Fractional pixel motion estimation (ME) is required to achieve more accurate motion vectors and higher compression efficiency. This results in an increase in the computational complexity of the ME process because of additional computational overheads such as interpolation and fractional pixel search. Fast algorithms for fractional ME in H.264/AVC are presented. To reduce the complexity of fractional pixel ME, unimodal error surface assumption is used to check only some points in the fractional pixel search window. The proposed algorithm employs motion prediction, directional quadrant and point-based search pattern and early termination to speed up the process. Hence, the proposed algorithm efficiently explores the neighborhood of integer pixel based on high correlation that exists between the neighboring fractional pixels and unimodal property of error surface. The proposed search pattern and early termination reduce computational time by almost 8% to 18% as compared to the hierarchical fractional pixel algorithm employed in the reference software with a negligible degradation in video quality and negligible increase in bit rate.
A cell is the structural and functional unit of all known living organisms, and its three-dimensional shape is an interesting
research topic and having many applications in biology. Usually, cells are kept surrounded with some liquid materials on
glass plates. In obtained image sequence, liquid material causes unwanted background in the images, and some virtual
images due to the glass plates occurs, which makes difficulty to recover the three-dimensional shape of the cell.
Therefore, conventional optical passive methods for three-dimensional shape recovery do not compute depth map
accurately. The purpose of this work is to reconstruct three-dimensional shape of HeLa cell by applying shape from
focus (SFF) method. SFF method is one of the optical passive methods to estimate three-dimensional shape by using
focal information from image sequence. To overcome problems from transparency and reflection, transparent part is
segmented from images by using the fact that background of the cell does not have focal point, and an original image
sequence is divided into two image sequences for real and virtual part by finding two focused points in itself. For more
accurate segmentation of the background part, the labeling method is used, and for automatically dividing an original
image sequence into two image sequences, the iterative threshold selection method is used. The proposed approach is
tested by using HeLa cell which is one of the most famous cells in biological research area. The experimental result
demonstrates the effectiveness.
KEYWORDS: 3D image processing, Optical filters, 3D modeling, Cameras, Point spread functions, Control systems, Machine vision, Video microscopy, Image processing algorithms and systems, Shape analysis
Shape from focus (SFF) is a passive optical method for 3D shape recovery, which has numerous applications in
machine vision, range segmentation, and video microscopy. This paper introduces a new algorithm for shape from
focus (SFF) based on multidimensional scaling (MDS) analysis. In contrast to the conventional focus measures
operators, a three dimensional neighborhood, enabling to capture the effect of pixels from previous as well as
next frames on focus value, is considered for each pixel in the image volume. A similarity matrix is computed
using Euclidean metric for the sequence of these 3D neighborhoods corresponding to each object point. This
matrix is then provided as input to MDS algorithm. The monotonic regression is applied which computes the
fitness of the approximated configuration by using stress function as the criterion for the fitness. The energy of
the components in lower dimensions is employed to compute the best focused point and its corresponding depth.
The proposed method is experimented using synthetic and real image sequences. The evaluation is gauged on
the basis of unimodality and monotonicity of the focus curve. Experimental results have demonstrated the
effectiveness of the new method.
KEYWORDS: Principal component analysis, Computed tomography, Databases, Digital image processing, Current controlled current source, Binary data, Lung, Lung imaging, Cancer, Lung cancer
Pulmonary nodule detection is a binary classification problem. The main objective is to classify nodule from the lung computed tomography (CT) images. The intra class variability is mainly due to the grey-level variance, texture differences and shape. The purpose of this study is to develop a novel nodule detection method which is based on Two-dimensional Principal Component Analysis (2DPCA). We extract the futures using 2DPCA from nodule candidate images. Nodule candidates are classified using threshold. The proposed method reduces False Positive (FP) rate. We tested the proposed algorithm by using Lung Imaging Database Consortium (LIDC) database of National Cancer Institute (NCI). The experimental results demonstrate the effectiveness and efficiency of the proposed method. The proposed method achieved 85.11% detection rate with 1.13 FPs per scan.
KEYWORDS: Digital filtering, 3D image processing, Fermium, Frequency modulation, Distortion, Cameras, Control systems, Statistical analysis, Device simulation, CCD cameras
The technique to estimate the depth and 3D shape of an object from the images of the same sample obtained at different
focus settings is called shape from focus (SFF). Conventional SFF methods sum up the focus values within a small
window of each pixel in the image. It produces a surface distortion effect, and an inaccurate depth map is obtained. In
this paper, a fast and accurate SFF method based on averaging filter is proposed. We suggest that instead of averaging
focus values, averaging depth values produces more accurate depth map. The experimental results demonstrate the
effectiveness and the efficiency of the proposed method in comparison to the conventional methods.
We introduce a new approach for 3-D shape recovery based on discrete wavelet transform (DWT) and principal component analysis (PCA). A small 3-D neighborhood is considered to incorporate the effect of pixels from previous as well as next frames. The intensity values of the pixels in the neighborhood are then arranged into a vector. DWT is applied on each vector to decompose it into approximation and wavelet coefficients. PCA is then applied on modified energies of wavelet components. The first feature in the eigenspace, as it contains maximum variation, is employed to compute the depth. The performance of the proposed approach is tested and is compared with existing methods by using synthetic and real image sequences. The evaluation is gauged on the basis of unimodality and monotonicity of the focus curve. Resolution, accuracy, root mean square error (RMSE), and correlation metrics have been applied to evaluate the performance. Experimental results and comparative analysis demonstrate the effectiveness of the proposed method.
This paper suggests a simple scheme of block motion estimation in which the search pattern is divided into a number of sectors. By employing prediction the motion vector search is constrained to a small area. In the first step five neighboring blocks are searched for finding the predicted motion vector. The vector thus obtained is chosen as the initial search center. This predictive search center is closer to the global minimum and thus decreases the effect of the monotonic error surface assumption on the motion field estimates. Secondly, the prediction information is used to obtain the direction of the predicted motion vector. Based on that direction, the search area is divided into eight sectors. Thirdly, the magnitude of the predicted motion vector is used to define the motion content of different blocks. Thus the final search pattern dynamically uses the motion magnitude and motion direction information and significantly reduces the computational complexity. Experimental results show speed improvement of the proposed algorithm over other fast search algorithms; in addition, the image quality measured in terms of peak SNR also shows good results.
The problem of 3-D shape recovery from image focus can be described as the problem of determining the shape of the focused image surface (FIS)—the surface formed by the best focused points. The shape from focus (SFF) methods in the literature are fast but inaccurate because of the piecewise constant approximation of FIS. The SFF method based on FIS has shown better results by exhaustive search of FIS shape using a planar surface approximation at the cost of a considerably higher number of computations. We present a method to search FIS shape as an optimization problem, i.e., maximization of focus measure in the 3-D image volume. Each image frame in the image volume (sequence) is divided into subimage frames, and the whole image volume is divided into a number of subimage volumes. A rough depth map at only the central pixel of each subimage frame is determined using one of the traditional SFF methods. A few image frames around the image frame, whose image number in the image volume is obtained from the rough depth at the central pixel of subimage frame, are selected for the subimage volumes. The search of FIS shape is now performed in the subimage volumes using a dynamic programming optimization technique. The final depth map is obtained by collecting the depth map of the subimage volumes. The new algorithm considerably decreases the computational complexity by searching FIS shape in subimage volumes and shows better results.
This paper presents the use of Genetic Algorithm as a search method for focus measure in Shape From Focus (SFF). Previous methods compute focus value for each pixel locally by summing all values within a small window. This summation is a good approximation of focus quality, but is not optimal one. The Genetic Algorithm is used
as a fine tuning process in which a measure of best focus is used as the fitness function corresponding to motion parameter values which make up each gene. The experimental results show that the proposed method performs better than previous algorithms such as Sum of the Modified Laplacian(SML), Grey Level Variance(GLV) and
Tenenbaum Focus Measure. The results are compared using root mean square error(RMSE) and correlation. The experiments are conducted using objects simulated cone, real cone and TFT-LCD color filter1 to evaluate performance of the proposed algorithm.
In this paper we propose an algorithm for reducing the complexity of motion estimation module in standard video
compression applications. In several video coding standards, motion estimation becomes the most time consuming sub
system such as H.264/AVC. Therefore recently research focuses on the development of novel algorithms to save
computations with minimal effects over the video distortion. Since real world video sequences usually exhibit a wide
range of motion content, from uniform to random, adaptive algorithms have revealed as the most robust general purpose
solutions.
In this paper a simple, computationally efficient and robust scheme for multi pattern motion estimation algorithm based
on the nature of error surfaces has been proposed. A combination of spatial and temporal predictors has been used for
multiple initial search center prediction, determination of magnitude of motion and search pattern selection. The multiple
initial predictors help to identify the absolute zero motion blocks and true location of global minimum based on the
characteristic of error surfaces. Hence the final predictive search center selected is closer to the global minimum. This
results in smaller number of search steps to reach minimum location and increases the computation speed. Further
computational speed up has been obtained by considering half stop technique and threshold for minimum distortion
point. The computational complexity of the proposed algorithm is drastically decreased (average speedup ~ 43%)
whereas the image quality measured in terms of PSNR (~.20 dB loss with respect to Full Search) also shows results close
to Full Search algorithm.
In this paper, we propose shape recovery method for measuring protrusions on LCD Color filter in TFT-LCD manufacturing process. We use 3-D Focus Measure operator to find focused points. Then we find the lens step that maximizes the sum of the Focus Measure. In order to reduce the computational complexity, we apply the successive focus measure update algorithm. The 3-D shape of the object can be easily estimated from the best-focused points. Experiments are conducted on both synthetic and real images to evaluate performance of the proposed algorithms. The experimental results show that our new method is faster than previous method.
In medical area, many image segmentation methods have been proposed for the segmentation of the medical image. However, there are few multiscale segmentation methods that can segment the medical image so that various components within the image could be separated at multiple resolutions or scales. In this paper, we present a new algorithm for multiscale segmentation of high-resolution computed tomography (HRCT) images. With this new segmentation technique, we demonstrate that it is possible to segment the HRCT images into its
various components at multiple scales hence separating the information available in HRCT image. We show that the HRCT image can be segmented such that we get separate images for bones, tissues, lungs and anatomical structures within lungs. The processing is done in frequency domain using the Discrete Cosine Transform (DCT).
This paper introduces a new approach for 3D shape recovery based on Discrete Wavelet Transform (DWT) and Principal
Component Analysis (PCA). Contrary to computing focus quality locally by summing all values in a 2D or 3D window
obtained after applying a focus measure, a vector consisting of seven neighboring pixels is populated for each pixel in
the image volume. Each vector in the sequence is decomposed by using DWT and then PCA is applied on the energies of
detailed coefficients to transform the data into eigenspace. The first feature, as it contains maximum variation, is
employed to compute the depth. Though DWT and PCA are both computationally expensive transformations, the
reduced data elements and algorithm iterations have made the proposed method efficient. The new approach was
experimented and its performance was compared with other methods by using synthetic and real image sequences. The
evaluation is gauged on the basis of unimodality, monotonicity and resolution of the focus curve. Two other global
statistical metrics Root Mean Square Error (RMSE) and correlation have also been applied for synthetic image sequence.
Experimental results demonstrate the effectiveness and the robustness of the new method.
The objective of 3D shape recovery using focus is to estimate depth map of the scene or object based on best focus points
from camera lens. In Shape From Focus (SFF), the measure of
focus - sharpness - is the crucial part for final 3D shape
estimation. The conventional methods compute sharpness by applying focus measure operator on each 2D image frame of
the image sequence. However, such methods do not reflect the accurate focus levels in an image because the focus levels for
curved objects require information from neighboring pixels in the adjacent frames too. To address this issue, we propose a
new method based on focus adjustment which takes the values of the neighboring pixels from the adjacent image frames that
have the same initial depth as of the center pixel and then it
re-adjusts the center value accordingly. Experimental results
show that the proposed technique generates better shape and takes less computation time in comparison to previous SFF
methods based on Focused Image Surface (FIS) and dynamic programming.
Estimation of surface roughness is an important parameter for many applications including optics, polymers,
semiconductor etc. In this paper, we propose to estimate surface roughness using one of the 3D shape recovery optical
passive methods, i.e., shape from focus. Three-dimensional shape recovery from one or multiple observations is a
challenging problem of computer vision. The objective of shape from focus is to calculate the depth map. That depth
map can further be used in techniques and algorithms leading to recovery of three dimensional structure of object which
is required in many high level vision applications. The same depth map can also be used for surface roughness
estimation. One of the requirements, of researchers is to quickly compare the samples being fabricated based on various
measures including surface roughness. However, the high cost involved in estimation of surface roughness limits its
extensive and exhaustive usage. Therefore, we propose an inexpensive and fast method based on Shape From Focus
(SFF). We use two microscopic test objects, i.e., coin and TFT-LCD cell for estimating the surface roughness.
Several active and passive methods have been proposed for recovering 3-D shape of objects from their 2-D images
Shape-from-focus is one of the passive methods. An important advantage of shape/depth from focus is that, unlike stereo
and motion, it is not confronted with the correspondence problem. A lot of research has been done on the image focus
analysis to automatically focus the imaging system, or to obtain the sparse depth information from the observed scene.
In our method, the images are taken by varying the focus value in different steps, and each pixel in the image is taken as
a single measurement. According to Thin Lens Model, the pixel's energy attains the maximum at the focused plane and
decreases else where. This change in pixel's energy follows a 'Generalized Gaussian' curve. For the initial stage, we
modified the pixel intensity in the images and found the maximum value in the modified pixel intensity vector and its
corresponding frame, repeating for all the pixels to compute Raw Depth Map (RDM). The proposed algorithm is fast and
precise, as compared to previous methods. The rigid body assumption reduces the number of pixels in the image, which
are to be considered for the shape reconstruction.
Three-dimensional shape recovery from one or multiple observations is a challenging problem of computer vision. In this paper, we present a new focus measure for calculation of depth map. That depth map can further be used in techniques and algorithms leading to recovery of three dimensional structure of object which is required in many high level vision applications. The focus measure presented has shown robustness in presence of noise as compared to the earlier focus measures. This new focus measure is based on an optical transfer function using Discrete Cosine Transform and its results are compared with the earlier focus measures including Sum of Modified Laplacian (SML) and Tenenbaum focus measures. With this new focus measure, the results without any noise are almost similar in nature to the earlier focus measures however drastic improvement is observed with respect to others in the presence of noise. The proposed focus measure is applied on a test image, on a sequence of 97 simulated cone images and on a sequence of 97 real cone images. The images were added with the Gaussian noise which arises due to factors such as electronic circuit noise and sensor noise due to poor illumination and/or high temperature.
Motion estimation is an important and computationally intensive task in video coding applications. Fast block matching algorithms reduce the computational complexity of motion estimation at the expense of accuracy. Fast motion estimation algorithms often assume monotonic error surface in order to speed up the computations. The argument against this assumption is that the search might be trapped in local minimum resulting in inaccurate motion estimates. This paper investigates the state-of-the-art techniques for block based motion estimation and presents an approach to improve the performance of block-based motion estimation algorithms. Specifically, this paper suggests a simple scheme that includes spatiotemporal neighborhood information for obtaining better estimates of the motion vectors. The predictive motion vector is then chosen as the initial search center. This predictive search center is found to be closer to the global minimum and thus decreases the effects of the monotonic error surface assumption and its impact on the motion field estimates. Based on the prediction, the algorithm also chooses between center biased or uniform approach for slow or fast moving sequences. The experiments presented in this paper demonstrate the efficiency of the proposed approach.
Most of previous image mosaicking techniques deal with stationary images that do not contain moving objects. But these moving objects cause serious errors on global motion estimation which is the core process of the image mosaicking since the global motion is estimated biased by local motions due to moving objects. There are some proposed techniques to effectively eliminate local motions and get precise global motion parameters but they have their own drawbacks, respectively.
In this paper a contour-based approach for mosaicking images that contain moving objects in them is presented. First, we extract contours from each image to be mosaicked. And then we estimate initial global motion. The key task of our work is how to eliminate local motions and obtain a precise global motion between two input images. To do this, we use three kinds of consistency check algorithm. Shape similarity consistency, scale consistency, and rigid transformation consistency. In these check processes, local movings are detected due to their motion vectors far different from the dominant one and removed in an iterative way. Besides, since we use contour information for image mosaicking, our approach is robust against the global gray level change between input images. Experimental results demonstrate the performance of our algorithm.
KEYWORDS: Reconstruction algorithms, Video, Video compression, Quantization, Video coding, Image compression, Computer programming, Data storage, Data conversion, Image processing
We present a new technique to improve the video compression ratio. In this technique, macroblock data are reordered in such a way that one block includes the important data of a macroblock, while the other three data blocks hold difference values in the horizontal, vertical, and diagonal directions. This results in reduced bit stream size because of low-valued data in the three blocks, giving a higher compression ratio. The proposed method can be easily used for error resilience applications as well. In that case, the important data block in a macroblock is transmitted on a secure channel while the remaining three blocks with difference data are sent via a lossy channel. In the case of an error in the lossy channel, the picture can still be reconstructed with a reasonably good quality using the block that contains important data transmitted on the secure channel. The proposed method generates better reconstruction quality when used at low bit rates.
In early stages of vision, the images are processed to generate "maps" or point-by-point distributions of values of various quantities including the edge elements, fields of local motion, depth maps and color constancy, etc. These features are then refined and processed in visual cortex. The next stage is recognition which also leads to simple control of behaviors such as steering and obstacle avoidance, etc. In this paper we present a system for object shape recognition that utilizes the features extracted by use of human vision model. The first block of the system performs processing analogous to that in retina for edge feature extraction. The second block represents the processing in visual cortex, where features are refined and combined to form a stimulus to be presented to the recognition model. We use the normalized distances of the edge pixels from the mean to form a feature vector. The next block that accomplishes the task of recognition consists of a counterpropagation neural network model. We use gray scale images of 3D objects to train and test the performance of the system. The experiments show that the system can recognize the objects with some variations in rotation, scaling and translation.
KEYWORDS: Video, Motion estimation, Video compression, Data communications, Algorithm development, Computer simulations, Motion detection, Information visualization, Data storage, Image analysis
In this paper, to avoid reaching a local minimum and correspond to a variety of real world video sequences, we propose an optimal fast search algorithm. With this objective, according to image property of each block, search strategy is varied adaptively by size of motion in each block. Each block from frames is classified into stationary, small motion and large motion block. We also suggest that the motion vector, which has stationary block such as background or still image, is set by zero, and thus these blocks do not perform search. For the others blocks, by using advantages of conventional search algorithm adaptively, we apply NTSS algorithm for small motion block and DS algorithm for large motion block. The proposed algorithm gives us faster search result and a significant improvement in terms of performance for motion compensated frames and computational complexity.
KEYWORDS: Digital watermarking, Video, Visualization, Digital filtering, Gaussian filters, Video compression, Video processing, Linear filtering, Computer simulations, Visual compression
This paper presents a robust and efficient scene-based video watermarking method using visual rhythm in compressed domain. A visual rhythm is a two dimensional abstraction of the entire three dimensional video contents obtained by cutting through the video sequence across the time axis or by sampling a certain group of pixels in consecutive video frames. Knowing that scene changes can be easily detected using visual rhythm and video sequences are conveniently edited at the scene boundaries, such scene-based watermark embedding process is a logical choice for video watermarking. Temporal spread spectrum can be achieved by applying spread spectrum methods to visual rhythm. Additive Gaussian noise, low-pass filtering, median filtering and histogram equalization attacks are simulated for all frames. Frame sub-sampling is also simulated as a typical video attack. Simulation results show that proposed algorithm is robust and efficient against these attacks.
Full search block matching motion estimation requires a very large amount of computing power. To overcome this problem, many fast search algorithms have been proposed. But, all these algorithms do not satisfy both matching error performance and real time property at the same time. This paper proposes a novel fast block matching algorithm using temporal correlation and center biased behavior of motion vector. In proposed algorithm, we modify new three-step search algorithm to combine technique for temporal correlation of motion vectors and center biased assumption. In real video sequences, there are many overlapped motion vectors between adjacent frames. Thus, by finding these duplicated motion vectors with a simple search rule, the proposed algorithm dramatically reduces the computational amount with low quality degradation.
This paper presents a novel wavelet comrpession technique to increase compression of images. Based on extension of zerotree entoropy coding method, this method initially uses only two symbols (significant and zerotree) to compress image data for each level. Additionally, sign bit is used for newly significant coefficients to indicate them being positive or negative. Contrary to isolated-zero symbols used in conventional zerotree algorithms, the proposed algorithm changes them to significant coefficients and saves its location, they are then treated just like other significant coefficients. This is done to decrease the number of symbols and hence decrease number of bits to represent the symbols used. In the end, algorithm indicates isolated-zero coordinates that are used to change the value back to original during reconstruction. Noticeably high compression ratio is achieved with no change in image quality.
KEYWORDS: Digital watermarking, Video, Video compression, Computer programming, Quantization, Video processing, Digital filtering, Linear filtering, Video coding, Virtual colonoscopy
In this paper, we propose a real-time video watermarking algorithm for MPAG streams. Watermarking Technique has been studied as a method to hide secret information into the signals so as to discourage unauthorized copy or attest the origin of the media. In the proposed algorithm, we take advantage of compression information of MPEG bistreams to embed the watermark into I-, P-, and B-Picture. The experimental results show that the proposed watermarking technique results almost invisible difference between watermarked MPEG video and original MPEG video, and reduces the processing time. Moreover, it shows robustness against a variety of attacks as well.
KEYWORDS: Image quality, Image processing algorithms and systems, Motion estimation, Video, Image segmentation, Video compression, Distortion, Signal to noise ratio, Algorithm development, Linear filtering
This paper presents new efficient method for motion estimation algorithm based on variable block size. The schemes allow the sizes of blocks to adapt to local activity within the block, and the number of blocks in any frame can be varied while still accurately representing true motion. This permits adaptive bit allocation between the representation of displacement and residual data, and also the variation of overall bit rate on a frame-by-frame basis. Especially, this paper approachs in direction of supplementing drawback of previous representative quad tree block matching algorithm and adds new method in order to improve the performance. Instead of the usual quad-tree segmentation, frame difference is computed for quad section and then homogeneity test is carried out for the largest block. Also, putting together of segmentation and re-merging strategy is employed which provides large computation reduction with little transmission overhead and raise higher image quality. Naturally, results are come into view as image quality is improved largely and computation load is decreased to 50-70 percent approximately.
KEYWORDS: Image filtering, Optical filters, Nonlinear filtering, Digital filtering, Sensors, RGB color model, Detection and tracking algorithms, Linear filtering, Interference (communication), Color image processing
Nonlinear vector median filters (VMF) and their variants represents one of the most popular approaches for color image processing. In this paper a novel effective method for detection and removal impulse noise in highly corrupted color images has been proposed. First, window operator based on an explicit use of spatial relations between color image elements to detect impulse noise is used. Then the spatially connected modification of VMF for removal of previously detected impulsive noise has been employed. The performance of the proposed detector and filter for detecting and suppressing of impulsive noise in test images is compared to conventional vector medium algorithms.
Segmentation of images to bit planes is one of the techniques for scalable image compression. Assuming our source image to be from a uniform quantizer, we categorize its bit planes on the basis of their significance, into two groups called MSB-planes and LSB-planes. The MSB planes contain low-entropy structural information, whereas LSB planes contain high entropy texture information. Due to the different nature of information and entropy of the two groups, they can be coded and reconstructed by different algorithms. The structural nature of MSB planes, make them more compressible at entropy coding stage, whereas the low significance of LSB-planes can be exploited against their high entropy. We realize the later by subsampling of LSB-planes, which in turn requires attention to the close coupling of MSB and LSB planes at the reconstruction stage. We introduce an estimation algorithm for the LSB planes. The reconstructed images are found to be perceptually comparable to the original images. Quantitative comparison also shows significant coding gain in terms of SNR vs. bit rate.
Wavelet based image compression has been a focus of research in recent days. In this paper, we propose a compression technique based on modification of original EZW coding. In this lossy technique, we try to discard less significant information in the image data in order to achieve further compression with minimal effect on output image quality. The algorithm calculates weight of each subband and finds the subband with minimum weight in every level. This minimum weight subband in each level, that contributes least effect during image reconstruction, undergoes a threshold process to eliminate low-valued data in it. Zerotree coding is done next on the resultant output for compression. Different values of threshold were applied during experiment to see the effect on compression ratio and reconstructed image quality. The proposed method results in further increase in compression ratio with negligible loss in image quality.
Motion estimation is one of the fundamental problems in digital video processing. One of the most notable approaches of motion estimation is based on the estimation of a measure of the change of image brightness in the frame sequence commonly referred to as optical flow. The classical approaches for finding optical flow have many drawbacks. The numerical methods or least square methods for solving optical flow constrains are susceptible to errors in the cases of occlusion and of noise. Two moving objects having common border causes confliction in the velocities, and taking their averages yields a less satisfactory optical flow estimation. The wrong detection of moving boundary, as motion is usually not homogeneous and the inexact contour measurements of moving objects are the other problems of optical flow methods. Therefore, information such as color and edges along with optical flow has been used in the literature. Further, the classical methods need lot of calculations and computations for optical flow measurements. In this paper, we proposed a method, which is very fast and gives better moving information of the objects in the image sequences. The possible locations of moving objects are found first, and then we apply the Hough Transform only on the detected moving regions to find the optical flow vectors for those regions only. So we save lot of time for not finding optical flow for the still or background parts in the image sequences. The new Boolean based edge detection is applied on the two consecutive input images, and then the differential edge image of the resulting two edge maps is found. A mask for detecting the moving regions is made by dilating the differential edge image. After getting the moving regions in the image sequence with the help of the mask obtained already, we use the Hough Transform and voting accumulation methods for solving optical flow constraint equations. The voting based Hough transform avoids the errors associated with least squares techniques. Calculation of a large number of points along the constraint line is also avoided by using the transformed slope-intercept parameter domain. The simulation results show that the proposed method is very effective for extracting optical flow vectors and hence tracking moving objects in the images.
The bubble rise velocity in air/water two-phase flow in a vertical pipe is studied experimentally by processing consecutive series of digitized video images. The shape of the bubble, as well as its instantaneous velocity is measured by using binary image processing techniques. Digital image processing algorithms have been developed to obtain the coordinates of specified points in image. This coordinate data was used to calculate the instantaneous bubble velocity, which can be expressed as a separation distance between the two consecutive image frames. This method has many advantages it is a non-invasive measurement, and does not require sophisticated laboratory equipment. Images are directly digitized by using CCD digital camera. Image analysis involves purely computer-based computation. Hence reasonably accurate velocity data has been obtained from only one experiment in a small period of time.
In this paper, we present a vehicle detection framework which aims at avoiding collision and warning the dangerous situation during driving on a road at night. Potential obstacles- vehicles, motorcycles are detected from image sequences by a vision system which processes the images given by a Charge Coupled Device (CCD) camera mounted on a moving car. We can compute the position and number of vehicles from these image sequences by using several image processing techniques.
Wavelet based compression is getting popular due to its promising compaction properties at low bitrate. Zerotree wavelet image coding scheme efficiently exploits multi-level redundancy present in transformed data to minimize coding bits. In this paper, a new technique is proposed to achieve high compression by adding new zerotree and significant symbols to original EZW coder. Contrary to four symbols present in basic EZW scheme, modified algorithm uses eight symbols to generate fewer bits for a given data. Subordinate pass of EZW is eliminated and replaced with fixed residual value transmission for easy implementation. This modification simplifies the coding technique as well and speeds up the process, retaining the property of embeddedness.
Using a white-light interference for high-precision surface structure analysis, 3D profilometry is realized. White-light surface profilers record the position of peak fringe contrast modulating the optical path difference of an imaging interferometer. This method is sensitive to random noise, such as spike or missing data points, which can be interpreted as positions of high fringe contrast. In order to overcome this problem, a median filter is employed for noise reduction of the acquired profile. Interferograms are generated simultaneously by scanning an object in a direction perpendicular to the object surface. These interferograms are filtered by the median filter, the surface height for each point of the image is obtained by finding the position of peak fringe contrast by extended depth from focus. It is shown in the experiment that the proposed method is able to cope with distortions of the fringe contrast envelope of noisy white-light interferograms without smoothing peak fringe contrast and distorting the position of peak. Moreover, the computation becomes fast and simple by using the advanced algorithm, which does not use complex spatial frequency domain processing.
KEYWORDS: Distortion, Motion estimation, Video coding, Image quality, Algorithm development, Video, Mechatronics, Video compression, Data compression, Video processing
New three-step search (NTSS) algorithm obtains good picture quality in predicted images with more reduced computation on the average. To reduce more the computation while keeping error performance compared with NTSS, this paper proposes a fast NTSS algorithm using unimodal error surface assumption correlation of causal adjacent matching errors, partial distortion elimination (PDE) algorithm and cross search algorithm. Proposing algorithm reduces less important checking points of the first step in the NTSS by using initial sum of absolute difference (SAD) and adaptive threshold of SAD. Instead of checking seventeen candidate points in the first step like the NTSS, our search algorithm starts with nine checking points according to the result of comparison between initial SAD and adaptive threshold of SAD. We get adaptive threshold of SAD according to the causal adjacent SADs. For more computational reduction without any degradation in prediction quality, we employ PDE and cross search algorithm. Therefore, we can apply this algorithm to variety of applications because the threshold is adaptive to the characteristics of each sequence. Experimentally, our algorithm shows good performance in terms of PSNR of predicted images and average-checking points for each block compared with the conventional NTSS and TSS algorithms.
Motion estimation has been widely used by various video coding standards. Full Search is the most straightforward and optimal block matching algorithm but its huge computational complexity is the major drawback. To overcome this problem several fast block matching motion estimation algorithms have been reported. In this paper a fast four step search algorithm based on the strict application of Unimodal Error Surface Assumption has been proposed. Quadrant Selection Approach has been adopted to reduce the computational complexity. The algorithm is adaptive in the sense that it can be stopped at the second or third step depending on the motion content of the block based on the Half Stop Technique. Simulation results show that the number of search points in our algorithm are almost half as compared to the conventional four step search algorithm. The total number of search points varies from 7 to 17 in our proposed algorithm. The worst case computational requirement is only 17 block matches. Our algorithm is robust, as the performance is independent of the motion of the image sequences. It also possesses regularity and simplicity of hardware oriented features.
In this paper, we propose a fast SFF method, which also provides accurate shape estimation for objects with complex geometry. Dynamic programming is a mathematical tool to determine the optical solution to an n-variable problem efficiently. In the new SF method, dynamic programming is modified and used to find the optimal path that gives maximum focus measure at each pixel with certain constraints. Then the shape of object can be estimated over the 3D FIS. An automated CCD camera system has been developed to implement the proposed method.
A new approach to 3-D profilometry for the white light interferometer is presented. Recently many different methods have been used to analyze the data obtained from white light interferometric profilers. The advantage of the interferometric methods is their precision that can reach a small fraction of a wavelength. But these profilers are usually limited to relatively smooth surfaces as well as being very expensive. We detail a simple way to construct a profiler that uses a simple and efficient algorithm. It treats the data in a fast and simple manner, thus reducing both the acquisition and the analysis time. The method is based on the Focus measurement that finds a maximum variance value. The method works well with rough surfaces.
A new algorithm to compute a precise 3-D shape of a moving object for color motion stereo is described. Input data is obtained from a single color CCD camera and a moving belt. Three-dimensional shape recovery in motion stereo is formulated as a matching optimization problem of multiple color stereo images. It is shown that the problem of matching among multiple color motion stereo images can be carried out with use circular decorrelation of a color signal. Three- dimensional shape recovery using real color motion stereo images demonstrates a good performance of the algorithm in terms of reconstruction accuracy.
Three-step search (TSS) has been studied for low bit-rate video communications because of significantly reduced computation, simplicity and reasonable performance in the motion estimation. Many other modified TSS algorithms have been developed for higher speedup of computation and improved error performance of motion compensation. Among the modified TSS algorithms, new three-step search (NTSS) shows very good error performance with more reduced computation on the average. However, the method can exceed about 30% compared with the computation of TSS for some cases. It can be serious problem in real-time video coding at the worst case. This paper proposes an efficient and adaptive three-step search (EATSS) algorithm with adaptive search strategy considering unequal subsampling and partial distortion elimination (PDE). Proposing search strategy reduces useless checking points of the first step in the NTSS by using initial sum of absolute difference (SAD) and predefined threshold of SAD. Instead of checking 17 candidate points in the first step as the NTSS, our search algorithm starts with 1 or 9 checking points according to the comparison between initial SAD and predefined threshold of SAD. Experimentally, our algorithm shows good performance in terms of PSNR of predicted images and average checking points for each block compared with NTSS and TSS.
Image processing techniques have been used extensively in many different applications today. In particular, in fluid mechanics, image processing has become a powerful technique to study the flow phenomena, the flow pattern and the flow characteristics of two-phase flow. This paper presents a new application of image processing techniques to two-phase bubble/slug flow in a vertical pipe. Based on image processing techniques (image filtering for noise reduction, edge detection and thresholding for image enhancement, etc.), the results obtained are showed that this technique has many advantages. It is able to study together, in very short time, one image contains a large number of bubbles and the large amount of images, while the other methods such as point by point measurements technique or by using a digitizing table for digitization cannot be applicable. Moreover this technique also enable to identify automatically, to measure fast and relatively accurate the parameters such as size, shape of the bubble. These studies promise a great progress for an application of image processing techniques to study the complicated flow phenomena, the flow pattern and the flow characteristics of multiphase flow.
In this paper, a new method for obtaining 3D shape of an object by measuring relative blur between images using wavelet analysis has been described. Most of the previous methods use inverse filtering to determine the measure of defocus. These methods suffer from some fundamental problems like inaccuracies in finding the frequency domain representation, windowing effects, and border effects. Besides these deficiencies, a filter, such as Laplacian of Gaussian, that produces an aggregate estimate of defocus for an unknown texture, can not lead to accurate depth estimates because of the non-stationary nature of images. We propose a new depth from defocus (DFD) method using wavelet analysis that is capable of performing both the local analysis and the windowing technique with variable-sized regions for non- stationary images with complex textural properties. We show that normalized image ratio of wavelet power by Parseval's theorem is closely related to blur parameter and depth. Experimental results have been presented demonstrating that our DFD method is faster in speed and gives more precise shape estimates than previous DFD techniques for both synthetic and real scenes.
In this paper, we describe a novel use of neural networks for extracting three-dimensional shape of the objects based on image focus. The conventional shape from focus methods are based on piece-wise constant, or piece-wise planar approximation of the focused image surface (FIS) of the object, so they fail to provide accurate shape estimation for objects with complex geometry. The proposed scheme is based on representation of three-dimensional shape of FIS in a window in terms of the neural network weights. The neural network is trained to learn the shape of the FIS that maximizes the focus measure. The SFF problem has thus been converted to an ordinary optimization problem in which a criterion function (focus measure) is to be optimized (maximized) with respect to the network weights. Gradient accent method has been used to optimize the focus measure over the three-dimensional FIS. Experiments were conducted on three different types of objects to compare the performance of the proposed algorithm with that of traditional SFF methods. Experimental results demonstrate that the method of SFF using neural networks provides more accurate depth estimates than those by the traditional methods.
The design of filters for pattern recognition that have optimal trade-offs among the criteria of noise robustness, sharpness of the correlation peak, and Horner efficiency when input scene noise is spatially disjoint (nonoverlapping) with the target are presented. Computer simulation is made to illustrate filter performances for optical pattern recognition.
This paper examines the use of nonlinear order-based filter, LUM filter as a prefilter for gradient edge detectors. The output of this filter is an order statistic from an observation vector. Wide range of characteristics can be achieved with the same filter structure by changing its parameters. Its ability to suppress impulse noise and enhance edges leads to significant improvement in edge map. An algorithm for automatic filter parameters adjustment is proposed based on the impulse noise level identification. Algorithm efficiency to set optimal parameters is demonstrated. Its accuracy in determining noise level is evaluated and drawbacks are discussed.
KEYWORDS: Motion estimation, Video, Video coding, Image quality, Lithium, Video compression, Digital image processing, Image processing, Polonium, Low bit rate video
The three-step search (TSS) has played a key role in real time video encoding because of its light computational complexity, regularity of search rule, and reasonable performance for reduced computation. Many researches about modified TSS algorithms have been studied for reducing the amount of computation or improving the quality of the image predicted with obtained motion vector. This paper explains a new concept of hierarchical search in motion estimation for more reduction of computational complexity and better error performance compared with conventional modified TSS algorithms. The structure of the proposed algorithm is similar to that of the conventional TSS algorithm. The proposed algorithm, however, has different precision of search for each step. It will be shown that the proposed algorithm is very efficient in terms of speed up for computation and has improved error performance over the conventional modified TSS algorithms. Our proposed algorithm will be useful in software-based real-time video coding and low bit rate video coding.
A new algorithm to compute precise depth estimates for motion stereo is described. Input data is obtained from a single CCD camera and a moving belt. It is shown that the problem of matching among multiple motion stereo images can be effectively carried out by use of adaptive correlation matching. Experimental results with real stereo images are presented to demonstrate the performance of the algorithm.
Many vision tasks are very complex and computationally intensive. Real time requirements further aggravate the situation. They usually involve both structured (low-level vision) and unstructured (high-level vision) computations. Parallel approaches offer hope in this context. Parallel approaches to vision tasks and scheduling schemes for their implementation receive special emphasis in this paper. Architectural issues are also addressed. The aim is to design algorithms which can be implemented on low cost heterogeneous networks running PVM. Issues connected with general purpose architectures also receive attention. The proposed ideas have been illustrated through a practical example (of eye location from an image sequence). Next generation multimedia environments are expected to routinely employ such high performance computing platforms.
In this paper, we describe the two-layer video codec based on MPEG-2 and its layering algorithms, where we exploit the property that the human vidual system is more sensitive to low frequency coefficients than high frequency coefficients in the DCT domain. We prose a new bloc effect reduction algorithm, called content-based AC correction, which predicts AC coefficients considering the image content itself in the DCT domain; because the block effect occurs in a block which has few AC coefficients in DCT domain. First, this algorithm detects which block has a block effect. Secondly, this algorithm detects if a block is located near the edge or a boundary of video object. If the block is located near any edge or boundary, a new AC correction algorithm is used. Otherwise, the traditional AC correction algorithm is used.
A new method is presented for focused image recovery from two blurred images which are taken by a CCD camera with different camera parameter settings. The focused image is obtained through actual point spread function (PSF) using the constrained least squares filter. Subbarao et. al. proposed an approach that the focused image is obtained through deconvolution in the Fourier domain using the Wiener filter. Subbarao's method requires the estimation of noise parameter, which is not known as a priori knowledge. This causes less accurate results in focused image recovery. On the other hand, the application of constrained least squares filtering to image restoration is optimal for each given image and requires knowledge only of the noise mean and variance. This is very useful for its flexibility. This approach fives more accurate focused image than the previous method.One of the difficulties of the image restoration problem is finding exact PSF which makes the blurred image. The actual PSF is obtained by the calibration of camera levels is used to find line spread function (LSF). The obtained LSF can be used to compute its actual PSF directly using the separable optical transfer function. The new method has been implemented on an actual camera system, and the experimental results of focused image recovery are provided and discussed.
Color pattern recognition based on projection preprocessing of red-green-blue components and single-channel matched filtering is described. We optimize parameters of projection preprocessing and correlation filters in terms of noise robustness. We design a phase-only filter optimized in sense of signal to noise ratio for optical pattern recognition. Computer simulation results are provided to illustrate color pattern recognition by using the proposed method.
We use the paraxial geometric optics model of image formation to derive a set of camera focusing techniques. These techniques do not require calibration of cameras but involve a search of the camera parameter space. The techniques are proved to be theoretically sound under weak assumptions. They include energy maximization of unfiltered, low-pass-filtered, high-pass-filtered, and bandpass-filtered images. It is shown that in the presence of high spatial frequencies, noise, and aliasing, focusing techniques based on bandpass filters perform well. The focusing techniques are implemented on a prototype camera system called the Stonybrook passive autofocusing and ranging camera system (SPARCS). The architecture of SPARCS is described briefly. The performance of the different techniques are compared experimentally. All techniques are found to perform well. The energy of low-pass-filtered image gradient, which has better overall characteristics, is recommended for practical applications.
A new shape-from-focus method is described which is based on a new concept named Focused Image Surface (FIS). FIS of an object is defined as the surface formed by the set of points at which the object points are focused by a camera lens. According to paraxial-geometric optics, there is a one-to-one correspondence between the shape of an object and the shape of its FIS. Therefore, the problem of shape recovery can be posed as the problem of determining the shape of the FIS. From the shape of FIS the shape of the object is easily obtained. In this paper the shape of the FIS is determined by searching for a shape which maximizes a focus measure. In contrast with previous literature where the focus measure is computed over the planar image detector of the camera, here the focus measure is computed over the FIS. This results in more accurate shape recovery than the traditional methods. Also, using FIS, a more accurate focused image can be reconstructed from a sequence of images than is possible with traditional methods. The new method has been implemented on an actual camera system, and the results of shape recovery and focused image reconstruction are presented.
We use the paraxial geometric optics model of image formation to derive a set of camera focusing techniques. These techniques do not require calibration of cameras but involve a search of the camera parameter space. The techniques are proved to be theoretically sound. They include energy maximization of unfiltered, low-pass filtered, high-pass filtered, and band-pass filtered images. It is shown that in the presence of high spatial frequencies, noise, and aliasing, focusing techniques based on band-pass filters perform well. The focusing techniques are implemented on a prototype camera system named SPARCS. The architecture of SPARCS is described briefly. The performance of the different techniques are compared experimentally. All techniques are found to perform well. One of them -- the energy of low pass filtered image gradient -- which has better overall characteristics is recommended for practical applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.