A procedure for creating images with higher resolution than the sampling rate would allow is described. The enhancement algorithm augments the frequency content of the image using shape-invariant properties of edges across scale by using a non-linearity that generates phase- coherent higher harmonics. The procedure utilizes the Laplacian pyramid image representation. Results are presented depicting the power-spectra augmentation and the visual enhancement of several images. Simplicity of computations and ease of implementation allow for real-time applications such as high-definition television.
This paper presents a generalized image contrast enhancement technique which equalizes perceived brightness based on the Heinemann contrast discrimination model. This is a modified algorithm which presents an improvement over the previous study by Mokrane in its mathematically proven existence of a unique solution and in its easily tunable parameterization. The model uses a log-log representation of contrast luminosity between targets and the surround in a fixed luminosity background setting. The algorithm consists of two nonlinear gray-scale mapping functions which have seven parameters, two of which are adjustable Heinemann constants. Another parameter is the background gray level. The remaining four parameters are nonlinear functions of gray scale distribution of the image, and can be uniquely determined once the previous three are given. Tests have been carried out to examine the effectiveness of the algorithm for increasing the overall contrast of images. It can be demonstrated that the generalized algorithm provides better contrast enhancement than histogram equalization. In fact, the histogram equalization technique is a special case of the proposed mapping.
Error diffusion is a procedure for generating high quality bi-level images from continuous tone images so that both the continuous and halftone images appear similarly when observed from a distance. It is well known that certain objectionable patterning artifacts can occur in error diffused images. Previous approaches for improving the quality of error diffused images include the application of non-standard scanning strategies (e.g., serpentine or Peano scanning), dithering the filter coefficients in error diffusion, dithering the quantizer threshold, incorporating feedback to control the average distance between dots, and designing an optimum error diffusion filter according to some distortion criterion. Here we consider a method for adjusting the error diffusion filter concurrently with the error diffusion process so that an error criterion is minimized. The minimization is performed using the LMS algorithm in adaptive signal processing. Such an algorithm produces better halftone image quality compared to traditional error diffusion with a fixed filter.
Minimizing visible distortion in a quantized color image is context-dependent. Our feedback- based strategy for color image quantization looks at the quantized image as well as the original. This comparison yields useful information to guide the embedded quantization algorithm to devote, during re-quantization of the original image, more resources to areas where the most offensive distortion occurred. Our current implementation of this new strategy uses an edge detector in a scaled RGB space to reveal the location and severeness of false contours, which appear in the quantized image but not in the original. The result of this false- contour detection step is used to identify uniformly colored regions in the quantized image that are along side of significant false contours. These regions correspond directly to areas in the original image that need to be better preserved during re-quantization. A well-known divisive method and our own agglomerative method are adapted separately as the embedded quantization algorithm to demonstrate the applicability and effectiveness of this feedback-based approach.
Images are often corrupted by impulse noise due to faulty image acquisition devices or during transmission. The goal of impulse noise removal is to suppress the noise while preserving edges and details. In this paper, an efficient method is proposed to remove impulse noise from a highly corrupted image while preserving the detail features. This new method is based on a detection-estimation strategy, in which the impulse noise is detected first using a quadratic filter, and a selectively chosen local mean is used to estimate the true value of the corrupted pixel. Simulation results indicate that the new method is much more efficient and effective than the median filtering and the out-range methods.
The inadequacy of the classic linear approach to edge detection and scale space filtering lies in the spatial averaging of the Laplacian. The Laplacian is the divergence of the gradient and thus is the divergence of both magnitude and direction. The divergence in magnitude characterizes edges and this divergence must not be averaged if the image structure is to be preserved. We introduce a new nonlinear filtering theory that only averages the divergence of direction. This averaging keeps edges and lines intact as their direction is nondivergent. Noise does not have this nondivergent consistency and its divergent direction is averaged. Higher order structures such as corners are singular points or inflection points in the divergence of direction and also are averaged. Corners are intersection points of edges of nondivergent direction (or smooth curves of small divergence in direction) and their averaging is limited. This approach provides a better compromise between noise removal and preservation of image structure. Experiments that verify and demonstrate the adequacy of this new theory are presented.
In this paper, we develop a theory to design weighted order statistic filters with structural approach and discuss their applications in image filtering. By introducing a set of parameters, called Mis, the statistical properties of weighted order statistic filters are analyzed. A theorem is presented to show that any symmetric weighted order statistic filter will drive the input to a root or oscillate in a cycle of period 2. This result was proven to hold only for some weighted order statistic filters. A condition is provided to guarantee the convergence of weighted order statistic filters.
A technique is proposed for estimating the parameters of 2D uniform motion of multiple moving objects in a scene, based on long-sequence image processing and the application of a recently developed multi-line fitting algorithm. Plots of the vertical and horizontal projections versus frame number give new images in which uniformly moving objects are represented by skewed band regions, with the angles of the skew from the vertical being a measure of the velocities of the moving objects. For example, vertical bands will correspond to objects with zero velocity. A recently developed algorithm called SLIDE (subspace-based line detection) can be used to efficiently determine the skew angles. SLIDE exploits the temporal coherence between the contributions of each of the moving patterns in the frame projections to enhance and distinguish a signal subspace that is defined by the desired motion parameters. A similar procedure can be used to determine the vertical velocities. Some further steps must then be taken to properly associate the horizontal and vertical velocities.
The perceived quality of images reconstructed from low bit rate compression is severely degraded by the appearance of transform coding artifacts. This paper proposes a method for producing higher quality reconstructed images based on a stochastic model for the image data. Quantization (scalar or vector) partitions the transform coefficient space and maps all points in a partition cell to a representative reconstruction point, usually taken as the centroid of the cell. The proposed image estimate technique selects the reconstruction point within the quantization partition cell which results in a reconstructed image which best fits a non- Gaussian Markov random field image model. The estimation of the best reconstructed image results in a convex constrained optimization problem which can be solved iteratively. Experimental results are shown for images compressed using scalar quantization of block DCT and using vector quantization of subband wavelet transform. The proposed image decompression provides a reconstructed image with reduced visibility of transform coding artifacts and superior perceived quality.
A 3D nonlinear postprocessing algorithm is proposed in this paper for reducing coding artifacts produced by block based motion compensated transform coding. In the proposed algorithm, a separable 3D filtering structure is used: a space-variant FIR-Median Hybrid filtering is used in spatial domain, followed by a motion compensated nonlinear filtering in temporal domain. By using this structure, the coding artifacts in the reconstructed image sequence can be effectively reduced without blurring edges or moving objects in the image sequence. The simulation results showed that significant improvement in the picture quality of low bit-rate coded video sequences can be obtained by using the proposed algorithm.
To reach higher compression ratios in video sequence coding the demands posed on the motion estimation module become greater and greater. In this paper we present a motion estimation scheme which gives a motion field defined on every block (typical 8 by 8 or 16 by 16) and resembles very well the real motion in the scene. In more complex coding schemes constraints were added to obtain better visual quality. The proposed algorithm is a `one-pass' algorithm and is not explicitly based on any statistical model. A classification procedure on a predefined number of motion vector candidates defines the final motion field. An overview is given of the concept of using global information to calculate the motion vector field. A thorough description of the new algorithm is given and some simulation results are presented on real sequences. Further research on this algorithm will be done to use it in a segmentation based codec for complex scenes.
Image segmentation provides a powerful semantic description of video imagery essential in image understanding and efficient manipulation of image data. In particular, segmentation based on image motion defines regions undergoing similar motion allowing an image coding system to more efficiently represent video sequences. This paper describes a general iterative framework for segmentation of video data. The objective of our spatiotemporal segmentation is to produce a layered image representation of the video for image coding applications whereby video data is simply described as a set of moving layers.
This paper presents an edge-based background subtraction approach to moving object location from video frames. First, a background frame without moving objects is obtained. For each subsequent frame with objects, edge detection with thresholding is performed on both the background frame and the current frame. The difference of the two edge images is computed. To emphasize the areas containing edge pixels belonging to the moving object, a selective dilation operation is performed. Connected component analysis is applied to remove insignificant candidate regions. An erosion operation is applied to compensate for earlier dilation. Finally, unwanted areas formed by the shadows of the moving object are removed. Experimental results with road scenes to locate vehicles approaching toll booths on a highway entrance demonstrate the effectiveness of the proposed approach. The extracted vehicles are input to a classification module for classifying vehicles into useful categories.
Parsing video content is an important first step in the video indexing process. This paper presents algorithms to automate the video parsing task, including video partitioning and video clip classification according to camera operations using compressed video data. We have studied and implemented two algorithms for partitioning video data compressed according to the MPEG standard. The first one is based on discrete cosine transform coefficients of video frames, and the other based on correlation of motion vectors. Algorithms to detect camera operations using motion vectors are presented.
The problem addressed is to convert standard CCIR 601 TV images into progressive images (deinterlacing) and into HDTV images. The latter is achieved through deinterlacing as an intermediate step. Deinterlacing is done by motion-compensated inter-frame interpolation for pixels with a reliable displacement estimate and intra-frame interpolation for the other pixels. The displacement estimation technique resembles a generalized Hough transform.
Missile image sequences often contain a strong plume signature expelled by the missile, with a weak signal corresponding to the missile hardbody. Enhancement of the missile signal in low signal-to-noise ratio environments is necessary to accurately track the trajectory throughout the sequence. By stabilizing the image sequence a registered data set is produced in which the missile hardbody can be found in the same location within each frame. A statistical method known as product correlation (PC) may be applied to the stabilized data, enhancing missile contrast with respect to the background. An algorithm for the passive stabilization of video sequences consisting of missile imagery is described. PC is presented in the context of higher order statistics and then applied to stabilized video sequences to enhance the missile hardbody signal within the data.
Pressure sensitive paints are used to record surface pressures on wind tunnel models at high spatial resolution. Images taken at two separate pressure conditions are ratioed to yield quantitative measures of surface pressures. However, these two images have very different intensity values. This paper discusses some common motion estimation techniques, such as block matching algorithms and point correspondences, that were used to analyze this intensity variant image set. The drawbacks and limitations to each method are presented as well as any modifications that were necessary to implement each method given an intensity variant data set. Comparisons between different methods or implementations are discussed in terms of the speed of computation and the accuracy of the motion compensated result.
Motion analysis systems are currently used in a number of applications. In space research, this apparatus is useful in the study of astronauts' movements in weightlessness. However, the systems that are available on the market do not allow for easy identification of the markers held by the subject when the luminous environment is not controlled. The utilization of active coded targets allows these markers to be identified automatically and without ambiguity. This article presents a possible form of coding, called temporal asynchronous coding of active targets. A prototype integrating this coding has been developed in collaboration with an industrial partner. The implementation of this system, and certain experiments carried out, are set out in this document.
Devices called fish passes are constructed in rivers to help migratory fish get over obstacles (dams). Window panes are used to observe and count by species the fish which cross. Our goal is to automate this work by using a vision system. The images used to accomplish fish recognition and counting are taken by a video camera fitted with an electronic shutter in a backlit fish pass. The development structure is based on a micro-computer connected to an image acquisition and display system. Images, taken from a S-VHS video-tape recorder, are digitized in a 256 X 256 X 8 bit format and stored on an optical disk. The recognition operations (parameter extraction and discriminant analysis classification process) are included in a dynamic process which tracks each fish while it is in the observation field to count it. When several fish come to overlap, the situation is detected by a comparison of consecutive images and then the recognition is not achieved. The classification results obtained for the `static' recognition are 90 to 100% correct recognition, depending on the species. Furthermore, the tracking process improves these results by the temporal redundancy it generates.
This paper addresses the regional boundary extraction problem, which attempts to match a reference template contour to a desired target boundary in the presence of noise. A three-step iterative strategy is used: first, a local correspondence is set up between each pixel of the reference template and the target boundary; second, shape information is used to minimize the error from the first step; and third, the template contour is updated. The first step assigns a target pixel to each pixel of the template contour, using an iterated max-min estimator which gives the upper bound for the mean square error. The second step minimizes error by using correlation information between the template pixels; the correlation is approximated as a Gaussian weighted distance function, and effectively smoothes the template contour deformation from the first step. The third step uses this smoothed deformation to update the template contour, which then becomes the starting point for the next iteration. Experimental results are shown in which the algorithm is applied to medical NMR images and performance is compared to the SNAKE algorithm.
A PC-based car license plate recognition system has been developed. The system recognizes Chinese characters and Japanese phonetic hiragana characters as well as six digits on Japanese license plates. The system consists of a CCD camera, vehicle sensors, a strobe unit, a monitoring center, and an i486-based PC. The PC includes in its extension slots: a vehicle detector board, a strobe emitter board, and an image grabber board. When a passing vehicle is detected by the vehicle sensors, the strobe emits a pulse of light. The light pulse is synchronized with the time the vehicle image is frozen on an image grabber board. The recognition process is composed of three steps: image thresholding, character region extraction, and matching-based character recognition. The recognition software can handle obscured characters. Experimental results for hundreds of outdoor images showed high recognition performance within relatively short performance times. The results confirmed that the system is applicable to a wide variety of applications such as automatic vehicle identification and travel time measurement.
In image restoration processes, the Wiener filter method derived from the minimum mean square error criterion is probably the most popular. In this method the constant (Gamma) , which is an a'priori knowledge of the signal-to-noise ratio, plays a major role and its value is determined by a trial and error approach. In previous papers an expression for (Gamma) [i, j], which is an a'priori knowledge of the signal-to-noise ratio at pixel [i, j], was derived assuming that two degraded versions (referred to as the first and second images) of the original image are provided. It may be possible to have many degraded versions of the original image. In this paper a new methodology is proposed to construct both the first and the second images from the given set of degraded images and then they are used with the above process of estimation of (Gamma) .
Image restoration is an estimation process that attempts to recover an ideal high quality image from a given degraded version. The Wiener filter method derived from the minimum mean square error criterion is widely used in image restoration to restore degraded images. In this method the constant (Gamma) , which is an a priori representation of the signal to noise ratio for the complete image plane, is unknown and its value is supplied by the user and adjusted by the trial and error approach. In this paper a new estimation process of (Gamma) is proposed. First of all, a second image is constructed from the given degraded image (referred to as the first image) using the Lagrange's interpolation technique. The Lagrange's interpolation technique used here is actually a modified version of the original approach. Secondly, an expression for (Gamma) [i, j], which is an a priori representation of the signal to noise ratio at pixel [i, j], is obtained using both the first and the second images and their auto correlations. However the Wiener filter only needs (Gamma) for the complete image plane. Therefore an arithmetic mean of a selected set of (Gamma) [i, j] values is calculated. This arithmetic mean is then used as (Gamma) in the Wiener filter to restore the first image.
We present an improved regularized pseudo inverse method that can super-resolve passive imagery. The algorithm is a non-iterative technique that involves convolution operations in the spatial domain which allows an image to be processed in the frame time of the imager. Our method gives encouraging results, providing a significant increase in spatial resolution with only a small reduction in the final signal to noise ratio. Examples are given of the application of the technique to millimeter-wave (mm-wave) images and a comparison is made with other fast deconvolution methods and computer intensive super-resolution algorithms. Finally, a brief discussion is given of how high speed hardware is used to implement this algorithm.
A new parallel thinning algorithm is proposed here that is based on a convolution approach. This algorithm works for both four neighbor and eight neighbor connectivity. This algorithm gives good results and requires very little computation time if we exploit a high speed convolutional processor such as HNC's Vision Processor (ViP). The algorithm executes in a parallel fashion using 3 X 3 convolutions. It checks all possible 512 patterns within the 3 X 3 windows in each pass where each pass takes less than 7 milliseconds with the ViP. To maintain original connectivity, we divide the 100 patterns into two large and four small groups that avoid possible conflict. The high speed (70 milliseconds for most 512 X 512 images with objects that have a 10 pixel width or less) is due to the parallelism in HNC's ViP chip and enables real time applications. Because this algorithm takes advantage of the current VLSI technology, it checks as many as 512 patterns at the same time using a lookup table and provides the best result.
Color can be used as a very important cue for image recognition. In industrial and commercial areas, color is widely used as a trademark or identifying feature in objects, such as packaged goods, advertising signs, etc. In image database systems, one may retrieve an image of interest by specifying prominent colors and their locations in the image (image retrieval by contents). These facts enable us to detect or identify a target object using colors. However, this task depends mainly on how effectively we can identify a color and detect regions of the given color under possibly non-uniform illumination conditions such as shade, highlight, and strong contrast. In this paper, we present an effective method to detect regions matching given colors, along with the features of the region surfaces. We adopt the HVC color coordinates in the method because of its ability of completely separating the luminant and chromatic components of colors. Three basis functions functionally serving as the low-pass, high-pass, and band-pass filters, respectively, are introduced.
Interpolation based contour metamorphosis methods often yields self-intersecting intermediate contours. In this research, we present a highly automatic algorithm to achieve non-self- intersecting contour morphing. The basic idea of our approach is to represent a planar curve with the wavelet descriptor which allows the metamorphosis at different resolutions as well as spatial locations. Furthermore, to avoid self-intersection, we formulate the mapping of control vertices of key frames as a minimization problem with a cost function involving bending and stretching of an object. Experiments of the proposed morphing algorithm are conducted to demonstrate its performance.
A new digital surface called the gradually varied surface is introduced and studied in digital spaces, especially in digital manifolds. In this paper, we have proved a constructive theorem: Let i_(Sigma) m be an indirectly adjacent grid space. Given a subset J of D and a mapping fJ : J yields i_(Sigma) m, if the distance of any two points p and q in J is not less than the distance of fJ(p) and fJ(q) in i_(Sigma) m, then there exists an extension mapping f of fJ, such that the distance of any two points p and q in D is not less than the distance of f(p) and f(q) in i_(Sigma) m, we call such f a gradually varied surface. We also show that any digital manifold (graph) can normally immerse an arbitrary tree T. Furthermore, we discuss the gradually varied function. An envelop theorem, a uniqueness theorem, and an extension theorem concerned with having the same norm are obtained. Finally, we show an optimal uniform approximation theorem of gradually varied functionals and develop an efficient algorithm for the approximation.
The spatial resolution of the Voltage ImagingTM technology is defined as the smallest pixel pitch that can be measured for a certain voltage accuracy. It is a function of the EO material, the image forming optics, the CCD resolution and electronic filter bandwidth, and the image processor resolution and filter bandwidth. This paper starts from a square wave input signal with 100% modulation on the LCD panel. The contrast transfer function (CTF) for each individual stage in the voltage imaging process is derived. They include the EO material, the objective lens and imaging lens, the CCD camera spatial sampling, the CCD camera electronics filtering, and the image processor sampling and filtering. Simulated data is presented.
We describe a model-based vision system to assist the pilots in landing maneuvers under restricted visibility conditions. The system has been designed to analyze image sequences obtained from a passive millimeter wave (PMMW) imaging system mounted on the aircraft to delineate runways/taxiways, buildings, and other objects on or near runways. PMMW sensors have good response in a foggy atmosphere, but their spatial resolution is very low. However, additional data such as airport model and approximate position and orientation of aircraft are available. We exploit these data to guide our model-based system to locate objects in the low resolution image and generate warning signals to alert the pilots. We also derive analytical expressions for the accuracy of the camera position estimate obtained by detecting the position of known objects in the image.
An eigenstructure approach for the region detection is presented in this paper. It first defines Block and Snapshot, followed by the data formulation. A covariance matrix is formed as an overall description of the entire image environment. Then information criteria are finally applied to directly determine the number of regions. This approach does not make the heuristic assumptions for the model and considers the spatial correlation among the pixels. it also provides faster and more accurate operation than model-based approaches. The principle of block processing is described and the encouraging results are included.
As part of a project to utilize advanced software to diagnose multi-layer circuit card assemblies, the signal traces were separated into top and bottom layers. The source images were stereo x-ray images, 2 K by 2 K by 12 bits. The a stereo software algorithm was developed using steerable filters as a means of precisely determining signal trace location and orientation. A new stereopsis algorithm was derived which efficiently allocates signal traces to particular layers. The paper relates the approach taken to biological vision and its characteristic of hyperacuity. And to the subject of model building and model fitting.
In this paper, we propose a nonsupervised neural network approach to an image segmentation issue. The purpose is to extract spectral lines from sonar images. A Kohonen's self-organizing map is used to approximate the probability density function of the input data in a nonlinear way. The originality of our work with regards to the Kohonen approach is that constraints due to spectral lines features (temporal continuity, high mean energy), are encoded into the network. The process consists of two steps. First, searching for each point in the image whether a spectral line goes through this point. This step is achieved using a one-dimension map which self-organizes until a stable state is reached. The second step consists in evaluating whether network topology recovers spectral lines properties. For this purpose, we define an objective function which depends on neurons mean energy and global curvature of the network seen as a topological set of units. This process enhances spectral lines perception in a noisy image and has been successfully applied to a set of lofargrams with different signal to noise ratio.
The aim of this work is to develop a method to improve the region segmentation of images by considering each image separately and taking into account the results of the matching process. The method is carried out in several steps. First, an initial region segmentation is computed by using a split-and-merge algorithm cooperating with an edge extractor. Then, a rule-based system is used in order to improve the initial region segmentation. In the second step of the method, the regions of the images are matched by an iterative algorithm; only the reliable matches are performed and for this reason, numerous regions are left unmatched. Then, these regions are treated by another rule-based system by comparing the homologous regions on each image. Both the image segmentation and the matching results are improved at the same time making easier the 3-D reconstruction of the facets corresponding to the matched regions.
This paper discusses the methods used to model the structure of x-ray images of the human body and the individual organs within the body. A generic model of a region is built up from x-ray images to aid in automatic segmentation. By using the ribs from a chest x-ray image as an example, it is shown how models of the different organs can be generated. The generic model of the chest region is built up by using a priori knowledge of the physical structure of the human body. The models of the individual organs are built up by using knowledge of the structure of the organs as well as other information contained within each image. Each image is unique and therefore information from the region surrounding the organs in the image has to be taken into account when adapting the generic model to individual images. Results showing the application of these techniques to x-ray images of the chest region, the labelling of individual organs, and the generation of models of the ribs are presented.
A new approach in single photon emission computed tomography (SPECT) is presented to reconstruct the distributions of both activity and attenuation from projection data. Poisson noise, attenuation, scatter, and collimator effects were corrected completely with a Bayesian statistical model. The attenuation distribution was modeled with deformable template since the attenuation coefficients for the (gamma) -ray used in SPECT were nearly uniform within different regions of the human body. A prior probability was constructed for the attenuation distribution with a Gibbs measure on a hierarchical deformable template while the activity distribution was modeled by a Markov random field. By maximizing the joint posterior probability, the maximum-a-posteriori (MAP) estimate of the distributions of both activity and attenuation was obtained. The implementation of MAP estimation was achieved approximately by a hybrid algorithm of iterated conditional modes and gradient descent.
An algorithm for the robust and fast automatic construction of a 3D model of any real object using images from multiple views is presented. The images are taken from a real object rotating in front of a stationary calibrated CCD TV camera. The object silhouettes extracted from the input images, the related turntable positions, and camera orientation are used to construct the volume model of the real object by applying the method of occluding contours. A keypoint in performing this method is a proper volume representation, characterized by low complexity and suitability for a fast computation of volume models. In the presented approach, each volume model is described by pillar-like volume elements (pillars) ensuring a computational complexity proportional to the size of the real object surface and enabling a fast and simple construction of the volume model. The fast performance is due to the simple projection feasibility of those pillars and the easy-to-perform intersection test for the object silhouette with the projected pillars. Results with real image sequences have confirmed the robustness of the developed algorithm even for the modelling of real objects with highly detailed and complex surfaces and the use of imperfect object silhouettes.
The goal of this paper is to introduce a surface tracking algorithm of 26-connected objects and to apply it to contour tracking of discrete 3D digital objects. We show that border notion is insufficient to make a distinction between outer points and points of its cavities. Then, we introduce 3D discrete surfaces modelization. A classical surface tracking algorithm is introduced for 6 and 18-connected objects. We propose an original contour tracking algorithm based on the surface tracking one, and an extension to 26-connected objects. Two parallelization strategies of the contour extraction algorithm are then proposed, one using a data structure of list, the other one, a spatial distribution of the image over the processors.