By incorporating the image gradient directional information into the geodesic active contour model, we propose a novel active contour model called directional geodesic active contour, which has the advantage of selectively detecting the image edges with different gradient directions. The experiment results show the high performance of the proposed active contour in image segmentation, especially when multiple edges with different gradient directions are present near the object boundary to confuse the active contour.
We present a specialized form of error diffusion (ED) that addresses certain long-standing problems associated with operating on images possessing halftone structure and other local high-contrast structures. For instance, image-quality defects often occur when rendering a scanned halftone image to printable form via quantization reduction. Rendering the scanned halftone via conventional ED produces fragmented dots, which can appear grainy and be unstable in printed density. Rendering by thresholding or rehalftoning often produces moiré, and descreening blurs the image. Another difficulty arises in printers that utilize a binary image path, where an image is rasterized directly to halftone form. In that form it is difficult to perform basic image processing operations such as applying a digital tone reproduction curve. Rank-order error diffusion (ROED) has been developed to address these problems. ROED utilizes brightness ranking of pixels within a diffusion mask to diffuse quantization error at a pixel. This approach results in an image-structure-adaptive quantization with useful properties. We describe the basic methodology of ROED as well as several applications in processing halftone images.
A grid-fitting method for Chinese fonts at a size smaller than 20 grid units to achieve a high-quality display is presented. The basic idea behind the proposed method is to adopt several grid-fitting rules with a gray-scaling strategy. The major procedures include supersampling, adjusting the position and width of strokes, gray-scale filtering, and subsampling. The experimental results show that the jagged edge induced by the coarse resolution can be removed and that the rendition of character details and the contrast are also improved, revealing that the character's clarity and smoothness can be enhanced for a higher legibility with comfortable reading.
We propose a fast palette design scheme based on the K-means algorithm for color image quantization. To accelerate the K-means algorithm for palette design, the use of stable flags for palette entries is introduced. If the squared Euclidean distances incurred by the same palette entry in two successive rounds are quite similar, the palette entry is classified to be stable. The clustering process will not work on these stable palette entries to cut down the required computational cost. The experimental results reveal that the proposed algorithm consumes a lower computational cost than those comparative schemes while keeping approximately the same image quality.
We develop a method for automatic colorization of images (or two-dimensional fields) in order to visualize pixel values and their local differences. In many applications, local differences in pixel values are as important as their values. For example, in topography, both elevation and slope often must be considered. Gradient-based value mapping (GBVM) is a technique for colorizing pixels based on value (e.g., intensity or elevation) and gradient (e.g., local differences or slope). The method maps pixel values to a color scale (either gray-scale or pseudocolor) in a manner that emphasizes gradients in the image while maintaining ordinal relationships of values. GBVM is especially useful for high-precision data, in which the number of possible values is large. Colorization with GBVM is demonstrated with data from comprehensive two-dimensional gas chromatography (GCxGC), using both gray-scale and pseudocolor to visualize both small and large peaks, and with data from the Global Land One-Kilometer Base Elevation (GLOBE) Project, using gray-scale to visualize features that are not visible in images produced with popular value-mapping algorithms.
We present an adaptive contrast enhancement method based on the generalized histogram, which is obtained by relaxing the restriction of using the integer count. For each pixel, the integer count 1 allocated to a pixel is split into the fractional count and the remainder count. The generalized histogram is generated by accumulating the fractional count for each intensity level and distributing the remainder count uniformly throughout the intensity levels. The intensity mapping function, which determines the contrast gain for each intensity level, is derived from the generalized histogram. Since only the fractional part of the count allocated to each pixel is used for increasing the contrast gain of its intensity level, the amount of contrast enhancement is adjusted by varying the fractional count according to regional characteristics. The proposed scheme produces visually more pleasing results than the conventional histogram equalization.
This paper presents the development and implementation of a real-time dynamic range compensation system for optical images. Compared with conventional automatic brightness contrast compensators that are based on the average image or pixel intensity level, the proposed system utilizes histogram-profiling techniques to continuously compensate for the dynamic range of the processed video signal. The algorithms are implemented in real time with a frame grabber card forming the front-end video capture element. The proposed technique yields better image compensation compared with conventional methods. Various methods are used in order for the analysis to be statistically valid.
Classical nonlinear vector median-based filters are well-known methods for impulsive noise suppression in color images, but mostly they lack good detail-preserving ability. We use a class of fuzzy metrics to introduce a vector filter aimed at improving the detail-preserving ability of classical vector filters while effectively removing impulsive noise. The output of the proposed method is the pixel inside the filter window which maximizes the similarity in color and spatial closeness. The use of fuzzy metrics allows us to handle both criteria simultaneously. The filter is designed so that the importance of the spatial criterion can be adjusted. We show that the filter can adapt to the density of the contaminating noise by adjusting the spatial criterion importance. Classical and recent filters are used to assess the proposed filtering. The experimental results show that the proposed technique performs competitively.
A comprehensive survey of 48 filters for impulsive noise removal from color images is presented. The filters are formulated using a uniform notation and categorized into 8 families. The performance of these filters is compared on a large set of images that cover a variety of domains using three effectiveness and one efficiency criteria. In order to ensure a fair efficiency comparison, a fast and accurate approximation for the inverse cosine function is introduced. In addition, commonly used distance measures (Minkowski, angular, and directional-distance) are analyzed and evaluated. Finally, suggestions are provided on how to choose a filter given certain requirements.
Confocal microscopes (CM) are routinely used for building 3-D images of microscopic structures. Nonideal imaging conditions in a white-light CM introduce additive noise and blur. The optical section images need to be restored prior to quantitative analysis. We present an adaptive noise filtering technique using Karhunen–Loéve expansion (KLE) by the method of snapshots and a ringing metric to quantify the ringing artifacts introduced in the images restored at various iterations of iterative Lucy–Richardson deconvolution algorithm. The KLE provides a set of basis functions that comprise the optimal linear basis for an ensemble of empirical observations. We show that most of the noise in the scene can be removed by reconstructing the images using the KLE basis vector with the largest eigenvalue. The prefiltering scheme presented is faster and does not require prior knowledge about image noise. Optical sections processed using the KLE prefilter can be restored using a simple inverse restoration algorithm; thus, the methodology is suitable for real-time image restoration applications. The KLE image prefilter outperforms the temporal-average prefilter in restoring CM optical sections. The ringing metric developed uses simple binary morphological operations to quantify the ringing artifacts and confirms with the visual observation of ringing artifacts in the restored images.
TOPICS: 3D image processing, 3D acquisition, Image resolution, 3D image reconstruction, Image processing, Target detection, Resolution enhancement technologies, 3D modeling, Target recognition, Imaging systems
A computer-based integrated imaging system (CIIS) using normalized cross correlation (NCC) for resolution enhancement is proposed to extract accurate location data of 3-D objects. Elemental images (EI) of the target and reference objects are picked up by lenslet arrays, and then target and reference plane images with enhanced resolution are reconstructed at the output plane by using the CIIS technique. Through cross correlations between the reconstructed reference and target plane images, 3-D location data of the target objects in a scene can then be robustly extracted. As a result of our experiments, we see that the proposed correlation scheme provides good discrimination and detection performance for 3-D object recognition.
Image segmentation and its performance evaluation are very difficult but important problems in computer vision. A major challenge in segmentation evaluation comes from the fundamental conflict between generality and objectivity: For general-purpose segmentation, the ground truth and segmentation accuracy may not be well defined, while embedding the evaluation in a specific application, the evaluation results may not be extensible to other applications. We present a new benchmark to evaluate five different image segmentation methods according to their capability to separate a perceptually salient structure from the background with a relatively small number of segments. This way, we not only find a large variety of images that satisfy the requirement of good generality, but also construct ground-truth segmentations to achieve good objectivity. We also present a special strategy to address two important issues underlying this benchmark: (1) most image-segmentation methods are not developed to directly extract a single salient structure; (2) many real images have multiple salient structures. We apply this benchmark to evaluate and compare the performance of several state-of-the-art image segmentation methods, including the normalized-cut method, the watershed method, the efficient graph-based method, the mean-shift method, and the ratio-cut method.
We propose a practical schema for semiautomatic segmentation of images of Arctic charr. The goal is to separate differently colored parts of the fish, especially red abdominal areas from the other parts. The novelty and importance of the proposed system are in the reconstruction of a working schema rather than its components. The system is important to fisheries since the coloration of fish is connected to the genetic quality and is often used to evaluate the health status of the fish. Quantitative analysis of this kind of information gives follow-up data and a more realistic view of fish stock than the basic visual evaluation. The schema takes consideration of economical limitations of an ordinary fishery and educational aspects of personnel. The results are evaluated visually by the experts and against a neural network solution.
The marked point process (MPP) provides a useful and theoretically well-established tool for integrating spatial information into the image analysis process. We consider the problem of detecting rolling leukocytes within intravital microscopy images. A first stage of the detection method reduces the detection to a set of points, each one representing a possible leukocyte. Our task is then to decide which points are actual leukocytes. We propose an MPP-based approach that aims at improving both the accuracy and efficiency of the detection process by means of exploiting the spatial interrelationships. We construct a Markov chain Monte Carlo algorithm to obtain the maximum a posteriori (MAP) estimation of a set of points corresponding to the centroids of leukocytes observed in the image. The optimal solution, in terms of the MAP principle, is computed with respect to all leukocytes, rather than a single leukocyte. A quantitative study of our detection approach demonstrates results that compare very well to those achieved by manual detection and exceed the solution quality given by two competing methods. Our approach can serve as a fully automated substitute to the tedious and time-consuming manual rolling leukocyte detection process.
We use time-frequency (t-f) analysis techniques to examine the echo returns present in synthetic aperture radar (SAR) images of land-mine fields. A flying platform illuminates a mine field containing various types of mines and "confusers," with an ultra-wideband radar. A number of familiar time-frequency distributions are used to inspect the various possible mine locations above and below the ground surface. The 2-dimensional plots generated by these distributions offer a larger variety of features and clues that facilitate the discrimination of each mine type from the others and from possible "confusers." Conclusions emerge that confirm that the pseudo-Wigner–Ville and the Choi–Williams distributions provide the best discrimination results, as was pointed out in earlier work. Larger mines such as the ones denoted here as "type 1" are the easiest to discriminate. Comparison of mines to clutter objects ("confusers") shows that such objects are clearly distinguishable from all the present metal mines.
We describe a method for automatically classifying image-quality defects on printed documents. The proposed approach accepts a scanned image where the defect has been localized a priori and performs several appropriate image processing steps to reveal the region of interest. A mask is then created from the exposed region to identify bright outliers. Morphological reconstruction techniques are then applied to emphasize relevant local attributes. The classification of the defects is accomplished via a customized tree classifier that utilizes size or shape attributes at corresponding nodes to yield appropriate binary decisions. Applications of this process include automated/assisted diagnosis and repair of printers/copiers in the field in a timely fashion. The proposed technique was tested on a database of 276 images of synthetic and real-life defects with 94.95% accuracy.
Currently, inspection of the quality of a golf logo requires much time and manpower. There is a demanding need to reduce the escape rate. We present a preliminary but innovative study on development of an automatic optical inspection (AOI) system for golf logo quality. This study includes the development of algorithms for logo matching, logo degradation measure, and logo alignment. Logo contour features and spatial layout are implemented in the developed matching algorithm. The feasibility of the algorithm is assured regardless of the large variation in logo orientation. The proposed degradation measure is based on local window masking and coding. It is capable of detecting logo contours with small extrusion or intrusion defects. A positioning mechanism is specifically designated for automatic logo alignment. Good logo alignment is achieved through a proposed sequential control of ball orientation to enhance the discriminating power of the proposed degradation measure. Experiments conducted support the applicability of this study. A given example shows that the escape rate vanishes, but the false-alarm rate is 1.23%. A good balance between the escape rate and the false-alarm rate is achieved based on experiences or more experiments. For practical implementation, parallel computing is suggested to reduce the overall processing time.
Between the two main schemes of digital video coding, the variable bit rate (VBR) encoding scheme is generally considered better in terms of efficiency and encoding quality in comparison to the constant bit rate (CBR), because it retains the same quantization parameters for the whole encoding procedure (unconstrained VBR), without altering them according to a specific adaptive rate algorithm. Toward this generally accepted statement, we present a quantitative comparison to the perceptual efficiency of the VBR over the CBR for the Moving Picture Expert Group-4 (MPEG-4) ASP CIF and QCIF encoding sequences, showing that the VBR does not outperform significantly the corresponding CBR encoding quality, since the deduced perceptual advantage/ratio of the VBR over the CBR for the CIF is approximately 4–5% and is constant for all the encoding bit rates greater than 200 kbps, while for the QCIF case the relative ratio drops to approximately 2.5%.
We propose a method for obtaining high-accuracy subpixel motion estimates using phase correlation. Our method is motivated by recently published analysis according to which the Fourier inverse of the normalized cross-power spectrum of pairs of images that have been mutually shifted by a fractional amount can be approximated by a two-dimensional sinc function. We propose a modified version of such a function to obtain a subpixel estimate of motion by means of variable-separable fitting in the vicinity of the maximum peak of the phase correlation surface. We demonstrate that our method outperforms, in terms of subpixel accuracy, not only other surface-fitting techniques but also the state-of-the-art in motion estimation using phase correlation including the technique that motivated our work in the first place. Furthermore, our method performs particularly well in the presence of artificially induced additive white Gaussian noise and also offers better motion vector coherence in terms of zero-order entropy.
We combine predictive hexagonal pattern and partial distortion searches with the recently proposed constrained one-bit transform-based motion estimation scheme to reduce the computational load of the motion estimation process. Furthermore, the kernel used to obtain the one-bit images is simplified. Experimental results show significant reduction of the number of average search points, with only a slight loss in motion estimation accuracy.
We propose a robust image watermarking scheme by applying the fast Hadamard transform (FHT) to small blocks computed from the four discrete wavelet transform (DWT) subbands. Different transforms have different properties that can effectively match various aspects of the signal's frequencies. Our approach consists of four main steps: (1) we decomposed the original image into four subbands, (2) the four subbands are divided into blocks; (3) FHT is applied to each block; and (4) the singular-value decomposition (SVD) is applied to the watermark image prior to distributing the singular values over the DC components of the transformed blocks. The proposed technique improves the data embedding system effectively, the watermark imperceptibility, and its resistance to a wide range of intentional attacks. The experimental results demonstrate the improved performance of the proposed method in comparison with existing techniques in terms of the watermark imperceptibility and the robustness against attacks.
Most digital cameras incorporate nonlinear mapping from the amount of light falling onto an imaging sensor to a picture's intensity. However, this nonlinear mapping would cause problems in some linear operations of image processing such as depth-from-defocus (DFD). We address the erroneous effects of the nonlinear response function of a digital still camera in DFD. A spatial-domain DFD technique called S-transform is employed to present the effects of the nonlinear mapping. To enhance DFD performance, we undistort the camera response function. Comparagrams of differently exposed multiple images are employed to estimate the response function. The estimated response function is then fitted to a piecewise linear function. To compare DFD performance, we use four different approaches of the S-transform for automatic focusing of the camera. Compensation of the nonlinear response function enhances AF performance of the camera and estimates blur parameters more accurately.
We describe a new type of depth-fused 3-D (DFD) perception that occurs when watching a display system that uses two stereoscopic displays instead of two 2-D displays in a conventional DFD display. Subjective tests for this display revealed that two 3-D images of the same shape displayed by the two stereoscopic displays were fused into one 3-D image when they were viewed as overlapping as in a conventional DFD display in which two 2-D images are fused. The perceived depth of the fused 3-D image depends on both the luminance ratio of the two 3-D images and their depth specified by binocular disparity. This result demonstrates that DFD perception is dominated by the effects of binocular disparity and image intensity, i.e., the effect of the depth of focus is much weaker.