This paper describes various illumination and image processing techniques for yarn characterization. Darkfield and back-lit illuminations are compared in terms of depth of field tolerance and image quality. Experiments show that back-lit illumination is superior in terms of depth of field tolerance and contrast. Three different back-lit illumination configurations are studied: one simply employing a light source placed behind the yarn, the other incorporating a field lens to increase the light intensity passing through the aperture, and the third using a mirror placed at 45° to the optical axis to enable imaging of two orthogonal views of the yarn core. Problems in defining the hair–core boundaries in high resolution yarn pictures are addressed and a filtering process is introduced for back-lit images. A comparison of the diameter and diameter coefficient of variation percentage measurements for different illumination and image processing techniques is given for several yarn samples. The data are also correlated with Premier 7000 diametric irregularity tester and Uster Tester 3 irregularity measurements.
The purpose of this study is to compare the performance of a complementary metal–oxide–semiconductor (CMOS)-based digital x-ray imaging system with that of a charge-coupled device (CCD)-based system for small animal research. A CMOS-based digital x-ray imaging system was developed and tested. The core of this system is a detector module consisting of eight joint CMOS chips, each having a size of 512×1024 pixels with a readout unit on the side. The pixel size of the CMOS detectors is 0.048 mm. The contrast detail detectability of the CMOS-based system was studied using different phantoms, and compared with that of a CCD-based digital imaging system. The contrast detail curves of the CMOS-based image system, obtained from the observer-based studies, are highly comparable to the CCD-based imaging system, particularly at higher x-ray exposures. The images of fine structures of a mouse, acquired by the CMOS system, demonstrated the capability of the system in the studies of small animals. With integration potential, manufacturability, and low costs, the CMOS-based imaging systems could be used in animal studies and potentially become useful clinical tools for diagnosis.
Tone mapping refers to the conversion of luminance values recorded by a digital camera or other acquisition device, to the luminance levels available from an output device, such as a monitor or a printer. Tone mapping can improve the appearance of rendered images. Although there are a variety of algorithms available, there is little information about the image tone characteristics that produce pleasing images. We devised an experiment where preferences for images with different tone characteristics were measured. The results indicate that there is a systematic relation between image tone characteristics and perceptual image quality for images containing faces. For these images, a mean face luminance level of 46–49 CIELAB L* units and a luminance standard deviation (taken over the whole image) of 18 CIELAB L* units produced the best renderings. This information is relevant for the design of tone-mapping algorithms, particularly as many images taken by digital camera users include faces.
J. Electron. Imag. 14(2), 023004 (1 April 2005) doi:10.1117/1.1900135
TOPICS: Visualization, Image segmentation, Image compression, Image enhancement, Digital imaging, Image processing, Digital image processing, Visual system, Data modeling, Human vision and color perception
Biological evolution has adapted human vision to terrestrial light contrasts. Earthly scenes have a typical bimodal contrast histogram, with a reflection mode and a shadow mode. Consequently, human sensitivity to gray-scale differences also is bimodal. Luminance differences are most discriminable at the modal intensities of the terrestrial contrast distribution so that vision conveys maximum information about the world. By inverting this biologic, an image can be computationally adapted to the human visual system. Using Paul Whittle's model of gray-scale sensitivity as a basis, distinct pixel intensities in image data are mapped to optimally discriminable displayed luminances. Since this approach is scene dependent and display dependent, the optimum gray scale must be recomputed for each displayed image, background luminance and display environment.
We consider the problem of restoring a noisy blurred image using an adaptive unsharp mask filter. Starting with a set of very high quality images, we use models for both the blur and the noise to generate a set of degraded images. With these image pairs, we optimally train the strength parameter of the unsharp mask to smooth flat areas of the image and to sharpen areas with detail. We characterize the blur and the noise for a specific hybrid analog/digital imaging system in which the original image is captured on film with a low-cost analog camera. A silver-halide print is made from this negative; and this is scanned to obtain a digital image. Our experimental results for this imaging system demonstrate the superiority of our optimal unsharp mask compared to a conventional unsharp mask with fixed strength.
Printing processes of electrophotography basically involve some unstableness of analog nature. The unstableness causes stochastic reproduction of small dots and degrades image quality. To overcome the analog unstableness, this paper presents a new dispersed-dot halftoning technique for high resolution electrophotography. We combine a Gaussian filter with a sigmoid nonlinear function to compute the probability of toner transfer in print using the characteristic of electrophotography. We can predict an image to be printed out using this nonlinear printer model. Using this model we can produce halftoned images of good image quality with small perceptive error against an original gray scale image. To achieve this, we rely on iterative improvement to remove as many unstable pixels as possible whose transfer probabilities lie within a band in our nonlinear printer model. Experimental results show great improvement of the perceptive error compared to the conventional cluster-dot halftoning.
Error diffusion is a popular halftoning algorithm that in its most widely used form, is inherently serial. As a serial algorithm, error diffusion offers limited opportunity for large-scale parallelism. In some implementations, it may also result in excessive bus traffic between the on-chip processor and the off-chip memory used to store the modified continuous-tone image and the halftone image. We introduce a new error diffusion algorithm in which the image is processed in two groups of interlaced blocks. Within each group, the blocks may be processed entirely independently. In the first group, the error diffusion proceeds along an outward spiral from the center of the block. Errors along the boundaries of blocks in the first group are diffused into neighboring blocks in the second group, within which the error diffusion spirals inward. A tone-dependent error diffusion training framework is used to eliminate artifacts associated with the spiral scan paths. We demonstrate image quality that is close to that achieved by conventional line-by-line error diffusion.
Spectral characterization involves building a model that relates the device dependent representation to the reflectance function of the printed color, usually represented with a high number of reflectance samples at different wavelengths. Look-up table-based approaches, conventionally employed for colorimetric device characterization cannot be easily scaled to multispectral representations, but methods for the analytical description of devices are required. The article describes an innovative analytical printer model based on the Yule–Nielsen Spectral Neugebauer equation and formulated with a large number of degrees of freedom in order to account for dot-gain, ink interactions, and printer driver operations. To estimate our model's parameters we use genetic algorithms. No assumption is made concerning the sequence of inks during printing, and the printers are treated as RGB devices (the printer-driver operations are included in the model). We have tested our characterization method, which requires only about 130 measurements to train the learning algorithm, on four different inkjet printers, using different kinds of paper and drivers. The test set used for model evaluation was composed of 777 samples, uniformly distributed over the RGB color space.
A novel image compression technique employing the self-organized clustering capability of Fuzzy-ART neural network and 2D runlength encoding is presented. Initially the image is divided into 4×4 blocks and the 16 element vectors representing the pixels in the blocks are applied to the Fuzzy-ART network for classification. The image is then represented by the block codes consisting of the sequence of class indices, and the codebook consisting of the class index and their respective gray levels. Further compression is achieved by 2D runlength encoding, making use of the repetitions of the class index in the block codes in x and y directions. By controlling the vigilance parameter of Fuzzy-ART, a reasonable compression of the image without sacrificing the image quality can be obtained. From the experimental results, it can be seen that the proposed method of image compression can be used for image communication systems where large compression ratio is required. An efficient technique for automatic computation of the value of vigilance parameter based on the image characteristics for optimum compression of the image is also presented in this paper. With the introduction of a new class of Fuzzy-ART network, namely Force Class Fuzzy-ART, hardware implementation of the image compression module is made feasible. This architecture constrains the maximum number of classes in the output of the network by forcing the new vectors into one of the closest categories.
J. Electron. Imag. 14(2), 023010 (1 April 2005) doi:10.1117/1.1902995
TOPICS: Digital watermarking, Visibility, Signal to noise ratio, Image processing, Visualization, Visual process modeling, Human vision and color perception, 3D modeling, RGB color model, Digital imaging
In this paper, we discuss the visibility problem of the digital watermarking from the point of unobtrusiveness. A new approach to fully automatic watermark visibility computation is presented. From the view of spatial domain, the visibility can be explained from two aspects, contrast and luminance. Two criterions for contrast and luminance are also proposed. Applying the criteria to local windows, two kinds of images are obtained called T-image and L-image. Experiments show that these two images provide good measurement for unobtrusiveness.
Many previous methods for image thresholding focused on developing automatic algorithms to determine thresholds. However, most of the methods suffer from time-consuming computation for multilevel thresholding. Therefore, a fast and automatic thresholding method is desired for real-time applications. This paper proposes a new and faster method for bilevel as well as multilevel image thresholding. Taking (partial) derivatives of image between-class variance with respect to gray levels develops the proposed method. For bilevel thresholding, a nonlinear equation is derived to solve for an optimal threshold. For multilevel thresholding, a set of nonlinear equations is derived to solve for a set of optimal thresholds. A parameter is introduced to determine the class number for image classification by subjective determination of the ratio of image features to be kept after classification. Statistical performance analysis of the proposed method versus the Baysian classifier is included in this paper. Thresholding computation for the proposed method and Otsu's [N. Otsu, "A threshold selection method from gray-level histograms," IEEE Trans. Syst. Man, Cyber. SMC-9, 62–66 (1979)] is discussed. There are also several examples to illustrate the feasibility of the proposed method and its superiority in computation speed.
In this paper, we discuss a unified theory for and performance evaluation of the ridge direction estimation through the minimization of the integral of the second directional derivative of the gray-level intensity function. The primary emphasis of this paper is on the ridge orientation estimation. The subsequent ridge detection can be performed using the traditional methods of using the zero crossing of the first directional derivative. The performance evaluation of the ridge orientation estimation is performed in terms of the mean orientation bias and orientation standard deviation given the true orientation and the same two measures given the noise standard deviation. We discuss two forms of our new ridge detector—first (ISDDRO-CN) using the noise covariance matrix estimation procedure under colored noise assumption, and the second (ISDDRO-WN) using the white noise assumption. ISDDRO-CN performs better than the ISDDRO-WN in the presence of strong correlated noise. When the noise levels are moderate it performs as well as ISDDRO-WN. ISDDRO-CN has superior noise sensitivity characteristics. We also compare both forms of our algorithm with the algorithm, Maximum Level Set Extrinsic Curvature (MLSEC) designed by A. López [IEEE Trans. Patter Anal. Mach. Intell. 21, 327–335 (1999)].
The aim of this paper is to present a new method for the estimation of the instantaneous frequency of a frequency modulated signal, corrupted by additive noise. Any time-frequency representation of an acquired signal is concentrated around the instantaneous frequency law of its useful component (the projection of the ridges of the time-frequency representation on the time-frequency plane) and realizes the diffusion of its noise component. So, extracting the ridges of the time-frequency representation, the instantaneous frequency of its useful component can be estimated. In this paper a new time-frequency representation is proposed. Using the image of this new time-frequency representation, its ridges can be extracted with the aid of some mathematical morphology operators. This is a ridges detection mechanism producing the projection on the time-frequency plane. This projection represents the result of the proposed estimation method. Some simulations prove the qualities of this method.
J. Electron. Imag. 14(2), 023014 (1 April 2005) doi:10.1117/1.1904066
TOPICS: Image fusion, Image segmentation, Data fusion, Data modeling, Medical imaging, Magnetic resonance imaging, Image processing algorithms and systems, Monte Carlo methods, Reconstruction algorithms, Image classification
In this paper we propose a Bayesian framework for unsupervised image fusion and joint segmentation. More specifically we consider the case where we have observed images of the same object through different imaging processes or through different spectral bands (multi- or hyperspectral images). The objective of this work is then to propose a coherent approach to combine these images and obtain a joint segmentation which can be considered as the fusion result of these observations. The proposed approach is based on a hidden Markov modeling of the images where the hidden variables represent the common classification or segmentation labels. These label variables are modeled by the Potts Markov random field. We propose two particular models for the pixels in each segment (iid. or Markovian) and develop appropriate Markov chain Monte Carlo algorithms for their implementations. Finally we present some simulation results to show the relative performances of these models and mention the potential applications of the proposed methods in medical imaging and survey and security imaging systems.
In this paper the method of understanding of the complex object concept is presented. The proposed method of understanding the concept of the complex object is part of the shape understanding method developed by the authors. The main novelty of the presented method is that the process of understanding is related to the visual concept represented as a symbolic name of the possible class of shapes. The possible class of shapes, viewed as the hierarchical structures, is incorporated into the shape model. At each stage of the reasoning process that led to assigning of an examined object to one of the possible classes, the novel processing methods were used. These methods are very efficient because they dealt with a very specific class of shapes. These methods are implemented as a module of the shape understanding system and tested on the broad classes of shapes. The system of shape understanding is able to perform different tasks of shape analysis and recognition based on the ability of the system to understand the different concepts of shape at the different levels of cognition. The system consists of different types of experts that perform different processing and reasoning tasks.
Content-based retrieval for the comparative analysis of mammograms containing masses is presented as one part of a larger project on the development of a content-based image retrieval system for computer-aided diagnosis of breast cancer. In response to a query, masses characterized by objectively determined values related to specific mammographic features are retrieved from a database. The retrieved mammograms and their associated patient information may be used to support the radiologist's decision-making process when examining difficult-to-diagnose cases. We investigate the use of objective measures of shape, edge sharpness, and texture to retrieve mammograms with similar masses. Experiments were conducted with 57 regions (20 malignant and 37 benign) in mammograms containing masses. Three shape factors representing compactness, fractional concavity (Fcc), and spiculation index; Haralick's 14 statistical texture features; and four edge-sharpness measures were computed for use as indices for each of the mass regions. The feature values were evaluated with linear discriminant analysis, logistic regression, and Mahalanobis distance for their effectiveness in classifying the masses as benign or malignant. The three most effective features of Fcc, acutance (A), and sum entropy (F8) were selected from the 21 computed features based on the area under the receiver operating characteristics curve and logistic regression. Linear discriminant analysis with Fcc resulted in the highest sensitivity of 100% and specificity of 97%. The texture feature F8 and acutance A yielded average accuracies of 61% and 74%, respectively. A measure of retrieval accuracy known as precision was determined to be 91% when using the three selected features. However, the shape measure of fractional concavity on its own yielded a precision rate of 95%. The methods proposed should lead to an efficient tool for computer-aided diagnosis of breast cancer.
An implementation for parametric snakes used for object tracking is proposed via generalized deterministic annealing (GDA). Given an arbitrary energy functional that quantifies the quality of the contour solution, GDA computes the snake position by approximating the solution given by stochastic simulated annealing. First, the Markov chain representing the solution space for the snake position is broken into N smaller, local Markov chains representing the position of each discrete snake sample. At each annealing temperature, GDA directly approximates the stationary distribution of the local Markov chains using a mean field approximation for neighboring snake sample positions, and the final distribution reveals the solution. In contrast to the typical implementation via gradient descent, annealing methods can avoid suboptimal local solutions and can be used to compute snakes that are effective in the presence of severe noise and distant initial positions. Unlike simulated annealing, GDA does not utilize random moves to slowly locate a high quality solution and is thus appropriate for time critical applications. In the paper, synthetic experiments (on 231 images) are provided that compare the edge localization performance of snakes computed by GDA, simulated annealing and gradient descent for conditions of varying noise and varying initial snake position. The effectiveness of GDA is also demonstrated in a challenging real-data application (on 910 images) in which white blood cells are tracked from video microscopy.
Integral imaging is a technique capable of displaying images with continuous parallax in full natural color. This paper presents a method of extracting depth map from integral images through viewpoint image extraction. The approach starts with the constructions of special viewpoint images from the integral image. Each viewpoint image contains a two-dimensional parallel recording of the three-dimensional scene. A new mathematical expression giving the relationship between object depth and the corresponding viewpoint image pair displacement is derived by geometrically analyzing the integral recording process. The depth can be calculated from the corresponding displacement between two viewpoint images. A modified multibaseline algorithm, where the baseline is defined as the sample distance between two viewpoint images, is further adopted to integrate the information from multiple extracted viewpoint images. The developed depth extraction method is validated and applied to both real photographic and computer generated unidirectional integral images. The depth measuring solution gives a precise description of the object thickness with an error of less than 0.3% from the photographic image in the example.
Conventional integral three-dimensional images, either acquired by cameras or generated by computers, suffer from narrow viewing range. Many methods to enlarge the viewing range of integral images have been suggested. However, by far they all involve modifications of the optical systems, which normally make the system more complex and may bring other drawbacks in some designs. Based on the observation and study of the computer generated integral images, this paper quantitatively analyzes the viewing properties of the integral images in conventional configuration and its problem. To improve the viewing properties, a new model, the maximum viewing width (MVW) configuration is proposed. The MVW configured integral images can achieve the maximum viewing width on the viewing line at the optimum viewing distance and greatly extended viewing width around the viewing line without any modification of the original optical display systems. In normal applications, a MVW integral image also has better viewing zone transition properties than the conventional images. The considerations in the selection of optimal parameters are discussed. New definitions related to the viewing properties of integral images are given. Finally, two potential application schemes of the MVW integral images besides the computer generation are described.
A depth-fused three dimensional (DFD) display composed of two two-dimensional (2-D) images displayed at different depths enables an observer to perceive a three dimensional image without the assistance of extra equipment. The original data for the display are a 2-D image and a depth map of objects. The two 2-D images are formed by dividing the luminance of a 2-D image of objects between the two 2-D images according to the depth data of the objects. This paper presents the effect of compressing the depth map on a DFD image. The results of subjective evaluations of still pictures using JPEG revealed that compression noises appearing on the decoded image appeared as position errors in depth on the DFD image; however, less data are possible for the depth map data than for a conventional 2-D image. This means that compressing the depth map is advantageous when transmitting a DFD image.
Active depth from defocus (DFD) eliminates the main limitation faced by passive DFD, namely its inability to recover depth when dealing with scenes defined by weakly textured (or textureless) objects. This is achieved by projecting a dense illumination pattern onto the scene and depth can be recovered by measuring the local blurring of the projected pattern. Since the illumination pattern forces a strong dominant texture on imaged surfaces, the level of blurring is determined by applying a local operator (tuned on the frequency derived from the illumination pattern) as opposed to the case of window-based passive DFD where a large range of band pass operators are required. The choice of the local operator is a key issue in achieving precise and dense depth estimation. Consequently, in this paper we introduce a new focus operator and we propose refinements to compensate for the problems associated with a suboptimal local operator and a nonoptimized illumination pattern. The developed range sensor has been tested on real images and the results demonstrate that the performance of our range sensor compares well with those achieved by other implementations, where precise and computationally expensive optimization techniques are employed.
We introduce a novel model capturing user preference using the Bayesian approach for recommending users' preferred multimedia content. Unlike other preference models, our method traces the trend of a user preference in time. It allows us to do online learning so we do not need exhaustive data collection. The tracing of the trend can be done by modifying the frequency of attributes in order to force the old preference to be correlated with the current preference under the assumption that the current preference is correlated with the near future preference. The modification is done by partitioning usage history data into smaller sets in a time axis and then weighting the frequencies of attributes to be computed from the partitioned sets of the usage history data in order to differently reflect their significance on predicting the future preference. In the experimental section, the learning and reasoning on user preference in genres are performed by the proposed method with a set of real TV viewers' watching history data collected from many real households. The reasoning performance by the proposed method is also compared with that by a typical method without training in order to show the superiority of our proposed method.