As I start my term as Editor, the Journal of Electronic Imaging begins its tenth year of publication. The journal has come a long way since its first issue, and for that we owe two individuals a debt of gratitude: Paul Roetling and Ed Dougherty. I especially want to thank Ed Dougherty who has served as Editor for the past six years.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
SPECIAL SECTION ON STATISTICAL ISSUES IN PSYCHOMETRIC ASSESSMENT OF IMAGE QUALITY
The human visual system is central to imaging science. The idea of knowing the world through two-dimensional samples was borne from our sense of sight. The human visual system inspires our designs for sensors and algorithms. Yet much of what we consider to be imaging science is in the realm of psychology. Color, for example, is a mental interpretation of sensory data. Quantification of color requires psychophysical techniques to map physical stimuli to cognitive responses.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
Experimental methods and statistics derived from signal detection theory are frequently used to compare two imaging techniques, to predict human performance under different parameterizations of an imaging system, and to distinguish variables related to human visual perception from variables related to decision making. We review recent experimental results suggesting that the assumptions of signal detection theory are fundamentally unsound. Instead of shifting decision criteria under different priors, humans appear to alter the information assimilation process, representing images from categories with high prior probability more accurately (less variance) than images from categories with low prior probability. If this hypothesis is correct, detection theory measures such as d8 and area under the receiver operating characteristic may be misleading or incomplete. We propose an alternative approach that can be used to quantify the effects of suboptimal decision making strategies without relying on a model of detection structure.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
In this paper, a perceptual color difference is presented as an alternative color difference metric for complex images instead of the conventional color difference equations. This color difference is derived based on Mahalanobis distance by using covariance matrices for differences of each color attribute. The covariance matrices for each class of images can be obtained by psychophysical experiments using just noticeable difference in paired comparisons. We compared the resultant matrices for different class of images and the information in the matrix can give very useful trends and clues about which kind of transformation can minimize the perceptual color difference in images when a transformation such as gamut mapping is required.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
This paper describes a more efficient paired comparison method that reduces the number of trials necessary for converting a table of paired comparisons into scaler data. Instead of comparing every pair of samples (the complete method), a partial method is used that makes more comparisons between closer samples than between more distant samples. A sorting algorithm is used to efficiently order the samples with paired comparisons, and each comparison is recorded. When the sorting is completed, more trials will have been conducted between closer samples than between distant samples. Regression is used to scale the resulting comparison matrix into a one dimensional perceptual quality estimate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
We analyze data from a gamut-mapping experiment using several statistical procedures for ranked data. In this experiment six gamut-mapping algorithms were applied to six different images and the results were ranked by 31 judges according to how well the images matched an original. We fitted two distance-based statistical models to the data: both analyses showed that aggregate preference among the six algorithms depended on the image viewed. Based on the first model we classified the images into four classes or clusters. We applied unidimensional unfolding, a technique from mathematical psychology, to extract latent reference frames upon which judges plausibly ordered the algorithms. Four color experts gave interpretations of the derived reference frames. We used the second model to generate confidence sets for the consensus rankings, and another cluster analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
CLUSTERING FOR IMAGE COMPRESSION AND DOCUMENT ANALYSIS
TOPICS: Image quality, Reconstruction algorithms, Quantization, Computer programming, Image compression, Distortion, Visualization, Signal to noise ratio, Data compression, Lead
In this paper, a fast clustering algorithm (FCA) is proposed to be implemented in vector quantization codebook production. This algorithm gives the ability to avoid iterative averaging of vectors and is based on collecting vectors with similar or closely similar characters to produce corresponding clusters. FCA gives an increase in peak signal-to-noise ratio (PSNR) about 0.3–1.1 dB, over the LBG algorithm and reduces the computational cost for codebook production (10%–60%) at different bit rates. Here, two FCA modifications are proposed: FCA with limited cluster size 1&2 (FCA-LCS1 and FCA-LCS2, respectively). FCA-LCS1 tends to subdivide large clusters into smaller ones while FCA-LCS2 reduces a predetermined threshold by a step to reach the required cluster size. The FCA-LCS1 and FCA-LCS2 give an increase in PSNR of about 0.9–1.0 and 0.9–1.1 dB, respectively, over the FCA algorithm, at the expense of about 15%–25% and 18%–28% increase in the output codebook size.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
In this work clustering and recognition problem of fonts in document images is addressed. Various font features and their clustering behavior are investigated. Font clustering is implemented both from shape similarity or from OCR performance points of view. A font recognition algorithm is developed that can identify the font group or the individual font from which a text was created.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
TOPICS: Visualization, Error analysis, 3D image processing, Solids, Image processing, Color management, Curium, Superposition, Color difference, RGB color model
Three-dimensional table interpolation techniques are now widely used in color management systems. These techniques are practical because complicated color conversions such as gamma conversion, matrix masking, under color removal, or gamut mapping can be executed at once by use of a three-dimensional lookup table (3D LUT). However, in some cases, the resultant interpolated reproduction of gradation has visible artifacts that degrade neutral and color gradations. Several research works concerning interpolation accuracy have been published. However, those articles have focused on an average color difference derived from experiments based on very few types of color conversions with no theoretical explanation. This paper describes a theoretical evaluation of errors and reproduced gradation curves using three-dimensional interpolation for several nonlinear color conversions. Two types of errors; conventional error and ripple error that is difference between piece-wise linear approximation and interpolated curves are defined, and gray gradation is used for an input image. The errors with a tetrahedral interpolation technique are also examined. Several nonlinear color conversions are tested and significant ripples have been found in case of the matrix-gamma with negative coefficients and minimum (MIN) function. The error goes down with decreasing of the distance between the lattice points (D), however, the decreasing rate is quite different. From these analyses we can get useful information about optimal 3D LUT sizes and which conversions are suitable for three-dimensional interpolation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
Histogram equalization (HE) is one of the simplest and most effective techniques for enhancing gray-level images. For color images, HE becomes a more difficult task, due to the vectorial nature of the data. We propose a new method for color image enhancement that uses two hierarchical levels of HE: global and local. In order to preserve the hue, equalization is only applied to intensities. For each pixel (called the ‘‘seed’’ when being processed) a variable-sized, variable-shaped neighborhood is determined to contain pixels that are ‘‘similar’’ to the seed. Then, the histogram of the region is stretched to a range that is computed with respect to the statistical parameters of the region (mean and variance) and to the global HE function (of intensities), and only the seed pixel is given a new intensity value. We applied the proposed color HE method to various images and observed the results to be subjectively ‘‘pleasant to the human eye,’’ with emphasized details, preserved colors, and with the histogram of intensities close to the ideal uniform one. The results compared favorably with those of three other methods (histogram explosion, histogram decimation, and three-dimensional histogram equalization) in terms of subjective visual quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
Effective document compression algorithms require that scanned document images be first segmented into regions such as text, pictures, and background. In this paper, we present a multilayer compression algorithm for document images. This compression algorithm first segments a scanned document image into different classes, then compresses each class using an algorithm specifically designed for that class. Two algorithms are investigated for segmenting document images: a direct image segmentation algorithm called the trainable sequential MAP (TSMAP) segmentation algorithm, and a rate-distortion optimized segmentation (RDOS) algorithm. The RDOS algorithm works in a closed loop fashion by applying each coding method to each region of the document and then selecting the method that yields the best rate-distortion trade-off. Compared with the TSMAP algorithm, the RDOS algorithm can often result in a better rate-distortion trade-off, and produce more robust segmentations by eliminating those misclassifications which can cause severe artifacts. At similar bit rates, the multilayer compression algorithm using RDOS can achieve a much higher subjective quality than state-of-the-art compression algorithms, such as DjVu and SPIHT.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
In this paper we propose a method for computing JPEG quantization matrices for a given mean-square error (MSE) or peak signal-to-noise ratio (PSNR). Then, we employ our method to compute JPEG standard progressive operation mode definition scripts using a quantization approach. Therefore, it is no longer necessary to use a trial and error procedure to obtain a desired PSNR and/or definition script, reducing cost. First, we establish a relationship between a Laplacian source and its uniform quantization error. We apply this model to the coefficients obtained in the discrete cosine transform stage of the JPEG standard. Then, an image may be compressed using the JPEG standard under a global MSE (or PSNR) constraint and a set of local constraints determined by the JPEG standard and visual criteria. Second, we study the JPEG standard progressive operation mode from a quantization-based approach. A relationship between the measured image quality at a given stage of the coding process and a quantization matrix is found. Thus, the definition script construction problem can be reduced to a quantization problem. Simulations show that our method generates better quantization matrices than the classical method based on scaling the JPEG default quantization matrix. The estimation of PSNR has usually an error smaller than 1 dB. This figure decreases for high PSNR values. Definition scripts may be generated avoiding an excessive number of stages and removing small stages that do not contribute during the decoding process with a noticeable image quality improvement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
A spatially adaptive generalization of motioncompensated prediction that models correspondence as occurring within band-limited ranges of spatial frequencies is presented. An efficient estimation algorithm—frequency adaptive block half-pixel interpolative prediction—is proposed for block translational compensation for video compression. This algorithm decreases the meansquared error (MSE) of temporal prediction for typical video sequences, capturing interframe correlations more accurately than a nonadaptive approach. The algorithm is implemented efficiently with the same hardware and software developed for traditional blockmatching algorithms. Unlike generalized deformational motion models, which require an order of magnitude more computation for similar coding gains, the storage and computational requirements of the new algorithm are modest. Experiments demonstrate that wavelet coding of spatial interframes may be improved by 1 dB [peak signalto- noise ratio (PSNR)] at medium bit rates. For H.263 coding with constant quantization, a 5% reduction in bit rate is achieved even at extremely low rates, with significant additional gains at higher rates. A detailed analysis of H.263 coding of the Foreman sequence demonstrates that 30% bit rate reduction is possible for high-quality coding of frames that frequently violate a traditional block translational motion model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
In this paper, we present a perceptual measure that predicts the visibility of the well-known blocking effect in discrete cosine transform (DCT) coded image sequences. The main objective of this work is to use the results of the measure derived for adaptive video postprocessing, in order to significantly improve the visual quality of the video decoded sequences at the receiver. The proposed measure is based on a visual model accounting for both the spatial and temporal properties of the human visual system. The input of the visual model is the distorted sequence only. Psycho–visual experiments have been carried out to determine the eye sensitivity to blocking artifacts, by varying a number of visually significant parameters: background level, spatial, and temporal activities in the surrounding image. Results obtained for the measurement of the visibility thresholds enable us to estimate the model parameters. The visual model is finally applied to real coded video sequences. The comparison of measurement results with subjective tests shows that the proposed perceptual measure has a good correlation with subjective evaluation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
This paper proposes a novel technique to reduce noise while preserving edge sharpness during image filtering. This method is based on the image multiresolution decomposition by a discrete wavelet transform, given a proper wavelet basis. In the transform space, edges are implicitly located and preserved, at the same time that image noise is filtered out. At each resolution level, geometric continuity is used to preserve edges that are not isolated. Finally, we compare consecutive levels to preserve edges having continuity along scales. As a result, the proposed technique produces a filtered version of the original image, where homogeneous regions appear separated by well-defined edges. Possible applications include image presegmentation and image denoising.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
In this paper a structure-adaptive approach to the evaluation of image intensity for adaptive filtering is described in the context of mixed noise model: Gaussian noise and impulsive noise as outliers of the Gaussian distribution. The known adaptive filtering techniques often yield nonsatisfactory results in this case because the outliers are confused with image fine details such as corner edges and thin lines. The proposed adaptive estimation procedure is based on the selection of best fitting structuring region relatively to a current point from available multiple structuring regions by the maximum a posteriori probability principle. Robust estimation of image intensity in the current point is made by using the sample of pixels of the selected structuring region. The described method allows us to suppress mixed noise and at the same time not to damage the initial image including corner edges and other image fine details.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
Restoration of document and text images has become increasingly important in many areas of electronic imaging. This paper presents an automated system to restore low-resolution document and text images. It makes use of resolution expansion to enhance low-resolution images for optical character recognition accuracy as well as to improve the quality of degraded images. Several approaches have been proposed in the past for resolution expansion such as linear interpolation and cubic spline expansion. The proposed system implements a bimodal-smooth-average (BSA) scoring function as an optimal criterion for image quality. The BSA approach is very different from existing methods in the sense that it uses three measures: bimodal, smooth, and average to produce a scoring function as an optimal criterion. Its idea is to create for a given image a strongly bimodal image with smooth regions in both the foreground and background, while allowing for sharp discontinuities at the edges. Then the desired resolution-expanded image is obtained by solving a nonlinear optimization problem subject to a constraint that the average of expanded resolution must be equal to the original unexpanded resolution. The system can be used to restore both binary and grayscale images as well as video frames. Its capability is demonstrated experimentally to be quantitatively and qualitatively superior to standard interpolation methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
Morphological image processing has been widely used to process binary and grayscale images, with morphological techniques being applied to noise reduction, image enhancement, and feature detection. Relying on an ordering of the data, morphology modifies the geometrical aspects of an image: object contours in binary images and object surfaces in grayscale images. Extending morphological operators to color image processing has been problematic because it is not easy to define geometry of a vector-valued function and ordering of vectors is not straightforward. We propose a new set of color morphological operators based on a combination of reduced ordering and conditional ordering of the underlying data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
This paper described a fast and automatic method for segmenting fiducial marks in an image taken with a metric camera. These marks could be at the four corners, four middle sides, or inside an image (re´seau marks). The segmentation was realized by using attribute-based mathematical morphology techniques. The attributes that we used in the morphology processing step were the area, aspect ratio, and orientation of the best fitted ellipse of an object. The algorithm took about 0.25 s in the automatic segmentation stage. The positions of these automatically segmented fiducial marks were then refined to subpixel accuracy. Dozens of real images with different shapes of fiducial marks were tested and reliable results were obtained. Accuracy comparisons were made using our method and previously published methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
One of the most important features in image analysis and understanding is shape. Mathematical morphology is the image processing branch that deals with shape analysis. The definition of all morphological transformations is based on two primitive operations, i.e., dilation and erosion. Since many applications require the solution of morphological problems in real time, researching time efficient algorithms for these two operations is crucial. In this paper, efficient algorithms for the binary dilation and erosion are presented and evaluated for an advanced associative processor. Simulation results show that the proposed algorithms for this advanced architecture reach a near optimal speedup compared to the serial algorithm. Additionally, it is proven that the implementation of this image processor is economically feasible.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.
Binary images are not just for morphology anymore. The reality is that binary images appear in many applications, but the topic does not get as much attention as the image processing of gray-scale or color images. The authors have done a real service by writing a good text on the foundations of the topic.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print format on
SPIE.org.