An approach to automatic target cueing (ATC) in hyperspectral images, referred to as K-means reclustering, is introduced. The objective is to extract spatial clusters of spectrally related pixels having specified and distinctive spatial characteristics. K-means reclustering has three steps: spectral cluster initialization, spectral clustering and spatial re-clustering, plus an optional dimensionality reduction step. It provides an alternative to classical ATC algorithms based on anomaly detection, in which pixels are classified as type anomaly or background clutter. K-means reclustering is used to cue targets of various sizes in AVIRIS imagery. Statistical performance and computational complexity are evaluated experimentally as a function of the designated number of spectral classes (K) and the initially specified spectral cluster centers.
The encoding of images at high quality is important in a number of applications. We have developed an approach to coding that produces no visible degradation and that we denote as perceptually transparent. Such a technique achieves a modest compression, but still significantly higher than error free codes. Maintaining image quality is not important in the early stages of a progressive scheme, when only a reduced resolution preview is needed. In this paper, we describe a new method for the progressive transmission of high quality still images, that efficiently uses the lower resolution images in the encoding process. Analysis based interpolation is used to estimate the higher resolution image, and reduces the incremental information transmitted at each step. This methodology for high quality image compression is also aimed at obtaining a compressed image of higher perceived quality than the original.
Coding techniques, such as JPEG and MPEG, result in visibly degraded images at high compression. The coding artifacts are strongly correlated with image features and result in objectionable structured errors. Among structured errors, the reduction of the end of block effect in JPEG encoding has received recent attention, with advantage being taken of the known location of block boundaries. However, end of block errors are not apparent in subband or wavelet coded images. Even for JPEG coding, end of block errors are not perceptible for moderate compression, while other artifacts are still quite apparent and disturbing. In previous work, we have shown that the quality of images can be in general improved by analysis based processing and interpolation. In this paper, we present a new approach that addresses the reduction of the end of block errors as well as other visible artifacts that persist at high image quality. We demonstrate that a substantial improvement of image quality is possible by analysis based post-processing.
In previous work, we reported on the benefits of noise reduction prior to coding of very high quality images. Perceptual transparency can be achieved with a significant improvement in compression as compared to error free codes. In this paper, we examine the benefits of preprocessing when the quality requirements are not very high, and perceptible distortion results. The use of data dependent anisotropic diffusion that maintains image structure, edges, and transitions in luminance or color is beneficial in controlling the spatial distribution of errors introduced by coding. Thus, the merit of preprocessing is for the control of coding errors. In this preliminary study, we only consider preprocessing prior to the use of the standard JPEG and MPEG coding techniques.
We have recently proposed the use of geometry in image processing by representing an image as a surface in 3-space. The linear variations in intensity (edges) were shown to have a nondivergent surface normal. Exploiting this feature we introduced a nonlinear adaptive filter that only averages the divergence in the direction of the surface normal. This led to an inhomogeneous diffusion (ID) that averages the mean curvature of the surface, rendering edges invariant while removing noise. This mean curvature diffusion (MCD) when applied to an isolated edge imbedded in additive Gaussian noise results in complete noise removal and edge enhancement with the edge location left intact. In this paper we introduce a new filter that will render corners (two intersecting edges), as well as edges, invariant to the diffusion process. Because many edges in images are not isolated the corner model better represents the image than the edge model. For this reason, this new filtering technique, while encompassing MCD, also outperforms it when applied to images. Many applications will benefit from this geometrical interpretation of image processing, and those discussed in this paper include image noise removal, edge and/or corner detection and enhancement, and perceptually transparent coding.
In the perceptually transparent coding of images, we use representation and quantization strategies that exploit properties of human perception to obtain an approximate digital image indistinguishable from the original. This image is then encoded in an error free manner. The resulting coders have better performance than error free coding for a comparable quality. Further, by considering changes to images that do not produce perceptible distortion, we identify image characteristics onerous for the encoder, but perceptually unimportant. Once such characteristic is the typical noise level, often imperceptible, encountered in still images. Thus, we consider adaptive noise removal to improve coder performance, without perceptible degradation of quality. In this paper, several elements contribute to coding efficiency while preserving image quality: adaptive noise removal, additive decomposition of the image with a high activity remainder, coarse quantization of the remainder, progressive representation of the remainder, using bilinear or directional interpolation methods, and efficient encoding of the sparse remainder. The overall coding performance improvement due to noise removal and the use of a progressive code is about 18%, as compared to our previous results for perceptually transparent coders. The compression ratio for a set of nine test images is 3.72 for no perceptible loss of quality.
The inadequacy of the classic linear approach to edge detection and scale space filtering lies in the spatial averaging of the Laplacian. The Laplacian is the divergence of the gradient and thus is the divergence of both magnitude and direction. The divergence in magnitude characterizes edges and this divergence must not be averaged if the image structure is to be preserved. We introduce a new nonlinear filtering theory that only averages the divergence of direction. This averaging keeps edges and lines intact as their direction is nondivergent. Noise does not have this nondivergent consistency and its divergent direction is averaged. Higher order structures such as corners are singular points or inflection points in the divergence of direction and also are averaged. Corners are intersection points of edges of nondivergent direction (or smooth curves of small divergence in direction) and their averaging is limited. This approach provides a better compromise between noise removal and preservation of image structure. Experiments that verify and demonstrate the adequacy of this new theory are presented.
A number of new approaches to image coding are being developed to meet the increasing need for a broader range of quality and usage environments for image compression. Several of these new approaches, such as subband coders, are intended to provide higher quality images at the same bit rate as compared to the JPEG standard, because they are not subject to end of block artifacts, or because they are inherently better attuned to the image representation that occurs in the peripheral visual system. Still, in the absence of a pertinent quality criterion, the quality and performance of subband coders, or wavelet coders, can be mediocre. We have developed over the past few years the elements of a methodology applicable to this problem. We reported last year at the SPIE in San Diego, a Comparison of Coding Techniques based on a new Picture Quality Scale (PQS). In that work, we were able to rate coders designed by any criteria, on the basis of performance and quality. The problem that we are now considering is to design the coding technique so as to provide a better quality or a lower bit rate. The image quality, as evaluated by PQS, depends on a combination of several objective distortion factors, which can be identified with perceived coding artifacts, but the design of coders using all the factors is much too complex for an analytical approach. We make use instead of two design methodologies. The first one is to optimize the design of an existing subband coder using PQS as a distortion metric. The second one makes use of a methodology for the design of linear filters based on properties of human perception that we have developed previously and that may provide a tractable design method.
The encoding of Super High Definition Images presents new problems with regard to the effect of noise on the quality of images and on coding performance. Although the information content of images decreases with increasing resolution, the noise introduced in the image acquisition or scanning process, remains at a high level, independently of resolution. Although this noise may not be perceptible in the original image, it will effect the quality of the encoded image, if the encoding process introduces correlation and structure in the coded noise. Further, the coder performance will be affected by the noise, even if the noise is not perceived. Therefore, there is a need to reduce the noise by pre-processing the SHD image, so as to maintain image quality and improve the encoding process. The reduction of noise cannot be performed by low pass filtering operations that will degrade image quality. We are applying to this problem image analysis for adaptive noise removal. We discuss first the information theoretic issues on the effect of noise on coders. We then consider adaptive noise removal techniques to the perceptually transparent and very high quality coding of still SHD images.
An application of neural networks is the recognition of objects under translation, rotation, and scale change. Most existing networks for invariant object recognition require a huge number of connections and/or processing units. In this paper, we propose a new connectionist model for invariant object recognition for binary images with a reasonable network size. The network consists of five stages. The first stage shifts the object so that the centroid of the object coincides with the center of the image plane. The second stage is a variation of the polar- coordinate transformation used to obtain two N-dimensional representations of the input object. In this stage, the 0 axis is represented by the positions of the output units; therefore, any rotation of the original object becomes a cyclic shift of the output values of this stage. The third stage is a variation of the rapid transform, which provides invariant representations of cyclic-shift inputs. The next stage normalizes the outputs of the rapid transform to obtain scale invariance. The final stage is a nearest neighbor classifier. We tested the performance of the network for character recognition and good results were obtained with only one pattern per class in training.
In the encoding of high quality images beyond current standards, a reexamination of issues in the representation, processing and encoding problems is needed. The fundamental reason for that change of emphasis is because the image representation, sampling density, color and motion parameters are no longer given by accepted practices or standards and, thus, require study. Some basic issues that should be reconsidered are as follows:
We have undertaken a study of techniques for the perceptually transparent coding of very high quality and resolution images. These techniques should be free of any of the visual artifacts due to transform coders, DPCM coders or vector quantization, so that post processing to improve image quality or resolution can be performed. The approach starts with a decomposition of an image into a spline approximation, based on a subsampled array of pixels. The spline approximation is then subtracted from the original and the resulting remainder is quantized non-uniformly and with as few levels as possible while still insuring perceptual transparency. This differential quantization takes advantage of properties of human vision. The coarsely quantized remainder is then encoded in an error free fashion. Several techniques are being considered for this error free coding. Since the only errors are introduced in the quantization of the remainder, the errors are not perceptible and there is no structure to them.
A new approach is developed for detection of image objects and their orientations, based on distance transforms of intermediate level edge information(i.e., edge segments and vertices). Objects are modeled with edge segments and these edges are matched to edges extracted from an image by correlating spatially transformed versions of one with a distance transform of the other. Upper bounds on change in cross- correlation between edge maps and distance transforms are shown to be simple functions of change in translation and rotation. The process of computing the optimal object rotation at each possible translation can be accelerated by one to two orders of magnitude when these bounds are applied in conjunction with an object translation-rotation traversal strategy. Examples with detection and acceleration results demonstrate the robustness and high discriminatory power of the algorithm.
The amount of data generated by computed tomography (CT) scanners is enormous, making the image reconstruction operation slow, especially for 3-D and limited-data scans requiring iterative algorithms. The inverse Radon transform, commonly used for CT image reconstructions from projections, and the forward Radon transform are computationally burdensome for single-processor computer architectures. Fortunately, the forward Radon transform and the back projection operation (involved in the inverse Radon transform) are easily calculated using a parallel pipelined processor array. Using this array the processing time for the Radon transform and the back projection can be reduced dramatically. This paper describes a unified, pipelined architecture for an integrated circuit that computes both the forward Radon transform and the back projection operation at a 10 MHz data rate in a pipelined processor array. The trade-offs between computational complexity and reconstruction error of different interpolation schemes are presented along with an evaluation of the architecture's noise characteristics due to finite word lengths. The fully pipelined architecture is designed to reconstruct 1024 pixel by 1024 pixel images using up to 1024 projections over 180 degrees. The chip contains three pipelined data-paths, each five stages long, and uses a single multiplier.