Image coding methods continue to be introduced without satisfactorily answering the question, 'Which compression method is best for a given set of conditions?' This paper provides a framework for comparing and contrasting seemingly unrelated methods and then addresses the remaining issue, 'Which metric should be used?'
Multiresolution imaging consists of image gathering, signal decomposition, quantization and either reconstruction or restoration. Our assessment of this process for restoration, in terms of information and fidelity, integrates optical filtering in image gathering and display with digital filtering and decimation in signal decomposition. This approach establishes upper bounds on the information efficiency of the data transmission for images restored with the best possible visual quality.
Generally, image compression algorithms are developed, and implemented, without taking into account the distortions injected into the process by the image-gathering and the image-display systems. Often, these distortions degrade the quality of the displayed image more than those due to coding and quantization. We assess the whole coding process--from image capture to image display--using information-theoretical analysis. The restoration procedure which we develop for (Discrete Cosine) transform coded images takes into account not only the quantization errors introduced by the coding, but also the aliasing and blurring errors due to (non-ideal) image gathering and display. This procedure maximizes the information content of the image-gathering and the fidelity of the resultant restorations.
The discrete wavelet transform has recently emerged as a powerful technique for decomposing images into various multiresolution approximations. An image is decomposed into a sequence of orthogonal components, the first being an approximation of the original image at some 'base' resolution. By the addition of successive (orthogonal) 'error' images, approximations of higher resolution are obtained. Trellis coded quantization (TCQ) is known as an effective scheme for quantizing memoryless sources with low to moderate complexity. The TCQ approach to data compression has led to some of the most effective source codes found to date for memoryless sources. In this work, we investigate the use of entropy-constrained TCQ for encoding wavelet coefficients at different bit rates. The lowest-resolution sub-image is quantized using a 2-D discrete cosine transform encoder. For encoding the 512 X 512, 8- bit, monochrome 'Lenna' image, a PSNR of 39.00 dB is obtained at an average bit rate of 0.89 bits/pixel.
Recent increases in focal plane size and spatial resolution, coupled with the introduction of multi-spectral sensor suites in tactical reconnaissance platforms, have exponentially increased the volume of image data to be processed. This paper proposes the use of a quadrature mirror filter compression technique whereby intrinsic multiresolution pyramid decomposition can simplify implementation of many of the algorithms required to exploit these images. A subband coding system, based on the spatial frequency and orientation of the decomposed images, is presented. Implementation of additional algorithms based on multiresolution pyramid decomposition is also discussed. These algorithms include: edge detection and restoration, real-time minification of large E-O images for screening purposes and interpolation with electronic zooming for target identification.
The tile effect is an artifact which considerably degrades the visual quality of the images coded at bit rates less than 1 bpp. A new algorithm called JPEG/RBC which is based on a two source decomposition of a noncausal model fits closely within the broad framework of the JPEG standard. Preliminary results indicate substantial improvement in performance and bit rates in addition to the mitigation of the tile effect.
In an effort to remove the soldier, to the extent possible, from harm's way, some of the Army's future combat vehicles will be tele-operable. Video or Forward Looking Infrared (FLIR) sensors will be mounted on the vehicle, and imagery of the surrounding scene will be transmitted to a control station via a radio frequency (RF) link. However, visual imagery is a large user of bandwidth, and bandwidth on the battlefield is very limited. Therefore, a means must be found of achieving a low-data-rate transmission. We have developed a system that accomplishes this by using two distinct techniques. First, a 25-to-1 bandwidth reduction ratio is achieved by compressing the transmitted image using a combination of the Discrete Cosine Transform and Huffman encoding. Second, a 90-to-1 bandwidth reduction ratio is achieved by transmitting only one frame every 3 second rather than the usual 30 frames per second. The intervening 3 seconds are filled with 89 synthetically created frames (synthetic optic flow) which appear very much like those that would have been transmitted using full bandwidth transmission. The result of these two steps is an overall bandwidth reduction ratio of 25 X 90 equals 2250 to 1.
Most of the development work on automated Machine Vision for space operations has assumed the presence of a dark sky background or a 'cooperative', (i.e.: marked or lighted), target. In reality, the sun-lit earth, or other natural body, will be the background much of the time, providing a far more difficult image segmentation problem. Fortunately, many of the natural background objects, e.g.: clouds, mountain ranges, etc., exhibit fractal characteristics when viewed from orbit. Images of manmade objects such as satellites, space shuttles, and stations will yield sufficiently different values for the fractal parameters so that edge detection and segmentation can be accomplished. This paper describes the methods used to segment images of space scenes into manmade and natural components using fractal dimensions and lacunarities. The calculation of these parameters are described in detail, and the results presented for a variety of aerospace images.
Fractal geometry is increasingly becoming a useful tool for modeling natural phenomenon. As an alterative to Euclidean concepts, fractals allow for a more accurate representation of the nature of complexity in natural boundaries and surfaces. Since they are characterized by self- similarity, an ideal fractal surface is scale-independent; i.e. at different scales a fractal surface looks the same. This is not exactly true for natural surfaces. When viewed at different spatial resolutions parts of natural surfaces look alike in a statistical manner and only for a limited range of scales. In this paper, images acquired by NASA's Calibrated Airborne Multispectral Scanner are used to compute the fractal dimension as a function of spatial resolution. Three methods are used to determine the fractal dimension--Shelberg's line-divider method, the variogram method and the triangular prism method. A description of these methods and the result of applying these methods to a remotely-sensed image is also presented. The scanner data was acquired over western Puerto Rico in January, 1990 over land and water. The aim is to study impacts of man-induced changes on land that affect sedimentation into the near-shore environment. The data was obtained over the same area at 3 different pixel sizes--10 m, 20 m and 30 m.
The development of generalized contour/texture discrimination techniques is a central element necessary for machine vision recognition and interpretation of arbitrary images. Here the visual perception of texture, selected studies of texture analysis in machine vision, and diverse small samples of contour and texture are all used to provide insights into the fundamental characteristics of contour and texture. From these, an experimental discrimination scheme is developed and tested on a battery of natural images. The processing of contour and texture is considered as a unified problem of zonal determinations of stasis versus change. The visual perception of texture defined fine texture as a subclass which is interpreted as shading and is distinct from coarse figural similarity textures. Also, perception defined the smallest scale for contour/texture discrimination as 8 to 9 visual acuity units. Three contour/texture discrimination parameters were found to be moderately successful for this scale of discrimination: (1) lightness change in a blurred version of the image, (2) change in lightness change in the original image, and (3) percent change in edge counts relative to local maximum.
This paper presents a geometric reasoning approach to detect image curves. A set of curve partitioning and grouping rules is derived based on perceptual organization of curve features. This method is capable of tracing curve segments and joining them into an appropriate form of curve structure according to its topological and geometric properties. Experimental results are presented to demonstrate the effectivenss of the curve detection technique.
One of the more remarkable properties of the primate visual system is color constancy, the perception of the actual color of a scene independent of the spectral composition of the scene illumination. In this paper we presented one approach to color constancy based on the intensity dependent summation (IDS) filter. IDS is an adaptive filter that displays many of the spatial and frequency domain characteristics of the human visual system including Mach Bands. Our approach utilizes Mach Bands generated by the filter at edges to recover the ratio of reflectances across adjacent boundaries. A diffusion process then generates a multispectral reconstruction of the image.
A common problem in manufacturing design is the accurate rendering of a 3-D object into a digitized 3-D representation. Such a rendering can be used to duplicate or modify existing objects as well as perform quality control and object recognition in automated manufacturing operations. There are several automated ways to retrieve 3-D information. If we limit our discussion to non-destructive computer vision techniques, these techniques include stereo- vision, orthogonal silhouette projection, gray level analysis, focal plane analysis, interferometric analysis, and structured light projection. In this paper we present a sine section structured light technique. This technique requires only one image of a surface area to process the topology, it is insensitive to reflected intensity variations and it can be accomplished with incoherent white light so it is insensitive to phase distortions caused by roughness. The sine section technique projects a sine wave image onto an object and recovers the local slopes (1st order) which are then combined to reconstruct the surface topology. This technique has the advantage over single slit structured light approaches which use position (0 order) information because it uses the entire area of a 2-D image as information for reconstruction. It has the advantage over multiple slits because, locally, it is the narrow-band frequency modulation of a sine wave. This decreases the side loop response. However globally it is a wide-band modulating technique (0 to infinite frequency) so an optimum frequency demodulation technique is developed which yields average slopes in local regions of the illuminated area. The performance of this technique appears to be very robust and insensitive to intensity- variations. It is also compatible with certain single slit projection techniques for 0 order recovery. In addition to this it yields local carrier frequencies which can be used with (2nd level processing) narrow-band phase demodulation techniques to recover higher resolution of surface variation. Results are presented for recovery of a second order surface corrupted by additive colored Gaussian spatial noise. The colored noise is generated with a fractal filter based on a fractal dimension parameter (beta) .
The challenge of information extraction in robot vision and automated inspection requires the development of efficient and dedicated hardware systems. A specific requirement relates to the hierarchical description of a scene, which is difficult to implement in real-time on conventional computers. Hardware solutions may exploit parallel computing capabilities in order to provide intelligent sensing of visual information. A promising strategy seeks to exploit VLSI solutions in novel architectures for optical sensing and processing. The Multi- port Array photo-Receptor system (MAR) discussed in this paper combines optical transduction with integrated focal-plane processing. The central element of the MAR system is a full custom VLSI photo-sensor array with hexagonal tessellation which provides parallel analog read-out from groups of pixels over prescribed areas. The overall capability of the sensor is enhanced by the addition of external analog computation which performs real-time spatial convolution at multiple resolutions and uses feedback control for automatic edge tracking. Current VLSI technology allows the fabrication of a CMOS sensor array with dimensions of up to 500 X 500 pixels on a 1.5 cm die using CMOS 1.2 micron technology. VLSI also provides means to integrate analog computing modules and microcontrol capabilities. A set of chips required by the system has been fabricated and a first prototype which integrates an array of 128 X 128 pixels with zero-crossing detection at seven different spatial resolutions runs at a rate of 1 M pixel/sec. Edge data at multiple resolutions are computed in real-time. Parallel edge extraction at 16 different resolutions will be available from a forthcoming unit. The sensor includes arbitrary pixel displacement and non-linear dark current compensation. This type of integrated sensor is a good candidate for advanced applications which require small weight and size.
This paper presents a knowledge-based vision system for a telerobotics guidance project. The system is capable of recognizing and locating 3-D objects with unrestricted viewpoints in a simulated unconstrained space environment. It constructs object representation for vision tasks from wireframe models; recognizes and locates objects in a 3-D scene, and provides world modeling capability to establish, maintain, and update 3-D environment description for telerobotic manipulations. In this paper, an object model is represented by an attributed hypergraph which contains direct structural (relational) information with features grouped according to their multiple-views so as the interpretation of the 3-D object and its 2-D projections are coupled. With this representation, object recognition is directed by a knowledge-directed hypothesis refinement strategy. The strategy starts with the identification of 2-D local feature characteristics for initiating feature and relation matching. Next it continues to refine the matching by adding 2-D features from the image according to viewpoint and geometric consistency. Finally it links the successful matchings back to the 3-D model to recover the feature, relation and location information of the recognized object. The paper also presents the implementation and the experimentation of the vision prototype.
This paper introduces a robust method which can be used to recognize English characters. In our method, we first present an Invariant Matrix (IM) corresponding to a unique character image under the polar system. This kind of matrix has many properties and it is insensitive to image translation, scaling rotation and noise. On the basis of invariant matrix, a set of similar discriminant functions (SDF) of English characters are established. Then, a feature extraction and recognition method based on the SDF functions is proposed. Feature vectors extracted by our method are reliable and have the maximum similarity for the same class of English character samples. Finally, according to our recognition model, we design a hierarchical classifier to recognize English characters. Experimental results show that the SDF function based on IM is an efficient criterion for feature extraction of English characters and our recognition model can obtain the recognition accuracies of 100 percent for all English characters.
Many modeling, simulation and performance analysis studies of sampled imaging systems are inherently incomplete because they are conditioned on a discrete-input, discrete-output model which only accounts for blurring during image gathering and additive noise. For those sampled imaging systems where the effects of image gathering, restoration and interpolation are significant the modeling, simulation and performance analysis should be based on a more comprehensive continuous-input, discrete-processing, continuous-output end-to-end model. This more comprehensive model should properly account for the low-pass filtering effects of image gathering prior to sampling, the potentially important noise-like effects of aliasing, additive noise, the high-pass filtering effects of restoration, and the low-pass filtering effects of image reconstruction. Yet this model should not be so complex as to preclude significant mathematical analysis, particularly the mean-square (fidelity) type of analysis so common in linear system theory. In this paper we demonstrate that, although the mathematics of this more comprehensive model is more complex, the increase in complexity is not so great as to prevent a complete fidelity-metric analysis at both the component level and at the end-to-end system level. That is, easily computed, mean-square-based fidelity metrics are developed by which both component-level and system-level performance can be predicted. In particular, it is demonstrated that these fidelity metrics can be used to quantify the combined effects of image gathering, restoration and reconstruction.
This paper presents an assessment of visual communication that integrates the critical limiting factors of image gathering an display with the digital processing that is used to code and restore images. The approach focuses on two mathematical criteria, information and fidelity, and on their relationships to the entropy of the encoded data and to the visual quality of the restored image.
Constrained least-squares image restoration, first proposed by Hunt twenty years ago, is a linear image restoration technique in which the restoration filter is derived by maximizing the smoothness of the restored image while satisfying a fidelity constraint related to how well the restored image matches the actual data. The traditional derivation and implementation of the constrained least-squares restoration filter is based on an incomplete discrete/discrete system model which does not account for the effects of spatial sampling and image reconstruction. For many imaging systems, these effects are significant and should not be ignored. In a recent paper Park demonstrated that a derivation of the Wiener filter based on the incomplete discrete/discrete model can be extended to a more comprehensive end-to-end, continuous/discrete/continuous model. In a similar way, in this paper, we show that a derivation of the constrained least-squares filter based on the discrete/discrete model can also be extended to this more comprehensive continuous/discrete/continuous model and, by so doing, an improved restoration filter is derived. Building on previous work by Reichenbach and Park for the Wiener filter, we also show that this improved constrained least-squares restoration filter can be efficiently implemented as a small-kernel convolution in the spatial domain.
This paper describes the design of an efficient filter that promises to significantly improve the performance of second-generation Forward Looking Infrared (FLIR) and other digital imaging systems. The filter is based on a comprehensive model of the digital imaging process that accounts for the significant effects of sampling and reconstruction as well as acquisition blur and noise. The filter both restores, partially correcting degradations introduced during image acquisition, and interpolates, increasing apparent resolution and improving reconstruction. The filter derivation is conditioned on explicit constraints on spatial support and resolution so that it can be implemented efficiently and is practical for real-time applications. Subject to these implementation constraints, the filter optimizes end-to-end system fidelity. In experiments with simulated FLIR systems, the filter significantly increases fidelity, apparent resolution, effective range, and visual quality for a range of conditions with relatively little computation.
A contrast-enhancement algorithm is described which avoids the 'flat', noisy appearance often produced by histogram equalization. This algorithm is based upon the Intensity-dependent spatial summation (IDS) model proposed by Cornsweet and Yellott to explain certain effects in human vision. However, where IDS collapses quantization levels except at edges and histogram equalization collapses some quantization levels in the process of expanding others, our algorithm largely preserves the original brightness gradations as it expands them. Thus, the shape-from-shading cues so important to the perception of form and contour are preserved. Also, this technique, in some cases, enhances high frequency noise to a far lesser extent than does histogram equalization.
Digital x-ray imaging of the chest has been introduced in an attempt to alleviate some of the limitations of conventional radiography. The digital techniques permit software processing of stored data to compensate for factors such as exposure variation, rib and bone interposition, and density of overlying tissue. However, the digital approach introduces its own set of limitations, principally involving spatial and intensity resolution, and dynamic range. For presently-available displays, an intensity range of 256 levels (8 bits) is the maximum practical dynamic range; however using present contrast enhancement methods this is inadequate for many diagnostic purposes. This paper investigates the comparative performance and the efficient implementation, using low-cost personal computers and digital signal processors, of existing and a new method of adaptive histogram equalization that facilitate the display of 12 bit or 16 bit digital x-rays on 8-bit displays. Results showing the quality of the various methods and the associated computational cost are included.
Low frequency sinusoidal motion involves random process blur length. Restoration of blurred images degraded by this type of motion is more complicated than other images blurred by relative motion between the target and the camera. Results are presented from low frequency motion experiments in which all parameters were measured and used to calculate numerically the optical transfer function (OTF) which was used to restore the blurred image with a wiener filter approach. These results are much better than those achieved using linear motion OTF and high frequency motion OTF. This method for calculating numerically OTF and applying it to image restoration of blurred images can be used for any type of vibration or motion.
Methods are discussed for reducing noise from a discretely-sampled input signal where the underlying signal of interest has a broadband spectrum. The emphasis is on high-noise applications, in particular, for which the clean signal is contaminated with 100% or more noise (signal to noise ratio less than or equal to zero). We discuss conventional methods, and suggest a new method based on time delay embedding using coordinates generated by local low-pass filtering, which we call a low-pass embedding. The singular value decomposition can then be used locally in embedding space to distinguish between the dynamics and the noise. Conventional algorithms and the proposed new algorithm are evaluated for chaotic signals generated by the Lorenz and Rossler systems, to which Gaussian white noise has been added.
Multipath propagation of radio waves in indoor/outdoor environments shows a highly irregular behavior as a function of time. Typical modeling of this phenomenon assumes the received signal is a stochastic process composed of the superposition of various altered replicas of the transmitted one, their amplitudes and phases being drawn from specific probability densities. We set out to explore the hypothesis of the presence of deterministic chaos in signals propagating inside various buildings at the University of Calgary. The correlation dimension versus embedding dimension saturates to a value between 3 and 4 for various antenna polarizations. The full Liapunov spectrum calculated contains two positive exponents and yields through the Kaplan-Yorke conjecture the same dimension obtained from the correlation sum. The presence of strange attractors in multipath propagation hints to better ways to predict the behavior of the signal and better methods to counter the effects of interference. The use of Neural Networks in non linear prediction will be illustrated in an example and potential applications will be highlighted.
Recent work using chaotic signals to drive nonlinear systems shows that chaotic dynamics is rich in new application possibilities. The approach of using nonlinear dynamics concepts to guide synthesis of new nonlinear systems leads to the concepts of synchronization of chaotic systems and pseudoperiodic driving. These are only the beginnings of new and unique uses of chaotic dynamics.
If signal analyses were perfect without noise and clutters, then any transform can be equally chosen to represent the signal without any loss of information. However, if the analysis using Fourier transform (FT) happens to be a nonlinear dynamic phenomenon, the effect of nonlinearity must be postponed until a later time when a complicated mode-mode coupling is attempted without the assurance of any convergence. Alternatively, there exists a new paradigm of linear transforms called wavelet transform (WT) developed for French oil explorations. Such a WT enjoys the linear superposition principle, the computational efficiency, and the signal/noise ratio enhancement for a nonsinusoidal and nonstationary signal. Our extensions to a dynamic WT and furthermore to an adaptive WT are possible due to the fact that there exists a large set of square-integrable functions that are special solutions of the nonlinear dynamic medium and could be adopted for the WT. In order to analyze nonlinear dynamics phenomena in ocean, we are naturally led to the construction of a soliton mother wavelet. This common sense of 'pay the nonlinear price now and enjoy the linearity later' is certainly useful to probe any nonlinear dynamics. Research directions in wavelets, such as adaptivity, and neural network implementations are indicated, e.g., tailoring an active sonar profile for explorations.
The Wavelet Transform (WT) employs nonsinusoidal bases of compact support. The basis sets gab(t) equals g((t-b)/a)(root)a are called daughter wavelets, which are constructed from a mother wavelet g(t) by means of the dilation operation with parameters a and the translation operation with parameter b. Normally, the mother wavelet g(t) is required to be an even function with (integral) 0INFdfG(f)2/f<(infinity) , to guarantee that the basis set is complete. We show that the mother wavelet can be causal instead of even and still guarantee completeness. This allows mother wavelets to be selected that better match causal input signals. A parallel optical WT architecture is sketched.
A new wavelet transform normalization procedure is proposed for the construction of a weighted bank of matched filters. The standard normalization results in higher input frequencies producing larger wavelet transform magnitudes if the amplitude of the frequencies is held constant, while the new normalization produces equal responses. This is illustrated with an example of Gibb's overshooting phenomenon, and connections to neural networks are discussed. Another example is presented which illustrates a cocktail party effect. A derivation is given to show that an inverse transform still exists when using the new normalization.
We are investigating the possibility that a video image may productively be warped prior to presentation of a low vision patient. This could form part of a prosthesis for certain field defects. We have done preliminary quantitative studies on some notions that may be valid in calculating the image warpings. We hope the results will help make best use of time to be spent with human subjects, by guiding the selection of parameters and their range to be investigated. We liken a warping optimization to opening the largest number of spatial channels between the pixels of an input imager and resolution cells in the visual system. Some important effects are not quantified that will require human evaluation, such as local 'squashing' of the image, taken as the ratio of eigenvalues of the Jacobian of the transformation. The results indicate that the method shows quantitative promise. These results have identified some geometric transformations to evaluate further with human subjects.
A true continuous wavelet transform (CWT) must be continuous in both shift and scale. By a continuous anamorphic transformation of a one-dimensional (1D) signal and a suitable choice of kernel or filter, we can allow a normal two-dimensional (2D) optical Fourier transform image processor to perform a CWT.
This paper describes the basic concept of the wavelet transform and proposes optical implementation of the wavelet transform using an optical correlator. The one-dimensional wavelet transform is implemented in a two-dimensional multichannel correlator with a bank of one-dimensional strip filters. In the case of the Cos-Gaussian wavelets the wavelet transform filters are the optically recorded or computer generated transmittance masks. Experimental results show detection of the transitions of the input signal by the optical wavelet transform.
We propose optical multichannel N4 correlator for implementation of the wavelet transform of two dimensional function, that yields four dimensional wavelet transform coefficients. We propose also the optical wavelet matched filter for two dimension pattern recognition. Advantage of this new band-pass matched filter is we can select image features to be enhanced by choosing the wavelet functions before the matched filtering. Experimental results are shown.
The problem of designing wavelets which are most appropriate for applications to multiresolution coding of image, speech, radar and other signals is addressed. The effects of regularity and zero moments on the design of wavelets and filter banks used to realize these wavelet decompositions are discussed, and insights pointed out. The use of vector quantization with wavelet transforms will be discussed. It is observed how wavelet decompositions are a compromise between optimality and complexity, where the optimality is determined from the minimization of bit rate and distortion, using rate distortion theory. The problem of designing wavelets yielding linear phase filtering, important for applications such as television coding and radar, is discussed and a number of approaches to solutions are described. These include the use of biorthogonal rather than orthogonal bases for wavelets which are realizable by general perfect reconstruction filter banks in which the analysis and synthesis filters are not time-reversed versions of each other. Methods for designing linear phase filters are briefly discussed and referenced. In the discussion on applications to radar signals, the relation of wavelet theory to a special signal called a chirplet is noted. Some connections of wavelets to splines and cardinal series are noted. Finally, wavelets which almost meet the uncertainty principle bound with equality are described.