In an effort to make automatically detect image features for pattern recognition, we described a 3-dimesional (3-D) Hough transform. We describe two interlocking theoretical extensions to greatly enhance the Hough transform's ability to handle finite lineal features and allow directed search for various features while balancing memory and computational complexity. We computed the 2-D Hough transform of 1-D slices of an image which results in a 2-D to 3-D transform. Features such as line segments will cluster in a particular location so that both line orientation and spatial extent can be determined. This approach allows the Hough transform to be more widely applied in pattern recognition including 3-D features.
We used a higher-order correlation-based method for signal denoising of images corrupted by multiplicative noise. Using the logarithm of an image, we applied a third-order correlation technique for identification of wavelet coefficients that contained mostly signal. In our approach, we examined wavelet coefficients in an environment where the contribution from the second-order moment of the noise had been reduced. Our results compared favorably and were less sensitive to threshold selection when compared to a second-order wavelet denoising method.
Hough transform theory provides a heuristically appealing approach toward finding lineal features in imagery. Unfortunately direct algorithmic implementation of its theory results in many practical problems. We provide two interlocking theoretical extensions to greatly enhances the Hough transform's ability to handle finite lineal features and allow directed search for parallel lines within the scene while balancing memory and computational complexity. Both extensions involve expansion of the Hough space concept to allow easier access to processed data for both dedicated silicon and general-purpose computer implementations.
We detected roads in aerial imagery using a method based on lineal feature detection. Our method used the products of wavelet coefficients at several scales were to identify and locate lineal features. Using our approach effectively increased the size of the region we examined when looking for possible road pixels, and decreased the probability of false positive road pixels. Then, we used a shortest path algorithm to link road pixels to form road networks. Our approach restricted possible road network solutions based on the initial detection of road pixels. We found that our approach leads to an effective method for detecting roads in aerial imagery. The method is general and can be applied to other features in imagery.
It is shown that the image chain has important effects upon the quality of feature extraction. Exact analytic ROC results are given for the case where arbitrary multivariate normal imagery is passed to a Bayesian feature detector designed for multivariate normal imagery with a diagonal covariance matrix. Plots are provided to allow direct visual inspection of many of the more readly apparent effects. Also shown is an analytic tradeoff that says doubling background contrast is equal to halving sensor to scene distance or sensor noise. It is also shown that the results provide a lower bound to the ROC of a Bayesian feature detector designed for arbitrary multivariate normal distributions.
We detected roads in aerial imagery based on multiresolution linear feature detection. Our method used the products of wavelet coefficients at several scales to identify and locate linear features. After detecting possible road pixels, we used a shortest-path algorithm to identify roads. The multiresolution approach effectively increased the size of the region we examined when looking for possible road pixels and reduced the effect of noise. We found that our approach leads to an effective method for detecting roads in aerial imagery.
One of the highest potential uses of image fusion is that of recognition of critical targets. The continuing image fusion question then is how to make optimal use of the often disparate forms of encountered image detail during fusion. Toward this end, many techniques have been advanced for fusion to a single viewable image. Fewer techniques have been suggested toward fusion with the goal of directly improving target detection or recognition. Based upon emerging trends in pixel accurate registration of images, we show the theoretical foundations required to optimally fuse target imagery for recognition. Results obtained can be applied to both the cases of automatic target recognition and image analysis.
We used the wavelet representation of an image for multiresolution processing. As the resolution of an object was decreased we maintained the signal-to-noise ratio (SNR) at the highest resolution by changing the support region of a binary phase-only filter (BPOF). In this way we were able to decrease the resolution by a factor of four while maintaining the SNR at the full resolution for our test image. Although the SNR remained constant as the resolution decreased, the peak-to- correlation energy decreased. The processing of an object at a lower resolution is significant because it corresponds to processing the same object in a larger image after downsampling. Due to the limited number of pixels in currently available SLMs, multiresolution processing could extend there usefulness.
Underground objects are by nature often severely obscured although the general character of the intervening random media may be reasonably understood. The task of detecting these underground objects also implies that their exact location and or orientation is not known. To partially counter these difficulties, one may; however, be given a model of the target of interest, e.g. a particular tank type, a water pipe, etc. To set up a quality framework for solution of the above problem, this paper utilizes the paradigm of Bayesian decision theory that promises minimum error detection given that certain probability density functions can be found. Within this framework, mathematical techniques are shown to handle the uncertainties of target location and orientation, many of the random obscuration problems, and how to make best use of the target model. The approach taken can also be applied to other synergistic cases such as seeing through obscuring vegetation.
We designed binary phase-only filters from a training set of images using a statistical approach. We forced images into clusters and designed filters to recognize objects from that cluster. We report on results obtained by computer simulation comparing the performance of filters to recognize objects from clusters of one and two classes.
Based upon the requirements of Bayesian object recognition theory, this paper provides the fundamental framework to determine the joint probability density function of object regions in an IR image. This probability function contains all information about the region that is required to achieve minimum probability of error recognition. The techniques advanced here are expected to be of significant use in certain rather hostile and difficult situations such as testing piping for fault conditions within operational nuclear power plants.
Often one is interested in multisensor fusion to enhance the recognition of critical targets -- even though in isolation none of the sensors can supply sufficient information for detection. To recognize under such adverse conditions will require the best of techniques, e.g., Bayesian. Previously through careful target and sensor phenomenological modeling, we have overcome the main objection to single sensor Bayesian automatic target detection, i.e., the rigorous development of the necessary target probabilities. In this paper we show that one can further use a process of conditioning on target and sensor phenomenology to conditionally decouple the sensors. Optimal fusion will then proceed simply by combining the conditionally independent target probabilities arising from the individual sensors.
A new technique is shown for refining and reducing incoming camouflage data based upon the Bayesian paradigm. Innovation is displayed in use of a statistical conditioning sequence that avoids the need to form target features from the data. The result is a simplified and more accurate probabilistic indication of actual target presence. This probabilistic indication can then be incorporated into a variety of target detection scenarios or, alternately, to form the basis of a theoretically optimal Bayesian target detector. Numeric simulation is presented to show the effectiveness of the technique against simulated camouflage.
A multi-technology high performance computing system based on the Open Parallel Architecture Design Specification (OPADS) platform is being evaluated for use as a graphics engine for spherical scene projection. This system is designed to make available the massive quantities of real-time processing power needed to support complete real time scene generation and projection of complex dynamical maneuvers for applications such as scientific visualization and three dimensional database creation and interaction. A comparison is also provided between head mounted projection systems and walk-in spherical scene projection systems.
One of the more useful techniques to emerge from AI is the provision of an explanation modality used by the researcher to understand and subsequently tune the reasoning of an expert system. Such a capability, missing in the arena of statistical object recognition, is not that difficult to provide. Long standing results show that the paradigm of Bayesian object recognition is truly optimal in a minimum probability of error sense. To a large degree, the Bayesian paradigm achieves optimality through adroit fusion of a wide range of lower informational data sources to give a higher quality decision--a very 'expert system' like capability. When various sources of incoming data are represented by C++ classes, it becomes possible to automatically backtrack the Bayesian data fusion process, assigning relative weights to the more significant datums and their combinations. A C++ object oriented engine is then able to synthesize 'English' like textural description of the Bayesian reasoning suitable for generalized presentation. Key concepts and examples are provided based on an actual object recognition problem.
The core concept developed here is one of digital registration of multiple frames of moderate resolution from such readily available recording sources as palm VCRs. The multiple frames are then coalesced into a single frame of highly sampled imagery to increase the critical Nyquist sampling rate in regions of interest. To achieve the highest resolution of the final product, special data driven spatially varying image enhancement techniques are the employed. Both mathematics and results are shown.
in designing a feedforward neural network for numerical computation using the backpropagation algorithm it is essential to know that the resulting network has a practical global minimum, meaning that convergence to a stationary solution can be achieved in reasonable time and using a network of reasonable size. This is in contrast to theoretical results indicating that any square-integrable (L2) function can be computed assuming that an unlimited number of neurons are available. A class of problems is discussed that does not fit into this category. Although these problems are conceptually simple, it is shown that in practice convergence to a stationary solution can only be approximate and very costly. Computer simulation results are shown, and concepts are presented that can improve the performance by a careful redesign of the problem.
Infrared imagery of 512 x 512 pixels were processed with 128 x 128 arrays by computer simulation of an optical correlator using various correlation filters. Pyramidal processing using binary phase-only filters (BPOFs), synthetic discriminant function (SDF) filters, and feature-based filters was used to process an entire image in parallel at different resolutions. Results showed that both SDF and feature-based filters were more robust to the effects of thresholding input imagery than BPOFs. The feature-based filters offered a range of performance by setting a parameter to different values. As the value of the parameter was changed, correlation peaks within the training set became more consistent and broader. The feature-based filters were more useful than both the SDF and simple BPOFs for recognizing objects outside the training set. Furthermore, the feature-based filter was more easily calculated and trained than an SDF filter.
We modeled an optical system for estimation of the fractal dimension to provide a measure of surface roughness for an entire image and for image segmentation. Although the simulated optical result was similar to that calculated by digital techniques, both suffered from problems known to occur with estimating fractal dimension. Furthermore, the optical estimation did not have as good a resolution as that obtained with digital estimates due primarily to the limited dynamic range of the detector.
Imagery from actual sensors was used for automatic object recognition with a binary phase- only filtering (BPOF) optical correlator. The correlator was primarily tested for applications that involve objects with a nonrepeatable signature. Digital image processing techniques were used to threshold gray-level images and approximate their boundaries with polygons to limit variations of input imagery before correlation. Correlation responses increased and worked best when relatively short line segments were used in the approximation. The best results were obtained when a convex polygon was drawn to the outward-most points of the object. This is equivalent to extracting specific shape features of the object. The method presented here reduced the sensitivity of a BPOF to changes in an object's appearance when an object varied in different or unknown ways.