Real-time image segmentation is a critical part of an automatic target recognizer (ATR). The segmentation of poorly resolved targets in low contrast thermal video images is a challenging task. Most edge-based segmenters are too susceptible to noise, contrast variation, and target boundary discontinuity. The problem is the lack of a fast (video rate) and robust method of grouping the relevant edge elements together while rejecting the irrelevant ones. We have overcome this problem by combining the processing power of a single instruction multiple data (SIMD) computer with a newly developed model-directed segmentation algorithm.
A digital machine-inspection system is being developed at Oak Ridge National Laboratory to detect flaws on printed graphic images. The inspection is based on subtraction of a digitized test image from a reference image to determine the location, number, extent, and contrast of potential flaws. When performing subtractive analysis on the digitized information, two sources of errors in the amplitude of the difference image can develop: (1) spatial misregistration of the reference and test sample, or (2) random fluctuations in the printing process. Variations in printing and registration between samples will generate topological artifacts related to surface structure, which is referred to as edge noise in the difference image. Most feature extraction routines require that the difference image be relatively free of noise to perform properly. A novel algorithm has been developed to filter edge noise from the difference images. The algorithm relies on the a priori assumption that edge noise will be located near locations having a strong intensity gradient in the reference image. The filter is based on the structure of the reference image and is used to attenuate edge features in the difference image. The filtering algorithm, consisting of an image multiplication, a global intensity threshold, and an erosion/dilation, has reduced edge noise by 98% over the unfiltered image and can be implemented using off-the-shelf hardware.
Several problems in low-level computer vision can be mathematically formulated as linear elliptic partial differential equations of the second order. A subset of these problems can be expressed in the form of a Poisson equation, Lu(x, y) = f(x, y). In this paper, fast direct methods for solving the Poisson equations of computer vision are developed. Until recently, iterative methods were used to solve these equations. Recently, direct Fourier techniques were suggested to speed up the computation. We present the Fourier Analysis and Cyclic Reduction (FACR) method which is faster than the Fourier method or the Cyclic Reduction method alone. For computation on an n x n grid, the operation count for the Fourier method is O(n2log2n), and that for the FACR method is O(n2log2log2n). The FACR method first reduces the system of equations into a smaller set using Cyclic Reduction. Next, the reduced system is solved by the Fourier method. The final solution is obtained by back-substituting the solution of the reduced system. With Neumann boundary conditions, a Poisson equation does not have a unique solution. We show how a physically meaningful solution can be obtained under such circumstances. Application of the FACR and other methods is discussed for two problems of low-level computer vision - lightness, or reflectance from brightness, and recovering height from surface gradient.
A real time segmentation algorithm for infrared sequences is presented, based on the iterative application of thermal histogram processing and thresholding. The segmentation algorithm is intrinsically not sequential and may therefore be decomposed into a graph of concurrent processing tasks well suited to a parallel implementation. The algorithm is based on the assumption that two pixel populations are present, namely object and background. Since in this hypothesis the separation goodness function should have one single maximum, the algorithm forces this behaviour by redistributing the area of the histogram, i.e. by progressively brightening the target population. An initialization phase finds out potential target areas throughout the current frame in cooperation with a temporal tracking and labeling task, and compiles the search windows set. For each search window the thermal histogram is computed. Then a family of modified histograms is obtained by removing greater and greater areas from the hot tail of the original histogram and by replacing it with an impulse of the same area at the highest extreme of the thermal range. Each of these histograms enters a module which computes a separation goodness function. The separation function presents two adjacent segments: the high thermal segment and the low thermal segment. The former is a constant segment which lasts from the high extreme of the thermal range down to the lowest thermal value of the removed area, the latter has a variable shape and lasts as far as the lowest extreme of the thermal range. The iteration is stopped as soon as the variable segment reaches a monotonically decreasing behaviour. The boundary value between the two segments is chosen as threshold within that window. Experimental results are presented.
We describe a digital correlation mask that induces orthogonality among a prescribed set of reference imagery. In our particular implementation the resulting correlation is under sampled and shift variant, though if it is applied to optical correlators those limitations are removed. We derive a method of introducing orthogonality among the weights among the training set from which filter values are obtained, so that the correlation value from a given filter is representative of the unique nature of the reference object as compared against the other objects in the training class.
New methods of light field conversion and image processing are
proposed on the basis of two-photon absorption, high-frequency Kerr-
effect and different types of stimulated scatterings in condensed
Optical correlator architectures have been reported which reduce the physical length of the conventional 4f design initially proposed by VanderLugt (1975). Attention is given to one such architecture which employs two simple lenses and obviates light collimation requirements. This 'modified 2f optical correlator' retains the highly useful Fourier transform scale feature; experimental correlation performance with a binary phase-only filter and photographic input is found to compare favorably with computer simulations of a 4f correlator.
A method is presented for designing optical correlation filters based on measuring intensity patterns in the Fourier plane. The method can be used in general to design any type of frequency plane filter but is especially attractive in the case of a binary phase only filter (BPOF). The method can produce a filter that is well matched to both the object, its transforming optical system and the spatial light modulator used in the correlator input plane. A working filter was produced using the technique, but the filter response was weak. The suspected causes of the weak output are imprecise Fourier plane positioning, nonlinearities in the recording process, and/or aliasing effects.
Construction rules for building composite binary phase-only filters (CBPOFs) designed for use at the nodes of a binary tree have been developed using upweighted superposition algorithms. This paper describes the use of this method to build a four-target, rotation-invariant CBPOF bank on which binary tree searches can be performed. Each of the four trees in this filter bank consists of 256 simple filters (BPOFs) and 254 composites. Sequential search of the 256 simple filters is replaced by a binary search which uses at most 26 composites and 4 simple BPOFs. An image recognition system utilizing the filter bank has been developed, assembled, and evaluated by simulation and experiment. The empirical results obtained using a hybrid optical correlator with computer-controlled magneto-optic spatial light modulators (MOSLMs) at the input and filter planes are presented.
A two dimensional image correlator based on acousto-optic (AO) and charge coupled devices (CCDs) is described that can be built with existing technology to provide 1000 frames per second operation. In recent years, architectures have been developed that perform the two dimensional correlation utilizing one dimensional input devices. The input scene is loaded into the acousto-optic device (AOD) one line at time. This line is then correlated against all of the rows of a reference template introduced into the optical system using a one dimensional array of LEDs or laser diodes. However, it generally takes a much greater time to load the AO cell than it does to process the information. This latency time severely limits the maximum throughput rate of the processor. This paper introduces a new acousto-optic correlator implementation that overcomes this bottleneck so that processing can occur close to 100% of the time. A grayscale image correlator is proposed that can be built using present technology that can realistically achieve throughput rates on the order of lO12 operations per second. This translates to over 1000 correlations per second for input scenes with dimensions of 512 x 512 pixels and reference templates of size 64 x 64 pixels.
Powerful new signal processing algorithms are now becoming available for such tasks as the detection and tracking of multiple targets via image sequence analysis. For typical real-time applications, these algorithms require throughputs on the order of hundreds to thousands of MFLOPS. In order to achieve such throughputs, it is necessary to employ parallel processing architectures, which normally require large size, weight and power consumption for their implementation. We have developed a high-performance, programmable MIMD processor called the SCC-lOO which is particularly well-suited for miniaturization using hybrid wafer-scale packaging technology. A 20-node configuration with a peak throughput in excess of 1 GFLOPS will be packaged in a three-inch cube using this approach.
An account is given of the design features and performance capabilities of a coherent optical spectrum analyzer based on an electrically addressable matrix magnetooptic spatial light modulator (MMOSLM). Attention is given to the spectra of spatially periodic and aperiodic images, as well as to the optimization of MMOSLM operational regimes to enhance stability and reduce power consumption. It is demonstrated that hundreds of periodic images and tens of aperiodic ones can be easily recognized by the spectrum analyzer, using MMOSLM fragments as small as 16 x 16. The nature of input errors experienced during MMOSLM recoding is discussed.
The capability of an inexpensive liquid-crystal TV (LCTV) to modulate both coherent and incoherent IR light is reported. Experiments demonstrating light modulation for wavelengths between 0.8 and 1.1 microns have been performed. Potential applications to dynamic 3-5 and 8-12 microns IR scene simulators, compact joint transform correlators, and novel electron trapping processors, are described.
A technique using data association target tracking in a motion sequence via an adaptive joint transform correlator is presented. The massive data in the field of view can be reduced to a few correlation peaks. The average velocity of a target during the tracking cycle is then determined from the location of the correlation peak. A data-association algorithm is used for the analysis of these correlation signals, for which multiple targets can be tracked. A phase-mostly liquid-crystal TV is used in the hybrid joint transform correlation system, and simultaneous tracking of three targets is demonstrated.
An alternate spatial weighting scheme is given for creating synthetic reference objects (SROs) for use in joint optical correlation. The objects are spatially weighted using an aggregate of pixels made up from the objects composing the SRO. An ad-hoc iterative algorithm is presented for finding a workable SRO. The algorithm is demonstrated by using it to find a spatially weighted scale invariant SRO for a random object. The SRO is then compared with a conventional transmission weighted SRO using linear and nonlinear spatial light modulator models. The spatially weighted SRO was found to be more robust with respect to modulator nonlinearities.
The Backscratching optical correlation algorithm has been proposed for four degree of freedom tracking. In an alternating Cartesian and log-polar implementation, the tracked parameters are scale, rotation, and two-dimensional translation. The algorithm has a finite capture radius in the four-dimensional tracking space. The capture radius is dependent on the tracked object, the correlator architecture, and the method of filter computation. Some methods of extending the capture radius are discussed. One is a modification of matched filters, another is a careful consideration of log-polar transform center, and another is an operational method. Some simulations of the filter construction method, in which a larger capture radius is gained at the expense of precision in determining the four parameters are presented.
Results of research into the performance of a Joint Transform correlator utilizing the binary magnetooptic spatial light modulator are discussed. Computer simulation and experimental correlation plane analyses are presented. Optical correlation performance is discussed, with emphasis on the JFTC's recognition capability in the presence of input noise and clutter. The JFTC suffers significantly more degradation in correlation SNR from the addition of input noise and clutter than the VanderLugt correlator using the same binary SLMs. When the JFTC and the VanderLugt architectures are compared using identical input patterns containing added noise and clutter, the correlation SNR is many times lower in the JFTC than in the VanderLugt. A Fourier plane thresholding technique which dramatically reduces the D.C. correlation intensity is also introduced. This technique achieves reduction in D.C. peak intensity of from 30 to over 80 percent, while affecting the correlation peak intensity by less than 5 percent. Conclusions on the potential for use of the JFTC in various pattern recognition applications are discussed.
This paper presents parallel computable contour based feature strings for two dimensional shape recognition. Parallel techniques are used for contour extraction and for computation of normalized contour based features independent of scale and rotation. The EREW PRAM architecture is considered, but the technique can be adapted to other parallel architectures. Illustrated examples are presented.
Image processing operations fall into two classes: local and global. Local operations affect only a small corresponding area in the output image, and include edge detection, smoothing, and point operations. In global operations any input pixel can affect any or a large number of output data. Global operations include histogram, image warping, Hough transform, and connected components.
Parallel architectures offer a promising method for speeding up these image processing operations. Local operations are easy to parallelize, because the input data can be divided among processors, processed in parallel separately, then the outputs can be combined by concatenation.
Global operations are harder to parallelize. In fact, some global operations cannot be executed in parallel; it is possible for a global operation to require serial execution for correct computation of the result. However, an important class of global operations, namely those that are reversible-that can be computed in forward or reverse order on a data structure-can be computed in parallel using a restricted form of divide and conquer called split and merge.
These reversible operations include the global operations mentioned above, and many more besides-even such non-image processing operations as parsing, string search, and sorting. The split and merge method will be illustrated by application of it to these algorithms. Performance analysis of the method on different architectures-one-dimensional, two-dimensional, and binary tree processor arrays will be demonstrated.
Automatic road detection is an important part in many scene recognition applications. The extraction of roads provides a means of navigation and position update for remotely piloted vehicles or autonomous vehicles. Roads supply strong contextual information which can be used to improve the performance of automatic target recognition (ATh) systems by directing the search for targets and adjusting target classification confidences.
This paper will describe algorithmic techniques for labeling roads in high-resolution infrared imagery. In addition, realtime implementation of this structural approach using a processor array based on the Martin Marietta Geometric Arithmetic Parallel Processor (GAPPTh) chip will be addressed.
The algorithm described is based on the hypothesis that a road consists of pairs of line segments separated by a distance "d" with opposite gradient directions (antiparallel). The general nature of the algorithm, in addition to its parallel implementation in a single instruction, multiple data (SIMD) machine, are improvements to existing work.
The algorithm seeks to identify line segments meeting the road hypothesis in a manner that performs well, even when the side of the road is fragmented due to occlusion or intersections.
The use of geometrical relationships between line segments is a powerful yet flexible method of road classification which is independent of orientation. In addition, this approach can be used to nominate other types of objects with minor parametric changes.
An optical symbolic neural net is described. It uses an optical symbolic correlator. This produces a new input neuron representation space that is shift-invariant and can accommodate multiple objects. No other neural net can handle multiple objects within the field of view. Initial optical laboratory data are presented. An optical neural net production system processes this new neuron data. This aspect of the system is briefly described.
A scheme is presented for the estimation of differential ranges on the basis of optical flow along a known direction, giving attention to the factors affecting the accuracy of results and various spatial and temporal smoothing algorithms employed to enhance the method's accuracy. It is found that while the use of edge detectors reduces noise, a priori knowledge of the environment improves the method's range discriminability. The method has been implemented on a real-time, high speed pipelined image processing engine capable of processing 60 image frames/sec, using a horizontally moving camera which generates optical flow along a scan line.
Automatic shape recognition using morphological operators has proved to be an effective approach to the problem of shape recognition. We present the problem of shape recognition in noisy environments as that of the problem of recognizing imperfect shapes. The method we present in this paper does not require the use of all possible variations of a shape. Instead, this method employs a priori known shape information as a basis for structuring elements, transforms objects into structuring elements, then uses the structuring elements in a hit-or-miss operation to find the location of the shape being recognized. The choice of structuring elements is critical. The resulting image after the hit or-miss operation contains a set of points which indicate the locations of the target shape. Each occurrence of this target shape is represented by one point, or a small cluster of points within a known disk. A number of examples illustrating the process of recognizing imperfect shapes show that, even though the noise environment changes the appearance of the shapes to be recognized in images, our method provides a fast and accurate solution.
One of the goals of the development of vehicular systems for air, space, land and sea operations is the creation of autonomous behavior which is directed by intelligent processors. For intelligent vehicle systems the goal-directed autonomy includes convoy following. In this paper, the convoy problem is investigated as a real-time control problem, the prime input of which is a stereo vision sensing system. The general approach to maintaining a safe distance along with the calibration of the stereo system is discussed in this paper. The methodology merges intelligent real-time control with techniques developed to solve constituent problems in the system. Real-time system definitions are given as well as preliminary results of controlling an actual autonomous vehicle in real-time.
Image flow, the apparent motion of brightness patterns on the image plane, can provide important visual information such as distance, shape, surface orientation, and boundaries. It can be determined by either feature tracking or spatio-temporal analysis. We consider spatio-temporal methods, and show how differential range can be estimated from time-space imagery.
We generate a time-space image by considering only one scan line of the image obtained from a camera moving in the horizontal direction at each time interval. At the next instant of time, we shift the previous line up by one pixel, and obtain another line from the image. We continue the procedure to obtain a time-space image, where each horizontal line represents the spatial relationship of the pixels, and each vertical line the temporal relationship.
Each feature along the horizontal scan line generates an edge in the time-space image, the slope of which depends upon the distance of the feature from the camera. We apply two mutually perpendicular edge operators to the time-space image, and determine the slope of each edge. We show that this corresponds to optical flow. We use the result to obtain the differential range, and show how this can be implemented on the Pipelined Image Processing Engine (PIPE). We discuss several kinds of edge operators, show how using the zero crossings reduces the noise, and demonstrate how better discrimination can be achieved by knowledge of range.
To carry out complex tasks, such as reconnaissance missions in relatively unknown terrain, autonomous systems will need to constantly sense and perceive various aspects of their local environment. Range and motion detection capabilities will be required by an autonomous agent to create an internal representation of the environment that will be used in mission planning. Conventionally, shape-from-binocular stereo has been a popular, yet compute intensive and hence slow, technique for detecting range using passive sensors. In this paper, we present a pipeline architecture that performs correlation-based stereo matching and motion detection in near real-time. The system has been implemented using DATACUBE image processing boards and can match 256 x 256 pixel stereo image pairs in one second using a search range of 64 pixels. It has been tested on various indoor and outdoor images with generally successful results and is being used for obstacle detection in our work on autonomous navigation. The system demonstrates the feasibility of real-time stereo matching in the near future using miniaturized hardware that fits inside a vehicle. After discussing our approach, we present results using real images. We present the application of this work to autonomous navigation research.
A three stage process for compressing real time color imagery by factors in the range of 1600-to-i is proposed for remote driving'. The key is to match the resolution gradient of human vision and preserve only those cues important for driving. Some hardware components have been built and a research prototype is planned.
Stage 1 is log polar mapping, which reduces peripheral image sampling resolution to match the peripheral gradient in human visual acuity. This can yield 25-to-i compression. Stage 2 partitions color and contrast into separate channels. This can yield 8-to-i compression. Stage 3 is conventional block data compression such as hybrid DCT/DPCM which can yield 8-to-i compression. The product of all three stages is i600-to-i data compression.
The compressed signal can be transmitted over FM bands which do not require line-of-sight, greatly increasing the range of operation and reducing the topographic exposure of teleoperated vehicles. Since the compressed channel data contains the essential constituents of human visual perception, imagery reconstructed by inverting each of the three compression stages is perceived as complete, provided the operator's direction of gaze is at the center of the mapping. This can be achieved by eye-tracker feedback which steers the center of log polar mapping in the remote vehicle to match the teleoperator's direction of gaze.
Imaging Systems have traditionally required large development cycles to transition from non-real-time implementations on general purpose computers to final real-time system prototypes using custom hardware. This paper presents a flexible realtime prototyping approach for the Conceptual Definition, Demonstration and Validation phases of development for imaging system applications such as forward observe, perimeterdefense, or "mobile barrier." A target acquisition and tracking system that has utilized this approach will be discussed and tracking system that has utilized this approach will be discussed and used to compare hardware, software, resources and schedule factors to other imaging system development programs. The testbed is shown to maintain a high degree of algorithm flexibility allowing field test experiences to be rapidly incorporated into the system. The entire system is programmable using high order languages to minimize software costs and enhance maintainability. This system was developed and integrated into a mobile lab for field testing. During real-time testing the system was upgraded and modified to provide high detection performance with low false alarm rates. This approach has led to a more complete understanding of the problem being addressed and has positioned this system closer to its final product form.