Today an important convergence is taking place between video and computing. Fifteen years ago, computers and television had little, if any, common technology base. Consumer television receivers employed analog signal processing technology and were the only volume market for video displays. Computers, on the other hand, were the primary market for digital technology, and were used with various forms of paper input and output media. In the late 1970s and early 1980s, the computer industry went through a revolution. In addition to increasing memory, processing capabilities, and the use of microprocessors, video displays developed for television became output devices for computers for the first time. These developments spawned a plethora of new concepts and products, including personal computers, word processing, and computer graphics. Computer displays have now become a significant factor in the display industry, once dominated by television. Computer graphic applications, such as computer-aided design and computer-aided manufacturing, are now the technology drivers of display resolution, although consumer television remains the technology driver of display brightness. Furthermore, virtually every television receiver now employs digital circuitry and incorporates at least one microprocessor. In the future, television receivers will be even more heavily dependent upon digital technology.
The retina computes to let us see, but can we see the retina compute? Until now, the answer has been no, because the unconscious nature of the processing hides it from our view. Here the authors describe a method of seeing computations performed throughout the retina. This is achieved by using neurophysiological data to construct a model of the retina, and using a special-purpose image processing computer (PIPE) to implement the model in real time. Processing in the model is organized into stages corresponding to computations performed by each retinal cell type. The final stage is the transient (change detecting) ganglion cell. A CCD camera forms the input image, and the activity of a selected retinal cell type is the output which is displayed on a TV monitor. By changing the retina cell driving the monitor, the progressive transformations of the image by the retina can be observed. These simulations demonstrate the ubiquitous presence of temporal and spatial variations in the patterns of activity generated by the retina which are fed into the brain. The dynamical aspects make these patterns very different from those generated by the common DOG (Difference of Gaussian) model of receptive field. Because the retina is so successful in biological vision systems, the processing described here may be useful in machine vision.
With the advent of low-cost mid-range IR mosaic PtSi focal planes, there is an increasing need for accommodation of the human vision system in image exploitation workstation displays. This paper presents a suite of algorithms which match the neural response of the eye with the image-processing display by translating the raw 12-bit images into an enhanced 8-bit format for display. The tool box of translation algorithms includes histogram equalization, histogram projection, plateau projection, plateau equalization, modular projection, overlapping and non- overlapping zonal projection, sub-sampling projection, pseudocolor, and half gray scale/half pseudocolor. The operator/photointerpreter is presented a menu from which he may select an automatic mode which uses image statistics to enhance the image, or a manual mode optimized by the operator to his preference. The choice of the appropriate algorithm and operating mode compensates for the wide variance in IR gray scale and background clutter due to time of day, season, and atmospheric conditions. The tool box also includes standard image processing algorithms such as roam, zoom, sharpening, filtering, and convolution to manipulate and further enhance the translated images.
Operational reconnaissance technical organizations are burdened by greatly increasing workloads due to expanding capabilities for collection and delivery of large-volume near-real- time multisensor/multispectral softcopy imagery. Related to the tasking of reconnaissance platforms to provide the imagery are more stringent timelines for exploiting the imagery in response to the rapidly changing threat environment being monitored. The development of a semi-automated softcopy multisensor image exploitation capability is a critical step toward integrating existing advanced image processing techniques in conjunction with appropriate intelligence and cartographic data for next-generation image exploitation systems. This paper discusses the results of a recent effort to develop computer-assisted aids for the image analyst (IA) in order to rapidly and accurately exploit multispectral/multisensor imagery in combination with intelligence support data and cartographic information for the purpose of target detection and identification. A key challenge of the effort was to design and implement an effective human-computer interface that would satisfy any generic IA task and readily accommodate the needs of a broad range of IAs.
The Friendly Lisp Image Processing System (FLIPS) is the interface to Advanced Target Detection (ATD), a multi-resolutional image analysis system developed by Hughes in conjunction with the Hughes Research Laboratories. Both menu- and graphics-driven, FLIPS enhances system usability by supporting the interactive nature of research and development. Although much progress has been made, fully automated image understanding technology that is both robust and reliable is not a reality. In situations where highly accurate results are required, skilled human analysts must still verify the findings of these systems. Furthermore, the systems often require processing times several orders of magnitude greater than that needed by veteran personnel to analyze the same image. The purpose of FLIPS is to facilitate the ability of an image analyst to take statistical measurements on digital imagery in a timely fashion, a capability critical in research environments where a large percentage of time is expended in algorithm development. In many cases, this entails minor modifications or code tinkering. Without a well-developed man-machine interface, throughput is unduly constricted. FLIPS provides mechanisms which support rapid prototyping for ATD. This paper examines the ATD/FLIPS system. The philosophy of ATD in addressing image understanding problems is described, and the capabilities of FLIPS are discussed, along with a description of the interaction between ATD and FLIPS. Finally, an overview of current plans for the system is outlined.
A mechanism for attaching graphic and overlay annotation to multiple bits/pixel imagery while providing levels of performance approaching that of native mode graphics systems is presented. This mechanism isolates programming complexity from the application programmer through software encapsulation under the X Window System. It ensures display accuracy throughout operations on the imagery and annotation including zooms, pans, and modifications of the annotation. Trade-offs that affect speed of display, consumption of memory, and system functionality are explored. The use of resource files to tune the display system is discussed. The mechanism makes use of an abstraction consisting of four parts; a graphics overlay, a dithered overlay, an image overly, and a physical display window. Data structures are maintained that retain the distinction between the four parts so that they can be modified independently, providing system flexibility. A unique technique for associating user color preferences with annotation is introduced. An interface that allows interactive modification of the mapping between image value and color is discussed. A procedure that provides for the colorization of imagery on 8-bit display systems using pixel dithering is explained. Finally, the application of annotation mechanisms to various applications is discussed.
The Strategic Computing Object-directed Reconnaissance Parallel-processing Image Understanding System (SCORPIUS) program was successfully completed in September 1990. Initiated in 1985, the program was known then as the Strategic Computing Image Understanding Program (SCIUP). SCORPIUS was a research program that combined emerging technologies from DARPA's Image Understanding (IU) and Strategic Computing Initiative (SCI) programs in a real world application. This application demonstrated the automated image exploitation of aerial imagery to extract intelligence from image data.
Automatic target recognition (ATR) schemes attempt to locate and classify given target objects. Successful approaches either require substantial computing power to correlate target models with all pixels or use a search strategy to minimize the amount of pixels considered as candidates. The search refinement approach to ATR can be accomplished by continually examining various resolutions of the data sets for candidate objects or through a technique of excluding areas that could not contain the object. This paper describes an approach for excluding areas which are identified a priori as 'clutter' and should not contain the areas of interest.
Current research in the area of automatic target recognition (ATR) has produced system concepts shown to be capable of accurately recognizing targets in imagery. However, the ATR task is typically limited in scope and requires long runtimes. The authors have developed an innovative ATR system using a parallel rule based production system. The predicted signature of a three- dimensional target in the two-dimensional scene is coupled with feature extraction information to create a set of rules that will guide the image search and match the predicted signature to extracted features. As various predicted features are found, confidence is accumulated for the object and its orientation. Since there can be tens of thousands of features extracted from an image, data level parallel processing is an ideally suited architecture. The system architecture and performance are described in this paper. Work is ongoing; however, results to date are exceptionally encouraging. The ATR application used during development has generated a few hundred facts and has successfully recognized the target of interest in the presence of other targets and a cluttered background. Performance indicates that rule execution runtimes increase slowly as the number of facts increase, and are orders of magnitude faster than comparable serial implementation benchmarks.
Typical vision systems which attempt to extract features from a visual image of the world for the purposes of object recognition and navigation are limited by the use of a single sensor and no active sensor control capability. To overcome limitations and deficiencies of rigid single sensor systems, more and more researchers are investigating actively controlled, multisensor systems. To address these problems, we have developed a self-calibrating system which uses active multiple sensor control to extract features of moving objects. A key problem in such systems is registering the images, that is, finding correspondences between images from cameras of differing focal lengths, lens characteristics, and positions and orientations. The authors first propose a technique which uses correlation of edge magnitudes for continuously calibrating pan and tilt angles of several different cameras relative to a single camera with a wide angle field of view, which encompasses the views of every other sensor. A simulation of a world of planar surfaces, visual sensors, and a robot platform used to test active control for feature extraction is then described. Motion in the field of view of at least one sensor is used to center the moving object for several sensors, which then extract object features such as color, boundary, and velocity from the appropriate sensors. Results are presented from real cameras and from the simulated world.
Two new methods are proposed. The first is used to find attitude (or even position, if the planar shape size is available) of a planar shape in space from a single perspective view. The second is used to recognize a moving planar shape with two images. In the first method, a fast rotative transformation algorithm is designed for some geometric features of planar shape images, and a search tree algorithm is employed to find correct camera orientation based on a given shape model. Then the attitude of the planar shape with respect to the camera can be derived from calculated camera orientation. In the second method, the planar shape is identified among several models with two images by comparing results of the first method applied to each model.
Filtering by morphological operations is particularly suited for removal of clutter and noise objects which have been introduced into noiseless binary images. The morphological filtering is designed to exploit differences in the spatial nature (shape, size, orientation) of the objects (connected components) in the ideal noiseless images as compared to the noise/clutter objects. Since the typical noise models (union, intersection set difference, etc.) for binary images are not additive, and the morphological processing is strongly nonlinear, optimal filtering results conventionally available for linear processing in the presence of additive noise are not directly applicable to morphological filtering of binary images. In this paper, a morphological filtering analog to the classic Wiener filter is described.
In practical pattern recognition problems, one-shot classifiers such as single feedforward neural networks trained by back-propagation may operate inefficiently in a complex pattern space and/or have unstable trained configurations. An alternative is a decision tree classifier. The authors report on the design, training, and accuracy of a hierarchical classifier implementing neural nets. Each nonterminal node is a separate feedforward neural network and is neither restricted to binary decisions nor limited to using only one feature to make those decisions. The features are pyramid data structures: identical texture parameters calculated across three different image resolutions about the training sites. In this application, results show a twenty percent relative increase in accuracy over the monolithic classifier.
Image understanding is a cross-disciplinary field, drawing on concepts and algorithms from image processing, pattern recognition, and artificial intelligence. An integrated system for image understanding may require a variety of capabilities that appear quite disparate, such as image restoration to compensate for degradations detected in the data, followed by logical inference to interpret features extracted from the restored data. The authors establish that constrained optimization provides a uniform formulation for two such apparently disparate problems: restoration of blurred imagery, and logical deduction or mechanized inference. Formulation of these problems in each of these categories as linear programming (LP) problems is shown. The 'deblurred' image is regained by minimizing a linear objective function subject to the constraints imposed by the blur. The degree of truth or falsity of a consequent proposition is established by maximizing a linear objective function subject to the constraints imposed by the premises.
An optical fiber inspection technique for detecting the following defects is described: hard debris, necks, and bubbles. It is assumed that there is a single predominant defect in the field of view. Two complementary images of a fiber are utilized: in the first image, the fiber is illuminated with white light from the back; in the second, the image is illuminated with a laser from the side and slightly above. Each defect is characterized by its typical appearance in the two images. The complementary images are used to resolve ambiguities and provide redundancy in cases where a defect cannot be clearly identified in one image alone. Due to modular system design, the system can be easily augmented if the manufacturing process introduces new defects. After describing the inspection technique, the authors show results of exercising the system on a number of images.
This paper describes a generalized flaw detection scheme for a molded and machined turbine blade. The data used are radiograph images. Based on knowledge of the molding and machining process, selective features may be isolated and classified for each possible flaw candidate. The proposed classification system requires the incorporation of many smaller pattern recognition systems. Several of these pattern recognition subsystems have been developed and implemented. Described is the implementation of one such subsystem whose characteristics are best realize utilizing a back propagation neural network. The results of the network are compared with other classification schemes (K nearest neighbor and Bayes classifier).
Inspection of industrial images can be a laborious task. Automating the inspection using image processing techniques works effectively only with an appropriate human interface. This paper describes a semi-automatic aircraft engine component motion registration system. Manual inspection of aircraft engine x-ray data was replaced by the use of several interactive programs running on a personal computer. This system allowed the inspector to digitize, process, tabulate, and document test image sequences without requiring image processing experience. The new environment also provided a digital replacement for the analog densitometer previously used, as well as enabling the extraction of digital templates of arbitrary size. Once two masks were selected, measurements could be performed by correlating the pair with a sequence of images, in a batch process. Calibrated measurement results were sent automatically to file, printer, or screen; hardcopy output of found templates, superimposed on individual test images was used for visual verification. Several image processing techniques for performing correlation were surveyed and three of them were implemented. Complexity, speed, and accuracy of each are presented. The methods implemented were direct normalized cross-correlation, hierarchical normalized spatial cross-correlation, and Fourier transform based cross-correlation (using an array processor). Extensions for scale and rotational invariance are also discussed. Attempts were made to fully automate the process, replacing the human expert with equivalent image understanding routines. The methods used by the expert to select templates were criteria such as edge detail, contrast, and local histograms. These strategies were applied to automatically selected templates containing desired measurement points. Results and limitations are discussed.
In assuring the quality of aircraft skin, it must be free of surface imperfections and structural defects. Manual inspection methods involve mechanical and optical technologies. Machine vision instrumentation can be automated for increasing the inspection rate and repeatability of measurement. As shown by previous industry experience, machine vision instrumentation methods are not calibrated and certified as easily as mechanical devices. The defect must be accurately measured and documented via a printout for engineering evaluation and disposition. In the actual usage of the instrument for inspection, the device must be portable for factory usage, on the flight line, or on an aircraft anywhere in the world. The instrumentation must be inexpensive and operable by a mechanic/technician level of training. The instrument design requirements are extensive, requiring a multidisciplinary approach for the research and development. This paper presents the image analysis results of microscopic structures laser images of scratches on various surfaces. Also discussed are the hardware and algorithms used for the microscopic structures laser images. Dedicated hardware and embedded software for implementing the image acquisition and analysis have been developed. The human interface, human vision is used for determining which image should be processed. Once the image is chosen for analysis, the final answer is a numerical value of the scratch depth. The result is an answer that is reliable and repeatable. The prototype has been built and demonstrated to Boeing Commercial Airplanes Group factory Quality Assurance and flight test management with favorable response.
Assembling flint wheels into cigarette lighters requires the insertion of the wheels into assembly from a certain side of the wheel. Computer vision techniques are used to enable the assembly system to determine appropriate orientation for the upright and fallen wheels and properly insert them into the assembly. Wheel surface images obtained under directional lighting are studied and features are found which provide the required distinguishability. Near real-time decision making is achieved by the system and very satisfactory results are reported.
Cartographic compilation requires precision mensuration. The calibration of mensuration processes is based on specific fiducials. External fiducials, around the exterior frame of the image, must be precisely measured to establish the overall physical geometry. Internal fiducials are provided within the image by placement of cloth panels on the ground at locations whose position is precisely known. Both types of fiducials must be known within the pixel space of a digitized image in order for the feature extraction process to be accurate with respect to delineated features. Precise mensuration of these fiducials requires that a cartographer view the image on a display and use pointing devices, such as a mouse, to pick the exact point. For accurate fiducial location, the required manual operations can be an added time- consuming task in the feature extraction process. The authors developed interactive tools which eliminate the precise pointing action for the operator. The operator is required only to 'box-in' the fiducial, using a simple drawing tool, and select either the internal or external fiducial functions; the software of the tool returns the precise location of the fiducial. The theory of the analysis used by the tool is discussed.
Methods are described for an automated recognition of height contours from graphic maps. The methods are applied to a large-scale conversion of contour data into digital format. Two data structures are used during the recognition process. The actual contour data is fully described by a line structure obtained by vectorizing the thinned medial axis of the original scanned contours. A second line structure is obtained by applying a similar procedure for the background of the image. This process yields a vectorized medial axis of the background consisting of so-called dual lines. By definition, each dual line represents a neighborhood relation between two contour line segments. The recognition of the contours consists of two phases. Closed height contours are first constructed from broken contour segments followed by attaching absolute height labels to the recognized contours. The formation of closed contours relies on a knowledge-based algorithm which utilizes both attributes of actual contour line segments and their inter-relations described by the dual lines. The dual lines are also valuable when attaching height labels on the contours.
Cartographic feature extraction is a manpower intensive process, requiring detailed and tedious labor by a skilled cartographer to identify and delineate significant cartographic features from an image. The availability of digital image data has made feasible the usage of computers to aid in the extraction of features. In particular, much interest has been generated in the potential for AT and IU techniques to automate feature extraction. In this paper we report on techniques to assist the cartographer. In particular, the cartographer initiates by picking a point or points on the feature, and the tools complete the delineation process. We discuss two such tools, one which delineates line features and one which delineates area features. Both tools utilize neural nets to carry out the critical decisions on tracking feature boundaries. In both tools the cartographer is allowed to concentrate on the most important and professionally rewarding task, feature detection and identification, and is spared the most tedious task, feature boundary delineation.
The authors have previously proposed a network of probabilistic cellular automata (PCAs) as part of an image recognition system designed to integrate model-based and data-driven approaches in a connectionist framework. The PCA arises from some natural requirements on the system which include incorporation of prior knowledge such as in inference rules, locality of inferences, and full parallelism. This network has been applied to recognize objects in both synthetic and in real data. This approach achieves recognition through the short-, rather than the long-time behavior of the dynamics of the PCA. In this paper, some methods are developed for learning the connection strengths by solving linear inequalities: the figures of merit are tendencies or directions of movement of the dynamical system. These 'dynamical' figures of merit result in inequality constraints on the connection strengths which are solved by linear (LP) or quadratic programs (QP). An algorithm is described for processing a large number of samples to determine weights for the PCA. The work may be regarded as either pointing out another application for constrained optimization, or as pointing out the need to extend the perceptron and similar methods for learning. The extension is needed because the neural network operates on a different principle from that for which the perceptron method was devised.