Recognition of objects using model can be formulated as the finding of affine transforms such that the locations of all object features are consistent with the projected positions of the model from a single view. This paper describes an efficient method for the computing of the transform using the symbolic clustering method. Matches between image features and object features are explored to generate hypotheses of possible object locations. Consistent hypotheses are grouped to form clusters. Supporting evidence of the participating hypotheses of a cluster is collected to generate a new transform hypothesis. The clusters that contain sufficient amount of evidence are selected for further verification. Hypotheses are verified by comparing the object against the image directly. The advantage of this approach is that the basic hypotheses can be computed easily and in parallel and the clusters can be generated efficiently. Also, since clusters with strong supports are selected and investigated first, the probability that the correct transforms are computed earlier in the hypothesize-and-test process is high. Therefore, the total amount of computation for the recognition task may be reduced.
The application of knowledge-based processing in image understanding systems has always occurred at a high level. These approaches have generated great claims, but limited results as they are basically simple forward chaining rule systems (e.g., if object on a road, then object is a vehicle). The application of heuristic reasoning to the low-level processing (e.g., image enhancement, segmentation, feature extraction, and classification) is a requirement to provide more accurate object and region information for high-level analysis, and truly integrate artificial intelligence throughout the entire system rather than as a post processing afterthought. This paper addresses the design and development of an integrated knowledge-based vision system in three phases. First, the application of knowledge base system techniques to image understanding is analyzed in light of deficiencies and limitations. This analysis is then exploited to produce a synergistically integrated system design. Second, the application of heuristics to low-level processing is discussed with specific application in the areas of image enhancement and segmentation. Finally, conclusions drawn to date on system performance will be presented in association with a mapping of how they are directing future work in this area.
Since the beginning of the domestic oil industry in 1859, the oil exploration process has been dependent on rules-of-thumb and best-guess strategies of expert geologists. In the absence of well-defined methods for collecting, analyzing, and interpreting geological data, oil exploration involved much guesswork and extreme financial risks, especially in the beginning when it was conducted "on a basis of feast or famine". The nature of oil exploration is such that it takes many years (sometimes decades) for a geologist to gain the knowledge and experience needed for analyzing geological data in a reliable and efficient manner. Exposure to a wide range of situations seems to be the key to developing expertise in the field. Another important issue that impacts oil exploration is the acquisition and handling of the diverse and vast amounts of data that are needed by experts to make reliable conclusions.
An expert system is being implemented for enhancing operability of the Ground Communication Facility (GCF) of Jet Propulsion Laboratory's (JPL) Deep Space Network (DSN). The DSN is a tracking network for all of JPL's spacecraft plus a subset of spacecrafts launched by other NASA centers. A GCF upgrade task is set to replace the current GCF aging system with new, modern equipments which are capable of using knowledge-based monitor and control approach. The expert system, implemented in terms of KEE and SUN workstation, is used for performing network fault management, configuration management, and performance management in real-time. Monitor data are collected from each processor and DSCC's in every five seconds. In addition to serving as input parameters of the expert system, extracted management information is used to update a management information database. For the monitor and control purpose, software of each processor is divided into layers following the OSI standard. Each layer is modeled as a finite state machine. A System Management Application Process (SMAP) is implemented at application layer, which coordinates layer managers of the same processor and communicates with peer SMAPs of other processors. The expert system will be tuned by augmenting the production rules as the operation is going on, and its performance will be measured.
This paper describes a knowledge-based expert system that uses return features, provided by image analysts, to identify an object as a specific instance or class of object, such as a tank or truck. Partial feature sets allow the expert system to classify occluded and unfamiliar or falsified object data returns to the most likely class with a specified reasoning path. The rule based system was developed using the Prolog version of Ml.
This paper introduces a knowledge based list recognition and understanding system. It is capable of classifying the list of various types, recognizing the characters in the items of a list and finally constructing a symbolic description of a list. The problems which have to be solved in the development of this system are divided into three parts. . Analysis of a list image skeleton . Recognition of lists . Construction of a list symbolic description A blackboard model of problem solving is adopted in the development of this system which realize a hierarchical control strategy. This system consists of five parts,Know-ledge Base System, Control System, Dynamic Data Station (or Blackboard), Processing Modules and I/O System. Now, the system is still under development. The first researches which we have done have achieved a satisfactory results.
The U.S. Army must examine the utilization of artificial intelligence to assist combat leaders on the AirLand battlefield of the year 2000. Automated reporting, FM secure digital-burst communications and mission planning aids will provide the combat leader with the capability to fight and win on an extremely fluid and lethal battlefield.
In this paper, we discuss two control schemes, based on (1) centrally controlkd region-growing and (2) iterative quadtree splitting, for incorporating knowledge-based processes into the- segmentation of texture images. An important feature of these two schemes is that knowledge about the nature of the images is directly involved in the partition process rather than being used afterwards to label the resulting segments of the partition. Prototype systems which we implemented for the automatic interpretation of seismic sections are described in detail. A specific application of these systems on a test section of real seismic data from the Gulf of Mexico is presented. Test runs on the data have shown that both schemes give a much improved segmentation result over the one obtained by a conventional approach.
Machine vision algorithms designed to model the human preattentive perception of texture boundaries often define the texture of a region on the basis of a single perceptual property. In addition, humans can segment textures even when the regions are spatially non-homogeneous in the texture properties of the primitive texture elements. This paper proposes a model of texture segmentation, and a texture segmentation algorithm based on the model, which can produce segmentations which agree with human perception.
This paper describes a new spatial-spectral feature extraction and selection technique for analysis and segmentation of the color images of natural scenes. It is a statistical-structural method which is developed by modeling several basic scene patterns such as uniformly colored object regions, textured image areas, and shadowed or highlighted image segments. The uniformly colored object parts are characterized by their spatial continuity property. The textured areas are identified by their spatial placement rules and spectral primitives. The shadowed and highlighted image segments are determined by their constant chromaticity and gradually varying lightness property. The basic rule used in this modelling process is the Julesz conjecture. The features are extracted by a simple averaging process in the local areas of these visual patterns, which are determined by the algorithm. This averaging process implicitly incorporates the spatial and spectral information contained in the local areas. Thus the extracted feature set is expected to provide better clustering in the feature space than the sensory data alone.
This paper presents a highly structured and compact representation of grey-level images. Addition and multiplication are defined for the set of all grey-level images which become commutative semigroups under these operations. Images can then be described as polynomials of two variables. Examples are given for ordered and randomized image patterns. Also examined are grey-level image magnification, and image contours.
A scene interpretation system can be partitioned into three levels of processing: low level, mid level, and high level. We describe here a systematic approach to scene interpretation that embeds these three levels of processing in a structured hierarchical blackboard architecture. Each blackboard level contains modular knowledge sources which perform numeric and/or symbolic processing. The low level processing blackboard is embedded as one of the knowledge sources within the mid level processing blackboard which is in turn embedded as a knowledge source within the high level processing blackboard. The benefits of this scene interpretation system architecture are efficient bi-directional flow of control between processing levels, modularized multispectral/multisensor image processing, and information feedback for performance improvement and for spatial/temporal tracking.
Logic programming methods have been used to implement an expert system shell which represents and reasons about the implications of causal event streams, in arbitrarily complex causal networks. The system features the capability of "forward" and "backward" inference, "fuzzy" causal inference, default vs conditional reasoning, the ability to combine 1st generation "if-then" heuristics with 2nd generation structural inference during fault diagnosis exercises and the ability to generate inferences from empirical data as well as from conceptually based models of causal relationships. A description of the direct causal connections between pairs of focal events (the causal topology), along with the enumeration of policy variables (manipulable inputs), system observables (events which can be observed) and goal variables (desired outputs) enable declarative representations of meta-level causal structures used during the causal inference process. The simplicity of these input requirements, the expressive power of an open ended Logic Programming environment and the availability of a rich set of analytic tools all combine to provide the knowledge engineer with the ability to quickly build systems which can reason symbolically about complex causal structures.
This paper presents an implementation of an expert system shell which has the capability of reasoning non-monotonically. An expert system is then developed which is capable of reasoning about software system development. The implementation of this expert system provides a working prototype that models a full scale environment for complex software system development.
REACT (Rapid Expert Assessment to Counter Threats) is a prototype artificial intelligence system which is being developed at the Grumman Corporate Research Center. The purpose of REACT is to aid pilots in determining appropriate threat response strategies during combat situations. REACT is also being designed to monitor the state of the aircraft's onboard systems in order to offload the pilot from performing some of these tasks. REACT will notify the pilot during unusual circumstances which require some corrective actions to be taken. Our approach to carrying out these tasks is to develop individual, cooperating expert systems. The REACT experts are being designed to execute independently to obtain specified goals and they will also have the capability to share information using a blackboard model so that they can work collectively to solve problems as a single system.
Communications is an expert intensive discipline. The application of expert systems for maintenance of large and complex networks, mainly as an aid in trouble shooting, can simplify the task of network management. The important steps involved in troubleshooting are fault detection, fault reporting, fault interpretation and fault isolation. At present, Network Maintenance Facilities are capable of detecting and reporting the faults to network personnel. Fault interpretation refers to the next step in the process, which involves coming up with reasons for the failure. Fault interpretation can be characterized in two ways. First, it involves such a diversity of facts that it is difficult to predict. Secondly, it embodies a wealth of knowledge in the form of network management personnel. The application of expert systems in these interpretive tasks is an important step towards automation of network maintenance. In this paper, INDEX (Intelligent Network Diagnosis Expediter), a rule based production system for computer network alarm interpretation is described. It acts as an intelligent filter for people analyzing network alarms. INDEX analyzes the alarms in the network and identifies proper maintenance action to be taken.The important feature of this production system is that it is data driven. Working memory is the principal data repository of production systems and its contents represent the current state of the problem. Control is based upon which productions match the constantly changing working memory elements. Implementation of the prototype is in OPS83. Major issues in rule based system development such as rule base organization, implementation and efficiency are discussed.
I am motivated to write this paper because I am continually surprised at the level of misunderstanding about AI in DoD both on the technology level and the applications level. As for AI as a technology, which is not the subject of this paper, there has been much progress over the last few years, but that progress has been incremental and evolutionary. In contrast to the big splash that AI made some few years ago when everything about the technology seemed revolutionary, this progress may be so subtle as to be invisible to the casual observer of the technical scene. Technical progress notwithstanding, there is much that can be said about the limitations of the current level of AI. As obvious as it is to those in the AI research community, for whom these limitations enable funding, it is surprising how many people believe that AI is technically "mature." I feel compelled to point out the obvious: a demonstration of some capability (in AI or other technology) on one restricted instance of a general class of problems is important as an existence proof of a technology, but it does not satisfy the general need for a technology that will be able to produce solutions for all unrestricted problems in that class. It is in this sense that I believe that AI will require much basic and engineering research from DoD and other sources for many years to come. Given the utility derived from the relatively modest level of today's technology, I believe that even incremental gains here will prove of phenomenal value to DoD and the economy in general.
The significance of compiling case histories of empirical process knowledge and the role of such histories in improving the efficiency of manufacturing process development is discussed in this paper. Methods of representing important investigations as cases and using the information from such cases to eliminate redundancy of empirical investigations in analogous process development situations are also discussed. A system is proposed that uses such methods to capture the problem-solving framework of the application domain. A conceptual design of the system is presented and discussed.
The advent of personal engineering workstations has brought substantial information processing power to the individual programmer. Advanced tools and environment capabilities supporting the software lifecycle are just beginning to become generally available. However, many of these tools are addressing only part of the software development problem by focusing on rapid construction of self-contained programs by a small group of talented engineers. Additional capabilities are required to support the development of large programming systems where a high degree of coordination and communication is required among large numbers of software engineers, hardware engineers, and managers. A major player in realizing these capabilities is the framework supporting the software development environment. In this paper we discuss our research toward a Knowledge-Based Software Assistant (KBSA) framework. We propose the development of an advanced framework containing a distributed knowledge base that can support the data representation needs of tools, provide environmental support for the formalization and control of the software development process, and offer a highly interactive and consistent user interface.
Systems that assess the real world must cope with evidence that is uncertain, ambiguous, and spread over time. Typically, the most important function of an assessment system is to identify when activities are occurring that are unusual or unanticipated. Model based temporal reasoning addresses both of these requirements. The differences among temporal reasoning schemes lies in the methods used to avoid computational intractability. If we had n pieces of data and we wanted to examine how they were related, the worst case would be where we had to examine every subset of these points to see if that subset satisfied the relations. This would be 2n, which is intractable. Models compress this; if several data points are all compatible with a model, then that model represents all those data points. Data points are then considered related if they lie within the same model or if they lie in models that are related. Models thus address the intractability problem. They also address the problem of determining unusual activities if the data do not agree with models that are indicated by earlier data then something out of the norm is taking place. The models can summarize what we know up to that time, so when they are not predicting correctly, either something unusual is happening or we need to revise our models. The model based reasoner developed at Advanced Decision Systems is thus both intuitive and powerful. It is currently being used on one operational system and several prototype systems. It has enough power to be used in domains spanning the spectrum from manufacturing engineering and project management to low-intensity conflict and strategic assessment.
This paper presents a compiler and loader for C-STROBE Knowledge Bases (KBs). C-STROBE is an object-oriented programming language which supports tangled generalization hierarchies, inheritance of properties, procedural attachment and event-driven procedure invocation. The compiler writes out C-STROBE KBs in relocatable binary object modules. Symbol tables created by the compiler allow the loading of KBs quickly and without knowledge of the structure of C-STROBE KBs.
This paper is a study of methods for focusing reasoning systems in the presence of uncertainty in a dynamic environment. The aim is to capture human-level behavior in selection of the interesting from myriad confounding and conflicting occurrences. Up until now, the most frequently used methods for focusing have involved the use of numerical measures in the form of utility functions or pattern matchers. The advantages of the numerical methods are their speed and transparency. The disadvantage (from the cognitive science viewpoint) is that the numerical forms are not necessarily representative of human-like thinking. These evaluator functions are often formulated in an ad hoc manner, then tested and modified on experimental cases until they meet desired levels of performance. One way of tuning the form of the functions is by feedback, stochastic learning designs. The scoring functions capture behavior over a prescribed domain but cannot adaptively vary their scope. In particular the functions are not allowed to change form to accommodate discovery; an appealing feature for a truly reactive planner. Human judgment adjusts dynamically, whereas utility function form is rigid. We propose methods to overcome the rigidity and allow the evaluator functions to handle interesting situations as they occur. Application domains considered are target prioritization for autonomous reconnaissance vehicles, and local planning of trajectories for combat aircraft.
The management of uncertainty is an important issue in the design of expert systems for troubleshooting complicated systems because much of the information in the knowledge bases is uncertain and incomplete. Based on Dempster-Shafer theory, this paper describes a coherent algorithm which calculates BPA's from the a priori probabilities and the statistics of a target system. It also discusses the ways to propagate the BPA's in the hierarchical search space. It is shown that the computational complexity of the propagating belief functions can be reduced with the restriction of dichotomy.
The use of expert systems in nuclear power plants to provide advice to managers, supervisors and/or operators is a concept that is rapidly gaining acceptance. f2 Generally, expert systems rely on the expertise of human experts or knowledge that has been codified in publications, books, or regulations to provide advice under a wide variety of conditions. In this work, a probabilistic risk assessment (PRA)3 of a nuclear power plant performed previously is used to assess the safety status of nuclear power plants and to make recommendations to the plant personnel. Nuclear power plants have many redundant systems and can continue to operate when one or more of these systems is disabled or removed from service for maintenance or testing. PRAs provide a means of evaluating the risk to the public associated with the operation of nuclear power plants with components or systems out of service. While the choice of the "source term" and methodology in a PRA may influence the absolute probability and consequences of a core melt, the ratio of two PRA calculations for two configurations of the same plant, carried out on a consistent basis, can readily identify the increase in risk associated with going from one configuration to the other. PRISIM,4 a personal computer program to calculate the ratio of core melt probabilities described above (based on previously performed PRAs), has been developed under the sponsorship of the U.S. Nuclear Regulatory Commission (NRC). When one or several components are removed from service, PRISM then calculates the ratio of the core melt probabilities. The inference engine of the expert system then uses this ratio and a constant risk criterion,5 along with information from its knowledge base (which includes information from the PRA), to advise plant personnel as to what action, if any, should be taken.
This paper solves the problem of probabilistic entailment in stochastic decision theory according to the principle of maximum entropy. The recursive computation of entropy within the tree structure reduces this problem to an assignment of uniform probability distributions on the missing information.
This paper reports the result of a model driven touch sensor recognition experiment. The touch sensor employed is a large field tactile array. Object features appropriate for touch sensor recognition are extracted from a geometric model of an object, the dual spherical image. Both geometric and dynamic features are used to identify objects and their position and orientation on the touch sensor. Experiments show that geometric features extracted from the model are effective but that dynamic features must be determined empirically. Correct object identification rates even for very similar objects exceed ninety percent, a success rate much higher than we would have expected from only two-dimensional contact patterns. Position and orientation of objects once identified are very reliable. We conclude that large field tactile sensors could prove very useful in the automatic palletizing problem when object models (from a CAD system, for example) can be utilized.
In this paper, we present an approach to color image understanding that can be used to segment and analyze surfaces with color variations due to highlights and shading. We begin with a theory that relates the reflected light from dielectric materials, such as plastic, to fundamental physical reflection processes, and describes the color of the reflected light as a linear combination of the color of the light due to surface reflection (highlights) and body reflection (object color). This theory is used in an algorithm that separates a color image into two parts: an image of just the highlights, and the original image with the highlights removed. In the past, we have applied this method to hand-segmented images. The current paper shows how to perform automatic segmentation method by applying this theory in stages to identify the object and highlight colors. The result is a combination of segmentation and reflection analysis that is better than traditional heuristic segmentation methods (such as histogram thresholding), and provides important physical information about the surface geometry and material properties at the same time. We also show the importance of modeling the camera properties for this kind of quantitative analysis of color. This line of research cRn lead to physics-based image segmentation methods that are both more reliable and more useful than traditional segmentation methods.
Inferring the 3D structures of nonrigidly moving objects from natural images is a difficult yet basic problem in computational vision. Our approach makes use of dynamic, elastically deformable models. These physically-based 3D models offer the geometric flexibility to satisfy a diversity of visual constraints. Constraints are encoded as forces which act on the models to mold their shapes, place them in proper depth, and carry them through motions so as to best account for the available image data. We demonstrate the recovery of shape, depth, and nonrigid motion from object profiles (occluding contours) in natural images. This article reviews our approach; mathematical details are found in the primary sources.
The recent development of an algebra for the manipulation of decision trees has allowed the implementation of an algorithm for generating all the irreducible forms of a decision tree. An irreducible is a syntactic form for a decision tree which is guaranteed to be optimal for some cost criterion (for example, an expected testing cost). However, each irreducible is optimal only under certain stability conditions. Thus, in the absence of specific costing information, the more demanding the stability conditions for an irreducible, the less generally useful the tree. This paper illustrates, by means of an example, a syntactic approach to decision tree inference in which all the irreducible decision trees which are consistent with respect to a given set of training examples are generated, and a test for stability is used to narrow down the selection of a reasonable inference model.
A learning classifier system (LCS) is assigned the task of learning difficult boolean function, a 6-multiplexer. An LCS is a type of production system that learns to generalize and instantiate rules called classifiers in response to intermittent and noisy reinforcement (payoff). This paper presents results similar to Wilson's recent experiments using a champion reinforcement algorithm rather than Wilson's collective scheme. Performance differences are discussed.
Autonomous machines such as a planetary explorer must learn from the real universe, which can be seen as a single event of infinite complexity or as an infinite set of trivial events. Current inductive learning systems generalize from a finite set of events provided by a human teacher or a software environment. Even a constrained universe is difficult to represent as a finite set of events comprising various types of useful information. Cues used by humans to perform this representation task include temporal or spatial relationships among environmental data. We discuss a method for identifying meaningful complex events in the universe of an autonomous learning robot. When no temporal concept is known to link two descriptors, temporal proximity of events is used to pair simple descriptor events into complex events for inductive learning. When a temporal concept is known to link two descriptors, that concept is used to guide the pairing of descriptor events. We discuss the efficiency improvements which arise from using temporal concepts in this manner. The method is embedded in GPAL 1.3, a general purpose autonomous learner that uses a knowledge-based learning and control strategy which models the scientific method.
This paper presents a new method for using passive binocular vision to create a map of the top-view of a robot's environment. Numerous autonomous robot navigation systems have been proposed; most attempt to match objects in separate images by using edges or by locating significant groups of edge pixels. The method described in this paper uses two cameras (aligned in parallel) to generate stereo images from which low level image features are extracted using a new non-linear production rule system, rather than a conventional spatial filter design. The image features are registered by matching and aligning correspondingly shaped regions of constant brightness levels in both images; the binocular offsets can then be computed. The use of heuristics and production rules to relieve the computational burden associated with low level image processing is unique, both in processing the images and in locating matching regions in the images. The feature extraction algorithm, the intermediate symbolic representations, and the application of these results to hierarchical structures common to context queuing systems are presented.
The technique of geographic encoding for the representation of a 3-D object is introduced in this paper. The geographic codes consist of the number of visible vertices and surfaces, the vertex-surface description, the vertex-surface vector, the vertex-type description, and the vertex-link code. A novel method for automatic surface extraction from the pictorial drawing of a 3-D object is presented. This method converts a pictorial drawing into surface trees. For each candidate visible surface, a surface tree is generated from which a visible surface is automatically identified via tree tracing. The extracted surfaces are orderly arranged to generate the vertex-surface description (VSD) for the 3-D object. The other geographic codes are automatically generated from the vertex-surface description.
A rule based system for 3D shape recovery and orientation estimation from a single perspective view is described. The primary input to our system is a set of line segments extracted from images by a complex segmentation process. In practice, humans are able to interpret 3D shape and orientation from 2D images with very little a priori information. The heuristics behind shape constancy suggest that certain regularity assumptions play an important role. Fifteen rules have been developed for the rule base which can be extended to include additional rules. The current rules deal with parallel lines, perpendicular lines, and right corners in the object space that lead to the given image instance recorded by the camera. Forward chaining methodology is adopted. The implementation is written in the rule base language OPS5 in conjunction with Pascal on a VAX/VMS system. Two examples are presented, and the results are consistent with human perception.
The paper describes a 3-dimensional Shape Descriptor Function (SDF) which is invariant under the action of SO(3). We concentrate here on the analytical derivation of the SSD, and show how certain requirements constrain both the nature of the SSD and the class of manifolds over which it can be defined.
Gray scale morphology is applied to the problem of counting and locating rods and cones in whole mounted human retina viewed with Nomarski differential interference contrast (NDIC) optics. Straightforward techniques perform well on high quality images, but fail badly on similar images. Two types of adaptation are introduced, both of which markedly broaden the class of images which can be handled.
The fusion of multisensor data from LADAR, MMW, FLIR, and SAR sensors by a knowledge-based system is used in automatic target recognition and remote monitoring applications. Each sensor offers unique scene attribute and contextual information. Thus, problems of image registration as well as the effects of terrain and environmental variability upon each sensor complicate the analysis process. In this paper, the use of multipolarization and range data from MMW images to enhances the sensor fusion phase by exploiting the multiband feature of the sensor is explored. The multiband data provided by MMW sensors is unique and provides a sufficient basis for analysis as an independent problem. As specialized hardware implementations of knowledge systems become available, the ability to exploit polarization and range data in a subsystem that is independent of the fusion process presents a unique approach that increases the overall performance of the fusion process.
An intelligent real-time multiple moving object tracker is introduced. It consists of five main functional units: a multi-mode object detector, a detection confidence measure, a multi-state intelligent tracking controller, a clutter filter, and an intelligent predictor. The object detector exploits correlation, motion, and contrast detection modes cooperatively, and operates at multiple image resolutions. The detection confidence measure is used to combine dynamically the measured and predicted attributes of current tracked objects. To control and maintain good tracking, each tracking step is assigned one of eight possible tracking states, with associated strategies for intelligent detection, matching and prediction. The clutter filter uses a stored database of interesting objects to reject clutter selectively during stable tracking, based on both feature and motion history. The intelligent predictor predicts the tracking state, tracked object attributes and background properties expected in the next tracking step. This tracking scheme incorporates knowledge and control not exploited by conventional Kalman filter based trackers. We demonstrate its performance in detecting and accurately tracking dynamic targets from FLIR image sequences.
Target cueing is commonly performed using only spatial properties of images. However, image degradations and complexities, such as low resolution, low contrast, cluttered background, and noise, can result in insufficient spatial information to achieve meaningful cueing. Fortunately, in many situations, primarily when targets are moving with respect to background, temporal cueing can be performed instead of (or in addition to) spatial cueing. In general, the imaging sensor can be moving, which induces apparent scene background motion, greatly complicating the detection of target motion. We will discuss a technique that detects independent motion of objects, even in the presence of induced background motion, and can subsequently cue targets successfully in the absence of adequate spatial information. We start by calculating the optical flow field that describes image motion between successive frames in an image sequence. Issues concerning the rapid calculation of optical flow and implementation on high speed pipe-lined architecture will be discussed. The optical flow field is used to register the backgrounds in the two scenes but does not affect independent motion of small regions within the images. The registered images are compared and moving objects are distinguished by spatial differences between the registered images. The final stage of the cueing process applies both intensity and size discrimination to the result of the registered image comparison. We have successfully demonstrated our independent motion cueing algorithm on real, cluttered images and will present some of our results.
The vision model described in this paper utilizes a competitive winner-take-all network with several biological inspired levels similar to the Neocognitron and the work of Frohn, Geiger and Singer with improvements. Their models are simple object detectors and they incorporate feature detector levels. In the current system many of these simple object detectors are used and each one presents spatial information that is used by the top complex structured object detector level. This top level is inspired from Patrick Winston's and others' A.I. work on learning and matching structural descriptions of complex objects. The synaptic connections at this level contain relational information such as right-of, left-of, in-front-of, must-be and must-not. The winner-take-all characteristic of each level allows even partially obscured objects to be recognized. Forbidden properties of an object can be implemented by inhibitory connections while essential properties are implemented by excitatory connections. In this way a small number of missing essential properties does not automatically rule out the recognition of an object while present forbidden properties quickly rule it out.
Models of objects stored in memory have been shown to be useful for guiding the processing of computer vision systems. A major consideration in such systems, however, is how stored models are initially accessed and indexed by the system. As the number of stored models increases, the time required to search memory for the correct model becomes high. Parallel distributed, connectionist, neural networks' have been shown to have appealing content addressable memory properties. This paper discusses an architecture for efficient storage and reference of model memories stored as stable patterns of activity in a parallel, distributed, connectionist, neural network. The emergent properties of content addressability and resistance to noise are exploited to perform indexing of the appropriate object centered model from image centered primitives. The system consists of three network modules each of which represent information relative to a different frame of reference. The model memory network is a large state space vector where fields in the vector correspond to ordered component objects and relative, object based spatial relationships between the component objects. The component assertion network represents evidence about the existence of object primitives in the input image. It establishes local frames of reference for object primitives relative to the image based frame of reference. The spatial relationship constraint network is an intermediate representation which enables the association between the object based and the image based frames of reference. This intermediate level represents information about possible object orderings and establishes relative spatial relationships from the image based information in the component assertion network below. It is also constrained by the lawful object orderings in the model memory network above. The system design is consistent with current psychological theories of recognition by component. It also seems to support Marr's notions of hierarchical indexing. (i.e. the specificity, adjunct, and parent indices) It supports the notion that multiple canonical views of an object may have to be stored in memory to enable its efficient identification. The use of variable fields in the state space vectors appears to keep the number of required nodes in the network down to a tractable number while imposing a semantic value on different areas of the state space. This semantic imposition supports an interface between the analogical aspects of neural networks and the propositional paradigms of symbolic processing.
Rule-based systems, which have proven to be extremely useful for several Artificial Intelligence and Expert Systems applications, currently face severe limitations due to the slow speed of their execution. To achieve the desired speed-up, this paper addresses the problem of parallelization of production systems and explores the various architectural and algorithmic possibilities. The inherent sources of parallelism in the production system structure are analyzed and the trade-offs, limitations and feasibility of exploitation of these sources of parallelism are presented. Based on this analysis, we propose a dedicated, coarse-grained, n-ary tree multiprocessor architecture for the parallel implementation of rule-based systems and then present algorithms for partitioning of rules in this architecture.
The use of neural-like networks to solve optimization problems such as the Traveling Salesman Problem has been proposed by Hopfield and Tank. The networks are based on a standard "neuron" which can be implemented by means of voltage amplifiers. The gain of network conductances and time constants as well as the value of constants in the problem's energy function have a decisive influence on the solution provided by the network, yet Hopfield and Tank do not make clear how to determine the value of these constants for a particular problem. In this paper a method for selection of constants is proposed which gives good results for the TSP. Instead of readjusting the gains and adding terms to the energy function until good results are obtained, the gains are chosen a priori and the energy function's original form is not altered. Simulation results are presented and discussed.
Computer vision deals with the extraction of information about a scene by analysis of images of that scene. The field had its origins over 30 years ago. Traditionally, it dealt with scenes that were essentially two-dimensional: documents, microscope images, high-altitude images of the earth's surface. The classical approach to analyzing such images involves segmentation of the image into parts corresponding to meaningful parts of the scene; measurement of properties of and relations among the parts; and object recognition based on comparison of the configuration of parts, properties and relations (essentially a labeled graph) with standard configurations representing the objects of interest.
This paper addresses the issue of coupling in a knowledge-based computer vision system for 3-D polyhederal object recognition. A typical vision system is fairly complex in which both domain-independent and domain dependent knowledge are used during the course of problem solving. Both sources of knowledge interact in a very complicated manner during the course of problem solving. This paper incorporates a complete study of the 3-D object recognition problem and highlights certain important aspects such as (1) Coupling of symbolic and numerical knowledge (2) Evidential reasoning (3) Distributed belief evaluation and propagation. (4) Learning and adaptation.
Modern manufacturing environments are using an increasing number of automated guided vehicles. While these robots can improve productivity, they are restricted to following preset paths marked by paint or electrical wires. A need exists for a robot cart which can determine its own path given knowledge of its surroundings and its goals. This type of robot could determine the shortest path to accomplish its goals, and it could use heuristic reasoning to determine how to get around obstacles which would normally block a painted path. Image processing techniques combined with knowledge-based reasoning can provide a solution to this problem. This paper discusses a system which acquires video images of a hallway, segments those images using image processing algorithms, and determines the classification of objects and the appropriate path for a cart using heuristic reasoning. The images are segmented using basic knowledge of the surroundings. Heuristics in the form of production rules are used to determine the classification for each segment and the corresponding confidence. The rules were developed from an analysis of the basic features common to object classes in the image database.
This paper reports on a funded research project to develop and test a microcomputer based hardware and software system that is capable of capturing and classifying digitized images of features found on topographic and production maps used by the minerals industry for exploration and planning. Images are captured using a 240 pixel-per-inch image scanner, and recognized map features are automatically incorporated into a database. Digitized images are processed using appropriate combinations of standard techniques for noise removal, boundary identification, and feature extraction. Map features are identified and feature tracing is guided by a frame-oriented knowledge-base processing system containing rules that interact with the image software. In-house development of both the image and knowledge-base processing software has greatly facilitated the necessary information flow between these two systems.
In this paper a problem-reduction approach is applied to handwritten numeral recognition and a recognition system is built. A problem-reduction representation (PRR) is used as the structural model for the character into which the semantics is injected. A powerful feature point extraction technique is designed to extract turnabouts on the strokes of a character with the windows of variable size. In terms of this point, a character is segmented into a series of line segments, each with one head and one tail. A nondirection analysis algorithm in problem-reduction approach is used to analyze characters. A heuristic ordered search method according to attributes is developed. A high recognition rate is obtained.
The methods of path planning for an Automatic Guided Vehicle (AGV) are discussed with regards to algorithm speed and method of world representation. Existing methods of path solution are often too slow for real time execution and must be done off line by the Factory Management System (FMS). By using a linked list world model and a modification of Djikstras algorithm to allow suboptimal results, execution may be sped up by a factor of 10 to 50 with the suboptimal error relative to the optimal path cost being bounded.
A two-level hierarchical route planner has been developed. The data input to the system is a cross-country mobility map. For a given vehicle type, this map specifies regions which are "GO" or "NO-GO." A line-thinning algorithm is used to generate a skeleton of the "GO" areas. This skeleton is then converted into a graph-theoretic structure. A first-level route planner using elevation-grid data is used to compute the traversal time of each arc of the graph. These traversal times become the weights used by the second level route planner. This route planner is an A* algorithm that is used to search for a specified number of non-competing routes, i.e., routes that have no arc-segments in common. Thus, the first level route planner does detailed planning over a small area but is subject to combinatorial explosion when a search over a wider area is required. The second level graph-search algorithm provides the capability to efficiently plan a route over a larger area but without detail about the precise path followed. This system was implemented in Common Lisp on a Lisp machine. The software has also been integrated into a workstation that was developed to provide support to Army robotic vehicle research. The workstation provides support for comparing the capabilities of alternative route finding algorithms.
Historically, dynamic graph searching techniques have been used to construct minimal cost paths through a defended environment. In small search spaces, heuristic techniques are shown to produce near optimal solutions while effecting a forty percent reduction in the effort required to generate the path. A hybrid strategy combining heuristic search and hierarchic decomposition is developed to inhibit the combinatorial problems inherent in constructing paths in large a search space.
A robot which functions in a variable and dynamic environment must be able to intercept or avoid moving objects whose locations and velocities may not be known. This paper describes a visual guidance technique for such machines which utilizes image sequences from two cameras. Comparing the relative displacements over time of an object on the left and right image planes leads to the recovery of two important quantities: 1) the location where the object will collide with the robot, and 2) the absolute object size. A third quantity, time to collision, can be recovered from the displacement of the object on a single image plane. These three items are sufficient for the robot to approach or avoid the object. Implementation of this technique on an IBM 7565 industrial robot is described. Experimentation indicates that this method is more accurate than traditional stereo analysis.
New generation of robotic systems will operate in complex, unstructured environments of industrial plants utilizing sophisticated sensory mechanisms. In this paper we consider development of autonomous robotic systems for various inspection and manipulation tasks associated with advanced nuclear power plants. Our approach in the development of the robotic system is to utilize an array of sensors capable of sensing the robot's environment in several sensory modalities. One of the most important sensor modality utilized is that of vision. We describe the development of a model-based vision system for performing a number of inspection and manipulation tasks. The system is designed and tested using a laboratory based test panel. A number of analog and digital meters and a variety of switches, valves and controls are mounted on the panel. The paper presents details of system design and development and a series of experiments performed to evaluate capabilities of the vision system.
In a highly distributed CIM (Computer-Integrate-Manufacturing) environment, adaptive control using neural network processors is proposed. First, neural network concepts are employed to represent and to model knowledge in robotics and automation application domains. Then, the model captured by a neural network emulator serves as the distributed and adaptive controller of a CIM environment.
This paper describes a current Bureau of Mines research project that is applying autonomous vehicle concepts to coal mining machinery. The system consists of a specially instrumented 50-ton mining machine with an onboard integrated computer control system and a Symbolics 3600 computer. Preliminary tests demonstrated that the Symbolics could communicate and control the mining machine through the use of a simple yet powerful communication protocol. The status of the system is displayed graphically on the Symbolics. A description of the mining machine, the system architecture, the integrated computer control system, the communication protocol, and initial experimental test results are discussed. Future experiments and system enhancements are also delineated.
Finding a safe, smooth, and efficient path to move an object through obstacles is necessary for object manipulation in robotics and automation. This paper presents an approach to two-dimensional as well as three-dimensional findpath problems that divides the problem into two steps. First, rough paths are found based only on topological information. This is accomplished by assigning to each obstacle an artificial potential similar to the electrostatic potential to prevent the moving object from colliding with the obstacles, and then locating minimum potential valleys. Second, the paths defined by the minimum potential valleys are modified to obtain an optimal collision-free path and orientations of the moving object along the path. Three algorithms are given to accomplish this second step. The first algorithm simply minimizes a weighted sum of the path length and the total potential experienced by the moving object along the path. This algorithm solves only "easy" problems where the free space between the obstacles is wide. The other two algorithms are developed to handle the problems in which intelligent maneuvering of the moving object among tightly packed obstacles is necessary. These three algorithms based on potential fields are nearly complete in scope, and solve a large variety of problems.
Currently, the main problem with automated target recognizers (ATRs) is not a low hit rate but rather a high false alarm rate. The use of multiple sensors has frequently been proposed as a means of reducing this false alarm rate. Less frequently proposed has been the use of a soldier "in the loop". Adding a human to the system requires a careful consideration of soldier-ATR interface issues-a consideration which heretofore has been nearly absent. We propose an integration concept in which the activities of the man and the ATR are tightly woven, and they are able to facilitate each other's tasks. We also introduce the term "recognition enhancement" and propose it as a name for the discipline which studies and attempts to improve aided target recognition performance.
In state-of-the-art research, the difficult task of image understanding by computer can be organized as a hierarchy of algorithms applied to image data at low, mid, and high levels of processing. This paper discusses aspects of design and implementation of this kind of multilevel system. A project in knowledge-based interpretation of satellite IR data of the Atlantic Ocean illustrates the ideas.