The goal of ARPA's Unmanned Ground Vehicle project is to demonstrate the use of small teams of cooperating autonomous robots (2 - 4 vehicles) to carry out military tasks in an outdoor environment. The role of the University of Michigan within the project focuses on aspects of mission planning, assimilation of information provided by multiple agents, and the interaction between planning and perception. The two aspects of this work related to sensor fusion are planning observation points to maximally reduce hypothesis uncertainty, and information sharing in multivehicle scenarios to reduce the amount of perception required. Observation point planning combines the system's current knowledge about an object with the uncertainty model used to characterize observations for data fusion in order to select optimal points for additional observations. Information sharing selects those detected features in the environment which are predicted to be most useful to other cooperating vehicles in the future, adding them to the multiagent system's model of the environment while ignoring less useful features.
Sensoric data enable a robotic system to react to events occurring in its environment. Much work has been done on the development of various sensors and algorithms to extract information from an environment. On the other hand, only little work has been done in the field of multisensor communication. This paper presents a shared memory based communication protocol that has been developed for the autonomous robot system KAMRO. This system consists of two PUMA 260 manipulators and an omnidirectionally driven mobile platform. The proposed approach is based on logical sensors, which can be used to dynamically build hierarchical sensor units. The protocol uses a distributed blackboard structure for the transmission of sensor data and commands. To support asynchronous coupling of robots and sensors, it not only transfers single sensor values, but also offers functions to estimate future values.
Most robotic systems used today rely heavily on prior knowledge of the location of specific objects in the workspace. Uncertainty occurs even in these `fixed' manufacturing workspaces, where over time, the precision about the knowledge of the location of objects degrades. Therefore, the entire system must be reprogrammed when adjustments are made. A robot that is able to adapt to changes automatically, must be able to sense its surroundings and identify objects that are encountered. The objective of this work was to design control algorithms that would enable a multisensor robotic system to perform specified tasks in a real world environment. The robotic system presented in this paper operates with little or no prior knowledge. Therefore the robot is given the task of collecting information about its surroundings. Data may be collected for the purpose of object recognition, environment mapping or object manipulation. The main sensing capabilities chosen for this system are machine vision and tactile force sensing. Each sensing technique has its relative advantages and disadvantages. Therefore, it is best to combine and coordinate sensing techniques depending on the work environment and the task requirements. Experiments are performed to show that the algorithms allow the system to meet the goals of safety, accuracy and economical use of computational resources.
Westinghouse has developed and demonstrated a system for the rapid prototyping of Sensor Fusion Tracking (SFT) algorithms. The system provides an object-oriented envelope with three sets of generic software objects to aid in the development and evaluation of SFT algorithms. The first is a generic tracker model that encapsulates the idea of a tracker being a series of SFT algorithms along with the data manipulated by those algorithms and is capable of simultaneously supporting multiple, independent trackers. The second is a set of flexible, easily extensible sensor and target models which allows many types of sensors and targets to be used. Live, recorded and simulated sensors and combinations thereof can be utilized as sources for the trackers. The sensor models also provide an easily extensible interface to the generic tracker model so that all sensors provide input to the SFT algorithms in the same fashion. The third is a highly versatile display and user interface that allows easy access to many of the performance measures for sensors and trackers for easy evaluation and debugging of the SFT algorithms. The system is an object-oriented design programmed in C++. This system with several of the SFT algorithms developed for it has been used with live sensors as a real-time tracking system. This paper outlines the salient features of the sensor fusion architecture and programming environment.
This paper outlines the development of a high-speed, distributed, multiprocessor data acquisition system for remote, hostile environments. It presents details of the concurrent architecture and then briefly discusses how formal methods were used to produce a software design that was refined into deadlock-free code. An outline of system testing and results are also given. The system was originally conceived to instrument gas-turbine test facilities and the design and development is presented in this context. This latter part of this paper shows how the architecture can be simply developed or adapted to expand the range of applications. Development to data has shown the prototype to be reliable, easily expandable and maintainable due to its modular approach. A multiple module prototype system has been successfully tested, capturing data on multiple channels at rates of up to 625 kHz in environments where component loadings exceeded 3000 g.
Conventional mobile robotic systems are `stand alone'. Program development involves loading programs into the mobile, via an umbilical. Autonomous operation, in this context, means `isolation': the user cannot interact with the program as the robot is moving around. Recent research in `swarm robotics' has exploited wireless networks as a means of providing inter- robot communication, but the population is still isolated from the human user. In this paper we report on research we are conducting into the provision of mobile robots as resources on a local area computer network, and thus breaking the isolation barrier. We are making use of new multimedia workstation and wireless networking technology to link the robots to the network in order to provide a new type of resource for the user. We model the robot as a set of resources and propose a client-server architecture as the basis for providing user access to the robots. We describe the types of resources each robot can provide and we outline the potential for cooperative robotics, human-robot cooperation, and teleoperation and autonomous robot behavior within this context.
Network management concerns the ability of a network to sustain continued correct operation. What constitutes correct operation of a network is dependent on the nature and application of the system. A decentralized data fusion system consists of a network of sensor nodes, each capable of local fusion of sensor data. The decentralized estimation algorithms described here are constrained to operate on fully connected and tree connected topologies. Neither topology is practical, both are vulnerable to the loss of a single link. Decentralized estimation in fully connected systems exploits the fact that direct communication is possible between every pair of nodes. The loss of any link changes the pattern of communication upon which the estimation algorithm is based. In a tree connected system, the loss of a single link can divide the network in two, stopping network-wide communication altogether. Management must maintain a continued flow of information throughout the network and along the routes which provide for the best overall network performance. Reliable routing depends on the existence of a range of routes and the ability to switch routes when failures occur. Decentralized data fusion management necessitates that the routing system operates within the constraints imposed on the sensing network. A management system which reconciles the requirement for redundant routes with the need for the maintenance of valid topologies is described. Thus, we provide for continued communication and estimation in a changing topological environment.
Planning a search for moving ground targets is difficult for humans and computationally intractable. This paper describes a technique to solve such problems. The main idea is to combine probability of detection assessments with computational search heuristics to generate sensor plans which approximately maximize either the probability of detection or a user- specified knowledge function (e.g., determining the target's probable destination; locating the enemy tanks). In contrast to super computer-based moving target search planning, our technique has been implemented using workstation technology. The data structures generated by sensor planning can be used to evaluate sensor reports during plan execution. Our system revises its objective function with each sensor report, allowing the user to assess both the current situation as well as the expected value of future information. This capability is particularly useful in situations involving a high rate of sensor reporting, helping the user focus his attention on sensors reports most pertinent to current needs. Our planning approach is implemented in a three layer architecture. The layers are: mobility analysis, followed by sensor coverage analysis, and concluding with sensor plan analysis. It is possible using these layers to describe the physical, spatial, and temporal characteristics of a scenario in the first two layers, and customize the final analysis to specific intelligence objectives. The architecture also allows a user to customize operational parameters in each of the three major components of the system. As examples of these performance options, we briefly describe the mobility analysis and discuss issues affecting sensor plan analysis.
In this paper the Nonlinear Information Filter is derived from the Extended Kalman Filter. A nonlinear system is considered. Linearizing the state and observation equations, a linear estimator which keeps track of total state estimates is conceived; the Extended Kalman Filter. The linearized parameters and filter equations are expressed in information space. This gives a filter that predicts and estimates information about nonlinear state parameters given nonlinear observations and nonlinear system dynamics. The Nonlinear Information Filter derivation is contrasted to that of the Linear Information filter. Pitfalls of a naive extension of the later to the former are thus identified. Furthermore, the Nonlinear Information filter is decentralized and distributed, to give the Distributed and Decentralized Nonlinear Information. Application is real decentralized data fusion and distributed control is proposed. Specifically, realtime distributed/decentralized control of a navigating, modular wheeled robot is considered.
This paper presents SOMBRERO, a new system for recognizing and locating 3D, rigid, non- moving objects from range data. The objects may be polyhedral or curved, partially occluding, touching or lying flush with each other. For data collection, we employ 2D time- of-flight laser scanners mounted to a moving gantry robot. By combining sensor and robot coordinates, we obtain 3D cartesian coordinates. Boundary representations (Brep's) provide view independent geometry models that are both efficiently recognizable and derivable automatically from sensor data. SOMBRERO's methods for generating, matching and fusing Brep's are highly synergetic. A split-and-merge segmentation algorithm with dynamic triangular builds a partial (21/2D) Brep from scattered data. The recognition module matches this scene description with a model database and outputs recognized objects, their positions and orientations, and possibly surfaces corresponding to unknown objects. We present preliminary results in scene segmentation and recognition. Partial Brep's corresponding to different range sensors or viewpoints can be merged into a consistent, complete and irredundant 3D object or scene model. This fusion algorithm itself uses the recognition and segmentation methods.
Since they provide direct depth measurements from a scene, range images are important sources of information in many 3D robot vision problems such as navigation and object recognition. Many physical factors, however, contribute noise to the discrete measurements in range images, which leads us to reassess the error distribution in samples taken from real range images. This paper suggests the utility of the Lp norms in yielding reliable estimates of location and regression coefficients. This approach is compared against two commonly used approaches: Equally Weighted Least Squares, which minimizes the L2 norm; and the Chebychev approximation, which minimizes the L1 norm. The problem is of a weighted least squares where the weights are derived from the chosen parameter, p. Of particular interest is this parameter's ability to yield a variety of location estimates spanning from the sample mean to the sample median. These two estimates have a wide application in image processing, especially in noise removal tasks. This paper will show the problems associated with these two techniques, and suggest solutions to minimize these problems. The regression module is used in a region-growing segmentation algorithm to provide a reliable partition of range images.
A method for spatial registration of 3D surfaces was developed for range data acquired by a multi-sensor optical surface scanner. Registration of 3D shapes is important for change detection and inspection. The requirement for an automatic and robust registration method stems from the need to compare digitized human anatomy surfaces obtained over extended periods of time. A typical example is comparison of pre-operative, post-operative, and recovered facial morphology of a face-lift patient. An iterative algorithm that handles six degrees of freedom (three rotations, and three translations) and does not require point to point correspondence of surfaces was developed. The method assumes that the surfaces are in near registration, otherwise, with surfaces having spherical symmetry, many iterations may be required before a successful outcome is achieved. Coarse registration can be obtained by visual transformations or by use of a principal axis transformation. First, points are identified on the second surface that lie on surface normals of points on the first surface. A divide and conquer technique is used to accelerate this process. Any points on the first surface that do not yield points on the second surface are ignored. The two sets of corresponding points (one set on each surface patch) is used in a least squares estimation scheme to minimize their distance. The estimate yields a transformation vector (consisting of rotations and translations) used to resample the second surface patch into a common coordinate system. This iterative process continues until the errors reduce below a set threshold or convergence is reached. Error statistics are reported. Testing and validation of the algorithm shows the method is feasible and efficient.
We describe a method for landmark-based robot position correction that uses a 2D image plus sparse range data to describe each landmark. Each landmark is learned by acquiring a CCD camera view from a known robot position. For each view, sparse range samples are taken with a spot laser ranging device, then a dense range estimate is created by interpolation. The choice of points to range is made based on the apparent deviation from the predicted range in each region of the landmark image. Position correction is done by using the approximate position of the robot and the interpolated range information to determine a reprojection of the stored camera view for each landmark. The reprojection is a prediction of the appearance of the landmark and its surroundings, from which an image of the landmark is extracted to use for correlation-based matching with images taken from the robot's position. Tests of the method in an industrial environment demonstrate its robustness to variation in viewpoint.
Our recent wavelet based fusion research concentrated on the analysis of static images. Although there is a substantial savings in the time needed to create a fused image, there are still problems with the extraction of connected regions from the image. This is due to the ambiguities associated with the choice of the wavelet channels to use for reconstruction. Multiple images derived from multiple sensors can be used to assist this selection process, as well as to derive object characteristics through dynamic scene analysis. Some of our earlier work with epipolar image (EPI) analysis of fused image sequences indicated that the technique was able to successfully act as a navigation aid in highly cluttered, dynamic environments. In this paper, we present a system that combines wavelet analysis with the EPI technique. Frame by frame integration of information from the sensors is done within the wavelet coefficient space, followed by an EPI analysis of features derived from the fused coefficients. We also report the results of a preliminary experiment with a laboratory sequence.
Laser radar has been used in scene distance measure since 80's. Because of its initiative character, we can obtain images without the influence of light conditions, no matter day and night, the results are the same. So laser radars are widely used. But how to use range images obtained from laser radars to rebuild 3D scene and then detect the obstacle in the scene is an interesting subject. The radar image based on vertical view implies the height characteristics of object in the image. This paper we propose, introduces a method to get object's height image using insection and locus algorithm, and also has the image thinningly processed. This paper advances slope detection, using slope fitting algorithm in order to detect obstacle. The experiment results show that we can obtain good 3D scene description from range image of laser radar. The algorithm in this paper is advantageous to decrease the influence of noise and correct distortion.
In Computer Assisted Surgery, the registration between pre-operative images, intra-operative images, anatomical models and guiding systems such as robots is a crucial step. This paper presents the methodology and the algorithms that we have developed to address the problem of rigid-body registration in this context. Our technique has been validated for many clinical cases where we had to register 3D anatomical surfaces with various sensory data. These sensory data can have 3D representation (3D images, range images, digitized 3D points, 2.5D ultrasound data) or they can be 2D projections (X-ray images, video images). This paper presents an overview of the results we have obtained.
START is a new automation system invented for nasopharyngeal carcinoma treatment. A laser scanner system capable of non-contact digitization of 3D surface is used to digitize the contours of the patient's face, shoulder and special landmark reference features of the patient. These features are stored in the computer in 3D digitized format. The digitized facial features with traced landmark reference features are used for fabrication of a true sized wood-particle laminates mould by a computer numerical controlled milling system. A Cobex mask is formed on this mould by using vacuum forming technique. With an image analysis and computer aided design system, the X-ray film with treatment window marked is traced automatically and converted to match the prescanned 3D information. A computer controlled 6-axis robot can precisely mark out the required areas on the Cobex cast for treatment. Finally, the patient receives radiotherapy treatment with the Cobex case as a positioning registration device. The new system will replace the manual procedure with better patient comfort, higher efficiency and enhanced accuracy.
In order for multisensor-based mobile robot to maneuver through its environment, it should be able to navigate based on the interpretation about world deduced from its multisensory information. This paper proposes a method to construct 3D road model based on images acquired by color camera and laser radar. In our method, the points of range image correspond to the points of road edges in color images will be decided directly by the transformation between the coordinate systems of color camera and laser radar. By this way, the parallel assumption of road edges, which was required by many conventional methods, is relaxed. From these correspondent points, not only 3D road edges but road region itself including all objects within it all can be decided.
Previous work in data fusion has seen the development of a range of architectures for multi-sensor data fusion systems, from fully centralised through distributed to fully decentraiised [1, 2]. This paper presents some further experimental results obtained from an implementation of a multi-target tracking system built around a fully decentralised Kalman filter (DKF) [3, 4]. Here we concentrate on the problem of sensor management, and consider how the individual sensors in a decentraiised sensing network can use the information in the global picture to make decisions about which targets to observe. Explicit use is made of the information available locally to a sensor to control its pointing and target detection. The sensor modality (e.g. range only, bearing only, etc.) strongly affects the way in which the sensor is managed. The tracking system integrates an essentially range-only sensor with a bearing-only sensor . The sensors run asynchronously from each other, and also exhibit asynchronous first detection. These effects are studied in the context of known target motion, as is the temporary removal of one of the sensors from the system. Of particular importance is the effect limited communications bandwidth has on the timeliness of the information exchange. Possible applications of the work are discussed, and suggestions are made for further research. This paper is organised as follows. First we review some background material, including the decentralised data fusion test bed used for the experiments reported here. Then we address the sensor management problem, and describe the experiments we have performed; these focus on assessing the impact on the performance of the data fusion system as a whole of employing sensor management. Finally, we draw some conclusions, and indicate some possible future work.
This paper describes a new representation of the sensing and control uncertainties that occur in sensor-based robot control. A sensor is represented by three quantities: a domain, which is the set of robot configurations at which a valid measurement can be taken; an absolute sensing uncertainty field, which describes the sensor's absolute (global) accuracy; and an incremental motion uncertainty, which describes the sensor's relative (pertaining to displacements) accuracy. Control uncertainty represents the ability of a controller to drive the measured error near zero, regardless of how accurate the measurement is. These descriptions of sensing and control capability determine the evolution of uncertainty in a sensor-based motion plan. Sensor fusion is handled in this representation as the simultaneous activation of multiple sensors. A composite sensor's representation is computed from the representation of its constituent sensors. A motion planning framework based on the above representation of uncertainties is introduced through an example. This framework is designed to utilize the available sensing modes required to accomplish a task.
Motion uncertainty arises whenever there is ambiguity in local velocity vector assignment, such as along a straight contour or in a textureless region. Motion uncertainty can be quantified by computing the entropy of the corresponding velocity probability distribution. We propose a new framework for the integration of visual motion where the objective is the reduction of motion uncertainty. Based on this approach, we have developed a model that searches for the proper extent of motion integration in order to minimize motion entropy. By modeling our task as a multi-stage stochastic optimization problem, the control structure for motion integration can be inferred through dynamic programming. Results from initial experiments demonstrate that our model is capable of analyzing image sequences containing the aperture problem and textureless regions.
This paper is concerned with automated visual inspection of manufactured products which is carried out by means of pixel-by-pixel comparison of the sensed image of the product to be inspected with some reference pattern (or image). In this framework, the disorder detection problem (or the change-point problem) is of basic importance which consists in detecting possible abrupt changes in parameters of the initial distribution of observations of a process occurring at unknown time points. In this paper, the problem is considered from the point of view of both the parametric and non-parametric approaches. The purpose of this paper is to give a presentation of several sequential jump detection algorithms which combine sensor information from automated visual inspection and also have applications in such areas as remote sensing, target recognition, environmental monitoring, etc. In order to illustrate these algorithms, the examples are given.
A reconfigurable, optical, 3D scanning system with sub-second acquisition of human body surface data was designed and simulated. Sensor elements (digital cameras/light beam projectors) that meet resolution, accuracy, and speed requirements are included in the system design. The sensors are interfaced to video frame grabber(s) under computer control resulting in a modular, low cost system. System operation and data processing are performed using a desktop graphics workstation. Surface data collected with this system can be oversampled to improve resolution and accuracy (viewed by overlapping camera/projector pairs). Multi- resolution data can be collected for different surfaces simultaneously or separately. Modeling and calibration of this reconfigurable system are achieved via a robust optimal estimation technique. Reconstruction software that allows seamless merging of a range data from multiple sensors has been implemented. Laser scanners that acquire body surface range data using one or two sensors require several seconds for data collection. Surface digitization of inaminate objects is feasible with such devices, but their use in human surface metrology is limited due to motion artifacts and occluded surfaces. Use of multiple, independent active sensors providing rapid collection and multi-resolution data enable sampling of complex human surface morphology not otherwise practical. 3D facial surface data has provided accurate measurements used in facial/craniofacial plastic surgery and modern personal protective equipment systems. Whole body data obtained with this new system is applicable to human factors research, medical diagnosis/treatment, and industrial design.
This work presents a new approach to the problem of calibrating a zoomable camera. The calibration of zooming cameras is central for tasks which employ zoom to improve feature detection and correspondence, such as 3D stereo reconstruction. Our method solves for the parameters of a camera model using a global optimization technique on a sequence of images of a known calibration target obtained for different mechanical zoom settings. This approach addresses two primary weaknesses in classical camera calibrations. First, the process avoids the difficulties of explicit feature detection. Feature localization is instead included as part of the error measure used in the optimization. Second, images are not calibrated independently, as in previous efforts. Rather, the optimization process considers all images simultaneously, representing the final calibrated camera as a function of zoom. We compute a starting point for the optimization using the measured mechanical zoom settings for the images, and certain features identified by either a high-level process or a human operator. This paper describes the details of our approach, showing initial experimental results on real data.
Issues relating to the fusion of dynamically distorted scanned imagery are discussed. In particular the problem of resampling imagery for simulation of orthographic projection is addressed. Using a suitable parameterization of the scanner trajectory enables a ray tracing supersampling technique to be used which is highly efficient compared to conventional rendering approaches. It is seen that the problem of sampling rays within distorted pixel can be solved by sampling platform motion during the scanline integration period. Difficulties are encountered when using conventional parameterizations of motion. These are avoided by using the screw representation.
In this paper we report on research we are carrying out on camera localization and control for remote viewing during teleoperation. We present an approach to the localization problem which exploits a model of the task environment to guide the selection of vision filters for picking out `interesting' features. This research adapts the `interest' operator ideas of Moravec within a model-based vision framework. We present initial results which demonstrate the utility of feature-sensitive interest operators for picking out key visual landmarks.
We present a new system for supervisory automated control of multiple remote cameras. Our primary purpose in developing this system has been to provide capability for knowledge- based, `hands-off' viewing during execution of teleoperation/telerobotic tasks. The reported technology has broader applicability to remote surveillance, telescience observation, automated manufacturing workcells, etc. We refer to this new capability as `Intelligent Viewing Control (IVC),' distinguishing it from a simple programmed camera motion control. In the IVC system, camera viewing assignment, sequencing, positioning, panning, and parameter adjustment (zoom, focus, aperture, etc.) are invoked and interactively executed by real-time by a knowledge-based controller, drawing on a priori known task models and constraints, including operator preferences. This multi-camera control is integrated with a real-time, high-fidelity 3D graphics simulation, which is correctly calibrated in perspective to the actual cameras and their platform kinematics (translation/pan-tilt). Such merged graphics- with-video design allows the system user to preview and modify the planned (`choreographed') viewing sequences. Further, during actual task execution, the system operator has available both the resulting optimized video sequence, as well as supplementary graphics views from arbitrary perspectives. IVC, including operator-interactive designation of robot task actions, is presented to the user as a well-integrated video-graphic single screen user interface allowing easy access to all relevant telerobot communication/command/control resources. We describe and show pictorial results of a preliminary IVC system implementation for telerobotic servicing of a satellite.
Humans use their senses, particularly vision, to interrogate the environment in search of information pertinent to the performance of a task. We say that the user has `visual goals', and we associate `visual acts' with these goals. Visual acts are patterns of `looking' displayed in acquiring the information. In this paper we present a model for visual acts which is based on known features of the human visual perception system and to illustrate the model we use as a case study a task which is typical of mechanical manipulation operations. The model is based on human perceptual discrimination and is motivated by a query-based model of the observer.
Three dimensional surface measurements are required in a number of industrial processes. These measurements have commonly been made using contact probes, but optical sensors are now available that allow fast, non-contact measurements. A common characteristic of optical surface profilers is the trade-off between measurement accuracy and field of view. In order to measure large objects with high resolution, multiple views are required. An accurate transformation between the different views is needed to reconstruct the entire surface. We demonstrate a method of obtaining the transformation by choosing a small number of control points that lie in the overlapping region between two views. The selection of the control points is independent of the object geometry, and only an approximate knowledge of the overlapping region is required.