The Office of Naval Research (ONR) is looking for methods to perform higher levels of sensor processing onboard UAVs to alleviate the need to transmit full motion video to ground stations over constrained data links. Charles River Analytics is particularly interested in performing intelligence, surveillance, and reconnaissance (ISR) tasks using UAV sensor feeds. Computing with approximate arithmetic can provide 10,000x improvement in size, weight, and power (SWAP) over desktop CPUs, thereby enabling ISR processing onboard small UAVs. Charles River and Singular Computing are teaming on an ONR program to develop these low-SWAP ISR capabilities using a small, low power, single chip machine, developed by Singular Computing, with many thousands of cores. Producing reliable results efficiently on massively parallel approximate machines requires adapting the core kernels of algorithms. We describe a feature-aided tracking algorithm adapted for the novel hardware architecture, which will be suitable for use onboard a UAV. Tests have shown the algorithm produces results equivalent to state-of-the-art traditional approaches while achieving a 6400x improvement in speed/power ratio.
One of the principal challenges in autonomous navigation for mobile ground robots is collision avoidance, especially in
dynamic environments featuring both moving and non-moving (static) obstacles. Detecting and tracking moving objects
(such as vehicles and pedestrians) presents a particular challenge because all points in the scene are in motion relative to
a moving platform. We present a solution for detecting and robustly tracking moving objects from a moving platform.
We use a novel epipolar Hough transform to identify points in the scene which do not conform to the geometric
constraints of a static scene when viewed from a moving camera. These points can then be analyzed in three different
domains: image space, Hough space and world space, allowing redundant clustering and tracking of moving objects. We
use a particle filter to model uncertainty in the tracking process and a multiple-hypothesis tracker with lifecycle
management to maintain tracks through occlusions and stop-start conditions. The result is a set of detected objects whose
position and estimated trajectory are continuously updated for use by path planning and collision avoidance systems. We
present results from experiments using a mobile test robot with a forward looking stereo camera navigating among
multiple moving objects.
Computer vision methods, such as automatic target recognition (ATR) techniques, have the potential to improve the
accuracy of military systems for weapon deployment and targeting, resulting in greater utility and reduced collateral
damage. A major challenge, however, is training the ATR algorithm to the specific environment and mission. Because of
the wide range of operating conditions encountered in practice, advanced training based on a pre-selected training set
may not provide the robust performance needed. Training on a mission-specific image set is a promising approach, but
requires rapid selection of a small, but highly representative training set to support time-critical operations. To remedy
these problems and make short-notice seeker missions a reality, we developed Learning and Mining using Bagged
Augmented Decision Trees (LAMBAST). LAMBAST examines large databases and extracts sparse, representative
subsets of target and clutter samples of interest. For data mining, LAMBAST uses a variant of decision trees, called
random decision trees (RDTs). This approach guards against overfitting and can incorporate novel, mission-specific data
after initial training via perpetual learning. We augment these trees with a distribution modeling component that
eliminates redundant information, ignores misrepresentative class distributions in the database, and stops training when
decision boundaries are sufficiently sampled. These augmented random decision trees enable fast investigation of
multiple images to train a reliable, mission-specific ATR. This paper presents the augmented random decision tree
framework, develops the sampling procedure for efficient construction of the sample, and illustrates the procedure using
Scene-Based Non-Uniformity Correction (SBNUC) is an attractive alternative to radiometric calibration for infrared
sensors because it does not rely on specialized hardware. The best known approach is Constant Statistics (CS) but it is
highly dependant on scene content and the amount of motion present, often introducing a "ghosting" artifact. In this
paper, we present a novel approach which applies a variation on CS to both the spatial and frequency domains of the
image. The result is a solution which effectively eliminates fixed pattern noise without ghosting and is much less
dependant scene content and scene motion than traditional CS.
Many fielded mobile robot systems have demonstrated the importance of directly estimating the 3D shape of objects in the robot's vicinity. The most mature solutions available today use active laser scanning or stereo camera pairs, but both approaches require specialized and expensive sensors. In prior publications, we have demonstrated the generation of stereo images from a single very low-cost camera using structure from motion (SFM) techniques. In this paper we demonstrate the practical usage of single-camera stereo in real-world mobile robot applications. Stereo imagery tends to produce incomplete 3D shape reconstructions of man-made objects because of smooth/glary regions that defeat stereo matching algorithms. We demonstrate robust object detection despite such incompleteness through matching of simple parameterized geometric models. Results are presented where parked cars are detected, and then recognized via license plate recognition, all in real time by a robot traveling through a parking lot.
Mobile robot designers frequently look to computer vision to solve navigation, obstacle avoidance, and object detection problems such as those encountered in parking lot surveillance. Stereo reconstruction is a useful technique in this domain and can be done in two ways. The first requires a fixed stereo camera rig to provide two side-by-side images; the second uses a single camera in motion to provide the images. While stereo rigs can be accurately calibrated in advance, they rely on a fixed baseline distance between the two cameras. The advantage of a single-camera method is the flexibility to change the baseline distance to best match each scenario. This directly increases the robustness of the stereo algorithm and increases the effective range of the system. The challenge comes from accurately rectifying the images into an ideal stereo pair. Structure from motion (SFM) can be used to compute the camera motion between the two images, but its accuracy is limited and small errors can cause rectified images to be misaligned. We present a single-camera stereo system that incorporates a Levenberg-Marquardt minimization of rectification parameters to bring the rectified images into alignment.
Automatic Target Recognition (ATR) algorithms are extremely sensitive to differences between the operating conditions under which they are trained and the extended operating conditions (EOCs) in which the fielded algorithms are tested. These extended operating conditions can cause a target's signature to be drastically different from training exemplars/models. For example, a target's signature can be influenced by: the time of day, the time of year, the weather, atmospheric conditions, position of the sun or other illumination sources, the target surface and material properties, the target composition, the target geometry, sensor characteristics, sensor viewing angle and range, the target surroundings and environment, and the target and scene temperature. Recognition rates degrade if an ATR is not trained for a particular EOC. Most infrared target detection techniques are based on a very simple probabilistic theory. This theory states that a pixel should be assigned the label of "target" if a set of measurements (features) is more likely to have come from an assumed (or learned) distribution of target features than from the distribution of background features. However, most detection systems treat these learned distributions as static and they are not adapted to changing EOCs. In this paper, we present an algorithm for assigning a pixel the label of target or background based on a statistical comparison of the distributions of measurements surrounding that pixel in the image. This method provides a feature-level adaptation to changing EOCs. Results are demonstrated on infrared imagery containing several military vehicles.
Mobile robot designers frequently look to computer vision to solve navigation, obstacle avoidance, and object detection problems. Potential solutions using low-cost video cameras are particularly alluring. Recent results in 3D scene reconstruction from a single moving camera seem particularly relevant, but robot designers who attempt to use such 3D techniques have uncovered a variety of practical concerns. We present lessons-learned from developing a single-camera 3D scene reconstruction system that provides both a real-time camera motion estimate and a rough model of major 3D structures in the robot’s vicinity. Our objective is to use the motion estimate to supplement GPS (indoors in particular) and to use the model to provide guidance for further vision processing (look for signs on walls, obstacles on the ground, etc.). The computational geometry involved is closely related to traditional two-camera stereo, however a number of degenerate cases exist. We also demonstrate how SFM can use used to improve the performance of two specific robot navigation tasks.
The success of any potential application for mobile robots depends largely on the specific environment where the application takes place. Practical applications are rarely found in highly structured environments, but unstructured environments (such as natural terrain) pose major challenges to any mobile robot. We believe that semi-structured environments-such as parking lots-provide a good opportunity for successful mobile robot applications. Parking lots tend to be flat and smooth, and cars can be uniquely identified by their license plates. Our scenario is a parking lot where only known vehicles are supposed to park. The robot looks for vehicles that do not belong in the parking lot. It checks both license plates and vehicle types, in case the plate is stolen from an approved vehicle. It operates autonomously, but reports back to a guard who verifies its performance. Our interest is in developing the robot's vision system, which we call Scene Estimation & Situational Awareness Mapping Engine (SESAME). In this paper, we present initial results from the development of two SESAME subsystems, the ego-location and license plate detection systems. While their ultimate goals are obviously quite different, our design demonstrates that by sharing intermediate results, both tasks can be significantly simplified. The inspiration for this design approach comes from the basic tenets of Situational Awareness (SA), where the benefits of holistic perception are clearly demonstrated over the more typical designs that attempt to solve each sensing/perception problem in isolation.