The Office of Naval Research (ONR) is looking for methods to perform higher levels of sensor processing onboard UAVs to alleviate the need to transmit full motion video to ground stations over constrained data links. Charles River Analytics is particularly interested in performing intelligence, surveillance, and reconnaissance (ISR) tasks using UAV sensor feeds. Computing with approximate arithmetic can provide 10,000x improvement in size, weight, and power (SWAP) over desktop CPUs, thereby enabling ISR processing onboard small UAVs. Charles River and Singular Computing are teaming on an ONR program to develop these low-SWAP ISR capabilities using a small, low power, single chip machine, developed by Singular Computing, with many thousands of cores. Producing reliable results efficiently on massively parallel approximate machines requires adapting the core kernels of algorithms. We describe a feature-aided tracking algorithm adapted for the novel hardware architecture, which will be suitable for use onboard a UAV. Tests have shown the algorithm produces results equivalent to state-of-the-art traditional approaches while achieving a 6400x improvement in speed/power ratio.
Computer vision methods, such as automatic target recognition (ATR) techniques, have the potential to improve the
accuracy of military systems for weapon deployment and targeting, resulting in greater utility and reduced collateral
damage. A major challenge, however, is training the ATR algorithm to the specific environment and mission. Because of
the wide range of operating conditions encountered in practice, advanced training based on a pre-selected training set
may not provide the robust performance needed. Training on a mission-specific image set is a promising approach, but
requires rapid selection of a small, but highly representative training set to support time-critical operations. To remedy
these problems and make short-notice seeker missions a reality, we developed Learning and Mining using Bagged
Augmented Decision Trees (LAMBAST). LAMBAST examines large databases and extracts sparse, representative
subsets of target and clutter samples of interest. For data mining, LAMBAST uses a variant of decision trees, called
random decision trees (RDTs). This approach guards against overfitting and can incorporate novel, mission-specific data
after initial training via perpetual learning. We augment these trees with a distribution modeling component that
eliminates redundant information, ignores misrepresentative class distributions in the database, and stops training when
decision boundaries are sufficiently sampled. These augmented random decision trees enable fast investigation of
multiple images to train a reliable, mission-specific ATR. This paper presents the augmented random decision tree
framework, develops the sampling procedure for efficient construction of the sample, and illustrates the procedure using
Automatic target detection (ATD) systems process imagery to detect and locate targets in support of intelligence,
surveillance, reconnaissance, and strike missions. Accurate prediction of ATD performance would assist in system
design and trade studies, collection management, and mission planning. Specifically, a need exists for ATD performance
prediction based exclusively on information available from the imagery and its associated metadata. In response to this
need, we undertake a modeling effort that consists of two phases: a learning phase, where image measures are computed
for a set of test images, the ATD performance is measured, and a prediction model is developed; and a second phase to
test and validate performance prediction. The learning phase produces a mapping, valid across various ATD algorithms,
which is even applicable when no image truth is available (e.g., when evaluating denied area imagery). Ongoing efforts
to develop such a prediction model have met with some success. Previous results presented models to predict
performance for several ATD methods. This paper extends the work in several ways: extension to a new ATD method,
application of the modeling to a new image set, and an investigation of systematic changes in the image properties
(resolution, noise, contrast). The paper concludes with a discussion of future research.
Automatic target recognition (ATR) using an infrared (IR) sensor is a particularly appealing combination, because an IR sensor can overcome various types of concealment and works in both day and night conditions. We present a system for ATR on low resolution IR imagery. We describe the system architecture and methods for feature extraction and feature subset selection. We also compare two types of classifier, K-Nearest Neighbors (KNN) and Random Decision Tree (RDT). Our experiments test the recognition accuracy of the classifiers, within our ATR system, on a variety of IR datasets. Results show that RDT and KNN achieve comparable performance across the tested datasets, but that RDT requires significantly less retrieval time on large datasets and in high dimensional feature spaces. Therefore, we conclude that RDT is a promising classifier to enable a robust, real time ATR solution.
Automatic target detection (ATD) systems process imagery to detect and locate targets in imagery in support of a
variety of military missions. Accurate prediction of ATD performance would assist in system design and trade
studies, collection management, and mission planning. A need exists for ATD performance prediction based exclusively
on information available from the imagery and its associated metadata. We present a predictor based on
image measures quantifying the intrinsic ATD difficulty on an image. The modeling effort consists of two phases:
a learning phase, where image measures are computed for a set of test images, the ATD performance is measured,
and a prediction model is developed; and a second phase to test and validate performance prediction. The learning
phase produces a mapping, valid across various ATR algorithms, which is even applicable when no image truth is
available (e.g., when evaluating denied area imagery). The testbed has plug-in capability to allow rapid evaluation
of new ATR algorithms. The image measures employed in the model include: statistics derived from a constant
false alarm rate (CFAR) processor, the Power Spectrum Signature, and others. We present performance predictors
for two trained ATD classifiers, one constructed using using GENIE Pro<sup>TM</sup>, a tool developed at Los Alamos National
Laboratory, and the other eCognition<sup>TM</sup>, developed by Definiens (http://www.definiens.com/products). We
present analyses of the two performance predictions, and compare the underlying prediction models. The paper
concludes with a discussion of future research.
Many fielded mobile robot systems have demonstrated the importance of directly estimating the 3D shape of objects in the robot's vicinity. The most mature solutions available today use active laser scanning or stereo camera pairs, but both approaches require specialized and expensive sensors. In prior publications, we have demonstrated the generation of stereo images from a single very low-cost camera using structure from motion (SFM) techniques. In this paper we demonstrate the practical usage of single-camera stereo in real-world mobile robot applications. Stereo imagery tends to produce incomplete 3D shape reconstructions of man-made objects because of smooth/glary regions that defeat stereo matching algorithms. We demonstrate robust object detection despite such incompleteness through matching of simple parameterized geometric models. Results are presented where parked cars are detected, and then recognized via license plate recognition, all in real time by a robot traveling through a parking lot.
Mobile robot designers frequently look to computer vision to solve navigation, obstacle avoidance, and object detection problems such as those encountered in parking lot surveillance. Stereo reconstruction is a useful technique in this domain and can be done in two ways. The first requires a fixed stereo camera rig to provide two side-by-side images; the second uses a single camera in motion to provide the images. While stereo rigs can be accurately calibrated in advance, they rely on a fixed baseline distance between the two cameras. The advantage of a single-camera method is the flexibility to change the baseline distance to best match each scenario. This directly increases the robustness of the stereo algorithm and increases the effective range of the system. The challenge comes from accurately rectifying the images into an ideal stereo pair. Structure from motion (SFM) can be used to compute the camera motion between the two images, but its accuracy is limited and small errors can cause rectified images to be misaligned. We present a single-camera stereo system that incorporates a Levenberg-Marquardt minimization of rectification parameters to bring the rectified images into alignment.
Automatic Target Recognition (ATR) algorithms are extremely sensitive to differences between the operating conditions under which they are trained and the extended operating conditions (EOCs) in which the fielded algorithms are tested. These extended operating conditions can cause a target's signature to be drastically different from training exemplars/models. For example, a target's signature can be influenced by: the time of day, the time of year, the weather, atmospheric conditions, position of the sun or other illumination sources, the target surface and material properties, the target composition, the target geometry, sensor characteristics, sensor viewing angle and range, the target surroundings and environment, and the target and scene temperature. Recognition rates degrade if an ATR is not trained for a particular EOC. Most infrared target detection techniques are based on a very simple probabilistic theory. This theory states that a pixel should be assigned the label of "target" if a set of measurements (features) is more likely to have come from an assumed (or learned) distribution of target features than from the distribution of background features. However, most detection systems treat these learned distributions as static and they are not adapted to changing EOCs. In this paper, we present an algorithm for assigning a pixel the label of target or background based on a statistical comparison of the distributions of measurements surrounding that pixel in the image. This method provides a feature-level adaptation to changing EOCs. Results are demonstrated on infrared imagery containing several military vehicles.
Mobile robot designers frequently look to computer vision to solve navigation, obstacle avoidance, and object detection problems. Potential solutions using low-cost video cameras are particularly alluring. Recent results in 3D scene reconstruction from a single moving camera seem particularly relevant, but robot designers who attempt to use such 3D techniques have uncovered a variety of practical concerns. We present lessons-learned from developing a single-camera 3D scene reconstruction system that provides both a real-time camera motion estimate and a rough model of major 3D structures in the robot’s vicinity. Our objective is to use the motion estimate to supplement GPS (indoors in particular) and to use the model to provide guidance for further vision processing (look for signs on <i>walls</i>, obstacles on the <i>ground</i>, etc.). The computational geometry involved is closely related to traditional two-camera stereo, however a number of degenerate cases exist. We also demonstrate how SFM can use used to improve the performance of two specific robot navigation tasks.