As unmanned systems (UMS) proliferate for security and defense applications, autonomous control system capabilities that enable them to perform tactical operations are of increasing interest. These operations, in which UMS must match or exceed the performance and speed of people or manned assets, even in the presence of dynamic mission objectives and unpredictable adversary behavior, are well beyond the capability of even the most advanced control systems demonstrated to date. In this paper we deconstruct the tactical autonomy problem, identify the key technical challenges, and place them into context with the autonomy taxonomy produced by the US Department of Defense’s Autonomy Community of Interest. We argue that two key capabilities beyond the state of the art are required to enable an initial fieldable capability: rapid abstract perception in appropriate environments, and tactical reasoning. We summarize our work to date in tactical reasoning, and present initial results from a new research program focused on abstract perception in tactical environments. This approach seeks to apply semantic labels to a broad set of objects via three core thrusts. First, we use physics-based multi-sensor fusion to enable generalization from imperfect and limited training data. Second, we pursue methods to optimize sensor perspective to improve object segmentation, mapping and, ultimately, classification. Finally, we assess the potential impact of using sensors that have not traditionally been used by UMS to perceive their environment, for example hyperspectral imagers, on the ability to identify objects. Our technical approach and initial results are presented.
Fast, accurate and robust automatic target recognition (ATR) in optical aerial imagery can provide game-changing
advantages to military commanders and personnel. ATR algorithms must reject non-targets with a high degree of
confidence in a world with an infinite number of possible input images. Furthermore, they must learn to recognize new
targets without requiring massive data collections. Whereas most machine learning algorithms classify data in a closed set
manner by mapping inputs to a fixed set of training classes, open set recognizers incorporate constraints that allow for
inputs to be labelled as unknown. We have adapted two template-based open set recognizers to use computer generated
synthetic images of military aircraft as training data, to provide a baseline for military-grade ATR: (1) a frequentist
approach based on probabilistic fusion of extracted image features, and (2) an open set extension to the one-class support
vector machine (SVM). These algorithms both use histograms of oriented gradients (HOG) as features as well as artificial
augmentation of both real and synthetic image chips to take advantage of minimal training data. Our results show that
open set recognizers trained with synthetic data and tested with real data can successfully discriminate real target inputs
from non-targets. However, there is still a requirement for some knowledge of the real target in order to calibrate the
relationship between synthetic template and target score distributions. We conclude by proposing algorithm modifications
that may improve the ability of synthetic data to represent real data.
Current techniques for building detection in Synthetic Aperture Radar (SAR) imagery can be computationally expensive and/or enforce stringent requirements for data acquisition. We present a technique that is effective and efficient at determining an approximate building location from multi-pass single-pol SAR imagery. This approximate location provides focus-of-attention to specific image regions for subsequent processing. The proposed technique assumes that for the desired image, a preprocessing algorithm has detected and labeled bright lines and shadows. Because we observe that buildings produce bright lines and shadows with predetermined relationships, our algorithm uses a graph clustering technique to find groups of bright lines and shadows that create a building. The nodes of the graph represent bright line and shadow regions, while the arcs represent the relationships between the bright lines and shadow. Constraints based on angle of depression and the relationship between connected bright lines and shadows are applied to remove unrelated arcs. Once the related bright lines and shadows are grouped, their locations are combined to provide an approximate building location. Experimental results are presented to demonstrate the outcome of this technique.
Sandia National Laboratories produces copious amounts of high-resolution, single-polarization Synthetic Aperture Radar (SAR) imagery, much more than available researchers and analysts can examine. Automating the recognition of terrains and structures in SAR imagery is highly desired. The optical image processing community has shown that superpixel segmentation (SPS) algorithms divide an image into small compact regions of similar intensity. Applying these SPS algorithms to optical images can reduce image complexity, enhance statistical characterization and improve segmentation and categorization of scene objects. SPS algorithms typically require high SNR (signal-to-noise-ratio) images to define segment boundaries accurately. Unfortunately, SAR imagery contains speckle, a product of coherent image formation, which complicates the extraction of superpixel segments and could preclude their use. Some researchers have developed modified SPS algorithms that discount speckle for application to SAR imagery. We apply two widely-used SPS algorithms to speckle-reduced SAR image products, both single SAR products and combinations of multiple SAR products, which include both single polarization and multi-polarization SAR images. To evaluate the quality of resulting superpixels, we compute research-standard segmentation quality measures on the match between superpixels and hand-labeled ground-truth, as well as statistical characterization of the radar-cross-section within each superpixel. Results of this quality analysis determine the best input/algorithm/parameter set for SAR imagery. Simple Linear Iterative Clustering provides faster computation time, superpixels that conform to scene-relevant structures, direct control of average superpixel size and more uniform superpixel sizes for improved statistical estimation which will facilitate subsequent terrain/structure categorization and segmentation into scene-relevant regions.
We have developed algorithms to automatically learn a detection map of a deployed sensor field for a virtual presence
and extended defense (VPED) system without apriori knowledge of the local terrain. The VPED system is an
unattended network of sensor pods, with each pod containing acoustic and seismic sensors. Each pod has the ability to
detect and classify moving targets at a limited range. By using a network of pods we can form a virtual perimeter with
each pod responsible for a certain section of the perimeter. The site's geography and soil conditions can affect the
detection performance of the pods. Thus, a network in the field may not have the same performance as a network
designed in the lab. To solve this problem we automatically estimate a network's detection performance as it is being
installed at a site by a mobile deployment unit (MDU). The MDU will wear a GPS unit, so the system not only knows
when it can detect the MDU, but also the MDU's location. In this paper, we demonstrate how to handle anisotropic
sensor-configurations, geography, and soil conditions.
Airborne ground moving-target indication (GMTI) radar can track moving vehicles at large standoff distances.
Unfortunately, trajectories from multiple vehicles can become kinematically ambiguous, resulting in confusion
between a target vehicle of interest and other vehicles. We propose the use of high range resolution (HRR) radar
profiles and multinomial pattern matching (MPM) for target fingerprinting and track stitching to overcome
Sandia's MPM algorithm is a robust template-based identification algorithm that has been applied successfully
to various target recognition problems. MPM utilizes a quantile transformation to map target intensity samples
to a small number of grayscale values, or quantiles. The algorithm relies on a statistical characterization of the
multinomial distribution of the sample-by-sample intensity values for target profiles. The quantile transformation
and statistical characterization procedures are extremely well suited to a robust representation of targets for HRR
profiles: they are invariant to sensor calibration, robust to target signature variations, and lend themselves to
efficient matching algorithms.
In typical HRR tracking applications, target fingerprints must be initiated on the fly from a limited number of
HRR profiles. Data may accumulate indefinitely as vehicles are tracked, and their templates must be continually
updated without becoming unbounded in size or complexity. To address this need, an incrementally updated
version of MPM has been developed. This implementation of MPM incorporates individual HRR profiles as they
become available, and fuses data from multiple aspect angles for a given target to aid in track stitching. This
paper provides a description of the incrementally updated version of MPM.
An unattended ground sensor (UGS) that attempts to perform target identification without providing some corresponding estimate of confidence level is of limited utility. In this context, a confidence level is a measure of probability that the detected vehicle is of a particular target class. Many identification methods attempt to match features of a detected vehicle to each of a set of target templates. Each template is formed empirically from features collected from vehicles known to be members of the particular target class. The nontarget class is inherent in this formulation and must be addressed in providing a confidence level. Often, it is difficult to adequately characterize the nontarget class empirically by feature collection, so assumptions must be made about the nontarget class. An analyst tasked with deciding how to use the confidence level of the classifier decision should have an accurate understanding of the meaning of the confidence level given. This paper compares several definitions of confidence level by considering the assumptions that are made in each, how these assumptions affect the meaning, and giving examples of implementing them in a practical acoustic UGS.
Automating the detection and identification of significant threats using multispectral (MS) imagery is a critical issue in remote sensing. Unlike previous multispectral target recognition approaches, we utilize a three-stage process that not only takes into account the spectral content, but also the spatial information. The first stage applies a matched filter to the calibrated MS data. Here, the matched filter is tuned to the spectral components of a given target and produces an image intensity map of where the best matches occur. The second stage represents a novel detection algorithm, known as the focus of attention (FOA) stage. The FOA performs an initial screening of the data based on intensity and size checks on the matched filter output. Next, using the target's pure components, the third stage performs constrained linear unmixing on MS pixels within the FOA detected regions. Knowledge sources derived from this process are combined using a sequential probability ratio test (SPRT). The SPRT can fuse contaminated, uncertain and disparate information from multiple sources. We demonstrate our approach on identifying a specific target using actual data collected in ideal conditions and also use approximately 35 square kilometers of urban clutter as false alarm data.
In this paper we investigate the applicability of the feature extraction mechanisms found in the neurophysiology of mammals to the problem of object recognition in synthetic aperture radar imagery. Our approach presents multiple views of target objects to a two-stage-organizing neural network architecture. The first stage, a Neocognitron, performs two layers of feature extraction. The resulting feature vectors are presented to the second stage, an ART-2A classifier self-organizing neural network which clusters the features into multiple object categories. In our first experiments reported in a previous paper, the Neocognitron was trained on raw SAR imagery. The architecture was able to recognize a simulated vehicle at arbitrary azimuthal orientations at a single depression angle while rejecting clutter as well as other vehicles. Feature extraction on raw imagery yielded features that were robust but difficult to interpret. We have performed new experiments in which the self-organization process is used to discover features separately in shadow and bright returns from objects to be recognized. feature extraction on shadow returns yields oriented contrast edge operators suggestive of bipartite simple cells observed in the striate cortex of mammals. Feature extraction on the specularity patterns in bright returns yield a mixture of orientation-independent operators similar to those found in the retina, and a collection of symmetric oriented contrast edge operators. These operators are formed at multiple positions within the receptive fields during the self-organization process and collectively resemble a two-dimensional Haar basis set. we merge the feature operators discovered separately in shadow and bright returns into a combined feature extractor front end. This front end is designed to extract the desired features from raw imagery. We compare the performance of the earlier two-stage neural network with a modified network using the new feature set.
Noisy objects, partially occluded objects, and objects in random positions and orientations cause significant problems for current robotic vision systems. In the past, an association graph has formed the basis for many model based matching methods. However, the association graph has many false nodes due to local and noisy features. Objects having similar local structures create many false arcs in the association graph. The maximal clique recursive and tree search procedures for finding sets of structurally compatible matches have exponential time complexity, due to these false arcs and nodes. This represents a real problem as the number of objects appearing in the scene and the model set size increase. Our approach is similar to randomized string matching algorithms. Points on edges represent the model features. A fingerprint defines a subset of model features that uniquely identify the model. These fingerprints remove the ambiguities and inconsistencies present in the association graph and eliminate the problems of Turney's connected salient features. The vision system chooses the fingerprints at random, preventing a knowledgeable adversary from constructing examples that destroy the advantages of fingerprinting. Fingerprints consist of local model features called point vectors. We have developed a heuristic approach for extracting fingerprints from a set of model objects. A list of connected and unconnected scene edges represent the scene. A Hough transform type approach matches model fingerprints to scene features. Results are given for scenes containing varying amounts of occlusion.