Automatic Target Recognition (ATR) seeks to improve upon techniques from signal processing, pattern recognition (PR), and information fusion. Currently, there is interest to extend traditional ATR methods by employing Artificial Intelligence (AI) and Machine Learning (ML). In support of current opportunities, the paper discusses a methodology entitled: Systems Experimentation efficiency effectives Evaluation Networks (SEeeEN). ATR differs from PR in that ATR is a system deployment leveraging pattern recognition (PR) in a networked environment for mission decision making, while PR/ML is a statistical representation of patterns for classification. ATR analysis has long been part of the COMPrehensive Assessment of Sensor Exploitation (COMPASE) Center utilizing measures of performance (e.g., efficiency) and measures of effectiveness (e.g., robustness) for ATR evaluation. The paper highlights available multimodal data sets for Automated ML Target Recognition (AMLTR).
Training deep convolutional networks for satellite or aerial image analysis often requires a large amount of training data. For a more robust algorithm, training data need to have variations not only in the background and target, but also radiometric variations in the image such as shadowing, illumination changes, atmospheric conditions, and imaging platforms with different collection geometry. Data augmentation is a commonly used approach to generating additional training data. However, this approach is often insufficient in accounting for real world changes in lighting, location or viewpoint outside of the collection geometry. Alternatively, image simulation can be an efficient way to augment training data that incorporates all these variations, such as changing backgrounds, that may be encountered in real data. The Digital Imaging and Remote Sensing Image Image Generation (DIRSIG) model is a tool that produces synthetic imagery using a suite of physics-based radiation propagation modules. DIRSIG can simulate images taken from different sensors with variation in collection geometry, spectral response, solar elevation and angle, atmospheric models, target, and background. Simulation of Urban Mobility (SUMO) is a multi-modal traffic simulation tool that explicitly models vehicles that move through a given road network. The output of the SUMO model was incorporated into DIRSIG to generate scenes with moving vehicles. The same approach was used when using helicopters as targets, but with slight modifications. Using the combination of DIRSIG and SUMO, we quickly generated many small images, with the target at the center with different backgrounds. The simulations generated images with vehicles and helicopters as targets, and corresponding images without targets. Using parallel computing, 120,000 training images were generated in about an hour. Some preliminary results show an improvement in the deep learning algorithm when real image training data are augmented with the simulated images, especially when obtaining sufficient real data was particularly challenging.
A Scene Understanding Challenge Problem was released by AFRL at this conference in 2015 in response to DARPA’s Mathematics, Sensing, Exploitation, and Execution (MSEE) program. We consider a scene understanding system as a generalization of typical sensor exploitation systems where instead of performing a narrowly deﬁned task (e.g., detect, track, classify, etc.), the system can perform general user-deﬁned tasks speciﬁed in a query language. That paper1 laid out the general challenges and methods for developing scene understanding performance models. This is an enormously challenging problem, so now AFRL is illustrating the methods with a baseline system primarily developed by the University of California, Los Angeles (UCLA) during the MSEE program. This system will be publicly available for others to utilize, compare, and contrast with related methods. This paper will further explain and provide insights into the challenges, illustrating them with examples from a publicly available data set. Our intent is that these tools will relieve the requirement for developing an entire system and enable progress to occur by focusing on individual elements of the system. Finally, we will provide details as to how interested researchers may obtain the system and the data.
The Wright Patterson Air Force Base 2009 Wide Area Image data set consists of 1537 frames of high-resolution image data. The data are supplied as raw images with pose files and also as projected images in National Imagery Transmission Format (NITF). The georegistration performance of the NITF images is 22.3 m root-mean square horizontal (RMSH). In a previous paper, calibrated camera models were developed to reduce the georegistration error to 3.3 m RMSH. In this work, corrected pose files are generated to reduce the error to 0.9 m RMSH. This is done by correcting the pose errors in a stepwise fashion to illustrate the error sources, which are global positioning system position bias, time registration errors, and attitude errors. The pose files are then corrected by simultaneously modifying the position and attitude to achieve the 0.9 m RMSH result. The corrected pose files are posted to allow users to perform high-accuracy projection, tracking, and other functions.
Interest in the use of active electro-optical(EO) sensors for non-cooperative target identification has steadily increased as the quality and availability of EO sources and detectors have improved. A unique and recent innovation has been the development of an airborne synthetic aperture imaging capability at optical wavelengths. To effectively exploit this new data source for target identification, one must develop an understanding of target-sensor phenomenology at those wavelengths. Current high-frequency, asymptotic EM predictors are computationally intractable for such conditions, as their ray density is inversely proportional to wavelength. As a more efficient alternative, we have developed a geometric optics based simulation for synthetic aperture ladar that seeks to model the second order statistics of the diffuse scattering commonly found at those wavelengths but with much lesser ray density. Code has been developed, ported to high-performance computing environments, and tested on a variety of target models.
Object recognition is an important problem that has many applications that are of interest to the United States Air Force
(USAF). Recently the USAF released its update to Technology Horizons, a report that is designed to guide the science
and technology direction of the Air Force. Technology Horizons specifically calls out for the need to use autonomous
systems in essentially all aspects of Air Force operations . Object recognition is a key enabler to autonomous
exploitation of intelligence, surveillance, and reconnaissance (ISR) data which might make the automatic searching of
millions of hours of video practical. In particular this paper focuses on vehicle recognition with Lowe's Scale-invariant
feature transform (SIFT) using a model database that was generated with semi-synthetic data. To create the model database we used a desktop laser scanner to create a high resolution 3D facet model. Then the 3D facet model was imported into LuxRender, a physics accurate ray tracing tool, and several views were rendered to create a model database. SIFT was selected because the algorithm is invariant to scale, noise, and illumination making it possible to create a model database of only a hundred original viewing locations which keeps the size of the model database reasonable.
We present an architecture for layered sensing which is constructed on open source and government off-the-shelf
software. This architecture shows how leveraging existing open-source software allows for practical graphical user
interfaces along with the underlying database and messaging architecture to be rapidly assimilated and utilized in real-world
applications. As an example of how this works, we present a system composed of a database and a graphical user
interface which can display wide area motion imagery, ground-based sensor data and overlays from narrow field of view
sensors in one composite image composed of sensor data and other metadata in separate layers on the display. We further
show how the development time is greatly reduced by utilizing open-source software and integrating it into the final
system design. The paper describes the architecture, the pros and cons of the open-source approach with results for a
layered sensing application with data from multiple disparate sensors.