Better understanding of Multi Domain Battle (MDB) challenges in complex military environments may start by gaining a basic scientific appreciation of the level of generalization and scalability offered by Machine Learning (ML) solutions designed, trained and optimized to achieve a single, specific task, continuously daytime and nighttime. We examine the generalization and scalability promises of a modern deep ML solution, applied to a unique spatial-spectral dataset that consists of blackbody calibrated, longwave infrared spectra of a fixed target site containing three painted metal surrogate tanks deployed in a field of mixed vegetation. Data was collected at roughly six minute intervals, nearly continuously, for over a year. This includes collection in many atmospheric conditions (rain, snow, sleet, fog, etc.) throughout the year. This paper focuses on data collected by a Telops Hyper-Cam from a 65 meter observation tower located at slant range of roughly 550 meters, from the targets. The dataset is very complex. There are no obvious spectral signatures from the target surfaces. The complexity is due in part to the natural variations of the various types of vegetation, cloud presence, and the changing solar loading conditions over time. This is precisely the environment MDB applications must function in. We detail some of the many training sets extracted to train different deep learning stacked auto encoder networks. We present performance results with receiver operator characteristic curves, confusion matrices, metric-vs-time plots, and classification maps. We show performance of ML models trained with data from various time windows, including over complete diurnal cycles and their performance processing data from different days and environmental conditions.
We study the generalization and scalability behavior of a deep belief network (DBN) applied to a challenging long-wave infrared hyperspectral dataset, consisting of radiance from several manmade and natural materials within a fixed site located 500 m from an observation tower. The collections cover multiple full diurnal cycles and include different atmospheric conditions. Using complementary priors, a DBN uses a greedy algorithm that can learn deep, directed belief networks one layer at a time and has two layers form to provide undirected associative memory. The greedy algorithm initializes a slower learning procedure, which fine-tunes the weights, using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of spectral data and their labels, despite significant data variability between and within classes due to environmental and temperature variation occurring within and between full diurnal cycles. We argue, however, that more questions than answers are raised regarding the generalization capacity of these deep nets through experiments aimed at investigating their training and augmented learning behavior.
We study the transfer learning behavior of a Hybrid Deep Network (HDN) applied to a challenging longwave infrared hyperspectral dataset, consisting of radiance from several manmade and natural materials within a fixed site located 500 m from an observation tower, over multiple full diurnal cycles and different atmospheric conditions. The HDN architecture adopted in this study stakes a number of Restricted Boltzmann Machines to form a deep belief network for generative pre-training, or initialization of weight parameters, and then combines with a discriminative learning procedure that fine-tune all of the weights jointly to improve the network’s performance. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of spectral data and their labels, despite of significant data variability observed between and within classes due to environmental and temperature variation, occurring within full diurnal cycles. We argue, however, that more question are raised than answers are provided regarding the generalization capacity of these deep nets through experiments aimed for investigating their training and transfer learning behavior in the longwave infrared region of the electromagnetic spectrum.
An ensemble learning approach using a number of weak classifiers, each classifier conducting learning based on a
random subset of spectral features (bands) of the training sample, is used to detect/identify a specific chemical plume.
The support vector machine (SVM) is used as the weak classifier. The detection results of the multiple SVMs are
combined to generate a final decision on a pixel's class membership. Due to the multiple learning processes conducted in
the randomly selected spectral subspaces, the proposed ensemble learning can improve solution generality. This work
uses a two-class approach, using samples taken from hyper-spectral image (HSI) cubes collected during a release of the
test chemical. Performance results in the form of receiver operator characteristic curves, show similar performance when
compared to a single SVM using the full spectrum. Initial results were obtained by training with samples taken from a
single HSI cube. These results are compared to results that are more recent from training with sample data from 28 HSI
cubes. Performance of algorithms trained with high concentration spectra show very strong responses when scored only
on high concentration data. However, performance drops substantially when low concentration pixels are scored as well.
Training with the low concentration pixels along with the high concentration pixels can improve over all solution
generality and shows the strength of the ensemble approach. However, it appears that careful selection of the training
data and the number of examples can have a significant impact on performance.
A neural network is applied to data collected by the close-in detector for the Mine Hunter Killer (MHK) project with promising results. We use the ground penetrating radar (GPR) and metal detector to create three channels (two from the GPR) and train a basic, two layer (single hidden layer), feed-forward neural network. By experimenting with the number of hidden nodes and training goals, we were able to surpass the performance of the single sensors when we fused the three channels via our neural network and applied the trained net to different data. The fused sensors exceeded the best single sensor performance above 95 percent detection by providing a lower, but still high, false alarm rate. And though our three channel neural net worked best, we saw an increase in performance with fewer than three channels, as well.
The mission of the Department of Defense Counter-drug Technology Development Program Office's face recognition technology (FERET) program is to develop automatic face- recognition systems for intelligence and law enforcement applications. To achieve this objective, the program supports research in face-recognition algorithms, the collection of a large database of facial images, independent testing and evaluation of face-recognition algorithms, construction of real-time demonstration systems, and the integration of algorithms into the demonstration systems. The FERET program has established baseline performance for face recognition. The Army Research Laboratory (ARL) has been the program's technical agent since 1993, managing development of the recognition algorithms, database collection, and conduction algorithm testing and evaluation. Currently, ARL is managing the development of several prototype face-recognition systems that will demonstrate complete real-time video face identification in an access control scenario. This paper gives an overview of the FERET program, presents performance results of the face- recognition algorithms evaluated, and addresses the future direction of the program and applications for DoD and law enforcement.
The mission of the Department of Defense (DoD) Counter-drug Technology Development Program Office's Face Recognition Technology (FERET) program is to develop automatic face recognition systems from the development of detection and recognition algorithms in the laboratory through their demonstration in a prototype real-time system. To achieve this objective, the program supports research in face recognition algorithms, the collection of a large database of facial images, independent testing and evaluation of face recognition algorithms, construction of a real-time demonstration systems, and the integration of algorithms into the demonstration systems. The FERET program has established baseline performance for face recognition. The Army Research Laboratory (ARL) has been the technical agent for the Advanced Research Projects Agency since 1993, managing development of the recognition algorithms, database collection, and algorithm testing. Currently ARL is managing the development of several prototype face recognition systems that will demonstrate complete real-time video face identification in an access control mission. This paper gives an overview of the FERET program, presents recent performance results of face recognition algorithms evaluated, and addresses the future direction of the program and applications for DoD and law enforcement.
This paper presents a target detection and interrogation techniques for a foveal automatic target recognition (ATR) system based on the hierarchical scale-space processing of imagery from a rectilinear tessellated multiacuity retinotopology. Conventional machine vision captures imagery and applies early vision techniques with uniform resolution throughout the field-of-view (FOV). In contrast, foveal active vision features graded acuity imagers and processing coupled with context sensitive gaze control, analogous to that prevalent throughout vertebrate vision. Foveal vision can operate more efficiently in dynamic scenarios with localized relevance than uniform acuity vision because resolution is treated as a dynamically allocable resource. Foveal ATR exploits the difference between detection and recognition resolution requirements and sacrifices peripheral acuity to achieve a wider FOV (e.g. faster search), greater localized resolution where needed (e.g., more confident recognition at the fovea), and faster frame rates (e.g., more reliable tracking and navigation) without increasing processing requirements. The rectilinearity of the retinotopology supports a data structure that is a subset of the image pyramid. This structure lends itself to multiresolution and conventional 2-D algorithms, and features a shift invariance of perceived target shape that tolerates sensor pointing errors and supports multiresolution model-based techniques. The detection technique described in this paper searches for regions-of- interest (ROIs) using the foveal sensor's wide FOV peripheral vision. ROIs are initially detected using anisotropic diffusion filtering and expansion template matching to a multiscale Zernike polynomial-based target model. Each ROI is then interrogated to filter out false target ROIs by sequentially pointing a higher acuity region of the sensor at each ROI centroid and conducting a fractal dimension test that distinguishes targets from structured clutter.