We investigate the problem of identifying pixels in pairs of co-registered images that correspond to real changes on the ground. Changes that are due to environmental differences (illumination, atmospheric distortion, etc.) or sensor differences (focus, contrast, etc.) will be widespread throughout the image, and the aim is to avoid these changes in favor of changes that occur in only one or a few pixels. Formal outlier detection schemes (such as the one-class support vector machine) can identify rare occurrences, but will be confounded by pixels that are "equally rare" in both images: they may be anomalous, but they are not changes. We describe a resampling scheme we have developed that formally addresses both of these issues, and reduces the problem to a binary classification, a problem for which a large variety of machine learning tools have been developed. In principle, the effects of misregistration will manifest themselves as pervasive changes, and our method will be robust against them - but in practice, misregistration remains a serious issue.
Accurate and robust techniques for automated feature extraction (AFE) from remotely-sensed imagery are an important area of research, having many applications in the civilian and military/intelligence arenas. Much work has been undertaken in developing sophisticated tools for performing these tasks. However, while many of these tools have been shown to perform quite well (such as the GENIE and Genie Pro software developed at LANL), these tools are not perfect. The classification algorithms produced often have significant errors, such as false-alarms and missed detections. We describe some efforts at improving this situation in which we add a clutter mitigation layer to our existing AFE software (Genie Pro). This clutter mitigation layer takes as input the output from the previous feature extraction (classification) layer and, using the same training data (pixels providing examples of the classes of interest), uses similar machine-learning techniques to those used in the previous AFE layer to optimise an image-processing pipeline aimed at improving any errors existing in the AFE output. While the AFE layer optimises an image processing pipeline that can combine spectral, logical, textural, morphological and other spatial operators, etc., the clutter mitigation layer is limited to a pool of morphological operators. The resulting clutter mitigation algorithm will not only be optimized for the particular feature of interest but will also be co-optimized with the preceding feature extraction algorithm. We demonstrate these techniques on several feature extraction problems in various multi-spectral, remotely-sensed images.
We present Genie Pro, a new software tool for image analysis produced by the ISIS (Intelligent Search in Images and Signals) group at Los Alamos National Laboratory. Like the earlier GENIE tool produced by the same group, Genie Pro is a general purpose adaptive tool that derives automatic pixel classification algorithms for satellite/aerial imagery, from training input provided by a human expert. Genie Pro is a complete rewrite of our earlier work that incorporates many new ideas and concepts. In particular, the new software integrates spectral information; and spatial cues such as texture, local morphology and large-scale shape information; in a much more sophisticated way. In addition, attention has been paid to how the human expert interacts with the software: Genie Pro facilitates highly efficient training through an interactive and iterative “training dialog”. Finally, the new software runs on both Linux and Windows platforms, increasing its versatility. We give detailed descriptions of the new techniques and ideas in Genie Pro, and summarize the results of a recent evaluation of the software.
We present ZEUS, an algorithm for extracting features from images and time series signals. ZEIS is designed to solve a variety of machine learning problems including time series forecasting, signal classification, image and pixel classification of multispectral and panchromatic imagery. An evolutionary approach is used to extract features from a near-infinite space of possible combinations of nonlinear operators. Each problem type (i.e. signal or image, regression or classification, multiclass or binary) has its own set of primitive operators. We employ fairly generic operators, but note that the choice of which operators to use provides an opportunity to consult with a domain expert. Each feature is produced from a composition of some subset of these primitive operators. The fitness for an evolved set of features is given by the performance of a back-end classifier (or regressor) on training data. We demonstrate our multimodal approach to feature extraction on a variety of problems in remote sensing. The performance of this algorithm will be compared to standard approaches, and the relative benefit of various aspects of the algorithm will be investigated.
An increasing number and variety of platforms are now capable of
collecting remote sensing data over a particular scene. For many
applications, the information available from any individual sensor may
be incomplete, inconsistent or imprecise. However, other sources may
provide complementary and/or additional data. Thus, for an application
such as image feature extraction or classification, it may be that
fusing the mulitple data sources can lead to more consistent and
Unfortunately, with the increased complexity of the fused data, the
search space of feature-extraction or classification algorithms also
greatly increases. With a single data source, the determination of a
suitable algorithm may be a significant challenge for an image
analyst. With the fused data, the search for suitable algorithms can
go far beyond the capabilities of a human in a realistic time frame,
and becomes the realm of machine learning, where the computational
power of modern computers can be harnessed to the task at hand.
We describe experiments in which we investigate the ability of a suite
of automated feature extraction tools developed at Los Alamos National
Laboratory to make use of multiple data sources for various feature
extraction tasks. We compare and contrast this software's capabilities
on 1) individual data sets from different data sources 2) fused data
sets from multiple data sources and 3) fusion of results from multiple
individual data sources.
We introduce an algorithm for classifying time series data. Since our initial application is for lightning data, we call the algorithm Zeus. Zeus is a hybrid algorithm that employs evolutionary computation for feature extraction, and a support vector machine for the final backend classification. Support vector machines have a reputation for classifying in high-dimensional spaces without overfitting, so the utility of reducing dimensionality with an intermediate feature selection step has been questioned. We address this question by testing Zeus on a lightning classification task using data acquired from the Fast On-orbit Recording of Transient Events (FORTE) satellite.
The Rapid Telescopes for Optical Response (RAPTOR) experiment is a spatially distributed system of autonomous robotic telescopes that is designed to monitor the sky for optical transients. The core of the ystem is composed of two telescope arrays, separated by 38 kilometers, that stereoscopically view the same 1500 square-degree field with a wide-field imaging array and a central 4 square-degree field with a more sensitive narrow-field ``fovea" imager. Coupled to each telescope array is a real-time data analysis pipeline that is designed to identify interesting transients on timescales of seconds and, when a celestial transient is identified, to command the rapidly slewing robotic mounts to point the narrow-field ``fovea'' imagers at the transient. The two narrow-field telescopes then image the transient with higher spatial resolution and at a faster cadence to gather light curve information. Each ``fovea" camera also images the transient through a different filter to provide color information. This stereoscopic monitoring array is supplemented by a rapidly slewing telescope with a low resolution spectrograph for follow-up observations of transients and a sky patrol telescope that nightly monitors about 10,000 square-degrees for variations, with timescales of a day or longer, to a depth about 100 times fainter. In addition to searching for fast transients, we will use the data stream from RAPTOR as a real-time sentinel for recognizing important variations in known sources. All of the data will be publically released through a virtual observatory called SkyDOT (Sky Database for Objects in the Time Domain) that we are developing for studying variability of the optical sky. Altogether, the RAPTOR project aims to construct a new type of system for discovery in optical astronomy---one that explores the time domain by "mining the sky in real time".
Feature extraction from imagery is an important and long-standing problem in remote sensing. In this paper, we report on work using genetic programming to perform feature extraction simultaneously from multispectral and digital elevation model (DEM) data. We use the GENetic Imagery Exploitation (GENIE) software for this purpose, which produces image-processing software that inherently combines spatial and spectral processing. GENIE is particularly useful in exploratory studies of imagery, such as one often does in combining data from multiple sources. The user trains the software by painting the feature of interest with a simple graphical user interface. GENIE then uses genetic programming techniques to produce an image-processing pipeline. Here, we demonstrate evolution of image processing algorithms that extract a range of land cover features including towns, wildfire burnscars, and forest. We use imagery from the DOE/NNSA Multispectral Thermal Imager (MTI) spacecraft, fused with USGS 1:24000 scale DEM data.
Los Alamos National Laboratory has developed and demonstrated a highly capable system, GENIE, for the two-class problem of detecting a single feature against a background of non-feature. In addition to the two-class case, however, a commonly encountered remote sensing task is the segmentation of multispectral image data into a larger number of distinct feature classes or land cover types. To this end we have extended our existing system to allow the simultaneous classification of multiple features/classes from multispectral data. The technique builds on previous work and its core continues to utilize a hybrid evolutionary-algorithm-based system capable of searching for image processing pipelines optimized for specific image feature extraction tasks. We describe the improvements made to the GENIE software to allow multiple-feature classification and describe the application of this system to the automatic simultaneous classification of multiple features from MTI image data. We show the application of the multiple-feature classification technique to the problem of classifying lava flows on Mauna Loa volcano, Hawaii, using MTI image data and compare the classification results with standard supervised multiple-feature classification techniques.
The Cerro Grande/Los Alamos forest fire devastated over 43,000 acres (17,500 ha) of forested land, and destroyed over 200 structures in the town of Los Alamos and the adjoining Los Alamos National Laboratory. The need to measure the continuing impact of the fire on the local environment has led to the application of a number of remote sensing technologies. During and after the fire, remote-sensing data was acquired from a variety of aircraft- and satellite-based sensors, including Landsat 7 Enhanced Thematic Mapper (ETM+). We now report on the application of a machine learning technique to the automated classification of land cover using multi-spectral and multi-temporal imagery. We apply a hybrid genetic programming/supervised classification technique to evolve automatic feature extraction algorithms. We use a software package we have developed at Los Alamos National Laboratory, called GENIE, to carry out this evolution. We use multispectral imagery from the Landsat 7 ETM+ instrument from before, during, and after the wildfire. Using an existing land cover classification based on a 1992 Landsat 5 TM scene for our training data, we evolve algorithms that distinguish a range of land cover categories, and an algorithm to mask out clouds and cloud shadows. We report preliminary results of combining individual classification results using a K-means clustering approach. The details of our evolved classification are compared to the manually produced land-cover classification.
Feature identification attempts to find algorithms that can consistently separate a feature of interest from the background in the presence of noise and uncertain conditions. This paper describes the development of a high-throughput, reconfigurable computer based, feature identification system known as POOKA. POOKA is based on a novel spatio-spectral network, which can be optimized with an evolutionary algorithm on a problem-by-problem basis. The reconfigurable computer provides speed up in two places: 1) in the training environment to accelerate the computationally intensive search for new feature identification algorithms, and 2) in the application of trained networks to accelerate content based search in large multi-spectral image databases. The network is applied to several broad area features relevant to scene classification. The results are compared to those found with traditional remote sensing techniques as well as an advanced software system known as GENIE. The hardware efficiency and performance gains compared to software are also reported.
Classification of broad area features in satellite imagery is one of the most important applications of remote sensing. It is often difficult and time-consuming to develop classifiers by hand, so many researchers have turned to techniques from the fields of statistics and machine learning to automatically generate classifiers. Common techniques include Maximum Likelihood classifiers, neural networks and genetic algorithms. We present a new system called Afreet, which uses a recently developed machine learning paradigm called Support Vector Machines (SVMs). In contrast to other techniques, SVMs offer a solid mathematical foundation that provides a probabalistic guarantee on how well the classifier will generalize to unseen data. In addition the SVM training algorithm is guaranteed to converge to the globally optimal SVM classifier, can learn highly non-linear discrimination functions, copes extremely well with high-dimensional feature spaces (such as hyperspectral data), and scales well to large problem sizes. Afreet combines an SVM with a sophisticated spatio-spectral feature construction mechanism that allows it to classify spectrally ambiguous pixels. We demonstrate the effectiveness of the system by applying Afreet to several broad area classification problems in remote sensing, and provide a comparison with conventional Maximum Likelihood classification.
Between May 6 and May 18, 2000, the Cerro Grande/Los Alamos wildfire burned approximately 43,000 acres (17,500 ha) and 235 residences in the town of Los Alamos, NM. Initial estimates of forest damage included 17,000 acres (6,900 ha) of 70-100% tree mortality. Restoration efforts following the fire were complicated by the large scale of the fire, and by the presence of extensive natural and man-made hazards. These conditions forced a reliance on remote sensing techniques for mapping and classifying the burn region. During and after the fire, remote-sensing data was acquired from a variety of aircraft-based and satellite-based sensors, including Landsat 7. We now report on the application of a machine learning technique, implemented in a software package called GENIE, to the classification of forest fire burn severity using Landsat 7 ETM+ multispectral imagery. The details of this automatic classification are compared to the manually produced burn classification, which was derived from field observations and manual interpretation of high-resolution aerial color/infrared photography.
KEYWORDS: Digital signal processing, Reconfigurable computing, Sensors, Image segmentation, Image processing, Remote sensing, Field programmable gate arrays, Feature extraction, Signal processing, Algorithm development
Compute performance and algorithm design are key problems of image processing and scientific computing in general. For example, imaging spectrometers are capable of producing data in hundreds of spectral bands with millions of pixels. These data sets show great promise for remote sensing applications, but require new and computationally intensive processing. The goal of the Deployable Adaptive Processing Systems (DAPS) project at Los Alamos National Laboratory is to develop advanced processing hardware and algorithms for high-bandwidth sensor applications. The project has produced electronics for processing multi- and hyper-spectral sensor data, as well as LIDAR data, while employing processing elements using a variety of technologies. The project team is currently working on reconfigurable computing technology and advanced feature extraction techniques, with an emphasis on their application to image and RF signal processing. This paper presents reconfigurable computing technology and advanced feature extraction algorithm work and their application to multi- and hyperspectral image processing. Related projects on genetic algorithms as applied to image processing will be introduced, as will the collaboration between the DAPS project and the DARPA Adaptive Computing Systems program. Further details are presented in other talks during this conference and in other conferences taking place during this symposium.
The “pixel purity index” (PPI) algorithm proposed by Boardman, et al1 identifies potential endmember pixels in multispectral imagery. The algorithm generates a large number of “skewers” (unit vectors in random directions), and then computes the dot product of each skewer with each pixel. The PPI is incremented for those pixels associated with the extreme values of the dot products. A small number of pixels (a subset of those with the largest PPI values) are selected as “pure” and the rest of the pixels in the image are expressed as linear mixtures of these pure endmembers. This provides a convenient and physically-motivated decomposition of the image in terms of a relatively few components. We report on a variant of the PPI algorithm in which blocks of B skewers are considered at a time. Prom the computation of B dot products, one can produce a much larger set of “derived” dot products that are associated with skewers that are linear combinations of the original B skewers. Since the derived dot products involve only scalar operations, instead of full vector dot products, they can be very cheaply computed. We will also discuss a hardware implementation on a field programmable gate array (FPGA) processor both of the original PPI algorithm and of the block-skewer approach. We will furthermore discuss the use of fast PPI as a front-end to more sophisticated algorithms for selecting the actual endmembers.
We describe the implementation and performance of a parallel, hybrid evolutionary-algorithm-based system, which optimizes image processing tools for feature-finding tasks in multi-spectral imagery (MSI) data sets. Our system uses an integrated spatio-spectral approach and is capable of combining suitably-registered data from different sensors. We investigate the speed-up obtained by parallelization of the evolutionary process via multiple processors (a workstation cluster) and develop a model for prediction of run-times for different numbers of processors. We demonstrate our system on Landsat Thematic Mapper MSI , covering the recent Cerro Grande fire at Los Alamos, NM, USA.
We consider the problem of pixel-by-pixel classification of a multi- spectral image using supervised learning. Conventional spuervised classification techniques such as maximum likelihood classification and less conventional ones s uch as neural networks, typically base such classifications solely on the spectral components of each pixel. It is easy to see why: the color of a pixel provides a nice, bounded, fixed dimensional space in which these classifiers work well. It is often the case however, that spectral information alone is not sufficient to correctly classify a pixel. Maybe spatial neighborhood information is required as well. Or maybe the raw spectral components do not themselves make for easy classification, but some arithmetic combination of them would. In either of these cases we have the problem of selecting suitable spatial, spectral or spatio-spectral features that allow the classifier to do its job well. The number of all possible such features is extremely large. How can we select a suitable subset? We have developed GENIE, a hybrid learning system that combines a genetic algorithm that searches a space of image processing operations for a set that can produce suitable feature planes, and a more conventional classifier which uses those feature planes to output a final classification. In this paper we show that the use of a hybrid GA provides significant advantages over using either a GA alone or more conventional classification methods alone. We present results using high-resolution IKONOS data, looking for regions of burned forest and for roads.
We describe the implementation and performance of a genetic algorithm (GA) which evolves and combines image processing tools for multispectral imagery (MSI) datasets. Existing algorithms for particular features can also be “re-tuned” and combined with the newly evolved image processing tools to rapidly produce customized feature extraction tools. First results from our software system were presented previously. We now report on work extending our system to look for a range of broad-area features in MSI datasets. These features demand an integrated spatio- spectral approach, which our system is designed to use. We describe our chromosomal representation of candidate image processing algorithms, and discuss our set of image operators. Our application has been geospatial feature extraction using publicly available MSI and hyperspectral imagery (HSI). We demonstrate our system on NASA/Jet Propulsion Laboratory’s Airborne Visible and Infrared Imaging Spectrometer (AVIRIS) HSI which has been processed to simulate MSI data from the Department of Energy’s Multispectral Thermal Imager (MTI) instrument. We exhibit some of our evolved algorithms, and discuss their operation and performance.
We describe the implementation and performance of a genetic algorithm which generates image feature extraction algorithms for remote sensing applications. We describe our basis set of primitive image operators and present our chromosomal representation of a complete algorithm. Our initial application has been geospatial feature extraction using publicly available multi-spectral aerial-photography data sets. We present the preliminary results of our analysis of the efficiency of the classic genetic operations of crossover and mutation for our application, and discuss our choice of evolutionary control parameters. We exhibit some of our evolved algorithms, and discuss possible avenues for future progress.
The retrieval of scene properties (surface temperature, material type, vegetation health, etc.) from remotely sensed data is the ultimate goal of many earth observing satellites. The algorithms that have been developed for these retrievals are informed by physical models of how the raw data were generated. This includes models of radiation as emitted and/or reflected by the scene, propagated through the atmosphere, collected by the optics, detected by the sensor, and digitized by the electronics. To some extent, the retrieval is the inverse of this 'forward' modeling problem. But in contrast to this forward modeling, the practical task of making inferences about the original scene usually requires some ad hoc assumptions, good physical intuition, and a healthy dose of trial and error. The standard MTI data processing pipeline will employ algorithms developed with this traditional approach. But we will discuss some preliminary research on the use of a genetic programming scheme to 'evolve' retrieval algorithms. Such a scheme cannot compete with the physical intuition of a remote sensing scientist, but it may be able to automate some of the trial and error. In this scenario, a training set is used, which consists of multispectral image data and the associated 'ground truth;' that is, a registered map of the desired retrieval quantity. The genetic programming scheme attempts to combine a core set of image processing primitives to produce an IDL (Interactive Data Language) program which estimates this retrieval quantity from the raw data.