In a broad sense, spectral mixing phenomenology in multispectral and hyperspectral imagery can be treated in two ways, depending on how the materials in a scene are presumed to be mixed. A linear mixing model is suitable in cases where materials are presumed to be nonoverlapping areas and can be mathematically expressed as a weighted linear combination, where the weights are associated with the abundances of each material. The endmembers are spectra (real or ideal) representing unique materials in a given image such as water, soil, and vegetation. Abundances are the percentage of each endmember within a given pixel. On the other hand, it is well known that intimately mixed materials frequently exhibit nonlinear spectral mixing. Granular materials, such as soils, are often intimate mixtures of numerous different inorganic (minerals) and organic (humic) substances. And since soils are often significant constituents of spectral remote sensing scenes, intimate mixing may safely be assumed to be a common phenomenon. Another case is the mixing of fluids, such as oil and water, as occurred during the Deepwater Horizon oil spill disaster in 2010. We will consider this second case in more detail later.
Linear mixing is modeled as a linear combination of the spectra of multiple endmembers. The problem is posed in a number of ways using theory from linear statistical models with variations that impose either physical or sparseness constraints.18.104.22.168.6.–7 Sites in a scene are labeled as containing one, two, or perhaps many endmembers. Some approaches begin with what is sometimes referred to as a full model, where the full model contains all possible variables (endmembers), and subsequently eliminates variables that do not contribute to the statistical significance (e.g., using an F-statistic) of the model.8 Alternatively, other approaches are synonymous with stepwise regression, where the process begins with a pair of variables (endmembers) and introduces new variables if they contribute significantly to the model.6,8
Recent research on linear unmixing has resulted in a number of alternative approaches. A stochastic mixture model (SMM) has been proposed with the goal to incorporate inherent endmember variance.9 Related to the SMM is a continuous version of the method, known as the normal compositional model (NCM).10 In the NCM, the mean vectors and covariance matrices of the endmembers are assumed adequate to capture their spectral variability. However, a big challenge to the NCM is parameter estimation. Very recently, a Bayesian algorithm called “normal endmember spectral unmixing” has been proposed to enhance parameter estimation performed in the NCM.11 Research has also been performed on the use of sparse modeling for modeling linear mixtures in hyperspectral imagery,12,13 a data-driven stochastic approach,14 as well as some advanced methods of nonnegative matrix factorization (NMF), such as projection-based NMF,15 and an adaptive sparsity-constrained NMF.16
Intimate mixtures exhibit nonlinear spectral mixing behavior and are modeled as nonlinear combinations of spectra from multiple endmembers. In cases of intimate mixing, a linear spectral unmixing inversion applied to such nonlinear mixtures will yield subpixel material abundance estimates that do not equal the true values of the mixture constituents. An example of this is provided by Keshava and Mustard.17
An intimate mixture model can be described by nonlinear functions, which are justified by Hapke scattering theory18 and photometric phase functions.1920.–21 In such an approach, reflectance is converted to single-scattering albedo (SSA). An exact model can be difficult to obtain, but can show up to 30% improvement in measurements over the linear mixing model when intimate mixtures are present.22 Many years ago studies showed success in using a constrained energy minimization method and other linear methods applied to reflectance spectra transformed to SSA data.2324.–25
More recently, several alternative methods of spectral unmixing of nonlinear spectral mixtures have been proposed. Reviews of nonlinear spectral mixture analysis are given in Refs. 17, 26, 27, and 28 and references cited therein.
Kernel functions provide a way to generalize linear algorithms to nonlinear data.32,33 In the cases of detection and classification applications, they can induce high-dimensional feature spaces. In these spaces, previously nonseparable classes are made linearly separable. Thus, linear methods can be applied in this new feature space that provides nonlinear boundaries back in the original data space. Another example is the kernel principal component analysis method.34 The kernel, in this case, is not used to induce a high-dimensional space, but is used to better match the data structure through nonlinear mappings. It is in this mode that kernels can be applied to obtain abundances in nonlinear mixtures while essentially using a linear mixture model. What is more appealing is that the physics suggests that such a method is ideal if one can model the kernel correctly.
A limitation of earlier kernel algorithms for material detection and scene classification is that they produced abundance estimates that did not meet nonnegativity and sum-to-one constraints. This was overcome by the development of a kernel fully constrained least squares (KFCLS) method that computes kernel-based abundance estimates to meet the physical (nonnegativity and sum-to-one) abundance constraints.35 Further investigation of the KFCLS method has resulted in (1) the development of a generalized kernel for linear and intimate (nonlinear) mixtures37 and (2) an adaptive kernel-based technique for mapping linear and intimate nonlinear mixtures.38 The generalized kernel and adaptive techniques provide a way to adaptively estimate a mixture model suitable to the degree of nonlinearity that may be occurring at each pixel in a scene. This is important because a scene may contain both linear and intimate mixtures and it is not always known a priori which model is appropriate on a pixel-by-pixel basis. This problem was investigated in a study performed by Broadwater and Banerjee.38 Building upon their work, a study investigating the behavior of the generalized KFCLS and adaptive kernel-based techniques was performed using both user-defined and automatically generated endmembers determined by the support vector data description algorithm.39
An investigation using laboratory data was also performed comparing the performance of the fully constrained least squares (FCLS) method applied to spectra converted to SSA with the generalized KFCLS applied to reflectance spectra and another recently published kernel-based method, the “K-Hype” kernel applied to reflectance spectra.40 One conclusion of this study: similar accuracy in abundance estimates can be achieved using the SSA-based method as compared to the generalized KFCLS, but with a much faster computation time. However, not all kernel methods are the same. The study also determined the “K-Hype” kernel method was not worth pursuing further for this type of analysis.
Overall, the impact of nonlinear spectral mixing on algorithm results must be well understood if we are to achieve a major goal of hyperspectral imaging (HSI): the geospatial mapping and quantification of materials that comprise remotely sensed scenes. In this study, we further this understanding by investigating both phenomenology-based SSA and mathematical-based kernel methods; and we investigate some issues noted while conducting earlier studies.39 The experiments include both laboratory and airborne HSI.
There has been a relative paucity of real-spectral image data sets that contain well-characterized intimate mixtures that can be used for algorithm development and testing. An important aspect of this study is the generation of a set of HSI data that are truly nonlinear spectral mixtures for which very precise truth information exists. Our laboratory experiment is performed on highly controlled data containing predetermined, nonlinear (intimate) mixtures of two materials. In addition, airborne experiments are performed on a scene of HyMap data collected over Oahu, Hawaii, and a scene collected by the airborne visible/infrared spectrometer (AVIRIS) instrument over the oil spill region in the Gulf of Mexico during the Deepwater Horizon oil incident.41,42
Description of Algorithms
Fully Constrained Least Squares and Linear Kernel-Based Mixing Models
The fully constrained least squares (FCLS)7 spectral mixing model can be expressed as
Alternatively, the linear model can also be posed as a special case of the kernel-based mixing model derived previously by Broadwater et al.,35 where the abundances of mixture components are estimated through the process
Either way, once is obtained, the estimator vector for a mixed spectral signature is
The root mean-squared error (RMSE) between the observed and estimated spectral signature is measured as
FCLS has been shown to be successful for modeling linear mixing phenomenology.7 In the present study, we will be using FCLS in two ways. The method will be used as a benchmark to compare with the proposed nonlinear methods. The method will also be used in one of the nonlinear approaches, where we will apply the FCLS to spectra that have been converted to SSA, as discussed below in Sec. 2.2.
Using Hapke Theory: Fully Constrained Least Squares Applied to Single-Scattering Albedo Spectra
Previous studies indicate that intimate (nonlinear or macroscopic) spectral mixtures in reflectance space may be linearized when transformed into SSA space. In order to understand this, we first mention the relationship between reflectance and SSA. Reflectance is the ratio of reflected radiance to incident irradiance.43 SSA is the ratio of the total amount of power scattered to the total power removed from the wave. If we assume the materials of interest are Lambertian, then we can consider the case of hemispherical reflectance and the mathematical relationship between reflectance and SSA can then be written19
For a number of intimately mixed materials, the average SSA can be written as38
According to Eq. (7), intimate mixtures are a linear mixture of the albedos and these are nonlinearly mapped to reflectance by Eq. (6). This linear mixing of the SSAs can be seen trivially, as we can write Eq. (7) as38 Noting the form of Eqs. (8) and (9) as compared to Eq. (1) for linear mixing, the terminology for relative geometric cross section and “abundance” can be used interchangeably.
Accordingly, we investigate this behavior by applying a linear mixing method on albedo; specifically, by applying the FCLS method on data that has been converted to SSA.
Conversion to SSA is described further in Resmini et al.23 and Resmini.24 Both studies follow Hapke;18 and Mustard and Pieters1920.–21 assuming the reflectance spectra are bidirectional. The expressions to transform reflectance spectra to SSA are given by Eqs. (10) and (11) for bidirectional (bd) reflectance and for hemispherical-directional (hd) reflectance, respectively. In the derivation of both expressions, the phase angle is large enough that the opposition effect is assumed negligible.2.1, Hapke18); , , , and are the same as in Eq. (6). The two different equations can be used to generate the two sets of SSA spectra for bd and hd reflectance.
In the experiments that follow, we refer to this approach as the “FCLS on SSA” method, or simply denote it as the SSA method. The conversion of reflectance spectra to SSA using Eqs. (10) and (11) is very fast as compared to the generalized kernel least squares (GKLS) method.
Generalized Kernel Fully Constrained Least Squares (Fixed and Automated)
Choosing the linear function for the kernel in Eq. (2) works well for modeling linear mixtures; however, it is not a suitable kernel for intimate mixtures. A physics-inspired kernel was proposed and shown to provide significantly improved behavior to model nonlinear mixtures, but a result of that effort was that although each kernel provides good results for the type of mixing intended, only one kernel or the other could be used, for either linear mixtures or intimate mixtures, but not both.36
The kernel approach was further developed by Broadwater and Banerjee37,38 into a generalized method for adaptive linear and intimate mixtures. This method is motivated by attempting to simulate Hapke theory for SSA by making use of the kernel
In this study, we investigate a fixed- generalized kernel least square” (fixed- GKLS). The computation is similar in form to Eq. (2) except the minimization is done according to
For a specified , we implement this by transforming the observed and into kernel space
Equation (16) has the same constraints as Eq. (1) and the abundance estimates can be computed in the same manner as Eq. (1). The estimated mixed spectra in kernel space is computed as
The estimated error in kernel space is
The estimated error vector in the original space is computed as Eq. (4), where the components of the estimated mixed spectra are computed using the inverse transform of Eq. (14). The RMSE is computed by Eq. (5).
In addition to a fixed- GKLS implementation, an automated GKLS method is investigated that attempts to select the most appropriate gamma according to44 In computing the optimization, we seek to achieve a minimum of the model’s RMSE. In this manner, at least theoretically, the automated GKLS method has the ability to respond differently to differing degrees of nonlinearity to select the best gamma and compute more precise estimates of abundance.
Description of Experiments
Overview of Experiments
Three experiments are performed in this study. The first is an experiment conducted in the laboratory. This is an important aspect of our investigation: a set of HSI data is generated that are truly nonlinear spectral mixtures for which very precise truth information is documented. The laboratory experiment is performed on highly controlled data containing predetermined, nonlinear (intimate) mixtures of two materials. In this experiment, two granular materials were custom fabricated and mechanically mixed to form intimate mixtures. The materials are spherical beads of didymium glass and soda-lime glass, both ranging in particle size from 63 to . The glass beads materials are both translucent. Their chemical composition, densities, and particle size range are well known. The mixtures, which exhibit largely nonlinear spectral mixing, were then observed with a visible/near-infrared (VNIR; 400 to 900 nm) HSI microscope. The second and third experiments are conducted in the field with airborne hyperspectral imagery: the second experiment is performed using data from the HyMap sensor acquired during a campaign by the Navy Research Laboratory (NRL) over Oahu, Hawaii, from January 24 to February 1, 2009. The third experiment utilizes AVIRIS data collected over the oil spill region in the Gulf of Mexico during the Deepwater Horizon oil incident. The SSA, FCLS, and GKLS methods are each applied to data from the three collections.
Experiment 1: Laboratory Experiment
A Resonon Pika II imaging spectrometer with a Xenoplan 1.4/23-0902 objective lens shown in Fig. 1(a) is used to measure the didymium and soda-lime glass bead mixtures.45 We also used an Edmund Optics Gold Series telecentric lens. However, this lens gives data, providing a spatial resolution higher than what was required for the analyses. Consequently, the Edmund data are not reported in our results.
The Pika II is mounted nadir-looking at a mechanical translation table on which the sample to be imaged is placed. This device is a pushbroom sensor with a slit aperture, thus the need for a translation table to move the sample to facilitate hyperspectral image cube formation. The height of the sensor above the table is user selectable; a height was chosen such that all mixtures are captured in the same scene so that the data have a ground sample distance of . Though capable of acquiring 240 bands from 400 to 900 nm, the sensor was configured to acquire 80 bands by binning (spectrally by three) resulting in a sampling interval of and high signal-to-noise ratio spectra. Four quartz–tungsten–halogen (QTH) lamps are used for illumination approximating a hemispherical-directional illumination/viewing geometry.
Sensor and translation table operation, data acquisition, and data calibration are achieved by software that runs on a laptop computer. Calibration consists of a measurement of dark frame data (i.e., acquiring a cube with the lens cap on) and a measurement of a polytetrafluoroethylene (PTFE) reference plaque (large enough to entirely fill the field-of-view). Then, for each HSI cube measured, the sensor’s software first subtracts the dark data and then uses the PTFE data (also dark subtracted) to ratio the spectral measurements to give relative reflectance (also known as reflectance factor: Hapke18 and Schott43).
Three binary mixtures (and the two endmembers) are constructed and emplaced in the wells of a 96-well sample plate: 0/100%, 25/75%, 50/50%, 80/20%, and 100/0% of didymium/soda-lime (percentages by volume). This was done as follows: five cells of a 96-well sample plate, spray-painted flat black, were filled with the various glass bead mixtures; this is shown in Fig. 2. The volume of each cell is (0.33 mL). Three binary mixtures and the two 100% endmembers are constructed and emplaced in five of the wells of the 96-well sample plate: 0/100%, 25/75%, 50/50%, 80/20%, and 100/0% didymium/soda-lime. The glass beads and their mixtures also display subtle, though interesting, gonioapparent changes in color. We remark that the glass bead particle size range is much larger than the VNIR wavelengths used in this analysis.
Fifteen data Pika II HSI cubes were acquired; however, we focus here on the analysis of one cube comprised of 640 samples, 400 lines, and 75 bands ranging from 0.434 to . Of the 80 bands acquired the first five were discarded due to noise content.
Training and test data were extracted from the selected hyperspectral cube. Figure 2 shows polygons defining the training and test regions drawn on top of a red–green–blue (RGB) color composite context image. The training regions are shown in Fig. 2(a) and the test regions are shown in Fig. 2(b). Note that none of the test regions overlapped the training regions. Figure 2(a) shows the small training polygon regions defining the three training endmembers: DiDy (100%), lime (100%), and background. Figure 2(b) shows the test areas used to quantitatively measure the performance of the algorithms. Five regions were defined, corresponding to the five mixtures 100% DiDy, 75/25% DiDy/lime, 50/50% DiDy/lime, 25/75% DiDy/lime, and 0/100% DiDy/lime. Two additional test regions of background spectra were also extracted.
Figure 3 shows plots of the mean vectors of the spectral data extracted within the training polygons shown in Fig. 2(a). These image-derived training endmembers, as just described, are used to investigate the three methods described in Sec. 2: (1) FCLS applied in reflectance space, (2) GKLS applied in reflectance space, and (3) FCLS applied in SSA space. For purposes of conciseness in reporting results, these methods henceforth will be referred to as the FCLS, GKLS, and SSA methods, respectively.
Numerous factors affect the performance of the methods. In particular, three factors affecting the performance of all the methods are: (1) the number the endmembers used in a model, (2) how well these endmembers span the space in which the mixing occurs, and (3) the RMSE threshold for eliminating bad fits between the observed and model-estimated spectra. In addition, the GKLS method uses a parameter to determine the nonlinear behavior introduced by the kernel. We test the GKLS method at fixed values of : 0.1, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0, and 6.0, as well as the automated GKLS method. For the SSA method, performance may be affected by the type of SSA conversion: hemispherical or bidirectional. If bidirectional, the view and illumination angles are factors. Although the laboratory instrument has approximately hemispherical illumination, we will conduct trials for SSA conversions made assuming both bidirectional reflectance with nadir view and illumination angles as defined by Eq. (10), denoted as SSA, as well as hemispherical reflectance as defined by Eq. (11), which we will refer to as SSA-H.
The algorithms are applied to the entire scenes. Both qualitative results (shown by images of the algorithm output) and quantitative results (applied in the test regions) are given.
Experiment 2: Airborne Experiment in Hawaii
Experiment 2 tests the proposed methods using a cube of HyMap data collected during a campaign by the NRL over Oahu, Hawaii. Flight line Az148 (acquired on February 1, 2009) was selected and analysis was performed on a subscene of the original scene with a size of 400 samples by 250 lines over government-owned property near the Waimanalo region of Oahu. The imagery consists of 125 bands, has a spectral range of 0.45 to with a spectral resolution of about , and a ground sampling distance (GSD) of . The calibrated, at-aperture radiance data were converted to reflectance by the NRL using their Tafkaa,46 atmospheric compensation tool. Tafkaa is a heavily modified version of ATREM.47 The reflectance spectra are scaled by a factor of 10,000 and converted to 16-bit integers. An RGB normal-color composite of this scene is shown in Fig. 4.
Ten fabric panels of varying size were placed at three different sites (and in varying local backgrounds) prior to the image collection. These panels are impermeable Tyvek-like house wrap, near white in appearance. The sizes of these panels are listed in Table 1. Figure 5 shows aerial photographs of the three sites. The locations of the panels at each site are annotated. Numerous ground photos were also taken of the panels. Figure 6 shows ground-truth photos for three of these panels (TY-4, TY-5, and TY-7).
Target sizes: the sizes of the 10 target panels are listed in this table. The measurements were made in units of feet, but converted to meters for easier comparison to the estimated 4.0 GSD of the airborne HyMap scene.
Site 1 contains 6 of the 10 panels (TY-4, TY-5, TY-6, TY-7, TY-8, and TY-9). TY-7 is expected to be difficult to detect because of its subpixel size in the HyMap scene and because it is partially occluded by trees. Notice that TY-7 is somewhat difficult to see in the aerial photo shown in Fig. 1; however, it is considerably more difficult to see in the HyMap data because of the reduced spatial resolution. TY-5 is expected to be difficult because of size and shading. Site 2 contains three Tyvek panels (TY-1, TY-2, and TY-3). Site 3 contains one Tyvek panel. During the campaign, spectral measurements were acquired by the USGS and NGA using an ASD spectrometer. However, these field measurements were not used for training in this study. Instead image-derived spectra were used.
Figure 7 shows four of the eight endmember spectra used in experiment 2. Similar to the procedure used in the previous experiment, training regions for eight classes of materials in the scene were defined for use as endmembers. Seven of these belonged to naturally occurring classes, such as sand, gravel, water, and different types of vegetation having regional extent (e.g., multiple pixels). These endmembers were computed as mean spectra within regions defined by polygon overlays (similar to the overlays shown in Fig. 2). However, as shown in Table 1, the Tyvek panels are comparably small and mostly subpixel in extent. TY-6 is the largest panel and is just slightly larger than a full pixel (). The spectral vector from the center point of TY-6 was extracted and used as the Tyvek endmember.
Experiment 3: Airborne Experiment: Deep Water Horizon
Experiment 3 uses a scene that was extracted from AVIRIS data collected during a JPL/USGS airborne campaign during the Deep Water Horizon (DWH) oil spill incident.41,42 As a part of these studies, during the DWH campaign, USGS collected numerous oil samples for laboratory measurements.41 Although precise ground truth is not available due to the dynamically changing environment of an oil spill on water, previous studies using feature-based spectroscopy methods determined the locations of three predominant oil thicknesses in the region.42
Out of the numerous flight lines of AVIRIS acquired on May 17, 2010, run 11 of AVIRIS was selected, which includes the actual incident site of the oil spill. This run was collected at an altitude of 28,000 feet resulting in a GSD of . The AVIRIS scene is 677 samples by 16,835 lines with 224 bands of reflectance data from 0.365 to . As part of the calibration to reflectance, the USGS scaled the imagery by a factor of 20,000. The experiment focused on a subset scene of size 500 pixels by 500 lines by 224 bands. Out of the 224 bands, 97 bands were excluded: The first 39 bands (0.365 to ) were excluded to focus the algorithm on the longer wavelengths away from the visible, the last 10 bands (2.407 to ) were excluded because of low signal-to-noise, and the remaining bands were excluded because of atmospheric absorption.
Figure 8 shows a true-color RGB composite of the subset scene. The scene is mostly water with large portions containing a great diversity of oil thicknesses. Different states of the oil can be seen visibly, appearing in different colors ranging from almost black, to dark brown, to light brown, and to orange.
Figure 9 shows laboratory-measured reflectance spectra of four oil samples at differing thicknesses. Absorption features due to C-H with band centers (bc) at 1.2 and are labeled as F1 and F2.
For purposes of training and defining the endmembers, laboratory spectra for pure oil samples with 4-mm thickness and 1.85-mm thickness are used, as well as an oil/water emulsion having a thickness of 1.0 mm. Because much of the emulsion behavior is difficult to model in the laboratory, user-defined image spectra are also used: orange-colored spectra (in the RGB true-color composite) and water spectra with trace oil.
The FCLS, GKLS, and SSA methods are applied to the AVIRIS data, reporting on the results using the fixed- implementation and the SSA conversion made assuming bidirectional reflectance with nadir input and output angles.
Overview of Results
The results for the three experiments are presented below, showing tables with model diagnostics that provide estimated abundance values for the estimated models and RMSE values for the model errors; abundance maps that provide a visual representation of the abundances of materials found in the scenes; and graphs showing a comparison of the estimated (simulated) spectra estimated by the computed models and the observed spectra at selected points.
Results of Experiment 1
The results for experiment 1 (the laboratory experiment) are shown in Figs. 10 and 11, as well as Tables 1 and 2. Figure 10 shows RGB color abundance maps for four of the trials (, , ). Qualitatively, Figs. 10(a) and 10(b) show poor correspondence to the known mixtures shown in Fig. 2. Figures 10(c) and 10(d) show much better correspondence to the actual abundances given in Fig. 2. The variations in color within the discs containing the three mixtures (75/25, 50/50, and 25/75) are noteworthy. These variations indicate the methods are detecting notable variance in the abundance proportions. This indicates that the attempt to prepare mixture proportions that are as uniform as possible were not completely successful. These variations are real and the abundance estimation methods (particularly, GKLS and SSA) are responding correctly. Static clinging and other interparticle interactions likely account for the clumping and variations observed. Nonetheless, overall population averages captured by regions of interest delineated in Fig. 2(b) covering a large portion of the individual cells yield averages close to those intended in the experiment design phase.
Abundance results: the average estimated abundances are listed for the FCLS, GKLS, and SSA methods in the five test regions. The “truth” for these regions (actual physically measured proportions by volume) is 1.0, 0.788, 0.505, 0.242, and 0.0 for DiDy100, DiD075, DiDy050, DiDy025, and DiDy000, respectively. Results for the GKLS method using a fixed gamma of γ=0.1, γ=0.5, γ=1.0, γ=2.0, γ=3.0, γ=4.0, γ=5.0, and γ=6.0 are given, as well as for the automated GKLS.
Table 2 lists the average estimated abundances for the FCLS, GKLS, SSA, and SSA-H methods in the five test regions. The “truth” (actual physically measured) percent by volume of DiDy for these regions varied slightly from the goal of 100%, 75%, 50% 25%, and 0%. In reality, these were measured as 100%, 78.8%, 50.5%, 24.2%, and 0% for DiDy100, DiD075, DiDy050, DiDy025, and DiDy000, respectively. Results for the GKLS method using a fixed gamma of , , , , , , , and are given, as well as for the automated GKLS. This table shows FCLS to be poor at predicting the abundances for DiDy075 (93.11% predicted versus 78.8% truth) and DiDy050 (74.97% predicted versus 50.5% truth), as well as very poor at predicting DiDy025 (63.88% predicted versus 24.2% truth). The results of GKLS for small values of agree with theoretical expectations of an approximately linear model. Specifically, the prediction of GKLS at is almost exactly the same as the FCLS method.
Out of the eight values tested, GKLS at provides the closest prediction for DiDy50 (49.08% versus 50.5%) and is only slightly worse at predicting the correct abundance than GKLS at .
Figure 11 shows the observed and estimated mixture (averaged) spectra of the 50/50% region using the FCLS, GKLS, and SSA-H methods. Visually, we can see both the GKLS () and SSA-H methods provide a better fit as compared to the FCLS method.
Table 3 lists the RMSE of the fit between the estimated and observed spectral mixtures for selected points in the scene. Except for DiDy at 25% (0.242), the RMSE errors for FCLS were not considerably larger than the errors for the other methods. Yet we know FCLS is poorly predicting the known abundances for these samples. We conclude RMSE is not necessarily a good indicator of a method’s accuracy to predict abundance.
Model diagnostics: the RMSE results of the fit between the estimated and observed spectral mixtures are listed for selected points in the scene. The first column lists the location and planned percentage mix of DiDy for these points. The actual physically measured mixes were 100%, 78.8%, 50.5%, 24.2%, and 0.0%. Bold values indicate lowest RMSE value retrieved for the indicated pixel.
|(x,y) DiDy %||FCLS||GKLS γ=5||GKLS γ=6||SSA||SSA-H|
|(585, 248) 100%||0.0141||0.0293||0.0369||0.0315||0.0320|
|(400, 228) 75%||0.0245||0.0244||0.0248||0.0351||0.0388|
|(324, 049) 50%||0.0357||0.0225||0.0222||0.0129||0.0190|
|(313, 062) 50%||0.0359||0.0318||0.0390||0.0118||0.0175|
|(328, 067) 50%||0.0333||0.0294||0.0533||0.0223||0.0294|
|(223, 314) 25%||0.0526||0.0380||0.0379||0.0140||0.0241|
|(224, 327) 25%||0.0495||0.0383||0.0384||0.0186||0.0298|
|(246, 330) 25%||0.0420||0.0289||0.0636||0.0121||0.0197|
The results in Tables 2 and 3 also show that the automated implementation of the GKLS method was not as successful as the fixed- GKLS (, 4, 5, or 6) for estimating the correct abundance. The automated GKLS method attempts to select the most appropriate based on achieving a minimum of the model’s RMSE. This metric seems to respond to linear and nonlinear mixtures; however, it does not appear to be a reliable metric to determine the degree of nonlinear behavior. RMSE also cannot be used to achieve the most accurate estimate of abundance. Consequently, an alternative approach should be sought for implementing an automated GKLS.
Determining which value of gives the best result for this experiment is difficult. In terms of abundance, clearly provides the best estimate for the DiDy50 mix (49% abundance). However, provides slightly better estimates of abundance for the DiDy75 and DiDy25 mixes, (84% and 0.31%, respectively). In terms of RMSE, provides a better fit than in most (but not all) cases. Noting that provides the highly close prediction of abundance for the DiDy50 mixture and the RMSE fits are better, we henceforth consider to provide the best GKLS result as compared to the others (although only slightly better than ).
The processing times on the data consisting of 256,000 pixel vectors (640 samples by 400 lines) were recorded for each of the methods. The FCLS, SSA, and fixed- GKLS runs were completed in 9 s, 9 s, and 12 s, respectively. The automated GKLS run finished in 228 s. Note the automated GKLS method was slower by a factor of 19.
Results of Experiment 2
Results for experiment 2 are shown in Tables 4Table 5Table 6–7 and Figs. 12 and 13, showing the behavior for the FCLS, GKLS, and SSA methods, quantitatively and visually. The SSA method in this experiment is synonymous to the SSA method discussed in the first experiment, which is defined by Eq. (10) assuming bidirectional reflectance with nadir view and illumination angles.
Model diagnostics of panels for experiment 2. The RMSE values of the FCLS, SSA, and GKLS methods are listed for selected panel pixels in the scene, as well as the corresponding γ value chosen by the GKLS method at these points.
|T01||(369, 126)||0.007||0.017||0.007||6.440||T06A||(211, 62)||0.016||0.018||0.011||6.440|
|T02A||(365, 128)||0.020||0.022||0.017||4.569||T06B||(211, 63)||0.000||0.000||0.000||5.126|
|T02B||(365, 129)||0.021||0.029||0.020||3.413||T06C||(212, 63)||0.010||0.017||0.009||6.440|
|T03A||(363, 182)||0.020||0.014||0.020||6.440||T07||(187, 99)||0.005||0.018||0.005||0.002|
|T03B||(362, 182)||0.039||0.024||0.039||6.440||T08||(201, 64)||0.004||0.014||0.004||0.006|
|T03C||(363, 181)||0.004||0.004||0.004||0.005||T09A||(163, 33)||0.007||0.010||0.007||2.838|
|T04A||(209, 79)||0.005||0.021||0.005||0.002||T09B||(164, 33)||0.007||0.024||0.007||0.002|
|T04B||(209, 80)||0.003||0.007||0.003||1.778||T10A||(43, 106)||0.008||0.014||0.008||3.459|
|T05A||(212, 76)||0.008||0.017||0.007||5.321||T10B||(43, 107)||0.007||0.010||0.006||2.479|
|T05B||(212, 77)||0.003||0.014||0.003||1.714||T10C||(44, 107)||0.006||0.013||0.006||0.002|
Model diagnostics of vegetation mixes for experiment 2. The RMSE values of the FCLS, SSA, and GKLS methods are listed for selected pixels containing vegetation mixtures in the scene, as well as the corresponding γ value chosen by the GKLS method at these points.
Abundance results for Tyvek panels in experiment 2. The abundance values computed by the FCLS, SSA, and GKLS methods are listed for selected locations of the Tyvek panels.
|Panel name||Location (x,y)||WaterDrk||Gravel||SandPit||Grass healthy||Trees||VegNrTyvek||Tyvek||Panel name||Location (x, y)||WaterDrk||Gravel||SandPit||Grass healthy||Trees||VegNrTyvek||Tyvek|
|FCLS||T01||(369, 126)||0.04||0.65||0.05||0.00||0.00||0.00||0.25||T07||(187, 99)||0.18||0.00||0.00||0.00||0.14||0.60||0.08|
|SSA||T01||(369, 126)||0.07||0.00||0.45||0.00||0.01||0.26||0.21||T07||(187, 99)||0.09||0.00||0.00||0.01||0.30||0.39||0.21|
|GKLS||T01||(369, 126)||0.07||0.01||0.39||0.00||0.00||0.31||0.22||T07||(187, 99)||0.18||0.00||0.00||0.00||0.14||0.60||0.08|
|FCLS||T03C||(363, 181)||0.00||0.00||0.98||0.01||0.00||0.00||0.01||T08||(201, 64)||0.00||0.00||0.07||0.25||0.00||0.46||0.22|
|SSA||T03C||(363, 181)||0.00||0.00||0.97||0.00||0.00||0.00||0.03||T08||(201, 64)||0.00||0.00||0.00||0.32||0.07||0.09||0.52|
|GKLS||T03C||(363, 181)||0.00||0.00||0.98||0.01||0.00||0.00||0.01||T08||(201, 64)||0.00||0.00||0.07||0.25||0.00||0.46||0.22|
|FCLS||T04B||(209, 80)||0.00||0.52||0.08||0.11||0.00||0.21||0.09||T09A||(163, 33)||0.00||0.00||0.19||0.16||0.00||0.60||0.02|
|SSA||T04B||(209, 80)||0.01||0.36||0.12||0.17||0.00||0.10||0.21||T09A||(163, 33)||0.00||0.00||0.24||0.19||0.00||0.36||0.22|
|GKLS||T04B||(209, 80)||0.00||0.47||0.09||0.11||0.00||0.21||0.11||T09A||(163, 33)||0.00||0.00||0.22||0.19||0.00||0.51||0.07|
|FCLS||T05B||(212, 77)||0.53||0.07||0.00||0.00||0.07||0.28||0.05||T10A||(43, 106)||0.00||0.67||0.24||0.03||0.00||0.00||0.05|
|SSA||T05B||(212, 77)||0.32||0.00||0.00||0.00||0.25||0.27||0.17||T10A||(43, 106)||0.00||0.27||0.40||0.00||0.00||0.16||0.17|
|GKLS||T05B||(212, 77)||0.49||0.02||0.00||0.00||0.08||0.34||0.07||T10A||(43, 106)||0.00||0.57||0.29||0.04||0.00||0.00||0.11|
|FCLS||T06C||(212, 63)||0.33||0.00||0.25||0.00||0.00||0.00||0.42||T10B||(43, 107)||0.00||0.57||0.17||0.08||0.00||0.00||0.18|
|SSA||T06C||(212, 63)||0.12||0.01||0.35||0.00||0.00||0.00||0.51||T10B||(43, 107)||0.00||0.04||0.35||0.01||0.00||0.27||0.33|
|GKLS||T06C||(212, 63)||0.14||0.01||0.27||0.00||0.00||0.00||0.52||T10B||(43, 107)||0.00||0.44||0.20||0.08||0.00||0.05||0.23|
The bold values emphasize the predicted abundance for “Tyvek,” the primary material of interest.
Abundance results for vegetation mixtures in experiment 2. The abundance values computed by the FCLS, SSA, and GKLS methods are listed for selected locations of vegetation mixes.
|Vegetation mix location (x,y)||WaterDrk||Gravel||Grass healthy||Trees||VegNrTyvek||Blg01||Vegetation mix location (x,y)||WaterDrk||Gravel||Grass healthy||Trees||VegNrTyvek||Blg01|
|FCLS||(277, 50)||0.06||0.00||0.15||0.05||0.74||0.01||(233, 141)||0.03||0.00||0.53||0.00||0.39||0.02|
|SSA||(277, 50)||0.03||0.08||0.29||0.13||0.47||0.00||(233, 141)||0.05||0.00||0.83||0.03||0.00||0.06|
|GKLS||(277, 50)||0.06||0.01||0.19||0.07||0.66||0.01||(233, 141)||0.06||0.00||0.63||0.00||0.26||0.03|
|FCLS||(277, 60)||0.11||0.00||0.11||0.08||0.70||0.00||(234, 141)||0.00||0.00||0.54||0.00||0.38||0.05|
|SSA||(277, 60)||0.08||0.00||0.40||0.11||0.42||(234, 141)||0.04||0.00||0.82||0.00||0.00||0.12|
|GKLS||(277, 60)||0.11||0.00||0.11||0.08||0.70||0.00||(234, 141)||0.01||0.00||0.63||0.00||0.25||0.08|
|FCLS||(248, 68)||0.00||0.00||0.27||0.21||0.52||0.00||(270, 178)||0.00||0.00||0.16||0.18||0.66||0.00|
|SSA||(248, 68)||0.01||0.00||0.51||0.27||0.21||0.00||(270, 178)||0.00||0.00||0.29||0.24||0.48||0.00|
|GKLS||(248, 68)||0.00||0.31||0.27||0.43||0.00||(270, 178)||0.00||0.00||0.19||0.21||0.60||0.00|
|FCLS||(99, 79)||0.00||0.15||0.22||0.00||0.63||0.00||(274, 179)||0.00||0.00||0.16||0.30||0.54||0.00|
|SSA||(99, 79)||0.00||0.06||0.34||0.00||0.60||0.00||(274, 179)||0.00||0.00||0.28||0.36||0.37||0.00|
|GKLS||(99, 79)||0.00||0.07||0.35||0.00||0.58||0.00||(274, 179)||0.00||0.00||0.17||0.31||0.51||0.00|
The bold values emphasize the predicted abundance for three types of vegetation, the primary materials of interest.
The abundance maps for experiment 2 are shown in Figs. 12 and 13. Figure 12 shows maps for the Tyvek panels using the GKLS and SSA methods. The map for the FCLS method is not shown because this map is almost identical to the GKLS method. The colored circles show where the panels are located. Due to limited dynamic range in the printed version, some of the low abundance panels (T02, T03, T04, and T07) cannot be seen, but the detection can be verified by noting the abundances shown in Table 6. The RGB color composite maps in Fig. 13 show the differences in vegetation modeling for the three methods. The overall greater abundance of grass predicted by the GKLS as compared to FCLS method can be seen by comparing Figs. 13(a) and 13(b). An even greater abundance of grass is predicted by the SSA method, as shown by Fig. 13(c).
The abundance retrieval capability of all methods is good; however, a moderate advantage is shown for the SSA and GKLS methods. Tables 4 and 5 list the model diagnostics for selected points in the scene. The entries in Table 4 list the results for many of the Tyvek panels. The entries in Table 5 list results for selected mixes of vegetation with other backgrounds. Note the RMSE for the GKLS is always less than or equal to the RMSE of the FCLS. This is to be expected since the GKLS algorithm iterates with increasing until the RMSE converges to a minimum RMSE. A higher value of corresponds to greater nonlinearity in the model. The RMSE values of the SSA method are almost always higher than either of the other methods; however, they were still in an acceptable range. The higher values may be attributed to SSA spectra having overall higher values than reflectance.
Figure 14 shows plots of observed and estimated spectra, as computed by the three methods, for one of the Tyvek panel (TY-5) mixtures with background at location and one of the vegetation mixtures at . Observed spectra (labeled as “O”) are displayed in dark gray and estimated modeled spectra (labeled as “E”) are displayed in lighter gray.
Figures 14(a)–14(c) show results for the T05A panel mixed with background as computed by the FCLS, GKLS, and SSA methods, respectively. The GKLS result is reported for . Note there are diagnostic Tyvek features located at 1.72 and (see Fig. 9). Overall, the fits between the observed and estimated spectra are good for the three methods, especially near the diagnostic Tyvek features. Note the GKLS improves the fit as compared to the FCLS method in agreement with the numerical results listed in Table 2.
Figures 14(d)–14(f) show results for the vegetation mix as computed by the FCLS, GKLS, and SSA methods. The GKLS result from is shown. The fits between the observed and estimated spectra were good for the three methods. As with the Tyvek panel, the GKLS improves the fit as compared to the FCLS method in agreement with the numerical results listed in Table 2. However, in this case, the SSA method was slightly better.
Tables 6 and 7 show the abundance results of experiment 2 for the three methods. Table 6 shows these results for the panels. At least one pixel location is listed for each Tyvek panel shown. Although most panels are subpixel, the ground resolution distance of the sensor can cause the panel’s signature to occur over multiple pixels. Not all locations are listed due to an uncertainty factor of the actual sample and also because of space limitations. We notice there is not always a significant difference in abundances between the FCLS and GKLS methods, such as with T04A (one of the pixels in TY-4) and T09B (one of the pixels in TY-9). However, at these locations, the SSA method actually does estimate significantly higher abundances. Overall, the abundances estimated by the SSA method are higher, and for the case of T07 it is sufficiently high to trigger the detection of that panel. The abundance estimates from the other methods applied to T07 are too low to be considered significant.
Table 7 shows the results for vegetation mixtures. The vegetation abundance estimates are notably different for each method. It is difficult to quantitatively assess the results in Table 7 because of a lack of ground truth. However, they corroborate with the qualitative results shown in Fig. 13 and indicate that the SSA method is preferable over the FCLS for better characterizing the nonlinear mixtures of vegetation in the scene.
Results of Experiment 3
The results for experiment 3 (airborne HSI data) are shown in Figs. 15Fig. 16–17 and Tables 8 and 9. Figure 15 shows an RGB composite abundance map and five single-layer abundance maps for the fixed- GKLS method (). The composite abundance map visually shows the method’s response to three thicknesses of oil: the R layer depicts the abundance of 4.0-mm-thick oil; the G layer depicts the abundance of 1.85-mm-thick oil; and the B layer depicts the abundance of 1.00-mm-thick oil. The single-layer abundance maps show the results for five of the six endmembers. Due to the small size of these photos in the report, it is difficult to see the detailed patterns, but it is possible to get a sense of the method’s behavior. The oil is expected to spread out in a certain pattern from thick to thin, which is indeed what occurs. In the original photos of the three methods (not shown here but available in supplemental material), there is little difference in the patterns observed between the GKLS and SSA methods. However, there is a notable difference in the patterns seen for the 4.0-mm-thick oil between the FCLS and GKLS (or FCLS and SSA) methods.
Abundance results: estimated abundance values of the methods are listed for selected locations (pixels) in the AVIRIS DWH scene.
|Location||DWH oil||DWH oil||DWH oil||DWH oil||(546, 481)||(1154, 112)|
|Method||(x,y)||4.0 mm||1.85 mm||1.0 mm||0.5 mm||Orange||Water|
The bold values indicate the best fitting model for each location.
Model diagnostics: RMSE values computed for the fit between the observed and estimated mixed spectra are listed for selected locations of the AVIRIS DWH scene.
Figure 16 shows the single-layer abundance maps of the 4.0-mm-thick oil for the FCLS and fixed- GKLS () method. The FCLS method shows much of the same large area of thick oil in the central-right side of the scene as the GKSL (and SSA, not shown). However, these maps also show a notable difference in pattern between the FCLS and GKLS methods, as just mentioned above. The linear FCLS indicates a significant amount of 4.0-mm-thick oil throughout the scene in areas that would not be expected. Physically we expect the oil to spread out in a certain pattern from thick to thin. There is isolated 4.0 mm oil scattered throughout the area in the FCLS photo not seen in the GKLS photo. Neither the GKLS nor the SSA methods indicate much 4.0-mm-thick oil in these regions.
Tables 8 and 9 show the abundance results and RMSE values, respectively, for the three methods at selected locations of the AVIRIS scene. In Table 8, the abundance estimates are not very different for the first three locations. Interestingly, the RMSE values in Table 9 indicate the mixtures are likely to be linear at these locations. The abundance estimates listed in Table 8 for the locations at and are in agreement with Figs. 15 and 16. The smaller RMSE value for the GKLS at this location indicates the mixture is likely to be nonlinear.
Figure 17 shows plots of the observed and estimated spectra for a thinner layer of oil at and a thicker layer of oil at . The fits are qualitatively good except for a significant difference in the F2 diagnostic region for the FCLS method and in the F1 diagnostic region for the SSA method. However, overall the fits are quite good. The larger values of RMSE listed for the SSA method are likely a consequence of the magnitude of the SSA spectra as compared to the normalized reflectance values used in the FCLS and GKLS methods (see range of the -axis in Fig. 17).
This study has investigated the use of a GKLS method applied to reflectance data, and an SSA method for nonlinear mixture analysis. In the case of the SSA method, an FCLS method is applied to data that have been converted from reflectance space to SSA space. Our baseline method was the FCLS method applied to reflectance data. Our hypothesis is that, for intimate (nonlinear) mixtures, both of these methods will provide improved modeling and abundance estimates as compared to the baseline FCLS method. Three scenarios are tested on different scenes: laboratory, aerial over land, and aerial over oil-contaminated seawater.
Overall the results for experiment 1 using laboratory data from the Pika II sensor indicate the FCLS method has a poor capability for modeling intimate mixtures. In contrast, both the GKLS and SSA methods are much better at retrieving actual material abundances. Whether or not one is better than the other is not conclusive. However, we conclude that our hypothesis is confirmed and that both of these methods provide a better estimate of abundance for mixtures exhibiting nonlinearity. For the laboratory experiment of known abundance quantities, the SSA and GKLS methods responded well to the nonlinearity present in a mixture of materials and provide better estimates of abundance than the linear FCLS method for the DiDy and soda-lime glass bead mixtures.
The GKLS parameter determines the degree of nonlinear behavior exhibited by the GKLS method and affects its accuracy for estimating abundances. The automated GKLS method attempts to select the most appropriate gamma based on achieving a minimum of the model’s RMSE. We conclude the RMSE metric seems to respond to a mixture being linear and nonlinear, but is not a reliable metric to determine the degree of nonlinearity present. It could not be used to achieve the most accurate estimate of abundance and an alternative approach should be sought to automate the GKLS method.
For mixtures known to be nonlinear, the fixed- implementation of GKLS with or 6 provides a good estimate of abundance. The fixed- GKLS and the SSA methods can be computed in approximately the same amount of time and provide approximately the same accuracy for estimating abundances. Compared to the fixed- GKLS method, the automated GKLS was much slower to compute by a factor of about 19 and did not really achieve better accuracy.
For experiments 2 and 3 using the airborne data from HyMap (scene over land) and AVIRIS (scene over water), we also conclude that our hypothesis is confirmed and that either of these methods provides a better estimate of abundance for mixtures exhibiting nonlinearity. Unfortunately, the data set used in experiment 2 was from a collection campaign not particularly focused on nonlinear mixing problems, and although it contained many nonlinear mixtures, precise ground truth is not available. However, the data provide good opportunity to observe that linear mixtures did not significantly degrade the performance of the nonlinear GKLS and SSA methods. In experiment 3 (Deepwater Horizon oil spill), the SSA and GKLS methods performed equivalently and appeared to respond better to the nonlinear mixtures of oil thicknesses in the water than the linear method. Specifically, the linear method appears to be predicting too much of the thicker oils. This can only be assessed qualitatively because, here, too, there is a lack of ground truth. Additional analysis should be performed on scenes containing a variety of known nonlinear mixtures that are well documented with ground truth.
Appendix provides information about obtaining data from the first experiment. All of the data as well as dark field data cubes (with 80, 120, and 240 bands) and some additional information [e.g., the ENVI ROI files and an ENVI ‘Spectral Math’ expression file for implementing Eqs. (2) and (3)] are available for download as one compressed (.zip) file.
Additional experiments are recommended that investigate the effects of adding endmembers to the models and also to the physical mixtures in a controlled environment. In experiment 1, the number of controlled endmembers was limited to two.
Obtaining the Data
The focus of the first experiment discussed in this study has been on the analysis of a single Pika II HSI data cube (file: 20130521_sphere_array_2.bil). However, 15 measurements, in total, of the glass beads in the 96-well sample plate were acquired as listed in Table 10. Two are simply redundant measurements ( prior to spectral subsetting in ENVI). Six others are of the same sample plate and glass bead mixtures but with 120 and 240 bands. These data are somewhat noisier than those with 80 bands but were acquired to assess the impact of an increased noise level on abundance estimations. Three cubes are with the sensor moved closer to the sample plate (still with the Xenoplan objective lens) and with 80 bands. Two additional cubes contain two replicate sets of glass bead mixtures placed in adjacent empty cells of the plate (for a total of 15 filled cells). These cubes, too, contain 80 bands. All of the data sets are with four QTH lamps and nadir-looking sensor configuration shown in Fig. 1. To date, no data have yet been acquired with a line-light source to yield true bidirectional reflectance spectra. All of these data sets as well as dark field data cubes (with 80, 120, and 240 bands) and some additional information [e.g., the ENVI ROI files and an ENVI “spectral math” expression file for implementing Eqs. (2) and (3)] are available for download as one compressed (.zip) file. Send an e-mail to firstname.lastname@example.org requesting the data; a link for downloading will be sent in return. All questions and comments about these data should be addressed to R.G. Resmini.
A list is shown of the available Pika II data sets of the glass beads. (File 20130521_sphere_array_shifted_8.bil is the same as file 20130521_sphere_array_8.bil except it has been shifted so that some regions of interest defined in other cubes in the data set also apply.)
References are made to certain commercially available products in this paper to adequately specify the experimental procedures involved. Such identification does not imply a recommendation or endorsement by the U.S. government, nor does it imply that these products are the best for the purpose specified. Thanks are given to Dr. Charles Bachmann and the U.S. Naval Research Laboratory for providing the HyMap imagery, as well as associated aerial photographs and atmospheric calibration for experiment 2. Thanks are also given to Dr. Roger Clark and Eric Livo (USGS) for the data provided in experiment 3, including the laboratory and field measurements, as well as the AVIRIS imagery. The MITRE Innovation Program (MIP) is gratefully acknowledged for funding the HSI microscopy aspect of the project in which the study presented here was conducted. This article has been approved by NGA for public release, Case 16-132.
Robert S. Rand received his BS degree in physics from the University of Massachusetts at Lowell and his PhD in engineering physics from the University of Virginia, Charlottesville. He has worked for agencies with the U.S. Department of Defense since 1977. His recent efforts have focused on hyperspectral exploitation using feature-based methods and spectral mixture models, as well as the use of three-dimensional information from LIDAR to improve the exploitation of hyperspectral imagery.
Ronald G. Resmini received his MS degree in geology from Boston College, Boston, Massachusetts, USA, and his PhD in geology from Johns Hopkins University, Baltimore, Maryland, USA. He is a research scientist at the MITRE Corporation, McLean, Virginia, USA, and an associate professor in the College of Science at George Mason University, Fairfax, Virginia, USA. He specializes in visible to infrared multi- and hyperspectral remote sensing, the geological and geophysical sciences, and algorithm development for remote sensing applications.
David W. Allen received his PhD in earth systems and geoinformation sciences with a concentration in remote sensing from George Mason University. He is a research chemist at the NIST, where he has worked for over 20 years. Current research interests include advancing hyperspectral imaging for applications related to medicine, security, and the environment. This effort encompasses the development of end-to-end analysis methodology that integrates standards and best practices for hyperspectral imager performance metrics.