The high spectral resolution afforded by Hyperspectral Imaging (HSI) sensors is poised to bring unprecedented advancements to signature characterization applications. Thus far, much of the research in the machine learning field devoted to HSI applications has focused on a few specific tasks like land-use/land-cover classification. In land classification tasks, spatial information is very important, and model architectures are often designed to leverage spatial contexts. However, it is unclear how well these spatially-tuned models will translate to tasks where spectral information is critical, like the detection and characterization of chemicals. In this work, we compare spectral models (inputs are 1D spectra) and spatial-spectral models (inputs are 3D cubes) in the context of predicting chemical concentration maps. We find that spatial-spectral models perform the best, though we find a wide range in performance across the different architectures tested. Additionally, we find that model performance is impacted by the availability of training data, particularly in scenarios where the training data doesn’t fully capture the true variance of real-world conditions. We find that data augmentation can help mitigate sparse coverage of observed parameter space (e.g., seasonal or geographic variability in ground cover), and present augmentation strategies that are tailored to hyperspectral data.
Including information from additional spectral bands (e.g., near-infrared) can improve deep learning model performance for many vision-oriented tasks. There are many possible ways to incorporate this additional information into a deep learning model, but the optimal fusion strategy has not yet been determined and can vary between applications. At one extreme, known as “early fusion,” additional bands are stacked as extra channels to obtain an input image with more than three channels. At the other extreme, known as “late fusion,” RGB and non-RGB bands are passed through separate branches of a deep learning model and merged immediately before a final classification or segmentation layer. In this work, we characterize the performance of a suite of multispectral deep learning models with different fusion approaches, quantify their relative reliance on different input bands and evaluate their robustness to naturalistic image corruptions affecting one or more input channels.
KEYWORDS: Multimodal imaging, Data fusion, RGB color model, Data modeling, Performance modeling, Near infrared, Education and training, Image segmentation, Image fusion, Multispectral imaging, Satellites
In overhead image segmentation tasks, including additional spectral bands beyond the traditional RGB channels can improve model performance. However, it is still unclear how incorporating this additional data impacts model robustness to adversarial attacks and natural perturbations. For adversarial robustness, the additional information could improve the model’s ability to distinguish malicious inputs, or simply provide new attack avenues and vulnerabilities. For natural perturbations, the additional information could better inform model decisions and weaken perturbation effects or have no significant influence at all. In this work, we seek to characterize the performance and robustness of a multispectral (RGB and near infrared) image segmentation model subjected to adversarial attacks and natural perturbations. While existing adversarial and natural robustness research has focused primarily on digital perturbations, we prioritize on creating realistic perturbations designed with physical world conditions in mind. For adversarial robustness, we focus on data poisoning attacks whereas for natural robustness, we focus on extending ImageNet-C common corruptions for fog and snow that coherently and self-consistently perturbs the input data. Overall, we find both RGB and multispectral models are vulnerable to data poisoning attacks regardless of input or fusion architectures and that while physically realizable natural perturbations still degrade model performance, the impact differs based on fusion architecture and input data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.