Hyperspectral and light detection and ranging (LiDAR) imaging instruments capture on-ground object information from diverse perspectives, reflecting spectral–spatial and elevation descriptions, respectively. Their complementary capturing feasibilities contribute to enhancing accurate landcover identification in multimodal data fusion tasks. However, their heterogeneous distributions always impede fusion and joint classification performance, leading to wrong classification phenomena. To solve this challenge, we proposed a deep multiscale cross-modal attention (DMSCA) network for hyperspectral and LiDAR data fusion and joint classification. Compared with existing methods, our primary motivation is to explore the intrinsic connection between these two specific remote sensing modalities and enhance their shared attributes through the implementation of various cross-modal attention mechanisms. The extracted modality features are cross-modally integrated and exchanged, thereby enhancing the overall consistency of the simulations. Specifically, these cross-modal attention mechanisms are capable of strengthening local considerable segments considering detailed hyperspectral and LiDAR geographical descriptions. The spatial-wise attention mechanism measures the contributions of neighboring samples to classification performance. The spectral-wise attention mechanism highlights the significant hyperspectral channels in terms of channel correlation. The elevation-wise attention mechanism highly connects the hyperspectral-related attention mechanisms to detailed LiDAR elevations for information fusion. Based on these mechanisms, an adaptive fusion and joint classification framework is constructed for balancing multimodal information. Multiple experiments are conducted on three widely used datasets to prove the effectiveness of DMSCA. Experimental results prove that DMSCA outperforms state-of-the-art techniques qualitatively and quantitatively. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
LIDAR
Data fusion
Remote sensing
Feature extraction
Feature fusion
Education and training
Deep learning