In the field of remote sensing image (RSI) change detection (CD), existing methods often struggle to balance local and global features and adapt to complex scenes. Therefore, we propose a bidirectional-enhanced transformer network to address these issues. In the encoding part, we introduce a bidirectional-enhanced attention operation that encodes information both horizontally and vertically, as well as deep convolution to improve local contextual connections, thereby reducing computational complexity while improving the network’s perception of global and local information. In the feature fusion part, we propose a channel weighting fusion module, which recalibrates channel-wise features to suppress noise and enhance semantic relevance. We tested the proposed method on two publicly available RSI CD datasets, the LEVIR-CD and DSIFN-CD datasets. Experimental results show that our model outperforms several state-of-the-art CD methods, including one based on convolution, three based on attention, and three based on the transformer. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Feature fusion
Transformers
Semantics
Image fusion
Feature extraction
Remote sensing
Buildings