Change Masked Modality Alignment Network for Multimodal Change Detection

Using multimodal remote sensing images for change detection (CD) can significantly improve the feasibility and reliability in challenging environments. However, the differences in imaging mechanisms make multimodal images highly heterogeneous. A key challenge for multimodal CD (MCD) is that the hete...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on geoscience and remote sensing 2025, Vol.63, p.1-16
Hauptverfasser: Jiang, Fenlong, Huang, Bo, Wu, Husheng, Feng, Dan, Zhou, Yu, Zhang, Mingyang, Gong, Maoguo, Zhao, Wei, Guan, Ziyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Using multimodal remote sensing images for change detection (CD) can significantly improve the feasibility and reliability in challenging environments. However, the differences in imaging mechanisms make multimodal images highly heterogeneous. A key challenge for multimodal CD (MCD) is that the heterogeneity of the modalities and changes in ground objects are intertwined during processing. To address this issue, this article proposes a change masked modality alignment network (CMMAN), which uses a multitask framework consisting of one CD branch and two image modal transformation (IMT) branches. Specifically, to ensure a unified feature space, bi-temporal multimodal images are first input into the same Swin-Transformer-based encoder. The extracted features are then fed simultaneously into the CD branch and separately into the two IMT branches. In the CD branch, the decoder is also designed based on the Swin-Transformer, and a weakly modality-correlated feature enhancement (WMCFE) module is introduced to mitigate the interference of modality heterogeneity on CD. For the two IMT branches, both employ a generative adversarial network (GAN) to transform between modalities, and the distributions of features from different modalities are aligned through simultaneous optimization. Uniquely, the change probability map predicted by the CD branch is utilized to mask the change regions in IMT, further decoupling ground object changes and modal heterogeneity. Experimental results on multiple public datasets demonstrate that the proposed CMMAN significantly improves MCD performance and shows good compatibility and portability with various common backbone networks.
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2024.3516001