BCLNet: Boundary contrastive learning with gated attention feature fusion and multi-branch spatial-channel reconstruction for land use classification

The fusion of optical and synthetic aperture radar (SAR) images is not only a crucial method for enhancing land use classification tasks but also forms a fundamental basis for the interpretation of multimodal remote sensing imagery. However, the visual differences caused by the distinct nonlinear ra...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2024-10, Vol.302, p.112387, Article 112387
Hauptverfasser: Yue, Chenke, Zhang, Yin, Yan, Junhua, Luo, Zhaolong, Liu, Yong, Guo, Pengyu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The fusion of optical and synthetic aperture radar (SAR) images is not only a crucial method for enhancing land use classification tasks but also forms a fundamental basis for the interpretation of multimodal remote sensing imagery. However, the visual differences caused by the distinct nonlinear radiometric properties of optical and SAR images lead to challenges in effectively integrating multimodal features using current land use classification methods, resulting in lower classification accuracy. On one hand, these approaches may be affected by the heterogeneity among different source images, leading to an inability to fully utilize the semantic content of the fused features. On the other hand, the semantic embedding space does not consider the semantic relationships between different pixels from a global perspective (i.e., across the entire dataset). To address the aforementioned challenges, we propose the boundary contrastive learning classification network (BCLNet). In this framework, we introduce the gate attention fusion (GAF) module, which selectively considers distinctiveness and consistency features in optical and SAR imagery, facilitating effective modal representation fusion. Furthermore, we introduce the multi-branch spatial channel reconstruction (MSCR) module for enhanced feature augmentation, operating in both spatial and channel dimensions with a dual-dimensional feature reconstruction selection mechanism for the fused features. Furthermore, we adopt boundary contrast learning to address the global pixel semantic space embedding problem. Our method is evaluated on the WHU-OPT-SAR dataset and a multi-class scene classification dataset (MSD) that we constructed. Compared to state-of-the-art methods, our approach demonstrates improvements in pixel classification overall accuracy (OA) and mean intersection over union (mIOU).
ISSN:0950-7051
DOI:10.1016/j.knosys.2024.112387