Gradient Decoupled Learning With Unimodal Regularization for Multimodal Remote Sensing Classification
The joint use of multisource remote-sensing data for Earth observation has drawn much attention due to its robust performance. Although many methods have been proposed to fuse multimodal data, they tend to improve the interaction of different modality data while ignoring the optimization of each mod...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-12 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The joint use of multisource remote-sensing data for Earth observation has drawn much attention due to its robust performance. Although many methods have been proposed to fuse multimodal data, they tend to improve the interaction of different modality data while ignoring the optimization of each modality. Existing studies show that high-performance modalities will suppress the learning of weak ones, leading to under-optimized multimodal learning. To this end, we propose a general framework called gradient decoupled network (GDNet) to assist the multimodal remote sensing (RS) classification. GDNet guides each modality encoder in the multimodal model to learn probabilistic representations instead of deterministic ones. This helps decouple their gradient, reducing their influence on each other and encouraging them to learn the modality-specific information. Then, we further introduce the unimodal regularization for each modality encoder to align their logit output with the multimodal one and label distribution simultaneously. This helps introduce independent gradient paths for each morality encoder to accelerate their optimization when preserving the modality-share information. Finally, extensive experiments conducted on three benchmark datasets demonstrate that the proposed GDNet can effectively address the under-optimized problem in multimodal RS image classification. Code is available at https://github.com/shicaiwei123/TGRS-GDNet . |
---|---|
ISSN: | 0196-2892 1558-0644 |
DOI: | 10.1109/TGRS.2024.3478393 |