DDRNet: Dual-Domain Refinement Network for Remote Sensing Image Semantic Segmentation
Semantic segmentation is crucial for interpreting remote sensing images. The segmentation performance has been significantly improved recently with the development of deep learning. However, complex background samples and small objects greatly increase the challenge of the semantic segmentation task...
Gespeichert in:
Veröffentlicht in: | IEEE journal of selected topics in applied earth observations and remote sensing 2024, Vol.17, p.20177-20189 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Semantic segmentation is crucial for interpreting remote sensing images. The segmentation performance has been significantly improved recently with the development of deep learning. However, complex background samples and small objects greatly increase the challenge of the semantic segmentation task for remote sensing images. To address these challenges, we propose a dual-domain refinement network (DDRNet) for accurate segmentation. Specifically, we first propose a spatial and frequency feature reconstruction module, which separately utilizes the characteristics of the frequency and spatial domains to refine the global salient features and the fine-grained spatial features of objects. This process enhances the foreground saliency and adaptively suppresses background noise. Subsequently, we propose a feature alignment module that selectively couples the features refined from both domains via cross-attention, achieving semantic alignment between frequency and spatial domains. In addition, a meticulously designed detail-aware attention module is introduced to compensate for the loss of small objects during feature propagation. This module leverages cross-correlation matrices between high-level features and the original image to quantify the similarities among objects belonging to the same category, thereby transmitting rich semantic information from high-level features to small objects. The results on multiple datasets validate that our method outperforms the existing methods and achieves a good compromise between computational overhead and accuracy. |
---|---|
ISSN: | 1939-1404 2151-1535 |
DOI: | 10.1109/JSTARS.2024.3490584 |