DSM-assisted unsupervised domain adaptive network for semantic segmentation of remote sensing imagery

The semantic segmentation of high-resolution remote sensing imagery (RSI) is an essential task for many applications. As a promising unsupervised learning method, unsupervised domain adaptation (UDA) methods remarkably contribute to the advancement of high-resolution RSI semantic segmentation. Previ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on geoscience and remote sensing 2023-01, Vol.61, p.1-1
Hauptverfasser: Zhou, Shunping, Feng, Yuting, Li, Shengwen, Zheng, Daoyuan, Fang, Fang, Liu, Yuanyuan, Wan, Bo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The semantic segmentation of high-resolution remote sensing imagery (RSI) is an essential task for many applications. As a promising unsupervised learning method, unsupervised domain adaptation (UDA) methods remarkably contribute to the advancement of high-resolution RSI semantic segmentation. Previous methods focus on reducing domain shift of orthophotos, suffering from some limitations because the available information in orthophotos is relatively homogeneous. This paper proposes a framework to introduce digital surface model (DSM) data for the unsupervised semantic segmentation of RSI. The proposed method combines RSI with DSM through two modules, namely, multipath encoder (MPE) and multitask decoder (MTD), and aligns global data distribution in the source and target domains with a UDA module. A refined post fusion (RPF) module is proposed in the inference phase to exploit the height information fully for refining the segmentation results. Specifically, MPE is designed to utilize RSI and DSM to train the segmentation network jointly, which iteratively fuses RSI and DSM features at multiple levels to enhance their feature representations. MTD is designed to produce fusion prediction maps by filtering interference information of DSM and yielding accurate segmentation masks of DSM and RSI. Experimental results show that the proposed method substantially improves the semantic segmentation performance on high-resolution RSI and outperforms state-of-the-art methods. This paper provides a methodological reference for fusing multimodal data in various RSI-based unsupervised tasks.
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2023.3268362