Transformer framework for depth-assisted UDA semantic segmentation

Unsupervised domain adaptation (UDA) plays a crucial role in transferring models trained on synthetic datasets to real-world datasets. In semantic segmentation, UDA can alleviate the requirement of a large number of dense semantic annotations. Some UDA semantic segmentation approaches have already l...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Engineering applications of artificial intelligence 2024-11, Vol.137, p.109206, Article 109206
Hauptverfasser:	Song, Yunna, Shi, Jinlong, Zou, Danping, Liu, Caisheng, Bai, Suqin, Shu, Xin, Qian, Qiang, Xu, Dan, Yuan, Yu, Sun, Yunhan
Format:	Artikel
Sprache:	eng
Schlagworte:	Depth estimation Multitask learning Semantic segmentation Transformer Unsupervised domain adaptation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Unsupervised domain adaptation (UDA) plays a crucial role in transferring models trained on synthetic datasets to real-world datasets. In semantic segmentation, UDA can alleviate the requirement of a large number of dense semantic annotations. Some UDA semantic segmentation approaches have already leveraged depth information to enhance semantic features for improved segmentation accuracy. Building on this, we introduce a UDA multitask Transformer framework called Multi-former. Multi-former contains a semantic-segmentation and a depth-estimation network. Depth-estimation network extracts more informative depth features to estimate depth and assist in semantic segmentation. In addition, considering the issue of imbalanced class pixel distributions in the source domain, we present a rare class mix strategy (RCM) to balance domain adaptability for all classes. To further enhance the UDA semantic segmentation performance, we design a mixed label loss weight strategy (MLW), which employs different types of weights to comprehensively utilize the features of pseudo-label. Experimental results demonstrate the effectiveness of the proposed approach, which achieves the best mean intersection over union (mIoU) of 56.1% and 76.3% on the two UDA benchmark tasks of synthetic datasets to real-world datasets, respectively. The code and models are available at https://github.com/fz-ss/Multi-former.
ISSN:	0952-1976
DOI:	10.1016/j.engappai.2024.109206