Transformer framework for depth-assisted UDA semantic segmentation
Unsupervised domain adaptation (UDA) plays a crucial role in transferring models trained on synthetic datasets to real-world datasets. In semantic segmentation, UDA can alleviate the requirement of a large number of dense semantic annotations. Some UDA semantic segmentation approaches have already l...
Gespeichert in:
Veröffentlicht in: | Engineering applications of artificial intelligence 2024-11, Vol.137, p.109206, Article 109206 |
---|---|
Hauptverfasser: | , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Unsupervised domain adaptation (UDA) plays a crucial role in transferring models trained on synthetic datasets to real-world datasets. In semantic segmentation, UDA can alleviate the requirement of a large number of dense semantic annotations. Some UDA semantic segmentation approaches have already leveraged depth information to enhance semantic features for improved segmentation accuracy. Building on this, we introduce a UDA multitask Transformer framework called Multi-former. Multi-former contains a semantic-segmentation and a depth-estimation network. Depth-estimation network extracts more informative depth features to estimate depth and assist in semantic segmentation. In addition, considering the issue of imbalanced class pixel distributions in the source domain, we present a rare class mix strategy (RCM) to balance domain adaptability for all classes. To further enhance the UDA semantic segmentation performance, we design a mixed label loss weight strategy (MLW), which employs different types of weights to comprehensively utilize the features of pseudo-label. Experimental results demonstrate the effectiveness of the proposed approach, which achieves the best mean intersection over union (mIoU) of 56.1% and 76.3% on the two UDA benchmark tasks of synthetic datasets to real-world datasets, respectively. The code and models are available at https://github.com/fz-ss/Multi-former. |
---|---|
ISSN: | 0952-1976 |
DOI: | 10.1016/j.engappai.2024.109206 |