Non-Local Aggregation for RGB-D Semantic Segmentation

Exploiting both RGB (2D appearance) and Depth (3D geometry) information can improve the performance of semantic segmentation. However, due to the inherent difference between the RGB and Depth information, it remains a challenging problem in how to integrate RGB-D features effectively. In this letter...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE signal processing letters 2021, Vol.28, p.658-662
Hauptverfasser:	Zhang, Guodong, Xue, Jing-Hao, Xie, Pengwei, Yang, Sifan, Wang, Guijin
Format:	Artikel
Sprache:	eng
Schlagworte:	Agglomeration Benchmark testing Convolutional neural network Feature extraction Image segmentation Interpolation Manganese multi-modality feature fusion RGB-D semantic segmentation Semantic segmentation Semantics Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Exploiting both RGB (2D appearance) and Depth (3D geometry) information can improve the performance of semantic segmentation. However, due to the inherent difference between the RGB and Depth information, it remains a challenging problem in how to integrate RGB-D features effectively. In this letter, to address this issue, we propose a Non-local Aggregation Network (NANet), with a well-designed Multi-modality Non-local Aggregation Module (MNAM), to better exploit the non-local context of RGB-D features at multi-stage. Compared with most existing RGB-D semantic segmentation schemes, which only exploit local RGB-D features, the MNAM enables the aggregation of non-local RGB-D information along both spatial and channel dimensions. The proposed NANet achieves comparable performances with state-of-the-art methods on popular RGB-D benchmarks, NYUDv2 and SUN-RGBD.
ISSN:	1070-9908 1558-2361
DOI:	10.1109/LSP.2021.3066071