AGWNet: Attention-guided adaptive shuffle channel gate warped feature network for indoor scene RGB-D semantic segmentation

In recent years, depth maps have shown compelling performance as complementary information in semantic segmentation of indoor scenes. This benefits greatly from geometric relationships corresponding to objects captured by the depth sensor. However, the hole or sparse characteristics in depth maps di...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Displays 2024-07, Vol.83, p.102730, Article 102730
Hauptverfasser:	Xiong, Bing, Peng, Yue, Zhu, JingKe, Gu, Jia, Chen, Zhen, Qin, Wenjian
Format:	Artikel
Sprache:	eng
Schlagworte:	Cross-modality feature propagation Multi-level feature alignment RGBD segmentation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In recent years, depth maps have shown compelling performance as complementary information in semantic segmentation of indoor scenes. This benefits greatly from geometric relationships corresponding to objects captured by the depth sensor. However, the hole or sparse characteristics in depth maps directly fused with RGB images lead to reduced accuracy of the segmentation model. Therefore, it is necessary to design an efficient feature fusion complementary module and dynamically adjust the weight of feature fusion between two modalities according to the quality of the input image to avoid irreparable depth defects, and effectively utilize cross-modality correlation for further correction of depth map noise. To tackle the above challenges, we proposed the attention-guided adaptive channel shuffle gate and feature warp network (AGWNet) for indoor scene RGB-D semantic segmentation with low-quality depth maps. Specifically, our network efficiently captures accurate features in both RGB-D modalities using gating and channel fusion attention modules. Furthermore, the feature fusion is rectified by a multilevel feature correction and alignment module through skip layers to the decoder. Extensive quantitative and qualitative evaluations on the NYU-Depth V2 and SUNRGB-D datasets show that our model outperforms previous state-of-the-art RGB-D semantic segmentation methods. •We propose an RGB-D indoor segmentation algorithm with adaptive attention for depth map noise correction.•We propose channel shuffling and attention to lessen anomalous features, and gated spatial aggregation.•We designed a pyramidal module using multilevel features for noise and depth map holes.•Experimental results on two datasets surpassed current RGB-D segmentation methods.
ISSN:	0141-9382 1872-7387
DOI:	10.1016/j.displa.2024.102730