Enabling modality interactions for RGB-T salient object detection

Most existing RGB and thermal (RGB-T) salient object detection (SOD) techniques focus on investing multi-modality feature fusion strategies for capturing cross-modality complementary information within RGB and thermal images. However, most of these strategies do not allow explicitly extracting the i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computer vision and image understanding 2022-09, Vol.222, p.103514, Article 103514
Hauptverfasser:	Zhang, Qiang, Xi, Ruida, Xiao, Tonglin, Huang, Nianchang, Luo, Yongjiang
Format:	Artikel
Sprache:	eng
Schlagworte:	Modality interactions RGB-T salient object detection Scale interactions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Most existing RGB and thermal (RGB-T) salient object detection (SOD) techniques focus on investing multi-modality feature fusion strategies for capturing cross-modality complementary information within RGB and thermal images. However, most of these strategies do not allow explicitly extracting the interactions among the features of different modalities, thus leading to insignificant cross-modality complementary information exploitation. In this paper, we propose a novel RGB-T SOD model that alleviates this issue by leveraging a modality-aware and scale-aware feature fusion module. Such a module is capable of capturing the cross-modality complementary information by exploiting the interactions of single-modality features across modalities and the interactions of multi-modality features across scales. A stage-wise feature aggregation module is also proposed to thoroughly exploit the cross-level complementary information and reduce their redundancies for generating accurate saliency maps with sharp boundaries. To this end, a novel multi-level feature aggregation structure with two types of feature aggregation nodes is employed. Experimental results on several benchmark datasets verify the effectiveness and superiorities of our proposed model over some state-of-the-art models. [Display omitted] •Exploring the interactions of features across modalities and scales of RGB-T SOD.•Proposing a novel fusion module to capture multi-modality complementary information.•Designing a new aggregation module to integrate multi-level complementary information.
ISSN:	1077-3142 1090-235X
DOI:	10.1016/j.cviu.2022.103514