Boosting RGB-D salient object detection with adaptively cooperative dynamic fusion network

The suitable employment of RGB and depth data shows great significance in promoting the development of computer vision tasks and robot-environment interactions. However, there are different advantages and disadvantages in the early and late fusion of the two types of data. In addition, due to the di...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Knowledge-based systems 2022-09, Vol.251, p.109205, Article 109205
Hauptverfasser:	Zhu, Jinchao, Zhang, Xiaoyu, Fang, Xian, Rahman, Muhammad Rameez Ur, Dong, Feng, Li, Yuehua, Yan, Siyu, Tan, Panlong
Format:	Artikel
Sprache:	eng
Schlagworte:	Dilated convolution Early fusion and late fusion Gated mechanism RGB-D salient object detection
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The suitable employment of RGB and depth data shows great significance in promoting the development of computer vision tasks and robot-environment interactions. However, there are different advantages and disadvantages in the early and late fusion of the two types of data. In addition, due to the diversity of the object information, using a single type of data in a specific scenario results in being semantically misleading. Based on the above considerations, we propose a transformer-based adaptively cooperative dynamic fusion network (ACDNet) with a dynamic composite structure (DCS) for salient object detection. This structure is designed to flexibly utilize the advantages of feature fusion in different stages. Second, an adaptively cooperative semantic guidance (ACG) scheme is designed to suppress inaccurate features in multilevel multimodal feature fusion. Furthermore, we proposed a perceptual aggregation module (PAM) to optimize the network from the perspectives of spatial perception and scale perception, which strengthens the network’s ability to perceive multiscale objects. Extensive experiments conducted on 8 RGB-D SOD datasets illustrate that the proposed network outperforms 24 state-of-the-art algorithms.
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2022.109205