MSEDNet: Multi-scale fusion and edge-supervised network for RGB-T salient object detection

RGB-T Salient object detection (SOD) is to accurately segment salient regions in both visible light images and thermal infrared images. However, most of existing methods for SOD neglects the critical complementarity between multiple modalities images, which is beneficial to further improve the detec...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural networks 2024-03, Vol.171, p.410-422
Hauptverfasser:	Peng, Daogang, Zhou, Weiyi, Pan, Junzhen, Wang, Danhao
Format:	Artikel
Sprache:	eng
Schlagworte:	Edge fusion loss Multi-scale fusion RGB-T Salient object detection
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	RGB-T Salient object detection (SOD) is to accurately segment salient regions in both visible light images and thermal infrared images. However, most of existing methods for SOD neglects the critical complementarity between multiple modalities images, which is beneficial to further improve the detection accuracy. Therefore, this work introduces the MSEDNet RGB-T SOD method. We utilize an encoder to extract multi-level modalities features from both visible light images and thermal infrared images, which are subsequently categorized into high, medium, and low level. Additionally, we propose three separate feature fusion modules to comprehensively extract complementary information between different modalities during the fusion process. These modules are applied to specific feature levels: the Edge Dilation Sharpening module for low-level features, the Spatial and Channel-Aware module for mid-level features, and the Cross-Residual Fusion module for high-level features. Finally, we introduce an edge fusion loss function for supervised learning, which effectively extracts edge information from different modalities and suppresses background noise. Comparative demonstrate the superiority of the proposed MSEDNet over other state-of-the-art methods. The code and results can be found at the following link: https://github.com/Zhou-wy/MSEDNet. •Proposing three novel modules for effective cross-modal information fusion.•Presenting a novel loss function for precise refinement of local features.•Achieving excellence in RGB-T salient object detection benchmarks with MSEDNet.
ISSN:	0893-6080 1879-2782
DOI:	10.1016/j.neunet.2023.12.031