MFUR-Net: Multimodal feature fusion and unimodal feature refinement for RGB-D salient object detection
RGB-D salient object detection aims to integrate multimodal feature information for accurate salient region localization. Despite the development of several RGB-D salient object detection models, existing methods face challenges in effectively fusing RGB with Depth features to exploit their compleme...
Gespeichert in:
Veröffentlicht in: | Knowledge-based systems 2024-09, Vol.299, p.112022, Article 112022 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | RGB-D salient object detection aims to integrate multimodal feature information for accurate salient region localization. Despite the development of several RGB-D salient object detection models, existing methods face challenges in effectively fusing RGB with Depth features to exploit their complementary aspects. To address this challenge, this study introduces MFUR-Net, a network based on multimodal feature fusion and unimodal feature refinement. The contributions of this study are primarily threefold: First, a multimodal multilevel feature fusion module is proposed at the encoder stage to integrate multimodal and multilevel features, generating enhanced RGB-D features; Second, a multi-input feature aggregation module at the decoder stage is introduced, which incorporates the RGB and Depth feature streams into the RGB-D feature streams so that they collaborate with the RGB-D features to learn more discriminative information related to the salient object; Third, a unimodal saliency feature refinement module is introduced to refine saliency feature information across modalities and eliminate redundancy before the integration of feature streams into the decoder; With the gradual refinement of saliency features, MFUR-Net achieves accurate saliency map prediction at the decoder stage. This method has been validated through extensive experiments on seven recognized datasets, demonstrating significant advantages over existing state-of-the-art techniques in key performance metrics. The source code can be found in https://github.com/wangwei678/MFUR-Net.
•MFUR-Net proposed for RGB-D saliency detection with novel mechanisms.•Multimodal feature fusion module fuses RGB and depth for new feature RGBD.•Unimodal feature refinement module refines features, reduces redundancy.•Multi-Input Feature Aggregation module aggregates RGB, depth, and RGBD features.•MFUR-Net surpasses state-of-the-art on seven datasets. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2024.112022 |