Intra-modality Self-enhancement Mirror Network for RGB-T Salient Object Detection

The inherent imaging properties of sensors result in two distinct differences between the data from the two modalities in RGB-T Salient Object Detection (SOD) tasks. Namely, differences in imaging effectiveness due to varying sensitivities to specific scenes and fundamental domain differences result...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on circuits and systems for video technology 2024-10, p.1-1
Hauptverfasser:	Wang, Jie, Li, Guoqiang, Yu, Hongjie, Xi, Jinwen, Shi, Jie, Wu, Xueying
Format:	Artikel
Sprache:	eng
Schlagworte:	Circuits and systems Cross-scale fusion Decoding Feature extraction Imaging Interpolation Intra-modality self-enhancement Mirrors Object detection RGB-T images Salient object detection Sensitivity Sensors Thermal sensors
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The inherent imaging properties of sensors result in two distinct differences between the data from the two modalities in RGB-T Salient Object Detection (SOD) tasks. Namely, differences in imaging effectiveness due to varying sensitivities to specific scenes and fundamental domain differences resulting from differences in reflecting scene characteristics. Existing methods primarily focus on pursuing unique cross-modal fusion designs to enhance model performance. However, not only do direct cross-modal fusion modes fail to improve the effectiveness of original features, but intricate cross-modal fusion designs also increase the domain differences between modalities, thereby resulting in suboptimal performance. Therefore, in this paper, we no longer insist on pursuing unique cross-modal fusion designs but instead contemplate how to enhance the effectiveness of original features within modalities (mitigating differences in imaging effectiveness) and utilize a concise cross-modal fusion mechanism (alleviating the impact of domain differences) to achieve satisfactory performance. In this spirit, we propose the Intra-modality Self-enhancement Mirror Network (ISMNet) for RGB-T salient object detection. The core of ISMNet is the proposed Intra-modality Cross-scale Self-enhancement Module (ICSM). The main insight of ICSM is to exploit saliency clues by modeling the correlation between intra-modality cross-scale features (which exhibit strong correlations and small domain differences), thereby enhancing the effectiveness of original multi-scale features within modalities. We employ the proposed novel paradigm to mirror-expand existing typical paradigms to obtain a more robust model architecture. Extensive experiments demonstrate that our proposed new architecture and the introduced universal Intra-modality Cross-scale Self-enhancement Module effectively improve the effectiveness of original features and promote the achievement of state-of-the-art performance.
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2024.3489440