DSAFuse: Infrared and visible image fusion via dual-branch spatial adaptive feature extraction
By exploiting the thermal radiation information from infrared images and the detailed texture information from visible light images, image fusion technology enables more accurate target identification. However, most current image fusion methods primarily rely on convolutional neural networks for cro...
Gespeichert in:
Veröffentlicht in: | Neurocomputing (Amsterdam) 2025-02, Vol.616, p.128957, Article 128957 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | By exploiting the thermal radiation information from infrared images and the detailed texture information from visible light images, image fusion technology enables more accurate target identification. However, most current image fusion methods primarily rely on convolutional neural networks for cross-modal local feature extraction and do not fully utilize long-range contextual information, resulting in limited performance in complex scenarios. To address this issue, this paper proposes an infrared and visible light image fusion method termed DSAFuse, which is based on dual-branch spatially adaptive feature extraction. Specifically, a unimodal feature mixing module is used for multi-scale spatially adaptive feature extraction on both modal images with shared weights. The extracted features are then inputted into a dual-branch feature extraction module comprising flatten transformer blocks and vanilla blocks, which extract low-frequency texture features and high-frequency local detail features, respectively. Subsequently, features from both modalities are concatenated, and a bimodal feature mixing module reconstructs the fused image to generate semantically rich fusion results. Additionally, to achieve end-to-end unsupervised training, a loss function consisting of decomposition loss, gradient loss, and structural similarity loss is designed. Qualitative and quantitative experimental results demonstrate that our DSAFuse outperforms the state-of-the-art IVIF methods across various benchmark datasets. It effectively preserves the texture details and target features of the source images, producing satisfactory fusion results even in harsh environments and enhancing downstream visual tasks.
•We propose a dual-branch feature extraction for low and high-frequency features.•We design a spatial adaptive fusion module with multi-scale and attention mechanisms.•We craft a loss function enabling the model to generate images with clear targets.•Experiments validate the superiority of our method in fusion and downstream tasks. |
---|---|
ISSN: | 0925-2312 |
DOI: | 10.1016/j.neucom.2024.128957 |