Semantic-Injected Bidirectional Multiscale Flow Estimation Network for Infrared and Visible Image Registration
Infrared and visible image registration ensures consistency in spatial positions across different modalities. Crossmodal images contain different scales objects and cluttered backgrounds. However, most existing image registration methods adopt the same alignment strategy for different objects, which...
Gespeichert in:
Veröffentlicht in: | IEEE journal of selected topics in applied earth observations and remote sensing 2025-01, Vol.18, p.1-10 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Infrared and visible image registration ensures consistency in spatial positions across different modalities. Crossmodal images contain different scales objects and cluttered backgrounds. However, most existing image registration methods adopt the same alignment strategy for different objects, which leads to insufficient multi-scale feature representation and inaccurate registration of foreground objects. To address these issues, we propose a semantic-injected bidirectional multiscale flow estimation (SI-BMFE) network for infrared and visible image registration. SI-BMFE leverages feature complementarity across different scales and employs a pre-trained segmentation network to extract the spatial positions of foreground objects to improve registration accuracy. Specifically, we first design a bidirectional multiscale feature enhancement (BMFE) module to integrate feature complementarity across different scales, effectively extracts both global structures and local details. BMFE pushes the network to roughly align infrared and visible images. Then, the semantic-injected flow estimation (SFE) module is introduced to estimate multi-level deformation fields for finegrained registration of different objects. SFE utilizes a pretrained segmentation network to obtain spatial location information of foreground objects. Object location cues help the network distinguish and focus on different foreground objects from the background. SFE exploits semantic knowledge to promote fine alignment of different foreground objects and improve the accuracy of cross-modal image registration. Extensive experiments demonstrate that our proposed method outperforms state-of-theart registration networks on both the MSRS and RoadScene infrared and visible image registration datasets. |
---|---|
ISSN: | 1939-1404 2151-1535 |
DOI: | 10.1109/JSTARS.2025.3527175 |