Multimodal Remote Sensing Image Matching via Learning Features and Attention Mechanism

Matching multimodal remote sensing image remains an ongoing challenge due to the significant nonlinear radiometric differences and geometric distortions, resulting in matches exhibiting one-to-many matches or mismatches. To tackle this challenge, we propose a novel approach for multimodal remote sen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on geoscience and remote sensing 2024-01, Vol.62, p.1-1
Hauptverfasser:	Zhang, Yongxian, Lan, Chaozhen, Zhang, Haiming, Ma, Guorui, Li, Heng
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Attention mechanism Computer networks Consistency Detectors Feature extraction Feature matching Image matching Imaging Learning Learning feature Matching Multimodal image Neural networks Optical imaging Remote sensing Transformer Transformers
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Matching multimodal remote sensing image remains an ongoing challenge due to the significant nonlinear radiometric differences and geometric distortions, resulting in matches exhibiting one-to-many matches or mismatches. To tackle this challenge, we propose a novel approach for multimodal remote sensing image matching called modality independent consistency matching (MICM), which leverages the capabilities of deep convolutional neural networks and the Transformer attention mechanism to improve the matching performance. The proposed MICM method consists of three key steps. First, a Unet-like feature extraction backbone network is employed to learn multiscale invariant features from multimodal remote sensing images, enabling the extraction of rich and evenly distributed feature keypoints. Second, a hybrid approach combining local learning features with the Transformer attention mechanism is introduced to aggregate learning features, facilitating both detailed capture and long-range modeling to enhance the representation ability of the features. Third, a feature consistency correlation strategy is adopted to maximize the number of correct corresponding feature points, ensuring reliable matching performance. The performance of the proposed method has been extensively evaluated on both same scene and different scene multimodal remote sensing images, which captured from various imaging modes, wavebands, and platforms. The results show the superior matching performance of the proposed MICM method compared to commonly used and state-of-the-art handcrafted-based and learning-based methods when evaluated on both same scene and different scene datasets. The proposed method serves as a valuable reference for addressing common challenges in multimodal remote sensing image matching.
ISSN:	0196-2892 1558-0644
DOI:	10.1109/TGRS.2023.3348980