HTMatch: An efficient hybrid transformer based graph neural network for local feature matching

•We present a new spatial embedding module that enhances the two-view geometric constraints into the graph for better feature matching.•We present a new spatial embedding module that enhances the two-view geometric constraints into the graph for better feature matching.•The proposed HTMatch achieves...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Signal processing 2023-03, Vol.204, p.108859, Article 108859
Hauptverfasser:	Cai, Youcheng, Li, Lin, Wang, Dong, Li, Xinjie, Liu, Xiaoping
Format:	Artikel
Sprache:	eng
Schlagworte:	Feature matching Graph neural network Local feature Transformer
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	•We present a new spatial embedding module that enhances the two-view geometric constraints into the graph for better feature matching.•We present a new spatial embedding module that enhances the two-view geometric constraints into the graph for better feature matching.•The proposed HTMatch achieves state-of-the-art preference on several datasets and maintains high efficiency with low runtime and memory consumption. Local feature matching plays a vital role in various computer vision tasks. In this work, we present a novel network that combines feature matching and outlier rejection for finding reliable correspondences between image pairs. The proposed method is a hybrid transformer-based graph neural network (GNN), termed HTMatch, which aims to achieve high accuracy and efficient feature matching. Specifically, we first propose a hybrid transformer that integrates self- and cross-attention together to condition the feature descriptors between image pairs. By doing so, the intra/inter-graph attentional aggregation can be realized by a single transformer layer, which achieves more efficient message passing. Then, we introduce a new spatial embedding module to enhance the spatial constraints across images. The spatial information from one image is embedded into another, which can significantly improve matching performance. Finally, we adopt a seeded GNN architecture for establishing a sparse graph, which improves both efficiency and effectiveness. Experiments show that HTMatch reaches state-of-the-art results on several public benchmarks.
ISSN:	0165-1684 1872-7557
DOI:	10.1016/j.sigpro.2022.108859