ContextMatcher: Detector-Free Feature Matching With Cross-Modality Context
Existing feature matching methods tend to extract feature descriptors by relying on the visual appearance, leading to false matches which are obviously false from the geometric perspective. This paper proposes ContextMatcher, which goes beyond the visual appearance representation by introducing the...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on circuits and systems for video technology 2024-09, Vol.34 (9), p.7922-7934 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Existing feature matching methods tend to extract feature descriptors by relying on the visual appearance, leading to false matches which are obviously false from the geometric perspective. This paper proposes ContextMatcher, which goes beyond the visual appearance representation by introducing the geometric context to guild the feature matching. Specifically, our ContextMatcher includes visual descriptors generation, the neighborhood consensus module, and the geometric context encoder. To learn visual descriptors, Transformers situated in different branches are leveraged to obtain feature descriptors. In one branch, convolutions are integrated into self-attention layers elegantly to compensate for the lack of the local structure information. In another branch, a cross-scale Transformer is proposed through injecting heterogeneous receptive field sizes into tokens. To leverage and aggregate the geometric contextual information, a neighborhood consensus mechanism is proposed by re-ranking initial pixel-level matches to make a constraint of geometric consensus on neighborhood feature descriptors. Moreover, local feature descriptors are boosted through combining with the geometric properties of keypoints for refining matches to the sub-pixel level. Extensive experiments on relative pose estimations and image matching show that our proposed method outperforms existing state-of-the-art methods by a large margin. |
---|---|
ISSN: | 1051-8215 1558-2205 |
DOI: | 10.1109/TCSVT.2024.3383334 |