COTR: Correspondence Transformer for Matching Across Images
We propose a novel framework for finding correspondences in images based on a deep neural network that, given two images and a query point in one of them, finds its correspondence in the other. By doing so, one has the option to query only the points of interest and retrieve sparse correspondences,...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose a novel framework for finding correspondences in images based on a
deep neural network that, given two images and a query point in one of them,
finds its correspondence in the other. By doing so, one has the option to query
only the points of interest and retrieve sparse correspondences, or to query
all points in an image and obtain dense mappings. Importantly, in order to
capture both local and global priors, and to let our model relate between image
regions using the most relevant among said priors, we realize our network using
a transformer. At inference time, we apply our correspondence network by
recursively zooming in around the estimates, yielding a multiscale pipeline
able to provide highly-accurate correspondences. Our method significantly
outperforms the state of the art on both sparse and dense correspondence
problems on multiple datasets and tasks, ranging from wide-baseline stereo to
optical flow, without any retraining for a specific dataset. We commit to
releasing data, code, and all the tools necessary to train from scratch and
ensure reproducibility. |
---|---|
DOI: | 10.48550/arxiv.2103.14167 |