Learning Contrastive Representation for Semantic Correspondence

Dense correspondence across semantically related images has been extensively studied, but still faces two challenges: 1) large variations in appearance, scale and pose exist even for objects from the same category, and 2) labeling pixel-level dense correspondences is labor intensive and infeasible t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of computer vision 2022-05, Vol.130 (5), p.1293-1309
Hauptverfasser:	Xiao, Taihong, Liu, Sifei, De Mello, Shalini, Yu, Zhiding, Kautz, Jan, Yang, Ming-Hsuan
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Computer Imaging Computer Science Correspondence Image Processing and Computer Vision Learning Matching Pattern Recognition Pattern Recognition and Graphics Pixels Semantics Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Dense correspondence across semantically related images has been extensively studied, but still faces two challenges: 1) large variations in appearance, scale and pose exist even for objects from the same category, and 2) labeling pixel-level dense correspondences is labor intensive and infeasible to scale. Most existing methods focus on designing various matching modules using fully-supervised ImageNet pretrained networks. On the other hand, while a variety of self-supervised approaches are proposed to explicitly measure image-level similarities, correspondence matching the pixel level remains under-explored. In this work, we propose a multi-level contrastive learning approach for semantic matching, which does not rely on any ImageNet pretrained model. We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects, while the performance can be further enhanced by regularizing cross-instance cycle-consistency at intermediate feature levels. Experimental results on the PF-PASCAL, PF-WILLOW, and SPair-71k benchmark datasets demonstrate that our method performs favorably against the state-of-the-art approaches.
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-022-01602-y