The Functional Correspondence Problem
The ability to find correspondences in visual data is the essence of most computer vision tasks. But what are the right correspondences? The task of visual correspondence is well defined for two different images of same object instance. In case of two images of objects belonging to same category, vi...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The ability to find correspondences in visual data is the essence of most
computer vision tasks. But what are the right correspondences? The task of
visual correspondence is well defined for two different images of same object
instance. In case of two images of objects belonging to same category, visual
correspondence is reasonably well-defined in most cases. But what about
correspondence between two objects of completely different category -- e.g., a
shoe and a bottle? Does there exist any correspondence? Inspired by humans'
ability to: (a) generalize beyond semantic categories and; (b) infer functional
affordances, we introduce the problem of functional correspondences in this
paper. Given images of two objects, we ask a simple question: what is the set
of correspondences between these two images for a given task? For example, what
are the correspondences between a bottle and shoe for the task of pounding or
the task of pouring. We introduce a new dataset: FunKPoint that has ground
truth correspondences for 10 tasks and 20 object categories. We also introduce
a modular task-driven representation for attacking this problem and demonstrate
that our learned representation is effective for this task. But most
importantly, because our supervision signal is not bound by semantics, we show
that our learned representation can generalize better on few-shot
classification problem. We hope this paper will inspire our community to think
beyond semantics and focus more on cross-category generalization and learning
representations for robotics tasks. |
---|---|
DOI: | 10.48550/arxiv.2109.01097 |