SGAligner : 3D Scene Alignment with Scene Graphs
Building 3D scene graphs has recently emerged as a topic in scene representation for several embodied AI applications to represent the world in a structured and rich manner. With their increased use in solving downstream tasks (eg, navigation and room rearrangement), can we leverage and recycle them...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Building 3D scene graphs has recently emerged as a topic in scene
representation for several embodied AI applications to represent the world in a
structured and rich manner. With their increased use in solving downstream
tasks (eg, navigation and room rearrangement), can we leverage and recycle them
for creating 3D maps of environments, a pivotal step in agent operation? We
focus on the fundamental problem of aligning pairs of 3D scene graphs whose
overlap can range from zero to partial and can contain arbitrary changes. We
propose SGAligner, the first method for aligning pairs of 3D scene graphs that
is robust to in-the-wild scenarios (ie, unknown overlap -- if any -- and
changes in the environment). We get inspired by multi-modality knowledge graphs
and use contrastive learning to learn a joint, multi-modal embedding space. We
evaluate on the 3RScan dataset and further showcase that our method can be used
for estimating the transformation between pairs of 3D scenes. Since benchmarks
for these tasks are missing, we create them on this dataset. The code,
benchmark, and trained models are available on the project website. |
---|---|
DOI: | 10.48550/arxiv.2304.14880 |