Language-Embedded Gaussian Splats (LEGS): Incrementally Building Room-Scale Representations with a Mobile Robot
Building semantic 3D maps is valuable for searching for objects of interest in offices, warehouses, stores, and homes. We present a mapping system that incrementally builds a Language-Embedded Gaussian Splat (LEGS): a detailed 3D scene representation that encodes both appearance and semantics in a u...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Building semantic 3D maps is valuable for searching for objects of interest
in offices, warehouses, stores, and homes. We present a mapping system that
incrementally builds a Language-Embedded Gaussian Splat (LEGS): a detailed 3D
scene representation that encodes both appearance and semantics in a unified
representation. LEGS is trained online as a robot traverses its environment to
enable localization of open-vocabulary object queries. We evaluate LEGS on 4
room-scale scenes where we query for objects in the scene to assess how LEGS
can capture semantic meaning. We compare LEGS to LERF and find that while both
systems have comparable object query success rates, LEGS trains over 3.5x
faster than LERF. Results suggest that a multi-camera setup and incremental
bundle adjustment can boost visual reconstruction quality in constrained robot
trajectories, and suggest LEGS can localize open-vocabulary and long-tail
object queries with up to 66% accuracy. |
---|---|
DOI: | 10.48550/arxiv.2409.18108 |