Task-Driven Graph Attention for Hierarchical Relational Object Navigation
Embodied AI agents in large scenes often need to navigate to find objects. In this work, we study a naturally emerging variant of the object navigation task, hierarchical relational object navigation (HRON), where the goal is to find objects specified by logical predicates organized in a hierarchica...
Gespeichert in:
Hauptverfasser: | , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Embodied AI agents in large scenes often need to navigate to find objects. In
this work, we study a naturally emerging variant of the object navigation task,
hierarchical relational object navigation (HRON), where the goal is to find
objects specified by logical predicates organized in a hierarchical structure -
objects related to furniture and then to rooms - such as finding an apple on
top of a table in the kitchen. Solving such a task requires an efficient
representation to reason about object relations and correlate the relations in
the environment and in the task goal. HRON in large scenes (e.g. homes) is
particularly challenging due to its partial observability and long horizon,
which invites solutions that can compactly store the past information while
effectively exploring the scene. We demonstrate experimentally that scene
graphs are the best-suited representation compared to conventional
representations such as images or 2D maps. We propose a solution that uses
scene graphs as part of its input and integrates graph neural networks as its
backbone, with an integrated task-driven attention mechanism, and demonstrate
its better scalability and learning efficiency than state-of-the-art baselines. |
---|---|
DOI: | 10.48550/arxiv.2306.13760 |