Learning to Reason over Scene Graphs: A Case Study of Finetuning GPT-2 into a Robot Language Model for Grounded Task Planning
Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifi...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Long-horizon task planning is essential for the development of intelligent
assistive and service robots. In this work, we investigate the applicability of
a smaller class of large language models (LLMs), specifically GPT-2, in robotic
task planning by learning to decompose tasks into subgoal specifications for a
planner to execute sequentially. Our method grounds the input of the LLM on the
domain that is represented as a scene graph, enabling it to translate human
requests into executable robot plans, thereby learning to reason over
long-horizon tasks, as encountered in the ALFRED benchmark. We compare our
approach with classical planning and baseline methods to examine the
applicability and generalizability of LLM-based planners. Our findings suggest
that the knowledge stored in an LLM can be effectively grounded to perform
long-horizon task planning, demonstrating the promising potential for the
future application of neuro-symbolic planning methods in robotics. |
---|---|
DOI: | 10.48550/arxiv.2305.07716 |