Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning

Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Frontiers in robotics and AI 2023-08, Vol.10, p.1221739-1221739
Hauptverfasser:	Chalvatzaki, Georgia, Younes, Ali, Nandha, Daljeet, Le, An Thai, Ribeiro, Leonardo F. R., Gurevych, Iryna
Format:	Artikel
Sprache:	eng
Schlagworte:	grounding language models (LMs) pretrained models robot learning Robotics and AI scene graphs task planning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially. Our method grounds the input of the LLM on the domain that is represented as a scene graph, enabling it to translate human requests into executable robot plans, thereby learning to reason over long-horizon tasks, as encountered in the ALFRED benchmark. We compare our approach with classical planning and baseline methods to examine the applicability and generalizability of LLM-based planners. Our findings suggest that the knowledge stored in an LLM can be effectively grounded to perform long-horizon task planning, demonstrating the promising potential for the future application of neuro-symbolic planning methods in robotics.
ISSN:	2296-9144 2296-9144
DOI:	10.3389/frobt.2023.1221739