Language Models as Zero-Shot Trajectory Generators

Large Language Models (LLMs) have recently shown promise as high-level planners for robots when given access to a selection of low-level skills. However, it is often assumed that LLMs do not possess sufficient knowledge to be used for the low-level trajectories themselves. In this work, we address t...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE robotics and automation letters 2024-07, Vol.9 (7), p.6728-6735
Hauptverfasser:	Kwon, Teyun, Palo, Norman Di, Johns, Edward
Format:	Artikel
Sprache:	eng
Schlagworte:	AI-based methods Artificial intelligence Big Data big data in robotics and automation Codes Deep learning deep learning in grasping and manipulation End effectors Grasping Large language models Manipulators Object detection Object recognition Robot control Robot kinematics Robotics Task analysis Trajectory Trajectory planning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Large Language Models (LLMs) have recently shown promise as high-level planners for robots when given access to a selection of low-level skills. However, it is often assumed that LLMs do not possess sufficient knowledge to be used for the low-level trajectories themselves. In this work, we address this assumption thoroughly, and investigate if an LLM (GPT-4) can directly predict a dense sequence of end-effector poses for manipulation tasks, when given access to only object detection and segmentation vision models. We designed a single, task-agnostic prompt, without any in-context examples, motion primitives, or external trajectory optimisers. Then we studied how well it can perform across 30 real-world language-based tasks, such as " open the bottle cap " and " wipe the plate with the sponge ", and we investigated which design choices in this prompt are the most important. Our conclusions raise the assumed limit of LLMs for robotics, and we reveal for the first time that LLMs do indeed possess an understanding of low-level robot control sufficient for a range of common tasks, and that they can additionally detect failures and then re-plan trajectories accordingly.
ISSN:	2377-3766 2377-3766
DOI:	10.1109/LRA.2024.3410155