I Can Tell What I am Doing: Toward Real-World Natural Language Grounding of Robot Experiences
Understanding robot behaviors and experiences through natural language is crucial for developing intelligent and transparent robotic systems. Recent advancement in large language models (LLMs) makes it possible to translate complex, multi-modal robotic experiences into coherent, human-readable narra...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Understanding robot behaviors and experiences through natural language is
crucial for developing intelligent and transparent robotic systems. Recent
advancement in large language models (LLMs) makes it possible to translate
complex, multi-modal robotic experiences into coherent, human-readable
narratives. However, grounding real-world robot experiences into natural
language is challenging due to many reasons, such as multi-modal nature of
data, differing sample rates, and data volume. We introduce RONAR, an LLM-based
system that generates natural language narrations from robot experiences,
aiding in behavior announcement, failure analysis, and human interaction to
recover failure. Evaluated across various scenarios, RONAR outperforms
state-of-the-art methods and improves failure recovery efficiency. Our
contributions include a multi-modal framework for robot experience narration, a
comprehensive real-robot dataset, and empirical evidence of RONAR's
effectiveness in enhancing user experience in system transparency and failure
analysis. |
---|---|
DOI: | 10.48550/arxiv.2411.12960 |