Semantic constraints to represent common sense required in household actions for multimodal learning-from-observation robot

The learning-from-observation (LfO) paradigm allows a robot to learn how to perform actions by observing human actions. Previous research in top-down learning-from-observation has mainly focused on the industrial domain, which consists only of the real physical constraints between a manipulated tool...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The International journal of robotics research 2024-02, Vol.43 (2), p.134-170
Hauptverfasser:	Ikeuchi, Katsushi, Wake, Naoki, Sasabuchi, Kazuhiro, Takamatsu, Jun
Format:	Artikel
Sprache:	eng
Schlagworte:	Households Representations Robots Semantics Upper bounds Working conditions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The learning-from-observation (LfO) paradigm allows a robot to learn how to perform actions by observing human actions. Previous research in top-down learning-from-observation has mainly focused on the industrial domain, which consists only of the real physical constraints between a manipulated tool and the robot’s working environment. To extend this paradigm to the household domain, which consists of imaginary constraints derived from human common sense, we introduce the idea of semantic constraints, which are represented similarly to the physical constraints by defining an imaginary contact with an imaginary environment. By studying the transitions between contact states under physical and semantic constraints, we derive a necessary and sufficient set of task representations that provides the upper bound of the possible task set. We then apply the task representations to analyze various actions in top-rated household YouTube videos and real home cooking recordings, classify frequently occurring constraint patterns into physical, semantic, and multi-step task groups, and determine a subset that covers standard household actions. Finally, we design and implement task models, corresponding to these task representations in the subset, with the necessary daemon functions to collect the necessary parameters to perform the corresponding household actions. Our results provide promising directions for incorporating common sense into the robot teaching literature.
ISSN:	0278-3649 1741-3176
DOI:	10.1177/02783649231212929