Interest-Driven Exploration With Observational Learning for Developmental Robots

It has been emphasized for a long time that real-world applications of developmental robots require lifelong and online learning. A major challenge in this field is the high sample-complexity of algorithms, which has led to the development of intrinsic motivation approaches to render learning more e...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on cognitive and developmental systems 2023-06, Vol.15 (2), p.373-384
Hauptverfasser:	Rayyes, Rania, Donat, Heiko, Steil, Jochen, Spranger, Michael
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Developmental robot direct inverse kinematics learning from observation Distance learning intrinsic motivation Knowledge based systems learning Manipulators online learning Probabilistic logic Robot kinematics robot model learning Robots socially guided exploration Task analysis Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	It has been emphasized for a long time that real-world applications of developmental robots require lifelong and online learning. A major challenge in this field is the high sample-complexity of algorithms, which has led to the development of intrinsic motivation approaches to render learning more efficient. However, only few works have been demonstrated on real robots and although these robots are supposed to share the environment with humans, there is hardly any research to integrate intrinsic motivation with learning from an interacting teacher. In this article, we tackle the efficiency challenge by proposing a novel extrinsic-intrinsic motivation learning scheme. We specifically investigate how to combine intrinsic motivation with learning from observation to accelerate learning. Our novel scheme comprises four elements: 1) a probabilistic intrinsic motivation signal yielding the robot's interest; 2) a probabilistic extrinsic motivation signal to expand the robot's knowledge by learning from observation; 3) novelty detection; and 4) novelty degree methods to enable the robot to decide autonomously how and when to explore. The efficiency as well as the applicability of our methods are benchmarked in simulation experiments and demonstrated on a physical 7-degree of freedom left arm of Baxter robot.
ISSN:	2379-8920 2379-8939
DOI:	10.1109/TCDS.2021.3057758