Quality-diversity based semi-autonomous teleoperation using reinforcement learning

Recent successes in robot learning have significantly enhanced autonomous systems across a wide range of tasks. However, they are prone to generate similar or the same solutions, limiting the controllability of the robot to behave according to user intentions. These limited robot behaviors may lead...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural networks 2024-11, Vol.179, p.106543, Article 106543
Hauptverfasser:	Park, Sangbeom, Yoon, Taerim, Lee, Joonhyung, Park, Sunghyun, Choi, Sungjoon
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Computer Simulation Humans Machine Learning Neural Networks, Computer Quality-diversity Reinforcement learning Reinforcement, Psychology Robotics - methods Shared autonomy Teleoperation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Recent successes in robot learning have significantly enhanced autonomous systems across a wide range of tasks. However, they are prone to generate similar or the same solutions, limiting the controllability of the robot to behave according to user intentions. These limited robot behaviors may lead to collisions and potential harm to humans. To resolve these limitations, we introduce a semi-autonomous teleoperation framework that enables users to operate a robot by selecting a high-level command, referred to as option. Our approach aims to provide effective and diverse options by a learned policy, thereby enhancing the efficiency of the proposed framework. In this work, we propose a quality-diversity (QD) based sampling method that simultaneously optimizes both the quality and diversity of options using reinforcement learning (RL). Additionally, we present a mixture of latent variable models to learn multiple policy distributions defined as options. In experiments, we show that the proposed method achieves superior performance in terms of the success rate and diversity of the options in simulation environments. We further demonstrate that our method outperforms manual keyboard control for time duration over cluttered real-world environments. •Introduced a semi-autonomous teleoperation by providing selectable options.•Proposed an efficient sampling method considering quality and diversity.•Developed a mixture of latent models to learn multiple policy distributions.•Generated diverse and effective options in various robotic manipulation tasks.
ISSN:	0893-6080 1879-2782 1879-2782
DOI:	10.1016/j.neunet.2024.106543