Hierarchical End-to-end Control Policy for Multi-degree-of-freedom Manipulators

In recent years, several control policies for a multi-degree-of-freedom (DOF) manipulator using deep reinforcement learning have been proposed. To avoid complexity, previous studies have applied a number of constraints on the high-dimensional state-action space, thus hindering generalized policy fun...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of control, automation, and systems 2022, Automation, and Systems, 20(10), , pp.3296-3311
Hauptverfasser:	Min, Cheol-Hui, Song, Jae-Bok
Format:	Artikel
Sprache:	eng
Schlagworte:	Control Deep learning Degrees of freedom Engineering Manipulators Mechatronics Regular Papers Robot arms Robotics 제어계측공학
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In recent years, several control policies for a multi-degree-of-freedom (DOF) manipulator using deep reinforcement learning have been proposed. To avoid complexity, previous studies have applied a number of constraints on the high-dimensional state-action space, thus hindering generalized policy function learning. In this study, the control problem is addressed by in-troducing a hierarchical reinforcement learning method that can learn the end-to-end control policy of a multi-DOF manipula-tor without any constraints on the state-action space. The proposed method learns hierarchical policy using two off-policy methods. Using human demonstration data and a newly proposed data-correction method, controlling the multi-DOF manipu-lator in an end-to-end manner is shown to outperform the non-hierarchical deep reinforcement learning methods.
ISSN:	1598-6446 2005-4092
DOI:	10.1007/s12555-021-0511-4