Efficient Robot Manipulation via Reinforcement Learning with Dynamic Movement Primitives-Based Policy

Reinforcement learning (RL) that autonomously explores optimal control policies has become a crucial direction for developing intelligent robots while Dynamic Movement Primitives (DMPs) serve as a powerful tool for efficiently expressing robot trajectories. This article explores an efficient integra...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied sciences 2024-11, Vol.14 (22), p.10665
Hauptverfasser: Li, Shangde, Huang, Wenjun, Miao, Chenyang, Xu, Kun, Chen, Yidong, Sun, Tianfu, Cui, Yunduan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Reinforcement learning (RL) that autonomously explores optimal control policies has become a crucial direction for developing intelligent robots while Dynamic Movement Primitives (DMPs) serve as a powerful tool for efficiently expressing robot trajectories. This article explores an efficient integration of RL and DMP to enhance the learning efficiency and control performance of reinforcement learning in robot manipulation tasks by focusing on the forms of control actions and their smoothness. A novel approach, DDPG-DMP, is proposed to address the efficiency and feasibility issues in the current RL approaches that employ DMP to generate control actions. The proposed method naturally integrates a DMP-based policy into the actor–critic framework of the traditional RL approach Deep Deterministic Policy Gradient (DDPG) and derives the corresponding update formulas to learn the networks that properly decide the parameters of DMPs. A novel inverse controller is further introduced to adaptively learn the translation from observed states into various robot control signals through DMPs, eliminating the requirement for human prior knowledge. Evaluated on five robot arm control benchmark tasks, DDPG-DMP demonstrates significant advantages in control performance, learning efficiency, and smoothness of robot actions compared to related baselines, highlighting its potential in complex robot control applications.
ISSN:2076-3417
2076-3417
DOI:10.3390/app142210665