Continuous control actions learning and adaptation for robotic manipulation through reinforcement learning

This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-C...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Autonomous robots 2022-03, Vol.46 (3), p.483-498
Hauptverfasser:	Shahid, Asad Ali, Piga, Dario, Braghin, Francesco, Roveda, Loris
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Algorithms Artificial Intelligence Computer Imaging Control Engineering Machine learning Mechatronics Optimization Pattern Recognition and Graphics Robotics Robotics and Automation Robots Vision
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The learning performance is compared across on-policy and off-policy algorithms: Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). In order to accelerate the learning process, the fine-tuning procedure is proposed that demonstrates the continuous adaptation of on-policy RL to new environments, allowing the learned policy to adapt and execute the (partially) modified task. A dense reward function is designed for the task to enable an efficient learning of the agent. A grasping task involving a Franka Emika Panda manipulator is considered as the reference task to be learned. The learned control policy is demonstrated to be generalizable across multiple object geometries and initial robot/parts configurations. The approach is finally tested on a real Franka Emika Panda robot, showing the possibility to transfer the learned behavior from simulation. Experimental results show 100% of successful grasping tasks, making the proposed approach applicable to real applications.
ISSN:	0929-5593 1573-7527
DOI:	10.1007/s10514-022-10034-z