Robot grasping method optimization using improved deep deterministic policy gradient algorithm of deep reinforcement learning

Robot grasping has become a very hot research field so that the requirements for robot operation are getting higher and higher. In previous research studies, the use of traditional target detection algorithms for grasping is often very inefficient, and this article is dedicated to improving the deep...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Review of scientific instruments 2021-02, Vol.92 (2), p.025114-025114
Hauptverfasser: Zhang, Hongxu, Wang, Fei, Wang, Jianhui, Cui, Ben
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Robot grasping has become a very hot research field so that the requirements for robot operation are getting higher and higher. In previous research studies, the use of traditional target detection algorithms for grasping is often very inefficient, and this article is dedicated to improving the deep reinforcement learning algorithm to improve the grasping efficiency and solve the problem of robots dealing with the impact of unknown disturbances on grasping. Using the characteristic that deep reinforcement learning actively explores the unknown environment, a Gaussian parameter Deep Deterministic Policy Gradient (Gaussian-DDPG) algorithm based on the Importance-Weighted Autoencoder (IWAE) is proposed to realize the robot’s autonomous learning of the grasping task. Traditional coordinate positioning methods and deep learning methods have poor grasping effects for disturbed situations (such as the movement of the target object). The IWAE algorithm is used to compress the high-dimensional information of the original visual input to the hidden space and pass it to the deep reinforcement learning network as part of the state value. Based on the classic DDPG algorithm, it smoothly adds Gaussian parameters to improve the exploratory nature of the algorithm, dynamically sets the robot grasping space parameters to adapt to the workspace of multiple scales, and finally, realizes the accurate grasping of the robot. Relying on the possible position information deviation of the visual information, the control of the grasping position by the manipulator torque information is further optimized to improve the grasping efficiency of disturbed objects.
ISSN:0034-6748
1089-7623
DOI:10.1063/5.0034101