An Improved SAC-Based Deep Reinforcement Learning Framework for Collaborative Pushing and Grasping in Underwater Environments

Autonomous grasping is a fundamental task for underwater robots, but direct grasping for tightly stacked objects will lead to collisions and grasp failures, which requires pushing actions to separate the target object and increase grasp success (GS) rates. Hence, this article proposes a novel approa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on instrumentation and measurement 2024, Vol.73, p.1-14
Hauptverfasser: Gao, Jian, Li, Yufeng, Chen, Yimin, He, Yaozhen, Guo, Jingwei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Autonomous grasping is a fundamental task for underwater robots, but direct grasping for tightly stacked objects will lead to collisions and grasp failures, which requires pushing actions to separate the target object and increase grasp success (GS) rates. Hence, this article proposes a novel approach by employing an improved soft actor-critic (SAC) algorithm within a deep reinforcement learning (RL) framework for achieving collaborative pushing and grasping actions. The developed scheme employs an end-to-end control strategy that maps input images to actions. Specifically, an attention mechanism is introduced in the visual perception module to extract the necessary features for pushing and grasping actions to enhance the training strategy. Moreover, a novel pushing reward function is designed, comprising a per-object distribution function around the target and a global object distribution assessment network named PA-Net. Furthermore, an enhanced experience replay strategy is introduced to address the sparsity issue of grasp action rewards. Finally, a training environment for underwater manipulators is established, in which variations in light, water flow noise, and pressure effects are incorporated to simulate underwater work conditions more realistically. The simulation and real-world experiments demonstrate that the proposed learning strategy efficiently separates target objects and avoids inefficient pushing actions, achieving a significantly higher GS rate.
ISSN:0018-9456
1557-9662
DOI:10.1109/TIM.2024.3379048