An Improved SAC-Based Deep Reinforcement Learning Framework for Collaborative Pushing and Grasping in Underwater Environments
Autonomous grasping is a fundamental task for underwater robots, but direct grasping for tightly stacked objects will lead to collisions and grasp failures, which requires pushing actions to separate the target object and increase grasp success (GS) rates. Hence, this article proposes a novel approa...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on instrumentation and measurement 2024, Vol.73, p.1-14 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Autonomous grasping is a fundamental task for underwater robots, but direct grasping for tightly stacked objects will lead to collisions and grasp failures, which requires pushing actions to separate the target object and increase grasp success (GS) rates. Hence, this article proposes a novel approach by employing an improved soft actor-critic (SAC) algorithm within a deep reinforcement learning (RL) framework for achieving collaborative pushing and grasping actions. The developed scheme employs an end-to-end control strategy that maps input images to actions. Specifically, an attention mechanism is introduced in the visual perception module to extract the necessary features for pushing and grasping actions to enhance the training strategy. Moreover, a novel pushing reward function is designed, comprising a per-object distribution function around the target and a global object distribution assessment network named PA-Net. Furthermore, an enhanced experience replay strategy is introduced to address the sparsity issue of grasp action rewards. Finally, a training environment for underwater manipulators is established, in which variations in light, water flow noise, and pressure effects are incorporated to simulate underwater work conditions more realistically. The simulation and real-world experiments demonstrate that the proposed learning strategy efficiently separates target objects and avoids inefficient pushing actions, achieving a significantly higher GS rate. |
---|---|
ISSN: | 0018-9456 1557-9662 |
DOI: | 10.1109/TIM.2024.3379048 |