Task-level decision-making for dynamic and stochastic human-robot collaboration based on dual agents deep reinforcement learning
Human-robot collaboration as a multidisciplinary research topic is still pursuing the robots’ enhanced intelligence to be more human-compatible and fit the dynamic and stochastic characteristics of human. However, the uncertainties brought by the human partner challenge the task-planning and decisio...
Gespeichert in:
Veröffentlicht in: | International journal of advanced manufacturing technology 2021-08, Vol.115 (11-12), p.3533-3552 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Human-robot collaboration as a multidisciplinary research topic is still pursuing the robots’ enhanced intelligence to be more human-compatible and fit the dynamic and stochastic characteristics of human. However, the uncertainties brought by the human partner challenge the task-planning and decision-making of the robot. When aiming at industrial tasks like collaborative assembly, dynamics on temporal dimension and stochasticities on the order of procedures need to be further considered. In this work, we bring a new perspective and solution based on reinforcement learning, where the problem is regarded as training an agent towards tasks in dynamic and stochastic environments. Concretely, an adapted training approach based on the deep Q learning method is proposed. This method regards both the robot and the human as the agents in the interactive training environment for deep reinforcement learning. With the consideration of task-level industrial human-robot collaboration, the training logic and the agent-environment interaction have been proposed. For the human-robot collaborative assembly tasks in the case study, it is illustrated that our method could drive the robot represented by one agent to collaborate with the human partner even the human performs randomly on the task procedures. |
---|---|
ISSN: | 0268-3768 1433-3015 1433-3015 |
DOI: | 10.1007/s00170-021-07265-2 |