Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning
In this work, we present and study a training set-up that achieves fast policy generation for real-world robotic tasks by using massive parallelism on a single workstation GPU. We analyze and discuss the impact of different training algorithm components in the massively parallel regime on the final...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this work, we present and study a training set-up that achieves fast
policy generation for real-world robotic tasks by using massive parallelism on
a single workstation GPU. We analyze and discuss the impact of different
training algorithm components in the massively parallel regime on the final
policy performance and training times. In addition, we present a novel
game-inspired curriculum that is well suited for training with thousands of
simulated robots in parallel. We evaluate the approach by training the
quadrupedal robot ANYmal to walk on challenging terrain. The parallel approach
allows training policies for flat terrain in under four minutes, and in twenty
minutes for uneven terrain. This represents a speedup of multiple orders of
magnitude compared to previous work. Finally, we transfer the policies to the
real robot to validate the approach. We open-source our training code to help
accelerate further research in the field of learned legged locomotion. |
---|---|
DOI: | 10.48550/arxiv.2109.11978 |