Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots
Reinforcement learning solely from an agent's self-generated data is often believed to be infeasible for learning on real robots, due to the amount of data needed. However, if done right, agents learning from real data can be surprisingly efficient through re-using previously collected sub-opti...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Reinforcement learning solely from an agent's self-generated data is often
believed to be infeasible for learning on real robots, due to the amount of
data needed. However, if done right, agents learning from real data can be
surprisingly efficient through re-using previously collected sub-optimal data.
In this paper we demonstrate how the increased understanding of off-policy
learning methods and their embedding in an iterative online/offline scheme
(``collect and infer'') can drastically improve data-efficiency by using all
the collected experience, which empowers learning from real robot experience
only. Moreover, the resulting policy improves significantly over the state of
the art on a recently proposed real robot manipulation benchmark. Our approach
learns end-to-end, directly from pixels, and does not rely on additional human
domain knowledge such as a simulator or demonstrations. |
---|---|
DOI: | 10.48550/arxiv.2312.11374 |