Implicit Offline Reinforcement Learning via Supervised Learning
Offline Reinforcement Learning (RL) via Supervised Learning is a simple and effective way to learn robotic skills from a dataset collected by policies of different expertise levels. It is as simple as supervised learning and Behavior Cloning (BC), but takes advantage of return information. On datase...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Offline Reinforcement Learning (RL) via Supervised Learning is a simple and
effective way to learn robotic skills from a dataset collected by policies of
different expertise levels. It is as simple as supervised learning and Behavior
Cloning (BC), but takes advantage of return information. On datasets collected
by policies of similar expertise, implicit BC has been shown to match or
outperform explicit BC. Despite the benefits of using implicit models to learn
robotic skills via BC, offline RL via Supervised Learning algorithms have been
limited to explicit models. We show how implicit models can leverage return
information and match or outperform explicit algorithms to acquire robotic
skills from fixed datasets. Furthermore, we show the close relationship between
our implicit methods and other popular RL via Supervised Learning algorithms to
provide a unified framework. Finally, we demonstrate the effectiveness of our
method on high-dimension manipulation and locomotion tasks. |
---|---|
DOI: | 10.48550/arxiv.2210.12272 |