Latent Action Priors From a Single Gait Cycle Demonstration for Online Imitation Learning
Deep Reinforcement Learning (DRL) in simulation often results in brittle and unrealistic learning outcomes. To push the agent towards more desirable solutions, prior information can be injected in the learning process through, for instance, reward shaping, expert data, or motion primitives. We propo...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep Reinforcement Learning (DRL) in simulation often results in brittle and
unrealistic learning outcomes. To push the agent towards more desirable
solutions, prior information can be injected in the learning process through,
for instance, reward shaping, expert data, or motion primitives. We propose an
additional inductive bias for robot learning: latent actions learned from
expert demonstration as priors in the action space. We show that these action
priors can be learned from only a single open-loop gait cycle using a simple
autoencoder. Using these latent action priors combined with established style
rewards for imitation in DRL achieves above expert demonstration level of
performance and leads to more desirable gaits. Further, action priors
substantially improve the performance on transfer tasks, even leading to gait
transitions for higher target speeds. Videos and code are available at
https://sites.google.com/view/latent-action-priors. |
---|---|
DOI: | 10.48550/arxiv.2410.03246 |