Skill-based Model-based Reinforcement Learning
Model-based reinforcement learning (RL) is a sample-efficient way of learning complex behaviors by leveraging a learned single-step dynamics model to plan actions in imagination. However, planning every action for long-horizon tasks is not practical, akin to a human planning out every muscle movemen...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Model-based reinforcement learning (RL) is a sample-efficient way of learning
complex behaviors by leveraging a learned single-step dynamics model to plan
actions in imagination. However, planning every action for long-horizon tasks
is not practical, akin to a human planning out every muscle movement. Instead,
humans efficiently plan with high-level skills to solve complex tasks. From
this intuition, we propose a Skill-based Model-based RL framework (SkiMo) that
enables planning in the skill space using a skill dynamics model, which
directly predicts the skill outcomes, rather than predicting all small details
in the intermediate states, step by step. For accurate and efficient long-term
planning, we jointly learn the skill dynamics model and a skill repertoire from
prior experience. We then harness the learned skill dynamics model to
accurately simulate and plan over long horizons in the skill space, which
enables efficient downstream learning of long-horizon, sparse reward tasks.
Experimental results in navigation and manipulation domains show that SkiMo
extends the temporal horizon of model-based approaches and improves the sample
efficiency for both model-based RL and skill-based RL. Code and videos are
available at https://clvrai.com/skimo |
---|---|
DOI: | 10.48550/arxiv.2207.07560 |