UniMASK: Unified Inference in Sequential Decision Problems
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision-making, where many well-studied tasks like behavior cloning, offline...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Randomly masking and predicting word tokens has been a successful approach in
pre-training language models for a variety of downstream tasks. In this work,
we observe that the same idea also applies naturally to sequential
decision-making, where many well-studied tasks like behavior cloning, offline
reinforcement learning, inverse dynamics, and waypoint conditioning correspond
to different sequence maskings over a sequence of states, actions, and returns.
We introduce the UniMASK framework, which provides a unified way to specify
models which can be trained on many different sequential decision-making tasks.
We show that a single UniMASK model is often capable of carrying out many tasks
with performance similar to or better than single-task models. Additionally,
after fine-tuning, our UniMASK models consistently outperform comparable
single-task models. Our code is publicly available at
https://github.com/micahcarroll/uniMASK. |
---|---|
DOI: | 10.48550/arxiv.2211.10869 |