Towards robust and domain agnostic reinforcement learning competitions
Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field. Despite this, a majority of challenges suffer from the same fundamental problems: participant solutions to the posed challen...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Reinforcement learning competitions have formed the basis for standard
research benchmarks, galvanized advances in the state-of-the-art, and shaped
the direction of the field. Despite this, a majority of challenges suffer from
the same fundamental problems: participant solutions to the posed challenge are
usually domain-specific, biased to maximally exploit compute resources, and not
guaranteed to be reproducible. In this paper, we present a new framework of
competition design that promotes the development of algorithms that overcome
these barriers. We propose four central mechanisms for achieving this end:
submission retraining, domain randomization, desemantization through domain
obfuscation, and the limitation of competition compute and environment-sample
budget. To demonstrate the efficacy of this design, we proposed, organized, and
ran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning. In
this work, we describe the organizational outcomes of the competition and show
that the resulting participant submissions are reproducible, non-specific to
the competition environment, and sample/resource efficient, despite the
difficult competition task. |
---|---|
DOI: | 10.48550/arxiv.2106.03748 |