Can Learned Optimization Make Reinforcement Learning Less Difficult?
While reinforcement learning (RL) holds great potential for decision making in the real world, it suffers from a number of unique difficulties which often need specific consideration. In particular: it is highly non-stationary; suffers from high degrees of plasticity loss; and requires exploration t...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | While reinforcement learning (RL) holds great potential for decision making
in the real world, it suffers from a number of unique difficulties which often
need specific consideration. In particular: it is highly non-stationary;
suffers from high degrees of plasticity loss; and requires exploration to
prevent premature convergence to local optima and maximize return. In this
paper, we consider whether learned optimization can help overcome these
problems. Our method, Learned Optimization for Plasticity, Exploration and
Non-stationarity (OPEN), meta-learns an update rule whose input features and
output structure are informed by previously proposed solutions to these
difficulties. We show that our parameterization is flexible enough to enable
meta-learning in diverse learning contexts, including the ability to use
stochasticity for exploration. Our experiments demonstrate that when
meta-trained on single and small sets of environments, OPEN outperforms or
equals traditionally used optimizers. Furthermore, OPEN shows strong
generalization characteristics across a range of environments and agent
architectures. |
---|---|
DOI: | 10.48550/arxiv.2407.07082 |