Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning
The exploration \& exploitation dilemma poses significant challenges in reinforcement learning (RL). Recently, curiosity-based exploration methods achieved great success in tackling hard-exploration problems. However, they necessitate extensive hyperparameter tuning on different environments, wh...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The exploration \& exploitation dilemma poses significant challenges in
reinforcement learning (RL). Recently, curiosity-based exploration methods
achieved great success in tackling hard-exploration problems. However, they
necessitate extensive hyperparameter tuning on different environments, which
heavily limits the applicability and accessibility of this line of methods. In
this paper, we characterize this problem via analysis of the agent behavior,
concluding the fundamental difficulty of choosing a proper hyperparameter. We
then identify the difficulty and the instability of the optimization when the
agent learns with curiosity. We propose our method, hyperparameter robust
exploration (\textbf{Hyper}), which extensively mitigates the problem by
effectively regularizing the visitation of the exploration and decoupling the
exploitation to ensure stable training. We theoretically justify that
\textbf{Hyper} is provably efficient under function approximation setting and
empirically demonstrate its appealing performance and robustness in various
environments. |
---|---|
DOI: | 10.48550/arxiv.2412.03767 |