A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning
Continual learning with deep neural networks presents challenges distinct from both the fixed-dataset and convex continual learning regimes. One such challenge is plasticity loss, wherein a neural network trained in an online fashion displays a degraded ability to fit new tasks. This problem has bee...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Continual learning with deep neural networks presents challenges distinct
from both the fixed-dataset and convex continual learning regimes. One such
challenge is plasticity loss, wherein a neural network trained in an online
fashion displays a degraded ability to fit new tasks. This problem has been
extensively studied in both supervised learning and off-policy reinforcement
learning (RL), where a number of remedies have been proposed. Still, plasticity
loss has received less attention in the on-policy deep RL setting. Here we
perform an extensive set of experiments examining plasticity loss and a variety
of mitigation methods in on-policy deep RL. We demonstrate that plasticity loss
is pervasive under domain shift in this regime, and that a number of methods
developed to resolve it in other settings fail, sometimes even performing worse
than applying no intervention at all. In contrast, we find that a class of
``regenerative'' methods are able to consistently mitigate plasticity loss in a
variety of contexts, including in gridworld tasks and more challenging
environments like Montezuma's Revenge and ProcGen. |
---|---|
DOI: | 10.48550/arxiv.2405.19153 |