Two Complementary Perspectives to Continual Learning: Ask Not Only What to Optimize, But Also How
Proceedings of the 1st ContinualAI Unconference, 2023, PMLR 249: 37-61 Recent years have seen considerable progress in the continual training of deep neural networks, predominantly thanks to approaches that add replay or regularization terms to the loss function to approximate the joint loss over al...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Proceedings of the 1st ContinualAI Unconference, 2023, PMLR 249:
37-61 Recent years have seen considerable progress in the continual training of
deep neural networks, predominantly thanks to approaches that add replay or
regularization terms to the loss function to approximate the joint loss over
all tasks so far. However, we show that even with a perfect approximation to
the joint loss, these approaches still suffer from temporary but substantial
forgetting when starting to train on a new task. Motivated by this 'stability
gap', we propose that continual learning strategies should focus not only on
the optimization objective, but also on the way this objective is optimized.
While there is some continual learning work that alters the optimization
trajectory (e.g., using gradient projection techniques), this line of research
is positioned as alternative to improving the optimization objective, while we
argue it should be complementary. In search of empirical support for our
proposition, we perform a series of pre-registered experiments combining
replay-approximated joint objectives with gradient projection-based
optimization routines. However, this first experimental attempt fails to show
clear and consistent benefits. Nevertheless, our conceptual arguments, as well
as some of our empirical results, demonstrate the distinctive importance of the
optimization trajectory in continual learning, thereby opening up a new
direction for continual learning research. |
---|---|
DOI: | 10.48550/arxiv.2311.04898 |