Reducing catastrophic forgetting of incremental learning in the absence of rehearsal memory with task-specific token
Deep learning models generally display catastrophic forgetting when learning new data continuously. Many incremental learning approaches address this problem by reusing data from previous tasks while learning new tasks. However, the direct access to past data generates privacy and security concerns....
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Deep learning models generally display catastrophic forgetting when learning
new data continuously. Many incremental learning approaches address this
problem by reusing data from previous tasks while learning new tasks. However,
the direct access to past data generates privacy and security concerns. To
address these issues, we present a novel method that preserves previous
knowledge without storing previous data. This method is inspired by the
architecture of a vision transformer and employs a unique token capable of
encapsulating the compressed knowledge of each task. This approach generates
task-specific embeddings by directing attention differently based on the task
associated with the data, thereby effectively mimicking the impact of having
multiple models through tokens. Our method incorporates a distillation process
that ensures efficient interactions even after multiple additional learning
steps, thereby optimizing the model against forgetting. We measured the
performance of our model in terms of accuracy and backward transfer using a
benchmark dataset for different task-incremental learning scenarios. Our
results demonstrate the superiority of our approach, which achieved the highest
accuracy and lowest backward transfer among the compared methods. In addition
to presenting a new model, our approach lays the foundation for various
extensions within the spectrum of vision-transformer architectures. |
---|---|
DOI: | 10.48550/arxiv.2411.05846 |