SeQuiFi: Mitigating Catastrophic Forgetting in Speech Emotion Recognition with Sequential Class-Finetuning
In this work, we introduce SeQuiFi, a novel approach for mitigating catastrophic forgetting (CF) in speech emotion recognition (SER). SeQuiFi adopts a sequential class-finetuning strategy, where the model is fine-tuned incrementally on one emotion class at a time, preserving and enhancing retention...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this work, we introduce SeQuiFi, a novel approach for mitigating
catastrophic forgetting (CF) in speech emotion recognition (SER). SeQuiFi
adopts a sequential class-finetuning strategy, where the model is fine-tuned
incrementally on one emotion class at a time, preserving and enhancing
retention for each class. While various state-of-the-art (SOTA) methods, such
as regularization-based, memory-based, and weight-averaging techniques, have
been proposed to address CF, it still remains a challenge, particularly with
diverse and multilingual datasets. Through extensive experiments, we
demonstrate that SeQuiFi significantly outperforms both vanilla fine-tuning and
SOTA continual learning techniques in terms of accuracy and F1 scores on
multiple benchmark SER datasets, including CREMA-D, RAVDESS, Emo-DB, MESD, and
SHEMO, covering different languages. |
---|---|
DOI: | 10.48550/arxiv.2410.12567 |