Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition
Recognizing emotions from speech is a daunting task due to the subtlety and ambiguity of expressions. Traditional speech emotion recognition (SER) systems, which typically rely on a singular, precise emotion label, struggle with this complexity. Therefore, modeling the inherent ambiguity of emotions...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Recognizing emotions from speech is a daunting task due to the subtlety and
ambiguity of expressions. Traditional speech emotion recognition (SER) systems,
which typically rely on a singular, precise emotion label, struggle with this
complexity. Therefore, modeling the inherent ambiguity of emotions is an urgent
problem. In this paper, we propose an iterative prototype refinement framework
(IPR) for ambiguous SER. IPR comprises two interlinked components: contrastive
learning and class prototypes. The former provides an efficient way to obtain
high-quality representations of ambiguous samples. The latter are dynamically
updated based on ambiguous labels -- the similarity of the ambiguous data to
all prototypes. These refined embeddings yield precise pseudo labels, thus
reinforcing representation quality. Experimental evaluations conducted on the
IEMOCAP dataset validate the superior performance of IPR over state-of-the-art
methods, thus proving the effectiveness of our proposed method. |
---|---|
DOI: | 10.48550/arxiv.2408.00325 |