Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework
Multimodal emotion recognition systems rely heavily on the full availability of modalities, suffering significant performance declines when modal data is incomplete. To tackle this issue, we present the Cross-Modal Alignment, Reconstruction, and Refinement (CM-ARR) framework, an innovative approach...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multimodal emotion recognition systems rely heavily on the full availability
of modalities, suffering significant performance declines when modal data is
incomplete. To tackle this issue, we present the Cross-Modal Alignment,
Reconstruction, and Refinement (CM-ARR) framework, an innovative approach that
sequentially engages in cross-modal alignment, reconstruction, and refinement
phases to handle missing modalities and enhance emotion recognition. This
framework utilizes unsupervised distribution-based contrastive learning to
align heterogeneous modal distributions, reducing discrepancies and modeling
semantic uncertainty effectively. The reconstruction phase applies normalizing
flow models to transform these aligned distributions and recover missing
modalities. The refinement phase employs supervised point-based contrastive
learning to disrupt semantic correlations and accentuate emotional traits,
thereby enriching the affective content of the reconstructed representations.
Extensive experiments on the IEMOCAP and MSP-IMPROV datasets confirm the
superior performance of CM-ARR under conditions of both missing and complete
modalities. Notably, averaged across six scenarios of missing modalities,
CM-ARR achieves absolute improvements of 2.11% in WAR and 2.12% in UAR on the
IEMOCAP dataset, and 1.71% and 1.96% in WAR and UAR, respectively, on the
MSP-IMPROV dataset. |
---|---|
DOI: | 10.48550/arxiv.2407.09029 |