Leveraging Retrieval Augment Approach for Multimodal Emotion Recognition Under Missing Modalities
Multimodal emotion recognition utilizes complete multimodal information and robust multimodal joint representation to gain high performance. However, the ideal condition of full modality integrity is often not applicable in reality and there always appears the situation that some modalities are miss...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multimodal emotion recognition utilizes complete multimodal information and
robust multimodal joint representation to gain high performance. However, the
ideal condition of full modality integrity is often not applicable in reality
and there always appears the situation that some modalities are missing. For
example, video, audio, or text data is missing due to sensor failure or network
bandwidth problems, which presents a great challenge to MER research.
Traditional methods extract useful information from the complete modalities and
reconstruct the missing modalities to learn robust multimodal joint
representation. These methods have laid a solid foundation for research in this
field, and to a certain extent, alleviated the difficulty of multimodal emotion
recognition under missing modalities. However, relying solely on internal
reconstruction and multimodal joint learning has its limitations, especially
when the missing information is critical for emotion recognition. To address
this challenge, we propose a novel framework of Retrieval Augment for Missing
Modality Multimodal Emotion Recognition (RAMER), which introduces similar
multimodal emotion data to enhance the performance of emotion recognition under
missing modalities. By leveraging databases, that contain related multimodal
emotion data, we can retrieve similar multimodal emotion information to fill in
the gaps left by missing modalities. Various experimental results demonstrate
that our framework is superior to existing state-of-the-art approaches in
missing modality MER tasks. Our whole project is publicly available on
https://github.com/WooyoohL/Retrieval_Augment_MER. |
---|---|
DOI: | 10.48550/arxiv.2410.02804 |