HEU Emotion: a large-scale database for multimodal emotion recognition in the wild

The study of affective computing in the wild setting is underpinned by databases. Existing multimodal emotion databases in the real-world conditions are few and small, with a limited number of subjects and expressed in a single language. To meet this requirement, we collected, annotated, and prepare...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2021-07, Vol.33 (14), p.8669-8685
Hauptverfasser: Chen, Jing, Wang, Chenhui, Wang, Kejun, Yin, Chaoqun, Zhao, Cong, Xu, Tao, Zhang, Xinyi, Huang, Ziqiang, Liu, Meichen, Yang, Tao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The study of affective computing in the wild setting is underpinned by databases. Existing multimodal emotion databases in the real-world conditions are few and small, with a limited number of subjects and expressed in a single language. To meet this requirement, we collected, annotated, and prepared to release a new natural state video database (called HEU Emotion). HEU Emotion contains a total of 19,004 video clips, which is divided into two parts according to the data source. The first part contains videos downloaded from Tumblr, Google, and Giphy, including 10 emotions and two modalities (facial expression and body posture). The second part includes corpus taken manually from movies, TV series, and variety shows, consisting of 10 emotions and three modalities (facial expression, body posture, and emotional speech). HEU Emotion is by far the most extensive multimodal emotional database with 9951 subjects. In order to provide a benchmark for emotion recognition, we used many conventional machine learning and deep learning methods to evaluate HEU Emotion. We proposed a multimodal attention module to fuse multimodal features adaptively. After multimodal fusion, the recognition accuracies for the two parts increased by 2.19% and 4.01%, respectively, over those of single-modal facial expression recognition.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-020-05616-w