Audio-Visual Continuous Recognition of Emotional State in a Multi-User System Based on Personalized Representation of Facial Expressions and Voice
This paper is devoted to tracking dynamics of psycho-emotional state based on analysis of the user’s facial video and voice. We propose a novel technology with personalized acoustic and visual lightweight neural network models that can be launched in real-time on any laptop or even mobile device. At...
Gespeichert in:
Veröffentlicht in: | Pattern recognition and image analysis 2022-09, Vol.32 (3), p.665-671 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper is devoted to tracking dynamics of psycho-emotional state based on analysis of the user’s facial video and voice. We propose a novel technology with personalized acoustic and visual lightweight neural network models that can be launched in real-time on any laptop or even mobile device. At first, two separate user-independent classifiers (feed-forward neural networks) are trained for speech emotion and facial expression recognition in video. The former extracts acoustic features with OpenL3 or OpenSmile frameworks. The latter is based on preliminary extraction of emotional features from each frame with a pre-trained convolutional neural network. Next, both classifiers are fine-tuned using a small number of short emotional videos that should be available for each user. The face of a user is identified during the real-time tracking of emotional state to choose the concrete neural networks. The final decision about current emotion in a short time frame is predicted by blending the outputs of personalized audio and video classifiers. It is experimentally demonstrated for the Russian Acted Multimodal Affective Set that the proposed approach makes it possible to increase the emotion recognition accuracy by 2–15%. |
---|---|
ISSN: | 1054-6618 1555-6212 |
DOI: | 10.1134/S1054661822030397 |