Estimation of continuous valence and arousal levels from faces in naturalistic conditions
Facial affect analysis aims to create new types of human–computer interactions by enabling computers to better understand a person’s emotional state in order to provide ad hoc help and interactions. Since discrete emotional classes (such as anger, happiness, sadness and so on) are not representative...
Gespeichert in:
Veröffentlicht in: | Nature machine intelligence 2021-01, Vol.3 (1), p.42-50 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Facial affect analysis aims to create new types of human–computer interactions by enabling computers to better understand a person’s emotional state in order to provide ad hoc help and interactions. Since discrete emotional classes (such as anger, happiness, sadness and so on) are not representative of the full spectrum of emotions displayed by humans on a daily basis, psychologists typically rely on dimensional measures, namely valence (how positive the emotional display is) and arousal (how calming or exciting the emotional display looks like). However, while estimating these values from a face is natural for humans, it is extremely difficult for computer-based systems and automatic estimation of valence and arousal in naturalistic conditions is an open problem. Additionally, the subjectivity of these measures makes it hard to obtain good quality data. Here we introduce a novel deep neural network architecture to analyse facial affect in naturalistic conditions with a high level of accuracy. The proposed network integrates face alignment and jointly estimates both categorical and continuous emotions in a single pass, making it suitable for real-time applications. We test our method on three challenging datasets collected in naturalistic conditions and show that our approach outperforms all previous methods. We also discuss caveats regarding the use of this tool, and ethical aspects that must be considered in its application.
The annotation of the visual signs of emotions can be important for psychological studies and even human–computer interactions. Instead of only ascribing discrete emotions, Toisoul and colleagues use a single neural network that predicts emotional labels on a spectrum of valence and arousal without separate face-alignment steps. |
---|---|
ISSN: | 2522-5839 2522-5839 |
DOI: | 10.1038/s42256-020-00280-0 |