Optimizable Ensemble Regression for Arousal and Valence Predictions from Visual Features

The cognitive state of a person can be categorized using the Circumplex model of emotional states, a continuous model of two dimensions: arousal and valence. We exploit the Remote Collaborative and Affective Interactions (RECOLA) database, which includes audio, video, and physiological recordings of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering proceedings 2023-11, Vol.58 (1), p.3
Hauptverfasser: Itaf Omar Joudeh, Ana-Maria Cretu, Stéphane Bouchard
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The cognitive state of a person can be categorized using the Circumplex model of emotional states, a continuous model of two dimensions: arousal and valence. We exploit the Remote Collaborative and Affective Interactions (RECOLA) database, which includes audio, video, and physiological recordings of interactions between human participants to predict arousal and valance values using machine learning techniques. To allow learners to focus on the most relevant data, features are extracted from raw data. Such features can be predesigned or learned. Learned features are automatically learned and utilized by deep learning solutions. Predesigned features are calculated before machine learning and inputted into the learner. Our previous work on video recordings focused on learned features. In this paper, we expand our work onto predesigned visual features, extracted from video recordings. We process these features by applying time delay and sequencing, arousal/valence labelling, and shuffling and splitting. We then train and test regressors to predict arousal and valence values. Our results outperform those from the literature. We achieve a root mean squared error (RMSE), Pearson’s correlation coefficient (PCC), and concordance correlation coefficient (CCC) of 0.1033, 0.8498, and 0.8001 on arousal predictions; and 0.07016, 0.8473, and 0.8053 on valence predictions, using an optimizable ensemble, respectively.
ISSN:2673-4591
DOI:10.3390/ecsa-10-16009