Deep neural networks for acoustic emotion recognition: Raising the benchmarks

Deep Neural Networks (DNNs) denote multilayer artificial neural networks with more than one hidden layer and millions of free parameters. We propose a Generalized Discriminant Analysis (GerDA) based on DNNs to learn discriminative features of low dimension optimized with respect to a fast classifica...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Stuhlsatz, Andre, Meyer, Christine, Eyben, Florian, Zielke, Thomas, Meier, Gunter, Schuller, Bjorn
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Acoustics Affective Computing Artificial neural networks Deep Neural Networks Emotion recognition Feature extraction Generalized Discriminant Analysis Speech Support vector machines
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Deep Neural Networks (DNNs) denote multilayer artificial neural networks with more than one hidden layer and millions of free parameters. We propose a Generalized Discriminant Analysis (GerDA) based on DNNs to learn discriminative features of low dimension optimized with respect to a fast classification from a large set of acoustic features for emotion recognition. On nine frequently used emotional speech corpora, we compare the performance of GerDA features and their subsequent linear classification with previously reported benchmarks obtained using the same set of acoustic features classified by Support Vector Machines (SVMs). Our results impressively show that low-dimensional GerDA features capture hidden information from the acoustic features leading to a significantly raised unweighted average recall and considerably raised weighted average recall.
ISSN:	1520-6149 2379-190X
DOI:	10.1109/ICASSP.2011.5947651