Vowels formants analysis allows straightforward detection of high arousal emotions

Recently, automatic emotion recognition from speech has achieved growing interest within the human-machine interaction research community. Most part of emotion recognition methods use context independent frame-level analysis or turn-level analysis. In this article, we introduce context dependent vow...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Vlasenko, Bogdan, Philippou-Hubner, David, Prylipko, Dmytro, Bock, Ronald, Siegert, Ingo, Wendemuth, Andreas
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, automatic emotion recognition from speech has achieved growing interest within the human-machine interaction research community. Most part of emotion recognition methods use context independent frame-level analysis or turn-level analysis. In this article, we introduce context dependent vowel level analysis applied for emotion classification. An average first formant value extracted on vowel level has been used as unidimensional acoustic feature vector. The Neyman-Pearson criterion has been used for classification purpose. Our classifier is able to detect high-arousal emotions with small error rates. Within our research we proved that the smallest emotional unit should be the vowel instead of the word. We find out that using vowel level analysis can be an important issue during developing a robust emotion classifier. Also, our research can be useful for developing robust affective speech recognition methods and high quality emotional speech synthesis systems.
ISSN:1945-7871
1945-788X
DOI:10.1109/ICME.2011.6012003