Speaking Effect Removal on Emotion Recognition From Facial Expressions Based on Eigenface Conversion

Speaking effect is a crucial issue that may dramatically degrade performance in emotion recognition from facial expressions. To manage this problem, an eigenface conversion-based approach is proposed to remove speaking effect on facial expressions for improving accuracy of emotion recognition. In th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2013-12, Vol.15 (8), p.1732-1744
Hauptverfasser:	WU, Chung-Hsien, WEI, Wen-Li, LIN, Jen-Chun, LEE, Wei-Yu
Format:	Artikel
Sprache:	eng
Schlagworte:	Active appearance model Anatomical correlates of behavior Applied sciences Arousal-valence emotion plane articulatory attribute Artificial intelligence Behavioral psychophysiology Biological and medical sciences Computer science control theory systems Context modeling conversion function Data processing. List processing. Character string processing Emotion recognition Exact sciences and technology Face recognition facial expression Facial features Fundamental and applied biological sciences. Psychology Memory organisation. Data processing Pattern recognition. Digital image processing. Computational geometry Psychology. Psychoanalysis. Psychiatry Psychology. Psychophysiology Software Speech Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Speaking effect is a crucial issue that may dramatically degrade performance in emotion recognition from facial expressions. To manage this problem, an eigenface conversion-based approach is proposed to remove speaking effect on facial expressions for improving accuracy of emotion recognition. In the proposed approach, a context-dependent linear conversion function modeled by a statistical Gaussian Mixture Model (GMM) is constructed with parallel data from speaking and non-speaking facial expressions with emotions. To model the speaking effect in more detail, the conversion functions are categorized using a decision tree considering the visual temporal context of the Articulatory Attribute (AA) classes of the corresponding input speech segments. For verification of the identified quadrant of emotional expression on the Arousal-Valence (A-V) emotion plane, which is commonly used to dimensionally define the emotion classes, from the reconstructed facial feature points, an expression template is constructed to represent the feature points of the non-speaking facial expressions for each quadrant. With the verified quadrant, a regression scheme is further employed to estimate the A-V values of the facial expression as a precise point in the A-V emotion plane. Experimental results show that the proposed method outperforms current approaches and demonstrates that removing the speaking effect on facial expression is useful for improving the performance of emotion recognition.
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2013.2272917