Multi-Objective Based Spatio-Temporal Feature Representation Learning Robust to Expression Intensity Variations for Facial Expression Recognition

Facial expression recognition (FER) is increasingly gaining importance in various emerging affective computing applications. In practice, achieving accurate FER is challenging due to the large amount of inter-personal variations such as expression intensity variations. In this paper, we propose a ne...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on affective computing 2019-04, Vol.10 (2), p.223-236
Hauptverfasser: Kim, Dae Hoe, Baddar, Wissam J., Jang, Jinhyeok, Ro, Yong Man
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Facial expression recognition (FER) is increasingly gaining importance in various emerging affective computing applications. In practice, achieving accurate FER is challenging due to the large amount of inter-personal variations such as expression intensity variations. In this paper, we propose a new spatio-temporal feature representation learning for FER that is robust to expression intensity variations. The proposed method utilizes representative expression-states (e.g., onset, apex and offset of expressions) which can be specified in facial sequences regardless of the expression intensity. The characteristics of facial expressions are encoded in two parts in this paper. As the first part, spatial image characteristics of the representative expression-state frames are learned via a convolutional neural network. Five objective terms are proposed to improve the expression class separability of the spatial feature representation. In the second part, temporal characteristics of the spatial feature representation in the first part are learned with a long short-term memory of the facial expression. Comprehensive experiments have been conducted on a deliberate expression dataset (MMI) and a spontaneous micro-expression dataset (CASME II). Experimental results showed that the proposed method achieved higher recognition rates in both datasets compared to the state-of-the-art methods.
ISSN:1949-3045
1949-3045
DOI:10.1109/TAFFC.2017.2695999