Modelling appearance variations in expressive and neutral face image for automatic facial expression recognition

In automatic facial expression recognition (AFER) systems, modelling the spatio‐temporal feature information in a specific manner, coalescing, and its effective utilization is challenging. The state‐of‐the‐art studies have examined integrating multiple features to enhance the recognition rate of AFE...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IET Image Processing 2024-07, Vol.18 (9), p.2449-2460
Hauptverfasser: Kumar H N, Naveen, M S, Guru Prasad, Asif Shah, Mohd, Mahadevaswamy, B, Jagadeesh, K, Sudheesh
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In automatic facial expression recognition (AFER) systems, modelling the spatio‐temporal feature information in a specific manner, coalescing, and its effective utilization is challenging. The state‐of‐the‐art studies have examined integrating multiple features to enhance the recognition rate of AFER systems. However, the feature variations between expressive and neutral face images are not fully explored to identify the expression class. The proposed research presents an innovative approach to AFER by modelling appearance variations in both expressive and neutral face images. The prominent contributions of the work are developing a novel and hybrid feature space by integrating the discriminative feature distribution derived from expressive and neutral face images; preserving the highly discriminative latent feature distribution using autoencoders. Local binary pattern (LBP) and histogram of oriented gradients (HOG) are the feature descriptors employed to derive the discriminative texture and shape information, respectively. The component‐based approach is employed, wherein the features are derived from the salient facial regions instead of the whole face. The three‐stage stacked deep convolutional autoencoder (SDCA) and multi‐class support vector machine (MSVM) are employed to address dimensionality reduction and classification, respectively. The efficacy of the proposed model is substantiated by empirical findings, which establish its superiority in terms of accuracy in AFER tasks on widely recognized benchmark datasets. This paper develops a hybrid feature space by integrating the discriminative power of HOG and LBP feature distributions derived from the salient facial regions. Also, a novel feature space is developed by integrating the hybrid feature space derived from the expressive and neutral face images to further amplify the discriminative power of the feature descriptor. A three‐stage stacked deep convolutional autoencoder (SDCA) derives the representation and MSVM for multi‐class classification. To test its generalization power, the proposed work is implemented on three benchmark datasets (CK+, JAFFE, and KDEF).
ISSN:1751-9659
1751-9667
DOI:10.1049/ipr2.13109