EmotionFusion: A unified ensemble R-CNN approach for advanced facial emotion analysis

To assess non-verbal reactions to commodities, services, or products, sentiment analysis is the technique of identifying exhibited human emotions utilizing artificial intelligence-based technology. The facial muscles flex and contract differently in response to each facial expression that a person m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of intelligent & fuzzy systems 2023-12, Vol.45 (6), p.10141-10155
Hauptverfasser: Umamageswari, A., Deepa, S., Bhagyalakshmi, A., Sangari, A., Raja, K.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To assess non-verbal reactions to commodities, services, or products, sentiment analysis is the technique of identifying exhibited human emotions utilizing artificial intelligence-based technology. The facial muscles flex and contract differently in response to each facial expression that a person makes, which facilitates the deep learning AI algorithms’ ability to identify an emotion. Facial emotion analysis has numerous applications across various industries and domains, leveraging the understanding of human emotions conveyed through facial expressions, so it is very much required in healthcare, security and survelliance, Forensics, Autism and cultural studies etc,.. In this study, facially expressed sentiments in real-time photographs as well as in an existing dataset are classified using object detection techniques based on deep learning. Fast Region-based Convolution Neural Network (R-CNN) is an object detection system that uses suggested areas to categorize facial expressions of emotion in real-time. Using a high-quality video collection made up of 24 actors who were photographed facially expressing eight distinct emotions (Happy, Sad, Disgust, Anger, Surprise, Fear, Contempt and Neutral). The Fast R-CNN and Mouth region-based feature extraction and Maximally Stable Extremal Regions (MSER) method used for classification and feature extraction respectively. In order to assess the deep network’s performance, the proposed work builds a confusion matrix. The network generalizes to new images rather well, as seen by the average recognition rate of 97.6% for eight emotions. The suggested deep network approach may deliver superior recognition performance when compared to CNN and SVM methods, and it can be applied to a variety of applications including online classrooms, video game testing, healthcare sectors, and automated industry.
ISSN:1064-1246
1875-8967
DOI:10.3233/JIFS-233842