Enhancing masked facial expression recognition with multimodal deep learning
Facial expression recognition (FER) is an essential field for intelligent human-computer interaction, but the COVID-19 pandemic has made unimodal techniques less effective due to masks. Multimodal approaches that combine information from multiple modalities are more robust at recognizing emotions fr...
Gespeichert in:
Veröffentlicht in: | Multimedia tools and applications 2024-02, Vol.83 (30), p.73911-73921 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Facial expression recognition (FER) is an essential field for intelligent human-computer interaction, but the COVID-19 pandemic has made unimodal techniques less effective due to masks. Multimodal approaches that combine information from multiple modalities are more robust at recognizing emotions from facial expressions. The need to accurately recognize human emotions based on facial expressions is still significant. The study proposed a multimodal methodology based on deep learning for facial recognition under masks and vocal expressions. The proposed approach used two standard datasets, M-LFW-F and CREMA-D to capture facial and vocal emotional cues. The resulting dataset was used to train a multimodal neural network using fusion techniques that outperformed unimodal methods. The proposed approach achieved an accuracy of 79.05%, while the unimodal approach achieved 68.76%, demonstrating that the proposed approach outperforms unimodal techniques in facial expression recognition under masked conditions. This highlights the potential of multimodal techniques for improving FER in challenging scenarios. |
---|---|
ISSN: | 1573-7721 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-024-18362-1 |