Emotion Recognition Using Transformers with Masked Learning
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 4860-4865 In recent years, deep learning has achieved innovative advancements in various fields, including the analysis of human emotions and behaviors. Initiatives such as the Affective Beh...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR) Workshops, 2024, pp. 4860-4865 In recent years, deep learning has achieved innovative advancements in
various fields, including the analysis of human emotions and behaviors.
Initiatives such as the Affective Behavior Analysis in-the-wild (ABAW)
competition have been particularly instrumental in driving research in this
area by providing diverse and challenging datasets that enable precise
evaluation of complex emotional states. This study leverages the Vision
Transformer (ViT) and Transformer models to focus on the estimation of
Valence-Arousal (VA), which signifies the positivity and intensity of emotions,
recognition of various facial expressions, and detection of Action Units (AU)
representing fundamental muscle movements. This approach transcends traditional
Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) based
methods, proposing a new Transformer-based framework that maximizes the
understanding of temporal and spatial features. The core contributions of this
research include the introduction of a learning technique through random frame
masking and the application of Focal loss adapted for imbalanced data,
enhancing the accuracy and applicability of emotion and behavior analysis in
real-world settings. This approach is expected to contribute to the advancement
of emotional computing and deep learning methodologies. |
---|---|
DOI: | 10.48550/arxiv.2403.13731 |