Multi-modal Emotion Estimation for in-the-wild Videos
In this paper, we briefly introduce our submission to the Valence-Arousal Estimation Challenge of the 3rd Affective Behavior Analysis in-the-wild (ABAW) competition. Our method utilizes the multi-modal information, i.e., the visual and audio information, and employs a temporal encoder to model the t...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we briefly introduce our submission to the Valence-Arousal
Estimation Challenge of the 3rd Affective Behavior Analysis in-the-wild (ABAW)
competition. Our method utilizes the multi-modal information, i.e., the visual
and audio information, and employs a temporal encoder to model the temporal
context in the videos. Besides, a smooth processor is applied to get more
reasonable predictions, and a model ensemble strategy is used to improve the
performance of our proposed method. The experiment results show that our method
achieves 65.55% ccc for valence and 70.88% ccc for arousal on the validation
set of the Aff-Wild2 dataset, which prove the effectiveness of our proposed
method. |
---|---|
DOI: | 10.48550/arxiv.2203.13032 |