Self-supervised Learning for Speech Emotion Recognition Task Using Audio-visual Features and Distil Hubert Model on BAVED and RAVDESS Databases

Existing pre-trained models like Distil HuBERT excel at uncovering hidden patterns and facilitating accurate recognition across diverse data types, such as audio and visual information. We harnessed this capability to develop a deep learning model that utilizes Distil HuBERT for jointly learning the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of systems science and systems engineering 2024-10, Vol.33 (5), p.576-606
Hauptverfasser: Dabbabi, Karim, Mars, Abdelkarim
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!