Deteksi Emosi Wicara pada Media On-Demand menggunakan SVM dan LSTM

To date, there are many speech data sets with emotional classes, but with impromptu or intentional actors. The native speakers are given a stimulus in each emotion expression. Because natural conversation from secretly recorded daily communication still raises ethical issues, then using voice data t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) (Online) 2020-10, Vol.4 (5), p.799-804
Hauptverfasser: Ainurrochman, Adi, Derry Pramono, Gumelar, Agustinus Bimo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To date, there are many speech data sets with emotional classes, but with impromptu or intentional actors. The native speakers are given a stimulus in each emotion expression. Because natural conversation from secretly recorded daily communication still raises ethical issues, then using voice data that takes samples from movies and podcasts is the most appropriate step to take the best insights from speech. Professional actors are trained to induce the most real emotions close to natural, through the Stanislavski acting method. The speech dataset that meets this qualification is the Human voice Natural Language from On-demand media (HENLO). Within HENLO, there are basic per-emotion audio clips of films and podcasts originating from Media On-Demand, a motion video entertainment media platform with the freedom to play and download at any time. In this paper, we describe the use of sound clips from HENLO, then conduct learning using Support Vector Machine (SVM) and Long Short-Term Memory (LSTM). In these two methods, we found the best strategy by training LSTMs first, then then feeding the model to SVM, with a data split strategy at 80:20 scale. The results of the five training phases show that the last accuracy results increased by more than 17% compared to the first training. These results mean both complement and methods are important for improving classification accuracy.
ISSN:2580-0760
2580-0760
DOI:10.29207/resti.v4i5.2073