Automatic hate speech detection in audio using machine learning algorithms

Even though every individual is entitled to freedom of speech, some limitations exist when this freedom is used to target and harm another individual or a group of people, as it translates to hate speech. In this study, the proposed research deals with detection of hate speech for English and Kiswah...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of speech technology 2024, Vol.27 (2), p.447-469
Hauptverfasser: Imbwaga, Joan L., Chittaragi, Nagatatna B., Koolagudi, Shashidhar G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Even though every individual is entitled to freedom of speech, some limitations exist when this freedom is used to target and harm another individual or a group of people, as it translates to hate speech. In this study, the proposed research deals with detection of hate speech for English and Kiswahili languages from audio. The dataset used in this work was collected manually from YouTube videos and then converted to audio. Audio-based features namely spectral, temporal, prosodic and excitation source features were extracted and used to train various machine learning classifiers. Initial experiments were conducted for English language and later on for Kiswahili language. However, it is observed from literature that research activities on Kiswahili language is comparatively lesser. The scores calculated for accuracy, recall, precision, auc and f1 score in detecting hate speech, suggest that Random Forest classifier performed better for English language while the Extreme Gradient Boosting classifier performed better for Kiswahili language.
ISSN:1381-2416
1572-8110
DOI:10.1007/s10772-024-10116-6