Detecting Hateful and Offensive Speech in Arabic Social Media Using Transfer Learning

The democratization of access to internet and social media has given an opportunity for every individual to openly express his or her ideas and feelings. Unfortunately, this has also created room for extremist, racist, misogynist, and offensive opinions expressed either as articles, posts, or commen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied sciences 2022-12, Vol.12 (24), p.12823
Hauptverfasser:	Boulouard, Zakaria, Ouaissa, Mariya, Ouaissa, Mariyam, Krichen, Moez, Almutiq, Mutiq, Gasmi, Karim
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Arabic language Artificial intelligence Attitudes Computer generated language analysis COVID-19 Datasets Deep learning Dialects Digital media English language French language Hate speech hate speech detection Internet Internet access Machine learning Multilingualism natural language processing Racism Religion Sentiment analysis Social discrimination learning Social media social media analytics Social networks Spanish language Standard dialects Support vector machines text mining Transfer learning Translations Websites
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The democratization of access to internet and social media has given an opportunity for every individual to openly express his or her ideas and feelings. Unfortunately, this has also created room for extremist, racist, misogynist, and offensive opinions expressed either as articles, posts, or comments. While controlling offensive speech in English-, Spanish-, and French- speaking social media communities and websites has reached a mature level, it is much less the case for their counterparts in Arabic-speaking countries. This paper presents a transfer learning solution to detect hateful and offensive speech on Arabic websites and social media platforms. This paper will compare the performance of different BERT-based models trained to classify comments as either abusive or neutral. The training dataset contains comments in standard Arabic as well as four dialects. We will also use their English translations for comparative purposes. The models were evaluated based on five metrics: Accuracy, Precision, Recall, F1-Score, and Confusion Matrix.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app122412823