Multilingual Cross-domain Perspectives on Online Hate Speech

In this report, we present a study of eight corpora of online hate speech, by demonstrating the NLP techniques that we used to collect and analyze the jihadist, extremist, racist, and sexist content. Analysis of the multilingual corpora shows that the different contexts share certain characteristics...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2018-09
Hauptverfasser:	De Smedt, Tom, Jaki, Sylvia, Kotzé, Eduan, Saoud, Leïla, Gwóźdź, Maja, De Pauw, Guy, Daelemans, Walter
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Hate speech Multilingualism
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this report, we present a study of eight corpora of online hate speech, by demonstrating the NLP techniques that we used to collect and analyze the jihadist, extremist, racist, and sexist content. Analysis of the multilingual corpora shows that the different contexts share certain characteristics in their hateful rhetoric. To expose the main features, we have focused on text classification, text profiling, keyword and collocation extraction, along with manual annotation and qualitative study.
ISSN:	2331-8422