One to rule them all: Towards Joint Indic Language Hate Speech Detection
This paper is a contribution to the Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) 2021 shared task. Social media today is a hotbed of toxic and hateful conversations, in various languages. Recent news reports have shown that current models struggle to automatica...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper is a contribution to the Hate Speech and Offensive Content
Identification in Indo-European Languages (HASOC) 2021 shared task. Social
media today is a hotbed of toxic and hateful conversations, in various
languages. Recent news reports have shown that current models struggle to
automatically identify hate posted in minority languages. Therefore,
efficiently curbing hate speech is a critical challenge and problem of
interest. We present a multilingual architecture using state-of-the-art
transformer language models to jointly learn hate and offensive speech
detection across three languages namely, English, Hindi, and Marathi. On the
provided testing corpora, we achieve Macro F1 scores of 0.7996, 0.7748, 0.8651
for sub-task 1A and 0.6268, 0.5603 during the fine-grained classification of
sub-task 1B. These results show the efficacy of exploiting a multilingual
training scheme. |
---|---|
DOI: | 10.48550/arxiv.2109.13711 |