Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in Tamil
This paper tries to address the problem of abusive comment detection in low-resource indic languages. Abusive comments are statements that are offensive to a person or a group of people. These comments are targeted toward individuals belonging to specific ethnicities, genders, caste, race, sexuality...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper tries to address the problem of abusive comment detection in
low-resource indic languages. Abusive comments are statements that are
offensive to a person or a group of people. These comments are targeted toward
individuals belonging to specific ethnicities, genders, caste, race, sexuality,
etc. Abusive Comment Detection is a significant problem, especially with the
recent rise in social media users. This paper presents the approach used by our
team - Optimize_Prime, in the ACL 2022 shared task "Abusive Comment Detection
in Tamil." This task detects and classifies YouTube comments in Tamil and
Tamil- English Codemixed format into multiple categories. We have used three
methods to optimize our results: Ensemble models, Recurrent Neural Networks,
and Transformers. In the Tamil data, MuRIL and XLM-RoBERTA were our best
performing models with a macro-averaged f1 score of 0.43. Furthermore, for the
Code-mixed data, MuRIL and M-BERT provided sub-lime results, with a
macro-averaged f1 score of 0.45. |
---|---|
DOI: | 10.48550/arxiv.2204.09675 |