Mapping Violence: Developing an Extensive Framework to Build a Bangla Sectarian Expression Dataset from Social Media Interactions
Communal violence in online forums has become extremely prevalent in South Asia, where many communities of different cultures coexist and share resources. These societies exhibit a phenomenon characterized by strong bonds within their own groups and animosity towards others, leading to conflicts tha...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Communal violence in online forums has become extremely prevalent in South
Asia, where many communities of different cultures coexist and share resources.
These societies exhibit a phenomenon characterized by strong bonds within their
own groups and animosity towards others, leading to conflicts that frequently
escalate into violent confrontations. To address this issue, we have developed
the first comprehensive framework for the automatic detection of communal
violence markers in online Bangla content accompanying the largest collection
(13K raw sentences) of social media interactions that fall under the definition
of four major violence class and their 16 coarse expressions. Our workflow
introduces a 7-step expert annotation process incorporating insights from
social scientists, linguists, and psychologists. By presenting data statistics
and benchmarking performance using this dataset, we have determined that, aside
from the category of Non-communal violence, Religio-communal violence is
particularly pervasive in Bangla text. Moreover, we have substantiated the
effectiveness of fine-tuning language models in identifying violent comments by
conducting preliminary benchmarking on the state-of-the-art Bangla deep
learning model. |
---|---|
DOI: | 10.48550/arxiv.2404.11752 |