Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse
The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content. Existing studies mainly focus on classifying texts into binary categories, often overlooking the continuous spectrum of offensiveness and hatefulness inherent in th...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Ayele, Abinew Ali Jalew, Esubalew Alemneh Ali, Adem Chanie Yimam, Seid Muhie Biemann, Chris |
description | The prevalence of digital media and evolving sociopolitical dynamics have
significantly amplified the dissemination of hateful content. Existing studies
mainly focus on classifying texts into binary categories, often overlooking the
continuous spectrum of offensiveness and hatefulness inherent in the text. In
this research, we present an extensive benchmark dataset for Amharic,
comprising 8,258 tweets annotated for three distinct tasks: category
classification, identification of hate targets, and rating offensiveness and
hatefulness intensities. Our study highlights that a considerable majority of
tweets belong to the less offensive and less hate intensity levels,
underscoring the need for early interventions by stakeholders. The prevalence
of ethnic and political hatred targets, with significant overlaps in our
dataset, emphasizes the complex relationships within Ethiopia's sociopolitical
landscape. We build classification and regression models and investigate the
efficacy of models in handling these tasks. Our results reveal that hate and
offensive speech can not be addressed by a simplistic binary classification,
instead manifesting as variables across a continuous range of values. The
Afro-XLMR-large model exhibits the best performances achieving F1-scores of
75.30%, 70.59%, and 29.42% for the category, target, and regression tasks,
respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model
indicates strong alignments. |
doi_str_mv | 10.48550/arxiv.2404.12042 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2404_12042</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2404_12042</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-f5c746b7539b7c7cdf6b687e64384896303831d4c2dc08c2175a56eb701dfd0f3</originalsourceid><addsrcrecordid>eNotkMtOwzAURLNhgQofwAr_QIITP8sOQqGVirpoWUeOfU0tJXbkPBTEz6O0rEZHRxppJkkecpxRyRh-UnF2U1ZQTLO8wLS4TX43c9eE6Pw3eg2jNyo66JHyBu38AL53w8LOo4O1C05wkVs1ADp2APr8jL78BK5ZKoYzoDK0XQPzYvUQxxYFi45BO9WgTzBOoTfX6zDGHu6SG6uaHu7_c5Wc3jencpvuDx-78mWfKi6K1DItKK8FI-taaKGN5TWXAjglkso1J5hIkhuqC6Ox1EUumGIcaoFzYw22ZJU8Xmsv66suulbFn2p5obq8QP4A97BYxQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse</title><source>arXiv.org</source><creator>Ayele, Abinew Ali ; Jalew, Esubalew Alemneh ; Ali, Adem Chanie ; Yimam, Seid Muhie ; Biemann, Chris</creator><creatorcontrib>Ayele, Abinew Ali ; Jalew, Esubalew Alemneh ; Ali, Adem Chanie ; Yimam, Seid Muhie ; Biemann, Chris</creatorcontrib><description>The prevalence of digital media and evolving sociopolitical dynamics have
significantly amplified the dissemination of hateful content. Existing studies
mainly focus on classifying texts into binary categories, often overlooking the
continuous spectrum of offensiveness and hatefulness inherent in the text. In
this research, we present an extensive benchmark dataset for Amharic,
comprising 8,258 tweets annotated for three distinct tasks: category
classification, identification of hate targets, and rating offensiveness and
hatefulness intensities. Our study highlights that a considerable majority of
tweets belong to the less offensive and less hate intensity levels,
underscoring the need for early interventions by stakeholders. The prevalence
of ethnic and political hatred targets, with significant overlaps in our
dataset, emphasizes the complex relationships within Ethiopia's sociopolitical
landscape. We build classification and regression models and investigate the
efficacy of models in handling these tasks. Our results reveal that hate and
offensive speech can not be addressed by a simplistic binary classification,
instead manifesting as variables across a continuous range of values. The
Afro-XLMR-large model exhibits the best performances achieving F1-scores of
75.30%, 70.59%, and 29.42% for the category, target, and regression tasks,
respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model
indicates strong alignments.</description><identifier>DOI: 10.48550/arxiv.2404.12042</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2024-04</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2404.12042$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2404.12042$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ayele, Abinew Ali</creatorcontrib><creatorcontrib>Jalew, Esubalew Alemneh</creatorcontrib><creatorcontrib>Ali, Adem Chanie</creatorcontrib><creatorcontrib>Yimam, Seid Muhie</creatorcontrib><creatorcontrib>Biemann, Chris</creatorcontrib><title>Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse</title><description>The prevalence of digital media and evolving sociopolitical dynamics have
significantly amplified the dissemination of hateful content. Existing studies
mainly focus on classifying texts into binary categories, often overlooking the
continuous spectrum of offensiveness and hatefulness inherent in the text. In
this research, we present an extensive benchmark dataset for Amharic,
comprising 8,258 tweets annotated for three distinct tasks: category
classification, identification of hate targets, and rating offensiveness and
hatefulness intensities. Our study highlights that a considerable majority of
tweets belong to the less offensive and less hate intensity levels,
underscoring the need for early interventions by stakeholders. The prevalence
of ethnic and political hatred targets, with significant overlaps in our
dataset, emphasizes the complex relationships within Ethiopia's sociopolitical
landscape. We build classification and regression models and investigate the
efficacy of models in handling these tasks. Our results reveal that hate and
offensive speech can not be addressed by a simplistic binary classification,
instead manifesting as variables across a continuous range of values. The
Afro-XLMR-large model exhibits the best performances achieving F1-scores of
75.30%, 70.59%, and 29.42% for the category, target, and regression tasks,
respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model
indicates strong alignments.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotkMtOwzAURLNhgQofwAr_QIITP8sOQqGVirpoWUeOfU0tJXbkPBTEz6O0rEZHRxppJkkecpxRyRh-UnF2U1ZQTLO8wLS4TX43c9eE6Pw3eg2jNyo66JHyBu38AL53w8LOo4O1C05wkVs1ADp2APr8jL78BK5ZKoYzoDK0XQPzYvUQxxYFi45BO9WgTzBOoTfX6zDGHu6SG6uaHu7_c5Wc3jencpvuDx-78mWfKi6K1DItKK8FI-taaKGN5TWXAjglkso1J5hIkhuqC6Ox1EUumGIcaoFzYw22ZJU8Xmsv66suulbFn2p5obq8QP4A97BYxQ</recordid><startdate>20240418</startdate><enddate>20240418</enddate><creator>Ayele, Abinew Ali</creator><creator>Jalew, Esubalew Alemneh</creator><creator>Ali, Adem Chanie</creator><creator>Yimam, Seid Muhie</creator><creator>Biemann, Chris</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240418</creationdate><title>Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse</title><author>Ayele, Abinew Ali ; Jalew, Esubalew Alemneh ; Ali, Adem Chanie ; Yimam, Seid Muhie ; Biemann, Chris</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-f5c746b7539b7c7cdf6b687e64384896303831d4c2dc08c2175a56eb701dfd0f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Ayele, Abinew Ali</creatorcontrib><creatorcontrib>Jalew, Esubalew Alemneh</creatorcontrib><creatorcontrib>Ali, Adem Chanie</creatorcontrib><creatorcontrib>Yimam, Seid Muhie</creatorcontrib><creatorcontrib>Biemann, Chris</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ayele, Abinew Ali</au><au>Jalew, Esubalew Alemneh</au><au>Ali, Adem Chanie</au><au>Yimam, Seid Muhie</au><au>Biemann, Chris</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse</atitle><date>2024-04-18</date><risdate>2024</risdate><abstract>The prevalence of digital media and evolving sociopolitical dynamics have
significantly amplified the dissemination of hateful content. Existing studies
mainly focus on classifying texts into binary categories, often overlooking the
continuous spectrum of offensiveness and hatefulness inherent in the text. In
this research, we present an extensive benchmark dataset for Amharic,
comprising 8,258 tweets annotated for three distinct tasks: category
classification, identification of hate targets, and rating offensiveness and
hatefulness intensities. Our study highlights that a considerable majority of
tweets belong to the less offensive and less hate intensity levels,
underscoring the need for early interventions by stakeholders. The prevalence
of ethnic and political hatred targets, with significant overlaps in our
dataset, emphasizes the complex relationships within Ethiopia's sociopolitical
landscape. We build classification and regression models and investigate the
efficacy of models in handling these tasks. Our results reveal that hate and
offensive speech can not be addressed by a simplistic binary classification,
instead manifesting as variables across a continuous range of values. The
Afro-XLMR-large model exhibits the best performances achieving F1-scores of
75.30%, 70.59%, and 29.42% for the category, target, and regression tasks,
respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model
indicates strong alignments.</abstract><doi>10.48550/arxiv.2404.12042</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2404.12042 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2404_12042 |
source | arXiv.org |
subjects | Computer Science - Computation and Language |
title | Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T12%3A54%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploring%20Boundaries%20and%20Intensities%20in%20Offensive%20and%20Hate%20Speech:%20Unveiling%20the%20Complex%20Spectrum%20of%20Social%20Media%20Discourse&rft.au=Ayele,%20Abinew%20Ali&rft.date=2024-04-18&rft_id=info:doi/10.48550/arxiv.2404.12042&rft_dat=%3Carxiv_GOX%3E2404_12042%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |