Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse

The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content. Existing studies mainly focus on classifying texts into binary categories, often overlooking the continuous spectrum of offensiveness and hatefulness inherent in th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-04
Hauptverfasser: Abinew Ali Ayele, Esubalew Alemneh Jalew, Adem Chanie Ali, Yimam, Seid Muhie, Biemann, Chris
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Abinew Ali Ayele
Esubalew Alemneh Jalew
Adem Chanie Ali
Yimam, Seid Muhie
Biemann, Chris
description The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content. Existing studies mainly focus on classifying texts into binary categories, often overlooking the continuous spectrum of offensiveness and hatefulness inherent in the text. In this research, we present an extensive benchmark dataset for Amharic, comprising 8,258 tweets annotated for three distinct tasks: category classification, identification of hate targets, and rating offensiveness and hatefulness intensities. Our study highlights that a considerable majority of tweets belong to the less offensive and less hate intensity levels, underscoring the need for early interventions by stakeholders. The prevalence of ethnic and political hatred targets, with significant overlaps in our dataset, emphasizes the complex relationships within Ethiopia's sociopolitical landscape. We build classification and regression models and investigate the efficacy of models in handling these tasks. Our results reveal that hate and offensive speech can not be addressed by a simplistic binary classification, instead manifesting as variables across a continuous range of values. The Afro-XLMR-large model exhibits the best performances achieving F1-scores of 75.30%, 70.59%, and 29.42% for the category, target, and regression tasks, respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model indicates strong alignments.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3041589696</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3041589696</sourcerecordid><originalsourceid>FETCH-proquest_journals_30415896963</originalsourceid><addsrcrecordid>eNqNjM0KgkAURocgKMp3uNA60Jk0bdkftogW1joGvebENGMzowi9fBk9QKuPwzl8AzKmjAXzeEHpiHjW3n3fp9GShiEbk9euq6U2Qt1grRtVcCPQAlcFHJRDZYXrWSg4lWWPLX5lyh1CViPm1QouqkUh-wtXIWz0o5bY9TZ3pnmALiHTueASjlgIDlthc90Yi1MyLLm06P12Qmb73XmTzmujnw1ad71_OvVRV-YvgjBOoiRi_1VvRhpOjQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3041589696</pqid></control><display><type>article</type><title>Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse</title><source>Free E- Journals</source><creator>Abinew Ali Ayele ; Esubalew Alemneh Jalew ; Adem Chanie Ali ; Yimam, Seid Muhie ; Biemann, Chris</creator><creatorcontrib>Abinew Ali Ayele ; Esubalew Alemneh Jalew ; Adem Chanie Ali ; Yimam, Seid Muhie ; Biemann, Chris</creatorcontrib><description>The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content. Existing studies mainly focus on classifying texts into binary categories, often overlooking the continuous spectrum of offensiveness and hatefulness inherent in the text. In this research, we present an extensive benchmark dataset for Amharic, comprising 8,258 tweets annotated for three distinct tasks: category classification, identification of hate targets, and rating offensiveness and hatefulness intensities. Our study highlights that a considerable majority of tweets belong to the less offensive and less hate intensity levels, underscoring the need for early interventions by stakeholders. The prevalence of ethnic and political hatred targets, with significant overlaps in our dataset, emphasizes the complex relationships within Ethiopia's sociopolitical landscape. We build classification and regression models and investigate the efficacy of models in handling these tasks. Our results reveal that hate and offensive speech can not be addressed by a simplistic binary classification, instead manifesting as variables across a continuous range of values. The Afro-XLMR-large model exhibits the best performances achieving F1-scores of 75.30%, 70.59%, and 29.42% for the category, target, and regression tasks, respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model indicates strong alignments.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Classification ; Continuity (mathematics) ; Correlation coefficients ; Datasets ; Digital media ; Hate speech ; Regression models ; Sociopolitical factors</subject><ispartof>arXiv.org, 2024-04</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Abinew Ali Ayele</creatorcontrib><creatorcontrib>Esubalew Alemneh Jalew</creatorcontrib><creatorcontrib>Adem Chanie Ali</creatorcontrib><creatorcontrib>Yimam, Seid Muhie</creatorcontrib><creatorcontrib>Biemann, Chris</creatorcontrib><title>Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse</title><title>arXiv.org</title><description>The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content. Existing studies mainly focus on classifying texts into binary categories, often overlooking the continuous spectrum of offensiveness and hatefulness inherent in the text. In this research, we present an extensive benchmark dataset for Amharic, comprising 8,258 tweets annotated for three distinct tasks: category classification, identification of hate targets, and rating offensiveness and hatefulness intensities. Our study highlights that a considerable majority of tweets belong to the less offensive and less hate intensity levels, underscoring the need for early interventions by stakeholders. The prevalence of ethnic and political hatred targets, with significant overlaps in our dataset, emphasizes the complex relationships within Ethiopia's sociopolitical landscape. We build classification and regression models and investigate the efficacy of models in handling these tasks. Our results reveal that hate and offensive speech can not be addressed by a simplistic binary classification, instead manifesting as variables across a continuous range of values. The Afro-XLMR-large model exhibits the best performances achieving F1-scores of 75.30%, 70.59%, and 29.42% for the category, target, and regression tasks, respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model indicates strong alignments.</description><subject>Classification</subject><subject>Continuity (mathematics)</subject><subject>Correlation coefficients</subject><subject>Datasets</subject><subject>Digital media</subject><subject>Hate speech</subject><subject>Regression models</subject><subject>Sociopolitical factors</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNjM0KgkAURocgKMp3uNA60Jk0bdkftogW1joGvebENGMzowi9fBk9QKuPwzl8AzKmjAXzeEHpiHjW3n3fp9GShiEbk9euq6U2Qt1grRtVcCPQAlcFHJRDZYXrWSg4lWWPLX5lyh1CViPm1QouqkUh-wtXIWz0o5bY9TZ3pnmALiHTueASjlgIDlthc90Yi1MyLLm06P12Qmb73XmTzmujnw1ad71_OvVRV-YvgjBOoiRi_1VvRhpOjQ</recordid><startdate>20240418</startdate><enddate>20240418</enddate><creator>Abinew Ali Ayele</creator><creator>Esubalew Alemneh Jalew</creator><creator>Adem Chanie Ali</creator><creator>Yimam, Seid Muhie</creator><creator>Biemann, Chris</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240418</creationdate><title>Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse</title><author>Abinew Ali Ayele ; Esubalew Alemneh Jalew ; Adem Chanie Ali ; Yimam, Seid Muhie ; Biemann, Chris</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30415896963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Classification</topic><topic>Continuity (mathematics)</topic><topic>Correlation coefficients</topic><topic>Datasets</topic><topic>Digital media</topic><topic>Hate speech</topic><topic>Regression models</topic><topic>Sociopolitical factors</topic><toplevel>online_resources</toplevel><creatorcontrib>Abinew Ali Ayele</creatorcontrib><creatorcontrib>Esubalew Alemneh Jalew</creatorcontrib><creatorcontrib>Adem Chanie Ali</creatorcontrib><creatorcontrib>Yimam, Seid Muhie</creatorcontrib><creatorcontrib>Biemann, Chris</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Abinew Ali Ayele</au><au>Esubalew Alemneh Jalew</au><au>Adem Chanie Ali</au><au>Yimam, Seid Muhie</au><au>Biemann, Chris</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse</atitle><jtitle>arXiv.org</jtitle><date>2024-04-18</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content. Existing studies mainly focus on classifying texts into binary categories, often overlooking the continuous spectrum of offensiveness and hatefulness inherent in the text. In this research, we present an extensive benchmark dataset for Amharic, comprising 8,258 tweets annotated for three distinct tasks: category classification, identification of hate targets, and rating offensiveness and hatefulness intensities. Our study highlights that a considerable majority of tweets belong to the less offensive and less hate intensity levels, underscoring the need for early interventions by stakeholders. The prevalence of ethnic and political hatred targets, with significant overlaps in our dataset, emphasizes the complex relationships within Ethiopia's sociopolitical landscape. We build classification and regression models and investigate the efficacy of models in handling these tasks. Our results reveal that hate and offensive speech can not be addressed by a simplistic binary classification, instead manifesting as variables across a continuous range of values. The Afro-XLMR-large model exhibits the best performances achieving F1-scores of 75.30%, 70.59%, and 29.42% for the category, target, and regression tasks, respectively. The 80.22% correlation coefficient of the Afro-XLMR-large model indicates strong alignments.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-04
issn 2331-8422
language eng
recordid cdi_proquest_journals_3041589696
source Free E- Journals
subjects Classification
Continuity (mathematics)
Correlation coefficients
Datasets
Digital media
Hate speech
Regression models
Sociopolitical factors
title Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T22%3A47%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Exploring%20Boundaries%20and%20Intensities%20in%20Offensive%20and%20Hate%20Speech:%20Unveiling%20the%20Complex%20Spectrum%20of%20Social%20Media%20Discourse&rft.jtitle=arXiv.org&rft.au=Abinew%20Ali%20Ayele&rft.date=2024-04-18&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3041589696%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3041589696&rft_id=info:pmid/&rfr_iscdi=true