Detection of cyberhate speech towards female sport in the Arabic Xsphere

The recent rapid growth in the number of Saudi female athletes and sports enthusiasts' presence on social media has exposed them to gender-hate speech and discrimination. Hate speech, a harmful worldwide phenomenon, can have severe consequences. Its prevalence in sports has surged alongside the...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PeerJ. Computer science 2024-06, Vol.10, p.e2138, Article e2138
Hauptverfasser: Alhayan, Fatimah, Almobarak, Monerah, Shalabi, Hawazen, Alshubaili, Luluwah, Albatati, Renad, Alqahtani, Wafa, Alhaidari, Nofe
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The recent rapid growth in the number of Saudi female athletes and sports enthusiasts' presence on social media has exposed them to gender-hate speech and discrimination. Hate speech, a harmful worldwide phenomenon, can have severe consequences. Its prevalence in sports has surged alongside the growing influence of social media, with X serving as a prominent platform for the expression of hate speech and discriminatory comments, often targeting women in sports. This research combines two studies that explores online hate speech and gender biases in the context of sports, proposing an automated solution for detecting hate speech targeting women in sports on platforms like X, with a particular focus on Arabic, a challenging domain with limited prior research. In Study 1, semi-structured interviews with 33 Saudi female athletes and sports fans revealed common forms of hate speech, including gender-based derogatory comments, misogyny, and appearance-related discrimination. Building upon the foundations laid by Study 1, Study 2 addresses the pressing need for effective interventions to combat hate speech against women in sports on social media by evaluating machine learning (ML) models for identifying hate speech targeting women in sports in Arabic. A dataset of 7,487 Arabic tweets was collected, annotated, and pre-processed. Term frequency-inverse document frequency (TF-IDF) and part-of-speech (POS) feature extraction techniques were used, and various ML algorithms were trained Random Forest consistently outperformed, achieving accuracy (85% and 84% using TF-IDF and POS, respectively) compared to other methods, demonstrating the effectiveness of both feature sets in identifying Arabic hate speech. The research contribution advances the understanding of online hate targeting Arabic women in sports by identifying various forms of such hate. The systematic creation of a meticulously annotated Arabic hate speech dataset, specifically focused on women's sports, enhances the dataset's reliability and provides valuable insights for future research in countering hate speech against women in sports. This dataset forms a strong foundation for developing effective strategies to address online hate within the unique context of women's sports. The research findings contribute to the ongoing efforts to combat hate speech against women in sports on social media, aligning with the objectives of Saudi Arabia's Vision 2030 and recognizing the significance of female participatio
ISSN:2376-5992
2376-5992
DOI:10.7717/peerj-cs.2138