Contribution to the Moroccan Darija sentiment analysis in social networks

With the rise of social media, there has been a growing interest in developing automatic sentiment analysis and opinion mining tools for natural language processing (NLP). However, most of the current research focuses on Indo-European languages, particularly English. However, a large community of pe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Social Network Analysis and Mining 2023-10, Vol.13 (1), p.138, Article 138
Hauptverfasser: El Ouahabi, Sara, El Ouahabi, Safâa, Dadi, El Wardani
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the rise of social media, there has been a growing interest in developing automatic sentiment analysis and opinion mining tools for natural language processing (NLP). However, most of the current research focuses on Indo-European languages, particularly English. However, a large community of people who use dialectics is not being adequately served by these existing tools. To our knowledge, there is currently no publicly available dataset for sentiment analysis specifically for the Moroccan dialect (MAD) that covers all social networks. In this work, we aim to address this issue by focusing on sentiment analysis for the Moroccan Arabic dialect (Darija), by creating a large and high-quality dataset of Moroccan dialectal text extracted from different social media (Facebook, Twitter, YouTube, Instagram and Web site) that covers a wide range of domains including sports, arts, politics, education and society. It is characterized by its size, quality, and variety, and involves experimenting with different machine learning algorithms, feature extraction models, and testing the transformer-based model (BERT).
ISSN:1869-5469
1869-5450
1869-5469
DOI:10.1007/s13278-023-01129-1