Multi-label Arabic text classification in Online Social Networks

Online Social Networks (OSNs) are the most popular interactive media for communicating, posting, and sharing indefinite amounts of personal information. However, along with interesting and attractive topics and contents, some users neither like the fact that certain topics that are not among their i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information systems (Oxford) 2021-09, Vol.100, p.101785, Article 101785
Hauptverfasser:	Omar, Ahmed, Mahmoud, Tarek M., Abd-El-Hafeez, Tarek, Mahfouz, Ahmed
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Arabic natural language processing Arabic sentiment analysis Arabic text classification Classification Communication Computer Science Computer Science, Information Systems Data mining English language Evaluation Information systems Pornography Science & Technology Sentiment analysis Social networks Technology Text categorization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Online Social Networks (OSNs) are the most popular interactive media for communicating, posting, and sharing indefinite amounts of personal information. However, along with interesting and attractive topics and contents, some users neither like the fact that certain topics that are not among their interests can fill their personal pages nor do they wish to see disappointing negative posts that may appear repeatedly. Also, people sometimes post inappropriate or abusive content on these networks, such as insults or pornography. Most of the efforts in the field of text classification have focused on the English language, while research on the Arabic language, which has numerous challenges is scarce. In this paper, we constructed a standard multi-label Arabic dataset using manual annotation and a semi-supervised annotation technique that can be used for short text classification, sentiment analysis, and multilabel classification. Then, we evaluated the topics classification, sentiment analysis, and multilabel classification. Based on that evaluation we found a relationship between topics published in OSNs and hate speech. The experimental results validate the effectiveness of the proposed technique. •We construct a standard multi-label Arabic dataset using manual annotation and semi-supervised annotation techniques.•We train machine-learning models for topic classification, sentiment analysis, and multilabel classification in OSNs.•We examine the relationship between topics published in OSNs and hate speech.•We propose a technique to filter social networks contents.
ISSN:	0306-4379 1873-6076
DOI:	10.1016/j.is.2021.101785