A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences

People tend to share their opinions on social media daily. This text needs to be accurately mined for different purposes like enhancements in services and/or products. Mining and analyzing Arabic text have been a big challenge due to many complications inherited in Arabic language. Although, many re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Algorithms 2025-01, Vol.18 (1), p.44
Hauptverfasser: Hamed, Alaa, Keshk, Arabi, Youssef, Anas
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:People tend to share their opinions on social media daily. This text needs to be accurately mined for different purposes like enhancements in services and/or products. Mining and analyzing Arabic text have been a big challenge due to many complications inherited in Arabic language. Although, many research studies have already investigated the Arabic text sentiment analysis problem, this paper investigates the specific research topic that addresses Arabic comparative opinion mining. This research topic is not widely investigated in many research studies. This paper proposes a lexicon-based framework which includes a set of proposed algorithms for the mining and analysis of Arabic comparative sentences. The proposed framework comprises a set of contributions including an Arabic comparative sentence keywords lexicon and a proposed algorithm for the identification of Arabic comparative sentences, followed by a second proposed algorithm for the classification of identified comparative sentences into different types. The framework also comprises a third proposed algorithm that was developed to extract relations between entities in each of the identified comparative sentence types. Finally, two proposed algorithms were developed for the extraction of the preferred entity in each sentence type. The framework was evaluated using three different Arabic language datasets. The evaluation metrics used to obtain the evaluation results include precision, recall, F-score, and accuracy. The average values of the evaluation metrics for the proposed sentences identification algorithm reached 97%. The average evaluation values of the evaluation metrics for the proposed sentence type identification algorithm reached 96%. Finally, the average results showed 97% relation word extraction precision for the proposed relation extraction algorithm.
ISSN:1999-4893
1999-4893
DOI:10.3390/a18010044