A C4.5 algorithm for english emotional classification

The solutions for processing sentiment analysis are very important and very helpful for many researchers, many applications, etc. This new model has been proposed in this paper, used in the English document-level sentiment classification. In this research, we propose a new model using C4.5 Algorithm...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Evolving systems 2019-09, Vol.10 (3), p.425-451
Hauptverfasser: Ngoc, Phu Vo, Ngoc, Chau Vo Thi, Ngoc, Tran Vo Thi, Duy, Dat Nguyen
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The solutions for processing sentiment analysis are very important and very helpful for many researchers, many applications, etc. This new model has been proposed in this paper, used in the English document-level sentiment classification. In this research, we propose a new model using C4.5 Algorithm of a decision tree to classify semantics (positive, negative, neutral) for the English documents. Our English training data set has 140,000 English sentences, including 70,000 English positive sentences and 70,000 English negative sentences. We use the C4.5 algorithm on the 70,000 English positive sentences to generate a decision tree and many association rules of the positive polarity are created by the decision tree. We also use the C4.5 algorithm on the 70,000 English negative sentences to generate a decision tree and many association rules of the negative polarity are created by the decision tree. Classifying sentiments of one English document is identified based on the association rules of the positive polarity and the negative polarity. Our English testing data set has 25,000 English documents, including 12,500 English positive reviews and 12,500 English negative reviews. We have tested our new model on our testing data set and we have achieved 60.3% accuracy of sentiment classification on this English testing data set.
ISSN:1868-6478
1868-6486
DOI:10.1007/s12530-017-9180-1