Effect of Random Splitting and Cross Validation for Indonesian Opinion Mining using Machine Learning Approach

Opinion mining has been a prominent topic of research in Indonesia, however there are still many unanswered questions. The majority of past research has been on machine learning methods and models. A comparison of the effects of random splitting and cross-validation on processing performance is requ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of advanced computer science & applications 2022, Vol.13 (9)
Hauptverfasser: Purba, Mariana, Ermatita, Ermatita, Abdiansah, Abdiansah, Noprisson, Handrie, Ayumi, Vina, Setiawan, Hadiguna, Salamah, Umniy, Yadi, Yadi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Opinion mining has been a prominent topic of research in Indonesia, however there are still many unanswered questions. The majority of past research has been on machine learning methods and models. A comparison of the effects of random splitting and cross-validation on processing performance is required. Text data is in Indonesian. The goal of this project is to use a machine learning model to conduct opinion mining on Indonesian text data using a random splitting and cross validation approach. This research consists of five stages: data collection, pre-processing, feature extraction, training & testing, and evaluation. Based on the experimental results, the TF-IDF feature is better than the Count-Vectorizer (CV) for Indonesian text. The best accuracy results are obtained by using TF-IDF as a feature and Support Vector Machine (SVM) as a classifier with cross validation implementation. The best accuracy reaches 81%. From the experimental results, it can also be seen that the implementation of cross validation can improve accuracy compared to the implementation of random splitting.
ISSN:2158-107X
2156-5570
DOI:10.14569/IJACSA.2022.0130917