Implementation of n-gram Methodology to Analyze Sentiment Reviews for Indonesian Chips Purchases in Shopee E-Marketplace

Chips are a well-known product among Small and Medium Enterprises (SMEs). In order to enhance the quality of chips as an SME product, sentiment analysis is a crucial step. In this research, sentiment analysis of chip purchases on the Shopee E-marketplace was conducted using the Natural Language Proc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) (Online) 2023-06, Vol.7 (3), p.609-617
Hauptverfasser: Purbaya, Muhammad Eka, Rakhmadani, Diovianto Putra, Maliana Puspa Arum, Luthfi Zian Nasifah
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Chips are a well-known product among Small and Medium Enterprises (SMEs). In order to enhance the quality of chips as an SME product, sentiment analysis is a crucial step. In this research, sentiment analysis of chip purchases on the Shopee E-marketplace was conducted using the Natural Language Processing (NLP) method, utilizing the N-Gram Model and Term Frequent-Inverse Document Frequency (TF-IDF) as feature extraction techniques, and the Support Vector Machine (SVM) algorithm for sentiment classification. The objective of this research is to identify the most suitable feature extraction model and optimal SVM kernel type from the options of Linear, Polynomial degree, Gaussian RBF, and Sigmoid kernels. Results from the experiments indicate that the TF-IDF and unigram feature extraction techniques offer the best performance for SVM classification when utilizing the Linear kernel. By labeling the dataset, it was observed that using a lexicon-based approach for sentiment classification resulted in 84.31% of the total reviews being positive. The words "price", "cheap" and "quality" in unigram have the highest weights above 0.040. In the unigram model, linear kernel accuracy and precision performance values are 88.4% and 87.3%. At the same time, the recall performance values is 88.4%. The results of the F1-Score assessment matrix from Unigram were 86.9%, Bigram was 78.5% and Trigram was 77.4%. Ultimately, the unigram model combined with a linear kernel in the SVM algorithm demonstrates strong potential for application in the development of various systems focused on detecting user reviews in the Indonesian language on the Shopee E-Marketplace. 
ISSN:2580-0760
2580-0760
DOI:10.29207/resti.v7i3.4726