Indonesian Online News Topics Classification using Word2Vec and K-Nearest Neighbor
News is information disseminated by newspapers, radio, television, the internet, and other media. According to the survey results, there are many news titles from various topics spread on the internet. This of course makes newsreaders have difficulty when they want to find the desired news topic to...
Gespeichert in:
Veröffentlicht in: | Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) (Online) 2021-12, Vol.5 (6), p.1083-1089 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | News is information disseminated by newspapers, radio, television, the internet, and other media. According to the survey results, there are many news titles from various topics spread on the internet. This of course makes newsreaders have difficulty when they want to find the desired news topic to read. These problems can be solved by grouping or so-called classification. The classification process is carried out of course by using a computerized process. This study aims to classify several news topics in Indonesian language using the KNN classification model and word2vec to convert words into vectors which aim to facilitate the classification process. The use of KNN in this study also determines the optimal K value to be used. In addition to using the classification model, this study also uses a word embedding-based model, namely word2vec. The results obtained using the word2vec and KNN models have an accuracy of 89.2% with a value of K=7. The word2vec and KNN models are also superior to the support vector machine, logistic regression, and random forest classification models. |
---|---|
ISSN: | 2580-0760 2580-0760 |
DOI: | 10.29207/resti.v5i6.3547 |