The comparison of K-means clustering result with centroid initialization using agglomerative hierarchical clustering (case study in Eastern Indonesia Region)

K-Means is a popular method in cluster analysis that partitions a data set into several clusters based on distance. However, the clustering results depend on the selection of the initial centroid value so that the the results can probably stuck in a local optimum. In this research, the determination...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Mamu, Muzdalifa D. Z., Yahya, Lailany, Payu, Muhammad Rezky Friesta
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Centroids Cluster analysis Clustering Vector quantization
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	K-Means is a popular method in cluster analysis that partitions a data set into several clusters based on distance. However, the clustering results depend on the selection of the initial centroid value so that the the results can probably stuck in a local optimum. In this research, the determination of the initial centroid value use Agglomerative Hierarchical Clustering (AHC) method. The clustering method is applied to grouping districts/cities in 13 provinces in Eastern Indonesia Region based on the educational indicators in 2019. This research aims to find out the comparison of K-Means clustering performance with and without initial centroid using the AHC method. Furthermore, the Davies Bouldin Index is used as a method to evaluate the clustering results. It obtained that K-Means with Agglomerative Hierarchical Clustering gave better results compared to simple K-Means. Also, the districts/cities are grouped into three clusters based on the Gross Enrollment Rate (GER) and Net Enrollment Rate (NER), and five clusters for clustering based on the Repetition Rate, Drop Out Rate, and Student-Teacher ratio.
ISSN:	0094-243X 1551-7616
DOI:	10.1063/5.0126087