K-nearest neighbor classifier optimization using purity

Data classification using K-Nearest Neighbor in high dimensions will produce a decrease in accuracy and takes time to compute. This research uses purity as a K-Nearest Neighbor optimization method in reducing data dimensions. The data used and analyzed in this research were the Breast Cancer Wiscons...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Dinata, Rozzi Kesuma, Adek, Rizal Tjut, Hasdyna, Novia, Retno, Sujacka
Format:	Tagungsbericht
Sprache:	eng
Schlagworte:	Epithelium K-nearest neighbors algorithm Machine learning Optimization Purity Reduction
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Data classification using K-Nearest Neighbor in high dimensions will produce a decrease in accuracy and takes time to compute. This research uses purity as a K-Nearest Neighbor optimization method in reducing data dimensions. The data used and analyzed in this research were the Breast Cancer Wisconsin (Original)Dataset that obtained from the UCI Machine Learning Repository. The results obtained from this research show that the smallest purity value is 0.059322 in the Single Epithelial Cell Size attribute. The second smallest purity value is 0.060976 in the Clump Thickness attribute. Purity is able to optimize the K-NN algorithm by 2.5% higher than the K-NN without using purity. K-NN without purity has an accuracy value of 76.25%. K-NN with one-dimensional reduction by purity has an accuracy value of 77.50%. K-NN with two-dimensional reduction by purity has the highest accuracy with a value of 78.75%.
ISSN:	0094-243X 1551-7616
DOI:	10.1063/5.0117058