Sample Reduction Algorithm Based on Classification Contribution
The KNN algorithm takes exponentially growth of time to process dataset containing a large number of samples and has low classification performance. To address this problem, this paper proposes a sample reduction method based on classification contribution ranking (SRCCR). First, SRCCR performs a de...
Gespeichert in:
Veröffentlicht in: | IAENG international journal of computer science 2023-08, Vol.50 (3), p.851 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The KNN algorithm takes exponentially growth of time to process dataset containing a large number of samples and has low classification performance. To address this problem, this paper proposes a sample reduction method based on classification contribution ranking (SRCCR). First, SRCCR performs a denoising process to expand the smoothing decision boundary by removing the noise sample in the initial training dataset; next, the denoised samples are sorted in ascending order according to the classification contribution strategy; finally, representative boundary samples and center samples are selected based on the local set to form the final subset. SRCCR reduces storage requirement and execution time, and significantly improves the classification performance of the KNN algorithm. To verify the effectiveness of the proposed method, we conduct comparative experiments on 31 real datasets from the UCI and KEEL databases. Compared with several classical instance selection algorithms, the proposed SRCCR algorithm has advantages in terms of accuracy and reduction rate. The results of the study on the two-dimensional dataset "Banana" show that the SRCCR algorithm not only selects more representative boundary and center samples, but preserves the distribution of the original dataset. |
---|---|
ISSN: | 1819-656X 1819-9224 |