Classification Based on Structural Information in Data

Clustering provides structural information from unlabeled data. The studies in which the structural information of the dataset is obtained through unsupervised learning approaches such as clustering and then transferred to the supervised learning are noteworthy. In this study, we propose a new prepr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Arabian journal for science and engineering (2011) 2022-02, Vol.47 (2), p.2239-2253
Hauptverfasser: Karabulut, Bergen, Arslan, Güvenç, Ünver, Halil Murat
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Clustering provides structural information from unlabeled data. The studies in which the structural information of the dataset is obtained through unsupervised learning approaches such as clustering and then transferred to the supervised learning are noteworthy. In this study, we propose a new preprocessing method, which obtains structural information that is expected to represent the most meaningful summary of the training dataset before applying a supervised learning strategy. To obtain this summary, the CURE clustering method was used. The proposed preprocessing method combined with SVM and a new classification method named representative points based SVM (RP-SVM) was developed. This new method was experimentally tested with various real datasets and was compared with the standard SVM, KMSVM, K NN and CART methods. The RP-SVM has significantly reduced the training size and resulted in less support vectors compared to standard SVM while achieving similar accuracy results. The RP-SVM has achieved better accuracy with less training data compared to K NN and CART. In addition, the RP-SVM has less data reduction compared to the KMSVM, but it is a more stable method that performs well in all datasets used. The results show that the proposed method can extract structural information that provides high quality for classification.
ISSN:2193-567X
1319-8025
2191-4281
DOI:10.1007/s13369-021-06177-3