New data reduction algorithms based on the fusion of instance and feature selection

This paper presents two novel data reduction algorithms that combine instance selection and feature selection methods to simultaneously reduce the number of records and features. These algorithms are rigorously defined and thoroughly explained. They are applied to reduce twelve data sets selected fr...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Knowledge-based systems 2024-07, Vol.296, p.111844, Article 111844
Hauptverfasser:	Kusy, Maciej, Zajdel, Roman
Format:	Artikel
Sprache:	eng
Schlagworte:	Clustering Feature selection Instance selection Machine learning Nearest neighbors Random forest
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents two novel data reduction algorithms that combine instance selection and feature selection methods to simultaneously reduce the number of records and features. These algorithms are rigorously defined and thoroughly explained. They are applied to reduce twelve data sets selected from a commonly available database repository and single representative of big database. Their reduction efficiency is tested using three well-known machine learning models: classification and regression tree, convolutional neural network, and support vector machines. The results are extensively analyzed, comparing the classification performance of the models on the reduced data sets with that on the original ones. The computational complexity of algorithms is theoretically estimated and practically verified. The obtained performance results are compared with those available in the literature. It is shown that implementing the proposed data reduction algorithms consistently improves model accuracy in the vast majority of classification scenarios.
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2024.111844