Data Dimensionality Reduction for Data Mining: A Combined Filter-Wrapper Framework

Knowledge Discovery in Databases aims to extract new, interesting and potential useful patterns from large amounts of data. It is a complex process whose central point is data mining, which effectively builds models from data. Data type, quality and dimensionality are some factors which affect perfo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of computers, communications & control communications & control, 2012-12, Vol.7 (5), p.824
Hauptverfasser: Danubianu, Mirela, Pentiuc, Stefan Gheorghe, Danubianu, Dragos Mircea
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Knowledge Discovery in Databases aims to extract new, interesting and potential useful patterns from large amounts of data. It is a complex process whose central point is data mining, which effectively builds models from data. Data type, quality and dimensionality are some factors which affect performance of data mining task. Since the high dimensionality of data can cause some troubles, as data overload, a possible solution could be its reduction. Sampling and filtering reduce the number of cases in a dataset, whereas features reduction can be achieved by feature selection. This paper aims to present a combined method for feature selection, where a filter based on correlation is applied on whole features set to find the relevant ones, and then, on these features a wrapper is applied in order to find the best features subset for a specified predictor. It is also presented a case study for a data set provided by TERAPERS a personalized speech therapy system.
ISSN:1841-9836
1841-9844
DOI:10.15837/ijccc.2012.5.1337