Kick-one-out-based variable selection method for Euclidean distance-based classifier in high-dimensional settings
This paper presents a variable selection method for the Euclidean distance-based classifier in high-dimensional settings. We are concerned that the expected probabilities of misclassification (EPMC) for the Euclidean distance-based classifier may be increasing with dimension when redundant variables...
Gespeichert in:
Veröffentlicht in: | Journal of multivariate analysis 2021-07, Vol.184, p.104756, Article 104756 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper presents a variable selection method for the Euclidean distance-based classifier in high-dimensional settings. We are concerned that the expected probabilities of misclassification (EPMC) for the Euclidean distance-based classifier may be increasing with dimension when redundant variables are included in feature values. First, we show the Euclidean distance-based classifier with only non-redundant variables reduces asymptotic EPMC more than the Euclidean distance-based classifier with all variables. Next, we obtain a kick-one-out based variable selection method that helps reduce EPMC and prove its consistency in variable selection in the context of high dimensionality. Finally, we conduct a Monte Carlo simulation study to examine the finite sample performance of the proposed selection method. Our simulation results show that the selection method frequently selects the set containing non-redundant variables. We also observed that the discrimination rules constructed from the selected variables reduce EPMC more than the discrimination rules constructed from all variables. |
---|---|
ISSN: | 0047-259X 1095-7243 |
DOI: | 10.1016/j.jmva.2021.104756 |