Cross-Validation in Nonparametric Regression with Outliers

A popular data-driven method for choosing the bandwidth in standard kernel regression is cross-validation. Even when there are outliers in the data, robust kernel regression can be used to estimate the unknown regression curve [Robust and Nonlinear Time Series Analysis. Lecture Notes in Statist. (19...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Annals of statistics 2005-10, Vol.33 (5), p.2291-2310
1. Verfasser:	Leung, Denis Heng-Yan
Format:	Artikel
Sprache:	eng
Schlagworte:	62F35 62F40 62G08 Bandwidth Bandwidths cross-validation Data smoothing Datasets Error rates Estimators Exact sciences and technology Inference from stochastic processes time series analysis kernel Linear inference, regression Linear regression Mathematical constants Mathematical functions Mathematics Nonparametric Estimation Nonparametric inference nonparametric regression Outliers Probability and statistics Regression analysis robust Sciences and techniques of general use Signal bandwidth smoothing Standard deviation Statistics Studies
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	A popular data-driven method for choosing the bandwidth in standard kernel regression is cross-validation. Even when there are outliers in the data, robust kernel regression can be used to estimate the unknown regression curve [Robust and Nonlinear Time Series Analysis. Lecture Notes in Statist. (1984) 26 163-184]. However, under these circumstances standard cross-validation is no longer a satisfactory bandwidth selector because it is unduly influenced by extreme prediction errors caused by the existence of these outliers. A more robust method proposed here is a cross-validation method that discounts the extreme prediction errors. In large samples the robust method chooses consistent bandwidths, and the consistency of the method is practically independent of the form in which extreme prediction errors are discounted. Additionally, evaluation of the method's finite sample behavior in a simulation demonstrates that the proposed method performs favorably. This method can also be applied to other problems, for example, model selection, that require cross-validation.
ISSN:	0090-5364 2168-8966
DOI:	10.1214/009053605000000499