Online cross‐validation‐based ensemble learning

Online estimators update a current estimate with a new incoming batch of data without having to revisit past data thereby providing streaming estimates that are scalable to big data. We develop flexible, ensemble‐based online estimators of an infinite‐dimensional target parameter, such as a regressi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Statistics in medicine 2018-01, Vol.37 (2), p.249-260
Hauptverfasser: Benkeser, David, Ju, Cheng, Lendle, Sam, van der Laan, Mark
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Online estimators update a current estimate with a new incoming batch of data without having to revisit past data thereby providing streaming estimates that are scalable to big data. We develop flexible, ensemble‐based online estimators of an infinite‐dimensional target parameter, such as a regression function, in the setting where data are generated sequentially by a common conditional data distribution given summary measures of the past. This setting encompasses a wide range of time‐series models and, as special case, models for independent and identically distributed data. Our estimator considers a large library of candidate online estimators and uses online cross‐validation to identify the algorithm with the best performance. We show that by basing estimates on the cross‐validation‐selected algorithm, we are asymptotically guaranteed to perform as well as the true, unknown best‐performing algorithm. We provide extensions of this approach including online estimation of the optimal ensemble of candidate online estimators. We illustrate excellent performance of our methods using simulations and a real data example where we make streaming predictions of infectious disease incidence using data from a large database. Copyright © 2017 John Wiley & Sons, Ltd.
ISSN:0277-6715
1097-0258
DOI:10.1002/sim.7320