Random forests for functional covariates

We propose a form of random forests that is especially suited for functional covariates. The method is based on partitioning the functions' domain in intervals and using the functions' mean values across those intervals as predictors in regression or classification trees. This approach app...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of chemometrics 2016-12, Vol.30 (12), p.715-725
Hauptverfasser: Möller, Annette, Tutz, Gerhard, Gertheiss, Jan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose a form of random forests that is especially suited for functional covariates. The method is based on partitioning the functions' domain in intervals and using the functions' mean values across those intervals as predictors in regression or classification trees. This approach appears to be more intuitive to applied researchers than usual methods for functional data, while also performing very well in terms of prediction accuracy. The intervals are obtained from randomly drawn, exponentially distributed waiting times. We apply our method to data from Raman spectra on boar meat as well as near‐infrared absorption spectra. The predictive performance of the proposed functional random forests is compared with commonly used parametric and nonparametric functional methods and with a nonfunctional random forest using the single measurements of the curve as covariates. Further, we present a functional variable importance measure, yielding information about the relevance of the different parts of the predictor curves. Our variable importance curve is much smoother and hence easier to interpret than the one obtained from nonfunctional random forests. We propose a form of random forests designed for functional covariates. The approach is based on partitioning the domain of the functional predictors into randomly generated intervals and then using the functions' mean values across these intervals as predictors in classification or regression trees. Further, we derive a smooth functional variable importance curve that yields information about the importance of different parts of the predictor curves. Two case studies show that our approach is highly competitive to standard functional models.
ISSN:0886-9383
1099-128X
DOI:10.1002/cem.2849