Random Planted Forest: a directly interpretable tree ensemble
We introduce a novel interpretable tree based algorithm for prediction in a regression setting. Our motivation is to estimate the unknown regression function from a functional decomposition perspective in which the functional components correspond to lower order interaction terms. The idea is to mod...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We introduce a novel interpretable tree based algorithm for prediction in a
regression setting. Our motivation is to estimate the unknown regression
function from a functional decomposition perspective in which the functional
components correspond to lower order interaction terms. The idea is to modify
the random forest algorithm by keeping certain leaves after they are split
instead of deleting them. This leads to non-binary trees which we refer to as
planted trees. An extension to a forest leads to our random planted forest
algorithm. Additionally, the maximum number of covariates which can interact
within a leaf can be bounded. If we set this interaction bound to one, the
resulting estimator is a sum of one-dimensional functions. In the other extreme
case, if we do not set a limit, the resulting estimator and corresponding model
place no restrictions on the form of the regression function. In a simulation
study we find encouraging prediction and visualisation properties of our random
planted forest method. We also develop theory for an idealized version of
random planted forests in cases where the interaction bound is low. We show
that if it is smaller than three, the idealized version achieves asymptotically
optimal convergence rates up to a logarithmic factor. Code is available on
GitHub https://github.com/PlantedML/randomPlantedForest. |
---|---|
DOI: | 10.48550/arxiv.2012.14563 |