Learning decision trees through Monte Carlo tree search: An empirical evaluation

Decision trees (DTs) are a widely used prediction tool, owing to their interpretability. Standard learning methods follow a locally optimal approach that trades off prediction performance for computational efficiency. Such methods can however be far from optimal, and it may pay off to spend more com...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Wiley interdisciplinary reviews. Data mining and knowledge discovery 2020-05, Vol.10 (3), p.e1348-n/a
Hauptverfasser:	Nunes, Cecília, De Craene, Mathieu, Langet, Hélène, Camara, Oscar, Jonsson, Anders
Format:	Artikel
Sprache:	eng
Schlagworte:	Complexity Computational efficiency Computing costs Confidence Data mining Datasets Decision trees Hierarchies Learning Monte Carlo simulation Monte Carlo tree search Performance prediction Search algorithms Searching Software Software development tools
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Decision trees (DTs) are a widely used prediction tool, owing to their interpretability. Standard learning methods follow a locally optimal approach that trades off prediction performance for computational efficiency. Such methods can however be far from optimal, and it may pay off to spend more computational resources to increase performance. Monte Carlo tree search (MCTS) is an approach to approximate optimal choices in exponentially large search spaces. We propose a DT learning approach based on the Upper Confidence Bound applied to tree (UCT) algorithm, including procedures to expand and explore the space of DTs. To mitigate the computational cost of our method, we employ search pruning strategies that discard some branches of the search tree. The experiments show that proposed approach outperformed the C4.5 algorithm in 20 out of 31 datasets, with statistically significant improvements in the trade‐off between prediction performance and DT complexity. The approach improved locally optimal search for datasets with more than 1,000 instances, or for smaller datasets likely arising from complex distributions. This article is categorized under: Algorithmic Development > Hierarchies and Trees Application Areas > Data Mining Software Tools Fundamental Concepts of Data and Knowledge > Data Concepts One iteration of the UCT–DT algorithm, an adaptation of the Upper Confidence Bound applied to tree (UCT) algorithm to learn decision trees (DTs) for supervised learning. The method employs Monte Carlo tree search as an alternative to the locally optimal approach followed by standard methods.
ISSN:	1942-4787 1942-4795
DOI:	10.1002/widm.1348