Measuring the Shattering coefficient of Decision Tree models

•The Statistical Learning Theory to study learning guarantees of Decision Trees;•The numerical formulation of the Shattering coefficient for Decision Trees;•Complexity assessment of Decision-Tree models using the Generalization Bound. In spite of the relevance of Decision Trees (DTs), there is still...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2019-12, Vol.137, p.443-452
Hauptverfasser: de Mello, Rodrigo F., Manapragada, Chaitanya, Bifet, Albert
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•The Statistical Learning Theory to study learning guarantees of Decision Trees;•The numerical formulation of the Shattering coefficient for Decision Trees;•Complexity assessment of Decision-Tree models using the Generalization Bound. In spite of the relevance of Decision Trees (DTs), there is still a disconnection between their theoretical and practical results while selecting models to address specific learning tasks. A particular criterion is provided by the Shattering coefficient, a growth function formulated in the context of the Statistical Learning Theory (SLT), which measures the complexity of the algorithm bias as sample sizes increase. In attempt to establish the basis for a relative theoretical complexity analysis, this paper introduces a method to compute the Shattering coefficient of DT models using recurrence equations. Next, we assess the bias of models provided by DT algorithms while solving practical problems as well as their overall learning bounds in light of the SLT. As the main contribution, our results support other researchers to decide on the most adequate DT models to tackle specific supervised learning tasks.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2019.07.012