Exact Shapley values for local and model-true explanations of decision tree ensembles

Additive feature explanations using Shapley values have become popular for providing transparency into the relative importance of each feature to an individual prediction of a machine learning model. While Shapley values provide a unique additive feature attribution in cooperative game theory, the S...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Machine learning with applications 2022-09, Vol.9, p.100345, Article 100345
Hauptverfasser:	Campbell, Thomas W., Roder, Heinrich, Georgantas III, Robert W., Roder, Joanna
Format:	Artikel
Sprache:	eng
Schlagworte:	Decision trees Explainability Interpretability Machine learning Shapley values
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Additive feature explanations using Shapley values have become popular for providing transparency into the relative importance of each feature to an individual prediction of a machine learning model. While Shapley values provide a unique additive feature attribution in cooperative game theory, the Shapley values that can be generated for even a single machine learning model are far from unique, with theoretical and implementational decisions affecting the resulting attributions. Here, we consider the application of Shapley values for explaining decision tree ensembles and present a novel approach to Shapley value-based feature attribution that can be applied to random forests and boosted decision trees. This new method provides attributions that accurately reflect details of the model prediction algorithm for individual instances, while being computationally competitive with one of the most widely used current methods. We explain the theoretical differences between the standard and novel approaches and compare their performance using synthetic and real data. •Introduces a novel approach to calculation of Shapley values for tree-based models.•Algorithm has same computational complexity as standard treeSHAP implementations.•Attributes not used in prediction generation for an instance have zero Shapley value.•Approach is of particular relevance for true-to-model instance-specific explanations.
ISSN:	2666-8270 2666-8270
DOI:	10.1016/j.mlwa.2022.100345