Interpreting Deep Learning Models with Marginal Attribution by Conditioning on Quantiles
A vastly growing literature on explaining deep learning models has emerged. This paper contributes to that literature by introducing a global gradient-based model-agnostic method, which we call Marginal Attribution by Conditioning on Quantiles (MACQ). Our approach is based on analyzing the marginal...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A vastly growing literature on explaining deep learning models has emerged.
This paper contributes to that literature by introducing a global
gradient-based model-agnostic method, which we call Marginal Attribution by
Conditioning on Quantiles (MACQ). Our approach is based on analyzing the
marginal attribution of predictions (outputs) to individual features (inputs).
Specificalllly, we consider variable importance by mixing (global) output
levels and, thus, explain how features marginally contribute across different
regions of the prediction space. Hence, MACQ can be seen as a marginal
attribution counterpart to approaches such as accumulated local effects (ALE),
which study the sensitivities of outputs by perturbing inputs. Furthermore,
MACQ allows us to separate marginal attribution of individual features from
interaction effect, and visually illustrate the 3-way relationship between
marginal attribution, output level, and feature value. |
---|---|
DOI: | 10.48550/arxiv.2103.11706 |