Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach
We examine counterfactual explanations for explaining the decisions made by model-based AI systems. The counterfactual approach we consider defines an explanation as a set of the system's data inputs that causally drives the decision (i.e., changing the inputs in the set changes the decision) a...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We examine counterfactual explanations for explaining the decisions made by
model-based AI systems. The counterfactual approach we consider defines an
explanation as a set of the system's data inputs that causally drives the
decision (i.e., changing the inputs in the set changes the decision) and is
irreducible (i.e., changing any subset of the inputs does not change the
decision). We (1) demonstrate how this framework may be used to provide
explanations for decisions made by general, data-driven AI systems that may
incorporate features with arbitrary data types and multiple predictive models,
and (2) propose a heuristic procedure to find the most useful explanations
depending on the context. We then contrast counterfactual explanations with
methods that explain model predictions by weighting features according to their
importance (e.g., SHAP, LIME) and present two fundamental reasons why we should
carefully consider whether importance-weight explanations are well-suited to
explain system decisions. Specifically, we show that (i) features that have a
large importance weight for a model prediction may not affect the corresponding
decision, and (ii) importance weights are insufficient to communicate whether
and how features influence decisions. We demonstrate this with several concise
examples and three detailed case studies that compare the counterfactual
approach with SHAP to illustrate various conditions under which counterfactual
explanations explain data-driven decisions better than importance weights. |
---|---|
DOI: | 10.48550/arxiv.2001.07417 |