Keep the Faith: Faithful Explanations in Convolutional Neural Networks for Case-Based Reasoning
Explaining predictions of black-box neural networks is crucial when applied to decision-critical tasks. Thus, attribution maps are commonly used to identify important image regions, despite prior work showing that humans prefer explanations based on similar examples. To this end, ProtoPNet learns a...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Explaining predictions of black-box neural networks is crucial when applied
to decision-critical tasks. Thus, attribution maps are commonly used to
identify important image regions, despite prior work showing that humans prefer
explanations based on similar examples. To this end, ProtoPNet learns a set of
class-representative feature vectors (prototypes) for case-based reasoning.
During inference, similarities of latent features to prototypes are linearly
classified to form predictions and attribution maps are provided to explain the
similarity. In this work, we evaluate whether architectures for case-based
reasoning fulfill established axioms required for faithful explanations using
the example of ProtoPNet. We show that such architectures allow the extraction
of faithful explanations. However, we prove that the attribution maps used to
explain the similarities violate the axioms. We propose a new procedure to
extract explanations for trained ProtoPNets, named ProtoPFaith. Conceptually,
these explanations are Shapley values, calculated on the similarity scores of
each prototype. They allow to faithfully answer which prototypes are present in
an unseen image and quantify each pixel's contribution to that presence,
thereby complying with all axioms. The theoretical violations of ProtoPNet
manifest in our experiments on three datasets (CUB-200-2011, Stanford Dogs,
RSNA) and five architectures (ConvNet, ResNet, ResNet50, WideResNet50,
ResNeXt50). Our experiments show a qualitative difference between the
explanations given by ProtoPNet and ProtoPFaith. Additionally, we quantify the
explanations with the Area Over the Perturbation Curve, on which ProtoPFaith
outperforms ProtoPNet on all experiments by a factor $>10^3$. |
---|---|
DOI: | 10.48550/arxiv.2312.09783 |