Fast Axiomatic Attribution for Neural Networks
Mitigating the dependence on spurious correlations present in the training dataset is a quickly emerging and important topic of deep learning. Recent approaches include priors on the feature attribution of a deep neural network (DNN) into the training process to reduce the dependence on unwanted fea...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Mitigating the dependence on spurious correlations present in the training
dataset is a quickly emerging and important topic of deep learning. Recent
approaches include priors on the feature attribution of a deep neural network
(DNN) into the training process to reduce the dependence on unwanted features.
However, until now one needed to trade off high-quality attributions,
satisfying desirable axioms, against the time required to compute them. This in
turn either led to long training times or ineffective attribution priors. In
this work, we break this trade-off by considering a special class of
efficiently axiomatically attributable DNNs for which an axiomatic feature
attribution can be computed with only a single forward/backward pass. We
formally prove that nonnegatively homogeneous DNNs, here termed
$\mathcal{X}$-DNNs, are efficiently axiomatically attributable and show that
they can be effortlessly constructed from a wide range of regular DNNs by
simply removing the bias term of each layer. Various experiments demonstrate
the advantages of $\mathcal{X}$-DNNs, beating state-of-the-art generic
attribution methods on regular DNNs for training with attribution priors. |
---|---|
DOI: | 10.48550/arxiv.2111.07668 |