A Bayesian Approach To Analysing Training Data Attribution In Deep Learning
Training data attribution (TDA) techniques find influential training data for the model's prediction on the test data of interest. They approximate the impact of down- or up-weighting a particular training sample. While conceptually useful, they are hardly applicable to deep models in practice,...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Training data attribution (TDA) techniques find influential training data for
the model's prediction on the test data of interest. They approximate the
impact of down- or up-weighting a particular training sample. While
conceptually useful, they are hardly applicable to deep models in practice,
particularly because of their sensitivity to different model initialisation. In
this paper, we introduce a Bayesian perspective on the TDA task, where the
learned model is treated as a Bayesian posterior and the TDA estimates as
random variables. From this novel viewpoint, we observe that the influence of
an individual training sample is often overshadowed by the noise stemming from
model initialisation and SGD batch composition. Based on this observation, we
argue that TDA can only be reliably used for explaining deep model predictions
that are consistently influenced by certain training data, independent of other
noise factors. Our experiments demonstrate the rarity of such noise-independent
training-test data pairs but confirm their existence. We recommend that future
researchers and practitioners trust TDA estimates only in such cases. Further,
we find a disagreement between ground truth and estimated TDA distributions and
encourage future work to study this gap. Code is provided at
https://github.com/ElisaNguyen/bayesian-tda. |
---|---|
DOI: | 10.48550/arxiv.2305.19765 |