One Wave to Explain Them All: A Unifying Perspective on Post-hoc Explainability
Despite the growing use of deep neural networks in safety-critical decision-making, their inherent black-box nature hinders transparency and interpretability. Explainable AI (XAI) methods have thus emerged to understand a model's internal workings, and notably attribution methods also called sa...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite the growing use of deep neural networks in safety-critical
decision-making, their inherent black-box nature hinders transparency and
interpretability. Explainable AI (XAI) methods have thus emerged to understand
a model's internal workings, and notably attribution methods also called
saliency maps. Conventional attribution methods typically identify the
locations -- the where -- of significant regions within an input. However,
because they overlook the inherent structure of the input data, these methods
often fail to interpret what these regions represent in terms of structural
components (e.g., textures in images or transients in sounds). Furthermore,
existing methods are usually tailored to a single data modality, limiting their
generalizability. In this paper, we propose leveraging the wavelet domain as a
robust mathematical foundation for attribution. Our approach, the Wavelet
Attribution Method (WAM) extends the existing gradient-based feature
attributions into the wavelet domain, providing a unified framework for
explaining classifiers across images, audio, and 3D shapes. Empirical
evaluations demonstrate that WAM matches or surpasses state-of-the-art methods
across faithfulness metrics and models in image, audio, and 3D explainability.
Finally, we show how our method explains not only the where -- the important
parts of the input -- but also the what -- the relevant patterns in terms of
structural components. |
---|---|
DOI: | 10.48550/arxiv.2410.01482 |