GINT: A Generative Interpretability method via perturbation in the latent space

As the neural networks get deeper and deeper, model interpretation becomes more necessary and important, especially in high-risk fields such as medicine and finance. From the perspective of feature attribution, most existing attempts aim to identify relevant features contributing the most to the pre...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2023-12, Vol.232, p.120570, Article 120570
Hauptverfasser:	Tang, Caizhi, Cui, Qing, Li, Longfei, Zhou, Jun
Format:	Artikel
Sprache:	eng
Schlagworte:	Generative model Interpretability Neural networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	As the neural networks get deeper and deeper, model interpretation becomes more necessary and important, especially in high-risk fields such as medicine and finance. From the perspective of feature attribution, most existing attempts aim to identify relevant features contributing the most to the prediction. Among them, perturbation-based methods mainly explore the corresponding model’s output by randomly perturbing the given features. When the data is high-dimensional and sparse, perturbing the feature space may be inefficient and meaningless since it ignores the feature correlations in the data distribution. In this paper, we introduce a novel Generative INTerpretability method, named GINT, which generates perturbations in the latent space. We propose a unified framework for perturbation-based methods, which describes the characteristics of a suitable perturbation for interpretation. Under the framework, we adopt a generative model to generate perturbation instead of randomly perturbing. Subsequently, we sample perturbations from the generative model for a given instance and its prediction and calculate the feature importance by analyzing those perturbations. We conduct extensive experiments to validate the effectiveness and efficiency of the proposed method. •Leveraging generative model to generate perturbation in latent space.•The perturbation is more meaningful and efficient.•The interpretation of GINT is more meaningful and efficient.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2023.120570