Density-based reliable and robust explainer for counterfactual explanation

As an essential post-hoc explanatory method, counterfactual explanation enables people to understand and react to machine learning models. Works on counterfactual explanation generally aim at generating high-quality results, which means providing close and detailed explanations to users. However, th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Expert systems with applications 2023-09, Vol.226, p.120214, Article 120214
Hauptverfasser:	Zhang, Songming, Chen, Xiaofeng, Wen, Shiping, Li, Zhongshan
Format:	Artikel
Sprache:	eng
Schlagworte:	Counterfactual explanation Density Interpretability Robustness
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	As an essential post-hoc explanatory method, counterfactual explanation enables people to understand and react to machine learning models. Works on counterfactual explanation generally aim at generating high-quality results, which means providing close and detailed explanations to users. However, the counterfactual explainer trained on data is fragile in practice, i.e., even a small perturbation to samples can lead to large differences in explanation. In this work, we address this issue by analyzing and formalizing the robustness of counterfactual explainer with practical considerations. An explainer is considered robust if it can generate relatively stable counterfactuals under various settings. To this end, we propose a robust and reliable explainer for searching counterfactuals of classifier predictions by using density gravity. To evaluate the performance, we provide metrics that allow comparison of our proposed explainer with others and further demonstrate the importance of density in enhancing robustness. Extensive experiments on real-world datasets show that our method offers a significant improvement in explainer reliability and stability.
ISSN:	0957-4174 1873-6793
DOI:	10.1016/j.eswa.2023.120214