P-Shapley: Shapley Values on Probabilistic Classifiers

The Shapley value provides a unique approach to equitably gauge each player's contribution within a coalition and has extensive applications with various utility functions. In data valuation for machine learning, particularly for classification tasks, using classification accuracy as the utilit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the VLDB Endowment 2024-03, Vol.17 (7), p.1737-1750
Hauptverfasser: Xia, Haocheng, Li, Xiang, Pang, Junyuan, Liu, Jinfei, Ren, Kui, Xiong, Li
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The Shapley value provides a unique approach to equitably gauge each player's contribution within a coalition and has extensive applications with various utility functions. In data valuation for machine learning, particularly for classification tasks, using classification accuracy as the utility function has become a de facto standard. However, accuracy can be an imprecise metric, potentially missing finer details crucial for valuation. In this paper, we propose the probability-based Shapley (P-Shapley) value, which leverages predicted probabilities to heighten utility differentiation. Several convex calibration functions are further incorporated for probability calibration. We prove that the P-Shapley value outperforms Shapley values based on accuracy or other coarse metrics in approximation stability and the discrimination of marginal utility change can be further improved by convex calibration functions. Extensive experiments on four real-world datasets demonstrate the effectiveness of our approaches.
ISSN:2150-8097
2150-8097
DOI:10.14778/3654621.3654638