Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory

Counterfactual explanations provide ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be leveraged to reconstruct the model by strategically training a surrogate model to give similar predictions as the original (target) model....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Dissanayake, Pasan, Dutta, Sanghamitra
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computers and Society Computer Science - Cryptography and Security Computer Science - Information Theory Computer Science - Learning Mathematics - Information Theory Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Dissanayake, Pasan Dutta, Sanghamitra
description	Counterfactual explanations provide ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be leveraged to reconstruct the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we analyze how model reconstruction using counterfactuals can be improved by further leveraging the fact that the counterfactuals also lie quite close to the decision boundary. Our main contribution is to derive novel theoretical relationships between the error in model reconstruction and the number of counterfactual queries required using polytope theory. Our theoretical analysis leads us to propose a strategy for model reconstruction that we call Counterfactual Clamping Attack (CCA) which trains a surrogate model using a unique loss function that treats counterfactuals differently than ordinary instances. Our approach also alleviates the related problem of decision boundary shift that arises in existing model reconstruction approaches when counterfactuals are treated as ordinary instances. Experimental results demonstrate that our strategy improves fidelity between the target and surrogate model predictions on several datasets.
doi_str_mv	10.48550/arxiv.2405.05369
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2405_05369</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2405_05369</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2405_053693</originalsourceid><addsrcrecordid>eNqFzrEOgjAUheEuDkZ9ACfvC4hVqFE3QyAuJsTg4kIavGiT0jZtIfD2CnF3Ost_ko-Q5ZYG0YExuuG2E22wiygLKAv3xyl5XPUTJdyw1Mp525ReaAV3J9QLYt0oj7bipW-4hKQzkis-BO4EZ8jQOoPfQ4uQWl1DpmXvtUHI36htPyeTikuHi9_OyCpN8viyHhWFsaLmti8GTTFqwv_FBxvkQRM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory</title><source>arXiv.org</source><creator>Dissanayake, Pasan ; Dutta, Sanghamitra</creator><creatorcontrib>Dissanayake, Pasan ; Dutta, Sanghamitra</creatorcontrib><description>Counterfactual explanations provide ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be leveraged to reconstruct the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we analyze how model reconstruction using counterfactuals can be improved by further leveraging the fact that the counterfactuals also lie quite close to the decision boundary. Our main contribution is to derive novel theoretical relationships between the error in model reconstruction and the number of counterfactual queries required using polytope theory. Our theoretical analysis leads us to propose a strategy for model reconstruction that we call Counterfactual Clamping Attack (CCA) which trains a surrogate model using a unique loss function that treats counterfactuals differently than ordinary instances. Our approach also alleviates the related problem of decision boundary shift that arises in existing model reconstruction approaches when counterfactuals are treated as ordinary instances. Experimental results demonstrate that our strategy improves fidelity between the target and surrogate model predictions on several datasets.</description><identifier>DOI: 10.48550/arxiv.2405.05369</identifier><language>eng</language><subject>Computer Science - Computers and Society ; Computer Science - Cryptography and Security ; Computer Science - Information Theory ; Computer Science - Learning ; Mathematics - Information Theory ; Statistics - Machine Learning</subject><creationdate>2024-05</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2405.05369$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2405.05369$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Dissanayake, Pasan</creatorcontrib><creatorcontrib>Dutta, Sanghamitra</creatorcontrib><title>Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory</title><description>Counterfactual explanations provide ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be leveraged to reconstruct the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we analyze how model reconstruction using counterfactuals can be improved by further leveraging the fact that the counterfactuals also lie quite close to the decision boundary. Our main contribution is to derive novel theoretical relationships between the error in model reconstruction and the number of counterfactual queries required using polytope theory. Our theoretical analysis leads us to propose a strategy for model reconstruction that we call Counterfactual Clamping Attack (CCA) which trains a surrogate model using a unique loss function that treats counterfactuals differently than ordinary instances. Our approach also alleviates the related problem of decision boundary shift that arises in existing model reconstruction approaches when counterfactuals are treated as ordinary instances. Experimental results demonstrate that our strategy improves fidelity between the target and surrogate model predictions on several datasets.</description><subject>Computer Science - Computers and Society</subject><subject>Computer Science - Cryptography and Security</subject><subject>Computer Science - Information Theory</subject><subject>Computer Science - Learning</subject><subject>Mathematics - Information Theory</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNqFzrEOgjAUheEuDkZ9ACfvC4hVqFE3QyAuJsTg4kIavGiT0jZtIfD2CnF3Ost_ko-Q5ZYG0YExuuG2E22wiygLKAv3xyl5XPUTJdyw1Mp525ReaAV3J9QLYt0oj7bipW-4hKQzkis-BO4EZ8jQOoPfQ4uQWl1DpmXvtUHI36htPyeTikuHi9_OyCpN8viyHhWFsaLmti8GTTFqwv_FBxvkQRM</recordid><startdate>20240508</startdate><enddate>20240508</enddate><creator>Dissanayake, Pasan</creator><creator>Dutta, Sanghamitra</creator><scope>AKY</scope><scope>AKZ</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20240508</creationdate><title>Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory</title><author>Dissanayake, Pasan ; Dutta, Sanghamitra</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2405_053693</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computers and Society</topic><topic>Computer Science - Cryptography and Security</topic><topic>Computer Science - Information Theory</topic><topic>Computer Science - Learning</topic><topic>Mathematics - Information Theory</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Dissanayake, Pasan</creatorcontrib><creatorcontrib>Dutta, Sanghamitra</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Mathematics</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Dissanayake, Pasan</au><au>Dutta, Sanghamitra</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory</atitle><date>2024-05-08</date><risdate>2024</risdate><abstract>Counterfactual explanations provide ways of achieving a favorable model outcome with minimum input perturbation. However, counterfactual explanations can also be leveraged to reconstruct the model by strategically training a surrogate model to give similar predictions as the original (target) model. In this work, we analyze how model reconstruction using counterfactuals can be improved by further leveraging the fact that the counterfactuals also lie quite close to the decision boundary. Our main contribution is to derive novel theoretical relationships between the error in model reconstruction and the number of counterfactual queries required using polytope theory. Our theoretical analysis leads us to propose a strategy for model reconstruction that we call Counterfactual Clamping Attack (CCA) which trains a surrogate model using a unique loss function that treats counterfactuals differently than ordinary instances. Our approach also alleviates the related problem of decision boundary shift that arises in existing model reconstruction approaches when counterfactuals are treated as ordinary instances. Experimental results demonstrate that our strategy improves fidelity between the target and surrogate model predictions on several datasets.</abstract><doi>10.48550/arxiv.2405.05369</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2405.05369
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2405_05369
source	arXiv.org
subjects	Computer Science - Computers and Society Computer Science - Cryptography and Security Computer Science - Information Theory Computer Science - Learning Mathematics - Information Theory Statistics - Machine Learning
title	Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T15%3A37%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Model%20Reconstruction%20Using%20Counterfactual%20Explanations:%20A%20Perspective%20From%20Polytope%20Theory&rft.au=Dissanayake,%20Pasan&rft.date=2024-05-08&rft_id=info:doi/10.48550/arxiv.2405.05369&rft_dat=%3Carxiv_GOX%3E2405_05369%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true