CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions

Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2023-01
Hauptverfasser: Julia El Zini, Mansour, Mohammad, Awad, Mariette
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Julia El Zini
Mansour, Mohammad
Awad, Mariette
description Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature \(X>x\), the output \(Y\) would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints. In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2767353323</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2767353323</sourcerecordid><originalsourceid>FETCH-proquest_journals_27673533233</originalsourceid><addsrcrecordid>eNqNjrEKwjAURYMgKNp_eODgVKiJteImtcXFzV1e21SiMa_mRbR_bwc_wOkM91w4IzGVSq3i7VrKiYiYb0mSyE0m01RNBeaFCzvYOxjoqevjClk3cKJG2xivjjiYGopPZ9E4rIw1oYfS40O_yd8hEOQ0PJED5BaZTWu05yUcdG3YkOO5GLdoWUc_zsSiLM75Me48PV-aw-VGL--G6SKzTaZSpYbe_6wvAudFHA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2767353323</pqid></control><display><type>article</type><title>CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions</title><source>Free E- Journals</source><creator>Julia El Zini ; Mansour, Mohammad ; Awad, Mariette</creator><creatorcontrib>Julia El Zini ; Mansour, Mohammad ; Awad, Mariette</creatorcontrib><description>Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature \(X&gt;x\), the output \(Y\) would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints. In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Classifiers ; Coders ; Datasets ; Decision trees ; Entropy ; Shortest-path problems</subject><ispartof>arXiv.org, 2023-01</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Julia El Zini</creatorcontrib><creatorcontrib>Mansour, Mohammad</creatorcontrib><creatorcontrib>Awad, Mariette</creatorcontrib><title>CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions</title><title>arXiv.org</title><description>Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature \(X&gt;x\), the output \(Y\) would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints. In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers.</description><subject>Classifiers</subject><subject>Coders</subject><subject>Datasets</subject><subject>Decision trees</subject><subject>Entropy</subject><subject>Shortest-path problems</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjrEKwjAURYMgKNp_eODgVKiJteImtcXFzV1e21SiMa_mRbR_bwc_wOkM91w4IzGVSq3i7VrKiYiYb0mSyE0m01RNBeaFCzvYOxjoqevjClk3cKJG2xivjjiYGopPZ9E4rIw1oYfS40O_yd8hEOQ0PJED5BaZTWu05yUcdG3YkOO5GLdoWUc_zsSiLM75Me48PV-aw-VGL--G6SKzTaZSpYbe_6wvAudFHA</recordid><startdate>20230119</startdate><enddate>20230119</enddate><creator>Julia El Zini</creator><creator>Mansour, Mohammad</creator><creator>Awad, Mariette</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230119</creationdate><title>CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions</title><author>Julia El Zini ; Mansour, Mohammad ; Awad, Mariette</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27673533233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Classifiers</topic><topic>Coders</topic><topic>Datasets</topic><topic>Decision trees</topic><topic>Entropy</topic><topic>Shortest-path problems</topic><toplevel>online_resources</toplevel><creatorcontrib>Julia El Zini</creatorcontrib><creatorcontrib>Mansour, Mohammad</creatorcontrib><creatorcontrib>Awad, Mariette</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Julia El Zini</au><au>Mansour, Mohammad</au><au>Awad, Mariette</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions</atitle><jtitle>arXiv.org</jtitle><date>2023-01-19</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature \(X&gt;x\), the output \(Y\) would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints. In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-01
issn 2331-8422
language eng
recordid cdi_proquest_journals_2767353323
source Free E- Journals
subjects Classifiers
Coders
Datasets
Decision trees
Entropy
Shortest-path problems
title CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T08%3A56%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=CEnt:%20An%20Entropy-based%20Model-agnostic%20Explainability%20Framework%20to%20Contrast%20Classifiers'%20Decisions&rft.jtitle=arXiv.org&rft.au=Julia%20El%20Zini&rft.date=2023-01-19&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2767353323%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2767353323&rft_id=info:pmid/&rfr_iscdi=true