CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions
Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providi...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2023-01 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Julia El Zini Mansour, Mohammad Awad, Mariette |
description | Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature \(X>x\), the output \(Y\) would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints. In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2767353323</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2767353323</sourcerecordid><originalsourceid>FETCH-proquest_journals_27673533233</originalsourceid><addsrcrecordid>eNqNjrEKwjAURYMgKNp_eODgVKiJteImtcXFzV1e21SiMa_mRbR_bwc_wOkM91w4IzGVSq3i7VrKiYiYb0mSyE0m01RNBeaFCzvYOxjoqevjClk3cKJG2xivjjiYGopPZ9E4rIw1oYfS40O_yd8hEOQ0PJED5BaZTWu05yUcdG3YkOO5GLdoWUc_zsSiLM75Me48PV-aw-VGL--G6SKzTaZSpYbe_6wvAudFHA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2767353323</pqid></control><display><type>article</type><title>CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions</title><source>Free E- Journals</source><creator>Julia El Zini ; Mansour, Mohammad ; Awad, Mariette</creator><creatorcontrib>Julia El Zini ; Mansour, Mohammad ; Awad, Mariette</creatorcontrib><description>Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature \(X>x\), the output \(Y\) would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints. In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Classifiers ; Coders ; Datasets ; Decision trees ; Entropy ; Shortest-path problems</subject><ispartof>arXiv.org, 2023-01</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Julia El Zini</creatorcontrib><creatorcontrib>Mansour, Mohammad</creatorcontrib><creatorcontrib>Awad, Mariette</creatorcontrib><title>CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions</title><title>arXiv.org</title><description>Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature \(X>x\), the output \(Y\) would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints. In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers.</description><subject>Classifiers</subject><subject>Coders</subject><subject>Datasets</subject><subject>Decision trees</subject><subject>Entropy</subject><subject>Shortest-path problems</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNjrEKwjAURYMgKNp_eODgVKiJteImtcXFzV1e21SiMa_mRbR_bwc_wOkM91w4IzGVSq3i7VrKiYiYb0mSyE0m01RNBeaFCzvYOxjoqevjClk3cKJG2xivjjiYGopPZ9E4rIw1oYfS40O_yd8hEOQ0PJED5BaZTWu05yUcdG3YkOO5GLdoWUc_zsSiLM75Me48PV-aw-VGL--G6SKzTaZSpYbe_6wvAudFHA</recordid><startdate>20230119</startdate><enddate>20230119</enddate><creator>Julia El Zini</creator><creator>Mansour, Mohammad</creator><creator>Awad, Mariette</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230119</creationdate><title>CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions</title><author>Julia El Zini ; Mansour, Mohammad ; Awad, Mariette</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_27673533233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Classifiers</topic><topic>Coders</topic><topic>Datasets</topic><topic>Decision trees</topic><topic>Entropy</topic><topic>Shortest-path problems</topic><toplevel>online_resources</toplevel><creatorcontrib>Julia El Zini</creatorcontrib><creatorcontrib>Mansour, Mohammad</creatorcontrib><creatorcontrib>Awad, Mariette</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Julia El Zini</au><au>Mansour, Mohammad</au><au>Awad, Mariette</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions</atitle><jtitle>arXiv.org</jtitle><date>2023-01-19</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Current interpretability methods focus on explaining a particular model's decision through present input features. Such methods do not inform the user of the sufficient conditions that alter these decisions when they are not desirable. Contrastive explanations circumvent this problem by providing explanations of the form "If the feature \(X>x\), the output \(Y\) would be different''. While different approaches are developed to find contrasts; these methods do not all deal with mutability and attainability constraints. In this work, we present a novel approach to locally contrast the prediction of any classifier. Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits. A graph, G, is then built where contrast nodes are found through a one-to-many shortest path search. Contrastive examples are generated from the shortest path to reflect feature splits that alter model decisions while maintaining lower entropy. We perform local sampling on manifold-like distances computed by variational auto-encoders to reflect data density. CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction). Empirical evaluation on four real-world numerical datasets demonstrates the ability of CEnt in generating counterfactuals that achieve better proximity rates than existing methods without compromising latency, feasibility, and attainability. We further extend CEnt to imagery data to derive visually appealing and useful contrasts between class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can serve as a tool to detect vulnerabilities of textual classifiers.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-01 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2767353323 |
source | Free E- Journals |
subjects | Classifiers Coders Datasets Decision trees Entropy Shortest-path problems |
title | CEnt: An Entropy-based Model-agnostic Explainability Framework to Contrast Classifiers' Decisions |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T08%3A56%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=CEnt:%20An%20Entropy-based%20Model-agnostic%20Explainability%20Framework%20to%20Contrast%20Classifiers'%20Decisions&rft.jtitle=arXiv.org&rft.au=Julia%20El%20Zini&rft.date=2023-01-19&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2767353323%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2767353323&rft_id=info:pmid/&rfr_iscdi=true |