A Statistical Test for Probabilistic Fairness

Algorithms are now routinely used to make consequential decisions that affect human lives. Examples include college admissions, medical interventions or law enforcement. While algorithms empower us to harness all information hidden in vast amounts of data, they may inadvertently amplify existing bia...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Taskesen, Bahar, Blanchet, Jose, Kuhn, Daniel, Nguyen, Viet Anh
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computers and Society Computer Science - Learning Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Taskesen, Bahar Blanchet, Jose Kuhn, Daniel Nguyen, Viet Anh
description	Algorithms are now routinely used to make consequential decisions that affect human lives. Examples include college admissions, medical interventions or law enforcement. While algorithms empower us to harness all information hidden in vast amounts of data, they may inadvertently amplify existing biases in the available datasets. This concern has sparked increasing interest in fair machine learning, which aims to quantify and mitigate algorithmic discrimination. Indeed, machine learning models should undergo intensive tests to detect algorithmic biases before being deployed at scale. In this paper, we use ideas from the theory of optimal transport to propose a statistical hypothesis test for detecting unfair classifiers. Leveraging the geometry of the feature space, the test statistic quantifies the distance of the empirical distribution supported on the test samples to the manifold of distributions that render a pre-trained classifier fair. We develop a rigorous hypothesis testing mechanism for assessing the probabilistic fairness of any pre-trained logistic classifier, and we show both theoretically as well as empirically that the proposed test is asymptotically correct. In addition, the proposed framework offers interpretability by identifying the most favorable perturbation of the data so that the given classifier becomes fair.
doi_str_mv	10.48550/arxiv.2012.04800
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2012_04800</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2012_04800</sourcerecordid><originalsourceid>FETCH-LOGICAL-a670-1281a9098912ec141935bcd452dda7b83fc3e2ca672fae9c7b8b0ed9241a77463</originalsourceid><addsrcrecordid>eNotjssKwjAURLNxIeoHuDI_0HrzqE2WRXyBoGD35SZNIVCtJEX079XqamDmMBxC5gxSqbIMlhie_pFyYDwFqQDGJCnopcfex95bbGnpYk-bLtBz6Awa3w4D3aIPNxfjlIwabKOb_XNCyu2mXO-T42l3WBfHBFc5JIwrhhq00ow7yyTTIjO2lhmva8yNEo0VjtsPyxt02n4qA67WXDLMc7kSE7L43Q6-1T34K4ZX9fWuBm_xBgxyPGU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Statistical Test for Probabilistic Fairness</title><source>arXiv.org</source><creator>Taskesen, Bahar ; Blanchet, Jose ; Kuhn, Daniel ; Nguyen, Viet Anh</creator><creatorcontrib>Taskesen, Bahar ; Blanchet, Jose ; Kuhn, Daniel ; Nguyen, Viet Anh</creatorcontrib><description>Algorithms are now routinely used to make consequential decisions that affect human lives. Examples include college admissions, medical interventions or law enforcement. While algorithms empower us to harness all information hidden in vast amounts of data, they may inadvertently amplify existing biases in the available datasets. This concern has sparked increasing interest in fair machine learning, which aims to quantify and mitigate algorithmic discrimination. Indeed, machine learning models should undergo intensive tests to detect algorithmic biases before being deployed at scale. In this paper, we use ideas from the theory of optimal transport to propose a statistical hypothesis test for detecting unfair classifiers. Leveraging the geometry of the feature space, the test statistic quantifies the distance of the empirical distribution supported on the test samples to the manifold of distributions that render a pre-trained classifier fair. We develop a rigorous hypothesis testing mechanism for assessing the probabilistic fairness of any pre-trained logistic classifier, and we show both theoretically as well as empirically that the proposed test is asymptotically correct. In addition, the proposed framework offers interpretability by identifying the most favorable perturbation of the data so that the given classifier becomes fair.</description><identifier>DOI: 10.48550/arxiv.2012.04800</identifier><language>eng</language><subject>Computer Science - Computers and Society ; Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2020-12</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2012.04800$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2012.04800$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Taskesen, Bahar</creatorcontrib><creatorcontrib>Blanchet, Jose</creatorcontrib><creatorcontrib>Kuhn, Daniel</creatorcontrib><creatorcontrib>Nguyen, Viet Anh</creatorcontrib><title>A Statistical Test for Probabilistic Fairness</title><description>Algorithms are now routinely used to make consequential decisions that affect human lives. Examples include college admissions, medical interventions or law enforcement. While algorithms empower us to harness all information hidden in vast amounts of data, they may inadvertently amplify existing biases in the available datasets. This concern has sparked increasing interest in fair machine learning, which aims to quantify and mitigate algorithmic discrimination. Indeed, machine learning models should undergo intensive tests to detect algorithmic biases before being deployed at scale. In this paper, we use ideas from the theory of optimal transport to propose a statistical hypothesis test for detecting unfair classifiers. Leveraging the geometry of the feature space, the test statistic quantifies the distance of the empirical distribution supported on the test samples to the manifold of distributions that render a pre-trained classifier fair. We develop a rigorous hypothesis testing mechanism for assessing the probabilistic fairness of any pre-trained logistic classifier, and we show both theoretically as well as empirically that the proposed test is asymptotically correct. In addition, the proposed framework offers interpretability by identifying the most favorable perturbation of the data so that the given classifier becomes fair.</description><subject>Computer Science - Computers and Society</subject><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotjssKwjAURLNxIeoHuDI_0HrzqE2WRXyBoGD35SZNIVCtJEX079XqamDmMBxC5gxSqbIMlhie_pFyYDwFqQDGJCnopcfex95bbGnpYk-bLtBz6Awa3w4D3aIPNxfjlIwabKOb_XNCyu2mXO-T42l3WBfHBFc5JIwrhhq00ow7yyTTIjO2lhmva8yNEo0VjtsPyxt02n4qA67WXDLMc7kSE7L43Q6-1T34K4ZX9fWuBm_xBgxyPGU</recordid><startdate>20201208</startdate><enddate>20201208</enddate><creator>Taskesen, Bahar</creator><creator>Blanchet, Jose</creator><creator>Kuhn, Daniel</creator><creator>Nguyen, Viet Anh</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20201208</creationdate><title>A Statistical Test for Probabilistic Fairness</title><author>Taskesen, Bahar ; Blanchet, Jose ; Kuhn, Daniel ; Nguyen, Viet Anh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a670-1281a9098912ec141935bcd452dda7b83fc3e2ca672fae9c7b8b0ed9241a77463</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer Science - Computers and Society</topic><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Taskesen, Bahar</creatorcontrib><creatorcontrib>Blanchet, Jose</creatorcontrib><creatorcontrib>Kuhn, Daniel</creatorcontrib><creatorcontrib>Nguyen, Viet Anh</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Taskesen, Bahar</au><au>Blanchet, Jose</au><au>Kuhn, Daniel</au><au>Nguyen, Viet Anh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Statistical Test for Probabilistic Fairness</atitle><date>2020-12-08</date><risdate>2020</risdate><abstract>Algorithms are now routinely used to make consequential decisions that affect human lives. Examples include college admissions, medical interventions or law enforcement. While algorithms empower us to harness all information hidden in vast amounts of data, they may inadvertently amplify existing biases in the available datasets. This concern has sparked increasing interest in fair machine learning, which aims to quantify and mitigate algorithmic discrimination. Indeed, machine learning models should undergo intensive tests to detect algorithmic biases before being deployed at scale. In this paper, we use ideas from the theory of optimal transport to propose a statistical hypothesis test for detecting unfair classifiers. Leveraging the geometry of the feature space, the test statistic quantifies the distance of the empirical distribution supported on the test samples to the manifold of distributions that render a pre-trained classifier fair. We develop a rigorous hypothesis testing mechanism for assessing the probabilistic fairness of any pre-trained logistic classifier, and we show both theoretically as well as empirically that the proposed test is asymptotically correct. In addition, the proposed framework offers interpretability by identifying the most favorable perturbation of the data so that the given classifier becomes fair.</abstract><doi>10.48550/arxiv.2012.04800</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2012.04800
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2012_04800
source	arXiv.org
subjects	Computer Science - Computers and Society Computer Science - Learning Statistics - Machine Learning
title	A Statistical Test for Probabilistic Fairness
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T14%3A03%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Statistical%20Test%20for%20Probabilistic%20Fairness&rft.au=Taskesen,%20Bahar&rft.date=2020-12-08&rft_id=info:doi/10.48550/arxiv.2012.04800&rft_dat=%3Carxiv_GOX%3E2012_04800%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true