Identifying and mitigating bias in algorithms used to manage patients in a pandemic

Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status an...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2021-10
Hauptverfasser:	Li, Yifan, Yoon, Garrett, Nasir-Moin, Mustafa, Rosenberg, David, Neifert, Sean, Kondziolka, Douglas, Oermann, Eric Karl
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Bias Calibration Coronaviruses COVID-19 Decision support systems Machine learning Performance prediction Prediction models Regression models Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Li, Yifan Yoon, Garrett Nasir-Moin, Mustafa Rosenberg, David Neifert, Sean Kondziolka, Douglas Oermann, Eric Karl
description	Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset consisting of four hospitals in New York City and analyzed for biases against race, gender and age. Simple thresholding adjustments were applied in the training process to establish more equitable models. Results Compared to the naively trained models, the calibrated models showed a 57% decrease in the number of biased trials, while predictive performance, measured by area under the receiver/operating curve (AUC), remained unchanged. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that naively training and deploying machine learning models on real world data for predictive analytics of COVID-19 has a high risk of bias. Simple implemented adjustments or calibrations during model training can lead to substantial and sustained gains in fairness on subsequent deployment.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2591831747</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2591831747</sourcerecordid><originalsourceid>FETCH-proquest_journals_25918317473</originalsourceid><addsrcrecordid>eNqNi8EKwjAQBYMgWLT_sOC50CatrWdR9Kx3iSaNW5qkdtODf29EP8DTY3gzM5ZwIYqsKTlfsJSoy_Ocb2peVSJh55PSLmD7QmdAOgUWAxoZPnhDSYAOZG_8iOFhCSbSCoIHK500GoYoxvxrRXJKW7yv2LyVPen0t0u2Puwvu2M2jP45aQrXzk-ji9eVV9uiEUVd1uI_6w1CgEAe</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2591831747</pqid></control><display><type>article</type><title>Identifying and mitigating bias in algorithms used to manage patients in a pandemic</title><source>Free E- Journals</source><creator>Li, Yifan ; Yoon, Garrett ; Nasir-Moin, Mustafa ; Rosenberg, David ; Neifert, Sean ; Kondziolka, Douglas ; Oermann, Eric Karl</creator><creatorcontrib>Li, Yifan ; Yoon, Garrett ; Nasir-Moin, Mustafa ; Rosenberg, David ; Neifert, Sean ; Kondziolka, Douglas ; Oermann, Eric Karl</creatorcontrib><description>Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset consisting of four hospitals in New York City and analyzed for biases against race, gender and age. Simple thresholding adjustments were applied in the training process to establish more equitable models. Results Compared to the naively trained models, the calibrated models showed a 57% decrease in the number of biased trials, while predictive performance, measured by area under the receiver/operating curve (AUC), remained unchanged. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that naively training and deploying machine learning models on real world data for predictive analytics of COVID-19 has a high risk of bias. Simple implemented adjustments or calibrations during model training can lead to substantial and sustained gains in fairness on subsequent deployment.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Bias ; Calibration ; Coronaviruses ; COVID-19 ; Decision support systems ; Machine learning ; Performance prediction ; Prediction models ; Regression models ; Training</subject><ispartof>arXiv.org, 2021-10</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Li, Yifan</creatorcontrib><creatorcontrib>Yoon, Garrett</creatorcontrib><creatorcontrib>Nasir-Moin, Mustafa</creatorcontrib><creatorcontrib>Rosenberg, David</creatorcontrib><creatorcontrib>Neifert, Sean</creatorcontrib><creatorcontrib>Kondziolka, Douglas</creatorcontrib><creatorcontrib>Oermann, Eric Karl</creatorcontrib><title>Identifying and mitigating bias in algorithms used to manage patients in a pandemic</title><title>arXiv.org</title><description>Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset consisting of four hospitals in New York City and analyzed for biases against race, gender and age. Simple thresholding adjustments were applied in the training process to establish more equitable models. Results Compared to the naively trained models, the calibrated models showed a 57% decrease in the number of biased trials, while predictive performance, measured by area under the receiver/operating curve (AUC), remained unchanged. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that naively training and deploying machine learning models on real world data for predictive analytics of COVID-19 has a high risk of bias. Simple implemented adjustments or calibrations during model training can lead to substantial and sustained gains in fairness on subsequent deployment.</description><subject>Algorithms</subject><subject>Bias</subject><subject>Calibration</subject><subject>Coronaviruses</subject><subject>COVID-19</subject><subject>Decision support systems</subject><subject>Machine learning</subject><subject>Performance prediction</subject><subject>Prediction models</subject><subject>Regression models</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNi8EKwjAQBYMgWLT_sOC50CatrWdR9Kx3iSaNW5qkdtODf29EP8DTY3gzM5ZwIYqsKTlfsJSoy_Ocb2peVSJh55PSLmD7QmdAOgUWAxoZPnhDSYAOZG_8iOFhCSbSCoIHK500GoYoxvxrRXJKW7yv2LyVPen0t0u2Puwvu2M2jP45aQrXzk-ji9eVV9uiEUVd1uI_6w1CgEAe</recordid><startdate>20211030</startdate><enddate>20211030</enddate><creator>Li, Yifan</creator><creator>Yoon, Garrett</creator><creator>Nasir-Moin, Mustafa</creator><creator>Rosenberg, David</creator><creator>Neifert, Sean</creator><creator>Kondziolka, Douglas</creator><creator>Oermann, Eric Karl</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>COVID</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20211030</creationdate><title>Identifying and mitigating bias in algorithms used to manage patients in a pandemic</title><author>Li, Yifan ; Yoon, Garrett ; Nasir-Moin, Mustafa ; Rosenberg, David ; Neifert, Sean ; Kondziolka, Douglas ; Oermann, Eric Karl</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_25918317473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Bias</topic><topic>Calibration</topic><topic>Coronaviruses</topic><topic>COVID-19</topic><topic>Decision support systems</topic><topic>Machine learning</topic><topic>Performance prediction</topic><topic>Prediction models</topic><topic>Regression models</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Yifan</creatorcontrib><creatorcontrib>Yoon, Garrett</creatorcontrib><creatorcontrib>Nasir-Moin, Mustafa</creatorcontrib><creatorcontrib>Rosenberg, David</creatorcontrib><creatorcontrib>Neifert, Sean</creatorcontrib><creatorcontrib>Kondziolka, Douglas</creatorcontrib><creatorcontrib>Oermann, Eric Karl</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Yifan</au><au>Yoon, Garrett</au><au>Nasir-Moin, Mustafa</au><au>Rosenberg, David</au><au>Neifert, Sean</au><au>Kondziolka, Douglas</au><au>Oermann, Eric Karl</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Identifying and mitigating bias in algorithms used to manage patients in a pandemic</atitle><jtitle>arXiv.org</jtitle><date>2021-10-30</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset consisting of four hospitals in New York City and analyzed for biases against race, gender and age. Simple thresholding adjustments were applied in the training process to establish more equitable models. Results Compared to the naively trained models, the calibrated models showed a 57% decrease in the number of biased trials, while predictive performance, measured by area under the receiver/operating curve (AUC), remained unchanged. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that naively training and deploying machine learning models on real world data for predictive analytics of COVID-19 has a high risk of bias. Simple implemented adjustments or calibrations during model training can lead to substantial and sustained gains in fairness on subsequent deployment.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2021-10
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2591831747
source	Free E- Journals
subjects	Algorithms Bias Calibration Coronaviruses COVID-19 Decision support systems Machine learning Performance prediction Prediction models Regression models Training
title	Identifying and mitigating bias in algorithms used to manage patients in a pandemic
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T15%3A49%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Identifying%20and%20mitigating%20bias%20in%20algorithms%20used%20to%20manage%20patients%20in%20a%20pandemic&rft.jtitle=arXiv.org&rft.au=Li,%20Yifan&rft.date=2021-10-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2591831747%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2591831747&rft_id=info:pmid/&rfr_iscdi=true