Identifying and mitigating bias in algorithms used to manage patients in a pandemic

Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status an...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2021-10
Hauptverfasser: Li, Yifan, Yoon, Garrett, Nasir-Moin, Mustafa, Rosenberg, David, Neifert, Sean, Kondziolka, Douglas, Oermann, Eric Karl
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Li, Yifan
Yoon, Garrett
Nasir-Moin, Mustafa
Rosenberg, David
Neifert, Sean
Kondziolka, Douglas
Oermann, Eric Karl
description Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset consisting of four hospitals in New York City and analyzed for biases against race, gender and age. Simple thresholding adjustments were applied in the training process to establish more equitable models. Results Compared to the naively trained models, the calibrated models showed a 57% decrease in the number of biased trials, while predictive performance, measured by area under the receiver/operating curve (AUC), remained unchanged. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that naively training and deploying machine learning models on real world data for predictive analytics of COVID-19 has a high risk of bias. Simple implemented adjustments or calibrations during model training can lead to substantial and sustained gains in fairness on subsequent deployment.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2591831747</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2591831747</sourcerecordid><originalsourceid>FETCH-proquest_journals_25918317473</originalsourceid><addsrcrecordid>eNqNi8EKwjAQBYMgWLT_sOC50CatrWdR9Kx3iSaNW5qkdtODf29EP8DTY3gzM5ZwIYqsKTlfsJSoy_Ocb2peVSJh55PSLmD7QmdAOgUWAxoZPnhDSYAOZG_8iOFhCSbSCoIHK500GoYoxvxrRXJKW7yv2LyVPen0t0u2Puwvu2M2jP45aQrXzk-ji9eVV9uiEUVd1uI_6w1CgEAe</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2591831747</pqid></control><display><type>article</type><title>Identifying and mitigating bias in algorithms used to manage patients in a pandemic</title><source>Free E- Journals</source><creator>Li, Yifan ; Yoon, Garrett ; Nasir-Moin, Mustafa ; Rosenberg, David ; Neifert, Sean ; Kondziolka, Douglas ; Oermann, Eric Karl</creator><creatorcontrib>Li, Yifan ; Yoon, Garrett ; Nasir-Moin, Mustafa ; Rosenberg, David ; Neifert, Sean ; Kondziolka, Douglas ; Oermann, Eric Karl</creatorcontrib><description>Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset consisting of four hospitals in New York City and analyzed for biases against race, gender and age. Simple thresholding adjustments were applied in the training process to establish more equitable models. Results Compared to the naively trained models, the calibrated models showed a 57% decrease in the number of biased trials, while predictive performance, measured by area under the receiver/operating curve (AUC), remained unchanged. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that naively training and deploying machine learning models on real world data for predictive analytics of COVID-19 has a high risk of bias. Simple implemented adjustments or calibrations during model training can lead to substantial and sustained gains in fairness on subsequent deployment.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Bias ; Calibration ; Coronaviruses ; COVID-19 ; Decision support systems ; Machine learning ; Performance prediction ; Prediction models ; Regression models ; Training</subject><ispartof>arXiv.org, 2021-10</ispartof><rights>2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>776,780</link.rule.ids></links><search><creatorcontrib>Li, Yifan</creatorcontrib><creatorcontrib>Yoon, Garrett</creatorcontrib><creatorcontrib>Nasir-Moin, Mustafa</creatorcontrib><creatorcontrib>Rosenberg, David</creatorcontrib><creatorcontrib>Neifert, Sean</creatorcontrib><creatorcontrib>Kondziolka, Douglas</creatorcontrib><creatorcontrib>Oermann, Eric Karl</creatorcontrib><title>Identifying and mitigating bias in algorithms used to manage patients in a pandemic</title><title>arXiv.org</title><description>Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset consisting of four hospitals in New York City and analyzed for biases against race, gender and age. Simple thresholding adjustments were applied in the training process to establish more equitable models. Results Compared to the naively trained models, the calibrated models showed a 57% decrease in the number of biased trials, while predictive performance, measured by area under the receiver/operating curve (AUC), remained unchanged. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that naively training and deploying machine learning models on real world data for predictive analytics of COVID-19 has a high risk of bias. Simple implemented adjustments or calibrations during model training can lead to substantial and sustained gains in fairness on subsequent deployment.</description><subject>Algorithms</subject><subject>Bias</subject><subject>Calibration</subject><subject>Coronaviruses</subject><subject>COVID-19</subject><subject>Decision support systems</subject><subject>Machine learning</subject><subject>Performance prediction</subject><subject>Prediction models</subject><subject>Regression models</subject><subject>Training</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>BENPR</sourceid><recordid>eNqNi8EKwjAQBYMgWLT_sOC50CatrWdR9Kx3iSaNW5qkdtODf29EP8DTY3gzM5ZwIYqsKTlfsJSoy_Ocb2peVSJh55PSLmD7QmdAOgUWAxoZPnhDSYAOZG_8iOFhCSbSCoIHK500GoYoxvxrRXJKW7yv2LyVPen0t0u2Puwvu2M2jP45aQrXzk-ji9eVV9uiEUVd1uI_6w1CgEAe</recordid><startdate>20211030</startdate><enddate>20211030</enddate><creator>Li, Yifan</creator><creator>Yoon, Garrett</creator><creator>Nasir-Moin, Mustafa</creator><creator>Rosenberg, David</creator><creator>Neifert, Sean</creator><creator>Kondziolka, Douglas</creator><creator>Oermann, Eric Karl</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>COVID</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20211030</creationdate><title>Identifying and mitigating bias in algorithms used to manage patients in a pandemic</title><author>Li, Yifan ; Yoon, Garrett ; Nasir-Moin, Mustafa ; Rosenberg, David ; Neifert, Sean ; Kondziolka, Douglas ; Oermann, Eric Karl</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_25918317473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Bias</topic><topic>Calibration</topic><topic>Coronaviruses</topic><topic>COVID-19</topic><topic>Decision support systems</topic><topic>Machine learning</topic><topic>Performance prediction</topic><topic>Prediction models</topic><topic>Regression models</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Yifan</creatorcontrib><creatorcontrib>Yoon, Garrett</creatorcontrib><creatorcontrib>Nasir-Moin, Mustafa</creatorcontrib><creatorcontrib>Rosenberg, David</creatorcontrib><creatorcontrib>Neifert, Sean</creatorcontrib><creatorcontrib>Kondziolka, Douglas</creatorcontrib><creatorcontrib>Oermann, Eric Karl</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Yifan</au><au>Yoon, Garrett</au><au>Nasir-Moin, Mustafa</au><au>Rosenberg, David</au><au>Neifert, Sean</au><au>Kondziolka, Douglas</au><au>Oermann, Eric Karl</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Identifying and mitigating bias in algorithms used to manage patients in a pandemic</atitle><jtitle>arXiv.org</jtitle><date>2021-10-30</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>Numerous COVID-19 clinical decision support systems have been developed. However many of these systems do not have the merit for validity due to methodological shortcomings including algorithmic bias. Methods Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset consisting of four hospitals in New York City and analyzed for biases against race, gender and age. Simple thresholding adjustments were applied in the training process to establish more equitable models. Results Compared to the naively trained models, the calibrated models showed a 57% decrease in the number of biased trials, while predictive performance, measured by area under the receiver/operating curve (AUC), remained unchanged. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955. Conclusion We demonstrate that naively training and deploying machine learning models on real world data for predictive analytics of COVID-19 has a high risk of bias. Simple implemented adjustments or calibrations during model training can lead to substantial and sustained gains in fairness on subsequent deployment.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2021-10
issn 2331-8422
language eng
recordid cdi_proquest_journals_2591831747
source Free E- Journals
subjects Algorithms
Bias
Calibration
Coronaviruses
COVID-19
Decision support systems
Machine learning
Performance prediction
Prediction models
Regression models
Training
title Identifying and mitigating bias in algorithms used to manage patients in a pandemic
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T15%3A49%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Identifying%20and%20mitigating%20bias%20in%20algorithms%20used%20to%20manage%20patients%20in%20a%20pandemic&rft.jtitle=arXiv.org&rft.au=Li,%20Yifan&rft.date=2021-10-30&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2591831747%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2591831747&rft_id=info:pmid/&rfr_iscdi=true