A comparative analysis of machine learning approaches to predict C. difficile infection in hospitalized patients

•Clostridioides difficile is a leading cause of infectious diarrhea in hospitalized patients.•Machine learning algorithms can predict Clostridioides difficile with excellent discrimination.•XGBoost maintained predictive performance across a hold-out test set and an external dataset Interventions to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:American journal of infection control 2022-03, Vol.50 (3), p.250-257
Hauptverfasser: Panchavati, Saarang, Zelin, Nicole S., Garikipati, Anurag, Pellegrini, Emily, Iqbal, Zohora, Barnes, Gina, Hoffman, Jana, Calvert, Jacob, Mao, Qingqing, Das, Ritankar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 257
container_issue 3
container_start_page 250
container_title American journal of infection control
container_volume 50
creator Panchavati, Saarang
Zelin, Nicole S.
Garikipati, Anurag
Pellegrini, Emily
Iqbal, Zohora
Barnes, Gina
Hoffman, Jana
Calvert, Jacob
Mao, Qingqing
Das, Ritankar
description •Clostridioides difficile is a leading cause of infectious diarrhea in hospitalized patients.•Machine learning algorithms can predict Clostridioides difficile with excellent discrimination.•XGBoost maintained predictive performance across a hold-out test set and an external dataset Interventions to better prevent or manage Clostridioides difficile infection (CDI) may significantly reduce morbidity, mortality, and healthcare spending. We present a retrospective study using electronic health record data from over 700 United States hospitals. A subset of hospitals was used to develop machine learning algorithms (MLAs); the remaining hospitals served as an external test set. Three MLAs were evaluated: gradient-boosted decision trees (XGBoost), Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network. MLA performance was evaluated with area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, diagnostic odds ratios and likelihood ratios. The development dataset contained 13,664,840 inpatient encounters with 80,046 CDI encounters; the external dataset contained 1,149,088 inpatient encounters with 7,107 CDI encounters. The highest AUROCs were achieved for XGB, Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network via abstaining from use of specialized training techniques, resampling in isolation, and resampling and output bias in combination, respectively. XGBoost achieved the highest AUROC. MLAs can predict future CDI in hospitalized patients using just 6 hours of data. In clinical practice, a machine-learning based tool may support prophylactic measures, earlier diagnosis, and more timely implementation of infection control measures.
doi_str_mv 10.1016/j.ajic.2021.11.012
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2622479552</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0196655321007574</els_id><sourcerecordid>2622479552</sourcerecordid><originalsourceid>FETCH-LOGICAL-c356t-f9c8d282cac7806d5c08cd34c62b2792d6ff201765a79dc6055c517141065c363</originalsourceid><addsrcrecordid>eNp9kEFv1DAQha0KRLeFP8AB-cglwR6vnUTiUq0KVKrEpZwtdzxuvUriYGcrtb8er7Zw5DSj0XtvZj7GPkrRSiHNl33r9hFbECBbKVsh4YxtpIauUTCYN2wj5GAao7U6Zxel7IUQgzL6HTtXWphO9bBhyxXHNC0uuzU-EXezG59LLDwFPjl8jDPxkVye4_zA3bLkVIdU-Jr4kslHXPmu5T6GEDGOxOMcCNeY5trxx1SWuLoxvpDnS11A81res7fBjYU-vNZL9uvb9d3uR3P78_vN7uq2QaXN2oQBew89oMOuF8ZrFD16tUUD99AN4E0IIGRntOsGj0ZojVp2ciuF0aiMumSfT7n15t8HKqudYkEaRzdTOhQLBmDbDVpDlcJJijmVkinYJcfJ5WcrhT2Stnt7JG2PpK2UtpKupk-v-Yf7ifw_y1-0VfD1JKD65VOkbAtWAlip5crI-hT_l_8HQn6P3Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2622479552</pqid></control><display><type>article</type><title>A comparative analysis of machine learning approaches to predict C. difficile infection in hospitalized patients</title><source>MEDLINE</source><source>ScienceDirect Journals (5 years ago - present)</source><creator>Panchavati, Saarang ; Zelin, Nicole S. ; Garikipati, Anurag ; Pellegrini, Emily ; Iqbal, Zohora ; Barnes, Gina ; Hoffman, Jana ; Calvert, Jacob ; Mao, Qingqing ; Das, Ritankar</creator><creatorcontrib>Panchavati, Saarang ; Zelin, Nicole S. ; Garikipati, Anurag ; Pellegrini, Emily ; Iqbal, Zohora ; Barnes, Gina ; Hoffman, Jana ; Calvert, Jacob ; Mao, Qingqing ; Das, Ritankar</creatorcontrib><description>•Clostridioides difficile is a leading cause of infectious diarrhea in hospitalized patients.•Machine learning algorithms can predict Clostridioides difficile with excellent discrimination.•XGBoost maintained predictive performance across a hold-out test set and an external dataset Interventions to better prevent or manage Clostridioides difficile infection (CDI) may significantly reduce morbidity, mortality, and healthcare spending. We present a retrospective study using electronic health record data from over 700 United States hospitals. A subset of hospitals was used to develop machine learning algorithms (MLAs); the remaining hospitals served as an external test set. Three MLAs were evaluated: gradient-boosted decision trees (XGBoost), Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network. MLA performance was evaluated with area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, diagnostic odds ratios and likelihood ratios. The development dataset contained 13,664,840 inpatient encounters with 80,046 CDI encounters; the external dataset contained 1,149,088 inpatient encounters with 7,107 CDI encounters. The highest AUROCs were achieved for XGB, Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network via abstaining from use of specialized training techniques, resampling in isolation, and resampling and output bias in combination, respectively. XGBoost achieved the highest AUROC. MLAs can predict future CDI in hospitalized patients using just 6 hours of data. In clinical practice, a machine-learning based tool may support prophylactic measures, earlier diagnosis, and more timely implementation of infection control measures.</description><identifier>ISSN: 0196-6553</identifier><identifier>EISSN: 1527-3296</identifier><identifier>DOI: 10.1016/j.ajic.2021.11.012</identifier><identifier>PMID: 35067382</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Algorithm ; CDI ; Clostridioides difficile ; Clostridium Infections - diagnosis ; Clostridium Infections - epidemiology ; Electronic health record ; Humans ; Machine Learning ; Prediction ; Retrospective Studies ; ROC Curve ; XGBoost</subject><ispartof>American journal of infection control, 2022-03, Vol.50 (3), p.250-257</ispartof><rights>2021 The Authors</rights><rights>Copyright © 2021 The Authors. Published by Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c356t-f9c8d282cac7806d5c08cd34c62b2792d6ff201765a79dc6055c517141065c363</citedby><cites>FETCH-LOGICAL-c356t-f9c8d282cac7806d5c08cd34c62b2792d6ff201765a79dc6055c517141065c363</cites><orcidid>0000-0001-6001-6723 ; 0000-0002-7745-3900 ; 0000-0001-7065-8367 ; 0000-0002-2230-2187</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.ajic.2021.11.012$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,3548,27922,27923,45993</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35067382$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Panchavati, Saarang</creatorcontrib><creatorcontrib>Zelin, Nicole S.</creatorcontrib><creatorcontrib>Garikipati, Anurag</creatorcontrib><creatorcontrib>Pellegrini, Emily</creatorcontrib><creatorcontrib>Iqbal, Zohora</creatorcontrib><creatorcontrib>Barnes, Gina</creatorcontrib><creatorcontrib>Hoffman, Jana</creatorcontrib><creatorcontrib>Calvert, Jacob</creatorcontrib><creatorcontrib>Mao, Qingqing</creatorcontrib><creatorcontrib>Das, Ritankar</creatorcontrib><title>A comparative analysis of machine learning approaches to predict C. difficile infection in hospitalized patients</title><title>American journal of infection control</title><addtitle>Am J Infect Control</addtitle><description>•Clostridioides difficile is a leading cause of infectious diarrhea in hospitalized patients.•Machine learning algorithms can predict Clostridioides difficile with excellent discrimination.•XGBoost maintained predictive performance across a hold-out test set and an external dataset Interventions to better prevent or manage Clostridioides difficile infection (CDI) may significantly reduce morbidity, mortality, and healthcare spending. We present a retrospective study using electronic health record data from over 700 United States hospitals. A subset of hospitals was used to develop machine learning algorithms (MLAs); the remaining hospitals served as an external test set. Three MLAs were evaluated: gradient-boosted decision trees (XGBoost), Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network. MLA performance was evaluated with area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, diagnostic odds ratios and likelihood ratios. The development dataset contained 13,664,840 inpatient encounters with 80,046 CDI encounters; the external dataset contained 1,149,088 inpatient encounters with 7,107 CDI encounters. The highest AUROCs were achieved for XGB, Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network via abstaining from use of specialized training techniques, resampling in isolation, and resampling and output bias in combination, respectively. XGBoost achieved the highest AUROC. MLAs can predict future CDI in hospitalized patients using just 6 hours of data. In clinical practice, a machine-learning based tool may support prophylactic measures, earlier diagnosis, and more timely implementation of infection control measures.</description><subject>Algorithm</subject><subject>CDI</subject><subject>Clostridioides difficile</subject><subject>Clostridium Infections - diagnosis</subject><subject>Clostridium Infections - epidemiology</subject><subject>Electronic health record</subject><subject>Humans</subject><subject>Machine Learning</subject><subject>Prediction</subject><subject>Retrospective Studies</subject><subject>ROC Curve</subject><subject>XGBoost</subject><issn>0196-6553</issn><issn>1527-3296</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kEFv1DAQha0KRLeFP8AB-cglwR6vnUTiUq0KVKrEpZwtdzxuvUriYGcrtb8er7Zw5DSj0XtvZj7GPkrRSiHNl33r9hFbECBbKVsh4YxtpIauUTCYN2wj5GAao7U6Zxel7IUQgzL6HTtXWphO9bBhyxXHNC0uuzU-EXezG59LLDwFPjl8jDPxkVye4_zA3bLkVIdU-Jr4kslHXPmu5T6GEDGOxOMcCNeY5trxx1SWuLoxvpDnS11A81res7fBjYU-vNZL9uvb9d3uR3P78_vN7uq2QaXN2oQBew89oMOuF8ZrFD16tUUD99AN4E0IIGRntOsGj0ZojVp2ciuF0aiMumSfT7n15t8HKqudYkEaRzdTOhQLBmDbDVpDlcJJijmVkinYJcfJ5WcrhT2Stnt7JG2PpK2UtpKupk-v-Yf7ifw_y1-0VfD1JKD65VOkbAtWAlip5crI-hT_l_8HQn6P3Q</recordid><startdate>202203</startdate><enddate>202203</enddate><creator>Panchavati, Saarang</creator><creator>Zelin, Nicole S.</creator><creator>Garikipati, Anurag</creator><creator>Pellegrini, Emily</creator><creator>Iqbal, Zohora</creator><creator>Barnes, Gina</creator><creator>Hoffman, Jana</creator><creator>Calvert, Jacob</creator><creator>Mao, Qingqing</creator><creator>Das, Ritankar</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-6001-6723</orcidid><orcidid>https://orcid.org/0000-0002-7745-3900</orcidid><orcidid>https://orcid.org/0000-0001-7065-8367</orcidid><orcidid>https://orcid.org/0000-0002-2230-2187</orcidid></search><sort><creationdate>202203</creationdate><title>A comparative analysis of machine learning approaches to predict C. difficile infection in hospitalized patients</title><author>Panchavati, Saarang ; Zelin, Nicole S. ; Garikipati, Anurag ; Pellegrini, Emily ; Iqbal, Zohora ; Barnes, Gina ; Hoffman, Jana ; Calvert, Jacob ; Mao, Qingqing ; Das, Ritankar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c356t-f9c8d282cac7806d5c08cd34c62b2792d6ff201765a79dc6055c517141065c363</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithm</topic><topic>CDI</topic><topic>Clostridioides difficile</topic><topic>Clostridium Infections - diagnosis</topic><topic>Clostridium Infections - epidemiology</topic><topic>Electronic health record</topic><topic>Humans</topic><topic>Machine Learning</topic><topic>Prediction</topic><topic>Retrospective Studies</topic><topic>ROC Curve</topic><topic>XGBoost</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Panchavati, Saarang</creatorcontrib><creatorcontrib>Zelin, Nicole S.</creatorcontrib><creatorcontrib>Garikipati, Anurag</creatorcontrib><creatorcontrib>Pellegrini, Emily</creatorcontrib><creatorcontrib>Iqbal, Zohora</creatorcontrib><creatorcontrib>Barnes, Gina</creatorcontrib><creatorcontrib>Hoffman, Jana</creatorcontrib><creatorcontrib>Calvert, Jacob</creatorcontrib><creatorcontrib>Mao, Qingqing</creatorcontrib><creatorcontrib>Das, Ritankar</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>American journal of infection control</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Panchavati, Saarang</au><au>Zelin, Nicole S.</au><au>Garikipati, Anurag</au><au>Pellegrini, Emily</au><au>Iqbal, Zohora</au><au>Barnes, Gina</au><au>Hoffman, Jana</au><au>Calvert, Jacob</au><au>Mao, Qingqing</au><au>Das, Ritankar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A comparative analysis of machine learning approaches to predict C. difficile infection in hospitalized patients</atitle><jtitle>American journal of infection control</jtitle><addtitle>Am J Infect Control</addtitle><date>2022-03</date><risdate>2022</risdate><volume>50</volume><issue>3</issue><spage>250</spage><epage>257</epage><pages>250-257</pages><issn>0196-6553</issn><eissn>1527-3296</eissn><abstract>•Clostridioides difficile is a leading cause of infectious diarrhea in hospitalized patients.•Machine learning algorithms can predict Clostridioides difficile with excellent discrimination.•XGBoost maintained predictive performance across a hold-out test set and an external dataset Interventions to better prevent or manage Clostridioides difficile infection (CDI) may significantly reduce morbidity, mortality, and healthcare spending. We present a retrospective study using electronic health record data from over 700 United States hospitals. A subset of hospitals was used to develop machine learning algorithms (MLAs); the remaining hospitals served as an external test set. Three MLAs were evaluated: gradient-boosted decision trees (XGBoost), Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network. MLA performance was evaluated with area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, diagnostic odds ratios and likelihood ratios. The development dataset contained 13,664,840 inpatient encounters with 80,046 CDI encounters; the external dataset contained 1,149,088 inpatient encounters with 7,107 CDI encounters. The highest AUROCs were achieved for XGB, Deep Long Short Term Memory neural network, and one-dimensional convolutional neural network via abstaining from use of specialized training techniques, resampling in isolation, and resampling and output bias in combination, respectively. XGBoost achieved the highest AUROC. MLAs can predict future CDI in hospitalized patients using just 6 hours of data. In clinical practice, a machine-learning based tool may support prophylactic measures, earlier diagnosis, and more timely implementation of infection control measures.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>35067382</pmid><doi>10.1016/j.ajic.2021.11.012</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0001-6001-6723</orcidid><orcidid>https://orcid.org/0000-0002-7745-3900</orcidid><orcidid>https://orcid.org/0000-0001-7065-8367</orcidid><orcidid>https://orcid.org/0000-0002-2230-2187</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0196-6553
ispartof American journal of infection control, 2022-03, Vol.50 (3), p.250-257
issn 0196-6553
1527-3296
language eng
recordid cdi_proquest_miscellaneous_2622479552
source MEDLINE; ScienceDirect Journals (5 years ago - present)
subjects Algorithm
CDI
Clostridioides difficile
Clostridium Infections - diagnosis
Clostridium Infections - epidemiology
Electronic health record
Humans
Machine Learning
Prediction
Retrospective Studies
ROC Curve
XGBoost
title A comparative analysis of machine learning approaches to predict C. difficile infection in hospitalized patients
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T03%3A11%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20comparative%20analysis%20of%20machine%20learning%20approaches%20to%20predict%20C.%20difficile%20infection%20in%20hospitalized%20patients&rft.jtitle=American%20journal%20of%20infection%20control&rft.au=Panchavati,%20Saarang&rft.date=2022-03&rft.volume=50&rft.issue=3&rft.spage=250&rft.epage=257&rft.pages=250-257&rft.issn=0196-6553&rft.eissn=1527-3296&rft_id=info:doi/10.1016/j.ajic.2021.11.012&rft_dat=%3Cproquest_cross%3E2622479552%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2622479552&rft_id=info:pmid/35067382&rft_els_id=S0196655321007574&rfr_iscdi=true