Interpretable Machine Learning for COVID-19: An Empirical Study on Severity Prediction Task

The black-box nature of machine learning models hinders the deployment of some high-accuracy medical diagnosis algorithms. It is risky to put one's life in the hands of models that medical researchers do not fully understand or trust. However, through model interpretation, black-box models can...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on artificial intelligence 2023-08, Vol.4 (4), p.764-777
Hauptverfasser: Wu, Han, Ruan, Wenjie, Wang, Jiangtao, Zheng, Dingchang, Liu, Bei, Geng, Yayuan, Chai, Xiangfei, Chen, Jian, Li, Kunwei, Li, Shaolin, Helal, Sumi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 777
container_issue 4
container_start_page 764
container_title IEEE transactions on artificial intelligence
container_volume 4
creator Wu, Han
Ruan, Wenjie
Wang, Jiangtao
Zheng, Dingchang
Liu, Bei
Geng, Yayuan
Chai, Xiangfei
Chen, Jian
Li, Kunwei
Li, Shaolin
Helal, Sumi
description The black-box nature of machine learning models hinders the deployment of some high-accuracy medical diagnosis algorithms. It is risky to put one's life in the hands of models that medical researchers do not fully understand or trust. However, through model interpretation, black-box models can promptly reveal significant biomarkers that medical practitioners may have overlooked due to the surge of infected patients in the COVID-19 pandemic. This research leverages a database of 92 patients with confirmed SARS-CoV-2 laboratory tests between 18th January 2020 and 5th March 2020, in Zhuhai, China, to identify biomarkers indicative of infection severity prediction. Through the interpretation of four machine learning models, decision tree, random forests, gradient boosted trees, and neural networks using permutation feature importance, partial dependence plot, individual conditional expectation, accumulated local effects, local interpretable model-agnostic explanations, and Shapley additive explanation, we identify an increase in N-terminal pro-brain natriuretic peptide, C-reaction protein, and lactic dehydrogenase, a decrease in lymphocyte is associated with severe infection and an increased risk of death, which is consistent with recent medical research on COVID-19 and other research using dedicated models. We further validate our methods on a large open dataset with 5644 confirmed patients from the Hospital Israelita Albert Einstein, at São Paulo, Brazil from Kaggle, and unveil leukocytes, eosinophils, and platelets as three indicative biomarkers for COVID-19.
doi_str_mv 10.1109/TAI.2021.3092698
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_2889589168</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9465748</ieee_id><sourcerecordid>2889589168</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3098-273f47ff49ab8387df0d3ae86776ba50a7eb3f04e48700d168cb831bd3c398ee3</originalsourceid><addsrcrecordid>eNpVUU1LAzEQXURBqb0LXnL0sjUfu5vEi5RatVCpYPXiIWSzsxrdZmuyLfTfm9JS9PSGmfcx8JLkguABIVhez4eTAcWUDBiWtJDiKDmLQNIsF-T4z3ya9EP4whjTnFBK-VnyPnEd-KWHTpcNoCdtPq0DNAXtnXUfqG49Gs3eJncpkTdo6NB4sbTeGt2gl25VbVDr0Auswdtug549VNZ0Nu7mOnyfJye1bgL099hLXu_H89FjOp09TEbDaWriuyKlnNUZr-tM6lIwwasaV0yDKDgvSp1jzaFkNc4gExzjihTCRB4pK2aYFACsl9zufJercgGVAdd53ailtwvtN6rVVv2_OPupPtq1IrigWBY0OlztHXz7s4LQqYUNBppGO2hXQVEhZC5kjI5UvKMa34bgoT7kEKy2ZahYhtqWofZlRMnlTmIB4ECXWZHzTLBfi9SFoQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2889589168</pqid></control><display><type>article</type><title>Interpretable Machine Learning for COVID-19: An Empirical Study on Severity Prediction Task</title><source>IEEE Electronic Library (IEL)</source><creator>Wu, Han ; Ruan, Wenjie ; Wang, Jiangtao ; Zheng, Dingchang ; Liu, Bei ; Geng, Yayuan ; Chai, Xiangfei ; Chen, Jian ; Li, Kunwei ; Li, Shaolin ; Helal, Sumi</creator><creatorcontrib>Wu, Han ; Ruan, Wenjie ; Wang, Jiangtao ; Zheng, Dingchang ; Liu, Bei ; Geng, Yayuan ; Chai, Xiangfei ; Chen, Jian ; Li, Kunwei ; Li, Shaolin ; Helal, Sumi</creatorcontrib><description>The black-box nature of machine learning models hinders the deployment of some high-accuracy medical diagnosis algorithms. It is risky to put one's life in the hands of models that medical researchers do not fully understand or trust. However, through model interpretation, black-box models can promptly reveal significant biomarkers that medical practitioners may have overlooked due to the surge of infected patients in the COVID-19 pandemic. This research leverages a database of 92 patients with confirmed SARS-CoV-2 laboratory tests between 18th January 2020 and 5th March 2020, in Zhuhai, China, to identify biomarkers indicative of infection severity prediction. Through the interpretation of four machine learning models, decision tree, random forests, gradient boosted trees, and neural networks using permutation feature importance, partial dependence plot, individual conditional expectation, accumulated local effects, local interpretable model-agnostic explanations, and Shapley additive explanation, we identify an increase in N-terminal pro-brain natriuretic peptide, C-reaction protein, and lactic dehydrogenase, a decrease in lymphocyte is associated with severe infection and an increased risk of death, which is consistent with recent medical research on COVID-19 and other research using dedicated models. We further validate our methods on a large open dataset with 5644 confirmed patients from the Hospital Israelita Albert Einstein, at São Paulo, Brazil from Kaggle, and unveil leukocytes, eosinophils, and platelets as three indicative biomarkers for COVID-19.</description><identifier>ISSN: 2691-4581</identifier><identifier>EISSN: 2691-4581</identifier><identifier>DOI: 10.1109/TAI.2021.3092698</identifier><identifier>CODEN: ITAICB</identifier><language>eng</language><publisher>IEEE</publisher><subject>Artificial intelligence in health ; artificial intelligence in medicine ; Biological system modeling ; COVID-19 ; interpretable machine learning ; Machine learning ; Medical diagnostic imaging ; Medical services ; Pandemics ; Predictive models</subject><ispartof>IEEE transactions on artificial intelligence, 2023-08, Vol.4 (4), p.764-777</ispartof><rights>2021 IEEE</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3098-273f47ff49ab8387df0d3ae86776ba50a7eb3f04e48700d168cb831bd3c398ee3</citedby><cites>FETCH-LOGICAL-c3098-273f47ff49ab8387df0d3ae86776ba50a7eb3f04e48700d168cb831bd3c398ee3</cites><orcidid>0000-0002-8704-502X ; 0000-0002-1778-1814 ; 0000-0002-8311-8738 ; 0000-0001-8077-4548 ; 0000-0001-5451-4398</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9465748$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,314,780,784,796,885,27924,27925,54758</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9465748$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Wu, Han</creatorcontrib><creatorcontrib>Ruan, Wenjie</creatorcontrib><creatorcontrib>Wang, Jiangtao</creatorcontrib><creatorcontrib>Zheng, Dingchang</creatorcontrib><creatorcontrib>Liu, Bei</creatorcontrib><creatorcontrib>Geng, Yayuan</creatorcontrib><creatorcontrib>Chai, Xiangfei</creatorcontrib><creatorcontrib>Chen, Jian</creatorcontrib><creatorcontrib>Li, Kunwei</creatorcontrib><creatorcontrib>Li, Shaolin</creatorcontrib><creatorcontrib>Helal, Sumi</creatorcontrib><title>Interpretable Machine Learning for COVID-19: An Empirical Study on Severity Prediction Task</title><title>IEEE transactions on artificial intelligence</title><addtitle>TAI</addtitle><description>The black-box nature of machine learning models hinders the deployment of some high-accuracy medical diagnosis algorithms. It is risky to put one's life in the hands of models that medical researchers do not fully understand or trust. However, through model interpretation, black-box models can promptly reveal significant biomarkers that medical practitioners may have overlooked due to the surge of infected patients in the COVID-19 pandemic. This research leverages a database of 92 patients with confirmed SARS-CoV-2 laboratory tests between 18th January 2020 and 5th March 2020, in Zhuhai, China, to identify biomarkers indicative of infection severity prediction. Through the interpretation of four machine learning models, decision tree, random forests, gradient boosted trees, and neural networks using permutation feature importance, partial dependence plot, individual conditional expectation, accumulated local effects, local interpretable model-agnostic explanations, and Shapley additive explanation, we identify an increase in N-terminal pro-brain natriuretic peptide, C-reaction protein, and lactic dehydrogenase, a decrease in lymphocyte is associated with severe infection and an increased risk of death, which is consistent with recent medical research on COVID-19 and other research using dedicated models. We further validate our methods on a large open dataset with 5644 confirmed patients from the Hospital Israelita Albert Einstein, at São Paulo, Brazil from Kaggle, and unveil leukocytes, eosinophils, and platelets as three indicative biomarkers for COVID-19.</description><subject>Artificial intelligence in health</subject><subject>artificial intelligence in medicine</subject><subject>Biological system modeling</subject><subject>COVID-19</subject><subject>interpretable machine learning</subject><subject>Machine learning</subject><subject>Medical diagnostic imaging</subject><subject>Medical services</subject><subject>Pandemics</subject><subject>Predictive models</subject><issn>2691-4581</issn><issn>2691-4581</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpVUU1LAzEQXURBqb0LXnL0sjUfu5vEi5RatVCpYPXiIWSzsxrdZmuyLfTfm9JS9PSGmfcx8JLkguABIVhez4eTAcWUDBiWtJDiKDmLQNIsF-T4z3ya9EP4whjTnFBK-VnyPnEd-KWHTpcNoCdtPq0DNAXtnXUfqG49Gs3eJncpkTdo6NB4sbTeGt2gl25VbVDr0Auswdtug549VNZ0Nu7mOnyfJye1bgL099hLXu_H89FjOp09TEbDaWriuyKlnNUZr-tM6lIwwasaV0yDKDgvSp1jzaFkNc4gExzjihTCRB4pK2aYFACsl9zufJercgGVAdd53ailtwvtN6rVVv2_OPupPtq1IrigWBY0OlztHXz7s4LQqYUNBppGO2hXQVEhZC5kjI5UvKMa34bgoT7kEKy2ZahYhtqWofZlRMnlTmIB4ECXWZHzTLBfi9SFoQ</recordid><startdate>20230801</startdate><enddate>20230801</enddate><creator>Wu, Han</creator><creator>Ruan, Wenjie</creator><creator>Wang, Jiangtao</creator><creator>Zheng, Dingchang</creator><creator>Liu, Bei</creator><creator>Geng, Yayuan</creator><creator>Chai, Xiangfei</creator><creator>Chen, Jian</creator><creator>Li, Kunwei</creator><creator>Li, Shaolin</creator><creator>Helal, Sumi</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-8704-502X</orcidid><orcidid>https://orcid.org/0000-0002-1778-1814</orcidid><orcidid>https://orcid.org/0000-0002-8311-8738</orcidid><orcidid>https://orcid.org/0000-0001-8077-4548</orcidid><orcidid>https://orcid.org/0000-0001-5451-4398</orcidid></search><sort><creationdate>20230801</creationdate><title>Interpretable Machine Learning for COVID-19: An Empirical Study on Severity Prediction Task</title><author>Wu, Han ; Ruan, Wenjie ; Wang, Jiangtao ; Zheng, Dingchang ; Liu, Bei ; Geng, Yayuan ; Chai, Xiangfei ; Chen, Jian ; Li, Kunwei ; Li, Shaolin ; Helal, Sumi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3098-273f47ff49ab8387df0d3ae86776ba50a7eb3f04e48700d168cb831bd3c398ee3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial intelligence in health</topic><topic>artificial intelligence in medicine</topic><topic>Biological system modeling</topic><topic>COVID-19</topic><topic>interpretable machine learning</topic><topic>Machine learning</topic><topic>Medical diagnostic imaging</topic><topic>Medical services</topic><topic>Pandemics</topic><topic>Predictive models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wu, Han</creatorcontrib><creatorcontrib>Ruan, Wenjie</creatorcontrib><creatorcontrib>Wang, Jiangtao</creatorcontrib><creatorcontrib>Zheng, Dingchang</creatorcontrib><creatorcontrib>Liu, Bei</creatorcontrib><creatorcontrib>Geng, Yayuan</creatorcontrib><creatorcontrib>Chai, Xiangfei</creatorcontrib><creatorcontrib>Chen, Jian</creatorcontrib><creatorcontrib>Li, Kunwei</creatorcontrib><creatorcontrib>Li, Shaolin</creatorcontrib><creatorcontrib>Helal, Sumi</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>IEEE transactions on artificial intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wu, Han</au><au>Ruan, Wenjie</au><au>Wang, Jiangtao</au><au>Zheng, Dingchang</au><au>Liu, Bei</au><au>Geng, Yayuan</au><au>Chai, Xiangfei</au><au>Chen, Jian</au><au>Li, Kunwei</au><au>Li, Shaolin</au><au>Helal, Sumi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Interpretable Machine Learning for COVID-19: An Empirical Study on Severity Prediction Task</atitle><jtitle>IEEE transactions on artificial intelligence</jtitle><stitle>TAI</stitle><date>2023-08-01</date><risdate>2023</risdate><volume>4</volume><issue>4</issue><spage>764</spage><epage>777</epage><pages>764-777</pages><issn>2691-4581</issn><eissn>2691-4581</eissn><coden>ITAICB</coden><abstract>The black-box nature of machine learning models hinders the deployment of some high-accuracy medical diagnosis algorithms. It is risky to put one's life in the hands of models that medical researchers do not fully understand or trust. However, through model interpretation, black-box models can promptly reveal significant biomarkers that medical practitioners may have overlooked due to the surge of infected patients in the COVID-19 pandemic. This research leverages a database of 92 patients with confirmed SARS-CoV-2 laboratory tests between 18th January 2020 and 5th March 2020, in Zhuhai, China, to identify biomarkers indicative of infection severity prediction. Through the interpretation of four machine learning models, decision tree, random forests, gradient boosted trees, and neural networks using permutation feature importance, partial dependence plot, individual conditional expectation, accumulated local effects, local interpretable model-agnostic explanations, and Shapley additive explanation, we identify an increase in N-terminal pro-brain natriuretic peptide, C-reaction protein, and lactic dehydrogenase, a decrease in lymphocyte is associated with severe infection and an increased risk of death, which is consistent with recent medical research on COVID-19 and other research using dedicated models. We further validate our methods on a large open dataset with 5644 confirmed patients from the Hospital Israelita Albert Einstein, at São Paulo, Brazil from Kaggle, and unveil leukocytes, eosinophils, and platelets as three indicative biomarkers for COVID-19.</abstract><pub>IEEE</pub><doi>10.1109/TAI.2021.3092698</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0002-8704-502X</orcidid><orcidid>https://orcid.org/0000-0002-1778-1814</orcidid><orcidid>https://orcid.org/0000-0002-8311-8738</orcidid><orcidid>https://orcid.org/0000-0001-8077-4548</orcidid><orcidid>https://orcid.org/0000-0001-5451-4398</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2691-4581
ispartof IEEE transactions on artificial intelligence, 2023-08, Vol.4 (4), p.764-777
issn 2691-4581
2691-4581
language eng
recordid cdi_proquest_miscellaneous_2889589168
source IEEE Electronic Library (IEL)
subjects Artificial intelligence in health
artificial intelligence in medicine
Biological system modeling
COVID-19
interpretable machine learning
Machine learning
Medical diagnostic imaging
Medical services
Pandemics
Predictive models
title Interpretable Machine Learning for COVID-19: An Empirical Study on Severity Prediction Task
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T06%3A12%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Interpretable%20Machine%20Learning%20for%20COVID-19:%20An%20Empirical%20Study%20on%20Severity%20Prediction%20Task&rft.jtitle=IEEE%20transactions%20on%20artificial%20intelligence&rft.au=Wu,%20Han&rft.date=2023-08-01&rft.volume=4&rft.issue=4&rft.spage=764&rft.epage=777&rft.pages=764-777&rft.issn=2691-4581&rft.eissn=2691-4581&rft.coden=ITAICB&rft_id=info:doi/10.1109/TAI.2021.3092698&rft_dat=%3Cproquest_RIE%3E2889589168%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2889589168&rft_id=info:pmid/&rft_ieee_id=9465748&rfr_iscdi=true