Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality

Myocardial Infarction is a main cause of mortality globally, and accurate risk prediction is crucial for improving patient outcomes. Machine Learning techniques have shown promise in identifying high-risk patients and predicting outcomes. However, patient data often contain vast amounts of informati...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Vicente, Ana Letícia Garcez, Junior, Roseval Donisete Malaquias, Romero, Roseli A. F
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Vicente, Ana Letícia Garcez Junior, Roseval Donisete Malaquias Romero, Roseli A. F
description	Myocardial Infarction is a main cause of mortality globally, and accurate risk prediction is crucial for improving patient outcomes. Machine Learning techniques have shown promise in identifying high-risk patients and predicting outcomes. However, patient data often contain vast amounts of information and missing values, posing challenges for feature selection and imputation methods. In this article, we investigate the impact of the data preprocessing task and compare three ensembles boosted tree methods to predict the risk of mortality in patients with myocardial infarction. Further, we use the Tree Shapley Additive Explanations method to identify relationships among all the features for the performed predictions, leveraging the entirety of the available data in the analysis. Notably, our approach achieved a superior performance when compared to other existing machine learning approaches, with an F1-score of 91,2% and an accuracy of 91,8% for LightGBM without data preprocessing.
doi_str_mv	10.48550/arxiv.2404.15029
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2404_15029</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2404_15029</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-32cdd41131b749678842c8a8524bd4e597a3525b50b6367bd1d92d4a15315f1f3</originalsourceid><addsrcrecordid>eNotz71OwzAYhWEvHVDLBTDhG0jwb2yPpSqlUiIYukef7bi1ZOLIRKi5e6AwHekdjvQg9EBJLbSU5AnKNX7VTBBRU0mYuUPt_joliCPYNOA2ni_z4bnD22kqGdwFh1zwexl8dHMcz7hbsoPiIyR8HAOUn5pH3OUyQ4rzskGrAOlzuP_fNTq97E-716p9Oxx327aCRpmKM-e9oJRTq4RplNaCOQ1aMmG9GKRRwCWTVhLb8EZZT71hXgCVnMpAA1-jx7_bG6efSvyAsvS_rP7G4t-tGEdE</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality</title><source>arXiv.org</source><creator>Vicente, Ana Letícia Garcez ; Junior, Roseval Donisete Malaquias ; Romero, Roseli A. F</creator><creatorcontrib>Vicente, Ana Letícia Garcez ; Junior, Roseval Donisete Malaquias ; Romero, Roseli A. F</creatorcontrib><description>Myocardial Infarction is a main cause of mortality globally, and accurate risk prediction is crucial for improving patient outcomes. Machine Learning techniques have shown promise in identifying high-risk patients and predicting outcomes. However, patient data often contain vast amounts of information and missing values, posing challenges for feature selection and imputation methods. In this article, we investigate the impact of the data preprocessing task and compare three ensembles boosted tree methods to predict the risk of mortality in patients with myocardial infarction. Further, we use the Tree Shapley Additive Explanations method to identify relationships among all the features for the performed predictions, leveraging the entirety of the available data in the analysis. Notably, our approach achieved a superior performance when compared to other existing machine learning approaches, with an F1-score of 91,2% and an accuracy of 91,8% for LightGBM without data preprocessing.</description><identifier>DOI: 10.48550/arxiv.2404.15029</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2024-04</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,882</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2404.15029$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2404.15029$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Vicente, Ana Letícia Garcez</creatorcontrib><creatorcontrib>Junior, Roseval Donisete Malaquias</creatorcontrib><creatorcontrib>Romero, Roseli A. F</creatorcontrib><title>Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality</title><description>Myocardial Infarction is a main cause of mortality globally, and accurate risk prediction is crucial for improving patient outcomes. Machine Learning techniques have shown promise in identifying high-risk patients and predicting outcomes. However, patient data often contain vast amounts of information and missing values, posing challenges for feature selection and imputation methods. In this article, we investigate the impact of the data preprocessing task and compare three ensembles boosted tree methods to predict the risk of mortality in patients with myocardial infarction. Further, we use the Tree Shapley Additive Explanations method to identify relationships among all the features for the performed predictions, leveraging the entirety of the available data in the analysis. Notably, our approach achieved a superior performance when compared to other existing machine learning approaches, with an F1-score of 91,2% and an accuracy of 91,8% for LightGBM without data preprocessing.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71OwzAYhWEvHVDLBTDhG0jwb2yPpSqlUiIYukef7bi1ZOLIRKi5e6AwHekdjvQg9EBJLbSU5AnKNX7VTBBRU0mYuUPt_joliCPYNOA2ni_z4bnD22kqGdwFh1zwexl8dHMcz7hbsoPiIyR8HAOUn5pH3OUyQ4rzskGrAOlzuP_fNTq97E-716p9Oxx327aCRpmKM-e9oJRTq4RplNaCOQ1aMmG9GKRRwCWTVhLb8EZZT71hXgCVnMpAA1-jx7_bG6efSvyAsvS_rP7G4t-tGEdE</recordid><startdate>20240423</startdate><enddate>20240423</enddate><creator>Vicente, Ana Letícia Garcez</creator><creator>Junior, Roseval Donisete Malaquias</creator><creator>Romero, Roseli A. F</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240423</creationdate><title>Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality</title><author>Vicente, Ana Letícia Garcez ; Junior, Roseval Donisete Malaquias ; Romero, Roseli A. F</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-32cdd41131b749678842c8a8524bd4e597a3525b50b6367bd1d92d4a15315f1f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Vicente, Ana Letícia Garcez</creatorcontrib><creatorcontrib>Junior, Roseval Donisete Malaquias</creatorcontrib><creatorcontrib>Romero, Roseli A. F</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Vicente, Ana Letícia Garcez</au><au>Junior, Roseval Donisete Malaquias</au><au>Romero, Roseli A. F</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality</atitle><date>2024-04-23</date><risdate>2024</risdate><abstract>Myocardial Infarction is a main cause of mortality globally, and accurate risk prediction is crucial for improving patient outcomes. Machine Learning techniques have shown promise in identifying high-risk patients and predicting outcomes. However, patient data often contain vast amounts of information and missing values, posing challenges for feature selection and imputation methods. In this article, we investigate the impact of the data preprocessing task and compare three ensembles boosted tree methods to predict the risk of mortality in patients with myocardial infarction. Further, we use the Tree Shapley Additive Explanations method to identify relationships among all the features for the performed predictions, leveraging the entirety of the available data in the analysis. Notably, our approach achieved a superior performance when compared to other existing machine learning approaches, with an F1-score of 91,2% and an accuracy of 91,8% for LightGBM without data preprocessing.</abstract><doi>10.48550/arxiv.2404.15029</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2404.15029
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2404_15029
source	arXiv.org
subjects	Computer Science - Learning
title	Explainable LightGBM Approach for Predicting Myocardial Infarction Mortality
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T18%3A59%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Explainable%20LightGBM%20Approach%20for%20Predicting%20Myocardial%20Infarction%20Mortality&rft.au=Vicente,%20Ana%20Let%C3%ADcia%20Garcez&rft.date=2024-04-23&rft_id=info:doi/10.48550/arxiv.2404.15029&rft_dat=%3Carxiv_GOX%3E2404_15029%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true