Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology

Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and sto...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of medical informatics (Shannon, Ireland) Ireland), 2009-12, Vol.78 (12), p.e84-e96
Hauptverfasser: Oztekin, Asil, Delen, Dursun, Kong, Zhenyu (James)
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e96
container_issue 12
container_start_page e84
container_title International journal of medical informatics (Shannon, Ireland)
container_volume 78
creator Oztekin, Asil
Delen, Dursun
Kong, Zhenyu (James)
description Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and stored, only a small subset of the predictive factors has been used in modeling heart–lung transplantation outcomes. The previous studies have mainly focused on applying statistical techniques to a small set of factors selected by the domain-experts in order to reveal the simple linear relationships between the factors and survival. The collection of methods known as ‘data mining’ offers significant advantages over conventional statistical techniques in dealing with the latter's limitations such as normality assumption of observations, independence of observations from each other, and linearity of the relationship between the observations and the output measure(s). There are statistical methods that overcome these limitations. Yet, they are computationally more expensive and do not provide fast and flexible solutions as do data mining techniques in large datasets. Purpose The main objective of this study is to improve the prediction of outcomes following combined heart–lung transplantation by proposing an integrated data-mining methodology. Methods A large and feature-rich dataset (16,604 cases with 283 variables) is used to (1) develop machine learning based predictive models and (2) extract the most important predictive factors. Then, using three different variable selection methods, namely, (i) machine learning methods driven variables—using decision trees, neural networks, logistic regression, (ii) the literature review-based expert-defined variables, and (iii) common sense-based interaction variables, a consolidated set of factors is generated and used to develop Cox regression models for heart–lung graft survival. Results The predictive models’ performance in terms of 10-fold cross-validation accuracy rates for two multi-imputed datasets ranged from 79% to 86% for neural networks, from 78% to 86% for logistic regression, and from 71% to 79% for decision trees. The results indicate that the proposed integrated data mining methodology using Cox hazard models better predicted the graft survival with different variables than the conventional approaches commonly used in the literature. This result is validated by the comparison of the corresponding Gains charts fo
doi_str_mv 10.1016/j.ijmedinf.2009.04.007
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_734124995</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1386505609000707</els_id><sourcerecordid>734124995</sourcerecordid><originalsourceid>FETCH-LOGICAL-c455t-2878875be06cafc505462bac300996c3da28e81c15301cdb84d1c4c505e0a9983</originalsourceid><addsrcrecordid>eNqFkk2O1DAQhSMEYoaBK4yyg01COXESmwViNOJPGgkkYG25nUq3Q2I3ttNS77gDN-QkVNSNkFjAypb81SvXe5Vl1wxKBqx9PpZ2nLG3bigrAFkCLwG6e9klE11ViIrX9-lei7ZooGkvskcxjgCsg4Y_zC6Y5LLrRHWZuY-BVEyybpunHebboIeUxyUc7EFP-eBDvkMd0s_vP6ZlZYJ2cT9pl3Sy3uV7OtCl-CK_cbl1CUkgYZ_3Oul8tm7VnTHtfO8nvz0-zh4Meor45HxeZV_evP58-664-_D2_e3NXWF406SiEp0QXbNBaI0eDI3A22qjTU2jytbUva4ECmZYUwMz_Ubwnhm-cghaSlFfZU9Puvvgvy0Yk5ptNDjRx9EvUXU1ZxWXsiHy2T9Jcpsz4CA5oe0JNcHHGHBQ-2BnHY4ErVyrRvU7FbWmooArSoUKr889lg09_yk7x0DAqxOA5MnBYlDRkK2GpAKapHpv_9_j5V8SZiL7jZ6-4hHj6JfgyHHFVKwUqE_rbqyrARKonAR-ATbmuJk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1014104094</pqid></control><display><type>article</type><title>Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Oztekin, Asil ; Delen, Dursun ; Kong, Zhenyu (James)</creator><creatorcontrib>Oztekin, Asil ; Delen, Dursun ; Kong, Zhenyu (James)</creatorcontrib><description>Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and stored, only a small subset of the predictive factors has been used in modeling heart–lung transplantation outcomes. The previous studies have mainly focused on applying statistical techniques to a small set of factors selected by the domain-experts in order to reveal the simple linear relationships between the factors and survival. The collection of methods known as ‘data mining’ offers significant advantages over conventional statistical techniques in dealing with the latter's limitations such as normality assumption of observations, independence of observations from each other, and linearity of the relationship between the observations and the output measure(s). There are statistical methods that overcome these limitations. Yet, they are computationally more expensive and do not provide fast and flexible solutions as do data mining techniques in large datasets. Purpose The main objective of this study is to improve the prediction of outcomes following combined heart–lung transplantation by proposing an integrated data-mining methodology. Methods A large and feature-rich dataset (16,604 cases with 283 variables) is used to (1) develop machine learning based predictive models and (2) extract the most important predictive factors. Then, using three different variable selection methods, namely, (i) machine learning methods driven variables—using decision trees, neural networks, logistic regression, (ii) the literature review-based expert-defined variables, and (iii) common sense-based interaction variables, a consolidated set of factors is generated and used to develop Cox regression models for heart–lung graft survival. Results The predictive models’ performance in terms of 10-fold cross-validation accuracy rates for two multi-imputed datasets ranged from 79% to 86% for neural networks, from 78% to 86% for logistic regression, and from 71% to 79% for decision trees. The results indicate that the proposed integrated data mining methodology using Cox hazard models better predicted the graft survival with different variables than the conventional approaches commonly used in the literature. This result is validated by the comparison of the corresponding Gains charts for our proposed methodology and the literature review based Cox results, and by the comparison of Akaike information criteria (AIC) values received from each. Conclusions Data mining-based methodology proposed in this study reveals that there are undiscovered relationships (i.e. interactions of the existing variables) among the survival-related variables, which helps better predict the survival of the heart–lung transplants. It also brings a different set of variables into the scene to be evaluated by the domain-experts and be considered prior to the organ transplantation.</description><identifier>ISSN: 1386-5056</identifier><identifier>EISSN: 1872-8243</identifier><identifier>DOI: 10.1016/j.ijmedinf.2009.04.007</identifier><identifier>PMID: 19497782</identifier><language>eng</language><publisher>Ireland: Elsevier Ireland Ltd</publisher><subject>Artificial Intelligence ; Buffers ; Classification ; Combined heart–lung transplantation ; Cox proportional hazards models ; Data Mining ; Decision Trees ; Graft Survival - physiology ; Heart-Lung Transplantation ; Humans ; Internal Medicine ; Middle Aged ; Models, Theoretical ; Other ; Prognosis ; Survival analysis ; Survival Rate</subject><ispartof>International journal of medical informatics (Shannon, Ireland), 2009-12, Vol.78 (12), p.e84-e96</ispartof><rights>Elsevier Ireland Ltd</rights><rights>2009 Elsevier Ireland Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c455t-2878875be06cafc505462bac300996c3da28e81c15301cdb84d1c4c505e0a9983</citedby><cites>FETCH-LOGICAL-c455t-2878875be06cafc505462bac300996c3da28e81c15301cdb84d1c4c505e0a9983</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1386505609000707$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19497782$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Oztekin, Asil</creatorcontrib><creatorcontrib>Delen, Dursun</creatorcontrib><creatorcontrib>Kong, Zhenyu (James)</creatorcontrib><title>Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology</title><title>International journal of medical informatics (Shannon, Ireland)</title><addtitle>Int J Med Inform</addtitle><description>Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and stored, only a small subset of the predictive factors has been used in modeling heart–lung transplantation outcomes. The previous studies have mainly focused on applying statistical techniques to a small set of factors selected by the domain-experts in order to reveal the simple linear relationships between the factors and survival. The collection of methods known as ‘data mining’ offers significant advantages over conventional statistical techniques in dealing with the latter's limitations such as normality assumption of observations, independence of observations from each other, and linearity of the relationship between the observations and the output measure(s). There are statistical methods that overcome these limitations. Yet, they are computationally more expensive and do not provide fast and flexible solutions as do data mining techniques in large datasets. Purpose The main objective of this study is to improve the prediction of outcomes following combined heart–lung transplantation by proposing an integrated data-mining methodology. Methods A large and feature-rich dataset (16,604 cases with 283 variables) is used to (1) develop machine learning based predictive models and (2) extract the most important predictive factors. Then, using three different variable selection methods, namely, (i) machine learning methods driven variables—using decision trees, neural networks, logistic regression, (ii) the literature review-based expert-defined variables, and (iii) common sense-based interaction variables, a consolidated set of factors is generated and used to develop Cox regression models for heart–lung graft survival. Results The predictive models’ performance in terms of 10-fold cross-validation accuracy rates for two multi-imputed datasets ranged from 79% to 86% for neural networks, from 78% to 86% for logistic regression, and from 71% to 79% for decision trees. The results indicate that the proposed integrated data mining methodology using Cox hazard models better predicted the graft survival with different variables than the conventional approaches commonly used in the literature. This result is validated by the comparison of the corresponding Gains charts for our proposed methodology and the literature review based Cox results, and by the comparison of Akaike information criteria (AIC) values received from each. Conclusions Data mining-based methodology proposed in this study reveals that there are undiscovered relationships (i.e. interactions of the existing variables) among the survival-related variables, which helps better predict the survival of the heart–lung transplants. It also brings a different set of variables into the scene to be evaluated by the domain-experts and be considered prior to the organ transplantation.</description><subject>Artificial Intelligence</subject><subject>Buffers</subject><subject>Classification</subject><subject>Combined heart–lung transplantation</subject><subject>Cox proportional hazards models</subject><subject>Data Mining</subject><subject>Decision Trees</subject><subject>Graft Survival - physiology</subject><subject>Heart-Lung Transplantation</subject><subject>Humans</subject><subject>Internal Medicine</subject><subject>Middle Aged</subject><subject>Models, Theoretical</subject><subject>Other</subject><subject>Prognosis</subject><subject>Survival analysis</subject><subject>Survival Rate</subject><issn>1386-5056</issn><issn>1872-8243</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkk2O1DAQhSMEYoaBK4yyg01COXESmwViNOJPGgkkYG25nUq3Q2I3ttNS77gDN-QkVNSNkFjAypb81SvXe5Vl1wxKBqx9PpZ2nLG3bigrAFkCLwG6e9klE11ViIrX9-lei7ZooGkvskcxjgCsg4Y_zC6Y5LLrRHWZuY-BVEyybpunHebboIeUxyUc7EFP-eBDvkMd0s_vP6ZlZYJ2cT9pl3Sy3uV7OtCl-CK_cbl1CUkgYZ_3Oul8tm7VnTHtfO8nvz0-zh4Meor45HxeZV_evP58-664-_D2_e3NXWF406SiEp0QXbNBaI0eDI3A22qjTU2jytbUva4ECmZYUwMz_Ubwnhm-cghaSlFfZU9Puvvgvy0Yk5ptNDjRx9EvUXU1ZxWXsiHy2T9Jcpsz4CA5oe0JNcHHGHBQ-2BnHY4ErVyrRvU7FbWmooArSoUKr889lg09_yk7x0DAqxOA5MnBYlDRkK2GpAKapHpv_9_j5V8SZiL7jZ6-4hHj6JfgyHHFVKwUqE_rbqyrARKonAR-ATbmuJk</recordid><startdate>20091201</startdate><enddate>20091201</enddate><creator>Oztekin, Asil</creator><creator>Delen, Dursun</creator><creator>Kong, Zhenyu (James)</creator><general>Elsevier Ireland Ltd</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20091201</creationdate><title>Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology</title><author>Oztekin, Asil ; Delen, Dursun ; Kong, Zhenyu (James)</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c455t-2878875be06cafc505462bac300996c3da28e81c15301cdb84d1c4c505e0a9983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Artificial Intelligence</topic><topic>Buffers</topic><topic>Classification</topic><topic>Combined heart–lung transplantation</topic><topic>Cox proportional hazards models</topic><topic>Data Mining</topic><topic>Decision Trees</topic><topic>Graft Survival - physiology</topic><topic>Heart-Lung Transplantation</topic><topic>Humans</topic><topic>Internal Medicine</topic><topic>Middle Aged</topic><topic>Models, Theoretical</topic><topic>Other</topic><topic>Prognosis</topic><topic>Survival analysis</topic><topic>Survival Rate</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Oztekin, Asil</creatorcontrib><creatorcontrib>Delen, Dursun</creatorcontrib><creatorcontrib>Kong, Zhenyu (James)</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>International journal of medical informatics (Shannon, Ireland)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Oztekin, Asil</au><au>Delen, Dursun</au><au>Kong, Zhenyu (James)</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology</atitle><jtitle>International journal of medical informatics (Shannon, Ireland)</jtitle><addtitle>Int J Med Inform</addtitle><date>2009-12-01</date><risdate>2009</risdate><volume>78</volume><issue>12</issue><spage>e84</spage><epage>e96</epage><pages>e84-e96</pages><issn>1386-5056</issn><eissn>1872-8243</eissn><abstract>Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and stored, only a small subset of the predictive factors has been used in modeling heart–lung transplantation outcomes. The previous studies have mainly focused on applying statistical techniques to a small set of factors selected by the domain-experts in order to reveal the simple linear relationships between the factors and survival. The collection of methods known as ‘data mining’ offers significant advantages over conventional statistical techniques in dealing with the latter's limitations such as normality assumption of observations, independence of observations from each other, and linearity of the relationship between the observations and the output measure(s). There are statistical methods that overcome these limitations. Yet, they are computationally more expensive and do not provide fast and flexible solutions as do data mining techniques in large datasets. Purpose The main objective of this study is to improve the prediction of outcomes following combined heart–lung transplantation by proposing an integrated data-mining methodology. Methods A large and feature-rich dataset (16,604 cases with 283 variables) is used to (1) develop machine learning based predictive models and (2) extract the most important predictive factors. Then, using three different variable selection methods, namely, (i) machine learning methods driven variables—using decision trees, neural networks, logistic regression, (ii) the literature review-based expert-defined variables, and (iii) common sense-based interaction variables, a consolidated set of factors is generated and used to develop Cox regression models for heart–lung graft survival. Results The predictive models’ performance in terms of 10-fold cross-validation accuracy rates for two multi-imputed datasets ranged from 79% to 86% for neural networks, from 78% to 86% for logistic regression, and from 71% to 79% for decision trees. The results indicate that the proposed integrated data mining methodology using Cox hazard models better predicted the graft survival with different variables than the conventional approaches commonly used in the literature. This result is validated by the comparison of the corresponding Gains charts for our proposed methodology and the literature review based Cox results, and by the comparison of Akaike information criteria (AIC) values received from each. Conclusions Data mining-based methodology proposed in this study reveals that there are undiscovered relationships (i.e. interactions of the existing variables) among the survival-related variables, which helps better predict the survival of the heart–lung transplants. It also brings a different set of variables into the scene to be evaluated by the domain-experts and be considered prior to the organ transplantation.</abstract><cop>Ireland</cop><pub>Elsevier Ireland Ltd</pub><pmid>19497782</pmid><doi>10.1016/j.ijmedinf.2009.04.007</doi></addata></record>
fulltext fulltext
identifier ISSN: 1386-5056
ispartof International journal of medical informatics (Shannon, Ireland), 2009-12, Vol.78 (12), p.e84-e96
issn 1386-5056
1872-8243
language eng
recordid cdi_proquest_miscellaneous_734124995
source MEDLINE; Elsevier ScienceDirect Journals
subjects Artificial Intelligence
Buffers
Classification
Combined heart–lung transplantation
Cox proportional hazards models
Data Mining
Decision Trees
Graft Survival - physiology
Heart-Lung Transplantation
Humans
Internal Medicine
Middle Aged
Models, Theoretical
Other
Prognosis
Survival analysis
Survival Rate
title Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T01%3A36%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20the%20graft%20survival%20for%20heart%E2%80%93lung%20transplantation%20patients:%20An%20integrated%20data%20mining%20methodology&rft.jtitle=International%20journal%20of%20medical%20informatics%20(Shannon,%20Ireland)&rft.au=Oztekin,%20Asil&rft.date=2009-12-01&rft.volume=78&rft.issue=12&rft.spage=e84&rft.epage=e96&rft.pages=e84-e96&rft.issn=1386-5056&rft.eissn=1872-8243&rft_id=info:doi/10.1016/j.ijmedinf.2009.04.007&rft_dat=%3Cproquest_cross%3E734124995%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1014104094&rft_id=info:pmid/19497782&rft_els_id=S1386505609000707&rfr_iscdi=true