Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology
Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and sto...
Gespeichert in:
Veröffentlicht in: | International journal of medical informatics (Shannon, Ireland) Ireland), 2009-12, Vol.78 (12), p.e84-e96 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | e96 |
---|---|
container_issue | 12 |
container_start_page | e84 |
container_title | International journal of medical informatics (Shannon, Ireland) |
container_volume | 78 |
creator | Oztekin, Asil Delen, Dursun Kong, Zhenyu (James) |
description | Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and stored, only a small subset of the predictive factors has been used in modeling heart–lung transplantation outcomes. The previous studies have mainly focused on applying statistical techniques to a small set of factors selected by the domain-experts in order to reveal the simple linear relationships between the factors and survival. The collection of methods known as ‘data mining’ offers significant advantages over conventional statistical techniques in dealing with the latter's limitations such as normality assumption of observations, independence of observations from each other, and linearity of the relationship between the observations and the output measure(s). There are statistical methods that overcome these limitations. Yet, they are computationally more expensive and do not provide fast and flexible solutions as do data mining techniques in large datasets. Purpose The main objective of this study is to improve the prediction of outcomes following combined heart–lung transplantation by proposing an integrated data-mining methodology. Methods A large and feature-rich dataset (16,604 cases with 283 variables) is used to (1) develop machine learning based predictive models and (2) extract the most important predictive factors. Then, using three different variable selection methods, namely, (i) machine learning methods driven variables—using decision trees, neural networks, logistic regression, (ii) the literature review-based expert-defined variables, and (iii) common sense-based interaction variables, a consolidated set of factors is generated and used to develop Cox regression models for heart–lung graft survival. Results The predictive models’ performance in terms of 10-fold cross-validation accuracy rates for two multi-imputed datasets ranged from 79% to 86% for neural networks, from 78% to 86% for logistic regression, and from 71% to 79% for decision trees. The results indicate that the proposed integrated data mining methodology using Cox hazard models better predicted the graft survival with different variables than the conventional approaches commonly used in the literature. This result is validated by the comparison of the corresponding Gains charts fo |
doi_str_mv | 10.1016/j.ijmedinf.2009.04.007 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_734124995</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1386505609000707</els_id><sourcerecordid>734124995</sourcerecordid><originalsourceid>FETCH-LOGICAL-c455t-2878875be06cafc505462bac300996c3da28e81c15301cdb84d1c4c505e0a9983</originalsourceid><addsrcrecordid>eNqFkk2O1DAQhSMEYoaBK4yyg01COXESmwViNOJPGgkkYG25nUq3Q2I3ttNS77gDN-QkVNSNkFjAypb81SvXe5Vl1wxKBqx9PpZ2nLG3bigrAFkCLwG6e9klE11ViIrX9-lei7ZooGkvskcxjgCsg4Y_zC6Y5LLrRHWZuY-BVEyybpunHebboIeUxyUc7EFP-eBDvkMd0s_vP6ZlZYJ2cT9pl3Sy3uV7OtCl-CK_cbl1CUkgYZ_3Oul8tm7VnTHtfO8nvz0-zh4Meor45HxeZV_evP58-664-_D2_e3NXWF406SiEp0QXbNBaI0eDI3A22qjTU2jytbUva4ECmZYUwMz_Ubwnhm-cghaSlFfZU9Puvvgvy0Yk5ptNDjRx9EvUXU1ZxWXsiHy2T9Jcpsz4CA5oe0JNcHHGHBQ-2BnHY4ErVyrRvU7FbWmooArSoUKr889lg09_yk7x0DAqxOA5MnBYlDRkK2GpAKapHpv_9_j5V8SZiL7jZ6-4hHj6JfgyHHFVKwUqE_rbqyrARKonAR-ATbmuJk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1014104094</pqid></control><display><type>article</type><title>Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><creator>Oztekin, Asil ; Delen, Dursun ; Kong, Zhenyu (James)</creator><creatorcontrib>Oztekin, Asil ; Delen, Dursun ; Kong, Zhenyu (James)</creatorcontrib><description>Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and stored, only a small subset of the predictive factors has been used in modeling heart–lung transplantation outcomes. The previous studies have mainly focused on applying statistical techniques to a small set of factors selected by the domain-experts in order to reveal the simple linear relationships between the factors and survival. The collection of methods known as ‘data mining’ offers significant advantages over conventional statistical techniques in dealing with the latter's limitations such as normality assumption of observations, independence of observations from each other, and linearity of the relationship between the observations and the output measure(s). There are statistical methods that overcome these limitations. Yet, they are computationally more expensive and do not provide fast and flexible solutions as do data mining techniques in large datasets. Purpose The main objective of this study is to improve the prediction of outcomes following combined heart–lung transplantation by proposing an integrated data-mining methodology. Methods A large and feature-rich dataset (16,604 cases with 283 variables) is used to (1) develop machine learning based predictive models and (2) extract the most important predictive factors. Then, using three different variable selection methods, namely, (i) machine learning methods driven variables—using decision trees, neural networks, logistic regression, (ii) the literature review-based expert-defined variables, and (iii) common sense-based interaction variables, a consolidated set of factors is generated and used to develop Cox regression models for heart–lung graft survival. Results The predictive models’ performance in terms of 10-fold cross-validation accuracy rates for two multi-imputed datasets ranged from 79% to 86% for neural networks, from 78% to 86% for logistic regression, and from 71% to 79% for decision trees. The results indicate that the proposed integrated data mining methodology using Cox hazard models better predicted the graft survival with different variables than the conventional approaches commonly used in the literature. This result is validated by the comparison of the corresponding Gains charts for our proposed methodology and the literature review based Cox results, and by the comparison of Akaike information criteria (AIC) values received from each. Conclusions Data mining-based methodology proposed in this study reveals that there are undiscovered relationships (i.e. interactions of the existing variables) among the survival-related variables, which helps better predict the survival of the heart–lung transplants. It also brings a different set of variables into the scene to be evaluated by the domain-experts and be considered prior to the organ transplantation.</description><identifier>ISSN: 1386-5056</identifier><identifier>EISSN: 1872-8243</identifier><identifier>DOI: 10.1016/j.ijmedinf.2009.04.007</identifier><identifier>PMID: 19497782</identifier><language>eng</language><publisher>Ireland: Elsevier Ireland Ltd</publisher><subject>Artificial Intelligence ; Buffers ; Classification ; Combined heart–lung transplantation ; Cox proportional hazards models ; Data Mining ; Decision Trees ; Graft Survival - physiology ; Heart-Lung Transplantation ; Humans ; Internal Medicine ; Middle Aged ; Models, Theoretical ; Other ; Prognosis ; Survival analysis ; Survival Rate</subject><ispartof>International journal of medical informatics (Shannon, Ireland), 2009-12, Vol.78 (12), p.e84-e96</ispartof><rights>Elsevier Ireland Ltd</rights><rights>2009 Elsevier Ireland Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c455t-2878875be06cafc505462bac300996c3da28e81c15301cdb84d1c4c505e0a9983</citedby><cites>FETCH-LOGICAL-c455t-2878875be06cafc505462bac300996c3da28e81c15301cdb84d1c4c505e0a9983</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1386505609000707$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19497782$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Oztekin, Asil</creatorcontrib><creatorcontrib>Delen, Dursun</creatorcontrib><creatorcontrib>Kong, Zhenyu (James)</creatorcontrib><title>Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology</title><title>International journal of medical informatics (Shannon, Ireland)</title><addtitle>Int J Med Inform</addtitle><description>Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and stored, only a small subset of the predictive factors has been used in modeling heart–lung transplantation outcomes. The previous studies have mainly focused on applying statistical techniques to a small set of factors selected by the domain-experts in order to reveal the simple linear relationships between the factors and survival. The collection of methods known as ‘data mining’ offers significant advantages over conventional statistical techniques in dealing with the latter's limitations such as normality assumption of observations, independence of observations from each other, and linearity of the relationship between the observations and the output measure(s). There are statistical methods that overcome these limitations. Yet, they are computationally more expensive and do not provide fast and flexible solutions as do data mining techniques in large datasets. Purpose The main objective of this study is to improve the prediction of outcomes following combined heart–lung transplantation by proposing an integrated data-mining methodology. Methods A large and feature-rich dataset (16,604 cases with 283 variables) is used to (1) develop machine learning based predictive models and (2) extract the most important predictive factors. Then, using three different variable selection methods, namely, (i) machine learning methods driven variables—using decision trees, neural networks, logistic regression, (ii) the literature review-based expert-defined variables, and (iii) common sense-based interaction variables, a consolidated set of factors is generated and used to develop Cox regression models for heart–lung graft survival. Results The predictive models’ performance in terms of 10-fold cross-validation accuracy rates for two multi-imputed datasets ranged from 79% to 86% for neural networks, from 78% to 86% for logistic regression, and from 71% to 79% for decision trees. The results indicate that the proposed integrated data mining methodology using Cox hazard models better predicted the graft survival with different variables than the conventional approaches commonly used in the literature. This result is validated by the comparison of the corresponding Gains charts for our proposed methodology and the literature review based Cox results, and by the comparison of Akaike information criteria (AIC) values received from each. Conclusions Data mining-based methodology proposed in this study reveals that there are undiscovered relationships (i.e. interactions of the existing variables) among the survival-related variables, which helps better predict the survival of the heart–lung transplants. It also brings a different set of variables into the scene to be evaluated by the domain-experts and be considered prior to the organ transplantation.</description><subject>Artificial Intelligence</subject><subject>Buffers</subject><subject>Classification</subject><subject>Combined heart–lung transplantation</subject><subject>Cox proportional hazards models</subject><subject>Data Mining</subject><subject>Decision Trees</subject><subject>Graft Survival - physiology</subject><subject>Heart-Lung Transplantation</subject><subject>Humans</subject><subject>Internal Medicine</subject><subject>Middle Aged</subject><subject>Models, Theoretical</subject><subject>Other</subject><subject>Prognosis</subject><subject>Survival analysis</subject><subject>Survival Rate</subject><issn>1386-5056</issn><issn>1872-8243</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkk2O1DAQhSMEYoaBK4yyg01COXESmwViNOJPGgkkYG25nUq3Q2I3ttNS77gDN-QkVNSNkFjAypb81SvXe5Vl1wxKBqx9PpZ2nLG3bigrAFkCLwG6e9klE11ViIrX9-lei7ZooGkvskcxjgCsg4Y_zC6Y5LLrRHWZuY-BVEyybpunHebboIeUxyUc7EFP-eBDvkMd0s_vP6ZlZYJ2cT9pl3Sy3uV7OtCl-CK_cbl1CUkgYZ_3Oul8tm7VnTHtfO8nvz0-zh4Meor45HxeZV_evP58-664-_D2_e3NXWF406SiEp0QXbNBaI0eDI3A22qjTU2jytbUva4ECmZYUwMz_Ubwnhm-cghaSlFfZU9Puvvgvy0Yk5ptNDjRx9EvUXU1ZxWXsiHy2T9Jcpsz4CA5oe0JNcHHGHBQ-2BnHY4ErVyrRvU7FbWmooArSoUKr889lg09_yk7x0DAqxOA5MnBYlDRkK2GpAKapHpv_9_j5V8SZiL7jZ6-4hHj6JfgyHHFVKwUqE_rbqyrARKonAR-ATbmuJk</recordid><startdate>20091201</startdate><enddate>20091201</enddate><creator>Oztekin, Asil</creator><creator>Delen, Dursun</creator><creator>Kong, Zhenyu (James)</creator><general>Elsevier Ireland Ltd</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20091201</creationdate><title>Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology</title><author>Oztekin, Asil ; Delen, Dursun ; Kong, Zhenyu (James)</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c455t-2878875be06cafc505462bac300996c3da28e81c15301cdb84d1c4c505e0a9983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Artificial Intelligence</topic><topic>Buffers</topic><topic>Classification</topic><topic>Combined heart–lung transplantation</topic><topic>Cox proportional hazards models</topic><topic>Data Mining</topic><topic>Decision Trees</topic><topic>Graft Survival - physiology</topic><topic>Heart-Lung Transplantation</topic><topic>Humans</topic><topic>Internal Medicine</topic><topic>Middle Aged</topic><topic>Models, Theoretical</topic><topic>Other</topic><topic>Prognosis</topic><topic>Survival analysis</topic><topic>Survival Rate</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Oztekin, Asil</creatorcontrib><creatorcontrib>Delen, Dursun</creatorcontrib><creatorcontrib>Kong, Zhenyu (James)</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>International journal of medical informatics (Shannon, Ireland)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Oztekin, Asil</au><au>Delen, Dursun</au><au>Kong, Zhenyu (James)</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology</atitle><jtitle>International journal of medical informatics (Shannon, Ireland)</jtitle><addtitle>Int J Med Inform</addtitle><date>2009-12-01</date><risdate>2009</risdate><volume>78</volume><issue>12</issue><spage>e84</spage><epage>e96</epage><pages>e84-e96</pages><issn>1386-5056</issn><eissn>1872-8243</eissn><abstract>Abstract Background Predicting the survival of heart–lung transplant patients has the potential to play a critical role in understanding and improving the matching procedure between the recipient and graft. Although voluminous data related to the transplantation procedures is being collected and stored, only a small subset of the predictive factors has been used in modeling heart–lung transplantation outcomes. The previous studies have mainly focused on applying statistical techniques to a small set of factors selected by the domain-experts in order to reveal the simple linear relationships between the factors and survival. The collection of methods known as ‘data mining’ offers significant advantages over conventional statistical techniques in dealing with the latter's limitations such as normality assumption of observations, independence of observations from each other, and linearity of the relationship between the observations and the output measure(s). There are statistical methods that overcome these limitations. Yet, they are computationally more expensive and do not provide fast and flexible solutions as do data mining techniques in large datasets. Purpose The main objective of this study is to improve the prediction of outcomes following combined heart–lung transplantation by proposing an integrated data-mining methodology. Methods A large and feature-rich dataset (16,604 cases with 283 variables) is used to (1) develop machine learning based predictive models and (2) extract the most important predictive factors. Then, using three different variable selection methods, namely, (i) machine learning methods driven variables—using decision trees, neural networks, logistic regression, (ii) the literature review-based expert-defined variables, and (iii) common sense-based interaction variables, a consolidated set of factors is generated and used to develop Cox regression models for heart–lung graft survival. Results The predictive models’ performance in terms of 10-fold cross-validation accuracy rates for two multi-imputed datasets ranged from 79% to 86% for neural networks, from 78% to 86% for logistic regression, and from 71% to 79% for decision trees. The results indicate that the proposed integrated data mining methodology using Cox hazard models better predicted the graft survival with different variables than the conventional approaches commonly used in the literature. This result is validated by the comparison of the corresponding Gains charts for our proposed methodology and the literature review based Cox results, and by the comparison of Akaike information criteria (AIC) values received from each. Conclusions Data mining-based methodology proposed in this study reveals that there are undiscovered relationships (i.e. interactions of the existing variables) among the survival-related variables, which helps better predict the survival of the heart–lung transplants. It also brings a different set of variables into the scene to be evaluated by the domain-experts and be considered prior to the organ transplantation.</abstract><cop>Ireland</cop><pub>Elsevier Ireland Ltd</pub><pmid>19497782</pmid><doi>10.1016/j.ijmedinf.2009.04.007</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1386-5056 |
ispartof | International journal of medical informatics (Shannon, Ireland), 2009-12, Vol.78 (12), p.e84-e96 |
issn | 1386-5056 1872-8243 |
language | eng |
recordid | cdi_proquest_miscellaneous_734124995 |
source | MEDLINE; Elsevier ScienceDirect Journals |
subjects | Artificial Intelligence Buffers Classification Combined heart–lung transplantation Cox proportional hazards models Data Mining Decision Trees Graft Survival - physiology Heart-Lung Transplantation Humans Internal Medicine Middle Aged Models, Theoretical Other Prognosis Survival analysis Survival Rate |
title | Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T01%3A36%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Predicting%20the%20graft%20survival%20for%20heart%E2%80%93lung%20transplantation%20patients:%20An%20integrated%20data%20mining%20methodology&rft.jtitle=International%20journal%20of%20medical%20informatics%20(Shannon,%20Ireland)&rft.au=Oztekin,%20Asil&rft.date=2009-12-01&rft.volume=78&rft.issue=12&rft.spage=e84&rft.epage=e96&rft.pages=e84-e96&rft.issn=1386-5056&rft.eissn=1872-8243&rft_id=info:doi/10.1016/j.ijmedinf.2009.04.007&rft_dat=%3Cproquest_cross%3E734124995%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1014104094&rft_id=info:pmid/19497782&rft_els_id=S1386505609000707&rfr_iscdi=true |