507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction
Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the fut...
Gespeichert in:
Veröffentlicht in: | Diabetes (New York, N.Y.) N.Y.), 2019-06, Vol.68 (Supplement_1) |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | Supplement_1 |
container_start_page | |
container_title | Diabetes (New York, N.Y.) |
container_volume | 68 |
creator | KOSEKI, AKIRA ONO, MASAKI KUDO, MICHIHARU HAIDA, KYOICHI MAKINO, MASAKI SUZUKI, ATSUSHI |
description | Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the future. For those models, a set of past information are used to make explanatory variables, however, we don't have enough knowledge as to how long we should collect data backward. In some cases, very late tendencies are influencing the future status of disease while in the other cases, old events were the importance causes of the change of the disease status. Our interest thus lies in how old data we have to process to make the good prediction models.
Method: In this paper, we discuss a set of machine learning algorithms to predict the diabetic nephropathy stage in the future using sets of input variables which were collected from different time span of past records. To compare the performance of algorithms we used Logistic Regression, AdaBoost, Gradient Boosting, Decision tree, Multi-layer Perceptron, and Random Forest. We then provide different set of variables of EHR that include past 30-, 60-, 90-, 180-, 210-, 240-, 270-, 300-, 330-, and 360-day data sets, from which we extracted several longitudinal statistics for input variables. From about 65 thousand type 2 diabetes patients, the models classify whether the nephropathy stage gets aggravated or stay in 180 days.
Results: For almost all algorithms, AUC is getting improved when using older data, and 360-day data sets gave the best. Among the algorithms, Gradient Boosting gave the best AUC of 0.77 when using 360-day data set. When using 360-day data sets, Decision Tree gave worst AUC of 0.61.
Conclusion: We observed that when using to past data up to 360 days, the oldest data set gave the best prediction performance. Longitudinal statistics in rather long span gives good explanatory information for future nephropathy development. |
doi_str_mv | 10.2337/db19-507-P |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2248399561</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2248399561</sourcerecordid><originalsourceid>FETCH-LOGICAL-c631-dd96e2586d073a4c3d1654e9f6015881990aba5e8bbfe1ed0f7c86602963adeb3</originalsourceid><addsrcrecordid>eNotkEtLAzEQgIMoWKsXf0HAmxDNo5vdeCu1aqHoHnrwFrPJpN3SbtZkK_Tfu2tlDvPgmxn4ELpl9IELkT-6iimS0ZyUZ2jElFBE8PzzHI0oZZywXOWX6CqlLaVU9jFCX3_wE140fneAxgIOHpcmdcMkxL3p6tDgLuAygq3T0PTAc20q6GqL36HdxNCabnPE0_U6mp_TQk-72g7lNbrwZpfg5j-P0eplvpq9keXH62I2XRIrBSPOKQk8K6SjuTATKxyT2QSUl5RlRcGUoqYyGRRV5YGBoz63hZSUKymMg0qM0d3pbBvD9wFSp7fhEJv-o-Z8UgilMsl66v5E2RhSiuB1G-u9iUfNqB4E6kGg7p3oUvwCokNjAg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2248399561</pqid></control><display><type>article</type><title>507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction</title><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>KOSEKI, AKIRA ; ONO, MASAKI ; KUDO, MICHIHARU ; HAIDA, KYOICHI ; MAKINO, MASAKI ; SUZUKI, ATSUSHI</creator><creatorcontrib>KOSEKI, AKIRA ; ONO, MASAKI ; KUDO, MICHIHARU ; HAIDA, KYOICHI ; MAKINO, MASAKI ; SUZUKI, ATSUSHI</creatorcontrib><description>Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the future. For those models, a set of past information are used to make explanatory variables, however, we don't have enough knowledge as to how long we should collect data backward. In some cases, very late tendencies are influencing the future status of disease while in the other cases, old events were the importance causes of the change of the disease status. Our interest thus lies in how old data we have to process to make the good prediction models.
Method: In this paper, we discuss a set of machine learning algorithms to predict the diabetic nephropathy stage in the future using sets of input variables which were collected from different time span of past records. To compare the performance of algorithms we used Logistic Regression, AdaBoost, Gradient Boosting, Decision tree, Multi-layer Perceptron, and Random Forest. We then provide different set of variables of EHR that include past 30-, 60-, 90-, 180-, 210-, 240-, 270-, 300-, 330-, and 360-day data sets, from which we extracted several longitudinal statistics for input variables. From about 65 thousand type 2 diabetes patients, the models classify whether the nephropathy stage gets aggravated or stay in 180 days.
Results: For almost all algorithms, AUC is getting improved when using older data, and 360-day data sets gave the best. Among the algorithms, Gradient Boosting gave the best AUC of 0.77 when using 360-day data set. When using 360-day data sets, Decision Tree gave worst AUC of 0.61.
Conclusion: We observed that when using to past data up to 360 days, the oldest data set gave the best prediction performance. Longitudinal statistics in rather long span gives good explanatory information for future nephropathy development.</description><identifier>ISSN: 0012-1797</identifier><identifier>EISSN: 1939-327X</identifier><identifier>DOI: 10.2337/db19-507-P</identifier><language>eng</language><publisher>New York: American Diabetes Association</publisher><subject>Algorithms ; Artificial intelligence ; Datasets ; Decision making ; Decision trees ; Diabetes ; Diabetes mellitus ; Diabetes mellitus (non-insulin dependent) ; Diabetic nephropathy ; Electronic health records ; Electronic medical records ; Kidney diseases ; Learning algorithms ; Machine learning ; Nephropathy ; Prediction models ; Statistics</subject><ispartof>Diabetes (New York, N.Y.), 2019-06, Vol.68 (Supplement_1)</ispartof><rights>Copyright American Diabetes Association Jun 1, 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>KOSEKI, AKIRA</creatorcontrib><creatorcontrib>ONO, MASAKI</creatorcontrib><creatorcontrib>KUDO, MICHIHARU</creatorcontrib><creatorcontrib>HAIDA, KYOICHI</creatorcontrib><creatorcontrib>MAKINO, MASAKI</creatorcontrib><creatorcontrib>SUZUKI, ATSUSHI</creatorcontrib><title>507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction</title><title>Diabetes (New York, N.Y.)</title><description>Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the future. For those models, a set of past information are used to make explanatory variables, however, we don't have enough knowledge as to how long we should collect data backward. In some cases, very late tendencies are influencing the future status of disease while in the other cases, old events were the importance causes of the change of the disease status. Our interest thus lies in how old data we have to process to make the good prediction models.
Method: In this paper, we discuss a set of machine learning algorithms to predict the diabetic nephropathy stage in the future using sets of input variables which were collected from different time span of past records. To compare the performance of algorithms we used Logistic Regression, AdaBoost, Gradient Boosting, Decision tree, Multi-layer Perceptron, and Random Forest. We then provide different set of variables of EHR that include past 30-, 60-, 90-, 180-, 210-, 240-, 270-, 300-, 330-, and 360-day data sets, from which we extracted several longitudinal statistics for input variables. From about 65 thousand type 2 diabetes patients, the models classify whether the nephropathy stage gets aggravated or stay in 180 days.
Results: For almost all algorithms, AUC is getting improved when using older data, and 360-day data sets gave the best. Among the algorithms, Gradient Boosting gave the best AUC of 0.77 when using 360-day data set. When using 360-day data sets, Decision Tree gave worst AUC of 0.61.
Conclusion: We observed that when using to past data up to 360 days, the oldest data set gave the best prediction performance. Longitudinal statistics in rather long span gives good explanatory information for future nephropathy development.</description><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Datasets</subject><subject>Decision making</subject><subject>Decision trees</subject><subject>Diabetes</subject><subject>Diabetes mellitus</subject><subject>Diabetes mellitus (non-insulin dependent)</subject><subject>Diabetic nephropathy</subject><subject>Electronic health records</subject><subject>Electronic medical records</subject><subject>Kidney diseases</subject><subject>Learning algorithms</subject><subject>Machine learning</subject><subject>Nephropathy</subject><subject>Prediction models</subject><subject>Statistics</subject><issn>0012-1797</issn><issn>1939-327X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNotkEtLAzEQgIMoWKsXf0HAmxDNo5vdeCu1aqHoHnrwFrPJpN3SbtZkK_Tfu2tlDvPgmxn4ELpl9IELkT-6iimS0ZyUZ2jElFBE8PzzHI0oZZywXOWX6CqlLaVU9jFCX3_wE140fneAxgIOHpcmdcMkxL3p6tDgLuAygq3T0PTAc20q6GqL36HdxNCabnPE0_U6mp_TQk-72g7lNbrwZpfg5j-P0eplvpq9keXH62I2XRIrBSPOKQk8K6SjuTATKxyT2QSUl5RlRcGUoqYyGRRV5YGBoz63hZSUKymMg0qM0d3pbBvD9wFSp7fhEJv-o-Z8UgilMsl66v5E2RhSiuB1G-u9iUfNqB4E6kGg7p3oUvwCokNjAg</recordid><startdate>20190601</startdate><enddate>20190601</enddate><creator>KOSEKI, AKIRA</creator><creator>ONO, MASAKI</creator><creator>KUDO, MICHIHARU</creator><creator>HAIDA, KYOICHI</creator><creator>MAKINO, MASAKI</creator><creator>SUZUKI, ATSUSHI</creator><general>American Diabetes Association</general><scope>AAYXX</scope><scope>CITATION</scope><scope>K9.</scope><scope>NAPCQ</scope></search><sort><creationdate>20190601</creationdate><title>507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction</title><author>KOSEKI, AKIRA ; ONO, MASAKI ; KUDO, MICHIHARU ; HAIDA, KYOICHI ; MAKINO, MASAKI ; SUZUKI, ATSUSHI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c631-dd96e2586d073a4c3d1654e9f6015881990aba5e8bbfe1ed0f7c86602963adeb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Datasets</topic><topic>Decision making</topic><topic>Decision trees</topic><topic>Diabetes</topic><topic>Diabetes mellitus</topic><topic>Diabetes mellitus (non-insulin dependent)</topic><topic>Diabetic nephropathy</topic><topic>Electronic health records</topic><topic>Electronic medical records</topic><topic>Kidney diseases</topic><topic>Learning algorithms</topic><topic>Machine learning</topic><topic>Nephropathy</topic><topic>Prediction models</topic><topic>Statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>KOSEKI, AKIRA</creatorcontrib><creatorcontrib>ONO, MASAKI</creatorcontrib><creatorcontrib>KUDO, MICHIHARU</creatorcontrib><creatorcontrib>HAIDA, KYOICHI</creatorcontrib><creatorcontrib>MAKINO, MASAKI</creatorcontrib><creatorcontrib>SUZUKI, ATSUSHI</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Nursing & Allied Health Premium</collection><jtitle>Diabetes (New York, N.Y.)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>KOSEKI, AKIRA</au><au>ONO, MASAKI</au><au>KUDO, MICHIHARU</au><au>HAIDA, KYOICHI</au><au>MAKINO, MASAKI</au><au>SUZUKI, ATSUSHI</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction</atitle><jtitle>Diabetes (New York, N.Y.)</jtitle><date>2019-06-01</date><risdate>2019</risdate><volume>68</volume><issue>Supplement_1</issue><issn>0012-1797</issn><eissn>1939-327X</eissn><abstract>Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the future. For those models, a set of past information are used to make explanatory variables, however, we don't have enough knowledge as to how long we should collect data backward. In some cases, very late tendencies are influencing the future status of disease while in the other cases, old events were the importance causes of the change of the disease status. Our interest thus lies in how old data we have to process to make the good prediction models.
Method: In this paper, we discuss a set of machine learning algorithms to predict the diabetic nephropathy stage in the future using sets of input variables which were collected from different time span of past records. To compare the performance of algorithms we used Logistic Regression, AdaBoost, Gradient Boosting, Decision tree, Multi-layer Perceptron, and Random Forest. We then provide different set of variables of EHR that include past 30-, 60-, 90-, 180-, 210-, 240-, 270-, 300-, 330-, and 360-day data sets, from which we extracted several longitudinal statistics for input variables. From about 65 thousand type 2 diabetes patients, the models classify whether the nephropathy stage gets aggravated or stay in 180 days.
Results: For almost all algorithms, AUC is getting improved when using older data, and 360-day data sets gave the best. Among the algorithms, Gradient Boosting gave the best AUC of 0.77 when using 360-day data set. When using 360-day data sets, Decision Tree gave worst AUC of 0.61.
Conclusion: We observed that when using to past data up to 360 days, the oldest data set gave the best prediction performance. Longitudinal statistics in rather long span gives good explanatory information for future nephropathy development.</abstract><cop>New York</cop><pub>American Diabetes Association</pub><doi>10.2337/db19-507-P</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0012-1797 |
ispartof | Diabetes (New York, N.Y.), 2019-06, Vol.68 (Supplement_1) |
issn | 0012-1797 1939-327X |
language | eng |
recordid | cdi_proquest_journals_2248399561 |
source | EZB-FREE-00999 freely available EZB journals; PubMed Central |
subjects | Algorithms Artificial intelligence Datasets Decision making Decision trees Diabetes Diabetes mellitus Diabetes mellitus (non-insulin dependent) Diabetic nephropathy Electronic health records Electronic medical records Kidney diseases Learning algorithms Machine learning Nephropathy Prediction models Statistics |
title | 507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T02%3A06%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=507-P:%20Influence%20of%20Past%20Information%20to%20Precision%20of%20Diabetic%20Nephropathy%20Aggravation%20Prediction&rft.jtitle=Diabetes%20(New%20York,%20N.Y.)&rft.au=KOSEKI,%20AKIRA&rft.date=2019-06-01&rft.volume=68&rft.issue=Supplement_1&rft.issn=0012-1797&rft.eissn=1939-327X&rft_id=info:doi/10.2337/db19-507-P&rft_dat=%3Cproquest_cross%3E2248399561%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2248399561&rft_id=info:pmid/&rfr_iscdi=true |