507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction

Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the fut...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Diabetes (New York, N.Y.) N.Y.), 2019-06, Vol.68 (Supplement_1)
Hauptverfasser: KOSEKI, AKIRA, ONO, MASAKI, KUDO, MICHIHARU, HAIDA, KYOICHI, MAKINO, MASAKI, SUZUKI, ATSUSHI
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue Supplement_1
container_start_page
container_title Diabetes (New York, N.Y.)
container_volume 68
creator KOSEKI, AKIRA
ONO, MASAKI
KUDO, MICHIHARU
HAIDA, KYOICHI
MAKINO, MASAKI
SUZUKI, ATSUSHI
description Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the future. For those models, a set of past information are used to make explanatory variables, however, we don't have enough knowledge as to how long we should collect data backward. In some cases, very late tendencies are influencing the future status of disease while in the other cases, old events were the importance causes of the change of the disease status. Our interest thus lies in how old data we have to process to make the good prediction models. Method: In this paper, we discuss a set of machine learning algorithms to predict the diabetic nephropathy stage in the future using sets of input variables which were collected from different time span of past records. To compare the performance of algorithms we used Logistic Regression, AdaBoost, Gradient Boosting, Decision tree, Multi-layer Perceptron, and Random Forest. We then provide different set of variables of EHR that include past 30-, 60-, 90-, 180-, 210-, 240-, 270-, 300-, 330-, and 360-day data sets, from which we extracted several longitudinal statistics for input variables. From about 65 thousand type 2 diabetes patients, the models classify whether the nephropathy stage gets aggravated or stay in 180 days. Results: For almost all algorithms, AUC is getting improved when using older data, and 360-day data sets gave the best. Among the algorithms, Gradient Boosting gave the best AUC of 0.77 when using 360-day data set. When using 360-day data sets, Decision Tree gave worst AUC of 0.61. Conclusion: We observed that when using to past data up to 360 days, the oldest data set gave the best prediction performance. Longitudinal statistics in rather long span gives good explanatory information for future nephropathy development.
doi_str_mv 10.2337/db19-507-P
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2248399561</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2248399561</sourcerecordid><originalsourceid>FETCH-LOGICAL-c631-dd96e2586d073a4c3d1654e9f6015881990aba5e8bbfe1ed0f7c86602963adeb3</originalsourceid><addsrcrecordid>eNotkEtLAzEQgIMoWKsXf0HAmxDNo5vdeCu1aqHoHnrwFrPJpN3SbtZkK_Tfu2tlDvPgmxn4ELpl9IELkT-6iimS0ZyUZ2jElFBE8PzzHI0oZZywXOWX6CqlLaVU9jFCX3_wE140fneAxgIOHpcmdcMkxL3p6tDgLuAygq3T0PTAc20q6GqL36HdxNCabnPE0_U6mp_TQk-72g7lNbrwZpfg5j-P0eplvpq9keXH62I2XRIrBSPOKQk8K6SjuTATKxyT2QSUl5RlRcGUoqYyGRRV5YGBoz63hZSUKymMg0qM0d3pbBvD9wFSp7fhEJv-o-Z8UgilMsl66v5E2RhSiuB1G-u9iUfNqB4E6kGg7p3oUvwCokNjAg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2248399561</pqid></control><display><type>article</type><title>507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction</title><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>KOSEKI, AKIRA ; ONO, MASAKI ; KUDO, MICHIHARU ; HAIDA, KYOICHI ; MAKINO, MASAKI ; SUZUKI, ATSUSHI</creator><creatorcontrib>KOSEKI, AKIRA ; ONO, MASAKI ; KUDO, MICHIHARU ; HAIDA, KYOICHI ; MAKINO, MASAKI ; SUZUKI, ATSUSHI</creatorcontrib><description>Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the future. For those models, a set of past information are used to make explanatory variables, however, we don't have enough knowledge as to how long we should collect data backward. In some cases, very late tendencies are influencing the future status of disease while in the other cases, old events were the importance causes of the change of the disease status. Our interest thus lies in how old data we have to process to make the good prediction models. Method: In this paper, we discuss a set of machine learning algorithms to predict the diabetic nephropathy stage in the future using sets of input variables which were collected from different time span of past records. To compare the performance of algorithms we used Logistic Regression, AdaBoost, Gradient Boosting, Decision tree, Multi-layer Perceptron, and Random Forest. We then provide different set of variables of EHR that include past 30-, 60-, 90-, 180-, 210-, 240-, 270-, 300-, 330-, and 360-day data sets, from which we extracted several longitudinal statistics for input variables. From about 65 thousand type 2 diabetes patients, the models classify whether the nephropathy stage gets aggravated or stay in 180 days. Results: For almost all algorithms, AUC is getting improved when using older data, and 360-day data sets gave the best. Among the algorithms, Gradient Boosting gave the best AUC of 0.77 when using 360-day data set. When using 360-day data sets, Decision Tree gave worst AUC of 0.61. Conclusion: We observed that when using to past data up to 360 days, the oldest data set gave the best prediction performance. Longitudinal statistics in rather long span gives good explanatory information for future nephropathy development.</description><identifier>ISSN: 0012-1797</identifier><identifier>EISSN: 1939-327X</identifier><identifier>DOI: 10.2337/db19-507-P</identifier><language>eng</language><publisher>New York: American Diabetes Association</publisher><subject>Algorithms ; Artificial intelligence ; Datasets ; Decision making ; Decision trees ; Diabetes ; Diabetes mellitus ; Diabetes mellitus (non-insulin dependent) ; Diabetic nephropathy ; Electronic health records ; Electronic medical records ; Kidney diseases ; Learning algorithms ; Machine learning ; Nephropathy ; Prediction models ; Statistics</subject><ispartof>Diabetes (New York, N.Y.), 2019-06, Vol.68 (Supplement_1)</ispartof><rights>Copyright American Diabetes Association Jun 1, 2019</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>KOSEKI, AKIRA</creatorcontrib><creatorcontrib>ONO, MASAKI</creatorcontrib><creatorcontrib>KUDO, MICHIHARU</creatorcontrib><creatorcontrib>HAIDA, KYOICHI</creatorcontrib><creatorcontrib>MAKINO, MASAKI</creatorcontrib><creatorcontrib>SUZUKI, ATSUSHI</creatorcontrib><title>507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction</title><title>Diabetes (New York, N.Y.)</title><description>Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the future. For those models, a set of past information are used to make explanatory variables, however, we don't have enough knowledge as to how long we should collect data backward. In some cases, very late tendencies are influencing the future status of disease while in the other cases, old events were the importance causes of the change of the disease status. Our interest thus lies in how old data we have to process to make the good prediction models. Method: In this paper, we discuss a set of machine learning algorithms to predict the diabetic nephropathy stage in the future using sets of input variables which were collected from different time span of past records. To compare the performance of algorithms we used Logistic Regression, AdaBoost, Gradient Boosting, Decision tree, Multi-layer Perceptron, and Random Forest. We then provide different set of variables of EHR that include past 30-, 60-, 90-, 180-, 210-, 240-, 270-, 300-, 330-, and 360-day data sets, from which we extracted several longitudinal statistics for input variables. From about 65 thousand type 2 diabetes patients, the models classify whether the nephropathy stage gets aggravated or stay in 180 days. Results: For almost all algorithms, AUC is getting improved when using older data, and 360-day data sets gave the best. Among the algorithms, Gradient Boosting gave the best AUC of 0.77 when using 360-day data set. When using 360-day data sets, Decision Tree gave worst AUC of 0.61. Conclusion: We observed that when using to past data up to 360 days, the oldest data set gave the best prediction performance. Longitudinal statistics in rather long span gives good explanatory information for future nephropathy development.</description><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Datasets</subject><subject>Decision making</subject><subject>Decision trees</subject><subject>Diabetes</subject><subject>Diabetes mellitus</subject><subject>Diabetes mellitus (non-insulin dependent)</subject><subject>Diabetic nephropathy</subject><subject>Electronic health records</subject><subject>Electronic medical records</subject><subject>Kidney diseases</subject><subject>Learning algorithms</subject><subject>Machine learning</subject><subject>Nephropathy</subject><subject>Prediction models</subject><subject>Statistics</subject><issn>0012-1797</issn><issn>1939-327X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNotkEtLAzEQgIMoWKsXf0HAmxDNo5vdeCu1aqHoHnrwFrPJpN3SbtZkK_Tfu2tlDvPgmxn4ELpl9IELkT-6iimS0ZyUZ2jElFBE8PzzHI0oZZywXOWX6CqlLaVU9jFCX3_wE140fneAxgIOHpcmdcMkxL3p6tDgLuAygq3T0PTAc20q6GqL36HdxNCabnPE0_U6mp_TQk-72g7lNbrwZpfg5j-P0eplvpq9keXH62I2XRIrBSPOKQk8K6SjuTATKxyT2QSUl5RlRcGUoqYyGRRV5YGBoz63hZSUKymMg0qM0d3pbBvD9wFSp7fhEJv-o-Z8UgilMsl66v5E2RhSiuB1G-u9iUfNqB4E6kGg7p3oUvwCokNjAg</recordid><startdate>20190601</startdate><enddate>20190601</enddate><creator>KOSEKI, AKIRA</creator><creator>ONO, MASAKI</creator><creator>KUDO, MICHIHARU</creator><creator>HAIDA, KYOICHI</creator><creator>MAKINO, MASAKI</creator><creator>SUZUKI, ATSUSHI</creator><general>American Diabetes Association</general><scope>AAYXX</scope><scope>CITATION</scope><scope>K9.</scope><scope>NAPCQ</scope></search><sort><creationdate>20190601</creationdate><title>507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction</title><author>KOSEKI, AKIRA ; ONO, MASAKI ; KUDO, MICHIHARU ; HAIDA, KYOICHI ; MAKINO, MASAKI ; SUZUKI, ATSUSHI</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c631-dd96e2586d073a4c3d1654e9f6015881990aba5e8bbfe1ed0f7c86602963adeb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Datasets</topic><topic>Decision making</topic><topic>Decision trees</topic><topic>Diabetes</topic><topic>Diabetes mellitus</topic><topic>Diabetes mellitus (non-insulin dependent)</topic><topic>Diabetic nephropathy</topic><topic>Electronic health records</topic><topic>Electronic medical records</topic><topic>Kidney diseases</topic><topic>Learning algorithms</topic><topic>Machine learning</topic><topic>Nephropathy</topic><topic>Prediction models</topic><topic>Statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>KOSEKI, AKIRA</creatorcontrib><creatorcontrib>ONO, MASAKI</creatorcontrib><creatorcontrib>KUDO, MICHIHARU</creatorcontrib><creatorcontrib>HAIDA, KYOICHI</creatorcontrib><creatorcontrib>MAKINO, MASAKI</creatorcontrib><creatorcontrib>SUZUKI, ATSUSHI</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Nursing &amp; Allied Health Premium</collection><jtitle>Diabetes (New York, N.Y.)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>KOSEKI, AKIRA</au><au>ONO, MASAKI</au><au>KUDO, MICHIHARU</au><au>HAIDA, KYOICHI</au><au>MAKINO, MASAKI</au><au>SUZUKI, ATSUSHI</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction</atitle><jtitle>Diabetes (New York, N.Y.)</jtitle><date>2019-06-01</date><risdate>2019</risdate><volume>68</volume><issue>Supplement_1</issue><issn>0012-1797</issn><eissn>1939-327X</eissn><abstract>Background: In the presence of large data set of electronic health records (EHRs), predicting the future disease status is of importance for decision making in the medical treatments. Using modern machine learning techniques, it is generally becoming easier to build complex models to predict the future. For those models, a set of past information are used to make explanatory variables, however, we don't have enough knowledge as to how long we should collect data backward. In some cases, very late tendencies are influencing the future status of disease while in the other cases, old events were the importance causes of the change of the disease status. Our interest thus lies in how old data we have to process to make the good prediction models. Method: In this paper, we discuss a set of machine learning algorithms to predict the diabetic nephropathy stage in the future using sets of input variables which were collected from different time span of past records. To compare the performance of algorithms we used Logistic Regression, AdaBoost, Gradient Boosting, Decision tree, Multi-layer Perceptron, and Random Forest. We then provide different set of variables of EHR that include past 30-, 60-, 90-, 180-, 210-, 240-, 270-, 300-, 330-, and 360-day data sets, from which we extracted several longitudinal statistics for input variables. From about 65 thousand type 2 diabetes patients, the models classify whether the nephropathy stage gets aggravated or stay in 180 days. Results: For almost all algorithms, AUC is getting improved when using older data, and 360-day data sets gave the best. Among the algorithms, Gradient Boosting gave the best AUC of 0.77 when using 360-day data set. When using 360-day data sets, Decision Tree gave worst AUC of 0.61. Conclusion: We observed that when using to past data up to 360 days, the oldest data set gave the best prediction performance. Longitudinal statistics in rather long span gives good explanatory information for future nephropathy development.</abstract><cop>New York</cop><pub>American Diabetes Association</pub><doi>10.2337/db19-507-P</doi></addata></record>
fulltext fulltext
identifier ISSN: 0012-1797
ispartof Diabetes (New York, N.Y.), 2019-06, Vol.68 (Supplement_1)
issn 0012-1797
1939-327X
language eng
recordid cdi_proquest_journals_2248399561
source EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Algorithms
Artificial intelligence
Datasets
Decision making
Decision trees
Diabetes
Diabetes mellitus
Diabetes mellitus (non-insulin dependent)
Diabetic nephropathy
Electronic health records
Electronic medical records
Kidney diseases
Learning algorithms
Machine learning
Nephropathy
Prediction models
Statistics
title 507-P: Influence of Past Information to Precision of Diabetic Nephropathy Aggravation Prediction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T02%3A06%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=507-P:%20Influence%20of%20Past%20Information%20to%20Precision%20of%20Diabetic%20Nephropathy%20Aggravation%20Prediction&rft.jtitle=Diabetes%20(New%20York,%20N.Y.)&rft.au=KOSEKI,%20AKIRA&rft.date=2019-06-01&rft.volume=68&rft.issue=Supplement_1&rft.issn=0012-1797&rft.eissn=1939-327X&rft_id=info:doi/10.2337/db19-507-P&rft_dat=%3Cproquest_cross%3E2248399561%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2248399561&rft_id=info:pmid/&rfr_iscdi=true