Development and validation of a machine learning-based diagnostic model for Parkinson's disease in community-dwelling populations: Evidence from the China health and retirement longitudinal study (CHARLS)

Parkinson's disease (PD) is a major neurodegenerative disorder in Middle-aged and elderly people.There is a pressing need for effective predictive models, particularly in chinese population. Objective:This study aims to develop and validate a machine learning-based diagnostic model to identify...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Parkinsonism & related disorders 2025-01, Vol.130, p.107182, Article 107182
Hauptverfasser: Fan, Hongyang, Li, Sai, Guo, Xin, Chen, Min, Zhang, Honggao, Chen, Yingzhu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page 107182
container_title Parkinsonism & related disorders
container_volume 130
creator Fan, Hongyang
Li, Sai
Guo, Xin
Chen, Min
Zhang, Honggao
Chen, Yingzhu
description Parkinson's disease (PD) is a major neurodegenerative disorder in Middle-aged and elderly people.There is a pressing need for effective predictive models, particularly in chinese population. Objective:This study aims to develop and validate a machine learning-based diagnostic model to identify individuals with PD in community-dwelling populations using data from the China Health and Retirement Longitudinal Study (CHARLS). We utilized data from 19,134 individuals aged 45 and above from the CHARLS dataset, with 265 adults reported to have PD. The external validation cohort included 1500 individuals, with 21 (1.4 %) having PD.The random forest (RF) algorithm was used to develop an interpretable PD prediction model, which was internally validated using 10-fold cross-validation and externally validated with a dataset from Northern Jiangsu People's Hospital. SHapley Additive exPlanation (SHAP) values were employed to elucidate the model's predictions. The RF model demonstrated robust performance with an Area Under the Curve (AUC) of 0.884 and high sensitivity, specificity, and F1 scores. The model's performance in external validation cohort, highlighting an AUC of 0.82 and an accuracy of 0.99. The model's performance remained consistent across internal and external validation cohorts. SHAP analysis provided insights into the importance and interaction of various predictors, enhancing model interpretability. The study presents a highly accurate and interpretable machine learning-based diagnostic model to identify individuals with PD in middle-aged and older Chinese adults. By combined with predictive risk factors and chronic disease information, the model offers valuable insights for early identification and intervention, potentially mitigating PD progression. •Developed a machine learning model for PD in China using random forest.•Used SHAP values to explain the importance of predictors in the model, enhancing interpretability.•Showcased the value of lifestyle and disease data in PD diagnostic models for early detection.•Created a user-friendly app based on the model for clinical use and acceptance.
doi_str_mv 10.1016/j.parkreldis.2024.107182
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_3128747048</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1353802024011945</els_id><sourcerecordid>3128747048</sourcerecordid><originalsourceid>FETCH-LOGICAL-c249t-c754f43db94a7134e1f9b716f37ca2ff315c3b56b018433f9bdfdb75a8ed27c3</originalsourceid><addsrcrecordid>eNqFkc9uEzEQxlcIREvhFZBvlMOm_rMbb7iVUGilSK2gd8trjxMHr73Y3qC8Iw-FkxQ4cvLI85v5ZuarKkTwjGAyv9rORhm_R3DaphnFtCnfnHT0WXVOOs7qltD58xKzltUdpvisepXSFmPMW8xeVmds0VLKOn5e_foEO3BhHMBnJL1GO-msltkGj4JBEg1SbawH5EBGb_267mUCjbSVax9StgoNQYNDJkT0UIayPgX_LhUgQSGR9UiFYZi8zfta_wTnShM0hnFyR5X0Ad3srAavAJkYBpQ3gJZFUqINSJc3x6kiZBvhOKQLfm3zpAvhUCrBHl0ub6-_rr69f129MNIlePP0XlSPn28el7f16v7L3fJ6VSvaLHKteNuYhul-0UhOWAPELHpO5oZxJakxjLSK9e28x6RrGCtJbXTPW9mBplyxi-ry1HaM4ccEKYvBJlU2kx7ClAQjtOMNx01X0O6EqhhSimDEGO0g414QLA5Wiq34Z6U4WClOVpbSt08qUz-A_lv4x7sCfDwBUFbdWYgiKXu4oy6nUlnoYP-v8htparpB</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3128747048</pqid></control><display><type>article</type><title>Development and validation of a machine learning-based diagnostic model for Parkinson's disease in community-dwelling populations: Evidence from the China health and retirement longitudinal study (CHARLS)</title><source>ScienceDirect Journals (5 years ago - present)</source><creator>Fan, Hongyang ; Li, Sai ; Guo, Xin ; Chen, Min ; Zhang, Honggao ; Chen, Yingzhu</creator><creatorcontrib>Fan, Hongyang ; Li, Sai ; Guo, Xin ; Chen, Min ; Zhang, Honggao ; Chen, Yingzhu</creatorcontrib><description>Parkinson's disease (PD) is a major neurodegenerative disorder in Middle-aged and elderly people.There is a pressing need for effective predictive models, particularly in chinese population. Objective:This study aims to develop and validate a machine learning-based diagnostic model to identify individuals with PD in community-dwelling populations using data from the China Health and Retirement Longitudinal Study (CHARLS). We utilized data from 19,134 individuals aged 45 and above from the CHARLS dataset, with 265 adults reported to have PD. The external validation cohort included 1500 individuals, with 21 (1.4 %) having PD.The random forest (RF) algorithm was used to develop an interpretable PD prediction model, which was internally validated using 10-fold cross-validation and externally validated with a dataset from Northern Jiangsu People's Hospital. SHapley Additive exPlanation (SHAP) values were employed to elucidate the model's predictions. The RF model demonstrated robust performance with an Area Under the Curve (AUC) of 0.884 and high sensitivity, specificity, and F1 scores. The model's performance in external validation cohort, highlighting an AUC of 0.82 and an accuracy of 0.99. The model's performance remained consistent across internal and external validation cohorts. SHAP analysis provided insights into the importance and interaction of various predictors, enhancing model interpretability. The study presents a highly accurate and interpretable machine learning-based diagnostic model to identify individuals with PD in middle-aged and older Chinese adults. By combined with predictive risk factors and chronic disease information, the model offers valuable insights for early identification and intervention, potentially mitigating PD progression. •Developed a machine learning model for PD in China using random forest.•Used SHAP values to explain the importance of predictors in the model, enhancing interpretability.•Showcased the value of lifestyle and disease data in PD diagnostic models for early detection.•Created a user-friendly app based on the model for clinical use and acceptance.</description><identifier>ISSN: 1353-8020</identifier><identifier>ISSN: 1873-5126</identifier><identifier>EISSN: 1873-5126</identifier><identifier>DOI: 10.1016/j.parkreldis.2024.107182</identifier><identifier>PMID: 39522387</identifier><language>eng</language><publisher>England: Elsevier Ltd</publisher><subject>CHARLS ; Lifestyle factors ; Machine learning ; Parkinson's disease ; Predictive model ; SHAP analysis</subject><ispartof>Parkinsonism &amp; related disorders, 2025-01, Vol.130, p.107182, Article 107182</ispartof><rights>2024 Elsevier Ltd</rights><rights>Copyright © 2024 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c249t-c754f43db94a7134e1f9b716f37ca2ff315c3b56b018433f9bdfdb75a8ed27c3</cites><orcidid>0000-0001-9269-1715</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.parkreldis.2024.107182$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3548,27922,27923,45993</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39522387$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Fan, Hongyang</creatorcontrib><creatorcontrib>Li, Sai</creatorcontrib><creatorcontrib>Guo, Xin</creatorcontrib><creatorcontrib>Chen, Min</creatorcontrib><creatorcontrib>Zhang, Honggao</creatorcontrib><creatorcontrib>Chen, Yingzhu</creatorcontrib><title>Development and validation of a machine learning-based diagnostic model for Parkinson's disease in community-dwelling populations: Evidence from the China health and retirement longitudinal study (CHARLS)</title><title>Parkinsonism &amp; related disorders</title><addtitle>Parkinsonism Relat Disord</addtitle><description>Parkinson's disease (PD) is a major neurodegenerative disorder in Middle-aged and elderly people.There is a pressing need for effective predictive models, particularly in chinese population. Objective:This study aims to develop and validate a machine learning-based diagnostic model to identify individuals with PD in community-dwelling populations using data from the China Health and Retirement Longitudinal Study (CHARLS). We utilized data from 19,134 individuals aged 45 and above from the CHARLS dataset, with 265 adults reported to have PD. The external validation cohort included 1500 individuals, with 21 (1.4 %) having PD.The random forest (RF) algorithm was used to develop an interpretable PD prediction model, which was internally validated using 10-fold cross-validation and externally validated with a dataset from Northern Jiangsu People's Hospital. SHapley Additive exPlanation (SHAP) values were employed to elucidate the model's predictions. The RF model demonstrated robust performance with an Area Under the Curve (AUC) of 0.884 and high sensitivity, specificity, and F1 scores. The model's performance in external validation cohort, highlighting an AUC of 0.82 and an accuracy of 0.99. The model's performance remained consistent across internal and external validation cohorts. SHAP analysis provided insights into the importance and interaction of various predictors, enhancing model interpretability. The study presents a highly accurate and interpretable machine learning-based diagnostic model to identify individuals with PD in middle-aged and older Chinese adults. By combined with predictive risk factors and chronic disease information, the model offers valuable insights for early identification and intervention, potentially mitigating PD progression. •Developed a machine learning model for PD in China using random forest.•Used SHAP values to explain the importance of predictors in the model, enhancing interpretability.•Showcased the value of lifestyle and disease data in PD diagnostic models for early detection.•Created a user-friendly app based on the model for clinical use and acceptance.</description><subject>CHARLS</subject><subject>Lifestyle factors</subject><subject>Machine learning</subject><subject>Parkinson's disease</subject><subject>Predictive model</subject><subject>SHAP analysis</subject><issn>1353-8020</issn><issn>1873-5126</issn><issn>1873-5126</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNqFkc9uEzEQxlcIREvhFZBvlMOm_rMbb7iVUGilSK2gd8trjxMHr73Y3qC8Iw-FkxQ4cvLI85v5ZuarKkTwjGAyv9rORhm_R3DaphnFtCnfnHT0WXVOOs7qltD58xKzltUdpvisepXSFmPMW8xeVmds0VLKOn5e_foEO3BhHMBnJL1GO-msltkGj4JBEg1SbawH5EBGb_267mUCjbSVax9StgoNQYNDJkT0UIayPgX_LhUgQSGR9UiFYZi8zfta_wTnShM0hnFyR5X0Ad3srAavAJkYBpQ3gJZFUqINSJc3x6kiZBvhOKQLfm3zpAvhUCrBHl0ub6-_rr69f129MNIlePP0XlSPn28el7f16v7L3fJ6VSvaLHKteNuYhul-0UhOWAPELHpO5oZxJakxjLSK9e28x6RrGCtJbXTPW9mBplyxi-ry1HaM4ccEKYvBJlU2kx7ClAQjtOMNx01X0O6EqhhSimDEGO0g414QLA5Wiq34Z6U4WClOVpbSt08qUz-A_lv4x7sCfDwBUFbdWYgiKXu4oy6nUlnoYP-v8htparpB</recordid><startdate>202501</startdate><enddate>202501</enddate><creator>Fan, Hongyang</creator><creator>Li, Sai</creator><creator>Guo, Xin</creator><creator>Chen, Min</creator><creator>Zhang, Honggao</creator><creator>Chen, Yingzhu</creator><general>Elsevier Ltd</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-9269-1715</orcidid></search><sort><creationdate>202501</creationdate><title>Development and validation of a machine learning-based diagnostic model for Parkinson's disease in community-dwelling populations: Evidence from the China health and retirement longitudinal study (CHARLS)</title><author>Fan, Hongyang ; Li, Sai ; Guo, Xin ; Chen, Min ; Zhang, Honggao ; Chen, Yingzhu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c249t-c754f43db94a7134e1f9b716f37ca2ff315c3b56b018433f9bdfdb75a8ed27c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>CHARLS</topic><topic>Lifestyle factors</topic><topic>Machine learning</topic><topic>Parkinson's disease</topic><topic>Predictive model</topic><topic>SHAP analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fan, Hongyang</creatorcontrib><creatorcontrib>Li, Sai</creatorcontrib><creatorcontrib>Guo, Xin</creatorcontrib><creatorcontrib>Chen, Min</creatorcontrib><creatorcontrib>Zhang, Honggao</creatorcontrib><creatorcontrib>Chen, Yingzhu</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Parkinsonism &amp; related disorders</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fan, Hongyang</au><au>Li, Sai</au><au>Guo, Xin</au><au>Chen, Min</au><au>Zhang, Honggao</au><au>Chen, Yingzhu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Development and validation of a machine learning-based diagnostic model for Parkinson's disease in community-dwelling populations: Evidence from the China health and retirement longitudinal study (CHARLS)</atitle><jtitle>Parkinsonism &amp; related disorders</jtitle><addtitle>Parkinsonism Relat Disord</addtitle><date>2025-01</date><risdate>2025</risdate><volume>130</volume><spage>107182</spage><pages>107182-</pages><artnum>107182</artnum><issn>1353-8020</issn><issn>1873-5126</issn><eissn>1873-5126</eissn><abstract>Parkinson's disease (PD) is a major neurodegenerative disorder in Middle-aged and elderly people.There is a pressing need for effective predictive models, particularly in chinese population. Objective:This study aims to develop and validate a machine learning-based diagnostic model to identify individuals with PD in community-dwelling populations using data from the China Health and Retirement Longitudinal Study (CHARLS). We utilized data from 19,134 individuals aged 45 and above from the CHARLS dataset, with 265 adults reported to have PD. The external validation cohort included 1500 individuals, with 21 (1.4 %) having PD.The random forest (RF) algorithm was used to develop an interpretable PD prediction model, which was internally validated using 10-fold cross-validation and externally validated with a dataset from Northern Jiangsu People's Hospital. SHapley Additive exPlanation (SHAP) values were employed to elucidate the model's predictions. The RF model demonstrated robust performance with an Area Under the Curve (AUC) of 0.884 and high sensitivity, specificity, and F1 scores. The model's performance in external validation cohort, highlighting an AUC of 0.82 and an accuracy of 0.99. The model's performance remained consistent across internal and external validation cohorts. SHAP analysis provided insights into the importance and interaction of various predictors, enhancing model interpretability. The study presents a highly accurate and interpretable machine learning-based diagnostic model to identify individuals with PD in middle-aged and older Chinese adults. By combined with predictive risk factors and chronic disease information, the model offers valuable insights for early identification and intervention, potentially mitigating PD progression. •Developed a machine learning model for PD in China using random forest.•Used SHAP values to explain the importance of predictors in the model, enhancing interpretability.•Showcased the value of lifestyle and disease data in PD diagnostic models for early detection.•Created a user-friendly app based on the model for clinical use and acceptance.</abstract><cop>England</cop><pub>Elsevier Ltd</pub><pmid>39522387</pmid><doi>10.1016/j.parkreldis.2024.107182</doi><orcidid>https://orcid.org/0000-0001-9269-1715</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1353-8020
ispartof Parkinsonism & related disorders, 2025-01, Vol.130, p.107182, Article 107182
issn 1353-8020
1873-5126
1873-5126
language eng
recordid cdi_proquest_miscellaneous_3128747048
source ScienceDirect Journals (5 years ago - present)
subjects CHARLS
Lifestyle factors
Machine learning
Parkinson's disease
Predictive model
SHAP analysis
title Development and validation of a machine learning-based diagnostic model for Parkinson's disease in community-dwelling populations: Evidence from the China health and retirement longitudinal study (CHARLS)
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T09%3A00%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Development%20and%20validation%20of%20a%20machine%20learning-based%20diagnostic%20model%20for%20Parkinson's%20disease%20in%20community-dwelling%20populations:%20Evidence%20from%20the%20China%20health%20and%20retirement%20longitudinal%20study%20(CHARLS)&rft.jtitle=Parkinsonism%20&%20related%20disorders&rft.au=Fan,%20Hongyang&rft.date=2025-01&rft.volume=130&rft.spage=107182&rft.pages=107182-&rft.artnum=107182&rft.issn=1353-8020&rft.eissn=1873-5126&rft_id=info:doi/10.1016/j.parkreldis.2024.107182&rft_dat=%3Cproquest_cross%3E3128747048%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3128747048&rft_id=info:pmid/39522387&rft_els_id=S1353802024011945&rfr_iscdi=true