Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk
Abstract The benefits of using electronic health records (EHRs) for disease risk screening and personalized health-care decisions are being increasingly recognized. Here we present a computationally feasible statistical approach with which to address the methodological challenges involved in utilizi...
Gespeichert in:
Veröffentlicht in: | American journal of epidemiology 2018-07, Vol.187 (7), p.1530-1538 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1538 |
---|---|
container_issue | 7 |
container_start_page | 1530 |
container_title | American journal of epidemiology |
container_volume | 187 |
creator | Paige, Ellie Barrett, Jessica Stevens, David Keogh, Ruth H Sweeting, Michael J Nazareth, Irwin Petersen, Irene Wood, Angela M |
description | Abstract
The benefits of using electronic health records (EHRs) for disease risk screening and personalized health-care decisions are being increasingly recognized. Here we present a computationally feasible statistical approach with which to address the methodological challenges involved in utilizing historical repeat measures of multiple risk factors recorded in EHRs to systematically identify patients at high risk of future disease. The approach is principally based on a 2-stage dynamic landmark model. The first stage estimates current risk factor values from all available historical repeat risk factor measurements via landmark-age–specific multivariate linear mixed-effects models with correlated random intercepts, which account for sporadically recorded repeat measures, unobserved data, and measurement errors. The second stage predicts future disease risk from a sex-stratified Cox proportional hazards model, with estimated current risk factor values from the first stage. We exemplify these methods by developing and validating a dynamic 10-year cardiovascular disease risk prediction model using primary-care EHRs for age, diabetes status, hypertension treatment, smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol in 41,373 persons from 10 primary-care practices in England and Wales contributing to The Health Improvement Network (1997–2016). Using cross-validation, the model was well-calibrated (Brier score = 0.041, 95% confidence interval: 0.039, 0.042) and had good discrimination (C-index = 0.768, 95% confidence interval: 0.759, 0.777). |
doi_str_mv | 10.1093/aje/kwy018 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6030927</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/aje/kwy018</oup_id><sourcerecordid>2306242708</sourcerecordid><originalsourceid>FETCH-LOGICAL-c436t-ee233db663529fce7ab728d84cf62c9d9cb658744e4a890fb7b1909ab5480163</originalsourceid><addsrcrecordid>eNp9kcFu1DAQhi1ERZfChQdAllAlhBRqO45jX5BQ6bZIWxWhcrYcZ9L1bhKntlNUXoGXxu2WCjhwmsN8_mbGP0KvKHlPiSqPzAaOtt9vCZVP0ILyWhSCVeIpWhBCWKGYYPvoeYwbQihVFXmG9pmqJJeULdDPlRnbwYQtPvct9BF3PuCLKbnB_XDjFU5rwN8iYN_hrzCBSdDiczBxDjDAmOJ9w8UtXhqbfIjYjfikB5uCH53FZ2D6tM5PrQ9txMnjLwFaZxNezik78CcXsw3uHS_QXmf6CC8f6gG6XJ5cHp8Vq4vTz8cfV4XlpUgFACvLthGirJjqLNSmqZlsJbedYFa1yjaikjXnwI1UpGvqhiqiTFNxSagoD9CHnXaamwFam88IptdTcPkfbrU3Tv_dGd1aX_kbLUhJFKuz4O2DIPjrGWLSg4sW-t6M4OeoGaGKi8zezXrzD7rxcxjzdZqVRDDOaiIz9W5H2eBjDNA9LkOJvotY54j1LuIMv_5z_Uf0d6YZONwBfp7-J_oF1VCxWQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2306242708</pqid></control><display><type>article</type><title>Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk</title><source>Oxford University Press Journals All Titles (1996-Current)</source><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Alma/SFX Local Collection</source><creator>Paige, Ellie ; Barrett, Jessica ; Stevens, David ; Keogh, Ruth H ; Sweeting, Michael J ; Nazareth, Irwin ; Petersen, Irene ; Wood, Angela M</creator><creatorcontrib>Paige, Ellie ; Barrett, Jessica ; Stevens, David ; Keogh, Ruth H ; Sweeting, Michael J ; Nazareth, Irwin ; Petersen, Irene ; Wood, Angela M</creatorcontrib><description>Abstract
The benefits of using electronic health records (EHRs) for disease risk screening and personalized health-care decisions are being increasingly recognized. Here we present a computationally feasible statistical approach with which to address the methodological challenges involved in utilizing historical repeat measures of multiple risk factors recorded in EHRs to systematically identify patients at high risk of future disease. The approach is principally based on a 2-stage dynamic landmark model. The first stage estimates current risk factor values from all available historical repeat risk factor measurements via landmark-age–specific multivariate linear mixed-effects models with correlated random intercepts, which account for sporadically recorded repeat measures, unobserved data, and measurement errors. The second stage predicts future disease risk from a sex-stratified Cox proportional hazards model, with estimated current risk factor values from the first stage. We exemplify these methods by developing and validating a dynamic 10-year cardiovascular disease risk prediction model using primary-care EHRs for age, diabetes status, hypertension treatment, smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol in 41,373 persons from 10 primary-care practices in England and Wales contributing to The Health Improvement Network (1997–2016). Using cross-validation, the model was well-calibrated (Brier score = 0.041, 95% confidence interval: 0.039, 0.042) and had good discrimination (C-index = 0.768, 95% confidence interval: 0.759, 0.777).</description><identifier>ISSN: 0002-9262</identifier><identifier>ISSN: 1476-6256</identifier><identifier>EISSN: 1476-6256</identifier><identifier>DOI: 10.1093/aje/kwy018</identifier><identifier>PMID: 29584812</identifier><language>eng</language><publisher>United States: Oxford University Press</publisher><subject>Adult ; Blood pressure ; Calibration ; Cardiovascular diseases ; Cardiovascular Diseases - epidemiology ; Cardiovascular Diseases - etiology ; Cholesterol ; Confidence intervals ; Diabetes mellitus ; Disease Susceptibility - epidemiology ; Electronic health records ; Electronic Health Records - statistics & numerical data ; Electronic medical records ; England - epidemiology ; Feasibility Studies ; Female ; Forecasting - methods ; Health risk assessment ; Health risks ; Humans ; Hypertension ; Linear Models ; Male ; Middle Aged ; Multivariate Analysis ; Patient-Specific Modeling ; Practice of Epidemiology ; Prediction models ; Primary Health Care - statistics & numerical data ; Proportional Hazards Models ; Risk analysis ; Risk Assessment - methods ; Risk Factors ; Smoking ; Statistical analysis ; Statistical models ; Wales - epidemiology</subject><ispartof>American journal of epidemiology, 2018-07, Vol.187 (7), p.1530-1538</ispartof><rights>The Author(s) 2018. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. 2018</rights><rights>The Author(s) 2018. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c436t-ee233db663529fce7ab728d84cf62c9d9cb658744e4a890fb7b1909ab5480163</citedby><cites>FETCH-LOGICAL-c436t-ee233db663529fce7ab728d84cf62c9d9cb658744e4a890fb7b1909ab5480163</cites><orcidid>0000-0003-0855-9872</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,1578,27901,27902</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29584812$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Paige, Ellie</creatorcontrib><creatorcontrib>Barrett, Jessica</creatorcontrib><creatorcontrib>Stevens, David</creatorcontrib><creatorcontrib>Keogh, Ruth H</creatorcontrib><creatorcontrib>Sweeting, Michael J</creatorcontrib><creatorcontrib>Nazareth, Irwin</creatorcontrib><creatorcontrib>Petersen, Irene</creatorcontrib><creatorcontrib>Wood, Angela M</creatorcontrib><title>Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk</title><title>American journal of epidemiology</title><addtitle>Am J Epidemiol</addtitle><description>Abstract
The benefits of using electronic health records (EHRs) for disease risk screening and personalized health-care decisions are being increasingly recognized. Here we present a computationally feasible statistical approach with which to address the methodological challenges involved in utilizing historical repeat measures of multiple risk factors recorded in EHRs to systematically identify patients at high risk of future disease. The approach is principally based on a 2-stage dynamic landmark model. The first stage estimates current risk factor values from all available historical repeat risk factor measurements via landmark-age–specific multivariate linear mixed-effects models with correlated random intercepts, which account for sporadically recorded repeat measures, unobserved data, and measurement errors. The second stage predicts future disease risk from a sex-stratified Cox proportional hazards model, with estimated current risk factor values from the first stage. We exemplify these methods by developing and validating a dynamic 10-year cardiovascular disease risk prediction model using primary-care EHRs for age, diabetes status, hypertension treatment, smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol in 41,373 persons from 10 primary-care practices in England and Wales contributing to The Health Improvement Network (1997–2016). Using cross-validation, the model was well-calibrated (Brier score = 0.041, 95% confidence interval: 0.039, 0.042) and had good discrimination (C-index = 0.768, 95% confidence interval: 0.759, 0.777).</description><subject>Adult</subject><subject>Blood pressure</subject><subject>Calibration</subject><subject>Cardiovascular diseases</subject><subject>Cardiovascular Diseases - epidemiology</subject><subject>Cardiovascular Diseases - etiology</subject><subject>Cholesterol</subject><subject>Confidence intervals</subject><subject>Diabetes mellitus</subject><subject>Disease Susceptibility - epidemiology</subject><subject>Electronic health records</subject><subject>Electronic Health Records - statistics & numerical data</subject><subject>Electronic medical records</subject><subject>England - epidemiology</subject><subject>Feasibility Studies</subject><subject>Female</subject><subject>Forecasting - methods</subject><subject>Health risk assessment</subject><subject>Health risks</subject><subject>Humans</subject><subject>Hypertension</subject><subject>Linear Models</subject><subject>Male</subject><subject>Middle Aged</subject><subject>Multivariate Analysis</subject><subject>Patient-Specific Modeling</subject><subject>Practice of Epidemiology</subject><subject>Prediction models</subject><subject>Primary Health Care - statistics & numerical data</subject><subject>Proportional Hazards Models</subject><subject>Risk analysis</subject><subject>Risk Assessment - methods</subject><subject>Risk Factors</subject><subject>Smoking</subject><subject>Statistical analysis</subject><subject>Statistical models</subject><subject>Wales - epidemiology</subject><issn>0002-9262</issn><issn>1476-6256</issn><issn>1476-6256</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNp9kcFu1DAQhi1ERZfChQdAllAlhBRqO45jX5BQ6bZIWxWhcrYcZ9L1bhKntlNUXoGXxu2WCjhwmsN8_mbGP0KvKHlPiSqPzAaOtt9vCZVP0ILyWhSCVeIpWhBCWKGYYPvoeYwbQihVFXmG9pmqJJeULdDPlRnbwYQtPvct9BF3PuCLKbnB_XDjFU5rwN8iYN_hrzCBSdDiczBxDjDAmOJ9w8UtXhqbfIjYjfikB5uCH53FZ2D6tM5PrQ9txMnjLwFaZxNezik78CcXsw3uHS_QXmf6CC8f6gG6XJ5cHp8Vq4vTz8cfV4XlpUgFACvLthGirJjqLNSmqZlsJbedYFa1yjaikjXnwI1UpGvqhiqiTFNxSagoD9CHnXaamwFam88IptdTcPkfbrU3Tv_dGd1aX_kbLUhJFKuz4O2DIPjrGWLSg4sW-t6M4OeoGaGKi8zezXrzD7rxcxjzdZqVRDDOaiIz9W5H2eBjDNA9LkOJvotY54j1LuIMv_5z_Uf0d6YZONwBfp7-J_oF1VCxWQ</recordid><startdate>20180701</startdate><enddate>20180701</enddate><creator>Paige, Ellie</creator><creator>Barrett, Jessica</creator><creator>Stevens, David</creator><creator>Keogh, Ruth H</creator><creator>Sweeting, Michael J</creator><creator>Nazareth, Irwin</creator><creator>Petersen, Irene</creator><creator>Wood, Angela M</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QP</scope><scope>7T2</scope><scope>7TK</scope><scope>7U7</scope><scope>7U9</scope><scope>C1K</scope><scope>H94</scope><scope>K9.</scope><scope>NAPCQ</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-0855-9872</orcidid></search><sort><creationdate>20180701</creationdate><title>Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk</title><author>Paige, Ellie ; Barrett, Jessica ; Stevens, David ; Keogh, Ruth H ; Sweeting, Michael J ; Nazareth, Irwin ; Petersen, Irene ; Wood, Angela M</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c436t-ee233db663529fce7ab728d84cf62c9d9cb658744e4a890fb7b1909ab5480163</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Adult</topic><topic>Blood pressure</topic><topic>Calibration</topic><topic>Cardiovascular diseases</topic><topic>Cardiovascular Diseases - epidemiology</topic><topic>Cardiovascular Diseases - etiology</topic><topic>Cholesterol</topic><topic>Confidence intervals</topic><topic>Diabetes mellitus</topic><topic>Disease Susceptibility - epidemiology</topic><topic>Electronic health records</topic><topic>Electronic Health Records - statistics & numerical data</topic><topic>Electronic medical records</topic><topic>England - epidemiology</topic><topic>Feasibility Studies</topic><topic>Female</topic><topic>Forecasting - methods</topic><topic>Health risk assessment</topic><topic>Health risks</topic><topic>Humans</topic><topic>Hypertension</topic><topic>Linear Models</topic><topic>Male</topic><topic>Middle Aged</topic><topic>Multivariate Analysis</topic><topic>Patient-Specific Modeling</topic><topic>Practice of Epidemiology</topic><topic>Prediction models</topic><topic>Primary Health Care - statistics & numerical data</topic><topic>Proportional Hazards Models</topic><topic>Risk analysis</topic><topic>Risk Assessment - methods</topic><topic>Risk Factors</topic><topic>Smoking</topic><topic>Statistical analysis</topic><topic>Statistical models</topic><topic>Wales - epidemiology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Paige, Ellie</creatorcontrib><creatorcontrib>Barrett, Jessica</creatorcontrib><creatorcontrib>Stevens, David</creatorcontrib><creatorcontrib>Keogh, Ruth H</creatorcontrib><creatorcontrib>Sweeting, Michael J</creatorcontrib><creatorcontrib>Nazareth, Irwin</creatorcontrib><creatorcontrib>Petersen, Irene</creatorcontrib><creatorcontrib>Wood, Angela M</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Health and Safety Science Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Environmental Sciences and Pollution Management</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Nursing & Allied Health Premium</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>American journal of epidemiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Paige, Ellie</au><au>Barrett, Jessica</au><au>Stevens, David</au><au>Keogh, Ruth H</au><au>Sweeting, Michael J</au><au>Nazareth, Irwin</au><au>Petersen, Irene</au><au>Wood, Angela M</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk</atitle><jtitle>American journal of epidemiology</jtitle><addtitle>Am J Epidemiol</addtitle><date>2018-07-01</date><risdate>2018</risdate><volume>187</volume><issue>7</issue><spage>1530</spage><epage>1538</epage><pages>1530-1538</pages><issn>0002-9262</issn><issn>1476-6256</issn><eissn>1476-6256</eissn><abstract>Abstract
The benefits of using electronic health records (EHRs) for disease risk screening and personalized health-care decisions are being increasingly recognized. Here we present a computationally feasible statistical approach with which to address the methodological challenges involved in utilizing historical repeat measures of multiple risk factors recorded in EHRs to systematically identify patients at high risk of future disease. The approach is principally based on a 2-stage dynamic landmark model. The first stage estimates current risk factor values from all available historical repeat risk factor measurements via landmark-age–specific multivariate linear mixed-effects models with correlated random intercepts, which account for sporadically recorded repeat measures, unobserved data, and measurement errors. The second stage predicts future disease risk from a sex-stratified Cox proportional hazards model, with estimated current risk factor values from the first stage. We exemplify these methods by developing and validating a dynamic 10-year cardiovascular disease risk prediction model using primary-care EHRs for age, diabetes status, hypertension treatment, smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol in 41,373 persons from 10 primary-care practices in England and Wales contributing to The Health Improvement Network (1997–2016). Using cross-validation, the model was well-calibrated (Brier score = 0.041, 95% confidence interval: 0.039, 0.042) and had good discrimination (C-index = 0.768, 95% confidence interval: 0.759, 0.777).</abstract><cop>United States</cop><pub>Oxford University Press</pub><pmid>29584812</pmid><doi>10.1093/aje/kwy018</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0003-0855-9872</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0002-9262 |
ispartof | American journal of epidemiology, 2018-07, Vol.187 (7), p.1530-1538 |
issn | 0002-9262 1476-6256 1476-6256 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6030927 |
source | Oxford University Press Journals All Titles (1996-Current); MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Alma/SFX Local Collection |
subjects | Adult Blood pressure Calibration Cardiovascular diseases Cardiovascular Diseases - epidemiology Cardiovascular Diseases - etiology Cholesterol Confidence intervals Diabetes mellitus Disease Susceptibility - epidemiology Electronic health records Electronic Health Records - statistics & numerical data Electronic medical records England - epidemiology Feasibility Studies Female Forecasting - methods Health risk assessment Health risks Humans Hypertension Linear Models Male Middle Aged Multivariate Analysis Patient-Specific Modeling Practice of Epidemiology Prediction models Primary Health Care - statistics & numerical data Proportional Hazards Models Risk analysis Risk Assessment - methods Risk Factors Smoking Statistical analysis Statistical models Wales - epidemiology |
title | Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-03T20%3A03%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Landmark%20Models%20for%20Optimizing%20the%20Use%20of%20Repeated%20Measurements%20of%20Risk%20Factors%20in%20Electronic%20Health%20Records%20to%20Predict%20Future%20Disease%20Risk&rft.jtitle=American%20journal%20of%20epidemiology&rft.au=Paige,%20Ellie&rft.date=2018-07-01&rft.volume=187&rft.issue=7&rft.spage=1530&rft.epage=1538&rft.pages=1530-1538&rft.issn=0002-9262&rft.eissn=1476-6256&rft_id=info:doi/10.1093/aje/kwy018&rft_dat=%3Cproquest_pubme%3E2306242708%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2306242708&rft_id=info:pmid/29584812&rft_oup_id=10.1093/aje/kwy018&rfr_iscdi=true |