Predicting the risk of emergency admission with machine learning: Development and validation using linked electronic health records

Emergency admissions are a major source of healthcare spending. We aimed to derive, validate, and compare conventional and machine learning models for prediction of the first emergency admission. Machine learning methods are capable of capturing complex interactions that are likely to be present whe...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PLoS medicine 2018-11, Vol.15 (11), p.e1002695-e1002695
Hauptverfasser:	Rahimian, Fatemeh, Salimi-Khorshidi, Gholamreza, Payberah, Amir H, Tran, Jenny, Ayala Solares, Roberto, Raimondi, Francesca, Nazarzadeh, Milad, Canoy, Dexter, Rahimi, Kazem
Format:	Artikel
Sprache:	eng
Schlagworte:	Adolescent Adult Age Factors Aged Aged, 80 and over Algorithms Analysis Artificial intelligence Calibration Cardiovascular disease Computer and Information Sciences Data Mining - methods Datasets Demographic variables Demographics Demography Discrimination Economic aspects Electronic Health Records Electronic medical records Electronic records Emergency medical services Emergency Service, Hospital England Family medicine Female General practitioners Health care disparities Health care reform Health Services Needs and Demand Health Status Hospital admission and discharge Hospitals Humans Laboratory tests Learning algorithms Machine Learning Male Marriage Medical economics Medical prognosis Medical records Medicine Medicine and Health Sciences Methods Middle Aged Needs Assessment Patient Admission Patient admissions Physical Sciences Primary care Reproducibility of Results Research and Analysis Methods Risk Assessment Risk Factors Risk levels Sex Factors Socioeconomic Factors Software Time Factors Windows (intervals) Young Adult
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Emergency admissions are a major source of healthcare spending. We aimed to derive, validate, and compare conventional and machine learning models for prediction of the first emergency admission. Machine learning methods are capable of capturing complex interactions that are likely to be present when predicting less specific outcomes, such as this one. We used longitudinal data from linked electronic health records of 4.6 million patients aged 18-100 years from 389 practices across England between 1985 to 2015. The population was divided into a derivation cohort (80%, 3.75 million patients from 300 general practices) and a validation cohort (20%, 0.88 million patients from 89 general practices) from geographically distinct regions with different risk levels. We first replicated a previously reported Cox proportional hazards (CPH) model for prediction of the risk of the first emergency admission up to 24 months after baseline. This reference model was then compared with 2 machine learning models, random forest (RF) and gradient boosting classifier (GBC). The initial set of predictors for all models included 43 variables, including patient demographics, lifestyle factors, laboratory tests, currently prescribed medications, selected morbidities, and previous emergency admissions. We then added 13 more variables (marital status, prior general practice visits, and 11 additional morbidities), and also enriched all variables by incorporating temporal information whenever possible (e.g., time since first diagnosis). We also varied the prediction windows to 12, 36, 48, and 60 months after baseline and compared model performances. For internal validation, we used 5-fold cross-validation. When the initial set of variables was used, GBC outperformed RF and CPH, with an area under the receiver operating characteristic curve (AUC) of 0.779 (95% CI 0.777, 0.781), compared to 0.752 (95% CI 0.751, 0.753) and 0.740 (95% CI 0.739, 0.741), respectively. In external validation, we observed an AUC of 0.796, 0.736, and 0.736 for GBC, RF, and CPH, respectively. The addition of temporal information improved AUC across all models. In internal validation, the AUC rose to 0.848 (95% CI 0.847, 0.849), 0.825 (95% CI 0.824, 0.826), and 0.805 (95% CI 0.804, 0.806) for GBC, RF, and CPH, respectively, while the AUC in external validation rose to 0.826, 0.810, and 0.788, respectively. This enhancement also resulted in robust predictions for longer time horizons, with AUC values remaining at
ISSN:	1549-1676 1549-1277 1549-1676
DOI:	10.1371/journal.pmed.1002695