A Naive Bayes machine learning approach to risk prediction using censored, time-to-event data

Predicting an individual's risk of experiencing a future clinical outcome is a statistical task with important consequences for both practicing clinicians and public health experts. Modern observational databases such as electronic health records provide an alternative to the longitudinal cohor...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Statistics in medicine 2015-09, Vol.34 (21), p.2941-2957
Hauptverfasser:	Wolfson, Julian, Bandyopadhyay, Sunayan, Elidrisi, Mohamed, Vazquez-Benitez, Gabriela, Vock, David M., Musgrove, Donald, Adomavicius, Gediminas, Johnson, Paul E., O'Connor, Patrick J.
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial intelligence Bayes Theorem Bayesian analysis Biometry - methods Cardiovascular Diseases - epidemiology Clinical outcomes Computer Simulation Databases, Factual Delivery of Health Care, Integrated Electronic Health Records Health risk assessment Humans Longitudinal Studies Machine Learning Mathematical models Midwestern United States - epidemiology Naive Bayes Predictions Proportional Hazards Models Risk Risk Assessment - methods risk prediction Space-Time Clustering survival analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Predicting an individual's risk of experiencing a future clinical outcome is a statistical task with important consequences for both practicing clinicians and public health experts. Modern observational databases such as electronic health records provide an alternative to the longitudinal cohort studies traditionally used to construct risk models, bringing with them both opportunities and challenges. Large sample sizes and detailed covariate histories enable the use of sophisticated machine learning techniques to uncover complex associations and interactions, but observational databases are often ‘messy’, with high levels of missing data and incomplete patient follow‐up. In this paper, we propose an adaptation of the well‐known Naive Bayes machine learning approach to time‐to‐event outcomes subject to censoring. We compare the predictive performance of our method with the Cox proportional hazards model which is commonly used for risk prediction in healthcare populations, and illustrate its application to prediction of cardiovascular risk using an electronic health record dataset from a large Midwest integrated healthcare system. Copyright © 2015 John Wiley & Sons, Ltd.
ISSN:	0277-6715 1097-0258
DOI:	10.1002/sim.6526