OWL: an optimized and independently validated machine learning prediction model for lung cancer screening based on the UK Biobank, PLCO, and NLST populationsResearch in context

Background: A reliable risk prediction model is critically important for identifying individuals with high risk of developing lung cancer as candidates for low-dose chest computed tomography (LDCT) screening. Leveraging a cutting-edge machine learning technique that accommodates a wide list of quest...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	EBioMedicine 2023-02, Vol.88, p.104443
Hauptverfasser:	Zoucheng Pan, Ruyang Zhang, Sipeng Shen, Yunzhi Lin, Longyao Zhang, Xiang Wang, Qian Ye, Xuan Wang, Jiajin Chen, Yang Zhao, David C. Christiani, Yi Li, Feng Chen, Yongyue Wei
Format:	Artikel
Sprache:	eng
Schlagworte:	External validation Lung cancer Machine learning Risk prediction UK Biobank
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Background: A reliable risk prediction model is critically important for identifying individuals with high risk of developing lung cancer as candidates for low-dose chest computed tomography (LDCT) screening. Leveraging a cutting-edge machine learning technique that accommodates a wide list of questionnaire-based predictors, we sought to optimize and validate a lung cancer prediction model. Methods: We developed an Optimized early Warning model for Lung cancer risk (OWL) using the XGBoost algorithm with 323,344 participants from the England area in UK Biobank (training set), and independently validated it with 93,227 participants from UKB Scotland and Wales area (validation set 1), as well as 70,605 and 66,231 participants in the Prostate, Lung, Colorectal, and Ovarian cancer screening trial (PLCO) control and intervention subpopulations, respectively (validation sets 2 & 3) and 23,138 and 18,669 participants in the United States National Lung Screening Trial (NLST) control and intervention subpopulations, respectively (validation sets 4 & 5). By comparing with three competitive prediction models, i.e., PLCO modified 2012 (PLCOm2012), PLCO modified 2014 (PLCOall2014), and the Liverpool Lung cancer Project risk model version 3 (LLPv3), we assessed the discrimination of OWL by the area under receiver operating characteristic curve (AUC) at the designed time point. We further evaluated the calibration using relative improvement in the ratio of expected to observed lung cancer cases (RIEO), and illustrated the clinical utility by the decision curve analysis. Findings: For general population, with validation set 1, OWL (AUC = 0.855, 95% CI: 0.829–0.880) presented a better discriminative capability than PLCOall2014 (AUC = 0.821, 95% CI: 0.794–0.848) (p
ISSN:	2352-3964 2352-3964