Unsupervised Learning Identifies Computed Tomographic Measurements as Primary Drivers of Progression, Exacerbation, and Mortality in Chronic Obstructive Pulmonary Disease

Chronic obstructive pulmonary disease (COPD) is a heterogeneous syndrome with phenotypic manifestations that tend to be distributed along a continuum. Unsupervised machine learning based on broad selection of imaging and clinical phenotypes may be used to identify primary variables that define disea...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Annals of the American Thoracic Society 2022-12, Vol.19 (12), p.1993-2002
Hauptverfasser: Yuan, Nancy F, Hasenstab, Kyle, Retson, Tara, Conrad, Douglas J, Lynch, David A, Hsiao, Albert
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Chronic obstructive pulmonary disease (COPD) is a heterogeneous syndrome with phenotypic manifestations that tend to be distributed along a continuum. Unsupervised machine learning based on broad selection of imaging and clinical phenotypes may be used to identify primary variables that define disease axes and stratify patients with COPD. To identify primary variables driving COPD heterogeneity using principal component analysis and to define disease axes and assess the prognostic value of these axes across three outcomes: progression, exacerbation, and mortality. We included 7,331 patients between 39 and 85 years old, of whom 40.3% were Black and 45.8% were female smokers with a mean of 44.6 pack-years, from the COPDGene (Genetic Epidemiology of COPD) phase I cohort (2008-2011) in our analysis. Out of a total of 916 phenotypes, 147 continuous clinical, spirometric, and computed tomography (CT) features were selected. For each principal component (PC), we computed a PC score based on feature weights. We used PC score distributions to define disease axes along which we divided the patients into quartiles. To assess the prognostic value of these axes, we applied logistic regression analyses to estimate 5-year (  = 4,159) and 10-year (  = 1,487) odds of progression. Cox regression and Kaplan-Meier analyses were performed to estimate 5-year and 10-year risk of exacerbation (  = 6,532) and all-cause mortality (  = 7,331). The first PC, accounting for 43.7% of variance, was defined by CT measures of air trapping and emphysema. The second PC, accounting for 13.7% of variance, was defined by spirometric and CT measures of vital capacity and lung volume. The third PC, accounting for 7.9% of the variance, was defined by CT measures of lung mass, airway thickening, and body habitus. Stratification of patients across each disease axis revealed up to 3.2-fold (95% confidence interval [CI] 2.4, 4.3) greater odds of 5-year progression, 5.4-fold (95% CI 4.6, 6.3) greater risk of 5-year exacerbation, and 5.0-fold (95% CI 4.2, 6.0) greater risk of 10-year mortality between the highest and lowest quartiles. Unsupervised learning analysis of the COPDGene cohort reveals that CT measurements may bolster patient stratification along the continuum of COPD phenotypes. Each of the disease axes also individually demonstrate prognostic potential, predictive of future forced expiratory volume in 1 second decline, exacerbation, and mortality.
ISSN:2329-6933
2325-6621
DOI:10.1513/AnnalsATS.202110-1127OC