Subject-level spinal osteoporotic fracture prediction combining deep learning vertebral outputs and limited demographic data

Summary Automated screening for vertebral fractures could improve outcomes. We achieved an AUC-ROC = 0.968 for the prediction of moderate to severe fracture using a GAM with age and three maximal vertebral body scores of fracture from a convolutional neural network. Maximal fracture scores resulted...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Archives of osteoporosis 2024-09, Vol.19 (1), p.87, Article 87
Hauptverfasser: Cross, Nathan M., Perry, Jessica, Dong, Qifei, Luo, Gang, Renslo, Jonathan, Chang, Brian C., Lane, Nancy E., Marshall, Lynn, Johnston, Sandra K., Haynor, David R., Jarvik, Jeffrey G., Heagerty, Patrick J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Summary Automated screening for vertebral fractures could improve outcomes. We achieved an AUC-ROC = 0.968 for the prediction of moderate to severe fracture using a GAM with age and three maximal vertebral body scores of fracture from a convolutional neural network. Maximal fracture scores resulted in a performant model for subject-level fracture prediction. Combining individual deep learning vertebral body fracture scores and demographic covariates for subject-level classification of osteoporotic fracture achieved excellent performance (AUC-ROC of 0.968) on a large dataset of radiographs with basic demographic data. Purpose Osteoporotic vertebral fractures are common and morbid. Automated opportunistic screening for incidental vertebral fractures from radiographs, the highest volume imaging modality, could improve osteoporosis detection and management. We consider how to form patient-level fracture predictions and summarization to guide management, using our previously developed vertebral fracture classifier on segmented radiographs from a prospective cohort study of US men (MrOS). We compare the performance of logistic regression (LR) and generalized additive models (GAM) with combinations of individual vertebral scores and basic demographic covariates. Methods Subject-level LR and GAM models were created retrospectively using all fracture predictions or summary variables such as order statistics, adjacent vertebral interactions, and demographic covariates (age, race/ethnicity). The classifier outputs for 8663 vertebrae from 1176 thoracic and lumbar radiographs in 669 subjects were divided by subject to perform stratified fivefold cross-validation. Models were assessed using multiple metrics, including receiver operating characteristic (ROC) and precision-recall (PR) curves. Results The best model (AUC-ROC = 0.968) was a GAM using the top three maximum vertebral fracture scores and age. Using top-ranked scores only, rather than all vertebral scores, improved performance for both model classes. Adding age, but not ethnicity, to the GAMs improved performance slightly. Conclusion Maximal vertebral fracture scores resulted in the highest-performing models. While combining multiple vertebral body predictions risks decreasing specificity, our results demonstrate that subject-level models maintain good predictive performance. Thresholding strategies can be used to control sensitivity and specificity as clinically appropriate.
ISSN:1862-3514
1862-3514
DOI:10.1007/s11657-024-01433-z