Development of a Predictive Model of Occult Cancer After a Venous Thromboembolism Event Using Machine Learning: The CLOVER Study

: Venous thromboembolism (VTE) can be the first manifestation of an underlying cancer. This study aimed to develop a predictive model to assess the risk of occult cancer between 30 days and 24 months after a venous thrombotic event using machine learning (ML). : We designed a case-control study nest...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Medicina (Kaunas, Lithuania) Lithuania), 2024-12, Vol.61 (1), p.18
Hauptverfasser: Franco-Moreno, Anabel, Madroñal-Cerezo, Elena, de Ancos-Aracil, Cristina Lucía, Farfán-Sedano, Ana Isabel, Muñoz-Rivas, Nuria, Bascuñana Morejón-Girón, José, Ruiz-Giardín, José Manuel, Álvarez-Rodríguez, Federico, Prada-Alonso, Jesús, Gala-García, Yvonne, Casado-Suela, Miguel Ángel, Bustamante-Fermosel, Ana, Alfaro-Fernández, Nuria, Torres-Macho, Juan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:: Venous thromboembolism (VTE) can be the first manifestation of an underlying cancer. This study aimed to develop a predictive model to assess the risk of occult cancer between 30 days and 24 months after a venous thrombotic event using machine learning (ML). : We designed a case-control study nested in a cohort of patients with VTE included in a prospective registry from two Spanish hospitals between 2005 and 2021. Both clinically and ML-driven feature selection were performed to identify predictors for occult cancer. XGBoost, LightGBM, and CatBoost algorithms were used to train different prediction models, which were subsequently validated in a hold-out dataset. : A total of 815 patients with VTE were included (51.5% male and median age of 59). During follow-up, 56 patients (6.9%) were diagnosed with cancer. One hundred and twenty-one variables were explored for the predictive analysis. CatBoost obtained better performance metrics among the ML models analyzed. The final CatBoost model included, among the top 15 variables to predict hidden malignancy, age, gender, systolic blood pressure, heart rate, weight, chronic lung disease, D-dimer, alanine aminotransferase, hemoglobin, serum creatinine, cholesterol, platelets, triglycerides, leukocyte count and previous VTE. The model had an ROC-AUC of 0.86 (95% CI, 0.83-0.87) in the test set. Sensitivity, specificity, and negative and positive predictive values were 62%, 94%, 93% and 75%, respectively. : This is the first risk score developed for identifying patients with VTE who are at increased risk of occult cancer using ML tools, obtaining a remarkably high diagnostic accuracy. This study's limitations include potential information bias from electronic health records and a small cancer sample size. In addition, variability in detection protocols and evolving clinical practices may affect model accuracy. Our score needs external validation.
ISSN:1648-9144
1010-660X
1648-9144
DOI:10.3390/medicina61010018