Development of prediction models for liver metastasis in colorectal cancer based on machine learning: a population-level study

Liver metastasis (LM) is of vital importance in making treatment-related decisions in patients with colorectal cancer (CRC). The aim of our study was to develop and validate prediction models for LM in CRC by making use of machine learning. We selected patients diagnosed with CRC from 2010 to 2015 f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Translational cancer research 2024-11, Vol.13 (11), p.5943-5952
Hauptverfasser:	Xing, Yuncan, Yu, Guanhua, Jiang, Zheng, Wang, Zheng
Format:	Artikel
Sprache:	eng
Schlagworte:	Original
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Liver metastasis (LM) is of vital importance in making treatment-related decisions in patients with colorectal cancer (CRC). The aim of our study was to develop and validate prediction models for LM in CRC by making use of machine learning. We selected patients diagnosed with CRC from 2010 to 2015 from the Surveillance, Epidemiology, and End Results (SEER) database. Four machine-learning methods, eXtreme gradient boost (XGB), decision tree (DT), random forest (RF), and support vector machine (SVM), were employed to develop a predictive model. The receiver operating characteristic (ROC) curves, decision curve analysis (DCA) curves and calibration curves were adopted to evaluate the model performance. The SHapley Additive exPlanation (SHAP) technique was chosen for visual analysis to enhance the interpretation of the outcomes of models. A total of 51,632 patients suffering from CRC were selected from the SEER database. Excellent accuracy of machine learning models was showed from ROC curves. In both the training and validation cohorts, calibration curves for the likelihood of LM demonstrated a high degree of concordance between model prediction and actual observation. The DCA indicated that each machine learning model can yield net benefits for both treat-none and treat-all strategies. Carcinoembryonic antigen (CEA) and N stage were identified as the most significant risk factors for LM based on the SHAP summary plot of the RF and XGB models. The XGB and RF were the best machine learning models among the four algorithms, of which CEA and N stage were identified as the most important risk factors related to LM.
ISSN:	2218-676X 2219-6803 2219-6803
DOI:	10.21037/tcr-24-1194