The Use of Machine Learning to Create a Risk Score to Predict Survival in Patients with Hepatocellular Carcinoma: A TCGA Cohort Analysis

Introduction. Hepatocellular carcinoma (HCC) accounts for approximately 90% of primary liver malignancies and is currently the fourth most common cause of cancer-related death worldwide. Due to varying underlying etiologies, the prognosis of HCC differs greatly among patients. It is important to dev...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Canadian Journal of Gastroenterology and Hepatology 2021, Vol.2021, p.1-8, Article 5212953
Hauptverfasser: Tohme, Samer, Yazdani, Hamza O, Rahman, Amaan, Handu, Sanah, Khan, Sidrah, Wilson, Tanner, Geller, David A, Simmons, Richard L, Molinari, Michele, Kaltenmeier, Christof
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Introduction. Hepatocellular carcinoma (HCC) accounts for approximately 90% of primary liver malignancies and is currently the fourth most common cause of cancer-related death worldwide. Due to varying underlying etiologies, the prognosis of HCC differs greatly among patients. It is important to develop ways to help stratify patients upon initial diagnosis to provide optimal treatment modalities and follow-up plans. The current study uses Artificial Neural Network (ANN) and Classification Tree Analysis (CTA) to create a gene signature score that can help predict survival in patients with HCC. Methods. The Cancer Genome Atlas (TCGA-LIHC) was analyzed for differentially expressed genes. Clinicopathological data were obtained from cBioPortal. ANN analysis of the 75 most significant genes predicting disease-free survival (DFS) was performed. Next, CTA results were used for creation of the scoring system. Cox regression was performed to identify the prognostic value of the scoring system. Results. 363 patients diagnosed with HCC were analyzed in this study. ANN provided 15 genes with normalized importance >50%. CTA resulted in a set of three genes (NRM, STAG3, and SNHG20). Patients were then divided in to 4 groups based on the CTA tree cutoff values. The Kaplan–Meier analysis showed significantly reduced DFS in groups 1, 2, and 3 (median DFS: 29.7 months, 16.1 months, and 11.7 months, p 
ISSN:2291-2789
2291-2797
DOI:10.1155/2021/5212953