Prediction of HER2 status via random forest in 3257 Chinese patients with gastric cancer

The accurate evaluation of human epidermal growth factor receptor 2 (HER2) is crucial for successful trastuzumab-based therapy in individuals with gastric cancer (GC). The present study, involving a retrospective cohort ( N  = 2865) from Wuhan Union Hospital and a prospective cohort ( N  = 392) from...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Clinical and experimental medicine 2023-12, Vol.23 (8), p.5015-5024
Hauptverfasser: Tian, Shan, Yu, Rong, Zhou, Fangfang, Zhan, Na, Li, Jiao, Wang, Xia, Peng, Xiulan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The accurate evaluation of human epidermal growth factor receptor 2 (HER2) is crucial for successful trastuzumab-based therapy in individuals with gastric cancer (GC). The present study, involving a retrospective cohort ( N  = 2865) from Wuhan Union Hospital and a prospective cohort ( N  = 392) from Renmin Hospital of Wuhan University, evaluated the benefits of clinical features using random forest and logistic regression models for the detection of HER2 status in patients with GC. Patients from the Union cohort were randomly assigned to either a training ( N  = 2005) or an internal validation ( N  = 860) group. Data processing and feature selection were done in Python, which was also used to build random forest and logistic regression models for the prediction of HER2 overexpression. The Renmin cohort ( N  = 392) was used as the external validation group. Ten features were closely correlated with HER2 overexpression, including age, albumin/globulin ratio, globulin, activated partial thromboplastin time, tumor stage, node stage, tumor node metastasis stage, tumor size, tumor differentiation, and neuron-specific enolase (NSE). Random forest and logistic regression had areas under the curve (AUC) of 0.9995 and 0.6653 in the training group and 0.923 and 0.667 in the internal validation group, respectively. When the two predictive models were validated using data from the Renmin cohort, random forest and logistic regression had AUCs of 0.9994 and 0.627, respectively. This is the first multicenter study to predict HER2 overexpression in individuals with GC, based on clinical variables. The random forest model significantly outperformed the logistic regression model.
ISSN:1591-9528
1591-8890
1591-9528
DOI:10.1007/s10238-023-01111-3