Prediction of HER2 status via random forest in 3257 Chinese patients with gastric cancer
The accurate evaluation of human epidermal growth factor receptor 2 (HER2) is crucial for successful trastuzumab-based therapy in individuals with gastric cancer (GC). The present study, involving a retrospective cohort ( N = 2865) from Wuhan Union Hospital and a prospective cohort ( N = 392) from...
Gespeichert in:
Veröffentlicht in: | Clinical and experimental medicine 2023-12, Vol.23 (8), p.5015-5024 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The accurate evaluation of human epidermal growth factor receptor 2 (HER2) is crucial for successful trastuzumab-based therapy in individuals with gastric cancer (GC). The present study, involving a retrospective cohort (
N
= 2865) from Wuhan Union Hospital and a prospective cohort (
N
= 392) from Renmin Hospital of Wuhan University, evaluated the benefits of clinical features using random forest and logistic regression models for the detection of HER2 status in patients with GC. Patients from the Union cohort were randomly assigned to either a training (
N
= 2005) or an internal validation (
N
= 860) group. Data processing and feature selection were done in Python, which was also used to build random forest and logistic regression models for the prediction of HER2 overexpression. The Renmin cohort (
N
= 392) was used as the external validation group. Ten features were closely correlated with HER2 overexpression, including age, albumin/globulin ratio, globulin, activated partial thromboplastin time, tumor stage, node stage, tumor node metastasis stage, tumor size, tumor differentiation, and neuron-specific enolase (NSE). Random forest and logistic regression had areas under the curve (AUC) of 0.9995 and 0.6653 in the training group and 0.923 and 0.667 in the internal validation group, respectively. When the two predictive models were validated using data from the Renmin cohort, random forest and logistic regression had AUCs of 0.9994 and 0.627, respectively. This is the first multicenter study to predict HER2 overexpression in individuals with GC, based on clinical variables. The random forest model significantly outperformed the logistic regression model. |
---|---|
ISSN: | 1591-9528 1591-8890 1591-9528 |
DOI: | 10.1007/s10238-023-01111-3 |