Deep learning or radiomics based on CT for predicting the response of gastric cancer to neoadjuvant chemotherapy: a meta-analysis and systematic review
Artificial intelligence (AI) models, clinical models (CM), and the integrated model (IM) are utilized to evaluate the response to neoadjuvant chemotherapy (NACT) in patients diagnosed with gastric cancer. The objective is to identify the diagnostic test of the AI model and to compare the accuracy of...
Gespeichert in:
Veröffentlicht in: | Frontiers in oncology 2024-03, Vol.14, p.1363812 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Artificial intelligence (AI) models, clinical models (CM), and the integrated model (IM) are utilized to evaluate the response to neoadjuvant chemotherapy (NACT) in patients diagnosed with gastric cancer.
The objective is to identify the diagnostic test of the AI model and to compare the accuracy of AI, CM, and IM through a comprehensive summary of head-to-head comparative studies.
PubMed, Web of Science, Cochrane Library, and Embase were systematically searched until September 5, 2023, to compile English language studies without regional restrictions. The quality of the included studies was evaluated using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) criteria. Forest plots were utilized to illustrate the findings of diagnostic accuracy, while Hierarchical Summary Receiver Operating Characteristic curves were generated to estimate sensitivity (SEN) and specificity (SPE). Meta-regression was applied to analyze heterogeneity across the studies. To assess the presence of publication bias, Deeks' funnel plot and an asymmetry test were employed.
A total of 9 studies, comprising 3313 patients, were included for the AI model, with 7 head-to-head comparative studies involving 2699 patients. Across the 9 studies, the pooled SEN for the AI model was 0.75 (95% confidence interval (CI): 0.66, 0.82), and SPE was 0.77 (95% CI: 0.69, 0.84). Meta-regression was conducted, revealing that the cut-off value, approach to predicting response, and gold standard might be sources of heterogeneity. In the head-to-head comparative studies, the pooled SEN for AI was 0.77 (95% CI: 0.69, 0.84) with SPE at 0.79 (95% CI: 0.70, 0.85). For CM, the pooled SEN was 0.67 (95% CI: 0.57, 0.77) with SPE at 0.59 (95% CI: 0.54, 0.64), while for IM, the pooled SEN was 0.83 (95% CI: 0.79, 0.86) with SPE at 0.69 (95% CI: 0.56, 0.79). Notably, there was no statistical difference, except that IM exhibited higher SEN than AI, while maintaining a similar level of SPE in pairwise comparisons. In the Receiver Operating Characteristic analysis subgroup, the CT-based Deep Learning (DL) subgroup, and the National Comprehensive Cancer Network (NCCN) guideline subgroup, the AI model exhibited higher SEN but lower SPE compared to the IM. Conversely, in the training cohort subgroup and the internal validation cohort subgroup, the AI model demonstrated lower SEN but higher SPE than the IM. The subgroup analysis underscored that factors such as the number of cohorts, cohort type, cut-off value |
---|---|
ISSN: | 2234-943X 2234-943X |
DOI: | 10.3389/fonc.2024.1363812 |