A multi-view prognostic model for diffuse large B-cell lymphoma based on kernel canonical correlation analysis and support vector machine

Positron emission tomography/computed tomography (PET/CT) is recommended as the standard imaging modality for diffuse large B-cell lymphoma (DLBCL) staging. However, many studies have neglected the role of patients' prognostic factors with respect to imaging PET/CT of quantitative features. In...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:BMC cancer 2024-12, Vol.24 (1), p.1495-15
Hauptverfasser: Luo, Yanhong, Li, Yongao, Yang, Zhenhuan, Zhang, Yanbo, Yu, Hongmei, Zhao, Zhiqiang, Yu, Kai, Guo, Yujiao, Wang, Xueman, Yang, Na, Zhang, Yan, Zheng, Tingting, Zhou, Jie
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Positron emission tomography/computed tomography (PET/CT) is recommended as the standard imaging modality for diffuse large B-cell lymphoma (DLBCL) staging. However, many studies have neglected the role of patients' prognostic factors with respect to imaging PET/CT of quantitative features. In this paper, a multi-view learning (MVL) model is established to make full use of both clinical and imaging data to predict the prognosis of DLBCL patients and thereby assist doctors in decision-making. Feature engineering, including feature extraction, feature screening by recursive feature elimination, and dimensionality reduction by principal component analysis, are successively performed on the clinical data and imaging data of the research subjects to obtain the study data. After dividing the data into training and test sets, an instance weighting method is applied to the training data. Subsequently, kernel mapping is performed on the imaging features and clinical features separately, and this kernel mapping is processed in the new kernel feature space using kernel canonical correlation analysis (KCCA). Lastly, model training is performed on the obtained common kernel subspace using a support vector machine (SVM). The final overall model, named SVM-2view-KCCA (SVM-2 K), was compared with three other multi-view models (Ensemble-SVM, Multi-view maximum entropy discrimination, and canonical correlation analysis). The performance of the model was evaluated on the test data with respect to several dichotomous metrics: accuracy, sensitivity, F1 score, the area under the curve (AUC), and G-mean. The SVM model improved AUC by 10.5%, sensitivity by 11.9%, accuracy by 9.8%, F1 score by 9.2%, and G-mean by 7.8% for the DLBCL test data after feature engineering based on dimensionality reduction and instance weighting. In the performance comparison of single-view learning models, the SVM-based integration of clinical and imaging features achieved the best overall performance (AUC = 86.3%, accuracy = 91.6%, sensitivity = 83.2%, F1 = 85.7%, and G-mean = 86.1%). In the comparison of MVL models, SVM-2 K achieved the best overall performance (AUC = 92.1%, accuracy = 96.9%, sensitivity = 90.9%, F1 = 92.8%, and G-mean = 91.4%), and the performance of each MVL model was better than that of the best single-view learning model. MVL models outperformed single-view learning models. Of the MVL models, the proposed SVM-2 K achieved the best overall performance and could accurately predict
ISSN:1471-2407
1471-2407
DOI:10.1186/s12885-024-13266-7