A Novel Meta-Analysis-Based Regularized Orthogonal Matching Pursuit Algorithm to Predict Lung Cancer with Selected Biomarkers

Biomarker selection for predictive analytics encounters the problem of identifying a minimal-size subset of genes that is maximally predictive of an outcome of interest. For lung cancer gene expression datasets, it is a great challenge to handle the characteristics of small sample size, high dimensi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mathematics (Basel) 2023-10, Vol.11 (19), p.4171
Hauptverfasser: Wang, Sai, Wang, Bin-Yuan, Li, Hai-Fang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Biomarker selection for predictive analytics encounters the problem of identifying a minimal-size subset of genes that is maximally predictive of an outcome of interest. For lung cancer gene expression datasets, it is a great challenge to handle the characteristics of small sample size, high dimensionality, high noise as well as the low reproducibility of important biomarkers in different studies. In this paper, our proposed meta-analysis-based regularized orthogonal matching pursuit (MA-ROMP) algorithm not only gains strength by using multiple datasets to identify important genomic biomarkers efficiently, but also keeps the selection flexible among datasets to take into account data heterogeneity through a hierarchical decomposition on regression coefficients. For a case study of lung cancer, we downloaded GSE10072, GSE19188 and GSE19804 from the GEO database with inconsistent experimental conditions, sample preparation methods, different study groups, etc. Compared with state-of-the-art methods, our method shows the highest accuracy, of up to 95.63%, with the best discriminative ability (AUC 0.9756) as well as a more than 15-fold decrease in its training time. The experimental results on both simulated data and several lung cancer gene expression datasets demonstrate that MA-ROMP is a more effective tool for biomarker selection and learning cancer prediction.
ISSN:2227-7390
2227-7390
DOI:10.3390/math11194171