Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods
Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsit...
Gespeichert in:
Veröffentlicht in: | Iranian journal of pediatric hematology and oncology 2023-01 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method.
Materials and Methods: In this descriptive study, the microarray gene expression data of 72 patients with acute myeloid leukemia (AML) and lymphoblastic leukemia (ALL) was used. To remove the redundant genes and identify the most important genes in the prediction of AML and ALL, a robust ℓ2,p-norm (0 < p ≤1) sparsity-based gene selection method was applied, in which the parameter p method was implemented from 1/4, 1/2, 3/4 and 1. Then, the most important genes were used by the random forest (RF) and support vector machine (SVM) classifiers for prediction of AML and ALL.
Results: The RF and SVM classifiers correctly classified all AML and ALL samples. The RF classifier obtained the performance of 100% using 10 genes selected by the ℓ2,1/2-norm and ℓ2,1-norm sparsity-based gene selection methods. Moreover, the SVM classifier obtained a performance of 100% using 10 genes selected by the ℓ2,1/2-norm method. Seven common genes were identified by all four values of parameter p in the ℓ2,p-norm method as the most important genes in the classification of AML and ALL, and the gene with the description “PRTN3 Proteinase 3 (serine proteinase, neutrophil, Wegener granulomatosis autoantigen” was identified as the most important gene.
Conclusion: The results obtained in this study indicated that the prediction of blood cancer from leukemia microarray gene expression data can be carried out using the robust ℓ2,p-norm sparsity-based gene selection method and classification algorithms. It can be useful to examine the expression level of the genes identified by this study to predict leukemia. |
---|---|
ISSN: | 2008-8892 2008-8892 |
DOI: | 10.18502/ijpho.v13i1.11629 |