Identification of informative genes and sub-pathways using Improved Differential Expression Analysis for Pathways (iDEAP) for cancer classification [version 1; peer review: 1 approved, 1 approved with reservations]

Background: Pathway-based analysis primarily focuses on sub-pathway-based analysis, which aids in understanding biological reactions. Several studies have found abnormalities in pathways caused by certain regions based on the etiology of diseases. The Differential Expression Analysis for Pathways (D...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:F1000 research 2023, Vol.12, p.1433
Hauptverfasser: Nasarudin, Nurul Athirah, Mohamad, Mohd Saberi, Zakaria, Zalmiyah, O. Sinnott, Richard, Al Jasmi, Fatma, Al Dhaheri, Noura
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background: Pathway-based analysis primarily focuses on sub-pathway-based analysis, which aids in understanding biological reactions. Several studies have found abnormalities in pathways caused by certain regions based on the etiology of diseases. The Differential Expression Analysis for Pathways (DEAP) method is one such sub-pathway-based analysis method, that identifies a local region perturbed by complex diseases based on larger pathway data. However, the method has low performance in identifying informative pathways and sub-pathways. Methods: In this paper we propose an improved DEAP (iDEAP) method for enhanced identification of perturbed sub-pathways that achieves higher performance in the detection of differentially expressed pathways. Firstly, a search algorithm adapted from the Detect Module from Seed Protein (DMSP) algorithm was implemented as part of the DEAP method to search for informative sub-pathways. Secondly, the relation among sub-pathways was taken into account by averaging the maximum absolute value for the DEAP score for the reaction among sub-pathways to support the efficient identification of informative pathways. Three gene expression data sets were applied to this research. Results: The proposed improved method shows a better performance over the previous methods. In fact, when the identified genes from the results were assessed using 10-fold cross validation to classify cancer based on accuracy, the improved method shows higher accuracy for colorectal cancer (90%) and breast cancer (94%). Conclusions: This shows that the proposed method effectively identifies informative genes related to the targeted phenotype. A biological validation was also conducted on the top five significant pathways and selected genes based on biological literature. The results from this analysis will be useful especially in the medical field for disease detection. In 10 years and beyond, computational biology will become ever more entwined with biomedical research and medicine.
ISSN:2046-1402
2046-1402
DOI:10.12688/f1000research.132899.1