Two novel models and a parthenogenetic algorithm for detecting common driver pathways from pan-cancer data

With the rapid development of high-throughput sequencing technologies, huge volumes of generated cancer genomics data make it into reality to understand the carcinogenic pathogenesis from the molecular level. It is believed that the study of commonalities among different cancers is one of the signif...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering applications of artificial intelligence 2020-11, Vol.96, p.104010, Article 104010
Hauptverfasser: Wu, Jingli, Pan, Ke, Li, Gaoshi, Zhu, Kai, Cai, Qirong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the rapid development of high-throughput sequencing technologies, huge volumes of generated cancer genomics data make it into reality to understand the carcinogenic pathogenesis from the molecular level. It is believed that the study of commonalities among different cancers is one of the significant problems for understanding cancers, and will be beneficial for personalized therapy and precision medicine in cancer treatment. The ComMDP method is a useful one for solving this problem. However, when there is a substantially difference among the number of samples, the method of accumulating the absolute weight value of every cancer, employed by the ComMDP method, may give rise to missing some driver pathways. In this paper, two mathematical models CDP-V and CDP-H, replacing the absolute weight values with relative ones, are presented by using variance and harmonic mean, respectively. By devising a sort of short chromosome code and a greedy based recombination operator, a parthenogenetic algorithm is proposed for solving these two models. Extensive experiments were performed on both simulated and real cancer data. The experimental results show that given several types of cancer, the gene sets identified based on the presented models and algorithm not only mutate in a large proportion of samples of these cancers, but have close proportion of mutated samples in each cancer. In addition, some biologically meaningful gene sets, which are missed by the ComMDP one, are indeed detected. Hence the identified methods based on the presented models and algorithm may become useful complementary tools for identifying cancer pathways.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2020.104010