LOTUS: a single-and multi-task machine-learning algorithm for the prediction of cancer driver genes

Cancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2019
Hauptverfasser: Collier, Olivier, Stoven, Véronique, Vert, Jean-Philippe
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Cancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types. In this paper we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including informations about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a 1 multitask learning strategy to share information across cancer types. We empirically show that LOTUS outperforms three other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types. Author summary Cancer development is thought to be driven by some important genes that should be targeted by new treatments. Unfortunately, there is a small number of such genes, so that it is of crucial importance to design algorithms capable of finding genes with the highest oncogenic potential. Our new method analyses in particular data of mutations but also other sources of informations to establish a list of genes that should be investigated in priority. Moreover, our algorithm can differentiate between several types of cancer and share information between them to improve the prediction for every disease. We showed that in several contexts our algorithm beats its concurrents.
ISSN:1553-734X
1553-7358