Computational tools for prioritizing candidate genes: boosting disease gene discovery

Key Points Gene prioritization aims to integrate complex, heterogeneous data to identify the most promising genes for biological validation among a set of candidates. Its goal is to help biological researchers who face mountains of public and private omics data to maximize the yield of downstream bi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature reviews. Genetics 2012-08, Vol.13 (8), p.523-536
Hauptverfasser: Moreau, Yves, Tranchevent, Léon-Charles
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Key Points Gene prioritization aims to integrate complex, heterogeneous data to identify the most promising genes for biological validation among a set of candidates. Its goal is to help biological researchers who face mountains of public and private omics data to maximize the yield of downstream biological validation. Prioritization methods leverage prior knowledge of the phenotype or biological process of interest, either in the form of keywords describing the phenotype of interest or of sets of genes that were previously associated to the phenotype or the process. They then either profile data from candidates against this prior knowledge or diffuse this knowledge across a biological network to identify the most closely associated candidates; methods also exist for the case in which little or no prior knowledge is available. Gene prioritization has contributed to the discovery of many disease-causing genes. High ranking of a candidate gene in prioritization for a phenotype is now accepted as contributing evidence in proving that mutations in this gene cause the phenotype. Numerous prioritization tools are publicly available, often via the Web, and they can easily be used by biologists without specific bioinformatics expertise. Although no tool performs best in all situations, the different tools cover together most experimental situations in which gene prioritization is useful. Computational validation of prioritization results — using procedures such as cross-validation, appropriate negative controls and functional enrichment — is essential to guarantee the effectiveness of the prioritization. More complex prioritization strategies are available to increase the effectiveness of prioritization methods further. Although prioritization methods are now firmly established, many refinements that improve their performance and usability by biologists can be expected. Moreover, prioritization of sequencing variants identified by next-generation sequencing is emerging as a major need for the biological community, in which data integration can have an important role and for which new prioritization strategies are needed. Various studies (such as genetic linkage or 'omics'-based approaches) generate large lists of candidate genes, of which only a minority may be biologically relevant for a phenotype or disease of interest. This Review discusses computational tools for gene prioritization, emphasizing key considerations for how biologists can incorporate these tools
ISSN:1471-0056
1471-0064
DOI:10.1038/nrg3253