Assigning functional linkages to proteins using phylogenetic profiles and continuous phenotypes
Motivation: A class of non-homology-based methods for protein function prediction relies on the assumption that genes linked to a phenotypic trait are preferentially conserved among organisms that share the trait. These methods typically compare pairs of binary strings, where one string encodes the...
Gespeichert in:
Veröffentlicht in: | Bioinformatics 2008-05, Vol.24 (10), p.1257-1263 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Motivation: A class of non-homology-based methods for protein function prediction relies on the assumption that genes linked to a phenotypic trait are preferentially conserved among organisms that share the trait. These methods typically compare pairs of binary strings, where one string encodes the phylogenetic distribution of a trait and the other of a protein. In this work, we extended the approach to automatically deal with continuous phenotypes. Results: Rather than use a priori rules, which can be very subjective, to construct binary profiles from continuous phenotypes, we propose to systematically explore thresholds which can meaningfully separate the phenotype values. We illustrate our method by analyzing optimal growth temperatures, and demonstrate its usefulness by automatically retrieving genes which have been associated with thermophilic growth. We also apply the general approach, for the first time, to optimal growth pH, and make novel predictions. Finally, we show that our method can also be applied to other properties which may not be classically considered as phenotypes. Specifically, we studied correlations between genome size and the distribution of genes. Contact: orlandgonzalez@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. |
---|---|
ISSN: | 1367-4803 0266-7061 1460-2059 1367-4811 |
DOI: | 10.1093/bioinformatics/btn106 |