De-novo protein function prediction using DNA binding and RNA binding proteins as a test case

Of the currently identified protein sequences, 99.6% have never been observed in the laboratory as proteins and their molecular function has not been established experimentally. Predicting the function of such proteins relies mostly on annotated homologs. However, this has resulted in some erroneous...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nature communications 2016-11, Vol.7 (1), p.13424-13424, Article 13424
Hauptverfasser: Peled, Sapir, Leiderman, Olga, Charar, Rotem, Efroni, Gilat, Shav-Tal, Yaron, Ofran, Yanay
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Of the currently identified protein sequences, 99.6% have never been observed in the laboratory as proteins and their molecular function has not been established experimentally. Predicting the function of such proteins relies mostly on annotated homologs. However, this has resulted in some erroneous annotations, and many proteins have no annotated homologs. Here we propose a de-novo function prediction approach based on identifying biophysical features that underlie function. Using our approach, we discover DNA and RNA binding proteins that cannot be identified based on homology and validate these predictions experimentally. For example, FGF14, which belongs to a family of secreted growth factors was predicted to bind DNA. We verify this experimentally and also show that FGF14 is localized to the nucleus. Mutating the predicted binding site on FGF14 abrogated DNA binding. These results demonstrate the feasibility of automated de-novo function prediction based on identifying function-related biophysical features. Identification of the function of proteins is difficult when there are no structurally or biochemically characterized homologs. Here, the authors present an approach that allows the prediction of nucleic-acid binding proteins based on sequence alone, and they are able to experimentally validate their method.
ISSN:2041-1723
2041-1723
DOI:10.1038/ncomms13424