Human protein-protein interaction prediction by a novel sequence-based co-evolution method: co-evolutionary divergence

Protein-protein interaction (PPI) plays an important role in understanding gene functions, and many computational PPI prediction methods have been proposed in recent years. Despite the extensive efforts, PPI prediction still has much room to improve. Sequence-based co-evolution methods include the s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2013-01, Vol.29 (1), p.92-98
Hauptverfasser: Hsin Liu, Chia, Li, Ker-Chau, Yuan, Shinsheng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Protein-protein interaction (PPI) plays an important role in understanding gene functions, and many computational PPI prediction methods have been proposed in recent years. Despite the extensive efforts, PPI prediction still has much room to improve. Sequence-based co-evolution methods include the substitution rate method and the mirror tree method, which compare sequence substitution rates and topological similarity of phylogenetic trees, respectively. Although they have been used to predict PPI in species with small genomes like Escherichia coli, such methods have not been tested in large scale proteome like Homo sapiens. In this study, we propose a novel sequence-based co-evolution method, co-evolutionary divergence (CD), for human PPI prediction. Built on the basic assumption that protein pairs with similar substitution rates are likely to interact with each other, the CD method converts the evolutionary information from 14 species of vertebrates into likelihood ratios and combined them together to infer PPI. We showed that the CD method outperformed the mirror tree method in three independent human PPI datasets by a large margin. With the arrival of more species genome information generated by next generation sequencing, the performance of the CD method can be further improved. Source code and support are available at http://mib.stat.sinica.edu.tw/LAP/tmp/CD.rar.
ISSN:1367-4803
1367-4811
1460-2059
DOI:10.1093/bioinformatics/bts620