Predicting Protein-Protein Interactions from Protein Domains Using a Set Cover Approach

One goal of contemporary proteome research is the elucidation of cellular protein interactions. Based on currently available protein-protein interaction and domain data, we introduce a novel method, maximum specificity set cover (MSSC), for the prediction of protein-protein interactions. In our appr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on computational biology and bioinformatics 2007-01, Vol.4 (1), p.78-87
Hauptverfasser: Chengbang Huang, Morcos, F., Kanaan, S.P., Wuchty, S., Chen, D.Z., Izaguirre, J.A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:One goal of contemporary proteome research is the elucidation of cellular protein interactions. Based on currently available protein-protein interaction and domain data, we introduce a novel method, maximum specificity set cover (MSSC), for the prediction of protein-protein interactions. In our approach, we map the relationship between interactions of proteins and their corresponding domain architectures to a generalized weighted set cover problem. The application of a greedy algorithm provides sets of domain interactions which explain the presence of protein interactions to the largest degree of specificity. Utilizing domain and protein interaction data of S. cerevisiae, MSSC enables prediction of previously unknown protein interactions, links that are well supported by a high tendency of coexpression and functional homogeneity of the corresponding proteins. Focusing on concrete examples, we show that MSSC reliably predicts protein interactions in well-studied molecular systems, such as the 26S proteasome and RNA polymerase II of S. cerevisiae. We also show that the quality of the predictions is comparable to the maximum likelihood estimation while MSSC is faster. This new algorithm and all data sets used are accessible through a Web portal at http://ppi-cse.nd.edu
ISSN:1545-5963
1557-9964
DOI:10.1109/TCBB.2007.1001