On the statistical significance of protein complex

Background: Statistical validation of predicted complexes is a fundamental issue in proteomics and bioinformatics. The target is to measure the statistical significance of each predicted complex in terms of p-values. Surprisingly, this issue has not received much attention in the literature. To our...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Quantitative biology 2018-12, Vol.6 (4), p.313-320
Hauptverfasser: Su, Youfu, Zhao, Can, Chen, Zheng, Tian, Bo, He, Zengyou
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background: Statistical validation of predicted complexes is a fundamental issue in proteomics and bioinformatics. The target is to measure the statistical significance of each predicted complex in terms of p-values. Surprisingly, this issue has not received much attention in the literature. To our knowledge, only a few research efforts have been made towards this direction. Methods: In this article, we propose a novel method for calculating the p-value of a predicted complex. The null hypothesis is that there is no difference between the number of edges in target protein complex and that in the random null model. In addition, we assume that a true protein complex must be a connected subgraph. Based on this null hypothesis, we present an algorithm to compute the p-value of a given predicted complex. Results: We test our method on five benchmark data sets to evaluate its effectiveness. Conclusions: The experimental results show that our method is superior to the state-of-the-art algorithms on assessing the statistical significance of candidate protein complexes.
ISSN:2095-4689
2095-4697
DOI:10.1007/s40484-018-0153-6