On the statistical significance of protein complex

Background: Statistical validation of predicted complexes is a fundamental issue in proteomics and bioinformatics. The target is to measure the statistical significance of each predicted complex in terms of p-values. Surprisingly, this issue has not received much attention in the literature. To our...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Quantitative biology 2018-12, Vol.6 (4), p.313-320
Hauptverfasser: Su, Youfu, Zhao, Can, Chen, Zheng, Tian, Bo, He, Zengyou
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 320
container_issue 4
container_start_page 313
container_title Quantitative biology
container_volume 6
creator Su, Youfu
Zhao, Can
Chen, Zheng
Tian, Bo
He, Zengyou
description Background: Statistical validation of predicted complexes is a fundamental issue in proteomics and bioinformatics. The target is to measure the statistical significance of each predicted complex in terms of p-values. Surprisingly, this issue has not received much attention in the literature. To our knowledge, only a few research efforts have been made towards this direction. Methods: In this article, we propose a novel method for calculating the p-value of a predicted complex. The null hypothesis is that there is no difference between the number of edges in target protein complex and that in the random null model. In addition, we assume that a true protein complex must be a connected subgraph. Based on this null hypothesis, we present an algorithm to compute the p-value of a given predicted complex. Results: We test our method on five benchmark data sets to evaluate its effectiveness. Conclusions: The experimental results show that our method is superior to the state-of-the-art algorithms on assessing the statistical significance of candidate protein complexes.
doi_str_mv 10.1007/s40484-018-0153-6
format Article
fullrecord <record><control><sourceid>proquest_24P</sourceid><recordid>TN_cdi_proquest_journals_2159743893</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2159743893</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4592-c6d9dd316dcc900d06d7b4ef85d463a468bef2a01f653dfd2e55b6c872c9ea8e3</originalsourceid><addsrcrecordid>eNqFkEFLAzEQhYMoWGp_gLcFz6tJNskm3myxKhSKYM9hN5m0ke1uTbZo_70pK3qrh2Hm8L55j4fQNcG3BOPyLjLMJMsxkWl4kYszNKJY8ZwJVZ7_3lJdokmMvsaMYckoxSNEl23WbyCLfdX72HtTNVn069a7dLYGss5lu9D14NvMdNtdA19X6MJVTYTJzx6j1fzxbfacL5ZPL7OHRW4YVzQ3wiprCyKsMQpji4UtawZOcstEUaU4NThaYeIEL6yzFDivhZElNQoqCcUY3Qx_k__HHmKv37t9aJOlpoSrkhVSFUlFBpUJXYwBnN4Fv63CQROsj-3ooR2d2tHHdrRIzP3AfPoGDv8D-nU1pdM5xoTRBNMBjolr1xD-Yp1ylAO08esNBLC7ADFqF7q29xBOod8Gnooj</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2159743893</pqid></control><display><type>article</type><title>On the statistical significance of protein complex</title><source>Wiley Online Library Open Access</source><creator>Su, Youfu ; Zhao, Can ; Chen, Zheng ; Tian, Bo ; He, Zengyou</creator><creatorcontrib>Su, Youfu ; Zhao, Can ; Chen, Zheng ; Tian, Bo ; He, Zengyou</creatorcontrib><description>Background: Statistical validation of predicted complexes is a fundamental issue in proteomics and bioinformatics. The target is to measure the statistical significance of each predicted complex in terms of p-values. Surprisingly, this issue has not received much attention in the literature. To our knowledge, only a few research efforts have been made towards this direction. Methods: In this article, we propose a novel method for calculating the p-value of a predicted complex. The null hypothesis is that there is no difference between the number of edges in target protein complex and that in the random null model. In addition, we assume that a true protein complex must be a connected subgraph. Based on this null hypothesis, we present an algorithm to compute the p-value of a given predicted complex. Results: We test our method on five benchmark data sets to evaluate its effectiveness. Conclusions: The experimental results show that our method is superior to the state-of-the-art algorithms on assessing the statistical significance of candidate protein complexes.</description><identifier>ISSN: 2095-4689</identifier><identifier>EISSN: 2095-4697</identifier><identifier>DOI: 10.1007/s40484-018-0153-6</identifier><language>eng</language><publisher>Beijing: Higher Education Press</publisher><subject>Bioinformatics ; Biomedical and Life Sciences ; community detection ; Computational Biology/Bioinformatics ; Computer Appl. in Life Sciences ; Life Sciences ; Mathematical and Computational Biology ; predicted complex ; Proteins ; Proteomics ; Research Article ; Statistical significance ; statistical significance testing ; Statistics ; subgraph mining</subject><ispartof>Quantitative biology, 2018-12, Vol.6 (4), p.313-320</ispartof><rights>Copyright reserved, 2018, Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature</rights><rights>Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature 2018</rights><rights>The Author(s) 2018.</rights><rights>Copyright Springer Nature B.V. 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4592-c6d9dd316dcc900d06d7b4ef85d463a468bef2a01f653dfd2e55b6c872c9ea8e3</citedby><cites>FETCH-LOGICAL-c4592-c6d9dd316dcc900d06d7b4ef85d463a468bef2a01f653dfd2e55b6c872c9ea8e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s40484-018-0153-6$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s40484-018-0153-6$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,11541,27901,27902,41464,42533,46027,46451,51294</link.rule.ids><linktorsrc>$$Uhttps://onlinelibrary.wiley.com/doi/abs/10.1007%2Fs40484-018-0153-6$$EView_record_in_Wiley-Blackwell$$FView_record_in_$$GWiley-Blackwell</linktorsrc></links><search><creatorcontrib>Su, Youfu</creatorcontrib><creatorcontrib>Zhao, Can</creatorcontrib><creatorcontrib>Chen, Zheng</creatorcontrib><creatorcontrib>Tian, Bo</creatorcontrib><creatorcontrib>He, Zengyou</creatorcontrib><title>On the statistical significance of protein complex</title><title>Quantitative biology</title><addtitle>Quant. Biol</addtitle><addtitle>Quant Biol</addtitle><description>Background: Statistical validation of predicted complexes is a fundamental issue in proteomics and bioinformatics. The target is to measure the statistical significance of each predicted complex in terms of p-values. Surprisingly, this issue has not received much attention in the literature. To our knowledge, only a few research efforts have been made towards this direction. Methods: In this article, we propose a novel method for calculating the p-value of a predicted complex. The null hypothesis is that there is no difference between the number of edges in target protein complex and that in the random null model. In addition, we assume that a true protein complex must be a connected subgraph. Based on this null hypothesis, we present an algorithm to compute the p-value of a given predicted complex. Results: We test our method on five benchmark data sets to evaluate its effectiveness. Conclusions: The experimental results show that our method is superior to the state-of-the-art algorithms on assessing the statistical significance of candidate protein complexes.</description><subject>Bioinformatics</subject><subject>Biomedical and Life Sciences</subject><subject>community detection</subject><subject>Computational Biology/Bioinformatics</subject><subject>Computer Appl. in Life Sciences</subject><subject>Life Sciences</subject><subject>Mathematical and Computational Biology</subject><subject>predicted complex</subject><subject>Proteins</subject><subject>Proteomics</subject><subject>Research Article</subject><subject>Statistical significance</subject><subject>statistical significance testing</subject><subject>Statistics</subject><subject>subgraph mining</subject><issn>2095-4689</issn><issn>2095-4697</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNqFkEFLAzEQhYMoWGp_gLcFz6tJNskm3myxKhSKYM9hN5m0ke1uTbZo_70pK3qrh2Hm8L55j4fQNcG3BOPyLjLMJMsxkWl4kYszNKJY8ZwJVZ7_3lJdokmMvsaMYckoxSNEl23WbyCLfdX72HtTNVn069a7dLYGss5lu9D14NvMdNtdA19X6MJVTYTJzx6j1fzxbfacL5ZPL7OHRW4YVzQ3wiprCyKsMQpji4UtawZOcstEUaU4NThaYeIEL6yzFDivhZElNQoqCcUY3Qx_k__HHmKv37t9aJOlpoSrkhVSFUlFBpUJXYwBnN4Fv63CQROsj-3ooR2d2tHHdrRIzP3AfPoGDv8D-nU1pdM5xoTRBNMBjolr1xD-Yp1ylAO08esNBLC7ADFqF7q29xBOod8Gnooj</recordid><startdate>201812</startdate><enddate>201812</enddate><creator>Su, Youfu</creator><creator>Zhao, Can</creator><creator>Chen, Zheng</creator><creator>Tian, Bo</creator><creator>He, Zengyou</creator><general>Higher Education Press</general><general>Higher Education Press and Springer‐Verlag GmbH Germany, part of Springer Nature</general><general>John Wiley &amp; Sons, Inc</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>201812</creationdate><title>On the statistical significance of protein complex</title><author>Su, Youfu ; Zhao, Can ; Chen, Zheng ; Tian, Bo ; He, Zengyou</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4592-c6d9dd316dcc900d06d7b4ef85d463a468bef2a01f653dfd2e55b6c872c9ea8e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Bioinformatics</topic><topic>Biomedical and Life Sciences</topic><topic>community detection</topic><topic>Computational Biology/Bioinformatics</topic><topic>Computer Appl. in Life Sciences</topic><topic>Life Sciences</topic><topic>Mathematical and Computational Biology</topic><topic>predicted complex</topic><topic>Proteins</topic><topic>Proteomics</topic><topic>Research Article</topic><topic>Statistical significance</topic><topic>statistical significance testing</topic><topic>Statistics</topic><topic>subgraph mining</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Su, Youfu</creatorcontrib><creatorcontrib>Zhao, Can</creatorcontrib><creatorcontrib>Chen, Zheng</creatorcontrib><creatorcontrib>Tian, Bo</creatorcontrib><creatorcontrib>He, Zengyou</creatorcontrib><collection>CrossRef</collection><jtitle>Quantitative biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Su, Youfu</au><au>Zhao, Can</au><au>Chen, Zheng</au><au>Tian, Bo</au><au>He, Zengyou</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>On the statistical significance of protein complex</atitle><jtitle>Quantitative biology</jtitle><stitle>Quant. Biol</stitle><stitle>Quant Biol</stitle><date>2018-12</date><risdate>2018</risdate><volume>6</volume><issue>4</issue><spage>313</spage><epage>320</epage><pages>313-320</pages><issn>2095-4689</issn><eissn>2095-4697</eissn><abstract>Background: Statistical validation of predicted complexes is a fundamental issue in proteomics and bioinformatics. The target is to measure the statistical significance of each predicted complex in terms of p-values. Surprisingly, this issue has not received much attention in the literature. To our knowledge, only a few research efforts have been made towards this direction. Methods: In this article, we propose a novel method for calculating the p-value of a predicted complex. The null hypothesis is that there is no difference between the number of edges in target protein complex and that in the random null model. In addition, we assume that a true protein complex must be a connected subgraph. Based on this null hypothesis, we present an algorithm to compute the p-value of a given predicted complex. Results: We test our method on five benchmark data sets to evaluate its effectiveness. Conclusions: The experimental results show that our method is superior to the state-of-the-art algorithms on assessing the statistical significance of candidate protein complexes.</abstract><cop>Beijing</cop><pub>Higher Education Press</pub><doi>10.1007/s40484-018-0153-6</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2095-4689
ispartof Quantitative biology, 2018-12, Vol.6 (4), p.313-320
issn 2095-4689
2095-4697
language eng
recordid cdi_proquest_journals_2159743893
source Wiley Online Library Open Access
subjects Bioinformatics
Biomedical and Life Sciences
community detection
Computational Biology/Bioinformatics
Computer Appl. in Life Sciences
Life Sciences
Mathematical and Computational Biology
predicted complex
Proteins
Proteomics
Research Article
Statistical significance
statistical significance testing
Statistics
subgraph mining
title On the statistical significance of protein complex
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T08%3A36%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_24P&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=On%20the%20statistical%20significance%20of%20protein%20complex&rft.jtitle=Quantitative%20biology&rft.au=Su,%20Youfu&rft.date=2018-12&rft.volume=6&rft.issue=4&rft.spage=313&rft.epage=320&rft.pages=313-320&rft.issn=2095-4689&rft.eissn=2095-4697&rft_id=info:doi/10.1007/s40484-018-0153-6&rft_dat=%3Cproquest_24P%3E2159743893%3C/proquest_24P%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2159743893&rft_id=info:pmid/&rfr_iscdi=true