Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators
Identification of protein-protein interactions (PPIs) is essential for a better understanding of biological processes, pathways and functions. However, experimental identification of the complete set of PPIs in a cell/organism ("an interactome") is still a difficult task. To circumvent lim...
Gespeichert in:
Veröffentlicht in: | BMC bioinformatics 2014-06, Vol.15 (1), p.213-213, Article 213 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 213 |
---|---|
container_issue | 1 |
container_start_page | 213 |
container_title | BMC bioinformatics |
container_volume | 15 |
creator | Murakami, Yoichi Mizuguchi, Kenji |
description | Identification of protein-protein interactions (PPIs) is essential for a better understanding of biological processes, pathways and functions. However, experimental identification of the complete set of PPIs in a cell/organism ("an interactome") is still a difficult task. To circumvent limitations of current high-throughput experimental techniques, it is necessary to develop high-performance computational methods for predicting PPIs.
In this article, we propose a new computational method to predict interaction between a given pair of protein sequences using features derived from known homologous PPIs. The proposed method is capable of predicting interaction between two proteins (of unknown structure) using Averaged One-Dependence Estimators (AODE) and three features calculated for the protein pair: (a) sequence similarities to a known interacting protein pair (FSeq), (b) statistical propensities of domain pairs observed in interacting proteins (FDom) and (c) a sum of edge weights along the shortest path between homologous proteins in a PPI network (FNet). Feature vectors were defined to lie in a half-space of the symmetrical high-dimensional feature space to make them independent of the protein order. The predictability of the method was assessed by a 10-fold cross validation on a recently created human PPI dataset with randomly sampled negative data, and the best model achieved an Area Under the Curve of 0.79 (pAUC0.5% = 0.16). In addition, the AODE trained on all three features (named PSOPIA) showed better prediction performance on a separate independent data set than a recently reported homology-based method.
Our results suggest that FNet, a feature representing proximity in a known PPI network between two proteins that are homologous to a target protein pair, contributes to the prediction of whether the target proteins interact or not. PSOPIA will help identify novel PPIs and estimate complete PPI networks. The method proposed in this article is freely available on the web at http://mizuguchilab.org/PSOPIA. |
doi_str_mv | 10.1186/1471-2105-15-213 |
format | Article |
fullrecord | <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4229973</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A539461104</galeid><sourcerecordid>A539461104</sourcerecordid><originalsourceid>FETCH-LOGICAL-c624t-c9a9b6096c2905a13770669ab50d0b9157dad467a1820909375c6fb90337df403</originalsourceid><addsrcrecordid>eNqNkktv1TAQhSMEog_Ys0KR2LSLtHb8ijdIV6UvqVKlAmvLcSbBVWJfbKfQf4_TlksvYoGymHj8nfHR0RTFO4yOMG74MaYCVzVGrMIsV_Ki2N20Xj773yn2YrxFCIsGsdfFTk0lI7jmu4W58JMf_XBftTpCV64DdNYk613p-9K6BEE_HGPZQvoB4DLiE9jcmKN1Q7m6y8iQpdcOqk-wBteBM1CexmQnnXyIb4pXvR4jvH2q-8XXs9MvJxfV1fX55cnqqjK8pqkyUsuWI8lNLRHTmAiBOJe6ZahDrcRMdLqjXGjc1EgiSQQzvG8lIkR0PUVkv_j4OHc9txN0BlwKelTrkH2Ee-W1Vds3zn5Tg79TtK6lFCQPOHgaEPz3GWJSk40GxlE78HNUmFHRMMKI_B-UUZzhxdaHv9BbPweXk3igUMMF53-oQY-grOt9tmiWoWqVH6QcY0QzdfQPKn8dTNZ4B73N_S3B4ZYgMwl-pkHPMarLzzfbLHpkTfAxBug30WGkln1Ty0KpZaGy81yXxN4_j3wj-L1g5Bebs82P</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1545086766</pqid></control><display><type>article</type><title>Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators</title><source>Open Access: DOAJ - Directory of Open Access Journals</source><source>MEDLINE</source><source>PubMed Central</source><source>EZB Electronic Journals Library</source><source>SpringerLink Journals - AutoHoldings</source><source>PubMed Central Open Access</source><source>Springer Nature OA Free Journals</source><creator>Murakami, Yoichi ; Mizuguchi, Kenji</creator><creatorcontrib>Murakami, Yoichi ; Mizuguchi, Kenji</creatorcontrib><description>Identification of protein-protein interactions (PPIs) is essential for a better understanding of biological processes, pathways and functions. However, experimental identification of the complete set of PPIs in a cell/organism ("an interactome") is still a difficult task. To circumvent limitations of current high-throughput experimental techniques, it is necessary to develop high-performance computational methods for predicting PPIs.
In this article, we propose a new computational method to predict interaction between a given pair of protein sequences using features derived from known homologous PPIs. The proposed method is capable of predicting interaction between two proteins (of unknown structure) using Averaged One-Dependence Estimators (AODE) and three features calculated for the protein pair: (a) sequence similarities to a known interacting protein pair (FSeq), (b) statistical propensities of domain pairs observed in interacting proteins (FDom) and (c) a sum of edge weights along the shortest path between homologous proteins in a PPI network (FNet). Feature vectors were defined to lie in a half-space of the symmetrical high-dimensional feature space to make them independent of the protein order. The predictability of the method was assessed by a 10-fold cross validation on a recently created human PPI dataset with randomly sampled negative data, and the best model achieved an Area Under the Curve of 0.79 (pAUC0.5% = 0.16). In addition, the AODE trained on all three features (named PSOPIA) showed better prediction performance on a separate independent data set than a recently reported homology-based method.
Our results suggest that FNet, a feature representing proximity in a known PPI network between two proteins that are homologous to a target protein pair, contributes to the prediction of whether the target proteins interact or not. PSOPIA will help identify novel PPIs and estimate complete PPI networks. The method proposed in this article is freely available on the web at http://mizuguchilab.org/PSOPIA.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/1471-2105-15-213</identifier><identifier>PMID: 24953126</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Amino Acid Sequence ; Amino acids ; Computational Biology - methods ; Dependence ; Genetic aspects ; Grants ; Humans ; Machine learning ; Methods ; Physiological aspects ; Protein Interaction Mapping - methods ; Protein Structure, Tertiary ; Protein-protein interactions ; Proteins ; Proteins - chemistry ; Proteins - metabolism ; Science ; Sequence Homology, Amino Acid ; Web sites</subject><ispartof>BMC bioinformatics, 2014-06, Vol.15 (1), p.213-213, Article 213</ispartof><rights>COPYRIGHT 2014 BioMed Central Ltd.</rights><rights>2014 Murakami and Mizuguchi; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.</rights><rights>Copyright © 2014 Murakami and Mizuguchi; licensee BioMed Central Ltd. 2014 Murakami and Mizuguchi; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c624t-c9a9b6096c2905a13770669ab50d0b9157dad467a1820909375c6fb90337df403</citedby><cites>FETCH-LOGICAL-c624t-c9a9b6096c2905a13770669ab50d0b9157dad467a1820909375c6fb90337df403</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4229973/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4229973/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/24953126$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Murakami, Yoichi</creatorcontrib><creatorcontrib>Mizuguchi, Kenji</creatorcontrib><title>Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Identification of protein-protein interactions (PPIs) is essential for a better understanding of biological processes, pathways and functions. However, experimental identification of the complete set of PPIs in a cell/organism ("an interactome") is still a difficult task. To circumvent limitations of current high-throughput experimental techniques, it is necessary to develop high-performance computational methods for predicting PPIs.
In this article, we propose a new computational method to predict interaction between a given pair of protein sequences using features derived from known homologous PPIs. The proposed method is capable of predicting interaction between two proteins (of unknown structure) using Averaged One-Dependence Estimators (AODE) and three features calculated for the protein pair: (a) sequence similarities to a known interacting protein pair (FSeq), (b) statistical propensities of domain pairs observed in interacting proteins (FDom) and (c) a sum of edge weights along the shortest path between homologous proteins in a PPI network (FNet). Feature vectors were defined to lie in a half-space of the symmetrical high-dimensional feature space to make them independent of the protein order. The predictability of the method was assessed by a 10-fold cross validation on a recently created human PPI dataset with randomly sampled negative data, and the best model achieved an Area Under the Curve of 0.79 (pAUC0.5% = 0.16). In addition, the AODE trained on all three features (named PSOPIA) showed better prediction performance on a separate independent data set than a recently reported homology-based method.
Our results suggest that FNet, a feature representing proximity in a known PPI network between two proteins that are homologous to a target protein pair, contributes to the prediction of whether the target proteins interact or not. PSOPIA will help identify novel PPIs and estimate complete PPI networks. The method proposed in this article is freely available on the web at http://mizuguchilab.org/PSOPIA.</description><subject>Amino Acid Sequence</subject><subject>Amino acids</subject><subject>Computational Biology - methods</subject><subject>Dependence</subject><subject>Genetic aspects</subject><subject>Grants</subject><subject>Humans</subject><subject>Machine learning</subject><subject>Methods</subject><subject>Physiological aspects</subject><subject>Protein Interaction Mapping - methods</subject><subject>Protein Structure, Tertiary</subject><subject>Protein-protein interactions</subject><subject>Proteins</subject><subject>Proteins - chemistry</subject><subject>Proteins - metabolism</subject><subject>Science</subject><subject>Sequence Homology, Amino Acid</subject><subject>Web sites</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqNkktv1TAQhSMEog_Ys0KR2LSLtHb8ijdIV6UvqVKlAmvLcSbBVWJfbKfQf4_TlksvYoGymHj8nfHR0RTFO4yOMG74MaYCVzVGrMIsV_Ki2N20Xj773yn2YrxFCIsGsdfFTk0lI7jmu4W58JMf_XBftTpCV64DdNYk613p-9K6BEE_HGPZQvoB4DLiE9jcmKN1Q7m6y8iQpdcOqk-wBteBM1CexmQnnXyIb4pXvR4jvH2q-8XXs9MvJxfV1fX55cnqqjK8pqkyUsuWI8lNLRHTmAiBOJe6ZahDrcRMdLqjXGjc1EgiSQQzvG8lIkR0PUVkv_j4OHc9txN0BlwKelTrkH2Ee-W1Vds3zn5Tg79TtK6lFCQPOHgaEPz3GWJSk40GxlE78HNUmFHRMMKI_B-UUZzhxdaHv9BbPweXk3igUMMF53-oQY-grOt9tmiWoWqVH6QcY0QzdfQPKn8dTNZ4B73N_S3B4ZYgMwl-pkHPMarLzzfbLHpkTfAxBug30WGkln1Ty0KpZaGy81yXxN4_j3wj-L1g5Bebs82P</recordid><startdate>20140623</startdate><enddate>20140623</enddate><creator>Murakami, Yoichi</creator><creator>Mizuguchi, Kenji</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20140623</creationdate><title>Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators</title><author>Murakami, Yoichi ; Mizuguchi, Kenji</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c624t-c9a9b6096c2905a13770669ab50d0b9157dad467a1820909375c6fb90337df403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Amino Acid Sequence</topic><topic>Amino acids</topic><topic>Computational Biology - methods</topic><topic>Dependence</topic><topic>Genetic aspects</topic><topic>Grants</topic><topic>Humans</topic><topic>Machine learning</topic><topic>Methods</topic><topic>Physiological aspects</topic><topic>Protein Interaction Mapping - methods</topic><topic>Protein Structure, Tertiary</topic><topic>Protein-protein interactions</topic><topic>Proteins</topic><topic>Proteins - chemistry</topic><topic>Proteins - metabolism</topic><topic>Science</topic><topic>Sequence Homology, Amino Acid</topic><topic>Web sites</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Murakami, Yoichi</creatorcontrib><creatorcontrib>Mizuguchi, Kenji</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale in Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest_Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Murakami, Yoichi</au><au>Mizuguchi, Kenji</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2014-06-23</date><risdate>2014</risdate><volume>15</volume><issue>1</issue><spage>213</spage><epage>213</epage><pages>213-213</pages><artnum>213</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Identification of protein-protein interactions (PPIs) is essential for a better understanding of biological processes, pathways and functions. However, experimental identification of the complete set of PPIs in a cell/organism ("an interactome") is still a difficult task. To circumvent limitations of current high-throughput experimental techniques, it is necessary to develop high-performance computational methods for predicting PPIs.
In this article, we propose a new computational method to predict interaction between a given pair of protein sequences using features derived from known homologous PPIs. The proposed method is capable of predicting interaction between two proteins (of unknown structure) using Averaged One-Dependence Estimators (AODE) and three features calculated for the protein pair: (a) sequence similarities to a known interacting protein pair (FSeq), (b) statistical propensities of domain pairs observed in interacting proteins (FDom) and (c) a sum of edge weights along the shortest path between homologous proteins in a PPI network (FNet). Feature vectors were defined to lie in a half-space of the symmetrical high-dimensional feature space to make them independent of the protein order. The predictability of the method was assessed by a 10-fold cross validation on a recently created human PPI dataset with randomly sampled negative data, and the best model achieved an Area Under the Curve of 0.79 (pAUC0.5% = 0.16). In addition, the AODE trained on all three features (named PSOPIA) showed better prediction performance on a separate independent data set than a recently reported homology-based method.
Our results suggest that FNet, a feature representing proximity in a known PPI network between two proteins that are homologous to a target protein pair, contributes to the prediction of whether the target proteins interact or not. PSOPIA will help identify novel PPIs and estimate complete PPI networks. The method proposed in this article is freely available on the web at http://mizuguchilab.org/PSOPIA.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>24953126</pmid><doi>10.1186/1471-2105-15-213</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2014-06, Vol.15 (1), p.213-213, Article 213 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4229973 |
source | Open Access: DOAJ - Directory of Open Access Journals; MEDLINE; PubMed Central; EZB Electronic Journals Library; SpringerLink Journals - AutoHoldings; PubMed Central Open Access; Springer Nature OA Free Journals |
subjects | Amino Acid Sequence Amino acids Computational Biology - methods Dependence Genetic aspects Grants Humans Machine learning Methods Physiological aspects Protein Interaction Mapping - methods Protein Structure, Tertiary Protein-protein interactions Proteins Proteins - chemistry Proteins - metabolism Science Sequence Homology, Amino Acid Web sites |
title | Homology-based prediction of interactions between proteins using Averaged One-Dependence Estimators |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T15%3A42%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Homology-based%20prediction%20of%20interactions%20between%20proteins%20using%20Averaged%20One-Dependence%20Estimators&rft.jtitle=BMC%20bioinformatics&rft.au=Murakami,%20Yoichi&rft.date=2014-06-23&rft.volume=15&rft.issue=1&rft.spage=213&rft.epage=213&rft.pages=213-213&rft.artnum=213&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/1471-2105-15-213&rft_dat=%3Cgale_pubme%3EA539461104%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1545086766&rft_id=info:pmid/24953126&rft_galeid=A539461104&rfr_iscdi=true |