HITS-PR-HHblits: protein remote homology detection by combining PageRank and Hyperlink-Induced Topic Search

Abstract As one of the most important fundamental problems in protein sequence analysis, protein remote homology detection is critical for both theoretical research (protein structure and function studies) and real world applications (drug design). Although several computational predictors have been...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Briefings in bioinformatics 2018-11, Vol.21 (1), p.298-308
Hauptverfasser:	Liu, Bin, Jiang, Shuangyan, Zou, Quan
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Amino acid sequence Bioinformatics Computer applications Drug development Homology Internet Performance prediction Protein structure Proteins Proteomes Protocol (computers) Queries Search algorithms Search engines Sequence analysis Similarity Software Structure-function relationships
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	308
container_issue	1
container_start_page	298
container_title	Briefings in bioinformatics
container_volume	21
creator	Liu, Bin Jiang, Shuangyan Zou, Quan
description	Abstract As one of the most important fundamental problems in protein sequence analysis, protein remote homology detection is critical for both theoretical research (protein structure and function studies) and real world applications (drug design). Although several computational predictors have been proposed, their detection performance is still limited. In this study, we treat protein remote homology detection as a document retrieval task, where the proteins are considered as documents and its aim is to find the highly related documents with the query documents in a database. A protein similarity network was constructed based on the true labels of proteins in the database, and the query proteins were then connected into the network based on the similarity scores calculated by three ranking methods, including PSI-BLAST, Hmmer and HHblits. The PageRank algorithm and Hyperlink-Induced Topic Search (HITS) algorithm were respectively performed on this network to move the homologous proteins of query proteins to the neighbors of the query proteins in the network. Finally, PageRank and HITS algorithms were combined, and a predictor called HITS-PR-HHblits was proposed to further improve the predictive performance. Tested on the SCOP and SCOPe benchmark datasets, the experimental results showed that the proposed protocols outperformed other state-of-the-art methods. For the convenience of the most experimental scientists, a web server for HITS-PR-HHblits was established at http://bioinformatics.hitsz.edu.cn/HITS-PR-HHblits, by which the users can easily get the results without the need to go through the mathematical details. The HITS-PR-HHblits predictor is a protocol for protein remote homology detection using different sets of programs, which will become a very useful computational tool for proteome analysis.
doi_str_mv	10.1093/bib/bby104
format	Article
fullrecord	<record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_2131241137</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bib/bby104</oup_id><sourcerecordid>2431014186</sourcerecordid><originalsourceid>FETCH-LOGICAL-c370t-20a569b09a08c9f5f85c9fcfdef9db2dfa4238e027db77114b01f07c8fd10533</originalsourceid><addsrcrecordid>eNp90c1O3DAUBWCrAhUK3fQBkCWEVCEFrmNnnHRXoZaMhASC2Uf-HcwkdrCTRd4eo6FddNHVvYtPR9c-CH0jcEWgodfSyWspFwLsEzomjPOCQcUO3vcVLyq2okfoS0ovACXwmnxGRxQYUM7hGO3a9eapeHgs2lb2bko_8BjDZJzH0Qx5wc9hCH3YLlibyajJBY_lglUYpPPOb_GD2JpH4XdYeI3bZTSxd35XrL2eldF4E0an8JMRUT2fokMr-mS-fswTtPn9a3PTFnf3t-ubn3eFohymogRRrRoJjYBaNbaydZWHstrYRstSW8FKWhsouZacE8IkEAtc1VYTqCg9Qd_3sfklr7NJUze4pEzfC2_CnLqSUFIyQijP9Pwf-hLm6PNxXckoAcJIvcrqcq9UDClFY7sxukHEpSPQvTfQ5Qa6fQMZn31EznIw-i_98-UZXOxBmMf_Bb0B9QSOGA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2431014186</pqid></control><display><type>article</type><title>HITS-PR-HHblits: protein remote homology detection by combining PageRank and Hyperlink-Induced Topic Search</title><source>Oxford Journals Open Access Collection</source><creator>Liu, Bin ; Jiang, Shuangyan ; Zou, Quan</creator><creatorcontrib>Liu, Bin ; Jiang, Shuangyan ; Zou, Quan</creatorcontrib><description>Abstract As one of the most important fundamental problems in protein sequence analysis, protein remote homology detection is critical for both theoretical research (protein structure and function studies) and real world applications (drug design). Although several computational predictors have been proposed, their detection performance is still limited. In this study, we treat protein remote homology detection as a document retrieval task, where the proteins are considered as documents and its aim is to find the highly related documents with the query documents in a database. A protein similarity network was constructed based on the true labels of proteins in the database, and the query proteins were then connected into the network based on the similarity scores calculated by three ranking methods, including PSI-BLAST, Hmmer and HHblits. The PageRank algorithm and Hyperlink-Induced Topic Search (HITS) algorithm were respectively performed on this network to move the homologous proteins of query proteins to the neighbors of the query proteins in the network. Finally, PageRank and HITS algorithms were combined, and a predictor called HITS-PR-HHblits was proposed to further improve the predictive performance. Tested on the SCOP and SCOPe benchmark datasets, the experimental results showed that the proposed protocols outperformed other state-of-the-art methods. For the convenience of the most experimental scientists, a web server for HITS-PR-HHblits was established at http://bioinformatics.hitsz.edu.cn/HITS-PR-HHblits, by which the users can easily get the results without the need to go through the mathematical details. The HITS-PR-HHblits predictor is a protocol for protein remote homology detection using different sets of programs, which will become a very useful computational tool for proteome analysis.</description><identifier>ISSN: 1467-5463</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bby104</identifier><identifier>PMID: 30403770</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Algorithms ; Amino acid sequence ; Bioinformatics ; Computer applications ; Drug development ; Homology ; Internet ; Performance prediction ; Protein structure ; Proteins ; Proteomes ; Protocol (computers) ; Queries ; Search algorithms ; Search engines ; Sequence analysis ; Similarity ; Software ; Structure-function relationships</subject><ispartof>Briefings in bioinformatics, 2018-11, Vol.21 (1), p.298-308</ispartof><rights>The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2018</rights><rights>The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.</rights><rights>The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c370t-20a569b09a08c9f5f85c9fcfdef9db2dfa4238e027db77114b01f07c8fd10533</cites><orcidid>0000-0001-6406-1142</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1604,27924,27925</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bib/bby104$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30403770$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Bin</creatorcontrib><creatorcontrib>Jiang, Shuangyan</creatorcontrib><creatorcontrib>Zou, Quan</creatorcontrib><title>HITS-PR-HHblits: protein remote homology detection by combining PageRank and Hyperlink-Induced Topic Search</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>Abstract As one of the most important fundamental problems in protein sequence analysis, protein remote homology detection is critical for both theoretical research (protein structure and function studies) and real world applications (drug design). Although several computational predictors have been proposed, their detection performance is still limited. In this study, we treat protein remote homology detection as a document retrieval task, where the proteins are considered as documents and its aim is to find the highly related documents with the query documents in a database. A protein similarity network was constructed based on the true labels of proteins in the database, and the query proteins were then connected into the network based on the similarity scores calculated by three ranking methods, including PSI-BLAST, Hmmer and HHblits. The PageRank algorithm and Hyperlink-Induced Topic Search (HITS) algorithm were respectively performed on this network to move the homologous proteins of query proteins to the neighbors of the query proteins in the network. Finally, PageRank and HITS algorithms were combined, and a predictor called HITS-PR-HHblits was proposed to further improve the predictive performance. Tested on the SCOP and SCOPe benchmark datasets, the experimental results showed that the proposed protocols outperformed other state-of-the-art methods. For the convenience of the most experimental scientists, a web server for HITS-PR-HHblits was established at http://bioinformatics.hitsz.edu.cn/HITS-PR-HHblits, by which the users can easily get the results without the need to go through the mathematical details. The HITS-PR-HHblits predictor is a protocol for protein remote homology detection using different sets of programs, which will become a very useful computational tool for proteome analysis.</description><subject>Algorithms</subject><subject>Amino acid sequence</subject><subject>Bioinformatics</subject><subject>Computer applications</subject><subject>Drug development</subject><subject>Homology</subject><subject>Internet</subject><subject>Performance prediction</subject><subject>Protein structure</subject><subject>Proteins</subject><subject>Proteomes</subject><subject>Protocol (computers)</subject><subject>Queries</subject><subject>Search algorithms</subject><subject>Search engines</subject><subject>Sequence analysis</subject><subject>Similarity</subject><subject>Software</subject><subject>Structure-function relationships</subject><issn>1467-5463</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp90c1O3DAUBWCrAhUK3fQBkCWEVCEFrmNnnHRXoZaMhASC2Uf-HcwkdrCTRd4eo6FddNHVvYtPR9c-CH0jcEWgodfSyWspFwLsEzomjPOCQcUO3vcVLyq2okfoS0ovACXwmnxGRxQYUM7hGO3a9eapeHgs2lb2bko_8BjDZJzH0Qx5wc9hCH3YLlibyajJBY_lglUYpPPOb_GD2JpH4XdYeI3bZTSxd35XrL2eldF4E0an8JMRUT2fokMr-mS-fswTtPn9a3PTFnf3t-ubn3eFohymogRRrRoJjYBaNbaydZWHstrYRstSW8FKWhsouZacE8IkEAtc1VYTqCg9Qd_3sfklr7NJUze4pEzfC2_CnLqSUFIyQijP9Pwf-hLm6PNxXckoAcJIvcrqcq9UDClFY7sxukHEpSPQvTfQ5Qa6fQMZn31EznIw-i_98-UZXOxBmMf_Bb0B9QSOGA</recordid><startdate>20181107</startdate><enddate>20181107</enddate><creator>Liu, Bin</creator><creator>Jiang, Shuangyan</creator><creator>Zou, Quan</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-6406-1142</orcidid></search><sort><creationdate>20181107</creationdate><title>HITS-PR-HHblits: protein remote homology detection by combining PageRank and Hyperlink-Induced Topic Search</title><author>Liu, Bin ; Jiang, Shuangyan ; Zou, Quan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c370t-20a569b09a08c9f5f85c9fcfdef9db2dfa4238e027db77114b01f07c8fd10533</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Amino acid sequence</topic><topic>Bioinformatics</topic><topic>Computer applications</topic><topic>Drug development</topic><topic>Homology</topic><topic>Internet</topic><topic>Performance prediction</topic><topic>Protein structure</topic><topic>Proteins</topic><topic>Proteomes</topic><topic>Protocol (computers)</topic><topic>Queries</topic><topic>Search algorithms</topic><topic>Search engines</topic><topic>Sequence analysis</topic><topic>Similarity</topic><topic>Software</topic><topic>Structure-function relationships</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Bin</creatorcontrib><creatorcontrib>Jiang, Shuangyan</creatorcontrib><creatorcontrib>Zou, Quan</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Liu, Bin</au><au>Jiang, Shuangyan</au><au>Zou, Quan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HITS-PR-HHblits: protein remote homology detection by combining PageRank and Hyperlink-Induced Topic Search</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2018-11-07</date><risdate>2018</risdate><volume>21</volume><issue>1</issue><spage>298</spage><epage>308</epage><pages>298-308</pages><issn>1467-5463</issn><eissn>1477-4054</eissn><abstract>Abstract As one of the most important fundamental problems in protein sequence analysis, protein remote homology detection is critical for both theoretical research (protein structure and function studies) and real world applications (drug design). Although several computational predictors have been proposed, their detection performance is still limited. In this study, we treat protein remote homology detection as a document retrieval task, where the proteins are considered as documents and its aim is to find the highly related documents with the query documents in a database. A protein similarity network was constructed based on the true labels of proteins in the database, and the query proteins were then connected into the network based on the similarity scores calculated by three ranking methods, including PSI-BLAST, Hmmer and HHblits. The PageRank algorithm and Hyperlink-Induced Topic Search (HITS) algorithm were respectively performed on this network to move the homologous proteins of query proteins to the neighbors of the query proteins in the network. Finally, PageRank and HITS algorithms were combined, and a predictor called HITS-PR-HHblits was proposed to further improve the predictive performance. Tested on the SCOP and SCOPe benchmark datasets, the experimental results showed that the proposed protocols outperformed other state-of-the-art methods. For the convenience of the most experimental scientists, a web server for HITS-PR-HHblits was established at http://bioinformatics.hitsz.edu.cn/HITS-PR-HHblits, by which the users can easily get the results without the need to go through the mathematical details. The HITS-PR-HHblits predictor is a protocol for protein remote homology detection using different sets of programs, which will become a very useful computational tool for proteome analysis.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>30403770</pmid><doi>10.1093/bib/bby104</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0001-6406-1142</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1467-5463
ispartof	Briefings in bioinformatics, 2018-11, Vol.21 (1), p.298-308
issn	1467-5463 1477-4054
language	eng
recordid	cdi_proquest_miscellaneous_2131241137
source	Oxford Journals Open Access Collection
subjects	Algorithms Amino acid sequence Bioinformatics Computer applications Drug development Homology Internet Performance prediction Protein structure Proteins Proteomes Protocol (computers) Queries Search algorithms Search engines Sequence analysis Similarity Software Structure-function relationships
title	HITS-PR-HHblits: protein remote homology detection by combining PageRank and Hyperlink-Induced Topic Search
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T22%3A06%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HITS-PR-HHblits:%20protein%20remote%20homology%20detection%20by%20combining%20PageRank%20and%20Hyperlink-Induced%20Topic%20Search&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Liu,%20Bin&rft.date=2018-11-07&rft.volume=21&rft.issue=1&rft.spage=298&rft.epage=308&rft.pages=298-308&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bby104&rft_dat=%3Cproquest_TOX%3E2431014186%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2431014186&rft_id=info:pmid/30403770&rft_oup_id=10.1093/bib/bby104&rfr_iscdi=true