Inter-domain distance prediction based on deep learning for domain assembly
Abstract AlphaFold2 achieved a breakthrough in protein structure prediction through the end-to-end deep learning method, which can predict nearly all single-domain proteins at experimental resolution. However, the prediction accuracy of full-chain proteins is generally lower than that of single-doma...
Gespeichert in:
Veröffentlicht in: | Briefings in bioinformatics 2023-05, Vol.24 (3) |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 3 |
container_start_page | |
container_title | Briefings in bioinformatics |
container_volume | 24 |
creator | Ge, Fengqi Peng, Chunxiang Cui, Xinyue Xia, Yuhao Zhang, Guijun |
description | Abstract
AlphaFold2 achieved a breakthrough in protein structure prediction through the end-to-end deep learning method, which can predict nearly all single-domain proteins at experimental resolution. However, the prediction accuracy of full-chain proteins is generally lower than that of single-domain proteins because of the incorrect interactions between domains. In this work, we develop an inter-domain distance prediction method, named DeepIDDP. In DeepIDDP, we design a neural network with attention mechanisms, where two new inter-domain features are used to enhance the ability to capture the interactions between domains. Furthermore, we propose a data enhancement strategy termed DPMSA, which is employed to deal with the absence of co-evolutionary information on targets. We integrate DeepIDDP into our previously developed domain assembly method SADA, termed SADA-DeepIDDP. Tested on a given multi-domain benchmark dataset, the accuracy of SADA-DeepIDDP inter-domain distance prediction is 11.3% and 21.6% higher than trRosettaX and trRosetta, respectively. The accuracy of the domain assembly model is 2.5% higher than that of SADA. Meanwhile, we reassemble 68 human multi-domain protein models with TM-score ≤ 0.80 from the AlphaFold protein structure database, where the average TM-score is improved by 11.8% after the reassembly by our method. The online server is at http://zhanglab-bioinf.com/DeepIDDP/. |
doi_str_mv | 10.1093/bib/bbad100 |
format | Article |
fullrecord | <record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_2787212951</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bib/bbad100</oup_id><sourcerecordid>3040989309</sourcerecordid><originalsourceid>FETCH-LOGICAL-c385t-a10aa580934b200e29efab7ec5ff78b76290ed39dc70a76d72b713a723f1232d3</originalsourceid><addsrcrecordid>eNp9kE1LxDAURYMozji6ci8FQQSp85K0TbOUwY_BATe6LknzKh3atCbtYv69Gaa6cOHq3cV5l8Ml5JLCPQXJl7rWS62VoQBHZE4TIeIE0uR4nzMRp0nGZ-TM-y0AA5HTUzLjmWQAEubkdW0HdLHpWlXbyNR-ULbEqHdo6nKoOxtp5dFEIRjEPmpQOVvbz6jqXDR9Ke-x1c3unJxUqvF4Md0F-Xh6fF-9xJu35_XqYROXPE-HWFFQKs2DeqKDBTKJldICy7SqRK5FxiSg4dKUApTIjGBaUK4E4xVlnBm-ILeH3t51XyP6oWhrX2LTKIvd6AsmcsEokykN6PUfdNuNzga7gkMCMpc8eCzI3YEqXee9w6roXd0qtysoFPuNi7BxMW0c6Kupc9Qtml_2Z9QA3ByAbuz_bfoGGYKD_w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3040989309</pqid></control><display><type>article</type><title>Inter-domain distance prediction based on deep learning for domain assembly</title><source>Oxford Journals Open Access Collection</source><creator>Ge, Fengqi ; Peng, Chunxiang ; Cui, Xinyue ; Xia, Yuhao ; Zhang, Guijun</creator><creatorcontrib>Ge, Fengqi ; Peng, Chunxiang ; Cui, Xinyue ; Xia, Yuhao ; Zhang, Guijun</creatorcontrib><description>Abstract
AlphaFold2 achieved a breakthrough in protein structure prediction through the end-to-end deep learning method, which can predict nearly all single-domain proteins at experimental resolution. However, the prediction accuracy of full-chain proteins is generally lower than that of single-domain proteins because of the incorrect interactions between domains. In this work, we develop an inter-domain distance prediction method, named DeepIDDP. In DeepIDDP, we design a neural network with attention mechanisms, where two new inter-domain features are used to enhance the ability to capture the interactions between domains. Furthermore, we propose a data enhancement strategy termed DPMSA, which is employed to deal with the absence of co-evolutionary information on targets. We integrate DeepIDDP into our previously developed domain assembly method SADA, termed SADA-DeepIDDP. Tested on a given multi-domain benchmark dataset, the accuracy of SADA-DeepIDDP inter-domain distance prediction is 11.3% and 21.6% higher than trRosettaX and trRosetta, respectively. The accuracy of the domain assembly model is 2.5% higher than that of SADA. Meanwhile, we reassemble 68 human multi-domain protein models with TM-score ≤ 0.80 from the AlphaFold protein structure database, where the average TM-score is improved by 11.8% after the reassembly by our method. The online server is at http://zhanglab-bioinf.com/DeepIDDP/.</description><identifier>ISSN: 1467-5463</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbad100</identifier><identifier>PMID: 36920090</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Accuracy ; Algorithms ; Assembly ; Computational Biology ; Databases, Protein ; Deep Learning ; Humans ; Information processing ; Neural networks ; Neural Networks, Computer ; Predictions ; Protein structure ; Proteins ; Proteins - chemistry</subject><ispartof>Briefings in bioinformatics, 2023-05, Vol.24 (3)</ispartof><rights>The Author(s) 2023. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2023</rights><rights>The Author(s) 2023. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.</rights><rights>The Author(s) 2023. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c385t-a10aa580934b200e29efab7ec5ff78b76290ed39dc70a76d72b713a723f1232d3</citedby><cites>FETCH-LOGICAL-c385t-a10aa580934b200e29efab7ec5ff78b76290ed39dc70a76d72b713a723f1232d3</cites><orcidid>0000-0002-7815-5884</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1598,27903,27904</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bib/bbad100$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36920090$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ge, Fengqi</creatorcontrib><creatorcontrib>Peng, Chunxiang</creatorcontrib><creatorcontrib>Cui, Xinyue</creatorcontrib><creatorcontrib>Xia, Yuhao</creatorcontrib><creatorcontrib>Zhang, Guijun</creatorcontrib><title>Inter-domain distance prediction based on deep learning for domain assembly</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>Abstract
AlphaFold2 achieved a breakthrough in protein structure prediction through the end-to-end deep learning method, which can predict nearly all single-domain proteins at experimental resolution. However, the prediction accuracy of full-chain proteins is generally lower than that of single-domain proteins because of the incorrect interactions between domains. In this work, we develop an inter-domain distance prediction method, named DeepIDDP. In DeepIDDP, we design a neural network with attention mechanisms, where two new inter-domain features are used to enhance the ability to capture the interactions between domains. Furthermore, we propose a data enhancement strategy termed DPMSA, which is employed to deal with the absence of co-evolutionary information on targets. We integrate DeepIDDP into our previously developed domain assembly method SADA, termed SADA-DeepIDDP. Tested on a given multi-domain benchmark dataset, the accuracy of SADA-DeepIDDP inter-domain distance prediction is 11.3% and 21.6% higher than trRosettaX and trRosetta, respectively. The accuracy of the domain assembly model is 2.5% higher than that of SADA. Meanwhile, we reassemble 68 human multi-domain protein models with TM-score ≤ 0.80 from the AlphaFold protein structure database, where the average TM-score is improved by 11.8% after the reassembly by our method. The online server is at http://zhanglab-bioinf.com/DeepIDDP/.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Assembly</subject><subject>Computational Biology</subject><subject>Databases, Protein</subject><subject>Deep Learning</subject><subject>Humans</subject><subject>Information processing</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Predictions</subject><subject>Protein structure</subject><subject>Proteins</subject><subject>Proteins - chemistry</subject><issn>1467-5463</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kE1LxDAURYMozji6ci8FQQSp85K0TbOUwY_BATe6LknzKh3atCbtYv69Gaa6cOHq3cV5l8Ml5JLCPQXJl7rWS62VoQBHZE4TIeIE0uR4nzMRp0nGZ-TM-y0AA5HTUzLjmWQAEubkdW0HdLHpWlXbyNR-ULbEqHdo6nKoOxtp5dFEIRjEPmpQOVvbz6jqXDR9Ke-x1c3unJxUqvF4Md0F-Xh6fF-9xJu35_XqYROXPE-HWFFQKs2DeqKDBTKJldICy7SqRK5FxiSg4dKUApTIjGBaUK4E4xVlnBm-ILeH3t51XyP6oWhrX2LTKIvd6AsmcsEokykN6PUfdNuNzga7gkMCMpc8eCzI3YEqXee9w6roXd0qtysoFPuNi7BxMW0c6Kupc9Qtml_2Z9QA3ByAbuz_bfoGGYKD_w</recordid><startdate>20230519</startdate><enddate>20230519</enddate><creator>Ge, Fengqi</creator><creator>Peng, Chunxiang</creator><creator>Cui, Xinyue</creator><creator>Xia, Yuhao</creator><creator>Zhang, Guijun</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-7815-5884</orcidid></search><sort><creationdate>20230519</creationdate><title>Inter-domain distance prediction based on deep learning for domain assembly</title><author>Ge, Fengqi ; Peng, Chunxiang ; Cui, Xinyue ; Xia, Yuhao ; Zhang, Guijun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c385t-a10aa580934b200e29efab7ec5ff78b76290ed39dc70a76d72b713a723f1232d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Assembly</topic><topic>Computational Biology</topic><topic>Databases, Protein</topic><topic>Deep Learning</topic><topic>Humans</topic><topic>Information processing</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Predictions</topic><topic>Protein structure</topic><topic>Proteins</topic><topic>Proteins - chemistry</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ge, Fengqi</creatorcontrib><creatorcontrib>Peng, Chunxiang</creatorcontrib><creatorcontrib>Cui, Xinyue</creatorcontrib><creatorcontrib>Xia, Yuhao</creatorcontrib><creatorcontrib>Zhang, Guijun</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ge, Fengqi</au><au>Peng, Chunxiang</au><au>Cui, Xinyue</au><au>Xia, Yuhao</au><au>Zhang, Guijun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Inter-domain distance prediction based on deep learning for domain assembly</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2023-05-19</date><risdate>2023</risdate><volume>24</volume><issue>3</issue><issn>1467-5463</issn><eissn>1477-4054</eissn><abstract>Abstract
AlphaFold2 achieved a breakthrough in protein structure prediction through the end-to-end deep learning method, which can predict nearly all single-domain proteins at experimental resolution. However, the prediction accuracy of full-chain proteins is generally lower than that of single-domain proteins because of the incorrect interactions between domains. In this work, we develop an inter-domain distance prediction method, named DeepIDDP. In DeepIDDP, we design a neural network with attention mechanisms, where two new inter-domain features are used to enhance the ability to capture the interactions between domains. Furthermore, we propose a data enhancement strategy termed DPMSA, which is employed to deal with the absence of co-evolutionary information on targets. We integrate DeepIDDP into our previously developed domain assembly method SADA, termed SADA-DeepIDDP. Tested on a given multi-domain benchmark dataset, the accuracy of SADA-DeepIDDP inter-domain distance prediction is 11.3% and 21.6% higher than trRosettaX and trRosetta, respectively. The accuracy of the domain assembly model is 2.5% higher than that of SADA. Meanwhile, we reassemble 68 human multi-domain protein models with TM-score ≤ 0.80 from the AlphaFold protein structure database, where the average TM-score is improved by 11.8% after the reassembly by our method. The online server is at http://zhanglab-bioinf.com/DeepIDDP/.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>36920090</pmid><doi>10.1093/bib/bbad100</doi><orcidid>https://orcid.org/0000-0002-7815-5884</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1467-5463 |
ispartof | Briefings in bioinformatics, 2023-05, Vol.24 (3) |
issn | 1467-5463 1477-4054 |
language | eng |
recordid | cdi_proquest_miscellaneous_2787212951 |
source | Oxford Journals Open Access Collection |
subjects | Accuracy Algorithms Assembly Computational Biology Databases, Protein Deep Learning Humans Information processing Neural networks Neural Networks, Computer Predictions Protein structure Proteins Proteins - chemistry |
title | Inter-domain distance prediction based on deep learning for domain assembly |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T03%3A15%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Inter-domain%20distance%20prediction%20based%20on%20deep%20learning%20for%20domain%20assembly&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Ge,%20Fengqi&rft.date=2023-05-19&rft.volume=24&rft.issue=3&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbad100&rft_dat=%3Cproquest_TOX%3E3040989309%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3040989309&rft_id=info:pmid/36920090&rft_oup_id=10.1093/bib/bbad100&rfr_iscdi=true |