DeepProSite: structure-aware protein binding site prediction using ESMFold and pretrained language model
Abstract Motivation Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as...
Gespeichert in:
Veröffentlicht in: | Bioinformatics (Oxford, England) England), 2023-12, Vol.39 (12) |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 12 |
container_start_page | |
container_title | Bioinformatics (Oxford, England) |
container_volume | 39 |
creator | Fang, Yitian Jiang, Yi Wei, Leyi Ma, Qin Ren, Zhixiang Yuan, Qianmu Wei, Dong-Qing |
description | Abstract
Motivation
Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as they only consider sequence-adjacent contextual features and lack structural information.
Results
In this study, DeepProSite is presented as a new framework for identifying protein binding site that utilizes protein structure and sequence information. DeepProSite first generates protein structures from ESMFold and sequence representations from pretrained language models. It then uses Graph Transformer and formulates binding site predictions as graph node classifications. In predicting protein–protein/peptide binding sites, DeepProSite outperforms state-of-the-art sequence- and structure-based methods on most metrics. Moreover, DeepProSite maintains its performance when predicting unbound structures, in contrast to competing structure-based prediction methods. DeepProSite is also extended to the prediction of binding sites for nucleic acids and other ligands, verifying its generalization capability. Finally, an online server for predicting multiple types of residue is established as the implementation of the proposed DeepProSite.
Availability and implementation
The datasets and source codes can be accessed at https://github.com/WeiLab-Biology/DeepProSite. The proposed DeepProSite can be accessed at https://inner.wei-group.net/DeepProSite/. |
doi_str_mv | 10.1093/bioinformatics/btad718 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2895261823</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btad718</oup_id><sourcerecordid>2895261823</sourcerecordid><originalsourceid>FETCH-LOGICAL-c401t-e9f1d3aad8ac9185799c4c17dd57ed124b22473d679bcfcef4aa9145ca9905193</originalsourceid><addsrcrecordid>eNqNkEFPwzAMhSMEYjD4C1OOXMritmkbbmhsgDQE0uBcpYk7gtpmJKkQ_55OGwhunGw9f362HiETYJfARDKtjDVdbV0rg1F-WgWpcygOyAkkWR6lBcDhr35ETr1_Y4xxxrNjMkoKBrzI4xPyeoO4eXJ2ZQJeUR9cr0LvMJIf0iHdOBvQdLQynTbdmvqBGkTURgVjO9r7rTpfPSxso6ns9HYYnDQdatrIbt3LNdLWamzOyFEtG4_n-zomL4v58-wuWj7e3s-ul5FKGYQIRQ06kVIXUgkoeC6EShXkWvMcNcRpFcdpnugsF5WqFdaplAJSrqQQjINIxuRi5zv8_t6jD2VrvMJm-AZt78u4EDzOoIiTAc12qHLWe4d1uXGmle6zBFZuUy7_plzuUx4WJ_sbfdWi_ln7jnUAYAfYfvNf0y_LOJJU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2895261823</pqid></control><display><type>article</type><title>DeepProSite: structure-aware protein binding site prediction using ESMFold and pretrained language model</title><source>MEDLINE</source><source>PubMed Central</source><source>Directory of Open Access Journals</source><source>Alma/SFX Local Collection</source><source>EZB Electronic Journals Library</source><source>Oxford Academic Journals (Open Access)</source><creator>Fang, Yitian ; Jiang, Yi ; Wei, Leyi ; Ma, Qin ; Ren, Zhixiang ; Yuan, Qianmu ; Wei, Dong-Qing</creator><contributor>Cowen, Lenore</contributor><creatorcontrib>Fang, Yitian ; Jiang, Yi ; Wei, Leyi ; Ma, Qin ; Ren, Zhixiang ; Yuan, Qianmu ; Wei, Dong-Qing ; Cowen, Lenore</creatorcontrib><description>Abstract
Motivation
Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as they only consider sequence-adjacent contextual features and lack structural information.
Results
In this study, DeepProSite is presented as a new framework for identifying protein binding site that utilizes protein structure and sequence information. DeepProSite first generates protein structures from ESMFold and sequence representations from pretrained language models. It then uses Graph Transformer and formulates binding site predictions as graph node classifications. In predicting protein–protein/peptide binding sites, DeepProSite outperforms state-of-the-art sequence- and structure-based methods on most metrics. Moreover, DeepProSite maintains its performance when predicting unbound structures, in contrast to competing structure-based prediction methods. DeepProSite is also extended to the prediction of binding sites for nucleic acids and other ligands, verifying its generalization capability. Finally, an online server for predicting multiple types of residue is established as the implementation of the proposed DeepProSite.
Availability and implementation
The datasets and source codes can be accessed at https://github.com/WeiLab-Biology/DeepProSite. The proposed DeepProSite can be accessed at https://inner.wei-group.net/DeepProSite/.</description><identifier>ISSN: 1367-4811</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btad718</identifier><identifier>PMID: 38015872</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Binding Sites ; Peptides ; Protein Binding ; Proteins - chemistry ; Software</subject><ispartof>Bioinformatics (Oxford, England), 2023-12, Vol.39 (12)</ispartof><rights>The Author(s) 2023. Published by Oxford University Press. 2023</rights><rights>The Author(s) 2023. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c401t-e9f1d3aad8ac9185799c4c17dd57ed124b22473d679bcfcef4aa9145ca9905193</citedby><cites>FETCH-LOGICAL-c401t-e9f1d3aad8ac9185799c4c17dd57ed124b22473d679bcfcef4aa9145ca9905193</cites><orcidid>0000-0002-4644-1464 ; 0000-0003-1444-190X ; 0000-0003-4200-7502</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,864,27923,27924</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38015872$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Cowen, Lenore</contributor><creatorcontrib>Fang, Yitian</creatorcontrib><creatorcontrib>Jiang, Yi</creatorcontrib><creatorcontrib>Wei, Leyi</creatorcontrib><creatorcontrib>Ma, Qin</creatorcontrib><creatorcontrib>Ren, Zhixiang</creatorcontrib><creatorcontrib>Yuan, Qianmu</creatorcontrib><creatorcontrib>Wei, Dong-Qing</creatorcontrib><title>DeepProSite: structure-aware protein binding site prediction using ESMFold and pretrained language model</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>Abstract
Motivation
Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as they only consider sequence-adjacent contextual features and lack structural information.
Results
In this study, DeepProSite is presented as a new framework for identifying protein binding site that utilizes protein structure and sequence information. DeepProSite first generates protein structures from ESMFold and sequence representations from pretrained language models. It then uses Graph Transformer and formulates binding site predictions as graph node classifications. In predicting protein–protein/peptide binding sites, DeepProSite outperforms state-of-the-art sequence- and structure-based methods on most metrics. Moreover, DeepProSite maintains its performance when predicting unbound structures, in contrast to competing structure-based prediction methods. DeepProSite is also extended to the prediction of binding sites for nucleic acids and other ligands, verifying its generalization capability. Finally, an online server for predicting multiple types of residue is established as the implementation of the proposed DeepProSite.
Availability and implementation
The datasets and source codes can be accessed at https://github.com/WeiLab-Biology/DeepProSite. The proposed DeepProSite can be accessed at https://inner.wei-group.net/DeepProSite/.</description><subject>Binding Sites</subject><subject>Peptides</subject><subject>Protein Binding</subject><subject>Proteins - chemistry</subject><subject>Software</subject><issn>1367-4811</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNqNkEFPwzAMhSMEYjD4C1OOXMritmkbbmhsgDQE0uBcpYk7gtpmJKkQ_55OGwhunGw9f362HiETYJfARDKtjDVdbV0rg1F-WgWpcygOyAkkWR6lBcDhr35ETr1_Y4xxxrNjMkoKBrzI4xPyeoO4eXJ2ZQJeUR9cr0LvMJIf0iHdOBvQdLQynTbdmvqBGkTURgVjO9r7rTpfPSxso6ns9HYYnDQdatrIbt3LNdLWamzOyFEtG4_n-zomL4v58-wuWj7e3s-ul5FKGYQIRQ06kVIXUgkoeC6EShXkWvMcNcRpFcdpnugsF5WqFdaplAJSrqQQjINIxuRi5zv8_t6jD2VrvMJm-AZt78u4EDzOoIiTAc12qHLWe4d1uXGmle6zBFZuUy7_plzuUx4WJ_sbfdWi_ln7jnUAYAfYfvNf0y_LOJJU</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Fang, Yitian</creator><creator>Jiang, Yi</creator><creator>Wei, Leyi</creator><creator>Ma, Qin</creator><creator>Ren, Zhixiang</creator><creator>Yuan, Qianmu</creator><creator>Wei, Dong-Qing</creator><general>Oxford University Press</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4644-1464</orcidid><orcidid>https://orcid.org/0000-0003-1444-190X</orcidid><orcidid>https://orcid.org/0000-0003-4200-7502</orcidid></search><sort><creationdate>20231201</creationdate><title>DeepProSite: structure-aware protein binding site prediction using ESMFold and pretrained language model</title><author>Fang, Yitian ; Jiang, Yi ; Wei, Leyi ; Ma, Qin ; Ren, Zhixiang ; Yuan, Qianmu ; Wei, Dong-Qing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c401t-e9f1d3aad8ac9185799c4c17dd57ed124b22473d679bcfcef4aa9145ca9905193</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Binding Sites</topic><topic>Peptides</topic><topic>Protein Binding</topic><topic>Proteins - chemistry</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fang, Yitian</creatorcontrib><creatorcontrib>Jiang, Yi</creatorcontrib><creatorcontrib>Wei, Leyi</creatorcontrib><creatorcontrib>Ma, Qin</creatorcontrib><creatorcontrib>Ren, Zhixiang</creatorcontrib><creatorcontrib>Yuan, Qianmu</creatorcontrib><creatorcontrib>Wei, Dong-Qing</creatorcontrib><collection>Oxford Academic Journals (Open Access)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fang, Yitian</au><au>Jiang, Yi</au><au>Wei, Leyi</au><au>Ma, Qin</au><au>Ren, Zhixiang</au><au>Yuan, Qianmu</au><au>Wei, Dong-Qing</au><au>Cowen, Lenore</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DeepProSite: structure-aware protein binding site prediction using ESMFold and pretrained language model</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2023-12-01</date><risdate>2023</risdate><volume>39</volume><issue>12</issue><issn>1367-4811</issn><eissn>1367-4811</eissn><abstract>Abstract
Motivation
Identifying the functional sites of a protein, such as the binding sites of proteins, peptides, or other biological components, is crucial for understanding related biological processes and drug design. However, existing sequence-based methods have limited predictive accuracy, as they only consider sequence-adjacent contextual features and lack structural information.
Results
In this study, DeepProSite is presented as a new framework for identifying protein binding site that utilizes protein structure and sequence information. DeepProSite first generates protein structures from ESMFold and sequence representations from pretrained language models. It then uses Graph Transformer and formulates binding site predictions as graph node classifications. In predicting protein–protein/peptide binding sites, DeepProSite outperforms state-of-the-art sequence- and structure-based methods on most metrics. Moreover, DeepProSite maintains its performance when predicting unbound structures, in contrast to competing structure-based prediction methods. DeepProSite is also extended to the prediction of binding sites for nucleic acids and other ligands, verifying its generalization capability. Finally, an online server for predicting multiple types of residue is established as the implementation of the proposed DeepProSite.
Availability and implementation
The datasets and source codes can be accessed at https://github.com/WeiLab-Biology/DeepProSite. The proposed DeepProSite can be accessed at https://inner.wei-group.net/DeepProSite/.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>38015872</pmid><doi>10.1093/bioinformatics/btad718</doi><orcidid>https://orcid.org/0000-0002-4644-1464</orcidid><orcidid>https://orcid.org/0000-0003-1444-190X</orcidid><orcidid>https://orcid.org/0000-0003-4200-7502</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4811 |
ispartof | Bioinformatics (Oxford, England), 2023-12, Vol.39 (12) |
issn | 1367-4811 1367-4811 |
language | eng |
recordid | cdi_proquest_miscellaneous_2895261823 |
source | MEDLINE; PubMed Central; Directory of Open Access Journals; Alma/SFX Local Collection; EZB Electronic Journals Library; Oxford Academic Journals (Open Access) |
subjects | Binding Sites Peptides Protein Binding Proteins - chemistry Software |
title | DeepProSite: structure-aware protein binding site prediction using ESMFold and pretrained language model |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T19%3A45%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DeepProSite:%20structure-aware%20protein%20binding%20site%20prediction%20using%20ESMFold%20and%20pretrained%20language%20model&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Fang,%20Yitian&rft.date=2023-12-01&rft.volume=39&rft.issue=12&rft.issn=1367-4811&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btad718&rft_dat=%3Cproquest_cross%3E2895261823%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2895261823&rft_id=info:pmid/38015872&rft_oup_id=10.1093/bioinformatics/btad718&rfr_iscdi=true |