MAHOMES II: A webserver for predicting if a metal binding site is enzymatic

Recent advances have enabled high‐quality computationally generated structures for proteins with no solved crystal structures. However, protein function data remains largely limited to experimental methods and homology mapping. Since structure determines function, it is natural that methods capable...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Protein science 2023-04, Vol.32 (4), p.e4626-n/a
Hauptverfasser: Feehan, Ryan, Copeland, Matthew, Franklin, Meghan W., Slusky, Joanna S. G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page n/a
container_issue 4
container_start_page e4626
container_title Protein science
container_volume 32
creator Feehan, Ryan
Copeland, Matthew
Franklin, Meghan W.
Slusky, Joanna S. G.
description Recent advances have enabled high‐quality computationally generated structures for proteins with no solved crystal structures. However, protein function data remains largely limited to experimental methods and homology mapping. Since structure determines function, it is natural that methods capable of using computationally generated structures for functional annotations need to be advanced. Our laboratory recently developed a method to distinguish between metalloenzyme and nonenzyme sites. Here we report improvements to this method by upgrading our physicochemical features to alleviate the need for structures with sub‐angstrom precision and using machine learning to reduce training data labeling error. Our improved classifier identifies protein bound metal sites as enzymatic or nonenzymatic with 94% precision and 92% recall. We demonstrate that both adjustments increased predictive performance and reliability on sites with sub‐angstrom variations. We constructed a set of predicted metalloprotein structures with no solved crystal structures and no detectable homology to our training data. Our model had an accuracy of 90%–97.5% depending on the quality of the predicted structures included in our test. Finally, we found the physicochemical trends that drove this model's successful performance were local protein density, second shell ionizable residue burial, and the pocket's accessibility to the site. We anticipate that our model's ability to correctly identify catalytic metal sites could enable identification of new enzymatic mechanisms and improve de novo metalloenzyme design success rates.
doi_str_mv 10.1002/pro.4626
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10044107</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2786812917</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4396-60b1b9417ce5777c4ee1ca2670ea885eeba0528ae87d0d8b0e4e242bee12042c3</originalsourceid><addsrcrecordid>eNp1kctqGzEUQEVoSVwn0C8ogm66GVfSyHpkE0xwG9MEhzwgO6HRXDsyMyNXGju4X99x7aZNICtxdQ_nvhD6SMmAEsK-LmMYcMHEAepRLnSmtHh4h3pEC5qpXKgj9CGlBSGEU5YfoqNcaCqkYD3042p0Mb0a3-LJ5BSP8BMUCeIaIp6FiJcRSu9a38yxn2GLa2hthQvflNuv5FvAPmFofm1q23p3jN7PbJXgZP_20f238d35RXY5_T45H11mjudaZIIUtNCcSgdDKaXjANRZJiQBq9QQoLBkyJQFJUtSqoIAB8ZZ0WGMcObyPjrbeZeroobSQdNGW5ll9LWNGxOsNy8zjX8087A23a44p0R2hi97Qww_V5BaU_vkoKpsA2GVDJNKKMo03aKfX6GLsIpNN19Hacb1MCf0n9DFkFKE2XM3lGzLsi4OZnuiDv30f_fP4N-bdEC2A558BZs3Reb6ZvpH-BvhmJl8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2792495301</pqid></control><display><type>article</type><title>MAHOMES II: A webserver for predicting if a metal binding site is enzymatic</title><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Wiley Free Content</source><source>Wiley Online Library All Journals</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Feehan, Ryan ; Copeland, Matthew ; Franklin, Meghan W. ; Slusky, Joanna S. G.</creator><creatorcontrib>Feehan, Ryan ; Copeland, Matthew ; Franklin, Meghan W. ; Slusky, Joanna S. G.</creatorcontrib><description>Recent advances have enabled high‐quality computationally generated structures for proteins with no solved crystal structures. However, protein function data remains largely limited to experimental methods and homology mapping. Since structure determines function, it is natural that methods capable of using computationally generated structures for functional annotations need to be advanced. Our laboratory recently developed a method to distinguish between metalloenzyme and nonenzyme sites. Here we report improvements to this method by upgrading our physicochemical features to alleviate the need for structures with sub‐angstrom precision and using machine learning to reduce training data labeling error. Our improved classifier identifies protein bound metal sites as enzymatic or nonenzymatic with 94% precision and 92% recall. We demonstrate that both adjustments increased predictive performance and reliability on sites with sub‐angstrom variations. We constructed a set of predicted metalloprotein structures with no solved crystal structures and no detectable homology to our training data. Our model had an accuracy of 90%–97.5% depending on the quality of the predicted structures included in our test. Finally, we found the physicochemical trends that drove this model's successful performance were local protein density, second shell ionizable residue burial, and the pocket's accessibility to the site. We anticipate that our model's ability to correctly identify catalytic metal sites could enable identification of new enzymatic mechanisms and improve de novo metalloenzyme design success rates.</description><identifier>ISSN: 0961-8368</identifier><identifier>EISSN: 1469-896X</identifier><identifier>DOI: 10.1002/pro.4626</identifier><identifier>PMID: 36916762</identifier><language>eng</language><publisher>Hoboken, USA: John Wiley &amp; Sons, Inc</publisher><subject>Annotations ; Binding Sites ; Catalytic Domain ; Crystal structure ; enzymes ; Experimental methods ; Full‐length Paper ; Full‐length Papers ; Homology ; Machine learning ; metalloenzymes ; metalloproteins ; Metalloproteins - chemistry ; Metals ; Model accuracy ; Peptide mapping ; Performance prediction ; Proteins ; Reproducibility of Results ; Structure-function relationships ; Training</subject><ispartof>Protein science, 2023-04, Vol.32 (4), p.e4626-n/a</ispartof><rights>2023 The Protein Society.</rights><rights>2023 The Protein Society</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4396-60b1b9417ce5777c4ee1ca2670ea885eeba0528ae87d0d8b0e4e242bee12042c3</citedby><cites>FETCH-LOGICAL-c4396-60b1b9417ce5777c4ee1ca2670ea885eeba0528ae87d0d8b0e4e242bee12042c3</cites><orcidid>0000-0003-0842-6340</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10044107/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10044107/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,727,780,784,885,1417,1433,27923,27924,45573,45574,46408,46832,53790,53792</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36916762$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Feehan, Ryan</creatorcontrib><creatorcontrib>Copeland, Matthew</creatorcontrib><creatorcontrib>Franklin, Meghan W.</creatorcontrib><creatorcontrib>Slusky, Joanna S. G.</creatorcontrib><title>MAHOMES II: A webserver for predicting if a metal binding site is enzymatic</title><title>Protein science</title><addtitle>Protein Sci</addtitle><description>Recent advances have enabled high‐quality computationally generated structures for proteins with no solved crystal structures. However, protein function data remains largely limited to experimental methods and homology mapping. Since structure determines function, it is natural that methods capable of using computationally generated structures for functional annotations need to be advanced. Our laboratory recently developed a method to distinguish between metalloenzyme and nonenzyme sites. Here we report improvements to this method by upgrading our physicochemical features to alleviate the need for structures with sub‐angstrom precision and using machine learning to reduce training data labeling error. Our improved classifier identifies protein bound metal sites as enzymatic or nonenzymatic with 94% precision and 92% recall. We demonstrate that both adjustments increased predictive performance and reliability on sites with sub‐angstrom variations. We constructed a set of predicted metalloprotein structures with no solved crystal structures and no detectable homology to our training data. Our model had an accuracy of 90%–97.5% depending on the quality of the predicted structures included in our test. Finally, we found the physicochemical trends that drove this model's successful performance were local protein density, second shell ionizable residue burial, and the pocket's accessibility to the site. We anticipate that our model's ability to correctly identify catalytic metal sites could enable identification of new enzymatic mechanisms and improve de novo metalloenzyme design success rates.</description><subject>Annotations</subject><subject>Binding Sites</subject><subject>Catalytic Domain</subject><subject>Crystal structure</subject><subject>enzymes</subject><subject>Experimental methods</subject><subject>Full‐length Paper</subject><subject>Full‐length Papers</subject><subject>Homology</subject><subject>Machine learning</subject><subject>metalloenzymes</subject><subject>metalloproteins</subject><subject>Metalloproteins - chemistry</subject><subject>Metals</subject><subject>Model accuracy</subject><subject>Peptide mapping</subject><subject>Performance prediction</subject><subject>Proteins</subject><subject>Reproducibility of Results</subject><subject>Structure-function relationships</subject><subject>Training</subject><issn>0961-8368</issn><issn>1469-896X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp1kctqGzEUQEVoSVwn0C8ogm66GVfSyHpkE0xwG9MEhzwgO6HRXDsyMyNXGju4X99x7aZNICtxdQ_nvhD6SMmAEsK-LmMYcMHEAepRLnSmtHh4h3pEC5qpXKgj9CGlBSGEU5YfoqNcaCqkYD3042p0Mb0a3-LJ5BSP8BMUCeIaIp6FiJcRSu9a38yxn2GLa2hthQvflNuv5FvAPmFofm1q23p3jN7PbJXgZP_20f238d35RXY5_T45H11mjudaZIIUtNCcSgdDKaXjANRZJiQBq9QQoLBkyJQFJUtSqoIAB8ZZ0WGMcObyPjrbeZeroobSQdNGW5ll9LWNGxOsNy8zjX8087A23a44p0R2hi97Qww_V5BaU_vkoKpsA2GVDJNKKMo03aKfX6GLsIpNN19Hacb1MCf0n9DFkFKE2XM3lGzLsi4OZnuiDv30f_fP4N-bdEC2A558BZs3Reb6ZvpH-BvhmJl8</recordid><startdate>202304</startdate><enddate>202304</enddate><creator>Feehan, Ryan</creator><creator>Copeland, Matthew</creator><creator>Franklin, Meghan W.</creator><creator>Slusky, Joanna S. G.</creator><general>John Wiley &amp; Sons, Inc</general><general>Wiley Subscription Services, Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7T5</scope><scope>7TM</scope><scope>7U9</scope><scope>8FD</scope><scope>FR3</scope><scope>H94</scope><scope>K9.</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-0842-6340</orcidid></search><sort><creationdate>202304</creationdate><title>MAHOMES II: A webserver for predicting if a metal binding site is enzymatic</title><author>Feehan, Ryan ; Copeland, Matthew ; Franklin, Meghan W. ; Slusky, Joanna S. G.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4396-60b1b9417ce5777c4ee1ca2670ea885eeba0528ae87d0d8b0e4e242bee12042c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Annotations</topic><topic>Binding Sites</topic><topic>Catalytic Domain</topic><topic>Crystal structure</topic><topic>enzymes</topic><topic>Experimental methods</topic><topic>Full‐length Paper</topic><topic>Full‐length Papers</topic><topic>Homology</topic><topic>Machine learning</topic><topic>metalloenzymes</topic><topic>metalloproteins</topic><topic>Metalloproteins - chemistry</topic><topic>Metals</topic><topic>Model accuracy</topic><topic>Peptide mapping</topic><topic>Performance prediction</topic><topic>Proteins</topic><topic>Reproducibility of Results</topic><topic>Structure-function relationships</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Feehan, Ryan</creatorcontrib><creatorcontrib>Copeland, Matthew</creatorcontrib><creatorcontrib>Franklin, Meghan W.</creatorcontrib><creatorcontrib>Slusky, Joanna S. G.</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Immunology Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Protein science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Feehan, Ryan</au><au>Copeland, Matthew</au><au>Franklin, Meghan W.</au><au>Slusky, Joanna S. G.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MAHOMES II: A webserver for predicting if a metal binding site is enzymatic</atitle><jtitle>Protein science</jtitle><addtitle>Protein Sci</addtitle><date>2023-04</date><risdate>2023</risdate><volume>32</volume><issue>4</issue><spage>e4626</spage><epage>n/a</epage><pages>e4626-n/a</pages><issn>0961-8368</issn><eissn>1469-896X</eissn><abstract>Recent advances have enabled high‐quality computationally generated structures for proteins with no solved crystal structures. However, protein function data remains largely limited to experimental methods and homology mapping. Since structure determines function, it is natural that methods capable of using computationally generated structures for functional annotations need to be advanced. Our laboratory recently developed a method to distinguish between metalloenzyme and nonenzyme sites. Here we report improvements to this method by upgrading our physicochemical features to alleviate the need for structures with sub‐angstrom precision and using machine learning to reduce training data labeling error. Our improved classifier identifies protein bound metal sites as enzymatic or nonenzymatic with 94% precision and 92% recall. We demonstrate that both adjustments increased predictive performance and reliability on sites with sub‐angstrom variations. We constructed a set of predicted metalloprotein structures with no solved crystal structures and no detectable homology to our training data. Our model had an accuracy of 90%–97.5% depending on the quality of the predicted structures included in our test. Finally, we found the physicochemical trends that drove this model's successful performance were local protein density, second shell ionizable residue burial, and the pocket's accessibility to the site. We anticipate that our model's ability to correctly identify catalytic metal sites could enable identification of new enzymatic mechanisms and improve de novo metalloenzyme design success rates.</abstract><cop>Hoboken, USA</cop><pub>John Wiley &amp; Sons, Inc</pub><pmid>36916762</pmid><doi>10.1002/pro.4626</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0003-0842-6340</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0961-8368
ispartof Protein science, 2023-04, Vol.32 (4), p.e4626-n/a
issn 0961-8368
1469-896X
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10044107
source MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Wiley Free Content; Wiley Online Library All Journals; PubMed Central; Free Full-Text Journals in Chemistry
subjects Annotations
Binding Sites
Catalytic Domain
Crystal structure
enzymes
Experimental methods
Full‐length Paper
Full‐length Papers
Homology
Machine learning
metalloenzymes
metalloproteins
Metalloproteins - chemistry
Metals
Model accuracy
Peptide mapping
Performance prediction
Proteins
Reproducibility of Results
Structure-function relationships
Training
title MAHOMES II: A webserver for predicting if a metal binding site is enzymatic
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T19%3A56%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MAHOMES%20II:%20A%20webserver%20for%20predicting%20if%20a%20metal%20binding%20site%20is%20enzymatic&rft.jtitle=Protein%20science&rft.au=Feehan,%20Ryan&rft.date=2023-04&rft.volume=32&rft.issue=4&rft.spage=e4626&rft.epage=n/a&rft.pages=e4626-n/a&rft.issn=0961-8368&rft.eissn=1469-896X&rft_id=info:doi/10.1002/pro.4626&rft_dat=%3Cproquest_pubme%3E2786812917%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2792495301&rft_id=info:pmid/36916762&rfr_iscdi=true