Reduced alphabet for protein folding prediction

ABSTRACT What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐l...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proteins, structure, function, and bioinformatics structure, function, and bioinformatics, 2015-04, Vol.83 (4), p.631-639
Hauptverfasser: Huang, Jitao T., Wang, Titi, Huang, Shanran R., Li, Xin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 639
container_issue 4
container_start_page 631
container_title Proteins, structure, function, and bioinformatics
container_volume 83
creator Huang, Jitao T.
Wang, Titi
Huang, Shanran R.
Li, Xin
description ABSTRACT What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐letter alphabet to determine folding kinetics and mechanism. Here we predict folding kinetic rates of proteins from many reduced alphabets. We find that a reduced alphabet of 10 letters achieves good correlation with folding rates, close to the one achieved by full 28‐letter alphabet. Many other reduced alphabets are not significantly correlated to folding rates. The finding suggests that not all amino acids and secondary structures are equally important for protein folding. The foldable sequence of a protein could be designed using at least 10 folding units, which can either promote or inhibit protein folding. Reducing alphabet cardinality without losing key folding kinetic information opens the door to potentially faster machine learning and data mining applications in protein structure prediction, sequence alignment and protein design. Proteins 2015; 83:631–639. © 2015 Wiley Periodicals, Inc.
doi_str_mv 10.1002/prot.24762
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_1668259896</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1665118345</sourcerecordid><originalsourceid>FETCH-LOGICAL-i3522-6ddba96434b8fb30c0852004a608466746277bd3450818422ece96ef1b1ffef53</originalsourceid><addsrcrecordid>eNqNkUtPAjEUhRujEUQ3_gBD4sbN4O27szREUEPEEAzLpjPT0eIwg_OI8u_tALJw5eqem37ntj0XoUsMAwxAbtdlUQ8Ik4IcoS6GUAaAKTtGXVBKBpQr3kFnVbUEABFScYo6hAuGGYEuup3ZpIlt0jfZ-t1Etu6nRdlvJ1qXe50lLn_zvU1cXLsiP0cnqckqe7GvPfQ6up8PH4LJdPw4vJsEjnJCApEkkQkFoyxSaUQhBsUJADMCFBNCMkGkjBLKOCisGCE2tqGwKY5wmtqU0x662c31T_lsbFXrlatim2Umt0VTaSyEIjxUofgPyjFW_i6PXv9Bl0VT5v4jLcUYET4jT13tqSZa2USvS7cy5Ub_puYBvAO-XGY3h3MMut2HbtPT233ol9l0vlXeE-w8rqrt98Fjyg8tJJVcL57Her6Qi6fRjGlMfwB4SImJ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1664426000</pqid></control><display><type>article</type><title>Reduced alphabet for protein folding prediction</title><source>MEDLINE</source><source>Access via Wiley Online Library</source><creator>Huang, Jitao T. ; Wang, Titi ; Huang, Shanran R. ; Li, Xin</creator><creatorcontrib>Huang, Jitao T. ; Wang, Titi ; Huang, Shanran R. ; Li, Xin</creatorcontrib><description>ABSTRACT What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐letter alphabet to determine folding kinetics and mechanism. Here we predict folding kinetic rates of proteins from many reduced alphabets. We find that a reduced alphabet of 10 letters achieves good correlation with folding rates, close to the one achieved by full 28‐letter alphabet. Many other reduced alphabets are not significantly correlated to folding rates. The finding suggests that not all amino acids and secondary structures are equally important for protein folding. The foldable sequence of a protein could be designed using at least 10 folding units, which can either promote or inhibit protein folding. Reducing alphabet cardinality without losing key folding kinetic information opens the door to potentially faster machine learning and data mining applications in protein structure prediction, sequence alignment and protein design. Proteins 2015; 83:631–639. © 2015 Wiley Periodicals, Inc.</description><identifier>ISSN: 0887-3585</identifier><identifier>EISSN: 1097-0134</identifier><identifier>DOI: 10.1002/prot.24762</identifier><identifier>PMID: 25641420</identifier><language>eng</language><publisher>United States: Blackwell Publishing Ltd</publisher><subject>Algorithms ; Amino Acids - chemistry ; Amino Acids - metabolism ; Computational Biology - methods ; folding unit ; prediction ; Protein Folding ; Proteins - chemistry ; Proteins - metabolism ; reduced alphabet ; Sequence Analysis, Protein</subject><ispartof>Proteins, structure, function, and bioinformatics, 2015-04, Vol.83 (4), p.631-639</ispartof><rights>2015 Wiley Periodicals, Inc.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fprot.24762$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fprot.24762$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1417,27924,27925,45574,45575</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25641420$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Huang, Jitao T.</creatorcontrib><creatorcontrib>Wang, Titi</creatorcontrib><creatorcontrib>Huang, Shanran R.</creatorcontrib><creatorcontrib>Li, Xin</creatorcontrib><title>Reduced alphabet for protein folding prediction</title><title>Proteins, structure, function, and bioinformatics</title><addtitle>Proteins</addtitle><description>ABSTRACT What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐letter alphabet to determine folding kinetics and mechanism. Here we predict folding kinetic rates of proteins from many reduced alphabets. We find that a reduced alphabet of 10 letters achieves good correlation with folding rates, close to the one achieved by full 28‐letter alphabet. Many other reduced alphabets are not significantly correlated to folding rates. The finding suggests that not all amino acids and secondary structures are equally important for protein folding. The foldable sequence of a protein could be designed using at least 10 folding units, which can either promote or inhibit protein folding. Reducing alphabet cardinality without losing key folding kinetic information opens the door to potentially faster machine learning and data mining applications in protein structure prediction, sequence alignment and protein design. Proteins 2015; 83:631–639. © 2015 Wiley Periodicals, Inc.</description><subject>Algorithms</subject><subject>Amino Acids - chemistry</subject><subject>Amino Acids - metabolism</subject><subject>Computational Biology - methods</subject><subject>folding unit</subject><subject>prediction</subject><subject>Protein Folding</subject><subject>Proteins - chemistry</subject><subject>Proteins - metabolism</subject><subject>reduced alphabet</subject><subject>Sequence Analysis, Protein</subject><issn>0887-3585</issn><issn>1097-0134</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkUtPAjEUhRujEUQ3_gBD4sbN4O27szREUEPEEAzLpjPT0eIwg_OI8u_tALJw5eqem37ntj0XoUsMAwxAbtdlUQ8Ik4IcoS6GUAaAKTtGXVBKBpQr3kFnVbUEABFScYo6hAuGGYEuup3ZpIlt0jfZ-t1Etu6nRdlvJ1qXe50lLn_zvU1cXLsiP0cnqckqe7GvPfQ6up8PH4LJdPw4vJsEjnJCApEkkQkFoyxSaUQhBsUJADMCFBNCMkGkjBLKOCisGCE2tqGwKY5wmtqU0x662c31T_lsbFXrlatim2Umt0VTaSyEIjxUofgPyjFW_i6PXv9Bl0VT5v4jLcUYET4jT13tqSZa2USvS7cy5Ub_puYBvAO-XGY3h3MMut2HbtPT233ol9l0vlXeE-w8rqrt98Fjyg8tJJVcL57Her6Qi6fRjGlMfwB4SImJ</recordid><startdate>201504</startdate><enddate>201504</enddate><creator>Huang, Jitao T.</creator><creator>Wang, Titi</creator><creator>Huang, Shanran R.</creator><creator>Li, Xin</creator><general>Blackwell Publishing Ltd</general><general>Wiley Subscription Services, Inc</general><scope>BSCLL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>7QL</scope><scope>7QO</scope><scope>7QP</scope><scope>7QR</scope><scope>7TK</scope><scope>7TM</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>K9.</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope></search><sort><creationdate>201504</creationdate><title>Reduced alphabet for protein folding prediction</title><author>Huang, Jitao T. ; Wang, Titi ; Huang, Shanran R. ; Li, Xin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i3522-6ddba96434b8fb30c0852004a608466746277bd3450818422ece96ef1b1ffef53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Algorithms</topic><topic>Amino Acids - chemistry</topic><topic>Amino Acids - metabolism</topic><topic>Computational Biology - methods</topic><topic>folding unit</topic><topic>prediction</topic><topic>Protein Folding</topic><topic>Proteins - chemistry</topic><topic>Proteins - metabolism</topic><topic>reduced alphabet</topic><topic>Sequence Analysis, Protein</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Huang, Jitao T.</creatorcontrib><creatorcontrib>Wang, Titi</creatorcontrib><creatorcontrib>Huang, Shanran R.</creatorcontrib><creatorcontrib>Li, Xin</creatorcontrib><collection>Istex</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Proteins, structure, function, and bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Huang, Jitao T.</au><au>Wang, Titi</au><au>Huang, Shanran R.</au><au>Li, Xin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Reduced alphabet for protein folding prediction</atitle><jtitle>Proteins, structure, function, and bioinformatics</jtitle><addtitle>Proteins</addtitle><date>2015-04</date><risdate>2015</risdate><volume>83</volume><issue>4</issue><spage>631</spage><epage>639</epage><pages>631-639</pages><issn>0887-3585</issn><eissn>1097-0134</eissn><abstract>ABSTRACT What are the key building blocks that would have been needed to construct complex protein folds? This is an important issue for understanding protein folding mechanism and guiding de novo protein design. Twenty naturally occurring amino acids and eight secondary structures consist of a 28‐letter alphabet to determine folding kinetics and mechanism. Here we predict folding kinetic rates of proteins from many reduced alphabets. We find that a reduced alphabet of 10 letters achieves good correlation with folding rates, close to the one achieved by full 28‐letter alphabet. Many other reduced alphabets are not significantly correlated to folding rates. The finding suggests that not all amino acids and secondary structures are equally important for protein folding. The foldable sequence of a protein could be designed using at least 10 folding units, which can either promote or inhibit protein folding. Reducing alphabet cardinality without losing key folding kinetic information opens the door to potentially faster machine learning and data mining applications in protein structure prediction, sequence alignment and protein design. Proteins 2015; 83:631–639. © 2015 Wiley Periodicals, Inc.</abstract><cop>United States</cop><pub>Blackwell Publishing Ltd</pub><pmid>25641420</pmid><doi>10.1002/prot.24762</doi><tpages>9</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0887-3585
ispartof Proteins, structure, function, and bioinformatics, 2015-04, Vol.83 (4), p.631-639
issn 0887-3585
1097-0134
language eng
recordid cdi_proquest_miscellaneous_1668259896
source MEDLINE; Access via Wiley Online Library
subjects Algorithms
Amino Acids - chemistry
Amino Acids - metabolism
Computational Biology - methods
folding unit
prediction
Protein Folding
Proteins - chemistry
Proteins - metabolism
reduced alphabet
Sequence Analysis, Protein
title Reduced alphabet for protein folding prediction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T19%3A55%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Reduced%20alphabet%20for%20protein%20folding%20prediction&rft.jtitle=Proteins,%20structure,%20function,%20and%20bioinformatics&rft.au=Huang,%20Jitao%20T.&rft.date=2015-04&rft.volume=83&rft.issue=4&rft.spage=631&rft.epage=639&rft.pages=631-639&rft.issn=0887-3585&rft.eissn=1097-0134&rft_id=info:doi/10.1002/prot.24762&rft_dat=%3Cproquest_pubme%3E1665118345%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1664426000&rft_id=info:pmid/25641420&rfr_iscdi=true