DISOselect: Disorder predictor selection at the protein level
The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the g...
Gespeichert in:
Veröffentlicht in: | Protein science 2020-01, Vol.29 (1), p.184-200 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 200 |
---|---|
container_issue | 1 |
container_start_page | 184 |
container_title | Protein science |
container_volume | 29 |
creator | Katuwawala, Akila Oldfield, Christopher J. Kurgan, Lukasz |
description | The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence‐derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non‐commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/. |
doi_str_mv | 10.1002/pro.3756 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6933862</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2328843003</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4386-adc3c30d15494eecedd42305ead94aeda2d02103d66a6cc3e071c66684f6eae83</originalsourceid><addsrcrecordid>eNp1kVlLAzEUhYMoWhfwF8iAL75MvVkaM4KC1BWEigv4FmJyq5HppCZTpf_eaN3Bp3A5Hx8nHELWKXQpANsex9DlOz05RzpUyKpUlbydJx2oJC0Vl2qJLKf0CACCMr5IljiVglGqOmTv8OxqkLBG2-4Whz6F6DAW44jO2zbEYhb50BSmLdoHzFFo0TdFjc9Yr5KFoakTrn28K-Tm-Oi6f1qeD07O-gfnpRVcydI4yy0HR3uiEogWnROMQw-Nq4RBZ5gDRoE7KY20liPsUCulVGIo0aDiK2R_5h1P7kboLDZtNLUeRz8ycaqD8fp30vgHfR-etax4LsCyYOtDEMPTBFOrRz5ZrGvTYJgkndsoyqDqQUY3_6CPYRKb_L1MMaUEB-DfQhtDShGHX2Uo6LdN8h302yYZ3fhZ_gv8HCED5Qx48TVO_xXpi8vBu_AV7u6WFQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2328843003</pqid></control><display><type>article</type><title>DISOselect: Disorder predictor selection at the protein level</title><source>Wiley Free Content</source><source>MEDLINE</source><source>Wiley Online Library Journals Frontfile Complete</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Katuwawala, Akila ; Oldfield, Christopher J. ; Kurgan, Lukasz</creator><creatorcontrib>Katuwawala, Akila ; Oldfield, Christopher J. ; Kurgan, Lukasz</creatorcontrib><description>The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence‐derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non‐commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.</description><identifier>ISSN: 0961-8368</identifier><identifier>EISSN: 1469-896X</identifier><identifier>DOI: 10.1002/pro.3756</identifier><identifier>PMID: 31642118</identifier><language>eng</language><publisher>Hoboken, USA: John Wiley & Sons, Inc</publisher><subject>Algorithms ; Amino Acid Sequence ; Computational Biology - methods ; Computer applications ; Databases, Protein ; intrinsic disorder ; intrinsically disordered proteins ; intrinsically disordered regions ; Performance prediction ; prediction ; predictive performance ; protein properties ; Protein Unfolding ; Proteins ; Proteins - chemistry ; Proteins - genetics ; recommendation ; Sequence Analysis, Protein ; Statistical analysis ; Tools for Protein Science</subject><ispartof>Protein science, 2020-01, Vol.29 (1), p.184-200</ispartof><rights>2019 The Protein Society</rights><rights>2019 The Protein Society.</rights><rights>2020 The Protein Society</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4386-adc3c30d15494eecedd42305ead94aeda2d02103d66a6cc3e071c66684f6eae83</citedby><cites>FETCH-LOGICAL-c4386-adc3c30d15494eecedd42305ead94aeda2d02103d66a6cc3e071c66684f6eae83</cites><orcidid>0000-0002-3362-2047 ; 0000-0002-7749-0314</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933862/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933862/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,1411,1427,27901,27902,45550,45551,46384,46808,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31642118$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Katuwawala, Akila</creatorcontrib><creatorcontrib>Oldfield, Christopher J.</creatorcontrib><creatorcontrib>Kurgan, Lukasz</creatorcontrib><title>DISOselect: Disorder predictor selection at the protein level</title><title>Protein science</title><addtitle>Protein Sci</addtitle><description>The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence‐derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non‐commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.</description><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>Computational Biology - methods</subject><subject>Computer applications</subject><subject>Databases, Protein</subject><subject>intrinsic disorder</subject><subject>intrinsically disordered proteins</subject><subject>intrinsically disordered regions</subject><subject>Performance prediction</subject><subject>prediction</subject><subject>predictive performance</subject><subject>protein properties</subject><subject>Protein Unfolding</subject><subject>Proteins</subject><subject>Proteins - chemistry</subject><subject>Proteins - genetics</subject><subject>recommendation</subject><subject>Sequence Analysis, Protein</subject><subject>Statistical analysis</subject><subject>Tools for Protein Science</subject><issn>0961-8368</issn><issn>1469-896X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp1kVlLAzEUhYMoWhfwF8iAL75MvVkaM4KC1BWEigv4FmJyq5HppCZTpf_eaN3Bp3A5Hx8nHELWKXQpANsex9DlOz05RzpUyKpUlbydJx2oJC0Vl2qJLKf0CACCMr5IljiVglGqOmTv8OxqkLBG2-4Whz6F6DAW44jO2zbEYhb50BSmLdoHzFFo0TdFjc9Yr5KFoakTrn28K-Tm-Oi6f1qeD07O-gfnpRVcydI4yy0HR3uiEogWnROMQw-Nq4RBZ5gDRoE7KY20liPsUCulVGIo0aDiK2R_5h1P7kboLDZtNLUeRz8ycaqD8fp30vgHfR-etax4LsCyYOtDEMPTBFOrRz5ZrGvTYJgkndsoyqDqQUY3_6CPYRKb_L1MMaUEB-DfQhtDShGHX2Uo6LdN8h302yYZ3fhZ_gv8HCED5Qx48TVO_xXpi8vBu_AV7u6WFQ</recordid><startdate>202001</startdate><enddate>202001</enddate><creator>Katuwawala, Akila</creator><creator>Oldfield, Christopher J.</creator><creator>Kurgan, Lukasz</creator><general>John Wiley & Sons, Inc</general><general>Wiley Subscription Services, Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7T5</scope><scope>7TM</scope><scope>7U9</scope><scope>8FD</scope><scope>FR3</scope><scope>H94</scope><scope>K9.</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-3362-2047</orcidid><orcidid>https://orcid.org/0000-0002-7749-0314</orcidid></search><sort><creationdate>202001</creationdate><title>DISOselect: Disorder predictor selection at the protein level</title><author>Katuwawala, Akila ; Oldfield, Christopher J. ; Kurgan, Lukasz</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4386-adc3c30d15494eecedd42305ead94aeda2d02103d66a6cc3e071c66684f6eae83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>Computational Biology - methods</topic><topic>Computer applications</topic><topic>Databases, Protein</topic><topic>intrinsic disorder</topic><topic>intrinsically disordered proteins</topic><topic>intrinsically disordered regions</topic><topic>Performance prediction</topic><topic>prediction</topic><topic>predictive performance</topic><topic>protein properties</topic><topic>Protein Unfolding</topic><topic>Proteins</topic><topic>Proteins - chemistry</topic><topic>Proteins - genetics</topic><topic>recommendation</topic><topic>Sequence Analysis, Protein</topic><topic>Statistical analysis</topic><topic>Tools for Protein Science</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Katuwawala, Akila</creatorcontrib><creatorcontrib>Oldfield, Christopher J.</creatorcontrib><creatorcontrib>Kurgan, Lukasz</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Immunology Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Protein science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Katuwawala, Akila</au><au>Oldfield, Christopher J.</au><au>Kurgan, Lukasz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DISOselect: Disorder predictor selection at the protein level</atitle><jtitle>Protein science</jtitle><addtitle>Protein Sci</addtitle><date>2020-01</date><risdate>2020</risdate><volume>29</volume><issue>1</issue><spage>184</spage><epage>200</epage><pages>184-200</pages><issn>0961-8368</issn><eissn>1469-896X</eissn><abstract>The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence‐derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non‐commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.</abstract><cop>Hoboken, USA</cop><pub>John Wiley & Sons, Inc</pub><pmid>31642118</pmid><doi>10.1002/pro.3756</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-3362-2047</orcidid><orcidid>https://orcid.org/0000-0002-7749-0314</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0961-8368 |
ispartof | Protein science, 2020-01, Vol.29 (1), p.184-200 |
issn | 0961-8368 1469-896X |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6933862 |
source | Wiley Free Content; MEDLINE; Wiley Online Library Journals Frontfile Complete; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central; Free Full-Text Journals in Chemistry |
subjects | Algorithms Amino Acid Sequence Computational Biology - methods Computer applications Databases, Protein intrinsic disorder intrinsically disordered proteins intrinsically disordered regions Performance prediction prediction predictive performance protein properties Protein Unfolding Proteins Proteins - chemistry Proteins - genetics recommendation Sequence Analysis, Protein Statistical analysis Tools for Protein Science |
title | DISOselect: Disorder predictor selection at the protein level |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T22%3A59%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DISOselect:%20Disorder%20predictor%20selection%20at%20the%20protein%20level&rft.jtitle=Protein%20science&rft.au=Katuwawala,%20Akila&rft.date=2020-01&rft.volume=29&rft.issue=1&rft.spage=184&rft.epage=200&rft.pages=184-200&rft.issn=0961-8368&rft.eissn=1469-896X&rft_id=info:doi/10.1002/pro.3756&rft_dat=%3Cproquest_pubme%3E2328843003%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2328843003&rft_id=info:pmid/31642118&rfr_iscdi=true |