DISOselect: Disorder predictor selection at the protein level

The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the g...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Protein science 2020-01, Vol.29 (1), p.184-200
Hauptverfasser: Katuwawala, Akila, Oldfield, Christopher J., Kurgan, Lukasz
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 200
container_issue 1
container_start_page 184
container_title Protein science
container_volume 29
creator Katuwawala, Akila
Oldfield, Christopher J.
Kurgan, Lukasz
description The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence‐derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non‐commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.
doi_str_mv 10.1002/pro.3756
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6933862</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2328843003</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4386-adc3c30d15494eecedd42305ead94aeda2d02103d66a6cc3e071c66684f6eae83</originalsourceid><addsrcrecordid>eNp1kVlLAzEUhYMoWhfwF8iAL75MvVkaM4KC1BWEigv4FmJyq5HppCZTpf_eaN3Bp3A5Hx8nHELWKXQpANsex9DlOz05RzpUyKpUlbydJx2oJC0Vl2qJLKf0CACCMr5IljiVglGqOmTv8OxqkLBG2-4Whz6F6DAW44jO2zbEYhb50BSmLdoHzFFo0TdFjc9Yr5KFoakTrn28K-Tm-Oi6f1qeD07O-gfnpRVcydI4yy0HR3uiEogWnROMQw-Nq4RBZ5gDRoE7KY20liPsUCulVGIo0aDiK2R_5h1P7kboLDZtNLUeRz8ycaqD8fp30vgHfR-etax4LsCyYOtDEMPTBFOrRz5ZrGvTYJgkndsoyqDqQUY3_6CPYRKb_L1MMaUEB-DfQhtDShGHX2Uo6LdN8h302yYZ3fhZ_gv8HCED5Qx48TVO_xXpi8vBu_AV7u6WFQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2328843003</pqid></control><display><type>article</type><title>DISOselect: Disorder predictor selection at the protein level</title><source>Wiley Free Content</source><source>MEDLINE</source><source>Wiley Online Library Journals Frontfile Complete</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Katuwawala, Akila ; Oldfield, Christopher J. ; Kurgan, Lukasz</creator><creatorcontrib>Katuwawala, Akila ; Oldfield, Christopher J. ; Kurgan, Lukasz</creatorcontrib><description>The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence‐derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non‐commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.</description><identifier>ISSN: 0961-8368</identifier><identifier>EISSN: 1469-896X</identifier><identifier>DOI: 10.1002/pro.3756</identifier><identifier>PMID: 31642118</identifier><language>eng</language><publisher>Hoboken, USA: John Wiley &amp; Sons, Inc</publisher><subject>Algorithms ; Amino Acid Sequence ; Computational Biology - methods ; Computer applications ; Databases, Protein ; intrinsic disorder ; intrinsically disordered proteins ; intrinsically disordered regions ; Performance prediction ; prediction ; predictive performance ; protein properties ; Protein Unfolding ; Proteins ; Proteins - chemistry ; Proteins - genetics ; recommendation ; Sequence Analysis, Protein ; Statistical analysis ; Tools for Protein Science</subject><ispartof>Protein science, 2020-01, Vol.29 (1), p.184-200</ispartof><rights>2019 The Protein Society</rights><rights>2019 The Protein Society.</rights><rights>2020 The Protein Society</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4386-adc3c30d15494eecedd42305ead94aeda2d02103d66a6cc3e071c66684f6eae83</citedby><cites>FETCH-LOGICAL-c4386-adc3c30d15494eecedd42305ead94aeda2d02103d66a6cc3e071c66684f6eae83</cites><orcidid>0000-0002-3362-2047 ; 0000-0002-7749-0314</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933862/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933862/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,1411,1427,27901,27902,45550,45551,46384,46808,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31642118$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Katuwawala, Akila</creatorcontrib><creatorcontrib>Oldfield, Christopher J.</creatorcontrib><creatorcontrib>Kurgan, Lukasz</creatorcontrib><title>DISOselect: Disorder predictor selection at the protein level</title><title>Protein science</title><addtitle>Protein Sci</addtitle><description>The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence‐derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non‐commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.</description><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>Computational Biology - methods</subject><subject>Computer applications</subject><subject>Databases, Protein</subject><subject>intrinsic disorder</subject><subject>intrinsically disordered proteins</subject><subject>intrinsically disordered regions</subject><subject>Performance prediction</subject><subject>prediction</subject><subject>predictive performance</subject><subject>protein properties</subject><subject>Protein Unfolding</subject><subject>Proteins</subject><subject>Proteins - chemistry</subject><subject>Proteins - genetics</subject><subject>recommendation</subject><subject>Sequence Analysis, Protein</subject><subject>Statistical analysis</subject><subject>Tools for Protein Science</subject><issn>0961-8368</issn><issn>1469-896X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp1kVlLAzEUhYMoWhfwF8iAL75MvVkaM4KC1BWEigv4FmJyq5HppCZTpf_eaN3Bp3A5Hx8nHELWKXQpANsex9DlOz05RzpUyKpUlbydJx2oJC0Vl2qJLKf0CACCMr5IljiVglGqOmTv8OxqkLBG2-4Whz6F6DAW44jO2zbEYhb50BSmLdoHzFFo0TdFjc9Yr5KFoakTrn28K-Tm-Oi6f1qeD07O-gfnpRVcydI4yy0HR3uiEogWnROMQw-Nq4RBZ5gDRoE7KY20liPsUCulVGIo0aDiK2R_5h1P7kboLDZtNLUeRz8ycaqD8fp30vgHfR-etax4LsCyYOtDEMPTBFOrRz5ZrGvTYJgkndsoyqDqQUY3_6CPYRKb_L1MMaUEB-DfQhtDShGHX2Uo6LdN8h302yYZ3fhZ_gv8HCED5Qx48TVO_xXpi8vBu_AV7u6WFQ</recordid><startdate>202001</startdate><enddate>202001</enddate><creator>Katuwawala, Akila</creator><creator>Oldfield, Christopher J.</creator><creator>Kurgan, Lukasz</creator><general>John Wiley &amp; Sons, Inc</general><general>Wiley Subscription Services, Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7T5</scope><scope>7TM</scope><scope>7U9</scope><scope>8FD</scope><scope>FR3</scope><scope>H94</scope><scope>K9.</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-3362-2047</orcidid><orcidid>https://orcid.org/0000-0002-7749-0314</orcidid></search><sort><creationdate>202001</creationdate><title>DISOselect: Disorder predictor selection at the protein level</title><author>Katuwawala, Akila ; Oldfield, Christopher J. ; Kurgan, Lukasz</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4386-adc3c30d15494eecedd42305ead94aeda2d02103d66a6cc3e071c66684f6eae83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>Computational Biology - methods</topic><topic>Computer applications</topic><topic>Databases, Protein</topic><topic>intrinsic disorder</topic><topic>intrinsically disordered proteins</topic><topic>intrinsically disordered regions</topic><topic>Performance prediction</topic><topic>prediction</topic><topic>predictive performance</topic><topic>protein properties</topic><topic>Protein Unfolding</topic><topic>Proteins</topic><topic>Proteins - chemistry</topic><topic>Proteins - genetics</topic><topic>recommendation</topic><topic>Sequence Analysis, Protein</topic><topic>Statistical analysis</topic><topic>Tools for Protein Science</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Katuwawala, Akila</creatorcontrib><creatorcontrib>Oldfield, Christopher J.</creatorcontrib><creatorcontrib>Kurgan, Lukasz</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Immunology Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Protein science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Katuwawala, Akila</au><au>Oldfield, Christopher J.</au><au>Kurgan, Lukasz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DISOselect: Disorder predictor selection at the protein level</atitle><jtitle>Protein science</jtitle><addtitle>Protein Sci</addtitle><date>2020-01</date><risdate>2020</risdate><volume>29</volume><issue>1</issue><spage>184</spage><epage>200</epage><pages>184-200</pages><issn>0961-8368</issn><eissn>1469-896X</eissn><abstract>The intense interest in the intrinsically disordered proteins in the life science community, together with the remarkable advancements in predictive technologies, have given rise to the development of a large number of computational predictors of intrinsic disorder from protein sequence. While the growing number of predictors is a positive trend, we have observed a considerable difference in predictive quality among predictors for individual proteins. Furthermore, variable predictor performance is often inconsistent between predictors for different proteins, and the predictor that shows the best predictive performance depends on the unique properties of each protein sequence. We propose a computational approach, DISOselect, to estimate the predictive performance of 12 selected predictors for individual proteins based on their unique sequence‐derived properties. This estimation informs the users about the expected predictive quality for a selected disorder predictor and can be used to recommend methods that are likely to provide the best quality predictions. Our solution does not depend on the results of any disorder predictor; the estimations are made based solely on the protein sequence. Our solution significantly improves predictive performance, as judged with a test set of 1,000 proteins, when compared to other alternatives. We have empirically shown that by using the recommended methods the overall predictive performance for a given set of proteins can be improved by a statistically significant margin. DISOselect is freely available for non‐commercial users through the webserver at http://biomine.cs.vcu.edu/servers/DISOselect/.</abstract><cop>Hoboken, USA</cop><pub>John Wiley &amp; Sons, Inc</pub><pmid>31642118</pmid><doi>10.1002/pro.3756</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-3362-2047</orcidid><orcidid>https://orcid.org/0000-0002-7749-0314</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0961-8368
ispartof Protein science, 2020-01, Vol.29 (1), p.184-200
issn 0961-8368
1469-896X
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6933862
source Wiley Free Content; MEDLINE; Wiley Online Library Journals Frontfile Complete; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central; Free Full-Text Journals in Chemistry
subjects Algorithms
Amino Acid Sequence
Computational Biology - methods
Computer applications
Databases, Protein
intrinsic disorder
intrinsically disordered proteins
intrinsically disordered regions
Performance prediction
prediction
predictive performance
protein properties
Protein Unfolding
Proteins
Proteins - chemistry
Proteins - genetics
recommendation
Sequence Analysis, Protein
Statistical analysis
Tools for Protein Science
title DISOselect: Disorder predictor selection at the protein level
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T22%3A59%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DISOselect:%20Disorder%20predictor%20selection%20at%20the%20protein%20level&rft.jtitle=Protein%20science&rft.au=Katuwawala,%20Akila&rft.date=2020-01&rft.volume=29&rft.issue=1&rft.spage=184&rft.epage=200&rft.pages=184-200&rft.issn=0961-8368&rft.eissn=1469-896X&rft_id=info:doi/10.1002/pro.3756&rft_dat=%3Cproquest_pubme%3E2328843003%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2328843003&rft_id=info:pmid/31642118&rfr_iscdi=true