Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA)

The assignment of gene function remains a difficult but important task in computational biology. The establishment of the first Critical Assessment of Functional Annotation (CAFA) was aimed at increasing progress in the field. We present an independent analysis of the results of CAFA, aimed at ident...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:BMC bioinformatics 2013-04, Vol.14 Suppl 3 (Suppl 3), p.S15-S15, Article S15
Hauptverfasser: Gillis, Jesse, Pavlidis, Paul
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page S15
container_issue Suppl 3
container_start_page S15
container_title BMC bioinformatics
container_volume 14 Suppl 3
creator Gillis, Jesse
Pavlidis, Paul
description The assignment of gene function remains a difficult but important task in computational biology. The establishment of the first Critical Assessment of Functional Annotation (CAFA) was aimed at increasing progress in the field. We present an independent analysis of the results of CAFA, aimed at identifying challenges in assessment and at understanding trends in prediction performance. We found that well-accepted methods based on sequence similarity (i.e., BLAST) have a dominant effect. Many of the most informative predictions turned out to be either recovering existing knowledge about sequence similarity or were "post-dictions" already documented in the literature. These results indicate that deep challenges remain in even defining the task of function assignment, with a particular difficulty posed by the problem of defining function in a way that is not dependent on either flawed gold standards or the input data itself. In particular, we suggest that using the Gene Ontology (or other similar systematizations of function) as a gold standard is unlikely to be the way forward.
doi_str_mv 10.1186/1471-2105-14-s3-s15
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3633048</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1352287384</sourcerecordid><originalsourceid>FETCH-LOGICAL-b5605-4148e2a2ac530cde19af7fcb9ccd93dab3796f116842f07f33f90e17f1dd1603</originalsourceid><addsrcrecordid>eNqNUs2KFDEQbkRx19UnECTgZT20prrSfx6EYXBUWPCwew_pdDKTpTsZk25Bn8ZHNZneGXaFBU-p5PtJ8VVl2Wug7wGa6gOwGvICaJkDywPmAcon2fnp9em9-ix7EcItpVA3tHyenRVYIW0bPM_-rHfCCzkpb34buyXTTpEwiUkRpw8X4Sdi7KGUbtzPETPOioGIEMzWjspOibpVVhE9W5nQj2RQITgbiPZuPGi18WEi0pvJyEUcGUfxUZcAa93yBblcrzardy-zZ1oMQb26Oy-ym83nm_XX_Or7l2_r1VXelVWMgAFrVCEKIUukslfQCl1r2bVS9i32osO6rTRA1bBC01oj6pYqqDX0PVQUL7JPi-1-7kbVy9iZFwPfezMK_4s7YfhDxJod37qfPCaJlDXRYLMYdMY9YvAQiWnyNCCeBhQrfo38GspodHnXiXc_ZhUmPpog1TAIq9wcOGBZFE2NDfsPKqtZhQyT69t_qLdu9jHxxMKqrIqmTSxcWNK7ELzSp_6B8rR0j3T85n52J81xy_AveV3Wiw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1336562895</pqid></control><display><type>article</type><title>Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA)</title><source>SpringerOpen</source><source>MEDLINE</source><source>PubMed Central</source><source>Directory of Open Access Journals</source><source>EZB Electronic Journals Library</source><source>SpringerLink Journals - AutoHoldings</source><source>PubMed Central Open Access</source><creator>Gillis, Jesse ; Pavlidis, Paul</creator><creatorcontrib>Gillis, Jesse ; Pavlidis, Paul</creatorcontrib><description>The assignment of gene function remains a difficult but important task in computational biology. The establishment of the first Critical Assessment of Functional Annotation (CAFA) was aimed at increasing progress in the field. We present an independent analysis of the results of CAFA, aimed at identifying challenges in assessment and at understanding trends in prediction performance. We found that well-accepted methods based on sequence similarity (i.e., BLAST) have a dominant effect. Many of the most informative predictions turned out to be either recovering existing knowledge about sequence similarity or were "post-dictions" already documented in the literature. These results indicate that deep challenges remain in even defining the task of function assignment, with a particular difficulty posed by the problem of defining function in a way that is not dependent on either flawed gold standards or the input data itself. In particular, we suggest that using the Gene Ontology (or other similar systematizations of function) as a gold standard is unlikely to be the way forward.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/1471-2105-14-s3-s15</identifier><identifier>PMID: 23630983</identifier><language>eng</language><publisher>England: BioMed Central</publisher><subject>Algorithms ; Computational Biology - methods ; Databases, Protein ; Genes ; Molecular Sequence Annotation ; Proceedings ; Proteins - chemistry ; Proteins - genetics ; Proteins - physiology ; Vocabulary, Controlled</subject><ispartof>BMC bioinformatics, 2013-04, Vol.14 Suppl 3 (Suppl 3), p.S15-S15, Article S15</ispartof><rights>2013 Gillis and Pavlidis; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2013 Gillis and Pavlidis; licensee BioMed Central Ltd. 2013 Gillis and Pavlidis; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-b5605-4148e2a2ac530cde19af7fcb9ccd93dab3796f116842f07f33f90e17f1dd1603</citedby><cites>FETCH-LOGICAL-b5605-4148e2a2ac530cde19af7fcb9ccd93dab3796f116842f07f33f90e17f1dd1603</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3633048/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3633048/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23630983$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Gillis, Jesse</creatorcontrib><creatorcontrib>Pavlidis, Paul</creatorcontrib><title>Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA)</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>The assignment of gene function remains a difficult but important task in computational biology. The establishment of the first Critical Assessment of Functional Annotation (CAFA) was aimed at increasing progress in the field. We present an independent analysis of the results of CAFA, aimed at identifying challenges in assessment and at understanding trends in prediction performance. We found that well-accepted methods based on sequence similarity (i.e., BLAST) have a dominant effect. Many of the most informative predictions turned out to be either recovering existing knowledge about sequence similarity or were "post-dictions" already documented in the literature. These results indicate that deep challenges remain in even defining the task of function assignment, with a particular difficulty posed by the problem of defining function in a way that is not dependent on either flawed gold standards or the input data itself. In particular, we suggest that using the Gene Ontology (or other similar systematizations of function) as a gold standard is unlikely to be the way forward.</description><subject>Algorithms</subject><subject>Computational Biology - methods</subject><subject>Databases, Protein</subject><subject>Genes</subject><subject>Molecular Sequence Annotation</subject><subject>Proceedings</subject><subject>Proteins - chemistry</subject><subject>Proteins - genetics</subject><subject>Proteins - physiology</subject><subject>Vocabulary, Controlled</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqNUs2KFDEQbkRx19UnECTgZT20prrSfx6EYXBUWPCwew_pdDKTpTsZk25Bn8ZHNZneGXaFBU-p5PtJ8VVl2Wug7wGa6gOwGvICaJkDywPmAcon2fnp9em9-ix7EcItpVA3tHyenRVYIW0bPM_-rHfCCzkpb34buyXTTpEwiUkRpw8X4Sdi7KGUbtzPETPOioGIEMzWjspOibpVVhE9W5nQj2RQITgbiPZuPGi18WEi0pvJyEUcGUfxUZcAa93yBblcrzardy-zZ1oMQb26Oy-ym83nm_XX_Or7l2_r1VXelVWMgAFrVCEKIUukslfQCl1r2bVS9i32osO6rTRA1bBC01oj6pYqqDX0PVQUL7JPi-1-7kbVy9iZFwPfezMK_4s7YfhDxJod37qfPCaJlDXRYLMYdMY9YvAQiWnyNCCeBhQrfo38GspodHnXiXc_ZhUmPpog1TAIq9wcOGBZFE2NDfsPKqtZhQyT69t_qLdu9jHxxMKqrIqmTSxcWNK7ELzSp_6B8rR0j3T85n52J81xy_AveV3Wiw</recordid><startdate>20130422</startdate><enddate>20130422</enddate><creator>Gillis, Jesse</creator><creator>Pavlidis, Paul</creator><general>BioMed Central</general><general>BioMed Central Ltd</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20130422</creationdate><title>Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA)</title><author>Gillis, Jesse ; Pavlidis, Paul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-b5605-4148e2a2ac530cde19af7fcb9ccd93dab3796f116842f07f33f90e17f1dd1603</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Computational Biology - methods</topic><topic>Databases, Protein</topic><topic>Genes</topic><topic>Molecular Sequence Annotation</topic><topic>Proceedings</topic><topic>Proteins - chemistry</topic><topic>Proteins - genetics</topic><topic>Proteins - physiology</topic><topic>Vocabulary, Controlled</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gillis, Jesse</creatorcontrib><creatorcontrib>Pavlidis, Paul</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest - Health &amp; Medical Complete保健、医学与药学数据库</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gillis, Jesse</au><au>Pavlidis, Paul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA)</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2013-04-22</date><risdate>2013</risdate><volume>14 Suppl 3</volume><issue>Suppl 3</issue><spage>S15</spage><epage>S15</epage><pages>S15-S15</pages><artnum>S15</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>The assignment of gene function remains a difficult but important task in computational biology. The establishment of the first Critical Assessment of Functional Annotation (CAFA) was aimed at increasing progress in the field. We present an independent analysis of the results of CAFA, aimed at identifying challenges in assessment and at understanding trends in prediction performance. We found that well-accepted methods based on sequence similarity (i.e., BLAST) have a dominant effect. Many of the most informative predictions turned out to be either recovering existing knowledge about sequence similarity or were "post-dictions" already documented in the literature. These results indicate that deep challenges remain in even defining the task of function assignment, with a particular difficulty posed by the problem of defining function in a way that is not dependent on either flawed gold standards or the input data itself. In particular, we suggest that using the Gene Ontology (or other similar systematizations of function) as a gold standard is unlikely to be the way forward.</abstract><cop>England</cop><pub>BioMed Central</pub><pmid>23630983</pmid><doi>10.1186/1471-2105-14-s3-s15</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2013-04, Vol.14 Suppl 3 (Suppl 3), p.S15-S15, Article S15
issn 1471-2105
1471-2105
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3633048
source SpringerOpen; MEDLINE; PubMed Central; Directory of Open Access Journals; EZB Electronic Journals Library; SpringerLink Journals - AutoHoldings; PubMed Central Open Access
subjects Algorithms
Computational Biology - methods
Databases, Protein
Genes
Molecular Sequence Annotation
Proceedings
Proteins - chemistry
Proteins - genetics
Proteins - physiology
Vocabulary, Controlled
title Characterizing the state of the art in the computational assignment of gene function: lessons from the first critical assessment of functional annotation (CAFA)
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T16%3A01%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Characterizing%20the%20state%20of%20the%20art%20in%20the%20computational%20assignment%20of%20gene%20function:%20lessons%20from%20the%20first%20critical%20assessment%20of%20functional%20annotation%20(CAFA)&rft.jtitle=BMC%20bioinformatics&rft.au=Gillis,%20Jesse&rft.date=2013-04-22&rft.volume=14%20Suppl%203&rft.issue=Suppl%203&rft.spage=S15&rft.epage=S15&rft.pages=S15-S15&rft.artnum=S15&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/1471-2105-14-s3-s15&rft_dat=%3Cproquest_pubme%3E1352287384%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1336562895&rft_id=info:pmid/23630983&rfr_iscdi=true