Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation

Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nucleic acids research 2018-01, Vol.46 (D1), p.D221-D228
Hauptverfasser: Pujar, Shashikant, O'Leary, Nuala A, Farrell, Catherine M, Loveland, Jane E, Mudge, Jonathan M, Wallin, Craig, Girón, Carlos G, Diekhans, Mark, Barnes, If, Bennett, Ruth, Berry, Andrew E, Cox, Eric, Davidson, Claire, Goldfarb, Tamara, Gonzalez, Jose M, Hunt, Toby, Jackson, John, Joardar, Vinita, Kay, Mike P, Kodali, Vamsi K, Martin, Fergal J, McAndrews, Monica, McGarvey, Kelly M, Murphy, Michael, Rajput, Bhanu, Rangwala, Sanjida H, Riddick, Lillian D, Seal, Ruth L, Suner, Marie-Marthe, Webb, David, Zhu, Sophia, Aken, Bronwen L, Bruford, Elspeth A, Bult, Carol J, Frankish, Adam, Murphy, Terence, Pruitt, Kim D
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page D228
container_issue D1
container_start_page D221
container_title Nucleic acids research
container_volume 46
creator Pujar, Shashikant
O'Leary, Nuala A
Farrell, Catherine M
Loveland, Jane E
Mudge, Jonathan M
Wallin, Craig
Girón, Carlos G
Diekhans, Mark
Barnes, If
Bennett, Ruth
Berry, Andrew E
Cox, Eric
Davidson, Claire
Goldfarb, Tamara
Gonzalez, Jose M
Hunt, Toby
Jackson, John
Joardar, Vinita
Kay, Mike P
Kodali, Vamsi K
Martin, Fergal J
McAndrews, Monica
McGarvey, Kelly M
Murphy, Michael
Rajput, Bhanu
Rangwala, Sanjida H
Riddick, Lillian D
Seal, Ruth L
Suner, Marie-Marthe
Webb, David
Zhu, Sophia
Aken, Bronwen L
Bruford, Elspeth A
Bult, Carol J
Frankish, Adam
Murphy, Terence
Pruitt, Kim D
description Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.
doi_str_mv 10.1093/nar/gkx1031
format Article
fullrecord <record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5753299</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/nar/gkx1031</oup_id><sourcerecordid>1963271996</sourcerecordid><originalsourceid>FETCH-LOGICAL-c454t-1a296d3b3ccd3d665569236450d4d19d2ab0203eaaa51618b2d01e23f8b898533</originalsourceid><addsrcrecordid>eNp9kT1vFDEQhi0EIsdBRY9coSC0xN9ZUyChhUCkSBRAbc2u5y4Lt_Zie1FCnR-O0V0iaKhG8jx-ZuyXkKecveLMypMA6WT7_Yozye-RFZdGNMoacZ-smGS64Uy1R-RRzt8Y44pr9ZAcCcuF4apdkZsuhowhL5kO0Y9hSzP-WDAMSI-77t3nF9RDgR4yvqZAc4HgIfnxF_oKFho39HKZINB6Tqe4ZKRzigXH0Bx0CbdjHUHzMs8xlXqvv6Z4NWMqdFgSlNp9TB5sYJfxyaGuydez91-6j83Fpw_n3duLZlBalYaDsMbLXg6Dl94YrY0V0ijNvPLcegE9E0wiAGhueNsLzzgKuWn71rZayjV5s_fOSz-hHzCUBDs3p3GCdO0ijO7fThgv3Tb-dPpUS2FtFRwfBCnWX8rFTWMecLeDgPXxjlsjxSm3tazJyz06pJhzws3dGM7cn9xczc0dcqv0s783u2Nvg6rA8z0Ql_m_pt-8PaQH</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1963271996</pqid></control><display><type>article</type><title>Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation</title><source>Oxford Journals Open Access Collection</source><creator>Pujar, Shashikant ; O'Leary, Nuala A ; Farrell, Catherine M ; Loveland, Jane E ; Mudge, Jonathan M ; Wallin, Craig ; Girón, Carlos G ; Diekhans, Mark ; Barnes, If ; Bennett, Ruth ; Berry, Andrew E ; Cox, Eric ; Davidson, Claire ; Goldfarb, Tamara ; Gonzalez, Jose M ; Hunt, Toby ; Jackson, John ; Joardar, Vinita ; Kay, Mike P ; Kodali, Vamsi K ; Martin, Fergal J ; McAndrews, Monica ; McGarvey, Kelly M ; Murphy, Michael ; Rajput, Bhanu ; Rangwala, Sanjida H ; Riddick, Lillian D ; Seal, Ruth L ; Suner, Marie-Marthe ; Webb, David ; Zhu, Sophia ; Aken, Bronwen L ; Bruford, Elspeth A ; Bult, Carol J ; Frankish, Adam ; Murphy, Terence ; Pruitt, Kim D</creator><creatorcontrib>Pujar, Shashikant ; O'Leary, Nuala A ; Farrell, Catherine M ; Loveland, Jane E ; Mudge, Jonathan M ; Wallin, Craig ; Girón, Carlos G ; Diekhans, Mark ; Barnes, If ; Bennett, Ruth ; Berry, Andrew E ; Cox, Eric ; Davidson, Claire ; Goldfarb, Tamara ; Gonzalez, Jose M ; Hunt, Toby ; Jackson, John ; Joardar, Vinita ; Kay, Mike P ; Kodali, Vamsi K ; Martin, Fergal J ; McAndrews, Monica ; McGarvey, Kelly M ; Murphy, Michael ; Rajput, Bhanu ; Rangwala, Sanjida H ; Riddick, Lillian D ; Seal, Ruth L ; Suner, Marie-Marthe ; Webb, David ; Zhu, Sophia ; Aken, Bronwen L ; Bruford, Elspeth A ; Bult, Carol J ; Frankish, Adam ; Murphy, Terence ; Pruitt, Kim D</creatorcontrib><description>Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.</description><identifier>ISSN: 0305-1048</identifier><identifier>EISSN: 1362-4962</identifier><identifier>DOI: 10.1093/nar/gkx1031</identifier><identifier>PMID: 29126148</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Animals ; Consensus Sequence ; Data Curation - methods ; Data Curation - standards ; Database Issue ; Databases, Genetic - standards ; Guidelines as Topic ; Humans ; Mice ; Molecular Sequence Annotation ; National Library of Medicine (U.S.) ; Open Reading Frames ; United States ; User-Computer Interface</subject><ispartof>Nucleic acids research, 2018-01, Vol.46 (D1), p.D221-D228</ispartof><rights>Published by Oxford University Press on behalf of Nucleic Acids Research 2017. 2018</rights><rights>Published by Oxford University Press on behalf of Nucleic Acids Research 2017.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c454t-1a296d3b3ccd3d665569236450d4d19d2ab0203eaaa51618b2d01e23f8b898533</citedby><cites>FETCH-LOGICAL-c454t-1a296d3b3ccd3d665569236450d4d19d2ab0203eaaa51618b2d01e23f8b898533</cites><orcidid>0000-0002-8380-5247 ; 0000-0002-1672-050X ; 0000-0002-0380-7171 ; 0000-0002-7669-2934 ; 0000-0002-0935-7271</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753299/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753299/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,1598,27901,27902,53766,53768</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/nar/gkx1031$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29126148$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Pujar, Shashikant</creatorcontrib><creatorcontrib>O'Leary, Nuala A</creatorcontrib><creatorcontrib>Farrell, Catherine M</creatorcontrib><creatorcontrib>Loveland, Jane E</creatorcontrib><creatorcontrib>Mudge, Jonathan M</creatorcontrib><creatorcontrib>Wallin, Craig</creatorcontrib><creatorcontrib>Girón, Carlos G</creatorcontrib><creatorcontrib>Diekhans, Mark</creatorcontrib><creatorcontrib>Barnes, If</creatorcontrib><creatorcontrib>Bennett, Ruth</creatorcontrib><creatorcontrib>Berry, Andrew E</creatorcontrib><creatorcontrib>Cox, Eric</creatorcontrib><creatorcontrib>Davidson, Claire</creatorcontrib><creatorcontrib>Goldfarb, Tamara</creatorcontrib><creatorcontrib>Gonzalez, Jose M</creatorcontrib><creatorcontrib>Hunt, Toby</creatorcontrib><creatorcontrib>Jackson, John</creatorcontrib><creatorcontrib>Joardar, Vinita</creatorcontrib><creatorcontrib>Kay, Mike P</creatorcontrib><creatorcontrib>Kodali, Vamsi K</creatorcontrib><creatorcontrib>Martin, Fergal J</creatorcontrib><creatorcontrib>McAndrews, Monica</creatorcontrib><creatorcontrib>McGarvey, Kelly M</creatorcontrib><creatorcontrib>Murphy, Michael</creatorcontrib><creatorcontrib>Rajput, Bhanu</creatorcontrib><creatorcontrib>Rangwala, Sanjida H</creatorcontrib><creatorcontrib>Riddick, Lillian D</creatorcontrib><creatorcontrib>Seal, Ruth L</creatorcontrib><creatorcontrib>Suner, Marie-Marthe</creatorcontrib><creatorcontrib>Webb, David</creatorcontrib><creatorcontrib>Zhu, Sophia</creatorcontrib><creatorcontrib>Aken, Bronwen L</creatorcontrib><creatorcontrib>Bruford, Elspeth A</creatorcontrib><creatorcontrib>Bult, Carol J</creatorcontrib><creatorcontrib>Frankish, Adam</creatorcontrib><creatorcontrib>Murphy, Terence</creatorcontrib><creatorcontrib>Pruitt, Kim D</creatorcontrib><title>Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation</title><title>Nucleic acids research</title><addtitle>Nucleic Acids Res</addtitle><description>Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.</description><subject>Animals</subject><subject>Consensus Sequence</subject><subject>Data Curation - methods</subject><subject>Data Curation - standards</subject><subject>Database Issue</subject><subject>Databases, Genetic - standards</subject><subject>Guidelines as Topic</subject><subject>Humans</subject><subject>Mice</subject><subject>Molecular Sequence Annotation</subject><subject>National Library of Medicine (U.S.)</subject><subject>Open Reading Frames</subject><subject>United States</subject><subject>User-Computer Interface</subject><issn>0305-1048</issn><issn>1362-4962</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kT1vFDEQhi0EIsdBRY9coSC0xN9ZUyChhUCkSBRAbc2u5y4Lt_Zie1FCnR-O0V0iaKhG8jx-ZuyXkKecveLMypMA6WT7_Yozye-RFZdGNMoacZ-smGS64Uy1R-RRzt8Y44pr9ZAcCcuF4apdkZsuhowhL5kO0Y9hSzP-WDAMSI-77t3nF9RDgR4yvqZAc4HgIfnxF_oKFho39HKZINB6Tqe4ZKRzigXH0Bx0CbdjHUHzMs8xlXqvv6Z4NWMqdFgSlNp9TB5sYJfxyaGuydez91-6j83Fpw_n3duLZlBalYaDsMbLXg6Dl94YrY0V0ijNvPLcegE9E0wiAGhueNsLzzgKuWn71rZayjV5s_fOSz-hHzCUBDs3p3GCdO0ijO7fThgv3Tb-dPpUS2FtFRwfBCnWX8rFTWMecLeDgPXxjlsjxSm3tazJyz06pJhzws3dGM7cn9xczc0dcqv0s783u2Nvg6rA8z0Ql_m_pt-8PaQH</recordid><startdate>20180104</startdate><enddate>20180104</enddate><creator>Pujar, Shashikant</creator><creator>O'Leary, Nuala A</creator><creator>Farrell, Catherine M</creator><creator>Loveland, Jane E</creator><creator>Mudge, Jonathan M</creator><creator>Wallin, Craig</creator><creator>Girón, Carlos G</creator><creator>Diekhans, Mark</creator><creator>Barnes, If</creator><creator>Bennett, Ruth</creator><creator>Berry, Andrew E</creator><creator>Cox, Eric</creator><creator>Davidson, Claire</creator><creator>Goldfarb, Tamara</creator><creator>Gonzalez, Jose M</creator><creator>Hunt, Toby</creator><creator>Jackson, John</creator><creator>Joardar, Vinita</creator><creator>Kay, Mike P</creator><creator>Kodali, Vamsi K</creator><creator>Martin, Fergal J</creator><creator>McAndrews, Monica</creator><creator>McGarvey, Kelly M</creator><creator>Murphy, Michael</creator><creator>Rajput, Bhanu</creator><creator>Rangwala, Sanjida H</creator><creator>Riddick, Lillian D</creator><creator>Seal, Ruth L</creator><creator>Suner, Marie-Marthe</creator><creator>Webb, David</creator><creator>Zhu, Sophia</creator><creator>Aken, Bronwen L</creator><creator>Bruford, Elspeth A</creator><creator>Bult, Carol J</creator><creator>Frankish, Adam</creator><creator>Murphy, Terence</creator><creator>Pruitt, Kim D</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-8380-5247</orcidid><orcidid>https://orcid.org/0000-0002-1672-050X</orcidid><orcidid>https://orcid.org/0000-0002-0380-7171</orcidid><orcidid>https://orcid.org/0000-0002-7669-2934</orcidid><orcidid>https://orcid.org/0000-0002-0935-7271</orcidid></search><sort><creationdate>20180104</creationdate><title>Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation</title><author>Pujar, Shashikant ; O'Leary, Nuala A ; Farrell, Catherine M ; Loveland, Jane E ; Mudge, Jonathan M ; Wallin, Craig ; Girón, Carlos G ; Diekhans, Mark ; Barnes, If ; Bennett, Ruth ; Berry, Andrew E ; Cox, Eric ; Davidson, Claire ; Goldfarb, Tamara ; Gonzalez, Jose M ; Hunt, Toby ; Jackson, John ; Joardar, Vinita ; Kay, Mike P ; Kodali, Vamsi K ; Martin, Fergal J ; McAndrews, Monica ; McGarvey, Kelly M ; Murphy, Michael ; Rajput, Bhanu ; Rangwala, Sanjida H ; Riddick, Lillian D ; Seal, Ruth L ; Suner, Marie-Marthe ; Webb, David ; Zhu, Sophia ; Aken, Bronwen L ; Bruford, Elspeth A ; Bult, Carol J ; Frankish, Adam ; Murphy, Terence ; Pruitt, Kim D</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c454t-1a296d3b3ccd3d665569236450d4d19d2ab0203eaaa51618b2d01e23f8b898533</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Animals</topic><topic>Consensus Sequence</topic><topic>Data Curation - methods</topic><topic>Data Curation - standards</topic><topic>Database Issue</topic><topic>Databases, Genetic - standards</topic><topic>Guidelines as Topic</topic><topic>Humans</topic><topic>Mice</topic><topic>Molecular Sequence Annotation</topic><topic>National Library of Medicine (U.S.)</topic><topic>Open Reading Frames</topic><topic>United States</topic><topic>User-Computer Interface</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pujar, Shashikant</creatorcontrib><creatorcontrib>O'Leary, Nuala A</creatorcontrib><creatorcontrib>Farrell, Catherine M</creatorcontrib><creatorcontrib>Loveland, Jane E</creatorcontrib><creatorcontrib>Mudge, Jonathan M</creatorcontrib><creatorcontrib>Wallin, Craig</creatorcontrib><creatorcontrib>Girón, Carlos G</creatorcontrib><creatorcontrib>Diekhans, Mark</creatorcontrib><creatorcontrib>Barnes, If</creatorcontrib><creatorcontrib>Bennett, Ruth</creatorcontrib><creatorcontrib>Berry, Andrew E</creatorcontrib><creatorcontrib>Cox, Eric</creatorcontrib><creatorcontrib>Davidson, Claire</creatorcontrib><creatorcontrib>Goldfarb, Tamara</creatorcontrib><creatorcontrib>Gonzalez, Jose M</creatorcontrib><creatorcontrib>Hunt, Toby</creatorcontrib><creatorcontrib>Jackson, John</creatorcontrib><creatorcontrib>Joardar, Vinita</creatorcontrib><creatorcontrib>Kay, Mike P</creatorcontrib><creatorcontrib>Kodali, Vamsi K</creatorcontrib><creatorcontrib>Martin, Fergal J</creatorcontrib><creatorcontrib>McAndrews, Monica</creatorcontrib><creatorcontrib>McGarvey, Kelly M</creatorcontrib><creatorcontrib>Murphy, Michael</creatorcontrib><creatorcontrib>Rajput, Bhanu</creatorcontrib><creatorcontrib>Rangwala, Sanjida H</creatorcontrib><creatorcontrib>Riddick, Lillian D</creatorcontrib><creatorcontrib>Seal, Ruth L</creatorcontrib><creatorcontrib>Suner, Marie-Marthe</creatorcontrib><creatorcontrib>Webb, David</creatorcontrib><creatorcontrib>Zhu, Sophia</creatorcontrib><creatorcontrib>Aken, Bronwen L</creatorcontrib><creatorcontrib>Bruford, Elspeth A</creatorcontrib><creatorcontrib>Bult, Carol J</creatorcontrib><creatorcontrib>Frankish, Adam</creatorcontrib><creatorcontrib>Murphy, Terence</creatorcontrib><creatorcontrib>Pruitt, Kim D</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Nucleic acids research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Pujar, Shashikant</au><au>O'Leary, Nuala A</au><au>Farrell, Catherine M</au><au>Loveland, Jane E</au><au>Mudge, Jonathan M</au><au>Wallin, Craig</au><au>Girón, Carlos G</au><au>Diekhans, Mark</au><au>Barnes, If</au><au>Bennett, Ruth</au><au>Berry, Andrew E</au><au>Cox, Eric</au><au>Davidson, Claire</au><au>Goldfarb, Tamara</au><au>Gonzalez, Jose M</au><au>Hunt, Toby</au><au>Jackson, John</au><au>Joardar, Vinita</au><au>Kay, Mike P</au><au>Kodali, Vamsi K</au><au>Martin, Fergal J</au><au>McAndrews, Monica</au><au>McGarvey, Kelly M</au><au>Murphy, Michael</au><au>Rajput, Bhanu</au><au>Rangwala, Sanjida H</au><au>Riddick, Lillian D</au><au>Seal, Ruth L</au><au>Suner, Marie-Marthe</au><au>Webb, David</au><au>Zhu, Sophia</au><au>Aken, Bronwen L</au><au>Bruford, Elspeth A</au><au>Bult, Carol J</au><au>Frankish, Adam</au><au>Murphy, Terence</au><au>Pruitt, Kim D</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation</atitle><jtitle>Nucleic acids research</jtitle><addtitle>Nucleic Acids Res</addtitle><date>2018-01-04</date><risdate>2018</risdate><volume>46</volume><issue>D1</issue><spage>D221</spage><epage>D228</epage><pages>D221-D228</pages><issn>0305-1048</issn><eissn>1362-4962</eissn><abstract>Abstract The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>29126148</pmid><doi>10.1093/nar/gkx1031</doi><orcidid>https://orcid.org/0000-0002-8380-5247</orcidid><orcidid>https://orcid.org/0000-0002-1672-050X</orcidid><orcidid>https://orcid.org/0000-0002-0380-7171</orcidid><orcidid>https://orcid.org/0000-0002-7669-2934</orcidid><orcidid>https://orcid.org/0000-0002-0935-7271</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 0305-1048
ispartof Nucleic acids research, 2018-01, Vol.46 (D1), p.D221-D228
issn 0305-1048
1362-4962
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5753299
source Oxford Journals Open Access Collection
subjects Animals
Consensus Sequence
Data Curation - methods
Data Curation - standards
Database Issue
Databases, Genetic - standards
Guidelines as Topic
Humans
Mice
Molecular Sequence Annotation
National Library of Medicine (U.S.)
Open Reading Frames
United States
User-Computer Interface
title Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T03%3A17%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Consensus%20coding%20sequence%20(CCDS)%20database:%20a%20standardized%20set%20of%20human%20and%20mouse%20protein-coding%20regions%20supported%20by%20expert%20curation&rft.jtitle=Nucleic%20acids%20research&rft.au=Pujar,%20Shashikant&rft.date=2018-01-04&rft.volume=46&rft.issue=D1&rft.spage=D221&rft.epage=D228&rft.pages=D221-D228&rft.issn=0305-1048&rft.eissn=1362-4962&rft_id=info:doi/10.1093/nar/gkx1031&rft_dat=%3Cproquest_TOX%3E1963271996%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1963271996&rft_id=info:pmid/29126148&rft_oup_id=10.1093/nar/gkx1031&rfr_iscdi=true