Assessing the gene space in draft genomes

Genome sequencing projects have been initiated for a wide range of eukaryotes. A few projects have reached completion, but most exist as draft assemblies. As one of the main reasons to sequence a genome is to obtain its catalog of genes, an important question is how complete or completable the catal...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Nucleic acids research 2009-01, Vol.37 (1), p.289-297
Hauptverfasser: Parra, Genis, Bradnam, Keith, Ning, Zemin, Keane, Thomas, Korf, Ian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 297
container_issue 1
container_start_page 289
container_title Nucleic acids research
container_volume 37
creator Parra, Genis
Bradnam, Keith
Ning, Zemin
Keane, Thomas
Korf, Ian
description Genome sequencing projects have been initiated for a wide range of eukaryotes. A few projects have reached completion, but most exist as draft assemblies. As one of the main reasons to sequence a genome is to obtain its catalog of genes, an important question is how complete or completable the catalog is in unfinished genomes. To answer this question, we have identified a set of core eukaryotic genes (CEGs), that are extremely highly conserved and which we believe are present in low copy numbers in higher eukaryotes. From an analysis of a phylogenetically diverse set of eukaryotic genome assemblies, we found that the proportion of CEGs mapped in draft genomes provides a useful metric for describing the gene space, and complements the commonly used N50 length and x-fold coverage values.
doi_str_mv 10.1093/nar/gkn916
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2615622</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/nar/gkn916</oup_id><sourcerecordid>1627955031</sourcerecordid><originalsourceid>FETCH-LOGICAL-c529t-8cf41de8c82c885b1e49e9af8067555a2f8af6fd29919c61e6d3784639f756933</originalsourceid><addsrcrecordid>eNqFkV9rVDEQxYModlt98QPIRbBQ4dr8v8mLUIp2CwURVii-hDR3spt2N3eb5Ip-e1PustWX-jQw8-PMzDkIvSH4I8GanUabTpd3URP5DM0Ik7TlWtLnaIYZFi3BXB2gw5xvMSacCP4SHRCNOdUdn6GTs5wh5xCXTVlBs4QITd5aB02ITZ-sLw-9YQP5FXrh7TrD6109Qt-_fF6cz9urrxeX52dXrRNUl1Y5z0kPyinqlBI3BLgGbb3CshNCWOqV9dL3VGuinSQge9YpLpn2nZCasSP0adLdjjcb6B3EkuzabFPY2PTbDDaYfycxrMxy-GmoJEJSWgXeTQJDLsFkFwq4lRtiBFcMkVRwISt0vNuShvsRcjGbkB2s1zbCMGYjZT24-vVfkGJGBeb8ce0evB3GFKtVlcFCqep9hT5MkEtDzgn8_i-CzUOYpoZppjAr_PZvJx7RXXoVeL_7dNw-LdROXMgFfu1Jm-6M7FgnzPz6h_lGFmoxv6ZGsj-uNLTU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>200588415</pqid></control><display><type>article</type><title>Assessing the gene space in draft genomes</title><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Parra, Genis ; Bradnam, Keith ; Ning, Zemin ; Keane, Thomas ; Korf, Ian</creator><creatorcontrib>Parra, Genis ; Bradnam, Keith ; Ning, Zemin ; Keane, Thomas ; Korf, Ian ; Univ. of California, Davis, CA (United States)</creatorcontrib><description>Genome sequencing projects have been initiated for a wide range of eukaryotes. A few projects have reached completion, but most exist as draft assemblies. As one of the main reasons to sequence a genome is to obtain its catalog of genes, an important question is how complete or completable the catalog is in unfinished genomes. To answer this question, we have identified a set of core eukaryotic genes (CEGs), that are extremely highly conserved and which we believe are present in low copy numbers in higher eukaryotes. From an analysis of a phylogenetically diverse set of eukaryotic genome assemblies, we found that the proportion of CEGs mapped in draft genomes provides a useful metric for describing the gene space, and complements the commonly used N50 length and x-fold coverage values.</description><identifier>ISSN: 0305-1048</identifier><identifier>EISSN: 1362-4962</identifier><identifier>DOI: 10.1093/nar/gkn916</identifier><identifier>PMID: 19042974</identifier><identifier>CODEN: NARHAD</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Animals ; BASIC BIOLOGICAL SCIENCES ; Biochemistry &amp; Molecular Biology ; Chromosome Mapping ; Genes ; Genomics ; Humans ; Proteins - genetics</subject><ispartof>Nucleic acids research, 2009-01, Vol.37 (1), p.289-297</ispartof><rights>2008 The Author(s) 2008</rights><rights>2008 The Author(s)</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c529t-8cf41de8c82c885b1e49e9af8067555a2f8af6fd29919c61e6d3784639f756933</citedby><cites>FETCH-LOGICAL-c529t-8cf41de8c82c885b1e49e9af8067555a2f8af6fd29919c61e6d3784639f756933</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2615622/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2615622/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,1598,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19042974$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://www.osti.gov/servlets/purl/1625456$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Parra, Genis</creatorcontrib><creatorcontrib>Bradnam, Keith</creatorcontrib><creatorcontrib>Ning, Zemin</creatorcontrib><creatorcontrib>Keane, Thomas</creatorcontrib><creatorcontrib>Korf, Ian</creatorcontrib><creatorcontrib>Univ. of California, Davis, CA (United States)</creatorcontrib><title>Assessing the gene space in draft genomes</title><title>Nucleic acids research</title><addtitle>Nucleic Acids Res</addtitle><description>Genome sequencing projects have been initiated for a wide range of eukaryotes. A few projects have reached completion, but most exist as draft assemblies. As one of the main reasons to sequence a genome is to obtain its catalog of genes, an important question is how complete or completable the catalog is in unfinished genomes. To answer this question, we have identified a set of core eukaryotic genes (CEGs), that are extremely highly conserved and which we believe are present in low copy numbers in higher eukaryotes. From an analysis of a phylogenetically diverse set of eukaryotic genome assemblies, we found that the proportion of CEGs mapped in draft genomes provides a useful metric for describing the gene space, and complements the commonly used N50 length and x-fold coverage values.</description><subject>Animals</subject><subject>BASIC BIOLOGICAL SCIENCES</subject><subject>Biochemistry &amp; Molecular Biology</subject><subject>Chromosome Mapping</subject><subject>Genes</subject><subject>Genomics</subject><subject>Humans</subject><subject>Proteins - genetics</subject><issn>0305-1048</issn><issn>1362-4962</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNqFkV9rVDEQxYModlt98QPIRbBQ4dr8v8mLUIp2CwURVii-hDR3spt2N3eb5Ip-e1PustWX-jQw8-PMzDkIvSH4I8GanUabTpd3URP5DM0Ik7TlWtLnaIYZFi3BXB2gw5xvMSacCP4SHRCNOdUdn6GTs5wh5xCXTVlBs4QITd5aB02ITZ-sLw-9YQP5FXrh7TrD6109Qt-_fF6cz9urrxeX52dXrRNUl1Y5z0kPyinqlBI3BLgGbb3CshNCWOqV9dL3VGuinSQge9YpLpn2nZCasSP0adLdjjcb6B3EkuzabFPY2PTbDDaYfycxrMxy-GmoJEJSWgXeTQJDLsFkFwq4lRtiBFcMkVRwISt0vNuShvsRcjGbkB2s1zbCMGYjZT24-vVfkGJGBeb8ce0evB3GFKtVlcFCqep9hT5MkEtDzgn8_i-CzUOYpoZppjAr_PZvJx7RXXoVeL_7dNw-LdROXMgFfu1Jm-6M7FgnzPz6h_lGFmoxv6ZGsj-uNLTU</recordid><startdate>20090101</startdate><enddate>20090101</enddate><creator>Parra, Genis</creator><creator>Bradnam, Keith</creator><creator>Ning, Zemin</creator><creator>Keane, Thomas</creator><creator>Korf, Ian</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>BSCLL</scope><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QL</scope><scope>7QO</scope><scope>7QP</scope><scope>7QR</scope><scope>7SS</scope><scope>7TK</scope><scope>7TM</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>K9.</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>OIOZB</scope><scope>OTOTI</scope><scope>5PM</scope></search><sort><creationdate>20090101</creationdate><title>Assessing the gene space in draft genomes</title><author>Parra, Genis ; Bradnam, Keith ; Ning, Zemin ; Keane, Thomas ; Korf, Ian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c529t-8cf41de8c82c885b1e49e9af8067555a2f8af6fd29919c61e6d3784639f756933</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Animals</topic><topic>BASIC BIOLOGICAL SCIENCES</topic><topic>Biochemistry &amp; Molecular Biology</topic><topic>Chromosome Mapping</topic><topic>Genes</topic><topic>Genomics</topic><topic>Humans</topic><topic>Proteins - genetics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Parra, Genis</creatorcontrib><creatorcontrib>Bradnam, Keith</creatorcontrib><creatorcontrib>Ning, Zemin</creatorcontrib><creatorcontrib>Keane, Thomas</creatorcontrib><creatorcontrib>Korf, Ian</creatorcontrib><creatorcontrib>Univ. of California, Davis, CA (United States)</creatorcontrib><collection>Istex</collection><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>OSTI.GOV - Hybrid</collection><collection>OSTI.GOV</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Nucleic acids research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Parra, Genis</au><au>Bradnam, Keith</au><au>Ning, Zemin</au><au>Keane, Thomas</au><au>Korf, Ian</au><aucorp>Univ. of California, Davis, CA (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing the gene space in draft genomes</atitle><jtitle>Nucleic acids research</jtitle><addtitle>Nucleic Acids Res</addtitle><date>2009-01-01</date><risdate>2009</risdate><volume>37</volume><issue>1</issue><spage>289</spage><epage>297</epage><pages>289-297</pages><issn>0305-1048</issn><eissn>1362-4962</eissn><coden>NARHAD</coden><abstract>Genome sequencing projects have been initiated for a wide range of eukaryotes. A few projects have reached completion, but most exist as draft assemblies. As one of the main reasons to sequence a genome is to obtain its catalog of genes, an important question is how complete or completable the catalog is in unfinished genomes. To answer this question, we have identified a set of core eukaryotic genes (CEGs), that are extremely highly conserved and which we believe are present in low copy numbers in higher eukaryotes. From an analysis of a phylogenetically diverse set of eukaryotic genome assemblies, we found that the proportion of CEGs mapped in draft genomes provides a useful metric for describing the gene space, and complements the commonly used N50 length and x-fold coverage values.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>19042974</pmid><doi>10.1093/nar/gkn916</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0305-1048
ispartof Nucleic acids research, 2009-01, Vol.37 (1), p.289-297
issn 0305-1048
1362-4962
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_2615622
source Oxford Journals Open Access Collection; MEDLINE; DOAJ Directory of Open Access Journals; PubMed Central; Free Full-Text Journals in Chemistry
subjects Animals
BASIC BIOLOGICAL SCIENCES
Biochemistry & Molecular Biology
Chromosome Mapping
Genes
Genomics
Humans
Proteins - genetics
title Assessing the gene space in draft genomes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T19%3A44%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20the%20gene%20space%20in%20draft%20genomes&rft.jtitle=Nucleic%20acids%20research&rft.au=Parra,%20Genis&rft.aucorp=Univ.%20of%20California,%20Davis,%20CA%20(United%20States)&rft.date=2009-01-01&rft.volume=37&rft.issue=1&rft.spage=289&rft.epage=297&rft.pages=289-297&rft.issn=0305-1048&rft.eissn=1362-4962&rft.coden=NARHAD&rft_id=info:doi/10.1093/nar/gkn916&rft_dat=%3Cproquest_pubme%3E1627955031%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=200588415&rft_id=info:pmid/19042974&rft_oup_id=10.1093/nar/gkn916&rfr_iscdi=true