Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants

We estimate the global BOLD Systems database holds core DNA barcodes ( rbcL  +  matK ) for about 15% of land plant species and that comprehensive species coverage is still many decades away. Interim performance of the resource is compromised by variable sequence overlap and modest information conten...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientific reports 2017-04, Vol.7 (1), p.46040-46040, Article 46040
Hauptverfasser: Wilkinson, Mike J., Szabo, Claudia, Ford, Caroline S., Yarom, Yuval, Croxford, Adam E., Camp, Amanda, Gooding, Paul
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 46040
container_issue 1
container_start_page 46040
container_title Scientific reports
container_volume 7
creator Wilkinson, Mike J.
Szabo, Claudia
Ford, Caroline S.
Yarom, Yuval
Croxford, Adam E.
Camp, Amanda
Gooding, Paul
description We estimate the global BOLD Systems database holds core DNA barcodes ( rbcL  +  matK ) for about 15% of land plant species and that comprehensive species coverage is still many decades away. Interim performance of the resource is compromised by variable sequence overlap and modest information content within each barcode. Our model predicts that the proportion of species-unique barcodes reduces as the database grows and that ‘false’ species-unique barcodes remain >5% until the database is almost complete. We conclude the current rbcL  +  matK barcode is unfit for purpose. Genome skimming and supplementary barcodes could improve diagnostic power but would slow new barcode acquisition. We therefore present two novel Next Generation Sequencing protocols (with freeware) capable of accurate, massively parallel de novo assembly of high quality DNA barcodes of >1400 bp. We explore how these capabilities could enhance species diagnosis in the coming decades.
doi_str_mv 10.1038/srep46040
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5388885</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1903447593</sourcerecordid><originalsourceid>FETCH-LOGICAL-c438t-5027011fa9a1540801306f320ddbfac56d17efdf047374ec778e32b6d21396753</originalsourceid><addsrcrecordid>eNplkV1rFDEUhoMottRe-Ack4I0VVvM5ydwIpWoVSgWr1yGbOdmmzCbTJFPtvzfr1mXVc5EcOA_v-XgRek7JG0q4flsyTKIjgjxCh4wIuWCcscd7-QE6LuWGtJCsF7R_ig6YFoT2Uh-i-StMo3UhrvCVjSvI-Eeo1_gSflZ8DhGyrSFFfAW3M8TfWE04rKec7gC79mS7AmzjgG9nO4Z6j5PHGTzkhgN-f3mKlza7NEDBPmXcmsVanqEn3o4Fjh_-I_T944dvZ58WF1_OP5-dXiyc4LouJGGKUOptb6kURBPKSec5I8Ow9NbJbqAK_OCJUFwJcEpp4GzZDYzyvlOSH6F3W91pXq5hcBBrtqOZcljbfG-SDebvSgzXZpXujOS6xUbg1YNATu0CpZp1KA7GtgWkuRiqtSKyHV839OU_6E2ac2zrGdoTLoSSPW_UyZZyOZVmnd8NQ4nZ-Gl2fjb2xf70O_KPew14vQVKK23M22v5n9ovNrGqWw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1903447593</pqid></control><display><type>article</type><title>Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants</title><source>Nature Open Access</source><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><source>Springer Nature OA Free Journals</source><creator>Wilkinson, Mike J. ; Szabo, Claudia ; Ford, Caroline S. ; Yarom, Yuval ; Croxford, Adam E. ; Camp, Amanda ; Gooding, Paul</creator><creatorcontrib>Wilkinson, Mike J. ; Szabo, Claudia ; Ford, Caroline S. ; Yarom, Yuval ; Croxford, Adam E. ; Camp, Amanda ; Gooding, Paul</creatorcontrib><description>We estimate the global BOLD Systems database holds core DNA barcodes ( rbcL  +  matK ) for about 15% of land plant species and that comprehensive species coverage is still many decades away. Interim performance of the resource is compromised by variable sequence overlap and modest information content within each barcode. Our model predicts that the proportion of species-unique barcodes reduces as the database grows and that ‘false’ species-unique barcodes remain &gt;5% until the database is almost complete. We conclude the current rbcL  +  matK barcode is unfit for purpose. Genome skimming and supplementary barcodes could improve diagnostic power but would slow new barcode acquisition. We therefore present two novel Next Generation Sequencing protocols (with freeware) capable of accurate, massively parallel de novo assembly of high quality DNA barcodes of &gt;1400 bp. We explore how these capabilities could enhance species diagnosis in the coming decades.</description><identifier>ISSN: 2045-2322</identifier><identifier>EISSN: 2045-2322</identifier><identifier>DOI: 10.1038/srep46040</identifier><identifier>PMID: 28401958</identifier><language>eng</language><publisher>London: Nature Publishing Group UK</publisher><subject>45 ; 45/23 ; 45/77 ; 631/114 ; 631/449 ; Coverage ; Databases, Genetic ; Deoxyribonucleic acid ; DNA ; DNA Barcoding, Taxonomic - methods ; DNA sequencing ; DNA, Plant - genetics ; Genomes ; High-Throughput Nucleotide Sequencing - methods ; Humanities and Social Sciences ; multidisciplinary ; Phylogeny ; Plants - genetics ; Reference Standards ; Science ; Sonication ; Species ; Species Specificity</subject><ispartof>Scientific reports, 2017-04, Vol.7 (1), p.46040-46040, Article 46040</ispartof><rights>The Author(s) 2017</rights><rights>Copyright Nature Publishing Group Apr 2017</rights><rights>Copyright © 2017, The Author(s) 2017 The Author(s)</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c438t-5027011fa9a1540801306f320ddbfac56d17efdf047374ec778e32b6d21396753</citedby><cites>FETCH-LOGICAL-c438t-5027011fa9a1540801306f320ddbfac56d17efdf047374ec778e32b6d21396753</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5388885/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5388885/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,27901,27902,41096,42165,51551,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/28401958$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wilkinson, Mike J.</creatorcontrib><creatorcontrib>Szabo, Claudia</creatorcontrib><creatorcontrib>Ford, Caroline S.</creatorcontrib><creatorcontrib>Yarom, Yuval</creatorcontrib><creatorcontrib>Croxford, Adam E.</creatorcontrib><creatorcontrib>Camp, Amanda</creatorcontrib><creatorcontrib>Gooding, Paul</creatorcontrib><title>Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants</title><title>Scientific reports</title><addtitle>Sci Rep</addtitle><addtitle>Sci Rep</addtitle><description>We estimate the global BOLD Systems database holds core DNA barcodes ( rbcL  +  matK ) for about 15% of land plant species and that comprehensive species coverage is still many decades away. Interim performance of the resource is compromised by variable sequence overlap and modest information content within each barcode. Our model predicts that the proportion of species-unique barcodes reduces as the database grows and that ‘false’ species-unique barcodes remain &gt;5% until the database is almost complete. We conclude the current rbcL  +  matK barcode is unfit for purpose. Genome skimming and supplementary barcodes could improve diagnostic power but would slow new barcode acquisition. We therefore present two novel Next Generation Sequencing protocols (with freeware) capable of accurate, massively parallel de novo assembly of high quality DNA barcodes of &gt;1400 bp. We explore how these capabilities could enhance species diagnosis in the coming decades.</description><subject>45</subject><subject>45/23</subject><subject>45/77</subject><subject>631/114</subject><subject>631/449</subject><subject>Coverage</subject><subject>Databases, Genetic</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>DNA Barcoding, Taxonomic - methods</subject><subject>DNA sequencing</subject><subject>DNA, Plant - genetics</subject><subject>Genomes</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>Humanities and Social Sciences</subject><subject>multidisciplinary</subject><subject>Phylogeny</subject><subject>Plants - genetics</subject><subject>Reference Standards</subject><subject>Science</subject><subject>Sonication</subject><subject>Species</subject><subject>Species Specificity</subject><issn>2045-2322</issn><issn>2045-2322</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><recordid>eNplkV1rFDEUhoMottRe-Ack4I0VVvM5ydwIpWoVSgWr1yGbOdmmzCbTJFPtvzfr1mXVc5EcOA_v-XgRek7JG0q4flsyTKIjgjxCh4wIuWCcscd7-QE6LuWGtJCsF7R_ig6YFoT2Uh-i-StMo3UhrvCVjSvI-Eeo1_gSflZ8DhGyrSFFfAW3M8TfWE04rKec7gC79mS7AmzjgG9nO4Z6j5PHGTzkhgN-f3mKlza7NEDBPmXcmsVanqEn3o4Fjh_-I_T944dvZ58WF1_OP5-dXiyc4LouJGGKUOptb6kURBPKSec5I8Ow9NbJbqAK_OCJUFwJcEpp4GzZDYzyvlOSH6F3W91pXq5hcBBrtqOZcljbfG-SDebvSgzXZpXujOS6xUbg1YNATu0CpZp1KA7GtgWkuRiqtSKyHV839OU_6E2ac2zrGdoTLoSSPW_UyZZyOZVmnd8NQ4nZ-Gl2fjb2xf70O_KPew14vQVKK23M22v5n9ovNrGqWw</recordid><startdate>20170412</startdate><enddate>20170412</enddate><creator>Wilkinson, Mike J.</creator><creator>Szabo, Claudia</creator><creator>Ford, Caroline S.</creator><creator>Yarom, Yuval</creator><creator>Croxford, Adam E.</creator><creator>Camp, Amanda</creator><creator>Gooding, Paul</creator><general>Nature Publishing Group UK</general><general>Nature Publishing Group</general><scope>C6C</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PJZUB</scope><scope>PKEHL</scope><scope>PPXIY</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20170412</creationdate><title>Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants</title><author>Wilkinson, Mike J. ; Szabo, Claudia ; Ford, Caroline S. ; Yarom, Yuval ; Croxford, Adam E. ; Camp, Amanda ; Gooding, Paul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c438t-5027011fa9a1540801306f320ddbfac56d17efdf047374ec778e32b6d21396753</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>45</topic><topic>45/23</topic><topic>45/77</topic><topic>631/114</topic><topic>631/449</topic><topic>Coverage</topic><topic>Databases, Genetic</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>DNA Barcoding, Taxonomic - methods</topic><topic>DNA sequencing</topic><topic>DNA, Plant - genetics</topic><topic>Genomes</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>Humanities and Social Sciences</topic><topic>multidisciplinary</topic><topic>Phylogeny</topic><topic>Plants - genetics</topic><topic>Reference Standards</topic><topic>Science</topic><topic>Sonication</topic><topic>Species</topic><topic>Species Specificity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wilkinson, Mike J.</creatorcontrib><creatorcontrib>Szabo, Claudia</creatorcontrib><creatorcontrib>Ford, Caroline S.</creatorcontrib><creatorcontrib>Yarom, Yuval</creatorcontrib><creatorcontrib>Croxford, Adam E.</creatorcontrib><creatorcontrib>Camp, Amanda</creatorcontrib><creatorcontrib>Gooding, Paul</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Science Database</collection><collection>Biological Science Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest Health &amp; Medical Research Collection</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Health &amp; Nursing</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied &amp; Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Scientific reports</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wilkinson, Mike J.</au><au>Szabo, Claudia</au><au>Ford, Caroline S.</au><au>Yarom, Yuval</au><au>Croxford, Adam E.</au><au>Camp, Amanda</au><au>Gooding, Paul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants</atitle><jtitle>Scientific reports</jtitle><stitle>Sci Rep</stitle><addtitle>Sci Rep</addtitle><date>2017-04-12</date><risdate>2017</risdate><volume>7</volume><issue>1</issue><spage>46040</spage><epage>46040</epage><pages>46040-46040</pages><artnum>46040</artnum><issn>2045-2322</issn><eissn>2045-2322</eissn><abstract>We estimate the global BOLD Systems database holds core DNA barcodes ( rbcL  +  matK ) for about 15% of land plant species and that comprehensive species coverage is still many decades away. Interim performance of the resource is compromised by variable sequence overlap and modest information content within each barcode. Our model predicts that the proportion of species-unique barcodes reduces as the database grows and that ‘false’ species-unique barcodes remain &gt;5% until the database is almost complete. We conclude the current rbcL  +  matK barcode is unfit for purpose. Genome skimming and supplementary barcodes could improve diagnostic power but would slow new barcode acquisition. We therefore present two novel Next Generation Sequencing protocols (with freeware) capable of accurate, massively parallel de novo assembly of high quality DNA barcodes of &gt;1400 bp. We explore how these capabilities could enhance species diagnosis in the coming decades.</abstract><cop>London</cop><pub>Nature Publishing Group UK</pub><pmid>28401958</pmid><doi>10.1038/srep46040</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2045-2322
ispartof Scientific reports, 2017-04, Vol.7 (1), p.46040-46040, Article 46040
issn 2045-2322
2045-2322
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5388885
source Nature Open Access; MEDLINE; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry; Springer Nature OA Free Journals
subjects 45
45/23
45/77
631/114
631/449
Coverage
Databases, Genetic
Deoxyribonucleic acid
DNA
DNA Barcoding, Taxonomic - methods
DNA sequencing
DNA, Plant - genetics
Genomes
High-Throughput Nucleotide Sequencing - methods
Humanities and Social Sciences
multidisciplinary
Phylogeny
Plants - genetics
Reference Standards
Science
Sonication
Species
Species Specificity
title Replacing Sanger with Next Generation Sequencing to improve coverage and quality of reference DNA barcodes for plants
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-15T22%3A10%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Replacing%20Sanger%20with%20Next%20Generation%20Sequencing%20to%20improve%20coverage%20and%20quality%20of%20reference%20DNA%20barcodes%20for%20plants&rft.jtitle=Scientific%20reports&rft.au=Wilkinson,%20Mike%20J.&rft.date=2017-04-12&rft.volume=7&rft.issue=1&rft.spage=46040&rft.epage=46040&rft.pages=46040-46040&rft.artnum=46040&rft.issn=2045-2322&rft.eissn=2045-2322&rft_id=info:doi/10.1038/srep46040&rft_dat=%3Cproquest_pubme%3E1903447593%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1903447593&rft_id=info:pmid/28401958&rfr_iscdi=true