Unlocking short read sequencing for metagenomics

Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved. We describe an experimental and computational pipeline yielding millions of rea...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PloS one 2010-07, Vol.5 (7), p.e11840-e11840
Hauptverfasser:	Rodrigue, Sébastien, Materna, Arne C, Timberlake, Sonia C, Blackburn, Matthew C, Malmstrom, Rex R, Alm, Eric J, Chisholm, Sallie W
Format:	Artikel
Sprache:	eng
Schlagworte:	Analysis Artificial intelligence BASIC BIOLOGICAL SCIENCES Biochemistry/Bioinformatics Biotechnology Composite materials Computational Biology/Metagenomics Computer applications Deoxyribonucleic acid DNA DNA sequencing Engineering Environmental engineering Error analysis Genetics and Genomics Genetics and Genomics/Bioinformatics Genomes Genomics Inserts Metagenomics - methods Molecular Biology Next-generation sequencing Nucleic acids Phylogenetics Polyethylene glycol Prochlorococcus Sequence Analysis, DNA - methods Software
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved. We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read. This strategy is broadly applicable to sequencing applications that benefit from low-cost high-throughput sequencing, but require longer read lengths. We demonstrate that our approach enables metagenomic analyses using the Illumina Genome Analyzer, with low error rates, and at a fraction of the cost of pyrosequencing.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0011840