Denoising PCR-amplified metagenome data

PCR amplification and high-throughput sequencing theoretically enable the characterization of the finest-scale diversity in natural microbial and viral populations, but each of these methods introduces random errors that are difficult to distinguish from genuine biological diversity. Several approac...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	BMC bioinformatics 2012-10, Vol.13 (1), p.283-283, Article 283
Hauptverfasser:	Rosen, Michael J, Callahan, Benjamin J, Fisher, Daniel S, Holmes, Susan P
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Bioinformatics Biological diversity Cloning Computer programs Data processing Genomes Genotypes Information processing Metagenome - genetics Polymerase chain reaction Polymerase Chain Reaction - statistics & numerical data Software Studies Taxonomy
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	PCR amplification and high-throughput sequencing theoretically enable the characterization of the finest-scale diversity in natural microbial and viral populations, but each of these methods introduces random errors that are difficult to distinguish from genuine biological diversity. Several approaches have been proposed to denoise these data but lack either speed or accuracy. We introduce a new denoising algorithm that we call DADA (Divisive Amplicon Denoising Algorithm). Without training data, DADA infers both the sample genotypes and error parameters that produced a metagenome data set. We demonstrate performance on control data sequenced on Roche's 454 platform, and compare the results to the most accurate denoising software currently available, AmpliconNoise. DADA is more accurate and over an order of magnitude faster than AmpliconNoise. It eliminates the need for training data to establish error parameters, fully utilizes sequence-abundance information, and enables inclusion of context-dependent PCR error rates. It should be readily extensible to other sequencing platforms such as Illumina.
ISSN:	1471-2105 1471-2105
DOI:	10.1186/1471-2105-13-283