Systematic bias in high-throughput sequencing data and its correction by BEADS

Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as ch...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Nucleic acids research 2011-08, Vol.39 (15), p.e103-e103
Hauptverfasser:	Cheung, Ming-Sin, Down, Thomas A, Latorre, Isabel, Ahringer, Julie
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Animals Base Composition Caenorhabditis elegans - genetics Chromatin Immunoprecipitation DNA, Helminth - chemistry High-Throughput Nucleotide Sequencing - methods Methods Online Sequence Analysis, DNA - methods
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina's Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses.
ISSN:	0305-1048 1362-4962
DOI:	10.1093/nar/gkr425