High quality SNP calling using Illumina data at shallow coverage

Motivation: Detection of single nucleotide polymorphisms (SNPs) has been a major application in processing second generation sequencing (SGS) data. In principle, SNPs are called on single base differences between a reference genome and a sequence generated from SGS short reads of a sample genome. Ho...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2010-04, Vol.26 (8), p.1029-1035
Hauptverfasser: Malhis, Nawar, Jones, Steven J. M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation: Detection of single nucleotide polymorphisms (SNPs) has been a major application in processing second generation sequencing (SGS) data. In principle, SNPs are called on single base differences between a reference genome and a sequence generated from SGS short reads of a sample genome. However, this exercise is far from trivial; several parameters related to sequencing quality, and/or reference genome properties, play essential effect on the accuracy of called SNPs especially at shallow coverage data. In this work, we present Slider II, an alignment and SNP calling approach that demonstrates improved algorithmic approaches enabling larger number of called SNPs with lower false positive rate. In addition to the regular alignment and SNP calling, as an optional feature, Slider II is capable of utilizing information about known SNPs of a target genome, as priors, in the alignment and SNPs calling to enhance it's capability of detecting these known SNPs and novel SNPs and mutations in their vicinity. Contact: nmalhis@bcgsc.ca Supplementary information and availability: Supplementary data are available at Bioinformatics online and at http://www.bcgsc.ca/platform/bioinfo/software/SliderII
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btq092