SNP discovery in nonmodel organisms: strand bias and base‐substitution errors reduce conversion rates

Single nucleotide polymorphisms (SNPs) have become the marker of choice for genetic studies in organisms of conservation, commercial or biological interest. Most SNP discovery projects in nonmodel organisms apply a strategy for identifying putative SNPs based on filtering rules that account for rand...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Molecular ecology resources 2015-07, Vol.15 (4), p.723-736
Hauptverfasser:	Gonçalves da Silva, Anders, Barendse, William, Kijas, James W., Barris, Wes C., McWilliam, Sean, Bunch, Rowan J., McCullough, Russell, Harrison, Blair, Hoelzel, A. Rus, England, Phillip R.
Format:	Artikel
Sprache:	eng
Schlagworte:	Animals assembly error Computational Biology - methods Genes Genotyping Techniques - methods High-Throughput Nucleotide Sequencing - methods Hoplostethus atlanticus orange roughy Organisms Polymorphism Polymorphism, Single Nucleotide sequencing error single nucleotide polymorphisms Vertebrates - classification Vertebrates - genetics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Single nucleotide polymorphisms (SNPs) have become the marker of choice for genetic studies in organisms of conservation, commercial or biological interest. Most SNP discovery projects in nonmodel organisms apply a strategy for identifying putative SNPs based on filtering rules that account for random sequencing errors. Here, we analyse data used to develop 4723 novel SNPs for the commercially important deep‐sea fish, orange roughy (Hoplostethus atlanticus), to assess the impact of not accounting for systematic sequencing errors when filtering identified polymorphisms when discovering SNPs. We used SAMtools to identify polymorphisms in a velvet assembly of genomic DNA sequence data from seven individuals. The resulting set of polymorphisms were filtered to minimize ‘bycatch’—polymorphisms caused by sequencing or assembly error. An Illumina Infinium SNP chip was used to genotype a final set of 7714 polymorphisms across 1734 individuals. Five predictors were examined for their effect on the probability of obtaining an assayable SNP: depth of coverage, number of reads that support a variant, polymorphism type (e.g. A/C), strand‐bias and Illumina SNP probe design score. Our results indicate that filtering out systematic sequencing errors could substantially improve the efficiency of SNP discovery. We show that BLASTX can be used as an efficient tool to identify single‐copy genomic regions in the absence of a reference genome. The results have implications for research aiming to identify assayable SNPs and build SNP genotyping assays for nonmodel organisms.
ISSN:	1755-098X 1755-0998
DOI:	10.1111/1755-0998.12343