Hypothesis driven single nucleotide polymorphism search (HyDn-SNP-S)

•New method to correlate disease SNPs to genes or protein families.•Applied to correlate cancer SNPs to DNA polymerases.•Found 79 new statistically significant SNPs for four cancer phenotypes on DNA polymerases.•Linkage of Polσ and Polλ to prostate and breast cancers, respectively. The advent of com...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:DNA repair 2013-09, Vol.12 (9), p.733-740
Hauptverfasser: Swett, Rebecca J., Elias, Angela, Miller, Jeffrey A., Dyson, Gregory E., Andrés Cisneros, G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•New method to correlate disease SNPs to genes or protein families.•Applied to correlate cancer SNPs to DNA polymerases.•Found 79 new statistically significant SNPs for four cancer phenotypes on DNA polymerases.•Linkage of Polσ and Polλ to prostate and breast cancers, respectively. The advent of complete-genome genotyping across phenotype cohorts has provided a rich source of information for bioinformaticians. However the search for SNPs from this data is generally performed on a study-by-study case without any specific hypothesis of the location for SNPs that are predictive for the phenotype. We have designed a method whereby very large SNP lists (several gigabytes in size), combining several genotyping studies at once, can be sorted and traced back to their ultimate consequence in protein structure. Given a working hypothesis, researchers are able to easily search whole genome genotyping data for SNPs that link genetic locations to phenotypes. This allows a targeted search for correlations between phenotypes and potentially relevant systems, rather than utilizing statistical methods only. HyDn-SNP-S returns results that are less data dense, allowing more thorough analysis, including haplotype analysis. We have applied our method to correlate DNA polymerases to cancer phenotypes using four of the available cancer databases in dbGaP. Logistic regression and derived haplotype analysis indicates that ∼80SNPs, previously overlooked, are statistically significant. Derived haplotypes from this work link POLL to breast cancer and POLG to prostate cancer with an increase in incidence of 3.01- and 9.6-fold, respectively. Molecular dynamics simulations on wild-type and one of the SNP mutants from the haplotype of POLL provide insights at the atomic level on the functional impact of this cancer related SNP. Furthermore, HyDn-SNP-S has been designed to allow application to any system. The program is available upon request from the authors.
ISSN:1568-7864
1568-7856
DOI:10.1016/j.dnarep.2013.06.001