Wavelet Screening: a novel approach to analyzing GWAS data

Background Traditional methods for single-variant genome-wide association study (GWAS) incur a substantial multiple-testing burden because of the need to test for associations with a vast number of single-nucleotide polymorphisms (SNPs) simultaneously. Further, by ignoring more complex joint effects...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	BMC bioinformatics 2021-10, Vol.22 (1), p.1-484, Article 484
Hauptverfasser:	Denault, William R. P., Gjessing, Håkon K., Juodakis, Julius, Jacobsson, Bo, Jugessur, Astanand
Format:	Artikel
Sprache:	eng
Schlagworte:	Context Data analysis Etiology Genome-wide association studies Genomes Genomics Genotype & phenotype Genotypes GWAS Mathematical models Methodology Multiple testing Nucleotides Obstetrics, Gynecology and Reproductive Medicine Polygenic association Regions Regression coefficients Regularization methods Reproduktionsmedicin och gynekologi Screening Single-nucleotide polymorphism SNP Software Wavelet analysis Wavelet regression Wavelet transforms
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Background Traditional methods for single-variant genome-wide association study (GWAS) incur a substantial multiple-testing burden because of the need to test for associations with a vast number of single-nucleotide polymorphisms (SNPs) simultaneously. Further, by ignoring more complex joint effects of nearby SNPs within a given region, these methods fail to consider the genomic context of an association with the outcome. Results To address these shortcomings, we present a more powerful method for GWAS, coined ‘Wavelet Screening’ (WS), that greatly reduces the number of tests to be performed. This is achieved through the use of a sliding-window approach based on wavelets to sequentially screen the entire genome for associations. Wavelets are oscillatory functions that are useful for analyzing the local frequency and time behavior of signals. The signals can then be divided into different scale components and analyzed separately. In the current setting, we consider a sequence of SNPs as a genetic signal, and for each screened region, we transform the genetic signal into the wavelet space. The null and alternative hypotheses are modeled using the posterior distribution of the wavelet coefficients. WS is enhanced by using additional information from the regression coefficients and by taking advantage of the pyramidal structure of wavelets. When faced with more complex genetic signals than single-SNP associations, we show via simulations that WS provides a substantial gain in power compared to both the traditional GWAS modeling and another popular regional association test called SNP-set (Sequence) Kernel Association Test (SKAT). To demonstrate feasibility, we applied WS to a large Norwegian cohort (N=8006) with genotypes and information available on gestational duration. Conclusions WS is a powerful and versatile approach to analyzing whole-genome data and lends itself easily to investigating various omics data types. Given its broader focus on the genomic context of an association, WS may provide additional insight into trait etiology by revealing genes and loci that might have been missed by previous efforts.
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-021-04356-5