Efficient testing and effect size estimation for set‐based genetic association inference via semiparametric multilevel mixture modeling

In genetic association studies, rare variants with extremely low allele frequencies play a crucial role in complex traits. Therefore, set‐based testing methods that jointly assess the effects of groups of single nucleotide polymorphisms (SNPs) were developed to increase the powers of the association...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Biometrical journal 2022-08, Vol.64 (6), p.1142-1152
Hauptverfasser: Sugasawa, Shonosuke, Noma, Hisashi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In genetic association studies, rare variants with extremely low allele frequencies play a crucial role in complex traits. Therefore, set‐based testing methods that jointly assess the effects of groups of single nucleotide polymorphisms (SNPs) were developed to increase the powers of the association tests. However, these powers are still insufficient, and precise estimations of the effect sizes of individual SNPs are largely impossible. In this article, we provide an efficient set‐based statistical inference framework that addresses both of these important issues simultaneously using an empirical Bayes method with semiparametric multilevel mixture modeling. We propose to utilize the hierarchical model that incorporates variations in set‐specific effects and to apply the optimal discovery procedure (ODP) that achieves the largest overall power in multiple significance testing. In addition, we provide an optimal “set‐based” estimator of the empirical distribution of effect sizes. The efficiency of the proposed methods is demonstrated through application to a genome‐wide association study of coronary artery disease and through simulation studies. The results demonstrated numerous rare variants with large effect sizes for coronary artery disease, and the number of significant sets detected by the ODP was much greater than those identified by existing methods.
ISSN:0323-3847
1521-4036
DOI:10.1002/bimj.202100234