GWASBrewer: An R Package for Simulating Realistic GWAS Summary Statistics

Many statistical genetics analysis methods make use of GWAS summary statistics. Best statistical practice requires evaluating these methods in realistic simulation experiments. However, simulating summary statistics by first simulating individual genotype and phenotype data is extremely computationa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genetic epidemiology 2025-01, Vol.49 (1), p.e22594
1. Verfasser: Morrison, Jean
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Many statistical genetics analysis methods make use of GWAS summary statistics. Best statistical practice requires evaluating these methods in realistic simulation experiments. However, simulating summary statistics by first simulating individual genotype and phenotype data is extremely computationally demanding. This high cost may force researchers to conduct overly simplistic simulations that fail to accurately measure method performance. Alternatively, summary statistics can be simulated directly from their theoretical distribution. Although this is a common need among statistical genetics researchers, no software packages exist for comprehensive GWAS summary statistic simulation. We present GWASBrewer, an open source R package for direct simulation of GWAS summary statistics. We show that statistics simulated by GWASBrewer have the same distribution as statistics generated from individual level data, and can be produced at a fraction of the computational expense. Additionally, GWASBrewer can simulate standard error estimates, something that is typically not done when sampling summary statistics directly. GWASBrewer is highly flexible, allowing the user to simulate data for multiple traits connected by causal effects and with complex distributions of effect sizes. We demonstrate example uses of GWASBrewer for evaluating Mendelian randomization, polygenic risk score, and heritability estimation methods.
ISSN:0741-0395
1098-2272
1098-2272
DOI:10.1002/gepi.22594