Genotype imputation, integration for genome-wide association studies and genomic prediction of blackleg resistance in Canola

Background:Integrating germplasm populations genotyped by different genotyping platforms via genotype imputation is a way to utilize accumulated genetic resources. In this study, we used 278 canola samples genotyped via whole-genome sequencing (WGS) at 10X coverage to evaluate the imputation accurac...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Zhao, Huanhuan, MacLeod, Iona M, Keeble‐Gagnere, Gabriel, Barbulescu, Denise M, Tibbits, Josquin F., Kaur, Sukhjiwan, Hayden, Matthew
Format: Dataset
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Background:Integrating germplasm populations genotyped by different genotyping platforms via genotype imputation is a way to utilize accumulated genetic resources. In this study, we used 278 canola samples genotyped via whole-genome sequencing (WGS) at 10X coverage to evaluate the imputation accuracy of three imputation approaches. The optimal imputation methods were used to impute and integrate two Canola genotype datasets: a diverse canola collection genotyped by genotyping-by-sequencing via transcriptome (GBS-t) and a double haploid (DH) line collection genotyped with low-coverage WGS (skim-WGS). The genomic prediction accuracy (GP) and detection power of marker‒trait association (GWAS) of the combined population for blackleg resistance were evaluated.Results:The empirical imputation accuracy (r2) measured as the squared correlation between observed and imputed genotypes was moderate for Minimac3 when imputing from the GBS-t density to the WGS. The accuracy dramatically improved from 0.64 to 0.82 by removing SNPs with poor Minimac3-reported R2 (R2 < 0.2) quality statistics. The r2 for GLIMPSE was higher than that for Beagle when imputing from different low-coverage to full-coverage WGS. We imputed and integrated the diverse canola collection and the DH lines, and the combined population showed similar or slightly greater prediction accuracy (PA) for blackleg resistance traits than did each of the single populations with ~921K SNPs. Higher marker-trait association (MTA) detection powers were indicated with the combined population; however, similar numbers of MTAs were discovered when each single population was combined in a meta-GWAS.Conclusion:It is feasible to impute and integrate germplasms from different sequencing platforms for downstream analyses. However, genetic heterogeneity across populations could add complexity to the analyses. Increasing the sample size by combining datasets showed slightly greater prediction accuracy and greater detection power in GWASs in the present study.
DOI:10.6084/m9.figshare.25661574