HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set

Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Human immunology 2016-03, Vol.77 (3), p.307-312
Hauptverfasser: Nunes, Kelly, Zheng, Xiuwen, Torres, Margareth, Moraes, Maria Elisa, Piovezan, Bruno Z, Pontes, Gerlandia N, Kimura, Lilian, Carnavalli, Juliana E.P, Mingroni Netto, Regina C, Meyer, Diogo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.
ISSN:0198-8859
1879-1166
DOI:10.1016/j.humimm.2015.11.004