HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set

Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the in...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Human immunology 2016-03, Vol.77 (3), p.307-312
Hauptverfasser: Nunes, Kelly, Zheng, Xiuwen, Torres, Margareth, Moraes, Maria Elisa, Piovezan, Bruno Z, Pontes, Gerlandia N, Kimura, Lilian, Carnavalli, Juliana E.P, Mingroni Netto, Regina C, Meyer, Diogo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 312
container_issue 3
container_start_page 307
container_title Human immunology
container_volume 77
creator Nunes, Kelly
Zheng, Xiuwen
Torres, Margareth
Moraes, Maria Elisa
Piovezan, Bruno Z
Pontes, Gerlandia N
Kimura, Lilian
Carnavalli, Juliana E.P
Mingroni Netto, Regina C
Meyer, Diogo
description Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.
doi_str_mv 10.1016/j.humimm.2015.11.004
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5609807</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>1_s2_0_S0198885915005571</els_id><sourcerecordid>1780514853</sourcerecordid><originalsourceid>FETCH-LOGICAL-c551t-e27ba73e550f8643a951337e33a1e7135db1e757360fad3ac86ee112d11080523</originalsourceid><addsrcrecordid>eNqNkk1v1DAQhiMEokvhHyDkI5eEmTiOHQ5IqwpapJU4AGfLm0y6XmJ7sZOK_nu8bCkfF5AsjeR5550ZPy6K5wgVArav9tVucda5qgYUFWIF0DwoVqhkVyK27cNiBdipUinRnRVPUtoDgATZPC7O6laoGkCsCnO1WTPrDstsZhs8s56ZfAZnv9HADuGwTD8Sr9k6X6dEKTnyMwsjm3fEMHuyS_LBUWKDmU3WMMPmaKy3_polmp8Wj0YzJXp2F8-Lz-_efrq4KjcfLt9frDdlLwTOJdVyayQnIWBUbcNNJ5BzSZwbJIlcDNscheQtjGbgplctEWI9IIICUfPz4s3J97BsHQ19njKaSR-idSbe6mCs_jPj7U5fhxstWugUyGzw8s4ghq8LpVk7m3qaJuMpLEmjVKJuOP6fFAQ2SvAsbU7SPoaUIo33EyHoI0i91yeQ-ghSI-oMMpe9-H2b-6Kf5H6tS_lNbyxFnXpLvqfBRupnPQT7rw5_G_RThtab6QvdUtqHJfrMS6NOtQb98fiZjn8JRW4vMpHvAzbElg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1780514853</pqid></control><display><type>article</type><title>HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set</title><source>MEDLINE</source><source>Access via ScienceDirect (Elsevier)</source><creator>Nunes, Kelly ; Zheng, Xiuwen ; Torres, Margareth ; Moraes, Maria Elisa ; Piovezan, Bruno Z ; Pontes, Gerlandia N ; Kimura, Lilian ; Carnavalli, Juliana E.P ; Mingroni Netto, Regina C ; Meyer, Diogo</creator><creatorcontrib>Nunes, Kelly ; Zheng, Xiuwen ; Torres, Margareth ; Moraes, Maria Elisa ; Piovezan, Bruno Z ; Pontes, Gerlandia N ; Kimura, Lilian ; Carnavalli, Juliana E.P ; Mingroni Netto, Regina C ; Meyer, Diogo</creatorcontrib><description>Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.</description><identifier>ISSN: 0198-8859</identifier><identifier>EISSN: 1879-1166</identifier><identifier>DOI: 10.1016/j.humimm.2015.11.004</identifier><identifier>PMID: 26582005</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>1000 Genomes ; Admixed populations ; Alleles ; Allergy and Immunology ; Brazil ; Computational Biology - methods ; Databases, Genetic ; Genetics, Population ; Genome, Human ; Genome-Wide Association Study ; HLA ; HLA Antigens - genetics ; Humans ; Imputation ; Polymorphism, Single Nucleotide ; Relatedness ; Reproducibility of Results ; Software ; Web Browser</subject><ispartof>Human immunology, 2016-03, Vol.77 (3), p.307-312</ispartof><rights>2016 American Society for Histocompatibility and Immunogenetics</rights><rights>Copyright © 2016 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c551t-e27ba73e550f8643a951337e33a1e7135db1e757360fad3ac86ee112d11080523</citedby><cites>FETCH-LOGICAL-c551t-e27ba73e550f8643a951337e33a1e7135db1e757360fad3ac86ee112d11080523</cites><orcidid>0000-0002-1390-0708</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.humimm.2015.11.004$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,780,784,885,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26582005$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Nunes, Kelly</creatorcontrib><creatorcontrib>Zheng, Xiuwen</creatorcontrib><creatorcontrib>Torres, Margareth</creatorcontrib><creatorcontrib>Moraes, Maria Elisa</creatorcontrib><creatorcontrib>Piovezan, Bruno Z</creatorcontrib><creatorcontrib>Pontes, Gerlandia N</creatorcontrib><creatorcontrib>Kimura, Lilian</creatorcontrib><creatorcontrib>Carnavalli, Juliana E.P</creatorcontrib><creatorcontrib>Mingroni Netto, Regina C</creatorcontrib><creatorcontrib>Meyer, Diogo</creatorcontrib><title>HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set</title><title>Human immunology</title><addtitle>Hum Immunol</addtitle><description>Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.</description><subject>1000 Genomes</subject><subject>Admixed populations</subject><subject>Alleles</subject><subject>Allergy and Immunology</subject><subject>Brazil</subject><subject>Computational Biology - methods</subject><subject>Databases, Genetic</subject><subject>Genetics, Population</subject><subject>Genome, Human</subject><subject>Genome-Wide Association Study</subject><subject>HLA</subject><subject>HLA Antigens - genetics</subject><subject>Humans</subject><subject>Imputation</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Relatedness</subject><subject>Reproducibility of Results</subject><subject>Software</subject><subject>Web Browser</subject><issn>0198-8859</issn><issn>1879-1166</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkk1v1DAQhiMEokvhHyDkI5eEmTiOHQ5IqwpapJU4AGfLm0y6XmJ7sZOK_nu8bCkfF5AsjeR5550ZPy6K5wgVArav9tVucda5qgYUFWIF0DwoVqhkVyK27cNiBdipUinRnRVPUtoDgATZPC7O6laoGkCsCnO1WTPrDstsZhs8s56ZfAZnv9HADuGwTD8Sr9k6X6dEKTnyMwsjm3fEMHuyS_LBUWKDmU3WMMPmaKy3_polmp8Wj0YzJXp2F8-Lz-_efrq4KjcfLt9frDdlLwTOJdVyayQnIWBUbcNNJ5BzSZwbJIlcDNscheQtjGbgplctEWI9IIICUfPz4s3J97BsHQ19njKaSR-idSbe6mCs_jPj7U5fhxstWugUyGzw8s4ghq8LpVk7m3qaJuMpLEmjVKJuOP6fFAQ2SvAsbU7SPoaUIo33EyHoI0i91yeQ-ghSI-oMMpe9-H2b-6Kf5H6tS_lNbyxFnXpLvqfBRupnPQT7rw5_G_RThtab6QvdUtqHJfrMS6NOtQb98fiZjn8JRW4vMpHvAzbElg</recordid><startdate>20160301</startdate><enddate>20160301</enddate><creator>Nunes, Kelly</creator><creator>Zheng, Xiuwen</creator><creator>Torres, Margareth</creator><creator>Moraes, Maria Elisa</creator><creator>Piovezan, Bruno Z</creator><creator>Pontes, Gerlandia N</creator><creator>Kimura, Lilian</creator><creator>Carnavalli, Juliana E.P</creator><creator>Mingroni Netto, Regina C</creator><creator>Meyer, Diogo</creator><general>Elsevier Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7T5</scope><scope>H94</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-1390-0708</orcidid></search><sort><creationdate>20160301</creationdate><title>HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set</title><author>Nunes, Kelly ; Zheng, Xiuwen ; Torres, Margareth ; Moraes, Maria Elisa ; Piovezan, Bruno Z ; Pontes, Gerlandia N ; Kimura, Lilian ; Carnavalli, Juliana E.P ; Mingroni Netto, Regina C ; Meyer, Diogo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c551t-e27ba73e550f8643a951337e33a1e7135db1e757360fad3ac86ee112d11080523</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>1000 Genomes</topic><topic>Admixed populations</topic><topic>Alleles</topic><topic>Allergy and Immunology</topic><topic>Brazil</topic><topic>Computational Biology - methods</topic><topic>Databases, Genetic</topic><topic>Genetics, Population</topic><topic>Genome, Human</topic><topic>Genome-Wide Association Study</topic><topic>HLA</topic><topic>HLA Antigens - genetics</topic><topic>Humans</topic><topic>Imputation</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Relatedness</topic><topic>Reproducibility of Results</topic><topic>Software</topic><topic>Web Browser</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nunes, Kelly</creatorcontrib><creatorcontrib>Zheng, Xiuwen</creatorcontrib><creatorcontrib>Torres, Margareth</creatorcontrib><creatorcontrib>Moraes, Maria Elisa</creatorcontrib><creatorcontrib>Piovezan, Bruno Z</creatorcontrib><creatorcontrib>Pontes, Gerlandia N</creatorcontrib><creatorcontrib>Kimura, Lilian</creatorcontrib><creatorcontrib>Carnavalli, Juliana E.P</creatorcontrib><creatorcontrib>Mingroni Netto, Regina C</creatorcontrib><creatorcontrib>Meyer, Diogo</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Immunology Abstracts</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Human immunology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nunes, Kelly</au><au>Zheng, Xiuwen</au><au>Torres, Margareth</au><au>Moraes, Maria Elisa</au><au>Piovezan, Bruno Z</au><au>Pontes, Gerlandia N</au><au>Kimura, Lilian</au><au>Carnavalli, Juliana E.P</au><au>Mingroni Netto, Regina C</au><au>Meyer, Diogo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set</atitle><jtitle>Human immunology</jtitle><addtitle>Hum Immunol</addtitle><date>2016-03-01</date><risdate>2016</risdate><volume>77</volume><issue>3</issue><spage>307</spage><epage>312</epage><pages>307-312</pages><issn>0198-8859</issn><eissn>1879-1166</eissn><abstract>Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>26582005</pmid><doi>10.1016/j.humimm.2015.11.004</doi><tpages>6</tpages><orcidid>https://orcid.org/0000-0002-1390-0708</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0198-8859
ispartof Human immunology, 2016-03, Vol.77 (3), p.307-312
issn 0198-8859
1879-1166
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5609807
source MEDLINE; Access via ScienceDirect (Elsevier)
subjects 1000 Genomes
Admixed populations
Alleles
Allergy and Immunology
Brazil
Computational Biology - methods
Databases, Genetic
Genetics, Population
Genome, Human
Genome-Wide Association Study
HLA
HLA Antigens - genetics
Humans
Imputation
Polymorphism, Single Nucleotide
Relatedness
Reproducibility of Results
Software
Web Browser
title HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T16%3A29%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HLA%20imputation%20in%20an%20admixed%20population:%20An%20assessment%20of%20the%201000%20Genomes%20data%20as%20a%20training%20set&rft.jtitle=Human%20immunology&rft.au=Nunes,%20Kelly&rft.date=2016-03-01&rft.volume=77&rft.issue=3&rft.spage=307&rft.epage=312&rft.pages=307-312&rft.issn=0198-8859&rft.eissn=1879-1166&rft_id=info:doi/10.1016/j.humimm.2015.11.004&rft_dat=%3Cproquest_pubme%3E1780514853%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1780514853&rft_id=info:pmid/26582005&rft_els_id=1_s2_0_S0198885915005571&rfr_iscdi=true