HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set
Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the in...
Gespeichert in:
Veröffentlicht in: | Human immunology 2016-03, Vol.77 (3), p.307-312 |
---|---|
Hauptverfasser: | , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 312 |
---|---|
container_issue | 3 |
container_start_page | 307 |
container_title | Human immunology |
container_volume | 77 |
creator | Nunes, Kelly Zheng, Xiuwen Torres, Margareth Moraes, Maria Elisa Piovezan, Bruno Z Pontes, Gerlandia N Kimura, Lilian Carnavalli, Juliana E.P Mingroni Netto, Regina C Meyer, Diogo |
description | Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy. |
doi_str_mv | 10.1016/j.humimm.2015.11.004 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5609807</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>1_s2_0_S0198885915005571</els_id><sourcerecordid>1780514853</sourcerecordid><originalsourceid>FETCH-LOGICAL-c551t-e27ba73e550f8643a951337e33a1e7135db1e757360fad3ac86ee112d11080523</originalsourceid><addsrcrecordid>eNqNkk1v1DAQhiMEokvhHyDkI5eEmTiOHQ5IqwpapJU4AGfLm0y6XmJ7sZOK_nu8bCkfF5AsjeR5550ZPy6K5wgVArav9tVucda5qgYUFWIF0DwoVqhkVyK27cNiBdipUinRnRVPUtoDgATZPC7O6laoGkCsCnO1WTPrDstsZhs8s56ZfAZnv9HADuGwTD8Sr9k6X6dEKTnyMwsjm3fEMHuyS_LBUWKDmU3WMMPmaKy3_polmp8Wj0YzJXp2F8-Lz-_efrq4KjcfLt9frDdlLwTOJdVyayQnIWBUbcNNJ5BzSZwbJIlcDNscheQtjGbgplctEWI9IIICUfPz4s3J97BsHQ19njKaSR-idSbe6mCs_jPj7U5fhxstWugUyGzw8s4ghq8LpVk7m3qaJuMpLEmjVKJuOP6fFAQ2SvAsbU7SPoaUIo33EyHoI0i91yeQ-ghSI-oMMpe9-H2b-6Kf5H6tS_lNbyxFnXpLvqfBRupnPQT7rw5_G_RThtab6QvdUtqHJfrMS6NOtQb98fiZjn8JRW4vMpHvAzbElg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1780514853</pqid></control><display><type>article</type><title>HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set</title><source>MEDLINE</source><source>Access via ScienceDirect (Elsevier)</source><creator>Nunes, Kelly ; Zheng, Xiuwen ; Torres, Margareth ; Moraes, Maria Elisa ; Piovezan, Bruno Z ; Pontes, Gerlandia N ; Kimura, Lilian ; Carnavalli, Juliana E.P ; Mingroni Netto, Regina C ; Meyer, Diogo</creator><creatorcontrib>Nunes, Kelly ; Zheng, Xiuwen ; Torres, Margareth ; Moraes, Maria Elisa ; Piovezan, Bruno Z ; Pontes, Gerlandia N ; Kimura, Lilian ; Carnavalli, Juliana E.P ; Mingroni Netto, Regina C ; Meyer, Diogo</creatorcontrib><description>Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.</description><identifier>ISSN: 0198-8859</identifier><identifier>EISSN: 1879-1166</identifier><identifier>DOI: 10.1016/j.humimm.2015.11.004</identifier><identifier>PMID: 26582005</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>1000 Genomes ; Admixed populations ; Alleles ; Allergy and Immunology ; Brazil ; Computational Biology - methods ; Databases, Genetic ; Genetics, Population ; Genome, Human ; Genome-Wide Association Study ; HLA ; HLA Antigens - genetics ; Humans ; Imputation ; Polymorphism, Single Nucleotide ; Relatedness ; Reproducibility of Results ; Software ; Web Browser</subject><ispartof>Human immunology, 2016-03, Vol.77 (3), p.307-312</ispartof><rights>2016 American Society for Histocompatibility and Immunogenetics</rights><rights>Copyright © 2016 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c551t-e27ba73e550f8643a951337e33a1e7135db1e757360fad3ac86ee112d11080523</citedby><cites>FETCH-LOGICAL-c551t-e27ba73e550f8643a951337e33a1e7135db1e757360fad3ac86ee112d11080523</cites><orcidid>0000-0002-1390-0708</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.humimm.2015.11.004$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>230,314,780,784,885,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26582005$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Nunes, Kelly</creatorcontrib><creatorcontrib>Zheng, Xiuwen</creatorcontrib><creatorcontrib>Torres, Margareth</creatorcontrib><creatorcontrib>Moraes, Maria Elisa</creatorcontrib><creatorcontrib>Piovezan, Bruno Z</creatorcontrib><creatorcontrib>Pontes, Gerlandia N</creatorcontrib><creatorcontrib>Kimura, Lilian</creatorcontrib><creatorcontrib>Carnavalli, Juliana E.P</creatorcontrib><creatorcontrib>Mingroni Netto, Regina C</creatorcontrib><creatorcontrib>Meyer, Diogo</creatorcontrib><title>HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set</title><title>Human immunology</title><addtitle>Hum Immunol</addtitle><description>Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.</description><subject>1000 Genomes</subject><subject>Admixed populations</subject><subject>Alleles</subject><subject>Allergy and Immunology</subject><subject>Brazil</subject><subject>Computational Biology - methods</subject><subject>Databases, Genetic</subject><subject>Genetics, Population</subject><subject>Genome, Human</subject><subject>Genome-Wide Association Study</subject><subject>HLA</subject><subject>HLA Antigens - genetics</subject><subject>Humans</subject><subject>Imputation</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Relatedness</subject><subject>Reproducibility of Results</subject><subject>Software</subject><subject>Web Browser</subject><issn>0198-8859</issn><issn>1879-1166</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkk1v1DAQhiMEokvhHyDkI5eEmTiOHQ5IqwpapJU4AGfLm0y6XmJ7sZOK_nu8bCkfF5AsjeR5550ZPy6K5wgVArav9tVucda5qgYUFWIF0DwoVqhkVyK27cNiBdipUinRnRVPUtoDgATZPC7O6laoGkCsCnO1WTPrDstsZhs8s56ZfAZnv9HADuGwTD8Sr9k6X6dEKTnyMwsjm3fEMHuyS_LBUWKDmU3WMMPmaKy3_polmp8Wj0YzJXp2F8-Lz-_efrq4KjcfLt9frDdlLwTOJdVyayQnIWBUbcNNJ5BzSZwbJIlcDNscheQtjGbgplctEWI9IIICUfPz4s3J97BsHQ19njKaSR-idSbe6mCs_jPj7U5fhxstWugUyGzw8s4ghq8LpVk7m3qaJuMpLEmjVKJuOP6fFAQ2SvAsbU7SPoaUIo33EyHoI0i91yeQ-ghSI-oMMpe9-H2b-6Kf5H6tS_lNbyxFnXpLvqfBRupnPQT7rw5_G_RThtab6QvdUtqHJfrMS6NOtQb98fiZjn8JRW4vMpHvAzbElg</recordid><startdate>20160301</startdate><enddate>20160301</enddate><creator>Nunes, Kelly</creator><creator>Zheng, Xiuwen</creator><creator>Torres, Margareth</creator><creator>Moraes, Maria Elisa</creator><creator>Piovezan, Bruno Z</creator><creator>Pontes, Gerlandia N</creator><creator>Kimura, Lilian</creator><creator>Carnavalli, Juliana E.P</creator><creator>Mingroni Netto, Regina C</creator><creator>Meyer, Diogo</creator><general>Elsevier Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7T5</scope><scope>H94</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-1390-0708</orcidid></search><sort><creationdate>20160301</creationdate><title>HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set</title><author>Nunes, Kelly ; Zheng, Xiuwen ; Torres, Margareth ; Moraes, Maria Elisa ; Piovezan, Bruno Z ; Pontes, Gerlandia N ; Kimura, Lilian ; Carnavalli, Juliana E.P ; Mingroni Netto, Regina C ; Meyer, Diogo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c551t-e27ba73e550f8643a951337e33a1e7135db1e757360fad3ac86ee112d11080523</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>1000 Genomes</topic><topic>Admixed populations</topic><topic>Alleles</topic><topic>Allergy and Immunology</topic><topic>Brazil</topic><topic>Computational Biology - methods</topic><topic>Databases, Genetic</topic><topic>Genetics, Population</topic><topic>Genome, Human</topic><topic>Genome-Wide Association Study</topic><topic>HLA</topic><topic>HLA Antigens - genetics</topic><topic>Humans</topic><topic>Imputation</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Relatedness</topic><topic>Reproducibility of Results</topic><topic>Software</topic><topic>Web Browser</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Nunes, Kelly</creatorcontrib><creatorcontrib>Zheng, Xiuwen</creatorcontrib><creatorcontrib>Torres, Margareth</creatorcontrib><creatorcontrib>Moraes, Maria Elisa</creatorcontrib><creatorcontrib>Piovezan, Bruno Z</creatorcontrib><creatorcontrib>Pontes, Gerlandia N</creatorcontrib><creatorcontrib>Kimura, Lilian</creatorcontrib><creatorcontrib>Carnavalli, Juliana E.P</creatorcontrib><creatorcontrib>Mingroni Netto, Regina C</creatorcontrib><creatorcontrib>Meyer, Diogo</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Immunology Abstracts</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Human immunology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Nunes, Kelly</au><au>Zheng, Xiuwen</au><au>Torres, Margareth</au><au>Moraes, Maria Elisa</au><au>Piovezan, Bruno Z</au><au>Pontes, Gerlandia N</au><au>Kimura, Lilian</au><au>Carnavalli, Juliana E.P</au><au>Mingroni Netto, Regina C</au><au>Meyer, Diogo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set</atitle><jtitle>Human immunology</jtitle><addtitle>Hum Immunol</addtitle><date>2016-03-01</date><risdate>2016</risdate><volume>77</volume><issue>3</issue><spage>307</spage><epage>312</epage><pages>307-312</pages><issn>0198-8859</issn><eissn>1879-1166</eissn><abstract>Abstract Methods to impute HLA alleles based on dense single nucleotide polymorphism (SNP) data provide a valuable resource to association studies and evolutionary investigation of the MHC region. The availability of appropriate training sets is critical to the accuracy of HLA imputation, and the inclusion of samples with various ancestries is an important pre-requisite in studies of admixed populations. We assess the accuracy of HLA imputation using 1000 Genomes Project data as a training set, applying it to a highly admixed Brazilian population, the Quilombos from the state of São Paulo. To assess accuracy, we compared imputed and experimentally determined genotypes for 146 samples at 4 HLA classical loci. We found imputation accuracies of 82.9%, 81.8%, 94.8% and 86.6% for HLA - A , - B , -C and - DRB1 respectively (two-field resolution). Accuracies were improved when we included a subset of Quilombo individuals in the training set. We conclude that the 1000 Genomes data is a valuable resource for construction of training sets due to the diversity of ancestries and the potential for a large overlap of SNPs with the target population. We also show that tailoring training sets to features of the target population substantially enhances imputation accuracy.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>26582005</pmid><doi>10.1016/j.humimm.2015.11.004</doi><tpages>6</tpages><orcidid>https://orcid.org/0000-0002-1390-0708</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0198-8859 |
ispartof | Human immunology, 2016-03, Vol.77 (3), p.307-312 |
issn | 0198-8859 1879-1166 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5609807 |
source | MEDLINE; Access via ScienceDirect (Elsevier) |
subjects | 1000 Genomes Admixed populations Alleles Allergy and Immunology Brazil Computational Biology - methods Databases, Genetic Genetics, Population Genome, Human Genome-Wide Association Study HLA HLA Antigens - genetics Humans Imputation Polymorphism, Single Nucleotide Relatedness Reproducibility of Results Software Web Browser |
title | HLA imputation in an admixed population: An assessment of the 1000 Genomes data as a training set |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T16%3A29%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=HLA%20imputation%20in%20an%20admixed%20population:%20An%20assessment%20of%20the%201000%20Genomes%20data%20as%20a%20training%20set&rft.jtitle=Human%20immunology&rft.au=Nunes,%20Kelly&rft.date=2016-03-01&rft.volume=77&rft.issue=3&rft.spage=307&rft.epage=312&rft.pages=307-312&rft.issn=0198-8859&rft.eissn=1879-1166&rft_id=info:doi/10.1016/j.humimm.2015.11.004&rft_dat=%3Cproquest_pubme%3E1780514853%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1780514853&rft_id=info:pmid/26582005&rft_els_id=1_s2_0_S0198885915005571&rfr_iscdi=true |