The limits of individual identification from sample allele frequencies: theory and statistical analysis

It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the po...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS genetics 2009-10, Vol.5 (10), p.e1000628-e1000628
Hauptverfasser: Visscher, Peter M, Hill, William G
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e1000628
container_issue 10
container_start_page e1000628
container_title PLoS genetics
container_volume 5
creator Visscher, Peter M
Hill, William G
description It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.
doi_str_mv 10.1371/journal.pgen.1000628
format Article
fullrecord <record><control><sourceid>proquest_plos_</sourceid><recordid>TN_cdi_plos_journals_1313511744</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_f9da9c5668d146f490cf96c23cd785b5</doaj_id><sourcerecordid>734070225</sourcerecordid><originalsourceid>FETCH-LOGICAL-c563t-543470d546bc4e30c74160016f2cc2526b2f0585f4f88514a8dc2415ffea69903</originalsourceid><addsrcrecordid>eNpVUstuUzEUvEIgWgp_gMA7Vgl-P1ggVRWPSpXYlLXl-JE48rWDfW-l_D0OuUC7OvbxmTnj0QzDWwTXiAj0cV_mmk1aH7Y-rxGEkGP5bLhEjJGVoJA-f3S-GF61toeQMKnEy-ECKaEkJepy2N7vPEhxjFMDJYCYXXyIbjYJROfzFEO0Zoolg1DLCJoZD8kDk5LvJVT_a_bZRt8-gWnnSz0Ckx1oU4e0qSNTv5t0bLG9Hl4Ek5p_s9Sr4efXL_c331d3P77d3lzfrSzjZFoxSqiAjlG-sdQTaAVFHELEA7YWM8w3OEAmWaBBSoaokc5iilgI3nClILka3p95D6k0vXjUNCKIMIQEpX3i9jzhitnrQ42jqUddTNR_GqVutaldfPI6KGdUF8alQ5QHqqANiltMrBOSbVjn-rxsmzejd7Y7Vk16Qvr0Jced3pYHjQXlBKlO8GEhqKV72SY9xmZ9Sib7MjctCIUCYnxaRc-TtpbWqg__tiCoT4H4-1l9CoReAtFh7x4r_A9aEkB-A8L7tgs</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>734070225</pqid></control><display><type>article</type><title>The limits of individual identification from sample allele frequencies: theory and statistical analysis</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Public Library of Science (PLoS)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Visscher, Peter M ; Hill, William G</creator><contributor>Gibson, Greg</contributor><creatorcontrib>Visscher, Peter M ; Hill, William G ; Gibson, Greg</creatorcontrib><description>It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.</description><identifier>ISSN: 1553-7404</identifier><identifier>ISSN: 1553-7390</identifier><identifier>EISSN: 1553-7404</identifier><identifier>DOI: 10.1371/journal.pgen.1000628</identifier><identifier>PMID: 19798439</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Deoxyribonucleic acid ; DNA ; Gene Frequency ; Genetic testing ; Genetics ; Genetics and Genomics/Population Genetics ; Genetics, Population ; Genomes ; Genotype ; Humans ; Models, Genetic ; Models, Statistical ; Polymorphism, Single Nucleotide ; Population ; Statistics ; Studies ; Theory</subject><ispartof>PLoS genetics, 2009-10, Vol.5 (10), p.e1000628-e1000628</ispartof><rights>Visscher, Hill. 2009</rights><rights>2009 Visscher, Hill. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Visscher PM, Hill WG (2009) The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis. PLoS Genet 5(10): e1000628. doi:10.1371/journal.pgen.1000628</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c563t-543470d546bc4e30c74160016f2cc2526b2f0585f4f88514a8dc2415ffea69903</citedby><cites>FETCH-LOGICAL-c563t-543470d546bc4e30c74160016f2cc2526b2f0585f4f88514a8dc2415ffea69903</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2746319/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2746319/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,2095,2914,23846,27903,27904,53770,53772,79347,79348</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19798439$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Gibson, Greg</contributor><creatorcontrib>Visscher, Peter M</creatorcontrib><creatorcontrib>Hill, William G</creatorcontrib><title>The limits of individual identification from sample allele frequencies: theory and statistical analysis</title><title>PLoS genetics</title><addtitle>PLoS Genet</addtitle><description>It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.</description><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Gene Frequency</subject><subject>Genetic testing</subject><subject>Genetics</subject><subject>Genetics and Genomics/Population Genetics</subject><subject>Genetics, Population</subject><subject>Genomes</subject><subject>Genotype</subject><subject>Humans</subject><subject>Models, Genetic</subject><subject>Models, Statistical</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Population</subject><subject>Statistics</subject><subject>Studies</subject><subject>Theory</subject><issn>1553-7404</issn><issn>1553-7390</issn><issn>1553-7404</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>DOA</sourceid><recordid>eNpVUstuUzEUvEIgWgp_gMA7Vgl-P1ggVRWPSpXYlLXl-JE48rWDfW-l_D0OuUC7OvbxmTnj0QzDWwTXiAj0cV_mmk1aH7Y-rxGEkGP5bLhEjJGVoJA-f3S-GF61toeQMKnEy-ECKaEkJepy2N7vPEhxjFMDJYCYXXyIbjYJROfzFEO0Zoolg1DLCJoZD8kDk5LvJVT_a_bZRt8-gWnnSz0Ckx1oU4e0qSNTv5t0bLG9Hl4Ek5p_s9Sr4efXL_c331d3P77d3lzfrSzjZFoxSqiAjlG-sdQTaAVFHELEA7YWM8w3OEAmWaBBSoaokc5iilgI3nClILka3p95D6k0vXjUNCKIMIQEpX3i9jzhitnrQ42jqUddTNR_GqVutaldfPI6KGdUF8alQ5QHqqANiltMrBOSbVjn-rxsmzejd7Y7Vk16Qvr0Jced3pYHjQXlBKlO8GEhqKV72SY9xmZ9Sib7MjctCIUCYnxaRc-TtpbWqg__tiCoT4H4-1l9CoReAtFh7x4r_A9aEkB-A8L7tgs</recordid><startdate>20091001</startdate><enddate>20091001</enddate><creator>Visscher, Peter M</creator><creator>Hill, William G</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20091001</creationdate><title>The limits of individual identification from sample allele frequencies: theory and statistical analysis</title><author>Visscher, Peter M ; Hill, William G</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c563t-543470d546bc4e30c74160016f2cc2526b2f0585f4f88514a8dc2415ffea69903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Gene Frequency</topic><topic>Genetic testing</topic><topic>Genetics</topic><topic>Genetics and Genomics/Population Genetics</topic><topic>Genetics, Population</topic><topic>Genomes</topic><topic>Genotype</topic><topic>Humans</topic><topic>Models, Genetic</topic><topic>Models, Statistical</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Population</topic><topic>Statistics</topic><topic>Studies</topic><topic>Theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Visscher, Peter M</creatorcontrib><creatorcontrib>Hill, William G</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Visscher, Peter M</au><au>Hill, William G</au><au>Gibson, Greg</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The limits of individual identification from sample allele frequencies: theory and statistical analysis</atitle><jtitle>PLoS genetics</jtitle><addtitle>PLoS Genet</addtitle><date>2009-10-01</date><risdate>2009</risdate><volume>5</volume><issue>10</issue><spage>e1000628</spage><epage>e1000628</epage><pages>e1000628-e1000628</pages><issn>1553-7404</issn><issn>1553-7390</issn><eissn>1553-7404</eissn><abstract>It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>19798439</pmid><doi>10.1371/journal.pgen.1000628</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7404
ispartof PLoS genetics, 2009-10, Vol.5 (10), p.e1000628-e1000628
issn 1553-7404
1553-7390
1553-7404
language eng
recordid cdi_plos_journals_1313511744
source MEDLINE; DOAJ Directory of Open Access Journals; Public Library of Science (PLoS); EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Deoxyribonucleic acid
DNA
Gene Frequency
Genetic testing
Genetics
Genetics and Genomics/Population Genetics
Genetics, Population
Genomes
Genotype
Humans
Models, Genetic
Models, Statistical
Polymorphism, Single Nucleotide
Population
Statistics
Studies
Theory
title The limits of individual identification from sample allele frequencies: theory and statistical analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T23%3A41%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20limits%20of%20individual%20identification%20from%20sample%20allele%20frequencies:%20theory%20and%20statistical%20analysis&rft.jtitle=PLoS%20genetics&rft.au=Visscher,%20Peter%20M&rft.date=2009-10-01&rft.volume=5&rft.issue=10&rft.spage=e1000628&rft.epage=e1000628&rft.pages=e1000628-e1000628&rft.issn=1553-7404&rft.eissn=1553-7404&rft_id=info:doi/10.1371/journal.pgen.1000628&rft_dat=%3Cproquest_plos_%3E734070225%3C/proquest_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=734070225&rft_id=info:pmid/19798439&rft_doaj_id=oai_doaj_org_article_f9da9c5668d146f490cf96c23cd785b5&rfr_iscdi=true