The limits of individual identification from sample allele frequencies: theory and statistical analysis
It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the po...
Gespeichert in:
Veröffentlicht in: | PLoS genetics 2009-10, Vol.5 (10), p.e1000628-e1000628 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | e1000628 |
---|---|
container_issue | 10 |
container_start_page | e1000628 |
container_title | PLoS genetics |
container_volume | 5 |
creator | Visscher, Peter M Hill, William G |
description | It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature. |
doi_str_mv | 10.1371/journal.pgen.1000628 |
format | Article |
fullrecord | <record><control><sourceid>proquest_plos_</sourceid><recordid>TN_cdi_plos_journals_1313511744</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_f9da9c5668d146f490cf96c23cd785b5</doaj_id><sourcerecordid>734070225</sourcerecordid><originalsourceid>FETCH-LOGICAL-c563t-543470d546bc4e30c74160016f2cc2526b2f0585f4f88514a8dc2415ffea69903</originalsourceid><addsrcrecordid>eNpVUstuUzEUvEIgWgp_gMA7Vgl-P1ggVRWPSpXYlLXl-JE48rWDfW-l_D0OuUC7OvbxmTnj0QzDWwTXiAj0cV_mmk1aH7Y-rxGEkGP5bLhEjJGVoJA-f3S-GF61toeQMKnEy-ECKaEkJepy2N7vPEhxjFMDJYCYXXyIbjYJROfzFEO0Zoolg1DLCJoZD8kDk5LvJVT_a_bZRt8-gWnnSz0Ckx1oU4e0qSNTv5t0bLG9Hl4Ek5p_s9Sr4efXL_c331d3P77d3lzfrSzjZFoxSqiAjlG-sdQTaAVFHELEA7YWM8w3OEAmWaBBSoaokc5iilgI3nClILka3p95D6k0vXjUNCKIMIQEpX3i9jzhitnrQ42jqUddTNR_GqVutaldfPI6KGdUF8alQ5QHqqANiltMrBOSbVjn-rxsmzejd7Y7Vk16Qvr0Jced3pYHjQXlBKlO8GEhqKV72SY9xmZ9Sib7MjctCIUCYnxaRc-TtpbWqg__tiCoT4H4-1l9CoReAtFh7x4r_A9aEkB-A8L7tgs</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>734070225</pqid></control><display><type>article</type><title>The limits of individual identification from sample allele frequencies: theory and statistical analysis</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Public Library of Science (PLoS)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Visscher, Peter M ; Hill, William G</creator><contributor>Gibson, Greg</contributor><creatorcontrib>Visscher, Peter M ; Hill, William G ; Gibson, Greg</creatorcontrib><description>It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.</description><identifier>ISSN: 1553-7404</identifier><identifier>ISSN: 1553-7390</identifier><identifier>EISSN: 1553-7404</identifier><identifier>DOI: 10.1371/journal.pgen.1000628</identifier><identifier>PMID: 19798439</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Deoxyribonucleic acid ; DNA ; Gene Frequency ; Genetic testing ; Genetics ; Genetics and Genomics/Population Genetics ; Genetics, Population ; Genomes ; Genotype ; Humans ; Models, Genetic ; Models, Statistical ; Polymorphism, Single Nucleotide ; Population ; Statistics ; Studies ; Theory</subject><ispartof>PLoS genetics, 2009-10, Vol.5 (10), p.e1000628-e1000628</ispartof><rights>Visscher, Hill. 2009</rights><rights>2009 Visscher, Hill. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Visscher PM, Hill WG (2009) The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis. PLoS Genet 5(10): e1000628. doi:10.1371/journal.pgen.1000628</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c563t-543470d546bc4e30c74160016f2cc2526b2f0585f4f88514a8dc2415ffea69903</citedby><cites>FETCH-LOGICAL-c563t-543470d546bc4e30c74160016f2cc2526b2f0585f4f88514a8dc2415ffea69903</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2746319/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2746319/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,2095,2914,23846,27903,27904,53770,53772,79347,79348</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19798439$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Gibson, Greg</contributor><creatorcontrib>Visscher, Peter M</creatorcontrib><creatorcontrib>Hill, William G</creatorcontrib><title>The limits of individual identification from sample allele frequencies: theory and statistical analysis</title><title>PLoS genetics</title><addtitle>PLoS Genet</addtitle><description>It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.</description><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Gene Frequency</subject><subject>Genetic testing</subject><subject>Genetics</subject><subject>Genetics and Genomics/Population Genetics</subject><subject>Genetics, Population</subject><subject>Genomes</subject><subject>Genotype</subject><subject>Humans</subject><subject>Models, Genetic</subject><subject>Models, Statistical</subject><subject>Polymorphism, Single Nucleotide</subject><subject>Population</subject><subject>Statistics</subject><subject>Studies</subject><subject>Theory</subject><issn>1553-7404</issn><issn>1553-7390</issn><issn>1553-7404</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>DOA</sourceid><recordid>eNpVUstuUzEUvEIgWgp_gMA7Vgl-P1ggVRWPSpXYlLXl-JE48rWDfW-l_D0OuUC7OvbxmTnj0QzDWwTXiAj0cV_mmk1aH7Y-rxGEkGP5bLhEjJGVoJA-f3S-GF61toeQMKnEy-ECKaEkJepy2N7vPEhxjFMDJYCYXXyIbjYJROfzFEO0Zoolg1DLCJoZD8kDk5LvJVT_a_bZRt8-gWnnSz0Ckx1oU4e0qSNTv5t0bLG9Hl4Ek5p_s9Sr4efXL_c331d3P77d3lzfrSzjZFoxSqiAjlG-sdQTaAVFHELEA7YWM8w3OEAmWaBBSoaokc5iilgI3nClILka3p95D6k0vXjUNCKIMIQEpX3i9jzhitnrQ42jqUddTNR_GqVutaldfPI6KGdUF8alQ5QHqqANiltMrBOSbVjn-rxsmzejd7Y7Vk16Qvr0Jced3pYHjQXlBKlO8GEhqKV72SY9xmZ9Sib7MjctCIUCYnxaRc-TtpbWqg__tiCoT4H4-1l9CoReAtFh7x4r_A9aEkB-A8L7tgs</recordid><startdate>20091001</startdate><enddate>20091001</enddate><creator>Visscher, Peter M</creator><creator>Hill, William G</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20091001</creationdate><title>The limits of individual identification from sample allele frequencies: theory and statistical analysis</title><author>Visscher, Peter M ; Hill, William G</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c563t-543470d546bc4e30c74160016f2cc2526b2f0585f4f88514a8dc2415ffea69903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Gene Frequency</topic><topic>Genetic testing</topic><topic>Genetics</topic><topic>Genetics and Genomics/Population Genetics</topic><topic>Genetics, Population</topic><topic>Genomes</topic><topic>Genotype</topic><topic>Humans</topic><topic>Models, Genetic</topic><topic>Models, Statistical</topic><topic>Polymorphism, Single Nucleotide</topic><topic>Population</topic><topic>Statistics</topic><topic>Studies</topic><topic>Theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Visscher, Peter M</creatorcontrib><creatorcontrib>Hill, William G</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Visscher, Peter M</au><au>Hill, William G</au><au>Gibson, Greg</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The limits of individual identification from sample allele frequencies: theory and statistical analysis</atitle><jtitle>PLoS genetics</jtitle><addtitle>PLoS Genet</addtitle><date>2009-10-01</date><risdate>2009</risdate><volume>5</volume><issue>10</issue><spage>e1000628</spage><epage>e1000628</epage><pages>e1000628-e1000628</pages><issn>1553-7404</issn><issn>1553-7390</issn><eissn>1553-7404</eissn><abstract>It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>19798439</pmid><doi>10.1371/journal.pgen.1000628</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1553-7404 |
ispartof | PLoS genetics, 2009-10, Vol.5 (10), p.e1000628-e1000628 |
issn | 1553-7404 1553-7390 1553-7404 |
language | eng |
recordid | cdi_plos_journals_1313511744 |
source | MEDLINE; DOAJ Directory of Open Access Journals; Public Library of Science (PLoS); EZB-FREE-00999 freely available EZB journals; PubMed Central |
subjects | Deoxyribonucleic acid DNA Gene Frequency Genetic testing Genetics Genetics and Genomics/Population Genetics Genetics, Population Genomes Genotype Humans Models, Genetic Models, Statistical Polymorphism, Single Nucleotide Population Statistics Studies Theory |
title | The limits of individual identification from sample allele frequencies: theory and statistical analysis |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T23%3A41%3A36IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20limits%20of%20individual%20identification%20from%20sample%20allele%20frequencies:%20theory%20and%20statistical%20analysis&rft.jtitle=PLoS%20genetics&rft.au=Visscher,%20Peter%20M&rft.date=2009-10-01&rft.volume=5&rft.issue=10&rft.spage=e1000628&rft.epage=e1000628&rft.pages=e1000628-e1000628&rft.issn=1553-7404&rft.eissn=1553-7404&rft_id=info:doi/10.1371/journal.pgen.1000628&rft_dat=%3Cproquest_plos_%3E734070225%3C/proquest_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=734070225&rft_id=info:pmid/19798439&rft_doaj_id=oai_doaj_org_article_f9da9c5668d146f490cf96c23cd785b5&rfr_iscdi=true |