Identifying personal microbiomes using metagenomic codes

Significance Recent surveys of the microbial communities living on and in the human body—the human microbiome—have revealed strong variation in community membership between individuals. Some of this variation is stable over time, leading to speculation that individuals might possess unique microbial...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the National Academy of Sciences - PNAS 2015-06, Vol.112 (22), p.E2930-E2938
Hauptverfasser: Franzosa, Eric A., Huang, Katherine, Meadow, James F., Gevers, Dirk, Lemon, Katherine P., Bohannan, Brendan J. M., Huttenhower, Curtis
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page E2938
container_issue 22
container_start_page E2930
container_title Proceedings of the National Academy of Sciences - PNAS
container_volume 112
creator Franzosa, Eric A.
Huang, Katherine
Meadow, James F.
Gevers, Dirk
Lemon, Katherine P.
Bohannan, Brendan J. M.
Huttenhower, Curtis
description Significance Recent surveys of the microbial communities living on and in the human body—the human microbiome—have revealed strong variation in community membership between individuals. Some of this variation is stable over time, leading to speculation that individuals might possess unique microbial “fingerprints” that distinguish them from the population. We rigorously evaluated this idea by combining concepts from microbial ecology and computer science. Our results demonstrated that individuals could be uniquely identified among populations of 100s based on their microbiomes alone. In the case of the gut microbiome, >80% of individuals could still be uniquely identified up to a year later—a result that raises potential privacy concerns for subjects enrolled in human microbiome research projects. Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30–300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed >80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability—a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability .
doi_str_mv 10.1073/pnas.1423854112
format Article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_1803103309</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26463519</jstor_id><sourcerecordid>26463519</sourcerecordid><originalsourceid>FETCH-LOGICAL-c624t-e0927da4a4d4f87b36b6c01e65d08caf974b3d13fbebbc40851b786b196b98623</originalsourceid><addsrcrecordid>eNqFkkFv1DAQhS0EosvCmRMQiQuXtDO2M7EvSKgqUKkSB-jZshNnyWoTL3aC1H-Pwy5b4NKTD--bp_F7w9hLhHOEWlzsR5vOUXKhKonIH7EVgsaSpIbHbAXA61JJLs_Ys5S2AKArBU_ZGa80SSFxxdR168ep7-76cVPsfUxhtLti6JsYXB8Gn4o5LdLgJ7vxY8hK0YTWp-fsSWd3yb84vmt2-_Hq2-Xn8ubLp-vLDzdlQ1xOpQfN69ZKK1vZqdoJctQAeqpaUI3tdC2daFF0zjvXSFAVulqRQ01OK-Jizd4ffPezG3zb5G2j3Zl97Acb70ywvflXGfvvZhN-GikJqpzRmr07GsTwY_ZpMkOfGr_b2dGHORlUIBCEAP0wWgNWOVOkh1FSBERKVBl9-x-6DXPMMf-mFJFEvVAXByonn1L03emLCGbp2ixdm_uu88Trv5M58X_KzUBxBJbJkx1yw7m54lpARl4dkG2aQry3IEmiwiWRNwe9s8HYTeyTuf3KAQkA841hLX4BhJnBgw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1688664195</pqid></control><display><type>article</type><title>Identifying personal microbiomes using metagenomic codes</title><source>MEDLINE</source><source>Jstor Complete Legacy</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Franzosa, Eric A. ; Huang, Katherine ; Meadow, James F. ; Gevers, Dirk ; Lemon, Katherine P. ; Bohannan, Brendan J. M. ; Huttenhower, Curtis</creator><creatorcontrib>Franzosa, Eric A. ; Huang, Katherine ; Meadow, James F. ; Gevers, Dirk ; Lemon, Katherine P. ; Bohannan, Brendan J. M. ; Huttenhower, Curtis</creatorcontrib><description>Significance Recent surveys of the microbial communities living on and in the human body—the human microbiome—have revealed strong variation in community membership between individuals. Some of this variation is stable over time, leading to speculation that individuals might possess unique microbial “fingerprints” that distinguish them from the population. We rigorously evaluated this idea by combining concepts from microbial ecology and computer science. Our results demonstrated that individuals could be uniquely identified among populations of 100s based on their microbiomes alone. In the case of the gut microbiome, &gt;80% of individuals could still be uniquely identified up to a year later—a result that raises potential privacy concerns for subjects enrolled in human microbiome research projects. Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30–300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed &gt;80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability—a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability .</description><identifier>ISSN: 0027-8424</identifier><identifier>EISSN: 1091-6490</identifier><identifier>DOI: 10.1073/pnas.1423854112</identifier><identifier>PMID: 25964341</identifier><language>eng</language><publisher>United States: National Academy of Sciences</publisher><subject>Biological Sciences ; computer science ; Confidentiality - standards ; Confidentiality - trends ; digestive system ; Genes ; Genetic Markers - genetics ; Genetic Variation ; Genomics ; Humans ; metagenomics ; Metagenomics - methods ; microbial communities ; microbiome ; Microbiota - genetics ; Microorganisms ; Models, Genetic ; PNAS Plus ; Precision Medicine - methods ; research projects ; Sampling ; surveys</subject><ispartof>Proceedings of the National Academy of Sciences - PNAS, 2015-06, Vol.112 (22), p.E2930-E2938</ispartof><rights>Volumes 1–89 and 106–112, copyright as a collective work only; author(s) retains copyright to individual articles</rights><rights>Copyright National Academy of Sciences Jun 2, 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c624t-e0927da4a4d4f87b36b6c01e65d08caf974b3d13fbebbc40851b786b196b98623</citedby><cites>FETCH-LOGICAL-c624t-e0927da4a4d4f87b36b6c01e65d08caf974b3d13fbebbc40851b786b196b98623</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.pnas.org/content/112/22.cover.gif</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26463519$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26463519$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,724,777,781,800,882,27905,27906,53772,53774,57998,58231</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25964341$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Franzosa, Eric A.</creatorcontrib><creatorcontrib>Huang, Katherine</creatorcontrib><creatorcontrib>Meadow, James F.</creatorcontrib><creatorcontrib>Gevers, Dirk</creatorcontrib><creatorcontrib>Lemon, Katherine P.</creatorcontrib><creatorcontrib>Bohannan, Brendan J. M.</creatorcontrib><creatorcontrib>Huttenhower, Curtis</creatorcontrib><title>Identifying personal microbiomes using metagenomic codes</title><title>Proceedings of the National Academy of Sciences - PNAS</title><addtitle>Proc Natl Acad Sci U S A</addtitle><description>Significance Recent surveys of the microbial communities living on and in the human body—the human microbiome—have revealed strong variation in community membership between individuals. Some of this variation is stable over time, leading to speculation that individuals might possess unique microbial “fingerprints” that distinguish them from the population. We rigorously evaluated this idea by combining concepts from microbial ecology and computer science. Our results demonstrated that individuals could be uniquely identified among populations of 100s based on their microbiomes alone. In the case of the gut microbiome, &gt;80% of individuals could still be uniquely identified up to a year later—a result that raises potential privacy concerns for subjects enrolled in human microbiome research projects. Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30–300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed &gt;80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability—a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability .</description><subject>Biological Sciences</subject><subject>computer science</subject><subject>Confidentiality - standards</subject><subject>Confidentiality - trends</subject><subject>digestive system</subject><subject>Genes</subject><subject>Genetic Markers - genetics</subject><subject>Genetic Variation</subject><subject>Genomics</subject><subject>Humans</subject><subject>metagenomics</subject><subject>Metagenomics - methods</subject><subject>microbial communities</subject><subject>microbiome</subject><subject>Microbiota - genetics</subject><subject>Microorganisms</subject><subject>Models, Genetic</subject><subject>PNAS Plus</subject><subject>Precision Medicine - methods</subject><subject>research projects</subject><subject>Sampling</subject><subject>surveys</subject><issn>0027-8424</issn><issn>1091-6490</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkkFv1DAQhS0EosvCmRMQiQuXtDO2M7EvSKgqUKkSB-jZshNnyWoTL3aC1H-Pwy5b4NKTD--bp_F7w9hLhHOEWlzsR5vOUXKhKonIH7EVgsaSpIbHbAXA61JJLs_Ys5S2AKArBU_ZGa80SSFxxdR168ep7-76cVPsfUxhtLti6JsYXB8Gn4o5LdLgJ7vxY8hK0YTWp-fsSWd3yb84vmt2-_Hq2-Xn8ubLp-vLDzdlQ1xOpQfN69ZKK1vZqdoJctQAeqpaUI3tdC2daFF0zjvXSFAVulqRQ01OK-Jizd4ffPezG3zb5G2j3Zl97Acb70ywvflXGfvvZhN-GikJqpzRmr07GsTwY_ZpMkOfGr_b2dGHORlUIBCEAP0wWgNWOVOkh1FSBERKVBl9-x-6DXPMMf-mFJFEvVAXByonn1L03emLCGbp2ixdm_uu88Trv5M58X_KzUBxBJbJkx1yw7m54lpARl4dkG2aQry3IEmiwiWRNwe9s8HYTeyTuf3KAQkA841hLX4BhJnBgw</recordid><startdate>20150602</startdate><enddate>20150602</enddate><creator>Franzosa, Eric A.</creator><creator>Huang, Katherine</creator><creator>Meadow, James F.</creator><creator>Gevers, Dirk</creator><creator>Lemon, Katherine P.</creator><creator>Bohannan, Brendan J. M.</creator><creator>Huttenhower, Curtis</creator><general>National Academy of Sciences</general><general>National Acad Sciences</general><scope>FBQ</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QG</scope><scope>7QL</scope><scope>7QP</scope><scope>7QR</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TK</scope><scope>7TM</scope><scope>7TO</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>7ST</scope><scope>SOI</scope><scope>7S9</scope><scope>L.6</scope><scope>5PM</scope></search><sort><creationdate>20150602</creationdate><title>Identifying personal microbiomes using metagenomic codes</title><author>Franzosa, Eric A. ; Huang, Katherine ; Meadow, James F. ; Gevers, Dirk ; Lemon, Katherine P. ; Bohannan, Brendan J. M. ; Huttenhower, Curtis</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c624t-e0927da4a4d4f87b36b6c01e65d08caf974b3d13fbebbc40851b786b196b98623</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Biological Sciences</topic><topic>computer science</topic><topic>Confidentiality - standards</topic><topic>Confidentiality - trends</topic><topic>digestive system</topic><topic>Genes</topic><topic>Genetic Markers - genetics</topic><topic>Genetic Variation</topic><topic>Genomics</topic><topic>Humans</topic><topic>metagenomics</topic><topic>Metagenomics - methods</topic><topic>microbial communities</topic><topic>microbiome</topic><topic>Microbiota - genetics</topic><topic>Microorganisms</topic><topic>Models, Genetic</topic><topic>PNAS Plus</topic><topic>Precision Medicine - methods</topic><topic>research projects</topic><topic>Sampling</topic><topic>surveys</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Franzosa, Eric A.</creatorcontrib><creatorcontrib>Huang, Katherine</creatorcontrib><creatorcontrib>Meadow, James F.</creatorcontrib><creatorcontrib>Gevers, Dirk</creatorcontrib><creatorcontrib>Lemon, Katherine P.</creatorcontrib><creatorcontrib>Bohannan, Brendan J. M.</creatorcontrib><creatorcontrib>Huttenhower, Curtis</creatorcontrib><collection>AGRIS</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>Environment Abstracts</collection><collection>Environment Abstracts</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Franzosa, Eric A.</au><au>Huang, Katherine</au><au>Meadow, James F.</au><au>Gevers, Dirk</au><au>Lemon, Katherine P.</au><au>Bohannan, Brendan J. M.</au><au>Huttenhower, Curtis</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identifying personal microbiomes using metagenomic codes</atitle><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle><addtitle>Proc Natl Acad Sci U S A</addtitle><date>2015-06-02</date><risdate>2015</risdate><volume>112</volume><issue>22</issue><spage>E2930</spage><epage>E2938</epage><pages>E2930-E2938</pages><issn>0027-8424</issn><eissn>1091-6490</eissn><abstract>Significance Recent surveys of the microbial communities living on and in the human body—the human microbiome—have revealed strong variation in community membership between individuals. Some of this variation is stable over time, leading to speculation that individuals might possess unique microbial “fingerprints” that distinguish them from the population. We rigorously evaluated this idea by combining concepts from microbial ecology and computer science. Our results demonstrated that individuals could be uniquely identified among populations of 100s based on their microbiomes alone. In the case of the gut microbiome, &gt;80% of individuals could still be uniquely identified up to a year later—a result that raises potential privacy concerns for subjects enrolled in human microbiome research projects. Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30–300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed &gt;80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability—a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability .</abstract><cop>United States</cop><pub>National Academy of Sciences</pub><pmid>25964341</pmid><doi>10.1073/pnas.1423854112</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0027-8424
ispartof Proceedings of the National Academy of Sciences - PNAS, 2015-06, Vol.112 (22), p.E2930-E2938
issn 0027-8424
1091-6490
language eng
recordid cdi_proquest_miscellaneous_1803103309
source MEDLINE; Jstor Complete Legacy; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects Biological Sciences
computer science
Confidentiality - standards
Confidentiality - trends
digestive system
Genes
Genetic Markers - genetics
Genetic Variation
Genomics
Humans
metagenomics
Metagenomics - methods
microbial communities
microbiome
Microbiota - genetics
Microorganisms
Models, Genetic
PNAS Plus
Precision Medicine - methods
research projects
Sampling
surveys
title Identifying personal microbiomes using metagenomic codes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T03%3A54%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identifying%20personal%20microbiomes%20using%20metagenomic%20codes&rft.jtitle=Proceedings%20of%20the%20National%20Academy%20of%20Sciences%20-%20PNAS&rft.au=Franzosa,%20Eric%20A.&rft.date=2015-06-02&rft.volume=112&rft.issue=22&rft.spage=E2930&rft.epage=E2938&rft.pages=E2930-E2938&rft.issn=0027-8424&rft.eissn=1091-6490&rft_id=info:doi/10.1073/pnas.1423854112&rft_dat=%3Cjstor_proqu%3E26463519%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1688664195&rft_id=info:pmid/25964341&rft_jstor_id=26463519&rfr_iscdi=true