Investigating microbial co-occurrence patterns based on metagenomic compositional data
The high-throughput sequencing technologies have provided a powerful tool to study the microbial organisms living in various environments. Characterizing microbial interactions can give us insights into how they live and work together as a community. Metagonomic data are usually summarized in a comp...
Gespeichert in:
Veröffentlicht in: | Bioinformatics 2015-10, Vol.31 (20), p.3322-3329 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 3329 |
---|---|
container_issue | 20 |
container_start_page | 3322 |
container_title | Bioinformatics |
container_volume | 31 |
creator | Ban, Yuguang An, Lingling Jiang, Hongmei |
description | The high-throughput sequencing technologies have provided a powerful tool to study the microbial organisms living in various environments. Characterizing microbial interactions can give us insights into how they live and work together as a community. Metagonomic data are usually summarized in a compositional fashion due to varying sampling/sequencing depths from one sample to another. We study the co-occurrence patterns of microbial organisms using their relative abundance information. Analyzing compositional data using conventional correlation methods has been shown prone to bias that leads to artifactual correlations.
We propose a novel method, regularized estimation of the basis covariance based on compositional data (REBACCA), to identify significant co-occurrence patterns by finding sparse solutions to a system with a deficient rank. To be specific, we construct the system using log ratios of count or proportion data and solve the system using the l1-norm shrinkage method. Our comprehensive simulation studies show that REBACCA (i) achieves higher accuracy in general than the existing methods when a sparse condition is satisfied; (ii) controls the false positives at a pre-specified level, while other methods fail in various cases and (iii) runs considerably faster than the existing comparable method. REBACCA is also applied to several real metagenomic datasets.
The R codes for the proposed method are available at http://faculty.wcas.northwestern.edu/∼hji403/REBACCA.htm
hongmei@northwestern.edu
Supplementary data are available at Bioinformatics online. |
doi_str_mv | 10.1093/bioinformatics/btv364 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4795632</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1721347994</sourcerecordid><originalsourceid>FETCH-LOGICAL-c543t-53ce3f582ae11fb9f6deac78cfc90c7f0bfc75e849ea0eb2981f849fbeab36f83</originalsourceid><addsrcrecordid>eNqNkU9P3DAQxS1UxL_2I4By7CXsOI6d-FKpQtAiIXEBrpbtHW-NEnuxvSv122O0dFVunMYj_97TzDxCzilcUpBsYXz0wcU06-JtXpiyZaI_ICeUiaHtR0q_7N_Ajslpzs8AwIGLI3LcCRgk43BCnm7DFnPxq2oTVs3sbYrG66mxsY3WblLCYLFZ61IwhdwYnXHZxNDMWPQKQ6yKys7rmH3xMVTlUhf9lRw6PWX89l7PyOPN9cPV7_bu_tft1c-71vKelZYzi8zxsdNIqTPSiSVqO4zWWQl2cGCcHTiOvUQNaDo5UlcbZ1AbJtzIzsiPne96Y2ZcWgwl6Umtk591-qui9urjT_B_1CpuVT9ILlhXDb6_G6T4sqmXULPPFqdJB4ybrOgwABMAvfgEyriknMFn0I6yOoLsK8p3aD18zgndfngK6i1p9TFptUu66i7-33yv-hctewWPfq4l</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1721347994</pqid></control><display><type>article</type><title>Investigating microbial co-occurrence patterns based on metagenomic compositional data</title><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Ban, Yuguang ; An, Lingling ; Jiang, Hongmei</creator><creatorcontrib>Ban, Yuguang ; An, Lingling ; Jiang, Hongmei</creatorcontrib><description>The high-throughput sequencing technologies have provided a powerful tool to study the microbial organisms living in various environments. Characterizing microbial interactions can give us insights into how they live and work together as a community. Metagonomic data are usually summarized in a compositional fashion due to varying sampling/sequencing depths from one sample to another. We study the co-occurrence patterns of microbial organisms using their relative abundance information. Analyzing compositional data using conventional correlation methods has been shown prone to bias that leads to artifactual correlations.
We propose a novel method, regularized estimation of the basis covariance based on compositional data (REBACCA), to identify significant co-occurrence patterns by finding sparse solutions to a system with a deficient rank. To be specific, we construct the system using log ratios of count or proportion data and solve the system using the l1-norm shrinkage method. Our comprehensive simulation studies show that REBACCA (i) achieves higher accuracy in general than the existing methods when a sparse condition is satisfied; (ii) controls the false positives at a pre-specified level, while other methods fail in various cases and (iii) runs considerably faster than the existing comparable method. REBACCA is also applied to several real metagenomic datasets.
The R codes for the proposed method are available at http://faculty.wcas.northwestern.edu/∼hji403/REBACCA.htm
hongmei@northwestern.edu
Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/btv364</identifier><identifier>PMID: 26079350</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Algorithms ; Bioinformatics ; Communities ; Computational Biology - methods ; Correlation analysis ; Counting ; High-Throughput Nucleotide Sequencing - methods ; Humans ; Immunization ; Metagenomics - methods ; Microorganisms ; Organisms ; Original Papers ; Sampling ; Sequencing ; Skin - immunology ; Skin - microbiology ; Skin - physiopathology ; Species Specificity</subject><ispartof>Bioinformatics, 2015-10, Vol.31 (20), p.3322-3329</ispartof><rights>The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.</rights><rights>The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c543t-53ce3f582ae11fb9f6deac78cfc90c7f0bfc75e849ea0eb2981f849fbeab36f83</citedby><cites>FETCH-LOGICAL-c543t-53ce3f582ae11fb9f6deac78cfc90c7f0bfc75e849ea0eb2981f849fbeab36f83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795632/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4795632/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26079350$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ban, Yuguang</creatorcontrib><creatorcontrib>An, Lingling</creatorcontrib><creatorcontrib>Jiang, Hongmei</creatorcontrib><title>Investigating microbial co-occurrence patterns based on metagenomic compositional data</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>The high-throughput sequencing technologies have provided a powerful tool to study the microbial organisms living in various environments. Characterizing microbial interactions can give us insights into how they live and work together as a community. Metagonomic data are usually summarized in a compositional fashion due to varying sampling/sequencing depths from one sample to another. We study the co-occurrence patterns of microbial organisms using their relative abundance information. Analyzing compositional data using conventional correlation methods has been shown prone to bias that leads to artifactual correlations.
We propose a novel method, regularized estimation of the basis covariance based on compositional data (REBACCA), to identify significant co-occurrence patterns by finding sparse solutions to a system with a deficient rank. To be specific, we construct the system using log ratios of count or proportion data and solve the system using the l1-norm shrinkage method. Our comprehensive simulation studies show that REBACCA (i) achieves higher accuracy in general than the existing methods when a sparse condition is satisfied; (ii) controls the false positives at a pre-specified level, while other methods fail in various cases and (iii) runs considerably faster than the existing comparable method. REBACCA is also applied to several real metagenomic datasets.
The R codes for the proposed method are available at http://faculty.wcas.northwestern.edu/∼hji403/REBACCA.htm
hongmei@northwestern.edu
Supplementary data are available at Bioinformatics online.</description><subject>Algorithms</subject><subject>Bioinformatics</subject><subject>Communities</subject><subject>Computational Biology - methods</subject><subject>Correlation analysis</subject><subject>Counting</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>Humans</subject><subject>Immunization</subject><subject>Metagenomics - methods</subject><subject>Microorganisms</subject><subject>Organisms</subject><subject>Original Papers</subject><subject>Sampling</subject><subject>Sequencing</subject><subject>Skin - immunology</subject><subject>Skin - microbiology</subject><subject>Skin - physiopathology</subject><subject>Species Specificity</subject><issn>1367-4803</issn><issn>1367-4811</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkU9P3DAQxS1UxL_2I4By7CXsOI6d-FKpQtAiIXEBrpbtHW-NEnuxvSv122O0dFVunMYj_97TzDxCzilcUpBsYXz0wcU06-JtXpiyZaI_ICeUiaHtR0q_7N_Ajslpzs8AwIGLI3LcCRgk43BCnm7DFnPxq2oTVs3sbYrG66mxsY3WblLCYLFZ61IwhdwYnXHZxNDMWPQKQ6yKys7rmH3xMVTlUhf9lRw6PWX89l7PyOPN9cPV7_bu_tft1c-71vKelZYzi8zxsdNIqTPSiSVqO4zWWQl2cGCcHTiOvUQNaDo5UlcbZ1AbJtzIzsiPne96Y2ZcWgwl6Umtk591-qui9urjT_B_1CpuVT9ILlhXDb6_G6T4sqmXULPPFqdJB4ybrOgwABMAvfgEyriknMFn0I6yOoLsK8p3aD18zgndfngK6i1p9TFptUu66i7-33yv-hctewWPfq4l</recordid><startdate>20151015</startdate><enddate>20151015</enddate><creator>Ban, Yuguang</creator><creator>An, Lingling</creator><creator>Jiang, Hongmei</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7QO</scope><scope>7T7</scope><scope>7TM</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>P64</scope><scope>7SC</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>5PM</scope></search><sort><creationdate>20151015</creationdate><title>Investigating microbial co-occurrence patterns based on metagenomic compositional data</title><author>Ban, Yuguang ; An, Lingling ; Jiang, Hongmei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c543t-53ce3f582ae11fb9f6deac78cfc90c7f0bfc75e849ea0eb2981f849fbeab36f83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Algorithms</topic><topic>Bioinformatics</topic><topic>Communities</topic><topic>Computational Biology - methods</topic><topic>Correlation analysis</topic><topic>Counting</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>Humans</topic><topic>Immunization</topic><topic>Metagenomics - methods</topic><topic>Microorganisms</topic><topic>Organisms</topic><topic>Original Papers</topic><topic>Sampling</topic><topic>Sequencing</topic><topic>Skin - immunology</topic><topic>Skin - microbiology</topic><topic>Skin - physiopathology</topic><topic>Species Specificity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ban, Yuguang</creatorcontrib><creatorcontrib>An, Lingling</creatorcontrib><creatorcontrib>Jiang, Hongmei</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Biotechnology Research Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ban, Yuguang</au><au>An, Lingling</au><au>Jiang, Hongmei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Investigating microbial co-occurrence patterns based on metagenomic compositional data</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2015-10-15</date><risdate>2015</risdate><volume>31</volume><issue>20</issue><spage>3322</spage><epage>3329</epage><pages>3322-3329</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><eissn>1460-2059</eissn><abstract>The high-throughput sequencing technologies have provided a powerful tool to study the microbial organisms living in various environments. Characterizing microbial interactions can give us insights into how they live and work together as a community. Metagonomic data are usually summarized in a compositional fashion due to varying sampling/sequencing depths from one sample to another. We study the co-occurrence patterns of microbial organisms using their relative abundance information. Analyzing compositional data using conventional correlation methods has been shown prone to bias that leads to artifactual correlations.
We propose a novel method, regularized estimation of the basis covariance based on compositional data (REBACCA), to identify significant co-occurrence patterns by finding sparse solutions to a system with a deficient rank. To be specific, we construct the system using log ratios of count or proportion data and solve the system using the l1-norm shrinkage method. Our comprehensive simulation studies show that REBACCA (i) achieves higher accuracy in general than the existing methods when a sparse condition is satisfied; (ii) controls the false positives at a pre-specified level, while other methods fail in various cases and (iii) runs considerably faster than the existing comparable method. REBACCA is also applied to several real metagenomic datasets.
The R codes for the proposed method are available at http://faculty.wcas.northwestern.edu/∼hji403/REBACCA.htm
hongmei@northwestern.edu
Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>26079350</pmid><doi>10.1093/bioinformatics/btv364</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics, 2015-10, Vol.31 (20), p.3322-3329 |
issn | 1367-4803 1367-4811 1460-2059 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4795632 |
source | Oxford Journals Open Access Collection; MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central; Alma/SFX Local Collection |
subjects | Algorithms Bioinformatics Communities Computational Biology - methods Correlation analysis Counting High-Throughput Nucleotide Sequencing - methods Humans Immunization Metagenomics - methods Microorganisms Organisms Original Papers Sampling Sequencing Skin - immunology Skin - microbiology Skin - physiopathology Species Specificity |
title | Investigating microbial co-occurrence patterns based on metagenomic compositional data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-20T22%3A11%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Investigating%20microbial%20co-occurrence%20patterns%20based%20on%20metagenomic%20compositional%20data&rft.jtitle=Bioinformatics&rft.au=Ban,%20Yuguang&rft.date=2015-10-15&rft.volume=31&rft.issue=20&rft.spage=3322&rft.epage=3329&rft.pages=3322-3329&rft.issn=1367-4803&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btv364&rft_dat=%3Cproquest_pubme%3E1721347994%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1721347994&rft_id=info:pmid/26079350&rfr_iscdi=true |