Independent component analysis recovers consistent regulatory signals from disparate datasets

The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2021-02, Vol.17 (2), p.e1008647-e1008647
Hauptverfasser: Sastry, Anand V, Hu, Alyssa, Heckmann, David, Poudel, Saugat, Kavvas, Erol, Palsson, Bernhard O
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e1008647
container_issue 2
container_start_page e1008647
container_title PLoS computational biology
container_volume 17
creator Sastry, Anand V
Hu, Alyssa
Heckmann, David
Poudel, Saugat
Kavvas, Erol
Palsson, Bernhard O
description The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. With this understanding, we expanded our analysis to over 3,000 E. coli expression profiles and predicted three high-impact regulons that respond to oxidative stress, anaerobiosis, and antibiotic treatment. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets.
doi_str_mv 10.1371/journal.pcbi.1008647
format Article
fullrecord <record><control><sourceid>proquest_plos_</sourceid><recordid>TN_cdi_plos_journals_2501880178</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_7cde387e0a6d4180812e1e476bde05ea</doaj_id><sourcerecordid>2501880178</sourcerecordid><originalsourceid>FETCH-LOGICAL-c526t-cb1bbab3ebca1115d4ab96cdd1121fd1a1a78e0c7ee5d530740bf821aac607273</originalsourceid><addsrcrecordid>eNptUk1v1DAQjRCIlsI_QBCJC5ddPHH8sRekqmphpUpc4IissT1ZskriYCeV9t_jZdOqRVzssd-b55nxK4q3wNbAFXzahzkO2K1HZ9s1MKZlrZ4V5yAEXyku9PNH8VnxKqU9YzncyJfFGeei2lRMnBc_t4OnkfIyTKUL_RiGY4RZ-ZDaVEZy4Y5iytiQz9MRjLSbO5xCPJSp3WVmKpsY-tK3acSIE5UeJ0w0pdfFiybD9GbZL4ofN9ffr76ubr992V5d3q6cqOS0chasRcvJOgQA4Wu0G-m8B6ig8YCAShNzikh4wZmqmW10BYhOMlUpflG8P-mOXUhmmUwylWCgNQOlM2N7YviAezPGtsd4MAFb8_cixJ3BOLWuI6OcJ64VMZS-Bs00VARUK2k9MUGYtT4vr822J-_yTCJ2T0SfIkP7y-zCnVFaaylZFvi4CMTwe6Y0mb5NjroOBwpzrrvWEur8RXWmfviH-v_u6hPLxZBSpOahGGDm6Jb7LHN0i1ncktPePW7kIeneHvwPdFLBCA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2501880178</pqid></control><display><type>article</type><title>Independent component analysis recovers consistent regulatory signals from disparate datasets</title><source>DOAJ Directory of Open Access Journals</source><source>NCBI_PubMed Central(免费)</source><source>PLoS_OA刊</source><source>EZB Electronic Journals Library</source><creator>Sastry, Anand V ; Hu, Alyssa ; Heckmann, David ; Poudel, Saugat ; Kavvas, Erol ; Palsson, Bernhard O</creator><contributor>Patil, Kiran Raosaheb</contributor><creatorcontrib>Sastry, Anand V ; Hu, Alyssa ; Heckmann, David ; Poudel, Saugat ; Kavvas, Erol ; Palsson, Bernhard O ; Patil, Kiran Raosaheb</creatorcontrib><description>The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. With this understanding, we expanded our analysis to over 3,000 E. coli expression profiles and predicted three high-impact regulons that respond to oxidative stress, anaerobiosis, and antibiotic treatment. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1008647</identifier><identifier>PMID: 33529205</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Adenosine ; Antibiotics ; Biology and Life Sciences ; Datasets ; Decomposition ; Dietary supplements ; E coli ; Escherichia coli ; Gene expression ; Genes ; Genotypes ; Independent component analysis ; Noise ; Ontology ; Perturbation ; Research and Analysis Methods ; Transcription</subject><ispartof>PLoS computational biology, 2021-02, Vol.17 (2), p.e1008647-e1008647</ispartof><rights>2021 Sastry et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2021 Sastry et al 2021 Sastry et al</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c526t-cb1bbab3ebca1115d4ab96cdd1121fd1a1a78e0c7ee5d530740bf821aac607273</citedby><cites>FETCH-LOGICAL-c526t-cb1bbab3ebca1115d4ab96cdd1121fd1a1a78e0c7ee5d530740bf821aac607273</cites><orcidid>0000-0002-3732-2463 ; 0000-0003-1422-1712 ; 0000-0003-2525-0818 ; 0000-0003-0152-1580 ; 0000-0002-8293-3909</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7888660/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7888660/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,2102,2928,23866,27924,27925,53791,53793,79600,79601</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33529205$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Patil, Kiran Raosaheb</contributor><creatorcontrib>Sastry, Anand V</creatorcontrib><creatorcontrib>Hu, Alyssa</creatorcontrib><creatorcontrib>Heckmann, David</creatorcontrib><creatorcontrib>Poudel, Saugat</creatorcontrib><creatorcontrib>Kavvas, Erol</creatorcontrib><creatorcontrib>Palsson, Bernhard O</creatorcontrib><title>Independent component analysis recovers consistent regulatory signals from disparate datasets</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. With this understanding, we expanded our analysis to over 3,000 E. coli expression profiles and predicted three high-impact regulons that respond to oxidative stress, anaerobiosis, and antibiotic treatment. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets.</description><subject>Adenosine</subject><subject>Antibiotics</subject><subject>Biology and Life Sciences</subject><subject>Datasets</subject><subject>Decomposition</subject><subject>Dietary supplements</subject><subject>E coli</subject><subject>Escherichia coli</subject><subject>Gene expression</subject><subject>Genes</subject><subject>Genotypes</subject><subject>Independent component analysis</subject><subject>Noise</subject><subject>Ontology</subject><subject>Perturbation</subject><subject>Research and Analysis Methods</subject><subject>Transcription</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>DOA</sourceid><recordid>eNptUk1v1DAQjRCIlsI_QBCJC5ddPHH8sRekqmphpUpc4IissT1ZskriYCeV9t_jZdOqRVzssd-b55nxK4q3wNbAFXzahzkO2K1HZ9s1MKZlrZ4V5yAEXyku9PNH8VnxKqU9YzncyJfFGeei2lRMnBc_t4OnkfIyTKUL_RiGY4RZ-ZDaVEZy4Y5iytiQz9MRjLSbO5xCPJSp3WVmKpsY-tK3acSIE5UeJ0w0pdfFiybD9GbZL4ofN9ffr76ubr992V5d3q6cqOS0chasRcvJOgQA4Wu0G-m8B6ig8YCAShNzikh4wZmqmW10BYhOMlUpflG8P-mOXUhmmUwylWCgNQOlM2N7YviAezPGtsd4MAFb8_cixJ3BOLWuI6OcJ64VMZS-Bs00VARUK2k9MUGYtT4vr822J-_yTCJ2T0SfIkP7y-zCnVFaaylZFvi4CMTwe6Y0mb5NjroOBwpzrrvWEur8RXWmfviH-v_u6hPLxZBSpOahGGDm6Jb7LHN0i1ncktPePW7kIeneHvwPdFLBCA</recordid><startdate>20210201</startdate><enddate>20210201</enddate><creator>Sastry, Anand V</creator><creator>Hu, Alyssa</creator><creator>Heckmann, David</creator><creator>Poudel, Saugat</creator><creator>Kavvas, Erol</creator><creator>Palsson, Bernhard O</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QO</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-3732-2463</orcidid><orcidid>https://orcid.org/0000-0003-1422-1712</orcidid><orcidid>https://orcid.org/0000-0003-2525-0818</orcidid><orcidid>https://orcid.org/0000-0003-0152-1580</orcidid><orcidid>https://orcid.org/0000-0002-8293-3909</orcidid></search><sort><creationdate>20210201</creationdate><title>Independent component analysis recovers consistent regulatory signals from disparate datasets</title><author>Sastry, Anand V ; Hu, Alyssa ; Heckmann, David ; Poudel, Saugat ; Kavvas, Erol ; Palsson, Bernhard O</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c526t-cb1bbab3ebca1115d4ab96cdd1121fd1a1a78e0c7ee5d530740bf821aac607273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Adenosine</topic><topic>Antibiotics</topic><topic>Biology and Life Sciences</topic><topic>Datasets</topic><topic>Decomposition</topic><topic>Dietary supplements</topic><topic>E coli</topic><topic>Escherichia coli</topic><topic>Gene expression</topic><topic>Genes</topic><topic>Genotypes</topic><topic>Independent component analysis</topic><topic>Noise</topic><topic>Ontology</topic><topic>Perturbation</topic><topic>Research and Analysis Methods</topic><topic>Transcription</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sastry, Anand V</creatorcontrib><creatorcontrib>Hu, Alyssa</creatorcontrib><creatorcontrib>Heckmann, David</creatorcontrib><creatorcontrib>Poudel, Saugat</creatorcontrib><creatorcontrib>Kavvas, Erol</creatorcontrib><creatorcontrib>Palsson, Bernhard O</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>ProQuest Health and Medical</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer science database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sastry, Anand V</au><au>Hu, Alyssa</au><au>Heckmann, David</au><au>Poudel, Saugat</au><au>Kavvas, Erol</au><au>Palsson, Bernhard O</au><au>Patil, Kiran Raosaheb</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Independent component analysis recovers consistent regulatory signals from disparate datasets</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2021-02-01</date><risdate>2021</risdate><volume>17</volume><issue>2</issue><spage>e1008647</spage><epage>e1008647</epage><pages>e1008647-e1008647</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. With this understanding, we expanded our analysis to over 3,000 E. coli expression profiles and predicted three high-impact regulons that respond to oxidative stress, anaerobiosis, and antibiotic treatment. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>33529205</pmid><doi>10.1371/journal.pcbi.1008647</doi><orcidid>https://orcid.org/0000-0002-3732-2463</orcidid><orcidid>https://orcid.org/0000-0003-1422-1712</orcidid><orcidid>https://orcid.org/0000-0003-2525-0818</orcidid><orcidid>https://orcid.org/0000-0003-0152-1580</orcidid><orcidid>https://orcid.org/0000-0002-8293-3909</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7358
ispartof PLoS computational biology, 2021-02, Vol.17 (2), p.e1008647-e1008647
issn 1553-7358
1553-734X
1553-7358
language eng
recordid cdi_plos_journals_2501880178
source DOAJ Directory of Open Access Journals; NCBI_PubMed Central(免费); PLoS_OA刊; EZB Electronic Journals Library
subjects Adenosine
Antibiotics
Biology and Life Sciences
Datasets
Decomposition
Dietary supplements
E coli
Escherichia coli
Gene expression
Genes
Genotypes
Independent component analysis
Noise
Ontology
Perturbation
Research and Analysis Methods
Transcription
title Independent component analysis recovers consistent regulatory signals from disparate datasets
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T08%3A31%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Independent%20component%20analysis%20recovers%20consistent%20regulatory%20signals%20from%20disparate%20datasets&rft.jtitle=PLoS%20computational%20biology&rft.au=Sastry,%20Anand%20V&rft.date=2021-02-01&rft.volume=17&rft.issue=2&rft.spage=e1008647&rft.epage=e1008647&rft.pages=e1008647-e1008647&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1008647&rft_dat=%3Cproquest_plos_%3E2501880178%3C/proquest_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2501880178&rft_id=info:pmid/33529205&rft_doaj_id=oai_doaj_org_article_7cde387e0a6d4180812e1e476bde05ea&rfr_iscdi=true