DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP

Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2018-04, Vol.14 (4), p.e1006090-e1006090
Hauptverfasser: Mitra, Sneha, Biswas, Anushua, Narlikar, Leelavati
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e1006090
container_issue 4
container_start_page e1006090
container_title PLoS computational biology
container_volume 14
creator Mitra, Sneha
Biswas, Anushua
Narlikar, Leelavati
description Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0.
doi_str_mv 10.1371/journal.pcbi.1006090
format Article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_2039767049</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A536809428</galeid><doaj_id>oai_doaj_org_article_6a6ffde3cfaf415b9a4ba7c6f72e7f18</doaj_id><sourcerecordid>A536809428</sourcerecordid><originalsourceid>FETCH-LOGICAL-c633t-4cb5575d93b7748e9dfb69d00c2810b903b68cc0283a709176fc0e10f5a0f6c73</originalsourceid><addsrcrecordid>eNqVkl9v0zAQwCMEYqPwDRBE4gWktZzj2I5fkKYyINIEqBtIPFmOYyeukrjYSQXfHnftphXxgvzg893v_vqS5DmCBcIMvV27yQ-yW2xUZRcIgAKHB8kpIgTPGSbFw3vySfIkhDVAFDl9nJxknBY5QHGarN6X3y9WV-X1j9QOaWWH2g7NWep1M3VytG44S-VQp3rrumn3jJatlp2uU-Ndn7a2aedj693UtJtpTJdt-fVp8sjILuhnh3uWfPtwcb38NL_88rFcnl_OFcV4nOeqIoSRmuOKsbzQvDYV5TWAygoEFQdc0UIpyAosGXDEqFGgERgiwVDF8Cx5uY-76VwQh3EEkQHmjDLIeSTKPVE7uRYbb3vpfwsnrbhRON8I6UerOi2opMbUGisjTY5IxWVeSaaoYZlmBhUx1rtDtqnqda30MHrZHQU9tgy2FY3bCsIxLuLoZ8nrQwDvfk46jKK3Qemuk4N2003dCDJKEIroq7_Qf3e32FNN_A9hB-NiXhVPrXur3KCNjfpzgmkBPM92Lbw5cojMqH-NjZxCEOXV6j_Yz8dsvmeVdyF4be6mgkDsdvW2fLHbVXHY1ej24v5E75xulxP_AZc45Nw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2039767049</pqid></control><display><type>article</type><title>DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Public Library of Science (PLoS)</source><creator>Mitra, Sneha ; Biswas, Anushua ; Narlikar, Leelavati</creator><creatorcontrib>Mitra, Sneha ; Biswas, Anushua ; Narlikar, Leelavati</creatorcontrib><description>Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1006090</identifier><identifier>PMID: 29684008</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Algorithms ; Animals ; Bayes Theorem ; Bayesian analysis ; Binding Sites ; Binding sites (Biochemistry) ; Bioinformatics ; Biological evolution ; Biology and life sciences ; Chemical engineering ; Chromatin ; Chromatin - genetics ; Chromatin - metabolism ; Chromatin Immunoprecipitation - statistics &amp; numerical data ; Computational Biology ; Deoxyribonucleic acid ; DNA ; DNA - genetics ; DNA - metabolism ; DNA-Binding Proteins - genetics ; DNA-Binding Proteins - metabolism ; Enrichment ; Evolution, Molecular ; Experiments ; Gene expression ; Genome-wide association studies ; Genomes ; High-Throughput Nucleotide Sequencing - statistics &amp; numerical data ; Humans ; Immunoprecipitation ; Laboratories ; Neurons - metabolism ; Nucleotide Motifs ; Nucleotide sequence ; Observations ; Partitions ; Protein Binding ; Protein X ; Proteins ; Research and Analysis Methods ; Sequence Analysis, DNA - statistics &amp; numerical data ; Software ; Transcription ; Transcription (Genetics) ; Transcription factors ; Transcription Factors - genetics ; Transcription Factors - metabolism</subject><ispartof>PLoS computational biology, 2018-04, Vol.14 (4), p.e1006090-e1006090</ispartof><rights>COPYRIGHT 2018 Public Library of Science</rights><rights>2018 Public Library of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Mitra S, Biswas A, Narlikar L (2018) DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP. PLoS Comput Biol 14(4): e1006090. https://doi.org/10.1371/journal.pcbi.1006090</rights><rights>2018 Mitra et al 2018 Mitra et al</rights><rights>2018 Public Library of Science. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Mitra S, Biswas A, Narlikar L (2018) DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP. PLoS Comput Biol 14(4): e1006090. https://doi.org/10.1371/journal.pcbi.1006090</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c633t-4cb5575d93b7748e9dfb69d00c2810b903b68cc0283a709176fc0e10f5a0f6c73</citedby><cites>FETCH-LOGICAL-c633t-4cb5575d93b7748e9dfb69d00c2810b903b68cc0283a709176fc0e10f5a0f6c73</cites><orcidid>0000-0001-6820-1878</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5933800/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5933800/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,2096,2915,23845,27901,27902,53766,53768,79342,79343</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29684008$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mitra, Sneha</creatorcontrib><creatorcontrib>Biswas, Anushua</creatorcontrib><creatorcontrib>Narlikar, Leelavati</creatorcontrib><title>DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0.</description><subject>Algorithms</subject><subject>Animals</subject><subject>Bayes Theorem</subject><subject>Bayesian analysis</subject><subject>Binding Sites</subject><subject>Binding sites (Biochemistry)</subject><subject>Bioinformatics</subject><subject>Biological evolution</subject><subject>Biology and life sciences</subject><subject>Chemical engineering</subject><subject>Chromatin</subject><subject>Chromatin - genetics</subject><subject>Chromatin - metabolism</subject><subject>Chromatin Immunoprecipitation - statistics &amp; numerical data</subject><subject>Computational Biology</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>DNA - genetics</subject><subject>DNA - metabolism</subject><subject>DNA-Binding Proteins - genetics</subject><subject>DNA-Binding Proteins - metabolism</subject><subject>Enrichment</subject><subject>Evolution, Molecular</subject><subject>Experiments</subject><subject>Gene expression</subject><subject>Genome-wide association studies</subject><subject>Genomes</subject><subject>High-Throughput Nucleotide Sequencing - statistics &amp; numerical data</subject><subject>Humans</subject><subject>Immunoprecipitation</subject><subject>Laboratories</subject><subject>Neurons - metabolism</subject><subject>Nucleotide Motifs</subject><subject>Nucleotide sequence</subject><subject>Observations</subject><subject>Partitions</subject><subject>Protein Binding</subject><subject>Protein X</subject><subject>Proteins</subject><subject>Research and Analysis Methods</subject><subject>Sequence Analysis, DNA - statistics &amp; numerical data</subject><subject>Software</subject><subject>Transcription</subject><subject>Transcription (Genetics)</subject><subject>Transcription factors</subject><subject>Transcription Factors - genetics</subject><subject>Transcription Factors - metabolism</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><sourceid>DOA</sourceid><recordid>eNqVkl9v0zAQwCMEYqPwDRBE4gWktZzj2I5fkKYyINIEqBtIPFmOYyeukrjYSQXfHnftphXxgvzg893v_vqS5DmCBcIMvV27yQ-yW2xUZRcIgAKHB8kpIgTPGSbFw3vySfIkhDVAFDl9nJxknBY5QHGarN6X3y9WV-X1j9QOaWWH2g7NWep1M3VytG44S-VQp3rrumn3jJatlp2uU-Ndn7a2aedj693UtJtpTJdt-fVp8sjILuhnh3uWfPtwcb38NL_88rFcnl_OFcV4nOeqIoSRmuOKsbzQvDYV5TWAygoEFQdc0UIpyAosGXDEqFGgERgiwVDF8Cx5uY-76VwQh3EEkQHmjDLIeSTKPVE7uRYbb3vpfwsnrbhRON8I6UerOi2opMbUGisjTY5IxWVeSaaoYZlmBhUx1rtDtqnqda30MHrZHQU9tgy2FY3bCsIxLuLoZ8nrQwDvfk46jKK3Qemuk4N2003dCDJKEIroq7_Qf3e32FNN_A9hB-NiXhVPrXur3KCNjfpzgmkBPM92Lbw5cojMqH-NjZxCEOXV6j_Yz8dsvmeVdyF4be6mgkDsdvW2fLHbVXHY1ej24v5E75xulxP_AZc45Nw</recordid><startdate>20180401</startdate><enddate>20180401</enddate><creator>Mitra, Sneha</creator><creator>Biswas, Anushua</creator><creator>Narlikar, Leelavati</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISN</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-6820-1878</orcidid></search><sort><creationdate>20180401</creationdate><title>DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP</title><author>Mitra, Sneha ; Biswas, Anushua ; Narlikar, Leelavati</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c633t-4cb5575d93b7748e9dfb69d00c2810b903b68cc0283a709176fc0e10f5a0f6c73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Animals</topic><topic>Bayes Theorem</topic><topic>Bayesian analysis</topic><topic>Binding Sites</topic><topic>Binding sites (Biochemistry)</topic><topic>Bioinformatics</topic><topic>Biological evolution</topic><topic>Biology and life sciences</topic><topic>Chemical engineering</topic><topic>Chromatin</topic><topic>Chromatin - genetics</topic><topic>Chromatin - metabolism</topic><topic>Chromatin Immunoprecipitation - statistics &amp; numerical data</topic><topic>Computational Biology</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>DNA - genetics</topic><topic>DNA - metabolism</topic><topic>DNA-Binding Proteins - genetics</topic><topic>DNA-Binding Proteins - metabolism</topic><topic>Enrichment</topic><topic>Evolution, Molecular</topic><topic>Experiments</topic><topic>Gene expression</topic><topic>Genome-wide association studies</topic><topic>Genomes</topic><topic>High-Throughput Nucleotide Sequencing - statistics &amp; numerical data</topic><topic>Humans</topic><topic>Immunoprecipitation</topic><topic>Laboratories</topic><topic>Neurons - metabolism</topic><topic>Nucleotide Motifs</topic><topic>Nucleotide sequence</topic><topic>Observations</topic><topic>Partitions</topic><topic>Protein Binding</topic><topic>Protein X</topic><topic>Proteins</topic><topic>Research and Analysis Methods</topic><topic>Sequence Analysis, DNA - statistics &amp; numerical data</topic><topic>Software</topic><topic>Transcription</topic><topic>Transcription (Genetics)</topic><topic>Transcription factors</topic><topic>Transcription Factors - genetics</topic><topic>Transcription Factors - metabolism</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mitra, Sneha</creatorcontrib><creatorcontrib>Biswas, Anushua</creatorcontrib><creatorcontrib>Narlikar, Leelavati</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mitra, Sneha</au><au>Biswas, Anushua</au><au>Narlikar, Leelavati</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2018-04-01</date><risdate>2018</risdate><volume>14</volume><issue>4</issue><spage>e1006090</spage><epage>e1006090</epage><pages>e1006090-e1006090</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>Genome-wide in vivo protein-DNA interactions are routinely mapped using high-throughput chromatin immunoprecipitation (ChIP). ChIP-reported regions are typically investigated for enriched sequence-motifs, which are likely to model the DNA-binding specificity of the profiled protein and/or of co-occurring proteins. However, simple enrichment analyses can miss insights into the binding-activity of the protein. Note that ChIP reports regions making direct contact with the protein as well as those binding through intermediaries. For example, consider a ChIP experiment targeting protein X, which binds DNA at its cognate sites, but simultaneously interacts with four other proteins. Each of these proteins also binds to its own specific cognate sites along distant parts of the genome, a scenario consistent with the current view of transcriptional hubs and chromatin loops. Since ChIP will pull down all X-associated regions, the final reported data will be a union of five distinct sets of regions, each containing binding sites of one of the five proteins, respectively. Characterizing all five different motifs and the corresponding sets is important to interpret the ChIP experiment and ultimately, the role of X in regulation. We present diversity which attempts exactly this: it partitions the data so that each partition can be characterized with its own de novo motif. Diversity uses a Bayesian approach to identify the optimal number of motifs and the associated partitions, which together explain the entire dataset. This is in contrast to standard motif finders, which report motifs individually enriched in the data, but do not necessarily explain all reported regions. We show that the different motifs and associated regions identified by diversity give insights into the various complexes that may be forming along the chromatin, something that has so far not been attempted from ChIP data. Webserver at http://diversity.ncl.res.in/; standalone (Mac OS X/Linux) from https://github.com/NarlikarLab/DIVERSITY/releases/tag/v1.0.0.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>29684008</pmid><doi>10.1371/journal.pcbi.1006090</doi><orcidid>https://orcid.org/0000-0001-6820-1878</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7358
ispartof PLoS computational biology, 2018-04, Vol.14 (4), p.e1006090-e1006090
issn 1553-7358
1553-734X
1553-7358
language eng
recordid cdi_plos_journals_2039767049
source MEDLINE; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central; Public Library of Science (PLoS)
subjects Algorithms
Animals
Bayes Theorem
Bayesian analysis
Binding Sites
Binding sites (Biochemistry)
Bioinformatics
Biological evolution
Biology and life sciences
Chemical engineering
Chromatin
Chromatin - genetics
Chromatin - metabolism
Chromatin Immunoprecipitation - statistics & numerical data
Computational Biology
Deoxyribonucleic acid
DNA
DNA - genetics
DNA - metabolism
DNA-Binding Proteins - genetics
DNA-Binding Proteins - metabolism
Enrichment
Evolution, Molecular
Experiments
Gene expression
Genome-wide association studies
Genomes
High-Throughput Nucleotide Sequencing - statistics & numerical data
Humans
Immunoprecipitation
Laboratories
Neurons - metabolism
Nucleotide Motifs
Nucleotide sequence
Observations
Partitions
Protein Binding
Protein X
Proteins
Research and Analysis Methods
Sequence Analysis, DNA - statistics & numerical data
Software
Transcription
Transcription (Genetics)
Transcription factors
Transcription Factors - genetics
Transcription Factors - metabolism
title DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T07%3A24%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DIVERSITY%20in%20binding,%20regulation,%20and%20evolution%20revealed%20from%20high-throughput%20ChIP&rft.jtitle=PLoS%20computational%20biology&rft.au=Mitra,%20Sneha&rft.date=2018-04-01&rft.volume=14&rft.issue=4&rft.spage=e1006090&rft.epage=e1006090&rft.pages=e1006090-e1006090&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1006090&rft_dat=%3Cgale_plos_%3EA536809428%3C/gale_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2039767049&rft_id=info:pmid/29684008&rft_galeid=A536809428&rft_doaj_id=oai_doaj_org_article_6a6ffde3cfaf415b9a4ba7c6f72e7f18&rfr_iscdi=true