Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction

The comprehension of protein and DNA binding in vivo is essential to understand gene regulation. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) provides a global map of the regulatory binding network. Most ChIP-seq analysis tools focus on identifying binding regions from coverage en...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genome research 2014-10, Vol.24 (10), p.1686-1697
Hauptverfasser: Gomes, Antonio L C, Abeel, Thomas, Peterson, Matthew, Azizi, Elham, Lyubetskaya, Anna, Carvalho, Luís, Galagan, James
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1697
container_issue 10
container_start_page 1686
container_title Genome research
container_volume 24
creator Gomes, Antonio L C
Abeel, Thomas
Peterson, Matthew
Azizi, Elham
Lyubetskaya, Anna
Carvalho, Luís
Galagan, James
description The comprehension of protein and DNA binding in vivo is essential to understand gene regulation. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) provides a global map of the regulatory binding network. Most ChIP-seq analysis tools focus on identifying binding regions from coverage enrichment. However, less work has been performed to infer the physical and regulatory details inside the enriched regions. This research extends a previous blind-deconvolution approach to develop a post-peak-calling algorithm that improves binding site resolution and predicts cooperative interactions. At the core of our new method is a physically motivated model that characterizes the binding signal as an extreme value distribution. This model suggests a mathematical framework to study physical properties of DNA shearing from the ChIP-seq coverage. The model explains the ChIP-seq coverage with two signals: The first considers DNA fragments with only a single binding event, whereas the second considers fragments with two binding events (a double-binding signal). The model incorporates motif discovery and is able to detect multiple sites in an enriched region with single-nucleotide resolution, high sensitivity, and high specificity. Our method improves peak caller sensitivity, from less than 45% up to 94%, at a false positive rate < 11% for a set of 47 experimentally validated prokaryotic sites. It also improves resolution of highly enriched regions of large-scale eukaryotic data sets. The double-binding signal provides a novel application in ChIP-seq analysis: the identification of cooperative interaction. Predictions of known cooperative binding sites show a 0.85 area under an ROC curve.
doi_str_mv 10.1101/gr.161711.113
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4199365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1673383265</sourcerecordid><originalsourceid>FETCH-LOGICAL-c420t-d82be4dab99d5b597ff3f297f02d3cd14074036dd77c7f2847fd0fbf07816e813</originalsourceid><addsrcrecordid>eNqNkTlvFTEUhUcIREKgpEUuaSZ4Xxok9MISKRIUUFseL_MM8-yJ7Qmi4q9jeEkEHdVdzqcj-55heI7gOUIQvZrLOeJIINRH8mA4RYyqkVGuHvYeSjkqyNDJ8KTWrxBCQqV8PJxgBjFFHJ8OPy-8zS6mGez2l5_G6q_B99j2wACXt2nx4xTTH7nGOZkFFB9i8hXcrVdvvlXQctfT3PG02cXnFl1nTHJgLd5F2yqwOa--mBZvPIip9da2mNPT4VEwS_XPbuvZ8OXd28-7D-PVx_eXuzdXo6UYttFJPHnqzKSUYxNTIgQScC8QO2IdolBQSLhzQlgRsKQiOBimAIVE3EtEzobXR991mw7eWZ9aMYteSzyY8kNnE_W_Sop7PecbTZFShLNu8PLWoOTrzdemD7Favywm-bxVjbggRBL8PyiTHCqGFe_oeERtybX2496_CEH9O189F33Mt4-k8y_-_sY9fRco-QWPCqPX</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1586095296</pqid></control><display><type>article</type><title>Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction</title><source>MEDLINE</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Gomes, Antonio L C ; Abeel, Thomas ; Peterson, Matthew ; Azizi, Elham ; Lyubetskaya, Anna ; Carvalho, Luís ; Galagan, James</creator><creatorcontrib>Gomes, Antonio L C ; Abeel, Thomas ; Peterson, Matthew ; Azizi, Elham ; Lyubetskaya, Anna ; Carvalho, Luís ; Galagan, James</creatorcontrib><description>The comprehension of protein and DNA binding in vivo is essential to understand gene regulation. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) provides a global map of the regulatory binding network. Most ChIP-seq analysis tools focus on identifying binding regions from coverage enrichment. However, less work has been performed to infer the physical and regulatory details inside the enriched regions. This research extends a previous blind-deconvolution approach to develop a post-peak-calling algorithm that improves binding site resolution and predicts cooperative interactions. At the core of our new method is a physically motivated model that characterizes the binding signal as an extreme value distribution. This model suggests a mathematical framework to study physical properties of DNA shearing from the ChIP-seq coverage. The model explains the ChIP-seq coverage with two signals: The first considers DNA fragments with only a single binding event, whereas the second considers fragments with two binding events (a double-binding signal). The model incorporates motif discovery and is able to detect multiple sites in an enriched region with single-nucleotide resolution, high sensitivity, and high specificity. Our method improves peak caller sensitivity, from less than 45% up to 94%, at a false positive rate &lt; 11% for a set of 47 experimentally validated prokaryotic sites. It also improves resolution of highly enriched regions of large-scale eukaryotic data sets. The double-binding signal provides a novel application in ChIP-seq analysis: the identification of cooperative interaction. Predictions of known cooperative binding sites show a 0.85 area under an ROC curve.</description><identifier>ISSN: 1088-9051</identifier><identifier>EISSN: 1549-5469</identifier><identifier>DOI: 10.1101/gr.161711.113</identifier><identifier>PMID: 25024162</identifier><language>eng</language><publisher>United States: Cold Spring Harbor Laboratory Press</publisher><subject>Algorithms ; Binding Sites ; Chromatin Immunoprecipitation ; Computational Biology - methods ; DNA-Binding Proteins - metabolism ; Method ; Models, Genetic ; Nucleotides - metabolism ; Sequence Analysis, DNA</subject><ispartof>Genome research, 2014-10, Vol.24 (10), p.1686-1697</ispartof><rights>2014 Gomes et al.; Published by Cold Spring Harbor Laboratory Press.</rights><rights>2014</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c420t-d82be4dab99d5b597ff3f297f02d3cd14074036dd77c7f2847fd0fbf07816e813</citedby><cites>FETCH-LOGICAL-c420t-d82be4dab99d5b597ff3f297f02d3cd14074036dd77c7f2847fd0fbf07816e813</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199365/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199365/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25024162$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Gomes, Antonio L C</creatorcontrib><creatorcontrib>Abeel, Thomas</creatorcontrib><creatorcontrib>Peterson, Matthew</creatorcontrib><creatorcontrib>Azizi, Elham</creatorcontrib><creatorcontrib>Lyubetskaya, Anna</creatorcontrib><creatorcontrib>Carvalho, Luís</creatorcontrib><creatorcontrib>Galagan, James</creatorcontrib><title>Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction</title><title>Genome research</title><addtitle>Genome Res</addtitle><description>The comprehension of protein and DNA binding in vivo is essential to understand gene regulation. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) provides a global map of the regulatory binding network. Most ChIP-seq analysis tools focus on identifying binding regions from coverage enrichment. However, less work has been performed to infer the physical and regulatory details inside the enriched regions. This research extends a previous blind-deconvolution approach to develop a post-peak-calling algorithm that improves binding site resolution and predicts cooperative interactions. At the core of our new method is a physically motivated model that characterizes the binding signal as an extreme value distribution. This model suggests a mathematical framework to study physical properties of DNA shearing from the ChIP-seq coverage. The model explains the ChIP-seq coverage with two signals: The first considers DNA fragments with only a single binding event, whereas the second considers fragments with two binding events (a double-binding signal). The model incorporates motif discovery and is able to detect multiple sites in an enriched region with single-nucleotide resolution, high sensitivity, and high specificity. Our method improves peak caller sensitivity, from less than 45% up to 94%, at a false positive rate &lt; 11% for a set of 47 experimentally validated prokaryotic sites. It also improves resolution of highly enriched regions of large-scale eukaryotic data sets. The double-binding signal provides a novel application in ChIP-seq analysis: the identification of cooperative interaction. Predictions of known cooperative binding sites show a 0.85 area under an ROC curve.</description><subject>Algorithms</subject><subject>Binding Sites</subject><subject>Chromatin Immunoprecipitation</subject><subject>Computational Biology - methods</subject><subject>DNA-Binding Proteins - metabolism</subject><subject>Method</subject><subject>Models, Genetic</subject><subject>Nucleotides - metabolism</subject><subject>Sequence Analysis, DNA</subject><issn>1088-9051</issn><issn>1549-5469</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkTlvFTEUhUcIREKgpEUuaSZ4Xxok9MISKRIUUFseL_MM8-yJ7Qmi4q9jeEkEHdVdzqcj-55heI7gOUIQvZrLOeJIINRH8mA4RYyqkVGuHvYeSjkqyNDJ8KTWrxBCQqV8PJxgBjFFHJ8OPy-8zS6mGez2l5_G6q_B99j2wACXt2nx4xTTH7nGOZkFFB9i8hXcrVdvvlXQctfT3PG02cXnFl1nTHJgLd5F2yqwOa--mBZvPIip9da2mNPT4VEwS_XPbuvZ8OXd28-7D-PVx_eXuzdXo6UYttFJPHnqzKSUYxNTIgQScC8QO2IdolBQSLhzQlgRsKQiOBimAIVE3EtEzobXR991mw7eWZ9aMYteSzyY8kNnE_W_Sop7PecbTZFShLNu8PLWoOTrzdemD7Favywm-bxVjbggRBL8PyiTHCqGFe_oeERtybX2496_CEH9O189F33Mt4-k8y_-_sY9fRco-QWPCqPX</recordid><startdate>201410</startdate><enddate>201410</enddate><creator>Gomes, Antonio L C</creator><creator>Abeel, Thomas</creator><creator>Peterson, Matthew</creator><creator>Azizi, Elham</creator><creator>Lyubetskaya, Anna</creator><creator>Carvalho, Luís</creator><creator>Galagan, James</creator><general>Cold Spring Harbor Laboratory Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><scope>5PM</scope></search><sort><creationdate>201410</creationdate><title>Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction</title><author>Gomes, Antonio L C ; Abeel, Thomas ; Peterson, Matthew ; Azizi, Elham ; Lyubetskaya, Anna ; Carvalho, Luís ; Galagan, James</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c420t-d82be4dab99d5b597ff3f297f02d3cd14074036dd77c7f2847fd0fbf07816e813</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Binding Sites</topic><topic>Chromatin Immunoprecipitation</topic><topic>Computational Biology - methods</topic><topic>DNA-Binding Proteins - metabolism</topic><topic>Method</topic><topic>Models, Genetic</topic><topic>Nucleotides - metabolism</topic><topic>Sequence Analysis, DNA</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gomes, Antonio L C</creatorcontrib><creatorcontrib>Abeel, Thomas</creatorcontrib><creatorcontrib>Peterson, Matthew</creatorcontrib><creatorcontrib>Azizi, Elham</creatorcontrib><creatorcontrib>Lyubetskaya, Anna</creatorcontrib><creatorcontrib>Carvalho, Luís</creatorcontrib><creatorcontrib>Galagan, James</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genome research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gomes, Antonio L C</au><au>Abeel, Thomas</au><au>Peterson, Matthew</au><au>Azizi, Elham</au><au>Lyubetskaya, Anna</au><au>Carvalho, Luís</au><au>Galagan, James</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction</atitle><jtitle>Genome research</jtitle><addtitle>Genome Res</addtitle><date>2014-10</date><risdate>2014</risdate><volume>24</volume><issue>10</issue><spage>1686</spage><epage>1697</epage><pages>1686-1697</pages><issn>1088-9051</issn><eissn>1549-5469</eissn><abstract>The comprehension of protein and DNA binding in vivo is essential to understand gene regulation. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) provides a global map of the regulatory binding network. Most ChIP-seq analysis tools focus on identifying binding regions from coverage enrichment. However, less work has been performed to infer the physical and regulatory details inside the enriched regions. This research extends a previous blind-deconvolution approach to develop a post-peak-calling algorithm that improves binding site resolution and predicts cooperative interactions. At the core of our new method is a physically motivated model that characterizes the binding signal as an extreme value distribution. This model suggests a mathematical framework to study physical properties of DNA shearing from the ChIP-seq coverage. The model explains the ChIP-seq coverage with two signals: The first considers DNA fragments with only a single binding event, whereas the second considers fragments with two binding events (a double-binding signal). The model incorporates motif discovery and is able to detect multiple sites in an enriched region with single-nucleotide resolution, high sensitivity, and high specificity. Our method improves peak caller sensitivity, from less than 45% up to 94%, at a false positive rate &lt; 11% for a set of 47 experimentally validated prokaryotic sites. It also improves resolution of highly enriched regions of large-scale eukaryotic data sets. The double-binding signal provides a novel application in ChIP-seq analysis: the identification of cooperative interaction. Predictions of known cooperative binding sites show a 0.85 area under an ROC curve.</abstract><cop>United States</cop><pub>Cold Spring Harbor Laboratory Press</pub><pmid>25024162</pmid><doi>10.1101/gr.161711.113</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1088-9051
ispartof Genome research, 2014-10, Vol.24 (10), p.1686-1697
issn 1088-9051
1549-5469
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4199365
source MEDLINE; PubMed Central; Alma/SFX Local Collection
subjects Algorithms
Binding Sites
Chromatin Immunoprecipitation
Computational Biology - methods
DNA-Binding Proteins - metabolism
Method
Models, Genetic
Nucleotides - metabolism
Sequence Analysis, DNA
title Decoding ChIP-seq with a double-binding signal refines binding peaks to single-nucleotides and predicts cooperative interaction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T11%3A05%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Decoding%20ChIP-seq%20with%20a%20double-binding%20signal%20refines%20binding%20peaks%20to%20single-nucleotides%20and%20predicts%20cooperative%20interaction&rft.jtitle=Genome%20research&rft.au=Gomes,%20Antonio%20L%20C&rft.date=2014-10&rft.volume=24&rft.issue=10&rft.spage=1686&rft.epage=1697&rft.pages=1686-1697&rft.issn=1088-9051&rft.eissn=1549-5469&rft_id=info:doi/10.1101/gr.161711.113&rft_dat=%3Cproquest_pubme%3E1673383265%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1586095296&rft_id=info:pmid/25024162&rfr_iscdi=true