ChIP-GSM: Inferring active transcription factor modules to predict functional regulatory elements
Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIP-seq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on...
Gespeichert in:
Veröffentlicht in: | PLoS computational biology 2021-07, Vol.17 (7), p.e1009203-e1009203, Article 1009203 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIP-seq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on co-localization of ChIP-seq peaks, often many weak binding events are missed, especially for mediators, resulting in incomplete identification of modules. To address this problem, we develop a ChIP-seq data-driven Gibbs Sampler to infer Modules (ChIP-GSM) using a Bayesian framework that integrates ChIP-seq profiles of multiple TFs. ChIP-GSM samples read counts of module TFs iteratively to estimate the binding potential of a module to each region and, across all regions, estimates the module abundance. Using inferred module-region probabilistic bindings as feature units, ChIP-GSM then employs logistic regression to predict active regulatory elements. Validation of ChIP-GSM predicted regulatory regions on multiple independent datasets sharing the same context confirms the advantage of using TF modules for predicting regulatory activity. In a case study of K562 cells, we demonstrate that the ChIP-GSM inferred modules form as groups, activate gene expression at different time points, and mediate diverse functional cellular processes. Hence, ChIP-GSM infers biologically meaningful TF modules and improves the prediction accuracy of regulatory region activities.
Author summary Investigating TF binding to different types of regulatory regions can help reveal underlying activation mechanisms. However, accurately inferring modules among a large set of TFs is challenging due to the existence of weak, noisy, and context-sensitive binding signals. To reliably infer TF modules, here we describe ChIP-GSM, a Gibbs sampler built upon a Bayesian framework, that can further predict active regulatory elements. A comparison with other methods demonstrates ChIP-GSM's improved performance on module identification and active regulatory element prediction. Experimental results demonstrate that TF modules identified by ChIP-GSM are likely mediating distinct cellular functions by activating regulatory regions at different time points. |
---|---|
ISSN: | 1553-734X 1553-7358 1553-7358 |
DOI: | 10.1371/journal.pcbi.1009203 |