| Literature DB >> 34292930 |
Xi Chen1,2, Andrew F Neuwald3, Leena Hilakivi-Clarke4, Robert Clarke4, Jianhua Xuan1.
Abstract
Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIP-seq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on co-localization of ChIP-seq peaks, often many weak binding events are missed, especially for mediators, resulting in incomplete identification of modules. To address this problem, we develop a ChIP-seq data-driven Gibbs Sampler to infer Modules (ChIP-GSM) using a Bayesian framework that integrates ChIP-seq profiles of multiple TFs. ChIP-GSM samples read counts of module TFs iteratively to estimate the binding potential of a module to each region and, across all regions, estimates the module abundance. Using inferred module-region probabilistic bindings as feature units, ChIP-GSM then employs logistic regression to predict active regulatory elements. Validation of ChIP-GSM predicted regulatory regions on multiple independent datasets sharing the same context confirms the advantage of using TF modules for predicting regulatory activity. In a case study of K562 cells, we demonstrate that the ChIP-GSM inferred modules form as groups, activate gene expression at different time points, and mediate diverse functional cellular processes. Hence, ChIP-GSM infers biologically meaningful TF modules and improves the prediction accuracy of regulatory region activities.Entities:
Year: 2021 PMID: 34292930 PMCID: PMC8330942 DOI: 10.1371/journal.pcbi.1009203
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 3ChIP-GSM-inferred TF modules for enhancer and promoter regions respectively.
The number of modules functioning at enhancer or promoter regions in (A) MCF-7 cells or (B) K562 cells. Module abundance reveals that region-specific modules can be as strong as common modules functioning in both enhancer and promoter regions, in (C) MCF-7 cells or in (D) K562 cells.
Fig 5ChIP-GSM-predicted active regions are significantly enriched with epigenetic markers and significantly correlated with 3D chromatin interactions.
(A) The top 10% predicted enhancers are significantly enriched with marker ChIP-seq peaks of H3K4me1 and H3K27ac but not H3K4me3. (B) and (C) The ChIP-GSM-predicted enhancer activities are significantly correlated with ChIA-PET 3D chromatin interactions in MCF7 and K562 cells, respectively. (D) The top 10% of predicted enhancers are significantly enriched with marker peaks of H3K4me3 but not H3K4me1 or H3K27ac. (E) and (F) The ChIP-GSM-predicted promoter activities are significantly correlated with ChIA-PET 3D chromatin interactions in MCF7 and K562 cells, respectively.