| Literature DB >> 30697231 |
Sharmi Banerjee1,2, Hongxiao Zhu3, Man Tang3, Wu-Chun Feng4, Xiaowei Wu3, Hehuang Xie2,5,6,7.
Abstract
Gene expression regulation is a complex process involving the interplay between transcription factors and chromatin states. Significant progress has been made toward understanding the impact of chromatin states on gene expression. Nevertheless, the mechanism of transcription factors binding combinatorially in different chromatin states to enable selective regulation of gene expression remains an interesting research area. We introduce a nonparametric Bayesian clustering method for inhomogeneous Poisson processes to detect heterogeneous binding patterns of multiple proteins including transcription factors to form regulatory modules in different chromatin states. We applied this approach on ChIP-seq data for mouse neural stem cells containing 21 proteins and observed different groups or modules of proteins clustered within different chromatin states. These chromatin-state-specific regulatory modules were found to have significant influence on gene expression. We also observed different motif preferences for certain TFs between different chromatin states. Our results reveal a degree of interdependency between chromatin states and combinatorial binding of proteins in the complex transcriptional regulatory process. The software package is available on Github at - https://github.com/BSharmi/DPM-LGCP.Entities:
Keywords: Poisson process; chromatin states; neural stem cell; regulatory network; transcription factor
Year: 2019 PMID: 30697231 PMCID: PMC6341026 DOI: 10.3389/fgene.2018.00731
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1A two-step process to identify chromatin-state-specific transcriptional regulatory modules. In the first step, uniquely aligned bam files of histone marks are used along with the diHMM software to segment the genome and identify distinct chromatin states (illustrated by State X and State Y). In the second step, using the identified chromatin states from the previous step and ChIP-seq peak files for different TFs, the proposed Bayesian clustering method is applied to identify transcriptional regulatory modules within each chromatin state. In downstream analyses, proximal (± 2 kb from TSS) genes are used to compare the TPM expression level when regulated by individual TFs to that when regulated combinatorially by the predicted regulatory modules in step 2. Finally, using de-novo motif enrichment analysis, the binding sequences of the TFs are compared across different chromatin stats to study the effect of histone marks and co-factors on TF binding sequences.
Figure 2(A) Nucleosome level emission matrix generated by diHMM. Functional annotations of the nucleosome level states are shown in the color bar on the left. Scale varies linearly between 0 and 1. (B) Fractional genome coverage for nucleosome and domain level states. Scale varies logarithmically between 10−4 and 1. (C) Combined nucleosome-domain fold change obtained by diHMM. Functional annotation of the states are shown in the color bar on the left. Scale varies logarithmically between 0.5 and 50.
Figure 3(A) Enrichment (in log scale) of TF peaks in different chromatin states showing binding preference of individual TFs. (B) Comparison of average TPM expression (in log scale) of proximal genes (± 2 kb from TSS) in different domain level chromatin states. Genes were mapped to the nucleosome-level states for the corresponding domain-level states. (C) Comparison of average TPM expression (in log scale) of proximal genes (± 2 kb from TSS) mapped to individual TFs in the Broad Promoter state and in (D) the Poised Enhancer state.
Figure 4(A,B) Estimated cluster binding intensities along with the individual TF binding intensities in the Broad Promoter and Poised Enhancer states, respectively. In each figure, the estimated binding intensities of the individual TFs are shown in dotted lines and the estimated binding intensities of the clusters are shown in solid line. TFs in each cluster are shown in the same color as that of the cluster. The X axis represents the genomic locations mapped on the real line between 0 and 50. The Y axis represents the estimated binding intensities, both for the individual TFs and for the identified clusters. (C,D) Pairwise protein co-binding probabilities corresponding to (A,B) respectively. (E,F) Comparison of proximal gene expressions (TPM) regulated by the clusters in (a) and (B) respectively. Only those clusters having (1) multiple TFs and (2) proximal genes for at least two TFs are shown in the figure to explain the combinatorial regulation of gene expressions by multiple TFs.
Comparison of clustering results with other methods.
| Broad Promoter (D5) | (1) ASCL1, JMJD3, KDM1A, NPAS3, OLIG2, SMAD3, SMAD4, TCF3; (2) BMI1, POU5F1, RNF2, SMCHD1, SOX21, NUP153; (3) FOXO3, MAX, NFIC, P300, RAD21, SOX2, SOX9 | (1) ASLC1, JMJD3, KDM1A, NFIC, NPAS3, OLIG2, SMAD3, SMAD4, TCF3; (2) BMI1, FOXO3, MAX, P300, POU5F1, RAD21, RNF2, SMCHD1, SOX2, SOX21, SOX9, NUP153 | (1) ASCL1, FOXO3, JMJD3, KDM1A, NFIC, NPAS3, OLIG2, RAD21, SMAD3, SMAD4, SOX2, SOX9; (2) BMI1, MAX, P300, POU5F1, RNF2, SMCHD1, SOX21, NUP153 |
| Poised Enhancer (D13) | (1) ASCL1, JMJD3, KDM1A, NFIC, NPAS3, OLIG2, P300, SMAD3, SOX2, TCF3; (2) BMI1; (3) FOXO3, POU5F1, RAD21, RNF2, SMAD4, SOX21, SOX9, TCF3; (4) MAX, SMCHD1, NUP153 | (1) ASCL1, JMJD3, KDM1A, NFIC, NPAS3, OLIG2, P300, SMAD3, SOX2, SOX9, TCF3; (2) BMI1, FOXO3, MAX, POU5F1, RAD21, RNF2, SMAD4, SMCHD1, SOX21, NUP153 | (1) ASCL1, FOXO3, JMJD3, KDM1A, NFIC, NPAS3, OLIG2, P300, POU5F1, SMAD3, SMAD4, SOX2, SOX9, TCF3; (2) BMI1, MAX, RAD21, RNF2, SMCHD1, SOX21, NUP153 |
For each method the clusters are preceded by the cluster number within parentheses. Further comparisons are shown in Supplementary Table .
Figure 5Effect of chromatin states and co-binding partner on binding motifs. (A) De-novo motifs obtained using MEME for ASCL1 are similar to the consensus motif in both Broad Promoter and Polycomb Repressed states although the co-factors of ASCL1 are different in the two states. (B) De-novo motifs obtained using MEME for TCF3 show differences in motifs between the two states with different co-factors. The motifs in active state resemble the β-catenin/TCF/LEF motif whereas the motifs in repressed state resemble the E-Box consensus motif.