| Literature DB >> 25887857 |
Wuming Gong1, Naoko Koyano-Nakagawa2, Tongbin Li3, Daniel J Garry4.
Abstract
BACKGROUND: Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25887857 PMCID: PMC4359553 DOI: 10.1186/s12859-015-0460-0
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Overview of our two-step strategy to infer dynamic regulatory networks during cardiac differentiation. (A) Training a general logistic regression model to predict the probability being bound by any transcription factor (LR score) in 40 kb cis-regions surrounding transcriptional start sites (TSS) of expressed genes. The response variables of the model indicate whether the hit of the PWM of a specific TF coincides with the peak region of the corresponding ChIP-seq data. 15 context-independent (e.g. conservation) and 54 context-dependent (e.g. mean intensity of H3K27ac in 1 kb surrounding any base) features were used to train the logistic regression model. (B) Context-dependent LR score, temporal expression profiles and perturbation networks were used to infer the dynamic regulatory networks based on a time-varying dynamic Bayesian network model.
Transcription factor ChIP-seq datasets used to train the logistic regression model
|
|
|
|
|
|---|---|---|---|
| E2f1 | ESC |
| [ |
| Esrrb | ESC |
| [ |
| Gata4 | HL-1 |
| [ |
| Klf4 | ESC |
| [ |
| Mef2a | HL-1 |
| [ |
| Myc | ESC |
| [ |
| Mycn | ESC |
| [ |
| Nanog | ESC |
| [ |
| Nkx2-5 | HL-1 |
| [ |
| Nr5a2 | ESC |
| [ |
| Pou5f1 | ESC |
| [ |
| Sox2 | ESC |
| [ |
| Srf | HL-1 |
| [ |
| Stat3 | ESC |
| [ |
| Tbx5 | HL-1 |
| [ |
| Tfcp2l1 | ESC |
| [ |
| Zfx | ESC |
| [ |
The 19 features based on sequence and nearby gene expression
|
|
|
|
|
|
|---|---|---|---|---|
| 1 | PhastCons score for 60-way vertebrate alignments; 0 if not available | + | 1.82E-09 | *** |
| 2 | PhastCons score for placental mammal; 0 if not available | + | 7.76E-31 | *** |
| 3 | 1 if PhastCons vertebrate score is available and the score is 0; 0 otherwise | + | 1.55E-02 | * |
| 4 | 1 if PhastCons placental score is available and the score is 0; 0 otherwise | - | 4.53E-04 | *** |
| 5 | 1 if PhastCons vertebrate score is available; 0 otherwise | + | 7.24E-125 | *** |
| 6 | 1 if PhastCons placental score is available; 0 otherwise | + | 8.36E-01 | |
| 7 | 1 if base is in CpG islands; 0 otherwise | + | 1.03E-23 | *** |
| 8 | ln(x + 5), where x is the absolute number of base pairs to nearest RefSeq transcription start site | + | 1.69E-02 | * |
| 9 | 1 if base is part of repeat element based on RepeatMaster; 0 otherwise | - | 6.07E-02 | |
| 10 | 1 if base is part of a transcribed region of a RefSeq gene; 0 otherwise | - | 2.76E-07 | *** |
| 11 | 1 if base is between the start and end of the coding region of the gene; 0 otherwise | - | 3.15E-05 | *** |
| 12 | 1 if base is part of RefSeq exon; 0 otherwise | + | 7.47E-03 | ** |
| 13 | 1 if base is part of a RefSeq exon and within the coding region of the gene; 0 otherwise | - | 2.07E-105 | *** |
| 14 | 1 if base is part of a RefSeq intron; 0 otherwise | + | 2.19E-02 | * |
| 15 | Percentage of G or C base pairs of all bases within 50 bases in either direction | + | 0.00E + 00 | *** |
| 16 | ln(x + 1), where x is the FPKM of nearest gene at time t | - | 1.20E-07 | *** |
| 17 | ln(x + 1), where x is the FPKM of nearest gene at time t + 1 | + | 3.81E-05 | *** |
| 18 | 1 if nearest gene is significantly up-regulated from t to t + 1; 0 otherwise | + | 2.05E-03 | ** |
| 19 | 1 if nearest gene is significantly down-regulated from t to t + 1; 0 otherwise | + | 5.77E-02 |
*: 0.01≤p value < 0.05; **: 0.001≤p value < 0.01; ***: p value < 0.001.
The 50 features based on ChIP-seq intensity of four histone marks (H3K27ac, H3K4me1, H3K4me3 and H3K27me3) and RNA polymerase II phosphorylation at serine 5 (RNAP)
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| 20 | H3K27ac | mean(xt,1000) | + | 4.40E-245 | *** |
| 21 | mean(xt,500) | - | 7.63E-03 | ** | |
| 22 | mean(xt,100) | - | 1.59E-21 | *** | |
| 23 | mean(xt,50) | + | 2.41E-06 | *** | |
| 24 | mean(xt,10) | - | 2.09E-01 | ||
| 25 | mean(xt+1,1000)-mean(xt,1000) | - | 1.62E-04 | *** | |
| 26 | mean(xt+1,500)-mean(xt,500) | - | 9.25E-01 | ||
| 27 | mean(xt+1,100)-mean(xt,100) | - | 1.03E-03 | ** | |
| 28 | mean(xt+1,50)-mean(xt,50) | + | 6.40E-04 | *** | |
| 29 | mean(xt+1,10)-mean(xt,10) | - | 1.33E-01 | ||
| 30 | H3K4me1 | mean(xt,1000) | + | 9.27E-01 | |
| 31 | mean(xt,500) | + | 2.04E-07 | *** | |
| 32 | mean(xt,100) | + | 1.04E-01 | ||
| 33 | mean(xt,50) | - | 8.89E-01 | ||
| 34 | mean(xt,10) | - | 3.06E-01 | ||
| 35 | mean(xt+1,1000)-mean(xt,1000) | - | 1.21E-56 | *** | |
| 36 | mean(xt+1,500)-mean(xt,500) | + | 2.42E-17 | *** | |
| 37 | mean(xt+1,100)-mean(xt,100) | + | 4.97E-05 | *** | |
| 38 | mean(xt+1,50)-mean(xt,50) | - | 4.04E-01 | ||
| 39 | mean(xt+1,10)-mean(xt,10) | + | 8.39E-01 | ||
| 40 | H3K4me3 | mean(xt,1000) | - | 3.34E-91 | *** |
| 41 | mean(xt,500) | + | 2.34E-32 | *** | |
| 42 | mean(xt,100) | + | 1.28E-02 | * | |
| 43 | mean(xt,50) | - | 8.41E-01 | ||
| 44 | mean(xt,10) | + | 7.33E-01 | ||
| 45 | mean(xt+1,1000)-mean(xt,1000) | - | 3.73E-23 | *** | |
| 46 | mean(xt+1,500)-mean(xt,500) | + | 3.82E-01 | ||
| 47 | mean(xt+1,100)-mean(xt,100) | + | 3.15E-04 | *** | |
| 48 | mean(xt+1,50)-mean(xt,50) | + | 9.68E-01 | ||
| 49 | mean(xt+1,10)-mean(xt,10) | - | 7.65E-01 | ||
| 50 | H3K27me3 | mean(xt,1000) | + | 1.61E-15 | *** |
| 51 | mean(xt,500) | - | 1.83E-27 | *** | |
| 52 | mean(xt,100) | + | 7.01E-02 | ||
| 53 | mean(xt,50) | - | 2.04E-01 | ||
| 54 | mean(xt,10) | - | 8.98E-02 | ||
| 55 | mean(xt+1,1000)-mean(xt,1000) | + | 1.77E-08 | *** | |
| 56 | mean(xt+1,500)-mean(xt,500) | - | 9.39E-05 | *** | |
| 57 | mean(xt+1,100)-mean(xt,100) | + | 2.15E-01 | ||
| 58 | mean(xt+1,50)-mean(xt,50) | - | 3.05E-01 | ||
| 59 | mean(xt+1,10)-mean(xt,10) | - | 4.10E-01 | ||
| 60 | RNAP | mean(xt,1000) | - | 0.00E + 00 | *** |
| 61 | mean(xt,500) | + | 1.82E-105 | *** | |
| 62 | mean(xt,100) | + | 1.04E-13 | *** | |
| 63 | mean(xt,50) | - | 1.54E-01 | ||
| 64 | mean(xt,10) | + | 9.46E-01 | ||
| 65 | mean(xt+1,1000)-mean(xt,1000) | - | 6.44E-166 | *** | |
| 66 | mean(xt+1,500)-mean(xt,500) | + | 1.98E-31 | *** | |
| 67 | mean(xt+1,100)-mean(xt,100) | + | 5.17E-03 | ** | |
| 68 | mean(xt+1,50)-mean(xt,50) | - | 4.77E-01 | ||
| 69 | mean(xt+1,10)-mean(xt,10) | + | 1.96E-01 |
*: 0.01 ≤p value < 0.05; **: 0.001≤p value < 0.01; ***: p value < 0.001.
Figure 2Stage specific transcription factor (TF) binding probability (LR score). (A) Performance of leave-one-TF-out cross-validation of predicting binding sites of 12 ESC TFs and 5 cardiac TFs, as measured by area under the Receiver Operating Characteristics curve (AUC). (B) Distribution of the mean LR score of ESC transcriptional regulatory modules identified by functional identification of regulatory elements within accessible chromatin (FIREWACh) and the LR score of one million randomly selected bases in the cis-region [35]. (C) Distribution of the mean LR score of heart enhancers and the LR score of one million randomly selected bases in the respective cis-regions [34]. (D) Number of significantly enriched TFs in high LR score regions (>0.1) in the three stage transition.
Figure 3Predicted transcription factor binding sites around the 40-kb cis-region of the gene. The cardiac regulatory region (−9435/-8922) has been reported by Lien et al. Brown bars indicate the presence of links and associated transcription factors at distinct stage transitions.
Figure 4Inferring dynamic networks during cardiac differentiation by a time-varying dynamic Bayesian network model. (A) Number of inferred gene-gene regulatory relationships in ESC-MES, MES-CPs, and CPs-CMs transitions. (B) Number of predicted transcription factor binding sites that are overlapped with known ESC and heart enhancers on genes that have enhancers within their 2 kb, 10 kb or 40 kb regions surrounding the transcriptional start sites. (C) Predicted up- or down-regulated genes on computationally inducing Pou5f1 five-fold in ESCs compared with known up- or down-regulated genes on experimentally inducing Pou5f1. p-values were determined using Fisher's exact test.
Figure 5Inferred dynamic regulatory networks of 94 genes during cardiac differentiation (ESCs-MES, MES-CPs and CPs-CMs). Expression levels for each gene were normalized to a mean of zero and a standard deviation of one.