| Literature DB >> 21247883 |
Bum-Kyu Lee1, Akshay A Bhinge, Vishwanath R Iyer.
Abstract
The E2F family of transcription factors has important roles in cell cycle progression. E2F4 is an E2F family member that has been proposed to be primarily a repressor of transcription, but the scope of its binding activity and functions in transcriptional regulation is not fully known. We used ChIP sequencing (ChIP-seq) to identify around 16,000 E2F4 binding sites which potentially regulate 7346 downstream target genes with wide-ranging functions in DNA repair, cell cycle regulation, apoptosis, and other processes. While half of all E2F4 binding sites (56%) occurred near transcription start sites (TSSs), ∼20% of sites occurred more than 20 kb away from any annotated TSS. These distal sites showed histone modifications suggesting that E2F4 may function as a long-range regulator, which we confirmed by functional experimental assays on a subset. Overexpression of E2F4 and its transcriptional cofactors of the retinoblastoma (Rb) family and its binding partner DP-1 revealed that E2F4 acts as an activator as well as a repressor. E2F4 binding sites also occurred near regulatory elements for miRNAs such as let-7a and mir-17, suggestive of regulation of miRNAs by E2F4. Taken together, our genome-wide analysis provided evidence of versatile roles of E2F4 and insights into its functions.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21247883 PMCID: PMC3089461 DOI: 10.1093/nar/gkq1313
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.E2F4 ChIP-seq reveals genome-wide E2F4 binding sites. (A) An example of a known E2F4 binding site that was identified in our ChIP-seq data. Chromosome coordinates are indicated on top. The plot in the middle shows the density of ChIP-seq reads, with the peak score indicated on the Y axis. The bottom track shows the CDC25C gene with coding regions, exons and introns indicated by thick or thin boxes and line, respectively. The direction of transcription is indicated by the arrows from right to left. (B) An example of strong peaks discovered in both input and ChIP likely due to copy number differences between the cell genome and the reference sequence. Such sites were removed by input correction (‘Materials and Methods’ section). (C) FDR calculation based on random simulations. The 1% FDR threshold was used for further analysis. (D) qPCR verification of 42 randomly selected targets identified by ChIP-seq. Blue diamonds represent targets which passed the 1% FDR ChIP-seq threshold, and red squares represent targets below this threshold. (E) Capture-recapture analysis to estimate saturation for E2F4 targets (see Methods). x-axis represents −log10 FDR and the y-axis shows the saturation as a percentage of expected sites that were discovered at each FDR.
Figure 2.The genome-wide distribution pattern of E2F4 binding sites. (A) The correlation between E2F4 binding sites and gene density. Each point on the plot represents a 20 Mb bin. (B) A pie chart representation of the distribution of E2F4 binding sites in five different genomic regions. The definition of each genomic region is described below. Core promoters are within ±2 kb from the TSS, upstream is from 2 to 20 kb upstream from the TSS, and intergenic is a region not included as a promoter, upstream region, intron or exon. (C) Distribution of E2F4 binding sites within ±10kb. Inset shows a close up of a 1 kb region centered on the TSS. (D) A box-plot shows the ChIP-seq peak score distribution across five different genomic regions. Peak scores in core promoters were significantly higher compared to those from other genomic regions (P < 5.5 × 10−15, Wilcoxon test with Bonferroni correction). (E) Distribution of E2F4 binding sites depending on peak scores. Even though the number of intergenic sites decreased with increasing score, a substantial proportion of intergenic sites (10%) still remained at a score of 10.
Figure 3.Some E2F4-bound distal sites function as enhancers. (A) Relationship of histone enhancer marks with the 2857 intergenic E2F4 sites (a) or 1560 upstream E2F4 sites (b). Data for histone modifications indicative of enhancers was obtained from previous studies (17,35), assigned to E2F4 binding sites identified here and hierarchically clustered for display. The relative strength of the histone modification signal is indicated in the heat-map according to the color table. Sixty-eight percent of upstream and 36% of intergenic E2F4 sites contained at least one enhancer mark. (B) Luciferase reporter gene assays for randomly chosen 10 distal E2F4 binding sites. The y-axis represents the expression fold change of a luciferase reporter gene normalized to an empty-vector control. P-values were calculated using t-test from three independent transfections. E1 through E10 represent 10 enhancer candidates randomly selected from among distal E2F4 binding sites. ‘P’ represents a positive control enhancer selected based on a previously published study (64). Single and double asterisks indicates P < 0.005 and 0.001, respectively.
Functional categories of E2F4 target genes
| Biological functions | Count (Percent) | Fold enrichment | FDR | |
|---|---|---|---|---|
| Biopolymer metabolic process | 2259 (34.81) | 1.36 | 6.32 | 1.21 |
| Nucleotide and nucleic acid metabolic Process | 1736 (26.75) | 1.38 | 1.86 | 3.57 |
| Cell cycle | 490 (7.55) | 1.77 | 8.05 | 1.54 |
| Gene expression | 1499 (23.10) | 1.31 | 2.09 | 4.00 |
| RNA processing | 277 (4.27) | 1.96 | 1.55 | 2.97 |
| Organelle organization and biogenesis | 579 (8.92) | 1.57 | 3.36 | 6.42 |
| Response to DNA damage stimulus | 208 (3.21) | 2.07 | 7.92 | 1.51 |
| Biopolymer modification | 811 (12.50) | 1.4 | 1.06 | 2.03 |
| mRNA processing | 172 (2.65) | 2.14 | 3.14 | 6.01 |
| RNA splicing | 156 (2.40) | 2.22 | 3.23 | 6.18 |
| Protein modification process | 772 (11.90) | 1.38 | 1.98 | 3.78 |
| DNA repair | 171 (2.64) | 2.07 | 1.48 | 2.83 |
| Ubiquitin cycle | 278 (4.28) | 1.74 | 7.46 | 1.43 |
| Response to endogenous stimulus | 230 (3.54) | 1.84 | 1.38 | 2.64 |
| Macromolecule localization | 401 (6.18) | 1.54 | 2.09 | 4.00 |
| Protein transport | 345 (5.32) | 1.6 | 2.64 | 5.05 |
| Chromosome organization and biogenesis | 221 (3.41) | 1.81 | 3.61 | 6.91 |
| Post-translational protein modification | 653 (10.06) | 1.39 | 4.68 | 8.96 |
| Transcription | 1061 (16.35) | 1.27 | 1.67 | 3.19 |
| Protein localization | 373 (5.75) | 1.53 | 1.34 | 2.56 |
| DNA replication | 145 (2.23) | 1.87 | 1.42 | 2.73 |
| Apoptosis | 356 (5.49) | 1.47 | 3.15 | 6.02 |
| Chromatin modification | 118 (1.82) | 1.9 | 1.11 | 2.12 |
| Establishment and maintenance of Chromatin | 165 (2.54) | 1.69 | 4.09 | 7.86 |
| DNA packaging | 166 (2.56) | 1.67 | 1.48 | 2.85 |
| Chromosome segregation | 50 (0.77) | 2.56 | 2.65 | 5.05 |
| Ribonucleoprotein complex biogenesis and assembly | 116 (1.79) | 1.85 | 2.84 | 5.44 |
| Protein targeting | 122 (1.88) | 1.8 | 7.12 | 1.36 |
| Cell development | 489 (7.54) | 1.27 | 1.16 | 2.21 |
| RNA localization | 58 (0.89) | 2.1 | 1.36 | 2.61 |
| Protein modification by small protein conjugation | 55 (0.85) | 2.14 | 1.73 | 3.30 |
| Sister chromatid segregation | 28 (0.43) | 2.82 | 6.29 | 1.20 |
| Protein ubiquitination | 51 (0.79) | 2.11 | 1.76 | 3.37 |
| Ubiquitin-dependent protein catabolic process | 98 (1.51) | 1.68 | 3.04 | 5.81 |
| Ribosome biogenesis and assembly | 55 (0.85) | 1.95 | 2.16 | 4.14 |
| Protein kinase cascade | 174 (2.68) | 1.43 | 2.96 | 5.67 |
| Response to stress | 416 (6.41) | 1.24 | 6.16 | 1.18 |
| Spindle organization and biogenesis | 19 (0.29) | 3.06 | 6.67 | 1.28 |
| Phosphate metabolic process | 399 (6.15) | 1.25 | 1.07 | 2.04 |
| Phosphorus metabolic process | 399 (6.15) | 1.25 | 1.07 | 2.04 |
| Protein folding | 127 (1.96) | 1.49 | 2.01 | 3.85 |
| DNA damage response, signal transduction | 33 (0.51) | 2.22 | 3.98 | 7.62 |
| Protein–RNA complex assembly | 62 (0.96) | 1.73 | 1.07 | 0.002 |
| Regulation of gene expression, epigenetic | 34 (0.52) | 2.11 | 1.43 | 0.002 |
| Chromatin assembly or disassembly | 73 (1.12) | 1.63 | 2.06 | 0.003 |
| Lipid biosynthetic process | 124 (1.91) | 1.44 | 2.09 | 0.004 |
| Microtubule organization and biogenesis | 17 (0.26) | 2.89 | 2.53 | 0.004 |
| Centrosome organization and biogenesis | 17 (0.26) | 2.89 | 2.53 | 0.004 |
| I-kappaB kinase/NF-kappaB cascade | 69 (1.06) | 1.63 | 3.98 | 0.007 |
Count represents the number of genes in the biological function category. Percent shows the proportion of E2F4 targets among the count. FDR is the false discovery rate. Functional categories were as defined by the online database DAVID (37). P-values and FDR were also as calculated using their online tool, based on the list of E2F4 target genes identified in this study.
Figure 4.E2F4 motif analysis. (A) Enrichment of indicated motifs over background is plotted on the Y axis, as a function of ChIP-seq peak score plotted on the x-axis. (B) Distribution of motifs around E2F4 binding sites identified by ChIP-seq. E2F4 motifs were mapped to E2F4 binding sites and the distance of the identified motif from the maxima of the binding site was plotted as a histogram. The y-axis shows the percentage of peaks that had an E2F4 motif within the specified distance shown on the x-axis. The figure indicates that the majority of E2F4 peaks had an E2F4 motif within 20 bp of the indicated nucleotide that was designated as the binding site. (C) Frequency of motif occurrence in five different genomic regions. The heat-map shows the percentage distribution of E2F4 binding sites found in each genomic region for each of the six different E2F4 motifs used in this study. E2F4 motifs 1–5 were found predominantly in sites that mapped to the core promoter except motif 6. Motif 6 was found at almost equal frequency in sites that mapped to the core and intergenic regions. Color bar indicates percentage of a motif in a given genomic region. For a given motif, the sum of the percentages across all five different genomic regions is 100%. (D) Number of motifs discovered within E2F4 sites segregated by their ChIP-seq score. The density plot shows the relative frequency of sites on the y-axis containing each indicated number of motifs on the x-axis. Sites with stronger ChIP-seq scores had more motifs and overall, E2F4 sites had approximately two motifs per site on average.
Motif usage of E2F4 within different biological pathways
| KEGG pathway terms | Motifs |
|---|---|
| Cell cycle | 1, 2, 3, 4, 5, 6 |
| Ubiquitin mediated proteolysis | 3, 4, 5 |
| Pyrimidine metabolism | 1, 3, 5 |
| DNA polymerase | 1, 3, 5 |
| p53 signaling pathway | 3, 5 |
| Chronic myeloid leukemia | 4 |
| 3 | |
| Biosynthesis of steroids | 2 |
Each number indicates one of six E2F4 motifs, assigned to a KEGG pathway category.
Figure 5.Overexpression of E2F4 and its cofactors (DP-1 and RBL2). (A) qPCR verification of increase in mRNA of E2F4 and its cofactors. GAPDH was used as an internal control and the log-scaled y-axis shows the fold increase of the indicated mRNA relative to the empty vector control. (B) Western blotting confirming overexpression of E2F4 and its cofactors at the protein level. Empty vector was used as a control. (C) K-means clustering of E2F4 targets identified by ChIP-seq along with gene expression data obtained in four different overexpression conditions. The data plotted is the expression value relative to that of a vehicle transfection control. The significance value (X) obtained from error model analysis was used for the clustering. A significance value of 3.3 corresponds to a P-value of 0.001. ChIP score was transformed to natural log.
Number of up- or down-regulated genes after overexpression of E2F4 and its cofactors
| ( | Number of expression-changed genes | |
|---|---|---|
| Overexpression | Up-regulated | Down-regulated |
| E2F4 | 167 | 128 |
| E2F4 + DP-1 | 314 | 341 |
| E2F4 + RBL | 105 | 171 |
| E2F4 + DP-1 + RBL2 | 228 | 281 |
P-value was calculated using an error model (24)
Figure 6.E2F4 can regulate miRNAs. (A) ChIP-seq data showing E2F4 binding within 10 kb upstream of the mir-17–92 cluster. The positions of the miRNAs are shown in red. The bottom track shows phylogenetic conservation across vertebrates species (Vertebrate Multiz Alignment & PhastCons Conservation: http://genome.ucsc.edu) with darker vertical bars indicating greater conservation. (B) qPCR verification of E2F4 binding sites upstream of indicated miRNAs in lymphoblastoid and HeLa cells. (C) TaqMan qPCR data for miRNA expression upon overexpression of E2F4 and its cofactors. Different combinations of E2F4 overexpression with its cofactors caused a modest decrease in the expression of all three miRNAs. The data plotted is the log2 of the expression relative to RNU66 which served as the internal control.