| Literature DB >> 30089658 |
Annalaura Vacca1, Masayoshi Itoh2, Hideya Kawaji3, Erik Arner4, Timo Lassmann5, Carsten O Daub6, Piero Carninci4, Alistair R R Forrest7, Yoshihide Hayashizaki2, Stuart Aitken1, Colin A Semple8.
Abstract
The promoters of immediate early genes (IEGs) are rapidly activated in response to an external stimulus. These genes, also known as primary response genes, have been identified in a range of cell types, under diverse extracellular signals and using varying experimental protocols. Whereas genomic dissection on a case-by-case basis has not resulted in a comprehensive catalogue of IEGs, a rigorous meta-analysis of eight genome-wide FANTOM5 CAGE (cap analysis of gene expression) time course datasets reveals successive waves of promoter activation in IEGs, recapitulating known relationships between cell types and stimuli: we obtain a set of 57 (42 protein-coding) candidate IEGs possessing promoters that consistently drive a rapid but transient increase in expression over time. These genes show significant enrichment for known IEGs reported previously, pathways associated with the immediate early response, and include a number of non-coding RNAs with roles in proliferation and differentiation. Surprisingly, we also find strong conservation of the ordering of activation for these genes, such that 77 pairwise promoter activation orderings are conserved. Using the leverage of comprehensive CAGE time series data across cell types, we also document the extensive alternative promoter usage by such genes, which is likely to have been a barrier to their discovery until now. The common activation ordering of the core set of early-responding genes we identify may indicate conserved underlying regulatory mechanisms. By contrast, the considerably larger number of transiently activated genes that are specific to each cell type and stimulus illustrates the breadth of the primary response.Entities:
Keywords: CAGE data; immediate early response; promoter activity; time series analysis
Mesh:
Substances:
Year: 2018 PMID: 30089658 PMCID: PMC6119861 DOI: 10.1098/rsob.180011
Source DB: PubMed Journal: Open Biol ISSN: 2046-2441 Impact factor: 6.411
Figure 1.Time course datasets demonstrating the immediate early response. (a) Schematic of the eight time course datasets considered. Horizontal lines indicate the time span and symbols show the sampling times. Time zero corresponds to inactivated or quiescent cells in all cases. (b) The time course expression profile of FOS (i) and JUN (ii) in all eight datasets. Cage cluster expression (mean TPM of three replicates) is plotted against time. (c) The extent to which the classification of a TSS as a peak is unique to one dataset (3515 TSS) or shared between two or more datasets.
Figure 2.Broad trends in peak expression times across datasets. (a) Identification of the peak time parameter (tp) of FOS estimated from the PMDM_LPS time series (filled symbols indicate the median TPM; unfilled symbols are individual replicates; green lines represent tp and one standard deviation above and below). (b) Heatmap of the times of peak TSS expression (tp) for TSSs in the permissive set for all datasets. Heatmap colours reflect the tp for each CAGE TSS (within 100 min: dark green; 100–150 min: light green; 150–200 min: yellow; beyond 200 min: red). Known IEGs are indicated on the left by black cells.
Figure 3.Promoter usage across time series datasets. For representative genes, bar charts show the number of datasets where each TSS peaks to illustrate the diversity of TSS usage and commonality of the peaking response. Known IEGS are shown in blue, TFs in yellow and other genes in green. FOSB has a single TSS that peaks in eight datasets, JUN has three TSS each peaking in four or more datasets and XBP1 has six TSS that peak in between one and six datasets.
Enrichment of known IEGs for genes classified to the peak model in multiple datasets. Enrichment (expressed as odds ratios) and p-values for genes classified across different numbers of time series datasets.
| shared datasets | IEGs enrichment | no. CAGE TSSs (median) | |||||
|---|---|---|---|---|---|---|---|
| no. genes | no. IEGs | no. CAGE TSSs (across eight datasets) | no. IEG CAGE TSSs (across eight datasets) | OR | |||
| 1–8 (all peaking genes) | 8785 | 204 | 102 496 | 913 | — | — | 1 |
| 2–8 | 5270 | 171 | 71 384 | 853 | 6.3 | 2.2 × 10−16 | 1 |
| 3–8 | 2882 | 128 | 45 360 | 751 | 5.9 | 2.2 × 10−16 | 2 |
| 4–8 | 1304 | 86 | 24 616 | 590 | 5.9 | 2.2 × 10−16 | 2 |
| 5–8 | 507 | 56 | 11 528 | 433 | 7.4 | 2.2 × 10−16 | 3 |
| 6–8 | 182 | 35 | 4896 | 299 | 10.3 | 2.2 × 10−16 | 3 |
| 7–8 | 42 | 13 | 1376 | 124 | 12.6 | 2.2 × 10−16 | 4 |
| 8 | 5 | 2 | 264 | 18 | 8.3 | 4.6 × 10−11 | 5 |
Non-coding genes peaking in at least seven out of eight datasets. The short descriptions of the molecular function are from the genecard database [15].
| gene ID | no. of shared datasets | description (PubMed ref.) |
|---|---|---|
| LINC00478 (MIR99AHG) | 7 | it has a role in cell proliferation and differentiation and it is considered a regulator of oncogenes in leukaemia (PMID: 25027842) |
| LINC00263 | 7 | regulation of oligodendrocyte maturation (PMID: 25575711) |
| LINC-PINT | 8 | putative tumour suppressor (PMID: 24070194) |
| LINC00963 | 7 | involved in the prostate cancer transition from androgen-dependent to androgen-independent and metastasis via the EGFR signalling pathway (PMID: 24691949) |
| LINC00476 | 8 | uncharacterized lincRNA |
| LINC00674 | 7 | uncharacterized lincRNA |
| STX18-AS1 | 7 | uncharacterized lincRNA |
| DLEU2 | 7 | critical host gene of the cell cycle inhibitory microRNAs miR-15a and miR-16-1 (PMID:19591824) |
| MiR-29A | 7 | the expression of the miR-29 family has antifibrotic effects in heart, kidney and other organs; miR-29s have also been shown to induce apoptosis and regulate cell differentiation (PMID: 22214600) |
| MiR-3654 | 7 | involved in prostate cancer progression (PMID: 27297584) |
| MiR-21 | 7 | oncogenic potential (PMID: 18548003) |
| AL928646 | 7 | uncharacterized ncRNA |
| SCARNA17 | 7 | scaRNA involved in the maturation of other RNA molecules (PMID: 12032087) |
| SNORD65 | 7 | belongs to the small nucleolar RNAs, C/D family; involved in rRNA modification and alternative splicing (PMID: 26957605) |
| SNORD82 | 7 | belongs to the small nucleolar RNAs, C/D family; involved in rRNA modification and alternative splicing (PMID: 26957605) |
Figure 4.Conserved activation network. (a) Schematic profiles of two peaking genes, with temporal precedence indicated by the arrow. (b) Conserved temporal precedence between IEGs (light blue nodes), TFs (yellow nodes) ncRNA (grey nodes) and other protein-coding genes (green nodes) is shown by directed edges. A subset of IEGs in this network are also TFs (FOS, KLF6, FOSB, BHLHE40, JUN and FOSL1).
Figure 5.Transcriptional dynamics of genes classified to the peak model. Scatterplots of log fold change against the time of peaking for selected genes, with conserved temporal precedence indicated by arrows for (a) PMDM_LPS and (b) MCF7_EGF1. FOS peaks earliest and has many conserved temporal relations to later peaking genes, while EHD1 peaks late and has many conserved temporal orderings with earlier peaking genes.