| Literature DB >> 21619633 |
Ying Sheng1, Christopher Previti.
Abstract
BACKGROUND: Recent functional studies have demonstrated that many microRNAs (miRNAs) are expressed by RNA polymerase II in a specific spatiotemporal manner during the development of organisms and play a key role in cell-lineage decisions and morphogenesis. They are therefore functionally related to a number of key protein coding developmental genes, that form genomic regulatory blocks (GRBs) with arrays of highly conserved non-coding elements (HCNEs) functioning as long-range enhancers that collaboratively regulate the expression of their target genes. Given this functional similarity as well as recent zebrafish transgenesis assays showing that the miR-9 family is indeed regulated by HCNEs with enhancer activity, we hypothesized that this type of miRNA regulation is prevalent. In this paper, we therefore systematically investigate the regulatory landscape around conserved self-transcribed miRNAs (ST miRNAs), with their own known or computationally inferred promoters, by analyzing the hallmarks of GRB target genes. These include not only the density of HCNEs in their vicinity but also the presence of large CpG islands (CGIs) and distinct patterns of histone modification marks associated with developmental genes.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21619633 PMCID: PMC3123655 DOI: 10.1186/1471-2164-12-270
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Density distribution of distances between predicted TSSs/TESs and human pre-miRNAs. The corresponding dashed lines indicate the cutoffs used to define the TSSs and TESs in the analysis, whose distance distributions are indicated by the red and blue curves, respectively.
HCNE density comparison
| Comparison between human conserved ST miRNAs and human random coding regions | Comparison between human conserved ST miRNAs and human random non-coding regions | Comparison between human conserved ST miRNAs and human GRB target genes | ||||
|---|---|---|---|---|---|---|
| Lineage comparison | ||||||
| human: mouse | 0 | 0 | 0 | 0 | 0 | 0 |
| human: dog | 0 | 0 | 0 | 0 | 0 | 0 |
| human: opossum | 0 | 0 | 1.0e-04 | 0 | 0 | 0 |
| human: platypus | 0 | 0 | 0 | 2.0e-04 | 0 | 0 |
| human: chicken | 0 | 0 | 0 | 0 | 0 | 0 |
| human: frog | 0 | 0 | 0 | 0 | 0 | 0 |
| human: zebrafish | 4.0e-03 | 8.0e-03 | 1.8e-03 | 4.0e-04 | 0 | 0 |
The comparisons represented by each column were performed after selecting HCNEs and ST miRNAs conserved between the lineages shown in the left column (see Methods). The p-values were computed using the two-sided bootstrapped version of the Kolmogorov-Smirnov test. All p-values lower than 1.0e-20 were set to 0.
Figure 2The enrichment of HCNEs around conserved human ST miRNAs (including ST miRNAs overlapping with GRBs). Figure 2 shows the cumulative curves of HCNE density in five lineage comparisons. The lineages compared are indicated at the top of each figure. The HCNE density was calculated based on a 300 kb window centered on a region of interest, which is either a ST miRNA, a randomly selected coding/non-coding region (control sets) or a GRB target gene. The x-axis shows the percentage of base pairs in HCNEs within the 300 kb window (HCNE density). The fraction of 300 kb windows we analyzed is shown in the y-axis. The red curve shows the HCNE density of the conserved human ST miRNAs, while the grey, blue and green curves show the HCNE density of the non-coding and protein-coding control sets as well as the set of GRB target genes, respectively. Conserved human ST miRNAs are therefore more likely to be located in regions with higher HCNE density than would be expected by chance.
Figure 3The enrichment of p300 binding sites around mouse orthologs of human ST miRNAs. Figure 3 shows the cumulative curves of the enhancer enrichment analysis using all p300 binding sites (A) and using only p300 binding sites that do not overlap HCNEs conserved between human and mouse (percentage of identity ≥ 98% and length of HCNE ≥ 50 bp) (B). These results indicate that the mouse orthologs of human ST miRNAs are more likely to be located in regions with significantly higher p300 binding site density than the control set of protein coding and non-coding regions.
Comparison of CpG-to-gene ratios between different gene sets
| HCNE enriched miRNAs | HCNE poor miRNAs | Known GRB target genes | Known bystander genes | Other transcription factors | Other CpG island genes | |
|---|---|---|---|---|---|---|
| Median CpG-to-gene ratio | 0.1703 | 0.0238 | 0.2032 | 0.0100 | 0.0339 | 0.0280 |
| HCNE enriched miRNAs | - | 0.2931 | 0.1185 | |||
| HCNE poor miRNAs | 0.2931 | - | 0.1846 | 0.0966 |
Significant differences in the median CpG-to-gene ratio between the gene sets were determined using the two-sided bootstrapped version of the Kolmogorov-Smirnov test. Significant p-values (<0.05) are shown in bold. The first row indicates the median CpG-to-gene ratio for each gene set and the second and third rows contain the p-values of the comparisons between the CpG-to-gene ratios of the HCNE enriched or HCNE poor ST miRNAs and other control gene sets.
Figure 4Case study of the miR-9 family. UCSC Genome Browser screen shots of the miRNAs, hsa-mir-9-1 (A), hsa-mir-9-2 (B) and hsa-mir-9-3 (C) as well as their orthologs in the mouse genome. The screen shots of the human genome display information on CGIs, neighboring protein-coding genes as well as the level of HCNE density in different lineage comparisons. The screen shots of the mouse orthologs display information regarding bivalent domains (marked by rectangles). The color of the rectangle indicates the cell type the bivalent domain was detected in.
Annotation of ST miRNA candidates under putative long-range developmental regulation
| Name | Intergenic/Intragenic | |||
|---|---|---|---|---|
| Intergenic | mES, MEF | HOXA cluster | 0 | |
| Intergenic | hES, mES | - | 1 | |
| Intergenic | MEF | HOXC cluster | 2 | |
| Intergenic | hES, mES, MEF | HOXB cluster | 1 | |
| Intergenic | hES | HOXB cluster | 6 | |
| Intergenic | hES | - | 2 | |
| Intergenic | hES, mES | 2 | ||
| Intergenic | hES, mES | - | 3 | |
| Intergenic | hES, mES | - | 3 | |
| intergenic | hES, mES | - | 1 | |
| Intergenic | hES | - | 1 | |
| Intergenic | hES, mES | - | 3 | |
| Intergenic | hES, mES | - | 3 | |
| Intergenic | hES, mES, MEF | - | 2 | |
| intergenic | hES, mES, MEF | - | 3 | |
| Intergenic | hES, mES, MEF | - | 3 | |
| Intergenic | hES | 0 | ||
| Intergenic | hES | - | 0 | |
| Intergenic | mES, MEF | - | 3 | |
| Intergenic | MEF | - | 1 | |
| Intergenic | hES, MEF | - | 4 | |
| Intron of | mES | - | ||
| Intron of | mES | - | ||
| Intergenic | mES, MEF | - | 4 | |
| Intron of | mES | - | - | |
| Intergenic | mES | 3 |
List of conserved human ST miRNAs that have been annotated as targets of putative long-range regulation during development and differentiation. The mir-9 family of miRNAs is highlighted since it contains know examples of GRB target miRNAs that were captured using our two prediction features: 1) localization in regions of high HCNE density and 2) association with a bivalent promoter.
1The cell type in which the promoter of the miRNA is predicted to be a bivalent promoter. hES and mES represent embryonic stem cells of human and mouse, respectively. MEFs are mouse embryonic fibroblasts and NPCs are mouse neural progenitor cells.
2Further annotated GRB target genes within the 300kb region that was analyzed. The annotation of the GRB target genes was performed as described in the Methods section.
3Number of CpG islands overlapping with pri-miRNAs. The pri-miRNA is defined as the region 50 kb upstream to 20 kb downstream of the pre-miRNAs. If the defined pri-miRNA overlapped known protein-coding genes (the gene itself plus 1 kb up- and downstream of it), it was truncated to exclude the overlapping gene (see Methods). We did not count CGIs for intragenic miRNAs (denoted as '-').