| Literature DB >> 26425556 |
Yuming Zhao1, Fang Wang2, Liran Juan2.
Abstract
A microRNA is a small noncoding RNA molecule, which functions in RNA silencing and posttranscriptional regulation of gene expression. To understand the mechanism of the activation of microRNA genes, the location of promoter regions driving their expression is required to be annotated precisely. Only a fraction of microRNA genes have confirmed transcription start sites (TSSs), which hinders our understanding of the transcription factor binding events. With the development of the next generation sequencing technology, the chromatin states can be inferred precisely by virtue of a combination of specific histone modifications. Using the genome-wide profiles of nine histone markers including H3K4me2, H3K4me3, H3K9Ac, H3K9me2, H3K18Ac, H3K27me1, H3K27me3, H3K36me2, and H3K36me3, we developed a computational strategy to identify the promoter regions of most microRNA genes in Arabidopsis, based upon the assumption that the distribution of histone markers around the TSSs of microRNA genes is similar to the TSSs of protein coding genes. Among 298 miRNA genes, our model identified 42 independent miRNA TSSs and 132 miRNA TSSs, which are located in the promoters of upstream genes. The identification of promoters will provide better understanding of microRNA regulation and can play an important role in the study of diseases at genetic level.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26425556 PMCID: PMC4573627 DOI: 10.1155/2015/861402
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1The distribution of histone markers around TSS of protein coding genes. (a) The ChIP-Seq-derived histone modifications patterns around the TSS of protein coding gene in Arabidopsis. The RPM distributions of nine histone markers including H3K4me2, H3K4me3, H3K9Ac, H3K9me2, H3K18Ac, H3K27me1, H3K27me3, H3K36me2, and H3K36me3 were marked by different colors. (b) The pattern of ChIP-Seq-derived H3K4me3 around the TSS of protein coding gene (red curve) and random genomic region (blue) in Arabidopsis.
Figure 2ROC curve for TSS prediction of protein coding genes with different histone markers. From (a) to (i), the ROC curve shows the sensitivity and specificity of the TSS prediction for protein coding genes with different histone marker. For each histone marker, the ROC curve was calculated within four different ranges around the TSS. For example, the red curve represents the ROC curve calculated within 20 bins up and 20 bins down of the TSS. The area under the curve (AUC) for each range around the TSS is shown in each graph. (j) The ROC curve for TSS prediction of protein coding genes with the combined nine histone markers.
Figure 3Features of predicted miRNA TSSs. (a) Histogram illustrating the distance between 298 miRNAs and their upstream genes. (b) The number of identified miRNA TSSs. The blue sector represents the 42 independent miRNA TSSs. The green sector represents the 132 predicted miRNA TSSs in the same position as the TSS of their upstream genes. The red sector represents 124 miRNAs that have no predicted TSS. (c) The distances between the predicted independent miRNA TSSs and their corresponding miRNAs.
42 independent predicted miRNA transcription start sites.
| Index | miRNA ID | miRNA name | Genome coordinates | TSS |
|---|---|---|---|---|
| 1 | MI0000989 | ath-MIR171b | chr1:3961348–3961464(−) | 3961764–3961864 |
| 2 | MI0005386 | ath-MIR830 | chr1:4820355–4820549(−) | 4820549–4820649 |
| 3 | MI0000218 | ath-MIR159b | chr1:6220646–6220841(+) | 6220446–6220546 |
| 4 | MI0001005 | ath-MIR394a | chr1:7058194–7058310(+) | 7055994–7056094 |
| 5 | MI0019201 | ath-MIR5630a | chr1:12011152–12011223(−) | 12011523–12011623 |
| 6 | MI0019211 | ath-MIR5630b | chr1:12023526–12023597(−) | 12023997–12024097 |
| 7 | MI0000193 | ath-MIR161 | chr1:17825685–17825857(+) | 17825485–17825585 |
| 8 | MI0019208 | ath-MIR5636 | chr1:18549959–18550036(+) | 18549659–18549759 |
| 9 | MI0001078 | ath-MIR406 | chr1:19430078–19430277(−) | 19431177–19431277 |
| 10 | MI0019235 | ath-MIR5652 | chr1:23412989–23413436(−) | 23413636–23413736 |
| 11 | MI0000196 | ath-MIR163 | chr1:24884066–24884396(+) | 24883966–24884066 |
| 12 | MI0001425 | ath-MIR414 | chr1:25137456–25137563(−) | 25137763–25137863 |
| 13 | MI0000189 | ath-MIR159a | chr1:27713233–27713416(−) | 27713616–27713716 |
| 14 | MI0015817 | ath-MIR4228 | chr1:28889375–28889532(+) | 28889175–28889275 |
| 15 | MI0005105 | ath-MIR775 | chr1:29422452–29422574(+) | 29422052–29422152 |
| 16 | MI0001013 | ath-MIR396a | chr2:4142323–4142473(−) | 4142673–4142773 |
| 17 | MI0005109 | ath-MIR779 | chr2:9560761–9560923(+) | 9560161–9560261 |
| 18 | MI0020189 | ath-MIR5995b | chr2:10026910–10027050(+) | 10026310–10026410 |
| 19 | MI0020188 | ath-MIR5595a | chr2:10026910–10027050(−) | 10027050–10027150 |
| 20 | MI0000178 | ath-MIR156a | chr2:10676451–10676573(−) | 10676673–10676773 |
| 21 | MI0000215 | ath-MIR172a | chr2:11942914–11943015(−) | 11943215–11943315 |
| 22 | MI0017889 | ath-MIR5021 | chr2:11974711–11974881(−) | 11975181–11975281 |
| 23 | MI0000201 | ath-MIR166a | chr2:19176108–19176277(+) | 19176008–19176108 |
| 24 | MI0001072 | ath-MIR403 | chr2:19415052–19415186(+) | 19414952–19415052 |
| 25 | MI0000208 | ath-MIR167a | chr3:8108072–8108209(+) | 8107972–8108072 |
| 26 | MI0005383 | ath-MIR827 | chr3:22122760–22122936(−) | 22123036–22123136 |
| 27 | MI0000202 | ath-MIR166b | chr3:22922206–22922325(+) | 22921906–22922006 |
| 28 | MI0002407 | ath-MIR447a | chr4:1528134–1528370(−) | 1529270–1529370 |
| 29 | MI0017896 | ath-MIR5026 | chr4:7844496–7844688(+) | 7842896–7842996 |
| 30 | MI0005405 | ath-MIR850 | chr4:7845707–7845927(+) | 7842907–7843007 |
| 31 | MI0005440 | ath-MIR863 | chr4:7846597–7846899(+) | 7842897–7842997 |
| 32 | MI0015815 | ath-MIR4221 | chr4:8460516–8460662(+) | 8459516–8459616 |
| 33 | MI0000210 | ath-MIR168a | chr4:10578635–10578772(+) | 10578335–10578435 |
| 34 | MI0000180 | ath-MIR156c | chr4:15415418–15415521(−) | 15415821–15415921 |
| 35 | MI0019242 | ath-MIR5658 | chr4:18485438–18485531(−) | 18486431–18486531 |
| 36 | MI0000198 | ath-MIR164b | chr5:287584–287736(+) | 287484–287584 |
| 37 | MI0000216 | ath-MIR172b | chr5:1188207–1188301(−) | 1188501–1188601 |
| 38 | MI0000195 | ath-MIR162b | chr5:7740598–7740708(−) | 7740908–7741008 |
| 39 | MI0019216 | ath-MIR5643a | chr5:11667797–11667879(+) | 11667197–11667297 |
| 40 | MI0001014 | ath-MIR396b | chr5:13611798–13611932(+) | 13611698–13611798 |
| 41 | MI0000211 | ath-MIR168b | chr5:18358788–18358911(−) | 18359011–18359111 |
| 42 | MI0001075 | ath-MIR405b | chr5:20632514–20632637(+) | 20630514–20630614 |
Figure 4The histone pattern between predicted miRNA TSSs and upstream gene TSSs. (a) The example of independent predicted miRNA TSS. The first peak represents the position of the upstream gene TSS and the second peak represents the position of predicted miRNA TSS. (b) The example of predicted miRNA TSS has the same position as the TSS of upstream gene. Different histone markers are presented in different color.
Figure 5The overlapping of 16 microRNA TSSs identified by all three methods.