| Literature DB >> 19390574 |
David L Corcoran1, Kusum V Pandit, Ben Gordon, Arindam Bhattacharjee, Naftali Kaminski, Panayiotis V Benos.
Abstract
BACKGROUND: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19390574 PMCID: PMC2668758 DOI: 10.1371/journal.pone.0005279
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Identification of promoters of intergenic miRNA genes.
| miRNA | Chromosomal location | ChIP-chip region | CPPP Model | Predicted TSS (CPPP) | Distance |
| miR-200b∼miR-200a∼miR-429 | Chr1: 1092346 (+) | [1082033, 1083782] | CpG+ | 1083333 | 8763 |
| miR-34a | Chr1: 9134423 (−) | [9162283, 9166532] | CpG+ | 9163733 | 29310 |
| miR-101-1 | Chr1: 65296779 (−) | [65304283, 65307532] | CpG+ | 65305833 | 9054 |
| miR-181a-1∼miR-181b-1 | Chr1: 197094905 (−) | [197125783, 197127032] |
|
|
|
| miR-202 | Chr10: 134911115 (−) | [134919994, 134925743] | CpG+ | 134924844 | 8879 |
| miR-210 | Chr11: 558198 (−) | [559355, 560354] | CpG+ |
|
|
| miR-194-2∼miR-192 | Chr11: 64415487 (−) | [64416605, 64418104] | CpG− | 64416930 | 1193 |
| miR-200c∼miR-141 | Chr12: 6943122 (+) | [6940546, 6942545] | CpG+ | 6941146 | 1976 |
| let-7i | Chr12: 61283732 (+) | [61279796, 61291045] | CpG+ | 61283796 | 506 |
| miR-379∼miR411∼…∼miR-410∼miR-656 | Chr14: 100558155 (+) | [100524119, 100525868] |
|
|
|
| miR-193b | Chr16: 14305324 (+) | [14302031, 14310280] | CpG+ | 14304581 | 743 |
| miR-138-2 | Chr16: 55449930 (+) | [55439531, 55441030] | CpG− | 55439856 | 9824 |
| miR-497∼miR-195 | Chr17: 6862065 (−) | [6863309, 6865058] | CpG− | 6864759 | 2444 |
| miR-10a | Chr17: 44012308 (−) | [44017059, 44018808] | CpG+ | 44017709 | 5401 |
| miR-196a-1 | Chr17: 44064920 (−) | [44078809, 44080558] | CpG+ | 44079509 | 14589 |
|
|
|
|
|
|
|
| miR-122 | Chr18: 54269285 (+) | [54235566, 54236565] | CpG− | 54235891 | 33144 |
|
|
|
|
|
|
|
| miR-181c∼miR-181d | Chr19: 13846512 (+) | [13832848, 13834847] |
|
|
|
| miR-99b∼let-7e∼miR125a | Chr19: 56887676 (+) | [56882098, 56886347] | CpG+ |
|
|
| miR-216a∼miR-217 | Chr2: 56069698 (−) | [56072783, 56074282] | CpG− | 56073933 | 3985 |
| miR-301b∼miR-130b | Chr22: 20337269 (+) | [20335283, 20337282] | CpG+ | 20336583 | 686 |
| let-7a-3∼let-7b | Chr22: 44887292 (+) | [44879283, 44883032] | CpG+ | 44881933 | 5109 |
| miR-206∼miR-133b | Chr6: 52117105 (+) | [52096878, 52098877] | CpG− | 52098453 | 18402 |
| miR-30a | Chr6: 72170045 (−) | [72164628, 72176377] | CpG− | 72174203 | 3908 |
| miR-129-1 | Chr7: 127635160 (+) | [127593752, 127595501] | CpG+ | 127594092 | 41068 |
| miR-183∼miR-96∼miR-182 | Chr7: 129202090 (−) | [129206752, 129207751] | CpG+ | 129207202 | 5112 |
| miR-29b-1∼miR-29a | Chr7: 130212838 (−) | [130219002, 130223501] | CpG− | 130223027 | 9939 |
| miR-30d∼miR-30b | Chr8: 135886370 (−) | [135913283, 135915782] | CpG+ | 135914133 | 27763 |
| let-7a-1∼let-7f-1∼let-7d | Chr9: 95978059 (+) | [95966631, 95971380] | CpG+ | 95969131 | 9928 |
| miR-181a-2∼miR-181b-2 | Chr9: 126494541 (+) | [126459631, 126464380] | CpG− | 126460831 | 33460 |
| miR-222∼miR-221 | ChrX: 45491474 (−) | [45504862, 45507861] | CpG− | 45506782 | 15308 |
| miR-542∼miR-450a-2∼miR-450a-1∼miR-450b | ChrX: 133503133 (−) | [133502362, 133506611] | CpG+ | 133505762 | 2629 |
| miR-505 | ChrX: 138834056 (−) | [138842362, 138844111] | CpG+ | 138843122 | 9066 |
miRNA: miRNA gene symbol, multiple symbols designate cluster of co-expressed miRNAs; Chromosomal location: the chromosomal position and orientation of the miRNA gene; ChIP-chip region: the nearest region with a statistically significant peak; CPPP model: the CpG (CpG+) or non-CpG (CpG−) model used for the TSS prediction; Predicted TSS: TSS predicted by CPPP; Distance: the distance of the predicted TSS from the most 5′ pre-miRNA transcript. Bold letters designate previously verified TSSs.
Figure 1Pol II ChIP-chip results for miR-10a.
The blue arrow represents the location and transcriptional direction of hsa-miR-10a. The red dashes represent the location and value of the ChIP-chip probes. TSS – transcription start site of this miRNA.
Intragenic miRNAs who's nearest ChIP-chip peak overlaps the host gene's TSS.
| miRNA | Host Gene | Chromosomal location | ChIP-chip region |
| miR-30e∼miR30c-1 | NFYC | Chr1: 40992613 (+) | [40946783, 40950532] |
| miR-186 | ZRANB2 | Chr1: 71305987 (−) | [71316783, 71320532] |
| miR-130a | AK096335 | Chr11: 57165246 (+) | [57161605, 57163604] |
| miR-148b | COPZ1 | Chr12 53017266 (+) | [53004046, 53006295] |
| miR-26a-2 | CTDSP2 | Chr12: 56504742 (−) | [56524546, 56528295] |
| miR-15a∼miR-16-1 | DLEU2 | Chr13: 49521338 (−) | [49551648, 49555397] |
|
|
|
|
|
| miR-423 | CCDC55 | Chr17: 25468222 (+) | [25467059, 25470058] |
| miR-301a∼miR-454 | FAM33A | Chr17: 54583364 (−) | [54583809, 54589308] |
| miR-330 | EML2 | Chr19: 50834185 (−) | [50833598, 50834597] |
| miR-26b | CTDSP1 | Chr2: 218975612 (+) | [218968033, 218974282] |
| miR-103-2 | PANK2 | Chr20: 3846140 (+) | [3816001, 3820000] |
| miR-185 | C22orf25 | Chr22: 18400661 (+) | [18387533, 18389782] |
| miR-191∼miR-425 | DALRD3 | Chr3: 49033146 (−) | [49026104, 49038353] |
| miR-15b∼miR-16-2 | SMC4 | Chr3: 161605069 (+) | [161598354, 161603353] |
| miR-378 | PPARGC1B | Chr5: 149092580 (+) | [149089935, 149091684] |
| miR-103-1 | PANK3 | Chr5: 167920556 (−) | [167938685, 167940184] |
| miR-335 | MEST | Chr7: 129923187 (+) | [129912502, 129914001] |
| miR-31 | LOC554202 | Chr9: 21502184 (−) | [21539381, 21557130] |
| miR-421 | AK125301 | ChrX: 73355021 (−) | [73377862, 73379611] |
| miR-374b∼miR-374a∼miR-545 | AK057701 | ChrX: 73355178 (−) | [73421362, 73431611] |
| miR-361 | CHM | ChrX: 85045368 (−) | [85188362, 85189861] |
| miR-503 | MGC16121 | ChrX: 133508094 (−) | [133506612, 133515611] |
| miR-452∼miR-224 | GABRE | ChrX: 150878840 (−) | [150889112, 150894611] |
| miR-22 | MGC14376 | Chr17: 1564031 (−) | [1563059, 1569558] |
| miR-636 | SFRS2 | Chr17: 72244225 (−) | [72244059, 72246308] |
|
|
|
|
|
Host gene: the gene whose intron the miRNA was found in. Other column names as in Table 1. Bold letters designate genes that are known to be co-transcribed with their host genes.
Identification of promoters for intragenic miRNA genes.
| miRNA | Host Gene | Chromosomal location | ChIP-chip region | CPPP Model | Predicted TSS (CPPP) | Distance |
| miR-107 | PANK1 | Chr10: 91342564 (−) | [91382494, 91383493] | CpG− | 91382844 | 40030 |
| let-7a-2∼miR-100 | AK091713 | Chr11: 121522511 (−) | [121521855, 121523854] |
|
|
|
| miR-190 | TLN2 | Chr15: 60903208 (+) | [60860703, 60861952] | CpG− | 60861428 | 41530 |
| miR-99a∼let-7c | C21orf34 | Chr21: 16833279 (+) | [16826951, 16832700] | CpG− | 16827826 | 5203 |
| miR-125b-2 | C21orf34 | Chr21: 16884427 (+) | [16880451, 16883950] | CpG− | 16880951 | 3226 |
| miR-26a-1 | CTDSPL | Chr3: 37985898 (+) | [37961854, 37963353] | CpG− | 37962529 | 23119 |
| miR-196b | HOXA9 | Chr7: 27175707 (−) | [27178752, 27180251] | CpG+ | 27178802 | 3095 |
| miR-489∼miR-653 | CALCR | Chr7: 92951267 (−) | [92953002, 92954251] |
|
|
|
| miR-101-2 | RCL1 | Chr9: 4840296 (+) | [4827381, 4828630] | CpG− | 4828281 | 11765 |
| miR-491 | KIAA1797 | Chr9: 20706103 (+) | [20673131, 20677880] | CpG+ | 20677181 | 28922 |
| miR-204 | TRPM3 | Chr9: 72614820 (−) | [72633881, 72634880] |
|
|
|
| miR-7-1 | HNRPK | Chr9: 85774592 (−) | [85774131, 85775630] | CpG− | 85775081 | 239 |
| mir-23b∼miR-27b∼miR-24-1 | C9orf3 | Chr9: 96887310 (+) | [96846381, 96860880] | CpG+ | 96855881 | 31429 |
|
|
|
|
|
|
|
|
| miR-448 | HTR2C | ChrX: 113964272 (+) | [113955612, 113956861] |
|
|
|
Column names as in Table 1 and 2. Bold letters designate genes whose expression was found to be anti-correlated with their host genes.
Figure 2Performance of the n-mers and FBPs (alone and in combination) in predicting Pol II core promoter regions.
Sn – sensitivity, Sp – specificity.
Figure 3Performance of the SVM models in predicting CpG+ and CpG− promoters.
Two SVM models were evaluated in the prediction of the CpG+ promoters: one with random intergenic background (CpG+/Rnd_bg) and one with intergenic background with similar GC content (CpG+/GC_bg). Sn – sensitivity, Sp – specificity.
The top 20 most significant n-mers for each of the two models and the Fisher score as well as the −log10 of their p-value from Gist package (t-test metric).
| non-CpG | CpG | ||||
| Feature | −log10(p-value) | Fisher Score | Feature | −log10(p-value) | Fisher Score |
|
| 29.7925 | 0.152704 |
| 26.8458 | 0.136008 |
|
| 26.8574 | 0.136658 |
| 23.3817 | 0.11732 |
|
| 23.9996 | 0.12122 |
|
|
|
|
| 23.6638 | 0.119395 |
| 21.3434 | 0.106413 |
|
| 23.7021 | 0.119389 |
| 18.0213 | 0.0887804 |
|
| 23.6248 | 0.119181 |
| 17.9756 | 0.088539 |
|
| 23.4104 | 0.117827 |
| 17.8046 | 0.0876364 |
|
| 22.2979 | 0.111908 |
| 15.998 | 0.0781331 |
|
| 21.1428 | 0.105734 |
| 14.4589 | 0.0700828 |
|
| 20.8754 | 0.104254 |
| 14.448 | 0.0700258 |
|
| 19.5344 | 0.0971108 |
| 14.0166 | 0.0677778 |
|
| 19.1561 | 0.0950535 |
| 13.9502 | 0.067432 |
|
| 19.1021 | 0.0947959 |
| 13.7632 | 0.0664587 |
|
| 19.0021 | 0.0942557 |
| 13.7331 | 0.0663022 |
|
| 18.6868 | 0.0925992 |
| 13.2385 | 0.0637315 |
|
| 18.6051 | 0.092075 |
| 12.9486 | 0.0622268 |
|
| 18.0089 | 0.0890836 |
| 12.6467 | 0.0606617 |
|
| 18.0034 | 0.0890473 |
| 12.3375 | 0.0590611 |
|
| 16.8798 | 0.0831072 |
| 12.3167 | 0.0589534 |
|
|
|
|
| 12.2470 | 0.0585926 |
Bold letters indicate the n-mer that appears to be a significant feature in both the CpG+ and CpG− models.