| Literature DB >> 21317189 |
Nan Deng1, Adriane Puetter, Kun Zhang, Kristen Johnson, Zhiyu Zhao, Christopher Taylor, Erik K Flemington, Dongxiao Zhu.
Abstract
Computational prediction of microRNA targets remains a challenging problem. The existing rule-based, data-driven and expression profiling approaches to target prediction are mostly approached from the gene-level. The increasing availability of RNA-seq data provides a new perspective for microRNA target prediction on the isoform-level. We hypothesize that the splicing isoform is the ultimate effector in microRNA targeting and that the proposed isoform-level approach is capable of predicting non-dominant isoform targets as well as their targeting regions that are otherwise invisible to many existing approaches. To test the hypothesis, we used an iterative expectation maximization (EM) algorithm to quantify transcriptomes at the isoform-level. The performance of the EM algorithm in transcriptome quantification was examined in simulation studies using FluxSimulator. We used joint evidence from isoform-level down-regulation and seed enrichment to predict microRNA-155 targets. We validated our computational approach using results from 149 in-house performed in vitro 3'-UTR assays. We also augmented the splicing database using exon-exon junction evidence, and applied the EM algorithm to predict and quantify 1572 cell line specific novel isoforms. Combined with seed enrichment analysis, we predicted 51 novel microRNA-155 isoform targets. Our work is among the first computational studies advocating the isoform-level microRNA target prediction.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21317189 PMCID: PMC3089486 DOI: 10.1093/nar/gkr042
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(a) The workflow of our transcriptome and targetome analysis pipeline. Solid arrows represent annotated transcript quantification and dotted arrows represent novel transcript quantification. (b) Novel transcript discovery illustrated using splicing graph. One source of exon–exon junctions (solid lines) is available from alternative splicing database, and another source (dotted lines) of junctions is available from computational prediction using TopHat.
Figure 2.(a) Illustration of the EM algorithm. Deriving the observed short-read compatible matrix Y or Y′ from short-read alignment (left panel) Applying EM algorithm to infer the short-read originating matrix Z or Z′ (middle panel). Calculating relative isoform proportions in case and control (right panel). Note the referred gene is differentially spliced between case and control but not differentially expressed. It is also known as dichotomy of regulation. (b) Simulation studies to evaluate the accuracy of isoform quantification using the FluxSimulator. For single-end data (b) and paired-end data (Supplementary Figure S1), we plot predicted isoform abundance scores against true abundance scores. R2 calculated by robust linear regression analysis were shown in each figure. Fifteen-millions 50-mer single-end short reads and 30-millions 50-mer paired-end short reads were generated and used for this simulation studies.
Quantitative RT–PCR experiments to verify isoform target TAF5L
| TABLE I | Forward primer | Reverse primer | ||
|---|---|---|---|---|
| TAF5L LTE | ENST00000366676 | Last two exons | 5′-AGCCCCACCAAGTAGACGTGT-3′ | 5′-TCTCCGTGCCTGCATTATCAT-3′ |
| TAF5L EX | ENST00000366676 | Last two exons | 5′-AGTAGACGTGTCCCGCATCCATTT-3′ | 5′-AACAAGAGAGCAACCCTGAGCTGT-3′ |
| TAF5L Iso | ENST00000366675 | Last exon | 5′-CACAGGAAGTAGAGTTGCCAGCT-3′ | 5′-AACGGTTACAAGCCAAACAAGATT-3′ |
| Iso TAF5L | ENST00000366675 | Last exon | 5′-CCCACAGAAGGTTGTGCCATTTCA-3′ | 5′-ACATGGAGCCACAGGATATGCACT-3′ |
| TBP | Housekeeping | 5′-GATGGATGTTGAGTTGCAGGGTGT-3′ | 5′-AGCACGGTATGAGCAACTCACAGT-3′ |
Quantitative RT–PCR experiments to verify novel junctions (9 out of 10 were verified)
| TABLE II | Reference transcript ID | Junction assessed | Forward primer | Reverse primer | |
|---|---|---|---|---|---|
| PLDN | ENST00000220531 | Exons 1–4 | 5′-CACACGTTTGCTTCTTCCCTGTGT-3′ | 5′-GGCATGATAGTGTTTAGCCTCAGC-3′ | 379 bp |
| TMEM126A | ENST00000304511 | Exons 1–3 | 5′-CCCAGGTAATTTGAGCAAAGGCCA-3′ | 5′-CTATGAGGCCACAAAGAGCAGCAT-3′ | 244 bp |
| PGPEP1 | ENST00000269919 | Exons 3–5 | 5′-TCCGGTTGAGTACCAAACAGTCCA-3′ | 5′-CGTGACTCTGGTACAAAGAGGTGT-3′ | 344 bp |
| NOL9 | ENST00000377705 | Exons 3–7 | 5′-TAACCAGCTATCCGGGTTCATCCT-3′ | 5′-TGTGGAGTCCTCAGGTGAGTGAAA-3′ | 403 bp |
| C15orf17 | ENST00000357635 | Exons 1–3 | 5′-AGATCGGTAATAGAGCCCTCCGTCT-3′ | 5′-ATCTGGACTCTGGCTAAGAGCAGT-3′ | 242 bp |
| YEATS4 | ENST00000247843 | Exons 2–5 | 5′-GGGCACACTCATCAGTGGACAGTAT-3′ | 5′-CCCAGCATTGCATTGGTGTCTGAT-3′ | 266 bp |
| NARS2 | ENST00000281038 | Exons 12–14 | 5′-GCTGTTGATCTTCTGGTTCCTGGAGT-3′ | 5′-AAGATGCACTGCAGGTAGCGTTCA-3′ | 200 bp |
| SLC7A11 | ENST00000280612 | Exons 1–3 | 5′-GCACCATCATTGGAGCAGGAATCT-3′ | 5′-TGTAGCGTCCAAATGCCAGGGATA-3′ | 285 bp |
| ARL1 | ENST00000261636 | Exons 3–5 | 5′-TAGGAGGACAGACAAGTATCAGGCCA-3′ | 5′-TCCTTCAAGGCAGGTAACCCAAGT-3′ | 247 bp |
The novel splice junction tested for each gene spanned the exons indicated in the ‘Junction assessed’ column of the indicated ‘reference transcript ID’.
Figure 3.(a) Venn Diagram of the microRNA targets predicted by the three approaches at 0.8 cut-off level of relative expression. (b) An example of isoform target exclusively predicted by the isoform-level approach (Gene PHF17). It represents a group of genes with dichotomy-regulated isoforms and the down-regulated isoform (potential target) was not tested in 3′-UTR assay. (c). An example of isoform target predicted jointly by the isoform-level approach and the 3′-UTR assay (Gene TAF5L). It represents a group of genes with dichotomy-regulated isoforms where the down-regulated isoform was also tested in 3′-UTR assay. (d) Quantitative RT–PCR and 3′-UTR reporter assay of the TAF5L isoform relative expression. (e) An example of target predicted by both the isoform- and gene-level approaches, but not by the 3′-UTR assay (Gene TBRG1). (f) An example of drop out in the 8-mer seed region of the 3′-UTR (Gene CEBPB).
Figure 4.(a) Venn diagram of the microRNA gene targets prediction using the gene- and isoform-level approaches. (b) The percentage of gene targets represented by dominant and non-dominant isoforms predicted by both the gene- and the isoform-level approaches. (c) The percentage of isoform targeting regions predicted by both the gene- and the isoform-level approaches. (d) The percentage of gene targets represented by dominant and non-dominant isoforms predicted by the isoform-level approach exclusively. (e) The percentage of isoform targeting regions predicted by the isoform-level approach exclusively.