| Literature DB >> 23708386 |
William H Majoros1, Parawee Lekprasert, Neelanjan Mukherjee, Rebecca L Skalsky, David L Corcoran, Bryan R Cullen, Uwe Ohler.
Abstract
High-throughput sequencing has opened numerous possibilities for the identification of regulatory RNA-binding events. Cross-linking and immunoprecipitation of Argonaute proteins can pinpoint a microRNA (miRNA) target site within tens of bases but leaves the identity of the miRNA unresolved. A flexible computational framework, microMUMMIE, integrates sequence with cross-linking features and reliably identifies the miRNA family involved in each binding event. It considerably outperforms sequence-only approaches and quantifies the prevalence of noncanonical binding modes.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23708386 PMCID: PMC3818907 DOI: 10.1038/nmeth.2489
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1Identifying miRNA target sites by a joint sequence and interaction model. (a) Example of an AGO-interaction site identified by PARalyzer. Shown are the kernel density estimate of signal versus background T-to-C transitions, and the read depth of a 3′ UTR region of SPATS2 (ENST00000395063) profiled in the LCL-BAC library. Two 7mer seed matches are in close proximity, each a different distance 3′ of a signal peak; naive target prediction based on read depth alone would have incorrectly treated these two binding events as one. (b) Across all miRNA target candidates, seed matches show the strongest enrichment immediately 3′ of the PAR-CLIP T-to-C transition peak. (c) Conservation of sites as measured by PhastCons scores[19] illustrate preferential conservation in the first 8nt 3′ of peaks. (d) State-transition diagram of the hidden Markov model. All states represent a joint likelihood of sequence and PAR-CLIP signal in a particular region. (e) State 5 is a metastate that specifically expands into a 41-state submodel for several types of miRNA seed matches. (f) Sensitivity-vs.-SNR tradeoff for MUMMIE, Targetscan, MIRZA, and two baselines, using the top 100 expressed miRNAs in LCL-BAC cell line. Explicit expression information was not utilized by any predictor (cf. Supplementary Fig. 2 for additional results). (g) Similar to (f) for sensitivity-vs.-specificity.
Figure 2Validation of predicted sites and their impact on expression levels. We computed the aggregate PAR-CLIP signals at predicted target sites for specific miRNAs expressed in the LCL-BAC cell line, and compared it to the aggregate signal at the same sites in cell lines missing one of the corresponding miRNAs. (a) Loss of PAR-CLIP signal for LCL-BAC predicted target sites of BHRF1-1, BHRF1-2, and BHRF1-3 in the corresponding deletion line (LCL-BAC-D1, -D2, and -D3, respectively) (23 predicted target sites for BHRF1-1, 52 for BHRF1-2, and 10 for BHRF1-3). (b) Control: Signal difference for predicted target sites in LCL-BAC vs. deletion lines, to all but the miRNA of interest (2,159 predicted targets for the D1 control, 2,082 for the D2 control, and 2,172 for the D3 control). Binding loss of BHRF1-1 (P = 0.0006) and BHRF1-2 (P = 0.0394) were statistically significant compared to the control; loss of BHRF1-3 was consistent but not significant due to the small number of sites (P=0.1879; Wilcoxon rank-sum test). (c) Impact of site loss on steady-state mRNA expression levels, based on RNA-seq data from the LCL-BAC cell lines. The enrichment of BHRF target sites in the top gene sets ranked by differential expression is contrasted with the enrichment of other expressed miRNAs with similar numbers of predicted targets. Predictions were obtained using the top 100 expressed miRNAs, with the MUMMIE PAR-CLIP signal variance parameter set to 0.01. All predicted targets were aligned across the beginning of miRNA seed matches in the mRNAs. Vertical error bars indicate ± one standard deviation.