| Literature DB >> 25559987 |
Daniel Tabas-Madrid, Ander Muniategui, Ignacio Sánchez-Caballero, Dannys Jorge Martínez-Herrera, Carlos Oscar S Sorzano, Angel Rubio, Alberto Pascual-Montano.
Abstract
BACKGROUND: MicroRNAs are short RNA molecules that post-transcriptionally regulate gene expression. Today, microRNA target prediction remains challenging since very few have been experimentally validated and sequence-based predictions have large numbers of false positives. Furthermore, due to the different measuring rules used in each database of predicted interactions, the selection of the most reliable ones requires extensive knowledge about each algorithm.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25559987 PMCID: PMC4304206 DOI: 10.1186/1471-2164-15-S10-S2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Number of experimentally validated interactions.
| Mirtarbase | Tarbase | Mirwalk | Mirecords | |
|---|---|---|---|---|
| 30 | - | - | 17 | |
| 115 | - | - | 81 | |
| 102 | - | - | 32 | |
| 16 | - | - | - | |
| 2860 | 878 | 5668 | 1276 | |
| 537 | 70 | 2749 | 194 | |
| 231 | - | 1514 | 39 |
Summary of the number of validated interactions for each species we have studied as well as the source where this interaction was reported.
Comparison of sequence-based algorithms for miRNA-mRNA target predictionn.
| Methoda | Name | Seed | ΔG | Wobbles | ΔΔG | Other features | Type of classifier | Scoring | DB? | software? | website | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AI | miRanda | ✓ | ✓ | ✓ | matches with the first 11 nt's of the miRNA are rewarded | score | ✓ | ✓ | ||||
| AI | TargetScan | ✓ | ✓ | different seed types and AU content | score | ✓ | ✓ | |||||
| AI | PicTar | ✓ | ✓ | ✓ | score | ✓ | http://pictar.mdc-berlin.de/ | |||||
| AI | RNA22 | ✓ | ✓ | miRNA paired to statistically significant patterns in the mRNA | ||||||||
| AI | RNAhybrid | ✓ | ✓ | MFEs modeled as extreme-value distributed | MDE (energy) | ✓ | http://bibiserv.techfak.uni-bielefeld.de/rnahybrid/ | |||||
| AI | PITA | ✓ | ✓ | ✓ | ✓ | 1) G:U allowed in 7mer seed | score | ✓ | ✓b | http://genie.weizmann.ac.il/pubs/mir07/ | ||
| 2) G:U, 1 mismatch allowed in 8mer | ||||||||||||
| AI | EiMMo | ✓ | ✓ | model the evolution of orthologous target sites in related species | score | ✓ | http://www.mirz.unibas.ch/EIMMo3/ | |||||
| AI | DIANA-microT | ✓ | ✓ | ✓ | ✓ | score | ✓ | http://diana.cslab.ece.ntua.gr/microT/ | ||||
| AI | MicroTar | ✓ | ✓ | p-value | ✓ | |||||||
| AI | FindTar | ✓ | ✓ | ✓ | central loop score to reduce false positives | score and energy | ✓ | |||||
| AI | miRiam | ✓ | ✓ | ✓ | ✓ | ✓ | http://ferrolab.dmi.unict.it/miriam.html | |||||
| AI | microcosm | ✓ | ✓ | ✓ | Uses miRanda. Requires: complete seed complementarity and conservation at the same position and in ≥2 species | score | ✓ | |||||
| AI | miRWalk | ✓ | ✓ | also a DDBB with experimentally-validated targets from text mining | p-value | http://mirwalk.uni-hd.de/ | ||||||
| ML | miTarget | ✓ | ✓ | Starting set: miRanda. Radial basis function | SVM | http://cbit.snu.ac.kr/~miTarget/introduction.html | ||||||
| ML | MirTarget2 | ✓ | ✓ | ✓ | Initial set: TargetScan, PicTar, miRanda, MirTarget | SVM | score (from probabilities) | ✓ | ||||
| ML | TargetSpy | ✓ | ✓ | ✓ | Starting set: PicTar. Generates candidate zones of binding and a representative hybrid (1st or 2nd nt of the miRNA is paired) | score | ✓ | ✓ | ||||
| ML | mirSVR | ✓ | ✓ | ✓ | Starting set: miRanda | SVR | mirSVR score (probability for down-regulation) | ✓ | ✓ | |||
| H | NBmiRTar | ✓ | ✓ | ✓ | NB classifier is applied to the output of the miRanda program | Naïve | NB score (probability) | |||||
| a | AI = Ab Initio, ML = Machine Learning, H = Hybrid | |||||||||||
| b | Academic use only | |||||||||||
Comparison of different algorithms of miRNA-mRNA target prediction including different algorithm features, the databases and software availability, scoring method, type of classifier, and species for which the algorithm was designed.
Reliability of databases.
| (1) | (2) | (3) | (4) | (5) | (6) | |
|---|---|---|---|---|---|---|
| LRS | -89.27 | 163829 | 4669137 | 4286 | 9.18E-04 | 9.2 |
| WSP | -84.52 | 123589 | 4669137 | 4286 | 9.18E-04 | 6.94 |
| EiMMo | -61.87 | 191582 | 1781671 | 2949 | 1.66E-03 | 10.75 |
| DIANA-microT | -54.51 | 269525 | 2889574 | 3010 | 1.31E-03 | 11.77 |
| -21.2 | 134227 | 737379 | 2685 | 3.64E-03 | 18.2 | |
| microcosm | -17.99 | 6035 | 352016 | 784 | 2.23E-03 | 1.71 |
| PITA | -15.2 | 75683 | 206722 | 1425 | 6.89E-03 | 36.61 |
| TargetSpy | -14 | 178114 | 300000 | 653 | 2.18E-03 | 59.37 |
| miRWalk | -9.92 | 422089 | 780000 | 1243 | 1.59E-03 | 54.11 |
| TargetScan | -9.29 | 19491 | 132809 | 1832 | 1.38E-02 | 14.68 |
| mirTarget | -5.08 | 149088 | 691265 | 234 | 3.39E-04 | 21.57 |
(1) Minimum z-score of a hypergeometric distribution. The lower the z-score the more statistically significant the enrichment in experimentally validated interactions is. (2) Number of interactions for the minimum z-score. (3) Total number of interactions in the database. (4) Number of experimentally validated (EV) interactions in the database. (5) Proportion of EV interactions within the database. (6) Proportion of selected interactions in the database for the minimum z-score.
Figure 1. The area under the curve and the number of interactions are also included for every algorithm.
Figure 2. The curve was corrected by subtracting the precision value that corresponds to random interactions. The Y axis shows the Precision and the × axis the interactions sorted by score in descending order.
Figure 3. In the WSP method, box a), a new score for each interaction in each database is calculated by weighting their original scores with their associated accumulated precision. To this aim, for each of the databases, the interactions are sorted and their corresponding accumulated precisions are calculated. The obtained precision values are considered to be reliable in case they are larger than the randomly expected precision of the database. In the LRS method, box b), each interaction in each database is re-scored by assigning its probability of being experimentally-validated. To this aim, for each database, the probability of each interaction of being experimentally-validated is calculated. The probabilities in different databases are then combined considering their possible dependencies.