| Literature DB >> 17996040 |
Jin Gu1, Hu Fu, Xuegong Zhang, Yanda Li.
Abstract
BACKGROUND: MicroRNAs (miRNAs) are a class of endogenous regulatory small RNAs which play an important role in posttranscriptional regulations by targeting mRNAs for cleavage or translational repression. The base-pairing between the 5'-end of miRNA and the target mRNA 3'-UTRs is essential for the miRNA:mRNA recognition. Recent studies show that many seed matches in 3'-UTRs, which are fully complementary to miRNA 5'-ends, are highly conserved. Based on these features, a two-stage strategy can be implemented to achieve the de novo identification of miRNAs by requiring the complete base-pairing between the 5'-end of miRNA candidates and the potential seed matches in 3'-UTRs.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17996040 PMCID: PMC2241842 DOI: 10.1186/1471-2105-8-432
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The flowchart of the method. The whole method consists of two stages: in the first stage, conserved 7-mers are identified by considering all 7-mers' conservation patterns in six pairs of flies; in the second stage, pre-miRNAs and mature miRNAs are predicted by adding seed-matching information into published miRNA prediction methods in the whole genome.
Performance comparisons of different algorithms
| The organisms selected for analysis | The algorithm | The number of Identified reference seed matches1 |
| Dme Dsi Dya Dan Dps | Cons-SVM | 632 (333) |
| Dmo Dvi | ||
| Dme Dsi Dya Dan Dps | MCS [23] | 58(29) |
| Dmo Dvi | ||
| Dme Dps | FastCompare [24] | 52(29) |
| Dme Dps | PCS4 | 59(32) |
| Dme Dps | MCS [23] | 47(26) |
1 Because Cons-SVM identifies 689 candidate seed matches, we test the performances of different algorithms when selecting the 689 highest ranking 7-mers.
2 This number is obtained by LOOCV (Leave One Out Cross Validation). The number of identified reference seed matches is 65 when classification.
3 The numbers in the parenthesis indicate how many miRNA families are identified according to the identified seed matches.
4 We only used the PCSs computed from Dme-Dps pairs and used the 689 highest-score 7-mers in the analysis.
Figure 2Comparions of the results using the three methods. The number in each block indicates the corresponding number of 7-mers in that part. The number in the parenthesis indicates the number of reference miRNA families in that block.
Figure 3The 689 conserved 7-mers identified by Cons-SVM matching with the 59 reference miRNAs. Much more sites matched with the 1–7 nt or 2–8 nt of the mature miRNAs.
Figure 4The nucleotide composition of the 59 reference miRNAs. The 5' first nucleotide of mature miRNAs significantly favours "U". Other sites do not show similar nucleotide bias. The logo plot is produced by WebLogo [49].
The list of predicted miRNAs which have homologies with other known miRNAs or conserved in other insects
| pmir-1 | TAAGCGTAtagcttttcccct | chr2L:Minus:Intron | Rank#197a | + | + |
| pmir-11 | TTATTGCTtgagaatacacgt | chr2R:Minus:Intergenic | tni-miR-137 | + | + |
| pmir-16 | GATATGTttgatattcttggt | chr3L:Plus:Intron | cbr-miR-50 | + | + |
| pmir-20 | AATTGACTctagtagggagtc | chr3R:Plus:Intron | Rank#5 | + | + |
| pmir-26 | TAAGTACtagtgccgcaggag | chr3R:Minus:Intron | cel-mir-252 | + | |
| pmir-29 | ATGCAACgttgctgggaagtg | chr3R:Plus:Intron | + | ||
| pmir-31 | TGTTAACtgtaagactgtgtc | chr3R:Minus:Intron | + | ||
| pmir-33 | TATTGTCCtgtcacagcagta | chr3R:Minus:Intergenic | Rank#119 | + | + |
| pmir-37 | TTCGTTGTcgacgaaacctgc | chrX:Minus: Intergenic | Rank#15 | + | + |
a The miRNA candidates also predicted by Lai et al [5].
b The miRNA candidates also predicted by Chan et al [24].
The list of predicted miRNAs which have significant GO categories (with Bonferroni corrected P-value less than 0.001)
| pmiR-3-5 | protein binding | 1.59E-03 |
| pmiR-5-3 | receptor activity | 8.77E-03 |
| cell adhesion molecule binding | 1.45E-04 | |
| pmiR-7-5 | transcription factor activity | 5.07E-03 |
| pmiR-8-5 | transcription factor activity | 2.70E-03 |
| specific RNA polymerase II transcription factor activity | 1.61E-03 | |
| structural constituent of cytoskeleton | 7.38E-04 | |
| pmiR-10-3 | DNA binding | 9.60E-03 |
| SH3 domain binding | 3.19E-03 | |
| specific RNA polymerase II transcription factor activity | 2.14E-03 | |
| pmiR-13-3 | cell adhesion molecule binding | 1.78E-03 |
| pmiR-15-5 | transcription factor activity | 4.73E-07 |
| RNA polymerase II transcription factor activity | 6.85E-07 | |
| protein serine/threonine kinase activity | 6.71E-05 | |
| pmiR-24-3 | DNA binding | 2.93E-03 |
| structural constituent of cytoskeleton | 8.08E-03 | |
| pmiR-25-5 | protein binding | 6.78E-03 |
| pmiR-28-3 | guanyl-nucleotide exchange factor activity | 7.01E-03 |
| pmiR-31-3 | potassium channel activity | 7.01E-03 |
| pmiR-32-5 | specific RNA polymerase II transcription factor activity | 3.75E-04 |
| pmiR-36-5 | phosphatidylcholine-sterol O-acyltransferase activity | 1.61E-03 |
| pmiR-39-5 | receptor binding | 5.22E-03 |