| Literature DB >> 14709173 |
Anton J Enright1, Bino John, Ulrike Gaul, Thomas Tuschl, Chris Sander, Debora S Marks.
Abstract
BACKGROUND: The recent discoveries of microRNA (miRNA) genes and characterization of the first few target genes regulated by miRNAs in Caenorhabditis elegans and Drosophila melanogaster have set the stage for elucidation of a novel network of regulatory control. We present a computational method for whole-genome prediction of miRNA target genes. The method is validated using known examples. For each miRNA, target genes are selected on the basis of three properties: sequence complementarity using a position-weighted local alignment algorithm, free energies of RNA-RNA duplexes, and conservation of target sites in related genomes. Application to the D. melanogaster, Drosophila pseudoobscura and Anopheles gambiae genomes identifies several hundred target genes potentially regulated by one or more known miRNAs.Entities:
Mesh:
Substances:
Year: 2003 PMID: 14709173 PMCID: PMC395733 DOI: 10.1186/gb-2003-5-1-r1
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Algorithm and analysis pipeline. Source data consisting of (a) miRNAs and (b) 3' UTRs are processed initially by (c) the miRanda algorithm, which searches for complementarity matches between miRNAs and 3' UTRs using dynamic programming alignment (Phase 1) and thermodynamic calculation (Phase 2). (d) All results are then post-processed by first filtering out results not consistently conserved according to target sequence similarity with D. pseudoobscura and A. gambiae (Phase 3), then by sorting and ranking all remaining results. (e) Finally, all miRNA target gene predictions are annotated using data from FlyBase and stored for further analysis.
Validation of prediction method on experimentally verified miRNA targets
| MiRNA | Organism | Target gene (3' UTR) | Number of experimental sites | Number of predicted sites | Rank | Number of predicted sites with conservation | Matches experimental to predicted | Matches experimental to predicted (%) |
| cel/cbr | 7 | 1 | 0 | 0 | 0% | |||
| cel/cbr | 1 | 1 | 4/1,014 | 1 | 1 | 100% | ||
| cel/cbr | 1 | 1 | 5/1,014 | N/A | 1† | 100%† | ||
| cel/cbr | 2 | 6 | 9/1,014 | 2 | 2 | 100% | ||
| cel/cbr | 1 | 1 | 12/1,014 | 1 | 1 | 100% | ||
| cel/cbr | 2 | 6 | 2/1,014 | N/A | 2† | 100%† | ||
| cel/cbr | 3 | 10 | 7/1,014 | 1 | 1 | 33% | ||
| cel/cbr | 8 | 14 | 1/1,014 | 8 | 5 | 63% | ||
| dme/dps | 2 | 2 | 1/11,318 | 2 | 2 | 100% | ||
| dme/dps | CG10222 | 1 | 1 | 4/11,318 | 1 | 1 | 100% |
Using intermediate thresholds (S: 80; ΔG: -14 kcal/mol), for each known miRNA and target gene pair (in either C. elegans or D. melanogaster), we list the number of known experimental target sites, the number of sites detected here, both raw and conserved in C. briggsae or D. pseudoobscura; and, the number and percentage of known sites that correspond to computationally detected conserved sites, with larger values indicating more successful (retrospective) prediction († and 'N/A' indicate that no 3' UTR was available to scan against in C. briggsae, hence no conservation analysis was possible, results assume conservation). cel/cbr: C. elegans/C. briggsae; dme/dps: D. melanogaster/D. pseudoobscura.
Whole genome comparison of real versus randomized miRNAs against the complete genomes of D. melanogaster and D. pseudoobscura
| Total hits | Total conserved hits | 1 site | ≥ 2 sites | |
| 73 | 6,864 | 589 | 556 | 33 |
| 73 Random miRNAs (B) | 5,152 | 204 | 201 | 3 |
| Standard deviation (100 experiments) | ± 132 | ± 43 | ± 40 | ± 3 |
| Ratio (A/B) | 2.9 | 2.8 | 11.0 | |
| Estimated false positives (%) | 35% | 36% | 9% |
Detected conserved hits (especially those with multiple detected sites in the 3' UTR) are significantly over-represented (2.8× and 11× as many cases, on average, respectively) in analyses with actual miRNAs compared to randomly shuffled miRNAs. The thresholds used for this analysis were S: 100; ΔG: -19 kcal/mol; ID: 70%.
Functional analysis of actual versus random miRNAs
| GO molecular function class | ||||||||||||
| Transcription regulator | Apoptosis regulator | Cell adhesion molecule | Binding | Enzyme | Transporter | Signal transducer | Translation regulator | Motor | Enzyme regulator | Structural molecule | Chaperone | |
| Actual miRNAs | 113 | 6 | 10 | 146 | 128 | 35 | 66 | 4 | 6 | 9 | 6 | 1 |
| Random miRNAs (average) | 43 | 1 | 2 | 98 | 88 | 22 | 41 | 2 | 4 | 9 | 7 | 1 |
| Standard deviation | ± 7.5 | ± 0.9 | ± 1.5 | ± 13.3 | ± 13.7 | ± 4.9 | ± 9.7 | ± 1.3 | ± 2.2 | ± 3.9 | ± 2.6 | ± 1.2 |
| Z-score | 9.3 | 5.9 | 5.0 | 3.6 | 2.9 | 2.7 | 2.6 | 1.4 | 1.0 | 0.01 | -0.3 | -0.3 |
Integers are number of detected conserved cases in each class. The standard deviation is for 50 experiments. The Z-scores for seven functional classes indicate over-represention for actual miRNA target genes. The thresholds used are S: 100; ΔG: -19 kcal/mol; ID: 70%.
Figure 2Functional map of miRNAs and their target genes. Left axis: selected over-represented FlyBase [49] derived GO [87] classifications from the 'molecular function' hierarchy. Bottom axis: ordered list of the 73 miRNAs. Each cell in the matrix is color-coded according to the degree of over-representation (right axis) for a miRNA hitting a specific functional class. For example, a bright red box indicates that a given miRNA hits six to eight times more targets in a particular class then one would expect by chance. The matrix is built by two-dimensional hierarchical clustering after normalization for classes that are over-represented in FlyBase annotations as a whole.
Potential miRNA targets of Hox cluster genes and their regulators
| Gene name | Genes identifier | MiRNA |
| CG10325* | ||
| CG11648 | ||
| CG1028* | ||
| CG2047 | ||
| CG17117 | ||
| CG32443 | ||
| CG1030 | ||
| CG3848* | ||
| CG8651 | ||
| CG10388 |
*Target gene based on top 20 hits of each miRNA; all others are based on top ten hits. †Target site also conserved in A. gambiae.
Potential miRNA targets of ecdysone induction
| Gene name | Gene identifier | MiRNA |
| CG6438 | ||
| CG3166 | ||
| CG5206 | ||
| CG11491-RA | ||
| CG11491-RB&RC | ||
| CG11491-RD | ||
| CG11491-RE&RG | ||
| CG11491-RF | ||
| CG14938 | ||
| CG13478 | ||
| CG5400 | ||
| CG1765 | ||
| CG7266 | ||
| CG32180 | ||
| CG18389 | ||
| CG10002 | ||
| CG1864 | ||
| CG33183* | ||
| CG11783† | ||
| CG16973† | ||
| CG7761 | ||
| CG6601 | ||
| CG4319 | ||
| CG1975 | ||
| CG2272† | ||
| CG5123 | ||
| CG5965 |
*,†Target gene based on the top 20 and top 30 hits of each miRNA, respectively; all others based on top ten hits. ‡The gene br has seven splice variants RA to RG, five of which have unique UTRs. § Target site also conserved in A. gambiae.
Figure 3Representation of 3' UTRs for potential miRNA target genes involved in axon guidance. Each individual conserved hit between a miRNA and a target gene is marked by an annotated triangle on a conservation plot (D. melanogaster versus D. pseudoobscura) for that UTR. Red triangles indicate target site locations that are illustrated in more detail (alignment and secondary structure) below. Multiple target sites on a 3' UTR for one or more miRNAs are not uncommon and reflect cooperative regulation of transcription.
Potential miRNA targets of the axon guidance pathway
| Gene name | Gene identifier | MiRNA |
| CG4032* | ||
| CG4846† | ||
| CG3727 | ||
| CG3915 | ||
| CG1511 | ||
| CG10443 | ||
| CG18657* | ||
| CG10521† | ||
| CG2005 | ||
| CG13521* | ||
| CG18405* | ||
| CG6446* | ||
| CG4700* | ||
| CG8355 | ||
| CG18497† | ||
| CG18214 |
*,†Target gene based on the top 20 and top 30 hits of each miRNA, respectively; all others based on top 10 hits. ‡Target site also conserved in A. gambiae.