| Literature DB >> 21515631 |
Michael Hackenberg1, Naiara Rodríguez-Ezpeleta, Ana M Aransay.
Abstract
We present a new version of miRanalyzer, a web server and stand-alone tool for the detection of known and prediction of new microRNAs in high-throughput sequencing experiments. The new version has been notably improved regarding speed, scope and available features. Alignments are now based on the ultrafast short-read aligner Bowtie (granting also colour space support, allowing mismatches and improving speed) and 31 genomes, including 6 plant genomes, can now be analysed (previous version contained only 7). Differences between plant and animal microRNAs have been taken into account for the prediction models and differential expression of both, known and predicted microRNAs, between two conditions can be calculated. Additionally, consensus sequences of predicted mature and precursor microRNAs can be obtained from multiple samples, which increases the reliability of the predicted microRNAs. Finally, a stand-alone version of the miRanalyzer that is based on a local and easily customized database is also available; this allows the user to have more control on certain parameters as well as to use specific data such as unpublished assemblies or other libraries that are not available in the web server. miRanalyzer is available at http://bioinfo2.ugr.es/miRanalyzer/miRanalyzer.php.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21515631 PMCID: PMC3125730 DOI: 10.1093/nar/gkr247
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.General workflow of miRanalyzer. The fastq file is transformed into a read count file, which is filtered to keep only sequences from 17 to 26 bases. These reads are successively mapped to several databases in order to identify known microRNAs, discard messenger RNA contaminations and select sequences for the microRNA prediction step.
The default values of the parameters used in miRanalyzer are shown
| General parameters | ||
|---|---|---|
| Name | Description | Value |
| minLength | The minimum read length, all others will be removed | 17 |
| maxLength | The maximum read length, all reads will be trimmed to this length | 26 |
The web server version allows the user to change the ‘−n’ parameter. The standalone version allows manipulating all of them. We used −l 17 to detect known microRNAs and predict new microRNAs (align to the genome) as this is the shortest microRNA length in miRBase but −l 20 for the other libraries.
Figure 2.Selection of longest alignments performed by miRanalyzer. The example shows the best alignments for two reads obtained with Bowtie, and the one selected (light grey square). The 17 nt seed is outlined and the longest alignment maintaining the number of observed mismatches within the seed is highlighted in red. Note that for Read2, the chosen alignment is not the one that contains the least total number of mismatches.
Data sets used to train the prediction models
| Species | Tissues/Conditions | No. of microRNAs | References | GEO references |
|---|---|---|---|---|
| Animal | ||||
| | 16 | 10 321 | ( | GSE19812, GSE20384, GSE21279, GSE20892 |
| | 9 | 6201 | ( | GSE20384, GSE19473 |
| | 9 | 587 | ( | GSE12462,GSE24314, GSE24608, SE24542, GSE24540 |
| | 12 | 2091 | ( | GSE18634, GSE13339 |
| | 2 | 695 | ( | GSE21503, GSE22068 |
| | 3 | 46 | ( | GSE17965 |
| Plant | ||||
| | 4 | 295 | ( | GSE20448, GSE16971 |
| | 9 | 1302 | ( | GSE23217, GSE20748 |
| | 3 | 193 | ( | GSE17339 |
| | 1 | 28 | ( | GSE18406 |
Features used for the Random forest prediction models
| Feature | Used for kingdom |
|---|---|
| Number of bindings in read cluster sequence | Animal |
| Normalized mean free energy of precursor sequence | Plant and Animal |
| Number of bindings in precursor | Animal |
| Length of read cluster | Plant and Animal |
| The corresponding putative maturestar sequence is also present (binary value 0, 1) | Plant and Animal |
| Number of bindings in read cluster divided by the read cluster length | Plant |
| Number of reads in read cluster | Plant and Animal |
| Mean free energy of precursor sequence | Plant and Animal |
| Degree of bulb asymmetry in precursor | Animal |
| The number of bulbs in precursor secondary structure | Plant |