Literature DB >> 24243848

miRNEST 2.0: a database of plant and animal microRNAs.

Michal W Szczesniak1, Izabela Makalowska.   

Abstract

Ever growing interest in microRNAs has immensely populated the number of resources and research papers devoted to the field and, as a result, it becomes more and more demanding to find miRNA data of interest. To mitigate this problem, we created miRNEST database (http://mirnest.amu.edu.pl), an integrative microRNAs resource. In its updated version, named miRNEST 2.0, the database is complemented with our extensive miRNA predictions from deep sequencing libraries, data from plant degradome analyses, results of pre-miRNA classification with HuntMi and miRNA splice sites information. We also added download and upload options and improved the user interface to make it easier to browse through miRNA records.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24243848      PMCID: PMC3965105          DOI: 10.1093/nar/gkt1156

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

microRNAs (miRNAs) are a class of negative regulators of gene expression, widely identified in animals and plants. In plants, miRNAs participate in different aspects of growth and developmental processes, including lateral root formation or transition from juvenile to adult vegetative phase (1). They are also key players in response to stress conditions, like drought, low temperatures or nitrogen deficiency (2). Animal miRNAs are believed to regulate more than half of protein-coding genes and, like in plants, are implicated in a number of biological processes (3). Notably, multiple miRNAs have been associated with diseases, like cancers or rheumatoid arthritis (4). The fact that miRNAs are key regulators of molecular processes in a cell and that they could find multiple applications in biotechnology, molecular biology or medicine, motivated extensive development of methods for their identification and study. The growing number of miRNA studies allowed better understanding of their biology and, consequently, led to accumulation of miRNA databases. However, many of them are limited to species of high interest, selected taxa or miRNAs involved in some specific processes. For instance, miRNeye (5) collects data about miRNA expression in mouse eye, whereas GrapeMiRNA stores sequences from V. vinifera (6). miRBase (7), on the other hand, although accommodates data from a wide range of species, contains only already published results. As a result, a single universal repository is required so that there was no necessity to browse through a number of dispersed data sets to collect information related to specific species or miRNA type. Previously, we took up this challenge and we developed miRNEST, a comprehensive online resource for plant, animal and virus miRNAs. Using a comparative approach, we identified 10 004 miRNA candidates in 221 animal and 199 plant species. As our goal was not only to identify new miRNAs but also to develop a resource that would integrate miRNA data scattered across literature and databases, we also incorporated miRNA sequences from three other databases and two publications. Additionally, based on availability, we used data from 12 resources providing further annotation for miRNAs from selected species. Here we present miRNEST 2.0, an updated version of the database. In addition to 39 122 miRNAs from miRNEST 1.0 (10 004 from our EST analysis and 29 118 from other resources), we predicted 18 043 pre-miRNAs using small RNA deep sequencing data from 21 species. For miRNAs in 10 species, we provided targets inferred from degradome libraries. We also added miRNA splice sites information, HuntMi (8) predictions and some database functionalities, including download option. Taken together, miRNEST 2.0 is a large and comprehensive resource of miRNA data that bears distinct improvements over its previous version.

MATERIALS AND METHODS

miRNA prediction from sRNA deep sequencing data

For miRNA predictions we downloaded, from GEO database (9), 171 small RNA deep sequencing libraries from 8 plant and 13 animal species (Figure 1, Supplementary Table S1). Reads 19–26 bases long were kept and we mapped them to corresponding plant or animal genomes using Bowtie (10). In the mapping step, no mismatches were allowed and reads mapping to >20 distinct locations were discarded. Mapped reads that were 19–22-nt long and with count ≥ 5 were considered ‘potential mature miRNAs’. We retrieved their sequences from genomes along with flanking genomic sequences of 150 bases in animals and 250 bases in plants, and then we predicted secondary structures using hybrid-ss-min from UNAFold package (11). We kept only sequences with miRNA-like secondary structures: a stem loop-structure with ‘potential mature miRNA’ located in a single hairpin arm; no more than six mismatches and three bulges (animals) or five mismatches and two bulges (plants) between mature miRNA and the opposite hairpin arm. If a stem-loop structure was surrounded by additional nucleotides, the flanking regions were cutoff. Subsequently, we checked similarity to non-coding RNAs from RFAM (12) and proteins from UniProt (UniProtKB/Swiss-Prot protein data set) (13) using BLAST (14). Sequences showing similarity to RFAM non-miRNAs with E < 1e-10 or UniProt proteins with E < 1e-20 were discarded. After that we searched for low-complexity regions using Dustmasker (14); sequences bearing >60% of low-complexity regions were removed. Finally, we made sure that there is a miRNA-like profile of reads mapped to the hairpin. To achieve this we kept only the hairpins where (i) ‘potential mature miRNA’ corresponded to the most abundant read in at least one library, (ii) abundance of ‘potential mature miRNA’ constituted minimal 20% of total read counts in at least one library and (iii) the total count of reads starting at 5′ position of ‘potential mature miRNA’ was the maximal one in at least one library.
Figure 1.

The pipeline used for large-scale miRNA discovery from sRNA deep-sequencing data.

The pipeline used for large-scale miRNA discovery from sRNA deep-sequencing data. Newly identified miRNA candidates were checked against intronic sequences in corresponding species and sequences that fully overlapped with introns, with ‘potential mature miRNA’ located no more than four bases away from 5′ or 3′ intron end became mirtron candidates. We supplemented these candidates with already published predictions in mouse and human (15).

Degradome analysis

We downloaded 18 degradome libraries from GEO (9) that corresponded to 10 plant species: Arabidopsis thaliana, Glycine max, Hordeum vulgare, Malus domestica, Medicago truncatula, Physcomitrella patens, Prunus persica, Solanum lycopersicum, Triticum aestivum and V. vinifera (Supplementary Table S2). Transcript sequences (cDNAs) were downloaded from Ensembl Plants (16), and mature miRNA sequences were retrieved from miRNEST (17). Using PAREsnip (18), we searched for miRNA targets evidenced by degradome reads. We adjusted the program settings to look only for category 0, 1 and 2 targets, i.e. only high confidence candidates. For obtained candidates, we prepared degradome reads alignment files and corresponding plots for graphical representation of read mapping.

HuntMi predictions

HuntMi (8) is a machine learning tool for discrimination between true and false pre-miRNAs in plants, animals and viruses based on properties of pre-miRNA sequence and its secondary structure. We used this tool with default settings to better annotate pre-miRNAs stored in miRNEST. For animal, plant and virus sequences, different taxon-specific classifiers were used.

miRNA splice sites prediction

To infer miRNA splicing events from EST sequences, we applied a strategy previously used in ERISdb (19). In the first step, pre-miRNAs were searched against dbEST (20) using Megablast (14). It was required that the identity was 97% or higher and that the EST sequence contained at least 90% of known pre-miRNA sequence. The selected ESTs were subsequently mapped to the corresponding genome using Splign (21) with default settings. The alignments were finally checked manually to remove cases where ESTs came from the antisense strand and to improve the alignment in every case when splice site was broken because of imperfection of EST alignment software. Additionally, gene structures for 45 plant miRNAs were downloaded from ERISdb (19). We also obtained gene structures from RACE experiments in Populus trichocarpa (22), and RNA-Seq-evidenced splice sites in V. vinifera (23).

RESULTS

In current version, miRNEST has been extensively enlarged by results of small RNA deep sequencing analyses. First of all, we predicted 18 043 pre-miRNAs in 21 plant and animal species, and because miRNAs were often found independently in different sRNA libraries, this corresponds to as many as 36 468 new records in the database. In the search pipeline, we applied a number of strict criteria from the literature (17,24,25). In all, 38.1% of new sequences overlap with miRNAs already stored in miRNEST 1.0, thus providing experimental support for them (Supplementary Table S3). Moreover, as the database encompasses multiple libraries per species, it is possible to investigate isomiRs and changes in small RNA counts in different tissues and conditions. Although a similar functionality is available at miRBase (7), the analyzed species and selected deep sequencing libraries overlap only partly. Furthermore, for all miRNAs stored in miRNEST, including new predictions, we run classification analysis using HuntMi, which helped in much better annotation. Altogether, 91.16% of miRNEST sequences were considered true miRNAs, including miRNEST EST predictions (77.85%), miRNEST deep sequencing predictions (71.9%) and miRNAs from external databases (96.91%). Relatively high fraction of sequences recognized as true miRNAs in case of external databases [miRBase (7), PMRD (26), microPC (27)] might be due to the fact that this data set largely overlaps with miRNAs used to train HuntMi. Another aspect of deep sequencing analysis was identification of degradome-evidenced miRNA targets in 10 plant species. As we wanted to achieve highest quality results, only category 0, 1 and 2 candidates, as returned by PAREsnip, were considered. This allowed us to identify 2041 miRNA-target associations (Supplementary Table S4). Splicing in miRNA genes is an underestimated aspect of miRNA biology. So far, there is only one repository that stores miRNA splice sites information (19). We incorporated that data into miRNEST 2.0 and additionally performed splice site search in several species, which allowed us to find 17 miRNAs with introns in 5 plant species. We also complemented that data with miRNA gene structures from the literature (P. trichocarpa, V. vinifera).

CONCLUSIONS

The current version of the miRNEST database contains twice as many miRNA records as the version 1.0. Thanks to the small RNA deep sequencing data analysis, almost 40% of previously predicted miRNAs is now validated by the experimental data. Moreover, target predictions for miRNAs from 10 species are supported by degradome data. miRNEST 2.0 has also an updated user interface and works faster than its predecessor. We added both bulk data download and download available from ‘Browse’ page (for user-selected miRNAs). As we want miRNEST to grow and be a truly comprehensive miRNA resource, we also enabled upload option for miRNA-associated data.

AVAILABILITY AND REQUIREMENTS

miRNEST is freely available at http://mirnest.amu.edu.pl. Its previous version, miRNEST 1.0, can still be accessed at http://lemur.amu.edu.pl/share/php/mirnest_1.0. The database was constructed using Hypertext Markup Language (HTML), Cascading Style Sheets (CSS), PHP 5.2.11 (http://www.php.net/) and MySQL 4.0.31 (http://www.mysql.com/). pre-miRNA secondary structures are drawn using Java lightweight applet VARNA (28), which requires installation of Java plugin.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

Funding for open access charge: National Science Centre grant [2011/01/N/NZ2/01653 to M.W.S]. Conflict of interest statement. None declared.
  28 in total

1.  Criteria for annotation of plant MicroRNAs.

Authors:  Blake C Meyers; Michael J Axtell; Bonnie Bartel; David P Bartel; David Baulcombe; John L Bowman; Xiaofeng Cao; James C Carrington; Xuemei Chen; Pamela J Green; Sam Griffiths-Jones; Steven E Jacobsen; Allison C Mallory; Robert A Martienssen; R Scott Poethig; Yijun Qi; Herve Vaucheret; Olivier Voinnet; Yuichiro Watanabe; Detlef Weigel; Jian-Kang Zhu
Journal:  Plant Cell       Date:  2008-12-12       Impact factor: 11.277

2.  UNAFold: software for nucleic acid folding and hybridization.

Authors:  Nicholas R Markham; Michael Zuker
Journal:  Methods Mol Biol       Date:  2008

Review 3.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

Review 4.  Functions of microRNAs and related small RNAs in plants.

Authors:  Allison C Mallory; Hervé Vaucheret
Journal:  Nat Genet       Date:  2006-06       Impact factor: 38.330

5.  VARNA: Interactive drawing and editing of the RNA secondary structure.

Authors:  Kévin Darty; Alain Denise; Yann Ponty
Journal:  Bioinformatics       Date:  2009-04-27       Impact factor: 6.937

6.  Ontology-oriented retrieval of putative microRNAs in Vitis vinifera via GrapeMiRNA: a web database of de novo predicted grape microRNAs.

Authors:  Barbara Lazzari; Andrea Caprera; Alessandro Cestaro; Ivan Merelli; Marcello Del Corvo; Paolo Fontana; Luciano Milanesi; Riccardo Velasco; Alessandra Stella
Journal:  BMC Plant Biol       Date:  2009-06-29       Impact factor: 4.215

7.  MicroPC (microPC): A comprehensive resource for predicting and comparing plant microRNAs.

Authors:  Wuttichai Mhuantong; Duangdao Wichadakul
Journal:  BMC Genomics       Date:  2009-08-07       Impact factor: 3.969

8.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors:  Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-03-04       Impact factor: 13.583

9.  miR2Disease: a manually curated database for microRNA deregulation in human disease.

Authors:  Qinghua Jiang; Yadong Wang; Yangyang Hao; Liran Juan; Mingxiang Teng; Xinjun Zhang; Meimei Li; Guohua Wang; Yunlong Liu
Journal:  Nucleic Acids Res       Date:  2008-10-15       Impact factor: 16.971

10.  Splign: algorithms for computing spliced alignments with identification of paralogs.

Authors:  Yuri Kapustin; Alexander Souvorov; Tatiana Tatusova; David Lipman
Journal:  Biol Direct       Date:  2008-05-21       Impact factor: 4.540

View more
  25 in total

Review 1.  miRNA Nomenclature: A View Incorporating Genetic Origins, Biosynthetic Pathways, and Sequence Variants.

Authors:  T Desvignes; P Batzel; E Berezikov; K Eilbeck; J T Eppig; M S McAndrews; A Singer; J H Postlethwait
Journal:  Trends Genet       Date:  2015-10-08       Impact factor: 11.639

2.  PlanTE-MIR DB: a database for transposable element-related microRNAs in plant genomes.

Authors:  Alan P R Lorenzetti; Gabriel Y A de Antonio; Alexandre R Paschoal; Douglas S Domingues
Journal:  Funct Integr Genomics       Date:  2016-02-18       Impact factor: 3.410

Review 3.  The role of miRNA in plant-virus interaction: a review.

Authors:  Anteneh Ademe Mengistu; Tesfaye Alemu Tenkegna
Journal:  Mol Biol Rep       Date:  2021-03-26       Impact factor: 2.316

Review 4.  miRNomes involved in imparting thermotolerance to crop plants.

Authors:  Vijay Gahlaut; Vinay Kumar Baranwal; Paramjit Khurana
Journal:  3 Biotech       Date:  2018-11-24       Impact factor: 2.406

5.  MicroRNA Databases and Tools.

Authors:  Tharcísio Soares de Amorim; Daniel Longhi Fernandes Pedro; Alexandre Rossi Paschoal
Journal:  Methods Mol Biol       Date:  2022

Review 6.  Computational Detection of Pre-microRNAs.

Authors:  Müşerref Duygu Saçar Demirci
Journal:  Methods Mol Biol       Date:  2022

7.  An Integrated Bioinformatics and Functional Approach for miRNA Validation.

Authors:  Sombir Rao; Sonia Balyan; Chandni Bansal; Saloni Mathur
Journal:  Methods Mol Biol       Date:  2022

8.  VIRmiRNA: a comprehensive resource for experimentally validated viral miRNAs and their targets.

Authors:  Abid Qureshi; Nishant Thakur; Isha Monga; Anamika Thakur; Manoj Kumar
Journal:  Database (Oxford)       Date:  2014-11-07       Impact factor: 3.451

Review 9.  A comprehensive view of the web-resources related to sericulture.

Authors:  Deepika Singh; Hasnahana Chetia; Debajyoti Kabiraj; Swagata Sharma; Anil Kumar; Pragya Sharma; Manab Deka; Utpal Bora
Journal:  Database (Oxford)       Date:  2016-06-15       Impact factor: 3.451

10.  microRNAs and Their Targets in Apple (Malus domestica cv. "Fuji") Involved in Response to Infection of Pathogen Valsa mali.

Authors:  Hao Feng; Ming Xu; Xiang Zheng; Tongyi Zhu; Xiaoning Gao; Lili Huang
Journal:  Front Plant Sci       Date:  2017-12-06       Impact factor: 5.753

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.