Bruno Henrique Ribeiro Da Fonseca1, Douglas Silva Domingues1,2, Alexandre Rossi Paschoal1. 1. Bioinformatics Graduation Program (PPGBIOINFO), Department of Computer Science, Federal University of Technology - Paraná, Cornélio Procópio, Paraná, Brazil. 2. Department of Botany, Institute of Biosciences, São Paulo State University, UNESP, Rio Claro, São Paulo, Brazil.
Abstract
MOTIVATION: Mirtrons arise from short introns with atypical cleavage by using the splicing mechanism. In the current literature, there is no repository centralizing and organizing the data available to the public. To fill this gap, we developed mirtronDB, the first knowledge database dedicated to mirtron, and it is available at http://mirtrondb.cp.utfpr.edu.br/. MirtronDB currently contains a total of 1407 mirtron precursors and 2426 mirtron mature sequences in 18 species. RESULTS: Through a user-friendly interface, users can now browse and search mirtrons by organism, organism group, type and name. MirtronDB is a specialized resource that provides free and user-friendly access to knowledge on mirtron data. AVAILABILITY AND IMPLEMENTATION: MirtronDB is available at http://mirtrondb.cp.utfpr.edu.br/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Mirtrons arise from short introns with atypical cleavage by using the splicing mechanism. In the current literature, there is no repository centralizing and organizing the data available to the public. To fill this gap, we developed mirtronDB, the first knowledge database dedicated to mirtron, and it is available at http://mirtrondb.cp.utfpr.edu.br/. MirtronDB currently contains a total of 1407 mirtron precursors and 2426 mirtron mature sequences in 18 species. RESULTS: Through a user-friendly interface, users can now browse and search mirtrons by organism, organism group, type and name. MirtronDB is a specialized resource that provides free and user-friendly access to knowledge on mirtron data. AVAILABILITY AND IMPLEMENTATION: MirtronDB is available at http://mirtrondb.cp.utfpr.edu.br/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Some studies in model organisms identified short hairpin introns displaying characteristics similar to miRNAs, which use the splicing mechanism as the first stage of the miRNA biogenesis cleavage (Ruby ). These noncanonical miRNAs, described as small introns, are collectively called ‘mirtrons’ (Okamura ). Mirtron deregulation was identified as a potential source of several human pathologies (Qu and Adelson, 2012), whereas in plants, research suggests a feedback loop for the autoregulation of miRNA biogenesis (Budak and Akpinar, 2015). In the literature, although there are many databases devoted to miRNA, (i.e. Das ) there is no repository for accessing knowledge on mirtron data. Not even miRBase (Griffiths-Jones ), the miRNA state-of-the-art repository, has a specific analysis for mirtrons. Up until November 2017, we identified 22 articles that had available public mirtron data. However, those datasets are dispersed and with neither standardization nor organization.In this context and to fill this gap, we provide mirtronDB, a central mirtron knowledge data repository. For that, based on published available literature, we modeled a total of 1407 mirtron precursors, and 2426 mature mirtrons from 18 species (chordates, invertebrates and plants). MirtronDB has an online user-friendly interface for the user, who can search, browse, visualize, and download information. All datasets are publicly available in several formats. The user has access to (i) precursor mirtron similarity analysis; (ii) target gene predictions; and (iii) ceRNA predictions in plants.
2 Materials and methods
MirtronDB was built using HTML 5, PHP 7.0, CSS 4.0, Bootstrap 3.3, Cytoscape.js and PostgreSQL in four steps: (i) Data collection; (ii) Data modeling; (iii) Data analysis; and (iv) Website interface (Supplementary Fig. S1).
2.1 Mirtron data collection
We collected the mirtron data available from June 2007 to November 2017 by searching the term ‘mirtron OR mirtrons’ in the field ‘title/abstract’ in NCBI PubMed (Supplementary Table S1) and in the papers thereby cited. The articles selected were manually analyzed and redundancies were removed. We created a standardized name: ‘organism name abbreviation + the word “mirtron” + ID, and for mature we add the arm’. We built a database and automatically imported the data (Supplementary Fig. S2). The STATUS column in the search pages and details pages provides the mirtron functional information (e.g. known, candidate).
2.2 Similarity analysis among organisms
We extracted the genomic information from several sources (Supplementary Table S2). We performed a BLASTN alignment between all the precursor mirtrons against all other species genomes. We retained results that have above 95% query coverage and identity.
2.3 Mirtrons and miRNAs similarity analysis
The mature mirtrons were aligned to miRNAs from miRBase v22 (Griffiths-Jones ) using the CD-HIT-EST-2D (Huang ) and by using the alignment of 9 nucleotides (nt) at 0.98 of identity.
2.4 Target gene prediction
We predicted the targets gene for Homo sapiens and plants. For human, we used TargetScan (Agarwal ) with default parameters, and for plants, we used psRNATarget (Dai ) with seed region parameter from 2 to 8 nt.
2.2.4 ceRNA prediction in plants
We used TAPIR (Bonnet ) with default parameters to predict ceRNA in plants. All mature mirtrons were compared against all lncRNAs from GreeNC database (Gallart ).
3 Results
3.1 mirtronDB: database content
We found a total of 1407 precursor mirtrons and 2426 mature mirtrons in 18 species, and we extracted functional information, when available. All mirtrons collected are detailed in Supplementary Table S3, and Supplementary Figure S3.
3.2 Precursor mirtron similarity analysis
We obtained 944 aligned precursors, where 896 were aligned in chordates (94.9%), 46 in invertebrates (4.9%) and 2 in plants (0.2%) (Supplementary Table S4).
3.3 Mature mirtron characterization
In chordates and invertebrates, most mature mirtrons have 22 nt (32.1%), and in plants, 28% of mature have 21 nt (Supplementary Table S5 and Supplementary Fig. S4). We obtained logo sequences for mirtron arms, where chordates present more GC bases than invertebrates and plants (Supplementary Fig. S5).
3.4 Mirtrons availability in miRBase
We investigated if the mature mirtron sequences were represented in miRBase. Only 966 mirtrons (39.8%) appear in miRBase, reinforcing the novelty and provided by our mirtronDB (Supplementary Table S6).
3.5 Target gene and ceRNA analysis
We identified a total of 512 298 and 3884 potential targets, gene predictions, in human and in plants, respectively (Supplementary Table S3). In plants, we also verified if the mirtrons could act as ceRNA candidates, where a total of 1738 potential interactions were found (Supplementary Table S7).
3.6 mirtronDB: user interfaces and visualization
The mirtronDB portal provides a user-friendly web interface to access mirtron knowledge. With the ‘Search’ function, the users can query mirtrons by organism, group, type, name, article and use the JBrowser visualization. In the ‘Network’ page, the users can build a mirtron network, and the results are displayed graphically.
4 Discussion
MirtronDB is a database that standardizes and provides mirtron data from literature. We highlight that (i) all data collected is in several formats; (ii) curated data make this repository a mirtron information reference; (iii) sequence, structure and conservation analysis are provided; and (iv) targets and ceRNA in mirtrons are also investigated. Data availability facilitates the development of new studies in biology. For example, we identified four mirtrons associated with diseases (Supplementary Material) in a cross-validation of mirtronDB with miRwayDB, which is a database with information of experimentally validated miRNA-pathway associations in pathophysiological conditions (Das ).
5 Conclusion
MirtronDB is a comprehensive database about mirtrons that allows users to query data and download it. This repository has the potential to promote advances in bioinformatics, such as what has been done by using data exploration and machine learning (Grzegorz ). We will update mirtronDB every year and the users can submit novel mirtrons to our website.Conflict of Interest: none declared.Click here for additional data file.
Authors: Alexandru A Sabo; Maria Dudau; George L Constantin; Tudor C Pop; Christoph-M Geilfus; Alessio Naccarati; Mihnea P Dragomir Journal: Front Pharmacol Date: 2021-07-06 Impact factor: 5.810