Literature DB >> 29036351

miRandola 2017: a curated knowledge base of non-invasive biomarkers.

Francesco Russo1, Sebastiano Di Bella2, Federica Vannini3, Gabriele Berti3, Flavia Scoyni4, Helen V Cook1, Alberto Santos1,5, Giovanni Nigita6, Vincenzo Bonnici7, Alessandro Laganà8, Filippo Geraci9, Alfredo Pulvirenti10, Rosalba Giugno7, Federico De Masi11, Kirstine Belling1, Lars J Jensen1, Søren Brunak1, Marco Pellegrini9, Alfredo Ferro10.   

Abstract

miRandola (http://mirandola.iit.cnr.it/) is a database of extracellular non-coding RNAs (ncRNAs) that was initially published in 2012, foreseeing the relevance of ncRNAs as non-invasive biomarkers. An increasing amount of experimental evidence shows that ncRNAs are frequently dysregulated in diseases. Further, ncRNAs have been discovered in different extracellular forms, such as exosomes, which circulate in human body fluids. Thus, miRandola 2017 is an effort to update and collect the accumulating information on extracellular ncRNAs that is spread across scientific publications and different databases. Data are manually curated from 314 articles that describe miRNAs, long non-coding RNAs and circular RNAs. Fourteen organisms are now included in the database, and associations of ncRNAs with 25 drugs, 47 sample types and 197 diseases. miRandola also classifies extracellular RNAs based on their extracellular form: Argonaute2 protein, exosome, microvesicle, microparticle, membrane vesicle, high density lipoprotein and circulating. We also implemented a new web interface to improve the user experience.
© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29036351      PMCID: PMC5753291          DOI: 10.1093/nar/gkx854

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

miRNAs are small non-coding RNAs (ncRNAs) (21–23 nt long) that regulate gene expression at the post-transcriptional level by binding to messenger RNAs (mRNAs) and inhibiting their translation into proteins or by binding to other ncRNAs (1). First discovered in 1993 in Caenorhabditis elegans (2), miRNAs had tremendous impact on the study of gene expression regulation and regulatory networks. Since their discovery as post-transcriptional regulators (3), specific links have been discovered between miRNA and human pathologies (4–7), and further studies indicate the utility of some miRNAs as biomarkers for cancer and other diseases (5). Some miRNA-targeted therapeutics have been tested in clinical trials, including a miRNA mimic of the tumor suppressor miR-34, which reached phase I clinical trials for treating cancer, and antimiRs for miR-122, which reached phase II trials for treating hepatitis (8). Recently, miRNAs were shown to be present in human body fluids (9), resulting in the potential use of these small RNAs as non-invasive biomarkers (9,10). The great potential of extracellular miRNAs as biomarkers is their high stability in plasma, serum, saliva, urine and many other fluids (9–12). This stability is due to the formation of complexes of extracellular miRNAs in membrane-bound vesicles such as exosomes, which offers them protection from RNAses (13,14). Moreover, miRNAs can be found complexed with the Argonaute2 (Ago2) protein, part of the RNA-induced silencing complex (RISC) responsible of the RNA silencing mediated by miRNAs (15,16). miRNAs also complex with high density lipoprotein (HDL) (17), which provides the mechanism for a new pathway for intercellular communication. In fact, miRNAs transported by HDL can be delivered to recipient cells, where they can alter the expression of their targets (17). Given the rising interest in miRNAs and more generally other ncRNAs as potential biomarkers, we present an updated version of the miRandola database (11,12), with extensive addition of curated publications, new visualizations and additional functionality in the web interface.

DATA COLLECTION AND CONTENT

The majority of the data were extracted from the literature in PubMed (https://www.ncbi.nlm.nih.gov/pubmed). This included information regarding: the RNA type, sample type, experimental procedures, associated diseases, extracellular RNA forms and other metadata regarding publications, including a summary of the results of the study. Some articles were collected from two publicly available resources, ExoCarta (18) and Vesiclepedia (19), which are two manually curated databases specialized in collecting information on extracellular vesicles. The first version of the database made use of human biocurators that searched in PubMed using several keywords such as ‘microRNA’, ‘circulating’ and ‘extracellular’ and then manually extracted the relevant information. For this new version, we introduced text-mining-assisted curation to identify and prioritize papers for manual curation. The text mining approach will help our internal curators to increase the update frequency of the database to at least twice a year. To identify terms of interest, the text-mining software uses dictionaries of human ncRNAs (20), diseases (21) and keywords that indicate extracellular RNA forms. We performed the text mining on more than 26 million entries in PubMed using the tagger software (22). We scored pairs of these terms by summing scores for all co-occurrences of the terms in the same sentence, paragraph and abstract with decreasing weights. We then normalized these scores, and we calculated the geometric mean of RNA-disease and RNA-circulating scores, which we took as the final score for the combined association between circulating RNA and disease. Scientific articles that contain all three types of terms were given the same score as the triple (RNA-extracellular form-disease). Scores were then used to rank articles and facilitate the manual curation. Altogether, the collected data consisted of 314 articles (see Supplementary Table and website), a notably higher number of papers compared with the first version of the database (n = 89) (12) and the previous short update (n = 119) (11). For details on the database content see Table 1.
Table 1.

Comparison between the latest 2017 version and the previous version of miRandola

Previous versionmiRandola 2017
*Papers119314
Entries22763283
microRNAs5901002
lncRNAs012
CircRNAs08
Extracellular RNA forms47
Drugs625
Organisms114
Sample types2347
Visualization toolNoYes
External dataExoCartaExoCarta and Vesiclepedia
**Text-mining-assisted curationNoYes

*See the supplementary table.

**See the manuscript for more details.

*See the supplementary table. **See the manuscript for more details. The database aims to present a comprehensive list of all known extracellular ncRNAs. Still, the majority of studies included in this version of miRandola focus on extracellular miRNAs, since they are the most investigated type of ncRNA. Although, in this new version, we started to collect information on two new RNA classes, namely long ncRNAs (lncRNAs) of more than 200 nt, and circular RNAs (circRNAs), which are transcripts that form a continuous loop. These RNA classes represent a small portion of the entries in the database (counts are reported in Table 1 and Figure 1A) and future updates will introduce additional information. Figure 1B shows the variety of extracellular forms that the ncRNAs are found complexed with, including Ago2, exosomes and HDL. More than 35% of RNAs have been annotated only as ‘circulating’, indicating that authors did not specify whether the RNA was complexed with known extracellular forms (Figure 1B).
Figure 1.

Descriptive statistics of the database. (A) Number of RNAs across RNA classes; (B) Number of RNAs across extracellular RNA forms.

Descriptive statistics of the database. (A) Number of RNAs across RNA classes; (B) Number of RNAs across extracellular RNA forms. Recently, some well-known extracellular lncRNAs have been used as potential non-invasive biomarkers. Probably the most famous example is PCA3 (also known as DD3), which is highly overexpressed in most types of prostate cancer cells and detectable in urine (23). This new non-invasive biomarker shows great potential to improve patient care by reducing the number of biopsies (23). The function of circRNAs is largely unknown, but some studies have shown that they are able to act as a natural ‘sponge’, by binding and down-regulating miRNAs (24,25). Since circRNAs are stable molecules (26), they have been proposed as novel non-invasive biomarker candidates (26).

DATABASE DEVELOPMENT AND WEB INTERFACE

Data were collected and are maintained in a MySQL database running on an Apache server. The redesigned web interface was implemented using PHP and JavaScript (via the libraries AngularJS and D3.js). Furthermore, Bootstrap is used as front-end framework for faster and easier web development, allowing compatibility with web browsers. In this new version of the database we implemented new functionalities to explore and visualize data making the website more dynamic. Starting from the home page of the database (see Supplementary Figure S1A), users can quickly search for the name of the RNA of interest by typing it into a search bar. We have implemented an autocomplete function in order to facilitate the search (see Supplementary Figure S1B). After clicking on the RNA of interest, users will have an overview of the extracellular forms in which the RNA has been found, and are able to click on the specific term to browse the results. The user can browse by the following data types (see Supplementary Figure S2A): ‘miRNAs’, ‘lncRNAs’, ‘circRNAs’, ‘Diseases’, ‘exRNA forms’, ‘Samples’, ‘Drugs’ and ‘Organisms’. For instance, after clicking on ‘miRNAs’, a summary table will be shown (see Supplementary Figure S2B) with miRNA identifiers as reported in literature, but also with the official miRNA identifiers annotated in the last version of the miRNA registry miRBase (27). Each table can be filtered on a term of interest (see Supplementary Figure S2B), and can also be sorted. After clicking on a specific RNA of interest (see Supplementary Figure S2C and B), a results table is displayed (Figure 2).
Figure 2.

Results table containing the core information of the database. In this example, we report one result for hsa-miR-21.

Results table containing the core information of the database. In this example, we report one result for hsa-miR-21. The results table contains annotations specific to the selected RNA and other details such as publication identifier, reporting title, publication year, first author and journal. We show associations to diseases, sample types, the extracellular RNA type, the RNA expression level and the drug used in the experiment. We also report the methods used to verify the expression or other relevant techniques, and a short description of the results. The results table contains a field called ‘Potential biomarker role defined in the article’, indicating whether the selected RNA has a potential role as a biomarker, as stated in the published article. Users can also use the ‘Search’ section (see Supplementary Figure S3) to search for pairs of terms such as ‘hsa-miR-21’ and ‘non-small cell lung cancer’. All the data in miRandola are available in the ‘Download’ section of the database.

VISUALIZATION

In this new version of miRandola, we introduce a visualization to show RNA-disease co-occurrences extracted from literature, and a circos plot (Figure 3) that shows how many RNAs are shared between the most representative tumor types in our database. This plot reveals that most tumors share several extracellular RNAs, with the exception of ‘Cervical squamous cell carcinoma’ for which we have no evidence of RNAs that are shared with any of the other tumor types. This common signature can be used to help identify common non-invasive biomarkers for many types of cancer.
Figure 3.

Circos plot of the most representative tumor types in the database. The plot shows how many RNAs are shared by different tumors.

Circos plot of the most representative tumor types in the database. The plot shows how many RNAs are shared by different tumors.

FUTURE PERSPECTIVE

When the field of extracellular RNAs was still new, miRandola was started as a small project and it soon became a successful database due to the work of a few biocurators and developers. In this new version, the involvement of additional biocurators and the introduction of assisted curation using text mining both contributed to the collection of many more curated articles, and on an ongoing basis, improves our ability to update the database regularly. We emphasize manual curation as an indispensable step, and define it as the fingerprint of miRandola. For this reason, we will evaluate the participation of the scientific community in the curation process in future updates. We aim to develop a new online tool to achieve this goal, giving users the possibility to curate articles that have been identified and prioritized by text mining. After this step, our internal curators will verify the information introduced by the scientific community. The final goal of miRandola is to be a reference database for all non-invasive biomarkers, and future updates will consider other data such as extracellular DNA, giving a comprehensive panel of disease-specific biomarkers.

DATA AVAILABILITY

miRandola is available at http://mirandola.iit.cnr.it/. Click here for additional data file.
  27 in total

1.  DISEASES: text mining and data integration of disease-gene associations.

Authors:  Sune Pletscher-Frankild; Albert Pallejà; Kalliopi Tsafou; Janos X Binder; Lars Juhl Jensen
Journal:  Methods       Date:  2014-12-05       Impact factor: 3.608

2.  ExoCarta: A Web-Based Compendium of Exosomal Cargo.

Authors:  Shivakumar Keerthikumar; David Chisanga; Dinuka Ariyaratne; Haidar Al Saffar; Sushma Anand; Kening Zhao; Monisha Samuel; Mohashin Pathan; Markandeya Jois; Naveen Chilamkurti; Lahiru Gangoda; Suresh Mathivanan
Journal:  J Mol Biol       Date:  2015-10-03       Impact factor: 5.469

3.  Argonaute2 complexes carry a population of circulating microRNAs independent of vesicles in human plasma.

Authors:  Jason D Arroyo; John R Chevillet; Evan M Kroh; Ingrid K Ruf; Colin C Pritchard; Donald F Gibson; Patrick S Mitchell; Christopher F Bennett; Era L Pogosova-Agadjanyan; Derek L Stirewalt; Jonathan F Tait; Muneesh Tewari
Journal:  Proc Natl Acad Sci U S A       Date:  2011-03-07       Impact factor: 11.205

Review 4.  MicroRNA therapeutics: towards a new era for the management of cancer and other diseases.

Authors:  Rajesha Rupaimoole; Frank J Slack
Journal:  Nat Rev Drug Discov       Date:  2017-02-17       Impact factor: 84.694

5.  Human microRNA genes are frequently located at fragile sites and genomic regions involved in cancers.

Authors:  George Adrian Calin; Cinzia Sevignani; Calin Dan Dumitru; Terry Hyslop; Evan Noch; Sai Yendamuri; Masayoshi Shimizu; Sashi Rattan; Florencia Bullrich; Massimo Negrini; Carlo M Croce
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-18       Impact factor: 11.205

6.  Variability in the incidence of miRNAs and genes in fragile sites and the role of repeats and CpG islands in the distribution of genetic material.

Authors:  Alessandro Laganà; Francesco Russo; Catarina Sismeiro; Rosalba Giugno; Alfredo Pulvirenti; Alfredo Ferro
Journal:  PLoS One       Date:  2010-06-17       Impact factor: 3.240

7.  Circular RNAs are a large class of animal RNAs with regulatory potency.

Authors:  Sebastian Memczak; Marvin Jens; Antigoni Elefsinioti; Francesca Torti; Janna Krueger; Agnieszka Rybak; Luisa Maier; Sebastian D Mackowiak; Lea H Gregersen; Mathias Munschauer; Alexander Loewer; Ulrike Ziebold; Markus Landthaler; Christine Kocks; Ferdinand le Noble; Nikolaus Rajewsky
Journal:  Nature       Date:  2013-02-27       Impact factor: 49.962

8.  The SPECIES and ORGANISMS Resources for Fast and Accurate Identification of Taxonomic Names in Text.

Authors:  Evangelos Pafilis; Sune P Frankild; Lucia Fanini; Sarah Faulwetter; Christina Pavloudi; Aikaterini Vasileiadou; Christos Arvanitidis; Lars Juhl Jensen
Journal:  PLoS One       Date:  2013-06-18       Impact factor: 3.240

9.  MicroRNAs are transported in plasma and delivered to recipient cells by high-density lipoproteins.

Authors:  Kasey C Vickers; Brian T Palmisano; Bassem M Shoucri; Robert D Shamburek; Alan T Remaley
Journal:  Nat Cell Biol       Date:  2011-03-20       Impact factor: 28.824

10.  A knowledge base for the discovery of function, diagnostic potential and drug effects on cellular and extracellular miRNAs.

Authors:  Francesco Russo; Sebastiano Di Bella; Vincenzo Bonnici; Alessandro Laganà; Giuseppe Rainaldi; Marco Pellegrini; Alfredo Pulvirenti; Rosalba Giugno; Alfredo Ferro
Journal:  BMC Genomics       Date:  2014-05-06       Impact factor: 3.969

View more
  26 in total

1.  Lnc2Cancer v2.0: updated database of experimentally supported long non-coding RNAs in human cancers.

Authors:  Yue Gao; Peng Wang; Yanxia Wang; Xueyan Ma; Hui Zhi; Dianshuang Zhou; Xin Li; Ying Fang; Weitao Shen; Yingqi Xu; Shipeng Shang; Lihua Wang; Li Wang; Shangwei Ning; Xia Li
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

2.  Exploring Noninvasive Biomarkers with the miRandola Database: A Tool for Translational Medicine.

Authors:  Francesco Russo
Journal:  Methods Mol Biol       Date:  2021

Review 3.  The circulating non-coding RNA landscape for biomarker research: lessons and prospects from cardiovascular diseases.

Authors:  Stępień E; Marina C Costa; Szczepan Kurc; Anna Drożdż; Nuno Cortez-Dias; Francisco J Enguita
Journal:  Acta Pharmacol Sin       Date:  2018-06-07       Impact factor: 6.150

4.  CMEP: a database for circulating microRNA expression profiling.

Authors:  Jian-Rong Li; Chun-Yip Tong; Tsai-Jung Sung; Ting-Yu Kang; Xianghong Jasmine Zhou; Chun-Chi Liu
Journal:  Bioinformatics       Date:  2019-09-01       Impact factor: 6.937

5.  Circulating microRNA trafficking and regulation: computational principles and practice.

Authors:  Juan Cui; Jiang Shu
Journal:  Brief Bioinform       Date:  2020-07-15       Impact factor: 11.622

6.  Profile of circulating extracellular vesicles microRNA correlates with the disease activity in granulomatosis with polyangiitis.

Authors:  Marcin Surmiak; Katarzyna Wawrzycka-Adamczyk; Joanna Kosałka-Węgiel; Stanisław Polański; Marek Sanak
Journal:  Clin Exp Immunol       Date:  2022-05-13       Impact factor: 4.330

7.  miEAA 2.0: integrating multi-species microRNA enrichment analysis and workflow management systems.

Authors:  Fabian Kern; Tobias Fehlmann; Jeffrey Solomon; Louisa Schwed; Nadja Grammes; Christina Backes; Kendall Van Keuren-Jensen; David Wesley Craig; Eckart Meese; Andreas Keller
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

8.  Endothelial cells response to neutrophil-derived extracellular vesicles miRNAs in anti-PR3 positive vasculitis.

Authors:  M Surmiak; J Kosałka-Węgiel; S Polański; M Sanak
Journal:  Clin Exp Immunol       Date:  2021-02-28       Impact factor: 4.330

Review 9.  Circular RNAs: Expression, localization, and therapeutic potentials.

Authors:  Qiwei Yang; Feiya Li; Alina T He; Burton B Yang
Journal:  Mol Ther       Date:  2021-01-21       Impact factor: 11.454

Review 10.  Exosomes: Innocent Bystanders or Critical Culprits in Neurodegenerative Diseases.

Authors:  Margarida Beatriz; Rita Vilaça; Carla Lopes
Journal:  Front Cell Dev Biol       Date:  2021-05-13
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.