| Literature DB >> 28605770 |
Yalbi Itzel Balderas-Martínez1,2, Fabio Rinaldi3,4, Gabriela Contreras4, Hilda Solano-Lira4, Mishael Sánchez-Pérez4, Julio Collado-Vides4, Moisés Selman5, Annie Pardo1.
Abstract
MicroRNAs (miRNAs) are small and non-coding RNA molecules that inhibit gene expression posttranscriptionally. They play important roles in several biological processes, and in recent years there has been an interest in studying how they are related to the pathogenesis of diseases. Although there are already some databases that contain information for miRNAs and their relation with illnesses, their curation represents a significant challenge due to the amount of information that is being generated every day. In particular, respiratory diseases are poorly documented in databases, despite the fact that they are of increasing concern regarding morbidity, mortality and economic impacts. In this work, we present the results that we obtained in the BioCreative Interactive Track (IAT), using a semiautomatic approach for improving biocuration of miRNAs related to diseases. Our procedures will be useful to complement databases that contain this type of information. We adapted the OntoGene text mining pipeline and the ODIN curation system in a full-text corpus of scientific publications concerning one specific respiratory disease: idiopathic pulmonary fibrosis, the most common and aggressive of the idiopathic interstitial cases of pneumonia. We curated 823 miRNA text snippets and found a total of 246 miRNAs related to this disease based on our semiautomatic approach with the system OntoGene/ODIN. The biocuration throughput improved by a factor of 12 compared with traditional manual biocuration. A significant advantage of our semiautomatic pipeline is that it can be applied to obtain the miRNAs of all the respiratory diseases and offers the possibility to be used for other illnesses. Database URL: http://odin.ccg.unam.mx/ODIN/bc2015-miRNA/.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28605770 PMCID: PMC5467562 DOI: 10.1093/database/bax030
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Categories of terms used in the annotation dictionary
| Term type | Source | Example(s) | Total terms in the dictionary by category |
|---|---|---|---|
| miRTarBase ( | hsa-mir-21-5p, hsa-mir-21 | 32 548 | |
| Target | HGNC ( | SMAD7 | 165 849 |
| Jaspar ( | ZEB1 | 1191 | |
| NCBI taxonomy (41) | Human, rat, mouse (different species) | 174 | |
| Terms reviewed by experts | IPF | 26 | |
| Level of microRNA under some conditions, or the regulatory | Previous work ( | Up, down, overexpressed, deleted, induced, repressed | 90 |
| Characteristics of the | RapidMiner pipeline and terms reviewed by experts | Type of sample (lung tissue, alveolar macrophages, fibroblasts) Characteristics of the samples or | 124 |
Figure 1.ODIN interface. At the left side there is the article tagged and at the right side all the terms that appeared at the dictionary.
Figure 2.Terms related to effects. Effects are important to indicate the level of the miRNA or the relation with the target gene.
Figure 3.Terms used to find miRNAs and their target genes.