| Literature DB >> 20233441 |
Haroon Naeem1, Robert Küffner, Gergely Csaba, Ralf Zimmer.
Abstract
BACKGROUND: MicroRNAs have been discovered as important regulators of gene expression. To identify the target genes of microRNAs, several databases and prediction algorithms have been developed. Only few experimentally confirmed microRNA targets are available in databases. Many of the microRNA targets stored in databases were derived from large-scale experiments that are considered not very reliable. We propose to use text mining of publication abstracts for extracting microRNA-gene associations including microRNA-target relations to complement current repositories.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20233441 PMCID: PMC2845581 DOI: 10.1186/1471-2105-11-135
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
miRNA and gene/protein dictionaries
| Proteins/genes | |||||
|---|---|---|---|---|---|
| Species | Mature (miR/miR*) | miRNA Stem-loop | Synonyms | Entities | Synonyms |
| 1026 | 162 | 43070 | 30120 | 473403 | |
| 767 | 133 | 32448 | 42130 | 460921 | |
| 392 | 63 | 15662 | 39545 | 285483 | |
The identifiers and synonyms are extracted from different biological databases (such as miRBase, miRGen, HUGO, MGI, Entrez Gene, Swiss-Prot), including manually collected miRNA identifiers for human, mouse and rat from the literature. All dictionaries were processed to add frequently used synonym variants and to remove unspecific and inappropriate synonyms.
Figure 1The number of miRNA and gene/protein pair matches with synonym expansion, strictness and post filters in human. No selection all miRNA-gene co-occurrences found in the publication titles and abstracts are displayed. Counts of miRNA-target pairs in the main text refer to this first column. The organism specificity can be increased by the taxonomy filter that requires confirmation of the selected organism. The text-mining results can also be restricted to miRNA gene pairs found within single sentences. The particlular type of association in miRNA-gene pairs can be restricted by the relation filter. Additional filters report pairs only if they are confirmed by target prediction algorithms (e.g. Pita) or manually curated databases (e.g. miRecords, mir2Disease, TarBase).
Figure 2A web based graphical user interface to the database. miRSel can be queried via different options, including miRNA, target, gene ontology and PubMed keyword queries. If multiple options are selected, the results are AND-combined. Several filters are provided to control recall vs. precision of the mining results. For details see text.
Figure 3A schematic workflow of miRSel search by miRNA ID. After entering a complete or partial search key (e.g. a miRNA) (A) the user can select a subset of the matching miRNAs (B). Then, corresponding miRNA-target co-occurrences stored in the database are displayed in a tabular format (C). This table enables the navigation to miRNA or gene pages of primary databases (e.g. D = miRBase, E = Entrez Gene, PubMed abstracts that reference particular co-occurrences (F), or to the database sources for which the pair has been integrated (G). Also, details related to each miRNA-target pair e.g. all possible names for a given miRNA or protein in the literature and comparison results of other databases and sequence prediction can be displayed from the table (H). Finally, a miRNA target interaction graph (I) can be displayed that also enables the navigation to miRNA and gene pages (nodes) or PubMed abstracts (edges).
Evaluation of the detection of miRNAs and miRNA-gene associations.
| Performance evaluation | abstracts | sentences | cases | Recall | precision | f-meas |
|---|---|---|---|---|---|---|
| 50 | 89 | 79 | 0.96 | 1.00 | 0.98 | |
| 50 | 89 | 181 | 0.90 | 0.65 | 0.76 | |
| 50 | 89 | 181 | 0.88 | 0.78 | 0.83 | |
| 20 | 29 | 103 | 0.89 | 0.70 | 0.78 | |
| 20 | 29 | 103 | 0.87 | 0.62 | 0.73 |