| Literature DB >> 28077569 |
Alexander Junge1,2, Jan C Refsgaard3, Christian Garde1,4, Xiaoyong Pan1,2,3, Alberto Santos3, Ferhat Alkan1,2, Christian Anthon1,2, Christian von Mering5, Christopher T Workman1,4, Lars Juhl Jensen1,3, Jan Gorodkin1,2.
Abstract
Protein association networks can be inferred from a range of resources including experimental data, literature mining and computational predictions. These types of evidence are emerging for non-coding RNAs (ncRNAs) as well. However, integration of ncRNAs into protein association networks is challenging due to data heterogeneity. Here, we present a database of ncRNA-RNA and ncRNA-protein interactions and its integration with the STRING database of protein-protein interactions. These ncRNA associations cover four organisms and have been established from curated examples, experimental data, interaction predictions and automatic literature mining. RAIN uses an integrative scoring scheme to assign a confidence score to each interaction. We demonstrate that RAIN outperforms the underlying microRNA-target predictions in inferring ncRNA interactions. RAIN can be operated through an easily accessible web interface and all interaction data can be downloaded.Database URL: http://rth.dk/resources/rain.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28077569 PMCID: PMC5225963 DOI: 10.1093/database/baw167
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1Flow chart illustrating the development of the RAIN database, ranging from establishing scoring schemes for the individual sources of evidence, through integration of resources to evidence channels, to finally defining functional molecular networks.
Figure 2Toy example describing the benchmarking and scoring scheme. (A) A true positive (TP) interaction is depicted as a black dot and represents a miRNA–mRNA pair found in the gold standard; a false positive (FP) interaction is depicted as a white dot and comprises interactions where the miRNA and mRNA constituents are in the gold standard, but their pair is not. Interactions where the miRNA or the mRNA were not part of the gold standard are depicted as gray dots. Only TP and FP interactions are used to establish the transfer function, which subsequently is applied to assign confidence scores to all interactions. (B) A discrete transfer function is established as the fraction of correctly predicted interactions in each of the discrete raw score bins. (C) A continuous transfer function is established based on the TP and FP interactions found in sliding windows. The mean raw interaction score and fraction of correctly predicted interactions were computed for each window, followed by the fitting of a sigmoid transfer function.
The number of miRNA–mRNA, ncRNA–protein and ncRNA–ncRNA interactions per organism in RAIN with a combined confidence score higher than 0.15
| Organism | Number of interactions | |||
|---|---|---|---|---|
| miRNA–mRNA | ncRNA–protein | ncRNA–ncRNA | Total | |
| 174 853 | 11 026 | 2507 | 188 386 | |
| 77 270 | 469 | 35 | 77 774 | |
| 19 985 | 39 | 1 | 20 025 | |
| 0 | 640 | 85 | 725 | |
| Total | 272 108 | 12 174 | 2628 | 286 910 |
Figure 3Receiver-operating characteristics of the RAIN prediction channel and the respective miRNA target prediction tools benchmarked against an independent validation set of miRNA–mRNA interactions. The integration of the respective prediction tools yields improved predictive performance. Where specificity , sensitivity , P is the number of positive and n the number of negative miRNA–mRNA pairs.
Figure 4RAIN use case. (A) Querying RAIN for human miR-145-5p (miR-145), suggested to act as tumor-suppressor in breast and colon cancer (40, 41), finds multiple oncogenes such as KLF4 and SOX2 (42, 43) as putative targets of miR-145. Evidence channels supporting each interaction are encoded as edge colors. (B) Sources of evidence for each association, e.g. between miR-145 and KLF4, are presented in a pop-up opened after clicking an edge in the network. RAIN confidence scores are collected in the ‘Additional data’ table. Information about KLF4 is provided by STRING. Clicking the ‘Show’ button leads to a website that links to research articles presenting experimental evidence and displaying detailed text mining evidence, where available. (C) In contrast to single identifier search (A), the RAIN multiple identifier search can be used to specifically view interactions between three ribosomal RNAs (28S_rRNA, 5_8S_rRNA, 5S_rRNA) and a subset of five ribosomal proteins part of the large ribosomal subunit. These interactions were extracted from Reactome (10) or found by text mining. (D) Clicking an ncRNA node in the network opens a popup with basic information about the ncRNA, e.g. 5.8S rRNA.