| Literature DB >> 25566534 |
Salvatore Alaimo1, Rosalba Giugno2, Alfredo Pulvirenti2.
Abstract
MOTIVATION: Over the past few years, experimental evidence has highlighted the role of microRNAs to human diseases. miRNAs are critical for the regulation of cellular processes, and, therefore, their aberration can be among the triggering causes of pathological phenomena. They are just one member of the large class of non-coding RNAs, which include transcribed ultra-conserved regions (T-UCRs), small nucleolar RNAs (snoRNAs), PIWI-interacting RNAs (piRNAs), large intergenic non-coding RNAs (lincRNAs) and, the heterogeneous group of long non-coding RNAs (lncRNAs). Their associations with diseases are few in number, and their reliability is questionable. In literature, there is only one recent method proposed by Yang et al. (2014) to predict lncRNA-disease associations. This technique, however, lacks in prediction quality. All these elements entail the need to investigate new bioinformatics tools for the prediction of high quality ncRNA-disease associations. Here, we propose a method called ncPred for the inference of novel ncRNA-disease association based on recommendation technique. We represent our knowledge through a tripartite network, whose nodes are ncRNAs, targets, or diseases. Interactions in such a network associate each ncRNA with a disease through its targets. Our algorithm, starting from such a network, computes weights between each ncRNA-disease pair using a multi-level resource transfer technique that at each step takes into account the resource transferred in the previous one.Entities:
Keywords: lncRNAs functional characterization; ncRNAs-diseases association predictions; network-based inference; resource transfer algorithm; tripartite networks
Year: 2014 PMID: 25566534 PMCID: PMC4264506 DOI: 10.3389/fbioe.2014.00071
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Figure 1Operating principle of ncPred in a tripartite network. Here, we represent ncRNAs in blue, targets in orange, and diseases in red. Without loss of generality, and in order to simplify the reading of the image, we decided to put λ and λ to 1, so as to obtain a uniform distribution of resources in the network. In the first step, a resource is assigned to each target and disease node (1). Thereafter, two separate transfer process are launched to compute the resource in target nodes (2a, 2b) and disease nodes (3a, 3b). Finally, resources are combined to obtain the total quantity in each disease node (4). In (4), the literals are used only for example purposes due to lack of space. They are to be replaced with the values computed in steps (2b) and (3b).
Description of the datasets: number of ncRNAs, targets and diseases together with the count of interactions, average degree, density, modularity, number of connected components, and average path length.
| Metrics | Chen et al. ( | Helwak et al. ( |
|---|---|---|
| ncRNAs | 119 | 338 |
| Targets | 110 | 179 |
| Diseases | 514 | 134 |
| ncRNAs–targets interactions | 247 | 1699 |
| Targets–diseases interactions | 1005 | 1572 |
| Average degree | 1.572 | 5.025 |
| Density | 0.002 | 0.008 |
| Modularity | 0.609 | 0.274 |
| Number of connected components | 24 | 1 |
| Average path length | 1.572 | 1.734 |
Figure 2Degree distribution of the two networks used as datasets: (A) Chen et al. (2013), (B) Helwak et al. (2013). The two plots are in log-log scale. As can be seen the degree distribution for the two networks can be approximated to an exponential one. We can therefore assume that the two networks are scale-free.
Comparison of ncPred and Yang et al. (.
| Dataset | ||||||
|---|---|---|---|---|---|---|
| Yang et al. ( | ncPred | Yang et al. ( | ncPred | Yang et al. ( | ncPred | |
| Chen et al. ( | 5.5113 | 12.3290 | 0.7297 | 1.6636 | 0.6217 ± 0.0178 | 0.7566 ± 0.0218 |
| Helwak et al. ( | 1.8654 | 5.8197 | 1.6509 | 5.6572 | 0.7069 ± 0.0084 | 0.7669 ± 0.0093 |
The results were obtained using the optimal values for .
Figure 3Comparison between ncPred and Yang et al. (. Such curves measure the quality of the algorithms in terms of false positives rate against true positives rate. (A,B) are independent since computed on two separate datasets. The significance of the difference highlighted between ncPred and Yang et al. (2014) was measured by applying the Friedman rank sum test as assessed in Table 4.
Friedman rank sum test applied to establish the statistical significance in the performance improvement of ncPred compared to Yang et al. (.
| Dataset | Friedman | |
|---|---|---|
| Chen et al. ( | 1026.315 | <2.2 × 10−16 |
| Helwak et al. ( | 6537.915 | <2.2 × 10−16 |
Optimal values of .
| Dataset | ||
|---|---|---|
| Chen et al. ( | 0.5 | 1 |
| Helwak et al. ( | 0.2 | 0.2 |
List of top 10 predictions computed by ncPred and their rank obtained with Yang et al. (.
| ncRNA | ncPred rank | Yang et al. ( | ncRNA | ncPred rank | Yang et al. ( |
|---|---|---|---|---|---|
| PVT1 | 1 | 3 | B2 SINE RNA | 6 | 28 |
| MEG3 | 2 | 19 | TP53TG1 | 7 | 22 |
| TUG1 | 3 | 32 | WRAP53 | 8 | 23 |
| lincRNA-p21 | 4 | 21 | Kcnq1ot1 | 9 | 48 |
| CDKN2B-AS1 | 5 | 20 | Evf2 | 10 | 35 |
| H19 | 1 | 43 | Kcnq1ot1 | 6 | 23 |
| SRA1 | 2 | 24 | PVT1 | 7 | 47 |
| TUG1 | 3 | 26 | CDKN2B-AS1 | 8 | 25 |
| 7SL | 4 | 29 | B2 SINE RNA | 9 | 17 |
| BDNF-AS1 | 5 | 34 | Airn | 10 | 18 |
| HOTAIR | 1 | 16 | PCAT1 | 6 | 40 |
| LINC00312 | 2 | 15 | ncRNACCND1 | 7 | 9 |
| Kcnq1ot1 | 3 | 25 | Six3OS | 8 | 45 |
| Xist | 4 | 43 | Airn | 9 | 14 |
| TERRA | 5 | 10 | RepA | 10 | 47 |
| PVT1 | 1 | 11 | LINC00312 | 6 | 24 |
| MEG3 | 2 | 16 | TP53TG1 | 7 | 20 |
| TUG1 | 3 | 26 | WRAP53 | 8 | 21 |
| BACE1-AS | 4 | 23 | CDKN2B-AS1 | 9 | 27 |
| lincRNA-p21 | 5 | 19 | B2 SINE RNA | 10 | 40 |
| PTENP1 | 1 | 38 | Evf2 | 6 | 60 |
| LINC00312 | 2 | 15 | Airn | 7 | 13 |
| Xist | 3 | 1 | TERRA | 8 | 18 |
| PCAT1 | 4 | 29 | B2 SINE RNA | 9 | 40 |
| Six3OS | 5 | 39 | RepA | 10 | 37 |