| Literature DB >> 30459803 |
Xiaofang Xiao1, Wen Zhu2, Bo Liao1,2, Junlin Xu1, Changlong Gu1, Binbin Ji2, Yuhua Yao2, Lihong Peng3, Jialiang Yang2,4.
Abstract
In recent years, it has been increasingly clear that long noncoding RNAs (lncRNAs) play critical roles in many biological processes associated with human diseases. Inferring potential lncRNA-disease associations is essential to reveal the secrets behind diseases, develop novel drugs, and optimize personalized treatments. However, biological experiments to validate lncRNA-disease associations are very time-consuming and costly. Thus, it is critical to develop effective computational models. In this study, we have proposed a method called BPLLDA to predict lncRNA-disease associations based on paths of fixed lengths in a heterogeneous lncRNA-disease association network. Specifically, BPLLDA first constructs a heterogeneous lncRNA-disease network by integrating the lncRNA-disease association network, the lncRNA functional similarity network, and the disease semantic similarity network. It then infers the probability of an lncRNA-disease association based on paths connecting them and their lengths in the network. Compared to existing methods, BPLLDA has a few advantages, including not demanding negative samples and the ability to predict associations related to novel lncRNAs or novel diseases. BPLLDA was applied to a canonical lncRNA-disease association database called LncRNADisease, together with two popular methods LRLSLDA and GrwLDA. The leave-one-out cross-validation areas under the receiver operating characteristic curve of BPLLDA are 0.87117, 0.82403, and 0.78528, respectively, for predicting overall associations, associations related to novel lncRNAs, and associations related to novel diseases, higher than those of the two compared methods. In addition, cervical cancer, glioma, and non-small-cell lung cancer were selected as case studies, for which the predicted top five lncRNA-disease associations were verified by recently published literature. In summary, BPLLDA exhibits good performances in predicting novel lncRNA-disease associations and associations related to novel lncRNAs and diseases. It may contribute to the understanding of lncRNA-associated diseases like certain cancers.Entities:
Keywords: Gaussian interaction profile kernel similarity; ROC curve; disease similarity; leave-one-out cross validation; lncRNA similarity; path with limited length
Year: 2018 PMID: 30459803 PMCID: PMC6232683 DOI: 10.3389/fgene.2018.00411
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
The basic characteristics of the lncRNA-disease association dataset.
| 156 | 190 | 352 | 2.3 | 1.9 | 41 | 15 | 1 |
Figure 1The distributions of disease semantic and lncRNA functional similarity. (A) Disease semantic similarity (SS) distribution. (B) lncRNA functional similarity (FS) distribution. The x-axis indicates the intervals of similarity values and the y-axis indicates the numbers of values in the interval. The actual values are also marked above the histograms.
Figure 2The distributions of integrated similarities. (A) Distribution of the integrated similarity for diseases (DS). (B) Distribution of the integrated similarity for lncRNAs (LS). The x-axis indicates the intervals of similarity values and the y-axis indicates the numbers of values in the interval. The actual values are also marked above the histograms.
Figure 3The flowchart of BPLLDA. It consists of three steps: (1) disease similarity measurement, (2) lncRNA similarity measurement, and (3) the BPLLDA algorithm.
Figure 4Performance evaluation of BPLLDA, LRLSLDA, and GrwLDA in predicting lncRNA-disease associations by global LOOCV.
Figure 5Performance evaluation of BPLLDA, LRLSLDA, and GrwLDA in predicting novel lncRNA-associated diseases.
Figure 6Performance evaluation of BPLLDA, LRLSLDA, and GrwLDA in predicting novel disease-associated lncRNAs.
Precision of BPLLDA on global LOOCV.
| Precision | > = 0.134 | > = 0.446 | > = 0.933 | 1 |
Tuning two model parameters: the maximum path length L and the weight threshold T by LOOCV.
| T = 0.2 | 0.83903 | 0.87117 | * |
| T = 0.4 | 0.82043 | 0.85568 | 0.81205 |
| T = 0.5 | 0.81761 | 0.85959 | 0.80830 |
The value in each cell represents LOOCV AUC.
T = 0.2 and L = 4 was not calculated because it takes more than 48 h.
The effects of T on AUC when fixing L = 3.
| AUC | 0.87102 | 0.87117 | 0.86889 | 0.85568 | 0.85959 |
The effects of the Gaussian interaction profile kernel similarity for lncRNAs and diseases on LOOCV.
| 0.78718 | 0.79036 | 0.80924 | 0.87117 |
The value in each cell represents LOOCV AUC.
The top five lncRNA candidates predicted for cervical cancer, glioma, and non-small-cell lung cancer.
| Cervical cancer | MEG3 | LncRNADisease (Zhang J. et al., |
| Cervical cancer | PVT1 | LncRNADisease (Yang et al., |
| Cervical cancer | CDKN2B-AS1 | LncRNADisease (Zhang D. et al., |
| Cervical cancer | HOTAIR | LncRNADisease (Huang et al., |
| Cervical cancer | GAS5 | LncRNADisease (Cao et al., |
| Glioma | H19 | LncRNADisease (Shi et al., |
| Glioma | MALAT1 | LncRNADisease (Ma et al., |
| Glioma | PVT1 | (Zou et al., |
| Glioma | HOTAIR | LncRNADisease (Ke et al., |
| Glioma | GAS5 | LncRNADisease (Zhao X. et al., |
| Non-small-cell lung cancer | H19 | LncRNADisease (Zhang E. et al., |
| Non-small-cell lung cancer | MEG3 | LncRNADisease (Lu et al., |
| Non-small-cell lung cancer | HOTAIR | LncRNADisease (Liu X. H. et al., |
| Non-small-cell lung cancer | PVT1 | LncRNADisease (Yang et al., |
| Non-small-cell lung cancer | CDKN2B-AS1 | LncRNADisease (Nie et al., |
Figure 7Network view of the top 10 predicted lncRNAs for cervical cancer, glioma, and non-small-cell lung cancer.
The top five novel disease-correlated lncRNA candidates predicted for colorectal cancer and breast cancer.
| Colorectal cancer | H19 | lncRNADisease (Tsang et al., |
| Colorectal cancer | CDKN2B-AS1 | lncRNADisease (Sun et al., |
| Colorectal cancer | PVT1 | lncRNADisease (Ping et al., |
| Colorectal cancer | MEG3 | lncRNADisease (Zhu et al., |
| Colorectal cancer | MALAT1 | lncRNADisease (Ji et al., |
| Breast cancer | H19 | lncRNADisease (Vennin et al., |
| Breast cancer | CDKN2B-AS1 | lncRNADisease (Xu et al., |
| Breast cancer | PVT1 | lncRNADisease (Guan et al., |
| Breast cancer | MALAT1 | lncRNADisease (Chou et al., |
| Breast cancer | B2 SINE RNA | Unconfirmed |
The top five novel disease-correlated lncRNA candidates predicted for H19 and HOTAIR.
| H19 | Prostate cancer | lncRNADisease (Zhu et al., |
| H19 | Tumor | (Matouk et al., |
| H19 | Cancer | lncRNADisease (DeBaun et al., |
| H19 | Breast cancer | lncRNADisease (Vennin et al., |
| H19 | Decreased myogenesis | Unconfirmed |
| HOTAIR | Cancer | lncRNADisease (Gupta et al., |
| HOTAIR | Breast cancer | lncRNADisease (Xue et al., |
| HOTAIR | Hepatocellular carcinoma | lncRNADisease (Yang et al., |
| HOTAIR | Prostate cancer | lncRNADisease (Zhang et al., |
| HOTAIR | Tumor | Unconfirmed |