| Literature DB >> 27992375 |
Peng Wang1, Qiuyan Guo2, Yue Gao1, Hui Zhi1, Yan Zhang1, Yue Liu1, Jizhou Zhang1, Ming Yue1, Maoni Guo1, Shangwei Ning1,3, Guangmei Zhang2, Xia Li1,3.
Abstract
Although several computational models that predict disease-associated lncRNAs (long non-coding RNAs) exist, only a limited number of disease-associated lncRNAs are known. In this study, we mapped lncRNAs to their functional genomics context using competing endogenous RNAs (ceRNAs) theory. Based on the criteria that similar lncRNAs are likely involved in similar diseases, we proposed a disease lncRNA prioritization method, DisLncPri, to identify novel disease-lncRNA associations. Using a leave-one-out cross validation (LOOCV) strategy, DisLncPri achieved reliable area under curve (AUC) values of 0.89 and 0.87 for the LncRNADisease and Lnc2Cancer datasets that further improved to 0.90 and 0.89 by integrating a multiple rank fusion strategy. We found that DisLncPri had the highest rank enrichment score and AUC value in comparison to several other methods for case studies of alzheimer's disease, ovarian cancer, pancreatic cancer and gastric cancer. Several novel lncRNAs in the top ranks of these diseases were found to be newly verified by relevant databases or reported in recent studies. Prioritization of lncRNAs from a microarray (GSE53622) of oesophageal cancer patients highlighted ENSG00000226029 (top 2), a previously unidentified lncRNA as a potential prognostic biomarker. Our analysis thus indicates that DisLncPri is an excellent tool for identifying lncRNAs that could be novel biomarkers and therapeutic targets in a variety of human diseases.Entities:
Keywords: competing endogenous RNA; functional genomics; long non-coding RNA; prognostic biomarker
Mesh:
Substances:
Year: 2017 PMID: 27992375 PMCID: PMC5354861 DOI: 10.18632/oncotarget.13964
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Figure 1Systematic analysis of the functional similarity for known disease-associated lncRNAs
(A–C) Comparison of FS scores between experimentally validated disease lncRNAs (red points) and randomly selected lncRNAs (green points) based on three orthogonal ontologies of GO. (D–I) Comparison of FS scores between experimentally validated disease lncRNAs (red points) and randomly selected lncRNAs (green points) based on six biological networks. Experimentally validated disease lncRNA groups had significant higher FS score than random groups. The horizontal bars indicate the mean FS score.
Figure 2A flowchart of DisLncPri
There are three major steps in DisLncPri: (A) The candidate lncRNA list is ranked according to their FS score with known seed disease lncRNAs based on the three orthogonal function ontologies of GO. (B) With a similar strategy as in step A, the candidate lncRNA list is ranked according to their FS score based on the context of six biological networks. (C) The nine ranked lncRNA lists from steps (A and B) are combined for each candidate lncRNA into a single list using multiple rank fusion method. LncRNAs are indicated as blue circles and mRNAs are indicated as yellow circles.
Figure 3ROC curves for LOOCV analysis
(A–C) Three orthogonal ontologies of GO. (D–I) Six biological networks. DisLncPri achieved a reliable AUC value from 0.83 to 0.89.
Figure 4ROC curves for DisLncPri by integrating different functional genomics dataset
(A) The overall ROC curve yielded the highest AUC value of 0.90. (B–Y) Case studies for 24 complex diseases in LOOCV analysis after improvement of DisLncPri. HTT: Hereditary Haemorrhagic Telangiectasia.
Figure 5Comparison of DisLncPri analysis with other methods
(A) DisLncPri method had a higher ES score than other similar methods. Error bars are 95% confidence Interval. (B) DisLncPri had the highest AUC value in comparsion with others.
Novel lncRNA-disease associations confirmed by literature survey in the top 20 ranked list of DisLncPri
| Disease | lncRNA | Ensembl ID | Rank |
|---|---|---|---|
| Alzheimer's disease | MEG3 | ENSG00000214548 | 1 |
| PVT1 | ENSG00000249859 | 6 | |
| LINC01616 | ENSG00000261340 | 13 | |
| Ovarian cancer | GAS5 | ENSG00000234741 | 1 |
| MALAT1 | ENSG00000251562 | 4 | |
| MEG3 | ENSG00000214548 | 6 | |
| HOTAIR | ENSG00000228630 | 9 | |
| Pancreatic cancer | GAS5 | ENSG00000234741 | 4 |
| AP000221.1 | ENSG00000229962 | 11 | |
| CTC-338M12.5 | ENSG00000250222 | 17 | |
| Gastric cancer | FRGCA | ENSG00000236663 | 1 |
| MALAT1 | ENSG00000251562 | 13 | |
| MEG3 | ENSG00000214548 | 20 |
Univariate Cox regression analysis showing 8 lncRNAs that significantly affect OSCC patient survival (P < 0.05)
| Rank | LncRNAs | Ensembl ID | HR(95%CI) | Coefficient | |
|---|---|---|---|---|---|
| 1 | CTB-113D17.1 | ENSG00000272568 | 5.20(2.51–10.75) | 1.65 | 8.89E–06 |
| 2 | RP4-798A10.2 | ENSG00000226029 | 9.75(3.59–26.49) | 2.28 | 7.93E–06 |
| 4 | MIR202HG | ENSG00000166917 | 6.05(2.13–17.14) | 1.80 | 7.08E–04 |
| 6 | TFAP2A-AS1 | ENSG00000229950 | 0.57(0.37–0.88) | −0.56 | 1.10E–02 |
| 12 | SCGB1B2P | ENSG00000268751 | 0.23(0.07–0.79) | −1.47 | 1.96E–02 |
| 13 | RP11-510M2.2 | ENSG00000247324 | 1.91(1.02–3.59) | 0.65 | 4.32E–02 |
| 19 | AL133493.2 | ENSG00000233922 | 0.17(0.06–0.48) | −1.78 | 9.10E–04 |
| 20 | MALAT1 | ENSG00000251562 | 2.90(1.11–7.60) | 1.07 | 3.02E–02 |
Figure 6Kaplan-Meier survival analysis for lncRNAs as predicted by DisLncPri
(A–E) DisLncPri predicted five risk lncRNAs which could significantly divide the 60 OSCC patients into two groups with high- and low-survival rates. (F) The lncRNA ENSG00000226029 (top 2 in the list) had significant effects on OSCC patient survival in another independent dataset.