| Literature DB >> 32153646 |
Haojiang Tan1, Quanmeng Sun1, Guanghui Li2, Qiu Xiao3, Pingjian Ding4, Jiawei Luo5, Cheng Liang1.
Abstract
Long noncoding RNAs (lncRNAs) are a class of noncoding RNA molecules longer than 200 nucleotides. Recent studies have uncovered their functional roles in diverse cellular processes and tumorigenesis. Therefore, identifying novel disease-related lncRNAs might deepen our understanding of disease etiology. However, due to the relatively small number of verified associations between lncRNAs and diseases, it remains a challenging task to reliably and effectively predict the associated lncRNAs for given diseases. In this paper, we propose a novel multiview consensus graph learning method to infer potential disease-related lncRNAs. Specifically, we first construct a set of similarity matrices for lncRNAs and diseases by taking advantage of the known associations. We then iteratively learn a consensus graph from the multiple input matrices and simultaneously optimize the predicted association probability based on a multi-label learning framework. To convey the utility of our method, three state-of-the-art methods are compared with our method on three widely used datasets. The experiment results illustrate that our method could obtain the best prediction performance under different cross validation schemes. The case study analysis implemented for uterine cervical neoplasms further confirmed the utility of our method in identifying lncRNAs as potential prognostic biomarkers in practice.Entities:
Keywords: consensus graph learning; lncRNA–disease association; multi-label learning; multiple similarity matrices; survival analysis
Year: 2020 PMID: 32153646 PMCID: PMC7047769 DOI: 10.3389/fgene.2020.00089
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Details of the three datasets used in this study.
| Dataset | lncRNAs# | diseases# | interactions# |
|---|---|---|---|
| Dataset1 | 112 | 150 | 276 |
| Dataset2 | 131 | 169 | 319 |
| Dataset3 | 285 | 226 | 621 |
Figure 1An overall workflow of our method.
Figure 2The comparison results between our method and the other three methods in terms of LOOCV using (A) Dataset1; (B) Dataset2; (C) Dataset3.
Figure 3The comparison results between our method and the other three methods in terms of five-fold CV using (A) Dataset1; (B) Dataset2; (C) Dataset3.
Figure 4The comparison results between our method and the other three methods in terms of LODOCV using (A) Dataset1; (B) Dataset2; (C) Dataset3.
Comparison of different methods based on LODOCV using Wilcoxon signed rank test.
| Dataset | BiwalkLDA | SIMCLDA | KATZLDA |
|---|---|---|---|
| Dataset1 | 8.41e−10 | 1.84e−09 | 3.74e−12 |
| Dataset2 | 4.57e–09 | 1.22e−07 | 8.07e−13 |
| Dataset3 | 5.981e−09 | 7.49e−07 | 5.54e−14 |
Figure 5The influence of the two parameters α and β on the prediction accuracy of five-fold cross-validation.
Figure 6The convergence rate of our method.
The top 10 predicted lncRNAs to be associated with cervical uterine neoplasms by our method.
| Rank | lncRNA | Evidence |
|---|---|---|
| 1 | UCA1 | Lnc2Cancer;MNDR |
| 2 | TUG1 | Lnc2Cancer;MNDR |
| 3 | MIR99AHG | MNDR |
| 4 | MIR7-3HG | Unknown |
| 5 | HIF1A-AS1 | MNDR |
| 6 | HOXC-AS1 | MNDR |
| 7 | LINC-ROR | Lnc2Cancer |
| 8 | NEAT1 | Lnc2Cancer;MNDR |
| 9 | GSEC | MNDR |
| 10 | HOTTIP | MNDR |
Figure 7Kaplan–Meier survival analysis using MIR7-3HG as a prognostic biomarker in uterine cervical neoplasms. Patients are divided into “high” and “low” groups according to their expression level of MIR7-3HG against the mean expression level across all patients.
|
|