| Literature DB >> 35938010 |
Zhifeng Guo1, Yan Hui1, Fanlong Kong1, Xiaoxi Lin1.
Abstract
Lung cancer is one of the leading causes of cancer-related deaths. Thus, it is important to find its biomarkers. Furthermore, there is an increasing number of studies reporting that long noncoding RNAs (lncRNAs) demonstrate dense linkages with multiple human complex diseases. Inferring new lncRNA-disease associations help to identify potential biomarkers for lung cancer and further understand its pathogenesis, design new drugs, and formulate individualized therapeutic options for lung cancer patients. This study developed a computational method (LDA-RLSURW) by integrating Laplacian regularized least squares and unbalanced bi-random walk to discover possible lncRNA biomarkers for lung cancer. First, the lncRNA and disease similarities were computed. Second, unbalanced bi-random walk was, respectively, applied to the lncRNA and disease networks to score associations between diseases and lncRNAs. Third, Laplacian regularized least squares were further used to compute the association probability between each lncRNA-disease pair based on the computed random walk scores. LDA-RLSURW was compared using 10 classical LDA prediction methods, and the best AUC value of 0.9027 on the lncRNADisease database was obtained. We found the top 30 lncRNAs associated with lung cancers and inferred that lncRNAs TUG1, PTENP1, and UCA1 may be biomarkers of lung neoplasms, non-small-cell lung cancer, and LUAD, respectively.Entities:
Keywords: biomarker; laplacian regularized least squares; lncRNA; lncRNA-disease association; lung cancer; unbalanced bi-random walk
Year: 2022 PMID: 35938010 PMCID: PMC9355720 DOI: 10.3389/fgene.2022.933009
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Flowchart of LDA-RLSURW.
AUC values of LDA prediction methods on the lncRNADisease dataset.
| LNCSIM1/LNCSIM2 | ILNCSIM | IDSSIM | RWRlncD | IIRWR | |
| 5-fold CV | 0.8892/0.8881 | 0.8866 | 0.8966 | 0.6976 | 0.7781 |
| SIMCLDA | LRLSLDA | LLCPLDA | LDA-LNSUBRW | LDA-RLSURW | |
| 0.7986 | 0.8174 | 0.8678 | 0.8874 | 0.9027 |
The LNCSIM1, LNCSIM2, LRLSLDA, and LDA-RLSURW are Laplacian regularized least square-based LDA methods, and the LDA-RLSURW can compute a better AUC. The results demonstrate that integrating unbalanced bi-random random walk can improve the performance. In addition, the IDSSIM and LDA-RLSURW computed the lncRNA similarity and disease similarity using the same method. The IDSSIM used the weighed K nearest known neighbor method to compute the lncRNA-disease association scores. The LDA-RLSURW outperforms IDSSIM, which show that the combination of Laplacian regularized least square and unbalanced bi-random walk can improve the LDA prediction performance compared to weighted K nearest known neighbor method. Both RWRlncD and IIRWR are random walk with restart-based LDA prediction methods. The SIMCLDA is an inductive matrix completion-based method. The LLCPLDA is a locality-constraint linear coding-based method. The LDA-RLSURW computes a better AUC than RWRlncD, IIRWR, SIMCLDA, and LLCPLDA, which further validates the powerful performance of LDA-RLSURW.
Inferred top 30 lncRNAs associated with LN.
| Rank | lncRNAs | Evidence | Rank | lncRNAs | Evidence |
|---|---|---|---|---|---|
| 1 | MALAT1 | Known | 16 | MINA | the MNDR database |
| 2 | HOTAIR | Known | 17 | PVT1 | the MNDR database |
| 3 | MEG3 | Known | 18 |
|
|
| 4 | H19 | Known | 19 |
|
|
| 5 | GAS5 | Known | 20 | XIST | the MNDR database |
| 6 | UCA1 | Known | 21 |
|
|
| 7 | CCAT2 | Known | 22 |
|
|
| 8 | SPRY4-IT1 | Known | 23 |
|
|
| 9 | CCAT1 | Known | 24 |
|
|
| 10 | CDKN2B-AS1 | Known | 25 |
|
|
| 11 | BANCR | Known | 26 |
|
|
| 12 | BCYRN1 | Known | 27 |
|
|
| 13 | PCAT1 | Known | 28 |
|
|
| 14 | SOX2-OT | Known | 29 |
|
|
| 15 | CASC2 | Known | 30 |
|
|
The bold values denotes lncRNAs that were predicted to associate with LN and need to further validate in Table 2.
FIGURE 2Associations between the inferred top 30 lncRNAs and lung neoplasms (LN). Black solid lines represent known LDAs in the lncRNADisease database. Blue-dot lines represent LDAs that can be observed in the MNDR database. Red-dash lines represent LDAs predicted to be potential lncRNA biomarkers of LN.
Inferred top 30 lncRNAs associated with NSCLC.
| Rank | lncRNAs | Evidence | Rank | lncRNAs | Evidence |
|---|---|---|---|---|---|
| 1 | MALAT1 | Known | 16 | PANDAR | Known |
| 2 | HOTAIR | Known | 17 | HIF1A-AS1 | Known |
| 3 | MEG3 | Known | 18 | PCAT1 | the MNDR database |
| 4 | GAS5 | Known | 19 | CASC2 | the MNDR database |
| 5 | H19 | Known | 20 | SOX2-OT | the MNDR database |
| 6 | UCA1 | Known | 21 | HULC | the MNDR database |
| 7 | CCAT2 | Known | 22 |
| Unconfirmed |
| 8 | SPRY4-IT1 | Known | 23 |
| Unconfirmed |
| 9 | CDKN2B-AS1 | Known | 24 | HIF1A-AS2 | the MNDR database |
| 10 | PVT1 | Known | 25 | HNF1A-AS1 | Known |
| 11 | CCAT1 | Known | 26 | KCNQ1OT1 | the MNDR database |
| 12 | TUG1 | Known | 27 | CRNDE | the MNDR database |
| 13 | BANCR | Known | 28 | DANCR | the MNDR database |
| 14 | BCYRN1 | Known | 29 | MIR31HG | the MNDR database |
| 15 | XIST | Known | 30 | NPTN-IT1 | the MNDR database |
The bold values denotes lncRNAs that were predicted to associate with NSCLC and need to further validate in Table 3.
FIGURE 3Associations between the inferred top 30 lncRNAs and NSCLC. Black solid lines represent known LDAs in the lncRNADisease database. Blue-dot lines represent LDAs that can be observed in the MNDR database. Red-dash lines represent LDAs predicted to be potential lncRNA biomarkers of LN.
Inferred top 30 lncRNAs associated with LUAD.
| Rank | lncRNAs | Evidence | Rank | lncRNAs | Evidence |
|---|---|---|---|---|---|
| 1 | MALAT1 | Known | 16 |
| Unconfirmed |
| 2 | HOTAIR | Known | 17 |
| Unconfirmed |
| 3 | MEG3 | Known | 18 |
| Unconfirmed |
| 4 | GAS5 | Known | 19 |
| Unconfirmed |
| 5 | CCAT1 | Known | 20 |
| Unconfirmed |
| 6 | HNF1A-AS1 | the MNDR database | 21 |
| Unconfirmed |
| 7 | MIAT | Known | 22 |
| Unconfirmed |
| 8 | H19 | the MNDR database | 23 |
| Unconfirmed |
| 9 |
| Unconfirmed | 24 |
| Unconfirmed |
| 10 |
| Unconfirmed | 25 |
| Unconfirmed |
| 11 |
| Unconfirmed | 26 |
| Unconfirmed |
| 12 |
| Unconfirmed | 27 |
| Unconfirmed |
| 13 |
| Unconfirmed | 28 |
| Unconfirmed |
| 14 |
| Unconfirmed | 29 |
| Unconfirmed |
| 15 |
| Unconfirmed | 30 |
| Unconfirmed |
The bold values denotes lncRNAs that were predicted to associate with LUAD and need to further validate in Table 4.
FIGURE 4Associations between the inferred top 30 lncRNAs and LUAD. Black solid lines represent known LDAs in the lncRNADisease database. Blue-dot lines represent LDAs that can be observed in the MNDR database. Red-dash lines represent LDAs predicted to be potential lncRNA biomarkers of adenocarcinoma of lung.