| Literature DB >> 33415333 |
Qing-Wen Wu1, Jun-Feng Xia2, Jian-Cheng Ni3, Chun-Hou Zheng1.
Abstract
Predicting disease-related long non-coding RNAs (lncRNAs) is beneficial to finding of new biomarkers for prevention, diagnosis and treatment of complex human diseases. In this paper, we proposed a machine learning techniques-based classification approach to identify disease-related lncRNAs by graph auto-encoder (GAE) and random forest (RF) (GAERF). First, we combined the relationship of lncRNA, miRNA and disease into a heterogeneous network. Then, low-dimensional representation vectors of nodes were learned from the network by GAE, which reduce the dimension and heterogeneity of biological data. Taking these feature vectors as input, we trained a RF classifier to predict new lncRNA-disease associations (LDAs). Related experiment results show that the proposed method for the representation of lncRNA-disease characterizes them accurately. GAERF achieves superior performance owing to the ensemble learning method, outperforming other methods significantly. Moreover, case studies further demonstrated that GAERF is an effective method to predict LDAs.Entities:
Keywords: graph auto-encoder; graph convolutional network; graph embedding; lncRNA-disease association; random forest
Year: 2021 PMID: 33415333 DOI: 10.1093/bib/bbaa391
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622