| Literature DB >> 28051121 |
Qianlan Yao1, Leilei Wu1, Jia Li2,3, Li Guang Yang2,3, Yidi Sun2,3, Zhen Li1, Sheng He2,3, Fangyoumin Feng2,3, Hong Li2, Yixue Li1,2,4.
Abstract
LncRNAs play pivotal roles in many important biological processes, but research on the functions of lncRNAs in human disease is still in its infancy. Therefore, it is urgent to prioritize lncRNAs that are potentially associated with diseases. In this work, we developed a novel algorithm, LncPriCNet, that uses a multi-level composite network to prioritize candidate lncRNAs associated with diseases. By integrating genes, lncRNAs, phenotypes and their associations, LncPriCNet achieves an overall performance superior to that of previous methods, with high AUC values of up to 0.93. Notably, LncPriCNet still performs well when information on known disease lncRNAs is lacking. When applied to breast cancer, LncPriCNet identified known breast cancer-related lncRNAs, revealed novel lncRNA candidates and inferred their functions via pathway analysis. We further constructed the human disease-lncRNA landscape, revealed the modularity of the disease-lncRNA network and identified several lncRNA hotspots. In summary, LncPriCNet is a useful tool for prioritizing disease-related lncRNAs and may facilitate understanding of the molecular mechanisms of human disease at the lncRNA level.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28051121 PMCID: PMC5209722 DOI: 10.1038/srep39516
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The flow chart of LncPriCNet.
(a) Construction of the multi-level composite network. This network is constructed by six sub-networks. White circle indicates lncRNA; white square indicates gene; white triangle indicates phenotype. The thickness of the edge indicates the weight score. (b) The flow chart by which LncPriCNet optimizes the candidate lncRNAs. First, the candidate lncRNAs of interest and seed nodes are mapped to the multi-level composite network. Then, a global extended RWR method is used to score the candidate lncRNAs according to their proximity to seed nodes. Finally, the candidate lncRNAs are ranked according to the scores. Purple circles represent the candidate lncRNAs of interest; red triangle indicates disease phenotype (phenotype seed) of interest from the OMIM data base; red squares represent known disease genes (gene seeds) from the OMIM database; and red circles indicate known disease lncRNAs (lncRNA seeds) from the lncRNADisease database.
Figure 2Performance of LncPriCNet and comparison with other methods.
(a) ROC curve for the predicted lncRNAs of 53 phenotypes. (b) ROC curve for the predicted lncRNAs of 20 phenotypes with two known lncRNAs. (c) ROC curve for the predicted lncRNAs of 42 phenotypes with only one known lncRNA. (d) The performance in hypothetical phenotypes without known disease lncRNAs.
Predicted breast cancer related lncRNAs, which were ranked in top 10 by LncPriCNet, RWRHLD or RlncD.
| lncRNA | Rank (LncPriCNet) | Rank(RWRHLD) | Rank(RlncD) | Refereces |
|---|---|---|---|---|
| 97 | ||||
| 31 | ||||
| 77 | ||||
| 522 | ||||
| 58 | — | |||
| 385 | — | |||
| 4 | — | |||
| 12 | — | |||
| 5 | — | |||
| 266 | ||||
| 3 | — | |||
| 1 | — | |||
| 2 | — | |||
| 8 | — | |||
| 6 | — | |||
| 10 | — | |||
| 7 | — | |||
| 9 | — |
Figure 3Case study, applying LncPriCNet to breast cancer.
(a) Venn diagram of the top 10 ranked lncRNAs identified by LncPriCNet and two other methods. The numbers in square brackets denote lncRNAs with literature support. (b) The subnetwork of the top three lncRNAs (CBR3-AS1, TINCR and ACTA2-AS1) and seed nodes. (c) The network of the top 10 ranked lncRNAs and enriched pathways of their co-expressed genes.
Figure 4Global view of the predicted landscape of human disease lncRNAs.
(a) Hierarchical clustering of the LncPriCNet scores between 53 phenotypes and 10082 lncRNAs. The color of each cell represents the LncPriCNet score of a lncRNA (row) for a phenotype (column). Phenotype clusters were annotated with enriched disease categories (bottom), and lncRNA clusters were annotated with the most enriched pathways of their co-expressed genes (right). The red circled region indicates a module composed of lncRNAs involved in the cell cycle process. (b) Zoom-in plot of the red circled region, involving 3 type of cancers and 156 high-risk lncRNAs. (c) Enriched pathways for the co-expressed genes of 156 high-risk lncRNAs.
Figure 5Statistics and analysis of lncRNA-disease landscape.
(a) The number of disease phenotypes involving each lncRNA, with three different rank cutoffs. (b) The stacked plot of 20 lncRNAs, which are predicted to be associated with most phenotypes. (c) High-risk lncRNAs (top 50) are significantly overlapped with dysregulated lncRNAs in all six tumors (BRCA: Breast carcinoma; COAD: colon adenocarcinoma; LUSC: lung squamous cell carcinoma; LUAD: lung adenocarcinoma; PRAD: prostate adenocarcinoma; KIRC: kidney renal clear cell carcinoma).