| Literature DB >> 32313121 |
Bo-Ya Ji1,2, Zhu-Hong You3,4, Li Cheng5, Ji-Ren Zhou1, Daniyal Alghazzawi6, Li-Ping Li1.
Abstract
In recent years, accumulating evidences have shown that microRNA (miRNA) plays an important role in the exploration and treatment of diseases, so detection of the associations between miRNA and disease has been drawn more and more attentions. However, traditional experimental methods have the limitations of high cost and time- consuming, a computational method can help us more systematically and effectively predict the potential miRNA-disease associations. In this work, we proposed a novel network embedding-based heterogeneous information integration method to predict miRNA-disease associations. More specifically, a heterogeneous information network is constructed by combining the known associations among lncRNA, drug, protein, disease, and miRNA. After that, the network embedding method Learning Graph Representations with Global Structural Information (GraRep) is employed to learn embeddings of nodes in heterogeneous information network. In this way, the embedding representations of miRNA and disease are integrated with the attribute information of miRNA and disease (e.g. miRNA sequence information and disease semantic similarity) to represent miRNA-disease association pairs. Finally, the Random Forest (RF) classifier is used for predicting potential miRNA-disease associations. Under the 5-fold cross validation, our method obtained 85.11% prediction accuracy with 80.41% sensitivity at the AUC of 91.25%. In addition, in case studies of three major Human diseases, 45 (Colon Neoplasms), 42 (Breast Neoplasms) and 44 (Esophageal Neoplasms) of top-50 predicted miRNAs are respectively verified by other miRNA-disease association databases. In conclusion, the experimental results suggest that our method can be a powerful and useful tool for predicting potential miRNA-disease associations.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32313121 PMCID: PMC7170854 DOI: 10.1038/s41598-020-63735-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The heterogeneous information network.
Figure 2Flowchart of our method to predict potential miRNA-disease associations.
The associations in the heterogeneous information network.
| Association type | Database | Number of associations |
|---|---|---|
| miRNA-lncRNA | lncRNASNP2[ | 8374 |
| miRNA-protein | miRTarBase: updata 2018[ | 4944 |
| lncRNA-disease | LncRNADisease[ | |
| lncRNASNP2[ | 1264 | |
| drug-disease | CTD: updata 2019[ | 18416 |
| lncRNA-protein | LncRNA2Target v2.0[ | 690 |
| drug-protein | DrugBank v5.0[ | 11107 |
| protein-protein | STRING: in 2017[ | 19237 |
| protein-disease | DisGeNET[ | 25087 |
| Total | N/A | 105546 |
The nodes in the heterogeneous information network.
| Node | Amount |
|---|---|
| Protein | 1649 |
| Disease | 2062 |
| LncRNA | 769 |
| Drug | 1025 |
| MiRNA | 1023 |
| Total | 6528 |
Figure 3Construction of gastrointestinal neoplasms’ DAG.
The GraRep Overall Algorithm.
The performance of our method under 5-fold cross validation.
| Fold | ACC.(%) | Prec.(%) | Sen.(%) | MCC(%) | Spec.(%) | AUC(%) |
|---|---|---|---|---|---|---|
| 0 | 85.29 | 89.00 | 80.52 | 70.89 | 90.05 | 91.32 |
| 1 | 85.23 | 89.17 | 80.19 | 70.81 | 90.26 | 91.24 |
| 2 | 84.57 | 88.51 | 79.46 | 69.51 | 89.68 | 90.66 |
| 3 | 85.54 | 88.68 | 81.50 | 71.32 | 89.59 | 91.51 |
| 4 | 84.92 | 88.41 | 80.38 | 70.13 | 89.46 | 91.53 |
Figure 4The ROC curves of our method in miRNA-disease association prediction under 5-fold cross validation.
Figure 5The PR curves of our method in miRNA-disease association prediction under 5-fold cross validation.
Comparison of our method with different feature combinations.
| Feature | Acc.(%) | Prec.(%) | Sen.(%) | MCC(%) | Spec.(%) | AUC(%) |
|---|---|---|---|---|---|---|
| Attribute | 79.77 ± 0.42 | 78.77 ± 0.61 | 81.52 ± 0.47 | 59.59 ± 0.83 | 78.03 ± 0.82 | 86.60 ± 0.37 |
| Behavior | 85.00 ± 0.28 | 88.42 ± 0.29 | 80.54 ± 0.86 | 70.27 ± 0.50 | 89.45 ± 0.39 | 91.18 ± 0.32 |
Figure 6Comparison of our method with different features under 5-fold cross validation.
Comparison of our method with different classifiers.
| Classifier | ACC.(%) | Prec.(%) | Sen.(%) | MCC.(%) | Spec.(%) | AUC.(%) |
|---|---|---|---|---|---|---|
| DecisionTree | 81.82 ± 0.23 | 83.59 ± 0.41 | 79.18 ± 0.11 | 63.72 ± 0.47 | 84.45 ± 0.47 | 81.82 ± 0.23 |
| KNN | 84.62 ± 0.47 | 84.23 ± 0.37 | 85.18 ± 0.86 | 69.24 ± 0.94 | 84.06 ± 0.42 | 89.90 ± 0.39 |
| Naive Bayes | 81.79 ± 0.67 | 81.02 ± 0.85 | 83.04 ± 0.51 | 63.61 ± 1.34 | 80.54 ± 1.01 | 87.81 ± 0.55 |
Figure 7Comparison with Random Forest, DecisionTree, KNN, and Naive Bayes classifier under 5-fold cross validation.
Predicted the top 50 miRNAs associated with colon neoplasms. The first column recorded the top 1–25 associated miRNAs. The second column recorded the top 26–50 associated miRNAs.
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-186-5p | dbDemc | hsa-mir-129-5p | dbDemc |
| hsa-mir-16-5p | dbDemc | hsa-mir-503-5p | dbDemc |
| hsa-mir-485-5p | dbDemc | hsa-mir-136-5p | dbDemc |
| hsa-mir-497-5p | dbDemc | hsa-mir-324-5p | dbDemc |
| hsa-mir-206 | dbDemc;miR2Disease | hsa-mir-10a-5p | dbDemc |
| hsa-mir-33b-5p | dbDemc | hsa-mir-199a-5p | dbDemc |
| hsa-mir-19b-3p | dbDemc | hsa-mir-199b-5p | dbDemc |
| hsa-mir-198 | dbDemc;miR2Disease | hsa-mir-451a | dbDemc |
| hsa-mir-361-5p | dbDemc | hsa-mir-29c-5p | dbDemc |
| hsa-mir-185-5p | dbDemc | hsa-mir-181a-2-3p | dbDemc |
| hsa-mir-154-5p | dbDemc | hsa-mir-184 | dbDemc;miR2Disease |
| hsa-mir-26b-5p | dbDemc | hsa-mir-99b-5p | dbDemc |
| hsa-mir-638 | dbDemc;miR2Disease | hsa-mir-144-5p | dbDemc |
| hsa-mir-34c-5p | dbDemc | hsa-mir-128-1-5p | dbDemc |
| hsa-mir-122-5p | dbDemc | hsa-mir-92a-2-5p | dbDemc |
| hsa-mir-449b-5p | dbDemc | hsa-mir-337-5p | dbDemc |
| hsa-mir-590-5p | dbDemc | hsa-mir-423-5p | dbDemc |
| hsa-mir-139-5p | dbDemc | hsa-mir-663a | dbDemc |
| hsa-mir-340-5p | dbDemc | hsa-mir-99a-5p | Unconfirmed |
| hsa-mir-542-5p | dbDemc;miR2Disease | hsa-mir-378a-5p | dbDemc |
| hsa-mir-211-5p | dbDemc | hsa-mir-575 | dbDemc |
| hsa-mir-153-3p | Unconfirmed | hsa-mir-373-5p | Unconfirmed |
| hsa-mir-149-5p | dbDemc | hsa-mir-214-5p | dbDemc |
| hsa-mir-499a-5p | Unconfirmed | hsa-mir-217-5p | Unconfirmed |
| hsa-mir-183-5p | dbDemc | hsa-mir-452-5p | dbDemc |
Predicted the top 50 miRNAs associated with breast neoplasms. The first column recorded the top 1–25 associated miRNAs. The second column recorded the top 26–50 associated miRNAs.
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-186-5p | dbDemc | hsa-mir-508-5p | dbDemc |
| hsa-mir-539-5p | dbDemc | hsa-mir-525-5p | Unconfirmed |
| hsa-mir-216a-5p | dbDemc | hsa-mir-431-5p | dbDemc |
| hsa-mir-330-5p | dbDemc | hsa-mir-532-5p | dbDemc |
| hsa-mir-154-5p | dbDemc | hsa-mir-483-5p | dbDemc |
| hsa-mir-543 | dbDemc | hsa-mir-519a-5p | Unconfirmed |
| hsa-mir-181d-5p | dbDemc | hsa-mir-581 | dbDemc |
| hsa-mir-4262 | Unconfirmed | hsa-mir-744-5p | dbDemc |
| hsa-mir-449b-5p | dbDemc | hsa-mir-362-5p | dbDemc |
| hsa-mir-384 | dbDemc | hsa-mir-432-5p | dbDemc |
| hsa-mir-211-5p | dbDemc | hsa-mir-511-5p | dbDemc |
| hsa-mir-4458 | dbDemc | hsa-mir-513b-5p | dbDemc |
| hsa-mir-504-5p | dbDemc | hsa-mir-513c-5p | dbDemc |
| hsa-mir-28-5p | dbDemc | hsa-mir-583 | dbDemc |
| hsa-mir-1271-5p | dbDemc | hsa-mir-628-5p | dbDemc |
| hsa-mir-136-5p | dbDemc | hsa-mir-939-5p | dbDemc |
| hsa-mir-300 | dbDemc | hsa-mir-885-5p | Unconfirmed |
| hsa-mir-99b-5p | dbDemc | hsa-mir-1973 | Unconfirmed |
| hsa-mir-337-5p | dbDemc | hsa-mir-369-5p | dbDemc |
| hsa-mir-518b | Unconfirmed | hsa-mir-612 | Unconfirmed |
| hsa-mir-637 | dbDemc;miR2Disease | hsa-mir-665 | dbDemc |
| hsa-mir-217-5p | Unconfirmed | hsa-mir-943 | dbDemc |
| hsa-mir-517a-3p | dbDemc | hsa-mir-490-5p | dbDemc |
| hsa-mir-646 | dbDemc | hsa-mir-188-5p | dbDemc |
| hsa-mir-671-5p | dbDemc | hsa-mir-942-5p | dbDemc |
Predicted the top 50 miRNAs associated with esophageal neoplasms. The first column recorded the top 1–25 associated miRNAs. The second column recorded the top 26–50 associated miRNAs.
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-182-5p | dbDemc | hsa-mir-181d-5p | dbDemc |
| hsa-mir-186-5p | dbDemc | hsa-mir-449a | dbDemc |
| hsa-mir-30e-5p | dbDemc | hsa-mir-140-5p | dbDemc |
| hsa-mir-107 | dbDemc | hsa-mir-590-5p | dbDemc |
| hsa-mir-16-5p | dbDemc | hsa-mir-29b-3p | dbDemc |
| hsa-mir-195-5p | dbDemc | hsa-mir-134-5p | dbDemc |
| hsa-mir-103a-3p | dbDemc | hsa-mir-24-3p | dbDemc |
| hsa-mir-15b-5p | dbDemc | hsa-let-7e-5p | dbDemc |
| hsa-mir-206 | dbDemc | hsa-mir-125a-5p | dbDemc |
| hsa-mir-30a-5p | dbDemc | hsa-mir-153-3p | dbDemc |
| hsa-mir-18a-5p | dbDemc | hsa-mir-149-5p | dbDemc |
| hsa-mir-135a-5p | dbDemc | hsa-mir-221-5p | Unconfirmed |
| hsa-mir-33a-5p | dbDemc | hsa-mir-152-5p | Unconfirmed |
| hsa-mir-17-5p | dbDemc | hsa-mir-204-5p | dbDemc |
| hsa-mir-19b-3p | dbDemc | hsa-let-7f-5p | dbDemc |
| hsa-mir-20b-5p | dbDemc | hsa-let-7d-5p | dbDemc |
| hsa-mir-106a-5p | dbDemc | hsa-mir-504-5p | dbDemc |
| hsa-mir-7-5p | dbDemc | hsa-mir-129-5p | dbDemc |
| hsa-mir-26a-5p | dbDemc | hsa-mir-144-5p | Unconfirmed |
| hsa-mir-9-5p | dbDemc | hsa-mir-324-5p | dbDemc |
| hsa-mir-181b-5p | dbDemc | hsa-mir-191-5p | dbDemc |
| hsa-mir-181a-5p | dbDemc | hsa-mir-199a-5p | dbDemc |
| hsa-mir-1271-5 | Unconfirmed | hsa-mir-29a-5p | Unconfirmed |
| hsa-mir-122-5p | dbDemc | hsa-mir-125b-2-3p | dbDemc |
| hsa-mir-181c-5p | dbDemc | hsa-mir-127-5p | Unconfirmed |