| Literature DB >> 33354569 |
Junyi Li1, Ying Liu1, Zhongqing Zhang1, Bo Liu2, Yadong Wang1,2.
Abstract
Successful prediction of miRNA-disease association is nontrivial for the diagnosis and prognosis of genetic diseases. There are many methods to predict miRNA and disease, but biological data are numerous and complex, and they often exist in the form of network. How to accurately use the features of miRNA and disease-related biological networks to predict unknown association has always been a challenge. Here, we propose PmDNE, a method based on network embedding and network similarity analysis, to predict the miRNA-disease association. In PmDNE, the structure of network bipartite graph is improved, and a random walk generator is designed. For embedded vectors, 128 dimensions are used, and the accuracy of prediction is significantly improved. Compared with other network embedding methods, PmDNE is comparable and competitive with the state of art methods. Our method can solve the problem of feature extraction, reduce the dimension of features, and improve the efficiency of miRNA-disease association prediction. This method can also be extended to other area for biomedical network prediction.Entities:
Year: 2020 PMID: 33354569 PMCID: PMC7735824 DOI: 10.1155/2020/6248686
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Number of edges about miRNA and disease, miRNA and miRNA, and disease and disease.
| miRNA | Disease | |
|---|---|---|
| miRNA | 644918 | 18733 |
| Disease | 18733 | 414003 |
The number of miRNA nodes and disease node.
| Nodes number | |
|---|---|
| miRNA | 1208 |
| Disease | 894 |
Pseudocode 1Pseudocode of node sequence.
Concept of TF, FN, FP, and TN.
| Prediction values | Actual values | |
|---|---|---|
| Positive | Negative | |
| Positive | TP | FN |
| Negative | FP | TN |
PR curve: abscissa is recall rate and ordinate is precision; precision = TP/(TP + FP); recall = TP/(total positive samples) = TP/(TP + FN); ROC curve: the abscissa is FPR and the ordinate is TPR; TPR = TP/(TP + FN); FPR = FP/(TN + FP).
Influence of different networks on results.
| ROC_AUC | PR_AUC | PREC | ACC | F1 | Recall | |
|---|---|---|---|---|---|---|
| 1 | 0.8952 ± 0.003 | 0.9002 ± 0.002 | 0.6744 ± 0.01 | 0.8153 ± 0.02 | 0.8104 ± 0.004 | 0.7863 ± 0.004 |
| 2 | 0.8833 ± 0.002 | 0.8916 ± 0.0015 | 0.6480 ± 0.015 | 0.8034 ± 0.02 | 0.7986 ± 0.004 | 0.7861 ± 0.005 |
| 3 | 0.8906 ± 0.0015 | 0.8966 ± 0.002 | 0.663 ± 0.001 | 0.8103 ± 0.015 | 0.8054 ± 0.003 | 0.7857 ± 0.004 |
| 4 | 0.8914 ± 0.001 | 0.8968 ± 0.0015 | 0.6634 ± 0.003 | 0.8115 ± 0.02 | 0.8056 ± 0.002 | 0.7813 ± 0.004 |
Figure 1Influence of parameters on prediction effect. The parameter scores mean the value obtained by ROC or PR. The scores of alpha, beta, and gamma fluctuate greatly. These three parameters play an important role in regulating the size of the first similarity and the second similarity.
Comparison of network embedding methods.
| Auc_roc | Auc_pr | |
|---|---|---|
| PmDNE | 0.8954 ± 0.003 | 0.9002 ± 0.002 |
| DeepWalk | 0.8689 ± 0.002 | 0.8780 ± 0.002 |
| Line | 0.8302 ± 0.003 | 0.8305 ± 0.002 |
| Node2Vec | 0.8807 ± 0.004 | 0.8782 ± 0.004 |
| GraRep | 0.8766 ± 0.002 | 0.8760 ± 0.003 |
| GF | 0.8881 ± 0.004 | 0.8856 ± 0.003 |
| Lap | 0.7706 ± 0.004 | 0.7062 ± 0.002 |
| lle | 0.8670 ± 0.004 | 0.8673 ± 0.004 |
Figure 2Experimental results for PR and ROC curves of each models: (a) ROC curves for all models; (b) PR curves for all models.
Comparison of the effect of different classifiers.
| ROC_AUC | PR_AUC | PREC | ACC | F1 | Recall | |
|---|---|---|---|---|---|---|
| RF | 0.8954 ± 0.003 | 0.9002 ± 0.002 | 0.6744 ± 0.01 | 0.8153 ± 0.02 | 0.8104 ± 0.004 | 0.7863 ± 0.004 |
| KNN | 0.8746 ± 0.002 | 0.8560 ± 0.0015 | 0.7933 ± 0.015 | 0.8075 ± 0.02 | 0.8014 ± 0.004 | 0.7758 ± 0.005 |
| GBC | 0.8827 ± 0.0015 | 0.8955 ± 0.002 | 0.7747 ± 0.001 | 0.8045 ± 0.015 | 0.7937 ± 0.003 | 0.7532 ± 0.004 |
| SVM | 0.7693 ± 0.001 | 0.8194 ± 0.0015 | 0.7367 ± 0.003 | 0.7042 ± 0.02 | 0.5989 ± 0.002 | 0.4495 ± 0.004 |
| LR | 0.8070 ± 0.02 | 0.8412 ± 0.005 | 0.7541 ± 0.004 | 0.7390 ± 0.015 | 0.7153 ± 0.004 | 0.65618 ± 0.005 |
| ADBC | 0.8330 ± 0.002 | 0.8579 ± 0.002 | 0.71370.0015 | 0.7570 ± 0.002 | 0.7348 ± 0.004 | 0.6757 ± 0.005 |