| Literature DB >> 35889314 |
Zihao Li1, Xing Huang1, Yakun Shi1, Xiaoyong Zou2, Zhanchao Li3, Zong Dai1.
Abstract
Cumulative research reveals that microRNAs (miRNAs) are involved in many critical biological processes including cell proliferation, differentiation and apoptosis. It is of great significance to figure out the associations between miRNAs and human diseases that are the basis for finding biomarkers for diagnosis and targets for treatment. To overcome the time-consuming and labor-intensive problems faced by traditional experiments, a computational method was developed to identify potential associations between miRNAs and diseases based on the graph attention network (GAT) with different meta-path mode and support vector (SVM). Firstly, we constructed a multi-module heterogeneous network based on the meta-path and learned the latent features of different modules by GAT. Secondly, we found the average of the latent features with weight to obtain a final node representation. Finally, we characterized miRNA-disease-association pairs with the node representation and trained an SVM to recognize potential associations. Based on the five-fold cross-validation and benchmark datasets, the proposed method achieved an area under the precision-recall curve (AUPR) of 0.9379 and an area under the receiver-operating characteristic curve (AUC) of 0.9472. The results demonstrate that our method has an outstanding practical application performance and can provide a reference for the discovery of new biomarkers and therapeutic targets.Entities:
Keywords: MiRNA–disease association; graph neural network; meta-path
Mesh:
Substances:
Year: 2022 PMID: 35889314 PMCID: PMC9321348 DOI: 10.3390/molecules27144443
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.927
Figure 1The main performance under different dimensions.
Figure 2The learning curve of our method. (a) Training and validation accuracy graph; (b) Training and validation error graph.
Figure 3ROC curves for different classifier.
Figure 4PR curves for different classifier.
Figure 5The performance comparison of different classifier.
The performance comparison of different classifier.
| Acc | Auc | Aupr | Sens | Spec | Prec | F1 | Mcc | |
|---|---|---|---|---|---|---|---|---|
| MLP | 0.8661 | 0.9460 | 0.9420 | 0.8924 | 0.8398 | 0.8504 | 0.8693 | 0.7364 |
| CNN | 0.8689 | 0.9458 | 0.9411 | 0.8984 | 0.8393 | 0.8490 | 0.8725 | 0.7399 |
| RF | 0.8639 | 0.9398 | 0.9327 | 0.8776 | 0.8502 | 0.8542 | 0.8657 | 0.7281 |
| SVM | 0.8752 | 0.9470 | 0.9374 | 0.9156 | 0.83491 | 0.8473 | 0.8801 | 0.7531 |
Figure 6The ROC curves of our method.
Figure 7The PR curves of our method.
Figure 8Performance comparison of different methods in 5-CV.
Figure 9Performance comparison of different ratio of positive and negative samples.
Performance comparison of different ratio of positive and negative samples.
| Ratio | Acc | Auc | Aupr | Sens | Spec | Prec | F1 | Mcc |
|---|---|---|---|---|---|---|---|---|
| 1:1 | 0.8753 | 0.9470 | 0.9375 | 0.9157 | 0.8349 | 0.8473 | 0.8801 | 0.7531 |
| 1:2 | 0.8790 | 0.9481 | 0.8989 | 0.8168 | 0.9101 | 0.8199 | 0.8182 | 0.7277 |
| 1:3 | 0.8901 | 0.9460 | 0.8634 | 0.7210 | 0.9464 | 0.8177 | 0.7662 | 0.6971 |
| 1:4 | 0.9002 | 0.9422 | 0.8321 | 0.6461 | 0.9637 | 0.8167 | 0.7213 | 0.6684 |
| 1:5 | 0.9098 | 0.9325 | 0.7984 | 0.5898 | 0.9738 | 0.8186 | 0.6854 | 0.6461 |
Figure 10Performance comparison of different thresholds of negative samples.
Top-30 Predicted Associations of Liver Neoplasms.
| Rank | Score | miRNA | Evidence |
|---|---|---|---|
| 1 | 0.9557 | hsa-miR-21 | HMDD3.0, dbDEMC, PMID: 31037150 |
| 2 | 0.9540 | hsa-miR-155 | dbDEMC, PMID: 29565484 |
| 3 | 0.9477 | hsa-miR-146a | HMDD3.0, dbDEMC, PMID: 29133238 |
| 4 | 0.9345 | hsa-miR-29a | HMDD3.0, dbDEMC, PMID: 33891266 |
| 5 | 0.9326 | hsa-miR-16 | HMDD3.0, dbDEMC, PMID: 30657555 |
| 6 | 0.9323 | hsa-miR-29b | dbDEMC, PMID: 34184070 |
| 7 | 0.9309 | hsa-miR-125b | HMDD3.0 dbDEMC, PMID: 32609900 |
| 8 | 0.9301 | hsa-miR-15a | dbDEMC, PMID: 31099097 |
| 9 | 0.9266 | hsa-miR-1 | dbDEMC, PMID: 31846694 |
| 10 | 0.9242 | hsa-miR-221 | HMDD3.0, dbDEMC, PMID: 31069760 |
| 11 | 0.9220 | hsa-miR-34a | HMDD3.0, dbDEMC, PMID: 32778238 |
| 12 | 0.9203 | hsa-miR-17 | dbDEMC, PMID: 32206115 |
| 13 | 0.9195 | hsa-miR-20a | dbDEMC, PMID: 32206115 |
| 14 | 0.9184 | hsa-miR-199a | HMDD3.0, dbDEMC, PMID: 31144384 |
| 15 | 0.9183 | hsa-miR-133a | dbDEMC, PMID: 30086463 |
| 16 | 0.9150 | hsa-miR-19b | dbDEMC, PMID: 29889802 |
| 17 | 0.9147 | hsa-miR-29c | HMDD3.0 dbDEMC, PMID: 30718452 |
| 18 | 0.9141 | hsa-miR-223 | HMDD3.0, dbDEMC, PMID: 32233593 |
| 19 | 0.9139 | hsa-miR-222 | HMDD3.0, dbDEMC, PMID: 34273068 |
| 20 | 0.9101 | hsa-miR-150 | dbDEMC, PMID: 25549355 |
| 21 | 0.9043 | hsa-miR-92a | dbDEMC, PMID: 32587378 |
| 22 | 0.9040 | hsa-miR-18a | dbDEMC, PMID: 34221105 |
| 23 | 0.9015 | hsa-miR-145 | dbDEMC, PMID: 29658584 |
| 24 | 0.9011 | hsa-miR-106b | dbDEMC, PMID: 29975452 |
| 25 | 0.9009 | hsa-miR-181a | dbDEMC, PMID: 25058462 |
| 26 | 0.9006 | hsa-miR-19a | dbDEMC, PMID: 27012708 |
| 27 | 0.8999 | hsa-miR-210 | HMDD3.0, dbDEMC, PMID: 27666683 |
| 28 | 0.8978 | hsa-miR-31 | HMDD3.0, dbDEMC, PMID: 25797269 |
| 29 | 0.8957 | hsa-miR-122 | HMDD3.0, dbDEMC, PMID: 25537773 |
| 30 | 0.8941 | hsa-miR-142 | HMDD3.0, dbDEMC, PMID: 30092578 |
Top-30 Predicted Associations of Lung Neoplasms.
| Rank | Score | miRNA | Evidence |
|---|---|---|---|
| 1 | 0.9690 | hsa-miR-21 | HMDD3.0, dbDEMC, PMID: 30736829 |
| 2 | 0.9675 | hsa-miR-155 | HMDD3.0, dbDEMC, PMID:32447486 |
| 3 | 0.9673 | hsa-miR-122 | HMDD3.0, dbDEMC, PMID: 26604787 |
| 4 | 0.9672 | hsa-miR-15a | HMDD3.0, dbDEMC, PMID: 33059020 |
| 5 | 0.9671 | hsa-miR-29a | HMDD3.0, dbDEMC, PMID: 33250420 |
| 6 | 0.9670 | hsa-miR-16 | HMDD3.0, dbDEMC, PMID: 31379227 |
| 7 | 0.9660 | hsa-miR-29b | HMDD3.0, dbDEMC, PMID: 31813135 |
| 8 | 0.9647 | hsa-miR-133a | HMDD3.0, dbDEMC, PMID: 33074595 |
| 9 | 0.9630 | hsa-miR-1 | HMDD3.0, dbDEMC, PMID: 34139980 |
| 10 | 0.9626 | hsa-miR-15b | dbDEMC, PMID: 32220063 |
| 11 | 0.9617 | hsa-miR-199a | HMDD3.0, dbDEMC, PMID: 28363780 |
| 12 | 0.9608 | hsa-miR-146a | HMDD3.0, dbDEMC, PMID: 29127520 |
| 13 | 0.9602 | hsa-miR-29c | HMDD3.0, dbDEMC, PMID: 29512752 |
| 14 | 0.9598 | hsa-miR-26a | HMDD3.0, dbDEMC, PMID: 33407724 |
| 15 | 0.9588 | hsa-miR-126 | HMDD3.0, dbDEMC, PMID: 34107168 |
| 16 | 0.9586 | hsa-miR-192 | HMDD3.0, dbDEMC, PMID: 29571988 |
| 17 | 0.9581 | hsa-miR-30b | HMDD3.0, dbDEMC, PMID: 33779882 |
| 18 | 0.9578 | hsa-miR-106b | dbDEMC, PMID: 34351868 |
| 19 | 0.9575 | hsa-miR-19b | HMDD3.0, dbDEMC, PMID: 29455644 |
| 20 | 0.9569 | hsa-miR-150 | HMDD3.0, dbDEMC, PMID: 24456795 |
| 21 | 0.9575 | hsa-miR-23a | HMDD3.0, dbDEMC, PMID: 28436951 |
| 22 | 0.9567 | hsa-miR-196a | HMDD3.0, dbDEMC, PMID: 33775710 |
| 23 | 0.9561 | hsa-miR-19a | HMDD3.0, dbDEMC, PMID: 28364280 |
| 24 | 0.9558 | hsa-miR-23b | dbDEMC, PMID: 32495614 |
| 25 | 0.9556 | hsa-miR-206 | HMDD3.0, dbDEMC, PMID: 26919096 |
| 26 | 0.9555 | hsa-miR-26b | HMDD3.0, dbDEMC, PMID: 26744864 |
| 27 | 0.9552 | hsa-miR-223 | HMDD3.0, dbDEMC, PMID: 29615147 |
| 28 | 0.9547 | hsa-miR-195 | HMDD3.0, dbDEMC, PMID: 32406336 |
| 29 | 0.9544 | hsa-miR-222 | HMDD3.0, dbDEMC, PMID: 32588752 |
| 30 | 0.9539 | hsa-miR-34a | HMDD3.0, dbDEMC, PMID: 30700696 |
Top 30 Predicted Associations of Leukemia.
| Rank | Score | miRNA | Evidence |
|---|---|---|---|
| 1 | 0.9819 | hsa-miR-21 | HMDD3.0, dbDEMC, PMID: 32911844 |
| 2 | 0.9804 | hsa-miR-155 | HMDD3.0, dbDEMC, PMID: 33357126 |
| 3 | 0.9723 | hsa-miR-146a | HMDD3.0, dbDEMC, PMID: 32798394 |
| 4 | 0.9643 | hsa-miR-17 | HMDD3.0, dbDEMC, PMID: 35536524 |
| 5 | 0.9632 | hsa-miR-29a | HMDD3.0, dbDEMC, PMID: 31870103 |
| 6 | 0.9631 | hsa-miR-125b | HMDD3.0, dbDEMC, PMID: 27637078 |
| 7 | 0.9630 | hsa-miR-34a | HMDD3.0, dbDEMC, PMID: 27424989 |
| 8 | 0.9629 | hsa-miR-20a | HMDD3.0, dbDEMC, PMID: 34587164 |
| 9 | 0.9622 | hsa-miR-16 | HMDD3.0, dbDEMC, PMID: 28599250 |
| 10 | 0.9606 | hsa-miR-221 | HMDD3.0, dbDEMC, PMID: 29172404 |
| 11 | 0.9605 | hsa-miR-29b | dbDEMC, PMID: 29435107 |
| 12 | 0.9568 | hsa-miR-92a | HMDD3.0, dbDEMC, PMID: 31870103 |
| 13 | 0.9556 | hsa-miR-145 | HMDD3.0, dbDEMC, PMID: 32538049 |
| 14 | 0.9552 | hsa-miR-126 | HMDD3.0, dbDEMC, PMID: 34686664 |
| 15 | 0.9546 | hsa-miR-1 | dbDEMC, PMID: 28042875 |
| 16 | 0.9543 | hsa-miR-15a | HMDD3.0, dbDEMC, PMID: 24026141 |
| 17 | 0.9532 | hsa-miR-19b | HMDD3.0, dbDEMC, PMID: 29032147 |
| 18 | 0.9520 | hsa-miR-18a | HMDD3.0, dbDEMC, PMID: 32146479 |
| 19 | 0.9505 | hsa-let-7a | dbDEMC, PMID: 29398802 |
| 20 | 0.9489 | hsa-miR-19a | HMDD3.0, dbDEMC, PMID: 34895042 |
| 21 | 0.9473 | hsa-miR-222 | HMDD3.0, dbDEMC, PMID: 20203269 |
| 22 | 0.9463 | hsa-miR-143 | dbDEMC, PMID: 28890884 |
| 23 | 0.9454 | hsa-miR-31 | HMDD3.0, dbDEMC, PMID: 22511990 |
| 24 | 0.9453 | hsa-miR-29c | dbDEMC, PMID: 31333331 |
| 25 | 0.9445 | hsa-miR-223 | HMDD3.0, dbDEMC, PMID: 27900032 |
| 26 | 0.9443 | hsa-miR-133a | dbDEMC, PMID: 32647415 |
| 27 | 0.9439 | hsa-miR-199a | HMDD3.0, dbDEMC, PMID: 31636666 |
| 28 | 0.9409 | hsa-let-7b | HMDD3.0, dbDEMC, PMID: 33283713 |
| 29 | 0.9398 | hsa-miR-150 | HMDD3.0, dbDEMC, PMID: 27917123 |
| 30 | 0.9386 | hsa-miR-200b | PMID: 30574752 |
The advantages and drawbacks of our method and other methods.
| Method | AUC | Advantages | Drawbacks |
|---|---|---|---|
| PBMDA | 0.9172 | Topological information, complex network | No weighted, imbalance problem |
| WBNPMD | 0.9173 | Weighted edges | No topological information, imbalance problem |
| NIMCGCN | 0.9291 | Topological information, complex network, neural inductive | No weighted, imbalance problem |
| DNRLMF | 0.9357 | Complex network, dynamic regularized weight | No topological information, imbalance problem |
| VGAE-MDA | 0.9394 | Topological information, complex network, variational Bayesian inference | No weighted, imbalance problem |
| Ours | 0.9472 | Topological information, complex network, adaptive weight | Imbalance problem |
Figure 11The construction of module. MPMA and MDMA are miRNA adjacent matrices based on proteins and diseases respectively. DPDA and DMDA are disease adjacent matrices based on proteins and diseases respectively. MS and DS are the similarity matrices of miRNAs and diseases respectively.
The framework and parameters of model.
| Parameters | |
|---|---|
| GAT | Input (1, 857, 857) |
| Node attention layer (1, 857, 32) | |
| Concatenate layer (1, 857, 256) | |
| Module attention layer (857, 256), activation function | |
| Dense layer (857, 256), activation function | |
| Learning rate (0.001) | |
| Epoch (2000) | |
| SVM | Kernel function (radial basis function) |
| C factor (50) |
Figure 12The flowchart of our method. (a) Construction of networks; (b) Construction of multi-module; (c) Feature extraction; (d) Model training and prediction.