| Literature DB >> 36253735 |
Ruiyu Guo1, Hailin Chen2, Wengang Wang1, Guangsheng Wu3, Fangliang Lv1.
Abstract
BACKGROUND: Increasing biomedical studies have shown that the dysfunction of miRNAs is closely related with many human diseases. Identifying disease-associated miRNAs would contribute to the understanding of pathological mechanisms of diseases. Supervised learning-based computational methods have continuously been developed for miRNA-disease association predictions. Negative samples of experimentally-validated uncorrelated miRNA-disease pairs are required for these approaches, while they are not available due to lack of biomedical research interest. Existing methods mainly choose negative samples from the unlabelled ones randomly. Therefore, the selection of more reliable negative samples is of great importance for these methods to achieve satisfactory prediction results.Entities:
Keywords: Negative sample selection; Supervised learning; miRNA-disease association predictions
Mesh:
Substances:
Year: 2022 PMID: 36253735 PMCID: PMC9575264 DOI: 10.1186/s12859-022-04978-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
The ablation experimental results based on fivefold cross-validations
| model | AUC | AUPR | Precision | Recall | F1-score | Accuracy |
|---|---|---|---|---|---|---|
| SS-Kmeans | 0.9712 | 0.9792 | 0.9531 | 0.8682 | 0.9081 | 0.9133 |
| 0.9652 | 0.9712 | 0.9398 | 0.8433 | 0.8886 | 0.8946 | |
| KR-NSSM |
The bold value indicates the highest one in each column
Fig. 1ROC curves of different classifiers based on fivefold cross-validations and different strategies of negative sample selection
Fig. 2PR curves of different classifiers based on fivefold cross-validations and different strategies of negative sample selection
Performance comparison based on six classical classifiers and fivefold cross-validations
| classifier | AUC | AUPR | Precision | Recall | F1-score | Accuracy |
|---|---|---|---|---|---|---|
| lightGBM | 0.9723 | 0.9787 | 0.9678 | 0.8681 | 0.9150 | 0.9196 |
| SVM | 0.9701 | 0.9788 | 0.9701 | 0.8799 | 0.9225 | 0.9263 |
| RF | 0.9699 | 0.9766 | 0.9731 | 0.8608 | 0.9131 | 0.9185 |
| LR | 0.9763 | 0.9810 | 0.9630 | 0.8751 | 0.9168 | 0.9208 |
| XGBoost | 0.9595 | 0.9698 | 0.9655 | 0.8554 | 0.9069 | 0.9125 |
| MLP | 0.9492 | 0.9642 | 0.9537 | 0.8527 | 0.9001 | 0.9056 |
| lightGBM | 0.8719 | 0.8519 | 0.8099 | 0.6853 | 0.7406 | 0.7629 |
| SVM | 0.8865 | 0.8489 | 0.8376 | 0.8015 | 0.8189 | 0.8230 |
| RF | 0.8106 | 0.7347 | 0.7438 | 0.4573 | 0.5599 | 0.6543 |
| LR | 0.8655 | 0.8451 | 0.8240 | 0.7004 | 0.7570 | 0.7754 |
| XGBoost | 0.7724 | 0.7280 | 0.7363 | 0.4042 | 0.5187 | 0.6315 |
| MLP | 0.7451 | 0.7276 | 0.7575 | 0.4860 | 0.5892 | 0.6669 |
Performance comparison of existing prediction methods based on fivefold cross-validations
| method | AUC | AUPR | Precision | Recall | F1-score | Accuracy |
|---|---|---|---|---|---|---|
| RFMDA | 0.9414 | 0.9606 | 0.9818 | 0.8424 | 0.9064 | 0.9134 |
| IRFMDA | 0.9671 | 0.9739 | 0.9591 | 0.8582 | 0.9054 | 0.9109 |
| ABMDA | 0.9732 | 0.9789 | 0.9971 | 0.8427 | 0.9129 | 0.9201 |
| GBDT-LR | 0.9633 | 0.9730 | 0.9625 | 0.8654 | 0.9111 | 0.9158 |
| SMALF | 0.9913 | 0.9931 | 0.9749 | 0.9507 | 0.9626 | 0.9648 |
| RFMDA | 0.7388 | 0.7034 | 0.6253 | 0.9548 | 0.7453 | 0.6912 |
| IRFMDA | 0.9267 | 0.9222 | 0.8447 | 0.8598 | 0.8521 | 0.8567 |
| ABMDA | 0.8841 | 0.8807 | 0.8152 | 0.7827 | 0.7908 | 0.8027 |
| GBDT-LR | 0.9274 | 0.9014 | 0.8315 | 0.8273 | 0.8302 | 0.8304 |
| SMALF | 0.9503 | 0.9472 | 0.8808 | 0.8931 | 0.8868 | 0.8860 |
Fig. 3The workflow of our method KR-NSSM