| Literature DB >> 30917115 |
Lei Wang1, Zhu-Hong You1, Xing Chen2, Yang-Ming Li3, Ya-Nan Dong4, Li-Ping Li1, Kai Zheng1.
Abstract
Emerging evidence has shown microRNAs (miRNAs) play an important role in human disease research. Identifying potential association among them is significant for the development of pathology, diagnose and therapy. However, only a tiny portion of all miRNA-disease pairs in the current datasets are experimentally validated. This prompts the development of high-precision computational methods to predict real interaction pairs. In this paper, we propose a new model of Logistic Model Tree for predicting miRNA-Disease Association (LMTRDA) by fusing multi-source information including miRNA sequences, miRNA functional similarity, disease semantic similarity, and known miRNA-disease associations. In particular, we introduce miRNA sequence information and extract its features using natural language processing technique for the first time in the miRNA-disease prediction model. In the cross-validation experiment, LMTRDA obtained 90.51% prediction accuracy with 92.55% sensitivity at the AUC of 90.54% on the HMDD V3.0 dataset. To further evaluate the performance of LMTRDA, we compared it with different classifier and feature descriptor models. In addition, we also validate the predictive ability of LMTRDA in human diseases including Breast Neoplasms, Breast Neoplasms and Lymphoma. As a result, 28, 27 and 26 out of the top 30 miRNAs associated with these diseases were verified by experiments in different kinds of case studies. These experimental results demonstrate that LMTRDA is a reliable model for predicting the association among miRNAs and diseases.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30917115 PMCID: PMC6464243 DOI: 10.1371/journal.pcbi.1006865
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Flowchart of LMTRDA model to predict potential miRNA-disease associations.
Five-fold cross-validation results performed by LMTRDA on HMDD V3.0 dataset.
| Test set | Accu.(%) | Sen.(%) | Prec. (%) | MCC(%) | AUC(%) |
|---|---|---|---|---|---|
| 1 | 90.99 | 92.32 | 89.92 | 82.00 | 91.03 |
| 2 | 90.29 | 93.98 | 87.44 | 80.81 | 90.51 |
| 3 | 90.74 | 93.37 | 88.53 | 81.60 | 90.69 |
| 4 | 90.22 | 91.72 | 89.31 | 80.47 | 90.22 |
| 5 | 90.30 | 91.35 | 89.47 | 80.63 | 90.27 |
| Average |
Fig 2ROC curves performed by LMTRDA on HMDD V3.0 dataset.
Fig 3PR curves performed by LMTRDA on HMDD V3.0 dataset.
Five-fold cross-validation results performed by SVM classifier combined with the proposed feature descriptors on HMDD V3.0 dataset.
| Test set | Accu.(%) | Sen.(%) | Prec. (%) | MCC(%) | AUC(%) |
|---|---|---|---|---|---|
| 1 | 86.30 | 76.56 | 95.09 | 74.02 | 86.13 |
| 2 | 86.46 | 77.00 | 94.82 | 74.21 | 86.75 |
| 3 | 86.04 | 75.71 | 95.06 | 73.52 | 85.91 |
| 4 | 85.76 | 75.65 | 95.32 | 73.21 | 85.84 |
| 5 | 85.88 | 75.79 | 94.97 | 73.28 | 85.88 |
| Average | |||||
| LMTRDA |
Five-fold cross-validation results performed by random forest classifier combined with the proposed feature descriptors on HMDD V3.0 dataset.
| Test set | Accu.(%) | Sen.(%) | Prec. (%) | MCC(%) | AUC(%) |
|---|---|---|---|---|---|
| 1 | 90.12 | 88.35 | 91.60 | 80.30 | 90.32 |
| 2 | 89.88 | 88.69 | 90.76 | 79.78 | 90.14 |
| 3 | 90.02 | 88.33 | 91.21 | 80.06 | 89.95 |
| 4 | 89.32 | 88.16 | 90.55 | 78.67 | 89.27 |
| 5 | 88.95 | 87.18 | 90.39 | 77.94 | 88.97 |
| Average | |||||
| LMTRDA |
Fig 4Comparison of results of different classifier models on HMDD V3.0 dataset.
Five-fold cross-validation results performed by LMT classifier combined with descriptor DescSeq on HMDD V3.0 dataset.
| Test set | Accu.(%) | Sen.(%) | Prec. (%) | MCC(%) | AUC(%) |
|---|---|---|---|---|---|
| 1 | 88.08 | 88.83 | 87.51 | 76.16 | 88.13 |
| 2 | 87.55 | 87.68 | 87.34 | 75.10 | 87.67 |
| 3 | 87.55 | 87.85 | 87.09 | 75.10 | 87.55 |
| 4 | 87.54 | 86.43 | 88.73 | 75.11 | 87.73 |
| 5 | 86.84 | 85.49 | 87.87 | 73.70 | 86.97 |
| Average | |||||
| LMTRDA |
Five-fold cross-validation results performed by LMT classifier combined with descriptor DescSim on HMDD V3.0 dataset.
| Test set | Accu.(%) | Sen.(%) | Prec. (%) | MCC(%) | AUC(%) |
|---|---|---|---|---|---|
| 1 | 90.87 | 92.31 | 89.68 | 81.77 | 90.92 |
| 2 | 89.95 | 92.86 | 87.74 | 80.03 | 90.09 |
| 3 | 90.87 | 92.83 | 89.43 | 81.79 | 90.90 |
| 4 | 87.93 | 92.25 | 84.90 | 76.14 | 88.15 |
| 5 | 87.55 | 92.05 | 84.40 | 75.42 | 87.69 |
| Average | |||||
| LMTRDA |
Fig 5Comparison of results of different descriptor models on HMDD V3.0 dataset.
Top 30 miRNAs related to Breast Neoplasms were predicted by LMTRDA based on known miRNA-disease associations in HMDD V3.0 database.
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-520f | dbDEMC V2.0 | hsa-mir-211 | dbDEMC V2.0 |
| hsa-mir-520e | dbDEMC V2.0 | hsa-mir-19b-2 | |
| hsa-mir-325 | dbDEMC V2.0 | hsa-mir-663 | dbDEMC V2.0 |
| hsa-mir-616 | dbDEMC V2.0 | hsa-mir-362 | dbDEMC V2.0 |
| hsa-mir-634 | dbDEMC V2.0 | hsa-mir-133 | dbDEMC V2.0 |
| hsa-mir-637 | dbDEMC V2.0 | hsa-mir-490 | dbDEMC V2.0 |
| hsa-mir-498 | dbDEMC V2.0 | hsa-mir-483 | dbDEMC V2.0 |
| hsa-mir-885 | dbDEMC V2.0 | hsa-mir-30 | dbDEMC V2.0 |
| hsa-mir-181d | dbDEMC V2.0 | hsa-mir-186 | dbDEMC V2.0 |
| hsa-mir-28 | dbDEMC V2.0 | hsa-mir-95 | dbDEMC V2.0 |
| hsa-mir-216 | dbDEMC V2.0 | hsa-mir-449b | dbDEMC V2.0 |
| hsa-mir-208b | hsa-mir-330 | dbDEMC V2.0 | |
| hsa-mir-455 | dbDEMC V2.0 | hsa-mir-217 | dbDEMC V2.0 |
| hsa-mir-382 | dbDEMC V2.0 | hsa-mir-99b | dbDEMC V2.0 |
| hsa-mir-520f | dbDEMC V2.0 | hsa-mir-365 | dbDEMC V2.0 |
Top 30 miRNAs related to Colon neoplasms were predicted by LMTRDA based on known miRNA-disease associations in HMDD V3.0 database.
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-526b | dbDEMC V2.0 | hsa-mir-198 | dbDEMC V2.0 |
| hsa-mir-520g | dbDEMC V2.0 | hsa-mir-181d | dbDEMC V2.0 |
| hsa-mir-520f | dbDEMC V2.0 | hsa-mir-181c | dbDEMC V2.0 |
| hsa-mir-520e | dbDEMC V2.0 | hsa-mir-181b-2 | dbDEMC V2.0 |
| hsa-mir-325 | dbDEMC V2.0 | hsa-mir-181b-1 | dbDEMC V2.0 |
| hsa-mir-302f | hsa-mir-122 | dbDEMC V2.0 | |
| hsa-mir-616 | dbDEMC V2.0 | hsa-mir-370 | dbDEMC V2.0 |
| hsa-mir-634 | dbDEMC V2.0 | hsa-mir-302c | dbDEMC V2.0 |
| hsa-mir-637 | dbDEMC V2.0 | hsa-mir-28 | dbDEMC V2.0 |
| hsa-mir-492 | hsa-mir-26a-2 | dbDEMC V2.0 | |
| hsa-mir-520c | hsa-mir-26a-1 | dbDEMC V2.0 | |
| hsa-mir-520b | dbDEMC V2.0 | hsa-mir-216 | dbDEMC V2.0 |
| hsa-mir-885 | dbDEMC V2.0 | hsa-mir-208b | dbDEMC V2.0 |
| hsa-mir-34b | dbDEMC V2.0 | hsa-mir-182 | dbDEMC V2.0 |
| hsa-mir-340 | dbDEMC V2.0 | hsa-mir-103a-2 | dbDEMC V2.0 |
Top 30 miRNAs related to Lymphoma were predicted by LMTRDA based on known miRNA-disease associations in HMDD V3.0 database.
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-526b | dbDEMC V2.0 | hsa-mir-30c-1 | dbDEMC V2.0 |
| hsa-mir-520g | dbDEMC V2.0 | hsa-mir-198 | dbDEMC V2.0 |
| hsa-mir-520f | dbDEMC V2.0 | hsa-mir-181d | dbDEMC V2.0 |
| hsa-mir-520e | dbDEMC V2.0 | hsa-mir-181b-2 | dbDEMC V2.0 |
| hsa-mir-325 | dbDEMC V2.0 | hsa-mir-506 | |
| hsa-mir-302f | hsa-mir-370 | dbDEMC V2.0 | |
| hsa-mir-616 | dbDEMC V2.0 | hsa-mir-30a | dbDEMC V2.0 |
| hsa-mir-634 | dbDEMC V2.0 | hsa-mir-302c | dbDEMC V2.0 |
| hsa-mir-637 | dbDEMC V2.0 | hsa-mir-302b | dbDEMC V2.0 |
| hsa-mir-492 | dbDEMC V2.0 | hsa-mir-216 | dbDEMC V2.0 |
| hsa-mir-520b | dbDEMC V2.0 | hsa-mir-208b | dbDEMC V2.0 |
| hsa-mir-498 | dbDEMC V2.0 | hsa-mir-103a-2 | |
| hsa-mir-885 | dbDEMC V2.0 | hsa-mir-103a-1 | |
| hsa-mir-340 | dbDEMC V2.0 | hsa-mir-1 | dbDEMC V2.0 |
| hsa-mir-30c-2 | dbDEMC V2.0 | hsa-mir-499 | dbDEMC V2.0 |