| Literature DB >> 28317855 |
Hua Yu1, Xiaojun Chen2, Lu Lu3.
Abstract
Identification of the associations between microRNA molecules and human diseases from large-scale heterogeneous biological data is an important step for understanding the pathogenesis of diseases in microRNA level. However, experimental verification of microRNA-disease associations is expensive and time-consuming. To overcome the drawbacks of conventional experimental methods, we presented a combinatorial prioritization algorithm to predict the microRNA-disease associations. Importantly, our method can be used to predict microRNAs (diseases) associated with the diseases (microRNAs) without the known associated microRNAs (diseases). The predictive performance of our proposed approach was evaluated and verified by the internal cross-validations and external independent validations based on standard association datasets. The results demonstrate that our proposed method achieves the impressive performance for predicting the microRNA-disease association with the Area Under receiver operation characteristic Curve (AUC), 86.93%, which is indeed outperform the previous prediction methods. Particularly, we observed that the ensemble-based method by integrating the predictions of multiple algorithms can give more reliable and robust prediction than the single algorithm, with the AUC score improved to 92.26%. We applied our combinatorial prioritization algorithm to lung neoplasms and breast neoplasms, and revealed their top 30 microRNA candidates, which are in consistent with the published literatures and databases.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28317855 PMCID: PMC5357838 DOI: 10.1038/srep43792
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Performance of our combinatorial prioritization algorithm.
(A) The ROC curves for prioritizing the microRNAs related to the specific diseases. (B) The ROC curves for prioritizing the diseases related to the specific microRNAs.
The predictive performance of our method in a series of cross-validation experiments.
| 50a | 100a | 150a | 200a | 250a | 300a | All microRNAs1 | ||
| 84.12%c | 85.00%c | 85.61%c | 86.10%c | 86.56%c | 86.76%c | 86.93% | ||
| 50a | 100a | 150a | 200a | 250a | 300a | All microRNAs1 | ||
| 69.23%c | 68.89%c | 69.68%c | 70.00%c | 70.44%c | 71.30%c | 70.92% | ||
| 90b | 100b | 110b | 120b | 130b | 140b | All diseases2 | ||
| 81.53%c | 80.66%c | 79.44%c | 83.43%c | 81.52%c | 83.47%c | 83.50% | ||
| 90b | 100b | 110b | 120b | 130b | 140b | All diseases2 | ||
| 77.97%c | 77.28%c | 79.38%c | 79.27%c | 79.63%c | 79.02%c | 80.03% |
1The candidate microRNAs were obtained after deleting the microRNA (microRNAs) related to query disease.
2The candidate diseases were obtained after deleting the disease (diseases) associated with the query microRNA.
aThe number of randomly selected candidate microRNAs in microRNA prioritization.
bThe number of randomly selected candidate diseases in disease prioritization.
cThe AUC score of ROC curve.
Figure 2Comparison the performance of our method with the previous methods for prioritizing the microRNAs related to the specific diseases using the leave-one-out cross-validation experiments.
(A) The ROC curve. (B) The PR curve.
Figure 3Comparison the performance of our method with RLSMDA and NCPMDA for prioritizing the diseases associated with the specific microRNAs using leave-one-out cross-validation experiments.
(A) The ROC curve. (B) The PR curve.
The top 30 breast neoplasms-related microRNA candidates.
| Rank | MicroRNA Name | Evidences |
|---|---|---|
| 1 | hsa-let-7b | HMDD,dbDEMC |
| 2 | hsa-let-7c | HMDD,dbDEMC |
| 3 | hsa-mir-126 | HMDD,dbDEMC,miR2Disease |
| 4 | hsa-mir-16 | HMDD,dbDEMC |
| 5 | hsa-mir-100 | HMDD,dbDEMC |
| 6 | hsa-let-7e | HMDD,dbDEMC |
| 7 | hsa-mir-135a | HMDD,dbDEMC |
| 8 | hsa-mir-130a | dbDEMC |
| 9 | hsa-let-7i | HMDD,dbDEMC,miR2Disease |
| 10 | hsa-mir-106a | dbDEMC |
| 11 | hsa-mir-150 | dbDEMC |
| 12 | hsa-mir-181a | HMDD,dbDEMC,miR2Disease |
| 13 | hsa-mir-140 | HMDD,dbDEMC |
| 14 | hsa-mir-203 | HMDD,dbDEMC,miR2Disease |
| 15 | hsa-mir-192 | dbDEMC |
| 16 | hsa-mir-138 | dbDEMC |
| 17 | hsa-mir-191 | HMDD,dbDEMC,miR2Disease |
| 18 | hsa-let-7g | HMDD,dbDEMC |
| 19 | hsa-mir-142 | literature |
| 20 | hsa-mir-449a | literature |
| 21 | hsa-mir-101 | dbDEMC,miR2Disease |
| 22 | hsa-mir-449b | G2SBC |
| 23 | hsa-mir-99b | dbDEMC |
| 24 | hsa-mir-186 | dbDEMC |
| 25 | hsa-mir-372 | dbDEMC |
| 26 | hsa-mir-95 | dbDEMC |
| 27 | hsa-mir-371 | dbDEMC |
| 28 | hsa-mir-152 | HMDD,dbDEMC,miR2Disease |
| 29 | hsa-mir-148a | HMDD,dbDEMC,miR2Disease |
| 30 | hsa-mir-208 | dbDEMC |
Figure 4The flowchart of the whole modeling procedure.