| Literature DB >> 31329575 |
Xing Chen1, Chi-Chi Zhu1, Jun Yin1.
Abstract
In recent years, increasing associations between microRNAs (miRNAs) and human diseases have been identified. Based on accumulating biological data, many computational models for potential miRNA-disease associations inference have been developed, which saves time and expenditure on experimental studies, making great contributions to researching molecular mechanism of human diseases and developing new drugs for disease treatment. In this paper, we proposed a novel computational method named Ensemble of Decision Tree based MiRNA-Disease Association prediction (EDTMDA), which innovatively built a computational framework integrating ensemble learning and dimensionality reduction. For each miRNA-disease pair, the feature vector was extracted by calculating the statistical measures, graph theoretical measures, and matrix factorization results for the miRNA and disease, respectively. Then multiple base learnings were built to yield many decision trees (DTs) based on random selection of negative samples and miRNA/disease features. Particularly, Principal Components Analysis was applied to each base learning to reduce feature dimensionality and hence remove the noise or redundancy. Average strategy was adopted for these DTs to get final association scores between miRNAs and diseases. In model performance evaluation, EDTMDA showed AUC of 0.9309 in global leave-one-out cross validation (LOOCV) and AUC of 0.8524 in local LOOCV. Additionally, AUC of 0.9192+/-0.0009 in 5-fold cross validation proved the model's reliability and stability. Furthermore, three types of case studies for four human diseases were implemented. As a result, 94% (Esophageal Neoplasms), 86% (Kidney Neoplasms), 96% (Breast Neoplasms) and 88% (Carcinoma Hepatocellular) of top 50 predicted miRNAs were confirmed by experimental evidences in literature.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31329575 PMCID: PMC6675125 DOI: 10.1371/journal.pcbi.1007209
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1The flowchart of EDTMDA to predict miRNA-disease associations.
MiRNA/disease features extracted from integrated miRNA/disease similarity and known miRNA-disease associations were inputs of our training model. M DTs were obtained from M base learnings and the average of prediction scores from all DTs were calculated as final prediction results.
Fig 2The pseudocode of EDTMDA to predict miRNA-disease associations.
Fig 3Performance comparisons between EDTMDA and other 12 prediction models (HGIMDA, RLSMDA, HDMP, WBSMDA, RWRMDA, MCMDA, MIDP, PBMDA, MaxFlow, LRSSLMDA, MiRAI and MDHGI) in terms of ROC curve and AUC based on local and global LOOCV, respectively.
As a result, EDTMDA obtained AUCs of 0.9309 and 0.8524 in the global and local LOOCV, which exceed all of the above previous classical models.
AUC results between EDTMDA and other methods under 5-fold CV.
| Methods | AUC |
|---|---|
| EDTMDA | 0.9192+/-0.0009 |
| LRSSLMDA | 0.9181+/-0.0004 |
| PBMDA | 0.9172+/-0.0007 |
| MDHGI | 0.8794+/-0.0021 |
| MCMDA | 0.8767+/-0.0011 |
| MaxFlow | 0.8579+/-0.001 |
| RLSMDA | 0.8569+/-0.0020 |
| HDMP | 0.8342+/-0.0010 |
| WBSMDA | 0.8185+/-0.0009 |
AUC results of EDTMDA between with dimensionality reduction and without dimensionality reduction under three cross validations.
| Methods | Global LOOCV | Local LOOCV | 5-fold CV |
|---|---|---|---|
| EDTMDA with PCA | 0.9309 | 0.8524 | 0.9192+/-0.0009 |
| EDTMDA without PCA | 0.9216 | 0.8423 | 0.9076+/-0.0012 |
AUC results between EDTMDA and RF under three cross validations.
| Methods | Global LOOCV | Local LOOCV | 5-fold CV |
|---|---|---|---|
| EDTMDA | 0.9309 | 0.8524 | 0.9192+/-0.0009 |
| RF | 0.8464 | 0.7745 | 0.8341+/-0.0035 |
EDTMDA was implemented to predict potential miRNAs related to Esophageal Neoplasms based on known associations in HMDD V2.0.
The top 50 predicted miRNAs were verified in dbDEMC and miR2Disease. The first column records top 1–25 related miRNAs and the third column records the top 26–50 related miRNAs.
| miRNA | evidence | miRNA | evidence |
|---|---|---|---|
| hsa-mir-106b | dbDEMC | hsa-mir-142 | dbDEMC |
| hsa-mir-200b | dbDEMC | hsa-mir-195 | dbDEMC |
| hsa-mir-16 | dbDEMC | hsa-mir-218 | unconfirmed |
| hsa-mir-18a | dbDEMC | hsa-mir-204 | unconfirmed |
| hsa-mir-125b | dbDEMC | hsa-let-7d | dbDEMC |
| hsa-mir-221 | dbDEMC | hsa-mir-29a | dbDEMC |
| hsa-mir-106a | dbDEMC | hsa-mir-146b | dbDEMC |
| hsa-mir-9 | dbDEMC | hsa-mir-181b | dbDEMC |
| hsa-mir-222 | dbDEMC | hsa-mir-199b | dbDEMC |
| hsa-mir-107 | dbDEMC and miR2Disease | hsa-mir-138 | unconfirmed |
| hsa-let-7e | dbDEMC | hsa-let-7i | dbDEMC |
| hsa-mir-125a | dbDEMC | hsa-mir-335 | dbDEMC |
| hsa-mir-7 | dbDEMC | hsa-mir-302c | dbDEMC |
| hsa-mir-182 | dbDEMC | hsa-mir-181a | dbDEMC |
| hsa-mir-429 | dbDEMC | hsa-mir-139 | dbDEMC |
| hsa-mir-29b | dbDEMC | hsa-mir-20b | dbDEMC |
| hsa-mir-302b | dbDEMC | hsa-let-7g | dbDEMC |
| hsa-mir-30a | dbDEMC | hsa-mir-30c | dbDEMC |
| hsa-mir-1 | dbDEMC | hsa-mir-17 | dbDEMC |
| hsa-mir-127 | dbDEMC | hsa-mir-135a | dbDEMC |
| hsa-mir-10b | dbDEMC | hsa-mir-19b | dbDEMC |
| hsa-mir-93 | dbDEMC | hsa-mir-219 | unconfirmed |
| hsa-mir-24 | dbDEMC | hsa-mir-372 | dbDEMC |
| hsa-mir-194 | dbDEMC and miR2Disease | hsa-mir-224 | dbDEMC |
| hsa-mir-32 | dbDEMC | hsa-mir-30d | dbDEMC |
EDTMDA was implemented to predict potential miRNAs related to Kidney Neoplasms based on known associations in HMDD V2.0.
The top 50 predicted miRNAs were verified in dbDEMC and miR2Disease. The first column records top 1–25 related miRNAs and the third column records the top 26–50 related miRNAs.
| miRNA | evidence | miRNA | evidence |
|---|---|---|---|
| hsa-mir-16 | dbDEMC | hsa-mir-1 | dbDEMC |
| hsa-let-7a | dbDEMC | hsa-mir-92a | unconfirmed |
| hsa-mir-150 | dbDEMC and miR2Disease | hsa-let-7i | dbDEMC |
| hsa-mir-200a | dbDEMC | hsa-mir-18a | dbDEMC |
| hsa-mir-155 | dbDEMC | hsa-mir-210 | dbDEMC and miR2Disease |
| hsa-mir-182 | dbDEMC and miR2Disease | hsa-mir-296 | unconfirmed |
| hsa-mir-125b | unconfirmed | hsa-mir-196a | dbDEMC |
| hsa-mir-34a | dbDEMC | hsa-let-7g | dbDEMC |
| hsa-mir-17 | miR2Disease | hsa-mir-19a | dbDEMC |
| hsa-mir-146a | dbDEMC | hsa-mir-199a | dbDEMC and miR2Disease |
| hsa-mir-145 | dbDEMC | hsa-mir-133a | unconfirmed |
| hsa-let-7c | dbDEMC | hsa-mir-29b | dbDEMC and miR2Disease |
| hsa-mir-9 | dbDEMC | hsa-mir-19b | dbDEMC and miR2Disease |
| hsa-mir-367 | unconfirmed | hsa-mir-25 | dbDEMC |
| hsa-let-7b | unconfirmed | hsa-mir-223 | dbDEMC |
| hsa-mir-29a | dbDEMC and miR2Disease | hsa-mir-106b | dbDEMC and miR2Disease |
| hsa-mir-181a | dbDEMC | hsa-mir-146b | dbDEMC |
| hsa-mir-222 | dbDEMC | hsa-mir-193b | dbDEMC |
| hsa-mir-221 | unconfirmed | hsa-mir-302c | unconfirmed |
| hsa-mir-203 | dbDEMC | hsa-mir-99a | dbDEMC |
| hsa-mir-126 | dbDEMC and miR2Disease | hsa-mir-195 | dbDEMC |
| hsa-let-7d | dbDEMC | hsa-mir-205 | unconfirmed |
| hsa-mir-199b | dbDEMC | hsa-mir-148a | dbDEMC |
| hsa-mir-200b | dbDEMC and miR2Disease | hsa-mir-130a | dbDEMC |
| hsa-let-7f | dbDEMC and miR2Disease | hsa-mir-181b | dbDEMC |
EDTMDA was implemented to predict potential miRNAs associated with Breast Neoplasms as a new disease by removing all known associations containing Breast Neoplasms in HMDD V2.0 database.
The top 50 predicted miRNAs were verified in dbDEMC, miR2Disease and HMDD V2.0. The first column records top 1–25 related miRNAs and the third column records the top 26–50 related miRNAs.
| miRNA | evidence | miRNA | evidence |
|---|---|---|---|
| hsa-mir-210 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-155 | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-31 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-15a | dbDEMC;HMDD V2.0 |
| hsa-mir-134 | dbDEMC | hsa-mir-132 | dbDEMC;HMDD V2.0 |
| hsa-mir-122 | dbDEMC;HMDD V2.0 | hsa-mir-218 | dbDEMC;HMDD V2.0 |
| hsa-mir-221 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-222 | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-133a | dbDEMC;HMDD V2.0 | hsa-mir-137 | dbDEMC;HMDD V2.0 |
| hsa-mir-196a | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-29b | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-7 | dbDEMC;miR2Disease; HMDD V2.0 V2.0 | hsa-mir-15b | dbDEMC |
| hsa-mir-34a | dbDEMC;HMDD V2.0 | hsa-mir-20a | miR2Disease;HMDD V2.0 |
| hsa-mir-125b | miR2Disease;HMDD V2.0 | hsa-mir-96 | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-16 | dbDEMC;HMDD V2.0 | hsa-mir-205 | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-1 | dbDEMC;HMDD V2.0 | hsa-mir-200c | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-26a | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-326 | dbDEMC;HMDD V2.0 |
| hsa-mir-146a | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-34b | dbDEMC;HMDD V2.0 |
| hsa-mir-29c | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-200a | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-223 | dbDEMC;HMDD V2.0 | hsa-mir-148a | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-206 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-29a | dbDEMC;HMDD V2.0 |
| hsa-mir-142 | unconfirmed | hsa-mir-302b | dbDEMC;HMDD V2.0 |
| hsa-mir-9 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-34c | dbDEMC;HMDD V2.0 |
| hsa-mir-21 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-30b | dbDEMC;HMDD V2.0 |
| hsa-mir-200b | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-182 | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-199a | dbDEMC;HMDD V2.0 | hsa-mir-1207 | unconfirmed |
| hsa-mir-224 | dbDEMC;HMDD V2.0 | hsa-mir-302a | dbDEMC;HMDD V2.0 |
| hsa-mir-145 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-10b | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-124 | dbDEMC;HMDD V2.0 | hsa-mir-150 | dbDEMC |
EDTMDA was implemented to predict potential miRNAs related to Carcinoma Hepatocellular based on known associations in HMDD V1.0 database.
The top 50 predicted miRNAs were verified in dbDEMC, miR2Disease and HMDD V2.0. The first column records top 1–25 related miRNAs and the third column records the top 26–50 related miRNAs.
| miRNA | evidence | miRNA | evidence |
|---|---|---|---|
| hsa-mir-146b | HMDD V2.0 | hsa-mir-29a | dbDEMC;HMDD V2.0 |
| hsa-mir-155 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-194 | dbDEMC;miR2Disease |
| hsa-mir-128b | miR2Disease | hsa-let-7i | dbDEMC;HMDD V2.0 |
| hsa-mir-106b | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-93 | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-126 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-34b | unconfirmed |
| hsa-mir-143 | dbDEMC;miR2Disease | hsa-mir-30c | miR2Disease;HMDD V2.0 |
| hsa-mir-210 | dbDEMC;HMDD V2.0 | hsa-mir-429 | unconfirmed |
| hsa-mir-141 | miR2Disease;HMDD V2.0 | hsa-mir-135b | unconfirmed |
| hsa-let-7a | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-15a | dbDEMC;miR2Disease; HMDD V2.0 |
| hsa-mir-132 | miR2Disease | hsa-mir-30d | dbDEMC;HMDD V2.0 |
| hsa-mir-25 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-205 | miR2Disease;HMDD V2.0 |
| hsa-let-7g | miR2Disease;HMDD V2.0 | hsa-mir-153 | unconfirmed |
| hsa-mir-29b | dbDEMC;HMDD V2.0 | hsa-mir-383 | unconfirmed |
| hsa-mir-214 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-196b | unconfirmed |
| hsa-let-7d | miR2Disease;HMDD V2.0 | hsa-mir-200c | HMDD V2.0 |
| hsa-mir-181b | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-451 | dbDEMC |
| hsa-mir-24 | miR2Disease;HMDD V2.0 | hsa-mir-219 | miR2Disease;HMDD V2.0 |
| hsa-let-7b | miR2Disease;HMDD V2.0 | hsa-mir-7 | HMDD V2.0 |
| hsa-let-7f | miR2Disease;HMDD V2.0 | hsa-mir-151 | miR2Disease |
| hsa-let-7c | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-30e | miR2Disease |
| hsa-mir-9 | miR2Disease | hsa-mir-192 | miR2Disease;HMDD V2.0 |
| hsa-mir-191 | dbDEMC;HMDD V2.0 | hsa-mir-103 | miR2Disease |
| hsa-mir-16 | dbDEMC;miR2Disease; HMDD V2.0 | hsa-mir-26b | dbDEMC;miR2Disease |
| hsa-mir-29c | dbDEMC;HMDD V2.0 | hsa-mir-218 | HMDD V2.0 |
| hsa-mir-34c | HMDD V2.0 | hsa-mir-339 | unconfirmed |
The number of validated miRNAs among top 10 and top 50 predicted miRNAs in case studies between under true labels and under label randomization.
| Case study | Top 10 & true labels | Top 10 & label randomization | Top 50 & true labels | Top 50 & label randomization |
|---|---|---|---|---|
| The 1st type of case study for Esophageal Neoplasms | 10 | 4 | 47 | 26 |
| The 1st type of case study for Kidney Neoplasms | 9 | 5 | 43 | 22 |
| The 2nd type of case study for Breast Neoplasms | 10 | 5 | 48 | 36 |
| The 3rd type of case study for Carcinoma Hepatocellular | 10 | 5 | 44 | 33 |