| Literature DB >> 29305594 |
Xing Chen1, Li Huang2, Di Xie3, Qi Zhao3,4.
Abstract
Associations between microRNAs (miRNAs) and human diseases have been identified by increasing studies and discovering new ones is an ongoing process in medical laboratories. To improve experiment productivity, researchers computationally infer potential associations from biological data, selecting the most promising candidates for experimental verification. Predicting potential miRNA-disease association has become a research area of growing importance. This paper presents a model of Extreme Gradient Boosting Machine for MiRNA-Disease Association (EGBMMDA) prediction by integrating the miRNA functional similarity, the disease semantic similarity, and known miRNA-disease associations. The statistical measures, graph theoretical measures, and matrix factorization results for each miRNA-disease pair were calculated and used to form an informative feature vector. The vector for known associated pairs obtained from the HMDD v2.0 database was used to train a regression tree under the gradient boosting framework. EGBMMDA was the first decision tree learning-based model used for predicting miRNA-disease associations. Respectively, AUCs of 0.9123 and 0.8221 in global and local leave-one-out cross-validation proved the model's reliable performance. Moreover, the 0.9048 ± 0.0012 AUC in fivefold cross-validation confirmed its stability. We carried out three different types of case studies of predicting potential miRNAs related to Colon Neoplasms, Lymphoma, Prostate Neoplasms, Breast Neoplasms, and Esophageal Neoplasms. The results indicated that, respectively, 98%, 90%, 98%, 100%, and 98% of the top 50 predictions for the five diseases were confirmed by experiments. Therefore, EGBMMDA appears to be a useful computational resource for miRNA-disease association prediction.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29305594 PMCID: PMC5849212 DOI: 10.1038/s41419-017-0003-x
Source DB: PubMed Journal: Cell Death Dis Impact factor: 8.469
Fig. 1Performance comparisons between EGBMMDA and eight previous disease–miRNA association prediction models (RLSMDA, MiRAI, MCMDA, HGIMDA, WBSMDA, MIDP, RWRMDA, and HDMP) in terms of ROC curve and AUCs based on local and global LOOCV, respectively. As a result, EGBMMDA achieved AUCs of 0.9123 and 0.8221 in the global and local LOOCV, surpassing all the previous models
Prediction of the top 50 predicted miRNAs associated with Colon Neoplasms based on known associations in HMDD database
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-29a | dbDEMC;miR2Disease | hsa-let-7c | dbDEMC |
| hsa-mir-29b | dbDEMC;miR2Disease | hsa-mir-222 | dbDEMC |
| hsa-let-7a | dbDEMC;miR2Disease | hsa-mir-199a | 23292866 |
| hsa-mir-143 | dbDEMC;miR2Disease | hsa-mir-29c | dbDEMC |
| hsa-mir-150 | 25230975 | hsa-mir-19a | dbDEMC;miR2Disease |
| hsa-mir-15a | dbDEMC | hsa-mir-142 | 23619912 |
| hsa-mir-16 | dbDEMC | hsa-mir-181a | dbDEMC;miR2Disease |
| hsa-mir-21 | dbDEMC;miR2Disease | hsa-mir-125a | dbDEMC;miR2Disease |
| hsa-mir-1 | dbDEMC;miR2Disease | hsa-mir-196a | dbDEMC;miR2Disease |
| hsa-mir-133a | dbDEMC;miR2Disease | hsa-mir-141 | dbDEMC;miR2Disease |
| hsa-mir-146a | dbDEMC | hsa-mir-133b | dbDEMC;miR2Disease |
| hsa-mir-155 | dbDEMC;miR2Disease | hsa-mir-10b | dbDEMC;miR2Disease |
| hsa-mir-200b | dbDEMC | hsa-mir-181b | dbDEMC;miR2Disease |
| hsa-mir-200c | dbDEMC;miR2Disease | hsa-mir-182 | dbDEMC;miR2Disease |
| hsa-mir-20a | dbDEMC;miR2Disease | hsa-mir-183 | dbDEMC;miR2Disease |
| hsa-mir-210 | dbDEMC | hsa-mir-192 | dbDEMC;miR2Disease |
| hsa-mir-221 | dbDEMC;miR2Disease | hsa-mir-195 | dbDEMC;miR2Disease |
| hsa-mir-223 | dbDEMC;miR2Disease | hsa-mir-200a | Unconfirmed |
| hsa-mir-31 | dbDEMC;miR2Disease | hsa-mir-203 | dbDEMC;miR2Disease |
| hsa-mir-92a | 21883694 | hsa-mir-205 | dbDEMC |
| hsa-mir-125b | dbDEMC | hsa-mir-34b | dbDEMC;miR2Disease |
| hsa-mir-18a | dbDEMC;miR2Disease | hsa-mir-93 | dbDEMC;miR2Disease |
| hsa-mir-19b | dbDEMC;miR2Disease | hsa-let-7e | dbDEMC |
| hsa-mir-34a | dbDEMC;miR2Disease | hsa-mir-101 | 22353936 |
| hsa-let-7b | dbDEMC;miR2Disease | hsa-mir-146b | 26178670 |
The first column records top 1–25 related miRNAs. The third column records the top 26–50 related miRNAs. The evidences for the associations were either database studies or PMIDs of other experimental literatures
Prediction of the top 50 predicted miRNAs associated with Lymphoma based on known associations in HMDD database
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-196a | dbDEMC | hsa-mir-223 | dbDEMC |
| hsa-mir-29a | dbDEMC | hsa-mir-25 | dbDEMC |
| hsa-mir-29b | dbDEMC | hsa-mir-26b | dbDEMC |
| hsa-let-7a | dbDEMC | hsa-mir-31 | dbDEMC |
| hsa-mir-141 | dbDEMC | hsa-mir-34b | dbDEMC |
| hsa-mir-143 | dbDEMC | hsa-mir-429 | Unconfirmed |
| hsa-mir-145 | dbDEMC | hsa-mir-93 | dbDEMC |
| hsa-mir-1 | dbDEMC | hsa-let-7e | dbDEMC |
| hsa-mir-133a | dbDEMC | hsa-mir-125b | 23527180 |
| hsa-mir-103a | Unconfirmed | hsa-mir-146b | 24931464 |
| hsa-mir-106a | dbDEMC | hsa-mir-148a | dbDEMC |
| hsa-mir-10b | dbDEMC | hsa-mir-196b | Unconfirmed |
| hsa-mir-151a | Unconfirmed | hsa-mir-219 | dbDEMC |
| hsa-mir-152 | dbDEMC | hsa-mir-27a | dbDEMC |
| hsa-mir-181b | dbDEMC | hsa-mir-27b | dbDEMC |
| hsa-mir-182 | dbDEMC | hsa-mir-30a | dbDEMC |
| hsa-mir-183 | dbDEMC | hsa-mir-30b | dbDEMC |
| hsa-mir-191 | dbDEMC | hsa-mir-30c | dbDEMC |
| hsa-mir-192 | dbDEMC | hsa-mir-338 | dbDEMC |
| hsa-mir-193b | 22235305 | hsa-mir-34a | dbDEMC |
| hsa-mir-194 | dbDEMC | hsa-mir-378a | Unconfirmed |
| hsa-mir-195 | dbDEMC | hsa-mir-7 | dbDEMC |
| hsa-mir-204 | dbDEMC | hsa-mir-100 | dbDEMC |
| hsa-mir-205 | dbDEMC | hsa-mir-214 | dbDEMC |
| hsa-mir-221 | dbDEMC | hsa-mir-99a | dbDEMC |
The first column records top 1–25 related miRNAs. The third column records the top 26–50 related miRNAs. The evidences for the associations were either database studies or PMIDs of other experimental literatures
Prediction of the top 50 predicted miRNAs associated with Prostate Neoplasms based on known associations in HMDD database
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-125a | dbDEMC;miR2Disease | hsa-mir-34c | dbDEMC |
| hsa-mir-196a | dbDEMC | hsa-mir-9 | dbDEMC |
| hsa-mir-141 | miR2Disease | hsa-mir-26a | dbDEMC;miR2Disease |
| hsa-mir-133b | dbDEMC | hsa-mir-206 | dbDEMC |
| hsa-mir-181b | dbDEMC;miR2Disease | hsa-let-7f | dbDEMC;miR2Disease |
| hsa-mir-182 | dbDEMC;miR2Disease | hsa-let-7g | dbDEMC;miR2Disease |
| hsa-mir-195 | dbDEMC;miR2Disease | hsa-let-7i | dbDEMC |
| hsa-mir-200a | dbDEMC | hsa-mir-486 | 27877055 |
| hsa-mir-203 | 21159887 | hsa-mir-122 | Unconfirmed |
| hsa-mir-205 | dbDEMC;miR2Disease | hsa-mir-218 | dbDEMC;miR2Disease |
| hsa-mir-34b | dbDEMC | hsa-mir-24 | dbDEMC;miR2Disease |
| hsa-mir-93 | 26124181 | hsa-mir-29a | dbDEMC;miR2Disease |
| hsa-let-7e | dbDEMC | hsa-mir-29b | dbDEMC;miR2Disease |
| hsa-mir-101 | dbDEMC;miR2Disease | hsa-let-7a | dbDEMC;miR2Disease |
| hsa-mir-146b | 21980038 | hsa-mir-143 | dbDEMC;miR2Disease |
| hsa-mir-148a | miR2Disease | hsa-mir-150 | dbDEMC |
| hsa-mir-27a | dbDEMC;miR2Disease | hsa-mir-15a | dbDEMC;miR2Disease |
| hsa-mir-30a | miR2Disease | hsa-mir-16 | dbDEMC;miR2Disease |
| hsa-mir-7 | dbDEMC | hsa-mir-21 | dbDEMC;miR2Disease |
| hsa-mir-100 | dbDEMC;miR2Disease | hsa-mir-1 | dbDEMC |
| hsa-mir-214 | dbDEMC;miR2Disease | hsa-mir-133a | dbDEMC |
| hsa-let-7d | dbDEMC;miR2Disease | hsa-mir-146a | miR2Disease |
| hsa-mir-106b | dbDEMC | hsa-mir-155 | dbDEMC |
| hsa-mir-15b | dbDEMC | hsa-mir-126 | dbDEMC;miR2Disease |
| hsa-mir-124 | dbDEMC | hsa-mir-17 | miR2Disease |
The first column records top 1–25 related miRNAs. The third column records the top 26–50 related miRNAs. The evidences for the associations were either database studies or PMIDs of other experimental literatures
Prediction of the top 50 predicted miRNAs associated with Breast Neoplasms based on known associations in HMDD database
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-499a | HMDD | hsa-mir-132 | dbDEMC;HMDD |
| hsa-mir-204 | dbDEMC;miR2Disease;HMDD | hsa-mir-137 | dbDEMC;HMDD |
| hsa-mir-26b | dbDEMC;HMDD | hsa-mir-206 | dbDEMC;miR2Disease;HMDD |
| hsa-mir-95 | dbDEMC | hsa-mir-23a | dbDEMC;HMDD |
| hsa-mir-219 | dbDEMC;HMDD | hsa-mir-212 | dbDEMC |
| hsa-mir-342 | dbDEMC;HMDD | hsa-mir-125a | dbDEMC;miR2Disease;HMDD |
| hsa-mir-433 | dbDEMC | hsa-let-7a | dbDEMC;miR2Disease;HMDD |
| hsa-mir-424 | dbDEMC | hsa-mir-141 | dbDEMC;miR2Disease;HMDD |
| hsa-mir-153 | dbDEMC;HMDD | hsa-mir-143 | dbDEMC;miR2Disease;HMDD |
| hsa-mir-181c | dbDEMC | hsa-mir-150 | dbDEMC |
| hsa-mir-140 | dbDEMC;HMDD | hsa-mir-133b | dbDEMC;HMDD |
| hsa-mir-328 | dbDEMC;miR2Disease;HMDD | hsa-mir-106a | dbDEMC |
| hsa-mir-372 | dbDEMC | hsa-mir-10b | dbDEMC;miR2Disease;HMDD |
| hsa-mir-373 | dbDEMC;miR2Disease;HMDD | hsa-mir-126 | dbDEMC;miR2Disease;HMDD |
| hsa-mir-708 | HMDD | hsa-mir-181b | dbDEMC;miR2Disease;HMDD |
| hsa-mir-326 | dbDEMC;HMDD | hsa-mir-182 | dbDEMC;miR2Disease;HMDD |
| hsa-mir-302b | dbDEMC;HMDD | hsa-mir-183 | dbDEMC;HMDD |
| hsa-mir-320a | HMDD | hsa-mir-192 | dbDEMC |
| hsa-mir-506 | HMDD | hsa-mir-195 | dbDEMC;miR2Disease;HMDD |
| hsa-mir-516a | HMDD | hsa-mir-200a | dbDEMC;miR2Disease;HMDD |
| hsa-mir-184 | dbDEMC | hsa-mir-200b | dbDEMC;miR2Disease;HMDD |
| hsa-mir-134 | dbDEMC | hsa-mir-200c | dbDEMC;miR2Disease;HMDD |
| hsa-mir-32 | dbDEMC | hsa-mir-203 | dbDEMC;miR2Disease;HMDD |
| hsa-mir-325 | dbDEMC | hsa-mir-205 | dbDEMC;miR2Disease;HMDD |
| hsa-mir-30b | dbDEMC;HMDD | hsa-mir-223 | dbDEMC;HMDD |
The first column records top 1–25 related miRNAs. The third column records the top 26–50 related miRNAs
Prediction of the top 50 predicted miRNAs associated with Esophageal Neoplasms based on known associations in the older version of the HMDD database
| miRNA | Evidence | miRNA | Evidence |
|---|---|---|---|
| hsa-mir-20a | dbDEMC;HMDD | hsa-mir-34a | dbDEMC;HMDD |
| hsa-mir-221 | dbDEMC | hsa-let-7c | dbDEMC;HMDD |
| hsa-mir-155 | dbDEMC;HMDD | hsa-mir-29b | dbDEMC |
| hsa-mir-146a | dbDEMC;HMDD | hsa-mir-19b | dbDEMC |
| hsa-mir-222 | dbDEMC | hsa-mir-126 | dbDEMC;HMDD |
| hsa-mir-150 | dbDEMC;HMDD | hsa-mir-206 | dbDEMC |
| hsa-mir-1 | dbDEMC | hsa-mir-9 | dbDEMC |
| hsa-mir-143 | dbDEMC;HMDD | hsa-mir-96 | dbDEMC |
| hsa-mir-17 | dbDEMC | hsa-mir-141 | dbDEMC;HMDD |
| hsa-mir-125b | dbDEMC | hsa-mir-132 | dbDEMC |
| hsa-mir-16 | dbDEMC | hsa-mir-373 | dbDEMC;miR2Disease |
| hsa-mir-133a | dbDEMC;HMDD | hsa-mir-451 | dbDEMC |
| hsa-mir-181b | dbDEMC | hsa-mir-211 | dbDEMC |
| hsa-mir-92a | HMDD | hsa-mir-142 | dbDEMC |
| hsa-mir-15a | dbDEMC;HMDD | hsa-mir-494 | dbDEMC |
| hsa-mir-18a | dbDEMC | hsa-mir-30c | dbDEMC |
| hsa-let-7d | dbDEMC | hsa-mir-302c | dbDEMC |
| hsa-mir-200b | dbDEMC | hsa-mir-10a | dbDEMC |
| hsa-mir-29a | dbDEMC | hsa-mir-34b | dbDEMC;HMDD |
| hsa-mir-19a | dbDEMC;HMDD | hsa-mir-377 | dbDEMC |
| hsa-mir-145 | dbDEMC;HMDD | hsa-mir-184 | Unconfirmed |
| hsa-let-7b | dbDEMC;HMDD | hsa-mir-23b | dbDEMC |
| hsa-let-7a | dbDEMC;HMDD | hsa-mir-106b | dbDEMC |
| hsa-let-7e | dbDEMC | hsa-mir-199a | dbDEMC;HMDD |
| hsa-mir-223 | dbDEMC;miR2Disease;HMDD | hsa-mir-196a | dbDEMC;miR2Disease;HMDD |
The first column records top 1–25 related miRNAs. The third column records the top 26–50 related miRNAs
Fig. 2Flowchart of potential miRNA–disease association prediction based on the computational model of EGBMMDA
Feature vector extracted from the miRNA functional similarity matrix, the disease semantic similarity matrix, and the known miRNA–disease association matrix
| Type 1 features for each miRNA/disease | For the miRNA | |
| The average of all similarity scores, namely, the average of the | ||
| The range of similarity scores [0, 1] was segmented into | ||
| Type 2 features for each miRNA/disease | Number of neighbors of a node in the unweighted graph version of | |
| The similarity values of the k-nearest neighbors of a node | ||
| The average of Type 1 features among the k-nearest neighbors of a node | ||
| The average of Type 1 features among the k-nearest neighbors of a node weighted by the similarity values. | ||
| Betweenness, closeness, eigenvector centrality of a node | ||
|
| Page-Rank score of a node | |
| Type 3 features for each miRNA–disease pair |
| Latent vectors for the miRNA and the disease, obtained by matrix factorization of |
| The number of associations between an miRNA and a disease’s neighbors | ||
| The number of associations between a disease and an miRNA’s neighbors | ||
| Betweenness, closeness, eigenvector centrality of a node | ||
| Page-Rank score of a node |
Fig. 3Tree growing algorithm. The algorithm first grew the tree in a top-down manner to the maximum depth specified by the user, creating a 2depth number of nodes, and then pruned all the leaves with negative gains in a bottom-up order