| Literature DB >> 31921290 |
Junlin Xu1, Lijun Cai1, Bo Liao2, Wen Zhu1, Peng Wang1, Yajie Meng1, Jidong Lang3, Geng Tian3, Jialiang Yang2.
Abstract
In recent years, miRNAs have been verified to play an irreplaceable role in biological processes associated with human disease. Discovering potential disease-related miRNAs helps explain the underlying pathogenesis of the disease at the molecular level. Given the high cost and labor intensity of biological experiments, computational predictions will be an indispensable alternative. Therefore, we design a new model called probability matrix factorization (PMFMDA). Specifically, we first integrate miRNA and disease similarity. Next, the known association matrix and integrated similarity matrix are utilized to construct a probability matrix factorization algorithm to identify potentially relevant miRNAs for disease. We find that PMFMDA achieves reliable performance in the frameworks of global leave-one-out cross validation (LOOCV) and 5-fold cross validation (AUCs are 0.9237 and 0.9187, respectively) in the HMDD (V2.0) dataset, significantly outperforming a few state-of-the-art methods including CMFMDA, IMCMDA, NCPMDA, RLSMDA, and RWRMDA. In addition, case studies show that PMFMDA has good predictive performance for new associations, and the evidence can be identified by literature mining.Entities:
Keywords: association prediction; diseases; miRNAs; probabilistic matrix factorization; receiver operating characteristic curve (ROC)
Year: 2019 PMID: 31921290 PMCID: PMC6918542 DOI: 10.3389/fgene.2019.01234
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1The workflow of PMFMDA is used to infer disease-associated unknown miRNAs.
Figure 2The ROC curves for PMFMDA and benchmark algorithms for 5-fold CV and global LOOCV.
Figure 3The PR curves for PMFMDA and benchmark algorithms for 5-fold CV.
Comparison of AUPR values predicted by PMFMDA and benchmark algorithms on novel diseases.
| Disease name | AURP | ||||
|---|---|---|---|---|---|
| PMFMDA | IMCMDA | CMFMDA | NCPMDA | RLSMDA | |
| Melanoma | 0.7149 | 0.6757 | 0.4574 | 0.6785 | 0.6940 |
| Breast tumor | 0.7895 | 0.7752 | 0.6135 | 0.7866 | 0.7749 |
| Colorectal tumor | 0.6585 | 0.6333 | 0.4725 | 0.5714 | 0.5315 |
| Glioblastoma | 0.5940 | 0.5076 | 0.4540 | 0.4779 | 0.4028 |
| Heart failure | 0.5956 | 0.6284 | 0.4510 | 0.6182 | 0.5510 |
| Prostatic tumor | 0.6578 | 0.5881 | 0.5963 | 0.5873 | 0.5208 |
| Stomach tumor | 0.6981 | 0.6438 | 0.5231 | 0.6269 | 0.6081 |
| Bladder tumor | 0.6409 | 0.5388 | 0.5051 | 0.5505 | 0.5255 |
| Mean | 0.6687 | 0.6237 | 0.5091 | 0.6121 | 0.5761 |
PMFMDA predicts the correct numbers of different ranking thresholds for 8 common diseases.
| Cancer | No. of known associated miRNAs | Ranking threshold | ||||
|---|---|---|---|---|---|---|
| 20 | 40 | 60 | 80 | 100 | ||
| Breast neoplasms | 202 | 20 | 38 | 54 | 74 | 91 |
| Colorectal neoplasms | 147 | 17 | 30 | 45 | 58 | 70 |
| Glioblastoma | 96 | 17 | 30 | 36 | 43 | 53 |
| Heart failure | 120 | 17 | 28 | 39 | 51 | 58 |
| Melanoma | 141 | 19 | 35 | 51 | 63 | 77 |
| Prostatic neoplasms | 118 | 17 | 32 | 43 | 56 | 65 |
| Stomach neoplasms | 173 | 15 | 32 | 49 | 63 | 79 |
| Urinary bladder neoplasms | 92 | 18 | 31 | 42 | 51 | 55 |
The performance of PMFMDA and the baseline methods based on 5-fold CV on the MNDRV2.0 dataset.
| PMFMDA | CMFMDA | IMCMDA | NCPMDA | RLSMDA | RWRMDA | |
|---|---|---|---|---|---|---|
| AUC | 0.9885 | 0.9799 | 0.9171 | 0.9480 | 0.9358 | 0.9055 |
| AUPR | 0.5174 | 0.5047 | 0.3865 | 0.2045 | 0.2818 | 0.1907 |
Parameter tuning for PMFMDA based on 5-fold CV.
| AUC |
|
|
|
|---|---|---|---|
|
| 0.7905 | 0.7728 | 0.7588 |
|
| 0.9040 | 0.8507 | 0.8381 |
|
| 0.9185 | 0.9032 | 0.8692 |
Figure 4Performance evaluation of PMFMDA in two situations for 5-fold cross validation. (1) PMFMDA with similarity information; (2) PMFMDA without similarity information.
PMFMDA infers the top 10 miRNA candidates for the three selected diseases.
| Cancer | Number of miRNAs identified by the literature | Top 10 | |||||
|---|---|---|---|---|---|---|---|
| Rank | miRNAs | Evidence | Rank | miRNAs | Evidence | ||
| Esophageal neoplasms | 1 | mir-17 | dbDEMC | 6 | mir-1 | dbDEMC | |
| 2 | mir-18a | dbDEMC | 7 | mir-200b | dbDEMC | ||
| 10 | 3 | mir-221 | dbDEMC | 8 | mir-222 | dbDEMC | |
| 4 | mir-16 | dbDEMC | 9 | mir-29a | dbDEMC | ||
| 5 | mir-19b | dbDEMC | 10 | mir-133b | dbDEMC | ||
| Breast neoplasms | 1 | mir-142 | miRCancer | 6 | mir-138 | dbDEMC | |
| 2 | mir-150 | dbDEMC, miRCancer | 7 | mir-15b | dbDEMC | ||
| 9 | 3 | mir-106a | dbDEMC | 8 | mir-192 | dbDEMC | |
| 4 | mir-99a | dbDEMC, miRCancer | 9 | mir-378a | Unconfirmed | ||
| 5 | mir-130a | dbDEMC | 10 | mir-196b | dbDEMC | ||
| lung neoplasms | 1 | mir-16 | dbDEMC | 6 | mir-99a | dbDEMC | |
| 2 | hsa-mir-15a | dbDEMC | 7 | mir-429 | dbDEMC, miRCancer | ||
| 9 | 3 | hsa-mir-106b | dbDEMC | 8 | mir-302b | dbDEMC, miRCancer | |
| 4 | mir-195 | dbDEMC, miRCancer | 9 | mir-130a | dbDEMC | ||
| 5 | mir-141 | dbDEMC | 10 | mir-296 | Unconfirmed | ||
Figure 5The network of the top 20 predicted associations for the three selected diseases via PMFMDA.