| Literature DB >> 30477152 |
Ping Xuan1, Yihua Dong2, Yahong Guo3, Tiangang Zhang4, Yong Liu5.
Abstract
Identification of disease-related microRNAs (disease miRNAs) is helpful for understanding and exploring the etiology and pathogenesis of diseases. Most of recent methods predict disease miRNAs by integrating the similarities and associations of miRNAs and diseases. However, these methods fail to learn the deep features of the miRNA similarities, the disease similarities, and the miRNA⁻disease associations. We propose a dual convolutional neural network-based method for predicting candidate disease miRNAs and refer to it as CNNDMP. CNNDMP not only exploits the similarities and associations of miRNAs and diseases, but also captures the topology structures of the miRNA and disease networks. An embedding layer is constructed by combining the biological premises about the miRNA⁻disease associations. A new framework based on the dual convolutional neural network is presented for extracting the deep feature representation of associations. The left part of the framework focuses on integrating the original similarities and associations of miRNAs and diseases. The novel miRNA and disease similarities which contain the topology structures are obtained by random walks on the miRNA and disease networks, and their deep features are learned by the right part of the framework. CNNDMP achieves the superior prediction performance than several state-of-the-art methods during the cross-validation process. Case studies on breast cancer, colorectal cancer and lung cancer further demonstrate CNNDMP's powerful ability of discovering potential disease miRNAs.Entities:
Keywords: convolutional neural network; miRNA–disease association; network topology structure; random walk
Mesh:
Substances:
Year: 2018 PMID: 30477152 PMCID: PMC6321160 DOI: 10.3390/ijms19123732
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
ROC-AUCs and PR-AUCs at different values of .
|
| 0.1 | 0.2 | 0.3 | 0.4 | 0.4 | 0.5 | 0.7 | 0.8 | 0.9 |
|---|---|---|---|---|---|---|---|---|---|
| ROC-AUC | 0.890 | 0.918 | 0.934 | 0.939 | 0.946 | 0.950 | 0.952 | 0.954 | 0.956 |
| PR-AUC | 0.340 | 0.401 | 0.442 | 0.462 | 0.491 | 0.503 | 0.513 | 0.521 | 0.538 |
Prediction results of CNNDMP and the other four methods for 15 diseases in terms of ROC-AUCs.
| Disease Name | ROC-AUC CNNDMP | GSTRW | DMPred | PBMDA | Liu’s Method |
|---|---|---|---|---|---|
| Breast neoplasm | 0.987 | 0.822 | 0.938 | 0.852 | 0.863 |
| Hepatocellular carcinoma | 0.986 | 0.779 | 0.900 | 0.803 | 0.845 |
| Renal cell carcinoma | 0.950 | 0.816 | 0.903 | 0.813 | 0.832 |
| Squamous cell carcinoma | 0.936 | 0.817 | 0.908 | 0.881 | 0.890 |
| Colorectal neoplasm | 0.910 | 0.737 | 0.842 | 0.826 | 0.857 |
| Glioblastoma | 0.926 | 0.814 | 0.904 | 0.803 | 0.842 |
| Heart failure | 0.972 | 0.817 | 0.987 | 0.791 | 0.828 |
| Acute myeloid leukemia | 0.961 | 0.788 | 0.890 | 0.844 | 0.874 |
| Lung neoplasm | 0.962 | 0.791 | 0.948 | 0.905 | 0.920 |
| Melanoma | 0.978 | 0.789 | 0.913 | 0.836 | 0.860 |
| Ovarian neoplasm | 0.958 | 0.830 | 0.929 | 0.889 | 0.897 |
| Pancreatic neoplasm | 0.945 | 0.838 | 0.916 | 0.891 | 0.904 |
| Prostatic neoplasm | 0.964 | 0.822 | 0.951 | 0.843 | 0.855 |
| Stomach neoplasm | 0.954 | 0.762 | 0.908 | 0.821 | 0.836 |
| Urinary bladder neoplasm | 0.956 | 0.816 | 0.919 | 0.854 | 0.865 |
| Average AUC | 0.956 | 0.802 | 0.917 | 0.844 | 0.865 |
Figure 1Receiver operating feature curve (ROC) curve of CNNDMP and the other four methods. AUC = area under the curve.
Figure 2Precision–recall (PR) curve of CNNDMP and the other four methods.
Prediction results of CNNDMP and the other four methods for 15 diseases in terms of PR-AUCs.
| Diseases Name | PR-AUC CNNDMP | GSTRW | DMPred | PBMDA | Liu’s Method |
|---|---|---|---|---|---|
| Breast neoplasm | 0.894 | 0.322 | 0.699 | 0.574 | 0.573 |
| Hepatocellular carcinoma | 0.893 | 0.279 | 0.501 | 0.454 | 0.498 |
| Renal cell carcinoma | 0.365 | 0.150 | 0.293 | 0.181 | 0.186 |
| Squamous cell carcinoma | 0.287 | 0.109 | 0.213 | 0.211 | 0.208 |
| Colorectal neoplasm | 0.367 | 0.141 | 0.186 | 0.367 | 0.371 |
| Glioblastoma | 0.330 | 0.151 | 0.219 | 0.217 | 0.243 |
| Heart failure | 0.602 | 0.191 | 0.700 | 0.168 | 0.189 |
| Acute myeloid leukemia | 0.368 | 0.140 | 0.211 | 0.191 | 0.236 |
| Lung neoplasms | 0.636 | 0.147 | 0.511 | 0.537 | 0.503 |
| Melanoma | 0.657 | 0.171 | 0.389 | 0.363 | 0.397 |
| Ovarian neoplasm | 0.490 | 0.169 | 0.404 | 0.361 | 0.361 |
| Pancreatic neoplasm | 0.555 | 0.137 | 0.329 | 0.364 | 0.354 |
| Prostatic neoplasm | 0.568 | 0.166 | 0.463 | 0.282 | 0.264 |
| Stomach neoplasm | 0.608 | 0.220 | 0.446 | 0.344 | 0.346 |
| Urinary bladder neoplasm | 0.470 | 0.163 | 0.315 | 0.252 | 0.280 |
| Average AUC | 0.538 | 0.177 | 0.392 | 0.324 | 0.334 |
Figure 3Recall values of top k candidates of CNNDMP and the other four methods.
Comparison of different methods based on AUCs with a paired t-test.
| DMPred | GSTRW | PBMDA | Liu’s Method | |
|---|---|---|---|---|
| 6.44998 × 10−4 | 9.60973 × 10−16 | 2.65553 × 10−10 | 1.25344 × 10−10 | |
| 0.02972 | 1.75747 × 10−6 | 0.00111 | 0.00151 |
The top 50 breast cancer-related candidates.
| Rank | miRNA Name | Evidence | Rank | miRNA Name | Evidence |
|---|---|---|---|---|---|
| 1 | hsa-mir-1266 | dbDEMC | 26 | hsa-mir-663 | dbDEMC |
| 2 | hsa-mir-942 | dbDEMC | 27 | hsa-mir-545 | dbDEMC |
| 3 | hsa-mir-384 | dbDEMC | 28 | hsa-mir-525 | dbDEMC |
| 4 | hsa-mir-374b | dbDEMC | 29 | hsa-mir-520f | dbDEMC |
| 5 | hsa-mir-1293 | dbDEMC | 30 | hsa-mir-520g | dbDEMC |
| 6 | hsa-mir-3148 | Literature [ | 31 | hsa-mir-659 | dbDEMC |
| 7 | hsa-mir-569 | Literature [ | 32 | hsa-mir-150 | miRCancer, PhenomiR |
| 8 | hsa-mir-431 | dbDEMC | 33 | hsa-mir-592 | dbDEMC |
| 9 | hsa-mir-711 | Literature [ | 34 | hsa-mir-1254 | dbDEMC |
| 10 | hsa-mir-325 | dbDEMC | 35 | hsa-mir-548c | dbDEMC |
| 11 | hsa-mir-1302 | Literature [ | 36 | hsa-mir-675 | miRCancer |
| 12 | hsa-mir-33a | dbDEMC | 37 | hsa-mir-3940 | Literature [ |
| 13 | hsa-mir-1246 | dbDEMC | 38 | hsa-mir-1299 | dbDEMC |
| 14 | hsa-mir-376b | dbDEMC | 39 | hsa-mir-377 | dbDEMC |
| 15 | hsa-mir-487a | dbDEMC | 40 | hsa-mir-519a | dbDEMC |
| 16 | hsa-mir-1236 | dbDEMC | 41 | hsa-mir-1180 | dbDEMC |
| 17 | hsa-mir-548a | dbDEMC | 42 | hsa-mir-1184 | dbDEMC |
| 18 | hsa-mir-624 | dbDEMC | 43 | hsa-mir-3151 | dbDEMC |
| 19 | hsa-mir-633 | dbDEMC | 44 | hsa-mir-627 | dbDEMC |
| 20 | hsa-mir-1181 | dbDEMC | 45 | hsa-mir-1273a | dbDEMC |
| 21 | hsa-mir-382 | dbDEMC | 46 | hsa-mir-1972 | dbDEMC |
| 22 | hsa-mir-448 | dbDEMC | 47 | hsa-mir-208a | dbDEMC, PhenomiR |
| 23 | hsa-mir-583 | dbDEMC | 48 | hsa-mir-668 | dbDEMC |
| 24 | hsa-mir-518a | dbDEMC | 49 | hsa-mir-635 | dbDEMC |
| 25 | hsa-mir-433 | dbDEMC | 50 | hsa-mir-619 | dbDEMC |
Figure 4Construction of a miRNA–disease heterogeneous network and matrix representation. (a) The miRNA similarities network is constructed based on two miRNAs whose similarity are greater than 0 and the matrix representation . We represent miRNA network topology information and the similarity values between miRNAs by a weighted network. Each node represents a miRNA entity, and the weight on edge represents miRNA similarity values in the weighted network. (b) The disease similarities network and its matrix representation . (c) The miRNA–disease associations network is constructed based on the known associations between miRNAs and diseases, and its corresponding matrix representation . When a disease is associated with a miRNA, they are connected by a dotted line. (d) miRNA–disease heterogeneous network. It effectively integrates miRNA similarities, disease similarities and miRNA–disease association information.
Figure 5Integration of miRNA and disease original features to construct the embedding in the left part.
Figure 6Integration of miRNA and disease network topological features to construct the embedding in the right part.
Figure 7miRNA–disease association prediction framework based on dual CNN.