| Literature DB >> 31703479 |
Zhihua Chen1, Xinke Wang2, Peng Gao2, Hongju Liu3, Bosheng Song4.
Abstract
It is known that many diseases are caused by mutations or abnormalities in microRNA (miRNA). The usual method to predict miRNA disease relationships is to build a high-quality similarity network of diseases and miRNAs. All unobserved associations are ranked by their similarity scores, such that a higher score indicates a greater probability of a potential connection. However, this approach does not utilize information within the network. Therefore, in this study, we propose a machine learning method, called STIM, which uses network topology information to predict disease-miRNA associations. In contrast to the conventional approach, STIM constructs features according to information on similarity and topology in networks and then uses a machine learning model to predict potential associations. To verify the reliability and accuracy of our method, we compared STIM to other classical algorithms. The results of fivefold cross validation demonstrated that STIM outperforms many existing methods, particularly in terms of the area under the curve. In addition, the top 30 candidate miRNAs recommended by STIM in a case study of lung neoplasm have been confirmed in previous experiments, which proved the validity of the method.Entities:
Keywords: heterogeneous network; link prediction; machine learning; miRNA; network embedding; topology information
Mesh:
Substances:
Year: 2019 PMID: 31703479 PMCID: PMC6912199 DOI: 10.3390/cells8111405
Source DB: PubMed Journal: Cells ISSN: 2073-4409 Impact factor: 6.600
Figure 1Workflow of STIM. (a) Network construction based on different resources. (b) Two kinds of feature extractions, one of which is passed through an auto-encoder. (c) Deep forest-based association prediction.
Disease–miRNA associations network.
| Node Type | Number |
|---|---|
| Disease | 336 |
| miRNA | 577 |
| Disease–miRNA associations | 6441 |
Figure 2A schematic diagram of generating two kinds of eigenvectors. (a) the upper part: the similarity based feature vector: a n-dimensional disease-miRNA pair feature vector is obtained by the auto-encoder processing the dimension reduction of the cascaded m-dimensional disease feature vector and h-dimensional miRNA feature. (b) the lower part: the feature vector generated based on DeepWalk: a 2n-dimensional disease-miRNA pair feature vector is obtained by the cascaded n-dimensional disease feature vector and n-dimensional feature.
Figure 3The schematic diagram of processing predictions. (a) a multi-granularity scanning is used to preprocess the input features. (b) the obtained eigenvectors are put into the cascaded forest for training.
Figure 4PRE, REC and ROC of different diseases by five-fold cross-validation.
AUC comparison of different methods with specific diseases. (The best AUC value is shown in bold).
| Disease | STIM | RWRMDA | HDMP | RLSMDA | MIDP |
|---|---|---|---|---|---|
| Acute myeloid leukemia | 0.887 | 0.839 | 0.858 | 0.853 | 0.913 |
| Breast neoplasm |
| 0.785 | 0.801 | 0.832 | 0.838 |
| Colorectal neoplasms |
| 0.793 | 0.802 | 0.831 | 0.845 |
| Glioblastoma |
| 0.68 | 0.7 | 0.714 | 0.786 |
| Heart failure |
| 0.722 | 0.77 | 0.738 | 0.821 |
| Hepatocellular carcinoma |
| 0.749 | 0.759 | 0.794 | 0.807 |
| Lung neoplasms |
| 0.827 | 0.835 | 0.855 | 0.876 |
| Melanoma |
| 0.784 | 0.79 | 0.807 | 0.837 |
| Ovarian neoplasms | 0.89 | 0.882 |
| 0.909 |
|
| Pancreatic neoplasms | 0.909 | 0.871 | 0.895 | 0.887 |
|
| Prostatic neoplasms | 0.872 | 0.823 | 0.854 | 0.841 |
|
| Renal cell carcinoma |
| 0.815 | 0.833 | 0.839 | 0.862 |
| Squamous cell carcinoma |
| 0.819 | 0.82 | 0.849 | 0.87 |
| Stomach neoplasms |
| 0.779 | 0.787 | 0.797 | 0.821 |
| Urinary bladder neoplasms | 0.868 | 0.821 | 0.85 | 0.845 |
|
| Average AUC |
| 0.799 | 0.816 | 0.826 | 0.862 |
Top 30 lung neoplasm-related candidates.
| Rank | miRNA | Evidence | Rank | miRNA | Evidence |
|---|---|---|---|---|---|
| 1 | hsa-mir-130a | dbDEMC,miR2disease | 16 | hsa-mir-149 | dbDEMC |
| 2 | hsa-mir-125b-2 | Unconfirm | 17 | hsa-mir-15a | dbDEMC |
| 3 | hsa-mir-195 | dbDEMC,miR2disease | 18 | hsa-mir-302a | Ref.[ |
| 4 | hsa-mir-451a | dbDEMC,miR2disease | 19 | hsa-mir-99a | dbDEMC,miR2disease |
| 5 | hsa-mir-128-1 | Unconfirm | 20 | hsa-mir-152 | dbDEMC |
| 6 | hsa-mir-23b | dbDEMC | 21 | hsa-mir-708 | Ref.[ |
| 7 | hsa-mir-151a | Ref. [ | 22 | hsa-mir-378a | Ref.[ |
| 8 | hsa-mir-92a-2 | dbDEMC | 23 | hsa-mir-339 | dbDEMC |
| 9 | hsa-mir-302b | dbDEMC | 24 | hsa-mir-106b | dbDEMC |
| 10 | hsa-mir-193b | dbDEMC | 25 | hsa-mir-215 | dbDEMC |
| 11 | hsa-mir-141 | dbDEMC,miR2Disease | 26 | hsa-mir-130b | dbDEMC |
| 12 | hsa-mir-196b | dbDEMC | 27 | hsa-mir-302c | dbDEMC |
| 13 | hsa-mir-10a | dbDEMC | 28 | hsa-mir-296 | dbDEMC |
| 14 | hsa-mir-429 | dbDEMC,miR2disease | 29 | hsa-mir-320a | Ref.[ |
| 15 | hsa-mir-328 | dbDEMC | 30 | hsa-mir-20b | dbDEMC |