| Literature DB >> 32582646 |
Zhan-Heng Chen1,2, Zhu-Hong You1,2, Zhen-Hao Guo1,2, Hai-Cheng Yi1,2, Gong-Xu Luo1,2, Yan-Bin Wang3.
Abstract
Predicting drug-target interactions (DTIs) is crucial in innovative drug discovery, drug repositioning and other fields. However, there are many shortcomings for predicting DTIs using traditional biological experimental methods, such as the high-cost, time-consumption, low efficiency, and so on, which make these methods difficult to widely apply. As a supplement, the in silico method can provide helpful information for predictions of DTIs in a timely manner. In this work, a deep walk embedding method is developed for predicting DTIs from a multi-molecular network. More specifically, a multi-molecular network, also called molecular associations network, is constructed by integrating the associations among drug, protein, disease, lncRNA, and miRNA. Then, each node can be represented as a behavior feature vector by using a deep walk embedding method. Finally, we compared behavior features with traditional attribute features on an integrated dataset by using various classifiers. The experimental results revealed that the behavior feature could be performed better on different classifiers, especially on the random forest classifier. It is also demonstrated that the use of behavior information is very helpful for addressing the problem of sequences containing both self-interacting and non-interacting pairs of proteins. This work is not only extremely suitable for predicting DTIs, but also provides a new perspective for the prediction of other biomolecules' associations.Entities:
Keywords: attribute feature; behavior feature; drug–target interactions; molecular association network; random forest
Year: 2020 PMID: 32582646 PMCID: PMC7283956 DOI: 10.3389/fbioe.2020.00338
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Performance evaluation with SVM on attribute features.
| 5-folds | Acc (%) | TPR (%) | TNR (%) | PPV (%) | MCC (%) |
| 1 | 66.16 | 66.79 | 65.53 | 65.96 | 32.32 |
| 2 | 66.22 | 66.16 | 66.29 | 66.25 | 32.45 |
| 3 | 66.49 | 67.64 | 65.35 | 66.12 | 33.00 |
| 4 | 67.06 | 67.69 | 66.43 | 66.84 | 34.12 |
| 5 | 66.74 | 67.37 | 66.11 | 66.53 | 33.49 |
| Average | 66.53 ± 0.37 | 67.13 ± 0.65 | 65.94 ± 0.48 | 66.34 ± 0.35 | 33.08 ± 0.75 |
Performance evaluation with SVM on behavior features.
| 5-folds | Acc (%) | TPR (%) | TNR (%) | PPV (%) | MCC (%) |
| 1 | 74.71 | 71.56 | 77.86 | 76.37 | 49.51 |
| 2 | 77.12 | 72.73 | 81.50 | 79.72 | 54.44 |
| 3 | 75.83 | 75.07 | 76.60 | 76.23 | 51.67 |
| 4 | 75.99 | 75.83 | 76.15 | 76.07 | 51.98 |
| 5 | 75.51 | 73.41 | 77.60 | 76.62 | 51.06 |
| Average | 75.83 ± 0.87 | 73.72 ± 1.73 | 77.94 ± 2.11 | 77.00 ± 1.53 | 51.73 ± 1.79 |
FIGURE 1The ROC curve of SVM on attribute feature.
FIGURE 2The ROC curve of SVM on behavior feature.
Performance evaluation with RF on attribute features.
| 5-folds | Acc (%) | TPR (%) | TNR (%) | PPV (%) | MCC (%) |
| 1 | 81.37 | 77.59 | 85.15 | 83.93 | 62.92 |
| 2 | 81.98 | 78.62 | 85.33 | 84.27 | 64.10 |
| 3 | 81.80 | 79.16 | 84.43 | 83.56 | 63.68 |
| 4 | 80.49 | 76.78 | 84.20 | 82.94 | 61.15 |
| 5 | 80.71 | 76.30 | 85.13 | 83.69 | 61.67 |
| Average | 81.27 ± 0.66 | 77.69 ± 1.20 | 84.85 ± 0.50 | 83.68 ± 0.49 | 62.70 ± 1.27 |
Performance evaluation with RF on behavior features.
| 5-folds | Acc (%) | TPR (%) | TNR (%) | PPV (%) | MCC (%) |
| 1 | 85.58 | 79.93 | 91.22 | 90.11 | 71.61 |
| 2 | 86.16 | 80.38 | 91.94 | 90.89 | 72.81 |
| 3 | 85.76 | 80.56 | 90.95 | 89.9 | 71.9 |
| 4 | 84.18 | 77.63 | 90.73 | 89.33 | 68.96 |
| 5 | 85.56 | 79.86 | 91.26 | 90.13 | 71.58 |
| Average | 85.45 ± 0.75 | 79.67 ± 1.18 | 91.22 ± 0.46 | 90.07 ± 0.56 | 71.37 ± 1.44 |
FIGURE 3The ROC curve of random forest on attribute feature.
FIGURE 4The ROC curve of random forest on behavior feature.
Nine known relationships in the molecular associations network.
| Relationship | Database | Number |
| Drug–target | DrugBank ( | 11107 |
| Drug–disease | CTD ( | 18416 |
| Protein–disease | DisGeNET ( | 25087 |
| lncRNA–target | LncRNA2Target ( | 690 |
| lncRNA–disease | LncRNADisease ( | 1264 |
| lncRNASNP2 ( | ||
| miRNA–target | miRTarBase ( | 4944 |
| miRNA–disease | HMDD ( | 16427 |
| miRNA–lncRNA | lncRNASNP2 ( | 8374 |
| Protein–protein | STRING ( | 19237 |
| Total | N/A | 105546 |
The number of 5 types of biomolecules from the nine known relationships.
| Biomolecule | Number |
| Drug | 1025 |
| Target/Protein | 1649 |
| miRNA | 1023 |
| lncRNA | 769 |
| Disease | 2062 |
| Total | 6528 |
FIGURE 5Construction of Multi-molecular Network.
FIGURE 6Representation of drug molecular fingerprint.
FIGURE 7Random Walks on Molecular Associations Network.