| Literature DB >> 29401735 |
Li-Yue Bai1, Hao Dai2, Qin Xu3, Muhammad Junaid4, Shao-Liang Peng5,6, Xiaolei Zhu7, Yi Xiong8, Dong-Qing Wei9.
Abstract
Drug combinatorial therapy is a promising strategy for combating complex diseases due to its fewer side effects, lower toxicity and better efficacy. However, it is not feasible to determine all the effective drug combinations in the vast space of possible combinations given the increasing number of approved drugs in the market, since the experimental methods for identification of effective drug combinations are both labor- and time-consuming. In this study, we conducted systematic analysis of various types of features to characterize pairs of drugs. These features included information about the targets of the drugs, the pathway in which the target protein of a drug was involved in, side effects of drugs, metabolic enzymes of the drugs, and drug transporters. The latter two features (metabolic enzymes and drug transporters) were related to the metabolism and transportation properties of drugs, which were not analyzed or used in previous studies. Then, we devised a novel improved naïve Bayesian algorithm to construct classification models to predict effective drug combinations by using the individual types of features mentioned above. Our results indicated that the performance of our proposed method was indeed better than the naïve Bayesian algorithm and other conventional classification algorithms such as support vector machine and K-nearest neighbor.Entities:
Keywords: classification and prediction; drug combination; improved naïve Bayesian algorithm; metabolic enzyme
Mesh:
Substances:
Year: 2018 PMID: 29401735 PMCID: PMC5855689 DOI: 10.3390/ijms19020467
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1The Venn diagram of drug combinations for five types of features, where the numbers show how many drug pairs (there were a total of 946 drug pairs) can be covered by different features or combinations of these feature types.
The classification performance of different positive-to-negative sample ratios by using the feature of the side effect of drugs on the independent test.
| Positive-to-Negative Ratio | Accuracy | F-Measure | MCC | Recall | Precision |
|---|---|---|---|---|---|
| 1:1 | 0.6800 | 0.6667 | 0.3612 | 0.6400 | 0.6957 |
| 1:2 | 0.6667 | 0.5098 | 0.2638 | 0.5652 | 0.4643 |
| 1:3 | 0.6832 | 0.3043 | 0.0992 | 0.3043 | 0.3043 |
Figure 2The performance comparison of models on two different negative data sets by using leave-one-out cross validation on five types of single features. The performance of the N1 negative data set is shown by the blue line, whereas the performance of the N2 negative data set is shown by the red line: (A) The enzyme feature; (B) The pathway feature; (C) The side effect feature; (D) The transporter feature; (E) The target feature.
Figure 3Performance comparison of different machine learning algorithms for the N2 negative data set by using our feature selection method of leave-one-out cross validation test.
Figure 4Performance comparison of two different feature selection algorithms for the N2 negative data set by using the naïve Bayesian and improved naïve Bayesian method of the leave-one-out cross validation test. The blue and green lines used our new method to select features, while the red and yellow lines used the mRMR algorithm for feature selection.
Figure 5Performance comparison between the original models (S1, S2, S3) and the models with randomly shuffled data sets (S1_Y_random, S2_Y_random, S3_Y_random).
Figure 6Performance comparison of five different feature types for the N2 negative data set by using our feature selection and improved naïve Bayesian method in the leave-one-out cross validation test.
Performance comparison of prediction models based on different feature types using the improved naïve Bayesian algorithm on the independent data set.
| Feature Type | Accuracy | F-Measure | MCC | Recall | Precision |
|---|---|---|---|---|---|
| Targets | 0.7034 | 0.6431 | 0.4771 | 0.5000 | 0.9008 |
| Side effect | 0.6800 | 0.6667 | 0.3612 | 0.6400 | 0.6957 |
| Pathways | 0.6238 | 0.6174 | 0.2474 | 0.6216 | 0.6133 |
| Enzymes | 0.6115 | 0.6904 | 0.2144 | 0.8095 | 0.6018 |
| Transporters | 0.5339 | 0.5865 | 0.1216 | 0.7500 | 0.4815 |
Figure 7Performance comparison between our proposed method and Chen et al.’s method by using the leave-one-out cross validation test [12]. The red line shows the classification performance of our method while the others show the performances of Chen et al.’s models.
The dimensions of different features for the two different negative data sets.
| Category | Feature | Source | Dimension (N1) | Dimension (N2) |
|---|---|---|---|---|
| Pharmacodynamics | Targets | DrugBank | 681 | 787 |
| Pathways | KEGG | 255 | 263 | |
| Pharmacokinetic | Enzymes | DrugBank | 135 | 146 |
| Transporters | DrugBank | 76 | 86 | |
| Phenotypic | Side effect | SIDER | 3005 | 3889 |