| Literature DB >> 35428354 |
Ning-Ning Wang1,2,3, Xiang-Gui Wang2,4, Guo-Li Xiong3, Zi-Yi Yang3, Ai-Ping Lu5, Xiang Chen6, Shao Liu7,8, Ting-Jun Hou9, Dong-Sheng Cao10,11,12,13.
Abstract
Drug-drug interaction (DDI) often causes serious adverse reactions and thus results in inestimable economic and social loss. Currently, comprehensive DDI evaluation has become a major challenge in pharmaceutical research due to the time-consuming and costly process of the experimental assessment and it is of high necessity to develop effective in silico methods to predict and evaluate DDIs accurately and efficiently. In this study, based on a large number of substrates and inhibitors related to five important CYP450 isozymes (CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4), a series of high-performance predictive models for metabolic DDIs were constructed by two machine learning methods (random forest and XGBoost) and 4 different types of descriptors (MOE_2D, CATS, ECFP4 and MACCS). To reduce the uncertainty of individual models, the consensus method was applied to yield more reliable predictions. A series of evaluations illustrated that the consensus models were more reliable and robust for the DDI predictions of new drug combination. For the internal validation, the whole prediction accuracy and AUC value of the DDI models were around 0.8 and 0.9, respectively. When it was applied to the external datasets, the model accuracy was 0.793 and 0.795 for multi-level validation and external validation, respectively. Furthermore, we also compared our model with some recently published tools and then applied the final model to predict FDA-approved drugs and proposed 54,013 possible drug pairs with potential DDIs. In summary, we developed a powerful DDI predictive model from the perspective of the CYP450 enzyme family and it will help a lot in the future drug development and clinical pharmacy research.Entities:
Keywords: Adverse drug reactions; CYP450; Drug combination; Machine learning; Metabolic drug interaction
Year: 2022 PMID: 35428354 PMCID: PMC9013037 DOI: 10.1186/s13321-022-00602-x
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Changes of the plasma concentration when two drugs are combined
| Drug 1/2 | Inhibitor | Substrate | Both |
|---|---|---|---|
| Inhibitor | Affect other drugs metabolized by the isozyme | Cp of Drug 1 will increase | Cp of Drug 1 will increase |
| Substrate | Cp of Drug 2 will increase | If bind the same active site, Cp of Drug 1, 2 will increase | Cp of Drug 1, 2 will increase |
| Both | Cp of Drug 2 will increase | Cp of Drug 1, 2 will increase | Cp of Drug 1, 2 will increase |
Cp: the drug concentration in plasma
The detailed information of the modeling datasets
| Substrate | Inhibitor | |||
|---|---|---|---|---|
| Active | Inactive | Active | Inactive | |
| CYP1A2 | 198 | 198 | 6089 | 6974 |
| CYP2C9 | 259 | 259 | 4316 | 8361 |
| CYP2C19 | 314 | 314 | 6027 | 7135 |
| CYP2D6 | 357 | 357 | 2906 | 10,826 |
| CYP3A4 | 792 | 792 | 5544 | 7446 |
Fig. 1The distribution of the collected drugs and isozymes (S-P, S-V, I-P, and I-V represent the pie diagram of substrates, the UpSet diagram of substrates, the pie diagram of inhibitors, and the UpSet diagram of inhibitors, respectively)
Fig. 2The whole performance of the substrate and inhibitor models using different descriptors
The accuracy values of the substrate models based on the 10 randomly-generated negative datasets
| Des | 1A2_Sub | 2C9_Sub | 2C19_Sub | 2D6_Sub | 3A4_Sub | |
|---|---|---|---|---|---|---|
| 1 | 2D | 0.70 ± 0.049 | 0.72 ± 0.042 | 0.74 ± 0.045 | 0.78 ± 0.031 | 0.75 ± 0.025 |
| 2 | 2D | 0.69 ± 0.041 | 0.74 ± 0.041 | 0.73 ± 0.050 | 0.78 ± 0.028 | |
| 3 | 2D | 0.71 ± 0.045 | 0.73 ± 0.041 | 0.75 ± 0.045 | 0.78 ± 0.031 | 0.76 ± 0.023 |
| 4 | 2D | 0.69 ± 0.045 | 0.72 ± 0.048 | 0.75 ± 0.046 | 0.78 ± 0.031 | 0.74 ± 0.020 |
| 5 | 2D | 0.74 ± 0.047 | 0.77 ± 0.033 | 0.76 ± 0.019 | ||
| 6 | 2D | 0.72 ± 0.042 | 0.71 ± 0.045 | 0.74 ± 0.049 | 0.78 ± 0.035 | 0.75 ± 0.023 |
| 7 | 2D | 0.71 ± 0.036 | 0.74 ± 0.040 | 0.75 ± 0.045 | 0.78 ± 0.031 | 0.75 ± 0.024 |
| 8 | 2D | 0.72 ± 0.050 | 0.73 ± 0.046 | 0.74 ± 0.051 | 0.77 ± 0.033 | 0.75 ± 0.020 |
| 9 | 2D | 0.72 ± 0.041 | 0.73 ± 0.042 | 0.75 ± 0.022 | ||
| 10 | 2D | 0.71 ± 0.046 | 0.72 ± 0.045 | 0.73 ± 0.043 | 0.78 ± 0.034 | 0.74 ± 0.020 |
Bold value represents the accuracy of the best of the 10 randomly generated negative sets
The detailed predictive ability of chosen QSAR models using RF
| Des | SE | SP | F | ACC | AUC | |
|---|---|---|---|---|---|---|
| 1A2_Sub | 2D | 0.72 ± 0.067 | 0.73 ± 0.072 | 0.72 ± 0.051 | 0.72 ± 0.047 | 0.78 ± 0.045 |
| 2C9_Sub | 2D | 0.75 ± 0.060 | 0.77 ± 0.063 | 0.75 ± 0.037 | 0.76 ± 0.035 | 0.84 ± 0.038 |
| 2C19_Sub | 2D | 0.76 ± 0.063 | 0.79 ± 0.067 | 0.77 ± 0.043 | 0.77 ± 0.040 | 0.85 ± 0.038 |
| 2D6_Sub | 2D | 0.76 ± 0.046 | 0.82 ± 0.044 | 0.79 ± 0.033 | 0.79 ± 0.032 | 0.86 ± 0.029 |
| 3A4_Sub | 2D | 0.75 ± 0.038 | 0.79 ± 0.035 | 0.76 ± 0.026 | 0.77 ± 0.024 | 0.85 ± 0.020 |
| 1A2_In | 2D | 0.82 ± 0.010 | 0.87 ± 0.009 | 0.84 ± 0.007 | 0.85 ± 0.006 | 0.93 ± 0.005 |
| 2C9_In | 2D | 0.71 ± 0.017 | 0.90 ± 0.008 | 0.74 ± 0.011 | 0.83 ± 0.007 | 0.90 ± 0.005 |
| 2C9_In_B* | 2D | 0.83 ± 0.014 | 0.81 ± 0.013 | 0.82 ± 0.010 | 0.82 ± 0.008 | 0.89 ± 0.007 |
| 2C19_In | 2D | 0.81 ± 0.010 | 0.84 ± 0.008 | 0.81 ± 0.007 | 0.83 ± 0.006 | 0.89 ± 0.005 |
| 2D6_In | 2D | 0.46 ± 0.016 | 0.97 ± 0.017 | 0.59 ± 0.015 | 0.87 ± 0.011 | 0.87 ± 0.009 |
| 2D6_In_B* | 2D | 0.74 ± 0.018 | 0.83 ± 0.018 | 0.77 ± 0.014 | 0.79 ± 0.012 | 0.87 ± 0.011 |
| 3A4_In | MACCS | 0.73 ± 0.015 | 0.86 ± 0.009 | 0.76 ± 0.010 | 0.81 ± 0.008 | 0.89 ± 0.006 |
B* refers to the balanced model using the random sampling method
The detailed predictive ability of the chosen QSAR models using XGBoost
| Des | SE | SP | F | ACC | AUC | |
|---|---|---|---|---|---|---|
| 1A2_Sub | 2D | 0.72 ± 0.070 | 0.69 ± 0.073 | 0.71 ± 0.046 | 0.70 ± 0.040 | 0.77 ± 0.041 |
| 2C9_Sub | 2D | 0.79 ± 0.055 | 0.73 ± 0.072 | 0.76 ± 0.042 | 0.76 ± 0.039 | 0.83 ± 0.040 |
| 2C19_Sub | 2D | 0.76 ± 0.060 | 0.74 ± 0.070 | 0.75 ± 0.048 | 0.75 ± 0.043 | 0.82 ± 0.039 |
| 2D6_Sub | 2D | 0.80 ± 0.046 | 0.79 ± 0.044 | 0.79 ± 0.034 | 0.79 ± 0.030 | 0.86 ± 0.029 |
| 3A4_Sub | MACCS | 0.77 ± 0.034 | 0.76 ± 0.037 | 0.77 ± 0.023 | 0.77 ± 0.021 | 0.84 ± 0.018 |
| 1A2_In | 2D | 0.84 ± 0.011 | 0.87 ± 0.009 | 0.85 ± 0.007 | 0.86 ± 0.006 | 0.93 ± 0.004 |
| 2C9_In_B* | 2D | 0.83 ± 0.013 | 0.80 ± 0.014 | 0.82 ± 0.009 | 0.82 ± 0.009 | 0.89 ± 0.008 |
| 2C19_In | 2D | 0.82 ± 0.011 | 0.83 ± 0.010 | 0.81 ± 0.007 | 0.82 ± 0.006 | 0.89 ± 0.005 |
| 2D6_In_B* | 2D | 0.78 ± 0.018 | 0.81 ± 0.018 | 0.79 ± 0.013 | 0.80 ± 0.012 | 0.87 ± 0.009 |
| 3A4_In | MACCS | 0.76 ± 0.011 | 0.83 ± 0.010 | 0.77 ± 0.009 | 0.801 ± 0.007 | 0.88 ± 0.006 |
Fig. 3The statistical results of the predictive RF and XGBoost models and the consensus method
The detailed information of the result for the multi-level datasets
| Total | CYP1A2 | CYP2C9 | CYP2C19 | CYP2D6 | CYP3A4 | Accuracy | |
|---|---|---|---|---|---|---|---|
| First level | 1317 (1317) | – | – | – | – | – | 1.000 |
| Second level | 1310 (1308) | 80 (80) | 132 (131) | 27 (27) | 112 (112) | 959 (958) | 0.998 |
| Third level | 1194 (947) | 80 (80) | 128 (112) | 27 (24) | 111 (94) | 848 (637) | 0.793 |
The numbers outside and inside the parentheses represent the actual and predictive numbers of DDIs respectively
The scaffold analysis results of the substrate and inhibitor datasets
| Subsets | Number of scaffolds | Number of carbons | Most prominent scaffolds | Common scaffolds | ||
|---|---|---|---|---|---|---|
| 1A2 | Inhibitor | Positive | 3520 | 1418 | 1526, 3567, 3606 | 4, 1, 168 |
| Negative | 4257 | 2090 | 54, 7220, 5, 414, 849 | |||
| Substrate | Positive | 150 | 94 | 36, 8, 14 | 3, 56 | |
| Negative | 163 | 113 | 158, 43 | |||
| 2C9 | Inhibitor | Positive | 3088 | 1708 | 4866, 4863, 4885 | 4, 3366, 179 |
| Negative | 4353 | 1793 | 59, 3303, 3413 | |||
| Substrate | Positive | 168 | 119 | 66, 36, 70 | 19 | |
| Negative | 166 | 123 | 173, 199, 254 | |||
| 2C19 | Inhibitor | Positive | 4189 | 2015 | 3756, 188, 349, 5176 | 4, 6, 39 |
| Negative | 3751 | 1614 | 3567, 461, 3570, 7237 | |||
| Substrate | Positive | 132 | 89 | 7, 31, 63 | 47, 16 | |
| Negative | 137 | 105 | 93 | |||
| 2D6 | Inhibitor | Positive | 1836 | 1003 | 74, 3742, 3904, 3911 | 4, 36, 1655 |
| Negative | 6244 | 2580 | 3718, 3744, 56 | |||
| Substrate | Positive | 209 | 150 | 40, 42, 45 | 1, 26 | |
| Negative | 223 | 153 | 212, 62, 38 | |||
| 3A4 | Inhibitor | Positive | 3536 | 1912 | 3504, 3539, 3592 | 4, 3576, 168 |
| Negative | 4227 | 1630 | 58, 429, 7136 | |||
| Substrate | Positive | 518 | 385 | 57, 31, 80 | 38, 8, 60 | |
| Negative | 506 | 301 | 541, 530, 533, 105, 631 | |||
The detailed comparison results (Accuracy) between models using different methods
| Dataset | Methods | 1A2_In | 2C9_In | 2C19_In | 2D6_In | 3A4_In | 2C9_Sub | 2D6_Sub | 3A4_Sub |
|---|---|---|---|---|---|---|---|---|---|
| Multi-validation | DNN | 0.293 | 0.345 | 0.471 | 0.712 | 0.506 | – | – | – |
| Ours | 1.000 | 1.000 | 1.000 | 0.856 | 0.915 | – | – | – | |
| NCATS | SB | – | 0.614 | – | 0.550 | 0.653 | 0.618 | 0.607 | 0.663 |
| Ours | – | 0.662 | – | 0.629 | 0.664 | 0.632 | 0.608 | 0.547 |
The most frequently 10 drugs predicted to cause DDIs when interacting with other drugs
| Drugs | Interact with | No. of predicted DDIs |
|---|---|---|
| Cidofovir | CYP2C9, CYP3A4 | 455 |
| Alendronate | CYP1A2, CYP3A4 | 444 |
| Trifluridine | CYP2D6, CYP3A4 | 442 |
| Promazine | CYP2C19, CYP2D6, CYP3A4 | 439 |
| Vinblastine | CYP3A4 | 424 |
| Chloropyramine | CYP2C19, CYP2D6 | 397 |
| Citalopram | CYP2C19, CYP2D6, CYP3A4 | 396 |
| Clarithromycin | CYP3A4 | 384 |
| Vincristine | CYP3A4 | 384 |
| Nitrendipine | CYP3A4 | 384 |