| Literature DB >> 35082675 |
Yinping Shi1, Yuqing Hua1,2, Baobao Wang3, Ruiqiu Zhang1,2, Xiao Li1,4.
Abstract
Drug induced nephrotoxicity is a major clinical challenge, and it is always associated with higher costs for the pharmaceutical industry and due to detection during the late stages of drug development. It is desirable for improving the health outcomes for patients to distinguish nephrotoxic structures at an early stage of drug development. In this study, we focused on in silico prediction and insights into the structural basis of drug induced nephrotoxicity, based on reliable data on human nephrotoxicity. We collected 565 diverse chemical structures, including 287 nephrotoxic drugs on humans in the real world, and 278 non-nephrotoxic approved drugs. Several different machine learning and deep learning algorithms were employed for in silico model building. Then, a consensus model was developed based on three best individual models (RFR_QNPR, XGBOOST_QNPR, and CNF). The consensus model performed much better than individual models on internal validation and it achieved prediction accuracy of 86.24% external validation. The results of analysis of molecular properties differences between nephrotoxic and non-nephrotoxic structures indicated that several key molecular properties differ significantly, including molecular weight (MW), molecular polar surface area (MPSA), AlogP, number of hydrogen bond acceptors (nHBA), molecular solubility (LogS), the number of rotatable bonds (nRotB), and the number of aromatic rings (nAR). These molecular properties may be able to play an important part in the identification of nephrotoxic chemicals. Finally, 87 structural alerts for chemical nephrotoxicity were mined with f-score and positive rate analysis of substructures from Klekota-Roth fingerprint (KRFP). These structural alerts can well identify nephrotoxic drug structures in the data set. The in silico models and the structural alerts could be freely accessed via https://ochem.eu/article/140251 and http://www.sapredictor.cn, respectively. We hope the results should provide useful tools for early nephrotoxicity estimation in drug development.Entities:
Keywords: consensus model; drug induced nephrotoxicity; in silico prediction; structural alert; web-server
Year: 2022 PMID: 35082675 PMCID: PMC8785686 DOI: 10.3389/fphar.2021.793332
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
The number of structures in the data set.
| Nephrotoxic structures | Non-nephrotoxic structures | Total | |
|---|---|---|---|
| Training set | 232 | 224 | 456 |
| Validation set | 55 | 54 | 109 |
| Total | 287 | 278 | 565 |
FIGURE 1Chemical space defined by the first two principal components of CDK descriptors. Red squares stand for the training set, blue circles stand for the validation set.
Performances of models on 5-fold cross-validation.
| Model | Q (%) | SE (%) | SP (%) | EF | MCC | AUC |
|---|---|---|---|---|---|---|
| XGBOOST_QNPR | 72.81 | 71.98 | 73.66 | 1.46 | 0.46 | 0.80 |
| WEKA_J48_QNPR | 67.11 | 69.83 | 64.29 | 1.37 | 0.34 | 0.67 |
| RFR_QNPR | 72.15 | 71.98 | 72.32 | 1.45 | 0.44 | 0.80 |
| libSVM_QNPR | 68.20 | 66.81 | 69.64 | 1.36 | 0.36 | 0.68 |
| ASNN_QNPR | 69.30 | 68.97 | 69.64 | 1.39 | 0.39 | 0.75 |
| XGBOOST_PyDescriptor | 63.82 | 62.07 | 65.63 | 1.27 | 0.28 | 0.70 |
| WEKA_J48_PyDescriptor | 60.09 | 60.78 | 59.38 | 1.21 | 0.20 | 0.60 |
| RFR_PyDescriptor | 68.86 | 69.83 | 67.86 | 1.39 | 0.38 | 0.74 |
| libSVM_PyDescriptor | 63.38 | 58.19 | 68.75 | 1.25 | 0.27 | 0.63 |
| ASNN_PyDescriptor | 63.38 | 62.07 | 64.73 | 1.27 | 0.27 | 0.68 |
| XGBOOST_MORDRED | 64.04 | 65.09 | 62.95 | 1.29 | 0.28 | 0.69 |
| WEKA_J48_MORDRED | 67.32 | 70.69 | 63.84 | 1.38 | 0.35 | 0.67 |
| RFR_MORDRED | 67.98 | 67.67 | 68.30 | 1.37 | 0.36 | 0.75 |
| libSVM_MORDRED | 67.54 | 66.81 | 68.30 | 1.35 | 0.35 | 0.68 |
| ASNN_MORDRED | 66.01 | 65.09 | 66.96 | 1.32 | 0.32 | 0.71 |
| XGBOOST_GSFrag | 63.86 | 64.32 | 63.39 | 1.28 | 0.28 | 0.68 |
| WEKA_J48_GSFrag | 62.53 | 60.79 | 64.29 | 1.24 | 0.25 | 0.63 |
| RFR_GSFrag | 63.86 | 61.23 | 66.52 | 1.27 | 0.28 | 0.70 |
| libSVM_GSFrag | 64.75 | 57.27 | 72.32 | 1.26 | 0.30 | 0.65 |
| ASNN_GSFrag | 63.41 | 57.27 | 69.64 | 1.24 | 0.27 | 0.67 |
| XGBOOST_Fragmentor | 63.16 | 62.07 | 64.29 | 1.26 | 0.26 | 0.70 |
| WEKA_J48_Fragmentor | 58.77 | 55.17 | 62.50 | 1.17 | 0.18 | 0.59 |
| RFR_Fragmentor | 67.54 | 65.95 | 69.20 | 1.35 | 0.35 | 0.73 |
| libSVM_Fragmentor | 67.76 | 66.38 | 69.20 | 1.35 | 0.36 | 0.68 |
| ASNN_Fragmentor | 64.47 | 63.36 | 65.63 | 1.29 | 0.29 | 0.69 |
| XGBOOST_ECFP4 | 63.82 | 63.36 | 64.29 | 1.28 | 0.28 | 0.70 |
| WEKA_J48_ECFP4 | 59.21 | 56.03 | 62.50 | 1.18 | 0.19 | 0.59 |
| RFR_ECFP4 | 66.45 | 63.79 | 69.20 | 1.32 | 0.33 | 0.73 |
| libSVM_ECFP4 | 65.79 | 63.79 | 67.86 | 1.31 | 0.32 | 0.66 |
| ASNN_ECFP4 | 65.57 | 64.66 | 66.52 | 1.31 | 0.31 | 0.67 |
| XGBOOST_Chemaxon | 62.97 | 64.63 | 61.26 | 1.27 | 0.26 | 0.67 |
| WEKA_J48_Chemaxon | 61.64 | 57.64 | 65.77 | 1.22 | 0.23 | 0.62 |
| RFR_Chemaxon | 64.30 | 61.57 | 67.12 | 1.28 | 0.29 | 0.71 |
| libSVM_Chemaxon | 64.30 | 58.52 | 70.27 | 1.26 | 0.29 | 0.64 |
| ASNN_Chemaxon | 64.97 | 65.07 | 64.86 | 1.31 | 0.30 | 0.70 |
| XGBOOST_alvaDesc | 64.04 | 64.22 | 63.84 | 1.29 | 0.28 | 0.69 |
| WEKA_J48_alvaDesc | 63.38 | 64.66 | 62.05 | 1.28 | 0.27 | 0.63 |
| RFR_alvaDesc | 70.18 | 68.97 | 71.43 | 1.40 | 0.40 | 0.75 |
| libSVM_alvaDesc | 65.13 | 68.53 | 61.61 | 1.33 | 0.30 | 0.65 |
| ASNN_alvaDesc | 70.61 | 68.97 | 72.32 | 1.41 | 0.41 | 0.73 |
| CNF | 73.90 | 69.83 | 78.13 | 1.45 | 0.48 | 0.81 |
| TRANSNNI | 69.45 | 65.80 | 73.21 | 1.37 | 0.39 | 0.74 |
| GNN GIN | 67.11 | 67.24 | 66.96 | 1.35 | 0.34 | 0.74 |
| EAGCNG | 58.54 | 52.42 | 64.73 | 1.15 | 0.17 | 0.62 |
| DEEPCHEM | 70.42 | 53.39 | 80.83 | 1.19 | 0.36 | 0.74 |
| Consensus | 75.88 | 72.84 | 79.02 | 1.50 | 0.52 | 0.83 |
Performances of models on external validation.
| Model | Q (%) | SE (%) | SP (%) | EF | MCC | AUC |
|---|---|---|---|---|---|---|
| RFR_QNPR | 87.16 | 87.27 | 87.04 | 1.76 | 0.74 | 0.91 |
| XGBOOST_QNPR | 83.49 | 85.45 | 81.48 | 1.71 | 0.67 | 0.90 |
| CNF | 83.49 | 80.00 | 87.04 | 1.64 | 0.67 | 0.89 |
| Consensus | 86.24 | 85.45 | 87.04 | 1.72 | 0.72 | 0.93 |
FIGURE 2ROC curve of models on external validation. Each color line represents a model.
FIGURE 3Distributions of the commonly molecular properties for nephrotoxic and non-nephrotoxic drugs.
Structural alerts only presented in nephrotoxic drugs.
| ID | Bit | SMARTS | Positive | Negative | Representative structure |
|---|---|---|---|---|---|
| 1 | KR413 | [!#1][CH2][CH2]c1[cH][cH][cH][cH][cH]1 | 9 | 0 |
|
| 2 | KR848 | [!#1][NH]C(=O)[CH]([CH3])[NH]C(=O)[!#1] | 7 | 0 |
|
| 3 | KR1798 | [!#1]c1[cH][cH]c(F)[cH][cH]1 | 8 | 0 |
|
| 4 | KR2444 | [!#1]N1[CH2][CH2]N([CH3])[CH2][CH2]1 | 6 | 0 |
|
| 5 | KR3206 | c1nc2ccccc2[nH]1 | 7 | 0 |
|
| 6 | KR3280 | CC(=O)c1ccc(N)cc1 | 8 | 0 |
|
| 7 | KR3540 | Cc1ccc(cc1)c2ccccc2 | 7 | 0 |
|
| 8 | KR3548 | Cc1ccc(F)cc1 | 8 | 0 |
|
| 9 | KR3586 | Cc1cccc(F)c1 | 12 | 0 |
|
| 10 | KR4029 | CS(c1nc2ccccc2[nH]1) | 6 | 0 |
|
| 11 | KR4064 | Fc1cccc(C=O)c1 | 8 | 0 |
|
| 12 | KR4065 | Fc1cccc(F)c1 | 8 | 0 |
|
| 13 | KR4081 | N#Cc1ccccc1 | 6 | 0 |
|
| 14 | KR4252 | Nc1ccc(F)cc1 | 11 | 0 |
|
| 15 | KR4556 | O=CNCCCCNC = O | 8 | 0 |
|
| 16 | KR4651 | OC(=O)C1CCCN1 | 8 | 0 |
|