| Literature DB >> 31574921 |
Yasunari Matsuzaka1, Yoshihiro Uesawa2.
Abstract
The constitutive androstane receptor (CAR) plays pivotal roles in drug-induced liver injury through the transcriptional regulation of drug-metabolizing enzymes and transporters. Thus, identifying regulatory factors for CAR activation is important for understanding its mechanisms. Numerous studies conducted previously on CAR activation and its toxicity focused on in vivo or in vitro analyses, which are expensive, time consuming, and require many animals. We developed a computational model that predicts agonists for the CAR using the Toxicology in the 21st Century 10k library. Additionally, we evaluate the prediction performance of novel deep learning (DL)-based quantitative structure-activity relationship analysis called the DeepSnap-DL approach, which is a procedure of generating an omnidirectional snapshot portraying three-dimensional (3D) structures of chemical compounds. The CAR prediction model, which applies a 3D structure generator tool, called CORINA-generated and -optimized chemical structures, in the DeepSnap-DL demonstrated better performance than the existing methods using molecular descriptors. These results indicate that high performance in the prediction model using the DeepSnap-DL approach may be important to prepare suitable 3D chemical structures as input data and to enable the identification of modulators of the CAR.Entities:
Keywords: DeepSnap; QSAR; Tox21; chemical structure; constitutive androgen receptor (CAR); deep learning
Mesh:
Substances:
Year: 2019 PMID: 31574921 PMCID: PMC6801383 DOI: 10.3390/ijms20194855
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1A contribution of performance of prediction models with angles of production of pictures in the DeepSnap approach: area under the curve (AUC), which was calculated by the deep learning (DL) build prediction models in GoogLeNet using training, validation, and external test datasets produced by the DeepSnap approach with 92 and 53 different angles from (360°, 360°, 360°) to (38°, 38°, 38°) and from (360°, 360°, 360°) to (90°, 90°, 90°), with MPS: 100, ZF: 100, AT: 23%, BR: 21.1 mÅ, BMD: 0.4 Å, BT: 0.8 Å, LR: 0.01, and BS: default.
Figure 2A contribution of the performance of prediction models with different wash conditions in the preparation of chemical structures of molecular operating environment (MOE) software. In the preparation for 3D chemical structures using MOE software, combinations of three kinds of protonation (none, dominate, and neutralize) and coordinates (2D, 3D, and CORINA) were utilized. The image produced by DeepSnap had the following angles and parameters: (176°, 176°, 176°), MPS:100, ZF:100, AT:23%, BR:14.5 mÅ, BMD:0.4 Å, BT:0.8 Å using nonoverlapped samples (Tra:Val:Test = 16:16:1) of the build DL-based prediction model by GoogLeNet from LR:0.001 to 0.0001 (a). The averages of AUCs of each LR were calculated (b).
Prediction performances with different preparations of chemical structures in the DeepSnap.
| AUC | Acc | MCC | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Train:val:test | Protonation | Coordinate | Protonation | Coordinate | Average | SD | Average | SD | Average | SD |
| 1:1:1 | none | 2D | 0.930 | 0.006 | 0.967 | 0.007 | 0.821 | 0.035 | ||
| 1:1:1 | dominate | 2D | 0.904 | 0.011 | 0.926 | 0.048 | 0.668 | 0.131 | ||
| 1:1:1 | neutralize | 2D | 0.890 | 0.006 | 0.919 | 0.032 | 0.619 | 0.115 | ||
| 1:1:1 | none | 3D | 0.907 | 0.008 | 0.797 | 0.035 | 0.440 | 0.019 | ||
| 1:1:1 | dominate | 3D | 0.971 | 0.003 | 0.927 | 0.001 | 0.734 | 0.005 | ||
| 1:1:1 | neutralize | 3D | 0.924 | 0.007 | 0.969 | 0.003 | 0.827 | 0.017 | ||
| 1:1:1 | none | CORINA | 0.989 | 0.003 | 0.958 | 0.003 | 0.826 | 0.012 | ||
| 1:1:1 | dominate | CORINA | 0.996 | 0.002 | 0.982 | 0.005 | 0.914 | 0.021 | ||
| 1:1:1 | neutralize | CORINA |
| 0.002 | 0.991 | 0.006 | 0.954 | 0.026 | ||
| 1:1:1 | neutralize | 3D | neutralize | CORINA | 0.798 | 0.016 | 0.707 | 0.020 | 0.302 | 0.018 |
| 4:4:1 | none | 2D | 0.923 | 0.024 | 0.959 | 0.029 | 0.798 | 0.107 | ||
| 4:4:1 | dominate | 2D | 0.906 | 0.013 | 0.894 | 0.069 | 0.609 | 0.139 | ||
| 4:4:1 | neutralize | 2D | 0.898 | 0.019 | 0.903 | 0.059 | 0.621 | 0.125 | ||
| 4:4:1 | none | 3D | 0.911 | 0.009 | 0.801 | 0.043 | 0.458 | 0.033 | ||
| 4:4:1 | dominate | 3D | 0.972 | 0.003 | 0.928 | 0.012 | 0.739 | 0.030 | ||
| 4:4:1 | neutralize | 3D | 0.927 | 0.011 | 0.971 | 0.002 | 0.839 | 0.010 | ||
| 4:4:1 | none | CORINA | 0.990 | 0.003 | 0.957 | 0.009 | 0.821 | 0.029 | ||
| 4:4:1 | dominate | CORINA | 0.997 | 0.001 | 0.985 | 0.003 | 0.927 | 0.015 | ||
| 4:4:1 | neutralize | CORINA |
| 0.001 |
| 0.005 |
| 0.023 | ||
| 4:4:1 | neutralize | 3D | neutralize | CORINA | 0.802 | 0.014 | 0.684 | 0.043 | 0.311 | 0.021 |
Parameters (angle: 280, MPS: 100, ZF:100, AT: 23%, BR: 14.5 mÅ, BMD: 0.4 Å, BT: 0.8 Å, LR: 0.0008, BS: 108, GoogleNet), n = 3 or 9 for 1:1:1 or 4:4:1, respectively. Maximum values for AUC, Accuray in test dataset (Acc), and Matthews correlation coefficient (MCC) in each dataset are indicated by bold.
Prediction performances with different angles in the DeepSnap.
| 176° | 280° | 360° | 280°PT | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Train:Val:Test | N | Average | SD | Average | SD | Average | SD | Average | SD | |
| AUC | 1:1:1 | 3 | 1.000 | 0.000 | 0.998 | 0.002 | 0.932 | 0.027 | 0.537 | 0.009 |
| 2:2:1 | 5 | 0.999 | 0.001 | 0.998 | 0.001 | 0.964 | 0.005 | 0.522 | 0.013 | |
| 3:3:1 | 6 | 0.999 | 0.000 | 0.998 | 0.001 | 0.972 | 0.009 | 0.544 | 0.019 | |
| 4:4:1 | 9 | 0.998 | 0.003 | 0.999 | 0.001 | 0.979 | 0.005 | 0.545 | 0.027 | |
| 5:5:1 | 11 | 0.998 | 0.003 | 0.998 | 0.002 | 0.983 | 0.005 | 0.534 | 0.016 | |
| 6:6:1 | 13 | 0.999 | 0.001 | 0.998 | 0.002 | 0.983 | 0.008 | 0.529 | 0.022 | |
| 7:7:1 | 15 | 0.998 | 0.002 | 0.998 | 0.002 | 0.982 | 0.007 | 0.555 | 0.043 | |
| 8:8:1 | 17 | 0.999 | 0.003 | 0.998 | 0.003 | 0.983 | 0.009 | 0.552 | 0.044 | |
| Acc | 1:1:1 | 3 | 0.997 | 0.001 | 0.991 | 0.006 | 0.851 | 0.037 | 0.422 | 0.009 |
| 2:2:1 | 5 | 0.995 | 0.002 | 0.993 | 0.005 | 0.898 | 0.005 | 0.554 | 0.013 | |
| 3:3:1 | 6 | 0.993 | 0.006 | 0.988 | 0.008 | 0.918 | 0.034 | 0.555 | 0.019 | |
| 4:4:1 | 9 | 0.995 | 0.003 | 0.993 | 0.005 | 0.925 | 0.020 | 0.449 | 0.027 | |
| 5:5:1 | 11 | 0.993 | 0.004 | 0.992 | 0.004 | 0.934 | 0.022 | 0.507 | 0.016 | |
| 6:6:1 | 13 | 0.995 | 0.002 | 0.993 | 0.007 | 0.942 | 0.022 | 0.498 | 0.022 | |
| 7:7:1 | 15 | 0.994 | 0.003 | 0.993 | 0.007 | 0.934 | 0.030 | 0.513 | 0.043 | |
| 8:8:1 | 17 | 0.996 | 0.003 | 0.992 | 0.009 | 0.931 | 0.049 | 0.527 | 0.044 | |
| MCC | 1:1:1 | 3 | 0.986 | 0.006 | 0.954 | 0.026 | 0.547 | 0.074 | 0.018 | 0.073 |
| 2:2:1 | 5 | 0.977 | 0.012 | 0.966 | 0.022 | 0.647 | 0.016 | 0.018 | 0.047 | |
| 3:3:1 | 6 | 0.966 | 0.028 | 0.942 | 0.037 | 0.705 | 0.074 | 0.025 | 0.065 | |
| 4:4:1 | 9 | 0.976 | 0.015 | 0.966 | 0.023 | 0.723 | 0.049 | 0.078 | 0.022 | |
| 5:5:1 | 11 | 0.967 | 0.017 | 0.962 | 0.018 | 0.749 | 0.055 | 0.057 | 0.055 | |
| 6:6:1 | 13 | 0.976 | 0.012 | 0.970 | 0.028 | 0.768 | 0.060 | 0.062 | 0.049 | |
| 7:7:1 | 15 | 0.970 | 0.012 | 0.966 | 0.031 | 0.755 | 0.072 | 0.069 | 0.079 | |
| 8:8:1 | 17 | 0.978 | 0.016 | 0.961 | 0.041 | 0.749 | 0.103 | 0.060 | 0.092 | |
Parameters (MPS: 100, ZF: 100, AT: 23%, BR: 14.5 mÅ, BMD: 0.4 Å, BT: 0.8 Å, LR: 0.0008, BS: 108, GoogleNet). Protonation: neutralize, coordinate: CORINA, train:val:test: ratio of train, validation, and test datasets, n: number of external test datasets, average: means of Accuray in test dataset (Acc), AUC, and Matthews correlation coefficient (MCC) for n, sd: standard deviations of MCC for n, 280°PT: permutation test for activity scores at 280° angle.
Prediction performances with combinations of different angles in the DeepSnap.
| Angles on | AUC | Acc | MCC | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| No. of Picture | Pic1 | Pic2 | Pic3 | Pic4 | Average | SD | Average | SD | Average | SD |
| 4 | 0,0,0, | 280,0,0, | 0,280,0 | 0,0,280 | 0.999 | 0.000 | 0.994 | 0.002 | 0.967 | 0.012 |
| 4 | 280,280,280, | 0,280,280, | 280,0,280 | 280,280,0 | 0.998 | 0.002 | 0.988 | 0.004 | 0.941 | 0.021 |
| 4 | 0,0,0, | 0,280,280, | 280.0.280 | 280,280,0 | 0.998 | 0.001 | 0.990 | 0.003 | 0.952 | 0.014 |
| 4 | 0,0,0, | 280,0,0, | 280.0.280 | 280,280,0 | 0.997 | 0.003 | 0.988 | 0.006 | 0.943 | 0.027 |
| 4 | 0,0,0, | 280,0,0, | 0.280.0 | 280,280,0 | 0.996 | 0.002 | 0.991 | 0.004 | 0.953 | 0.018 |
| 3 | - | 280,0,0, | 0,280,0 | 0,0,280 | 0.995 | 0.004 | 0.984 | 0.006 | 0.921 | 0.027 |
| 3 | 0,0,0, | - | 0,280,0 | 0,0,280 | 0.998 | 0.001 | 0.987 | 0.005 | 0.935 | 0.024 |
| 3 | 0,0,0, | 280,0,0, | - | 0,0,280 | 0.998 | 0.001 | 0.988 | 0.008 | 0.943 | 0.037 |
| 3 | 0,0,0, | 280,0,0, | 0,280,0 | - | 0.995 | 0.002 | 0.984 | 0.007 | 0.921 | 0.032 |
| 2 | 0,0,0, | 280,0,0, | - | - | 0.995 | 0.002 | 0.976 | 0.012 | 0.890 | 0.048 |
| 2 | 0,0,0, | - | 0,280,0 | - | 0.993 | 0.002 | 0.970 | 0.015 | 0.864 | 0.055 |
| 2 | 0,0,0, | - | - | 0,0,280 | 0.996 | 0.000 | 0.978 | 0.009 | 0.896 | 0.034 |
| 2 | - | 280,0,0, | 0,280,0 | - | 0.982 | 0.008 | 0.960 | 0.006 | 0.817 | 0.010 |
| 2 | - | - | 0,280,0 | 0,0,280 | 0.998 | 0.001 | 0.986 | 0.002 | 0.931 | 0.010 |
Parameters (angle: 280, MPS: 100, ZF: 100, AT: 23%, BR: 14.5 mÅ, BMD: 0.4 Å, BT: 0.8 Å, LR: 0.0008, BS: 108, GoogleNet). Parameters (angle: 280, MPS: 100, ZF: 100, AT: 23%, BR: 14.5 mÅ, BM: 0.4 Å, BT: 0.8 Å, LR: 0.0008, BS: 108, GoogleNet). Maximum values for AUC, Accuracy in test dataset (Acc), and Matthews correlation coefficient (MCC) are indicated by bold. Wash in MOE (protonation states: neutralize, coordinating washed species: CORINA).
Prediction performances in extreme gradient boosting (XGB) and random forest (RF).
| Auc | Parameters | ||||
|---|---|---|---|---|---|
| Model # | Average | SD | Max_Depth | Nestimators | Max_Features |
| XGB_1 | 0.8855 | 0.0071 | 3 | 100 | 29 |
| XGB_2 | 0.8862 | 0.0095 | 3 | 500 | 29 |
| XGB_3 | 0.8854 | 0.0073 | 3 | 1000 | 29 |
| XGB_4 | 0.8885 | 0.0033 | 3 | 5000 | 29 |
| XGB_5 | 0.8883 | 0.0040 | 30 | 1000 | 29 |
| XGB_6 | 0.8872 | 0.0089 | 3 | 5000 | 40 |
| XGB_7 | 0.8851 | 0.0026 | 3 | 5000 | 50 |
| XGB_8 |
| 0.0072 | 3 | 5000 | 60 |
| XGB_9 | 0.8873 | 0.0069 | 3 | 5000 | 100 |
| XGB_10 | 0.8835 | 0.0075 | 3 | 5000 | 120 |
| RF_1 | 0.8069 | 0.0193 | 2 | 10 | 29 |
| RF_2 | 0.8314 | 0.0287 | 2 | 100 | 29 |
| RF_3 | 0.8416 | 0.0252 | 2 | 1000 | 29 |
| RF_4 | 0.8803 | 0.0053 | 20 | 1000 | 29 |
| RF_5 | 0.8781 | 0.0104 | 200 | 1000 | 29 |
| RF_6 | 0.8780 | 0.0083 | 20 | 5000 | 29 |
| RF_7 | 0.8702 | 0.0067 | 20 | 1000 | 5 |
| RF_8 | 0.8813 | 0.0032 | 20 | 1000 | 80 |
| RF_9 | 0.8842 | 0.0052 | 20 | 1000 | 120 |
| RF_10 | 0.8807 | 0.0055 | 20 | 1000 | 250 |
Average: means of AUCs for 5 tests. SD: standard deviations of AUCs for 5 independent tests. Maximum values for AUC in each models are indicated by bold.