| Literature DB >> 24257141 |
Svetoslav H Slavov, Bruce A Pearce, Dan A Buzatu, Jon G Wilkes, Richard D Beger1.
Abstract
Multiple validation techniques (Y-scrambling, complete training/test set randomization, determination of the dependence of R2test on the number of randomization cycles, etc.) aimed to improve the reliability of the modeling process were utilized and their effect on the statistical parameters of the models was evaluated. A consensus partial least squares (PLS)-similarity based k-nearest neighbors (KNN) model utilizing 3D-SDAR (three dimensional spectral data-activity relationship) fingerprint descriptors for prediction of the log(1/EC50) values of a dataset of 94 aryl hydrocarbon receptor binders was developed. This consensus model was constructed from a PLS model utilizing 10 ppm x 10 ppm x 0.5 Å bins and 7 latent variables (R2test of 0.617), and a KNN model using 2 ppm x 2 ppm x 0.5 Å bins and 6 neighbors (R2test of 0.622). Compared to individual models, improvement in predictive performance of approximately 10.5% (R2test of 0.685) was observed. Further experiments indicated that this improvement is likely an outcome of the complementarity of the information contained in 3D-SDAR matrices of different granularity. For similarly sized data sets of Aryl hydrocarbon (AhR) binders the consensus KNN and PLS models compare favorably to earlier reports. The ability of 3D-QSDAR (three dimensional quantitative spectral data-activity relationship) to provide structural interpretation was illustrated by a projection of the most frequently occurring bins on the standard coordinate space, thus allowing identification of structural features related to toxicity.Entities:
Year: 2013 PMID: 24257141 PMCID: PMC3843526 DOI: 10.1186/1758-2946-5-47
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Summary of QSARs published since year 2000
| PCBs, PCDDs and PCDFs | logEC50 | 52 | MLR | 13C-NMR | R2 = 0.85; q2 = 0.71 | [ |
| PCBs, PCDDs and PCDFs | logEC50 | 52 | MLR | 13C-NMR, atom-to-atom distances | R2 = 0.85; q2 = 0.52 | [ |
| PCDFs | log(1/EC50) | 33 | MLR | Quantum mechanical; logP | R2 = 0.720; s = 0.723 | [ |
| PCDFs | log(1/EC50) | 34 | MLR | Quantum mechanical | R2 = 0.747; R2adj = 0.669; q2 = 0.572 | [ |
| PCDDs and PCDFs | log(1/EC50) | 90 | PLS | CoMFA - 10 latent variables | R2 = 0.838; q2 = 0.624; SEP = 0.903 | [ |
| PCDFs | log(1/EC50) | 34 | MLR | Quantum mechanical | R2 = 0.863; R2adj = 0.839; q2 = 0.807; SE = 0.558 ; F = 35.389 | [ |
| PCDDs | log(1/EC50) | 47 | MLR | Quantum mechanical | R2 = 0.729; R2adj = 0.703; SE = 0.797; F = 28.269 | [ |
| PHDDs | log(1/EC50) | 25 | MLR | Quantum mechanical | R2 = 0.768; R2adj = 0.721; q2 = 0.635; S.E. = 0.762; F = 16.529 | [ |
| PHDDs | log(1/EC50) | 25 | MLR | WHIM | R2 = 0.915; R2adj = 0.902; q2 = 0.880; S.E. = 0.451; F = 75.032 | [ |
| PCDDs and PCDFs | log(1/EC50) | 60 | MLR | Quantum mechanical | R2 = 0.687; R2adj = 0.686; q2 = 0.603; S.E. = 0.870 | [ |
*MLR - Multiple Linear Regression; PLS - Partial Least Squares.
AhR binders and their experimental and predicted log(1/EC)
| 3,3',4,4'-Tetrachlorobiphenyl | 6.15 | 6.02 | 5.50 | 6.49 | 5.75 | 6.00 |
| 2,3,4,4'-Tetrachlorobiphenyl | 4.55 | 5.27 | 5.35 | 5.15 | 5.28 | 5.25 |
| 3,3',4,4',5-Pentachlorobiphenyl | 6.89 | 5.06 | 5.11 | 5.96 | 5.63 | 5.54 |
| 2',3,4,4',5-Pentachlorobiphenyl | 4.85 | 4.77 | 5.26 | 4.23 | 5.11 | 4.75 |
| 2,3,3',4,4'-Pentachlorobiphenyl | 5.37 | 5.59 | 5.64 | 5.07 | 5.38 | 5.36 |
| 2,3',4,4',5-Pentachlorobiphenyl | 5.04 | 5.47 | 5.47 | 4.74 | 5.29 | 5.11 |
| 2,3,4,4',5-Pentachlorobiphenyl | 5.39 | 4.81 | 4.78 | 5.53 | 5.14 | 5.16 |
| 2,3,3',4,4',5-Hexachlorobiphenyl | 5.15 | 5.33 | 5.22 | 5.61 | 5.19 | 5.42 |
| 2,3',4,4',5,5'-Hexachlorobiphenyl | 4.80 | 5.16 | 5.41 | 4.80 | 5.36 | 5.11 |
| 2,3,3'4,4',5'-Hexachlorobiphenyl | 5.33 | 5.23 | 5.12 | 5.07 | 5.37 | 5.10 |
| 2,2',4,4'-Tetrachlorobiphenyl | 3.89 | 5.22 | 4.83 | 4.49 | 4.94 | 4.66 |
| 2,2',4,4'5,5'-Hexachlorobiphenyl | 4.10 | 4.41 | 5.05 | 3.50 | 4.85 | 4.28 |
| 2,3,4,5-Tetrachlorobiphenyl | 3.85 | 5.55 | 5.20 | 5.35 | 5.29 | 5.28 |
| 2,3',4,4',5',6-Hexachlorobiphenyl | 4.00 | 5.44 | 5.23 | 4.37 | 4.90 | 4.80 |
| 4'-Hydroxy-2,3,4,5-tetrachlorobiphenyl | 4.05 | 5.88 | 5.05 | 5.07 | 4.86 | 5.07 |
| 4'-Methyl-2,3,4,5-tetrachlorobiphenyl | 4.51 | 5.21 | 5.27 | 5.13 | 4.86 | 5.20 |
| 4'-Fluoro-2,3,4,5-tetrachlorobiphenyl | 4.60 | 5.13 | 4.92 | 4.37 | 4.67 | 4.65 |
| 4'-Methoxy-2,3,4,5-tetrachlorobiphenyl | 4.80 | 5.35 | 5.15 | 4.32 | 4.74 | 4.74 |
| 4'-Acetyl-2,3,4,5-tetrachlorobiphenyl | 5.17 | 5.00 | 4.87 | 4.14 | 4.98 | 4.51 |
| 4'-Cyano-2,3,4,5-tetrachlorobiphenyl | 5.27 | 5.48 | 5.05 | 4.29 | 4.78 | 4.67 |
| 4'-Ethyl-2,3,4,5-tetrachlorobiphenyl | 5.46 | 5.13 | 5.06 | 4.50 | 4.82 | 4.78 |
| 4'-Bromo-2,3,4,5-tetrachlorobiphenyl | 5.60 | 5.42 | 5.34 | 5.27 | 5.51 | 5.31 |
| 4'-Iodo-2,3,4,5-tetrachlorobiphenyl | 5.82 | 5.53 | 5.16 | 5.88 | 5.84 | 5.52 |
| 4'-isopropyl-2,3,4,5-tetrachlorobiphenyl | 5.89 | 5.77 | 5.45 | 5.07 | 4.75 | 5.26 |
| 4'-Trifluromethyl-2,3,4,5-tetrachlorobiphenyl | 6.43 | 5.42 | 5.25 | 4.46 | 4.80 | 4.86 |
| 3'-Nitro-2,3,4,5-tetrachlorobiphenyl | 4.85 | 5.51 | 5.27 | 5.07 | 4.75 | 5.17 |
| 4'-N-Acetylamino-2,3,4,5-tetrachlorobiphenyl | 5.09 | 5.26 | 4.87 | 5.09 | 4.96 | 4.98 |
| 4'-Phenyl-2,3,4,5-tetrachlorobiphenyl | 5.18 | 4.74 | 5.03 | 4.69 | 5.01 | 4.86 |
| 4'-t-Butyl-2,3,4,5-tetrachlorobiphenyl | 5.17 | 5.12 | 5.34 | 4.71 | 4.89 | 5.03 |
| 4'-n-Butyl-2,3,4,5-tetrachlorobiphenyl | 5.13 | 5.12 | 5.13 | 5.44 | 4.93 | 5.29 |
| 2,3,7,8-Tetrachlorodibenzo-p-dioxin | 8.00 | 8.27 | 7.66 | 7.10 | 7.28 | 7.38 |
| 1,2,3,7,8-Pentachlorodibenzo-p-dioxin | 7.10 | 6.10 | 6.73 | 6.43 | 5.99 | 6.58 |
| 2,3,6,7-Tetrachlorodibenzo-p-dioxin | 6.80 | 6.56 | 6.76 | 5.92 | 5.96 | 6.34 |
| 2,3,6-Trichlorodibenzo-p-dioxin | 6.66 | 6.31 | 6.67 | 5.85 | 5.90 | 6.26 |
| 1,2,3,4,7,8-Hexachlorodibenzo-p-dioxin | 6.55 | 5.83 | 6.10 | 5.84 | 5.69 | 5.97 |
| 1,3,7,8-Tetrachlorodibenzo-p-dioxin | 6.10 | 6.22 | 6.68 | 6.03 | 6.12 | 6.36 |
| 1,2,4,7,8-Pentachlorodibenzo-p-dioxin | 5.96 | 5.99 | 6.46 | 5.41 | 5.86 | 5.94 |
| 1,2,3,4-Tetrachlorodibenzo-p-dioxin | 5.89 | 4.39 | 5.44 | 5.96 | 5.87 | 5.70 |
| 2,3,7-Trichlorodibenzo-p-dioxin | 7.15 | 6.72 | 6.84 | 6.69 | 7.37 | 6.77 |
| 2,8-Dichlorodibenzo-p-dioxin | 5.50 | 5.73 | 6.04 | 7.83 | 7.94 | 6.94 |
| 1,2,3,4,7-Pentachlorodibenzo-p-dioxin | 5.19 | 5.68 | 6.02 | 5.69 | 5.95 | 5.86 |
| 1,2,4-Trichlorodibenzo-p-dioxin | 4.89 | 5.46 | 5.90 | 6.12 | 5.99 | 6.01 |
| 1,2,3,4,6,7,8,9-octachlorodibenzo-p-dioxin | 5.00 | 6.78 | 7.76 | 4.77 | 5.74 | 6.27 |
| 1-Chlorodibenzo-p-dioxin | 4.00 | 5.97 | 6.09 | 6.44 | 6.54 | 6.28 |
| 2,3,7,8-Tetra bromodibenzo-p-dioxin | 8.82 | 9.29 | 8.61 | 9.86 | 8.43 | 9.24 |
| 2,3-Dibromo-7,8-dichlorodibenzo-p-dioxin | 8.83 | 8.56 | 8.43 | 8.55 | 8.15 | 8.49 |
| 2,8-Dibromo-3,7-dichlorodibenzo-p-dioxin | 9.35 | 7.54 | 7.86 | 6.87 | 7.06 | 7.37 |
| 2-Bromo-3,7,8-trichlorodibenzo-p-dioxin | 7.94 | 8.31 | 8.05 | 7.26 | 7.40 | 7.66 |
| 1,3,7,8,9-Pentabromodibenzo-p-dioxin | 7.03 | 7.25 | 7.99 | 7.53 | 8.29 | 7.76 |
| 1,3,7,8,-Tetrabromodibenzo-p-dioxin | 8.70 | 7.38 | 8.51 | 8.22 | 8.48 | 8.37 |
| 1,2,4,7,8-Pentabromodibenzo-p-dioxin | 7.77 | 7.31 | 8.06 | 9.20 | 8.24 | 8.63 |
| 1,2,3,7,8-Pentabromodibenzo-p-dioxin | 8.18 | 8.31 | 8.65 | 8.40 | 8.57 | 8.53 |
| 2,3,7-Tribromodibenzo-p-dioxin | 8.93 | 8.10 | 8.40 | 8.23 | 8.42 | 8.32 |
| 2,7-Dibromodibenzo-p-dioxin | 7.81 | 7.48 | 7.36 | 7.07 | 8.06 | 7.22 |
| 2-Bromodibenzo-p-dioxin | 6.53 | 6.67 | 7.03 | 8.22 | 7.73 | 7.63 |
| 2-Chlorodibenzofuran | 3.55 | 3.94 | 4.48 | 3.76 | 3.78 | 4.12 |
| 3-Chlorodibenzofuran | 4.38 | 5.13 | 5.01 | 5.75 | 5.89 | 5.38 |
| 4-Chlorodibenzofuran | 3.00 | 5.20 | 4.54 | 4.80 | 5.37 | 4.67 |
| 2,3-Dichlorodibenzofuran | 5.33 | 5.29 | 4.77 | 5.68 | 5.71 | 5.23 |
| 2,6-Dichlorodibenzofuran | 3.61 | 5.03 | 4.85 | 3.50 | 4.14 | 4.18 |
| 2,8-Dichlorodibenzofuran | 3.59 | 4.21 | 4.77 | 3.76 | 3.88 | 4.27 |
| 1,3,6-Trichlorodibenzofuran | 5.36 | 6.28 | 6.21 | 5.70 | 5.57 | 5.96 |
| 1,3,8-Trichlorodibenzofuran | 4.07 | 5.80 | 5.82 | 5.28 | 5.40 | 5.55 |
| 2,3,4-Trichlorodibenzofuran | 4.72 | 6.78 | 5.80 | 5.73 | 5.83 | 5.77 |
| 2,3,8-Trichlorodibenzofuran | 6.00 | 5.58 | 5.07 | 5.63 | 5.59 | 5.35 |
| 2,6,7 -Trichlorodibenzofuran | 6.35 | 5.64 | 5.29 | 5.38 | 4.98 | 5.34 |
| 2,3,4,6-Tetrachlorodibenzofuran | 6.46 | 5.95 | 5.86 | 6.68 | 5.56 | 6.27 |
| 2,3,4,8-Tetrachlorodibenzofuran | 6.70 | 6.19 | 5.84 | 5.55 | 5.38 | 5.70 |
| 1,3,6,8-Tetrachlorodibenzofuran | 6.66 | 5.63 | 5.52 | 6.36 | 5.92 | 5.94 |
| 2,3,7,8-Tetrachlorodibenzofuran | 7.39 | 6.96 | 6.54 | 7.18 | 6.84 | 6.86 |
| 1,2,4,8-Tetrachlorodibenzofuran | 5.00 | 5.16 | 5.32 | 4.19 | 4.90 | 4.76 |
| 1,2,4,6,7-Pentachlorodibenzofuran | 7.17 | 5.65 | 5.50 | 5.82 | 5.54 | 5.66 |
| 1,2,4,7,9-Pentachlorodibenzofuran | 4.70 | 6.82 | 6.34 | 5.22 | 5.40 | 5.78 |
| 1,2,3,4,8-Pentachlorodibenzofuran | 6.92 | 6.42 | 5.74 | 5.49 | 5.21 | 5.62 |
| 1,2,3,7,8-Pentachlorodibenzofuran | 7.13 | 7.03 | 6.56 | 6.96 | 7.19 | 6.76 |
| 1,2,4,7,8-Pentachlorodibenzofuran | 5.89 | 5.94 | 5.57 | 6.32 | 5.94 | 5.95 |
| 2,3,4,7,8-Pentachlorodibenzofuran | 7.82 | 6.42 | 6.42 | 7.08 | 6.80 | 6.75 |
| 1,2,3,4,7,8-Hexachlorodibenzofuran | 6.64 | 6.61 | 6.06 | 7.22 | 6.95 | 6.64 |
| 1,2,3,6,7,8-Hexachlorodibenzofuran | 6.57 | 7.22 | 6.78 | 6.67 | 6.47 | 6.73 |
| 1,2,4,6,7,8-Hexachlorodibenzofuran | 5.08 | 6.58 | 5.83 | 6.53 | 5.70 | 6.18 |
| 2,3,4,6,7,8-Hexachlorodibenzofuran | 7.33 | 7.93 | 6.85 | 7.73 | 6.60 | 7.29 |
| 2,3,6,8-Tetrachlorodibenzofuran | 6.66 | 5.39 | 5.23 | 5.58 | 5.42 | 5.41 |
| 1,2,3,6-Tetrachlorodibenzofuran | 6.46 | 4.93 | 5.36 | 6.17 | 5.85 | 5.77 |
| 1,2,3,7-Tetrachlorodibenzofuran | 6.96 | 6.93 | 6.57 | 7.00 | 7.22 | 6.79 |
| 1,3,4,7,8-Pentachlorodibenzofuran | 6.70 | 6.82 | 6.59 | 6.60 | 6.53 | 6.60 |
| 2,3,4,7,9-Pentachlorodibenzofuran | 6.70 | 6.54 | 6.34 | 7.29 | 6.99 | 6.82 |
| 1,2,3,7,9-Pentachlorodibenzofuran | 6.40 | 6.32 | 6.40 | 6.69 | 6.94 | 6.55 |
| H | 3.00 | 3.53 | 4.46 | 3.98 | 3.95 | 4.22 |
| 2,3,4,7-Tetrachlorodibenzofuran | 7.60 | 6.08 | 6.44 | 6.37 | 6.29 | 6.41 |
| 1,2,3,7-Tetrachlorodibenzofuran | 6.96 | 6.97 | 6.59 | 7.00 | 7.17 | 6.80 |
| 1,3,4,7,8-Pentachlorodibenzofuran | 6.70 | 6.84 | 6.58 | 6.62 | 6.52 | 6.60 |
| 2,3,4,7,9-Pentachlorodibenzofuran | 6.70 | 6.52 | 6.36 | 7.23 | 6.96 | 6.80 |
| 1,2,3,7,9-Pentachlorodibenzofuran | 6.40 | 6.38 | 6.41 | 6.68 | 6.94 | 6.55 |
| 1,2,4,6,8-Pentachlorodibenzofuran | 5.51 | 5.81 | 5.61 | 3.30 | 4.80 | 4.46 |
Figure 1(a) structure of 2,3,7,8-tetrachlorodibenzo- -dioxin; (b) C NMR spectra of 2,3,7,8-tetrachlorodibenzo- -dioxin; (c) 3D fingerprint of 2,3,7,8-tetrachlorodibenzo- -dioxin; The gray circles representing the shadows of the fingerprint elements in the -plane and the drop lines are shown to indicate better the elements’ positions in the 3D-SDAR abstract space.
Figure 2Average predictive performance of the PLS and KNN models as a function of the number of training/test cycles.
Figure 3Tanimoto similarity between pairs of compounds for the AhR dataset using bins.
Average statistical parameters of the best PLS and KNN models at a given number of LVs and neighbors as a function of the granularity of the 3D-SDAR space
| 2 ppm x 2 ppm x 0.5 Å | 3 | 0.591 | 0.143 | 0.085 | 0.103 | 6 | 0.170 | |
| 4 ppm x 4 ppm x 0.5 Å | 3 | 0.604 | 0.142 | 0.088 | 0.109 | 5 | 0.606 | 0.146 |
| 6 ppm x 6 ppm x 0.5 Å | 5 | 0.532 | 0.167 | 0.074 | 0.097 | 7 | 0.453 | 0.178 |
| 8 ppm x 8 ppm x 0.5 Å | 5 | 0.593 | 0.142 | 0.097 | 0.113 | 6 | 0.520 | 0.162 |
| 10 ppm x 10 ppm x 0.5 Å | 7 | 0.147 | 0.085 | 0.113 | 4 | 0.612 | 0.162 | |
| 12 ppm x 12 ppm x 0.5 Å | 3 | 0.474 | 0.178 | 0.105 | 0.115 | 9 | 0.432 | 0.181 |
| 14 ppm x 14 ppm x 0.5 Å | 2 | 0.321 | 0.193 | 0.096 | 0.121 | 10 | 0.312 | 0.179 |
| 16 ppm x 16 ppm x 0.5 Å | 3 | 0.383 | 0.154 | 0.073 | 0.090 | 10 | 0.353 | 0.166 |
| 18 ppm x 18 ppm x 0.5 Å | 2 | 0.307 | 0.189 | 0.077 | 0.100 | 10 | 0.307 | 0.186 |
| 20 ppm x 20 ppm x 0.5 Å | 2 | 0.410 | 0.178 | 0.122 | 0.137 | 9 | 0.356 | 0.180 |
| 2 ppm x 2 ppm x 1.0 Å | 3 | 0.567 | 0.149 | 0.082 | 0.095 | 6 | 0.599 | 0.181 |
| 4 ppm x 4 ppm x 1.0 Å | 3 | 0.562 | 0.149 | 0.081 | 0.099 | 3 | 0.558 | 0.179 |
| 6 ppm x 6 ppm x 1.0 Å | 5 | 0.526 | 0.164 | 0.076 | 0.099 | 7 | 0.466 | 0.178 |
| 8 ppm x 8 ppm x 1.0 Å | 4 | 0.542 | 0.161 | 0.095 | 0.116 | 6 | 0.504 | 0.164 |
| 10 ppm x 10 ppm x 1.0 Å | 6 | 0.597 | 0.153 | 0.086 | 0.100 | 4 | 0.593 | 0.162 |
| 12 ppm x 12 ppm x 1.0 Å | 2 | 0.440 | 0.176 | 0.101 | 0.128 | 10 | 0.429 | 0.182 |
| 14 ppm x 14 ppm x 1.0 Å | 2 | 0.315 | 0.195 | 0.100 | 0.125 | 10 | 0.327 | 0.179 |
| 16 ppm x 16 ppm x 1.0 Å | 5 | 0.251 | 0.147 | 0.069 | 0.090 | 10 | 0.357 | 0.168 |
| 18 ppm x 18 ppm x 1.0 Å | 2 | 0.296 | 0.189 | 0.077 | 0.106 | 10 | 0.292 | 0.185 |
| 20 ppm x 20 ppm x 1.0 Å | 2 | 0.405 | 0.176 | 0.128 | 0.137 | 10 | 0.358 | 0.180 |
| 2 ppm x 2 ppm x 1.5 Å | 3 | 0.537 | 0.163 | 0.074 | 0.087 | 5 | 0.603 | 0.178 |
| 4 ppm x 4 ppm x 1.5 Å | 3 | 0.542 | 0.151 | 0.077 | 0.101 | 6 | 0.574 | 0.160 |
| 6 ppm x 6 ppm x 1.5 Å | 5 | 0.536 | 0.164 | 0.073 | 0.112 | 5 | 0.481 | 0.169 |
| 8 ppm x 8 ppm x 1.5 Å | 8 | 0.500 | 0.196 | 0.090 | 0.106 | 9 | 0.498 | 0.164 |
| 10 ppm x 10 ppm x 1.5 Å | 8 | 0.531 | 0.180 | 0.092 | 0.106 | 5 | 0.585 | 0.166 |
| 12 ppm x 12 ppm x 1.5 Å | 2 | 0.440 | 0.174 | 0.104 | 0.132 | 10 | 0.421 | 0.180 |
| 14 ppm x 14 ppm x 1.5 Å | 8 | 0.267 | 0.155 | 0.073 | 0.082 | 10 | 0.316 | 0.181 |
| 16 ppm x 16 ppm x 1.5 Å | 6 | 0.286 | 0.147 | 0.063 | 0.081 | 10 | 0.359 | 0.169 |
| 18 ppm x 18 ppm x 1.5 Å | 2 | 0.302 | 0.188 | 0.079 | 0.111 | 7 | 0.291 | 0.180 |
| 20 ppm x 20 ppm x 1.5 Å | 2 | 0.406 | 0.176 | 0.121 | 0.138 | 10 | 0.365 | 0.182 |
| 2 ppm x 2 ppm x 2.0 Å | 2 | 0.495 | 0.177 | 0.071 | 0.086 | 6 | 0.576 | 0.180 |
| 4 ppm x 4 ppm x 2.0 Å | 3 | 0.504 | 0.158 | 0.080 | 0.102 | 7 | 0.535 | 0.172 |
| 6 ppm x 6 ppm x 2.0 Å | 5 | 0.500 | 0.170 | 0.071 | 0.095 | 6 | 0.467 | 0.173 |
| 8 ppm x 8 ppm x 2.0 Å | 4 | 0.508 | 0.159 | 0.095 | 0.121 | 10 | 0.481 | 0.169 |
| 10 ppm x 10 ppm x 2.0 Å | 4 | 0.498 | 0.174 | 0.088 | 0.105 | 10 | 0.557 | 0.174 |
| 12 ppm x 12 ppm x 2.0 Å | 3 | 0.450 | 0.171 | 0.102 | 0.116 | 10 | 0.430 | 0.181 |
| 14 ppm x 14 ppm x 2.0 Å | 9 | 0.297 | 0.156 | 0.078 | 0.093 | 10 | 0.329 | 0.186 |
| 16 ppm x 16 ppm x 2.0 Å | 7 | 0.207 | 0.142 | 0.057 | 0.075 | 10 | 0.359 | 0.166 |
| 18 ppm x 18 ppm x 2.0 Å | 2 | 0.273 | 0.179 | 0.070 | 0.112 | 10 | 0.308 | 0.188 |
| 20 ppm x 20 ppm x 2.0 Å | 2 | 0.410 | 0.174 | 0.131 | 0.137 | 10 | 0.383 | 0.179 |
| 2 ppm x 2 ppm x 2.5 Å | 2 | 0.481 | 0.18 | 0.076 | 0.087 | 8 | 0.555 | 0.185 |
| 4 ppm x 4 ppm x 2.5 Å | 3 | 0.485 | 0.163 | 0.079 | 0.101 | 7 | 0.522 | 0.182 |
| 6 ppm x 6 ppm x 2.5 Å | 5 | 0.492 | 0.165 | 0.071 | 0.101 | 7 | 0.465 | 0.175 |
| 8 ppm x 8 ppm x 2.5 Å | 3 | 0.422 | 0.173 | 0.097 | 0.122 | 6 | 0.485 | 0.175 |
| 10 ppm x 10 ppm x 2.5 Å | 10 | 0.471 | 0.222 | 0.072 | 0.082 | 3 | 0.568 | 0.172 |
| 12 ppm x 12 ppm x 2.5 Å | 2 | 0.404 | 0.174 | 0.097 | 0.135 | 10 | 0.429 | 0.180 |
| 14 ppm x 14 ppm x 2.5 Å | 8 | 0.286 | 0.158 | 0.073 | 0.094 | 10 | 0.315 | 0.186 |
| 16 ppm x 16 ppm x 2.5 Å | 7 | 0.244 | 0.133 | 0.057 | 0.076 | 10 | 0.339 | 0.167 |
| 18 ppm x 18 ppm x 2.5 Å | 3 | 0.282 | 0.173 | 0.081 | 0.092 | 10 | 0.293 | 0.184 |
| 20 ppm x 20 ppm x 2.5 Å | 1 | 0.397 | 0.176 | 0.137 | 0.152 | 10 | 0.358 | 0.176 |
*indicates the best PLS and KNN models.
Figure 4Average R for the (a) PLS and (b) KNN models as a function of the 3D-bin size.
Figure 5Plot of the predicted vs. observed log(1/EC ) values in case of: a) the composite PLS model using bins and 7LVs; b) the composite KNN model using bins and 6 neighbors; and c) the PLS-KNN consensus model.
Improvement of Rof consensus models over the average Rof the individual models (in %)
| 1 | PLS 10 ppm x 10 ppm x 0.5 Å | KNN 2 ppm x 2 ppm x 0.5 Å | 0.685 | 0.620 | 10.5 |
| 2 | PLS 10 ppm x 10 ppm x 0.5 Å | PLS 2 ppm x 2 ppm x 0.5 Å | 0.673 | 0.609 | 10.5 |
| 3 | PLS 2 ppm x 2 ppm x 0.5 Å | KNN 10 ppm x 10 ppm x 0.5 Å | 0.658 | 0.603 | 9.1 |
| 4 | KNN 2 ppm x 2 ppm x 0.5 Å | KNN 10 ppm x 10 ppm x 0.5 Å | 0.654 | 0.614 | 6.5 |
| 5 | PLS 2 ppm x 2 ppm x 0.5 Å | KNN 2 ppm x 2 ppm x 0.5 Å | 0.640 | 0.612 | 4.6 |
| 6 | PLS 10 ppm x 10 ppm x 0.5 Å | KNN 10 ppm x 10 ppm x 0.5 Å | 0.633 | 0.611 | 3.6 |
Figure 6Ranked (6a) and matched test set pairs (6b) hold-out R of the 100 individual PLS and KNN models producing the best composite models. The distribution of the hold-out R -R is shown in 6c.
Figure 7Orthographic projections in the planes (Figures 7a and7d) and (Figures 7b and7e) of the most frequently occurring bins with positive and negative PLS weights mapped back to the 3D-QSDAR abstract space shown on Figures7c and7f.
Figure 8Frequently occurring positively weighted bins from Figure6c superimposed over the structures of dioxins. For clarity only a few bins are shown, though many more were present.
Figure 9Frequently occurring negatively weighted bins from Figure 6f superimposed over the structures of PCBs. For clarity only a few bins are shown, though more were.