| Literature DB >> 34473856 |
Sivani Baskaran1, Ying Duan Lei1, Frank Wania1.
Abstract
The octanol-air equilibrium partition ratio (KOA ) is frequently used to describe the volatility of organic chemicals, whereby n-octanol serves as a substitute for a variety of organic phases ranging from organic matter in atmospheric particles and soils, to biological tissues such as plant foliage, fat, blood, and milk, and to polymeric sorbents. Because measured KOA values exist for just over 500 compounds, most of which are nonpolar halogenated aromatics, there is a need for tools that can reliably predict this parameter for a wide range of organic molecules, ideally at different temperatures. The ability of five techniques, specifically polyparameter linear free energy relationships (ppLFERs) with either experimental or predicted solute descriptors, EPISuite's KOAWIN, COSMOtherm, and OPERA, to predict the KOA of organic substances, either at 25 °C or at any temperature, was assessed by comparison with all KOA values measured to date. In addition, three different ppLFER equations for KOA were evaluated, and a new modified equation is proposed. A technique's performance was quantified with the mean absolute error (MAE), the root mean square error (RMSE), and the estimated uncertainty of future predicted values, that is, the prediction interval. We also considered each model's applicability domain and accessibility. With an RMSE of 0.37 and a MAE of 0.23 for predictions of log KOA at 25 °C and RMSE of 0.32 and MAE of 0.21 for predictions made at any temperature, the ppLFER equation using experimental solute descriptors predicted the KOA the best. Even if solute descriptors must be predicted in the absence of experimental values, ppLFERs are the preferred method, also because they are easy to use and freely available. Environ Toxicol Chem 2021;40:3166-3180.Entities:
Keywords: Environmental partitioning; Organic contaminants; Partitioning coefficient; Partitioning ratio; Quantitative structure-activity relationships
Mesh:
Substances:
Year: 2021 PMID: 34473856 PMCID: PMC9292506 DOI: 10.1002/etc.5201
Source DB: PubMed Journal: Environ Toxicol Chem ISSN: 0730-7268 Impact factor: 4.218
Reliability score of polyparameter linear free energy relationship (ppLFER) predictions based on the overall error (OE) of the estimate
| Reliability score | Guideline |
|---|---|
| Poor | OE > 1 |
| Fair | OE ≤ 1 |
| Good | OE ≤ 0.75 |
| Excellent | OE ≤ 0.5 |
Reliability of the EPISuite‐25 predictions determined using the applicability domain (AD) set by the KOWWIN and HENRYWIN models
| EPISuite set | Reliability score | Guideline |
|---|---|---|
| EPISuite‐25 | Poor | Outside all 3 AD limits |
| Fair | Outside 2 of the AD limits | |
| Good | Outside 1 of the AD limits | |
| Excellent | Inside all AD limits | |
| EPISuite‐T | Poor | Outside all 3 AD limits |
| Fair | Outside 2 of the AD limits | |
| Good | Outside of the KOWWIN AD or uses slope analogy to obtain the HLC equation | |
| Excellent | Inside KOWWIN AD, experimental HLC equation |
HLC = Henry's law constant.
The reliability score for the OPERA model, determined based on the reported information regarding the applicability domain (AD) of the prediction
| Reliability score | Global AD | Local AD level | Confidence level index |
|---|---|---|---|
| Poor | Outside | <0.4 | |
| Fair | Outside | 0.4–0.6 | |
| Fair | Inside | <0.6 | |
| Good | Outside | ≥0.6 | |
| Good | Inside | ≥0.6 | <0.75 |
| Excellent | Inside | ≥0.6 | ≥0.75 |
Performance of different polyparameter linear free energy relationships (ppLFERs) using experimental solute descriptors
| 25 °C | All temperatures | |||||||
|---|---|---|---|---|---|---|---|---|
| Estimate | No. | MAE | RMSE | PIwidth | No. | MAE | RMSE | PIwidth |
| Abraham & Acree, | 337 | 0.21 | 0.33 | 1.29 | 1363 | 0.22 | 0.32 | 1.23 |
| Endo & Goss, | 347 | 0.22 | 0.37 | 1.41 | 1395 | 0.20 | 0.32 | 1.25 |
| Modified (Equation | 347 | 0.23 | 0.37 | 1.43 | 1395 | 0.21 | 0.32 | 1.27 |
| Jin et al. | 337 | 0.21 | 0.33 | 1.28 | 1363 | 0.22 | 0.33 | 1.28 |
MAE = mean absolute error; RMSE = root mean square error; PI = prediction interval.
Statistics on the residuals, including the mean absolute error (MAE), standard deviation (SD), root mean square error (RSMSE), upper (PIU), and lower (PIL) prediction interval, when considering all log K OA estimates at 25 °C from all models
| Prediction tool | No. | Mean | MAE | Median | SD | RMSE | PIU | PIL |
|---|---|---|---|---|---|---|---|---|
| ppLFER, experimental | 347 | –0.07 | 0.23 | –0.01 | 0.37 | 0.37 | 0.65 | –0.79 |
| ppLFER, estimated | 475 | –0.09 | 0.34 | –0.05 | 0.50 | 0.51 | 0.89 | –1.07 |
| EPISuite‐25 | 475 | 0.00 | 0.58 | 0.06 | 0.78 | 0.78 | 1.54 | –1.54 |
| EPISuite‐T | 474 | –0.01 | 0.54 | 0.01 | 0.74 | 0.74 | 1.43 | –1.45 |
| OPERA | 475 | 0.07 | 0.33 | 0.00 | 0.52 | 0.52 | 1.09 | –0.95 |
| COSMOtherm | 475 | 0.02 | 0.41 | –0.04 | 0.56 | 0.56 | 1.12 | –1.08 |
ppLFER = polyparameter linear free energy relationship.
Statistics on the residuals, including the mean absolute error (MAE), standard deviation (SD), root mean square error (RSMSE), upper (PIU), and lower (PIL) prediction interval, when considering only estimates for chemicals, for which all models could make predictions at 25 °C (n = 346)
| Prediction tool | Mean | MAE | Median | SD | RMSE | PIU | PIL |
|---|---|---|---|---|---|---|---|
| ppLFER, experimental | –0.07 | 0.23 | –0.01 | 0.37 | 0.37 | 0.65 | –0.79 |
| ppLFER, estimated | –0.05 | 0.31 | –0.01 | 0.45 | 0.45 | 0.83 | –0.94 |
| EPISuite‐25 | 0.08 | 0.47 | 0.09 | 0.64 | 0.65 | 1.35 | –1.18 |
| EPISuite‐T | 0.02 | 0.44 | 0.05 | 0.60 | 0.60 | 1.19 | –1.15 |
| OPERA | 0.00 | 0.26 | –0.01 | 0.44 | 0.44 | 0.86 | –0.86 |
| COSMOtherm | 0.00 | 0.34 | –0.04 | 0.46 | 0.46 | 0.90 | –0.90 |
ppLFER = polyparameter linear free energy relationship.
Figure 1Plots of the residual of the log K OA predictions at 25 °C against the measured log K OA value for 346 chemicals for which an estimate could be made by all models. The dashed lines indicate the prediction interval and the mean. The color and shape of each point indicate the reliability score of each prediction as described in the Materials and Methods section. The COSMOtherm model has no applicability domain or reliability score. The corresponding plot for log K OA predictions at 25 °C for all available data is available in the Supporting Information. ppLFER = polyparameter linear free energy relationship.
Statistics on the residuals of log K OA predictions at temperatures between –10 and 110 °C, including the mean absolute error (MAE), standard deviation (SD), root mean square error (RMSE), upper (PIU), and lower prediction interval (PIL)
| Prediction tool | No. | Mean | MAE | Median | SD | RMSE | PIU | PIL |
|---|---|---|---|---|---|---|---|---|
| ppLFER, experimental | 1395 | 0.01 | 0.21 | 0.03 | 0.32 | 0.32 | 0.64 | –0.63 |
| ppLFER, estimated | 1676 | –0.02 | 0.29 | 0.01 | 0.43 | 0.43 | 0.82 | –0.86 |
| EPISuite‐T | 1675 | 0.13 | 0.59 | 0.10 | 0.80 | 0.81 | 1.69 | –1.44 |
| COSMOtherm | 1676 | 0.04 | 0.40 | 0.04 | 0.55 | 0.56 | 1.12 | –1.05 |
ppLFER = polyparameter linear free energy relationship.
Statistics on the residuals of log K OA predictions at temperatures –10–110 °C that could be made with all models (n = 1394), including the mean absolute error (MAE), standard deviation (SD), root mean square error (RMSE), upper (PIU), and lower prediction interval (PIL)
| Prediction tool | Mean | MAE | Median | SD | RMSE | PIU | PIL |
|---|---|---|---|---|---|---|---|
| ppLFER, experimental | 0.01 | 0.21 | 0.03 | 0.32 | 0.32 | 0.64 | –0.63 |
| ppLFER, estimated | 0.01 | 0.26 | 0.02 | 0.38 | 0.38 | 0.75 | –0.73 |
| EPISuite‐T | 0.11 | 0.51 | 0.10 | 0.69 | 0.69 | 1.46 | –1.23 |
| COSMOtherm | 0.08 | 0.34 | 0.05 | 0.45 | 0.46 | 0.97 | –0.81 |
ppLFER = polyparameter linear free energy relationship.
Figure 2The measured log K OA is plotted against the residual of the log K OA prediction for chemicals when estimates could be made with all models (n = 1394). The dashed lines indicate the prediction interval and the mean. The color and shape of each point indicate the reliability score of each prediction as described in Materials and Methods. A similar plot for all chemicals with measured data is available in the Supporting Information. ppLFER = polyparameter linear free energy relationship.
Figure 3Prediction interval for each model for log K OA predictions at 25 °C and at all temperatures. The red points indicate the upper and lower prediction intervals and the bar indicates the mean error. ppLFER = polyparameter linear free energy relationship.
Summary of the mean absolute error (MAE), root mean square error (RMSE), and prediction intervals (PIs) for prediction models that work best to predict log K OA at 25 °C and any temperaturea
| T | Rank | Prediction tool | No. | MAE | RMSE | PIU | PIL | PIwidth |
|---|---|---|---|---|---|---|---|---|
| 25 °C | 1 | Modified ppLFER, experimental | 347 | 0.23 | 0.37 | 0.65 | –0.79 | 1.43 |
| 2 | Modified ppLFER, estimated | 475 | 0.34 | 0.51 | 0.89 | –1.07 | 1.96 | |
| 3 | OPERA | 475 | 0.33 | 0.52 | 1.09 | –0.95 | 2.03 | |
| 4 | COSMOtherm | 475 | 0.41 | 0.56 | 1.12 | –1.08 | 2.19 | |
| Any T | 1 | Modified ppLFER, experimental | 1395 | 0.21 | 0.32 | 0.64 | –0.63 | 1.27 |
| 2 | Modified ppLFER, estimated | 1676 | 0.29 | 0.43 | 0.82 | –0.86 | 1.68 | |
| 3 | COSMOtherm | 1676 | 0.40 | 0.56 | 1.12 | –1.05 | 2.17 |
Models are ranked based on their performance and usability within each temperature range.
PIwidth is equal to |PIU| + |PIL|.
ppLFER = polyparameter linear free energy relationship.