| Literature DB >> 27919554 |
L Berthod1, D C Whitley2, G Roberts3, A Sharpe3, R Greenwood4, G A Mills5.
Abstract
Understanding the sorption of pharmaceuticals to sewage sludge during waste water treatment processes is important for understanding their environmental fate and in risk assessments. The degree of sorption is defined by the sludge/water partition coefficient (Kd). Experimental Kd values (n=297) for active pharmaceutical ingredients (n=148) in primary and activated sludge were collected from literature. The compounds were classified by their charge at pH7.4 (44 uncharged, 60 positively and 28 negatively charged, and 16 zwitterions). Univariate models relating log Kd to log Kow for each charge class showed weak correlations (maximum R2=0.51 for positively charged) with no overall correlation for the combined dataset (R2=0.04). Weaker correlations were found when relating log Kd to log Dow. Three sets of molecular descriptors (Molecular Operating Environment, VolSurf and ParaSurf) encoding a range of physico-chemical properties were used to derive multivariate models using stepwise regression, partial least squares and Bayesian artificial neural networks (ANN). The best predictive performance was obtained with ANN, with R2=0.62-0.69 for these descriptors using the complete dataset. Use of more complex Vsurf and ParaSurf descriptors showed little improvement over Molecular Operating Environment descriptors. The most influential descriptors in the ANN models, identified by automatic relevance determination, highlighted the importance of hydrophobicity, charge and molecular shape effects in these sorbate-sorbent interactions. The heterogeneous nature of the different sewage sludges used to measure Kd limited the predictability of sorption from physico-chemical properties of the pharmaceuticals alone. Standardization of test materials for the measurement of Kd would improve comparability of data from different studies, in the long-term leading to better quality environmental risk assessments.Entities:
Keywords: Artificial neural networks; Partition coefficient; Pharmaceuticals; Quantitative structure-property relationship (QSPR); Sewage sludge; Sorption
Mesh:
Substances:
Year: 2016 PMID: 27919554 PMCID: PMC5206221 DOI: 10.1016/j.scitotenv.2016.11.156
Source DB: PubMed Journal: Sci Total Environ ISSN: 0048-9697 Impact factor: 7.963
Fig. 1Plots of reported log Kd values (L kg− 1) against log K for APIs: (a) complete dataset (n = 297), linear regression equation log Kd = 0.11 log K + 2.15, R2 = 0.04; (b) uncharged compounds (n = 92), linear regression equation log Kd = 0.42 log K + 1.07, R2 = 0.46; (c) negatively charged compounds (n = 75), linear regression equation log Kd = 0.04 log K + 1.73, R2 = 0.01; (d) positively charged compounds (n = 105), linear regression equation log Kd = 0.45 log K + 1.53, R2 = 0.51. Zwitterionic compounds (n = 24) omitted. Ceftadizime (log Kd = − 4.55) omitted from the negatively charged dataset.
Stepwise regression models for MOE, Vsurf and ParaSurf descriptors, individual charge classes.
| Compounds | n | Descriptors | Variables | S | R2 | R2adj | R2pred |
|---|---|---|---|---|---|---|---|
| Uncharged | 92 | MOE | 3 | 0.65 | 0.51 | 0.50 | 0.46 |
| Vsurf | 6 | 0.67 | 0.51 | 0.48 | 0.39 | ||
| ParaSurf | 7 | 0.63 | 0.58 | 0.54 | 0.49 | ||
| Positively charged | 105 | MOE | 8 | 0.50 | 0.77 | 0.75 | 0.69 |
| Vsurf | 12 | 0.48 | 0.79 | 0.76 | 0.71 | ||
| ParaSurf | 7 | 0.60 | 0.66 | 0.64 | 0.58 | ||
| Negatively charged | 76 | MOE | 5 | 0.59 | 0.57 | 0.53 | 0.47 |
| Vsurf | 9 | 0.56 | 0.63 | 0.58 | 0.50 | ||
| ParaSurf | 8 | 0.59 | 0.59 | 0.54 | 0.46 | ||
| Zwitterions | 24 | MOE | 3 | 0.79 | 0.39 | 0.30 | 0.17 |
| Vsurf | 2 | 0.72 | 0.46 | 0.41 | 0.32 | ||
| ParaSurf | 3 | 0.77 | 0.42 | 0.33 | 0.27 |
n = number of Kd values; S = standard deviation of the error; R2adj = adjusted R2; R2pred = predicted R2.
Partial least squares models for the MOE, Vsurf and ParaSurf descriptors, individual charge classes.
| Compounds | n | Descriptors | Variables | Components | R2 | R2cv |
|---|---|---|---|---|---|---|
| Uncharged | 92 | MOE | 19 | 1 | 0.41 | 0.35 |
| Vsurf | 39 | 2 | 0.45 | 0.28 | ||
| ParaSurf | 41 | 8 | 0.67 | 0.33 | ||
| Positively charged | 105 | MOE | 21 | 4 | 0.69 | 0.62 |
| Vsurf | 43 | 6 | 0.78 | 0.56 | ||
| ParaSurf | 47 | 4 | 0.68 | 0.51 | ||
| Negatively charged | 76 | MOE | 18 | 4 | 0.57 | 0.41 |
| Vsurf | 27 | 1 | 0.20 | 0.01 | ||
| ParaSurf | 25 | 1 | 0.28 | 0.00 | ||
| Zwitterions | 24 | MOE | 15 | 3 | 0.49 | 0.00 |
| Vsurf | 15 | 2 | 0.56 | 0.01 | ||
| ParaSurf | 15 | 2 | 0.39 | 0.16 |
n = number of Kd values; R2cv = leave-one-out cross-validated R2.
Performance of PLS models derived using the combined log Kd values with five random choices of training (n = 237) and test (n = 60) sets, for the MOE, Vsurf and ParaSurf descriptors.
| Descriptors | Test Set | Components | Variance | MUEtrain | MUEtest | R2train | R2cv | R2test |
|---|---|---|---|---|---|---|---|---|
| MOE | 1 | 8 | 0.87 | 0.58 | 0.62 | 0.46 | 0.33 | 0.50 |
| 2 | 6 | 0.81 | 0.58 | 0.68 | 0.47 | 0.37 | 0.36 | |
| 3 | 8 | 0.88 | 0.59 | 0.60 | 0.46 | 0.34 | 0.48 | |
| 4 | 5 | 0.74 | 0.59 | 0.62 | 0.41 | 0.31 | 0.50 | |
| 5 | 6 | 0.82 | 0.60 | 0.57 | 0.44 | 0.34 | 0.43 | |
| Mean | 0.82 | 0.59 | 0.62 | 0.45 | 0.34 | 0.45 | ||
| Sdev | 0.06 | 0.01 | 0.04 | 0.02 | 0.02 | 0.06 | ||
| Vsurf | 1 | 7 | 0.71 | 0.56 | 0.61 | 0.49 | 0.33 | 0.50 |
| 2 | 8 | 0.76 | 0.52 | 0.68 | 0.55 | 0.41 | 0.27 | |
| 3 | 6 | 0.64 | 0.56 | 0.64 | 0.52 | 0.36 | 0.37 | |
| 4 | 7 | 0.67 | 0.53 | 0.72 | 0.53 | 0.38 | 0.33 | |
| 5 | 5 | 0.61 | 0.59 | 0.51 | 0.45 | 0.30 | 0.56 | |
| Mean | 0.68 | 0.55 | 0.63 | 0.51 | 0.36 | 0.41 | ||
| Sdev | 0.06 | 0.03 | 0.08 | 0.04 | 0.04 | 0.12 | ||
| ParaSurf | 1 | 7 | 0.78 | 0.53 | 0.56 | 0.54 | 0.38 | 0.51 |
| 2 | 7 | 0.78 | 0.52 | 0.66 | 0.56 | 0.41 | 0.46 | |
| 3 | 8 | 0.82 | 0.52 | 0.59 | 0.57 | 0.42 | 0.39 | |
| 4 | 7 | 0.78 | 0.52 | 0.64 | 0.54 | 0.41 | 0.45 | |
| 5 | 7 | 0.80 | 0.53 | 0.57 | 0.55 | 0.39 | 0.48 | |
| Mean | 0.79 | 0.53 | 0.61 | 0.55 | 0.40 | 0.46 | ||
| Sdev | 0.02 | 0.01 | 0.04 | 0.01 | 0.01 | 0.04 |
MUEtrain and MUEtest = mean unsigned error on training and test sets; R2train and R2test = R2 on training and test sets; R2cv = leave-one-out cross-validated R2.
Mean predictions of committee of 10 two-hidden unit ANNs with highest evidence.
| Descriptors | Training set | MUEtrain | MUEtest | R2train | R2test |
|---|---|---|---|---|---|
| MOE | 1 | 0.42 | 0.54 | 0.71 | 0.55 |
| 2 | 0.42 | 0.62 | 0.72 | 0.47 | |
| 3 | 0.47 | 0.54 | 0.65 | 0.58 | |
| 4 | 0.45 | 0.53 | 0.65 | 0.65 | |
| 5 | 0.43 | 0.45 | 0.71 | 0.68 | |
| Mean | 0.44 | 0.54 | 0.69 | 0.58 | |
| Sdev | 0.02 | 0.06 | 0.03 | 0.09 | |
| Vsurf | 1 | 0.43 | 0.62 | 0.71 | 0.47 |
| 2 | 0.40 | 0.67 | 0.73 | 0.36 | |
| 3 | 0.41 | 0.64 | 0.73 | 0.43 | |
| 4 | 0.42 | 0.63 | 0.69 | 0.49 | |
| 5 | 0.44 | 0.54 | 0.69 | 0.57 | |
| Mean | 0.42 | 0.62 | 0.71 | 0.46 | |
| Sdev | 0.02 | 0.05 | 0.02 | 0.08 | |
| ParaSurf | 1 | 0.39 | 0.53 | 0.75 | 0.58 |
| 2 | 0.38 | 0.63 | 0.76 | 0.46 | |
| 3 | 0.40 | 0.56 | 0.74 | 0.45 | |
| 4 | 0.38 | 0.58 | 0.73 | 0.56 | |
| 5 | 0.40 | 0.52 | 0.75 | 0.59 | |
| Mean | 0.39 | 0.56 | 0.75 | 0.53 | |
| Sdev | 0.01 | 0.04 | 0.01 | 0.07 |
MUEtrain and MUEtest = mean unsigned error on training and test sets; R2train and R2test = R2 on training and test sets.
Ten most relevant descriptors in ANNs identified by ARD, ordered by mean rank over the 5 test sets. See Tables S4 and S5 for definition of MOE and Vsurf descriptors. ParaSurf descriptors are: polarizability (molecular electronic polarizability), FNmin (minimum of electrostatic field normal to surface), var.*balance (product of total variance in molecular electrostatic potential and the balance parameter), HARDrange (range of local electron hardness), MEPskew (skewness of the molecular electrostatic potential), MEPvar + (variance of the positive molecular electrostatic potential), meanMEP- (mean of the negative molecular electrostatic potential)), POLkurt (kurtosis of the local polarizability), EALkurt (kurtosis of the local electron affinity), FNmax (maximum of electrostatic field normal to surface).
| MOE | Vsurf | ParaSurf |
|---|---|---|
| a_base | vsurf_R | polarizability |
| vsa_hyd | vsurf_G | FNmin |
| PC + | vsurf_EWmin1 | var*balance |
| a_don | vsurf_CW2 | HARDrange |
| rgyr | vsurf_CW1 | MEPskew |
| vsa_pol | vsurf_HB4 | MEPvar + |
| log | vsurf_ID7 | meanMEP − |
| rings | vsurf_IW5 | POLkurt |
| a_acc | vsurf_HL2 | EALkurt |
| Weight | vsurf_ID8 | FNmax |
Fig. 2Predicted log Kd against observed log Kd for MOE descriptors trained on entire dataset for (a) PLS (R2 = 0.47, MUE = 0.58), (b) ANN (R2 = 0.64, MUE = 0.47). Data point for zibotentan (observed log Kd = − 0.699) omitted for clarity.