Literature DB >> 33681563

Relook on the Linear Free Energy Relationships Describing the Partitioning Behavior of Diverse Chemicals for Polyethylene Water Passive Samplers.

Muhammad Irfan Khawar1, Deedar Nabi1,2.   

Abstract

Over the past 3 decades, low-density polyethylene (PE) passive sampling devices have been widely used to scout organic chemicals in air, water, sediments, and biotic phases. Experimental partition coefficient data, required to calculate the concentrations in environmental compartments, are not widely available. In this study, we developed and rigorously evaluated linear free energy relationships (LFERs) to predict the partition coefficient between the PE and the water phase (log K pe-w). Poly-parameter (pp) LFERs based on Abraham solute parameters performed better (root-mean-square error, rmse = 0.333-0.350 log unit) in predicting log K pe-w compared to the two one-parameter (op) LFERs built on n-hexadecane-water and octanol-water partition coefficients (rmse = 0.41-0.42 log unit), indicating that one parameter is not able to account for all types of interactions experienced by a chemical during PE-water exchange. Dimensionality analyses show that the calibration dataset used to train pp-LFERs fulfills all the requirements to obtain a robust model for log K pe-w. Van der Waals interactions of the molecule tend to favor the PE phase, and polar interactions of the molecule favor the water phase. The PE phase is the most sensitive to polarizable chemicals compared to other commonly used passive sampling polymeric phases such as polydimethylsiloxane, polyoxymethylene, and polyacrylate. For op-LFERs, the PE phase is better represented by the hexadecane phase than by the octanol phase. A computational method based on the conductor-like screening model for real solvents theory did good job in estimating log K pe-w for chemicals that were neither very hydrophobic nor very hydrophilic in nature. Our models can be used to reliably predict the log K pe-w values of simple neutral organic chemicals. This study provides insights into the partitioning behavior of PE samplers compared to other commonly used passive samplers.
© 2021 The Authors. Published by American Chemical Society.

Entities:  

Year:  2021        PMID: 33681563      PMCID: PMC7931192          DOI: 10.1021/acsomega.0c05179

Source DB:  PubMed          Journal:  ACS Omega        ISSN: 2470-1343


Introduction

Over the past 3 decades, passive sampling devices (PSDs) have been widely used to scout organic chemicals in air, water, sediments, and biotic phases.[1] The increasing popularity of passive sampling techniques among analytical chemists may be attributed to the facts that passive samplers provide cleaner extracts, improved detection limits, ease of storage, and archiving of samples.[2] To environmental chemists, passive sampling brings value because PSDs bio-mimic the passive uptake of truly dissolved concentrations (Cfree)[3] of chemicals in the environment.[4]Cfree is considered as a more accurate endpoint of chemical exposure than given by the total concentration measured using conventional sampling methods.[5] At environmental levels, measuring Cfree is equivalent to determining the chemical activity of a contaminant.[6] At equilibrium, chemical activities between multiple phases such as a whole organism (or its compositional components such as lipids, proteins, and carbohydrates) and a reference phase (e.g., water and air) are equal; using appropriate partition coefficients, we can calculate concentrations in different compartments of interest and evaluate the bioaccumulation disequilibrium.[6−8] On a fringe side, laboratory experimentalists have started preferring passive dosing methods, which use the same polymeric phases as used in passive sampling, to determine exposure,[9] toxicity,[10] bioconcentration,[11] and speciation and fractionation[12] of hydrophobic organic chemicals in complex systems. Passive dosing offers a tight control on the exposure concentrations of hydrophobic chemicals in laboratory experiments involving multiple phases.[13] Taken together, environmental chemists can gain insights about bioavailability, ecotoxicity, bioaccumulation, and biomagnifications of organic contaminants using low-cost and low-tech PSDs.[14] In field, several types of PSDs have shown promise in monitoring organic pollutants in environmental waters.[1] These PSDs include polydimethylsiloxane (PDMS),[15] polyoxymethylene (POM),[16] polyacrylate (PA),[17] ethylene–vinyl acetate,[18] semipermeable membrane devices,[19] high-density polyethylene (HDPE),[20] and low-density polyethylene (PE).[20] However, PE is a cheap and widely available material with proven robustness for long-term monitoring of organic pollutants.[21] These organic pollutants include diverse chemical families such as organochlorine pesticides (OCPs), polycyclic aromatic hydrocarbons (PAHs), nitro-PAHs, polychlorinated biphenyls (PCBs), polybrominated diphenyl ethers (PBDEs), alkyl benzenes, and alkyl phenols.[22−26] The partition coefficient between the PE and water phase (Kpe–w) is required to calculate the concentration (Cfree) of organic pollutants in environmental waters (eq ).[27]where Cpe is the equilibrium concentration of contaminants accumulated in the PE passive sampler deployed in water. However, the experimental Kpe–w data are not available beyond few hundred chemicals. Experimental methods are expensive, laborious, and difficult especially for hydrophobic chemicals. Consequently, environmental analysts resort to different estimation methods to compute Kpe–w. The estimation approaches mostly include one-parameter (op-) and poly-parameter (pp-) linear free energy relationships (LFERs).[21] Theoretically speaking, log Kpe–w is a free energy related property (eq ), which may be related to other free energy properties to predict log Kpe–w.where R is the universal gas constant, T is the temperature, and Δpe–wG is the Gibbs free energy change for the transfer of solute in the PE–water system. Previously, free energy properties such as subcooled pure liquid solubility and partition coefficients for octanolwater (log Kow) and hexadecanewater (log Khexadecane–w) systems are used to develop op correlations for the estimation of log Kpe–w.[21] However, such relationships between two partitioning properties work accurately only if the same type of interactions governs both properties.[28] This is because one parameter can explore only one of many types of intermolecular interactions involved in a partitioning.[29] Thus, op-LFERs are limited to specific chemical classes and cannot be applied to estimate property for chemicals belonging to different chemical class.[8,29] To explore the entire spectrum of interactions governing the partitioning of diverse chemicals, both specific and nonspecific intermolecular interactions need to be taken into account.[8] In other words, the total partitioning Δpe–wG is a linear combination of free energy changes due to van der Waals (Δpe–wGvdW) and polar interactions (Δpe–wGpolar) (eq ).[30] Abraham and co-workers successfully linked the van der Waals and polar interactions of chemicals to their macroscopic partitioning properties. They demonstrated that not more than five to six intermolecular interaction parameters are required to develop LFERs for diverse partitioning properties.[31−35] Such pp-LFERs are also referred to as Abraham solvation models (ASMs). According to ASM (eqs and 5)[36,37] and other variants proposed by Goss (eq )[38] and van Noort (eq ),[39] equilibrium partition coefficients (log K) of nonionic chemicals for a system of two phases, x and y, can be described by following general expressionswhere E defines the polarizability of the solute in excess of that of a comparably sized n-alkane, S parameter blends the electrostatic polarity and polarizability of the solute;[40,52] and A and B are the parameters depicting hydrogen bond donating and hydrogen bond accepting capacities of the solute, respectively. V is the McGowan molecular volume of solute i and L is the hexadecane–air partition coefficient of the solute at 25 °C. Small letters c, e, s, a, b, l, and v are coefficients specific to each two-phase partitioning system xy. Generally, for liquid–liquid partitioning processes, the lL term is ignored (eq , henceforth referred to as the ESABV model), and for gas–liquid partitioning, the vV term would not be considered (eq , henceforth referred to as the ESABL model).[37] Goss developed a single equation, keeping both terms, vV and lL, and ignoring the eE term in the model to describe both the liquid–liquid and gas–liquid partitioning (eq , henceforth referred to as the SABVL model).[38] Van Noort proposed that for air–organic solvent partitioning, both vV and eE terms can be excluded without any loss of statistical quality (eq , henceforth referred to as the SABL model).[39] In a recent study, Zhu and co-workers[41] developed pp-LFER based on three Abraham solute parameters (ASPs), A, B, and V, using a diverse set of 254 chemicals. Their pp-LFER explained 78% of variance in the experimental log Kpe–w training data (n = 203) and exhibited the root-mean-square error (rmse) of 0.59 log unit compared to the experimental log Kpe–w training data. However, authors did not discuss if they evaluated the role of other ASPs, E, S, and L while developing their pp-LFER. Furthermore, the authors did not report if they used experimental or calculated data for ASPs to develop pp-LFER. In another study, Zhu and co-workers[42] developed pp-LFER based on the theoretical parameters such as average molecular polarizability, dipole moment, and the net charge of the most negative atoms as a proxy of hydrogen bonding interaction using a diverse set of 191 chemicals. The reported rmse for this equation is 0.60 log unit. Authors successfully validated their pp-LFER using an external set of 48 chemicals. However, this model requires the quantum-chemically computed theoretical descriptors, which are not quickly accessible to common users of passive samplers. The octanolwater partition coefficient (log Kow) is the most widely used parameter to develop op-LFERs for many passive sampling phases.[15,21,33,43] The octanolwater partition coefficient-based LFERs are developed for the PE–water system.[21] However, octanol is reckoned not to be a good solvent to represent PE phase due to its semipolar trait.[44,45] Generally, such relationships are good only for single chemical family and cannot be applied to diverse chemicals.[21,27,44,46] Long-chain n-alkanes such as n-hexadecane are expected to be a more appropriate proxy of the PE phase than n-octanol. The solvation similarity between n-hexadecanewater and PE–water systems has been reported for the chlorinated organic chemicals.[47] Hale and co-workers developed op-LFER between n-hexadecanewater and PE–water systems for 14 OCPs.[44] Sacks and Lohmann reported an op-LFER between the partition coefficients of triclosan and alkyl phenols for the n-hexadecanewater (log Khexadecane–w) and PE–water systems.[45] The accuracy of these op-LFERs was better than the ones based on linear relationships between log Kow and log Kpe–w for chemical families having semipolar traits. The LFER approach requires calibration and validation with reliable experimental data. Generally, the parameters used in the LFERs are not readily available. For instance, the experimental data for ASPs are available only for 8000 chemicals.[48] However, quantum-chemical approaches generally do not require parameterization. A computational solvation approach based on the COSMO-RS (conductor-like screening model for real solvents) theory[49] has been widely used to predict thermodynamic properties such as activity coefficients, solubility, partition coefficients, vapor pressure, and free energy of solvation.[50] The COSMO-RS integrates continuum solvation model and surface interaction methods to calculate the chemical potential of molecules in a variety of solvents, mixtures, and polymers. It considers the solvent as a dielectric continuum, in which solute molecule is embedded in cavity of its size and shape, to calculate the surface charge density of the molecules. The solute–solvent interaction energies are computed based on the interactions of surface segments.[51] The partition coefficients are computed through fast statistical thermodynamics of interacting molecular surface segments. COSMOtherm (COSMOlogic GmbH & Co. KG) is a predictive tool which is used to implement COSMO-RS.[52] Goss evaluated the performance of COSMOtherm to predict partition coefficients of neutral organic compounds for several polymers used in analytical chemistry.[53] The author demonstrated the utility of COSMOtherm in selecting polymers for various applications in analytical chemistry. For instance, linear regression between experimental and COSMOtherm-predicted values resulted in R2 = 0.95 and rmse = 0.44 log unit for 146 chemicals. In another study, Loschen and Klamt[54] calculated the solubilities and partitioning of gaseous and liquid solutes in different polymers. They demonstrated the importance of free volume correction in improving the prediction of partitioning properties of polymers. For PE–water partition coefficients, when regressed against the experimental values, the predicted values calculated without a combinatorial term resulted in R2 = 0.87 and rmse = 1.83 log units for 10 chemicals. With incorporation of free volume term and estimated crystalline fraction of 0.67 for PE, this comparison improved yielding R2 = 0.86 and rmse = 1.13 log units for 10 chemicals. The objective of the study was to (i) develop op and pp-LFERs, (ii) understand what types of intermolecular interactions are relevant to the PEwater system, and (iii) evaluate the performance of COSMO-RS model compared to LFERs for the prediction of partition coefficient for PE–water samplers.

Materials and Methods

Data Source and Analysis

The experimental values of low-density PE to water partition coefficient (Kpe–w) comprising 270 chemicals were taken from compilations given in the previous works.[41,44] These data are shown as Table S1 in the Supporting Information. Multiple values reported in the literature were averaged. Experimental ASPs, which were available for 214 chemicals (Dataset-I), were imported from the Helmholtz Centre for Environmental Research-Linear Solvation Energy Relationships (UFZ-LSER) database[48] (Table S2). Dataset-I was used to train and validate the pp-LFER equations. The estimated values of ASPs for the remaining 56 chemicals were calculated using the UFZ-LSER Database tool[48] (Table S3). However, only 26 out of the 56 remaining chemicals (Dataset-II) were retained for additional validation of the pp-LFER equations (Section ) as they were found within the acceptable application domain[48,55] (Table S4). We excluded the estimated values which were outside of the application domain. Experimental values of log Kow, ranging from 2.7 to 8.6 log unit, were taken from the literature[41,56] (Table S2). The data of log Khexadecane–w values, which spanned about 7 orders of magnitude, were estimated using the ASM equation reported in the literature[57] (Table S2). These partition coefficients were used to develop op-LFERs for the PE–water passive sampler. System coefficients of pp-LFER equations for common passive samplers such as PDMSwater,[58] POMwater,[35] PA–water,[33] and technical solvents such as octanolwater and[59] hexadecanewater[57] systems were taken from the literature to compare the similarities of PE with these systems (Table S5). The experimental Kpe–w dataset, used to develop the pp-LFER equations, was diverse and spanned over 6 orders of magnitude (Table S1) of Kpe–w. The dataset contains chemicals with diverse structures and comprises the compounds from families such as n-alkane, linear alkyl benzenes, chlorinated benzenes, OCPs, PAHs, nitro-PAHs, PCBs, PBDEs, polyhalogenated dibenzo-p-dioxins (PHDDs), and dibenzofurans (PHDFs). A dataset comprising 238 chemicals were used to evaluate the COSMOtherm model. This dataset (Table S6) traversed wider ranges of ASPs than by the Dataset-I (Table S1). However, experimental data were available only for 47 chemicals (Table S7) to compare with the COSMOtherm predictions. For the remaining 189 chemicals, Kpe–w values were estimated using the pp-LFER equation developed in this study.

Statistical Analysis

The statistical tests such as correlation analysis, principle component analysis (PCA), multiple linear regression, and cross-validation tests were performed using R Program (3.5.3)[60] and XLSTAT (2018).[61] Significant and optimum number of descriptors for each model was selected using step-wise multiple linear regression based on the statistical criteria such as Student’s t-test, Akaike information criteria, adjusted R2, and variance inflation factor. Uncertainties around regression coefficient, which correspond to a 95% probability interval of the fitted values, were estimated using the bootstrap method with 1000 synthetic resampling. Where the intercept was found to be statistically indistinguishable from zero, the regression was repeated with an intercept set equals to zero. To define the domain of applicability, and to find the influential values in the training datasets, the regression diagnostics such as Studentized Residuals, Hat Values, and Cook’s Distance were applied to each model (Tables S9–S12). The bootstrap resampling method was used to estimate the standard errors of beta coefficients for all models. Cross-validation tests such as K-fold, repeated K-fold (r = 10), leave one-out, and bootstrapping (n = 1000) were performed for each model to evaluate the robustness (Sections S1–S4). The PCA test was used to find the contribution of all variables in the principal components. Applicability domains (ADs) of all predictive models were assessed using influence plots, which reflect leverages, studentized residuals, and Cook’s D values. Chemicals that have values for these metrics above the critical thresholds were flagged as outside the AD.

COSMOtherm Calculation

Cosmo files comprising the screening charge densities for the select 47 chemicals, for which the experimental log Kpe–w values were available (Table S7), were generated using the TURBOMOLE package[62] at the B-P86 density functional level with a def-TZVP basis set. Cosmo files for 189 chemicals (Table S8) were taken from the COSMOtherm database. These cosmo files were then inputted in the COSMOtherm software using the BP_TZVP_C30_1701 parametrization to calculate log Kpe–w.

Results and Discussion

Appropriateness of Dataset for the Development of ASM

To begin with, we investigated if the calibration dataset fulfills the necessary requirements stipulated in the literature[63] for developing a robust ASM equation. First, we note that the calibration dataset almost follows the normal distribution which traversed more than 6 orders of magnitude (102.02–108.4) for Kpewater values (Figure a). Second, all ASM parameters in the calibration dataset, except the hydrogen bond donating ability, span a reasonable range of values (Figure b). Only 5 of 214 chemicals in the calibration datasets have nonzero values for A parameter, which still fulfills the condition of having a minimum four solute per parameter[63] required to account for the true dependence of Kpewater on acidity. Third, the solute set must not exhibit significant covariance among the ASM parameters. As evident from the correlogram (Figure c), there is a moderate correlation between E, S, V, and L parameters. A and B are fairly uncorrelated parameters. This overlap in information is expected as ASPs do not represent orthogonal information in terms of fundamental intermolecular interactions such as dispersion, Keesom, Debye, and hydrogen bond forces. For example, S parameter represents a mixture of polarity and polarizability.[40,63] Similarly, the polarizability and induction effects are rooted in the definitions of L and E.[63] In fact, polarizability is linearly correlated with the size of the molecule.[63] As a result, descriptors correlate with each other due to mixing of fundamental intermolecular interactions. In the presence of correlations among the solute parameters, the choice of training set becomes critical to develop meaningful ASM-type equations.
Figure 1

Partitioning variability embedded in the training set (n = 214) in terms of ASPs and PE–water partition coefficient. Top panels show the distribution of (a) PE–water partition coefficients (log KPE–water) and (b) of ASPs. Lower panels show (c) the correlogram of the correlation matrix obtained, respectively, by Pearson correlation analysis and (d) the percent contribution of variables in the first five dimensions obtained by PCA of the 214 × 8 matrix [ESABVL log Kpe–water]. In Panel (c), red and purple color, respectively, show positive and negative correlations between the pair. The value of correlation coefficient for each pair of variables is shown in each square. In panel (d), color intensity and size of the circle are proportional to the percent contribution of a variable. In panel (d), Dim. stands for dimension.

Partitioning variability embedded in the training set (n = 214) in terms of ASPs and PE–water partition coefficient. Top panels show the distribution of (a) PE–water partition coefficients (log KPEwater) and (b) of ASPs. Lower panels show (c) the correlogram of the correlation matrix obtained, respectively, by Pearson correlation analysis and (d) the percent contribution of variables in the first five dimensions obtained by PCA of the 214 × 8 matrix [ESABVL log Kpewater]. In Panel (c), red and purple color, respectively, show positive and negative correlations between the pair. The value of correlation coefficient for each pair of variables is shown in each square. In panel (d), color intensity and size of the circle are proportional to the percent contribution of a variable. In panel (d), Dim. stands for dimension. To further investigate the impact of overlap in chemical information among solute parameters and their relationship with log Kpe–w, we performed PCA on the 214 × 7 matrix [ESABVL log Kpe–w]. The first five dimensions account for more than 99% information for this matrix. The total variance in the dataset due to ASM parameters is partitioned in almost all the orthogonal dimensions obtained after PCA (Figure d). This is also indicative of the absence of multi-collinearity (i.e., one of the solute parameters might be a linear function of a combination of other parameters). The lack of multi-collinearity is further corroborated by the variance inflation factors (VIFs) obtained after regression of log Kpe–w against the ASM parameters. Similarly, the contribution of log Kpe–w is significant in all the orthogonal dimensions. This indicates that all ASM parameters are important to explain the variance in the log Kpe–w of the dataset. Taken together, these results show that the calibration dataset fulfills all the requirements to obtain a robust ASM equation for log Kpe–w.

pp-LFERs for PE–Water Partitioning

We calibrated and evaluated four variants of the pp-LFER model based on Abraham solvation parameters. Models based on the ESABL and SABL pp-LFERs are presented in the Supporting Information (Sections S5 and S6). Two models, ESABV and SABVL pp-LFERs, are discussed here in detail.

ESABV Model

The ESABV model, based on the relationship of log Kpe–w with a linear combination of E, S, A, B, and V parameters, successfully described 99% of variation in the log Kpe–w data (eq and Figure a).where n, R2, Radj2, rmse, and F-statistic denote the number of experimental values of log Kpe–w, coefficient of determination, adjusted coefficient of determination, root-mean-squared error, and the overall Fisher statistic, respectively.
Figure 2

Linear regression plot for the (a) ESABV model, (b) SABVL model, and (c) op-LFER model based on hexadecane–water partition coefficient and (d) op-LFER model based on octanol–water partition coefficient. Upper and lower green lines bound 95% confidence interval around the regression line, which is shown as dotted black line in the middle. Blue diamonds (◇) and purple circle (◯)represent the data points in the training set and validation set, respectively.

Linear regression plot for the (a) ESABV model, (b) SABVL model, and (c) op-LFER model based on hexadecanewater partition coefficient and (d) op-LFER model based on octanolwater partition coefficient. Upper and lower green lines bound 95% confidence interval around the regression line, which is shown as dotted black line in the middle. Blue diamonds (◇) and purple circle (◯)represent the data points in the training set and validation set, respectively. For external validation, the PPM full dataset (n = 214, Table S2) was split randomly into a training set (n = 174, Table S13) and a validation set (n = 40, Table S14). Equation was derived using the training set of 174 compounds. The fitting coefficients and regression statistics of eq are statistically similar to eq . For equation, the largest VIF value among ASM parameters was 4.8 for S parameter, which was lesser than the cutoff value of 10 for multicollinearity.[64] Predictions of eq compared favorably with the experimental data for the external validation set (Rexternal2 = 0.962 and rmseexternal = 0.296). The results of four types of cross-validation tests for eq were in good agreement with each other, indicating that the model is internally valid for predictive purpose (Section S1). These tests exhibited rmse and R2 in the range of 0.33–0.39 log unit and 0.915–0.930, respectively. The values of ASM parameters for 26 chemicals, for which experimental ASM data were not available, were calculated from UFZ-LSER Website (Dataset-II). With the input of these calculated ASM parameters (n = 26), eq predicted values were in good agreement with the experimental values of log Kpe–w (rmse = 0.58 log units). In this comparison, largest residuals were observed for chemicals that were either very hydrophobic or had significant hydrogen bonding interactions (Figure S1). Application domain for the ESABV model was established by using influence plot (Figure a). The following six chemicals were flagged as influential observations in this plot: PCB 209, aldrin, methoxychlor, n-dodecylbenzene, n-octylphenol, and triclosan. These chemicals are either very hydrophobic or have substantial hydrogen bonding interactions. Leverages for n-octylphenol and triclosan were significantly higher than other chemicals in the dataset. Leverage higher than the critical values generally indicates possible issues with predictor variables, which in our case are the ASM parameters for these solutes. The values of ASM parameters for some chemicals, especially for very hydrophobic and complex molecules, might be in considerable error.[65,66] Using different sets of published experimental solute parameters from several different sources, estimated water–air partition coefficient for pesticides exhibited rmse values ranging from 0.54 to 1.39 log units when compared to experimental data.[66] For triclosan, the percent relative standard deviations for the values of S, A, and B reported in the literature are 17, 36, and 33%, respectively (Table S15).
Figure 3

Comparison of experimental values with the predicted values obtained by inputting calculated values of ASPs in the (a) ESABV model and (b) SABVL model for chemicals for which the experimental values of ASPs were not available. The dotted line in the middle shows 1:1 agreement, and upper and lower dotted lines indicate 1:2 agreement between the experimental and predicted values.

Comparison of experimental values with the predicted values obtained by inputting calculated values of ASPs in the (a) ESABV model and (b) SABVL model for chemicals for which the experimental values of ASPs were not available. The dotted line in the middle shows 1:1 agreement, and upper and lower dotted lines indicate 1:2 agreement between the experimental and predicted values. Higher than the critical studentized residual values for PCB 209, aldrin, methoxychlor, and n-dodecylbenzene may indicate a problem in measured log Kpe–w values for these compounds. These compounds are considerably hydrophobic with log Kow ranging from 5.08 to 8.65. Water–solvent partitioning properties of hydrophobic chemicals fall near the extreme limits of analytical techniques and therefore suffer from significant uncertainties in the reported values.[67]

SABVL Model

Model equation based on the relationship of log Kpe–w with a linear combination of S, A, B, V, and L parameters (SABVL model) successfully explained more than 99% of variation in the log Kpe–w data (eq ). For eq , the largest VIF value among ASM parameters was 8.4 for L parameter, which was acceptable being below the cutoff value of 10 for multicollinearity.[64] Higher VIF value observed for SABVL model can be attributed to a higher correlation coefficient of L with S (r = 0.81) and V (r = 0.87) parameters as compared to the correlation coefficient of E with S (r = 0.80) and V (r = 0.61) parameters used in the ESABV model (Figure c). The ESABV model performed slightly better than the SABVL model by showing the value of rmse, which was 0.017 log unit lower than rmse for the SABVL model. External validation was performed by splitting the full dataset (n = 214) randomly into a training set (n = 174, Table S16) and a validation set (n = 40, Table S17). Equation was obtained using the training set of 174 compounds. The fitting coefficients and regression statistics of eq are statistically similar to eq . For eq , the largest VIF value among ASM parameters was 8.2. Predictions of eq for the external validation set were in good agreement with the experimental values (Rexternal2 = 0.9364, rmseexternal = 0.456) (Figure b). Cross-validation tests (Section S2) yielded rmse and R2 values in the range of 0.35–0.48 log unit and 0.893–0.925, respectively, indicating robustness of the model. With the input of calculated ASM parameters (n = 26), eq predicted values compared with the experimental values of log Kpe–w with an rmse = 0.75 log units for Dataset-II. Higher rmse observed for the SABVL model than for the ESABV model might be attributed to the errors in the calculation of the L parameter. Inter-laboratory variation for the experimental value of the L parameter for hydrophobic chemicals has been reported significant.[31] The application domain of the SABVL model is somewhat similar to that of the ESABV model (Figure b). Four chemicals, PCB 209, endrin, n-octylphenol, and triclosan, were flagged as influential observations in the influence plot. Only endrin was not flagged as influential in the ESABV model (Figure a). These deviations may be rationalized either due to error in the response or predictor variables used for these chemicals. For example, percent relative standard deviations found for the values of L reported in different literature sources for endrin and triclosan were more than 9% (Table S18).
Figure 4

Application domains for (a) ESABV model, (b) SABVL model, (c) op-LFER model based on hexadecane–water partition coefficient and (d) op-LFER model based on octanol–water partition coefficient. Studentized residuals are plotted against hat-values, and the size of circle is proportional to Cook’s distance. Hat-value is a measure of leverage. Observation 170, 182, 184, 185, 186, 189, 199, 208, 209, 210, and 212 flagged in the above figures correspond to PCB 209, aldrin, endrin, endrin aldehyde, endrin ketone, methoxychlor, toluene, n-dodecylbenzene, n-octylphenol, pentane, and triclosan, respectively.

Application domains for (a) ESABV model, (b) SABVL model, (c) op-LFER model based on hexadecanewater partition coefficient and (d) op-LFER model based on octanolwater partition coefficient. Studentized residuals are plotted against hat-values, and the size of circle is proportional to Cook’s distance. Hat-value is a measure of leverage. Observation 170, 182, 184, 185, 186, 189, 199, 208, 209, 210, and 212 flagged in the above figures correspond to PCB 209, aldrin, endrin, endrin aldehyde, endrin ketone, methoxychlor, toluene, n-dodecylbenzene, n-octylphenol, pentane, and triclosan, respectively.

Types of Interaction Dominating PE–Water Partitioning

The LSERs shed light on the type and relative importance of the interactions that govern contaminant uptake by passive samplers from the water phase. This is an important information for the choice of the polymeric phase for passive sampling. In the PEwater system, solute parameters such as size of the molecule (V), polarizability (E), and dispersion interactions (L) favor the transport of the contaminants in the direction of the PE phase. On the other hand, contaminants that have stronger polar interactions such as hydrogen bonding interactions (A and B) and polarity/polarizability (S) tend to favor the water phase over the PE phase (Figure a). These relative transport tendencies are somewhat similar to other PDMS, POM, and PA passive sampling phases.
Figure 5

Intermolecular interactions governing PE–water and other related partitioning systems. (a) Standardized regression coefficient obtained by regressing log KPE–water against all six ASPs indicate the relative contribution of ASPs in controlling the partitioning of chemicals between the water phase and the PE phase. Error bars indicate the 95% confidence interval around the mean. (b) A biplot between the first two orthogonal dimensions obtained by PCA on the system coefficients for PE–water, PDMS–water, POM–water, PA–water, octanol–water, and hexadecane–water partitioning systems. Red lines are the projections of system coefficients on the two-dimensional space. The first principal dimension (Dim 1) and the second principal dimension (Dim 2) account for 65.47 and 19.51%, respectively, of variance for these partitioning systems.

Intermolecular interactions governing PEwater and other related partitioning systems. (a) Standardized regression coefficient obtained by regressing log KPEwater against all six ASPs indicate the relative contribution of ASPs in controlling the partitioning of chemicals between the water phase and the PE phase. Error bars indicate the 95% confidence interval around the mean. (b) A biplot between the first two orthogonal dimensions obtained by PCA on the system coefficients for PE–water, PDMSwater, POMwater, PA–water, octanolwater, and hexadecanewater partitioning systems. Red lines are the projections of system coefficients on the two-dimensional space. The first principal dimension (Dim 1) and the second principal dimension (Dim 2) account for 65.47 and 19.51%, respectively, of variance for these partitioning systems. Comparison of pp-LFER coefficients for the PEwater system with those for the PDMSwater, POMwater, PA–water system indicates that the PE phase has the largest e coefficient among these passive sampling phases. Consequently, the PE phase shows highest affinity for the chemicals that are highly polarizable. This is also indicative from the biplot obtained by the PCA on the system coefficients of these passive samplers (Figure b). The PE–water system stands out in terms of the e coefficient. System coefficients corresponding to the specific interactions (i.e., s, a, and b) for the PA and POM phases are higher than for PE and PDMS phases. This indicates that the PA and POM are more polar than the PDMS and PE passive sampler phases. The PDMS phase almost occupies the central position in the biplot, which indicates that it has good affinity for chemicals with a wide range of polarities.

op-LFER Model

The op-LFER model based on the linear relationship between log Kpe–w with log Kow and with log Khexadecane–w was re-examined with statistical diagnostics and for comparison with the pp-LFER model.

Hexadecane–Water LFER

Comparison of the system coefficients for Abraham solvation equations for PEwater and hexadecanewater systems (Table S5) reveals that the cost of cavity formation in hexadecane is approximately 1 order of magnitude lower than in the PE phase. This is expected as PE is a rigid polymeric matrix compared to hexadecane, which is a liquid solvent.[21] The hydrogen bond donating trait for PE relative to water (a = −1.82) is about 58 times higher than that for hexadecane (a = −3.59). The hydrogen bond accepting trait for PE relative to water (b = −4.04) is about 6 times higher than that for hexadecane (b = −4.87). The polarity/polarizability trait of PE is s = −1.30 compared to s = −1.62 for hexadecane. The polarizability trait of PE (e = 1.00) is about twice that of hexadecane (e = 0.67). These tendencies can easily be discerned from the biplot (Figure b). The Euclidean distance between the hexadecanewater system and the PE–water system is larger than the distance between PE–water and PDMSwater systems. This indicates that the solvation trait of PE is more similar to PDMS than to hexadecane. Taken together, hexadecane shows a stronger nonpolar trait than the PE phase. Linear regression of log KPEwater against log Khexadecanewater resulted in the following form of op-LFER (eq ). Higher rmse observed for eq compared to the ESABV model indicates that the one parameter is not enough to account for the total variance of log KPEwater data. The performance of eq is significantly better than the previously reported hexadecanewater LFERs,[21,45] which were developed using smaller and lesser diverse datasets. For external validation, the full dataset (n = 214) was split randomly into a training set (n = 174, Table S19) and a validation set (n = 40, Table S20). Equation was derived using the training set of 174 compounds. The fitting coefficients and regression statistics of eq are statistically similar to eq . Predictions of eq compared favorably with the experimental data for the external validation set (Rexternal2 = 0.9336, rmseexternal = 0.390). The results of four types of cross-validation tests for eq were in good agreement with each other, indicating that the model is internally valid for predictive purpose (Section S3). These tests exhibited rmse and R2 in the range of 0.40–0.41 log unit and 0.898–0.901, respectively. Three chemicals were found to be outside of the application domain for the hexadecanewater LFER, which are PCB 209, aldrin, and n-dodecylbenzene. These chemicals were flagged in the ESABV model, as well, and are either very hydrophobic or have substantial hydrogen bonding interactions.

Octanol–Water LFER

As evident from the respective ASM equations (Table S5), the solvation character of octanol is significantly different from that of the PE. The hydrogen bond acidity coefficient for the PE–water system (a = −1.82) is about 62 times higher than that for the octanolwater system (a = 0.03). The hydrogen bond basicity trait for PE relative to water (b = −4.04) is about 6 times higher than that for octanol (b = −3.46). The polarity/polarizability trait of the PE–water system is more pronounced (s = −1.30) as compared to that for the octanolwater system (s = −1.05). The polarizability trait of PE (e = 1.00) is about twice that of octanol (e = 0.56). The cost of cavity formation in octanol is approximately 0.4 log unit lower than in the PE phase. This is further corroborated by the biplot (Figure b) where the octanolwater system occupies the position in the direction of a, b, and s coefficients, and the PE–water system is oriented more toward the e coefficient. These differences in the system coefficients for the two systems imply that the fugacity of polar and semipolar chemicals from water phase to PE phase is lower than from water phase to octanol phase. The linear regression of log KPEwater against log Kow resulted in the following model equation. The performance of eq is considerably lower than that of eq , which further substantiates the notion that the octanol phase is not a good representation of the PE phase. To evaluate the external validity of octanolwater op-LFER, the full dataset (n = 214) was split randomly into a training set (n = 174, Table S21) and a validation set (n = 40, Table S22). Equation was obtained using the training set of 174 compounds. The fitting coefficients and regression statistics of eq are statistically similar to eq . Predictions of eq compared favorably with the experimental data for the external validation set (Rexternal2 = 0.8716, rmseexternal = 0.442). The results of four types of cross-validation tests for eq were in good agreement with each other, indicating that the model is internally valid for predictive purpose (Section S4). These tests exhibited rmse and R2 in the range of 0.40–0.42 log unit and 0.892–0.896, respectively. The application domain for the octanolwater LFER was established by using an influence plot. The following five chemicals were flagged as influential observations in this plot: PCB 209, endrin aldehyde, endrin ketone, toluene, and pentane. We compared the predictions from our four models with those from chemical class-specific octanolwater LFER equations reported for PCB and PAH families in Ghosh et al.[68] For 117 PCBs, the comparison of experimental log KPEwater values with the predicted log KPEwater values from the ESABV model (eq ), hexadecanewater LFER (eq ), octanolwater LFER (eq ), and PCB-specific octanolwater LFER equation (eq S1), respectively, resulted into rmse values of 0.30, 0.30, 0.26, and 0.29 log units (Table S23). The residuals (predicted–experimental values) observed for superhydrophobic chemicals such as PCB 207 and 209 were more than 1 order of magnitude for all models. For 47 PAHs, the agreement between experimental log KPEwater values and predicted log KPEwater values from the ESABV model (eq ), hexadecanewater LFER (eq ), octanolwater LFER (eq ), and PAH-specific octanolwater LFER (eq S2) exhibited rmse values of 0.23, 0.41, 0.29, and 0.26 log units, respectively (Table S24). As expected, the op-LFERs trained on datasets comprising specific chemical families (PCBs and PAHs) perform better than the op-LFERs trained using diverse multiclass chemicals.

COSMOtherm Predictions

COSMOtherm did a reasonable job in predicting the PEwater partition coefficients for diverse chemicals. Overall, COSMOtherm predictions were in good agreement with the experimental values of all chemicals (n = 47, rmse = 0.52 log unit) (Figure a and Table S7). A regression line with an intercept set equal to zero between the two sets of values resulted in slope = 0.96 with R2 = 0.99. For chemical families such as PCBs (n = 28), OCPs (n = 7), PBDEs (n = 3), and hydrocarbons (n = 8), the comparison of predicted values with the experimental values yielded rmse values of 0.50, 0.24, 0.22, and 0.82 log unit, respectively. In general, the largest deviations (residual = predicted log Kpe–w – experimental log Kpe–w) were for the compounds that were significantly hydrophobic in nature with the exceptions of n-pentane (residual = 1.47) and n-hexane (residual = 0.99).
Figure 6

Comparison of COSMOtherm predictions with (a) experimental and (b) ESABV model predicted log Kpe–w values. Lower panel (c) shows the variation of deviations (residual = ESABV model predicted log Kpe–w – COSMOtherm log Kpe–w) as a function of log Kow.

Comparison of COSMOtherm predictions with (a) experimental and (b) ESABV model predicted log Kpe–w values. Lower panel (c) shows the variation of deviations (residual = ESABV model predicted log Kpe–w – COSMOtherm log Kpe–w) as a function of log Kow. For a set of 192 chemicals—for which the experimental values of log KPEwater were not available—agreement between the predictions of the ESABV model and those of COSMOtherm was good (R2 = 0.93) but had a systematic bias (slope = 0.77). Overall, comparison of COSMOtherm predictions with those of the ESABV model resulted in rmse = 0.97 log unit (Figure b, Table S8). COSMOtherm systematically underpredicted the log KPEwater values with respect to predictions of the ESABV model for hydrophobic chemicals. For most of hydrophilic chemicals, COSMOtherm overpredicted values as compared to the ESABV model. This indicates that COSMOtherm predictions require some sort of adjustment factor to offset these systematic biases. The deviations (residual = ESABV model predicted log Kpe–w – COSMOtherm log Kpe–w) from the two models were plotted against log Kow to inspect if the deviations depend on hydrophobicity of chemicals (Figure c). For hydrophilic compounds, deviations are positive, implying that COSMOtherm underestimates the values as compared to the ESABV model. These hydrophilic compounds have significant hydrogen bonding interaction traits (Table S6). On the other hand, COSMOtherm overestimated log Kpe–w values as compared to the ESABV model for hydrophobic chemicals. A similar pattern was found in the literature[69] for COSMOtherm calculation of PDMSwater partition coefficients, where huge deviations were found for compounds that were either very hydrophobic or were having a significant hydrogen bonding interaction. From a practical standpoint, we recommend users to prefer the ESABV model (eq ) on account of its better predictive performance compared to other models developed in this study for estimation of log Kpe–w values. For this purpose, users can find the experimental values of ASPs from the UFZ-LSER database.[48] Where experimental values are not available, users may input the calculated ASPs from the UFZ-LSER site,[48] as long as the estimated values fall within the acceptable application domain. Where reliable estimates of ASPs are not available, we recommend the use of hexadecanewater LFER model (eq ) instead of octanolwater LFER model (eq ). This is due to better solvation proximity between hexadecanewater and PE–water systems than is observed between octanolwater and PE–water systems. For eq , users can input values of log Khexadecanewater that are either found in experimental database[57] or can quickly be predicted using the estimation approaches listed elsewhere.[44,70] The COSMO-RS model is of value to users who have access to supercomputers and commercial software such as TURBOMOLE and COSMOtherm to compute log Kpe–w for the chemicals for which reliable experimental or estimated solute descriptors are not available for input into the above LFERs. However, the COSMO-RS model requires corrections for the free volume and crystallinity of the PE, which might not be always readily accessible. Finally, users are advised to carefully evaluate the quality of input data to these models especially for the compounds that are either very hydrophobic or have substantial hydrogen bonding interaction attribute. As discussed above, these compounds might fall outside the application domain of these calibrated models. In summary, we successfully trained op- and pp-LFER models using datasets, which are large and structurally more diverse than reported in previous studies. These models were subjected to rigorous validation tests to verify their predictive robustness. Overall, pp-LFERs performed better than op-LFERs in describing the partitioning variability for the PEwater system. The COSMOtherm model, which is based on the COSMO-RS theory, predicted log Kpe–w values with reasonable accuracy for chemicals that are moderately hydrophobic in nature. These models also provide insights about the partitioning behavior of neutral organic chemicals during PE–water exchange, which may help evaluate the utility of PE water passive samplers for the contaminants of interest.
  1 in total

1.  Exploring the role of octanol-water partition coefficient and Henry's law constant in predicting the lipid-water partition coefficients of organic chemicals.

Authors:  Muhammad Irfan Khawar; Azhar Mahmood; Deedar Nabi
Journal:  Sci Rep       Date:  2022-09-02       Impact factor: 4.996

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.