Thomas H Miller1, Jose A Baz-Lomba2, Christopher Harman3, Malcolm J Reid2, Stewart F Owen4, Nicolas R Bury5, Kevin V Thomas2, Leon P Barron1. 1. Analytical & Environmental Sciences Division, Faculty of Life Sciences and Medicine, King's College London , 150 Stamford Street, London, SE1 9NH United Kingdom. 2. Norwegian Institute for Water Research (NIVA) , Oslo, NO-0349, Norway. 3. Norwegian Institute for Water Research (NIVA) , Grimstad, NO-4879, Norway. 4. AstraZeneca, Global Environment, Alderley Park, Macclesfield, Cheshire SK10 4TF, United Kingdom. 5. Division of Diabetes and Nutritional Sciences, Faculty of Life Sciences and Medicine, King's College London , Franklin Wilkins Building, 150 Stamford Street, London, SE1 9NH, United Kingdom.
Abstract
Modeling and prediction of polar organic chemical integrative sampler (POCIS) sampling rates (Rs) for 73 compounds using artificial neural networks (ANNs) is presented for the first time. Two models were constructed: the first was developed ab initio using a genetic algorithm (GSD-model) to shortlist 24 descriptors covering constitutional, topological, geometrical and physicochemical properties and the second model was adapted for Rs prediction from a previous chromatographic retention model (RTD-model). Mechanistic evaluation of descriptors showed that models did not require comprehensive a priori information to predict Rs. Average predicted errors for the verification and blind test sets were 0.03 ± 0.02 L d(-1) (RTD-model) and 0.03 ± 0.03 L d(-1) (GSD-model) relative to experimentally determined Rs. Prediction variability in replicated models was the same or less than for measured Rs. Networks were externally validated using a measured Rs data set of six benzodiazepines. The RTD-model performed best in comparison to the GSD-model for these compounds (average absolute errors of 0.0145 ± 0.008 L d(-1) and 0.0437 ± 0.02 L d(-1), respectively). Improvements to generalizability of modeling approaches will be reliant on the need for standardized guidelines for Rs measurement. The use of in silico tools for Rs determination represents a more economical approach than laboratory calibrations.
Modeling and prediction of polar organic chemical integrative sampler (POCIS) sampling rates (Rs) for 73 compounds using artificial neural networks (ANNs) is presented for the first time. Two models were constructed: the first was developed ab initio using a genetic algorithm (GSD-model) to shortlist 24 descriptors covering constitutional, topological, geometrical and physicochemical properties and the second model was adapted for Rs prediction from a previous chromatographic retention model (RTD-model). Mechanistic evaluation of descriptors showed that models did not require comprehensive a priori information to predict Rs. Average predicted errors for the verification and blind test sets were 0.03 ± 0.02 L d(-1) (RTD-model) and 0.03 ± 0.03 L d(-1) (GSD-model) relative to experimentally determined Rs. Prediction variability in replicated models was the same or less than for measured Rs. Networks were externally validated using a measured Rs data set of six benzodiazepines. The RTD-model performed best in comparison to the GSD-model for these compounds (average absolute errors of 0.0145 ± 0.008 L d(-1) and 0.0437 ± 0.02 L d(-1), respectively). Improvements to generalizability of modeling approaches will be reliant on the need for standardized guidelines for Rs measurement. The use of in silico tools for Rs determination represents a more economical approach than laboratory calibrations.
Contamination
of the aquatic environment with herbicides, pesticides,
pharmaceuticals, and personal care products (PPCPs), among other contaminants,
has been the focus of environmental monitoring campaigns over the
last two decades. Reported concentrations and associated adverse effects
of these contaminants has led to the introduction of legislative procedures
to monitor and assess risk associated with pollutants, such as the
EU water framework directive and the EU registration, evaluation,
authorization, and restriction of chemicals (REACH).[1,2]High frequency sampling campaigns often involve the use of
grab
or composite sampling, but are practically difficult and costly to
manage for monitoring longer-term fluctuations in contaminant concentrations
in the aquatic environment. These methods are also often labor intensive
with respect to sampling and can lead to considerable cost during
instrumental analysis. More recently, however, the development and
use of passive sampling devices (PSDs) is increasing due to their
capability for a time-integrated approach to averaging contaminant
concentrations in surface waters as well as influent and effluent
wastewater over extended periods.[3] PSDs
minimize sample preparation and allow in situ enrichment of analytes
which may potentially reduce limits of quantification in comparison
to those achieved by point sampling.[4] Passive
sampling devices in some fields are well-established, such as use
of semipermeable membrane devices (SPMD) for organochlorines[5] and other similarly hydrophobic compounds.[6−8] However, one type of PSD which is emerging currently is the polar
organic chemical integrative sampler (POCIS). These samplers have
been used to determine the occurrence of a range of chemically diverse,
and comparatively polar to moderately nonpolar compounds.[9−13] However, for quantitative studies, POCIS suffer from some limitations,
mainly relating to the reliability of derived estimations of the sampling
rates (Rs) from experimental measurements,
as well the lack of a well-developed performance reference compound
(PRC) exposure correction method.[14−16] One further hindrance
is that reported sampling rate data are few and methods for their
estimation vary which leads to limited transferability across other
locations or studies.[17]Given the
time-intensive nature of determining Rs experimentally, it is possible that computational modeling
approaches could offer a solution that would enable prediction of
sampling rate data for compounds without the need for experimental
determination. A previous investigation by Stephens et al.[18] evaluated the use of an empirical method (Sherwoods
correlation) to determine PSD kinetic parameters for a limited number
of compounds showing maximum errors of +40 and −20% for the
estimation of the aqueous boundary layer mass transfer coefficient
(kf). In contrast to empirical methods
for estimating specific parameters, quantitative structure–property
relationship (QSPR) models are becoming more frequently used in ecotoxicology
where a set of x variables are used to predict a
response, y.[19] The variables
are often molecular descriptors that cover constitutional, topological,
geometrical, and physicochemical properties which can then be used
to model a desired output. Models can vary from simple linear regression
approaches to complex nonlinear functions where such models are often
designed by machine learning methods. Two well-known machine learning
methods are support vector machines (SVMs) and artificial neural networks
(ANNs) and have been used successfully in related areas such as the
prediction of bioconcentration factors (BCFs), octanol–water
partition coefficients (logP) and biosolid/water
partition coefficients (Kd), as well as
for suspect compound screening via prediction of chromatographic retention
time.[20−27] The use of SVMs for environmental applications is still in its infancy
and substantial programming capability is required for routine application.
On the other hand, ANNs are well-known and more user-friendly software
has been available for many years. ANNs comprise a layered structure
(normally three), each with a different purpose. The input layer contains
the molecular descriptor data for each compound for training, verification
and blind testing and the output layer is the response. The hidden
layer sits in between and contains several nodes, and often multiple
sublayers of such nodes, where linear or nonlinear functions are used
to relate the descriptors to the output layer. The residual errors
are monitored and reduced by using iterative algorithms which adjust
weights associated with the nodes in the hidden layer. Thus, such
modeling approaches could greatly increase the applicability of POCIS
in environmental monitoring studies through bypassing the need for
laboratory and in situ calibrations.The aim of this work was
to investigate the potential of ANNs to
model and predict Rs for POCIS devices
for a range of pharmaceuticals, endocrine disrupting chemicals, pesticides,
herbicides and drugs of abuse. The objectives were to identify suitable
analyte molecular descriptors to build, train and test a range of
suitable model types and architectures and then finally to externally
validate the approach for predicting Rs for several compounds which were, for comparison, determined in
parallel by laboratory calibration. To the authors’ knowledge,
this represents the first study to draw together, harmonize and predict
the published Rs data for ionizable pharmaceutical
compounds on POCIS. Ultimately, where such tools can provide adequate
predictions using new data generated in the future, this approach
could reduce the analytical burden of laboratory estimations of Rs.
Materials and Methods
Selection of Data Sets,
Molecular Descriptors and ANN Models
A working data set derived
from the literature (2007-present) was
used to build, train and optimize models for Rs prediction on POCIS. A total of n = 73 compound Rs data were derived from Fauvelle et al.[28] and Morin et al.,[24] which were
generated using similar experimental conditions to give the largest
combined data set of all studies. Compounds included herbicides, pesticides,
endocrine disrupting compounds and pharmaceuticals. Where duplicate
compound Rs data existed, both values
were removed entirely from the data set (six compounds). Generally,
in these cases Rs differed and it was
uncertain which value was correct or whether an average was appropriate
for modeling. Simplified molecular input line entry system (SMILES)
strings were generated from Chemspider (Royal Society of Chemistry,
UK). Using these, n = 185 molecular descriptors were generated from
Parameter Client freeware (Virtual Computational Chemistry Laboratory,
Munich, Germany) and an additional n = 16 descriptors were from ACD
laboratories Percepta software (Advanced Chemistry Development Laboratories,
ON, Canada).Two models were generated using two separate sets
of descriptors covering constitutional, topological, geometrical and
physicochemical properties that were investigated for their comparative
prediction performance. The first subset of 24 descriptors (see Supporting Information (SI), Table S1) was generated
using a genetic feature selection algorithm to produce the genetically
selected descriptor model (GSD-model). Genetic feature selection algorithms
follow evolutionary concepts to convert input descriptors into binary
strings, in this case to prioritise descriptors for Rs prediction. Using a process similar to natural selection,
prioritised strings are crossed to form a new population of strings.
The generational “breeding” of strings produced an optimized
selection of input variables for application to prediction. The parameters
for the GA were as follows; population = 100, generation = 100, mutation
rate = 0.1 and crossover rate = 1. In an alternative approach, a much
simpler descriptor data set previously used to model elution from
reversed-phase liquid chromatography (RPLC) stationary phases was
investigated to assess any improvement (see SI Table S2).[23,25] This model is referred to as
the retention time descriptor model (RTD-model). POCIS devices contain
a divinylbenzene and N-vinylpyrrolidone copolymer,
which enabled dual polar and nonpolar interactions for retention.
As retention on reversed-phase chromatographic columns is governed
predominantly by hydrophobic interactions too, it is possible that
these same descriptors will also be important in passive sampling.
No retention data was available for the studies by Fauvelle et al.[28] and Morin et al.[24,29] However, correlation
between Rs and 21 corresponding retention
times (tR) gathered on a C18 stationary phase in a study by Bade et al.[30] showed a weak relationship (R = 0.472).For
both descriptor subsets, several network types were tested
for predictive ability using Trajan 6.0 neural network software (Trajan
Software Ltd., Lincolnshire, UK) and these included radial basis function
(RBF), generalized regression neural networks (GRNNs) and multilayer
perceptrons (MLPs). Following training and optimization using both
data sets, the GSD- and RTD-models were produced. The GSD-model architecture
was a four-layer MLP with 24 descriptors in the input layer (independent
variables); two hidden layers containing 17 and 14 nodes and the dependent
variable output layer (Rs). Training involved
two types of algorithms, the first was back-propagation (BP) and the
second was conjugate gradient descent (CGD). The data set was split
into 45:14:14 cases for the training, verification and test subsets
(optimized). The RTD-model architecture was also a four-layer MLP
using both BP and CGD. The first and fourth layers were the inputs
(using the set of descriptors previously used for chromatographic
retention modeling) and outputs (Rs),
respectively, and the second and third layers (hidden layers) contained
14 and 9 nodes, respectively. The division of cases included 51 compounds
for training, 11 compounds for verification and 11 compounds for blind
testing (optimized).[27] All cases were randomly
selected to avoid bias. The verification data set was used to characterize
network predictive performance during training and also to allow regularisation
to prevent overfitting. The test set was then used to validate the
model to ensure that the model generalized well to new cases. The
optimized models were selected based on the lowest errors and consistency
across the training, verification and test subsets.
Laboratory
Calibration of Sampling Rates to Test Model Generalizability
Sampling rates (L d–1) were determined using
a static renewal method over a 14 day exposure period and in a similar
manner to data in the literature which were used for modeling here.[28] Briefly, 3 L of high-density polyethylene vessels
were filled with ultrapure water, the pH adjusted to 7.6 with 20 mg
L–1 NaHCO3 and spiked with the mixture
of respective compounds to expose the POCIS. Each vessel contained
three POCIS devices for exposure to an aqueous-based standard mixture
of 200 ng L–1 of each target compound (solvent <0.001%).
This standard solution was prepared and replaced daily in 3 L volumetric
flasks to maintain the nominal concentration. Following this, all
three POCIS were removed from each vessel at day 4, 7, and 14, rinsed
with ultrapure water and frozen at −20 °C. Extraction
of POCIS sorbents was performed using a wash phase of 5 mL of ultrapure
water and then elution using 5 mL of MeOH. Eluate was dried under
nitrogen at 35 °C for 40 min. The dried residue was then reconstituted
in 0.5 mL of starting mobile phase. The analysis of the benzodiazepines
was performed on an Acquity UPLC system coupled to a Xevo G2 S QTOF
mass analyzer (Milford, MA) with an online Oasis HLB Direct Connect
HP loading column. Analyte separation was performed on an Acquity
UPLC BEH C18 column (1.7 μm, 50 × 2.1 mm) from
Waters (Milford, MA) at 50 °C. Gradient elution (0.6 mL min–1) for analyte separation was with 0.1% (v/v) formic
acid in water (phase A) and 0.1% formic acid in methanol (phase B).
Full method details for the laboratory calibration experiments and
analysis are given in the SI.
Results
and Discussion
Rs Prediction
Using a GSD-Model
Following genetic feature selection, a
24–17–14–1
MLP yielded the best performance using 24 input descriptors with R2 = 0.8800, 0.8694, and 0.8050 for training,
verification and blind test sets respectively (sum of squared residual
errors were 0.084, 0.062, and 0.116, respectively). Therefore, this
model initially seemed quite promising for application to prediction
of Rs for new compounds (Figure a). Many shortlisted descriptors
were derived from topological indices, but some others were expected
to have more importance for this application, such as those that describe
molecular hydrophobicity. These include the octanol–water partition
coefficient (logP) and the distribution ratio between
octanol and water (logDow). The latter
takes into account the ionised proportion of a compound at a particular
pH and is dependent on the logP and the pKa of all ionizable functional groups in a molecule.
An investigation by Booij et al.,[31] demonstrated
that uptake rates in SPMDs correlated well with logP where Rs ≈ P–0.044. Correlations between logP and Rs has also been observed for POCIS
devices.[32−34] Assessment of the collinearity with Rs (SI Table S7) showed rather
unsurprisingly for so many ionizable compounds that logDow had, by far, the highest correlation (R = 0.59), but was insufficient by itself to describe sorption to
POCIS sorbents. To the authors knowledge, no previous investigations
have used logDow to model Rs, although it has been weakly correlated with Rs.[35] Furthermore,
interinput descriptor collinearity also existed and especially for
constitutional descriptors such as the number of non-H bonds (nBO)
and the sum of conventional bond orders (SCBO); as well as topological
descriptors such as log Narumi simple topological index (Snar), second
Zagreb index (ZM2). Pearson’s coefficients were ≥0.8
for these descriptors with at least eight other descriptors. Therefore,
though genetic algorithms shortlisted useful descriptors for potential Rs modeling here, back-interpretation of model
sensitivity to descriptor data for derivation of mechanistic understanding
of physicochemical POCIS uptake mechanisms would be limited. However,
as a tool to predict Rs, the training
set overall displayed good accuracy within 22% of the measured value
on average. In comparison, the verification and test subsets were
predicted on average within 19% of their measured values showing consistency
across all subsets (SI, Figure S1). For
particular blind test cases, however, some notably large inaccuracies
were observed such as for sotalol (80% inaccuracy) where the lower
hydrophobicity of this molecule may explain poorer correlation with Rs.[31] Larger errors
were also recorded for acetochlor ethanesulfonic acid (40%), diclofenac
(39%) and sulcotrione (38%). The verification subset contained two
largely inaccurate predictions (2,4-dichlorophenoxyacetic acid at
59% and timolol at 31%), but all remaining compounds were within 20%
of measured Rs. Larger inaccuracies may
be related to poor learning from selected training data. For example,
mesotrione had a 59% inaccuracy to the measured value in the training
set which may explain the poor prediction of another structurally
similar compound, sulcotrione, in the test set. Overall, inaccuracy
was most prevalent for sulfonate-containing compounds where genetic
selection did not sufficiently prioritise descriptors for this portion
of cases for reliable Rs prediction. As
the number of available cases expands, genetic selection of descriptors
may improve for such compounds in the future. It is also unclear whether
sulfonate bearing molecules are subject to steric and/or repulsive
forces arising from the PES membrane. Furthermore, larger inaccuracies
(>30%) in the full data set generally corresponded to compounds
with Rs < 0.1 such as 2,4-dichlorophenoxyacetic
acid, sotalol, sulcotrione and nicosulfuron. However, when predictive
accuracy was plotted against Rs for all
compounds, no correlation was observed for other compounds with Rs < 0.1 (SI, Figures S2 and S3).
Figure 1
Measured Rs against predicted Rs for (a) the GSD-model and (b) the RTD-model.
Crosses, circles and triangles are the training, verification and
test subsets, respectively. Open circles and triangles indicate predicted
inaccuracies of >30% of the measured value.
Measured Rs against predicted Rs for (a) the GSD-model and (b) the RTD-model.
Crosses, circles and triangles are the training, verification and
test subsets, respectively. Open circles and triangles indicate predicted
inaccuracies of >30% of the measured value.
Rs Prediction Using a RTD-Model
The correlation of predicted versus measured Rs for the RTD-model is shown in Figure b. The error (sum squared) for the subsets
were 0.092, 0.062, and 0.121 for the training, verification and test
sets, respectively. The model was, again, a four-layered MLP with
a 16:14:9:1 architecture. Generally, acceptable correlations were
achieved for the training, verification and blind test sets (R2 = 0.8511, 0.9085, and 0.6425, respectively)
though this model performed slightly worse (training and test) than
the GSD-model. The training subset showed several larger errors which
corresponded to the compounds t-butylphenol (149%),
2,4-dichlorphenol (41%), and simazine (41%). The compound sulfamethoxazole
showed an 81% overestimation of its experimentally determined Rs. As discussed earlier, this large inaccuracy
was also reflected in the GSD-model which showed an overestimation
of 230% for sulfamethoxazole which also bears a sulfonate group. Overall,
however, the model showed relatively good predictions of Rs (mean absolute error for training set = 15%; and for
both verification and test subsets = 22%). The average error ±
standard deviation across the verification and blind test subsets
was 0.03 ± 0.02 L day–1 showing acceptable
overall predictive accuracy for Rs. Atenolol,
the compound with the lowest Rs, yielded
poor prediction accuracy (predicted Rs = 0.067, measured Rs = 0.025) which
was initially thought to be due its higher polarity in comparison
to others selected for this study. However, no correlation was observed
between predictive accuracy and logDow (Figure ). Average
predictive mean error of the verification and blind test sets both
reduced to ∼15% upon removal of the atenolol data-point. Importantly,
as very polar compounds are generally not retained well by n-vinylpyrrolidone-co-divinylbenzene-based polymer sorbents, inaccuracy in
measured Rs may be compound specific as
a result, which in turn may contribute to RTD-model prediction errors.
This highlights the lack of consistent measurements available for
training of such models for predictive purposes. Nonetheless, considering
this performance alongside the potential for inaccuracy in Rs data from different laboratory calibrations,
predictions using these models were considered reasonable.
Figure 2
RTD-model residual
plot of predicted Rs values for the verification
and test subset only, ordered in parentheses
by their ascending distribution ratio values between octanol and water
(logDow). Circles and triangles represent
the verification and test subset, respectively. The measured Rs values are displayed in parentheses on the x-axis. 2,4-D (2,4-dichlorophenoxyacetic acid), ESA (ethanesulfonic
acid), OA (oxanilic acid), and IPPMU (isoproturon-monodemethyl).
RTD-model residual
plot of predicted Rs values for the verification
and test subset only, ordered in parentheses
by their ascending distribution ratio values between octanol and water
(logDow). Circles and triangles represent
the verification and test subset, respectively. The measured Rs values are displayed in parentheses on the x-axis. 2,4-D (2,4-dichlorophenoxyacetic acid), ESA (ethanesulfonic
acid), OA (oxanilic acid), and IPPMU (isoproturon-monodemethyl).
Model Interpretation and
Descriptor Contribution to Rs Prediction
Given the level of multicollinearity
observed for GSD-model descriptors, a sensitivity analysis could only
be performed to identify the relative contribution of each descriptor
to predictions in the RTD-model. This was represented as the error
ratio, i.e. the ratio between the model error using all descriptors
and the model error when one descriptor was removed. However, like
in the GSD-model, the use of sensitivity analysis to further mechanistic
understanding of sorption processes should be approached with caution
if some individual descriptors display multicollinearity (please refer
to SI Tables S1–S3 for full descriptor
details and data). The logDow, the Moriguchi
octanol–water partition coefficient (MlogP), the Ghose-Crippen octanol–water partition coefficient (AlogP) and the number of Benzene rings (nBnz) were the top four descriptors used by the RTD-model (Figure ). This is in agreement with
Bäuerlein et al., who showed that hydrophobicity and pi-pi interactions (e.g., via benzene rings) were important
for adsorption to HLB sorbents in batch experiments[36] and which can also affect diffusion. Other important descriptors
were the number of triple bonds (nTB; error ratio = 1.2165), number
of five-membered rings (nR05; error ratio = 1.2041) and number of
nine-membered rings (nR09; error ratio = 1.4544). The importance of
the n-membered ring descriptors could be attributed
to molecular size and flexibility thus affecting the diffusivity of
molecules through the water boundary layer (WBL), PES membrane (pore
= 0.1 μm) or pores of the HLB copolymer (80 Å).[37−39] A previous investigation showed that size descriptors were also
important for predicting soil sorption coefficients for pesticides.[39] We also previously showed these descriptors
were important for ANN-based predictions of pharmaceutical sorption
to soils and sludge.[40] In addition to those
mentioned above, the number of carbons (nC), number
of oxygens (nO) and hydrophilic factor (Hy) also showed that they were important to the RTD-model. Hy relates to the number of hydrophilic groups in the molecule
such as hydroxyls, thiols and sulfonates. As polar surface area has
been previously shown to influence interactions with HLB sorbents,
it is logical that hydrophilicity/polarity related descriptors would
have some importance.[36] Several authors
have suggested that diffusion is the main factor governing uptake
rates in PSDs.[41] We have attributed the
importance of the descriptors mainly to sorbent interactions so far,
but it is also possible that these same descriptors could relate to
diffusion processes due to the number of molecular properties that
will affect it including dipole moments, polarizability, molecular
size (including hydration radius) and electrostatic charge.[42] The genetic feature selection algorithm did
not select some recognized diffusion-related descriptors, such as
molecular weight as a simple example. However, it did select other
descriptors that showed interdependencies on factors affecting diffusion
such as number of atoms, number of rotatable bonds, and electrotopological
states. Rs has been attributed mainly
in the past to diffusion processes in partition samplers such as silicone
rubbers.[41] The portion of Rs governed by diffusion in adsorption samplers using HLB-type
sorbents in POCIS remains unclear especially whether sorption of analytes
via hydrogen bonding, dipole–dipole, dipole–induced
dipole, van der Waals and pi-pi interactions plays
a more significant role. It is also possible that the models presented
here for Rs prediction could be developed
and improved further with additional or alternative descriptors, such
as diffusion coefficients. However, adding such descriptors may introduce
a greater uncertainty into the model as estimates can be based on
several different approaches.[43−45] In addition, diffusion coefficients
will be affected by numerous environmental factors and hydrodynamic
conditions that would be difficult to replicate or control in situ.
Inclusion of larger numbers of descriptors to cover all the processes
involved will likely inhibit model generalizability. Indeed, ANNs
learn more holistically, making predictions possible without the need
for such comprehensive a priori information. However, such a holistic
approach obviously limits deeper understanding of the precise contribution
of individual mechanisms involved in POCIS.
Figure 3
Sensitivity analysis
of the optimized RTD-model. Acronyms: nDB/nTB
= number of double/triple bonds; nC/nO = number of carbon/oxygen atoms;
nR04-nR09= number of 4–9 membered rings; Ui = unsaturation
index; Hy = hydrophilic factor; nBnz = number of benzene-like rings;
MlogP/AlogP = Moriguchi/Ghose-Crippen logarithm of octanol–water
partition coefficient; logD7.6 = logarithm of distribution ratio between
octanol and water at pH 7.6.
Sensitivity analysis
of the optimized RTD-model. Acronyms: nDB/nTB
= number of double/triple bonds; nC/nO = number of carbon/oxygen atoms;
nR04-nR09= number of 4–9 membered rings; Ui = unsaturation
index; Hy = hydrophilic factor; nBnz = number of benzene-like rings;
MlogP/AlogP = Moriguchi/Ghose-Crippen logarithm of octanol–water
partition coefficient; logD7.6 = logarithm of distribution ratio between
octanol and water at pH 7.6.By comparison, the GSD-model featured many more topological
and
geometrical descriptors than in the RTD-model. These descriptors showed
multicollinearity and therefore the sensitivity analysis could not
be performed reliably (Table S7). Simply
adding noncollinear descriptors to the RTD-model is also disadvantageous
at this point. As the number of descriptors increases, overfitting
of data is more likely to occur and would require significantly more
case examples for valid application.[46] Model
complexity will also limit the ability of the network to generalize
when predicting unknown compounds therefore a smaller number of descriptors
(and nodes in the hidden layer(s)) is ultimately more beneficial.
Reproducibility of Predicted and Experimentally Determined Rs
Model performance and generalizability
is limited by the quality of input data. Measured Rs can differ considerably even within calibration studies
performed in the same laboratory. The largest variance in measured Rs used corresponded to diclofop which had a
60% relative standard deviation (RSD), n = 15.[28] Many of the reported Rs values vary by more than 2-fold depending on the methodology
used for their estimation.[29] Although pH
and temperature during collection of both sets of data used herein
were similar, the type of calibration experiment applied was slightly
different (flow-through and static renewal) therefore the resulting
differences in the Rs estimates from each
investigation could have affected the performance of a model. For
the six compounds common to both calibration methods that were removed
from the original data set used for model optimization, the absolute
difference in Rs was 0.088 ± 0.072
L day–1 between measurements.The average
% RSD of measured Rs data used herein
was 11% (mean deviation was ±0.017 L day–1)
(Figure ). In 45%
of all cases, the % RSDs of predicted Rs across triplicate network trained ab initio were better than the
% RSDs of the measured data. For several specific cases, such as DET,
DIA, diclofop and ioxynil, the experimental variation was relatively
large when compared to the variation in predicted Rs. Such deviation in experimentally derived sampling rates
can be attributed to several similar factors to those already discussed
above (e.g., temperature, pH, flow rate etc.). Figure shows that for cases which had poor predictive
accuracy with respect to the mean true value, such as for acetochlor
ESA, the standard deviation of the predicted Rs overlapped with the reported experimental variance. A review
by Harman et al., suggests that literature reported Rs data should only be considered as an approximation.[47] However, in the absence of a standardized method
for POCIS calibration, either in the laboratory or in the field, it
would seem that Rs modeling in this way
offers similar accuracy and precision without being labor or resource
intensive. Calibration experiments for each compound can take several
weeks, requiring a large mass of reference material for static renewal
and flow through experiments, or very frequent and accurate water
sampling for in situ experiments. Furthermore, given that models developed
herein are derived from a very limited number of training cases, any
new reported Rs data generated by similar
methods to those used herein will likely enable better generalizability
in the future, as was observed with retention time predictions in
reversed-phase liquid chromatography.[27]
Figure 4
Comparison
of the measured and predicted Rs values
and their respective variances against the variance
in predicted Rs (n =
73) from replicate RTD-models (n = 3). Inset: Optimized
16–14–9–1 model architecture. Compounds in bold
represent the verification and blind test cases. All others were used
for model training.
Comparison
of the measured and predicted Rs values
and their respective variances against the variance
in predicted Rs (n =
73) from replicate RTD-models (n = 3). Inset: Optimized
16–14–9–1 model architecture. Compounds in bold
represent the verification and blind test cases. All others were used
for model training.
External Application to Rs Prediction
To further support the
application of the optimized modeling approach, Rs data for several additional benzodiazepines
were experimentally determined in our laboratory using a similar approach.
In the previous sections, blind test compounds were structurally diverse
which is logical for testing model accuracy.[48] However, for this experiment, structural similarity was deliberately
chosen to externally test its discriminative power. Despite this similarity,
it was expected that measured Rs could
be different on POCIS given their slight differences in chromatographic
retention on C18 phases. The retention order of the benzodiazepines
was as follows: oxazepam (3.26 min) nitrazepam (3.26 min), clonazepam
(3.29 min), lorazepam (3.29 min), alprazolam (3.31 min), midazolam
(3.32 min), flunitrazepam (3.37 min) and diazepam (3.58 min). As discussed
previously, measurement of Rs often suffers
from some imprecision. The calibration experiment performed here was
not exempt from this either. Two compounds, lorazepam (Rs: 0.205 L d–1) and oxazepam (Rs: 0.226 L d–1), were originally
present in the training set and verification set respectively during
model development. The Rs values for these
compounds were experimentally determined again here to characterize
the variance between the selected calibration method used here and
the method by Morin et al.[24] The Rs determined here varied by approximately 0.1
L d–1 for both compounds (lorazepam: 0.302 L d–1 and oxazepam: 0.327 L d–1). This
observation showed again that the difference in calibrations between
flow-through and static renewals is not negligible and was an unavoidable
limitation of the calibration experiment used here. Standard deviations
for the six compounds ranged from ±0.024 to ±0.055 L day–1 (n = 9). Overall, the average RSD
for all compounds was 20 ± 6% (flunitrazepam: 19%; clonazepam:
13%; nitrazepam: 13%; midazolam: 23%; diazepam: 23%; and alprazolam:
29%) and this variance was consistent with other studies.[29]As shown in Figure , both the GSD- and RTD-models predicted Rs well to within the measured value for all
six compounds. The two largest errors in the RTD-model corresponded
to those substances with the highest Rs variance (diazepam and alprazolam at 16 and 17%, respectively),
but the four remaining compounds showed little inaccuracy (≤5%).
In terms of absolute inaccuracy of the measured Rs however, examination of the RTD-model residual errors
showed that for all compounds except nitrazepam, that predictions
were slightly overestimated. The GSD-model performed worse by comparison
(Figure ). The two largest errors corresponded to nitrazepam and midazolam
that were 37% and 43% inaccurate, respectively. The remaining compound
inaccuracies were alprazolam (28%), clonazepam (18%), diazepam (10%)
and flunitrazepam (19%). The average absolute error for the GSD-model
predictions was 0.0437 L d–1 and all compounds were
predicted within ±0.075 L d–1. By contrast,
the RTD-model had an average absolute error of 0.0145 L d–1 for these benzodiazepines (and Rs for
all compounds were predicted within 0.03 L day–1). These predictions again demonstrated that predicted Rs were similar enough to those determined by experimental
determination to be practical.
Figure 5
Residual plot of the predicted Rs values
for the GSD-model (cross) and RTD-model (diamond) for external prediction
validation (as a blind test application) using six additional benzodiazepines.
Measured Rs values are displayed in parentheses.
Residual plot of the predicted Rs values
for the GSD-model (cross) and RTD-model (diamond) for external prediction
validation (as a blind test application) using six additional benzodiazepines.
Measured Rs values are displayed in parentheses.Passive sampling for nonhydrophobic
compounds is mainly used for
screening purposes and as a semiquantitative technique. Furthermore,
in situ exposures are difficult to quantify accurately as laboratory
calibrations may not translate well into field Rs due to several factors such as biofouling and other matrix-
or environmentally related effects on diffusion, for example. In addition,
for reliable quantification the performance reference compound approach
has limited availability and application for polar passive sampling
due to the strong retention of analytes on HLB sorbents.[49] However, modeling approaches could potentially
overcome these limitations if models were built from in situ calibration
data. It is also possible that estimation of Rs by in silico approaches may offer a viable alternative for
compounds where Rs data cannot be estimated
by field studies due to poor correlation of concentrations in water
to sample mass on the PSD.[35] Lastly, the
two different approaches to the molecular descriptor selection presented
show acceptable predictive accuracy for polar compound passive sampling.
However, the use of descriptors derived for tR prediction in a model for Rs prediction
holds significant potential for application to new compounds based
solely on their SMILES strings by simultaneously allowing preliminary
identification (by tR and high resolution m/z, for example) and estimation of Rs using the same descriptors.
Authors: M J Martínez Bueno; S Herrera; D Munaron; C Boillot; H Fenet; S Chiron; E Gómez Journal: Environ Sci Pollut Res Int Date: 2014-11-11 Impact factor: 4.223
Authors: Ian Townsend; Lewis Jones; Martin Broom; Anthony Gravell; Melanie Schumacher; Gary R Fones; Richard Greenwood; Graham A Mills Journal: Environ Sci Pollut Res Int Date: 2018-06-25 Impact factor: 4.223
Authors: Thomas H Miller; Matteo D Gallidabino; James I MacRae; Stewart F Owen; Nicolas R Bury; Leon P Barron Journal: Sci Total Environ Date: 2018-08-10 Impact factor: 7.963