Gabriel Sigmund1,2,3, Mehdi Gharasoo4, Thorsten Hüffer1, Thilo Hofmann1. 1. Department of Environmental Geosciences, Centre for Microbiology and Environmental Systems Science, University of Vienna, Althanstrasse 14, 1090 Wien, Austria. 2. Agroscope, Environmental Analytics, Reckenholzstrasse 191, CH-8046 Zurich, Switzerland. 3. Ithaka Institute, Ancienne Eglise 9, 1974 Arbaz, Switzerland. 4. Department of Earth and Environmental Sciences, Ecohydrology, University of Waterloo, 200 University Avenue West, Waterloo, Ontario N2L 3G1, Canada.
Abstract
Most contaminants of emerging concern are polar and/or ionizable organic compounds, whose removal from engineered and environmental systems is difficult. Carbonaceous sorbents include activated carbon, biochar, fullerenes, and carbon nanotubes, with applications such as drinking water filtration, wastewater treatment, and contaminant remediation. Tools for predicting sorption of many emerging contaminants to these sorbents are lacking because existing models were developed for neutral compounds. A method to select the appropriate sorbent for a given contaminant based on the ability to predict sorption is required by researchers and practitioners alike. Here, we present a widely applicable deep learning neural network approach that excellently predicted the conventionally used Freundlich isotherm fitting parameters log KF and n (R2 > 0.98 for log KF, and R2 > 0.91 for n). The neural network models are based on parameters generally available for carbonaceous sorbents and/or parameters freely available from online databases. A freely accessible graphical user interface is provided.
Most contaminants of emerging concern are polar and/or ionizable organic compounds, whose removal from engineered and environmental systems is difficult. Carbonaceous sorbents include activated carbon, biochar, fullerenes, and carbon nanotubes, with applications such as drinking water filtration, wastewater treatment, and contaminant remediation. Tools for predicting sorption of many emerging contaminants to these sorbents are lacking because existing models were developed for neutral compounds. A method to select the appropriate sorbent for a given contaminant based on the ability to predict sorption is required by researchers and practitioners alike. Here, we present a widely applicable deep learning neural network approach that excellently predicted the conventionally used Freundlich isotherm fitting parameters log KF and n (R2 > 0.98 for log KF, and R2 > 0.91 for n). The neural network models are based on parameters generally available for carbonaceous sorbents and/or parameters freely available from online databases. A freely accessible graphical user interface is provided.
Persistent
organic contaminants (POPs) are hydrophobic organic
compounds that include the original 12 compounds regulated in the
Stockholm convention (the “dirty dozen”). These toxic
compounds have been of special concern because of their longevity
(“persistence”) and their potential long-range atmospheric
transport. Over the past several decades, POPs and their environmental
fate have been widely studied and approaches to elucidate their fate
in the natural environment have been developed.[1] Today, many contaminants of emerging concern are polar
and/or ionizable organic compounds, including pesticides, pharmaceuticals,
and personal care products. For example, in 2010, approximately 50%
of the industrial chemicals falling under European chemicals regulation
(REACH) were ionizable organic compounds; of these, 27% were acids,
14% were bases, and 8% were zwitterions.[2]A state-of-the-art approach to predict the sorption of neutral
hydrophobic organic contaminants to a given material (sorbent) are
poly-parameter linear free-energy relationships (ppLFER).[3−6] The ppLFER concept for neutral compounds is based on the Abraham
parameters E (excess molar refraction), S (dipolarity/polarizability), A (H-bond acidity), B (H-bond basicity), V (McGowan molar volume,
cm3 mol–1/100), and L (log of the hexadecane–air partition coefficient). In addition,
the sorption of organic compounds to carbonaceous sorbents is concentration-dependent
(non-linear), a factor that was recently introduced into ppLFER for
predicting the sorption of neutral organic compounds to activated
carbon[4] and to soot.[5]Carbonaceous sorbent materials, such as activated
carbon, soot,
biochar, and carbon nanotubes (CNTs), have a wide range of applications,
including drinking water filtration systems, wastewater treatment
plants, and soil and sediment remediation. There are major limitations
to the use of conventional ppLFER in the characterization of these
systems. Specifically, the development of each ppLFER requires a substantial
number of experiments with a wide range of compounds, such that every
model is limited to a single sorbent the ppLFER must be developed
for individually. Thus, compared to existing models, the ability to
predict contaminant sorption as a function of the properties of the
sorbent would be of great advantage, as it would facilitate selection
of both the appropriate sorbent and its quantity for a given application.Moreover, methods developed to predict the sorption of neutral
compounds, such as the ppLFER, are not applicable to charged compounds
because the occurrence of additional interactions, including electrostatic
repulsion and attraction, charge-assisted H-bonding, cation bridging,
cation−π bonding, and anion−π bonding, will
depend on the speciation/dissociation of a given ionizable organic
compound.[7−10] These interactions cannot be accounted for in existing ppLFER concepts,
and the prediction of the environmental fate of charged compounds
is accordingly hindered.A simple sorbent-dependent model based
on experimental data to
predict the sorption of organic acids was recently proposed.[11] The model predicted sorption distribution coefficients
using the pH-dependent lipophilicity parameter log DOW and the specific surface area (SSA) of the carbonaceous
sorbent. However, (i) acids are only one of the three types of ionizable
organic compounds; (ii) the model could not satisfactorily predict
the literature data likely because of differences in measurement protocols,
including the measurement of the SSA,[12] and (iii) the concentration-dependent nonlinearity of sorption was
not predicted.To address these issues, we collected published
data from over
10 years of experimental research. As the Freundlich isotherm fitting
model, first published in 1907,[13] was the
most widely applied model (66% of 210 papers collected), it was chosen
as a target for prediction. To predict the Freundlich fitting parameters
for the sorption of ionizable organic compounds to carbonaceous sorbents,
a deep learning approach was developed and tested on independent literature
data to validate the model’s performance. The results showed
that this newly developed method is able to predict the sorption of
anionic, cationic, and zwitterionic ionizable organic compounds to
carbonaceous sorbents and is therefore widely applicable. Moreover,
it is based on parameters generally available for carbonaceous sorbents
and additional compound descriptors that are freely available from
online databases. A freely accessible graphical user interface is
provided by the authors.
Methods
Data
Mining from the Literature
The
literature from 2005 to 2019 was searched using three-word Scopus
searches, including one keyword for the sorbent (CNT, activated carbon,
biochar, graphene, carbonaceous, or graphite) and one keyword for
the sorbate (polar or ionizable) and “sorption.” Reviews
and nonrelevant papers were excluded from the database, which resulted
in a list of 210 papers. Thereafter, only papers including the Freundlich
isotherm fit and reporting the SSA as well as the C, H, and O contents
were selected for further analysis, which resulted in a core database
sourced from 47 publications.[11,14−59] Every sorbent–sorbate combination used in these publications
received a separate line in the database, which resulted in 328 lines
for negatively charged and polar compounds (Table S1 in the Supporting Information) and 139 lines for compounds
with a positive charge (Table S2 in the Supporting Information). Each line contained information on the combination
of one single sorbate and one single sorbent under one specific pH
condition. If pH was not reported, a pH of 7 was assumed for the sorbate
property calculations (i.e., log DOW).
For highly carbonized materials with a carbon content >90% and
no
measurable H content, an H content of 0.01% was assumed for H/C calculations.
The validity of this approach was tested by running the neural network
with and without these data. The predictions based on the smaller
data set did not differ substantially but were associated with larger
prediction errors and a smaller working range because of the decreased
number of available training items (data not shown).
Deep Learning Neural Network Setup
The feed-forward
neural network was trained using an automated Bayesian
regularization technique[60,61] in which the weights
and biases of the network are assumed to be random variables with
specified distributions. The regularization parameters are then connected
to the variances associated with these distributions and are estimated
using statistical techniques. The Bayesian regularization algorithm
generally works best when the inputs and outputs are approximately
scaled in the range between −1 and 1. Because KF values are orders of magnitude greater than this range
and change drastically, considering log KF, instead of KF, as a target parameter
significantly improved the quality of the trainings. To further improve
the model, outliers were excluded from the training data sets. To
this end, data lines containing n and log KF values smaller than the 5th percentile and
larger than the 95th percentile were excluded from the training set.Overfitting is a common problem during neural network training.
In an over-fitted neural network, although the error of the training
set is driven to a small value, the presentation of new data to the
same network can result in large errors. This is because the trained
network has memorized only the training examples and has not learned
to generalize to new situations. The number of parameters in this
study was reasonably smaller than the total number of data in the
training set such that the chance of overfitting was small. In addition,
network generalization was improved by training the neural network
on the same data set multiple (50) times. The same Bayesian regularization
back-propagation training technique was used in all multitraining
sessions. Each training session started with different initial weights
and biases as well as different divisions of data for training (70%),
validation (15%), and test (15%) sets. Because different conditions
led to different solutions, the final estimations were obtained by
averaging between the outputs from all 50 trained networks. As a result,
in the majority of cases, the mean squared error for the average output
was lower than that for the individual sessions. The employed multitraining
technique thus led to a better network generalization, which improved
the network forecasting capability. This was particularly helpful
for the small and noisy data set of compounds containing a positive
charge.Because the computational costs of multiple training
can be high,
we implemented a parallel computation scheme to greatly reduce the
training times. The computational time for a complete multitraining
session on an Intel Core i7-9700K CPU with 32 GB RAM was under a minute
using all eight CPU cores.
Sensitivity Analysis
The variance-based
global sensitivity analysis (GSA) of Sobol (2001)[62] was used to determine the importance of individual input
parameters for the outcome of neural network predictions of log KF and n. A GSA, in contrast
to a local sensitivity analysis (LSA), considers variabilities in
the full range of values for all input parameters simultaneously.
It is thus superior to LSA, in which the focus is the variability
of a single parameter value at a time. As such, GSA offers a more
rigorous solution for elucidating the impact of input parameter’s
variability considering that all other parameters are also variable.
We used a latin hypercube sequencing sampler to generate 200,000 sample
scenarios that uniformly covered the space of the input parameters.
For each realization, input variables were perturbed at random within
the range of each parameter variability in the full training data
set. The fully trained model was solved 200,000 times for each randomly
generated realization, for which the abovementioned computational
setup took about 20 min. The spatial variability of each input parameter
was assumed to follow a normal distribution defined by the standard
deviation and mean value of that parameter alone; no correlation was
assumed between the spatial variabilities of different input parameters.
The first-order Sobol indices (Si) were
then calculated from the GSA, as described in Gharasoo et al. (2019)[63] and in Sobol and Levitan (1999).[64]
Graphical User Interface
To maximize
their range of applicability, the models for the graphical user interface
were built on the complete data set (including the data lines previously
excluded for validation). The interface is conceptually similar to
previously developed graphical user interfaces.[65,66] The “CFreuPred” graphical user interface is capable
of importing and exporting data from/to Excel or Open Office to ease
data transfer to the widely used file format “.xls”.
Results and Discussion
Selection
of Parameters
The classical
presentation of the Freundlich equation iswhere q [μg/kg] describes
sorbate loading onto the sorbent, caq [μg/L]
is the aqueous concentration of the sorbate, KF is the Freundlich constant, and n [-] is the Freundlich
exponent representing isotherm nonlinearity.
The units of KF change as a function of
the units used with q and caq. Several researchers have reported KF without units and/or “1/n”
instead of “n”, such that care must
be taken when comparing literature data. For this study, the Freundlich
parameters KF and n sourced
from the literature were transformed to the form shown here and defined
as the target parameters for prediction.Four sorbent property
parameters commonly reported in the literature and previously linked
to sorption behavior were selected.[7,11,54,67,68] The sorbent content of carbon (C, %), hydrogen (H, %), and oxygen
(O, %) as well as the SSA (m2/g) were sourced from the
literature, and the molar ratios H/C and O/C were calculated. Among
the >200 screened publications,[11,14−59] 47 reported all of the above parameters. In addition, pH was also
used as a fifth parameter. Thereby, for a given material, C is a proxy
for homogeneity, SSA is a proxy for porosity and accessible sorption
sites, H/C is a proxy for aromaticity, and O/C is a proxy for polarity,
and the experimental pH is linked to the material’s surface
charge (negative charge increasing with pH).Eight sorbate properties
were selected to describe the molecular
properties of ionizable and polar compounds: The five Abraham solute
parameters (E, S, A, B, and V) were obtained from
the freely accessible UFZ-LSER database.[69] The sixth Abraham parameter, describing hexadecane–air distribution
(L), was not used because a pH-independent hydrophobicity
parameter is conceptually not applicable to ionizable organic compounds,
which dissociate depending on the surrounding pH and whose hydrophobicity
thereby changes. Instead of L, the pH-dependent hydrophobicity
parameter log DOW was calculated at the
experimental pH, using the freely accessible ChemAxon online platform
(chemicalize.com). When P (octanol–water partition coefficient
for the neutral species) and Pi (octanol–water
partition coefficient for the ionized species) are known, Dow for acidic (anionic) compounds can be calculated
asand Dow for basic
(cationic) compounds can be calculated asIn addition, we
used the experimental pH and the dissociation constants
of the ionizable organic compounds to calculate the abundancy of ionized
species present under a given condition using the Henderson–Hasselbach
equation.Several attempts to train the neural network for all
types of compounds
combined were not able to obtain meaningful results (data not shown),
most likely because compounds containing a positive charge behave
differently from polar and anionic compounds. For example, the hydrophobicity
of acidic and polar compounds is generally positively linked to sorption.
As the hydrophobicity of acidic compounds decreases with dissociation,
sorption decreases as well. This can be explained in part by the electrostatic
repulsion of the anions from the generally negatively charged surface
functional groups on the carbonaceous sorbents. In contrast, when
cationic ionizable organic compounds dissociate and their hydrophobicity
decreases, their positive charge can be electrostatically attractive
to the negatively charged functional groups on the sorbent surface,
thereby increasing sorption.We therefore subdivided the data
set into (i) negatively charged
and polar compounds and (ii) compounds containing a positive charge.
Zwitterions, which can have both charges, were grouped according to
their speciation, with 0.001% of the compound being positively charged
set as the threshold to place the compound in the second group. At
<0.001% of the compound being positively charged, the contribution
of the positive charge to overall sorption is most likely negligible.
The two databases are presented in searchable xls Tables S1 and S2
of the Supporting Information.Abbreviations: carbon content (C,
%), molar ratios H/C and O/C, specific surface area (SSA, m2/g), abundance of negatively charged species (A–, %), E (excess molar refraction), S (dipolarity/polarizability), A (H-bond
acidity), B (H-bond basicity), and V (molar volume).Abbreviations:
carbon content (C,
%), molar ratios H/C and O/C, SSA (m2/g), abundance of
negatively charged species (A–,
%), abundance of positively charged species (B+, %), E (excess molar refraction), S (dipolarity/polarizability), A (H-bond
acidity), B (H-bond basicity), and V (molar volume).
Predicting Sorption of Anions and Polar Compounds
The
model was constructed on the basis of a feed-forward deep learning
neural network (also known as a multi-layered network of neurons)
with 20 hidden layers between the input and output layers. These hidden
layers process the complex nonlinear relationships between the 12
input parameters (sorbent and sorbate descriptors from Section ) and the two
output parameters (log KF and n). The neural-network-based predictions of log KF and n yielded very accurate
predictions of the data from the training set and were able to cover
a wide range of input parameters, as shown in Table and Figure .
Table 1
Training Range of
the Individual Input
Parameters for Predicting the Freundlich Parameters log KF and n of Negatively Charged and Polar
Compoundsa
C [%]
H/C
O/C
SSA [m2/g]
pH
log DOW
A [%]
E [cm3 mol–1/100]
S
A
B
V [cm3 mol–1/100]
min
10
0.001
0.0002
1
3.3
–9.64
0.00
0.39
0.57
0.00
0.15
0.61
max
98
2.883
1.2002
1100
11.8
5.74
100.00
3.50
3.60
1.35
3.29
3.10
Abbreviations: carbon content (C,
%), molar ratios H/C and O/C, specific surface area (SSA, m2/g), abundance of negatively charged species (A–, %), E (excess molar refraction), S (dipolarity/polarizability), A (H-bond
acidity), B (H-bond basicity), and V (molar volume).
Figure 1
Measured Freundlich parameters log KF and t·n (“target”)
from the training set of polar and negatively charged compounds plotted
against log KF and n,
as predicted by the neural network model. (A) Shows the model for
log KF (grey ▽) and the 95% confidence
interval for the prediction (dashed red lines). (B) Shows the model
for the exponent n (blue △) and the 95% confidence
interval for the prediction (dashed red lines). (C) Shows the normalized
error frequency associated with the predictions of KF and n (sample size = 313).
Measured Freundlich parameters log KF and t·n (“target”)
from the training set of polar and negatively charged compounds plotted
against log KF and n,
as predicted by the neural network model. (A) Shows the model for
log KF (grey ▽) and the 95% confidence
interval for the prediction (dashed red lines). (B) Shows the model
for the exponent n (blue △) and the 95% confidence
interval for the prediction (dashed red lines). (C) Shows the normalized
error frequency associated with the predictions of KF and n (sample size = 313).The 95% confidence interval for the prediction of log KF shows that predictions of KF are associated with errors below one order of magnitude.
This is
in the same or lower range as the errors of state-of-the-art prediction
models of single carbonaceous sorbents and neutral compounds,[4,5] which demonstrated the excellent performance of our model in predicting
the log KF for polar and anionic compounds
as a function of sorbent properties. Typically, for carbonaceous sorbents,
the concentration dependence of sorption (nonlinearity) increases
at high concentrations (i.e., n decreases). Thus,
the slightly larger errors associated with the prediction of n (Figure ) can partially be explained by the strong dependence of the nonlinearity
of sorption on the concentration range of interest during the measurement
of a sorption isotherm. The values obtained from the literature were
calculated based on widely varying concentration ranges (ng/L range
to mg/L range for caq in the aqueous solution).
Therefore, the performance of the model in predicting the exponent n can be considered to be very good.To validate the
predictions, we randomly excluded 15 data lines
(equal to 5% of the total dataset, see Table S3 in the Supporting Information) from the training data
set prior to neural network training. The prediction results for these
independent data are shown in Figure and confirm the good model performance obtained for
the KF and n of negatively
charged and polar compounds.
Figure 2
Measured Freundlich fit parameters log KF (grey ▼) and n (blue
▲) from the
independent data set for negatively charged and polar compounds plotted
against parameters predicted by the neural network model (sample size
= 15).
Measured Freundlich fit parameters log KF (grey ▼) and n (blue
▲) from the
independent data set for negatively charged and polar compounds plotted
against parameters predicted by the neural network model (sample size
= 15).The variance-based GSA of Sobol
(2001)[62] was used to determine the importance
of individual input parameters
for the outcome of the neural network predictions of log KF and n. The first-order Sobol indices
for the global sensitivity of log KF and n to the 12 input parameters are displayed in Figure . The SSA, the sorbent aromaticity,
and polarity as approximated by H/C and O/C were the most important
sorbent parameters for the prediction of log KF and n. The most important compound properties
to predict log KF were the degree of dissociation
(A– %), the pH-dependent hydrophobicity
parameter log DOW, and the Abraham parameters
for polarizability (S), H-bond basicity (B), and molar volume (V). The sensitivity
of the predictions of log KF to sorbent
and sorbate properties was similar, whereas the prediction of n was largely driven (>80%) by the properties of the
sorbent.
Thus, sorption was driven by interactions with specific sorption sites,
which were consumed with increasing sorbent loading. Furthermore,
the importance of SSA, H/C, and S indicated that
π–π electron donor–acceptor interactions
are a driving mechanism of sorption for negatively charged and polar
compounds, as also reported in the literature.[7,9]
Figure 3
GSA first-order
indices (Si) for the prediction of the Freundlich
parameters for negatively charged and polar compounds. Abbreviations:
carbon content (C, %), molar ratios H/C and O/C, SSA (m2/g), and abundance of ionized negatively charged species (A–, %), E (excess molar
refraction), S (dipolarity/polarizability), A (H-bond acidity), B (H-bond basicity),
and V (molar volume).
GSA first-order
indices (Si) for the prediction of the Freundlich
parameters for negatively charged and polar compounds. Abbreviations:
carbon content (C, %), molar ratios H/C and O/C, SSA (m2/g), and abundance of ionized negatively charged species (A–, %), E (excess molar
refraction), S (dipolarity/polarizability), A (H-bond acidity), B (H-bond basicity),
and V (molar volume).Measured
Freundlich fit parameters log KF and n (“target”) from the training
set of cations and zwitterions plotted against log KF and n predicted by the neural network
model. (A) Shows the model for log KF (▽)
and the 95% confidence interval for the prediction (dashed red lines).
(B) Shows the model for the exponent n (blue △)
and the 95% confidence interval (dashed red lines). (C) Shows the
normalized errors associated with the predictions of KF and n (sample size = 133).
Predicting Sorption of Cations and Zwitterions
The same deep learning approach used for negatively charged and
polar compounds was applied to compounds containing a positive charge.
Therefore, the input layer was extended for an additional parameter
that accounted for the abundance (%) of positively charged species,
resulting in a total of 13 input parameters. The neural-network-based
predictions of log KF and n again yielded very accurate predictions over a wide working range
of input parameters, as shown in Table and Figure . Because of the smaller size of the training data set, the
model’s working range was smaller than for anions and polar
compounds (see Tables and 2). Similar to the predictions for anions
and polar compounds, n was associated with higher
prediction errors, likewise explained by the high concentration dependence
of sorption nonlinearity (see Section ).
Table 2
Training Range of the Individual Input
Parameters for Predicting the Freundlich Parameters log KF and n of Cations and Zwitterionsa
C [%]
H/C
O/C
SSA [m2/g]
pH
log DOW
A– [%]
B+ [%]
E [cm3 mol–1/100]
S
A
B
V [cm3 mol–1/100]
min
20
0.001
0.0002
1
3.0
–8.55
0
0.001
0.63
0.84
0.00
0.25
0.68
max
99
1.808
0.8882
2000
10.0
9.50
100
100
3.50
4.07
1.65
6.52
7.04
Abbreviations:
carbon content (C,
%), molar ratios H/C and O/C, SSA (m2/g), abundance of
negatively charged species (A–,
%), abundance of positively charged species (B+, %), E (excess molar refraction), S (dipolarity/polarizability), A (H-bond
acidity), B (H-bond basicity), and V (molar volume).
Figure 4
Measured
Freundlich fit parameters log KF and n (“target”) from the training
set of cations and zwitterions plotted against log KF and n predicted by the neural network
model. (A) Shows the model for log KF (▽)
and the 95% confidence interval for the prediction (dashed red lines).
(B) Shows the model for the exponent n (blue △)
and the 95% confidence interval (dashed red lines). (C) Shows the
normalized errors associated with the predictions of KF and n (sample size = 133).
To validate the predictions, we
again randomly excluded 5% of the data lines (6 lines, see Table S4
in the Supporting Information) from the
data set prior to neural network training. The results for these independent
data confirmed the good predictions for both KF and n for compounds containing a positive
charge (Figure ).
However, the training data set was much smaller for this model than
for the model presented in Section , and a larger body of literature will likely further
increase the accuracy and applicability range of the model.
Figure 5
Measured Freundlich fit parameters log KF (grey ▼) and n (blue ▲) from the
independent data set for cations and zwitterions plotted against the
parameters predicted by the neural network model (sample size = 6).
The calculation of the variance-based GSA was similar to that for
anions and polar compounds. The importance of single input parameters
for the prediction of log KF and n was more evenly distributed (Figure ), indicating that no single sorption process
describable by these parameters was responsible for driving the sorption
of compounds containing a positive charge. This is in good agreement
with the literature, in which prediction of the sorption of compounds
with a positive charge is often viewed as more challenging than is
the case for negatively charged compounds.[7,8] The
dipolarity/polarizability S was the only sorbate
parameter with little to no significance for the sorption of compounds
with a positive charge. In contrast, S was a sorbate
property of high importance for predicting the sorption of polar and
negatively charged compounds. This indicates that π electron
donor–acceptor interactions are generally not the drivers of
the sorption of these compounds. Instead, the amount of ionized positively
charged species was an important sorbate parameter for prediction,
indicating that electrostatic attraction contributed substantially
to sorption. Compounds containing a positive charge therefore exhibit
a very distinct sorption behavior that is in stark contrast to the
behavior of other organic compounds. Untangling of this distinct behavior
to further improve predictive models and enable the production of
sorbents tailored for cations and zwitterions is an important challenge
for future research. In these studies, additional sorbent parameters
such as cation exchange capacity should be considered during sorbent
characterization because most of the published studies on the sorption
of ionizable organic compounds have not reported cation- or anion-exchange
capacities.
Figure 6
GSA first-order indices
(Si) for the prediction of the Freundlich
parameters for cations and zwitterions. Abbreviations: carbon content
(C, %), molar ratios H/C and O/C, SSA (m2/g), amount of
ionized negatively charged species (A–, %), amount of ionized positively charged species (B+, %), E (excess molar refraction), S (dipolarity/polarizability), A (H-bond
acidity), B (H-bond basicity), and V (molar volume).
Measured Freundlich fit parameters log KF (grey ▼) and n (blue ▲) from the
independent data set for cations and zwitterions plotted against the
parameters predicted by the neural network model (sample size = 6).
Potential Model Applications
and Environmental
Implications
Prerequisites for the design of efficient water
purification systems or remediation strategies are easily accessible
tools able to predict the sorption of emerging contaminants, which
are often ionizable and polar compounds. To address this need, we
made use of the available literature to develop two neural network-based
models. Both performed excellently in predicting the sorption of organic
anions, cations, and zwitterions as well as polar compounds to a wide
range of carbonaceous materials. The first model was tailored to predict
the sorption of polar and negatively charged contaminants and the
second model that of compounds containing a positive charge, including
zwitterions. To account for the concentration dependence of organic
contaminant sorption to carbonaceous sorbent materials, both models
can predict the Freundlich coefficient KF and the exponent n that accounts for the concentration
dependency of sorption. The provided models are able to cover a very
wide range of sorption scenarios and will thus be useful for scientists
and practitioners in the fields of water purification and remediation.
To increase the accessibility of the models to those who are not familiar
with computational environments, we provide a graphical user interface
as Supporting Information. To predict compounds
and sorbent combinations with properties outside the range of the
current version, the model can be trained with additional data, which
will further improve its generalization and forecasting capabilities.GSA first-order indices
(Si) for the prediction of the Freundlich
parameters for cations and zwitterions. Abbreviations: carbon content
(C, %), molar ratios H/C and O/C, SSA (m2/g), amount of
ionized negatively charged species (A–, %), amount of ionized positively charged species (B+, %), E (excess molar refraction), S (dipolarity/polarizability), A (H-bond
acidity), B (H-bond basicity), and V (molar volume).
Authors: Gabriel Sigmund; Hans Peter H Arp; Benedikt M Aumeier; Thomas D Bucheli; Benny Chefetz; Wei Chen; Steven T J Droge; Satoshi Endo; Beate I Escher; Sarah E Hale; Thilo Hofmann; Joseph Pignatello; Thorsten Reemtsma; Torsten C Schmidt; Carina D Schönsee; Martin Scheringer Journal: Environ Sci Technol Date: 2022-03-30 Impact factor: 11.357
Authors: Xiaoyun Li; Jinlong Zhang; Yaofeng Jin; Yifan Liu; Nana Li; Yue Wang; Cong Du; Zhijing Xue; Nan Zhang; Qin Chen Journal: Int J Environ Res Public Health Date: 2022-09-25 Impact factor: 4.614