The effect of molecular cluster formation on the estimation of kinetic parameters for primary nucleation and growth in different systems has been studied using computationally generated data and three sets of experimental data in the literature. It is shown that the formation of molecular clusters decreases the concentration of monomers and hence the thermodynamic driving force for crystallization, which consequently affects the crystallization kinetics. For a system exhibiting a strong tendency to form molecular clusters, accounting for cluster formation in a kinetic model is critical to interpret kinetic data accurately, for instance, to estimate the specific surface energy γ from a set of primary nucleation rates. On the contrary, for a system with negligible cluster formation, a consideration of cluster formation does not affect parameter estimation outcomes. Moreover, it is demonstrated that using a growth kinetic model that accounts for cluster formation allows the estimation of γ from typical growth kinetic data (i.e., de-supersaturation profiles of seeded batch crystallization), which is a novel method of estimating γ developed in this work. The applicability of the novel method to different systems is proven by showing that the estimated values of γ are closely comparable to the actual values used for generating the kinetic data or the corresponding estimates reported in the literature.
The effect of molecular cluster formation on the estimation of kinetic parameters for primary nucleation and growth in different systems has been studied using computationally generated data and three sets of experimental data in the literature. It is shown that the formation of molecular clusters decreases the concentration of monomers and hence the thermodynamic driving force for crystallization, which consequently affects the crystallization kinetics. For a system exhibiting a strong tendency to form molecular clusters, accounting for cluster formation in a kinetic model is critical to interpret kinetic data accurately, for instance, to estimate the specific surface energy γ from a set of primary nucleation rates. On the contrary, for a system with negligible cluster formation, a consideration of cluster formation does not affect parameter estimation outcomes. Moreover, it is demonstrated that using a growth kinetic model that accounts for cluster formation allows the estimation of γ from typical growth kinetic data (i.e., de-supersaturation profiles of seeded batch crystallization), which is a novel method of estimating γ developed in this work. The applicability of the novel method to different systems is proven by showing that the estimated values of γ are closely comparable to the actual values used for generating the kinetic data or the corresponding estimates reported in the literature.
Nucleation
and growth are fundamental crystallization phenomena
in which solute molecules arrange into a crystal lattice. These two
phenomena are typically driven by the volume and surface diffusion
of solute molecules and their integration into the crystal lattice,[1,2] although in some nonclassical cases the crystal growth can be driven
by the incorporation of dimers.[3]Solute molecules in a solution comprise a mixture of monomers and
molecular clusters, whose existence has been supported experimentally[4−7] by applying various methods and rationalized theoretically by using
the classical nucleation theory (CNT)[1] and
the two-step nucleation theory.[8] Accordingly,
two types of solute concentrations can be used in defining the driving
force for crystallization, namely, the bulk solute concentration, c, accounting for all solute molecules in the mixture, i.e.,
present both as monomers and in molecular clusters, and the monomer
concentration, Z1, considering only monomers.
The bulk supersaturation S is defined aswhere ce is the
bulk solubility (i.e., the bulk solute concentration at equilibrium).
This supersaturation is often used, partly because it is easier to
characterize experimentally.[9] The monomer-based
supersaturation s is defined as[10]where C1,e is
the monomer solubility (i.e., the monomer concentration at equilibrium).These two types of supersaturations are equivalent under the assumption
that all solute molecules in the liquid phase exist only as monomers,
which overlooks the fact that part of the molecules can form molecular
clusters. Consequently, to describe crystallization kinetics in the
presence of the molecular clusters, s should be used
instead of S. Note that crystallization kinetics
described through these two supersaturations of reference can be closely
comparable to each other; a similar discussion has also been presented
elsewhere.[9,11]Despite the convenience of using the
bulk supersaturation S, it is beneficial to use the
monomer supersaturation s when inferring actual properties
of a system, such as
the specific surface energy, γ, from experimental data of primary
nucleation rates, provided that the crystallization kinetics of the
system is limited by the transport of monomers rather than that of
oligomers. Given that γ is a key quantity in the study of several
phenomena,[12−19] an accurate estimation of γ is essential. Moreover, by employing s, more information can be extracted from the same experimental
data. For instance, by utilizing a growth kinetic model formulated
on s, γ can be estimated from growth kinetic
data; this is a novel method of determining γ developed in this
work. Note that a more profound understanding of primary nucleation
and growth kinetics can offer a deeper insight into other crystallization
phenomena, e.g., secondary nucleation, which is the dominant nucleation
mechanism in industrial crystallization, particularly when operated
continuously.[20]This work has two
objectives. The first is to demonstrate, for
the first time, the effect of cluster formation on the estimation
of crystallization kinetic parameters in different systems, with a
focus on primary nucleation and growth. The second is to present a
novel method of estimating γ exploiting the information embedded
in growth kinetic data.To achieve these objectives, it is crucial
to acquire a set of
comprehensive and unbiased data describing the kinetics of primary
nucleation and growth. However, acquiring all data experimentally
can be highly challenging. The nucleation rate can be easily under-
or overestimated because of other concomitant mechanisms (e.g., aggregation)
and/or technical challenges (e.g., an inadequate mixing of solutions
and inherent limitations in available particle characterization devices).[13] A recent experimental investigation has also
revealed that nucleation rates characterized by different methods
can differ by several orders of magnitude.[15] Likewise, several issues can negatively affect the experimental
characterization of growth kinetics. Seeded batch de-supersaturation
experiments, which is a standard method of characterizing growth kinetics
for process design,[9,21−23] can be accompanied
by secondary nucleation.[9,24] In addition, some intrinsic
properties of a compound (e.g., crystal morphology) can make measured
data nonrepresentative. For instance, the particle size data of needle-like
crystals measured by some 1D sizing techniques (e.g., laser diffraction
or focused beam reflectance measurement) poorly represent the actual
particle size distribution (PSD).[25]To overcome these difficulties and to address the two objectives,
measurement data are computationally generated using a first-principle
mathematical model. This in silico approach prevents
any potential systematic bias in data from confounding the present
work; in silico approaches have been widely adopted in the literature,
for instance, to study particle characterization techniques[25−27] and growth kinetic estimation.[28] In this
contribution, data are generated using a kinetic rate equation (KRE)
model. The generated data are first perturbed by Gaussian white noise
to reflect measurement errors and then used to estimate the parameters
of kinetic models for primary nucleation or for growth that are based
on either s or S. Note that these
kinetic models used for interpreting the data are completely independent
of the KRE model used to generate the data. Estimated parameters and
corresponding confidence intervals are compared with a set of the
actual parameters characterizing a system described by the KRE model,
which reveals the conditions in which applying an s-based model (instead of a S-based one) is advantageous.
Moreover, to prove the applicability of this analysis to real experimental
data, three sets of experimental data in the literature are analyzed
in the same manner.This article is structured as follows. In section , the model and
the methods for generating
kinetic data are presented, and the three sets of reference experimental
data are explained. In section , the methods for primary nucleation and growth kinetic modeling
and parameter estimation are discussed. Finally, in section , the parameter estimation
results and the novel method of estimating γ are presented and
discussed.
Acquisition of Kinetic Measurement Data
The KRE model (see section ) was used to computationally generate kinetic data
for primary nucleation and growth individually (see section ). Besides the generated
data, three sets of experimental data in the literature (see section ) were used
to estimate the parameters of the kinetic models for primary nucleation
and growth.
Kinetic Rate Equation Model
The conventional
KRE model of nucleation (i.e., the Szilard model[29]) simulates primary nucleation, growth, and Ostwald ripening
in a self-consistent manner in that all phenomena are explained by
a unifying principle; all of them are described as a result of a series
of molecule attachments to and detachments from molecular clusters.[1,30] This self-consistency is absent in the population balance equation
(PBE) model where each phenomenon is explained by a distinctive kinetic/thermodynamic
description. It is important to note that the description of the Szilard
model is consistent with the CNT.[1] Although
other nucleation mechanisms (e.g., the two-step nucleation theory)
could be considered by adopting a corresponding kinetic description,[31] doing this is nevertheless beyond the scope
of this work.The KRE modeling framework has been successfully
applied to describe several systems, e.g., the polymorphic system
of silica,[32] that of l-glutamic
acid,[10] and a system for the synthesis
of semiconductor nanocrystals.[33] Moreover,
the model has been extended to describe secondary nucleation caused
by interparticle energies between seed crystals and molecular clusters,[34] with its description of underlying interparticle
interactions built on a thermodynamic analysis.[35] In this work, the Szilard model was used for generating
kinetic data. The model equations are[1]where Z(t) is the
concentration of molecular clusters
consisting of n solute molecules (i.e., the n-sized clusters), t is the time, ka and kd are the rate
constants of molecule attachment to and detachment from the n-sized clusters, respectively, and Z(t) ≡
0, with nmax being the upper bound of
the cluster size; this bound should be large enough to avoid any effect
on numerical solutions.[33] The n-sized clusters represent crystals when the size n is larger than the critical nucleus size n* defined
in the CNT,[1] whereas they account for part
of the liquid phase when n is smaller than or equal
to n*, thus determining the bulk solute concentration c. Equation is the mass balance for the solute molecules. The initial condition
for eqs 3 iswhere Z is the initial concentration of the n-sized
clusters. The expression for ka is determined
by the rate-limiting mechanism of the monomer transport to the n-sized clusters,[1] e.g., the
volume-diffusion limited or surface-integration limited mechanism.
In this work, the surface-integration limited mechanism is considered
as an example since this mechanism is frequently encountered in practice.[9,10,25,36] A corresponding expression for ka is[1]where
β is the sticking coefficient, ks is the surface shape factor, d1 = (V1/kv)1/3 is the molecular diameter defined with
the volume shape factor kv and the molecular
volume V1, and D is the
diffusion coefficient of the solute molecules in the solution. According
to the assumption on which eq is based, the clusters grow through the mechanism of rough
growth.[1,37−39] The expression for kd is[1,34]where ΔG (Δμ, γ) is the
Gibbs free energy for the formation of a n-sized
cluster at a given driving force for crystallization, Δμ, kB is the Boltzmann constant, and T is the absolute temperature of the system. With the initial
condition (eq ) and
the rate constants (eqs and 6), the system of the governing equations
(eqs 3) can be solved to describe the temporal evolution of Z. To do that, ΔG should be defined explicitly. Under
the capillary approximation,[1] the Gibbs
free energy ΔG (in eq ) is defined
aswhere b = ks(V1/kv)2/3 is the surface area of a solute molecule.
The driving force Δμ is the difference
in the chemical potential between the solute molecule in the liquid
phase and that in the solid phase, and it is given by[10]where s can be evaluated
using eq once the monomer
solubility C1,e (in eq ) is known.The monomer solubility C1,e can be
determined using the mass balance of the solute in the liquid phase,
and the constraint that the bulk solute concentration c is equal to the bulk solubility ce at
phase equilibrium (i.e., Δμ = 0). The
mass balance in the liquid phase can be written aswhere n* = (2γb/3Δμ)3 is the critical nucleus
size in the CNT[1] and the concentration Z is approximated by the equilibrium
concentration of the n-sized clusters, C, considering that a population of subcritical
and critical clusters reaches equilibrium with a characteristic time
shorter than those of nucleation and growth.[30] The equilibrium cluster concentration C is defined as[1]where C0 is the
concentration of nucleation sites in the system on which the clusters
of the solid phase can form.[1] The value
of C0 can be determined by using eqs , 10, and 11, and the aforementioned constraint
(i.e., c = ce when Δμ = 0): C0 = ce/∑∞m exp(−Ωm2/3) where Ω
is the dimensionless surface energy defined asOne can determine C1,e using eq since C1,e is by definition equal
to C1 at Δμ = 0.The relationship between S and s can be derived in two steps: first, substituting eqs and 8 into eq yieldsThen, accounting for eqs , 12, and 13 in eq leads
to the following relationship between s and S:where Ω is the only parameter.The KRE model (eqs , 3–11) was
solved by applying numerical methods widely used in the literature;[30,32−34] for small n (e.g., n < 50), a system of ordinary differential equations (ODEs) (eqs
3) can be integrated in time using an ODE solver (e.g., the MATLAB’s
ode15s), whereas, for large n (e.g., n ≥ 50), the system of the ODEs can be adequately approximated
by the Fokker–Planck partial differential equation, which can
be solved efficiently using the weighted finite difference scheme
proposed by Chang and Cooper.[40]As
shown in section S1 in the Supporting Information, nondimensionalization of the KRE model reveals that the general
behavior of the model depends only on the initial condition and on
the system-dependent property Ω, thus suggesting that, for a
given initial condition, the behavior of different systems can be
studied by varying Ω. To this aim, two values of Ω = {3,
10} are used in this work. Concerning the other properties, those
of the paracetamol–ethanol system at 20 °C (summarized
in Table ) are used
as a reference.
Table 1
Physicochemical Properties of the
Paracetamol–Ethanol System at 20 °C Used in the KRE Model
properties
value
description
sticking coefficient, β
3.2 × 10–6
assumeda
surface-area
shape factor, ks
π
The spherical shape is assumed.
volume
shape factor, kv
π/6
The spherical shape is assumed.
molecular weight of solute, Mw
0.151 kg mol–1
solid density,
ρs
1263 kg m–3
molar solubility, cM
0.826 mol L–1
from ref (41)
dynamic viscosity of solvent, η
1.184 × 10–3 Pa s
molecular
volume of solute, V1
1.99 × 10–28 m3
≡ Mw/ρsNAb
bulk solubility, ce
4.97 × 1026 m–3
≡ 103cMNAb
molecular diameter, d1
7.24 × 10–10 m
≡ (V1/kv)1/3
diffusion coefficient, D
5.01 × 10–10 m2 s–1
≡ kBT/3πηd1c
This scales
a nucleation rate and
a growth rate such that their values are experimentally more relevant.
NA is
the Avogadro number.
The
Stokes–Einstein equation
for diffusion in fluid.
This scales
a nucleation rate and
a growth rate such that their values are experimentally more relevant.NA is
the Avogadro number.The
Stokes–Einstein equation
for diffusion in fluid.
Data Generation Methods
Primary nucleation
kinetics are often determined based on the data of nucleation rates J characterized at different values of S.[12,14−17,42] To generate this type of data in silico, unseeded batch crystallization
was simulated at different initial supersaturations using the KRE
model, where the initial concentrations of subcritical clusters were
set to zero; hence, only monomers were present at the start, which
corresponds to a typical initial condition in the literature.[30,33] For each simulation, a nucleation rate, J, was
calculated using the following definition:[1]Here, ts is the
time for the occurrence of stationary nucleation, when the monomer
supersaturation s, the nucleation rate J, and the concentrations of subcritical clusters Z are practically
time-independent;[1] after the phase of stationary
nucleation, all the three quantities decline due to an appreciable
consumption of monomers by a large population of growing clusters.[1] The bulk supersaturation S corresponding
to J was calculated using eqs and 10.Growth
kinetics is typically determined using the data of S profiles obtained from seeded batch de-supersaturation experiments
with varying initial supersaturations.[9,21−23] To generate this type of data, seeded batch crystallization was
simulated with different initial supersaturations using the KRE model,
where the initial crystal size distribution follows a Gaussian distribution
with a mean size of 50 μm and a standard deviation of 5 μm.
The time evolution of S was calculated using eqs and 10.
Reference Experimental Data
To show
the general validity of this analysis, three sets of experimental
data in the literature were analyzed in the same manner as the in
silico data (see their main features in Table ). The first set (E1) concerns homogeneous
nucleation of benzoic acid from an aqueous sodium chloride solution
at 30 °C,[14] the second set (E2) corresponds
to homogeneous nucleation of α l-glutamic acid from
an aqueous solution at 25 °C,[17] and
the third set (E3) provides the growth kinetics of the same system
at 25 °C.[21]
Table 2
List of
Reference Experiments
data
phenomenon
system
T (K)
ref
E1
homogeneous nucleation
benzoic acid/sodium chloride/water
303.15
ref (14)
E2
homogeneous nucleation
α-l-glutamic acid/water
298.15
ref (17)
E3
growth
α-l-glutamic acid/water
298.15
ref (21)
Kinetic
Parameter Estimation for Primary Nucleation
and Growth
Kinetic models for primary nucleation and growth
(see sections and 3.2, respectively) were used to estimate
model parameters
in accordance with the procedure explained in section . This allowed for the interpretation of
experimental data with and without consideration of cluster formation
and its impact on crystallization kinetics. Note that the cluster
formation can be considered in a kinetic model by using the monomer
supersaturation s instead of the bulk supersaturation S as the supersaturation value of reference.
Primary Nucleation Kinetics
The primary
nucleation rate J is typically described by using
the following expression:[1]where x ∈ {s, S} is a generic supersaturation and
θPN (i = 1, 2) is the parameter to be estimated
from a given set of primary nucleation rate data. Using the one-to-one
correspondence between θ2PN and Ω according to the CNT (i.e., θ2PN = 4Ω3/27),[1] corresponding values of
Ω, s (eq ), and γ (eq ) can be determined from the estimated value of θ2PN.Equation adequately approximates J obtained from the KRE model using eq , when x = s, and θ1PN and θ2PN are set as follows (see the relevant derivation in section S2 in
the Supporting Information):These parameters correspond to the actual
nucleation kinetic parameters of a system described by the KRE model.
Growth Kinetics
Population Balance Model
The PBE model for a well-mixed
batch reactor can be written asHere, L is the characteristic
length of the crystal, G is the crystal growth rate
assumed to be size-independent, and f(t, L) is the PSD, i.e., f(t, L) dL being the number
of crystals with length L ∈ [L; L + dL] per unit suspension volume.[39]Equation is the initial condition, with f0(L) being the initial PSD, while eq is the boundary condition establishing
that no nucleation occurs in the system. Note that neither agglomeration
nor breakage is considered in the model. The PBE is coupled with the
solute material balancewhere c(t) is the
solute concentration, m = ∫0∞Lf(t, L) dL (i = 0, 1, 2, ...) is the ith
moment of f, c0 is the
initial solute concentration, and v1 is
the molecular volume.The PBE was solved numerically by applying
a fully discrete, high-resolution finite volume method with the van
Leer Flux limiter,[43] while satisfying the
convergence condition by Courant–Friedrichs–Lewy.[44] At the upper bound of the size domain, a numerical
outflow boundary condition was imposed by using a zero-order extrapolation
method.[43] For the discretization of f along L, a regular grid with a sufficiently
small size was used to obtain a numerical solution independent of
the grid size; in this work, the corresponding size ranged between
0.025 and 1 μm depending on the data set used for fitting and
a chosen growth model.
Growth Rate Models
Two growth rate
models are considered
in this work, namely, the rough growth (RG) model and the birth and
spread (BS) model. The RG model was used for analyzing the data obtained
from the KRE model whose description follows the rough growth mechanism,
and a corresponding rate expression is[1,37,39]where θRG is the parameter
to be estimated. When x = s and
a specific value of θRG is assigned, the numerical
solutions of the PBE model (eqs –24) approximate those
of the KRE model that simulates seeded batch crystallization where
nucleation is negligible, and the corresponding value of θRG is as follows (see the relevant derivation in section S3
in the Supporting Information):This equation gives the actual growth
kinetic
parameter of a system described by the KRE model.The birth
and spread growth model[2] was used when
interpreting the experimental data E3 (see Table ), considering that the growth kinetics of
the corresponding system has been adequately described by the BS model
in the literature.[21,45] The corresponding growth rate
expression is[2]where θBS (i = 1, 2) is the parameter to be estimated.It is worth noting
that using a s-based growth
rate model allows the simultaneous estimation of the growth kinetic
parameters and of the specific surface energy γ, which is a
novel method of determining γ. The determination of γ
is possible because, at a given solute concentration c, the growth rate G given by a s-based model depends not only on the kinetic parameters but also
on γ, which affects the monomer supersaturation s (see eqs , 12, and 14). An application
of this method to the generated data and to the experimental data
E3 is presented in section .
Fitting Procedure
As detailed in
section S4 in the Supporting Information, the parameter(s) of a kinetic model were estimated from a set of
data points using the method of least-squares, and corresponding confidence
intervals were estimated by applying a linear approximation of the
model. Prior to parameter estimation, suitable model transformation
and parameter transformation were applied, as explained in section
S5 in the Supporting Information.For nucleation kinetics, the logarithm of the nucleation rate (ln J) was chosen as a response variable in parameter estimation
because it is the most common choice in the literature[11,16,17,42] and also because it linearizes the S-based model
(eq with x = S) such that a simple linear regression
can be applied. Moreover, the variance of ln J in
a set of experimental data is approximately constant,[16,17] suggesting that the logarithmic transformation is helpful to satisfy
the constant-variance assumption in parameter estimation.[46] To reflect noise in experimental data, white
Gaussian noise was added to the generated data of ln J before using them for fitting.For growth kinetics, the time
evolution of bulk supersaturation S(t) was directly used as a response variable
for parameter fitting, as it is common in the literature.[21−23] To give equal weight to each de-supersaturation profile regardless
of its time duration, the profiles with a longer time duration were
down-sampled such that the number of the data points of each profile
used in parameter estimation is equal, which is a data preprocessing
technique used also elsewhere.[47] To consider
measurement noise in experimental data, white Gaussian noise was added
to the generated S profiles before using them in
parameter estimation, as done elsewhere.[25,28]
Results and Discussion
Comparison
of Bulk and Monomer Supersaturation
Figure illustrates
the relationship, given by eq , between the bulk supersaturation S and
the monomer supersaturation s at different values
of the dimensionless surface energy Ω. The monomer supersaturation s is always smaller than or equal to the bulk supersaturation S, and their difference increases with an increase in S (or s) and a decrease in Ω. This
is because both increasing s and decreasing Ω
lower the Gibbs free energy for cluster formation (see eqs , 8, and 12), which facilitates cluster formation while consuming
the monomer concentration Z1 or, equivalently,
the monomer supersaturation s. Accordingly, a system
featuring a strong tendency to form molecular clusters exhibits a
considerable deviation between s and S (e.g., see the region (a) in Figure ). This deviation can cause different outcomes in kinetic
parameter estimations for s-based and S-based models, whereas a small deviation (e.g., the region (b) in Figure ) would yield similar
results, as demonstrated in sections and 4.3 for primary
nucleation kinetics and growth kinetics, respectively.
Figure 1
Relationship between
the bulk supersaturation S and the monomer supersaturation s at different
values of the dimensionless surface energy Ω = {3, 4.5, 6, 10}.
The region (a) represents a part of the Ω = 3 system with a
distinctive deviation between S and s, whereas the region (b) represents a part of the Ω = 10 system
with a negligible difference between S and s. Gray area represents an unattainable region. Note that,
for illustration purposes, the critical size n* is
treated as a continuously variable in this plot, thus yielding smooth
curves for wide ranges of Ω, S, and s.
Relationship between
the bulk supersaturation S and the monomer supersaturation s at different
values of the dimensionless surface energy Ω = {3, 4.5, 6, 10}.
The region (a) represents a part of the Ω = 3 system with a
distinctive deviation between S and s, whereas the region (b) represents a part of the Ω = 10 system
with a negligible difference between S and s. Gray area represents an unattainable region. Note that,
for illustration purposes, the critical size n* is
treated as a continuously variable in this plot, thus yielding smooth
curves for wide ranges of Ω, S, and s.
Primary
Nucleation Kinetics
The effect
of cluster formation on the estimation of primary nucleation kinetic
parameters for the s-based and S-based models (i.e., eq with x = s and x = S, respectively) is demonstrated in
the following. Parameter estimation results based on a set of generated
data corresponding to either region (a) or (b) in Figure , each representing a Ω
= 3 or Ω = 10 system, respectively, are presented, followed
by a discussion on the parameter estimation outcomes obtained from
the experimental data E1 and E2.
Generated Data
Parameter estimation
outcomes based
on the Ω = 3 and Ω = 10 data are presented in Figure , panels a and b,
respectively, where the nucleation rate data (S, J), the corresponding (s, J) data, the outputs of fitted models, and the corresponding 95% confidence
intervals are reported.
Figure 2
Nucleation rate data generated with (a) Ω
= 3 and (b) Ω
= 10 used in parameter estimation (circle markers) for the s-based and S-based models, the corresponding
(s, J) data (diamond markers) where s is calculated using the fitted θ2PN and eqs and 18, the outputs
of the fitted models (lines), and the corresponding 95% confidence
intervals (shades).
Nucleation rate data generated with (a) Ω
= 3 and (b) Ω
= 10 used in parameter estimation (circle markers) for the s-based and S-based models, the corresponding
(s, J) data (diamond markers) where s is calculated using the fitted θ2PN and eqs and 18, the outputs
of the fitted models (lines), and the corresponding 95% confidence
intervals (shades).In the case of Ω
= 3, the outcomes of the s-based and S-based models are distinctly different,
thus clearly indicating that the interpretation of identical data
can be strikingly different depending on the choice of the supersaturation
of reference. Table summarizes the actual value of primary nucleation parameters θPN,KRE used to generate the Ω = 3
and Ω = 10 data, the fitted parameters for the s-based and the S-based model, denoted by θ̂PN (x = s) and θ̂PN (x = S), respectively, and the corresponding 95% confidence intervals.
For instance, the estimated value of θ1PN for the s-based model
is 1 × 1030 with a 95% confidence interval ranging
from 8 × 1026 to 2 × 1033. It can
be seen that, in the case of Ω = 3, the actual parameter values
can be satisfactorily estimated only when the s-based
model is employed. On the contrary, using the S-based
model underestimates θ1PN and overestimates θ2PN, which can be explained by considering
the functional form of eq : since S is noticeably larger than s in the case of Ω = 3 (see Figure ), a lower value of θ1PN and a higher value of θ2PN are necessary
to obtain the same nucleation rate value. The overestimated θ2PN yields an overestimation
of γ (see eqs and 18), which highlights that using the s-based model is necessary to estimate γ accurately
for a system where cluster formation is significant.
Table 3
Actual Parameters Used to Generate
Two Data Sets (θPN,KRE) and Corresponding
Parameter Estimation Results (θ̂PN)a
data
parameters
θ1PN(m–3 s–1)
θ2PN(−)
γ × 103(J m–2)
Ω = 3
θPN,KRE
2 × 1030
4.0
7.4
θ̂PN (x = s)
1 × 1030 [8 × 1026, 2 × 1033]
3.8 [1.6, 5.9]
7.2 [5.5, 8.4]
θ̂PN (x = S)
4 × 1027 [2 × 1024, 1 × 1031]
10.1 [7.9, 12.2]
10.0 [9.3, 10.7]
Ω = 10
θPN,KRE
1 × 1034
148
24.6
θ̂PN (x = s)
4 × 1033 [4 × 1030, 5 × 1038]
145 [111, 179]
24.4 [22.3, 26.2]
θ̂PN (x = S)
2 × 1033 [1 × 1028, 2 × 1038]
147 [113, 180]
24.5 [22.5, 26.2]
[ωmin, ωmax]: a 95% confidence interval of an estimate
is from ωmin to ωmax.
[ωmin, ωmax]: a 95% confidence interval of an estimate
is from ωmin to ωmax.Parameter estimation results obtained
from the Ω = 10 data
contrast sharply with those from the Ω = 3 data. In Figure b, the outputs of
the two models fitted to the Ω = 10 data are closely comparable
to each other, to the point that their 95% confidence intervals overlap.
Moreover, as presented in Table , the fitted parameters for both models are extremely
similar to the actual values, thus indicating that the two models
are virtually equivalent in parameter estimation for a system where
cluster formation is negligible.
Experimental Data
Parameter estimation outcomes for
the data E1 and E2 are illustrated in Figure , panels a and b, respectively. In the case
of E1 (corresponding to the benzoic acid/sodium chloride/water system
as listed in Table ), the behaviors of the two models are noticeably different, as previously
observed for the Ω = 3 case. As summarized in Table , using the S-based model in parameter estimation yields a lower value of θ1PN and a higher
value of θ2PN than those obtained using the s-based model, which
is the same trend identified in the Ω = 3 case. According to
the noticeable difference between two parameter estimation results
based on the two models, it can be inferred that the system of E1
exhibits a strong tendency to form molecular clusters.
Figure 3
Nucleation rate data
(a) E1 and (b) E2 used in parameter estimation
(circle markers) for the s-based and S-based models, the corresponding (s, J) data (diamond markers) where where s is calculated
using the fitted θ2PN and eqs and 18, the outputs of the fitted models (lines),
and the corresponding 95% confidence intervals (shade). The data E1
are reprinted with permission from ref (14). Copyright 2001 American Institute of Chemical
Engineers (AIChE). The data E2 are reprinted with permission from
ref (17). Copyright
2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Table 4
Parameter Estimation Results Using
Experimental Data E1 and E2 (See Table )a
data
parameters
θ1PN (m–3 s–1)
θ2PN (−)
γ × 103 (J m–2)
E1
θ̂PN (x = s)
3 × 1019, [1 × 1019 ,1 × 1020]
5.0 [1.6, 8.5]
9.7 [6.5, 11.5]
θ̂PN (x = S)
8 × 1016, [4 × 1016, 2 × 1017]
10.5 [8.7, 12.3]
12.4 [11.6, 13.0]
E2
θ̂PN (x = s)
2 × 1026, [7 × 1019, 5 × 1032]
158 [98, 217]
29.7 [25.3, 33.0]
θ̂PN (x = S)
7 × 1025, [2 × 1019, 3 × 1032]
159 [100, 218]
29.7 [25.5, 33.0]
[ωmin, ωmax]: a 95% confidence interval of an estimate
is from ωmin to ωmax.
Nucleation rate data
(a) E1 and (b) E2 used in parameter estimation
(circle markers) for the s-based and S-based models, the corresponding (s, J) data (diamond markers) where where s is calculated
using the fitted θ2PN and eqs and 18, the outputs of the fitted models (lines),
and the corresponding 95% confidence intervals (shade). The data E1
are reprinted with permission from ref (14). Copyright 2001 American Institute of Chemical
Engineers (AIChE). The data E2 are reprinted with permission from
ref (17). Copyright
2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.[ωmin, ωmax]: a 95% confidence interval of an estimate
is from ωmin to ωmax.In Figure b, the
results of parameter estimation using the data E2 (corresponding to
the α l-glutamic acid/water system as listed in Table ) shows characteristics
observed in the Ω = 10 case. The 95% confidence intervals for
the outputs of the two models overlap and the fitted parameter values
for both models are extremely similar (see Table ), thus suggesting that the formation of
molecular clusters and its influence on parameter estimation can be
regarded as insignificant in the system of E2.The aim of this section
is to demonstrate the effect of cluster formation on the estimation
of growth kinetic parameters and to present the novel method of estimating
the specific surface energy γ from growth kinetic data. To this
aim, parameter estimation outcomes based on two sets of data generated
with Ω = 3 and Ω = 10 are analyzed, followed by a discussion
on parameter estimation results using the experimental data E3.
Generated
Data
In Figure , the concatenated time series of the Ω
= 3 data and the corresponding predictions of the fitted, s-based, and S-based model (explained in section ) are shown.
Figure 4
Concatenated
time series of the data generated with Ω = 3
and the outputs of the fitted, s-based, and S-based models with respect to the bulk supersaturation S and the volume-weighted mean size L̅v. The alternating background color is used to distinguish
a subset of the whole data, each generated from a different KRE simulation.
Concatenated
time series of the data generated with Ω = 3
and the outputs of the fitted, s-based, and S-based models with respect to the bulk supersaturation S and the volume-weighted mean size L̅v. The alternating background color is used to distinguish
a subset of the whole data, each generated from a different KRE simulation.For both models, the only fitted output was the
bulk supersaturation S, as explained in section , while the
volume-weighted mean size L̅v was
calculated according to its definition: L̅v = m4/m3. As shown in Figure , the outputs of both models agree with the
corresponding data points satisfactorily, and, furthermore, the 95%
confidence intervals for the outputs are narrower than the thickness
of the lines corresponding to the model outputs, which means that
the statistical goodness of fit of each model is adequate. It can
be seen in Figure that the time series of the mean size L̅v predicted by the two models overlap with the actual values
obtained from the KRE simulations, thus suggesting that the descriptions
of the two models are practically identical and these descriptions
conform to that of the KRE model.Although the outputs of the
two models are remarkably similar,
a striking difference is observed in the fitted values of the parameter
(θ̂RG), as summarized in Table . The value obtained
using the S-based model is noticeably smaller than
the actual one (θRG,KRE), whereas the
one from the s-based model is closely comparable
to the actual one. Recall that the difference between the two models
originates from the difference between s and S as illustrated in Figure . Moreover, by using the s-based model,
the specific surface energy γ was simultaneously estimated as
explained in section , which is the novel method of determining γ. Remarkably,
the estimated value of γ (8.2 mJ m–2) is very
close to the actual one (7.4 mJ m–2), which falls
within the 95% confidence interval of the estimate ([6.0, 11.2] mJ
m–2), thus proving the applicability of the novel
method to a system featuring a relatively small Ω.
Table 5
Actual Parameters in a Simulated Environment
(θRG,KRE) and Parameter Estimation Results
Using Two Sets of Generated Data (θ̂RG)a
data
parameters
θRG (μm s–1)
γ × 103 (J m–2)
Ω = 3
θRG,KRE
0.286
7.4
θ̂RG (x = s)
0.261 [0.189, 0.334]
8.2 [6.0, 11.2]
θ̂RG (x = S)
0.170 [0.167, 0.172]
Ω = 10
θRG,KRE
0.435
24.6
θ̂RG (x = s)
0.436
[0.432, 0.440]
26.0 [23.2, 29.1]
θ̂RG (x = S)
0.430 [0.429, 0.431]
[ωmin, ωmax]: a 95% confidence
interval of an estimate is from ωmin to ωmax.
[ωmin, ωmax]: a 95% confidence
interval of an estimate is from ωmin to ωmax.In Figure , the
parameter estimation results based on the Ω = 10 data are shown.
On the one hand, the results are similar to those for the Ω
= 3 data in that the s-based and S-based models fit the data satisfactorily: the outputs of the two
models coincide with the data, and the corresponding 95% confidence
intervals are narrower than the thickness of the lines representing
the model outputs. On the other hand, the results differ from the
ones obtained for the Ω = 3 data because, in the case of Ω
= 10, the estimated θRG for the two models is practically
the same, as summarized in Table . This is mainly because s and S are almost equivalent in the corresponding conditions,
as shown in Figure . As reported in Table , the value of γ estimated from the Ω = 10 data (26.0
mJ m–2) is closely comparable to the actual one
(24.6 mJ m–2), thus showing that the novel method
of estimating γ is also applicable to a system exhibiting a
relatively high Ω.
Figure 5
Concatenated time series of the data generated
with Ω = 10
and the outputs of the fitted, s-based, and S-based models with respect to the bulk supersaturation S and the volume-weighted mean size L̅v. The alternating background color is used to distinguish
a subset of the whole data, each generated from a different KRE simulation.
Concatenated time series of the data generated
with Ω = 10
and the outputs of the fitted, s-based, and S-based models with respect to the bulk supersaturation S and the volume-weighted mean size L̅v. The alternating background color is used to distinguish
a subset of the whole data, each generated from a different KRE simulation.Finally, the parameter
estimation
results obtained using the experimental data E3 (corresponding to
the α l-glutamic acid/water system as listed in Table ) are illustrated
in Figure . As explained
in section , the
birth and spread (BS) growth model (eq with x = S or x = s) was utilized to fit the E3 data.
In Figure , both s-based and S-based models fit data adequately,
and the 95% confidence intervals for the predictions of the models
are narrower than the width of the lines representing the predictions.
Analogous to the results illustrated in Figures and 5, the behavior
of the two models illustrated in Figure is essentially equivalent. Note that, unlike
the BS model, the rough growth (RG) model could not describe the data
E3 adequately (see section S6 in the Supporting Information). This clearly suggests that the growth of the
α l-glutamic acid crystals in aqueous solution is not
governed by the RG mechanism, and thus the estimated Ω value
from the RG model cannot be trusted.
Figure 6
Concatenated time series of the E3 data
(see Table ) and the
outputs of the fitted, s-based, and S-based models with respect
to the bulk supersaturation S and the volume-weighted
mean size L̅v. The alternating background
color is used to distinguish a subset of the whole data, each corresponding
to a different experiment. From left to right, the fitted experiments
correspond to Runs 1, 4, 5, 7, 10, and 12 in ref (21), which are all reported
data sets measured at 25 °C, and they are reproduced with permission
from the Royal Society of Chemistry.
Concatenated time series of the E3 data
(see Table ) and the
outputs of the fitted, s-based, and S-based models with respect
to the bulk supersaturation S and the volume-weighted
mean size L̅v. The alternating background
color is used to distinguish a subset of the whole data, each corresponding
to a different experiment. From left to right, the fitted experiments
correspond to Runs 1, 4, 5, 7, 10, and 12 in ref (21), which are all reported
data sets measured at 25 °C, and they are reproduced with permission
from the Royal Society of Chemistry.Table summarizes
the estimated parameters for the s-based and S-based models as well as those obtained from the reference
literature (ref (21)) for the E3 data, denoted by θ̂BS (x = s), θ̂BS (x = S), and θ̂BS,ref, respectively. There is a reasonable agreement
between the estimated parameters for the S-based
model θ̂BS (x = S) and the reference values θ̂BS,ref with a small difference in the value of
θ2BS.
Note that it is not possible to obtain parameter values for the S-based model θ̂BS (x = S) identical to the reference values θ̂BS,ref because the applied parameter
estimation methods are different, and the reference values θ̂BS,ref were fitted using the whole data sets measured
at 25, 35, and 45 °C, whereas only a subset measured at 25 °C
was employed for fitting the S-based model. The main
motivation for using the 25 °C data only is to determine γ,
a temperature-dependent quantity,[10,16] by applying
the novel method and to compare its value with that estimated from
the data E2, which is the primary nucleation kinetic data for the
same system at the same temperature. It is worth mentioning that a
comparison between the estimated parameters for the s-based and the S-based model shows that the estimated
values of θ1BS are noticeably different, and those of θ2BS are highly comparable.
The contrasting difference in the estimated θ1BS suggests that the cluster formation
in this system (i.e., an aqueous solution of l-glutamic acid)
is likely to be significant.
Table 6
Parameter Estimation
Results Using
Experimental Data E3 (See Table )a
data
parameters
θ1BS (μm s–1)
θ2BS × 10–4 (K2)
γ × 103 (J m–2)
E3
θ̂BS,ref
0.42
5.42
θ̂BS (x = s)
0.57
[0.40, 0.74]
4.72 [4.44, 5.01]
14.2 [10.4,
19.4]
θ̂BS (x = S)
0.41 [0.40, 0.43]
4.97 [4.79, 5.14]
[ωmin, ωmax]: a 95% confidence interval of an
estimate is from ωmin to ωmax.
[ωmin, ωmax]: a 95% confidence interval of an
estimate is from ωmin to ωmax.The specific surface energy
γ of α l-glutamic
acid crystal in an aqueous solution at 25 °C was estimated to
be 29.7 mJ m–2 in section where the data E2 were used for fitting.
However, this estimate of γ differs substantially from other
four estimates reported in the literature,[15,16] namely, 6.8, 8.4, 10.4, and 17.1 mJ m–2, where
two values of γ are given in ref (16), and the other two are calculated from the estimated
values of θ2PN reported in ref (15) using eqs and 18. A comparison of all the estimates
based on five sets of nucleation rate data of the same system at the
same temperature suggests that estimating γ from a set of nucleation
rates is not highly reliable due to inherent difficulties in acquiring
nucleation rates accurately (see ref (48) for a detailed discussion about inherent uncertainty
in primary nucleation rate measurement due to its intrinsic stochasticity).
On the contrary, the presented method of determining γ from
the temporal evolution of a solute concentration (or bulk supersaturation)
is intrinsically more robust, because the concentration can be monitored
quite accurately using various techniques such as spectroscopy,[49−52] chromatography,[53] and particle imaging.[52] For instance, a concentration estimate obtained
from attenuated total reflectance Fourier transform infrared (ATR-FTIR)
spectroscopy can have an error of 3% with respect to the equilibrium
concentration,[54] or even less.[55] As listed in Table , the value of γ estimated using our
method is 14.2 mJ m–2 (corresponding to Ω
= 4.9), which is closely comparable to the two literature estimates
(10.4 and 17.1 mJ m–2) mentioned above. Although
validating the presented method further by applying it to other systems
is highly recommended, this is nevertheless outside the scope of this
contribution.The optimal method for estimating γ depends
on a number of
factors. If nucleation rates can be measured accurately, the standard
method of fitting a nucleation rate model is more effective due to
the relative simplicity of the fitting procedure. Moreover, the novel
method can be inaccurate, when a quantitative estimation of solute
concentration is difficult (e.g., due to a very low solubility[54]), when other concomitant mechanisms (e.g., breakage
and secondary nucleation) occur uncontrollably during seeded batch
de-supersaturation experiments, and when a system exhibits a very
large value of Ω such that s and S are very similar. In the last case, Ω cannot be accurately
estimated because Ω does not affect the growth rate through eq . When eq reduces to S = s, then Ω cannot be estimated from growth experiments.
Conclusions
The effect of molecular
cluster formation on supersaturation and,
consequently, on the estimation of kinetic parameters for primary
nucleation and growth has been investigated in different systems using
computationally generated data and three sets of benchmark experimental
data. Utilizing the generated data has allowed us to study different
systems under different conditions more extensively, to prevent any
systematic error in experimental measurements from confounding our
analysis, and to identify a general, system-dependent trend in cluster
formation and its influence on the estimation of kinetic parameters.
The generated data and the reference experimental data were used for
fitting primary nucleation and growth kinetic models, while each model
was formulated on the basis of supersaturation defined in two different
manners. One supersaturation definition, denoted by S, was based on bulk concentrations, whereas another definition of
supersaturation, denoted by s, was formulated based
on monomer concentrations to consider the impact of cluster formation
on kinetics. As a result, two kinetic models were formulated for each
phenomenon.Parameter estimation results using s-based and S-based models have demonstrated that
considerable cluster
formation in a system leads to strikingly different interpretations
of identical data depending on the choice of the supersaturation of
reference. In this case, an accurate estimation of model parameters
can be achieved only when the s-based model is employed.
On the contrary, when cluster formation is insignificant, its influence
on parameter estimation outcomes is negligible.A novel method
of estimating the specific surface energy γ
from concentration profiles of seeded batch crystallization has been
developed. This method gives a more reliable estimate of γ than
standard methods that utilize nucleation rate data, mainly because
the concentration can be quantified more accurately than the nucleation
rate. The applicability of the novel method to different systems has
been proven by demonstrating that the estimates of γ agree with
the actual values used to generate data or with the values reported
in the literature.It is worth noting that the present analysis
is valid only within
the framework of classical nucleation theory, where crystallization
kinetics are limited by the transport of monomers instead of that
of oligomers; molecular clusters have the same structure and specific
surface energy as crystals and consist of molecules having the same
bulk chemical potential as crystals independent of their cluster size.
Although a generalization of this work is in principle possible, for
instance, to be applicable to two-step nucleation theory, this is
nonetheless beyond the scope of this contribution.
Authors: Monika Warzecha; Lakshmanji Verma; Blair F Johnston; Jeremy C Palmer; Alastair J Florence; Peter G Vekilov Journal: Nat Chem Date: 2020-09-23 Impact factor: 24.427
Authors: Dominique Maes; Maria A Vorontsova; Marco A C Potenza; Tiziano Sanvito; Mike Sleutel; Marzio Giglio; Peter G Vekilov Journal: Acta Crystallogr F Struct Biol Commun Date: 2015-06-27 Impact factor: 1.056
Authors: Mike Sleutel; Jim Lutsko; Alexander E S Van Driessche; Miguel A Durán-Olivencia; Dominique Maes Journal: Nat Commun Date: 2014-12-03 Impact factor: 14.919