Enzhi Liang1, Song Zhang2, Bin Liu2, Bujin Qi2, Yanpei Nie2, Zhihong Yuan1. 1. State Key Laboratory of Chemical Engineering, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China. 2. China Sunsine Chemical Holdings Ltd., Shanxian City, Shandong 274300, China.
Abstract
As an important chemical intermediate, 2-mercaptobenzothiazole (MBT) is widely used in various processes, especially in the rubber industry. However, there is no first-principles model that describes the synthetic process of MBT. This paper focuses on the formulation of a reliable mathematical model represented by a series of differential and algebraic equations for the industrial batch MBT production process. It is difficult to estimate all of the unknown parameters in the model because of the lack of sufficient industrial/experimental data. Thus, a reduced estimable parameter set is derived by performing estimability analysis on the original estimation problem. A multiple-starting-point strategy is then applied to numerically solve the non-convex parameter estimation problem with the weighted least-squares approach. Subsequently, a cross-validation technique is employed to evaluate the generalizability of the proposed model. Finally, it is confirmed that the proposed model produces encouraging prediction performance with regard to independent test data.
As an important chemical intermediate, 2-mercaptobenzothiazole (MBT) is widely used in various processes, especially in the rubber industry. However, there is no first-principles model that describes the synthetic process of MBT. This paper focuses on the formulation of a reliable mathematical model represented by a series of differential and algebraic equations for the industrial batch MBT production process. It is difficult to estimate all of the unknown parameters in the model because of the lack of sufficient industrial/experimental data. Thus, a reduced estimable parameter set is derived by performing estimability analysis on the original estimation problem. A multiple-starting-point strategy is then applied to numerically solve the non-convex parameter estimation problem with the weighted least-squares approach. Subsequently, a cross-validation technique is employed to evaluate the generalizability of the proposed model. Finally, it is confirmed that the proposed model produces encouraging prediction performance with regard to independent test data.
The chemical intermediate 2-mercaptobenzothiazole (MBT) is a crucial
product in the rubber and automobile manufacturing industries.[1] Traditionally, MBT was extensively used as an
accelerator in the rubber vulcanization process.[2] Later, the capability of MBT for inhibiting metal corrosion
was discovered, and biological applications of MBT as a fungicide
were also investigated.[3,4] In recent research studies, new
functions of MBT and its derivatives have been disclosed that suggest
it can be used in the modified Julia olefination reaction, which is
an effective tool for advance fragment linkage.[5,6] MBT
is also an essential precursor of many other popular rubber additives,
such as 2,2′-dibenzothiazyl disulfide (MBTS) and N-cyclohexylbenzothiazole-2-sulphenamide (CBS). Furthermore, several
widely used antibiotics, such as cephalosporins, can also be synthesized
from the derivatives of MBT.[7]As
the batch process is more flexible and closer to the market
requirements than its continuous counterpart,[8] batch production of MBT with aniline as a reactant, usually referred
to as the Kelly process, remains a popular choice.[9] The Kelly process charges the reactants, i.e., aniline,
sulfur, and carbon disulfide, in a closed autoclave. The temperature
is then raised to 220–250 °C and maintained for about
4 h.[10] During the reaction, the pressure
of the autoclave increases to around 10 MPa as a result of the production
of H2S. Under these tough reaction conditions, major manipulations
of the industrial MBT production process still rely on the prior experiences
of workers, which may lead to significant reductions in profits and
could even cause safety problems. Therefore, it is essential to build
a reliable process model for MBT and estimate the relevant process
parameters for the industrial production process in preparation for
further optimization, automatic control, and monitoring.Existing
approaches for process modeling can be categorized into
various levels according to the capabilities of the model in describing
the relevant process; common examples include white-box models, gray-box
models, and black-box models.[11] Among the
various types of process models, the first-principles models, or fundamental
mathematical models, theoretically provide the best representations
of the dynamic characteristics and nonlinear behaviors of the chemical
process.[12−14] Unfortunately, the rigorous reaction mechanism of
MBT synthesis is very complicated, with at least 50 species separated
and identified from the generated mixture.[5] The inter-reactions between the minor components have not yet been
fully identified. Therefore, the establishment of a full and rigorous
first-principles model that includes all reaction components is a
huge challenge. Fortunately, the major reaction steps of MBT synthesis
have been disclosed in recent research studies.[5,15] Thereby,
kinetics of these key reactions can be mathematically formulated.
Additionally, the less important reactions can be represented by a
lump reaction. In this way, a complete theoretical model describing
the kinetics, mass/energy balances, and equations of state can be
established.Parameter estimation is one of the key components
in establishing
any model. In certain linear systems, where the output variables can
be easily interpreted as explicit functions of the input variables
and model parameters, parameter estimation is often referred to as
explicit estimation.[16,17] In contrast, implicit estimation
refers to estimation problems in which the outputs are implicitly
dependent on the inputs and parameters.[18] Implicit parameter estimation is usually performed by solving an
optimization problem that minimizes the mismatches between the model
predictions and real datasets, with the parameters serving as the
decision variables.[19,20] The implicit estimation strategy
is applied in this study due to the nonlinearity of the reaction kinetics
of the MBT production process.In recent decades, a great number
of methods have been developed
for implicit parameter estimation. Among these, the most popular one
is the least-squares (LS) approach, in which the objective function
is generally defined as a summation of the 2-norm residuals of output
variables between model prediction and measurement.[13,17,21] However, the standard LS approach assumes
that different output data have same contributions to the estimation
of unknown parameters.[22] Therefore, the
weighted least-squares (WLS) method was developed, whereby suitable
weights are assigned to the relevant residuals.[19,23] The relatively compact form and high computational efficiency of
the WLS approach have led to its widespread application to parameter
estimation problems in various chemical processes.[23−25] The WLS approach
assumes that the input variables can be precisely measured and that
there are no errors in the input data. Nevertheless, errors in the
input measurements cannot be ignored in a few industrial processes,
such as the measurement errors of jacket temperature in the solid–liquid
reactions.[26] In such cases, the errors-in-variables
(EIV) method can better describe these processes, because all of the
errors in the measured data, including the input data, are considered
in the EIV formulation.[27,28] However, the EIV approach
requires far more computational resources than WLS.[29] For example, in the parameter estimation problem of the
batch MBT process where 18 experiments are available in the training
dataset and each experiment records around 60 temperature points as
input variables, the EIV approach has to consider the extra 18 ×
60 measured temperature values as additional decision variables, which
leads to unnecessary use of computational resources. Moreover, treating
all the inputs as decision variables can lead to an ill-condition
problem especially when these inputs are not informative enough.[29]Industrial batch MBT production is essentially
a dynamic process
that has no stable operation point. Hence, the corresponding mathematical
model should incorporate ordinary differential equations and algebraic
equations, giving a differential and algebraic equation (DAE) system.
The parameter estimation for such a nonlinear DAE system will often
encounter challenges such as non-convexity and ill-conditioning[30,31] where the solution is very sensitive to the variations in the data.
The non-convexity of an optimization problem can lead to multiple
local optima. Therefore, the multiple-starting-point (MSP) approach
is applied to the estimation problem to improve global optimality.
On the other hand, the difficulty of optimizing an ill-conditioned
and over-fitted problem can be properly addressed by performing estimability
analysis and reducing the number of parameters to be estimated.[32,33] In general, parameters that are coupled with each other and parameters
that have no significant influence on the output data cannot be estimated,
and are therefore removed from the estimation problem to avoid numerical
intractability and unreliable solutions.Several effective techniques
for improving ill-conditioned estimation
problems and selecting the most informative parameters have been developed.
For instance, the ridge regression approach regularizes the parameter
estimation problem by adding a specific term to the LS term of the
objective function.[34] Nevertheless, ridge
regression cannot provide enough extra constructive information. Another
approach is principal component analysis (PCA), a model-reduction
technique that can identify and choose the most influential parameters
to be estimated.[35] Specifically, PCA applies
eigenvalue decomposition to the information matrix of the corresponding
process and selects the largest eigenvalue that makes the greatest
contribution to the system equations.[36] The original matrix is then projected into a space spanned by the
selected eigenvectors. After repeating the eigenvalue decomposition
and projection steps, the full space of the original matrix is replaced
by the reduced space spanned by the eigenvectors with the largest
eigenvalues. Hence, the computational burden of the original problem
can be significantly reduced. Nevertheless, PCA can only be applied
to linear or almost-linear systems, while the estimation problem for
the MBT synthesis process is highly nonlinear. A third technique is
global sensitivity analysis (GSA). GSA can naturally incorporate the
parameter information into the sensitivity analysis to describe the
parameter behaviors over a large range.[33,37] Hence, GSA
can evaluate the parameter estimability over the whole space. However,
the evaluation of the global sensitivity matrix requires the computation
of a complicated multi-dimensional integral term.[37] Furthermore, the experimental data cannot be used directly
and need to be reformulated before applying the global sensitivity
technique.[33] Another technique is parameter
estimability analysis based on the local sensitivity matrix, which
has been widely studied and applied in the past decades[26,32,38] and is used in this work.The overall research methodology of this study is displayed in Figure . Based on the prior
experiments and existing literature, the reaction mechanism for parameter
estimation is disclosed. Then, the dynamics of the process and corresponding
parameter estimation problem are formulated. The estimability analysis
is conducted to improve the condition number of the estimation problem.
Subsequently, the estimation problem is solved and the selected unknown
parameters are revealed. Finally, the established model is validated
by independent experiment data and industrial data.
Figure 1
Flowchart of overall
research methodology.
Flowchart of overall
research methodology.We concentrate on developing
a reliable first-principles mathematical
model and producing optimal estimates of the relevant parameters for
the industrial MBT batch production process. The main scientific contributions
can be described as follows:The mathematical model is formulated
for the industrial MBT batch production process.Parameter estimability analysis is
conducted to select the estimable parameters and improve the condition
number of the parameter estimation model for MBT production process.The nonconvex parameter
estimation
problem is solved by the MSP approach to get the optimal parameters
for the model of MBT production process.The quality of the proposed model
is evaluated using independent industrial/experimental datasets.The remainder of this paper is arranged
as follows. In Section , the reaction mechanism
and the model equations are introduced. Section then discusses the parameter estimation
model and estimability analysis approach. In Section , the unknown parameters are estimated and
the model is validated. Finally, a brief conclusion to this study
is presented in Section .
Reaction Mechanism and Model Development
The overall reaction mechanism of MBT synthesis can be expressed
as follows:The MBT yield in the current industrial batch process is usually
80–90% based on aniline. At present, the complete steps of
the elementary reactions are still under exploration.[5] In general, the intermediates associated with MBT synthesis
can be categorized into two groups, namely, the desired group that
generates MBT and the undesired group. Specifically, the most crucial
intermediates in the desired group are N,N-diphenylthiourea (DPTU) and N-phenyl-2-benzothiazolamine
(ABT). Both intermediates take part in the major reaction steps for
synthesizing MBT.[15,39] The undesired group includes
those byproducts that finally and irreversibly form tar. As the real
composition of tar is very complex, a virtual component is used to
represent it in our work, the formula for which is derived based on
element conservation. Note that the most abundant byproduct, 1,3-benzothiazole
(BT) can also be categorized as an undesired product because the generation
of more BT usually leads to less production of MBT.Based on
previous studies on the major reaction steps of MBT synthesis,[5,15,39] the dominant reaction path can
be described as shown in Figure . For brevity, the “smaller” components,
such as sulfur, CS2, and H2S, are not displayed
in Figure .
Figure 2
Main pathway
of the MBT synthesis reaction.
Main pathway
of the MBT synthesis reaction.According to Figure , the MBT synthesis pathway can be discomposed into the following
three cascade reactions:The byproduct BT can be generated
from the following side reaction,
which is much slower than the main reactions:The lump reaction ( represents the complex
inter-reactions between the undesired byproducts,
which will irreversibly generate tar:where the molar weight of
the virtual tar component and the stoichiometric coefficients a, b, c, and d in reaction ( can
be derived by conservation: a = 0.41, b = 0.37, c = 0.82, and d = −0.168.The traditional apparent Arrhenius equation is defined as follows:where i =
1, ..., 5 indicate the indexes of the reaction, j = 1, ..., 9 represent the indexes of the considered components, A0 denotes the Arrhenius coefficient, E represents
the activation energy, and α is
the reaction order of component j. In this work,
the equivalent re-parameterized form of the apparent Arrhenius equation
is given in eq 2. This is used to reduce the computational complexity
because the orders of magnitude of the kinetic constants A0 and E in the traditional expressions are unbalanced.[40]where T is the reference temperature
and a and b are the re-parameterized kinetic
parameters that can be defined as:Note that b > 0 holds in most cases
because E> 0, while the sign of a is
often
dependent on the choice of reference temperature T. In this work, the reference temperature is selected as
273.15 K to let a < 0 hold and avoid
the change of a from plus to minus. In
this way, the sensitivities of model variables with respect to a can be evaluated without extra consideration
of the sign change. Moreover, an increase in b will lead to a faster reaction rate because T is much lower than the reaction temperature T. In addition, a and b are dimensionless parameters.Importantly,
notwithstanding that the batch reactor has vapor and
liquid phases, reactions can only take place in the liquid phase.
Based on the empirical equations of state, the average density of
the liquid reactant shows no significant change before and after the
reaction. As massive amounts of CS2 and H2S
dissolve in the liquid phase under high pressure, the change of mass
in the liquid phase is negligible compared with the overall mass.
As a consequence, the liquid volume V can also be seen as a constant during the batch process. Therefore,
the concentrations of the heavy components, including aniline, sulfur,
BT, MBT, ABT, DPTU, and tar, can be expressed as follows:where n is the molar amount
of the heavy component.The light components, i.e., CS2 and H2S,
exist in both the liquid and vapor phases. Hence, the effective molar
amounts of CS2 and H2S in the liquid phase that
can participate in the reactions are lower than their total abundances.
In general, the molar fractions of CS2 and H2S in a liquid–vapor system change with different compositions,
reactor pressures, and temperatures. Thus, the relationships between
the fractions of light components in the vapor phase, x*, , and the related variables
can be described aswhere x∈
[0, 1] and x∈ [0, 1] represent
the molar fractions of CS2 and H2S in the vapor
phase, respectively. The molar amounts of CS2 and H2S in the vapor phase can be computed from n = nx and n = nx. Note that x + x = 1 and x + x = 1 always hold during the reaction process.As the overall
compositions, pressures, and temperatures change
with time, the molar fractions of CS2 and H2S in the vapor phase also change over time. Hence, the rigorous form
of eq is difficult
to measure. Fortunately, the change in the overall composition is
related with the reactor pressure, because an increase in the reactor
pressure indicates the generation of the terminal product H2S. A higher pressure is correlated with more products and less remaining
reactants. In this way, the compositions of all the components can
be expressed as an implicit function related to pressure. Hence, eq can be simplified asFurthermore, eq can
be approximated by the following linear expressions:where x1 and x2 indicate the intercepts
of the affine approximation, while k, k, k, and k denote the slopes. Note that the values
of the slopes and intercepts can be obtained offline via the equations
of state and the empirical formula. Therefore, the concentrations
of CS2 and H2S in the liquid phase are defined
byAs mentioned above, the concentrations
of the components change
over the reaction process. Taking a mass balance, the change of concentration
with respect to time can be defined aswhere γ are the stoichiometric coefficients and n denotes the molar amount of component j. In addition, the total reactor pressure is equal to the
pressure of the vapor phase, which can be calculated by the pressure–volume–temperature
(PVT) relationship aswhere Z is the compression factor and V denotes the volume of the vapor phase. As the total volume of the
reactor is fixed and the volume of the liquid phase can be regarded
as a constant, the vapor volume can also be assumed to be constant.The mathematical model of the industrial MBT batch production process
is summarized in Table . The model contains 9 ordinary differential equations with nine
initial conditions and 21 algebraic equations. The model has 31 time-varying
variables, as listed in Table . Note that n is a sub-vector of n excluding CS2 and H2S. As a result, the mathematical
model has one degree of freedom, which is equal to the number of independent
variables to subtract the number of independent equations.[12] For example, once the temperature profile is
given, the whole reaction process should reach the same termination
point under the same initial conditions. Importantly, the energy balance
is not considered in the mathematical model because the batch reactor
is a non-adiabatic reactor and no measured data relative to energy
balance is available. In addition, the parameters of the proposed
model are summarized in Table . To be specific, the reactor temperature is controlled by
the power of an electrical heating belt and the heat transfer between
the reactor and environment cannot be neglected since the reactions
happen at around 250 °C.
Table 1
Mathematical Model
of the MBT Batch
Production Process
re-parameterized kinetic
parameters to be estimated
5
bi
/
re-parameterized
kinetic parameters to be estimated
5
αij
/
apparent reaction orders
10
γij
/
stoichiometric numbers
24
kP*, kT*, x*
/
parameters for computing molar fraction of
CS2 and
H2S in the vapor phase
6
Zc
/
compression factor in the PVT relationship
1
R
mol·K–1·L–1
ideal gas constant
1
Vvap
L
volume of vapor phase
1
Vliq
L
volume of
liquid phase
1
tf
min
final time
1
Parameter Estimation Model and Estimability
Analysis
Parameter Estimation Model
The overall
parameter estimation problem of the DAE system representing the MBT
production process can be defined aswhere y* denotes the measured value of output variable y at sampling time t in the mth experiment, y(t) denotes the model prediction for y*, Σ–1 is the inverse
of the covariance matrix, and x and u represent the state and input variables in
the mth experiment, respectively. denotes unknown parameters whose lower
and upper bounds are defined as θLB and θUB. The whole DAE system contains n differential equations and n algebraic equations, where n denotes the number of batch experiments in dataset I.Solving the parameter estimation problem often
gives point estimation. To further evaluate the accuracy of estimation,
the confidence interval (CI) can also be evaluated. The CI of the nth parameter can be derived by the following definition[17,41] for a given significance level α:where θ* denotes the estimated
value, F–1(P/v ) represents the value of
the Student’s cumulative distribution function, P = 1 – α/2 is the probability, v is
the number of degrees of freedom, and d is the nth diagonal element of (Z)−1 where Z is the sensitivity
matrix.
Estimability Analysis via a Local Sensitivity
Matrix
The estimability analysis reflects whether the fitted
model with the selected parameter subset can properly depict the correlations
between the output and input data.[42] To
illustrate the estimability analysis procedure, an explicit DAE system
is defined here aswhere denotes the state variable, denotes the output variable, is the input variable, and denotes the unknown parameter. The initial
conditions are defined as x(0) = x0 and y(0) = y0.The sensitivity of the kth state variable x with respect to the pth unknown
parameter θ at sampling time t is defined asThe analytical form of the sensitivity coefficient can be
given
as adjoint sensitivity[43] or direct sensitivity
formulation.[44,45] In this work, the direct sensitivity
is chosen:Based on the chain rule, the sensitivity coefficient that
reveals
the relationship between the lth output variable
and the nth unknown parameter can be derived asThe sensitivity coefficients
are usually scaled because the magnitudes
of the parameters may be different. The scaled sensitivity can be
expressed aswhere θ̅ denotes the parameter point that is used
for sensitivity evaluation, while y̅(t) represents the measured data of
the output variable at time t. The absolute
value function | · | is used in eq because the scale factor might
be negative.Thus, the scaled sensitivity matrix of output variable
vectors
with respect to parameter vectors at t can be expressed aswhere represents the scaled sensitivity. Note
that the outputs may not be measurable at every sampling moment. In
this case, the corresponding sensitivity should be fixed to zero.
The entire sensitivity matrix Z at every sampling
moment t ∈ {t0, ..., t} can be defined
asThe full sensitivity matrix Z has nθ columns and (N + 1)n rows. Importantly, the column rank of matrix Z indicates the maximum number of parameters that can theoretically
be identified.[46] Furthermore, the well-conditioned
property of the parameter estimation problem can be ascertained by
the condition number of Z. If the condition
number of Z is very large, the estimation
problem is ill-conditioned and will usually not be estimable.The estimability of the parameters can be sorted based on their
influence on the measured data and whether these parameters are correlated
with each other.[26] Therefore, the parameter
ranking policy mainly concentrates on the magnitudes and linear dependencies
of the columns of Z. In the following, two parameter
ranking strategies are introduced. Both ranking methods will be used
in the estimability analysis in this study.The first approach
iteratively regresses the full sensitivity matrix Z with the columns of the most predictable parameters to
determine the estimable rankings of the parameters.[32] The parameter rankings interpret the estimability of the
corresponding parameters, from that with the greatest influence on
the model outputs to that with the least influence. The detailed procedure
is displayed in Figure , and the corresponding steps are described as follows:[32]
Figure 3
Flowchart of the first
estimability analysis approach (iterative
regression).
Initialize the loop with k = 1, residual matrix R0 ≔ Z, and define the
initial selected column X0 as an empty
matrix.Calculate the
2-norm for every column
of the residual matrix R.Search for the column
with the largest
2-norm and remark the column as C. The
corresponding parameter is selected and ranked after the parameter
chosen in the previous iteration.The matrix denoting the already selected
columns is augmented as X = [XC]Calculate the prediction matrix Ẑ of the original matrix Z, such that .Calculate the updated residual matrix R ≔ Z – Ẑ.Output
of the order of columns in X if the norm
of residual matrix R is below the stop
criteria ϵ. Else, repeat
the steps (3)–(6).Flowchart of the first
estimability analysis approach (iterative
regression).The second approach is to apply
individual variance contribution
analysis based on QR decomposition.[38] In
detail, the sensitivity matrix Z is decomposed by
QR decomposition with column permutation aswhere P is
the column permutation matrix, R is an upper triangular
matrix, and Q is an orthogonal matrix. The matrix R is further decomposed aswhere D = diag [d1, ..., d] and R̃ is a unit
triangular matrix in which all diagonal elements are 1.
With the supposition that all the measurement errors obey Gaussian
distribution, the individual variance contributions of the unknown
parameters can be written aswhere r is the nth column of the inverse matrix
of R̃. The workflow of this approach is displayed
in Figure .
Figure 4
Flowchart of
the second estimability analysis approach (QR decomposition).
Flowchart of
the second estimability analysis approach (QR decomposition).
Industrial Case Studies
The numerical problem was programmed on a personal computer with
16 GB RAM and an i7-8750H CPU @2.20 GHz. The estimation problem was
implemented on the MATLAB 2018a platform. The control vector parameterization
approach[47] was used to discretize the DAE
system with a MATLAB ode45 solver for integration, as this approach
generates a relatively small nonlinear programming (NLP) problem when
incorporating experimental data. The resulting NLP was then solved
by the IPOPT solver[48] with a MATLAB interface.
The absolute and relative function evaluation tolerances of the IPOPT
solver are set as 1 × 10–7, while the other
settings are set by default. The optimization problem with multiple
starting points was solved by the parallel computing toolbox of MATLAB
with a parallel pool (12 workers) on local computers.
Estimation Model of the MBT Batch Production
Process
The experiments were conducted in a medium-sized
autoclave (volume 4.0 L) with the experimental devices shown in Figure . Industrial data
were collected from a larger industrial autoclave (volume = 1.0 m3). The overall dataset contains 38 independent batches, including
28 batches of experimental data and 10 batches of industrial data.
The whole dataset was divided into three subsets. The first subset,
which served as the training dataset, contained 18 batches of experimental
data for determining the values of the unknown parameters. The second
subset, named test dataset I, contained 10 batches
of experimental data, and the third subset, named test dataset II, contained 10 batches of industrial data.
Figure 5
Sketch of experimental
equipment.
Sketch of experimental
equipment.In both the experimental and industrial
batches, the charged molar
amounts of aniline and sulfur/CS2 solution in the autoclave
were fixed. The reactor temperature and pressure were recorded every
5 min. The weight fraction of MBT in the liquid crude product was
only measured after the whole batch process had terminated. Online
sampling of the reactant is highly dangerous because of the toxic
H2S in a high-pressure environment (∼10 MPa). Therefore,
sampling for the outlet crude product could only be conducted after
the vapor phase was completely exhausted. The weight fraction of MBT
in the crude product was detected by high-performance liquid chromatography
(HPLC). Note that the total mass of crude product from the industrial
autoclave could not be measured. Additionally, because the vapor phase
is exhausted in the industrial plant for treatment, the composition
and molar amount of vapor could not be detected.The types of
accessible data are summarized in Table . Because the temperature of
the autoclave can be manipulated by adjusting the electric heating
power directly, the whole temperature curve is regarded as an input
variable. The other measured data should be regarded as output variables
because they cannot be manipulated directly. Hence, the system has
only one degree of freedom.
Table 4
Types of Overall
Accessible Data
variable
experiment data
industrial data
charged amounts of reactants
input (fixed)
input (fixed)
temperature
input (varying)
input (varying)
pressure
output
output
weight fraction of MBT in the crude product
output
output
total mass of crude product
output
not measured
composition of exhaust
not measured
not
measured
The
measured outputs can be related with the state variables in
the DAE system. Based on mass balance, the total mass of the charged
raw material should equal the summation of the liquid mass and vapor
mass during the reaction process. Moreover, no CS2 and
H2S exists in the crude product. Therefore, the mass of
the discharged crude product can be calculated bywhere m(t) denotes the
mass of crude
product, M denotes the molar
weight of raw materials, and M denotes the vapor components.Moreover, the weight fraction
of MBT in the crude product can be
defined aswhere x(t) denotes the weight fraction
of MBT.Finally, the objective function was derived using the
WLS approach,
with the gap between the model predictions and real output values
minimized. The objective function is defined asIn the MBT model, the molar fractions
of CS2 and H2S in the liquid–vapor phases
and the compression factor Z in the PVT
equation can be obtained in advance
from empirical formulas. In addition, the stoichiometric coefficients
of the reactions are known. Furthermore, the apparent reaction orders
are assumed to be one based on the results of prior experiments. Therefore,
only the unknown kinetic parameters, i.e., a and b, have to be estimated in the
full estimation problem.In the MBT production process, the
average reaction rates of the
main reactions are around 0.001–0.1 mol/L/min because the reaction
duration is around 250 min and the MBT produced in the experiment
is around 20 mol. In addition, the temperature should be higher than
200 °C; otherwise, the reaction cannot take place. Therefore,
the apparent activation energy of the main reactions should be within
[80, 160] kJ/mol. Correspondingly, the pre-exponential Arrhenius factor
should be bounded within [108, 1012]. The reference
temperature is chosen as 273.15 K. As a result, the re-parameterized
kinetic parameters should have the following ranges: a∈ [−50, −10] and b∈ [35, 70]. In summary, the whole parameter
estimation model for the MBT batch production process under the whole
unknown parameter set is defined aswhere the index i denotes the reactions (i–v) and the decision
variables are the
unknown kinetic parameters a and b. The estimability analysis would be performed
later to decide which unknown parameters could be estimated with higher
accuracy.The parameter estimation problem for a nonlinear model
usually
has multiple local optima, although only the global optimum can guarantee
the most precise approximation to the true values of the unknown parameters.
Therefore, the MSP method is applied to solve the estimation problem
for better global performance. Each start point can reach an independent
extremum on the local convex region under a gradient-based approach.[30] Although the global optimum cannot be rigorously
guaranteed, this approach is often sufficient to reach a satisfying
solution within an acceptable computation budget.[49] Moreover, compared with meta-heuristic approaches, the
MSP approach is much easier to program and parallelize to further
enhance computational efficiency.[50]
Estimability Analysis for the MBT Batch Production
Process
The local sensitivities were evaluated at the predefined
parameter points θ̅, namely, the estimates of the unknown
parameters a and b. In this work, a feasible guess θ̅1 was carefully chosen from within the full parameter set, while the
second guess θ̅2 is a local optimum parameter
point obtained by solving the estimation problem eq with one batch of experiment data
as a prior. The ranks and condition numbers of the scaled local sensitivity
matrixes evaluated at θ̅1 and θ̅2 are provided in Table . The condition number of matrix Z is defined as cond(Z) = ∥Z∥ · ∥(Z)−1∥. If the condition number
is higher than 1 × 1010, then the relative matrix
is seen as ill-conditioned.[36] In the ideal
case, it is assumed that the concentration profiles of all components
can be measured during the reaction. The ideal case can identify the
model quality, i.e., whether the proposed model is over-parameterized
or not. On the other hand, in real cases, only the final composition
of MBT in the crude product can be obtained, representing the actual
estimability under the full unknown parameter set.
Table 5
Ranks and Condition Numbers of the
Local Sensitivity Matrix under a Full Parameter Set
ideal
case
real
case
parameter point θ̅1
parameter point θ̅2
parameter point θ̅1
parameter point θ̅2
rank(Z)
10
10
10
10
cond(ZTZ)
1.2 × 1015
1.1 × 1015
3.2 × 105
4.2 × 105
The full column ranks
of the sensitivity matrixes show that all
the unknown kinetic parameters are linearly independent. Hence, a and b can be
uniquely determined in theory. Moreover, in the ideal case, the condition
number of Z is less than 1 × 1010, demonstrating that the original parameter estimation problem
is well-conditioned, which proves that the proposed model is not over-parameterized
when sufficient concentration data are available. Nevertheless, in
the real case, we only have the final weight fraction of MBT. Thereby,
the condition number of Z increases
drastically to 1.1 × 1015. In the real case, it is
not possible to determine all unknown a and b. As a consequence, the full unknown
parameter set should be reduced to effectively exploit the available
data and regain a well-conditioned optimization problem.The
parameter rankings and individual variance contributions given
by the estimability analysis are presented in Table . The parameter rankings were produced by
iterative regression, while the individual variances were calculated
by QR decomposition. The parameter ranking is consistent with the
ascending order of individual variance contributions because a higher
individual variance contribution implies lower estimability.[26]
Table 6
Estimability Analysis
with Local Sensitivity
under the Full Parameter Set
parameter
point θ̅1
parameter
point θ̅2
parameter
value
variance contribution
parameter ranking
value
variance contribution
parameter ranking
a1
–40
1.6 × 10–4
1
–32.6
3.2 × 10–4
1
a2
–25
2.5 × 100
5
–28.8
5.9 × 10–1
4
a3
–25
8.0 × 10–1
4
–25.8
1.5 × 10–3
2
a4
–25
2.1 × 10–3
2
–31.2
5.4 × 10–1
3
a5
–50
4.0 × 10–1
3
–43.2
9.8 × 10–1
5
b1
70
6.8 ×
101
6
57.3
6.9
× 101
6
b2
45
2.7 ×
103
7
51.9
7.8
× 102
7
b3
45
7.9 ×
105
8
44.9
2.3
× 105
8
b4
45
4.6 ×
108
9
53.6
7.6
× 107
9
b5
70
1.2 × 1011
10
68.1
1.1 × 1011
10
According to Table , the re-parameterized pre-exponential factors a are more estimable than the re-parameterized
activation
energies b. More importantly, the individual
variance contributions of b3, b4, and b5 are significantly
larger than for the other activation energies. This is because b3, b4, and b5 are the re-parameterized activation energies
of reaction (,
side reaction (,
and the lump tar reaction (, which have less influence on the output data than the other
unknown parameters. Thus, b3, b4, and b5 were marked
as the least-estimable parameters and excluded from the full parameter
set (i.e., they will not be determined in the estimation problem).
Based on the values of parameter point θ̅2,
they were fixed as b3 = 45, b4 = 54, and b5 = 68, equivalent
to apparent activation energies of 100, 120, and 155 kJ/mol, respectively.After removing b3, b4, and b5, the full parameter
set becomes the estimable parameter set. The parameter points θ̅1′ and θ̅2′ are produced
by dropping b3, b4, and b5 from θ̅1 and θ̅2. The ranks and condition numbers
of the local sensitivity matrixes Z′ evaluated at θ̅1′ and θ̅2′ in the real case are given
in Table . The condition
numbers of the sensitivity matrixes are less than 5 × 106, representing a significant improvement of several orders
of magnitude. This implies that a well-conditioned parameter estimation
problem is formulated under the estimable parameter set.
Table 7
Ranks and Condition Numbers of the
Local Sensitivity Matrix under an Estimable Parameter Set
parameter point θ̅1′
parameter point θ̅2′
rank(Z′)
7
7
cond((Z′)TZ′)
4.9 × 106
4.6 × 106
Local
Sensitivity Analysis for the MBT Batch
Production Process
Local sensitivity analysis can disclose
some important characteristics of the system. The sensitivities of
the output data with respect to the estimable parameters at θ̅1′ are listed
in Table . Note that
higher values of a and b indicate a higher reaction rate. Therefore, the
values of , , ,
and are positive because higher reaction rates
in reactions ( and
(ii) lead to more ABT and produce more MBT.
Similarly, >
0 and is significantly larger than other
derivatives because MBT is directly produced in reaction (iii). In
contrast, 0 because reaction ( is competing against reaction (, and increasing the rate
of reaction ( leads
to lower MBT production. Analogously, 0 as tar reaction ( consumes MBT. However, the sensitivity of m(t) with respect
to all the kinetic parameters is positive because the mass of crude
product does not include the mass of CS2, which is gradually
consumed during the reaction.
Table 8
Scaled Local Sensitivity
Matrix of x(t) and m(t) with Respect
to Parameters at θ̅1′
output
variables
parameter
xMBT(tf)
mcp(tf)
a1
2.0 × 10–1
2.0 × 10–1
a2
1.7 × 10–1
3.5
× 10–2
a3
1.4 × 100
4.7 × 10–2
a4
–1.3 × 100
6.8 × 10–3
a5
–8.0 × 10–1
1.3 × 10–1
b1
1.7 × 10–1
1.7 × 10–1
b2
1.5
× 10–1
3.0 × 10–2
The profiles
of local pressure sensitivities with respect to the
estimable parameters at θ̅2′ are displayed in Figure . Obviously, the vapor pressure
is primarily determined by the total amount of H2S and
secondarily by the volume of CS2. Therefore, and are
significantly higher than the other
values, because reaction ( is the rate-determining step that dominates the total generation
rate of H2S. Moreover, and are
positive because H2S is
also produced by reaction (. Furthermore, indicates that the massive consumption
of CS2 in reaction ( contributes to a reduction in the vapor pressure
based on the PVT relationship. Finally, and are
negative because side reaction ( and tar reaction (v) directly
consume H2S.
Figure 6
Scaled sensitivities at
parameter point θ̅2′.
Scaled sensitivities at
parameter point θ̅2′.
Estimation Results
The problem scale
and computational performance are listed in Table . The WLS approach was applied for parameter
estimation with 18 batches of experimental data. The weights of objective
functions are set as w1 = 20, w2= 10, and w3 =
1. There were seven estimable parameters to be solved. A total of
200 random starting points were used to achieve better global optimality,
which are generated under uniform distribution within the predefined
bounds. The maximum computation time for the optimization on each
starting point was set to be 3.6 × 103 s. The average
solving time for each starting point that successfully converged to
a local extremum was around 800 s. Because the DAE system is highly
nonlinear, some starting points may not return a solution within the
maximum computation time or may even converge to an infeasible solution.
In detail, 85 points converged to local optima, while the other 115
points resulted in non-convergence or infeasible solutions. Among
the 85 converged solutions, the smallest objective value was 4.47
× 101, which is regarded as the approximated global
optimum.
Table 9
Problem Scale and Computational Performance
parameters
value
starting points
200
time intervals
60
decision
variables
7
total CPU time
(s)
1.03 × 104
objective value
4.47 ×
101
The local extremes and global minimum given by the 85 converged
solutions are portrayed in Figure . The blue points indicate local optima, while the
red dashed line represents the value of the global optimum. Most of
the feasible local solutions guarantee weighted residuals of less
than 50, which is close to the global optimum. Furthermore, the objective
values of several local optimums are close to the global optimum as
shown in Figure ,
which implies that the MSP strategy can indeed find the global optimum
of this problem.
Figure 7
Local optima and global optimum given by the MSP approach.
Local optima and global optimum given by the MSP approach.The estimated results for the re-parameterized
kinetic parameters
are provided in Table . The reliability factor can be computed by the
length of the CI
and the estimates.[26] Small reliability
factors indicate narrow CIs, which further illustrate that the estimated
result is close to the true value under a given confidence level of
90%. The activation energies and pre-exponential factors can be obtained
via the parameters listed in Table . It is indicated that the reaction
( is the limiting step, which is consistent
with the sensitivity analysis shown in Figure . Note that b3, b4, and b5 are fixed.
Table 10
Overall Estimated Results for the
MBT Production Process
parameter θn
confidence interval
reliability
βn
a1
–32.49 ∓ 0.37
0.0113
a2
–20.98
∓ 1.80
0.0852
a3
–22.54 ∓ 0.23
0.0077
a4
–34.96 ∓ 0.20
0.0080
a5
–43.13
∓ 0.16
0.0038
b1
57.18 ∓ 0.44
0.0078
b2
44.03 ∓ 2.20
0.0494
b3#
45
/
b4#
54
/
b5#
68
/
The average relative error
between the model prediction and the
data was used to evaluate the overall fitting performance of the proposed
model. This average error can be defined aswhere Δy denotes the average
relative mismatch between
the lth fitted output y and the lth measured output y* recorded in the mth experiment in dataset I, which contains n batches
of data.Analogously, the largest mismatch is defined aswhere (Δy)max is the
maximum relative gap between
the fitted and measured values of lth output y.The overall data fitting performance
is summarized in Table . The fitting of x(t) is very
accurate, with an average mismatch of just 0.38% and the largest relative
error of only 1.28%. The prediction of m(t) and the reactor pressure is also
satisfactory, with average biases of 0.95 and 1.66% and maximum deviations
of 2.50 and 3.23%, respectively. It is clear that the proposed model
is highly accurate and the estimated parameters are in good agreement
with the known training data.
Table 11
Prediction Deviations
of the Training
Dataset under Estimated Parameters
outputs
average
error Δyl®
maximum error (Δyl)max
xMBT(tf)
0.38%
1.28%
mcp(tf)
0.95%
2.50%
pressure
1.66%
3.23%
The fitting results of all experiments in the training set are
plotted in Figures and 9. In Figure , the blue dotted line denotes the fitted
values of x(t), while the red dashed line represents the experimental data. The
gray area represents the associated confidence region of x(t) under the parameter’s
confidence interval. Obviously, the actual measurements are very close
to the predicted x(t) within the corresponding confidence interval. In Figure , the blue dashed
line with squares indicates the predictions of m(t), and the red line denotes
the actual values. Notwithstanding that the mismatches between the
predictions and measurements of m(t) are slightly higher than those for x(t), the fitting
performance is still very encouraging since all the measured results
are within the 90% confidence interval.
Figure 8
Data fitting of weight
fractions of MBT x(t) in 18 training batches.
Figure 9
Data fitting
of the scaled mass of crude product m(t) in 18 training
batches.
Data fitting of weight
fractions of MBT x(t) in 18 training batches.Data fitting
of the scaled mass of crude product m(t) in 18 training
batches.The pressure profiles of the fitted
and experimental data in the
training set are plotted in Figure . The red line denotes the pressure data and the blue
line represents the pressure predictions. Note that the reaction duration
varies in the different experiments. As shown in Figure , the model predictions match
the real data in most experiments. Even in the batch with the largest
deviation, the gap between the prediction and the measured value is
still less than 0.3 MPa. Therefore, the proposed model for MBT production
accurately predicts the output measurements.
Figure 10
Data fitting of pressure
curves in 18 training batches.
Data fitting of pressure
curves in 18 training batches.
Model Validation with Experimental Data
Although our model has proven its ability to fit the data in the
training dataset, it is essential to assess the prediction quality
with independent data that have not been seen by the model so as to
prove the generality of the model. Therefore, a test set containing
10 new batches of experimental data was used in a further simulation.
The overall results are presented in Table . The average and largest mismatches of x(t) between
the model prediction and experimental data in the independent test
set are just 0.46 and 0.87%, respectively. This result is very satisfying
because the prediction mismatches are only slightly higher than with
the training data. Indeed, the maximum prediction errors of m(t), x(t), and the
reactor pressure in the test dataset are slightly smaller than with
the training dataset. It is obvious that the overall results using
the test dataset are consistent with the results using the training
dataset, confirming the generality of the established model.
Table 12
Prediction Deviations of the Experimental
Batches in Test Dataset I
outputs
average error Δyl®
maximum error (Δyl)max
xMBT(tf)
0.46%
1.13%
mcp(tf)
0.87%
2.46%
pressure
1.75%
3.23%
Detailed information about the model predictions and
measurements
of x(t) is presented in Figure . The measurements (red dotted line) are highly consistent
with the predictions (blue dashed line) and within the confidence
interval (gray area). The estimation quality of the total weight of
the crude product can be seen in Figure . The predicted mass is very close to the
experimental data in most cases within the relative 90% confidence
interval.
Figure 11
Prediction of the weight fraction of MBT of 10 experimental batches
in test dataset I.
Figure 12
Prediction
of the scaled mass of crude product of 10 experimental
batches in test dataset I.
Prediction of the weight fraction of MBT of 10 experimental batches
in test dataset I.Prediction
of the scaled mass of crude product of 10 experimental
batches in test dataset I.Furthermore, the predicted pressure curves are compared with the
experimental data in Figure . The predicted pressures are highly coherent with the experimental
data in most batches. Even in the case with the largest pressure deviation,
namely, experiment no. 6, the gap between the prediction and the measurement
is less than 0.33 MPa.
Figure 13
Prediction of pressure curves of 10 experimental
batches in test
dataset I.
Prediction of pressure curves of 10 experimental
batches in test
dataset I.
Model Validation with Industrial Data
The
model established in this study was also applied to data from
a real industrial batch process, and the gap between the predictions
and measurements was examined. Note that the mass of crude product
was not recorded in the industrial data.The overall results
are listed in Table . Detailed comparisons of the model predictions and industrial data
are displayed in Figures and 15. The average prediction errors
of x(t) and the reactor pressure are slightly higher than with the experimental
data, which might be caused by the effect of scale-up[51] and process uncertainties. For example, the real composition
of S/CS2 might derivate from the nominal value as the S/CS2 solution is mixed in an upstream mixing tank without an auto
control system. Therefore, such mismatches are still acceptable in
real production. Moreover, even in the industrial batch that produced
the maximum prediction error, the difference in the predicted value
of x(t) does not exceed 1.58%, and the mismatch in the pressure is within
3.49%. Therefore, it can be concluded that the proposed model with
estimated parameters shows high prediction accuracy for both experimental
and industrial MBT batch production processes.
Table 13
Prediction Deviations of the Industrial
Batches in Test Dataset II
outputs
average error Δyl®
maximum error (Δyl)max
xMBT(tf)
0.74%
1.58%
pressure
2.05%
3.49%
Figure 14
Prediction of the weight
fraction of MBT of 10 industrial batches
in test dataset II.
Figure 15
Prediction
of pressure curves of 10 industrial batches in test
dataset II.
Prediction of the weight
fraction of MBT of 10 industrial batches
in test dataset II.Prediction
of pressure curves of 10 industrial batches in test
dataset II.
Conclusions
In this work, a first-principles model of the MBT batch production
process has been established based on the existing mechanism of the
main reaction pathway and the lump treatment of minor side products.
Because of the lack of information regarding the concentration profile,
the estimable parameter set was selected through estimability analysis
using the local sensitivity matrix. The MSP technique was applied
to the estimation problem to improve the global performance and ensure
precise model predictions. Moreover, independent unseen data from
experimental and industrial batches were used to confirm the prediction
performance of the proposed model. In general, the established model
illustrated that the reaction ( is the rate-determining step. In other words, more aniline
in the discharged reactants can lead to a higher yield ratio of MBT.
Such a conclusion is consistent with a real industrial process.Currently, it is extremely important to increase profits and improve
safety by automatically controlling the chemical production process
based on a reliable model. The proposed model will be applied for
operation optimization and closed-loop control in real industrial
plants. Although the established model can describe the MBT production
process with high accuracy, uncertainty cannot be avoided in process
modeling and model reduction. Furthermore, other disturbances and
measurement errors might also affect the real processes. Therefore,
future work will focus on the dynamic optimization of the MBT production
process under uncertainties. In addition, it is widely accepted in
the chemical industry that MBT synthesis should be transformed from
a batch process to a continuous process to enhance production efficiency
and achieve better dynamic performance. Hence, formulating the continuous
process model is an important task for future work. Optimization of
the start-up manipulation and auto-control of the continuous process
are other essential issues that should be considered in the near future.
Authors: Alejandro F Villaverde; Fabian Fröhlich; Daniel Weindl; Jan Hasenauer; Julio R Banga Journal: Bioinformatics Date: 2019-03-01 Impact factor: 6.937