Literature DB >> 35252688

Mathematical Modeling for the Industrial 2-Mercaptobenzothiazole Batch Production Process.

Enzhi Liang¹, Song Zhang², Bin Liu², Bujin Qi², Yanpei Nie², Zhihong Yuan¹.

Abstract

As an important chemical intermediate, 2-mercaptobenzothiazole (MBT) is widely used in various processes, especially in the rubber industry. However, there is no first-principles model that describes the synthetic process of MBT. This paper focuses on the formulation of a reliable mathematical model represented by a series of differential and algebraic equations for the industrial batch MBT production process. It is difficult to estimate all of the unknown parameters in the model because of the lack of sufficient industrial/experimental data. Thus, a reduced estimable parameter set is derived by performing estimability analysis on the original estimation problem. A multiple-starting-point strategy is then applied to numerically solve the non-convex parameter estimation problem with the weighted least-squares approach. Subsequently, a cross-validation technique is employed to evaluate the generalizability of the proposed model. Finally, it is confirmed that the proposed model produces encouraging prediction performance with regard to independent test data.

Entities: Chemical

Year: 2022 PMID： 35252688 PMCID： PMC8892915 DOI： 10.1021/acsomega.1c06646

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

The chemical intermediate 2-mercaptobenzothiazole (MBT) is a crucial product in the rubber and automobile manufacturing industries.[1] Traditionally, MBT was extensively used as an accelerator in the rubber vulcanization process.[2] Later, the capability of MBT for inhibiting metal corrosion was discovered, and biological applications of MBT as a fungicide were also investigated.[3,4] In recent research studies, new functions of MBT and its derivatives have been disclosed that suggest it can be used in the modified Julia olefination reaction, which is an effective tool for advance fragment linkage.[5,6] MBT is also an essential precursor of many other popular rubber additives, such as 2,2′-dibenzothiazyl disulfide (MBTS) and N-cyclohexylbenzothiazole-2-sulphenamide (CBS). Furthermore, several widely used antibiotics, such as cephalosporins, can also be synthesized from the derivatives of MBT.[7] As the batch process is more flexible and closer to the market requirements than its continuous counterpart,[8] batch production of MBT with aniline as a reactant, usually referred to as the Kelly process, remains a popular choice.[9] The Kelly process charges the reactants, i.e., aniline, sulfur, and carbon disulfide, in a closed autoclave. The temperature is then raised to 220–250 °C and maintained for about 4 h.[10] During the reaction, the pressure of the autoclave increases to around 10 MPa as a result of the production of H2S. Under these tough reaction conditions, major manipulations of the industrial MBT production process still rely on the prior experiences of workers, which may lead to significant reductions in profits and could even cause safety problems. Therefore, it is essential to build a reliable process model for MBT and estimate the relevant process parameters for the industrial production process in preparation for further optimization, automatic control, and monitoring. Existing approaches for process modeling can be categorized into various levels according to the capabilities of the model in describing the relevant process; common examples include white-box models, gray-box models, and black-box models.[11] Among the various types of process models, the first-principles models, or fundamental mathematical models, theoretically provide the best representations of the dynamic characteristics and nonlinear behaviors of the chemical process.[12−14] Unfortunately, the rigorous reaction mechanism of MBT synthesis is very complicated, with at least 50 species separated and identified from the generated mixture.[5] The inter-reactions between the minor components have not yet been fully identified. Therefore, the establishment of a full and rigorous first-principles model that includes all reaction components is a huge challenge. Fortunately, the major reaction steps of MBT synthesis have been disclosed in recent research studies.[5,15] Thereby, kinetics of these key reactions can be mathematically formulated. Additionally, the less important reactions can be represented by a lump reaction. In this way, a complete theoretical model describing the kinetics, mass/energy balances, and equations of state can be established. Parameter estimation is one of the key components in establishing any model. In certain linear systems, where the output variables can be easily interpreted as explicit functions of the input variables and model parameters, parameter estimation is often referred to as explicit estimation.[16,17] In contrast, implicit estimation refers to estimation problems in which the outputs are implicitly dependent on the inputs and parameters.[18] Implicit parameter estimation is usually performed by solving an optimization problem that minimizes the mismatches between the model predictions and real datasets, with the parameters serving as the decision variables.[19,20] The implicit estimation strategy is applied in this study due to the nonlinearity of the reaction kinetics of the MBT production process. In recent decades, a great number of methods have been developed for implicit parameter estimation. Among these, the most popular one is the least-squares (LS) approach, in which the objective function is generally defined as a summation of the 2-norm residuals of output variables between model prediction and measurement.[13,17,21] However, the standard LS approach assumes that different output data have same contributions to the estimation of unknown parameters.[22] Therefore, the weighted least-squares (WLS) method was developed, whereby suitable weights are assigned to the relevant residuals.[19,23] The relatively compact form and high computational efficiency of the WLS approach have led to its widespread application to parameter estimation problems in various chemical processes.[23−25] The WLS approach assumes that the input variables can be precisely measured and that there are no errors in the input data. Nevertheless, errors in the input measurements cannot be ignored in a few industrial processes, such as the measurement errors of jacket temperature in the solid–liquid reactions.[26] In such cases, the errors-in-variables (EIV) method can better describe these processes, because all of the errors in the measured data, including the input data, are considered in the EIV formulation.[27,28] However, the EIV approach requires far more computational resources than WLS.[29] For example, in the parameter estimation problem of the batch MBT process where 18 experiments are available in the training dataset and each experiment records around 60 temperature points as input variables, the EIV approach has to consider the extra 18 × 60 measured temperature values as additional decision variables, which leads to unnecessary use of computational resources. Moreover, treating all the inputs as decision variables can lead to an ill-condition problem especially when these inputs are not informative enough.[29] Industrial batch MBT production is essentially a dynamic process that has no stable operation point. Hence, the corresponding mathematical model should incorporate ordinary differential equations and algebraic equations, giving a differential and algebraic equation (DAE) system. The parameter estimation for such a nonlinear DAE system will often encounter challenges such as non-convexity and ill-conditioning[30,31] where the solution is very sensitive to the variations in the data. The non-convexity of an optimization problem can lead to multiple local optima. Therefore, the multiple-starting-point (MSP) approach is applied to the estimation problem to improve global optimality. On the other hand, the difficulty of optimizing an ill-conditioned and over-fitted problem can be properly addressed by performing estimability analysis and reducing the number of parameters to be estimated.[32,33] In general, parameters that are coupled with each other and parameters that have no significant influence on the output data cannot be estimated, and are therefore removed from the estimation problem to avoid numerical intractability and unreliable solutions. Several effective techniques for improving ill-conditioned estimation problems and selecting the most informative parameters have been developed. For instance, the ridge regression approach regularizes the parameter estimation problem by adding a specific term to the LS term of the objective function.[34] Nevertheless, ridge regression cannot provide enough extra constructive information. Another approach is principal component analysis (PCA), a model-reduction technique that can identify and choose the most influential parameters to be estimated.[35] Specifically, PCA applies eigenvalue decomposition to the information matrix of the corresponding process and selects the largest eigenvalue that makes the greatest contribution to the system equations.[36] The original matrix is then projected into a space spanned by the selected eigenvectors. After repeating the eigenvalue decomposition and projection steps, the full space of the original matrix is replaced by the reduced space spanned by the eigenvectors with the largest eigenvalues. Hence, the computational burden of the original problem can be significantly reduced. Nevertheless, PCA can only be applied to linear or almost-linear systems, while the estimation problem for the MBT synthesis process is highly nonlinear. A third technique is global sensitivity analysis (GSA). GSA can naturally incorporate the parameter information into the sensitivity analysis to describe the parameter behaviors over a large range.[33,37] Hence, GSA can evaluate the parameter estimability over the whole space. However, the evaluation of the global sensitivity matrix requires the computation of a complicated multi-dimensional integral term.[37] Furthermore, the experimental data cannot be used directly and need to be reformulated before applying the global sensitivity technique.[33] Another technique is parameter estimability analysis based on the local sensitivity matrix, which has been widely studied and applied in the past decades[26,32,38] and is used in this work. The overall research methodology of this study is displayed in Figure . Based on the prior experiments and existing literature, the reaction mechanism for parameter estimation is disclosed. Then, the dynamics of the process and corresponding parameter estimation problem are formulated. The estimability analysis is conducted to improve the condition number of the estimation problem. Subsequently, the estimation problem is solved and the selected unknown parameters are revealed. Finally, the established model is validated by independent experiment data and industrial data.

Figure 1

Flowchart of overall research methodology.

Flowchart of overall research methodology. We concentrate on developing a reliable first-principles mathematical model and producing optimal estimates of the relevant parameters for the industrial MBT batch production process. The main scientific contributions can be described as follows: The mathematical model is formulated for the industrial MBT batch production process. Parameter estimability analysis is conducted to select the estimable parameters and improve the condition number of the parameter estimation model for MBT production process. The nonconvex parameter estimation problem is solved by the MSP approach to get the optimal parameters for the model of MBT production process. The quality of the proposed model is evaluated using independent industrial/experimental datasets. The remainder of this paper is arranged as follows. In Section , the reaction mechanism and the model equations are introduced. Section then discusses the parameter estimation model and estimability analysis approach. In Section , the unknown parameters are estimated and the model is validated. Finally, a brief conclusion to this study is presented in Section .

Reaction Mechanism and Model Development

The overall reaction mechanism of MBT synthesis can be expressed as follows: The MBT yield in the current industrial batch process is usually 80–90% based on aniline. At present, the complete steps of the elementary reactions are still under exploration.[5] In general, the intermediates associated with MBT synthesis can be categorized into two groups, namely, the desired group that generates MBT and the undesired group. Specifically, the most crucial intermediates in the desired group are N,N-diphenylthiourea (DPTU) and N-phenyl-2-benzothiazolamine (ABT). Both intermediates take part in the major reaction steps for synthesizing MBT.[15,39] The undesired group includes those byproducts that finally and irreversibly form tar. As the real composition of tar is very complex, a virtual component is used to represent it in our work, the formula for which is derived based on element conservation. Note that the most abundant byproduct, 1,3-benzothiazole (BT) can also be categorized as an undesired product because the generation of more BT usually leads to less production of MBT. Based on previous studies on the major reaction steps of MBT synthesis,[5,15,39] the dominant reaction path can be described as shown in Figure . For brevity, the “smaller” components, such as sulfur, CS2, and H2S, are not displayed in Figure .

Figure 2

Main pathway of the MBT synthesis reaction.

Main pathway of the MBT synthesis reaction. According to Figure , the MBT synthesis pathway can be discomposed into the following three cascade reactions: The byproduct BT can be generated from the following side reaction, which is much slower than the main reactions: The lump reaction ( represents the complex inter-reactions between the undesired byproducts, which will irreversibly generate tar:where the molar weight of the virtual tar component and the stoichiometric coefficients a, b, c, and d in reaction ( can be derived by conservation: a = 0.41, b = 0.37, c = 0.82, and d = −0.168. The traditional apparent Arrhenius equation is defined as follows:where i = 1, ..., 5 indicate the indexes of the reaction, j = 1, ..., 9 represent the indexes of the considered components, A0 denotes the Arrhenius coefficient, E represents the activation energy, and α is the reaction order of component j. In this work, the equivalent re-parameterized form of the apparent Arrhenius equation is given in eq 2. This is used to reduce the computational complexity because the orders of magnitude of the kinetic constants A0 and E in the traditional expressions are unbalanced.[40]where T is the reference temperature and a and b are the re-parameterized kinetic parameters that can be defined as: Note that b > 0 holds in most cases because E> 0, while the sign of a is often dependent on the choice of reference temperature T. In this work, the reference temperature is selected as 273.15 K to let a < 0 hold and avoid the change of a from plus to minus. In this way, the sensitivities of model variables with respect to a can be evaluated without extra consideration of the sign change. Moreover, an increase in b will lead to a faster reaction rate because T is much lower than the reaction temperature T. In addition, a and b are dimensionless parameters. Importantly, notwithstanding that the batch reactor has vapor and liquid phases, reactions can only take place in the liquid phase. Based on the empirical equations of state, the average density of the liquid reactant shows no significant change before and after the reaction. As massive amounts of CS2 and H2S dissolve in the liquid phase under high pressure, the change of mass in the liquid phase is negligible compared with the overall mass. As a consequence, the liquid volume V can also be seen as a constant during the batch process. Therefore, the concentrations of the heavy components, including aniline, sulfur, BT, MBT, ABT, DPTU, and tar, can be expressed as follows:where n is the molar amount of the heavy component. The light components, i.e., CS2 and H2S, exist in both the liquid and vapor phases. Hence, the effective molar amounts of CS2 and H2S in the liquid phase that can participate in the reactions are lower than their total abundances. In general, the molar fractions of CS2 and H2S in a liquid–vapor system change with different compositions, reactor pressures, and temperatures. Thus, the relationships between the fractions of light components in the vapor phase, x*, , and the related variables can be described aswhere x∈ [0, 1] and x∈ [0, 1] represent the molar fractions of CS2 and H2S in the vapor phase, respectively. The molar amounts of CS2 and H2S in the vapor phase can be computed from n = nx and n = nx. Note that x + x = 1 and x + x = 1 always hold during the reaction process. As the overall compositions, pressures, and temperatures change with time, the molar fractions of CS2 and H2S in the vapor phase also change over time. Hence, the rigorous form of eq is difficult to measure. Fortunately, the change in the overall composition is related with the reactor pressure, because an increase in the reactor pressure indicates the generation of the terminal product H2S. A higher pressure is correlated with more products and less remaining reactants. In this way, the compositions of all the components can be expressed as an implicit function related to pressure. Hence, eq can be simplified as Furthermore, eq can be approximated by the following linear expressions:where x1 and x2 indicate the intercepts of the affine approximation, while k, k, k, and k denote the slopes. Note that the values of the slopes and intercepts can be obtained offline via the equations of state and the empirical formula. Therefore, the concentrations of CS2 and H2S in the liquid phase are defined by As mentioned above, the concentrations of the components change over the reaction process. Taking a mass balance, the change of concentration with respect to time can be defined aswhere γ are the stoichiometric coefficients and n denotes the molar amount of component j. In addition, the total reactor pressure is equal to the pressure of the vapor phase, which can be calculated by the pressure–volume–temperature (PVT) relationship aswhere Z is the compression factor and V denotes the volume of the vapor phase. As the total volume of the reactor is fixed and the volume of the liquid phase can be regarded as a constant, the vapor volume can also be assumed to be constant. The mathematical model of the industrial MBT batch production process is summarized in Table . The model contains 9 ordinary differential equations with nine initial conditions and 21 algebraic equations. The model has 31 time-varying variables, as listed in Table . Note that n is a sub-vector of n excluding CS2 and H2S. As a result, the mathematical model has one degree of freedom, which is equal to the number of independent variables to subtract the number of independent equations.[12] For example, once the temperature profile is given, the whole reaction process should reach the same termination point under the same initial conditions. Importantly, the energy balance is not considered in the mathematical model because the batch reactor is a non-adiabatic reactor and no measured data relative to energy balance is available. In addition, the parameters of the proposed model are summarized in Table . To be specific, the reactor temperature is controlled by the power of an electrical heating belt and the heat transfer between the reactor and environment cannot be neglected since the reactions happen at around 250 °C.

Table 1

Mathematical Model of the MBT Batch Production Process

physical meaning	equations
kinetics
mass balances
concentrations of heavy components	(except CS₂ and H₂S)
concentrations of light components
PVT relationship
molar fractions of CS₂ and H₂S in the vapor phase	x_CS₂, vap = k_P1P + k_T1T + x₁, x_H₂S, vap = k_P2P + k_T2T + x₂
molar amounts of CS₂ and H₂S in the vapor phase	n_CS₂, vap = n_CS₂x_CS₂, vap, n_H₂S, vap = n_H₂Sx_H₂S, vap
total molar amounts of CS₂ and H₂S	n_CS₂ = n_CS₂, vap + n_CS₂, liq, n_H₂S = n_H₂S, vap + n_H₂S, liq

Table 2

Variables in the Proposed Model

variables	units	description	number of variables
r_i	mol·min^–1	reaction rates	5
n_j	mol	molar amounts of reaction component j	9
n_*, liq	mol	molar amounts of CS₂ and H₂S in the liquid phase	2
n_*, vap	mol	molar amounts of CS₂ and H₂S in the vapor phase	2
c_j	mol·L^–1	concentrations of components in the liquid phase	9
x_*, vap	/	molar fractions of CS₂ and H₂S in the vapor phase	2
P	MPa	reactor pressure	1
T	°C	reactor temperature	1
n_j, h	mol	molar amounts of the heavy component	7

Table 3

Parameters in the Proposed Model

parameter	units	description	number of parameters
a_i	/	re-parameterized kinetic parameters to be estimated	5
b_i	/	re-parameterized kinetic parameters to be estimated	5
α_ij	/	apparent reaction orders	10
γ_ij	/	stoichiometric numbers	24
k_P, k_T, x_*	/	parameters for computing molar fraction of CS₂ and H₂S in the vapor phase	6
Z_c	/	compression factor in the PVT relationship	1
R	mol·K^–1·L^–1	ideal gas constant	1
V_vap	L	volume of vapor phase	1
V_liq	L	volume of liquid phase	1
t_f	min	final time	1

Parameter Estimation Model and Estimability Analysis

Parameter Estimation Model

The overall parameter estimation problem of the DAE system representing the MBT production process can be defined aswhere y* denotes the measured value of output variable y at sampling time t in the mth experiment, y(t) denotes the model prediction for y*, Σ–1 is the inverse of the covariance matrix, and x and u represent the state and input variables in the mth experiment, respectively. denotes unknown parameters whose lower and upper bounds are defined as θLB and θUB. The whole DAE system contains n differential equations and n algebraic equations, where n denotes the number of batch experiments in dataset I. Solving the parameter estimation problem often gives point estimation. To further evaluate the accuracy of estimation, the confidence interval (CI) can also be evaluated. The CI of the nth parameter can be derived by the following definition[17,41] for a given significance level α:where θ* denotes the estimated value, F–1(P/v ) represents the value of the Student’s cumulative distribution function, P = 1 – α/2 is the probability, v is the number of degrees of freedom, and d is the nth diagonal element of (Z)−1 where Z is the sensitivity matrix.

Estimability Analysis via a Local Sensitivity Matrix

The estimability analysis reflects whether the fitted model with the selected parameter subset can properly depict the correlations between the output and input data.[42] To illustrate the estimability analysis procedure, an explicit DAE system is defined here aswhere denotes the state variable, denotes the output variable, is the input variable, and denotes the unknown parameter. The initial conditions are defined as x(0) = x0 and y(0) = y0. The sensitivity of the kth state variable x with respect to the pth unknown parameter θ at sampling time t is defined as The analytical form of the sensitivity coefficient can be given as adjoint sensitivity[43] or direct sensitivity formulation.[44,45] In this work, the direct sensitivity is chosen: Based on the chain rule, the sensitivity coefficient that reveals the relationship between the lth output variable and the nth unknown parameter can be derived as The sensitivity coefficients are usually scaled because the magnitudes of the parameters may be different. The scaled sensitivity can be expressed aswhere θ̅ denotes the parameter point that is used for sensitivity evaluation, while y̅(t) represents the measured data of the output variable at time t. The absolute value function | · | is used in eq because the scale factor might be negative. Thus, the scaled sensitivity matrix of output variable vectors with respect to parameter vectors at t can be expressed aswhere represents the scaled sensitivity. Note that the outputs may not be measurable at every sampling moment. In this case, the corresponding sensitivity should be fixed to zero. The entire sensitivity matrix Z at every sampling moment t ∈ {t0, ..., t} can be defined as The full sensitivity matrix Z has nθ columns and (N + 1)n rows. Importantly, the column rank of matrix Z indicates the maximum number of parameters that can theoretically be identified.[46] Furthermore, the well-conditioned property of the parameter estimation problem can be ascertained by the condition number of Z. If the condition number of Z is very large, the estimation problem is ill-conditioned and will usually not be estimable. The estimability of the parameters can be sorted based on their influence on the measured data and whether these parameters are correlated with each other.[26] Therefore, the parameter ranking policy mainly concentrates on the magnitudes and linear dependencies of the columns of Z. In the following, two parameter ranking strategies are introduced. Both ranking methods will be used in the estimability analysis in this study. The first approach iteratively regresses the full sensitivity matrix Z with the columns of the most predictable parameters to determine the estimable rankings of the parameters.[32] The parameter rankings interpret the estimability of the corresponding parameters, from that with the greatest influence on the model outputs to that with the least influence. The detailed procedure is displayed in Figure , and the corresponding steps are described as follows:[32]

Figure 3

Flowchart of the first estimability analysis approach (iterative regression).

Initialize the loop with k = 1, residual matrix R0 ≔ Z, and define the initial selected column X0 as an empty matrix. Calculate the 2-norm for every column of the residual matrix R. Search for the column with the largest 2-norm and remark the column as C. The corresponding parameter is selected and ranked after the parameter chosen in the previous iteration. The matrix denoting the already selected columns is augmented as X = [XC] Calculate the prediction matrix Ẑ of the original matrix Z, such that . Calculate the updated residual matrix R ≔ Z – Ẑ. Output of the order of columns in X if the norm of residual matrix R is below the stop criteria ϵ. Else, repeat the steps (3)–(6). Flowchart of the first estimability analysis approach (iterative regression). The second approach is to apply individual variance contribution analysis based on QR decomposition.[38] In detail, the sensitivity matrix Z is decomposed by QR decomposition with column permutation aswhere P is the column permutation matrix, R is an upper triangular matrix, and Q is an orthogonal matrix. The matrix R is further decomposed aswhere D = diag [d1, ..., d] and R̃ is a unit triangular matrix in which all diagonal elements are 1. With the supposition that all the measurement errors obey Gaussian distribution, the individual variance contributions of the unknown parameters can be written aswhere r is the nth column of the inverse matrix of R̃. The workflow of this approach is displayed in Figure .

Figure 4

Flowchart of the second estimability analysis approach (QR decomposition).

Industrial Case Studies

The numerical problem was programmed on a personal computer with 16 GB RAM and an i7-8750H CPU @2.20 GHz. The estimation problem was implemented on the MATLAB 2018a platform. The control vector parameterization approach[47] was used to discretize the DAE system with a MATLAB ode45 solver for integration, as this approach generates a relatively small nonlinear programming (NLP) problem when incorporating experimental data. The resulting NLP was then solved by the IPOPT solver[48] with a MATLAB interface. The absolute and relative function evaluation tolerances of the IPOPT solver are set as 1 × 10–7, while the other settings are set by default. The optimization problem with multiple starting points was solved by the parallel computing toolbox of MATLAB with a parallel pool (12 workers) on local computers.

Estimation Model of the MBT Batch Production Process

The experiments were conducted in a medium-sized autoclave (volume 4.0 L) with the experimental devices shown in Figure . Industrial data were collected from a larger industrial autoclave (volume = 1.0 m3). The overall dataset contains 38 independent batches, including 28 batches of experimental data and 10 batches of industrial data. The whole dataset was divided into three subsets. The first subset, which served as the training dataset, contained 18 batches of experimental data for determining the values of the unknown parameters. The second subset, named test dataset I, contained 10 batches of experimental data, and the third subset, named test dataset II, contained 10 batches of industrial data.

Figure 5

Sketch of experimental equipment.

Sketch of experimental equipment. In both the experimental and industrial batches, the charged molar amounts of aniline and sulfur/CS2 solution in the autoclave were fixed. The reactor temperature and pressure were recorded every 5 min. The weight fraction of MBT in the liquid crude product was only measured after the whole batch process had terminated. Online sampling of the reactant is highly dangerous because of the toxic H2S in a high-pressure environment (∼10 MPa). Therefore, sampling for the outlet crude product could only be conducted after the vapor phase was completely exhausted. The weight fraction of MBT in the crude product was detected by high-performance liquid chromatography (HPLC). Note that the total mass of crude product from the industrial autoclave could not be measured. Additionally, because the vapor phase is exhausted in the industrial plant for treatment, the composition and molar amount of vapor could not be detected. The types of accessible data are summarized in Table . Because the temperature of the autoclave can be manipulated by adjusting the electric heating power directly, the whole temperature curve is regarded as an input variable. The other measured data should be regarded as output variables because they cannot be manipulated directly. Hence, the system has only one degree of freedom.

Table 4

Types of Overall Accessible Data

variable	experiment data	industrial data
charged amounts of reactants	input (fixed)	input (fixed)
temperature	input (varying)	input (varying)
pressure	output	output
weight fraction of MBT in the crude product	output	output
total mass of crude product	output	not measured
composition of exhaust	not measured	not measured

The measured outputs can be related with the state variables in the DAE system. Based on mass balance, the total mass of the charged raw material should equal the summation of the liquid mass and vapor mass during the reaction process. Moreover, no CS2 and H2S exists in the crude product. Therefore, the mass of the discharged crude product can be calculated bywhere m(t) denotes the mass of crude product, M denotes the molar weight of raw materials, and M denotes the vapor components. Moreover, the weight fraction of MBT in the crude product can be defined aswhere x(t) denotes the weight fraction of MBT. Finally, the objective function was derived using the WLS approach, with the gap between the model predictions and real output values minimized. The objective function is defined as In the MBT model, the molar fractions of CS2 and H2S in the liquid–vapor phases and the compression factor Z in the PVT equation can be obtained in advance from empirical formulas. In addition, the stoichiometric coefficients of the reactions are known. Furthermore, the apparent reaction orders are assumed to be one based on the results of prior experiments. Therefore, only the unknown kinetic parameters, i.e., a and b, have to be estimated in the full estimation problem. In the MBT production process, the average reaction rates of the main reactions are around 0.001–0.1 mol/L/min because the reaction duration is around 250 min and the MBT produced in the experiment is around 20 mol. In addition, the temperature should be higher than 200 °C; otherwise, the reaction cannot take place. Therefore, the apparent activation energy of the main reactions should be within [80, 160] kJ/mol. Correspondingly, the pre-exponential Arrhenius factor should be bounded within [108, 1012]. The reference temperature is chosen as 273.15 K. As a result, the re-parameterized kinetic parameters should have the following ranges: a∈ [−50, −10] and b∈ [35, 70]. In summary, the whole parameter estimation model for the MBT batch production process under the whole unknown parameter set is defined aswhere the index i denotes the reactions (i–v) and the decision variables are the unknown kinetic parameters a and b. The estimability analysis would be performed later to decide which unknown parameters could be estimated with higher accuracy. The parameter estimation problem for a nonlinear model usually has multiple local optima, although only the global optimum can guarantee the most precise approximation to the true values of the unknown parameters. Therefore, the MSP method is applied to solve the estimation problem for better global performance. Each start point can reach an independent extremum on the local convex region under a gradient-based approach.[30] Although the global optimum cannot be rigorously guaranteed, this approach is often sufficient to reach a satisfying solution within an acceptable computation budget.[49] Moreover, compared with meta-heuristic approaches, the MSP approach is much easier to program and parallelize to further enhance computational efficiency.[50]

Estimability Analysis for the MBT Batch Production Process

The local sensitivities were evaluated at the predefined parameter points θ̅, namely, the estimates of the unknown parameters a and b. In this work, a feasible guess θ̅1 was carefully chosen from within the full parameter set, while the second guess θ̅2 is a local optimum parameter point obtained by solving the estimation problem eq with one batch of experiment data as a prior. The ranks and condition numbers of the scaled local sensitivity matrixes evaluated at θ̅1 and θ̅2 are provided in Table . The condition number of matrix Z is defined as cond(Z) = ∥Z∥ · ∥(Z)−1∥. If the condition number is higher than 1 × 1010, then the relative matrix is seen as ill-conditioned.[36] In the ideal case, it is assumed that the concentration profiles of all components can be measured during the reaction. The ideal case can identify the model quality, i.e., whether the proposed model is over-parameterized or not. On the other hand, in real cases, only the final composition of MBT in the crude product can be obtained, representing the actual estimability under the full unknown parameter set.

Table 5

Ranks and Condition Numbers of the Local Sensitivity Matrix under a Full Parameter Set

	ideal case		real case
	parameter point θ̅₁	parameter point θ̅₂	parameter point θ̅₁	parameter point θ̅₂
rank(Z)	10	10	10	10
cond(Z^TZ)	1.2 × 10¹⁵	1.1 × 10¹⁵	3.2 × 10⁵	4.2 × 10⁵

The full column ranks of the sensitivity matrixes show that all the unknown kinetic parameters are linearly independent. Hence, a and b can be uniquely determined in theory. Moreover, in the ideal case, the condition number of Z is less than 1 × 1010, demonstrating that the original parameter estimation problem is well-conditioned, which proves that the proposed model is not over-parameterized when sufficient concentration data are available. Nevertheless, in the real case, we only have the final weight fraction of MBT. Thereby, the condition number of Z increases drastically to 1.1 × 1015. In the real case, it is not possible to determine all unknown a and b. As a consequence, the full unknown parameter set should be reduced to effectively exploit the available data and regain a well-conditioned optimization problem. The parameter rankings and individual variance contributions given by the estimability analysis are presented in Table . The parameter rankings were produced by iterative regression, while the individual variances were calculated by QR decomposition. The parameter ranking is consistent with the ascending order of individual variance contributions because a higher individual variance contribution implies lower estimability.[26]

Table 6

Estimability Analysis with Local Sensitivity under the Full Parameter Set

	parameter point θ̅₁			parameter point θ̅₂
parameter	value	variance contribution	parameter ranking	value	variance contribution	parameter ranking
a₁	–40	1.6 × 10^–4	1	–32.6	3.2 × 10^–4	1
a₂	–25	2.5 × 10⁰	5	–28.8	5.9 × 10^–1	4
a₃	–25	8.0 × 10^–1	4	–25.8	1.5 × 10^–3	2
a₄	–25	2.1 × 10^–3	2	–31.2	5.4 × 10^–1	3
a₅	–50	4.0 × 10^–1	3	–43.2	9.8 × 10^–1	5
b₁	70	6.8 × 10¹	6	57.3	6.9 × 10¹	6
b₂	45	2.7 × 10³	7	51.9	7.8 × 10²	7
b₃	45	7.9 × 10⁵	8	44.9	2.3 × 10⁵	8
b₄	45	4.6 × 10⁸	9	53.6	7.6 × 10⁷	9
b₅	70	1.2 × 10¹¹	10	68.1	1.1 × 10¹¹	10

According to Table , the re-parameterized pre-exponential factors a are more estimable than the re-parameterized activation energies b. More importantly, the individual variance contributions of b3, b4, and b5 are significantly larger than for the other activation energies. This is because b3, b4, and b5 are the re-parameterized activation energies of reaction (, side reaction (, and the lump tar reaction (, which have less influence on the output data than the other unknown parameters. Thus, b3, b4, and b5 were marked as the least-estimable parameters and excluded from the full parameter set (i.e., they will not be determined in the estimation problem). Based on the values of parameter point θ̅2, they were fixed as b3 = 45, b4 = 54, and b5 = 68, equivalent to apparent activation energies of 100, 120, and 155 kJ/mol, respectively. After removing b3, b4, and b5, the full parameter set becomes the estimable parameter set. The parameter points θ̅1′ and θ̅2′ are produced by dropping b3, b4, and b5 from θ̅1 and θ̅2. The ranks and condition numbers of the local sensitivity matrixes Z′ evaluated at θ̅1′ and θ̅2′ in the real case are given in Table . The condition numbers of the sensitivity matrixes are less than 5 × 106, representing a significant improvement of several orders of magnitude. This implies that a well-conditioned parameter estimation problem is formulated under the estimable parameter set.

Table 7

Ranks and Condition Numbers of the Local Sensitivity Matrix under an Estimable Parameter Set

	parameter point θ̅₁^′	parameter point θ̅₂^′
rank(Z^′)	7	7
cond((Z^′)^TZ^′)	4.9 × 10⁶	4.6 × 10⁶

Local Sensitivity Analysis for the MBT Batch Production Process

Local sensitivity analysis can disclose some important characteristics of the system. The sensitivities of the output data with respect to the estimable parameters at θ̅1′ are listed in Table . Note that higher values of a and b indicate a higher reaction rate. Therefore, the values of , , , and are positive because higher reaction rates in reactions ( and (ii) lead to more ABT and produce more MBT. Similarly, > 0 and is significantly larger than other derivatives because MBT is directly produced in reaction (iii). In contrast, 0 because reaction ( is competing against reaction (, and increasing the rate of reaction ( leads to lower MBT production. Analogously, 0 as tar reaction ( consumes MBT. However, the sensitivity of m(t) with respect to all the kinetic parameters is positive because the mass of crude product does not include the mass of CS2, which is gradually consumed during the reaction.

Table 8

Scaled Local Sensitivity Matrix of x(t) and m(t) with Respect to Parameters at θ̅1′

	output variables
parameter	x_MBT(t_f)	m_cp(t_f)
a₁	2.0 × 10^–1	2.0 × 10^–1
a₂	1.7 × 10^–1	3.5 × 10^–2
a₃	1.4 × 10⁰	4.7 × 10^–2
a₄	–1.3 × 10⁰	6.8 × 10^–3
a₅	–8.0 × 10^–1	1.3 × 10^–1
b₁	1.7 × 10^–1	1.7 × 10^–1
b₂	1.5 × 10^–1	3.0 × 10^–2

The profiles of local pressure sensitivities with respect to the estimable parameters at θ̅2′ are displayed in Figure . Obviously, the vapor pressure is primarily determined by the total amount of H2S and secondarily by the volume of CS2. Therefore, and are significantly higher than the other values, because reaction ( is the rate-determining step that dominates the total generation rate of H2S. Moreover, and are positive because H2S is also produced by reaction (. Furthermore, indicates that the massive consumption of CS2 in reaction ( contributes to a reduction in the vapor pressure based on the PVT relationship. Finally, and are negative because side reaction ( and tar reaction (v) directly consume H2S.

Figure 6

Scaled sensitivities at parameter point θ̅2′.

Estimation Results

The problem scale and computational performance are listed in Table . The WLS approach was applied for parameter estimation with 18 batches of experimental data. The weights of objective functions are set as w1 = 20, w2= 10, and w3 = 1. There were seven estimable parameters to be solved. A total of 200 random starting points were used to achieve better global optimality, which are generated under uniform distribution within the predefined bounds. The maximum computation time for the optimization on each starting point was set to be 3.6 × 103 s. The average solving time for each starting point that successfully converged to a local extremum was around 800 s. Because the DAE system is highly nonlinear, some starting points may not return a solution within the maximum computation time or may even converge to an infeasible solution. In detail, 85 points converged to local optima, while the other 115 points resulted in non-convergence or infeasible solutions. Among the 85 converged solutions, the smallest objective value was 4.47 × 101, which is regarded as the approximated global optimum.

Table 9

Problem Scale and Computational Performance

parameters	value
starting points	200
time intervals	60
decision variables	7
total CPU time (s)	1.03 × 10⁴
objective value	4.47 × 10¹

The local extremes and global minimum given by the 85 converged solutions are portrayed in Figure . The blue points indicate local optima, while the red dashed line represents the value of the global optimum. Most of the feasible local solutions guarantee weighted residuals of less than 50, which is close to the global optimum. Furthermore, the objective values of several local optimums are close to the global optimum as shown in Figure , which implies that the MSP strategy can indeed find the global optimum of this problem.

Figure 7

Local optima and global optimum given by the MSP approach.

Local optima and global optimum given by the MSP approach. The estimated results for the re-parameterized kinetic parameters are provided in Table . The reliability factor can be computed by the length of the CI and the estimates.[26] Small reliability factors indicate narrow CIs, which further illustrate that the estimated result is close to the true value under a given confidence level of 90%. The activation energies and pre-exponential factors can be obtained via the parameters listed in Table . It is indicated that the reaction ( is the limiting step, which is consistent with the sensitivity analysis shown in Figure . Note that b3, b4, and b5 are fixed.

Table 10

Overall Estimated Results for the MBT Production Process

parameter θ_n	confidence interval	reliability β_n
a₁	–32.49 ∓ 0.37	0.0113
a₂	–20.98 ∓ 1.80	0.0852
a₃	–22.54 ∓ 0.23	0.0077
a₄	–34.96 ∓ 0.20	0.0080
a₅	–43.13 ∓ 0.16	0.0038
b₁	57.18 ∓ 0.44	0.0078
b₂	44.03 ∓ 2.20	0.0494
b₃^#	45	/
b₄^#	54	/
b₅^#	68	/

The average relative error between the model prediction and the data was used to evaluate the overall fitting performance of the proposed model. This average error can be defined aswhere Δy denotes the average relative mismatch between the lth fitted output y and the lth measured output y* recorded in the mth experiment in dataset I, which contains n batches of data. Analogously, the largest mismatch is defined aswhere (Δy)max is the maximum relative gap between the fitted and measured values of lth output y. The overall data fitting performance is summarized in Table . The fitting of x(t) is very accurate, with an average mismatch of just 0.38% and the largest relative error of only 1.28%. The prediction of m(t) and the reactor pressure is also satisfactory, with average biases of 0.95 and 1.66% and maximum deviations of 2.50 and 3.23%, respectively. It is clear that the proposed model is highly accurate and the estimated parameters are in good agreement with the known training data.

Table 11

Prediction Deviations of the Training Dataset under Estimated Parameters

outputs	average error Δy_l®	maximum error (Δy_l)_max
x_MBT(t_f)	0.38%	1.28%
m_cp(t_f)	0.95%	2.50%
pressure	1.66%	3.23%

The fitting results of all experiments in the training set are plotted in Figures and 9. In Figure , the blue dotted line denotes the fitted values of x(t), while the red dashed line represents the experimental data. The gray area represents the associated confidence region of x(t) under the parameter’s confidence interval. Obviously, the actual measurements are very close to the predicted x(t) within the corresponding confidence interval. In Figure , the blue dashed line with squares indicates the predictions of m(t), and the red line denotes the actual values. Notwithstanding that the mismatches between the predictions and measurements of m(t) are slightly higher than those for x(t), the fitting performance is still very encouraging since all the measured results are within the 90% confidence interval.

Figure 8

Data fitting of weight fractions of MBT x(t) in 18 training batches.

Figure 9

Data fitting of the scaled mass of crude product m(t) in 18 training batches.

Data fitting of weight fractions of MBT x(t) in 18 training batches. Data fitting of the scaled mass of crude product m(t) in 18 training batches. The pressure profiles of the fitted and experimental data in the training set are plotted in Figure . The red line denotes the pressure data and the blue line represents the pressure predictions. Note that the reaction duration varies in the different experiments. As shown in Figure , the model predictions match the real data in most experiments. Even in the batch with the largest deviation, the gap between the prediction and the measured value is still less than 0.3 MPa. Therefore, the proposed model for MBT production accurately predicts the output measurements.

Figure 10

Data fitting of pressure curves in 18 training batches.

Model Validation with Experimental Data

Although our model has proven its ability to fit the data in the training dataset, it is essential to assess the prediction quality with independent data that have not been seen by the model so as to prove the generality of the model. Therefore, a test set containing 10 new batches of experimental data was used in a further simulation. The overall results are presented in Table . The average and largest mismatches of x(t) between the model prediction and experimental data in the independent test set are just 0.46 and 0.87%, respectively. This result is very satisfying because the prediction mismatches are only slightly higher than with the training data. Indeed, the maximum prediction errors of m(t), x(t), and the reactor pressure in the test dataset are slightly smaller than with the training dataset. It is obvious that the overall results using the test dataset are consistent with the results using the training dataset, confirming the generality of the established model.

Table 12

Prediction Deviations of the Experimental Batches in Test Dataset I

outputs	average error Δy_l®	maximum error (Δy_l)_max
x_MBT(t_f)	0.46%	1.13%
m_cp(t_f)	0.87%	2.46%
pressure	1.75%	3.23%

Detailed information about the model predictions and measurements of x(t) is presented in Figure . The measurements (red dotted line) are highly consistent with the predictions (blue dashed line) and within the confidence interval (gray area). The estimation quality of the total weight of the crude product can be seen in Figure . The predicted mass is very close to the experimental data in most cases within the relative 90% confidence interval.

Figure 11

Prediction of the weight fraction of MBT of 10 experimental batches in test dataset I.

Figure 12

Prediction of the scaled mass of crude product of 10 experimental batches in test dataset I.

Prediction of the weight fraction of MBT of 10 experimental batches in test dataset I. Prediction of the scaled mass of crude product of 10 experimental batches in test dataset I. Furthermore, the predicted pressure curves are compared with the experimental data in Figure . The predicted pressures are highly coherent with the experimental data in most batches. Even in the case with the largest pressure deviation, namely, experiment no. 6, the gap between the prediction and the measurement is less than 0.33 MPa.

Figure 13

Prediction of pressure curves of 10 experimental batches in test dataset I.

Model Validation with Industrial Data

The model established in this study was also applied to data from a real industrial batch process, and the gap between the predictions and measurements was examined. Note that the mass of crude product was not recorded in the industrial data. The overall results are listed in Table . Detailed comparisons of the model predictions and industrial data are displayed in Figures and 15. The average prediction errors of x(t) and the reactor pressure are slightly higher than with the experimental data, which might be caused by the effect of scale-up[51] and process uncertainties. For example, the real composition of S/CS2 might derivate from the nominal value as the S/CS2 solution is mixed in an upstream mixing tank without an auto control system. Therefore, such mismatches are still acceptable in real production. Moreover, even in the industrial batch that produced the maximum prediction error, the difference in the predicted value of x(t) does not exceed 1.58%, and the mismatch in the pressure is within 3.49%. Therefore, it can be concluded that the proposed model with estimated parameters shows high prediction accuracy for both experimental and industrial MBT batch production processes.

Table 13

Prediction Deviations of the Industrial Batches in Test Dataset II

outputs	average error Δy_l®	maximum error (Δy_l)_max
x_MBT(t_f)	0.74%	1.58%
pressure	2.05%	3.49%

Figure 14

Prediction of the weight fraction of MBT of 10 industrial batches in test dataset II.

Figure 15

Prediction of pressure curves of 10 industrial batches in test dataset II.

Prediction of the weight fraction of MBT of 10 industrial batches in test dataset II. Prediction of pressure curves of 10 industrial batches in test dataset II.

Conclusions

In this work, a first-principles model of the MBT batch production process has been established based on the existing mechanism of the main reaction pathway and the lump treatment of minor side products. Because of the lack of information regarding the concentration profile, the estimable parameter set was selected through estimability analysis using the local sensitivity matrix. The MSP technique was applied to the estimation problem to improve the global performance and ensure precise model predictions. Moreover, independent unseen data from experimental and industrial batches were used to confirm the prediction performance of the proposed model. In general, the established model illustrated that the reaction ( is the rate-determining step. In other words, more aniline in the discharged reactants can lead to a higher yield ratio of MBT. Such a conclusion is consistent with a real industrial process. Currently, it is extremely important to increase profits and improve safety by automatically controlling the chemical production process based on a reliable model. The proposed model will be applied for operation optimization and closed-loop control in real industrial plants. Although the established model can describe the MBT production process with high accuracy, uncertainty cannot be avoided in process modeling and model reduction. Furthermore, other disturbances and measurement errors might also affect the real processes. Therefore, future work will focus on the dynamic optimization of the MBT production process under uncertainties. In addition, it is widely accepted in the chemical industry that MBT synthesis should be transformed from a batch process to a continuous process to enhance production efficiency and achieve better dynamic performance. Hence, formulating the continuous process model is an important task for future work. Optimization of the start-up manipulation and auto-control of the continuous process are other essential issues that should be considered in the near future.

7 in total

1. Estimating the kinetic parameters of activated sludge storage using weighted non-linear least-squares and accelerating genetic algorithm.

Authors: Fang Fang; Bing-Jie Ni; Han-Qing Yu
Journal: Water Res Date: 2009-01-18 Impact factor: 11.236

Review 2. Parameter estimation and optimal experimental design.

Authors: Julio R Banga; Eva Balsa-Canto
Journal: Essays Biochem Date: 2008 Impact factor: 8.000

3. Spreadsheet method for evaluation of biochemical reaction rate coefficients and their uncertainties by weighted nonlinear least-squares analysis of the integrated monod equation

Authors:
Journal: Appl Environ Microbiol Date: 1998-06 Impact factor: 4.792