Literature DB >> 26865832

Modelling of amorphous cellulose depolymerisation by cellulases, parametric studies and optimisation.

Hongxing Niu¹, Nilay Shah¹, Cleo Kontoravdi¹.

Abstract

Improved understanding of heterogeneous cellulose hydrolysis by cellulases is the basis for optimising enzymatic catalysis-based cellulosic biorefineries. A detailed mechanistic model is developed to describe the dynamic adsorption/desorption and synergistic chain-end scissions of cellulases (endoglucanase, exoglucanase, and β-glucosidase) upon amorphous cellulose. The model can predict evolutions of the chain lengths of insoluble cellulose polymers and production of soluble sugars during hydrolysis. Simultaneously, a modelling framework for uncertainty analysis is built based on a quasi-Monte-Carlo method and global sensitivity analysis, which can systematically identify key parameters, help refine the model and improve its identifiability. The model, initially comprising 27 parameters, is found to be over-parameterized with structural and practical identification problems under usual operating conditions (low enzyme loadings). The parameter estimation problem is therefore mathematically ill posed. The framework allows us, on the one hand, to identify a subset of 13 crucial parameters, of which more accurate confidence intervals are estimated using a given experimental dataset, and, on the other hand, to overcome the identification problems. The model's predictive capability is checked against an independent set of experimental data. Finally, the optimal composition of cellulases cocktail is obtained by model-based optimisation both for enzymatic hydrolysis and for the process of simultaneous saccharification and fermentation.

Entities: Chemical Disease Gene Species

Keywords: Cellulase; Cellulose; Kinetic parameters; Modelling; Optimisation; Uncertainty

Year: 2016 PMID： 26865832 PMCID： PMC4705870 DOI： 10.1016/j.bej.2015.10.017

Source DB: PubMed Journal: Biochem Eng J ISSN： 1369-703X Impact factor: 3.978

accessible binding sites on cellulose unoccupied by enzymes (mmol sites/L) covariance matrix of estimated parameters variances in model outputs associated with simultaneous changes in the parameters ϴ1,…,p and in all the parameters, respectively diagonal elements of a matrix initial polymerization degree of cellulose substrate E = englucanase, exoglucanse, beta-glucosidase, respectively E’ = englucanase, exoglucanse activation energy (kJ/mol) enzyme loading (g/L) concentration of free enzyme in liquid phase (g/L) concentration of total enzymein liquid phase (g/L) concentration of free enzyme in solid phase (g/L) concentration of total enzyme in solid phase (g/L) substrate-enzyme complex (g enzyme/L) fraction of accessible β-glucosidic bonds concentration of glucose, cellobiose, cellotriose (mmol/L), respectively concentration of cellulose polymer with polymerization degree i(mmol/L) inhibition constant of glucose, cellobiose, cellotriose (g/L), respectively sum of squares of residuals between simulations and measurements adsorption equilibrium constant (L/mmol sites) equilibrium constant of enzymatic hydrolysis (mmol β-glucosidic bonds/L) “apparent” reaction constant (103 mmol/(g enzyme × h)) “intrinsic” reaction constant (103 mmol/(g enzyme × h)) adsorptionrate constant (L/(mmol sites × h)) desorptionrate constant (h−1) enzyme molecular weight (g/mmol) number of model variables, data points, parameters, respectively relative values between first-order and total effect sensitivity indices correlation coefficient between the estimated parameters ϴi and ϴj production rate of Gi (mmol/(L × h)) first-order, total effect sensitivity indices with respect to parameter i, respectively weighting matrix conversion of cellulose GN to soluble sugars (% C/C) measured variables, estimated variable values, respectively significance level for t-test number of cellobiose lattice occupied by one molecule of enzyme (mmol sites/mmol enzyme) parameter vector, lower and upper bounds, initial guesses, respectively confidence intervals of parameters at αt significance level reference values for corresponding parameters, cited from literature first order deactivation constant of enzyme (1/h) binding capacity of substrate (mmol sites/mmol β-glucosidic bonds) errorvariance of jth meansurement

Introduction

Enzymatic hydrolysis of cellulosic materials to produce reducing sugars has long been pursued for its potential for providing abundant food and energy resources. It is a multi-step process that takes place in a heterogeneous reaction system [1], in which insoluble cellulose is initially broken down at the solid-liquid interface (with enzyme adsorption/desorption) via the synergistic actions of endoglucanases ([EC 3.2.1.4]) and exoglucanases ([EC 3.2.1.91]). This initial degradation is accompanied by further liquid-phase hydrolysis of soluble intermediate products, i.e., short cellulose oligosaccharides and cellobiose, which are catalytically cleaved to produce glucose by the action of β-glucosidase ([EC 3.2.1.21]). Mechanistic understanding of the overall hydrolysis system is certainly interesting for designing rational approaches for enzymatic hydrolysis and subsequent fermentation processes. However, the complexity of the system, which arises from the concerted action of several enzymes on a solid substrate/mixture in a heterogeneous system, makes experimental kinetic studies very difficult. Accordingly, although many models of enzymatic hydrolysis have been developed over the past decades, most of them are empirical correlations and data-driven and as a result are only applicable to specific cases/conditions [2], [3], [4], [5], [6], [7], [8]. Generally, they are: (1) simply lumping the different cellulolytic enzymes together as a single catalyst; (2) treating the cellulose mixture as a single bulk concentration; (3) simplifying the reaction system as a homogeneous one, i.e., without considering the enzyme adsorption onto and desorption from solid particles; (4) lacking analysis of model identifiability and parameter uncertainty [9]. These approaches are summarised in recent reviews of enzymatic hydrolysis of (ligno) cellulose [10], [11]. Efforts to propose mechanistic models have been made to enhance understanding of the enzymatic hydrolysis of cellulose [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24]. However, they mainly lack thorough parametric studies and experimental validation. Consequently, the predictability of the models, especially for extrapolations, is still in doubt. At the same time, studies to investigate fundamental mechanisms of random hydrolysis (random chain scission) and processive hydrolysis (chain-end scission) of polymers have been carried out extensively using population balance modelling [25], [26], [27], [28], [29], [30]. Population balance modelling involves tracking the numbers of entities and behaviour of a population of particles based on the analysis of the behaviour of single particles in local conditions [31], [32]. The results provide clues to the underlying mechanisms of the enzymatic hydrolysis process of cellulose. Following the above advances, this work develops a mechanistic depolymerisation (scission) model of enzymatic hydrolysis, taking into account the enzyme adsorption/desorption processes in this heterogeneous system. Furthermore, a systematic sensitivity analysis-based method is proposed for parametric studies, model reduction and verification with published experimental data. Finally, the model’s predictive capability is checked against an independent set of data and model-based optimisation studies are presented.

Model development

Model assumptions

Model development is based on the following assumptions: There are several studies about how to modify and improve the various physical properties of cellulose as a substrate, such as particle size, fibre structure, accessibility, crystallinity index, and amorphicity index [24], [33], [34], [35], [36], [37], [38], [39]. In this work, for simplification, the substrate is assumed to be completely amorphous (non-crystalline) pure cellulose and be well ground into a very fine powder. The enzymatic hydrolysis takes place in a well-stirred tank reactor. As a result, there is no mass transfer limitation during enzymatic hydrolysis. The binding probability of enzyme to one polymer molecule is proportional to the molecule’s polymerization degree. In other words, the enzyme has equal accessibility to every β-glucosidic bond of the polymer [21], [22]. The quasi-steady state approximation holds for any intermediate complex, i.e.,. The loading of cellulases (i.e., total loading of the three core enzymes) is low. More specifically, the loading is no more than 15 mg/g-glucan and 750 mg/L. Accordingly, interaction/crowding effects between different kinds of enzymes are negligible during adsorption/desorption. The adsorption/desorption of each kind of enzyme is described separately and considered reversible. Cellulosic polymers with chain lengths over four are deemed to exist in solid phase and cannot dissolve into liquid. As shown by Fig. 1, solid particles (DP ≥ 4) are depolymerized by endoglucanases and exoglucanases, while soluble shorter polymers (namely cellotriose and cellobiose) are exclusively cleaved by β-glucosidases. This is a reasonable assumption if one compares the enzyme specific activities between insoluble and soluble substrates. It has been found experimentally that both endoglucanases and exoglucanases have relatively low catalytic activities upon soluble sugars compared with those upon solid cellulose [11].

Fig. 1

(a) Schematic representation of the concerted action by three cellulolytic enzymes en (endoglucanase), ex (cellobiohydrolase) and bg (beta-glucosidase) and hydrolysis producing different cellulose chain lengths. Gi, starting substrate with DPi; Gi–j, Gj, and Gi-2, hydrolysed cellulose segments of DPi–j, DPj, and DPi-2 (insoluble intermediate products); G2, cellobiose; G3, Cellotriose; G1, glucose. Action of en is represented by full arrow (→), of ex by dot-dash arrow (), and of bg by dashed arrow (). Feedback inhibition of cellotriose, cellobiose and glucose is shown. (b) Action modes of the three enzymes during cellulolysis: random scission by endocellulase (endoglucanase); chain-end processive scission by exocellulase (exoglucanase) releasing cellobiose; processive scission by beta-glucosidase on cellotriose and cellobiose producing glucose.

The enzymes are inhibited by soluble sugars (Fig. 1) non-competitively; the inhibition of glucose increases proportionally to its concentration raised to the power of 3 [6]; additionally, enzyme activity decreases exponentially with time [40].

Dynamic adsorption/desorption of cellulases

The reaction of the enzyme-substrate complex is initiated upon physical contact of endoglucanases and exoglucanases with the surface of an insoluble substrate. Equilibrium of enzyme adsorption/desorption is represented by the Langmuir isotherm model. The time required for equilibrium to be reached is relatively short compared to the hydrolysis time [20], [41], [42], [43]. Therefore, the adsorption/desorption is decoupled from the formation of the enzyme-substrate complex. The adsorption/desorption is represented bywhere C is the accessible binding sites onto the β-glucosidic bonds of cellulose uncovered by enzymes, E and E are the total concentrations of one type of enzyme in liquid phase and adsorbed onto solid phase respectively, is the adsorption equilibrium constant, and k and k are adsorption and desorption rate constants, respectively. The material balances of enzymes and accessible binding sites on cellulose are represented by Eq. (2) and Eq. (3), respectively.where Eload is the loading of any type of enzyme, σ is the binding capacity of β-glucosidic bonds, G represents the cellulose with polymerization degree i (so the total concentration in terms of “mmol β-glucosidic bonds/L” is (i − 1) G), 2α is the number of cellobiose lattice occupied by one molecule of enzyme, and M is the enzyme molecular weight. By eliminating the term C from Eq. (1) and Eq. (3), it is deduced that Furthermore, by eliminating the term E using Eq. (2), Eq. (4) is transformed into By solving the above quadratic equation, we obtain the real root, taking into account . (Enzyme in solid phase E should be non-negative (≥0), not bigger than total enzyme loading Eload or total binding capacity of substrate ). Eload and G may change over time, so where .

Enzymatic hydrolysis of cellulose

By endoglucanases (en)

The random hydrolysis action by endoglucanases in solid phase is described by Michaelis–Menten kinetics where G represents the cellulose with polymerization degree i, is the “not complexed” endoglucanases in solid phase, symbolizes the substrate-enzyme complex (in terms of “g endoglucanases/L”), and k is reaction constant. Endoglucanases break β-glucosidic bonds of cellulose randomly, so the total solid substrate quantity is . The equilibrium constant is defined by [24],where F is the fraction of accessible β-glucosidic bonds. Accordingly, Eq. (9) is obtained by changing Eq. (8) Using the enzyme mass balance (Eq. (10)), we can eliminate the term in Eq. (8) and obtain Eq. (11) From Eqs. (9) and (11), we further obtainAs shown in Eq. (12) the formation of one polymer-enzyme complex is proportional to the product of total adsorbed enzyme and total accessible β-glucosidic bonds of the polymer G. The overall production rate of G equals the contributions of G (k = i + 1, i + 2,…N) minus the loss of G, where G is cleaved by endoglucanases with a probability of [21], [24], i.e., Additionally, the formation rate of soluble sugars dissolving into the liquid phase is obtained by

By exoglucanases (ex)

Processive hydrolysis by exoglucanases is represented bywhere is the “not complexed” exoglucanases in solid phase, and symbolizes the substrate-enzyme complex (g exoglucanases/L). There are generally two functionally different kinds of exoglucanases, namely CBH-I and CBH-II, which processively cut off cellulosic polymers from reducing and non-reducing chain ends, respectively [44], [45]. The total enzyme concentration in this case is the sum of CBH-I and CBH-II assuming they have the same specific activity. Similar to Eq. (8), the equilibrium constant is defined by Since exoglucanases act on cellulose by means of chain-end scission, the total substrate quantity in the solid phase is represented by (compared with in Eq. (8)). The enzyme mass balance and substrate-enzyme complex concentration are obtained by Eq. (16) and Eq. (17), respectively The formation rate of equals the depolymerisation of minus that of , as follows and the production rates of cellotriose and cellobiose are obtained by Eq. (19) and Eq. (20), respectively

By β-glucosidases (bg)

Processive hydrolysis of soluble sugars (i.e., cellotriose and cellobiose) by β-glucosidase is described bywhere is the “not complexed” β-glucosidases in liquid phase, and symbolizes the substrate-enzyme complex (g β-glucosidases/L). Like exoglucanases, β-glucosidases cleave off glucose from cellotriose and cellobiose in the way of processive hydrolysis. Similar to the analyses in Section 2.3.2, relevant equations are obtained and shown below The production/consumption rates of cellotriose, cellobiose and glucose are calculated by Eq. (24), Eq. (25), and Eq. (26), respectively,

Decrease in enzyme activity

It is often observed that the effectiveness of cellulases is drastically reduced during hydrolysis. Until now, this phenomenon is not well understood and remains an open research question that merits further studies such as [46], [47]. In this work, the possible reasons for the rate reduction are hypothesised to be inhibition by soluble sugars (Fig. 1(a)) and loss of enzyme activity. Accordingly, the reaction constants change into Eqs. (27) and (28).where (and ), , and are respectively inhibition constants of glucose, cellobiose, and cellotriose. and represent “intrinsic” reaction constants. The inhibition is considered to be non-competitive and the effect of glucose is taken to increase proportionally to its concentration raised to the power of 3 [6]. Additionally, thanks to first order kinetics of enzyme deactivation [40], the remaining relative enzyme specific activities after t time arewhere λE represent the first order deactivation constants.

Summary of mass balances

The overall formation rates of each species are obtained as follows

Computational methods

Parameter estimation

One set of data for non-crystalline cellulose hydrolysis is used for parameter estimation [6], in which SpezymeCP (Genencor, lot no. 301-00348-257) had an average activity of 31.2 filter paper units (FPU)/mL and was diluted to 1 and 3 FPU by adding buffer solutions. Another set of data for phosphoric acid swollen cellulose hydrolysis was used to check the model’s predictive ability, where mixtures of different mole percentage of Humicola insolens endoglucanase and enxoglucanase (CBH II), and Penicillium brasilianum β-glucosidase were used to hydrolyze phosphoric acid swollen cellulose [66]. The values of parameters are iteratively adjusted to obtain the most accurate agreement between predictions and measurements, i.e., by minimizing the weighted sum of squares of residuals: In order to tackle some drawbacks in multimodal optimisation problems such as dependency on initial guesses, a hybrid solution strategy is employed, combining the genetic algorithm with the Nelder–Mead simplex search method [48]. Assuming that the error terms for each experiment are additive, independently/uncorrelated and identically distributed normally with zero mean and variance σ, the inverse co-variance matrix (i.e., weighting matrix) is [49]. When maximum likelihood estimation is used, the covariance matrix of θ is obtained by linear approximation [50]. Then, the confidence intervals of θ at αt significance level are Furthermore, the approximate correlation matrix of θ can be obtained, the ijth element of which is given by The diagonal elements of the matrix are all unity and the off-diagonal elements are in the interval [−1,1].

Global sensitivity analysis

Global sensitivity analysis (GSA), specifically the Sobol’ GSA method, is used to further evaluate the identifiability of parameters, i.e., sensitivities of model outputs (y) to variations of model parameters θ [51], [52]. The temporal profile of sensitivities of outputs with respect to parameter variation is calculated in terms of the sensitivity indices belowwhere and D are the variation in model outputs associated with changes in each parameter θ1,…,p and in all parameters simultaneously, respectively. The first-order indices,, indicate the sensitivity with respect to one individual parameter without interactions, while higher order indices,, represent the effect of interactions between the parameters. Total effect indices with respect to each parameter,, are defined as Additionally, a matrix of relative values calculated by Eq. (41) indicates levels of parameter interactions, i.e., the closer the values are to one, the lower the interactions.

Model-based optimisation

As an optimisation criterion for cellulose hydrolysis, the percent conversion to soluble sugars in terms of % C/C is defined as Process performance is compared with that of a simultaneous saccharification and fermentation (SSF) process [53], [54], [55], [56], [57], [58]. For this purpose, we consider an engineered Geobacillus thermoglucosidasius strain TM242 which can produce ethanol with an inoculum of 0.15 g DCW/L. Through systematic metabolic network reduction [59], a fermentation model (based on five main macro-metabolic reactions) is integrated with the above hydrolysis model to describe the SSF process [60]. By calculating percent conversion for a batch process of the enzymatic hydrolysis and a batch SSF under different compositions of the three core enzymes, respectively, we can determine what compositions are optimal in either case with regard to cellulose conversion to soluble sugars. In the both processes, initial cellulose concentrations are set to 50 g/L.

Results and discussion

Parameter uncertainty and preliminary estimation

Uncertainty analysis assesses the confidence in modelling results (including parameter estimates and model outputs/predictions) by quantification of the propagation of various sources of errors in the model input and design. The errors may originate from two sources: one is the quality and amount of data used to develop the model (each measurement is associated with a measurement noise and the system is usually observed partially); another is the model structure, which may not be a perfect representation of the real system. As shown in Eqs. (36)–(38), parameter uncertainty in terms of confidence intervals and correlation matrix can be estimated from the calculation of the Fischer information matrix based on local sensitivity analysis. Additionally, model prediction uncertainty will be investigated in Section 4.2 by two related methods: quasi Monte-Carlo (QMC) simulation using Sobol' sequences and global sensitivity analysis using the Sobol’ method (GSA). Fig. 2 depicts the framework used in this study.

Fig. 2

Flow chart of parameter estimation, uncertainty analysis, and model refinement. DOE: design of experiments; GSA: global sensitivity analysis; QMC: quasi-Monte-Carlo simulations.

One set of data for non-crystalline cellulose hydrolysis was used for assessing model fits and parameter estimation [6]. The activity of an enzyme in solution is proportional to its protein concentration [61] and 1PFU is approximated to 2 mg protein [62], [63], [64], [65]. The components of SpezymeCP are assumed to be 12% w/w endoglucanases, 80% w/w exoglucanases, 4% w/w β-glucosidases, and other enzymes [1], [66], [67]. The molecular weights are 43 kDa, 65 kDa, and 110 kDa for endoglucanases, exoglucanases, and β-glucosidases, respectively [1], [66], [67]. The model includes 27 parameters in total, i.e., k*E, KdisE, σE, KadE, IG1E, IG2E′, IG3E′ λE, 2αE (E = en, ex, bg; E′ = en, ex), F and DP of cellulose. For detailed definitions of the parameters, please refer to the nomenclature section. Initially, after GSA analysis was performed, eight non-significant parameters were set at their nominal values: reported values of F and DP were used, i.e., 0.12 and 152, respectively [11]; λE values were fixed at 0.0122 h−1 [40], [67], [68]; and 2αE were assumed to be the values of 43, 65, 110 for endoglucanases, exoglucanases, and β-glucosidases, respectively [69], [70]. As a result, there were 19 parameters left for estimation. As shown in Table 1, lower and upper bounds for constrained minimization (Eq. (34)) are set with reference to literature [1], [5], [6], [24], [11], [45], [62], [63], [71], [72], [73], [74], [75]. θ0 are initial guesses for the parameters. Preliminary parameter values estimated using this methodology and their 95% confidence intervals are summarized in Table 1. As illustrated in Fig. 3 depicting the dynamic concentration profiles of non-crystalline cellulose, glucose, cellobiose, and oligosaccharides for hydrolysis processes under different conditions [6], the model simulation results are in adequate agreement with the experimental data. Generally, a typical biphasic pattern is observed in every process, i.e., a faster initial rate of hydrolysis, which progressively slows down possibly due to product inhibition and enzyme deactivation [10], [11], [45].

Table 1

Preliminary estimates of parameter values (dimθ = 19) with 95% confidence intervals at 50 °C.

θ	θ_refer	θ^lb	θ^ub	θ₀	θ^	References
k*^en	4.0	1.0	7.0	4.00	2.32 ± 0.16	[24], [75]
k*^ex	0.80 ∼ 1.60	0.4	3.2	1.8	1.94 ± 0.23	[24], [75]
k*^bg	0.193 ∼ 115	0.15	120	60.1	17.6 ± 1.20	[11], [73]
k_dis^en	0.333	0.1	2	1.05	1.22 ± 0.05	[24]
k_dis^ex	0.25	0.1	2	1.05	0.41 ± 0.03	[11], [24]
k_dis^bg	0.057 ∼ 4.81	0.03	10	5.02	8.46 ± 1.02	[71], [73], [74]
σ^en	0.58 ∼ 1.9	0.2	10	5. 10	6.18 ± 2.51	[45]
σ^ex	0.16 ∼ 0.75	0.05	10	5.03	5.81 ± 7.25	[45]
σ^bg	0.269	0.1	5	2.55	0.47 ± 1.26	[61]
k_ad^en	15 ∼ 280	10	350	180	62.5 ± 3.2	[45], [61], [62], [72]
k_ad^ex	15 ∼ 280	10	350	180	44.2 ± 0.9	[45], [61], [62], [72]
k_ad^bg	15 ∼ 280	10	350	180	15.4 ± 2.7	[45], [61], [62], [72]
I_G2^en	0.015, 5.20	0.01	50	25	51.4 ± 6.48	[5], [6]
I_G2^ex	132, 5.20	0.5	150	75.3	2.15 ± 0.58	[5], [6]
I_G1^bg	0.07 ∼ 3.9	0.04	6	3	0.12 ± 0.09	[5], [6], [73], [74]
I_G1^en	0.1, 0.08	0.05	3	1.5	0.47 ± 0.21	[5], [6]
I_G1^ex	0.04, 0.08	0.02	3	1.5	0.82 ± 0.08	[5], [6]
I_G3^en	8.69	0.5	500	250	182 ± 32	[6]
I_G3^ex	8.69	0.5	500	250	295 ± 41	[6]
F_a	0.12	0.09	0.16	0.12	Fixed at 0.12	[24]
DP	152	100	200	152	Fixed at 152	[24]
λ^en,λ^ex,λ^bg	0.0122	0.006	0.0244	0.0122	Fixed at 0.0122	[37]
2α^en/M^en,
2α^ex/M^ex,	1	0.5	1.5	1	Fixed at 1	[69], [70]
2α^bg/M^bg

Note: some parameters were fixed at (1) λ = λ = λ = 0.0122 h−1; (2) DP = 152, F = 0.12; (3) 2α = 43, 2α = 65, 2α = 110. Refer to Nomenclature for unit.

Fig. 3

Experimental data and model predictions of non-crystalline cellulose hydrolysis, including the time profiles of insoluble cellulose, glucose, cellobiose, and oligosaccharides (G3–G6). Experimental data are represented as discrete points while model simulations are depicted as continuous lines.

Cellulases loadings: 1 FPU/g-glucan in processes of (a)–(d); 3 FPU/g-glucan in processes of (e)–(h). Starting substrates: only non-crystalline cellulose in (a) and (e); non-crystalline cellulose with 5% (w/w) cello-oligosaccharides in (b) and (f); non-crystalline cellulose with 5% (w/w) glucose in (c) and (g); non-crystalline cellulose with 5% (w/w) cellobiose in (d) and (h). (Experimental data from Peri et al. [6])

Moreover, the covariance matrix obtained by Eq. (36) allows analysis of the extent of correlation between the parameters (Eq. (38)). The correlation coefficients are shown in Fig. 4(a). High correlation coefficients (the absolute value of off-diagonal elements >0.7) are occasionally found between some parameters and the ratios of confidence intervals to means of certain parameters, especially σE, implied by Table 1 are rather high, indicating the estimates are highly correlated and may be inaccurate. One reason is, to a large extent, the model structure, such as over parameterization, which is often encountered in the modelling of biorefinery processes [9], [76], [77]. Another possible reason would be poor experimental design/information content of data. In the following section, the model structure and prediction uncertainty are analysed by means of QMC and GSA to improve identifiability and reliability.

Fig. 4

Correlation matrix of the identified parameters. (a), preliminary estimation, dim θ = 19; (b), refined estimation, dim θ = 13.

Prediction uncertainty, parameter reduction and refined estimation

As shown by the θ in Table 1, the model comprises 27 parameters in total, which are subjected to sampling input uncertainty using QMC and GSA. As mentioned above, eight parameters were initially found non-significant, the subjective input uncertainties of which are θlb ≤ θ ≤ θub, as shown in Table 1. In addition, the subjective input uncertainties of the rest 19 parameters are defined by the preliminary estimates θˆ and at the same time the values of σex and σbg are restricted to be non-negative. At first, the QMC simulations obtained by simulating 1000 samples using the model result in 1000 time-series of dynamic profiles in Fig. 5 for a group of typical cases, i.e., starting cellulose concentrations of 10, 25, 50, 100 g/L and initial SpezymeCP cellulases loadings of 0.5, 1.5, 3, 6 PFU/g-glucan. The model prediction uncertainty is represented using mean, 10th, and 90th percentile of the distribution of each model output at each time instant in Fig. 5. The prediction uncertainty of model outputs is illustrated by the spread of the prediction distribution, and the following conclusions can be obtained: (1) the aforementioned biphasic phenomena inevitably occurs with the hydrolysis processes in batch mode; (2) hydrolysis performance (i.e., the percent conversion of cellulose to soluble sugars) does not increase linearly but rather asymptotically with the increase of initial cellulases loadings; (3) interestingly, it is found that the major accumulation product is glucose in the all cases while the 90th percentile of cellobiose and cellotriose concentrations peaks around the end of the first (fast) hydrolysis phase (<10 g/L) and monotonically decreases thereafter.

Fig. 5

Representation of uncertainty of the model predications for cellulose, glucose, cellobiose, and cellotriose over hydrolysis with all the parameters (dim θ = 27). Quasi Monte Carle simulations (1000 samples, white blue lines), mean (→), and 10th () and 90th () percentile of the predictions.

Typical cases: (a), starting non-crystalline cellulose concentrations of 10, 25, 50, and 100 g/L, and 6 FPU/g-glucan SpezymeCP cellulases loading in all processes; (b), SpezymeCP cellulases loadings of 0.5, 1.5, 3, and 6 FPU/g-glucan, and 50 g/L starting non-crystalline cellulose concentration in all processes.

Meanwhile, to understand what underlies this prediction uncertainty and what parameters are most critical to the model prediction, GSA is performed. The GSA results (Fig. 6(a)) illustrate that model outputs are insensitive to F, DP, λ and 2αover time in the given but reasonable spaces as noted in Table 1, based on a quantitative criterion of max() < 0.05 in this work. This means they can be fixed at the pre-selected values. In addition, it is found that other non-significant parameters are KE, Ien, Ien, and Iex. This can be explained as follows. Firstly, only low cellulases loadings (≤750 mg/L), i.e.,, were implemented experimentally and usually (bound enzymes >75% w/w observed by [62], [63], [72], [78]). As a result, according to the corresponding parameter values in Table 1, it can be predicted that under usual operating conditions. Consequently, Eq. (4) is transformed into , i.e., only the products of tend to be structurally identifiable under usual experimental conditions. Therefore, KE can be set leaving just σE to be estimated. Secondly, thanks to the format of Michaelis–Menten kinetics and the fact that the preliminary estimates of IG2en, IG3, and IG3 (shown by Table 1) are all much bigger than the corresponding product concentrations (the above-mentioned <10 g/L obtained by the QMC simulations), these three parameters can be practically eliminated from the model. As a result, a subset of 13 crucial parameters is identified.

Fig. 6

Sensitivities of model outputs (namely glucose, cellobiose, cellotriose from top to bottom) to 27 parameters (the column on the left hand, i.e., (a)) and 13 critical parameters (the column on the right hand, i.e., (b)) during the course of an example hydrolysis process (corresponding to Fig. 3(a)). Color axis scaling is given in every subplot. Only the values of total sensitivity indices are shown.

The refined estimates for these parameters are summarized in Table 2 with the corresponding 95% confidence intervals. The corresponding correlation coefficients are shown in Fig. 4(b). It is found that the absolute values of all off-diagonal elements of correlation matrix are less than 0.6. From the ratios of to (i.e., in Eq. (41)) shown by Fig. S2, we can also conclude that the 13 parameters are less correlated unlike the very high interactions between some of the 27 parameters shown by Fig. S1. Additionally, as shown in Fig. 6 and Fig. S2, model predictions are generally more sensitive in the first 10 h of hydrolysis (i.e., the first phase), which means that more frequent sampling in this phase can yield more informative data for parameter estimation.

Table 2

Refined estimates of the parameter values for the reduced model (dimθ = 13) with 95% confidence intervals at 50 °C.

θ	θ_lb	θ_ub	θ₀	θ^	Unit
k^*en	1.0	7.0	4.0	2.12 ± 0.04	10³ mmol/(g enzyme × h)
k^*ex	0.4.0	3.2	1.8	1.52 ± 0.03	10³ mmol/(g enzyme × h)
k^*bg	0.15	120	60.1	14.8 ± 0.5	10³ mmol/(g enzyme × h)
K_dis^en	0.10	2.0	1.05	1.28 ± 0.04	mmol β-glucosidic bonds/L
K_dis^ex	0.1	5.0	1.05	0.17 ± 0.02	mmol β-glucosidic bonds/L
K_dis^bg	0.03	10	5.02	6.58 ± 0.43	mmol β-glucosidic bonds/L
σ^en	0.20	10	5.10	5.24 ± 0.68	10⁻³ mmol sites/mmolβ-glucosidic bonds
σ^ex	0.05	10	5.03	6.25 ± 0.72	10^-3 mmol sites/mmolβ-glucosidic bonds
σ^bg	0.10	5.0	2.55	0.32 ± 0.21	10^-3 mmol sites/mmolβ-glucosidic bonds
I_G2^ex	0.50	150	75.3	0.95 ± 0.24	g/L
I_G1^bg	0.04	6	3	0.11 ± 0.07	g/L
I_G1^en	0.05	3	1.5	0.78 ± 0.23	g/L
I_G1^ex	0.02	3	1.5	0.61 ± 0.14	g/L

Note: the pre-set values for the rest insensitive parameters are (1) λ = λ = λ = 0.0122 h−1; (2) DP = 152, F = 0.12; (3) 2α = 43, 2α = 65, 2α = 110; (4) Kaden = 50 L/mmol sites, Kadex = 450 L/mmol sites, Kadbg = 15 L/mmol sites.

No significant difference is observed between the simulations with the reduced model of 13 crucial parameters (using the means) and with 19 parameters (using the means) (results not shown). By comparison, the values change from 52.3 to 46.1 for the overall model and refined model, respectively. In addition, the uncertainty of model outputs is investigated by QMC simulations with the estimates of 13 parameters (Fig. 7 and Figure S3). Compared to Fig. 5, a smaller distribution spread of the outputs is observed in Fig. S4. Overall, as shown by the validation results (Fig. 6 and Fig. S3), the estimates are refined and the identifiability of the refined model is improved.

Fig. 7

Representation of uncertainty of model predictions for cellulose, glucose, cellobiose, and cellotriose over hydrolysis with the reduced set of parameters (dim θ = 13). Quasi Monte Carle simulations (1000 samples, white blue lines), mean (→), and 10th () and 90th () percentile of the predictions.

Two hydrolysis processes: (a), SpezymeCP cellulases loading = 1 FPU/g-glucan; (b), SpezymeCP cellulases loading = 3 FPU/g-glucan.

Model predictability and model-based optimisation

According to the results of the prediction uncertainty analysis, the model can be reduced to 13 crucial parameters with the other parameters eliminated or fixed at the values from literature under usual operating conditions. Note that the model should be carefully extrapolated to hydrolysis processes with high loadings of enzymes (>15 mg/g-glucan or >750 mg/L). Fortunately, high loadings are usually not used in industry applications due to high cost. Besides direct validation with the data of Peri et al. [6], another set of data for phosphoric acid swollen cellulose (PASC, which was regarded as amorphous cellulose) hydrolysis was used for cross validation to check the model's predictive ability [44], [79]. As shown in Table 3, the model is indeed capable of predicting the experimental measurements with sufficient accuracy. This further suggests that the model can be successfully adapted to new processes.

Table 3

Comparison between predictions and experimental measurements (shown in brackets and in red) of enzymatic hydrolysis products during degradation of PASC. (For interpretation of the references to colour in this Table 3, the reader is referred to the web version of this article.)

Note: n.d., not detected; d.l.a., detected in low amount; Comb., experimental combination number. (Experimental data from Andersen et al. [44], [79]).

This detailed mechanistic model provides a tool for optimising the cellulose hydrolysis process. An obvious target is to determine the optimal composition of cellulases cocktail including the three core constituents/enzymes. The optimised cocktail can, in theory, reduce the loading of enzymes without sacrificing the hydrolysis yield or increase the sugar yield in shorter incubation time. The optimisation procedure is as follows. Firstly, in silico simulations are performed under 1000 different make-ups of the three enzymes. Percent conversion is calculated with respect to different compositions. Then, a ternary plot of the results clearly illustrates what compositions are optimal with respect to conversion to soluble sugars. Here, the “compositions” correspond to the samples that are taken for simulations. The optimisation is carried out for batch processes of both enzymatic hydrolysis and SSF. Initial cellulose concentrations are set to 50 g/L for both processes. As shown by subplots (a) and (b) of Fig. 8, it can be concluded that the optimal cocktail with the highest synergism should contain 60-85% endoglucanases, 10–30% exoglucanases, and 3–5% beta-glucosidase (w/w/w) for both processes. By comparison, the cocktail was experimentally optimized by Andersen et al. for the hydrolysis of PASC in batch mode [44], [79]. The optimal cocktail was found to be en:ex:bg = (50–100):(0–40):(10–40) (w/w/w). Gao et al. also experimentally optimized six core enzymes for saccharification of ammonia fiber expansion (AFEX) pretreated corn stover [80]. The optimal composition was found to be endo: CBH I: CBH II: beta-glucosidase: endo-xylanase: beta-xylosidase = 31.0:28.4:18.0:4.7:14.1:3.8 (w/w/w/w/w/w).

Fig. 8

(a) and (b): Ternary plots of conversion rate (% C/C) as a function of varying total cellulase cocktail loadings and time during only hydrolysis (subplot (a)) and SSF (subplot (b)) of non-crystalline cellulose (both processes are of batch mode). In each subplot, from left to right, hydrolysis time of 24 h, 48 h, and 96 h; from top to bottom, total enzyme loadings of 2.5, 7.5, and 15 mg/g-glucan. (c) and (d): Contour plots of against time and cellulases loading at an optimum of en:ex:bg = 70:25:5(w/w/w) during only hydrolysis (subplot (c)) and SSF (subplot (d)) of non-crystalline cellulose (both processes are of batch mode).

In all processes, the starting cellulose concentration is 50 g/L. The reaction temperature is 50 °C or 60 °C in only hydrolysis or in SSF, respectively. The values of k* are predicted by Arrhenius/van't Hoff equation , where Ea is 47.6 kJ/mol [12] and ideal gas constant R is 8.314 J/K/mol.

A reasonable enzyme loading would lie at a value between 2.5 and 7.5 mg/g-glucan as an activity-cost trade off (one could determine optimal enzymes loading by taking into consideration enzymes cost and increase in rate of hydrolysis). From the simulation results, in batch enzymatic hydrolysis, only around 40% of 50 g/L cellulose can be converted into soluble sugars by the optimal cellulases cocktail with 7.5 mg/g-glucan loading in 96 h. By comparison, about 90% of the cellulose is hydrolysed with the same cellulases cocktail in 96 h thanks to reduced inhibitions of products, which are simultaneously fermented into ethanol. According to subplots (c) and (d) of Fig. 8 that values are around 40% C/C and around 90% C/C after 96 h, respectively, using 375 mg/L (i.e., 7.5 mg/g-glucan) optimised cellulases cocktail. As mentioned in the introduction, experimental validation using empirical models and pure mechanistic analysis of cellulose enzymatic hydrolysis (with high complexity, such as heterogeneous reaction, random chain and chain-end processive scission by different enzymes, and depolymerisation of polymer mixture) has typically been performed separately [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24]. This work is an early stage effort to bridge the gap between detailed mechanistic modeling and experimental parametric studies. In the future, more process-level factors, such as diverse substrate physical properties as a result of various pretreatment methods, can be included into the model to make it more applicable to real processes. Following the systematic framework of parameter estimation and identifiability analysis presented herein (Fig. 2), model parameters can be identified and refined with experimental data available and/or using optimal experimental design. We aim to employ the developed model to guide the improvement of the hydrolysis and SSF processes in terms of operating conditions, initial cellulose concentrations, feeds of enzymes and cellulose, and separation of soluble sugars or ethanol. Additionally, the model can direct future rational approaches to the development of multi-gene expression systems to directly produce optimised enzyme mixtures of exoglucanase, endoglucanase, and beta-glucosidase [81], especially in the context of consolidated bioprocessing [82], [83], [84], [85].

Conclusions

In this work we developed a detailed mechanistic model of depolymerisation (chain-end scission) of amorphous cellulose by three core cellulolytic enzymes in a heterogeneous catalytic system. Separate steps for enzyme adsorption, formation of a catalytically active complex with cellulosic chains/ends, and enzyme desorption have been included. Through parametric analysis and experimental validation, we show that the resulting model with the reduced parameter set (13 out of an initial set of 27) is capable of predicting the evolution of distribution of insoluble cellulose chain lengths (from four to DP) as well as the production of soluble sugars during synergistic enzymatic hydrolysis. Finally, model-based optimisation for hydrolysis and SSF conclude that the optimal composition of cellulases cocktail is approximately en:ex:bg ≅ 60–85:10–30:3–5(w/w/w) both for enzymatic hydrolysis and for more advantageous SSF process.

56 in total

1. Metabolic flux analysis of Escherichia coli K12 grown on 13C-labeled acetate and glucose using GC-MS and powerful flux calculation method.

Authors: Jiao Zhao; Kazuyuki Shimizu
Journal: J Biotechnol Date: 2003-03-06 Impact factor: 3.307

2. Mechanism of initial rapid rate retardation in cellobiohydrolase catalyzed cellulose hydrolysis.

Authors: Jürgen Jalak; Priit Väljamäe
Journal: Biotechnol Bioeng Date: 2010-08-15 Impact factor: 4.530

Review 3. Toward an aggregated understanding of enzymatic hydrolysis of cellulose: noncomplexed cellulase systems.

Authors: Yi-Heng Percival Zhang; Lee R Lynd
Journal: Biotechnol Bioeng Date: 2004-12-30 Impact factor: 4.530

4. Kinetic studies of enzymatic hydrolysis of insoluble cellulose: (II). Analysis of extended hydrolysis times.

Authors: Y H Lee; L T Fan
Journal: Biotechnol Bioeng Date: 1983-04 Impact factor: 4.530

5. Multiscale modelling of hydrothermal biomass pretreatment for chip size optimization.

Authors: Seyed Ali Hosseini; Nilay Shah
Journal: Bioresour Technol Date: 2009-01-10 Impact factor: 9.642

6. Kinetic modeling of cellulosic biomass to ethanol via simultaneous saccharification and fermentation: Part I. Accommodation of intermittent feeding and analysis of staged reactors.

Authors: Xiongjun Shao; Lee Lynd; Charles Wyman; André Bakker
Journal: Biotechnol Bioeng Date: 2009-01-01 Impact factor: 4.530

7. Adsorption of cellulase from Trichoderma reesei on cellulose and lignacious residue in wood pretreated by dilute sulfuric acid with explosive decompression.

Authors: H Ooshima; D S Burns; A O Converse
Journal: Biotechnol Bioeng Date: 1990-08-20 Impact factor: 4.530

8. Mixture optimization of six core glycosyl hydrolases for maximizing saccharification of ammonia fiber expansion (AFEX) pretreated corn stover.

Authors: Dahai Gao; Shishir P S Chundawat; Chandraraj Krishnan; Venkatesh Balan; Bruce E Dale
Journal: Bioresour Technol Date: 2009-11-30 Impact factor: 9.642

9. A mechanistic model for enzymatic saccharification of cellulose using continuous distribution kinetics I: depolymerization by EGI and CBHI.

Authors: Andrew J Griggs; Jonathan J Stickel; James J Lischeske
Journal: Biotechnol Bioeng Date: 2011-10-27 Impact factor: 4.530

10. Fingerprinting Trichoderma reesei hydrolases in a commercial cellulase preparation.

Authors: T B Vinzant; W S Adney; S R Decker; J O Baker; M T Kinter; N E Sherman; J W Fox; M E Himmel
Journal: Appl Biochem Biotechnol Date: 2001 Impact factor: 2.926

1 in total

1. Application of ZnO Nanoparticles for Improving the Thermal and pH Stability of Crude Cellulase Obtained from Aspergillus fumigatus AA001.

Authors: Neha Srivastava; Manish Srivastava; P K Mishra; Pramod W Ramteke
Journal: Front Microbiol Date: 2016-04-18 Impact factor: 5.640

1 in total