Literature DB >> 28952495

Macroscopic Dynamic Modeling of Sequential Batch Cultures of Hybridoma Cells: An Experimental Validation.

Laurent Dewasme¹, François Côte², Patrice Filee³, Anne-Lise Hantson⁴, Alain Vande Wouwer⁵.

Abstract

Hybridoma cells are commonly grown for the production of monoclonal antibodies (MAb). For monitoring and control purposes of the bioreactors, dynamic models of the cultures are required. However these models are difficult to infer from the usually limited amount of available experimental data and do not focus on target protein production optimization. This paper explores an experimental case study where hybridoma cells are grown in a sequential batch reactor. The simplest macroscopic reaction scheme translating the data is first derived using a maximum likelihood principal component analysis. Subsequently, nonlinear least-squares estimation is used to determine the kinetic laws. The resulting dynamic model reproduces quite satisfactorily the experimental data, as evidenced in direct and cross-validation tests. Furthermore, model predictions can also be used to predict optimal medium renewal time and composition.

Entities: Chemical Disease Species

Keywords: Mathematical modelling; bioprocess optimization; maximum likelihood principal component analysis; monoclonal antibody

Year: 2017 PMID： 28952495 PMCID： PMC5590448 DOI： 10.3390/bioengineering4010017

Source DB: PubMed Journal: Bioengineering (Basel) ISSN： 2306-5354

1. Introduction

Therapeutic products (vaccines, antibodies, etc.) are subject to exponential demands and cost-lowering process improvements, leading to the intensification of growth conditions in the bio-pharmaceutical industry and the sharp increase of the related market economy. For instance, monoclonal antibody (MAb) market amounts to several billion dollars and is still increasing. To improve bioprocess yield and repeatability, monitoring and control tools are required. The latter implies the availability of dynamic models, which can predict the process trajectory and support the design of software sensors or control strategies. Previous optimization studies of hybridoma cell cultures for MAb production were usually conducted using simple mathematical models based on macroscopic reaction schemes such as in [1,2]. More recently, a macroscopic model with kinetics accounting for overflow metabolism, where glucose and glutamine are the main substrates, was proposed in [3]. Indeed, cell respiratory capacity is limited [4]. Therefore, depending on the substrate concentrations, cell metabolism follows two possible pathways: the respirative regime if the respiratory capacity is sufficient to oxidize the whole substrate amount or the respiro-fermentative regime if this substrate amount is in excess with respect to the available oxidative capacity, thus leading to the formation of growth-inhibiting by-products (respectively lactate from glucose and ammonia from glutamine). The main drawback of such complex kinetic models is however the relatively large number of parameters and the necessity to achieve experiments in specific conditions so as to trigger the phenomena (activation, saturation or inhibition) reflected in the model. A more practical alternative is to directly deduce the model structure from the available data, assuming that the data sufficiently cover the process operating range. In [5], Principal Component Analysis (PCA) is used to determine the minimum number of reactions required to interpret the data. This methodology was further extended in [6], where an insightful geometric interpretation is provided, and maximum likelihood principal component analysis (MLPCA) is used to estimate the reaction number and stoichiometric matrix. In this study, the latter approach is applied to the culture of hybridoma cells in sequential batch reactors (SBR). This mode of operation is common in industrial practice, and poses the question of the information content of data sets collected during the several batches. Before embarking in the analysis of actual experimental data, synthetic data are first generated with a process simulator, with the possibility of varying the initial conditions of the batches and the data sampling rate within each single batch. The methodology is subsequently applied to real-life experimental data, with a better grasp of the data information content. The main contribution of this study is threefold: A simple dynamic model of cultures of hybridoma cells in SBR is developed and validated with experimental data. Confidence intervals for the parameters and the estimated trajectories are provided. A systematic model identification procedure, based on rigorous yet simple to use tools—MLPCA to determine the stoichiometry, nonlinear least squares to identify the parameters of the kinetic laws, sensitivity analysis and Monte Carlo analysis to infer the confidence intervals—is assessed in a real case study, showing good performance and promise for future applications. The simple dynamic model is further exploited to optimize the medium renewal strategy in the sequential batches. This paper is organized as follows. Section 2 reviews the basic concepts of verflow metabolism and mathematical modeling using principal component analysis. Section 3 presents the experimental case study and process operating conditions. The dynamic model of hybridoma sequential batch cultures is derived in Section 4 and parameters are identified from experimental data in Section 5. Subsequently, Section 6 develops a parametric sensitivity analysis and proposes further model simplifications. The simplified model is identified and cross-validated with two different data sets in Section 7. Finally, the dynamic model is used to optimize the culture medium renewal time and composition in Section 8 while conclusions are drawn in Section 9.

2. Dynamic Modeling of Hybridoma Cultures

This section first reviews the basic concept of overflow metabolism and then presents a systematic procedure to infer candidate macroscopic models from principal component analysis of the data at hand.

2.1. Overflow Metabolism

The main physiological feature of hybridoma resides in their primary metabolism or, more precisely, in their catabolism, presenting the following main pathways: The glycolysis which is a series of degradation reactions of glucose (the main substrate) taking place in the cytoplasm and leading to a final product, i.e., pyruvate. The Krebs cycle, also called the tricarboxylic acids (TCA) cycle or citric acids cycle, which takes place inside the mitochondrions and uses pyruvate to product the cells energy units (Adenosine triphosphate or ATP) and reduced cofactors (typically NADH and FADH). The electron transport, still taking place in mitochondrions and producing ATP from the reduced cofactors. The fermentative pathway which, in oxygen limitation, produces typical products like lactate from pyruvate in the cytoplasm. Cell catabolism is characterized by a limited energy production (i.e., the Krebs cycle has a limited capacity) principally used for cell growth and division. This limitation comes from the capacity to oxidize the main nutrients: glucose (main carbon source) and glutamine (main nitrogen source). The excess amounts of these nutrients are assumed to follow other metabolic pathways more commonly known as “fermentation”, producing a side byproduct. This “Overflow Metabolism” or “short-term Crabtree effect” [4,7,8,9,10,11], is typically observed with yeast, bacteria and animal cell cultures. Depending on the case, it leads to the production of ethanol, acetate and lactate/ammonium as side byproducts. Several descriptions of this switching mechanism have been proposed in the literature (for instance in [10]) but this phenomenon remains not well understood. The byproduct formation usually inhibits the oxidative capacity of the cells, slowing down growth for increasing concentrations. In turn, it depends on the oxidative capacity of the cells and on the medium composition. A generic mechanistic model that would, in principle, allow the representation of the culture of different strains presenting overflow metabolism, can be described through the following main reactions: jth substrate consumption: jth substrate overflow: jth byproduct consumption: where X, S and P are the concentrations of cell biomass, jth substrate and jth side byproduct, respectively. The k coefficients represent the yield (or pseudo-stoichiometric) coefficients of component n in reaction i. Overflow metabolism assumption involves that, for each concerned substrate, these reactions take place in pairs (1a and 1b) or triplets (1a–1c) if the considered byproduct can be reconsumed by the biomass as a substitute substrate source when the oxidative capacity is not fully exploited. Indeed, Sonnleitner and Käppeli [12] assume that the cell oxidative capacity rules the general metabolism, following a bottleneck effect. During a culture, the cells are likely to change their metabolism depending on the exploitation of the respiratory capacity. At low substrate uptake rate (substrate concentrations are below critical levels S < S and φ < φ), substrate is consumed with biomass growth and a relatively low metabolite byproduct production (1a) without overflow, which is defined as respiratory metabolism and the consequent remaining respiratory capacity can be used to oxidize byproduct as substitute carbon source as in (1c). At high substrate uptake rate (substrate concentration is above critical level S > S and φ > φ), the respiratory capacity is saturated, resulting in overflow metabolism towards excess metabolite production (reactions (1a) and (1b)). The state at which overflow metabolism is initiated (S = S and φ = φ) is referred to as critical metabolism. For instance, yeast metabolism is described by the bottleneck assumption of Sonnleitner and Käppeli [12], as illustrated in Figure 1.

Figure 1

Schematic representation of the bottleneck assumption of Sonnleitner and Käppeli [12].

This model was exploited in [13,14,15] for robust control purposes. Based on a similar model, references [16,17] suggested practical ways to estimate state variables such as biomass, glucose or acetate in bacteria cultures using software sensors. Recently, reference [3] proposed a dynamic model of hybridoma fed-batch cultures based on a double overflow mechanism, one linked to glucose, and the other to glutamine, with good prediction capabilities.

2.2. Systematic Modeling Procedure

In contrast with the previous modeling approach which is based on past experience and a priori knowledge of the metabolic network, it is now suggested to derive a model based mostly on the information content of available data sets. This can be particularly relevant when the model structure is uncertain and experimental data sets are available that can be analyzed to extract information on the reaction stoichiometry and kinetics. First, we recall that bioprocesses can be represented by macroscopic reaction schemes involving M reactions between N components under the following generic form [18]: where (respectively, ) denotes the set of reactants (or products) in the “jth” reaction. The parameters k are pseudo-stoichiometric coefficients while is the corresponding reaction rate. Applying mass balances to (2), the following ordinary differential equation system is obtained: where K is the pseudo-stoichiometric matrix and represents the transport term taking dilutions, input feeds and gaseous outflows into account. is the vector containing all the kinetic parameters. The number of components N is generally larger than the number of reactions M so that the rank of the stoichiometric matrix K is assumed to be M. For instance, in [3], M = 5 and N = 6. Defining the transport-free state evolution and integrating (3) between two consecutive measurement times lead to the following expression: where is the differential transport-free state vector. As discussed in [6], Equation (4) expresses that is contained in a M-dimensional linear subspace, and MLPCA allows to determine subspaces of increasing dimensions p explaining a noisy data set (and therefore reaction schemes of increasing detail explaining the experimental data). A systematic procedure can therefore be developed, which selects the smallest value of p that allows a thorough interpretation of the data up to a given confidence level, minimizing a log-likelihood cost: where n is the number of measured vector samples and is the noisy measurement of , with an error covariance matrix and is its maximum-likelihood (ML) estimate by the reduced p-dimensional linear model [6]. Jp is a decreasing function of p which is always smaller or equal to the log-likelihood cost J* of the true nonlinear model. Since J* is known to have a chi-square distribution with degrees of freedom [19]. The number of reaction is just chosen as the smallest p such that the log-likelihood cost Jp is smaller or equal to the range of a χ-distributed random variable. Once the number of reactions is determined, the resulting N by p affine subspace basis can be used to estimate a stoichiometric matrix , which is a linear combination of the basis vectors, i.e., with G a p by p regular matrix. For a complete estimation of the stoichiometry, p biological constraints have to be imposed in each column of (for instance the fact that a specific reactant or product is involved in only one reaction).

3. Experimental Case Study—Materials and Methods

3.1. Operating Conditions

In the framework of this study, six sequential suspended hybridoma batch cultures of 2 hybridoma strains (called, for the sake of confidentiality, HB1 and HB2) were performed in two series of three 200 mL T-flasks. In this protocol, at the initial time of each batch, biomass is kept in the reactor, while the metabolites (lactate, ammonia and monoclonal antibodies) are withdrawn and the substrate concentrations (glucose and glutamine) are set to prescribed values (respectively ranging between 6 and 7 g/L, and 0.3 and 0.4 g/L). The end-of-batch viable and dead biomass concentrations are considered as the initial conditions of the next batch (the initial biomass concentration of the first batch is 0.1 × 106 cells/mL). The culture time is approximately 15 days and one medium renewal is performed approximately after one week. Measurements are taken once every day. The culture medium is based on 10% FBS (ThermoFischer, Waltham, MA, USA) added to DMEM (Lonza, Belgium) with 6 g/L of glucose and 4 mM of l-glutamine, and is replaced at a specific time (approximately after one week) when one of the substrates (glucose and glutamine) is exhausted, in order to avoid starving. Most of the times, due to the selected medium composition, glutamine is the limiting substrate. As glucose measurement can be performed relatively quickly with respect to the other analytical methods, medium refreshments are achieved based on glucose concentration evolution. Indeed, when glutamine vanishes, glycolysis stops and glucose is not oxidized anymore. Once this phenomenon is observed, the medium is replaced within the day. Concerning the culture basic parameters, pH medium is set between 7.2–7.6 at the beginning of the batch and decreases to a minimum of pH between 6.7 and 7.0. The temperature is regulated at 37 °C in a 5% CO2 incubator.

3.2. Measurements and Data Sets

Measurements are collected off-line following different methods with respect to the component/analyte: Biomass: Living and dead biomasses are measured by cell-counting using Trypan blue and a Neubauer chamber. Glucose concentration is measured by using a Roche glycemic analytical device called Accu-Chek allowing a fast calculus of the glucose concentration within a few seconds. Lactate concentration is also measured using a Roche device called Accutrend delivering fast concentration measurements using dipsticks. A “Mega-Calc” enzymatic kit from Megazyme is used to obtain the glutamine and ammonium concentrations. This method is based on absorbance measurements. Antibody concentration is obtained using an ELISA dosage of murine IgG designed by the CER group from Aye (Belgium) based on reactants from Bethyl Laboratories (ref A90-131A for coatage antibodies and A90-131P for revelation). The resulting data are shown in Figure 2 and Figure 3. As apparent, cell viability decreases significantly after four days but is maintained around 30% thanks to the medium replacement. The level of ammonium concentration is very low and ranges below the sensitivity level of the measurement method. Therefore, ammonium is not considered in the modeling study since concentrations are far below the growth-inhibiting level. Only the glucose overflow, producing lactate, will be taken into account.

Figure 2

HB1 sequential batch culture data sets. Medium refreshment times are indicated by the arrows.

Figure 3

HB2 sequential batch culture data sets. Medium refreshment times are indicated by the arrows.

4. Data-Driven Model Derivation

4.1. Data Processing

Before applying MLPCA to the data sets, elimination of data outliers should be achieved in order to reject measurement inconsistencies. For instance, the last increasing glutamine concentration measurements of HB1 third experiment should not be used in identification (and direct validation) since glutamine production is not possible. Even if part of the data is discarded for identification, all the measurements can be considered in cross-validation. In the next sections, the first two sets of HB1 data are selected for identification, and the rest of data for cross-validation.

4.2. MLPCA-Based Systematic Procedure

The methodology presented in Section 2.2 is now applied to the first two data sets of HB1. As shown in Figure 4, a 3-dimensional subspace (i.e., p = 3 reactions) is sufficient to interpret the data.

Figure 4

Log-likelihood costs of the p-dimensional subspaces of the first two HB1 data sets. The upper dashed line represents the chi-square quantile χ at 99.9% and the lower one the chi-square quantile χ at 0.1%.

The following matrix represents the maximum likelihood principal components defining the subspace basis related to Figure 4: To obtain a biologically-consistent stoichiometric matrix, reaction constraints have to be expressed so as define a matrix G as introduced in Equation (6): The existence of a glycolysis pathway where biomass grows on substrates, producing no lactate and without mortality (, , ); A sole glucose overflow pathway, according to the absence of ammonium (i.e., of glutamine overflow), where no dead biomass nor antibody is produced (, , ); A biomass death pathway (, ) theoretically with no substrate or metabolite concentration variations. The latter would represent too many constraints with respect to the available degree of freedom and arbitrarily, only the lactate coefficient is set to zero (). Indeed, due to the size of G, which is a 3 by 3 matrix, only 3 constraints can be expressed per reaction. The general constrained problem can be summarized as: Find solving (6), s.t. In contrast with [5,6], this case study offers the possibility to explore the scenario where biomass is produced out of several macro-reactions. A specific matrix related to the constrained problem (8) is provided by (9): Apparently, the glucose and glutamine stoichiometric coefficients in the third reaction (i.e., and ) are small compared to the sum of their respective values in reactions 1 and 2. A possible scenario is therefore to consider that and could be set to zero (the coefficient deviations is partly explained by the lack of information in the data and the measurement noise). The corresponding reaction scheme becomes: Substrate oxidation: Substrate overflow: Biomass death: where , and are the reaction rates introduced in Section 5.1. The corresponding mass balance equations are: Compared to published models such as [3], the number of reactions is reduced. This can be explained by the absence of ammonium and the related overflow mechanism. As our procedure is data-driven, it leads to the identification of the sole phenomena visible in the collected experimental data. Moreover, our strategy allows to decouple the identification of the stoichiometry from the kinetics or, at least, to get a first estimate of the stoichiometric parameters, independently of the kinetics. This can be an important asset when identifying bioprocess complex models with numerous parameters.

5. Parameter Identification

5.1. Reaction Rates

Since the double bottleneck glucose-glutamine is reduced to a simple bottleneck depending on both substrates, a reaction rate combining Monod factors is selected where while the death rate is given by

5.2. Initial Conditions and Identification Criterion

Starting with the previously obtained values of the stoichiometric matrix in (10a–10c) as stoichiometric parameter initial conditions, the whole parameter set (i.e., stoichiometric and kinetic parameters) can be identified minimizing a least-squares criterion measuring the distance between model simulated data and experimental measurements as in: where is the parameter vector initialized with . The initial state is a vector of length with being the number of experiments used in identification. is set using the experimental measurements at time t = 0. Q is the measurement error covariance matrix. As measurement error standard deviations are a priori unknown, it is common choice to set Q to a diagonal matrix with the squares of the maximum respective concentration levels. This allows to normalize the distances calculated in (16) and give equal importance to states with different orders of magnitude. Parameter identification is performed with the MATLAB library-optimizer ”fmincon”. This algoritm allows to set box constraints on the parameters so as to limit the search space, and is typically used here in three successive calls. The first call starts from the initial guess (the MLPCA estimates of the stoichiometry, and an “inspired guess” for the kinetics), and the next are initialized with the parameter values resulting from of the previous minimization. Clever initialization is essential in reducing the computational cost and in increasing the chance of capturing the global minimum.

5.3. Minimization and Multi-Start Strategy

A multi-start strategy is applied in order to check if convergence is achieved when starting from different locations in the 7-dimensional kinetic parameter space polytope bounded by vertices defined in Table 1, and to identify the best parameter set corresponding to the cost Function (16) global minimum.

Table 1

Vertices of the multi-start parameter polytope.

Kinetic Parameters/Min–Max Values	Minimum Initial Value	Maximum Initial Value
μmax1	0.1 h⁻¹	1 h⁻¹
μmax2	0.1 h⁻¹	1 h⁻¹
KG	0.01 g/L	1 g/L
KGn	0.01 g/L	1 g/L
KGd	0.1 g/L	10 g/L
KGnd	0.1 g/L	10 g/L
μdmax	0.1 h⁻¹	1 h⁻¹

25 runs were achieved, leading to the results shown in Table A1 in Appendix A. From a quantitative point of view, the minimization process is achieved efficiently in most of the cases since the cost function residuals are comprised in the interval in 22 out of the 25 runs (the initial order of magnitude of J is typically between 20 and 100). Runs 11, 16 and 21 lead respectively to , and , which are highlighted in Table A1 in Appendix A (large deviations in the value of the growth rate are observed). We can conclude that the neighborhood of the optimum should be reached in almost 90% of the runs based on random initialization inside the polytope.

Table A1

Parameter identification following the 25 multi-start runs (orange lines indicate clear local minima).

μmax1	μmax2	KG	KGn	KGd	KGnd	μdmax	k31	k32	k41	k42	k52	k61	k63
0.5559	0.1410	0.1456	0.0224	2.1895	0.1922	0.7028	2.4175	6.0250	1.2867	0.5061	10.6769	119.7005	7.6599
0.5880	0.2670	0.0451	0.0162	1.9315	1.1321	0.7073	2.3867	6.6606	0.7739	0.6705	14.6022	68.0054	7.7126
0.5547	0.1500	0.0115	0.0094	1.2845	1.2627	0.9401	4.2541	4.4612	0.4948	0.8604	9.6828	114.3043	9.5609
0.5473	0.1429	0.0280	0.0145	1.9152	0.3041	0.7235	2.8453	6.0009	0.7705	0.7854	10.3002	122.0962	10.2667
0.6058	0.0611	0.0526	0.0413	0.8423	0.9630	1.2922	4.2558	5.7751	2.0796	0.5338	8.1830	234.9037	14.3253
0.5626	0.1100	0.0521	0.0286	0.9322	0.3854	1.2512	3.2340	5.6626	1.5257	0.4498	9.6220	161.7081	4.3626
0.5887	0.2774	0.0273	0.0170	2.3108	1.7476	0.6160	2.7446	8.1550	0.8308	0.5936	15.0889	62.0973	10.5890
0.6303	0.1737	0.0988	0.0229	2.0640	2.0451	0.6517	2.7454	6.1184	1.0883	0.5328	10.1684	94.7350	13.3752
0.6139	0.2062	0.1491	0.0195	1.0761	1.6163	1.0645	2.8275	6.4991	0.9660	0.5718	11.4731	78.7716	13.9319
0.5195	0.2916	0.0223	0.0061	2.5413	0.3398	0.6242	3.9897	5.4858	0.3507	1.5180	18.3348	65.0805	5.7034
0.2676	0.5515	0.0170	0.0123	1.8476	1.9618	0.1667	3.5684	13.6956	1.4072	1.3057	15.4667	107.3452	17.9771
0.6048	0.1698	0.0305	0.0232	2.0572	0.7868	0.6622	3.5670	5.3198	1.1648	0.5179	10.4758	104.5675	9.7748
0.5868	0.1808	0.0329	0.0219	2.4127	1.7841	0.6059	3.3243	5.3469	1.0981	0.4273	10.7470	94.6231	7.8496
0.5865	0.1951	0.0285	0.0185	1.1563	0.6525	1.0732	2.5100	4.3891	0.8773	0.6274	11.3024	98.2351	3.4642
0.5287	0.2558	0.0744	0.0099	2.4822	2.1080	0.6520	5.1365	4.3431	0.6706	0.8310	16.4441	59.0160	12.3303
0.9859	0.6572	0.0100	0.1800	5.7454	2.3415	0.3338	7.0345	0.2045	0.8892	0.0953	0.0064	50.4414	0.0511
0.5609	0.2443	0.0260	0.0116	0.8086	0.6322	1.3976	2.7069	6.8815	0.5485	0.9639	13.8020	76.5734	6.3556
0.6099	0.0984	0.1118	0.0176	2.1902	1.6137	0.6266	2.4999	5.1055	0.8327	0.7344	8.4616	146.6756	18.1001
0.6311	0.1634	0.0090	0.0452	1.2904	1.3282	0.8407	0.5745	9.7854	1.9713	0.1358	10.8829	126.9890	0.2194
0.5854	0.1725	0.1799	0.0251	1.7937	0.2995	0.7516	2.8422	6.3408	1.2724	0.4102	11.3594	101.7783	8.0611
0.3523	0.1682	0.2644	0.0147	0.2683	0.2306	0.3314	3.5112	17.5043	0.9900	1.5072	27.5654	98.4703	16.7248
0.5577	0.3183	0.0328	0.0073	1.6475	1.3849	0.8834	2.8735	5.5942	0.4435	1.1310	17.5563	6.7974	47.0452
0.6057	0.2115	0.0757	0.0163	1.5704	1.5814	0.8071	2.4129	5.6236	0.7583	0.6688	11.0633	87.1816	7.4376
0.5954	0.1964	0.0197	0.0171	1.2385	0.7389	0.9992	2.3813	5.4283	0.7576	0.7362	11.0359	100.4704	3.6202
0.5482	0.2715	0.0400	0.0182	1.5790	0.2627	0.8576	6.9704	1.5800	1.0154	0.3502	17.0623	67.7835	6.2548

Interestingly, all the 22 runs lead to similar direct validation results shown in Figure 5. For the sake of space in this article, both experiments are graphed on the same figure and the second experiment starts after 15 days, i.e. when the first one is over. Overall, the model predicts well the experimental measurements. However, the prediction of antibody concentration is less accurate after the medium renewal at day 8, probably due to inaccurate biomass concentration measurements.

Figure 5

Direct validation of Model (11) on the first two data sets of the HB1 strain. The second experiment starts after 15 days. The solid line represents the model prediction while the circles represent the experimental measurements with a confidence interval at 99%.

6. Parametric Sensitivity Analysis

The evaluation of parametric sensitivities, i.e., the relative influence of the parameters on the model outputs, is useful to assess potential identifiability problems and confidence intervals. Identifiability depends on the model structure and parametrization as well as on the information content of the data. In unfavorable situations a lack of sensitivity could appear or correlation among parameters. When the model is identifiable with the data at hand, sensitivity information can be used to evaluate the Fisher Information Matrix (FIM) and in turn confidence intervals for the several parameters [20].

6.1. Parameter Error Covariance

The sensitivity of the ith state with respect to kth parameter at time t is theoretically defined by: Parametric sensitivities can be computed by integration of the following ordinary differential equations: with the model state equation. Parameter identifiability can be assessed using the Fisher Information Matrix (FIM), which can be computed as follows where is the sampling time and is the number of samples. An optimistic estimate of the parameter estimation error covariance matrix can be estimated based on the inverse of the FIM: with being the posterior estimate of the measurement error variance obtained from the residual cost function at the optimum: where is the total number of measurements and is the number of estimated parameters.

6.2. Application to the Case Study

The relative standard deviations (the square root of the diagonal of (20)) are shown in Table A2 in Appendix A for the 22 optimization runs under consideration. It is apparent that the error on is very large, which is a sign that model (11) is over-parameterized. Indeed, glucose concentration levels are low so that .

Table A2

Parameter estimation error relative standard deviations for the 22 selected runs.

μmax1	μmax2	KG	KGn	KGd	KGnd	μdmax	k31	k32	k41	k42	k52	k61	k63
3.6662	22.6437	130.2128	17.8640	49.5212	13.2501	31.2599	120.2970	24.7702	21.8237	16.9119	10.8063	22.3173	22.0970
3.1545	11.4168	361.2676	18.2503	55.3056	40.4276	36.8463	57.3265	26.6477	11.5497	15.0769	12.9871	11.3732	22.5188
2.7529	27.1667	1181.2541	23.8344	46.7243	43.2328	34.3950	73.5869	32.0299	30.7479	11.1012	12.1905	27.1262	20.1369
3.6238	24.3889	720.1689	20.7425	59.3874	18.0344	39.2357	103.7392	24.0107	24.4068	11.3759	11.0635	24.1557	17.9644
5.1282	46.0975	628.3497	18.5780	114.7653	46.7917	93.0397	126.2801	16.7504	42.9210	10.6211	8.0060	45.4639	15.6957
3.6607	27.9484	379.5510	17.5197	73.0553	20.8103	57.9583	107.1231	22.6208	26.9998	16.0283	9.7616	27.6345	44.1384
3.0754	11.0214	574.9346	18.7735	51.3645	61.7058	31.3976	48.8369	23.7644	10.9953	17.3372	13.4790	10.9730	16.2965
3.7369	19.7116	211.5824	17.6190	54.5606	71.9202	34.9982	83.7420	21.9111	18.2301	14.1168	10.5920	19.4313	13.2615
3.8635	16.5377	145.3122	18.6602	84.8642	60.1199	65.9035	69.0624	23.7561	15.6449	15.0802	11.7848	16.3232	12.8728
2.5121	10.8171	546.0726	24.6565	49.0097	18.0168	29.2449	32.5452	39.8743	17.6116	15.2724	16.7534	10.9220	30.5888
3.5450	19.6658	640.1403	17.3780	54.9197	32.1188	35.3471	67.2044	25.6640	18.4603	15.0928	10.6327	19.3963	17.6762
3.9618	18.6565	667.0940	18.3143	57.4314	68.1850	34.5779	65.4403	26.6676	17.9300	18.1388	11.5034	18.5005	24.8874
3.4707	16.9881	644.5080	17.9580	71.5894	26.5994	55.1697	79.9299	32.5613	16.7107	13.7367	11.4200	16.8977	52.9654
4.2695	14.5552	280.6340	24.0336	59.1986	67.5103	36.0739	34.9784	50.1691	15.7086	18.0133	17.5370	14.6961	16.8627
3.0381	13.3080	609.3884	20.3656	87.6372	27.4640	72.0226	57.3727	25.6625	15.4098	12.4396	13.1777	13.3089	27.3362
3.7332	36.3680	186.6357	20.3580	49.4555	58.1809	31.1673	159.1482	21.3178	32.7821	9.1231	9.4398	35.9587	11.4881
3.3887	16.9274	1933.8346	15.1813	69.0932	49.6417	50.7343	401.5180	14.7875	16.6962	52.5206	9.3112	16.6627	722.5528
3.5640	17.3184	110.6870	17.5278	61.0394	17.2553	41.0509	77.7436	23.3178	16.9096	19.8023	10.6291	17.0292	21.2591
3.2513	12.6435	484.7580	27.1346	58.2372	46.5384	40.9441	46.2395	44.1952	15.8306	17.7920	20.7195	25.5029	4.2209
3.4716	16.2920	247.1369	18.5861	60.9806	55.8643	43.1792	76.6727	26.1026	16.0291	12.9251	11.8084	16.2252	24.4574
3.2776	16.8715	871.4772	17.9683	61.5156	28.8860	46.3341	82.1495	25.7187	16.8224	11.7135	11.0761	16.8066	49.3217
3.0620	10.1596	397.7225	18.8584	77.1272	16.8420	54.2770	20.9435	121.3063	10.1811	31.2955	13.8143	9.9632	24.4122

7. Reduced Model Identification

7.1. Model Reduction

Expression (13) is simplified to: Since glutamine is the main nitrogen source dedicated to cell viability, it is not surprising that glutamine becomes responsible of cell growth, i.e. glycolysis and overflow.

7.2. Re-Identification

With the exception of a few local minima, multi-start identification again leads the minima in the range and direct validation is shown in Figure 6.

Figure 6

Direct validation of the reduced model with the first two data sets of the HB1 strain. The second experiment starts after 15 days. The solid line represents the model while the circles represent the experimental measurements with a confidence interval at 99%.

Relative standard deviations are much improved as shown in Table 2, only for the best run (i.e., presenting the best cost function and relative error standard deviations).

Table 2

Parameter estimates and relative estimation error for the best identification run of the reduced model.

Parameter		Relative Error Standard Deviation
μmax1	0.4849	1.7652
μmax2	0.3198	8.1283
KGn	0.0089	23.2540
KGd	1.5899	15.4666
KGnd	1.3359	49.2893
μdmax	0.8667	66.5216
k31	3.1207	34.1265
k32	15.2090	26.3604
k41	0.6245	9.5667
k42	1.2221	19.2377
k52	23.9586	20.3273
k61	43.5907	8.3527
k63	14.2221	10.9999

7.3. Reduced Model Cross-Validation

The third data set of HB1 is now used to cross-validate the identified model. During this cross-validation, initial states are re-estimated since initial measurement noise can be a critical source of result degradation. Results shown in Figure 7 are quite satisfactory even though the antibody concentration still suffers from discrepancies after medium renewal. It is worth noticing that the last 3 measurements of glutamine concentration are probably outliers following wrong analytical manipulations (glutamine is only consumed and cannot be produced).

Figure 7

Cross-validation of the reduced model on the third data set of the HB1 strain. The solid line represents the model while the circles represent the experimental measurements with a confidence interval at 99%.

The residual deviation between the model and the experimental data is given by . Interestingly, the model is also able of a relatively good prediction of the experimental data collected with the second hybridoma strain shown in Figure 8. The main discrepancy is in the prediction of the biomass (and consequently of the antibody concentration). However, residuals are still relatively low (), confirming the satisfactory results. These observations allow the perspective that macroscopic models could be adapted from one application to another at relatively little extra costs, just recalibrating the model based on some new available data, starting from the parameter estimates obtained in earlier applications.

Figure 8

Cross-validation with the second data set of the HB2 strain of the reduced model identified with HB1 data sets. The solid line represents the model while the circles represent the experimental measurements with a confidence interval at 99%.

7.4. Robustness to Parameter Uncertainty

Since the identified parameters of Section 7.2 show some uncertainties represented by their estimation error standard deviations (see Table 2), a Monte-Carlo analysis is developed, where each parameter is subject to normally distributed variations. 100 runs of the HB1 model are performed for the HB1 cross-validation data sets and are shown in Figure 9. The trajectory envelope is most of the time contained within the measurement confidence intervals with the exception of the MAb measurements following the medium renewal.

Figure 9

Monte-Carlo analysis of the cross validation of the HB1 model: the solid lines represent 100 model trajectories corresponding to normally distributed parameters characterized by the standard deviations of Table 3. Circles represent the experimental measurements with a confidence interval of 99%.

Results of the Monte-Carlo analysis are presented in Table 3.

Table 3

Results of the Monte-Carlo analysis: number of runs, minimum, maximum and mean values of the residual cost function J and standard deviation.

Runs	Min J	Max J	J Mean	J Std Deviation
100	1.3412	1.6225	1.4050	0.0576

Parameter variations can have a slight positive effect on cross validation (since the residual cost function was initially 1.3573 and the best Monte-Carlo case is 1.3412) but usually a negative effect, the worst case corresponding to a residual cost of 1.6225. Since all the runs provide satisfactory results, the identified model is quite acceptable for prediction and control purposes.

8. Optimization of the Monoclonal Antibody Production

This section intends to provide the best medium renewal time and composition to maximize the monoclonal antibody production and substrate savings. Using the validated model of Section 7, it is possible to express these targets in a mathematical objective function of the form: where α represents a weighting coefficient penalizing substrate savings with respect to MAb production, i.e. defining the degree of predominance of one target over the other. Minimization of (25) is achieved using the optimizer fmincon from the Matlab platform, in order to find the best values of θ = [trenewal Grenewal Gnrefresh], i.e., is the medium renewal time trenewal and the glucose and glutamine concentrations in the medium Grenewal and Gnrenewal. Fmincon also allows to specify box constraints for trenewal Є [3 14] days. These values are selected in accordance with the previous experimental results: medium renewal should on the one hand not occur too soon and on the other hand before the end of the experiment set at day 14. Grenewal Є [1 15] g/L and Gnrenewal Є [0.1 1] g/L so as to avoid cell starvation or growth inhibition through the accumulation of byproduct. Figure 10 shows the optimization results when α is set to zero and Grenewal and Gnrenewal are respectively set to 6 and 0.4 g/L (which is similar to the concentrations used in the experiments dedicated to model identification described in Section 7). The best time at which medium renewal should be achieved is found at trenewal = 4.54 days (in the previous experiments, renewal had been achieved after approximately 7 days). Moreover, the MAb production, defined as the sum of the final batch concentrations, amounts to 60.92 µg/mL which represents a production gain of 30% with respect to the experiments of Figure 1 (where the production can be estimated to 40 to 45 µg/mL).

Figure 10

Optimization of the renewal time with medium substrate concentrations respectively set to G = 6 g/L and Gn = 0.4 g/L (α = 0), while maximizing monoclonal antibody production.

Figure 11 shows new results when a strong emphasis is placed on substrate savings with α = 10. The optimizer converges to θ = [6.95 4.55 1], which leads to the following observations:

Figure 11

Optimization of the renewal time when an important substrate saving (α = 10) to maximize the produced monoclonal antibody amount.

Even when considering substrate savings, the upper bound of Gnrefresh is reached. Indeed, when G is depleted, Gn still limits biomass decay and therefore maintains an efficient MAb production rate. However, since ammonium production (byproduct formed by glutamine overflow) is not considered in the model obtained in Section 7, higher values of Gnrefresh are not recommended. Interestingly, approximately the same renewal time as in Figure 2 is obtained, which means that these experiments could be “economically” optimized only by revising the medium composition. Since MAb production clearly appears as a function of substrate saving penalization, new optimizations considering α in the range 0 to 500 with incremental steps of 50 are achieved in order to assess the impact of α on MAb production and select a good compromise. Results displayed in Figure 12 show that specific operating conditions can be chosen to reach a target MAb production. For instance, approximately 3 g/L of glucose are sufficient, with a renewal after 4 days, to harvest 75 µg/mL of MAb within 14 days. Moreover, operating conditions of Figure 2 seem to be a good economic compromise as 100 µg/mL can be harvested starting with a glucose concentration of 6 g/L and a renewal after 7 days. Concerning the glutamine concentration, the observations from Figure 10 are confirmed: since no glutamine overflow is considered, very high glutamine concentrations are unrealistically tolerated.

Figure 12

Evolution of the culture operating conditions and MAb production as functions of α.

9. Conclusions

In this work, a simple dynamic model of hybridoma sequential batch cultures is developed, which can be used to optimize the production of monoclonal antibodies. Maximum likelihood principal component analysis allows assessing the information content of the experimental data, providing the minimum number of reactions and the corresponding stoichiometry by solving an optimization problem under a few a priori biological constraints. An original formulation of the method is presented, allowing biomass to occur in several reactions. Advantages of the method are (a) to limit the number of reactions, i.e. to avoid a useless complication of the model with respect to the experimental field and the involved biological phenomena (activation, saturation, inhibition, etc.); (b) to offer the possibility to proceed to a quick first estimation of the stoichiometry independently of the kinetics, in turn reducing the number of unknown parameters (for the current model, stoichiometry represents half the parameter set); (c) to provide a “divide and conquer” approach where the stoichiometry and kinetics can be estimated separately or simultaneously, in an iterative way, starting from estimates obtained at the previous step. The procedure can be supplemented by parametric sensitivity analysis, which allows further model simplification, whenever needed, by isolating parameters with low sensitivities. A Monte-Carlo study, where parameter variations are considered in accordance with the resulting estimation error variances, shows that model trajectories are globally kept inside a corridor defined by measurement confidence intervals (i.e., parameter discrepancies do not cause critical model misevaluations). As a practical illustrative outcome of the present study, the obtained dynamic model is used for a two-sequential batch process optimization (determination of the best sequence and composition of medium renewals). The results show that the importance of substrate savings drives the location of the optimum. A renewal time scheduling can therefore be established based on the user will to save medium components such as substrates. Further experimental validations of the optimization method are important perspectives of this still on-going work as well as estimation and on-line control issues.

10 in total

1. Dynamic optimization of hybridoma growth in a fed-batch bioreactor.

Authors: S Dhir; K J Morrow; R R Rhinehart; T Wiesner
Journal: Biotechnol Bioeng Date: 2000-01-20 Impact factor: 4.530

2. On the estimation of the pseudo-stoichiometric matrix for macroscopic mass balance modelling of biotechnological processes.

Authors: Olivier Bernard; Georges Bastin
Journal: Math Biosci Date: 2004-12-19 Impact factor: 2.144

3. Observations on the carbohydrate metabolism of tumours.

Authors: H G Crabtree
Journal: Biochem J Date: 1929 Impact factor: 3.857

4. Stoichiometric identification with maximum likelihood principal component analysis.

Authors: Johan Mailier; Marcel Remy; Alain Vande Wouwer
Journal: J Math Biol Date: 2012-07-21 Impact factor: 2.259

5. The Crabtree effect: a regulatory system in yeast.

Authors: R H De Deken
Journal: J Gen Microbiol Date: 1966-08

6. Overflow metabolism in Escherichia coli during steady-state growth: transcriptional regulation and effect of the redox ratio.

Authors: G N Vemuri; E Altman; D P Sangurdekar; A B Khodursky; M A Eiteman
Journal: Appl Environ Microbiol Date: 2006-05 Impact factor: 4.792

7. Increasing NADH oxidation reduces overflow metabolism in Saccharomyces cerevisiae.

Authors: G N Vemuri; M A Eiteman; J E McEwen; L Olsson; J Nielsen
Journal: Proc Natl Acad Sci U S A Date: 2007-02-07 Impact factor: 11.205

8. A dynamic model for diauxic growth, overflow metabolism, and AI-2-mediated cell-cell communication of Salmonella Typhimurium based on systems biology concepts.

Authors: Astrid M Cappuyns; Kristel Bernaerts; Jos Vanderleyden; Jan F Van Impe
Journal: Biotechnol Bioeng Date: 2009-01-01 Impact factor: 4.530

9. Monitoring of fed-batch E. coli fermentations with software sensors.

Authors: A C A Veloso; I Rocha; E C Ferreira
Journal: Bioprocess Biosyst Eng Date: 2008-08-26 Impact factor: 3.210

10. Growth of Saccharomyces cerevisiae is controlled by its limited respiratory capacity: Formulation and verification of a hypothesis.

Authors: B Sonnleitner; O Käppeli
Journal: Biotechnol Bioeng Date: 1986-06 Impact factor: 4.530

10 in total

1 in total

1. Model-Based Optimization of a Fed-Batch Bioreactor for mAb Production Using a Hybridoma Cell Culture.

Authors: Gheorghe Maria
Journal: Molecules Date: 2020-11-30 Impact factor: 4.411

1 in total