Literature DB >> 29208922

Beyond the Michaelis-Menten equation: Accurate and efficient estimation of enzyme kinetic parameters.

Boseung Choi¹, Grzegorz A Rempala², Jae Kyoung Kim³.

Abstract

Examining enzyme kinetics is critical for understanding cellular systems and for using enzymes in industry. The Michaelis-Menten equation has been widely used for over a century to estimate the enzyme kinetic parameters from reaction progress curves of substrates, which is known as the progress curve assay. However, this canonical approach works in limited conditions, such as when there is a large excess of substrate over enzyme. Even when this condition is satisfied, the identifiability of parameters is not always guaranteed, and often not verifiable in practice. To overcome such limitations of the canonical approach for the progress curve assay, here we propose a Bayesian approach based on an equation derived with the total quasi-steady-state approximation. In contrast to the canonical approach, estimates obtained with this proposed approach exhibit little bias for any combination of enzyme and substrate concentrations. Importantly, unlike the canonical approach, an optimal experiment to identify parameters with certainty can be easily designed without any prior information. Indeed, with this proposed design, the kinetic parameters of diverse enzymes with disparate catalytic efficiencies, such as chymotrypsin, fumarase, and urease, can be accurately and precisely estimated from a minimal amount of timecourse data. A publicly accessible computational package performing such accurate and efficient Bayesian inference for enzyme kinetics is provided.

Entities: Chemical Disease Gene

Mesh：

Substances：

Year: 2017 PMID： 29208922 PMCID： PMC5717222 DOI： 10.1038/s41598-017-17072-z

Source DB: PubMed Journal: Sci Rep ISSN： 2045-2322 Impact factor: 4.379

Introduction

Because enzymes can modulate biochemical reaction rates by selectively catalyzing specific substrates[1], they play fundamental roles in metabolism, signal transduction, and cell regulation, and their malfunction can cause serious diseases[2,3]. Furthermore, enzymes have been used as extremely specific catalysts in diverse industrial fields such as drug development, biofuel production, and food processing[4]. A canonical approach used to understand enzyme kinetics for a century has been based on the Michaelis-Menten equation (MM equation), which was developed by Michaelis and Menten[5] and then was more rigorously derived by Briggs and Haldane[6] using the standard quasi-steady-state approximation (sQSSA)[7]. The equation describes the dependence of enzyme-catalyzed reaction rates on the concentration of substrate by using two parameters, the catalytic constant, k and the Michaelis-Menten constant, K (see below for details). The k determines the maximum rate of the reaction at saturating substrate concentrations, V = k E , where E is total enzyme concentration, and the K is the substrate concentration at which the reaction rate is half of V . There are two major assays to estimate k and K from a measured accumulation of product over time (i.e. progress curve): the initial velocity assay (initial rate analysis) and the reaction progress curve assay (progress curve analysis)[8-12]. For the initial velocity assay, initial rates of the reaction are measured for a range of substrate concentrations. Then, by using a linear transform of these data, such as Lineweaver-Burk plots, the two parameters can be easily estimated without use of any computational tools[8,9]. Recent advances in computational tools have led to an alternative approach: the reaction progress curve assay. In this assay, the entire timecourse (i.e. progress curve) is fitted to the solution of a differential equation or integrated rate equation, and thus the data is used more efficiently than in the initial velocity assay[10,11,13]. Albeit more technically challenging, the progress curve assay requires less data to estimate parameters than the initial velocity assay does. Since both assays are based on the MM equation, they should be performed only when the MM equation is valid, that is, when the enzyme concentration is a much lower than the sum of the substrate concentration and the K [7,14] (see below for more details). Because the value of K is usually not known a priori, to ensure the validity of the MM equation, in vitro experiments are typically performed with a much lower enzyme concentration than substrate concentration[15]. However, such conditions cannot be guaranteed in vivo, because endogenous enzyme concentrations are much higher than those used in a typical in vitro assay[16,17]. It is therefore risky to use the MM equation to analyze in vivo data and to predict in vivo enzyme activity by using parameters estimated from an in vitro assay[15,18]. Furthermore, even when the MM equation is valid, precise estimation is not guaranteed, because of the highly correlated structure and unidentifiability of the parameters[19-23]. That is, even though estimated parameters can fit the data accurately, the estimates can differ greatly from the actual values of k and K . Because of the identifiability issue, experimental designs to infer the maximum possible information about the parameters have been investigated[12,13,20-24]. For instance, to ensure that the parameters can be identified from the initial velocity assay, the initial concentration of substrate needs to be increased from a low level to a higher level until the reaction velocity is saturated. For the saturation, generally the initial substrate concentration needs to be larger than 10 K , but often such high concentrations cannot be achieved[24]. For the progress curve assay, the initial substrate concentration is recommended to be at a similar level to K [23,25]. Note that both assays require prior knowledge of K , which gives rise to the conundrum that, in order to estimate K , the approximate value of K needs to be known. To overcome such limits on the inference using the model based on the MM equation, which is referred to as the sQ (standard QSSA) model, here we propose an alternative approach. In our approach, we use a different approximate model that is derived with the total QSSA and is referred to as the tQ (total QSSA) model[26-29]. By applying the Bayesian inference based on either the sQ or the tQ model to the product progress curve, we found that the estimates obtained with the sQ model were considerably biased when the enzyme concentration was not low. On the other hand, the estimates obtained with the tQ model were not biased for any combination of enzyme and substrate concentrations. Thus, with the tQ model, the experimental data from various conditions can be pooled without any restrictions to improve the accuracy and precision of the estimation. For instance, when two sets of timecourse data obtained under low and high enzyme concentrations are used together, the tQ model, but not the sQ model, leads to accurate and precise estimation. Another advantage of our approach is that, by analyzing the scatter plots of current estimates, the next optimal experiment to ensure the parameter identifiability can be easily designed without requiring any prior knowledge of the k and K values. The proposed optimized design yields accurate and precise estimation from a minimal amount of data simulated based on the kinetics of various enzymes: chymotrypsin, fumarase and urease, which have disparate catalytic efficiencies (k /K ). We provide a publicly accessible computational package that performs the Bayesian inference based on the tQ model, thus leading to accurate and efficient estimation of enzyme kinetics.

Results

Two types of models describing enzyme kinetics: The sQ and tQ models

A fundamental enzyme reaction consists of a single enzyme and a single substrate, where the free enzyme (E) reversibly binds with the substrate (S) to form the complex (C), and the complex irreversibly dissociates into the product (P) and the free enzyme:where the total enzyme concentration (E ≡ C + E) and the total substrate and product concentration (S ≡ S + C + P) are conserved. A popular model describing the accumulation of the product over time is based on the MM equation, as follows (see Supplementary Method for detailed derivation):where K = (k + k )/k is the Michaelis-Menten constant and k is the catalytic constant. This sQ model derived with the standard QSSA has been widely used to estimate the kinetic parameters, K and k from the progress curve of the product[8-11,23,25]. Another model describing the accumulation of the product is derived with the total QSSA; it was developed later than the sQ model and thus has received less attention for parameter estimation[26-29]: Although this tQ model is more complicated than the sQ model, it is accurate over wider ranges than the sQ model. Specifically, the sQ model is accurate whenwhich requires a low enzyme concentration[7,14]. On the other hand, the tQ model is accurate whenwhere K = k /k is the dissociation constant[27-29]. Importantly, this condition is generally valid and thus the tQ model, unlike the sQ model, is accurate even when the enzyme is in excess. See[14,30] for more details. Next, we investigated the accuracy of the stochastic simulations performed with both models. Specifically, we compared the stochastic simulations using the Gillespie algorithm based on the propensity functions from either the original full model (described in Table S1), the sQ model (Table S2), or the tQ model (Table S3) for 9 different conditions[31-36]: E is either lower than, similar to, or higher than K , and S is also either lower than, similar to, or higher than K (Fig. 1). The stochastic simulations of the sQ model fail to approximate those of the original full model when E is not low (i.e., E is lower than neither S nor K ). On the other hand, stochastic simulations using the tQ model are accurate for all conditions (Fig. 1), as is consistent with a recent study showing that stochastic simulations with the sQ and the tQ models are accurate when their deterministic validity conditions hold (Eqs (3) and (4))[37,38]. Taken together, the tQ model is valid for a wider range of conditions than the sQ model is in both the deterministic and the stochastic sense.

Figure 1

Whereas the sQ model fails to approximate the original full model as E increases, the tQ model is accurate regardless of E . Stochastic simulations of the original full model (Table S1), the sQ model (Table S2), and the tQ model (Table S3) were performed with S = 0.2, 2, or 80 nM, and E = 0.2, 2, or 40 nM. Note that these concentrations are either lower than, similar to, or higher than K ≈ 2 nM. Here, the lines and colored ranges represent a mean trajectory and fluctuation range (±2σ from the mean) of 104 stochastic simulations.

Estimation with the tQ model is unbiased for any combination of enzyme and substrate concentrations

Because the tQ model is accurate for a wider range of conditions than the sQ model is (Fig. 1), we hypothesized that the parameter estimation based on the tQ model is also accurate for more general conditions. To investigate this hypothesis, we first generated 102 noisy progress curves of P from the stochastic simulations of the original full model (Fig. S1). Then, we inferred parameters (k and K ) from these simulated data sets by applying the Bayesian inference with the likelihood functions based on either the sQ or the tQ model, under weakly informative gamma priors (Fig. S2) (see Methods for details). Note that throughout this study, we have used the simulated product progress curves (e.g. Fig. S1) because we need to know the true values of parameters for the accurate comparison of the estimations based on the sQ model and the tQ model. We first focused on the estimation of the k under the assumption that the value of K is known. When E is low, so that both the sQ and the tQ models are accurate (Fig. 1 left), posterior samples obtained with both models are similar and successfully capture the true value of k (Fig. 2a left). The posterior samples obtained with the two models are similar because, when E is low and thus , both models (Eqs 1 and 2) are approximately equivalent as follows:where the first approximation comes from the Taylor expansion in terms of (see[27-29] for details). Therefore, when and thus the sQ model is accurate, estimations with the sQ and the tQ models should be similar. On the other hand, when E is high, they show clear differences (Fig. 2a right): the posterior samples obtained with the sQ model show large errors, while those obtained with the tQ model accurately capture the true value of k .

Figure 2

The estimation of a single parameter (k or K ) with either the sQ or the tQ model. For each condition (S = 0.2, 2, or 80 nM, and E = 0.2, 2, or 40 nM), 105 posterior samples of either k (a) or K (b) were obtained by applying the Bayesian inference to 102 noisy data sets (Fig. S1) (see Methods for details). When the k is sampled, the K is fixed at its true value (a) and vice versa (b). Here, green triangles indicate the true values of the parameters. Whereas the estimates of k and K obtained with the sQ model are biased as E increases, those obtained with the tQ model have negligible bias regardless of conditions (See Fig. S3 for box plots of estimates). As E or S increases, the posterior variance of K increases when the tQ model is used. Similar results are also observed in the box plots of posterior means and posterior coefficient of variations (CVs) (Fig. S3a,b). Whereas posterior means obtained with the sQ model are biased when E is high, those obtained with the tQ model are accurate for all conditions (Fig. S3a). In particular, narrow distributions of posterior means indicate that the estimation of k with the tQ model is robust aginst the noise in the data (Fig. S1). Furthermore, posterior CVs are much smaller than prior CVs (Fig. S3b), indicating precise estimation of k with the tQ model. Next, K was estimated under the assumption that the value of k is known (Fig. 2b). Posterior samples of the K obtained with the sQ model again show errors that grow with increasing E . Note that the estimates of the K are biased upward, which implies that using the posterior estimates of K to validate the MM equation () can be misleading. On the other hand, the estimates of K obtained with the tQ model are little biased for all conditions. However, unlike the narrow posterior distributions of k (Fig. 2a), those of K obtained with the tQ model become wider; so precision decreases as E or S increases (Fig. 2b). These patterns are also observed in the box plots of posterior means and posterior CVs (Fig. S3c,d). The identifiability problem arises because, when or and thus , the K is negligible in the tQ model (Eq. 2), as follows:Specifically, when K is too low, the value of K has little effect on the dynamics of the tQ model and thus the K is structurally unidentifiable. Taken together, the estimations of K with both the sQ and the tQ models are not satisfactory, although for different reasons: estimations with the sQ model can be biased and those with the tQ model can be structurally unidentifiable (Fig. 2b). Similar patterns were also observed when a more informative prior was given (Fig. S4). In particular, even with the informative prior, estimates obtained with the sQ model still show considerable error as E increases.

Simultaneous estimation of k and K suffers from the lack of identifiability

Next, we considered simultaneous estimation of two parameters, k and K , which is the typical goal of enzyme kinetics. For the same gamma priors used in the single-parameter estimation (Fig. 2), the distributions of posterior samples obtained with both models became wider overall (Fig. 3). To find the reason for such imprecise estimation, we analysed the scatter plots of posterior k and K samples (Fig. 4). When (Fig. 4a–c), the posterior samples of k and K obtained with the sQ model exhibited a strong correlation, because the dynamics of the sQ model depend only on the ratio k /K , as seen in the following approximation:where is used. On the other hand, when (Fig. 4g–i), the scatter plot of the sQ model becomes horizontal, indicating the structure unidentifiability of the K . Indeed, the value of K has nearly no effect on the dynamics of the sQ model, as seen in the following approximation:where K + S ≈ S is used as . Such lack of parameter identifiability when or is consistent with previous studies, which recommend using S ≈ K for more precise estimation[22,23]. However, even when S ≈ K , estimates are still imprecise (Fig. 3a and b middle). Furthermore, as E increases, the estimates obtained with the sQ model are biased (Fig. 3) like in the single-parameter estimation (Fig. 2). Based on this analysis it appears that the simultaneous estimation of k and K with the sQ model is challenging because of both identifiability and bias problems.

Figure 3

Figure 4

The scatter plots of posterior samples obtained with the two-parameter estimation (Fig. 3). The scatter plots imply two types of structure unidentifiability: strong correlation between k and K , and unidentifiability of K , which is represented as a horizontal plot. Positively correlated scatter plots of the tQ model are changed to horizontal ones when the sampled K is much lower than S + E (dashed gray lines). Here, green triangles represent the true values of parameters.

Simultaneous estimation of two parameters (k and K ) with either the sQ or the tQ model. From the same 102 data sets (Fig. S1) used in the single-parameter estimation (Fig. 2), 105 posterior samples of the k (a) and the K (b) were obtained together. Although the same prior is given, the posterior distributions become wider than the single-parameter estimation (Fig. 2). Here, green triangles indicate the true values of k or K . The scatter plots of posterior samples obtained with the two-parameter estimation (Fig. 3). The scatter plots imply two types of structure unidentifiability: strong correlation between k and K , and unidentifiability of K , which is represented as a horizontal plot. Positively correlated scatter plots of the tQ model are changed to horizontal ones when the sampled K is much lower than S + E (dashed gray lines). Here, green triangles represent the true values of parameters. When or , the K has a negligible effect on the dynamics of the tQ model (Eq. 6), and thus only k was identifiable in the single-parameter estimation (Fig. 2a and b right or bottom). Similarly, when both k and K are inferred simultaneously with the tQ model, estimation of only k is accurate and precise (Fig. 3a and b right or bottom), as is shown by the horizontal scatter plots along the true value of k (Fig. 4c,f,g–i). In other cases (when neither nor ), posterior variance of both parameters dramatically increases compared to the single-parameter estimation (Figs 2 and 3 left and top). Such imprecise estimation stems from two sources, according to the scatter plots (Fig. 4a,b,d,e). When k and K decrease together, the behavior of the tQ model changes little as the SQ model (Eq. 5), which leads to the strong correlation between posterior samples of k and K . As the estimates of K keep decreasing together with those of k , so that they become much less than E + S (dashed vertical line of Fig. 4), the tQ model no longer depends on the value of K , as shown in Eq. 6, and thus the scatter plots become horizontal.

Combined data from different experiments allow accurate and precise estimation with the tQ model

As shown above, the estimation of both k and K using a single progress curve suffers from considerable bias and lack of identifiability (Figs 3 and 4), which is consistent with previous studies reporting that a progress curve obtained from a single experiment is not enough to identify both parameters simultaneously[19]. Thus, here, we investigate whether using multiple timecourse data sets obtained under different experimental conditions can improve the estimation. In typical in vitro assays, progress curves are measured with either a fixed S and varied E or a fixed E and varied S [8-11,39]. We first consider the case when progress curves are measured with a fixed S and a varied E . Specifically, progress curves from both low and high E are used to estimate parameters for a fixed S at different levels (Fig. S1 top and bottom). In this case, posterior samples obtained with the sQ model show considerable errors as the data from high E is used (Figs 5a and S5). On the other hand, the posterior samples obtained with the tQ model accurately capture the true values of both k and K with low variance (Figs 5a and S5). Such improvement stems from the fact that data obtained under the low and high E provide different types of information for parameter estimation. Specifically, from the high E data, although the K is not identifiable, the k can be accurately estimated with the tQ model (Fig. 4c,f,i). Such accurate estimation of k from the high E data can prevent the correlation between the k and the K when they are estimated from the low E data (Fig. 4a,d). Indeed, the narrow scatter plots of the tQ model (Fig. 5b left and middle) are the intersection of two scatter plots, a horizontal one obtained with the high E data (Fig. 4c,f) and a nonhorizontal one obtained with the low E data (Fig. 4a,d). However, when S is high, the scatter plot from the low E also becomes horizontal (Fig. 4c), and thus the synergistic effect of using combined data decreases (Fig. 5a,b right). Taken together, the tQ model can accurately estimate both parameters from the combination of low E and high E data when S is not much larger than K . Note that such low S is preferred for in vitro experiments[24,39-41] and is the case for most physiological conditions[24].

Figure 5

When data obtained under low E and high E are used together, the accuracy and precision of estimaties obtained with the tQ model, but not with the sQ model, are enhanced. (a) Posterior samples are inferred using data sets from E = 0.2 nM (Fig. S1 top) and E = 40 nM (Fig. S1 bottom) together for either S = 0.2, 2, or 80 nM. The posterior variance of the tQ model dramatically decreases to the level of the single-parameter estimation (Fig. 2). However, the estimates of the sQ model show considerable bias. Here, green triangles represent the true values of k or K . (b) The scatter plots of the posterior samples. Here green triangles, blue circles, and red squares represent true values, posterior means of the sQ model, and those of the tQ model, respectively. Next, we consider the case when progress curves are measured with a fixed E and a varied S . Specifically, the combination of two progress curves from low and high S is used to infer parameters for a fixed E at different levels (Fig. S1 left and right). When E is low, and thus the sQ and the tQ models behave similarly (Eq. 5), posterior samples obtained with both models accurately capture the true values of k and K (Figs 6a left and S6). Again, the narrow scatter plot (Fig. 6b left) is obtained as the intersection of a nonhorizontal scatter plot of low S (Fig. 4a) and a horizontal scatter plot of high S (Fig. 4g). However, as E increases, and thus the sQ model becomes less accurate, those obtained with the sQ model are biased, as expected (Figs 6a right and S6). Whereas such biases are not observed in those obtained with the tQ model, the precision of K estimates decreases as E increases, as in the single-parameter estimation (Fig. 2 and Eq. 6).

Figure 6

Estimation using the data obtained under low S and high S together. (a) Posterior samples are inferred using data sets from S = 0.2 nM (Fig. S1 left) and S = 80 nM (Fig. S1 right) together for either E = 0.2, 2, or 40 nM. When E is low, both the sQ and the tQ models allow accurate and precise estimation. As E increases, the estimates obtained with the sQ model become inaccurate, and the estimates of K obtained with the tQ model become less precise, similar to the single-parameter estimation (Fig. 2). Here, green triangles represent the true values of k or K . (b) The scatter plots of the posterior samples. Here green triangles, blue circles, and red squares represent true values, posterior means of the sQ model, and those of the tQ model, respectively.

Optimal design of experiments for accurate and efficient estimation with the tQ model

When a progress curve obtained from a single experiment is used, the posterior scatter plots of the tQ model can be categorized as a correlated type (Fig. 4a,b,d,e) and a horizontal type (Fig. 4c,f,g–i). The intersections of these two different types of scatter plots tend to be narrowly distributed near the true value (Figs 5b and 6b). Thus, combining two such data sets allows accurate estimation of both k and K (Figs 5a and 6a). Specifically, a progress curve measured under and (Fig. 4a,b,d,e) and one measured under or (Fig. 4c,f,g–i) provide different types of information for parameter estimation; so using both data sets leads to successful estimation. However, it is hard to compare the values of S , E , and K in practice, because the value of K is usually unknown a priori. This problem can be easily resolved by using the scatter plot. That is, if the posterior scatter plot obtained from the first experiment is horizontal, then both E and S should be decreased for the next experiment, so that the nonhorizontal scatter plot can be obtained (Fig. 7a). On the other hand, if the scatter plot from the first experiment shows a strong correlation between K and k , then either S or E should be increased in the next experiment (Fig. 7b). Basically, without any prior information of the value of K and k , the shape of the scatter plots of the current estimates determines the next optimal experimental design, which ensures accurate and precise estimation. However, this approach cannot be used with the sQ model, because estimation with the sQ model can be biased, depending on the relationship between E or S and K , which is unknown a priori. That is, unlike the tQ model, precise estimation does not always guarantee accurate estimation with the sQ model, as seen above (e.g. Fig. 5a right).

Figure 7

The optimal experimental design for accurate and precise estimation with the tQ model. (a) When the scatter plot of posterior samples from the first experiment is horizontal, E and S need to be decreased to obtain the nonhorizontal scatter plot in the next experiment. Then, using the combination of the two experiments leads to accurate and precise estimation (red scatter plots). (b) When the scatter plot from the first experiment is nonhorizontal, E or S need to be increased in the next experiment to obtain a horizontal scatter plot. (c) Inference with a single progress curve from the low E (0.1 K ) and the high E (10 K ) leads to nonhorizontal and horizontal scatter plots, respectively, for chymotrypsin, urease, and fumarase (gray scatter plots). When both data sets were used together, accurate estimates were obtained for all enzymes (red scatter plots). Here, low S (0.1 K ) is used. Here, green triangles represent the true values of the parameters. We test whether the proposed approach with the tQ model can accurately estimate k and K for catalysis of the N-acetylglycine ethyl ester, fumarate, and urea by the enzymes the chymotrypsin, urease, and fumarase, respectively (Fig. 7c). These three enzymes were chosen because they have disparate catalytic efficiencies (k /K )[1]: 0.12, 4 · 105, and 1.6 · 108 s −1 M −1, respectively. For each enzyme, 102 noisy timecourse data sets were generated using stochastic simulations based on known enzyme kinetic parameters[1]. When progress curves obtained with low E and low S are used, as expected, nonhorizontal scatter plots of posterior samples were obtained for all three enzymes (Fig. 7c). This indicates that either E or S should be increased in the next experiment to obtain a horizontal scatter plot. Indeed, when a progress curve with a 100-fold increase of E was used, horizontal scatter plots were obtained for all enzymes (Fig. 7c). Therefore, when these two progress curves are used together, both k and K can be accurately estimated (Fig. 7c red dots). These results support that such two-step optimized experimental design (Fig. 7a,b) to get two different types of scatter plots allows accurate and efficient estimation of enzyme kinetics with the tQ model. The computational package performing such estimation is provided (see Method for the details).

Discussion

The standard approach for estimating enzyme kinetic parameters even today continues to be based on the 100-year old MM equation (Eq. 1)[5,6]. However, when enzyme concentration is high, this approach can lead to biased estimation (Fig. 2). Even when enzyme concentration is relatively low, it may not be possible to identify kinetic parameters (Figs 3 and 4). To overcome the limitations of the canonical approach, we proposed an estimation method based on an alternative to the MM equation: the tQ model (Eq. 2), which is derived with the total QSSA[26-29]. Because the estimation procedure with the tQ model is not biased regardless of enzyme or substrate concentrations (Fig. 2), more accurate and precise estimations can be made when pooled data from different experimental conditions are used, unlike the canonical approach (Figs 5 and 6). It appears thus that the tQ model is especially appropriate for creating a consistent Bayesian inferential framework, which becomes more accurate as more data is used. The canonical enzyme kinetic assay based on the MM equation generally requires a large excess of substrate over enzyme[42]. However, such conditions impose experimental limitations and cannot be always guaranteed and verified[15]. For instance, it is hard to generate a high concentration of barely soluble substrate[24], and a low concentration of substrate is required for sensitive kinetic analysis, e.g., in the case of QD-FRET-based probes[39-41]. Importantly, to analyze in vivo enzyme kinetics, where enzyme concentration is often high[16-18], our approach, but not the canonical approach, can be used. For example, one needs to estimate the kinetic parameters underlying drug metabolism by CYP enzymes in the liver in order to predict the effects of drugs, as is essential for drug development[43]. Because of dosing requirements for potent drugs, the amount of CYP enzyme can greatly exceed the drug amount in the liver[44,45]. Another large area where our estimation method can be applied is in the development of nanobiosensors, which measure in vivo activity of a specific enzyme for precise diagnostics, because such enzymes are often in large excess over biosensors[46,47]. KinTek Explorer has been widely used to estimate enzyme kinetic parameters from the progress curves[48-50]. This software provides the confidence contours, which reveal the relationships between the estimated parameters. This approach recommends using multiple data sets to narrow down the confidence contours and thus improve precision of estimates and resolve the unidentifiability issue. Our finding (Fig. 7a,b) can provide the specific type of data sets required for the identifiability of k and K , so that the KinTek Explore could perform parameter estimation more efficiently for the Michales-Menten type of enzyme reactions. Since the initial velocity estimation with the MM equation is not accurate when enzyme concentration is high (Fig. 1), the standard initial velocity based on the MM equation would also be inaccurate[15]. On the other hand, the tQ model accurately captures the initial velocity for all conditions, and thus the modified initial velocity assay based on the tQ model is likely to be accurate over a wider range of conditions. To simplify such estimation procedures, an interesting future study could derive an analogous Lineweaver-Burk plot or the Hanes-Woolf plot[8,9,15,42] for the tQ model. Even with relatively large noise in the data (Fig. S1), our proposed method leads to accurate estimation (Figs 5, 6 and 7), indicating its robustness against experimental noise and some minor inaccuracy of the tQ model in certain ranges of parameter observed in[14,30]. Furthermore, if there are departures from simple non-inhibitory enzyme kinetics (e.g. inhibition of enzyme by product)[51,52], our method can be easily adjusted by modifying the tQ model (see[53,54] for the tQ model for other enzyme kinetics). Our work can also be used to improve the estimation of the kinetics underlying diverse biological functions, such as gene regulation[55,56], cellular rhythms[57-59], quorum sensing[60,61], signal cascade[62,63] and membrane transport[64,65], where the MM equation has been widely used.

Methods

Simulated Data

To obtain timecourse data (Fig. S1) for Bayesian inference, stochastic simulations of the original full model (Table S1) were performed with the Gillespie algorithm[66]. E(0) = E , S(0) = S , C(0) = 0, and P(0) = 0 are used as initial conditions following the typical in vitro enzyme kinetics protocol.

Description of the Bayesian inference approach

The Bayesian inference approach is used to estimate the catalytic constant k and the Michaelis-Menten constant K , based on the hazard function, with respective rates described in Eq. 1 for the sQ model and in Eq. 2 for the tQ model (Fig. S2). The likelihood functions are constructed based on an approximation to the underlying Markov model[66] as follows.where λ is given byorwhere P is the scaled number of product molecules observed at time point t over [0, T] = [t 0, t ] and n = P − P is an observed increment of P . With these likelihood functions, the usual independent gamma priors[67] are assigned to k and K in order to get their posterior distributions with the help of the Markov Chain Monte Carlo (MCMC) method. Weakly informative gamma priors are used for both k and K : their prior means are the same as their true values, and their prior variance is 10 times larger than the prior mean, which covers orders of magnitude (e.g. Fig. 2). The estimation of a single parameter, i.e., either k or K is done conditionally on the other parameter. For estimating the two parameters simultaneously, the Gibbs sampler method is used. In order to draw the sample for K , we also use the Metropolis-Hastings algorithm within the Gibbs sampler step. See Supplementary material for further details.

Computational code

The R package that performs the Bayesian inference based on the tQ model is available on the CRAN repository (https://cran.r-project.org/web/packages/EKMCMC).

50 in total

Review 1. Cellular concentrations of enzymes and their substrates.

Authors: K R Albe; M H Butler; B E Wright
Journal: J Theor Biol Date: 1990-03-22 Impact factor: 2.691

Review 2. Fitting enzyme kinetic data with KinTek Global Kinetic Explorer.

Authors: Kenneth A Johnson
Journal: Methods Enzymol Date: 2009 Impact factor: 1.600

3. Experimental designs for estimating the parameters of the Michaelis-Menten equation from progress curves of enzyme-catalyzed reactions.

Authors: R G Duggleby; R B Clarke
Journal: Biochim Biophys Acta Date: 1991-11-15

4. Uses and misuses of progress curve analysis in enzyme kinetics.

Authors: Natalia Nikolova; Kiril Tenekedjiev; Krasimir Kolev
Journal: Cent Eur J Biol Date: 2008-12-01

5. Enzyme kinetics: the whole picture reveals hidden meanings.

Authors: Maria F Pinto; Berta N Estevinho; Rosa Crespo; Fernando A Rocha; Ana M Damas; Pedro M Martins
Journal: FEBS J Date: 2015-04-13 Impact factor: 5.542

6. A Blueprint for a Synthetic Genetic Feedback Controller to Reprogram Cell Fate.

Authors: Domitilla Del Vecchio; Hussein Abdallah; Yili Qian; James J Collins
Journal: Cell Syst Date: 2017-01-05 Impact factor: 10.304

7. Kinetic behavior at high enzyme concentrations. Magnitude of errors of Michelis-Menten and other approximations.

Authors: S Cha
Journal: J Biol Chem Date: 1970-09-25 Impact factor: 5.157

8. Integration of kinetic analysis of reaction curve with a proper classical approach for enzymatic analysis.

Authors: Xiaolan Yang; Gaobo Long; Hairong Jiang; Pu Liao; Fei Liao
Journal: ScientificWorldJournal Date: 2012-05-03

9. The slow-scale linear noise approximation: an accurate, reduced stochastic description of biochemical networks under timescale separation conditions.

Authors: Philipp Thomas; Arthur V Straube; Ramon Grima
Journal: BMC Syst Biol Date: 2012-05-14

10. Spatial stochastic dynamics enable robust cell polarization.

Authors: Michael J Lawson; Brian Drawert; Mustafa Khammash; Linda Petzold; Tau-Mu Yi
Journal: PLoS Comput Biol Date: 2013-07-25 Impact factor: 4.475

21 in total

1. Bayesian inference of distributed time delay in transcriptional and translational regulation.

Authors: Boseung Choi; Yu-Yu Cheng; Selahattin Cinar; William Ott; Matthew R Bennett; Krešimir Josić; Jae Kyoung Kim
Journal: Bioinformatics Date: 2020-01-15 Impact factor: 6.937

2. Identifiability analysis for stochastic differential equation models in systems biology.

Authors: Alexander P Browning; David J Warne; Kevin Burrage; Ruth E Baker; Matthew J Simpson
Journal: J R Soc Interface Date: 2020-12-16 Impact factor: 4.118

3. The histone H3-H4 tetramer is a copper reductase enzyme.

Authors: Narsis Attar; Oscar A Campos; Maria Vogelauer; Chen Cheng; Yong Xue; Stefan Schmollinger; Lukasz Salwinski; Nathan V Mallipeddi; Brandon A Boone; Linda Yen; Sichen Yang; Shannon Zikovich; Jade Dardine; Michael F Carey; Sabeeha S Merchant; Siavash K Kurdistani
Journal: Science Date: 2020-07-03 Impact factor: 47.728

4. On the Validity of the Stochastic Quasi-Steady-State Approximation in Open Enzyme Catalyzed Reactions: Timescale Separation or Singular Perturbation?

Authors: Justin Eilertsen; Santiago Schnell
Journal: Bull Math Biol Date: 2021-11-26 Impact factor: 1.758

Review 5. Synthetic biomarkers: a twenty-first century path to early cancer detection.

Authors: Gabriel A Kwong; Sharmistha Ghosh; Lena Gamboa; Christos Patriotis; Sudhir Srivastava; Sangeeta N Bhatia
Journal: Nat Rev Cancer Date: 2021-09-06 Impact factor: 60.716

6. Mathematical Modeling Reveals That Sucrose Regulates Leaf Senescence via Dynamic Sugar Signaling Pathways.

Authors: Muhammad Asim; Quaid Hussain; Xiaolin Wang; Yanguo Sun; Haiwei Liu; Rayyan Khan; Shasha Du; Yi Shi; Yan Zhang
Journal: Int J Mol Sci Date: 2022-06-10 Impact factor: 6.208

7. A self-triggered radioligand therapy agent for fluorescence imaging of the treatment response in prostate cancer.

Authors: Hongchuang Xu; Yanpu Wang; Jingming Zhang; Xiaojiang Duan; Ting Zhang; Xuekang Cai; Hyunsoo Ha; Youngjoo Byun; Yan Fan; Zhi Yang; Yiguang Wang; Zhaofei Liu; Xing Yang
Journal: Eur J Nucl Med Mol Imaging Date: 2022-03-02 Impact factor: 10.057

8. The quasi-steady-state approximations revisited: Timescales, small parameters, singularities, and normal forms in enzyme kinetics.

Authors: Justin Eilertsen; Santiago Schnell
Journal: Math Biosci Date: 2020-03-14 Impact factor: 2.144

9. A small-molecular inhibitor against Proteus mirabilis urease to treat catheter-associated urinary tract infections.

Authors: Scarlet Milo; Rachel A Heylen; John Glancy; George T Williams; Bethany L Patenall; Hollie J Hathaway; Naing T Thet; Sarah L Allinson; Maisem Laabei; A Toby A Jenkins
Journal: Sci Rep Date: 2021-02-12 Impact factor: 4.379

10. Ensemble methods for stochastic networks with special reference to the biological clock of Neurospora crassa.

Authors: C Caranica; A Al-Omari; Z Deng; J Griffith; R Nilsen; L Mao; J Arnold; H-B Schüttler
Journal: PLoS One Date: 2018-05-16 Impact factor: 3.240