Literature DB >> 35404953

Validation-based model selection for 13C metabolic flux analysis with uncertain measurement errors.

Nicolas Sundqvist¹, Nina Grankvist^2,3,4, Jeramie Watrous⁵, Jain Mohit⁵, Roland Nilsson^2,3,4, Gunnar Cedersund^1,6.

Abstract

Accurate measurements of metabolic fluxes in living cells are central to metabolism research and metabolic engineering. The gold standard method is model-based metabolic flux analysis (MFA), where fluxes are estimated indirectly from mass isotopomer data with the use of a mathematical model of the metabolic network. A critical step in MFA is model selection: choosing what compartments, metabolites, and reactions to include in the metabolic network model. Model selection is often done informally during the modelling process, based on the same data that is used for model fitting (estimation data). This can lead to either overly complex models (overfitting) or too simple ones (underfitting), in both cases resulting in poor flux estimates. Here, we propose a method for model selection based on independent validation data. We demonstrate in simulation studies that this method consistently chooses the correct model in a way that is independent on errors in measurement uncertainty. This independence is beneficial, since estimating the true magnitude of these errors can be difficult. In contrast, commonly used model selection methods based on the χ2-test choose different model structures depending on the believed measurement uncertainty; this can lead to errors in flux estimates, especially when the magnitude of the error is substantially off. We present a new approach for quantification of prediction uncertainty of mass isotopomer distributions in other labelling experiments, to check for problems with too much or too little novelty in the validation data. Finally, in an isotope tracing study on human mammary epithelial cells, the validation-based model selection method identified pyruvate carboxylase as a key model component. Our results argue that validation-based model selection should be an integral part of MFA model development.

Entities: Chemical

Mesh：

Substances：
Carbon Isotopes
Carbon-13

Year: 2022 PMID： 35404953 PMCID： PMC9022838 DOI： 10.1371/journal.pcbi.1009999

Source DB: PubMed Journal: PLoS Comput Biol ISSN： 1553-734X Impact factor: 4.779

1. Introduction

Cellular metabolism is fundamental for all living organisms, involving thousands of metabolites and metabolic reactions that together form large interconnected metabolic networks [1,2]. While a substantial part of the human metabolic network has been reconstructed [2], measuring fluxes through individual reactions and metabolic pathways in living cells and tissues remains a challenge. This problem is central to a variety of medically relevant processes, including T-cell differentiation [3], caloric restriction and aging [4], cancer [5,6], the metabolic syndrome [7], and neurodegenerative diseases such as Parkinson’s disease [8]. The gold standard method for measuring metabolic fluxes in a given system is model-based metabolic flux analysis (MFA) [9]. In this technique, cells or tissues are fed “labelled” substrates containing stable isotopes such as 13C (Fig 1A). These substrates are metabolized to products containing various isotopic isomers (isotopomers) (Fig 1B). By measuring the abundance of these isotopomers, mass isotopomer distributions (MIDs, Fig 1C) are obtained for each metabolite [10]. Fluxes are then inferred by fitting a mathematical model to the observed MID data D (Fig 1D).

Fig 1

The basic steps in 13C MFA and the model selection problem.

The basic steps in 13C MFA and the model selection problem.

(A) New substrates, containing 13C (dark circles) are fed to the cells. (B) These substrates are consumed and converted to end products in the cells, according to its biochemical reactions. (C) The labelled 13C molecules appear to various proportions in each of the mass isotopomers, and these proportions are summed up in these distribution bar charts for each detected metabolite. (D) The iterative modelling cycle in which a hypothesized model structure is fitted to MID data. The model fit is evaluated, usually with a χ2-test, and either rejected or not. If the model structure is rejected it is revised and evaluated again. If the model structure is not rejected it is used for flux determination. (E) The iterative model development in (D) results in a model selection problem. Different approaches for solving this model selection problem might result in different model structures being selected. This paper evaluates how the uncertainty in measurement data affects uncertainty in model selection. While the above methodology is well established for assessing the fit of a given MFA model, several problems arise when it is used for model selection. In practise, MFA models are usually developed iteratively (Fig 1D), by repeatedly attempting to fit the same data to a sequence of models with successive modifications (adding or removing reactions, metabolites, and so on), until a model is found acceptable, i.e. not statistically rejected. In practice, this means that the model passes the χ2-test for goodness-of-fit [11]. Given the iterative nature of modifying the model structures, model development thus turns into a model selection problem. Depending on the approach used to solve this model selection problem different model structures might be selected, given the same data set (Fig 1E). For instance, if the traditionally iterative modelling cycle is used, the first model that passes the χ2-test might be selected and used for flux estimation. On the other hand, there might be multiple model structures that passes the χ2-test. In this case, the model structure that passes the χ2-test with the biggest margin may be a better option. Generally, model selection approaches that rely solely on the χ2-test to select a model can be problematic. First, correctness of the χ2-test depends on knowing the number of identifiable parameters, which is needed to properly account for overfitting by adjusting the degrees of freedom of the χ2 distribution [12], but can be difficult to determine for nonlinear models [13]. Second, the χ2-test can be unreliable in practise since the underlying error model is often not accurate. Typically, the MID errors σ are estimated by sample standard deviations s from biological replicates, which for mass spectrometry data often is below 0.01, and even can be as low as 0.001 (Fig 2A). However, such low estimates may not reflect all error sources. For example, MI fractions obtained from orbitrap instruments can be biased so that minor isotopomers are underestimated [14,15]. Also, s does not account for experimental bias, such as deviations from metabolic steady-state that always occur in batch cultures. Some such problems can be detected by repeating experiments, but some others cannot. The normal distribution assumption itself is also questionable for MIDs, which are constrained to the n-simplex [16]. For these reasons, s can severely underestimate the actual errors, making it exceedingly difficult to find a model that passes a χ2-test. In this situation, one is left with two bad choices: either arbitrarily increase s to some “reasonable” value to pass the χ2-test (Fig 2B), or introduce more or less well-motivated extra fluxes into the model. The former alternative, increasing s, may lead to high uncertainty in the estimated fluxes and does not necessarily reflect the experimental bias one tries to account for. The latter approach, introducing additional fluxes, increases model complexity and can lead to overfitting.

Fig 2

Example of MID sample standard deviation (A) Example of estimated mass isotopomer distribution (MID) of citrate from epithelial cells, as described in section 2.5.

M+i indicate the fractional abundance of the i:th mass isotopomer. (B) Difference between the assumed magnitude of the standard deviations and the measured magnitudes.

Example of MID sample standard deviation (A) Example of estimated mass isotopomer distribution (MID) of citrate from epithelial cells, as described in section 2.5.

M+i indicate the fractional abundance of the i:th mass isotopomer. (B) Difference between the assumed magnitude of the standard deviations and the measured magnitudes. While these issues with model selection are well known, they have to our knowledge not been treated systematically in the 13C MFA field. Indeed, MFA model selection is typically done in an informal fashion by trial-and-error, and the underlying procedure is rarely reported [17]. However, in other contexts where model fitting is central, such as systems biology, the problem of model selection has been treated extensively [13,18-27]. In these areas, a widely accepted solution is to perform model selection on a separate “validation” data set, which is not used for model fitting. Intuitively, this protects against overfitting by choosing the model that can best predict new, independent data. In this paper, we propose a formalized version of such a validation-based model selection approach for MFA. In a series of simulated examples, we demonstrate that this method consistently selects the correct metabolic network model, despite uncertainty in measurement errors, whereas “traditional” χ2-testing on the estimation data does not. By quantifying prediction uncertainty using prediction profile likelihood, we can avoid cases where the validation data is too similar, or too dissimilar, to the estimation data. Finally, in an application to flux analysis on our own new data in human epithelial cells, we find that the same robustness to measurement uncertainty variations still holds, and that the validation-based model selection method can identify reactions that are known to be active in this cell type.

2. Results

To systematically examine the effects of the model selection procedure on MFA, we adopted a scheme where a sequence of models with increasing complexity (increasing number of parameters) is tested by each model selection method, simulating typical iterative model development. We considered five possible model selection methods that use all available data for both parameter estimation and model evaluation (Table 1). Method “SSR” selects the model with the smallest weighted summed squared residuals (SSR) based on the data, included as a baseline. Method “First χ2” selects the model with fewest parameters (the “simplest” model) that passes a χ2-test, while accounting for overfitting by subtracting the number of free parameters p from the degrees of freedom in the χ2-distribution (see Section 4.3). Method “Best χ2” selects the model that passes the χ2-threshold with the greatest margin. Methods “AIC” and “BIC” select the model that minimizes the Akaike Information Criterion or the Bayesian Information Criterion, respectively [28,29]. The five methods mentioned above all depend on the noise model Eq (5), and all except “SSR” also requires knowing the number of free parameters p. Considering common practices in the field, it is probable that some combination of the “First χ2” and “Best χ2” methods is the prevailing approach in MFA modelling [17,30], although this is not entirely clear since the model selection process is often not described.

Table 1

A summary of the different model selection approaches considered in this paper.

Method of model selection	Model selection criteria
Estimation SSR	Selects the model with the lowest SSR given D^est
First χ²	Selects the first Mk that passes the χ² -test
Best χ²	Selects the Mk that passes the χ²-test with the greatest margin
AIC	Selects the Mk that minimizes the Akaike Information Criterion
BIC	Selects the Mk that minimizes the Bayesian Information Criterion
Validation	Selects the Mk with the smallest SSR with respect to D^val

In addition to these methods, we propose a validation-based model selection method (“Validation”) that divides the data D into estimation data D and validation data D. For each model, parameter estimation (model fitting) is then done using D, and the model achieving the smallest SSR with respect to D is selected. The division into estimation and validation data must be done so that qualitatively new information is present in the validation data. This can be done by reserving data from distinct model inputs or new model outputs for validation. For all examples herein, data from distinct model inputs is used for validation. For the 13C MFA examples, this means that data used for validation comes from a different tracer. Note that this proposed method allows for the selection of the most suitable model from a given set, but that it does not guarantee that the selected model is acceptable according to e.g. a χ2-test. In other words, the model selected with our new Validation method would still need be subjected to some form of final model testing. A detailed description of the “SSR”, “First χ2”, “Best χ2”, and “Validation” methods can be found in S1 Algorithm: A-D respectively (S1 Algorithm).

2.1 A motivating example

Before examining the behavior of the different model selection methods on metabolic network models, it may be helpful to illustrate their properties on a simple univariate example. For this purpose, we considered a model with a single input x and a single output , where model is the n-th order polynomial with parameter vector u. We assume that is the correct model, with true parameters u0, and sampled 20 measurements y = h7(x, u0)+ϵ for different values of x, where ϵ was drawn from N(0, σ) with standard deviation σ = 0.2. To simulate uncertainty about the error model, we considered σ to be unknown, and let the various model selection methods choose among with a “believed” standard deviation, denoted σ, in the range [0.1 σ, 10 σ]. For the “Validation” method, we reserved 4 of the 20 measurements for D (Fig 3, red error bars).

Fig 3

Example of how model selection is affected by σb, for the polynomial model. Error bars indicate data sampled from a 7th order polynomial y = h7(x, u0)+ϵ where ϵ is N(0, σr), σr = 0.2. Colours indicate estimation data Dest (blue) and validation data Dval (red) used by the “Validation” method. Solid curves in (A–B) indicate polynomials chosen by an estimation-based method with different “believed” standard deviation σb. (A) σb = 2, chosen model h1. (B) σb = 0.2 (the true value), chosen model h7 (the correct model). (C) σb = 0.02, chosen model h14. An illustration of the dependency on σ for a model selection method that does not use validation is shown in Fig 3. When D is not considered, we would expect larger values of σ to result in a simpler model, since almost all of the variation in the data is interpreted as noise (Fig 3A). Further, at very small values of σ an overly complex model will be required to obtain an acceptable fit to D (Fig 3C). Applying the five model selection methods to data from this polynomial model gave different results (Figs 4 and S1). Since the model selection process is somewhat stochastic, we resampled the data 10,000 times, each time with a new error ϵ drawn from N(0, σ), and report results as the fraction of times a particular model was chosen. As expected, “SSR” mostly selected the most complex polynomial regardless of σ, as the most complex model always gives the lowest SSR (Fig 4A). In contrast, “First χ2” or “Best χ2” gave different results depending on σ. “First χ2” selected the correct model only when σ≈σ. At σ≈10σ, only the low-degree polynomials () was chosen by the “First χ2” method, while at σ≈0.1σ, an overly complex polynomial was chosen (Fig 4B). The “Best χ2” method selected the correct model for σ≥σ, but selected overly complex models for smaller σ (Fig 4C). If σ were to increase further, “Best χ2” would choose a lower degree polynomial. This is because, for these χ2-based methods, the tradeoff between model complexity and goodness-of-fit is based on σ, and such a tradeoff is thus correct only if we happen to have σ≈σ. Similar results are seen with the “AIC” and “BIC” methods, which also depend on σ (S1 Fig).

Fig 4

Model selection results for the polynomial model example.

Model selection results for the polynomial model example.

(A–D) Heatmaps represent results from the indicated selection methods, where rows represent different values of σb and columns represent the polynomial models h1,…,h14. For each row, color indicates the fraction of times a model is selected for the given σb, out of 10,000 samples, as indicated by the color scale (right). In contrast, the “Validation” method predominantly selected the correct model, , regardless of σ (Fig 4D). This happens because, even though a polynomial of the wrong degree may fit well on D, it fails to predict independent validation data, resulting in large SSR on D. Since the correct model structure will best predict new data, agreement with validation data helps identify the right model, also in cases where the error model is inaccurate. Again, it should be recalled that the "Validation" method only is applicable to the task of selecting the best model, and that this selection approach should be followed by a final step that tests the quality of the model, e.g. using a χ2-test.

2.2 Model selection on multivariate linear models

To investigate model selection in a setting more relevant to metabolic networks, we next considered a multivariate linear model, where the output vector is a linear combination of the inputs x weighted by the model parameters. In this case, each model structure is fully specified by a matrix A such that and where the free parameters are elements of A (Fig 5). This type of model is roughly analogous to a simple metabolic network, where x corresponds to labelled substrates and y corresponds to metabolic products. We constructed six such models (A1−A6) of increasing complexity, nested so that the parameter space of each A contains the parameter space of all models A for l

Fig 5

Six different model structures for the linear model.

Six different model structures for the linear model.

This example is chosen as a simple representation of a mass flow model. The top row shows the model names A1,…,A6. The second row shows the matrices that constitute the model structures. The third row constitute visual illustrations of how the corresponding matrices connect the inputs xi and the outputs yi via the parameters a1,…,a6. We then tested each of the six model selection methods on the generated data. As before, the “SSR” method chose the most complex models (A5 and A6, Fig 6A). For the other four methods that only use estimation data, the selected model again depended on σ. Method “First χ2” selected the one of the simpler models A2 at σ≈10 σ, the correct model A3 only when σ≈σ, while at σ≈0.1 σ, no model passed the χ2-test (Fig 6B). The “Best χ2” method again selected the correct model, A3, for σ≈10 σ and σ≈σ, and model A6 for σ≈0.1 σ (Fig 6C). Methods “AIC” and “BIC” behaved similarly but chose somewhat different models (S2 Fig). Thus, which model is considered “best” depends on assumptions about the measurement noise, and established model selection methods give different results depending on what assumption is made.

Fig 6

Model selection results for the linear model example.

Model selection results for the linear model example.

(A–D) Heatmaps represent results from the indicated selection methods, where rows represent different values of σb and columns represent the linear models A1,…,A6. For each row, color indicates the fraction of times a model is selected for the given σb, out of 1000 samples, as indicated by the color scale (right). For the “Validation” method, simulated data from 2 of the 6 distinct inputs x were reserved for validation data D (S3 Fig). Again, this method predominantly selected the true model structure A3, regardless of σ (Fig 6D). Moreover, the model selection results for “Validation” method are consistent across all σ.

2.3 Model selection for simulated 13C MFA models

Let us now turn to model selection for the multivariate, nonlinear MFA models. To simulate the process of MFA model development, we designed seven stoichiometric models of the tricarboxylic acid (TCA) cycle and related reactions, with increasing model complexity (S1 Table and Fig 7). The full, atom-level models were generated using the EMU decomposition method (see methods). For all seven models, 51 MI fractions across nine metabolites (present in all models) were considered as measurement data. The data was simulated (Section 4.5) using model , with four different tracers separately used as inputs x in order to generate four separate sets of MID data. For this example, we resampled the data 100 times. Note that, unlike the previous examples, this model is nonlinear in the parameters.

Fig 7

Seven different model structures included in the simulated EMU 13C MFA example with simulated data.

Seven different model structures included in the simulated EMU 13C MFA example with simulated data.

The added component to each model structure, compared to the previous model, with slightly smaller complexity, is found inside the red circle. The true model used to simulate the data is model nr 4. Detailed descriptions for each model can be found in the supplementary material (S1 Table). As in the previous examples, the six methods of model selection were evaluated. As before, the “SSR” method always selected the most complex model (Fig 8A). Method “First χ2” selected different models depending on the value of σ (Fig 8B): it selects for σ≈10σ, for σ≈3σ, (the true model) and about 50% of the time respectively for σ≈σ, and for σ≈0.1σ and 0.3σ. In this example, “Best χ2” method selected the correct model for σ≈0.3σ and σ (Fig 8C). For σ≈σ “Best χ2” show a fraction of the samples selecting rather than (Fig 8C) and for σ≈10σ, “Best χ2” shift towards selecting the simpler model structures . Compared to previous examples, the AIC and BIC methods of model selection appear to be a bit more robust towards an unknown σ, selecting for σ≈0.1σ, 0.3σ, σ, and 3σ. Nevertheless, for σ≈0.1σ, both the AIC and BIC show tendencies to prefer more complex models and for σ≈10σ both the AIC and BIC selects too simple model structures, namely and respectively (S4 Fig).

Fig 8

Model selection results for the simulated 13C MFA model example.

Model selection results for the simulated 13C MFA model example.

(A–D) Heatmaps represent results from the indicated selection methods, where rows represent different values of σb and columns represent the MFA models . For each row, color indicates the fraction of times a model is selected for the given σb, out of 100 samples, as indicated by the color scale (right). For the “Validation” method, parameters were estimated using MID data from 3 out of 4 tracers, while the fourth set of MID data was used as validation data D; the exact division is described in Section 4.5. The “Validation” method selected the correct model in 60–70% of cases, for all tested values of σ (Fig 8D). The key observation here is that the validation-based method obtains the same results independently of σ. However, it should be noted that the “Best χ2” method does appear more robust in identifying the correct model when σ is correct. By selecting the wrong model structure, methods that depend on σ can lead to poor estimates of metabolic fluxes. For instance, when investigating the estimated flux values for model (model selected by “First χ2” at σ≈3σ) it becomes clear that the “First χ2” approach does not always capture the correct flux value with a 95% confidence interval (Fig 9). For instance, for the fluxes for mitochondrial Aconitase1 (ACONT1m) and Acetyl-CoA Synthetase (ASCm), the 95% confidence interval lies several standard deviations away from the estimated value, indicating that the confidence intervals are not reliable. In contrast, the flux solution, with a 95% confidence interval, for model (selected by the “Validation” method) does contain the true flux value (Fig 9). These results show that selecting the wrong model structure leads to errors in flux estimation, and that the “Validation” method therefore is more advantageous for both of these tasks.

Fig 9

Comparison of estimated flux solutions for the simulated 13C MFA example.

The resulting flux values with 95% confidence intervals for seven of the fluxes that are overlapping between all model structures in the simulated 13C MFA example. The confidence intervals correspond to the estimated fluxes for model (Blue), model with all data available (Green) and model with the data split into Dest and Dval (Red). The figure illustrates the selecting the wrong model structure may result in incorrect flux estimations.

Comparison of estimated flux solutions for the simulated 13C MFA example.

2.4 Assessing the novelty of a validation experiment using prediction uncertainty

As showed in the previous examples, validation data can be used for the purpose of model selection. However, one important aspect to consider for this new method in 13C MFA, is the degree of novelty of the validation data used. There are essentially two pitfalls that one wants to avoid: 1) the validation data may be too similar to the estimation data (i.e. then the new data does not provide any new information), 2) the validation data may be too dissimilar from the estimation data (i.e. then no model is able to predict the validation data). Both of these pitfalls can be avoided by looking at the uncertainty of the model predictions for the chosen validation experiment. The model’s prediction uncertainty is essentially a confidence interval for the predictions. In our case of 13C MFA models, a confidence interval is an interval for the predicted MIDs. This prediction uncertainty will depend on the uncertainty of the estimated fluxes, the model structure, and on the connection between the validation data and the estimation data. If the validation data is not novel enough (pitfall 1 above), models will produce identical predictions which do not differ from the estimation data (Fig 10A). On the other hand, if the validation data is too novel (pitfall 2), the estimation data does not contain any new information regarding the predicted MIDs, and the uncertainty will be very large (Fig 10B). Together, this means that the degree of uncertainty in the model predictions, compared to the difference in predictions between estimation and validation data, can be used to assess the novelty of the validation data. The desired scenario would be to have validation data such that the predictions are well-determined and are different between estimation and validation data (Fig 10C). A general approach for determining prediction uncertainty has been outlined in previous work [31] and has been implemented here for 13C MFA models in the EMU framework. A detailed description of this implementation is provided in Materials and methods, Section 4.4.

Fig 10

How prediction uncertainty can be used to assess the novelty in the validation data.

How prediction uncertainty can be used to assess the novelty in the validation data.

(A) If there is too little novelty in the validation data, differences between estimation data and validation data will typically be smaller than the prediction and measurement uncertainty. (B) If there is too much novelty in the validation data, there is no information about the corresponding MIDs, and the prediction uncertainty will be large, approaching [0,1]. (C) An ideal design of validation data is thus to have well-determined predictions that are different compared to the estimation data. To be sure that there really is new information, one should also check that the new fluxes generate linearly independent EMU basis vectors (Section 2.4). Another aspect that is important to consider for the case of 13C MFA, if the validation data consists of MIDs from a new tracer experiment, is that the new tracer is suitable to be used for validation. One approach to ensure that the new tracer generates data that is truly independent of the estimation data is to perform an EMU basis vector analysis [32]. This approach ensures that the tracers for the estimation and validation data produce linearly independent EMU basis vectors, which guarantees that the experiments give complementary information. This also ensures that one avoids the pitfall of having the validation data containing the same information as the estimation data, i.e. that the validation data is too similar to the estimation data. To demonstrate that the validation data used in the simulated 13C MFA example above does not fall into these pitfalls, the prediction uncertainties for the chosen model structure has been determined (Fig 11). As can be seen, the prediction uncertainties (light blue bars’ error bars) are well determined for all MIDs. We have thus avoided pitfall 2 above, i.e. the validation data is not too novel. We have also avoided pitfall 1, since i) the predicted MIDs (light blue) are non-overlapping with the estimation data MIDs (red bars), for many of the MIDs, ii) the EMU basis vectors are linearly independent.

Fig 11

Usage of prediction uncertainty to demonstrate that the validation data has neither too little, nor too much, novelty, compared to the estimation data.

This analysis shows the result from the simulated 13C MFA example (Fig 7–9). The model was trained on estimation data corresponding to three tracers: Tracer 1 = 1,2-13C-glutamine (dark red), Tracer 2 = 3-13C-pyruvate (red), and Tracer 3 = U-13C-glutamine (light red). The validation data (dark blue) came from usage of tracer U-13C-pyruvate. For the experimental data, the error bars represent standard deviation, and for the model predictions (light blue), the error bars represent model uncertainty (Section 4.4).

Usage of prediction uncertainty to demonstrate that the validation data has neither too little, nor too much, novelty, compared to the estimation data.

2.5 Model selection on cultured epithelial cells

Finally, we applied validation-based model selection on data from batch cultures of human cells. We performed two isotope labelling experiments with immortalized human mammary epithelium cells (HMECs), cultured with either U-13C-glucose or U-13C-glutamine for 6 cell doublings to achieve isotopic and metabolic steady-state (Materials and Methods, Section 4.6). MID data for nine metabolites were used as measurement data for this example, and the model structures used were the same as those presented previously in Section 2.3. The sample standard deviations from biological replicates were very small, around s = 0.005. In contrast to the previous theoretical examples for this system, the true σ and the true model structure is now unknown. However, by evaluating the six model selection approaches for a range of different believed σ, it is clear that the results are consistent even for these data (Fig 12). The “SSR” method always chose the most complex model (, Fig 12A). The “First χ2” method selected model for σ≈0.03 (Fig 12B), while for the smaller σ, no model passed the χ2-test. The “Best χ2” method selected model for σ≈0.3, 0.015, 0.003 (Fig 12C). Similarly, the BIC approach selects model for all values of σ,while the AIC selects model for σ≈0.03 and 0.015, and model for σ = 0.003 (S5 Fig).

Fig 12

Model selection results for the cultures epithelial cell example.

Model selection results for the cultures epithelial cell example.

(A–D) Heatmaps represent results from the indicated selection methods, where rows represent different values of σb and columns represent the MFA models . For each row, color indicates the fraction of times a model is selected for the given σb, out of 1000 samples, as indicated by the color scale (right). Similarly, the validation-based approach selected , regardless of σ (Fig 12D). Model excludes reactions for unlabeled acetyl group entry into acetyl-CoA (included in ), which represents catabolism of pre-existing fatty acids or acetate. Hence, the choice of suggests that such entry does not occur in these cultures. This seems reasonable since the culture medium did not contain acetate, and was also free from serum and therefore contained very little fat. On the other hand, includes the pyruvate carboxylase reaction while does not, suggesting that this reaction was necessary to explain the data. Also, the pyruvate carboxylase flux was nonzero (95% confidence interval [0.08 0.98]). Interestingly, pyruvate carboxylase has been shown to be present in mammary epithelium in vivo, where it is important for de novo fatty acid synthesis [33] by replenishing the TCA cycle carbon that is consumed by citrate export (“anaplerosis”). To investigate if fatty acid synthesis also occurred in the cultured HMEC cells, we measured the MID of cellular lysophosphatidylcholine (LPC) 16:0 as a proxy for palmitate, which was not detectable with the methods used, after 7 days of 13C labeling (Fig 13). The observed MID indicated that LPC 16:0 was a mixture of 13C-labeled and unlabeled species, with higher mass isotopomers indicating that fatty acid synthesis indeed occurred. To further test the selected model structure , we used the estimated MID for cytosolic acetate (Fig 13B) from the fitted model to predict the MID of palmitate and LPC 16:0, assuming a linear mixture of pre-existing (unlabeled) and newly synthesized (labeled) species (Fig 13A). We found a reasonably good fit to the observed MID data at 82% newly synthesized LPC 16:0, indicating that the selected model reflects actual lipid metabolism in this model system (Fig 13D).

Fig 13

Validation of lipid synthesis in HMEC cultures.

(A) Schematic of the model for lysophosphatidylcholine (LPC) 16:0 synthesis from acetate (ac). (B) Predicted MID of ac from the model selected by the “Validation” method. (C) Measured MID of glycerol-3-phosphocholine (g3pc). (D) Fitted (gray) and measured (black) MID of LPC 16:0. Mean values of biological triplicates are shown in (C, D). Error bars indicate standard deviation.

Validation of lipid synthesis in HMEC cultures.

3. Discussion

Since estimation of metabolic fluxes using 13C MFA critically depends on the metabolic network model used, a systematic approach to model selection is of great importance. As we have demonstrated, commonly used model selection criteria such as the χ2-test can give unpredictable results if the measurement error model is not accurate. Generally, we find that standard model selection methods that rely on a compensation for model complexity will choose different models depending on the “believed” standard deviation σ both for polynomial (Fig 4), linear model (Fig 6) and non-linear MFA models (Fig 8). Hence, when σ is inaccurate, these methods will over- or underfit the data, which naturally leads to errors in the estimated fluxes (Fig 9). Herein, we suggest remedying this problem by performing model selection on independent validation data, which is not used for estimating model parameters (“Validation” method). From our simulation studies, it is clear that this validation-based selection method indeed is more robust and selects the correct model in a way that is independent of errors in the size of σ (Figs 4, 6 and 8). Further, we demonstrate the importance of analysing the model’s prediction uncertainty in order to generate confidence that the selected model accurately approximates the true metabolic system (Figs 10–11) Finally, to illustrate the potential with validation-based model development in MFA, we also applied it to new experimental data. For this data, the validation-based method consistently identifies a single model structure, whereas “traditional” methods that exclusively rely on estimation data again select different models depending on σ (Fig 12). Furthermore, we also support the choice of model by predicting the MID of LPS 16:0 with reasonable accuracy (Fig 13), which is synthesized as a result of the combination of factors that differentiates the selected model structure and the other alternatives. This illustrates yet another usage of validation data for model testing and model selection. In summary, validation-based model selection offers a more reliable approach to MFA model development when measurement errors are uncertain. There are several reasons why the model (Eq (5)) of normal-distributed, independent errors may not be accurate for MID data. First, the since mass isotopomer fractions are constrained within [0,1] and sum to 1, strictly speaking they cannot be normal-distributed, nor independent. The normal assumption is particularly inaccurate for values close to 0 or 1, where the variance becomes very small. A better noise model for MIDs might be log-normal or other distributions on the n-simplex [16]. Moreover, MI fractions obtained from mass spectrometry can be biased for technical reasons: peak integration methods can affect MID accuracy [34], and minor isotopomers may underestimated due to limited sensitivity [15]. Finally, there are biological sources of error that are difficult to avoid. For example, in batch cultures, cells can never attain perfect metabolic steady-state, and there may be unforeseen kinds of compartmentalization, such as cell subpopulations, organelles, or reaction channeling [35]. Taken together, the result of these “hidden” factors is that observed standard deviations s will be artificially small compared to the residuals . While it could be argued that such biases constitute model error, and that the χ2-test is correct in rejecting such models, it may be unrealistic to expect a perfect model fit in every scenario. Indeed, in many cases the estimated s is so small that it is exceedingly difficult to find a model that passes a χ2-test, even for minor deviations from the error model. An important topic for future research is to address these issues by developing more suitable error models for MFA. However, in the meantime, validation-based model selection could offer a pragmatic way forward. The rational for why the new validation-based method is robust with respect to errors in the magnitude of σ comes from general theory from the field System identification. This theory says that if the data has been generated by a “true” model structure for some “true” parameters, θ0, the estimated parameters will converge to θ0 as the number of data points goes to infinity. This theory assumes that the data used for model training is informative, i.e. that one would not gain any information regarding model distinction by exciting the system further, and that there is no redundancy of parameters, as is the case e.g. in structurally unidentifiable models. The convergence of the parameters to θ0 holds for a large class of nonlinear model structures, which include all examples considered herein [36]. Furthermore, the overall magnitude of σ can be broken out from the cost function, and will thus not impact the location of any minima. These two facts together means that if the magnitude of sigma is the only thing that is wrong with sigma, the true model structure will still converge to the true parameters, while a too simple or too complex model structure will converge to the wrong parameters. This simple observation is the underlying motivation behind the validation-based method. Finally, note that the underlying theory from System identification assumes that in practice, the magnitude or scaling-error of σ is not necessarily the exact same for all datapoints, and the results herein indicate that the new validation-based method is a good choice also in such situations. In other words, the validation-based method is useful also in cases when the error in the believed σ is not homogeneously scaled for all data points. Furthermore, note that if one knows that the magnitude of the uncertainty for one metabolite is different form another metabolite, σ can still reflect this difference, and the presented results will still hold as long as the same difference in magnitude between metabolite measurement uncertainties also is the case for the true measurement uncertainty, σ. Finally, we believe that a validation-based approach is beneficial also in situations where σ is completely unknown, since a model that successfully predicts independent validation data probably is a decent description of reality. Note however that if the believed value of sigma, σ, is scaled wrong for some datapoint but not for others, the parameter estimation will be biased towards those datapoints and will converge to the wrong parameters. In this case, the predictions will be wrong, and the validation method may select the wrong model structure. A key issue with the new validation method concerns how one divides data into an estimation and a validation data set. Clearly, the validation data must contain truly “novel” data: it is not sufficient to merely divide up replicate measurements y from the same experiment, which only differ by random noise. Herein, we have always used independent experiments with different inputs (tracers) x for validation data. In this form, validation-based model selection for MFA requires parallel experiments with distinct tracers, which naturally increases the experimental effort. An alternative might be to reserve certain measurement components y for the validation data set. In principle, the same methods for calculations of prediction uncertainty and validation-based model selection (Section 2.4) should be applicable also then, and preliminary analysis shows that this is indeed the case (S6 Fig). The issue of which data points to reserve for the validation set is more difficult in our setting than in traditional cross-validation over statistically independent samples from a fixed data distribution. On one hand, highly dissimilar data points will be more difficult to predict, and should therefore provide a more stringent test for model selection. On the other hand, too dissimilar validation data (“extrapolation”) may not be predictable by any model. To judge this tradeoff, the prediction uncertainty method in Section 2.4 is useful. This topic is also relevant for experimental design, which could be adapted to generate data suitable for informative validation data. Finally, an interesting aspect of these results is that also too simple models have a too large prediction uncertainty (S7 Fig), which is contrary to the traditional principle of bias-variance tradeoff, which says that only too complex models have a too high variance. This further emphasizes the fundamental differences between statistical methods based on sampling from the same distribution, compared to methods for mechanistic modelling, where data from different distributions can be used for the validation analysis. These results argue for a revision of such previously established truths, coming from statistics [13,31,37]. It is important to distinguish between model selection and model testing. As mentioned earlier, while our method allows selecting the best model from a given set, it does not guarantee that the best model is indeed acceptable. While a goodness-of-fit test could be performed on the validation data for the selected best model, such a test will in general be optimistic due to multiple testing over models. For proper model testing, a third “test” data set should be used, which is not used for either parameter estimation of model selection. The analysis in Fig 8 is only meant to illustrate that errors in model structure can lead to errors in flux estimation, and is not an exhaustive analysis of what these errors might look like. The fluxes depicted in Fig 8 is considered a representative selection of the fluxes that overlaps between all model structures. The overall fact that errors in model structure may lead to more or less large errors in flux estimations should hold true. Based on our results, we suggest that validation-based model selection should always be considered when developing MFA models. Nevertheless, for small errors in σ, the less computationally expensive methods, such as AIC and BIC may give the same results. The problem, however, is that one does not know when the dependency on errors in σ make those methods unreliable in cases were the magnitude of the experimental error is uncertain, and in such cases it is therefore safer to use a validation-based approach. Validation-based approaches also have important advantages related to interpretation, and are therefore common-place in other field. We believe that the field of MFA modelling should take inspiration from such other fields of computational biology, where the ability to correctly predict independent data not used for parameter estimation is a standard criterion for model quality, and where such validation tests often are a requirement for publication [20,21,23,24,26,27]. Given that models are always simplifications of reality, such independent validation is important both for the modelling process and for communicating results to non-experts users. In other words, while it is almost always wrong to assume steady-state metabolism occurring in a single average cell, such a model may still be a good enough approximation of reality to produce realistic fluxes. Importantly, one way to demonstrate the realism and general predictive power of the chosen model is to show that it can predict new independent validation data. Notably, in guidelines issued by the US Food and Drug Administration (FDA), testing on independent validation data is necessary condition for a model to be considered trustworthy [38]. All in all, we believe that validation-based model selection provides sound and reliable checking of metabolic models, which we hope will be of value also to the 13C MFA field.

4. Materials and methods

4.1 13C Metabolic flux analysis

As stated previously, the gold standard method for measuring metabolic fluxes in a given system is model-based metabolic flux analysis with isotopically labelled tracers. The model includes the stoichiometry and atom mappings for each reaction, and is parameterized by the metabolic fluxes v, or more precisely, by the independent fluxes u. At steady state, the model-predicted MIDs are uniquely determined by u together with the known isotope distributions x of the network substrates [9,39], where the function h is determined by the model . Model fitting is done by seeking the vector u that minimizes the sum of the squared weighted residuals (SSR) between the model-predicted and measured MIDs [40], Here each measurement is assumed to derive from the model prediction at the true flux vector u0, plus a normal-distributed noise ϵ with standard deviation σ. If there are several experiments with different tracers x, the sum is taken over all resulting measurement vectors y [41] Under these assumptions, f(u) follows a χ2-distribution, and so the χ2-test can be used to assess model fit [42]. If this test does not reject, the model and the inferred fluxes u are considered valid.

4.1 Construction of mathematical models: Predictors and the EMU framework

The mathematical models presented herein are formulated such that a mathematical structure describes one or more predictors , given a set of model parameters θ. These mathematical structures or models can exist in different forms, such as polynomial models or ordinary differential equation (ODE) models. For MFA, a common approach for model formulation is the Elementary metabolite units (EMUs) framework [34]. In short, the EMU framework allows for a decomposition of the model such that only the information necessary to calculate a desired set of MIDs remains. The metabolic network is broken down into EMU subnetworks that are used to formulate equations, of the form in Eq (6) below [43]. where index n indicates the size of the EMU network and index k is used to index several networks of the same size; where matrices A and B contains the model structure for the fluxes v, which can be parameterized according using a smaller set of independent fluxes u; where matrices X and Y contains the unknown and known EMU variables, respectively; and where x are the EMU variables that correspond to the system tracer [43]. In other words, for EMU models θ = u.

4.2 Optimization of model parameters to fit the data

The objective of the parameter estimation step in any modelling problem is to minimize an objective function f(θ) which determines the agreement with data for a given the set of parameters θ. The general optimisation problem, which determines the optimal parameters θ*, is formulated as where g(θ) are functions describing constraints applied to the optimization problem. Again, for 13C MFA modelling, the parameters θ are the independent fluxes, herein denoted u. Also, for 13C MFA modelling, two constraints are usually placed on these independent fluxes: where null(s) is a null space matrix of the network’s stoichiometry matrix; and where ub and lb are the upper and lower bounds of u, respectively. The first condition ensures that all fluxes, which are given by the product of null(S) and u, are positive. The second condition ensures that the independent fluxes are constrained within a predetermined interval. As for the detailed form of the objective function, f, it can vary depending on the specific analysis conducted, but will generally be some variant of the weighted SSR function, since this objective function has sound theoretical properties [36]. The SSR used herein is given by Eq (4). For the EMU model, the relationships between , and the state variables X are given by: where y is the measured value for the m mass fraction of the l EMU in X, i.e. a specific bar in Fig 1C. If is a mean value of multiple original data points, then the residuals (), should be weighted with the standard error of the mean (SEM) rather than σ should be used in Eq (4). The relation between the two is given by: where N is the number of sample points. However, all of these are theoretical truths; which denominator to use in Eq (4) is an unresolved issue for 13C MFA models (see Section 3.2).

4.3 χ2-test

In the 13C MFA modelling field, the χ2-test is the statistical hypothesis test that most commonly is employed to evaluate whether the SSR is small enough, i.e. if the model can be considered an accurate representation of the target system. In practice, the SSR is compared with the inverse cumulative χ2-distribution, where the degrees of freedom is given by the number of datapoints adjusted for the fact that some independence between the datapoints and the model is lost by estimating parameters to the same data that is used for testing. This compensation can be done in different ways; the naive way is to do no compensation at all, and the most conservative way is to compensate for all parameters (free fluxes) in the model. The most accurate version is to instead use the number of practically identifiable parameters. In reality, this adjustment is done in different ways, often without justification, and these differences may be the reason why a model is, or is not, rejected. This ambiguity is one of the reasons arguing against the usage of this test. Its dependency on the value of σ is another such argument. With the most conservative choice, the algorithm becomes: Input: model structure , parameters θ, data D with N datapoints. Calculate the combined SSR for all data points N, using (Eq (4)). If , i.e. the cumulative inverse of a χ2 distribution, then model structure is accepted else model structure is rejected. Output: FAIL if model structure is rejected OR PASS if model structure acceptable with respect to D.

4.4 Determining model prediction uncertainty

In this work two main approaches were used to determine the prediction uncertainties. The first and primary approach was a prediction profile likelihood (PPL) analysis. A prediction profile likelihood analysis is used to determine the uncertainty of a predicted model value or property [44-46]. The PPL-analysis was implemented by modifying the function for the SSR, seen in Eq (4), such that it contained an additional term such as: where is the simulated relative abundance of mass fraction ω and is determined by the parameters θ. is a set target value for mass fraction ω and W is an integer with an arbitrary large value. By assigning a very large value to W, any difference between and will be magnified. Thus, the optimization process will select parameters that minimizes this difference. Then, was gradually stepped away from the optimal simulated value of the mass fraction ω, until a cutoff value is reached. This stepping process is repeated for all mass fractions that are included in the prediction. The second approach used for determining the model prediction uncertainty was estimation through Markov chain Monte Carlo (MCMC) sampling. For this analysis a posterior distribution of parameter values is generated, and all parameter sets that are acceptable with respect to the estimation data are collected. The model prediction uncertainty is then determined by the interval: where f(θ) is the generalise form of the SSR objective function described in Eq (4); f(θ*) is the SSR function value for the optimal parameters; α is the confidence level; Δ(χ2) is the quantile of the χ2-statistic; DoF is the degrees of freedom; and θ* are the optimal parameters. In this work, the DoF is equal to the number of model parameters, i.e. in the 13C MFA examples the number of free fluxes, and 105 samples were used for the sampling.

4.5 Simulated data generation

For the examples presented in this paper, simulated data is utilized to create scenarios for model estimation in which the ground truth is known. In each of these examples, a given model with parameters θ0 has been used to generate values for selected variables of interest, given a predetermined set of model parameters and inputs. To these values, a normally distributed noise was added, with a given true sigma, σ. For the polynomial exampled a seventh order polynomial were used as true model (see S2 Table, for true parameter values), and a normally distributed noise was added with σ = 0.2. The model was then fitted to data corresponding to one realisation of this noise i.e. i = 1, using Eq (4). For the linear model example, the model A3 (Fig 5) was used as the true model (see S2 Table for true parameter values). A normally distributed noise was added with a true sigma of σ = 5 The model was then fitted to data corresponding to five realisations of this noise i.e. i = 5, using Eq (4). For the EMU-model example, model 4 (Fig 7) was used as the true models (see S2 Table for true parameter values). A normally distributed noise was added with a true sigma of σ = 0.03 The model was then fitted to data corresponding to 3 realisations of this noise i.e. i = 3, using Eq (4). The data was generated from four different tracers. These tracers were U-13C-pyruvate, U-13C-glutamine, 3-13C-pyruvate, and 1,2-13C-glutamine. For the “Validation” method the data from tracers U-13C-glutamine, 3-13C-pyruvate, and 1,2-13C-glutamine was used as D while data from U-13C-pyruvate was used as D To generate the different values of σ, the sample standard deviation for each observation was scaled by a number drawn from a uniform random distribution. For σ≈10*σ the distribution range was between [8 12], for σ≈σ the distribution range was between [0.8 1.2], and for σ≈0.1*σ the distribution range was between [0.08 0.012].

4.6 Cell culture and isotope tracing

Human Mammary Epithelial Cells (HMECs) were obtained from the laboratory of William C. Hahn (Dana-Farber Cancer Institute, Boston, USA) and have been previously described [47]. HMECs were grown in custom-synthesized Mammary Epithelial Basal Medium (MCDB) 170 [48] supplemented with 1% Mammary Epithelial Growth Supplement (MEGS) (S0155, Gibco), 100 units/ml penicillin and 100 μg/ml streptomycin (15140122, Thermo Fisher Scientific). Cells were kept in a humidified atmosphere of 5% CO2/95% air at 37°C and washed and detached using ReagentPack Subculture Reagents (CC-5034, Lonza). For isotope tracing experiments, 400,000 cells were seeded at day 0 in a T25 flask in 5mL medium and incubated overnight. On day 1, medium was changed to an MCDB 170 medium of the same molar composition, but with glucose or glutamine exchanged for U-13C-glucose or U-13C-glutamine (Cambridge Isotope Laboratories), respectively. On day 2, each T25 flask culture was detached and seeded into in two T25 flasks, using the same medium. On day 4, cells were detached and seeded into 6-well plates at 250,000 cells/well in 2mL of medium, in triplicate for each tracer. On day 7 (after roughly 6 cell divisions in the presence of each 13C tracer), each multi-well plate was placed on ice, medium was aspirated and cells were washed twice with 1 mL of cold PBS. Then, 1 mL cold (–80°C) methanol (JT Baker, BAKR8402.2500, VWR) was added to each well, cells were scraped using a 17mm cell scraper (83.1830, Sarstedt), and the extracts were carefully transferred to a new tube, vortexed for 30 seconds to break up aggregated cell material, and stored in -80°C until analysis. LCMS analysis of cell extracts was performed using a pHILIC LC column coupled to a Thermo QExactive orbitrap mass spectrometer, as previously described [49]. All metabolite peaks reported were confirmed against pure standards. Peak areas were integrated directly from instrument data using the mzAccess data access framework [50] and Mathematica v.11.1 (Wolfram Research). Mass isotopomer distributions were calculated as the areas of each mass isotopomer peak, divided by the total peak area for all mass isotopomers.

4.7 Model for the LPC 16:0 MID

The MID x for acetate (ac) was obtained from the fitted model as described above in section 2.5. The MID x of total cellular pool was modeled as a linear mixture where is the MID of newly synthesized palmitate (pmt), computed by convolution of x eight times, modeling the condensation of eight ac molecules by fatty acid synthase; is the natural MID, representing pre-existing palmitate; and α is the unknown mixture coefficient. The lysophosphatidylcholine MID x was modeled as a convolution of x and the measured glycerol-3-phosphocholine (g3pc) MID x. This can be written as: where C(x) is a matrix whose elements depend on x. This yields an equation system linear in the unknown α, which was solved using the least-squared method.

4.8 Software

All analysis presented here were performed in MATLAB by The MathWorks Inc., release 2020b. For the models presented in this paper the open source MATLAB toolbox OpenFLUX2 [51] was employed to transform the network structure to the EMU equation systems. Heatmaps represent results from the AIC (left) and BIC (right) methods, where rows represent different values of σb and columns represent the polynomial models h1,…,h14. For each row, colour indicates the fraction of times a model is selected for the given σb, out of 10,000 samples, as indicated by the colour scale (right). (EPS) Click here for additional data file. Heatmaps represent results from the AIC (left) and BIC (right) methods, where rows represent different values of σb and columns represent the polynomial models A1,…,A6. For each row, colour indicates the fraction of times a model is selected for the given σb, out of 1 000 samples, as indicated by the colour scale (right). (EPS) Click here for additional data file.

Illustration of the data and simulation for the Linear example.

The simulated data for the linear example plotted with the different input vectors x along the x-axis and the model output y on the y-axis. The three different output variables are indicated by the different colours, y1−purple, y2−Green, and y3−orange. The division into estimation data (left) and validation data (right) is indicated by the vertical line. (EPS) Click here for additional data file. Heatmaps represent results from the AIC (left) and BIC (right) methods, where rows represent different values of σb and columns represent the polynomial models . For each row, colour indicates the fraction of times a model is selected for the given σb, out of 100 samples, as indicated by the colour scale (right). (EPS) Click here for additional data file.

Model selection results for the epithelial cell example.

Heatmaps represent results from the AIC (left) and BIC (right) methods, where rows represent different values of σb and columns represent the polynomial models . (EPS) Click here for additional data file.

Usage of Validation data with prediction uncertainty where a sub part of a single data set has been reserved for validation.

This preliminary analysis shows the result of using validation with prediction uncertainty in a scenario where a portion of a complete data has been reversed as validation data. In this example the model was trained on estimation data (dark red) consisting of MIDs from 8 metabolites, from two tracers, with the model fit to the estimation data (light red) is illustrated for each MID. The validation data consisted of the MID for α-ketoglutarate (dark blue) and the model prediction is illustrated (light blue) and shows good agreement with the validation data. For the experimental data, the error bars represent standard deviation, and for the model predictions, the error bars represent model uncertainty (Section 4.4). The tracers are Tracer 1 = U-13C-glutamine, Tracer 2 = U-13C-pyruvate. (EPS) Click here for additional data file.

A comparison of model predictions with uncertainties compared to the validation data for the simulated 13C MFA example.

The validation data for the simulated 13CMFA example consisted of MIDs for nine metabolites from a [U-13C] pyruvate tracer. The metabolites are from top-left to bottom-right, pyruvate, citrate, cis-aconitic acid, alpha-ketoglutarate, L-glutamate, L-glutamine, fumarate, L-malate, and L-aspartate. The purple bars indicate the simulated MIDs, and the corresponding error bars indicate the experimental uncertainty of σr. The red bars indicate the predicted MIDs of model structure and the corresponding error bars indicate the prediction uncertainty. The green bars indicate the predicted MIDs of model structure and the corresponding error bars indicate the prediction uncertainty. (EPS) Click here for additional data file.

Algorithm used for the model selection problem.

The model selection algorithm takes a set of model structures and a set of data as inputs and selects the most appropriate model structure based on the sub type (A-D) and the data. Subtype A selects the model structure that yields the smallest summed squared residuals (SSR) given the entire data set. Subtype B selects the first/simplest model structure that can pass a χ2-test. Subtype C selects the model structure that passes a χ2-test with the largest margin. Finally, subtype D selects the model structure that yields the lowest SSR with respect to a validation subset of the data. (DOCX) Click here for additional data file.

Breakdown of reactions for the TCA-cycle models.

The table contains a detailed breakdown of the reactions that are included in the TCA-models. In general, the model structures are derived from the most complex model structure by successively removing and combining reactions and thus make each successive model structure simpler. (DOCX) Click here for additional data file.

A summary of the parameter values that were used to generate the simulated data for the three different examples that are used in the manuscript.

The full parameter vector for the polynomial, linear and metabolic flux analysis model examples are given by the respective columns. (DOCX) Click here for additional data file. 6 Sep 2021 Dear Mr Sundqvist, Thank you very much for submitting your manuscript "Validation-based model selection for 13C metabolic flux analysis with uncertain measurement errors" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Vassily Hatzimanikatis Associate Editor PLOS Computational Biology Kiran Patil Deputy Editor PLOS Computational Biology *********************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors propose an approach to select which metabolic model to use for 13C-metabolic flux analysis (13C-MFA). The basic idea is to perform multiple tracer experiments with different isotopic tracers and then set one of these data sets aside to be used as validation data. The remaining data sets are then used to estimate fluxes using different metabolic models with increasing complexity. The hope is that the validation data will allow proper model selection, that is, selection of a metabolic model that is not too simple or too complex. The authors are correct that model selection can have a significant impact on the fluxes that 13C-MFA produces, however, I disagree with the proposed approach for several fundamental reasons. Major concerns: 1) The authors completely ignore the most important step in 13C-MFA, which is the design of tracer experiments. In this step, the key objective is to select the most informative isotopic tracers and isotopic labeling measurements for a given metabolic network model. In this step, the metabolic model must represent the best current knowledge of the biochemistry of the cell that is studied, i.e. the model should be as comprehensive as possible to avoid any biases in the 13C-MFA studies. The approach proposed by the authors ignores the importance of proper tracer selection. If poor tracers are selected, then it doesn’t matter what method is used for model selection, the flux results will be poor. The authors must evaluate what impact good tracer selection and poor tracer selection has on their proposed approach. 2) The proposed approach only really works if the validation data contains truly “novel” data that is independent of the other data. In reality, this can never be satisfied. How do you determine to what extent the validation data is independent? This is an important question that is never quantitatively addressed by the authors. The fact that a different tracer is used doesn’t automatically mean that the validation data is independent. 3) On the other extreme, the validation data must not be too dissimilar from the other data sets, or the validation data may not be predictable by any model. While this problem is briefly mentioned by the authors, is not fully addressed. This is a serious problem of the proposed approach. How do you ensure that the validation data is not too similar but also not too dissimilar compared to the other data? Unless the authors can provide some quantitative metrics that can be followed to quantify the degree of similarity and dissimilarity of the validation data, I am afraid that I cannot recommend the use of the proposed approach. Reviewer #2: The paper addresses the topic of model selection in metabolic flux analysis with isotope labeling data gathered under metabolic steady state conditions (termed MFA) when the labeling data error is largely unknown, but consistent across the data set. The authors compare different existing model selection approaches (SSR, first Chi2, best Chi2, AIC, BIC, validation) by means of simulated examples (polynomials, linear toy and nonlinear MFA systems) and apply them for the evaluation of a real data set in human epithelial cells. In the latter case, they found evidence for the pyruvate carboxylase as a necessary model component. The authors argue that model selection based on validation is a more reliable approach than SSR- and Chi2-based metrics in case of unknown data standard deviations. A conclusion about AIC/BIC is not drawn. Overall, the work addresses a relevant, up to now underrepresented topic in the field of MFA. Some major issues concerning the precise aims/scope/results of the work need to be addressed. If done, the manuscript has the potential to become fit for publication in PLOS Comput Biol after some major revisions. **Major Comments** 1. Model selection is always connected to a goal. The authors show examples for flux estimation, pathways inference and label prediction. Is the intention of model selection in this work directed towards all of these three categories? That this is not clearly stated is a weak spot of the manuscript; and it makes it somewhat hard to assess whether there is substantial evidence for the conclusions. 2. The presented results achieved across the examples, indicate that the performance of comparably cheap criteria (AIC, BIC) appear to be on par with the validation-based (VB) approach for all cases for which they are reported (Why do the authors calculate model selection AIC/BIC criteria for all, but the epithelial cell example?). Interestingly, this is not discussed. In addition, VB is computationally expensive. An overall comparison of criteria performances should be added incl. a discussion of the cost/benefit aspect. 3. Following up Comment #2, the AIC is an approximation of leave-one-out cross-validation according to Gelman's Bayesian Data Analysis. Could then the AIC in theory not even be better than the proposed VB scheme? 4. The Best Chi2 approach performs in some cases better than VB (in terms of fractions), in particular for the case of approx. correct and larger errors. Can the authors comment on this behavior? Together with Comments #1,2 above, this may find its way into a more fine-grained advice to modelers that apply model selection in cases of known vs. grossly known vs. unknown model errors rather than suggesting the consideration of VB per se (P19). 5. Please add some references on the origins of the epithelial network model and on which basis the model variants are constructed. A network description directly attached to this manuscript would aid understanding and bridge between the thumbnail-sized pictures and the reaction names discussed in the text (see also below). **Minor Comments** 1. Abstract: "These errors are often not known in practical examples." This statement suggests that it is standard that nothing is known about data errors in the field of MFA. This is certainly a disputable claim that may snub analytical chemists working in this field. Arguably, there is lots of work where approx. errors are formulated on well-grounded experience. Therefore, this reviewer suggests that the authors revise their statement appropriately. 2. P4 "s does not account for experimental bias, such as deviations from metabolic steady-state that always occur in batch cultures." This statement is difficult because if the MSS considerably questioned, applying MFA is doubtful. Moreover, wouldn't repetition of the experiment help to (un)cover such deviations at metabolic level? 3. The authors speculate about the approaches that are applied in the MFA domain for model selection. Indeed the procedure described in (Antoniewicz 2018) may be interpreted as First Chi2 approach. The iterative scheme sketched in (Dalman 2016) points more to the use of the SSR/Chi2 as a means for model validation/testing and in this sense would have been correctly applied. Generally, it would improve clarity if the authors introduce the differences between model validation/testing and model selection in the introduction, rather than mentioning it on the last page, and take care distinguishing these two traits in commenting the literature. 4. P6: "For all examples herein, data from distinct model inputs is used for validation." Does this mean data from tracer experiments with different tracer species? Please clarify. 5. Sec. 2.1: The motivating example, although being a simple illustration to unfamiliar readers, is clearly outside the main story line. Thus, this reviewer suggests to consider shifting the example to the supplement, and keep focused on labeling systems. Also, here the difference between model validation/testing and selection becomes evidently mixed (see also #3): the most simple and most complex polynomials will certainly fail in a model validation/testing step. 4. P11: Please report the number of resamplings in the main text, rather than just in the caption of Fig. 8. 6. P13/Fig. 9: Please report the full name of the reactions discussed to "link" the results better to the metabolic network context (see also Major Comment #5). 7. Sec. 2.4 Please report also AIC/BIC numbers (see also Major Comment #2). 8. P15: "suggesting that fatty acid synthesis indeed occurred". Please clarify whether there is potentially another route than FA synthesis for label entering this compound (see also Major Comment #5). 9. Fig. 11: Please add error bars to the mean MIDs. 10. P17: "correctly predicting" brings up the question on how correct, correct is. The more qualitative wording used on P15 "reasonable good fit" appears to be better suited here. 11. P18: "In practice, the magnitude or scaling-error of is not necessarily the same for all data points, and the results herein indicate that the new validation-based is a good choice also in such situations". To the understanding of the reviewer, this has not been tested. Please comment. 12. For clarity, please use consistent names for the different test cases throughout text and figures, e.g. "simulated 13C-MFA model example" in Fig 7+8+9, or "cultured epithelial cells" in Fig. 10 **Typos/Wording** P2 This problem is central P3 Correct system hypothesis --> System hypothesis P4 approached --> approach; passes --> pass; are often below --> are often reported to be below (add reference?); former alternative leads to high uncertainty --> former alternative may lead to high uncertainty P9 simulate data form (--> from) 6 distinct input vectors P14 tested --> applied P23 check notation: M0 model with theta0 parameters is selected for data generation, with M0=M7 in case of the polynomial example, M0=A3 in case of ... etc. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols 15 Nov 2021 Submitted filename: Response_to_reviewers.docx Click here for additional data file. 5 Jan 2022 Dear Mr Sundqvist, Thank you very much for submitting your manuscript "Validation-based model selection for 13C metabolic flux analysis with uncertain measurement errors" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Vassily Hatzimanikatis Associate Editor PLOS Computational Biology Kiran Patil Deputy Editor PLOS Computational Biology *********************** A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #2: The minor comments below are mainly related to sharpen statements that otherwise risk being misunderstood by the target community. Since line numbers are missing, page numbers are given. Minor Comments 1. Reply to Response to Major Comment 5: While the list given in the SI about model variants is clearly useful as an overview, this way is clearly insufficient for reproducing results or even a simple model simulation step. Please take care to adhere to the reporting standards established in the field. 2. Reply to Response to Major Comment 4 and Minor Comment 1: Abstract, Summary, P15: This reviewer agrees that estimating the precise true error is hardly possible. Indeed, results presented indicate that gross mis-characterization of these errors leads to non-robust model selections, which in turn CAN lead to errors in flux estimates. But, in standard use-cases, such as E. coli and GC-MS, the order of the errors in the labeling data is indeed well characterized. Here, the results presented do indicate that if the true error is approximately correctly known, traditional methods work fairly well *and* are computationally efficient. Thus, validation-based model selection is useful in cases for which the magnitude of the errors is unknown or a gross mis-characterization is suspected. Please be precise here and adequate in the conclusions. 3. Reply to Response to Minor Comment 2: Certainly, modeling assumptions are always idealizing the truth. The fundamental principles and experimental requirements for 13C MFA are well-known and -established, and it was not intended to scrutinize them here. The point is that it is dangerous to think about MID error models as means to cover deviation from metabolic stationarity. Labeling error models are, as the authors specify, assumed to be normally distributed without bias. But whether errors introduced by metabolic instationarity are also of this type is unclear. Please clarify, to circumvent the naive conclusion that an increase in error could "heal" metabolic instationarity. 4. Reply to Response to Minor Comment 3: Since the underlying principles of incremental model updating used in Dalman 2016 seems to be not perfectly clear, this reviewer suggests replacing or amend this cite on P3 by 10.1038/s41596-019-0204-0, where the procedure is outlined. 5. Abstract: The notion of prediction uncertainty should be formulated more precisely, i.e. labeling prediction uncertainty, since the constraint-based modelling community often uses the term "flux prediction". 6. P4: "By quantifying ... core prediction," is hard to understand without further background. It may be better so state why prediction uncertainty is important and validation data should be chosen wisely, instead of explaining how this is done here. Later, when the prediction and inference uncertainties are discussed, a statement is warranted what such "core predictions" are. Also, the authors may comment why MCMC is preferred over the standard PL-based approach for flux confidence bounds, whereas for prediction uncertainty the PL-based approach and not a standard MC approach is taken. 7. P13: "This theory says ... goes to infinity." This is not generally true. An important assumption is stated later in the paragraph, but in a rather disconnected manner. Please put it in direct context. For instance, in case the data is informative and no structural non-identifiabilities are present, theory says ... 8. P13: "and the results will still hold as long as the same difference between metabolites holds also for sigma_r". Unclear, please revise. 9. P14: "compared to methods for mechanistic modeling, where new distributions can be used for the analysis". Unclear, please revise. Typos: P3: model structures that pass the Chi2 test.; the underlying errors P6: First Chi2 (without ") P9: can be used for; for this new method in 13C MFA; does not contain any new information (instead provide) P13: offer a pragmatic way forward P14: of these results is that also P15: on errors in sigma_b P18: Sect. 4.4 check tense References: Check duplicates (e.g. Aitchison) ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: No: see comments to the authors ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References: Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. 21 Jan 2022 Submitted filename: Response_to_reviewers_2022_01.docx Click here for additional data file. 7 Mar 2022 Dear Mr Sundqvist, We are pleased to inform you that your manuscript 'Validation-based model selection for 13C metabolic flux analysis with uncertain measurement errors' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Vassily Hatzimanikatis Associate Editor PLOS Computational Biology Kiran Patil Deputy Editor PLOS Computational Biology *********************************************************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #2: The authors have revised their manuscript, and responded to all reviewer criticisms. I am satisfied with the clarifications to the paper. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No 6 Apr 2022 PCOMPBIOL-D-21-01170R2 Validation-based model selection for 13C metabolic flux analysis with uncertain measurement errors Dear Dr Sundqvist, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Olena Szabo PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

42 in total

Review 1. 13C metabolic flux analysis.

Authors: W Wiechert
Journal: Metab Eng Date: 2001-07 Impact factor: 9.783

2. A scientific workflow framework for (13)C metabolic flux analysis.

Authors: Tolga Dalman; Wolfgang Wiechert; Katharina Nöh
Journal: J Biotechnol Date: 2015-12-22 Impact factor: 3.307

Review 3. Publishing 13C metabolic flux analysis studies: a review and future perspectives.

Authors: Scott B Crown; Maciek R Antoniewicz
Journal: Metab Eng Date: 2013-09-08 Impact factor: 9.783

4. Human breast cancer cells generated by oncogenic transformation of primary mammary epithelial cells.

Authors: B Elenbaas; L Spirio; F Koerner; M D Fleming; D B Zimonjic; J L Donaher; N C Popescu; W C Hahn; R A Weinberg
Journal: Genes Dev Date: 2001-01-01 Impact factor: 11.361

5. Credibility of In Silico Trial Technologies-A Theoretical Framing.

Authors: Marco Viceconti; Miguel A Juarez; Cristina Curreli; Marzio Pennisi; Giulia Russo; Francesco Pappalardo
Journal: IEEE J Biomed Health Inform Date: 2019-10-28 Impact factor: 5.772

6. Mass and information feedbacks through receptor endocytosis govern insulin signaling as revealed using a parameter-free modeling framework.

Authors: Cecilia Brännmark; Robert Palmér; S Torkel Glad; Gunnar Cedersund; Peter Strålfors
Journal: J Biol Chem Date: 2010-04-26 Impact factor: 5.157

7. Simultaneous tracing of carbon and nitrogen isotopes in human cells.

Authors: Roland Nilsson; Mohit Jain
Journal: Mol Biosyst Date: 2016-05-24

8. Serum-free growth of human mammary epithelial cells: rapid clonal growth in defined medium and extended serial passage with pituitary extract.

Authors: S L Hammond; R G Ham; M R Stampfer
Journal: Proc Natl Acad Sci U S A Date: 1984-09 Impact factor: 11.205

Review 9. A guide to ¹³C metabolic flux analysis for the cancer biologist.

Authors: Maciek R Antoniewicz
Journal: Exp Mol Med Date: 2018-04-16 Impact factor: 8.718

10. Likelihood based observability analysis and confidence intervals for predictions of dynamic models.

Authors: Clemens Kreutz; Andreas Raue; Jens Timmer
Journal: BMC Syst Biol Date: 2012-09-05