Literature DB >> 36124805

Design centering enables robustness screening of pattern formation models.

Anastasia Solomatina^1,2,3, Alice Cezanne², Yannis Kalaidzidis², Marino Zerial^2,3,4, Ivo F Sbalzarini^1,2,3,4.

Abstract

MOTIVATION: Access to unprecedented amounts of quantitative biological data allows us to build and test biochemically accurate reaction-diffusion models of intracellular processes. However, any increase in model complexity increases the number of unknown parameters and, thus, the computational cost of model analysis. To efficiently characterize the behavior and robustness of models with many unknown parameters remains, therefore, a key challenge in systems biology.
RESULTS: We propose a novel computational framework for efficient high-dimensional parameter space characterization of reaction-diffusion models in systems biology. The method leverages the Lp-Adaptation algorithm, an adaptive-proposal statistical method for approximate design centering and robustness estimation. Our approach is based on an oracle function, which predicts for any given point in parameter space whether the model fulfills given specifications. We propose specific oracles to efficiently predict four characteristics of Turing-type reaction-diffusion models: bistability, instability, capability of spontaneous pattern formation and capability of pattern maintenance. We benchmark the method and demonstrate that it enables global exploration of a model's ability to undergo pattern-forming instabilities and to quantify robustness for model selection in polynomial time with dimensionality. We present an application of the framework to pattern formation on the endosomal membrane by the small GTPase Rab5 and its effectors, and we propose molecular mechanisms underlying this system.
AVAILABILITY AND IMPLEMENTATION: Our code is implemented in MATLAB and is available as open source under https://git.mpi-cbg.de/mosaic/software/black-box-optimization/rd-parameter-space-screening. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical

Mesh：

Substances：

Year: 2022 PMID： 36124805 PMCID： PMC9486588 DOI： 10.1093/bioinformatics/btac480

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.931

1 Introduction

In systems biology, mathematical models provide a framework to connect experimental observations with theory. This enables additional insights into the underlying biological mechanisms and allows comparing alternative molecular hypotheses. Besides offering a compact representation of data and of biological mechanisms, mathematical models allow us to explore how system behavior depends on experimentally non-controllable parameters, and how it varies over experimentally non-accessible time scales. Therefore, models provide a context in which data can be mechanistically interpreted and the function of biological systems dissected (Liepe ; Sbalzarini, 2013). Models in biology, however, usually include unknown parameters. The more molecular interactions are included in a model, the higher the dimensionality of its parameter space. Although data from the literature or independent measurements can be used to estimate some of the parameters, and therefore, to reduce the dimensionality of the parameter space, substantial uncertainty often remains about several parameters or about the precise values of estimated parameters. This constitutes a major challenge for the characterization of the behavior and robustness of biological systems. The difficulty of model characterization increases exponentially with the dimensionality of the model’s parameter space. For biochemical reaction–diffusion models, the dimensionality of the parameter space depends on the chemical reaction network topology of the considered molecules and on the type of interactions assumed between them. Typical questions addressed using such models are: ‘Are there any parameter values compatible with the generation of Turing patterns?’, ‘How large is the portion of parameter space in which the model fits given data?’, ‘Which model parameters are correlated?’ or ‘Which model parameters are most important for the pattern-forming capacity of the model?’. For low-dimensional parameter spaces (typically less than five-dimensional), these questions can be addressed by brute-force sampling (Barkai and Leibler, 1997). While this is easy to implement, it quickly becomes infeasible in higher-dimensional parameter spaces. Exploring models with higher-dimensional parameter spaces, however, is important in systems biology. For example, it has been shown that models including three or more molecular species (parameter space dimensionality > 5) have different pattern-formation requirements and reveal novel biological network designs (Haas and Goldstein, 2021; May, 1972; Satnoianu ). However, due to the computational complexity of identifying and characterizing such models, extending biochemical reaction–diffusion models to more realistic network topologies has been challenging. These challenges have been addressed in glocal methods, which combine the global and local search for analyzing high-dimensional parameter spaces (Hafner ; Zamora-Sillero ). Notwithstanding their importance and success, glocal methods remain limited by their exponential algorithmic complexity with parameter-space dimensionality and by their inability to explore non-convex regions in parameter space. Moreover, glocal methods rely on a continuous cost function, which is appropriate only when one is able to define a scale for the response of a model, such as a frequency of oscillations. Here, we propose a computational framework to globally explore potentially non-convex parameter spaces of reaction–diffusion models in polynomial time and without having to define a cost function. Our method leverages L-Adaptation, an adaptive statistical method for design centering and volume estimation in high-dimensional spaces (Asmus ). Instead of a continuous cost function, L-Adaptation uses binary oracles. We propose four specific oracles that capture the main characteristics of pattern-forming reaction–diffusion models: bistability, instability, pattern formation and pattern maintenance (Trong ). The resulting algorithm is capable of exploring these characteristics in parameter spaces of arbitrary shape: connected or disconnected and convex or non-convex. Moreover, we empirically show in benchmarks that the computational cost required to do so scales polynomially with the dimension of the parameter space. In the remainder of this article, we present in detail our framework for the efficient characterization of high-dimensional parameter spaces of reaction–diffusion models. The strength of the proposed framework lies in its ability to explore pattern-forming instabilities of models in polynomial time with the dimensionality. We validate the proposed oracle functions on the example of the partitioning-defective proteins (PAR) polarity system (Trong ) and provide an estimate for the computational complexity of the algorithm using synthetic data. This article concludes with a demonstration of the framework on the real-world problem of membrane pattern formation by the small GTPase Rab5, where we compare three hypotheses that potentially explain the molecular mechanism that drives domain formation on early endosomes.

2 Materials and methods

We briefly review the main concepts of the L-Adaptation algorithm (Asmus ) and then present the proposed oracles for pattern-forming reaction–diffusion models. Finally, we present the overall workflow of our framework and explain its use in model selection.

2.1 L-Adaptation for high-dimensional design centering and volume estimation

L-Adaptation is a versatile statistical method for design centering and volume estimation (Asmus ). It allows identifying the most robust parameter set, the design center, which enables a model to fulfill certain function under the largest possible perturbations. In addition, L-Adaptation estimates the shape and the volume of the portion of parameter space in which the model does so, the feasible region. It therefore enables model selection based on the robustness and quantification of parameter correlations. L-Adaptation is an iterative algorithm that is inspired by Gaussian Adaptation (Kjellström and Taxén, 1981; Müller and Sbalzarini, 2011) and by the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) (Hansen and Ostermeier, 1996). The important difference is that L-Adaptation does not use Gaussian proposals, but sampling is done uniformly from L-balls. Similarly to CMA-ES, however, L-Adaptation uses an adaptive multi-sample strategy (Hansen, 2008), which is state-of-the-art in bio-inspired optimization (Hansen and Ostermeier, 2001). In each iteration, L-Adaptation draws samples uniformly from an n-dimensional L-ball of radius r > 0: where is the p-norm of the vector : Every such sample represents one point in the parameter space of the model, i.e. one specific set of values for the unknown model parameters. For every sampled parameter vector, a binary function called an oracle is evaluated, which tests for the conditions, or specifications, that one wishes the model to fulfill. Based on the binary results of the oracle for every sampled parameter set, the position, orientation and aspect ratio of the L-ball are adapted. For efficient design centering and volume approximation, this adaptation of the L-ball proposal distribution is combined with an adaptive schedule for changing the target hitting probability of the sampler, which is the desired probability for a randomly drawn point from the proposal to be feasible (i.e. to satisfy the oracle). This is repeated iteratively until the chain of samples has reached stationarity (according to a statistical test), where averaging over samples becomes meaningful. The hallmark of a stationary state is that the statistics of the proposal distribution, such as mean volume and hitting probability, no longer change on average. The n-dimensional volume of the feasible region A is then approximated as: where P is the empirical hitting probability (averaged over iterations at stationarity) and is the exact analytical volume of the n-dimensional L-ball with radius r, where is the Gamma function. An important advantage of L-Adaptation is that it is effectively parameter free. All parameters required for the computation have default settings that need to be changed only in exceptional cases (Asmus ). In all our computations below, we always use the default parameter settings in conjunction with L2-balls, which have been suggested to be a good choice when no prior knowledge about the shape of the feasible region is available (Asmus ). We also use a decreasing schedule for the target hitting probability, which leads to better volume approximation as recommended by Asmus .

2.2 Oracles for reaction–diffusion models

L-Adaptation is based on an implicit definition of the feasible region through a binary oracle. An oracle f is a black-box function (i.e. point-wise computable but not necessarily known in closed form) that performs the computations required to determine if a given parameter vector lies within the feasible region, i.e. leads to the model fulfilling the specifications. Given an oracle function that returns 1 iff and 0 otherwise, the n-dimensional volume of the feasible region A can be defined as: We propose four oracles that allow to efficiently test for different parameter space characteristics of reaction–diffusion models: bistability, instability, pattern formation and pattern maintenance.

2.2.1 Bistability

The bistability oracle tests whether a parameter vector lies within a region of homogeneous bistability. To determine this, we find stable states of the non-spatial (i.e. reactions only) part of the model by numerically solving the corresponding system of ordinary differential equations (ODEs). The system is solved until time t = 1000 in non-dimensionalized characteristic time units using the MATLAB ODE solver ode15s. To account for the possibility of multiple stable steady states, similar to Scholes , we generate a grid of initial conditions with three concentrations per species within a defined interval of biologically relevant concentrations. The endpoints of the simulated concentration trajectories are clustered using k-means clustering. The cluster centers are further validated to be steady-state concentrations when used as initial concentrations for the original system of ODEs. The system is then solved a second time until t = 2000. The resulting steady-state concentration vectors are then tested for stability using linear stability analysis. The oracle returns 1 iff the number of homogeneous stable steady states is larger than one.

2.2.2 Instability

The instability oracle tests whether a parameter vector lies within an instability region of the model’s parameter space. For this, every homogeneous stable steady state identified as described above is tested for its response to small non-uniform perturbations. The oracle returns 1 iff there exists at least one homogeneous steady state that is stable with respect to uniform perturbation and unstable with respect to non-uniform perturbation.

2.2.3 Pattern formation

The pattern formation oracle simulates the full partial differential equation (PDE) model to test whether a spatial pattern forms for a given parameter vector. Pattern formation implies the emergence of stable spatial structures from homogeneous initial conditions, which are taken to be the (possibly multiple) solutions of the corresponding system of ODEs. To mimic molecular noise, and to trigger potential pattern-forming instabilities, we add 1% Gaussian noise to each initial condition. The simulations are performed in a 1D spatial domain with periodic boundary conditions. The diffusion operator is discretized using second-order central differences, and time is evolved using the explicit Runge–Kutta scheme of order 4. The time step δt is automatically defined from the Courant-Friedrichs-Lewy (CFL) condition of the PDE. We run the simulation until convergence of the solution to a steady state, or until 3000 time steps, whatever comes first. Convergence is defined by an absolute tolerance ϵ: where s(x, t) is the concentration of a molecular species at location x and time t. At the end of a simulation, we test whether the steady state of any species is an inhomogeneous pattern, by checking if the diffusion term is larger than a threshold d. In this work, we fix and (Lo ). The oracle returns 1 iff Equation (7) holds and for any species s in any of the simulations (for multiple initial conditions).

2.2.4 Pattern maintenance

The pattern maintenance oracle tests whether a model preserves or maintains a pre-patterned state for a given parameter vector, or whether it requires a finite perturbation to undergo pattern formation. Unlike the previous two oracles, which study the behavior of a model with respect to infinitesimal perturbations, the pattern maintenance oracle explores the response of a model to a finite perturbation. This oracle calls the pattern formation oracle with an initial condition containing a pre-patterned concentration field, introducing a finite initial perturbation, for example, with a single Gaussian peak where σ specifies the width of the peak and S0 its amplitude. The initial perturbation is typically chosen to recapitulate empirically observed peaks in images. The oracle returns 1 iff the pattern formation oracle returns 1 for this finite initial perturbation.

2.3 Workflow

To start the analysis, L-Adaptation requires two inputs: (i) an oracle f that checks if any given point in the parameter space Ω fulfills the specifications and (ii) a feasible starting point . For the application considered here, we also require bounds of the parameter space, which can be large, but finite. To identify a feasible starting point for L-Adaptation, we first sample a random point and check its feasibility by evaluating the oracle . If the point is feasible (i.e. ), we use it to start L-Adaptation. If the point is not feasible, which happens in most cases, the oracle f is negated to , such that non-feasible points of f are feasible for , i.e. . This enables using the randomly sampled point x as a starting point for L-Adaptation over the negated oracle and estimate the volume of the negated feasible region. If is comparable to (as defined in Equation (10)) the total volume of the parameter space , which can easily be computed since Ω is a hypercube, then the model is globally incapable of fulfilling the specifications encoded in f, and therefore, . If is smaller than the total volume of the parameter space , a non-feasible points of will at some point be sampled. As soon as that happens, L-Adaptation over is aborted and that point is used as a feasible point to start L-Adaptation over f and compute the volume estimate of the feasible region A. A schematic representation of this workflow is shown in Figure 1.

Fig. 1.

Flowchart of the workflow for high-dimensional parameter space screening using design centering. The process starts with initialization, where the user defines the oracle function f and the bounds of the parameter space Ω. Next, a random point is sampled and checked for feasibility. If the point is feasible, i.e. , then it is used to start L-Adaptation and estimate the volume of the feasible region . If the randomly chosen point x is not feasible, i.e. , then the oracle is negated such that . This allows to start L-Adaptation for the negated oracle and estimate the volume where the model is not capable of fulfilling the specifications defined by the oracle f. If the volume estimate is comparable to the volume of the whole parameter space, then the model is globally incapable of satisfying the oracle f, and we set . If is smaller than , then L-Adaptation is started for the original oracle f with a feasible starting point that has been sampled during L-Adaptation over the negated oracle function . In all cases, the final output is the volume estimate of the feasible region of the original oracle f. The inset panels illustrate the volume of the whole parameter space on the top right, the volume of the feasible region for the negated oracle on the bottom right and the volume of the feasible region for the original oracle on the left

2.4 Model selection

High-dimensional parameter space exploration is closely linked to model robustness estimation. The robustness of a system to a specific class of perturbations can be defined as the ability of the system to perform its function under these perturbations (Stelling ). It has been shown that the robustness of a model is associated with the volume of its feasible region (Dayarian ): a small feasible region requires fine-tuning of the parameter values, while a large feasible region implies the ability of the model to robustly perform a desired function under significant parameter fluctuations. This can be used for model selection in systems biology, because many biochemical systems are robust to a variety of perturbations, including temperature (Ruoff, 1992), mutations (Wagner, 2000) and molecular noise (Gonze ). Biological systems need to be robust against parameter perturbations to persist in a changing environment. Therefore, robustness can be used to select between competing hypothetical mechanisms (Hafner ). When comparing two models, both of which are capable of reproducing the function of a biological system, the more robust one should be preferred. We therefore use the estimated volume of the feasible region as provided by L-Adaptation to perform model selection. To compare models of different complexity, or parameter space dimensionality, robustness has to be defined in a dimension-independent way. A good choice is to use the linear volume of the feasible region, normalized as (Hafner ), where n is the dimension of the parameter space. It can be interpreted as the average tolerable variation per parameter dimension that still permits the model to perform its function.

3 Results

We first validate our proposed oracle function in a low-dimensional model where the stability properties are analytically known. Then, we empirically quantify the computational cost of our workflow for increasing parameter space dimensionality. Finally, we apply our method to model the selection of small GTPase domain formation mechanisms.

3.1 Oracle validation

We validate the oracles proposed above on the partitioning-defective proteins (PAR) model (Goehring ), which describes the establishment of uniaxial anterior–posterior polarity in the zygote of the roundworm Caenorhabditis elegans shortly after fertilization. This model has a two-dimensional parameter space for which the ground-truth characteristic regions and their volumes are known analytically (Trong ). We compare these ground-truth regions of the four parameter-space characteristics with the feasible regions identified by our proposed oracles. Furthermore, we validate the volume estimates computed by L-Adaptation. The spatiotemporal dynamics of the membrane-bound anterior PAR (aPAR) complexes with concentration A and posterior PAR (pPAR) complexes with concentration P is described by the reaction–diffusion model (Goehring ): where A and P are the cytoplasmic concentrations of aPAR and pPAR, respectively, and are their respective diffusion constants on the membrane and and ( and ) are the reaction rate constants of binding/unbinding to/from the membrane. The mutual inhibition of the protein complexes on the membrane is defined by the reaction rate constants and . All parameters except and were fixed for the model analysis, reducing the parameter space to two dimensions (Trong ). Figure 2 shows the four characteristic regions in this two-dimensional ( and ) parameter space: (i) bistability, (ii) instability, (iii) pattern formation and (iv) pattern maintenance. The feasible points sampled by L-Adaptation are shown as blue dots, while the black outlines delimit the ground-truth regions (Trong ). All feasible points sampled by L-Adaptation lie inside the boundaries of the ground-truth regions, validating the proposed oracles.

Fig. 2.

Bistability region (a), instability region (b), spontaneous pattern-formation region (c), and pattern-maintenance region (d) of the PAR model in the parameter space. Dots show the feasible points sampled by L-Adaptation over the respective oracle. The crosses mark the estimated design centers. The solid outlines show the ground-truth region boundaries from Trong We also compare the ground-truth volumes of the four feasible regions from Figure 2 with the respective L-Adaptation estimates. The ground-truth volumes are computed by exhaustive grid search using a resolution of 0.05. For L-Adaptation, we start volume estimation from five different initial feasible points until stationarity. We average the volume estimates over the last 200 iterations. The resulting volume estimates are 1.73 for the bistability oracle (ground truth 1.69), 1.04 for the instability oracle (ground truth 1.07), 1.01 for the pattern-formation oracle (ground truth 1.03) and 5.83 for the pattern-maintenance oracle (ground truth 5.87). All these are within the resolution of the ground-truth volume computation (0.05). Therefore, we conclude that the volume estimates of the four regions defined by the proposed oracles agree well with the ground-truth volumes.

3.2 Computational cost and scaling with dimension

L-Adaptation is expected to scale polynomially with the dimensionality of the parameter space, and not exponentially as is the case for exhaustive sampling (Asmus ). However, this is an empirical result because the algorithmic complexity of volume estimation cannot be rigorously bounded for general, non-convex regions (Lovász and Vempala, 2006; Simonovits, 2003). For the empirical quantification of the algorithmic complexity of L-Adaptation, Asmus used n-dimensional L1 and L2 balls as ground-truth feasible regions. They observed that the number of evaluations required for volume approximation to within a given tolerance scales as n3 (see Fig. 3).

Fig. 3.

Number of samples (oracle evaluations) required with increasing parameter space dimension n to reach a relative volume approximation error of <10% for an -ball using L2-ball proposal distributions. The solid line without symbols shows the best least-squares fit of a cubic scaling. The number of evaluations needed for volume approximation of L1- and L2-balls is shown as a reference (inset legend); the respective pre-factors for the cubic fitting are 10.2767 and 1.9724 (Asmus ) Here, we confirm this result for an n-dimensional hypercube, an -ball, as the ground-truth feasible region. This is motivated by the need of our framework to also estimate the volume of the entire parameter space Ω, which is an n-dimensional hypercube (see Fig. 1). Figure 3 shows the number of oracle evaluations required to reach a relative volume-approximation error below 10% scales cubically with the parameter space dimensionality n also in this case. Therefore, we expect the overall workflow in Figure 1 to scale polynomially with parameter space dimensionality. The accuracy of the final volume estimate depends on the shape of the feasible region, the starting point, and the dimensionality of the parameter space. Asmus have shown that L-Adaptation with L2-ball proposals can underestimate the volume of a 20-dimensional -ball by 50% even for decreasing the target hitting probability. Here, we measure the relative error of hypercube volume approximation using an L2-ball proposal, which we find to always be below 10% for the dimension of the parameter space going up to 30. This allows us to define a threshold on the approximated feasible volume for the negated oracle before concluding that the model is incapable of robustly meeting the specifications given by the oracle anywhere in parameter space. This threshold is: with the volume of the whole parameter space.

3.3 Application to small GTPase domain formation

Small GTPase proteins are known to play an important role in numerous cellular processes, including cell proliferation and growth (Militello and Colombo, 2013; Park and Bi, 2007), cytoskeletal regulation and cell motility (Charest and Firtel, 2007) and intracellular trafficking (Murphy ; Ridley, 2006; Zerial and McBride, 2001). Some of these processes are hypothesized to depend on the emergence of spatial patterns formed as a result of nonlinear interactions of small GTPases with their effectors on membranes (Horiuchi ; Zerial and McBride, 2001) (see Supplementary Section S1 for details). Although the key molecular players and the interactions between them have been discovered a while ago and have been included in a mechanistic model of pattern formation for small GTPases (Goryachev and Pokhilko, 2008), the recent reconstitution of domain formation of the small GTPase Rab5 by the GDP dissociation inhibitor (GDI) and the guanine nucleotide exchange factor (GEF)/effector Rabex5/Rabaptin5 complex by Cezanne motivates alternative models for the small GTPase Rab5. The known molecular interactions in this system are: Here, soluble species are shown in light color, while membrane-bound species are in black. In the first reaction, Rabex5/Rabaptin5 (XY) is recruited to the membrane via interaction of Rabaptin5 with membrane-bound active Rab5 (R). The GEF/effector complex can dissociate from the membrane; therefore, the reaction is reversible. In the second reaction, the Rabex5/Rabaptin5 complex activates inactive Rab5 (R). The third reaction corresponds to Rab5 activation by Rabex5 in complex with active Rab5 (XYR). The shuttling of inactive Rab5 via GDI (I) is described by the fourth reaction. The last reaction corresponds to spontaneous GTP hydrolysis by Rab5. See Supplementary Section S2 for details on the physical domain of the models, parameter value bounds and definition of the parameter spaces. Supplementary Section S3 contains the resulting model PDEs and their conservation laws. This defines the basic interactions in the model ‘Rab5’ in Table 1. We complement this basic model with seven alternative models, each including a different combination of additional, hypothetical molecular interactions as given in Table 1: GEF/effector spontaneous membrane binding, handover of newly activated Rab5 from the GEF to the effector and presence of the GEF complex in dimeric form with cooperative Rab5 binding. See Supplementary Sections S4 and S5 for detailed equations, parameter values and explanations on each of these alternative models. We compare the capability of these eight models to robustly form Turing patterns. Therefore, we use the present framework with the proposed instability oracle.

Table 1.

Eight alternative Rab5 models (columns) including different combinations (crosses) of four molecular interactions (rows) and the resulting dimensionalities n of the parameter spaces Ω

Interactions/model	Rab5	mRab5	Rab5-H	2xRab5	mRab5-H	2xmRab5	2xRab5-H	2xmRab5-H
Basic model (Equation (11))	×	×	×	×	×	×	×	×
RbRx on membrane		×			×	×		×
Handover			×		×		×	×
RbRx is a dimer				×		×	×	×
n=dim(Ω)	6	8	6	6	8	8	6	8

Eight alternative Rab5 models (columns) including different combinations (crosses) of four molecular interactions (rows) and the resulting dimensionalities n of the parameter spaces Ω We perform five independent runs of L-Adaptation for every model and every condition with 10 000 evaluations for six-dimensional parameter spaces and 15 000 evaluations for eight-dimensional parameter spaces. This required between 30 min and 2 h of computer time per model using MATLAB on a standard office laptop. Table 2 reports the mean ± standard deviation (over all runs) normalized volumes of the instability regions in comparison to the entire volume of the parameter space. In cases where the model is globally incapable of forming Tuing patterns, the estimated normalized volume of the negated oracle is given instead.

Table 2.

Estimated normalized volumes and for the negated and original oracle functions and f, respectively, for all models from Table 1, compared with the normalized volume of the entire parameter space Ω

Model	Vest*n	Vestn	VΩn
Rab5	6.039 ± 0.057	—	6.000
mRab5	—	3.874 ± 0.022	6.000
Rab5-H	6.070 ± 0.010	—	6.000
2xRab5	—	3.818 ± 0.040	6.000
mRab5-H	—	3.743 ± 0.021	6.000
2xmRab5	—	3.880 ± 0.033	6.000
2xRab5-H	—	3.677 ± 0.065	6.000
2xmRab5-H	—	3.836 ± 0.032	6.000

Estimated normalized volumes and for the negated and original oracle functions and f, respectively, for all models from Table 1, compared with the normalized volume of the entire parameter space Ω The results show that the basic model is globally incapable of producing robust patterns (Table 2, Rab5 model). However, any one of two additional hypothetical interactions is each sufficient for diffusion-driven instability: GEF/effector membrane binding in a Rab5-independent manner (mRab5 model) or a dimerized form of the GEF/effector complex (2xRab5 model). All model variants containing any one of these hypothetical additional interactions can form Turing patterns with similar robustness (about 60% of the entire parameter space). While the model with spontaneous GEF/effector membrane binding is analogous to the classic Cdc42 model (Goryachev and Pokhilko, 2008), our screen thus identifies an alternative model for small GTPase pattern formation, which includes a dimerized form of the GEF/effector complex. This demonstrates the applicability of the proposed method to inferring molecular mechanisms of pattern formation by selecting between alternative hypotheses based on robustness.

4 Conclusions

We have presented a computational framework for efficient high-dimensional parameter space screening of reaction–diffusion models in systems biology. Our framework is based on L-Adaptation, an adaptive statistical method for approximate design centering and volume estimation. L-Adaptation uses L-balls as proposals for parameter-space sampling, which are dynamically adapted based on the sampling history to efficiently explore the feasible region of a model. The feasible region is defined by specifications, which are jointly included in a binary black-box oracle function. The estimated volume of the feasible region relates to the robustness with which a model is able to fulfill the specifications under parameter perturbations and, therefore, provides a solid basis for model selection in biochemistry. We have proposed four oracles to characterize the parameter space of pattern-forming reaction–diffusion systems in terms of bistability, instability, spontaneous pattern formation and pattern maintenance. We have benchmarked the proposed oracles in a real biological example with analytically known parameter space regions and have demonstrated that L-Adaptation using these proposed oracles produces volume estimates within 10% of ground truth in polynomial computational time. In addition to the four proposed oracle functions, we have introduced negated oracles and have formulated a criterion on the volume estimate for the negated oracles that suggests the inability of a model to fulfill specifications, defined in the original oracle, robustly anywhere in its parameter space. This is useful in cases where a model lacks the feasible region and, therefore, L-Adaptation cannot be initiated on the original oracle. It is also useful to rule out biochemical hypotheses. An important feature of the presented algorithm is the polynomial scaling of its computational cost with parameter-space dimensionality, which, to our knowledge, outperforms the previous state-of-the-art glocal methods for high-dimensional parameter space characterization in systems biology. Moreover, the present approach does not require formulating a real-valued objective function of known dynamics. Although the present framework is capable of characterizing high-dimensional parameter spaces efficiently, it has several limitations. First, our framework does not exploit dependencies or correlations between parameters and cannot identify the causes of low or high robustness. One way to explore causes of robustness would be to explicitly include them in the model. Comparing volume estimates of the feasible regions with and without a specific cause included may uncover contributions to the total robustness. Parameter correlations could be explored by analyzing the shape of the approximated feasible region or its main axes (eigenvectors). Second, the presented oracles are formulated in a general way without a strict definition of the output, e.g. pattern formation without specifying a certain pattern type. While such general oracles are convenient, they could be refined to more specific outputs, for example by restricting to wavenumbers that are characteristic for certain pattern types. Third, for very high-dimensional parameter spaces (n > 30), the relative error of volume approximation can increase above 10%. Further adjustment of the target hitting probability is then necessary to improve the accuracy at the expense of a higher computational cost. Taken together, we believe that the algorithmic framework presented here can be useful in exploring the global behavior of reaction–diffusion models with high-dimensional parameter spaces. It enables characterizing the capability (or incapability) of a model to exhibit bistability, form Turing-type patterns, or maintain a patterned spatial distribution robustly. It does so without requiring specific values or fits of the parameters, using robustness as a principled criterion for model comparison. Click here for additional data file.

26 in total

Design centering enables robustness screening of pattern formation models.

1 Introduction

2 Materials and methods

2.1 L-Adaptation for high-dimensional design centering and volume estimation

2.2 Oracles for reaction–diffusion models

2.2.1 Bistability

2.2.2 Instability

2.2.3 Pattern formation

2.2.4 Pattern maintenance

2.3 Workflow

2.4 Model selection

3 Results

3.1 Oracle validation

3.2 Computational cost and scaling with dimension

3.3 Application to small GTPase domain formation

4 Conclusions

1. Robustness of circadian rhythms with respect to molecular noise.

Review 2. Completely derandomized self-adaptation in evolution strategies.

3. A Robust and Efficient Method for Steady State Patterns in Reaction-Diffusion Systems.

4. Robustness in simple biochemical networks.

5. Endosome dynamics regulated by a Rho protein.

6. Will a large complex system be stable?

7. Turing's Diffusive Threshold in Random Reaction-Diffusion Systems.

8. 'Glocal' robustness analysis and model discrimination for circadian oscillators.

Review 9. Central roles of small GTPases in the development of cell polarity in yeast and beyond.

Review 10. Small GTPases as regulators of cell division.