Xin Kang1,2, Chunhe Li2,3. 1. School of Mathematical Sciences, Fudan University, Shanghai, China. 2. Shanghai Center for Mathematical Sciences, Fudan University, Shanghai, China. 3. Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.
Abstract
Embryonic stem cells (ESCs) can differentiate into diverse cell types and have the ability of self-renewal. Therefore, the study of cell fate decisions on embryonic stem cells has far-reaching significance for regenerative medicine and other biomedical fields. Mathematical models have been used to study emryonic stem cell differentiation. However, the underlying mechanisms of cell differentiation and lineage reprogramming remain to be elucidated. Especially, how to integrate the computational models with quantitative experimental data is still challenging. In this work, we developed a data-constrained modelling approach, and established a model of mouse embryonic stem cells. We used the truncated moment equations (TME) method to quantify the potential landscape of the ESC network. We identified two attractors on the landscape, which represent the embryonic stem cell (ESC) state and differentiated cell (DC) state, respectively, and quantified high dimensional biological paths for differentiation and reprogramming process. Through identifying the optimal combinations of gene targets based on a landscape control strategy, we offered some predictions about the key regulatory factors that govern the differentiation and reprogramming in ESCs.
Embryonic stem cells (ESCs) can differentiate into diverse cell types and have the ability of self-renewal. Therefore, the study of cell fate decisions on embryonic stem cells has far-reaching significance for regenerative medicine and other biomedical fields. Mathematical models have been used to study emryonic stem cell differentiation. However, the underlying mechanisms of cell differentiation and lineage reprogramming remain to be elucidated. Especially, how to integrate the computational models with quantitative experimental data is still challenging. In this work, we developed a data-constrained modelling approach, and established a model of mouse embryonic stem cells. We used the truncated moment equations (TME) method to quantify the potential landscape of the ESC network. We identified two attractors on the landscape, which represent the embryonic stem cell (ESC) state and differentiated cell (DC) state, respectively, and quantified high dimensional biological paths for differentiation and reprogramming process. Through identifying the optimal combinations of gene targets based on a landscape control strategy, we offered some predictions about the key regulatory factors that govern the differentiation and reprogramming in ESCs.
Embryonic stem cells have the capacity of self-renewal, high proliferation and multilineage differentiation. Therefore, they have an important and far-reaching influence on new drug discovery and development, disease models, organ transplantation and tissue repair [1], [2]. Studies have shown that the reprogramming of human somatic cells and lineage reprogramming can be achieved by regulating several key genes [3], [4], [5], [6], [7], [8]. Some key genes that determine whether embryonic stem cells exit or remain in a pluripotent state have been extensively studied [9], [10], [11], [12], [13], [14]. However, the underlying mechanisms of cell differentiation and lineage reprogramming have not been fully clarified. It remains challenging to control whether cells exit or remain in the pluripotent state, which needs further explorations.The pluripotency of mouse embryonic stem cells has been considered to be controlled by underlying gene regulatory networks [15], [16], [17], [18], [19]. Dunn et al. developed a data-constrained, computational approach to construct a simplified gene regulation network of embryonic stem cells, which comprises 12 gene components and three input signals. This model can explain the known phenotypes, predict some results of combined genetic perturbations and test the relative stability of embryonic stem cells under different culture conditions [20]. Recently, a simulation approach has been proposed to investigate the stochastic dynamics of pluripotency in mouse embryonic stem cells based on the network of Dunn et al. [21]. However, the global stability of pluripotency network of mouse stem cells has yet to be explored. More importantly, how to identify the optimal combinations of interventions from gene networks to induce reprogramming remains challenging.Here, we aim to apply the landscape theory to study the dynamical mechanisms of the underlying gene regulatory network of embryonic stem cells. The classic Waddington landscape has been proposed as a metaphor for the development and differentiation of cells [22]. Recently, the epigenetic landscapes for biological networks have been quantified from various approaches [23], [24], [25], [26], [27], [28], [29], and employed to study the stochastic dynamics of embryonic development [23], [30], [31], [32], [33], [34]. From the perspective of landscape, different cell types are depicted as attractors on a potential surface. The cell differentiation process is viewed as a ball rolling from one basin to another on the landscape surface by crossing certain barriers. The barrier heights among the attractors quantify the feasibility of the cell transformation from one cell type to the other.In this work, we try to identify the important factors affecting the differentiation and reprogramming of mouse embryonic stem cells from the perspective of dynamics. Based on an embryonic stem cell network [20], we built models of ordinary differential equations (ODEs) to describe the time evolution of expression levels of different components. One critical issue in constructing a gene regulation network model is how to determine the parameters of the system. Here, we adopted a data constrained modelling strategy, and estimated the parameters in the models to fit the binary experimental data from Dunn et al. [20]. Then we obtained the potential landscape of the network by truncated moment equations (TME) method [35], [36]. We identified two attractors on the landscape, which represent the embryonic stem cell (ESC) state and the differentiated cell (DC) state, respectively. We further calculated the minimum action paths (MAPs) to quantify the transition processes of the differentiation and reprogramming. By single factor sensitivity analysis of parameters, we predicted some critical regulations in cell differentiation and reprogramming. From the perspective of gene networks, an effective intervention strategy should be targeting multiple factors in the network, rather than one. By optimizing the transition action from DC state to ESC state, we identified some optimal combinations of key regulations among genes, which can promote reprogramming. These analyses on the landscape and the transition actions will promote our understanding on the mechanisms of differentiation and reprogramming, and help to identify key genes and interactions for inducing reprogramming.
Results
Mathematical model of embryonic stem cell network
Our model is based on the embryonic stem cell network inferred by Dunn et al. [20]. This network includes 26 interactions (regulations) and 15 components (12 genes and 3 signals). The three extra-cellular signals are cytokine leukemia inhibitory factor (LIF) and two selective inhibitors, including glycogen synthase kinase 3 (CH) and mitogen-activated protein kinase (PD) [37]. Some direct transcriptional targets of LIF, CH, and PD have been identified, including Stat3, Gbx2, Klf4, Tcf3, Tfcp2l1, Esrrb, MEKERK and Nanog [38], [39], [40], [41], [42], [43] (see Supplemental information Fig. S1 for direct transcriptional targets). The other regulations among genes are predicted from experimental data of mouse embryonic stem cells [20]. Based on this network, we constructed models of ODEs to describe the time evolution of expression levels of each component, which can be written as the following form:where, represents the expression level of gene i. and k represent the basal production and degradation rates of , respectively.We used Hill functions to describe the regulations (activations and inhibitions) in the network (links in Fig. 1). The Hill function is a sigmoidal function. It has been commonly used to describe gene regulations in gene network models [44], [45], [46]. S in Eq. (1) represents the threshold of a sigmoidal function. n is the Hill coefficient, which determines the steepness of the sigmoidal function. and are the scale factors for the activation and inhibition, respectively. When the scale factor is equal to zero, it means that there is no activation from gene j to gene i. When the scale factor is equal to zero, it means that there is no inhibition from gene j to gene i.
Fig. 1
The network of the core circuit of an embryonic stem cell system including 15 nodes and 26 interaction links (18 activations and 8 inhibitions). Black arrows represent activation and red bars represent inhibition. Orange nodes represent input signals, and pink nodes represent genes. The width of the links represents the sensitivity of the links calculated from the optimization of transition actions (Table 1). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
The network of the core circuit of an embryonic stem cell system including 15 nodes and 26 interaction links (18 activations and 8 inhibitions). Black arrows represent activation and red bars represent inhibition. Orange nodes represent input signals, and pink nodes represent genes. The width of the links represents the sensitivity of the links calculated from the optimization of transition actions (Table 1). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Table 1
Cellular reprogramming-promoting interventions identified from optimization of transition actions for the model. Direction represents the direction of parameter changes for the corresponding interventions. Here, the inhibition constant represents the maximal synthesis rate when the inhibition regulation is completely not working. So, the increase of represents the decrease of the corresponding inhibition strength. Sensitivity is defined as the average of the changes in the parameters (strength of regulations) caused by each intervention across the six parameter sets.
Direction
Regulation
Sensitivity
References
↑
LIF →Stat3
0.123
[60]
↑
Tcf3 ⊣ Tfcp2l1
0.114
Prediction
↑
Esrrb ⊣ Oct4
0.104
[61]
↑
PD ⊣ MEKERK
0.089
[62]
↑
CH ⊣ Tcf3
0.089
Prediction
↑
MEKERK ⊣ Tcf3
0.089
Prediction
↑
MEKERK ⊣ Nanog
0.086
[63]
↑
Oct4 ⊣ Tfcp2l1
0.083
[11]
↑
Tcf3 ⊣ Esrrb
0.077
Prediction
Estimate parameters of the model from experimental data
We built the models of ODEs based on the embryonic stem cell network. Then, we used experimental data of Dunn et al. to estimate the parameters in our model [20]. The experimental data are binary gene expression data of 12 genes from PCR under five various combinations of external signals (LIF, CH, PD) (Fig. 2C). In the work of Dunn et al., there are totally five different combinations of signals, including 2i + LIF, 2i, LIF + CH, LIF + PD, and no signals. Therefore, we used gene expression data from these five various combinations of external signals to estimate the parameters in our model. In order to reduce the dimension of parameters space, we assumed that all the nonzero are equal to undetermined constant a and nonzero are equal to undetermined constant b. Moreover, LIF, CH and PD are regarded as signals of the network. Therefore, the unknown parameters in our model include and . Next, we aimed to estimate these parameters from the binary experimental results of Dunn et al. [20]. Since the experimental data are dimensionless, we specified the parameter range based on previous works [23], [45]. Here, the Hill coefficient n determines the steepness of the sigmoidal function, i.e. the degree of nonlinearity (the cooperativity for gene regulations). Following previous works [30], [46], we assumed or 4 to represent certain degree of cooperativity for gene regulations. We set the basal production rate and degradation rate , and the ranges of other parameters are and .
Fig. 2
Comparison of gene expression profiles between modeling and experiments. (A) The gene expression profiles from the model using the picked parameter set. (B) The binary expression profiles corresponding to (A). (C) The gene expression profiles from experiments [20]. Gene expression is discretized to high (blue) or low (white). The input signal is discretized to high (green) or low (white). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Comparison of gene expression profiles between modeling and experiments. (A) The gene expression profiles from the model using the picked parameter set. (B) The binary expression profiles corresponding to (A). (C) The gene expression profiles from experiments [20]. Gene expression is discretized to high (blue) or low (white). The input signal is discretized to high (green) or low (white). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)Then, we followed three steps to estimate the parameters of the model:First, we reduced the searching range of parameters based on the biological significance of stable states. A major biological constraint is multistability, because in an embryonic developmental system there should be at least two cell states, i.e. ESC state and DC state. In our model, it means that there will be multiple stable states when for different initial conditions. Therefore, we performed a search in the parameter space: and . For each parameter set, we numerically solved the ODEs under different initial values to acquire stable fixed points of the ODEs, i.e. stable steady states. We recorded the parameter sets that generate bistable states and corresponding stable states. Then, we binarized the stable states by the mean value of stable states, i.e., if the expression level of a gene is larger than the mean value, it is denoted as 1 and if the expression level of a gene is less than the mean value, it is denoted as 0. Further, we picked the parameter sets that can generate biologically significant bistable states (Fig. 3), i.e., one stable state should represent pluripotent ESC state, which has higher expression level (with the expression level of 1 after binarization) of OCT4, Nanog, Sox2, Esrrb and Tfcp2l1 (ESC markers), and the other stable state should represent DC state, which has lower expression level (with the expression level of 0 after binarization) of OCT4, Nanog, Sox2, Esrrb and Tfcp2l1.
Fig. 3
The points in parameter space which can generate two stable states (ESC state and DC state), from parameter searching. The color of the points denotes the Hamming distance between the benchmark data and 12 5 real matrix generated from the model using corresponding parameter set. Blue points represent smaller Hamming distance and yellow points represent larger Hamming distance. The triangular points represent the parameters used for optimization. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
The points in parameter space which can generate two stable states (ESC state and DC state), from parameter searching. The color of the points denotes the Hamming distance between the benchmark data and 12 5 real matrix generated from the model using corresponding parameter set. Blue points represent smaller Hamming distance and yellow points represent larger Hamming distance. The triangular points represent the parameters used for optimization. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)For each parameter set including eight parameters: obtained in the previous step, we performed simulations under five experimental conditions to find parameter sets that mostly fit the five groups of experimental data. Since embryonic stem cells can be stably cultured as a substantially homogeneous population in different environments [20], if the model has multiple stable states under a certain experimental condition, we choose the steady state with high probability as the expression level under this signal. Here our fitting procedure is based on deterministic ODEs. To consider the heterogeneity of populations, future work can develop methods to fit models based on the distribution of expression data among populations. In this way, we can obtain a (expression levels of 12 different gene variables in five different conditions) real-valued matrix. Then we took the average of gene expression levels under different signals as a threshold to acquire a binary matrix.To find parameters that mostly fit the experimental gene expression profile (benchmark data) (Fig. 2C), we calculated the Hamming distance (the discrepancy between two binary matrices) between the experimental gene expression profile and discretized gene expression profile from the model (step 2), and sorted them by hamming distance to find the most suitable parameter set. Fig. 2A shows the gene expression profiles generated from the selected parameter set, and Fig. 2B shows the binary expression profile corresponding to Fig. 2A. Fig. 3 shows the points in parameter space which can generate ESC state and DC state.
Potential landscape and the minimum action path for the embryonic stem cell model
Lately, several studies have employed the landscape approach to investigate the stochastic dynamics of biological systems [23], [25], [44]. Here we used the TME method (see Methods) to calculate the potential landscape of the embryonic stem cell model. Since it is hard to visualize the landscape in 12-dimensional space, we selected two gene variables as coordinates for landscape and projected the 12-dimensional landscape into two dimensions. Here, we chose two marker genes of embryonic stem cells, OCT4 and Nanog, as the coordinates of landscape. Oct4 is a mammalian POU transcription factor expressed by early embryo cells and germ cells [47]. Nanog’s overexpression enables self-renewal of embryonic stem cells, and Nanog is accordingly considered as a core element of the pluripotent transcriptional network [48]. Of note, our main conclusions do not depend on the specific choice of the coordinates because the transition actions among different attractors are based on the 12-dimensional space (see Supplemental information Fig. S2 for landscapes using other pairs of variables as coordinates). In this way, two stable states emerge on the landscape for the embryonic stem cell system (Fig. 4). Here, the ESC state has a high OCT4, Nanog, Sox2, Esrrb and Tfcp2l1 expression level, while the DC state has a low OCT4, Nanog, Sox2, Esrrb and Tfcp2l1 expression level.
Fig. 4
The landscape and paths for the ESC model shown in Oct4 and Nanog coordinates. The blue region represents higher probability or lower potential and the yellow region indicates lower probability or higher potential. The white line represents the transition path from ESC state to DC state, and the magenta line represents the transition path from DC state to ESC state. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
The landscape and paths for the ESC model shown in Oct4 and Nanog coordinates. The blue region represents higher probability or lower potential and the yellow region indicates lower probability or higher potential. The white line represents the transition path from ESC state to DC state, and the magenta line represents the transition path from DC state to ESC state. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)To study the transition processes between ESC state and DC state, we calculated the kinetic transition paths between two cell states by minimizing the transition actions, which are called MAPs (see Methods for how to acquire the MAPs). Fig. 4 shows the landscape and MAPs shown in Oct4 and Nanog coordinates. The white line represents the transition path from ESC state to DC state (differentiation), while the magenta line represents the transition path from DC state to ESC state (reprogramming). The transition paths for cell differentiation process and cell reprogramming process are not identical, which is reflected by the disparity between the forward and backward kinetic transition paths. The kinetic paths of the system deviating from the conventionally expected potential gradient paths is caused by the non-gradient force, i.e. curl flux [26].
Single factor sensitivity analysis on model parameters
To further study the effect of interactions among the genes on the dynamics of the embryonic stem cell model, we performed single factor sensitivity analysis on the 26 parameters (18 activation constants and 8 inhibition constants) of embryonic stem cell model. Specifically, we increased or decreased each parameter by to modulate the regulation strengths among genes, and calculated how the transition actions between ESC state and DC state change after these perturbations. In this way, we can identify some critical regulations that govern the transition between DC state and ESC state. From the results of sensitivity analysis (Fig. 5), we conclude that the increase of all activation constants and inhibition constants (the increase of activation strength or the decrease of inhibition strength) will increase the transition actions from ESC state to DC state, and make ESC state more stable. These key regulations are sorted according to their sensitivities (defined as the difference between and ). The top nine sensitive regulations are shown in Table S1. The sensitivity analysis predicts some key links including LIFStat3, Tfcp2l1Sall4 and EsrrbTfcp2l1. Here represents activation regulation. The increase of these link strengths can critically enhance the stability of ESC state.
Fig. 5
Sensitivity analysis for the 26 parameters (regulatory strengths, including 18 activation constants and 8 inhibition constants) on the transition actions. Y axis represents the 26 parameters. X axis represents the percentage of the change of the transition action (S) relative to S with default parameters. Here, represents the transition action from attractor ESC to attractor DC (cyan bars), and represents the transition action from attractor DC to attractor ESC (magenta bars). (A) Each parameter is increased by 10%, individually. (B) Each parameter is decreased by 10%, individually. ESC: Embryonic stem cell; DC: Differentiated cell.
Sensitivity analysis for the 26 parameters (regulatory strengths, including 18 activation constants and 8 inhibition constants) on the transition actions. Y axis represents the 26 parameters. X axis represents the percentage of the change of the transition action (S) relative to S with default parameters. Here, represents the transition action from attractor ESC to attractor DC (cyan bars), and represents the transition action from attractor DC to attractor ESC (magenta bars). (A) Each parameter is increased by 10%, individually. (B) Each parameter is decreased by 10%, individually. ESC: Embryonic stem cell; DC: Differentiated cell.
Cellular reprogramming-promoting interventions
To predict the combination of target regulations among genes which can maintain the totipotency of embryonic stem cells, we used an approach of optimization to predict the critical combinations of parameters [44], [49]. The optimization goal is to make ESC state more stable and DC state less stable by tuning the 26 parameters (18 activation constants and 8 inhibition constants) while keeping the number of steady states unchanged (see Methods for details).In this work, we selected six different initial parameter sets to perform optimization (the triangular points in Fig. 3). These six parameter sets can generate two stable states (ESC state and DC state), where the DC state is more stable. Due to the heterogeneity of biological individuals, these six initial parameter sets can be regarded as the corresponding cells of six different individuals. For each initial parameter set, we found an optimal intervention to induce the transition from DC state to ESC state. The results of optimizing transition actions are shown in Fig. 6 (also marked in Fig. 1).
Fig. 6
Predictions for reprogramming-promoting interventions based on the optimization of transition actions. X axis shows 6 different individual model results (different initial parameter sets), reflecting the heterogeneity in stem cell populations, and Y axis shows 26 parameters (link strengths, 18 activation links and 8 inhibition links). Different colors indicate the changes in the parameters (strength of links) caused by each intervention, with purple grids representing the increase of targets (link strength), and cyan grids representing the decrease of targets.
Predictions for reprogramming-promoting interventions based on the optimization of transition actions. X axis shows 6 different individual model results (different initial parameter sets), reflecting the heterogeneity in stem cell populations, and Y axis shows 26 parameters (link strengths, 18 activation links and 8 inhibition links). Different colors indicate the changes in the parameters (strength of links) caused by each intervention, with purple grids representing the increase of targets (link strength), and cyan grids representing the decrease of targets.Specifically, we identified nine critical regulations from 26 regulatory parameters. Table 1 shows the top nine sensitive regulations and the corresponding sensitivity (sensitivity is defined as the average of the changes in the parameters (strength of regulations) caused by each intervention across the six parameter sets). From Table 1 and Fig. 6, we found that the important intervention for the transition action includes LIFStat3, Tcf3Tfcp2l1 and EsrrbOct4, which correspond to the first three rows of Table 1. Therefore, our results suggest a combinatory strategy for reprogramming by increasing the activation of LIF on Stat3, decreasing the inhibition of Tcf3 to Tcfp2l1 and the inhibition of Esrrb to Oct4, simultaneously.Cellular reprogramming-promoting interventions identified from optimization of transition actions for the model. Direction represents the direction of parameter changes for the corresponding interventions. Here, the inhibition constant represents the maximal synthesis rate when the inhibition regulation is completely not working. So, the increase of represents the decrease of the corresponding inhibition strength. Sensitivity is defined as the average of the changes in the parameters (strength of regulations) caused by each intervention across the six parameter sets.To see the difference between single factor sensitivity analysis and the optimization approach, we compare the results of optimization and single factor sensitivity analysis. From the sensitivity analysis, the top three targets are LIFStat3, Tfcp2l1Sall4 and EsrrbTfcp2l1. The increase of the strengths of these links can critically enhance the stability of the ESC state. It can be seen from the network that the increase of these link strengths will eventually increase the expression of Sall4. This indicates that these regulations have the similar functions. By contrast, the top three target regulations from optimization are not identical to those from single factor sensitivity analysis. The optimization result shows that we can simultaneously increase the strengths of LIFStat3, Tcf3Tfcp2l1 and EsrrbOct4 to inhibit differentiation. Both single factor sensitivity analysis and optimization indicate that an efficient reprogramming-promoting intervention should include LIFStat3, while the optimization result includes two other target links, and the function of these two other links are different from LIFStat3. Therefore, the optimization result seems to suggests that a better strategy should target links with different functions instead of links with the same functions.
Discussion
Embryonic stem cells have the potential to differentiate to any tissue in the body, which provides the motivation for the application of stem cell in regenerative medicine and other fields. However, the mechanisms of cellular differentiation and reprogramming remain to be elucidated. In this study, we established a data-constrained theoretical approach to model the gene regulatory network of mouse embryonic stem cells described by Dunn et al. [20]. Here, we built a model of ODEs for ESC, and proposed a data constrained approach to estimate the parameters of the models from the binary experimental data from Dunn et al. [20]. We identified a parameter set which can fit the experimental data well, and the models using this parameter set can generate biologically significant bistable states. We calculated the potential landscape of the ESC network and identified two attractors, which represent the pluripotent ESC state and DC state, respectively. We further quantified the MAPs between ESC state and DC state. The results show that the differentiation path and reprogramming path are irreversible. This irreversibility of MAPs is a consequence of non-gradient force, i.e. curl flux [26].By single factor sensitivity analysis of parameters, we provided some predictions about the key links affecting differentiation and reprogramming. The results from sensitivity analysis indicate that LIF/Stat3 signalling and gene Tfcp2l1 play important roles in maintaining the stability of the ESC state. These predictions agree well with previous experimental studies. For example, LIF is suggested to inhibit differentiation for mouse embryonic stem cells that maintains embryonic stem cells in a totipotent state and stimulates self-renewal [10], [50]. Stat3 is seen to be the most important signal transducer following stimulation by LIF and the one which mediates most of the cellular effects. Overexpression of a dominant-negative Stat3 construct in ES cells also leads to both differentiation and a loss of self-renewal [9]. Additionally, the CP2 family transcription factor Tfcp2l1 is a common target in LIF/Stat3- and 2i-mediated self-renewal, and forced expression of Tfcp2l1 can recapitulate the self-renewal-promoting effect of LIF or either of the 2i components. This means that Tfcp2l1 is at the intersection of LIF- and 2i-mediated self-renewal pathways and plays a critical role in maintaining ESC identity [11].Although single factor sensitivity analysis can uncover some single key parameters in the network, the cell fate decision in biological system is often governed by multiple factors rather than one. In the mean time, the multi-factor sensitivity analysis is not easy to implement due to the large computational cost. Therefore, it is necessary to develop an approach to predict the optimal combinations of multiple regulations that are critical to the dynamics of the embryonic stem cell model. Here, we used an approach of optimization to predict the critical combinations of parameters [44], [49]. We identified an optimal combinatory strategy by increasing the activation of LIF on Stat3, decreasing the inhibition of Tcf3 to Tcfp2l1 and the inhibition of Esrrb to Oct4, to induce reprogramming. It is now well established that LIF/Stat3 signaling plays an indispensable role in the self-renewal of mouse embryonic stem cells [9], [10], [12], [50]. Recently, activation of LIF-JAK1-STAT3 signaling to delay contact inhibition has been put forward as a strategy to facilitate engineering of HCEC (human corneal endothelial cells) grafts to solve the unmet global shortage of corneal grafts [12]. In addition, Esrrb and Tcf3 have been demonstrated to be required for efficient self-renewal of embryonic stem cells in vitro [13], [14]. According to our model predictions, we propose that relevant experiments can be designed, e.g., to test the reprogramming effects by targeting multiple gene regulations simultaneously (including LIFStat3, Tcf3Tfcp2l1 and EsrrbOct4). We expect that these predictions can be tested with the further development of experimental techniques and will provide the guidance for designing the strategies of inducing reprogramming. These analyses on the landscape and transition actions will promote our mechanistic understanding on differentiation and reprogramming in embryonic stem cells.In the field of systems biology, it remains challenging to combine “model driven” approach (such as gene network models) and “data driven” approach (such as single cell data analysis) in systems biology [51]. In this work, we used binary gene expression data to estimate the parameters in the model. We performed a search in the parameter space based on specific biological constraints. In principle, this approach can be applied to other data sets, such as single cell data. Recently, single cell data of gene expression have been extensively explored from different approaches [52], [53]. One limitation of current methods for single cell analysis is that the molecular mechanisms and kinetic transitions are hard to be studied. Our approach is based on dynamical models, and we use experimental data to infer the parameter regime for the models. This provides a potential way to combine “model-driven” approach and “data-driven” approach. It will be of great interest to develop a single cell data-constrained dynamical modelling approach, and construct more accurate energy landscapes for ESC and other biological networks.
Methods
Truncated Moment Equations (TME)
The time evolution of the gene expression level can be studied as a complex dynamical system. The gene expression level can be regarded as a stochastic process, and represents the driving force of the system. Then the Langevin equations describing the dynamics of the gene expression level take the form:Here, is the noise term of the system, and coefficient matrix is a matrix-valued function of , while is n-dimensional independent Gaussian white noise, which means:where D is constant diffusion coefficient andThe time evolution of a dynamical system is determined by probabilistic diffusion equations (Fokker–Planck equation). However, when the drift part of Fokker–Planck equation is nonlinear, it is difficult to find an analytic solution. Nonetheless, the solutions of nonlinear system and linear system have similar form when the diffusion coefficient . Therefore, when , the solution could be approximated by Gaussian distribution and the mean and covariance satisfy the following equations [54], [55]:denotes the covariance matrix in time t, and is the jacobian matrix of when is equal to , which means . Furthermore, is matrix which is equal to . Then the time evolution of density function for this system can be expressed by Eq. (8).In this work, we assume that the noise is homogeneous and only consider the external noise. Therefore, is equal to , which is an identity matrix in n-dimensional space. Then Eq. (7) is modified to:When , we can get the density function of the system at the steady state by solving Eqs. (6) and (9). The probability distribution acquired above corresponds to one steady state. If the system has multiple steady states, there should be several probability distributions localized at each basin with different covariance. Therefore, the total probability is the weighted sum of all these probability distributions.If we only focus on the distribution of two variables such as and , we can get the marginal distribution by integrating the other variables in . Finally, we can construct the potential landscape U by
[26], [30], [44]. Here is steady state probability distribution, and is dimensionless potential.
Minimum action paths
The differential equation is an autonomous system, which does not explicitly depend on the independent variable t. Assuming that the initial time is 0 and the terminal time is T, we define the path between ith attractor and jth attractor as for , where the path should satisfy following terminal conditions:The transition action between and is defined as:Here, is the Lagrangian function that denotes the distance between the driving force and the velocity along the path:Following the approaches [56], [57] based on Wentzell-Freidlin theory [58], the most probable transition path from ith attractor to jth attractor at time T can be acquired through minimizing the transition action functional over all possible paths:We calculated MAPs numerically by applying minimum action methods used in [44], [57], and treated the MAPs as the biological paths in our models. In this work, T is set to be 5 and we verified that larger values of T would not improve accuracy of simulations significantly.
Transition action optimization
For a stochastic differential equation:the corresponding deterministic components are described by , and the time-independent stable states can be defined by . When there are more than one stable states in this system, the transitions between two stable states i and j occur as a Poisson process with a certain transition rate . The transition rate represents the transition probability per unit of time along the most likely transition path(s) between the stable states i and j. They can be obtained by an asymptotic formula [59]:where, is minimum action computed from MAPs. In order to achieve the goal of making ESC state more stable and DC state less stable, we minimized the ratio of to by tuning the 26 parameters (18 activation constants and 8 inhibition constants). We used the fmincon function of the optimization toolbox of MATLAB to implement the optimization.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Ian Chambers; Jose Silva; Douglas Colby; Jennifer Nichols; Bianca Nijmeijer; Morag Robertson; Jan Vrana; Ken Jones; Lars Grotewold; Austin Smith Journal: Nature Date: 2007-12-20 Impact factor: 49.962
Authors: J Nichols; B Zevnik; K Anastassiadis; H Niwa; D Klewe-Nebenius; I Chambers; H Schöler; A Smith Journal: Cell Date: 1998-10-30 Impact factor: 41.582
Authors: Debbie L C van den Berg; Tim Snoek; Nick P Mullin; Adam Yates; Karel Bezstarosti; Jeroen Demmers; Ian Chambers; Raymond A Poot Journal: Cell Stem Cell Date: 2010-04-02 Impact factor: 24.633
Authors: Fredrik Lanner; Kian Leong Lee; Marcus Sohl; Katarina Holmborn; Henry Yang; Johannes Wilbertz; Lorenz Poellinger; Janet Rossant; Filip Farnebo Journal: Stem Cells Date: 2010-02 Impact factor: 6.277
Authors: Anselme L Perrier; Viviane Tabar; Tiziano Barberi; Maria E Rubio; Juan Bruses; Norbert Topf; Neil L Harrison; Lorenz Studer Journal: Proc Natl Acad Sci U S A Date: 2004-08-13 Impact factor: 11.205