Ataur Katebi1, Vivek Kohar1, Mingyang Lu2. 1. The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA. 2. The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA. Electronic address: mingyang.lu@jax.org.
Abstract
Many biological processes involve precise cellular state transitions controlled by complex gene regulation. Here, we use budding yeast cell cycle as a model system and explore how a gene regulatory circuit encodes essential information of state transitions. We present a generalized random circuit perturbation method for circuits containing heterogeneous regulation types and its usage to analyze both steady and oscillatory states from an ensemble of circuit models with random kinetic parameters. The stable steady states form robust clusters with a circular structure that are associated with cell cycle phases. This circular structure in the clusters is consistent with single-cell RNA sequencing data. The oscillatory states specify the irreversible state transitions along cell cycle progression. Furthermore, we identify possible mechanisms to understand the irreversible state transitions from the steady states. We expect this approach to be robust and generally applicable to unbiasedly predict dynamical transitions of a gene regulatory circuit.
Many biological processes involve precise cellular state transitions controlled by complex gene regulation. Here, we use budding yeast cell cycle as a model system and explore how a gene regulatory circuit encodes essential information of state transitions. We present a generalized random circuit perturbation method for circuits containing heterogeneous regulation types and its usage to analyze both steady and oscillatory states from an ensemble of circuit models with random kinetic parameters. The stable steady states form robust clusters with a circular structure that are associated with cell cycle phases. This circular structure in the clusters is consistent with single-cell RNA sequencing data. The oscillatory states specify the irreversible state transitions along cell cycle progression. Furthermore, we identify possible mechanisms to understand the irreversible state transitions from the steady states. We expect this approach to be robust and generally applicable to unbiasedly predict dynamical transitions of a gene regulatory circuit.
One of the most significant challenges in biology is to elucidate the function and control mechanism of gene regulatory circuits (GRCs) that govern the decision-making of biological processes (Gerstein et al., 2012, Schwikowski et al., 2000). The rapid development of new genomics technology has allowed us to address this question by both experimental and computational systems biology approaches (Farrell et al., 2018, Moignard et al., 2015, Saelens et al., 2019). One of the critical components among them is to use mathematical modeling to simulate the dynamical behavior of GRCs (Balleza et al., 2008, Ciliberti et al., 2007, Sanchez-Osorio et al., 2013). A good model not only provides a mechanistic view of the system but also makes new predictions on the effect of a genetic perturbation like the knockdown and overexpression of one or more genes. Although mathematical modeling approaches have been successfully applied to analyze gene circuits in select cases (Csikász-Nagy et al., 2006, Goldbeter et al., 2001, Hong et al., 2012, Li and Wang, 2014, Li et al., 2004), they typically suffer from a long-standing problem inherent in almost every systems biology study—it is difficult to directly measure most circuit kinetic parameters, especially in vivo. Although some parameter values can be learned from published results, many others are often based on educated guesses, therefore, limiting the predictive power of mathematical modeling. The problem is amplified when circuit modeling is applied to large gene networks because such systems typically have many parameters, making conventional methods prone to over-fitting.To address this issue, we have recently developed a mathematical modeling algorithm, named random circuit perturbation (RACIPE) (Huang et al., 2018a), which captures the dynamics of a GRC without the need for precise kinetic parameters (Huang et al., 2017). Instead of fine-tuning a model with a specific set of parameters, RACIPE generates an ensemble of models based on the associated chemical rate equations with distinct random kinetic parameter sets. The random models can represent the GRC in different signaling states and/or microenvironments. The simulated expressions from the ensemble of models are then subjected to statistical analysis to identify robust features of the GRC. Particularly, in several applications to simple circuit motifs and biological circuits (Chen et al., 2004, Csikász-Nagy et al., 2006; Huang et al., 2017, Jia et al., 2019a, Kohar and Lu, 2018), we found that even though the kinetic parameters are largely randomized, the stable steady-state solutions from these models form robust clusters of states that can be associated with distinct cellular states. This finding suggests that we can use the circuit topology (i.e., the connectivity and the types of regulatory interactions) as the only input and unbiasedly predict the cellular states that are allowed by the circuit. Without the need to refine a particular set of parameters, RACIPE can be extended to simulate a large gene network. In essence, RACIPE uniquely converts the nonlinear dynamics problem to a statistical data analysis problem, an approach which we believe is critical for analyzing large network systems.So far, RACIPE has been applied, in several cases (Bocci et al., 2019, Huang et al., 2017, Jia et al., 2019b, Kohar and Lu, 2018), to understand the possible gene expression steady states and the roles of genes in the behavior of the GRCs. It remains unclear how efficient the ensemble-based approach is in dissecting the dynamical properties of a circuit, e.g., transitions between the cellular states. An archetypal system to assess RACIPE is the gene regulation of cell cycle progression because of the following reasons. First, cell cycle circuit is evolutionarily robust, as it is required by almost every cell type and organism for survival. With RACIPE, we can perturb the system to evaluate the robustness of a cell cycle GRC. Second, cell cycle dynamics involves transitions across a series of well-studied cell cycle states/checkpoints, which allows us to test whether RACIPE can capture these non-trivial dynamics using only random models. Third, there are many existing experimental evidences and data in the studies of cell cycle, which will help to validate the computational models. Fourth, a fundamental understanding of the regulatory mechanism of cell cycle will facilitate researches in areas such as cell differentiation during normal development (Dalton and Coverdell, 2015, Jakoby and Schnittger, 2004, Li and Kirschner, 2014, Myster and Duronio, 2000, Ruijtenberg and van den Heuvel, 2016, Soufi and Dalton, 2016) and tumorigenesis in cancer (Collins et al., 1997, Icard et al., 2019, Sandal, 2002). Fifth, although there are many existing computational models for studying cell cycle networks, different models usually emphasize different aspects of the system. For example, when studying a specific checkpoint, Tyson et al. proposed a two-state switching model (Barik et al., 2010, Tyson and Novak, 2014); when studying a full cycle of the progression, multiple models considered its dynamics as oscillations (Csikász-Nagy et al., 2006, Gerard and Goldbeter, 2009); yet another popular model suggested the cell cycle progression as a downhill process toward the G1 global attractor (Li et al., 2004). A RACIPE analysis could provide a holistic view of the dynamical properties of a cell cycle circuit.In this study, we analyzed a cell cycle GRC of budding yeast with RACIPE, from which we elucidated the gene regulation during cell cycle progression and the associated state transitions. To achieve this, we first generalized RACIPE for GRCs with multiple regulatory types, including transcription, signaling (such as phosphorylation), and degradation. To carefully examine the oscillatory dynamics, which were considered as an essential component of the cell cycle dynamics (Goldbeter et al., 2001, Novák and Tyson, 2008, Tyson et al., 2003), we developed a numerical algorithm to systematically detect and characterize limit cycles for a large number of models. The generalized RACIPE allowed us to examine the sequential change of circuit dynamics in cell cycle progression. We focused on discovering the features of network dynamics unbiasedly from this statistical analysis of random models. We found that the RACIPE models of the GRC allow both the stable steady states, which form clusters associated with the cell cycle phases, and the oscillatory states, which direct the irreversible state transitions along the cell cycle progression. The model predictions are largely consistent with gene perturbation data in the literature and recent single-cell RNA sequencing (RNA-seq) data. Furthermore, we explored possible mechanisms to understand the irreversible state transitions directly from the steady states of the random models. We hope that RACIPE will facilitate systems biology network modeling and generate unbiased predictions for complex systems.
Results
There are several existing GRC models for the budding yeast cell cycle in the literature (Battogtokh and Tyson, 2016, Chen et al., 2004, Gerard and Goldbeter, 2009, Li and Wang, 2014, Li et al., 2004). Based on these data, we constructed a GRC of 15 genes and 38 regulatory links (Figure 1A). The majority of the components of the circuit were reported in previous studies (Barik et al., 2016, Li et al., 2004). In addition, we made a few modifications to the circuit, as explained in the following. First, instead of including a node representing the cell size, we modeled Cln3 as the input node and the effect of the cell size by the production rate of Cln3. Furthermore, we added Whi5 and its interactions, as Whi5 was found to play a critical role as the signal integrator for cell cycle commitment (G1/S transition) (Liu et al., 2015, Palumbo et al., 2016, Qu et al., 2019) and was included in subsequent cell cycle modeling (Barik et al., 2016). Last, we added the inhibition of Cdc20 by Cdh1 as supported by multiple evidences (Hong et al., 2012, Mangla et al., 2010, Okabe and Sasai, 2007, Rudner and Murray, 2000). The final circuit contains transcription factors (SBF, Whi5, MBF, Mcm1, SFF, and Swi5), major cyclins (Cln3, Cln1,2, Clb5,6, and Clb1,2), other regulatory signaling proteins (Sic1, Cdh1, Cdc20, Pds1, and Cdc14), and a node named DNA Synthesis (rectangular shape, denoted as DNAS) representing the DNA synthesis process. These nodes regulate one another by transcription (black lines), degradation (red lines), or signaling (green lines) (Figure 1A). Each interaction can be either excitatory (arrow head) or inhibitory (circle head).
Figure 1
Stable Steady States of the Gene Regulatory Circuit Capture Cell Cycle Phases
(A) The core gene regulatory circuit of the yeast cell cycle. The nodes correspond to 14 genes and an abstract node representing DNA synthesis process, and the edges correspond to 38 regulatory interactions. These interactions can be any of the following three types: transcription (black), degradation (red), and signaling (green), where each type can be either excitatory (line and arrow head) or inhibitory (line and circle head). Nodes colored green are regulated by signaling interactions only. The node DNA sythesis is also denoted as DNAS.
(B) Hierarchical clustering analysis was performed using 12,758 activity profiles from simulated stable steady states. Six clusters of simulated stable steady states are mapped to specific cell cycle phases.
(C) The distribution of the simulated gene activity profiles of each cell cycle phase (red) and that of all the models (blue).
(D) Projection of the simulated activity profiles onto the first two principal components (PCs, PC1sim: ~32%, PC2sim: ~21%). Each projected cluster that is associated with a cell cycle phase is presented by an ellipse whose center is illustrated by a dot.
(E) Projection of the single-cell RNA-seq data onto the first two PCs (PC1data: 0.49%, PC2data: 0.46%) obtained from the sequencing data. Cln3 expression peaks at the G1 phase (blue), Rfa1 peaks (green) across the late G1 and S phases, HTA1 peaks (orange) at S, and Clb1 peaks (red) at M.
(F) Projection of the single-cell expression data of 15 representative genes (after KNN smoothing) onto the first two PCs from the simulated data (same as D). See Figures S6 and S7 for the application of another five smoothing/inputation methods.
Stable Steady States of the Gene Regulatory Circuit Capture Cell Cycle Phases(A) The core gene regulatory circuit of the yeast cell cycle. The nodes correspond to 14 genes and an abstract node representing DNA synthesis process, and the edges correspond to 38 regulatory interactions. These interactions can be any of the following three types: transcription (black), degradation (red), and signaling (green), where each type can be either excitatory (line and arrow head) or inhibitory (line and circle head). Nodes colored green are regulated by signaling interactions only. The node DNA sythesis is also denoted as DNAS.(B) Hierarchical clustering analysis was performed using 12,758 activity profiles from simulated stable steady states. Six clusters of simulated stable steady states are mapped to specific cell cycle phases.(C) The distribution of the simulated gene activity profiles of each cell cycle phase (red) and that of all the models (blue).(D) Projection of the simulated activity profiles onto the first two principal components (PCs, PC1sim: ~32%, PC2sim: ~21%). Each projected cluster that is associated with a cell cycle phase is presented by an ellipse whose center is illustrated by a dot.(E) Projection of the single-cell RNA-seq data onto the first two PCs (PC1data: 0.49%, PC2data: 0.46%) obtained from the sequencing data. Cln3 expression peaks at the G1 phase (blue), Rfa1 peaks (green) across the late G1 and S phases, HTA1 peaks (orange) at S, and Clb1 peaks (red) at M.(F) Projection of the single-cell expression data of 15 representative genes (after KNN smoothing) onto the first two PCs from the simulated data (same as D). See Figures S6 and S7 for the application of another five smoothing/inputation methods.To elucidate the dynamics of gene regulation during cell cycle progression, we applied a generalized version of the mathematical modeling algorithm RACIPE to the cell cycle gene circuit (see Section 1-2 and 13 in Transparent Methods for the modeling details). In a previous study, the Boolean network model was applied to a similar circuit (Li et al., 2004). In comparison, our approach (1) models the activity of each gene as a continuous variable, (2) considers the differences in various regulatory types, and (3) systematically characterizes both the stable steady states and the oscillatory states (see Section 3 in Transparent Methods, Figure S8). We generated 10,000 RACIPE models, from which we found 12,758 stable steady states (~84% models) and 3,070 stable oscillatory states (~24% models). About 9% models allow the coexistence of both dynamical modes, and another 1% not reaching stable states. The convergent tests of the RACIPE analyses are shown in Section 7 in Transparent Methods, Figures S2, and S3.
The Stable Steady States of the Circuit Capture Cell Cycle Phases
We first investigated the stable steady states from the RACIPE models. Here, the simulated activity levels of 15 genes from the 12,758 states were first log-transformed and standardized. The data are then analyzed by unsupervised hierarchical clustering analysis (R function hclust with distance metric as Euclidean and method ward.D2), from which we found six robust clusters, labeled as C1 ~ C6 (Figure 1B). To characterize the biological meaning of these clusters, we compared the expression patterns from the simulations with the literature data as follows. First, we computed, for each gene, the distributions of the simulated activity levels for each cluster (Figure 1B, red histograms) and those of all the data (Figure 1B, blue histograms). Second, we assigned the states of each gene to high, medium, or low by comparing the red and blue histograms (Table S1). Third, we compared the expression pattern of each cluster with the experimental evidence in the literature and associated each cluster to the best matching cell cycle phase (Table S2). The assignment of cell cycle phases for early G1, mid G1, late G1, S, M, and M/G1 is illustrated in Figures 1B and 1C.Next, we projected the simulated steady states to the first two principal components (PCs) (~32% contribution from PC1 and ~21% from PC2). As shown in Figure 1D, the six clusters of the steady states are arranged circularly. Strikingly, the directionality of the cell cycle progression follows a series of consecutive state transitions from a cluster to the neighboring cluster along the counterclockwise direction. We then compared the simulation results with recently published single-cell RNA-seq data of budding yeast (Jackson et al., 2020). When principal-component analysis (PCA) was performed on the genome-wide gene expression data, we also observed a similar circular structure of gene expression for individual cells (Figure 1E). Representative marker genes Cln3 (Early G1), Rfa1 (Late G1), HTA1 (S), and Clb1 (M) have peak expression in the corresponding cell cycle phases. Subsequently, we projected the expression data of genes associated with the cell cycle GRC to the PCs obtained from the RACIPE simulations. We found that one of the genes (Whi5) in the GRC is not expressed at all and another gene (Cdc14) is not highly expressed in the phases it is supposed to be expressed. In those cases, we replaced the gene with another gene that (1) directly interacts with the original gene and (2) has expression consistent with previously reported microarray data (Spellman et al., 1998). See Table S3 for the detailed mapping. As shown in Figure 1F, the circular structure, although much noisier, can still be discerned. Here, we applied KNN smoothing (Wagner et al., 2018) to the single-cell RNA-seq data to alleviate the dropout effects. We also explored several imputation methods, including MAGIC (van Dijk et al., 2018), scImpute (Li and Li, 2018), SAVER (Huang et al., 2018b), and RESCUE (Tracy et al., 2019). We found that although these methods can capture the circular structure of the genome-wide transcriptomics data, there is no evident improvement over KNN in capturing the structure of the data when projecting using the GRC genes (see Section 8 in Transparent Methods and Figures S4–S7 for details). Furthermore, our finding that cell cycle progression corresponds to the state transitions in a circular structure has also been observed in budding yeast time series microarray data (Ferrezuelo et al., 2010, Spellman et al., 1998) and mouse embryonic stem cell single-cell RNA-seq data (Liu et al., 2017).The aforementioned mapping of the cell cycle phases helps to identify the dynamical change of the gene activities along cell cycle progression from the RACIPE models. As shown in Figure 1C, the histogram of the activity of some genes (e.g., SBF) is bimodal and the gene switches the level of its activity discretely, whereas some other genes (e.g., Cdh1) have more continuous changes in the activity during the state transitions. Markedly, the three cyclin genes Cln1,2, Clb5,6, and Clb1,2 belong to the former type. Our model predicts that Cln1,2 switches from low to high activity in the mid G1 phase and remains high till the S phase, and then switches back to low activity in the M phase. By constrast Clb5,6 switches to high in the late G1 phase, whereas Clb1,2 switches to high in the S phase. The model predictions on the discrete changes of the cyclin expressions and their specific order are consistent with known experimental observations (Rustici et al., 2004, Spellman et al., 1998).
Perturbation Analysis Identifies the Role of the Genes in Cell Cycle Progression
To further validate the modeling results of the cell cycle circuit, we performed in silico perturbation analysis to evaluate the effects of gene knockdown (KD) on the behavior of the circuit. The outcomes of the gene perturbations were then compared with experimental evidence in the literature. Figure 2 shows the modeling results for the untreated (UT), 15 single gene KD, and 12 double gene KD. Here, the effect of gene KD was simulated by reducing the maximum production rate of the corresponding genes by 95%. For each case, we simulated 10,000 random models by RACIPE and evaluated the distribution of the stable steady states in each cell cycle phase. To rapidly assign the cell cycle phase to every stable steady state, we trained a fast-forward neural network model using the original 10,000 RACIPE models (i.e., the UT condition, the data from Figure 1B). From our benchmark, the neural network model has over 90% accuracy in the state assignment (see Section 5 in Transparent Methods and Figure S26). The results for the KD simulations at different KD levels are shown in Figure S9.
Figure 2
Perturbation Analysis
Distribution of the fraction of steady states in each cell cycle phase from the simulations of the cell cycle circuit under untreated (UT), single knockdown (KD), and double KD conditions. To assign new phases after a perturbation, we trained a neural net model using the activity profiles of the untreated condition and the classification of cell cycle phases from the previous hierarchical clustering analysis (see Section 5 in Transparent Methods for details), and then we infer the phase for any new activity profile using the trained model.
Perturbation AnalysisDistribution of the fraction of steady states in each cell cycle phase from the simulations of the cell cycle circuit under untreated (UT), single knockdown (KD), and double KD conditions. To assign new phases after a perturbation, we trained a neural net model using the activity profiles of the untreated condition and the classification of cell cycle phases from the previous hierarchical clustering analysis (see Section 5 in Transparent Methods for details), and then we infer the phase for any new activity profile using the trained model.The perturbation analysis identifies the role of each gene in stabilizing or destabilizing various cell cycle phases. The KD conditions wherein one or multiple cell cycle phases are mainly affected are summarized in Table S4. The outcomes of the gene perturbation are well supported by experimental evidence. For example, (1) the modeling results show that Cln3 KD increases the fraction of states in early G1 and M/G1 and decreases that of S. This is consistent with the experimental data where the inactivation of Cln3 was found to cause the cell size to increase, elongating early G1 (Dirick et al., 1995, Stuart and Wittenberg, 1995, Tyers et al., 1993, Wijnen et al., 2002). For the cell cycle START to take place, activation of Cdc28 by G1-cyclins Cln1, Cln2, or Cln3 is necessary (Dirick et al., 1995), and Cdc28 mutants cause a cell to stall at the START of early G1 phase (Hartwell et al., 1974). (2) Whi5 KD increases the fraction of states in mid G1 and S and decreases that in M/G1 and early G1, consistent with the finding that the inactivation of Whi5 dramatically reduces the cell size by shortening early G1 (de Bruin et al., 2004). (3) Interestingly, Whi5 KD seems to have the opposite effect of Cln3 KD. Double perturbation of Cln3 and Whi5 appears to mitigate the impact of each other by restoring the proportions in early G1 and mid G1, which is consistent with experimental findings (de Bruin et al., 2004). (4) Cln1 and Clb5 double KD reduces the proportion of stable states in mid G1 to M and increases that in M/G1 and early G1. This prediction is also supported by the experimental finding that quadruple KD of Cln1,2 and Clb5,6 arrests cells in G1 (Schwob and Nasmyth, 1993). (5) Cdc20 and Pds1 double KD decreases the proportion of states in M/G1, consistent with the findings that double mutant (Cdc20 and Pds1) cells get arrested in telophase (Lim et al., 1998). These experimental evidences are listed in Table S5.
Dynamical Changes of Circuit States along Cell Cycle Progression
As shown in previous sections, stable steady states from RACIPE models can capture the gene activity patterns of various cell cycle phases. Next, we show, from these steady states altogether, how we can recapitulate the dynamic behavior of the cell cycle circuit. We explored several existing pseudotime inference methods, including Angle (Saelens et al., 2019), ElPiGraph Cycle (Albergante et al., 2020), reCAT (Liu et al., 2017), and SoptSC (Wang et al., 2019), to assign pseudotime index to each activity profile of a steady state. However, these methods failed to infer the correct ordering of cell cycle phases, presumably because of the special circular structure and the high variation of the simulated data. Thus, we devised approaches to assign pseudotime using either our results from hierarchical clustering analysis shown in Figure 1B or a custom use of a recent pseudotime inference method named psupertime (Macnair and Claassen, 2019) shown in Figure S12; see Section 9 in Transparent Methods and Figures S10–S14 for details of the pseudotime analyses.The change in gene activity for Whi5 and three cyclins (Cln1,2, the G1 phase cyclin; Clb5,6, the S phase cyclin; and Clb1,2, the M phase cyclin) along the pseudotime (obtained using the custom-use of psupertime) are shown in Figure 3A (see Figure S12A for the other genes). The profiles in Figure 3A are illustrated with two repeats to represent two cycles, and they capture the overall gene activity dynamics well. Interestingly, genes like Cln1,2 exhibit binary activity—Cln1,2 level is high during G1 and low during M and M/G1. However, genes like Whi5 and Clb1,2 exhibit more gradual changes in activity along cell cycle progression. Moreover, Whi5 and Cln1,2 have opposite phases, with Whi5 switching activity slightly ahead of Cln1,2, which is captured by their peaked negative cross-correlation at lag 0 (Figure S12C). Noticeably, Clb5,6 level increases when the Cln1,2 level is still high. When Clb5,6 level decreases, Clb1,2 level starts to rise and continues to stay elevated during M/G1, as its coordination with the other genes (specifically Sic1, Cdh1, Cdc14, and Swi5) is needed for mitotic exit (Prinz Susanne, 1999, Shirayama et al., 1999, Visintin et al., 1998). The synchrony of Sic1 and Cdh1 with Clb1,2 is further observed in their autocorrelations (Figure S12B), emphasizing the role of Clb1,2 in mitotic exit.
Figure 3
Dynamic Change of Gene Expressions during Cell Cycle Progression
(A) The activity profiles of Whi5, Cln1,2, Clb5,6, and Clb1,2 along the inferred pseudotime (with two cycles) obtained from the custom use of psupertime method. Colors annotate different cell cycle phases. The activity profiles of all nodes in the cell cycle GRC are shown in Figure S12A.
(B) A dynamic network view of cell cycle progression. The size of each node represents the level of gene activity, and the thickness of the each edge represents the activity of the regulation, defined as the proportion of models wherein the level of the regulator is higher than the threshold level of the edge.
Dynamic Change of Gene Expressions during Cell Cycle Progression(A) The activity profiles of Whi5, Cln1,2, Clb5,6, and Clb1,2 along the inferred pseudotime (with two cycles) obtained from the custom use of psupertime method. Colors annotate different cell cycle phases. The activity profiles of all nodes in the cell cycle GRC are shown in Figure S12A.(B) A dynamic network view of cell cycle progression. The size of each node represents the level of gene activity, and the thickness of the each edge represents the activity of the regulation, defined as the proportion of models wherein the level of the regulator is higher than the threshold level of the edge.Coordinated activation of a subset of genes and their interactions drive the cell cycle from one phase to the next along the cell cycle progression. Using the simulated data from the stable steady states of the 10,000 RACIPE models, we can visualize the dynamic changes of the cell cycle gene circuit (Figure 3B). For each cell cycle phase, the size of a node is scaled to the average activity level of the corresponding gene from all the models in that phase, whereas the thickness of an edge is scaled to the fraction of models where the activity of the regulator is larger than the threshold of the Hill function (see Section 2 in Transparent Methods for the mathematical equations). Remarkably, this representation provides a dynamic view of gene regulation along cell cycle progression. In early G1, Whi5 has high activity, which inhibits SBF, and consequently keeps Cln1,2 activity suppressed. An increased level of Cln3 suppresses Whi5 and enables SBF release. The increased level of SBF activates Cln1,2, which induces the transition to mid G1. Furthermore, high level of Cln1,2 inhibits Whi5 and Sic1, which transitions the cell to enter late G1. This positive feedback of Cln1,2 behaves like a switch. This feedback mechanism is known as the START checkpoint, whose activation makes the cells commit to the S phase (DNA synthesis) (Johnston et al., 1977, Skotheim et al., 2008). In contrast to models wherein one network is associated with one state, here we show that a dynamic network model can describe the sequential changes of the gene regulation in all the cell cycle phases.
Oscillatory Dynamics Specify the Directionality of the Cell Cycle State Transitions
In the previous sections, we analyzed the steady states of RACIPE models and associated them with various cell cycle phases. However, the analysis so far does not explain the direction of transitions between these phases. Here, we explore how the state transitions can be understood from the oscillatory states of RACIPE models.To facilitate the statistical analysis on oscillations, we obtained 47,637 oscillatory states from 1,60,000 RACIPE models (10,000 models from the prior analysis plus 1,50,000 models generated anew). For each oscillation, we projected the trajectory to the first two PCs of the steady-state data (same PCs as in Figure 1D). Some of the projected trajectories reside in one or a very few cell cycle phases, whereas some others can span many phases (Figure 4A). To characterize the transitions through various cell cycle phases along an oscillatory trajectory, we used the previously trained neural network model to predict the cell cycle phase for any given gene activity profile along an oscillatory trajectory. This allows us to identify the patterns of state transitions for each oscillation. For example, two oscillations are shown in Figure 4A: the one on the left shuffles back and forth between two phases (M → M/G1 → M), whereas the one on the right passes through all the phases (early G1 → mid G1 → late G1 → S → M → M/G1 → early G1). See Section 3 in Transparent Methods for details on detecting oscillatory states.
Figure 4
Statistical Analysis of Oscillatory Trajectories Elucidates Cell Cycle Dynamics
(A) Examples of the oscillatory trajectories projected on the first two principal components (PC1 and PC2) obtained from the stable states from 10,000 models. Left panel: Trajectory traveling only two cell cycle phases; right panel: Trajectory traveling all six cell cycle phases. An arrow on the projected trajectory marks the direction of oscillation.
(B) The histogram of the polarity of oscillatory trajectories of different sizes—all trajectories (n = 47,327) and trajectories spanning 3–6 phases (n = 3954), 4–6 phases (n = 686), and 5–6 phases (n = 147). The histograms are fitted by either two Gaussian distributions (first two columns, where the green curve is for the distribution with a positive mean and pink curve is for the distribution with a near zero mean) or a single Gaussian distribution (the last two columns). The horizontal dotted red line shows the zero polarity.
(C) Top panel shows eight major state transition patterns of the oscillatory trajectories. The number below each pattern indicates the number of observed oscillations. Bottom panel shows the histogram of polarity of each pattern.
Statistical Analysis of Oscillatory Trajectories Elucidates Cell Cycle Dynamics(A) Examples of the oscillatory trajectories projected on the first two principal components (PC1 and PC2) obtained from the stable states from 10,000 models. Left panel: Trajectory traveling only two cell cycle phases; right panel: Trajectory traveling all six cell cycle phases. An arrow on the projected trajectory marks the direction of oscillation.(B) The histogram of the polarity of oscillatory trajectories of different sizes—all trajectories (n = 47,327) and trajectories spanning 3–6 phases (n = 3954), 4–6 phases (n = 686), and 5–6 phases (n = 147). The histograms are fitted by either two Gaussian distributions (first two columns, where the green curve is for the distribution with a positive mean and pink curve is for the distribution with a near zero mean) or a single Gaussian distribution (the last two columns). The horizontal dotted red line shows the zero polarity.(C) Top panel shows eight major state transition patterns of the oscillatory trajectories. The number below each pattern indicates the number of observed oscillations. Bottom panel shows the histogram of polarity of each pattern.To quantify the directions of the oscillatory trajectories, we defined an angular-velocity-based metric, named polarity, to measure whether a projected oscillatory trajectory is clockwise or counterclockwise (see Section 4 in Transparent Methods). A positive polarity corresponds to a counterclockwise trajectory. Interestingly, the distribution of the polarity for all the projected trajectories is bimodal (Figure 4B, leftmost panel), which can be fit nicely by two Gaussian distributions, one with mean zero (pink curve) and the other with positive values for almost the whole distribution (green curve). However, if we considered the trajectories spanning four or more cell cycle phases, the distribution of the polarity can be fit by a single Gaussian distribution with only positive values (Figure 4B, two rightmost panels). This result suggests that the polarity for the trajectories of small size (i.e., spanning no more than three phases) is random, presumably because there is no restraint for the oscillations to travel. However, the polarity for the trajectories of large size (i.e., spanning four or more phases) is mostly positive. These oscillations only travel counterclockwise, consistent with the known sequence of cell cycle progression. Note that we found this consistent pattern of state transitions from randomly generated models because the gene circuit topology restrains the possible routes of state transitions. Figure 4C lists eight major recurring transition patterns involving four or more phases with positive polarity, i.e., all the corresponding oscillations travel in a counterclockwise direction. The polarity distributions of the other transition patterns are shown in Figure S15 and Table S7.
So far, we have shown that the cell cycle phases can be described by the stable steady states from the RACIPE models, whereas the directionality of the cell cycle progression can be deduced from the oscillatory states. Nevertheless, there is a remaining issue in this theory of directionality of cell cycle progression as follows. RACIPE models with random kinetic parameters (intrinsic factors for the regulatory events associated with the GRC genes) can be regarded as heterogeneous cells under different signaling or environmental conditions (extrinsic factors for regulatory events that are important to cellular functions but are not described explicitly in the small GRC, e.g., the copy number of RNA polymerase). This view is consistent with the following previous studies: (1) extrinsic factors, such as the copy number of RNA polymerase, have been shown to contribute to cell-to-cell variability and gene expression noise (Jones et al., 2014); (2) cell-to-cell heterogeneity has also been described by a population of single-cell models with different parameter values (production and degradation rates, in Llamosi et al., 2016). Thus, it makes sense that different models can correspond to different cell cycle phases. However, if the signaling conditions can be altered reversibly, the state transitions should also be reversible, which is not consistent with the fact that the cell cycle progression is irreversible. To address this issue, we propose the following two analyses to explain the irreversible state transitions in the cell cycle using the stable steady-state solutions from the RACIPE models.First, we show in the following that the irreversible state transitions can be inferred using the steady-state gene activity profiles and the topology of the cell cycle gene circuit. The approach is based on the fact that the activity of the targets lags behind the activity of the regulators. Thus, for a given transition from state 1 to state 2, the activity correlation of the regulators in state 1 and the targets in state 2 should be higher than the activity correlation of the regulators in state 2 and the targets in state 1. Therefore, we evaluate a so-called delayed correlation for the forward and backward transitions between state 1 and state 2 (see Section 6 in Transparent Methods). If the median delayed correlation is significantly different between the forward and backward directions, the state transition is likely to be irreversible. As shown in Figures 5A, 5B, and S16, the delayed correlations for the forward transitions of consecutive cell cycle phases are usually higher than those for the backward transitions, except for the M to M/G1 and M/G1 to early G1 transitions where both forward and backward correlations are comparable. Based on these delayed correlations, we derive a state transition model for cell cycle progression (Figure 5C). The transition paths are overall irreversible along the direction of cell cycle progression, and the two bidirectional paths could be related to the START point of cell cycle (Kraikivski et al., 2015). Note that this approach is generally applicable to infer the paths of state transitions for a GRC, examples of which are shown in Figure 6 for a repressilator circuit (see Section 10 in Transparent Methods) and an induced toggle switch circuit (see Section 11 in Transparent Methods).
Figure 5
The Irreversible Direction of the Cell Cycle Inferred from Stable Steady States
(A–C) A model for cell cycle progression inferred from the delayed correlations. (A) Densities of the forward (top) and backward (bottom) delayed correlations of gene expressions between successive cell cycle phases. Densities for all possible pairs are shown in Figure S16. The vertical dotted red line along zero correlation helps identify the skewness of the distributions. (B) Median delayed correlations for every two cell cycle phases. Medians along the direction of the cell cycle are predominantly larger (green), except for M to M/G1 and M/G1 to early G1 where the backward correlations are also comparable (red). (C) A model for cell cycle progression derived from the delayed correlations.
(D and E) A scheme for cell cycle progression inferred from parameter perturbation. (D) Left panel shows the parameter vectors of every one-state model in each cell cycle phase and oscillatory state projected on the first two principal components in the parameter space. Right panel shows the averages of the parameter vectors of all the one-state models in each cell cycle phase and oscillatory state. (E) Mean and standard deviation (SD) of the percent models in each cell cycle phase when simulated with the parameters perturbed toward the global parameter average (shown as mean ± SD). Perturbed models from one cell cycle phase diffuse to the next cell cycle phase with one exception, early G1. In early G1, the perturbed models diffuse to both mid G1 and M/G1.
Figure 6
A Method to Predict State Transitions Using the Steady-State Gene Activity Profiles of RACIPE Models and the Circuit Topology
(A) Schematic of the method. Left panel shows three expression/activity vectors of three states i, j, and k, respectively. A black line with an arrow head indicates an excitatory link, whereas a red line with a circle head indicates an inhibitory link, and s(links) is the sign vector expressing each interaction type (+1: excitatory, −1: inhibitory). Right panel illustrates the definition of the delayed correlation for forward transitions (thick black line pointed right) and backward transitions (black dotted line pointed left). Panels B to D show three applications of the delayed correlations to infer state transitions. (B) The cell cycle circuit (left) and delayed correlation-based transitions between six cell cycle phases (right). (C) A repressilator circuit (left) and predicted transitions between six clusters (CL1 ~ CL6) obtained from the RACIPE models. Details are in Figure S22. (D) A three-node induced toggle switch circuit (left) and predicted transitions between four clusters (CL1 ~ CL4). Details are in Figure S25.
The Irreversible Direction of the Cell Cycle Inferred from Stable Steady States(A–C) A model for cell cycle progression inferred from the delayed correlations. (A) Densities of the forward (top) and backward (bottom) delayed correlations of gene expressions between successive cell cycle phases. Densities for all possible pairs are shown in Figure S16. The vertical dotted red line along zero correlation helps identify the skewness of the distributions. (B) Median delayed correlations for every two cell cycle phases. Medians along the direction of the cell cycle are predominantly larger (green), except for M to M/G1 and M/G1 to early G1 where the backward correlations are also comparable (red). (C) A model for cell cycle progression derived from the delayed correlations.(D and E) A scheme for cell cycle progression inferred from parameter perturbation. (D) Left panel shows the parameter vectors of every one-state model in each cell cycle phase and oscillatory state projected on the first two principal components in the parameter space. Right panel shows the averages of the parameter vectors of all the one-state models in each cell cycle phase and oscillatory state. (E) Mean and standard deviation (SD) of the percent models in each cell cycle phase when simulated with the parameters perturbed toward the global parameter average (shown as mean ± SD). Perturbed models from one cell cycle phase diffuse to the next cell cycle phase with one exception, early G1. In early G1, the perturbed models diffuse to both mid G1 and M/G1.A Method to Predict State Transitions Using the Steady-State Gene Activity Profiles of RACIPE Models and the Circuit Topology(A) Schematic of the method. Left panel shows three expression/activity vectors of three states i, j, and k, respectively. A black line with an arrow head indicates an excitatory link, whereas a red line with a circle head indicates an inhibitory link, and s(links) is the sign vector expressing each interaction type (+1: excitatory, −1: inhibitory). Right panel illustrates the definition of the delayed correlation for forward transitions (thick black line pointed right) and backward transitions (black dotted line pointed left). Panels B to D show three applications of the delayed correlations to infer state transitions. (B) The cell cycle circuit (left) and delayed correlation-based transitions between six cell cycle phases (right). (C) A repressilator circuit (left) and predicted transitions between six clusters (CL1 ~ CL6) obtained from the RACIPE models. Details are in Figure S22. (D) A three-node induced toggle switch circuit (left) and predicted transitions between four clusters (CL1 ~ CL4). Details are in Figure S25.Second, we propose a putative model to explain how irreversible state transitions occur by changing the kinetic parameters. In the analysis, we consider seven groups of models—each of the first six groups corresponds to the models that allow only one stable steady state of the corresponding cell cycle phase and the seventh group corresponds to the models that generate oscillatory state spanning three or more cell cycle phases. We first performed PCA on the high-dimensional kinetic parameter profiles from the selected RACIPE models (# models—early G1: 635, mid G1: 689, late G1: 365, S: 721, M: 1,304, M/G1: 805, and limit cycles: 3,899). Unlike the gene activity profiles, the parameters from the different cell cycle phases and the oscillatory states overlap, and the magnitude of the parameters is not significantly different from each other (Figure 5D, left panel). However, if PCA is performed on the average parameter profiles of each group, the projected parameter profiles follow a similar spatial arrangement as the projected gene activity profiles (Figure 5D right panel). Interestingly, the average projected parameter profile for the oscillatory states is near the center and at the same time, close to the average projected parameter profile of all the models.This pattern persists when we considered only individual type of parameters (Figure S17). Next, starting from the models of a cell cycle phase, we perturbed the kinetic parameters and evaluated the changes in the model proportions across the cell cycle phases. We found that if the parameters are perturbed toward the global average, the models shift toward the next cell cycle phase for most cases, except for early G1 (Figures 5E and S18), in which models are increased in both mid G1 and M/G1. Similar observations were found if we perturbed the parameters toward the average of the oscillatory states (Figure S19), or toward the parameters of any randomly selected oscillatory model (Figure S20). Note that the pattern of diffusion of the models along the cell cycle direction is not observed when the kinetic parameters are perturbed randomly (Figure S21). For details of the parameter perturbation, see Section 12 in Transparent Methods. Our modeling results demonstrate that the irreversible state transition during the cell cycle progression can be achieved by relaxing the kinetic parameters toward the global average, the average of the oscillatory states, or a random model of the oscillatory states. This approach has also been successfully applied to a repressilator (details are in Section 10 in Transparent Methods, Figures S22–S24).In short, we have shown through two different approaches how irreversible state transitions can be derived from the steady states of random models. Our simulation approach also sheds light on possible mechanisms to achieve irreversible state transitions by changing the signaling states (represented here by the kinetic parameters) of the circuit.
Discussion
In this study, we used the budding yeast cell cycle as a model system to elucidate dynamic behaviors of GRCs using a computational systems biology approach. We generalized and applied our recently developed mathematical modeling algorithm, named random circuit perturbation (RACIPE), to analyze the dynamic features of a budding yeast cell cycle gene circuit using an ensemble of chemical rate equation models with random kinetic parameters. From an extensive statistical analysis on a large set of simulated data, we found that the clusters of steady-state gene activities can be associated with specific cell cycle phases, whereas the oscillatory states specify the path of the state transitions during the cell cycle. These cell cycle-specific clusters from circuit simulations are consistent with budding yeast single-cell RNA-seq data. Finally, we elucidate the mechanism of the irreversible state transitions using RACIPE-generated models. Our analyses demonstrate the role of the circuit topology in determining the cell cycle dynamics and its function. The unique combination of modeling and statistical data analysis can be a powerful approach to analyzing the behavior of complex dynamical systems in biology. Note that, because of the evolutionary conservation, most of the genes and interactions in the circuit of the budding yeast cell cycle have counterparts in mammalian systems (Table S6) (Bähler, 2005, Csikász-Nagy et al., 2006). Thus, the analysis, although focused on budding yeast, can be extended to study the mammalian cell cycle.Cell cycle is an ideal testing ground for mathematical modeling, because (1) there are rich datasets and literature on gene regulation in cell cycle, (2) substantial efforts have been made to model the dynamics of cell cycle GRCs, and (3) cell cycle involves transitions of multiple cell cycle phases. Thus, the dynamics are non-trivial. Elucidating cell cycle gene regulation can contribute to a better understanding of the role of the cell cycle in sundry biological processes, such as cell differentiation (Dalton and Coverdell, 2015, Jakoby and Schnittger, 2004, Li and Kirschner, 2014, Myster and Duronio, 2000, Ruijtenberg and van den Heuvel, 2016, Soufi and Dalton, 2016), DNA damage (Branzei and Foiani, 2008, Hustedt and Durocher, 2017, Murray and Carr, 2018, Shaltiel et al., 2015) and metabolism (Kalucka et al., 2015, Kaplon et al., 2015, Roy et al., 2017), and in diseases, such as cancer (Collins et al., 1997, Icard et al., 2019, Sandal, 2002). Compared with previous modeling studies (Chen et al., 2004, Chen et al., 2000, Csikász-Nagy et al., 2006, Gerard and Goldbeter, 2009, Li and Wang, 2014, Li et al., 2004, Lv et al., 2015), we have generated a large number of models and performed extensive data analysis on all these models. Our results illustrate that models with different sets of parameters can be associated with either one of the cell cycle phases and checkpoints, or the oscillatory dynamics across multiple cell cycle phases. Strikingly, we show that the cell cycle dynamics can be well explained even without carefully selected kinetic parameters, again demonstrating the crucial role of the cell cycle gene circuit topology, other than the detailed kinetic parameters, in determining the circuit's function.To study the cell cycle gene circuit, we have improved the original RACIPE method (Huang et al., 2017) and further developed statistical methods to analyze the simulated data from a large set of models. Primarily, RACIPE is generalized to model three additional types of regulation (phosphorylation, signaling, and degradation) in addition to transcriptional regulation. Furthermore, the method is extended for high throughput analysis of not only stable steady states but also oscillatory dynamics. To carefully characterize different dynamical features from many models, we developed numerical algorithms to automatically detect oscillatory trajectories and characterize the state transition patterns using machine learning. The advances in RACIPE enable us to identify robust dynamical features of the cell cycle directly from randomly generated models.An important question we focused on is to understand how the cell cycle gene circuit controls the irreversible state transitions. As shown in our simulated data, we have investigated it from three different aspects. First, we systematically mapped the patterns of state transitions from all the RACIPE-generated oscillatory trajectories. From this, we identified not only a consistent directionality of the state transitions but also recurring transition patterns. The statistical analysis indicates the crucial role of the circuit topology, instead of a specific set of kinetic parameters, in determining the irreversible nature of the state transitions during the cell cycle. Second, we developed a numerical method to infer the directions of state transitions directly from the steady-state gene expression/activity profiles and the circuit topology. Here we achieved this by considering that the expression of the targets occurs later than the activation of the regulators and employed a metric based on delayed correlations. We have applied this approach on several classic GRCs, including the cell cycle circuit, and successfully reproduced the patterns of state transitions. The success of this method demonstrates again how GRC drives state transitions. Third, we examined the relationship between the cell cycle progression and the signaling states of the GRC by perturbing the model parameters belonging to each cell cycle phase and assessing how the models diffuse to the other phases. We perturbed the parameters either toward the global parameter average (i.e., making the parameters more balanced) or relaxed them toward the limit cycle parameters. We found that the perturbed models transition to the next cell cycle phase in both cases. This finding suggests that an irreversible state transition of the cell cycle would occur by systematically changing the signaling state of the circuit.Overall, we have shown the power of RACIPE in gene circuit modeling. The paths of the cell cycle state transitions that we inferred from either the delayed correlations or the parameter perturbations suggest that an ensemble of random models provides a holistic view of the dynamical behaviors identified from previous cell cycle models, including state switching (Chen et al., 2004, Chen et al., 2000, Csikász-Nagy et al., 2006), oscillations (Csikász-Nagy et al., 2006, Gerard and Goldbeter, 2009, Li and Wang, 2014, Lv et al., 2015), and G1 global attractor (Li et al., 2004). In particular, irreversible state transitions during cell cycle progression have been elucidated in previous studies by various approaches, including those using the directionality of an oscillatory trajectory (Chen et al., 2004, Chen et al., 2000, Csikász-Nagy et al., 2006), phase transitions in a bifurcation from steady states to oscillatory states (Csikász-Nagy et al., 2006, Gerard and Goldbeter, 2009), and the directionality of flux in a landscape view of a detailed cell cycle circuit (Li and Wang, 2014) and a reduced state network model (Lv et al., 2015). In contrast, our approach does not require specifying the kinetic parameters of the GRC model but derives the state transition patterns purely from statistical analysis of an ensemble of ordinary differential equation (ODE) models with randomly generated kinetic parameters. Using an extensive statistical analysis of the oscillatory states from a large number of random models, we found that the irreversible directionality of cell cycle state transitions was only observed in large oscillatory trajectories, whereas the directionality is random for small ones. Moreover, using the two new analyses based on delayed correlations and parameter perturbations, our approach explains not only the global irreversible state transitions along cell cycle progression but also the local reversible transitions before the start checkpoint. Our combined mathematical modeling and statistical analysis strengthens the view that cell cycle progression is driven by interplay between steady states and oscillatory dynamics.
Limitations of the Study
We propose a generalized mathematical modeling approach to computing the dynamics of a GRC and an analytical framework to find the states of the GRC and the direction of the state transitions. Although we did not use a detailed description in the mathematical formula for modeling the dynamics of a GRC (see Equation S1 ~ Equation S7 in Section 2 in Transparent Methods), the modeling still works reasonably well in finding the states of cell cycle GRC and a few other circuits. There are still several issues that need to be addressed in the future. First, the state assignment is currently based on the hierarchical clustering analysis of the steady state activity profiles from the random models. The quality and robustness of the clustering might affect the performance of the subsequent statistical analysis. For instance, we have noticed a few distinct features for the M/G1 phase, which might suggest a hybrid nature of the phase. From the performance of neural net model in assigning phases, we observed that the M/G1 phase was incorrectly assigned to the early G1 and M phases, each being at 7.59% and 6.62%, respectively, whereas the early G1 phase was incorrectly assigned to M/G1 by 11.99%. This state assignment problem for M/G1 suggests that the gene activities of M/G1 are similar to those of both the early G1 and M phases. In addition, we know that the mitotic exit network gets activated during the M/G1 phase (Prinz Susanne, 1999, Shirayama et al., 1999, Visintin et al., 1998), which makes the activity profile of the dividing cells very similar to that of early G1 (shown in Figure 3B). Second, although we have tested several cell cycle circuits with minor variations, a systematic approach for circuit refinement is needed for better modeling. One potential approach is to apply RACIPE on different versions of a circuit by adding/removing genes/links and analyzing the circuits iteratively. During the process, the modeling results from different circuits can be compared against literature or genomics data (such as gene expression data). Thus, RACIPE can not only be used for modeling existing gene circuits but also can be utilized in circuit refinement.
Resource Availability
Lead Contact
Mingyang Lu, Assistant Professor, The Jackson Laboratory, Bar Harbor, ME 04609, Email: mingyang.lu@jax.org
Materials Availability
This study does not generate any new unique reagents.
Data and Code Availability
The generalized RACIPE is available on GitHub at https://github.com/arkatebi/cellcycle.
Methods
All methods can be found in the accompanying Transparent Methods supplemental file.
Authors: Christopher A Jackson; Dayanne M Castro; Richard Bonneau; David Gresham; Giuseppe-Antonio Saldi Journal: Elife Date: 2020-01-27 Impact factor: 8.140