Feng Jiao1, Chunjuan Zhu2. 1. Center for Applied Mathematics, Guangzhou University, Guangzhou, P. R. China; College of Mathematics and Information Sciences, Guangzhou University, Guangzhou, P.R. China. 2. Modern Business and Management Department, Guangdong Construction Polytechnic, Guangzhou, P. R. China; College of Mathematics and Information Sciences, Guangzhou University, Guangzhou, P.R. China. Electronic address: cjzhuhappy@sina.com.
Abstract
The phenomenon of a gene expression burst has been attributed to random transitions between the active and inactive states of a gene. However, the mechanisms underlying regulation of the activation process in response to environmental changes remain unclear. Here, we model gene activation as a consequence of the competitive cross talk between a weak basal pathway and an inducible signaling pathway and reveal rich expression dynamics along with intricate dependence of noise and Fano factor on mean expression levels. These theoretical results are in good agreement with a large experimental data set in Escherichia coli, yeast, and mammalian cells. Furthermore, both theoretical analyses and supporting biological evidence converge to demonstrate the existence of a tradeoff that governs the sharp up- and downregulation of gene expression, suggesting an ordered scenario that activates a gene under varying conditions. These regulation modes, together with cross talk pathways, may provide new guidance for the analysis and interpretation of genetic data in various applications, ranging from genetic engineering to therapeutic targets of disease.
The phenomenon of a gene expression burst has been attributed to random transitions between the active and inactive states of a gene. However, the mechanisms underlying regulation of the activation process in response to environmental changes remain unclear. Here, we model gene activation as a consequence of the competitive cross talk between a weak basal pathway and an inducible signaling pathway and reveal rich expression dynamics along with intricate dependence of noise and Fano factor on mean expression levels. These theoretical results are in good agreement with a large experimental data set in Escherichia coli, yeast, and mammalian cells. Furthermore, both theoretical analyses and supporting biological evidence converge to demonstrate the existence of a tradeoff that governs the sharp up- and downregulation of gene expression, suggesting an ordered scenario that activates a gene under varying conditions. These regulation modes, together with cross talk pathways, may provide new guidance for the analysis and interpretation of genetic data in various applications, ranging from genetic engineering to therapeutic targets of disease.
A central question in biology has been to understand how genes in single cells respond to environmental changes. Recent technological advances have generated massive amounts of data on expression dynamics and noise. Notoriously, achieving a theoretical fit to these data requires synchronous modulation of gene activation and inactivation, messenger RNA synthesis, and subsequent feedback in mathematical models. We integrated a typically neglected weak basal pathway into the classical two-state model, which competes with the signaling pathway to activate a gene. We revealed rich cross talk regulations of the two pathways that show good agreement with experimental data. This simple framework provides a new, to our knowledge, perspective on the factors that modulate gene activation, which may open new avenues for future biophysical studies.
Introduction
Gene expression activation in single cells can be considered a random process. In virtually all genomic loci, messenger RNA (mRNA) and protein molecules in active genes are synthesized in pulsatile bursts in which each episode of expression activation is interrupted by a period of inactivation of the gene, or an “off” period (1, 2, 3). In the classical two-state model (3, 4, 5, 6), such bursting behavior is suggested to originate from random transitions between a gene’s on (active) and off (inactive) states. As shown in the diagram,the system is determined by the first-order kinetic rates λ for gene activation, γ for inactivation, v for mRNA synthesis in the active state, δ for mRNA degradation, and v and δ for the birth and death of protein molecules, respectively.This two-state model (Eq. 1) implicitly assumes a single pathway to direct gene activation under environmental changes. However, for a large class of inducible genes, activation is usually mediated by two scenarios. First, the weak and spontaneous basal pathway maintains basal expression under normal cellular growth conditions (7, 8, 9). Second, when cells receive external cues, the downstream transcription factors (TFs) are activated by specific signal transduction pathways (10, 11, 12). In turn, these specific TFs are forced to compete with other TFs in the basal pathway for binding at target DNA sites to direct the assembly of the active transcription complex (13, 14, 15). Therefore, cross talk between these two competitive scenarios could be generated to initiate gene expression.Here, we propose a model that integrates these two cross talking scenarios into the classical two-state model (Eq. 1; 16), based on the multiple-pathway framework (17).Recent studies have shown that two cross talk pathways are more likely to generate a bimodal distribution (16) and high noise (18) in gene transcription levels compared with a single signaling pathway in the classical two-state model (Eq. 1). However, deeper understanding of the mechanisms regulating the cross talk between pathways is lacking. For instance, it remains unclear whether the cross talk of pathways directs more intricate mRNA and protein production than considered thus far according to intuitive perceptions, and the mechanism by which parameter rates in pathways are regulated in response to different external signals has yet to be clarified. Similar questions regarding the two-state model have previously been addressed by combining the model with comprehensive statistical data on mean gene expression levels, noise (variance over mean squared), and Fano factor (variance over mean) (19, 20, 21, 22, 23, 24, 25). Here, we highlight the importance of the neglected basal pathway in the classical two-state model, which helps to uncover a large spectrum of cross talking regulation modes. This new, to our knowledge, perspective can shed light on interpretations of large sets of experimentally obtained gene expression data under varying conditions.
Methods
As shown by the diagram,genes can be activated by two competitive pathways: the weak and stable basal pathway with strength rate λ1 or the stronger and inducible signaling pathway with strength rate λ2, satisfyingWe refer to a gene’s off state as O1 if it is transformed to the on state by the basal pathway and O2 otherwise. Then, the pathway selection probabilities are denoted bysatisfyingThe basal pathway is regulated independently by a default or spontaneous mechanism, so its strength rate λ1 does not vary significantly under a changing signal (10,26). The inducible activation rate λ2 of the signaling pathway is governed by accessibility of the DNA-binding sites, which can be modified based on chromatin structure and the binding strength between the TFs and DNA-binding sites (12,17).The activation of each target gene is ultimately mediated through the binding of downstream TFs in the basal pathway or signaling pathway at the cognate DNA sites in the gene promoter or enhancer regions (13,17). Accordingly, the selection probabilities q1 and q2 quantify the degree of competition between the two pathways to form the corresponding TF-DNA binding configurations. Therefore, the values of q1 and q2 may be approximated by the concentration and availability of activated TFs in each pathway (10,14). These cellular parameters are signal related in general, suggesting that a given gene may vary greatly in q2 (or q1) and λ2 depending on the type and level of signals received. Our calculation and analysis were performed in two main steps, which are outlined in detail in the Supporting Materials and Methods. First, we focus on the time-dependent average levels of mRNA and protein molecules generated by the cross talking pathways model described above (Eq. 2). Their exact forms, under arbitrary initial values, can be computed by solving a system of differential equations derived from master equations corresponding to the cross talking pathways model. Use of these formulas enabled detailed mathematical analysis of the modulation of temporal profiles of mean expression levels, along with the relationship of these profiles from the mRNA to protein level.
Second, we derive analytical formulas for determining the stationary expression mean, noise, and Fano factor (i.e., noise strength) generated by cross talking pathways. These formulas reveal two types of relationships: 1) different theoretical curves of noise and Fano factors against the mean expression level generated by varying single or multiple parameter rates and 2) synchronous variation of two parameters against the mean expression level derived through the reverse calculation of the formulas. Together with steady-state expression data under varying cellular conditions, these dynamics may help to uncover a large spectrum of regulation modes that cells utilize in response to environmental changes.
Results and Discussion
We first explore how the cross talking pathways model (Eq. 2) generates rich dynamics for average gene expression at both the mRNA and protein levels. We further reveal a cross talking regulation scenario with a good theoretical fit to a large data set of steady-state expression under varying conditions. Finally, we compare our cross talking pathways model (Eq. 2) with the classical two-state model (Eq. 1).
Three dynamical modes of mean expression
Under the classical framework, Hao and Baltimore (27) quantified gene expression in cultured mouse fibroblasts in response to stimulation with the cytokine tumor necrosis factor (TNF), and divided the 180 activated genes into three dynamical categories characterized by the peak and corresponding time mRNA accumulation level. They found that the stability of the mRNAs encoded by the three groups of genes played an unexpectedly key role in the overall expression dynamics, whereas other transcriptional control factors were determinants of the observed dynamics. Therefore, we were motivated to clarify the modulation of cross talking pathways on the dynamics of mean expression levels and to determine the interplay of this cross talk with mRNA stability to generate three distinct temporal gene-induction modes in mouse fibroblasts.We denote the probabilities of gene on state and gene off states O, j = 1, 2 as P(t) and P(t) at time t ≥ 0, respectively. The three probability functions are related by the conservation lawIn addition, we denote the time-dependent average levels of mRNA and protein molecules as m(t) and a(t), respectively. To characterize the dynamic profiles of m(t) and a(t) with almost no expression products at the initial time t = 0 (27,28), we assume that transcription starts from the gene off state and count only the newly produced mRNA and protein molecules. This gives the initial valuesUnder this condition, the exact forms of P(t), m(t), and a(t) are given as described by formulas 2.1–2.5 of the Supporting Materials and Methods. It can be rigorously shown that both m(t) and a(t) with the initial condition Eq. 3 either increase at all times or increase initially until reaching a peak uniquely at finite times and decrease thereafter.More precisely, we introduce a real number β > 0 given by Theorem 2.1 in the Supporting Materials and Methods and a threshold value Λ given byin which m(t) satisfies (18)and a(t) satisfies Theorem 2.1 in the Supporting Materials and Methods,The theoretical result Eq. 5 suggests a plausible regulation mode for m(t) dynamics by varying the selection probability q2. We first noticed that gene inactivation events occur more frequently than mRNA degradation events among thousands of eukaryotic genes evaluated from yeast, mouse, and human cells (29). Therefore, γ > δ, so δ = min{δ, γ} in Eq. 5. Note that Λ, given by Eq. 4, decreases with respect to q2. The equation Λ = δ is then solved to determine the critical value q2 = Λ such that Λ ≥ δ if q2 ≤ Λ and Λ < δ if q2 > Λ. Therefore, under the assumption γ > δ, Eq. 5 can be rewritten asWhen δ is very small (δ < λ1), Eq. 7 indicates that Λ > 1, and m(t) takes on a gradual growth pattern. Conversely, if δ is relatively large (δ > λ2), then Λ < 0, and m(t) takes on an up-and-down dynamical profile. For the median scenario in which δ falls into the region (λ1, λ2), then Λ ∈ (0, 1) so that q2 is enhanced across Λ, resulting in a switch of m(t) dynamics from a monotonic increase to nonmonotonic behavior.Here, we use numerical examples to demonstrate how cross talking pathways cooperate with mRNA stability to generate distinct expression dynamics in three groups of mouse fibroblast genes after TNF stimulation based on the data of Hao and Baltimore (27). Group I genes are those that peaked quickly within 0.5 h after stimulation, and then their transcript expression levels sharply decreased, often reaching the baseline value or lower, by 2 h after stimulation. The half-life of group I mRNA transcripts varied from 0.2 to 0.9 h. Thus, we set a median half-life of 0.57 h and the corresponding degradation rate as δ = log(2)/0.57 ≈ 1.22 h−1. Group II genes are those that showed a continual increase in transcription levels up to 2 h after induction and did not decrease sharply thereafter. These mRNAs were relatively more stable, with half-lives varying from 1 to 8 h. Thus, we set δ = 0.14 h−1 so that the half-life has a median value of 4.85 h. Group III mRNAs are those that also gradually increased in abundance but did not reach peak levels during the observation period. These mRNAs were very stable, with half-lives longer than 8 h. Therefore, we set the half-life of this group to 12 h, corresponding to δ = 0.0578 h−1.We established a protocol for estimating parameters in the simulation to capture three main features of transcription dynamics. To guarantee the up-and-down dynamics in groups I and II, we take λ1 < δ < γ and maintain q2 above the critical value Λ, according to Eq. 7. To simulate the sharp transcriptional variation observed in group I, we take a large δ to reach the steady state within a short time period, use large γ and small λ1 to push the stationary value below the baseline, and set large v- and q2λ2-values to generate a steep up-and-down curve that is balanced with the large δ- and γ-values to maintain a high peak (see Remark 2.1 in the Supporting Materials and Methods). Similarly, to fit the more gradual dynamic curves in group II, we set δ, γ, v, and q2λ2 to be relatively small compared with those in group I. To generate the growth dynamics in group III, we use λ2 > δ and set q2 below the critical value Λ given in Eq. 7. In addition, we use v and q2λ2 to modulate the growth rate in the early time period. In general, the selection of parameters for group III can be less restricted compared with that used for parameters in groups I and II owing to the disappearance of the transcription level peak in group III.The three curves of the average mRNA level m(t), based on the protocol above to fit representative data in the three groups, are shown in Fig. 1
a. Integrating the birth rate v and death rate δ of a protein naturally gives rise to an interesting question: how does m(t) transit its dynamical profile to the protein level a(t)? Using Eqs. 5 and 6, we analytically show that δ dominates this transition (see Remark 2.2 in the Supporting Materials and Methods). The monotonic growing pattern of m(t) is always maintained in a(t); the up-and-down dynamics of m(t) remains in a(t) when δ > Λ but switches to the growing pattern once δ ≤ Λ. To illustrate this conclusion, we utilize parameter rates for generating the three dynamical modes of m(t) in Fig. 1
a and further compute a(t) under different δ-values. As shown in Fig. 1
b, when δ is relatively large, the variation rates for m(t) and a(t) approach the direct ratio (see Remark 2.2 in the Supporting Materials and Methods), indicating that dynamical features remain intact from the mRNA to protein level. By contrast, when δ decreases, the growth curve in group III is maintained, whereas the up-and-down dynamics in groups I and II is gradually weakened and finally switches to the growing pattern at the protein level only if δ ≤ Λ.
Figure 1
Three dynamical modes at the mRNA and protein levels. (a) Cross talk pathways modulate transcription dynamics in three groups of fibroblast genes after TNF stimulation (27). The red line represents the fit of data in group I (red diamonds) with v = 400, δ = 1.22, λ1 = 0.03, λ2 = 300, γ = 90 (h−1), and q2 = 0.925. The green line represents the fit of data in group II (green squares) with v = 45, δ = 0.14, λ1 = 0.03, λ2 = 15, γ = 1.9 (h−1), and q2 = 0.65. The blue line represents the fit of data in group III (blue triangles) with v = 50, δ = 0.0578, λ1 = 0.01, λ2 = 1, γ = 0.116 (h−1), and q2 = 0.1. (b) Protein degradation rate δ dominates the transition of dynamical modes from the mRNA to protein level. Left: when δ is relatively large, the mRNA dynamic features remain intact at the protein level. Right: when δ decreases, the growth curve in group III always maintains its expression, whereas the up-and-down curves in groups I and II switch to the growing pattern at the protein level once δ < Λ. To see this figure in color, go online.
Three dynamical modes at the mRNA and protein levels. (a) Cross talk pathways modulate transcription dynamics in three groups of fibroblast genes after TNF stimulation (27). The red line represents the fit of data in group I (red diamonds) with v = 400, δ = 1.22, λ1 = 0.03, λ2 = 300, γ = 90 (h−1), and q2 = 0.925. The green line represents the fit of data in group II (green squares) with v = 45, δ = 0.14, λ1 = 0.03, λ2 = 15, γ = 1.9 (h−1), and q2 = 0.65. The blue line represents the fit of data in group III (blue triangles) with v = 50, δ = 0.0578, λ1 = 0.01, λ2 = 1, γ = 0.116 (h−1), and q2 = 0.1. (b) Protein degradation rate δ dominates the transition of dynamical modes from the mRNA to protein level. Left: when δ is relatively large, the mRNA dynamic features remain intact at the protein level. Right: when δ decreases, the growth curve in group III always maintains its expression, whereas the up-and-down curves in groups I and II switch to the growing pattern at the protein level once δ < Λ. To see this figure in color, go online.
Regulation of signaling pathways under varying conditions
There has been tremendous effort put forward to understand how genes respond to changing signals at the single-cell level (5,25), resulting in massive steady-state data on average expression levels E, the Fano factor (or noise strength) ϕ, and the noise η for both mRNA and protein copy numbers (19, 20, 21, 22, 23, 24). These data, when mapped as scattered points onto an E-ϕ or E-η plane, provide a statistical basis for fitting trend lines of ϕ and η against E through mathematical models (25). Motivated by these pioneering studies, we were interested in understanding how cross talking pathways influence the curves of ϕ and η vs. E. More importantly, by fitting experimental data, we sought to reveal the cross talking modulation mode of pathways in response to environmental changes.We first obtained analytical forms for the stationary mean level E, Fano factor ϕ, and noise η generated by our cross talk pathways model (Eq. 2) (see formulas 2.15, 2.16, and 3.6–3.8 in the Supporting Materials and Methods). The results showed that all first-order kinetic rates can be nondimensionalized by dividing by the mRNA degradation rate δ. Thus, we maintain and assume dimensionless parameters in further discussions of gene expression at the steady state. Here, we focus on the impact of varying the selection probability q2 and the strength rate λ2 in the signaling pathway to evaluate gene expression regulation specifically in response to external signals.When q2 varies, η always decays for E, as shown in Fig. 2
a. To reveal the nonlinear dependence of ϕ on E as observed in (18), the theoretical framework suggests the need for either a large inactivation rate γ or weak strength λ1 in the basal pathway (see Remark 3.1 in the Supporting Materials and Methods). As shown in Fig. 2
a, the variation of q2 indeed forces ϕ to increase for small E and to decay downward for large E, in a similar fashion as observed by varying γ in the two-state model (Eq. 1) (see Theorem 3.1 in the Supporting Materials and Methods). However, this mode changes when λ2 is varied while λ1 is kept relatively small or γ is relatively large. As shown in Fig. 2
b, variation of λ2 modulates both ϕ and η to first decay downward and then to increase for E, which cannot occur by varying only any single parameter in the two-state model (Eq. 1) (see Theorem 3.1 in the Supporting Materials and Methods). Furthermore, when λ1 continues to become smaller or γ continues to increase, this down-and-up curve is maintained for η vs. E but switches to the growing curve for ϕ vs. E (Fig. 2
b).
Figure 2
Cross talking pathways modulate the dependence of the Fano factor and noise on the mean expression level. (a) Variation of q2 generates an up-and-down Fano factor against the expression level when γ is relatively large or λ1 is relatively small and always generates decaying noise against the expression level. (b) Variation of λ2 generates down-and-up curves for both the Fano factor and noise against the expression level when γ is relatively large or λ1 is relatively small. The dimensionless parameters γ and λ1 are embedded in the figures; δ = v = δ = 1, v = 500, and λ2 = 15 in (a), and q2 = 0.8 in (b). To see this figure in color, go online.
Cross talking pathways modulate the dependence of the Fano factor and noise on the mean expression level. (a) Variation of q2 generates an up-and-down Fano factor against the expression level when γ is relatively large or λ1 is relatively small and always generates decaying noise against the expression level. (b) Variation of λ2 generates down-and-up curves for both the Fano factor and noise against the expression level when γ is relatively large or λ1 is relatively small. The dimensionless parameters γ and λ1 are embedded in the figures; δ = v = δ = 1, v = 500, and λ2 = 15 in (a), and q2 = 0.8 in (b). To see this figure in color, go online.The changes in decaying noise and up-and-down Fano factor against the mean expression level are consistent with data from experiments in actual cells as reported by (19, 20, 21). In a recent study in yeast, Carey et al. (19) inserted the ZRT1 promoter, which is a target of the TF Zap1, upstream of the YFP-coding sequence and further enhanced the concentration and binding kinetics of Zap1 in the signaling pathway by decreasing zinc induction levels. As shown in Fig. 3
a, the mean YFP expression level and noise clustered into a decaying trend line. Similarly, So et al. (20) and Jones et al. (21) quantified transcription levels for a large number of Escherichia coli promoters under varying conditions and revealed general up-and-down trend lines for the Fano factor versus mean level (Fig. 3
b). The activity of these promoters was demonstrated to be regulated by competitive DNA binding of the repressor LacI and several activators such as CRP (20,21). Therefore, we used repressors as the downstream TFs in the weak basal pathway and activators as TFs for modulating the signaling pathway. Indeed, our theoretical curves generated by changing the selection probability q2 of the signaling pathway are in good agreement with the empirical data (Fig. 3, a and b; Table S1). This indicates that the modulation of q2, controlled by the corresponding TF concentration and binding kinetics, plays a pivotal role in gene expression regulation. To further understand the extent to which the other parameter λ2 in the signaling pathway is modulated, we reversed each data point in Fig. 3
b to a pair of q2 and λ2 while keeping the remaining parameters stable (see Remark 3.2 in the Supporting Materials and Methods; Table S2). This revealed an interesting regulation mode: q2 is modulated when the expression level is low, whereas λ2 is modulated when the expression level is higher (Fig. 3
c).
Figure 3
Modulation of the signaling pathway in yeast and E. coli. (a) Fit of data for noise against mean YFP expression in yeast is shown. Blue triangles are the steady-state data from the ZRT1 promoter under different zinc concentrations (19). Red line is the theoretical curve, generated by varying the selected probability q2 of the signaling pathway. (b) Fit of the data for the Fano factor versus mRNA mean level in E. coli is shown. Markers represent steady-state data from the promoter P under different IPTG and cAMP concentrations (green circles) (20) and two constructed promoters under different repressor concentrations (red squares and blue triangles) (21). Solid lines (green, red, and blue) show theoretical curves generated by varying q2. (c) Markers represent extracted probability q2 and strength rate λ2 of the signaling pathway as functions of mean mRNA expression level from the data in (b). Solid lines show trend lines of markers characterized by variation of q2 at low mRNA levels and of λ2 at high mRNA levels. The parameter values are shown in Tables S1 and S2. To see this figure in color, go online.
Modulation of the signaling pathway in yeast and E. coli. (a) Fit of data for noise against mean YFP expression in yeast is shown. Blue triangles are the steady-state data from the ZRT1 promoter under different zinc concentrations (19). Red line is the theoretical curve, generated by varying the selected probability q2 of the signaling pathway. (b) Fit of the data for the Fano factor versus mRNA mean level in E. coli is shown. Markers represent steady-state data from the promoter P under different IPTG and cAMP concentrations (green circles) (20) and two constructed promoters under different repressor concentrations (red squares and blue triangles) (21). Solid lines (green, red, and blue) show theoretical curves generated by varying q2. (c) Markers represent extracted probability q2 and strength rate λ2 of the signaling pathway as functions of mean mRNA expression level from the data in (b). Solid lines show trend lines of markers characterized by variation of q2 at low mRNA levels and of λ2 at high mRNA levels. The parameter values are shown in Tables S1 and S2. To see this figure in color, go online.Down-and-up noise dynamics as a function of the mean expression level has also been observed in recent human immunodeficiency virus (HIV) studies, in which Dar et al. (22) and Dey et al. (23) inserted the HIV LTR promoter at random genomic locations of Jurkat T cells and obtained a large set of expression data. As shown in Fig. 4, a and b, a good theoretical fit to these data was achieved by simply changing the activation rate λ2 of the signaling pathway. The invariant rate λ1 reinforces the basal promoter activity as determined by HIV integration sites (9). However, variation of λ2 suggests that HIV gene expression can be strongly influenced by the local chromatin environment (9,22), which might be governed by different levels of chromatin accessibility across the genome (23). Thus, to further understand the modulating role of the chromatin environment, we reversed the data in Fig. 4
b with respect to variations of q2 and λ2 against mean expression levels (see Remark 3.2 in the Supporting Materials and Methods). In addition, we repeated this procedure under different values of the other parameter triplet (v, γ, λ1), as shown in Fig. 4
c. This procedure resulted in two main observations. First, the selection probability q2 maintains relatively high values, consistent with the estimated large q2 for the fitting data in Fig. 4, a and b (also see Table S1). This might be caused by the effect of TNF stimulation that induces related TFs (e.g., NFκB) to become nearly saturated in the nucleus of sorted GFP-positive cells (23,24). Second, variations of q2 and λ2 across the genome show robust behavior against different triplets (v, γ, λ1), with modulation of q2 at low expression and of λ2 when the expression level is higher.
Figure 4
Modulation of signaling pathways in mammalian cells. (a and b) Fit of data for noise and Fano factor against mean expression levels is given. Red circles represent steady-state data from the HIV LTR promoter inserted at random locations in the genome of Jurkat T cells (22,23). Blue lines show the theoretical fit generated by changing the strength rate λ2 of the signaling pathway. (c) Red circles represent extracted λ2 and selection probability q2 of the signaling pathway from the data in (b) under different values of the other parameters (v, γ, λ1). Blue lines show robust trend lines of red circles, indicating that q2 is modulated at low mRNA expression levels and λ2 is modulated at high expression levels. The parameter values are shown in Table S1. To see this figure in color, go online.
Modulation of signaling pathways in mammalian cells. (a and b) Fit of data for noise and Fano factor against mean expression levels is given. Red circles represent steady-state data from the HIV LTR promoter inserted at random locations in the genome of Jurkat T cells (22,23). Blue lines show the theoretical fit generated by changing the strength rate λ2 of the signaling pathway. (c) Red circles represent extracted λ2 and selection probability q2 of the signaling pathway from the data in (b) under different values of the other parameters (v, γ, λ1). Blue lines show robust trend lines of red circles, indicating that q2 is modulated at low mRNA expression levels and λ2 is modulated at high expression levels. The parameter values are shown in Table S1. To see this figure in color, go online.
Comparison with the classical two-state model
The proposed cross talking pathways model (Eq. 2) is, in principle, a generalized two-state model (Eq. 1) that assumes an additional basal pathway that triggers the process of gene activation. The basal pathway has been generally ignored in the traditional mathematical framework because of its very weak and stable strength rate λ1, leading to the assumption that a stronger signaling pathway induced by extracellular signals may dominate gene activation (10). However, this intuitive assumption may not be justified in consideration of the cross talk regulation of the two pathways. Consider the gene activation frequency Pon, which takes the form of (16,18)When the signaling pathway is very strong and frequently selected, the large λ2 gives q2/λ2 ≪ 1, whereas the balance of the small q1 and weak strength of λ1 gives q1/λ1 ≫ q2/λ2. This leads to Pon ∼λ1/q1, suggesting that the gene activation event may be dominated by the basal pathway even when the signaling pathway is efficiently induced. This observation shows that the basal pathway can play a non-negligible role in gene activation and thus may cooperate with the signaling pathway to produce expression features that cannot be captured in the model with a single pathway.For genes that are silenced under normal growth conditions (27,28,30), the average mRNA and protein levels derived from the two-state model increase for all time points t > 0 in response to external stresses and approach steady-state values as t → ∞ (see Remark 2.3 in the Supporting Materials and Methods). This suggests that the application of the two-state model is inadequate to capture the typical up-and-down expression dynamics for mouse fibroblast genes after TNF treatment (27); see Fig. 5
a. The monotonic increasing dynamic pattern of gene expression thus seems to be a robust feature induced by a single activation pathway. For instance, we can extend the two-state model by decomposing the gene activation into two independent sequential processes (31). This extension maintains a single pathway, and the mean expression level either increases monotonically or displays damped oscillatory dynamics (32). However, such oscillation behavior is almost invisible because of the rapid exponential decay and only slightly slows down the increase of the average level to an amplified steady-state value (32).
Figure 5
Fit of data through the classical two-state model (Eq. 1). (a) Dynamical curves of mean mRNA levels that continuously increase at all time points and cannot fit the nonmonotonic transcription dynamics of mouse fibroblast genes (27) are shown. (b) Decaying noise against the mean expression levels of the ZRT1 promoter in yeast can be best described by an increase in the activation rate λ (19). (c) Up-and-down Fano factor dynamics versus mean expression levels for E. coli promoters can be generated by decreasing the inactivation rate γ (20,21). (d) Down-and-up noise dynamics against mean expression levels for the HIV LTR promoter reveals the modulation of λ at low expression levels and of burst size v/γ when expression is higher (22,23). The embedded diagrams suggest that γ and the mRNA synthesis rate v increase synchronously to modulate the burst size and ultimately upregulate expression. Markers represent expression levels, as shown in Figs. 1, 3, and 4. Solid lines indicate the theoretical curves generated by the two-state model. The parameter values are shown in Table S3. To see this figure in color, go online.
Fit of data through the classical two-state model (Eq. 1). (a) Dynamical curves of mean mRNA levels that continuously increase at all time points and cannot fit the nonmonotonic transcription dynamics of mouse fibroblast genes (27) are shown. (b) Decaying noise against the mean expression levels of the ZRT1 promoter in yeast can be best described by an increase in the activation rate λ (19). (c) Up-and-down Fano factor dynamics versus mean expression levels for E. coli promoters can be generated by decreasing the inactivation rate γ (20,21). (d) Down-and-up noise dynamics against mean expression levels for the HIV LTR promoter reveals the modulation of λ at low expression levels and of burst size v/γ when expression is higher (22,23). The embedded diagrams suggest that γ and the mRNA synthesis rate v increase synchronously to modulate the burst size and ultimately upregulate expression. Markers represent expression levels, as shown in Figs. 1, 3, and 4. Solid lines indicate the theoretical curves generated by the two-state model. The parameter values are shown in Table S3. To see this figure in color, go online.The two-state model (Eq. 1) still provides a good fit to experimental data in the case of monotonically increasing dynamics, as shown in Fig. 5
a. In this case, we can rule out the cross talk of two pathways and conclude that gene activation is directed by a single pathway. However, this interpretation may require quantifying the stability of mRNA and protein molecules. From Eq. 7, a lower mRNA death rate δ results in a higher probability of increasing dynamics at the average level. To illustrate this more clearly, we introduceand rewrite Eq. 7 in the formThis suggests that encoded stable transcripts usually demonstrate a gradual increase in the average level, whereas less stable transcripts are more likely to exhibit nonmonotonic mean dynamics, as observed in experimentally with the TNF-induced transcription of cultured mouse fibroblasts (27). When the transcript is extremely stable at δ < λ1, its average level always presents increasing dynamics independently of how the signaling pathway is modulated. Such extreme cases may be achieved by tagging mRNA with the MS2-GFP fusion protein, which increases the transcript lifetime (30). Application of this approach produces increasing dynamics of transcript numbers in living E. coli cells, even if the upstream Plac,/ara promoter is highly activated through the cross talk of two pathways that separately converge on the repressor LacI and activator AraC (30). In contrast, other experimental approaches such as single-molecule fluorescent in situ hybridization and reverse transcription-quantitative polymerase chain reaction have demonstrated up-and-down dynamics for the mean transcription level of c-Fos induced through the MAPK signaling pathway (28). Given the significant impact of protein stability on the mean expression dynamics (Fig. 1
b), the discussion above suggests that the impact of cross talking pathways may be ruled out only when the gene is highly activated and the average level of encoded unstable transcripts displays monotonic growing dynamics.Nevertheless, the two-state model (Eq. 1) generates theoretical curves to fit the stationary data of noise and Fano factor versus mean expression level. To explain this phenomenon, consider the case of YFP expression from the ZRT1 promoter in yeast, in which the decaying trend of noise against the mean level can best be described by an increase in the gene activation rate λ (19); see Fig. 5
b. For the transcription of E. coli promoters, the general up-and-down Fano factor dynamics as a function of mean expression level can be achieved by decreasing the inactivation rate γ (Fig. 5
c; (20,21)). However, variation in any single parameter rate cannot generate the typical down-and-up noise dynamic against the mean expression level for the HIV LTR promoter across integration positions (22,23) (see Theorem 3.1 in the Supporting Materials and Methods). Two approaches of parameter estimation have been independently applied to identify the modulation mode: increasing λ accounts for suppressing noise at weaker expression loci, whereas increasing the burst size v/γ accounts for enhancing the noise at stronger expression loci (22,23); see Fig. 5
d. To separate the mRNA synthesis rate v from γ, we reversed the data points at stronger expression loci to variations of v and γ against the mean level (see Remark 3.2 in the Supporting Materials and Methods). This reveals a synchronous increase in both v and γ that upregulates gene expression; see Fig. 5
d.The good fit of both the two-state model (Eq. 1) and the cross talking pathway model (Eq. 2) to the same stationary data sets supports two alternative views. On the one hand, using the two-state model to fit the data from different promoters requires different regulation modes that alter mRNA synthesis, gene activation, and inactivation, as shown in Fig. 5. This implies that gene regulation in response to environmental changes is promoter specific because it may act on some genes individually, but not on others. On the other hand, the theoretical fit of all data through the cross talking pathway model (Figs. 3, a and b and 4, a and b) can be achieved only by modulating the signaling pathway in the gene activation process. Such a general regulation mode, as opposed to the promoter-specific scenario that may be dependent on promoter architecture, reflects global constraints such as the binding and unbinding of regulatory elements on the corresponding DNA-binding sites. This contrast between promoter nonspecific or promoter-specific regulation is maintained when considering the synchronous variation of the two system parameters (see Remark 3.2 in the Supporting Materials and Methods). Figs. 3
c and 4
c show that all data points can be reversed to an ordered regulation on the selection probability q1 and the strength rate λ2 of the signaling pathway. In contrast, using a two-state model to reverse these data results in different variation patterns for the combination of the gene activation rate λ and inactivation rate γ or the combination of γ and the mRNA synthesis rate v, which determines the burst size v/γ (Fig. 6).
Figure 6
Different regulation modes revealed by varying two parameters in the two-state model (Eq. 1). (a) Synchronous variation of the activation rate λ and inactivation rate γ is shown. To upregulate average mRNA expression, γ decreases for all promoters, whereas λ increases for the promoter P, decreases initially and then increases for promoters P and P5DL1, and increases initially and then decreases for the HIV LTR promoter. (b) Synchronous variation of γ and the mRNA synthesis rate v is shown. As the average mRNA level is increased, v initially decreases and then increases for P, P, and P5DL1 and increases for the HIV LTR promoter, and γ varies differently for all promoters. Markers are extracted parameters from transcription data obtained in (20,21,23). Solid lines are the trend lines of markers; other parameter values, δ = 1, P, P, P5DL1, and LTR promoters are set to v = 61, 14, 8, and 200 in (a), and λ = 0.01, 0.1, 0.05, and 0.2 in (b), respectively. To see this figure in color, go online.
Different regulation modes revealed by varying two parameters in the two-state model (Eq. 1). (a) Synchronous variation of the activation rate λ and inactivation rate γ is shown. To upregulate average mRNA expression, γ decreases for all promoters, whereas λ increases for the promoter P, decreases initially and then increases for promoters P and P5DL1, and increases initially and then decreases for the HIV LTR promoter. (b) Synchronous variation of γ and the mRNA synthesis rate v is shown. As the average mRNA level is increased, v initially decreases and then increases for P, P, and P5DL1 and increases for the HIV LTR promoter, and γ varies differently for all promoters. Markers are extracted parameters from transcription data obtained in (20,21,23). Solid lines are the trend lines of markers; other parameter values, δ = 1, P, P, P5DL1, and LTR promoters are set to v = 61, 14, 8, and 200 in (a), and λ = 0.01, 0.1, 0.05, and 0.2 in (b), respectively. To see this figure in color, go online.
Conclusions
The phenomenon of a burst of gene expression has been attributed to the random transition between active and inactive states of a gene (3,5). Here, we propose a new, to our knowledge, model for considering regulation of gene activation in response to environmental changes and external stimuli. Activation of a target gene is ultimately mediated through competitive DNA binding of different TFs in the promoter region, which may result in a large number of TF-DNA binding configurations (14,17). We have classified downstream TFs into two parallel pathways: the basal pathway regulated by a default or spontaneous mechanism (7, 8, 9) and the signaling pathway that is induced in response to external signals (10, 11, 12). These two pathways competitively converge onto their corresponding TF-DNA binding patterns to induce gene activation. This observation motivated us to extend the classical two-state model (Eq. 1) by assuming that a gene can be activated either by the basal pathway with strength rate λ1 and selection probability q1 or by the inducible signaling pathway with strength rate λ2 and selection probability q2 = 1 − q1. Our ultimate aim is to better understand how potential cross talk between these two competitive pathways modulates downstream gene expression.We first theoretically demonstrated that the dynamical profile of the mean expression level of a gene takes on either a monotonic growth or up-and-down pattern. We further revealed rich cross talk regulation modes on the mean expression dynamics and found a good theoretical fit to the three groups of genes classified according to their expression dynamics based on experimental observations for 180 TNF-induced mouse fibroblast genes (27). In particular, sharp up-and-down dynamics can be achieved by the tradeoff between a strong and frequently selected signaling pathway that upregulates expression to rapidly reach a high level and a rarely selected weak basal pathway that downregulates expression from its peak to the baseline level. To more explicitly demonstrate the nonmonotonic expression dynamics, we assumed that the gene inactivation rate is larger than the mRNA degradation rate, as manifested by empirical data for a large number of genes in several different types of eukaryotic cells (29). We derived a threshold λ given by Eq. 7 for the selection probability q2 of the signaling pathway. The mean expression level exhibited a growing pattern when q2 ≤ Λ and switched to the up-and-down dynamics once q2 was enhanced above Λ. Note that the model with a single pathway could not generate such nonmonotonic expression dynamics. Thus, our results suggest that natural selection may favor the cross talk of two parallel pathways for fulfilling two contradictory requirements: the rapid upregulation of gene expression to contribute to the adaptation to acute external stresses and subsequent repression to avoid overexuberant expression, which may have detrimental side effects, as observed in the innate immune system of insects when fighting pathogen invasions (33).Moreover, the stabilities of mRNA and protein molecules have a significant influence on the dynamic monotonicity of their average levels. First, our analytical results showed that the monotonic growing dynamics are consistently maintained from the mRNA to protein level. However, the up-and-down dynamics at the mRNA level can switch to the growing pattern at the protein level if, and only if, the protein stability remains above a certain threshold characterized by Eq. 4. Taking into account the long timescale of protein molecules for a large number of genes (29,34,35), this suggests that the average protein level may increase monotonically even if the transcription can be dynamically downregulated. Second, our theoretical results and supporting experimental evidence show that stable transcripts may mask downregulation governed by cross talking pathways and ultimately exhibit a gradually increasing average level (27,30). In contrast, less stable transcripts are more likely to exhibit nonmonotonic mean dynamics (27,28). This suggests that, for some efficiently induced genes, we may consider ruling out the cross talking regulation mechanism when the encoded transcripts are unstable while displaying a dynamically growing average level.We further revealed a general cross talking regulation scenario that may explain the stationary expression data of distinct promoters under varying conditions. That is, we have demonstrated different stationary curves of noise and Fano factor against the mean expression level simply by changing the selection probability q2 or strength rate λ2 of the signaling pathway. As an encouraging observation, our theoretical results were in good agreement with a large data set obtained from E. coli, yeast, and mammalian cells (19, 20, 21, 22, 23). This reveals that the regulation of q2 is the main contributor in the response to external signals and λ2 is strongly modulated across different chromatin environments. To gain deeper insight into the pivotal roles of q2 and λ2 in gene regulation, we focused on the transcription data for three E. coli promoters (20,21) and the HIV LTR promoter (23) and then reversed the data set to the variations of q2 and λ2 against mean expression. This process revealed a robust regulation mode in which q2 is modulated when the expression level is low, whereas λ2 is modulated when the expression level becomes higher. Such ordered regulation shows that the accumulation of signal-induced TFs primarily navigates gene activation more frequently through the stronger signaling pathway. Once the TF concentration and retention time reach the ceiling, the scenario switches to increasing the strength rate of the signaling pathway, such as by enhancing accessibility of DNA-binding sites hampered by the chromatin structure.The general regulation mode of the signaling pathway described above reflects a promoter nonspecific scenario governed by global cellular constraints such as the binding of regulatory elements on DNA sites. By contrast, using the two-state model to fit the steady-state data revealed a promoter-specific scenario, as different types of promoters corresponded to distinct regulation modes. Both scenarios may coexist to produce the final observed expression features (25); that is, the general regulation mechanism based on cross talk of pathways for gene activation may cooperate with other promoter-specific regulation mechanisms in response to external changes. The question then remains as to which regulation mode plays a dominant role and which has only a secondary effect on a given gene of interest. A plausible way to address this question may require measuring the expression dynamics with respect to the average level or overall distribution under different external induction levels. This could then allow for estimating which of the two regulation scenarios can better characterize the variation of dynamical profiles with respect to the induction strength. In particular, some typical dynamical features such as the up-and-down mean expression pattern (27,28) or long-lasting bimodal expression distribution (10,30) may point to the presence of cross talk regulation (see Fig. 1
a; (36,37)).The cross talk regulation mode of the two pathways presented herein cannot generate different stationary noise for the same average level of expression (19). In addition, this model cannot reveal mean expression dynamics characterized by more than one peak or with oscillation behavior (11,38). To capture these features, future work is required to implement two extensions of the cross talking pathway model (Eq. 2): 1) integrating a suitable number of parallel signaling pathways (14,17) or 2) decomposing the signaling pathway into several sequential processes (31,32).
Author Contributions
F.J. and C.Z. designed and performed the study. F.J. wrote the manuscript.
Authors: Ian B Dodd; Keith E Shearwin; Alison J Perkins; Tom Burr; Ann Hochschild; J Barry Egan Journal: Genes Dev Date: 2004-02-01 Impact factor: 11.361