Literature DB >> 35143483

Slow nucleosome dynamics set the transcriptional speed limit and induce RNA polymerase II traffic jams and bursts.

Robert C Mines1, Tomasz Lipniacki2, Xiling Shen1,3,4.   

Abstract

Nucleosomes are recognized as key regulators of transcription. However, the relationship between slow nucleosome unwrapping dynamics and bulk transcriptional properties has not been thoroughly explored. Here, an agent-based model that we call the dynamic defect Totally Asymmetric Simple Exclusion Process (ddTASEP) was constructed to investigate the effects of nucleosome-induced pausing on transcriptional dynamics. Pausing due to slow nucleosome dynamics induced RNAPII convoy formation, which would cooperatively prevent nucleosome rebinding leading to bursts of transcription. The mean first passage time (MFPT) and the variance of first passage time (VFPT) were analytically expressed in terms of the nucleosome rate constants, allowing for the direct quantification of the effects of nucleosome-induced pausing on pioneering polymerase dynamics. The mean first passage elongation rate γ(hc, ho) is inversely proportional to the MFPT and can be considered to be a new axis of the ddTASEP phase diagram, orthogonal to the classical αβ-plane (where α and β are the initiation and termination rates). Subsequently, we showed that, for β = 1, there is a novel jamming transition in the αγ-plane that separates the ddTASEP dynamics into initiation-limited and nucleosome pausing-limited regions. We propose analytical estimates for the RNAPII density ρ, average elongation rate v, and transcription flux J and verified them numerically. We demonstrate that the intra-burst RNAPII waiting times tin follow the time-headway distribution of a max flux TASEP and that the average inter-burst interval [Formula: see text] correlates with the index of dispersion De. In the limit γ→0, the average burst size reaches a maximum set by the closing rate hc. When α≪1, the burst sizes are geometrically distributed, allowing large bursts even while the average burst size [Formula: see text] is small. Last, preliminary results on the relative effects of static and dynamic defects are presented to show that dynamic defects can induce equal or greater pausing than static bottle necks.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35143483      PMCID: PMC8865691          DOI: 10.1371/journal.pcbi.1009811

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


Introduction

Over the last 20 years, there has been a significant effort to explain stochasticity in molecular pathways focusing especially on the regulation of transcription and translation [1-6]. Most of the theoretical studies were concentrated on regulation of gene activity by means of binding and dissociation of transcription factors. However, a wide variety of assays have been developed to investigate epigenetic features that affect transcription by means of chromatin accessibility (ATAC-seq [7], DNASE-seq [8], FAIRE-seq [9]), histone modifications (Mint-ChIP [10] and ChIP-mentation [11]), or DNA methylation (bisulfite sequencing [12]). All of these epigenetic features exert their influence on transcription through the dynamics of nucleosomes, the histone octamers on which DNA is wrapped. Further, these epigenetic features exert measurable kinetic effects on the wrapping and unwrapping of DNA from the nucleosomes that may provide unexplored regulatory mechanisms for eukaryotic transcription [3,13]. At the molecular level, the process of transcription past a nucleosome is complex, but it has several key features: stalling of RNAPII on approach to a bound nucleosome, temporary nucleosome unwrapping/dissociation, and rebinding of the nucleosome if the site is not occupied by a polymerase. Histone modifications alter the duration and frequency of transcriptional pausing by influencing the fluctuation of nucleosomes between wrapped and unwrapped states (commonly referred to as “nucleosome breathing”) [14,15]. The nucleosome’s wrapping equilibrium and the absolute time scale of this process can potentially shape the transcriptional dynamics. Bintu et al. showed that transcription of bare DNA had an average of 4 pausing events per kbp with an average pause length of 4.4 seconds while transcription passing an unmodified nucleosome had an average of 14 pausing events per kbp with an average pause length of 10.2 seconds [16]. However, histone modification via acetylation significantly reduced the stalling to an average of 11 pausing events per kbp with an average pause length of 9.6 seconds [16]. In cases where the DNA is completely depleted of histones (as occurs with loss of stem-loop-binding protein), cells show a global increase in transcriptional elongation rates, loss of almost all transcriptional pausing as if the polymerase were transcribing bare DNA, and altered patterns of co-transcriptional splicing [17]. Taken together, this data suggests that nucleosomes are the primary regulator of transcriptional elongation. However, nucleosomes are also part of a large array of genetic and epigenetic features that control transcriptional elongation rates and the overall transcriptional flux [18,19]. In contrast to the dynamic nature of nucleosomes, there are also static defects associated with transcriptional pausing. The most critical static features that predict elongation slowdowns are high GC content, high exon density, and highly methylated CpG islands [18,19]. It is unintuitive whether these genetic and epigenetic features can best be described as static site defects or if they modify the dynamics of nucleosomes. Wang, Stein, and Ware demonstrated that nucleosomes have higher occupancy on regions of higher GC content and on exons [20]. Jonkers, Kwak, and Lis demonstrated that increased exon density increased RNAPII pausing due to co-transcriptional splicing at intron-exon junctions [18]. Collings, Waddel, and Anderson showed that methylated CpG sites (associated with transcriptional silencing) are more highly occupied by nucleosomes while unmethylated CpG islands are associated with polymerase recruitment and nucleosome depletion [21,22]. Therefore, these pausing events may be considered the result of either a static or dynamic defects. While nucleosomes stall polymerases, polymerases also exert an effect on nucleosomes. At low rates, each polymerase must move past the nucleosome which rebinds between polymerase crossings. However, at high transcription rates, the lead polymerase pioneers its way through the nucleosomes, while holding the DNA open for its followers. This process keeps the nucleosome in the destabilized state longer, making it more likely to be totally displaced from the chromatin [23,24]. Thus, the nucleosomes induce convoy formation through pausing, but the resulting convoys obstruct nucleosome rebinding. In spite of the numerous simulations and extensive modeling of initiation-rate limited transcription and transcriptional bursting due to multi-state promoter dynamics, few computational and mathematical modeling studies have been performed to investigate the possibility of a transcriptional regime limited by nucleosome-induced pausing or the unique pattern of transcriptional bursting that this would induce [25-28]. Therefore, we developed a stochastic, agent-based model of transcription through nucleosomes arranged in the canonical beads on a string geometry. The nucleosomes undergo stochastic transitions from the wrapped to unwrapped state consistent with the thermodynamic equilibrium set by the microscopic rate constants. Encountering a nucleosome in the closed state causes polymerases to stall until the nucleosome reopens. In turn, the presence of polymerases prevents nucleosomes from closing. From a theoretical perspective, the proposed model is an extension of a classical model in stochastic transport theory known as the Totally Asymmetric Simple Exclusion Process (TASEP) [29,30]. In the open boundary TASEP, particles can be injected through the first boundary (if the first lattice site is unoccupied) with propensity α, removed through the other boundary (if the last lattice site is occupied) with propensity β, and advance with propensity q (typically set to one to define the time scale) if there is no particle obstructing it on the next, non-boundary lattice site as shown in Fig 1A. Even though the TASEP only has nearest neighbor interactions, the TASEP exhibits a non-trivial kinetic phase diagram (Fig 1B) with the bulk density ρ, average hop rate v, and total particle flux J = v×ρ on the lattice completely defined by the initiation and termination propensities as shown in Eqs 1–3 [31-33].
Fig 1

Extension of the canonical TASEP to include nucleosomes as dynamic defects.

(A) Depiction of the canonical TASEP. Particles are injected into the first lattice site at rate α and advance with rate q = 1 when unobstructed by other particles. Particles are removed from the final lattice site at rate ß. (B) The canonical TASEP phase diagram in the αß-plane with the fundamental relationships for J, ρ, and v summarized for each phase. (C) Simulated density profiles for each phase are shown by black lines, and dashed red lines indicate the mean-field theory prediction obtained for an infinitely long lattice. (D) Depiction of the proposed dynamic defect TASEP (ddTASEP) where periodically spaced nucleosomes function as extended-body dynamic defects. RNAPII bind the first lattice site (transcription start site, TSS) at rate α, RNAPII advances at rate q when unobstructed by other polymerases or nucleosomes. RNAPII is then released from the gene at the end of transcription at rate ß, which is set equal to q (ß = q). DNA in the nucleosome associated regions (containing 3 subsequent sites) unwraps from the nucleosome to the open conformation at rate h and returns to the wrapped/closed state at rate h when there are no polymerase present on the nucleosome associated sites. All lattice sites are 50 bp long.

Additionally, the boundary defects give rise to a variety of density profiles as shown in Fig 1C.

Extension of the canonical TASEP to include nucleosomes as dynamic defects.

(A) Depiction of the canonical TASEP. Particles are injected into the first lattice site at rate α and advance with rate q = 1 when unobstructed by other particles. Particles are removed from the final lattice site at rate ß. (B) The canonical TASEP phase diagram in the αß-plane with the fundamental relationships for J, ρ, and v summarized for each phase. (C) Simulated density profiles for each phase are shown by black lines, and dashed red lines indicate the mean-field theory prediction obtained for an infinitely long lattice. (D) Depiction of the proposed dynamic defect TASEP (ddTASEP) where periodically spaced nucleosomes function as extended-body dynamic defects. RNAPII bind the first lattice site (transcription start site, TSS) at rate α, RNAPII advances at rate q when unobstructed by other polymerases or nucleosomes. RNAPII is then released from the gene at the end of transcription at rate ß, which is set equal to q (ß = q). DNA in the nucleosome associated regions (containing 3 subsequent sites) unwraps from the nucleosome to the open conformation at rate h and returns to the wrapped/closed state at rate h when there are no polymerase present on the nucleosome associated sites. All lattice sites are 50 bp long. The theory of the dynamic defect TASEP (ddTASEP), which is relevant to transcription, has not been fully developed. Van den Berg and Depken constructed a ddTASEP of transcription through single-site nucleosomes that would automatically be displaced from the gene after direct contact with polymerases [25]. Their model showed the first proof of concept of polymerase convoy organization and nucleosome depletion. However, their model treated polymerase advancement past nucleosomes as a single concerted step (i.e. RNAPII automatically moves through nucleosomes at a manually specified, slower rate rather than letting the wrapping dynamics dictate the advancement), the representative cases focused on low nucleosome density with slow rebinding, and they interpreted the results by mapping them on to a hypothetical two state model [25]. Waclaw et al. investigated a general dynamic defect TASEP with single site hopping particles and single site defects focusing primarily on closed loop cases with soft exclusion effects that allow nucleosome and polymerase co-occupancy with some coverage of the open boundary case [34]. Previous studies investigating the flux-density-hop rate (J-ρ-v) relationship of TASEPs with dynamic defects have focused primarily on weak or fast fluctuating perturbations and have utilized mean field approaches that cannot account for the long range particle-particle and particle-defect correlations [25,34,35-36]. Most notably, Waclaw et al. studied a variety of single site dynamic defect TASEPs with periodic and open boundaries with constrained and unconstrained defect binding [34]. However, most of their theoretical results were derived for the mathematically simpler case with periodic boundaries in which the bulk particle density remains constant. In this study, we propose a transcription-dedicated, TASEP model with elongated, dynamic defects, and hard polymerase-nucleosome exclusion constraints (Fig 1D) that accounts for key features of RNAPII transcription through nucleosomes including nucleosome depletion, polymerase convoy organization, and transcriptional bursting. Additionally, we abandon the mean field approximation to focus on the regime in which transcription proceeds via propagation of polymerase convoys of arbitrary length. We derive the mean first passage time (MFPT) and variance of the first passage time (VFPT) of the pioneering polymerases that lead the convoys using a Markov Chain approach [37-39]. We quantified the strength of the nucleosome perturbation via the mean first passage rate γ, which is inversely proportional to the MFPT and depends on the microscopic nucleosome wrapping and unwrapping rate constants. The initiation rate α and the mean first passage rate γ were then used to construct the ddTASEP phase diagram (orthogonal to the canonical TASEP αβ-phase diagram at β = 1) and to calculate the location of the nucleosome-induced jamming phase transition [29,31,32]. We identified two regimes in the novel αγ-plane: initiation limited and nucleosome pausing (defect) limited. In both regions, we postulated approximate expressions for the elongation rate v, the bulk density ρ, and the bulk transcription flux J. Additionally, the inter-burst intervals and burst sizes generated by the slow nucleosome dynamics were investigated over a wide range of conditions spanning the initiation-limited and nucleosome-limited regimes. Finally, preliminary results on the effects of variability in the RNAPII advance rate and in the nucleosome dynamic rate constants along DNA (e.g. caused by presence of exons or DNA methylation on CpG islands) are presented.

Methods

Numerical simulations

The model is simulated using a Gillespie algorithm with a variable number of reaction channels based on the set of all possible legal moves [40]. During each Monte Carlo step, the list of all legal polymerase and nucleosome moves is compiled. The propensities for each possible move are used to construct a categorical distribution, and a legal move is randomly selected from this distribution with probability proportional to its propensity. Next, the waiting time to the selected move is drawn from an exponential distribution with a mean waiting time given by the total propensity (sum of all legal propensities). The move is performed, and the system time is updated.

Determination of ddTASEP properties from the simulation output

The canonical properties of the TASEP are the flux J, average hop rate v, and the average site occupancy/density ρ [31,32]. The considered ddTASEP is ergodic and converges to a steady state probability density. For each Monte Carlo step, the state of the system is described by two vectors and composed of binary variables s and h indicating the polymerase occupancy of the i site and the nucleosome occupancy of the m nucleosome associated region, respectively. The averages of v, ρ, and J are estimated within the time interval between the first passage time t (Monte Carlo step N) and final time point t (at least eight times greater than t). All simulations were run for a fixed number of Monte Carlo steps (N = 1.2×106 for qualitative trends or N = 2×106 for quantitative results). The time averaged density profile ρ is estimated in Eq 4. As Derrida proved for the classic TASEP, far from the boundaries, the three primary properties of the TASEP converge exactly to values set by the microscopic rate constants for the initiation and termination rates (α, ß) [31,32]. For an infinitely long lattice, the boundary effects become negligible. For genes of 20 kbp (400 lattice sites) or longer, the boundary effects are weak, allowing the system to be characterized in terms of these three non-spatial, non-temporal metrics. Taking N to be the total number of mRNA produced, N to be the number of lattice sites on the gene, and to be the time between the k RNAPII binding and terminating its transcription, the bulk averages of the three primary properties are defined in Eqs 5–7. The bulk nucleosome densities ρ and bulk nucleosome density profiles ρ are calculated analogously to ρ in Eq 1 and in Eq 2. However, the bulk nucleosome density ρ is scaled by a factor of ¾ or more generally to account for the linker regions.

Classifying burst events and measuring burst properties

When the pioneering polymerase reaches the end of the gene, the subsequent polymerases in the convoy exit with the flux predicted by the classical TASEP since a sufficiently long convoy functions as a shorter TASEP within the larger ddTASEP lattice. Given that the advance rate into the convoy and the termination rate are both greater than ½ (i.e. q = 1), polymerases exit with the highest flux predicted by the classical TASEP J = ¼. This implies that the average intra-burst waiting time () is equal to 4 time units. However, there is no way to systematically predict what set of parameters (α, h, h) will give rise to bursting or what the optimal threshold is to classify an unknown waiting time t as either an intra-burst waiting time t or an inter-burst interval t. Therefore, we take the following probabilistic approach. First, kernel density estimation is performed on the set of log-transformed waiting times (log10 t) weighted by (log10 t) so that the rare bursting events are more prominent. Next, an additional smoothing step is performed to reduce the noise from the probability density function estimation of the rare burst events. A local maxima search is performed to attempt to identify the intra-burst interval peak and the inter-burst interval peak. If only one peak is found, the ddTASEP is considered initiation limited, and no burst properties are recorded. If the system is bimodal, a threshold T is calculated by taking the geometric mean of the peaks (which is also the arithmetic mean of the log transformed values). Burst events are defined by t>T. This threshold might filter out some intermediate cases, but it guarantees that those parameter sets (α, h, h) that generate robust bursts will be accurately classified while systematically eliminating false positives due to intrinsic stochasticity. The burst size N is defined as the number of mRNA produced between these two breaks in transcription. Burst size and inter-burst intervals are averaged across simulations since they constitute rare events.

Using correlations to quantify defect perturbation strength

The inter-site Pearson r correlation is defined below as Eq 8 can be used to define a matrix R = [r] that contains the inter-site correlations for all site pairs (i,j). Each row or column of the matrix R is a vector of Pearson correlations that we denote as . Given the periodic nature of the lattice when a site is far from the boundaries, the inter-site correlations r should only be a function of the relative distance between the sites (Δ = j−i). Further, each central row vector should exhibit the same qualitative features around the i site as some hypothetical bulk correlation vector (defined only in terms of inter-site distance) would around its central site at Δ = 0. To construct the bulk correlation profile , we first chose a maximum distance (Δ = d) to evaluate the correlations over. Then, for each i∈{d, N−d}, a new vector can be constructed that truncates each and centers it around the i site. Finally, we defined the bulk correlation profile as the sliding window average over the vectors as shown in Eq 9. Each entry rΔ of the vector gives the average inter-site correlation based on relative distance Δ. This metric allows for direct assessment of the strength and length scale of polymerase cooperativity and also allows for more direct, visual comparison of the correlation strengths from different parameter sets than can be obtained from the correlation matrices themselves.

Code implementation and code/data availability

Code was written in matlab R2018a (Math Works, Natick MA). High throughput parallel simulations were performed on a remote server (HARDAC at Duke Center for Genomic and Computational Biology) utilizing a SLURM job handler and an Lmod Environmental Module System that provided matlab R2017b. Given that matlab’s internal parallelization could not be used due to conflict between it and the job scheduler, each instance of the parameter sweep was evaluated on a separate node initialized into the same environment. Since the environmental handler initialized with the same default random number seed for each iteration, random number seeds were specified manually. Code for the core algorithms and the calculation of simulation observables is available at SimTK () along with the associated data sets. Data sets are provided both in a.mat format that can be directly loaded into the matlab workspace. Alternatively, the files can be downloaded as Excel files in a.xlsx format and reprocessed by the reader.

Results

ddTASEP transcription model formulation

In the proposed model (Fig 1D), the ddTASEP lattice is segmented into 50 base pair (bp) sites. A single nucleosome occupies three sites, and there is one linker site between each nucleosome-associated region. This discretization was chosen since the RNAPII footprint on DNA is generally estimated to be roughly 40 bp [41,42]. Further, the nucleosome linkers range in size from 10–70 bp [43]. Additionally, DNA wrapped around nucleosomes is known to be exactly 145–147 bp [43]. Last, assays such as BruDRB-seq have suggested that the maximum velocity of polymerases is roughly 50 bp/s even for extreme outliers across multiple cells types [19], and other more common assays such as MS2-tagging have suggested transcription speeds of 26–86 bp/s [44]. Thus, 50 bp sites will be occupied by one RNAPII enzyme which will take on average one second to clear the site on bare DNA at 50 bp/s. There are three possible actions for polymerases: initiation, advancement, and termination. If the transcription start site (TSS) is open, polymerases can be initiated with propensity α. Once on the gene, RNAPII advances with propensity q while it is not obstructed by another RNAPII or a nucleosome in the wrapped/closed state. It must pause until the nucleosome or other polymerase moves out of its way. Nucleosomes transition from the open (unwrapped) to the closed (wrapped) position with propensity h when all three nucleosome associated sites are empty. The nucleosome unwraps to the open position with propensity h. At the end of the gene, the RNAPII will terminate with propensity ß = q. Since the lattice is segmented into 50 bp sites and the characteristic RNAPII velocity is of order of 50 bp/s, the RNAPII advance rate (from one 50 bp site to the next) q is on the order of 1 site/s. For the sake of simplicity, we will use time units such that q = 1 in the figures. However, q is left in the equations associated with the figures for generality. In the last section of this manuscript, the role of non-uniformity of DNA (due to the presence of exons, methylated CpG islands, and high GC content) will be incorporated by assigning each lattice site a new nominal advance rate q or by assigning each nucleosome a unique set of rate constants h and h. The model can be described formally as follows. Let s be a binary variable representing the polymerase occupancy of the i lattice site and let h be the nucleosome occupancy of the nucleosome associated with sites i, (i+1), (i+2), and (i+3). Site i is a linker site, and the others are the nucleosome associated sites. When any of (i+1), (i+2), and (i+3) are occupied, nucleosome m cannot rebind. These rules are summarized as follows:

Nucleosome dynamics control polymerase convoy formation and transcriptional bursting

The dynamical defect TASEP exhibits a wider array of dynamical behavior than the classical TASEP [31,32] or static defect TASEP [45-48]. The pausing induced by nucleosome binding leads to organization of RNAPII into convoys. In Fig 2, we analyze the behavior of the considered ddTASEP.
Fig 2

Qualitative behavior of the ddTASEP over a range of representative parameters.

All sub-figures are composed of 9 subplots. For all plots, the transcription initiation rate α was set to 0.05, while h is held constant along rows (increasing from 0.001→0.01→0.1 from top to bottom) and while h is held constant along columns (increasing from 0.001→0.01→0.1 from left to right). The top right subplot deviates the furthest from the canonical TASEP behavior while the bottom left most closely resembles ideal TASEP behavior. Simulations shown in panels A, C, D and F utilized 4.5×105 Monte Carlo Steps with 50 replicates. (A) Number of completed mRNAs as a function of time. Notice that the simulation times for the top, middle, and bottom rows are 60, 8, and 2 hours, respectively. (B) Kymographs plotting the position of each RNAPII on the gene (x-axis) as time advances (y-axis). (C) Time averaged RNAPII site density ρ as a function of position. The wavy density profiles are caused by the fact that the linker sites have a higher average RNAPII occupancy than the neighboring nucleosome sites. (D) Time averaged nucleosome site density ρ (black line) as a function of position (ignoring linker sites). The bulk nucleosome density is anti-correlated with the polymerase density, reaching a maximum at the end points where polymerases are rapidly ejected. The dotted red lines indicate the resting-state nucleosome density prior to transcriptional initiation. (E) Inter-site correlation heatmaps with colors bars ranging from -0.2 to 1. (F) Average inter-site correlation profile (rΔ, black line) as a function of relative inter-site distance Δ with the dotted red line indicating zero.

Qualitative behavior of the ddTASEP over a range of representative parameters.

All sub-figures are composed of 9 subplots. For all plots, the transcription initiation rate α was set to 0.05, while h is held constant along rows (increasing from 0.001→0.01→0.1 from top to bottom) and while h is held constant along columns (increasing from 0.001→0.01→0.1 from left to right). The top right subplot deviates the furthest from the canonical TASEP behavior while the bottom left most closely resembles ideal TASEP behavior. Simulations shown in panels A, C, D and F utilized 4.5×105 Monte Carlo Steps with 50 replicates. (A) Number of completed mRNAs as a function of time. Notice that the simulation times for the top, middle, and bottom rows are 60, 8, and 2 hours, respectively. (B) Kymographs plotting the position of each RNAPII on the gene (x-axis) as time advances (y-axis). (C) Time averaged RNAPII site density ρ as a function of position. The wavy density profiles are caused by the fact that the linker sites have a higher average RNAPII occupancy than the neighboring nucleosome sites. (D) Time averaged nucleosome site density ρ (black line) as a function of position (ignoring linker sites). The bulk nucleosome density is anti-correlated with the polymerase density, reaching a maximum at the end points where polymerases are rapidly ejected. The dotted red lines indicate the resting-state nucleosome density prior to transcriptional initiation. (E) Inter-site correlation heatmaps with colors bars ranging from -0.2 to 1. (F) Average inter-site correlation profile (rΔ, black line) as a function of relative inter-site distance Δ with the dotted red line indicating zero. The mRNA accumulation curves shown in Fig 2A all exhibit characteristic features. First, there is a delay due to the first passage time for the first RNAPII to clear the length of the gene as it interacts with the nucleosomes. Second, there is an approximately linear increase in the number of transcriptional events after this initial period. In the weakly perturbed cases (bottom row), the polymerases advance quasi-deterministically giving rise to uninterrupted, linear in time mRNA synthesis. As the nucleosome dynamics slow down (top row), two patterns of nucleosome-induced pausing are observed. In the slow opening and closing case (h = h = 0.001, top left), periods of sustained transcription are disrupted by nucleosome binding resulting in large transcriptional pauses. Therefore, the nearly vertical sections of mRNA production in these cases represent mRNA bursts observed when a convoy of RNAPII reaches end of the gene. In the fast closing and slow opening cases (h = 0.001, h = 0.1, top right), fast nucleosome re-wrapping leads to the formation of smaller, tight burst groups without the long, uninterrupted periods of continuous transcription seen in the previous case. The self-organization of polymerases into convoys that give rise to the bursts in Fig 2A is also readily apparent in the kymographs in Fig 2B. In the less nucleosome perturbed cases (bottom row), the kymographs show the polymerases moving quasi-deterministically along the length of the gene with few collisions and limited convoy formation as would be expected in a canonical TASEP. As the nucleosome-induced pausing becomes stronger in the upper subplots of Fig 2B, the polymerases organize into well separated convoys, most notably in the fast binding and slow opening case (h = 0.001, h = 0.1, top right). Finally, in the extremely slow opening and closing case (h = h = 0.001, top left), both the completely empty gene and completely polymerase-occupied configurations of the lattice are observed. Based on Eq 2, a canonical TASEP with α = 0.05 would be in the initiation limited region and accordingly have ρ = 0.05. However, while all the ddTASEP density profiles shown in Fig 2C resemble the initiation limited region from Fig 1C, the values for the bulk density ρ are always equal or larger than those of the classical TASEP. Further, the density profiles ρ showed a sawtooth wave pattern along the spatial axis since the linker sites always had higher polymerase occupancy due to nucleosome induced pausing. As expected, higher levels of polymerase occupancy lead to noticeable nucleosome depletion as demonstrated by the difference between resting probabilities of nucleosome occupancy (dotted red line) and the steady state nucleosome density profiles in Fig 2D. The nucleosome occupancy profiles presented here look like inverted versions of the polymerase density profiles consistent with the hard exclusion constraint that allows polymerases to hold nucleosomes in the open conformation. Last, the central sites in the inter-site correlation heatmap Fig 2E confirm our expectation that the inter-site correlations only depend on the relative distance between sites. Using Eq 9, for inter-site correlation rΔ in Fig 2F we directly quantify the strength of cooperativity between polymerases at distance Δ. In the highly nucleosome perturbed case (with slow opening and closing, h = h = 0.001, top left), the positive correlation spans over a distance comparable with the gene length suggesting that polymerase convoys for a given set of h and h could be longer than the length of the gene. The intermediate cases (most notably the central case where h = h = 0.01) exhibit both a short range positive correlation associated with RNAPII convoys and a longer range negative correlation showing that convoys are separated by regions of rebinding nucleosomes. In the case of the weak nucleosome binding (h = 0.1, h = 0.001 and h = h = 0.1), no inter-site correlation is observed showing that bursts are not formed.

Mean and variance of first passage time

As demonstrated in Fig 2A and 2B, our model implies that the time needed to produce first complete mRNA depends on the nucleosome-induced pausing. Using Markov chain theory, the expected passage time through a nucleosome unit (consisting of one linker site and three nucleosome sites) for the first polymerase can be calculated [49,50]. Due to the periodic spacing of nucleosome binding sites, the MFPT to produce an mRNA after gene reactivation is simply the sum of the expected times for the first polymerase to clear each nucleosome unit. The interaction between a polymerase on a linker site and a nucleosome can be treated as a three state Markov Chain (Fig 3A), where State 1 (polymerase on linker with nucleosome open) and State 2 (polymerase on linker with nucleosome closed) can reversibly communicate until the absorbing State 3 (polymerase enters the first nucleosome site) is reached as shown in Fig 3A. The evolution of the state probability distribution of this Markov Chain can be described as a system of Forward Kolmogorov Equations with generator matrix Q [37,39,49]:
Fig 3

Deriving and validating the mean and variance of first passage time equations.

(A) A state transition diagram utilized to calculate the mean (E) and variance (V) of first passage time to enter a nucleosome. Each arrow is labeled with the rate constant for the transition between the neighboring states. (B) Log-log plot of simulated mean first passage time (Eq 22) against the predicted analytical expression for mean first passage time. The black line represents the theoretical prediction. The dashed red line represents the Bare DNA limit obtained by simulations with h = 0. (C) Log-log plot of simulated variance of first passage time against the predicted analytical expression for VFPT (Eq 26) with the Bare DNA limit represented by the dashed red line. (D) Plot of index of dispersion of the waiting time to enter a nucleosome D = V/E against the resting state binding probability of the nucleosomes (). h is set to be 0.001 (blue), 0.01 (green), 0.1 (orange), and 1 (red). h was adjusted to achieve γ∈{0.002, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 0.99}. Solid lines indicate the theoretical prediction given by Eq 27. (E) Log-log plot of the analytical expression for the index of dispersion of the waiting time to enter a nucleosome D (Eq 27, analytical) as a function of the expected waiting time for nucleosome opening (1/h). The dashed black line shows the index of dispersion in the limit of h→0.

Deriving and validating the mean and variance of first passage time equations.

(A) A state transition diagram utilized to calculate the mean (E) and variance (V) of first passage time to enter a nucleosome. Each arrow is labeled with the rate constant for the transition between the neighboring states. (B) Log-log plot of simulated mean first passage time (Eq 22) against the predicted analytical expression for mean first passage time. The black line represents the theoretical prediction. The dashed red line represents the Bare DNA limit obtained by simulations with h = 0. (C) Log-log plot of simulated variance of first passage time against the predicted analytical expression for VFPT (Eq 26) with the Bare DNA limit represented by the dashed red line. (D) Plot of index of dispersion of the waiting time to enter a nucleosome D = V/E against the resting state binding probability of the nucleosomes (). h is set to be 0.001 (blue), 0.01 (green), 0.1 (orange), and 1 (red). h was adjusted to achieve γ∈{0.002, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 0.99}. Solid lines indicate the theoretical prediction given by Eq 27. (E) Log-log plot of the analytical expression for the index of dispersion of the waiting time to enter a nucleosome D (Eq 27, analytical) as a function of the expected waiting time for nucleosome opening (1/h). The dashed black line shows the index of dispersion in the limit of h→0. All of the moments of the first passage time of any absorbing Markov chain are uniquely determined by the submatrix of transient states in Q which will be referred to as R. The fundamental Matrix N is then given by [38,49] The expected waiting times for the polymerase to enter the nucleosome site from the open state (E, State 1) and closed state (E, State 2) can be expressed in terms of the fundamental matrix N as The overall expected waiting time to enter the nucleosome associated site (E) is given by the weighted average of E and E with respect to the probability of finding the nucleosome in the open or closed state upon the polymerase’s initial approach . The expected passage time to transcribe past a nucleosome and its three associated sites (E) is the sum of E and the expected time to pass through 3 empty sites (3/q) Consequently, the overall MFPT to transcribe the gene is given by where α, β and N are the initiation rate, termination rate, and the number of nucleosome units, respectively. Fig 3B shows that MFPT formula (Eq 22) accurately predicts the simulated MFPTs (R = 0.99995), generated by varying h to attain various values of γ across a log scale from 0.001 to 1 while holding h constant. We use E also to construct a new parameter γ called the mean first passage elongation rate. By definition γ = 1 when the nucleosomes are constantly open h = 0. The reader should note the distinction between the bulk elongation rate v (the average velocity of all polymerases to transcribe a gene) and the mean first passage rate γ (the expected/ensemble-averaged velocity of the first polymerase to transcribe a gene). Another useful metric to understand the effects of the nucleosomes on transcription is the variance of the first passage time (VFPT). The variance of the nucleosome entrance waiting time (V) was calculated with the following equation from [38]. As with the mean first passage time, the variance of the time to pass through a nucleosome unit (V) and the overall variance of the first passage time (VFPT) are obtained from V. Fig 3C demonstrates that the analytical expression for VFPT (Eq 26) accurately predicts the simulated VFPTs (R2 = 0.991), over three orders of magnitude of h and h. When nucleosome binding was turned completely off (h = 0), the analytical expression and simulation both provided the expected result for bare, nucleosome-free DNA (i.e. a canonical TASEP). Finally, we introduce the index of dispersion , that can be used to measure the deviation of the ddTASEP’s behavior from a classical TASEP on nucleosome-free DNA for which . The index of dispersion of the time to enter a single nucleosome D was found to be In the limits as h = 0 or h→∞, the index of dispersion converges to (1/q) which is consistent with the canonical TASEP limit since this is simply the index of dispersion for an exponential distribution or a Poisson process [49]. In contrast, when h≪h, the index of dispersion becomes dominated by the expected waiting time for the nucleosome to reopen. Fig 3D and 3E investigate whether the index of dispersion is more strongly controlled by the resting nucleosome density (Fig 3D) or the absolute time scale of the reopening time (Fig 3E). As shown in Fig 3D, there is a linear increase in index of dispersion with respect to increasing resting nucleosome density () followed by an abrupt increase as the resting density goes to one. It is also possible that this change in the index of dispersion likely arises from the associated decrease in h needed to achieve the desired initial nucleosome density for fixed h. Fig 3E shows a smooth transition from the quasi-deterministic random walk of the classical TASEP (D≈1/q in the limit h = 0) to the nucleosome limited cases where the unwrapping waiting time associated with nucleosome breathing dominates the index of dispersion D≈(1+h)/h in the limit h→0. Thus, these large deviations from canonical TASEP behavior leading to polymerase self-organization and bursting are driven primarily by infrequent interactions with nucleosomes with slow re-opening (low h) but can be marginally enhanced by more frequent interactions with nucleosomes (higher resting nucleosome density) due to higher rates of re-wrapping (high h).

Validity of Coarse-Graining the Lattice

A core assumption of this model is that the lattice can be treated in terms of 50 bp sites instead of at 1 bp resolution (consistent with the fact that RNAPII advance one nucleotide at a time). It is a well-known result that, for fixed lattice size, a TASEP with particles of length L>1 will have reduced flux through the lattice relative to a TASEP with particles of length L = 1 [51,52]. However, the effect of simultaneously changing both particle length and lattice size on transcription flux is not intuitive. Therefore, we considered the possibility of using non-exponential waiting times. The true waiting time distribution for this system is likely an Erlang(k = n, λ = n) distribution with n∈[1, 50]. Two special cases of this exist: When n = 1, the canonical exponential waiting time distribution is reobtained. When n = 50, the interpretation is that all 50 nucleotide polymerization steps occur with waiting times given by independent and identically distributed exponential variables with a uniform rate of 50 bp/s. (Both cases represent extremes that are not likely to be biologically realistic.) The variance of the advancement waiting time under the exponential distribution is 1s2 while it is 0.02s2 under the Erlang(k = 50, λ = 50) distribution. In a non-Markovian TASEP without defects, the reduced variance in the hop time leads to fewer collisions allowing the average (bulk) particle hop rate v to approach the nominal hop rate q. This leads to substantially enhanced flux J relative to a Markovian TASEP with exponentially distributed waiting times [53]. However, in the proposed ddTASEP model, we hypothesize that the contribution from pausing induced by the slow nucleosome dynamics will dominate the first passage time eliminating the entrainment effect that is observed on bare DNA allowing us to coarse-grain the system and use a simpler, exponential waiting time distribution without introducing significant error. To demonstrate this, we first prove that the entry waiting time statistics are not affected by considering a larger number of intermediate entry steps. Second, we then demonstrate that the nucleosome induced pausing dominates the mean and the variance of the first passage time relative to the time to transcribe the bare DNA after entry into the nucleosome associated region. First, we consider a more general transient state matrix R′ (analogous to Eq 17) which considers the entry of a polymerase of length L that travels at rate Lq (between subsequent base-pairs) into the nucleosome associated region that can be temporarily arrested by nucleosome rebinding at rate h at any step of the process. The transient states of the new Markov Chain R′ will have a block structure with each two rows indicating an open (unarrested) and closed (arrested) state for each step of the extended polymerase’s entry into the nucleosome associated region as given by As a representative example, we can consider the two-site case where the advance rate is 2q. Interestingly, the nucleosome entry time for the two-site case E was identical to that the of the one-site case (E = E). Additionally, the variance of the entry waiting time V was found to be Further, we expect that V≈V. Evaluating the ratio of the two gives While analytical results become unwieldy as L→50, this approach can be performed numerically in MATLAB. The numerical values of E and V are plotted against the analytical results for E and V in S1 Fig for the same parameter sets shown in Fig 3. As hypothesized, S1A Fig shows that the expected entry waiting times are equal to each other. Likewise, S1B Fig shows that the variance of this waiting time for the fifty-step model is approximately equal to the variance for a single-entry step except for the data series when h = 1 (since the highest power term of (h+h) can no longer vanish) and in cases where h≪h, which correspond to effectively bare DNA (as demonstrated by the final data points of the h = 0.001 and h = 0.01 data series). Next, in S1C Fig, we establish that the nucleosome entry time dominates the first passage time through a nucleosome associated region. From S1C Fig (left), for biologically interesting conditions when h Therefore, while a more accurate waiting time distribution would be critical in the absence of nucleosomes, the nucleosome induced pausing and subsequent polymerase-polymerase collisions within a burst group likely cancel out any of the entrainment effects that would be observed on bare DNA if a more realistic waiting time distribution were used. Thus, we believe that the use of a single site polymerase with 50 bp lattice sites and exponentially distributed waiting times is still an appropriate simplification to model the system.

Constructing the ddTASEP αγ-plane

The canonical TASEP phase diagram exhibits three jamming transitions in the αβ-plane when the initiation rate becomes limiting (α<β and α<½), the termination rate becomes limiting (ß<α and ß<½), or when the channel capacity is reached in the max flux limit (α, ß≥1/2) as shown in Fig 1B. Analogously, the ddTASEP should exhibit a jamming transition due to the limitation on the maximum transcription flux imposed by the nucleosome-induced pausing. Therefore, we expected that the transcription efficiency γ could be used to define a third axis of the phase diagram orthogonal to the canonical αß-plane. By assuming ß = q = 1, we restrict our analysis to the αγ-plane as shown in Fig 4. The reader should note that, by the selection of q = 1, we are implicitly rescaling the values of α and γ with respect to the advance rate q.
Fig 4

The ddTASEP phase diagram in the αγ-plane.

(A) Phase diagram in the αγ-plane (orthogonal to the original αß-plane in Fig 1B at ß = q = 1). The critical line (Eq 37) divides the phase diagram into two regions: the initiation-limited and the nucleosome-limited For γ→1, the ddTASEP’s dynamics converge to the classical Max Flux Limit shown in Fig 1B. (B) The theoretical predictions (top row) for the bulk elongation rate v (Eq 38), bulk density ρ (Eq 39), and bulk transcription flux J (Eq 40) are qualitatively compared to simulated results for h = 0.1, 0.01, and 0.001 (2nd through 4th rows) Mean first passage rate, γ was varied from between 0 and 1 by specifying h (for given h) to achieve the desired value. 1.2 million simulation steps were used for each heat map point.

The ddTASEP phase diagram in the αγ-plane.

(A) Phase diagram in the αγ-plane (orthogonal to the original αß-plane in Fig 1B at ß = q = 1). The critical line (Eq 37) divides the phase diagram into two regions: the initiation-limited and the nucleosome-limited For γ→1, the ddTASEP’s dynamics converge to the classical Max Flux Limit shown in Fig 1B. (B) The theoretical predictions (top row) for the bulk elongation rate v (Eq 38), bulk density ρ (Eq 39), and bulk transcription flux J (Eq 40) are qualitatively compared to simulated results for h = 0.1, 0.01, and 0.001 (2nd through 4th rows) Mean first passage rate, γ was varied from between 0 and 1 by specifying h (for given h) to achieve the desired value. 1.2 million simulation steps were used for each heat map point. Classically, the ideal method to determine the value of the transcription observables J, ρ, and v and to locate the phase transitions in parameter space would be to utilize a mean field approach [31,32]. However, this approach has significant limitations. Based on the model formulation in Eqs 1–6, it is clear that the system’s chemical master equation would involve the products of up to four site occupancies. Therefore, when taking the ensemble average of the site occupancies, quadratic (e.g. 〈ss〉) and cubic mixed moments (e.g. 〈sss〉) appear in the master equation. (Note that quartic moments such as 〈hsss〉 vanish due to the exclusionary binding constraint.) Under the mean-field approximation (also referred to as the random phase approximation for the TASEP), these mixed moments can be factored as which implies that each site of the classical TASEP comes to a steady state value and that no site occupancies are correlated with each other. The correlation-free assumption clearly fails for the proposed ddTASEP with extended dynamic defects and exclusionary defect binding as shown in Fig 2E and 2F. Therefore, we decided to abandon this approach entirely.
Fig 6

Evaluation of the effects of gene length and nucleosome spacing on bulk transcriptional properties.

Panels A (v), B (ρ), and C (J) show the results of a parameter sweep with respect to gene length from 2,000 to 20,000 bp with α = 0.1 (top left), 0.25 (top right), 0.5 (bottom left), and 1 (bottom right), and γ = 0.1 (blue), 0.2, (green), 0.4 (orange), and 0.75 (red). Panels D (v), E (ρ), and F (J) show the results of a parameter sweep with respect to the number of evenly spaced nucleosomes (by varying number of linker sites) giving N∈{1, 4, 8, 16, 20, 25, 40, 50, 80, 100} with α = 0.1 (top left), 0.25 (top right), 0.5 (bottom left), and 1 (bottom right). Since γ depends on geometry, h and h were set equal to 1 (blue), 0.1 (green), and 0.01 (orange). Both sweeps were performed with 2×106 Monte Carlo steps and 50 replicates.

In order to derive approximate expressions for J, ρ, and v, we propose a phenomenological model inspired by saturation kinetics [54,55] that will interpolate between the canonical TASEP regime (γ = 1, α∈[0, 1]) and the strongly perturbed ddTASEP regime with low initiation rates (γ<1, α≪1). Specifically, appropriate functional forms for v and ρ were selected to satisfy the assumed asymptotic behavior. The bulk elongation rate v and the bulk density ρ were then used to calculate J, and all three expressions were then compared against the simulated results. First, we identified an appropriate functional form for the bulk polymerase hop rate v in the regime with . In the canonical TASEP limit γ→1, the bulk hop rate v is equal to (1−ρ) = (1−α) for α<½ as shown in Eqs 1 and 2. Since the density in the initiation limited regime is ρ = α (Eq 2), the bulk hop rate v must equal (1−α). As the initiation rate α→0, the gene effectively resets to its resting state between each passing polymerase. Therefore, the bulk hop rate v should approach γ. To satisfy both limits, we propose that In order to estimate the bulk density ρ, we noticed that, in the limit γ→1, the bulk density ρ must also converge to the classical TASEP bulk density in Eq 2. Additionally, in the limit γ→0 with a sufficiently large initiation rate α, ρ should be close to one in this limit since a whole gene spanning convoy may form (Fig 2B). Gene spanning convoys are able to form because the most likely binding site for nucleosomes is at the end of the gene (Fig 2D, top left), where the polymerase density drops to nearly zero. Last, in the highly perturbed cases, switching from an almost completely full gene to an almost empty gene is observed as shown in Fig 2B (top left). Under these conditions, the limit ρ→1 is not reached even though the gene may transiently be fully occupied. From the canonical TASEP model, the density has a linear relationship with the initiation rate in the low-density region. For sufficiently high first passage elongation rates γ, the density should increase almost linearly with the initiation rate for small values of α. However, the density will eventually saturate to some limit influenced by γ since convoys will back up on the transcription start site preventing new initiation. We postulate that ρ has a linear dependence on the initiation rate α (in the limit γ→1) and tends to one as γ→0. Therefore, we propose that Finally, we obtain the transcription flux J as In the limit γ→1, the value of J converges to the unperturbed TASEP flux for the initiation limited regime (Eq 3) scaled by the advance rate q. In the unperturbed TASEP, when , the flux (as well as the density and the hop rate) become independent of the injection rate, and the flux J reaches its maximum value for a given β equal to By analogy, it is reasonable to expect that, for a given γ, the flux of the ddTASEP will also become independent of the initiation rate α after some saturating condition is reached since the nucleosomes will cause the polymerases to back up onto the TSS. We propose that for a given γ the jamming transition will occur when J(α,γ) reaches its maximum value with respect the initiation rate α, as given by Therefore, the critical value of the initiation rate α* is For α<α*(γ), both α and γ limit the rate of transcription. In the nucleosome-limited regime where α≥α*(γ), the density, flux, and hop rate no longer depend on the initiation rate but now depend exclusively on nucleosome-induced pausing. Substituting α* into Eqs 32, 33 and 34 gives the proposed estimates for v, ρ, and J. These results are summarized in the phase diagram for the αγ-plane as shown in Fig 4A. The phase diagram is divided into initiation-limited and nucleosome-limited or jammed regions at the critical line defined Eq 37. The simulations shown in Fig 4B provide evidence that the analytical estimates (Eqs 38–40) reproduce qualitatively dependence of v, ρ, and J on the model parameters.

The initiation rate α and the mean first passage rate γ control transcription dynamics

The results shown in Fig 4B suggest that Eqs 38–40 predict the transcriptional dynamics of the system regardless of the particular values of h and h. This is confirmed by Fig 5A–5C which show the dependence of the bulk elongation rate v (Fig 5A), bulk density ρ (Fig 5B), and the transcription flux J (Fig 5C) on the mean first passage rate γ (which was obtained by varying h while holding h fixed across multiple orders of magnitude). The plots confirm that the values of h and h typically have relatively small effect on the transcription dynamics, which is nearly fully governed by γ.
Fig 5

Dependence of the elongation rate v, polymerase density ρ, and transcription flux J on the initiation rate α and the mean first passage rate γ.

The left panels A, B and C show the dependence of v (Eq 38), ρ (Eq 39), and J (Eq 40) on γ. γ is varied from 0.001 to 0.99 by adjusting h for fixed values of h equal to 0.001 (blue), 0.01 (green), 0.1 (orange), and 1 (red). For each subpanel of A, B, and C, a different α values is assumed: 0.1 (top left), 0.25 (top right), 0.5 (bottom left), 1 (bottom right). The right panels D, E and F show the dependence of v (Eq 38), ρ (Eq 39), and J (Eq 40) on α. α is varied from 0.001 to 1 for fixed values of γ equal to 0.1 (blue), 0.2 (green), 0.4 (orange), and 0.7 (red) under the constraint that h = h. Data points represent the average simulation results from 50 simulations with 2×106 Monte Carlo steps each, and solid lines represent theoretical estimates.

Dependence of the elongation rate v, polymerase density ρ, and transcription flux J on the initiation rate α and the mean first passage rate γ.

The left panels A, B and C show the dependence of v (Eq 38), ρ (Eq 39), and J (Eq 40) on γ. γ is varied from 0.001 to 0.99 by adjusting h for fixed values of h equal to 0.001 (blue), 0.01 (green), 0.1 (orange), and 1 (red). For each subpanel of A, B, and C, a different α values is assumed: 0.1 (top left), 0.25 (top right), 0.5 (bottom left), 1 (bottom right). The right panels D, E and F show the dependence of v (Eq 38), ρ (Eq 39), and J (Eq 40) on α. α is varied from 0.001 to 1 for fixed values of γ equal to 0.1 (blue), 0.2 (green), 0.4 (orange), and 0.7 (red) under the constraint that h = h. Data points represent the average simulation results from 50 simulations with 2×106 Monte Carlo steps each, and solid lines represent theoretical estimates. The highest deviations exist for the lowest and highest values h = 0.001 (blue) and h = 1 (red). However, the predictions for the bulk elongation rate v (Fig 5A), bulk density ρ (Fig 5B), and transcription flux J (Fig 5C) are the most accurate for the biologically relevant cases with h = 0.01 (green) and h = 0.1 (orange) across all three metrics. As a point of reference, if , then the nucleosome dynamics associated with 0.01≤h<0.1 s−1 would be on the order of 10–100 seconds, which is biologically relevant as shown by Bintu et al [13]. Fig 5D–5F show the dependence of the bulk elongation rate v (Fig 5D), bulk density ρ (Fig 5E), and transcription flux J (Fig 5F) on α. Because Fig 5A–5C show that α and γ control the transcription dynamics with minimal effects from h and h under biologically relevant conditions, we decided to omit explicitly varying h in Fig 5D–5F. However, the reader should note that h = h that the nucleosome rate constants vary from 0.0143 to 0.7 to achieve values of γ from 0.1 to 0.7 in Fig 5D–5F. The sharp jamming transition that arises when initiation rate passes the critical value α*(γ) is readily visible in these figures. Fig 5F shows that, as predicted by the model, the transcription flux increases linearly with α until α reaches the critical value α*(γ) at which J(α,γ) reaches its maximum value J(γ) (Eq 40). Similarly, the bulk density ρ(α,γ) reaches its maximum value ρ(γ) (Eq 39) in Fig 5E while the bulk elongation rate v(α,γ) reaches its minimum value v(γ) (Eq 38) in Fig 5D. It is worth nothing that our theoretical predictions slightly overestimate the bulk elongation rate and slightly underestimate the bulk density. However, since the transcription flux is the product of these two quantities, the errors nearly cancel out leading to quantitative agreement between the simulated results and theoretical estimate for J(α,γ) as shown in Fig 5F. For J(α,γ) the transition between the initiation-limited and nucleosome-limited regimes is smooth (in contrast to the sharp transitions for ρ and v) in agreement with assumption that J(α,γ) has zero derivative with respect to α at α*(γ) (Eq 37).

ddTASEP transcription dynamics are not affected by gene length or geometry for short convoy lengths

Fig 6 investigates the effects of varying gene length and the length of linker regions between nucleosomes. Fig 6A–6C show the dependence of elongation rate, bulk density, and transcription flux on the length of the gene, ranging from 2,000 to 20,000 base pairs. On short genes, the convoys start to span the length of the gene, giving rise to some interesting effects. Most notably, the observed transcription flux is higher than that predicted by our proposed model, and the observed density is lower than our proposed model on short genes. We hypothesize that this is connected to the existence of gene spanning convoys. When convoys span the length of the gene, nucleosomes can only infrequently re-bind, and the gene transiently approximates classical TASEP behavior. In the initiation limited regime, the transcription flux of a classical TASEP is the upper-bound of transcription flux for all other related ddTASEPs (with γ<1), and the density of a classical TASEP is the lower-bound of density for all other analogous ddTASEPs. Therefore, the previously mentioned elevated transcription flux and decreased bulk density are consistent with more classical TASEP behavior that may occur in a gene-spanning convoy. Accordingly, when the gene length increased to over 10,000 base pairs, the transcription flux, density, and elongation rate converge to the theoretical predictions for the ddTASEP since the formation of a gene-spanning convoy (that would induce quasi-classical TASEP dynamics) becomes too unlikely for the given rate of nucleosome dynamics.

Evaluation of the effects of gene length and nucleosome spacing on bulk transcriptional properties.

Panels A (v), B (ρ), and C (J) show the results of a parameter sweep with respect to gene length from 2,000 to 20,000 bp with α = 0.1 (top left), 0.25 (top right), 0.5 (bottom left), and 1 (bottom right), and γ = 0.1 (blue), 0.2, (green), 0.4 (orange), and 0.75 (red). Panels D (v), E (ρ), and F (J) show the results of a parameter sweep with respect to the number of evenly spaced nucleosomes (by varying number of linker sites) giving N∈{1, 4, 8, 16, 20, 25, 40, 50, 80, 100} with α = 0.1 (top left), 0.25 (top right), 0.5 (bottom left), and 1 (bottom right). Since γ depends on geometry, h and h were set equal to 1 (blue), 0.1 (green), and 0.01 (orange). Both sweeps were performed with 2×106 Monte Carlo steps and 50 replicates. Fig 6D–6F show the dependence of the bulk ddTASEP properties on the linker spacing between nucleosomes on a 20,000 base pair gene. The reader should note that the number of nucleosomes decrease in the simulation as the number of linker sites increase. Even when γ must be calculated assuming different linker geometries, the analytical results for the bulk elongation rate (Fig 6D) and transcription flux (Fig 6F) adapt almost perfectly to the new geometry and perform reasonably well for the bulk density (Fig 6E).

Features of nucleosome-induced convoy formation and transcriptional bursting

While classical models of transcription attribute bursting to promoter dynamics where the promoter switches between an ON and OFF state [26,27,56-59], the ddTASEP model exhibits bursting even with a constitutively active promoter because of nucleosome-induced pausing. In order to identify bursts, we utilized the difference in time scales between the average intra-convoy waiting time and the average inter-burst interval . As can be seen in Fig 2A and 2B, nucleosome induced pausing introduces non-trivial time delays between the last polymerase of a convoy and the pioneering polymerase of the convoy following it. To briefly summarize our approach (described in full in the methods), we utilized a kernel density estimate of log10 t weighted by itself to determine whether the system exhibited bursting and (in the presence of bursts) to identify the approximate location of the average inter- and intra-burst intervals. The threshold for a burst waiting time was set as the geometric average of the two peaks of the kernel density estimate. Representative kernel density estimates for the parameter sets used in Fig 2 are shown in Fig 7A. Excluding the cases without strong correlations (Fig 2F, bottom left and bottom middle), all of the remaining cases were strongly bimodal indicating that non-trivial bursting is occurring.
Fig 7

Investigation of the ddTASEP Inter-Burst Interval and Average Burst Size.

(A) Representative smoothed kernel density estimates of log10(t) weighted by log10(t) using same parameter sets as Fig 2. (B) Empirical cumulative distribution functions of waiting times (black line) from ddTASEPs with the same parameters as in Fig 2. The time-headway cumulative distribution function is overlaid for each plot (Eq 42). The red dashed time headway distributions have robust bursting with an effective density ρ = 0.5. In contrast, the green dashed lines indicate initiation limited cases where unimodality was observed in Fig 7A, and ρ is given by Eq 39. (C) shows the average inter-burst interval plotted as a function of E = 4/γ. (D) shows the average inter-burst interval plotted as a function of index of dispersion of the first passage time to enter a nucleosome D. (E) shows burst size plotted as a function of mean first passage rate γ with the solid lines proportional to . Panels (C), (D), and (E) were generated with Fig 5A–5C. γ is varied from 0.001 to 0.99 by adjusting h for fixed values of h equal to 0.001 (blue), 0.01 (green), and 0.1 (orange) omitting cases that failed to show bimodality.

Investigation of the ddTASEP Inter-Burst Interval and Average Burst Size.

(A) Representative smoothed kernel density estimates of log10(t) weighted by log10(t) using same parameter sets as Fig 2. (B) Empirical cumulative distribution functions of waiting times (black line) from ddTASEPs with the same parameters as in Fig 2. The time-headway cumulative distribution function is overlaid for each plot (Eq 42). The red dashed time headway distributions have robust bursting with an effective density ρ = 0.5. In contrast, the green dashed lines indicate initiation limited cases where unimodality was observed in Fig 7A, and ρ is given by Eq 39. (C) shows the average inter-burst interval plotted as a function of E = 4/γ. (D) shows the average inter-burst interval plotted as a function of index of dispersion of the first passage time to enter a nucleosome D. (E) shows burst size plotted as a function of mean first passage rate γ with the solid lines proportional to . Panels (C), (D), and (E) were generated with Fig 5A–5C. γ is varied from 0.001 to 0.99 by adjusting h for fixed values of h equal to 0.001 (blue), 0.01 (green), and 0.1 (orange) omitting cases that failed to show bimodality. Next, we investigated the internal transcription dynamics of bursting ddTASEPs versus non-bursting ddTASEPs in Fig 7B. In the classic TASEP, the waiting time between particles is given by the time headway distribution f(t;ρ). The probability density function f(t;ρ) of this distribution is given by [60] and its cumulative distribution function F(t;ρ) is given by The mean of this distribution is known to be . For a burst group at the end of a gene, we hypothesized that it would exhibit max flux limit TASEP dynamics since the entry rate into the burst group is given by q and the termination rate is given by β = q. Therefore, the effective density of a burst group would be ρ = 0.5. In contrast, for a case without bursting, the density of the system would be given by our initiation limited estimate in Eq 39. The empirical cumulative distribution functions in Fig 7B directly confirm this result. In Fig 7C, the relationship between the average inter-burst interval and the first passage time to clear a nucleosome E = 4/γ was considered. While there is a clear correlation between E and , there is clearly dependence on the magnitude of h that is not appropriately accounted for here. Therefore, we investigated the hypothesis that the inter-burst interval should scale with index of dispersion of the time to enter a nucleosome since D quantifies the deviation from classical TASEP behavior. The variance V will be directly related to the observed inter-burst intervals t since these contribute most strongly to the variance out of all of the observed waiting times t since the intra-burst intervals are always approximately . However, a system with a long MFPT with frequent, short inter-burst intervals t and a system with a short MFPT with infrequent, long inter-burst intervals t could have comparable variances. Therefore, to better account for the relative perturbation strength and to maintain dimensional consistency, the index of dispersion D is preferable to the variance V. Fig 7D shows that all three data series collapse onto a single curve and that the average inter-burst intervals exhibit a power law scaling with D that could be slightly improved with additional correction for α and h. However, the correlation is still strong with r = 0.82 even though the relationship is not completely linear. Fig 7E investigates the limiting values of burst size. Under saturation conditions, the maximum average burst size is approximately This result is surprising since it initially seems like burst size should be controlled by the flux and the absolute time scale of the nucleosome unwrapping (1/h) because this should be connected to the number of polymerases that can accumulate behind a closed nucleosome. However, when the gene is saturated, a constant number of polymerases accumulate behind each closed nucleosome. The limiting factor under these conditions is how fast the nucleosome can close in such a way that it splits burst groups to finite size as shown in the anti-correlated regions observed in the Fig 2F.

The existence of bursts under initiation limited conditions

While robust bursting is clearly evident in the nucleosome-limited (saturated) region, it is unintuitive whether or not robust bursting occurs in the initiation limited region. In Fig 8A, we investigate the average inter-burst interval as α→0. The predictions given by the time headway distributions (1/J) are given by the dashed lines. Remarkably, the inter-burst intervals are substantially longer than the waiting times between randomly initiated polymerases. In other words, the now represents the sum of the waiting time between rare initiation events plus the sum of the time to clear the nucleosomes which have all returned to resting nucleosome occupancy since the polymerase density is approaching zero. Fig 8B directly confirms the existence of burst groups. In the limit α→1, the max burst size converges to the value given by Eq 43. In contrast, in the limit as α→0, the burst size approaches one. However, our proposed burst identification procedure still identifies bursts for these parameter sets. We hypothesize that the burst sizes are geometrically distributed and that most “apparent bursts” under extreme initiation limited conditions are single polymerases while a smaller subset constitute non-trivial polymerase convoys/burst events.
Fig 8

Bursting in the low initiation rate regime.

(A) shows the average inter-burst interval as a function of α with the headway distribution prediction for the average waiting time given by a dashed line. (B) shows the average burst size plotted as a function of α. Panels (A) and (B) were generated with Fig 5D–5F with α∈[0.001, 1] with h = h∈{0.0143, 0.033} to achieve γ∈{0.1, 0.2}. (C) shows that the variance of burst size (simulated) is equal to the variance of a geometric distribution given by Eq 46 and is colored by −log10 α. (D) compares the mean inter-burst interval and the variance of the inter-burst intervals to establish that the waiting time distribution is exponentially distributed using the data from Fig 5A–5C and is colored by −log10 γ. (E) and (F) show representative burst size probability mass functions generated from geometric distributions with means given by = 3.62 and 1.49 associated with the parameter sets (α = 0.01, γ = 0.2) and (α = 0.002, γ = 0.2).

Bursting in the low initiation rate regime.

(A) shows the average inter-burst interval as a function of α with the headway distribution prediction for the average waiting time given by a dashed line. (B) shows the average burst size plotted as a function of α. Panels (A) and (B) were generated with Fig 5D–5F with α∈[0.001, 1] with h = h∈{0.0143, 0.033} to achieve γ∈{0.1, 0.2}. (C) shows that the variance of burst size (simulated) is equal to the variance of a geometric distribution given by Eq 46 and is colored by −log10 α. (D) compares the mean inter-burst interval and the variance of the inter-burst intervals to establish that the waiting time distribution is exponentially distributed using the data from Fig 5A–5C and is colored by −log10 γ. (E) and (F) show representative burst size probability mass functions generated from geometric distributions with means given by = 3.62 and 1.49 associated with the parameter sets (α = 0.01, γ = 0.2) and (α = 0.002, γ = 0.2). We hypothesize that the burst sizes are geometrically distributed [27] such that where k is the number of polymerases in the burst, and p is defined by The variance of the burst size geometric distribution is given by Fig 8C verifies that this relationship holds for the data set plotted in Fig 8A and 8B, so the assumption that the burst sizes are geometrically distributed is reasonable. Additionally, Fig 8D shows that the variance of the inter-burst intervals is equal to their mean squared, () indicating that burst waiting times are exponentially distributed. Interestingly, these findings bare some similarity to that of a two-state random telegraph model. In this classical model, mRNA are produced continuously for brief periods of time while the promoter of a gene is active, and then, mRNA production stops when the promoter is inactivated [61,62]. This model can be interpreted through the framework of the classical TASEP (without nucleosomes) by considering the association and dissociation of a transcription factor with rates f and f respectively. In this new framework, the promoter would be active for a time period of 1/f (i.e. when the transcription factor dissociates) and the inter-burst interval would be exponentially distributed with mean 1/f (i.e. the waiting time for transcription factor association). Burst groups would move deterministically along the length of the gene with the same bulk velocity, but the density and flux would be scaled by the fraction of time the promoter was active . The burst size distribution would then be geometrically distributed with a burst termination probability of and a mean of (i.e. the flux times the expected time the promoter will remain active). As in the two-state telegraph model, the burst sizes of our proposed ddTASEP also follow a geometric distribution, and the inter-burst intervals are exponentially distributed [27]. However, there are a few key distinctions. First, the burst sizes in our proposed model have an upper bound given by Eq 43 that is proportional to and is completely independent of the flux. Second, if J Last, we will verify the ddTASEP’s capacity to induce bursting for two cases in the extreme initiation limited regime for the parameter sets (α = 0.01, γ = 0.2) and (α = 0.002, γ = 0.2) which produced average burst sizes of 3.62 and 1.49, respectively. The probability mass functions given by Eq 44 are plotted for both cases in Fig 8D and 8E. For the case with the higher initiation rate (α = 0.01) in Fig 8D, bursts of up to 15 polymerases can be observed in some rare events. More interestingly for the case with (α = 0.002) in Fig 8E, the convoys can regularly include up to six polymerases even though the bulk density ρ for this case is 0.01 and the flux J is 0.002 (which would correspond to a 500 second average waiting time in a classical TASEP). In short, non-trivial bursting is still possible in the extreme initiation limited region. From a biological standpoint, these parameter sets are obtainable. In Fig 8, cases with γ equal to 0.1 and 0.2 are considered. Veloso et al. utilized BruDRB-seq to demonstrate that the range of first passage elongation rates γ in K562 Leukemia cells ranged from 0.0083 to 1 site/s [19]. Bintu et al. showed via optical trap experiments that nucleosome unwrapping took between 10–100 seconds [13], and the cases presented here assume h = h = 0.0143 or 0.0333 which correspond to roughly 30–70 seconds. Tantale et al.’s study of HIV1 gene transcription (under high initiation rates) via MS2-tagging suggested that polymerases could move in convoys of 10–25 with an intra-burst waiting time of 4.3 seconds which is consistent with the predictions from the proposed geometric distribution for burst size and the TASEP time headway distribution [63]. Therefore, nucleosome-induced pausing serves a viable mechanism for transcriptional bursting on its own.

Comparing the biological effects of static and dynamic defects

In this section, we numerically analyze RNAPII elongation on non-uniform DNA where the advance rate q or the nucleosome rate constants h and h may vary along the gene. We assume that the gene contains a 1600 bp (8 nucleosomes/32 site), partially methylated CpG Island at the promoter and seven 400 bp (2 nucleosomes/8 site) regions that correspond to exons and the adjacent intron-exon boundary regions where co-transcriptional splicing occurs [18,64]. (The reader should note most genes contain one long exon of median length 1,000–1,500 bp and many smaller exons of median length 150–200 bp [65]). For the cases where we treated these features as static defects, we assigned a unique value of q to each site i. First, the effect of variable GC content was incorporated into each site’s advance rate by the following equation where the average advance rate is set by q, and N represents a binomial-distributed sample of the number of guanine and cytosine base-pairs present in the 50 bp interval. Then, the values of q (obtained from Eq 47) were reduced by an additional 50% on the sites corresponding to the CpG Island and the exons. The minimum advancement rate obtained from this procedure (i.e. static defect slowdown of q = 0.5q plus additional GC content penalty) used to generate the lattice for the gene in Fig 9A and 9B was q = 0.43.
Fig 9

Effects of static defects and nucleosome rate constant variability on RNAPII density profiles.

All subpanels have h increasing from 0.001 to 0.1 from left to right and h increasing from 0.001 to 1 from top to bottom as in Fig 2. 50 replicates of N = 2×106 Monte Carlo steps are performed for all cases. All plots list the simulated flux J, the theory prediction for the unperturbed system J, and the theory prediction based on the slowest site J. The left column (A, C) contains simulations at a low initiation rate of α = 0.05 while the right column (B, D) contains simulations at α = 0.25 which is saturating for the slow sites. The top row (A, B) shows density profiles from a gene with static defects representing a CpG Island (32 sites) on the promoter and 7 exons (8 sites) with a static defect advance rate of q = 0.5q on these sites with additional GC content heterogeneity given by Eq 48 leading to a minimum advance rate q = 0.43. The bottom row (C, D) shows density profiles from the same gene, but the variability is introduced through the rate constants h and h. GC content heterogeneity is introduced via Eqs 49 and 50. On the CpG Island and exon sites, h is increased by 25%, and h is decreased by 25%. With the addition of GC content heterogeneity, the maximum observed max(h) was 27% higher than the average, and the minimum observed min(h) was 26% lower than the average.

Effects of static defects and nucleosome rate constant variability on RNAPII density profiles.

All subpanels have h increasing from 0.001 to 0.1 from left to right and h increasing from 0.001 to 1 from top to bottom as in Fig 2. 50 replicates of N = 2×106 Monte Carlo steps are performed for all cases. All plots list the simulated flux J, the theory prediction for the unperturbed system J, and the theory prediction based on the slowest site J. The left column (A, C) contains simulations at a low initiation rate of α = 0.05 while the right column (B, D) contains simulations at α = 0.25 which is saturating for the slow sites. The top row (A, B) shows density profiles from a gene with static defects representing a CpG Island (32 sites) on the promoter and 7 exons (8 sites) with a static defect advance rate of q = 0.5q on these sites with additional GC content heterogeneity given by Eq 48 leading to a minimum advance rate q = 0.43. The bottom row (C, D) shows density profiles from the same gene, but the variability is introduced through the rate constants h and h. GC content heterogeneity is introduced via Eqs 49 and 50. On the CpG Island and exon sites, h is increased by 25%, and h is decreased by 25%. With the addition of GC content heterogeneity, the maximum observed max(h) was 27% higher than the average, and the minimum observed min(h) was 26% lower than the average. For the cases where the pause inducing features were assumed to modulate the nucleosome dynamics, we assigned a unique pair of rate constants h and h to the m nucleosome. The effect of variable GC content was incorporated into each pair of rate constants as follows where h and h represent the average wrapping and unwrapping rate and N now considers 150 base pairs instead of 50 to reflect the size of the nucleosome associated region. On the CpG island and on the exons, the values of h were increased by an additional 25% while the values of h were decreased an additional 25% from the values obtained from Eqs 48 and 49. Our random sampling procedure (i.e. h = 1.25h + additional GC content enhancement and h + GC content penalty) used to generate the gene in Fig 9C and 9D led to the max close rate being 27% higher than the average h and the minimum open rate being 26% lower than the average h. In Fig 9A and 9B, the effects of static defects on advance rate parameter variability is considered under initiation limited (α = 0.05, 9A) and saturating conditions (α = 0.25, 9B). The reader should note that all of our prior definitions for v, ρ, and J given in Eqs 39–41 assume that α and γ are scaled relative to q. Therefore, while α = 0.25 is not saturating for q = 1, the rescaled is saturating for the slow sites. A few features become immediately obvious. First, regions of elevated polymerase density have appeared at the sites of the static defects due to the increased dwell time on these sites, and the reduced flux through these sites has non-trivially reduced the transcription flux through the system (which must be constant across all sites). As the nucleosome dynamics slow down, the localized regions of elevated polymerase density smear out until the density profiles begin to qualitatively look like those in Fig 2C in the absence of perturbations. In Fig 9A (barring the top left two plots), the flux through the gene is roughly equal to or slightly less than the flux predicted by the slowest bottleneck. The most significant discrepancies arise for cases with h≫h (top and middle right) where the simulated flux is significantly lower than the predicted flux through the slow sites. We hypothesize that this results from a synergistic effect where slow downs on the static defects increase the likelihood of nucleosome rebinding immediately after them. In contrast, for Fig 9B (barring the top left two cases), the fluxes mostly lie between the upper and lower bounds set by the unperturbed and most perturbed predictions. However, in the case of fast nucleosome dynamics in Fig 9B (bottom row), the simulated values converge to the predictions for the slowest site. Last, the discrepancies between the predicted and simulated fluxes that existed for the cases with h≫h when α = 0.05 (Fig 9A, top and middle left) have disappeared when the initiation rate was increased to α = 0.25. We suspect that saturating the gene has led to the formation of robust convoys that overcome the nucleosome rebinding effect that occurs after slow sites. While Fig 9A and 9B demonstrate that static defects can throttle the flux, the resulting density profiles in Fig 9B have sharp discontinuities. Next, in Fig 9C and 9D, we investigate the alternative hypothesis that these genetic and epigenetic features could be inducing pausing by causing local variation in the nucleosome rate constants without inducing the same discontinuities. In Fig 9C and 9D, it is clear that this alternative mechanism of action produces a similar throttling effect but with a smoother density profile. In Fig 9C, the simulated flux is dramatically reduced even relative to the fluxes in the analogous subplots in Fig 9A suggesting that small variations in the nucleosome rate constants can have a substantially stronger effect than even adjusting the nominal advance rate. Second, in many cases, the observed flux is substantially lower than the prediction for slow sites. We suspect that this effect is associated with the sharp decay in density at the left-hand boundaries in Fig 9C that is not observed in the static defect case. We hypothesize that the faster nucleosome rebinding on the promoter disrupts the initial formation of convoys until later on the gene causing a reduction in flux due to increased polymerase-nucleosome collisions. While Fig 9C shows that local modification of the nucleosome dynamics can affect the flux at low initiation rates, Fig 9D shows the limitations of this hypothesis at higher initiation rates. For higher initiation rates, the exclusionary binding rule leads to nucleosome unwrapping/destabilization even on the faster binding sites. Thus, when the system starts to saturate, the observed fluxes increase from at or slightly below the lower bound set by J to an intermediate value between the bounds or even the upper bound set by J.

Discussion

In this study, we present a stochastic, agent-based model of transcription with nucleosome induced pausing that maps onto the ddTASEP. The model reproduces key features of transcription including cooperative nucleosome destabilization [23,24,66], polymerase convoy formation [63], and transcriptional bursting [56]. Further, the model provides significant insight into the physics of jamming transitions in the presence of dynamic defects [34]. In lieu of a traditional mean-field approach to solving the ddTASEP dynamics, we calculated the moments of first passage time (MFPT and VFPT) using a Markov chain approach [37,39], yielding exact results for both. Our approach allowed us to directly quantify the effects of the microscopic nucleosome rate constants on the first passage elongation rate γ. Through our analysis of the index of dispersion of the waiting time to enter a nucleosome (D), we observed that significant deviation from classical TASEP behavior can occur as the resting nucleosome density increases, but these deviations from classical behavior are more dramatic as the unwrapping kinetics slow down (h→0) even at fixed nucleosome density. This is potentially consistent with observations such as those in Bintu et al. that showed that multiple mechanisms to reduce transcriptional pausing exist. In acetylation, the thermodynamic equilibrium of the resting nucleosome density (wrapping state) is altered with minimal effect on the rates of wrapping and unwrapping [13]. In contrast, tail-less nucleosomes (which are a minimally explored epigenetic regulatory mechanism [67]) had substantially faster rates of unwrapping and re-wrapping with no change in the resting nucleosome density (wrapping state) [13]. Both mechanisms contributed to fewer transcriptional pauses of shorter duration relative to transcription through unmodified nucleosomes. Using the mean first passage elongation rate γ, we constructed a new axis to the fundamental TASEP αβ-phase diagram. Subsequently, we developed analytical approximations for the core transcriptional properties (transcription flux J, bulk density ρ, and bulk elongation rate v) based on assumptions about the asymptotic behavior of the system for low initiation rates and near the max flux limit. We identified a novel jamming transition in the αγ-plane that separated the transcriptional dynamics into initiation limited and nucleosome (dynamic defect) limited regions. Additionally, these estimates were robust to changes in gene length and to changes in the geometry of nucleosome spacing. Further research is merited into the physical properties of this system. For instance, the effects of varying β in the αβγ-volume and the effects of varying the dynamic defect length should be explored. While the burst size and waiting time distributions shared similarities with two-state models, the model provided insight into a novel mechanism for transcriptional bursting that does not intrinsically rely on standard mechanisms at the promoter (such as two-state models) [25,26,56]. ddTASEP burst groups were shown to be quantitatively (via the time headway distribution [60]) similar to shorter TASEPs nested within the ddTASEP lattice. Further, the average inter-burst intervals were shown to correlate strongly with the index of dispersion D, and the max burst size was observed to be proportional to . In eukaryotic systems, many transcriptional initiation rates are slow with most genes showing one RNAPII on a loci on average at a time [68]. However, even in the extreme low initiation rate region, non-trivial transcriptional bursts were observed on average. Further, given that the burst sizes were geometrically distributed, non-trivial bursting can occur even in extreme initiation limited region. While the nucleosome-mediated bursting hypothesis is interesting, our model more importantly demonstrates both how powerfully and how versatilely nucleosomes dynamics can modulate transcriptional elongation and flux. An emerging body of evidence is forming in the literature that implicates the role of enzymes such as BRD4 and DNMT1/DNMT3B as epigenetic drivers of cancer and epithelial to mesenchymal transition. BRD4 is a histone acetyltransferase associated with histone displacement, chromatin de-compactification, and cell cycle progression. BRD4’s upregulation has been associated with EMT in hepatocellular carcinoma and in salivary adenoid cystic carcinoma [69-71]. DNMT’s are DNA methyltransferases that maintain gene silencing by hypermethylating CpG islands which encourages tighter nucleosome occupancy. Loss of DNMT activity leads to loss of nucleosomes on the promoter of tumor suppressor genes leading to their transcriptional activation [72]. The phase diagrams of the proposed ddTASEP confirm that even minor changes of the nucleosome rate constants will significantly adjust γ giving rise to a large dynamic range of behavior even for fixed initiation rates α consistent with these patterns of transcriptional upregulation. Further, unlike transcription factor dynamics or other shorter time-scale processes traditionally considered in two state models of transcription regulation, these nucleosome mediated methods of transcriptional control can span timescales from days to weeks giving long term control [73]. Last, in the final section of the manuscript, the proposed ddTASEP was extended to consider both static site defects that introduce variability in the RNAPII nominal advance rate and localized variability in the nucleosome rate constants that also induced pausing. Our results demonstrated that the nucleosome dynamics could be tuned individually to induce localized pausing effects that may occur at genetic points of interest like CpG islands or exons without introducing discontinuous shock profiles that would arise from treating these as static defects. Further, the resulting throttling effects on the flux were comparable if not stronger than those obtained in the static defect case. Additionally, the capacity to modify the nucleosome dynamics along the gene body provides an interesting capability in this model that is not attainable in a two-state transcription model. Utilizing our model, it may be possible to tune the nucleosome dynamics in different regions to serve different functions. For instance, a unique set of wrapping dynamic rate constants could be assigned to the promoter-proximal region to simulate promoter associated histone modifications like H3K9me3 silencing or H3K27ac activation while the gene body could have a separate set of modifications such as H3K79me3 or H4K20me1 promoting transcriptional elongation on the gene body leading to locally elevated polymerase density near the transcription start site with a homogeneous gene body (e.g. see Akhtar et al.) comparable to our density profiles in Fig 9C [19,74,75].

Validity of lattice coarse-graining.

(A) compares the analytical results for the expected waiting time to enter a nucleosome between the single site and fifty site model showing that the results are mathematically identical. (B) compares the analytical results for the variance of the waiting time to enter a nucleosome between the single site and 50 site models confirming that they converge to each other in the limit as , →0. (A) and (B) are log-log plots with set to {0.001, 0.01, 0.1, 1} corresponding to red, blue, yellow, and purple markers respectively with adjusted to achieve the desired values of and . (C) Pseudo-color plots of the percent contribution of the bare DNA passage time to the mean and variance of first passage time to clear a nucleosome unit (left) and (right) for the single site (top) and 50 site (bottom) models. (TIF) Click here for additional data file. 25 Aug 2021 Dear Prof. Shen, Thank you very much for submitting your manuscript "Slow Nucleosome Dynamics set the Transcriptional Speed Limit and Induce RNA Polymerase II Traffic Jams and Bursts" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Alexandre V. Morozov, Ph.D. Associate Editor PLOS Computational Biology Mark Alber Deputy Editor PLOS Computational Biology *********************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The article explains an interesting phenomenon of appearance of traffic jams on molecular highways as a result of slowing down of nucleosome transcriptional rates, which is an undesirable situation. The model used is ddTASEP which is one of the recent agent based models for studying gene transcription process. Overall the work is nice. I recommend the article for publication in this journal. Reviewer #2: Mines et al. study the impact of nucleosome unwrapping on transcriptional dynamics. They introduce a stochastic model of particles (i.e. the RNA polymerase) moving on a lattice (i.e. the gene sequence), where the local speed also stochastically vary between two values (accounting for nucleosome wrapping/unwrapping). This model can be generically formulated as a variant of the TASEP model on an open lattice with dynamic defect (ddTASEP). Mines et al. mainly use simulations to simulate the properties of the model and study the impact of the parameters on the observed dynamics. They notably show how transcriptional bursting can be controlled by the nucleosome dynamics under this model, analytically study the first moments of the first passage time a particle to complete transcription and upon introducing a parameter (gamma) that accounts for the average rate of elongation without particle obstruction, they estimate the role of gamma on explaining the flux and density at equilibrium with a partial phase diagram, using simple heuristics and numerical simulations. To my knowledge, the theoretical study of the role of nucleosomes in transcriptional pausing and bursting is new and interesting, as an alternative to other models proposed that notably focus on initiation control mechanism. However, I have major concerns about the definition of the model and technical aspects of the paper, which make me doubtful about its relevance in the study of transcription dynamics. Major comments 1. The authors map the particle occupancy to 1 site, by scaling the transcription bubble to the unit length of the lattice in the TASEP. In reality, RNA polymerases move site by site with a rule of excluding the move if another polymerase is located L nt further, where L is the size of the bubble. These dynamics obey that of a TASEP with particles of extended size that has been studied in the context of translation and transcription (see Zia et al. for a review https://doi.org/10.1007/s10955-011-0183-1 and many other papers, included those cited below). It is also to be noted that at least two recent papers which I am aware of on this topic, e.g. Zuo and Chou arXiv:2103.05151 , and Turowski et al. https://doi.org/10.1016/j.molcel.2020.06.002, use a TASEP model with extended particle size to model transcription. The approximation that consists of scaling a sum of moves with exponential rates to a single exponential jump is a priori wrong (a sum of independent exponential r.v.s is not an exponential r.v.), so the authors need to clarify why this simplification should be still relevant. 2. The aforementioned papers of Zuo and Chou and Turowski et al. especially focus on some important aspects of RNA Pol dynamics, including backtracking and DNA torque. While these studies don’t require or assume a nucleosome free DNA in the modelling, I believe that the present paper is strongly limited in the biological interpretation, as there is no backtracking or heterogeneity in transcription (for example GC content biases the elongation rate), which are known to affect transcription dynamics. With the orders of magnitude involved, is it reasonable to consider a TASEP model that does not take these aspects into account? 3. While the authors investigate the size and frequency of transcriptional bursts, I find surprising that they don’t relate their study to the widely used model of transcriptional burst induced by two state promoter (see review by Sanchez and Golding DOI: 10.1126/science.1242975 , or this more recent review doi: 10.1111/brv.12452 ), with many experimental papers working with this framework (including recent papers such as DOI: 10.1126/sciadv.aaz6699 or doi : 10.1038/s41467-021-24461-6) to explain burst size and frequency. While this mechanism controls the burst at initiation, does an elongation-controlled burst yields any difference, or could be better explaining some experimental observations? 4. In the section where the parameter gamma is introduced and studied, as the modified elongation rate due to histone dynamics, I find the derivation of the mean and variance for the FPT not very interesting (it is a quite basic calculation using continuous time Markov chain); the relevance of this part is stated at the very end of the paper, but with no evidence or reference that would support the authors claim, so I think this should either be a supplement, or revised to make it more relevant with actual experiments and applications. 5. Technically, the authors provide some heuristics to infer the flux and densities that yield a partial phase diagram in gamma. While the results are sound, I also suspect they can also be quite naturally derived from a mean-field approximation to solve the master equation associated with the process (some formulas seem quite close to the TASEP with extended particles, with some correspondence between the parameters gamma and the particle size DOI:10.1088/0305-4470/36/8/302). This would confirm the authors result and help to interpret the limitations that seem to numerically appear in a certain range of the parameters. Minor comments 1. There should be a rigorous mathematical definition of the ddTASEP model in the methods sections 2. Fig 4: In the IL region, typo on J ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Isha Dhiman Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols 7 Dec 2021 Submitted filename: ResponseToReviewers.docx Click here for additional data file. 6 Jan 2022 Dear Prof. Shen, We are pleased to inform you that your manuscript 'Slow Nucleosome Dynamics set the Transcriptional Speed Limit and Induce RNA Polymerase II Traffic Jams and Bursts' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Alexandre V. Morozov, Ph.D. Associate Editor PLOS Computational Biology Mark Alber Deputy Editor PLOS Computational Biology *********************************************************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: I recommend this paper for publication in this journal. Reviewer #2: The authors thoroughly answered the points raised in my review and strengthened their manuscript while opening some very interesting lines for future work. I am happy to recommend it for publication in PLoS Comp Bio ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: ISHA DHIMAN Reviewer #2: No 2 Feb 2022 PCOMPBIOL-D-21-01235R1 Slow Nucleosome Dynamics set the Transcriptional Speed Limit and Induce RNA Polymerase II Traffic Jams and Bursts Dear Dr Shen, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Anita Estes PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol
  52 in total

1.  Fast transcription rates of RNA polymerase II in human cells.

Authors:  Paolo Maiuri; Anna Knezevich; Alex De Marco; Davide Mazza; Anna Kula; Jim G McNally; Alessandro Marcello
Journal:  EMBO Rep       Date:  2011-12-01       Impact factor: 8.807

2.  RNA polymerase II stalled at a thymine dimer: footprint and effect on excision repair.

Authors:  C P Selby; R Drapkin; D Reinberg; A Sancar
Journal:  Nucleic Acids Res       Date:  1997-02-15       Impact factor: 16.971

3.  Crowding-induced transcriptional bursts dictate polymerase and nucleosome density profiles along genes.

Authors:  Aafke A van den Berg; Martin Depken
Journal:  Nucleic Acids Res       Date:  2017-07-27       Impact factor: 16.971

4.  Dynamics of epigenetic regulation at the single-cell level.

Authors:  Lacramioara Bintu; John Yong; Yaron E Antebi; Kayla McCue; Yasuhiro Kazuki; Narumi Uno; Mitsuo Oshimura; Michael B Elowitz
Journal:  Science       Date:  2016-02-12       Impact factor: 47.728

5.  ATAC-seq: A Method for Assaying Chromatin Accessibility Genome-Wide.

Authors:  Jason D Buenrostro; Beijing Wu; Howard Y Chang; William J Greenleaf
Journal:  Curr Protoc Mol Biol       Date:  2015-01-05

6.  Gene transcription in bursting: a unified mode for realizing accuracy and stochasticity.

Authors:  Yaolai Wang; Tengfei Ni; Wei Wang; Feng Liu
Journal:  Biol Rev Camb Philos Soc       Date:  2018-07-19

7.  The generation of promoter-mediated transcriptional noise in bacteria.

Authors:  Namiko Mitarai; Ian B Dodd; Michael T Crooks; Kim Sneppen
Journal:  PLoS Comput Biol       Date:  2008-07-11       Impact factor: 4.475

8.  DNA methylation determines nucleosome occupancy in the 5'-CpG islands of tumor suppressor genes.

Authors:  A Portela; J Liz; V Nogales; F Setién; A Villanueva; M Esteller
Journal:  Oncogene       Date:  2013-05-20       Impact factor: 9.867

9.  The impact of ribosomal interference, codon usage, and exit tunnel interactions on translation elongation rate variation.

Authors:  Khanh Dao Duc; Yun S Song
Journal:  PLoS Genet       Date:  2018-01-16       Impact factor: 5.917

10.  Nucleosomal fluctuations govern the transcription dynamics of RNA polymerase II.

Authors:  Courtney Hodges; Lacramioara Bintu; Lucyna Lubkowska; Mikhail Kashlev; Carlos Bustamante
Journal:  Science       Date:  2009-07-31       Impact factor: 47.728

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.