A guiding principle in self-assembly is that, for high production yield, nucleation of structures must be significantly slower than their growth. However, details of the mechanism that impedes nucleation are broadly considered irrelevant. Here, we analyze self-assembly into finite-sized target structures employing mathematical modeling. We investigate two key scenarios to delay nucleation: (i) by introducing a slow activation step for the assembling constituents and, (ii) by decreasing the dimerization rate. These scenarios have widely different characteristics. While the dimerization scenario exhibits robust behavior, the activation scenario is highly sensitive to demographic fluctuations. These demographic fluctuations ultimately disfavor growth compared to nucleation and can suppress yield completely. The occurrence of this stochastic yield catastrophe does not depend on model details but is generic as soon as number fluctuations between constituents are taken into account. On a broader perspective, our results reveal that stochasticity is an important limiting factor for self-assembly and that the specific implementation of the nucleation process plays a significant role in determining the yield.
A guiding principle in self-assembly is that, for high production yield, nucleation of structures must be significantly slower than their growth. However, details of the mechanism that impedes nucleation are broadly considered irrelevant. Here, we analyze self-assembly into finite-sized target structures employing mathematical modeling. We investigate two key scenarios to delay nucleation: (i) by introducing a slow activation step for the assembling constituents and, (ii) by decreasing the dimerization rate. These scenarios have widely different characteristics. While the dimerization scenario exhibits robust behavior, the activation scenario is highly sensitive to demographic fluctuations. These demographic fluctuations ultimately disfavor growth compared to nucleation and can suppress yield completely. The occurrence of this stochastic yield catastrophe does not depend on model details but is generic as soon as number fluctuations between constituents are taken into account. On a broader perspective, our results reveal that stochasticity is an important limiting factor for self-assembly and that the specific implementation of the nucleation process plays a significant role in determining the yield.
Efficient and accurate assembly of macromolecular structures is vital for living organisms. Not only must resource use be carefully controlled, but malfunctioning aggregates can also pose a substantial threat to the organism itself (Jucker and Walker, 2013; Drummond and Wilke, 2009). Furthermore, artificial self-assembly processes have important applications in a variety of research areas like nanotechnology, biology, and medicine (Zhang, 2003; Whitesides and Grzybowski, 2002; Whitesides et al., 1991). In these areas, we find a broad range of assembly schemes. For example, while a large number of viruses assemble capsids from identical protein subunits, some others, like the Escherichia virus T4, form highly complex and heterogeneous virions encompassing many different types of constituents (Zlotnick et al., 1999; Zlotnick, 2003; Hagan, 2014; Leiman et al., 2010). Furthermore, artificially built DNA structures can reach up to Gigadalton sizes and can, in principle, comprise an unlimited number of different subunits (Ke et al., 2012; Reinhardt and Frenkel, 2014; Gerling et al., 2015; Wagenbauer et al., 2017). Notwithstanding these differences, a generic self-assembly process always includes three key steps: First, subunits must be made available, for example by gene expression, or rendered competent for binding, for example by nucleotide exchange (Alberts and Johnson, 2015; Chen et al., 2008; Whitelam, 2015) (‘activation’). Second, the formation of a structure must be initiated by a nucleation event (‘nucleation’). Due to cooperative or allosteric effects in binding, there might be a significant nucleation barrier (Chen et al., 2008; Jacobs and Frenkel, 2015; Sear, 2007; Lazaro and Hagan, 2016; Hagan and Elrad, 2010). Third, following nucleation, structures grow via aggregation of substructures (‘growth’). To avoid kinetic traps that may occur due to irreversibility or very slow disassembly of substructures (Hagan et al., 2011; Grant et al., 2011), structure nucleation must be significantly slower than growth (Zlotnick et al., 1999; Ke et al., 2012; Reinhardt and Frenkel, 2014; Wei et al., 2012; Jacobs et al., 2015; Hagan and Elrad, 2010). Physically speaking, there are no irreversible reactions. However, in the biological context, self-assembly describes the (relatively fast) formation of long-lasting, stable structures. Therefore, at least part of the assembly reactions are often considered to be irreversible on the time scale of the assembly process. In this manuscript we investigate, for a given target structure, whether the nature of the specific mechanism employed in order to slow down nucleation influences the yield of assembled product. To address this question, we examine a generic model that incorporates the key elements of self-assembly outlined above.
Model definition
We model the assembly of a fixed number of well-defined target structures from limited resources. Specifically, we consider a set of different species of constituents denoted by which assemble into rings of size . The cases and () are denoted as homogeneous and partially (fully) heterogeneous, respectively. The homogeneous model builds on previous work on virus capsid (Chen et al., 2008; Hagan et al., 2011), linear protein filament assembly (Michaels et al., 2016; Michaels et al., 2017; D'Orsogna et al., 2012) and aggregation and polymerization models (Krapivsky et al., 2010). The heterogeneous model in turn links to previous model systems used to study, for example, DNA-brick-based assembly of heterogeneous structures (Murugan et al., 2015; Hedges et al., 2014; D'Orsogna et al., 2013). We emphasize that, even though strikingly similar experimental realizations of our model exist (Gerling et al., 2015; Wagenbauer et al., 2017; Praetorius and Dietz, 2017), it is not intended to describe any particular system. The ring structure represents a general linear assembly process involving building blocks with equivalent binding properties and resulting in a target of finite size. The main assumption in the ring model is that the different constituents assemble linearly in a sequential order. In many biological self-assembling systems like bacterial flagellum assembly or biogenesis of the ribosome subunits the assumption of a linear binding sequence appears to be justified (Peña et al., 2017; Chevance and Hughes, 2008). In order to test the validity of our results beyond these constraints we also perform stochastic simulations of generalized self-assembling systems that do not obey a sequential binding order: i) by explicitly allowing for polymer-polymer bindings and ii) by considering the assembly of finite sized squares that grow independently in two dimensions (see Figures 6 and 7).The assembly process starts with inactive monomers of each species. We use to denote the initial concentration of each monomer species, where is the reaction volume. Monomers are activated independently at the same per capita rate , and, once active, are available for binding. Binding takes place only between constituents of species with periodically consecutive indices, for example 1 and 2 or and 1 (leading to structures such as for ); see Figure 1. To avoid ambiguity, we restrict ring sizes to integer multiples of the number of species . Furthermore, we neglect the possibility of incorrect binding, for example species 1 binding to 3 or . Polymers, that is incomplete ring structures, grow via consecutive attachment of monomers. For simplicity, polymer-polymer binding is disregarded at first, as it is typically assumed to be of minor importance (Zlotnick et al., 1999; Chen et al., 2008; Murugan et al., 2015; Haxton and Whitelam, 2013). To probe the robustness of the model, later we consider an extended model including polymer-polymer binding for which the results are qualitatively the same (see Figure 6 and the discussion). Furthermore, it has been observed that nucleation phenomena play a critical role for self-assembly processes (Ke et al., 2012; Wei et al., 2012; Reinhardt and Frenkel, 2014; Chen et al., 2008). So it is in general necessary to take into account a critical nucleation size, which marks the transition between slow particle nucleation and the faster subsequent structure growth (Michaels et al., 2016; Lazaro and Hagan, 2016; Morozov et al., 2009; Murugan et al., 2015). We denote this critical nucleation size by , which in terms of classical nucleation theory corresponds to the structure size at which the free energy barrier has its maximum. For attachment of monomers to existing structures and decay of structures (reversible binding) into monomers take place at size-dependent reaction rates and , respectively (Figure 1). Here, we focus on identical rates and . A discussion of the general case is given in Appendix 4. Above the nucleation size, polymers grow by attachment of monomers with reaction rate per binding site. As we consider successfully nucleated structures to be stable on the observational time scales, monomer detachment from structures above the critical nucelation size is neglected (irreversible binding) (Murugan et al., 2015; Chen et al., 2008). Complete rings neither grow nor decay (absorbing state).
Figure 1.
Schematic description of the model.
(a) Rings of size are assembled from different particle species. monomers of each species are initially in an inactive state (blue) and are activated at the same per-capita rate . Once active (green), species with periodically consecutive index can bind to each other. Structures grow by attachment of single monomers. Below a critical nucleation size (), structures of size (light yellow) grow and decay into monomers at size-dependent rates and , respectively. Above the critical size, polymers (dark yellow) are stable and grow at size-independent rate until the ring is complete (the absorbing state; red). (b) Illustration of depletion traps. If nucleation is slow compared to growth, initiated structures are likely to be completed. Otherwise, many stable nuclei will form that cannot be completed before resources run out.
Schematic description of the model.
(a) Rings of size are assembled from different particle species. monomers of each species are initially in an inactive state (blue) and are activated at the same per-capita rate . Once active (green), species with periodically consecutive index can bind to each other. Structures grow by attachment of single monomers. Below a critical nucleation size (), structures of size (light yellow) grow and decay into monomers at size-dependent rates and , respectively. Above the critical size, polymers (dark yellow) are stable and grow at size-independent rate until the ring is complete (the absorbing state; red). (b) Illustration of depletion traps. If nucleation is slow compared to growth, initiated structures are likely to be completed. Otherwise, many stable nuclei will form that cannot be completed before resources run out.We investigate two scenarios for the control of nucleation speed, first separately and then in combination. For the ‘activation scenario’ we set (all binding rates are equal) and control the assembly process by varying the activation rate . For the ‘dimerization scenario’ all particles are inherently active () and we control the assembly process by varying the dimerization rate (we focus on ). It has been demonstrated previously in Chen et al. (2008) and (Endres and Zlotnick, 2002; Hagan and Elrad, 2010; Morozov et al., 2009) that either a slow activation or a slow dimerization step are suitable in principle to retard nucleation and favour growth of the structures over the initiation of new ones. We quantify the quality of the assembly process in terms of the assembly yield, defined as the number of successfully assembled ring structures relative to the maximal possible number . Yield is measured when all resources have been used up and the system has reached its final state. We do not discuss the assembly time in this manuscript, however, in Appendix 5 we show typical trajectories for the time evolution of the yield in the activation and dimerization scenario. If the assembly product is stable (absorbing state), the yield can only increase with time. Consequently, the final yield constitutes the upper limit for the yield irrespective of additional time constraints. Therefore, the final yield is an informative and unambiguous observable to describe the efficiency of the assembly reaction.We simulated our system both stochastically via Gillespie’s algorithm (Gillespie, 2007) and deterministically as a set of ordinary differential equations corresponding to chemical rate equations (see Appendix 1).
Results
Deterministic behavior in the macroscopic limit
First, we consider the macroscopic limit, , and investigate how assembly yield depends on the activation rate (activation scenario) and the dimerization rate (dimerization scenario) for . Here, the deterministic description coincides with the stochastic simulations (Figure 2a and b). For both high activation and high dimerization rates, yield is very poor. Upon decreasing either the activation rate (Figure 2a) or the dimerization rate (Figure 2b), however, we find a threshold value, or , below which a rapid transition to the perfect yield of 1 is observed both in the deterministic and stochastic simulation. By exploiting the symmetries of the system with respect to relabeling of species, one can show that, in the deterministic limit, the behavior is independent of the number of species (for fixed and , see Appendix 1). Consequently, all systems behave equivalently to the homogeneous system and yield becomes independent of in this limit. Note, however, that equivalent systems with differing have different total numbers of particles and hence assemble different total numbers of rings.
Figure 2.
Deterministic behavior in the macroscopic limit .
(a, b) Yield for different particle numbers (symbols) and ring sizes (colors) for . Decreasing either (a) the activation rate (‘activation scenario’: ) or (b) the dimerization rate (‘dimerization scenario’: ) achieves perfect yield. The stochastic simulation results (symbols) average over 16 realizations and agree exactly with the integration of the chemical rate equations (lines). The threshold values (Equation 1) are indicated by the vertical dashed lines. Plotting yield against the dimensionless quantity causes the curves for different to collapse into a single master curve (inset in a). For both scenarios there is no dependency on the number of species in the deterministic limit. (c, d) Illustration showing how depletion traps are avoided by either slow activation (c) or slow dimerization (d). If the activation or the dimerization rate is small (large) compared to the growth rate, assembly paths leading to complete rings are favored (disfavored). The color scheme is the same as in Figure 1. (e) Deterministically, the size distribution of polymers behaves like a wave, and is shown for large and small activation rate for , , and . The distributions are obtained from a numerical integration of the deterministic mean-field dynamics, Equation 6, and are plotted for early, intermediate and final simulation times. The respective percentage of inactive monomers and complete rings is indicated by the symbols in the scale bar on the left or right.
Deterministic behavior in the macroscopic limit .
(a, b) Yield for different particle numbers (symbols) and ring sizes (colors) for . Decreasing either (a) the activation rate (‘activation scenario’: ) or (b) the dimerization rate (‘dimerization scenario’: ) achieves perfect yield. The stochastic simulation results (symbols) average over 16 realizations and agree exactly with the integration of the chemical rate equations (lines). The threshold values (Equation 1) are indicated by the vertical dashed lines. Plotting yield against the dimensionless quantity causes the curves for different to collapse into a single master curve (inset in a). For both scenarios there is no dependency on the number of species in the deterministic limit. (c, d) Illustration showing how depletion traps are avoided by either slow activation (c) or slow dimerization (d). If the activation or the dimerization rate is small (large) compared to the growth rate, assembly paths leading to complete rings are favored (disfavored). The color scheme is the same as in Figure 1. (e) Deterministically, the size distribution of polymers behaves like a wave, and is shown for large and small activation rate for , , and . The distributions are obtained from a numerical integration of the deterministic mean-field dynamics, Equation 6, and are plotted for early, intermediate and final simulation times. The respective percentage of inactive monomers and complete rings is indicated by the symbols in the scale bar on the left or right.Decreasing the activation rate reduces the concentration of active monomers in the system. Hence growth of the polymers is favored over nucleation, because growth depends linearly on the concentration of active monomers while nucleation shows a quadratic dependence. Likewise, lower dimerization rates slow down nucleation relative to growth. Both mechanisms therefore restrict the number of nucleation events, and ensure that initiated structures can be completed before resources become depleted (see Figure 2c and d).Mathematically, the deterministic time evolution of the polymer size distribution is described by an advection-diffusion equation (Endres and Zlotnick, 2002; Yvinec et al., 2012) with advection and diffusion coefficients depending on the instantaneous concentration of active monomers (see Appendix 2). Solving this equation results in the wavefront of the size distribution advancing from small to large polymer sizes (Figure 2e). Yield production sets in as soon as the distance travelled by this wavefront reaches the maximal ring size . Exploiting this condition, we find that in the deterministic system for , a non-zero yield is obtained if either the activation rate or the dimerization rate remains below a corresponding threshold value, that is if or , where(see Appendix 3) with proportionality constants and . These relations generalize previous results (Morozov et al., 2009) to finite activation rates and for heterogeneous systems. A comparison between the threshold values given by Equation 1 and the simulated yield curves is shown in Figure 2a,b. The relations highlight important differences between the two scenarios (where and , respectively): While decreases cubically with the ring size , does so only quadratically. Furthermore, the threshold activation rate increases with the initial monomer concentration . Consequently, for fixed activation rate, the yield can be optimized by increasing . In contrast, the threshold dimerization rate is independent of and the yield curves coincide for . Finally, if is finite and , the interplay between the two slow-nucleation scenarios may lead to enhanced yield. This is reflected by the factor in , and we will come back to this point later when we discuss the stochastic effects.In summary, for large particle numbers (), perfect yield can be achieved in two different ways, independently of the heterogeneity of the system - by decreasing either the activation rate (activation scenario) or the dimerization rate (dimerization scenario) below its respective threshold value.
Stochastic effects in the case of reduced resources
Next, we consider the limit where the particle number becomes relevant for the physics of the system. In the activation scenario, we find a markedly different phenomenology if resources are sparse. Figure 3a shows the dependence of the average yield on the activation rate for different, low particle numbers in the completely heterogeneous case (). Here, we restrict our discussion to the average yield. The error of the mean is negligible due to the large number of simulations used to calculate the average yield. Still, due to the randomness in binding and activation, the yield can differ between simulations. A figure with the average yield and its standard deviation is shown in Appendix 6. For very low and very high average yield, the standard deviation has to be small due to the boundedness of the yield. For intermediate values of the average, the standard deviation is highest but still small compared to the average yield. Thus, the average yield is meaningful for the essential understanding of the assembly process. Whereas the deterministic theory predicts perfect yield for small activation rates, in the stochastic simulation yield saturates at an imperfect value . Reducing the particle number decreases this saturation value until no finished structures are produced (). The magnitude of this effect strongly depends on the size of the target structure if the system is heterogeneous. Figure 3c shows a diagram characterizing different regimes for the saturation value of the yield, , in dependence of the particle number and the size of the target structure for fully heterogeneous systems . We find that the threshold particle number necessary to obtain a fixed yield increases nonlinearly with the target size . For the depicted range of , the dependence of the threshold for nonzero yield, , on can approximately be described by a power-law: , with exponent for . Consequently, for already more than 105 rings must be assembled in order to obtain a yield larger than zero. In Appendix 8 we included two additional plots that show the dependence of on for fixed and the dependence on for fixed , respectively. The suppression of the yield is caused by fluctuations (see explanation below) and is not captured by a deterministic description. Because these stochastic effects can decrease the yield from a perfect value in a deterministic description to zero (see Figure 3a), we term this effect ‘stochastic yield catastrophe’. For fixed target size and fixed maximum number of target structures , increases with decreasing number of species, see Figure 3d. In the fully homogeneous case, , a perfect yield of 1 is always achieved for . The decrease of the maximal yield with the number of species thus suggests that, in order to obtain high yield, it is beneficial to design structures with as few different species as possible. In large part this effect is due to the constraint , whereby the more homogeneous systems (small ) require larger numbers of particles per species and, correspondingly, exhibit less stochasticity. If is fixed instead of , the yield still initially decreases with increasing number of species but then quickly reaches a stationary plateau and gets independent of for , see Appendix 7. Moreover, increasing the nucleation size , and with it the reversibility of binding, also increases , see Figure 3(d). This indicates that, beside heterogeneity of the target structure, irreversibility of binding on the relevant time scale makes the system susceptible to stochastic effects.
Figure 3.
Stochastic effects in the case of reduced resources.
(a, b) Yield of the fully heterogeneous system () for reduced number of particles (symbols) for and averaged over 1024 ensembles. In the activation scenario, at low activation rates the yield saturates at an imperfect value , which decreases with the number of particles (a). This finding disagrees with the deterministic prediction (black line) of perfect yield for . In contrast, the dimerization scenario robustly exhibits the maximal yield of 1 for small , in agreement with the deterministic prediction (black line) (b). (c) Diagram showing different regimes of in dependence of the particle number and target size (for the fully heterogeneous system ) as obtained from stochastic simulations in the limit . The minimal number of particles necessary to obtain a fixed yield increases in a strongly nonlinear way with the target size. The symbols along the line represent the saturation values of the yield curves in (a). (d) Dependence of on the number of species for fixed and fixed number of ring structures . Symbols indicate different values of the critical nucleation size . The impact of stochastic effects strongly depends on the number of species under the constraint of a fixed total number of particles and fixed target size . The homogeneous system is not subject to stochastic effects at all. Higher reversibility for larger also mitigates stochastic effects.
Stochastic effects in the case of reduced resources.
(a, b) Yield of the fully heterogeneous system () for reduced number of particles (symbols) for and averaged over 1024 ensembles. In the activation scenario, at low activation rates the yield saturates at an imperfect value , which decreases with the number of particles (a). This finding disagrees with the deterministic prediction (black line) of perfect yield for . In contrast, the dimerization scenario robustly exhibits the maximal yield of 1 for small , in agreement with the deterministic prediction (black line) (b). (c) Diagram showing different regimes of in dependence of the particle number and target size (for the fully heterogeneous system ) as obtained from stochastic simulations in the limit . The minimal number of particles necessary to obtain a fixed yield increases in a strongly nonlinear way with the target size. The symbols along the line represent the saturation values of the yield curves in (a). (d) Dependence of on the number of species for fixed and fixed number of ring structures . Symbols indicate different values of the critical nucleation size . The impact of stochastic effects strongly depends on the number of species under the constraint of a fixed total number of particles and fixed target size . The homogeneous system is not subject to stochastic effects at all. Higher reversibility for larger also mitigates stochastic effects.The stochastic yield catastrophe is mainly attributable to fluctuations in the number of active monomers. In the deterministic (mean-field) equation the different particle species evolve in balanced stoichiometric concentrations. However, if activation is much slower than binding, the number of active monomers present at any given time is small, and the mean-field assumption of equal concentrations is violated due to fluctuations (for ). Activated monomers then might not fit any of the existing larger structures and would instead initiate new structures. Figure 4a illustrates this effect and shows how fluctuations in the availability of active particles lead to an enhanced nucleation and, correspondingly, to a decrease in yield. Due to the effective enhancement of the nucleation rate, the resulting polymer size distribution has a higher amplitude than that predicted deterministically (Figure 4b) and the system is prone to depletion traps. A similar broadening of the size distribution has been reported in the context of stochastic coagulation-fragmentation of identical particles (D'Orsogna et al., 2015).
Figure 4.
Cause and effect of stochasticity in the activation scenario.
(a) Illustration of the significance of stochastic effects when resources are sparse. Arrows indicate possible transitions and the probabilities in the depicted situation for sufficiently small activation rate . For small , the random order of activation alone determines the availability of monomers and therefore the order of binding. In the depicted situation, the complete structure is assembled only with probability 1/2. In all other cases, only fragments of the structure are assembled such that the final yield is decreased. (b) Polymer size distribution for the activation scenario (symbols) as obtained from stochastic simulations, in comparison with its deterministic prediction (lines) for , and . Due to the enhanced number of nucleation events, the stochastic wave encompasses far more structures and moves more slowly. As a result, it does not quite reach the absorbing boundary.
Cause and effect of stochasticity in the activation scenario.
(a) Illustration of the significance of stochastic effects when resources are sparse. Arrows indicate possible transitions and the probabilities in the depicted situation for sufficiently small activation rate . For small , the random order of activation alone determines the availability of monomers and therefore the order of binding. In the depicted situation, the complete structure is assembled only with probability 1/2. In all other cases, only fragments of the structure are assembled such that the final yield is decreased. (b) Polymer size distribution for the activation scenario (symbols) as obtained from stochastic simulations, in comparison with its deterministic prediction (lines) for , and . Due to the enhanced number of nucleation events, the stochastic wave encompasses far more structures and moves more slowly. As a result, it does not quite reach the absorbing boundary.In the dimerization scenario, in contrast, there is no stochastic activation step. All particles are available for binding from the outset. Consequently, stochastic effects do not play an essential role in the dimerization scenario and perfect yield can be reached robustly for all system sizes, regardless of the number of species (Figure 3(b)).
Non-monotonic yield curves for a combination of slow dimerization and activation
So far, the two implementations of the ‘slow nucleation principle’ have been investigated separately. Surprisingly, we observe counter-intuitive behavior in a mixed scenario in which both dimerization and activation occur slowly (i.e., , ). Figure 5 shows that, depending on the ratio , the yield can become a non-monotonic function of . In the regime where is large, nucleation is dimerization-limited; therefore activation is irrelevant and the system behaves as in the dimerization scenario for . Upon decreasing we then encounter a second regime, where activation and dimerization jointly limit nucleation. The yield increases due to synergism between slow dimerization and activation (see dependence of , Equation 1), whilst the average number of active monomers is still high and fluctuations are negligible. Finally, a stochastic yield catastrophe occurs if is further reduced and activation becomes the limiting step. This decline is caused by an increase in nucleation events due to relative fluctuations in the availability of the different species (‘fluctuations between species’). This contrasts the deterministic description where nucleation is always slower for smaller activation rate. Depending on the ratio , the ring size and the particle number , maximal yield is obtained either in the dimerization-limited (red curves, Figure 5), activation-limited (blue curve, Figure 5b) or intermediate regime (green and orange curves, Figure 5).
Figure 5.
Yield for a combination of slow dimerization and activation.
(a, b) Dependence of the yield of the fully heterogeneous system on the activation rate for and different values of the dimerization rate (colors/symbols) for (a) and (b) (averaged over 1024 ensembles). For large activation rates the yield behaves deterministically (lines). In contrast, for small activation rates, stochastic effects (blue shading) lead to a decrease in yield. Depending on the parameters, the yield maximum is attained in either the deterministic, stochastic or intermediate regime. (c) Table summarizing the qualitative behavior of the yield (poor/intermediate/perfect) for a combination of dimerization and activation rates for both the deterministic and the stochastic limit. The columns correspond to low and high values of the dimerization rate, as indicated by the marker in the corresponding deterministic yield curve at the top of the column. Similarly, the rows correspond to low, intermediate and high activation rates. Arrows and colors indicate where and for which curve this behavior can be observed in (a) and (b). Deviations between the deterministic and stochastic limits are most prominent for low activation rates.
Yield for a combination of slow dimerization and activation.
(a, b) Dependence of the yield of the fully heterogeneous system on the activation rate for and different values of the dimerization rate (colors/symbols) for (a) and (b) (averaged over 1024 ensembles). For large activation rates the yield behaves deterministically (lines). In contrast, for small activation rates, stochastic effects (blue shading) lead to a decrease in yield. Depending on the parameters, the yield maximum is attained in either the deterministic, stochastic or intermediate regime. (c) Table summarizing the qualitative behavior of the yield (poor/intermediate/perfect) for a combination of dimerization and activation rates for both the deterministic and the stochastic limit. The columns correspond to low and high values of the dimerization rate, as indicated by the marker in the corresponding deterministic yield curve at the top of the column. Similarly, the rows correspond to low, intermediate and high activation rates. Arrows and colors indicate where and for which curve this behavior can be observed in (a) and (b). Deviations between the deterministic and stochastic limits are most prominent for low activation rates.
Robustness of the results to model modifications
In our model, the reason for the stochastic yield catastrophe is that - due to fluctuations between species - the effective nucleation rate is strongly enhanced. Hence, if binding to a larger structure is temporarily impossible, activated monomers tend to initiate new structures, causing an excess of structures that ultimately cannot be completed. Natural questions that arise are whether (i) relaxing the constraint that polymers cannot bind other polymers or (ii) abandoning the assumption of a linear assembly path, will resolve the stochastic yield catastrophe. To answer these questions, we performed stochastic simulations for extensions of our model system showing that the stochastic yield catastrophe indeed persists. We start by considering the ring model from the previous section but take polymer-polymer binding into account in addition to growth via monomer attachment (Figure 6). In detail, we assume that two structures of arbitrary size (and with combined length ) bind at rate if they fit together, that is if the left (right) end of the first structure is periodically continued by the right (left) end of the second one. Realistically, the rate of binding between two structures is expected to decrease with the motility and thus the sizes of the structures. In order to assess the effect of polymer-polymer binding, we focus on the worst case where the rate for binding is independent of the size of both structures. If a stochastic yield catastrophe occurs for this choice of parameters, we expect it to be even more pronounced in all the ‘intermediate cases’. Figure 6 shows the dependence of the yield on the activation rate in the polymer-polymer model. As before, yield increases below a critical activation rate and then saturates at an imperfect value for small activation rates. Decreasing the number of particles per species, decreases this saturation value. Compared to the original model, the stochastic yield catastrophe is mitigated but still significant: For structures of size , yield saturates at around 0.87 for particles per species and at around 0.33 for particles per species. We thus conclude that polymer-polymer binding indeed alleviates the stochastic yield catastrophe but does not resolve it. Since binding only happens between consecutive species, structures with overlapping parts intrinsically can not bind together and depletion traps continue to occur. Taken together, also in the extended model, fluctuations in the availability of the different species lead to an excess of intermediate-sized structures that get kinetically trapped due to structural mismatches. Note that in the extreme case of , incomplete polymers can always combine into one final ring structure so that in this case the yield is always 1. Analogously, for high activation rates yield is improved for compared to (Figure 6b).
Figure 6.
Extended model including polymer-polymer binding.
(a) In the extended model, structures not only grow by monomer attachment but also by binding with another polymer (colored arrow). As before, binding only happens between periodically consecutive species with rate per binding site. So, the reaction rate for two polymers is identical to the one for monomer-polymer binding, . Furthermore, only polymers with combined length can bind. All other processes and rules are the same as in the original model described in Figure 1. (b) The yield of the extended model as obtained from stochastic simulations is shown in dependence of the activation rate for , , and different values of the number of particles per species, (averaged over 1024 ensembles). The qualitative behavior is the same as for the original model. In particular, yield saturates (in the stochastic limit) at an imperfect value for slow activation rates. Note that for small particle numbers polymer-polymer binding results in an increase of the minimal yield (here for large activation rates). This is due to the fact that even in the case where a priori too many nucleation events happen, polymers can combine into final structures.
Extended model including polymer-polymer binding.
(a) In the extended model, structures not only grow by monomer attachment but also by binding with another polymer (colored arrow). As before, binding only happens between periodically consecutive species with rate per binding site. So, the reaction rate for two polymers is identical to the one for monomer-polymer binding, . Furthermore, only polymers with combined length can bind. All other processes and rules are the same as in the original model described in Figure 1. (b) The yield of the extended model as obtained from stochastic simulations is shown in dependence of the activation rate for , , and different values of the number of particles per species, (averaged over 1024 ensembles). The qualitative behavior is the same as for the original model. In particular, yield saturates (in the stochastic limit) at an imperfect value for slow activation rates. Note that for small particle numbers polymer-polymer binding results in an increase of the minimal yield (here for large activation rates). This is due to the fact that even in the case where a priori too many nucleation events happen, polymers can combine into final structures.Kinetic trapping due to structural mismatches can occur in every (partially) irreversible heterogeneous assembly process with finite-sized target structure and limited resources. From our results, we thus expect a stochastic yield catastrophe to be common to such systems. In order to further test this hypothesis, we simulated another variant of our model where finite sized squares assemble via monomer attachment from a pool of initially inactive particles, see Figure 7. In contrast to the original model, the assembled structures are non-periodic and exhibit a non-linear assembly path where structures can grow independently in two dimensions. While the ring model assumes a sequential order of binding of the monomers, the square allows for a variety of distinct assembly paths that all lead to the same final structure. Note that, because of the absence of periodicity, the square model is only well defined for the completely heterogeneous case. Figure 7 depicts the dependence of the yield on the activation rate for a square of size . Also in this case, we find that the yield saturates at an imperfect value for small activation rates. Hence, we showed that the stochastic yield catastrophe is not resolved neither by accounting for polymer-polymer combination nor by considering more general assembly processes with multiple parallel assembly paths. This observation supports the general validity of our findings and indicates that stochastic yield catastrophes are a general phenomenon of (partially) irreversible and heterogeneous self-assembling systems that occur if particle number fluctuations are non-negligible.
Figure 7.
Assembly of squares of size from different particle species.
(a) As in the ring models, there are monomers of each species in the system. All particles are initially in an inactive state (blue) and are activated at the same per-capita rate . Once active (green), species with neighboring position within the square (left/right, up/down) can bind to each other. Structures grow by attachment of single monomers until the square is complete (absorbing state). Depending on the number of contacts between the monomer and the structure, the corresponding rate is . For simplicity, all polymers (yellow) are stable () and we do not consider polymer-polymer binding. (b) The yield of the square model as obtained from stochastic simulations is shown in dependence of the activation rate for , and different values of the number of particles per species, (averaged over 256 ensembles). The qualitative behavior is the same as for the previous models: Whereas the yield is poor for large activation rates, it strongly increases below a threshold value and saturates (in the stochastic limit) at an imperfect value < 1 for small activation rates. The saturation value decreases with decreasing number of particles in the system.
Assembly of squares of size from different particle species.
(a) As in the ring models, there are monomers of each species in the system. All particles are initially in an inactive state (blue) and are activated at the same per-capita rate . Once active (green), species with neighboring position within the square (left/right, up/down) can bind to each other. Structures grow by attachment of single monomers until the square is complete (absorbing state). Depending on the number of contacts between the monomer and the structure, the corresponding rate is . For simplicity, all polymers (yellow) are stable () and we do not consider polymer-polymer binding. (b) The yield of the square model as obtained from stochastic simulations is shown in dependence of the activation rate for , and different values of the number of particles per species, (averaged over 256 ensembles). The qualitative behavior is the same as for the previous models: Whereas the yield is poor for large activation rates, it strongly increases below a threshold value and saturates (in the stochastic limit) at an imperfect value < 1 for small activation rates. The saturation value decreases with decreasing number of particles in the system.
Discussion
Our results show that different ways to slow down nucleation are indeed not equivalent, and that the explicit implementation is crucial for assembly efficiency. Susceptibility to stochastic effects is highly dependent on the specific scenario. Whereas systems for which dimerization limits nucleation are robust against stochastic effects, stochastic yield catastrophes can occur in heterogeneous systems when resource supply limits nucleation. The occurrence of stochastic yield catastrophes is not captured by the deterministic rate equations, for which the qualitative behavior of both scenarios is the same. Therefore, a stochastic description of the self-assembly process, which includes fluctuations in the availability of the different species, is required. The interplay between stochastic and deterministic dynamics can lead to a plethora of interesting behaviors. For example, the combination of slow activation and slow nucleation may result in a non-monotonic dependence of the yield on the activation rate. While deterministically, yield is always improved by decreasing the activation rate, stochastic fluctuations between species strongly suppress the yield for small activation rate by effectively enhancing the nucleation speed. This observation clearly demonstrates that a deterministically slow nucleation speed is not sufficient in order to obtain good yield in heterogeneous self-assembly. For example, a slow activation step does not necessarily result in few nucleation events although deterministically this behavior is expected. Thus, our results indicate that the slow nucleation principle has to be interpreted in terms of the stochastic framework and have important implications for yield optimization.We showed that demographic noise can cause stochastic yield catastrophes in heterogeneous self-assembly. However, other types of noise, such as spatiotemporal fluctuations induced by diffusion, are also expected to trigger stochastic yield catastrophes. Hence, our results have broad implications for complex biological and artificial systems, which typically exhibit various sources of noise. We characterize conditions under which stochastic yield catastrophes occur, and demonstrate how they can be mitigated. These insights could usefully inform the design of experiments to circumvent yield catastrophes: In particular, while slow provision of constituents is a feasible strategy for experiments, it is highly susceptible to stochastic effects. On the other hand, irrespective of its robustness to stochastic effects, the experimental realization of the dimerization scenario relies on cooperative or allosteric effects in binding, and may therefore require more sophisticated design of the constituents (Sacanna et al., 2010; Zeravcic et al., 2017). Our theoretical analysis shows that stochasticity can be alleviated either by decreasing heterogeneity (presumably lowering realizable complexity) or by increasing reversibility (potentially requiring fine-tuning of bond strengths and reducing the stability of the assembly product). Alternative approaches to control stochasticity include the promotion of specific assembly paths (Murugan et al., 2015; Gartner, Graf and Frey, in preparation) and the control of fluctuations (Graf, Gartner and Frey, in preparation). One possibility to test these ideas and the ensuing control strategies could be via experiments based on DNA origami. Instead of building homogeneous ring structures as in Wagenbauer et al. (2017), one would have to design heterogeneous ring structures made from several different types of constituents with specified binding properties. By varying the opening angle of the ‘wedges’ (and thus the preferred number of building blocks in the ring) and/or the number of constituents, both the target structure size as well as the heterogeneity of the target structure could be controlled.Moreover, the ideas presented in this manuscript are relevant for the understanding of intracellular self-assembly. In cells, provision of building blocks is typically a gradual process, as synthesis is either inherently slow or an explicit activation step, such as phosphorylation, is required. In addition, the constituents of the complex structures assembled in cells are usually present in small numbers and subject to diffusion. Hence, stochastic yield catastrophes would be expected to have devastating consequences for self-assembly, unless the relevant cellular processes use elaborate control mechanisms to circumvent stochastic effects. Further exploration of these control mechanisms should enhance the understanding of self-assembly processes in cells and help improve synthesis of complex nanostructures.
Materials and methods
All our simulation data was generated with either C++ or MATLAB. The source code is available at the eLife website.Here we show the derivation of Equation 1 in the main text, giving the threshold values for the rate constants below which finite yield is obtained. The details can be found in Appendices 1–3.
Master equation and chemical rate equations
We start with the general Master equation and derive the chemical rate equations (deterministic/mean-field equations) for the heterogeneous self-assembly process. We renounce to show the full Master equation here but instead state the system that describes the evolution of the first moments. To this end, we denote the random variable that describes the number of polymers of size and species in the system at time by with and . The species of a polymer is defined by the species of the respective monomer at its left end. Furthermore, and denote the number of inactive and active monomers of species , respectively, and the number of complete rings. We signify the reaction rate for binding of a monomer to a polymer of size by . denotes the activation rate and the decay rate of a polymer of size . By we indicate (ensemble) averages. The system governing the evolution of the first moments (the averages) of the is then given by:The different terms of this equation are illustrated graphically in Figure 8. The first equation describes loss of inactive particles due to activation at rate . Equation 2b gives the temporal change of the number of active monomers that is governed by the following processes: activation of inactive monomers at rate , binding of active monomers to the left or to the right end of an existing structure of size at rate , and decay of below-critical polymers of size into monomers at rate (disassembly). Equations 2c and 2d describe the dynamics of dimers and larger polymers of size , respectively. The terms account for reactions of polymers with active monomers (polymerization) as well as decay in the case of below-critical polymers (disassembly). The indicator function equals 1 if the condition is satisfied and 0 otherwise. Note that a polymer of size can grow by attaching a monomer to its left or to its right end whereas the formation of a dimer of a specific species is only possible via one reaction pathway (dimerization reaction). Finally, polymers of length – the complete ring structures – form an absorbing state and, therefore, include only the respective gain terms (cf Equation 2e).
Figure 8.
Graphical illustration of Equations 2 and 6.
(a) Visualization of the gain and loss terms in the dynamics of the active monomers in Equation 2b. Gain of active monomers is due to activation of inactive monomers as well as decay of unstable polymers. Loss of active monomers is due to dimerization and attachment of monomers to larger polymers. (b) Visualization of the transitions between clusters of different sizes (without distinction of species). The first and second box represent the inactive and active monomers in the system, the subsequent boxes each represent the ensemble of polymers of a certain size. The arrows between the boxes show possible reactions and transitions with the reaction rates indicated accordingly. Each arrow starting from or leading to a box is associated with a corresponding loss or gain term on the right hand side of Equation 2 and Equation 6.
Graphical illustration of Equations 2 and 6.
(a) Visualization of the gain and loss terms in the dynamics of the active monomers in Equation 2b. Gain of active monomers is due to activation of inactive monomers as well as decay of unstable polymers. Loss of active monomers is due to dimerization and attachment of monomers to larger polymers. (b) Visualization of the transitions between clusters of different sizes (without distinction of species). The first and second box represent the inactive and active monomers in the system, the subsequent boxes each represent the ensemble of polymers of a certain size. The arrows between the boxes show possible reactions and transitions with the reaction rates indicated accordingly. Each arrow starting from or leading to a box is associated with a corresponding loss or gain term on the right hand side of Equation 2 and Equation 6.We simulated the Master equation underlying Equation 2 stochastically using Gillespie’s algorithm. For the following deterministic analysis, we neglect correlations between particle numbers , which is valid assumption for large particle numbers. Then the two-point correlator can be approximated as the product of the corresponding mean values (mean-field approximation)Furthermore, for the expectation values it must holdbecause all species have equivalent properties (there is no distinct species) and hence the system is invariant under relabelling of the upper index. Bywe denote the concentration of any monomer or polymer species of size , where is the reaction volume. Due to the symmetry formulated in Equation 4, the heterogeneous assembly process decouples into a set of identical and independent homogeneous assembly processes in the deterministic limit. The corresponding homogeneous system then is described by the following set of equations that is obtained by applying (Equation 3, Equation 4) and (Equation 5) to (Equation 2)The rate constants in Equations 6 and 2 differ by a factor of . For convenience, we use however the same symbol in both cases. The rate constants in Equation 6 can be interpreted in the usual units . Due to the symmetry, the yield, which is given by the quotient of the number of completely assembled rings and the maximum number of complete rings, becomes independent of the number of speciesHence, it is enough to study the dynamics of the homogeneous system, Equation 6, to identify the condition under which non zero yield is obtained.
Effective description by an advection-diffusion equation
The dynamical properties of the evolution of the polymer-size distribution become evident if the set of ODEs, Equation 6, is rewritten as a partial differential equation. This approach was previously described in the context of virus capsid assembly (Zlotnick et al., 1999; Morozov et al., 2009). For simplicity, we restrict ourselves to the case and let and . Then, for the polymers with we haveAs a next step, we approximate the index indicating the length of the polymeras a continuous variable and define . By we denote the concentration of active monomers in the following to emphasize their special role. Formally expanding the right-hand side of Equation 8 in a Taylor series up to second orderone arrives at the advection-diffusion equation with both advection and diffusion coefficients depending on the concentration of active monomersEquation 10 can be written in the form of a continuity equation with flux . The flux at the left boundary equals the influx of polymers due to dimerization of free monomers . This enforces a Robin boundary condition atAt we set an absorbing boundary so that completed structures are removed from the system. The time evolution of the concentration of active monomers is given byThe terms on the right-hand side account for activation of inactive particles, dimerization, and binding of active particles to polymers (polymerization).Qualitatively, Equation 10 describes a profile that emerges at from the boundary condition Equation 11, moves to the right with time-dependent velocity due to the advection term, and broadens with a time-dependent diffusion coefficient . In Appendices 2–3 we show how the full solution of Equations 10 and 11 can be found assuming knowledge of . Here, we focus only on the derivation of the threshold activation rate and threshold dimerization rate that mark the onset of non-zero yield. Yield production starts as soon as the density wave reaches the absorbing boundary at . Therefore, finite yield is obtained if the sum of the advectively travelled distance and the diffusively travelled distance exceeds the system sizeAccording to Equation 10, and , giving as condition for the onset of finite yieldwhere the last approximation is valid for large .In order to obtain we derive an effective two-component system that governs the evolution of . To this end, we denote the total number of polymers in Equation 12 by (as long as yield is zero the upper boundary is irrelevant and we can consider ). Equation 12 then readsand the dynamics of is determined from the boundary condition, Equation 11Measuring and in units of the initial monomer concentration and time in units of the equations are rewritten in dimensionless units aswhere and . Equation 17 describes a closed two-component system for the concentration of active monomers and the total concentration of polymers . It describes the dynamics exactly as long as yield is zero. In order to evaluate the condition (14) we need to determine the integral over as a function of andTo that end, we proceed by looking at both scenarios separately. The numerical analysis, confirming our analytic results, is given in Appendix 3.
Dimerization scenario
The activation rate in the dimerization scenario is , and instead of the term in , we set the initial condition (and ). Furthermore, and we can neglect the term proportional to in . As a result,Solving this equation for as a function of using the initial condition , the totally travelled distance of the wave is determined to bewhere for the evaluation of the integral we used the substitution .
Activation scenario
In the activation scenario, yield sets in only if the activation rate and thus the effective nucleation rate is slow. As a result, in addition to , we can again neglect the term proportional to in . This time, however, we have to keep the term . As a next step, we assume that is much smaller than the remaining terms on the right-hand side, and . This assumption might seem crude at first sight but is justified a posteriori by the solution of the equation (see Appendix 3). Hence, we get the algebraic equation . Using it to solve for , and then to determine , the totally travelled distance of the wave is deduced asTaken together, we therefore obtain two conditions out of which one must be fulfilled in order to obtain finite yieldwhere and are numerical factors, and and . This verifies Equation 1 in the main text.In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.Acceptance summary:The authors study the role of fluctuations for the yield of self-assembly of heterogenous molecular complexes in specific arrangements. The approach captures the combination of subunit activation, nucleation of complex formation and complex assembly, starting from a set of different subunits in solution. This study shows in particular that when nucleation is limited by activation of subunits, yield can drop catastrophically due to stochasticity of activation and nucleation. This occurs in regimes where yield would be high when fluctuations are neglected. This work provides deep new insights in the role of stochasticity for the reliable self-assembly of molecular complexes and a framework for the study of molecular assembly in cells.Decision letter after peer review:Thank you for submitting your article "Stochastic yield catastrophes and robustness in self-assembly" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Naama Barkai as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Pablo Sartori (Reviewer #2).The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.Summary:The authors study the role of fluctuations for the yield of self-assembly of heterogenous molecular structures. An elegant and simple model is introduced which captures fluctuations of both activation of monomers and stochastic nucleation. The paper first recapitulates known results that slow nucleation is key to high yield. It is shown that in the deterministic or mean field limit slow nucleation by activation or by cooperative assembly (dimerization) play a similar role. However, in the presence of fluctuations, both scenarios can show very different behaviors. In particular the yield can drop in the activation scenario due to fluctuations which the authors term stochastic yield catastrophe. Interesting is the general case where activation and dimerization occur together. The authors show that in this case there can be an optimal activation rate at which yield is maximized. The work provides fundamental new insights in the self-assembly of macromolecular structures in cells.However, the referees also raised several points that need to be addressed in a revision. In particular the depth and relevance of some of the results is not fully clear. The authors should address the specific points given below.Essential revisions:The relevance and strength of the results remains partly unclear. Several points need clarification:1) The authors use the term "Stochastic yield catastrophe" to describe the drop of yield for increased fluctuations. In contrast to the catastrophic drop of yield in the deterministic case, it is unclear whether for these stochastic effects the term stochastic yield catastrophe is meaningful and appropriate as the drop is gradual and seems soft rather than representing a transition.2) In Figure 3D the effect of increasing the number of species for a fixed size L is studied. It is shown that more species results in lower maximal yield. This could simply be an effect of the constraint NS = constant, which results in a decreasing number of components per species, N. The authors should check whether, when keeping N = constant, changing S still has an effect on the yield.3) Still in Figure 3, the characterization of the "stochastic yield catastrophe" seems incomplete in view of its relevance to the authors. It seems that there is a certain dependence ymax(N,L,S) (forgetting about L for now). Immediate questions that are not addressed are: how does ymax scale with N, for fixed L and S? How does it then scale with L? And how with the heterogeneity, L/S? One step in this direction is Figure 3D, in which the authors show ymax(S) for several values of L, but variable N (why then the label N = 1000 in the panel?). More such plots should be presented to clarify the key properties of the system.4) There are additional questions which relate to the relevance of the results in realistic situations. The model is elegant but very much simplified. For example, growth is fully reversible which may not be the case in many systems of macromolecular assembly. It is unclear how general and robust the results of the simplified model are if the simplifications are relaxed.5) The time needed for assembly is not discussed. In a biological context it is essential that assembly occurs sufficiently fast. High yield in the long term alone is not enough for assembly to happen. I understand that only looking at yield at long times is an elegant simplification of the problem, but it remains unclear how useful these insights are for real situations.Essential revisions:The relevance and strength of the results remains partly unclear. Several points need clarification:1) The authors use the term "Stochastic yield catastrophe" to describe the drop of yield for increased fluctuations. In contrast to the catastrophic drop of yield in the deterministic case, it is unclear whether for these stochastic effects the term stochastic yield catastrophe is meaningful and appropriate as the drop is gradual and seems soft rather than representing a transition.The question alerted us to the fact that we have to be more explicit about the reason why we use the term’ catastrophe’. The stochastic yield catastrophe is not the change in yield with respect to the parameter α. The term is intended to refer to situations in which we deterministically expect a perfect yield but due to stochastic effects we may end up with zero yield instead. This drop in yield of up to 100% compared to the deterministically expected value is what we refer to as a ’stochastic catastrophe’. We have revised the definition of the term in the subsection “Stochastic effects in the case of reduced resources” of the main text to make this point clearer. Furthermore, we extended our discussion of the nonlinearities in the stochastic effects with respect to different model parameters (especially the ring size L) to give a more complete picture of their significance as well. This directly links to answer 3 in which those aspects are discussed in detail.– We have adapted the introduction of the term “Stochastic yield catastrophe” in subsection “Stochastic effects in the case of reduced resources”.2) In Figure 3D the effect of increasing the number of species for a fixed size L is studied. It is shown that more species results in lower maximal yield. This could simply be an effect of the constraint NS = constant, which results in a decreasing number of components per species, N. The authors should check whether, when keeping N = constant, changing S still has an effect on the yield.Indeed, if N (instead of NS) is fixed in Figure 3D, the maximal yield becomes independent of S for large S >> 1. However, this saturation value of the yield differs from the yield for the homogeneous system S = 1, where the maximal yield is always 1 as the system behaves deterministically. So, the yield (sharply) drops from a perfect value of 1 (for S = 1) to a non-perfect value for S > 1 and then levels off. Intriguingly, for some cases, one even observes non-monotonic behavior of the yield with respect to the number of species: Yield decreases from 1 (for S = 1) to a lower value and then increases again before it saturates for S >> 1. This rather non-intuitive behavior needs a more thorough discussion which we provide in detail in a follow-up manuscript (in preparation). In short, we believe that similar to the deterministic description also the stochastic limit shows equivalent behavior for different number of species S, as long as N and L are fixed and S >> 1.In this manuscript, we decided to focus on the case where NS = const since this ensures that all compared systems can build the same number of rings, NS/L. In this context, Figure 3D suggests that in order to achieve high yield, it is beneficial to build structures that are as homogeneous as possible. So, we believe that the relevance of our findings does not change. Nonetheless, in order to pick up the important point made by the referees, we added a remark to the manuscript and a figure that shows the flattening of the maximal yield for S >> 1.– We have added a remark to subsection “Stochastic effects in the case of reduced resources”.– We have added subsection “Influence of the heterogeneity of the target structure for fixed number of particles per species” and a figure to the Appendix.3) Still in Figure 3, the characterization of the "stochastic yield catastrophe" seems incomplete in view of its relevance to the authors. It seems that there is a certain dependence ymax(N,L,S) (forgetting about Lnuc for now). Immediate questions that are not addressed are: how does ymax scale with N, for fixed L and S? How does it then scale with L? And how with the heterogeneity, L/S? One step in this direction is Figure 3D, in which the authors show ymax(S) for several values of L, but variable N (why then the label N=1000 in the panel?). More such plots should be presented to clarify the key properties of the system.We are happy to take this opportunity to clarify the dependence of ymax(N,L,S) on N,L and S. As already explained in point (2) and shown by the added plot, ymax quickly becomes independent of S when S > 1. Hence, what remains to be described is the dependence of ymax on N and L. Phenomenologically, by increasing the particle number N (for fixed L), we encounter a threshold value Nth where the yield starts to increase sigmoidally from 0 to 1 (roughly over three orders of magnitude in N), quite similar to the behavior of the deterministic yield with respect to α and µ in Figure 1. Analogously, for fixed particle number N, by decreasing L, ymax exhibits a rapid transition from 0 to 1 at some threshold L. We included two additional plots in the Appendix that show the dependences of ymax on N and L, respectively. In the main text, in Figure 3C, we characterize the dependence of ymax on N and L in combined form as a “phase diagram” that shows the four regimes ymax = 0, 0 < ymax
< 0.5, 0.5 < ymax
< 0.99 and 0.99 < ymax in N−L phase space (we now include the regime 0.99 < ymax to show that the transitions from 0 to almost perfect yield extend over a finite range in N or L, respectively).In order to assess the relevance of stochastic effects for a particular system it is, therefore, crucial to know Nth(L), the threshold particle number as a function of L, whose inverse gives Lth(N). The function Nth(L) is exactly described by the line that separates the two regimes ymax = 0 and ymax
> 0 in the “phase diagram” in Figure 3C. On the double logarithmic scale, we find that there is an approximate power-law dependence of ymax on L with an exponent of around 2.8. It is not totally clear to us why there is this dependence with such a high exponent. Simple scaling arguments that we tried could only explain an exponent of maximally 1.5. We investigate the underlying reasons for this dependence in a technical follow-up project (manuscript in preparation). The strongly nonlinear dependence of Nth on L partly explains why we argue that the expression “stochastic yield catastrophe” is justified, as the relevance of this effect strongly increases with the size of the target structure. We extended the caption of Figure 3C and the description in the main text in order to clarify the relevance of the stochastic yield catastrophe and the dependence of ymax on the system parameters.The label “N = 1000” in Figure 3D was, of course, incorrect and misleading. We thank the attentive referee for this remark and have corrected the label.– We have included the regime ymax
> 0.99 in Figure 3C and extended the description in the Figure caption and the main text.– We have added subsection “Dependence of the maximal yield ymax in the activation scenario on N and L” and two additional figures to the Appendix.– We have corrected the false label in Figure 3D.4) There are additional questions which relate to the relevance of the results in realistic situations. The model is elegant but very much simplified. For example, growth is fully reversible which may not be the case in many systems of macromolecular assembly. It is unclear how general and robust the results of the simplified model are if the simplifications are relaxed.In our model two different modes of binding exist. Below a polymer size Lnuc binding is fully reversible whereas above Lnuc irreversible growths takes place. This irreversibility is meant to represent a separation of time scales between assembly and disassembly of a stable structure. Introducing a very small dissociation rate would not affect our results significantly on the time scale of observation (see answer to question 5) but makes the definition of the yield ambiguous. If binding is reversible for all unfinished structures (Lnuc = L − 1), trivially, only a yield of 1 is possible at the end of the assembly process. However, the process then may take an unphysical amount of time. The relevance of the size of Lnuc for the stochastic yield catastrophe is studied in Figure 3D. The robustness of all other effects with respect to the size of Lnuc (which corresponds to the degree of reversibility) is discussed in Appendix 4. Furthermore, the two additional models illustrated in Figure 6A and Figure 7A were introduced to test for robustness against other model assumptions like monomer binding and linear growth. In all cases, we still find stochastic yield reduction, which leads us to the conclusion that the observed effects do not rely on model specifics. We added remarks to the main text to clarify the relation between Lnuc and the existence of reversible and irreversible binding. Finally, we want to remark that the assumption of a finite Lnuc
< L and irreversible growth above this threshold have been applied successfully in experimental studies, as for example of the Brome Mosaic Virus Capsids (Chevance and Hughes, 2008).– We made the relation between reversible and irreversible binding and Lnuc more explicit in the main text.5) The time needed for assembly is not discussed. In a biological context it is essential that assembly occurs sufficiently fast. High yield in the long term alone is not enough for assembly to happen. I understand that only looking at yield at long times is an elegant simplification of the problem, but it remains unclear how useful these insights are for real situations.The importance of time in the biological context is a very pertinent observation and we fully agree with the referee that this question arises naturally: time efficiency plays a significant role in biological and also artificial self-assembly. Due to its significance and due to the following three reasons we believe the question of assembly time and time efficiency is beyond the scope of the current study. First, a thorough discussion of time in the context of self-assembly turned out to be extensive and hence would have considerably increased the length of the manuscript. Second, a simultaneous discussion of both factors, stochasticity and time efficiency, would complicate the analysis (if not treated separately) and obscure the significance and relevance of each one of these factors. Third, the final yield still is a most significant observable that characterizes the efficiency of the assembly process in an explicit way: Since the yield monotonically increases with time, the final yield represents an upper bound for the yield at any finite time. In other words, the final yield is the maximum yield that can be achieved in the assembly process irrespective of time constraints. As such, the final yield is an important and informative observable that can be defined and analyzed in an unambiguous way. For those reasons, we decided to focus here entirely on the analysis of the final yield and how it is affected by stochastic effects and to treat the time aspect separately.However, in order to give the reader an impression of what the time evolution of the yield typically looks like, we now included a corresponding plot and a new section in the Appendix. The plot exhibits typical time trajectories of the yield in both scenarios and shows, in particular, that the yield increases monotonically with time. It also shows that typically, after some delay time, the yield increases rather quickly to a value that lies within 10% of the final yield and then continues to grow more slowly. This underlines that the final yield is a meaningful observable that describes the typical outcome of the assembly reaction well, under appropriate time constraints that are not too restrictive.– We have added a paragraph to clarify the significance of the observable of the final yield and to state more explicitly that the time efficiency is not considered here.– We have added subsection “Time evolution of the yield in the dimerization and activation scenario” and a figure to the Appendix.
Authors: Petr G Leiman; Fumio Arisaka; Mark J van Raaij; Victor A Kostyuchenko; Anastasia A Aksyuk; Shuji Kanamaru; Michael G Rossmann Journal: Virol J Date: 2010-12-03 Impact factor: 4.099