Literature DB >> 27513457

Role of Proteome Physical Chemistry in Cell Behavior.

Kingshuk Ghosh¹, Adam M R de Graff², Lucas Sawle¹, Ken A Dill².

Abstract

We review how major cell behaviors, such as bacterial growth laws, are derived from the physical chemistry of the cell's proteins. On one hand, cell actions depend on the individual biological functionalities of their many genes and proteins. On the other hand, the common physics among proteins can be as important as the unique biology that distinguishes them. For example, bacterial growth rates depend strongly on temperature. This dependence can be explained by the folding stabilities across a cell's proteome. Such modeling explains how thermophilic and mesophilic organisms differ, and how oxidative damage of highly charged proteins can lead to unfolding and aggregation in aging cells. Cells have characteristic time scales. For example, E. coli can duplicate as fast as 2-3 times per hour. These time scales can be explained by protein dynamics (the rates of synthesis and degradation, folding, and diffusional transport). It rationalizes how bacterial growth is slowed down by added salt. In the same way that the behaviors of inanimate materials can be expressed in terms of the statistical distributions of atoms and molecules, some cell behaviors can be expressed in terms of distributions of protein properties, giving insights into the microscopic basis of growth laws in simple cells.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2016 PMID： 27513457 PMCID： PMC5034766 DOI： 10.1021/acs.jpcb.6b04886

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 2.991

Cellular Growth Laws Are Related to Cellular Fitness

Consider the simplest cells, such as bacteria or yeast. Cells grow at different rates, depending on their environment. A cell’s growth rate depends on how much food is present, on the temperature and salt concentration of the external medium, and on its internal biochemical health. Because a cell’s duplication speed is often the single most important determinant of its ability to propagate its progeny, growth rate could have evolved to be a complicated function of many biochemical details of a cell. However, we review here recent efforts toward a different view. Modeling shows how the growth laws of simple cells are encoded within the physical properties of a cell’s proteome (i.e., its full complement of proteins). That is, some cell behaviors are attributable to large fractions of the proteome, not just a single protein or gene or pathway. And, some behaviors are physical (due to protein folding, aggregation, or diffusion, applicable in some universal or general way across different proteins), rather than biological (due to the protein’s particular biological action). Of course, at best, simple models of the physical proteome are only a first approximation. But, in the spirit of other physical chemistry, they may provide useful conceptual insights and can make testable predictions. First, we make a general point: growth laws are related to, and manifestations of, evolutionary fitness landscapes. Define a cellular growth rate, λ, as the number of new cells produced per unit time from each existing parent. If c(t) is the cell population at time t, then under appropriate conditions, populations grow asThe growth rate λ can depend, often strongly, on various quantities; these are called growth laws. Perhaps the best known growth law,[1] λ = λ(sugar), indicates that cells grow faster with increasing concentrations of food, such as sugar, up to a point at which the growth rate saturates. Bacterial growth rates also depend strongly on temperature and external salt concentrations. For practical bacteriology, these are important. To kill bacteria, you remove a food source, or you heat the cells to high temperatures (as when you cook food), or you introduce high external salt concentrations (in pickling fish or in making jerky or salting meats, for example). In general, such growth laws can be expressed as λ = λ(e), where e indicates a vector of environmental variables, such as sugar, temperature, or salt. These functions can express cellular growth laws. A growth law is a function that describes how “today’s cell” can respond to variations in today’s conditions. But, cells can change those functions, through evolutionary modifications over longer time scales. This can be expressed in terms of their genotype, a vector of genes, g. We use the term genotype here in a very general way: It can describe either a set of discrete options, such as the presence or absence of genes or amino acids in proteins, or a continuum of options. It can express some property of a gene directly or it can be a surrogate for that, representing some rate coefficients or equilibrium constants in the biochemical workings of the cell. In general, we can express the growth rate of a cell as Equation captures both today’s growth law λ = λ(e), for fixed evolutionary properties g, while it also captures that growth rates can be modulated by evolution λ = λ(g) for fixed conditions e. The latter property, λ = λ(g), is the fitness landscape for cells for which duplication speed is their primary measure of fitness. Hence, eq relates, albeit in only a general abstract way, the evolutionary fitness landscape to the growth laws of cells. For cells that have been under a fixed selection pressure for a long time, and have evolved to maximize their fitness, we can study their peak-fitness points by finding Note that, in general, cellular fitness f is not always equal to just λ, the growth rate. Many types of cells live in multicellular organisms. They contribute to the fitness of the whole organism. Their own particular fitness objectives are rarely known. Here, we describe some models of fitness f(e, g) in simple cells as a function of properties of the cell’s proteins. We focus on proteins because more than half of a cell’s biomass is its proteins. Hence, where physical behaviors matter, proteins are likely to be predominant players. We distinguish between a protein’s generic physicochemical properties and its specialized sequence-structure actions. By “general physical” properties, we mean the following. First, we are referring to a protein’s health (also called proteostasis(2)): the balance between folded and unfolded states, the balance between folding and degradation, and the states of protein oxidation. Second, we are also referring to biophysical properties that can matter to the cell, such as protein movement, transport, crowding, sticking, and localization. Thanks to enzymatic assays, genome sequencing, and tens of thousands of atomically detailed protein structures in the Protein DataBank, the special functions of many proteins are now known. Less is known about the generic, physical, and health behaviors of proteomes. While the biological actions are often distinct from one protein to the next, the physical behaviors can involve commonalities among proteins, often arising more from statistical properties than from the singular native states. These properties include a proteome’s distribution of stabilities, folding rates, and sensitivity to perturbations (such as side-chain charge modification), as shown in Figure . The physical properties of proteins are important because the cell commits major resources in energy and biomass toward managing them, in its struggle against stresses, disease, and death. Just like the specialized jobs of proteins, the generic actions can be changed through evolutionary processes such as natural selection.

Figure 1

(a) Folding stability (Δ) varies across a proteome, with longer proteins tending to have higher stability. (b) Mean folding rate decreases with increasing protein size (N). (c) Stability loss from a single side-chain charge modification (for example from oxidative damage) scales linearly with the net charge (Q) of the protein and affects small proteins more greatly than large proteins. While two-thirds of the human proteome lies within one standard deviation of neutrality (left of dotted boundary), and is relatively robust to charge modification, the high-charge outliers are at risk of large stability loss. Here, we describe how simple physicochemical models, combined with data from in vitro experiments, can predict some cell behaviors, rationalize observed growth laws, and generate hypotheses about diseases, aging, and evolutionary tendencies. The concepts being sought here, and the models being developed, are coarse-grained, not atomically detailed. Yet, despite their simplicity, they are often sufficient to generate testable hypotheses. The first example below shows how a coarse-grained model of protein folding stability can explain the high sensitivities of cells to temperature, rationalize thermal growth laws, predict proteome stability distribution functions, and give insight into how thermophilic organisms may have evolved to deal with higher environmental temperatures.

Thermal Properties of Cells Arise from the Folding Stabilities of their Proteomes

Cells are highly sensitive to temperature. It is not uncommon that the temperatures at which cells die are only a few degrees higher than the temperatures at which their growth is optimal.[3,4] Small shifts of environmental temperature can drive biological migrations, extinctions, genetic divergence, and speciation.[5−7] By what mechanism are cells so sensitive to temperature? Here, we review a polymer folding model (polymer-collapse theory) that indicates that the thermal sensitivities of cells arise because proteomes have evolved to have denaturation temperatures that are only marginally higher than the cell’s growth temperature.[8−11] Despite its simplicity, this mechanism gives an approximate quantitative description of bacterial growth rates versus temperature.

Cells Are Sensitive to Temperature Because Their Proteomes Are Poised Near Their Denaturation Temperatures

This protein–denaturation–catastrophe mechanism[8,12] has been made quantitative by a combination of thermodynamic measurements of 59 mesophilic proteins in vitro with polymer-collapse theory. Such theory reckons that reversible protein folding is driven by the small average tendency of amino acids to prefer sticking to other amino acids inside a compact native structure, rather than to be exposed and solvated in an expanded unfolded state in water. This mechanism reckons that the principal force opposing folding is the chain entropy, which favors the unfolded state. A version of that simple idea also accounts for electrostatic interactions and the effects of temperature, salts, and denaturants, giving the folding free-energy Δunfold = Gunfolded – Gfolded as[10,13,14]where g0 represents the free-energy when amino acids desolvate and come into contact, z is the average conformational freedom loss per backbone bond, and Δcp is the change in heat capacity per amino acid upon folding. Qd and Qn are the total net charge on the denatured and native structures, respectively, and Rd and Rn are the radii of denatured and native protein. N denotes the number of amino acids (or chain length) in the protein, c is the denaturant concentration, κ is the inverse Debye length, lb is Bjerrum length, k is Boltzmann’s constant, T is the temperature, Th = 373.5, Ts = 385 K,[13,14] and T0 = 300 K; for details, see refs (10) and (14). Equation gives the stability for a single average protein of length N. Thus, the probability distribution p(Δ) of stabilities of all the proteins in a proteome (Figure a) can be computed from P(N), the distribution of chain lengths of proteins in a cell.[8]P(N) is available for different cell types from proteomic or genomic data. We conclude that proteomes tend to be marginally stable at their physiological temperatures; see Figure . This marginal stability is not because the average stability is low, but because of the distribution of stabilities. The average protein in E. coli is estimated to be reasonably stable, Δunfold = 6.8 kcal/mol at 37 °C. However, there are many proteins that populate the “unstable” side of the distribution: approximately 550 out of 4300 (size of the E. coli proteome) proteins are less stable than 3 kcal/mol. In the absence of much data, we can estimate how stability is affected by protein domain structure,[15] and it indicates that proteins may be even less stable than the estimates above.[8] Furthermore, while these estimates are based on stabilities measured in vitro, experiments and simulations show that protein stabilities in vivo or in the reconstituted cytosol are comparable to, or even slightly less stable than, those in vitro.[16−20] The polymer folding model predicts that this marginally stable subset of the proteome is responsible for the high thermal sensitivity of the cell, as seen in Figure by a small shift in temperature from 37 to 41 °C.

Figure 2

Distribution of unfolding free-energy (Δunfold = Gunfolded – Gfolded) of all the proteins present in the E. coli proteome at 37 °C (in blue) and at 41 °C (in red). The bin width for the free-energy is 1 kT. The total area under the curve equals the number (4300) of proteins present in the E. coli proteome. Adapted with permission from ref (8). Copyright 2010 Elsevier. A similar stability distribution is predicted by an evolutionary kinetics model.[9] In that treatment, random mutations occur through evolution that can alter the folding stabilities of proteins. Evolutionary changes occur by a random walk with a drift on the folding free-energy landscape.[9,21] That work envisions two limiting states. Proteins have a maximum stability, Δmax, because it becomes increasingly harder for evolution to find sequences having arbitrarily high stabilities. Proteins also have a minimum stability, Δmin, because otherwise they will aggregate or not fold. Within these two limits, it is assumed that the fitness landscape is flat. The protein stability distribution that evolves through this evolutionary model gives the same stability distribution as the polymer folding model.[8] Both the polymer folding model and the evolutionary kinetics model give a basis for rationalizing the functional form of cellular thermal growth laws.[8,10−12] We suppose that the cell’s growth rate, r(T), is a product of two terms: (i) a factor that describes Arrhenius-activation of one or more activated metabolic process(es) that govern how the cell’s growth rate increases with temperature at low temperatures,[8,12,22,23] and (ii) a factor that accounts for the fraction of the proteome that is folded at any temperature (capturing the denaturation catastrophe of the proteome at high temperatures[8,11,12]):Here, r0 is some reference growth rate, Δ⧧ is the activation barrier of some critical growth-limited metabolic rate, and Γ is the number of essential proteins that are needed for growth. The product denotes multiplication over the probability that the ith essential protein (with N amino acids) is in the folded state which is written in terms of Δunfold (eq ; typical temperature dependence shown in Figure a). The expression above is simplified by assuming lethal proteins are drawn from the same distribution as the proteome,[8,12] thus enabling the calculation over all the proteins in the proteome, with Γ being a fit parameter. The details of the calculation can be found in previous work.[8,10] Similar arguments[22,24] have been made but using only a single effective value for Δunfold. The model described here, based on the whole proteome stability distribution, fits well the experimentally measured growth rates for mesophilic organisms (Figure b). The corresponding best-fit value of the cell’s activation barrier for growth, Δ⧧, for E. coli is found to be 16.3 kcal/mol. This happens to be approximately equal to the barrier for peptide bond formation by the ribosome,[25] and is consistent with estimates from other studies.[12,22−24] Moreover, this activation energy is in the same range as typical values for various enzymatic reactions, including the barrier (13 kcal/mol) that is associated with the elongation of RNA by transcription.[26] This model also fits the growth rates of thermophilic organisms (Figure c) well when using thermodynamic parameters for thermophilic proteins obtained from analyzing in vitro data sets.[10] A detailed systems level model has been applied to understand how mutations in metabolic networks change thermal growth rates.[27,28] They also indicate that the thermostabilities of metabolic enzymes are rate-limiting at superoptimal temperatures.[28] These models and arguments suggest that fundamental physicochemical properties of proteomes help to define a cell’s evolutionary fitness landscape (Figure d).

Figure 3

(a) Protein folding stability across temperatures (Δunfold) for an ideal mesophilic (blue) and thermophilic (red) protein based on thermodynamic data.[10] (b) The growth rate model (blue) captures the experimental growth rate of mesophiles like E. coli (●) and (c) thermophiles (red).[10] (d) Temperature–growth curves in parts b and c can be seen as slices through a high-dimensional fitness landscape. Some dimensions can be traversed rapidly (like temperature), while others (ξ) change over evolutionary time scales. Reprinted in part with permission from ref (10). Copyright 2011 Elsevier.

Proteomes of Thermophilic Organisms Are More Stable Than Those of Mesophilic Organisms

The polymer-collapse model also gives insight into how mesophilic cells differ from thermophiles. Mesophilic organisms mostly live at moderate temperatures (25–40 °C) while thermophilic organisms grow at higher temperatures. How do their proteomes differ? A global analysis of 57 thermophilic proteins and 59 mesophilic proteins shows an average systematic difference:[10] thermophilic proteins denature at higher temperatures than mesophilic proteins, as they are more stable, on average, at all temperatures[10] (see Figure a). It also indicates that denatured states of thermophilic proteins may have less chain entropy than mesophilic proteins.[10] This implies that the denatured states are, on average, more compact in thermophiles;[29−31] see Figure . In principle, the difference in stabilities between thermophiles and mesophiles could arise from any of the types of driving forces, including electrostatics, hydrophobic interactions, proline substitution, disulfide bonds,[32−57] the presence of amino acids having different flexibilities,[58−61] or loop deletions.[62]

Figure 4

Denatured states are more compact in thermophilic proteins than in their mesophilic counterparts. Among other things, this can result from less net charge on thermophilic proteins or from more subtle differences in charge patterning (see ref (57) for details). However, it seems likely that electrostatics may be a key contributor to these differences.[33−36,39−48,57,63,64] Electrostatic stability of folded proteins can depend both on a protein’s net charge and on its charge patterning. For example, Sawle and Ghosh have shown that a good predictor of the relative compactness of the denatured structures between thermophilic and mesophilic sequences is the sequence–charge–decoration (SCD) metric:[57]Here, q, q are the charges (1 for basic, −1 for acidic, and 0 otherwise) on two amino acids m and n with |m – n| being their sequence separation. SCD expresses the degree of charge mixing;[57] a similar metric has been given by Das and Pappu.[65]Figure gives the SCD values for two sequences of charge. A more compact denatured state is predicted by a more negative value of SCD. In this case, a “blockier” sequence of charges gives the more compact denatured state. Sawle and Ghosh have applied this metric to a set of 540 orthologous pairs of thermophilic and mesophilic proteins, and found that thermophiles, in general, have a more compact denatured state than mesophiles.[57] While this comparison was made without corresponding 3D protein structures, a comparison has also been made of a smaller set of 55 well-aligned mesophile–thermophile pairs, for which structures are known.[66] This too shows that thermophilic domains are, on average and with high statistical significance, more compact than their mesophilic counterparts. Charge patterning and segregation also contribute to the sizes of intrinsically disordered proteins[65] and to the degree of ribosome–protein complexation.[67]

Figure 5

Sequence–charge–decoration (SCD) is a measure of charge patterning discrimination and a predictor of the compactness of a denatured state. The blockier sequence has the more negative SCD, predicting the more compact denatured state. A key distinction between mesophilic and thermophilic proteins appears to be the net charge and charge patterning of the protein sequences (see ref (57) for details).

Highly Charged Proteins Are in Greater Danger of Unfolding from Random Oxidative Damage, Such as in Aging

Here is another way that protein folding stability appears to manifest as a phenotype of the cell. Cells sustain increasing oxidative damage with age.[68−71] Protein damage with age follows a fairly universal behavior, independent of organism (Figure ). We describe here a hypothesis about how oxidative damage can lower the folding stability of some of the proteome’s proteins.[72] A few things are clear. First, proteins are key targets of oxidative damage.[73−75] As many as half of the proteins in an average 80-year-old person are estimated to have oxidative damage.[68,74] Second, amino acid side-chains are the principal site of damage,[75−78] estimated to be at least 10 times more common than other types of damage.[75] Third, oxidative damage is a random “loose cannon” event in the cell, hitting proteins across the spectrum of the whole proteome. So, random side-chain damage may be an important consequence of oxidation. But, one additional fact poses a challenge for modeling: the level of oxidative damage in old cells amounts to only about one amino acid alteration per protein,[68,74] a relatively small effect. How might single charge changes in some proteins be sufficient to contribute to the aging phenotype?

Figure 6

Diverse range of organisms share a common age-dependent increase of oxidative damage. The amount of protein damage with age is shown for worms[69] (purple ◆), flies[70] (green ▲), rats[68] (cyan ■), and humans[71] (blue ▼). The black curve is the fit to the data, while the blue shaded region is the range of curves obtained if the fit parameters are changed by 15%. The pink stripes show the damage levels reached at the end of life in people with the premature aging diseases progeria and Werner syndrome.[71] Reprinted with permission from ref (72). Copyright 2016 Elsevier. Here, we review the following mechanism:[72] (i) oxidation damages amino acid sites on random proteins across the proteome; (ii) some damage events will alter the charges on some side-chains;[77] (iii) for a small subset of the proteome, a small change in net charge (as small as +1 or −1 charges) can denature or destabilize its folded state. How can changing a protein’s charge by only +1 or −1 units unfold a protein? Equation contains an expression of electrostatic contribution to the free-energy of folding in terms of Qn2 and Qd2, the square of the charge on the native and denatured protein, respectively.[10,11] These terms capture the principle that it is unfavorable to bring a protein’s net charge from the larger volume of the unfolded state to the smaller confines of the native state[79,80] (see Figure a). This model has been demonstrated to predict the following: (i) the experimentally measured pH–salt phase diagrams for the unfolding of myoglobin, lysozyme, and RNase A,[14] and (ii) the experimental dependence of the folding free-energy on the square of the net charge.[79−82]Equation shows that changing a protein’s charge from Q to Q ± 1, for example from a single oxidative damage event, will change an average protein’s folding stability by ΔΔ(Q) = Δ(Q ± 1) – Δ(Q), where

Figure 7

(a) For highly charged proteins, folding leads to the confinement of many charges into a small space. So, high net charge tends to destabilize the native fold. (b) This figure shows two points. First, the black line shows how one standard deviation of charge increases as a function of chain length in the human proteome. The color shading indicates the stability change predicted from a single destabilizing charge modification. The fact that the one standard deviation line coincides with the boundary between the blue and red regions indicates that most proteins in the human proteome are relatively long, neutral, and low-risk, yet there exists a significant number of outliers that are short, highly charged, and high-risk. Second, the points on this figure indicate 20 proteins that are important to aging and aging-related diseases and predicted to be in greater danger of large stability loss from a single oxidative charge modification. Some are among the most highly charged proteins in the proteome. Adapted with permission from ref (72). Copyright 2016 Elsevier. Equation is in quantitative agreement with charge-perturbation experiments.[81,82] It can be computed using only a protein’s sequence. It predicts a proteome-wide distribution of stability changes that is similar to that observed experimentally in point mutations of charged residues, which are reasonable proxies for oxidation.[83] A key conclusion from eq is that the change in folding free-energy, ΔΔ, from a damage event will be proportional to the net charge already on the native protein before the damage event. So, any proteins in the proteome that are highly charged and/or relatively unstable to begin with are in greater danger of being destabilized by a single oxidative damage event; see Figure b. Figure b shows an interesting implication of the model.[72] First, the black curve shows the one standard deviation line for the human proteome. It shows that most human proteins are sufficiently neutral to be safe from unfolding by single charge-modification events. Only a few of the proteins in the proteome have a sufficiently high net charge (of either sign) for the destabilization of their native state to be comparable to the stability of some entire proteins (roughly 2–4 kT; see Figure ). Now, notice the data points on Figure b. These are 20 human proteins known from the literature to be relevant to aging.[84] These 20 proteins all lie in the high-risk region, and thus, the model predicts that these proteins can be unfolded by a single oxidative charge-modification event. So, changing a single side-chain charge by a random oxidation event could contribute to how aging cells lose protein stability and function.[85]Figure compares a typical charge distribution found on the majority of proteins, which are nearly neutral (Figure c) and not at risk of unfolding from random oxidation events, with those of highly charged proteins (Figure a,b) at high risk of unfolding from single oxidation events.

Figure 8

Electrostatic surface potential of (a) telomerase reverse transcriptase (1132 residues and +98 net charge in Figure b; PDB: 3KYL) and (b) nucleosome-remodeling factor subunit RbAp48 (425 residues and −29 net charge; PDB: 2XU7) are substantially different from the smaller, more speckled potential at the surface of (c) ubiquitin (76 residues and zero net charge; PDB: 1UBQ).[72] Reprinted with permission from ref (72). Copyright 2016 Elsevier. Additional observations support this mechanism: high net charge is known to predict disorder-prone, unstable proteins;[86] disorder and low stability increase the chance of becoming oxidatively damaged;[87] protein aggregates of old organisms are enriched in damaged proteins;[88] and in budding yeast[89] and worms,[90,91] aggregates are known to be enriched in highly charged proteins such as ribosomal and DNA-binding proteins.[72] Interestingly, low net charge is also a signature of thermophilic proteins,[57] which face greater stability challenges, as discussed earlier.

Dynamical Properties of Cells Arise from the Folding, Synthesis, Degradation, and Transport Rates of Proteins

Below, we review some of the time scales and dynamical processes of proteomes that are important to rapidly duplicating cells.

Protein Folding Happens Fast Enough To Escape the “Grim Reaper” of Proteome Degradation

First, consider the distribution of protein folding times. Experiments show that single-domain proteins fold in vitro over time scales that range over about 8 log orders.[11,92−96] Thirumalai developed an early model,[97] predicting that folding rates would scale as k = k0 exp(−N1/2) with chain length N. It was remarkably prescient, given the almost complete absence of data at that time. It successfully describes folding rates of proteins[98] and RNA molecules.[99] Recently, a microscopic folding mechanism has been proposed, called the Foldon Funnel Model; see Figure . The model asserts a simple folding mechanism, namely, that local structures form first and rapidly, followed by larger nonlocal structures that assemble more slowly because they have to wait for smaller pieces to form first.[95] The model gives good predictions of folding rates for 93 single-domain proteins from sensible values of helix–coil and hydrophobic interaction parameters[95] (Figure a). The model predicts a median nonabundance-weighted folding time of 5 s for the E. coli proteome.[95]

Figure 9

(a) Foldon Funnel Model predictions for protein folding rates vs number of secondary structure units (Ns), compared to data on 93 small single-domain proteins. The inset shows the funnel landscape for this model. (b) Mechanism for how local structures form first and then assemble toward the native state.[95] Reprinted with permission from ref (95). Copyright 2014 American Chemical Society. Another model of folding rates is the Topology Polymer Model.[94] It treats the chain conformations more explicitly than the Foldon Funnel Model, fully accounting for entropic costs of chain topological restrictions (see polymer diagrams in ref (94) for details). The Topology Polymer Model also differs by (i) using structure-based domain assignments to predict folding rates and (ii) weighting the folding rates by protein abundance when predicting the proteome folding rate distribution.[96] The Topology Polymer Model gives good predictions for the dependence of folding speed on native topology[94,100] and unifies different models of folding kinetics. It predicts an average abundance-weighted folding time of 100 ms for the E. coli proteome, and it predicts an average of 170 ms for the yeast proteome.[96] The role of topological constraints in nucleic acids, proteins, and folding kinetics has also been recently revisited using simple folding models.[101,102] A question for the future remains: What are the folding rates of large single-domain or multidomain proteins? There are not yet many experiments for those types of proteins.[15,103] Figure a compares the protein folding times for the yeast proteome (from the Topology Polymer Model) with other key rates in the cell.[96] The rate distribution is broad. The most remarkable prediction is that folding speeds seem nearly optimal for outrunning the “grim reaper” of protein degradation,[96] with the slowest-folding proteins just barely out-pacing the fastest protein degradation. This case is made by the black curve in Figure a, which is the result of an evolutionary diffusion-drift model of folding rates,[96] resembling the diffusion-drift model of protein stabilities[9] described earlier. The model is based on asserting two physical principles of evolution, namely, that (i) no protein can fold faster than known ultrafast folders, due to conformational speed limits,[105] and (ii) no protein should fold more slowly than the fastest degradation time. Within this interval, the only selection pressure on folding kinetics is simply to “beat the clock” against degradation.[96] When fitted with only one parameter against the folding time distribution derived from the Topology Polymer Model, the model predicts the slowest folding time to be around 10 s. This provides a cushion of an order of magnitude in time separation relative to the fastest degradation times (a few minutes). So, even a protein that degrades at the fastest rate, if not folded off the ribosome by cotranslational folding, has at least a 90% chance of folding before being degraded.[96] For yeast, almost 99% of the proteome’s proteins fold faster than the degradation time (see Figure b and ref (96) for details). Among the four outliers, the only protein that folds significantly more slowly has 18 chaperone interaction partners,[96] indicating the important role of chaperones in helping slow folders.[106]

Figure 10

(a) Abundance-weighted folding time (t in seconds) distribution across the yeast proteome (blue) using the topology polymer model,[94] which is in good agreement with diffusion-drift model (black) with flat fitness landscape.[96] Experimentally measured half-life distribution of the yeast proteome (green)[104] shows folding kinetics is faster than protein degradation.[96] Median synthesis time is shown in red. (b) The distribution of the ratio of protein half-life and protein folding time.[96] Adapted with permission from ref (96). Copyright 2014 Zou et al.

Speed of Cell Duplication Is Limited by the Rate of Protein Translation

What is the speed limit for cell duplication? In rapidly growing E. coli bacteria, DNA replication takes 1–2 ms/base,[107] RNA polymerase 10–40 ms/base,[108,109] and the ribosome 50 ms/amino acid.[110] The ribosome’s slower rate of elongation, combined with its enormous size (since the ribosome itself needs to get copied) and the 10-fold greater cellular abundance of polymerized amino acids relative to nucleotides, makes protein translation the largest bottleneck to cellular growth. In fast-growing E. coli, about a third of the cell’s dry weight is ribosome (including rRNA).[111,112] What is the maximum rate of protein synthesis? First, cell duplication requires that each ribosome must make a copy of its own proteins. The fastest that a ribosome can copy itself is 6 min, assuming a ribosome’s 7336 amino acids[113] are translated at a rate of 20 per second.[114] Second, each ribosome must duplicate a corresponding complement of other proteins too. At fast growth rates, an E. coli ribosome must make roughly three times its own mass of nonribosomal proteins.[111,115] These nearly 30 000 amino acids must be duplicated in series, one-amino-acid-at-a-time, by each ribosome, predicting a minimum doubling time of 24 min, which approximately equals the observed maximum rate in E. coli.[111] Interestingly, this 1:3 ratio of ribosomal to nonribosomal proteins also appears to hold in budding yeast, a fast-growing eukaryote.[116] So, the minimum cell division time t can be estimated aswhere r is the rate that one ribosome adds one amino acid to a growing protein chain, and L is the number of amino acids in a ribosome. A ribosome of budding yeast contains 1.6-fold more amino acids than E. coli’s[113,117] and elongates proteins at half the latter’s speed.[110,116] So, if protein translation is indeed the limiting factor in the rate of cell duplication, it implies a minimum doubling time of 2 × 1.6 × 24 min = 77 min. This is close to experimental values.[118]

Protein Translation Speeds Are Limited by Diffusion and Binding

So, why can an amino acid not be added to a growing peptide chain in less than 50 ms in E. coli? Translation is known to require several actions:[119,120] (i) tRNA needs to diffuse to the ribosomal binding site; (ii) the tRNA must settle and bind in the appropriate orientation at this site, with proofreading to verify that it is the correct tRNA;[119] (iii) the peptide is chemically elongated. It is thought that the peptide elongation reaction (iii) is faster than the accommodation step, but this is still debated.[119] The rate of tRNA accommodation (ii) has been found experimentally to occur on the same time scale as translation (i) and thus could account for a non-negligible fraction of the total 50 ms. The translation step (i) depends on tRNA concentration. Evidence for its role in a diffusion bottleneck is that cellular tRNA concentrations are roughly the same as those needed to saturate ribosomal kinetics.[121] Furthermore, E. coli devotes a significant fraction of its dry weight to tRNA (up to 2%[121]) that could have been spent on more ribosomes, suggesting tRNA plays an important role in protein synthesis speed. Consistent with this, a tRNA diffusion model correctly accounts for the abundance of tRNA with growth rate.[121] In short, it appears that the physical processes of tRNA diffusion (i) and the binding and proofreading (ii) are limits to the speed of ribosomal translation.

Cellular Actions May Be Broadly Rate-Limited by Protein Motions

Of course, there are very many metabolic rates in the cell. Figure a summarizes a broad range of enzyme actions, indicating a predominant time scale around 10–1000 ms.[122] What limits their rates? Typical enzyme reactions are often parsed into the following steps:

Figure 11

(a) Distribution of protein and ribosomal catalytic rates in prokaryotes and eukaryotes.[122] Ribosomal catalytic rates are remarkably similar to the proteome-wide averages. (b) Catalytic rates often closely follow those of the functional low-frequency motions of proteins. Mesophilic adenylate kinase (●),[123] thermophilic adenylate kinase (○),[123] T4 lysozyme (■),[133] triosephosphate isomerase (◀),[134] ribonuclease binase (▶),[135] RNase A (▼),[136] and cyclophilin A (◆).[137] (c) Enzyme catalysis slows down with increasing solvent viscosity in different concentrations of trehalose (○).[131] Part a adapted with permission from ref (122). Copyright 2011 Americal Chemical Society. Part c reprinted with permission from ref (131). Copyright 2004 Springer. Among these steps, the chemical reaction step itself is often fast. The rate of collision between proteins and small diffusing ligands is on the order of 108 M–1 s–1, implying a time scale of 0.1 ms for typical ligand concentrations of 0.1 mM.[122] Hence, the rate-limiting steps for enzyme actions appear to be the other steps in eq ; namely, the opening and closing, binding, product release steps.[123−127] These steps can be limited by protein dynamics. Evidence for this view comes from the close correspondence between catalytic rates and the rates of functional motions observed across many proteins, as shown in Figure b. However, enzymatic efficiency can be enhanced by other subtle mechanisms as well. For example, binding of allosteric effectors can induce fluctuations[128] and alter conformational landscape either by facilitating conformational transition or altering the width of the free-energy basin[129] and site-specific local flexibility.[130] Partitioning of flux between different pathways can also enhance turnover rates.[128] In spite of these subtleties, the overall role of protein dynamics in enzymatic turnover is clear (Figure b). Furthermore, enzyme actions often slow down with increased solvent viscosity[131] (Figure c). This is consistent with the observed effect of solvent viscosity on loop closure, which is rate-limiting for catalysis in some enzymes.[132] So, if cell duplication speeds are ultimately limited by protein motions, why can proteins not wiggle any faster than they do? First, protein conformational energy landscapes are naturally rugged, even along directions of large-amplitude motions.[138] Second, large motions require moving against friction (“wet” friction of the solvent and “dry” friction from internal motions[139−142]). Third, some motions require local unfolding of secondary structures,[138] and that depends on protein folding stability, which is usually marginal.[127,138] Fourth, the protein conformation that binds the substrate is often little populated, and requires waiting for the right fluctuation. Lastly, there are trade-offs between high affinity for the substrate and stabilization of the transition state conformation.[127] In summary, the evidence compiled here indicates that cell duplication speeds are limited by ribosomal and enzyme actions, which are in turn limited typically by the diffusion of substrate and the motions of protein molecules as they slosh and contort in the solvent.

Salts Can Slow Down Cell Growth by Slowing the Rates of Movement of Proteins inside Cells

High salt concentrations can slow down the growth of bacteria. Salts are used to pickle foods and to preserve meats. Salts act by slowing down bacterial growth. Here, we describe a mechanism for bacterial salt growth laws: Adding external salt contributes an osmotic pressure that draws water out of the cell, causing the density of proteins inside the cell to increase, leading to more sluggish transport of the proteins throughout the cell’s cytoplasm, and reducing the cell’s growth rate. Experimental data shows a correlation between cellular growth rate and specific reactions such as translation speed[110,143] and other key metabolic reactions.[144] To obtain the salt growth law, we suppose that growth rates of cells are proportional to protein–protein collision rates (rd) inside the cell, resulting from protein diffusional transport. We hypothesize that biomolecular crowding has two opposing effects on reactions: (i) it increases the concentration of interacting species, but (ii) it hinders and slows the diffusion rate of the reactants. The combination of these two effects predicts a protein diffusional rate rd that is proportional to ϕD(ϕ), where ϕ is the protein volume fraction and D(ϕ) is the diffusion constant depending on the crowding fraction. The reduction of diffusion due to volume-excluding monodisperse hard-sphere crowders can be approximated by a simple formula: D(ϕ) ∼ D0 (1 – ϕ/ϕc)2, where D0 is the diffusion in the limit of no crowding, and ϕc denotes the volume fraction at which diffusion critically slows down and is estimated to be ϕc ≈ 0.58.[11,145,146] The protein–protein collision rate isMaximizing rd with respect to ϕ yields the optimal volume fraction of ϕopt ≈ ϕc/3 ≈ 0.19, close to the typical protein volume fraction (around 0.2) inside a cell.[11] We can compare this model’s predictions to experiments on bacterial growth rate as a function of salt and crowding volume fraction.[147] To account for heterogeneous protein sizes, two ingredients are needed. First, we have used the hard-particle theory of Minton,[148] and its parameters, to estimate how D(ϕ) varies with protein size. This model correctly captures the observed decrease in diffusion with increasing particle size.[148] Second, we need to know which particular protein or proteins are responsible for the diffusion limit to cell growth. Figure a shows two different assumptions regarding which proteins are rate-limiting. First, the red curve supposes that all the proteins in the proteome participate in growth, taken by averaging the reaction flux over the molecular weight distribution of the whole E. coli proteome. Second, an argument has been made[143] that one particular type of biomolecule may have an outsized influence on cell dynamics, namely, the tRNA-EF-Tu complex, which are the 70 kDa particles that bring the tRNA molecules to the ribosome in order to elongate the growing peptide chain. As we have argued in the previous section, protein translation, which depends on the rates of amino acid incorporation, may be rate-limiting for cell growth. The basic translation speed of incorporating one amino acid at a time can be further slowed in the presence of crowding due to compromised diffusion. Might the diffusion of the tRNA-EF-Tu complex be growth-limiting? This is a large complex. It will diffuse slowly to the ribosome in the crowded cell environment. This diffusion-bottleneck hypothesis is supported by a recent study showing that ribosomes and tRNA are maintained close to the ratios predicted from diffusion arguments to optimize cell-wide translation rates.[143] The black curve in Figure shows the model prediction when the diffusion of tRNA-EF-Tu complexes is considered to be rate-limiting.

Figure 12

(a) Growth rate as a function of crowding volume fraction is well-captured by the hard-particle model of Minton.[148] (b) Cell crowding has similar consequences on the rate of gene expression.[149] (c) A high-dimensional fitness landscape (as a function of volume fraction (ϕ) and arbitrary reaction coordinate ξ) on which part a represents a single slice. Part b is reprinted with permission from ref (149). Copyright 2013 Macmillan Publishers Ltd. Of course, other factors will matter too in the balance of salt and volumes of the cell, including ion fluxes, their regulation, and the balance of ATP.[150] The model described above only aims to give a simple estimate of the protein diffusional factor. Cellular crowding is known to affect many physiological processes.[151] Crowding can also affect gene expression levels (Figure b), reaching a maximum before decreasing at higher densities.[149,152] Recent work has also shown cytoplasm can exhibit glassy properties.[153,154] The nature of the cytoplasmic environment depends on the size of the cellular objects; for example, small objects experience cytoplasm as a liquid-background while large macromolecules experience a solid-like environment.[154] Interestingly, metabolism can also tune the fluidity of the cytoplasm allowing transport of large cellular components that will otherwise be severely constrained in their mobility. Thus, switching between different metabolic states under varying environmental conditions can alter dynamics, cell physiology, and ultimately cellular fitness.[154]Figure c shows how such relationships represent single slices through a high-dimensional fitness landscape that we are only beginning to understand.

Summary

While many behaviors of cells emerge from their unique biology, they are fundamentally constrained by the common physics that unites them. Here, we review simple arguments about how these fundamental limits are encoded within the collective physical properties of proteins and proteomes. We describe the role of proteome physics in cell growth laws, providing mechanisms for how cell growth speeds up with temperature and how high salt concentrations slow it down. Electrostatics models give mechanistic insight into the stability gain in thermophiles and the oxidative stability loss in aging and disease. Furthermore, kinetic models of protein folding applied on a global scale show how folding times may be limited by the rate of degradation. And, we note that cell growth appears to be rate-limited by the ribosomal action of adding amino acids to growing protein chains, and by protein motions responsible for enzyme actions. In short, physics can give qualitative and quantitative insights into the growth properties of cells through the use of simple physical models. We believe such global scale models, guided by physicochemical principles, will be increasingly sought after to understand cellular phenotypes and evolution.

147 in total

1. An electrostatic basis for the stability of thermophilic proteins.

Authors: Brian N Dominy; Hervé Minoux; Charles L Brooks
Journal: Proteins Date: 2004-10-01

2. Protein stability and surface electrostatics: a charged relationship.

Authors: Samantha S Strickler; Alexey V Gribenko; Alexander V Gribenko; Timothy R Keiffer; Jessica Tomlinson; Tracey Reihle; Vakhtang V Loladze; George I Makhatadze
Journal: Biochemistry Date: 2006-03-07 Impact factor: 3.162

Review 3. Why are proteins charged? Networks of charge-charge interactions in proteins measured by charge ladders and capillary electrophoresis.

Authors: Irina Gitlin; Jeffrey D Carbeck; George M Whitesides
Journal: Angew Chem Int Ed Engl Date: 2006-05-05 Impact factor: 15.336

4. How do thermophilic proteins and proteomes withstand high temperature?

Authors: Lucas Sawle; Kingshuk Ghosh
Journal: Biophys J Date: 2011-07-06 Impact factor: 4.033

Review 5. Connecting the dots: the effects of macromolecular crowding on cell physiology.

Authors: Márcio A Mourão; Joe B Hakim; Santiago Schnell
Journal: Biophys J Date: 2014-12-16 Impact factor: 4.033

6. Dependency of size of Saccharomyces cerevisiae cells on growth rate.

Authors: C B Tyson; P G Lord; A E Wheals
Journal: J Bacteriol Date: 1979-04 Impact factor: 3.490

Review 7. The folding of single domain proteins--have we reached a consensus?

Authors: Tobin R Sosnick; Doug Barrick
Journal: Curr Opin Struct Biol Date: 2010-12-06 Impact factor: 6.809

8. Relationship between ion pair geometries and electrostatic strengths in proteins.

Authors: Sandeep Kumar; Ruth Nussinov
Journal: Biophys J Date: 2002-09 Impact factor: 4.033

9. ProTherm and ProNIT: thermodynamic databases for proteins and protein-nucleic acid interactions.

Authors: M D Shaji Kumar; K Abdulla Bava; M Michael Gromiha; Ponraj Prabakaran; Koji Kitajima; Hatsuho Uedaira; Akinori Sarai
Journal: Nucleic Acids Res Date: 2006-01-01 Impact factor: 16.971

10. The energy landscape of adenylate kinase during catalysis.

Authors: S Jordan Kerns; Roman V Agafonov; Young-Jin Cho; Francesco Pontiggia; Renee Otten; Dimitar V Pachov; Steffen Kutter; Lien A Phung; Padraig N Murphy; Vu Thai; Tom Alber; Michael F Hagan; Dorothee Kern
Journal: Nat Struct Mol Biol Date: 2015-01-12 Impact factor: 15.369

8 in total

Review 1. How Do Cells Adapt? Stories Told in Landscapes.

Authors: Luca Agozzino; Gábor Balázsi; Jin Wang; Ken A Dill
Journal: Annu Rev Chem Biomol Eng Date: 2020-06-07 Impact factor: 11.059

2. Encapsulation of ribozymes inside model protocells leads to faster evolutionary adaptation.

Authors: Yei-Chen Lai; Ziwei Liu; Irene A Chen
Journal: Proc Natl Acad Sci U S A Date: 2021-05-25 Impact factor: 11.205

3. Thermal Analysis of a Mixture of Ribosomal Proteins by vT-ESI-MS: Toward a Parallel Approach for Characterizing the Stabilitome.

Authors: Tarick J El-Baba; Shannon A Raab; Rachel P Buckley; Christopher J Brown; Corinne A Lutomski; Lucas W Henderson; Daniel W Woodall; Jiangchuan Shen; Jonathan C Trinidad; Hengyao Niu; Martin F Jarrold; David H Russell; Arthur Laganowsky; David E Clemmer
Journal: Anal Chem Date: 2021-06-08 Impact factor: 8.008

Review 4. Rules of Physical Mathematics Govern Intrinsically Disordered Proteins.

Authors: Kingshuk Ghosh; Jonathan Huihui; Michael Phillips; Austin Haider
Journal: Annu Rev Biophys Date: 2022-02-04 Impact factor: 19.763

5. Protein evolution speed depends on its stability and abundance and on chaperone concentrations.

Authors: Luca Agozzino; Ken A Dill
Journal: Proc Natl Acad Sci U S A Date: 2018-08-27 Impact factor: 11.205

6. Adaptations of Escherichia coli strains to oxidative stress are reflected in properties of their structural proteomes.

Authors: Nathan Mih; Jonathan M Monk; Xin Fang; Edward Catoiu; David Heckmann; Laurence Yang; Bernhard O Palsson
Journal: BMC Bioinformatics Date: 2020-04-29 Impact factor: 3.169

7. Proteostasis collapse is a driver of cell aging and death.

Authors: Mantu Santra; Ken A Dill; Adam M R de Graff
Journal: Proc Natl Acad Sci U S A Date: 2019-10-16 Impact factor: 11.205

8. A Thermodynamic Atlas of Proteomes Reveals Energetic Innovation across the Tree of Life.

Authors: Alexander F Chin; James O Wrabl; Vincent J Hilser
Journal: Mol Biol Evol Date: 2022-03-02 Impact factor: 16.240

8 in total