Literature DB >> 35573218

Proteins' Evolution upon Point Mutations.

Abstract

As the reader must be already aware, state-of-the-art protein folding prediction methods have reached a smashing success in their goal of accurately determining the three-dimensional structures of proteins. Yet, a solution to simple problems such as the effects of protein point mutations on their (i) native conformation; (ii) marginal stability; (iii) ensemble of high-energy nativelike conformations; and (iv) metamorphism propensity and, hence, their evolvability, remains as an unsolved problem. As a plausible solution to the latter, some properties of the amide hydrogen-deuterium exchange, a highly sensitive probe of the structure, stability, and folding of proteins, are assessed from a new perspective. The preliminary results indicate that the protein marginal stability change upon point mutations provides the necessary and sufficient information to estimate, through a Boltzmann factor, the evolution of the amide hydrogen exchange protection factors and, consequently, that of the ensemble of folded conformations coexisting with the native state. This work contributes to our general understanding of the effects of point mutations on proteins and may spur significant progress in our efforts to develop methods to determine the appearance of new folds and functions accurately.

Entities: Chemical

Year: 2022 PMID： 35573218 PMCID： PMC9089682 DOI： 10.1021/acsomega.2c01407

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

How proteins’ point mutations impact their evolvability is of paramount importance in biology, from molecular evolution to structural biology; for example, nuclear magnetic resonance results reveal that “proteins adopt unstable, high-energy states that exist for fractions of a second but can have key biological roles”.[1] The long-standing question is how to resolve this problem without considering epistasis effects explicitly. Before we embark on unraveling a possible solution to this enigma, it is worth noting that the term epistasis has been used with various analogous meanings, although it is commonly defined as a phenomenon that “... occurs when the combined effect of two or more mutations differs from the sum of their individual effects...”.[2] We should point out here a remarkable equivalence between this definition of epistasis with Leibniz and Kant’s notion of space (and time) devised as “analytic wholes”, i.e., the one where “...its priority makes it impossible to obtain it by the additive synthesis of previously existing entities...”.[3] Beyond this philosophical thought, the following question arises, why should we be interested in epistasis? The main reason is that epistasis could have a remarkable impact on the evolution of proteins by either restricting their trajectories or opening new paths to new sequences that would otherwise have been inaccessible.[4−6] However, despite the simplicity of the definition of epistasis and the colossal progress in the prediction of protein structures, the answers to simple questions remain elusive because “...there is currently no means to predict specific epistasis from a protein sequence or structure...”.[2] What if we analyze a far simpler problem? For example, can point mutation effects be forecast accurately? Unfortunately, the answer still is no. To determine the nature of this problem, which includes that of epistasis, we should start by identifying the molecular origin and the main factors affecting point mutations. In this regard, it is not enough to consider the protein sequence nor the mutation-types—it could also be a post-translational modification[7]—but a precise determination of the “field” between and around amino acids. The relevance of the “field” for an accurate description of any physical problem was highlighted by Einstein and Infeld[8] in the following terms: “A new concept appears in physics, the most important invention since Newton’s time: the field. It needed great scientist imagination to realize that it is not the charges nor the particles, but the field in the space between the charges and the particles which is essential for the description of physical phenomena.” Application of this concept on structural biology started with the pioneer development of all-atom “force-field”[9−13] aimed to predict the three-dimensional structure of proteins with the only knowledge of the amino-acid sequence—the protein folding problem.[14] However, a definitive solution to this problem has been elusive since then, e.g., how a sequence encodes the protein folding remains unknown,[15] even though the protein’s three-dimensional structure can be accurately determined.[16] Consequently, and beyond any doubt, an accurate determination of point mutation effects is an unsolved problem still.[17,18] Indeed, a large number of methods and approaches used to predict protein stability upon point mutation, e.g., by using physical, statistical, or empirical “force-field”, respectively, or machine learning methods,[19,20] show limited performance and suffer from caveats.[19,21−25] Therefore, we should focus on the global rather than on the specific mutation effects. In contrast to the mutation-specific effects, the global distribution of the proteins’ stability upon mutations[26,27] can be forecast with acceptable accuracy by a bi-Gaussian function.[21] Such changes in the proteins’ stability due to point mutations can be determined experimentally from the unfolding Gibbs free energy (ΔGU) between the wild-type (wt) and the mutant (m) protein, viz., as ΔΔGU = (ΔGUm – ΔGUwt).[28] Consequently, considering that point mutations mainly affect the native-state stability,[29] the observed change on the unfolding Gibbs free energy (ΔΔGU) should represent, fundamentally, the change (ΔΔG) in the protein marginal stability (ΔG), which refers to the Gibbs free-energy gap between the native state and the first unfolded state.[7,30,31] Let us provide some pieces of evidence that support this important conjecture. The proteins’ free energy of unfolding (ΔGU) spans a wide range of variations, viz., between 5 and 25 kcal/mol,[32] revealing the complexity of the “protein folding problem”.[14,33−35] However, its range of variation upon point mutations (ΔΔGU) is small and well-defined, revealing the validity of the thermodynamic hypothesis or Anfinsen dogma,[14] as explained next. The absolute values of the unfolding Gibbs free energy changes (|ΔΔGU|)—from the histogram of more than 5200 point mutation data obtained by using urea and thermal unfolding experiments[29]—are within the following narrow range of variation: |ΔΔGU| ≤ ∼7.5 kcal/mol. Notably, this boundary value for |ΔΔGU| conforms with the proteins’ marginal stability upper bound limit, namely, ∼7.4 kcal/mol,[31] which (i) is a universal feature of proteins, i.e., obtained regardless of the fold-class or its amino-acid sequence;[31] (ii) is a consequence of Anfinsen’s dogma validity;[31,36] and (iii) represents a threshold beyond which a conformation will unfold and become nonfunctional.[7,36,37] The latter means that changes in the Gibbs free-energy gap size (ΔΔG) between the native state and the first unfolded state cannot be larger than ∼7.4 kcal/mol, e.g., as it occurs for the single mutants of the green fluorescent protein from Aequorea victoria that loses ∼100% of the log fluorescence (native function) if ΔΔGU > ∼7.5 kcal/mol.[38] Consequently, assuming ΔΔG ∼ ΔΔGU—with the latter being experimentally determined—is a reasonable strategy to obtain a reliable assessment of the change on the protein marginal stability upon point mutations and, from here, their effects on the ensemble of folded conformations coexisting with the native state,[36] as shown later. The gained knowledge on protein (i) stability;[21,29,31,37,39−41] (ii) metamorphism, characterized by the existence of two or more folds with a significant structural difference between them;[37,42−51] and (iii) evolvability, the ability of a biological system to provide, by mutation and selection, phenotypic variation, has been enormous.[50,52−58] This will enable us to examine below, in light of evolution, how point mutations could impact each of those issues.

Results and Discussion

Single-Point Mutation Effects

The unfolding Gibbs free energy changes upon mutation (ΔΔGU), in kcal/mol, instantly enable us to determine if they are positive (stabilizing) or negative (destabilizing) contributions to the protein’s marginal stability. However, their impact on both the ensemble of folded conformations in equilibrium with the native state or the metamorphism propensity cannot be straightforwardly inferred. To solve this issue, the amide hydrogen exchange (HX) may be used because it is a sensitive probe to assess changes in the protein native-state structure.[59−66] Indeed, their use could bring precise information on the structural changes that could occur upon mutations and, consequently, on their impact on the protein marginal stability.[36] This is possible because the intra- and intermolecular hydrogen bonds are dependent on the protein native-state structure and the milieu. A simple example will be enough to illustrate this methodology. Shirley et al.[67] accurately determined the urea and thermal unfolding average free-energy change (ΔΔGU) on ribonuclease T1 (Rnase T1) for 12-point mutations involving Tyr → Phe, Ser → Ala, and Asn → Ala amino acids, respectively. As a result, the observed destabilizing average ΔΔGU values were within the following range of variation: ≈−0.5 kcal/mol (for Tyr57 → Phe) to ∼−2.9 kcal/mol (for Asn81 → Ala). Before we proceed, let us remember the following: first, ΔΔG ∼ ΔΔGU will provide us with the Gibbs free energy change in the protein’s marginal stability upon point mutation; second, the amide HX protection factor (P) for a protein in their native state, i.e., in the EX2 limit,[68] is given by the following equation ΔGHX = RT ln P,[68] where ΔGHX represents the Gibbs free-energy change for the opening/closing equilibrium,[64,68]R is the gas constant, and T is the absolute temperature. Because our interest focuses on a particular region of the conformational space, namely, in the Boltzmann ensemble of folded states in equilibrium with the native state, the following relation ΔGHX ∼ ΔG should hold.[36] Consequently, upon a point mutation, the following relations ΔΔGHX ∼ ΔΔG = (ΔG – ΔG) = RT ln (P/P), where P and P are the corresponding protection factors for the mutant (m) and the wild-type (wt) protein, respectively, should also hold. Therefore, if ΔΔG ∼ −2.9 kcal/mol, then P ∼ P × 10–2, where P represents the resistance of the amide HX in the wild-type native state relative to that of the highest free-energy conformation in the ensemble of folded states.[36] In other words, because Asn81 → Ala is a destabilizing mutation, it will leave a native state for the mutant (m) that is ∼100 times less resistant to the amide HX than that of the wild-type protein (wt). Therefore, the point mutations change not only the stability of the native state[29] but also the structural dispersion in the ensemble of folded conformations coexisting with it. This conjecture is in line with convincing theoretical simulations of the HX mechanism on proteins.[69,70] Indeed, such simulations show that sizeable structural differences—in the ensemble of folded conformations relative to the native state—are not only likely but necessary for accurately analyzing the observed amide HX. From this point of view, it is reasonable to assume that a point mutation will introduce structural/energetic fluctuations in the ensemble of native folds in equilibrium with the native state (see Figure ). The latter would be of great impact on protein evolution since the existing piece of evidence shows that small changes in protein structure are essential to their function.[1] If the protein were a metamorphic one—an attribute encoded in its amino-acid sequence—a point mutation could modify, a priori, its metamorphism propensity. Regardless of this, the appearance of fluctuations in the ensemble of native folds, a change in the milieu or both, could allow redistribution of their folded state ratio[36,37]—determined by its Boltzmann factors—and, hence, could benefit/impair the thermodynamic equilibrium between highly dissimilar (metamorphic) folded states. This could be of paramount importance to identify critical amino acids for the arise—or disappearance—of metamorphism in proteins such as in the study of the appearance of new folds and functions upon a mutation.[49,50] In addition to all the above, the protein evolvability should also be affected by point mutation because it is well known that stability promotes it.[53,54,71,72]

Figure 1

Easy charts of the single-point mutation effects in terms of the protection factor (P), representing the resistance of the amide HX in the native state relative to that of the highest free-energy conformation in the ensemble of folded states, and the protein marginal-stability change (ΔΔG). (A) Protection factor change for the mutant (P) differs, from one of the wild-types, by a Boltzmann factor that is a function of ΔΔG. (B) Changes in the ensemble of conformations coexisting with the native state enable, e.g., upon a slight change in the milieu, the appearance of alternate native states, as for the metamorphic proteins and, hence, the occurrence of new functions.

Mutation Effects in Light of Evolution

We are now in good condition to determine how a series of point mutations will affect the protein’s marginal-stability change. To solve this, let us frame the problem within the protein space model where “...if evolution by natural selection is to occur, functional proteins must form a continuous network, which can be traversed by unit mutational steps without passing through nonfunctional intermediates...”.[73] Implicit in this modeling is that any functional protein that pertains to the protein space obeys Anfinsen’s dogma.[36,37] Then, a walk in that protein space enables us to determine, after j consecutive point mutation steps, the following relations of interest:where ΔΔGU = (ΔGU – ΔGU) for s > 1, and wt represents the wild type. Then, consideringthe following relationship, in terms of the observable ΔΔG, can be obtained:with β = 1/RT, and P and P are the corresponding protection factors for the protein after j point mutation steps and the wild-type (wt) native state, respectively (as shown in Figure for j = 1). In other words, eq represents, after navigating the protein space as an abstract model of evolution, the changes on both the wild-type protein native-state stability (ΔΔGU) and the ensemble of folded conformations coexisting with it (P). It is worth noting that the results in eq −3 are valid even if there are k out of j mutations (with k < j – 1) leading to nonfunctional proteins, e.g., when a mutation leads to a free-energy change (ΔΔG) larger than a marginal stability upper bound threshold of ∼7.4 kcal/mol.[31,36] Consideration of this problem is relevant for two reasons: first, because most of the point mutations are destabilizing[29,55,72,74] and second because the evolutionary trajectories in the protein sequence space are assumed to be reversible although the genotypic irreversibly should not be dismissed.[75] Let us assume that the ancestor and target protein sequences, respectively, are kept fixed during the evolutive process, as in the word game.[73] In this game, one word changes into another by replacing one letter at a time, e.g., transforming the word “NCSI” to “IRNL”,[76] where each letter represents an amino acid in the single-letter code. Then, according to eq , turning “NCSI” into “IRNL” will lead to the same result, in terms of P, whatever the mutational trajectories linking these two words are. That is feasible because ΔΔG asΔΔGU is a state function. Thus, nature follows any evolutive path if there is no penalty for doing so. In actual applications, the latter means that some trajectories could have a small (or null) chance of realization.[77] The word game also enables us to rationalize the complexity of the protein evolution analysis, e.g., in terms of either evolutionary trajectories or marginal stability changes. Let us briefly discuss the pros and cons of each of these approaches. On the one hand, an analysis of the protein evolution in terms of the evolutionary trajectories implies knowing with precision, in each step, the letter to be mutated (amino-acid identity), the background where the mutation occurs (sequence and milieu), and the epistasis effects that could take place. An accurate solution to this problem is a daunting task,[78] although of great practical relevance, e.g., how to turn a protein to exhibit the desired function—as it happens on directed evolution applications.[54,72] A solution to this problem is exacerbated by the fact that neutral mutations, aside from epistasis effects, also need to be considered because they may play a critical role in the transition from one amino acid to another.[53,71,79−81] In other words, neutral mutations (which are invisible to natural selection) may compensate for the effects of destabilizing mutations though beneficial from the functional point of view.[72,80,81] On the other hand, if protein stability is an essential factor for evolution,[74,82] then it is reasonable to think about the whole (marginal-stability evolution) rather than the parts (evolutionary-trajectories) because the former involves the latter and, consequently, all factors affecting protein evolution—including specific, nonspecific, or “high-order” epistasis effects.[4,5,78,83] Yet, we should keep in mind that Leibniz and Kant’s notion of space (and time),[3] devised as “analytic wholes”, highlights that the whole is more than the sum of the parts, although this does not imply the irrelevance of the latter. Indeed, as noted above for the directed evolution applications, if our interest focuses on understanding why nature adopts one among all possible evolutionary trajectories,[78] detailed consideration of the epistasis effects would be unavoidable.

Conclusions

The above analysis stresses the importance of the protein marginal-stability change analysis upon point mutations not solely because it appears as one of the main factors governing protein evolution but also because it provides a straightforward path to estimate—without considering epistasis effects explicitly—the fluctuations that may occur in the Boltzmann ensemble of folded conformations in equilibrium with its native state and, hence, their possible impact on the progress toward new architectures and functions.

75 in total

1. Why are proteins marginally stable?

Authors: Darin M Taverna; Richard A Goldstein
Journal: Proteins Date: 2002-01-01

2. Robustness, evolvability, and neutrality.

Authors: Andreas Wagner
Journal: FEBS Lett Date: 2005-03-21 Impact factor: 4.124

3. Darwinian evolution can follow only very few mutational paths to fitter proteins.

Authors: Daniel M Weinreich; Nigel F Delaney; Mark A Depristo; Daniel L Hartl
Journal: Science Date: 2006-04-07 Impact factor: 47.728

4. Exchange of hydrogen atoms in insulin with deuterium atoms in aqueous solutions.

Authors: A HVIDT; K LINDERSTRØM-LANG
Journal: Biochim Biophys Acta Date: 1954-08

5. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants.

Authors: M Michael Gromiha; P Anoosha; Liang-Tsung Huang
Journal: Methods Mol Biol Date: 2016

Proteins' Evolution upon Point Mutations.

Introduction

Results and Discussion

Single-Point Mutation Effects

Mutation Effects in Light of Evolution

Conclusions

1. Why are proteins marginally stable?

2. Robustness, evolvability, and neutrality.

3. Darwinian evolution can follow only very few mutational paths to fitter proteins.

4. Exchange of hydrogen atoms in insulin with deuterium atoms in aqueous solutions.

5. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants.

Review 6. Computational approaches for predicting mutant protein stability.

7. Prediction of water and metal binding sites and their affinities by using the Fold-X force field.

8. Interconversion between two unrelated protein folds in the lymphotactin native state.

9. Can AlphaFold2 predict the impact of missense mutations on structure?

Review 10. Unfolding the Mysteries of Protein Metamorphosis.