Literature DB >> 30704856

A Dynamic Hydrophobic Core and Surface Salt Bridges Thermostabilize a Designed Three-Helix Bundle.

Catrina Nguyen1, Jennifer T Young1, Gabriel G Slade2, Ronaldo J Oliveira2, Michelle E McCully3.   

Abstract

Thermostable proteins are advantageous in industrial applications, as pharmaceuticals or biosensors, and as templates for directed evolution. As protein-design methodologies improve, bioengineers are able to design proteins to perform a desired function. Although many rationally designed proteins end up being thermostable, how to intentionally design de novo, thermostable proteins is less clear. UVF is a de novo-designed protein based on the backbone structure of the Engrailed homeodomain (EnHD) and is highly thermostable (Tm > 99°C vs. 52°C for EnHD). Although most proteins generally have polar amino acids on their surfaces and hydrophobic amino acids buried in their cores, protein engineers followed this rule exactly when designing UVF. To investigate the contributions of the fully hydrophobic core versus the fully polar surface to UVF's thermostability, we built two hybrid, chimeric proteins combining the sets of buried and surface residues from UVF and EnHD. Here, we determined a structural, dynamic, and thermodynamic explanation for UVF's thermostability by performing 4 μs of all-atom, explicit-solvent molecular dynamics simulations at 25 and 100°C, Tanford-Kirkwood solvent accessibility Monte Carlo electrostatic calculations, and a thermodynamic analysis of 40 temperature runs by the weighted-histogram analysis method of heavy-atom, structure-based models of UVF, EnHD, and both chimeric proteins. Our models showed that UVF was highly dynamic because of its fully hydrophobic core, leading to a smaller loss of entropy upon folding. The charged residues on its surface made favorable electrostatic interactions that contributed enthalpically to its thermostability. In the chimeric proteins, both the hydrophobic core and charged surface independently imparted thermostability.
Copyright © 2019 Biophysical Society. Published by Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 30704856      PMCID: PMC6382955          DOI: 10.1016/j.bpj.2019.01.012

Source DB:  PubMed          Journal:  Biophys J        ISSN: 0006-3495            Impact factor:   4.033


Introduction

Proteins in thermophilic organisms often have more salt bridges on their surface, helix-capping residues, and more-rigid loops relative to mesophilic proteins (1, 2, 3, 4). Proteins constructed from ancestral or consensus sequences are generally more thermostable than their modern-day counterparts (5, 6, 7). Protein designers have taken advantage of these characteristics to intentionally thermostabilize otherwise-thermolabile proteins. Curiously, de novo-designed proteins often end up being thermostable without any direct effort on the designers’ part to make them so (8, 9, 10). As protein-design methods improve, it will become possible to rationally design proteins that are not only functional but also thermostable. Here, we identify and investigate the contribution of two strategies to increase a protein’s thermostability. In the early 2000s, the Mayo Lab developed an algorithm to redesign the amino acid sequence of a protein based on a backbone template (11, 12, 13). As part of their algorithm, they classified all residue positions as “buried” or “surface” and allowed only hydrophobic residues to be placed at buried positions and only hydrophilic residues on the surface. As a proof of principle for their algorithm, they redesigned the sequence of the Engrailed homeodomain (EnHD), a three-helix bundle transcription factor from Drosophila melanogaster. The resulting protein, UVF, folded as expected and was unexpectedly, extremely thermostable (T > 99°C vs. 52°C for EnHD) (14, 15, 16, 17). UVF and EnHD were the subject of a 2013 molecular dynamics study that investigated the structural and dynamic basis for thermostability in UVF (18). In those simulations, EnHD, as expected, is stable at 25°C and begins to unfold at 100°C. On the other hand, UVF is more dynamic than EnHD at room temperature and maintains these heightened dynamics at 100°C without unfolding. The authors predicted that the fully hydrophobic core of UVF contributes to its increased dynamics by allowing many favorable hydrophobic interactions, supporting more backbone conformations. Furthermore, they predicted that the more dynamic nature of UVF would be entropically favorable—UVF would lose less entropy when it folded relative to EnHD—contributing to UVF’s thermostability. Here, we test the prediction put forth in the 2013 study that UVF’s hydrophobic core imparts an entropic thermostability by building two hybrid, chimeric proteins. Protein models of these chimeras were built by combining the sets of buried residues in UVF and surface residues in EnHD (UbEs) and vice versa (EbUs) (Fig. 1 a). We performed molecular dynamics simulations of all four proteins at 25 and 100°C to investigate their stability and dynamics. To test whether the hydrophobic core imparts an entropic stabilization, we performed coarse-grained simulations and calculated thermodynamic quantities (specific heat, enthalpy, entropy, free energy) using the weighted-histogram analysis method (WHAM) for the four proteins. UVF and both chimeras were more dynamic than EnHD at 25°C, and all three remained folded at 100°C. The observed thermostability of the chimeric proteins suggests that the hydrophobic core and hydrophilic surface of UVF both independently impart thermostability. Based on these results, as well as TKSA Monte Carlo electrostatic calculations, we propose that UVF’s hydrophobic core provides entropic stabilization, and its hydrophilic surface provides enthalpic stabilization.
Figure 1

Structure and sequence of EnHD, UVF, and chimeric proteins. (a) Models depicting the design of EbUs and UbEs chimeras combining the surface and buried residue sets of EnHD and UVF. (b) Minimized starting structure (white) and representative final simulation structures at 25°C (light) and 100°C (dark) aligned on the helical core (residues 10–54). (c) Protein sequences with three helices (10–22, 28–38, 42–54) are noted below, and buried residues (8, 12, 16, 19, 20, 26, 31, 34, 35, 38, 40, 44, 45, 48, 49, 52) are noted in bold. To see this figure in color, go online.

Structure and sequence of EnHD, UVF, and chimeric proteins. (a) Models depicting the design of EbUs and UbEs chimeras combining the surface and buried residue sets of EnHD and UVF. (b) Minimized starting structure (white) and representative final simulation structures at 25°C (light) and 100°C (dark) aligned on the helical core (residues 10–54). (c) Protein sequences with three helices (10–22, 28–38, 42–54) are noted below, and buried residues (8, 12, 16, 19, 20, 26, 31, 34, 35, 38, 40, 44, 45, 48, 49, 52) are noted in bold. To see this figure in color, go online.

Materials and Methods

Chimeric protein models

The chimeric protein models for EnHD buried UVF surface (EbUs) and UVF buried EnHD surface (UbEs) were built combining the buried (EnHD residue numbers 8, 12, 16, 19, 20, 26, 31, 34, 35, 38, 40, 44, 45, 48, 49, and 52) and surface (all other residues) residue sets of EnHD and UVF using MODELLER (Fig. 1) (19). The protein that provided the buried set of residues was used as the structure template in the homology model. An alignment file between the chimera and template was generated using align2d, and the automodel command was used to generate 6 (EbUs) or 50 (UbEs) models. The model with the lowest score by the MODELLER objective function was selected.

Molecular dynamics simulations

We performed all-atom, explicit-solvent molecular dynamics (AA-MD) simulations of EnHD, UVF, and both chimeric proteins. The starting structures for EnHD and UVF were the experimental crystal structure (Protein Data Bank (PDB): 1enh (20)) and NMR structure (PDB: 2p6j (14)), respectively. To assess the inherent stability of a crystal structure versus an NMR or model structure, we also performed simulations beginning from residues 3 to 56 of the NMR structure of EnHD (PDB: 2jwt (21)). All simulations were performed using NAMD 2.11 (22) with the CHARMM36m force field (23, 24, 25). Structure files were generated for each of the five starting structures using autopsf in VMD (26), and 1000 steps of conjugate gradient minimization were performed. Each protein was solvated in a cubic box of TIP3P water molecules (27) of 50 Å/side (EnHD, UVF, chimeras) or 54 Å/side (EnHD NMR), and KCl was added at a concentration of 0.15 M to neutralize the system (28). The solvated system was minimized for an additional 100 steps and heated to 25 or 100°C. A Langevin thermostat and barostat were used to achieve an NPT system (constant number of particles, pressure, and temperature). The van der Waals forces were truncated with a smooth switching function at a cutoff distance of 8 Å, and full-system periodic electrostatics were performed using the particle mesh Ewald method. All bonds that included hydrogen were treated as rigid. An integration step size of 2 fs was applied, and structures were saved every 1 ps. Five 100-ns production runs were performed at both 25 and 100°C for EnHD (crystal), EnHD (NMR), UVF, EbUs, and UbEs for a total of 5 μs of simulation time.

Molecular dynamics simulation analysis

We performed data analysis using the molecular dynamics software, in lucem molecular mechanics (ilmm) (29). All protein sequences were numbered with respect to EnHD’s residue numbering (Fig. 1 c). The root mean-square deviation (RMSD) of the Cα atoms relative to the minimized starting structure was calculated for the helical core (residues 10–54) of all proteins. These residues were selected to discount the flexibility of the disordered N- and C-termini and to allow direct comparison of Cα RMSDs between proteins. The core Cα RMSD was also calculated pairwise between structures across the simulation at 10-ps granularity, rather than comparing only to the starting structure, and reported as the all-versus-all core Cα RMSD. Experimental NOEs for EnHD (MR block ID: 434304) (21) and UVF (MR block IDs: 449656 and 449657) (14) were downloaded from the Biological Magnetic Resonance Data Bank NMR Restraints Grid (30). 654 restraints were found for residues 3–56 of EnHD, and 1266 restraints were found for UVF. Distances were calculated as the 1/r6-weighted average distance between relevant pairs of hydrogen atoms, and an NOE was considered satisfied if the weighted distance was less than 5 Å or the rfar provided in the restraint file, whichever was longer. Distances were averaged across each simulation individually, over the five simulations at each temperature, and for both temperatures together. Long-order NOEs were defined as those between pairs of residues with a contact order of at least six. Several different contact analyses were performed. Contact maps were calculated based on the amount of time each pair of residues had at least one pair of atoms in contact. Two atoms were considered in contact when pairs of carbon atoms were closer than 5.4 Å and when any other nonhydrogen atoms were within 4.6 Å. Residue pairs that were in contact in the minimized starting structure were defined as native contacts and plotted below the diagonal on contact maps, and non-native contacts were plotted above. Contacts between side chains were further classified as hydrogen bonds, hydrophobic interactions, or other. Hydrogen bonds were defined as sets of three atoms in which the donor and acceptor atoms were <2.6 Å apart and the donor-hydrogen-acceptor angle was within 45° of linearity. Hydrophobic contacts were defined as pairs of carbon atoms that were bound to at least one hydrogen atom and were <5.4 Å apart. Other contacts were defined as all pairs of nonhydrogen atoms that were <4.6 Å apart. Side-chain salt bridges between ionizable residues were classified as hydrogen bonds or other contacts, as defined previously, that occurred between the carboxyl oxygen atoms of glutamic acid (none of the four proteins contains an aspartic acid) and the guanidinium, amine, or imidazole nitrogen and hydrogen atoms of arginine, lysine, or histidine. The total number of residue pairs that made any type of side-chain contact >10% of the simulation time were reported as “unique contacts.” More unique contacts are indicative of a more dynamic protein.

Electrostatic free-energy calculations

Electrostatic free-energy calculations were performed using the Tanford-Kirkwood solvent accessibility (TKSA) model. In this model, the protein is treated as a spherical cavity with dielectric constant (ε) and radius (b) surrounded by an electrolyte solution modeled according to Debye-Hückel theory (31). Shire et al. (32) modified this model by incorporating a solvent accessibility term for each ionizable residue that upgraded the TKSA model. Once the electrostatic energy between the titratable residues is determined, it is possible to calculate the free energy (ΔG(χ)) for the protein native state in a given protonation state χ (33, 34). It is assumed in the method that the unfolded state does not contribute to electrostatic interactions (35). The protonation states χ were calculated via Metropolis Monte Carlo algorithm, resulting in the TKSA-MC model (36).

Thermodynamic analysis of coarse-grained models

Structure-based models (SBMs), also known as Gō-models, are widely employed to study the dynamics of protein ensembles, especially protein-folding mechanisms, because of the success of the energy landscape theory (37). SBMs of proteins can be simplified using different levels of coarse graining by including only Cα (Cα-SBM) or all heavy (non-hydrogen) atoms (HA-SBM) (38). The Hamiltonian that gives the protein interaction energy is based on the geometry of its native state, such that the potential energy surface reaches its minimum at this reference state (39). Simulations were performed following a similar protocol from the previous work (38) with GROMACS 4.6.7 (40). Input files for the simulations were prepared with SMOG2 version 2.1 beta with standard options (41). Thermodynamic analysis was performed over 40 temperature runs from lower (fully folded states) to higher temperatures (unfolded states), including the transition temperature from the folded to unfolded state (T), by the WHAM with an in-house Python script (42). T was defined as the temperature of the specific heat peak between the folded and unfolded states for each protein. The folded state was chosen to be 0.95 T, and the unfolded state was chosen to be 1.05 T to calculate ΔH, ΔS, and ΔF.

Results

NOE satisfaction

To validate our AA-MD simulations, we compared them with experimentally determined nuclear Overhauser effect (NOE) distances. NOEs were obtained from the Biological Magnetic Resonance Data Bank, and 1/r6 weighted distances were calculated for all hydrogen atoms involved in NOE pairs. Table 1 shows the percentage of NOEs satisfied when the distances were averaged over each simulation, all five simulations at a given temperature, and all 10 simulations. EnHD satisfied 93% of long-range (contact order ≥6) and 94% of all experimental NOEs in simulations performed at room temperature. EnHD satisfied fewer long-range NOEs, only 91%, when the temperature was raised to 100°C because it began unfolding. UVF had lower long-range NOE satisfaction at 25°C than at 100°C, satisfying 88 vs. 92% long-range NOEs, respectively, because of faster sampling at the higher temperature. However, UVF satisfied 96 and 97% of all NOEs at 25 and 100°C, respectively. The number of long-range NOEs satisfied by the proteins in simulation was higher than the number satisfied by the starting structures, with 92% for EnHD and 81% for UVF, similarly because of increased sampling.
Table 1

Fraction of NOEs Satisfied

Temperature (°C)RunEnHD
UVF
Long RangeaAll NOEsLong RangeaAll NOEs
Minimized starting structureb0.920.930.810.94
25Averagec0.93 ± 0.010.94 ± 0.000.83 ± 0.030.94 ± 0.01
All simulations0.930.940.880.96
100Averagec0.89 ± 0.030.94 ± 0.010.91 ± 0.020.96 ± 0.01
All simulations0.910.950.920.97
All simulations0.950.960.910.97

i → i + 6 or greater.

PDB: 1enh for EnHD, PDB: 2p6j for UVF.

Average ± SD.

Fraction of NOEs Satisfied i → i + 6 or greater. PDB: 1enh for EnHD, PDB: 2p6j for UVF. Average ± SD.

Backbone motion

The core Cα RMSD (residues 10–54) to the minimized starting structure was reported for a representative simulation for EnHD, UVF, EbUs, and UbEs and averaged across the five AA-MD simulations for each protein at each temperature (Fig. 1, a and b). RMSD data for all simulations are provided in Fig. S1. Higher RMSDs indicate more deviation from the starting structure, and increasing RMSD over time indicates that the protein is unstable and beginning to unfold. EnHD was the only protein that denatured at 100°C, on average reaching a core Cα RMSD of ∼3.5 Å by 100 ns, compared with an average of 0.7 Å at 25°C. UVF and the two chimeras all had higher average core Cα RMSDs than EnHD at 25°C, but their RMSD did not increase much at 100°C. This backbone motion is evident in representative final structures from simulations at 25 and 100°C for all four proteins (Fig. 1 b). EnHD showed more movement from the starting structure (white) at 100°C (dark blue) than at 25°C (light blue), mostly in HIII, and the other three proteins show minimal differences between the two final structures. Measuring RMSD to the starting structure to quantify stability assumes that the starting structure is not only accurate but represents a stable, native conformation. Because both starting structures for the chimeras were models, we wanted to ensure that the elevated RMSD we observed for the native state was not due to the proteins finding a different conformation and fluctuating stably in it. Therefore, we calculated the core Cα RMSD pairwise across all time points for all 10 simulations, an all-versus-all core Cα RMSD. These results are presented as heat maps and as averages of the five simulations at each temperature (Fig. 2, c and d). Larger, higher-resolution versions of these heat maps are provided in Fig. S1. At room temperature, EnHD had an average all-versus-all core Cα RMSD of 0.7 Å, whereas UVF, EbUs, and UbEs had more highly dynamic native states measuring 1.3, 1.3, and 1.7 Å RMSD, respectively. Although EnHD was much more dynamic at 100°C relative to 25°C, with a jump in RMSD of 2.2 Å, the other three proteins had modest increases in RMSD, all less than 0.8 Å. The all-versus-all RMSD heat maps additionally illustrate the variation in RMSD between temperatures. For UVF, EbUs, and UbEs, all the pairwise RMSDs between simulations at 25 and 100°C were not notably different than the pairwise RMSDs within simulations at the same temperature. Therefore, the conformational space sampled at 25 and 100°C was similar for UVF and the chimeras. EnHD, on the other hand, was very stable at 25°C, adopting fewer varied native conformations than the other three proteins, but it began exploring the denatured state at 100°C.
Figure 2

Core Cα RMSD. Cα RMSD of the helical core (residues 10–54) at 25 and 100°C is shown. (a) RMSD from one representative simulation of EnHD, UVF, EbUs, and UbEs at 25°C (light) and 100°C (dark). (b) All-versus-all core Cα RMSD matrices comparing structures at each time point pairwise across five simulations at both 25 and 100°C. (c) Average core Cα RMSD of the five simulations of at 25 and 100°C with error bars indicating SD, n = 5. (d) Average pairwise core Cα RMSD for all pairs of structures in each simulation, averaged over all runs, n = 5. For larger matrix images and RSMD versus time plots for all simulations, see Fig. S1. To see this figure in color, go online.

Core Cα RMSD. Cα RMSD of the helical core (residues 10–54) at 25 and 100°C is shown. (a) RMSD from one representative simulation of EnHD, UVF, EbUs, and UbEs at 25°C (light) and 100°C (dark). (b) All-versus-all core Cα RMSD matrices comparing structures at each time point pairwise across five simulations at both 25 and 100°C. (c) Average core Cα RMSD of the five simulations of at 25 and 100°C with error bars indicating SD, n = 5. (d) Average pairwise core Cα RMSD for all pairs of structures in each simulation, averaged over all runs, n = 5. For larger matrix images and RSMD versus time plots for all simulations, see Fig. S1. To see this figure in color, go online.

Native and non-native contacts

The average time two residues spent in contact over the five AA-MD simulations at both temperatures was plotted as a contact map to compare how dynamic each protein was (Fig. S2). The contacts were plotted above or below the diagonal based on whether they were present in the starting structure (native, below the diagonal) or gained during the simulation (non-native, above the diagonal). At 25°C, EnHD had few non-native contacts, with most occurring in the same regions of the map as the native contacts. This pattern indicates slight movements of the helices without large conformational changes. However, at 100°C, EnHD gained many non-native contacts that were smeared across the helices, and its native contacts had a lower percent time in contact compared to 25°C. Together, these observations indicate more drastic changes in conformation at 100°C than at 25°C, consistent with the early stages of unfolding. At 25 and 100°C, UVF had more non-native contacts than EnHD did at 25°C. However, the non-native contacts were present in the same regions of the contact map as the native contacts, indicating that UVF made more unique residue-residue contacts than EnHD did at 25°C, whereas it maintained its folded structure at both temperatures. The contact patterns observed for EbUs and UbEs were most similar to those of UVF, suggesting that all three proteins were highly dynamic at both temperatures, with residues coming into contact with the residues next to those they made contacts with in the starting structures.

Side-chain contacts

To investigate what types of interactions were giving rise to the heightened dynamics observed in UVF and the two chimeric proteins in the AA-MD simulations, we investigated what types of interactions were made between side chains in different regions of the protein. Side-chain contacts were analyzed by type of contact (hydrogen bond, hydrophobic interaction, or other) and by the region to which both of the residues in the contact belonged (buried, surface, or buried ↔ surface) (Fig. 3).
Figure 3

Side-chain contacts by region and contact type. Side-chain contacts are categorized as between residues that are (a–d) both buried (residues: 8, 12, 16, 19, 20, 26, 31, 34, 35, 38, 40, 44, 45, 48, 49, 52), (e–h) both surface (residues: 3–7,9-11, 13–15, 17, 18, 21–25, 27–30, 32, 33, 36, 37, 39, 41–43, 46, 47, 50, 51, 53–56), or (i–l) one buried and one surface. Total side-chain contacts (m–p) are also displayed. Contacts are further typed as (from left to right) hydrogen bonds, hydrophobic contacts, other interactions, and the sum of those three groups. Unique residue side-chain pairs in contact >10% of the simulation time are plotted for (q) buried and (r) surface residues as well. Number of contacts was averaged across five simulations at 25°C (light) and 100°C (dark), and error bars indicate SD. To see this figure in color, go online.

Side-chain contacts by region and contact type. Side-chain contacts are categorized as between residues that are (a–d) both buried (residues: 8, 12, 16, 19, 20, 26, 31, 34, 35, 38, 40, 44, 45, 48, 49, 52), (e–h) both surface (residues: 3–7,9-11, 13–15, 17, 18, 21–25, 27–30, 32, 33, 36, 37, 39, 41–43, 46, 47, 50, 51, 53–56), or (i–l) one buried and one surface. Total side-chain contacts (m–p) are also displayed. Contacts are further typed as (from left to right) hydrogen bonds, hydrophobic contacts, other interactions, and the sum of those three groups. Unique residue side-chain pairs in contact >10% of the simulation time are plotted for (q) buried and (r) surface residues as well. Number of contacts was averaged across five simulations at 25°C (light) and 100°C (dark), and error bars indicate SD. To see this figure in color, go online. EnHD lost more total side-chain contacts than UVF, EbUs, and UbEs at 100°C relative to 25°C (Fig. 3 p). For EnHD, 66 of the 117 total contacts lost at 100°C relative to 25°C were buried hydrophobic interactions (Fig. 3, b and p), which is consistent with the early events of unfolding. The other three proteins lost closer to 30 hydrophobic interactions when the temperature was raised and only lost 11–46 contacts overall. UVF and EbUs both had more surface hydrogen bonds than EnHD and UbEs (Fig. 3 e), and both gained an average of about one hydrogen bond when the temperature was raised. On the other hand, EnHD and EbUs both had three to four hydrogen bonds between surface and buried residues, whereas UVF and EbUs had none (Fig. 3 i). UVF and UbEs had fewer buried “other” contacts than EnHD or EbUs (Fig. 3 c). This type of contact includes interactions between charged groups that do not have the proper geometry of a hydrogen bond as well as interactions between polar and hydrophobic groups. The number of unique contacts that a protein makes is related to how dynamic it is. We totaled the number of unique buried and surface residue pairs whose side chains were in contact for during at least 10% of the simulation time to quantify the dynamics of the hydrophobic core and hydrophilic surface in each of the four proteins (Fig. 3, q and r). EnHD was the most rigid, with 49 unique buried contacts at 25°C, whereas the other proteins had 53–59. Again, the number of unique buried contacts increased by 13 at 100°C relative to 25°C, as EnHD began denaturing, whereas there was an increase of only one to four contacts for UVF and the two chimeric proteins. EnHD and UbEs both have two more surface residues than UVF and EbUs (Fig. 1 c), in part accounting for their higher number of unique surface contacts. However, both chimeras have five to six more unique surface contacts at 25°C than the protein from which their surface was derived.

EnHD crystal versus NMR structure

EnHD was more stable than UVF and both chimeras, but its simulation starting structure was a crystal structure, whereas UVF’s was an NMR structure, and the two chimeras’ structures were homology models. To assess the contribution of structure determination method to protein stability as observed in our simulations, we compared simulations of EnHD beginning from the crystal (PDB: 1enh) or NMR (PDB: 2jwt) structure. 2jwt was as stable as 1enh based on core Cα RMSD to the starting structure as well as pairwise across simulations (Figs. 2 and S3, a–c). 2jwt and 1enh satisfied the same number of NOEs at both temperatures (Tables 1 and S1). Like 1enh, 2jwt began making non-native contacts in regions outside of the three-helix bundle topology in simulations at 100°C but maintained contacts within the topology at 25°C (Figs. S2 and S3 d). 2jwt made slightly fewer hydrogen bonds and “other” contacts than 1enh at 25°C, but their contacts were identical at 100°C (Figs. 3 and S3 e). 2jwt made more unique buried contacts at 25°C than 1enh, but not as many as UVF, and the same number of unique surface contacts (Figs. 3, r and q and S3 f).

Electrostatic interactions

EnHD and UVF have about the same number of charged residues (Arg, Lys, Glu; 19 and 21, respectively; neither protein has an Asp), but UVF has a more equal balance between positive and negative residues (EnHD net charge +7, UVF −1, Fig. 1 c). We investigated how this change in charge distribution affected the contacts that could be formed as well as the energetic contribution of electrostatic interactions. All five Glu residues in EnHD formed a salt bridge with an Arg residue 100% of the time in AA-MD at 25°C (Table 2). UVF, on the other hand, had only two contacts present 100% of the simulation time at 25°C. UVF made eight salt bridges during ≥80% of the AA-MD simulation time, involving 12 of 23 distinct ionizable residues, whereas EnHD made five salt bridges involving 7 of 19 ionizable residues, two of which were in the buried position. For the residues making salt bridges, they were equally efficient in UVF and EnHD (0.67 vs. 0.71 salt bridges/residue, respectively), but UVF involved a higher proportion of its ionizable residues in salt bridges than EnHD (52 vs. 37%).
Table 2

Side-Chain Salt Bridges between Ionizable Residues

Contact PairFraction Time in Contacta
EnHDUbEs
Glu28-Arg31Leub1.000.00
Glu19Phe-Arg301.000.00
Arg31Leu-Glu421.000.00
Arg15-Glu19Phe1.000.00
Arg15-Glu371.000.06
Arg18-Glu220.640.90
Glu28-Arg530.000.95

UVFEbUs

Glu51-Arg550.960.90
His32-Glu420.921.00
Glu14-Lys170.910.99
Lys15-Glu180.880.92
Glu18-Arg220.860.87
Lys6-Glu131.000.59
Lys6-Glu101.000.28
Glu13-Lys170.800.57
Glu18-Lys210.730.89
Glu43-Arg460.570.81
Arg46-Glu500.470.88
His23-Glu300.180.93
Arg25-Glu530.080.98
Phe19Glu-His230.000.96
Phe19Glu-Arg370.000.97
Phe19Glu-Arg220.000.99

For contacts present at least 80% of the time in at least one of the two proteins in five simulations at 25°C.

Mutation notation gives EnHD or UVF residue first and chimera residue second.

Side-Chain Salt Bridges between Ionizable Residues For contacts present at least 80% of the time in at least one of the two proteins in five simulations at 25°C. Mutation notation gives EnHD or UVF residue first and chimera residue second. The EbUs chimera made 13 salt bridges ≥80% of the simulation time at 25°C, but only one 100% of the time. Five of these salt bridges were between residue pairs that also made salt bridges ≥80% of the time in UVF. EbUs was as efficient as UVF and EnHD at using residues for multiple salt bridges (13 salt bridges among 19 ionizable residues, 0.71 salt bridges/residue), but it had 19 of 22 ionizable residues making salt bridges (86%). UbEs, on the other hand, formed only two salt bridges ≥80% of the simulation time, both of which were observed <80% of the time in EnHD. The TKSA-MC analysis allows the calculation of the charge-charge interaction free energy (ΔG) of each ionizable residue. A residue with a positive ΔG tends to destabilize the protein native state, whereas a negative ΔG indicates that the residue is favorable to stabilize the protein. Fig. 4 shows TKSA-MC analysis for the EnHD, UVF, and the two chimeric proteins. In the EnHD, almost 50% of the ionizable residues were not favorable to the protein stability. However, UVF had a greater number of ionizable residues, and almost all of them were electrostatically favorable. This analysis corroborates with the increase of the thermostability of UVF when compared with EnHD. Regarding the two chimeras, the free energies of UbEs’s ionizable residues were similar to those of EnHD’s, and EbUs’s ΔG distribution was more similar to UVF. The sum of each residue’s ΔG contribution is shown in Fig. 4 e for all four proteins, and the total ΔG was lowest for UVF, followed by EbUs, EnHD, and UbEs.
Figure 4

Electrostatic free energy. (a–d) Charge-charge interaction energy (ΔG) calculated by the TKSA-MC model for each ionizable residue. Negative values indicate that the residue is electrostatically favorable, whereas positive values indicate that the residue is unfavorable to the electrostatic free energy. (e) Total ΔG energy for each protein. To see this figure in color, go online.

Electrostatic free energy. (a–d) Charge-charge interaction energy (ΔG) calculated by the TKSA-MC model for each ionizable residue. Negative values indicate that the residue is electrostatically favorable, whereas positive values indicate that the residue is unfavorable to the electrostatic free energy. (e) Total ΔG energy for each protein. To see this figure in color, go online.

Thermodynamic analysis

SBMs define the energy minimum of the Hamiltonian function as the protein’s native conformation; thus, it is a topology-based model with fast sampling over a span of temperatures. SBM simulations were performed for the four proteins using two levels of coarse graining, Cα-SBM or HA-SBM, to quantify the thermodynamics of the folding process (Figs. 5 and 6). Fig. 5 compares the specific heat as a function of the simulated temperature range (c(T)) for the four proteins in the Cα-SBM (Fig. 5 a) and HA-SBM (Fig. 5 b). Only the HA-SBM simulations were able to show differences in the folding or melting temperature (T) of the studied proteins, given by the temperature of the peak in the cv(T) curve. This discrepancy is a consequence of the side-chain interactions included in the HA-SBM, which are not included in the Cα-SBM. UVF and UbEs had a higher T than EnHD and EbUs (Fig. 5 b).
Figure 5

Specific heat profiles. Specific heat as a function of temperature (c(T)) for the SBM simulations in (a) Cα-SBM and (b) HA-SBM is shown. Structures of the folded native state (left) and one unfolded conformation (right) are shown for each model in (a) and (b). Energy is in units of kT, and temperature is in GROMACS reduced units. To see this figure in color, go online.

Figure 6

Enthalpy, entropy, and free-energy profiles. (a) Enthalpy (H(T)), (b) entropy (S(T)), and (c) free-energy (F(T)) profiles as a function of temperature for the HA-SBM simulations. Free energy is given by the thermodynamic relationship, F = H − TS. Energy is in units of kT, and temperature is rescaled by the folding temperature (T) of each protein. To see this figure in color, go online.

Specific heat profiles. Specific heat as a function of temperature (c(T)) for the SBM simulations in (a) Cα-SBM and (b) HA-SBM is shown. Structures of the folded native state (left) and one unfolded conformation (right) are shown for each model in (a) and (b). Energy is in units of kT, and temperature is in GROMACS reduced units. To see this figure in color, go online. Enthalpy, entropy, and free-energy profiles. (a) Enthalpy (H(T)), (b) entropy (S(T)), and (c) free-energy (F(T)) profiles as a function of temperature for the HA-SBM simulations. Free energy is given by the thermodynamic relationship, F = H − TS. Energy is in units of kT, and temperature is rescaled by the folding temperature (T) of each protein. To see this figure in color, go online. Fig. 6 shows the enthalpy (H, Fig. 6 a), entropy (S, Fig. 6 b), and free-energy (F, Fig. 6 c) profiles as a function of temperature obtained by the heavy-atom SBM simulations. As in Fig. 5 b, the higher UVF and UbEs enthalpy and entropy profiles culminate in the lower free-energy profile (F = H − TS) over the simulated temperature range (Fig. 6 c). These profiles were used to calculate the relative changes in enthalpy and entropy upon folding (Table 3). UVF and UbEs both lost less entropy than EnHD when they folded (smaller ΔS), and UVF and EbUs both gained more enthalpy than EnHD when they folded (larger ΔH). EnHD was the least stable with the smallest ΔF.
Table 3

Relative Thermodynamic Parameters for Folding

ΔHΔSΔF
EnHD229248−31
UVF231229−40
EbUs238254−34
UbEs229225−41

Units are k; folded and unfolded states from WHAM were defined as 0.95 and 1.05 T, respectively.

Relative Thermodynamic Parameters for Folding Units are k; folded and unfolded states from WHAM were defined as 0.95 and 1.05 T, respectively.

Discussion

Consistent with experimental results, EnHD was stable in AA-MD simulations at 25°C and began unfolding at 100°C, whereas UVF was stable at both temperatures (Fig. 2). In coarse-grained SBMs, side-chain interactions, which were only included in the HA-SBM, were critical to reproducing the differences in melting temperatures observed for EnHD and UVF experimentally and in the AA-MD simulations (Fig. 5 b). The Cα-SBM is purely native-topology based, and because both proteins fold to nearly identical topologies, the level of theory was not detailed enough to reproduce relative differences in melting temperatures. UVF has an entirely hydrophobic core and entirely polar surface. When we spliced together the cores and surfaces of UVF and EnHD (Fig. 1 a), both resulting chimeric proteins were stable over 100 ns of AA-MD simulation (Figs. 2 and S2). The melting temperatures derived from the HA-SBM for both chimeras were between those of EnHD and UVF, with EbUs more similar to EnHD and UbEs more similar to UVF (Fig. 5 b), and both chimeras had a larger ΔF than EnHD (Table 3). UbEs had heightened thermostability at high temperature in both the AA-MD and HA-SBM simulations. However, for EbUs, the HA-SBM gave a melting temperature only slightly higher than EnHD’s, whereas the events of early unfolding were not observed by AA-MD at 100°C. It is possible that EbUs’s folding kinetics are not fast enough for unfolding to be observed on the 100-ns timescale, though those of its parent proteins are (t1/2 = 15 μs for EnHD and 9 μs for UVF (15, 17)).

Entropic stabilization by a dynamic hydrophobic core

The fully hydrophobic core of UVF contributed to the heightened stability of UbEs. Like UVF, UbEs was more dynamic than EnHD at room temperature and maintained these dynamics at 100°C, as evident in the higher core Cα RMSD relative to the starting structure (Fig. 2, a and c). The conformational spaces sampled by UVF and UbEs were very similar at both temperatures, as quantified by all-versus-all core Cα RMSD and contact maps (Figs. 2 b and S2). The heightened backbone dynamics could be tolerated without unfolding because of the molten, fully hydrophobic core, as measured by increased unique buried side-chain contact pairs relative to EnHD (Fig. 3 q). Replacing EnHD’s buried polar residues with hydrophobic ones in UVF and UbEs did not decrease the number of buried hydrophobic contacts, but it did remove eight unfavorable, buried “other” contacts of the 14 in EnHD (Fig. 3, b and c). Creating tightly packed hydrophobic cores is a known method of stabilizing proteins (43), but although the cores of UVF and UbEs are fully hydrophobic, the packing was loose and constantly rearranged. UbEs had the fewest surface hydrogen bonds and the fewest hydrogen bonds overall among all four proteins. Curiously, addition of surface hydrogen bonds is known to stabilize proteins (4, 43, 44), yet UbEs achieved heightened thermostability relative to EnHD with one fewer surface hydrogen bond and five fewer total side-chain hydrogen bonds (Fig. 3, e and m). UbEs had a highly dynamic surface, similar to what was observed for its core, resulting in five more unique surface contacts than UVF and as many as EnHD despite having two fewer surface residues (Fig. 3 r). EnHD had five salt bridges present 100% of the simulation time (Table 2). UbEs had only two present ≥80% of the simulation time, one of which was not observed in EnHD. The stabilization of UbEs was not due to increased salt bridges, on the surface or otherwise. The molten, fully hydrophobic core allowed both UVF and UbEs to be more dynamic at 25°C and maintain these dynamics without unfolding at 100°C. These heightened dynamics are observed in the HA-SBM thermodynamic analysis. The temperature-dependent enthalpy and entropy profiles of UbEs are similar to those of UVF and higher than those of EnHD (Fig. 6, a and b). Higher enthalpy indicates the system accesses higher energy states due to heat being absorbed, and entropy is a measure of the unavailability of a system’s energy to do work. Thus, the absorbed heat of UVF and UbEs increased the proteins’ entropy, which is reflected in the increased sampling of conformational space observed in the AA-MD simulations. UVF and UbEs were stabilized entropically by their highly heterogeneous native states. UVF and UbEs lost less entropy upon folding (ΔS = 229 and 225 kT, respectively; Table 3) than EnHD and EbUs (ΔS = 248 and 255 kT).

Enthalpic stabilization by surface electrostatic interactions

The fully polar surface of UVF contributed to the heightened stability of EbUs. UVF and EbUs both had more surface hydrogen bonds and overall more contacts on the surface than EnHD (Fig. 3, e and h). UbEs’s surface contacts were especially dynamic; it had the most unique surface contact pairs as well as more and shorter-lived side-chain salt bridges (Fig. 3 r; Table 2). Both EnHD and EbUs have hydrogen bonds between buried and surface residues as well as overall more “other” contacts than UVF (Fig. 3, i and o). We predicted that these interactions could be destabilizing to EnHD; however, EbUs was more stable than EnHD, so they could not have been too unfavorable. EbUs was also more dynamic that EnHD, with higher core and all-versus-all Cα RMSDs and more unique buried contact pairs (Figs. 2 and 3 q). These heightened dynamics could not have been due to the fully hydrophobic core as in UVF and UbEs, yet they seem to be similarly stabilizing. The TKSA-MC model results showed similar electrostatic behavior for UVF and EbUs (Fig. 4, b and d), for which nearly all ionizable residues contribute favorably to their stability. The total ΔG for these two proteins was more favorable than for EnHD and UbEs (Fig. 4 e). EnHD and UbEs both contained several ionizable residues that were not favorable to their stability, including Lys17, Arg24, Arg30, Lys53, and Lys55 (Fig. 4, a and c). This fact strengthens the argument that UVF’s polar surface contributed to the stability of EbUs. The majority of ionizable residues in UVF and EbUs were involved in salt bridges in AA-MD simulations, unlike in EnHD (52, 86, and 37%, respectively) (Table 2). UVF and EbUs also had overall more salt bridges (Table 2) and more stability imparted by electrostatic interactions (Fig. 4) than EnHD. UVF and EbUs gained stability from using their ionizable residues more efficiently, making more salt bridges than EnHD and including more ionizable residues in salt bridges. The thermodynamic quantities calculated for EbUs with the HA-SBM were between those of UVF and EnHD (Fig. 6), although EbUs presented heightened stability in the AA-MD simulations (Fig. 1) and low charge-charge electrostatic free energy calculated by the TKSA-MC model (Fig. 4). UVF and EbUs had a greater change in enthalpy upon folding (ΔH = 231 and 238 kT, respectively) than EnHD or UbEs (ΔH = 229 kT for both), which may be due to their increased number of salt bridges and fraction of ionizable residues with satisfied electrostatic interactions (Table 3). The HA-SBM simulations employed here do not account for electrostatics; they are a purely topology-based Hamiltonian. As discussed previously, we observed that side-chain salt bridges and electrostatic interactions contributed to EbUs’s thermostability. Thus, topology-based side-chain interactions seem to play an important role in protein stability at high temperatures for UVF and UbEs, whereas EbUs was more affected by electrostatics than topology-based contacts interactions. It is also possible that EbUs is less stable than UVF and UbEs, which would indicate that the fully hydrophobic core is more stabilizing than the electrostatic surface.

Implications for protein engineering

Protein engineers typically design proteins by placing hydrophobic residues at buried positions and polar residues on the surface of proteins. In protein-design algorithms, tightly packed hydrophobic cores are rewarded in scoring functions (12, 45). Proteins evolve naturally following the same, general scheme, but it is unusual, both in nature and in protein design, to end up with proteins that have exclusively hydrophobic residues at buried positions and polar residues on the surface, as is the case with UVF (46). Our AA-MD simulations showed that the fully hydrophobic core in both UVF and UbEs resulted in a more dynamic protein at 25°C that maintained the heightened dynamics without unfolding at 100°C. The dynamics of core residues observed previously in AA-MD simulations for EnHD and other proteins agree quantitatively with order parameters measured by NMR (18, 47). Similarly, hydrogen bonds and salt bridges are intentionally placed on the surface of proteins by rational protein designers, and scoring functions reward such interactions (13, 45). Proteins that evolve naturally in thermophilic organisms tend to have more surface salt bridges than those in mesophiles (4, 44). Salt bridges in UVF and EbUs were highly dynamic and more numerous compared to those in EnHD; indeed, few were present for the full simulation time. Designed proteins typically are more stable than naturally evolved proteins (8, 14, 48). It remains to be seen whether designed proteins are typically more dynamic or whether there is a relationship between heightened dynamics and stability. Thermodynamically, a highly dynamic, hydrophobic core that leads to a more heterogeneous native state should manifest as a smaller loss of entropy upon folding. The role of entropy in stabilizing thermophilic proteins has been investigated previously. For example, RNase H∗ from Thermus thermophilus is stabilized at higher temperatures than its Escherichia coli counterpart because of a smaller TΔS, likely resulting from a reduced ΔC and smaller hydrophobic effect (49). The cold shock protein from Thermotoga maritima unfolds slower than the Bacillus subtilis cold shock protein, indicating that it is stabilized entropically, most likely because of increased degrees of freedom in the native state (50). Designing proteins with highly dynamic, hydrophobic cores seems consistently to produce thermostable proteins. Thermostable proteins are popular starting points for directed evolution and designing proteins to perform a desired function. Thermostable proteins can withstand mutations that are required for adopting a new function but may be destabilizing to the overall structure. It remains to be seen whether highly dynamic proteins such as UVF can serve as modifiable scaffolds. It is possible that this stabilization strategy is not compatible with function, especially if the protein might need to adopt a specific conformation to carry out its function.

Conclusions

Here, we determine a structural, dynamic, and thermodynamic explanation for the thermostability of a de novo-designed variant of EnHD using AA-MD simulations, TKSA Monte Carlo electrostatic calculations, and heavy-atom SBMs. Our models suggest that UVF is more dynamic because of its fully hydrophobic core, which results in a smaller loss of entropy upon folding. It is further stabilized by its favorable surface electrostatics, resulting in a larger gain of enthalpy upon folding. Based on our models of chimeric proteins that combine the cores and surfaces of EnHD and UVF, both the dynamics of UVF’s fully hydrophobic core and the increased electrostatic interactions in its fully polar surface independently impart thermostability to the chimeras. All three computational models were necessary to identify and quantify the contribution of different structural patterns to the thermostability of the studied proteins, and together they provide, to our knowledge, a novel workflow for dissecting the contributions to stability in other thermostable proteins. Our results suggest a mechanism of protein thermostabilization that has not been thoroughly explored in the protein-design community, but it remains to be seen whether it is compatible with protein function.

Author Contributions

M.E.M., C.N., and J.T.Y. designed the chimeric proteins, and they designed and conducted the molecular dynamics simulations. R.J.O. and G.G.S. designed and conducted the electrostatics calculations and structure-based model simulations. All authors analyzed the data and wrote the manuscript.
  44 in total

1.  Electrostatics significantly affect the stability of designed homeodomain variants.

Authors:  Shannon A Marshall; Chantal S Morgan; Stephen L Mayo
Journal:  J Mol Biol       Date:  2002-02-08       Impact factor: 5.469

Review 2.  Coordinate and time-dependent diffusion dynamics in protein folding.

Authors:  Ronaldo J Oliveira; Paul C Whitford; Jorge Chahine; Vitor B P Leite; Jin Wang
Journal:  Methods       Date:  2010-05-11       Impact factor: 3.608

3.  GROMACS 4:  Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.

Authors:  Berk Hess; Carsten Kutzner; David van der Spoel; Erik Lindahl
Journal:  J Chem Theory Comput       Date:  2008-03       Impact factor: 6.006

4.  Comparison of multiple crystal structures with NMR data for engrailed homeodomain.

Authors:  Tomasz L Religa
Journal:  J Biomol NMR       Date:  2008-02-15       Impact factor: 2.835

5.  A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins.

Authors:  Gautam Dantas; Brian Kuhlman; David Callender; Michelle Wong; David Baker
Journal:  J Mol Biol       Date:  2003-09-12       Impact factor: 5.469

6.  Topography of funneled landscapes determines the thermodynamics and kinetics of protein folding.

Authors:  Jin Wang; Ronaldo J Oliveira; Xiakun Chu; Paul C Whitford; Jorge Chahine; Wei Han; Erkang Wang; José N Onuchic; Vitor B P Leite
Journal:  Proc Natl Acad Sci U S A       Date:  2012-09-10       Impact factor: 11.205

7.  VMD: visual molecular dynamics.

Authors:  W Humphrey; A Dalke; K Schulten
Journal:  J Mol Graph       Date:  1996-02

8.  TKSA-MC: A web server for rational mutation through the optimization of protein charge interactions.

Authors:  Vinícius G Contessoto; Vinícius M de Oliveira; Bruno R Fernandes; Gabriel G Slade; Vitor B P Leite
Journal:  Proteins       Date:  2018-09-28

9.  Thermal versus guanidine-induced unfolding of ubiquitin. An analysis in terms of the contributions from charge-charge interactions to protein stability.

Authors:  B Ibarra-Molero; V V Loladze; G I Makhatadze; J M Sanchez-Ruiz
Journal:  Biochemistry       Date:  1999-06-22       Impact factor: 3.162

10.  All-atom empirical potential for molecular modeling and dynamics studies of proteins.

Authors:  A D MacKerell; D Bashford; M Bellott; R L Dunbrack; J D Evanseck; M J Field; S Fischer; J Gao; H Guo; S Ha; D Joseph-McCarthy; L Kuchnir; K Kuczera; F T Lau; C Mattos; S Michnick; T Ngo; D T Nguyen; B Prodhom; W E Reiher; B Roux; M Schlenkrich; J C Smith; R Stote; J Straub; M Watanabe; J Wiórkiewicz-Kuczera; D Yin; M Karplus
Journal:  J Phys Chem B       Date:  1998-04-30       Impact factor: 2.991

View more
  8 in total

1.  The Role of Electrostatics and Folding Kinetics on the Thermostability of Homologous Cold Shock Proteins.

Authors:  Paulo Henrique Borges Ferreira; Frederico Campos Freitas; Michelle E McCully; Gabriel Gouvêa Slade; Ronaldo Junio de Oliveira
Journal:  J Chem Inf Model       Date:  2020-01-17       Impact factor: 4.956

Review 2.  Thermostability engineering of industrial enzymes through structure modification.

Authors:  Nima Ghahremani Nezhad; Raja Noor Zaliha Raja Abd Rahman; Yahaya M Normi; Siti Nurbaya Oslan; Fairolniza Mohd Shariff; Thean Chor Leow
Journal:  Appl Microbiol Biotechnol       Date:  2022-07-09       Impact factor: 5.560

Review 3.  The stability and dynamics of computationally designed proteins.

Authors:  Natali A Gonzalez; Brigitte A Li; Michelle E McCully
Journal:  Protein Eng Des Sel       Date:  2022-02-17       Impact factor: 1.952

4.  Electrostatic interaction optimization improves catalytic rates and thermotolerance on xylanases.

Authors:  Vinícius de Godoi Contessoto; Felipe Cardoso Ramos; Ricardo Rodrigues de Melo; Vinícius Martins de Oliveira; Josiane Aniele Scarpassa; Amanda Silva de Sousa; Letıcia Maria Zanphorlin; Gabriel Gouvea Slade; Vitor Barbanti Pereira Leite; Roberto Ruller
Journal:  Biophys J       Date:  2021-04-05       Impact factor: 3.699

5.  Thermostabilization mechanisms in thermophilic versus mesophilic three-helix bundle proteins.

Authors:  Catrina Nguyen; Lauren M Yearwood; Michelle E McCully
Journal:  J Comput Chem       Date:  2021-11-05       Impact factor: 3.672

6.  Structural basis for the hyperthermostability of an archaeal enzyme induced by succinimide formation.

Authors:  Aparna Vilas Dongre; Sudip Das; Asutosh Bellur; Sanjeev Kumar; Anusha Chandrashekarmath; Tarak Karmakar; Padmanabhan Balaram; Sundaram Balasubramanian; Hemalatha Balaram
Journal:  Biophys J       Date:  2021-07-22       Impact factor: 3.699

7.  Molecular dynamics simulations suggest stabilizing mutations in a de novo designed α/β protein.

Authors:  Matthew Gill; Michelle E McCully
Journal:  Protein Eng Des Sel       Date:  2019-12-31       Impact factor: 1.650

Review 8.  Recent Progress Using De Novo Design to Study Protein Structure, Design and Binding Interactions.

Authors:  Juan Ferrando; Lee A Solomon
Journal:  Life (Basel)       Date:  2021-03-10
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.