Differential scanning calorimetry, circular dichroism spectroscopy, nuclear magnetic resonance spectroscopy, and numerical simulations were used to study the thermostability of the N-terminal RNA-binding domain (RBD) of the SARS-CoV nucleocapsid protein. The transition temperature of the RBD in a mixing buffer, composed of glycine, sodium acetate, and sodium phosphate with 100 mM sodium chloride, at pH 6.8, determined by differential scanning calorimetry and circular dichroism, is 48.74 degrees C. Experimental results showed that the thermal-induced unfolding-folding transition of the RBD follows a two-state model with a reversibility >90%. Using a simple Gō-like model and Langevin dynamics we have shown that, in agreement with our experiments, the folding of the RBD is two-state. Theoretical estimates of thermodynamic quantities are in reasonable agreement with the experiments. Folding and thermal unfolding pathways of the RBD also were experimentally and numerically studied in detail. It was shown that the strand beta(1) from the N-terminal folds last and unfolds first, while the remaining beta-strands fold/unfold cooperatively.
Differential scanning calorimetry, circular dichroism spectroscopy, nuclear magnetic resonance spectroscopy, and numerical simulations were used to study the thermostability of the N-terminal RNA-binding domain (RBD) of the SARS-CoV nucleocapsid protein. The transition temperature of the RBD in a mixing buffer, composed of glycine, sodium acetate, and sodium phosphate with 100 mM sodium chloride, at pH 6.8, determined by differential scanning calorimetry and circular dichroism, is 48.74 degrees C. Experimental results showed that the thermal-induced unfolding-folding transition of the RBD follows a two-state model with a reversibility >90%. Using a simple Gō-like model and Langevin dynamics we have shown that, in agreement with our experiments, the folding of the RBD is two-state. Theoretical estimates of thermodynamic quantities are in reasonable agreement with the experiments. Folding and thermal unfolding pathways of the RBD also were experimentally and numerically studied in detail. It was shown that the strand beta(1) from the N-terminal folds last and unfolds first, while the remaining beta-strands fold/unfold cooperatively.
The SARS-CoV nucleocapsid protein (NP) is a major virion structural protein in severe acute respiratory syndrome (SARS) (1,2). It has 422 amino-acid residues, a molecular weight of 46 kDa, and shares 20–30% homology with the NP of other coronaviruses (1,2). Its major function is to assemble the viral RNA genome of coronavirus to form the helical ribonucleoprotein core. In vitro, the NPs, in the absence of genomic nucleic acid, exist as dimers at a low concentration, and may form higher oligomers at higher concentrations (3). The NP does not have any cysteine residues, thus it contains no disulfide bond, and may be less stable than other structural proteins of SARS-CoV. Experiments showed that it is most stable at pH 9.0 (3). Thermal and chemical denaturant-induced denaturation analysis (3) showed that the unfolding and refolding of oligomeric NP can be described by a two-state model. Chemical unfolding is found to be reversible (3,4), while both irreversibility (3) and reversibility (4) of the thermal-induced unfolding in different buffers have been reported. Recently, Wang et al. (4) reported that the NP in phosphate-buffered saline buffer starts to unfold at 35°C and is completely denatured at 55°C. The stability of the NP is critical for the function of the SARS virus, which is completely noninfectious at ∼56°C (5).Quite recently, Chang et al. (6) found that the NP consists of two independent structural domains—the N-terminal RNA-binding domain (RBD) (residues 45–181) and the C-terminal dimerization domain (DD) (residues 248–365), surrounded by flexible linkers. Fig. 1 shows a model for the overall structure of SARS-CoV NP (6). The two structural domains carry out two distinct functions. The N-terminal domain is able to bind RNA, while the C-terminal domain acts as a dimerization domain. Hsieh et al. (7) showed that the DD could also bind to an RNA packing signal, and Chen et al. (8) proposed an octamer model (note that the structure of the octamer can be accessed at the Protein Data Bank, PDB code: 2CJR) to demonstrate a possible framework for helical packing of viral RNA. Whereas the two structural domains do not interact with each other, they could act in concert to carry out biological functions with the aid of the long flexible linker between the two domains (6). In this context, two structural domains are logical candidates for further studies. We then investigated the thermostability of the two domains separately, instead of using the full-length of the NP. In this article, we focus on studying the RBD, and the study of the DD domain will be reported elsewhere.
Figure 1
A model of the overall structure of SARS-CoV NP. The RBD and DD are structural domains, and lines represents disordered segments.
The ability of the RBD to bind RNA is closely related to its structure. Fig. 2 shows the cartoon representation of the structure of the RBD (PDB code: 1SSK) (10). An additional 21 residues have been added to the N-terminal in the structure determination, so there is a total of 158 residues in this PDB structure, in which residues 22–158 correspond to residues 45–181 of the SARS-CoV NP. Nuclear magnetic resonance (NMR) studies indicate that the RBD has a five-stranded β-sheet, and all residues are solvent-exposed (6), which implies that it is essentially a loosely packed protein. We studied the thermostability of the RBD by differential scanning calorimetry (DSC) (11,12), circular dichroism (CD) spectroscopy, and NMR spectroscopy. Our experiments show that the transition midpoint of thermal denaturation of the RBD at pH 6.8 is 48.74°C, and the transition curve of the RBD can be fitted nicely with a two-state model. Further, NMR spectroscopy experiments conducted under equilibrium condition suggests that the N-terminus folds last at low temperature and unfolds first at high temperature. The remaining four β-sheets in the (un)folding process are highly cooperative.
Figure 2
(a) Cartoon representation of the structure of the RBD. The Protein Data Bank accession code is 1SSK. There are five β-strands: β1(aa 56–59, numbered according to the full length of NP), β2(aa 108–114), β3(aa 85–91), β4(aa 131–134), and β5(aa 172–173). In the native state, these strands have 30, 49, 48, 24, and 11 contacts, respectively. (b) A schematic plot for four pairs of strands P13, P15, P23, and P24. Here P13, for example, means the pair between strands 1 and 3. In the native state, there exists only the interactions between strands of these pairs. The numbers of native contacts of these pairs are equal to 10, 6, 22, and 13, respectively.
We have also performed numerical studies of the folding thermodynamics and kinetics of RBD using the simple Gō modeling approach (13,14). Using the Langevin dynamics and the reweighting technique (15), relevant thermodynamic quantities were computed and compared with the experiments. From the double-minimum structure of the free energy, plotted as a function of the number of native contacts, we conclude that the RBD is indeed a two-state folder. The reversibility of thermal folding of RBD was theoretically investigated by considering folding and unfolding pathways of five β-strands. The results are substantially congruent with the experimental results, while the folding of β-strands is less cooperative.The rest of the article is organized as follows. In Materials and Methods, we briefly present sample preparation, experimental procedures, and simulation details. The experimental and simulation results are summarized in the Results and, finally, our results are addressed in the Discussion.
Materials and Methods
Sample preparation
The SARS-CoVTW1 strain cDNA sequencing clones were provided by Dr. P.-J. Chen of National Taiwan University Hospital (16). The RBD of the SARS CoV NP (residues 45–181) was cloned into the pET6H vector as described in Chang et al. (17). The fragment was expressed in Escherichia coli BL21(DE3) cells overnight at 37°C in Luria-Broth media without inducing agents. For purification, 10 amino-acid residues (MHHHHHHAMG) have been added to the N-terminal of RBD as His-tag, such that the final RBD sample has a total of 147 amino-acid residues, in which residues 11–147 correspond to residues 45–181 of the SARS-CoV NP. The protein was purified with Ni-NTA affinity column (Qiagen, Valencia, CA) in buffer (50 mM sodium phosphate, 150 mM NaCl, pH 7.4) containing 7 M urea. The protein was then allowed to refold by gradually lowering the denaturant concentration through dialysis in liquid chromatography buffer (50 mM sodium phosphate, 150 mM NaCl, 1 mM EDTA, 0.01% NaN3, pH 7.4). Renatured protein was loaded onto an AKTA-EXPLORER fast performance liquid chromatography system equipped with a HiLoad 16/60 Superdex 75 column (Amersham Pharmacia Biotech, Uppsala, Sweden). Complete Protease Inhibitor cocktail (Roche, Mannheim, Germany) was added to the purified protein. Protein concentration was determined with the Bio-Rad Protein Assay kit as per instructions from the manufacturer (Bio-Rad, Hercules, CA). The correct molecular weights of the expressed proteins were confirmed by mass spectroscopy.
Differential scanning calorimetry
The conformational stability of the RBD is investigated by analyzing the unfolding transition induced by elevating temperature. The experiments were carried out with a VP-DSC calorimeter from MicroCal (Northampton, MA) at a scan rate of 1.0°C/min. The samples were degassed for 15 min at room temperature before the calorimetric experiments. The thermograms were measured in the temperature range 10–80°C. The sample was in a mixing buffer, which was composed of glycine, sodium acetate, and sodium phosphate with 100 mM sodium chloride, at pH 6.8. Fitting of the thermal unfolding/refolding transitions were carried out by the computer software MicroCal Origin Ver. 7.0.
Circular dichroism
The CD spectra were measured on a Jasco (Tokyo, Japan) J-715 spectropolarimeter at different temperatures with wavelength scanning taken from 250 to 200 nm. All experiments were carried out at 2-nm bandwidth in a 1-mm quartz square cuvette, thermostated to ±0.1°C. Protein concentration was ∼20 μM and was determined by the BCA Protein Assay Kit (Pierce, Rockford, IL). Protein was dissolved in 30 mM mixing buffer containing glycine, sodium acetate, and sodium phosphate with 100 mM sodium chloride at pH 7.0. Temperature control was achieved by using a circulating water bath system and the equilibrium time was 10 min for each temperature point.
NMR spectroscopy
13C, 15N-labeled samples were prepared as previously described. Samples were exchanged to NMR buffer (10 mM sodium phosphate, pH 6.0, 50 mM NaCl, 1 mM EDTA, 0.01% NaN3, 1 mM DSS, 10% D2O) with an Amicon Ultra-15 concentrator (Millipore, Billerica, MA). Final sample concentration was ∼1.5 mM. Variable temperature 15N-edited heteronuclear single-quantum coherence (HSQC) spectra were acquired on a Bruker AV600 spectrometer (Bruker, Rheinstetten, Germany) equipped with a normal temperature probe. Spectra were acquired using spectral widths of 12 ppm × 34 ppm and 1024 × 128 complex points for the 1H and 15N dimensions, respectively. The sample was allowed to equilibrate at the target temperature for at least 30 min before experiments were conducted. These spectra were processed with iNMR (Nucleomatica, Molfetta, Italy), which was also used to extract peak intensities. Graphs were generated with Microsoft Excel (Microsoft, Redmond, WA). Projection-reconstruction three-dimensional HNCO spectra were acquired on a Bruker AV500 spectrometer equipped with a cryoprobe. Spectra were acquired using spectral widths of 12 ppm × 34 ppm × 12 ppm and 1024 × 64 × 64 complex points on the 1H, 15N, and 13C dimensions, respectively. Spectra were processed with TopSpin 2.1 (Bruker) and the reconstructed spectra were analyzed with CARA (18). Digital resolution on the 13C dimension after processing was 0.046 ppm. Secondary shifts were calculated from the random coil shifts tabulated by Schwarzinger et al. (19,20), and corrected for sequence composition (19,20). The secondary shifts were graphed with GraphPad Prism (GraphPad Software, La Jolla, CA).
Simulation details
To simulate the thermal folding and unfolding of the RBD, we utilize the Gō-like model (14), in which amino-acid residues are represented by point particles located at positions of C atoms. This model is defined through the experimentally determined native structure, and it captures essential aspects of the important role played by the native structure (14,21). In our simulations, the lowest energy native structure was taken from the PDB (PDB ID: 1SSK), and only the positions of 158 C-carbons are considered. The energy of the Gō-like model has the following form (14):Here Δri = ri, i+1 – r0i, i+1, Δθi = θi – θ0i, Δϕi = ϕi – ϕ0i, and ri, i+1 is the distance between residues i and i + 1, θi is the bond angle between bonds (i − 1) and i, ϕi is the dihedral angle around the ith bond, and rij is the distance between the ith and jth residues. Subscripts 0, NC, and NNC refer to the native conformation, native contacts, and nonnative contacts, respectively. Residues i and j are in native contact if r0ij is less than a cutoff distance dc = 7.5 Å, where r0ij is the distance between the residues in the native conformation. The first harmonic term in Eq. 1 accounts for chain connectivity and the second term represents the bond-angle potential. The potential for the dihedral angle degrees of freedom is given by the third term in Eq. 1. The interaction energy between residues that are separated by at least three residues is given by 10-12 Lennard-Jones potential, given by the fourth term. A soft sphere repulsive potential, represented by the fifth term in Eq. 1, disfavors the formation of nonnative contacts. We choose Kr = 100 ɛH/Å2, K = 20ɛH/rad2, K(1) = ɛH, and K(3) = 0.5ɛH, where ɛH is the characteristic hydrogen-bond energy (14) and C = 4 Å.We assume that the dynamics of the polypeptide chain obeys the Langevin equation (22). To compute thermodynamic quantities, we performed simulations at the friction , where the folding is fast. Here τL = (ma2/ɛH)1/2 ≈ 3 ps, m is the typical mass of amino acids, and a is the distance between two neighboring residues. For this value of friction, the equations of motion (see (23–25) for more details) were integrated using the velocity form of the Verlet algorithm (26) with the time step Δt = 0.005τL. The equilibrium simulations have been carried out at T = 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, and 1.1 ɛH/kB. For each temperature, 15 independent trajectories of 1.2 × 106τL were generated to have good statistics. The first period of 1.2 × 105τL was spent on equilibration and it was discarded from the data collection. Histograms for the energy and native contacts have been collected. Then the reweighting technique (15) was used to obtain the thermodynamic quantities at arbitrary temperature.Since folding and unfolding pathways are kinetic in nature, they may depend on the friction. Thus, to mimic real experiments, kinetic simulations were carried out for the friction , which is believed to correspond to the viscosity of water (27). In this overdamped limit, we use the Euler method for integration and the time step Δt = 0.1τL. To probe folding pathways, for ith trajectory we introduce the progressive variable δi = t/τF, where τF is the folding time (28). One can then average the fraction of native contacts over many trajectories in a unique time window 0 ≤ δi ≤ 1, and monitor the folding sequence with the help of the progressive variable δ. The thermal unfolding pathways may be probed using a similar progressive variable, but instead of folding times we use unfolding ones, i.e., δi = t/τU, where τU values are unfolding times. We define τU as an average of first passage times to reach a conformation, which has no native contacts. Different trajectories start from the same native conformation but with different random number seeds.
Results
DSC experiments
A typical curve for the specific heat Cp of the RBD is shown in Fig. 3. The transition temperature at pH 6.8 in the presence of 100 mM NaCl is TM = 48.74°C. The reversibility of thermal denaturation was estimated by the ratio of the maximum value of Cpmax of subsequent scans and found to be >90%, suggesting that the thermal transition of the RBD is reversible. The sharp peak of the specific heat curve suggests that RBD is a two-state folder. Whereas the fitting of the transition may be carried out on the basis of the two-state model, here we assume a generalized situation instead of specific fitting functions (of a two-state folding model) and calculate thermodynamic quantities directly from definitions of such quantities.
Figure 3
Temperature dependence of the RBD partial heat capacity in a mixing buffer of glycine, sodium acetate, and sodium phosphate, with 100 mM sodium chloride, at pH 6.8 in the presence of 100 mM NaCl. The transition temperature TM is determined to be 48.74°C. The dashed line indicates the baseline curve and ΔCp ≈ 1.01 kcal/mol/K is the difference between extrapolations of the baseline at TM.
Table 1 presents the thermodynamic parameters obtained from the DSC measurements. The calorimetric enthalpy ΔHM was calculated by integrating the area enclosed by the specific heat curve and a proper baseline curve (dashed line in Fig. 3). The increment of specific heat ΔCp was measured by calculating the difference between the extrapolations of the specific heat curves at two ends. The entropy ΔSM at TM, the temperature of maximum stability TS, the temperature with zero enthalpy TH, and the conformational stability ΔGS at TS were determined by
Table 1
Thermodynamic parameters obtained from the DSC measurements and the Gō model simulations
Domain
TM (K)
ΔHM (kcal/mol)
ΔCp (kcal/mol/K)
ΔSM (cal/mol/K)
TS (K)
TH (K)
ΔGS (kcal/mol)
RBD (exp)
321.9 ± 0.27
81.99
1.01
254.71
250.15
240.72
9.52
RBD (sim)
321.9
69.02
0.92
214.41
254.98
246.88
7.45
CD spectra
The CD spectra for the RBD at pH 7.0 in the presence of 100 mM NaCl are presented in Fig. 4. The profiles of the CD curves peak at wavelength at ∼225 nm. We determine the folding fraction from the CD curves at wavelengths 208 nm, 212 nm, and 225 nm, using the equationwhere θ represents the observed mean residue ellipticity under the given conditions, and θN and θD are the corresponding ellipticities in the native (N) and denatured (D) states, respectively. The results are shown in Fig. 5. The melting temperature determined by the CD spectra is 48.5°C, consistent with the DSC measurement.
Figure 4
CD spectra of the RBD in a mixing buffer of glycine, sodium acetate, and sodium phosphate, at pH 7.0 in the presence of 100 mM NaCl.
Figure 5
Folding fractions of the RBD at pH 7.0 with respect to different temperature, determined by the CD spectra of Fig. 4 at λ = 208 nm, 212 nm, and 225 nm.
The peak intensity of the backbone amides of a protein in a 15N-edited HSQC spectrum is a function of the exchange rate with the solvent, the correlation time of the molecule, and the population of the state. Increased tumbling of the molecule results in stronger peak intensities, whereas increased exchange rate lowers the intensities. When the temperature is raised, both the tumbling rate and exchange rates increase. Residues that are involved in hydrogen bonding, e.g., secondary structure formation, usually have slower exchange rates and are less prone to lose intensity when the temperature is raised. Thus, a plot of peak intensities versus temperature provides a qualitative view of the (un)folding process under equilibrium conditions. Such a plot for the RBD is shown in Fig. 6 for the β-sheet residues. The residues comprising the β-sheet fall into three categories: Q59 (numbered according to the full length of NP), which shows extreme loss of intensity even at moderate temperatures; T58 and Y110, which lose intensity at an intermediate rate; and the rest, which lose intensity only at higher temperatures. Our results suggest that the β1 strand, which includes T58 and Q59, has a lower thermal stability compared to the other strands in the β-sheet. Thus, the strand β1 likely unfolds first and folds last in the (un)folding process. The fact that most residues exhibit a transition at approximately the same temperature implies a high degree of cooperativity of the (un)folding process.
Figure 6
The peak intensity of the backbone amides in a 15N-edited HSQC spectrum as a function of temperature for β-sheet residues, L57, T58, Q59 (β1), I85, G86, Y87, Y88, R89, R90, A91 (β2), R108, W109, Y110, F111, Y112, Y113 (β3), I131, V132, W133, V134 (β4), and F172, Y173 (β5).
Another way to analyze secondary structures is through the use of secondary shifts, which is defined as the difference between the observed chemical shift and the average random-coil chemical shift. This approach requires triple-resonance NMR experiments. We opted to observe only the carbonyl shifts because it employs a highly sensitive experiment and is sensitive to both the α- and β-structures. Positive values imply α-propensity, whereas negative values imply β-propensity. We carried out this experiment at three different temperatures with the results summarized in Fig. 6. Attempts to conduct the experiments at higher temperature were not feasible due to low signal/noise ratio. In this experiment, Y110 shows marginal β-propensity across all temperatures tested. Coupled with the results shown in Fig. 7, it seems that Y110 could serve as a starting point for (un)folding of the β2/β3/β4 sheet. Surprisingly, a number of residues located at the β2 (Y87, Y88, and R89) and β4 (I131 and W133) strands showed non-beta propensities at the temperatures tested. Higher temperatures led to a more random coil-like propensity for these residues. Whether there is a correlation between the secondary structure propensity of these residues and their thermal stability requires further investigation.
Figure 7
Carbonyl shifts at three different temperatures, 298 K, 303 K, and 308 K, for β-sheet residues L57, T58, Q59 (β1), I85, G86, Y87, Y88, R89, R90, A91 (β2), R108, W109, Y110, F111, Y112, Y113 (β3), I131, V132, W133, V134 (β4), and F172, Y173 (β5).
Simulation results
Thermodynamic properties
The specific heat Cp was calculated using the formula Cp(〈E2〉 – 〈E〉2)/T2, where E is the total energy of the system, and 〈···〉 stands for thermodynamic averaging. The averaging procedure was carried with the help of the reweighting technique (see Materials and Methods). As followed from our simulations, the maximum of the specific heat locates at the melting temperature TM = 0.8ɛH/kB. Identifying this value with the experimental value TM = 321.9 K we obtained the characteristic energy in our model ɛH = 0.73 kcal/mol. The simulated temperature dependence of Cp (Fig. 8 a) is shown in the same temperature interval as for the experimental curve in Fig. 3. The maximum value Cpmax ≈ 6.68 kcal/mol/K is not far from the experimental result (Fig. 3).
Figure 8
(a) Temperature dependence of the specific heat obtained in the Gō model. We use the same temperature interval as in the experiment (Fig. 3). We have ΔCp ≈ 0.92 kcal/mol/K. (b) The temperature dependence of dfN/dT obtained from the experiments and simulations. This quantity was calculated using where ΔHM and TM are taken from Table 1.
The jump of specific heat at the melting temperature ΔCp was defined as the difference between the base lines from two ends (Fig. 8 a). We obtain ΔCp ≈ 0.92 kcal/mol/K, which is also closed to the experimental estimate 1.01 kcal/mol/K. Due to the uncertainty in determination of baselines (Fig. 3 and Fig. 8 a), both the theoretical and experimental value of ΔCp should be considered as rough estimates. Therefore, it is not clear whether the agreement between the Gō modeling and experiments is just fortuitous for the RBD or robust for other proteins. Elucidating this issue requires further investigation.Defining the calorimetric enthalpy ΔHM as the area enclosed by the specific heat curve and a proper baseline curve (Fig. 8 a), we obtained ΔHM = 69.02 kcal/mol. Using TM = 321.9 K, Eq. 2 gives the jump of entropy at the denaturation transition ΔSM ≈ 214.41 cal/mol/K, which is a bit lower than the experimental one (Table 1). Substituting ΔHM, TM, ΔSM, and ΔCp to thermodynamic relations given by Eqs. 3–5, TS, TH, and ΔGS were also calculated and listed in Table 1. Clearly, the agreement between the theory and experiments is reasonable.The sharpness of the folded→unfolded transition might be quantitatively characterized via the cooperativity index Ωc defined by (29,30,31)where ΔT is the transition width, defined as the full width at half-maximum of |dfN/dT|. Using the experimental and theoretical values of ΔHM (Table 1), and the two-state formula for fN, we obtain the temperature dependence of |dfN/dT| shown in Fig. 8 b. Then, Eq. 7 gives Ωc = 4456 and 3195 for DSC experiments and the Gō model simulation, respectively. We can also estimate Ωc from its dependence on the number of amino acids N (30,32). From the data set collected for 34 proteins (32), we have Ωc ≈ 0.0456 × N, where exponent μ is universal and expressed via the random walk susceptibility exponent γ as μ = 1 + γ. Using μ ≈ 2.22 (γ ≈ 1.22) and N = 158, we obtain Ωc ≈ 3467, which is in reasonable agreement with the simulation and experimental estimate. Thus, one can use the scaling law to roughly estimate the cooperativity index of RBD.
RBD folds without intermediates
Assuming the number of native contacts as a good reaction coordinate, we constructed the free energy landscape (Fig. 9 a). Since two minima, which correspond to the denaturated and folded states, are separated by only one transition state, the RBD is a two-state folder. This is in qualitative agreement with our experimental finding. However, some remarks are in order. First, the two-state behavior is obvious for the β-protein RBD. Second, the all-or-none folding of the RBD is not a mere consequence of Gō modeling, because this model correctly captures not only the two-state folding of proteins CI2 and SH3 (more two-state Gō folders may be found in (33)) but also intermediates of the three-state folder barnase, RNase H, and CheY (14). The reason for this is that the simple Gō model ignores the energetic frustration but it still takes the topological frustration into account. Therefore, it can capture intermediates that occur due to topological constraints but not those emerging from the frustration of the contact interactions. The fact that the simple Gō model correctly captures the two-state behavior as was observed in our experiments, suggests that the energetic frustration ignored in this model plays a minor role compared to the topological frustration (14). Even though the Gō model by itself cannot be employed to ascertain the two-state behavior of proteins, one can use it in conjunction with experimental methods.
Figure 9
(a) Free energy as a function of the number of native contacts at T = TM. Typical snapshots are shown. (b) The distribution of RMSD for the denatured state (DS), transition state (TS), and folded state (FS) at T = TM. The average values of RMSD are 〈RMSD〉DS = 11.9 Å, 〈RMSD〉TS = 7.0 Å, and 〈RMSD〉FS = 3.3 Å.
As evident from Fig. 9 a, at the denaturation transition point the folding/unfolding barrier is rather low (≈1 kcal/mol). This presumably reflects shortcomings of the Gō modeling. Account of nonnative contacts would make a protein energetically frustrated (in addition to the topological frustration), leading to the barrier enhancement.Using Fig. 9 a, we define transition state (TS) conformations as those that have the number of native contacts 130 ≤ Q ≤ 180. Then one can sort out all of these conformations for a structural analysis. Fig. 9 b shows the distribution of the root-mean-square deviation (RMSD) for the denatured state (DS), TS, and folded state (FS). Representative snapshots of these states are shown in Fig. 9 a. The average RMSD of TS 〈RMSD〉TS ≈ 7 Å, and the average gyration radius 〈Rg〉TS = 21.67 Å. For the PDB structure the gyration radius is RgPDB = 19.92 Å. Since 〈Rg〉TS/RgPDB = 1.088 is close to unity, TS structures are relatively compact compared to the folded ones. It is also clear from snapshots shown in Fig. 9 a.
Reversibility of thermal folding
As discussed above, our experiments showed that the thermal folding of the RBD is highly reversible. Here, we present simulation results obtained at the water viscosity , to check this experimental observation. To see whether the thermal unfolding is the reverse of folding, we study the folding and unfolding pathways using the progressive variable δ as a reaction coordinate (see Simulation Details). Fig. 10 a shows the dependence of native contact fractions on δ for five β-strands at T = 285 K, which is
Figure 10
(a) Time dependence of the fraction of native contacts of five β-strands at T = 285 K. The definition of these strands is given in the caption of Fig. 2. (b) The same as in panel a, but for four pairs of β-strands (see Fig. 2 for definitions).
Thus, the folding initiates from the core by forming native contacts between strands β2 and β3. The native contacts, which belong to strands β3 and β4, form almost at the same timescales. The N-terminus is expected to fold last, when all contacts between strands β1 and β3 occur.The time dependence of native contact fractions at T = 450 K for the β-strands and their pairs is shown in Fig. 11. In this case, we have the following unfolding pathways:
Figure 11
(a) Time dependence of the fraction of native contacts of five β-strands at T = 450 K. The definition of these strands is given in the caption of Fig. 2. Results were averaged over 100 trajectories. (b) The same as in panel a, but for four pairs of β-strands (see Fig. 2 for definitions).
It should be noted that the pathways given by Eqs. 8 and 9 are valid in the statistical sense. Our detailed analysis shows that strand β1, e.g., unfolds first in 48% of events (Eq. 9), while β5 and β4 unfold first in 42% and 10% of events, respectively. Comparing Eq. 8 and Eq. 9, one can see that in terms of strand pairs, the unfolding is the reverse of folding. In this sense, our simulations capture the experimental fact that the folding of the RBD is highly reversible. However, it is not entirely true if one considers individual strands. In fact, assuming that the thermal unfolding is reverse of folding, then from Eq. 8 it would follow that strand β2 unfolds last, but β4 is supposed to unfold last according to Eq. 9. Notably, Eq. 8 and Eq. 9 unambiguously show that statistically β1 folds last at low temperatures, and it unfolds first at high temperatures. These results are congruent with the experimental results.
Discussion
We have carried out the DSC, CD, and NMR spectroscopy, and Gō-like model simulation to investigate the thermostability of the RBD. We find that even though the thermal-induced unfolding of the full-length NP is irreversible in the same buffer condition in our DSC measurements (results not shown), the RBD is a highly reversible segment. The transition midpoint is determined to be 48.74°C located in the range of the unfolding temperature of the NP (35–55°C with a TM of 43°C (4)). The thermal unfolding of the full-length NP was studied by fluorescence technique that monitors primarily the structural changes at the vicinities of the five tryptophan residues, three located in the RBD domain and two in the DD domain. Thus, the results are the composite effects of the melting of the two domains and as such, the observed melting is broad. The observed lower TM for the NP also suggests that the DD appears to be less stable and more crucial to the properties of the NP, and predominantly determines the stability of the framework of the DD octamer (8) in forming helical ribonucleoprotein core. The RBD and DD do not interact with each other. It is unclear how the two domains contribute to the high irreversibility. This issue requires further investigation.Our DSC and CD spectra analyses and the free-energy landscape constructed by the Gō-like model suggest that the folding-unfolding of the RBD can be treated as two-state within the experimental timescale. The Gō model by itself can not be employed to ascertain the two-state behavior of proteins. However, one can use it in conjunction with experiments to probe the two-state folding because this model does not always provide the two-state behavior as have been clearly shown in the work of Clementi et al. (14). It can capture intermediates that occur due to topological constraints but not those emerging from the frustration of the contact interactions. Although the folding process in the Gō model is less cooperative compared to what observed experimentally, we have obtained good estimates for other thermodynamic quantities. Given the simplicity of the Gō modeling, our theoretical results should be considered as reasonable. In the future, it is of interest to simulate RBD with all-atom models (34–38) to see whether such simulated results can be more cooperative. In addition, our present simulations show that the folding of the RBD initiates from the core and the N-terminus folds last. Our NMR spectroscopy experiment conducted under equilibrium conditions suggests that the N-terminus unfolds first and folds last, while the (un)folding of remaining β-strands is highly cooperative. Due to the sensitivity of the solution pH and hydrogen exchange rate with respect to temperature variations, our analysis based on the NMR data for the folding sequence of the remaining four β-strands has a limited resolution.As evident from our study that the unfolding-folding process of the RBD is highly reversible, it is appealing to investigate the pH dependence of the folding of the RBD. Furthermore, our experiments show that a number of residues located in the β2 and β4 strands showed non-beta propensities at the tested temperatures (298–308 K) and a more random coil-like propensity for higher temperatures. Whether there is a correlation between the secondary structure propensity of these residues and their thermal stability implied by the simulations remains to be seen. We are working on these problems, and related studies will be reported in a future article.