Literature DB >> 35917500

Quantitative Interpretation of Protein Diffusion Coefficients in Mixed Protiated-Deuteriated Aqueous Solvents.

Bridget Tang¹, Katie Chong², Walter Massefski³, Robert Evans¹.

Abstract

Diffusion-ordered nuclear magnetic resonance (NMR) spectroscopy is widely used for the analysis of mixtures, dispersing the signals of different species in a two-dimensional spectrum according to their diffusion coefficients. However, interpretation of these diffusion coefficients is typically purely qualitative, for example, to deduce which species are bigger or smaller. In studies of proteins in solution, important questions concern the molecular weight of the proteins, the presence or absence of aggregation, and the degree of folding. The Stokes-Einstein Gierer-Wirtz estimation (SEGWE) method has been previously developed to simplify the complex relationship between diffusion coefficient and molecular mass, allowing the prediction of a species' diffusion coefficient in a pure solvent based on its molecular weight. Here, we show that SEGWE can be extended to successfully predict both peptide and protein diffusion coefficients in mixed protiated-deuteriated water samples and, hence, distinguish effectively between globular and disordered proteins.

Entities: Chemical

Mesh：

Substances：
Proteins
Solvents
Water

Year: 2022 PMID： 35917500 PMCID： PMC9376945 DOI： 10.1021/acs.jpcb.2c03554

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 3.466

Introduction

Molecular self-diffusion in a liquid originates from the random, thermal motion of the molecules present. Diffusion coefficients, such as those acquired in diffusion-ordered nuclear magnetic resonance (NMR) spectroscopy,[1] provide information on the size, shape, and local environment of molecules, both small and large. This, in turn, infers chemical information, such as the molecular weight of an unknown species, its aggregation or association with other species and can reveal changes in structure, such as when proteins denature. However, while there is a rough inverse correlation between molecular mass and the speed at which a species moves through a solution, the wide range of possible molecular shapes, solute–solvent interactions, and some fundamental problems with diffusion theories make quantitative interpretation of diffusion coefficient data difficult. One approach is to use power laws, such as eq , to derive correlations between diffusion coefficient, D, and molecular mass, M, for chemically cognate systems, for example, a homologous series in a particular solvent at a given temperature. A plot of logD against logM, as in eq , for a series of structurally similar compounds can be used to infer the molecular weight of an unknown compound of the same class from an experimentally acquired D. This approach has been very successful, particularly in organometallic chemistry where diffusion NMR has been successfully used to identify reactive intermediates and organometallics.[2,3] Such empirically obtained power laws have also found wide use in the study of macromolecules, in particular polymers, peptides, and proteins.[4,5]Each power law must be parametrized for the distinct class of compounds studied in a given solvent, producing a pair of parameters, logK and, more importantly, the constant of proportionality. In this work, this constant has been expressed as (1/δ) throughout for two reasons. First, this highlights the similarities between power laws, such as eq , and Flory theory, and, second, it avoids duplication with parameters used later. The constant of proportionality between logD and logM indicates the relationship between the species molecular weight and its hydrodynamic relationship in solution. They depend not only on the molecular structure of the species but also on experimental conditions such as solvent choice and temperature. Globular proteins are an example of chemical species where values of (1/δ) typically tend towards 0.33. Two studies, one by Augé et al., using diffusion NMR,[4] and another by Enright and Leitner, computing fractal indices based on structures found in the Protein Data Bank,[6] both obtained values of 0.39 for a range of proteins spanning several orders of magnitude in size. These studies proved similar to diffusion NMR studies by Jones and Wilkins[7] and Whitehead et al.,[8] which relate the protein gyration and hydrodynamic radii, respectively, to the number of residues present.[8] Both reported that while values of (1/δ) for globular proteins tended towards 0.33, measurements in strongly denaturing solutions increased values approaching 0.6. In these conditions, the exponent is now similar to that expected for a polymer in a solvent with energetically favorable interactions between polymer segments and solvent molecules.[9] Therefore, differences in (1/δ) can be used to distinguish between folded, disordered, and denatured proteins. To demonstrate this, Dudás and Bodor acquired diffusion coefficients of 12 globular proteins and 10 intrinsically disordered proteins with sizes of up 65 000 g mol–1.[10] A value for (1/δ) of 0.381 was obtained for globular proteins, consistent with previous work and near-spherical molecules. Intrinsically disordered proteins exhibited an average exponent of 0.507, commensurate with their more extended, loosely packed structures. An alternative approach is to start with the Stokes–Einstein equation (eq ),[11] where the diffusion coefficient, D, of a particle or molecule is estimated by balancing the thermal energy of the system, defined as kBT, where kB is the Boltzmann constant and T is the temperature, with the friction acting on the particle, assuming that the particle is a hard sphere with the hydrodynamic radius rH, at an infinite dilution in a continuum fluid with the viscosity η.The Stokes–Einstein equation works well for nanometer and larger-sized species. However, for smaller molecules, the equation works less well for two well-established reasons. The first recognizes that solvents are not continuous but consist of molecules moving randomly. These solvent molecules have a finite size. This breakdown of the continuum model significantly affects predicted diffusion coefficients. The effect of non-negligible solvent particle size is to increase the friction acting on the solute molecules. This increase can be included in the Stokes–Einstein equation by introducing a variable friction factor, f, to the denominator, leading to eq .While several expressions for f have been proposed, all changing the friction as a function of the ratio of solvent to solute radii, here, the Gierer–Wirtz function (eq )[12] is usedwhere α is the ratio of the radius of the solute to that of the solvent. Equation was derived directly from microfrictional theory. The other approaches use adjustable parameters, determined empirically.[13] The second reason for the failure of the Stokes–Einstein equation to accurately predict molecular diffusion coefficients is that most molecules are not hard spheres but can exhibit different molecular shapes, are flexible, interact with solvents to different degrees, and can have very different effective densities. Molecule shapes can be approximated as ellipsoids and, while analytical equations do exist for the effect on molecular diffusion of increasing aspect ratios in ellipsoidal shapes,[14] for molecules that are not long thin rods or wide thin disks, the effects are typically much less than 10% and can be safely ignored in most cases. The remaining factors, flexibility, solvation, and composition, cannot be adequately handled without prior information. One further simplification is to limit the method to species that do not contain any heavy atoms and may be assumed to have an effective density typical of organic molecules containing carbon, hydrogen, oxygen, and nitrogen only. Therefore, all solutes can be assumed to be hard spheres with a single adjustable parameter, the effective density, ρeff. These modifications produce the Stokes–Einstein–Gierer–Wirtz estimation (SEGWE) method for the prediction of molecular diffusion coefficients (eqs and 5b).[15,16] This approach links the diffusion coefficient, D, expected in a solvent with a given viscosity η at a given temperature T to the solute and solvent molecular weights MW and MWS and NA, the Avogadro number, through a single adjustable parameter, ρeff.Using the Gierer–Wirtz function (eq ) does require knowledge of α, the ratio of the solute and the solvent radii but since the solute radius is being estimated using the hard-sphere approximation with an effective density, the same logic can be applied to estimating the solvent radius. The value of the single adjustable parameter, ρeff, can then be obtained by finding an optimum value from a test set of molecules. The original study[15] used a training set of experimental diffusion coefficients, D, all measured at 298.15 K, for 108 combinations of 44 test compounds and 5 common deuteriated NMR solvents. Numerical optimization was used to estimate the effective density ρeff = 627 kg m–3. This empirical effective density is lower than would be predicted from a consideration of only molecular mass and geometry because the effects of solvation and flexibility will typically increase the solute hydrodynamic radius. The SEGWE method has been further tested using 558 additional measurements of small molecules in dilute systems drawn from literature studies of small molecule diffusion as an additional training set. This larger data set spans a wider range of chemical space than the initial training set, increases the range of compound masses up to ca. 1.5 kDa, allows for measurements at variable temperatures, and increases the number of pure solvents covered from 5 to 23.[16] SEGWE has been demonstrated to be effective in analyzing small organic molecules, identifying natural products,[17] and confirming the presence or absence of aggregation.[18] While the SEGWE method was explicitly designed for small molecules containing only lighter atoms such as C, H, and O, this has not stopped its use in the analysis of compounds and complexes containing heavier atoms, such as coinage metals.[19−22] Neither general power law nor SEGWE methods are designed to handle samples containing mixed solvents. Mixed solvents are commonly used in NMR experiments, particularly in studies of proteins where deuteriated water is required for deuterium lock, but protiated water is necessary to preserve any exchangeable protons, particularly backbone and sidechain amide resonances. For power law-based models, any change in the system, whether the nature of the compounds studied or the solvent composition, necessitates generating a new power law and estimating new values for both parameters. A power law method for mixed solvents, albeit those containing chaotropic agents such as DMSO and urea, has recently been published.[23] In the case of SEGWE, the use of mixed solvents will affect both the Gierer–Wirtz function, as the different solvents may have different sizes, and also the solvent viscosity, as different compositions of mixed solvents will have different viscosities. While the effect of deuteriation on solvents may sometimes be overlooked, it can affect solvent viscosity depending on two factors; first, the number of protons per molecule replaced by the heavier isotope and, second, the role hydrogen bonding has in the liquid.[24] While, for solvents such as chloroform, the differences in solvent viscosity between protiated and deuteriated solvents can be small, Figure illustrates the differences in viscosity between H2O and D2O as a function of temperature.

Figure 1

Viscosities of H2O (blue) and D2O (black) at a range of temperatures from 273 to 313 K, calculated using Andrade’s equation and parameters obtained from ref (16).

Viscosities of H2O (blue) and D2O (black) at a range of temperatures from 273 to 313 K, calculated using Andrade’s equation and parameters obtained from ref (16). Figure reveals that the differences in viscosity between protiated solvents and their deuteriated counterparts can be large, reaching 25% at low temperatures for aqueous solvents. The two solvents also exhibit different temperature dependencies. The dependence of fluid viscosity, η, on temperature, T, can be described by an Arrhenius-like equation known as Andrade’s equation (eq ).[25,26]The parameters a and b can be obtained for a given liquid by plotting the logarithm of measured fluid viscosity against the reciprocal of its temperature. These Arrhenius-like parameters have previously been collated for common deuteriated and protiated solvents in Evans et al.[16]Supporting Information 1 contains figures similar to Figure (Figures S1–S5) for other common deuteriated solvents, CDCl3, MeOH-d4, DMSO-d6, and toluene-d8, and their protiated counterparts, as well as a summary (Figure S6), and all relevant Arrhenius parameters for their viscosities. There is surprisingly little consensus on the question of predicting the viscosities of mixed solvents. A number of empirical equations have been derived to estimate the viscosity of a mixed solvent based on its composition and the viscosities of the pure components. One of the most commonly used is the Kendall–Monroe equation (eq ),[27,28] which predicts the viscosity of the mixed solvent η1,2 as the weighted average of the cube-root viscosities of the pure component fluidswhere x1 is the molar fraction of component one, η1 is the viscosity of component one, x2 is the molar fraction of component two, and η2 is the viscosity of component two. The equation was proposed based on it being the least inaccurate of several models using different functions of the pure component viscosities.[27,28] Other models used to predict the viscosity of mixed solvents include physical quantities such as the densities of the pure components.[29,30] In the work presented here, a mixing rule for viscosity initially proposed by Eyring[31] and subsequently updated by Grunberg and Nissan[32] (eq ) has been used to extend SEGWE for use with mixed solvents.Equation predicts the viscosity of a mixed solvent as the weighted average of the logarithms of the viscosities of the pure component fluids. Equation is also functionally similar to a very early mixing rule derived by Arrhenius.[33] While the differences between the models in predicting viscosities of different compositions of protiated and deuteriated water are small, a clear advantage of eq is its synergy with eq to give eq . Equation provides a single exponential capable of predicting the viscosity of a mixed solvent based on the known values of a and b for both pure solvents used, their compositions in the mixed solvent, x1 and x2, and the sample temperature, T.In this work, the SEGWE method, extended for use with mixed-solvent solutions, is used to predict the diffusion coefficients of both globular and denatured proteins in mixed protiated–deuteriated solvents. Equation is used to estimate the viscosity of the mixed solvent. A weighted average of the Gierer–Wirtz predictions, eq , for the two components of the solution handles the breakdown of the continuum model. In mixed protiated–deuteriated solvents, the solvent radii are practically the same and additional friction will be very similar for the two components. As mixed protiated–deuteriated solvents are widely used in the NMR studies of proteins, a set of five proteins is used here to test the extended SEGWE method. Diffusion coefficients were acquired for solvent compositions between 10 and 100% D2O and temperatures from 278.15 to 310.15 K. Ascertaining whether diffusion coefficients are over- or under-predicted is an important part of assessing the effectiveness of the extended SEGWE method. Convection, common in liquid-phase NMR experiments, will lead to experimentally acquired diffusion coefficients larger than predicted, as would more compact structures. Conversely, aggregation or less effectively packed structures would have the opposite effect, as the larger species would move more slowly in solution. A summary of this extended SEGWE method and the interpretation of its predictions is shown in Scheme .

Scheme 1

Construction of the Extended SEGWE Equation and Infographic Illustrating the Format of SEGWE Predictions in Figures –6

Experimental Section

All data, unless otherwise specified, was acquired at the Department of Chemistry Instrumentation Facility (DCIF) at the Massachusetts Institute of Technology. All DOSY measurements were carried out on a 600 MHz AVANCE NEO Bruker spectrometer, using a 5 mm helium-cooled QCI-F cryoprobe equipped with a z-gradient coil producing a calibrated maximum gradient of 55.37 G cm–1. The gradients were calibrated using the standards and method of Holz and Weingartner.[34] The temperature was calibrated using methanol-d4 and ethylene glycol NMR thermometers.[35,36] DOSY data was acquired using a stimulated echo NMR pulse sequence with bipolar pulsed-field echoes and longitudinal eddy current delay,[37] with additional excitation sculpting[38,39] used to suppress the solvent signals. Full experimental parameters are described in Supporting Information 2, with experiment timing parameters, such as Δ and δ, summarized in Table S3. All data was processed using GNAT,[40] using 10 Hz of line broadening. The peaks between 0.5 and 1.5 ppm, corresponding to methyl groups in the proteins, were used to obtain the diffusion coefficients. In total, diffusion coefficients of five different globular, monomeric proteins with molecular weights ranging from 6500 to ca. 66 500 g mol–1 were acquired. Table summarizes all proteins studied in this work and their molecular weights. All DOSY spectra for all protein samples, at all sample temperatures and for all sample compositions, can be found in Supporting Information 3 (Figures S9–S44), Supporting Information 4 (Figures S46–S69), and Supporting Information 5 (Figures S70–S74).

Table 1

Summary of Proteins Studied and Their Molecular Weights

protein	molecular weight (g mol^–1)
aprotinin	6500
ubiquitin	8579
lysozyme	14 307
myoglobin	16 700
bovine serum albumin (BSA)	66 463

Results

Figure shows a typical DOSY spectrum of a protein, in this case lysozyme, in an aqueous solution. As all signals correspond to protons on the same macromolecule, all have the same diffusion coefficient. Therefore, the peaks in the DOSY spectrum align on or around the same horizontal line, indicated by a blue dashed line in the figure. Peaks significantly below the line may be due to smaller species, diffusing faster, also present in the sample. The residual solvent signal has been suppressed experimentally and also excluded from the DOSY processing. DOSY spectra similar to Figure were acquired for 0.4 mM lysozyme samples at temperatures ranging from 278.15 to 310.15 K in a range of different aqueous solvent compositions.

Figure 2

DOSY spectrum of 0.4 mM lysozyme in 100% D2O solution at 278.15 K. Insert depicts protein methyl peaks, estimate of diffusion coefficient, D, and associated error estimate.

DOSY spectrum of 0.4 mM lysozyme in 100% D2O solution at 278.15 K. Insert depicts protein methyl peaks, estimate of diffusion coefficient, D, and associated error estimate. Figure is a summary of diffusion coefficients acquired for lysozyme. Dashed lines, color coded for the different sample temperatures, highlight trends within sets of data acquired at a given temperature. All DOSY spectra of lysozyme, corresponding to the data in Figure , can be found in Supporting Information 3, with all diffusion coefficients summarized in Table S4.

Figure 3

Measured diffusion coefficients of 0.4 mM lysozyme samples at different temperatures and in different mixed aqueous solutions. The insert indicates color coding for different temperatures. Colored dashed lines are used to illustrate the trends in the data. As expected, the experimentally acquired diffusion coefficients of lysozyme increase as the percentage of protiated water in the solvent increases. For example, at 278.15 K, the diffusion coefficient of lysozyme was found to be 0.54 × 10–10 m2 s–1 in 10:90 H2O/D2O compared to 0.70 × 10–10 m2 s–1 in 90:10 H2O/D2O. As the temperature increases, there is both an increase in the thermal energy of the system and the solvent gets less viscous. Therefore, the diffusion coefficients also increase. In 10:90 H2O/D2O solution, the diffusion coefficient of lysozyme increases to 1.43 × 10–10 m2 s–1 at 310.15 K. This set of experimentally acquired diffusion coefficients can be compared with extended SEGWE predictions for lysozyme (MW = 14 307 g mol–1), using eqs and 9. Figure shows the results of plotting experimental versus predicted diffusion coefficients for the set of experimentally acquired diffusion coefficients summarized in Figure .

Figure 4

Measured diffusion coefficients plotted against diffusion coefficients calculated using the extended SEGWE method described in Scheme for 30 measurements of lysozyme at different temperatures and in different mixed aqueous solutions, with a dashed line of unit slope. Insets indicate color and shading coding for different temperatures and sample compositions, respectively. The same color coding as in Figure , from blue to red, is used to indicate measurements at different temperatures, while an additional shading, from dark to light, is used to indicate measurements in different solvent compositions. All diffusion coefficients predicted by the extended SEGWE method are summarized in Table S6. The extended SEGWE method performs well for this data set, with an RMS error of ca. 1.5%. Gratifyingly, there appears to be no decrease in accuracy with either increasing temperature, indicating that convection was not affecting these samples, or with changing sample composition, indicating that eq handles the prediction of different sample viscosities well. To further test the extended SEGWE method, diffusion coefficients were acquired for a wider set of five proteins, described in the Experimental Section, at the same temperature (298.15 K) and in different aqueous solvent compositions. Figure shows the plots of these experimental diffusion coefficients versus those predicted by the extended SEGWE method. All DOSY spectra for all four additional proteins, all acquired at 298.15 K in different solvent compositions, can be found in Supporting Information 4, supported by Table S7, summarizing both experimentally acquired diffusion coefficients and diffusion coefficients predicted by the extended SEGWE method.

Figure 5

Measured diffusion coefficients plotted against diffusion coefficients calculated using the extended SEGWE method described in Scheme for measurements of five different proteins at 298.15 K, in different mixed aqueous solutions, with a dashed line of unit slope. Different shapes indicate different proteins, while shading indicates different sample compositions, with darker colors containing a higher concentration of D2O. The extended SEGWE method performs well here, with an RMS error for the whole data set of 4.4%. As with the data in Figure , there are no deviations as the solvent composition changes. While the proteins were chosen as a representative set of monomeric, globular proteins, two, aprotinin and BSA, lie below the line of unit slope for all solvent compositions. This may result from the shape adopted by the proteins in solution, with any deviance from a spherical, globular protein resulting in greater friction and a lower measured diffusion coefficient. Two additional factors, both concentration dependent, will also reduce experimentally acquired diffusion coefficients. First, proteins are known to aggregate in solution, forming larger species. Diffusion NMR techniques are used in the study of protein aggregation,[41−43] and expressions exist for relating the decreases in apparent diffusion coefficient to the degree of aggregation and equilibria involved.[44,45] The sample concentrations in this work were chosen to limit the amount of aggregation present. Second, at high enough concentrations, the proteins present an inaccessible volume fraction of the sample and obstruct each other as they diffuse. Obstruction effects for a solution of a species with molecular weight MW at a molar concentration c in a solvent with density ρ can be estimated by calculating the volume fraction of solute using eq and hence ruled out for the samples studied in this work. This calculation and data depicting the influence of obstruction effects on experimentally acquired protein diffusion coefficients can be found in Supporting Information 5. A final assessment of the extended SEGWE method is how well it can answer common chemical questions. The measurement of protein diffusion coefficients provides an important insight into their folding state in solution and function. Globular proteins, such as the set of five proteins depicted in Figure , possess well-defined, compact 3D structures. On the other hand, disordered proteins can be described as worm-like chains, similar to polymers adopting a “random coil” configuration.[46] It is possible to denature proteins using either high concentrations of chaotropic agents such as urea or through heating. Intrinsically disordered proteins (IDPs), whether partially structured or fully unstructured, offer an alternative without uncertainty in how effective the denaturing process has been.[47]Figure shows experimentally acquired diffusion coefficients for a wider selection of proteins containing both globular proteins (black circles) and IDPs (blue circles) compared with values estimated using the extended SEGWE method.

Figure 6

Measured diffusion coefficients plotted against diffusion coefficients calculated using the extended SEGWE method described in Scheme for measurements of 12 folded proteins (black circles) and 10 IDPs (blue circle) at 287.0 K, with dashed lines of unit slope for both sets of proteins, offset to highlight the differences between the two sets of proteins. Data are drawn from ref (10). As expected, the IDPs form larger structures in solution, with corresponding lower diffusion coefficients than expected for the protein molecular weight. This difference is immediately visible: not one of the globular proteins in this set deviates enough to be misclassified as an IDP. SEGWE also predicts the diffusion coefficients of the globular proteins, indicated by black circles in Figure , reasonably accurately. The RMS error for the data set in Figure is ca. 10%, comparable to that found when SEGWE was applied to a broad set of many small molecules.[15,16] With the wider range of protein diffusion data depicted in Figure , the importance of structural features, such as numbers of charged residues and net charge, can also be assessed. The net charges of the proteins in Figure range from +7 to −24, but these appear to have no effect on the accuracy of the extended SEGWE method. This additional information is summarized in Supporting Information 6, supported by Figure S76, where Figure has been redrawn to indicate the net charges on the proteins. The raw diffusion coefficient data was previously published in Dudás and Bodor in ref (10). In the original SEGWE method, small organic molecules are assumed to have approximately the same density, ρeff = 627 kg m–3, with this single, optimized parameter containing all of the effects of shape, composition, flexibility, and solvation on small molecule diffusion. Proteins are made up of amino acids containing C, H, O, N, and S only, so are likely to have a composition similar to the compounds used to initially generate the SEGWE method. The secondary structure elements that proteins adopt, such as α-helices and β-sheets, produce well-defined structures, densities, and regions with well-packed atoms, buried away from the solvent. The overall density of a protein will be dependent on the packing of these structures and on the amino acid composition of the protein. Diffusion NMR data, depicted in Figures and 6, indicates that the assumption that proteins have a single effective density similar to that of small organic molecules remains valid. Deviations from the expected values will give important information about the structures adopted by the proteins studied. The IDPs depicted in Figure fall below the dashed line of unit slope because the extended structures they form have lower densities than the folded, globular proteins.

Discussion

Convection

Any discussion of experimentally acquired diffusion coefficients must address the likely presence of convection. Any convective flow in a sample will lead to higher experimentally acquired diffusion coefficients than expected. Convection is conventionally seen as a critical phenomenon. If a large enough negative vertical temperature gradient forms between the two ends of the NMR tube, then Rayleigh–Bernard convection will spontaneously form, with the warmer fluid flowing upward, displacing the colder fluid above. However, studies of convective flow in NMR experiments have revealed that some convective flow is almost always present in typical diffusion NMR experiments.[48,49] In a temperature-regulated NMR probe, the airflow around the sample is disrupted by the highly asymmetric space around the tube, and vertical and horizontal temperature gradients form. Horizontal temperature gradients can drive convection through Hadley flow.[50] Importantly, this convective flow is not a critical phenomenon. As a result, the effects of convection on diffusion measurements have been historically underestimated. Any experimental measurements of diffusion coefficients need to be aware of the likely presence of convection and its effect on the data acquired. While the experimental protein diffusion coefficients were not acquired with convection-compensated sequences, the onset and magnitude of convective flow depend on the density, viscosity, and volumetric thermal expansion coefficient of the fluid. For the two solvents used here, H2O and D2O, these quantities are such that convection is unlikely to form and, if it does, it will only have a small effect.[49]Figures –6, all depicting proteins in aqueous solution, confirm this analysis and indicates that there is no evidence of convection in the diffusion data presented here. If diffusion NMR data shows any evidence of convection, the effects of convection can be reduced using narrower bore tubes, convection-compensated diffusion NMR pulse sequences,[51] or both if signal-to-noise is sufficient.

Extending SEGWE Further

This work is the first step in extending SEGWE to more general mixed-solvent systems. To achieve this goal, at least two more uncertainties need addressing.

Nonideal Mixing

The mixing rules discussed in the Introduction section assume that the fluids mix ideally. This is valid for the solvent system used in this work, which mixes nearly ideally. Other mixed-solvent systems will not. Even where there are only small differences in viscosity or chemical structure, nonideal mixing behavior can be observed. Nonideal mixing necessitates the addition of an extra term to eq to give eq (32)where Δ represents the effects of nonideal mixing. In the Eyring equation, this additional term is described as a minor correction for the excess free energy of mixing. In the Grunberg–Nissan equation, it is further specified as x1x2G12, where G12 is an interaction parameter that depends on the mixture components and temperature. In both cases, the effect of this term on eq is to add an additional exponential governing only the ideality, or otherwise, of mixing. Therefore, different mixed-solvent systems will need to be studied, particularly those known to exhibit nonideal mixing behavior. Attempts have been made to rationalize the viscosities of mixed solvents for many decades, resulting in a large resource of historic literature data on the topic,[52−54] which will support this investigation.

Gierer–Wirtz Function

The use of proteins as test molecules in this work has an advantage in minimizing the importance of the Gierer–Wirtz term. For the five proteins studied here, the effect of fGW ranges from ca. 4 to 9%. However, a failure to include eq in SEGWE calculations will lead to systematically higher-than-expected predicted diffusion coefficients, as observed in Figure S45a. More relevant for this study is the observation that α and, hence, fGW do not change significantly upon deuteriation of the solvent. In this work, a weighted average of the two fGW values has been used, even though the differences between them ultimately proved small. For smaller solutes and different solvent mixtures, the effect of changing the solvent radius will be greater. The Gierer–Wirtz function was originally derived entirely theoretically, and this approach could be revisited using different solvent radii in the derivation. The effect of mixed solvents can also be experimentally investigated using, e.g., solvent mixtures consisting of two solvents with very similar bulk viscosities but with different radii.

Conclusions

In this work, the SEGWE method is extended and shown to successfully estimate the diffusion coefficients of both globular proteins and IDPs, in a wide range of mixed protiated–deuteriated aqueous solvents at a range of temperatures. This allows for confirmation or estimation of protein molecular mass and proves capable of distinguishing unstructured proteins from their globular counterparts. The original SEGWE method was developed by making pragmatic decisions about assumptions underpinning the Stokes–Einstein equation. The successful extension of SEGWE to ideally mixed aqueous solvents makes similarly pragmatic decisions about the mixing rules for liquid viscosities. The extension to mixed solvents started in this work will also provide a firm foundation for further extensions of SEGWE to handle the more general question of mixed solvents of all types, not just mixed protiated–deuteriated solvents. Successfully demonstrated on proteins for the first time here, the original SEGWE model has found application in a wide range of chemical sciences, from simple organic molecules and natural products to organometallics and clusters. The extension to mixed solvents will only further increase the scope and range of use of the method. To aid its wider use, the extended SEGWE method has been implemented as an Excel spreadsheet, as detailed in Supporting Information 7, and has been made available for free download from doi: http://dx.doi.org/10.17632/fn64x6vpn4.1.

25 in total

1. Mass fractal dimension and the compactness of proteins.

Authors: Matthew B Enright; David M Leitner
Journal: Phys Rev E Stat Nonlin Soft Matter Phys Date: 2005-01-27

2. Unfolding process of rusticyanin: evidence of protein aggregation.

Authors: Luis A Alcaraz; Antonio Donaire
Journal: Eur J Biochem Date: 2004-11

3. A 1H-NMR thermometer suitable for cryoprobes.

Authors: M Findeisen; T Brand; S Berger
Journal: Magn Reson Chem Date: 2007-02 Impact factor: 2.447

4. Sample convection in liquid-state NMR: why it is always with us, and what we can do about it.

Authors: I Swan; M Reid; P W A Howe; M A Connell; M Nilsson; M A Moore; G A Morris
Journal: J Magn Reson Date: 2014-12-27 Impact factor: 2.229

5. Size Determination of Protein Oligomers/Aggregates Using Diffusion NMR Spectroscopy.

Authors: Pancham S Kandiyal; Ji Yoon Kim; Daniel L Fortunati; K H Mok
Journal: Methods Mol Biol Date: 2019

6. Mechanistic Studies on a Cu-Catalyzed Asymmetric Allylic Alkylation with Cyclic Racemic Starting Materials.

Authors: Emeline Rideau; Hengzhi You; Mireia Sidera; Timothy D W Claridge; Stephen P Fletcher
Journal: J Am Chem Soc Date: 2017-04-10 Impact factor: 15.419

7. Monitoring Protein Global and Local Parameters in Unfolding and Binding Studies: The Extended Applicability of the Diffusion Coefficient─Molecular Size Empirical Relations.

Authors: Csenge Lilla Szabó; Fanni Sebák; Andrea Bodor
Journal: Anal Chem Date: 2022-05-26 Impact factor: 6.986

8. Self-assembly of singlet-emitting double-helical silver dimers: the curious coordination chemistry and fluorescence of bisquinolylpyridone.

Authors: Charlotte M A Farrow; Geoffrey R Akien; Nathan R Halcovitch; James A Platts; M P Coogan
Journal: Dalton Trans Date: 2018-03-12 Impact factor: 4.390

9. Characterization of reactive intermediates by multinuclear diffusion-ordered NMR spectroscopy (DOSY).

Authors: Deyu Li; Ivan Keresztes; Russell Hopson; Paul G Williard
Journal: Acc Chem Res Date: 2009-02-17 Impact factor: 22.384

10. A novel fully water-soluble Cu(I) probe for fluorescence live cell imaging.

Authors: Maria Laura Giuffrida; Enrico Rizzarelli; Gaetano A Tomaselli; Cristina Satriano; Giuseppe Trusso Sfrazzetto
Journal: Chem Commun (Camb) Date: 2014-09-07 Impact factor: 6.222