Literature DB >> 33179918

Connecting Longitudinal and Transverse Relaxation Rates in Live-Cell NMR.

Sarah Leeb¹, Fan Yang¹, Mikael Oliveberg¹, Jens Danielsson¹.

Abstract

In the cytosolic environment, protein crowding and Brownian motions result in numerous transient encounters. Each such encounter event increases the apparent size of the interacting molecules, leading to slower rotational tumbling. The extent of transient protein complexes formed in live cells can conveniently be quantified by an apparent viscosity, based on NMR-detected spin-relaxation measurements, that is, the longitudinal (T1) and transverse (T2) relaxation. From combined analysis of three different proteins and surface mutations thereof, we find that T2 implies significantly higher apparent viscosity than T1. At first sight, the effect on T1 and T2 seems thus nonunifiable, consistent with previous reports on other proteins. We show here that the T1 and T2 deviation is actually not a inconsistency but an expected feature of a system with fast exchange between free monomers and transient complexes. In this case, the deviation is basically reconciled by a model with fast exchange between the free-tumbling reporter protein and a transient complex with a uniform 143 kDa partner. The analysis is then taken one step further by accounting for the fact that the cytosolic content is by no means uniform but comprises a wide range of molecular sizes. Integrating over the complete size distribution of the cytosolic interaction ensemble enables us to predict both T1 and T2 from a single binding model. The result yields a bound population for each protein variant and provides a quantification of the transient interactions. We finally extend the approach to obtain a correction term for the shape of a database-derived mass distribution of the interactome in the mammalian cytosol, in good accord with the existing data of the cellular composition.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：
Proteins

Year: 2020 PMID： 33179918 PMCID： PMC7735724 DOI： 10.1021/acs.jpcb.0c08274

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 2.991

Introduction

The cell interior is an immensely complex and crowded environment, both in prokaryotic and eukaryotic cells. Hence, a freely diffusing protein is bound to undergo countless collisions and transient interactions with the surrounding macromolecules.[1−3] Most frequently, the collisions are nonelastic, that is, a transient, so-called, “encounter complex” is formed, allowing for Brownian-surface diffusion and a close-range search for putative specific recognition and binding. Preferably, such weak interactions may result in colocalization of functionally related proteins, often called the quinary structure,[4−6] which can be described as a functional subset of all transient interactions. The duration of this surface search has been postulated to be under evolutionary optimization,[1,7,8] based on the argument that if the “hand shake” is too brief, the proteins may fail to recognize their partner and, conversely, if it is too long, they will simply waste their precious time. Consistent with this idea, the repulsive net-charge between cytosolic proteins across evolutionarily divergent organisms seems optimized for marginal colloidal stability.[9] For a given protein, each encounter-complex formation results in a transient reduction in rotational and translational diffusion.[10−13] The rotational component (Drot) of this retardation is well-suited to be quantified by in-cell nuclear magnetic resonance (in-cell NMR). In essence, the rotational motions are here coupled to the longitudinal (T1) and transverse (T2) relaxation times[14] that, in turn, provide a measure of a particular protein’s propensity to interact with the neighboring intracellular components.[6,10,11,15] The quantitative link between the molecular motions and the relaxation time is described by the overall correlation time (τc), which, in the general case, is the harmonic mean of global rotation (τr), local motions (τloc), and additional components such as chemical exchange (τex): τc–1 = τr–1 + τloc–1 + τex–1. The effects of the in-cell encounters on the molecular motions are expected to span several timescales and can be both global and local, but for folded proteins, the local motions are of less importance, that is, T1 and T2 mainly report on the rotational correlation time, τr.[16] Previous studies have shown that the extent of Drot retardation upon cell internalization depends not only on the protein’s physicochemical surface properties[6,10,11,17] but also on the type of host cell.[10,11,13] This in-cell retardation is manifested as increased T1 and decreased T2 relaxation times[11,14,18,19] (Figure , Supporting Information 1), where most commonly T2 has been exploited to quantify the retardation effect, both directly by determination of the relaxation time[10,11,19] and indirectly through line-broadening analysis.[6,20]

Figure 1

Probe proteins in-cell NMR properties and relaxation. (A) shows the proteins TTHApwt (blue), HAH1pwt (red), and SOD1barrel (green) including secondary structure elements. The electrostatic surface of each protein is displayed, with blue-colored patches belonging to the basic, positively charged residues arginine or lysine and the red colored patches belonging to the acidic, negatively charged residues glutamate or aspartate. The surface charge mutations are highlighted as black spheres. (B) depicts HMQC spectra of the reporter protein electroporated into live A2780 cells. (C) In-cell NMR relaxation data of the three basis reporter proteins are shown. Signal intensity attenuation obtained from the R1 (dark gray) and R2 (light gray) experiments are shown as filled circles and the corresponding fitted single exponential fits are shown as solid lines. The error bars in the relaxation rates are estimated from the signal-to-noise-ratio in each experiment (Supporting Information Methods). To enable comparison of proteins with different mass, we translate here the observed relaxation times into “apparent viscosities” (ηapp),[11,13,20,21] that is, the microscopic viscosities that yield the same relaxation parameters as observed in cells.[10] The ηapp values are derived from the reference curves, in which T1 and T2 are determined for each protein in increasing amounts of glycerol with well-defined viscosities. Even so, this mean-field approach results in an apparent disagreement between T1 and T2 in-cell relaxation times, where transverse relaxation reports on much higher ηapp than longitudinal relaxation. To pinpoint this T1 and T2 inconsistency, we expand here the mean-field retardation approximation to a binding model, where the reporter protein is set to be in fast exchange between a free monomeric state and a bound state with a cellular partner. This partner is, in the simplest case, a protein of uniform size or a distribution of proteins with different sizes. As a result, the observed T1 and T2 values for our three different proteins and their mutants can be accounted for by a single model. Our basic observation is that the relaxation data features can be accurately described by the minimal assumption of a single partner of uniform size. This accuracy is retained upon accounting for more realistic partner-mass distributions. The best partner-mass distribution is that obtained from the naturally occurring cytosolic proteins but with an increased tail of higher mass species. Consistent with the cellular composition, this tail is suggested to report on transient interactions with higher order complexes and larger biomolecular structures, such as membranes and ribosomes. Thus, by accounting for the full distribution of putative interaction partners, we are allowed to connect the rotational retardation to the fraction of transiently bound reporter proteins, pB. This bound population constitutes also a quantitative link between the reporter protein properties and their interaction pattern in cells. A final implication of this result is that information about the particular partner-size distributions under a given set of conditions can be obtained directly from T1 and T2, providing a new handle for exploring the macromolecular machinery at work in live cells.

Experimental Procedures

Protein Mutagenesis, Expression, and Purification

Plasmids were transformed into Escherichia coli BL21(DE3) expression strains, and point mutations were introduced through site-directed mutagenesis. For 15N-isotope-enriched protein production, minimal medium [0.02 M KH2PO4, 0.04 M Na2HPO4, 0.1 M NaCl, 2 mM MgSO4, 0.4% (w/v) glucose, and trace metals (1 mM MgCl2, 15 μM CaCl2, 1 μM FeSO4, 32 nM AlCl3, 33 nM CoCl2, 12 nM CuSO4, 120 nM KI, 100 nM MnSO4, 4 nM NiSO4, 16 nM Na2MoO4, 15 nM ZnSO4, 5 nM KCr(SO4)2, 15 nM H3BO4, and 100 μM citric acid), pH 7.0] supplemented with carbenicillin and 0.1% (w/v) 15NH4Cl was inoculated and grown at 37 °C until OD600 = 0.6–0.8. Isopropyl β-d-1-thiogalactopyranoside (IPTG) (0.5 mM) was added for 4 h overexpression. Cells were harvested at 5000g for 10 min at 4 °C (Supporting Information Methods). Protein purification of SOD1barrel is described in detail by Danielsson et al.,[22] while detailed protocols for TTHApwt and HAH1pwt are described by Mu et al.(6) and Leeb et al.(10)

Protein Transfer into Mammalian Cells for In-Cell NMR

Cell growth and protein transfer by electroporation were as described in the literature.[10] In short, human ovary adenocarcinoma A2780 cells were grown to 70–90% confluence. Approximately, 60 × 106 cells were suspended and supplemented with the respective reporter proteins to 1.5 mM final protein concentration. Electroporation was conducted by 115 V, 14–16 ms poring pulses followed by 5 × 50 ms transfer pulses at 20 V. After electroporation, the cells were washed, plated, and left for 5 h recovery. Then, the live cells were transferred to a 4 mm flat-bottomed NMR tube (BMS-004B, Shigemi Inc., Tokyo, Japan) (Supporting Information Methods).

In-Cell Relaxation Measurements

All NMR data were acquired on a Bruker AVANCE III 700 MHz spectrometer with a cryogenically cooled triple-resonance probe. All experiments were performed at 37 °C using an “interleaved” acquisition method.[10] Both R2 and R1 measurements were carried out using one-dimensional 15N-filtered heteronuclear single quantum coherence (HSQC)-based pulse sequences with three relaxation delays each, ranging between 0 and 68 ms in the case of R2 and 10 and 500 ms in the case of R1.

Quantification of Protein Leakage

Leakage of the reporter protein into the interstitial fluid surrounding the cells was quantified by carefully removing the cell slurry from the NMR tube. After spinning 5 min at 200 g, the supernatant was transferred to a fresh NMR tube and 1D 1H-SOFAST–HMQC was recorded (Figure S2). Integrating over the same spectral regions in both samples and then calculating the ratio after correcting for dilution in the supernatant samples showed that protein leakage typically is less than 10% (Figure S2). In addition, there is evidence that most of the leakage is introduced during sample preparation of the supernatant.[6]

Relaxation Measurements of In Vitro Glycerol Series

200 μM protein, 10 mM MES pH 6.5, 10% (v/v) D2O, and increasing amounts of deuterated glycerol-d8 (98% D) from 0 to 50% (v/v) were used. Both R1 and R2 were determined with 6–10 relaxation delays (Supporting Information Methods). Data was finally analyzed using MATLAB (MathWorks, MA, USA) scripts as described in the literature.[10]

Lysozyme Crowding

100 μM TTHApwt, 20 mM MES pH 6.5, and 10% (v/v) D2O together with 50 or 150 mg/mL human lysozyme (Sigma-Aldrich) were used. Both R1 and R2 were determined using 5–6 relaxation delay times. Data was analyzed with in-house MATLAB (MathWorks, MA, USA) scripts. Since lysozyme is pH-active, the sample pH had decreased to 6.0 and 5.6 for 50 and 150 mg/mL, respectively. Fluorescence spectroscopy was conducted under similar conditions, where 0–20 mM TTHApwt was titrated onto 100 mM human lysosome, upon which the induced change in intrinsic tryptophan fluorescence from lysozyme was detected (Figure S5).

Curating and Analyzing the Cytosolic Proteome Database

The proteomic composition of the mammalian cytosol was estimated from the database by Geiger et al.,[23] where lysate proteins of eleven human cancer cell lines were quantified by mass spectrometry. Of the 11,731 proteins in the Geiger list, 4.2% were given ambiguous UniProt-IDs, and these sequences were omitted. The calculations of charge density of the proteins are described in the Supporting Information Methods.

Results and Discussion

Model System

Quantification of NMR relaxation in live mammalian cells requires good signal to noise data. This means that the proteins used must exhibit only weak interactions with the surroundings as extensive interactions lead to severe line-broadening.[5] We use here three well-characterized model proteins: (i) the nonmetal binding variant of the putative heavy-metal binding protein TTHA1718 from Thermus thermophilus (TTHApwt),[6] (ii) the corresponding de-metalated variant of the copper chaperone HAH1 (HAH1pwt),[6] and (iii) the loop-truncated variant of the human superoxide dismutase SOD1 (SOD1barrel)[22] (Figure ). TTHApwt and HAH1pwt are structural homologues, but with distinct surface properties, making them well suited as reporter proteins for surface-mediated interactions in cells. All three proteins exhibit good NMR-relaxation properties in live mammalian cells,[10] where well-resolved HMQC spectra (Figure ) indicate that they only diffusely interact with the cell interior.[10,24] That is, they mainly probe nonspecific transient encounters. Nonetheless, in previous studies, we have shown that these three proteins exhibit different degrees of transient interactions related to their surface properties. Of particular importance here is the surface net-charge density and surface hydrophobicity.[6,10] To expand the surface-property space, we included further three surface variants of the model proteins above, all of which alter the surface net charge by 2 units: a Glu to Lys substitution on the TTHApwt background (TTHAE32K), a Lys to Glu substitution on the HAH1pwt background (HAH1K57E), and an Arg to Glu substitution on SOD1barrel (SOD1R100E) (Figure ). This set of six proteins then serves as reporters on the transient interactions between the protein surfaces and the cellular surroundings. For the in-cell NMR measurement, the proteins were electroporated into A2780 cells, obtaining approximately 20 μM intracellular concentration,[10] corresponding to approximately 10 μM total sample concentration.

Longitudinal and Transverse Relaxation Report Different Apparent Viscosity in Human Cells

To quantify the mean-field parameter, ηapp, we determined the global 15N R1 = T1–1 and 15N R2 = T2–1 of all six proteins in live A2780 cells at 37 °C[10] (Figure , Table , Figures S1 and S2). Hereinafter, we will use the relaxation rate (R) in parallel to the relaxation time (T), for clarity in the analysis. Due to the relatively low total concentration of isotope-labeled reporter protein in the cell sample and the limited life time of the cells in the NMR tube,[25] we were only able to record three relaxation delay times (Figure ), where we integrate over a set of amide signals (Supporting Information Methods). The interscan delay was set to 1 s, which is bordering on being too short for the R1 relaxation rates of 1.5–2 s–1. As a possible consequence of this trade-off between interscan relaxation delay and the total experimental time, the R1 data shows some deviation from ideal mono-exponential behavior, most evidently for the shortest delay (Figure ). In addition, a contributing factor to the deviation from single exponentiality could stem from the presence of two populations of the reporter protein–cytosolic and leaked protein. However, the leakage is in most cases negligible, no systematic deviations are found in the R2 attenuation curves (Figure ), and no correlation between the magnitude of deviation and the amount of leakage can be observed. Furthermore, fitting the data to a biexponential yielded nonphysiological relaxation values (Figure S3). Taken together, we conclude that leakage is not a major contributing factor to the deviation from monoexponentiality.

Table 1

Collected Physicochemical Properties and In-Cell Relaxation Rates of the Reporter Proteins in Human A2780 Cells

Protein	M_w (Da)	net chargea	R₁	R₂b
TTHA^pwt	7009	–1.47	2.08 ± 0.14	12.93 ± 0.02
TTHA^E32K	7008	0.50	1.96 ± 0.50	22.62 ± 2.26
HAH1^pwt	7353	0.94	2.12 ± 0.27	21.08 ± 4.41
HAH1^K57E	7354	–1.08	2.04 ± 0.29	11.76 ± 1.56
SOD1^barrel	10962	–0.70	1.55 ± 0.37	24.95 ± 2.67
SOD^R100E	10949	–2.50	1.74 ± 0.31	16.63 ± 2.26

Net charge calculated using propKa 3.0.[50]

Data from Leeb et al..[10]

Net charge calculated using propKa 3.0.[50] Data from Leeb et al..[10] To test the reproducibility, we determined R1 of HAH1pwt twice, with R1 = 2.12 ± 0.27 and 2.09 ± 0.18 s–1, indicating the precision of the relaxation measurements. R2 precision has previously been shown to be similar.[10] We compared the R1 and R2 values to reference values obtained in water–glycerol mixtures to get ηapp (Supporting Information 2, Figure S4, Table S1). As expected from previous work,[6,10,11] the results show that the R2-derived ηapp values increase with increased protein net-charge (Figure , Table , Figure S4). This complies with the notion that the more positively charged the proteins are, the more strongly they interact with the intracellular environment.[6,10,11,17,26−29] It is further apparent that the R1-derived ηapp values are much less affected (Figure ) by showing lower apparent viscosity as well as nearly no charge dependence. Using the R2-derived ηapp to predict the corresponding R1 values results thus in poor agreement with the observed values (Figure ), with high rmsd = 4.86 and low Pearson correlation coefficient r2 = 0.23 between the measured and calculated relaxation rates. Nonetheless, the observation that R1 and R2 yield different ηapp is in good agreement with previous findings.[11,19] From 19F NMR relaxation analysis, this discrepancy has been suggested to stem from 19F R2 being more sensitive to transient interactions, while 19F R1 reports mainly on local motions.[19] Basically, this shows that this mean-field approach at some level fails to fully describe the in-cell effect. For comparison, 15N R1 and 15N R2 of the same proteins in glycerol–water mixtures fall well within the theoretical predictions (Figure , Supporting Information 1, Table S1).

Figure 2

Table 2

Apparent Viscosities ηapp Derived from Fand F

protein	η^app,R₁ (cP)a	η^app,R₂ (cP)
TTHA^pwt	0.86 (−0.12, +0.10)	2.02 ± 0.00
TTHA^E32K	0.96 (−0.48, +0.54)	3.64 ± 0.38
HAH1^pwt	0.78 (−0.26, +0.21)	3.20 ± 0.70
HAH1^K57E	0.84 (−0.24, +0.26)	1.72 ± 0.25
SOD1^barrel	0.81 (−0.23, +0.29)	2.48 ± 0.26
SOD1^R100E	0.68 (−0.16, +0.21)	1.56 ± 0.22

In the case of R1-derived ηapp, the error is asymmetric and are shown in parentheses.

Figure 3

Agreement between observed and calculated reduced relaxation rates for the different models. The calculated R1 = T1–1 values in (A) are from the mean field ηapp approach, note the different axis scale in this figure. (B) Corresponds to fast exchange to a binding partner with an optimized mass of 143 kDa. In (C, D), lognormal mass distributions are used, where the shape factors are optimized in (D). The dashed line corresponds to a 1:1 correlation. Reduced R1 values are shown as circles, while reduced R2 values are depicted as squares. The color coding is the same as in Figures and 2.

NMR relaxation data and apparent viscosity derived therefrom. (A,B) NMR relaxation rates as functions of rotational correlation time, τr, and molecular weight Mw, at 700 MHz (16.5 T) field strength. The colored circles are the measured relaxation rates [(A) R1, (B) R2] for the three proteins (HAH1pwt: red, SOD1barrel: green and TTHApwt: blue) in increasingly viscous glycerol solutions. While τr values calculated from the two relaxation rates (Supporting Information 1) fit the predicted theoretical values (black line) in the case of the in vitro glycerol data, clear deviations from the theory are found for the in-cell relaxation data (triangles), where the brighter colors correspond to the respective surface mutation. (C) Apparent viscosities ηapp derived from transverse, R2 (squares), and from longitudinal in-cell relaxation, R1 (circles), plotted against protein net charge, where the colors are as in (A,B). The lines are empirically fitted exponential curves, with an offset corresponding to the intrinsic viscosity of water.[10] The marked discrepancy between the obtained ηapp values suggests that the mean-field ηapp model is insufficient for explaining changes in NMR relaxation due to intracellular encounters. Agreement between observed and calculated reduced relaxation rates for the different models. The calculated R1 = T1–1 values in (A) are from the mean field ηapp approach, note the different axis scale in this figure. (B) Corresponds to fast exchange to a binding partner with an optimized mass of 143 kDa. In (C, D), lognormal mass distributions are used, where the shape factors are optimized in (D). The dashed line corresponds to a 1:1 correlation. Reduced R1 values are shown as circles, while reduced R2 values are depicted as squares. The color coding is the same as in Figures and 2. In the case of R1-derived ηapp, the error is asymmetric and are shown in parentheses. One explanation for the observed discrepancy is that the intracellular environment, in contrast to glycerol, adds chemical-exchange contributions from local surface interactions. R2 is here known to be most affected through exchange broadening (Rex),[30,31] while R1 is left relatively unaffected. Challenging this idea, we have previously found that with our reporter proteins, the cytosolic enhancement of R2 is largely due to changes in global rotation, τr.[10] In essence, our conclusion is based on the finding that the line-broadening effect in the in-cell NMR spectra is uniform over all residues,[10] suggesting that the effect is global retardation rather than localized exchange effects. The effect on R2 is, moreover, independent of the nuclei type, which would not be the case if Rex would be the dominating factor. Finally, removal of the refocusing CPMG train in the pulse sequence of the R2 experiment[10] renders very similar relaxation rates, which suggests relatively small Rex contributions in the ms regime. Taken together, these observations indicate that exchange processes are not the primary cause for discrepancy in the observed in-cell relaxation effects and that an alternative explanation is to be found.

Transient Binding to a Single Large Partner Reconciles R1 and R2 Data

To seek a model that predicts both R1 and R2, we first extended the mean-field assumption where the tumbling of the entire protein ensemble is affected homogenously to a model where free monomers, with mass M, are in rapid exchange with transient clusters of an average molecular weight Mav (Supporting Information 3). This extension yields a population-weighted average of the relaxation rates of the bound and free state according towhere pB is the fraction of reporter protein bound to the cluster at any given time and F is the closed-form expression of the relaxation rate as a function of mass (Supporting Information 1). The good in-cell NMR properties of the proteins[10] indicate that the fast-exchange criterion holds since long-lived binding would yield line-broadening beyond the detection limit.[5] This model constitutes a simplified description of the transiently bound state: the encounter complex is pictured as a rigid body with mass M + Mav. However, a formed encounter complex involves Brownian motion along the complex surface, which also affects relaxation. The assumption of a rigid complex may thus result in a slight underestimation of pB. In this analysis, we assume that the correlation time is independent of the direction and that the diffusion tensor is symmetric. Although the former simplification breaks down in highly concentrated heterogeneous environments, the relatively small effects on the relaxation properties of the reporter proteins in A2780[10] suggest a comparably diluted environment. Use of a set of relaxation rates of individual spins located over the structure would allow the determination of the diffusion tensor.[32] However, here, we determine an average relaxation rate from many amide spins, spread out over the structure, justifying the use of an isotropic diffusion tensor. The near-linear dependence between R2 and molecular mass (Figure ) means that any encounter complex mass can be accounted for by adjusting pB, while, in contrast, the nonlinear relationship between R1 and mass (Figure ) strongly confines the possible masses of the complex, fixating pB. To benchmark this approach for quantifying weak transient interactions, we determined the effect on the relaxation rates of TTHApwt, in presence of 50 and 150 mg/mL human lysozyme-concentrations comparable with the total protein levels in human cells.[10,33] Here, we find that R1 and R2 agrees well with a 1:1 weak transient complex for pB = 0.25 and 0.46 respectively, in good accordance with fluorescence-detected binding affinity (Supporting Information 4, Figure S5). As shown in Figure , the introduction of a bound species with a common interaction partner of Mav = 143 kDa in addition to the free monomer accounts well for the in-cell R1 and R2 values of all six protein variants (Table S2). To assure that we get the same statistical weight for both R1 and R2 data in the fitting procedure, we use reduced relaxation parameters according towhere j denotes reporter protein j, i = 1 || 2, brackets denote the average over the full dataset (e.g. all R2 values), and σ is the standard deviation of the dataset. The correlation between the observed and calculated R1 now approaches a unit line with r2 = 0.93 and rmsd = 0.26 (Figure ). That is, the in-cell effect can indeed be described by alterations of τr alone, without the need to invoke additional chemical-exchange contributions. Furthermore, this shows that both R1 and R2 can be reconciled in a model where all proteins “feel” the same interaction environment and where the sole difference is the bound fraction. Intriguingly, and somewhat surprisingly, all reporter proteins show relatively low pB values (Table S2) despite being in the crowded cytosol. Still, the inert, soluble, and, mainly, negatively charged reporter proteins are expected to be kept soluble by charge repulsion with the negative surroundings, reducing the amount of potential complex-forming contacts.[9] Further, also in the presence of high concentration of the positively charged lysozyme, the negatively charged TTHApwt shows low pB, confirmed by fluorescence experiments (Figure S5), which underlines the low interactivity of this protein. Furthermore, the excellent quality of all the reporter proteins’ in-cell NMR spectra indicates a low pB, as a highly populated complex of high molecular weight would yield a significant increase in R2 if substantially populated (Figure ). To further test this result, we examined next how well the value of Mav = 143 kDa actually agrees with the various sizes of transient complexes expected to be formed in the cell. As a base for comparison, we used all human proteins in the UniProt database[34] annotated as cytosolic (Supporting Information Methods). The protein-mass distribution of this data set complies with both Γ- and lognormal distributions,[35] where the latter gives a somewhat better fit (Figure ). Notably, the employed protein-data set yields no information of abundance, which might bias the distribution. To test for such bias, we used an alternative data set from Geiger et al.,[23] listing the individual sequences and relative abundance of lysate proteins from several mammalian cancer cell lines. The result shows that the abundance-weighted distribution agrees well with the original cytosolic subset from the UniProt database (Supporting Information Methods, Figure S6). On this basis, we stick to the lognormal distribution from the cytosolic subset as an estimate for the interaction partner protein mass distribution, yielding an average protein mass of Mav = 73 kDa. Upon lowering the mass in eq from Mav = 143 kDa to Mav = 73 kDa, however, the agreement with the observed relaxation data becomes compromised (Supporting Information 3, Table S2). The reason why the 143 to 73 kDa decrease cannot simply be compensated for by a higher population of bound species is due to the distinct mass dependence of R1 and R2 (Figure ): for a given molecular mass of the transient complex, a change in bound population to fit one type of in-cell relaxation will inevitably lead to a coupled change in the other and concomitantly to a mismatch in most cases.

Approaching a Physiologically Relevant Situation

Even if we can reconcile both types of relaxation with a two-state binding model, a uniform protein mass does not realistically describe the in-cell situation. To better account for the natural protein-mass heterogeneity, we developed the model to include not only the encounters with a single average partner but with a distribution of species matching the cytosolic protein sizes. A set of putative binding partners were obtained by integrating over the full mass distribution of cytosolic proteins. This allows us to obtain the mass-weighted relaxation rates of the protein-encounter complexes (eq ). Notably, the fast-exchange model in eq relies on the assumption that the reporter protein is equally likely to collide and interact with all molecules in the distribution. Hence, as the surface properties are crucial for interaction and encounter formation,[6,10,17] the net-charge density of different intervals in the mass distribution must overlap. As a control, we divided the distribution into three size regimes, that is, <70, 70–140, and >140 kDa, and calculated the net-charge density[9] (Supporting Information 5). The result confirms that the surface-charge distribution of the three subensembles indeed overlaps nicely (Figure S7). This uniform mass–charge relation allows us to include the full distribution of binding partners in the fast exchange model described in eq , where the relaxation rate of the bound state now is given bywhere ρ(Mw) is the size distribution of interaction partners and F(M + Mw) is the closed-form expression for the relaxation rate for a complex between protein j and a binding partner with mass Mw (Supporting Information 3). Optimization of the bound population (pB) for each reporter protein using eqs and 3 and integration over the database-derived lognormal ρ(Mw) (Figure , Table S2) improve the agreement between observed and calculated R1 values, compared to the single average mass of the cytosolic proteins. Yet, the lower r2 = 0.87 and rmsd = 0.60, together with the systematic underestimate of calculated R1, indicate that the distribution of masses of the cytosolic proteins remains shifted towards too small proteins to account for the observed in-cell retardation.

Accounting for Larger Cellular Complexes in the Size Distribution Finally Reconnects the T1 and T2 Relaxations

Internalized proteins encounter not only other monomeric proteins but interact also with larger cellular components. Although a diffusing protein is most likely to collide with other proteins, simply because of their large proportion of the cellular dry weight,[36,37] it will also encounter other macromolecular structures and surfaces of larger dimensions. Examples of such structures are the cytoskeleton, membranes, and ribosomes. The amount of transient interactions is mainly determined by the overall macromolecular concentration and surface properties of the interacting molecules,[6,10] where the surface charge seems to be a key determinant. To a first approximation, the surface-charge density of the larger cellular structures can be assumed to follow the same distribution as the cytosolic proteins (Figure ). Most clearly, the cytoskeleton is primarily composed of tubulin, actin, and lamin, all of which show similar net negative surface charge density as soluble proteins. The surface architectures of membranes and ribosomes, however, are partly distinct from proteins and need extra consideration. Mammalian membranes are dynamic bilayers, where the fraction of anionic lipids is between 10 and 30%.[38,39] This translates to a surface net charge of −0.15––0.45 e/nm2, which is somewhat more negative than for the average protein (−0.065 e/nm2). Considering also the presence of ∼1/10 nm2 integral membrane proteins[40] with a positive-inside orientation[41,42] and that the lipid bilayer also consists of nonlipid, noncharged alcohols such as cholesterol, the effective surface charge density can still be assumed to fall in the same range as protein surfaces. Ribosomes are highly negative, abundant megadalton entities,[43,44] where the high surface charge density places ribosomes in the negative tail of the net charge distribution (Figure ). Nonetheless, even in this case transient interactions seem to follow a similar net charge dependence as protein–protein interactions,[6,10,12] where positively charged proteins even form semistable complexes with ribosomes.[12,45] Indeed, for several proteins, the observed line broadening in in-cell NMR experiments has been assigned to mainly stem from transient ribosome interactions.[46,47]

Figure 4

Size distribution estimated from in-cell relaxation data agrees well with the database-derived size distribution of cytosolic proteins. (A) Normalized histogram representation of the mass distribution of the cytosolic proteome of human proteins from the Uniprot database,[34] with 5217 proteins. A lognormal probability density function was fitted to the histogram (orange line). (B) Net charge density shows an approximately normal distribution centred at −0.065 e/nm2. The dashed line corresponds to zero charge. (C) Optimized size distributions combining R1 and R2 data from the six reporter proteins into a single binding model. The optimized lognormal distribution (black) compared to the database-derived distribution (orange), corresponding to the fitted distribution in (A). The gray shaded area depicts the variation upon random removal of one (dark gray) or two (bright gray) relaxation pairs. The dashed gray line is the maximum entropy distribution from the family of solutions (Supporting Information 6, Figures S8, S9). An additional high-mass contribution is the simultaneous interaction between multiple partners,[48] albeit that this possibility seems here disfavored by the low population of complexes formed by our current reporter proteins. Since the relaxation effect upon interaction with more than one partner is, nevertheless, indistinguishable from that with a single large entity, it is reasonable to include all transient complexes as a part of the same size and net-charge density distribution. To allow for such larger complexes to be included, we optimized pB for each reporter protein while simultaneously optimizing a common mass distribution of interaction partners. For this purpose, we employed a generalized lognormal distribution with free optimization of the distribution parameters (Supporting Information 3). The fit optimizes n + 2 parameters for n pairs of relaxation rates (pB for each reporter protein and 2 global distribution parameters), which means that at least two different data pairs are needed for a robust fit. We tested this approach to analyze transient binding of TTHApwt to lysozyme in vitro and used the two R1/R2 pairs from the 50 and 150 mg/mL lysosome samples to fit a distribution of Mw. Reassuringly, a narrow distribution centered around 15 kDa was obtained (Figure S5). Next, the six pairs of in-cell relaxation rates were used in a global fit, where R1 or R2 alone are not enough to determine the distribution, as for a single relaxation rate, any distribution can be accommodated by a shift in pB. We find that the distribution that best reproduces the relaxation data indeed resembles the one predicted from the natural intracellular environment: the distribution not only fits the sizes of the soluble proteins but also includes a high-mass tail that accounts for the larger cytosolic components (Figure , Table S2). The fitted relaxation rates are virtually identical to those from the fast-exchange model with a uniform partner of Mav = 143 kDa (Figure ), with good agreement between calculated and observed R1 values, where r2 = 0.93 and rmsd = 0.26 (Figure ). The distribution optimization from in-cell data yields a family of solutions for a given pB (Figure S8), which, in turn, allows the maximum-entropy distribution to be deducted (Figure S9, Supporting Information 6). This is a distribution that explains the data, while still carrying minimal “a priori” information. As the relaxation data can be explained by a single average binding partner with mass Mav, a symmetric distribution centered at Mav would provide the distribution with less information. However, the obtained maximum entropy solution resembles the asymmetrical database-derived distribution (Figure ) with the characteristic high mass tail. Although our extended distribution analysis does not lead to higher precision, it strengthens the approach by demonstrating that the NMR data also complies with a physiologically realistic cell composition. Moreover, it validates the ansatz of fast exchange in transient encounters and provides a method for estimating the size distribution of macromolecules at work in live cells. To determine the robustness of the obtained distribution, we randomly removed first one and then two data pairs and repeated the fit. The test shows, reassuringly, that the distribution only exhibits small changes (Figure ). As an additional control, we examined if the obtained distribution still reflects the intrinsic physical coupling between translational and transverse relaxation. In other words, can we accommodate any set of “inconsistent” R1/R2 pairs with a distribution, simply by rendering our conclusions about transient encounters nonconclusive? To test this, we prepared a set of relaxation pairs with R1 values systematically offset to lower values, while keeping R2 unchanged. The results show that we cannot obtain any distribution that reproduces these relaxation pairs with the same precision as the real data (Supporting Information 7, Figure S10).

Bound Population as Quantification of Transient Interactions in the Cell

Thus, use of the full distribution as possible binding partners explains both relaxation parameters using a single physically relevant model: the assumption of fast exchange between a bound and a free state (eq ). Additionally, it provides an intuitively apprehensible parameter in the form of the population of bound protein species (pB). The latter is here the sole specific parameter for the reporter protein and constitutes a direct measure of the actual interactivity of the protein inside the studied cell type. Upon comparing the obtained pB values with the apparent viscosity (determined from R2 alone), there is a common relationship with surface charge: less negative net charge results in both higher ηapp and pB. This relationship is further underlined by the clear response on surface charge mutations (Figure , Table S2). Notably, the response on the surface-charge perturbations seems similar for all three reporter proteins, where pB approximately doubles with a net-charge change by two units. The reporter proteins, however, follow distinct trajectories in the net charge-pB plane (Figure ), probably reporting on the differences in general inter-reactivity. At the same net charge, SOD1barrel stands out as the most interactive of the reporter proteins, while HAH1pwt tumbles most freely (Figure ). This result is in contrast to the more rudimentary ηapp analysis, where SOD1barrel emerges as less retarded (Figure ). The reason for this underestimate of SOD1barrel interactivity in the ηapp analysis is that the relative change in the apparent size upon transient interactions is less for a larger protein than for a smaller one,[10] while the pB analysis takes this explicitly into account. These results serve as a good example of the advantage of using pB determined from both longitudinal and transverse relaxation when characterizing transient interactions in live cells. Since the other readout from the global analysis is the effective size distribution, this method can also be used to observe cell-type differences and perturbations of the macromolecular size distribution in the cytosol. Accordingly, the approach can relatively simply provide new insights into cell function, cellular composition, and the formation of higher-order interactomes in live cells.

Figure 5

Determined bound population as a function of net charge. The optimized population from a model with transient interaction of the reporter proteins with a distribution of interaction partners, blue marker: TTHApwt; red marker: HAH1pwt; and green marker: SOD1barrel. The brighter markers correspond to the surface mutation variants of the reporter proteins and highlight the importance of surface net charge on in-cell transient encounter formations.

Concluding Remarks

To summarize, our study shows that the mean-field approaches based on apparent viscosity fail to accurately describe the in-cell effect on NMR relaxation data since they do not explicitly account for fast exchange between the monomer and monomer–partner complexes that is bound to accompany transient-binding events (Figure ). Taking this fast exchange into account not only reconciles the previous issue of seemingly inconsistent relaxation parameters, but also sheds light on the intracellular interactions through pinpointing the population of bound species (pB). Of particular interest, the results show that the disparate effect of transient in-cell interactions on T1 and T2 can be explained by mass-altering binding alone without introducing chemical-exchange contributions (Figure ). Although this simplifying result by no means rules out contributions from chemical-exchange effects, it is notable that they are not required to explain the observed data.

Figure 6

Comparing the models. The mean field approach, where a reporter protein (orange) is assigned an apparent mass (blue panel), cannot simultaneously describe in-cell R1 and R2 data. However, a model with the reporter protein in a single free state and a population in a distribution of bound states fully reconciles the R1 and R2 data. The distribution of monomeric cytosolic proteins is not sufficient by itself but transient interactions with larger components, such as protein assemblies, membranes, ribosomes, and cytoskeleton have to be accounted for. Another interesting detail is that to accurately reproduce the observed relaxation rates, larger interaction partners than the soluble cytosolic proteins need to be included (Figure ). This observation complies with the view that both soluble proteins and higher-order complexes, like for example, ribosomes and cytoskeletons, play important roles in modulating the rotational diffusion in live cells.[5,12,28,46,47,49] The influence of the full distribution of intracellular interaction partners on the rotational diffusion suggests also that the NMR relaxation analysis can be used for exploring more intricate aspects of cellular function. One such example is how the interplay between the intracellular components responds to physiological or genetic perturbations, where the interaction-size distributions for some key players are expected to undergo significant changes. Although it remains to establish how far this type of analysis can be taken given the signal-to-noise and many degrees of freedom, the results in this study show that, in principle, it is doable.

48 in total

1. Ribosome Mediated Quinary Interactions Modulate In-Cell Protein Activities.

Authors: Christopher M DeMott; Subhabrata Majumder; David S Burz; Sergey Reverdatto; Alexander Shekhtman
Journal: Biochemistry Date: 2017-08-03 Impact factor: 3.162

Review 2. Chemical exchange in biomacromolecules: past, present, and future.

Authors: Arthur G Palmer
Journal: J Magn Reson Date: 2014-04 Impact factor: 2.229

3. Stability Effect of Quinary Interactions Reversed by Single Point Mutations.

Authors: David Gnutt; Stepan Timr; Jonas Ahlers; Benedikt König; Emily Manderfeld; Matthias Heyden; Fabio Sterpone; Simon Ebbinghaus
Journal: J Am Chem Soc Date: 2019-02-21 Impact factor: 15.419

4. Positively Charged Tags Impede Protein Mobility in Cells as Quantified by ¹⁹F NMR.

Authors: Yansheng Ye; Qiong Wu; Wenwen Zheng; Bin Jiang; Gary J Pielak; Maili Liu; Conggang Li
Journal: J Phys Chem B Date: 2019-05-15 Impact factor: 2.991

5. Physicochemical code for quinary protein interactions in Escherichia coli.

Authors: Xin Mu; Seongil Choi; Lisa Lang; David Mowray; Nikolay V Dokholyan; Jens Danielsson; Mikael Oliveberg
Journal: Proc Natl Acad Sci U S A Date: 2017-05-23 Impact factor: 11.205

6. Topogenic signals in integral membrane proteins.

Authors: G von Heijne; Y Gavel
Journal: Eur J Biochem Date: 1988-07-01

7. The distribution of positively charged residues in bacterial inner membrane proteins correlates with the trans-membrane topology.

Authors: G Heijne
Journal: EMBO J Date: 1986-11 Impact factor: 11.598

8. Mathematical modeling and comparison of protein size distribution in different plant, animal, fungal and microbial species reveals a negative correlation between protein size and protein number, thus providing insight into the evolution of proteomes.

Authors: Axel Tiessen; Paulino Pérez-Rodríguez; Luis José Delaye-Arredondo
Journal: BMC Res Notes Date: 2012-02-01

Review 9. The Inescapable Effects of Ribosomes on In-Cell NMR Spectroscopy and the Implications for Regulation of Biological Activity.

Authors: David S Burz; Leonard Breindel; Alexander Shekhtman
Journal: Int J Mol Sci Date: 2019-03-14 Impact factor: 5.923

10. Ribosome surface properties may impose limits on the nature of the cytoplasmic proteome.

Authors: Paul E Schavemaker; Wojciech M Śmigiel; Bert Poolman
Journal: Elife Date: 2017-11-20 Impact factor: 8.140

6 in total

1. The intracellular environment affects protein-protein interactions.

Authors: Shannon L Speer; Wenwen Zheng; Xin Jiang; I-Te Chu; Alex J Guseman; Maili Liu; Gary J Pielak; Conggang Li
Journal: Proc Natl Acad Sci U S A Date: 2021-03-16 Impact factor: 11.205

2. Physicochemical classification of organisms.

Authors: Eloy Vallina Estrada; Mikael Oliveberg
Journal: Proc Natl Acad Sci U S A Date: 2022-05-02 Impact factor: 12.779

3. Preferential Interactions of a Crowder Protein with the Specific Binding Site of a Native Protein Complex.

Authors: Xu Dong; Ling-Yun Qin; Zhou Gong; Sanbo Qin; Huan-Xiang Zhou; Chun Tang
Journal: J Phys Chem Lett Date: 2022-01-19 Impact factor: 6.475

4. Visualizing Proteins in Mammalian Cells by ¹⁹ F NMR Spectroscopy.

Authors: Wenkai Zhu; Alex J Guseman; Fatema Bhinderwala; Manman Lu; Xun-Cheng Su; Angela M Gronenborn
Journal: Angew Chem Int Ed Engl Date: 2022-03-30 Impact factor: 16.823

Review 5. Radio Signals from Live Cells: The Coming of Age of In-Cell Solution NMR.

Authors: Enrico Luchinat; Matteo Cremonini; Lucia Banci
Journal: Chem Rev Date: 2022-01-21 Impact factor: 72.087

Review 6. NMR Provides Unique Insight into the Functional Dynamics and Interactions of Intrinsically Disordered Proteins.

Authors: Aldo R Camacho-Zarco; Vincent Schnapka; Serafima Guseva; Anton Abyzov; Wiktor Adamski; Sigrid Milles; Malene Ringkjøbing Jensen; Lukas Zidek; Nicola Salvi; Martin Blackledge
Journal: Chem Rev Date: 2022-04-21 Impact factor: 72.087

6 in total