Inga Jarmoskaite1, Ishraq AlSadhan1, Pavanapuresan P Vaidyanathan1, Daniel Herschlag1,2,3. 1. Department of Biochemistry, Stanford University, Stanford, United States. 2. Department of Chemical Engineering, Stanford University, Stanford, United States. 3. Stanford ChEM-H, Stanford University, Stanford, United States.
Abstract
Quantitative measurements of biomolecule associations are central to biological understanding and are needed to build and test predictive and mechanistic models. Given the advances in high-throughput technologies and the projected increase in the availability of binding data, we found it especially timely to evaluate the current standards for performing and reporting binding measurements. A review of 100 studies revealed that in most cases essential controls for establishing the appropriate incubation time and concentration regime were not documented, making it impossible to determine measurement reliability. Moreover, several reported affinities could be concluded to be incorrect, thereby impacting biological interpretations. Given these challenges, we provide a framework for a broad range of researchers to evaluate, teach about, perform, and clearly document high-quality equilibrium binding measurements. We apply this framework and explain underlying fundamental concepts through experimental examples with the RNA-binding protein Puf4.
Quantitative measurements of biomolecule associations are central to biological understanding and are needed to build and test predictive and mechanistic models. Given the advances in high-throughput technologies and the projected increase in the availability of binding data, we found it especially timely to evaluate the current standards for performing and reporting binding measurements. A review of 100 studies revealed that in most cases essential controls for establishing the appropriate incubation time and concentration regime were not documented, making it impossible to determine measurement reliability. Moreover, several reported affinities could be concluded to be incorrect, thereby impacting biological interpretations. Given these challenges, we provide a framework for a broad range of researchers to evaluate, teach about, perform, and clearly document high-quality equilibrium binding measurements. We apply this framework and explain underlying fundamental concepts through experimental examples with the RNA-binding protein Puf4.
Molecular associations lie at the heart of biology. Their thermodynamics provides information critical for deriving a fundamental understanding of molecular functions. In a broader biological context, these associations are linked and interconnected in complex networks that allow sensitive and precise developmental programs and responses to environmental cues, and that are altered in disease states. The outputs of pathways and networks are determined by the quantitative interplay of their many constituent molecules and interactions. Thus, equilibrium constants for association between network components are needed to define, model, predict, and ultimately precisely manipulate biology.A limitation of traditional biochemical measurements is their low throughput, especially in relation to the large number of cellular interactions. Excitingly, several strategies have recently emerged to obtain high-throughput, quantitative information for intermolecular associations (e.g. Buenrostro et al., 2014; Tome et al., 2014; Lambert et al., 2014; Nutiu et al., 2011; Maerkl and Quake, 2007; Adams et al., 2016; Jain et al., 2017). Given these potentially transformative advances, it is especially timely to assess the accuracy of equilibrium binding measurements. We wanted to know whether current practices are sufficient to ensure reliable and accurate measurements, and whether the reliability of these measurements can be readily ascertained from the information provided in published work.Our survey of 100 literature binding measurements, presented below, uncovered recurring problems with a large majority of studies. Fortunately, there are straightforward procedures, laid out here, that can be followed to ensure that published binding measurements are reliable. The principles underlying these procedures have been discussed and we build on these previous reports (Pollard, 2010; Hulme and Trevethick, 2010; Sanders, 2010). We focus on a minimal set of critical actionable steps and controls that biologists of any background should be able to implement in their binding measurements. We apply these procedures with experimental examples and also demonstrate the pitfalls of omitting essential controls. To further streamline application of these standard procedures, we provide a convenient checklist that can organize and guide experiments and can be used as an aid in summarizing and presenting results for publication.
Results
Assessing the current state of binding measurements
We evaluated published binding measurements using RNA-protein interactions as an illustrative example. We surveyed 100 studies that reported equilibrium dissociation constants (KD values) and scored them based on two key criteria for reliable binding measurements: sufficient time to equilibration and proper concentration regime (Figure 1).
Figure 1.
Assessment of published KD values for RNA-binding proteins.
We analyzed 100 papers reporting KD or ‘apparent KD’ values of RNA/protein interactions. Measurements were evaluated based on two criteria: demonstrating equilibration (horizontal axis) and controlling for titration (vertical axis). Detailed criteria are described in Materials and methods, and the source data are provided in Supplementary file 1. The right column includes predominantly studies that used ITC and SPR, techniques that inherently record binding progress over time (24/30 in this column). The fraction of studies that varied time to demonstrate equilibration in non-ITC/SPR experiments is considerably smaller (6 of the 76 papers that did not exclusively use ITC or SPR, or <10%).
(A) Percentages of publications that did or did not report and vary the incubation time. The light gray portion of the first column indicates the studies using SPR and ITC, techniques in which time is varied by default. (B) Incubation times in papers that reported a single time.
(A) Percentages of publications that did (blue) or did not (red) control for titration effects. The first category includes studies that systematically varied the limiting component concentration to rule out titration. Studies that reported using an appropriate concentration regime or analysis methods to minimize the effects of titration (second and third column, respectively) were considered titration controlled; nevertheless, we emphasize the importance of performing and reporting the control experiments described herein, instead of relying on concentrations alone (see section 'Avoid the titration regime'). The ‘Other’ category (n = 7) includes a study that reported KD values as upper limits, recognizing possible titration (n = 1), and studies that only used SPR (n = 6), where the concentration of the immobilized species is difficult to estimate, but mass transport is typically controlled for or accounted for during analysis, as indicated in most surveyed studies. (B) Breakdown of studies that did not report controlling for titration. The first three columns denote studies that assumed negligible concentration of the limiting component in their analysis; however, the reported concentrations and KD values were inconsistent with this assumption, with the ratio of the lowest measured KD value to the limiting component concentration indicated. The ‘Not reported/Other’ category includes studies that did not report the limiting component concentration (n = 4), or used the quadratic equation in a titration regime (limiting component concentration in >1000-fold excess over the KD), incompatible with reliable KD determination (n = 1; see below).
Assessment of published KD values for RNA-binding proteins.
We analyzed 100 papers reporting KD or ‘apparent KD’ values of RNA/protein interactions. Measurements were evaluated based on two criteria: demonstrating equilibration (horizontal axis) and controlling for titration (vertical axis). Detailed criteria are described in Materials and methods, and the source data are provided in Supplementary file 1. The right column includes predominantly studies that used ITC and SPR, techniques that inherently record binding progress over time (24/30 in this column). The fraction of studies that varied time to demonstrate equilibration in non-ITC/SPR experiments is considerably smaller (6 of the 76 papers that did not exclusively use ITC or SPR, or <10%).
Survey of incubation times for published equilibrium dissociation constants.
(A) Percentages of publications that did or did not report and vary the incubation time. The light gray portion of the first column indicates the studies using SPR and ITC, techniques in which time is varied by default. (B) Incubation times in papers that reported a single time.
Survey of titration controls in published binding studies.
(A) Percentages of publications that did (blue) or did not (red) control for titration effects. The first category includes studies that systematically varied the limiting component concentration to rule out titration. Studies that reported using an appropriate concentration regime or analysis methods to minimize the effects of titration (second and third column, respectively) were considered titration controlled; nevertheless, we emphasize the importance of performing and reporting the control experiments described herein, instead of relying on concentrations alone (see section 'Avoid the titration regime'). The ‘Other’ category (n = 7) includes a study that reported KD values as upper limits, recognizing possible titration (n = 1), and studies that only used SPR (n = 6), where the concentration of the immobilized species is difficult to estimate, but mass transport is typically controlled for or accounted for during analysis, as indicated in most surveyed studies. (B) Breakdown of studies that did not report controlling for titration. The first three columns denote studies that assumed negligible concentration of the limiting component in their analysis; however, the reported concentrations and KD values were inconsistent with this assumption, with the ratio of the lowest measured KD value to the limiting component concentration indicated. The ‘Not reported/Other’ category includes studies that did not report the limiting component concentration (n = 4), or used the quadratic equation in a titration regime (limiting component concentration in >1000-fold excess over the KD), incompatible with reliable KD determination (n = 1; see below).First, we asked if equilibration was demonstrated. By definition, an equilibrium state is invariant with time. So, determining a binding equilibrium constant requires showing that there is no change in the amount of bound complex over time. Of the 100 studies surveyed, 70 did not report varying time for reported equilibrium measurements (Figure 1; Supplementary file 1). Of the 30 studies that did vary time, 24 exclusively used techniques with built-in monitoring of progress over time (isothermal titration calorimetry (ITC) and surface plasmon resonance [SPR]). Of the remaining 76 studies—those using approaches such as native gel shifts, nitrocellulose filter binding, and fluorescence anisotropy—less than 10% reported varying time (Figure 1, Figure 1—figure supplement 1).
Figure 1—figure supplement 1.
Survey of incubation times for published equilibrium dissociation constants.
(A) Percentages of publications that did or did not report and vary the incubation time. The light gray portion of the first column indicates the studies using SPR and ITC, techniques in which time is varied by default. (B) Incubation times in papers that reported a single time.
We know from individual discussions that some researchers carry out these controls, as we advocate below, but do not report them. Unfortunately, the published record then cannot distinguish between these studies and others that have not demonstrated equilibration.A second critical control entails demonstrating that the KD is not affected by titration, as artifacts can arise when the concentration of the constant limiting component is too high relative to the dissociation constant (KD). Similar to varying time to establish equilibration, systematically varying the concentration of the limiting component provides a definitive control for effects of titration. In our survey, only 5% of studies reported performing this or equivalent control (Figure 1—figure supplement 2). Nevertheless, most authors appeared to be aware of the need to avoid titration, as the majority of studies (~70%) reported using appropriately low concentrations of the limiting component or employed advanced analysis methods. We consider these examples as reasonably titration-controlled for the purpose of the survey, but emphasize the importance of empirical controls in the sections below. Importantly, this leaves, at a minimum, one-fourth of studies at risk for titration (Figure 1, Figure 1—figure supplement 2).
Figure 1—figure supplement 2.
Survey of titration controls in published binding studies.
(A) Percentages of publications that did (blue) or did not (red) control for titration effects. The first category includes studies that systematically varied the limiting component concentration to rule out titration. Studies that reported using an appropriate concentration regime or analysis methods to minimize the effects of titration (second and third column, respectively) were considered titration controlled; nevertheless, we emphasize the importance of performing and reporting the control experiments described herein, instead of relying on concentrations alone (see section 'Avoid the titration regime'). The ‘Other’ category (n = 7) includes a study that reported KD values as upper limits, recognizing possible titration (n = 1), and studies that only used SPR (n = 6), where the concentration of the immobilized species is difficult to estimate, but mass transport is typically controlled for or accounted for during analysis, as indicated in most surveyed studies. (B) Breakdown of studies that did not report controlling for titration. The first three columns denote studies that assumed negligible concentration of the limiting component in their analysis; however, the reported concentrations and KD values were inconsistent with this assumption, with the ratio of the lowest measured KD value to the limiting component concentration indicated. The ‘Not reported/Other’ category includes studies that did not report the limiting component concentration (n = 4), or used the quadratic equation in a titration regime (limiting component concentration in >1000-fold excess over the KD), incompatible with reliable KD determination (n = 1; see below).
To what extent do these limitations affect the reported equilibrium binding constants in practice? As an example, for Puf4 binding (see below), not controlling for the factors above gave apparent KD values that were up to seven-fold higher than the actual KD values. A more extreme literature example is discussed in the next section, with discrepancies reaching 1000-fold, and other examples have been previously noted (Hulme and Trevethick, 2010; Strohkendl et al., 2018). There is a tendency to be less careful about controls in pursuit of relative affinities (specificity) rather than absolute affinity. However, failing to account for the factors noted above can also underestimate specificity by orders of magnitude (see Figure 4—figure supplement 1 and Figure 5—figure supplement 4 below).
Figure 4—figure supplement 1.
Insufficient equilibration times can lead to incorrect determination of relative affinities.
(A) Binding parameters for protein (P) interactions with two ligands, L1 and L2. The dissociation rate constant (koff) for L1 is 100-fold lower than for L2, such that L1 requires much longer to equilibrate than L2 (Equation 2). (B) Simulated binding data for L1 and L2 with varying incubation times (t1). The binding to each ligand is measured individually with trace amounts of L1 (blue) or L2 (red). Solid lines are fits to an equilibrium binding equation (Equation 4b), with dashed lines indicating the protein concentration at which half of the ligand is bound. Because equilibration of L1 binding is not complete until t1 = 10 hr (while L2 equilibration only takes ~5 min), the observed relative affinity ((rel) = /) is time-dependent and underestimates the true specificity if the incubation time is shorter than ~10 hr. Arrows and numbers indicate (rel) values at each time point. Note the systematic deviations of the simulated data from the fit curve in cases where equilibrium has not been reached. The presence of such deviations in experimental data indicates the need for additional controls to establish equilibration and rule out titration.
Figure 5—figure supplement 4.
Effects of trace binding partner concentration on apparent relative affinities.
(A) Affinities of protein P for ligands L1 and L2. (B) Simulated equilibrium binding curves. Binding to each ligand is measured individually with different concentrations of labeled ligand (L1* or L2*). Solid lines are fits to Eq. 4b, with dashed lines indicating the protein concentration at which half of the ligand is bound (corresponding to KD in Equation 4b). Arrows and numbers indicate apparent KD(rel) values at each concentration of L ((rel) = /; with and derived using Equation 4b). There is a pronounced dependence of apparent relative affinity on ligand concentration if [L] is not much lower than the KD for the most tightly bound ligand among the ligands being compared. If sufficiently low ligand concentrations are not accessible, Equation 5 should be used and results may be less reliable (see section 'Avoid the titration regime' of main text).
These observations highlight an urgent need to revisit the criteria for reliable binding measurements. There is a parallel need to render these criteria accessible to a broad range of biologists, regardless of background or training, in the form of clear and readily actionable guidelines. To meet these needs, we provide simple, concrete strategies so that any practitioner can carry out reliable binding measurements, clearly communicate their results, and evaluate results from others.Fortunately, the key requirements for binding measurements can be broken down into a small number of steps. We present two required steps for equilibrium binding measurements—varying the incubation time (see section 'Vary incubation time to test for equilibration') and controlling for titration (see section 'Avoid the titration regime'), and we illustrate these steps for the example of RNA binding to the Saccharomyces cerevisiaePuf4 protein (Gerber et al., 2004; Miller et al., 2008). We also present additional steps that can be taken to further increase confidence in KD values and to obtain kinetic information about the binding event under investigation (see sections 'Test KD by an independent approach' and 'Determine the fraction of active protein'). Finally, we describe strategies to address cases where no binding is initially detected and explain why it is often premature to conclude an absence of binding (see section 'The case of no observed binding').
Practical considerations
In principle, one would like to have well-behaved and perfectly controlled measurements in all cases, but biology and biochemistry can be messy. There are many times, working with extracts and partially purified systems where protein concentrations cannot be accurately determined, where proteases and nucleases may limit achievable equilibration times, and where there may be additional interacting components. Regardless of these potential complications, the simple steps indicated below can establish the robustness of measured affinities and can diagnose and help overcome issues like loss of activity over time. Moreover, these controls (and quantitative measurements more generally) can help uncover new features and regulatory mechanisms, based on deviations from ‘ideal’ behavior of simple binding equilibria.
Vary incubation time to test for equilibration
The most basic test for whether a binding reaction has reached equilibrium is that the fraction of complex formed between two molecules does not change over time. Nevertheless, the majority of papers we surveyed that present binding measurements and report apparent affinities or equilibrium dissociation constants do not report that time has been varied (Figure 1). We first describe two related concepts that will help readers develop an intuition for the time scales of binding processes and we then apply these concepts to Puf4 binding.
Half-life
Binding and other simple kinetic processes, in general, follow exponential curves (Figure 2). The key property of an exponential curve is that it has a constant half-life (t1/2)—that is, the time it takes for the reaction to proceed from 0% to 50% complete, 50% to 75% complete, 75% to 87.5% complete, etc. is the same (Figure 2). After three half-lives, an exponential process is almost 90% complete (3t1/2 = 87.5%; Figure 2), which is close enough to equilibration for most applications. Below we adopt the more common standard of taking reactions to five half-lives, or 96.6% completion; this more conservative standard is safer given that there are multiple sources of potential error in practice.
Figure 2.
Exponential kinetics used to estimate the time needed for binding equilibration.
Arrows indicate reaction half-life t1/2. Fraction bound is defined by the equation = .
Exponential kinetics used to estimate the time needed for binding equilibration.
Arrows indicate reaction half-life t1/2. Fraction bound is defined by the equation = .
Equilibration rate constant
The equilibration rate constant is effectively the inverse of the binding half-life (kequil = ≈ ) and, importantly, is concentration-dependent. For the binding equilibrium shown in Figure 3, under conditions where one binding partner (here, the protein, P) is in large excess over the other (RNA), the rate equation for approach to equilibrium, kequil, is described as: kon is the association rate constant, [P] is the concentration of protein, or the binding partner in excess, and koff is the dissociation rate constant (Pollard, 2010). According to Equation 1, equilibration is the slowest at the lowest protein concentrations. For this reason, equilibration times need to be established from the low end of the concentration range. In practice, it is useful to consider the limiting case with the protein concentration approaching zero ([P] ~ 0), such that Equation 1 simplifies to Equation 2 (Hulme and Trevethick, 2010):
Figure 3.
Model for one-step, non-cooperative, 1:1 binding between two molecules.
Protein (P) binding to an RNA (R) molecule is shown for illustrative purposes.
Thus, the more long-lived the complex (i.e. the lower its dissociation rate constant), the longer the incubation time required to reach equilibrium.
Model for one-step, non-cooperative, 1:1 binding between two molecules.
Protein (P) binding to an RNA (R) molecule is shown for illustrative purposes.What is the range of equilibration times for typical biomolecular interactions? While koff measurements (and, consequently, kequil) are less common in literature than KD measurements, equilibration times can be readily estimated (Sanders, 2010). Given that KD = (Figure 3) and assuming that the binding of molecules occurs as fast as diffusional collisions (kon = 108 M−1s−1), we can calculate that an interaction with a KD value of 1 pM would require a 10 hr incubation to reach equilibrium, whereas a 1 µM KD interaction would only require 40 ms (Table 1). Notably, binding rate constants for processes involving macromolecules are often smaller than the diffusion driven limit of ~108 M−1s−1, for example when additional conformational rearrangements are required for stabilizing binding after two molecules collide (Karbstein and Herschlag, 2003; Peluso et al., 2000; Wu et al., 2002). As a result, equilibration can take much longer. Thus, equilibration times for two interactions with the same KD value can vary by orders magnitude, and some reactions in the biologically relevant affinity range can require equilibration times of 10s of hr or even longer in vitro (Table 1; Hulme and Trevethick, 2010; Sanders, 2010). These long times underscore that biology has developed mechanisms to circumvent or utilize such slow processes—for example, rapid association may be facilitated by high intracellular concentrations of binding partners, and cellular factors such as molecular chaperones, helicases, chromatin remodelers, or translation can speed up binding and dissociation.
Table 1.
Equilibration times (tequil) for different affinities and association rate constants.
KD
kon, M−1 s−1
tequil*
sec
hr
1 µM
108
0.04
106
4
103
1
1 nM
108
40
106
1
103
1000
1 pM
108
10
106
1000
103
1,000,000
*tequil was calculated as five half-lives: tequil = 5t1/2 = 5 × 0.693/kequil, where kequil = koff = KD × kon (Equation 2 and Figure 3).
*tequil was calculated as five half-lives: tequil = 5t1/2 = 5 × 0.693/kequil, where kequil = koff = KD × kon (Equation 2 and Figure 3).
Implications of insufficient equilibration
Despite the realistic possibility of long equilibration times for biological association events, nearly 90% of the reported incubation times were 1 hr or less (Figure 1—figure supplement 1B). As a concrete example, several ‘equilibrium’ dissociation constants reported for CRISPR nucleases, which are well known for tight RNA and/or DNA binding, were determined from incubations of 1 hr or less (e.g. Semenova et al., 2011; Westra et al., 2012; Westra et al., 2013; Sternberg et al., 2014; O'Connell et al., 2014; Wright et al., 2015; Ma et al., 2015; Jiang et al., 2015; Sternberg et al., 2015; Beloglazova et al., 2015; Rutkauskas et al., 2015; Abudayyeh et al., 2016; Supplementary file 2). But when target dissociation of these proteins was measured over time, it took many hours (Strohkendl et al., 2018; Richardson et al., 2016; Boyle et al., 2017; Raper et al., 2018), suggesting that equilibration takes much longer than an hour and that the reported KD values based on these short incubation times underestimate the true binding strength. In one striking example, kinetic measurements revealed an equilibration time of >100 hr for the Cas12a complex and an equilibrium constant that was 1000-fold lower than previously reported for the same enzyme at similar conditions after much shorter incubation time (Strohkendl et al., 2018). Insufficient incubation times for tight binders may have also led to underestimation of specificity, a topic of central concern for CRISPR targeting (and for much of biology). Figure 4—figure supplement 1 illustrates how target affinities that differ by two orders of magnitude may appear identical if the incubation time is too short.An example in which extending the incubation time changed the mechanistic interpretation comes from studies of the signal recognition particle (SRP). Originally, the observation that 4.5S RNA enhanced the assembly of the signal recognition particle (SRP) and SRP receptor led to a proposed mechanism in which the 4.5S RNA stabilized the complex. Subsequently, binding studies extended to longer times revealed that the 4.5S RNA accelerated the otherwise slow SRP/receptor binding and dissociation without affecting the binding affinity (Peluso et al., 2000). Exploring the time dependence of the assembly process changed the mechanistic conclusions: 4.5S RNA could be shown to play a catalytic, rather than stabilizing role in SRP/receptor assembly.Figure 4—figure supplement 1 illustrates how incubation times that are very far from equilibrium can lead to systematic deviations of the data from the fit to an equilibrium binding equation. While a poor fit is not sufficient to diagnose insufficient equilibration (and, conversely, a good fit does not prove complete equilibration), an inability to fit the data well to a simple binding model provides an important indicator that additional controls are required. Only after simple controls for equilibration and titration (see below) have been performed, should more complex binding models, such as cooperativity, be considered, unless such models are independently supported. Indeed, among the studies in our literature survey omitting one or both key controls, several included poorly fit binding curves. Importantly, graphs of fits of the data to a clearly defined equilibrium binding model should be published along with the KD values when possible, and the quality of the fit over the entire concentration range should always be carefully assessed. In summary, the incubation time must be varied to ensure equilibration, ideally across a range of at least 10-fold. Below we illustrate this control, and the need for it, with experimental results for Puf4 binding to its consensus RNA.
Time dependence of Puf4 binding at 25°C and 0°C
To establish the equilibration time for Puf4 binding to its cognate RNA sequence, Puf4 was mixed, over a series of concentrations, with a trace amount of labeled RNA (in this case, 32P-labeled; 0.002–0.016 nM) and incubated for a specified time (t1) (Figure 4A). The fraction of bound RNA was subsequently determined by non-denaturing gel electrophoresis (see Materials and methods).
Figure 4.
Establishing equilibration in affinity measurements.
(A) Mixing scheme. RNA*: labeled RNA (here—5´-terminally labeled with 32P). In addition to varying equilibration time t1 (main text), the time and conditions between adding the loading buffer and loading (t2) are controlled (see Appendix 2—note 2). (B, C) Concentration dependence of Puf4 binding at 25°C (B) and at 0°C (C) after different incubation times. Data were collected at protein concentrations greater than or equal to the concentration of labeled RNA (0.002–0.016 nM, indicating the lower and upper limit of labeled RNA concentration; see section 'Avoid the titration regime' and Appendix 2—note 4).
(A) Binding parameters for protein (P) interactions with two ligands, L1 and L2. The dissociation rate constant (koff) for L1 is 100-fold lower than for L2, such that L1 requires much longer to equilibrate than L2 (Equation 2). (B) Simulated binding data for L1 and L2 with varying incubation times (t1). The binding to each ligand is measured individually with trace amounts of L1 (blue) or L2 (red). Solid lines are fits to an equilibrium binding equation (Equation 4b), with dashed lines indicating the protein concentration at which half of the ligand is bound. Because equilibration of L1 binding is not complete until t1 = 10 hr (while L2 equilibration only takes ~5 min), the observed relative affinity ((rel) = /) is time-dependent and underestimates the true specificity if the incubation time is shorter than ~10 hr. Arrows and numbers indicate (rel) values at each time point. Note the systematic deviations of the simulated data from the fit curve in cases where equilibrium has not been reached. The presence of such deviations in experimental data indicates the need for additional controls to establish equilibration and rule out titration.
Establishing equilibration in affinity measurements.
(A) Mixing scheme. RNA*: labeled RNA (here—5´-terminally labeled with 32P). In addition to varying equilibration time t1 (main text), the time and conditions between adding the loading buffer and loading (t2) are controlled (see Appendix 2—note 2). (B, C) Concentration dependence of Puf4 binding at 25°C (B) and at 0°C (C) after different incubation times. Data were collected at protein concentrations greater than or equal to the concentration of labeled RNA (0.002–0.016 nM, indicating the lower and upper limit of labeled RNA concentration; see section 'Avoid the titration regime' and Appendix 2—note 4).
Insufficient equilibration times can lead to incorrect determination of relative affinities.
(A) Binding parameters for protein (P) interactions with two ligands, L1 and L2. The dissociation rate constant (koff) for L1 is 100-fold lower than for L2, such that L1 requires much longer to equilibrate than L2 (Equation 2). (B) Simulated binding data for L1 and L2 with varying incubation times (t1). The binding to each ligand is measured individually with trace amounts of L1 (blue) or L2 (red). Solid lines are fits to an equilibrium binding equation (Equation 4b), with dashed lines indicating the protein concentration at which half of the ligand is bound. Because equilibration of L1 binding is not complete until t1 = 10 hr (while L2 equilibration only takes ~5 min), the observed relative affinity ((rel) = /) is time-dependent and underestimates the true specificity if the incubation time is shorter than ~10 hr. Arrows and numbers indicate (rel) values at each time point. Note the systematic deviations of the simulated data from the fit curve in cases where equilibrium has not been reached. The presence of such deviations in experimental data indicates the need for additional controls to establish equilibration and rule out titration.At 25°C, we observed the same amount of binding with incubations of t1 = 30 min, 1.5 hr, and 4.5 hr at each protein concentration, providing strong evidence for equilibration even at the shortest time (Figure 4B). Consequently, we can proceed to the next key control at this condition, using an incubation time of ≥30 min.We also present Puf4 binding results at 0°C as these data provide an example of slow equilibration and because many binding studies report incubations on ice to stabilize binding. Indeed, the results at 0°C were very different than those at 25°C. As shown in Figure 4C, Puf4 bound different amounts of RNA in the 30 min, 1.5 hr, and longer incubations. Not until the incubation was extended to 4.5 hr did the extent of binding level off at the lowest Puf4 concentrations—that is, the amount bound was the same after 4.5 and 24 hr. Consequently, equilibration of Puf4–RNA binding on ice requires at least 4.5 hr, and incubation for only 30 min would give an apparent KD value that is seven times higher than after a 24 hr incubation. Moreover, binding at 0°C was so tight that we were only able to obtain part of the binding curve while maintaining the protein concentrations in excess of labeled RNA (Figure 4C). The importance of this excess to obtain reliable KD values is described in the next section. In the 0°C case and more generally, it is important to re-assess the equilibration time after establishing that binding is in an appropriate concentration regime, as we demonstrate in later sections. Similarly, changes in conditions, such as salt concentration, temperature or pH, can affect both the affinity and the equilibration time and therefore should be accompanied by confirming that equilibration has occurred.
Avoid the titration regime
The most common approach to measuring affinity is to vary the concentration of one component, while keeping the concentration of the other binding partner constant. However, this experimental design is not always sufficient, as there are two limiting regimes, determined by the concentration of the constant component; only one of these concentration regimes allows the KD to be reliably determined, while the other does not.In the first, ‘binding’ regime, the concentration of the constant (‘trace’) component, R, is well below the dissociation constant ([R]total << KD for the example in Figure 3). In this case, the concentration of the variable component (P in Figure 3) that gives half binding is equal to the KD (Figure 5A). In the other, ‘titration’ regime, the concentration of the constant component is much greater than the KD ([R]total >> KD) so that essentially all added P is depleted from solution due to binding to R, until there is no more free R left to bind. In this case, the concentration of P that gives half binding does not equal or even approximate the KD. Rather, at high excess of R over the KD, the concentration of P that gives half binding is simply half of the concentration of (active) R molecules—a value that can differ from the sought-after KD by orders of magnitude (Figure 5B; Figure 5—figure supplement 1).
Figure 5.
Two concentration regimes.
(A) Binding curve for the model in Figure 3 in the ‘binding’ regime—that is, the trace binding partner concentration ([R]total) is much lower than KD and much lower than [P]total (Equation 4b). Here, the KD is simply the protein concentration at which half of the RNA is bound (K1/2, here corresponding to 1 nM). The same simulated binding curve is shown in linear (top) and log (bottom) plots, as both are useful and common in the literature. (B) Binding curve in the ‘titration’ regime, simulated for an interaction with a KD value of 0.01 nM and an [R]total of 2 nM. Although the K1/2 value in this example is identical to the example in Part A, here it does not equal KD, instead exceeding the real KD value by 100-fold.
(A) Circles indicate simulated data for an interaction with a KD = 10 pM in the presence of RNA concentrations ranging from 100-fold below to 100-fold above the KD. Curves indicate fits of the simulated data to a hyperbolic equation (Equation 4b). For RNA concentrations ≤10-fold below the KD, the data are well explained by a hyperbolic fit, and the protein concentration at which half-saturation occurs (K1/2; indicated with dashed lines for the 0.1 pM RNA curve) is consistent with the KD. Higher RNA concentrations lead to increasing deviations from a hyperbolic fit and have increasing K1/2 values as the RNA concentration increases. (B). The relationship between the observed K1/2 enhancement over the true KD (‘K1/2/KD’) and the total RNA concentration relative to KD (‘[R]total/KD’). K1/2 values were derived from the simulated data in part A using Equation 4b.
Simulated binding curves for RNA/protein interactions of varying affinities are shown in the presence of 1 nM labeled RNA. In this example, KD = 1 pM (1000-fold lower than [R]total) would be essentially impossible to distinguish from KD = 0.1 pM (10,000-fold lower than [R]total) and from even lower KD values because of the nearly identical binding curves. To accurately measure KD = 10 pM (100-fold lower than [RNA]) it would be critical to have a large number of data points in the narrow protein concentration range that distinguishes this curve from weaker and especially from stronger binders (inset).
All binding curves are for an RNA-protein interaction with a KD of 0.1 nM, measured in the presence of different RNA concentrations (0.001–100 nM) and with increasing levels of random noise in the fraction bound (standard deviation of 0.01–0.2). Ten datasets were simulated per condition and noise level and were individually fit to Equation 4b (leftmost column) or Equation 5 (the remaining columns) to determine the KD. The binding curves are shown as black lines, and the overlaid white circles indicate the expected fractions bound if the data were not affected by noise, with error bars indicating the standard deviation. The fit KD values for each of the 10 simulated datasets are shown below each set of binding curves, and the error bars indicate the 95% confidence intervals (CIs) of the KD. Gray bars indicate that the KD could not be determined from a quadratic fit. CIs that extend beyond the axis limits indicate that the lower limit of the KD was not defined. Note that with increasing noise and increasing RNA concentration the KD values derived from the quadratic fits become increasingly poorly constrained, particularly the lower CIs. By contrast, using the binding regime and Equation 4b to fit the data (leftmost column) consistently yields well-defined KD values, even with substantial noise.
(A) Affinities of protein P for ligands L1 and L2. (B) Simulated equilibrium binding curves. Binding to each ligand is measured individually with different concentrations of labeled ligand (L1* or L2*). Solid lines are fits to Eq. 4b, with dashed lines indicating the protein concentration at which half of the ligand is bound (corresponding to KD in Equation 4b). Arrows and numbers indicate apparent KD(rel) values at each concentration of L ((rel) = /; with and derived using Equation 4b). There is a pronounced dependence of apparent relative affinity on ligand concentration if [L] is not much lower than the KD for the most tightly bound ligand among the ligands being compared. If sufficiently low ligand concentrations are not accessible, Equation 5 should be used and results may be less reliable (see section 'Avoid the titration regime' of main text).
(A) Labeled RNA concentration is much lower than KD ([R*]total << KD; binding regime). (B) Labeled RNA concentration is greater than KD ([R*]total > KD; intermediate regime). In parts (A) and (B), concentrations are indicated schematically by the number of RNA (R*, red), protein (P, light blue) molecules and RNA-protein complexes (P●R*) shown. In each case, protein concentration is varied (6, 18, 54, 400 arbitrary units), and KD equals 18 (in the same units). The total RNA concentration is 4 (A) and 36 (B). (C) Protein concentration dependence of binding in each of the above regimes. In the binding regime (green, [R*]total << KD from part A), the protein concentration at which half of the RNA is bound corresponds to the KD. In contrast, in the intermediate regime (purple, [R*]total > KD from part B), a greater protein concentration is required to achieve half-saturation (40 vs. 18 arbitrary units). The discrepancy would further increase with higher RNA concentrations, as shown in Figure 5—figure supplement 1. We can understand the origin of this discrepancy as follows. In part (A), the RNA concentration (red) is below the KD value and below the protein concentration (blue), such that the free concentration of the protein is essentially unchanged after RNA binding at both saturating (complete binding of RNA) and sub-saturating protein concentrations. Changing the RNA concentration in this regime would not change the fraction of RNA bound at a given total protein concentration, as long as the [R*]total << KD condition remains met. On the contrary, in part (B), the RNA concentration exceeds the dissociation constant (KD) and is high enough that a large fraction of the total protein is bound by RNA. Thus, the free protein concentration, which determines the extent of binding according to Equation 4a, is depleted and can no longer be approximated by the total protein concentration in Equation 4b to obtain an accurate KD value. On the molecular scale, the lowered free protein results in less binding. Consequently, for a given KD, more protein is required to achieve half-saturation at higher RNA concentration than with a trace concentration of RNA. Intuitively, at a concentration of RNA that is greater than KD there simply isn’t enough protein to occupy half the RNA when the total protein concentration is equal to KD.
Figure 5—figure supplement 1.
The effects of RNA (ligand) concentration on observed binding.
(A) Circles indicate simulated data for an interaction with a KD = 10 pM in the presence of RNA concentrations ranging from 100-fold below to 100-fold above the KD. Curves indicate fits of the simulated data to a hyperbolic equation (Equation 4b). For RNA concentrations ≤10-fold below the KD, the data are well explained by a hyperbolic fit, and the protein concentration at which half-saturation occurs (K1/2; indicated with dashed lines for the 0.1 pM RNA curve) is consistent with the KD. Higher RNA concentrations lead to increasing deviations from a hyperbolic fit and have increasing K1/2 values as the RNA concentration increases. (B). The relationship between the observed K1/2 enhancement over the true KD (‘K1/2/KD’) and the total RNA concentration relative to KD (‘[R]total/KD’). K1/2 values were derived from the simulated data in part A using Equation 4b.
Two concentration regimes.
(A) Binding curve for the model in Figure 3 in the ‘binding’ regime—that is, the trace binding partner concentration ([R]total) is much lower than KD and much lower than [P]total (Equation 4b). Here, the KD is simply the protein concentration at which half of the RNA is bound (K1/2, here corresponding to 1 nM). The same simulated binding curve is shown in linear (top) and log (bottom) plots, as both are useful and common in the literature. (B) Binding curve in the ‘titration’ regime, simulated for an interaction with a KD value of 0.01 nM and an [R]total of 2 nM. Although the K1/2 value in this example is identical to the example in Part A, here it does not equal KD, instead exceeding the real KD value by 100-fold.
The effects of RNA (ligand) concentration on observed binding.
(A) Circles indicate simulated data for an interaction with a KD = 10 pM in the presence of RNA concentrations ranging from 100-fold below to 100-fold above the KD. Curves indicate fits of the simulated data to a hyperbolic equation (Equation 4b). For RNA concentrations ≤10-fold below the KD, the data are well explained by a hyperbolic fit, and the protein concentration at which half-saturation occurs (K1/2; indicated with dashed lines for the 0.1 pM RNA curve) is consistent with the KD. Higher RNA concentrations lead to increasing deviations from a hyperbolic fit and have increasing K1/2 values as the RNA concentration increases. (B). The relationship between the observed K1/2 enhancement over the true KD (‘K1/2/KD’) and the total RNA concentration relative to KD (‘[R]total/KD’). K1/2 values were derived from the simulated data in part A using Equation 4b.
Fit to the quadratic binding equation becomes less sensitive to differences in KD when the RNA concentration is in large excess over the KD.
Simulated binding curves for RNA/protein interactions of varying affinities are shown in the presence of 1 nM labeled RNA. In this example, KD = 1 pM (1000-fold lower than [R]total) would be essentially impossible to distinguish from KD = 0.1 pM (10,000-fold lower than [R]total) and from even lower KD values because of the nearly identical binding curves. To accurately measure KD = 10 pM (100-fold lower than [RNA]) it would be critical to have a large number of data points in the narrow protein concentration range that distinguishes this curve from weaker and especially from stronger binders (inset).
Application of the hyperbolic (Equation 4b) and quadratic (Equation 5) binding equations to simulated binding data with increasing noise levels.
All binding curves are for an RNA-protein interaction with a KD of 0.1 nM, measured in the presence of different RNA concentrations (0.001–100 nM) and with increasing levels of random noise in the fraction bound (standard deviation of 0.01–0.2). Ten datasets were simulated per condition and noise level and were individually fit to Equation 4b (leftmost column) or Equation 5 (the remaining columns) to determine the KD. The binding curves are shown as black lines, and the overlaid white circles indicate the expected fractions bound if the data were not affected by noise, with error bars indicating the standard deviation. The fit KD values for each of the 10 simulated datasets are shown below each set of binding curves, and the error bars indicate the 95% confidence intervals (CIs) of the KD. Gray bars indicate that the KD could not be determined from a quadratic fit. CIs that extend beyond the axis limits indicate that the lower limit of the KD was not defined. Note that with increasing noise and increasing RNA concentration the KD values derived from the quadratic fits become increasingly poorly constrained, particularly the lower CIs. By contrast, using the binding regime and Equation 4b to fit the data (leftmost column) consistently yields well-defined KD values, even with substantial noise.
Effects of trace binding partner concentration on apparent relative affinities.
(A) Affinities of protein P for ligands L1 and L2. (B) Simulated equilibrium binding curves. Binding to each ligand is measured individually with different concentrations of labeled ligand (L1* or L2*). Solid lines are fits to Eq. 4b, with dashed lines indicating the protein concentration at which half of the ligand is bound (corresponding to KD in Equation 4b). Arrows and numbers indicate apparent KD(rel) values at each concentration of L ((rel) = /; with and derived using Equation 4b). There is a pronounced dependence of apparent relative affinity on ligand concentration if [L] is not much lower than the KD for the most tightly bound ligand among the ligands being compared. If sufficiently low ligand concentrations are not accessible, Equation 5 should be used and results may be less reliable (see section 'Avoid the titration regime' of main text).
Concentration regimes that do not (A) and do (B) affect the determination of equilibrium binding constants.
(A) Labeled RNA concentration is much lower than KD ([R*]total << KD; binding regime). (B) Labeled RNA concentration is greater than KD ([R*]total > KD; intermediate regime). In parts (A) and (B), concentrations are indicated schematically by the number of RNA (R*, red), protein (P, light blue) molecules and RNA-protein complexes (P●R*) shown. In each case, protein concentration is varied (6, 18, 54, 400 arbitrary units), and KD equals 18 (in the same units). The total RNA concentration is 4 (A) and 36 (B). (C) Protein concentration dependence of binding in each of the above regimes. In the binding regime (green, [R*]total << KD from part A), the protein concentration at which half of the RNA is bound corresponds to the KD. In contrast, in the intermediate regime (purple, [R*]total > KD from part B), a greater protein concentration is required to achieve half-saturation (40 vs. 18 arbitrary units). The discrepancy would further increase with higher RNA concentrations, as shown in Figure 5—figure supplement 1. We can understand the origin of this discrepancy as follows. In part (A), the RNA concentration (red) is below the KD value and below the protein concentration (blue), such that the free concentration of the protein is essentially unchanged after RNA binding at both saturating (complete binding of RNA) and sub-saturating protein concentrations. Changing the RNA concentration in this regime would not change the fraction of RNA bound at a given total protein concentration, as long as the [R*]total << KD condition remains met. On the contrary, in part (B), the RNA concentration exceeds the dissociation constant (KD) and is high enough that a large fraction of the total protein is bound by RNA. Thus, the free protein concentration, which determines the extent of binding according to Equation 4a, is depleted and can no longer be approximated by the total protein concentration in Equation 4b to obtain an accurate KD value. On the molecular scale, the lowered free protein results in less binding. Consequently, for a given KD, more protein is required to achieve half-saturation at higher RNA concentration than with a trace concentration of RNA. Intuitively, at a concentration of RNA that is greater than KD there simply isn’t enough protein to occupy half the RNA when the total protein concentration is equal to KD.A potentially useful intermediate regime exists between the two extremes, with limiting component concentrations similar to or in modest excess over the KD. The KD can be determined in this regime by using an appropriate binding equation, although with potential pitfalls (see below).
Distinguishing between concentration regimes
The challenge is that distinguishing between the regimes requires the knowledge of the KD, and consequently it is impossible to know a priori which regime holds. A useful rule of thumb for avoiding the titration regime is to always maintain the concentration of the excess binding partner significantly above that of the trace limiting partner. The reason for this can be gleaned from the equation that describes the fraction of bound RNA for the simple binding scheme of Figure 3:Here [P]free is the unbound protein concentration and KD is simply the free protein concentration at which half of the RNA is bound. But while Equation 4a holds universally, in practice we only know the total concentration of P, [P]total—how much we added to the solution—not the free concentration ([P]free). Therefore, we want to operate under simplifying conditions where [P]free ≈ [P]total so that we can substitute [P]total into Equation 4a to give Equation 4b:The condition [P]free ≈ [P]total holds true if P is in large excess of RNA across the entire experiment, meaning that only a small fraction of total protein is used up by binding to RNA. Most importantly, this condition must hold for the protein concentration that gives half-saturation to determine the KD; hence the requirement for the binding regime that the concentration of the limiting component must be <In principle, a more complex quadratic binding equation provides an alternative to working under the [P]free ≈ [P]total assumption, as it explicitly accounts for bound protein:Indeed, several techniques (most notably ITC) commonly operate outside the binding regime and rely on Equation 5 (or equivalent formulations) for data fitting. Importantly, the quadratic equation is only applicable to the intermediate and binding regimes, but not the titration regime. The reason for this is that at very high concentrations relative to the KD, the contribution of KD in determining the fraction bound (Equation 5) becomes negligible, and as a result a meaningful KD value cannot be extracted from the fit to the binding data. Simulated data in Figure 5—figure supplements 2 and 3 illustrate this limitation. Consequently, even when using Equation 5, the concentration of the limiting component should be kept to a minimum to avoid the titration regime.
Figure 5—figure supplement 2.
Fit to the quadratic binding equation becomes less sensitive to differences in KD when the RNA concentration is in large excess over the KD.
Simulated binding curves for RNA/protein interactions of varying affinities are shown in the presence of 1 nM labeled RNA. In this example, KD = 1 pM (1000-fold lower than [R]total) would be essentially impossible to distinguish from KD = 0.1 pM (10,000-fold lower than [R]total) and from even lower KD values because of the nearly identical binding curves. To accurately measure KD = 10 pM (100-fold lower than [RNA]) it would be critical to have a large number of data points in the narrow protein concentration range that distinguishes this curve from weaker and especially from stronger binders (inset).
Figure 5—figure supplement 3.
Application of the hyperbolic (Equation 4b) and quadratic (Equation 5) binding equations to simulated binding data with increasing noise levels.
All binding curves are for an RNA-protein interaction with a KD of 0.1 nM, measured in the presence of different RNA concentrations (0.001–100 nM) and with increasing levels of random noise in the fraction bound (standard deviation of 0.01–0.2). Ten datasets were simulated per condition and noise level and were individually fit to Equation 4b (leftmost column) or Equation 5 (the remaining columns) to determine the KD. The binding curves are shown as black lines, and the overlaid white circles indicate the expected fractions bound if the data were not affected by noise, with error bars indicating the standard deviation. The fit KD values for each of the 10 simulated datasets are shown below each set of binding curves, and the error bars indicate the 95% confidence intervals (CIs) of the KD. Gray bars indicate that the KD could not be determined from a quadratic fit. CIs that extend beyond the axis limits indicate that the lower limit of the KD was not defined. Note that with increasing noise and increasing RNA concentration the KD values derived from the quadratic fits become increasingly poorly constrained, particularly the lower CIs. By contrast, using the binding regime and Equation 4b to fit the data (leftmost column) consistently yields well-defined KD values, even with substantial noise.
Where does the intermediate regime end and titration begin? The answer depends on the technique and the quality of the data. For ITC measurements, which provide highly precise information for each added binding aliquot, up to 1000-fold excess of the limiting species over the measured KD can be acceptable (Velázquez-Campoy et al., 2004). However, in most other cases, this limit is much lower. Simulations in Figure 5—figure supplement 3 suggest that up to ~10-fold excess consistently allows for reasonably well-defined KD values in the presence of typical binding data, and up to 100-fold excess can be useful for data with minimal noise. In contrast, performing the experiments in the binding regime (fit with Equation 4b) yields well-defined KD values even with substantial noise in the data (Figure 5—figure supplement 3).
Implications of the titration regime
Of the 100 literature studies we surveyed, most (65%) determined KD values under the assumption of the binding regime, by using Equation 4b or equivalent analysis. Nevertheless, the required condition that the limiting species concentration be <The implication in all these cases is that the reported KD values may underestimate the real affinities. Unfortunately, it is difficult to determine the extent of this underestimation post-factum without further experimental controls. To understand why, recall from the example in Figure 5B that in the titration regime the midpoint of the binding curve only reflects ~half the concentration of the limiting species, which sets a lower limit to the apparent KD derived from Equation 4b, even if the real KD is much lower. Conversely, if the midpoint of the binding curve (and the reported KD in the above cases) is approximately the same as the limiting component concentration (allowing for some uncertainty in the concentration), the real KD could be anything below this value, from several-fold to many orders of magnitude less. As with insufficient incubation, systematic deviations of the data from the fit to Equation 4b can be a clear indicator that the apparent KD is limited by titration, but a good fit should not be considered sufficient to prove the binding regime, as experimental uncertainties and other causes can mask deviations.High-affinity interactions are most susceptible to titration, a corollary of the simple fact that for very low KD values it becomes increasingly difficult to maintain concentrations much lower than KD while still allowing for detection. Since CRISPR nucleases represent some of the most widely studied high-affinity binders, we surveyed a sample of studies to determine the concentration regime under which the reported KD values were measured (Supplementary file 2). Of the 15 studies, the majority (13, or 90%) assumed the binding regime in their analysis, indicated by the use of Equation 4b or equivalent. However, only two of these studies (15%) reported using labeled DNA or RNA concentrations considerably below the apparent KD, and in five cases the lowest reported KD was essentially identical to the labeled RNA or DNA concentration (within ~2-fold), consistent with possible titration.Importantly, because relative affinities are typically based on the tightest binders, titration effects on the ‘wild-type’ substrate measurements can distort all specificity (relative affinity) values that are based on it. Figure 5—figure supplement 4 illustrates an example, in which two substrates with a 100-fold difference in affinity appear to have identical or near-identical affinities when titration is not controlled for.Given the impossibility of designing experiments for the binding regime a priori, without knowing the affinity, it is important to rule out titration empirically. Thus, analogously to varying time to establish equilibration, we strongly recommend systematically varying the concentration of the limiting species to establish the binding regime (or, with use of Equation 5, the intermediate regime). The hallmark of a valid KD is that it is not affected by varying the concentration of the limiting component, whereas a titration regime would result in concentration-dependent apparent KD values. At a minimum, this control should always be performed when the measured KD value is comparable to the concentration of the limiting component (Equation 4b), or when Equation 5 yields poorly defined apparent KD values or values much lower than the limiting concentration. Below we demonstrate the titration control for Puf4 affinity measurements.
RNA concentration dependence of Puf4 binding at 25°C and 0°C
We systematically varied the labeled RNA concentration in Puf4 binding experiments at 25°C and 0°C, to illustrate the binding and intermediate regimes, respectively. Figure 5—figure supplement 5 provides a schematic description of the two regimes to help build the reader’s intuition.
Figure 5—figure supplement 5.
Concentration regimes that do not (A) and do (B) affect the determination of equilibrium binding constants.
(A) Labeled RNA concentration is much lower than KD ([R*]total << KD; binding regime). (B) Labeled RNA concentration is greater than KD ([R*]total > KD; intermediate regime). In parts (A) and (B), concentrations are indicated schematically by the number of RNA (R*, red), protein (P, light blue) molecules and RNA-protein complexes (P●R*) shown. In each case, protein concentration is varied (6, 18, 54, 400 arbitrary units), and KD equals 18 (in the same units). The total RNA concentration is 4 (A) and 36 (B). (C) Protein concentration dependence of binding in each of the above regimes. In the binding regime (green, [R*]total << KD from part A), the protein concentration at which half of the RNA is bound corresponds to the KD. In contrast, in the intermediate regime (purple, [R*]total > KD from part B), a greater protein concentration is required to achieve half-saturation (40 vs. 18 arbitrary units). The discrepancy would further increase with higher RNA concentrations, as shown in Figure 5—figure supplement 1. We can understand the origin of this discrepancy as follows. In part (A), the RNA concentration (red) is below the KD value and below the protein concentration (blue), such that the free concentration of the protein is essentially unchanged after RNA binding at both saturating (complete binding of RNA) and sub-saturating protein concentrations. Changing the RNA concentration in this regime would not change the fraction of RNA bound at a given total protein concentration, as long as the [R*]total << KD condition remains met. On the contrary, in part (B), the RNA concentration exceeds the dissociation constant (KD) and is high enough that a large fraction of the total protein is bound by RNA. Thus, the free protein concentration, which determines the extent of binding according to Equation 4a, is depleted and can no longer be approximated by the total protein concentration in Equation 4b to obtain an accurate KD value. On the molecular scale, the lowered free protein results in less binding. Consequently, for a given KD, more protein is required to achieve half-saturation at higher RNA concentration than with a trace concentration of RNA. Intuitively, at a concentration of RNA that is greater than KD there simply isn’t enough protein to occupy half the RNA when the total protein concentration is equal to KD.
At 25°C, the Puf4 binding curves were identical across a nine-fold range of RNA concentrations (Figure 6A,B), and the data were well described by Equation 4b. From the constancy of the binding curves in Figure 6B, we can conclude that the binding regime holds for Puf4 at 25°C, and thus that the observed KD value of 120 pM obtained from Equation 4b represents a true equilibrium constant. As expected for the binding regime, the measured KD is higher than the RNA concentrations (120 pM vs 2–18 pM).
Figure 6.
Varying the concentration of the 'trace' binding partner.
(A) Mixing scheme, as in Figure 4A but now with a series of labeled RNA concentrations. (B) Puf4 binding to different concentrations of 32P-labeled RNA at 25°C. For simplicity, only the lower limits of RNA concentration are indicated; the corresponding upper limits were 15–140 pM RNA (see Materials and methods and Appendix 2—note 4). Incubation time t1 was 0.5 hr, as established in Figure 4B. (C) Puf4 binding to different concentrations of 32P-labeled RNA at 0°C. Lower limits of labeled RNA concentration are indicated. Incubation time t1 was 40 hr. Note that these data are not fit well by Equation 4b, which assumes [R*]total << KD (solid lines). Quadratic fits, which do not assume negligible RNA concentration, are shown in dashed lines (Equation 5). (D) Effect of RNA concentration on apparent KD () at 0°C. Red symbols indicate values from a hyperbolic fit (Equation 4b and solid lines in C) and grey symbols indicate values from fits to the quadratic equation (Equation 5). The error bars denote 95% confidence intervals, as determined by fitting the data to the indicated equation in Prism 8.
Varying the concentration of the 'trace' binding partner.
(A) Mixing scheme, as in Figure 4A but now with a series of labeled RNA concentrations. (B) Puf4 binding to different concentrations of 32P-labeled RNA at 25°C. For simplicity, only the lower limits of RNA concentration are indicated; the corresponding upper limits were 15–140 pM RNA (see Materials and methods and Appendix 2—note 4). Incubation time t1 was 0.5 hr, as established in Figure 4B. (C) Puf4 binding to different concentrations of 32P-labeled RNA at 0°C. Lower limits of labeled RNA concentration are indicated. Incubation time t1 was 40 hr. Note that these data are not fit well by Equation 4b, which assumes [R*]total << KD (solid lines). Quadratic fits, which do not assume negligible RNA concentration, are shown in dashed lines (Equation 5). (D) Effect of RNA concentration on apparent KD () at 0°C. Red symbols indicate values from a hyperbolic fit (Equation 4b and solid lines in C) and grey symbols indicate values from fits to the quadratic equation (Equation 5). The error bars denote 95% confidence intervals, as determined by fitting the data to the indicated equation in Prism 8.The situation is different at 0°C (Figure 6C). Here, varying the labeled RNA concentration revealed divergent binding curves and a pronounced dependence of apparent affinity (determined by fitting the data to Equation 4b) on the concentration of RNA, the constant component (Figure 6C,D). Moreover, the fits of the data to Equation 4b (solid lines in Figure 6C), which assumes [P]free ≈ [P]total, were poor, increasingly so for higher RNA concentrations. These data are indicative of protein depletion due to binding to labeled RNA. The apparent KD values vary by five-fold across the 30-fold range of RNA concentrations used (Figure 6D, red circles), and even greater discrepancies would arise at higher RNA concentrations (Figure 5—figure supplement 1). Consequently, only an upper limit of the real affinity can be extracted from these data (KD ≤ 2.3 pM, based on the fit value at the lowest RNA concentration used).To address the limitation in our 0°C data we could, in principle, lower the concentration of labeled RNA even further, until the labeled RNA concentration is <32P-labeled RNA we are already near the limit of reliable detection. If the concentration of the trace component cannot be lowered further, a more sensitive approach can sometimes be found. Kinetic approaches are particularly suitable for tight binders (see Appendix 1), or one can report an upper limit of the KD. In some cases, increasing the salt concentration or other changes to the solution or binding partners can be used to weaken binding to make it easier to obtain affinities at higher concentrations of the labeled species; this approach can be especially valuable if one is primarily interested in the relative affinities of multiple ligands (Altschuler et al., 2013).
As noted earlier, the quadratic binding equation enables KD determination for binding reactions in the intermediate regime. The quadratic equation provides a good fit to the 0°C data (Figure 6C, dashed lines) and yields uniform and well-defined KD values of ~1.9 pM across the different RNA concentrations, consistent with an intermediate (rather than titration) regime (Figure 6D, grey circles). The same KD value was obtained from kinetic experiments, providing independent support for and confidence in this determination (Appendix 1).In summary, we want to use the binding regime whenever possible, as it allows for the most straightforward and reliable KD measurements. It is necessary to avoid the titration regime and caution is required in the intermediate regime. In practice, varying the concentration of both components is an essential control for ruling out titration, ruling out other potential artifacts, and ensuring the measurement of valid dissociation constants.
Re-evaluating the equilibration time at 0°C
In the previous section, we mentioned the need for re-evaluating the equilibration time for Puf4 binding at 0°C after a binding regime was established. In principle, after determining sufficiently low RNA concentration for the binding regime, one could vary the incubation time again, as done in Figure 4. In our case, we used the shortcut defined in Equation 2 and instead determined the upper limit of the equilibration time by measuring the koff at 0°C (Appendix 1; see also Appendix 2—note 1 for precautions when applying this shortcut). These measurements revealed an equilibration time of 30 hr (five half-lives), far above the typical incubation times of 1 hr or less (Figure 1—figure supplement 1).
Dependence of binding affinity on conditions
The 100-fold difference in Puf4 affinity between 0°C and 25°C underscores the important point that the equilibrium dissociation constant is only a constant value at a given set of conditions, and that the affinity can change dramatically when the conditions (temperature, salt, pH) are changed. This dependence on conditions should always be considered when comparing literature values or when applying in vitro results to biology.
Test KD by an independent approach
Even when no challenges are encountered, as in the case of Puf4 binding at 25°C, it is a good idea to determine the KD by a second approach to ensure that the measurement is not biased by experimental artifacts or idiosyncrasies of a particular technique. This is especially important when using a secondary readout (vs. a direct approach) such as native gel shift or nitrocellulose filter binding, where major loss (or gain) of bound complex can potentially occur between the equilibration and detection steps (see below and Appendix 2—note 2).Of course, there are many approaches to carrying out equilibrium binding measurements one can choose from (e.g. Velázquez-Campoy et al., 2004; Wong and Lohman, 1993; Eftink, 1997; McDonnell, 2001). Here, we used a kinetic approach for independent KD determination for Puf4 at 25°C and 0°C, as described in Appendix 1. Kinetic measurements provide an information-rich alternative and complement to the equilibrium measurements and are often simple to carry out provided they fall within a measurable time range (Pollard, 2010; Hulme and Trevethick, 2010; Sanders, 2010; Pollard and De La Cruz, 2013). In case of Puf4, the affinities determined by kinetic measurements were within two-fold of those from equilibrium determinations, strongly supporting their accuracy.
Determine the fraction of active protein
The amount of bound ligand is determined not by the total protein concentration but by the concentration of total active protein. If 90% of the protein is damaged due to misfolding, aggregation, degradation or, for example, inactivated by phosphorylation at the binding interface, then the observed affinity will be that for only 10% of the total protein present—and will be ten-fold higher than the actual KD value. Moreover, if the binding-competent protein concentration is much lower than the total and therefore much closer to the limiting component concentration than expected, the binding regime may not be maintained, leading to even greater discrepancies between the real and observed KD. As a common cause of non-active or less active protein is aggregation, determining the monodispersity of the protein following purification is advisable (Altschuler et al., 2013).In addition, we recommend, when possible, a titration experiment to determine the fraction of binding-competent protein (Altschuler et al., 2013). Here, a concentration of ligand that is much greater than the measured KD is intentionally used and the protein concentration is varied by approximately an order of magnitude above and below the ligand concentration. To ensure accurate ligand concentration and to prevent excessive signal (if labeled ligand is used), the trace labeled ligand should be mixed with a large excess of identical unlabeled molecule at a known concentration. Assuming that the stoichiometry of the bound complex is known and that the ligand is 100% active, the breakpoint in fraction bound versus the ratio of protein to ligand indicates the amount of active protein (Figure 7). For example, for a 1:1 complex, a breakpoint at a protein:RNA ratio of 2.0 suggests that half of the protein is active. In Figure 7, the ratio of 1.3 suggests that the Puf4 preparation is 75% active (0.75 = 1/1.3). Consequently, the apparent KD values determined in the previous sections should be multiplied by the active protein fraction (which ranged from 0.75 to 0.90 for Puf4) to determine the final KD value. In an alternative approach, the titration data could be fit to a quadratic equation, with a coefficient used to represent the active protein fraction (Figure 7—figure supplement 1).
Figure 7.
Measuring the fraction of active protein by titration.
The fraction of active protein is derived from the breakpoint, that is, the intersection of linear fits to the low and high-Puf4 concentration data. See Figure 7—figure supplement 1 for an alternative strategy using Equation 5.
Fits of titration data at 100 nM (A) and 10 nM (B) RNA to the quadratic equation are shown. The quadratic equilibrium-binding equation (Equation 5) was modified to include a term for the active protein fraction.
A and O correspond to the amplitude and Y axis offset, respectively; F is the fraction of active protein; was constrained to the known RNA concentration (10 or 100 nM); here, the value was constrained to the known affinity (Table 2). The last constraint is optional, as the value contributes minimally to the fit at these high RNA concentrations and because the exact value may not yet be known at the time of measuring the active protein fraction. The fit fractions of active protein (F) are almost identical to those determined from linear fits of the same data in Figure 7 (~0.75).
Figure 7—figure supplement 1.
Determination of the fraction of active protein from a quadratic fit.
Fits of titration data at 100 nM (A) and 10 nM (B) RNA to the quadratic equation are shown. The quadratic equilibrium-binding equation (Equation 5) was modified to include a term for the active protein fraction.
A and O correspond to the amplitude and Y axis offset, respectively; F is the fraction of active protein; was constrained to the known RNA concentration (10 or 100 nM); here, the value was constrained to the known affinity (Table 2). The last constraint is optional, as the value contributes minimally to the fit at these high RNA concentrations and because the exact value may not yet be known at the time of measuring the active protein fraction. The fit fractions of active protein (F) are almost identical to those determined from linear fits of the same data in Figure 7 (~0.75).
Measuring the fraction of active protein by titration.
The fraction of active protein is derived from the breakpoint, that is, the intersection of linear fits to the low and high-Puf4 concentration data. See Figure 7—figure supplement 1 for an alternative strategy using Equation 5.
Determination of the fraction of active protein from a quadratic fit.
Fits of titration data at 100 nM (A) and 10 nM (B) RNA to the quadratic equation are shown. The quadratic equilibrium-binding equation (Equation 5) was modified to include a term for the active protein fraction.A and O correspond to the amplitude and Y axis offset, respectively; F is the fraction of active protein; was constrained to the known RNA concentration (10 or 100 nM); here, the value was constrained to the known affinity (Table 2). The last constraint is optional, as the value contributes minimally to the fit at these high RNA concentrations and because the exact value may not yet be known at the time of measuring the active protein fraction. The fit fractions of active protein (F) are almost identical to those determined from linear fits of the same data in Figure 7 (~0.75).
Table 2.
Summary of equilibrium and kinetic measurements of Puf4 affinity.
Equilibrium*
Kinetic
Temperature,°C
KD(hyperbolic), pM
KD(quadratic), pM
kon, M−1s−1*
koff, s−1
KD (=koff/kon), pM
0
≤1.7
1.39 ± 0.09
(2.85 ± 0.14)×107
(2.92 ± 0.17)×10−5
1.02 ± 0.08
25
120 ± 30
120 ± 30
(1.04 ± 0.14)×108
0.014 ± 0.003
130 ± 30
*The values have been normalized by active protein fraction (75–90%). KD(hyperbolic) and KD(quadratic) refer to values derived from fits to Equation 4b and Equation 5, respectively. Errors are defined in Materials and methods.
A limitation of the titration experiment is that it assumes the constant component to be 100% active, which may not always be the case, especially in the case of protein-protein interactions. Therefore, one should ensure, to the extent possible, maximum purity of both binding components. Importantly, one should always make clear whether experiments were carried out to determine ‘fraction active’.
The case of no observed binding
Researchers often conclude that there is ‘no binding’—that ‘X does not bind to Y’. Typically, the underlying experimental observation is an absence of observed binding up to a certain protein (or ligand) concentration. Therefore, one should report a lower limit for the dissociation constant (KD), rather than draw an absolute conclusion of ‘no binding’. But even an accurate lower limit often requires additional experiments, because the absence of observed binding—say in a gel shift, filter binding, or pull-down experiment—can arise either because there is no significant binding or because the complex does not withstand the assay conditions (Pollard, 2010). While this objection may seem like a technicality, there are many instances where known binders do not give a gel shift or filter binding.Immuno-precipitation and pull-down assays are pervasive in current biological investigations and are often interpreted in terms of ‘binding’ or ‘no binding’. But the reality of the interpretation of these experiments—and the reality of molecular interactions—is more nuanced (Pollard, 2010). A ligand with the same affinity, slightly lower affinity, or even higher affinity than another ligand with demonstrated binding can incorrectly be concluded to ‘not bind’.Consider, for example, an RNA pull-down with an RNA binding protein with KD = 10−9 M and kon = 108 M−1 s−1; this gives koff = 0.1 s−1 or a half-life for dissociation of ~10 s. If the washing steps following a pull-down take 30 s, only ~10% of the complex is expected to remain. If the affinity is 10-fold weaker (koff = 1 s−1), then no detectable complex is likely to remain after 30 s of washing (10−13 of the starting amount). Further, if another RNA ligand binds with the same affinity, but 10-fold slower (and thus also dissociating 10-fold slower; koff = 0.01 s−1, half-life of ~100 s), most (~75%) of the complex will remain after the 30 s washing steps despite an identical KD to the first ligand. In addition, the limited dynamic range of visual readouts of gels that are often used to evaluate pull-down experiments increases the danger of misinterpretation or overinterpretation of these experiments.Overall, observing binding in pull-downs and related experiments is a complex function of the experimental components and conditions. This doesn’t at all mean these experiments should not be done—they often provide critical clues and insights into biology. But, for these and all experiments, we need to keep in mind the nature of the assay, and thus what can and cannot be concluded from the experiment.Whether binding is absent or not detected can be tested by using approaches that directly report on the equilibrium between bound and unbound components in solution (e.g. ITC, fluorescence anisotropy, and other fluorescence-based techniques), as opposed to indirect approaches like native gel shift and pull-downs that are based on physically separating bound and unbound components, so that unstable complexes may fall apart prior to the detection step. Nevertheless, direct approaches also have limitations. For example, fluorescence intensity or FRET (Förster resonance energy transfer) is limited at high concentrations by inner filter effects, and ITC will miss binding events when the release (or uptake) of heat upon binding is too small (i.e. the binding enthalpy is too small).A simple way to test whether binding occurs when there is no binding signal is to carry out a competition experiment. If the ligand is bound but not detected in an approach such as native gel shift or filter binding, it will still lessen binding of another ligand for which there is an established signal. The amount lessened depends quantitatively on the KD values and concentrations of each ligand, given sufficient time for equilibration. A competition experiment to obtain the KD value for a weak RNA substrate of Puf4 is shown in Appendix 3, along with the binding scheme and equation to determine the KD value.Competition binding measurements can also have a practical benefit; after an initial KD is determined for a labeled substrate, KD values for additional substrates can be determined by competition without labeling each substrate (Hulme and Trevethick, 2010; Sanders, 2010; Ryder et al., 2008).
Discussion
Given the increasingly multi-disciplinary nature of research, scientists are increasingly venturing into disciplines outside their expertise. Our goal is to support these valuable efforts by enabling both experts and non-experts in thermodynamics to get the most out of their binding experiments, and to help them evaluate work by others, published or under review for publication.While the number of steps described to obtain reliable equilibrium data may initially seem daunting, the accompanying experimental illustrations and guides can transform an opaque process into one that is readily understandable and can be carried out in a straightforward, stepwise fashion by researchers from varied backgrounds.We found it useful to develop and use an Equilibrium Binding Checklist to organize our approach and findings. We provide a template of such a checklist, along with completed examples in Appendix 4 (Appendix 4—figure 1, 2, 3). We expect that many readers will find these valuable.
Appendix 4—figure 1.
Equilibrium binding checklist template.
Appendix 4—figure 2.
Example of a completed equilibrium binding checklist based on Puf4/RNA binding at 25°C.
Appendix 4—figure 3.
Example of a completed equilibrium binding checklist based on Puf4/RNA binding at 0°C.
There has been much discussion about problems with reproducibility and rigor in the scientific literature (Landis et al., 2012; Plant et al., 2014; Nature, 2013; Nosek and Errington, 2017; Koroshetz et al., 2020). Historically, a powerful means to ensure reliability of published data has been to develop community standards. Reporting guidelines have been successfully adopted by journals in a variety of fields, including structural biology (Berman et al., 2000), enzymology (http://www.beilstein-institut.de/en/projects/strenda/guidelines), organic synthesis (e.g. http://pubs.acs.org/page/joceah/submission/ccc.html), and many others, and new standards, guidelines and databases are continually being devised (see https://fairsharing.org/ for a curated list). We encourage journals to adopt analogous standards for reporting binding measurements. Contingent on implementation of such standards, we ultimately envision a well-curated and well-documented quantitative database that is routinely used to build and test models for individual molecular interactions and for cellular and molecular networks.
Materials and methods
Survey of published equilibrium binding measurements
We surveyed 100 papers, including 66 papers from the list of quantitative RNA/protein studies assembled by the Liu lab (Yang et al., 2013) and 34 additional studies reporting KD and apparent KD values for RNA/protein interactions (Supplementary file 1). To confirm that our survey was not biased, we also scored 20 publications from a single PubMed search for ‘RNA protein binding dissociation constant’, after confirming that they reported KD values for RNA/protein binding. Four of the 20 papers also appeared in the above list. The fractions of papers controlling for equilibration and/or titration were similar to those in the main survey (Figure 1): 30% of the 20 papers controlled both for equilibration and titration, 15% controlled for neither, 50% only controlled for titration and 5% only controlled for equilibration.Equilibration was evaluated as follows. If a study reported systematically varying the incubation time, it was counted as controlled for equilibration. If dissociation kinetics were measured in addition to performing equilibrium measurements (n = 3), the study was scored as equilibration-controlled, but only if the reported incubation time was at least three half-lives based on the reported koff, and only if the kinetic and equilibrium experiments were performed at the same conditions (n = 1). Studies exclusively using approaches that intrinsically monitor the binding progress (ITC, SPR, biolayer interferometry [BLI]) also were counted as equilibration controlled. However, if several approaches were used in a given study to determine affinities for distinct binding interactions and/or conditions, and if for at least one approach time was not varied, the study was scored as not equilibration controlled. Some exceptions where equilibration can be reasonably assumed are noted in Supplementary file 1.To generate Figure 1—figure supplement 1, we used the incubation times reported for non-equilibration controlled binding experiments. If a narrow range of times (e.g. 15–20 min, 45–60 min; n = 2) was indicated, this was not counted as systematically varying time and the longer time was used for Figure 1—figure supplement 1. If only a lower limit of the incubation time was reported (e.g. ‘at least 30 min’; n = 1), this lower limit was used for Figure 1—figure supplement 1. If two sequential incubations were performed at different temperatures (e.g. ‘10 min at room temperature and 10 min at 4°C’, n = 4), the total incubation time was used for the purposes of the survey. However, since affinity is condition-specific, only equilibration at a constant temperature can yield meaningful KD values, and two-temperature incubations should be avoided.To evaluate if titration was controlled for, first, we confirmed if the concentration of the limiting species was systematically varied to determine effects on KD (n = 5); these studies were counted as titration controlled. If a study reported a range of concentrations of the limiting species, without stating that the effects on KD were assessed, we did not count this as a titration control, as in practice such a range typically only indicates optimization of radioactive/fluorescent signal to account for radioactive decay and/or varying labeling efficiencies. For the remaining studies, we asked if Equation 4b (which assumes the binding regime) or Equation 5 (which also allows for the intermediate regime) was used to fit the data. If no equation was indicated, or if the midpoint of the binding curve/gel signal was used to determine the KD, or if linear transformation was used in lieu of the hyperbolic fit, we counted the study as using Equation 4b. For studies using Equation 4b, we asked if the lowest apparent KD value was in at least 10-fold excess over the limiting component concentration, in which case we counted the study as titration controlled. If a range of limiting component concentrations was reported, we used the lowest value. If only the amount (not concentration) of the limiting species was reported, the concentration was calculated based on the provided volume or, if not indicated, based on a 10 µL reaction volume; nevertheless, binding equilibria depend on concentrations, not amounts, and concentrations, in units of ‘M’, should always be indicated. If Equation 5 was used (incl. all ITC measurements), we counted the study as titration controlled, unless the reported KD was more than 1000-fold below the limiting species concentration (corresponding to a cutoff typically used in ITC [Velázquez-Campoy et al., 2004]). For simplicity, we assumed that all SPR/BLI measurements (where the concentration of the immobilized species is difficult to estimate and not reported) were titration controlled; nevertheless, we emphasize the importance of explicitly reporting controls for mass transport in SPR measurements (Myszka, 1999). If multiple approaches were used, but at least in one approach titration was not controlled for according to the above criteria, the study was scored as not titration controlled, unless the affected values were corroborated by a titration-controlled approach in the same study.If no details on the incubation time and/or the concentration of the limiting reagent were provided, but instead a previous study was cited (‘as described’, n = 4), the information for the above evaluation was obtained from the cited study. This included two cases in which the authors had performed rigorous equilibration and titration controls in their previous referenced work.
Puf4 purification
The RNA-binding domain (residues 537–888) of S. cerevisiaePuf4 was cloned into a custom pET28a-based expression vector in frame with an N-terminal 6X His-tag and a C-terminal SNAP tag (New England Biolabs, Ipswich, MA). The construct was transformed into E. coli protein expression strain BL21 (DE3) and protein expression was induced at an OD600 of 0.6 with 1 mM IPTG at 20°C for ~20 hr. Induced cells were harvested by centrifugation at 4500 × g for 20 min. Cell pellets were re-suspended in Buffer A (20 mM HEPES-sodium (HEPES-Na)), pH 7.4, 500 mM potassium acetate (KOAc), 5% glycerol, 0.2% Tween-20, 10 mM imidazole, 2 mM dithiothreitol (DTT), 1 mM phenylmethylsulfonyl fluoride (PMSF) and cOmplete, Mini, protease inhibitor cocktail (Roche Diagnostics GmbH, Mannheim, Germany) and lysed four times using an Emulsiflex (Avestin, Inc, Ottawa, ON, Canada). The lysate was clarified by centrifugation at 20,000 × g for 20 min, nucleic acids were precipitated with polyethylene imine (0.21% final concentration) at 4°C for 30 min with constant stirring and pelleted by centrifugation at 20,000 × g for 20 min. The supernatant was loaded on a Nickel-chelating HisTrap HP column (GE Healthcare, Pittsburgh, PA). Bound protein was washed extensively over a shallow 10–25 mM imidazole gradient and eluted over a linear 25–500 mM gradient of imidazole. Peak Puf4 protein fractions were pooled and desalted into Buffer B (20 mM HEPES-Na, pH 7.4, 50 mM KOAc, 5% glycerol, 0.1% Tween-20, 2 mM DTT) using a desalting column. The His-tag was cleaved by overnight incubation with His-tagged TEV protease at 4°C, and the protein was purified on a HisTrap HP column. The flow-through was desalted into Buffer B and loaded on a HiTrap Q HP column (GE Healthcare) and washed extensively with Buffer B to remove any bound RNA. Protein was eluted over a linear gradient of potassium acetate from 50 to 1000 mM. Protein fractions were pooled and desalted into Buffer C (20 mM HEPES-Na, pH 7.4, 100 mM KOAc, 5% glycerol, 0.1% Tween-20 and 2 mM DTT), concentrated and diluted two-fold with Buffer C containing 80% glycerol for final storage at −20°C. UV absorbance spectra indicated that the protein was free from significant RNA contamination (<1 RNA base per protein).
RNA 5´-end labeling
Puf4_HO RNA (AUGUGUAUAUUAGU; Integrated DNA Technologies (IDT), Coralville, IA; 5 µM) was labeled with equimolar [γ-32P] ATP (Perkin Elmer, Inc, Boston, MA) using T4 polynucleotide kinase (Thermo Fisher Scientific, Vilnius, Lithuania) and purified by non-denaturing gel electrophoresis (20% acrylamide). The RNA was eluted into TE buffer (10 mM Tris-HCl, pH 8.0; 1 mM EDTA) at 4°C overnight, and the lower limit of eluted RNA concentration, assuming no unlabeled RNA, was determined by scintillation counting and calibration against the specific activity of the [γ-32P] ATP stock used for labeling. The upper limit of RNA concentration was calculated from total RNA input and the elution buffer volume, assuming a 100% yield.
Equilibrium binding measurements
All reactions were performed in a binding buffer containing 20 mM HEPES-sodium or HEPES-potassium buffer, pH 7.4, 2 mM magnesium chloride (MgCl2), 100 mM KOAc, 2 mM DTT, 0.2% Tween 20, 5% glycerol, 0.1 mg/ml BSA, at 25 or 0°C, as indicated. The protein and labeled RNA dilutions were prepared in binding buffer at two-times the indicated concentration and were kept on ice until the binding reactions were initiated by mixing 10 µL of protein with 10 µL of labeled RNA. The pipette tips used for mixing and aliquoting the 0°C reactions were kept on ice. The labeled RNA concentrations and incubation times are indicated in the individual figure legends. Following the incubation, 7.5 µL aliquots were moved to 5 µL of ice-cold loading buffer containing 6.25% Ficoll PM 400 (Sigma-Aldrich, Saint Louis, MO), 0.075% bromophenol blue (BPB), and 2.5 µM unlabeled Puf4_HO RNA. The unlabeled RNA in the loading buffer prevented additional association to the labeled RNA from occurring during sample loading (Appendix 2—note 2). Control experiments indicated negligible re-equilibration in loading buffer (t1/2 ≥ 3 hr in three independent measurements), consistent with the slow dissociation rate constant measured in binding buffer at 0°C (Appendix 1). All samples were loaded on the gel within 20 min from mixing with the loading buffer. Non-denaturing acrylamide gels (20%) were pre-run for at least 1 hr at 42 V/cm constant voltage, 4–6°C with 0.5x TBE buffer (50 mM Tris, 42 mM boric acid, 0.5 mM EDTA•Na2, pH 8.5–8.6 final) using a circulating cooling system. Aliquots (7.5 µL) were carefully loaded on continuously running gels and separated for 45–90 min. (Extreme caution must be exercised at this step; see, e.g. https://ehs.stanford.edu/reference/electrophoresis-safety for electrical safety hazards.) The gels were dried and exposed to phosphorimager screens, scanned with a Typhoon 9400 Imager and quantified with TotalLab Quant software (TotalLab, Newcastle-Upon-Tyne, UK). Fitting was performed with KaleidaGraph 4.1 (Synergy Software, Reading, PA; RRID:SCR_014980).The KD values in Table 2 indicate the average and standard error from five independent equilibrium experiments (25°C). For 0°C measurements, KD(hyperbolic) indicates the upper limit determined using Equation 4b at the lowest RNA concentration (Figure 6C,D); KD(quadratic) indicates the average and standard error of KD values determined with Equation 5 at the four RNA concentrations shown in Figure 6C,D.*The values have been normalized by active protein fraction (75–90%). KD(hyperbolic) and KD(quadratic) refer to values derived from fits to Equation 4b and Equation 5, respectively. Errors are defined in Materials and methods.
Kinetic measurements
Measurements of koff (Appendix 1) were performed by incubating the indicated concentrations of Puf4 with trace concentration of labeled Puf4_HO RNA for 10 min at 25°C or 0°C in the binding buffer described in Equilibrium binding measurements. Labeled RNA concentrations were 0.04–0.5 nM, corresponding to the lower and upper limits, as defined in RNA 5´-end labeling. Dissociation was initiated by transferring the binding reaction to 2.5x volume of unlabeled chase in binding buffer. The chase RNA concentrations in the final reaction were 250 nM and 1000 nM. At various times, 7.5 µL aliquots were moved to 5 µL of ice-cold loading buffer containing 6.25% Ficoll PM 400% and 0.075% BPB, and 7.5 µL aliquots were loaded on a pre-run, continuously running 20% non-denaturing gel at 4–6°C. All pipette tip boxes and solutions used for the 0°C reactions were kept on ice. The chase solution for the 25°C reaction was pre-warmed in a 25°C water bath for 10 min before initiating the dissociation reaction. All time courses were fit to single exponentials using KaleidaGraph 4.1.The effectiveness of unlabeled Puf4_HO RNA chase was tested by pre-incubating 10 nM Puf4 with 100–1000 nM unlabeled RNA (final concentrations) for 12 min at 25°C before adding trace amount of labeled Puf4_HO RNA (0.04–0.4 nM). The fractions of bound labeled RNA ranged from 0.01 (1000 nM) to 0.1 (100 nM), compared to 0.95 fraction bound in the absence of chase, confirming the effectiveness of the chase.The koff values reported in Table 2 indicate the average and standard error from two replicate experiments (25°C) or the average and standard error across different concentrations in a single experiment (0°C).Values of kon were determined by mixing 40 µL each of trace labeled RNA solution (0.004–0.05 nM) and varying dilutions of Puf4. At varying times, 7.5 µL aliquots were transferred to 5 µL of ice-cold loading buffer containing 6.25% Ficoll PM 400, 0.075% BPB, and 2.5 µM unlabeled Puf4_HO RNA and loaded on a 20% gel as above. The protein and RNA solutions were pre-incubated at the reaction temperature (0°C or 25°C) before mixing, and ice-cold tips were used for the 0°C reactions. To control for titration by labeled RNA at the low protein concentrations used, at 0°C, the equilibration rate constants were also measured at three-fold higher labeled RNA concentration, giving consistent rate constants within 1.1–1.3-fold (Appendix 1).The kon values reported in Table 2 are the slopes and standard errors of linear fits to observed rate constants from two replicate experiments (25°C) or a single experiment (0°C). The kon values were corrected for the active protein fraction.
Measuring the fraction of active protein by titration
Unlabeled Puf4_HO RNA (10 or 100 nM) was incubated for 30 min with varying Puf4 concentrations in the presence of trace labeled Puf4_HO RNA (0.06–0.4 nM); the labeled and unlabeled RNA was pre-mixed before adding Puf4. The fraction bound RNA was determined as described in Equilibrium binding measurements.
Competition measurements
Trace labeled Puf4_HO RNA (0.02–0.19 nM) was equilibrated with 0.4 nM or 1.2 nM Puf4 and diluted two-fold into solutions containing varying concentrations of unlabeled competitor RNA (CGUAUAUUA; IDT). The reactions were incubated at 25°C for the indicated time, followed by transfer of 7.5 µL aliquots to 5 µL ice-cold loading buffer (6.25% Ficoll PM 400, 0.075% BPB, and 2.5 µM unlabeled Puf4_HO RNA). The samples were loaded immediately on a continuously running native acrylamide gel (4–5°C). The curves were fit to Equation 9, as described in Appendix 3.
Simulations
The simulated data in Figure 5 were generated by using Equation 4b (panel A) and Equation 5 (panel B) to calculate the fraction of bound RNA at each total protein concentration. In Figure 5—figure supplements 1, 2, 4 and 5, Equation 5 was used to calculate fractions bound at each protein and ligand concentration. In Figure 4—figure supplement 1, Equation 4b was used to determine the fraction of ligand bound at each protein concentration at equilibrium, assuming [P] = [P]total. This equilibrium value was then used as an amplitude (A) term in the single-exponential equation shown in Figure 2 to determine the fraction of bound ligand at each time point t: Fraction bound(t) = = .The simulated data in Figure 5—figure supplement 3 were generated as follows. First, Equation 5 was used to calculate the expected fraction of bound RNA at equilibrium for each [R]total and [P]total indicated in the figure. Two-fold serial dilution of protein was chosen as representative of a typical equilibrium binding experiment. In the case of 0.001 nM Rtotal, Equation 4b was used instead to calculate the expected fraction bound, as this condition satisfies the [P]free = [P]total assumption. Random noise in fraction bound was then generated around each predicted data point by sampling from a normal distribution with the indicated standard deviation, using the scipy and random packages in Python. Ten binding series were generated this way for each condition and each noise level. These datasets were then individually fit to Equation 5 (or Equation 4b in the case of 0.001 nM Rtotal) in Prism 8 (GraphPad Software, LLC, San Diego, CA; RRID:SCR_002798), with the equations modified to include amplitude (A) and y axis offset (O) terms:To facilitate fitting to Equation 6, [R]total was constrained to the known value, and the KD was constrained to positive values only, with the real affinity (0.1 nM) used as an initial estimate.In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.Acceptance summary:Given the ubiquitous nature of binding measurements in the literature, including newly emerging high-throughput approaches, this manuscript addresses an important and timely topic. This manuscript is particularly compelling in providing an easy-to-follow set of practical guidelines exemplified with relevant binding data. The authors' approach to this important topic is highly pedagogical and should be a must-read for anyone with the ambition to quantitatively characterize binding equilibria.Decision letter after peer review:Thank you for submitting your article "How to measure and evaluate binding affinities" for consideration by eLife. Your article has been favorably reviewed by two peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by John Kuriyan as the Senior Editor. The reviewers have opted to remain anonymous.The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.We would like to draw your attention to changes in our revision policy that we have made in response to COVID-19 (https://elifesciences.org/articles/57162). Specifically, we are asking editors to accept without delay manuscripts, like yours, that they judge can stand as eLife papers without additional data, even if they feel that they would make the manuscript stronger. Thus the revisions requested below only address clarity and presentation.Summary:In this manuscript, a review of 100 studies reporting on binding measurements is presented, allowing the authors to identify and illustrate a number of pitfalls and issues that often adversely affect the reliability and meaningful biological interpretation of binding equilibrium measurements. Using example binding measurements to illustrate the most relevant points, the authors provide a straightforward, practical set of guidelines in terms of a step-by-step “checklist” that can be followed to ensure acquisition of high-quality data best suited to quantitatively describe simple binding equilibria. Given the ubiquitous nature of binding measurements in the literature, including newly emerging high-throughput approaches, this manuscript addresses an important and timely topic. While there may be at least partial overlap with previously published literature on this topic, this manuscript is particularly compelling in providing an easy-to-follow set of practical guidelines exemplified with relevant binding data. The manuscript is well written and accompanied with a number of high-quality and clear illustrations. As such, the authors' approach to this important topic is highly pedagogical and should be a must-read for anyone with the ambition to quantitatively characterize binding equilibria.Revisions:The authors should address the following points to further improve clarity of the manuscript:1) While the need for the equilibration time control is clear, the requirement for changing the concentration of the second species to probe whether ligand depletion could affect Kd measurements seems to be less universal. If a Kd value, as obtained via binding experiment, is substantially larger than the concentration of the labeled species, it would not be strictly necessary to test for potential ligand depletion. It would be important to take this notion into account in the literature survey so as to indicate those cases where a titration regime could indeed be plausible. It would be useful to provide examples from the literature of Kd values that were underestimated due to ligand depletion. The authors should consider emphasizing the specific conditions where ligand depletion might be overlooked (e.g. high-affinity interactions requiring the use of particularly low concentrations of the labeled binding partner, where concentration uncertainty could play a significant role, and a very low active concentration of the protein). That said, varying the equilibration time is an extremely useful control that should be recommended, but perhaps the authors could be more specific as described above. The authors should also consider emphasizing that the inability to obtain a good fit with a hyperbolic function should be considered a serious warning sign that could indicate insufficient equilibration or ligand depletion. The requirement to rigorously report binding curves and fits could be an important part of a binding data reporting standard. In particular for indirect methods such as EMSAs, binding curves and fits are often omitted.2) Cpf1 is discussed as an example where affinity is substantially underestimated. However, this particular example appears more complicated and likely requires factors other than insufficient incubation time to be considered: First, in one of the studies reporting a 1000-fold lower affinity, a koff of 1/(several seconds) was directly measured using a smFRET assay; Second, the experimental conditions that are known to affect binding were different in all three studies being compared, including temperature, buffer composition (specifically, divalent ions), as well as RNA and DNA sequences; Moreover, in some cases, the Cpf1 proteins were from different species (Strohkendl et al., 2018, and the study reporting the lowest affinity).Indeed, this illustrates another common mistake when reporting or using binding affinities: treating them as constant values rather than functions of many variables, and ignoring experimental conditions and other important details when comparing the values. It would be important (and educational) to emphasize this in the manuscript, see also minor points for more details. This all being said, Cpf1 still makes a good case for the authors' main point regarding the need to prove that binding is at equilibrium, since the lack of this proof in the study reporting the lowest affinity creates a lot of confusion.3) In Appendix 3, Weeks and Crothers, 1992 is cited for a precise competitive binding equation for the case of Kd,comp close to total concentration of P, but the solution to the quadratic equation in Weeks and Crothers does not represent a general equation for competitive binding. Instead, the same approximation as in Lin and Riggs (Kd,comp>>total concentration of P) is assumed and the equation is solved for theta to obtain a binding curve rather than a single point for theta=0.5. This approach should still be considered superior when compared to determining Kd,comp from a single data point since it takes all the other data of the curve into account. However, this approach still cannot be used in case of comparable affinities for competitor and labeled ligand. Instead, a general competitive binding curve should be used that represents a correct solution to the cubic equation (see, for example, PMID:
7875313).Revisions:The authors should address the following points to further improve clarity of the manuscript:1) While the need for the equilibration time control is clear, the requirement for changing the concentration of the second species to probe whether ligand depletion could affect Kd measurements seems to be less universal. If a Kd value, as obtained via binding experiment, is substantially larger than the concentration of the labeled species, it would not be strictly necessary to test for potential ligand depletion. It would be important to take this notion into account in the literature survey so as to indicate those cases where a titration regime could indeed be plausible.We updated our literature survey to distinguish between studies in which titration is and is not plausible, and have included additional analysis of literature data based on the concentration of the limiting species relative to the reported KD values (see revised Figure 1 and newly added Figure 1—figure supplement 2, as well as accompanying text changes). While most authors appear to be aware of the need to avoid titration, approx. one fifth of all studies used the hyperbolic equation at concentrations that violate the [P]total ≈ [P]free condition. This included several studies in which the reported KD was equal to or even lower than the stated limiting species concentration, putting these studies at high risk for inaccuracies that accompany KD values determined in the titration regime. In these and other cases varying the concentration of the limiting species would provide a definitive control for titration (and for other concentration-related artifacts). Further, while it becomes less likely that there are titration artifacts when reported KD values are much greater than the concentration of the second species, one typically does not know for certain if all of the added material is active or if other possible artifacts, such as contamination with another species, are present. Finally, we emphasize that our primary purpose is to make future measurements reliable, so we have tried to make the case that changes are needed in how measurements are typically or often made, without calling out individual studies.It would be useful to provide examples from the literature of Kd values that were underestimated due to ligand depletion. The authors should consider emphasizing the specific conditions where ligand depletion might be overlooked (e.g., high-affinity interactions requiring the use of particularly low concentrations of the labeled binding partner, where concentration uncertainty could play a significant role, and a very low active concentration of the protein).We have added a new “Implications of the titration regime” section, analogous to the “Implications of insufficient equilibration” section, where we address the above points.That said, varying the equilibration time is an extremely useful control that should be recommended, but perhaps the authors could be more specific as described above. The authors should also consider emphasizing that the inability to obtain a good fit with a hyperbolic function should be considered a serious warning sign that could indicate insufficient equilibration or ligand depletion. The requirement to rigorously report binding curves and fits could be an important part of a binding data reporting standard. In particular for indirect methods such as EMSAs, binding curves and fits are often omitted.We agree that poor fits present an important red flag and now discuss this point in both the equilibration and titration sections. We also amended the checklist (Appendix 4) to include questions about systematic deviations from the fit and whether binding curves are displayed. Nevertheless, it is important not to rely on the quality of fits alone (given potential noise, often sparse data points, subtle and difficult to recognize systematic deviations, or real differences from simple 1:1 binding model). In addition, it is not uncommon for authors to vary the Hill coefficient to compensate for poor fits or to show gel images and derived K1/2 values without plots or fits of the data. We chose not to elaborate on these practices in order to focus on guidelines for correct measurements rather than on criticisms of unreliable ones.2) Cpf1 is discussed as an example where affinity is substantially underestimated. However, this particular example appears more complicated and likely requires factors other than insufficient incubation time to be considered: First, in one of the studies reporting a 1000-fold lower affinity, a koff of 1/(several seconds) was directly measured using a smFRET assay;We have carefully reviewed the paper in question and have consulted with an expert in Cpf1 kinetics. While we recognize the potential for additional complexities, the data strongly support a dominant role of insufficient equilibration in underestimating the affinity. To respond to the first point, the kinetics data in PMID: 29735714 that show rapid dissociation on the scale of seconds refer to targets with 9 mismatches (Figure 3E) or 16 mismatches (Figure S3E). For the fully complementary target, Figure 3A indicates virtually no dissociation after >1 h, consistent with extremely slow equilibration. Figure 3A also suggests slow equilibration (>1 h) for targets with up to 7 mismatches.Second, the experimental conditions that are known to affect binding were different in all three studies being compared, including temperature, buffer composition (specifically, divalent ions), as well as RNA and DNA sequences; Moreover, in some cases, the Cpf1 proteins were from different species (Strohkendl et al., 2018 and the study reporting the lowest affinity).We recognize that affinity can vary dramatically across conditions and now address this general point in the manuscript. Nevertheless, in this particular case the conditions used to study Acidaminococcus sp Cspf1 were near-identical between PMID: 29735714 and PMID: 30078724, despite a 1000-fold difference between reported KD values. We now state “for the same enzyme at similar conditions” in the text and we include details of the conditions below. Moreover, the consistency between the kinetic data in both studies (see the previous point) argues against major acceleration of equilibration in response to the small differences in conditions.We now only mention the 1000-fold difference between the affinities reported for the same Cpf1 ortholog (AsCpf1) and do not mention the 100,000-fold difference from a study that investigated a different Cpf1 ortholog.Indeed, this illustrates another common mistake when reporting or using binding affinities: treating them as constant values rather than functions of many variables, and ignoring experimental conditions and other important details when comparing the values. It would be important (and educational) to emphasize this in the manuscript, see also minor points for more details.We completely agree that the dependence of affinity on specific conditions is an important point and now discuss it in the manuscript.This all being said, Cpf1 still makes a good case for the authors' main point regarding the need to prove that binding is at equilibrium, since the lack of this proof in the study reporting the lowest affinity creates a lot of confusion.We agree.3) In Appendix 3, Weeks and Crothers, 1992 is cited for a precise competitive binding equation for the case of Kd,comp close to total concentration of P, but the solution to the quadratic equation in Weeks and Crothers does not represent a general equation for competitive binding. Instead, the same approximation as in Lin and Riggs (Kd,comp>>total concentration of P) is assumed and the equation is solved for theta to obtain a binding curve rather than a single point for theta=0.5. This approach should still be considered superior when compared to determining Kd,comp from a single data point since it takes all the other data of the curve into account. However, this approach still cannot be used in case of comparable affinities for competitor and labeled ligand. Instead, a general competitive binding curve should be used that represents a correct solution to the cubic equation (see, for example, PMID: 7875313).We are grateful to the reviewers for pointing us to the general formulation of the competitive binding equation. We have updated Appendix 3 with the new equation and we have added the reference to the cubic equation.
Authors: Niyati Jain; Hsuan-Chun Lin; Christopher E Morgan; Michael E Harris; Blanton S Tolbert Journal: Proc Natl Acad Sci U S A Date: 2017-02-13 Impact factor: 11.205
Authors: Kathryn D Smith; Sarah V Lipchock; Tyler D Ames; Jimin Wang; Ronald R Breaker; Scott A Strobel Journal: Nat Struct Mol Biol Date: 2009-11-08 Impact factor: 15.369
Authors: Natalia Beloglazova; Konstantin Kuznedelov; Robert Flick; Kirill A Datsenko; Greg Brown; Ana Popovic; Sofia Lemak; Ekaterina Semenova; Konstantin Severinov; Alexander F Yakunin Journal: Nucleic Acids Res Date: 2014-12-08 Impact factor: 16.971
Authors: Anastasia Smolentseva; Ivan M Goncharov; Anna Yudenko; Andrey Bogorodskiy; Oleg Semenov; Vera V Nazarenko; Valentin Borshchevskiy; Alexander V Fonin; Alina Remeeva; Karl-Erich Jaeger; Ulrich Krauss; Valentin Gordeliy; Ivan Gushchin Journal: Photochem Photobiol Sci Date: 2021-11-18 Impact factor: 3.982
Authors: Vaida Paketurytė; Vytautas Petrauskas; Asta Zubrienė; Olga Abian; Margarida Bastos; Wen-Yih Chen; Maria João Moreno; Georg Krainer; Vaida Linkuvienė; Arthur Sedivy; Adrian Velazquez-Campoy; Mark A Williams; Daumantas Matulis Journal: Eur Biophys J Date: 2021-04-10 Impact factor: 1.733
Authors: Ali Imran; Brandon S Moyer; Ashley J Canning; Dan Kalina; Thomas M Duncan; Kelsey J Moody; Aaron J Wolfe; Michael S Cosgrove; Liviu Movileanu Journal: Biochem J Date: 2021-06-11 Impact factor: 3.766