Literature DB >> 30914482

Demonstration of protein cooperativity mediated by RNA structure using the human protein PUM2.

Winston R Becker¹, Inga Jarmoskaite², Pavanapuresan P Vaidyanathan², William J Greenleaf^3,4,5, Daniel Herschlag^2,6.

Abstract

Posttranslational gene regulation requires a complex network of RNA-protein interactions. Cooperativity, which tunes response sensitivities, originates from protein-protein interactions in many systems. For RNA-binding proteins, cooperativity can also be mediated through RNA structure. RNA structural cooperativity (RSC) arises when binding of one protein induces a redistribution of RNA conformational states that enhance access (positive cooperativity) or block access (negative cooperativity) to additional binding sites. As RSC does not require direct protein-protein interactions, it allows cooperativity to be tuned for individual RNAs, via alterations in sequence that alter structural stability. Given the potential importance of this mechanism of control and our desire to quantitatively dissect features that underlie physiological regulation, we developed a statistical mechanical framework for RSC and tested this model by performing equilibrium binding measurements of the human PUF family protein PUM2. Using 68 RNAs that contain two to five PUM2-binding sites and RNA structures of varying stabilities, we observed a range of structure-dependent cooperative behaviors. To test our ability to account for this cooperativity with known physical constants, we used PUM2 affinity and nearest-neighbor RNA secondary structure predictions. Our model gave qualitative agreement for our disparate set of 68 RNAs across two temperatures, but quantitative deviations arise from overestimation of RNA structural stability. Our results demonstrate cooperativity mediated by RNA structure and underscore the power of quantitative stepwise experimental evaluation of mechanisms and computational tools.

Entities: Chemical Gene Species

Keywords: RNA; RNA secondary structure; cooperativity; posttranscriptional regulation; statistical mechanics

Mesh：

Substances：

Year: 2019 PMID： 30914482 PMCID： PMC6521599 DOI： 10.1261/rna.068585.118

Source DB: PubMed Journal: RNA ISSN： 1355-8382 Impact factor: 4.942

INTRODUCTION

RNA-binding proteins (RBPs) regulate gene expression through the control of RNA stability, splicing, modification, localization, and translation (Glisovic et al. 2008; Singh et al. 2015). Since each RNA contains many protein-binding sites, there is the potential for cooperative interactions, which are responsible for sharpening and broadening regulatory responses throughout biology (Monod et al. 1965; Weiss 1997; Ha and Ferrell 2016). In many cases, RBPs exhibit cooperative binding to RNA, and most cooperative binding to RNA that has been studied is generated via direct protein–protein interactions (Fig. 1A; Samuels et al. 1994; Cartegni et al. 1996; Wei et al. 1998; Fierro-Monti and Mathews 2000; Rakitina et al. 2006; Lunde et al. 2007; Daugherty et al. 2010; Campbell et al. 2012; Cieniková et al. 2015; Arvola et al. 2017).

FIGURE 1.

Schematics of cooperative mechanisms. (A) Direct protein–protein cooperativity, where physical contacts between two proteins increase RNA affinity. (B) Indirect cooperativity, where RNA secondary structure changes upon binding to one protein increases the binding affinity for the next protein. (RSC) RNA structural cooperativity. In addition to protein–protein cooperativity, cooperative protein binding to RNA can also be generated through RNA secondary structure, when binding of one protein results in a structural rearrangement that exposes or blocks additional binding sites (Fig. 1B; Lin and Bundschuh 2015). An analogous mechanism has been proposed for cooperativity between proteins and miRNAs (Kedde et al. 2010; Xue et al. 2013; Hafezqorani et al. 2016). RNA structural cooperativity (RSC) is particularly intriguing as a biological regulatory mechanism as it allows cooperativity and site occupancy to be adjusted individually for each RNA target via sequence. In this work, we use high-throughput quantitative binding data for the human PUF family protein PUM2 to demonstrate such cooperative binding in vitro. An RSC model was previously developed for two sites and applied to assess potential cooperativity in physiological sequences (Lin and Bundschuh 2015). Here we developed a generalizable statistical mechanical framework for RSC that calculates expected PUM2 binding from PUM2 affinity for fully accessible RNA sequences along with site accessibility as calculated from nearest neighbor models. We test whether our current understanding of RNA secondary structure and protein affinity can accurately predict RSC via quantitative, controlled equilibrium binding experiments. We found cooperativity, but weaker than predicted due to limitations in RNA structure predictions, and intrinsic properties of RNA structure may generally limit cooperativity. Additionally, we identified factors critical for understanding and engineering RSC, including temperature effects, the number of sites, the specific arrangement of sites, and RNA structure. Finally, we showed that RNA structure can, paradoxically, both enhance and suppress cooperativity. Ultimately, analogous models, taking into account cellular complexities and conditions, will be required to predict in vivo RNA/protein interactions and cooperativity, and their downstream consequences.

RESULTS

A statistical mechanical model for structure-mediated cooperativity

To quantitatively describe RSC, we developed a general model for the binding of multiple proteins to a structured RNA, starting with two binding sites (Fig. 1B). Briefly, in our thermodynamic model, a single-stranded RBP requires fully accessible RNA sites that are unoccluded by secondary structure. As a result, the formation of RNA structure involving the RBP sites destabilizes protein binding to an extent determined by the stability of RNA structure. Conversely, as the fraction of bound RNA increases with increasing protein concentration and/or affinity, the RNA folding equilibrium is shifted toward the unfolded, binding-competent state. The destabilization of RNA structure by RBP binding to one of the sites in turn makes the second site more accessible, resulting in apparent cooperativity (Fig. 1B). This cooperative behavior can be modeled using Kd values for single accessible sites, measured independently (Jarmoskaite et al. 2018), and free energies of RNA folding, computed from nearest neighbor rules (e.g., by Vienna RNAfold) (Lorenz et al. 2011). For simplicity, the complete model for two binding sites is presented here and the extension to account for additional binding sites is described in Materials and Methods. To construct our model, we began by defining substates for the RNA and RNA–protein complexes. Each RNA has two protein-binding sites, and each site can either be structured and unavailable for RBP binding or unstructured and thus available for RBP binding. This unstructured state can occur either with or without the protein bound, corresponding to a total of nine possible states (Fig. 2A).

FIGURE 2.

Statistical mechanical model for RNA structural cooperativity (RSC). (A) Framework for protein binding to a structured RNA with two binding sites. Transitions between different folded RNA states are specified by unfolding equilibrium constants where superscripts “u1” and “u2” correspond to unfolding of binding sites 1 and 2, respectively; similarly, transitions for protein binding are described with equilibrium dissociation constants specified with subscripts “d1” and “d2.” Brackets are placed around terms to specify that a folding transition or protein binding has already occurred (e.g., a [u1] superscript indicates that site 1 is accessible and a [d1] subscript indicates that a protein is bound to site 1). (B) Representative ensemble of RNA secondary structures for a single RNA molecule at equilibrium. (C) Structural substates with varying numbers of accessible protein-binding sites. The free energies of the ensemble of structures in each substate are computed relative to an unstructured reference. The relative free energies of each substate determine the relative occupancies of the substates, and the corresponding equilibrium unfolding constants (K). As RNA molecules exist in an ensemble of RNA structures at equilibrium (Fig. 2B), each of the distinct substates in our model represents an ensemble of underlying structures that share the specified structural properties (Fig. 2B,C). For each RNA with two protein-binding sites, we define the four structural substates (Fig. 2C): a substate where both binding sites are structurally inaccessible ([A0]), two substates where one binding site is accessible and the other is not ([A1 and A2]), and the substate of structures where both sites are available for binding ([A12]). To make quantitative predictions with this model, the relative populations of each substate were calculated. The RNAfold partition function was used to assign free energies to each ensemble of RNA molecules that occupy each substate relative to an unstructured substate (Fig. 2B; Lorenz et al. 2011). For example, ΔGA2 represents the free energy of the ensemble of RNA states where only site 2 is uninvolved in secondary structure, relative to the unstructured RNA alone; analogous descriptions apply to ΔGA0, ΔGA1, and ΔGA12. The relative free energies of the different subensembles were translated into equilibrium unfolding constants for the different substates (Fig. 2C, K1, K2, K[2],, K[1],), using the standard relationship: K = e−ΔG/RT. Thus, K1 denotes the equilibrium constant for site 1 being accessible and site 2 being inaccessible, relative to both sites being involved in structure and inaccessible (i.e., large values mean the unfolded state is favored). The dissociation constants (Fig. 2A, Kd1, Kd2, etc.), defined from PUM2 binding to single unstructured binding sites, were used to determine the relative populations of bound and unbound RNA substates. Kd1 and Kd2 refer to the global dissociation constants for PUM2 binding to sites 1 and 2, respectively, and were used to define the microscopic equilibrium between the different substates, for example: Next we defined the fraction of sites bound, fbound, as the sum of the sites with a protein bound divided by the sum of all binding sites. In our experiments, we are studying the binding of a single protein species to multiple binding sites, such that P1 and P2 in the above equation can be replaced by a single protein term, P0. In the process of deriving the model, we assumed that dissociation constants were identical for the binding of a protein to a given site regardless of RNA structure outside of the binding site or whether another PUM2 was bound to an adjacent site in the same RNA (i.e., Kd1 = K[d2],d1). Substituting the microscopic equilibrium relationships, as in Equation 1, into Equation 3 allows us to derive the following equation for structure-mediated cooperative binding of two identical proteins to an RNA: Tests and simulations of this equation allowed us to confirm that this equation predicts the expected behavior in limiting scenarios For example, if one of the dissociation constants is set to be infinite (i.e., no binding), then, as expected, Equation 4 predicts binding to an RNA with a single binding site.

Application of the RSC model to experimental data

This statistical mechanical model allows us to make quantitative predictions of RSC that can be tested experimentally. Using known PUM2 affinity to accessible sites and predicted equilibrium unfolding constants, we can define all equilibrium constants in the binding scheme of Figure 2A, resulting in a predictive model with no free parameters. We applied the RNA-MaP method (Fig. 3A; Buenrostro et al. 2014; She et al. 2017; Denny et al. 2018; Jarmoskaite et al. 2018) to make precise equilibrium measurements for binding of the PUF-domain of human PUM2 to 68 RNA constructs with two to five PUF binding sites designed to probe RSC. These oligonucleotides were tested as a subset of a larger library of >30,000 constructs used to investigate independent questions, and the results for other constructs are reported elsewhere (Jarmoskaite et al. 2018; Becker et al. 2019). PUM2, which binds to a UGUA[ACU]AUA consensus sequence in mRNA transcripts to promote decay and repress translation, was chosen for its well-defined sequence specificity and apparent lack of oligomerization (Fig. 3B; Galgano et al. 2008; Hafner et al. 2010; Miller and Olivas 2011; Van Etten et al. 2012; Bohn et al. 2018).

FIGURE 3.

PUM2/RNA-binding measurements. (A) RNA-MaP method to quantitatively determine PUM2-binding affinity to RNA constructs on a sequencing chip (Buenrostro et al. 2014). (B) PUM2 consensus binding motif based on PAR-CLIP-derived motif from Hafner et al. (2010). (C) PUM2 binding to an RNA with a single unstructured binding site (UGUAUAUU; blue, left axis) and an RNA with two identical binding sites but little intramolecular structure (red, right axis) giving dissociation constants of 2.3 nM and 3.3 nM, respectively. Our model makes multiple assumptions that must hold for its application to experimental data. These assumptions include that there is no direct protein cooperativity, that the system is at equilibrium, and that PUM2 does not bind structured RNA. We assess the validity of each of these assumptions below. First, we tested our assumption of no direct protein cooperativity. PUM2 binding to an RNA with two accessible binding sites was fit well by a model with two independent binding sites and with an affinity within twofold of that for an RNA containing a single unstructured binding site of the same sequence (Kd = 3.3 and 2.3 nM, respectively for binding to two and one UGUAUAUU sites; Fig. 3C). This result is consistent with lack of direct cooperativity. (We note that throughout RNA-MaP experiments, reported here and elsewhere [Jarmoskaite et al. 2018], it was necessary to incorporate a constant nonspecific binding term to account for continued increase in fluorescence after saturation [Fig. 3C], suggesting nonspecific binding of an additional PUM2 monomer to the RNA/protein complex at high concentrations. This term had a negligible effect on measured affinities for >10,000 single-site variants [Jarmoskaite et al. 2018], though it was incorporated throughout the fits presented herein to maximize precision [Materials and Methods].) Another prediction for structure-mediated cooperativity versus direct cooperativity is that there should be no change in the dissociation rate for constructs with any number of identical binding sites, as the RNA structure should slow protein association rates but not alter dissociation rates. Indeed, we observed similar dissociation rate constants for RNAs with two to five identical binding sites (Supplemental Fig. S1). Second, we confirmed that the system was at equilibrium, as equilibrium between all of the folded states and protein-bound states is required for application of the statistical mechanical model. At 25°C, PUM2 binding equilibrates within minutes (Vaidyanathan et al. 2017), and our observation times were sufficiently long to ensure equilibration of PUM2 binding (Materials and Methods). If stable RNA structures did not interconvert on the time scale of the experiment, amplitudes would decrease at apparent saturation, as a fraction of the RNA would remain inaccessible for protein binding. However, we observe consistent amplitudes regardless of predicted structural stability across RNAs with the same number of PUM2-binding sites (Supplemental Fig. S2). Third, we confirmed that PUM2 showed no appreciable binding to structured RNA. Structural and biochemical evidence indicates that PUM2 binds the Watson–Crick face of the RNA, which is incompatible with base-pairing. Indeed, we observe no binding to RNAs containing single PUM2-binding sites involved in stable RNA structure (Supplemental Fig. S3; Becker et al. 2019). Finally, the application of our model depends on correctly being able to assign energies for the ensembles of each folded state, which are predicted using a nearest-neighbor algorithm (Lorenz et al. 2011). However, independent tests of nearest neighbor based RNA structure prediction algorithms revealed that the RNA structural ensembles are less stable than predicted at our conditions (110 mM monovalent cations [K+, Na+] and 2 mM magnesium ions [Mg2+]), which differ from the conditions used to collect the parameters for nearest neighbor models (Becker et al. 2019). To account for the discrepancy in predicted structural stability, we introduced a stability correction parameter, Sunfold. This scaling factor allowed the unfolding energies to float while still taking advantage of nearest neighbor rules and only introducing one free parameter. This parameter altered each K term in Equation 4 as follows:

RNA structure produces cooperative interactions

Measurements of PUM2 binding to oligonucleotides containing two to five sites indicated varying degrees of cooperativity (as indicated by Hill coefficients up to 2.11); nevertheless, the data showed systematic deviations from the predictions without the structure correction term, with binding consistently occurring at lower PUM2 concentrations than predicted (Fig. 4A–F [circles vs. black solid lines] and Supplemental Fig. S4). By including a single structure correction term for each RNA, we capture the shape and offset of binding observed in RSC constructs (Supplemental Fig. S4), and representative fits are shown for RNA constructs with two identical binding sites (Fig. 4A), two different binding sites (Fig. 4B), three identical binding sites (Fig. 4C,D), and four identical binding sites (Fig. 4E,F); the remainder of the 203 total fits can be seen in Supplemental Figure S4.

FIGURE 4.

PUM2 binding to a series of RNAs. PUM2 binding at 25°C (left) and 37°C (right) to RNAs with two identical binding sites (A), two different binding sites (B), three identical binding sites (C,D), and four identical binding sites (E,F). The binding sites are indicated with different coloring in the most stable predicted secondary structure for each RNA. To compare our best-fit values for RNA structural stability to those predicted by RNAfold, the multiple equilibrium unfolding constants and the single adjustment factor applied to each unfolding equilibrium constant were combined to define the equilibrium between the state with all sites involved in structure and the state with no sites involved in structure as follows: We then defined a corresponding free energy change for this process as To test the reproducibility of these experimentally determined values, we compared the ΔGfold values computed from replicate experiments (Fig. 5A) and found that they were highly correlated (R = 0.95 for RNAs with ΔGfold < 1 kcal/mol). We also found that PUM1, the second human Pumilio protein with identical sequence specificity (Galgano et al. 2008), also gave fit ΔGfold values in good agreement with those based on PUM2 data (Supplemental Fig. S6). Together, these results demonstrate that these measurements are highly reproducible across different experiments and proteins.

FIGURE 5.

Comparisons of ΔGfold values determined from fits with one free parameter, Sunfold. (A) Replicate values for ΔGfold for independent PUM2-binding experiments (25°C) to RNA constructs with two identical binding sites (red), two different binding sites (blue), three binding sites (purple), four binding sites (green), and five binding sites (black). The deviations were the greatest for positive ΔGfold values because the shapes of binding curves of constructs with ΔGfold values greater than zero are minimally affected by even large changes in ΔGfold (Supplemental Fig. S5). (B) RNA stability (ΔGfold) predicted by RNAfold compared to the ΔGfold computed when fitting the statistical mechanical model with one free parameter. (C) Relationship between ΔGfold at 25°C (x-axis) and ΔGfold at 37°C (y-axis) as predicted by RNAfold and fit with the model. Next, we compared our corrected ΔGfold values derived from our single parameter fits with the original values from RNAfold (Fig. 5B). Overall, less structure than predicted was consistently observed for all RNA constructs tested, consistent with the need for a structure correction factor. The need for this factor may arise from differences in salt conditions or errors in secondary structure prediction (Becker et al. 2019). Despite these quantitative differences, the overall strong correlation (R = 0.85 for constructs with ΔGfold < 1 kcal/mol) between predicted structural stability and observed structural stability supports the RSC model. At higher temperature, our model predicts lower RSC due to a weaker structure. The model also predicts that, for structured PUM2 sites, binding will be tighter than expected from enthalpic effects on binding alone, as weakened PUM2/RNA interactions will be offset by structured sites becoming more accessible. Indeed, while increased temperature weakens PUM2 binding to unstructured UGUAUAUA RNA-binding sites, from a Kd of 0.40 nM to 3.9 nM (ΔΔG = 0.89 kcal/mol), and to unstructured UGUAUAUU RNA-binding sites, from a Kd of 2.11 nM to 15.5 nM (ΔΔG = 0.75 kcal/mol), binding to our structured RNAs was reduced by less. When we carried out the fitting procedure described above for the 25°C data, we found that the values of ΔGfold at 37°C were offset from the values at 25°C essentially as predicted by RNAfold (Fig. 5C).

Extension of model to three or more sites

As RNA secondary structure can generate cooperative behavior between many proteins at once, we extended the model to predict the binding behavior of RBPs to RNAs with an arbitrary number of binding sites. When modeling these constructs, a single factor was introduced to adjust all folding equilibrium constants predicted by RNAfold, as described above. The model predicts that having greater than two binding sites can result in either shallow, multiphasic binding curves or steep binding curves, depending on the structural arrangement of the RNA, and this is what we observed experimentally. RNAs in which subsets of PUM2 sites have different accessibility (such as in Fig. 4D, where two sites are predicted to pair with each other and are inaccessible, while a third one is fully accessible) give multiphasic PUM2 concentration dependence. Analogous behavior is observed in the four-site RNA in Figure 4E, where two pairs of sites each form distinct hairpins, yielding a shallow, biphasic response to changes in PUM2 concentration. In contrast, in Figure 4F, when four binding sites are involved in the same secondary structure, there is a very steep binding response to protein concentration. At higher temperature, we observed, as predicted by a model where cooperativity arises from the secondary structure, that cooperativity decreases and that more of the binding curves can be explained as a simple sum of three (or more) independent, unstructured binding sites (Fig. 4C–F; Supplemental Fig. S4). The ability of the model to capture these complex binding curves demonstrates the value of considering the ensemble of RNA secondary structures, as shown in Figure 2, to analyze cooperative binding to complex structured RNAs.

Apparent Hill slopes and RSC

To place these behaviors in familiar terms and to further explore the range of observed behaviors, we calculated best-fit Hill slopes (nHill) for each construct. While in principle, it is possible to achieve a Hill slope of two with two binding sites, and our five constructs with two identical sites were predicted to exhibit Hill slopes of up to 1.8 (1.67–1.80 at 25°C), in practice the maximum Hill slope we obtained for a two-site construct was 1.21. As the number of binding sites is increased, our model predicts that it should be possible to obtain a higher Hill slope. Indeed, our four-site RNA construct that had all four binding sites involved in an apparently single, highly stable structure gave the highest Hill slope (nHill = 2.11) (Fig. 4F). The observation of Hill slopes lower than the theoretical maximum and lower than predicted may indicate that engineering high cooperativity in RNA structure is more difficult than it appears, perhaps due to weaker than expected structure formation and due to the ability of RNA to readily form multiple, partially folded, quasistable states, often involving noncanonical interactions. Applying our statistical mechanical model, we can further illustrate that, perhaps counterintuitively, RNA secondary structure can both increase and decrease the sharpness of protein binding. In panel A of Figure 6, two proteins bind independently to an unstructured construct, leading to a response that is simply the sum of two individual binding curves, with the same shape but twice the amplitude of a binding curve to a single site. In panel B, one site is tied up in structure (ΔGfold = –2.0 kcal/mol), which prevents protein binding at that site until the protein concentration is high enough so that the protein-bound state becomes more stable than the structured RNA. As a result, the protein binds to one site with a high effective affinity (Kd2) and to the other site with a much lower effective affinity (Kd1), giving an observed binding response across a wide range of protein concentrations and apparent negative cooperativity (nHill = 0.34). Finally, panel C of Figure 6 shows an example with both sites strongly sequestered in the same structure (ΔGfold = –8.0 kcal/mol). Once the concentration of protein is high enough that the bound state is energetically favorable relative to the unbound structured state, both protein-binding sites become available, leading to a sharp response to protein concentration with a calculated Hill slope of 1.98.

FIGURE 6.

Modes of binding of two proteins to RNA: (A) Binding of an unstructured RNA leads to independent protein binding (nHill = 1). (B) Binding of a partially structured RNA where one protein binds with lower affinity because its binding site is involved in structure (nHill = 0.34). (C) Binding of a structured RNA where both proteins bind to sites involved in the same structure leading to cooperative binding (nHill = 1.98). We see evidence for these disparate behaviors in our data. For example, whereas the four-site RNA described above has a steep response (Fig. 4F, nHill = 2.11), the RNA shown in Figure 4E has its four binding sites involved in two different hairpins and a broader response with an apparent Hill slope of 0.62. Thus, in addition to creating sharp, ultrasensitive responses, RNA structure can lead to the generation of a steady increase in binding across a wide range of concentrations.

DISCUSSION

RNA structure has been widely implicated in cellular regulation through structured elements such as riboswitches, internal ribosome entry sites (IRES), and the iron-responsive element (Piccinelli and Samuelsson 2007; Garst et al. 2011; Serganov and Nudler 2013; Sherwood and Henkin 2016; Yamamoto et al. 2017). Here we describe a predictive model for and quantitatively test another potentially widespread structure-dependent regulatory mechanism, RSC. The ability of RNA to readily form local stable secondary structures generates a panoply of opportunities for crafting and tuning cooperative control throughout RNA processing. For example, certain mammalian miRNA-binding sites co-occur with PUF-binding sites and contain seed sequences that are the reverse complement of the PUF-binding site. For these combinatorial sites, the binding of PUF proteins has been proposed to release the miRNA site from the secondary structure, rendering it accessible to the RNA induced silencing complex and leading to accelerated transcript decay (Kedde et al. 2010; Jiang and Coller 2012; Miles et al. 2012; Jiang et al. 2013). PUM sites also often exist in clusters in vivo, suggesting that cooperative regulation via RNA structure may occur (Galgano et al. 2008). RNA structure-mediated cooperativity has also been described for other RBPs and miRNAs (Xue et al. 2013). Our model studies demonstrate that RNA secondary structure can engender cooperative RBP binding, a process we refer to as RSC. A key advantage of RSC over direct cooperativity is that RSC can be tuned via changes in the RNA sequence alone, and thus can be tailored individually for each RBP target. Furthermore, the RSC thermodynamic framework can be applied to any sequence-specific, single-stranded RBP, provided there is no direct cooperativity. Our studies reveal several features that will inform future RSC design and applications. First, as we have shown, the precise configuration of neighboring sites can lead to variable RSC. Indeed, depending on the arrangement of RBP sites, RNA structure can either enhance or suppress cooperativity. We find that cooperativity is generally weaker than might be expected for a given number of RBP sites, likely reflecting the intrinsic complexity of RNA folding and its ability to form complex structural ensembles that are not fully accounted for by nearest-neighbor predictions. Finally, we demonstrate temperature as a way of modulating RSC. In vivo, RSC may be further tuned temporally by RNA modifications that can increase or decrease secondary structure stability or alter protein affinities (Liu et al. 2015; Lewis et al. 2017; Vaidyanathan et al. 2017). Ultimately, these structural effects along with cellular conditions, such as temperature, salt, and helicase activity, must be incorporated into a statistical mechanical model to predict cooperativity in Nature. While we do not know the extent or dynamics of RNA secondary structure in cells, and there are some indications that there may be different extents of structure in different organisms, with more evidence for control via structural elements in plants and bacteria (Breaker 2012; Serganov and Nudler 2013; Vandivier et al. 2016), it would be surprising if Nature has not taken advantage of RNA structure to generate RSC, given the opportunistic and creative qualities of natural selection and evolution. Rather than requiring specific cooperative interactions to evolve between proteins, RSC can be developed from sequence-level changes that alter the secondary structure and thus may be more evolvable than new protein–protein interactions. It is possible that secondary structure conservation in the 3′-UTRs of some RNA transcripts is partly a result of regulatory functions of the RNA secondary structure in RSC. Our experimental tests of the RSC model also underscore limitations in the accuracy of RNA secondary structure prediction algorithms under our experimental conditions (Becker et al. 2019). Because of the central role of RNA structure and stability in RSC, the overestimation of RNA secondary structure stability represents a key limitation to quantitative applications of the RSC model. The observed discrepancies emphasize the need for further development of structural models, and we anticipate that improved in vitro modeling will be foundational for probing and better understanding RNA structure and manifestations such as RSC in vivo.

MATERIALS AND METHODS

Collection and quantification of experimental data

Experimental data were collected as a subset of a large PUM2 target library on an Illumina MiSeq flow cell using the RNA-MaP method (Fig. 3A; Buenrostro et al. 2014; She et al. 2017). The binding experiments were performed at 20 mM Na-HEPES (pH 7.4), 100 mM KOAc, 0.1% Tween-20, 5% glycerol, 0.1 mg/ml BSA, 2 mM MgCl2, and 2 mM DTT, at 25°C or 37°C, as indicated. At 25°C, PUM2 binding equilibrates within minutes (halftime ≤ 5.3 min; see also Vaidyanathan et al. 2017), and our observation times ranged from 33 min for the lowest concentrations to 19 min for the highest protein concentrations (25°C; 15–23 min at 37°C) to ensure equilibration of PUM2. The full details of the PUM2 protein expression and purification and the RNA-MaP experiments are reported elsewhere (Jarmoskaite et al. 2018). Here, a set of RNA sequences, which contain two to five UGUAUAUA- or UGUAUAUU-binding sites, were analyzed. These constructs were probed as a subset of a larger RNA library, with other sets used to address independent questions. The RNA constructs studied here contained either two to five identical binding sites (UGUAUAUA or UGUAUAUU) or two different sites (UGUAUAUA and UGUAUAUU), which varied in affinity by approximately fivefold (full list of sequences in Supplemental Tables S1, S2). Following the experiments, normalized fluorescence intensity for each RNA cluster and protein concentration was computed and normalized as previously described (Buenrostro et al. 2014; She et al. 2017; Jarmoskaite et al. 2018). The affinity for unstructured single binding sites was rigorously determined via 408 measurements of unstructured UGUAUAUA sequence in varying sequence contexts. All Kd values and concentrations reported are based on experiments performed with 57% active fraction of PUM2. To measure PUM2 dissociation, the RNA array was incubated with 12.8 nM PUM2 for 18 min (by flowing 320 µL of 12.8 nM PUM2 solution). Dissociation was initiated by rapidly (150 µL/min) flowing 150 µL of 3 µM of an RNA oligonucleotide chase containing the PUM2 consensus sequence (UCUUGUAUAUAUA) in the binding buffer, with simultaneous imaging. 170 µL of the chase solution was then flowed at 15 µL/min with continuous imaging. Because of the rapid dissociation and the time required to image all 16 tiles of the RNA array, only one tile (6% of the library) was imaged.

Structure predictions

Structure prediction was performed using RNAfold version 2.3.3. For cases with no constraints, the following command was used: where the x's represent positions that were not allowed to form structure. Each command calculates the minimum free energy (mfe), the free energy of the ensemble, which is computed from the partition function, and the frequency of the mfe structure in the ensemble. As indicated in Figure 2, the ensemble stabilities were used in all modeling and figures in the paper. RNAfold -T 37 C -p0 –noPS -i inputfile.fa and for cases with constraints, the following command was used: RNAfold -T 37 C -p0 –noPS -C -i inputfile.fa For the second command, constraints were provided as follows: UCUCUUUGUAGAUAUCUCUU ......xxxxxxxx......

Fitting of normalized fluorescence data

For all fits, the median fluorescence values of all clusters representing a specific sequence at a given concentration were used. Prior to determining the medians of the experimental data, the data were filtered by removing any unreasonable values for a given concentration or time point. These points were identified conservatively to only exclude points that clearly reflected experimental artifacts. The criteria used to exclude these points were (i) fluorescence values for a cluster that were greater than what would be expected for 10 bound proteins or (ii) fluorescence values that were 6 or more median standard deviations from the median fluorescence. After filtering the data, the median fluorescence and bootstrapped 95% confidence intervals on the median fluorescence values (shown as error bars) were computed for each time point. In addition to fitting the true medians, the values obtained from the bootstrapped 95% confidence interval on the median were fit to estimate error on the fit parameters. When fitting the data, it was necessary to incorporate a constrained parameter to account for variability in the maximum fluorescence as well as a constant nonspecific parameter to account for nonspecific binding of the protein to RNA–protein complexes (Jarmoskaite et al. 2018). In our experiments, the maximum fluorescence increased linearly with the total number of proteins expected to bind to an RNA (Supplemental Fig. S2), indicating that we can track the number of proteins bound with this experimental system. When fitting different models to the experimental data, the maximum fluorescence was constrained based on this observation by forcing the value to be within 25% of the number of binding sites times the median maximum fluorescence for single binding sites. The observed minimum fluorescence was tightly constrained to values between zero and the greater of the minimum fluorescence value observed and 1% of the maximum fluorescence value observed. The nonspecific binding contribution was fit as a constant multiplied by the concentration of protein and the fraction of RNA molecules in a cluster bound by the protein. A value for this nonspecific binding term was determined by fitting a range of values for the nonspecific binding to the RNA constructs with two sites. The values for the nonspecific term that maximized the R2 values of the fits were determined for each temperature and then applied for all fits at that temperature (Supplemental Fig. S7A,B). At 25°C, a nonspecific value of 0.001 was used and at 37°C, a value of 0.0005 was used. We compared the sensitivity of the folding energies obtained to the value of the nonspecific term and found that, for ΔGfold values less than zero, the value chosen for the nonspecific term had a minimal effect on the final ΔGfold values determined by fitting the data to the model (Supplemental Fig. S7C–E), indicating that our fit values were not sensitive to the values chosen for the nonspecific contribution. Incorporating the fluorescence and nonspecific parameters gives the following equation for the observed fluorescence as a function of the fraction of protein-binding sites bound by the protein fbound: where fbound is defined from the binding model being fit to the data. For the two-site case, fbound is given by Equation 4. For the Hill equation and the higher-order models, fbound is described below. All equations were fit in Python 2.7 using the differential evolution algorithm implemented in the lmfit package. The fmax, fmin and nonspecific (q) variables were constrained as described above. All unfolding constants used in fitting were defined from RNAfold. Other structure prediction algorithms gave similar agreement as observed for RNAfold (data not shown). Dissociation data were fit to a single exponential to determine the dissociation rate.

Derivation of fbound equations for three or more binding sites

As was done for the two-site model above, to fit the data collected for RNAs with three to five protein-binding sites, we derived analogous statistical mechanical equations for the fraction of sites bound. To derive these equations, we first enumerated all of the possible states that could be occupied by a given RNA molecule (e.g., 27 states for three sites, etc.). We then defined the microscopic equilibrium between the different substates. Next, we defined the fraction of sites bound, fbound, as the sum of the sites with a protein bound divided by the sum of all of the binding sites. The microscopic equilibrium equations were then substituted into this equation to give an equation for the fraction of sites bound as a function of protein concentration, dissociation constants, and equilibrium unfolding constants. Dissociation constants were defined from PUM2 binding to unstructured RNA-binding sites and unfolding constants were defined with RNAfold following the procedure outlined for two binding sites.

Hill equation

The following equation for fbound was used when fitting the Hill equation: Parameters for models in Figure 6: For Figure 6, panel A, the following parameters were used: Kd1 = Kd2 = 0.1 nM, ΔG01 = ΔG02 = 0 kcal/mol, and ΔG12 = ΔG21 = 10 kcal/mol. For Figure 6, panel B, the following parameters were used: Kd1 = Kd2 = 0.1 nM, ΔG01 = –2.0 kcal/mol, ΔG02 = 0 kcal/mol, and ΔG12 = 0, and ΔG21 = –2.0 kcal/mol. For Figure 6, panel C, the following parameters were used: Kd1 = Kd2 = 0.1 nM, ΔG01 = ΔG02 = –8.0 kcal/mol, and ΔG12 = ΔG21 = 0 kcal/mol. After generating the sample data, the binding curves were fit to the Hill equation as described above.

SUPPLEMENTAL MATERIAL

Supplemental material is available for this article.

42 in total

Review 1. Proteins binding to duplexed RNA: one motif, multiple functions.

Authors: I Fierro-Monti; M B Mathews
Journal: Trends Biochem Sci Date: 2000-05 Impact factor: 13.807

2. ON THE NATURE OF ALLOSTERIC TRANSITIONS: A PLAUSIBLE MODEL.

Authors: J MONOD; J WYMAN; J P CHANGEUX
Journal: J Mol Biol Date: 1965-05 Impact factor: 5.469

3. Zinc ions stimulate the cooperative RNA binding of hordeiviral gammab protein.

Authors: Daria V Rakitina; Natalia E Yelina; Natalia O Kalinina
Journal: FEBS Lett Date: 2006-08-28 Impact factor: 4.124

Review 4. RNA-binding proteins: modular design for efficient function.

Authors: Bradley M Lunde; Claire Moore; Gabriele Varani
Journal: Nat Rev Mol Cell Biol Date: 2007-06 Impact factor: 94.444

5. A Pumilio-induced RNA structure switch in p27-3' UTR controls miR-221 and miR-222 accessibility.

Authors: Martijn Kedde; Marieke van Kouwenhove; Wilbert Zwart; Joachim A F Oude Vrielink; Ran Elkon; Reuven Agami
Journal: Nat Cell Biol Date: 2010-09-05 Impact factor: 28.824

Review 6. Riboswitches: structures and mechanisms.

Authors: Andrew D Garst; Andrea L Edwards; Robert T Batey
Journal: Cold Spring Harb Perspect Biol Date: 2011-06-01 Impact factor: 10.005

7. Evolution of the iron-responsive element.

Authors: Paul Piccinelli; Tore Samuelsson
Journal: RNA Date: 2007-05-18 Impact factor: 4.942

8. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP.

Authors: Markus Hafner; Markus Landthaler; Lukas Burger; Mohsen Khorshid; Jean Hausser; Philipp Berninger; Andrea Rothballer; Manuel Ascano; Anna-Carina Jungkamp; Mathias Munschauer; Alexander Ulrich; Greg S Wardle; Scott Dewell; Mihaela Zavolan; Thomas Tuschl
Journal: Cell Date: 2010-04-02 Impact factor: 41.582

Review 9. RNA-binding proteins and post-transcriptional gene regulation.

Authors: Tina Glisovic; Jennifer L Bachorik; Jeongsik Yong; Gideon Dreyfuss
Journal: FEBS Lett Date: 2008-03-13 Impact factor: 4.124

10. Comparative analysis of mRNA targets for human PUF-family proteins suggests extensive interaction with the miRNA regulatory system.

Authors: Alessia Galgano; Michael Forrer; Lukasz Jaskiewicz; Alexander Kanitz; Mihaela Zavolan; André P Gerber
Journal: PLoS One Date: 2008-09-08 Impact factor: 3.240

3 in total

1. A Quantitative and Predictive Model for RNA Binding by Human Pumilio Proteins.

Authors: Inga Jarmoskaite; Sarah K Denny; Pavanapuresan P Vaidyanathan; Winston R Becker; Johan O L Andreasson; Curtis J Layton; Kalli Kappel; Varun Shivashankar; Raashi Sreenivasan; Rhiju Das; William J Greenleaf; Daniel Herschlag
Journal: Mol Cell Date: 2019-05-08 Impact factor: 17.970

2. Exposing Hidden High-Affinity RNA Conformational States.

Authors: Nicole I Orlovsky; Hashim M Al-Hashimi; Terrence G Oas
Journal: J Am Chem Soc Date: 2019-12-31 Impact factor: 15.419

3. An evolutionarily conserved RNA structure in the functional core of the lincRNA Cyrano.

Authors: Alisha N Jones; Giuseppina Pisignano; Thomas Pavelitz; Jessica White; Martin Kinisu; Nicholas Forino; Dreycey Albin; Gabriele Varani
Journal: RNA Date: 2020-05-26 Impact factor: 4.942

3 in total