Bing Wang1,2, Vladimir Svetlov3, Yuri I Wolf4, Eugene V Koonin4, Evgeny Nudler3,5, Irina Artsimovitch1,2. 1. Department of Microbiology, The Ohio State Universitygrid.261331.4, Columbus, Ohio, USA. 2. The Center for RNA Biology, The Ohio State Universitygrid.261331.4, Columbus, Ohio, USA. 3. Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, New York, USA. 4. National Center for Biotechnology Information, National Library of Medicine, National Institutes of Healthgrid.94365.3d, Bethesda, Maryland, USA. 5. Howard Hughes Medical Institute, New York University School of Medicine, New York, New York, USA.
Abstract
The catalytic subunit of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA-dependent RNA polymerase (RdRp) Nsp12 has a unique nidovirus RdRp-associated nucleotidyltransferase (NiRAN) domain that transfers nucleoside monophosphates to the Nsp9 protein and the nascent RNA. The NiRAN and RdRp modules form a dynamic interface distant from their catalytic sites, and both activities are essential for viral replication. We report that codon-optimized (for the pause-free translation in bacterial cells) Nsp12 exists in an inactive state in which NiRAN-RdRp interactions are broken, whereas translation by slow ribosomes and incubation with accessory Nsp7/8 subunits or nucleoside triphosphates (NTPs) partially rescue RdRp activity. Our data show that adenosine and remdesivir triphosphates promote the synthesis of A-less RNAs, as does ppGpp, while amino acid substitutions at the NiRAN-RdRp interface augment activation, suggesting that ligand binding to the NiRAN catalytic site modulates RdRp activity. The existence of allosterically linked nucleotidyl transferase sites that utilize the same substrates has important implications for understanding the mechanism of SARS-CoV-2 replication and the design of its inhibitors. IMPORTANCE In vitro interrogations of the central replicative complex of SARS-CoV-2, RNA-dependent RNA polymerase (RdRp), by structural, biochemical, and biophysical methods yielded an unprecedented windfall of information that, in turn, instructs drug development and administration, genomic surveillance, and other aspects of the evolving pandemic response. They also illuminated the vast disparity in the methods used to produce RdRp for experimental work and the hidden impact that this has on enzyme activity and research outcomes. In this report, we elucidate the positive and negative effects of codon optimization on the activity and folding of the recombinant RdRp and detail the design of a highly sensitive in vitro assay of RdRp-dependent RNA synthesis. Using this assay, we demonstrate that RdRp is allosterically activated by nontemplating phosphorylated nucleotides, including naturally occurring alarmone ppGpp and synthetic remdesivir triphosphate.
The catalytic subunit of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA-dependent RNA polymerase (RdRp) Nsp12 has a unique nidovirus RdRp-associated nucleotidyltransferase (NiRAN) domain that transfers nucleoside monophosphates to the Nsp9 protein and the nascent RNA. The NiRAN and RdRp modules form a dynamic interface distant from their catalytic sites, and both activities are essential for viral replication. We report that codon-optimized (for the pause-free translation in bacterial cells) Nsp12 exists in an inactive state in which NiRAN-RdRp interactions are broken, whereas translation by slow ribosomes and incubation with accessory Nsp7/8 subunits or nucleoside triphosphates (NTPs) partially rescue RdRp activity. Our data show that adenosine and remdesivir triphosphates promote the synthesis of A-less RNAs, as does ppGpp, while amino acid substitutions at the NiRAN-RdRp interface augment activation, suggesting that ligand binding to the NiRAN catalytic site modulates RdRp activity. The existence of allosterically linked nucleotidyl transferase sites that utilize the same substrates has important implications for understanding the mechanism of SARS-CoV-2 replication and the design of its inhibitors. IMPORTANCE In vitro interrogations of the central replicative complex of SARS-CoV-2, RNA-dependent RNA polymerase (RdRp), by structural, biochemical, and biophysical methods yielded an unprecedented windfall of information that, in turn, instructs drug development and administration, genomic surveillance, and other aspects of the evolving pandemic response. They also illuminated the vast disparity in the methods used to produce RdRp for experimental work and the hidden impact that this has on enzyme activity and research outcomes. In this report, we elucidate the positive and negative effects of codon optimization on the activity and folding of the recombinant RdRp and detail the design of a highly sensitive in vitro assay of RdRp-dependent RNA synthesis. Using this assay, we demonstrate that RdRp is allosterically activated by nontemplating phosphorylated nucleotides, including naturally occurring alarmone ppGpp and synthetic remdesivir triphosphate.
Respiratory RNA viruses pose a major threat to humankind and have proved to be extremely refractory to modern disease control measures, which have limited the spread of water-, food-, and blood-borne epidemics, such as cholera, plague, and AIDS (1). In the 21st century, the H1N1pdm09, severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS), and coronavirus (CoV) disease 2019 (COVID-19) epidemics have been caused by respiratory RNA viruses, the last three by betacoronaviruses (betaCoVs), which belong to the family Coronaviridae in the order Nidovirales (2). The ongoing COVID-19 pandemic led to a dramatic loss of human life and devastating economic and social disruptions around the world. The zoonotic origin of its causative agent, the SARS-CoV-2 clade of Severe acute respiratory syndrome-related coronavirus species, the rapid rise of mutant strains within the infected human population, and numerous instances of retransmission to zoonotic hosts speak to its resilience as a persistent human pathogen and the likelihood of the emergence of new betaCoV variants with pandemic potential (3–6). Adequate pandemic response measures, from the development of effective antivirals to genomic surveillance, require a detailed understanding of SARS-CoV-2’s molecular and structural biology. However, CoVs have been studied less thoroughly than other viral pathogens, in part owing to their extraordinarily large genome size (by far the largest among the known RNA viruses) and complex biology (7).Upon infecting human cells, the CoV plus-strand RNA genome is translated to produce a long polyprotein that is cleaved into several nonstructural proteins (Nsps), which are required for viral replication and gene expression by CoV-encoded protease (8). Among these, SARS-CoV-2 Nsp12 plays a central role as a catalytic subunit of RNA-dependent RNA polymerase (RdRp). The RdRp is the only protein that is universally conserved among RNA viruses (9) and therefore is an attractive target for broad-spectrum antivirals. Many nucleoside analogs identified as RNA synthesis inhibitors in other viruses have been actively pursued for retargeting against SARS-CoV-2 (10).The transcription machinery of CoV is unique among RNA viruses in its complexity; the transcribing RdRp associates with the replicative helicase Nsp13, proofreading exonuclease Nsp14/10, and several other viral proteins in a large membrane-bound replication-transcription complex (RTC) (11). The RTC components are highly conserved among CoVs (Fig. 1). Furthermore, unlike many well-studied single-subunit viral RdRps (9), a minimally active SARS-CoV-2 RdRp consists of Nsp12 and three accessory subunits: Nsp7 and two copies of Nsp8 (7·82·12) (12–14) (Fig. 2A).
FIG 1
Conservation of amino acid residues in genomes of alpha-, beta-, gamma-, and deltacoronavirus genera; only those proteins that are present in all Coronaviridae are shown (see Data Set S1 in the supplemental material). The Nsps are indicated by numbers; Nsp7 to -16 (shown in gray), which comprise the replication-transcription complex (RTC), are more conserved than structural (E, M, N, S) proteins and other Nsps.
FIG 2
Activities of Nsp12A and Nsp12R. (A) Transcription of RdRp of SARS-CoV-2 Wuhan-Hu-1. Nsp7 and Nsp8 are shown at the surface, and Nsp12 is shown as a cartoon, with individual domains highlighted (PDB accession no. 6YYT). (B) The 29-nt RNA hairpin scaffold is extended by RdRp to produce a 40-nt product; additional extension is thought to be mediated by Nsp8 after the completion of RNA synthesis (40). Cy 5.5, cyanine 5.5. (C) RNA extension by RdRp at 37°C under the indicated conditions; 15 mM KCl is a permissive condition. Removal of the His tag (ΔHis) does not increase Nsp12R activity, but Nsp12A expressed from an mRNA that retains rare codons is more active. Fractions of the extended RNA (% Ext.) at 10 min are shown (means ± SEM; n = 3). (D) Interactions with the RNA hairpin scaffold analyzed by electrophoretic mobility shift assays. RdRps at the indicated concentrations were incubated with 100 nM RNA.
Conservation of amino acid residues in genomes of alpha-, beta-, gamma-, and deltacoronavirus genera; only those proteins that are present in all Coronaviridae are shown (see Data Set S1 in the supplemental material). The Nsps are indicated by numbers; Nsp7 to -16 (shown in gray), which comprise the replication-transcription complex (RTC), are more conserved than structural (E, M, N, S) proteins and other Nsps.Activities of Nsp12A and Nsp12R. (A) Transcription of RdRp of SARS-CoV-2 Wuhan-Hu-1. Nsp7 and Nsp8 are shown at the surface, and Nsp12 is shown as a cartoon, with individual domains highlighted (PDB accession no. 6YYT). (B) The 29-nt RNA hairpin scaffold is extended by RdRp to produce a 40-nt product; additional extension is thought to be mediated by Nsp8 after the completion of RNA synthesis (40). Cy 5.5, cyanine 5.5. (C) RNA extension by RdRp at 37°C under the indicated conditions; 15 mM KCl is a permissive condition. Removal of the His tag (ΔHis) does not increase Nsp12R activity, but Nsp12A expressed from an mRNA that retains rare codons is more active. Fractions of the extended RNA (% Ext.) at 10 min are shown (means ± SEM; n = 3). (D) Interactions with the RNA hairpin scaffold analyzed by electrophoretic mobility shift assays. RdRps at the indicated concentrations were incubated with 100 nM RNA.Conservation of amino acid residues in the CoV genomes. Homogeneity (h) and the weighted fraction of nongap characters (g) were calculated. The product of these two values was used as the conservation index (h · g; ranging from 0 to 1). One sharp, single-residue profile (kernel f = 8). The Gaussian kernel smoothing procedure was applied. The overall position (column A), mature protein (B), position in the mature protein (C), and consensus amino acid (D) are shown. (A) Whole-genome pan-coronavirus conservation scores. Only consensus positions (fraction of gaps below 0.5) were used. (B) Conservation scores of Nsp12 only, which are mapped to the NCBI accession no. YP_009725307.1 protein of the SARS-CoV-2 reference genome (NCBI accession no. NC_045512.2). Download Data Set S1, XLSX file, 0.7 MB.Nsp12 is a large (932-residue) multidomain protein. In addition to containing the RdRp module, composed of finger, palm, and thumb domains, Nsp12 contains a large nidovirus RdRp-associated nucleotidyl transferase (NiRAN) domain, which is connected to the finger domain through an interface domain (Fig. 2A). The NiRAN domain is unique to Nidovirales and has been suggested to perform a range of activities, from RNA capping to protein-primed initiation of RNA synthesis (15). A recent report identified the accessory RNA-binding protein Nsp9 as the physiological target of NiRAN NMPylase and showed that this activity is critical for viral replication (16). Consistent with these findings, the NiRAN domain active site was observed to bind Nsp9 in a single-particle cryogenic electron microscopy (cryoEM) study (17), apparently in a catalytically inactive arrangement. Thus, Nsp12 is a bifunctional enzyme with two active sites, one of which transfers an NMP moiety to the 3′ end of the nascent RNA (active site 1 [AS1] in the RdRp domain) and the other one, to the N terminus of Nsp9 (AS2, in the NiRAN domain). AS1 and AS2 utilize standard nucleoside triphosphates (NTPs) as the substrates, but can also accommodate as ligands a variety of nucleotide derivatives. In particular, AS1 readily incorporates remdesivir (18) and favipiravir (19) monophosphates into RNA, and structural evidence suggests that AS2 is similarly promiscuous (14, 17, 20). Thus, the effects of NTPs and nucleoside analogs on both catalytic activities must be taken into account when interpreting experimental data and evaluating the antiviral potential of lead molecules.SARS-CoV-2 RdRp contains intrinsically disordered regions (IDRs) that undergo large context-dependent conformational changes, e.g., upon interaction with the product RNA (13) or upon binding to ligands in the AS2 of NiRAN (14, 17, 20). These inherent dynamic properties suggest that RdRp’s activity can be modulated, positively or negatively, by factors that control the folding of the enzyme. Here, we report that the RdRp used in several structural and functional studies is largely inactive because synonymous codon substitutions in Nsp12 designed to maximize its expression instead trigger its misfolding. We identify a region containing a cluster of rare codons that plays a critical role in the proper folding of Nsp12 and show that Nsp12 expression in a bacterial strain with slow ribosomes and/or incubation with the accessory Nsp7/8 subunits increases RdRp activity. We further show that nucleoside analogs that cannot be incorporated into RNA can nonetheless activate RNA chain extension, presumably through binding to AS2. Our findings have immediate implications for functional studies and identification of novel inhibitors of SARS-CoV-2 RdRp and highlight the need for improved mRNA-recoding algorithms during the rational design of other biotechnologically and medically important expression systems.
RESULTS
Nsp12s expressed from different coding sequences differ in activity and conformation.
A rapidly growing collection of cryoEM structures of RdRp bound to different partners provides an excellent framework for understanding the mechanism of RNA synthesis and for the identification of novel RdRp inhibitors (12–14, 17, 21, 22). As is the case with other systems, structural models require validation by functional studies that critically depend on the availability of robust expression systems and highly active RdRp preparations. Given that the structures obtained for RdRp produced in Escherichia coli (12, 14) and in insect cells (13) are closely similar, we used the E. coli expression platform (see Fig. S1 in the supplemental material) to initiate mechanistic studies of SARS-CoV-2 RdRp. For the sake of expediency, we used an Nsp12 expression vector described in reference 12 (we refer to Nsp12 produced from this vector as Nsp12R, where R indicates the laboratory where this plasmid was constructed) and Nsp7- and Nsp8-producing vectors that we constructed for this study. Nsp12R contains a noncleavable C-terminal His10 tag, is soluble when produced in E. coli, and is easily purified under “native” (nondenaturing) conditions.Expression constructs used in this study. The pET22a-Nsp12 plasmid has been described by Gao and colleagues (Y. Gao, L. Yan, Y. Huang, F. Liu, et al., Science 368:779–782, 2020, https://doi.org/10.1126/science.abb7498); we refer to this variant as Nsp12R. In our group, genes encoding SARS-CoV-2 Nsp7, Nsp8, and Nsp12 proteins were obtained from GenScript; the standard GenScript codon optimization algorithm was used, and additional silent restriction sites were designed to facilitate subsequent mutagenesis; this variant is referred to as Nsp12A. In addition, solubility and purification tags were added, together with the proteolytic cleavage sites. The synthetic cassettes were cloned into pET-based expression vectors under the control of the T7 gene 10 promoter and lac repressor. The relevant features of the expression vectors are shown. In vectors with N-terminally His-tagged Nsp12 variants (Nsp12A and chimeras), the Ulp1-mediated cleavage generates the “authentic” N terminus of the protein, a feature that may be critical for RdRp activity (D. W. Gohara, C. S. Ha, S. Kumar, B. Ghosh, et al., Protein Expr Purif 17:128–138, 1999, https://doi.org/10.1006/prep.1999.1100). Our data indicate, however, that additional sequences at the N and C termini do not impair Nsp12 activity. We found that Nsp12 with the N-terminal His-SUMO tag is active, although we have not rigorously compared the activities of RdRps assembled with Nsp12 before and after cleavage by Ulp1, and the removal of the C-terminal His tag does not increase the activity (Fig. 2C). We note that the TEV-generated C terminus in this and other cases has additional residues comprising the TEV recognition sequence. We also found that Nsp12T, expressed from a plasmid similar to pIA1385 (pRSFDuet-sumo-NSP12; Addgene no. 159107), which was constructed in the Tuschl lab for studies of the RdRp-Nsp13 helicase complexes (J. Chen, B. Malone, E. Llewellyn, M. Grasso, et al., Cell 182:1560–1573 e13, 2020, https://doi.org/10.1016/j.cell.2020.07.033), was as active as Nsp12A under our assay conditions (Fig. S3). To produce Nsp12T, the viral nsp12 RNA was reverse transcribed and expressed in the BL21 RIL strain, which contains extra copies of the rare argU, ileY, and leuW tRNA genes but lacks other rare tRNAs, suggesting that the nsp12 codon usage is suboptimal. Plasmids for three mutants of Nsp12A were also made for this study: (i) pIA1401 for Nsp12S709R, (ii) pIA1402 for Nsp12Y129A, and (iii) pIA1405 for Nsp12D218A. Download FIG S1, PDF file, 0.1 MB.We found that the 7·82·12R enzyme exhibited negligible activity on a number of different templates, including the optimal hairpin scaffold (Fig. 2B) used by Hillen et al. (13), which could be extended only at a very low concentration of salt. An extensive experimental survey of different combinations of purification schemes, RNA scaffolds, and reaction conditions failed to identify conditions that would support efficient primer extension, and the removal of the His tag, which has been proposed to interfere with RdRp activity (23), did not increase activity under permissive (15 mM KCl) conditions (Fig. 2C). In their follow-up study, Wang et al. reported similar results (24), prompting us to conclude that further attempts to boost the activity of the 7·82·12R enzyme produced under these conditions would be futile.Our survey of published reports failed to reveal an obvious reason for the observed low activity of the 7·82·12R enzyme. Under similar reaction conditions, some RdRps were able to completely extend the RNA primer in minutes (13, 19, 23, 25), whereas others failed to do so in an hour (21, 24), regardless of the expression system. Idiosyncratic but reproducible variations in activity can arise from recombinant protein misfolding; indeed, coexpression of Nsp12 with cellular chaperones has been shown to enhance its activity (23, 26). A likely source of this variability may lie in the coding mRNA itself; whereas all recombinant Nsp12s have the same amino acid sequence (ignoring the tags), their coding sequences (CDSs) have been altered to match the codon usage of their respective hosts to maximize protein expression. Codon optimization is routinely used for protein expression in heterologous systems (27), yet protein function can be compromised even by a single synonymous codon substitution (28, 29). Furthermore, protein expression in a BL21 RIL strain, which alleviates codon imbalance by supplying a subset of rare tRNAs and is thus commonly used to express heterologous proteins in E. coli, can hinder proper folding (30). The abrogation of ribosome pausing at rare codons is thought to uncouple nascent peptide synthesis from its folding, giving rise to misfolded proteins (29, 31, 32).Although robust viral gene expression may promote host takeover, SARS-CoV-2 mRNAs, including nsp12, are not efficiently translated in human cells (33). The viral nsp12 mRNA contains clusters of rare codons (in humans) (Fig. S2), yet the resulting enzyme is active and able to sustain efficient infection. This suggests that pause-prone translation may facilitate the proper folding of Nsp12. In contrast, the nsp12 codon usage matches that of highly expressed E. coli genes, raising the possibility that the nsp12 CDS has been optimized for the maximum expression of soluble protein in a bacterial host but not for enzymatic activity and/or acquisition of a native structure. There is no a priori reason to believe that an overexpressed soluble protein retains all of its activity, and the abundance of Nsp12 in the heterologous host may be made possible by its diminished NTP binding and/or condensation activity, defects in RNA binding, or other functionalities. To evaluate this possibility, we designed an nsp12 variant (where A indicates that it was constructed by I. Artsimovitch) that contains more rare codons (Fig. S2), including in the regions that bear rare codons in the viral mRNA. Interestingly, nsp12T expression vector (constructed in T. Tuschl’s lab) also gives rise to active RdRp (14); in this study, the viral nsp12 mRNA was reverse transcribed and expressed in the E. coli BL21 RIL strain, which contains extra copies of the argU, ileY, and leuW rare tRNA genes. While comparable codon frequency measurements are not available for the RIL strain, it does not carry all rare tRNAs required for the efficient translation of the viral nsp12 mRNA, suggesting that nsp12T codon usage is suboptimal. We found that RdRp assembled with Nsp12A had a much higher activity on the hairpin scaffold (Fig. 2C). We also noted that Nsp12A and Nsp12T copurified with nucleic acids (Fig. S3A); subsequent gel shift assays revealed that 7·82·12A readily bound the RNA hairpin, whereas 7·82·12R did not (Fig. 2D). We found that the 7·82·12T enzyme behaved similarly to 7·82·12A (Fig. S3B), but since the nsp12T expression vector lacks restriction sites required for protein engineering, we used nsp12A in all subsequent experiments.Differences in codon frequencies between nsp12 (blue) and nsp12 (red) CDSs. A dashed line indicates a 10% frequency cutoff. In the nsp12 mRNA, 2 codons fall below this threshold, compared to 22 in the nsp12 mRNA. Download FIG S2, PDF file, 0.3 MB.Nsp12R is defective in interactions with the nucleic acids. (A) Nsp12A and Nsp12T proteins copurify with nucleic acids during the first step, metal affinity chromatography. The nucleic acids are separated during chromatography on Q Sepharose, as is commonly observed with other RNA polymerases purified in E. coli, giving rise to a prominent peak detected by absorbance at 280 nm (indicated by a red box). Fractions comprising this peak contain no protein detectable by staining of denaturing SDS gels and by UV spectroscopy, which yields a 260/280 nm absorbance ratio around 1.8. We also observed differences in the elution profiles of Nsp12A and Nsp12R (gray boxes), consistent with presumed differences in their conformations. We note that since Nsp12 alone does not form productive complexes with RNA (B), the copurified nucleic acid must be bound nonspecifically. (B) Interactions with the RNA hairpin scaffold analyzed by electrophoretic mobility shift assay. (C) RNA extension by 7·82·12 RdRp complexes formed with Nsp12A, Nsp12R, and Nsp12T. Download FIG S3, PDF file, 0.3 MB.To test whether Nsp12R was misfolded, we used several approaches. First, we assessed the Nsp12 thermal stability using differential scanning fluorimetry (34). We recorded melting temperatures (T) of 41.3°C for Nsp12R and 47.3°C for Nsp12A (Fig. S4A); for another E. coli-expressed Nsp12, a T of 43.6°C was reported (35). Second, we compared the intrinsic fluorescence spectra of Nsp12, which contains nine tryptophan residues that are expected to be sensitive to the microenvironment (36). Nsp12A and Nsp12R exhibited similar emission peaks, but the Nsp12A intensity was 2-fold higher (Fig. 3A), suggesting that at least one Trp was more buried; the derivative spectra (Fig. S4B) did not reveal any additional differences. These results show that Nsp12A and Nsp12R are structurally distinct, but we cannot identify the regions of altered structure.
FIG 3
Differences between Nsp12A and Nsp12R. (A) Intrinsic tryptophan fluorescence of Nsp12 proteins. The spectra of denatured proteins confirm that their concentrations are identical. The means and SEM of triplicate measurements are shown as lines and shaded bands, respectively, in this and Fig. S4 and S6B. A.U., arbitrary units. (B) Map of the EDC modifications. Lines show the positions of monolinks (outside) and cross-links (inside) mapped onto the Nsp12 schematic, with the domains colored as in panel A. Colors indicate differences in reactivity; residues in red were reactive only in Nsp12R, those in blue were reactive in Nsp12A, and those in black were reactive in both proteins. Only high-confidence monolinks (<10−5) and cross-links (<10−3) are shown (see Data Set S2). (C) Conservation of the NiRAN-RdRp interaction surfaces mapped on the transcription complex structure (PDB accession no. 6XEZ). Amino acid residues are colored according to their conservation. Key residues in AS1 (D760), in AS2 (D218), and at the NiRAN-palm interface (Y129 and S709) are shown as spheres; ADP bound to AS2 is shown as sticks and the Mg2+ ion as a purple sphere.
Differences between Nsp12A and Nsp12R. (A) Intrinsic tryptophan fluorescence of Nsp12 proteins. The spectra of denatured proteins confirm that their concentrations are identical. The means and SEM of triplicate measurements are shown as lines and shaded bands, respectively, in this and Fig. S4 and S6B. A.U., arbitrary units. (B) Map of the EDC modifications. Lines show the positions of monolinks (outside) and cross-links (inside) mapped onto the Nsp12 schematic, with the domains colored as in panel A. Colors indicate differences in reactivity; residues in red were reactive only in Nsp12R, those in blue were reactive in Nsp12A, and those in black were reactive in both proteins. Only high-confidence monolinks (<10−5) and cross-links (<10−3) are shown (see Data Set S2). (C) Conservation of the NiRAN-RdRp interaction surfaces mapped on the transcription complex structure (PDB accession no. 6XEZ). Amino acid residues are colored according to their conservation. Key residues in AS1 (D760), in AS2 (D218), and at the NiRAN-palm interface (Y129 and S709) are shown as spheres; ADP bound to AS2 is shown as sticks and the Mg2+ ion as a purple sphere.Structural differences between Nsp12A and Nsp12R proteins. (A) Differential scanning fluorimetry (DSF) was conducted as reported previously (C. L. Schwebach, E. Kudryashova, W. Zheng, M. Orchard, et al., Bone Res 8:21, 2020, https://doi.org/10.1038/s41413-020-0095-2). Proteins were diluted to 3 μM in 20 mM HEPES, pH 7.5, 150 mM KCl, 40% glycerol, 1 mM DTT, 2 mM MgCl2 in the presence of a 1:5,000 dilution of Sypro Orange dye (Invitrogen). A change in the fluorescence of the dye, which preferentially binds to hydrophobic protein regions exposed upon heat-induced unfolding, was measured in triplicates at a rate of 2°C/min using a CFX real-time PCR detection system (Bio-Rad). The data are plotted as the first derivatives of the fluorescence signal versus temperature; three individual spectra are shown for each Nsp12 variant. The melting temperatures (T) were determined as the maximum of the first derivative of each normalized experimental curve and are expressed as means ± SEM. The P value was calculated by an unpaired two-tailed t test. **, P < 0.01. (B) Second derivatives of Trp emission spectra shown in Fig. 3A and S6B. Download FIG S4, PDF file, 0.2 MB.Nsp7-Nsp12 interactions. (A) Magnified view at the Nsp7-12 interface, with positions of differential EDC modifications indicated as in Fig. 3B. In Nsp7, a unique Trp29 residue is exposed but becomes buried at the interface with Nsp12. (B) Nsp7-Nsp12 interactions assayed by Trp fluorescence. As expected, Nsp7 alone (with solvent-exposed Trp29) did not produce detectable emission. Upon addition of Nsp7 to Nsp12A, a modest decrease in intensity (relative to that of free Nsp12A) was observed, indicative of changes in both partners upon complex formation, as docking of Nsp7 without concomitant changes in Nsp12 is expected to increase fluorescence when Trp29 is buried. In contrast, an increase was observed when Nsp7 was added to Nsp12R. Most strikingly, although the emission spectra of the isolated Nsp12R and Nsp12A were very different, the spectra of Nsp7-Nsp12 complexes tended to be similar. While we cannot identify specific Trp residues that become rearranged upon the complex formation, our results suggest that Nsp7 binds to both Nsp12 subunits. Download FIG S6, PDF file, 0.2 MB.EDC modifications sites identified by mass spectrometry. The sites shown in Fig. 3B are highlighted in bold italic. Residues are numbered according to the Nsp12 sequence from the SARS-CoV-2 genome (NCBI accession no. NC_045512.2). The Gly residue of Nsp12R before the first residue S1 is numbered G0. AA, amino acid. (A) Crosslinks in Nsp12A; (B) crosslinks in Nsp12R; (C) monolinks in Nsp12A; (D) monolinks in Nsp12R. Download Data Set S2, XLSX file, 0.03 MB.We next used a carboxyl- and amine-reactive reagent, EDC [1-ethyl-3-(3-dimethylaminopropyl)carbodiimide] to map solvent-accessible (surface) residues and intraprotein cross-links by mass spectrometry. We observed substantial differences in accessibility of several regions centered at residues 150 (NiRAN domain), 415 (fingers), 600 (palm), and 850 (thumb) and in cross-linking, particularly of the NiRAN domain (Fig. 3B).
Attenuated translation and accessory subunits promote an active Nsp12 conformation.
Although an overall excess of underrepresented codons can slow down translation, in many cases, the ribosome has to pause at one or more specific rare codons to ensure proper protein folding at key junctures (37, 38). The differences between the codon frequencies between the 2.8-kb mRNAs encoding Nsp12A and Nsp12R are extensive (Fig. S2). The produced proteins also differ in their N and C termini (Fig. S1), but based on available structural data, an extra N-terminal glycine would not be expected to account for the observed dramatic differences in EDC reactivity (Fig. 3B and Data Set S2). Comparative analysis identified two regions that contained rare codon clusters in the native SARS-CoV-2 RNA and in Nsp12A mRNAs, but not in Nsp12R (Fig. 3A and Fig. S5). We constructed chimeric proteins in which these Nsp12A segments were replaced with corresponding segments from Nsp12R, generating proteins with identical amino acid sequences (Fig. 4A). We found that whereas swapping of codons (143 to 346) between the mRNAs producing active and inactive Nsp12 variants did not alter the RdRp activity, a chimeric protein containing codons 350 to 435 derived from the Nsp12R CDS was defective (Fig. 4B).
FIG 4
Determinants of Nsp12 activity. (A) The translational context around residue 400 is critical for the correct folding of Nsp12. The SARS-CoV-2 genomic nsp12 RNA (with domain boundaries shown on the top) contains clusters of rare codons (purple bars); only the Nsp12A CDS has rare codons (cyan bars) at the corresponding positions. (B) A chimeric Nsp12-AR2 protein is defective in RNA synthesis. (C) Reactivation of Nsp12R via 37°C preincubation with the accessory Nsp7 and Nsp8 subunits to form the RdRp holoenzyme. (D) Translation by slow ribosomes yields a more active Nsp12. RNA extension is shown as means ± SEM, and the P value was calculated by an unpaired two-tailed t test. n.s., not significant; **, P < 0.01.
Determinants of Nsp12 activity. (A) The translational context around residue 400 is critical for the correct folding of Nsp12. The SARS-CoV-2 genomic nsp12 RNA (with domain boundaries shown on the top) contains clusters of rare codons (purple bars); only the Nsp12A CDS has rare codons (cyan bars) at the corresponding positions. (B) A chimeric Nsp12-AR2 protein is defective in RNA synthesis. (C) Reactivation of Nsp12R via 37°C preincubation with the accessory Nsp7 and Nsp8 subunits to form the RdRp holoenzyme. (D) Translation by slow ribosomes yields a more active Nsp12. RNA extension is shown as means ± SEM, and the P value was calculated by an unpaired two-tailed t test. n.s., not significant; **, P < 0.01.Rare codons in the nsp12 CDS, compared to codons present in highly expressed genes. This comparison is certainly appropriate for proteins expressed in E. coli from plasmids that carry a very strong T7 promoter and a canonical ribosome binding site. (A) Linear maps of nsp12 variants. Magenta bars on top show rare codons in the SARS-CoV-2 genomic RNA; the cyan bars indicate regions in which rare codons at similar positions were also present in the nsp12 but not in the nsp12 coding sequence. (B and C) Rare codon clusters in regions 1 (B) and 2 (C). The rare codons in the viral genome are highlighted in the sequence in magenta. The bar graph shows codon frequencies calculated using an online tool, E. coli Codon Usage Analysis 2.0, developed by Morris Maduro (https://faculty.ucr.edu/∼mmaduro/codonusage/usage.htm). A dashed line indicates the 10% cutoff. Rare codons (below 10%) present in nsp12 and the viral CDS at identical or adjacent positions are shown in cyan. Download FIG S5, PDF file, 0.1 MB.Together with the EDC modification patterns (Fig. 3B), this result suggests that controlled translation of the codon 350–435 region is important for Nsp12 folding and that changes in contacts with Nsp7 (Fig. S6), which are critical for RdRp activity (13), may be partially responsible for the low activity of Nsp12R. During expression of the viral genome, Nsp7, Nsp8, and Nsp12 are cotranslated with other Nsps as giant precursors that are later processed into individual polypeptides, and the RdRp may be assembled concurrently with protein synthesis. Analysis of Nsp7/Nsp12 interactions by Trp fluorescence reveals that Nsp7 binds to both Nsp12 subunits and might favor a similar Nsp12A-like state (Fig. S6). In support of the “scaffolding” function of the accessory subunits (35), we found that preincubation of Nsp12R with Nsp7 and Nsp8 led to an increased activity (Fig. 4C).We next tested if slowing translation during protein expression would promote Nsp12 folding. We constructed a BL21 strain with a K42T mutant of the ribosomal protein S12, which causes an approximately 2-fold reduction in the translation rate (39), and compared the Nsp12R protein purified from this “slow” BL21 variant to the protein purified from wild-type BL21. We found that Nsp12R purified from the mutant BL21 was approximately 2-fold more active (Fig. 4D), consistent with the favorable effect of attenuated translation.
Allosteric RdRp activation by nucleotides.
Our results show that Nsp12A and Nsp12R differ dramatically in the conformations and interactions of their NiRAN domains (Fig. 3B). Although the NiRAN domain is not known to affect RNA chain extension directly, it interacts with the catalytic palm domain (13, 14, 21) and may modulate catalysis allosterically. The NiRAN domain is partially disordered in most unliganded structures of RdRp and transcription complexes but becomes ordered upon binding of ADP-Mg2+, GDP-Mg2+, and PPi-Mg2+ to AS2 (Fig. 5A) (14, 17, 20). We hypothesized that, upon binding to nucleotides, the NiRAN domain would become more rigid, favoring an active RdRp conformation, thus leading to more efficient RNA elongation. First, we compared rates of RNA synthesis under standard conditions in which RdRp is bound to the RNA scaffold prior to the addition of the NTP substrates to the “NTP-primed” reaction mixture, in which the order of reagent addition was reversed (Fig. 5B). The results show that preincubation with NTPs strongly potentiates Nsp12R activity, an effect that may be mediated by the NiRAN domain.
FIG 5
Allosteric activation of RdRp. (A) An overlay of NiRAN domain structures in the absence (wheat [PBD accession no. 6YYT]) and in the presence (gray [PBD accession no. 6XEZ]) of the bound ADP-Mg2+. The interface of the NiRAN domain and the RdRp domains is show. (B) Activation of the Nsp12R holoenzyme by preincubation with NTPs. (C) Activation of RNA synthesis by purine nucleotides (at 1 mM) on the CU (left) and 4N (right) templates. (D) Effects of Nsp12 substitutions on activation of RNA synthesis by 0.5 mM GTP; fold activation is shown above each set of bars. (B and D) RNA extension is shown as means ± SEM, and the P value was calculated by an unpaired two-tailed t test. n.s., not significant; **, P < 0.01.
Allosteric activation of RdRp. (A) An overlay of NiRAN domain structures in the absence (wheat [PBD accession no. 6YYT]) and in the presence (gray [PBD accession no. 6XEZ]) of the bound ADP-Mg2+. The interface of the NiRAN domain and the RdRp domains is show. (B) Activation of the Nsp12R holoenzyme by preincubation with NTPs. (C) Activation of RNA synthesis by purine nucleotides (at 1 mM) on the CU (left) and 4N (right) templates. (D) Effects of Nsp12 substitutions on activation of RNA synthesis by 0.5 mM GTP; fold activation is shown above each set of bars. (B and D) RNA extension is shown as means ± SEM, and the P value was calculated by an unpaired two-tailed t test. n.s., not significant; **, P < 0.01.Given that nucleotide binding to AS2 has been shown to remodel the NiRAN domain in the active Nsp12 (14), we surmised that NTP-mediated activation should also occur in Nsp12A. To separate the direct and allosteric effects of NTPs, we used a CU template, which contains only purines in the transcribed region; as expected, the RNA was extended in the presence of CTP and UTP (Fig. 5C). In addition to detecting the runoff RNA (40 nucleotides [nt]), we detected a longer product that likely results from the terminal transferase activity of Nsp8, activity which prefers blunt over 3′ recessed ends and ATP as a substrate (40). To assay the hypothetical allosteric activation of the RdRp by NTP bound to the NiRAN domain, we chose conditions under which less than 50% of the scaffold was extended.Consistently with the allosteric effects of nontemplated nucleotides, transcription was activated in the presence of 1 mM ATP (>4-fold) or GTP (>10-fold) (Fig. 5C). An apparent promiscuity of AS2 suggests that other nucleotides might be able to substitute for ATP and GTP. To test this idea, we used a pause-promoting ATP analog, remdesivir triphosphate (RTP), and an allosteric effector of E. coli RNAP guanosine tetraphosphate (ppGpp). We found that the effects of RTP and ppGpp mimicked those of ATP and GTP, respectively, on the CU template (see Materials and Methods). Activation was also observed with the 4N template, on which ATP, GTP, and RTP but not ppGpp can be utilized as the substrates; as expected, RMP incorporation led to RdRp stalling before reaching the end of the template (18, 22). Consistently with the reported preference of the Nsp8 terminal transferase activity for ATP (40), the fraction of the extended RNA is reduced in the presence of GTP and ppGpp compared to that in the presence of ATP (Fig. 5C).We hypothesized that RdRp-activating nucleotides act via binding to AS2 and stabilizing the RdRp-NiRAN interface. To test this hypothesis, we replaced two conserved residues at the interface. Tyr129 in the NiRAN domain is nearly invariant among all CoVs, whereas only small residues (Ser, Ala, Gly) are found at position 709 in the palm domain (Fig. 3C). Given this evolutionary conservation, we suspected that replacements of these amino acids might compromise the interdomain contacts, making RNA synthesis more dependent on the state of the NiRAN domain. Consistently, we found that Y129A and S709R substitutions reduced RNA synthesis activity while potentiating activation by 0.5 mM GTP (Fig. 5D).The catalytic activity of the NiRAN domain has been shown to be independent of the RdRp function; Nsp9 modification occurs normally in an enzyme with substitutions in AS1 that inactivate the RdRp (16). To determine if the converse is true, we replaced Asp218, which coordinates the Mg2+ ion in the AS2 (14) and is critical for Nsp9 NMPylation and viral replication (16), with Ala. This substitution did not compromise RNA synthesis, confirming that AS1 and AS2 are functionally independent, but it modestly reduced GTP-dependent activation (Fig. 5D), suggesting that, if the allosteric GTP binds to the NiRAN AS2, Asp218 does not measurably contribute to nucleotide affinity. This observation is not entirely surprising because Asp residues are critical for substrate positioning but make lesser contributions to substrate binding in other viral polymerases (41).
DISCUSSION
Our results lead to two principal conclusions. First, SARS-CoV-2 Nsp12 depends on cotranslational folding, facilitated by ribosome pausing, and on interactions with the accessory subunits to attain the active conformation. Second, the two nucleotidyl transfer catalytic sites in Nsp12, a unique property of Nidovirales, appear to be connected allosterically, with nucleotides including various analogs that bind to NiRAN AS2 and activate RNA chain extension in RdRp AS1.
Pause-free translation yields inactive RdRp.
Our results demonstrate that overoptimized Nsp12R mRNA produces a soluble but misfolded protein in which RNA binding and catalytic activity (Fig. 2C and D) are compromised. Notably, despite the dramatic differences in their activities, all structures of SARS-CoV-2 transcription complexes reported so far are closely similar (12–14), reflecting the bias introduced during cryoEM analysis, in which only a small fraction of “good” particles is selected based on image analysis (e.g., about 1% in a study of RdRp inhibition by remdesivir [42]). A preparation comprised of largely inactive enzymes remains amenable to the cryoEM analysis but would compromise biochemical experiments; to rephrase the fourth commandment of enzymology (43), thou shalt not waste clean thinking on dead enzymes. For example, a conclusion that SARS RdRp is more active than the SARS-CoV-2 enzyme (35) is predicated on the assumption that both RdRps are properly folded. Even more critically, inactive RdRps cannot be used to screen potential inhibitors.While recoding is routinely used to optimize heterologous protein expression (27), the existence and frequent clustering of rare codons in mRNAs encoding many essential proteins, especially, large, multidomain ones, indicate their crucial role as regulators of protein folding. For example, native nonoptimal codons in intrinsically disordered regions (IDRs) are essential for the function of circadian clock oscillators (44, 45). IDRs often serve as platforms for protein-protein interactions (46) but can become trapped in unproductive states in the absence of their interaction partners. Our analysis supports this scenario by showing that an unstructured region that binds Nsp7 displays substantial differential sensitivity to EDC (Fig. 3B) and that interaction with Nsp7 locks Nsp12 in an active conformation (Fig. S6). When added to misfolded Nsp12, Nsp7/8 only modestly increases its activity (Fig. 4C). However, because all Nsps are produced as a giant precursor in coronavirus-infected cells (8), the accessory subunits may aid Nsp12 folding cotranslationally, as apparently happens during their coexpression in E. coli (23). Likewise, coexpression of E. coli RNA polymerase subunits suppresses assembly defects conferred by deletions in the catalytic subunits (47).More broadly, our findings have implications for the heterologous expression of countless other proteins. Although examples of deleterious synonymous substitutions have been reported, these cases have been generally perceived as outliers. In retrospect, optimization-induced misfolding is likely to be far more prevalent than previously thought, with different recoding approaches impacting the structure and activity of the resulting protein in substantially different ways. The importance of cotranslational folding, particularly for large and dynamic proteins that contain essential mobile regions, emphasizes a need for the integration of diverse approaches, from ribosome profiling to machine learning, during a rational design of coding sequences to avoid misfolding traps.Another important implication of the codon usage impact on SARS-CoV-2 protein folding, structure, and activity lies in the interpretation of genomic surveillance data. So far, the focus of the analysis of the genetic variability of SARS-CoV-2 has been on characterization of variants of concern, and designation of its evolutionary lineages has been in nonsynonymous changes, i.e., amino acid substitutions. Many of those amino acid substitutions show little-to-no impact on properties of proteins in which they appear (48, 49). We posit that many synonymic mutations, and even some nonsynonymic ones, may manifest their effects primarily at the level of cotranslational folding, rather than in the properties of the folded protein in vitro, or impact those through altering the ratio of folded to misfolded proteins during the infection.
Crosstalk between two catalytic sites of SARS-CoV-2 Nsp12.
Decades of studies of viral RdRps focused on the mechanism of RNA synthesis and identification of nucleoside analogs that inhibit viral replication. During the COVID-19 pandemic, repurposing of the existing drugs targeting RdRp, justified by structural similarity among RdRp active sites (9), became an urgent priority. Among these drugs, remdesivir received the most attention, even though the estimates of its clinical effectiveness range from moderate to insignificant (50, 51). The CoV RdRp readily uses RTP as a substrate in place of ATP and temporarily stalls downstream at the site of RMP incorporation (18, 22). However, the proposed mechanisms of the inhibitory effect of RTP vary widely, from RdRp stalling to RNA chain termination to disassembly of the RdRp (52–54). It is presently unclear whether antiviral effects of remdesivir are due to delays in RNA synthesis or to errors in the product RNAs, as is the case with another purine analog, favipiravir (19).Although efforts aimed at the identification of nucleoside analog inhibitors of RdRp are focused on AS1, it is clear that effects of nucleoside analogs on SARS-CoV-2 replication may be multifaceted. Nsp12 contains two active sites separated by more than 80 Å (Fig. 6), both of which can bind NTPs and nucleoside analogs. The functions of AS1 and AS2 are largely independent; Nsp12 containing double substitutions in AS1 that abolish elongation is fully competent for Nsp9 NMPylation (16), whereas the D218A substitution in Nsp12 that abolishes NMPylation blocks viral replication (16) but does not compromise RNA extension (Fig. 5D). In addition, each subunit of the Nsp8 dimer can also bind NTPs (40), and although there is no structural evidence of nucleotide binding to Nsp8 and its terminal transferase activity might be posttranscriptional, SARS-CoV-2 RdRp contains, all together, four nucleotide-binding sites.
FIG 6
SARS-CoV-2 replication critically depends on two active sites in Nsp12 that mediate NMP transfer to RNA (AS1) and Nsp9 protein (AS2). Substrate (or inhibitor) binding to one site may be communicated to the other site through a highly conserved domain interface.
SARS-CoV-2 replication critically depends on two active sites in Nsp12 that mediate NMP transfer to RNA (AS1) and Nsp9 protein (AS2). Substrate (or inhibitor) binding to one site may be communicated to the other site through a highly conserved domain interface.Thus, one cannot assume that the observed effect of a nucleotide is mediated via the “primary” nucleotide binding to AS1; indeed, we show here that RTP promotes RNA synthesis when it cannot be incorporated into RNA, and this effect is even more pronounced with ppGpp (Fig. 4C). Competitive inhibitors binding in AS2 or transferring noncognate ligands to Nsp9 are likely to inhibit replication. In the latter case, misincorporation may have more lasting effects because errors in the nascent RNA can be corrected by the SARS-CoV-2 proofreading exonuclease Nsp14 (26).We hypothesize that AS1 and AS2 are allosterically linked, enabling coordinated control of the RdRp activity. The NiRAN and palm domains form an extensive interface composed of highly conserved residues, including Tyr129 (Fig. 3C). Upon binding to AS2, nucleotides induce NiRAN folding and lead to subtle changes at the domain interface (14, 17, 20). We show that binding of nucleotides that cannot be incorporated into RNA potentiates RdRp activity (Fig. 5C). There is currently no direct evidence that this effect is triggered through their binding to AS2, but the effects of substitutions in the NiRAN domain (Fig. 5D) and structural data (14, 17, 20) support this model. Although rigorous computational, structural, and biochemical analyses will be required to test this hypothesis, it is already clear that, when considering the effects of various nucleotide analogs on viral RNA synthesis, their binding to AS2 (and, perhaps, AS3 and AS4 as well) cannot be ignored.The open active sites of viral RdRp can accommodate highly diverse substrates, some of which have been developed into therapeutics (10). Our findings that ppGpp activates RdRp similarly to GTP (Fig. 5C) suggests that other nucleotide-binding sites in Nsp12 are also promiscuous. Furthermore, the interplay between the binding of RTP to both catalytic and allosteric (relative to RNA synthesis) sites and competition therein with cellular NTPs call for a nuanced interpretation of the remdesivir inhibition mechanism, as does potential competition with ppGpp binding to the allosteric site (AS2). The biological activity of ppGpp has long been considered to be limited to bacteria and plastids, but its action as an alarmone has recently been demonstrated in human cells (55), raising the possibility that ppGpp and other nontemplating nucleotides impact SARS-CoV-2 replication in host cells.Allosteric control of SARS-CoV-2 RdRp invites interesting parallels with E. coli Qβ replicase, which also consists of four subunits, the phage-encoded RdRp (β-subunit) and three host RNA-binding proteins, the translation elongation GTPases EF-Tu and EF-Ts and a ribosomal protein, S1 (56). Similarly to Nsp7/8, EF-Tu and EF-Ts aid in the cotranslational assembly of Qβ RdRp (57); EF-Tu also forms a part of the single-stranded RNA exit channel, assisting in RNA strand separation during elongation, whereas S1 acts as an initiation factor (56). EF-Tu and EF-Ts binding to ppGpp modulates host translation (58) and RNA synthesis by Qβ (59), suggesting that RNA viruses from bacteria to humans may employ nucleotide analogs as sensors of cellular metabolism.The SARS-CoV-2 RdRp subunit composition and dynamics resemble those of structurally unrelated bacterial RNA polymerases (RNAPs). Bacterial enzymes are composed of 4 to 7 subunits and are elaborately controlled by regulatory nucleic acid signals and proteins that induce conformational changes in the transcription complex, as revealed by many recent cryoEM studies (60–62). Notably, most natural and synthetic products that inhibit bacterial RNAPs alter protein interfaces or trap transient intermediates rather than block nucleotide addition and bind to many different sites (63). Compared to simpler RdRps, the SARS-CoV-2 enzyme, with several active sites and many conserved protein interfaces, may be an easier target for diverse small molecules that inhibit subunit or domain interactions or interrupt allosteric signals. Given the outsized importance of coronaviruses to human health, efforts to identify diverse inhibitors of RdRp, beyond nucleotide analogs, should be prioritized. Given the broad utilization of nucleotides by host enzymes, such as polymerases, kinases, lyases, etc., the chances of off-target side effects during therapeutic administration thereof are greatly elevated (64), whereas a druggable target unique to the pathogen, such as the SARS-CoV-2 NiRAN domain, bears inherently lower risks of nonspecific interactions. The potential for the allosteric regulation of NTP condensation by SARS-CoV-2 RdRp has recently been highlighted by a computational study that identified several motifs under allosteric control (65). Notably, the NiRAN domain makes extensive contacts with allosteric motif D and fewer contacts with motifs A and B (65), consistent with our findings that nontemplating phosphorylated nucleotide binding activates RdRp, solidifying its potential as the drug target.
MATERIALS AND METHODS
Construction of expression vectors.
Plasmids used in this study are shown in Fig. S1 in the supplemental material. The SARS-CoV-2 nsp7/8/12 genes were codon optimized for expression in E. coli, synthesized by GenScript, and subcloned into standard pET-derived expression vectors under the control of the T7 gene 10 promoter and lac repressor. The derivative plasmids were constructed by standard molecular biology approaches with restriction and modification enzymes from New England Biolabs, taking advantage of the existing or silent restriction sites engineered into the Nsp12 coding sequence. DNA oligonucleotides for vector construction and sequencing were obtained from Millipore Sigma. The sequences of all plasmids, including pET22a-Nsp12, were confirmed by Sanger sequencing at the Genomics Shared Resource Facility (The Ohio State University) and are available upon request.
Protein expression and purification.
Nsp7/8 were overexpressed in E. coli XJB(DE3) cells (Zymo Research; catalog no. T5051). Nsp12 variants were overexpressed in E. coli BL21(DE3) cells (Novagen; catalog no. 69450). Strains were grown in lysogenic broth (LB) with appropriate antibiotics: kanamycin (50 μg/ml), carbenicillin (100 μg/ml), and chloramphenicol (25 μg/ml). All protein purification steps were carried out at 4°C.For Nsp7/8, cells were cultured at 37°C to an optical density at 600 nm (OD600) of 0.6 to 0.8, and the temperature was lowered to 16°C. Expression was induced with 0.2 mM isopropyl-1-thio-β-d-galactopyranoside (IPTG; GoldBio; catalog no. I2481C25) for 18 h. Induced cells were harvested by centrifugation (6,000 × g), resuspended in lysis buffer A (100 mM HEPES, pH 7.5, 300 mM NaCl, 5% glycerol [vol/vol], 1 mM phenylmethylsulfonyl fluoride [PMSF; ACROS Organics; catalog no. 329-98-6], 5 mM β-mercaptoethanol [β-ME], 10 mM imidazole), and lysed by sonication. The lysate was cleared by centrifugation (10,000 × g). The soluble protein was purified by absorption to Ni2+-nitrilotriacetic acid (NTA) resin (Cytiva; catalog no. 17531801), washed with Ni-buffer A (20 mM HEPES, pH 7.5, 300 mM NaCl, 5% glycerol, 5 mM β-ME, 50 mM imidazole), and eluted with Ni-buffer B (20 mM HEPES, pH 7.5, 50 mM NaCl, 5% glycerol, 5 mM β-ME, 300 mM imidazole). The eluted protein was further loaded onto a Resource Q ion-exchange column (Cytiva; catalog no. 17117701) in Q buffer A (20 mM HEPES, pH 7.5, 5% glycerol, 5 mM β-ME) and eluted with a gradient of Q buffer B (20 mM HEPES, pH 7.5, 1 M NaCl, 5% glycerol, 5 mM β-ME). The fusion protein was treated with tobacco etch virus (TEV) protease at 4°C overnight supplemented with 20 mM imidazole and was passed through Ni2+-NTA resin. The untagged, cleaved protein was loaded onto a Superdex 75 10/300 GL column (Cytiva; catalog no. 29148721) in Ni-buffer A. Peak fractions were assessed by SDS-PAGE and Coomassie blue staining. Purified protein was dialyzed into storage buffer A (20 mM HEPES, pH 7.5, 150 mM NaCl, 45% glycerol, 2.5 mM β-ME), aliquoted, and stored at −80°C.For Nsp12, cells were cultured at 37°C to an OD600 of 0.6 to 0.8, and the temperature was lowered to 16°C. Expression was induced with 0.1 mM IPTG for 18 h. Induced cells were harvested by centrifugation, resuspended in lysis buffer B (100 mM HEPES, pH 7.5, 300 mM KCl, 5% glycerol, 2 mM MgCl2, protease inhibitor cocktail [cOmplete, EDTA-free; Roche Diagnostics; catalog no. 11836170001], 1 mM PMSF, 10 mM imidazole, 5 mM β-ME), and lysed by sonication. The cleared lysate was applied to Ni2+-NTA resin (Cytiva), washed with Ni-buffer C (20 mM HEPES, pH 7.5, 300 mM KCl, 5% glycerol, 2 mM MgCl2, 5 mM β-ME, 0.1 mM PMSF) supplemented with 30 mM imidazole, and eluted with Ni-buffer D (20 mM HEPES, pH 7.5, 50 mM KCl, 5% glycerol, 2 mM MgCl2, 5 mM β-ME, 0.1 mM PMSF, 300 mM imidazole). The eluted protein was further purified by Resource Q (Cytiva) with Q buffer C (20 mM HEPES, pH 7.5, 5% glycerol, 2 mM MgCl2, 1 mM dithiothreitol [DTT]) and Q buffer D (20 mM HEPES, pH 7.5, 1 M KCl, 5% glycerol, 2 mM MgCl2, 1 mM DTT). Then the fusion protein was treated with an appropriate protease (TEV or SUMO protease) at 4°C. After an overnight treatment, protein was supplemented with 20 mM imidazole and passed through Ni2+-NTA resin. The untagged protein was applied to the Superdex 200 increase 10/300 GL column (Cytiva; catalog no. 28990944) in SEC buffer (20 mM HEPES, pH 7.5, 300 mM KCl, 5% glycerol, 2 mM MgCl2, 1 mM DTT). Peak fractions were assessed by SDS-PAGE and Coomassie blue staining. Purified protein was dialyzed into storage buffer B (20 mM HEPES, pH 7.5, 150 mM KCl, 45% glycerol, 1 mM MgCl2, 1 mM DTT), aliquoted, and stored at −80°C.
Expression by slow ribosomes.
To test the effect of slow translation on Nsp12 activity, a derivative of BL21 (IA659) containing a K42T substitution in the ribosomal protein S12 was constructed by P1 transduction from the DEV3 E. coli strain (KL16 lac5 strA2; obtained from Kurt Fredrick, The Ohio State University) and selection on streptomycin (50 mg/liter). This substitution reduces the translation rate ∼2-fold (39). Following sequencing of the rpsL gene to confirm the substitution, the slow BL21 strain was transformed with the plasmid encoding Nsp12R. The protein was purified as described above.
RNA extension assays.
An RNA oligonucleotide (5′-UUUUCAUGCUACGCGUAGUUUUCUACGCG-3′; 4N) with cyanine 5.5 at the 5′ end was obtained from Millipore Sigma (USA). Prior to the reaction, the RNA was annealed in 20 mM HEPES, pH 7.5, 50 mM KCl by heating the mixture to 75°C and then gradually cooling it to 4°C. Reactions were carried out at 37°C with 500 nM Nsp12 variant, 1 μM Nsp7, 1.5 μM Nsp8, 200 nM RNA, and 250 μM NTPs (Cytiva; catalog no. 27202501) in the transcription buffer (20 mM HEPES, pH 7.5, 15 mM KCl, 5% glycerol, 2 mM MgCl2, 1 mM DTT). RNA extension reactions were stopped at the desired times by adding 2× stop buffer (8 M urea, 20 mM EDTA, 1× Tris-borate-EDTA [TBE], 0.2% bromophenol blue). Samples were heated for 2 min at 95°C and separated by electrophoresis in denaturing 9% acrylamide (19:1) gels (7 M urea, 0.5× TBE). The RNA products were visualized and quantified using Typhoon FLA9000 (GE Healthcare) and ImageQuant software. RNA extension assays were carried out in triplicates. Means and standard errors of the means (SEM) were calculated by OriginPro 2021 (OriginLab), and an unpaired two-tailed t test was performed using Excel (Microsoft).
Electrophoretic mobility shift assays.
RdRp (Nsp12:Nsp7:Nsp8 = 1:2:3; indicated concentrations in Fig. 2D represent the Nsp12 concentration) in 20 mM HEPES, pH 7.5, 65/15 mM KCl, 5% glycerol, 2 mM MgCl2, 1 mM DTT were incubated with 100 nM 4N RNA at 37°C for 5 min. Then reactions were mixed with 10× loading buffer (30% glycerol, 0.2% Orange G) and run on a 3% agarose gel in 1× TBE on ice. The gel was visualized by Typhoon FLA9000.
Activation of Nsp12R.
To test the effect of holo RdRp formation, 5 μM Nsp12R mixed with Nsp7/8 (10 μM/15 μM) in storage buffer B was incubated at 0°C or 37°C for 15 min and then stored at −20°C. RNA extension was performed as described above, and the reaction was stopped at 8 min. To test the effect of NTPs, RdRp was first incubated with NTPs for 10 min at 37°C, and then 200 nM RNA was added to initiate the reaction; the final concentrations of RdRp (500 nM), RNA (250 nM), and NTPs (250 μM) were identical to those used in assays with the simultaneous addition of the RNA scaffold and substrates. The reaction was stopped by adding 2× stop buffer at the indicated times.
Allosteric activation by nucleotides.
A CU RNA hairpin (5′-AAAAGAAAAGACGCGUAGUUUUCUACGCG-3′; CU) labeled with cyanine 5.5 at the 5′ end (Millipore Sigma) was annealed in 20 mM HEPES, pH 7.5, 50 mM KCl by heating the mixture to 75°C and then gradually cooling it to 4°C. RdRp holoenzymes (500 nM wild-type or mutant Nsp12A, 1 μM Nsp7, 1.5 μM Nsp8 [final concentrations]) were mixed with ATP, GTP, remdesivir triphosphate (RTP; MedChemExpress; catalog no. GS443902), or ppGpp (TriLink BioTechnologies; catalog no. N-6001) at concentrations indicated in Fig. 5 in 20 mM HEPES, pH 7.5, 15 mM KCl, 5% glycerol, 1 mM DTT and either 2 or 1 mM MgCl2 (with 1 or 0.5 mM activating nucleotide, respectively). Reaction mixtures were incubated for 5 min at 37°C, and RNA chain extension was initiated by the addition of 200 nM RNA and 100 μM CTP and UTP. Following 15 min of incubation at 37°C, reactions were stopped at the desired times by adding 2× stop buffer (8 M urea, 20 mM EDTA, 1× TBE, 0.2% bromophenol blue).
Tryptophan fluorescence.
Tryptophan fluorescence spectroscopy was performed using a model F-7000 fluorescence spectrophotometer (Hitachi). The excitation wavelength was set at 280 nm, and the emission spectra were recorded from 310 to 370 nm, with a 5-nm slit width of excitation and emission. The scan speed was 240 nm/min. The temperature was maintained at 37°C by a thermostatic water circulator (NESLAB RTE-7; Thermo Scientific). The samples were prepared in 20 mM HEPES, pH 7.5, 65 mM KCl, 5% glycerol, 2 mM MgCl2, 1 mM DTT. One micromolar Nsp12 and 2 μM Nsp7 were used to record the spectra of Nsp12 and Nsp7, respectively. To record the spectra of Nsp7·12, 1 μM Nsp12 was incubated with 2 μM Nsp7 at 37°C for 15 min. To collect the spectra of denatured proteins, 1 μM Nsp12 was incubated in 8 M urea at room temperature for 1 h. Three independent measurements, each in three technical replicates, were performed. The same results were obtained with proteins purified 3 months apart. Means, SEM, and second derivatives of the emission spectra were calculated by OriginPro 2021.
Conservation analysis.
To assess the relative conservation of coronavirus proteins, 3,309 diverse coronavirus genomes, representing alpha-, beta-, gamma-, and deltacoronavirus genera were downloaded from GenBank in May 2020. High-quality CDSs (containing no more than 32 contiguous codons with ambiguous bases) were translated into five (poly)proteins, conserved across all coronaviruses: orf1ab, S, E, M, and N. Alignments of five open reading frames (ORFs) were produced using the MUSCLE program (66). For each alignment, column homogeneity and the weighted fraction of nongap characters (both ranging from 0 to 1) were calculated as described previously (67). The product of these two values was used as the conservation index (ranging from 0 to 1). For the whole-genome pan-Coronaviridae conservation map, only consensus positions (those with a fraction of gaps below 0.5) were used.
EDC modification and mass spectrometry.
Approximately 0.5 mg/ml of Nsp12 in 20 mM HEPES, pH 7.5, 50 mM KCl, 2 mM MgCl2, 1 mM DTT was mixed with freshly prepared EDC [N-(3-dimethylaminopropyl)-N′-ethyl carbodiimide hydrochloride; Sigma, catalog no. 03449]. EDC was added to a final concentration of 2 mM, and the reaction was performed at room temperature for 30 min. The reaction was quenched with a 50× molar excess of Tris-HCl (pH 7.5) for 5 min. Cross-linked protein samples were separated using SDS-PAGE, the protein bands were stained with GelCode Blue, and tryptic peptides were generated using an in-gel tryptic digestion kit (Thermo Scientific; catalog no. 89871); peptides were purified using Pierce 10-μl C18 tips (Thermo Fisher; catalog no. PI87782). Peptides were analyzed in the Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific) coupled to an EASY-nLC (Thermo Scientific) liquid chromatography system, with a 2-μm, 500-mm EASY-Spray column. The peptides were eluted over a 180-min linear gradient from 96% buffer A (water) to 40% buffer B (acetonitrile) and then continued to 98% buffer B over 20 min with a flow rate of 200 nl/min. Each full mass spectrometry (MS) scan (R = 60,000) was followed by 20 data-dependent MS2 (R = 15,000) with high-energy collisional dissociation (HCD) and an isolation window of 2.0 m/z. The normalized collision energy was set to 35. Precursors of charge states 2 to 6 and 4 to 6 were collected for MS2 scans; monoisotopic precursor selection was enabled, and a dynamic exclusion window was set to 30.0 s. The resulting raw files were searched in enumerative mode with pFind3 (68) in open search mode against the Nsp12 sequence; the inferred modifications over a 1% cutoff were used as “variable” modifications in the subsequent pLink2 search. The same files were then searched in cross-link discovery mode using pLink2 (69) against the Nsp12 sequence, using [EDC] as the cross-linking reagent, trypsin as the enzyme generating the peptides, and variable modifications set as inferred by pFind3.
Data availability.
Mass spectrometry data sets have been deposited into MassIVE (accession no. MSV000086827, available for download at ftp://massive.ucsd.edu/MSV000086827/), and processed data are presented in Data Set S2.
Authors: Michael L Gleghorn; Elena K Davydova; Ritwika Basu; Lucia B Rothman-Denes; Katsuhiko S Murakami Journal: Proc Natl Acad Sci U S A Date: 2011-02-14 Impact factor: 11.205
Authors: Mian Zhou; Jinhu Guo; Joonseok Cha; Michael Chae; She Chen; Jose M Barral; Matthew S Sachs; Yi Liu Journal: Nature Date: 2013-02-17 Impact factor: 49.962
Authors: Subhas Chandra Bera; Mona Seifert; Robert N Kirchdoerfer; Pauline van Nies; Yibulayin Wubulikasimu; Salina Quack; Flávia S Papini; Jamie J Arnold; Bruno Canard; Craig E Cameron; Martin Depken; David Dulin Journal: Cell Rep Date: 2021-08-17 Impact factor: 9.995