| Literature DB >> 34283611 |
Tamar Schlick1,2,3, Qiyao Zhu2, Abhishek Dey4, Swati Jain1, Shuting Yan1, Alain Laederach4.
Abstract
The SARS-CoV-2 frameshifting RNA element (FSE) is an excellent target for therapeutic intervention against Covid-19. This small gene element employs a shifting mechanism to pause and backtrack the ribosome during translation between Open Reading Frames 1a and 1b, which code for viral polyproteins. Any interference with this process has a profound effect on viral replication and propagation. Pinpointing the structures adapted by the FSE and associated structural transformations involved in frameshifting has been a challenge. Using our graph-theory-based modeling tools for representing RNA secondary structures, "RAG" (RNA-As-Graphs), and chemical structure probing experiments, we show that the 3-stem H-type pseudoknot (3_6 dual graph), long assumed to be the dominant structure, has a viable alternative, an HL-type 3-stem pseudoknot (3_3) for longer constructs. In addition, an unknotted 3-way junction RNA (3_5) emerges as a minor conformation. These three conformations share Stems 1 and 3, while the different Stem 2 may be involved in a conformational switch and possibly associations with the ribosome during translation. For full-length genomes, a stem-loop motif (2_2) may compete with these forms. These structural and mechanistic insights advance our understanding of the SARS-CoV-2 frameshifting process and concomitant virus life cycle, and point to three avenues of therapeutic intervention.Entities:
Year: 2021 PMID: 34283611 PMCID: PMC8315264 DOI: 10.1021/jacs.1c03003
Source DB: PubMed Journal: J Am Chem Soc ISSN: 0002-7863 Impact factor: 15.419
Figure 1FSE sequence and three relevant 2D structures for the SARS-CoV-2 84 nt frameshifting element (residues 13462–13545) emerging from this work that combines 2D structure prediction, SHAPE structural probing, and thermodynamic ensemble modeling. The −1 frameshifting alters the transcript UUU-UU*A(Leu)-AAC(Asn)-GGG at the second codon (asterisk) to backtrack by one nucleotide and start as AAA-CGG(Arg) instead, so that translation resumes at CGG. At the top is the FSE sequence, with the attenuator hairpin region and the 7 nt slippery site highlighted and A13533 labeled (C in SARS). The ORF1a end and ORF1b start codons for the overlapping regions are marked. For each 2D structure, H-type 3_6 pseudoknot, HL-type 3_3 pseudoknot, and three-way junction 3_5 (unknotted RNA), corresponding dual graphs, 2D structures, and corresponding arc plots are shown, with color coded stems and loops labeled.
Figure 2Multiple sequence alignments (MSA) of coronavirus frameshifting elements found by the Infernal covariance model[42] shown for 16 top-scored sequences among 182 unique homologues. Arrows at top and bottom illustrate the FSE expansion from 77 to 84 nt (+7 slippery site nt), 144 nt (+30 nt on both ends), 156 nt (+12 upstream nt), and 222 nt (+66 upstream nt). Sixteen top scored coronaviruses are aligned with the SARS-CoV-2 222 nt FSE region (insertions are hidden), with sequence similarities shown for both the whole genome and the FSE region. Nucleotides are colored based on sequence conservation. The consensus sequence is written below with a sequence logo (at each position, the overall stack height indicates sequence conservation level, and the height of an individual letter within indicates the relative frequency of that nucleotide). Stems are marked based on our analysis here: black for Stems 1 and 3, red/green/purple for Stem 2 of 3_6/3_3/3_5, consistent with Figure , and gray for Alternative Stem 1 (AS1) and upstream stems. The covarying base pairs detected by R-scape[43] are colored by nucleotide identity: green A, blue U, orange C, and red G.
Figure 3Predicted optimal structures for the frameshifting element using PKNOTS, NUPACK, IPknot, and ProbKnot (see text). For each program, 4 different sequence lengths are used: 77, 84, 144, and 156 nt. The common 77 nt subsequence is aligned, the slippery site is colored orange, and the attenuator hairpin AH is magenta. The predicted structures are shown as arc plots, with Stems 1 and 3 in black, and Stem 2 of 3_6, 3_3, and 3_5 in red, green, and purple, respectively. An upstream hairpin that blocks 3_3 Stem 2 and Alternative Stem 128–31 are labeled UH and AS1, respectively. Corresponding dual graphs 3_6 (red), 3_3 (green), 3_5 (purple), and 2_1 (black) are highlighted as graphs or subgraphs of larger motifs.
Figure 4SHAPE reactivity analysis for SARS-CoV-2 frameshifting element for 77 and 144 nt (Replicate 1). (A) The SHAPE reactivity for the 77 nt construct is plotted by bars, with red/yellow/black representing high/medium/low reactivity. The arc plot at top shows the dominant 3_6 pseudoknot predicted by the ShapeKnots energy landscape (98% of conformational space), and at bottom is the minor 3_5 (2% of landscape). Stems are labeled, and the Gibbs free energy (kcal/mol) and Boltzmann distribution probabilities are given. (B) SHAPE reactivity and ShapeKnots predictions for 144 nt construct. (C) Reactivity differences between the two constructs are shown for two enlarged key regions highlighted in A,B, with positive/negative differences indicating less flexibility in the 144/77 nt construct. Base pairs in the 144 nt 3_3 conformation are plotted by arcs at top, and 77 nt 3_6 or 3_5 at bottom. Critical residues for reactivity comparisons are highlighted.
Summary of ShapeKnots Prediction Results for Wildtype Frameshifting Element and Mutants Developed and Tested in This Worka
| construct | replicate | 3_6 prob. (energy) | 3_5 | 3_3 |
|---|---|---|---|---|
| 1 | 97.67% (− 55.9) | 2.33% (− 53.6) | none | |
| 2 | 97.26% (− 60.0) | 2.74% (− 57.8) | none | |
| 1 | 0.13% (− 60.1) | none | 99.87% (− 64.2) | |
| 2 | 0.06% (− 57.8) | none | 99.94% (− 62.4) | |
| 1 | 4.38% (− 80.9) | none | 95.62% (− 82.8) | |
| 2 | 2.26% (− 74.2) | none | 97.74% (− 76.2) | |
| 1 | 100% (− 63.1) | none | none | |
| [G3U, U4A, G18A, C19A, C68A, A69C] | 2 | 100% (− 51.6) | none | none |
| 1 | 100% (− 101.0) | none | none | |
| [G40U, U41A, G55A, C56A, C105A, A106C, C137A] | 2 | 100% (− 106.2) | none | none |
| 1 | none | none | 100% (− 59.7) | |
| [U4C, G71A, G72U] | 2 | none | none | 100% (− 61.0) |
| 1 | none | 100% (− 66.8) | none | |
| [G72C, U74C] | 2 | none | 100% (− 69.6) | none |
For each construct, the probability and free energy (kcal/mol) predicted for 3_6 pseudoknot, 3_5 junction, and 3_3 pseudoknot are shown. The mutations are annotated by their positions in the relative constructs; 77 nt construct covers residues 13469–13545, 87 nt covers 13459–13545, and 144 nt covers 13432–13575. PSM: Pseudoknot-strengthening mutant; see next section.
Figure 5Design and SHAPE analysis for (A) 77 and 144 nt 3_6 pseudoknot-strengthening mutant (PSM), (B) 77 nt 3_3 PSM, and (C) 77 nt 3_5 Mutant. For each mutant, we show the design flow, where we use RAG-IF and multiple 2D structure prediction program screening to determine mutations that stabilize the 3_6 pseudoknot, 3_3 pseudoknot, or 3_5 junction. Mutations are highlighted in blue. The SHAPE reactivity bar plots, and arc plots of the structure predicted by ShapeKnots, with alternative Stem 2 positions are shown. See Figure S8 for reactivity differences between the mutants and the wildtype for two boxed key regions.
Figure 6Conformational landscape of the frameshifting element for different sequence lengths predicted by ShapeKnots using reactivities from the 144 nt construct. For each length, probabilities of all structures containing independently folded 3_6 or 3_3 pseudoknots are individually summed. The compositions are colored red (3_6) and green (3_3), respectively. (Top) Landscape for adding upstream nucleotides only to the 77 nt FSE (asymmetric expansion). The optimal sequence length of 87 nt for the 3_3 pseudoknot is in dashed black. (Bottom) Landscape for adding both upstream and downstream nucleotides to the 77 nt FSE (symmetric approach). At 90 nt (87 + 3 downstream nt) the landscape is almost all 3_3.
FSE Structure Prediction in the Literature, Ordered by Date of First Archived Version of the Paper
| computational modeling | structure dual graph | ||||||
|---|---|---|---|---|---|---|---|
| reference | 2D | 3D | technique | length | major | minor | main findings |
| Kelly et al., | NA | NA | small-angle X-ray scattering | 85 nt | Pseudoknot 3_6 | NA | same conformation as SARS-CoV |
| Rangan et al., | homology model | NA | NA | 88 nt | Pseudoknot 3_6 | NA | |
| Andrews et al., | ScanFold (RNAfold) | NA | NA | 123 nt | Unknotted 2_1 | NA | only S1 and S3 predicted, but 3_6 S2 regions available for pairing; AS1 predicted. Four covarying base pairs in S1, two in 3_6 S2, and one in AS1 detected |
| Omar et al., | literature SARS-CoV-1 | SimRNA, FARFAR2, RNAComposer, RNAvista, RNA2D3D, Vfold; All-atom MD | NA | 68 nt | Pseudoknot 3_6 | NA | possible conformations include 5′ or 3′ end threading, and nonthreading structures |
| Manfredonia et al., | ShapeKnots | SimRNA | genome-wide SHAPE (NAI) in vivo and in vitro, DMS in vitro | 88 nt | Unknotted 2_2 (2D), Pseudoknot 3_6 (3D) | NA | 2D in vivo SHAPE probing predicts 2_2 (3_6 S2 and S3) for the 88 nt segment, but 3D SimRNA built from this 2_2 generates a 3_6 pseudoknot |
| Sanders et al., | SuperFold | NA | genome-wide SHAPE (1M7) in vivo | 123 nt | Unknotted 2_2 | NA | the 2_2 contains 3_6 S2 and S3, and AS1 |
| Lan et al., | Fold, ShapeKnots, DREEM | NA | genome-wide DMS in vivo and in vitro | 85 nt, 283 nt | Pseudoknot 3_6 (85 nt in vitro), Unknotted 2_2 (283 nt in vivo) | NA | for 283 nt in vivo, only S3, many residues base pair with up/downstream nucleotides, have an extended AS1 that excludes 3_6 S1 and S2 |
| Sun et al., | partition, MaxExpect | NA | Genome-wide SHAPE (NAI) in vivo and in vitro | 5000 nt | Unknotted 2_2 (in vivo) | NA | the 2_2 contains 3_6 S2 and S3, and AS1 |
| Huston et al., | ShapeKnots, SuperFold | NA | Genome-wide SHAPE (NAI) in vivo | 126 nt | Pseudoknot 3_8 | Pseudoknot 3_6 | original S3 replaced by a downstream stem, and 3_6 S2 base pairs formed by loop regions of S1 and this new stem; have AS1 upstream |
| Ziv et al., | NA | NA | cross-linking of matched RNAs and deep sequencing | 1475 nt | Pseudoknot 3_6 | NA | long-range RNA RNA interactions around FSE |
| Zhang et al., | ShapeKnots, Fold | autoDRRAFTER | Cryo-EM; SHAPE (1M7), DMS, M2-seq | 88 nt, 198 nt | Pseudoknot 3_6 (88 nt), Unknotted 2_2 (198 nt) | Pseudoknot 3_3, 3-way junction 3_5 (88 nt) | 198 nt contains 2_2 (3_6 S2 and S3) and AS1; 88 nt 3_3 predicted by ShapeKnots using DMS, 3_5 by Fold using SHAPE and DMS |
| Schlick et al., | NUPACK, PKNOTS | RNAcomposer, SimRNA, iFoldRNA, Vfold3D; All-atom MD | Graph theory based structure transforming mutation (RAG-IF) | 77 nt, 84 nt | Pseudoknot 3_6 | NA | FSE structure is highly fragile to mutations. Double mutants transform 3_6 to 3_5, 3_2, 2_1, and 3_3 |
| Trinity et al., | Iterative HFold | NA | NA | 68 nt | Pseudoknot 3_6 | Pseudoknot 3_8, 3_3 | 3_8: S2, S3 of 3_6, and a pseudoknot by 5′ end and S3 loop. 3_3: S1, S3, and a pseudoknot by S3 loop and 3′ end |
| Bhatt et al., | NA | NA | Cryo-EM | 118 nt | Pseudoknot 3_6 | NA | Pseudoknot start shifted 2 nt relative to literature prediction. Loop 3 shifted and expanded |
| Wacker et al., | pKiss | NA | NMR, DMS footprinting | 68 nt | Pseudoknot 3_6 | NA | homodimerization with Mg2+; Prediction with DMS consistent with NMR structrue |
| Iserman et al., | SuperFold | NA | SHAPE (5NIA) | 1000 nt | Unknotted 2_2 | NA | only S3, extended AS1 |
| Ahmed et al., | RNAfold | RNAComposer | NA | 81 nt | Unknotted 2_1 | NA | only S1 and S3 |
| Morandi et al., | DRACO | NA | Genome-wide DMS in vitro | 174 nt | not independently folded, unknotted | Unknotted 2_2 | the 2_2 contains 3_6 S2 and S3. both conformations contain AS1 |
| Schlick et al., | ShapeKnots; PKNOTS, NUPACK, IPknot, ProbKnot | RNAcomposer, SimRNA, iFoldRNA, Vfold3D | RAG-IF; SHAPE (5NIA) | 77, 87, 144, 156, 222 nt | Pseudoknot 3_6 (77 nt), 3_3 (87, 144 nt), Unknotted 2_2 (156, 222 nt) | 3-way junction 3_5 (77 nt), Pseudoknot 3_6 (87, 144, 222 nt) | 3_6 pseudoknot is dominant at 77 nt with a minor 3_5 junction, and 3_3 is dominant at 87, 144 nt with a minor 3_6. For 156 and 222 nt, stem-loop 2_2 is predominant |
Figure 7Effects of 5 mutations tested for frameshifting efficiency by Bhatt et al.[36] on 2D structure predictions of the 77 nt and 87 nt frameshifting element. (A) (Left) The 77 nt FSE 3_6 pseudoknot with mutation regions labeled in blue. Two weak base pairs for Stem 2 are indicated using dotted lines. (Right) A table showing 2D prediction results for the wildtype and the mutants. The upper half is for 4 programs: 3_6, 3_5, and 3_3 predictions are in red, purple, and green, respectively, with their corresponding Stem 2 lengths (in bold if structure change). The lower half is for ShapeKnots, showing probabilities of 3_6, 3_5, and 3_3. (B) (Left) The 87 nt FSE 2D structure by ShapeKnots, a 4_21 structure made of the 3_3 pseudoknot and a flanking Stem SF, with mutation regions in blue. (Right) A table showing prediction results using 4 programs and ShapeKnots.
Figure 8Three-dimensional models of the 87 nt 3_6 and 3_3 pseudoknot. Initial 3D structures are predicted by RNAComposer,[63] Vfold3D,[64] SimRNA,[65] and iFoldRNA,[66] and subjected to 1–1.5 μs MD using Gromacs.[67] The last 500 ns are used for clustering analysis, and the most populated cluster center by RNAComposer (red)/iFoldRNA (green) is shown here for each system. The 88 nt cryo-EM 3_6 structure derived by Zhang et al. (blue)[29] (PDB: 6XRZ) is aligned using Rclick[68] for comparison. The three shaded stems of the cryo-EM structure align well with our 3_6 model.
Figure 9Three avenues for frameshifting interference, with cartoon models for the tertiary systems as modeled by molecular dynamics[62] (created with BioRender.com).
Primers Used for the SuperScript II Error Prone Reverse Transcriptase PCR and Library Generation
| primer | sequence |
|---|---|
| 3′ Cassette-RT | GAACCGGACCGAAGCCCG |
| 5′ Cassette-Fwd | CCCTACACGACGCTCTTCCGATCTNNNNNGCCTTCGGGCCAA |
| 3′ Cassette-Rev | GACTGGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNGAACC |
| GGACCGAAGCCCG |