Conner J Langeberg1, Madeline E Sherlock1, Andrea MacFadden1, Jeffrey S Kieft1,2. 1. Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA. 2. RNA BioScience Initiative, University of Colorado Denver School of Medicine, Aurora, Colorado 80045, USA.
Abstract
Structured RNA elements are common in the genomes of RNA viruses, often playing critical roles during viral infection. Some viral RNA elements use forms of tRNA mimicry, but the diverse ways this mimicry can be achieved are poorly understood. Histidine-accepting tRNA-like structures (TLSHis) are examples found at the 3' termini of some positive-sense single-stranded RNA (+ssRNA) viruses where they interact with several host proteins, induce histidylation of the RNA genome, and facilitate processes important for infection, to include genome replication. As only five TLSHis examples had been reported, we explored the possible larger phylogenetic distribution and diversity of this TLS class using bioinformatic approaches. We identified many new examples of TLSHis, yielding a rigorous consensus sequence and secondary structure model that we validated by chemical probing of representative TLSHis RNAs. We confirmed new examples as authentic TLSHis by demonstrating their ability to be histidylated in vitro, then used mutational analyses to imply a tertiary interaction that is likely analogous to the D- and T-loop interaction found in canonical tRNAs. These results expand our understanding of how diverse RNA sequences achieve tRNA-like structure and function in the context of viral RNA genomes and lay the groundwork for high-resolution structural studies of tRNA mimicry by histidine-accepting TLSs.
Structured RNA elements are common in the genomes of RNA viruses, often playing critical roles during viral infection. Some viral RNA elements use forms of tRNA mimicry, but the diverse ways this mimicry can be achieved are poorly understood. Histidine-accepting tRNA-like structures (TLSHis) are examples found at the 3' termini of some positive-sense single-stranded RNA (+ssRNA) viruses where they interact with several host proteins, induce histidylation of the RNA genome, and facilitate processes important for infection, to include genome replication. As only five TLSHis examples had been reported, we explored the possible larger phylogenetic distribution and diversity of this TLS class using bioinformatic approaches. We identified many new examples of TLSHis, yielding a rigorous consensus sequence and secondary structure model that we validated by chemical probing of representative TLSHis RNAs. We confirmed new examples as authentic TLSHis by demonstrating their ability to be histidylated in vitro, then used mutational analyses to imply a tertiary interaction that is likely analogous to the D- and T-loop interaction found in canonical tRNAs. These results expand our understanding of how diverse RNA sequences achieve tRNA-like structure and function in the context of viral RNA genomes and lay the groundwork for high-resolution structural studies of tRNA mimicry by histidine-accepting TLSs.
Viruses are obligate cellular parasites that must subvert and coopt host cellular machinery to proliferate. In single-stranded RNA (ssRNA) viruses, structured regions within the viral genomic RNA can directly manipulate cellular machinery, a ubiquitous part of many viruses’ overall infection strategy. Such RNA structures affect pathways and processes as diverse as translation, replication, packaging, viral RNA stability, immune evasion, and others (Pathak et al. 2011; Tuplin 2015; Garcia-Blanco et al. 2016; Hogg 2016; Jaafar and Kieft 2019). RNA structural elements frequently coordinate different processes occurring on the genomic RNA, often using conformational changes (Gamarnik and Andino 1998). Understanding the structural diversity and distribution of different viral RNA elements is essential to define their mechanisms of action and helps us understand roles for RNA in cellular processes by revealing the fundamental rules for RNA structure-driven function. In particular, because ssRNA viruses evolve relatively rapidly, exploring conservation of sequence and structure of an RNA class within different viruses reveals how dissimilar RNA sequences achieve a similar structure and function (Mans et al. 1991; Roth and Breaker 2009; Webb et al. 2009; Perreault et al. 2011; Pisareva et al. 2018; Steckelberg et al. 2018; Jones et al. 2021).A class of structured RNAs with roles in viral infection are the transfer RNA (tRNA)-like structures (TLSs), found in the 3′ terminal sequences of certain positive-sense single-stranded RNA (+ssRNA) viruses where they mimic the structure of tRNAs to varying degrees (Fig. 1A; Rietveld et al. 1984; Mans et al. 1991; Dreher 2010; Felden et al. 1994a,1996; Hammond 2009). These TLSs were first identified by their ability to be aminoacylated on their 3′ ends by host cell aminoacyl tRNA synthetases (AARSs) (Pinck et al. 1970; Ijberg and Philipson 1972; Kohl and Hall 1974; Salomon et al. 1976; Joshi et al. 1985; Goodwin and Dreher 1998). Furthermore, they interact with host proteins associated with tRNAs, specifically eukaryotic elongation factor 1A (eEF1A) (Joshi et al. 1986; Dreher et al. 1999; Zeenko et al. 2002; Hwang et al. 2013; Li et al. 2013), and the C-adding enzyme (Litvak et al. 1973; Dreher and Goodwin 1998; Hema et al. 2005). However, although TLS RNAs have a terminal CCA that matches the CCA in tRNA and that is aminoacylated, they have secondary structures and sequence conservation that differ dramatically from authentic cellular tRNAs. This is mandated in part by the fact that they are part of the viral genome and are connected to it at their 5′ end; TLSs use a pseudoknot in place of the acceptor stem used in tRNA (Fig. 1A). However, TLSs have other secondary structure and sequence differences compared to tRNAs which may relate to the fact that they can play several roles during infection.
FIGURE 1.
Histidine-accepting tRNA-like structures. (A) Cartoon diagram of a TLS-containing +ssRNA viral RNA genome, with the 3′ TLS indicated with a dashed box. Gray shaded boxes indicate the ORFs in the capped viral genome. (B) Cartoon representations of tRNA and the three classes of TLS. (C) Phylogenetic distribution of tRNA-like structures in several +ssRNA plant virus genera. Tree is based on the concatenated viral methyl transferase, replicative RNA helicase, and RNA dependent RNA polymerase (Mtr-Hel-RdRp) sequence, adapted from King et al. (2012). Asterisks denote the presence of one or more tRNA-like structure classes in each viral genus. (D) Consensus sequence and secondary structural model of the 157 identified unique histidine-accepting tRNA-like structure sequences. Regions are labeled relative to their homology in a canonical tRNA, if present. PK2: pseudoknot 2 region, D*: putative D-loop analog, AC: anticodon arm, T: T-arm, PK1: pseudoknot 1 region, DN: discriminator nucleotide. The location of an A base speculated to substitute for the N−1 G in authentic tRNAHis is indicated.
Histidine-accepting tRNA-like structures. (A) Cartoon diagram of a TLS-containing +ssRNA viral RNA genome, with the 3′ TLS indicated with a dashed box. Gray shaded boxes indicate the ORFs in the capped viral genome. (B) Cartoon representations of tRNA and the three classes of TLS. (C) Phylogenetic distribution of tRNA-like structures in several +ssRNA plant virus genera. Tree is based on the concatenated viral methyl transferase, replicative RNA helicase, and RNA dependent RNA polymerase (Mtr-Hel-RdRp) sequence, adapted from King et al. (2012). Asterisks denote the presence of one or more tRNA-like structure classes in each viral genus. (D) Consensus sequence and secondary structural model of the 157 identified unique histidine-accepting tRNA-like structure sequences. Regions are labeled relative to their homology in a canonical tRNA, if present. PK2: pseudoknot 2 region, D*: putative D-loop analog, AC: anticodon arm, T: T-arm, PK1: pseudoknot 1 region, DN: discriminator nucleotide. The location of an A base speculated to substitute for the N−1 G in authentic tRNAHis is indicated.TLSs can alter the translation efficiency of an upstream open reading frame (ORF), but the mechanisms for this remain unknown (Barends et al. 2004; Matsuda and Dreher 2004; Rudinger-Thirion et al. 2006; Dreher 2010; Chujo et al. 2015; Hartwick et al. 2018). Viral proteins also interact with TLSs, which contain part or all of the viral negative-strand promoter site required for RNA-dependent RNA polymerase (RdRp) binding (Singh and Dreher 1997; Deiman et al. 1998; Osman et al. 2000; Olsthoorn et al. 2004; Yamaji et al. 2006; Rao and Cheng Kao 2015). In addition, in some viruses the TLS is essential for proper packaging of the virion, acting as a nucleation site for capsid assembly (Choi and Rao 2000; Choi et al. 2002). Thus, TLSs are multifunctional RNAs, a feature likely conferred by their three-dimensional structure, in particular the features that differ from authentic tRNA.Studying TLSs may give insight into other types of tRNA mimics proposed to exist in viral and cellular RNAs, including the tRNA-miRNA-encoded RNAs (TMERs) in Gammaherpesvirus (Diebel et al. 2015) and those in some 3′ cap-independent translational enhancer (3'-CITE) elements (McCormack et al. 2008; Simon and Miller 2013). Nonviral tRNA mimics include transfer-messenger RNAs (tmRNAs) (Williams and Bartel 1995; Weis et al. 2010) and the MALAT1-derived mascRNA (Wilusz et al. 2008; Sun and Ma 2019; Lu et al. 2020). Notably, while some of these enhance translation, none are known to be aminoacylated and while some remain within the genome, others are processed out of the primary transcript. Thus, tRNA mimicry may be both useful and diverse, motivating efforts to understand how different types of tRNA mimicry are formed.Three classes of TLSs are known: valine-accepting TLSs (TLSVal), tyrosine-accepting TLSs (TLSTyr), and histidine-accepting TLSs (TLSHis) (Mans et al. 1991; Dreher 2010). Each class is distinct in the identity of the amino acid added to its 3′ end, secondary structure, and presumably higher-order folding (Fig. 1B). All contain a pseudoknotted acceptor stem mimic as well as a 3′ CCA that is maintained on the viral genome by cellular processing machinery (Litvak et al. 1973; Dreher and Goodwin 1998; Osman et al. 2000). The TLSVal class most resembles a canonical tRNA structure with discernible acceptor stem, D-arm, T-arm, and anticodon (AC)-arm elements (Pinck et al. 1970; Fukai et al. 2000; Hammond et al. 2009), confirmed by two x-ray crystal structures of the TYMV TLS (Colussi et al. 2014; Hartwick et al. 2018). Conversely, the TLSTyr class appears to be the most divergent from tRNAs, with several additional stem–loops, no clear T-arm or AC-arm, and thus no obvious way to mimic tRNA (Haenni et al. 1982; Dreher and Hall 1988; Felden et al. 1994a). However, a recent cryo-EM structure of the Brome mosaic virus (BMV) TLSTyr revealed tRNA mimicry is embedded in a more complex fold that may require programmed conformational changes to fully mimic tRNA (Bonilla et al. 2020).The third class of TLSs, TLSHis, visually appears to lie between the other two in terms of structural similarity to canonical tRNA. The proposed TLSHis secondary structure has putative analogs to the AC-arm, T-arm, and acceptor stem, though it lacks an obvious D-arm analog (Ijberg and Philipson 1972; Salomon et al. 1976; Rietveld et al. 1984; Felden et al. 1996). High-resolution structural information on this class has remained elusive, likely due in part to structural heterogeneity as observed in the prototypical TLSHis RNA from the tobacco mosaic virus (TMV) (Hammond et al. 2009). Although structural modeling of the RNA provided insight into the possible TMV TLS three-dimensional fold (Rietveld et al. 1984; Felden et al. 1996), this model remains untested. Furthermore, only five TLSHis sequences have been identified, which makes analysis of this class challenging compared to the TLSVal and TLSTyr classes (Dreher 2010; Sherlock et al. 2021; Bonilla et al. 2020). Hence, the TLSHis class represents a novel form of tRNA mimicry that can provide insight into diverse ways such mimicry can be achieved.Previous studies identified 108 unique TLSVal sequences (Sherlock et al. 2021) and 512 unique TLSTyr sequences (Bonilla et al. 2020), but only five examples of the TLSHis were known (Dreher 2010). This relative scarcity of sequence and structural information available for the TLSHis class motivated us to better understand how primary sequence, secondary structure, and tertiary contacts achieve the functionally required fold. Using the few previously reported TLSHis sequences, we performed bioinformatic searches based on primary sequence and secondary structure conservation to identify many additional putative TLSHis sequences. We used chemical probing to query the proposed secondary structures of some new TLSHis and used an in vitro aminoacylation assay to verify that they are functional, histidine-accepting TLSs. Finally, we interrogated a proposed D-loop mimic, implicating this region in a long-range interaction with the T-loop that may be analogous to the D-loop/T-loop interaction present in canonical tRNAs. Together, our findings uncover an expanded phylogenetic diversity of the TLSHis class and provide insight into how the structural conservation of these RNAs correlates with their tRNA mimicry.
RESULTS AND DISCUSSION
Bioinformatic searches reveal additional TLSHis
Identifying new TLSHis promised to reveal conserved regions required for achieving the structure, locations of possible protein interactions, and variations such as the insertions found in the divergent members of the TLSVal and TLSTyr (Bonilla et al. 2020; Sherlock et al. 2021). To identify additional putative TLSHis RNAs, we used the program Infernal to perform homology-based searches (Nawrocki and Eddy 2013). We started with an initial seed alignment from the four TLSHis sequences deposited in the Rfam database (Rfam ID: RF01077) (Kalvari et al. 2018). This database seed alignment was incomplete, so we adjusted it to add the entire 3′ end, including the T-loop and acceptor stem pseudoknot. A search of all +ssRNA virus genomes deposited in the NCBI virus database identified 158 unique sequences from 36 unique viruses with substantial secondary structure conservation (Supplemental Files 1, 2). The difference between the number of unique sequences and unique viruses results from the presence of both genomic and subgenomic RNAs within a single viral species, and some sequence variations between isolates and strains (Adams et al. 2017). Of the 36 viruses containing a putative TLSHis, 33 belong to the Tobamovirus genus, two to the Tymovirus genus, and one to the Furovirus genus (Fig. 1C; Supplemental File 2), which are the genera previously known to contain TLSHis. Although the tobamoviruses and furoviruses are both in the Virgaviridae family and are closely related based on RdRp sequence, the tobamovirus TLSHis are more similar to those from the Tymovirus genus in the Tymoviridae family. The single TLS identified from Furovirus had a high E value of 0.019, making it an unlikely TLSHis candidate; rather, it demonstrates features of the TLSVal class and was excluded from further analyses. Interestingly, the tymoviruses mostly contain TLSVal, and most examples of TLSVal are within Tymoviridae, with a few in Virgaviridae (Sherlock et al. 2021). While we cannot propose a specific evolutionary history of these viral lineages, this distribution suggests exchange of TLS elements between viruses during coinfections.
TLSHis RNAs adopt a conserved secondary structure
We calculated a consensus sequence and secondary structure model using CaCoFold with the 157 putative unique TLSHis sequences (Fig. 1D; Rivas et al. 2016, 2020). The resultant covariation patterns strongly support the putative secondary structure of the prototypical TMV TLSHis. Specifically, both pseudoknots and analogs for the T-arm and AC-arm were present in our model and in good agreement with the observed base-pairing and covariation. Sequence conservation is markedly different in different regions of the secondary structure. Specifically, the sequence conservation is high in the acceptor stem analog PK1 and the T-arm analog (3′ region). The combined length of the T-arm plus PK1 is always 11 bp, 1 bp shorter than in the TLSVal class (Sherlock et al. 2021). Within the T-arm, the T-loop is nearly perfectly conserved in TLSHis sequences, containing the 5′-UUCGAAU-3′ sequence common in tRNA T-loops. In fact, the T-loop of the TLSHis is more conserved than the T-loop of canonical tRNAHis (Westhof and Auffinger 2001). The conservation in this region likely reflects how these TLSs are recognized by proteins, including host HisRS, CCA-adding enzyme, and eEF1A, and viral RdRp (Hegg et al. 1990; Singh and Dreher 1997; Deiman et al. 1998; Osman et al. 2000; Zeenko et al. 2002; Olsthoorn et al. 2004; Yamaji et al. 2006; Hwang et al. 2013; Li et al. 2013). Indeed, key bases for recognition by host HisRS are in this region (Crothers et al. 1972; Hou 1997; Rudinger et al. 1997; Tian et al. 2015) and both the CCA-adding enzyme and eEF1A bind to this part of tRNAs (Nissen et al. 1995; Xiong and Steitz 2004).In contrast to the 3′ region, the region comprising PK2 and the AC-arm analog (5′ region) exhibits substantial base-pair covariation but little primary sequence conservation, thus a specific secondary structure is required, largely independent of nucleotide identity. The exceptions are two isolated motifs that are conserved in sequence: the histidine AC (GUG) and a GG dinucleotide adjacent to PK2. Finally, we did not find any new TLSHis with substantial insertions or deletions as is seen in both the TLSVal and TLSTyr classes (Bonilla et al. 2020; Sherlock et al. 2021). Cumulatively, these patterns suggest a conserved secondary structure present in histidine-accepting TLSs that matches the proposed structure of the archetypal TMV TLSHis and less global variation (insertions or deletions) than in the other TLS classes.
Chemical probing of individual TLSHis structures
While the high degree of covariation present in all base-pairing regions supports a common TLSHis secondary structure, experimental interrogation of representative RNAs is useful to test this and find patterns across the RNAs. We applied selective 2′ hydroxyl acylation analyzed by primer extension (SHAPE) in vitro chemical probing, which queries the conformational flexibility at each nucleotide position in an RNA (Yoon et al. 2011; Kim et al. 2013; Cordero et al. 2014; Kladwang et al. 2014; Lee et al. 2015). Locations in the RNA that are more conformationally dynamic, such as in unpaired bases, react more readily with the SHAPE reagent N-methyl isatoic anhydride (NMIA) than those in interactions that restrict motion, such as in base pairs. We applied this method to the known TLSHis from Tobacco Mosaic Virus (TMV) and putative TLSHiss from Odontoglossum Ringspot Virus (ORSV), Ribgrass Mosaic Virus (RMV), Cucumber Mottle Virus (CMoV), Maracuja Mosaic Virus (MarMV), Hibiscus Latent Fort Pierce Virus (HLFPV), Zucchini Green Mottle Mosaic Virus (ZGMMV), and Diascia Yellow Mottle Virus (DiaYMV). Mapping reactivities onto the secondary structure models, we observed patterns consistent with each proposed secondary structure (Fig. 2; Supplemental Fig. S1). Specifically, low reactivities were observed in regions proposed to be base-paired, namely both pseudoknot regions, the AC stem, and the T-arm stem. Conversely, most regions predicted to lack canonical base-pairing in the model contained elevated levels of reactivity: the AC loop, the linker regions following PK2, the T-loop, and the CCA trinucleotide. Additionally, the internal loop or bulge present in all AC stems was highly reactive, consistent with conformational dynamics in this region. This loop often contains 5 nt but varies in size (Fig. 2; Supplemental Fig. S1). Notably, some base-paired regions exhibited some chemical reactivity, such as part of the AC stem and the 3 bp stem in PK1. This likely indicates some degree of local conformational dynamics or specific reactive structural features that allow modification by the reagent.
FIGURE 2.
Histidine-accepting tRNA-like structures adopt a conserved secondary structure. Chemical probing of four representative TLSHis RNAs using the SHAPE reagent NMIA: (A) Tobacco Mosaic virus, (B) Ribgrass Mosaic virus, (C) Hibiscus Latent Fort Pierce virus, and (D) Diascia Yellow Mottle virus. Reactivity was background subtracted and normalized to flanking 5′ and 3′ normalization hairpins (not depicted; see Supplemental File 2 for sequence details). Coloring represents degree of normalized modification according to the inset legend.
Histidine-accepting tRNA-like structures adopt a conserved secondary structure. Chemical probing of four representative TLSHis RNAs using the SHAPE reagent NMIA: (A) Tobacco Mosaic virus, (B) Ribgrass Mosaic virus, (C) Hibiscus Latent Fort Pierce virus, and (D) Diascia Yellow Mottle virus. Reactivity was background subtracted and normalized to flanking 5′ and 3′ normalization hairpins (not depicted; see Supplemental File 2 for sequence details). Coloring represents degree of normalized modification according to the inset legend.
Representative putative new TLSHis are histidylated in vitro
While all new putative TLSHis conformed to the consensus secondary structure, the sequence diversity motivated us to qualitatively test their ability to be aminoacylated by histidyl-tRNA synthetase (HisRS). We used in vitro aminoacylation assays with purified HisRS from S. cerevisiae and N. tabacum (Supplemental Fig. S2), and 3H-2,5-L-histidine as substrate. We first tested the specificity of these enzymes with yeast histidine tRNA (tRNAHis) and a yeast leucine tRNA (tRNALeu) as positive and negative controls, respectively. With both HisRS enzymes, the yeast tRNAHis was histidylated to a high level while yeast tRNALeu showed levels similar to a reaction with no RNA (Fig. 3A,B). We then tested both enzymes using the TYMV TLSVal, as previous studies show TYMV may be histidylated, likely due to a nucleotide within the acceptor stem pseudoknot that mimics the −1 base in a tRNAHis (Dreher and Goodwin 1998). These experiments recapitulate the finding that the TMYV TLSVal can be histidylated to some degree by the S. cerevisiae HisRS but not N. tabacum HisRS (Fig. 3A,B).
FIGURE 3.
Identified putative viral TLSHis sequences are histidylated in vitro. 3H-L-histidine incorporation of eight representative TLSHis RNAs identified through bioinformatic searches. Histidylation of each RNA, as measured by covalent incorporation of 3H-L-histidine by (A) S. cerevisiae histidine tRNA-synthetase and (B) N. tabacum histidine tRNA-synthetase at the 3′ adenosine, is normalized to yeast tRNAHis RNA. The dashed line and shaded region indicate the background of a reaction containing no RNA. Each reaction was performed in triplicate. Error bars represent one standard error from the mean.
Identified putative viral TLSHis sequences are histidylated in vitro. 3H-L-histidine incorporation of eight representative TLSHis RNAs identified through bioinformatic searches. Histidylation of each RNA, as measured by covalent incorporation of 3H-L-histidine by (A) S. cerevisiae histidine tRNA-synthetase and (B) N. tabacum histidine tRNA-synthetase at the 3′ adenosine, is normalized to yeast tRNAHis RNA. The dashed line and shaded region indicate the background of a reaction containing no RNA. Each reaction was performed in triplicate. Error bars represent one standard error from the mean.Under these conditions, the prototypical TLSHis from TMV was histidylated at levels matching or exceeding tRNAHis with either enzyme (Fig. 3A,B; Felden et al. 1994b). We then tested seven representative newly identified putative TLSHis RNAs, chosen to contain variable features including diverse anticodon sequences, discriminator nucleotide identity, AC stem length, and AC bulge size. All of these putative TLSHis RNAs were histidylated well above the negative controls by both the yeast and tobacco HisRS (Fig. 3A,B). In addition to the endpoint experiments described here using substrate levels of enzyme, we performed an enzyme titration with ORSV demonstrating this histidylation was not an artifact of the high enzyme concentration (Supplemental Fig. S2C). Although we did not test every new putative TLSHis, the fact that all those that were tested were aminoacylated, and the robust conservation within the class, suggests that most are likely functional substrates for HisRS.
TLSHis aminoacylation bypasses features required in tRNAs
Several important identity nucleotides that facilitate recognition by HisRS appear in TLSHis RNAs (Rudinger et al. 1997; Dreher 2010), but also several TLSHis that lacked these features were aminoacylated (Fig. 3A,B). Specifically, two important tRNAHis identity elements are the N−1 and N73 bases at the end of the acceptor stem (Rudinger et al. 1997; Tian et al. 2015). These are nearly invariant across tRNAHis as G−1 and A73. However, in TLSHis the homologous bases are often an A at N−1 and a C at N73 (Fig. 1D), with some variation. Our results suggest the discriminator nucleotide is not always essential for histidylation for TLSHis, as there was no significant decrease in histidine incorporation for DiaYMV, which contains an A at position 73 in place of the C typical of most TLSHis and many tRNAHis; experiments measuring kcat/KM will be useful to fully understand quantitative effects of this nucleotide on aminoacylation levels. Notably, in most members of the Virgaviridae family the discriminator nucleotide analog N73 is a C and in all members of the Tymoviridae family it is an A. It appears that rather than reflecting the identity of the canonical discriminator nucleotide, this base reflects the viral RdRp lineage (Deiman et al. 1998; Osman et al. 2000), which appears to also be true in the TLSVal class (Sherlock et al. 2021). Of similar note is the lack of the −1 G found in histidine tRNAs. During processing of the immature tRNAHis, the enzyme tRNAHis guanylyltransferase (THG1) and homologs covalently attach a guanine base to the 5′ end of the tRNA that is recognized by the host HisRS during aminoacylation. Because TLSs are found at the 3′ end of the viral genome there is no available 5′ end for THG1 to modify. Previous studies have suggested an A within the acceptor stem pseudoknot may accomplish a similar function (Fig. 1D; labeled as N−1; Rudinger et al. 1994), making TLSHis THG1-independent.The other important identity element within tRNAHis is the AC loop; in the TLSHis it is most often a GUG as in canonical tRNAHis (Rudinger et al. 1997). However, exceptions to this, namely TMV (GUU), RMV (CGG), and MarMV (AGA) TLSs, were readily aminoacylated (Fig. 3A,B). Of note, the TLSHis AC loop is not as well conserved as in the TLSVal class where the sequence is critical for aminoacylation (Sherlock et al. 2021). Additionally, the TLSHis AC loop is predicted to contain 5 nt with covariation in the stem's terminal base pair, contrasting with the canonical 7 nt tRNAHis AC loop (Rudinger et al. 1997). Finally, in the TLSHis, the AC stem always contains an internal loop or bulge and is significantly longer than the AC stem of the TLSVal class. This difference likely relates to the three-dimensional structure of the TLSHis RNAs, though further studies are needed. Perhaps the flexibility in this region affords the structure with dynamic capabilities to switch between different functional states, as proposed for the TLSTyr (Bonilla et al. 2020); this could explain previous data suggesting conformational heterogeneity in TLSHis samples (Hammond 2009).
Evidence for a conserved D-loop/T-loop-like interaction in the TLSHis
Chemical probing showed that in several TLSHis, the conserved GG dinucleotides present between PK2 and the AC stem had decreased reactivity compared to adjacent nucleotides and other unpaired regions, despite no obvious base-pairing partners (Fig. 2). Similarly, the T-loop region displayed decreased levels of SHAPE reactivity when compared with other loop and linker regions. Within the T-loop, the pattern is similar to what is observed in canonical tRNAs; the second U was consistently more reactive than the rest of the loop (Kladwang et al. 2011). In canonical tRNAs this elevated reactivity is due to the unique local backbone geometry (Kladwang et al. 2011; Tian et al. 2015), which facilities an interaction with the D-loop, forming the tRNA “elbow” (Levitt 1969; Rould et al. 1989; Tian et al. 2015). The similar reactivity pattern in TLSHis suggests a similar structure in the TLSHis T-loop, that then could interact with the conserved GG dinucleotide, as previously proposed but not directly tested (Felden et al. 1996).To test if the TLSHis class contains an interaction between the conserved GG dinucleotide and the T-loop, we assessed the functional and structural effects of mutating these elements. First, we individually introduced a G → A mutation in the GG dinucleotide proposed to interact with the T-loop, or a C → U mutation in the T-loop, in three representative RNAs: TMV, RMV, and DiaYMV TLSHis (Fig. 4A,B). Analogous mutations in tRNAs, G19A and C56U, disrupt the D-loop/T-loop interaction, resulting in decreased aminoacylation (Du and Wang 2003). When the G → A and C → U mutants were tested in several TLSHis, all exhibited decreased levels of aminoacylation relative to WT (Fig. 4C,D). Although a double mutant of the TMV TLS containing both the G13A and C77U mutations might be expected to restore activity, in fact similar compensatory mutations do not restore activity in canonical tRNAs (Du and Wang 2003). This is not surprising given the structural context of the T-loop and the nature of the long-range interactions. Consistent with this, the G → A + C → U double mutants also did not restore aminoacylation in the TMV TLSHis (Supplemental Fig. S3A,B). Overall, the near total loss of histidine incorporation resulting from these mutations is similar to what has been observed for canonical tRNAs when the T-loop/D-loop interaction is disrupted (Du and Wang 2003).
FIGURE 4.
A putative D-loop/T-loop mimic is required for structure and efficient histidylation. (A,B) SHAPE chemical probing of the wild-type TMV (A) and RMV (B) TLSs and two D-loop/T-loop mutants (boxed in figure). Reactivity was background subtracted and normalized to flanking 5′ and 3′ normalization hairpins (not shown). (C) 3H-L-histidine incorporation using S. cerevisiae HisRS of three representative TLSHis RNAs; TMV, RMV, and DiaYMV, each with the wild-type sequence as well as two D-loop/T-loop mutants. Histidylation of each RNA, as measured by covalent incorporation of 3H-L-histidine by histidine tRNA-synthetase (HisRS), is normalized to yeast tRNAHis. The dashed line and shaded region indicate the background of a reaction containing no RNA. (D) As in C, but with N. tabacum HisRS. Each reaction was performed in triplicate. Error bars represent one standard error from the mean.
A putative D-loop/T-loop mimic is required for structure and efficient histidylation. (A,B) SHAPE chemical probing of the wild-type TMV (A) and RMV (B) TLSs and two D-loop/T-loop mutants (boxed in figure). Reactivity was background subtracted and normalized to flanking 5′ and 3′ normalization hairpins (not shown). (C) 3H-L-histidine incorporation using S. cerevisiae HisRS of three representative TLSHis RNAs; TMV, RMV, and DiaYMV, each with the wild-type sequence as well as two D-loop/T-loop mutants. Histidylation of each RNA, as measured by covalent incorporation of 3H-L-histidine by histidine tRNA-synthetase (HisRS), is normalized to yeast tRNAHis. The dashed line and shaded region indicate the background of a reaction containing no RNA. (D) As in C, but with N. tabacum HisRS. Each reaction was performed in triplicate. Error bars represent one standard error from the mean.The aminoacylation assays reveal the functional importance of the T-loop and GG nucleotides but do not show if they interact, so we used chemical probing to determine if mutation to either induced changes in other parts of the RNA. SHAPE probing of the C → U RNAs revealed substantial increases in reactivity in the T-loop, consistent with destabilization of the loop's structure (Fig. 4A,B). In addition, the C → U mutation increased reactivity of the GG dinucleotides, while changes in the rest of the RNA were minimal (Supplemental Fig. S4). Thus, disruption of the T-loop in these TLSHis causes local structural changes in the GG dinucleotide, consistent with a long-range interaction between these elements. The G → A mutation in the GG dinucleotide induced increased reactivity in the GG dinucleotide and some subtle increases in the T-loop (Fig. 4A,B). This is consistent with the proposed interaction and likely reflects the fact that the T-loop comprises a preformed structural motif whose structure is less dependent on tertiary interactions (Chan et al. 2013).Our data suggest that the GG dinucleotide between PK2 and the AC stem of TLSHis acts analogously to the D-loop in tRNAs, making a specific long-range contact to the T-loop as previously proposed (Felden et al. 1996). Although high-resolution structural data will be needed to verify this interaction, we speculate there is a Watson–Crick base pair between the first G of the dinucleotide, G12 in our TMV construct, and the third base in the T-loop, C78, as well as the reverse Hoogsteen base pair between G13 and U76, as seen in tRNAs (Supplemental Fig. S3C,D). In tRNAs this interaction is critical to create the functional global fold; the GG dinucleotide “D-loop”/T-loop interaction in TLSHis may play a similar structural and functional role, positioning the AC stem and acceptor pseudoknot such that the HisRS can productively recognize these elements. Additionally, within the T-loop several nucleotides are modified in tRNAs and have been shown to be modified in the analogous region of Brome Mosaic Virus (Baumstark and Ahlquist 2001); whether similar modifications are present within members of the TLSHis class remains to be explored.
Comparisons between the TLSHis and other TLS classes
In some ways, characteristics of the TLSHis class lie in the middle of the three known TLS classes. For example, the identity and size of the AC loop is strictly conserved for TLSVal, and examples that lack a valine anticodon are not aminoacylated by valyl-tRNA synthetase (Dreher et al. 1992; Sherlock et al. 2021). In contrast, the AC loop is not conserved for the TLSTyr class, to the point where there are disagreements in its identity (Perret et al. 1989; Felden et al. 1994a; Bonilla et al. 2020). While many TLSHis representatives contain a GUG histidine AC, there are examples that do not, and the AC loop length for TLSHis is typically only five instead of seven nucleotides. Additionally, the TLSHis class contains more secondary structure elements than TLSVal and fewer than TLSTyr and also falls in the middle in average length. Perhaps this intermediate status could help elucidate the evolutionary relationship and trajectory of the different TLS classes. While our current studies do not address these evolutionary relationships, it is noteworthy that the TLSHis and TLSVal structures have enough similarity, especially in the 3′ regions, that bioinformatic searches based on the TLSHis class can find TLSVal and vice versa, albeit mostly at E values above the inclusion threshold (Sherlock et al. 2021). It also remains to be determined how the TLSs achieve specificity for a particular amino acid given the observed diversity in identity elements. It is intriguing to speculate that other classes of TLSs exist that are structurally unique compared to the three known classes. There is certainly precedent to support this, given the constantly expanding list of classes of riboswitches, ribozymes, and xrRNAs as well as the existence of other tRNA-mimicking structures such as TMERs, tmRNAs, and mascRNAs.
Concluding remarks
In this study, we expanded the list of known TLSHis, confirmed their shared secondary structure, shown they are histidylated in vitro and provided evidence for a proposed D-loop/T-loop interaction analog. These discoveries present new questions regarding how this class of TLSs achieves specific interactions required for recognition by both host and viral proteins while demonstrating significant sequence and secondary structure variation from canonical tRNAs. While the experiments and analyses herein build on and confirm previous observations and hypotheses regarding TLSHis, high-resolution structural information on TLSHis RNAs will be necessary to address lingering questions regarding their structure and function. For example, in this class the AC-arm is particularly long compared to canonical tRNAHis and always has an internal loop. How this additional length is accommodated, if the loop affects orientation of the anticodon relative to a bound HisRS, and if these elements provide conformational dynamics or reconfigurations remains to be understood. Overall, this study moves us closer to understanding the molecular interactions underlying tRNA mimicry for the TLSHis class in terms of structure, while the overall topology, intermolecular interactions with host and viral proteins, and ultimately the function of these RNA elements during infection, remain to be elucidated.
MATERIALS AND METHODS
TLSHis bioinformatic searches and consensus model generation
An alignment of four known histidine-accepting tRNA like structures (TLSHis) (Rfam ID: RF01077) was obtained from the Rfam database (Kalvari et al. 2018) and extended to include the entire TLS, as the existing alignment ended prior to the T-loop and PK1. Using Infernal version 1.1 (Nawrocki and Eddy 2013) a database consisting of all +ssRNA virus sequences deposited in the National Center for Biotechnology Information (NCBI, retrieved 01/22/2019) was queried to identify additional instances of this motif. Sequences identified by Infernal were added to the initial four sequences to generate an updated covariance model for subsequent iterative searching. Only sequences below the Infernal E-value threshold of 0.05 were considered. Duplicate sequences were removed, yielding 158 sequences from 36 unique viruses. A TLS identified from Furovirus with a high E value of 0.019, making it an unlikely TLSHis candidate, was removed, as it demonstrated features of the TLSVal class and previously was identified with high confidence as a member of this class (Sherlock et al. 2021). Thus it was excluded from further analyses, resulting in 157 sequences from 35 unique viruses. These resulting sequences were used to generate a consensus sequence and secondary structure model including an analysis of covariance using the RNA Covariation Above Phylogenetic Expectation CaCoFold (R-scape v1.5.16) (Rivas et al. 2016, 2020) then rendered in R2R (Weinberg and Breaker 2011).
Expression of Saccharomyces cerevisiae HisRS
The DNA sequence encoding the HisRS enzyme from S. cerevisiae (GenBank: AJW07132.1) was purchased as a dsDNA gBlock (IDT) and cloned into a pET15b(+) vector containing an in-frame amino-terminal hexahistidine affinity tag. The protein was recombinantly expressed in BL21 (DE3) cells. Cells were grown in LB to an OD600 of 0.3, then protein expression was induced using 250 µM isopropyl β-d-1-thiogalactopyranoside (IPTG) overnight at 18°C. Pelleted cells were resuspended in lysis buffer containing 20 mM Tris-HCl (pH 7.0), 500 mM NaCl, 2 mM β-mercaptoethanol (BME), 10% (v/v) glycerol, and cOmplete EDTA-free protease inhibitor cocktail tablets (Roche). Cell lysate was then sonicated on ice for 2 min of: 20 sec on, 40 sec off at 75 W. Cell lysate was clarified by centrifugation at 30,000g for 30 min at 4°C. The soluble fraction was purified by nickel affinity chromatography in a buffer containing 150 mM NaCl, 20 mM Tris-HCl (pH 7.0), 200 mM imidazole, 10% glycerol, and 2 mM β-mercaptoethanol. The protein was exchanged into a storage buffer containing 50 mM Tris-HCl (pH 8.0), 100 mM NaCl, 5 mM MgCl2, and 5% glycerol using a spin concentrator (Amicon) and stored at 0.3 mg mL−1 at −80°C with working stocks stored at −20°C.
Expression of Nicotiana tabacum HisRS
The DNA sequence encoding the HisRS enzyme from N. tabacum (NCBI Reference Sequence: XP_016505768.1) in a pET15b(+) vector was purchased (Gene Universal). The protein was purified in the same manner as above, with a subsequent size exclusion step in a buffer containing 50 mM Tris-HCl (pH 8.0), 100 mM NaCl, and 5 mM MgCl2. The protein was exchanged into a storage buffer containing 50 mM Tris-HCl (pH 8.0), 100 mM NaCl, 5 mM MgCl2, and 5% glycerol using a spin concentrator (Amicon) and stored at 1.3 mg mL−1 at −80°C with working stocks stored at −20°C.
In vitro RNA transcription
DNA templates were ordered as gBlock DNA fragments (IDT) and cloned into pUC19. An amount of 200 µL PCR reactions using primers containing an upstream T7 promoter were used to generate dsDNA templates for transcription. Typical PCR conditions: 100 ng plasmid DNA, 0.5 µM forward and reverse DNA primers (Supplemental File 2), 500 µM dNTPs, 25 mM TAPS-HCl (pH 9.3), 50 mM KCl, 2 mM MgCl2, 1 mM β-mercaptoethanol, and Phusion DNA polymerase (New England BioLabs). Templates for RNA used in aminoacylation assays were amplified using reverse primers containing two 5′-terminal 2′-O-methyl modified bases to ensure the correct 3′ end of the RNA. dsDNA amplification was confirmed by 1.5% agarose gel electrophoresis. Transcriptions were performed in 1 mL volume using 200 µL of PCR product (∼0.1 µM template DNA) and 10 mM NTPs, 75 mM MgCl2, 30 mM Tris-HCl (pH 8.0), 10 mM DTT, 0.1% spermidine, 0.1% Triton X-100, and T7 RNA polymerase. Reactions were incubated at 37°C overnight. After transcription, insoluble inorganic pyrophosphate was removed by centrifugation at 5000g for 5 min, then the RNA-containing supernatant was ethanol precipitated with three volumes of 100% ethanol at −80°C for a minimum of 1 h and then centrifuged at 21,000g for 30 min at 4°C to pellet the RNA, and the ethanolic fraction was decanted. The RNA was resuspended in 9 M urea loading buffer then purified by denaturing 10% PAGE. Bands were visualized by UV shadowing then excised. Bands were then crush-soaked in diethylpyrocarbonate-treated (DEPC) milli-Q water at 4°C overnight. The RNA-containing supernatant was then concentrated using spin concentrators (Amicon) to the appropriate concentration in DEPC-treated water. RNAs were stored at −80°C with working stocks stored at −20°C.
In vitro chemical probing of RNAs
Structure probing experiments using the SHAPE reagent NMIA were performed as described previously (Cordero et al. 2014). Briefly, 240 µM RNA was refolded by heating to 90°C for 5 min, cooled to ambient temperature, then incubated at ambient temperature with MgCl2 for 20 min. Subsequently, the refolded RNA was modified by incubating with NMIA for 15 min at ambient temperature. NMIA modification conditions: 120 nM RNA, 6 mg/mL NMIA or DMSO, 50 mM HEPES-KOH (pH 8.0), 10 mM MgCl2, 3 nM 6-fluorescein amidite 5′-labeled FAM-RT primer (Supplemental File 2). Modification was quenched by the addition of NaCl to 500 mM, Na-MES buffer (pH 6.0) to 50 mM, and oligo(dT) magnetic beads [Invitrogen Poly(A) Purist MAG Kit]. Modified RNAs were recovered using the magnetic beads, washed twice with 70% ethanol, then resuspended in water. Reverse transcription was carried out using SuperScript III (Invitrogen) at 48°C for 1 h per the manufacturer's instructions. The RNA was then degraded by the addition of NaOH to 200 mM and heating to 90°C for 5 min. An acid-quench solution (final concentration: 250 mM NaOAc [pH 5.2], 250 mM HCl, 500 mM NaCl) was added and DNA was recovered using the magnetic beads. The DNA was washed twice with 70% ethanol and eluted in GeneScan 350 ROX Dye Size Standard (ThermoFisher) containing HiDi formamide solution (ThermoFisher). 5′-FAM-labeled reverse-strand DNA products were resolved by capillary electrophoresis using an Applied Biosystems 3500 XL instrument. Data workup was performed using the HiTrace RiboKit (https://ribokit.github.io/HiTRACE/) (Yoon et al. 2011; Kim et al. 2013; Kladwang et al. 2014; Lee et al. 2015) in MatLab (MathWorks), and figures were rendered using RiboPaint (https://ribokit.github.io/RiboPaint/) in MatLab, then labeled in Adobe Illustrator. SHAPE reactivity was superimposed on the predicted secondary structure and used to make adjustments to the secondary structure model.
In vitro aminoacylation assays
Aminoacylation constructs were refolded by heating to 90°C for 5 min then cooling to ambient temperature, then incubated with 10 mM MgCl2 for 20 min. Aminoacylation reactions were set up as follows: 1 µL of 1 µM RNA or water, 1 µL of freshly prepared aminoacylation buffer (10×: 200 mM HEPES-KOH [pH 7.5], 20 mM ATP, 300 mM KCl, 50 mM MgCl2, 50 mM DTT), 1 µL of 3H-2,5-L-histidine, 6 µL of DEPC-treated water, and 1 µL of HisRS (3 µM). Aminoacylation reactions were incubated at 30°C for 2 h. Reactions were quenched with 100 µL of wash buffer (20 mM Bis-Tris [pH 6.5], 10 mM NaCl, 1 mM MgCl2) with trace xylene cyanol for visualization. Quenched reactions were immediately loaded onto a vacuum filter blotting apparatus. Filter stack in order from top to bottom: 0.45 µM Tuffryn membrane filter paper (PALL Life Sciences), HyBond positively charged membrane (GE Healthcare), thick filter paper (Bio-Rad gel dryer filter paper). Prior to filter blotting apparatus assembly, each layer was equilibrated in wash buffer. After application of the reaction solution, each blot was immediately washed with 3 × 300 µL of wash buffer containing trace xylene cyanol. The filters were subsequently dried and the blots from the HyBond membrane were excised and measured for 3H incorporation by liquid scintillation counter (Perkin-Elmer Tri-Carb 2910 TR). Data processing was performed in Excel.
SUPPLEMENTAL MATERIAL
Supplemental material is available for this article.
Authors: Anna-Lena Steckelberg; Benjamin M Akiyama; David A Costantino; Tim L Sit; Jay C Nix; Jeffrey S Kieft Journal: Proc Natl Acad Sci U S A Date: 2018-06-04 Impact factor: 11.205
Authors: Kevin W Diebel; Lauren M Oko; Eva M Medina; Brian F Niemeyer; Cody J Warren; David J Claypool; Scott A Tibbetts; Carlyne D Cool; Eric T Clambey; Linda F van Dyk Journal: MBio Date: 2015-02-17 Impact factor: 7.867