Zasha Weinberg1, Peter B Kim2, Tony H Chen3, Sanshu Li1, Kimberly A Harris1, Christina E Lünse2, Ronald R Breaker4. 1. 1] Howard Hughes Medical Institute, Yale University, New Haven, Connecticut, USA. [2] Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut, USA. 2. Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut, USA. 3. Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA. 4. 1] Howard Hughes Medical Institute, Yale University, New Haven, Connecticut, USA. [2] Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut, USA. [3] Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA.
Abstract
Enzymes made of RNA catalyze reactions that are essential for protein synthesis and RNA processing. However, such natural ribozymes are exceedingly rare, as evidenced by the fact that the discovery rate for new classes has dropped to one per decade from about one per year during the 1980s. Indeed, only 11 distinct ribozyme classes have been experimentally validated to date. Recently, we recognized that self-cleaving ribozymes frequently associate with certain types of genes from bacteria. Herein we exploited this association to identify divergent architectures for two previously known ribozyme classes and to discover additional noncoding RNA motifs that are self-cleaving RNA candidates. We identified three new self-cleaving classes, which we named twister sister, pistol and hatchet, from this collection, suggesting that even more ribozymes remain hidden in modern cells.
Enzymes made of RNA catalyze reactions that are essential for protein synthesis and RNA processing. However, such natural ribozymes are exceedingly rare, as evidenced by the fact that the discovery rate for new classes has dropped to one per decade from about one per year during the 1980s. Indeed, only 11 distinct ribozyme classes have been experimentally validated to date. Recently, we recognized that self-cleaving ribozymes frequently associate with certain types of genes from bacteria. Herein we exploited this association to identify divergent architectures for two previously known ribozyme classes and to discover additional noncoding RNA motifs that are self-cleaving RNA candidates. We identified three new self-cleaving classes, which we named twister sister, pistol and hatchet, from this collection, suggesting that even more ribozymes remain hidden in modern cells.
The RNA World theory[1] is based on the notion that contemporary life is derived from organisms that exploited numerous and functionally diverse ribozymes before the emergence of proteins. Many of the roles once played by catalytic RNAs presumably diminished over time due to competition from protein enzymes. A few ribozyme classes that perform critical biochemical transformations such as ribosome-mediated peptide bond formation[2] and various RNA processing reactions[3-5] persisted either because their roles could not easily be replaced or because RNA is particularly well suited to perform these tasks[1]. However, modern natural ribozymes are exceedingly rare, as evident by the fact that the discovery rate for new classes has dropped to one per decade from about one per year during the 1980s.Of the 11 previously validated ribozyme classes, six are self-cleaving. Three self-cleaving classes, hammerhead[6], HDV[7] and twister[8], have thousands of representatives in living systems. Interestingly, only a few of these representatives have been linked to biological roles, such as rolling-circle replication of RNA pathogens[9], processing of repetitive RNA sequences[10], and metabolite-dependent gene regulation[11]. Therefore, most self-cleaving ribozyme representatives have unknown utility, and much remains to be learned about the involvement of self-cleaving ribozymes in cellular function. Given the relative importance of known ribozymes to information processing, gene expression, and genomic integrity, the discovery of additional classes provides opportunities to advance our understanding of modern biochemical processes, to gain further insights into RNA structures, and to shed light on the possible diversity of RNA World functions.Unfortunately, the pace of novel ribozyme class discovery has been exceedingly slow over the last 25 years, and all confirmed discoveries were made serendipitously and not while searching for ribozymes. One example of a purposeful search for self-cleaving ribozymes in humans using a biochemical selection method[12] yielded a variant of the HDV self-cleaving ribozyme class and three other natural ribozyme candidates. Recently, we noted that many hammerhead and twister ribozymes commonly reside within a few kilobases of each other and likewise near certain protein-coding genes[8]. Although the biological basis for this association remains mysterious, we hypothesized that other self-cleaving ribozyme classes might also appear in the vicinity of these genetic elements and that a computational search strategy might reveal new ribozyme classes.In the current study, a search for conserved RNA structures nearby to these genetic elements yielded a ribozyme class that we called “twister sister” because it has vague similarities in sequence and secondary structure to twister ribozymes. However, the two ribozyme classes cleave at different sites, and therefore the significance of the sequence and structural similarities will require further investigation. The search also revealed variants of hammerhead and HDV ribozymes, as well as additional conserved RNA structures that did not self-cleave in vitro. Similarly, we performed another search using an expanded set of genetic elements and genomic sequences, and detected additional novel ribozyme classes, which we named “hatchet” and “pistol”.
RESULTS
Identification of ribozyme candidates
To select promising genomic locations for our search, we first enumerated genetic elements that are often located within 6 kilobases of twister or hammerhead self-cleaving ribozymes. Protein domains encoded by these nearby genes were identified using the Conserved Domain Database (CDD)[13], and by using the JackHMMER program[14] (see Online Methods, Supplementary Results, Supplementary Table 1). We calculated the frequency of these ribozymes in the vicinity of each conserved protein domain or RNA class (Supplementary Table 2), and chose a total of six RNA classes (three permutations of twister and hammerhead ribozymes) and 16 conserved protein domains (Supplementary Table 3). We speculated that the intergenic regions (IGRs) nearby to these elements (available at http://breaker.research.yale.edu/ts), totaling ~7 million base pairs, are likely to be enriched for self-cleaving ribozymes and therefore we used them in our search.Conserved RNA structures were then identified with a computational pipeline[15,16] that first employed BLAST to identify IGR groups that are presumably homologous, and then employed the CMfinder program[15] to predict conserved secondary structures. We manually analyzed the resulting predictions to refine the secondary structure model for each group of homologous sequences. Predictions were evaluated based on support from covarying mutations, which are known to provide powerful evidence for base pairing in natural RNAs[17-19].We predicted fifteen distinct RNA motifs with conserved secondary structures (Fig. 1, Supplementary Fig. 1). Alignments and information on nearby genes are provided for all motifs (Supplementary Data Sets 1 and 2). Among these were unusual variants of hammerhead and HDV ribozymes (Fig. 1a, Fig. 1b, Supplementary Fig. 2), and a motif we named “twister sister” because it has vague sequence and structural similarities with a recently discovered self-cleaving ribozyme called twister[8] (Fig. 1c). The genetic elements actually involved in detecting each confirmed ribozyme class are available in Supplementary Table 4. General properties of motifs that did not self-cleave in our experiments are listed in Supplementary Table 5.
Figure 1
Self-cleaving ribozyme candidates
(a) Consensus sequence and secondary structure model for distinct variants of hammerhead ribozymes. R and Y represent purine and pyrimidine nucleotides, respectively. The site of ribozyme-mediated RNA cleavage (Clv) is identified by arrowhead. (b) Consensus sequence and secondary structure model for distinct variants of newly found variants of HDV ribozymes. (c) Consensus models for twister and twister sister ribozymes. Noncanonical base pairs and other additional structural interactions for twister ribozymes recently revealed by biophysical studies are not included.
Three ribozyme candidates self-cleave in vitro
To assess self-cleaving ribozyme activity, we conducted in vitro transcriptions using wild-type (WT) or various mutant DNA templates of each candidate. Representatives of all three new motifs undergo efficient self-cleavage during transcription (Supplementary Fig. 2). Twelve additional motifs (Supplementary Fig. 1) did not cleave in vitro (data not shown), and were not further pursued.The hammerhead ribozymes uncovered in our study are variants of the type I architecture, wherein stem I is not covalently closed. These unusual variants are predicted to form stem II with only a single base-pair, which is a feature seen in only one hammerhead ribozyme representative reported previously[6]. Moreover, the loop of stem II forms a long-distance pseudoknot with the loop of an additional hairpin formed by the 5′ region of the motif. Despite these distinct structural features, the variant hammerhead ribozymes retain the normal site of RNA cleavage used by all other hammerhead ribozymes examined to date (Fig. 1a, Supplementary Fig. 3).The HDV ribozyme variants, which had not previously been detected in bacterial metagenomes, often carry an E-loop structure[20] at the base of P4. This arrangement is present in only one previously reported HDV ribozyme[21]. A bacterial representative carrying the E-loop sub-structure was found to undergo self-cleavage at the same location as other more-typical HDV ribozymes (Supplementary Fig. 4), suggesting that the newly-found variants form the same active site using an alternative structural feature. Eukaryotic HDV variants with similar sequence and structure features were found by homology search, and also by an independent bioinformatics analysis of fungal genomes (S.L. and R.R.B., unpublished data).
Biochemical properties of twister sister ribozymes
The “twister sister” motif, newly revealed by our search, remotely resembles twister ribozymes chiefly because some twister RNAs have P1 through P5 stems in an arrangement similar to twister sister, and because of similarities in the nucleotides in the P4 terminal loop. However, we find no evidence of pseudoknot formation via Watson-Crick base pairing, which occurs in all known twister ribozymes. In addition, there is poor correspondence among many of the most highly conserved nucleotides in each of the two motifs. Given these observations, it was not immediately clear if twister sister RNAs would self-cleave. Even if they function as ribozymes, twister sister RNAs could either represent a distinct ribozyme class or simply be a highly divergent form of twister ribozymes.A twister sister construct, called TS-1 and based on a microbial metagenomic DNA source, was engineered to function as a bimolecular complex with separate enzyme and substrate strands (Fig. 2a). The substrate strand of this complex was cleaved in the presence of Mg2+ only when the enzyme strand is present (Fig. 2b). Surprisingly, the cleavage site was on the right side of the internal loop linking P1 and P2 (Fig. 2c). Likewise another twister sister ribozyme construct called TS-2 from different microbial metagenomic DNA was cleaved in this same location (Supplementary Fig. 5). This cleavage site between nucleotide C13 and A14 is on the opposite side relative to the internal loop cleaved by twister. The distinct cleavage site, along with the sequence and structural differences noted above, provided initial evidence that twister sister RNAs might represent a separate ribozyme class from twister.
Figure 2
Activity of a bimolecular twister sister ribozyme
(a) Sequence and secondary structure model of a bimolecular construct derived from TS-1 (Supplementary Fig. 1). Highly conserved nucleotides are depicted in red and Clv designates the cleavage site. Lowercase letters identify non-native guanosine residues that were added to facilitate in vitro transcription. (b) PAGE separation of bimolecular TS-1 ribozyme assay products demonstrating the requirement for the enzyme strand and for divalent metal ions. S designates the 5′ 32P-labeled 23-nucleotide RNA substrate and 5′ Clv identifies the cleavage product. Reactions were conducted as described (see Online Methods) with variations noted. (c) PAGE separation of ribozyme cleavage products. S designates the 5′ 32P-labeled 23-nucleotide RNA substrate, and C13 identifies the 5′ 32P-labeled fragment band produced by incubation with excess unlabeled ribozyme for 30 min (Rxn lane). NR, T1 and −OH lanes designate no reaction, RNase T1 partial digestion (cleaves after G nucleotides), or partial alkaline digestion (cleaves all internucleotides), respectively. (d) Mass spectrum analysis of bimolecular TS-1 cleavage reaction products were examined by mass spectrometry (see Online Methods). The second largest peak near the 5′ Clv annotation is the spontaneously formed opened version (observed mass, 4304.576) of the initial 2′,3′-cyclic phosphate product. Intensity is abbreviated int. (e) The dependence of twister sister rate constants on pH. (f) The dependence of Mg2+ concentration on twister sister rate constants. A version of this figure containing full-length gel images is shown in Supplementary Figure 11.
The TS-1 cleavage site was confirmed by mass spectrometry (Fig. 2d), which also revealed that the ribozyme reaction yields 5′ cleavage product with a terminal 2′,3′-cyclic phosphate group, and a 3′ cleavage product with a 5′ hydroxyl group. These reaction products result via a mechanism wherein the 2′ oxygen of C13 attacks the adjacent phosphorus atom, with subsequent departure of the 5′ oxygen of A14 (Supplementary Fig. 6). As expected for a ribozyme that uses this general phosphoester transfer mechanism, the cleavage reaction cannot proceed when the nucleotide corresponding to C14 of the substrate lacks the 2′ oxygen nucleophile (Supplementary Fig. 7). This mechanism for RNA strand scission is identical to all six previously characterized self-cleaving ribozyme classes[5].To further assess the characteristics of twister sister ribozymes, we established additional biochemical properties of the TS-1 bimolecular complex. With each increase in pH unit, there was a ~10-fold increase in TS-1 cleavage activity, with a plateau near pH 7 (Fig. 2e). The simplest explanation for this pH-activity profile is that the ribozyme shifts the pKa of the 2′ hydroxyl group of C13 to approximately 7, which otherwise has a pKa value of ~13.7 (ref. 21). An oxyanion at the 2′ position will function as a far more powerful nucleophile compared to the 2′ hydroxyl group, and therefore the rate constant should increase in proportion to the probability that the 2′ position is deprotonated, but only if the chemical step of the reaction is rate limiting. If this explanation is true, then ribozyme activity does not continue to increase beyond pH 7 because the 2′ oxygen of C13 is already near fully deprotonated at values approaching and exceeding physiological pH.TS-1 activity was also highly dependent on the concentration of divalent metal ion, and the steep increase in ribozyme rate constant only plateaued at Mg2+ concentrations above 1 mM (Fig. 2f). Divalent metal ions could strongly induce ribozyme activity either by serving important roles in the formation of the global structure of the ribozyme or by directly promoting catalysis at the active site. Both twister sister and twister ribozymes displayed similar dependencies on pH and Mg2+ (see ref. 8 for kinetic details of twister), suggesting that they might share some catalytic strategies (Supplementary Fig. 6) despite their differences in conserved sequences, structural features, and cleavage site. Although twister ribozymes also are strongly activated by the addition of Mg2+ or by a number of other divalent metal ions, initial finding suggests that these ribozymes do not employ metal ions in their active site[8,24]. However, a recent x-ray structure model has been proposed wherein a Mg2+ ion interacts with the phosphate moiety at the cleavage site[25].The maximum observed rate constant (kobs) value for TS-1 was ~5 min−1 when both pH and Mg2+ are optimal. This is considerably slower than the twister ribozymes examined previously[8], which have kobs values estimated by extrapolation to be more than 1,000 min−1. Interestingly, the pH profile and maximum rate constant for TS-1 are nearly identical to those of a collection[22] of ribozymes and deoxyribozymes created previously by using directed evolution. It has been suggested that this previous collection of ribozymes and deoxyribozymes might use only two catalytic strategies to promote catalysis: orienting the labile linkage for in-line nucleophilic attack (called α, Supplementary Fig. 6) and promoting deprotonation of the 2′ hydroxyl group (called γ). It is known that the maximum rate constant generated by γ catalysis is ~0.02 min−1, whereas α catalysis is estimated to provide an additional ~100-fold increase in the rate constant[22]. If these two catalytic strategies are independent, and if their rate constant enhancements are multiplicative, this permits an estimate for a “speed limit” for catalysts that employ only α and γ catalytic strategies of ~2 min−1.Given that the maximum kobs value measured for the bimolecular TS-1 construct is only modestly greater than the estimated maximum kobs for αγ ribozymes, we wondered if other twister sister ribozymes might ac4hieve greater speeds. We tested two additional constructs called TS-3 and TS-4 (Supplementary Fig. 8). Both constructs exhibited much higher rate constants under suboptimal pH and Mg2+ ion concentrations, indicating that their rate constants would be far in excess of the TS-1 construct when tested under optimal reaction conditions (predicted as >100 min−1 for TS-3). These results demonstrate that at least some twister sister ribozymes can combine catalytic strategies to exceed the αγ speed limit.To allow further comparisons of twister and twister sister, we consulted three recently reported atomic-resolution structures[23-25]. These structures largely agree with the previously proposed secondary structure model for twister ribozymes[8], and reveal the importance of highly conserved nucleotide positions. In one model, twister ribozymes appear to employ three catalytic strategies to achieve their high speeds[24]: α and γ as described above, as well as protonation of a non-bridging phosphate oxygen (β). However, the active sites in other structure models for twister ribozymes are either unclear[23] or different[25]. As a result, additional studies will be needed to confirm the precise nature of the active site and the catalytic strategies used by twister ribozymes.As noted above, the overall secondary structures of twister and twister sister exhibit vague similarities. However, there are differences in two places with important roles. First, the conserved nucleotide predicted to be responsible for β catalysis in twister ribozymes is part of a pseudoknot[24], which is presumed to be absent in twister sister ribozymes. If twister sister ribozymes employ β catalysis, it is not clear how they promote this catalytic strategy using a structure similar to twister ribozymes. Second, although the nucleotides in the loop of P4 are notably similar, key conserved nucleotides in the P1 and P2 stems of twister are not found in twister sister. Both motifs have highly, but not invariantly conserved A nucleotides immediately 3′ to the cleavage site. However, the modest similarities in nucleotide sequence near the cleavage site could also be explained by chance. At least two explanations for the differences in conserved nucleotides are possible: either twister sister and twister ribozymes have distinct active sites that use a similar scaffold, or the different nucleotides in twister sister actually form the same geometry and tertiary contacts as those in twister ribozymes. An atomic-resolution structure of a twister sister ribozyme will help address the extent to which these ribozymes use similar structures and catalytic strategies to accelerate RNA phosphoester transfer.As with many other self-cleaving ribozyme classes, twister sister catalytic activity was supported by divalent metal ions other than Mg2+ (Supplementary Fig. 9a). However, the TS-1 RNA responded differently to Sr2+ and Ni2+ compared to twister[11]. Moreover, of five Group 1 monovalent cations tested at 1 M, only Li+ resulted in observable ribozyme activity (Supplementary Fig. 9b), whereas all five cations support twister activity. Finally, cobalt hexamine, an analog of hydrated Mg2+, did not induce twister sister activity (Supplementary Fig. 9c), but did support twister activity[8]. Together, these observations further suggest that there might be differences at the active sites formed by twister sister and other self-cleaving ribozyme classes that cause the distinct responses to these metal ions.
Identification of additional novel ribozymes
After completing the analyses described above, we reapplied our bioinformatics search strategy using a larger collection of ribozyme-associated gene classes and using additional bacterial DNA sequences from the rapidly accumulating genomic databases. Several novel motifs were found whose representatives did not cleave when assayed in vitro, and these will be detailed in a future report. However, this analysis also uncovered two additional novel self-cleaving ribozyme classes which we named pistol (Fig. 3) and hatchet (Fig. 4). Initial computational and biochemical analyses confirmed that these RNAs use distinct conserved sequences and structural features to accelerate RNA cleavage by more than 10 million fold (Supplementary Fig. 10) via an internal phosphoester transfer mechanism. Therefore, nine of the 14 known natural ribozyme classes promote rapid self-cleavage, which indicates that modern organisms make extensive use of RNA’s capacity to form diverse structures and accelerate RNA strand scission[26,27].
Figure 3
Structure and activity of pistol self-cleaving ribozymes
(a) Consensus sequence and secondary structure model for pistol self-cleaving ribozymes based on 449 unique examples. Annotations are as described in Fig 1a. (b) A bimolecular pistol ribozyme construct based on a representative from the bacterium Aliistipes putredinis. Annotations are as described in the legend for Fig. 2a. (c) Pistol ribozyme activity and cleavage site mapping of the A. putredinis bimolecular construct wherein the substrate RNA (S) was 5′-labeled with 32P. Other annotations are as described in the legend to Fig. 2c. Trace amount of substrate was incubated with excess WT or M10 enzyme strand either with (+) or without (−) 20 mM MgCl2. RNA cleavage products were separated by denaturing 20% PAGE. A version of this figure containing full-length gel images is shown in Supplementary Figure 12.
Figure 4
Structure and activity of hatchet self-cleaving ribozymes
(a) Consensus sequence and secondary structure model for hatchet self-cleaving ribozymes based on 159 unique examples. (b) A bimolecular hatchet ribozyme construct based on a representative from a metagenomic DNA sample. (c) Hatchet ribozyme activity and cleavage site mapping of the bimolecular hatchet ribozyme construct. A version of this figure containing full-length gel images is shown in Supplementary Figure 13.
DISCUSSION
Our discovery of three novel ribozyme classes in one study demonstrates that regions near ribozyme-associated genes in bacteria and bacteriophages are fruitful places to search for novel classes. It remains unclear, however, why these genes frequently associate with self-cleaving ribozymes. In most instances, the functions of the protein products of these ribozyme-associated genes are unknown. However, it was previously noted that many genetic elements associated with self-cleaving ribozymes are typically carried by Mu-like phages[8]. Therefore certain phages might have particular need for RNA processing by self-cleaving ribozymes.Twelve of 15 motifs identified in our original search did not exhibit self-cleavage activity during preparation by in vitro transcription. The reasons for the lack of ribozyme activity might be different for each motif. Some might function as ribozymes, but lack some critical component (e.g., an important domain was missing from the RNA molecule) or condition (e.g., a required metal ion or other cofactor) when tested outside of their natural environment. Some candidate motifs lack some of the structural features that are typical of self-cleaving ribozymes. For example, some candidates are unusually large compared to all known self-cleaving ribozyme classes, or they consist of only simple hairpin loops, whereas known natural self-cleaving ribozymes have more complex structures.Perhaps some of the inactive motifs act as targets for endoribonuclease proteins, providing an alternate mechanism for cleavage. Some motifs may not even function as structured RNAs, despite evidence for structure formation via sequence covariation. The biologically relevant nucleic acid structure could be formed by single-stranded DNA, rather than its corresponding RNA transcript. Since many genes associated with self-cleaving ribozymes are typical of Mu-like phages, it is possible that Mu-like phages have a preference for RNA-based solutions to biological challenges such as gene regulation and RNA processing. If true, some of the 12 candidate motifs might serve other functions rather than promoting site-specific RNA cleavage.The mystery regarding the biological utility of self-cleaving ribozymes is likely to deepen as more classes probably remain to be discovered in nature, and many more representatives of the known classes are certain to exist in organisms that have yet to have their genomes sequenced. Targeted computational searches should enable these ribozyme discoveries to be made more rapidly in the future. Moreover, each new ribozyme class provides another type of catalytic RNA for detailed dissection. Given their biological sources, these natural ribozymes will have had to compete effectively with protein-based enzymes to persist through evolution. Consequently, the catalytic activities of these RNAs should be highly refined compared to the many classes that have been created by directed evolution methods. A combination of biochemical and biophysical analyses on these natural ribozymes therefore should reveal much about how RNA molecules can promote rapid chemical transformations without the aid of protein factors.
ONLINE METHODS
Genome sequences and annotation
Our analysis used sequences in the bacterial and archaeal section of RefSeq[28] version 56, and various environmental sequences collected from IMG/M[29], the Human Microbiome Project[30], MG-RAST[31], CAMERA[32] or GenBank[33]. The locations of genes were retrieved from RefSeq or IMG/M annotations, or predicted using MetaGene[34] or MetaGeneMark[35]. Conserved domains were annotated using the Conserved Domain Database[13] version 2.25, using a previously described procedure[36]. The locations of twister ribozymes were predicted based on homology searches we conducted with Infernal[37] version 1.1, and these RNAs were used to generate the twister ribozyme consensus diagram. Other RNAs were predicted using Rfam[38], tRNAscan-SE[39], CRT[40] and rRNA prediction[41]. Rho-independent transcription terminators were predicted using RNie[42]. Consensus diagrams were generated using R2R[43].
Genetic elements associated with self-cleaving ribozymes
As in a previous study[8], we collected genes located within 6 kilobases (Kb) of a twister or hammerhead ribozyme where each gene is at least 200 base pairs from the end of a sequence fragment. As before, HDV ribozymes and glmS ribozymes were not considered because their gene associations did not resemble that of twister or hammerhead ribozymes, while other self-cleaving ribozyme classes are not known in bacteria[8].Some of the genes in the above collection encoded proteins that did not contain a predicted domain in the Conserved Domain Database, yet were clearly homologous with other ribozyme-associated proteins in the collection. To group together related proteins, we adopted a strategy based on the JackHMMER program[14]. JackHMMER uses a single query protein to find homologs, and then uses the alignment of the homologs to the original query to conduct a more refined search for additional homologs, repeating this process in multiple rounds.We ran JackHMMER searches on each of the proteins collected nearby to ribozymes above, and initially ran searches against the set of ribozyme-associated proteins. Proteins whose JackHMMER searches predicted a set of fewer than 20 homologous proteins were discarded. When the sets of homologs output from the JackHMMER searches of two query proteins overlapped, the protein with the smaller set was discarded, or an arbitrary protein was discarded if the homolog sets were equal in size. We performed an additional JackHMMER search for each of the remaining proteins against the full set of predicted proteins in our sequence database. The sets of predicted homologs arising from each of these searches were treated as a conserved domain and called SCRAP1-67 (SCRAP is Self-Cleaving Ribozyme Associated Protein) (Supplementary Table 1).
Detection of conserved RNAs
We used the conserved ribozyme-associated elements to identify putative non-coding sequences that were expected to be enriched for self-cleaving ribozymes. To evaluate the utility of each conserved element, we calculated two frequencies. For hammerhead and then twister ribozymes, we calculated the frequency with which a ribozyme is within 6 Kb of representatives of the conserved element (Supplementary Table 2). These two frequencies can be viewed as estimates of how likely it is that a self-cleaving ribozyme will be present around examples of the conserved element.We did not calculate p-values, primarily because of the difficulty in modeling correlations between metagenomic sequence fragments (caused by evolutionary relationships) that would otherwise distort p-value statistics. Moreover, the frequencies calculated are well suited to address the main goal of selecting the most promising genomic locations for discovering new ribozymes.We manually selected conserved elements that had high frequencies for both hammerhead and twister ribozymes, as we anticipated that high numbers with both ribozyme classes would indicate a more robust association. We also observed that conserved elements and their associated self-cleaving ribozymes were almost always encoded on the same DNA template strand, and therefore we did not consider the complementary strands to be enriched for self-cleaving ribozymes. In the case of some conserved elements, it appeared that the ribozymes were usually located closer than 6 Kb to the element, so we often selected a shorter distance. In all, 22 conserved elements were selected (Supplementary Table 3).The non-coding regions nearby to the selected conserved elements totaled 6.7 Mb. To identify conserved RNA motifs within these non-coding regions, we used a previously published method[16]. After manual analysis of the results, we identified 15 putative conserved RNA structures. By looking at consensus structures, we determined that that three of these motifs corresponded to or had some aspects in common with known structural classes of self-cleaving ribozymes, i.e., hammerhead, HDV and twister ribozymes. Eukaryotic homologs of motifs were found only for the HDV-like motif. Although we observed that many likely homologs of the HDV-like motif seemed defective, a similar phenomenon was observed with some eukaryotic hammerhead ribozymes[6]. We generally avoided these defective sequences, and did not attempt to find a comprehensive set of HDV ribozymes. However, we did include some HDV ribozymes that appeared to be truncated on their 5′ ends, as has previously been observed in some cases[42].
Evaluation of ribozyme self-cleavage during transcription
Experiments were conducted on two twister sister representatives (TS-1 and TS-2) identified in a human gut metagenome[45], found respectively in nucleotides 361 to 522 in sequence accession “scaffold1830_2_V1.CD-8” and the reverse complement of nucleotides 574 to 689 in sequence “scaffold909_4_MH0022.” Double-stranded DNA templates encoding these RNAs were generated by extending the synthetic DNA 5′-TAATACGACTCACTATAGG (containing the T7 RNA polymerase promoter sequence) on the appropriate template DNA by using Taq DNA polymerase. In vitro transcription reactions were performed as previously described[8]. Internally 32P-labeled products were separated using denaturing (8 M urea) 15% polyacrylamide gel electrophoresis (PAGE) and detected with a Typhoon Trio+ Variable Mode Imager (GE Healthcare).
Cleavage assays for bimolecular ribozyme constructs
Substrate and enzyme RNAs for bimolecular complexes as designated were individually synthesized, either by in vitro transcription or by solid-phase chemical synthesis (Sigma-Aldrich), resulting in the elimination of loop sequences that otherwise join P5. Substrate RNAs, including those containing the 2′-deoxycytosine modification, were purchased from Sigma-Aldrich and 5′ radiolabeled using γ-32P [ATP] and T4 polynucleotide kinase (New England Biolabs) according to the manufacturer’s instructions. Double-stranded DNAs encoding enzyme RNAs were prepared as described above and used as templates for in vitro transcription reactions, also as described above. Before being added to cleavage reactions, all RNAs were purified by denaturing PAGE, the appropriate product bands were eluted from gel slices using 10 mM Tris-HCl (pH 7.5 at 23°C), 200 mM NaCl and 1 mM EDTA, and concentrated by precipitation with ethanol.Bimolecular cleavage reactions were incubated at room temperature in standard reaction conditions: 30 mM HEPES (pH 7.5 at 23°C), 100 mM KCl, 20 mM MgCl2. Reactions contained 50 nM 32P-labeled substrate and 1000 nM unlabeled enzyme RNA, unless otherwise indicated. Substrate and enzyme RNAs were combined in reaction buffer lacking magnesium, heated to 80°C for 1 min, and cooled to 23°C before cleavage reactions were initiated by the addition of MgCl2. Unless otherwise indicated, reactions were halted by adding an equal volume of stop solution (90% formamide, 50 mM EDTA, 0.05% xylene cyanol and 0.05% bromophenol blue). Reaction products were separated using denaturing 20% PAGE, and then were imaged as described above and quantified by using ImageQuaNT software (GE Healthcare Life Scienes).
Cleavage site mapping
Reaction for TS-1 ribozyme cleavage site mapping contained ~100 nM 5′ 32P-labeled substrate RNA and ~500 nM enzyme RNA, and was incubated at 23°C for 30 min in the presence of 5 mM MgCl2 under otherwise standard conditions. To prepare the RNA marker lanes, the radiolabeled substrate was partially digested with RNase T1 nuclease [25 mM sodium citrate (pH 5.0 at 23°C), 4 M urea, 0.6 mM EDTA, and 0.2 U per L RNase T1 (from Aspergillus oryzae; Boehringer Mannheim) for 11 min at 55°C] or with alkali [50 mM Na2CO3 (pH 9.0 at 23°C) and 1 mM EDTA for 7 min at 90°C]. Samples were mixed with equal volumes of a urea-containing gel loading buffer and analyzed by denaturing 20% PAGE.Similarly, the bimolecular pistol ribozyme construct was based on a representative from A. putredinis (reverse complement of nucleotides 466281 to 466361 in RefSeq accession NZ_ABFK02000017.1). The enzyme strand was obtained by in vitro transcription from synthetic DNA oligonucleotides (Sigma-Aldrich) that were made double stranded as described above. The substrate strand RNA was chemically synthesized (Sigma-Aldrich), 5′ 32P-labeled and purified also as described above. The same methods were used to generate marker lanes and to conduct the ribozyme assay as noted above.Likewise, the bimolecular hatchet ribozyme construct was based on a representative from unclassified environmental sequences (reverse complement of nucleotides 23144 to 23308 in the previously mentioned human gut metagenome accession scaffold115765_3_MH0070). Both the enzyme and substrate strands were chemically synthesized (Sigma-Aldrich). The substrate strand RNA was 5′ 32P-labeled and purified also as described above. The same methods were used to generate marker lanes and to conduct the ribozyme assay as noted above.
Mass spectrum analysis of ribozyme cleavage products
A 50 μL reaction containing 30 mM HEPES (pH 7.5 at 23°C), 100 mM KCl, 5 mM MgCl2, 4 μM substrate and 4 μM enzyme RNAs (the bimolecular TS-1 construct, Fig. 2A) was incubated for 60 min at 23°C. Following precipitation with ethanol and sedimentation by centrifugation, the reaction products were resuspended in 20 μL deionized H2O and subjected to monoisotopic (exact-mass) spectrometry (Novatia LLC).
Measurements of observed rate constants (kobs)
Ribozyme cleavage assays for determining kobs values were performed using the bimolecular TS-1 construct depicted in Fig. 2a. It is not known if the kobs values reflect only the chemical step of the RNA cleavage reaction, or if there are slower structural transitions that limit the rate constants observed. All kobs values were established under standard reaction conditions (see above), except that Mg2+ concentrations and pH conditions were varied as noted for each experiment. Cleavage reactions were terminated using a stop solution containing 90% formamide, 50 mM EDTA, 0.05% xylene cyanol and 0.05% bromophenol blue that was added in a volume equal to the ribozyme reaction mixture. The fraction of 5′ 32P-labeled substrate RNA cleaved over time was quantified after separation by denaturing PAGE as described above. Apparent first order rate constants were determined by nonlinear curve fitting using GraphPad Prism[8] (GraphPad Software).
Authors: Zasha Weinberg; Joy X Wang; Jarrod Bogue; Jingying Yang; Keith Corbino; Ryan H Moy; Ronald R Breaker Journal: Genome Biol Date: 2010-03-15 Impact factor: 13.583
Authors: Shulei Sun; Jing Chen; Weizhong Li; Ilkay Altintas; Abel Lin; Steve Peltier; Karen Stocks; Eric E Allen; Mark Ellisman; Jeffrey Grethe; John Wooley Journal: Nucleic Acids Res Date: 2010-11-02 Impact factor: 16.971
Authors: Aron Marchler-Bauer; Shennan Lu; John B Anderson; Farideh Chitsaz; Myra K Derbyshire; Carol DeWeese-Scott; Jessica H Fong; Lewis Y Geer; Renata C Geer; Noreen R Gonzales; Marc Gwadz; David I Hurwitz; John D Jackson; Zhaoxi Ke; Christopher J Lanczycki; Fu Lu; Gabriele H Marchler; Mikhail Mullokandov; Marina V Omelchenko; Cynthia L Robertson; James S Song; Narmada Thanki; Roxanne A Yamashita; Dachuan Zhang; Naigong Zhang; Chanjuan Zheng; Stephen H Bryant Journal: Nucleic Acids Res Date: 2010-11-24 Impact factor: 16.971
Authors: Zasha Weinberg; James W Nelson; Christina E Lünse; Madeline E Sherlock; Ronald R Breaker Journal: Proc Natl Acad Sci U S A Date: 2017-03-06 Impact factor: 11.205
Authors: Alfredo J Hernandez; Athanasios Zovoilis; Catherine Cifuentes-Rojas; Lu Han; Bojan Bujisic; Jeannie T Lee Journal: Proc Natl Acad Sci U S A Date: 2019-12-23 Impact factor: 11.205
Authors: Anthony M Mustoe; Steven Busan; Greggory M Rice; Christine E Hajdin; Brant K Peterson; Vera M Ruda; Neil Kubica; Razvan Nutiu; Jeremy L Baryza; Kevin M Weeks Journal: Cell Date: 2018-03-15 Impact factor: 41.582