Melanie A Preston1,2, Douglas F Porter1,3, Fan Chen4, Natascha Buter1,2, Christopher P Lapointe1,5, Sunduz Keles4,6, Judith Kimble1,7, Marvin Wickens8. 1. Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA. 2. Promega Corporation, Madison, WI, USA. 3. Program in Epithelial Biology, Stanford University Medical School, Stanford, CA, USA. 4. Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA. 5. Department of Structural Biology, Stanford University, Stanford, CA, USA. 6. Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA. 7. Howard Hughes Medical Institute, University of Wisconsin-Madison, Madison, WI, USA. 8. Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA. wickens@biochem.wisc.edu.
Abstract
Ribonucleotidyl transferases (rNTases) add untemplated ribonucleotides to diverse RNAs. We have developed TRAID-seq, a screening strategy in Saccharomyces cerevisiae to identify sequences added to a reporter RNA at single-nucleotide resolution by overexpressed candidate enzymes from different organisms. The rNTase activities of 22 previously unexplored enzymes were determined. In addition to poly(A)- and poly(U)-adding enzymes, we identified a cytidine-adding enzyme that is likely to be part of a two-enzyme system that adds CCA to tRNAs in a eukaryote; a nucleotidyl transferase that adds nucleotides to RNA without apparent nucleotide preference; and a poly(UG) polymerase, Caenorhabditis elegans MUT-2, that adds alternating uridine and guanosine nucleotides to form poly(UG) tails. MUT-2 is known to be required for certain forms of RNA silencing, and mutants of the enzyme that result in defective silencing did not add poly(UG) tails in our assay. We propose that MUT-2 poly(UG) polymerase activity is required to promote genome integrity and RNA silencing.
Ribonucleotidyl transferases (rNTases) add untemplated ribonucleotides to diverse RNAs. We have developed TRAID-seq, a screening strategy in Saccharomyces cerevisiae to identify sequences added to a reporter RNA at single-nucleotide resolution by overexpressed candidate enzymes from different organisms. The rNTase activities of 22 previously unexplored enzymes were determined. In addition to poly(A)- and poly(U)-adding enzymes, we identified a cytidine-adding enzyme that is likely to be part of a two-enzyme system that adds CCA to tRNAs in a eukaryote; a nucleotidyl transferase that adds nucleotides to RNA without apparent nucleotide preference; and a poly(UG) polymerase, Caenorhabditis elegans MUT-2, that adds alternating uridine and guanosine nucleotides to form poly(UG) tails. MUT-2 is known to be required for certain forms of RNA silencing, and mutants of the enzyme that result in defective silencing did not add poly(UG) tails in our assay. We propose that MUT-2 poly(UG) polymerase activity is required to promote genome integrity and RNA silencing.
Covalent modifications pervade biological regulation. RNAs are extensively
modified: 5′ termini often are capped, internal positions are altered both on
ribose rings and bases, and 3′ termini receive untemplated nucleotides,
referred to as “tails”. In eukaryotes, tails occur on most classes of
RNA and control their processing, transport, stability, and function. Tails and
enzymes that add them are critical in biology. Uridylation is implicated in
tumorigenesis, proliferation, stem cell maintenance, and immunity[1-3], and polyadenylation in early development, cancer, and
memory[4-6]. Global approaches are needed to uncover
previously undetected tailing systems.The DNA polymerase β-like superfamily of nucleotidyl transferases
append nucleotides to divergent substrates, including RNAs, nucleotides, and
antibiotics[7,8]. Ribonucleotidyl transferases (rNTases) add
nucleotides to RNAs without using a template, and include poly(A) polymerases
(PAPs), poly(U) polymerases (PUPs, aka TUTases), and CCA-adding enzymes that act on
tRNAs[9]. PAPs and PUPs
cannot be distinguished definitively by their protein sequences.We suspected that other types of rNTases and tails exist but have escaped
detection. Studies in vitro and in Xenopus oocytes
identified many rNTase activities[10,11], but are
incompatible with genome-wide analyses. Similarly, powerful sequencing methods can
identify tails on cellular RNAs without bias[12,13]. Tails could be
missed if added at specific times, expressed in certain cell types, exist only
transiently, or termini are blocked.. The challenge is to uncover all forms of
tails, and identify enzymes responsible, at a genome-wide scale.We developed an approach to identify enzymes that add non-templated
nucleotides to RNAs. Candidate rNTases were tethered to a reporter RNA in S.
cerevisiae, and the number and identity of nucleotides added were
determined at single-nucleotide resolution. The approach revealed previously
undetected enzymes and tails, including a eukaryotic system with separate enzymes
that add CC and A to tRNAs and a poly(UG) polymerase that adds alternating U and G
residues. Mutations in the gene encoding this poly(UG) polymerase elevate
transposition frequency[14,15], disrupt RNA silencing[16-18], and impair RNA interference[19-22]. The poly(UG) polymerase and poly(UG) tails likely are required
for these events.
RESULTS
An in cellulo tethering assay identifies rNTase
activities
To identify rNTase activities we developed TRAID-Seq
(tethered rNTase
activity identified by
high-throughput sequencing). Enzymes were fused to MS2
coat protein (MS2), and co-expressed in yeast with a reporter RNA bearing
high-affinity MS2 binding sites. The interaction of MS2 with its binding sites
tethered the fusion protein to the RNA[23], and circumvented proteins that might bring the rNTase
to its endogenous substrates.To develop a reporter RNA, we first expressed an RNase P-derived RNA
bearing two MS2 binding sites in cells containing MS2-PUP fusions (C.
elegans PUP-2 or S. pombe Cid1, Supplementary Fig. 1a). RT-PCR
analysis designed to detect U or A tails revealed that U tails were added and
required a functional active site; however, high levels of endogenous
polyadenylation in the absence of expressed rNTases complicated analysis (Supplementary Fig. 1b,
c). We constructed
an alternative RNA substrate based on S. cerevisiae
tRNASer(AGA), in which its four-base pair variable arm was
replaced with one MS2 binding site (Fig.
1a). This reporter tRNA had virtually undetectable background
polyadenylation, judged by gel analysis of reaction products (Fig. 1b), and enabled unambiguous classification of
rNTase activities.
Figure 1.
TRAID-Seq assay measures nucleotide addition activity in
vivo.
(a, b) TRAID-Seq strategy. (a)
tRNASer(AGA) variable arm (gray) is mutated to an MS2 stem loop
(cyan) to form the tRNA reporter. (b) Left, tRNA reporter is
co-expressed with an MS2 coat protein-rNTase fusion in S.
cerevisiae. The tethered rNTase adds nucleotides to the 3′
end of the tRNA. Right, RT-PCR analysis to detect A tails or U tails added by
control rNTases, relative to empty vector or a no-reporter control. Lanes marked
with a dash indicate reactions performed without reverse transcriptase.
Representative gel image from four independent experiments. (c)
Schematic of sample processing. (d, e) Tail-o-grams of nucleotides
added by control rNTases, C. elegans PUP-2 (d) and C.
elegans GLD-2 (e). Percent of each nucleotide at each tail length
is color-coded and plotted on the left y-axis; U (green), C (yellow), G
(purple), A (brown). Tails lengths of five nucleotides or greater are shown for
clarity (see Online Methods). The number of tails detected per million heptamers
(TPMH) are indicated by black diamonds and correspond to the log scale on the
right y-axis.
The assay accurately recapitulated activities of well-characterized
rNTases. As proof-of-principle, we analyzed C. elegans
PUP-2[11]
(CePUP-2), S. pombe Cid1[11,24] (SpCid1), and a known PAP, C.
elegans GLD-2[25]
(CeGLD-2). In RT-PCR assays, a U-specific primer yielded
products with CePUP-2 and SpCid1, while an
A-specific primer yielded products with CeGLD-2 (Fig. 1b). Tails were not detected using
catalytically-inactive mutants of CePUP-2 and
SpCid1, nor in cells that lacked the proteins or reporter
RNA (Fig. 1b).To identify tails of any nucleotide composition and length, we used
high-throughput sequencing (Fig. 1c). Total
RNA from each sample was ligated to a DNA adapter to attach a known sequence to
the 3′ ends of all RNAs. The adapter enabled sequencing of added tails
and introduced a seven-nucleotide randomized sequence (random heptamer) to
facilitate computational removal of PCR duplicates. Following reverse
transcription, samples were PCR-amplified with primers specific for the reporter
tRNA and 3′ adapter. Gel-purified products were subjected to Illumina
paired-end sequencing.We computationally extracted added tails, defined as nucleotides between
the reporter tRNA 3′ end (including the CCA) and the random heptamer.
After removing PCR duplicates, we quantitated and plotted the number of unique
tails, tail length, and the nucleotide composition at each detected tail length
in “tail-o-grams”. In these plots, each tail length was assessed
as a population to determine the percent of each nucleotide added among all
tails of that length, and proportions were color-coded by nucleotide. Numbers of
reads at each tail length were normalized to the number of unique random
heptamers (TPMH, tails per million heptamers) and displayed on a log scale.The assay was accurate and sensitive. CePUP-2 and
SpCid1 added tails primarily of uridines, and
CeGLD-2 added tails of adenosines (Fig. 1d–e,
Supplementary Fig.
2a), consistent with their known specificities. Furthermore, the high
sensitivity enabled detection of secondary nucleotide addition preferences. For
example, SpCid1 added uridine tails with 8.6% adenosine (Supplementary Fig. 2a),
consistent with its ability to add both A and U in
vitro[24,26].TRAID-Seq circumvents the need for purified enzymes and precisely
identifies thousands of independently added tails, enabling sensitive
determination of their sequences and relative abundances.
PUPs, PAPs, and CCA-adding enzymes
We used TRAID-Seq to analyze nucleotide specificities of characterized
and previously untested rNTases. We tested 40 proteins from seven species:
Homo sapiens (Hs, Fig. 2a), Caenorhabditis elegans
(Ce, Fig. 2b),
Aspergillus nidulans (An), Candida
albicans (Ca), Neurospora crassa
(Nc), Schizosaccharomyces pombe
(Sp), and Saccharomyces cerevisiae
(Sc) (Fig. 2c).
Candidate rNTases were identified by the presence of a characteristic G(G/S)
X7–13 DhDh motif and a downstream third
aspartate[9]. To focus on
noncanonical rNTases, we included putative rNTases with at least a partial type
II nucleotide recognition motif (NRM)[8,9], and excluded
canonical PAPs, which are distinguished by a type I NRM[9].
Figure 2.
Analyses of nucleotide addition activities of 40 noncanonical rNTases from
seven species.
Overall percentages of each nucleotide added by (a)
H. sapiens, (b)
C. elegans, and (c) fungal rNTases.
(d) Categorization of rNTases as PUPs, PAPs, CCA-adding
enzymes, or those with unique activities. rNTases are color-coded by organism.
Gray boxes (top) indicate previously characterized (known) enzymes, and black
boxes (bottom) indicate enzyme activities identified in this study (new).
Nucleotide addition activities were classified by the nucleotide
composition of all tails added to the reporter tRNA (Fig. 2, Table
1). For example, if added tails consisted of primarily uridines, then the
rNTase was classified as a PUP. By this criterion, we uncovered two PUPs and 12
previously uncharacterized PAPs. We also identified likely CCA-adding enzymes in
N. crassa, C. albicans and C. elegans,
consistent with homology predictions. Tails added by these enzymes,
CeHPO-31, CaCca1, and
NcNCU08022, are primarily composed of C and A (Fig. 2b, c) and
show an enrichment for the repeating CCA pattern. The p-values
of CCA occurrence among tails added by each enzyme, determined using a one-sided
Wald’s test, are highly significant (adjusted p-values
less than 1.6 × 10−22).
Table 1:
Summary of nucleotide addition preferences of tested rNTases and NRM
analysis
rNTase
Species
Nucleotide Preference
Histidine in NRM
Cid1
S. pombe
U
Yes
PUP-1
C. elegans
U
Yes
PUP-2
C. elegans
U
Yes
TUT4
H. sapiens
U
Yes
TUT7
H. sapiens
U
Yes
NCU04364.7
N. crassa
U
Yes
Cid16
S.
pombe
U
No (Lys)
PUP-3
C.
elegans
U
No (Arg)
F43E2.1
C.
elegans
U
No (Arg)
F31C3.2
C.
elegans
A, U, C, G
(indiscriminate)
Yes
MUT-2
C.
elegans
U, G
Yes
TUT1 (Star-PAP)
H.
sapiens
A
Yes
ZK863.4
C.
elegans
A
Yes
cutB
A. nidulans
A
No (Asn)
Trf4
S. cerevisiae
A
No (Asn)
Trf5
S. cerevisiae
A
No (Asn)
Trf4
C. albicans
A
No (Asn)
CR_03940W_A
C. albicans
A
No (Asn)
GLD-2
C. elegans
A
No (Asn)
GLD-4
C. elegans
A
No (Asn)
TENT2 (GLD2)
H. sapiens
A
No (Asn)
TENT4B (PAPD5)
H. sapiens
A
No (Asn)
TENT4A
H. sapiens
A
No (Asn)
TRF4
N. crassa
A
No (Asn)
Cid12
S. pombe
A
No (Asn)
Cid14
S. pombe
A
No (Asn)
cutA
A. nidulans
A
No (Arg)
NCU00538.7
N. crassa
A
No (Arg)
Cid11
S. pombe
A
No (Arg)
Cid13
S. pombe
A
No (Arg)
MTPAP
H. sapiens
A
No (Leu)
NCU11050.7
N. crassa
A
No (Leu)
C53A5.16
C. elegans
A
No (Pro)
F43H9.3
C. elegans
A
No (not aligned)
RPN1
C. albicans
A
No (Glu)
“Histidine in NRM” indicates the amino acid in each
rNTase that corresponds to histidine 336 of SpCid1. Bold text indicates PUPs
that do not have histidine in the NRM, and rNTases that can add nucleotides
other than uridine and have histidine in the NRM.
Enabled by the sensitivity of TRAID-Seq, we confirmed nucleotide
specificities of previously characterized rNTases[10,11,24,25,27-32] and
identified surprising secondary preferences in certain enzymes.
SpCid13 and SpCid14 were both previously
identified as PAPs[27] but also
added other nucleotides in TRAID-Seq. SpCid13 added 90.3%
adenosine (s.d. 0.3%; n=4), yet also added 6.0% cytosine (s.d. 0.3%; n=4; Fig. 2c, Supplementary Fig. 2b).
SpCid14 added 77.9% adenosine (s.d. 1.2%; n=3) and 19.7%
guanosine (s.d. 0.8%; n=3; Fig. 2c, Supplementary Fig. 2c).
Analysis of the patterns of nucleotides added by enzymes with secondary
preferences revealed no specific sequence motifs within the tails.Application of TRAID-Seq enabled us to identify new PAPs, PUPs, and
CCA-adding enzymes (Fig. 2d) and reveal
enzymes with previously unknown activities, as described below.
C tails and a eukaryotic two-enzyme CCA-adding system
We identified a S. pombe rNTase that adds primarily
cytidines. Based on sequence similarity, S. pombe SPAC1093.04
(SpSPAC1093.04) is predicted to be a CCA-adding enzyme, a
highly conserved rNTase subfamily that adds CCA to tRNA 3′ termini. In
TRAID-Seq, SpSPAC1093.04 yielded tails predominantly of
oligo(C) or oligo(A) on reporter tRNAs with a CCA 3′ end (Fig. 2c, Fig. 3a;
cytosine=46.0%, s.d. 6.0%; adenosine = 52.8%, s.d. 5.9%; n=5). The oligo(A) may
have been added by endogenous PAPs in the TRAMP complex[33], perhaps in competition with the
tethered enzyme. Furthermore, reporters with CC 3′ termini received
almost exclusively oligo(C) (Fig. 3a).
Tails added by SpSPAC1093.04 and the S.
cerevisiae CCA-adding enzyme (ScCca1) were
distinct (Fig. 3a, b). Most tails added by ScCca1
contained multiple CCA motifs. In contrast, SpSPAC1093.04 added
long cytosine stretches of up to 19 cytosines. Computational analyses of
sequence motifs in tails added by SpSPAC1093.04 and
ScCca1 confirmed differences in their activities: the
trinucleotide CCA was highly enriched with ScCca1 but not
SpSPAC1093.04 (Fig.
3c). Both enzymes added tails significantly enriched for CC
dinucleotides, as expected (Fig. 3c; Supplementary Fig. 3). We
conclude that SpSPAC1093.04 possesses a distinctive C-addition
activity.
Figure 3.
Nucleotide addition activity of S. pombe SPAC1093.04 and
S. cerevisiae Cca1.
(a) Left, tail-o-gram depicting nucleotide composition in
each added tail length added by S. pombe SPAC1093.04 and number
of tails normalized to unique heptamer sequences. Right, most abundant tail
sequences added to tRNA reporter containing a 3′ CC, or 3′ CCA
end. (b) Left, tail-o-gram depicting nucleotide composition in each
added tail length added by S. cerevisiae Cca1 and number of
tails normalized to unique heptamer sequences. Right, most abundant tail
sequences added to tRNA reporter containing a 3′ CCA end.
(c) Sequence motif effect analysis of tails added by
SpSPAC1093.04 (red, n=5) and ScCca1
(black, n=3). Each adjusted p-value quantifies the significance
of contribution of the indicated oligonucleotide to the variation in tail
sequence read counts. Significances for dinucleotide (CC) and trinucleotides
(CCA) after multiplicity correction with the Bonferroni procedure are shown. A
dashed line indicates significance level 0.05. The -log10
p-values from left to right in the figure are 300, 148, 0.87,
and 313. (d). cca1-1 mutant strains containing
CEN plasmids expressing indicated plasmids were serially
diluted, spotted on SD-Ura-Leu media and grown at 37°C for 3 days or
23°C for 4 days. This experiment was repeated twice with similar
results.
The S. pombe genome encodes a second enzyme
(SpSPCC645.10) in the CCA-adding enzyme subfamily, which
yielded tails almost entirely of adenosines (Fig.
2c, 96.3%, s.d. 0.7%). Thus, we thought that
SpSPAC1093.04 and SpSPCC645.04 might act
sequentially to add CCA to tRNAs, with SpSPAC1093.04 adding two
C’s and SpSPCC645.10 adding the terminal A.To test this idea, we asked whether the two S. pombe
genes rescued lethality due to loss of CCA-adding activity in
S. cerevisiae. We used a cca1–1
mutant strain bearing a temperature-sensitive (ts) allele of
the essential CCA1 gene, which encodes the single protein that
adds CCA to tRNAs in S. cerevisiae[34]. SPAC1093.04 and SPCC645.10 were
expressed in the cca1–1 strain using the
CCA1 promoter and terminator sequences on single-copy
plasmids. Effects on temperature sensitivity were assessed in strains expressing
the S. pombe proteins either individually or together, and with
empty vector controls (Fig. 3d, Supplementary Fig.
4).Coexpression of both S. pombe enzymes rescued absence
of endogenous CCA-addition activity in S. cerevisiae.
cca1–1 temperature sensitivity at 37°C was
fully rescued by co-expression of SPAC1093.04 and SPCC645.10, and by the
wild-type CCA1 positive control. Expression of SPAC1093.04
alone partially suppressed the cca1–1 ts phenotype.
Expression of SPCC645.10 alone or catalytic-inactive versions of SPAC1093.04 and
SPCC645.10 failed to rescue temperature sensitivity. We suggest that SPAC1093.04
and SPCC645.10 cooperate to add CCA to tRNAs to rescue the cca1–1
ts phenotype, and that this collaboration is also necessary for CCA
addition to tRNAs in S. pombe. Both enzymes are essential in
S. pombe[35] and members of the CCA-adding enzyme subfamily. We propose
this is the first identified dual-enzyme CCA-addition system in a eukaryote; our
data are supported by a recent report[36,37].
An enzyme with broad specificity
C. elegans F31C3.2 displayed a uniquely broad
nucleotide specificity (Fig. 2b, Supplementary Fig. 5a).
The majority of nucleotides added were adenosines and uridines, but guanosines
and cytosines also were prominent. The nucleotide composition of tails
paralleled intracellular ribonucleotide concentrations in S.
cerevisiae (Supplementary Fig. 5b). The added tails yielded no discernible
pattern or sequence motif, and computational analysis of all 16 possible
dinucleotide sequences revealed no statistically significant enrichment among
the added tails (Supplementary
Fig. 5c). We suggest that CeF31C3.2 is relatively
indiscriminate in nucleotide preference and so provisionally refer to it as
“nucleotide polymerase-1” (CeNPOL-1).
A poly(UG) polymerase required for RNA silencing
C. elegans MUT-2 added tails with a 1:1 ratio of
uridines and guanosines (Fig. 2b, Fig. 4a, b). CeMUT-2 tails consisted of striking, polymeric
sequences of alternating U and G (Fig. 4b).
Computational analysis confirmed repetitive UG addition, and revealed that tails
began with either uridine or guanosine We refer to CeMUT-2 as a
poly(UG) polymerase.
Figure 4.
CeMUT-2 is a poly(UG) polymerase.
(a) Tail-o-gram depicting CeMUT-2
nucleotide addition activity in TRAID-Seq. (b) The most abundant
tail sequences identified in two biological replicates of
CeMUT-2 TRAID-Seq assays. (c) UG tail sequences
from two biological replicates of CeMUT-2 activity in
X. laevis oocytes. (d) Analysis of all
possible dinucleotides in the tails added by CeMUT-2 from 8
independent biological replicates. A heatmap of -log10
p-values for individual dinucleotides is shown. Each
p-value quantifies the significance of adjusted
contribution of each dinucleotide to the variation in tail sequence read counts.
Dinucleotides with a significant effect after multiplicity correction at
significance level 0.05 are marked with an asterisk (*). Details of statistical
analyses performed are provided in the Online Methods (Computational Analyses of
Sequence Motifs).
To determine whether this unusual activity was influenced by the
reporter tRNA, we used a different reporter RNA, derived from RNase P RNA (Supplementary Fig. 6a).
This RNA had neither a CCA 3′ end nor similarity to tRNAs.
CeMUT-2 again added tandem UG repeats, as demonstrated by
representative sequences from three biological replicates (Supplementary Fig. 6b), and did so
on multiple termini formed on RNase P reporter RNA prior to tailing.We tested CeMUT-2 in a different organism and cell type
to further examine whether UG addition was intrinsic to the protein.
CeMUT-2-MS2 protein was expressed in Xenopus
laevis oocytes via mRNA injection. We then injected a reporter RNA
with MS2 binding sites and a sequence distinct from the yeast reporters. Tails
were detected on 35–37% of reporter RNA molecules from two biological
replicates. Replicate 1 resulted in 43 independently cloned reporter sequences,
16 of which had added tails. Replicate 2 resulted in 31 independently cloned
reporter sequences, 11 of which had added tails. All sequenced tails contained
tandem UG repeats (Fig. 4c). Short uridine
stretches also were observed, perhaps due to Xenopus TUT4/TUT7
activity.We evaluated sequences of CeMUT-2-catalyzed tails from
all TRAID-Seq experiments to quantify enrichment of each of the possible 16
dinucleotide pairs (Fig. 4d).
5′-GU-3′ and 5′-UG-3′ were highly enriched, with
-log10 (p-values) of 7.3 and 6.2. The UG repeats
are essentially perfectly repeated throughout the tails added, a remarkable
pattern not observed in tails added by known nucleotidyl transferases.We also assayed a construct corresponding to another predicted splicing
isoform of CeMUT-2 (mut-2b; https://wormbase.org/species/c_elegans/gene/WBGene00003499#0-1-3).
Only CeMUT-2a exhibited UG-addition activity (Fig. 5a, b).
Figure 5.
CeMUT-2 mutants defective for RNAi lack poly(UG) polymerase
activity.
(a) Schematic of CeMUT-2 isoforms and
tested mutations, known catalytic mutants (pink), and mutants identified in
forward genetic screen[19]
(blue). NTD, Nucleotidyl transferase domain; PAPd, Poly(A) polymerase-associated
domain. * indicates that a truncated version of CeMUT-2 was
made to recapitulate this nonsense mutant. (b) Percent of
nucleotides added by each CeMUT-2 enzyme variant. Percent of
tails containing UG repeats, standard deviation, and number of biological
replicates are indicated. (c) Model depicting potential roles of
poly(UG) tails in small RNA amplification in C. elegans.
Poly(UG) tails could directly recruit RNA-dependent RNA polymerase (RdRP)
(left). Alternatively, poly(UG) tails could be identified by a poly(UG) binding
protein (UG-BP), which then recruits RdRP (right). In both cases, the UG tails
could be single-stranded or form a higher-order structure.
The poly(UG) polymerase activity of CeMUT-2 (aka RDE-3)
likely is critical for RNAi. CeMUT-2 was first identified
genetically, in screens for elevated Tc1 transposition in C.
elegans[14] (hence
MUT-2 for “mutator”). Later, the same gene
emerged from a screen for genes critical for RNAi (hence RDE-3 for
“RNAi-defective”)[19]. The RNAi-defective screen
yielded six independent mut-2 alleles with mutations in regions
likely important for catalytic activity (Fig.
5a). We assayed CeMUT-2 proteins bearing each of
these mutations, and a CeMUT-2 protein engineered to be
inactive (DADA, Fig. 5b). All mutant
CeMUT-2 proteins lacked UG-addition activity, and the
nucleotide compositions of the few tails detected resembled the catalytically
inactive DADA enzyme and vector controls. Since C. elegans
mutants harboring these alleles are defective for suppression of transposition,
RNAi interference, and RNA silencing, we propose that poly(UG) polymerase
activity is important in those events.
DISCUSSION
With TRAID-seq we tested proteins identified through sequence similarity to
rNTases, although the approach could be applied to enzymes that catalyze any RNA
modification detectable through sequencing, including certain base modifications.
Despite its sensitivity and ability to assay activity without purification of the
modified RNA or candidate rNTase, limitations of TRAID-Seq arise from the use of an
artificial substrate to which the enzyme is tethered, and from measuring activity in
a foreign cell.The active site regions of the 17 PAPs and PUPs we identified bear on how U
and A are distinguished by rNTases. A histidine in rNTase active sites can dictate
nucleotide preference for U[38-41]. Yet three
PUPs we uncovered (SpCid16, CePUP-3 and
CeF43E2.1) lack that histidine (Table 1). Similarly, CeNPOL-1 and
CeMUT-2 can add purines, yet possess the active-site histidine.
These findings emphasize the complexity of nucleotide preferences among rNTases and
the need for further structural analyses.TRAID-Seq may miss effects of the natural RNA substrates, co-factors or
protein partners of rNTases. For example, in mammalian cells,
HsTUT1/Star-PAP adds U’s to U6-snRNA[29], but adds A’s to a variety of
mRNAs[30]. In TRAID-Seq, we
detected a strong preference for A (adenosine 89.5%, s.d. 1.4%), and only low levels
of incorporation of other nucleotides (U=3.2%, s.d. 0.7%; C= 2.5%, s.d. 0.4%;
G=4.8%, s.d. 0.6%). A specific phosphoinositide enhances
HsTUT1/Star-PAP A addition activity in
vitro[30] and may
underlie these differences. In addition, Aspergillus AnCutA adds
CU-rich tails to RNAs in vivo and prefers CTP in
vitro[42,43]. In TRAID-Seq, AnCutA
predominantly added A (91.8%, s.d. 0.4%) vs. C (5.9%, s.d. 0.2%) or U (1.9%, s.d.
0.3%). In Aspergillus, AnCutB collaborates with
AnCutA to form CU-rich tails[42] but also added A’s in TRAID-Seq (98.7%, s.d. 0.2%)
vs. C (0.4%, s.d. 0.05%) or U (0.3%, s.d. 0.07%). These findings emphasize TRAID-Seq
as a starting point for further studies.The sensitivity of TRAID-Seq revealed previously undetected nucleotide
addition capabilities that may underlie the addition of in vivo
tails that have been enigmatic. For example, three human PAPs
(HsTENT2, HsTENT4b, and HsTUT1)
are capable of G addition, albeit at a low level in our system (Fig. 2a), and could contribute to G-addition on mRNAs in
human cells[12]. Indeed,
HsTENT4a and HsTENT4b were recently implicated
in G-addition to mRNAs, which protects them from deadenylation[44]. The discovery of other human PAPs that add
mixed tails might indicate that other classes of RNAs are subject to regulation by
G-addition. The abilities of SpCid13 and SpCid14
to add C and G, respectively, in addition to A, might suggest an alternate mechanism
of RNA regulation in S. pombe. We suspect that the nature and roles
of tails are more varied than previously realized.C. elegans NPOL-1 added tails composed of random
combinations of all four nucleotides. The levels of incorporation mirror
intracellular ribonucleotide concentrations, which may influence the proportions of
nucleotides added. CeNPOL-1 diverges in sequence from other enzymes
that can catalyze template-independent addition of any nucleotide tail[45], and belongs to a different
subfamily of nucleotidyl transferases[8]. Addition of random nucleotides within, or at the end of,
homopolymeric tails could interfere with RNA function[44]. It will be of interest to test the roles of
CeNPOL-1 in vivo.We propose that SpSPAC1093.04 and
SpSPCC645.10 constitute a two-enzyme system that catalyzes CCA
addition to tRNAs in S. pombe. This is strongly suggested by their
nucleotide specificities and ability to jointly complement a yeast strain lacking a
functional CCA-adding enzyme, consistent with a recent report[36,37].
These studies are the first observations of a two-enzyme system in a eukaryote, and
await verification in S. pombe.CeMUT-2, the poly(UG) polymerase, is remarkable both in
enzyme activity and its roles in RNA biology. We detected tails of up to 18 perfect
UG repeats; indeed, longer tails were likely added but went undetected due to
sequencing read limitations. The number of UG’s added in
vivo is not yet known. Alternating U and G addition bears comparison to
CCA addition via the single active site of CCA-adding enzyme[46,47].
Perhaps CeMUT-2 promotes consecutive rounds of UG-addition by
similarly repositioning the 3′-most UG relative to the active site.The diverse roles of CeMUT-2 – preserving genome
integrity by suppressing transposition[14,15], silencing
transgenes[16-18] and promoting RNAi due to
exogenous dsRNA[19-22] – all likely reflect the same
underlying molecular mechanism. CeMUT-2 biological functions likely
hinge on its poly(UG) polymerase activity, since mutations identified in mutator and
RNAi-defective mut-2 mutants abrogate its enzymatic activity (Fig. 5a, b).
CeMUT-2 increases the abundance of secondary small RNAs during
RNAi[19,21], suggesting that UG tails are important in
RdRP-based secondary small RNA synthesis or stabilization. In one model,
CeMUT-2 adds poly(UG) to the 3′ end of sliced RNAs
generated in an Ago-dependent process. The poly(UG) tails would provide a
distinctive mark to recruit RdRP, either directly or via a separate UG-binding
protein (Fig. 5c). In either scenario, the tail
could be single-stranded, or form a more complex structure (depicted simply as UG
pairing in Fig. 5c). By recruiting RdRPs to
amplify small RNA pools, and perhaps by directly stabilizing sliced RNAs, poly(UG)
tails could promote long-term gene silencing known to occur in C.
elegans[48-50]. Suppression of transposition by
CeMUT-2 implies that it acts on an RNA vital in that process.
Identification of the RNA targets of CeMUT-2 should provide an
entree into roles of poly(UG) polymerases and the tails they add.
ONLINE METHODS
Plasmid Construction
To enable overexpression of rNTases as MS2 coat protein fusions in
S. cerevisiae, the MAP72 MS2 cassette vector was
constructed. YEplac 181 (LEU2
2μ)[51] was digested with HindIII and
XhoI. Then each portion of the MS2 cassette was subcloned
with unique restrictions sites, resulting in the following insert: S.
cerevisiae TEF1 promoter, MS2 coat protein, a multiple cloning site
to insert the rNTase to test (consisting of BamHI,
XmaI/SmaI, NotI,
XbaI, PstI, and KpnI
sites), SV40 nuclear localization signal, RGS(H6) sequence to verify
rNTase expression by Western blotting, and S. cerevisiae ADH1
terminator sequence.Each rNTase tested was cloned into MAP72 by amplifying the genes
indicated in Supplementary
Table 1 using the primers listed. All inserts were sequenced to
confirm identity and lack of mutations. Site-directed mutations were made using
standard methods with oligomers corresponding to the mutated sequences.The MAP80A tRNA reporter vector was constructed using a
tRNAHis expression cassette, MAB812A[52]. tRNAHis sequence was removed
by digestion with XhoI and BglII. Then DNA
corresponding to the tRNA reporter sequence was inserted by annealing
overlapping oligomers to construct both strands of the DNA sequence. The tRNA
reporter is S. cerevisiae tRNASer(AGA) altered to
contain one MS2 stem loop sequence (underlined) in place of the endogenous
tRNASer(AGA) variable arm
(5′-GGCAACTTGGCCGAGTGGTTAAGGCGAAAGATTAGAAATCTTTACATGAGGATCACCCATGTCGCAGGTTCGAGTCCTGCAGTTGTCG-3′).A CCA1 cassette vector was constructed using YCplac 111
(LEU2 CEN)[51] in order to express CCA1, SPAC1093.04, or
SPCC645.10 with the same promoter and C-terminal epitope tag
[RGS(H)6]. BY4741 yeast genomic DNA was used as a template to
generate an amplicon consisting of LEU2 CEN vector sequence at
the 5′ end, the CCA1 promoter sequence, and a 3′
terminal sequence corresponding to the multiple cloning site of MAP72 using
5′-GAAACAGCTATGACCATGATTACGCCAAGCTTACTAGTAGCTACTTCAGGGACAAGCAAC-3′,
and
5′-ACCCTGCAGTCTAGAAGGCGGCCGCGTGGATCCACACAAAAAAAGCCCTTATAACCCACG-3′.
MAP72 was used as a template to generate an amplicon consisting of the multiple
cloning site, RGS(H6) sequence, ADH1 terminator
sequence of MAP72, and LEU2 CEN vector sequence at the
3′ end using
5′-GGATCCACGCGGCCGCCTTCTAGACTGCAGGGTACCAGAGGTTCTCACCACCACCACCAC-3′
and
5′-CCAGTCACGACGTTGTAAAACGACGGCCAGTGAATTCCTCGAGCGGTAGAGGTGTGGTCA-3′.
These two amplicons were combined with LEU2 CEN vector
(YCplac111) linearized with PstI/SacI and
assembled by Gibson cloning[53].
The CCA1 cassette sequence was confirmed by Sanger sequencing.
CCA1, SPAC1093.04, or SPCC645.10 sequences were subcloned
from their respective MAP72-based constructs into the CCA1
cassette for expression in cca1-1 yeast.To construct the MAP136 MUT-2 oocyte expression vector (pCS2 3HA
MS2-MUT-2 WT), MUT-2a was PCR-amplified from its MAP72-based vector using
5′-CTACCATGGATGGCTTCTAACTTTACTCAGTTCGTTCTCGTCGAC-3′ and
5′-ACTCTCGAGTTAGTGGTGGTGGTGGTGGTGAGAACCTCTGGTACCCTGCAGTACAAATGA-3′
and then cloned into the NcoI/XhoI site of
pCS2 3HA MS2. MUT-2 DNA sequence was verified prior to oocyte injections.All constructs used in this study are available from the authors upon
request.
Yeast Growth
BY4741 yeast were co-transformed using standard methods[54] with a plasmid expressing the
reporter RNA and a plasmid expressing the rNTase of interest, or vector
controls, and selected on synthetic yeast medium lacking uracil and leucine
(SD-Ura-Leu). Cultures were inoculated independently with single colonies to
produce biological replicates, grown to saturation, and then diluted to 0.1
OD600/mL and grown to log phase (0.8–1
OD600/mL). Cells were spun down in pellets of 25 OD600
(approximately 5 × 108 cells) and stored at
−80°C until RNA extraction or protein expression analysis. We
performed Western blotting with mouse anti-RGS-His Antibody (1:2500 dilution,
5PRIME/Qiagen). Only those samples with clear expression of the rNTase fusion
protein were analyzed by high-throughput sequencing.cca1–1 yeast were co-transformed with vectors as
listed in Fig. 3 and Supplementary Fig. 4 using standard
methods[54], and
selected on SD-Ura-Leu plates at room temperature. Colonies were selected and
grown to saturation in SD-Ura-Leu liquid media. Cultures were diluted to 0.5
OD/mL followed by three 10-fold serial dilutions, spotted on SD-Ura-Leu plates,
and incubated at room temperature (23°C) for 4 days or 37°C for 3
days.
RNA Extraction
RNA was extracted from 25 OD of yeast corresponding to each sample by
modification of a previously described method[55]. To each sample, 0.5 g of 0.5 mm acid
washed beads (Sigma-Aldrich), 0.5 mL of RNA ISO buffer (500 mM NaCl, 200 mM
Tris-Cl pH 7.5, 10 mM EDTA, 1 % SDS) and 0.5 mL of phenol-chloroform-isoamyl
alcohol pH 6.7 (PCA, Fisher Scientific) was added. Samples were lysed with 10
cycles that each consisted of vortexing for 20 seconds and incubation on ice for
30 seconds. 1.5 volumes (relative to starting amount of RNA ISO Buffer) of RNA
ISO Buffer and of PCA were added, and samples were centrifuged at 4°C to
separate phases. The aqueous layer was transferred to a pre-spun phase-lock gel
(heavy) tube (5PRIME/Quantabio); an equal volume of PCA was added and mixed
prior to centrifugation at room temperature to separate phases. The aqueous
layer was transferred to 2 new tubes for ethanol precipitation with 2 volumes of
100% ethanol followed incubation at −80°C for 1 hour to overnight.
Precipitated RNA was pelleted by centrifugation at 4°C. Each pellet was
dissolved in 25 μL nuclease-free water and combined into 1 tube per
sample. Co-purifying DNA was digested with 20 U of Turbo DNase (Invitrogen) at
37°C for 4 hours. RNA was cleaned up with the GeneJET RNA Purification
Kit (Thermo Fisher Scientific) and eluted with 50 μL of DEPC-treated
water.
RT-PCR Experiments
RT-PCR experiments to detect A tails or U tails on an RNase P RNA
reporter[56,57] (see Supplementary Fig. 1) were
performed by using a tail-specific reverse transcription step with 5 pmol of a
T33 or A33 DNA primer and 100 ng of total RNA using
ImProm-II Reverse Transcriptase (Promega Corporation). Then the resulting
reactions were PCR-amplified using reporter-specific primers
(5′-TCGAGCCCGGGCAGCTTGCATGC-3′ and 5′-
GGGAATTCCGATCCTCTAGAGTC-3′). If a tail was added to the RNase P RNA
reporter, then the RT reaction would produce cDNA, and the PCR would result in
an amplicon.RT-PCR experiments to detect tails added to the tRNA reporter were
performed as described with the RNase P RNA reporter but with the following
modifications. PCR amplification was performed with a forward primer specific to
the 5′ end of the tRNA (5′-GGCAACTTGGCCGAGTGGTTAAGG-3′) and
a reverse primer specific to the 3′ end of the tRNA with an A tail or U
tail, respectively:
5′-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATGGCGACAACTGC-3′ or
5′-TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGGCGACAACTGC-3′. If a tail
was added to the tRNA reporter, then the RT reaction would produce cDNA, and the
tail-specific PCR would result in an amplicon.
TRAID-Seq Library Preparation
Total RNA (100 ng) was ligated with 20 pmol of a PAGE gel-purified
5′ adenylated primer containing a 7-nucleotide random DNA sequence
(random heptamer), Illumina TruSeq adapter sequence and a 3′
dideoxycytidine (5′-A(pp) NNNNNNN TGGAATTCTCGGGTGCCAAGG ddC-3′)
using 200 U of T4 RNA ligase 2, truncated KQ (New England BioLabs) in a
20μL reaction with 16°C overnight incubation. This ligation added
the random heptamer and Illumina TruSeq adapter sequence to the 3′ end of
the RNAs in the sample.Half of the ligation reaction (10 μL) was reverse transcribed
using 5 pmol of Illumina RNA RT primer
(5′-GCCTTGGCACCCGAGAATTCCA-3′) and ImProm-II Reverse Transcriptase
(Promega Corporation) with 1.5 mM MgCl2 and 0.5 mM dNTPs, according
to manufacturer’s instructions.Samples were PCR-amplified with a forward primer consisting of
Illumina-specific sequences and tRNA reporter-specific sequences
(underlined)(5′-AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGACGATCGAGGATCACCCATGTCGCAG-3′)
and a reverse Illumina RNA PCR Primer with various indices used for
multiplexing, using GoTaq Green PCR Master Mix (Promega Corporation). PCR
products were run on an 8% polyacrylamide 8M urea gel and gel extracted.
Resulting samples for each sequencing run were combined in equimolar amounts and
run on an Illumina HiSeq2000 or HiSeq2500 (2×50 bp or 2×100 bp),
to produce approximately 1 × 106 reads per sample.Experiments with the RNase P RNA reporter were performed essentially as
described above but with one modification. For TRAID-Seq, the 5′ primer
used for PCR amplification was specific for the RNase P RNA reporter (5′-
AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGACGATCGTCTGCAGGTCGACTCTAGAAA-3′).
TRAID-Seq Data Analysis
Reads resulting from sequencing of TRAID-Seq samples were analyzed using
a group of Python scripts that we call the “PuppyTails” program.
Briefly, PuppyTails identifies sequences corresponding to the tRNA reporter, CCA
end of the tRNA, and added tail in read 1. In read 2, the program identifies the
random heptamer sequence, added tail sequence, and, if read length allows, the
CCA end and tRNA reporter sequence. Reads were collapsed into unique ligation
events using the random heptamer and then compared to identify and remove
sequences resulting from PCR amplification (PCR duplicates). The number of
unique times that each tail sequence is observed is counted. Tail sequences are
sorted by length to calculate the nucleotide composition at each tail length and
the number of tails per million heptamers (TPMH) measured for each tail length;
these data are plotted as tail-o-grams (for example, Fig. 1d, e).
Tails of all lengths were used for analyses of nucleotide composition, but for
all tail-o-grams, tails of 5 nucleotides or greater are shown for clarity. Tails
of 1–4 nucleotides included poly(A) sequences (likely added by endogenous
PAPs in S. cerevisiae) and also nucleotides not explained by
the activity of the rNTase. The sequences at these tail lengths were the same in
the absence of expressed enzyme and with catalytically inactive versions of the
enzymes tested (see Fig. 5b).A Perl script was used to calculate the overall nucleotide composition
of tails added by a given rNTase for each of its biological replicates. All tail
lengths were assessed as one population. The abundance of observed tail
sequences were factored into calculations of nucleotide composition. Nucleotide
addition percentages reported in this study were generated using this analysis
approach. Data shown in Fig.
2a–c were generated using
this type of analysis.
Protocol Accessibility
A step by step version of the TRAID-Seq workflow is available online on
Protocol Exchange (DOI: 10.1038/protex.2019.016)
Computational Analyses of Sequence Motifs
To analyze tail sequences, a general feature screening with a random
forest application[58] was
performed at the replicate level. We first quantified the number occurrences of
all oligonucleotides (k=1, 2, 3, 4) within each tail sequence and utilized the
resulting set of 340 features, as well as the length of the tail. The variable
importance, defined as the percent mean decrease in accuracy (with 500 trees,
113 candidate variables at each split, minimum node size of 5), were estimated
for all features. We define the selected features as those whose importance
measures are greater than 4% across replicates. We fitted a Poisson regression
model in which the response variable was tail sequence counts.Tails added by S. cerevisiae Cca1, S. pombe SPAC1093.04, and
predicted CCA-adding enzymes. The above selected features were used
as covariates. P-values from individual replicates, calculated
from one-sided Wald’s test, were aggregated using Fisher’s
(n<4) or Wilkinson’s (n>=4) method, followed by
multiplicity correction with the Bonferroni procedure. This process identified
oligonucleotides that differ between tails added by S.
cerevisiae Cca1 and S. pombe SPAC1093.04 at level
0.05.Tails added by C. elegans MUT-2. We evaluated the
impacts of 16 dinucleotides by formally testing for their effects by a
comparison of a null model without each dinucleotide and the alternative model
deduced from random forest filtered set of features plus other dinucleotides.
This procedure identified UG and GU as the most significant dinucleotides.
In vitro Transcription
pCS2 3HA MS2-MUT-2 (MAP136) was linearized with SacII,
and 3 μg of linearized plasmid was transcribed with Ampliscribe SP6 High
Yield Transcription Kit (Epicentre), according to manufacturer’s
instructions. pLGMS2-luc (RNA with three MS2-binding sites)[10,59] was linearized with BglII, and 1 μg
of linearized plasmid was transcribed with T7 Flash In Vitro Transcription Kit
(Epicentre), according to manufacturer’s instructions. Transcription
reactions included m7G(5′)ppp(5′)G RNA Cap Structure
Analog (New England Biolabs).
Tethered Function Assays and Oocyte RNA Extraction
Xenopus laevis oocyte manipulations and injections were
performed as in previous studies[10,59,60].Tethered function assays were conducted essentially as previously
described[11]. Briefly,
Stage VI oocytes were injected with 50 nL of 600 ng/µL capped mRNA
encoding MS2-HA-MUT-2 protein. After 6 hours, the same oocytes were injected
with 50 nL of 3 ng/µL pLGMS2-luc reporter mRNA. After 16 hours, oocytes
were collected, lysed, and assayed. Three oocytes were used to confirm protein
expression. Total RNA was extracted from oocytes using TRI reagent
(Sigma-Aldrich), as described previously[11], then treated with 8 U of Turbo DNase (Invitrogen) at
37°C for 1 hour, and cleaned up with the GeneJET RNA Purification Kit
(Thermo Fisher Scientific).
Oocyte RNA Analysis and Tail Sequencing
Oocyte total RNA (100 ng) was ligated with 20 pmol of the 5′
adenylated primer as described above. This ligation added the random heptamer
sequence and a known sequence to the 3′ ends of RNAs in the sample for
tail sequence-independent analyses. Half of the ligation reaction (10 μL)
was reverse transcribed as described above.Samples were PCR-amplified with a forward primer specific to the RNA
reporter (5′- CTCTGCAGTCGATAAAGAAAACATGAG-3′) and a reverse primer
specific to the known sequence added to the 3′ end of the RNA (5′-
GCCTTGGCACCCGAGAATTCCA-3′), using GoTaq Green PCR Master Mix (Promega
Corporation). PCR products were run on a 1.5% agarose gel, and purified with the
GeneJET Gel Extraction Kit (Thermo Fisher Scientific). Non-templated A overhangs
were added by treating the purified PCR products with 10 U of TaqPlus Precision
Polymerase Mixture (Agilent Genomics) in TaqPlus Precision buffer supplemented
with 0.2 mM dATP at 70°C for 30 minutes. PCR products were then cloned
with the TOPO TA Cloning Kit for Subcloning (Thermo Fisher Scientific) as
follows: 6% of the A addition reaction volume (2.4 μL) was combined with
0.6 μL of Salt Solution and 0.7 μL of TOPO Vector and incubated at
room temperature for 30 minutes. Reactions were diluted 1 in 4 with water,
transformed into DH5α competent cells, and selected on LB agar with 100
μg/mL ampicillin and 75 μg/mL X-Gal for blue/white screening.
White colonies were selected, plasmids were extracted, and inserts were
sequenced to identify tails added to the reporter. All reporter sequences with
added tails are reported in Fig. 4c.
NRM sequence analysis
Protein sequences of known[10,11,24,25,27-33,61-63] and
new rNTases tested (excluding CCA-adding enzymes) were aligned using ClustalX
2.1 software to identify the nucleotide recognition motif (NRM). The amino acid
for each rNTase reported in Table 1
corresponds to histidine 336 of S. pombe Cid1.
Authors: Matthew T Blahna; Matthew R Jones; Lee J Quinton; Kori Y Matsuura; Joseph P Mizgerd Journal: J Biol Chem Date: 2011-10-17 Impact factor: 5.157
Authors: Jae Eun Kwak; Eric Drier; Scott A Barbee; Mani Ramaswami; Jerry C P Yin; Marvin Wickens Journal: Proc Natl Acad Sci U S A Date: 2008-09-09 Impact factor: 11.205
Authors: Matthew R Jones; Lee J Quinton; Matthew T Blahna; Joel R Neilson; Suneng Fu; Alexander R Ivanov; Dieter A Wolf; Joseph P Mizgerd Journal: Nat Cell Biol Date: 2009-08-23 Impact factor: 28.824