Ribosome-inactivating proteins (RIPs) are RNA:adenosine glycosidases that inactivate eukaryotic ribosomes by depurinating the sarcin-ricin loop (SRL) in 28S rRNA. The GAGA sequence at the top of the SRL or at the top of a hairpin loop is assumed to be their target motif. Saporin is a RIP widely used to develop immunotoxins for research and medical applications, but its sequence specificity has not been investigated. Here, we combine the conventional aniline cleavage assay for depurinated nucleic acids with high-throughput sequencing to study sequence-specific depurination of oligonucleotides caused by saporin. Our data reveal the sequence preference of saporin for different substrates and show that the GAGA motif is not efficiently targeted by this protein, neither in RNA nor in DNA. Instead, a preference of saporin for certain hairpin DNAs was observed. The observed sequence-specific activity of saporin may be relevant to antiviral or apoptosis-inducing effects of RIPs. The developed method could also be useful for studying the sequence specificity of depurination by other RIPs or enzymes.
Ribosome-inactivating proteins (RIPs) are RNA:adenosine glycosidases that inactivate eukaryotic ribosomes by depurinating the sarcin-ricin loop (SRL) in 28S rRNA. The GAGA sequence at the top of the SRL or at the top of a hairpin loop is assumed to be their target motif. Saporin is a RIP widely used to develop immunotoxins for research and medical applications, but its sequence specificity has not been investigated. Here, we combine the conventional aniline cleavage assay for depurinated nucleic acids with high-throughput sequencing to study sequence-specific depurination of oligonucleotides caused by saporin. Our data reveal the sequence preference of saporin for different substrates and show that the GAGA motif is not efficiently targeted by this protein, neither in RNA nor in DNA. Instead, a preference of saporin for certain hairpin DNAs was observed. The observed sequence-specific activity of saporin may be relevant to antiviral or apoptosis-inducing effects of RIPs. The developed method could also be useful for studying the sequence specificity of depurination by other RIPs or enzymes.
RNA:adenosine glycosidase activity is
thought to be the defining
feature of all ribosome-inactivating proteins (RIPs).[1] This activity targets the sarcin-ricin loop (SRL), a highly
conserved, compact structure of rRNA (rRNA) that is important for
the function of ribosomes.[2] RIPs cleave
off one adenine base at the top of the SRL.[3] The loss of this adenine base inhibits translocation and inactivates
the ribosome.[4] Some RIPs, for example,
Shiga toxin and ricin, have been shown to specifically interact with
ribosomal stalk proteins to gain access to the SRL.[5,6] This
interaction is essential for specificity toward the intact ribosomes
and accelerates the depurination reaction,[7] enabling ricin to inactivate approximately 1500 ribosomes per minute
in vitro.[8]Experiments with ricin
indicated that a hairpin loop with the GAGA
sequence at the top is the minimal element required for RIP activity.[7,9] It has also been shown that RIPs can release adenine from a range
of different substrates, such as mRNA or DNA.[10,11] Until now, it has been unclear if and how the specificity of RIPs
on other substrates differs from the specificity for ribosomes or
rRNA. Saporin, a type I ribosome-inactivating protein produced by
the plant Saponaria officinalis, was particularly
effective at liberating adenine from a range of different nucleic
acid species.[12] Compared to ricin that
is highly specific for ribosomes, saporin is much more promiscuous
and can target oligonucleotides without the aid of other (protein)
factors. Thus far, it is unknown if the activity on substrates other
than rRNA is also influenced by the sequence of the nucleic acid substrates.RIPs are generally classified into type I or II depending on their
composition. Type I RIPs consist of one protein domain (the A chain)
that is catalytically active, while type II RIPs consist of the A
chain and a lectin-like domain (the B chain) that can mediate binding
and uptake into eukaryotic cells.[1] While
type II RIPs (like ricin and Shiga toxin) are usually highly toxic,
type I RIPs like saporin are basically nontoxic to humans and can
be found in many plants widely consumed as food.[13] The uptake of RIPs by eukaryotic cells is a major determinant
of their toxicity.[14] RIP toxicity thus
depends on the presence of a specific receptor on the cell surface.[15] RIPs coupled to cancer cell-specific antibodies
(so-called immunotoxins) have therefore been investigated for cancer-targeting
therapies. Saporin immunotoxins in particular have shown promise for
the treatment of cancer or as a conditioning agent for the treatment
of immune malfunction due to RAG deficiency.[16−19] Saporin-based immunotoxins have
also been used to kill specific types of neurons and to study their
effect on the brain.[20,21] Still, only a few immunotoxins
have been approved for treatment because of serious side effects and
immunogenicity.[22] Gaining a better understanding
of the catalytic activities of saporin might thus aid in the development
of targeted therapies and the understanding of the side effects of
saporin-containing immunotoxins.High-throughput or deep sequencing
is a versatile approach for
characterizing the effects enzymes have on nucleic acids. This approach
has been used to characterize the sequence specificity of C5, the
RNA binding subunit of Escherichia coli RNase P,
and its RNA processing kinetics.[23] The
detailed affinity distributions that could be obtained by sequencing
make it possible to gain a much more comprehensive picture of the
protein’s intrinsic activity.[24] To
the best of our knowledge, no such studies have been performed for
any RIP.Here, we present a high-throughput, DNA sequencing-based
approach
to study the depurination activity of saporin against different nucleic
acids. While most RIPs work efficiently on intact ribosomes, our method
combines synthetic oligonucleotides with a central randomized region,
depurination with saporin, the aniline cleavage assay for depurinated
nucleic acids, and high-throughput sequencing. Due to the minimal
requirements of this method, it can be easily adapted to study depurination
of promiscuous as well as more specific enzymes.Using this
approach, the sequence preference of saporin for DNA
and RNA sequences containing six randomized positions was determined.
Surprisingly, the wild-type SRL sequence in isolation is not an efficient
target for saporin. Instead, small hairpin DNAs containing the motif
ACG at the beginning of the loop were found to be most susceptible
to depurination. The results from sequencing could be reproduced for
selected oligos using the conventional aniline cleavage assay and
visualization by polyacrylamide gel electrophoresis (PAGE), showing
that saporin preferentially targets these hairpin DNAs. We believe
that our method can also be used to determine the sequence preference
of other depurinating proteins, for example, the important RIP toxins
ricin, abrin, and Shiga toxin. The results presented here might also
help other groups with the design of RIP-based immunotoxins, RIP inhibitors,
and sensors or help explain RIP antiviral activity and their roles
in nature.
Results
Method Overview
Our method uses synthetic DNA oligos
with a central randomized region. The sequences selected for the study
are based on structures known to be targeted by RIPs such as the SRL[25] or short hairpin loop structures[9] (Figure B). For the SRL, the “ACGAGA” sequence at the top of
the loop was replaced by six random nucleotides (N) shown in red.
The resulting molecules still contain the rest of the original loop,
including three conserved adenine residues (bold). Because RIPs act
specifically on adenine, the randomized region spanned three adenine
bases and no additional adenine residues were included in the stem
or the flanking sequences. The short hairpin loop oligos do not contain
any adenine residues outside of the randomized region. The DNA oligos
are transcribed into RNA using T7 RNA polymerase before treatment
with saporin. Alternatively, the DNA oligos are directly incubated
with saporin to induce depurination. The resulting abasic sites are
cleaved by aniline under acidic conditions[26] (Figure A). This
step separates the constant regions at the 5′ and 3′
ends of the oligo. The libraries are then loaded on a denaturing polyacrylamide
gel to separate cleaved and uncleaved fragments. The uncleaved fragments
(Figure C, boxed in
red) are extracted from the gel and processed into sequencing libraries.
The gel picture of the second replicate is shown in Figure S5. The constant regions of the oligos are used to
anneal primers for reverse transcription and second-strand synthesis.
The primers used also introduce the sequencing adapters necessary
for Illumina sequencing (Figure D). A final polymerase chain reaction (PCR) using Illumina
UDI primers is performed to amplify the library. The number of PCR
cycles was optimized to minimize PCR-introduced bias. The use of unique
molecular identifiers (UMIs) could further reduce such bias if more
precise quantification is desired.[27] To
find the sequences that are depurinated and cleaved, the abundance
of sequence variants in a library before and after saporin treatment
(here, 0 min meaning immediate quenching and 30 min treatment) is
compared.
Figure 1
Schematic illustration of the process used to treat the DNA and
RNA oligos and to construct sequencing libraries. (A) The randomized
library may contain sequences that are susceptible or not susceptible
to depurination by the RIP being studied. After RIP treatment, some
sequences are depurinated (top). These can be cleaved by aniline at
the abasic site. Sequences that are resistant to depurination remain
intact (bottom). (B) Sequences and secondary structure of both DNA
libraries with six randomized bases (N) in the loop. One was based
on the ribosomal rRNA sequence of the SRL with the top ACGAGA sequence
randomized. The second contained only the six randomized bases in
the loop. Both have a stable 11 bp stem and additional flanking 5′
and 3′ constant regions for reverse transcription and PCR.
(C) Uncleaved fragments (boxed in red) can be recovered from a denaturing
polyacrylamide gel after different treatment times. (D) RNA oligos
were reverse transcribed with a primer annealed to the 3′ end
(blue). The second strand of cDNA was synthesized with a primer annealed
to the other end of the SRL RNA (orange). The cDNA was amplified in
a PCR using Illumina UDI primers annealed to sequence newly added
in the previous two steps (dark orange and dark blue), yielding the
final sequencing library.
Schematic illustration of the process used to treat the DNA and
RNA oligos and to construct sequencing libraries. (A) The randomized
library may contain sequences that are susceptible or not susceptible
to depurination by the RIP being studied. After RIP treatment, some
sequences are depurinated (top). These can be cleaved by aniline at
the abasic site. Sequences that are resistant to depurination remain
intact (bottom). (B) Sequences and secondary structure of both DNA
libraries with six randomized bases (N) in the loop. One was based
on the ribosomal rRNA sequence of the SRL with the top ACGAGA sequence
randomized. The second contained only the six randomized bases in
the loop. Both have a stable 11 bp stem and additional flanking 5′
and 3′ constant regions for reverse transcription and PCR.
(C) Uncleaved fragments (boxed in red) can be recovered from a denaturing
polyacrylamide gel after different treatment times. (D) RNA oligos
were reverse transcribed with a primer annealed to the 3′ end
(blue). The second strand of cDNA was synthesized with a primer annealed
to the other end of the SRL RNA (orange). The cDNA was amplified in
a PCR using Illumina UDI primers annealed to sequence newly added
in the previous two steps (dark orange and dark blue), yielding the
final sequencing library.
Depurination Activity of Saporin on SRL-Based DNA and RNA Substrates
We began our study by determining the conditions necessary to achieve
measurable depurination and cleavage of RNA and DNA substrates with
saporin using the conventional aniline cleavage assay. For this purpose,
we initially chose E73,[25] an oligo mimic
of the SRL, as the substrate. DNA and RNA versions of this sequence
were generated by solid phase synthesis and treated with different
amounts of saporin using a published protocol.[3] The resulting abasic sites were cleaved by aniline, and the oligos
were loaded onto a denaturing polyacrylamide gel. The RNA yielded
cleavage bands whose appearance strictly depended on aniline cleavage
(Figure A). For DNA,
faint cleavage bands were also observed without aniline treatment.
These are most likely the result of spontaneous hydrolysis at the
abasic site, because compared to abasic RNA, abasic DNA is very unstable.[26]
Figure 2
Determination of saporin treatment conditions. (A) Cleavage
of
RNA (OSaH151) and DNA (OSaH152) variants of E73 after treatment with
different amounts of saporin for 30 min. The saporin concentrations
were 0, 5, 17, 50, 167, and 500 nM. The last two lanes were both incubated
with 500 nM saporin, but aniline treatment was omitted in the last
lane. The sequence of the RNA oligo is shown on the top with the targeted
adenine marked with an asterisk. (B) Similar to panel A, but using
oligos with a more stable (longer) stem (OSaH229 and OSaH190). Arrows
indicate the condition chosen for kinetic experiments in panel C.
(C) The oligos were treated with the amount of saporin indicated by
the arrow in panel B for different lengths of time. The arrows indicate
the time points deemed appropriate for sequencing. At these points,
cleavage bands are visible, but the bulk of the oligo is still unreacted.
(D) Nonspecific depurination of 600 nM DNA oligos (OSaH190, -195,
-196, -201, -244, -243, -199, and -207–209 from top to bottom,
respectively) by 150 nM saporin. All adenine residues in the wild-type
SRL sequence (top) are depurinated to some extent. The GAGA pattern
is not required for depurination. In panels A–C, canonical
and noncanonical base pairs found in the native SRL sequence are not
depicted.
Determination of saporin treatment conditions. (A) Cleavage
of
RNA (OSaH151) and DNA (OSaH152) variants of E73 after treatment with
different amounts of saporin for 30 min. The saporin concentrations
were 0, 5, 17, 50, 167, and 500 nM. The last two lanes were both incubated
with 500 nM saporin, but aniline treatment was omitted in the last
lane. The sequence of the RNA oligo is shown on the top with the targeted
adenine marked with an asterisk. (B) Similar to panel A, but using
oligos with a more stable (longer) stem (OSaH229 and OSaH190). Arrows
indicate the condition chosen for kinetic experiments in panel C.
(C) The oligos were treated with the amount of saporin indicated by
the arrow in panel B for different lengths of time. The arrows indicate
the time points deemed appropriate for sequencing. At these points,
cleavage bands are visible, but the bulk of the oligo is still unreacted.
(D) Nonspecific depurination of 600 nM DNA oligos (OSaH190, -195,
-196, -201, -244, -243, -199, and -207–209 from top to bottom,
respectively) by 150 nM saporin. All adenine residues in the wild-type
SRL sequence (top) are depurinated to some extent. The GAGA pattern
is not required for depurination. In panels A–C, canonical
and noncanonical base pairs found in the native SRL sequence are not
depicted.Interestingly, DNA seemed to be more sensitive
to depurination
by saporin than RNA. Visible cleavage bands were observed after treatment
with 500 fmol (17 nM) of saporin, while 5 pmol (167 nM) of saporin
was required for the RNA substrate (Figure A). Similar findings have already been reported
for other RIPs. For example, the ricin A chain had higher activity
on short linear DNA oligos than on RNA variants[10] and Shiga toxin has also been found to be highly active
on DNA.[28]If the E73 oligos were
depurinated and cleaved at the first A in
the GAGA motif, cleavage should yield two fragments of identical size.
Instead, our experiment yielded two fragments for the RNA oligo and
one intense band as well as some fainter bands for the DNA oligo.
We assumed that the short (6 bp) stem of E73 might not be stable and
allows different structures to form. These might be depurinated at
different sites. Therefore, new oligos that had a more stable 10 bp
stem with the same loop region as E73 were designed and treated with
saporin under the same conditions. Native PAGE was performed to confirm
that, under the experimental conditions, the oligos used do not form
dimers that disrupt the stem–loop structure (Figure S1). The new versions of E73 also did not yield the
expected cleavage pattern of two fragments with nearly identical sizes
(Figure B). Instead,
three intense bands and some fainter bands were observed, especially
for the DNA oligo (Figure B). To test the effect of single adenine bases on the cleavage
pattern of the DNA oligo, all adenine bases in the loop region were
mutated to thymidine.
Nonspecific DNA:Adenosine Glycosidase Activity
Figure D shows the cleavage
bands observed for variants of SRL-based DNA substrates. The results
indicate that the GAGA sequence is not required for depurination.
All adenine residues seem to be targeted by saporin at the concentration
used (150 nM saporin and 600 nM DNA), although the adenines at the
3′ end of the loop are depurinated less efficiently. As expected,
a substrate without adenine in the loop region was not depurinated
and therefore not cleaved by aniline. The introduction of a single
adenine residue was enough for cleavage bands to appear. A control
experiment showing untreated oligos can be found in Figure S2. Interestingly, adenine residues in the stem of
these oligos did not seem to contribute to the cleavage pattern. These
residues form Watson–Crick base pairs and are most likely not
accessible to saporin. Saporin thus seems to be a nonspecific single-strand
polynucleotide:adenosine glycosidase. This is in contrast to results
obtained with ricin, which seems to be more specific, at least for
RNA oligos.[9,10]
Optimizing the Reaction Time for Depurination
Using
the minimal saporin concentrations necessary to induce significant
cleavage of the SRL mimic (Figure B, indicated by the arrows), we next performed depurination
kinetic experiments to estimate the time required for optimal cleavage
using the same aniline cleavage assay. Fifteen picomoles (500 nM)
of the SRL mimic was treated with 5 pmol (167 nM) of saporin in the
case of RNA and 1.5 pmol (50 nM) of saporin in the case of DNA. According
to the results, we decided to treat the RNA libraries for 60 min and
the DNA libraries for 30 min (Figure C, indicated by the arrows). At this time point, there
should be measurable cleavage because cleavage bands were clearly
visible while most of the oligo (around 80%) was still uncleaved.
If the reaction were to proceed too far, it would not be possible
to differentiate the sequences with strong to moderate substrate preferences.While treating the randomized libraries with saporin using the
conditions determined so far, we found that the DNA library behaved
as expected but the RNA library was cleaved more than expected (data
not shown). This suggested that the RNA form of the SRL mimic might
be more resistant to saporin depurination than most variants in the
randomized library. Therefore, we decided to treat the RNA library
using the same conditions as determined for DNA [15 pmol (500 nM)
of library with 1.5 pmol (50 nM) of saporin for 0 and 30 min]. In
this case, we could indeed observe cleavage bands for the randomized
RNA and DNA libraries, while the bulk of each library was still uncleaved
after 30 min at 37 °C. The uncleaved fractions were purified
from a polyacrylamide gel (Figure C) and processed for sequencing. The conditions established
thus far (50 nM saporin, 500 nM library, 30 min, and 37 °C) are
likely substrate limiting. With 4096 total variants in a library,
the concentration of each variant averages 122 pM, which is substantially
lower than the previously reported KM values
for SRL-based substrates targeted by saporin (9–95 μM).[29,30]
Reproducibility of the Sequencing Assay
We performed
two independent saporin treatments of the randomized libraries to
evaluate the reproducibility of the method. The relative abundance
(count of one variant divided by the sum of the counts of all of the
variants) of each variant in the library is highly reproducible even
after saporin treatment (Figure A, top, and Figure S3).
The method-induced error was estimated by comparing the ratio of the
relative abundance for each variant in replicates 1 and 2, termed
the “abundance ratio”. As expected, across the replicates,
the abundance ratio is near 100% and the error seems (except for some
outliers) to be randomly distributed (Figure A, middle). A histogram representation of
the abundance ratio shows that the distribution of the error is symmetric
(Figure A, bottom)
and the kernel density estimate (blue) of the distribution estimated
from the data traces a normal distribution (red) fairly well. The
method therefore does not seem to introduce any systematic bias, but
it introduces a certain amount of normally distributed random error.
Figure 3
Results
of the saporin substrate sequencing assays. (A) Two independent
treatments of the libraries with saporin were highly reproducible.
All variants correlated linearly as shown for the DNA loop library.
Judging from the scatter and histogram plots of the abundance ratio
across both replicates, the error caused by the method is randomly
distributed. The density of the data (blue) closely follows that of
a normal distribution (red). (B) Sequence representation of the 100
most efficiently depurinated variants from each library with a sequence
logo based on the top 20 variants. Bases are colored green for A,
blue for C, orange for G, and red for T (or U). The last column, which
is present for only the loop libraries, indicates the possibility
of base pairing between randomized positions 1 and 6. Complementary
bases at these positions are colored black, while mismatched bases
are colored white.
Results
of the saporin substrate sequencing assays. (A) Two independent
treatments of the libraries with saporin were highly reproducible.
All variants correlated linearly as shown for the DNA loop library.
Judging from the scatter and histogram plots of the abundance ratio
across both replicates, the error caused by the method is randomly
distributed. The density of the data (blue) closely follows that of
a normal distribution (red). (B) Sequence representation of the 100
most efficiently depurinated variants from each library with a sequence
logo based on the top 20 variants. Bases are colored green for A,
blue for C, orange for G, and red for T (or U). The last column, which
is present for only the loop libraries, indicates the possibility
of base pairing between randomized positions 1 and 6. Complementary
bases at these positions are colored black, while mismatched bases
are colored white.
Sequence Preference of Saporin
All variants were ranked
according to the ratio of relative abundance in the library after
a 30 min saporin treatment in comparison to a 0 min treatment from
lowest to highest. We call this measure the recovery ratio of the
variants. This ranking reflects the preference of saporin for all
variants in the library. Figure B shows the top 100 most depleted variants and a sequence
logo for the top 20 variants.[31] The complete
preference distributions can be found in Supporting Data 1.Although an enrichment of certain bases can be
observed in the top 20 variants of all libraries (see sequence logos
in Figure ), only
the N6 loop DNA library clearly shows a preferred pattern. Here, variants
that contained the ACGN or AAGN sequence motif in a tetraloop (colored
black in the last column) were highly depleted. The same pattern in
a hexaloop was much less sensitive to depurination by saporin. This
result indicates the specificity of saporin for DNA tetraloops that
start with an ACG or AAG sequence. No patters were obvious in the
other libraries.Interestingly, the ACGAGA wild-type SRL sequence
(RNA) in the randomized
region is not efficiently targeted by saporin. It ranks 3284th of
4096 in replicate 1 and 3379th in replicate 2. Therefore, >80%
of
the other variants are depurinated more efficiently by saporin than
the wild-type SRL sequence. This was already suspected from the initial
experiments, because the SRL-mimic RNA was depurinated and cleaved
far less efficiently than the randomized library. Thus, in contrast
to ricin that was shown to specifically depurinate the native SRL
sequence,[9,32] saporin depurinates nucleic acids with different
sequence requirements.During the depurination of ribosomes,
RIPs need to interact with
ribosomal proteins to gain access to the SRL.[5,6] Depurination
of isolated nucleic acids as demonstrated in our experiments might
thus yield different specificities, because only direct saporin–nucleic
acid interactions are possible.
Search for Other Preferred Sequence Patterns
To find
any potentially hidden patterns in the other libraries, we tried the
following approaches. First, all pairwise base interactions were estimated
for contributions to saporin specificity. Second, enrichment of longer
(three- and four-nucleotide) patterns among the 100 most depurinated
sequences was investigated.On the basis of the work of Guenther
et al.,[23] we constructed a linear regression
model that considers the pairwise coupling of bases. The linear coefficients
of the models allow us to deduce which pair of bases significantly
affects saporin specificity as compared to complete interaction matrices
that can be harder to interpret. These significant interactions and
the predicted (model) versus experimental comparison of the recovery
ratios are shown in Figure .
Figure 4
Matrices for the pairwise coupling model. The recovery ratio was
used to fit linear regression models that consider pairwise interactions
between bases at different positions (see Materials
and Methods). The insets show the R2 values (coefficients of determination) between the predicted and
observed recovery ratios for each model. Each square shows the linear
coefficients of the model that indicate how a particular pair of base
combination affects model prediction of the recovery ratio. Blue squares
(low linear coefficient) indicate that this base combination promotes
depurination by saporin, while red fields (high linear coefficient)
indicate that the base combination inhibits depurination. Gray squares
show pairs of bases that do not significantly affect depurination
by saporin.
Matrices for the pairwise coupling model. The recovery ratio was
used to fit linear regression models that consider pairwise interactions
between bases at different positions (see Materials
and Methods). The insets show the R2 values (coefficients of determination) between the predicted and
observed recovery ratios for each model. Each square shows the linear
coefficients of the model that indicate how a particular pair of base
combination affects model prediction of the recovery ratio. Blue squares
(low linear coefficient) indicate that this base combination promotes
depurination by saporin, while red fields (high linear coefficient)
indicate that the base combination inhibits depurination. Gray squares
show pairs of bases that do not significantly affect depurination
by saporin.We observed that the pairwise models for the different
libraries
could explain ∼60% of the data set obtained for RNA, but only
∼50% for DNA. Therefore, these models are unable to fit the
data well. A likely explanation is the promiscuous nature of the catalytic
activity of saporin. As has been shown previously, saporin can target
all adenine residues in a DNA version of the SRL (see Figure D). When the enzyme is active
on any adenine residue, the pairwise coupling with other bases should
have minor effects. Complicating the interpretation is the fact that
multiple adenine bases might be present in a given variant. Once a
variant containing multiple adenine bases is depurinated and cleaved,
it is impossible to determine which position was targeted by saporin
or if multiple positions were depurinated. Therefore, highly promiscuous
activity cannot be captured accurately with this kind of model. Nevertheless,
pairwise coupling of bases seems to determine the sequence preference
of saporin for RNA to a higher degree than for DNA. Higher-order interactions
than pairwise, for example, the previously observed ACG pattern in
a tetraloop, could contribute more to how saporin discriminates DNA
sequences.
Promiscuous Activity or Specificity?
The data analyzed
thus far suggest highly promiscuous activity of saporin while also
showing specificity for certain DNA tetraloops. The data obtained
by sequencing cover the complete sequence space, so it is possible
to also inspect the overall distribution of recovery ratios. These
distributions show that there is a clear difference between the libraries
based on the SRL and the shorter loop libraries (see Figure A). The distributions of the
SRL libraries are more spread out, indicating promiscuous activity
across a broad range of variants, while distributions of the loop
libraries are more compact, suggesting that saporin is less active
on most variants.
Figure 5
Distribution of recovery ratios. (A) Shape of the distribution
of recovery ratios for all variants. The distributions of the SRL-based
libraries are flatter than those of the loop libraries indicative
of a more promiscuous activity. (B) Dependence of recovery on the
number of adenine residues. An increasing number of adenine residues
reduces the recovery ratios of variants, indicating that adenine is
targeted nonspecifically. Interestingly, there are some variants in
both DNA libraries that exhibit much lower recovery ratios compared
to those of the rest of the population. This indicates that some specific
sequence pattern is targeted. (C) Minimum free energy structures of
two sensitive variants from the SRL DNA library as predicted by NUPACK.[33] These variants form secondary hairpins with
small loops. The small loops closely resemble the sensitive ACG pattern
identified in the DNA loop library.
Distribution of recovery ratios. (A) Shape of the distribution
of recovery ratios for all variants. The distributions of the SRL-based
libraries are flatter than those of the loop libraries indicative
of a more promiscuous activity. (B) Dependence of recovery on the
number of adenine residues. An increasing number of adenine residues
reduces the recovery ratios of variants, indicating that adenine is
targeted nonspecifically. Interestingly, there are some variants in
both DNA libraries that exhibit much lower recovery ratios compared
to those of the rest of the population. This indicates that some specific
sequence pattern is targeted. (C) Minimum free energy structures of
two sensitive variants from the SRL DNA library as predicted by NUPACK.[33] These variants form secondary hairpins with
small loops. The small loops closely resemble the sensitive ACG pattern
identified in the DNA loop library.The higher activity of saporin on SRL libraries
might be caused
by the additional adenine bases outside of the randomized region.
Although it is not possible to determine which adenine in a specific
sequence is targeted by saporin, it is possible to investigate the
effect of the number of adenines in a specific sequence on its reactivity.
Nonspecific activity of saporin toward adenine should result in an
increased reactivity the more adenines are present in the sequence.
The box plots shown in Figure B show an increasing degree of depurination as the number
of adenines in a sequence increases. However, the trend is much less
pronounced in the DNA tetraloop library. Other bases (C, G, and U/T)
do not show similar trends (Figure S4).Other interesting features of the plots in Figure B are differences in the tails of the distributions.
For the two RNA libraries, there are only a few variants in the tails
of the distributions, while there are a significant number of variants
with one or two adenines with very low recovery ratios in the DNA
libraries. This suggests that these variants are targeted sequence
specifically rather than nonspecifically by saporin.As mentioned
above, the variants with the lowest recovery ratios
in the DNA loop library contain the ACG motif in a tetraloop. The
predicted secondary structures of the highly depleted variants containing
one or two adenines from the SRL-based library suggest that they may
form smaller hairpins with a tetraloop resembling the ACG motif (Figure C).
Confirmation of Sequencing Results
Variants from the
top, middle, and bottom of the preference distribution were selected
and treated with saporin to confirm the different preferences inferred
by sequencing. After saporin treatment, the oligos were cleaved by
aniline and analyzed by PAGE. Variants from the loop libraries were
selected because they exhibited the clearest preference patterns.
According to the sequencing results for RNA, two variants sensitive
to depurination by saporin (AUAGAC and AACAGA), one variant with intermediate
sensitivity (CAUGCC), and one variant resistant to depurination (GGAGAC)
were picked for analysis. Panels A and B of Figure show the results for the RNA substrates.
The bands corresponding to cleavage products are more intense for
the sensitive variants than for the other two, indicating that these
are indeed more readily depurinated by saporin.
Figure 6
Confirmation of the sequencing
results using PAGE. (A) Representative
PAGE gels. RNA (500 nM) with randomized AUAGAC (lanes 1, 5, and 9),
AACAGA (lanes 2, 6, and 10), CAUGCC (lanes 3, 7, and 11), and GGAGAC
(lanes 4, 8, and 12) sequences were treated with the indicated amounts
of saporin for 30 min at 37 °C before aniline treatment. DNA
oligos OSaH455 (G-ACGA-C in lane lanes 1, 5, and 9), OSaH456 (G-ACGC-C
in lanes 2, 6, and 10), OSaH457 (G-GAGA-C in lanes 3, 7, and 11),
and OSaH458 (G-GACC-C in lanes 4, 8, and 12) at 500 nM were incubated
with the indicated amounts of saporin for 30 min at 37 °C before
aniline treatment. Only the sequence in the loop is shown in the figure.
(B) Quantification of the fraction cleaved (FL) from PAGE gels using
three separate experiments. Shown are the average and standard deviation.
Asterisks indicate statistically significant differences at the **p < 0.01 or ****p < 0.0001 level.
(C) Kinetic analysis of the most sensitive ACGA tetraloop variant
and a tetraloop variant containing the GAGA SRL motif. The indicated
amounts of 5′FAM-labeled substrates OSaH525 and OSaH526 were
incubated with 20 nM saporin at 37 °C. The gel pictures show
cleavage after (1) 0, (2) 5, (3) 10, (4) 15, and (5) 30 min for the
sensitive ACGA variant and (1) 0, (2) 30, (3) 50, (4) 90, and (5)
180 min for the GAGA tetraloop variant. Note that the contrast was
adjusted in both pictures to better show the cleavage products. The
experiment was repeated three times, and the initial reaction velocity
estimated after 5 min for the ACGA variant and after 180 min for the
GAGA variant. A plot of the average initial velocity against the substrate
concentration is shown at the right. Error bars indicate the standard
deviation from three experiments. Nonlinear regression was performed
to estimate the KM for the sensitive ACGA
tetraloop.
Confirmation of the sequencing
results using PAGE. (A) Representative
PAGE gels. RNA (500 nM) with randomized AUAGAC (lanes 1, 5, and 9),
AACAGA (lanes 2, 6, and 10), CAUGCC (lanes 3, 7, and 11), and GGAGAC
(lanes 4, 8, and 12) sequences were treated with the indicated amounts
of saporin for 30 min at 37 °C before aniline treatment. DNA
oligos OSaH455 (G-ACGA-C in lane lanes 1, 5, and 9), OSaH456 (G-ACGC-C
in lanes 2, 6, and 10), OSaH457 (G-GAGA-C in lanes 3, 7, and 11),
and OSaH458 (G-GACC-C in lanes 4, 8, and 12) at 500 nM were incubated
with the indicated amounts of saporin for 30 min at 37 °C before
aniline treatment. Only the sequence in the loop is shown in the figure.
(B) Quantification of the fraction cleaved (FL) from PAGE gels using
three separate experiments. Shown are the average and standard deviation.
Asterisks indicate statistically significant differences at the **p < 0.01 or ****p < 0.0001 level.
(C) Kinetic analysis of the most sensitive ACGA tetraloop variant
and a tetraloop variant containing the GAGA SRL motif. The indicated
amounts of 5′FAM-labeled substrates OSaH525 and OSaH526 were
incubated with 20 nM saporin at 37 °C. The gel pictures show
cleavage after (1) 0, (2) 5, (3) 10, (4) 15, and (5) 30 min for the
sensitive ACGA variant and (1) 0, (2) 30, (3) 50, (4) 90, and (5)
180 min for the GAGA tetraloop variant. Note that the contrast was
adjusted in both pictures to better show the cleavage products. The
experiment was repeated three times, and the initial reaction velocity
estimated after 5 min for the ACGA variant and after 180 min for the
GAGA variant. A plot of the average initial velocity against the substrate
concentration is shown at the right. Error bars indicate the standard
deviation from three experiments. Nonlinear regression was performed
to estimate the KM for the sensitive ACGA
tetraloop.From the DNA loop library, we also selected two
sensitive variants
(G-ACGA-C and G-ACGC-C), one intermediate
variant (G-GAGA-C), and one resistant variant (G-GACC-C). All of these variants are predicted to form a tetraloop
and contain at least one adenine base in the loop that could theoretically
be targeted by saporin. Panels A and B of Figure show that the variants that should be sensitive
according to the sequencing results were efficiently cleaved even
at low saporin concentrations. The oligo with the GAGA loop (intermediate
sensitivity according to sequencing) showed visible cleavage only
at the high saporin concentration, while the supposedly resistant
GGACCC variant did not show clear cleavage bands. Treatment of these
selected oligos with saporin thus yielded the cleavage band intensities
expected from the sequencing results.To further validate the
results obtained by sequencing, the kinetics
of cleavage of the most sensitive DNA variant containing the ACGA
tetraloop was investigated. Varying concentrations of the tetraloop
variant were reacted with 20 nM saporin for 5, 10, 15, or 30 min,
and the amount of product produced was estimated by a PAGE assay.
The oligo was fluorescently labeled with a FAM fluorophore at the
5′ end to facilitate quantification. From the initial reaction
velocities, Michaelis constant KM was
estimated to be 197 nM [95% confidence interval (CI) from 133.2 to
287.6 nM] and vmax to be 24.3 nM/min (95%
CI from 21.8 to 27.4 nM/min). We also tried to estimate the kinetic
parameters for the GAGA tetraloop substrate that mimics the native
SRL substrate. Cleavage of this substrate was too slow to accurately
estimate the kinetic parameters using our assay (Figure C). Previous reports of KM values for native substrates range from 9
to 95 μM.[29,30] Therefore, our sequencing assay
identified DNA substrates with a higher affinity (lower KM) for saporin compared to those of native SRL sequences.
Discussion
The main findings of this study are (1)
the wild-type SRL sequence
is not efficiently targeted by saporin, (2) saporin depurinates all
adenine residues in single-stranded regions of DNA at high saporin
concentrations, and (3) saporin most efficiently targets DNA hairpins
containing tetraloops starting with the bases ACG.Although
the SRL sequence is generally assumed to be the natural
target of RIPs, most sequences in the randomized RNA library based
on the SRL are depurinated more efficiently than the SRL. It has been
suggested that for specific nucleic acid binding proteins the natural
substrate sequences are located at the top of the affinity distribution
while for nonspecific proteins the natural substrates are located
in the middle of the distribution.[23] Using
this interpretation would suggest that the RNA:adenosine glycosidase
activity of saporin is nonspecific, because the natural SRL substrate
is not preferentially depurinated, but located closer to the end of
our preference distribution. In contrast to the nonspecific action
toward RNA, we found that saporin preferentially targets DNA hairpins
at a tetraloop containing the ACG sequence. This seems to be achieved
by a higher affinity for variants containing a tetraloop with the
ACGN motif.Our sequencing assay provides information about
the target specificity
(or lack thereof) of saporin and other depurinating enzymes. However,
there are several limitations that need to be considered. As presented,
this assay is based on single-time-point measurements of the substrates
reacted with saporin. Therefore, it can provide relative reactivities
of the substrates only under a specific condition (reaction time,
enzyme concentration, etc.), which significantly affects the outcome.
A parallel assay of multiple time points should provide more information
about the substrate specificity and kinetics at the cost of additional
labor and sequencing capacity. Another limitation is that the assay
provides no information about the site of reaction when there are
multiple adenines in the substrate.This is the first time that
a highly specific DNA motif has been
reported for a RIP. The result might encourage more investigations
into the effect of RIPs on DNA. However, ssDNA and DNA hairpins occur
rarely in living cells. Some examples where ssDNA occurs are during
DNA replication,[34] DNA repair,[35] and transcription,[36] in R loops,[37] or in viral genomes and
during their replication.[38] A feature of
many RIP that is difficult to explain on the basis of their rRNA:adenosine
glycosidase activity is their antiviral property.[39,40] More general RNA- and DNA-targeting activity could in part explain
the observed antiviral effects. It also seems possible that some DNA
viruses that infect S. officinalis could carry sequences
that form hairpins with the recognition motif in the loop. The same
could be true for other RIPs that have been suggested to protect plants
from viruses.[41,42] RIPs could thus be part of the
plant immune defenses to target and destroy foreign DNA (or RNA) sequences
that contain the targeted motif.In light of our results, saporin-based
immunotoxins might also
not primarily target protein synthesis, but act by attacking DNA.
This can lead to side effects different from would be expected after
ribosome inactivation. RIPs are also known to cause nuclear damage
and apoptosis, but so far, no direct link between ribosome inactivation
and apoptosis activation has been proven.[43,44] Preferential depurination of DNA over RNA could explain such observations.
In this case, nucleus to mitochondria (NM) signaling[45] might be the missing link between RIP activity and apoptosis.
Some RIPs have even been shown to enter the nucleus and enrich the
nucleoli,[46,47] a place with strong transcriptional activity
where ssDNA segments might be present.Finally, a more general
observation is that the most sensitive
variants of the DNA libraries were depurinated more than the most
sensitive RNA variants (recovery ratios of >70% for RNA and <40%
for DNA). Interestingly, other RIPs also preferentially target DNA
over RNA,[10,28] so the DNA:adenosine glycosidase activity
may be a general feature of RIPs that has not received sufficient
attention. Using our method, it is now possible to study the sequence
preference of other RIPs, including important toxins such as ricin
and Shiga toxin. It is also possible to sequence samples from multiple
time points to obtain detailed kinetic data on the depurination of
all variants in the library similar to the results obtained for C5.[23]
Materials and Methods
Oligonucleotides
All oligonucleotides used in this
study were ordered from Eurofins Genomics, IDT, or Sigma and are summarized
in Table . The oligos
were stored as 100 μM stocks in water. Before use in depurination
assays, the oligos were diluted to 2 or 20 μM in water, heated
for 3 min at 98 °C and rapidly cooled on ice. These stocks were
stored at −20 °C until use.
Primers OSaH268 and OSaH269
were used to amplify constructs of randomized oligos OSaH378 and OSaH266
(33 nM) containing the T7 promoter. Q5 High-Fidelity 2X Master Mix
[New England Biolabs (NEB)] was used for PCR with 33 nM template and
1.6 μM primers in 60 μL reaction mixtures. The thermal
cycling conditions for all PCRs can be found in the Supporting Information.The dsDNA products were purified
from a 2.5% agarose gel (25 min at 135 V, stained with ethidium bromide)
using the Zymoclean Gel DNA Recovery Kit (Zymo Research) with two
additional washing steps to ensure complete removal of agarose. The
libraries were eluted with 16 μL of water, and the concentration
was measured using NanoDrop One (Thermo Fisher). RNA was transcribed
from DNA templates using T7 RNA polymerase (NEB) according to the
manufacturer’s instructions for 12–16 h. After digestion
of the DNA template for 30 min at 37 °C with DNaseI (Takara Bio),
RNA was purified using the RNA Clean and Concentrator-5 Kit (Zymo
Research). The RNA concentration was determined using NanoDrop One.For the confirmation of the sequencing results, DNA templates were
generated as described above using oligo OSaH466, OSaH467, OSaH468,
or OSaH469 as the template. The DNA was purified using the DNA Clean
and Concentrator-5 Kit (Zymo Research). RNA was transcribed as described
above and PAGE purified after DNase I digestion.
PAGE
Polyacrylamide gels were cast using a premixed
40% acrylamide/bisacrylamide solution at the final concentrations
indicated with 7 M urea. Gels were run at 200 V for 60 min (8%) or
100 min (16%). RNA and DNA were stained in polyacrylamide gels in
40 mL of TBE with 1× SYBR Gold for 10 min and then visualized
using a GE Typhoon FLA 9000 scanner.For PAGE purification,
the bands of interest were cut from the gel, finely crushed inside
a microcentrifuge tube, and extracted with 300 μL of extraction
buffer [30 mM Tris (pH 7.5) and 30 mM NaCl] at 4 °C overnight.
The gel was then pelleted at 21000g for 2 min, and
the supernatant was carefully removed. Nucleic acids were recovered
from the supernatant by ethanol precipitation [with 0.1 volume of
3 M sodium acetate (pH 5.2) and 3 volumes of 99% ethanol] and dissolved
in water at the volume required for further processing.
Saporin Treatment and Aniline Cleavage
Saporin treatment
and aniline cleavage were performed according to a previous report.[3] Briefly, DNA oligos or in vitro-transcribed RNA
was incubated with the indicated amounts of saporin (S9896, Sigma-Aldrich)
in 30 μL of reaction buffer [25 mM Tris (pH 7.5), 25 mM KCl,
and 5 mM MgCl2] at 37 °C for 30 min or as long as
indicated. The reactions were quenched by adding SDS to a final concentration
of 2% from a 10% stock solution. After saporin treatment, the oligos
were precipitated by addition of 0.1 volume of 3 M sodium acetate
and 3 volumes of 99% ethanol with centrifugation at 18000g and 4 °C for 15 min. The pellets were resuspended in 20 μL
of acetic acid/aniline (2.5 and 1 M) and incubated at 60 °C for
5 min. Aniline was removed by extraction with 500 μL of diethyl
ether and acetic acid by an additional round of ethanol precipitation.
Finally, the pellets were resuspended in 3 μL of water and 3
μL of 2× RNA loading buffer for PAGE analysis or purification.
For the confirmation experiment with RNA, the reaction was scaled
down using half of the amounts of all components.
Reverse Transcription
For reverse transcription, a
mixture of saporin-treated RNA (3 μL), reverse transcription
primer (1.5 μL, 10 μM), 2.5 mM dNTPs (2.5 μL), and
water (0.5 μL) was denatured at 98 °C for 3 min and cooled
to 4 °C. Then, 5× buffer (2 μL), RNase inhibitor,
murine (0.25 μL, NEB), and Maxima H Minus Reverse Transcriptase
(0.25 μL, Thermo Fisher) were added. The reaction mixture was
incubated at 50 °C for 30 min and then at 85 °C for 5 min.
The cDNA was separated and purified on a denaturing polyacrylamide
gel (8%). The reverse transcription primers were OSaH247 and OSaH249.
These contained different barcodes for different treatment times (0
and 30 min, respectively).The second strand of cDNA was synthesized
in a PCR using NEB Q5Mastermix and primers OSaH130 and OSaH382 (final
concentration of 50 nM) with 2 μL of the cDNA (resuspended in
10 μL of water after gel extraction) as the template in a total
volume of 20 μL. In the case of DNA oligos, OSaH382 and OSaH247
or OSaH249 were used in the first PCR (also using 2 μL of a
saporin-treated library recovered by PAGE and resuspended in 10 μL
of water) to add the sequencing adapters.
Preparation of the Sequencing Library
A second PCR
using Illumina UDI primers was performed using 1.5 μL of the
first PCR as the template. This PCR was performed twice. First, 5
μL of PCR mix was loaded on a 3% agarose gel after 5, 10, and
15 cycles to estimate the number of cycles necessary to obtain a sufficient
amount of the final library. The gels were stained using ethidium
bromide, and the amounts of the products were estimated by comparison
with the band intensity of the DNA size marker (50 bp DNA ladder,
NEB). The second repeat of this PCR was performed for the number of
cycles judged necessary (five to eight cycles) using 30 μL reaction
mixtures. After the addition of 6× loading dye, the complete
PCR mix was loaded into two wells of a 3% agarose gel, run for 25
min at 135 V, stained with ethidium bromide, and cut from the gel.
The DNA was recovered using the gel extraction kit with two additional
washing steps to ensure complete removal of agarose. The libraries
were eluted with 8 μL of water, and the concentration was measured
using NanoDrop One.
Sequencing and Data Analysis
Sequencing was performed
using Illumina NovaSeq by OIST DNA Sequencing Section. The reads corresponding
to different treatment times were selected on the basis of the barcode
added during the first PCR or reverse transcription. From these reads,
the randomized section was extracted and the occurrence of each sequence
was counted using a bash script.From the raw sequencing count
for each variant, the relative abundance in the library was calculated
by division by the sum of the count of all variants in the respective
library. The recovery ratio of each library was then calculated by
dividing the relative abundance after treatment and the relative abundance
before treatment. This measure was used to rank variants from sensitive
(low recovery ratio) to resistant (high recovery ratio). To quantify
the cleavage rate for different variants from PAGE data, the average
intensities of the cleaved and uncleaved bands were measured using
ImageJ. The fraction cleaved (FL) was calculated by dividing the intensity
of the cleaved bands by the sum of the intensities of the cleaved
and uncleaved bands after subtracting the gel background and normalizing
for the measured area. Three independent experiments were performed,
and statistically significant differences determined using a two-way
analysis of variance (ANOVA) in GraphPad Prism 9 with the FDR method
of Benjamini and Hochberg.
Native PAGE
Each oligo (500 nM) was incubated in 10
μL of reaction buffer [25 mM Tris (pH 7.5), 25 mM KCl, and 5
mM MgCl2] at 98 °C for 3 min and then immediately
cooled on ice. Slow cooling was performed in a Bio-Rad T100 cycler
at 0.1 °C/s until 4 °C was reached. Samples were then mixed
with an equal volume of loading dye (2× TBE, 50% glycerol, and
0.1% bromophenol blue), loaded onto a 15% native polyacrylamide gel,
and run at 200 V for 60 min, before being stained with SYBR Gold.
Imaging was performed with a GE Typhoon FLA 9500 imager.
Kinetic Analysis
For kinetic analysis, different concentrations
of 5′FAM-labeled oligos OSaH525 and OSaH526 were incubated
with 20 nM saporin in 40 μL of reaction buffer at 37 °C
(heat block) for the indicated times. At each time point, the reaction
was quenched by addition of SDS to a final concentration of 2%. The
samples were precipitated with ethanol and sodium acetate, treated
with aniline, washed with ether, and precipitated again, as described
in Saporin Treatment and Aniline Cleavage. In each lane of a 16% denaturing urea–polyacrylamide gel
was loaded 2.5 pmol of the oligo, and the gel was run at 200 V for
60 min. The gels were imaged using a GE Typhoon FLA 9500 instrument,
and the bands corresponding to full length and cleaved oligos were
quantified with ImageJ. The fraction cleaved was calculated as described
above. This was multiplied by the initial substrate concentration
to estimate the product concentration at each time point. For the
sensitive substrate, the initial reaction velocity was estimated using
the product concentration at 5 min, and at 180 min for the resistant
oligo. The experiments were performed three times independently. A
nonlinear fit to the Michaelis–Menten equation was performed
using GraphPad Prism 9 to estimate the kinetic parameters.
Pairwise Coupling Model
We used a model similar to
that of Guenther et al.[23] The model consists
of position terms x that are described by the linear
coefficients β. x is equal to 1 if there is
a specific base at a specific position or 0 otherwise. N equals 24 for four nucleotides at six positions.Next, pairwise terms were created, where each
variable is 0 or 1 if the sequence has a pair of nucleotides in specific
positions (e.g., A3U6). This creates a total of 240 pairwise terms.
The position model presented above was then fitted with a separate
pairwise term 240 times using the ordinary least-squares linear regression
model from the Statmodels Python package. The pairwise terms with
a p value of <0.000005, indicating linear coefficients
not statistically higher than zero, were discarded. Next, all statistically
significant terms were added to the position model. The model was
then trimmed with stepwise regression. We used backward elimination,
where starting from the full set of variables, the model was iteratively
fitted as described above, but at each iteration, the least statistically
significant term (highest p value) was removed until
all terms were significant. This led to a position model with significant
pairwise terms added. The model was then used to predict the values
of all variables. The R2 score (coefficient
of determination) between predicted and observed values was calculated
using the r2_score function from the Python scikit-learn package.
Authors: D J Flavell; S U Flavell; D A Boehm; L Emery; A Noss; N R Ling; P R Richardson; D Hardie; D H Wright Journal: Br J Cancer Date: 1995-12 Impact factor: 7.640