Praneeth Bommisetti1, Vahe Bandarian1. 1. Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112, United States.
Abstract
The transfer RNA (tRNA) modification 4-thiouridine (s4U) acts as a near-ultraviolet (UVA) radiation sensor in Escherichia coli (E. coli), where it induces a growth delay upon exposure to the UVA radiation (∼310-400 nm). Herein, we report sequencing methodology for site-specific profiling of s4U modification in E. coli tRNAs. Upon the addition of iodoacetamide (IA) or iodoacetyl-PEG2-biotin (BIA), the nucleophilic sulfur of s4U forms a reaction product that is extensively characterized by liquid chromatography-mass spectrometry (LC-MS/MS) analysis. This method is readily applied to the alkylation of natively occurring s4U on E. coli tRNA. Next-generation sequencing of BIA-treated tRNA from E. coli revealed misincorporations at position 8 in 19 of the 20 amino acid tRNA species. Alternatively, tRNA from the ΔthiI strain, which cannot introduce the s4U modification, does not exhibit any misincorporation at the corresponding positions, directly linking the base transitions and the tRNA modification. Independently, the s4U modification on E. coli tRNA was further validated by LC-MS/MS sequencing. Nuclease digestion of wild-type and deletion strains E. coli tRNA with RNase T1 generated smaller s4U/U containing fragments that could be analyzed by MS/MS analysis for modification assignment. Furthermore, RNase T1 digestion of tRNAs treated either with IA or BIA showed the specificity of iodoacetamide reagents toward s4U in the context of complex tRNA modifications. Overall, these results demonstrate the utility of the alkylation of s4U in the site-specific profiling of the modified base in native cellular tRNA.
The transfer RNA (tRNA) modification 4-thiouridine (s4U) acts as a near-ultraviolet (UVA) radiation sensor in Escherichia coli (E. coli), where it induces a growth delay upon exposure to the UVA radiation (∼310-400 nm). Herein, we report sequencing methodology for site-specific profiling of s4U modification in E. coli tRNAs. Upon the addition of iodoacetamide (IA) or iodoacetyl-PEG2-biotin (BIA), the nucleophilic sulfur of s4U forms a reaction product that is extensively characterized by liquid chromatography-mass spectrometry (LC-MS/MS) analysis. This method is readily applied to the alkylation of natively occurring s4U on E. coli tRNA. Next-generation sequencing of BIA-treated tRNA from E. coli revealed misincorporations at position 8 in 19 of the 20 amino acid tRNA species. Alternatively, tRNA from the ΔthiI strain, which cannot introduce the s4U modification, does not exhibit any misincorporation at the corresponding positions, directly linking the base transitions and the tRNA modification. Independently, the s4U modification on E. coli tRNA was further validated by LC-MS/MS sequencing. Nuclease digestion of wild-type and deletion strains E. coli tRNA with RNase T1 generated smaller s4U/U containing fragments that could be analyzed by MS/MS analysis for modification assignment. Furthermore, RNase T1 digestion of tRNAs treated either with IA or BIA showed the specificity of iodoacetamide reagents toward s4U in the context of complex tRNA modifications. Overall, these results demonstrate the utility of the alkylation of s4U in the site-specific profiling of the modified base in native cellular tRNA.
A common characteristic
among cellular ribonucleic acids, such
as ribosomal RNA (rRNA), transfer RNA (tRNA), and messenger RNA (mRNA),
is the presence of post-transcriptional modifications. While these
RNA species possess numerous post-transcriptional modifications, the
numbers and diversity of modifications are the most significant in
tRNA.[1] To date, a total of 108 modifications
have been reported in tRNA, and the presence of these modifications
across all domains of life highlights their importance.[2] The anticodon stem-loop region on tRNA contains
most of these modifications, presumably because of its role in codon–anticodon
interactions during protein synthesis, with modifications at positions
34 and 37 being most prevalent.[1,3]The discovery
of many nucleic acid modifications coincided with
the efforts in the late 1960s and early 1970s to determine the nucleic
acid sequence, the roles of quite a few remaining enigmatic to this
day. The emerging consensus is that many are not limited to RNA. For
example, 7-deazapurine-based modified bases, which were initially
discovered in the wobble positions of tRNA Asp, Asn, Tyr, and His,
have recently been identified in bacteriophage DNA. Their presence
is thought to protect the phage DNA from the host restriction endonuclease
response.[4,5] There is also cross-talk between DNA and
RNA modifications in eukaryotes, where a class of DNA methylating
enzymes alkylate certain tRNA.[6] The dynamics
of cellular insertion, detection, and removal of these modifications
represent an important dimension in understanding their physiological
roles. While detecting the modified bases is relatively trivial using
high-resolution analytical separation and mass spectrometry, site-specific
profiling methods remain challenging.The advent of commercially
available high-throughput next-generation
sequencing (NGS) technologies during the last 2 decades has opened
new research avenues for profiling post-transcriptional RNA modifications.[7] In these experiments, the conversion of RNA to
complementary DNA (cDNA) in the reverse transcription (RT) step serves
as a readout where a non-canonical nucleoside in the RNA template
can result in the misincorporation of deoxynucleotides, premature
termination, or both during the RT.[7] While
some non-canonical bases such as inosine (I) may induce these directly,
others require treatment with a chemical reagent before they lead
to a distinct readout.[7] For example, the
use of chemical reagents specific to pseudouridine (ψ) and 5-methylcytosine
(m5C) facilitate transcriptome-wide detection of these
modifications.[8−10] Studies from different laboratories also show that
NGS conditions are responsive to many tRNA modifications, where empirical
analysis of misincorporations and terminations (collectively known
as RT events) can predict the presence of base modifications in uncharacterized
bacterial tRNA genes.[11,12] However, in many cases, these
require extensive statistical analysis to uncover as the probabilities
for the misincorporation or terminations are low. Ideally, chemical
reagents that target specific modifications and increase RT event
probabilities could overcome these limitations. However, the diversity
of the modifications requires the development of a new procedure that
leverages the unique reactivity of the modified base in each case.Mass spectrometry could be used as an alternative technique for
detecting RNA modifications, presenting a more straightforward approach
to the sequence-specific analysis of RNAs.[13] The principles used in such methods are adapted from proteomics,
where upon protease specific digestion of a protein generates a smaller
peptide fragment library that could subsequently be analyzed using
liquid chromatography (LC)–mass spectrometry (MS/MS) analysis.[7] Similarly, the cleavage of RNA to smaller fragments
using specific nuclease generates a smaller RNA library that LC could
separate. The MS/MS fragmentation of eluting peaks can be compared
against the theoretically generated fragmentation data from the input
sequences to detect the modifications on the RNA.[14−17] While these methods are much
more direct than NGS, they are limited by the need for specialized
instrumentation and the lack of robust tools for analyzing MS fragmentation
data. In the current report, using the tRNA as a model system for
4-thiouridine (s4U) containing RNA species, we developed
and validated a methodology to site specifically profile the base
in the stationary growth phase of Escherichia coli (see Figure for
the s4U structure).
Figure 2
UV–visible spectrophotometric analysis IA-modified s4U. The spectra are of 20 μM authentic s4U
and modified s4U after treatment with IA or BIA. The spectra
were obtained in 0.05 M NaPi (pH 8.0).
Native profiling of tRNA modifications
shows that RNA-Seq conditions
are sensitive to the presence of the s4U, where it causes
low rates of mismatches.[11,12,18] The s4U alone displays empirical mismatch rates between
0.1 and 0.3, which may not be conclusive when applied in a context
where the existence of s4U is unknown. For example, while
s4U is commonly present at the 8th or 9th position of bacterial
tRNA, in the case of archaeal tRNA or uncharacterized bacterial tRNA,
limitations may occur in confidently assigning the s4U
position on the gene based on the statistical probabilities alone.The nucleophilicity of the thiouridine base and its reactivity
with thiol modification reagents is a viable method for introducing
a bulky modification that could potentially report its presence in
NGS sequencing experiments.[19−22] In the SLAM-seq method, s4U is metabolically
incorporated into the eukaryotic mRNA pool in pulse-chase experiments,
which is subjected to NGS after iodoacetamide (IA) treatment. The
dynamics are monitored by exploring the frequency of mismatches observed
in 3′-untranslated regions of mRNA. However, under these conditions,
nearly ∼0.5% of the total cellular RNA is labeled with s4U after 6 h of incubation with 0.1 mM s4U.[23]We envisioned a methodology, starting
from E. coli cells using RNA-Seq, to
detect s4U modification in a
site-specific manner (Figure ). To achieve this, the documented susceptibility of s4U to electrophilic reagents, such as IA, was explored.[21,22] We reasoned that the presence of a large adduct, such as biotin-IA,
would lead to a robust mismatch response in NGS, allowing single-nucleotide
resolution s4U detection. This article reports the characterization
of s4U-IA or s4U-iodoacetyl-PEG2-biotin (BIA)
products by high-resolution MS methods to establish conditions that
maximized the yield of labeled s4U. Next, these methods
were extended to bulk RNA from E. coli and NGS experiments using size-selected RNA to enrich tRNA. Remarkably,
the base transitions observed in the NGS data clearly show that most
detectable tRNAs in E. coli have the
modification at position 8 or 9 with robust probabilities. These base
transitions are absent in identically treated samples from cells where
a key tRNA sulfur insertion enzyme, ThiI, was absent. Finally, LC–MS/MS
analysis of untreated, IA or BIA-treated tRNA from wild type and deletion E. coli strains digested with RNase T1 nuclease corroborate
the positional modification data from the high-throughput NGS studies.
Together, these data provide a workflow for single-base resolution
sequencing for s4U on bacterial or archaeal tRNAs.
Figure 1
Overview of
the workflow for the detection of s4U using
RNA-Seq and mass spectrometry. In this workflow, extracted RNA is
treated with IA or BIA prior to RNA-Seq to identify misincorporation
or coverage changes. Likewise, tRNA was nuclease digested with RNase
T1 followed by LC–MS/MS analysis of tRNA fragments which independently
establish the assignment. RNase T1 digestion followed by LC–MS/MS
analysis of either IA- or BIA-treated tRNA establish the specificity
of iodoacetamide reagents toward s4U in the context of
complex tRNA modifications.
Overview of
the workflow for the detection of s4U using
RNA-Seq and mass spectrometry. In this workflow, extracted RNA is
treated with IA or BIA prior to RNA-Seq to identify misincorporation
or coverage changes. Likewise, tRNA was nuclease digested with RNase
T1 followed by LC–MS/MS analysis of tRNA fragments which independently
establish the assignment. RNase T1 digestion followed by LC–MS/MS
analysis of either IA- or BIA-treated tRNA establish the specificity
of iodoacetamide reagents toward s4U in the context of
complex tRNA modifications.
Results
and Discussion
The overall outline of the studies presented
in this article is
shown in Figure .
To ensure that the sequencing analyses’ data reflect the s4U modification, we first carried out several control experiments
to characterize the BIA modification reaction.
Characterization of Reaction
between s4U and IA (or
BIA)
When combined with IA, s4U carries out a
nucleophilic attack via the sulfur atom at the C-2 of IA (or BIA),
as shown in Figure (or Figure S1). Therefore, s4U was incubated with excess IA or BIA,
and the reaction mixture was subsequently analyzed by UV–visible
spectroscopy. Figure shows the absorbance spectra of s4U (black trace) with
an absorption maximum at ∼330 nm, s4U treated with
IA (red trace), and s4U treated with BIA (blue trace),
both of which have an absorption maximum at ∼303 nm. The ∼27
nm blue shift is consistent with a previous report.[23]UV–visible spectrophotometric analysis IA-modified s4U. The spectra are of 20 μM authentic s4U
and modified s4U after treatment with IA or BIA. The spectra
were obtained in 0.05 M NaPi (pH 8.0).To confirm that the incubation of s4U with IA leads
to the formation of the expected product, reactions were analyzed
by LC–MS (Figure S2). Under these
conditions, s4U elutes at 11.8 min. The mass spectrum of
the species eluting at 11.8 min contains two species, corresponding
to the m/z values of [s4U + Na+] and [s4U + K+], within
six ppm of the theoretical m/z values
(Figure S2C). In the presence of IA, a
new peak at 12.3 min appears. The mass spectrum of the species eluting
in this peak is consistent with the m/z of [s4U-IA + H+], [s4U-IA + Na+], and [s4U-IA + K+], which are within
seven ppm of the theoretical values (Figure S2B).To assess the specificity of IA for s4U versus
the four
canonical nucleobases, s4U was treated with IA in the presence
of equimolar quantities of A, U, G, and C. The reaction mixtures were
subsequently analyzed by HPLC-MS (Figure S3). Under these conditions, in the presence of IA, the s4U peak at 10.9 min is replaced with one at 11.5 min. The canonical
nucleosides remained unaffected. The UV–visible spectra corresponding
to each of s4U and s4U-IA peaks (Figure S3B,C) are consistent with those shown
in Figure . As is
apparent, no other changes are visible in the chromatograms. In passing,
we note that a new peak at 16 min is also observed in these experiments,
which has the same UV–visible features as s4U-IA.
We do not know its identity, and because it was a minor component,
it was not investigated further.To determine if BIA reacted
with s4U in a similar manner,
s4U was reacted with BIA and analyzed via LC–MS
experiments. The extracted ion chromatogram (EIC) shows that s4U (m/z—261.05) and
BIA (m/z—542.10) elute at
13.5 and 34.3 min, respectively (Figure A). The mass spectrum corresponding to the
species eluting at 13.5 min is a mixture of [s4U + H+], [s4U + Na+], and [s4U
+ K+], as observed previously (compare Figures C and S2C). The observed m/z values
for these species are within three ppm of theoretical. The reaction
product elutes at 33.03 min, and the mass spectrum of the peak exhibits
a [s4U-BIA + H+] species with m/z of 675.25 (Figure B), which is within 0.3 ppm of the expected mass.
Figure 3
LC–MS
analysis of the in vitro modification of s4U with BIA.
(A) EIC of s4U (m/z range
260.55–261.55), s4U-BIA (m/z range 674.75–675.75), and BIA
(m/z range 541.60–542.60)
samples show that upon the reaction with BIA, a new peak at 33 min
appears in the chromatogram, which is distinct from that observed
for unreacted s4U or BIA. (B) Mass spectrum of s4U-BIA. (C) Mass spectrum observed for s4U.
LC–MS
analysis of the in vitro modification of s4U with BIA.
(A) EIC of s4U (m/z range
260.55–261.55), s4U-BIA (m/z range 674.75–675.75), and BIA
(m/z range 541.60–542.60)
samples show that upon the reaction with BIA, a new peak at 33 min
appears in the chromatogram, which is distinct from that observed
for unreacted s4U or BIA. (B) Mass spectrum of s4U-BIA. (C) Mass spectrum observed for s4U.The peaks corresponding to the s4U, s4U-IA,
and s4U-BIA were subjected to high-resolution MS/MS analysis
to characterize the species further. A representative example MS–MS
spectrum for the s4U-BIA adduct is shown in Figure S4. Fragmentation of s4U-BIA
via CID at 12–15 eV results in daughter ions with an m/z of 565.19 arising from the loss of
ribose sugar (peak (a), Figure S4). Further
fragmentation of the peak (a) leads to a loss of the nucleobase with
an m/z of 397.19 [peak (b), Figure S4]. Experimental m/z values for the daughter ions are within eight ppm of theoretical m/z values calculated for those ions.To further characterize the modification reaction of s4U to s4U-BIA unambiguously, large-scale modification of
s4U with BIA was carried out, purified by preparative HPLC
conditions, and subjected to 1H NMR experiments (see additional
methods in the Supporting Information).
The proton NMR spectra of commercially sourced s4U and
purified s4U-BIA are shown in Figures S5 and S6, respectively. The spectral assignments for s4U and s4U-BIA were made by reference to uridine,
biotin, and the linker ethylene glycol 1H NMR spectra found
on publicly available biological magnetic resonance data bank (BMRB).[24] As seen from the s4U-BIA proton spectrum
(Figure S6), resonance peaks for uracil
aromatic protons, ribose protons, biotin protons, and polyethene glycol
protons can be seen clearly. These data and MS/MS fragmentation data
collectively establish that the incubation of s4U with
IA and BIA leads to the covalent modification of the nucleoside, establishing
the feasibility of using these as chemical biology tools.
IA or BIA Modification
of RNA
Encouraged by the efficiency
of the in vitro modification data with standards, we attempted to
extend the studies to RNA. In these experiments, total RNA extract
from wild-type E. coli was treated
either with IA or BIA, and the resulting sample was digested to nucleosides
by subsequent actions of P1 nuclease, phosphodiesterase, and alkaline
phosphatase enzymes.[25] The UV–visible
traces from the HPLC analysis of wild-type E. coli RNA with or without IA treatment are shown in Figure S7. In addition to the C, U, G, and A, the traces reveal
a clear peak for s4U at 10.5 min, with the expected UV–visible
spectrum for the nucleoside. A time course for the modification of
s4U to s4U-IA from total RNA is shown in Figure S8. After 2 h of incubation with IA, the
signal for s4U is replaced with one at 11.3 min, corresponding
to the s4U-IA adduct. The data show that the modification
is essentially complete by 4 h. Therefore, the reactions were allowed
to proceed overnight (12–14 h). The identity of the s4U-IA adduct is further confirmed by examining the mass spectrum of
the species eluting in the 11.3 min peak (Figure S7B). The observed m/z values
are within 10–11 ppm of theoretical and identical to those
observed with the standards (compare mass spectrum insets from Figures S2 and S7). The studies described for
the modification of s4U with IA were extended to BIA. As
with the above, the BIA-treated RNA was analyzed by LC–MS analysis.
The UV–vis (190–800 nm) absorbance traces show a peak
for s4U-BIA [trace (a), Figure A], though the peak for its precursor s4U co-elutes with G [trace (b), Figure A]. Because of the hydrophobic nature of
BIA, a modified elution method was used, which did not separate G
and s4U as well. However, when we examine the data at 330
nm, which corresponds to the absorbance maximum for s4U
[Figure B, trace (f),
see Figure ], a peak
corresponding to the nucleoside is visible at 13.4 min in the untreated
samples. By contrast, when the traces are examined at 305 nm, which
corresponds to the absorbance maximum for the alkylated nucleoside
[Figure B, trace (c),
see Figure ], the
peak corresponding to the BIA modified nucleoside at 33 min is prominent
in treated samples. We note that the retention times observed for
the adduct and unmodified nucleoside are consistent with those observed
with the commercially obtained s4U (Figure A). Finally, the mass spectrum of the s4U-BIA corresponds (within 3 ppm) to that expected for the
adduct (Figure C).
Figure 4
LC–MS
analysis of nucleosides in WT E. coli total RNA treated with BIA. E. coli RNA was digested to nucleosides and analyzed by LC–MS. (A)
UV–visible traces (190–800 nm) of nucleosides from BIA-treated E. coli RNA [trace (a)] and from untreated E. coli RNA [trace (b)]. Under these conditions,
s4U-BIA elutes at 33 min while it’s precursor s4U elutes with G at 13 min. (B) UV–visible traces were
examined at 305 nm for s4U-BIA and at 330 nm for s4U in E. coli RNA treated with
BIA [traces (c) and (d), respectively]. The chromatograms show a very
diminished peak for s4U, as compared to untreated E. coli RNA samples at 305 and 330 nm [traces (e)
and (f), respectively]. The y-axis scale for traces
(a) and (b) differs from that of the scale for traces (c–f)
and the RNA inputs were identical. (C) Mass spectrum observed for
s4U-BIA.
LC–MS
analysis of nucleosides in WT E. coli total RNA treated with BIA. E. coli RNA was digested to nucleosides and analyzed by LC–MS. (A)
UV–visible traces (190–800 nm) of nucleosides from BIA-treated E. coli RNA [trace (a)] and from untreated E. coli RNA [trace (b)]. Under these conditions,
s4U-BIA elutes at 33 min while it’s precursor s4U elutes with G at 13 min. (B) UV–visible traces were
examined at 305 nm for s4U-BIA and at 330 nm for s4U in E. coli RNA treated with
BIA [traces (c) and (d), respectively]. The chromatograms show a very
diminished peak for s4U, as compared to untreated E. coli RNA samples at 305 and 330 nm [traces (e)
and (f), respectively]. The y-axis scale for traces
(a) and (b) differs from that of the scale for traces (c–f)
and the RNA inputs were identical. (C) Mass spectrum observed for
s4U-BIA.The s4U and
s4U-BIA originating from biological
RNA samples showed similar retention times to commercially sourced
s4U and purified s4U-BIA characterized by 1H NMR (Figures S9, see S5 and S6). Additionally, the degradation of
RNA treated by this protocol was examined using denaturing poly-acrylamide
gel electrophoresis (PAGE) as described in additional methods in the Supporting Information. RNA samples taken at
various time points were analyzed by PAGE and represented in Figure S10. The gel image shows minimal (if any)
degradation of the total RNA treated in the protocol. Together these
results validate the use of the mentioned conditions in an NGS workflow.
BIA Modification of RNA from Control Strains
Before
applying the IA methodology in sequencing experiments, we conducted
experiments to specifically implicate the cellular sulfur insertion
machinery in the modification (Figure S11).[26] In the biosynthetic pathway of s4U, the PLP-dependent enzyme IscS mobilizes S from Cys as a
persulfide attached to a Cys residue in the protein. The persulfide
donates the sulfur to ThiI, which in an ATP-dependent reaction incorporates
the sulfur into tRNA. In these experiments, total RNA from wild type
and ΔiscS and ΔthiI strains
of E. coli was isolated, reacted with
BIA, and digested to nucleosides. The resulting mixtures were analyzed
by HPLC, as described in Materials and Methods (Figure ). Interestingly,
the sample from the ΔiscS variant appears to
contain s4U [trace (c), Figure ], which elutes similarly to the nucleoside
as in the wild-type samples [trace (a), Figure ] and exhibits UV–visible features
that are characteristic of s4U. Perhaps this is not surprising,
as E. coli have several overlapping
systems for sulfur mobilization from cysteine.[27] By contrast, the ΔthiI variant lacks
s4U [trace (e), Figure ]. When compared to wild-type-treated samples [trace
(b), Figure ], the
corresponding BIA-treated samples from the ΔiscS variant clearly show the appearance of s4U-BIA [trace
(d), Figure ], whereas
the ΔthiI sample shows no evidence for the
presence of s4U-BIA [trace (f), Figure ]. These data establish that the BIA-modification
specifically highlights a ThiI-dependent process, which previous studies
have shown to be the incorporation of S to form s4U.[26] Finally, the data shown in Figure allow one to estimate that
under the conditions of the experiments, the modification is nearly
quantitative (95 ± 4%) in several biological and technical replicates
(representative data are shown in Figure ). Therefore, the reaction is sufficiently
robust for s4U sequencing.
Figure 5
Representative analysis of RNA from ΔthiI and ΔiscSE. coli deletion strains for the presence of s4U. The HPLC traces
extracted at 330 nm (corresponding to the maximum absorbance for s4U) of samples containing total RNA from WT and ΔiscS strains [traces (a), and (c), respectively] show that the deletion
of iscS does not eliminate s4U but that
the deletion of thiI does [trace (e)]. The traces
at 305 nm (corresponding to the maximum absorbance for s4U-BIA) of samples containing total RNA WT and ΔiscS strains, [traces (b) and (d), respectively] clearly show the presence
of s4U-BIA. By contrast, RNA from the thiI deletion strain does not show any s4U-BIA [trace (f)],
which is consistent with the loss of s4U in the corresponding
control [see trace (e)]. To facilitate a direct comparison of intensities
at 330 and 305 nm, all the y-axes are on the same
scale.
Representative analysis of RNA from ΔthiI and ΔiscSE. coli deletion strains for the presence of s4U. The HPLC traces
extracted at 330 nm (corresponding to the maximum absorbance for s4U) of samples containing total RNA from WT and ΔiscS strains [traces (a), and (c), respectively] show that the deletion
of iscS does not eliminate s4U but that
the deletion of thiI does [trace (e)]. The traces
at 305 nm (corresponding to the maximum absorbance for s4U-BIA) of samples containing total RNA WT and ΔiscS strains, [traces (b) and (d), respectively] clearly show the presence
of s4U-BIA. By contrast, RNA from the thiI deletion strain does not show any s4U-BIA [trace (f)],
which is consistent with the loss of s4U in the corresponding
control [see trace (e)]. To facilitate a direct comparison of intensities
at 330 and 305 nm, all the y-axes are on the same
scale.We considered using the BIA to
enrich samples for the s4U-containing species before the
sequencing (discussed below) using
affinity pulldown (see additional methods in the Supporting Information for experimental details). To this
end, BIA-treated RNA from the wild-type, ΔiscS, and ΔthiI strains were incubated with streptavidin
beads and eluted with excess biotin under denaturing conditions, as
described in the Materials and Methods section.
The resulting samples were then analyzed by denaturing PAGE (Figure S12). Samples lacking the BIA treatment
served as controls. In each set of samples, we see an RNA fragment
(∼80 bp) being enriched only in the BIA-containing samples.
However, we see a similar band in the ΔthiI sample as well. At this point, it is not known whether the sample
is identical to that observed with the wild type and ΔiscS RNA. We cannot rule out the possibility of a hitherto unidentified
nucleoside, which is sensitive to treatment with IA-based reagents.
Considering these results, we did not attempt any further enrichment
experiments and moved forward with sequencing both unenriched and
enriched samples resulting from the treatment of RNA with BIA, as
described below.
Small RNA Sequencing
RNA sequencing
reactions were
set up with the sample groups shown in Table S1. In these experiments, untreated RNA served as a control for false
positives. The small RNA samples for sequencing were isolated by preparative
denaturing PAGE experiments[28] after the
indicated treatments described in the Materials and
Methods section (see Table S2 for
yields). In each case, a duplicate sample was included resulting in
a total of 12 samples. Sequencing libraries were prepared with a NEBNext
multiplex small RNA library prep set for Illumina from 1.5 μg
of purified RNA and sequenced on an Illumina NextSeq instrument in
a pair-ended manner (2 × 75 bp).
Data Analysis
The Materials and Methods section describes
the detailed data analysis workflow. Initially,
raw reads were analyzed by FastQC. Adaptor contents were trimmed by
Trimmomatic,[29] and the resulting reads
were mapped to the E. coli K12 genome
by STAR.[30] Next, the cufflinks suite of
tools was used to count the transcripts and analyze gene expression
across the sample groups.[31] The alignments
were visually assessed for the presence of mismatches, and misincorporation
analysis was performed using the data generated by the BCFtools program.[32] The results of the analyses are described below.
Gene Expression Analysis
The cufflinks suite of tools
was employed to assemble, merge, and perform several RNA-Seq data
analyses. An in-depth analysis of each sample group was performed,
and the gene expression profiles were compared against each other.
Traditionally, differential expression analysis is performed to identify
the upregulated or downregulated genes across different sample groups.
However, in our case, such an analysis was only used to look at the
libraries’ profiles and ensure no significant biological variation
between the sample groups.The aggregated read count for genes
obtained from the expression analysis was grouped according to the
gene biotypes, and percentages for each biotype were calculated (Table ). Unexpectedly, the
profiles for each sample group (except dTE) are dominated by protein-coding
mRNAs. Because we prepared libraries from 60 to 100 nt fragments isolated
from PAGE, this may indicate that larger RNA transcripts are cleaved
to shorter ones while labeling. Though Figure S10 depicts a minimal degradation of RNA under the incubation
conditions, RNA-Seq is a sensitive experiment that picks up minute
quantities of RNA that may still be present and co-migrated with the
tRNA. Nevertheless, because we are interested in tRNAs in this analysis,
subsequent visualization and variant analysis were limited to tRNA
transcripts only.
Table 1
Read Composition Percentages by Gene
Biotypea
class
WTU
WTT
WTE
dTU
dTT
dTE
ncRNA
14.1
28.3
26.4
28.9
31.1
48.0
mRNA
76.3
62.8
57.3
63.8
61.0
39.3
pseudo-gene
1.0
0.9
1.0
1.7
1.6
1.5
rRNA
4.5
4.1
4.1
1.6
1.7
3.8
tmRNA
0.4
0.3
0.8
0.6
0.6
0.5
tRNA
3.8
3.7
10.4
3.6
4.0
6.9
NcRNA: non-coding RNA, mRNA: messenger
RNA, rRNA: ribosomal RNA, tmRNA: transfer messenger RNA, tRNA: transfer
RNA. WTU: wild-type untreated, WTT: wild-type treated, WTE: wild-type
treated and enriched, dTU: ΔthiI untreated,
dTT: ΔthiI treated, dTE: ΔthiI treated and enriched.
NcRNA: non-coding RNA, mRNA: messenger
RNA, rRNA: ribosomal RNA, tmRNA: transfer messenger RNA, tRNA: transfer
RNA. WTU: wild-type untreated, WTT: wild-type treated, WTE: wild-type
treated and enriched, dTU: ΔthiI untreated,
dTT: ΔthiI treated, dTE: ΔthiI treated and enriched.
Visualization
In the next stage of analysis, the aligned
reads are visualized by an integrative genomic viewer (IGV)[33] to identify any changes to a sequence readout
that would result from the presence of BIA. Figure S13 depicts a representative example of tRNA transcripts, highlighting hisR (sense strand) and glnV (antisense
strand). In both cases, the reads are identical to the genome sequence,
except at a single position in the BIA-treated samples. In the glnV data, a misincorporation occurs at position 8. Indeed, glnV has been shown previously to contain s4U
in this position.[34] By contrast, the misincorporation
in the hisR data occurs at position 9 of the tRNA
(which aligns with position 8 of the sequence of all other tRNA species)
and has been demonstrated to contain s4U at this position.[35] A close inspection of similarly constructed
alignments of all the reads corresponding to tRNA genes shows misincorporations
at position 8. We note that the misincorporations only occurs in the
BIA-treated samples with wild-type RNA but not in the thiI deletion strain, which our studies show lacks s4U (see Figure S11). Analysis of the misincorporations
and mapping to the DNA shows that the BIA-modified s4U
is decoded as C. Depending on whether the gene occurs on the sense
or antisense strand, the base observed in the alignments is either
C or G, respectively. This transition is not observed in samples that
were not treated with BIA (wild type untreated), nor is it observed
in RNA from the thiI deletion strain (ΔthiI untreated and treated). This latter observation further validates
the assignment of this transition to the presence of s4U.
Misincorporation Analysis
The BCFtools program was
used to carry out a detailed analysis of all the sequences obtained
in these studies incorporating the corresponding biological replicates
in each case. In this analysis, genotype likelihoods are determined
initially by estimating the most likely base at a position by examining
all the reads that have been aligned at that position. Possible misincorporations
are identified by comparing the most likely base at a given position
to the reference genome. The frequency of misincorporation was calculated
from the number of correct and misincorporated bases at each position
as described in the Materials and Methods section.The misincorporation frequency heat maps for wild-type untreated
(WTU) and wild-type-treated (WTT) sample groups are depicted in Figure . As seen in previous
studies,[11,12] the presence of s4U modification
by itself at position 8 led to the misincorporation of deoxynucleotides
during the RT reaction but at a lower frequency in the untreated sample
(Figure A). The values
observed for WTU at position 8 are in the range 0.003–0.35
with a median of 0.04 in contrast to the WTT sample (Figure B), which displayed much higher
values in the range of 0.002–0.98 with a median of 0.75. Likewise,
the wild-type-treated and enriched (WTE) sample group displayed similar
values at position 8 in the range of 0.02–0.94, with a median
of 0.69 (Figure S14C). At this position,
either T > C or A > G transitions are seen depending on whether
the
gene is on the sense strand or the antisense strand. The control ΔthiI untreated (dTU), ΔthiI treated (dTT), ΔthiI treated, and enriched
(dTE) samples depicted in Figure S14 show
that position 8 does not display misincorporations compared to wild-type
samples. The median values of misincorporation at position 8 for dTU,
dTT, and dTE are in the range of 0.00–0.01. Alternatively,
the tRNAs corresponding to selenocysteine and isoleucine (Ile) have
values close to zero at position 8. The misincorporation frequency
values are shown in Table S3. It is also
notable that the misincorporation frequencies were not uniform across
all genes, resulting from the incomplete conversion of s4U to s4U-BIA or non-uniform cellular s4U incorporation.
Nevertheless, it is evident from the heat maps that the treatment
of BIA leads to increased misincorporation at position 8, which is
diagnostic for the presence of s4U.
Figure 6
Misincorporation profiles
of tRNAs derived from WT E. coli. Heat
maps summarizing misincorporation frequency
for tRNAs from (A) wild-type untreated (WTU) and (B) wild-type-treated
(WTT) sample groups. The modifications I and s4U (before
treatment in WTU, after treatment in WTT) are indicated by a black
line and labeled in white. Positions labeled as N* contain unknown
post-transcriptional modifications. Alignment of tRNA genes of E. coli is carried out based on the alignments derived
from tRNAdb. For clarity, the x-axis tick marks denote
every fourth base. The y-axes show tRNA species corresponding
to each amino acid. The height of the bar correlates to the individual
tRNA coding the amino acids. Gapped regions within the alignments
were set to background color. The secondary structure attributes below
the x-axis are labeled based on the alignments listed
on tRNAdb. Acc-stem: acceptor stem, Ac-arm: anticodon arm, CCA: terminal
nucleosides of tRNA genes. Each of the images are generated by considering
both replicates in the respective sample groups. The raw data used
to generate the heat map is reproduced in Figure S14.
Misincorporation profiles
of tRNAs derived from WT E. coli. Heat
maps summarizing misincorporation frequency
for tRNAs from (A) wild-type untreated (WTU) and (B) wild-type-treated
(WTT) sample groups. The modifications I and s4U (before
treatment in WTU, after treatment in WTT) are indicated by a black
line and labeled in white. Positions labeled as N* contain unknown
post-transcriptional modifications. Alignment of tRNA genes of E. coli is carried out based on the alignments derived
from tRNAdb. For clarity, the x-axis tick marks denote
every fourth base. The y-axes show tRNA species corresponding
to each amino acid. The height of the bar correlates to the individual
tRNA coding the amino acids. Gapped regions within the alignments
were set to background color. The secondary structure attributes below
the x-axis are labeled based on the alignments listed
on tRNAdb. Acc-stem: acceptor stem, Ac-arm: anticodon arm, CCA: terminal
nucleosides of tRNA genes. Each of the images are generated by considering
both replicates in the respective sample groups. The raw data used
to generate the heat map is reproduced in Figure S14.Close inspection of all sample
groups (Figures and S14) reveals
the presence of additional misincorporation hotspots. However, unlike
with s4U, these are present in all samples, including the thiI deletion strain. For example, the I present at position
34 on the anticodon loop of tRNAArg (ACG) consistently
leads to a misincorporation of C instead of G.[7] Intriguingly, additional misincorporations observed in all sample
groups (denoted by N* in Figures and S14) are, to our knowledge,
not correlated with any known modifications and may be novel. Because
all of these are present ±thiI, if they contain
an S, ThiI is not the source. We note that the bulky post-transcriptional
modification [3-(3-amino-3-carboxypropyl)uridine] (acp3U) has been shown to induce misincorporation previously[11,12] was not seen in our studies. However, this difference may be due
to the selectivity of the polymerase employed.
Sequencing
by MS
The RNA-Seq experiments clearly show
a measurable increase in the transition frequencies related to IA
modification and the presence of the thiI gene. However,
a direct demonstration was sought to independently verify the presence
of s4U on E. coli tRNAs
to validate the RNA-Seq workflow. The presence of s4U on
various E. coli tRNAs was obtained
by digesting E. coli total tRNA with
the RNase T1 enzyme and analyzing the resulting fragments by MS/MS.
The nuclease cleaves single-stranded RNA adjacent to G on the 3′
end to generate smaller fragments. Typical amounts of size-selected
tRNA used for RNase T1 experiments are shown in Table S4. Figure shows HPLC chromatograms of RNase T1-digested tRNA samples
from wild-type and deletion E. coli strains. As shown in Figure , s4U absorbs at 330 nm, which allows the UV detection
of s4U containing fragments. The chromatographs at 330
nm show numerous peaks, which are consistent with the presence of
s4U. Control experiments with tRNA from ΔiscS also show peaks, whereas the ΔthiI chromatograms
show no features, consistent with the loss of the modification (Figure ). These results
are in harmony with the nucleoside analysis of tRNA from wild-type
and deletion E. coli strains (see Figures and 5) that show a requirement of thiI in the
modification pathway.
Figure 7
Representative LC analysis of RNaseT1 digest of tRNA from
WT and
deletion E. coli strains. The HPLC
traces extracted at 330 nm (corresponding to the maximum absorbance
for s4U) of RNase T1 digestion samples are shown. Fragments
containing s4U display absorbance at 330 nm and distinct
peaks are visible in the samples. The corresponding strains are labeled.
To facilitate comparison, the intensities on y-axes
are normalized to per μg of the sample and depicted to similar
scales.
Representative LC analysis of RNaseT1 digest of tRNA from
WT and
deletion E. coli strains. The HPLC
traces extracted at 330 nm (corresponding to the maximum absorbance
for s4U) of RNase T1 digestion samples are shown. Fragments
containing s4U display absorbance at 330 nm and distinct
peaks are visible in the samples. The corresponding strains are labeled.
To facilitate comparison, the intensities on y-axes
are normalized to per μg of the sample and depicted to similar
scales.The samples were further analyzed
by subjecting the eluent to MS/MS
analysis in the negative ion mode. In these experiments, the five
highest intensity species at each time were subjected to MS fragmentation
as described in the Materials and Methods section.
The resulting data were analyzed by the NucleicAcidSearchEngine (NASE)[17] using a database containing all the fully modified E. coli tRNA sequences in E. coli. The resulting output was examined for the s4U containing
fragments and visualization in TOPPView from an OpenMS toolset.[36] Some of the tRNA species generated smaller fragments
such as s4UGp and s4UAGp, which could not be
assigned unambiguously to a specific tRNA, while others generated
unique fragments that were readily mapped. Likewise, tRNA from ΔthiIE. coli was used
as a control sample where the corresponding unmodified U8 fragment
for each tRNA was identified to assign the position of s4U via mass spectrometry confidently. A representative MS/MS analysis
for s4U containing fragment CUA[s4U]AGp from
tRNAAla is shown in Figure S15. Data with the ΔthiI tRNA sample shows that
the corresponding CUAUAGp is not modified. The results of these analyses
are summarized in supplementary extended Table.The data from LC–MS/MS analyses are compared to sequencing
analyses, tRNAdb[37] and MODOMICS[38] databases in Table . The tRNA sequences used for the analysis
are provided in supplementary extended Table. In few cases, there are minor differences between the reported
modification data at position 8 in MODOMICS and/or tRNAdb and assignments
from the RNA-Seq and MS/MS analyses. However, there is a general agreement
between the RNA-Seq and mass spectrometry analyses. For example, the
observed enhancement of the misincorporation frequency for Ala tRNA
genes is corroborated with the presence of s4U in the MS/MS
analysis of all five Ala tRNA genes (see Tables S3 and 2). In the MODOMICS and tRNAdb
databases, both modified and unmodified U is observed (Table ).
Table 2
Comparison
of tRNA Modification at
Position 8 from Databases against the RNA-Seq and MS/MS Analysesa
gene
local DB
anticodon
MODOMICS
tRNAdb
RNA-Seq
RNase T1-MS
RNaseT1—theoretical fragment
alaT, alaU, alaV
Ala1
VGC
U
s4U/U
s4U
s4U
CUA[s4U]AGp
alaW, alaX
Ala2
GGC
U
N
s4U
s4U
CUA[s4U]AGp
argQ, argV, argY, argZ
Arg1
ICG
s4U
s4U
s4U
Amb
[s4U]AGp
argX
Arg2
CCG
U
U
n.d.
Amb
[s4U]AGp
argU
Arg3
{CU
U
U
s4U
n.d.
CCCU[s4U]AGp
argW
Arg4_NA#
{CU
n.d.
n.d.
UCCUCU[s4U]AGp
asnT, asnU, asnV, asnW
Asn
QUU
s4U
s4U
s4U
Amb
[s4U]AGp
aspT, aspU, aspV
Asp
QUC
s4U
s4U
s4U
Amb
[s4U]AGp
cysT
Cys
GCA
s4U
s4U
s4U
U/s4U
U[s4U]AACAAAGp
glnU, glnW
Gln1
NUG
s4U
s4U
s4U
s4U
UA[s4U]CGp
glnV, glnX
Gln2
CUG
s4U
s4U
s4U
s4U
UA[s4U]CGp
gltT, gltU, gltV, gltW
Glu
SUC
U
U
s4U
U
UCCCCUUCGp
glyU
Gly1
CCC
s4U
s4U
s4U
Amb
[s4U]AGp
glyT
Gly2
NCC
U
U
s4U
U/s4U
CA[s4U]CGp
glyV, glyW, glyX, glyY
Gly3
GCC
s4U
U
s4U
U/s4U
AA[s4U]AGp
hisR
His
QUG
s4U
s4U
s4U
s4U
CUA[s4U]AGp
ileT, ileU, ileV
Ile1
GAU
U
U
U
Amb
UAGp
ileX
Ile2
}AU
s4U
s4U
n.d.
n.d.
CCCCU[s4U]AGp
ileY#
Ile3_NA
}AU
n.d.
n.d.
CCCUU[s4U]AGp/CCCUUUAGp
metV, metW, metZ*
Ini1
CAU
s4U
s4U
s4U
Amb
[s4U]Gp
metY*
Ini2
CAU
s4U
s4U
s4U
Amb
[s4U]Gp
leuP, leuQ, leuT, leuV
Leu1
CAG
U
U
s4U
Amb
UGp
leuZ
Leu3
)AA
s4U
s4U
s4U
Amb
A[s4U]Gp
leuU
Leu2
GAG
U
U
s4U
Amb
[s4U]Gp
leuX
Leu4
BAA
s4U
s4U
s4U
Amb
[s4U]Gp
leuW#
Leu5_NA
VAG
s4U
Amb
[s4U]Gp
lysQ, lysT, lysV, lysW, lysY, lysZ
Lys
SUU
U
s4U
s4U
U/s4U
U[s4U]AGp
metT, metU
Met
MAU
s4U
s4U
n.d.
Amb
[s4U]AGp
pheU, pheV
Phe
GAA
s4U
s4U
s4U
s4U
A[s4U]AGp
proK
Pro
CGG
U
s4U
n.d.
s4U/U
AU[s4U]Gp
proL#
Pro2_NA
n.d.
Amb
[s4U]AGp/UAGp
proM#
Pro3_NA
s4U
Amb
[s4U]AGp/UAGp
selC
Sec
UCA
s4U
U
U
U
UCGp
serT
Ser1
VGA
s4U
s4U
U
Amb
[s4U]Gp/UGp
serU
Ser2
CGA
U
U
s4U
s4U
A[s4U]Gp
serV
Ser3
GCU
s4U
s4U
s4U
Amb
[s4U]Gp
serW, serX
Ser4
GGA
s4U
s4U/U
s4U
Amb
[s4U]Gp
thrV
Thr1
GGU
U
U
s4U
s4U
AUA[s4U]Gp
thrT
Thr2
GGU
U
U
s4U
U/s4U
AUA[s4U]AGp
thrU
Thr3_NA
VGU
NA
NA
s4U
s4U
ACU[s4U]AGp
thrW
Thr4_NA
NA
NA
NA
s4U
s4U
AUA[s4U]AGp
trpT
Trp
CCA
s4U
s4U
s4U
Amb
[s4U]AGp
tyrT, tyrU
Tyr1
QUA
[s4U8][s4U9]
U8[s4U9]
s4U8
[s4U8][s4U9]
[s4U][s4U]CCCGp
tyrV
Tyr2
QUA
[s4U8][s4U9]
U8[s4U9]
s4U8
[s4U8][s4U9]
[s4U][s4U]CCCGp
valT, valU, valX, valY, valZ
Val1
VAC
s4U
s4U
s4U
s4U
AU[s4U]AGp
valW
Val2
GAC
s4U
s4U
s4U
Amb
[s4U]AGp
valV
Val3
GAC
s4U
s4U
s4U
U/s4U
UUCA[s4U]AGp
Local DB: name
of tRNA in the local
sequence database; refer to supplementary extended Table. Anticodon: anticodon
on the tRNA. MODOMICS: the modification at position 8 on the MODOMICS[38] website. tRNAdb: the modification at position
8 on tRNAdb.[37] RNA-Seq: the modification
at position 8 using RNA-Seq experiments. RNase T1-MS: the modification
at position 8 using LC–MS/MS analysis. RNaseT1—theoretical
fragment: theoretical oligonucleotide generated upon RNase T1 digestion
of a tRNA. Amb: ambiguous; the theoretical fragment generated is detected
but could not be assigned to one tRNA species unambiguously. n.d.:
not detected; N: unknown modified uridine; V: uridine 5-oxyacetic
acid; I: inosine; {: 5-methylaminomethyluridine; Q: queuosine; $:
5-carboxymethylaminomethyl-2-thiouridine; S: 5-methylaminomethyl-2-thiourdine;
B: 2′-O-methylcytidine); 5-carboxymethylaminomethyl-2′-O-methyluridine; *initiator methionine tRNAs. #These tRNAs have a gene, but the RNA sequences are not represented
in the MODOMICS and/or tRNAdb databases.
Local DB: name
of tRNA in the local
sequence database; refer to supplementary extended Table. Anticodon: anticodon
on the tRNA. MODOMICS: the modification at position 8 on the MODOMICS[38] website. tRNAdb: the modification at position
8 on tRNAdb.[37] RNA-Seq: the modification
at position 8 using RNA-Seq experiments. RNase T1-MS: the modification
at position 8 using LC–MS/MS analysis. RNaseT1—theoretical
fragment: theoretical oligonucleotide generated upon RNase T1 digestion
of a tRNA. Amb: ambiguous; the theoretical fragment generated is detected
but could not be assigned to one tRNA species unambiguously. n.d.:
not detected; N: unknown modified uridine; V: uridine 5-oxyacetic
acid; I: inosine; {: 5-methylaminomethyluridine; Q: queuosine; $:
5-carboxymethylaminomethyl-2-thiouridine; S: 5-methylaminomethyl-2-thiourdine;
B: 2′-O-methylcytidine); 5-carboxymethylaminomethyl-2′-O-methyluridine; *initiator methionine tRNAs. #These tRNAs have a gene, but the RNA sequences are not represented
in the MODOMICS and/or tRNAdb databases.The MS/MS data do not just confirm sequencing results
but reveal
additional modification data. The MS/MS analysis, for example, reveals
the presence of s4U at two adjacent positions in Tyr tRNA.
The MS/MS data with the wild-type sample show [s4U][s4U]CCCGp, which span positions 8 and 9 in the RNA. The corresponding
positions in the ΔthiI tRNA sample contain
U. By contrast to this data, the enhanced misincorporation frequency
is only reflected at position 8 of Tyr tRNA genes in the RNA-Seq analysis
(see Figure ).In few cases, the data are consistent with the presence of both
modified and unmodified U, which could be attributed to the incomplete
sulfuration of U8 in those tRNAs. This observation is in line with
the RNA-Seq analysis whereupon BIA treatment, few genes showed close
to 80–90% misincorporation at position 8, while others were
in 30–70% range (Table S3). Therefore,
there are differential modification frequencies of certain tRNA species
in the stationary phase of growth, though the data preclude the quantification
of dynamics of s4U installation on individual tRNA species.In addition to the untreated samples, RNase T1 digestion was carried
out on IA, and BIA-treated tRNA samples to supplement the sequencing
analysis. The NASE utilized in this study allows the use of custom
modifications by utilizing the monoisotopic mass of the modified nucleoside.
Therefore, the data from IA or BIA-treated tRNA samples were analyzed
against sequences containing either s4U-IA or s4U-BIA in place of s4U on E. coli tRNA sequences as described in the Materials and
Methods section. Treatment of the sample with IA or BIA leads
to loss of the UV–vis detection of peaks in LC, also consistent
with the change of the UV–visible absorbance of treated s4U (Figures S16 and S17). A representative
MS/MS analysis of CUA[s4U-IA]AGp and CUA[s4U-BIA]AGp
fragment originating from tRNAAla after IA and BIA treatments,
respectively, is depicted in Figure S18. The observed dissociation pattern for those fragments allows for
the confident assignment of these sequences to a precursor ion with
observed mass. Interestingly, in Tyr tRNA, fragments containing s4U-IA and s4U-BIA at positions 8 and 9 are found,
corroborating the presence of s4U at both positions 8 and
9.Overall, all the MS/MS analysis data of RNase T1-digested
samples
by NASE is tabulated in extended supplementary Table. These observations indicate that the iodoacetamide reaction
toward s4U is very specific in the context of complex tRNA
modifications present on E. coli tRNA.
In conclusion, the iodoacetamide reaction largely modifies the s4U both specifically and efficiently in the context of tRNA.
Overall, our method demonstrates the utility of the chemical treatment
of s4U with iodoacetamide in site-specific profiling of
natively occurring s4U on tRNAs.
Discussion
The
first nucleoside preparation of s4U was prepared in 1958,[39] long
before it was discovered in a soluble RNA extract from E. coli in 1965.[40] The
presence of 2-thiouridine derivatives in RNA was also uncovered around
the same time.[41] These revelations sparked
considerable interest in the roles of these non-canonical nucleosides.
Analytical studies by Favre et al. revealed that the irradiation of E. coli tRNAval with 334 nm light led
to adduct formation between s4U at position 8 and C at
position 13.[42] Efforts were also made to
understand the structure of the photoadduct.[43,44] Additional studies[45,46] showed similar reactivity in
vivo, leading to the suggestion that the modified base acts as a UVA
radiation sensor to induce the growth delay and promote photoprotection
during the stress.[47,48] Experiments focused on understanding
the mechanism of growth delay indicated the dependency on the levels
of the molecule ppGpp, which inhibits rRNA synthesis similarly to
that observed in amino acid starvation.[48−50] The hypothesis of growth
delay by s4U was also supported by the observation of a
minimal growth delay in nuvE. coli variants lacking s4U following the irradiation.[47,49] Favre and colleagues tried to examine the fate of cross-linked tRNA
in E. coli after removing the stress.[51,52] Intriguingly, the adduct appears to be resolved upon restoring favorable
growth conditions, but there is no established mechanism for such
a process.[51,52] While analytical studies based
on fluorescence measurements indicate a s4U presence in
70% E. coli tRNA,[47] understanding the role of the modification and its dynamics
under other common stress conditions has not been possible because
of the lack of site-specific profiling methods.The reactivity
and photochemistry of s4U have also been leveraged into
chemical biology tools to explore RNA dynamics.[18,53] For example, incorporating externally supplied s4U during
growth allows labeling of newly synthesized RNA.[23,54−56] Additionally, the nucleophilicity of the sulfur atom
has been exploited to attach desired reagents/fluorescent probes for
analytical or drug development purposes.[19,20] While these studies underscore the unique reactivity of the modified
base, to date, it has not been exploited for the site-specific detection
of natively occurring s4U in E. coli.While methylation has received substantial contemporary attention,
it is not the only modification that is known to occur in RNA. Indeed,
pioneering work from several groups starting from the 1970s has brought
to light many chemically diverse bases.[2,38] Current estimates
are that there are 105 numbers of unique modifications in tRNA.[2] Simple one-step routes that make some but many
hyper-modifications require an army of enzymes and cofactors to introduce.
In this regard, Q,[57−60] Y,[61,62] and s4U[26,63,64] are notable for their biosynthetic complexity.In this article, we have leveraged the nucleophilic reactivity
of the s4U to carry out single nucleotide level profiling
of tRNA of E. coli. The data show that
the modification is present in 19 of 20 amino acid coding tRNAs under
these conditions. In all but one case, the modification occurs at
position 8 of the tRNA. In hisR, the modification
occurs at position 9, which is consistent with previously reported
sequencing results.[35] It is notable that
under the conditions used in these experiments, the “readout”
is a base transition in all cases, providing a concise method for
identifying the modification in tRNA. It is also important to point
out that while tRNA is known to undergo many modifications, our data
show increased misincorporation frequency for s4U-BIA in
the treated samples, highlighting the fact that the s4U
modification can be observed in the background of all other possible
modifications. The extensive analytical studies that underpin the
work provide confidence that the reactivity of IA or BIA is unique
to the nucleoside, as we observed no modification of the canonical
nucleosides, even under conditions, where high concentrations of both
nucleoside and the reagent were present. Finally, the observation
that the transition is not observed in the thiI deletion
strain provides confidence in the assignment. Therefore, the method
could directly be applied to the profiling of natively occurring s4U on tRNA, for example, under different physiological stress
conditions or from bacterial species where the presence of s4U is not known.
Materials and Methods
Reaction of IA or BIA with
s4U
Commercially
available s4U was treated with a tenfold excess of either
iodoacetamide (IA, TCI Chemicals) or iodoacetyl-PEG2-biotin (BIA, Thermo Scientific) as follows. s4U (0.1 mM)
was incubated with 1 mM of either IA or BIA in 0.05 M sodium phosphate
(NaPi) buffer (pH 8) at 50 °C overnight. The reaction mixtures
were quenched with a tenfold excess of DTT (10 mM). UV–visible
absorption spectra were recorded on an Agilent 8454 diode-array spectrophotometer
between wavelengths from 190 to 800 nm. The reaction mixtures were
analyzed further on an Ultimate 3000 HPLC instrument with a photodiode
array detector coupled to an LTQ Orbitrap XL mass spectrometer (Thermo
Fisher Scientific) as follows. The mixtures were injected onto a Hypersil
Gold C-18 column [particle size—1.9 μM, dimensions—2.1
mm (D) × 150 mm (L), Thermo
Fisher] pre-equilibrated with 0.05 M ammonium acetate (pH 5.3) at
a flow rate of 0.2 mL/min. Separation was carried out with a gradient
of 40% acetonitrile (buffer B, optima grade from Thermo Fisher) in
water (optima grade from Thermo Fisher) over 17 min for IA-treated
samples (time: 0–3.5 min, % B: 0–0.8; time 3.5–3.75
min, % B: 0.8–3.2; time: 3.75–4.0, % B: 3.2–5.0;
time: 4–12 min, % B 5.0–25.0; time: 12–15 min,
% B: 25–50; time: 15–17 min, % B: 50–75; time:
17–17.1 min, % B: 75–100; time: 17.1–20 min,
% B: 100) and over 45 min for the BIA-treated samples (time: 0–4.4
min, % B: 0–0.2; time: 4.4–5.8 min, % B: 0.2–0.8;
time: 5.8–7.2, % B: 0.8–1.8; time: 7.2–8.6 min,
% B: 1.8–3.2; time: 8.6–10 min, % B: 3.2–5.0;
time: 10–25 min, % B: 5–25; time: 25–30 min,
% B: 25–50; time: 30–34 min, % B: 50–75; time:
34–37 min, % B: 75; time: 37–45 min, % B: 75–100;
time: 45–48 min, % B: 100). Variations to the above LC method
have been used to analyze few samples. Those experimental details
are listed in additional methods in the Supporting Information.The method employed for BIA-treated samples
was longer to accommodate the hydrophobicity of the BIA moiety. All
MS data on a LTQ Orbitrap XL mass spectrometer was recorded in the
positive ion mode with an FT analyzer at a resolution setting of 100,000
and m/z range of 50–1400.
The instrument used for recording the mass spectra was maintained
at a capillary temperature of 275 °C, with sheath gas flow 35,
auxiliary gas flow 12, and a source voltage of 3 kV. MS/MS fragmentation
was achieved by collision-induced dissociation at energy settings,
as noted in the results. Xcalibur software was used to analyze all
the LC–MS data (Thermo Fisher), and the deconvoluted spectra
were generated from the raw data by exporting with the use of Xcalibur
(Thermo Fisher).
Total RNA Extraction from E. coli
Total RNA was extracted from either
wild-type or deletion
strains of E. coli strains by employing
guanidine-based methods with some modifications as detailed below.[65] The cell pellet from 1 L of the growth was resuspended
in 6 mL of denaturation buffer consisting of 4 M guanidine thiocyanate,
0.025 M sodium citrate, 0.5% sarkosyl, and 0.1 M 2-mercaptoethanol.
The resulting suspension was equally distributed into 10–12
two mL Eppendorf tubes, each of which contained 0.7 mL of the suspension.
Next, 2 M sodium acetate at pH 4 (0.1 mL), phenol (1 mL), and 1-bromo-3-chloropropane
(0.2 mL) were added in succession, vortexed, and incubated on ice
for 15 min. The tubes were centrifuged at 15,000g, and the resulting top transparent layer (0.3–0.4 mL) was
transferred to a new 1.5 mL Eppendorf tube. An equivalent volume of
isopropanol was added before placing the tubes at −20 °C
for at least 30 min. The precipitated RNA was collected by centrifugation,
and the supernatant was discarded. The RNA pellet obtained was further
purified using a Qiagen miRNeasy mini kit following the manufacturer’s
directions for the purification of RNA >18 nt as follows.In
brief, the pellet was resuspended in 0.05 mL of RNAse free water (Invitrogen)
and vortexed. The Qiazol reagent (0.25 mL) and chloroform (0.05 mL)
were added in succession, vortexed, and centrifuged at 21,000g for 1 min. The top transparent layer was transferred to
a new 1.5 mL Eppendorf tube containing 1.5 volumes of cold ethanol
and mixed by pipetting. The entire content of the tube was loaded
onto an Omega BioTek HiBind RNA spin-columns fitted with collection
tubes. The column was then washed once with 0.7 mL RWT (Qiagen) and
twice with 0.5 mL RPE (Qiagen) buffers in succession, with centrifugation
to permit the removal of the wash solutions before eluting the RNA
with 40 μL of RNAse free water (Invitrogen). RNA prepared from
the same starting cell pellets were pooled and stored at −80
°C until further use in smaller aliquots.
Reaction of IA or BIA with
Total RNA from E.
coli
The RNA extracted was treated with either
IA or BIA as follows. Purified RNA (0.04 mL) was incubated with 1
mM of either IA or BIA in 0.05 mM NaPi buffer (pH 8) overnight at
50 °C. The reaction mixtures were quenched with a tenfold excess
of DTT (10 mM) over the reagent, and RNA was purified using the Qiagen
miRNeasy mini kit (see above). The eluted RNA was digested to the
nucleoside level by the successive action of P1 nuclease, phosphodiesterase,
and alkaline phosphatase, as reported previously.[25] The reaction mixtures were filtered on VWR centrifugal
filters (PES membrane, MWCO 10 kDa) to remove the protein components
prior to analysis. LC–MS analyses of these mixtures were carried
out as described above for the nucleoside samples.
Purification
of RNA for Sequencing Experiments
The
desired small RNA (∼60–100 nt) for sequencing experiments
was purified by employing preparative PAGE (28 cm H × 16.5 cm W) prepared with 6% acrylamide,
8 M urea, 1× TBE buffer, TEMED, and 30% APS.[28] The RNA samples were mixed with an equivalent volume of
2× loading buffer.[28] The RNA was separated
at 25 W for 2 h or until the lower dye band was two-thirds through
the gel. The bands were detected by UV-shadowing and excised from
the gel. RNA in the bands was eluted into solution by incubating the
gel pieces overnight at 4 °C in the crush soak buffer, which
was prepared by mixing 1 mL of 1 M Tris·HCl pH 7.5, 4 mL of 5
M NaCl, and 0.2 mL of 0.5 M EDTA pH 8 (diluted to 100 mL by adding
water and autoclaved). Following centrifugation at 21,000g for 5 min, the supernatant was transferred to a new tube containing
0.7 mL of cold ethanol. RNA was precipitated by placing the tubes
at −20 °C for 2 h. The RNA pellet was collected by centrifugation
and dried under vacuum conditions in a Savant speed-vac at room temperature.
The resulting white residue was resuspended in 0.02 mL of Tris–EDTA
buffer (made by mixing 0.1 mL of Tris·HCl pH 7.5 and 0.02 mL
of 0.5 M EDTA pH 8.0, made to 10 mL by adding water and sterile filtered)
and quantified using a Nanodrop. The amount of tRNA obtained for sequencing
experiments is listed in Table S2.
RNase
T1 Digestion and LC–MS/MS Analysis
The
RNA samples used for RNase T1 digestion are obtained with a slight
variation as described above. RNA from either WT or deletion strains
was obtained by phenol extraction[25] followed
by gel purification as described above, prior to digestion with the
RNase T1 enzyme. The cell pellet from 1 L of the growth was resuspended
in 1 mL of resuspension buffer (10 mM Tris–HCl (pH 8), 10 mM
MgCl2 and 0.15 M NaCl) per mg of the cell pellet, and mixed
by rotation at 4 °C. After resuspension, 1 mL of saturated phenol
buffered at pH 4.3 (Thermo Scientific) was added to 1 mg of the starting
cell pellet and mixed by inversion for 1 h at 4 °C. The suspension
was centrifuged at 8000 rpm at 4 °C for 30 min, and the top semi-transparent
layer was removed, mixed with saturated phenol (∼1 mL/mg of
starting cell pellet), and centrifuged at 8000 rpm at 4 °C for
30 min. After centrifugation, the upper transparent layer was removed
and mixed with 0.1 volumes of 3 M sodium acetate pH 5.5 and 3 volumes
of ethanol. The sample was incubated for 2 h at—20 °C,
and the precipitated RNA was collected by centrifugation for 30 min
at 12,000 rpm and at 4 °C. The resulting pellet was redissolved
in water, and total tRNA was isolated using the preparative PAGE purification
method as described above.The RNase T1 digestion reaction was
set up as follows. The purified tRNA (49 μL) was incubated with
1000 U of RNase T1 (1000 U/μL, Thermo Scientific) in buffer
containing 0.05 M Tris–HCl, 10 mM EDTA (pH 7.5) for 2 h at
37 °C. For IA-treated samples, the purified tRNA was first incubated
with IA as described above and digested with RNase T1 after the reaction
cleanup with a Qiagen miRNeasy mini kit as described in the Total
RNA extraction section. The concentrations of tRNA used for each of
the replicates in RNase T1 digestion are mentioned in Table S4. Likewise, ∼50 μg of each
of the sequenced WTU and WTT samples are similarly treated with RNase
T1. The reaction mixtures were analyzed on a Vanquish UHPLC instrument
(Thermo Fisher Scientific) with a photodiode array detector and a
Q-exactive mass spectrometer. The mixture was injected onto a Hypersil
Gold C-18 column [particle size—1.9 μM, 2.1 mm (D) × 150 mm (L), Thermo Fisher] pre-equilibrated
with 97.5% of buffer A containing 0.2 M hexafluoroisopropanol (HFIP)
and 0.085 M TEA in water (optima grade from Thermo Fisher) 2.5% of
buffer B containing 0.1 M HFIP, 0.042 M triethylamine (TEA) in methanol
(optima grade from Thermo Fisher) at a flow rate of 0.05 mL/min. Separation
was carried out with a gradient of buffer B over 70 min (time: 0–13.1
min, % B: 2.5; time 13.1–52.4 min, % B: 2.5–35; time:
52.5–71 min, % B: 100). The UV–vis data are plotted
as described in the above sections. However, the UV–vis chromatograms
depicted under RNase T1 analysis are normalized to per μg of
the input sample.Data-dependent MS/MS analysis in the negative
ion mode was carried
out on the eluent from the column from above. The acquisition of full
MS at a resolution of 140,000 followed by data-dependent fragmentation
of the 5 most intense peaks at a resolution of 70,000. An intensity
threshold of 1 × 105 and normalized collision energy
of 20 was employed. The acquired Thermo RAW files were converted to
mzML using MSConvert with vendor peak picking and MS subset filters.[66,67] The mzML files were processed with nucleic acids search engine (NASE)
software which generates theoretical fragments and compares them against
the experimental data.[17] A FASTA file containing
modified tRNA sequences was used as the input. Precursor and fragment
mass tolerances were set to 5 ppm. The program was set to consider
fragments of 2 or more nucleotides, and the number of missed cleavages
by RNase T1 was set to zero. The tRNA sequence used in the analysis,
along with the output from each sample, is tabulated in supplementary extended Table.
Library Preparation
and Data Accumulation
cDNA libraries
were prepared from 1.5 μg of isolated RNA for each of the sample
conditions (Table S1) using a NEBNext Multiplex
Small RNA library prep kit for Illumina. The libraries were made following
the manufacturer’s instructions. Size selection of libraries
was carried out using the native PAGE conditions described in the
manufacturer’s protocol. The amount of size-selected libraries
submitted to the sequencing facility is listed in Table S2.
RNA-Seq Data Collection and Analysis
The sequencing
was carried out on an Illumina NextSeq instrument in a pair ended
manner with a read length of 75 bp by the Advanced Genomics Core at
the University of Michigan. The data obtained were analyzed in house.
All the analyses are performed using open-source Linux tools following
the guidelines of each program.
Quality Assessment and Trimming
Raw reads were analyzed
by the program FastQC[68] which looks for
adaptor contents, duplication levels, and GC content. Following the
quality check, Trimmomatic[29] was used to
remove the adaptor content (provided with truseq3 adaptors FASTA file)
and to filter out reads below 35 nt in length. The resulting sequences
were then mapped to the E. coli K-12
MG1655 genome.
Mapping
FASTA sequence file and
annotation files required
for mapping reads to the genome are obtained from the ensemble website
(ftp://ftp.ensemblgenomes.org/pub/bacteria/release-44).
Spliced transcript alignment to a reference (STAR 2.7), a short-read
alignment program was used for mapping reads to the genome.[30] The alignments are carried out in a two-step
process where the genome is indexed first by the program, which would
later be used in the mapping process (default parameters were used
except for the sjdb-overhang parameter, which was set to 75 following
the instructions). The alignments were outputted to a BAM format file
(sorted by coordinate).Gene expression
analysis was
carried out using cufflinks, cuffmerge, cuffdiff, and cummeRbund.
The cufflink program within the suite takes GTF annotation file and
BAM files as an input (library-type = fr-first-strand, read length
= 76 nt) and generates a set of tracking files for each sample group.[31] Transcripts (information in tracking files)
from each assembly were merged into the single master-transcriptome
assembly by the program cuffmerge, taking the biological/technical
replicates into account. Cuffdiff (input = BAM files, reference FASTA
sequence & master assembly GTF) processes the master transcript
assembly to perform rigorous statistical analysis on the count values
obtained. The data produced by cuffdiff was navigated with the use
of an R library package called cummeRbund. A count matrix for all
the genes was extracted from the expression analysis. The counts were
summed according to the gene biotype in R by matching the cufflink
ids (obtained from master transcriptome assembly) with gene information
present in the GFF3 annotation file (obtained from the ensemble website).The read coverages for the tRNA genes
are visualized in the program IGV.[33] GTF,
GFF3, and BED files corresponding to the E. coli K −12 genome and the alignment files are loaded in the program.
Screenshots corresponding to the respective alignments are generated
in scalable vector graphics (SVG) and modified appropriately in an
Adobe Illustrator CC 2019.
Misincorporation Frequency
The misincorporation
frequency
was calculated from the genotype likelihoods generated by the bcftools
program[32] as follows. In the first step,
BCFtools “mpileup” argument [maximum depth at a position
100,000, FASTA file, ignored duplicates, output mpileup in variant
call file (VCF) format] calculates the genotype likelihoods at each
genomic position along with coverage. Replicates in the same sample
group are considered together for obtaining the VCF files. The output
VCF files are then processed with BCFtools “call” argument
to discover the mutations (multi-allelic caller, p = 0.99, BED feature file containing tRNA information). Several variant
analysis attributes corresponding to each position was written in
VCF files. The number of high-quality reference and alternate bases
(DP4 attribute) at each position was extracted from the VCF file using
VariantsToTable utility from the GATK suite of tools.[69] The misincorporation frequency at each of the tRNA positions
is calculated in a Microsoft Excel spreadsheet as followsThe obtained data
file was then processed
in R software for statistical computing and graphics to generate a
matrix file suitable for plotting heat maps. The obtained matrix file
was then manually adjusted according to the alignments listed on the
transfer RNA database (tRNAdb) and summarized in supplementary extended Table.[37] Additionally, the MODOMICS database was referred to as well.[38] The heat maps are generated with GraphPad Prism
plotting software. The positions with no values were given the same
color as the background of the plot. The genes argX, argW, ileX, ileY, lysV, metT, metU, proK, and proL were ignored while
plotting heat maps owing to the low coverage.
Authors: Markus Hafner; Markus Landthaler; Lukas Burger; Mohsen Khorshid; Jean Hausser; Philipp Berninger; Andrea Rothballer; Manuel Ascano; Anna-Carina Jungkamp; Mathias Munschauer; Alexander Ulrich; Greg S Wardle; Scott Dewell; Mihaela Zavolan; Thomas Tuschl Journal: Cell Date: 2010-04-02 Impact factor: 41.582