Literature DB >> 35155896

Site-Specific Profiling of 4-Thiouridine Across Transfer RNA Genes in Escherichia coli.

Abstract

The transfer RNA (tRNA) modification 4-thiouridine (s4U) acts as a near-ultraviolet (UVA) radiation sensor in Escherichia coli (E. coli), where it induces a growth delay upon exposure to the UVA radiation (∼310-400 nm). Herein, we report sequencing methodology for site-specific profiling of s4U modification in E. coli tRNAs. Upon the addition of iodoacetamide (IA) or iodoacetyl-PEG2-biotin (BIA), the nucleophilic sulfur of s4U forms a reaction product that is extensively characterized by liquid chromatography-mass spectrometry (LC-MS/MS) analysis. This method is readily applied to the alkylation of natively occurring s4U on E. coli tRNA. Next-generation sequencing of BIA-treated tRNA from E. coli revealed misincorporations at position 8 in 19 of the 20 amino acid tRNA species. Alternatively, tRNA from the ΔthiI strain, which cannot introduce the s4U modification, does not exhibit any misincorporation at the corresponding positions, directly linking the base transitions and the tRNA modification. Independently, the s4U modification on E. coli tRNA was further validated by LC-MS/MS sequencing. Nuclease digestion of wild-type and deletion strains E. coli tRNA with RNase T1 generated smaller s4U/U containing fragments that could be analyzed by MS/MS analysis for modification assignment. Furthermore, RNase T1 digestion of tRNAs treated either with IA or BIA showed the specificity of iodoacetamide reagents toward s4U in the context of complex tRNA modifications. Overall, these results demonstrate the utility of the alkylation of s4U in the site-specific profiling of the modified base in native cellular tRNA.

Entities: Chemical

Year: 2022 PMID： 35155896 PMCID： PMC8829951 DOI： 10.1021/acsomega.1c05071

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

A common characteristic among cellular ribonucleic acids, such as ribosomal RNA (rRNA), transfer RNA (tRNA), and messenger RNA (mRNA), is the presence of post-transcriptional modifications. While these RNA species possess numerous post-transcriptional modifications, the numbers and diversity of modifications are the most significant in tRNA.[1] To date, a total of 108 modifications have been reported in tRNA, and the presence of these modifications across all domains of life highlights their importance.[2] The anticodon stem-loop region on tRNA contains most of these modifications, presumably because of its role in codon–anticodon interactions during protein synthesis, with modifications at positions 34 and 37 being most prevalent.[1,3] The discovery of many nucleic acid modifications coincided with the efforts in the late 1960s and early 1970s to determine the nucleic acid sequence, the roles of quite a few remaining enigmatic to this day. The emerging consensus is that many are not limited to RNA. For example, 7-deazapurine-based modified bases, which were initially discovered in the wobble positions of tRNA Asp, Asn, Tyr, and His, have recently been identified in bacteriophage DNA. Their presence is thought to protect the phage DNA from the host restriction endonuclease response.[4,5] There is also cross-talk between DNA and RNA modifications in eukaryotes, where a class of DNA methylating enzymes alkylate certain tRNA.[6] The dynamics of cellular insertion, detection, and removal of these modifications represent an important dimension in understanding their physiological roles. While detecting the modified bases is relatively trivial using high-resolution analytical separation and mass spectrometry, site-specific profiling methods remain challenging. The advent of commercially available high-throughput next-generation sequencing (NGS) technologies during the last 2 decades has opened new research avenues for profiling post-transcriptional RNA modifications.[7] In these experiments, the conversion of RNA to complementary DNA (cDNA) in the reverse transcription (RT) step serves as a readout where a non-canonical nucleoside in the RNA template can result in the misincorporation of deoxynucleotides, premature termination, or both during the RT.[7] While some non-canonical bases such as inosine (I) may induce these directly, others require treatment with a chemical reagent before they lead to a distinct readout.[7] For example, the use of chemical reagents specific to pseudouridine (ψ) and 5-methylcytosine (m5C) facilitate transcriptome-wide detection of these modifications.[8−10] Studies from different laboratories also show that NGS conditions are responsive to many tRNA modifications, where empirical analysis of misincorporations and terminations (collectively known as RT events) can predict the presence of base modifications in uncharacterized bacterial tRNA genes.[11,12] However, in many cases, these require extensive statistical analysis to uncover as the probabilities for the misincorporation or terminations are low. Ideally, chemical reagents that target specific modifications and increase RT event probabilities could overcome these limitations. However, the diversity of the modifications requires the development of a new procedure that leverages the unique reactivity of the modified base in each case. Mass spectrometry could be used as an alternative technique for detecting RNA modifications, presenting a more straightforward approach to the sequence-specific analysis of RNAs.[13] The principles used in such methods are adapted from proteomics, where upon protease specific digestion of a protein generates a smaller peptide fragment library that could subsequently be analyzed using liquid chromatography (LC)–mass spectrometry (MS/MS) analysis.[7] Similarly, the cleavage of RNA to smaller fragments using specific nuclease generates a smaller RNA library that LC could separate. The MS/MS fragmentation of eluting peaks can be compared against the theoretically generated fragmentation data from the input sequences to detect the modifications on the RNA.[14−17] While these methods are much more direct than NGS, they are limited by the need for specialized instrumentation and the lack of robust tools for analyzing MS fragmentation data. In the current report, using the tRNA as a model system for 4-thiouridine (s4U) containing RNA species, we developed and validated a methodology to site specifically profile the base in the stationary growth phase of Escherichia coli (see Figure for the s4U structure).

Figure 2

UV–visible spectrophotometric analysis IA-modified s4U. The spectra are of 20 μM authentic s4U and modified s4U after treatment with IA or BIA. The spectra were obtained in 0.05 M NaPi (pH 8.0).

Native profiling of tRNA modifications shows that RNA-Seq conditions are sensitive to the presence of the s4U, where it causes low rates of mismatches.[11,12,18] The s4U alone displays empirical mismatch rates between 0.1 and 0.3, which may not be conclusive when applied in a context where the existence of s4U is unknown. For example, while s4U is commonly present at the 8th or 9th position of bacterial tRNA, in the case of archaeal tRNA or uncharacterized bacterial tRNA, limitations may occur in confidently assigning the s4U position on the gene based on the statistical probabilities alone. The nucleophilicity of the thiouridine base and its reactivity with thiol modification reagents is a viable method for introducing a bulky modification that could potentially report its presence in NGS sequencing experiments.[19−22] In the SLAM-seq method, s4U is metabolically incorporated into the eukaryotic mRNA pool in pulse-chase experiments, which is subjected to NGS after iodoacetamide (IA) treatment. The dynamics are monitored by exploring the frequency of mismatches observed in 3′-untranslated regions of mRNA. However, under these conditions, nearly ∼0.5% of the total cellular RNA is labeled with s4U after 6 h of incubation with 0.1 mM s4U.[23] We envisioned a methodology, starting from E. coli cells using RNA-Seq, to detect s4U modification in a site-specific manner (Figure ). To achieve this, the documented susceptibility of s4U to electrophilic reagents, such as IA, was explored.[21,22] We reasoned that the presence of a large adduct, such as biotin-IA, would lead to a robust mismatch response in NGS, allowing single-nucleotide resolution s4U detection. This article reports the characterization of s4U-IA or s4U-iodoacetyl-PEG2-biotin (BIA) products by high-resolution MS methods to establish conditions that maximized the yield of labeled s4U. Next, these methods were extended to bulk RNA from E. coli and NGS experiments using size-selected RNA to enrich tRNA. Remarkably, the base transitions observed in the NGS data clearly show that most detectable tRNAs in E. coli have the modification at position 8 or 9 with robust probabilities. These base transitions are absent in identically treated samples from cells where a key tRNA sulfur insertion enzyme, ThiI, was absent. Finally, LC–MS/MS analysis of untreated, IA or BIA-treated tRNA from wild type and deletion E. coli strains digested with RNase T1 nuclease corroborate the positional modification data from the high-throughput NGS studies. Together, these data provide a workflow for single-base resolution sequencing for s4U on bacterial or archaeal tRNAs.

Figure 1

Overview of the workflow for the detection of s4U using RNA-Seq and mass spectrometry. In this workflow, extracted RNA is treated with IA or BIA prior to RNA-Seq to identify misincorporation or coverage changes. Likewise, tRNA was nuclease digested with RNase T1 followed by LC–MS/MS analysis of tRNA fragments which independently establish the assignment. RNase T1 digestion followed by LC–MS/MS analysis of either IA- or BIA-treated tRNA establish the specificity of iodoacetamide reagents toward s4U in the context of complex tRNA modifications.

Results and Discussion

The overall outline of the studies presented in this article is shown in Figure . To ensure that the sequencing analyses’ data reflect the s4U modification, we first carried out several control experiments to characterize the BIA modification reaction.

Characterization of Reaction between s4U and IA (or BIA)

When combined with IA, s4U carries out a nucleophilic attack via the sulfur atom at the C-2 of IA (or BIA), as shown in Figure (or Figure S1). Therefore, s4U was incubated with excess IA or BIA, and the reaction mixture was subsequently analyzed by UV–visible spectroscopy. Figure shows the absorbance spectra of s4U (black trace) with an absorption maximum at ∼330 nm, s4U treated with IA (red trace), and s4U treated with BIA (blue trace), both of which have an absorption maximum at ∼303 nm. The ∼27 nm blue shift is consistent with a previous report.[23] UV–visible spectrophotometric analysis IA-modified s4U. The spectra are of 20 μM authentic s4U and modified s4U after treatment with IA or BIA. The spectra were obtained in 0.05 M NaPi (pH 8.0). To confirm that the incubation of s4U with IA leads to the formation of the expected product, reactions were analyzed by LC–MS (Figure S2). Under these conditions, s4U elutes at 11.8 min. The mass spectrum of the species eluting at 11.8 min contains two species, corresponding to the m/z values of [s4U + Na+] and [s4U + K+], within six ppm of the theoretical m/z values (Figure S2C). In the presence of IA, a new peak at 12.3 min appears. The mass spectrum of the species eluting in this peak is consistent with the m/z of [s4U-IA + H+], [s4U-IA + Na+], and [s4U-IA + K+], which are within seven ppm of the theoretical values (Figure S2B). To assess the specificity of IA for s4U versus the four canonical nucleobases, s4U was treated with IA in the presence of equimolar quantities of A, U, G, and C. The reaction mixtures were subsequently analyzed by HPLC-MS (Figure S3). Under these conditions, in the presence of IA, the s4U peak at 10.9 min is replaced with one at 11.5 min. The canonical nucleosides remained unaffected. The UV–visible spectra corresponding to each of s4U and s4U-IA peaks (Figure S3B,C) are consistent with those shown in Figure . As is apparent, no other changes are visible in the chromatograms. In passing, we note that a new peak at 16 min is also observed in these experiments, which has the same UV–visible features as s4U-IA. We do not know its identity, and because it was a minor component, it was not investigated further. To determine if BIA reacted with s4U in a similar manner, s4U was reacted with BIA and analyzed via LC–MS experiments. The extracted ion chromatogram (EIC) shows that s4U (m/z—261.05) and BIA (m/z—542.10) elute at 13.5 and 34.3 min, respectively (Figure A). The mass spectrum corresponding to the species eluting at 13.5 min is a mixture of [s4U + H+], [s4U + Na+], and [s4U + K+], as observed previously (compare Figures C and S2C). The observed m/z values for these species are within three ppm of theoretical. The reaction product elutes at 33.03 min, and the mass spectrum of the peak exhibits a [s4U-BIA + H+] species with m/z of 675.25 (Figure B), which is within 0.3 ppm of the expected mass.

Figure 3

LC–MS analysis of the in vitro modification of s4U with BIA. (A) EIC of s4U (m/z range 260.55–261.55), s4U-BIA (m/z range 674.75–675.75), and BIA (m/z range 541.60–542.60) samples show that upon the reaction with BIA, a new peak at 33 min appears in the chromatogram, which is distinct from that observed for unreacted s4U or BIA. (B) Mass spectrum of s4U-BIA. (C) Mass spectrum observed for s4U. The peaks corresponding to the s4U, s4U-IA, and s4U-BIA were subjected to high-resolution MS/MS analysis to characterize the species further. A representative example MS–MS spectrum for the s4U-BIA adduct is shown in Figure S4. Fragmentation of s4U-BIA via CID at 12–15 eV results in daughter ions with an m/z of 565.19 arising from the loss of ribose sugar (peak (a), Figure S4). Further fragmentation of the peak (a) leads to a loss of the nucleobase with an m/z of 397.19 [peak (b), Figure S4]. Experimental m/z values for the daughter ions are within eight ppm of theoretical m/z values calculated for those ions. To further characterize the modification reaction of s4U to s4U-BIA unambiguously, large-scale modification of s4U with BIA was carried out, purified by preparative HPLC conditions, and subjected to 1H NMR experiments (see additional methods in the Supporting Information). The proton NMR spectra of commercially sourced s4U and purified s4U-BIA are shown in Figures S5 and S6, respectively. The spectral assignments for s4U and s4U-BIA were made by reference to uridine, biotin, and the linker ethylene glycol 1H NMR spectra found on publicly available biological magnetic resonance data bank (BMRB).[24] As seen from the s4U-BIA proton spectrum (Figure S6), resonance peaks for uracil aromatic protons, ribose protons, biotin protons, and polyethene glycol protons can be seen clearly. These data and MS/MS fragmentation data collectively establish that the incubation of s4U with IA and BIA leads to the covalent modification of the nucleoside, establishing the feasibility of using these as chemical biology tools.

IA or BIA Modification of RNA

Encouraged by the efficiency of the in vitro modification data with standards, we attempted to extend the studies to RNA. In these experiments, total RNA extract from wild-type E. coli was treated either with IA or BIA, and the resulting sample was digested to nucleosides by subsequent actions of P1 nuclease, phosphodiesterase, and alkaline phosphatase enzymes.[25] The UV–visible traces from the HPLC analysis of wild-type E. coli RNA with or without IA treatment are shown in Figure S7. In addition to the C, U, G, and A, the traces reveal a clear peak for s4U at 10.5 min, with the expected UV–visible spectrum for the nucleoside. A time course for the modification of s4U to s4U-IA from total RNA is shown in Figure S8. After 2 h of incubation with IA, the signal for s4U is replaced with one at 11.3 min, corresponding to the s4U-IA adduct. The data show that the modification is essentially complete by 4 h. Therefore, the reactions were allowed to proceed overnight (12–14 h). The identity of the s4U-IA adduct is further confirmed by examining the mass spectrum of the species eluting in the 11.3 min peak (Figure S7B). The observed m/z values are within 10–11 ppm of theoretical and identical to those observed with the standards (compare mass spectrum insets from Figures S2 and S7). The studies described for the modification of s4U with IA were extended to BIA. As with the above, the BIA-treated RNA was analyzed by LC–MS analysis. The UV–vis (190–800 nm) absorbance traces show a peak for s4U-BIA [trace (a), Figure A], though the peak for its precursor s4U co-elutes with G [trace (b), Figure A]. Because of the hydrophobic nature of BIA, a modified elution method was used, which did not separate G and s4U as well. However, when we examine the data at 330 nm, which corresponds to the absorbance maximum for s4U [Figure B, trace (f), see Figure ], a peak corresponding to the nucleoside is visible at 13.4 min in the untreated samples. By contrast, when the traces are examined at 305 nm, which corresponds to the absorbance maximum for the alkylated nucleoside [Figure B, trace (c), see Figure ], the peak corresponding to the BIA modified nucleoside at 33 min is prominent in treated samples. We note that the retention times observed for the adduct and unmodified nucleoside are consistent with those observed with the commercially obtained s4U (Figure A). Finally, the mass spectrum of the s4U-BIA corresponds (within 3 ppm) to that expected for the adduct (Figure C).

Figure 4

LC–MS analysis of nucleosides in WT E. coli total RNA treated with BIA. E. coli RNA was digested to nucleosides and analyzed by LC–MS. (A) UV–visible traces (190–800 nm) of nucleosides from BIA-treated E. coli RNA [trace (a)] and from untreated E. coli RNA [trace (b)]. Under these conditions, s4U-BIA elutes at 33 min while it’s precursor s4U elutes with G at 13 min. (B) UV–visible traces were examined at 305 nm for s4U-BIA and at 330 nm for s4U in E. coli RNA treated with BIA [traces (c) and (d), respectively]. The chromatograms show a very diminished peak for s4U, as compared to untreated E. coli RNA samples at 305 and 330 nm [traces (e) and (f), respectively]. The y-axis scale for traces (a) and (b) differs from that of the scale for traces (c–f) and the RNA inputs were identical. (C) Mass spectrum observed for s4U-BIA. The s4U and s4U-BIA originating from biological RNA samples showed similar retention times to commercially sourced s4U and purified s4U-BIA characterized by 1H NMR (Figures S9, see S5 and S6). Additionally, the degradation of RNA treated by this protocol was examined using denaturing poly-acrylamide gel electrophoresis (PAGE) as described in additional methods in the Supporting Information. RNA samples taken at various time points were analyzed by PAGE and represented in Figure S10. The gel image shows minimal (if any) degradation of the total RNA treated in the protocol. Together these results validate the use of the mentioned conditions in an NGS workflow.

BIA Modification of RNA from Control Strains

Before applying the IA methodology in sequencing experiments, we conducted experiments to specifically implicate the cellular sulfur insertion machinery in the modification (Figure S11).[26] In the biosynthetic pathway of s4U, the PLP-dependent enzyme IscS mobilizes S from Cys as a persulfide attached to a Cys residue in the protein. The persulfide donates the sulfur to ThiI, which in an ATP-dependent reaction incorporates the sulfur into tRNA. In these experiments, total RNA from wild type and ΔiscS and ΔthiI strains of E. coli was isolated, reacted with BIA, and digested to nucleosides. The resulting mixtures were analyzed by HPLC, as described in Materials and Methods (Figure ). Interestingly, the sample from the ΔiscS variant appears to contain s4U [trace (c), Figure ], which elutes similarly to the nucleoside as in the wild-type samples [trace (a), Figure ] and exhibits UV–visible features that are characteristic of s4U. Perhaps this is not surprising, as E. coli have several overlapping systems for sulfur mobilization from cysteine.[27] By contrast, the ΔthiI variant lacks s4U [trace (e), Figure ]. When compared to wild-type-treated samples [trace (b), Figure ], the corresponding BIA-treated samples from the ΔiscS variant clearly show the appearance of s4U-BIA [trace (d), Figure ], whereas the ΔthiI sample shows no evidence for the presence of s4U-BIA [trace (f), Figure ]. These data establish that the BIA-modification specifically highlights a ThiI-dependent process, which previous studies have shown to be the incorporation of S to form s4U.[26] Finally, the data shown in Figure allow one to estimate that under the conditions of the experiments, the modification is nearly quantitative (95 ± 4%) in several biological and technical replicates (representative data are shown in Figure ). Therefore, the reaction is sufficiently robust for s4U sequencing.

Figure 5

Representative analysis of RNA from ΔthiI and ΔiscSE. coli deletion strains for the presence of s4U. The HPLC traces extracted at 330 nm (corresponding to the maximum absorbance for s4U) of samples containing total RNA from WT and ΔiscS strains [traces (a), and (c), respectively] show that the deletion of iscS does not eliminate s4U but that the deletion of thiI does [trace (e)]. The traces at 305 nm (corresponding to the maximum absorbance for s4U-BIA) of samples containing total RNA WT and ΔiscS strains, [traces (b) and (d), respectively] clearly show the presence of s4U-BIA. By contrast, RNA from the thiI deletion strain does not show any s4U-BIA [trace (f)], which is consistent with the loss of s4U in the corresponding control [see trace (e)]. To facilitate a direct comparison of intensities at 330 and 305 nm, all the y-axes are on the same scale. We considered using the BIA to enrich samples for the s4U-containing species before the sequencing (discussed below) using affinity pulldown (see additional methods in the Supporting Information for experimental details). To this end, BIA-treated RNA from the wild-type, ΔiscS, and ΔthiI strains were incubated with streptavidin beads and eluted with excess biotin under denaturing conditions, as described in the Materials and Methods section. The resulting samples were then analyzed by denaturing PAGE (Figure S12). Samples lacking the BIA treatment served as controls. In each set of samples, we see an RNA fragment (∼80 bp) being enriched only in the BIA-containing samples. However, we see a similar band in the ΔthiI sample as well. At this point, it is not known whether the sample is identical to that observed with the wild type and ΔiscS RNA. We cannot rule out the possibility of a hitherto unidentified nucleoside, which is sensitive to treatment with IA-based reagents. Considering these results, we did not attempt any further enrichment experiments and moved forward with sequencing both unenriched and enriched samples resulting from the treatment of RNA with BIA, as described below.

Small RNA Sequencing

RNA sequencing reactions were set up with the sample groups shown in Table S1. In these experiments, untreated RNA served as a control for false positives. The small RNA samples for sequencing were isolated by preparative denaturing PAGE experiments[28] after the indicated treatments described in the Materials and Methods section (see Table S2 for yields). In each case, a duplicate sample was included resulting in a total of 12 samples. Sequencing libraries were prepared with a NEBNext multiplex small RNA library prep set for Illumina from 1.5 μg of purified RNA and sequenced on an Illumina NextSeq instrument in a pair-ended manner (2 × 75 bp).

Data Analysis

The Materials and Methods section describes the detailed data analysis workflow. Initially, raw reads were analyzed by FastQC. Adaptor contents were trimmed by Trimmomatic,[29] and the resulting reads were mapped to the E. coli K12 genome by STAR.[30] Next, the cufflinks suite of tools was used to count the transcripts and analyze gene expression across the sample groups.[31] The alignments were visually assessed for the presence of mismatches, and misincorporation analysis was performed using the data generated by the BCFtools program.[32] The results of the analyses are described below.

Gene Expression Analysis

The cufflinks suite of tools was employed to assemble, merge, and perform several RNA-Seq data analyses. An in-depth analysis of each sample group was performed, and the gene expression profiles were compared against each other. Traditionally, differential expression analysis is performed to identify the upregulated or downregulated genes across different sample groups. However, in our case, such an analysis was only used to look at the libraries’ profiles and ensure no significant biological variation between the sample groups. The aggregated read count for genes obtained from the expression analysis was grouped according to the gene biotypes, and percentages for each biotype were calculated (Table ). Unexpectedly, the profiles for each sample group (except dTE) are dominated by protein-coding mRNAs. Because we prepared libraries from 60 to 100 nt fragments isolated from PAGE, this may indicate that larger RNA transcripts are cleaved to shorter ones while labeling. Though Figure S10 depicts a minimal degradation of RNA under the incubation conditions, RNA-Seq is a sensitive experiment that picks up minute quantities of RNA that may still be present and co-migrated with the tRNA. Nevertheless, because we are interested in tRNAs in this analysis, subsequent visualization and variant analysis were limited to tRNA transcripts only.

Table 1

Read Composition Percentages by Gene Biotypea

class	WTU	WTT	WTE	dTU	dTT	dTE
ncRNA	14.1	28.3	26.4	28.9	31.1	48.0
mRNA	76.3	62.8	57.3	63.8	61.0	39.3
pseudo-gene	1.0	0.9	1.0	1.7	1.6	1.5
rRNA	4.5	4.1	4.1	1.6	1.7	3.8
tmRNA	0.4	0.3	0.8	0.6	0.6	0.5
tRNA	3.8	3.7	10.4	3.6	4.0	6.9

NcRNA: non-coding RNA, mRNA: messenger RNA, rRNA: ribosomal RNA, tmRNA: transfer messenger RNA, tRNA: transfer RNA. WTU: wild-type untreated, WTT: wild-type treated, WTE: wild-type treated and enriched, dTU: ΔthiI untreated, dTT: ΔthiI treated, dTE: ΔthiI treated and enriched.

Visualization

In the next stage of analysis, the aligned reads are visualized by an integrative genomic viewer (IGV)[33] to identify any changes to a sequence readout that would result from the presence of BIA. Figure S13 depicts a representative example of tRNA transcripts, highlighting hisR (sense strand) and glnV (antisense strand). In both cases, the reads are identical to the genome sequence, except at a single position in the BIA-treated samples. In the glnV data, a misincorporation occurs at position 8. Indeed, glnV has been shown previously to contain s4U in this position.[34] By contrast, the misincorporation in the hisR data occurs at position 9 of the tRNA (which aligns with position 8 of the sequence of all other tRNA species) and has been demonstrated to contain s4U at this position.[35] A close inspection of similarly constructed alignments of all the reads corresponding to tRNA genes shows misincorporations at position 8. We note that the misincorporations only occurs in the BIA-treated samples with wild-type RNA but not in the thiI deletion strain, which our studies show lacks s4U (see Figure S11). Analysis of the misincorporations and mapping to the DNA shows that the BIA-modified s4U is decoded as C. Depending on whether the gene occurs on the sense or antisense strand, the base observed in the alignments is either C or G, respectively. This transition is not observed in samples that were not treated with BIA (wild type untreated), nor is it observed in RNA from the thiI deletion strain (ΔthiI untreated and treated). This latter observation further validates the assignment of this transition to the presence of s4U.

Misincorporation Analysis

The BCFtools program was used to carry out a detailed analysis of all the sequences obtained in these studies incorporating the corresponding biological replicates in each case. In this analysis, genotype likelihoods are determined initially by estimating the most likely base at a position by examining all the reads that have been aligned at that position. Possible misincorporations are identified by comparing the most likely base at a given position to the reference genome. The frequency of misincorporation was calculated from the number of correct and misincorporated bases at each position as described in the Materials and Methods section. The misincorporation frequency heat maps for wild-type untreated (WTU) and wild-type-treated (WTT) sample groups are depicted in Figure . As seen in previous studies,[11,12] the presence of s4U modification by itself at position 8 led to the misincorporation of deoxynucleotides during the RT reaction but at a lower frequency in the untreated sample (Figure A). The values observed for WTU at position 8 are in the range 0.003–0.35 with a median of 0.04 in contrast to the WTT sample (Figure B), which displayed much higher values in the range of 0.002–0.98 with a median of 0.75. Likewise, the wild-type-treated and enriched (WTE) sample group displayed similar values at position 8 in the range of 0.02–0.94, with a median of 0.69 (Figure S14C). At this position, either T > C or A > G transitions are seen depending on whether the gene is on the sense strand or the antisense strand. The control ΔthiI untreated (dTU), ΔthiI treated (dTT), ΔthiI treated, and enriched (dTE) samples depicted in Figure S14 show that position 8 does not display misincorporations compared to wild-type samples. The median values of misincorporation at position 8 for dTU, dTT, and dTE are in the range of 0.00–0.01. Alternatively, the tRNAs corresponding to selenocysteine and isoleucine (Ile) have values close to zero at position 8. The misincorporation frequency values are shown in Table S3. It is also notable that the misincorporation frequencies were not uniform across all genes, resulting from the incomplete conversion of s4U to s4U-BIA or non-uniform cellular s4U incorporation. Nevertheless, it is evident from the heat maps that the treatment of BIA leads to increased misincorporation at position 8, which is diagnostic for the presence of s4U.

Figure 6

Misincorporation profiles of tRNAs derived from WT E. coli. Heat maps summarizing misincorporation frequency for tRNAs from (A) wild-type untreated (WTU) and (B) wild-type-treated (WTT) sample groups. The modifications I and s4U (before treatment in WTU, after treatment in WTT) are indicated by a black line and labeled in white. Positions labeled as N* contain unknown post-transcriptional modifications. Alignment of tRNA genes of E. coli is carried out based on the alignments derived from tRNAdb. For clarity, the x-axis tick marks denote every fourth base. The y-axes show tRNA species corresponding to each amino acid. The height of the bar correlates to the individual tRNA coding the amino acids. Gapped regions within the alignments were set to background color. The secondary structure attributes below the x-axis are labeled based on the alignments listed on tRNAdb. Acc-stem: acceptor stem, Ac-arm: anticodon arm, CCA: terminal nucleosides of tRNA genes. Each of the images are generated by considering both replicates in the respective sample groups. The raw data used to generate the heat map is reproduced in Figure S14. Close inspection of all sample groups (Figures and S14) reveals the presence of additional misincorporation hotspots. However, unlike with s4U, these are present in all samples, including the thiI deletion strain. For example, the I present at position 34 on the anticodon loop of tRNAArg (ACG) consistently leads to a misincorporation of C instead of G.[7] Intriguingly, additional misincorporations observed in all sample groups (denoted by N* in Figures and S14) are, to our knowledge, not correlated with any known modifications and may be novel. Because all of these are present ±thiI, if they contain an S, ThiI is not the source. We note that the bulky post-transcriptional modification [3-(3-amino-3-carboxypropyl)uridine] (acp3U) has been shown to induce misincorporation previously[11,12] was not seen in our studies. However, this difference may be due to the selectivity of the polymerase employed.

Sequencing by MS

The RNA-Seq experiments clearly show a measurable increase in the transition frequencies related to IA modification and the presence of the thiI gene. However, a direct demonstration was sought to independently verify the presence of s4U on E. coli tRNAs to validate the RNA-Seq workflow. The presence of s4U on various E. coli tRNAs was obtained by digesting E. coli total tRNA with the RNase T1 enzyme and analyzing the resulting fragments by MS/MS. The nuclease cleaves single-stranded RNA adjacent to G on the 3′ end to generate smaller fragments. Typical amounts of size-selected tRNA used for RNase T1 experiments are shown in Table S4. Figure shows HPLC chromatograms of RNase T1-digested tRNA samples from wild-type and deletion E. coli strains. As shown in Figure , s4U absorbs at 330 nm, which allows the UV detection of s4U containing fragments. The chromatographs at 330 nm show numerous peaks, which are consistent with the presence of s4U. Control experiments with tRNA from ΔiscS also show peaks, whereas the ΔthiI chromatograms show no features, consistent with the loss of the modification (Figure ). These results are in harmony with the nucleoside analysis of tRNA from wild-type and deletion E. coli strains (see Figures and 5) that show a requirement of thiI in the modification pathway.

Figure 7

Representative LC analysis of RNaseT1 digest of tRNA from WT and deletion E. coli strains. The HPLC traces extracted at 330 nm (corresponding to the maximum absorbance for s4U) of RNase T1 digestion samples are shown. Fragments containing s4U display absorbance at 330 nm and distinct peaks are visible in the samples. The corresponding strains are labeled. To facilitate comparison, the intensities on y-axes are normalized to per μg of the sample and depicted to similar scales. The samples were further analyzed by subjecting the eluent to MS/MS analysis in the negative ion mode. In these experiments, the five highest intensity species at each time were subjected to MS fragmentation as described in the Materials and Methods section. The resulting data were analyzed by the NucleicAcidSearchEngine (NASE)[17] using a database containing all the fully modified E. coli tRNA sequences in E. coli. The resulting output was examined for the s4U containing fragments and visualization in TOPPView from an OpenMS toolset.[36] Some of the tRNA species generated smaller fragments such as s4UGp and s4UAGp, which could not be assigned unambiguously to a specific tRNA, while others generated unique fragments that were readily mapped. Likewise, tRNA from ΔthiIE. coli was used as a control sample where the corresponding unmodified U8 fragment for each tRNA was identified to assign the position of s4U via mass spectrometry confidently. A representative MS/MS analysis for s4U containing fragment CUA[s4U]AGp from tRNAAla is shown in Figure S15. Data with the ΔthiI tRNA sample shows that the corresponding CUAUAGp is not modified. The results of these analyses are summarized in supplementary extended Table. The data from LC–MS/MS analyses are compared to sequencing analyses, tRNAdb[37] and MODOMICS[38] databases in Table . The tRNA sequences used for the analysis are provided in supplementary extended Table. In few cases, there are minor differences between the reported modification data at position 8 in MODOMICS and/or tRNAdb and assignments from the RNA-Seq and MS/MS analyses. However, there is a general agreement between the RNA-Seq and mass spectrometry analyses. For example, the observed enhancement of the misincorporation frequency for Ala tRNA genes is corroborated with the presence of s4U in the MS/MS analysis of all five Ala tRNA genes (see Tables S3 and 2). In the MODOMICS and tRNAdb databases, both modified and unmodified U is observed (Table ).

Table 2

Comparison of tRNA Modification at Position 8 from Databases against the RNA-Seq and MS/MS Analysesa

gene	local DB	anticodon	MODOMICS	tRNAdb	RNA-Seq	RNase T1-MS	RNaseT1—theoretical fragment
alaT, alaU, alaV	Ala1	VGC	U	s⁴U/U	s⁴U	s⁴U	CUA[s⁴U]AGp
alaW, alaX	Ala2	GGC	U	N	s⁴U	s⁴U	CUA[s⁴U]AGp
argQ, argV, argY, argZ	Arg1	ICG	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]AGp
argX	Arg2	CCG	U	U	n.d.	Amb	[s⁴U]AGp
argU	Arg3	{CU	U	U	s⁴U	n.d.	CCCU[s⁴U]AGp
argW	Arg4_NA^#	{CU			n.d.	n.d.	UCCUCU[s⁴U]AGp
asnT, asnU, asnV, asnW	Asn	QUU	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]AGp
aspT, aspU, aspV	Asp	QUC	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]AGp
cysT	Cys	GCA	s⁴U	s⁴U	s⁴U	U/s⁴U	U[s⁴U]AACAAAGp
glnU, glnW	Gln1	NUG	s⁴U	s⁴U	s⁴U	s⁴U	UA[s⁴U]CGp
glnV, glnX	Gln2	CUG	s⁴U	s⁴U	s⁴U	s⁴U	UA[s⁴U]CGp
gltT, gltU, gltV, gltW	Glu	SUC	U	U	s⁴U	U	UCCCCUUCGp
glyU	Gly1	CCC	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]AGp
glyT	Gly2	NCC	U	U	s⁴U	U/s⁴U	CA[s⁴U]CGp
glyV, glyW, glyX, glyY	Gly3	GCC	s⁴U	U	s⁴U	U/s⁴U	AA[s⁴U]AGp
hisR	His	QUG	s⁴U	s⁴U	s⁴U	s⁴U	CUA[s⁴U]AGp
ileT, ileU, ileV	Ile1	GAU	U	U	U	Amb	UAGp
ileX	Ile2	}AU	s⁴U	s⁴U	n.d.	n.d.	CCCCU[s⁴U]AGp
ileY^#	Ile3_NA	}AU			n.d.	n.d.	CCCUU[s⁴U]AGp/CCCUUUAGp
metV, metW, metZ*	Ini1	CAU	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]Gp
metY*	Ini2	CAU	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]Gp
leuP, leuQ, leuT, leuV	Leu1	CAG	U	U	s⁴U	Amb	UGp
leuZ	Leu3	)AA	s⁴U	s⁴U	s⁴U	Amb	A[s⁴U]Gp
leuU	Leu2	GAG	U	U	s⁴U	Amb	[s⁴U]Gp
leuX	Leu4	BAA	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]Gp
leuW^#	Leu5_NA	VAG			s⁴U	Amb	[s⁴U]Gp
lysQ, lysT, lysV, lysW, lysY, lysZ	Lys	SUU	U	s⁴U	s⁴U	U/s⁴U	U[s⁴U]AGp
metT, metU	Met	MAU	s⁴U	s⁴U	n.d.	Amb	[s⁴U]AGp
pheU, pheV	Phe	GAA	s⁴U	s⁴U	s⁴U	s⁴U	A[s⁴U]AGp
proK	Pro	CGG	U	s⁴U	n.d.	s⁴U/U	AU[s⁴U]Gp
proL^#	Pro2_NA				n.d.	Amb	[s⁴U]AGp/UAGp
proM^#	Pro3_NA				s⁴U	Amb	[s⁴U]AGp/UAGp
selC	Sec	UCA	s⁴U	U	U	U	UCGp
serT	Ser1	VGA	s⁴U	s⁴U	U	Amb	[s⁴U]Gp/UGp
serU	Ser2	CGA	U	U	s⁴U	s⁴U	A[s⁴U]Gp
serV	Ser3	GCU	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]Gp
serW, serX	Ser4	GGA	s⁴U	s⁴U/U	s⁴U	Amb	[s⁴U]Gp
thrV	Thr1	GGU	U	U	s⁴U	s⁴U	AUA[s⁴U]Gp
thrT	Thr2	GGU	U	U	s⁴U	U/s⁴U	AUA[s⁴U]AGp
thrU	Thr3_NA	VGU	NA	NA	s⁴U	s⁴U	ACU[s⁴U]AGp
thrW	Thr4_NA	NA	NA	NA	s⁴U	s⁴U	AUA[s⁴U]AGp
trpT	Trp	CCA	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]AGp
tyrT, tyrU	Tyr1	QUA	[s⁴U8][s⁴U9]	U8[s⁴U9]	s⁴U8	[s⁴U8][s⁴U9]	[s⁴U][s⁴U]CCCGp
tyrV	Tyr2	QUA	[s⁴U8][s⁴U9]	U8[s4U9]	s⁴U8	[s⁴U8][s⁴U9]	[s⁴U][s⁴U]CCCGp
valT, valU, valX, valY, valZ	Val1	VAC	s⁴U	s⁴U	s⁴U	s⁴U	AU[s⁴U]AGp
valW	Val2	GAC	s⁴U	s⁴U	s⁴U	Amb	[s⁴U]AGp
valV	Val3	GAC	s⁴U	s⁴U	s⁴U	U/s⁴U	UUCA[s⁴U]AGp

Local DB: name of tRNA in the local sequence database; refer to supplementary extended Table. Anticodon: anticodon on the tRNA. MODOMICS: the modification at position 8 on the MODOMICS[38] website. tRNAdb: the modification at position 8 on tRNAdb.[37] RNA-Seq: the modification at position 8 using RNA-Seq experiments. RNase T1-MS: the modification at position 8 using LC–MS/MS analysis. RNaseT1—theoretical fragment: theoretical oligonucleotide generated upon RNase T1 digestion of a tRNA. Amb: ambiguous; the theoretical fragment generated is detected but could not be assigned to one tRNA species unambiguously. n.d.: not detected; N: unknown modified uridine; V: uridine 5-oxyacetic acid; I: inosine; {: 5-methylaminomethyluridine; Q: queuosine; $: 5-carboxymethylaminomethyl-2-thiouridine; S: 5-methylaminomethyl-2-thiourdine; B: 2′-O-methylcytidine); 5-carboxymethylaminomethyl-2′-O-methyluridine; *initiator methionine tRNAs. #These tRNAs have a gene, but the RNA sequences are not represented in the MODOMICS and/or tRNAdb databases. The MS/MS data do not just confirm sequencing results but reveal additional modification data. The MS/MS analysis, for example, reveals the presence of s4U at two adjacent positions in Tyr tRNA. The MS/MS data with the wild-type sample show [s4U][s4U]CCCGp, which span positions 8 and 9 in the RNA. The corresponding positions in the ΔthiI tRNA sample contain U. By contrast to this data, the enhanced misincorporation frequency is only reflected at position 8 of Tyr tRNA genes in the RNA-Seq analysis (see Figure ). In few cases, the data are consistent with the presence of both modified and unmodified U, which could be attributed to the incomplete sulfuration of U8 in those tRNAs. This observation is in line with the RNA-Seq analysis whereupon BIA treatment, few genes showed close to 80–90% misincorporation at position 8, while others were in 30–70% range (Table S3). Therefore, there are differential modification frequencies of certain tRNA species in the stationary phase of growth, though the data preclude the quantification of dynamics of s4U installation on individual tRNA species. In addition to the untreated samples, RNase T1 digestion was carried out on IA, and BIA-treated tRNA samples to supplement the sequencing analysis. The NASE utilized in this study allows the use of custom modifications by utilizing the monoisotopic mass of the modified nucleoside. Therefore, the data from IA or BIA-treated tRNA samples were analyzed against sequences containing either s4U-IA or s4U-BIA in place of s4U on E. coli tRNA sequences as described in the Materials and Methods section. Treatment of the sample with IA or BIA leads to loss of the UV–vis detection of peaks in LC, also consistent with the change of the UV–visible absorbance of treated s4U (Figures S16 and S17). A representative MS/MS analysis of CUA[s4U-IA]AGp and CUA[s4U-BIA]AGp fragment originating from tRNAAla after IA and BIA treatments, respectively, is depicted in Figure S18. The observed dissociation pattern for those fragments allows for the confident assignment of these sequences to a precursor ion with observed mass. Interestingly, in Tyr tRNA, fragments containing s4U-IA and s4U-BIA at positions 8 and 9 are found, corroborating the presence of s4U at both positions 8 and 9. Overall, all the MS/MS analysis data of RNase T1-digested samples by NASE is tabulated in extended supplementary Table. These observations indicate that the iodoacetamide reaction toward s4U is very specific in the context of complex tRNA modifications present on E. coli tRNA. In conclusion, the iodoacetamide reaction largely modifies the s4U both specifically and efficiently in the context of tRNA. Overall, our method demonstrates the utility of the chemical treatment of s4U with iodoacetamide in site-specific profiling of natively occurring s4U on tRNAs.

Discussion

The first nucleoside preparation of s4U was prepared in 1958,[39] long before it was discovered in a soluble RNA extract from E. coli in 1965.[40] The presence of 2-thiouridine derivatives in RNA was also uncovered around the same time.[41] These revelations sparked considerable interest in the roles of these non-canonical nucleosides. Analytical studies by Favre et al. revealed that the irradiation of E. coli tRNAval with 334 nm light led to adduct formation between s4U at position 8 and C at position 13.[42] Efforts were also made to understand the structure of the photoadduct.[43,44] Additional studies[45,46] showed similar reactivity in vivo, leading to the suggestion that the modified base acts as a UVA radiation sensor to induce the growth delay and promote photoprotection during the stress.[47,48] Experiments focused on understanding the mechanism of growth delay indicated the dependency on the levels of the molecule ppGpp, which inhibits rRNA synthesis similarly to that observed in amino acid starvation.[48−50] The hypothesis of growth delay by s4U was also supported by the observation of a minimal growth delay in nuvE. coli variants lacking s4U following the irradiation.[47,49] Favre and colleagues tried to examine the fate of cross-linked tRNA in E. coli after removing the stress.[51,52] Intriguingly, the adduct appears to be resolved upon restoring favorable growth conditions, but there is no established mechanism for such a process.[51,52] While analytical studies based on fluorescence measurements indicate a s4U presence in 70% E. coli tRNA,[47] understanding the role of the modification and its dynamics under other common stress conditions has not been possible because of the lack of site-specific profiling methods. The reactivity and photochemistry of s4U have also been leveraged into chemical biology tools to explore RNA dynamics.[18,53] For example, incorporating externally supplied s4U during growth allows labeling of newly synthesized RNA.[23,54−56] Additionally, the nucleophilicity of the sulfur atom has been exploited to attach desired reagents/fluorescent probes for analytical or drug development purposes.[19,20] While these studies underscore the unique reactivity of the modified base, to date, it has not been exploited for the site-specific detection of natively occurring s4U in E. coli. While methylation has received substantial contemporary attention, it is not the only modification that is known to occur in RNA. Indeed, pioneering work from several groups starting from the 1970s has brought to light many chemically diverse bases.[2,38] Current estimates are that there are 105 numbers of unique modifications in tRNA.[2] Simple one-step routes that make some but many hyper-modifications require an army of enzymes and cofactors to introduce. In this regard, Q,[57−60] Y,[61,62] and s4U[26,63,64] are notable for their biosynthetic complexity. In this article, we have leveraged the nucleophilic reactivity of the s4U to carry out single nucleotide level profiling of tRNA of E. coli. The data show that the modification is present in 19 of 20 amino acid coding tRNAs under these conditions. In all but one case, the modification occurs at position 8 of the tRNA. In hisR, the modification occurs at position 9, which is consistent with previously reported sequencing results.[35] It is notable that under the conditions used in these experiments, the “readout” is a base transition in all cases, providing a concise method for identifying the modification in tRNA. It is also important to point out that while tRNA is known to undergo many modifications, our data show increased misincorporation frequency for s4U-BIA in the treated samples, highlighting the fact that the s4U modification can be observed in the background of all other possible modifications. The extensive analytical studies that underpin the work provide confidence that the reactivity of IA or BIA is unique to the nucleoside, as we observed no modification of the canonical nucleosides, even under conditions, where high concentrations of both nucleoside and the reagent were present. Finally, the observation that the transition is not observed in the thiI deletion strain provides confidence in the assignment. Therefore, the method could directly be applied to the profiling of natively occurring s4U on tRNA, for example, under different physiological stress conditions or from bacterial species where the presence of s4U is not known.

Materials and Methods

Reaction of IA or BIA with s4U

Commercially available s4U was treated with a tenfold excess of either iodoacetamide (IA, TCI Chemicals) or iodoacetyl-PEG2-biotin (BIA, Thermo Scientific) as follows. s4U (0.1 mM) was incubated with 1 mM of either IA or BIA in 0.05 M sodium phosphate (NaPi) buffer (pH 8) at 50 °C overnight. The reaction mixtures were quenched with a tenfold excess of DTT (10 mM). UV–visible absorption spectra were recorded on an Agilent 8454 diode-array spectrophotometer between wavelengths from 190 to 800 nm. The reaction mixtures were analyzed further on an Ultimate 3000 HPLC instrument with a photodiode array detector coupled to an LTQ Orbitrap XL mass spectrometer (Thermo Fisher Scientific) as follows. The mixtures were injected onto a Hypersil Gold C-18 column [particle size—1.9 μM, dimensions—2.1 mm (D) × 150 mm (L), Thermo Fisher] pre-equilibrated with 0.05 M ammonium acetate (pH 5.3) at a flow rate of 0.2 mL/min. Separation was carried out with a gradient of 40% acetonitrile (buffer B, optima grade from Thermo Fisher) in water (optima grade from Thermo Fisher) over 17 min for IA-treated samples (time: 0–3.5 min, % B: 0–0.8; time 3.5–3.75 min, % B: 0.8–3.2; time: 3.75–4.0, % B: 3.2–5.0; time: 4–12 min, % B 5.0–25.0; time: 12–15 min, % B: 25–50; time: 15–17 min, % B: 50–75; time: 17–17.1 min, % B: 75–100; time: 17.1–20 min, % B: 100) and over 45 min for the BIA-treated samples (time: 0–4.4 min, % B: 0–0.2; time: 4.4–5.8 min, % B: 0.2–0.8; time: 5.8–7.2, % B: 0.8–1.8; time: 7.2–8.6 min, % B: 1.8–3.2; time: 8.6–10 min, % B: 3.2–5.0; time: 10–25 min, % B: 5–25; time: 25–30 min, % B: 25–50; time: 30–34 min, % B: 50–75; time: 34–37 min, % B: 75; time: 37–45 min, % B: 75–100; time: 45–48 min, % B: 100). Variations to the above LC method have been used to analyze few samples. Those experimental details are listed in additional methods in the Supporting Information. The method employed for BIA-treated samples was longer to accommodate the hydrophobicity of the BIA moiety. All MS data on a LTQ Orbitrap XL mass spectrometer was recorded in the positive ion mode with an FT analyzer at a resolution setting of 100,000 and m/z range of 50–1400. The instrument used for recording the mass spectra was maintained at a capillary temperature of 275 °C, with sheath gas flow 35, auxiliary gas flow 12, and a source voltage of 3 kV. MS/MS fragmentation was achieved by collision-induced dissociation at energy settings, as noted in the results. Xcalibur software was used to analyze all the LC–MS data (Thermo Fisher), and the deconvoluted spectra were generated from the raw data by exporting with the use of Xcalibur (Thermo Fisher).

Total RNA Extraction from E. coli

Total RNA was extracted from either wild-type or deletion strains of E. coli strains by employing guanidine-based methods with some modifications as detailed below.[65] The cell pellet from 1 L of the growth was resuspended in 6 mL of denaturation buffer consisting of 4 M guanidine thiocyanate, 0.025 M sodium citrate, 0.5% sarkosyl, and 0.1 M 2-mercaptoethanol. The resulting suspension was equally distributed into 10–12 two mL Eppendorf tubes, each of which contained 0.7 mL of the suspension. Next, 2 M sodium acetate at pH 4 (0.1 mL), phenol (1 mL), and 1-bromo-3-chloropropane (0.2 mL) were added in succession, vortexed, and incubated on ice for 15 min. The tubes were centrifuged at 15,000g, and the resulting top transparent layer (0.3–0.4 mL) was transferred to a new 1.5 mL Eppendorf tube. An equivalent volume of isopropanol was added before placing the tubes at −20 °C for at least 30 min. The precipitated RNA was collected by centrifugation, and the supernatant was discarded. The RNA pellet obtained was further purified using a Qiagen miRNeasy mini kit following the manufacturer’s directions for the purification of RNA >18 nt as follows. In brief, the pellet was resuspended in 0.05 mL of RNAse free water (Invitrogen) and vortexed. The Qiazol reagent (0.25 mL) and chloroform (0.05 mL) were added in succession, vortexed, and centrifuged at 21,000g for 1 min. The top transparent layer was transferred to a new 1.5 mL Eppendorf tube containing 1.5 volumes of cold ethanol and mixed by pipetting. The entire content of the tube was loaded onto an Omega BioTek HiBind RNA spin-columns fitted with collection tubes. The column was then washed once with 0.7 mL RWT (Qiagen) and twice with 0.5 mL RPE (Qiagen) buffers in succession, with centrifugation to permit the removal of the wash solutions before eluting the RNA with 40 μL of RNAse free water (Invitrogen). RNA prepared from the same starting cell pellets were pooled and stored at −80 °C until further use in smaller aliquots.

Reaction of IA or BIA with Total RNA from E. coli

The RNA extracted was treated with either IA or BIA as follows. Purified RNA (0.04 mL) was incubated with 1 mM of either IA or BIA in 0.05 mM NaPi buffer (pH 8) overnight at 50 °C. The reaction mixtures were quenched with a tenfold excess of DTT (10 mM) over the reagent, and RNA was purified using the Qiagen miRNeasy mini kit (see above). The eluted RNA was digested to the nucleoside level by the successive action of P1 nuclease, phosphodiesterase, and alkaline phosphatase, as reported previously.[25] The reaction mixtures were filtered on VWR centrifugal filters (PES membrane, MWCO 10 kDa) to remove the protein components prior to analysis. LC–MS analyses of these mixtures were carried out as described above for the nucleoside samples.

Purification of RNA for Sequencing Experiments

The desired small RNA (∼60–100 nt) for sequencing experiments was purified by employing preparative PAGE (28 cm H × 16.5 cm W) prepared with 6% acrylamide, 8 M urea, 1× TBE buffer, TEMED, and 30% APS.[28] The RNA samples were mixed with an equivalent volume of 2× loading buffer.[28] The RNA was separated at 25 W for 2 h or until the lower dye band was two-thirds through the gel. The bands were detected by UV-shadowing and excised from the gel. RNA in the bands was eluted into solution by incubating the gel pieces overnight at 4 °C in the crush soak buffer, which was prepared by mixing 1 mL of 1 M Tris·HCl pH 7.5, 4 mL of 5 M NaCl, and 0.2 mL of 0.5 M EDTA pH 8 (diluted to 100 mL by adding water and autoclaved). Following centrifugation at 21,000g for 5 min, the supernatant was transferred to a new tube containing 0.7 mL of cold ethanol. RNA was precipitated by placing the tubes at −20 °C for 2 h. The RNA pellet was collected by centrifugation and dried under vacuum conditions in a Savant speed-vac at room temperature. The resulting white residue was resuspended in 0.02 mL of Tris–EDTA buffer (made by mixing 0.1 mL of Tris·HCl pH 7.5 and 0.02 mL of 0.5 M EDTA pH 8.0, made to 10 mL by adding water and sterile filtered) and quantified using a Nanodrop. The amount of tRNA obtained for sequencing experiments is listed in Table S2.

RNase T1 Digestion and LC–MS/MS Analysis

The RNA samples used for RNase T1 digestion are obtained with a slight variation as described above. RNA from either WT or deletion strains was obtained by phenol extraction[25] followed by gel purification as described above, prior to digestion with the RNase T1 enzyme. The cell pellet from 1 L of the growth was resuspended in 1 mL of resuspension buffer (10 mM Tris–HCl (pH 8), 10 mM MgCl2 and 0.15 M NaCl) per mg of the cell pellet, and mixed by rotation at 4 °C. After resuspension, 1 mL of saturated phenol buffered at pH 4.3 (Thermo Scientific) was added to 1 mg of the starting cell pellet and mixed by inversion for 1 h at 4 °C. The suspension was centrifuged at 8000 rpm at 4 °C for 30 min, and the top semi-transparent layer was removed, mixed with saturated phenol (∼1 mL/mg of starting cell pellet), and centrifuged at 8000 rpm at 4 °C for 30 min. After centrifugation, the upper transparent layer was removed and mixed with 0.1 volumes of 3 M sodium acetate pH 5.5 and 3 volumes of ethanol. The sample was incubated for 2 h at—20 °C, and the precipitated RNA was collected by centrifugation for 30 min at 12,000 rpm and at 4 °C. The resulting pellet was redissolved in water, and total tRNA was isolated using the preparative PAGE purification method as described above. The RNase T1 digestion reaction was set up as follows. The purified tRNA (49 μL) was incubated with 1000 U of RNase T1 (1000 U/μL, Thermo Scientific) in buffer containing 0.05 M Tris–HCl, 10 mM EDTA (pH 7.5) for 2 h at 37 °C. For IA-treated samples, the purified tRNA was first incubated with IA as described above and digested with RNase T1 after the reaction cleanup with a Qiagen miRNeasy mini kit as described in the Total RNA extraction section. The concentrations of tRNA used for each of the replicates in RNase T1 digestion are mentioned in Table S4. Likewise, ∼50 μg of each of the sequenced WTU and WTT samples are similarly treated with RNase T1. The reaction mixtures were analyzed on a Vanquish UHPLC instrument (Thermo Fisher Scientific) with a photodiode array detector and a Q-exactive mass spectrometer. The mixture was injected onto a Hypersil Gold C-18 column [particle size—1.9 μM, 2.1 mm (D) × 150 mm (L), Thermo Fisher] pre-equilibrated with 97.5% of buffer A containing 0.2 M hexafluoroisopropanol (HFIP) and 0.085 M TEA in water (optima grade from Thermo Fisher) 2.5% of buffer B containing 0.1 M HFIP, 0.042 M triethylamine (TEA) in methanol (optima grade from Thermo Fisher) at a flow rate of 0.05 mL/min. Separation was carried out with a gradient of buffer B over 70 min (time: 0–13.1 min, % B: 2.5; time 13.1–52.4 min, % B: 2.5–35; time: 52.5–71 min, % B: 100). The UV–vis data are plotted as described in the above sections. However, the UV–vis chromatograms depicted under RNase T1 analysis are normalized to per μg of the input sample. Data-dependent MS/MS analysis in the negative ion mode was carried out on the eluent from the column from above. The acquisition of full MS at a resolution of 140,000 followed by data-dependent fragmentation of the 5 most intense peaks at a resolution of 70,000. An intensity threshold of 1 × 105 and normalized collision energy of 20 was employed. The acquired Thermo RAW files were converted to mzML using MSConvert with vendor peak picking and MS subset filters.[66,67] The mzML files were processed with nucleic acids search engine (NASE) software which generates theoretical fragments and compares them against the experimental data.[17] A FASTA file containing modified tRNA sequences was used as the input. Precursor and fragment mass tolerances were set to 5 ppm. The program was set to consider fragments of 2 or more nucleotides, and the number of missed cleavages by RNase T1 was set to zero. The tRNA sequence used in the analysis, along with the output from each sample, is tabulated in supplementary extended Table.

Library Preparation and Data Accumulation

cDNA libraries were prepared from 1.5 μg of isolated RNA for each of the sample conditions (Table S1) using a NEBNext Multiplex Small RNA library prep kit for Illumina. The libraries were made following the manufacturer’s instructions. Size selection of libraries was carried out using the native PAGE conditions described in the manufacturer’s protocol. The amount of size-selected libraries submitted to the sequencing facility is listed in Table S2.

RNA-Seq Data Collection and Analysis

The sequencing was carried out on an Illumina NextSeq instrument in a pair ended manner with a read length of 75 bp by the Advanced Genomics Core at the University of Michigan. The data obtained were analyzed in house. All the analyses are performed using open-source Linux tools following the guidelines of each program.

Quality Assessment and Trimming

Raw reads were analyzed by the program FastQC[68] which looks for adaptor contents, duplication levels, and GC content. Following the quality check, Trimmomatic[29] was used to remove the adaptor content (provided with truseq3 adaptors FASTA file) and to filter out reads below 35 nt in length. The resulting sequences were then mapped to the E. coli K-12 MG1655 genome.

Mapping

FASTA sequence file and annotation files required for mapping reads to the genome are obtained from the ensemble website (ftp://ftp.ensemblgenomes.org/pub/bacteria/release-44). Spliced transcript alignment to a reference (STAR 2.7), a short-read alignment program was used for mapping reads to the genome.[30] The alignments are carried out in a two-step process where the genome is indexed first by the program, which would later be used in the mapping process (default parameters were used except for the sjdb-overhang parameter, which was set to 75 following the instructions). The alignments were outputted to a BAM format file (sorted by coordinate). Gene expression analysis was carried out using cufflinks, cuffmerge, cuffdiff, and cummeRbund. The cufflink program within the suite takes GTF annotation file and BAM files as an input (library-type = fr-first-strand, read length = 76 nt) and generates a set of tracking files for each sample group.[31] Transcripts (information in tracking files) from each assembly were merged into the single master-transcriptome assembly by the program cuffmerge, taking the biological/technical replicates into account. Cuffdiff (input = BAM files, reference FASTA sequence & master assembly GTF) processes the master transcript assembly to perform rigorous statistical analysis on the count values obtained. The data produced by cuffdiff was navigated with the use of an R library package called cummeRbund. A count matrix for all the genes was extracted from the expression analysis. The counts were summed according to the gene biotype in R by matching the cufflink ids (obtained from master transcriptome assembly) with gene information present in the GFF3 annotation file (obtained from the ensemble website). The read coverages for the tRNA genes are visualized in the program IGV.[33] GTF, GFF3, and BED files corresponding to the E. coli K −12 genome and the alignment files are loaded in the program. Screenshots corresponding to the respective alignments are generated in scalable vector graphics (SVG) and modified appropriately in an Adobe Illustrator CC 2019.

Misincorporation Frequency

The misincorporation frequency was calculated from the genotype likelihoods generated by the bcftools program[32] as follows. In the first step, BCFtools “mpileup” argument [maximum depth at a position 100,000, FASTA file, ignored duplicates, output mpileup in variant call file (VCF) format] calculates the genotype likelihoods at each genomic position along with coverage. Replicates in the same sample group are considered together for obtaining the VCF files. The output VCF files are then processed with BCFtools “call” argument to discover the mutations (multi-allelic caller, p = 0.99, BED feature file containing tRNA information). Several variant analysis attributes corresponding to each position was written in VCF files. The number of high-quality reference and alternate bases (DP4 attribute) at each position was extracted from the VCF file using VariantsToTable utility from the GATK suite of tools.[69] The misincorporation frequency at each of the tRNA positions is calculated in a Microsoft Excel spreadsheet as follows The obtained data file was then processed in R software for statistical computing and graphics to generate a matrix file suitable for plotting heat maps. The obtained matrix file was then manually adjusted according to the alignments listed on the transfer RNA database (tRNAdb) and summarized in supplementary extended Table.[37] Additionally, the MODOMICS database was referred to as well.[38] The heat maps are generated with GraphPad Prism plotting software. The positions with no values were given the same color as the background of the plot. The genes argX, argW, ileX, ileY, lysV, metT, metU, proK, and proL were ignored while plotting heat maps owing to the low coverage.

66 in total

1. Isolation and characterization of a guanine insertion enzyme, a specific tRNA transglycosylase, from Escherichia coli.

Authors: N Okada; S Nishimura
Journal: J Biol Chem Date: 1979-04-25 Impact factor: 5.157

2. Metabolism of tRNAs in growing cells of Escherichia coli illuminated with near-ultraviolet light.

Authors: E Hajnsdorf; A Favre
Journal: Photochem Photobiol Date: 1986-02 Impact factor: 3.421

3. Incorporation of 6-thioguanosine and 4-thiouridine into RNA. Application to isolation of newly synthesised RNA by affinity chromatography.

Authors: W T Melvin; H B Milne; A A Slater; H J Allen; H M Keir
Journal: Eur J Biochem Date: 1978-12

4. Mechanism of growth delay induced in Escherichia coli by near ultraviolet radiation.

Authors: T V Ramabhadran; J Jagger
Journal: Proc Natl Acad Sci U S A Date: 1976-01 Impact factor: 11.205

5. The reaction of 4-thiopyrimidine nucleosides and 4-thiouridine 5'-phosphate with ethyleneimine.

Authors: K H Scheit
Journal: Biochim Biophys Acta Date: 1969-12-16

6. 4-Thiouridine incorporation into the RNA of monkey kidney cells (CV-1) triggers near-UV light long-term inhibition of DNA, RNA and protein synthesis.

Authors: A Favre; G Moreno; C Salet; F Vinzens
Journal: Photochem Photobiol Date: 1993-11 Impact factor: 3.421

7. A new function of S-adenosylmethionine: the ribosyl moiety of AdoMet is the precursor of the cyclopentenediol moiety of the tRNA wobble base queuine.

Authors: R K Slany; M Bösl; P F Crain; H Kersten
Journal: Biochemistry Date: 1993-08-03 Impact factor: 3.162

8. 4-Thiouridine triggers both growth delay induced by near-ultraviolet light and photoprotection.

Authors: G Thomas; A Favre
Journal: Eur J Biochem Date: 1980-12

9. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP.

Authors: Markus Hafner; Markus Landthaler; Lukas Burger; Mohsen Khorshid; Jean Hausser; Philipp Berninger; Andrea Rothballer; Manuel Ascano; Anna-Carina Jungkamp; Mathias Munschauer; Alexander Ulrich; Greg S Wardle; Scott Dewell; Mihaela Zavolan; Thomas Tuschl
Journal: Cell Date: 2010-04-02 Impact factor: 41.582

10. Comparative tRNA sequencing and RNA mass spectrometry for surveying tRNA modifications.

Authors: Satoshi Kimura; Peter C Dedon; Matthew K Waldor
Journal: Nat Chem Biol Date: 2020-06-08 Impact factor: 15.040