Literature DB >> 26237225

ARM-seq: AlkB-facilitated RNA methylation sequencing reveals a complex landscape of modified tRNA fragments.

Aaron E Cozen¹, Erin Quartley², Andrew D Holmes¹, Eva Hrabeta-Robinson¹, Eric M Phizicky^2,3, Todd M Lowe¹.

Abstract

High-throughput RNA sequencing has accelerated discovery of the complex regulatory roles of small RNAs, but RNAs containing modified nucleosides may escape detection when those modifications interfere with reverse transcription during RNA-seq library preparation. Here we describe AlkB-facilitated RNA methylation sequencing (ARM-seq), which uses pretreatment with Escherichia coli AlkB to demethylate N(1)-methyladenosine (m(1)A), N(3)-methylcytidine (m(3)C) and N(1)-methylguanosine (m(1)G), all commonly found in tRNAs. Comparative methylation analysis using ARM-seq provides the first detailed, transcriptome-scale map of these modifications and reveals an abundance of previously undetected, methylated small RNAs derived from tRNAs. ARM-seq demonstrates that tRNA fragments accurately recapitulate the m(1)A modification state for well-characterized yeast tRNAs and generates new predictions for a large number of human tRNAs, including tRNA precursors and mitochondrial tRNAs. Thus, ARM-seq provides broad utility for identifying previously overlooked methyl-modified RNAs, can efficiently monitor methylation state and may reveal new roles for tRNA fragments as biomarkers or signaling molecules.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Substances：

Year: 2015 PMID： 26237225 PMCID： PMC4553111 DOI： 10.1038/nmeth.3508

Source DB: PubMed Journal: Nat Methods ISSN： 1548-7091 Impact factor: 28.547

Introduction

Next-generation RNA-sequencing has provided insight into the importance of small RNAs in a wide range of biological contexts. Transfer RNAs (tRNAs) are among the most abundant RNAs in all organisms, so it is perhaps unsurprising that tRNA fragments and half molecules are often abundant constituents of small RNA sequencing libraries[1-3]. There is increasing evidence that these tRNA-derived RNAs can have important functions distinct from those of mature tRNAs[4-8], including potential roles in disease[4, 5, 9]. However, tRNA-derived fragments are likely to escape sequencing-based detection when they contain nucleoside modifications similar to those in mature tRNAs. Many tRNA modifications cause pauses or stops during reverse transcription[10], a critical step in most RNA-seq protocols. These so-called “hard-stop” modifications, including 1-methyladenosine (m1A), 1-methylguanosine (m1G), 2,2,-dimethylguanosine (m2,2G), and 3-methylcytidine (m3C), are more prevalent in tRNAs than other classes of RNAs, and likely play important roles in the biogenesis, stability, and functional activities of tRNA-derived small RNAs, much as they do for mature tRNAs[11]. For example, specific modifications can target specific tRNAs for cleavage into half-molecules[12], protect tRNAs from cleavage[13, 14], or alter the interaction of tRNA fragments with proteins such as Dicer or Piwi[2, 3, 8]. We developed AlkB-facilitated RNA Methylation sequencing (ARM-Seq) to provide sensitive and specific detection of methyl-modified RNAs using RNA-seq. In ARM-Seq, RNA is treated with a de-alkylating enzyme, Escherichia coli AlkB, prior to the reverse transcription step in library preparation. Differential abundance analysis comparing treated to untreated samples efficiently identifies RNAs sequenced more frequently after demethylation. The known substrates of E. coli AlkB in RNA are m1A, documented in approximately half of all well-characterized tRNAs, and m3C, a less common modification documented primarily in tRNAs[15-17]. There is also evidence that E. coli AlkB can demethylate m1G, which is nearly as prevalent as m1A in tRNAs, although by a different mechanism[18]. Analyses of budding yeast (Saccharomyces cerevisiae) and human cell lines show that ARM-Seq greatly increases the abundance and diversity of reads for small RNAs derived from tRNAs in widely divergent model organisms. ARM-Seq can be used to predict the identity and position of modified residues when compared to previous documentation[17], demonstrating that most tRNA-derived fragments contain modifications found in corresponding mature tRNAs. This approach, corroborated by primer extension experiments, correctly predicts the m1A modification state for the complete set of known yeast tRNAs with 94% accuracy, including several where modifications were verified to differ from previous documentation. Furthermore, ARM-Seq provides compelling evidence for m1A modifications in a large proportion of human tRNAs where modification patterns were unknown or not documented. Thus, ARM-Seq facilitates sequencing of methyl-modified RNAs that otherwise escape detection in standard sequencing protocols, and can be used to rapidly characterize methylation patterns across diverse transcriptomes.

Results

ARM-Seq enables detection of methylated small RNAs derived from tRNAs

We first tested the ARM-Seq methodology (Fig.1) on S. cerevisiae, where tRNAs and their modifications[17] have been most extensively characterized. Initial experiments showed that demethylation conditions used for ARM-Seq specifically removed m1A and m3C modifications from target RNAs (Supplementary Figure 1). ARM-Seq more than doubled the proportion of small RNA sequencing reads from tRNA genes from 6.9% to 15.1% (Fig.2a, Supplementary Table 1). These increases corresponded almost entirely to tRNA-derived small RNAs rather than full-length mature tRNAs (Supplementary Table 2), indicating that a large proportion of tRNA-derived small RNAs in yeast contain AlkB-sensitive modifications. In contrast, the share of reads mapping to other major classes of small RNAs diminished slightly (Supplementary Table 1).

Figure 1

ARM-Seq facilitates sequencing of m1A, m3C, or m1G modified RNAs

AlkB-facilitated RNA methylated sequence (ARM-Seq) uses enzymatic demethylation of RNA samples prior to RNA-seq library preparation to reveal RNAs containing m1A, m3C, or m1G. Widely used protocols for small RNA sequencing, including NEBNext (New England Biolabs) and TruSeq (Illumina), require ligation of sequencing adapters to both the 5′ and 3’ ends of each RNA prior to reverse transcription for library preparation. Without any additional treatments, sequencing output from these protocols will therefore represent only RNAs with appropriate end chemistry for sequencing adapter ligations (5′-monophosphate and 3′-OH, the expected end chemistry of mature tRNAs, some classes of tRNA-derived fragments, microRNAs, and snoRNAs) that produce full-length cDNAs. “Hard-stop” modifications such as m1A, m3C or m1G, which commonly occur in tRNAs, cause premature termination of cDNA synthesis, preventing PCR amplification and subsequent sequencing. Typical positions for these modifications are indicated in the schematic showing tRNA secondary structure in canonical cloverleaf form. In ARM-Seq, removal of m1A, m3C, or m1G modifications by AlkB treatment facilitates the production of full-length cDNAs from previously modified templates, producing a ratio of reads in treated versus untreated samples that can be used to identify methylated RNAs.

Figure 2

ARM-Seq reveals m1A-modified tRNA fragments in S. cerevisiae

(a) ARM-Seq more than doubled the fraction of yeast small RNA sequencing reads mapping to tRNAs, revealing a diversity of methylated small RNAs derived from tRNAs. The majority of these were 3′-fragments and half-molecules of tRNAs, where m1A at position 58 (m1A58) is the most prevalent hard-stop modification. Full-length tRNAs comprised less than 1% of tRNA reads in both AlkB-treated and untreated samples, consistent with a known bias in sequencing library preparation where 5′ linker ligation is impeded by recessed 5′ ends of mature tRNAs. (b) ARM-Seq read profiles show increases in 3′-fragment reads relative to untreated samples that predict the presence of m1A58 in Thr-AGT, Leu-GAG and Gln-TTG (indicated by *). By contrast, ARM-Seq profiles for Arg-CCG, Gly-CCC and His-GTG show comparable or diminished 3’ reads for untreated samples, predicting un-modified A58 in these tRNAs. (c) Primer extensions targeting the corresponding mature tRNAs demonstrate that these ARM-Seq results reflect the modification patterns of mature tRNAs, confirming the A58 modification state documented in Modomics for Thr-AGT and His-GTG, providing new information on the m1A58 modification state of Arg-CCG, Gly-CCC and Leu-GAG tRNAs, and presenting new evidence that Gln-TTG tRNAs contain m1A58. (d) As a genome-scale screen, ARM-Seq correctly predicts m1A58 modification state for yeast tRNAs with accuracy of 94% as corroborated by documentation in Modomics, or verification by primer extension (for tRNAs indicated in red), based on increases of two-fold or more (dotted red line) and P < 0.01 (indicated by *).

ARM-Seq predicts the m1A58 modification state of mature tRNAs

Next, we showed that ARM-Seq abundance ratios (RNA-seq read counts from AlkB-treated versus untreated RNA) and read profiles detected known m1A tRNA modifications as effectively as traditional primer extension experiments. Thr-AGT tRNA, which is known to contain m1A58, showed a 16-fold increase in normalized read count corresponding to fragments that include A58 (Fig.2b). Primer extensions targeting mature Thr-AGT tRNA revealed a hard-stop band corresponding to m1A58 in an untreated sample, versus much reduced band intensity in the corresponding AlkB-treated sample, consistent with demethylation of the expected m1A58 modification (Fig.2c). By contrast, ARM-Seq produced no significant effect for His-GTG (Fig.2b), a true negative where an expected unmodified A58 was also confirmed by primer extension (Fig.2c). Similar comparisons confirmed ARM-Seq predictions for three isodecoder groups with no previous modification data (Leu-GAG, Arg-CCG, Gly-CCC), and one isodecoder group (Gln-TTG) where A58 was previously documented as unmodified[7], but shown to be methylated by both ARM-Seq and primer extension (Fig.2b–c). Since ARM-Seq read profiles of tRNA-derived small RNAs correctly predicted the m1A58 modification state for the mature tRNAs tested, we examined ARM-Seq results for the complete set of yeast tRNAs. Based on our initial verified test data, we used a two-fold increase in read abundance and a DEseq2 P-value <0.01 (see Online Methods) as our threshold for identifying all significant ARM-Seq responses. ARM-Seq correctly predicted the modification state for 22 of 26 yeast tRNAs with documented[17] m1A58 modifications (Fig.2d, Supplementary Figures 2–3, Supplementary Table 2). Among the other four tRNAs, ARM-seq predicted unmodified A58 in two (Leu-TAA-1 & Lys-CTT-1), and these were confirmed by primer extension (Fig.2d, Supplementary Figure 4). The last two tRNAs expected to contain m1A58 (Ile-TAT-1, Val-CAC-1) showed visible increases in read count but were not quite significant by our cutoff criteria (Fig.2d, Supplementary Figure 2b). Conversely, ARM-Seq produced profiles consistent with unmodified A58 for 15 of 19 tRNAs in isodecoder groups expected to lack m1A58 (Supplementary Figure 2), and correctly identified three others (Gln-TTG isodecoders) where unexpected m1A58 modifications were confirmed by primer extension (Fig.2b–c). ARM-Seq profiles for the last tRNA in this group, Ser-CGA, showed evidence for demethylation of both an expected m3C32 modification, and an unexpected m1A58. ARM-Seq also predicted m1A58 modifications for five yeast tRNAs in isodecoder groups not represented in Modomics and unmodified A58 for three others, with primer extensions confirming m1A58 for Leu-GAG and unmodified A58 for Arg-CCG and Gly-CCC (Fig.2c–d, Supplementary Figure 2d). The final tRNA not represented in Modomics, Pro-AGG, showed evidence for partial AlkB sensitivity that was also confirmed by primer extension (Supplementary Figure 4). Summarizing for all yeast tRNAs where m1A58 modification state was either corroborated by documentation in Modomics or verified by primer extensions, ARM-Seq correctly predicted 26 of 28 that contain m1A58 (93% sensitivity) and 18 of 19 that contain unmodified A58 (95% specificity), demonstrating a combined accuracy of 94% overall.

ARM-seq reveals abundant methylated RNAs derived from human tRNAs

The tRNA repertoire in humans is substantially more complex. Of 414 unique human mature tRNA sequences identified by tRNAscan-SE[19, 20], just 43 match entries in Modomics. ARM-Seq demethylation increased the proportion of RNA-seq reads mapping to tRNAs from 2.9% to 10.1% in an Epstein-Barr virus transformed B-cell line (GM12878), and from 3.9% to 13.2% in a B-cell lymphoma-derived cell line (GM05372), about 3.5-fold in each case (Supplementary Figure 5, Supplementary Table 1). These increases again corresponded to detection of modified tRNA-derived small RNAs, rather than full-length mature tRNAs (Supplementary Table 3). The tRNA 3′-fragments only detectable with ARM-Seq all included A58, positively predicting 15 of the 17 (88%) human isodecoder groups expected to contain m1A58 modifications (Fig.3a–b). ARM-Seq also correctly identified the only isodecoder group expected to contain unmodified A58 (Glu-CTC; Supplementary Figure 6d). Examining all isotypes, ARM-Seq produced an unprecedented set of methylation predictions encompassing the full spectrum of human isodecoder groups (Supplementary Figure 6a–b; Supplementary Data 1).

Figure 3

ARM-Seq reveals methylated RNAs derived from human cytosolic tRNAs, tRNA precursors, and mitochondrial tRNAs

(a) Transcriptome-scale screening using ARM-Seq provides evidence for m1A58 modification in a majority of human tRNA isotypes, showing a consistent profile of modification in two B-cell derived human cell lines (with * indicating significant responders). (b) Profiles for many tRNA-derived small RNAs revealed by ARM-Seq show little, if any detection in untreated samples, indicating high levels of modification. (c) ARM-Seq also provides the first evidence that many human pre-tRNAs are m1A58 modified at an early stage prior to removal of 5’ leader and 3’ trailer sequences from primary transcripts (demarcated by dashed lines), demonstrating the ability to resolve sequential modification and processing steps involved in tRNA maturation. The 5′-leader sequences of these precursor-derived RNAs were typically short (4–5 nt) when present, which might reflect either nucleolytic processing or dephosphorylation of triphosphorylated primary transcripts to generate 5’-monophosphate ends (required for RNA-seq library inclusion). By contrast, the 3′-trailers were often 9–10 nt or longer, frequently ending with a poly-U sequence, suggesting that these represent the 3′-ends of primary RNA polymerase III transcripts. Reads for full-length and fragmentary pre-tRNAs revealed by ARM-Seq included the T-loop region, consistent with m1A58 modifications. (d) Fragments of human mitochondrial tRNAs revealed by ARM-Seq demonstrate a capacity to also demethylate m1A9 (in mito-Asp-GTC, mito-Lys-TTT), m1G9 (mito-Ile-GAT), and m1G37 (mito-Leu-TAG, mito-Pro-TGG), enabling investigation of mitochondrial diseases related to tRNA modification and processing. tRNAs for which ARM-Seq predictions were verified by primer extension are indicated in red.

ARM-Seq identifies methyl-modified pre-tRNAs and mitochondrial tRNAs

A subset of transcripts revealed by ARM-Seq in the human samples preferentially mapped to tRNA genes rather than mature tRNA transcripts because they included genomically-encoded sequences found only in tRNA precursors (Fig.3c, Supplementary Figures S7–S11). Most tRNA base modifications are thought to occur after cleavage of 5′-leader and 3′-trailer sequences from tRNA-precursor transcripts[21]. Evidence demonstrating m1A58 modification of initiator methionine pre-tRNAs in yeast and exogenous pre-tRNAs in Xenopus laevis oocytes established a limited precedent for this particular modification at an earlier stage in pre-tRNA processing[22, 23], but direct evidence for early m1A58 modification has been lacking for most organisms, including humans. Surprisingly, ARM-Seq identified modified precursors for most human acceptor types (Supplementary Figure 12, Supplementary Table 3), even though pre-tRNAs are less abundant and more challenging to detect than mature tRNAs. Overall, pre-tRNAs in 33 different isodecoder families from 86 different human tRNA gene loci showed significant ARM-Seq responses in at least one of the two cell lines. A large subset of these, 38 loci, showed significant ARM-Seq responses in both cell lines. Primer extensions confirmed an AlkB-sensitive block corresponding to m1A58 in a human Leu-CAA pre-tRNA (Supplementary Figure 8b). Thus, ARM-Seq provides the first evidence that many human pre-tRNAs are m1A58-modified prior to 5′-leader and 3′-trailer removal, suggesting this pattern occurs broadly among eukaryotes. ARM-Seq also efficiently revealed modifications in human mitochondrial tRNAs. Eight of 22 human mitochondrial tRNAs are currently documented[17], showing m1A9, m1G9, m1G37, and m1A58 as the most frequent hard-stop modifications. More extensively characterized bovine mitochondrial tRNAs show at least one difference in modification relative to humans for seven of these (all except initiator methionine), underscoring the need for specific characterization of human mitochondrial tRNAs[17, 24]. ARM-Seq produced significant increases identifying modified RNAs derived from 12 mitochondrial tRNAs in GM12878 cells, eight of which also showed significant responses in the GM05372 samples (Fig.3d, Supplementary Figure 7, Supplementary Table 3). In contrast to human cytosolic tRNAs, where ARM-Seq responses were attributable exclusively to m1A58 modification state, ARM-Seq profiles for human mitochondrial tRNAs provide evidence for m1A9 (in mito-Asp-GTC, mito-Lys-TTT, and mito-Pro-TGG), m1G9 (in mito-Ile-GAT and mito-Tyr-GTA), m1G37 (in mito-Leu-TAG and mito-Pro-TGG), and m1A58 (in mito-Leu-TAA). Primer extensions confirmed AlkB-mediated demethylation of m1A9 for mito-Pro-TGG, m1G9 in mito-Ile-GAT, and a previously undocumented m1G9 in mito-Tyr-GTA (Supplementary Figure 8b).

Discussion

ARM-Seq results presented here show that a large fraction of small RNAs in both budding yeast and human cells contain base modifications that reflect their biogenesis from modified tRNAs. Recently developed protocols provide tools to profile 6-methyladenosine (m6A), pseudouridine, and 5-methylcytidine (m5C) modified RNAs using high-throughput sequencing, revealing new and unexpected targets for these modifications[25-29]. ARM-Seq adds the capacity to profile m1A, m3C or m1G modified RNAs, which are otherwise recalcitrant to sequencing, revealing a complex landscape of modified tRNA fragments in two evolutionarily divergent organisms. Sequences of the most abundant of these are listed (Supplementary Table 4), with all 1634 read profiles available for individual examination (Supplementary Data 1). The power of ARM-Seq as a screen for m1A, m3C and m1G modified RNAs can be maximized by leveraging prior knowledge from databases such as Modomics, and complementary experimental approaches such as primer extension and mass-spectrometry to identify the specific nature and location of modified residues. ARM-Seq demonstrates remarkable accuracy in predicting previously documented tRNA modification patterns, and perfect agreement with corresponding primer extensions for unexpected modifications. Furthermore, results showing that many human pre-tRNAs are m1A-modified demonstrate that ARM-Seq can dissect complex sequential steps of RNA processing and modification, with potential application for identifying modification-based regulatory checkpoints. ARM-Seq profiles revealing m1A and m1G-modified mitochondrial tRNAs also suggest uses investigating mitochondrial genetic diseases, where defects in mitochondrial tRNAs often play central roles[30]. Our results, including untreated samples, do not show the same evidence for nucleotide misincorporation at expected hard-stop modifications that has been reported in other studies[31-34]. Although signature mismatches in sequencing data can identify modified or edited residues, ARM-Seq is almost certainly more sensitive and quantitative for detection of modified RNAs because it does not depend on low-frequency reverse transcription errors that are poorly understood, and possibly context-dependent. ARM-Seq should facilitate the study of tRNA processing and modification in a wide range of biological settings, including investigation of novel model organisms, as well as comparative analyses of different developmental stages, tissue types, and disease states. Such studies may illuminate new facets of tRNA biology, for example by revealing tissue-specific functions for distinct tRNA variants[35], or important regulatory functions for novel tRNA-derived small RNAs[5]. These typically overlooked small RNAs outnumbered microRNAs by four-fold or more (Supplementary Figure 5), which underscores their potential involvement in cellular signaling and regulation, as well as in disease states including neurodegeneration, cancer and viral infections[4, 5, 9]. Whether base modifications play central roles in these activities, and whether modifications have obscured detection of members of other classes of RNAs, such as mRNAs or long non-coding RNAs, are among the many potential lines of research now accessible with this methodology.

Methods

Purification of E. coli AlkB

AlkB was purified after growth of E. coli BL21(DE3)pLysS (12 liters) bearing plasmid JEE1167-B in the AVA421 vector[36, 37], and 2 hours IPTG induction at 37 °C to express His6-3C-AlkB fusion protein. Crude lysates were made by sonication, and protein was purified by batch treatment on TALON resin, tag cleavage with His6-3C protease, and re-application to TALON resin. Unbound protein was concentrated (Amicon Ultra-15 centifugal filter unit), purified using a Hi-Load 16/60 Superdex 200 gel filtration column, and then stored as concentrated protein (15.4 mg/mL, 0.77 ml) in buffer containing 20 mM Tris-HCl pH 8.0, 50% glycerol, 0.2 M NaCl, and 2 mM dithiothreitol at −20 °C, or at −80 °C. Freezing the enzyme did not impair activity.

Growth of yeast cells and RNA isolation

S. cerevisiae cells (strain BY4741) were grown in liquid YPD medium at 30°C to OD600 1–2, and 300 OD-ml cells were harvested and quick frozen at −80 °C. Bulk RNA was prepared from cell pellets using hot phenol[36], typically yielding 2 mg RNA. Bulk RNA from three independently inoculated cultures was processed separately in subsequent treatments.

Growth of human cell lines and RNA isolation

Cell pellets of human B-lymphocyte derived cell lines GM05372 and GM12878 were purchased from Coriell Institute and shipped frozen after a PBS wash. Cell lines were authenticated using microsatellite analysis, and verified as free of mycoplasma infection by Coriell Institute. Upon arrival, cells were immediately placed at −80 °C for storage prior to RNA extraction. Isolation of total RNA from 108 human cells was performed using Direct-Zol™ RNA MiniPrep Kit (Zymo Research) with TRI Reagent (Molecular Research Center, Inc.), typically yielding 400–450 µg of total RNA. Total RNA samples from each of the two human cell lines were then split into three technical replicates for subsequent treatments.

Treatment of RNA with AlkB

AlkB treatment of RNA was performed in 200 µl reaction mixtures containing 50 mM HEPES KOH, pH 8, 75 µM ferrous ammonium sulfate pH 5, 1 mM α-ketoglutarate, 2 mM sodium ascorbate, 50 µg/ml BSA, 50 µg AlkB, and 50 µg bulk RNA at 37 °C for 100 minutes. AlkB reaction buffer was prepared fresh prior to each use. Reactions were stopped by addition of 200 µl buffer containing 11 mM EDTA and 200 mM ammonium acetate, followed by phenol extraction, ethanol precipitation, and resuspension of the washed pellet in water. Control reactions for untreated samples were performed similarly, using AlkB storage buffer instead of AlkB.

Primer extension

For primer extension, ~0.7 pmol 5′-32P-phosphorylated primer was annealed to 0.2 µg bulk RNA in 5 µl H2O by heating for 3 min at 95 °C, followed by cooling to 50 °C and incubation for 1 h. Annealed primer was extended using 64 U Superscript III (Invitrogen) in a 10 µl reaction containing first strand buffer (50 mM Tris-HCl (pH 8.3, 25°C), 75 mM KCl, 3 mM MgCl2) and 1 mM each dNTP for 1 h at 50°C, stopped by addition of 10 µl formamide loading dye and freezing on dry ice. Primer extension products were resolved by electrophoresis on a 15% polyacrylamide gel containing 4 M urea, followed by visualization of the dried gel on a phosphoimager cassette. Sequences of oligonucleotides used for primer extension are listed in Supplementary Table 5.

Size selection and preparation of RNA sequencing libraries

50 µg of control or AlkB-treated RNA was processed using the MirVana miRNA Isolation Kit (Life Technologies), according to manufacturer’s instructions, to select for RNA < 200nt. RNA was concentrated to 25 µg using RNA Clean and Concentrate-25 (Zymo Research), and 10 µg was treated with DNase I (New England BioLabs). Following column cleanup of the RNA, 1 µg was used as input for NEBNext Small RNA Library Prep Kit for Illumina (New England BioLabs). Libraries were size selected on 2% SizeSelect agarose E-Gels using the 50 bp E-gel ladder (Life Technologies Corporation) as a marker to select for bands corresponding to libraries of RNA between 18–120 nt. Dilutions from column cleaned and concentrated libraries were assessed by BioAnalyzer traces using Agilent High Sensitivity DNA kit (Agilent Technologies). Sequencing of the libraries was performed at the University of California, Davis DNA Technologies and Expression Analysis Core using Illumina MiSeq paired-end sequencing. Fastq files for all sequencing runs are deposited in the NCBI Sequence Read Archive under project number SRP056032.

Mapping of sequencing reads

Reads were trimmed, removing barcoding indices and adapter sequences, and paired-end reads were merged using a custom Python script (Seqprep, J. St. John, http://github.com/jstjohn/SeqPrep). Only merged reads corresponding to RNAs at least 15 nucleotides long were analyzed further. Reads were mapped to reference genomes (Homo sapiens 2009 assembly hg19, GRCh37 or S. cerevisiae April 2011 assembly sacCer3) plus the set of mature tRNA sequences from tRNAscan-SE tRNA gene predictions for each of these genomes[19]. Mature tRNA sequences were generated to account for post-transcriptional processing steps: predicted introns were removed, a CCA sequence was added to the 3′ ends of all tRNAs, and a G nucleotide was added to the 5′-end of histidine tRNAs. Each of these mature tRNA sequences were padded on both ends with 20 “N” bases to allow mapping of reads with additional end sequences. Reads were mapped to the reference genomes plus the non-redundant set of predicted mature tRNA sequences using Bowtie 2[38], returning up to 100 alignments per read with default parameters. For analyses summarizing the composition of RNA-seq reads by RNA class, multiple mapping was not allowed and only the Bowtie 2 primary alignment was used (selected arbitrarily by the program when multiple features produced equal mapping scores). Each sample produced approximately one million mappable reads using this procedure. The proportional composition of these reads by RNA class was relatively uniform across technical replicates for the human samples, and somewhat more variable between biological replicates of the yeast samples that were derived from independently expanded cultures (Supplementary Table 1). For differential expression analysis of reads mapped to either individual gene loci or mature tRNA sequences using DESeq2 analyses (described below), all best matches according to the Bowtie 2 scoring function were used. Reads showing equal mapping scores to tRNA gene loci (which represent unprocessed pre-tRNA transcripts) and predicted mature tRNA sequences were mapped exclusively to mature tRNAs. Thus, reads with equivalent mapping scores to multiple gene loci (encoding tRNAs that are identical after maturation) were mapped instead to a single mature tRNA sequence. In addition, reads mapped by this procedure to tRNA gene loci all contain features of tRNA precursors that are not found in mature tRNAs (e.g., intronic sequences, 3′-trailers, or 5′-leaders). These pre-tRNA features often distinguish one tRNA gene locus from another even when the mature tRNA encoded is identical. Plots of read coverage profiles for tRNAs were produced using read counts that were normalized according to size factors calculated from DESeq2 analyses (see below).

Differential expression analysis

Read counts were tabulated for all reads and assigned to mature tRNAs or genomic features where mapping produced at least 10 nucleotides of sequence overlap. Non-overlapping RNA sequences mapped to the same annotated genomic features were labeled and counted separately (for example non-overlapping RNAs mapped to a genomic feature annotated as HERVH-int were labeled HERVH-int.1, HERVH-int.2, …). Read counts for all features that exceeded a minimum threshold of 20 reads were used as input to the DESeq2 R package with default parameters[39]. DESeq2 takes into account variability between replicates, and normalizes read counts to account for differences in sequencing depth between samples, reporting ARM-Seq fold changes relative to untreated samples along with associated P-values that are adjusted for multiple hypothesis testing. We used a two-fold increase in read abundance with a DEseq2 P-value <0.01 as our threshold for identifying all significant ARM-Seq responses. A doubling of read counts in ARM-Seq versus untreated samples indicated the presence of AlkB-sensitive modifications in at least half of the detected RNA molecules derived from a given tRNA, while larger increases indicate an even greater proportion of modified molecules. With the exception of Supplementary Table 1, which presents raw read counts and a proportional breakdown of read mappings by RNA class that is unaffected by normalization, all read counts reported in results and Figures_reflect normalization using DESeq2 size factors.

New tRNA naming convention

tRNA transcripts and individual gene loci are labeled using a new systematic naming convention that is designed to be more stable and informative (T. Lowe and P. Chan, unpublished data). The new tRNA naming convention echoes the systematic naming adopted for microRNAs in miRBase[40]. In brief, each unique mature tRNA transcript is named by isotype and codon (i.e. isodecoder), numbered in ascending order (e.g., tRNA-Ala-AGC-1, tRNA-Ala-AGC-2, etc.), from most "canonical" to least canonical (canonical is objectively defined by the bit score given to each tRNA by tRNAscan-SE using the default general tRNA model[19]). As with microRNAs, there are often multiple genome loci encoding identical mature tRNAs, so a secondary index number is assigned to denote specific tRNA gene loci (i.e., tRNA-Ala-AGC-1-1, tRNA-Ala-AGC-1-2, tRNA-Ala-AGC-1-3 describe different gene loci, but produce identical mature tRNA transcripts). Thus, labels for mature tRNA transcripts include only the first index number, which refers to the specific unique tRNA (e.g., tRNA-Ala-AGC-2), whereas labels for tRNA genes also include a second index, which refers to the locus number (i.e., tRNA-Ala-AGC-2-1). The new naming convention has been applied to all tRNAs in the Genomic tRNA Database[20], and has been adopted by the HUGO Gene Nomenclature Committee, and by RNAcentral[41]. For convenience in cross-referencing, Tables S1 and S2 also include legacy labels from the genomic tRNA database, where tRNA genes were labeled by chromosome number and order of occurrence[20]. By this new naming convention, we count 414 possible unique mature tRNAs in the GRCh37/hg19 assembly of the human genome (not including the 10 tRNA predictions with undetermined anticodons).

Correspondence to modifications annotated in Modomics

Predicted mature tRNA sequences were compared to those from the Modomics database (downloaded January 2015) to annotate modifications. tRNAs were labeled with annotated modifications from Modomics when these contained matching anticodons and the sequence of originating (un-modified) bases in Modomics matched those of the genomically encoded tRNAs with three or fewer nucleotide mismatches. tRNAs that did not match Modomics tRNA sequences using these criteria were labeled as “not documented.”

Code availability

The software pipeline developed for this study includes components for trimming of raw sequencing reads, merging of paired-end reads, read mapping of small RNAs (including pre-tRNAs & mature tRNAs), abundance estimation, and differential expression analysis (current version available at http://lowelab.ucsc.edu/software/).

41 in total

Review 1. Identification of modified residues in RNAs by reverse transcription-based methods.

Authors: Yuri Motorin; Sébastien Muller; Isabelle Behm-Ansmant; Christiane Branlant
Journal: Methods Enzymol Date: 2007 Impact factor: 1.600

2. Transcriptome-wide mapping reveals widespread dynamic-regulated pseudouridylation of ncRNA and mRNA.

Authors: Schraga Schwartz; Douglas A Bernstein; Maxwell R Mumbach; Marko Jovanovic; Rebecca H Herbst; Brian X León-Ricardo; Jesse M Engreitz; Mitchell Guttman; Rahul Satija; Eric S Lander; Gerald Fink; Aviv Regev
Journal: Cell Date: 2014-09-11 Impact factor: 41.582

3. RNA function. Ribosome stalling induced by mutation of a CNS-specific tRNA causes neurodegeneration.

Authors: Ryuta Ishimura; Gabor Nagy; Ivan Dotu; Huihao Zhou; Xiang-Lei Yang; Paul Schimmel; Satoru Senju; Yasuharu Nishimura; Jeffrey H Chuang; Susan L Ackerman
Journal: Science Date: 2014-07-25 Impact factor: 47.728

4. Aberrant methylation of tRNAs links cellular stress to neuro-developmental disorders.

Authors: Sandra Blanco; Sabine Dietmann; Joana V Flores; Shobbir Hussain; Claudia Kutter; Peter Humphreys; Margus Lukk; Patrick Lombard; Lucas Treps; Martyna Popis; Stefanie Kellner; Sabine M Hölter; Lillian Garrett; Wolfgang Wurst; Lore Becker; Thomas Klopstock; Helmut Fuchs; Valerie Gailus-Durner; Martin Hrabĕ de Angelis; Ragnhildur T Káradóttir; Mark Helm; Jernej Ule; Joseph G Gleeson; Duncan T Odom; Michaela Frye
Journal: EMBO J Date: 2014-07-25 Impact factor: 11.598

5. RNAcentral: an international database of ncRNA sequences.

Authors: Anton I Petrov; Simon J E Kay; Richard Gibson; Eugene Kulesha; Dan Staines; Elspeth A Bruford; Mathew W Wright; Sarah Burge; Robert D Finn; Paul J Kersey; Guy Cochrane; Alex Bateman; Sam Griffiths-Jones; Jennifer Harrow; Patricia P Chan; Todd M Lowe; Christian W Zwieb; Jacek Wower; Kelly P Williams; Corey M Hudson; Robin Gutell; Michael B Clark; Marcel Dinger; Xiu Cheng Quek; Janusz M Bujnicki; Nam-Hai Chua; Jun Liu; Huan Wang; Geir Skogerbø; Yi Zhao; Runsheng Chen; Weimin Zhu; James R Cole; Benli Chai; Hsien-Da Huang; His-Yuan Huang; J Michael Cherry; Artemis Hatzigeorgiou; Kim D Pruitt
Journal: Nucleic Acids Res Date: 2014-10-28 Impact factor: 16.971

6. Small tRNA-derived RNAs are increased and more abundant than microRNAs in chronic hepatitis B and C.

Authors: Sara R Selitsky; Jeanette Baran-Gale; Masao Honda; Daisuke Yamane; Takahiro Masaki; Emily E Fannin; Bernadette Guerra; Takayoshi Shirasaki; Tetsuro Shimakami; Shuichi Kaneko; Robert E Lanford; Stanley M Lemon; Praveen Sethupathy
Journal: Sci Rep Date: 2015-01-08 Impact factor: 4.379

7. Hidden layers of human small RNAs.

Authors: Hideya Kawaji; Mari Nakamura; Yukari Takahashi; Albin Sandelin; Shintaro Katayama; Shiro Fukuda; Carsten O Daub; Chikatoshi Kai; Jun Kawai; Jun Yasuda; Piero Carninci; Yoshihide Hayashizaki
Journal: BMC Genomics Date: 2008-04-10 Impact factor: 3.969

8. HAMR: high-throughput annotation of modified ribonucleotides.

Authors: Paul Ryvkin; Yuk Yee Leung; Ian M Silverman; Micah Childress; Otto Valladares; Isabelle Dragomir; Brian D Gregory; Li-San Wang
Journal: RNA Date: 2013-10-22 Impact factor: 4.942

9. Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells.

Authors: Thomas M Carlile; Maria F Rojas-Duran; Boris Zinshteyn; Hakyung Shin; Kristen M Bartoli; Wendy V Gilbert
Journal: Nature Date: 2014-09-05 Impact factor: 49.962

10. A complete landscape of post-transcriptional modifications in mammalian mitochondrial tRNAs.

Authors: Takeo Suzuki; Tsutomu Suzuki
Journal: Nucleic Acids Res Date: 2014-05-15 Impact factor: 16.971

170 in total

1. Selective amplification and sequencing of cyclic phosphate-containing RNAs by the cP-RNA-seq method.

Authors: Shozo Honda; Keisuke Morichika; Yohei Kirino
Journal: Nat Protoc Date: 2016-02-11 Impact factor: 13.491

2. Removing roadblocks to deep sequencing of modified RNAs.

Authors: Jeremy E Wilusz
Journal: Nat Methods Date: 2015-09 Impact factor: 28.547

3. Transcriptome-wide mapping reveals reversible and dynamic N(1)-methyladenosine methylome.

Authors: Xiaoyu Li; Xushen Xiong; Kun Wang; Lixia Wang; Xiaoting Shu; Shiqing Ma; Chengqi Yi
Journal: Nat Chem Biol Date: 2016-02-10 Impact factor: 15.040

4. Translational offsetting as a mode of estrogen receptor α-dependent regulation of gene expression.

Authors: Julie Lorent; Eric P Kusnadi; Vincent van Hoef; Richard J Rebello; Matthew Leibovitch; Johannes Ristau; Shan Chen; Mitchell G Lawrence; Krzysztof J Szkop; Baila Samreen; Preetika Balanathan; Francesca Rapino; Pierre Close; Patricia Bukczynska; Karin Scharmann; Itsuhiro Takizawa; Gail P Risbridger; Luke A Selth; Sebastian A Leidel; Qishan Lin; Ivan Topisirovic; Ola Larsson; Luc Furic
Journal: EMBO J Date: 2019-09-26 Impact factor: 11.598

5. Differential expression of human tRNA genes drives the abundance of tRNA-derived fragments.

Authors: Adrian Gabriel Torres; Oscar Reina; Camille Stephan-Otto Attolini; Lluís Ribas de Pouplana
Journal: Proc Natl Acad Sci U S A Date: 2019-04-08 Impact factor: 11.205

6. Combining tRNA sequencing methods to characterize plant tRNA expression and post-transcriptional modification.

Authors: Jessica M Warren; Thalia Salinas-Giegé; Guillaume Hummel; Nicole L Coots; Joshua M Svendsen; Kristen C Brown; Laurence Drouard; Daniel B Sloan
Journal: RNA Biol Date: 2020-07-25 Impact factor: 4.652

7. Accurate characterization of Escherichia coli tRNA modifications with a simple method of deep-sequencing library preparation.

Authors: Ji Wang; Claire Toffano-Nioche; Florence Lorieux; Daniel Gautheret; Jean Lehmann
Journal: RNA Biol Date: 2020-07-26 Impact factor: 4.652