| Literature DB >> 32317317 |
Peter H Culviner1, Chantal K Guegler1, Michael T Laub2,3.
Abstract
The profiling of gene expression by RNA sequencing (RNA-seq) has enabled powerful studies of global transcriptional patterns in all organisms, including bacteria. Because the vast majority of RNA in bacteria is rRNA, it is standard practice to deplete the rRNA from a total RNA sample such that the reads in an RNA-seq experiment derive predominantly from mRNA. One of the most commonly used commercial kits for rRNA depletion, the Ribo-Zero kit from Illumina, was recently discontinued abruptly and for an extended period of time. Here, we report the development of a simple, cost-effective, and robust method for depleting rRNA that can be easily implemented by any lab or facility. We first developed an algorithm for designing biotinylated oligonucleotides that will hybridize tightly and specifically to the 23S, 16S, and 5S rRNAs from any species of interest. Precipitation of these oligonucleotides bound to rRNA by magnetic streptavidin-coated beads then depletes rRNA from a complex, total RNA sample such that ∼75 to 80% of reads in a typical RNA-seq experiment derive from mRNA. Importantly, we demonstrate a high correlation of RNA abundance or fold change measurements in RNA-seq experiments between our method and the Ribo-Zero kit. Complete details on the methodology are provided, including open-source software for designing oligonucleotides optimized for any bacterial species or community of interest.IMPORTANCE The ability to examine global patterns of gene expression in microbes through RNA sequencing has fundamentally transformed microbiology. However, RNA-seq depends critically on the removal of rRNA from total RNA samples. Otherwise, rRNA would comprise upward of 90% of the reads in a typical RNA-seq experiment, limiting the reads coming from mRNA or requiring high total read depth. A commonly used kit for rRNA subtraction from Illumina was recently unavailable for an extended period of time, disrupting routine rRNA depletion. Here, we report the development of a "do-it-yourself" kit for rapid, cost-effective, and robust depletion of rRNA from total RNA. We present an algorithm for designing biotinylated oligonucleotides that will hybridize to the rRNAs from a target set of species. We then demonstrate that the designed oligonucleotides enable sufficient rRNA depletion to produce RNA-seq data with 75 to 80% of reads coming from mRNA. The methodology presented should enable RNA-seq studies on any species or metagenomic sample of interest.Entities:
Keywords: RNA sequencing; rRNA depletion; subtractive hybridization
Mesh:
Substances:
Year: 2020 PMID: 32317317 PMCID: PMC7175087 DOI: 10.1128/mBio.00010-20
Source DB: PubMed Journal: mBio Impact factor: 7.867
FIG 1Oligonucleotide selection for 16S rRNA. (A) Alignment of 16S sequences from 8 bacterial species (Ec, E. coli; Pa, P. aeruginosa; Rp, R. parkeri; Cc, C. crescentus; Bs, B. subtilis; Ms, M. smegmatis; Mtb, M. tuberculosis; Sa, S. aureus). Alignment gaps are shown as red lines in the particular species of the gap. Regions with a gap in any species are highlighted in pink; these regions were not considered when designing oligonucleotides. (B) The position, length, and minimum T of all oligonucleotides plotted against the 16S alignment after the indicated number of optimization cycles (top). The information content at each nucleotide position of aligned regions is also shown (bottom, points). To highlight conserved regions, a sliding average information content is also plotted (bottom, line). (C) Oligonucleotide T statistics after multiple cycles of the T optimization algorithm. For each oligonucleotide (n = 250), we calculated the minimum T across the 8 species considered and then plotted the mean of this value across all oligonucleotides (black). The T cannot be accurately estimated for oligonucleotides with multiple sequential mismatches; the number of oligonucleotides with an undefined T is also plotted (blue). (D) Histograms of minimum T for oligonucleotides at the indicated number of optimization cycles. Data were generated as in panel C, but oligonucleotide T minima were used to generate histograms rather than taking the mean across all oligonucleotides. Oligonucleotides with an undefined T were not included in the histograms. (E) Distribution of T values for each 16S-targeting oligonucleotide (n = 8) for each individual species indicated. The mean T of oligonucleotides for each species is also shown (red lines). Note that the same oligonucleotides are used for each species, but because of 16S sequence variability, the T can vary, as illustrated for one particular oligonucleotide (blue).
FIG 2rRNA depletion by oligonucleotide-based hybridization. (A) Cartoon of the rRNA depletion process. (B) Polyacrylamide gel showing total RNA from E. coli, B. subtilis, and C. crescentus before and after rRNA depletion using indicated probe sets. The first lane is a ladder. Approximate positions of abundant RNAs, including rRNAs, are indicated on the right. Note that a lower contrast is shown for the top portion of the gel to resolve 16S and 23S bands. B. subtilis RNA extraction partially depleted the 5S and small ncRNAs (see Materials and Methods). (C) Fraction of total reads aligning to rRNA for rRNA-undepleted and -depleted samples of E. coli, B. subtilis, and C. crescentus total RNA. (D) Summed read counts across the E. coli 16S, 23S, and 5S rRNAs before (red) and after (blue) depletion. The positions of oligonucleotides used for depletion are shown below.
FIG 3Our rRNA depletion strategy performs comparably to Ribo-Zero for RNA-seq. (A) Scatterplot showing correlation between log2 fold change for E. coli coding regions following rifampicin treatment, comparing rRNA depletion via Ribo-Zero with our depletion strategy. Fold changes were calculated as the ratio of RPKM between rifampicin-treated and untreated samples. All coding regions with at least 64 RPKM in both untreated samples (n = 1,294) were considered in the analysis. (B) Scatterplot showing correlation between log2 fold change for E. coli coding regions following chloramphenicol treatment, comparing rRNA depletion via Ribo-Zero with our depletion strategy. Fold changes were calculated as the ratio of RPKM between chloramphenicol-treated and untreated samples. All coding regions with at least 64 RPKM in both untreated samples (n = 1,294) were considered in the analysis. (C) Scatterplot showing correlation between read counts (RPKM) for E. coli coding regions treated with Ribo-Zero and our do-it-yourself (DIY) depletion strategy. All coding regions with at least 64 RPKM in both samples (n = 1,294) were considered in the analysis. Eleven outliers preferentially depleted by our method are highlighted (black); also see Fig. S2D.