Literature DB >> 21880599

Duplex-specific nuclease efficiently removes rRNA for prokaryotic RNA-seq.

Hana Yi1, Yong-Joon Cho, Sungho Won, Jong-Eun Lee, Hyung Jin Yu, Sujin Kim, Gary P Schroth, Shujun Luo, Jongsik Chun.   

Abstract

Next-generation sequencing has great potential for application in bacterial transcriptomics. However, unlike eukaryotes, bacteria have no clear mechanism to select mRNAs over rRNAs; therefore, rRNA removal is a critical step in sequencing-based transcriptomics. Duplex-specific nuclease (DSN) is an enzyme that, at high temperatures, degrades duplex DNA in preference to single-stranded DNA. DSN treatment has been successfully used to normalize the relative transcript abundance in mRNA-enriched cDNA libraries from eukaryotic organisms. In this study, we demonstrate the utility of this method to remove rRNA from prokaryotic total RNA. We evaluated the efficacy of DSN to remove rRNA by comparing it with the conventional subtractive hybridization (Hyb) method. Illumina deep sequencing was performed to obtain transcriptomes from Escherichia coli grown under four growth conditions. The results clearly showed that our DSN treatment was more efficient at removing rRNA than the Hyb method was, while preserving the original relative abundance of mRNA species in bacterial cells. Therefore, we propose that, for bacterial mRNA-seq experiments, DSN treatment should be preferred to Hyb-based methods.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21880599      PMCID: PMC3203590          DOI: 10.1093/nar/gkr617

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

RNA-seq is a novel method for elucidating the transcriptome of cells. This method uses high-throughput next-generation sequencing technology and has revolutionized the way in which gene expression profiles are examined (1). The RNA molecules present in prokaryotic cells are mostly rRNA species, whereas mRNA constitutes only 1–5% of total RNA. Therefore, efficient enrichment of mRNA is a critical step for successful mRNA-seq experiments. Because mRNA molecules extracted from bacterial cells mostly lack poly-A tails, the methods developed so far have focused on removing non-mRNAs rather than selecting mRNAs. Several techniques (2–6) have been applied to deplete rRNA from the total bacterial RNA population, but the efficiency and robustness of these methods have not been objectively compared. Recently, He et al. (7) compared the two most popular rRNA removal methods, namely subtractive hybridization (Hyb) and exonuclease digestion, using Illumina-based RNA-seq of synthetic microbial metatranscriptomes. Their results suggested that the Hyb method introduced less bias in the relative proportion of the mRNA population compared to exonuclease digestion. However, no study has been conducted to evaluate these rRNA removal methods in mRNA-seq based on pure cultured strains. Zhulidov et al. (8) introduced a simple cDNA normalization method based on duplex-specific nuclease (DSN) aimed at enhancing the detection of rare transcripts in eukaryotic cDNA libraries by decreasing the prevalence of highly abundant transcripts. This DSN method includes the denaturation of cDNA, its subsequent reassociation and enzymatic degradation of the double-stranded (ds) DNA fraction using DSN isolated from the Kamchatka crab (9). Because the Hyb rate for each transcript is proportional to the square of its concentration (10), abundant transcripts form ds DNA more effectively during the reassociation step and are subjected to DSN-mediated degradation. DSN has a strong preference for cleaving dsDNA, and there is no significant cleavage of single-stranded (ss) DNA under the directed working conditions of the enzyme (9). The DSN method has been successfully applied to normalize transcripts in cDNA libraries from various eukaryotes (11). The treatment was usually performed on cDNA libraries enriched with mRNAs using either mRNA-specific poly(A) tail selection in the RNA state (12) or an oligo(dT) primer approach for reverse transcription from total RNA (13,14). However, the application of the DSN method for the purpose of rRNA removal from total RNA has not been reported in prokaryotic or eukaryotic transcriptome studies. Here, the use of DSN normalization as an rRNA removal method was evaluated and compared to the conventional subtractive Hyb method. Illumina deep sequencing of the transcriptomes of Escherichia coli grown under four conditions demonstrated that the DSN method is suitable for rRNA removal while preserving the original relative abundance of each mRNA transcript.

MATERIALS AND METHODS

Bacterial cultures

Escherichia coli K-12/MG1655 was grown in LB medium (Difco) at 37°C with continuous shaking. The freshly grown cells were inoculated into two culture flasks containing LB medium and incubated under anaerobic or aerobic conditions. At exponential phase (OD0.6), the aerobic culture was evenly distributed into three sterile culture flasks under aseptic conditions. Three aliquots were then subjected to three different conditions as follows. The first aliquot was subjected to instant RNA extraction at exponential phase, and the second was extracted at stationary phase (OD2.0). The third aliquot was subjected to heat shock stress by incubating the culture at 42°C for 30 min. The anaerobic cells were also grown to exponential phase (OD0.6), and the cells were subjected to instant RNA extraction.

RNA extraction, rRNA removal and sequencing library construction

Total RNA was independently extracted from the four culture conditions (aerobic exponential, aerobic stationary, aerobic heat shock and anaerobic exponential) using the hot phenol method with additional purification using an RNeasy Mini kit (Qiagen) following the manufacturer's instructions. The quantity and quality of the RNA were evaluated before and after the rRNA removal processes using RNA electropherograms (Agilent 2100 Bioanalyzer) and the RNA integrity number (RIN) (15). Total RNAs from cultures treated under the four conditions were aliquoted into three portions. The first aliquot of total RNA (200 ng) was used to generate a sequencing library using an mRNA-seq library prep kit (Illumina) without any other treatment. The RNA was directly subjected to fragmentation without the mRNA purification step (poly-A selection). The second aliquot was subjected to a subtractive Hyb-based rRNA removal process using the MICROBExpress Bacterial mRNA Enrichment Kit (Ambion). The resultant RNA (100 ng) was used for sequencing library construction using the mRNA-seq library prep kit, omitting the poly-A selection step. The last aliquot (200 ng) was used to generate a sequencing library using an mRNA-seq library prep kit with some modifications. The RNA was directly subjected to fragmentation without the mRNA purification step (poly-A selection). The first- and second-strand cDNA was synthesized from the fragmented RNA using random hexamer primers. End repair, A-tailing, adaptor ligation, cDNA template purification and enrichment of the purified cDNA templates using PCR were then performed. The resulting sample libraries were subjected to DSN treatment using the Trimmer-Direct cDNA Normalization Kit (Evrogen) as follows. The sample library mixed with Hyb buffer was denatured at 98°C for 2 min and incubated at 68°C for 5 h. DSN buffer and 2 μl of the DSN enzyme were added to the mixture and incubated at 68°C for 25 min followed by the addition of stop solution. After purification of the DSN-treated library using SPRI beads, the library was enriched by PCR using PE1.0 and PE2.0 primers. The library construction was completed by final purification of the PCR product using SPRI beads. Because the commercial Hyb and DSN kits already employed highly optimized reagents and conditions for E. coli RNA, we adopted the procedures suggested by the respective manufacturers. The summary of the experimental procedures of the control, DSN and Hyb methods is given in Supplementary Figure S1.

Sequencing and alignment of the transcriptome

RNA deep sequencing was performed using two runs of the Illumina Genome Analyzer IIx to generate single-ended 36-bp reads. The genome sequence and functional annotation information of this strain were obtained from the NCBI database (accession number NC_000913.2). Quality-filtered reads were aligned to the reference genome sequence using CLC Genomics Workbench 4.0 (CLC bio). Mapping was based on the minimal length of 32 bp with an allowance of up to two mismatches. The relative transcript abundance was measured in reads per kilobase of exon per million mapped sequence reads (16) (RPKM) using the following formula:

Determination of detection threshold

The library-size normalization was performed by dividing the raw read count of each mRNA by the number of total mapped reads in each Illumina lane and then multiplying by the average total mapped read numbers of four control samples. The detection threshold was determined by calculating the number of reads of an mRNA that was significantly different from zero read considering the experimental errors. For this calculation, duplicate control RNA data were analyzed under the assumption that controls 1 and 2 should have an identical true expression level, but measurement errors may cause different observed expression levels. The sample standard deviation was used to calculate the confidence interval that distinguished the observed expression level of an mRNA from an undetected mRNA. The detailed calculations are as follows. It was assumed that x(x) was an observed expression level for control 1 (control 2) and that μ was an unknown true expression level for both x and x. If x was 0, it was removed from the analysis, and it was assumed that x was larger than 0. For the measurement error, the following log-normal distributions for x and x were assumed, respectively: where x and x are independent. The variance in x and x depends on μ because the proportion of the measurement error got smaller for higher μ, and results in Supplementary Figure S3a also confirmed that the inverse of their variance was proportional to the true expression level. Because the mean parameters for x and x were assumed to be equal, the following could be obtained: As a result, the confidence interval for x given μ could be empirically calculated using . To minimize the dependence of the variance on μ, only the observed intensities for which was <1 were considered. If the variance was denoted as , the normality of the observed expression level resulted in the confidence interval at a 0.001 significance level.

Statistical analyses

Statistical analyses were performed using the RPKM values of mRNAs detected from all experimental conditions after detection threshold filtering using library size-normalized data. The statistical significance (P-value) of differences in rRNA removal efficiency was obtained using the likelihood ratio test based on a generalized linear model based on Poisson regression. The robustness of mRNA relative abundance conservation was analyzed using general linear regression and Lowess nonlinear regression models. Hierarchical clustering was performed using the unweighted pair group method with arithmetic mean (UPGMA) clustering algorithm using Pearson's product-moment correlation coefficient. All statistical analyses were performed using the R package, version 2.11.0 (www.r-project.org).

RESULTS

Illumina deep sequencing

Four RNA samples prepared from E. coli K12/MG1655 cells grown under four conditions (aerobic exponential, aerobic stationary, aerobic heat shock and anaerobic exponential) were aliquoted and subjected to three different protocols. The first aliquot (control) was processed without any rRNA removal treatment; the second aliquot was treated using DSN normalization (DSN; Trimmer-Direct cDNA Normalization Kit, Evrogen); and the third aliquot was treated using subtractive Hyb (MICROBExpress Bacterial mRNA Enrichment kit, Ambion). The RNA quality measured using RNA electropherograms showed that the extracted total RNA was of good quality, with an average RNA integrity number (RIN) of 9.2 (Supplementary Figure S2). The disappearance of rRNA peaks after the rRNA removal process using Hyb was also visualized in the electropherograms. DNA sequencing was performed using two eight-lane flow cells of the Illumina Genome Analyzer IIx to generate single-ended 36-bp reads. The first run contained eight lanes of untreated control samples, which consisted of duplicate lanes of each condition to allow an increased number of mRNA reads for samples. The second run contained four lanes of DSN-treated samples and four Hyb-treated samples. The number of quality-filtered reads for each sample ranged from 32 to 39 million, and >99.0% of the reads (average 99.4%) were mapped to the reference genome sequence, indicating good sequencing quality and negligible contamination (Table 1). The Illumina reads of duplicate control sample lanes were combined for further analyses.
Table 1.

Sequencing and alignment statistics

DescriptionRun IDTotal readsMapped readsmRNArRNAtRNAmisc_RNAintergenicUnmapped reads
Control
    ExponentialEx-C (Ex-C1, Ex-C2)73 501 02973 146 1862 752 70069 216 30910 917116 6281 049 632354 843
    StationarySt-C (St-C1, St-C2)65 848 25265 554 0242 482 82862 197 0771389195 905676 825294 228
    Heat shockHe-C (He-C1, He-C2)76 958 66276 537 8992 904 83471 652 9456490183 4211 790 209420 763
    AnaerobicAn-C (An-C1, An-C2)73 860 61173 469 5392 120 61169 437 5821435274 2461 635 665391 072
DSN
    ExponentialEx-D32 461 46432 346 69918 516 3309 593 62055 732425 7623 755 255114 765
    StationarySt-D34 429 13834 274 83324 816 3214 566 20611 204747 6764 133 426154 305
    Heat shockHe-D33 583 55733 470 90021 926 9117 125 45538 092604 1163 776 326112 657
    AnaerobicAn-D32 309 94132 179 27716 188 51012 459 8437358727 9962 795 570130 664
Hyb
    ExponentialEx-H39 013 86238 765 8579 461 60025 217 66029 688253 4293 803 480248 005
    StationarySt-H35 007 17234 706 7677 897 90722 624 5875341406 0123 772 920300 405
    Heat shockHe-H39 419 31039 030 25112 374 01423 788 88132 205620 1682 214 983389 059
    AnaerobicAn-H37 470 51637 106 5015 635 77329 788 7407606519 7161 154 666364 015

The statistics of untreated control samples were obtained from data combined from two lanes.

Sequencing and alignment statistics The statistics of untreated control samples were obtained from data combined from two lanes. The sequence coverage is defined as the proportion of mRNAs that have one or more mapped reads with respect to all annotated genes (4493 genes in the reference E. coli genome). The average coverage of the control samples was 98.6%, which is not significantly different from those treated with DSN (99.1%) or Hyb (99.2%), suggesting that the sequencing depths in this study were sufficient (Supplementary Table S1). The resultant expression profile of E. coli is shown in Supplementary Table S2, and the top 10 highly expressed genes in each growth and rRNA treatment conditions are summarized in Supplementary Table S3. The composition of major RNA types, namely rRNA, tRNA, mRNA and other RNA (miscRNA), was similar to that of prokaryotes (7).

rRNA removal efficiency

Illumina deep sequencing of the total RNA in E. coli revealed that rRNA was indeed the major component in all four growth conditions, ranging from 93.1% to 94.5%, whereas a substantially smaller proportion (2.9–3.8%) was identified as mRNA (Table 1). After the rRNA removal treatments, the proportion of rRNA in the cDNA libraries was reduced to 13.3–38.6% using DSN and 60.4–79.5% using Hyb (Figure 1). The mapped mRNAs increased 17.3-fold using DSN and 6.5-fold using Hyb compared to the untreated controls. The difference in the efficiency of rRNA removal treatments was statistically significant (P = 0.00007). The rRNA removal efficiency of DSN was 2.5 times higher than Hyb. However, the ratio of unmapped reads was higher in samples that underwent Hyb (0.86%) than those treated with DSN (0.39%) or than the control (0.50%). The efficiency of rRNA removal in DSN method increased in the order of rRNA size (23 S > 16 S > 5 S), but the ratio of 5 S:16 S:23 S was pretty conserved in DSN (0.1:35:65) compared to control (0.02:32:68), while the ratio was significantly shifted in Hyb (0.4:9:91).
Figure 1.

rRNA removal efficiency of DSN and Hyb methods. The proportion of Illumina reads assigned to mRNA or rRNA indicates that DSN is more effective at rRNA removal than Hyb.

rRNA removal efficiency of DSN and Hyb methods. The proportion of Illumina reads assigned to mRNA or rRNA indicates that DSN is more effective at rRNA removal than Hyb.

RNA-seq detection threshold

To improve the accuracy of the RNA-seq analysis, the mRNA detection threshold was determined to exclude Illumina reads in which the expression level was severely affected by measurement errors. It was assumed that the biological duplicates, namely controls 1 and 2 under the same conditions, had unknown identical means and that the measurement errors caused the differences in the Illumina read counts. To determine the detection threshold value, read count 1 was used as the one-tailed upper limit of minimum read at the 0.001 significance level. Because the variance of the measurement decreased with the level of the unknown true expression level (Supplementary Figure S3a), the area in which the variance showed normal distribution was determined first (Supplementary Figure S3b) to minimize the dependence of the variance on expression values. For the selected area, the variance was estimated, and the detection threshold of an mRNA was determined to be six Illumina reads in our library size-normalized data at the 0.001 significance level. Consequently, 94.58% of mRNAs that had more than six mapped reads in all samples that passed the filtering. The 234 failed ORFs (5.42%) were omitted from further statistical analyses. By removing the insignificant low read-count genes, the linearity of regression between the two duplicate controls was improved (Supplementary Figure S3c and d).

Robustness of mRNA relative abundance

The correlation of mRNA expression patterns between control samples and the two rRNA removal treatments is summarized in Figure 2a and Supplementary Figure S4. The average slope of the linear regression line between the controls and DSN-treated samples was 0.99 (r = 0.99, Pearson's correlation coefficient), whereas the corresponding value between the controls and Hyb-treated samples was significantly lower (0.75 on average, r = 0.93). Similarly, the Lowess fit of DSN-treated samples converged to linear regression, whereas Hyb-treated samples showed a departure from linearity, with a skewed shape in the area of the low to middle ‘expressers’. Hierarchical clustering analysis of all samples using the Pearson's product-moment correlation coefficient indicated that the mRNA expression profiles of the untreated controls were more similar to that of DSN-treated samples than that of Hyb-treated samples (Figure 2b).
Figure 2.

Comparison of robustness of mRNA relative abundance using DSN and Hyb. (a) Correlation of relative gene expression between the control and rRNA removal treatments. The exponential phase libraries (Ex) are shown as an example. Left, control (Ex-C) versus DSN (Ex-D); right, control (Ex-C) versus Hyb (Ex-H). The points in the plot indicate the RPKM value of each individual mRNA transcript. Red indicates the linear regression line, and blue indicates non-linear regression (Lowess) fit. Green is a straight line with a slope of 1. (b) Hierarchical clustering of all samples tested in this study using UPGMA based on Pearson's correlation. The regression plot and dendrogram represent the superior conservation of mRNA relative abundance in DSN treatment compared to Hyb treatment.

Comparison of robustness of mRNA relative abundance using DSN and Hyb. (a) Correlation of relative gene expression between the control and rRNA removal treatments. The exponential phase libraries (Ex) are shown as an example. Left, control (Ex-C) versus DSN (Ex-D); right, control (Ex-C) versus Hyb (Ex-H). The points in the plot indicate the RPKM value of each individual mRNA transcript. Red indicates the linear regression line, and blue indicates non-linear regression (Lowess) fit. Green is a straight line with a slope of 1. (b) Hierarchical clustering of all samples tested in this study using UPGMA based on Pearson's correlation. The regression plot and dendrogram represent the superior conservation of mRNA relative abundance in DSN treatment compared to Hyb treatment.

Robustness of mRNA profiles

The fold-change in expression levels of each mRNA was calculated by dividing the RPKM of stationary (St), heat shock (He) or anaerobic (An) samples by the RPKM of the exponential (Ex) sample and these fold-change values were represented as St/Ex, He/Ex or An/Ex, respectively. The fold difference between the control and the two rRNA removal treatments was calculated by subtracting the log-scale fold value of the corresponding control sample from that of DSN- or Hyb-treated sample. The regression analyses (Supplementary Figure S5) and boxplots (Figure 3) indicated a smaller fold difference in samples treated with DSN than Hyb. The slopes of the linear regression lines of DSN- and Hyb-treated samples were 0.96 (r = 0.98) and 0.74 (r = 0.90), respectively. The Lowess fit analysis of Hyb-treated samples also showed a departure from linearity, whereas the fit of DSN-treated samples converged to linear regression. The boxplots (Figure 3) demonstrated no fold differences in DSN-treated samples, with average values close to 0, whereas the corresponding values of Hyb-treated samples ranged between 0.03 and 0.10, implying significant expression level differences caused by the Hyb treatment.
Figure 3.

Boxplots showing fold difference profiles. The fold-change of each mRNA was calculated by dividing the RPKM of St, He or An samples by the RPKM of the Ex sample and is represented as St/Ex, He/Ex or An/Ex, respectively. The fold difference profile caused by rRNA removal treatment was calculated by subtracting the log-scale fold-change of the corresponding control sample from the DSN- or Hyb-treated samples. The average and standard deviation are shown at the top of the plots. No fold difference was observed using DSN, with the average values close to 0, whereas the corresponding values of Hyb ranged between 0.10 and 0.03, implying a significant expression level difference caused by the Hyb treatment.

Boxplots showing fold difference profiles. The fold-change of each mRNA was calculated by dividing the RPKM of St, He or An samples by the RPKM of the Ex sample and is represented as St/Ex, He/Ex or An/Ex, respectively. The fold difference profile caused by rRNA removal treatment was calculated by subtracting the log-scale fold-change of the corresponding control sample from the DSN- or Hyb-treated samples. The average and standard deviation are shown at the top of the plots. No fold difference was observed using DSN, with the average values close to 0, whereas the corresponding values of Hyb ranged between 0.10 and 0.03, implying a significant expression level difference caused by the Hyb treatment. The transcripts for which relative abundance was severely biased using Hyb treatment were identified using a threshold of 0.5 at Δlog (fold value) because the majority of transcripts of DSN-treated samples were within the threshold (Figure 3). A total of 127 ORFs were identified from Hyb-treated samples (Supplementary Table S4), and their functional categories are provided in Supplementary Figure S6. No correlation between the functional categories of the transcripts and bias was detected. No selective loss or gain of mRNAs depending on GC content was either observed in this data set.

DISCUSSION

Given the prevalence of rRNA in the total transcriptome of prokaryotes, it is essential to enrich mRNA prior to sequencing-based genome-wide gene expression studies. However, it is equally important to preserve the overall gene expression patterns while enriching the mRNA. In this study, we evaluated two commercially available methods for performing such a task using E. coli, the most widely used model bacterium. In our study, both the Hyb and DSN methods removed a substantial proportion of rRNA species, with DSN being 2.5-fold more efficient than Hyb. Our results suggest that researchers can reduce the sequencing cost of RNA-seq by 2.5-fold by using DSN method, compared to the commercial Hyb kit. The rRNA removal efficiency of Hyb method obtained in this study was comparable with the previous studies. It has been known that the rRNA removal efficiency of Hyb method varies widely for community RNA samples (17,18), as well as for single-species analyses (19). For example, the amount of rRNA remained after the commercial Hyb treatment ranged from 43.6% to 98.6% depending on microbial species (7). This is because the Hyb method is based on Hyb between rRNA and oligonucleotides that target conserved regions of bacterial rRNA; therefore, the removal efficiency is largely dependent on the selected oligonucleotide sequences. In contrast, the DSN method does not depend on particular rRNA sequences; therefore, in theory, it can be used for any organism, including archaea. However, the variation of rRNA removal efficiency between samples was also observed in DSN treatment, but the reason is unclear at this stage. The regression analyses demonstrated the lower robustness of the Hyb method compared to DSN, especially in the low- to middle-expression range (approximate RPKM < 300). Many transcripts in this range seemed to be expressed at higher levels than those in the untreated controls. The depletion of high expressers may result in the relative overrepresentation of lower expressers. A number of severely biased transcripts were found in samples that underwent Hyb treatment, although no correlation between the functional categories of the transcripts and bias was found. Undesired binding between mRNAs and the capture oligonucleotides may cause this phenomenon. Indeed, in the case of ribosomal proteins (rps, rpl and rpm) that have obvious homology with rRNAs, the average RPKM value was reduced to 76% of the untreated control after Hyb treatment, whereas it was conserved at 105% in DSN-treated samples. An increased proportion of unmapped reads after Hyb treatment compared to the control or DSN treatment was also noted, although the reason for this is unclear. Because of the function of the DSN enzyme, which preferentially degrades highly abundant transcripts over transcripts of low abundance, we expected that abundant transcripts could be affected by DSN treatment. Because the absolute levels of gene expression in bacteria vary over as much as six orders of magnitude (20–22), the concentration of highly abundant mRNA classes could be affected by DSN degradation. However, as far as we surveyed, even the most abundant mRNA typically comprise 1–7% of mRNAs (6,16,23), and as much as 10% in severe cases (24). Thus, the amount of rRNA in a cell significantly outnumbers even the most abundant mRNA transcripts. In fact, the 10 most expressed genes in each condition showed a little bit declined expression level in DSN (7.9% on average) compared to untreated control as shown in Supplementary Table S3, while the amount of decline was much more severe in Hyb (20.3%). Moreover, the log scale regression analysis in this study proved that the amount of decline observed in the abundant mRNAs did not hamper the overall robustness. The typical ratio of rRNA:non-rRNA revealed by RNA-seq is 95–99% in a wide taxonomic range of pure cultured bacteria and archaea (7). Thus, our DSN method is, at least, applicable to a broad range of prokaryotes in laboratory culturing condition. In addition, it has been known that the ratio of rRNA compared to mRNA is higher in resting cells than actively growing cells (25,26). Thus, we think that the overwhelming abundance of rRNA compared to non-rRNA would be the case for even slowly growing cells or metabolically inactive cells, though experimental evidence will be required for DSN applicability for these conditions in future. Because the rRNA ratios reported in a metatranscriptomic study are also as high as 74–97% (6), it is fair to say that DSN method would be applicable to mixed population samples. Several rRNA removal methods are known, but all of these methods have some limitations. The methods based on Hyb between rRNA and DNA targeting rRNA may generate bias depending on the taxonomy of bacteria. These methods include the RNase H digestion method based on reverse transcription with rRNA-specific primers (4) and the Hyb method evaluated in this study. Though recently developed subtractive Hyb method has solved the bias by generating customized oligonucleotide probes targeting sample specific rRNAs (6), it is not still free from non-specific binding of rRNA probe to mRNAs. The size selection method using gel electrophoresis (5) has an apparent limitation because some mRNAs can co-migrate with rRNA and are subsequently omitted. The poly(A) tail addition methods (3,27), which use preferential poly(A) adenylation of mRNA in crude RNA, also have the possible limitation of uneven poly(A) adenylation efficiency among mRNAs, which may generate expression-level bias. Because poly(A) tail addition method have not been tested in pure cultures using RNA-seq, further evaluation is required. In the case of 5′-phosphate-dependent exonuclease digestion of rRNA with 5′ monophosphate, it has already been demonstrated that this method compromises the relative proportion of the mRNA population more severely than the Hyb method (7). This change in the relative proportion may occur because exonuclease can also eliminate 5′-monophosphorylated mRNA species, which are produced during mRNA processing by endoribonucleases (28,29) and RppH (30). The only drawback of DSN over the Hyb method is that it requires more experimental steps and, therefore, a longer time to prepare cDNA libraries for deep sequencing. However, because the rRNA removal step is performed after cDNA library construction, the possibility of unintentional RNA degradation is greatly reduced compared to other methods involving pre-treatments. In addition, the amount of total input RNA used for deep sequencing library construction was much less in the DSN method (200 ng) than in the conventional mRNA-seq accompanying poly(A) selection (1–10 μg). Considering the loss of RNAs during the mRNA enrichment step using other pre-treatments, the amount of total RNA to be extracted is even smaller using the DSN method. Another advantage of the DSN method is that it works well even with partially degraded total RNA, whereas the Hyb method requires intact rRNA for the successful binding of the oligonucleotides to targeted conserved sites. Indeed, a well-supported positive correlation between RIN values, which represent the degree of rRNA integrity, and rRNA removal efficiency (r = 0.88) was observed in our Hyb experiments, as well as in a previous report (7); however, this type of correlation was not observed in our DNS treatments (r = 0.27). Although the number of samples (four total RNAs) and the range of RNA degradation (RIN value 8.8–10.0) were not strong enough for statistical analyses, the lower importance of the RNA fragmentation status for effective DNS treatment compared to the Hyb method was clear. Because the conventional RNA-seq of eukaryotic organisms relies on poly(A)+ capture in the first step of sequencing library preparation, the outcome generally contains only information on poly(A)-tailed RNAs. If the DSN method used in this study is applied to eukaryotic RNA, it probably will provide information on both non-poly(A) RNA sequences and poly(A) RNA. Further study of the feasibility of this method for eukaryotic RNA-seq without poly(A) selection is therefore needed. In this study, we performed deep sequencing of total RNA of E. coli. Although E. coli is a well studied, widely used model organism, its precise genome-wide expression profile has not been documented previously. To our knowledge, this is the first report on the intact genome-wide RNA profile of E. coli without any treatment and selection. This information clearly demonstrates the overall abundance of different RNA species in a bacterial cell. DSN-based normalization showed a higher efficiency of rRNA removal than the Hyb method, while preserving the relative abundance of mRNA. The thermodynamic principle of this technique allows its application to any kind of eukaryotic or prokaryotic organism. Therefore, DSN-based mRNA enrichment can be readily used in bacterial mRNA-seq experiments.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

Priority Research Centers Program (grant number 2010-0094020); National Research Foundation grant (2011-0016498) through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology, Republic of Korea. Funding for open access charge: National Science Foundation. Conflict of interest statement. None declared.
  28 in total

1.  Transcription profiling-based identification of Staphylococcus aureus genes regulated by the agr and/or sarA loci.

Authors:  P M Dunman; E Murphy; S Haney; D Palacios; G Tucker-Kellogg; S Wu; E L Brown; R J Zagursky; D Shlaes; S J Projan
Journal:  J Bacteriol       Date:  2001-12       Impact factor: 3.490

2.  A novel method for SNP detection using a new duplex-specific nuclease from crab hepatopancreas.

Authors:  Dmitry A Shagin; Denis V Rebrikov; Valery B Kozhemyako; Ilia M Altshuler; Alex S Shcheglov; Pavel A Zhulidov; Ekaterina A Bogdanova; Dmitry B Staroverov; Valery A Rasskazov; Sergey Lukyanov
Journal:  Genome Res       Date:  2002-12       Impact factor: 9.043

3.  Validation of two ribosomal RNA removal methods for microbial metatranscriptomics.

Authors:  Shaomei He; Omri Wurtzel; Kanwar Singh; Jeff L Froula; Suzan Yilmaz; Susannah G Tringe; Zhong Wang; Feng Chen; Erika A Lindquist; Rotem Sorek; Philip Hugenholtz
Journal:  Nat Methods       Date:  2010-09-19       Impact factor: 28.547

4.  Microbial community gene expression within colonies of the diazotroph, Trichodesmium, from the Southwest Pacific Ocean.

Authors:  Ian Hewson; Rachel S Poretsky; Sonya T Dyhrman; Brian Zielinski; Angelicque E White; H James Tripp; Joseph P Montoya; Jonathan P Zehr
Journal:  ISME J       Date:  2009-07-02       Impact factor: 10.302

5.  Development and quantitative analyses of a universal rRNA-subtraction protocol for microbial metatranscriptomics.

Authors:  Frank J Stewart; Elizabeth A Ottesen; Edward F DeLong
Journal:  ISME J       Date:  2010-03-11       Impact factor: 10.302

6.  Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells.

Authors:  Yuichi Taniguchi; Paul J Choi; Gene-Wei Li; Huiyi Chen; Mohan Babu; Jeremy Hearn; Andrew Emili; X Sunney Xie
Journal:  Science       Date:  2010-07-30       Impact factor: 47.728

7.  Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction.

Authors:  Y Y Zhu; E M Machleder; A Chenchik; R Li; P D Siebert
Journal:  Biotechniques       Date:  2001-04       Impact factor: 1.993

8.  Isolation of Escherichia coli mRNA and comparison of expression using mRNA and total RNA on DNA microarrays.

Authors:  V F Wendisch; D P Zimmer; A Khodursky; B Peter; N Cozzarelli; S Kustu
Journal:  Anal Biochem       Date:  2001-03       Impact factor: 3.365

9.  Mapping the Burkholderia cenocepacia niche response via high-throughput sequencing.

Authors:  D R Yoder-Himes; P S G Chain; Y Zhu; O Wurtzel; E M Rubin; James M Tiedje; R Sorek
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-20       Impact factor: 11.205

10.  Simple cDNA normalization using kamchatka crab duplex-specific nuclease.

Authors:  Pavel A Zhulidov; Ekaterina A Bogdanova; Alex S Shcheglov; Laura L Vagner; George L Khaspekov; Valery B Kozhemyako; Mikhail V Matz; Ella Meleshkevitch; Leonid L Moroz; Sergey A Lukyanov; Dmitry A Shagin
Journal:  Nucleic Acids Res       Date:  2004-02-18       Impact factor: 16.971

View more
  44 in total

1.  Scalable and cost-effective ribonuclease-based rRNA depletion for transcriptomics.

Authors:  Yiming Huang; Ravi U Sheth; Andrew Kaufman; Harris H Wang
Journal:  Nucleic Acids Res       Date:  2020-02-28       Impact factor: 16.971

2.  Lineage-specific SoxR-mediated Regulation of an Endoribonuclease Protects Non-enteric Bacteria from Redox-active Compounds.

Authors:  Jisun Kim; Chulwoo Park; James A Imlay; Woojun Park
Journal:  J Biol Chem       Date:  2016-11-28       Impact factor: 5.157

3.  RNA sequencing: platform selection, experimental design, and data interpretation.

Authors:  Yongjun Chu; David R Corey
Journal:  Nucleic Acid Ther       Date:  2012-07-25       Impact factor: 5.486

Review 4.  rRNA Binding Sites and the Molecular Mechanism of Action of the Tetracyclines.

Authors:  Chinwe U Chukwudi
Journal:  Antimicrob Agents Chemother       Date:  2016-07-22       Impact factor: 5.191

5.  Metatranscriptomic analysis of arctic peat soil microbiota.

Authors:  Alexander T Tveit; Tim Urich; Mette M Svenning
Journal:  Appl Environ Microbiol       Date:  2014-07-11       Impact factor: 4.792

6.  Rapid identification of novel antigens of Salmonella Enteritidis by microarray-based immunoscreening.

Authors:  Lena Danckert; Sebastian Hoppe; Frank F Bier; Markus von Nickisch-Rosenegk
Journal:  Mikrochim Acta       Date:  2014-02-14       Impact factor: 5.833

Review 7.  Cancer transcriptome profiling at the juncture of clinical translation.

Authors:  Marcin Cieślik; Arul M Chinnaiyan
Journal:  Nat Rev Genet       Date:  2017-12-27       Impact factor: 53.242

8.  Terminator oligo blocking efficiently eliminates rRNA from Drosophila small RNA sequencing libraries.

Authors:  Michelle L Wickersheim; Justin P Blumenstiel
Journal:  Biotechniques       Date:  2013-11       Impact factor: 1.993

9.  Comparative metatranscriptomics reveals kingdom level changes in the rhizosphere microbiome of plants.

Authors:  Thomas R Turner; Karunakaran Ramakrishnan; John Walshaw; Darren Heavens; Mark Alston; David Swarbreck; Anne Osbourn; Alastair Grant; Philip S Poole
Journal:  ISME J       Date:  2013-07-18       Impact factor: 10.302

10.  Torn apart and reunited: impact of a heterotroph on the transcriptome of Prochlorococcus.

Authors:  Steven J Biller; Allison Coe; Sallie W Chisholm
Journal:  ISME J       Date:  2016-06-03       Impact factor: 10.302

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.