| Literature DB >> 27774452 |
Yan Guo1, Jie Wu2, Shilin Zhao1, Fei Ye3, Yinghao Su2, Travis Clark4, Quanhu Sheng1, Brian Lehmann5, Xiao-Ou Shu2, Qiuyin Cai2.
Abstract
Background. Proper rRNA depletion is crucial for the successful utilization of FFPE specimens when studying gene expression. We performed a study to evaluate two major rRNA depletion methods: Ribo-Zero and RNase H. RNAs extracted from 4 samples were treated with the two rRNA depletion methods in duplicate and sequenced (N = 16). We evaluated their reducibility, ability to detect RNA, and ability to molecularly subtype these triple negative breast cancer specimens. Results. Both rRNA depletion methods produced consistent data between the technical replicates. We found that the RNase H method produced higher quality RNAseq data as compared to the Ribo-Zero method. In addition, we evaluated the RNAseq data generated from the FFPE tissue samples for noncoding RNA, including lncRNA, enhancer/super enhancer RNA, and single nucleotide variation (SNV). We found that the RNase H is more suitable for detecting high-quality, noncoding RNAs as compared to the Ribo-Zero and provided more consistent molecular subtype identification between replicates. Unfortunately, neither method produced reliable SNV data. Conclusions. In conclusion, for FFPE specimens, the RNase H rRNA depletion method performed better than the Ribo-Zero. Neither method generates data sufficient for SNV detection.Entities:
Year: 2016 PMID: 27774452 PMCID: PMC5059559 DOI: 10.1155/2016/9837310
Source DB: PubMed Journal: Int J Genomics ISSN: 2314-436X Impact factor: 2.326
Sample description and alignment statistics.
| ID | Library | Raw data | Alignment | |||||
|---|---|---|---|---|---|---|---|---|
| Total reads | BQ | GC | CR | Non-CR | CR MQ | Non-CR MQ | ||
| 1 | Ribo-Zero | 17.8 M | 31 | 71.4% | 27.7% | 72.3% | 32 | 47 |
| 2 | Ribo-Zero | 16.1 M | 30 | 76.8% | 31.4% | 68.6% | 23 | 47 |
| 3 | Ribo-Zero | 16.8 M | 31 | 70.7% | 31.6% | 68.4% | 28 | 46 |
| 4 | Ribo-Zero | 14.0 M | 31 | 72.3% | 46.0% | 54.0% | 34 | 47 |
| 1 | Ribo-Zero | 16.1 M | 31 | 70.4% | 27.4% | 72.6% | 31 | 47 |
| 2 | Ribo-Zero | 14.9 M | 31 | 75.4% | 37.9% | 62.1% | 21 | 47 |
| 3 | Ribo-Zero | 17.6 M | 31 | 71.1% | 29.8% | 70.2% | 29 | 47 |
| 4 | Ribo-Zero | 15.6 M | 31 | 70.0% | 46.1% | 53.9% | 36 | 47 |
| 1 | RNase H | 20.2 M | 35 | 51.6% | 79.9% | 20.1% | 46 | 33 |
| 2 | RNase H | 20.5 M | 36 | 39.8% | 42.6% | 57.4% | 45 | 41 |
| 3 | RNase H | 20.4 M | 35 | 51.6% | 78.9% | 21.1% | 45 | 33 |
| 4 | RNase H | 21.4 M | 35 | 48.6% | 58.0% | 42.0% | 45 | 37 |
| 1 | RNase H | 22.1 M | 35 | 52.5% | 80.3% | 19.7% | 46 | 35 |
| 2 | RNase H | 22.4 M | 34 | 55.1% | 74.7% | 25.3% | 44 | 33 |
| 3 | RNase H | 20.6 M | 35 | 52.0% | 78.5% | 21.5% | 45 | 31 |
| 4 | RNase H | 24.0 M | 34 | 53.6% | 80.1% | 19.9% | 45 | 30 |
CR: coding region; BQ: base quality; MQ: mapping quality; GC: GC content.
Figure 1(a) Unsupervised cluster using all detected RNAs. Samples were clustered first by replicates then by rRNA depletion method. (b) Pairwise Spearman correlation heatmap between all samples. Ribo-Zero produced higher correlation between repeats than RNase H. The samples RN4 and RN4r produce low correlations with other samples compared to other random pairs. This could be the result of variation in the sample or variation introduced by the RNase H kit.
Figure 2TNBC subtype results from TNBC type. The results show that RNase H samples produced better TNBC subtype consistency than Ribo-Zero samples.
Figure 3Spearman's correlation coefficients between RNAseq data and NanoString data. The Ribo-Zero samples produced slightly higher correlation with NanoString data then RNase H samples.
Figure 4Detected RNA using thresholds: normalized reads count > 0, 2, 5, and 10. (a) Protein coding RNA. (b) lncRNA. (c) Enhancer RNA. (d) Super enhancer RNA. At lower thresholds (more noise), Ribo-Zero samples detected more RNAs. At higher thresholds (more reliability), RNase H method detected more RNAs.
Figure 5Callable site is defined as a genomic position with depth coverage ≥ 20. The number of callable sites indicates the number of genomic positions that are suitable for SNV inference. RNase H had substantially more callable sites than Ribo-Zero. The percentage of difference in callable site is significantly more than the percentage of difference in number of total reads sequenced by the two kits. Y-axis is plotted in log10 scale.