| Literature DB >> 24888378 |
Wei Zhao, Xiaping He, Katherine A Hoadley, Joel S Parker, David Neil Hayes, Charles M Perou1.
Abstract
BACKGROUND: RNA sequencing (RNA-Seq) is often used for transcriptome profiling as well as the identification of novel transcripts and alternative splicing events. Typically, RNA-Seq libraries are prepared from total RNA using poly(A) enrichment of the mRNA (mRNA-Seq) to remove ribosomal RNA (rRNA), however, this method fails to capture non-poly(A) transcripts or partially degraded mRNAs. Hence, a mRNA-Seq protocol will not be compatible for use with RNAs coming from Formalin-Fixed and Paraffin-Embedded (FFPE) samples.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24888378 PMCID: PMC4070569 DOI: 10.1186/1471-2164-15-419
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Schematic overview of the rRNA removal protocols and list of samples tested. (A) mRNA-Seq, Ribo-Zero-Seq and DSN-Seq library preparation protocols are shown, with the key steps to remove the rRNA from the library show in italics. The full protocol was applied to the fresh-frozen (FF) samples, and a similar alternative protocol was applied to FFPE samples (omitting steps marked as *). (B) The list of samples tested by each RNA-Seq library protocol and their source.
Analysis of performance for multiple RNA-Seq methods
| mRNA-Seq | RiboZero-Seq | DSN-Seq | RiboZero-FFPE | DSN-FFPE | |
|---|---|---|---|---|---|
|
| |||||
| Sample size | 11 | 11 | 10 | 8 | 4 |
| % rRNA relative to mRNA-seq | 1 | 5.04 | 116 | 7.14 | 585 |
| (1–1) | (1.42-8.66) | (78.9-154) | (3.48-10.8) | (-347-1,517) | |
| % Aligned bases | 94 | 93.8 | 85.5 | 81.5 | 93.5 |
| (91.5-96.5) | (92–95.5) | (82.6-88.4) | (71–92) | (92.2-94.8) | |
| Median CV coverage | 0.533 | 0.525 | 0.56 | 0.744 | 0.929 |
| (0.506-0.56) | (0.505-0.545) | (0.549-0.57) | (0.713-0.775) | (0.814-1.04) | |
| Median 5′ to 3′ bias | 0.27 | 0.64 | 0.209 | 0.356 | 0.242 |
| (0.189-0.35) | (0.493-0.788) | (0.143-0.275) | (0.285-0.427) | (0.0329-0.451) | |
| Pearson correlation to microarray | 0.851 | 0.832 | 0.855 | 0.636 | 0.7 |
| (0.825-0.878) | (0.809-0.854) | (0.84-0.871) | (0.601-0.671) | (0.628-0.771) | |
|
| |||||
| Sample size | 10 | 6 | NA | 18 | 10 |
| % rRNA relative to mRNA-seq | 1 | 11.2 | NA | 0.935 | 41.7 |
| (1–1) | (1.51-20.9) | (0.631-1.24) | (22.1-61.3) | ||
| % Aligned bases | 96.4 | 95.0 | NA | 93.4 | 93.2 |
| (95.4-97.5) | (93.9-96.2) | (91.6-95.2) | (90.7-95.8) | ||
| Median CV coverage | 0.534 | 0.478 | NA | 0.83 | 0.953 |
| (0.517-0.551) | (0.458-0.499) | (0.791-0.869) | (0.896-1.01) | ||
| Median 5′ to 3′ bias | 0.309 | 0.46 | NA | 0.417 | 0.157 |
| (0.244-0.374) | (0.37-0.551) | (0.253-0.581) | (0.0856-0.229) | ||
Five different analyses were performed in order to assess the capabilities of the different RNA-seq protocols. These included: 1) % rRNA relative to mRNA-Seq; 2) % Aligned bases; 3) Median CV coverage; 4) Median 5′ to 3′ bias; 5) The Pearson correlation coefficient between the RNA-Seq libraries methods and the same samples assayed by DNA microarray in UNC dataset.
Figure 2Genome alignment profiles. The percentage of nucleotide bases mapping to three different regions of the genome: exonic/protein coding and UTR (green), intronic (yellow), intergenic (red), and the percentage of unmapped bases (purple). The data is shown separately for the UNC (A) and TCGA (B) datasets.
Figure 3Comparison of gene quantification concordance across RNA-Seq library protocols. Pearson correlation coefficients of RNA-Seq libraries pairs in (A) UNC and (B) TCGA dataset. (C) Scatter plots of libraries of each pair of protocols for breast tumor sample 020578B. (D) Deming regression slope for pairs of RNA-Seq libraries in UNC dataset. A slope of 1 indicates the equivalent sensitivity of the two libraries, whereas a smaller value is indicative of a higher sensitivity of the first term/method in the pair.
Figure 4Determination of the number of reads needed for each RNA-Seq protocol to equal a DNA microarray. The number of detected genes at different levels of sequencing depth is displayed relative to the number of genes detected via DNA microarray (dashed horizontal line).