| Literature DB >> 35138205 |
Danyi Wang1, P Alexander Rolfe2, Dorothee Foernzler1, Dennis O'Rourke1, Sheng Zhao3, Juergen Scheuenpflug4, Zheng Feng1.
Abstract
Objective: RNA extraction and library preparation from formalin-fixed, paraffin-embedded (FFPE) samples are crucial pre-analytical steps towards achieving optimal downstream RNA sequencing (RNASeq) results. In this study, we assessed 2 Illumina library preparation methods for RNA-Seq analysis using archived FFPE samples from human cancer indications at 2 independent vendors.Entities:
Keywords: Gene expression; RNASeq; breast cancer; colorectal cancer; hepatocellular carcinoma; non-small cell lung cancer; renal cancer
Mesh:
Substances:
Year: 2022 PMID: 35138205 PMCID: PMC8832632 DOI: 10.1177/15330338221076304
Source DB: PubMed Journal: Technol Cancer Res Treat ISSN: 1533-0338
Figure 1.Study design & workflow. Schematic of the sample flow through the 2 vendors and 2 protocols. The number of samples processed at each step is noted.
Illumina TruSeq RNA Access Versus TruSeq Stranded Total RNA: Overall QC and Alignment Stats.
| Step | QC measure | Vendor A | Vendor B | ||
|---|---|---|---|---|---|
| Total Stranded
| RNA Access
| Total Stranded
| RNA Access
| ||
| Library Prep | Average Fragment size: Mean (Range) | 318 | 309.83 | 296.63 | 324.5 |
| Concentration (nM): Mean (Range) | 57.43 | 51.24 | 224.57 | 205.81 | |
| Sequencing | Total paired end reads: Mean (Range) | 137 M | 141 M | 64.7 M | 184 M |
| %Aligned reads rate: Mean ± SD | 89.6 ± 10.8 | 95.2 ± 1.0 | 86.3 ± 7.8 | 92.7 ± 1.5 | |
| %Exonic rate: Mean ± SD | 18.5 ± 4.7 | 81.0 ± 2.3 | 41.6 ± 13.2 | 84 ± 2.4 | |
| %Intragenic rate: Mean ± SD | 83.1 ± 6.8 | 89.1 ± 2.3 | 81.5 ± 17.6 | 92.1 ± 1.9 | |
| %rRNA rate: Mean ± SD | 2.0 ± 1.9 | 2.2 ± 1.6 | 9.9 ± 1.83 | 1.8 ± 2.2 | |
| % Correct strand reads rate: Mean ± SD | 94.1 ± 3.3 | 95.9 ± 2.3 | 97.5 ± 1.3 | 97.8 ± 1.5 | |
2 × 50 bp paired end sequencing at minimum 100 million reads per sample.
2 × 50 bp paired end sequencing at minimum 50 million reads per sample.
Figure 3.Q–Q plots. The Q–Q plots help visualize the shape of the correlation or distribution between the 2 kits. Here, the data for sample FFPE_766 is shown from both vendors. At both, the majority of the plot shows a straight diagonal line, indicating identical distribution of TPMs for most percentiles. However, the highest percentiles diverge and the TruSeq Stranded Total RNA kit shows higher levels than the TruSeq RNA Access kit. The plot should not be interpreted to mean that either kit is necessarily correct; only that the highest expressed genes in the TruSeq Stranded Total RNA kit yield higher TPM values than the highest expressed genes in the TruSeq RNA Access kit. The plots also show divergence at very low expression values, potentially genes which are not present in the Access probe set and thus generate no signal in the TruSeq RNA Access results while generating some signal in the TruSeq Stranded Total RNA kit.
Figure 2.Example cross-vendor scatterplots. The overall correlation between vendors ranged from excellent (eg, A: FFPE_1582 in both TruSeq Stranded Total RNA [R = 0.873, rho = 0.927] and TruSeq RNA Access kit [R = 0.858, rho = 0.927]) to moderate (eg, B: FFPE_1579 in both TruSeq Stranded Total RNA [R = 0.012, rho = 0.760] and TruSeq RNA Access kit [R = 0.131, rho = 0.869]).
Spearman Correlations Between Library QC Factors (Total RNA Extracted, Library Concentration) and the Spearman Correlation Between Vendors of the Eventual Gene-Level Quantification.
| Kit | predictive_factor | Spearman's rho | Bonferroni adjusted_ | |
|---|---|---|---|---|
| RNA Access | Vendor A, ug RNA | 0.421 | 1.192×10−1 | 3.577×10−1 |
| RNA Access | Vendor B, ug RNA | 0.481 | 6.965×10−2 | 2.786×10−1 |
| RNA Access | Vendor A, library conc. | 0.732 | 2.733×10−3 | 1.367×10−2 |
| RNA Access | Vendor B, library conc | 0.812 | 2.329×10−4 | 1.630×10−3 |
| TruSeq Total Stranded | Vendor A, ug RNA | 0.264 | 3.401×10−1 | 4.010×10−1 |
| TruSeq Total Stranded | Vendor B, ug RNA | 0.35 | 2.005×10−1 | 4.010×10−1 |
| TruSeq Total Stranded | Vendor A, library conc. | 0.964 | <1×10−10 | <1×10−10 |
| TruSeq Total Stranded | Vendor B, library conc | 0.821 | 2.578×10−4 | 1.630×10−3 |
This uses cross-vendor correlation as a proxy for the quality of the result and looks at which QC factors might predict that result quality. For each row, a P-value and a Bonferroni adjusted P-value (adjusted over all rows) is provided. While the quantity of RNA is never significantly associated with the eventual data quality, the library concentrations are significantly associated with downstream data quality for both kits and when using the library concentration from either vendor.
Figure 4.Cross-vendor correlation versus library concentration. Examination of the cross-vendor correlations compared to various common QC statistics suggested that the library concentration was most informative in predicting the cross-vendor correlation. Plotted here are the cross-vendor correlation values versus library concentration. For the TruSeq Stranded Total RNA kit, there is a trend of increasing (though perhaps nonlinear) correlation as library concentration increases. For TruSeq RNA Access kit, there appears to be notably better results from library concentrations above 50 nM.