| Literature DB >> 31992186 |
Arran K Turnbull1,2, Cigdem Selli1,3, Carlos Martinez-Perez1,2, Anu Fernando1,2, Lorna Renshaw2, Jane Keys2, Jonine D Figueroa4, Xiaping He5, Maki Tanioka5, Alison F Munro1, Lee Murphy6, Angie Fawkes7, Richard Clark7, Audrey Coutts7, Charles M Perou5, Lisa A Carey5, J Michael Dixon2, Andrew H Sims8.
Abstract
BACKGROUND: High-throughput transcriptomics has matured into a very well established and widely utilised research tool over the last two decades. Clinical datasets generated on a range of different platforms continue to be deposited in public repositories provide an ever-growing, valuable resource for reanalysis. Cost and tissue availability normally preclude processing samples across multiple technologies, making it challenging to directly evaluate performance and whether data from different platforms can be reliably compared or integrated.Entities:
Keywords: FFPE; Fresh-frozen; Gene expression; Microarray; Sequencing; Transcriptomics
Mesh:
Substances:
Year: 2020 PMID: 31992186 PMCID: PMC6988223 DOI: 10.1186/s12859-020-3365-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Comparison of traditional and new microarray platforms with sequencing approaches
| Technology | Technology/Platform | Biochemistry | Approx. Throughput | Max. no. probes/primer pairs | No. of mapped ENSG IDs | Read Depths | Input FFPE RNA (ng)* | Approx. cost per sample (£)** | Success rate of FF samples (n) | Success rate of FFPE samples (n) |
|---|---|---|---|---|---|---|---|---|---|---|
| 3′ RNA sequencing | Lexogen QuantSeq | RNA → RT, oligodT priming from 3′ end, random priming towards 3′ end → amplification and barcoding → sequencing | 96 samples per 5 days | 55,765 | 25,610 | 10 M | 500 | 90 | N/A | 98% (318) |
| QiaSeq UPX 3′ Transcriptome | RNA → RT, oligodT priming for cDNA synthesis →template switching for 2nd strand synthesis priming → fragmentation → end repair addition, adapter ligation → PCR to add indices → sequencing | 96 samples per 5 days | 42,553 | 20,000 | 15 M | 10 | 50 | N/A | 94% (48) | |
| Specific Targeted Sequencing | BioSpyder TempO-Seq | RNA → annealed 50 bp detector oligos are ligated then amplified and barcoded → sequencing | 192 samples per 4 days | 19,300 | 19,300 | 12 M | 20 μm FFPE Section | 160 | N/A | 95% (38) |
| Ion Ampliseq Transcriptome | RNA → RT, multiplex PCR → sequence barcoding → emulsion PCR → sequencing of ~ 150 bp targets | 96 samples per 5 days | 20,802 | 19,059 | 8 M | 10 | 160 | 100% (108) | 76% (76) | |
| Targeted Probes | Nanostring | RNA → hybridisation to fluorescent barcoded probes in solution → immobilised in nCounter cartridge → scan | 12 samples per day (800 genes) | 800 | 800 | N/A | 50 | 250 | N/A | 100% (12) |
| Newer Microarray | Affymetrix Clariom S | RNA → cRNA amplification → hybridisation to GeneChip → scan | 192 samples per 4 days | 211,300 | > 20,000 | N/A | 50 | 100 | 100% (3) | 100% (8) |
| Traditional Microarray | Affymetrix U133A | 192 per day | 250,833 | 11,827 | N/A | 50 | 360 | 100% (178) | 100% (286) | |
| Illumina BeadChip HT-12 v3 / v4 | RNA → RT, amplification, biotinylation (NuGEN WT Ovation kit) → hybridisation to 50 bp probes on chip → scan | 96 samples per 1.5 days | 47,323 | 22,571 | N/A | 1500 | 195 | 91% (348) | 21% (206) | |
| Full RNA Sequencing | RNA-seq | RNA → fragmentation → RT → barcoded library construction → genome-wide full RNA sequencing | 8 samples per 5 days | 20,025 | 18,57s1 | 136 M paired reads | 2000 | 250–500 | 100% (52) | 100% (87) |
*Input RNA reflects quantities used in this study – for input ranges refer to the manufacturer’s guidelines
**Estimated costs (£, UK December 2019) include library preparation and sequencing. Costs can vary by sample numbers and sequencing infrastructure
Fig. 1Comparison of gene expression profiling approaches (a) Schematic of probe/primer designs for each technology. A table showing which samples were processed on each technology is provided in Additional file 1: Table S1. b Number of overlapping Ensembl gene identifiers detected in each dataset (Nanostring and Affymetix U133 were omitted as they do not represent the whole transcriptome and the Clariom S was excluded as only three samples were processed). c Summary of FFPE sample processing success rates by sample age using whole-transcriptome platforms
Fig. 2Batch correction allows robust direct integration of transcriptomic data across platforms. a Dissimilarity heatmaps based upon Pearson correlations ranging from 0.4 (red) through shades of orange and yellow to 1.0 (white). Left triangle shows the combined dataset of 6844 genes across 7 gene expression platforms. Right triangle shows the same data following batch correction with Combat. Coloured bars below dendrograms denote the platform. b Enlargement of the dendrogram to demonstrate that the majority of the same time-point patient samples processed on different platforms cluster together following batch correction. c Scatter plots before (grey) and after batch correction (pink) of the same sample, either FF or FFPE processed across different platforms. In each case the Pearson correlations increase substantially following batch correction. Patient samples are denoted − 1 for pre-treatment, − 2 for early on-treatment
Fig. 3Robust gene expression measurement across platforms following batch correction. Correction of systematic platform bias and integration of data from fresh frozen and FFPE tissues. a 3D multi-dimensional scaling (MDS) before (left) and after (right) batch correction of 6844 common genes. Samples coloured by platform and shapes indicates time point. b MDS plot of the batch corrected data with samples coloured by time-point clearly demonstrates a consistent treatment effect seen across sequential patient-matched samples. c Ultrasound measurements of the eleven breast tumours which relate to the sequential patient-matched samples indicating consistent reductions in tumour volume over time across the patients. d Ranking patient samples by the expression of 42 common proliferation genes (listed in Additional file 2: Table S2) illustrates consistent changes resulting from endocrine therapy, which appears to be independent from profiling platform. Pre-treatment samples tend to have relatively high proliferation, whilst as expected early, and particularly late on-treatment samples have lower proliferation. Heatmap colours are Red = High, Green = low