| Literature DB >> 32597303 |
Fabiola Curion1, Adam E Handel2,3, Moustafa Attar1,4, Giuseppe Gallone5, Rory Bowden1, M Zameel Cader2,3, Michael B Clark6,7.
Abstract
RNA-seq is the standard method for profiling gene expression in many biological systems. Due to the wide dynamic range and complex nature of the transcriptome, RNA-seq provides an incomplete characterization, especially of lowly expressed genes and transcripts. Targeted RNA sequencing (RNA CaptureSeq) focuses sequencing on genes of interest, providing exquisite sensitivity for transcript detection and quantification. However, uses of CaptureSeq have focused on bulk samples and its performance on very small populations of cells is unknown. Here we show CaptureSeq greatly enhances transcriptomic profiling of target genes in ultra-low-input samples and provides equivalent performance to that on bulk samples. We validate the performance of CaptureSeq using multiple probe sets on samples of iPSC-derived cortical neurons. We demonstrate up to 275-fold enrichment for target genes, the detection of 10% additional genes and a greater than 5-fold increase in identified gene isoforms. Analysis of spike-in controls demonstrated CaptureSeq improved both detection sensitivity and expression quantification. Comparison to the CORTECON database of cerebral cortex development revealed CaptureSeq enhanced the identification of sample differentiation stage. CaptureSeq provides sensitive, reliable and quantitative expression measurements on hundreds-to-thousands of target genes from ultra-low-input samples and has the potential to greatly enhance transcriptomic profiling when samples are limiting.Entities:
Keywords: CaptureSeq; RNA-seq; gene expression; low-input sequencing; method; stem-cell-derived neurons; targeted RNA sequencing
Year: 2020 PMID: 32597303 PMCID: PMC7746246 DOI: 10.1080/15476286.2020.1777768
Source DB: PubMed Journal: RNA Biol ISSN: 1547-6286 Impact factor: 4.652
Figure 1.Mini-bulk CaptureSeq targets sequencing to genes of interest. (A) Schematic overview of project. Human-induced pluripotent stem cells (hiPSCs) were differentiated into cortical neurons as described in Volpato et al. [23]. Neural progenitor cells were separated into bulk and mini-bulk plates at the final plating stage. Sequencing libraries were utilized for standard (pre-capture) sequencing and sequence capture. Sequence capture with the NG and TF probe sets was performed on both bulk and mini-bulk samples, with an additional low cDNA (150 ng) hybridization capture performed on a mini-bulk pool with the TF probe set. Post-capture samples were sequenced to evaluate capture performance between bulk and mini-bulk samples. (B) Proportion of on-target reads pre- and post-capture. On-target reads are those overlapped by capture probes. Proportions shown as box plots, error bars span minimum to maximum values. (C) Enrichment of each library by capture. Enrichment factor (EF) is ratio of reads overlapping probe positions pre- and post-capture. Median EF reported. Box plot overlaid with violin plot displaying the EF ratio for each sample. Error bars span minimum to maximum values. (D) Genewise enrichment of coding genes targeted by NG capture in mini-bulk sample pool. Eighty-two genes detected pre-capture, all were enriched post-capture. Dashed red line represents no enrichment. (E) Between gene enrichment variability in bulk and mini-bulk captures. Enrichment CV shown for genes above and below the SLR expression cut-off. CV, coefficient of variation; SLR, segmental linear regression
Figure 3.Mini-bulk CaptureSeq allows comprehensive profiling of expressed gene isoforms. (A) Number of known and novel isoforms detected pre- and post-capture. (B) Percentage of detected isoforms in different classifications classes pre- and post-capture. Isoform classes described in methods
Figure 2.Improved gene detection and expression quantification with mini-bulk CaptureSeq. (A) Percentage of targeted genes enriched after capture. Black: genes detected pre-capture and enriched post-capture. Green: genes only detected post-capture and hence enriched above detection threshold by capture. Blue: genes detected pre-capture but not enriched by post-capture. Grey: gene not detected pre- or post-capture. TF mini-bulk capture is TF850. (B) Expression variability of targeted genes between replicate pre- and post-capture NG mini-bulk samples. CV, coefficient of variation. (C and D) Quantification of ERCCs targeted for capture in pre- and post-capture mini-bulk libraries. TF capture 850 (C), NG capture (D). Pre-capture quantification of ERCC shown in red, post-capture in teal. Compares known ERCC abundance (original concentration in attomoles/ul) to measured expression in CPKM. Mean and standard deviation plotted, n = 8 NG capture, n = 6 TF capture. Trend line, non-linear regression with a straight line fit
Figure 4.CaptureSeq enhances identification of sample differentiation stage. (A) Principal component analysis of bulk samples from different timepoints (D25, D55) and genotype status (CON and PS1). Panels show PC1 and PC2 of all genes expressed pre-capture (left, replicating [23]); pre-capture expression of genes targeted by TF capture (middle); and post-capture expression of TF capture targeted genes (right). (B) Overlap between genes associated with different developmental stages in CORTECON dataset and DE genes with higher expression at D25 or D55 in bulk TF post-capture samples. From earlier: Pluripotency (PP), Neural Differentiation (ND), Cortical Specification (CS); to later: Deep Layer Generation (DL), Upper Layer Generation (UL) stages. Left: All 597 TF DE genes, including genes which associate with multiple CORTECON stages. Right: TF DE genes (212) exclusive to one CORTECON stage. (C) Pearson correlations (all significant at p ≤ 0.001), between mini-bulk pre- (top) and post- (bottom) TF850 capture gene expression profiles and matched gene expression levels from each CORTECON timepoint. Colours represent CORTECON stages genes are associated with