| Literature DB >> 22973283 |
Ravi Kumar1, Yasunori Ichihashi, Seisuke Kimura, Daniel H Chitwood, Lauren R Headland, Jie Peng, Julin N Maloof, Neelima R Sinha.
Abstract
With the introduction of cost effective, rapid, and superior quality next generation sequencing techniques, gene expression analysis has become viable for labs conducting small projects as well as large-scale gene expression analysis experiments. However, the available protocols for construction of RNA-sequencing (RNA-Seq) libraries are expensive and/or difficult to scale for high-throughput applications. Also, most protocols require isolated total RNA as a starting point. We provide a cost-effective RNA-Seq library synthesis protocol that is fast, starts with tissue, and is high-throughput from tissue to synthesized library. We have also designed and report a set of 96 unique barcodes for library adapters that are amenable to high-throughput sequencing by a large combination of multiplexing strategies. Our developed protocol has more power to detect differentially expressed genes when compared to the standard Illumina protocol, probably owing to less technical variation amongst replicates. We also address the problem of gene-length biases affecting differential gene expression calls and demonstrate that such biases can be efficiently minimized during mRNA isolation for library preparation.Entities:
Keywords: Illumina; RNA-Seq; cDNA fragmentation; high-throughput; mRNA isolation; multiplexing; sequencing
Year: 2012 PMID: 22973283 PMCID: PMC3428589 DOI: 10.3389/fpls.2012.00202
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Outline of the high-throughput RNA-seq (HTR) library preparation. In short, frozen tissue samples are ground in the lysis buffer and mRNA is isolated from this using oligo dT beads (1). The mRNA is used to make first and second strands of cDNA (2) and this double stranded cDNA molecules are subsequently enzymatically fragmented (3). The ends of these molecules are repaired and an A nucleotide is added (4) to facilitate TA ligation of the barcoded adapters (5). The ligated samples are then enriched by amplification using adapter specific primers (6) and purified for sequencing.
Figure 2Quality control analysis for Illumina (IL) and high-throughput RNA-seq (HTR) library preparations. The quality control data from IL and HTR protocols using S. lycopersicum (SLY) and S. pennellii (SPE) are shown. (A) Per base sequence quality. Average of the four replicates has been plotted here. Error bars represent SD. (B) Sequence duplication levels. (C) Per sequence GC content. (D) Per base sequence content. In (C) and (D), the SPE and SLY of HTR protocol are plotted in the top panel and SPE and SLY of IL protocol are plotted in the bottom panel. Graphs were made in R using ggplot2.
Figure 3Read mapping for Illumina (IL) and high-throughput RNA-seq (HTR) library preparations. (A) Total number of reads. (B) Adapter contamination. (C) rRNA contamination. (D) Percentage reads mapped. (E) Number of detected genes. The read mapping data from IL and HTR protocols using S. lycopersicum (SLY) and S. pennellii (SPE) are shown. Graphs were made in R using ggplot2. Error bars are ±SEM.
Figure 4Detection of gene expression for Illumina (IL) and high-throughput RNA-seq (HTR) library preparations. (A) Read coverage is shown along whole gene length. (B) Multidimensional scaling (MDS) plot for assessing the variations amongst samples. Graph was made using the edgeR package in R. (C) VennDiagram comparing IL and HTR protocols for differential expressed genes (BH adjusted p-value < 0.01) between S. lycopersicum (SLY) and S. pennellii (SPE). The categories (a–h) are described in Table S5 in Supplementary Material. (D–G): Gene counts by gene length for IL and HTR protocols (D), for each category in (C) (E), for IL and HTR using Sera-Mag beads protocols (F), and for IL and HTR increasing Dynabeads amount protocols (G). 0–25, 25–50, 50–75, and 75–100 are the four gene-length quartiles (the genes separated into quartiles based on percentile gene length). Graphs were made in R using ggplot2.