| Literature DB >> 25331572 |
Adriana Alberti1, Caroline Belser, Stéfan Engelen, Laurie Bertrand, Céline Orvain, Laura Brinas, Corinne Cruaud, Laurène Giraut, Corinne Da Silva, Cyril Firmo, Jean-Marc Aury, Patrick Wincker.
Abstract
BACKGROUND: Metatranscriptomics is rapidly expanding our knowledge of gene expression patterns and pathway dynamics in natural microbial communities. However, to cope with the challenges of environmental sampling, various rRNA removal and cDNA synthesis methods have been applied in published microbial metatranscriptomic studies, making comparisons arduous. Whereas efficiency and biases introduced by rRNA removal methods have been relatively well explored, the impact of cDNA synthesis and library preparation on transcript abundance remains poorly characterized. The evaluation of potential biases introduced at this step is challenging for metatranscriptomic samples, where data analyses are complex, for example because of the lack of reference genomes.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25331572 PMCID: PMC4213505 DOI: 10.1186/1471-2164-15-912
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of the principal characteristics of the RNA-Seq library preparation kits evaluated in this study
| TruSeq stranded | Encore complete | Ovation RNA-Seq V2 | SMARTer stranded | |
|---|---|---|---|---|
|
| 100 ng depleted RNA | 100 ng total RNA | 0,5 ng depleted RNA | 1 ng depleted RNA |
|
| FFPE | RIN >7 | FFPE | FFPE |
|
| Yes | No | Yes | Yes |
|
| Random primers | Selective priming | Random and oligo(dT) primers | Random primers |
|
| RNA by divalent cations + heat | cDNA by Covaris shearing | cDNA by Covaris shearing | RNA by heat |
|
| Yes | Yes | No | Yes |
|
| Included | Included | Not included | Included |
|
| 96-plex | 16-plex | according to the library preparation method chosen | 12-plex |
|
| 6 hours | 7 hours | 4.5 hours for cDNA synthesis | 4.5 hours |
| + time for library preparation |
FFPE: Formalin-fixed, paraffin-embedded tissue.
RIN: RNA integrity number.
Sequences and mapping statistics
| Raw reads (millions) | % rRNAa | Cleaned readsb(millions) | % mapped readsc | % duplication ratesd | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Library name | replicate | replicate | replicate | replicate | replicate | replicate | replicate | replicate | replicate | replicate |
| 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | |
|
| 0.724 | 0.747 | 0.14 | 0.23 | 0.720 | 0.743 | 77.2 | 72.3 | 2.19 | 5.84 |
|
| 0.764 | 0.739 | 36.6 | 37.96 | 0.481 | 0.454 | 67.1 | 51.2 | 5.52 | 13 |
|
| 0.809 | 0.792 | 0.2 | 0.39 | 0.802 | 0.784 | 65.3 | 61 | 2.3 | 2.45 |
|
| 0.904 | 0.697 | 2.9 | 2.88 | 0.837 | 0.650 | 62.9 | 63.7 | 8.76 | 7.62 |
|
| 22.161 | 19.845 | 95.6 | 95.4 | 0.960 | 0.885 | 71.2 | 71.4 | 8.47 | 6.92 |
|
| 21.989 | 17.513 | 71 | 78.,5 | 6.302 | 3.676 | 73 | 74.4 | 1.01 | 0.57 |
|
| 24.806 | 18.798 | 95 | 95.1 | 1.146 | 0.831 | 51.5 | 51.9 | 54 | 61.6 |
|
| 0.754 | 0.769 | 0.21 | 2.27 | 0.748 | 0.747 | 73.1 | 72.1 | 1.23 | 1.59 |
|
| 0.749 | 0.732 | 37 | 36.61 | 0.470 | 0.461 | 66.5 | 66 | 7.4 | 5.58 |
|
| 0.864 | 1.177 | 0.9 | 0.6 | 0.849 | 1.161 | 55.1 | 54.7 | 0.93 | 0.85 |
|
| 0.824 | 0.726 | 0.77 | 0.96 | 0.809 | 0.702 | 65.7 | 66.9 | 6.5 | 7.03 |
|
| 20.257 | 30.693 | 93 | 94.1 | 1.393 | 1.775 | 70.4 | 71.1 | 3.24 | 3.,28 |
|
| 19.510 | 17.984 | 84 | 82 | 2.983 | 3.215 | 75.2 | 70.4 | 1.78 | 0.39 |
|
| 14.699 | 18.603 | 93.5 | 93.5 | 0.915 | 1.160 | 63.9 | 61.9 | 23 | 39 |
_L.L: library prepared from L. lactis depleted RNA.
_L.L control: library prepared from L. lactis total RNA.
_MIX: library prepared from the MIX depleted RNA.
_MIX control: library prepared from the MIX total RNA.
aproportion of rRNA reads detected in the raw reads.
bnumber of sequences remaining after the data quality control pipeline treatment applied on raw reads.
cproportion of cleaned reads uniquely mapped on CDS sequences.
dduplication rate estimated on 100 000 cleaned reads.
Figure 1CDS coverage and GC content. (a), (b), (c): box plots distribution of the CDS read counts normalized by the total read count for three categories of GC content (<40%, 40-50% and >50% GC) and for each MIX library. (d): distribution of the cumulated coverage along the length of the annotated CDS for MIX libraries (in 5′- > 3′ orientation).
Pearson correlation coefficients
|
| Replicatesb | Other methods vs TSc | Depleted RNA vs total RNAd | ||
|---|---|---|---|---|---|
|
| TS | 0.992 | 0.886 | 0.913 | |
| ENC | 0.985 | 0.736 | 0.570 | ||
| OV | 0.974 | 0.995 | 0.696 | 0.971 | |
| SMART | 0.753 | 0.995 | 0.805 | 0.9 | |
|
| TS | 0.958 | 0.884 | ||
| ENC | 0.980 | 0.498 | |||
| OV | 0.998 | 0.589 | 0.911 | ||
| SMART | 0.989 | 0.772 | 0.924 |
Pearson correlation coefficients between:
athe L. lactis library and the MIX library for each method.
bthe two replicates for each experiment.
cENC, OV or SMART L. lactis libraries and TS L. lactis libraries.
dthe depleted RNA library and the total RNA library for each method.
Proportions of genes detected as differentially expressed
|
| MIX vs TS_MIXb |
| MIX depleted RNA vs MIX total RNAd | |
|---|---|---|---|---|
| TS | 6.2 | 4.7 | ||
| ENC | 30.8 | 32.6 | ||
| OV | 27.6 | 35.4 | 8 | 14.5 |
| SMART | 2.2 | 9.2 | 21.3 | 1.1 |
Proportions of genes detected as differentially expressed between:
aENC, OV or SMART L. lactis library and TS L. lactis library.
bENC, OV or SMART MIX library and TS MIX library.
cTS,OV or SMART L. lactis depleted RNA library and TS,OV or SMART L. lactis total RNA library.
dTS,OV or SMART MIX depleted RNA library and TS,OV or SMART MIX total RNA library.
Figure 2Gene expression profile of TS versus OV, SMART and ENC methods in MIX samples. Gene expression profile from TS_MIX versus (a) ENC_MIX (b) SMART_MIX (c) OV_MIX samples. This figure shows the log scatter plots and the coefficients of determination (R2) obtained by comparing FPKM values for 14602 annotated CDS in the mix of bacteria.
Figure 3Evaluation of the GC bias in the differential expression analysis. Genes detected as the most differentially expressed were extracted (5 fold more or less differentially expressed) for ENC and OV MIX libraries. The proportions of CDS for different GC content intervals were plotted to evaluate the bias introduced by the cDNA synthesis method.