| Literature DB >> 35894617 |
Lorena Azevedo de Lima1, Kristina Reinmets1, Lars Keld Nielsen2,3,4, Esteban Marcellin2,3, Audrey Harris5, Michael Köpke5, Kaspar Valgepea1.
Abstract
Transcriptome analysis via RNA sequencing (RNA-seq) has become a standard technique employed across various biological fields of study. The rapid adoption of the RNA-seq approach has been mediated, in part, by the development of different commercial RNA-seq library preparation kits compatible with standard next-generation sequencing (NGS) platforms. Generally, the essential steps of library preparation, such as rRNA depletion and first-strand cDNA synthesis, are tailored to a specific group of organisms (e.g., eukaryotes versus prokaryotes) or genomic GC content. Therefore, the selection of appropriate commercial products is of crucial importance to capture the transcriptome of interest as closely to the native state as possible without introduction of technical bias. However, researchers rarely have the resources and time to test various commercial RNA-seq kits for their samples. This work reports a side-by-side comparison of RNA-seq data from Clostridium autoethanogenum obtained using three commercial rRNA removal and strand-specific library construction products of NuGEN Technologies, Qiagen, and Zymo Research and assesses their performance relative to published data. While all three vendors advertise their products as suitable for prokaryotes, we found significant differences in their performance regarding rRNA removal, strand specificity, and most importantly, transcript abundance distribution profiles. Notably, RNA-seq data obtained with Qiagen products were most similar to published data and delivered the best results in terms of library strandedness and transcript abundance distribution range. Our results highlight the importance of finding appropriate organism-specific workflows and library preparation products for RNA-seq studies. IMPORTANCE RNA-seq is a powerful technique for transcriptome profiling while involving elaborate sample processing before library sequencing. We show that RNA-seq library preparation kits can strongly affect the outcome of an RNA-seq experiment. Although library preparation benefits from the availability of various commercial kits, choosing appropriate products for the specific samples can be challenging for new users or for users working with unconventional organisms. Evaluating the performance of different commercial products requires significant financial and time investments infeasible for most researchers. Therefore, users are often guided in their choice of kits by published data involving similar input samples. We conclude that important consideration should be given to selecting sample processing workflows for any given organism.Entities:
Keywords: RNA sequencing; RNA-seq; acetogen; transcriptome profiling; transcriptomics
Mesh:
Year: 2022 PMID: 35894617 PMCID: PMC9431689 DOI: 10.1128/spectrum.02303-22
Source DB: PubMed Journal: Microbiol Spectr ISSN: 2165-0497
FIG 1RNA-seq results were strongly affected by rRNA removal and library construction kits. (A) Experimental design of the work. (B) PCA of transcript abundances. (C) Hierarchical clustering of individual transcript abundances. (D) Spearman’s correlation analysis of transcript abundances. (E) Probability density plots of transcript abundances. The reference data set refers to high-BC samples in GEO accession number GSE90792. REF, reference data set. rRNA transcript abundances were removed prior to data analysis to avoid bias from varied efficiencies of rRNA removal between kits.
General statistics of RNA-seq results of the three tested kits for rRNA removal and library construction
| Kit | Sample name | No. of raw reads | Reads mapped (RPKM) | Coverage (fold) | Feature counts/mapped counts | Strandedness | rRNA RPKM/total RPKM | |
|---|---|---|---|---|---|---|---|---|
| Sense | Antisense | |||||||
| NuGEN | NuGEN_S1 | 4,268,524 | 98% | 72 | 59% | 64% | 36% | 7% |
| NuGEN_S2 | 6,004,018 | 99% | 103 | 47% | 50% | 50% | 2% | |
| NuGEN_S3 | 3,615,864 | 99% | 62 | 52% | 56% | 44% | 4% | |
| NuGEN_S4 | 4,248,288 | 98% | 72 | 63% | 61% | 39% | 15% | |
| Qiagen | Qiagen_S1 | 4,911,456 | 98% | 81 | 87% | 95% | 5% | 15% |
| Qiagen_S2 | 3,559,522 | 95% | 57 | 79% | 84% | 16% | 9% | |
| Qiagen_S3 | 3,289,702 | 93% | 50 | 82% | 88% | 12% | 6% | |
| Qiagen_S4 | 4,954,218 | 88% | 74 | 88% | 96% | 4% | 17% | |
| Zymo | Zymo_S1 | 4,355,956 | 99% | 71 | 82% | 13% | 87% | 0.8% |
| Zymo_S2 | 4,840,762 | 99% | 79 | 70% | 26% | 74% | 0.6% | |
| Zymo_S3 | 5,691,744 | 99% | 93 | 78% | 18% | 82% | 0.6% | |
| Zymo_S4 | 4,651,118 | 99% | 76 | 87% | 7% | 93% | 0.8% | |