| Literature DB >> 33987443 |
Huajuan Shi1, Ying Zhou1, Erteng Jia1, Min Pan2, Yunfei Bai1, Qinyu Ge1.
Abstract
Although RNA sequencing (RNA-seq) has become the most advanced technology for transcriptome analysis, it also confronts various challenges. As we all know, the workflow of RNA-seq is extremely complicated and it is easy to produce bias. This may damage the quality of RNA-seq dataset and lead to an incorrect interpretation for sequencing result. Thus, our detailed understanding of the source and nature of these biases is essential for the interpretation of RNA-seq data, finding methods to improve the quality of RNA-seq experimental, or development bioinformatics tools to compensate for these biases. Here, we discuss the sources of experimental bias in RNA-seq. And for each type of bias, we discussed the method for improvement, in order to provide some useful suggestions for researcher in RNA-seq experimental.Entities:
Mesh:
Substances:
Year: 2021 PMID: 33987443 PMCID: PMC8079181 DOI: 10.1155/2021/6647597
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Sources of main bias in RNA-seq.
| Bias sources |
|---|
| Sample preservation |
| Library preparation |
| Sequencing and imaging |
Figure 1Simplified protocol of RNA-seq experiment and sources of bias. (a) Sample preservation and isolation. These biases can include sample degradation, DNA contamination. (b) Strategies for cDNA library construction. ①: the RNA directly converts to cDNA; then, cDNA was fragmented and library preparation. ②: classical a protocol. One method involves reverse transcription (RT) using random primers first, subsequently adapter ligations and sequencing (left). The other method is to first sequentially ligate 3′ and 5′ adapters, followed by performing cDNA synthesis with a primer complementary to the adapter (RT-primer), subsequently sequencing (right). On using the RT primer with a specific sequence, mispriming could occur due to annealing of the RT-primer to transcript sequences with some complementarity (RT mispriming). (c) RNA-seq platform (including Pyrosequencing, sequencing-by-synthesis, and single-molecule sequencing). These biases can be introduced by insertions and deletions, raw single-pass data, etc.
Sources of bias in RNA-seq sample preservation and suggestions for improvement.
| Description | Suggestion for improvement |
|---|---|
| Sample preservation | |
| FFPE methods: causes modifications of biomolecules, such as cross-linkage of nucleic acids with proteins | Use of non-cross-linking organic fixatives and methacarn solution [ |
| RNA extraction | |
| Using TRIzol: small RNA loss at low concentrations | Use high concentrations of RNA samples or avoid TRIzol extraction altogether [ |
Sources of bias in RNA-seq library preparation and suggestions for improvement.
| Description | Suggestion for improvement |
|---|---|
| mRNA enrichment | |
| 3′-end capture bias that is introduced during poly (A) enrichment in RNA sequencing | Use rRNA depletion [ |
| Fragmentation | |
| RNA fragmentation by RNase III: not completely random, leading to reduced complexity | Use chemical treatment (e.g., zinc) rather than RNase III for RNA fragmentation [ |
| Priming bias | |
| Random hexamer priming bias | RNA is not converted to dscDNA using random priming, instead of sequencing adapters that are ligated directly onto RNA fragments [ |
| Adapter ligation | |
| Adapter ligation bias: due to substrate preferences of T4 RNA ligases | Use adapters with random nucleotides at the extremities to be ligated [ |
| PCR | |
| (1) Bias due to preferential amplification of with neutral GC% | Use Kapa HiFi rather than Phusion polymerase [ |
The bias sources of major sequencing platforms.
| Company | Platforms | Sequencing | Dominant bias type | Suggestion for improvement |
|---|---|---|---|---|
| Roche/454 Life Sciences | GS FLX Titanium XL+ | Pyrosequencing | The bias of sequencing was introduced by PCR amplification prior to sequencing. | Reduction of the number of PCR cycles and use of DNA polymerases with even higher fidelity [ |
| GS FLX Titanium XLR70 | ||||
| GS Junior | ||||
| HiSeq 2000 | ||||
|
| ||||
| Illumina | Genome Analyzer IIx | Sequencing-by-synthesis with reversible terminator | Substitution type miscalls are the major source of bias. | Quality trimming (sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most suitable approach, reducing substitution biases [ |
| MiSeq | ||||
| SOLiD™ 4 system | ||||
| Ion PGM™ sequencer (318 chip) | ||||
|
| ||||
| Helicos BioSciences | HeliScope™ single molecule sequencer | Single-molecule sequencing | Biases were introduced by insertions and deletions. | If a low sequencing bias is needed, Illumina or SOLiD are often the best choices [ |
|
| ||||
| Pacific Biosciences | PacBio RS | Single-molecule sequencing | High bias of raw single-pass data | |