| Literature DB >> 27317696 |
Dominik Buschmann1, Anna Haberberger2, Benedikt Kirchner2, Melanie Spornraft2, Irmgard Riedmaier3, Gustav Schelling4, Michael W Pfaffl5.
Abstract
Small RNA-Seq has emerged as a powerful tool in transcriptomics, gene expression profiling and biomarker discovery. Sequencing cell-free nucleic acids, particularly microRNA (miRNA), from liquid biopsies additionally provides exciting possibilities for molecular diagnostics, and might help establish disease-specific biomarker signatures. The complexity of the small RNA-Seq workflow, however, bears challenges and biases that researchers need to be aware of in order to generate high-quality data. Rigorous standardization and extensive validation are required to guarantee reliability, reproducibility and comparability of research findings. Hypotheses based on flawed experimental conditions can be inconsistent and even misleading. Comparable to the well-established MIQE guidelines for qPCR experiments, this work aims at establishing guidelines for experimental design and pre-analytical sample processing, standardization of library preparation and sequencing reactions, as well as facilitating data analysis. We highlight bottlenecks in small RNA-Seq experiments, point out the importance of stringent quality control and validation, and provide a primer for differential expression analysis and biomarker discovery. Following our recommendations will encourage better sequencing practice, increase experimental transparency and lead to more reproducible small RNA-Seq results. This will ultimately enhance the validity of biomarker signatures, and allow reliable and robust clinical predictions.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27317696 PMCID: PMC5291277 DOI: 10.1093/nar/gkw545
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.An overview of the small RNA library preparation workflow.
Figure 2.An overview of the small RNA-Seq data analysis workflow.
Crucial steps and recommendations for small RNA sampling and library preparation
| Step | To consider | Recommendation |
|---|---|---|
| Experimental design and replication | Type and number of samples | Employ sufficient replication for question at hand |
| Outcome of interest | Favor biological replicates over technical ones | |
| Variance within samples | ||
| Sequencing depth | Outcome of interest Replication | For a rough snapshot of gene expression or analysis of high-level transcripts, lower coverage is sufficient |
| Sequencing depth needs to be increased for analysis of rare transcripts | ||
| Sampling and storage | Sampling environment Sample type Embedding/fixation Freezing/storage | Keep sampling conditions as clean as possible |
| Choose an appropriate sampling system for the particular sample type | ||
| Use agents to preserve and stabilize RNA | ||
| Freeze samples as quickly as possible and store at appropriate temperature | ||
| RNA extraction | Quantity of input material Type of extraction kit Use of a carrier | Carefully optimize the method of extraction for the particular type and quantity of starting material |
| Carrier material might be considered to increase small RNA yield | ||
| Total RNA | Expected yield and quantification system Quality of extracted RNA | Opt for fluorescence-based quantification of extracted RNA |
| Check RNA quality and integrity by capillary electrophoresis | ||
| Addition of adapter | Type of RNA (e.g. miRNA, piRNA) modified ends | Be aware of ligation biases |
| For small RNAs with modified 3′-ends avoid poly(A) or poly(C)-based approaches or modify protocol accordingly | ||
| Reverse transcription | Type of enzyme Introduction of barcodes | Choose appropriate enzyme for given experimental conditions |
| Introduce barcodes during PCR | ||
| PCR amplification | Necessity Type of enzyme Number of cycles | Choose pre-amplification strategy based on the quantity of starting material |
| Opt for high fidelity polymerases with low error rates | ||
| Perform as few PCR cycles as possible | ||
| Size selection | Appropriate size range Precision of selection system | Select for cDNA fragments that reflect the size of the RNA of interest |
| High-resolution gel electrophoresis to effectively separate small RNA species | ||
| Library purity and quantification | Contamination with adapter dimers | Assess library purity by capillary electrophoresis |
| Accurate quantification for precise flow cell loading | Quantify library by fluorimetric assays or qPCR/dPCR | |
| Quality control | Quality and purity of samples at each step of the workflow | Control for sample quality throughout workflow: purity and integrity of initial sample, extracted RNA, cDNA library before and after size selection |
Crucial steps and recommendations for small RNA-Seq data analysis
| Step | To consider | Recommended tools or algorithms |
|---|---|---|
| Data pre-processing | Trimming adapters | Btrim, FASTX-Toolkit |
| Removing short reads | ||
| Quality control | Library size and read distribution across samples | Btrim, FASTX-Toolkit, FaQCs |
| Per base/sequence Phred score | ||
| Read length distribution | ||
| Assess degradation | ||
| Check for over-represented sequences | ||
| Read alignment (Filtering) | Reference database or genome | Bowtie, BWA, HTSEQ, SAMtools, SOAP2 |
| Annotation | ||
| Mismatch rate | ||
| Handling of multi-reads | ||
| Normalization | Library sizes and sequencing depth | DESeq2, EdgeR, svaseq |
| Batch effects | ||
| Read distribution | ||
| Replication level | ||
| DGE analysis | Data distribution | DESeq2, EdgeR, SAMSeq, voom limma |
| Replication level | ||
| False discovery rate | ||
| Target prediction of miRNAs / siRNAs | miRanda, miRTarBase, TarBase | |
| Canonical and non-canonical target regulation | ||
| Biomarker identification | Sensitivity Specificity Classification rate | DESeq2, Simca-Q, Numerous R packages: base, pcaMethods, Mixomics |