| Literature DB >> 26688660 |
Huei-Chung Huang1, Yi Niu1, Li-Xuan Qin1.
Abstract
Deep sequencing has recently emerged as a powerful alternative to microarrays for the high-throughput profiling of gene expression. In order to account for the discrete nature of RNA sequencing data, new statistical methods and computational tools have been developed for the analysis of differential expression to identify genes that are relevant to a disease such as cancer. In this paper, it is thus timely to provide an overview of these analysis methods and tools. For readers with statistical background, we also review the parameter estimation algorithms and hypothesis testing strategies used in these methods.Entities:
Keywords: RNA sequencing; differential expression analysis; overview; software; statistical methods
Year: 2015 PMID: 26688660 PMCID: PMC4678998 DOI: 10.4137/CIN.S21631
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
RNA-seq count data DEA statistical methods and software.
| MODEL | SOFTWARE | REFERENCES | DATA TYPE | TESTING STRATEGY | NOTES | LIMITATIONS | DATA USED |
|---|---|---|---|---|---|---|---|
| Poisson | DEGseq | Wang et al. | RNA-seq data | Fisher’s exact test | Support raw read counts or normalized gene expression values, identify DE of exons or transcripts | Ignore biological variation | Marioni RNA-seq data |
| Myrna | Langmead et al. | RNA-seq data | Likelihood ratio test Parallelized permutation test | Handle dataset with over 1 billion rows, computationally efficient | Ignore biological variation, signal loss due to junction or repetitive reads, inconvenient cloud data transfer | HapMap expression data | |
| PoissonSeq | Li et al. | RNA-seq data | Score test | Accommodate multiple covariate types, computationally efficient | Transformation power depends only on gene expression, libraries are totally exchangeable | Marioni RNA-seq data | |
| Negative binomial | edgeR | Robinson et al. | SAGE data | Exact test | Separate biological from technical variations | Limited to pairwise comparison | SAGE data |
| DESeq | Anders and Huber | Tag-seq data | Exact test | Extend edgeR by allowing more general, data-driven relationship of mean and variance | Limited to pairwise comparison | Neural stem cell Tag-seq data | |
| DESeq2 | Love et al. | Tag-seq data | Wald test | Improve upon DESeq for better gene ranking, allow hypothesis tests above and below threshold | Limited to pairwise comparison | Fly RNA-seq data | |
| NBPSeq | Di et al. | RNA-seq data | Adapted exact test | Introduce an additional parameter to allow the dispersion to depend on the mean | Assume all library sizes are equal | Arabidopsis RNA-seq data | |
| Beta binomial | BBSeq | Zhou et al. | RNA-seq data | Wald test | Handle outlier detection automatically | Sensitive to outliers of shrinkage or penalization methods | HapMap RNA-seq data |
| Bayesian and Empirical Bayesian | ShrinkSeq | Van de Wiel et al. | RNA-seq data | Evaluating posterior probability for inference | Provide joint shrink multiple parameters, allow for random effects, address multiplicity problems | Computationally intensive but allow parallelization | HapMap RNA-seq data |
| baySeq | Hardcastle and Kelly | Small RNAs data | Evaluating posterior probability for inference | Involve multiple comparison, accommodate different sample size | Computationally intensive but allow parallelization | Trans-acting small RNAs | |
| Non-parametri | SAMseq | Li and Tibshirani | RNA-seq data | Wilcoxon test | Robust to outliers, remove Experimental effect, simplify test for feature effect, accommodate quantitative, survival and multiple group comparison | Overestimate FDR in some cases, relative low power for data with small sample size | Marioni RNA-seq data t’Hoen Tag-seq data Witten miRNA-seq data |
| NOIseq | Tarazona et al. | RNA-seq data | Wilcoxon test | Robust and maintain a high true-positive rate | Not easy to identify true differential expression at a low count range, limited to pairwise comparison | Marioni RNA-seq data |
List of sequencing depth normalization methods and reference papers.
| METHODS | RELEVANT REFERENCES |
|---|---|
| Mortazavi et al. | |
| Bullard et al. | |
| Robinson et al. | |
| Anders and Huber | |
| Bolstad et al. |
Algorithm Overview 1: Li et al.’s6 PoissonSeq
| Li and others proposed |
Algorithm Overview 2: Overdispersion
| Negative binomial can be derived as a gamma–Poisson mixture model (subscripts |
Algorithm Overview 3: Robinson and Smyth’s33,36 edgeR
| In |
Algorithm Overview 4: Anders and Huber’s8 DESeq
| The read count |
Algorithm Overview 5: Love et al.’s9 DESeq2
Algorithm Overview 6: Van de Wiel et al.’s13 ShrinkSeq
Initiate Use Obtain
Iterate from step 2 until convergence. |
Algorithm Overview 7: Hardcastle and Kelly’s14 baySeq
| The tuple system in |
Algorithm Overview 8: Li and Tibshirani’s15 SAMseq
| To use |
Algorithm Overview 9: Tarazona et al.’s16 NOISeq
| In |