Kenong Su1, Zhijin Wu2, Hao Wu3. 1. Department of Computer Science, Emory University, Atlanta, GA 30329, USA. 2. Department of Biostatistics, Brown University, Providence, RI 02912, USA. 3. Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA.
Abstract
MOTIVATION: Determining the sample size for adequate power to detect statistical significance is a crucial step at the design stage for high-throughput experiments. Even though a number of methods and tools are available for sample size calculation for microarray and RNA-seq in the context of differential expression (DE), this topic in the field of single-cell RNA sequencing is understudied. Moreover, the unique data characteristics present in scRNA-seq such as sparsity and heterogeneity increase the challenge. RESULTS: We propose POWSC, a simulation-based method, to provide power evaluation and sample size recommendation for single-cell RNA-sequencing DE analysis. POWSC consists of a data simulator that creates realistic expression data, and a power assessor that provides a comprehensive evaluation and visualization of the power and sample size relationship. The data simulator in POWSC outperforms two other state-of-art simulators in capturing key characteristics of real datasets. The power assessor in POWSC provides a variety of power evaluations including stratified and marginal power analyses for DEs characterized by two forms (phase transition or magnitude tuning), under different comparison scenarios. In addition, POWSC offers information for optimizing the tradeoffs between sample size and sequencing depth with the same total reads. AVAILABILITY AND IMPLEMENTATION: POWSC is an open-source R package available online at https://github.com/suke18/POWSC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Determining the sample size for adequate power to detect statistical significance is a crucial step at the design stage for high-throughput experiments. Even though a number of methods and tools are available for sample size calculation for microarray and RNA-seq in the context of differential expression (DE), this topic in the field of single-cell RNA sequencing is understudied. Moreover, the unique data characteristics present in scRNA-seq such as sparsity and heterogeneity increase the challenge. RESULTS: We propose POWSC, a simulation-based method, to provide power evaluation and sample size recommendation for single-cell RNA-sequencing DE analysis. POWSC consists of a data simulator that creates realistic expression data, and a power assessor that provides a comprehensive evaluation and visualization of the power and sample size relationship. The data simulator in POWSC outperforms two other state-of-art simulators in capturing key characteristics of real datasets. The power assessor in POWSC provides a variety of power evaluations including stratified and marginal power analyses for DEs characterized by two forms (phase transition or magnitude tuning), under different comparison scenarios. In addition, POWSC offers information for optimizing the tradeoffs between sample size and sequencing depth with the same total reads. AVAILABILITY AND IMPLEMENTATION: POWSC is an open-source R package available online at https://github.com/suke18/POWSC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Vladimir Yu Kiselev; Kristina Kirschner; Michael T Schaub; Tallulah Andrews; Andrew Yiu; Tamir Chandra; Kedar N Natarajan; Wolf Reik; Mauricio Barahona; Anthony R Green; Martin Hemberg Journal: Nat Methods Date: 2017-03-27 Impact factor: 28.547
Authors: Samaneh K Sarvestani; Steven Signs; Bo Hu; Yunku Yeu; Hao Feng; Ying Ni; David R Hill; Robert C Fisher; Sylvain Ferrandon; Reece K DeHaan; Jennifer Stiene; Michael Cruise; Tae Hyun Hwang; Xiling Shen; Jason R Spence; Emina H Huang Journal: Nat Commun Date: 2021-01-11 Impact factor: 14.919