Alemu Takele Assefa1, Jo Vandesompele2,3,4, Olivier Thas1,3,5,6. 1. Data Analysis and Mathematical Modeling. 2. Biomolecular Medicine. 3. Cancer Research Institute Ghent. 4. Center for Medical Genetics, Ghent University, Ghent, Belgium. 5. National Institute for Applied Statistics Research Australia (NIASRA), University of Wollongong, Wollongong, Australia. 6. Data Science Institute, I-BioStat, Hasselt University, Hasselt, Belgium.
Abstract
SUMMARY: SPsimSeq is a semi-parametric simulation method to generate bulk and single-cell RNA-sequencing data. It is designed to simulate gene expression data with maximal retention of the characteristics of real data. It is reasonably flexible to accommodate a wide range of experimental scenarios, including different sample sizes, biological signals (differential expression) and confounding batch effects. AVAILABILITY AND IMPLEMENTATION: The R package and associated documentation is available from https://github.com/CenterForStatistics-UGent/SPsimSeq. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
SUMMARY: SPsimSeq is a semi-parametric simulation method to generate bulk and single-cell RNA-sequencing data. It is designed to simulate gene expression data with maximal retention of the characteristics of real data. It is reasonably flexible to accommodate a wide range of experimental scenarios, including different sample sizes, biological signals (differential expression) and confounding batch effects. AVAILABILITY AND IMPLEMENTATION: The R package and associated documentation is available from https://github.com/CenterForStatistics-UGent/SPsimSeq. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.