| Literature DB >> 35266509 |
Federico Agostinis1, Chiara Romualdi1, Gabriele Sales1, Davide Risso2.
Abstract
SUMMARY: We present NewWave, a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA sequencing data. To achieve scalability, NewWave uses mini-batch optimization and can work with out-of-memory data, enabling users to analyze datasets with millions of cells.Entities:
Year: 2022 PMID: 35266509 PMCID: PMC9048694 DOI: 10.1093/bioinformatics/btac149
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.931
Fig. 1.Implementation and performance of NewWave. Unless otherwise noted, we used 10% of the observations as the size of the mini-batches and 10 cores. (A) Schema of the NewWave model, indicating which matrices are in shared memory (see Supplementary Information for more details). (B) Speed (top) and ARI (bottom) of NewWave (in-memory data) with different choices of the parameters and ZINB-WaVE applied to the BICCN dataset (Yao ) with a maximum of 312 000 cells and after selecting the 1000 most variable genes. The reported ARI is computed as the mean ARI of 100 k-means clustering procedures with the number of centroids set to the known number of labels (k = 20). (C) Speed and RAM usage of NewWave (gene-wise dispersion + mini-batch) and ZINB-WaVE using a subset of 100 000 cells varying the number of cores used for computation. (D) RAM usage (top) and speed (bottom) of NewWave on the 10X 1.3 M cell datasets with 1000 most variable genes