| Literature DB >> 32845323 |
Massimo Andreatta1,2,3, Santiago J Carmona1,2,3.
Abstract
SUMMARY: STACAS is a computational method for the identification of integration anchors in the Seurat environment, optimized for the integration of single-cell (sc) RNA-seq datasets that share only a subset of cell types. We demonstrate that by (i) correcting batch effects while preserving relevant biological variability across datasets, (ii) filtering aberrant integration anchors with a quantitative distance measure and (iii) constructing optimal guide trees for integration, STACAS can accurately align scRNA-seq datasets composed of only partially overlapping cell populations.Entities:
Year: 2021 PMID: 32845323 PMCID: PMC8098019 DOI: 10.1093/bioinformatics/btaa755
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Anchor finding and dataset integration using STACAS. (A) Expression level (log [ normalized UMI counts + 1]) of Cd8a and Cd4 after integration with Seurat CCA (top) or STACAS (bottom); important biological differences between the samples are lost by data rescaling and sub-optimal anchoring by Seurat 3 CCA. (B) Anchor distance distribution between pairs of samples prior to anchor filtering by STACAS; poor anchors with distance higher than threshold (represented with a vertical dashed line) are filtered out by STACAS. (C–E) Low-dimensionality UMAP visualization of scRNA-seq data, colored by sample, without batch correction (C), using Seurat CCA anchors (D) and using STACAS anchors (E) for dataset alignment. (F–H) UMAP visualization of scRNA-seq data, colored by TILPRED state prediction, without batch correction (F), using Seurat CCA anchors (G) and using STACAS anchors (H) for dataset alignment