| Literature DB >> 31074382 |
Shengquan Chen1, Kui Hua1, Hongfei Cui1,2, Rui Jiang3.
Abstract
BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) technologies have advanced rapidly in recent years and enabled the quantitative characterization at a microscopic resolution. With the exponential growth of the number of cells profiled in individual scRNA-seq experiments, the demand for identifying putative cell types from the data has become a great challenge that appeals for novel computational methods. Although a variety of algorithms have recently been proposed for single-cell clustering, such limitations as low accuracy, inferior robustness, and inadequate stability greatly impede the scope of applications of these methods.Entities:
Keywords: Cell subtypes; Clustering; Dimensionality reduction; Dropout; Multi-scale; Single-cell; Variational projection; scRNA-seq
Mesh:
Year: 2019 PMID: 31074382 PMCID: PMC6509870 DOI: 10.1186/s12859-019-2742-4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Representation of VPAC as a probabilistic graphical model. The observed variable x is shown by the shaded node, while the plate notation comprises a dataset of N independent observations together with the corresponding latent variables
Performance comparison on three datasets of different scales
| Dataset | S-Set | M-Set | L-Set | |||
|---|---|---|---|---|---|---|
| Model | ARI | NMI | ARI | NMI | ARI | NMI |
| VPAC | 0.769 ± 0.000 | 0.759 ± 0.000 | 0.765 ± 0.000 | 0.765 ± 0.000 | 0.779 ± 0.000 | 0.769 ± 0.000 |
| Para_DPMM | 0.675 ± 0.008 | 0.696 ± 0.008 | 0.704 ± 0.000 | 0.713 ± 0.002 | 0.700 ± 0.002 | 0.711 ± 0.002 |
| pcaReduce | 0.290 ± 0.019 | 0.289 ± 0.010 | 0.279 ± 0.007 | 0.252 ± 0.007 | 0.311 ± 0.026 | 0.309 ± 0.044 |
| Seurat | 0.378 ± 0.000 | 0.379 ± 0.000 | 0.379 ± 0.000 | 0.383 ± 0.000 | 0.383 ± 0.000 | 0.383 ± 0.000 |
| SC3 | 0.724 ± 0.005 | 0.722 ± 0.010 | 0.724 ± 0.009 | 0.717 ± 0.007 | 0.523 ± 0.053 | 0.501 ± 0.030 |
Fig. 2Performance comparison on datasets of various dimensionality. a The performance on discrete UMI counts. b The performance on continuous FPKM-normalized data
Fig. 3Performance comparison on datasets of different size. a The performance on discrete UMI counts. b The performance on continuous TPM-normalized data
Fig. 4Performance comparison on datasets of different sparsity
Fig. 5Visualization of the projection matrix W of VPAC (the left one) and that of classical PCA (the right one)
Fig. 6Visualization of the dendritic cells in latent space inferred by VPAC using t-SNE. The dendritic cells are colored by cell-type labels provided by the original study, different shapes of points represent different experimental batches, and the dashed circles represent potential clusters inferred by VPAC with setting the number of clusters to (a) 6, and (b) 5
Fig. 7The co-expression network of the DC2/3 cluster inferred by VPAC
Fig. 8The values of fifth items in the latent vectors inferred by VPAC