| Literature DB >> 29941873 |
Mo Huang1, Jingshu Wang1, Eduardo Torre2,3, Hannah Dueck4, Sydney Shaffer3, Roberto Bonasio5, John I Murray4, Arjun Raj3,4, Mingyao Li6, Nancy R Zhang7.
Abstract
In single-cell RNA sequencing (scRNA-seq) studies, only a small fraction of the transcripts present in each cell are sequenced. This leads to unreliable quantification of genes with low or moderate expression, which hinders downstream analysis. To address this challenge, we developed SAVER (single-cell analysis via expression recovery), an expression recovery method for unique molecule index (UMI)-based scRNA-seq data that borrows information across genes and cells to provide accurate expression estimates for all genes.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29941873 PMCID: PMC6030502 DOI: 10.1038/s41592-018-0033-z
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1RNA FISH validation of SAVER results on Drop-seq data. (a) Overview of SAVER procedure. (b) Comparison of Gini coefficient for each gene between FISH and Drop-seq (left) and between FISH and SAVER recovered values (right) for n = 15 genes. (c) Kernel density estimates of cross-cell expression distribution of LMNA (upper) and CCNA2 (lower). (d) Scatterplots of expression levels between BABAM1 and LMNA. Pearson correlations were calculated across n = 17,095 cells for FISH and n = 8,498 cells for Drop-seq and SAVER.
Figure 2Evaluation of SAVER by down-sampling and cell clustering. (a) Performance of algorithms measured by correlation with reference, on the gene level (left) and on the cell level (right). Number of genes and cells can be found in Supplementary Table 3. Box plots show the median (center line), interquartile range (hinges), and 1.5 times the interquartile range (whiskers); outlier data beyond this range are not shown. (b) Comparison of gene-to-gene (left) and cell-to-cell (right) correlation matrices of recovered values with the true correlation matrices, as measured by correlation matrix distance (CMD). (c) Differential expression (DE) analysis between CA1Pyr1 cells (n = 351) and CA1Py2 cells (n = 389) showing significant genes detected at FDR = 0.01 (left) and estimated FDR (right). (d) Cell clustering and t-SNE visualization of the Zeisel dataset (n = 1,799). Jaccard index of the down-sampled observed dataset and recovery methods as compared to the reference classification is shown. (e) t-SNE visualization of 7,387 mouse cortex cells for the observed data (left) and SAVER (right) colored by cell types determined by Hrvatin et al.