| Literature DB >> 32070398 |
Gennady Gorin1, Valentine Svensson2, Lior Pachter3.
Abstract
The simultaneous quantification of protein and RNA makes possible the inference of past, present, and future cell states from single experimental snapshots. To enable such temporal analysis from multimodal single-cell experiments, we introduce an extension of the RNA velocity method that leverages estimates of unprocessed transcript and protein abundances to extrapolate cell states. We apply the model to six datasets and demonstrate consistency among cell landscapes and phase portraits. The analysis software is available as the protaccel Python package.Entities:
Keywords: Bioinformatics; Computational biology; Multiomics; Protein acceleration; Protein velocity; RNA velocity; Transcriptomics
Mesh:
Substances:
Year: 2020 PMID: 32070398 PMCID: PMC7029606 DOI: 10.1186/s13059-020-1945-3
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Model structure and parameter inference. a A single gene’s information transfer through transcription, splicing, and translation, and the ordinary differential equations governing the spliced mRNA and protein populations. b Conceptual framework for extrapolation from snapshot sequencing data. c Protein acceleration workflow: estimation of equilibrium states u = γs and s = γp (black dashed lines) from imputed gene-specific population data (light brown), gene-specific extrapolation to calculate Δs and Δp, identification of nearest neighbors (dark gray: cell i, intermediate gray: n neighboring cells j, light gray: non-neighbor cells, circle: neighborhood), calculation of transition probabilities and embedded velocities (red: RNA velocity, blue: protein velocity, T: transition probability from cell i to neighbor j, u: unit vector from cell i to neighbor j), and visualization of acceleration (blue arrow: protein velocity, red arrow: RNA velocity, combined curvature: gray Bézier curve)
Protein acceleration datasets and parameters
| Dataset | CITE-seq | REAP-seq | ECCITE-seq ctrl | ECCITE-seq CTCL | 10X 1k | 10X 10k |
|---|---|---|---|---|---|---|
| RNA data | GSM2695381 | GSM2685238 | GSM3596095 | GSM3596100 | See Methods | See Methods |
| Protein data | GSM2695382 | GSM2685243 | GSM3596096 | GSM3596101 | See Methods | See Methods |
| Alignment software | ||||||
| Counting software | ||||||
| Reference genome | GRCh38 | hg19 | hg19 | hg19 | GRCh38 | GRCh38 |
| Cell count | 1780 | 3158 | 5084 | 5317 | 709 | 7855 |
| Velocity genes | 1172 | 1338 | 591 | 667 | 1114 | 920 |
| Antibodies | 10 | 41 | 49 | 49 | 17 | 17 |
| Velocity proteins | 7 | 16 | 11 | 12 | 7 | 8 |
| Cell types found | 5 | 4 | 4 | 3 | 5 | 5 |
| Imputation | 400 | 800 | 800 | 800 | 50 | 50 |
| Clustering method | MVP | RVP | RVP | MVP | MVP | MVP |
| Embedding | PC2/3 and t-SNE | t-SNE | t-SNE | t-SNE | t-SNE | t-SNE |
MVP ModularityVertexPartition, RVP RBERVertexPartition, PCA principal component, t-SNE t-Stochastic Neighbor Embedding
Fig. 2Protein acceleration visualization. a CITE-seq PBMC protein acceleration, visualized on a grid in principal component space. b Spliced RNA/protein phase portraits of CD4 in six PBMC datasets. Dot color identifies cell type (blue: CD4+ T, red: B, yellow: monocytes, green: CD8+ T, purple: natural killer, pink: not identifiable unambiguously)