| Literature DB >> 34026977 |
Zhaoyan Liu1, Wei Zhu1, Dmitri V Gnatenko2, Natasha M Nesbitt2, Wadie F Bahou2.
Abstract
Genetic pathways regulating hematopoietic lineage commitment at critical stages of development remain incompletely characterized. To better delineate genetic sources of variability regulating cellular speciation during steady-state hematopoiesis, we applied a factorial single-cell latent variable model (f-scLVM) to decompose single-cell transcriptome heterogeneity into interpretable biological factors (refined pathway annotations or gene sets without annotation) dynamically regulating cell fate. Hematopoietic single cell transcriptomic raw sequencing data extracted from 1,920 hematopoietic stem and progenitor cells (HSPCs) derived from 12-week-old female mice were used for data analysis and model development. These single cell RNA sequencing data were subsequently analyzed using the factorial single-cell latent variable model (f-scLVM), with their heterogeneity decomposed into interpretable biological factors. The top biological factors underlying the basal hematopoiesis were subsequently identified for the aggregate, and lineage-restricted (myeloid, megakaryocyte, erythroid) progenitor cells. For a subset of factors, data were independently verified experimentally in a companion research paper [1]. These data facilitate the identification of novel subpopulations and adjust gene sets to discover new marker genes and hidden confounding factors driving basal hematopoiesis.Entities:
Keywords: Factor analysis; Pathway annotation; Single-cell RNA sequencing analysis; Spatial reconstruction
Year: 2021 PMID: 34026977 PMCID: PMC8131567 DOI: 10.1016/j.dib.2021.107080
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 1Weights for the most important genes in the top 2 factors of Myeloid.
Fig. 2Weights for the most important genes in the top 2 factors of Megakaryocyte.
Fig. 3Weights for the most important genes in the top 2 factors of Erythroid.
Fig. 4t-SNE visualization of the single-cell variation captured by annotated factors (left) and unannotated factors (right).
Fig. 5Weights for the most important genes in the top 2 PCs.
Fig. 6PCA plot for cells.
Fig. 7t-SNE visualization of the single-cell variation captured by first 30 principal components.
| Subject | Biological sciences |
| Specific subject area | Single cell transcriptomics, Hematopoiesis |
| Type of data | Tables |
| How data were acquired | Raw data was acquired from GEO (accession number GSE81682, |
| Data format | Analyzed Secondary Data |
| Parameters for data collection | Quality control: removed cells expressing (i) less than 4,000 detected genes, (ii) less than 200,000 reads mapped to nuclear genes, or (iii) more than 10% of mapped reads mapping to the mitochondrial genome. |
| Description of data collection | The secondary data was obtained by performing quality control, normalization, highly variable genes (HVGs) selection and scaling on the raw count data. Based on HVGs, PCA was conducted, and then clustering was done on by using top 30 PCs. The fsclvm was trained by using all genes in the secondary data and the hallmark gene sets derived from MSigDB version 7.0. The loadings of top 30 genes in top 2 annotated factors of clusters were printed out. |
| Data source location | Institution: Department of Applied Mathematics and Statistics, Stony Brook University |
| Data accessibility | Liu, Zhaoyan (2021), “HSPC_FSCLVM_DATA&CODE”, Mendeley Data, V1, doi: 10.17632/3cxw2s7jw5.1 |
| Related research article | Natasha M. Nesbitt, Lisa E. Malone, Zhaoyan Liu, Alexander Jares, Dmitri V. Gnatenko, Yupo Ma, Wei Zhu, and Wadie F. Bahou. Divergent erythroid megakaryocyte fates in Blvrb-deficient mice establish non-overlapping cytoprotective functions during stress hematopoiesis. |