| Literature DB >> 31167661 |
Douglas Abrams1, Parveen Kumar2, R Krishna Murthy Karuturi3, Joshy George4.
Abstract
BACKGROUND: The advent of single cell RNA sequencing (scRNA-seq) enabled researchers to study transcriptomic activity within individual cells and identify inherent cell types in the sample. Although numerous computational tools have been developed to analyze single cell transcriptomes, there are no published studies and analytical packages available to guide experimental design and to devise suitable analysis procedure for cell type identification.Entities:
Keywords: Analysis design; Cell-type identification; Clustering; Experimental design; Single cell RNA-seq
Mesh:
Year: 2019 PMID: 31167661 PMCID: PMC6551246 DOI: 10.1186/s12859-019-2817-2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Schematic representation of SCEED pipeline. (Left to right) First a simulated dataset is generated using SCEED “generateDataset” function with input parameters mentioned under “Data simulation”. Next, the simulated dataset is analyzed using different single cell analysis procedures. To test the performance of each single-cell algorithm, F1score which is a measure of test’s accuracy is computed. Finally, based on the F1score cutoff chosen by user, the best analysis procedure and the number of cells required to perform the single cell experiment are selected
Fig. 2Schematic representation showing generation of simulated dataset using SCEED. (Left to right) A blank matrix is provided as an input where initially (1) mean expression of all the genes and (2) number of marker genes at a desired foldchange cutoff are simulated, followed by adjustment of (3) biological and (4) technical noises. Finally, (5) single cell count is simulated and provided as an output matrix
Properties of different of simulated single cell datasets generated
| Cell type proportions | No. of cell types (m) | No. of Genes | No. of Marker genes | No. of cells simulated (n) | Fold change (fC) of marker genes |
|---|---|---|---|---|---|
| 0.1, 0.2, 0.2, 0.2, 0.3 | 5 | 10,000 | 50 | 1000, 2000 and 3000 | 2, 4 and 8 |
| 0.05, 0.2, 0.2, 0.2, 0.35 | 5 | 10,000 | 50 | 1000, 2000 and 3000 | 2, 4 and 8 |
| 0.02, 0.2, 0.2, 0.2, 0.38 | 5 | 10,000 | 50 | 1000, 2000 and 3000 | 2, 4 and 8 |
Fig. 3Performance of different single cell algorithms at different cell proportions. F1score was calculated at cell rarity proportions of 0.02, 0.05 and 0.1 containing 2 foldchange upregulated marker genes for 1000, 2000 and 3000 single cells datasets. X-axis represents the cell rarity proportions while y-axis represents F1score