| Literature DB >> 29092000 |
Chung-Chau Hon1, Jay W Shin1, Piero Carninci1, Michael J T Stubbington2.
Abstract
The Human Cell Atlas is a large, international consortium that aims to identify and describe every cell type in the human body. The comprehensive cellular maps that arise from this ambitious effort have the potential to transform many aspects of fundamental biology and clinical practice. Here, we discuss the technical approaches that could be used today to generate such a resource and also the technical challenges that will be encountered.Entities:
Mesh:
Year: 2018 PMID: 29092000 PMCID: PMC6063304 DOI: 10.1093/bfgp/elx029
Source DB: PubMed Journal: Brief Funct Genomics ISSN: 2041-2649 Impact factor: 4.241
Figure 1.Overview of the paths from tissue acquisition to data dissemination in the HCA. scRNAseq protocols act on disaggregated suspensions of cells from human organs with optional stages at which samples may be fixed or otherwise preserved. Spatially resolved methods analyse sections of fixed tissues. The data that are generated must be stored, analysed and disseminated.
Tools for estimation of expression levels
| Goals | Methods/features | Tools |
|---|---|---|
| Quality control | Visualizing various quality control metrics | Scater [ |
| Data-driven identification of low-quality cells | SinQC [ | |
| UMI processing | General processing of UMI | umis [ |
| Systematically correct UMI sequencing errors | UMI-tools [ | |
| Normalization with spike-in | Simple statistical models | SAMstrt [ |
| Bayesian approaches to normalize cell-specific noises | BASiCS [ | |
| Normalization without spike-in | Estimating cell-specific factors by learning the properties of clusters of similar cells | scran [ |
| Gene-specific scaling | SCnorm [ | |
| Imputation with gene-specific dropout models | SCONE [ | |
| Batch effect removal | Originally developed for microarrays or bulk RNA-seq but used in scRNAseq | Combat [ |
| Specifically developed for scRNAseq | scPLS [ | |
| Cell cycle effect removal | Remove the cell cycle components from the expression values | scLVM [ |
| Identify and remove the genes that are affected by cell cycle stages | ccRemover [ | |
| Simulation | Simulation of scRNAseq data sets for benchmarking methods | Splatter [ |
Tools for definition of cell identity
| Goals | Methods/features | Tools |
|---|---|---|
| Dimensionality reduction | Linear, PCA | PCA [ |
| Non-linear, t-SNE embedding | t-SNE [ | |
| Nonlinear, diffusion map | destiny [ | |
| Nonlinear, non-negative matrix factorization | Nimfa [ | |
| Linear, specifically designed to model, or to impute, dropouts | ZIFA [ | |
| Machine learning for a custom distance metric | SIMLR [ | |
| Classification of cell types | Graph theory-based clustering methods | SNN-cliq [ |
| Combinations of standard dimension reduction and clustering algorithms | pcaReduce [ | |
| Bi-clustering of cells and genes | BackSPIN [ | |
| Hierarchical clustering on centred Pearson’s correlation | SINCERA [ | |
| Grade of membership models | CountClust [ | |
| Distinguish rare cell types from background noises | RaceID [ | |
| Trajectory inference | Linear trajectory inference | DeLorean [ |
| Branched trajectory inference | BEAM [ |
Tools for identification of gene signatures
| Goals | Methods/features | Tools |
|---|---|---|
| Identification of differentially expressed genes | Detect the differences in mean of expression levels, by modelling the bimodal distribution of expression levels | MAST [ |
| Detect the differences in distribution, instead of mean, of expression levels | SCPattern [ | |
| Identify variations in expression attributable to sets of genes | f-scLVM [ | |
| Incorporate pseudotime information to identify gene significantly changed along the inferred cell trajectory | switched [ | |
| Identification of cell-type-specific genes | Signature genes co-identified during clustering of cells | BackSPIN [ |
| Regression-based approaches | SINCERA [ | |
| Machine learning approaches | SVM-RFE [ | |
| Inference of GRN | Originally developed for microarrays or bulk RNA-seq but used in scRNAseq | WGCNA [ |
| Boolean network models specifically designed for single-cell data sets | SingCellNet [ | |
| Incorporate pseudotime information to identify co-expressed genes | LEAP [ |