| Literature DB >> 33967674 |
Asif Adil1, Vijay Kumar2, Arif Tasleem Jan3, Mohammed Asger1.
Abstract
Rapid cost drops and advancements in next-generation sequencing have made profiling of cells at individual level a conventional practice in scientific laboratories worldwide. Single-cell transcriptomics [single-cell RNA sequencing (SC-RNA-seq)] has an immense potential of uncovering the novel basis of human life. The well-known heterogeneity of cells at the individual level can be better studied by single-cell transcriptomics. Proper downstream analysis of this data will provide new insights into the scientific communities. However, due to low starting materials, the SC-RNA-seq data face various computational challenges: normalization, differential gene expression analysis, dimensionality reduction, etc. Additionally, new methods like 10× Chromium can profile millions of cells in parallel, which creates a considerable amount of data. Thus, single-cell data handling is another big challenge. This paper reviews the single-cell sequencing methods, library preparation, and data generation. We highlight some of the main computational challenges that require to be addressed by introducing new bioinformatics algorithms and tools for analysis. We also show single-cell transcriptomics data as a big data problem.Entities:
Keywords: Sc-RNA-seq; big data; downstream analysis; normalization; single-cell analysis; single-cell big data; single-cell transcriptomics
Year: 2021 PMID: 33967674 PMCID: PMC8100238 DOI: 10.3389/fnins.2021.591122
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
Current SC-RNA-seq profiling techniques, based on transcript coverage and UMI insertion possibility.
| ScNaUmi-seq | Full length | Yes | |
| MATQ-seq | Full length | Yes | |
| 10× Chromium | 3′ end | Yes | |
| CEL-seq2 | 3′ end | Yes | |
| Drop-seq | 3′ end | Yes | |
| InDrop | 3′ end | Yes | |
| Smart-seq2 | Full length | No | |
| STRT-seq | 5′ end | Yes | |
| MARS-seq | 3′ end | Yes | |
| Smart-seq | Full length | No |
Commonly used methods for cell isolation based on biological characteristics.
| Fluorescence-activated cell sorting | Automatic | High | High rate of rare cell sorting, high purity | Cost-intensive, high skills required | |
| Magnetic-activated cell separation | Automatic | High | High purity, cost-efficient | Cell capture is non-specific |
Commonly used methods for cell isolation on the bases of physical characteristics.
| Microfluidic cell separation | Automatic | High | Works with low starting materials, amplification integration | High skills required, dissociated cells | |
| Micromanipulation manual cell picking | Manual | Low | More control over cell, live and intact cell separation | Laborious, high skills needed | |
| Laser-capture microdissection | Manual | Low | Undamaged live cell capture, highly advanced | Too complex to operate, threat of contamination by neighboring cells | |
| Density gradient centrifugation | Manual | Low | Cost-efficient | Too slow and laborious, low yield |
FIGURE 1Single-cell analysis in disease and health. Starting from the dissociation of target cells from the target tissue/organ, their isolation based on fluorescence-activated cell sorting (FACS) or other microfluidic techniques to RNA extraction. The RNA extraction is followed by cDNA synthesis by reverse transcriptase, followed by amplification and sequencing. From the sequencing, the reads are aligned and subjected to quantification that results in a quantification matrix or Gene Expression Matrix.
Widely used tools for read alignment and expression quantification.
| Salmon | Expression quantification | k-mer-based read quantification | ||
| Kallisto | Expression quantification | Pseudoalignment-based rapid read determination | ||
| StringTIe | Expression quantification | Alignment based, splice aware | ||
| HISAT2 | Read alignment | Alignment based, splice aware | ||
| Sailfish | Expression quantification | k-mer-based read quantification | ||
| RNA-Skim | Expression quantification | |||
| TopHat2 | Read alignment | Alignment based, splice aware | ||
| STAR | Read alignment | Alignment based, splice aware | ||
| Bowtie | Read alignment | Maintains quality threshold, hence less no. of mismatches | ||
| Cufflinks | Expression quantification | Alignment based, splice aware |
FIGURE 2(A) There is a steep rise every year for the publications of studies addressing the big data and SC-RNA-seq. For big data papers on PubMed, we used the query “[big data (All Fields) AND MapReduce (All Fields) AND Hadoop (All fields)].” For SC-RNA-seq and big data papers on PubMed, we used “[(scRNA-seq OR Big Data) OR (Single-cell AND big data)].” (B,C) Numbers were collected from the Human Cell Atlas Data portal of some exemplary projects.