| Literature DB >> 29285690 |
Michael A Ortega1, Olivier Poirion1, Xun Zhu1,2, Sijia Huang1,2, Thomas K Wolfgruber1, Robert Sebra3, Lana X Garmire4,5.
Abstract
It has become increasingly clear that both normal and cancer tissues are composed of heterogeneous populations. Genetic variation can be attributed to the downstream effects of inherited mutations, environmental factors, or inaccurately resolved errors in transcription and replication. When lesions occur in regions that confer a proliferative advantage, it can support clonal expansion, subclonal variation, and neoplastic progression. In this manner, the complex heterogeneous microenvironment of a tumour promotes the likelihood of angiogenesis and metastasis. Recent advances in next-generation sequencing and computational biology have utilized single-cell applications to build deep profiles of individual cells that are otherwise masked in bulk profiling. In addition, the development of new techniques for combining single-cell multi-omic strategies is providing a more precise understanding of factors contributing to cellular identity, function, and growth. Continuing advancements in single-cell technology and computational deconvolution of data will be critical for reconstructing patient specific intra-tumour features and developing more personalized cancer treatments.Entities:
Keywords: Cancer; Gene expression; Heterogeneity; Methylation; Multi-omics; Mutation; Single-cell sequencing
Year: 2017 PMID: 29285690 PMCID: PMC5746494 DOI: 10.1186/s40169-017-0177-y
Source DB: PubMed Journal: Clin Transl Med ISSN: 2001-1326
Fig. 1Heterogeneity and metastasis. a Normal healthy tissues have a naturally occurring degree of somatic heterogeneity. These mutations can arise due to environmental factors and inaccurately resolved errors in transcription or replication. b As mutations stochastically arise, some will be neutral, thus having no apparent affect on the phenotype, while others may occur in ‘driver’ gene regions and have more immediately observable traits. For example, mutated DNA damage response (DDR) genes can drive tumorigenesis because they leave the cell without the necessary pathways to resolve lesions. c Driver gene mutations can confer an advantage in the founder clone and promote subsequent expansion. d Secondary mutations that occur in subclones further drive heterogeneity and can lead to metastasis. Additionally, recent research suggests that metastases may also derive from early disseminated cancer cells
Notable advancements in single-cell techniques
| Year introduced | Notable technology advancements | Method cell rangea |
|---|---|---|
| 2009 | Tang et al. [ | 1b |
| 2011 | STRT-seq [ | < 100 |
| 2012 | SMART-seq [ | < 100 |
| 2012 | CEL-Seq [ | < 100 |
| 2013 | Fluidigm C1 (IFC) [ | < 800 |
| 2013 | Smart-seq 2 [ | < 1000 |
| 2014 | MARS-seq [ | 10,000 s |
| 2015 | Drop-seq [ | 10,000 s |
| 2015 | inDrop [ | 10,000 s |
| 2016 | Chromium (10× Genomics) [ | 10,000 s |
| 2017 | ddSeq (Bio-Rad) [ | 10,000 s |
| 2017 | SPLiT-seq [ | 10,000 s |
| 2017 | Seq-well [ | 10,000 s |
This is a non-comprehensive list of peer-reviewed studies that advanced single-cell isolation and preparation techniques
aThe “range” lists the largest relative population that can or has been studied using this technique
bThis method involves mechanical separation and isolation of individual blastomeres into single wells
Fig. 2Clonal phylogeny in cancer and resistance. A Darwinian tree model best describes clonal evolution. a Multiregion biopsies have been used to investigate intra-tumour heterogeneity. This involves taking biopsies from different regions of the same tumour then preparing high-throughput sequencing libraries. b Phylogenetic reconstruction of clonal evolution gives a detailed understanding of heterogeneity in the tumour. A mutation occurring at the ‘trunk’ of the tree and promotes clonal expansion. Subclones arise due to subsequent mutations that diversify the population. Driver mutations can also occur later in clonal evolution and infer resistant properties that were not present in the initial driver mutation. If chemotherapy fails to knock out unique trunks, a drug-resistant population will remain and serve as the dominant feature during relapse
Fig. 3Single-cell isolation and preparation. a One method for isolating single-cells is by droplet microfluidics. In the first channel, individual cells are coupled with uniquely bar-coded beads that continue down the pipe until they are captured by an oil droplet. The oil droplets are then pooled in high quantities, and PCR is performed on the population. b A second approach for isolating single-cells is to pre-enrich cells by FACS then pass them through an IFC chip, which collects them into individual wells. IFC chips are available in different cell size ranges which assists in limiting the capture more than 1 cell/well. Unlike the droplet approach, PCR is performed on individual cells. This method results in lower overall cell counts than the droplet approach, however, there is a reported trade-off in sensitivity. Overall, droplet-based methods will yield higher numbers of cells per experiment, but the quality of data is more sparse, whereas, microfluidic chip methods provide a deeper cellular profile but with fewer cells. Researchers have to weigh the trade-off of single cell read depth or single-cell population breadth
Tools for investigating heterogeneity
| Name | Description | Link | Input | Specific to single-cell | References | Accessibility |
|---|---|---|---|---|---|---|
| Databases | ||||||
| scRNASeqDB | A database for gene expression profiling in human single cell by RNA-seq |
| N/A | Yes | Cao et al. [ | *** |
| The Human Protein Atlas | Spatial distribution of protein expressions |
| Protein name | No | Uhlén et al. [ | ***** |
| Enrichr | Very complete meta-database |
| List of genes/proteins—BED file | No | Kuleshov et al. [ | ***** |
| CIViC | Clinical interpretation of cancer variant |
| Gene or variant ID | No | Griffith et al. [ | **** |
| MyGene2 | A portal for sharing health and genetic information |
| Genetic information | No | Xin et al. [ | **** |
| Genome sequencing | ||||||
| SCITE | Pseudo-temporal clonal tree construction |
| Presence/absence/unknown mutation matrix | Yes | Jhan et al. [ | * |
| oncoNEM | Pseudo-temporal clonal tree construction |
| Binary matrix + estimation of FPR and FNR for each SNVS | Yes | Ross and Markowetz [ | * |
| BWA | DNA reads aligner |
| fastq file + reference genome | No | Li [ | * |
| Methylation | ||||||
| Bismark | Aligner for bisulfite treated sequencing reads |
| fastq files | No | Krueger and Andrews [ | ** |
| RNA-seq | ||||||
| Granatum | Graphical pipeline for scRNA-seq analysis |
| Expression matrix and sample metadata | Yes | Zhu et al. [ | ***** |
| Monocle2 | Pseudo-time construction using DDRTree |
| Expression matrix and sample metadata | Yes | Trapnell et al. [ | * |
| scLVM | Subpopulation detection |
| Expression matrix | Yes | Buettner et al. [ | * |
| PseudoGP | Probabilistic pseudotime for single-cell RNA-seq data |
| Expression matrix | Yes | Campbell et al. [ | * |
| SPADE | Cell hierarchy inference |
| Expression matrix | Yes | Anchang et al. [ | * |
| STAR | RNA-seq reads aligner |
| Fast | No | Dobin et al. [ | ** |
| CNV | ||||||
| InferCNV | Average gene expression on large genomic regions |
| Gene expression matrix | Yes | Patel et al. [ | ** |
| ECdetect | Detection of extrachromosomal DNA |
| Bam file | Yes | Turner et al. [ | * |
| CNVkit | Detection of CNV from DNA sequencing |
| BAM file + target regions (BED files) | No | Talevich et al. [ | ** |
| SynthEx | Detection of copy number alteration and tumour heterogeneity profiling for whole genome and exome sequencing |
| Count data (bed files) + optional vcf files for tumor samples | No | Silva et al. [ | ** |
| Ginkgo | Web platform for visualization and clustering |
| Bed files | Yes | ***** | |
| MutSigCV 2.0 | Eliminate false positive mutations in large datasets |
| Mutations for each sample + sequencing coverage | No | Lawrence et al. [ | ** |
| HotNet2 | Identify mutated subnetworks across pathways and protein complexes |
| Mutation data + protein–protein interaction network | No | Leiserson et al. [ | * |
| Proteomics | ||||||
| Wishbone | Reconstructing bifurcating developmental trajectories of single-cells |
| tsv expression files | Yes | Setty et al. [ | *** |
| Multi-omics | ||||||
| DeepCpG | Infer missing methylation states and expressive DNA motifs linked to methylation |
| Methylation position file + ref genomes + fastq files | Yes | Angermueller et al. [ | * |
| SSrGE | Link gene expression with SNVs. Provide a pipeline to extract SNVs from scRNA-seq |
| Expression matrix + binary matrix; fastq files | Yes | Poirion et al. [ | * |
| Genetic architecture | ||||||
| combinatorialHiC | Processing single cell combinatorial indexed Hi-C |
| fastq files + barcodes | Yes | Ramani et al. [ | * |
| Others | ||||||
| Integrate-neo | Gene fusion neoantigen discovering tool |
| fastq files (or tsv) + bedpe files + reference genomics | No | Zhang et al. [ | * |
| awesome-single-cell | Exhaustive community-driven list of single-cell analytical tools |
| N/A | Yes | N/A | ***** |
Software, computational packages, and databases mentioned in the paper
The accessibility of a tool is our evaluation of its user-friendliness towards bench scientists who are not necessarily computationally trained
The accessibility ranges from “*” (least accessible) to “*****” (most accessible)
Fig. 4Single-cell multi-omics analysis workflow. a Multi-omic technologies can produce reads from the transcriptome (RNA-seq), the genome (exome sequencing), and/or the methylome, from the same cells. b Read alignments, quality control (QC), and specific processing steps create “feature expression” matrices, where cells are represented as vectors and genomic features (e.g. gene expression, methylation) represented as columns. c The different omic matrices can then be analyzed independently, for detecting cell subpopulations and ranking the genomic features etc. d Finally, multi-omics integration can be performed to identify coherent features from different omics that separate different subpopulations