| Literature DB >> 32415257 |
Maxime Caron1,2, Pascal St-Onge2, Thomas Sontag2, Yu Chang Wang3, Chantal Richer2, Ioannis Ragoussis1,3, Daniel Sinnett4,5, Guillaume Bourque6,7,8.
Abstract
Childhood acute lymphoblastic leukemia (cALL) is the most common pediatric cancer. It is characterized by bone marrow lymphoid precursors that acquire genetic alterations, resulting in disrupted maturation and uncontrollable proliferation. More than a dozen molecular subtypes of variable severity can be used to classify cALL cases. Modern therapy protocols currently cure 85-90% of cases, but other patients are refractory or will relapse and eventually succumb to their disease. To better understand intratumor heterogeneity in cALL patients, we investigated the nature and extent of transcriptional heterogeneity at the cellular level by sequencing the transcriptomes of 39,375 individual cells in eight patients (six B-ALL and two T-ALL) and three healthy pediatric controls. We observed intra-individual transcriptional clusters in five out of the eight patients. Using pseudotime maturation trajectories of healthy B and T cells, we obtained the predicted developmental state of each leukemia cell and observed distribution shifts within patients. We showed that the predicted developmental states of these cancer cells are inversely correlated with ribosomal protein expression levels, which could be a common contributor to intra-individual heterogeneity in cALL patients.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32415257 PMCID: PMC7228968 DOI: 10.1038/s41598-020-64929-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Metadata of samples used in the study.
| Sample | Number of cells | Mean Reads per Cell | Median Genes per Cell | Number of Reads | Sequencing Saturation | Reads Mapped Confidently to Genome | Fraction Reads in Cells | Total Genes Detected | Median UMI per Cell |
|---|---|---|---|---|---|---|---|---|---|
| ETV6-RUNX1_1 | 2,776 | 76,003 | 1,830 | 210,987,037 | 84.0% | 92.0% | 89.8% | 18,397 | 5,312 |
| ETV6-RUNX1_2 | 6,274 | 52,662 | 1,239 | 330,404,706 | 87.9% | 90.5% | 85.1% | 20,005 | 2,435 |
| ETV6-RUNX1_3 | 3,862 | 85,994 | 1,181 | 332,110,056 | 90.3% | 85.3% | 80.8% | 19,909 | 2,815 |
| ETV6-RUNX1_4 | 5,069 | 59,411 | 1,304 | 301,155,525 | 84.3% | 83.5% | 88.3% | 19,414 | 3,940 |
| HHD_1 | 3,728 | 83,388 | 1,579 | 310,871,911 | 78.3% | 89.1% | 77.1% | 19,061 | 4,712 |
| HHD_2 | 5,013 | 64,741 | 2,059 | 324,551,184 | 73.4% | 87.1% | 93.3% | 20,798 | 5,643 |
| PRE-T_1 | 2,959 | 109,972 | 2,114 | 325,408,071 | 80.5% | 85.3% | 74.9% | 19,775 | 6,309 |
| PRE-T_2 | 2,748 | 118,053 | 1,685 | 324,412,280 | 86.1% | 86.7% | 81.8% | 20,568 | 5,037 |
| PBMMC_1 | 1,612 | 303,228 | 880 | 488,803,883 | NA | NA | 78.5% | NA | 2,573 |
| PBMMC_2 | 3,105 | 101,124 | 1,091 | 313,990,607 | 81.0% | 79.2% | 79.9% | 19,927 | 6,408 |
| PBMMC_3 | 2,229 | 145,472 | 1,085 | 324,258,554 | 91.4% | 87.1% | 61.6% | 18,793 | 3,008 |
Single cell statistics of samples used in the study.
| Mean Reads per Cell | Median Genes per Cell | Number of Reads | Sequencing Saturation | Reads Mapped Confidently to Genome | Fraction Reads in Cells | Total Genes Detected | Median UMI per Cell |
|---|---|---|---|---|---|---|---|
| 76,003 | 1,830 | 210,987,037 | 84.0% | 92.0% | 89.8% | 18,397 | 5,312 |
| 52,662 | 1,239 | 330,404,706 | 87.9% | 90.5% | 85.1% | 20,005 | 2,435 |
| 85,994 | 1,181 | 332,110,056 | 90.3% | 85.3% | 80.8% | 19,909 | 2,815 |
| 59,411 | 1,304 | 301,155,525 | 84.3% | 83.5% | 88.3% | 19,414 | 3,940 |
| 83,388 | 1,579 | 310,871,911 | 78.3% | 89.1% | 77.1% | 19,061 | 4,712 |
| 64,741 | 2,059 | 324,551,184 | 73.4% | 87.1% | 93.3% | 20,798 | 5,643 |
| 109,972 | 2,114 | 325,408,071 | 80.5% | 85.3% | 74.9% | 19,775 | 6,309 |
| 118,053 | 1,685 | 324,412,280 | 86.1% | 86.7% | 81.8% | 20,568 | 5,037 |
| 303,228 | 880 | 488,803,883 | NA | NA | 78.5% | NA | 2,573 |
| 101,124 | 1,091 | 313,990,607 | 81.0% | 79.2% | 79.9% | 19,927 | 6,408 |
| 145,472 | 1,085 | 324,258,554 | 91.4% | 87.1% | 61.6% | 18,793 | 3,008 |
*Pediatric control PBMMC1 was generated from two independent libraries to increase the total number of cells and sequencing runs were aggregated using Cell Ranger.
Figure 1Cell types identified in healthy pediatric and adult bone marrow mononuclear cells (BMMCs) using single cell RNA-seq. (A) UMAP representation of healthy BMMCs from three pediatric (PBMMC; n = 6,836 cells) and two adult (ABMMC; n = 3,467 cells) donors. (B) Expression of cell surface marker genes used to assign cell types: CD79A (B cells), CST3 (monocytes), CD3D (T cells) and HBA1 (erythrocytes). (C) Cell types identified in healthy pediatric and adult BMMCs. (D) Proportion of cells of a given cell type in pediatric and adult BMMCs. (E) Proportion of healthy cells in predicted cell cycle phases per cell type (G1, S and G2/M).
Figure 2Transcriptional landscape of cALL cancer cells. (A) UMAP representation of BMMCs from three healthy pediatric donors (n = 6,836 cells) and eight cALL patients (n = 32,086 cells). (B) UMAP representation of predicted cell cycle phases for healthy and cancer BMMCs. (C) Proportion of cells clustering with healthy (PBMMC) cell clusters. (D) Proportion of cancer cells in predicted cell cycle phases (G1, S and G2/M). (E) Heatmap and unsupervised clustering of normalized and scaled expression of the top 100 most variable genes in leukemia cells.
Figure 3Intra-individual transcriptional heterogeneity reveals deregulated genes and pathways within cALL samples. (A) UMAP representation of cALL cells in G1 phase not clustering with healthy cell clusters (n = 16,731). (B) Mean Adjusted Rand Index (ARI) of clustering solutions over a range of resolutions (highest mean ARI at 1.3 resolution). (C) Clusters of cells identified in cALL samples using the highest mean ARI resolution. (D) Proportion of cells belonging to each intra-individual cluster after removing clusters having less than 10% of cells (n = 16,162). (E) Differentially expressed genes between two the clusters of cells within the HHD.1 sample (log fold-change > 0.75 = green, >1 = orange). (F) Heatmap and unsupervised clustering of enriched GO biological pathways obtained using the top 100 most significant differentially expressed genes of cALL samples.
Figure 4Somatic alterations and intra-individual expression variability. (A) Copy number profiles of samples and intra-individual transcriptional clusters using equal size metacells and healthy pediatric BMMCs as control cells. (B) Number of somatic mutations in cALL samples. (C) Genomic annotations of somatic mutations in cALL samples. (D) Gene names and variant allele frequencies of exonic non-synonymous somatic mutations in cALL samples. (E) Number of somatic mutations used as input to obtain allele calls from single cell RNA-seq alignment files using vartrix (all = number of GRCh38 mutations lifted from hg19; covered = number of mutations covered in at least one cell; with alt = number of mutations with at least one mutant allele call). (F) QQ-plot of Fisher’s exact test p-values of somatic mutations tested for enrichment in intra-individual transcriptional clusters (>0.1% of cells with mutant allele call, n = 31 mutations).
Figure 5Predicted developmental state is inversely correlated with ribosomal protein expression in cALL cells and is a major source of intra-individual transcriptional heterogeneity. (A) Left: UMAP representation of the maturation spectrum of healthy pediatric and adult B cells (CD34+ → B cells → CD20 + B cells) used for the B cell developmental state classifier. Right: UMAP representation of the maturation spectrum of healthy pediatric and adult T cells (CD34+ → immature hematopoietic → T cells) used for the T cell developmental state classifier. Cells were projected onto the loess fit of the spectrum and assigned a pseudotime value of 0 to 1 from the first stem-like cell to the last mature cell. (B) Observed vs predicted pseudotime of healthy B and T cells using a hundred 70/30 cross-validation splits; mean RMSE was computed over all splits. (C) Density of predicted developmental state pseudotime distributions of leukemia cells per sample and intra-individual transcriptional cluster. (D) Boxplot of ribosomal protein (RP) expression percentage in leukemia cells per sample and intra-individual transcriptional cluster. (E) Pseudotime vs ribosomal protein expression in samples showing strong (HHD.1) vs weak (ETV6.RUNX1.2) intra-individual pseudotime and ribosomal protein expression shifts. (F) Heatmap of normalized and scaled expression of ribosomal protein genes per cell sorted from high to low. An expression gradient correlated to cluster assignment can be observed in sample HHD.1 but not ETV6.RUNX1.2.