| Literature DB >> 36085309 |
Daniel Sobral1,2, Marta Martins3, Shannon Kaplan4, Mahdi Golkaram4, Michael Salmans4, Nafeesa Khan4, Raakhee Vijayaraghavan4, Sandra Casimiro3, Afonso Fernandes3, Paula Borralho3, Cristina Ferreira5, Rui Pinto1,2, Catarina Abreu5, Ana Lúcia Costa5, Shile Zhang4, Traci Pawlowski4, Jim Godsey4, André Mansinho3,5, Daniela Macedo5, Soraia Lobo-Martins5, Pedro Filipe5, Rui Esteves5, João Coutinho5, Paulo Matos Costa5, Afonso Ramires5, Fernando Aldeia5, António Quintela5, Alex So4, Li Liu6, Ana Rita Grosso7,8, Luis Costa9,10.
Abstract
Colorectal cancer (CRC) is a highly diverse disease, where different genomic instability pathways shape genetic clonal diversity and tumor microenvironment. Although intra-tumor heterogeneity has been characterized in primary tumors, its origin and consequences in CRC outcome is not fully understood. Therefore, we assessed intra- and inter-tumor heterogeneity of a prospective cohort of 136 CRC samples. We demonstrate that CRC diversity is forged by asynchronous forms of molecular alterations, where mutational and chromosomal instability collectively boost CRC genetic and microenvironment intra-tumor heterogeneity. We were able to depict predictor signatures of cancer-related genes that can foresee heterogeneity levels across the different tumor consensus molecular subtypes (CMS) and primary tumor location. Finally, we show that high genetic and microenvironment heterogeneity are associated with lower metastatic potential, whereas late-emerging copy number variations favor metastasis development and polyclonal seeding. This study provides an exhaustive portrait of the interplay between genetic and microenvironment intra-tumor heterogeneity across CMS subtypes, depicting molecular events with predictive value of CRC progression and metastasis development.Entities:
Mesh:
Year: 2022 PMID: 36085309 PMCID: PMC9463147 DOI: 10.1038/s42003-022-03884-x
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
Fig. 1Genetic Inter- and Intra-Tumor Heterogeneity of CRC.
a Clonal composition of CRC samples grouped by CMS subtype, showing the number of subclones and respective relative frequency inferred with Expands (based on SSMs) or with PhyloWGS (using both SSMs and CNVs). b Distribution of the Shannon-Index values based on Expands clonal composition segregated by CMS subtype. c Distribution of the Shannon-Index values based on PhyloWGS clonal composition segregated by CMS subtype. d–g Distribution of the number of clonal SSMs (d), subconal SSMs (e), clonal CNVs (f), subclonal CNVs (g) for each CMS group. h Comparison between number of clonal/subclonal SSMs and clonal CNVs. Number of SSMs are represented in logarithm (log10). Estimate and statistical significance of the Pearson correlation are presented. Numbers of samples per CMS subtype: CMS1 n = 27; CMS2 n = 36; CMS3 n = 12; CMS4 n = 19; Unknown n = 18. *Wilcoxon signed-rank test p value < 0.05.
Fig. 2Genetic biomarker signatures for genetic ITH.
a Heatmap of cancer-related genes affected by SSMs or CNVs that are associated with genetic ITH levels (only SSMs) in CRC samples depicted by a LASSO penalized model. Each line represents an independent analysis applied to the CRC samples segregated according to CMS subtypes (all samples n = 112; CMS1 n = 27; CMS2 n = 36; CMS3 n = 12; CMS4 n = 19), MSI status (MSS n = 59; MSI-H n = 53) or primary tumor location (right n = 68; left n = 33). LASSO-selected coefficients are colored according to the effect of each standardized covariate in the optimal model. The numbers on each tile denote the order in which variables are included indicating their relative importance. The top bar plot indicates the frequency at which each driver-gene mutation occurs in the ITH fitted model. The right bar plot shows the explained variance. b Comparison between observed and predicted genetic ITH levels (only SSMs) for all samples in our cohort and in TCGA. c Heatmap of genes with SSMs or CNVs that are associated with genetic ITH levels (SSMs and CNVs) in CRC samples depicted by a LASSO penalized model. Graphical representation similar to (a). d Comparison between observed and predicted genetic ITH levels (SSMs and CNVs) for all samples in our cohort and in TCGA. Colors indicate CMS subtype for each CRC sample. Estimate and statistical significance of the Pearson correlation are presented. R2 represents the explained variance of the model in our cohort.
Fig. 3Microenvironment Inter- and Intra-Tumor Heterogeneity of CRC.
a Principal Component Analysis of cell frequencies from RNA-based deconvolution approach. CRC samples are colored according to CMS subtype. b Microenvironment composition of CRC samples grouped by CMS subtype, showing the microenvironmental Shannon-Index and cell frequency for RNA-based signatures of: epithelial cells (where CMS1, CMS2, CMS3-like are individually represented); stromal cells; myeloids; B-cells and T-cells. c, d Distribution of the microenvironment Shannon-Index (based on expression signatures) segregated by CMS subtype, considering the cell-frequency of: only epithelial cells (c) or all cell signatures (d). Numbers of samples per CMS subtype: CMS1 n = 27; CMS2 n = 36; CMS3 n = 12; CMS4 n = 19; Unknown n = 18. e Heatmap of Spearman correlation coefficient between: genetic Shannon-Index (only SSMs); immune score from Estimate; cell frequencies of CMS1/CMS2-like signatures; and number of SSMs and CNVs (total, clonal and subclonal). f Heatmap of Spearman correlation coefficient between genetic/microenvironmental Shannon-Indices and cell frequencies of the RNA-based signatures. Only significant correlations are represented (adj. p value < 0.05).
Fig. 4Transcriptional biomarker signatures for genetic ITH.
a Heatmap of cancer-related genes whose expression levels are associated with genetic ITH levels (only SSMs) in CRC samples depicted by a LASSO penalized model. Each line represents an independent analysis applied to the CRC samples segregated according to CMS subtypes (all samples n = 112; CMS1 n = 27; CMS2 n = 36; CMS3 n = 12; CMS4 n = 19), MSI status (MSS n = 59; MSI-H n = 53) or primary tumor location (right n = 68; left n = 33). LASSO-selected coefficients are colored according to the effect of each standardized covariate in the optimal model. The numbers on each tile denote the order in which variables are included indicating their relative importance. The top bar plot indicates the frequency at which each driver-gene mutation occurs in the ITH fitted model. The right bar plot shows the explained variance. b Comparison between observed and predicted genetic ITH levels (only SSMs) for all samples in our cohort and in TCGA.
Fig. 5Tumor heterogeneity and Metastatic potential.
a Coefficients (log-odds ratios) of generalized linear models for metastatic potential including intra-tumor heterogeneity (genetic and microenvironment); number of genomic alterations (SSMs and CNVs). Each color represents an independent model fitted to the CRC samples segregated according to CMS subtypes (all samples n = 112; CMS1 n = 27; CMS2 n = 36; CMS3 n = 12; CMS4 n = 19), MSI status (MSS n = 59; MSI-H n = 53) or primary tumor location (right n = 68; left n = 33). Detailed results and significance levels are represented in Supplementary Fig. 13. b Frequencies of copy number events (separated in amplifications and deletions) affecting different regions of the genome (binned by chromosomal bands), in early primaries that do not metastasize (nmCRC, n = 92), and in early primaries that metastasize (mCRC, n = 20). The top panel shows the enrichment of event frequency in the metastatic versus non-metastatic primaries (values represent the −log10 of the uncorrected p value of a fisher test for each genomic bin). Positions of known cancer-related genes are also displayed. c Coefficients (log-odds ratios) of generalized linear models for metastatic potential including clonal and subclonal genomic alterations (SSMs and CNVs). Each color represents an independent model fitted to the CRC samples segregated according to CMS subtypes, MSI status or primary tumor location. Detailed results and significance levels are represented in Supplementary Figs. 15 and 16.
Fig. 6Clonal Diversity and Metastasis Development.
a Distribution of the Shannon-Index values based on Expands clonal composition for early primary tumors (n = 112) and metastasis (n = 12). b–d Amount of total CNVs (b), clonal CNVs (c) and DNA ploidy (d) for early primary tumors and metastasis. e, f Phylogenetic trees depicting subclonal expansion for a primary-metastasis pair with monoclonal (e) and polyclonal metastatic seeding (f). The subclones are identified with the respective number and the containing sample: normal (N), primary tumor (P) and metastasis (M). g Frequencies of copy number events (separated in amplifications and deletions) affecting different regions of the genome, in the paired primary-metastasis samples from our study (n = 12) and the study of Lim et al. (n = 19). Only events found in both the primary and its paired metastasis are displayed. Positions of known cancer-related genes are also displayed.