| Literature DB >> 35501337 |
Jodi M Saunus1,2, Xavier M De Luca3, Korinne Northwood3, Ashwini Raghavendra3, Alexander Hasson4, Amy E McCart Reed3, Malcolm Lim3, Samir Lal3, A Cristina Vargas3, Jamie R Kutasovic3, Andrew J Dalley3, Mariska Miranda5, Emarene Kalaw3, Priyakshi Kalita-de Croft3, Irma Gresshoff3, Fares Al-Ejeh5, Julia M W Gee6, Chris Ormandy7, Kum Kum Khanna5, Jonathan Beesley5, Georgia Chenevix-Trench5, Andrew R Green8, Emad A Rakha8, Ian O Ellis8, Dan V Nicolau4,9, Peter T Simpson3, Sunil R Lakhani10,11.
Abstract
Intratumoral heterogeneity is caused by genomic instability and phenotypic plasticity, but how these features co-evolve remains unclear. SOX10 is a neural crest stem cell (NCSC) specifier and candidate mediator of phenotypic plasticity in cancer. We investigated its relevance in breast cancer by immunophenotyping 21 normal breast and 1860 tumour samples. Nuclear SOX10 was detected in normal mammary luminal progenitor cells, the histogenic origin of most TNBCs. In tumours, nuclear SOX10 was almost exclusive to TNBC, and predicted poorer outcome amongst cross-sectional (p = 0.0015, hazard ratio 2.02, n = 224) and metaplastic (p = 0.04, n = 66) cases. To understand SOX10's influence over the transcriptome during the transition from normal to malignant states, we performed a systems-level analysis of co-expression data, de-noising the networks with an eigen-decomposition method. This identified a core module in SOX10's normal mammary epithelial network that becomes rewired to NCSC genes in TNBC. Crucially, this reprogramming was proportional to genome-wide promoter methylation loss, particularly at lineage-specifying CpG-island shores. We propose that the progressive, genome-wide methylation loss in TNBC simulates more primitive epigenome architecture, making cells vulnerable to SOX10-driven reprogramming. This study demonstrates potential utility for SOX10 as a prognostic biomarker in TNBC and provides new insights about developmental phenotypic mimicry-a major contributor to intratumoral heterogeneity.Entities:
Year: 2022 PMID: 35501337 PMCID: PMC9061835 DOI: 10.1038/s41523-022-00425-x
Source DB: PubMed Journal: NPJ Breast Cancer ISSN: 2374-4677
Fig. 1SOX10 is expressed in basal and luminal progenitor cells of the human mammary gland.
a Representative SOX10 IHC analysis of reduction mammoplasty (RM) samples. Some terminal ducto-lobular units (TDLUs) had exclusive basal compartment expression (i) while others had expression in both basal and luminal compartments (ii). b (i) Analysis of SOX10 expression in ducts vs lobules of RM samples from 19 donors (whole sections). (ii) SOX10 expression in lobules was heterogeneous and more likely to occur in the luminal compartment (Mann–Whitney p = 0.011; n = 102 ducts and 102 lobules; median ± 95% confidence interval shown). c Representative immunofluorescent staining of SOX10 and CK8/18. Circled lobules and isolated cells (arrows) exhibited reciprocal expression of SOX10 (green) and CK8/18 (red) in structures with either (i) dual compartment (ii) or basal-restricted SOX10 expression. d IHC analysis of SOX10, c-kit, ER and Ki67 in serial RM sections. The three magnified regions represent major SOX10 staining patterns: (i) dual compartment, heterogeneous; (ii) dual compartment, homogeneous; and (iii) basal-restricted. Luminal SOX10 expression was directly associated with c-kit and inversely associated with ER, with no obvious relationship to Ki67 (e.g., cell cluster indicated with an arrow). e SOX10 mRNA levels in FACS-sorted human mammary epithelial cell (hMEC) subtypes[15]. Differentiation markers were analysed for comparison: basal markers CK14 and CK5; luminal progenitor (LP) markers KIT and ELF5; and markers enriched in mature luminal (ML) cells: CK18 and ESR1 (isolates with significantly different marker levels according to paired ANOVA tests are indicated and colour-coded: ****p < 0.00001; ***p < 0.0001; **p < 0.001). Data shown were means ± standard error of the mean from three donors. f Average methylation beta-values of SOX10 probes in FACS-sorted hMEC samples (DNAme), aligned with histone modification signals in a published ChIP-seq dataset[42]: H3K4me3, H3K27ac (activating) and H3K27me3 (repressive). Data were represented to scale on human chromosome 22. TSS transcription start site, UTR untranslated region. Indistinct = negative for CD45 (hematopoietic cells), CD31 (endothelia), CD140b (fibroblasts), EpCAM and CD49f (epithelia).
Fig. 2Expression of SOX10 in human breast cancer.
a Bimodal expression of SOX10 in TNBC compared to other breast cancers (nonTNBC) in the METABRIC cohort. b Frequency of copy-number alterations (CNAs) and DNA hypomethylation affecting SOX10 in TNBC and nonTNBC compared to the archetypal SOX10 + malignancy, melanoma (SKCM; TCGA datasets). c Correlation between SOX10 methylation and expression (normalised RNAseq counts) in SKCM, TNBC and nonTNBC (Spearman correlation coefficients (r) and p values are shown; derived from TCGA data). d Proportions of TNBC and nonTNBC cases with hypomethylation at each probe across the SOX10 locus (as defined in (b)). e Representative IHC showing SOX10-neg, heterogeneous and nuclear-positive (+) TNBCs. Tumours with absent or very weak nuclear staining in ≥50% of tumour cells were classified as SOX10-negative, while those with any one of replicate TMA cores exhibiting moderate-strong nuclear staining in <50% OR weak-moderate nuclear staining in ≥50% of tumour cells were classified as heterogeneous (see also Supplementary Fig. 2h). Survival curves of heterogeneous and negative categories overlapped (Supplementary Fig. 2j) and hence are grouped together here. f Kaplan–Meier analysis of the relationship between SOX10 nuclear positivity and breast cancer-specific survival (BCSS) in cross-sectional TNBCs. Log-rank test p value and hazard ratio (HR) are shown (95% confidence interval). g Kaplan–Meier analysis of the relationship between SOX10 nuclear positivity and BCSS in TNBCs classified as metaplastic breast cancers. Gehan–Breslow–Wilcoxon test p value shown. h SOX10 expression in brain-metastatic TNBC and matching brain metastases (BrM), compared to the frequency in cross-sectional TNBCs (Chi-square p value shown).
Key features of eight predominant gene co-expression modules extracted by WGCNA.
| Modules | Major functional ontologiesa | Signalling pathwaysa/intrinsic activatorsb | Size (no. genes) | Top ten hub genes (Highest kWithin; see Supplementary Table | |
|---|---|---|---|---|---|
| Tumour-centric | Blue | Mitotic instability | FOXM1, MYBL2 | 1239 | TPX2, BUB1, CEP55, HJURP, NCAPH, KIF4A, KIF2C, CCNB2, NCAPG, FOXM1 |
| Green | Multipotency (SOXE) | Wnt signalling | 487 | ROPN1, SFRP1, FOXC1, RGMA, GABRP, CHST3, MAML2, APCN, ROPN1B, SOX10 | |
| Brown | Primary cilium | ER, FOXA1 | 1008 | FOXA1, MLPH, ESR1, AGR3, XBP1, THSD4, GATA3, CA12, PRR15, ZMYND10 | |
| Tumour-stromal | Magenta | ECM-1 (structural) | FBN1, RUNX2 | 186 | COL5A2, COL1A2, COL3A1, COL5A1, COL6A3, FAP, THBS2, COL1A1, LUM, VCAN |
| Black | ECM2 (regulatory) | – | 207 | OLFML1, RECK, FSTL1, DCN, MSRB3, ECM2, CCDC80, TCF4, ZEB1, GLT8D2 | |
| Red | Fatty acid metabolism | PPARγ | 274 | DIA1R, PDE2A, LHFP, LDB2, ARHGEF15, S1PR1, SDPR, EBF1, CD34, ERG | |
| Tan | Type-I IFN response | STAT1, IRF9 | 33 | IFIT3, OAS2, CMPK2, IFI44L, IFI44, IFIT1, MX1, OASL, IFIT2, RSAD2 | |
| Stromal | Yellow | Adaptive immunity (TILs) | CD40L, CD40, IFNγ, IRF1 | 712 | SASH3, IL2RG, CD53, PTPN7, CD48, CD2, CD3E, ARHGAP9, CD5, CD3D, SIT1, SH2D1A |
ECM extracellular matrix.
aGene set enrichment analysis (GSEA) of all BRCA genes ranked according to module eigengene correlation (Supplementary Table 9).
bIngenuity pathways analysis upstream regulator prediction (p ≤ 1.0E-07) based on kWithin values for module genes.
Fig. 3SOX10’s regulatory network is associated with multipotency, cell migration and poor prognosis in TNBC.
a Relative expression of eight predominant transcription modules in human breast tumours, according to the PAM50 subtype (TCGA dataset). b SOXE-module co-expression profile similarity matrix, clustered to highlight genes with very highly coordinated expression. The similarity is based on cosine distance and has a maximum value of 1. SOX10 mapped to one of six module sub-clusters, the members of which are shown to the right of the matrix. See also Supplementary Fig. 4a, b. c Summary of results from unsupervised gene set enrichment analysis of the breast cancer transcriptome after ordering transcripts according to their correlations with SOXE-module expression (denoted by the ME value, TCGA dataset). d Tile plot showing overlapping expression of SOXE-module representatives. For each protein, significant co-expression with ≥2 other module members is indicated by a Fisher’s exact test result (*p < 0.05; ***p < 0.001; ****p < 0.0001). Refer to Supplementary Table 1 for scoring criteria. e IHC staining of representative SOXE-module nodes in serial sections from the same tumour. f Proportional expression of all eight modules (coloured as for (a)) in TNBCs annotated with PAM50 and TNBC subtypes (METABRIC dataset; LAR luminal androgen receptor-like, MES mesenchymal, BLIS basal-like immune-suppressed, BLIA basal-like immune-activated[39]). g Kaplan–Meier analysis of METABRIC TNBCs expressing different proportions of the three predominant TNBC modules. BCSS breast cancer-specific survival. ME fraction thresholds for classifying cases as high or low were 0.33 for SOXE/blue and 0.1 for yellow.
Fig. 4The SOXE-module drives the transition from normal mammary epithelial stem/progenitor to NCSC-like phenotypic states.
a Influence of SOXE-module genes over network architecture and information flow. kWithin: intramodular ‘connectivity’ based on weighted correlations with all other module genes; Eigencentrality: considers the connectivity of each node’s nearest neighbours as an indicator of ‘local influence’; Betweenness centrality: ‘conductivity’ based on each node’s position along the shortest paths between other nodes (genes with high betweenness are information conduits). Key hub genes are indicated (see Supplementary Table 10 for the full dataset). b Chick (ch.)NCSC and neural crest (NC) terms genesets are largely independent of each other and from the SOXE-module. c Correlations between SOXE-ME values and NCSC genesets (singscore values) in TNBC (n = 106 TCGA cases with tumour cellularity ≥0.6). Correlation coefficients (r) and p values are shown. d GSEA using three TNBC gene expression datasets (ICGC, METABRIC, TCGA). Normalised enrichment scores (NES) and corrected p values (q) shown. e Overlap between members of the SOXE-module and SOX10’s normal breast module (from de novo module identification on n = 97 TCGA normal breast samples; Supplementary Table 12). Generic ontology enrichment results are summarised (full GO term lists in Supplementary Table 13). f Comparison of network structure and information flow metrics (as for (a)) between shared and SOXE-module-exclusive genes. Groups were compared using Mann–Whitney tests (**p = 2.4E-03; ***p = 5.6E-04). Boxes show the 10–90th percentiles and median, with whiskers extending to the minimum and maximum values. Mean is indicated with ‘+’. g Model depicting the mammary epithelial progenitor gene regulatory network core being sustained through transformation and rewired as the SOXE-module in TNBC. Shared hub genes are listed.
Fig. 5The SOXE-module is driven by the erosion of lineage-specific epigenetic marks.
a Decision tree for identifying candidate copy-number alteration (CNA) drivers of the SOXE-module. Of 17,694 genes with case-matched GISTIC, RNAseq and WGCNA data, CN, and expression of 130 correlated with the SOXE-module in TNBC, including 25 SOXE-module nodes. b Network influence metrics for SOXE-module nodes coloured according to candidate CN driver status (intramodular connectivity (kWithin), local influence (Eigencentrality) and conductivity (betweenness centrality) defined in Fig. 4a). Boxes show the 10–90th percentiles and median, with whiskers extending to the minimum and maximum values. Mean is indicated with ‘+’. No significant differences by ordinary ANOVA test. c Relationship between SOXE-module levels and mutation signatures in ICGC TNBCs (COSMIC v2 SigProfiler and HRDetect on n = 74 ICGC TNBCs)[45]. Associations are depicted according to the correlation between SOXE-ME values and signature event count (y-axis); and by the significance of average SOXE-ME differences between ICGC TNBCs with low (quartile-1) vs higher (quartile 2–4) signature burden. d t-Distributed stochastic neighbour embedding (t-SNE) visualisation of genome methylation profile similarities amongst cases in the BRCA-TCGA 450k methylation array dataset. Panels are coloured according to PAM50 intrinsic subtype, SOXE-ME values or global median methylation-b values. Circled cases are epigenetically divergent, basal-like TNBCs that express high levels of the SOXE-module and have eroded methylomes. e Correlation analysis summary showing relationships between SOXE-ME values and region-specific methylation (n = 75 TCGA TNBCs, tumour cellularity ≥0.6; n = 215,323 probes after quality filtering); ****p < 1.0E-07. CGI CpG island, IGR intergenic region, TSS transcription start site, UTR untranslated region. Solo-WCpGW: consensus sequence for late-replicating loci demethylated via replicative senescence. f Unsupervised clustering of the BRCA-TCGA 450k methylation dataset according to ME correlation. Data shown were minimum correlation coefficients of ME values versus gene-averaged methylation-b data from promoter region probes (TSS1500, TSS200 and 5′UTR). Of three clusters inversely correlated with SOXE-module expression, two (a, b) were enriched with developmental ontologies (Supplementary Table 14). g Network influence metrics for SOXE-module genes in the hypomethylated clusters versus other SOXE-module genes, as for (b). Ordinary ANOVA p values: *p < 0.05; **p < 0.01; ***p < 0.001; ns not significant.
Fig. 6Model summarising the study findings.
Proposed links between established drivers of TNBC progression, epigenome erosion and the emergence of a neural crest-like transcriptional programme in de-differentiated TNBCs.
Biological resources.
| Resource | Source, identifier and relevant citations | Related figure(s) |
|---|---|---|
| Tissue samples | ||
| Histologically normal breast FFPE whole sections | The Brisbane breast bank[ | 1a–e |
| Fresh RM surgical samples | The Brisbane breast bank[ | 1f, Supp-1b, Supp-6a |
| Australian BC series, FFPE TMA sections & clinical data | Pathology Qld & The Brisbane breast bank[ | 2e, f, 3d–e, Supp-2h-k |
| UK breast cancer series, FFPE TMA sections & clinical data | Nottingham Breast Cancer Research Centre[ | 2e, f, Supp-2h-k |
| Metaplastic tumour series, FFPE sections & clinical data | Asia-Pacific MBC consortium[ | 2g |
| Patient-matched primary TNBCs and brain metastases | Pathology Qld & The Brisbane breast bank[ | 2h |
| Cancer cell lines | ||
| 293 T | ATCC® CRL-3216™ | Supp-1a, Supp-2g |
| MDA-MB-435S | ATCC® HTB-129™ | Supp-1a, Supp-2e, Supp-2g |
| HCC38 | ATCC® CRL-2314™ | Supp-2e |
| HCC1569 | ATCC® CRL-2330™ | Supp-2e, Supp-2g |
| Primary melanoma cells (D41, D05) | Dr. Chris Schmidt, QIMR Berghofer[ | Supp-2e |
| TaqMan gene expression assays | ||
| SOX10 | ThermoFisher, Hs00366918_m1 | Supp-2e |
| RPL13A | ThermoFisher, Hs03043885_g1 | Supp-2e |
| shRNA sequences | ||
| SOX10_1 | Sigma-Aldrich TRCN0000018984 | Supp-1a, Supp-2g |
| SOX10_2 | Sigma-Aldrich TRCN0000018987 | Supp-1a, Supp-2g |
| SOX10_3 | Sigma-Aldrich TRCN0000018988 | Supp-1a, Supp-2g |
| Non-targeted negative control (NTNC) | Sigma-Aldrich SHC002 | Supp-1a, Supp-2g |
Supp supplementary.
software, code, and published datasets.
| ResRource | Source, identifier and relevant citations | Related figure(s) | Related table(s) |
|---|---|---|---|
| Software packages and code | |||
| ChAMP | 5d–f | – | |
| Clustergrammer | 3b | Supp-10 | |
| Community detection algorithms | Refs. [ | Supp-4a | – |
| Epifactors database | Supp-5e | – | |
| FACSDiva™ | BD Biosciences, licensed | 1f, Supp-6a | – |
| FCS Express (v7) | De Novo Software, licensed | 1f, Supp-6a | – |
| GSEAPreranked | 3c, 4d, 5f, Supp-3 | 1, Supp-4, Supp-9 | |
| Ingenuity Pathways Analysis (IPA) | Ingenuity, licensed | – | 1 |
| MATLAB | Mathworks, licensed | Supp-4a | Supp-10 |
| Princeton Generic GO term finder | 5a | Supp-13, 14 | |
| Prism (v8.4.3) | GraphPad, licensed | Multiple | S2 |
| R package, Cluster | 5f | – | |
| R package, FlashClust | 5f, g | Supp-14 | |
| R package, Limma | Supp-3 | Supp-3 | |
| R package, t-SNE | 5d | – | |
| R package, WGNCA | Multiple | Multiple | |
| REVIGO | Supp-3 | Supp-4 | |
| Singscore | 4c | – | |
| SPSS | IBM, licensed | – | Supp-2 |
| Tableau desktop (2020.4) | Tableau, licensed | 4a | – |
| Published datasets | |||
| Cell line expression data | Supp-2e, f | – | |
| Cell line expression, CNA and methylation datasets | Supp-2e, f | – | |
| Chicken embryo neural crest gene set | Ref. [ | 4b–d | Supp-11 |
| Gene ontology resource | – | Supp-11 | |
| Genomic locations of solo-WCpGW sites | Ref. [ | Supp-5c | – |
| hMEC ChIP-seq data | 1f | – | |
| hMEC gene expression array data | Gene expression omnibus, | 1e | – |
| Human reference genome NCBI build 37 (GRCh37/hg19) | UCSC Genome Browser | 2d, Supp-5a | – |
| ICGC gene expression data | Ref. [ | – | Supp-8 |
| ICGC HRDetect scores | Ref. [ | 5c | – |
| ICGC mutational signatures (COSMIC, v2 SigProfiler) | Ref. [ | 5c | – |
| Illumina Infinium Omni2.5 array data | 1f, Supp-5b | – | |
| METABRIC gene expression & clinical data | EGAD00010000210, EGAD00010000211, EGAS00000000083; EGA portal, via data access committee[ | 2a, 3f, g, Supp-3, Supp-4c, d | Supp-4, Supp-7 |
| MetaCore | Supp-3 | Supp-4 | |
| SOXE-module network metrics | This paper | 4a, f, 5b, g | Supp-10 |
| TCGA clinicopathologic annotation | Ref. [ | 2a–d, 3a | – |
| TCGA gene copy-number data | Gistic2.Level_4; TCGA Data Analysis Center Firehose[ | 2b, 5a, b, Supp-5a | – |
| TCGA gene-level methylation data | Preprocess/meth.by_min_expr_corr; TCGA Data Analysis Center Firehose[ | 2b, c | – |
| TCGA Illumina HiSeq RNASeq-v2 RSEM level-3 normalised datasets | illuminahiseq_rnaseqv2-RSEM_genes_normalized (MD5); TCGA Data Analysis Center Firehose[ | 2a, c | Supp-4 |
| TCGA Illumina HiSeq RNASeq-v2 RSEM level-3 raw counts | TCGA Data Analysis Center Firehose[ | 3a, S3 | Supp-3, 5, 6, 9, 10, 12, 13 |
| TCGA probe-level methylation data | Humanmethylation_450; TCGA Data Analysis Center Firehose[ | 5d–f, Supp-5b–d | – |
| Triple-negative breast cancer subtypes (Burstein et al) | Ref. [ | 3f, Supp-2b | – |
| Tumour purity for TCGA cases | Supp data-1 (CPE metric) & infinium metric, refs. [ | Multiple | – |
| WGCNA ME dataset, ICGC cases | This paper | Multiple | Supp-8 |
| WGCNA ME dataset, METABRIC cases | This paper | Multiple | Supp-7 |
| WGCNA ME dataset, TCGA normal cases | This paper | Multiple | Supp-12 |
| WGCNA ME dataset, TCGA tumour cases | This paper | Multiple | Supp-6 |
| WGCNA mod membership dataset (TCGA cohort) | This paper | Multiple | Supp-5 |
Supp supplementary.