| Literature DB >> 24516633 |
Denise M Wolf1, Marc E Lenburg2, Christina Yau3, Aaron Boudreau1, Laura J van 't Veer1.
Abstract
Co-expression modules are groups of genes with highly correlated expression patterns. In cancer, differences in module activity potentially represent the heterogeneity of phenotypes important in carcinogenesis, progression, or treatment response. To find gene expression modules active in breast cancer subpopulations, we assembled 72 breast cancer-related gene expression datasets containing ∼5,700 samples altogether. Per dataset, we identified genes with bimodal expression and used mixture-model clustering to ultimately define 11 modules of genes that are consistently co-regulated across multiple datasets. Functionally, these modules reflected estrogen signaling, development/differentiation, immune signaling, histone modification, ERBB2 signaling, the extracellular matrix (ECM) and stroma, and cell proliferation. The Tcell/Bcell immune modules appeared tumor-extrinsic, with coherent expression in tumors but not cell lines; whereas most other modules, interferon and ECM included, appeared intrinsic. Only four of the eleven modules were represented in the PAM50 intrinsic subtype classifier and other well-established prognostic signatures; although the immune modules were highly correlated to previously published immune signatures. As expected, the proliferation module was highly associated with decreased recurrence-free survival (RFS). Interestingly, the immune modules appeared associated with RFS even after adjustment for receptor subtype and proliferation; and in a multivariate analysis, the combination of Tcell/Bcell immune module down-regulation and proliferation module upregulation strongly associated with decreased RFS. Immune modules are unusual in that their upregulation is associated with a good prognosis without chemotherapy and a good response to chemotherapy, suggesting the paradox of high immune patients who respond to chemotherapy but would do well without it. Other findings concern the ECM/stromal modules, which despite common themes were associated with different sites of metastasis, possibly relating to the "seed and soil" hypothesis of cancer dissemination. Overall, co-expression modules provide a high-level functional view of breast cancer that complements the "cancer hallmarks" and may form the basis for improved predictors and treatments.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24516633 PMCID: PMC3917875 DOI: 10.1371/journal.pone.0088309
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Clustered heat map used to define breast cancer co-expression modules.
Cross-correlation heat map of the 136 robust signatures derived from 72 datasets cluster into 11 coexpression modules.
Functional enrichment of modules.
| Module | Size(# genes) | Pathway/functional enrichment (%FDR) | Representative genes |
|
| 135 | Estrogen signaling, cell-cell signaling (0.045), hormone estrogen/stimulus (2.24) | ESR1, PGR, FOXA1, GATA3, TFF1, TFF3; TFAP2B; SLCA1; AGTR1, MAPT, MUC1, AR, ERBB4 |
|
| 247 | Ectoderm development (9.15E-04), epidermis development(0.055), cell adhesion (0.063), vitamin metabolic process (0.63) | KRT5, 7, 14; DSC2, DSC3; ITGB6, ITGB8; ITGB6,8; COL2A, DKK1, ELF5, EN1, FOXC1, GATA6, GJB5 |
|
| 28 | Response to type 1 interferon, cytokine mediated signaling,immune response (1.93E-06), response to virus (3.37E-06),RIG-I like receptor signaling (.015),DNA replication and repair (0.2) | STAT1; IFI27, IFI44, IFI44L, IFI6, IFIH1, IFIT1, IFIT2, IFIT3 |
|
| 82 | Immune response (1.15E-18), lymphocyte activation (3.3E-13), leukocyte activation (7.31E-12), positive regulation of immune system (1.76E-09), T cell activation (2.7E-05) | CD2, CD8A, CD7, CD3G; GZMA,B,K; TNFRSF17, CD27; IL21R, IL2R; CCR2, CCR7, CCL19 |
|
| 80 | Immune response (4E-24), defense response (1.2E-14),inflammatory response (1.01E-07), chemokine signaling(9.23E-06), cytokine-cytokine receptor interaction (4.33E-06) | CCL13, CCL18, CCL19, CCL5, CCL8, CXCL10, CXCL12, CXCL13, CXCLL9, CXCLR6, IRF1, IL32 |
|
| 12 | Nucleosome assembly (1.27E-15), chromatin assembly (1.71E-15), protein-DNA complex assemply (2.48E-15), nucleosomeorganization (2.98E-15), telomere maintenance (1.7E-10), lupus | HIST1H1C, HIST1H2AE, HIST1H3D, HIST1H4H, HIST2H2BE |
|
| 4 | ERBB2 amplicon. EGFR signaling pathway, EGFR activity,ErbB3 class receptor binding, oncogenomic recombinationhotspot (0.001) | ERBB2, GRB7, STARD3, PGAP3 |
|
| 82 | Regulation of cell proliferation (0.001), regulation of signal transduction (0.019), Hemostasis (0.02), blood vessel morphogenesis/development (0.11), response to wounding (0.36) | FBN2, PLAT, SERPINE1, L1CAM; TGFB2, VIM, LYN, BONF, CAV1, CAV2, DKK1, FOXF2, IGFBP6 |
|
| 110 | Extracellular region/part/space (6.41E-10), response to hormone stimulus (3.5E-05), extracellular matrix (3.89E-05), regulation of inflammatory response (0.001), muscle organ development(0.0027) | RBP4, TF, CXCL2, TIMP4, ADIPOQ, CHRDL1, EDNRB, FABP4, FIGF, GPC3, IL6; HOXA10, HOXA5, HOXA7; WIF1, GHR, IGF1, PPARG |
|
| 58 | Proteinaceous ECM (2.7E-18), ECM (1.3E-17), collagen (6.1E-10),ECM-receptor interaction (2.2E-09), cell adhesion (4.7E-10)),signaling by PDGF (2.3E-07),focal adhesion (2.1E-07) | ASPN, COL1A1, COL3A1, COL5A1, COL6A1-3; FBN1, FN1, LOX, LUM, NID1;FAP; CDH13, GPR124, PDGFRA |
|
| 120 | Cell cycle mitotic (5.87E-30), cell cycle (1E-16), cell cyclecheckpoints (1.94E-09), oocyte meiosis (4.84E-06),KISc (1.04E-05), p53 signaling (0.0091), DNA replication (0.12) | CCNB1, CCNB2, CCNE2, MKI67, AURKA-B, E2F8, CENPA, KIF11, TOP2A, HJURP, RAD21, RAD51AP |
This table summarizes the pathway and functional themes of the genes in each module, obtained from applying DAVID and g:Profiler algorithms. FDR = false discovery rate (reported for DAVID results).
Figure 2Module correlation patterns.
A) A clustered heatmap of Pearson correlation coefficients over all module pairs (using Pearson distance, and average linkage). Dark red denotes high correlation (r → 1), dark blue high anti-correlation (r→ −1), and white a lack of correlation (r ≅ 0). B) This network representation of (A) illustrates the correlation and anti-correlation topology of module expression; red links denote module pairs with Pearson correlation coefficients r >0.25, whereas blue links denote module pairs with r<−0.25. These figures represent the covariance of ∼3700 samples from 24 datasets listed in File S1.
Overlap between module genes and established signatures.
| Module | PAM50 (Intrinsic subtypes) | 70-gene prognosis signature (MammaPrint™) | 21-gene recurrence score (Oncotype DX) |
|
|
|
|
|
|
| FOXC1, MIA, SFRP1, KRT14, KRT5, CDH3 | NMU, HRASLS, TSPYL5 | |
|
|
|
| |
|
| CEP55, | CENPA, NMU, ECT2, NUSAP1, GPR126, RFC4, PRC1, |
|
|
| 30/48 (62.5%) | 12/54 (22%) | 10/16 (62.5%) |
Modules 1-ER, 11-Prolif, 7-ERBB2 and 2-Dev/Basal share genes in common with the PAM50 gene set used to evaluate intrinsic subtype, the NKI70 prognostic classifer used in MammaPrint, or the 21-gene prognostic signature used in OncotypeDX. Immune modules 3–5, histone module 6-Hist, the stromal modules 8 and 9-ECM/Dev/Immune, and the ECM module 10-ECM have no genes in common with these signatures. Listed genes are present in both the specified module and the labeled signature. Genes in bold face are present in the module and multiple signatures.
Figure 3Modules vs. intrinsic subtype heatmap.
This heatmap shows hierarchically clustered AUC scores summarizing how well each intrinsic subtype can be predicted by each coexpression module score. Red denotes high positive predictive value (AUC → 1), green high negative predictive value (AUC → 0), and black a non-informative relationship (AUC≈0.5). This figure represents GSE1456, with AUC’s clustered using Euclidean distance and complete linkage. (Heatmaps using other datasets can be found in Figure S2 in File S2.).
Figure 4Diversity and coherence of module expression in breast cancer cell lines compared to whole tumors.
A) This bar plot compares standard deviations of module scores in representative BCCL (a composite of data from the Sanger, GSK, and Neve et al. datasets, see Methods) and a breast tumor biopsy dataset (GSE21653). *** p<1E-10 (F-test for difference in variance in module score). B) This box plot shows the distributions of Pearson correlation coefficients for all pairs of genes in each module, respectively, for the BCCL and tumor datasets. *Modules 4-Immune, 5-Immune, and 9-ECM/Dev/Immune can be considered tumor-extrinsic, as their constituent genes are uncorrelated in cell lines but highly correlated in patient tumor biopsies (median r>0.35).
Figure 5Recurrence free survival of chemotherapy naïve patients with highly proliferating tumors depends on immune module activation.
Kaplan-Meier analysis shows that patients with high 11-Proliferation expression AND low 4-Immune expression have poorer outcomes than patients with low 11-Proliferation OR high 4-Immune expression (C). To demonstrate how dividing the patients according to the activity of both modules increases sensitivity to detect patients with poorer outcome, we include K-M plots of RFS as a univariate function of 4-Immune (A) and 11-Prolif (B). Module activity was dichotomized using the median for 11-Prolif and the lower tertile for 4-Immune. D) Immune modules are an exception to a Simpson’s paradox in breast cancer that some of the same features that are associated with poor outcome are also associated with superior response to chemotherapy (high PCR rate). Modules 11-Prolif and 1-ER both conform to this paradox, as high 11-Prolif is associated with a good response to chemotherapy but a poor outcome, whereas high 1-ER is associated with good outcome but a poor response to chemotherapy. Immune modules 4/5-Immune are an exception to this paradox, as they are associated with a good outcome without chemotherapy and a good response to chemotherapy in treated populations.
Figure 6Different organ sites of metastasis are associated with different ECM/stromal modules.
A) Boxplot of ECM/stromal module expression in primary tumors that metastasized to bone only vs. lung or brain. Also included are the non-stromal subtype-associated modules with the strongest associations, 1-ER (preferential to bone), and 2-Dev/Basal and 11-Prolif (preferential to viscera). Upregulation of 10-ECM was associated with decreased bone-specific RFS (C), whereas downregulation of 9-ECM/Dev/Immune was associated with decreased lung/brain-specific RFS (D). Upregulation of the proliferation module 11-Prolif was associated with a shorter time to recurrence in bone (B) and lung (Table S6 in File S2), as opposed to 1-ER, which associates with longer times to recurrence to either site (also Table S6). Asterisks in (A) denote statistically significantly different (see Table S5 in File S2 for p-values).
Figure 7Co-expression module hallmarks of breast cancer.
This figure is a model of possible correspondence between co-expression modules and cancer hallmarks, annotated by clinical associations and coherence in cell lines vs. tumors. The inner annotation ring is colored to represent association to chemotherapy response; the 2nd from center represents association to RFS in adjuvant untreated patients and site-specific RFS in a mixed treated/untreated patients; and the 3rd from center classification of intrinsic or extrinsic.