Literature DB >> 21978456

A clinically relevant gene signature in triple negative and basal-like breast cancer.

Achim Rody1, Thomas Karn, Cornelia Liedtke, Lajos Pusztai, Eugen Ruckhaeberle, Lars Hanker, Regine Gaetje, Christine Solbach, Andre Ahr, Dirk Metzler, Marcus Schmidt, Volkmar Müller, Uwe Holtrich, Manfred Kaufmann.   

Abstract

INTRODUCTION: Current prognostic gene expression profiles for breast cancer mainly reflect proliferation status and are most useful in ER-positive cancers. Triple negative breast cancers (TNBC) are clinically heterogeneous and prognostic markers and biology-based therapies are needed to better treat this disease.
METHODS: We assembled Affymetrix gene expression data for 579 TNBC and performed unsupervised analysis to define metagenes that distinguish molecular subsets within TNBC. We used n = 394 cases for discovery and n = 185 cases for validation. Sixteen metagenes emerged that identified basal-like, apocrine and claudin-low molecular subtypes, or reflected various non-neoplastic cell populations, including immune cells, blood, adipocytes, stroma, angiogenesis and inflammation within the cancer. The expressions of these metagenes were correlated with survival and multivariate analysis was performed, including routine clinical and pathological variables.
RESULTS: Seventy-three percent of TNBC displayed basal-like molecular subtype that correlated with high histological grade and younger age. Survival of basal-like TNBC was not different from non basal-like TNBC. High expression of immune cell metagenes was associated with good and high expression of inflammation and angiogenesis-related metagenes were associated with poor prognosis. A ratio of high B-cell and low IL-8 metagenes identified 32% of TNBC with good prognosis (hazard ratio (HR) 0.37, 95% CI 0.22 to 0.61; P < 0.001) and was the only significant predictor in multivariate analysis including routine clinicopathological variables.
CONCLUSIONS: We describe a ratio of high B-cell presence and low IL-8 activity as a powerful new prognostic marker for TNBC. Inhibition of the IL-8 pathway also represents an attractive novel therapeutic target for this disease.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21978456      PMCID: PMC3262210          DOI: 10.1186/bcr3035

Source DB:  PubMed          Journal:  Breast Cancer Res        ISSN: 1465-5411            Impact factor:   6.466


Introduction

Different molecular subtypes of breast cancer have been described [1]. The most profound effects on gene expression profiles in breast cancer are related to estrogen (ER), and proliferation status, and to a lesser extent to Human Epidermal Growth Factor Receptor 2 (HER2) status. Not surprisingly, molecular classification and current prognostic signatures mainly reflect these molecular features [2]. However, substantial clinical and molecular heterogeneity remains within current molecular subsets, particularly among ER, progesterone (PgR) and HER2 receptor negative (that is, triple negative breast cancers, TNBC [3]). Furthermore the relationship between clinically defined TNBC and the gene expression profile-based basal-like breast cancer subtype (BLBC) [4] is not fully defined [5]. Some authors use these two terms synonymously given the substantial overlap between the two definitions [6,7]. However, immunohistochemical and molecular profiling studies have shown that only a subset of TNBC express the combination of basal cell markers (for example, CK5 and CK14) that is required for the molecular definition of this disease [5]. The prognostic significance and therapeutic implications of molecular heterogeneity within TNBC remains to be established. From a clinical point of view, further understanding of TNBC is important because better prognostic markers and new treatments are needed [8]. The goal of this analysis was to assemble all currently available TNBC gene expression datasets generated on Affymetrix gene chips and search for molecular structures in the data to define gene expression-based subsets within TNBC. We defined metagenes as the average expression of groups of highly co-expressed genes in the data without considering any clinical outcome variable. These metagenes identified several molecular subsets within TNBC, some with good prognosis even in the absence of systemic therapy. Our results also suggest possible new therapeutic strategies for TNBC. This study represents the largest attempt to define clinically important molecular subsets within TNBC [9].

Materials and methods

All analyses were performed according to the REporting recommendations for tumour MARKer prognostic studies (REMARK) recommendations for prognostic and tumor marker studies [10,11] and the respective guidelines to microarray-based studies for clinical outcomes [12]. A respective diagram of the complete analytical strategy and the flow of patients through the study, including the number of patients included in each stage of the analysis, is given in Additional file 1, Supplementary Figure S1. Tissue samples of invasive breast cancer cases (dataset Frankfurt) were obtained with IRB approval and informed consent from consecutive patients undergoing surgical resection between December 1996 and July 2007 at the Department of Gynecology and Obstetrics at the Goethe-University in Frankfurt. Gene expression data have been deposited into the GEO database (accession number GSE31519).

Assembly of TNBC microarray data and definition of metagenes

In order to facilitate pooling of data sets from different laboratories we only used data from a single platform (Affymetrix U133A and U133 Plus 2.0 chips) and included only samples that were defined as triple negative based on the mRNA expression of ER, PgR, and HER2 as previously described [13-15]. To obtain a large enough sample size for discovery it was necessary to pool several datasets. A major concern during this exercise is the possible confounding effect of systematic technical differences that exist between individual datasets. These could lead to false discovery during metagene definition and could also weaken the power of validation. We applied two different strategies to minimize this problem. First, we selected only highly comparable datasets for discovery. We initially identified 579 TNBC from a total of 3,488 publicly available primary breast cancer gene expression profiles representing 28 individual datasets (Additional file 2, Supplementary Table S1). We excluded 13 datasets contributing 185 TNBC cases from the discovery cohort because they did not fulfill our criteria of comparability of the microarray data (for details see Additional file 4, Supplementary Methods Section 1 and Additional file 1, Supplementary Figure S2). The final discovery cohort to identify metagenes included 394 TNBC from 15 datasets (cohort-A). The 185 samples excluded from discovery were retained as a validation set (cohort-B) to assess correlations between various metagenes and between metagenes and clinical outcome (Additional file 1, Supplementary Figure S1). This strategy maximized the integrity of metagene discovery at the cost of possibly reducing the power of the validation study. The two cohorts did not significantly differ with respect to age, tumor size and histological grade. However, the validation cohort-B contained a larger number of lymph node positive patients and a higher proportion of fine needle aspiration (FNA) samples. Follow-up data were available for 2,348 of the total 3,488 samples and 327 of the 579 TNBC samples. Since the number of patients with follow-up in validation cohort B was too small (n = 30 of 185) an additional independent validation cohort-C [16] (n = 76) was included to assess the prognostic value of the metagenes (Additional file 1, Supplementary Figure S1). The patient characteristics of the discovery and validation cohorts are given in Table 1. For analysis of normal tissue a dataset from a benign breast was used (Additional file 2, Supplementary Table S1).
Table 1

Clinical data of TNBC patients from the finding-cohort-A and the validation cohorts-B and -C

ParameterStatusFinding cohort-A (n = 394)Validation cohort-B (n = 185)P-value (Chi2)B vs AValidation cohort-C (n = 76)P-value (Chi2)C vs A
Lymph node statusLNN2403644
Node pos.6860< 0.001320.001
n.a.86890
Age≤ 40 yr632510
41 to 50 yr914117
51 to 60 yr763913
> 60 yr79350.87360.003
n.a.85450
Tumor size≤ 2 cm852911
> 2 cm2241220.068620.035
n.a.85343
Histological gradegrade 322711062
grade 1 and 282460.57140.18
n.a.85290
Biopsy methodsurgical34613076
core19220
FNA2933< 0.00100.009
Five-year DFSno event2022449
event9560.25260.69
n.a.971551
Clinical data of TNBC patients from the finding-cohort-A and the validation cohorts-B and -C Unsupervised analysis, without input of clinical variables, was performed to identify metagenes that were defined as the arithmetical average expression of highly correlated genes. Gene clusters were selected with either a minimal membership of 10 genes and a minimal correlation threshold of 0.7, or a minimum of 25 genes and a correlation of 0.6, respectively (for details see Additional file 4, Supplementary Methods Section 2). We also employed a screen to remove genes that showed data-set bias. The dependence of the expression levels of the metagene probesets on the dataset vector was analyzed using the Kruskal-Wallis statistic (Additional file 4, Supplementary Methods Section 3). Only Stroma and Hemoglobin metagenes displayed a bias for FNA samples that reflect frequent contamination of these types of samples with blood and the lack of stromal elements compared to core needle or surgical biopsies (Additional file 1, Supplementary Figure S3 and Additional file 4, Supplementary Methods). Therefore, these two metagenes were analyzed only in surgical biopsies. No systematic bias was observed between the U133A and U133 Plus2.0 arrays, which differ only in the spatial feature size of the probesets (for details see Additional file 4, Supplementary Methods Section 4). Both metagene distributions and "Centroid methods" were used to classify subtypes of TNBC as given in Additional file 4, Supplementary Methods Sections 8 and 9).

Survival analysis

Relapse free survival (RFS) was preferentially used as a clinical endpoint for event free survival (EFS). Only if RFS was not available in some datasets was it replaced by distant metastasis free survival (DMFS). Details on used endpoints, Kaplan-Meier and Cox regression analysis are given in Additional file 4, Supplementary Methods Section 5. Optimized cutoffs for dichotomizing of metagene scores to plot survival curves were derived from the discovery cohort and were applied without modification to the validation cohorts (Additional file 4, Supplementary Methods Section 6). All P-values are two-sided and 0.05 was considered as a significant result. Analyses were performed using the R software [17] and SPSS version 17.0 (SPSS Inc. Chicago, IL).

Results

Identification of subsets of TNBC based on metagene expression profile

In our discovery cohort we identified 16 clusters of correlated genes by unsupervised methods whose expression values were averaged as metagenes (Figure 1). As expected, no cluster of genes correlated with ER, PgR, and HER2 status [4] were identified. In contrast the identified metagenes presented in Table 2 included the basal-like phenotype [4], an apocrine/androgen receptor signaling signature [18,19], five signatures related to different types of immune cells [4,20-25], a stromal signature [26,27], the claudin-CD24 signature [28,29], markers of blood [30] and adipocytes [4], as well as an inflammatory signature [31-33] and an angiogenesis signature [23,34]. These phenotypes corresponded to previously described gene signatures that have also been used to define subsets of TNBC in a recent smaller study [9]. The angiogenesis signature (VEGF metagene) has been described very recently as a "hypoxia signature" associated with poor outcome and expressed in distant metastases [34]. As shown in Figure 1, we observed the highest correlation between different types of immune cell metagenes. Similar relationships between the metagenes were detected in the validation cohort-B (Figure 1) and -C (Additional file 1, Supplementary Figure S4). The presence of B-lymphocytes in the tumor is the primary source of the expression of the B-Cell metagene that is largely composed of immunoglobulin genes [20,22]. In contrast, immunohistochemical analyses of IL-8 expression and analysis of gene expression data of breast cancer cell lines indicate that carcinoma cells are the main source of the IL-8 metagene (Figure 2).
Figure 1

Principal biological phenotypes identified as metagenes among TNBC. Heatmaps of expression values of the 16 metagenes (upper panels) and the 355 individual Affymetrix probe sets (lower panels) are shown for the finding cohort (left panels, n = 394) and validation cohort (right panels, n = 185). The dendrogram at the left presents the results from hierarchical clustering of the metagenes. Three major clusters were observed representing (i) basal-like, apocrine, CLDN-CD24, proliferation, and adipocyte metagenes (ii) all five immune cell metagenes, and (iii) the IL-8 and VEGF metagenes, when the hemoglobin and stroma metagenes were left out which display some dataset-bias (see methods). In keeping with these three major phenotypes the samples were sorted according to (1.) Basal-like phenotype, (2.) low vs. high B-Cell metagene, and (3.) the expression value of the IL-8 metagene. (The 355 individual Affymetrix probesets and the respective metagenes are listed in the Additional file 4, Supplementary Methods).

Table 2

Principal biological phenotypes identified as metagenes among TNBC

Biological componentMetagene nameCorrelation within metagene cluster# of probesets in metagene clusterKey markersReference
Basal-like phenotypeBasal-like0.6137KRT-5,-6, -14, -17, SOX10, SFRP1, ELF5, EPHB3, GABRP[4]
Apocrine/androgen receptor signallingApocrine0.6727AR, FOXA1[18,19]
Immune system:[4,20,21,23-25]
B-CellB-Cell0.8748IgG
T-CellT-Cell0.8427TCR, LCK, ITK
MHC class IIMHC-20.8314HLA-DR, -DM, -DP, -DQ
MHC class IMHC-10.8417HLA-A, -B, -C, -E, -F, -G
Interferone responseIFN0.7614OAS1, OAS2, OAS3, MX1
Stroma*Stroma0.8347Decorin, Osteonectin, Fibronectin, COL5A1[26,27]
Claudin-CD24 signatureClaudin-CD240.7019CLDN3, CLDN4, CD24, ELF3[28,29]
ProliferationProliferation0.7447BUB1, CDC2, STK6, BIRC5, TOP2A,[35]
Blood *Hemoglobin *0.6317HBA1, HBA2, HBB[30]
AdipocytesAdipocyte0.748FABP4, PLIN, ADIPOQ, ADH1B[4]
AngiogenesisVEGF0.577VEGF, adrenomedullin, ANGPTL4[34]
InflammationIL-80.524IL-8, CXCL1, CXCL2[31,32]
HOXA gene clusterHOXA0.528HOXA-4, -5, -7, -9, -10, -11[64]
Histone gene clusterHistone0.6919Histones H2A, H2B[65]

* The Stroma and Hemoglobin metagenes displayed a bias between datasets related to different biopsy techniques (see Methods).

[AU Query: Please choose a title of no more than 15 words for Tables 3 and 4. All other information should be placed in a legend beneath each table. AU Response: Done, additional information transferred into footnotes]

Figure 2

Immunohistochemical analyses of the cellular source of expression of the B-Cell and IL-8 metagenes in TNBC. A) Detection of B-lymphocytes by a CD20 antibody (red staining) in a triple negative breast cancer from the Frankfurt cohort with high expression of B-Cell and IL-8 metagenes. B) An adjacent section of the same tumor as in (A) is stained with an IL-8 antibody demonstrating that carcinoma cells are the source of IL-8 expression (red staining). Note the strong IL-8 staining in rod-like structures in the carcinoma cells. Further analyses using antibodies specific for macrophages (CD68) also demonstrated that macrophages are not the cellular source of IL-8 expression in the tumor (Additional file 1, Supplementary Figure S15).

Principal biological phenotypes identified as metagenes among TNBC. Heatmaps of expression values of the 16 metagenes (upper panels) and the 355 individual Affymetrix probe sets (lower panels) are shown for the finding cohort (left panels, n = 394) and validation cohort (right panels, n = 185). The dendrogram at the left presents the results from hierarchical clustering of the metagenes. Three major clusters were observed representing (i) basal-like, apocrine, CLDN-CD24, proliferation, and adipocyte metagenes (ii) all five immune cell metagenes, and (iii) the IL-8 and VEGF metagenes, when the hemoglobin and stroma metagenes were left out which display some dataset-bias (see methods). In keeping with these three major phenotypes the samples were sorted according to (1.) Basal-like phenotype, (2.) low vs. high B-Cell metagene, and (3.) the expression value of the IL-8 metagene. (The 355 individual Affymetrix probesets and the respective metagenes are listed in the Additional file 4, Supplementary Methods). Principal biological phenotypes identified as metagenes among TNBC * The Stroma and Hemoglobin metagenes displayed a bias between datasets related to different biopsy techniques (see Methods). [AU Query: Please choose a title of no more than 15 words for Tables 3 and 4. All other information should be placed in a legend beneath each table. AU Response: Done, additional information transferred into footnotes] Immunohistochemical analyses of the cellular source of expression of the B-Cell and IL-8 metagenes in TNBC. A) Detection of B-lymphocytes by a CD20 antibody (red staining) in a triple negative breast cancer from the Frankfurt cohort with high expression of B-Cell and IL-8 metagenes. B) An adjacent section of the same tumor as in (A) is stained with an IL-8 antibody demonstrating that carcinoma cells are the source of IL-8 expression (red staining). Note the strong IL-8 staining in rod-like structures in the carcinoma cells. Further analyses using antibodies specific for macrophages (CD68) also demonstrated that macrophages are not the cellular source of IL-8 expression in the tumor (Additional file 1, Supplementary Figure S15).

Relationship between TNBC and basal-like breast cancer (BLBC)

We observed a clear bimodal distribution of the basal-like metagene score among TNBC (Figure 3). This bimodal distribution allows us to derive a cutoff to separate cases into high and low expression groups by fitting two normal distributions to the data (Figure 3). According to this cutoff, 72.8%, 73.0% and 69.7% of TNBC were defined as BLBC in the discovery cohort-A, validation cohort-B, and validation cohort-C, respectively. Table 3 compares the clinical characteristics of BLBC or non-BLBC triple negative cancers the discovery cohort-A. The positive association between high histological grade (G3, P < 0.001), younger age (P = 0.004) and BLBC were also observed in the validation cohort-C and validation cohort-B, respectively (Additional file 2, Supplementary Table S2).
Figure 3

Distribution of the expression of the basal-like metagene among TNBC of cohort-A. The bimodal distribution of the expression of the basal-like metagene among the 394 TNBC samples in the finding cohort-A is shown. A mixture (black line) of two normal gaussian distributions (blue and red lines) was fitted to these data. The interception of the two gaussians was derived as a cutoff (0.0014) for the definition of basal-like tumors. Similar results were obtained for the validation cohorts-B, and -C, as well as from all samples combined.

Table 3

Clinical parameters of TNBC with basal-like breast cancer (BLBC) or non-BLBC phenotype

ParameterInformation available*Non-BLBC(n = 107, 27.2%)BLBC(n = 287, 72.8%)Total (n = 394)P-value
lymph node statusn = 308LNN50 (64.9%)190 (82.3%)240
N127 (35.1%)41 (17.7%)680.002
Age 50 yrsn = 309≤ 50 yr27 (34.6%)124 (53.7%)151
> 50 yr51 (65.4%)107 (46.3%)1580.004
Tumor sizen = 309≤ 2 cm16 (20.5%)69 (29.9%)85
> 2 cm62 (79.5%)162 (70.1%)2240.14
Histological graden = 309G345 (57.0%)182 (79.1%)227
G1&234 (43.0%)48 (20.9%)82< 0.001

* Number of cases with available information on the specific parameter in the finding cohort-A

Distribution of the expression of the basal-like metagene among TNBC of cohort-A. The bimodal distribution of the expression of the basal-like metagene among the 394 TNBC samples in the finding cohort-A is shown. A mixture (black line) of two normal gaussian distributions (blue and red lines) was fitted to these data. The interception of the two gaussians was derived as a cutoff (0.0014) for the definition of basal-like tumors. Similar results were obtained for the validation cohorts-B, and -C, as well as from all samples combined. Clinical parameters of TNBC with basal-like breast cancer (BLBC) or non-BLBC phenotype * Number of cases with available information on the specific parameter in the finding cohort-A In unsupervised clustering of the metagenes the basal-like metagene clustered next to the apocrine metagene but showed a strong inverse correlation (Figure 1). To quantify the correlation between the basal-like metagene and all other metagenes from Table 2 we used quartiles of the respective metagenes. Additional file 2, Supplementary Table S3 presents the six metagenes that displayed significant correlations with the BLBC phenotype in both the discovery and validation cohorts. A positive correlation was found between the BLBC phenotype and the proliferation and angiogenesis (VEGF) metagenes. A negative correlation was observed for the apocrine/androgen receptor signaling and two immune system related metagenes (MHC-2 and T-Cell metagenes), as well as an adipocyte related signature. Since we observed a negative correlation between the basal-like metagene and potential markers of normal breast tissue, such as the adipocyte metagene, we had to exclude the possibility that we are only distinguishing stroma-rich and stroma-poor samples. As shown in Additional file 1, Supplementary Figure S5, when metagenes for proliferation, adipocytes and histones were compared between BLBC, non-BLBC, and normal breast samples it is clearly demonstrated that the non-BLBC subtype is distinct from normal breast tissues in the expression of several metagenes. Proliferation genes have been previously shown to be the most important determinant of cancer vs normal signatures [35]. Furthermore, the strong bimodal distribution of the basal-like metagene argues against the possibility that this metagene is inversely describing the degree of contamination with normal tissue which should rather result in a continuous distribution. The non-BLBC tumors in our TNBC dataset mainly represent samples of the "molecular apocrine" type (16.5%), which demonstrates the inverse bimodal distribution as the basal-like metagene, and a relatively small group of "claudin-low" tumors (6.3%). The mutual relationship of these three metagenes is shown in Additional file 1, Supplementary Figure S6.

Prognostic value of the different biological phenotypes in TNBC

To assess the prognostic value of the metagenes, we analyzed the event free survival of patients as a function of metagene expression. The basal-like metagene had no significant effect on survival (Additional file 1, Supplementary Figure S7). In contrast, five other metagenes including the IL-8, Histone, VEGF, B-Cell, and T-Cell metagenes showed significant prognostic values when considered as continuous variables in univariate analysis (Additional file 2, Supplementary Table S4). In a stepwise multivariate Cox regression analysis only three of these, the IL-8, Histone, and the B-Cell metagenes, remained significant (Additional file 2, Supplementary Table S5). The IL-8 and Histone metagenes were positively correlated with one another in all data sets (see Figure 1). The B-cell and IL-8 metagenes were associated with prognosis but with an opposing direction. Based on these observations, we derived a B-Cell /IL-8 metagene ratio as a prognostic index for TNBC. Figure 4A demonstrates that patients with a high expression of the B-Cell and low expression of the IL8 metagene have significantly better prognosis than other TNBC patients (HR 0.37, 95% CI 0.22 to 0.61; P < 0.001). The five-year event-free survival was 84 ± 4% for the good prognosis group (n = 95) compared to 59 ± 4% for the rest of the patients. In validation cohort B (n = 30), there was a non-significant trend for better survival for patients with high B-cell low IL8 metagene expression (P = 0.3, Figure 4B). Since this cohort has limited power due to the small sample size, we also tested the prognostic value on a separate and larger (n = 75) validation cohort of TNBC samples [16]. The B-cell/IL8 metagene ratio had significant prognostic value in this second validation cohort C, the hazard ratio (HR) was 0.26, (95% CI 0.10 to 0.68) and the five-year DFS was 78 ± 9% vs. 45 ± 8%, (P = 0.003) (Figure 4C). The prognostic value was independent of histological grade; Figure 4D, E shows pooled data from all three cohorts to increase sample size, (see also Additional file 1, Supplementary Figure S8 for the individual cohorts). Moreover, the prognostic value of the B-cell/IL8 metagene ratio was observed both in BLBC and non-BLBC TNBCs (P = 0.001 and P = 0.006, respectively; Additional file 1, Supplementary Figure S9). The proportion of BLBC cases was similar in the Good and Poor prognosis groups defined by the B-cell/IL8 metagene ratio (75.2% and 71.8%, respectively; P = 0.54).
Figure 4

Prognostic value of the combined B-Cell/IL-8 metagenes among TNBC. Kaplan Meier analysis of event free survival of 297 TNBC patients with follow up from the finding cohort A. Samples were stratified according to prognostic predictor of the combined B-Cell/IL-8 metagenes. "Good" refers to 95 samples with both high B-Cell and low IL-8 metagene expression whereas all other samples (n = 202) are referred as "Poor". A) Prognostic value of the B-Cell/IL8-metagene prognostic predictor in the 30 TNBC patients with follow up from the validation cohort-B. Samples were stratified as in (A). B) Prognostic value of the B-Cell/IL8-metagene prognostic predictor in the 75 TNBC patients with follow-up from the independent validation cohort-C. Samples were stratified as in (A). C) Prognostic value of the combined B-Cell/IL-8 metagenes among the subset of high grade (G3) TNBC tumors from all three cohorts -A, -B, and -C (n = 186). Samples were stratified as in (A). (Results from the individual cohorts are given in Additional file 1, Supplemental Figure S8). D) Prognostic value of the combined B-Cell/IL-8 metagenes among the subset of low to medium grade (G1 and G2) TNBC tumors from all three cohorts -A, -B, and -C (n = 77). Samples were stratified as in (A). (Results from the individual cohorts are given in Additional file 1, Supplemental Figure S8).

Prognostic value of the combined B-Cell/IL-8 metagenes among TNBC. Kaplan Meier analysis of event free survival of 297 TNBC patients with follow up from the finding cohort A. Samples were stratified according to prognostic predictor of the combined B-Cell/IL-8 metagenes. "Good" refers to 95 samples with both high B-Cell and low IL-8 metagene expression whereas all other samples (n = 202) are referred as "Poor". A) Prognostic value of the B-Cell/IL8-metagene prognostic predictor in the 30 TNBC patients with follow up from the validation cohort-B. Samples were stratified as in (A). B) Prognostic value of the B-Cell/IL8-metagene prognostic predictor in the 75 TNBC patients with follow-up from the independent validation cohort-C. Samples were stratified as in (A). C) Prognostic value of the combined B-Cell/IL-8 metagenes among the subset of high grade (G3) TNBC tumors from all three cohorts -A, -B, and -C (n = 186). Samples were stratified as in (A). (Results from the individual cohorts are given in Additional file 1, Supplemental Figure S8). D) Prognostic value of the combined B-Cell/IL-8 metagenes among the subset of low to medium grade (G1 and G2) TNBC tumors from all three cohorts -A, -B, and -C (n = 77). Samples were stratified as in (A). (Results from the individual cohorts are given in Additional file 1, Supplemental Figure S8). To assess a potential predictive value for sensitivity to systemic adjuvant chemotherapy, the patients were stratified by adjuvant treatment. In the discovery cohort, 186 patients received no adjuvant systemic treatment and 81 patients received chemotherapy (mostly Cyclophosphamide Methotrexate Fluorouracil; CMF)). Better prognosis was observed for the high B-cell/low IL8 group in both untreated (P = 0.001) as well as chemotherapy treated patients (P = 0.05; not shown). A potential predictive value of the B-cell and IL8 metagenes was also analyzed in 191 patients with TNBC who received neoadjuvant chemotherapy. We assembled this cohort of samples with information on pathologically complete response (pCR) from seven datasets. As shown in Additional file 1, Supplementary Figure S10 the B-cell metagene had a modest predictive value with an area under the curve (AUC) of 0.606 consistent with our previous results [22]. The predictive value for the IL8 metagene was smaller (AUC -0.552). Combining both metagenes increased the AUC to 0.612 (95% CI 0.519 to 0.704; P = 0.018). In multivariate Cox regression analysis, including lymph node status, age, tumor size, and histological grade, only the combined B-Cell/IL8-metagene score showed strong independent prognostic value in both the discovery cohort (HR 0.38, 95% CI 0.22 to 0.67, P = 0.001) and in the second, larger validation cohort-C, (HR 0.21, 95% CI 0.07 to 0.62, P = 0.005). The only other variable with borderline statistical significance (HR 0.40; 95% CI 0.17 to 0.99, P = 0.046) was lymph node status in validation cohort-C (Table 4). However, even in univariate analyses the remaining clinical variables did not show a significant prognostic value in the analyzed cohorts. This might be attributed to the fact that most TNBC are usually highly proliferating and grading is not as important for prognosis in this subtype as it is in ER positive disease; in addition, the power of our analysis may be limited to detecting the modest effect of age and tumor size on prognosis within this sample set. The inclusion of a term for chemotherapeutic treatment in the multivariate analysis further reduced the sample size to 213 patients in cohort-A (no treatment information was available for patients from validation cohort-B). Of these 213 patients only 37 were treated with chemotherapy. The combined B-Cell/IL8-metagene score remained significant (P = 0.001) in the corresponding multivariate analysis (Additional file 2, Supplementary Table S9A). Unexpectedly, chemotherapy treatment was associated with a worse prognosis probably due to chance or some form of selection bias to include higher risk patients in these public data sets (Additional file 2, Supplementary Table S9A). This selection bias is consistent with a significant higher portion of node positive patients in the chemotherapy group (P = 0.001) and a trend for a higher histological grade (P = 0.074; Additional file 2, Supplementary Table S9B).
Table 4

Multivariate analysis of EFS according to standard parameters and the combined B-Cell/IL8-metagene in TNBC

Finding cohort A*Validation cohort C*


VariableNo. of patientsHazard ratio95% CIP-valueNo. of patients§Hazard ratio95% CIP-value
Lymph node statusLNN vs N1210 vs 270.590.31 to 1.120.1043 vs 290.400.17 to 0.990.046
Age> 50 vs ≤ 50113 vs 1240.750.48 to 1.170.2148 vs 241.680.65 to 4.380.29
Tumor size≤ 2 cm vs > 2 cm71 vs 1660.730.44 to 1.210.2211 vs 610.990.28 to 3.420.98
Histological gradingG3 vs G1 and 2166 vs 711.110.68 to 1.810.6859 vs 130.530.22 to 1.290.16
B-Cell/IL8-SignatureGood vs Poor||78 vs 1590.380.22 to 0.670.00129 vs 430.210.07 to 0.620.005

* Results from multivariate Cox analysis of event free survival in the TNBC finding cohort A and validation cohort C are presented.

† information on all parameters was available for 237 of the 297 TNBC samples with follow up data from the finding cohort A.

‡ Significant P-values are given in bold

§ information on all parameters was available for 72 of the 76 TNBC samples with follow up data from the validation cohort C.

|| "Good" refers to high B-Cell metagene together with low IL8 metagene expression compared to all the remaining samples referred as "Poor".

Multivariate analysis of EFS according to standard parameters and the combined B-Cell/IL8-metagene in TNBC * Results from multivariate Cox analysis of event free survival in the TNBC finding cohort A and validation cohort C are presented. † information on all parameters was available for 237 of the 297 TNBC samples with follow up data from the finding cohort A. ‡ Significant P-values are given in bold § information on all parameters was available for 72 of the 76 TNBC samples with follow up data from the validation cohort C. || "Good" refers to high B-Cell metagene together with low IL8 metagene expression compared to all the remaining samples referred as "Poor".

Relationship of the identified metagenes to known prognostic signatures

The correlation of several published prognostic gene signatures to the metagenes discovered within the pure TNBC cohort was analyzed by hierarchical clustering using the gene expression data from cohort-A (Additional file 4, Supplementary Methods Section 13). As shown in Additional file 1, Supplementary Figure S11, the "recurrence score" [36], "genomic grading index" (GGI) [37], and the "wound response signature" [38] display high correlation to the proliferation metagene. On the other hand the "7-gene immune response (IR) signature" [39], the "stroma derived prognostic predictor" (SDPP) [40], and the "368 gene medullary breast cancer signature" [16] were all highly correlated to immune cell metagenes. The magnitude of the correlation (R2 = 0.4 to approximately 0.7) between the different immune metagenes and the related signatures is at the same high level as the correlation between genes within other metagene clusters (R2 = 0.5 to approximately 0.7; Table 2). We demonstrated previously [22] that even if the different immune metagenes can discriminate between distinct types of immune cells, the actual infiltration of tumors generally represents a mixture of these different immune cells. In most cases, the differences in the proportions in this mixture are smaller than the global differences in lymphocyte infiltration between individual tumors. Therefore, different immune signatures often carry redundant prognostic information and can replace each other. In contrast to the immune cell metagenes no correlation between the IL8 metagene and other signatures were observed.

Discussion

It has been suggested that TNBC represent a group of several molecularly [3] and clinically [41,42] distinct disease subtypes. We used gene expression data of a cohort of 394 TNBC to identify molecular subsets within this tumor type. The definition of TNBC was based on gene expression data which is not the standard definition used in the clinic. This might be a caveat but holds the promise that samples erroneously characterized as receptor-negative by immunohistochemistry do not introduce noise into our analysis. We identified 16 metagenes associated with several distinct biological processes that showed variable expression across TNBC (Table 2). Some of the metagenes seem to point to the distinct origins of these cancers [43,44]. These include the basal-like [4], the apocrine [18,19], and the claudin-low [28,29] subtypes of TNBC. Other metagenes were related to non-neoplastic cellular constituents of the tumor microenvironment including stroma [26,27], blood cell [30] and adipocytes [4], as well as signatures for angiogenesis [23,34] and inflammation [31-33]. Five metagenes appear to reflect the variable presence of immune cells and may contribute to the clinical behavior of the cancer [4,20-25,27,45] (Table 2). Kreike et al. [9] detected similar metagenes among 97 TNBC analysed with a different microarray platform. That study suggested that the TNBC clinical phenotype can be equated to the BLBC molecular class determined by the centroid method [46] since 95% of the TNBCs were assigned basal-like molecular class [47]. However, the centroid method is highly susceptible to the composition of the dataset that is used to define the reference centroids [48] and variants of the method can lead to different results [49]. Bertucci et al. [50] identified only 71% of their 172 TNBC cases as basal-like when using a slightly different version of the centroid method for molecular classification. When we applied different versions of the centroid method to 1,364 breast cancers, 65% to 90% of the TNBC samples (n = 172) were assigned to the basal-like class depending on the method used (Additional file 2, Supplementary Table S6). In this paper we took a different approach and first identified metagenes and used these metagenes to define molecular subsets among TNBC. One of our metagenes corresponded closely to the gene signatures that are used to define BLBC in the centroid based methods. Our results indicate that BLBC defined based on the basal-like metagene expression represent around 73% of TNBC (Table 3 and Additional file 2, Supplementary Table S2). The proportion of BLBC among TNBC in our study is similar to results from an immunohistochemical study by Rakha et al. [7] that defined BLBC by the expression of CK5/6, CK14, CK17 or EGFR. These authors observed a worse survival of the 165 patients with BLBC compared to the remaining 67 TNBC cases, which expressed none of these markers. However, we did not detect differences in the prognosis of BLBC and non-BLBC type triple negative cancers (Additional file 1, Supplementary Figure S7). In the study by Rakha et al. the prognostic effect was mainly confined to 103 untreated patients. Still, even when we analyzed untreated patients (n = 186) separately, we detected no prognostic value of the BLBC phenotype (not shown). Our results are also contrary to the immunohistochemical study of Cheang et al. [51], which used CK5/6 and EGFR antibodies for TNBC stratification. They also observed a worse prognosis of 336 BLBC TNBC compared to 303 non-BLBC TNBC. However, our study is not directly comparable to these prior reports because our definition of BLBC is fundamentally different from the IHC-based methods. Our results are in line with several other genomic profiling studies that reported limited prognostic value for the BLBC molecular class among clinically triple negative cancers [18,19,50]. We observed strong prognostic value for several of the other metagenes (Additional file 2, Supplementary Table S4). An improved prognosis was observed for patients with tumors displaying high expression of immune system related metagenes which supports recent reports [20,23-25,27,39,40,52,53]. An association with decreased survival was observed for high expression of inflammation (IL-8), an angiogenesis/hypoxia signature (VEGF) [34], and histone-related metagenes (Additional file 2, Supplementary Table S4 and Figure 1). A simple combination of high B-Cell and low IL8 metagene expression identifies a subset of TNBC patients (32% of all) with a favorable prognosis and a five-year event-free survival of 84%. In multivariate analysis, only this metagene ratio and lymph node status were significant predictors of TNBC in our cohort of patients (Table 4 and Figure 4D, E). Other known prognostic factors in breast cancer, such as age, tumor size and histological grade, were not significant in our cohorts, even in univariate analysis. Most TNBC are high grade and, therefore, grade is not as important for prognosis in this subtype as it is in ER positive disease. TNBCs are also often associated with younger age but the impact of age and tumor size for prognosis within this subtype is not yet fully clear. Still it cannot be excluded that a bias in our cohort is the reason for the lack of the significance of these factors. Our analyses of neoadjuvant treated TNBC samples suggest modest predictive value of the B-cell/IL8 metagene ratio for currently used chemotherapies [22,54] (Additional file 1, Supplementary Figure S10). We also observed a pure prognostic value in untreated patients of finding the cohort in line with other reports on B-cell metagene [24,27]. Treatment information on the samples from the validation cohort was not available. Our observation is important since every currently available genomic prognostic signature, (for example, the 70-gene profile [55], Recurrence Score [36], Genomic Grading Index [37]), assigns poor prognostic risk status to all TNBC samples despite their variable outcome [56-58]. One of these signatures, the Rotterdam-76-gene prognostic signature [59], was developed in a way to allow prognostic stratification of ER-negative cancers. However, similar to other reports [9] we were not able to demonstrate a prognostic value for this signature (Additional file 1, Supplementary Figure S12). We used an unsupervised class discovery approach to first identify the main molecular subtypes within the data and then assess the prognostic differences between the molecular subsets. Interestingly, when we performed an independent supervised analysis that compared TNBC cases with or without recurrence, we also identified IL-8 as the top ranked gene associated with poor prognosis (Additional file 1, Supplementary Figure S13 and Additional file 2, Supplementary Table S8). However, gene signatures obtained through supervised analysis were not superior to the molecular structure based prognostic predictions in validation (Additional file 1, Supplementary Figure S14). In addition, the biological interpretation of the empirically derived prognostic signature is more difficult than the interpretation of metagenes. In summary, we performed the largest unsupervised analysis of pooled gene expression data from TNBC. We describe a new prognostic signature for these cancers that identify about one-third of TNBC as relatively low risk for recurrence. These cancers are characterized by high B-cell and low IL-8 metagene expression and have about 84% recurrence-free survival at five-years. Whereas, this may not be sufficiently high to forego adjuvant chemotherapy, these observations pave the way to develop a clinically useful multivariate prognostic model for TNBC. A combined, prognostic score, including clinical variables, such as nodal status and perhaps tumor size, and molecular variables, such as optimized B-cell and IL-8 metagenes (measured by an RT-PCR or array-based method), may identify patients with very low risk of recurrence even with ER-, PgR- and HER2-negative breast cancer. Equally important, the prognostic importance of B-cells and the negative impact of IL-8 suggest potential novel therapeutic strategies for TNBC that can be tested in the clinic [31,32]. It could allow the selection of those patients who could profit most from novel immune stimulating drugs like anti-CTLA-4 antibodies that have shown promise in melanoma [60,61]. IL8 could also directly increase the survival of breast cancer stem cells after chemotherapy [62], which can be blocked with IL8 directed drugs [63]. Such an effect might explain the triple negative paradox with high relapse rates despite a good initial response to chemotherapy.

Conclusions

In the largest and most comprehensive analysis of all available gene expression data in TNBC, we first identified structures in the molecular data without considering any clinical outcome. Subsequently, these molecular phenotypes were correlated with survival in multivariate analysis, including routine clinical and pathological variables. Our most important observation is that a high B-cell presence and low IL-8 activity identifies a good prognosis group, even in the absence of systemic therapy, among TNBC. These observations directly point to therapeutic interventions, such as the inhibition of the IL-8 pathway and activation of the immune system in the tumor microenvironment that could benefit patients with this disease.

Abbreviations

AUC: area under the curve; BLBC: basal-like breast cancer; CK: cytokeratine; DMFS: distant metastasis free survival; EFS: event free survival; EGFR: epidermal growth factor receptor; ER: estrogen receptor; FNA: fine needle aspiration; GGI: genomic grading index; HER2: human epidermal growth factor receptor 2; HR: hazard ratio; IL: interleukine; IR: immune response; MHC: major histocompatibility complex; PgR: progesterone receptor; REMARK: recommendations for prognostic and tumor marker studies; RFS: Relapse free survival; SDPP: stroma derived prognostic predictor; TNBC: triple negative breast cancer; VEGF: vascular endothelial growth factor.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AR, TK and UH conceived the study, carried out the analyses and wrote the manuscript. CL and LP added experimental data, participated in the interpretation of the data and in writing the manuscript. ER, LH, RG, CS AA, MS and VM provided patients and samples, obtained follow-up data and helped to draft the manuscript. DM and TK performed the statistical analysis. MK initiated the study and participated in the design and writing of the manuscript. All authors read and approved the final manuscript.

Additional file 1

Supplementary Figures S1 to S15. An Adobe file containing 15 supplementary figures (S1 to S15). Click here for file

Additional file 2

Supplementary Tables S1 to S7. An Adobe file containing seven supplementary tables (S1 to S7). Click here for file

Additional file 3

Supplementary Tables S8. An Excel file containing a supplementary table (S8) containing lists of probesets and corresponding information from the supervised analysis by SAM. Click here for file

Additional file 4

Supplementary Methods. An Adobe file containing supplementary information on methodology and six additional supplementary figures (S16 to S21), which are referred to within this supplementary methods. Click here for file

Additional file 5

Supplementary R files. A zipped package containing an R script file of the analysis with respective links to the complete dataset files in GEO and a text file of the metagene probesets used in the R analysis. Click here for file
  63 in total

1.  Mammary development meets cancer genomics.

Authors:  Aleix Prat; Charles M Perou
Journal:  Nat Med       Date:  2009-08       Impact factor: 53.440

2.  Use of archived specimens in evaluation of prognostic and predictive biomarkers.

Authors:  Richard M Simon; Soonmyung Paik; Daniel F Hayes
Journal:  J Natl Cancer Inst       Date:  2009-10-08       Impact factor: 13.506

3.  Residual breast cancers after conventional therapy display mesenchymal as well as tumor-initiating features.

Authors:  Chad J Creighton; Xiaoxian Li; Melissa Landis; J Michael Dixon; Veronique M Neumeister; Ashley Sjolund; David L Rimm; Helen Wong; Angel Rodriguez; Jason I Herschkowitz; Cheng Fan; Xiaomei Zhang; Xiaping He; Anne Pavlick; M Carolina Gutierrez; Lorna Renshaw; Alexey A Larionov; Dana Faratian; Susan G Hilsenbeck; Charles M Perou; Michael T Lewis; Jeffrey M Rosen; Jenny C Chang
Journal:  Proc Natl Acad Sci U S A       Date:  2009-08-03       Impact factor: 11.205

Review 4.  Triple-negative breast cancer--current status and future directions.

Authors:  O Gluz; C Liedtke; N Gottschalk; L Pusztai; U Nitz; N Harbeck
Journal:  Ann Oncol       Date:  2009-11-09       Impact factor: 32.976

Review 5.  Histological and molecular types of breast cancer: is there a unifying taxonomy?

Authors:  Britta Weigelt; Jorge S Reis-Filho
Journal:  Nat Rev Clin Oncol       Date:  2009-12       Impact factor: 66.675

6.  Effects of infiltrating lymphocytes and estrogen receptor on gene expression and prognosis in breast cancer.

Authors:  Alberto Calabrò; Tim Beissbarth; Ruprecht Kuner; Michael Stojanov; Axel Benner; Martin Asslaber; Ferdinand Ploner; Kurt Zatloukal; Hellmut Samonigg; Annemarie Poustka; Holger Sültmann
Journal:  Breast Cancer Res Treat       Date:  2008-07-01       Impact factor: 4.872

Review 7.  Do 'basal-like' breast cancers really exist?

Authors:  Barry Gusterson
Journal:  Nat Rev Cancer       Date:  2008-12-29       Impact factor: 60.716

8.  Genomic grade index is associated with response to chemotherapy in patients with breast cancer.

Authors:  Cornelia Liedtke; Christos Hatzis; William Fraser Symmans; Christine Desmedt; Benjamin Haibe-Kains; Vicente Valero; Henry Kuerer; Gabriel N Hortobagyi; Martine Piccart-Gebhart; Christos Sotiriou; Lajos Pusztai
Journal:  J Clin Oncol       Date:  2009-04-13       Impact factor: 44.544

9.  T-cell metagene predicts a favorable prognosis in estrogen receptor-negative and HER2-positive breast cancers.

Authors:  Achim Rody; Uwe Holtrich; Laos Pusztai; Cornelia Liedtke; Regine Gaetje; Eugen Ruckhaeberle; Christine Solbach; Lars Hanker; Andre Ahr; Dirk Metzler; Knut Engels; Thomas Karn; Manfred Kaufmann
Journal:  Breast Cancer Res       Date:  2009-03-09       Impact factor: 6.466

10.  A compact VEGF signature associated with distant metastases and poor outcomes.

Authors:  Zhiyuan Hu; Cheng Fan; Chad Livasy; Xiaping He; Daniel S Oh; Matthew G Ewend; Lisa A Carey; Subbaya Subramanian; Robert West; Francis Ikpatt; Olufunmilayo I Olopade; Matt van de Rijn; Charles M Perou
Journal:  BMC Med       Date:  2009-03-16       Impact factor: 8.775

View more
  148 in total

1.  A mammary stem cell population identified and characterized in late embryogenesis reveals similarities to human breast cancer.

Authors:  Benjamin T Spike; Dannielle D Engle; Jennifer C Lin; Samantha K Cheung; Justin La; Geoffrey M Wahl
Journal:  Cell Stem Cell       Date:  2012-02-03       Impact factor: 24.633

2.  Senescent Breast Luminal Cells Promote Carcinogenesis through Interleukin-8-Dependent Activation of Stromal Fibroblasts.

Authors:  Huda H Al-Khalaf; Hazem Ghebeh; Rabia Inass; Abdelilah Aboussekhra
Journal:  Mol Cell Biol       Date:  2019-01-03       Impact factor: 4.272

3.  DNA methylation of circulating DNA: a marker for monitoring efficacy of neoadjuvant chemotherapy in breast cancer patients.

Authors:  Gayatri Sharma; Sameer Mirza; Rajinder Parshad; Siddartha Datta Gupta; Ranju Ralhan
Journal:  Tumour Biol       Date:  2012-06-29

4.  Breast cancer subtypes and previously established genetic risk factors: a bayesian approach.

Authors:  Katie M O'Brien; Stephen R Cole; Lawrence S Engel; Jeannette T Bensen; Charles Poole; Amy H Herring; Robert C Millikan
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2013-10-31       Impact factor: 4.254

5.  Prognostic implications of the expression levels of different immunoglobulin heavy chain-encoding RNAs in early breast cancer.

Authors:  Christer Larsson; Anna Ehinger; Sofia Winslow; Karin Leandersson; Marie Klintman; Ludvig Dahl; Johan Vallon-Christersson; Jari Häkkinen; Cecilia Hegardt; Jonas Manjer; Lao Saal; Lisa Rydén; Martin Malmberg; Åke Borg; Niklas Loman
Journal:  NPJ Breast Cancer       Date:  2020-07-06

6.  Hallmarks of triple negative breast cancer emerging at last?

Authors:  Rosa Bernardi; Luca Gianni
Journal:  Cell Res       Date:  2014-05-09       Impact factor: 25.617

Review 7.  High-throughput gene expression and mutation profiling: current methods and future perspectives.

Authors:  Thomas Karn
Journal:  Breast Care (Basel)       Date:  2013-12       Impact factor: 2.860

Review 8.  Genomic profiling in triple-negative breast cancer.

Authors:  Cornelia Liedtke; Christof Bernemann; Ludwig Kiesel; Achim Rody
Journal:  Breast Care (Basel)       Date:  2013-12       Impact factor: 2.860

9.  Prognostic microRNA/mRNA signature from the integrated analysis of patients with invasive breast cancer.

Authors:  Stefano Volinia; Carlo M Croce
Journal:  Proc Natl Acad Sci U S A       Date:  2013-04-15       Impact factor: 11.205

10.  Immunogenic Subtypes of Breast Cancer Delineated by Gene Classifiers of Immune Responsiveness.

Authors:  Lance D Miller; Jeff A Chou; Michael A Black; Cristin Print; Julia Chifman; Angela Alistar; Thomas Putti; Xiaobo Zhou; Davide Bedognetti; Wouter Hendrickx; Ashok Pullikuth; Jonathan Rennhack; Eran R Andrechek; Sandra Demaria; Ena Wang; Francesco M Marincola
Journal:  Cancer Immunol Res       Date:  2016-04-28       Impact factor: 11.151

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.