| Literature DB >> 23700391 |
Laetitia Marisa1, Aurélien de Reyniès, Alex Duval, Janick Selves, Marie Pierre Gaub, Laure Vescovo, Marie-Christine Etienne-Grimaldi, Renaud Schiappa, Dominique Guenot, Mira Ayadi, Sylvain Kirzin, Maurice Chazal, Jean-François Fléjou, Daniel Benchimol, Anne Berger, Arnaud Lagarde, Erwan Pencreach, Françoise Piard, Dominique Elias, Yann Parc, Sylviane Olschwang, Gérard Milano, Pierre Laurent-Puig, Valérie Boige.
Abstract
BACKGROUND: Colon cancer (CC) pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses. METHODS ANDEntities:
Mesh:
Substances:
Year: 2013 PMID: 23700391 PMCID: PMC3660251 DOI: 10.1371/journal.pmed.1001453
Source DB: PubMed Journal: PLoS Med ISSN: 1549-1277 Impact factor: 11.069
Patient and tumor characteristics of the different sets.
| Characteristics | CIT Cohort Patients ( | CIT Discovery Dataset ( | Validation Datasets | CIT Cohort | All Cohorts | |
| CIT ( | Public ( | |||||
|
| 67 (14, 19–97) | 67 (14, 22–97) | 68 (12, 42–90) | 68 (13, 23–95) | 0.21 | 0.25 |
|
| 429/321 (57/43) | 237/206 (53/47) | 73/50 (59/41) | 347/330 (51/49) | 0.24 | 0.24 |
|
| ||||||
| I | 52 (7) | 27 (6) | 10 (8) | 48 (11) | 0.058 | <0.001 |
| II | 351 (47) | 198 (45) | 66 (54) | 205 (46) | ||
| III | 265 (35) | 164 (37) | 41 (33) | 113 (25) | ||
| IV | 82 (11) | 54 (12) | 6 (5) | 83 (18) | ||
| NA | 0 | 0 | 0 | 457 | ||
|
| ||||||
| Proximal | 305 (41) | 176 (40) | 48 (39) | 125 (51) | 0.97 | 0.014 |
| Distal | 445 (59) | 267 (60) | 75 (61) | 122 (49) | ||
| NA | 0 | 0 | 0 | 659 | ||
|
| ||||||
| Yes | 257 (42) | 161 (45) | 42 (40) | 91 (51) | 0.42 | 0.31 |
| No | 357 (58) | 200 (55) | 64 (60) | 87 (49) | ||
| NA | 2 | 1 | 6 | 140 | ||
|
| 51.5 (37, 0–201) | 50 (39, 0–201) | 58 (37, 0–146) | 48 (26, 0–143) | 0.33 | <0.001 |
|
| ||||||
| Yes | 179 (29) | 109 (30) | 30 (29) | 75 (24) | 0.81 | 0.08 |
| Distant/locoregional/both | 149/23/7 | 83/22/4 | 29/0/1 | — | ||
| No | 428 (71) | 250 (70) | 72 (71) | 239 (76) | ||
| NA | 9 | 3 | 5 | 4 | ||
|
| 118/701 (17) | 61/409 (15) | 14/110 (13) | 126/418 (30) | 0.67 | <0.001 |
|
| 102/555 (18) | 74/380 (19) | 17/116 (15) | — | 0.3 | — |
|
| 261/680 (38) | 172/392 (41) | 45/121 (37) | — | 0.57 | — |
|
| 70/634 (11) | 44/424 (11) | 7/120 (6) | — | 0.12 | — |
|
| 226/451 (50) | 135/245 (55) | 55/106 (52) | — | 0.66 | — |
p-Values are Chi-squared test p-values comparing the discovery and validation sets in the CIT cohort only and in all cohorts (excluding samples for which data were not available).
Among patients with stage II–III CC.
Only fluorouracil and folinic acid.
NA, not available; sd, standard deviation.
Figure 1Unsupervised gene expression analysis of the discovery set of 443 colon cancers.
(A) Consensus matrix heatmap defining six clusters of samples for which consensus values range from 0 (in white, samples never clustered together) to 1 (dark blue, samples always clustered together). (B) Distance between clusters according to the hierarchical clustering of the 1,459 probe sets based on the centroids of each cluster. (C) GEP heatmap of the 1,459 probe sets ordered by subtype, with annotations associated with each subtype.
Figure 2Signaling pathways associated with each molecular subtype.
The enrichment of Kyoto Encyclopedia of Genes and Genomes (KEGG) and GeneOntology (GO) pathways and gene sets related to cancer hallmarks was tested in each subtype signature (1,000 top differentially up- and down-expressed genes, separately). The hypergeometric test p-values for enrichment in up- and down-regulated signatures are indicated in red and green, respectively. ECM, extracellular matrix.
Figure 3Summary of the main characteristics of the six subtypes.
Symbols correspond to the relative frequency within the subtype (o: very low frequencies [∼0%]; +++: very high frequencies; +/++: intermediate frequencies), and arrows indicate significant enrichment of subtype up- and down-regulated genes in most of the pathways of the given category. EMT, epithelial–mesenchymal transition; SC, stem cell; Wnt pw, Wnt pathway.
Figure 4Kaplan-Meier relapse-free survival.
This figure shows RFS in (A) the discovery dataset, (B) the validation dataset, (C) the overall dataset, and (D) the overall dataset for C4 and C6 subtypes combined versus the other subtypes; the numbers at risk on the time axis are given.
Univariate and multivariate analyses of relapse-free survival according to clinical annotations, the six-subtype classification, and the Oncotype DX prognostic classifier in the overall dataset.
| Variables | Value | Univariate Analysis | Multivariate Model 1 | Multivariate Model 2 | |||||||||||||
|
|
| HR | 95% CI | Modality | Model |
| HR | 95% CI | Modality | Model |
| HR | 95% CI | Modality | Model | ||
| TNM stage | III | 775 | 202 | 2 | 1.5–2.6 | 0.0000011 | 0.00000064 | 775 | 1.9 | 1.4–2.5 | 0.0000077 | 1.2×10−09 | 775 | 1.8 | 1.4–2.4 | 0.000022 | 6.0×10−11 |
| CIT classification recoded | C4C6 | 775 | 202 | 2 | 1.5–2.7 | 0.000011 | 0.0000071 | 1.8 | 1.3–2.5 | 0.00011 | 1.5 | 1.1–2.1 | 0.0097 | ||||
| CIT classification | C2 | 775 | 202 | 1.1 | 0.66–1.7 | 0.83 | 0.0003 | ||||||||||
| C3 | 775 | 202 | 0.94 | 0.54–1.6 | 0.83 | ||||||||||||
| C4 | 775 | 202 | 2.3 | 1.4–3.8 | 0.00063 | ||||||||||||
| C5 | 775 | 202 | 1.4 | 0.89–2.1 | 0.15 | ||||||||||||
| C6 | 775 | 202 | 2.1 | 1.3–3.4 | 0.0031 | ||||||||||||
| Tumor location (ref = distal) | Proximal | 623 | 173 | 0.85 | 0.62–1.1 | 0.29 | 0.29 | ||||||||||
| Sex | Male | 775 | 202 | 1.2 | 0.88–1.5 | 0.3 | 0.3 | ||||||||||
| Age | — | 774 | 202 | 1 | 0.99–1 | 0.84 | 0.84 | ||||||||||
| Oncotype DX recurrence score | High risk | 775 | 202 | 1.9 | 1.4–2.5 | 0.0000050 | 0.0000034 | 1.6 | 1.2–2.1 | 0.0027 | |||||||
Analyses of RFS were performed using Cox regression. The multivariate models reported correspond to the best multivariate models obtained using a backward–forward selection procedure (R function step). Multivariate model 1 included all given clinical annotations (except tumor location, which was less well filled in) and the classification. Multivariate model 2 included only variables of prognostic interest, i.e., TNM stage, the CIT recoded classification, and the Oncotype DX classifier. Only samples for which all the variables were available were included in multivariate models.
Value indicates the modality of the annotation associated with the HR.
Variables used in multivariate model 1.
Variables used in multivariate model 2.
ref, reference.