| Literature DB >> 23145004 |
Rebeca Sanz-Pamplona1, Antoni Berenguer, David Cordero, Samantha Riccadonna, Xavier Solé, Marta Crous-Bou, Elisabet Guinó, Xavier Sanjuan, Sebastiano Biondo, Antonio Soriano, Giuseppe Jurman, Gabriel Capella, Cesare Furlanello, Victor Moreno.
Abstract
INTRODUCTION: The traditional staging system is inadequate to identify those patients with stage II colorectal cancer (CRC) at high risk of recurrence or with stage III CRC at low risk. A number of gene expression signatures to predict CRC prognosis have been proposed, but none is routinely used in the clinic. The aim of this work was to assess the prediction ability and potential clinical usefulness of these signatures in a series of independent datasets.Entities:
Mesh:
Year: 2012 PMID: 23145004 PMCID: PMC3492249 DOI: 10.1371/journal.pone.0048877
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1PRISMA Diagram which depicts the flow of information through the different phases of the prognosis signatures studies systematic review.
Description of signatures used in this work.
| Signature | Signature size | Training sample size (good + poor) | Training outcome | Training platform | Signature validation | Independent validation outcome | Size of independent validation sample (good + poor) | Validation results | Reference |
|
| 13 | 65 (56+9) | Recurrence in colorectal tumor samples | Affymetrix | No validation | – | – | – | 19156145 |
|
| 3 | 195 (173+22) | Recurrence in colorectal tumor samples | Affymetrix | Independent expression dataset | Recurrence | 50 (25+25) | Asociation | 19156145 |
|
| 36 | 20 (10+10) | A proliferation signature was derived from cell lines and a data set containing physiological expression of human colon crypts | MWG 30 K Oligo Set | Two independent expression datasets | Recurrence | 108 (84+24) | Association | 19238634 |
|
| 72 | 25 (15+10) | Recurrence in colorectal tumor samples | Affymetrix | Leave-one-out cv/Independent TMA (1 protein) | Recurrence | 137* | 88% accuracy/Association | 16143127 |
|
| 8 | 16 (10+6) | Recurrence in colorectal tumor samples | Human 19 K Oligo Array | No validation | – | – | – | 17390049 |
|
| 30 | 18 (9+9) | Recurrence in colorectal tumor samples | Affymetrix | 3-fold cv | – | – | 78% accuracy | 16091735 |
|
| 30 | 50 (25+25) | Recurrence in colorectal tumor samples | Affymetrix | 2-fold cv/k-fold Montecarlo cv | – | – | 80% accuracy/76% accuracy | 16966692 |
|
| 244 | 24 (13+9) | Metastasis in colorectal tumor samples | cDNA | Leave-one-out cv | – | – | 82% accuracy | 14973550 |
|
| 43 | 78 (32+46) | Overall survival in colorectal tumor samples | cDNA | Leave-one-out cv/Independent expression dataset | Prognosis not specified | 95* | 90% accuracy/Association | 15908663 |
|
| 32 | 6 (3+3) | Cell lines were used to build a metastatic potential profile. | Affymetrix | Independent expression dataset (5 genes) | Overall survival | 181* | Association | 20077526 |
|
| 7 | 73 (42+31) | Recurrence in colorectal tumor samples | Affymetrix | Independent expression dataset/Independent RT-PCR | Recurrence | 123 (105+18)/110 (86+24) | 68% accuracy/82% accuracy | 18556775 |
|
| 163 | 209 (86+123) | Duke’s A vs D colorectal tumor samples. Data set included 30 distant metastasis. | Affymetrix | 2-fold cv | Recurrence and overall survival | 99* | Association | 19996206 |
|
| 36 | 100 (69+31) | Recurrence in rectum samples | Illumina Sentix Human-6 Expression Beadchip | 5-fold cv/6-fold Montecarlo cv | – | – | 71% accuracy/80% accuracy | 20670856 |
|
| 634 | 215 (142+73) | Metastasis in colon tumor samples | cDNA | 5-fold cv/Independent expression dataset | Metastasis | 144 (85+59) | 0.68 AUC/0.68 AUC | 22067406 |
|
| 54 | 23 (9+14) | Only rectum samples used to build a chemoradiotherapy repondent signature | cDNA | No validation | – | – | – | 19380020 |
|
| 19 | 55 (29+26) | Recurrrence in colorectal tumor samples | Affymetrix | Independent expression dataset | Recurrrence | 149 (102+47) | 67% accuracy | 17255271 |
|
| 22 | 149 (102+47) | Recurrence in colorectal tumor samples | MWG 30 K Oligo Set | Independent expression dataset | Recurrrence | 55 (29+26) | 71% accuracy | 17255271 |
|
| 12 | 48 (32+16) | Breast tumor samples were used to derive a instability profile. | cDNA | Three independent expression datasets | Metastasis | 50 (25+25)/24 (14+10) | 69–72% accuracy | 21161944 |
|
| 7 | 1851* | Recurrence in colorectal tumor samples | RT-PCR | Independent RT-PCR | Recurrence | 1436 (1158+278) | Association | 20679606 |
|
| 8 | 95 (58/37) | Recurrence in colorectal tumor samples | DASL Illumina Cancer Panel | No validation | – | – | – | 20706727 |
|
| 7 | 74 (54+20) | Overall survival in colorectal tumor samples | RT-PCR | No validation | – | – | – | 19901968 |
|
| 18 | 188 (137+51) | Metastasis in colorectal tumor samples | Agilent WG oligo hd | Four independent expression datasets | Recurrence | 206*/100 (62+38) | Association | 21098318 |
|
| 6 | 57* | Cancer specific survival in colorectal tumor samples | RT-PCR | 2-fold cv/Independent RT-PCR | Cancer specific survival | 83* | Association | 19737943 |
|
| 34 | 55 (35+20) | Profile genes searched as indicated DNA replication. | Affymetrix | Independent expression dataset | Recurrence, cancer specific survival and overall survival | 177 (103+71)/(122+55)/(104+73) | 63–70% accuracy | 19914252 |
|
| 113 | 159* | Profile derived as co-expressed with WIPF1. | Affymetrix | Independent expression dataset | Recurrence and overall survival | 62 (47+13)/(50+12) | Association | 19399471 |
|
| 163 | 232 (177+55) | Recurrence in colorectal tumor samples | Affymetrix | Leave-one-out cv Independent expression data set | Recurrence | 60 (44+16) | Association | 21119668 |
|
| 28 | 96 (59+37) | Prognostic profile derived from breast tumor samples | cDNA | Two independent expression datasets | Metastasis | 50 (25+25)/24 (14+10) | 94% accuracy/75% accuracy | 20596637 |
|
| 23 | 38 (25+13) | Recurrence in colorectal tumor samples | Affymetrix | 2-fold cv | – | – | 78% accuracy | 15051756 |
|
| 45 | 36 (23+13) | Recurrence in colorectal tumor samples | Affymetrix | 9-fold Montecarlo cv | – | – | 92% accuracy | 19016304 |
|
| 10 | 160 (115+45) | Liver metastasis in colorectal tumor samples | Affymetrix | 2-fold cv | – | – | 86% accuracy | 20570135 |
|
| 119 | 92 (32+60) | Metastasis in colorectal tumor samples. Data set included 34 liver metastasis | Colonochip | Independent expression dataset | Metastasis | 28 (18+10) | 93% accuracy | 17143521 |
Signature: signature name; Training dataset: public training data set if used in this work; Validation dataset: public test data set if used in this work; Signature size: reported signature size in the original paper (genes or features):; Training sample size (good + poor): sample size of training data set, separating good and poor prognosis when reported; Training outcome: outcome used to derive the signature; Training platform: platform used for the training data set; Signature validation: type of validation for signature if performed; Independent validation outcome: outcome used for independent validation if performed; Validation results: for each validation performed, accuracy classification measures or association assessing if provided; Reference: PMID and reference for publishing paper. * Frequencies of subgroups were not available. Abbreviations: TMA: tissue microarray; cv: cross-validation; ns: not specified.
Datasets description.
| Dataset | Trained signatures | Validation signatures | Outcome | Minimum follow up | Number of samples (no event + event) | Clinical info* | Platform |
|
| ST09 | SL10 | Recurrence | Not available | 100 (62+38) | Stage 0–4, MSI no info | Affymetrix |
|
| SM09, VL10a | – | Recurrence completed with specific survival | 3 years | 47 (27+20) | Stage 1–4, MSI (NA) | Affymetrix |
|
| VL10a | SM09 | Recurrence completed with specific survival | 3 years | 141 (68+73) | Stage 1–4, MSI (NA) | Affymetrix |
|
| BD07 | – | Recurrence | 5 years | 16 (10+6) | Stage 1–2, MSI no info | H 19K Oligo |
|
| LN07NZ | SL10 | Recurrence | 5 years | 149 (102+47) | Stage 1–4 (NA), MSI no info | MWG H 30K |
|
| – | ST09 | Recurrence | 3 years | 55 (42+13) | Stage 1–4, MSI no info | Affymetrix |
|
| – | SL10 | Recurrence | Not available | 73 (63+10) | Stage 1–3 (NA), MSI no info | Hs OperonV2 |
|
| JS09b | VL10, JS09b | Recurrence | 3 years | 227 (116+111) | Stage 1–4, MSI no info | Affymetrix |
|
| – | – | Recurrence | 3 years | 146 (110+36) | Stage 1–4, MSI | Affymetrix |
|
| – | – | Metastasis | 3 years | 86 (51+35) | Stage 1–4, MSI no info | Rosetta 23K |
|
| – | – | Recurrence | 5 years | 53 (40+13) | Stage 2, MSI | Affymetrix |
Dataset: GEO or Array Express dataset identifier; Trained signatures: signatures which used that dataset as training sample, if any; Validation signatures: signatures which used that dataset as independent validation sample; Outcome: type of relapse used for that dataset; Minimum follow up: minimum follow up required for that dataset, when this info was available; Number of samples: number of samples contained in that dataset, showing good and bad prognosis’ separately between brackets; Clinical info: samples ranges of stage and microsatellite status when this information was available; Platform: datasets’ hybridization platform. * NA: the authors do not provide clinical information about MSI and/or stage. No info: Although authors provide clinical information in the paper, samples are not labelled with this information in GEO or ArrayExpress. a. Stage II and III samples from data sets GSE17536 and GSE17537 were jointly used to derive signature VL10, but the later did not include enough events at these stage subgroups. b. Signature JS09 was built with Duke’s A and D and validated with Duke’s B and C samples.
Clinical characteristics of datasets.
| GSE5206 | GSE17537 | GSE17536 | GSE2630 | E-MEXP-1245 | GSE12945 | GSE10402 | GSE14333 | GSE13294 | GSE28722 | GSE18088 | Total | ||
|
| 62 (62.0%) | 27 (57.4%) | 68 (48.2%) | 10 (62.5%) | 102 (68.5%) | 42 (76.4%) | 63 (86.3%) | 116 (51.1%) | 110 (75.3%) | 51 (59.3%) | 40 (75.5%) | 691 (63.2%) | |
|
|
| 38 (38.0%) | 20 (42.6%) | 73 (51.8%) | 6 (37.5%) | 47 (31.5%) | 13 (23.6%) | 10 (13.7%) | 111 (48.9%) | 36 (24.7%) | 35 (40.7%) | 13 (24.5%) | 402 (36.8%) |
|
| 100 (100.0%) | 47 (100.0%) | 141 (100.0%) | 16 (100.0%) | 149 (100.0%) | 55 (100.0%) | 73 (100.0%) | 227 (100.0%) | 146 (100.0%) | 86 (100.0%) | 53 (100.0%) | 1093 (100.0%) | |
|
| 46 (46.0%) | 21 (44.7%) | 79 (56.0%) | 11 (68.8%) | 70 (47.0%) | 30 (54.5%) | 133 (58.6%) | 71 (50.4%) | 26 (49.1%) | 487 (52.4%) | |||
|
|
| 54 (54.0%) | 26 (55.3%) | 62 (44.0%) | 5 (31.2%) | 79 (53.0%) | 25 (45.5%) | 94 (41.4%) | 70 (49.6%) | 27 (50.9%) | 442 (47.6%) | ||
|
| 100 (100.0%) | 47 (100.0%) | 141 (100.0%) | 16 (100.0%) | 149 (100.0%) | 55 (100.0%) | 227 (100.0%) | 141 (100.0%) | 53 (100.0%) | 929 (100.0%) | |||
|
| 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 5 (3.4%) | 0 (0.0%) | 164 (15.0%) | |||
|
|
| 64 (14.2) | 61 (13.5) | 65 (13.0) | 64 (11.2) | 64 (11.6) | 67 (12.8) | 65 (12.5) | 63 (12.5) | 65 (12.2) | 65 (12.7) | ||
|
| 15 (15.5%) | 4 (8.5%) | 18 (12.8%) | 6 (37.5%) | 11 (20.0%) | 31 (13.7%) | 5 (3.4%) | 13 (15.3%) | 0 (0.0%) | 103 (11.9%) | |||
|
| 29 (29.9%) | 9 (19.1%) | 38 (27.0%) | 10 (62.5%) | 23 (41.8%) | 64 (28.2%) | 123 (84.2%) | 44 (51.8%) | 53 (100.0%) | 393 (45.3%) | |||
|
|
| 33 (34.0%) | 17 (36.2%) | 46 (32.6%) | 0 (0.0%) | 16 (29.1%) | 71 (31.3%) | 10 (6.8%) | 23 (27.1%) | 0 (0.0%) | 216 (24.9%) | ||
|
| 20 (20.6%) | 17 (36.2%) | 39 (27.7%) | 0 (0.0%) | 5 (9.1%) | 61 (26.9%) | 8 (5.5%) | 5 (5.9%) | 0 (0.0%) | 155 (17.9%) | |||
|
| 97 (100.0%) | 47 (100.0%) | 141 (100.0%) | 16 (100.0%) | 55 (100.0%) | 227 (100.0%) | 146 (100.0%) | 85 (100.0%) | 53 (100.0%) | 867 (100.0%) | |||
|
| 3 (3.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 1 (1.2%) | 0 (0.0%) | 226 (20.7%) | |||
|
| 75 (75.0%) | 16 (100.0%) | 149 (100.0%) | 26 (47.3%) | 73 (100.0%) | 193 (85.4%) | 121 (82.9%) | 72 (83.7%) | 53 (100.0%) | 778 (86.1%) | |||
|
|
| 25 (25.0%) | 0 (0.0%) | 0 (0.0%) | 29 (52.7%) | 0 (0.0%) | 33 (14.6%) | 25 (17.1%) | 14 (16.3%) | 0 (0.0%) | 126 (13.9%) | ||
|
| 100 (100.0%) | 16 (100.0%) | 149 (100.0%) | 55 (100.0%) | 73 (100.0%) | 226 (100.0%) | 146 (100.0%) | 86 (100.0%) | 53 (100.0%) | 904 (100.0%) | |||
|
| 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 1 (0.4%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 189 (17.3%) | |||
|
| 73 (50.0%) | 34 (64.2%) | 107 (53.8%) | ||||||||||
|
|
| 73 (50.0%) | 19 (35.8%) | 92 (46.2%) | |||||||||
|
|
| 146 (100.0%) | 53 (100.0%) | 199 (100.0%) | |||||||||
|
| 0 (0.0%) | 0 (0.0%) | 894 (81.8%) | ||||||||||
|
| 8 (8.3%) | 1 (3.1%) | 12 (8.5%) | 0 (0.0%) | 2 (3.8%) | 23 (7.1%) | |||||||
|
| 78 (81.2%) | 24 (75.0%) | 105 (74.5%) | 28 (50.9%) | 35 (66.0%) | 242 (75.2%) | |||||||
|
|
| 10 (10.4%) | 7 (21.9%) | 24 (17.0%) | 27 (49.1%) | 16 (30.2%) | 57 (17.7%) | ||||||
|
| 96 (100.0%) | 32 (100.0%) | 141 (100.0%) | 55 (100.0%) | 53 (100.0%) | 322 (100.0%) | |||||||
|
| 4 (4.0%) | 15 (31.9%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 771 (70.5%) |
Figure 2Heatmap showing Matthews Correlation Coefficient (MCC) values for each signature in each dataset as result of analyses with Random Forest.
Rows correspond to signatures and columns to datasets. Last column shows a pooled MCC across datasets using sample size as weights. Black lines delimit the first five signatures for which training datasets were available (cells highlighted in black). Cells representing signatures and datasets used to validate them are highlighted in blue. Color scale represents the MCC values: the darker the color, the higher MCC (see the legend). Negative values were collapsed to zero.
Global performance of top 10 signatures for all, stage II and stage III samples.
| ALL SAMPLES | STAGE II | STAGE III | ||||||||||
| Signature | MCC | Accuracy (Sensitivity, Specificity) | Signature | MCC | Accuracy (Senitivity, Specificity) | Positive Post-Test Probability | Negative Post-Test Probability | Signature | MCC | Accuracy (Senitivity, Specificity) | Positive Post-Test Probability | Negative Post-Test Probability |
|
|
| 63% (65%, 61%) |
|
| 58% (69%, 55%) | 28% | 12% |
|
| 71% (69%, 72%) | 56% | 18% |
|
|
| 61% (65%, 60%) |
|
| 59% (66%, 57%) | 28% | 13% |
|
| 70% (68%, 72%) | 55% | 19% |
|
|
| 60% (63%, 59%) |
|
| 58% (68%, 56%) | 28% | 13% |
|
| 69% (70%, 67%) | 53% | 18% |
|
|
| 59% (61%, 58%) |
|
| 57% (68%, 54%) | 27% | 13% |
|
| 69% (70%, 67%) | 52% | 19% |
|
|
| 59% (63%, 57%) |
|
| 58% (66%, 55%) | 27% | 13% |
|
| 66% (69%, 63%) | 49% | 20% |
|
|
| 58% (61%, 56%) |
|
| 59% (64%, 58%) | 27% | 14% |
|
| 66% (65%, 66%) | 50% | 21% |
|
|
| 59% (61%, 57%) |
|
| 59% (62%, 58%) | 27% | 14% |
|
| 65% (68%, 63%) | 49% | 21% |
|
|
| 58% (60%, 57%) |
|
| 57% (65%, 54%) | 26% | 14% |
|
| 64% (67%, 62%) | 47% | 22% |
|
|
| 58% (60%, 56%) |
|
| 56% (66%, 52%) | 26% | 14% |
|
| 64% (64%, 64%) | 48% | 22% |
|
|
| 57% (62%, 55%) |
|
| 53% (69%, 49%) | 25% | 14% |
|
| 64% (68%, 60%) | 47% | 22% |
Abbreviations: MCC: Matthews Correlation Coefficient.
Figure 3Differences between positive and negative post-test probabilities of recurrence and their 95% confidence interval for stage II (A) and stage III (B).
Prevalence probability of recurrence for stage II and III were assumed to be 20 and 34% respectively. Signatures are listed in decreasing order of post-tests probability differences.