| Literature DB >> 17894856 |
Jack X Yu1, Anieta M Sieuwerts, Yi Zhang, John W M Martens, Marcel Smid, Jan G M Klijn, Yixin Wang, John A Foekens.
Abstract
BACKGROUND: Published prognostic gene signatures in breast cancer have few genes in common. Here we provide a rationale for this observation by studying the prognostic power and the underlying biological pathways of different gene signatures.Entities:
Mesh:
Year: 2007 PMID: 17894856 PMCID: PMC2077336 DOI: 10.1186/1471-2407-7-182
Source DB: PubMed Journal: BMC Cancer ISSN: 1471-2407 Impact factor: 4.430
Figure 1Work flow of data analysis for deriving core genes and over-represented pathways.
Figure 2Evaluation of the 500 gene signatures. Each of the 100-gene signatures for 80 randomly selected tumors in the training set was used to predict relapsed patients in the corresponding test set. Its performance was measured by the AUC of the ROC analysis. (A) Performance of the gene signatures for ER-positive patients in test sets. (B) Performance of the gene signatures for ER-negative patients in test sets. (Left) Frequency of AUC in 500 prognostic signatures panels as derived following the flow chart presented in Figure 1. (Right) Frequency of AUC in 500 random gene lists. To generate a gene list as a control, the survival data for the ER-positive patients or ER-negative patients was permutated randomly and reassigned to the chip data.
Genes with highest frequencies in 500 signatures
| Gene title | Gene symbol | Frequency |
| Top 20 core genes from ER-positive tumors | ||
| KIAA0241 protein | KIAA0241 | 321 |
| CD44 antigen (homing function and Indian blood group system) | CD44 | 286 |
| ATP-binding cassette, sub-family C (CFTR/MRP), member 5 | ABCC5 | 251 |
| serine/threonine kinase 6 | STK6 | 245 |
| cytochrome c, somatic | CYCS | 235 |
| KIAA0406 gene product | KIA0406 | 212 |
| uridine-cytidine kinase 1-like 1 | UCKL1 | 201 |
| zinc finger, CCHC domain containing 8 | ZCCHC8 | 188 |
| Rac GTPase activating protein 1 | RACGAP1 | 186 |
| staufen, RNA binding protein (Drosophila) | STAU | 176 |
| lactamase, beta 2 | LACTB2 | 175 |
| eukaryotic translation elongation factor 1 alpha 2 | EEF1A2 | 172 |
| RAE1 RNA export 1 homolog (S. pombe) | RAE1 | 153 |
| tuftelin 1 | TUFT1 | 150 |
| zinc finger protein 36, C3H type-like 2 | ZFP36L2 | 150 |
| origin recognition complex, subunit 6 homolog-like (yeast) | ORC6L | 143 |
| zinc finger protein 623 | ZNF623 | 140 |
| extra spindle poles like 1 | ESPL1 | 139 |
| transcription elongation factor B (SIII), polypeptide 1 | TCEB1 | 138 |
| ribosomal protein S6 kinase, 70 kDa, polypeptide 1 | RPS6KB1 | 127 |
| Top 20 core genes from ER-negative tumors | ||
| zinc finger protein, multitype 2 | ZFPM2 | 445 |
| ribosomal protein L26-like 1 | RPL26L1 | 372 |
| hypothetical protein FLJ14346 | FLJ14346 | 372 |
| mitogen-activated protein kinase-activated protein kinase 2 | MAPKAPK2 | 347 |
| collagen, type II, alpha 1 | COL2A1 | 340 |
| muscleblind-like 2 (Drosophila) | MBNL2 | 320 |
| G protein-coupled receptor 124 | GPR124 | 314 |
| splicing factor, arginine/serine-rich 11 | SFRS11 | 300 |
| heterogeneous nuclear ribonucleoprotein A1 | HNRPA1 | 297 |
| CDC42 binding protein kinase alpha (DMPK-like) | CDC42BPA | 296 |
| regulator of G-protein signalling 4 | RGS4 | 276 |
| transient receptor potential cation channel, subfamily C, member 1 | TRPC1 | 265 |
| transcription factor 8 (represses interleukin 2 expression) | TCF8 | 263 |
| chromosome 6 open reading frame 210 | C6orf210 | 262 |
| dynamin 3 | DNM3 | 260 |
| centrosome protein Cep63 | Cep63 | 251 |
| tumor necrosis factor (ligand) superfamily, member 13 | TNFSF13 | 251 |
| dapper, antagonist of beta-catenin, homolog 1 (Xenopus laevis) | DACT1 | 248 |
| heterogeneous nuclear ribonucleoprotein A1 | HNRPA1 | 245 |
| reversion-inducing-cysteine-rich protein with kazal motifs | RECK | 243 |
The top 20 genes are ranked by their frequency in the 500 signatures of 100 genes for ER-positive and ER-negative tumors (for details see Figure 1).
Top 20 pathways in the 500 signatures of ER-positive and ER-negative tumors evaluated by Global Test
| Pathways | GO_ID | P | Frequency |
| ER-positive tumors | |||
| Apoptosis | 6915 | 3.06E-7 | 250 |
| Regulation of cell cycle | 74 | 2.46E-5 | 203 |
| Protein amino acid phosphorylation | 6468 | 2.48E-5 | 114 |
| Cytokinesis | 910 | 6.13E-5 | 165 |
| Cell motility | 6928 | 0.00015 | 93 |
| Cell cycle | 7049 | 0.00028 | 138 |
| Cell surface receptor-linked signal transd. | 7166 | 0.00033 | 172 |
| Mitosis | 7067 | 0.00036 | 256 |
| Intracellular protein transport | 6886 | 0.00054 | 141 |
| Mitotic chromosome segregation | 70 | 0.00057 | 98 |
| Ubiquitin-dependent protein catabolism | 6511 | 0.00074 | 158 |
| DNA repair | 6281 | 0.00079 | 156 |
| Induction of apoptosis | 6917 | 0.00083 | 115 |
| Immune response | 6955 | 0.00094 | 167 |
| Protein biosynthesis | 6412 | 0.0010 | 145 |
| DNA replication | 6260 | 0.0015 | 92 |
| Oncogenesis | 7048 | 0.0020 | 228 |
| Metabolism | 8152 | 0.0021 | 83 |
| Cellular defense response | 6968 | 0.0025 | 131 |
| Chemotaxis | 6935 | 0.0027 | 89 |
| ER-negative tumors | |||
| Regulation of cell growth | 1558 | 0.00012 | 136 |
| Regul. of G-coupled receptor signaling | 8277 | 0.00013 | 153 |
| Skeletal development | 1501 | 0.00024 | 160 |
| Protein amino acid phosphorylation | 6468 | 0.0051 | 151 |
| Cell adhesion | 7155 | 0.0065 | 110 |
| Carbohydrate metabolism | 5975 | 0.0066 | 86 |
| Nuclear mRNA splicing, via spliceosome | 398 | 0.0067 | 203 |
| Signal transduction | 7165 | 0.0078 | 160 |
| Cation transport | 6812 | 0.0098 | 160 |
| Calciumion transport | 6816 | 0.010 | 93 |
| Protein modification | 6464 | 0.011 | 132 |
| Intracellular signaling cascade | 7242 | 0.012 | 135 |
| mRNA processing | 6397 | 0.012 | 81 |
| RNA splicing | 8380 | 0.014 | 192 |
| Endocytosis | 6897 | 0.026 | 166 |
| Regul. of transcription from PolII promoter | 6357 | 0.031 | 109 |
| Regulation of cell cycle | 74 | 0.043 | 88 |
| Protein complex assembly | 6461 | 0.048 | 183 |
| Protein biosynthesis | 6412 | 0.063 | 99 |
| Cell cycle | 7049 | 0.084 | 72 |
Each of the top 20 over-represented pathways that have the highest frequencies in the 500 signatures of ER-positive and ER-negative tumors were subjected to Global Test program [12, 14]. The Global Test examines the association of a group of genes as a whole to a specific clinical parameter, in this case DMFS, and generates an asymptotic theory p value for the pathway. The pathways are ranked by their p value in the respective ER-subgroup of tumors.
Figure 3Association of the expression of individual genes with DMFS time for selected over-represented pathways. The Geneplot function in the Global Test program [12, 14] was applied and the contribution of the individual genes in each selected pathway is plotted. The numbers at the X-axis represent the number of genes in the respective pathway in ER-positive (Left) or ER-negative tumors (Right). The values at the Y-axis, represent the contribution (influence) of each individual gene in the selected pathway with DMFS. Negative values indicate there is no association between the gene expression and DMFS. Horizontal markers in a bar indicates one standard deviation away from the reference point, two or more horizontal markers in a bar indicates that the association of the corresponding gene with DMFS is statistically significant. The green bars reflect genes that are positively associated with DMFS, indicating a higher expression in tumors without metastatic capability. The red bars reflect genes that are negatively associated with DMFS, indicative of higher expression in tumors with metastatic capability. (A) ER-positive tumors: from top to bottom: "apoptosis" pathway consisting of 282 genes, "regulation of cell cycle" pathway consisting of 228 genes, "immune response" pathway consisting of 379 genes, and "mitosis"? pathway consisting of 100 genes. (B) ER-negative tumors: from top to bottom: "regulation of cell growth" pathway consisting of 58 genes, "cell adhesion" pathway consisting of 327 genes, "regulation of G-coupled receptor signaling" pathway consisting of 20 genes, and "skeletal development" pathway consisting of 105 genes.
Significant genes in the Apoptosis pathway in ER-positive tumors
| Probe Set | z-score | DMFS | Gene Symbol | Gene Title |
| 208905_at | 4.29 | - | CYCS | cytochrome c, somatic |
| 204817_at | 3.73 | - | ESPL1 | extra spindle poles like 1 |
| 38158_at | 3.41 | - | ESPL1 | extra spindle poles like 1 |
| 204947_at | 3.04 | - | E2F1 | E2F transcription factor 1 |
| 201111_at | 3.04 | - | CSE1L | CSE1 chromosome segregation 1-like |
| 201636_at | 2.97 | - | FXR1 | fragile X mental retardation, autosomal homolog 1 |
| 220048_at | 2.82 | - | EDAR | ectodysplasin A receptor |
| 210766_s_at | 2.75 | - | CSE1L | CSE1 chromosome segregation 1-like |
| 221567_at | 2.66 | - | NOL3 | nucleolar protein 3 (apoptosis repressor with CARD domain) |
| 213829_x_at | 2.65 | - | TNFRSF6B | tumor necrosis factor receptor superfamily, member 6b, decoy |
| 201112_s_at | 2.57 | - | CSE1L | CSE1 chromosome segregation 1-like |
| 212353_at | 2.51 | - | SULF1 | sulfatase 1 |
| 208822_s_at | 2.47 | - | DAP3 | death associated protein 3 |
| 209462_at | 2.37 | - | APLP1 | amyloid beta (A4) precursor-like protein 1 |
| 203005_at | 2.29 | - | LTBR | lymphotoxin beta receptor (TNFR superfamily, member 3) |
| 202731_at | 4.01 | + | PDCD4 | programmed cell death 4 |
| 206150_at | 3.57 | + | TNFRSF7 | tumor necrosis factor receptor superfamily, member 7 |
| 202730_s_at | 3.18 | + | PDCD4 | programmed cell death 4 |
| 209539_at | 3.14 | + | ARHGEF6 | Rac/Cdc42 guanine nucleotide exchange factor (GEF) 6 |
| 212593_s_at | 3.07 | + | PDCD4 | programmed cell death 4 |
| 204933_s_at | 2.96 | + | TNFRSF11B | tumor necrosis factor receptor superfamily, member 11b |
| 209831_x_at | 2.43 | + | DNASE2 | deoxyribonuclease II, lysosomal |
| 203187_at | 2.38 | + | DOCK1 | dedicator of cytokinesis 1 |
| 210164_at | 2.34 | + | GZMB | granzyme B |
Genes were sorted based on their "z-score" (significance), reflecting their association with distant metastasis-free survival time ("DMFS") time.
Significant genes in the Regulation of cell cycle pathway in ER-positive tumors
| Probe Set | z-score | DMFS | Gene Symbol | Gene Title |
| 204817_at | 3.73 | - | ESPL1 | extra spindle poles like 1 (S. cerevisiae) |
| 38158_at | 3.41 | - | ESPL1 | extra spindle poles like 1 (S. cerevisiae) |
| 214710_s_at | 3.10 | - | CCNB1 | cyclin B1 |
| 212426_s_at | 3.08 | - | YWHAQ | tyrosine 3-/tryptophan 5-monooxygenase activation protein |
| 204009_s_at | 3.08 | - | KRAS | v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog |
| 204947_at | 3.04 | - | E2F1 | E2F transcription factor 1 |
| 201947_s_at | 3.04 | - | CCT2 | chaperonin containing TCP1, subunit 2 (beta) |
| 204822_at | 2.91 | - | TTK | TTK protein kinase |
| 209096_at | 2.57 | - | UBE2V2 | ubiquitin-conjugating enzyme E2 variant 2 |
| 204826_at | 2.53 | - | CCNF | cyclin F |
| 212022_s_at | 2.46 | - | MKI67 | antigen identified by monoclonal antibody Ki-67 |
| 202647_s_at | 2.42 | - | NRAS | neuroblastoma RAS viral (v-ras) oncogene homolog |
| 201076_at | 3.09 | + | NHP2L1 | NHP2 non-histone chromosome protein 2-like 1 (S. cerevisiae) |
| 201601_x_at | 3.00 | + | IFITM1 | interferon induced transmembrane protein 1 (9–27) |
| 204015_s_at | 2.90 | + | DUSP4 | dual specificity phosphatase 4 |
| 220407_s_at | 2.68 | + | TGFB2 | transforming growth factor, beta 2 |
| 206404_at | 2.38 | + | FGF9 | fibroblast growth factor 9 (glia-activating factor) |
Genes were sorted based on their "z-score" (significance), reflecting their association with distant metastasis-free survival time ("DMFS") time.
Significant genes in the Regulation of cell growth pathway in ER-negative tumors
| Probe Set | z-score | DMFS | Gene Symbol | Gene Title |
| 209648_x_at | 4.01 | - | SOCS5 | suppressor of cytokine signaling 5 |
| 208127_s_at | 3.75 | - | SOCS5 | suppressor of cytokine signaling 5 |
| 209550_at | 3.18 | - | NDN | necdin homolog (mouse) |
| 201162_at | 3.14 | - | IGFBP7 | insulin-like growth factor binding protein 7 |
| 213910_at | 2.87 | - | IGFBP7 | insulin-like growth factor binding protein 7 |
| 212279_at | 2.91 | + | MAC30 | hypothetical protein MAC30 |
| 213337_s_at | 2.88 | + | SOCS1 | suppressor of cytokine signaling 1 |
Genes were sorted based on their "z-score" (significance), reflecting their association with distant metastasis-free survival time ("DMFS") time.
Significant genes in the Regulation of G-protein coupled receptor signaling pathway in ER-negative tumors
| Probe Set | z-score | DMFS | Gene Symbol | Gene Title |
| 204337_at | 3.99 | - | RGS4 | regulator of G-protein signalling 4 |
| 209324_s_at | 3.73 | - | RGS16 | regulator of G-protein signalling 16 |
| 220300_at | 2.61 | - | RGS3 | regulator of G-protein signalling 3 |
| 202388_at | 2.61 | - | RGS2 | regulator of G-protein signalling 2, 24 kDa |
| 204396_s_at | 2.34 | - | GRK5 | G protein-coupled receptor kinase 5 |
Genes were sorted based on their "z-score" (significance), reflecting their association with distant metastasis-free survival time ("DMFS") time.
Figure 4Validation of pathway-based breast cancer classifiers constructed from the optimal significant genes. To find the optimal number of genes as a signature, ROC analyses, with 5-year DMFS as defining point, with an increasing number of genes were conducted in the training set of ER-positive tumors or ER-negative tumors. For ER-positive tumors, in the "apoptosis" pathway, 24 genes (reaching an AUC of 0.784) were considered optimal (Table 3). For the "regulation of cell cycle pathway" in ER-positive tumors, 17 genes (AUC of 0.777) were considered optimal (Table 4). For ER-negative tumors, the optimal number of genes was 7 (AUC of 0.790) for the "regulation for cell growth" pathway (Table 5), and 5 (AUC of 0.788) for the "regulation of G-protein coupled receptor signaling" pathway (Table 6), respectively. The selected genes for the top 2 pathways for ER-positive and ER-negative tumors were subsequently used to construct prognostic gene signatures separately for the 2 ER-subgroups of tumors. The 152-patient test set [23] consisted of 125 ER-positive tumors and 27 ER-negative tumors based on the expression level of ER gene on the chip. (A) ROC (Left) and Kaplan-Meier (Right) analysis of the 38-gene signature for ER-positive tumors. Thirteen patients with less than 5-year follow-up were excluded from ROC analysis. (B) ROC (Left) and Kaplan-Meier (Right) analysis of the 12-gene signature for ER-negative tumors. One patient with less than 5-year follow-up was excluded from ROC analysis. (C) ROC (Left) and Kaplan-Meier (Right) analysis of a combined 50-gene signature for ER-positive and ER-negative tumors. Fourteen patients with less than 5-year follow-up were excluded from ROC analysis.
Number of common genes between different gene signatures for breast cancer prognosis
| Wang's 76 genes | van 't Veer's 70 genes | Paik's 16 genes | Yu's 62 genes | |
| Wang's 76 genesa | CCNE2 | No genes | No genes | |
| van 't Veer's 70 genesb | CNNE2 | SCUBE2 | AA962149 | |
| Paik's 16 genesc | No genes | SCUBE2 | BIRC5 | |
| Yu's 62 genesa | No genes | AA962149 | BIRC5 | |
| Sotiriou's 97 genesa | PLK1, FEN1, CCNE2, GTSE1, KPNA2, MLF1IP, POLQ | MELK, CENPA, CCNE2, GMPS, DC13, PRC1, NUSAP1, KNTC2 | MYBL2, BIRC5, STK6, MKI67, CCNB1 | URCC6, FOXM1, DLG7, DKFZp686L20222, DC13, FLJ32241, HSP1CDC21, CDC2, KIF11, EXO1 |
aAffymetrix HG-U133A Genechip
bAgilent Hu25K microarray
cNo genome-wide assessment; RT-PCR
To compare genes from various prognostic signatures for breast cancer, five gene signatures were selected, the 76-gene signature [8], the 70-gene signature [3], the 16-gene signature [25], the 62-gene signature [26], and the 97-gene signature [23]. Identity of genes was determined by BLAST program when gene signatures were derived from different platforms. Except for the 97-gene expression grade index [23], which showed an overlap with 5 to 16 genes with the other gene signatures, a maximum overlap of only 1 identical gene was found between the other gene signatures. The initially reported 3-gene overlap between the 76-gene and the 70-gene prognostic signatures [8]included genes with high similarity in sequences. In this study, only genes with an identical sequence in two signatures are considered overlapped based on results from BLAST program. Therefore, CCNE2 gene is the only common gene between the two signatures.
Mapping various gene signatures to core pathways
| Published gene signaturesa | ||||||
| Pathways | GO_ID | Wang | Van 't Veer | Paik | Yu | Sotiriou |
| ER-positive tumors | ||||||
| Apoptosis | 6915 | X | X | X | X | X |
| Regulation of cell cycle | 74 | X | X | X | X | X |
| Protein amino acid phosphorylation | 6468 | X | X | X | X | X |
| Cytokinesis | 910 | X | X | X | X | |
| Cell motility | 6928 | X | X | |||
| Cell cycle | 7049 | X | X | X | X | X |
| Cell surface receptor-linked signal transd. | 7166 | X | ||||
| Mitosis | 7067 | X | X | X | X | X |
| Intracellular protein transport | 6886 | X | X | X | ||
| Mitotic chromosome segregation | 70 | X | X | X | ||
| Ubiquitin-dependent protein catabolism | 6511 | X | X | X | ||
| DNA repair | 6281 | X | X | X | X | |
| Induction of apoptosis | 6917 | X | ||||
| Immune response | 6955 | X | X | X | ||
| Protein biosynthesis | 6412 | X | X | X | ||
| DNA replication | 6260 | X | X | X | X | |
| Oncogenesis | 7048 | X | X | X | ||
| Metabolism | 8152 | X | X | |||
| Cellular defense response | 6968 | X | X | X | ||
| Chemotaxis | 6935 | X | X | |||
| ER-negative tumors | ||||||
| Regulation of cell growth | 1558 | X | ||||
| Regul. of G-coupled receptor signaling | 8277 | |||||
| Skeletal development | 1501 | X | X | |||
| Protein amino acid phosphorylation | 6468 | X | X | X | X | X |
| Cell adhesion | 7155 | X | X | X | X | |
| Carbohydrate metabolism | 5975 | X | X | |||
| Nuclear mRNA splicing, via spliceosome | 398 | |||||
| Signal transduction | 7165 | X | X | X | X | |
| Cation transport | 6812 | |||||
| Calciumion transport | 6816 | |||||
| Protein modification | 6464 | |||||
| Intracellular signaling cascade | 7242 | X | X | X | X | |
| mRNA processing | 6397 | |||||
| RNA splicing | 8380 | |||||
| Endocytosis | 6897 | |||||
| Regul. of transcription from PolII promoter | 6357 | X | ||||
| Regulation of cell cycle | 74 | X | X | X | ||
| Protein complex assembly | 6461 | X | X | |||
| Protein biosynthesis | 6412 | X | X | |||
| Cell cycle | 7049 | X | X | X | X | X |
aPublished gene signatures that were studied include the 76-gene signature [8], the 70-gene signature [3], the 16-gene signature [25], the 62-gene signature [26], and the 97-gene signature [23]. Individual genes in each signature were mapped to the top 20 core pathways for ER-positive and ER-negative tumors, a cross indicates a match.