| Literature DB >> 21736749 |
Marla H Daves1, Susan G Hilsenbeck, Ching C Lau, Tsz-Kwong Man.
Abstract
BACKGROUND: Metastasis is the number one cause of cancer deaths. Expression microarrays have been widely used to study metastasis in various types of cancer. We hypothesize that a meta-analysis of publicly available gene expression datasets in various tumor types can identify a signature of metastasis that is common to multiple tumor types. This common signature of metastasis may help us to understand the shared steps in the metastatic process and identify useful biomarkers that could predict metastatic risk.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21736749 PMCID: PMC3212952 DOI: 10.1186/1755-8794-4-56
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Figure 1Example of the identification of the . Each of the circles represents a hypothetical dataset (Sto S). The numerals are the number of genes differentially expressed in the datasets represented by that area of overlap of the circles. The value Ois defined as the number of genes differentially expressed in exactly i number of datasets. In the example, Ois 85 since that is the number of genes differentially expressed in 1 study, whereas Ois 1 since only one gene is present in the area overlapping all 4 studies.
Figure 2Flow diagram of the selection of datasets included in the meta-analysis. After initial screening and identification of potential datasets in the Oncomine database, the process of elimination of ineligible studies is outlined. n: number of datasets in a specific category; numbers in brackets: reference for the dataset.
Figure 3Flow diagram of the selection of studies included for validation. The process of selection of possible validation datasets is outlined. n: number of datasets in a specific category; numbers in brackets: reference for the dataset described; GEO: Gene Expression Omnibus [49]; SMD: Stanford Microarray Database [50].
Expression microarray studies used in the meta-analysis
| Study [Reference] | Platform | Unique Genes Tested | Genes Sig Up | Genes Sig Down | Primary tumors | Distant Mets | GEO Accession | |
|---|---|---|---|---|---|---|---|---|
| 1 | Bittner Breast [ | HG U133 Plus 2.0 | 19079 | 33 | 0 | 327 | 9 | GSE2109 |
| 2 | Bittner Colon [ | HG U133 Plus 2.0 | 19079 | 656 | 3938 | 330 | 43 | GSE2109 |
| 3 | Bittner Lung [ | HG U133 Plus 2.0 | 19079 | 127 | 15 | 101 | 8 | GSE2109 |
| 4 | Bittner Ovarian [ | HG U133 Plus 2.0 | 19079 | 494 | 131 | 166 | 75 | GSE2109 |
| 5 | Bittner Sarcoma [ | HG U133 Plus 2.0 | 19079 | 4 | 1 | 42 | 10 | GSE2109 |
| 6 | Garber Lung [ | Institutional cDNA microarray | 10723 | 9 | 57 | 61 | 6 | GSE3398 |
| 7 | Graudens Colon [ | Institutional cDNA microarray | 6242 | 145 | 80 | 18 | 30 | GSE3964 |
| 8 | Haqq Melanoma [ | Research Genetics cDNA microarray | 7344 | 420 | 639 | 6 | 19 | N/A |
| 9 | Holzbeierlein Prostate [ | HG U95A-Av2 | 7820 | 11 | 295 | 40 | 9 | N/A |
| 10 | Jain Endocrine [ | HG U95A-Av2 | 7820 | 14 | 229 | 8 | 17 | N/A |
| 11 | Lapointe Prostate [ | Institutional cDNA microarray | 10021 | 1081 | 1219 | 62 | 9 | GSE3933 |
| 12 | LaTulippe Prostate [ | HG U95A-Av2 | 7820 | 265 | 245 | 23 | 9 | N/A |
| 13 | Magee Prostate [ | HG FL | 4564 | 35 | 18 | 8 | 3 | N/A |
| 14 | O'Donnell Oral [ | HG U133A | 12427 | 1 | 28 | 22 | 5 | GSE2280 |
| 15 | Radvanyi Breast [ | Custom cDNA microarray | 16133 | 548 | 85 | 47 | 7 | GSE1477 |
| 16 | Ramaswamy Multicancer [ | HG FL, Hu35KsubA | 9064 | 556 | 301 | 10 | 4 | N/A |
| 17 | Segal Sarcoma [ | HG U95A-Av2 | 7820 | 168 | 164 | 29 | 4 | N/A |
| 18 | Vanaja Prostate [ | HG U133A, | 17358 | 4 | 208 | 27 | 5 | N/A |
The 18 datasets used in the meta-analysis are described with regard to the platform used in the original experiment, the number of unique genes represented in the platform, the number of genes significantly (sig) dysregulated in metastases compared with primaries with a Q-value < 0.1, the number of samples that are primary tumors or distant metastases (mets), and the Gene Expression Omnibus (GEO) Accession number. HG U133 Plus 2: Affymetrix Human Genome U133 Plus 2.0 Array; HG U95A0Av2: Affymetrix Human Genome U95A-Av2 Array; HG FL: Affymetrix HumanGeneFL Array; HG U133A: Affymetrix Human Genome U133A Array: HG U133B: Affymetrix Human Genome U133B Array; N/A: Not applicable.
Figure 4Average observed by permutation versus observed dysregulated genes found by the meta-analysis method. The number of genes expected to be repeated as calculated by our permutation method and the number of repeated genes observed in our datasets is plotted against the number of studies (x) in which they are repeated (4a shows the results for the up-regulated genes; 4c is the results for the down-regulated genes). The actual numbers are presented in the tables below the corresponding chart (4b corresponds with up-regulated, and 4d with down). The observed repeated genes are greater than the expected number when significant in 2 datasets. This was considered significant when the FDP < 0.1.
The Common Metastatic Signature
| Number of | Up-regulated | Down-regulated | |||
|---|---|---|---|---|---|
| 4 studies | Not significant | ACTG2 | GJA1 | NBL1 | RARRES1 |
| CASP7 | GNG12 | PAGE4 | SELE | ||
| CSRP1 | GSN | PAM | SLC12A4 | ||
| CYR61 | IER2 | PCP4 | SMTN | ||
| DPT | ISL1 | PDE4D | SORBS1 | ||
| DSTN | JMJD3 | PIGB | SYNPO2 | ||
| FILIP1L | JUNB | PKIG | TCF21 | ||
| FLNC | KRT15 | PLA2G2A | TMEM49 | ||
| FOSB | LUM | PLEKHC1 | TPM1 | ||
| FUCA1 | MAPK1 | PPP1R12A | TSC22D1 | ||
| GADD45B | MFAP4 | RAP1A | VCL | ||
| 5 studies | EZH2 | ACTA2 | DKFZP564O0823 | LMOD1 | RBPMS |
| BMPR1A | DMN | MCL1 | SPARCL1 | ||
| CAMK2G | FBLN1 | MGP | SPG20 | ||
| CCND2 | FHL1 | NR4A1 | TACC1 | ||
| CNN1 | FXYD3 | NR4A3 | TAGLN | ||
| CTGF | HBEGF | PPP1R12B | ZFP36 | ||
| DIO2 | KCNMA1 | PYROXD1 | |||
| 6 studies | None | BTG2 | KCNMB1 | MYLK | |
| JUND | MYH11 | SOD3 | |||
| 8 studies | None | TPM2 |
The gene symbols of the genes in the metastatic signature are given with along with the number of studies in which it was significant.
Figure 5Number of genes in the common metastatic signature significant (Q-value < 0.1) in each dataset. N: Number of unique genes differentially expressed with a Q-value < 0.1.
Datasets in which genes in common metastatic signature of metastasis are significant with a Q < 0.1
| Genes differentially | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | Num studies |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EZH2 | X | X | X | X | X | 5 | |||||||||||||
| ACTG2 | X | X | X | X | 4 | ||||||||||||||
| FUCA1 | X | X | X | X | 4 | ||||||||||||||
| JUNB | X | X | X | X | 4 | ||||||||||||||
| PDE4D | X | X | X | X | 4 | ||||||||||||||
| RAP1A | X | X | X | X | 4 | ||||||||||||||
| SYNPO2 | X | X | X | X | 4 | ||||||||||||||
| CASP7 | X | X | X | X | 4 | ||||||||||||||
| GADD45B | X | X | X | X | 4 | ||||||||||||||
| KRT15 | X | X | X | X | 4 | ||||||||||||||
| PIGB | X | X | X | X | 4 | ||||||||||||||
| RARRES1 | X | X | X | X | 4 | ||||||||||||||
| TCF21 | X | X | X | X | 4 | ||||||||||||||
| CSRP1 | X | X | X | X | 4 | ||||||||||||||
| GJA1 | X | X | X | X | 4 | ||||||||||||||
| LUM | X | X | X | X | 4 | ||||||||||||||
| PKIG | X | X | X | X | 4 | ||||||||||||||
| SELE | X | X | X | X | 4 | ||||||||||||||
| TMEM49 | X | X | X | X | 4 | ||||||||||||||
| CYR61 | X | X | X | X | 4 | ||||||||||||||
| GNG12 | X | X | X | X | 4 | ||||||||||||||
| MAPK1 | X | X | X | X | 4 | ||||||||||||||
| PLA2G2A | X | X | X | X | 4 | ||||||||||||||
| SLC12A4 | X | X | X | X | 4 | ||||||||||||||
| TPM1 | X | X | X | X | 4 | ||||||||||||||
| DPT | X | X | X | X | 4 | ||||||||||||||
| GSN | X | X | X | X | 4 | ||||||||||||||
| MFAP4 | X | X | X | X | 4 | ||||||||||||||
| PLEKHC1 | X | X | X | X | 4 | ||||||||||||||
| SMTN | X | X | X | X | 4 | ||||||||||||||
| TSC22D1 | X | X | X | X | 4 | ||||||||||||||
| DSTN | X | X | X | X | 4 | ||||||||||||||
| IER2 | X | X | X | X | 4 | ||||||||||||||
| NBL1 | X | X | X | X | 4 | ||||||||||||||
| PPP1R12A | X | X | X | X | 4 | ||||||||||||||
| SORBS1 | X | X | X | X | 4 | ||||||||||||||
| VCL | X | X | X | X | 4 | ||||||||||||||
| FILIP1L | X | X | X | X | 4 | ||||||||||||||
| ISL1 | X | X | X | X | 4 | ||||||||||||||
| PAGE4 | X | X | X | X | 4 | ||||||||||||||
| FLNC | X | X | X | X | 4 | ||||||||||||||
| JMJD3 | X | X | X | X | 4 | ||||||||||||||
| PAM | X | X | X | X | 4 | ||||||||||||||
| FOSB | X | X | X | X | 4 | ||||||||||||||
| PCP4 | X | X | X | X | 4 | ||||||||||||||
| ACTA2 | X | X | X | X | X | 5 | |||||||||||||
| CNN1 | X | X | X | X | X | 5 | |||||||||||||
| DMN | X | X | X | X | X | 5 | |||||||||||||
| HBEGF | X | X | X | X | X | 5 | |||||||||||||
| MGP | X | X | X | X | X | 5 | |||||||||||||
| PYRO1D1 | X | X | X | X | X | 5 | |||||||||||||
| TACC1 | X | X | X | X | X | 5 | |||||||||||||
| BMPR1A | X | X | X | X | X | 5 | |||||||||||||
| CTGF | X | X | X | X | X | 5 | |||||||||||||
| FBLN1 | X | X | X | X | X | 5 | |||||||||||||
| KCNMA1 | X | X | X | X | X | 5 | |||||||||||||
| NR4A1 | X | X | X | X | X | 5 | |||||||||||||
| RBPMS | X | X | X | X | X | 5 | |||||||||||||
| TAGLN | X | X | X | X | X | 5 | |||||||||||||
| CAMK2G | X | X | X | X | X | 5 | |||||||||||||
| DIO2 | X | X | X | X | X | 5 | |||||||||||||
| FHL1 | X | X | X | X | X | 5 | |||||||||||||
| LMOD1 | X | X | X | X | X | 5 | |||||||||||||
| NR4A3 | X | X | X | X | X | 5 | |||||||||||||
| SPARCL1 | X | X | X | X | X | 5 | |||||||||||||
| ZFP36 | X | X | X | X | X | 5 | |||||||||||||
| CCND2 | X | X | X | X | X | 5 | |||||||||||||
| DKFZP564O0823 | X | X | X | X | X | 5 | |||||||||||||
| FXYD3 | X | X | X | X | X | 5 | |||||||||||||
| MCL1 | X | X | X | X | X | 5 | |||||||||||||
| PPP1R12B | X | X | X | X | X | 5 | |||||||||||||
| SPG20 | X | X | X | X | X | 5 | |||||||||||||
| BTG2 | X | X | X | X | X | X | 6 | ||||||||||||
| JUND | X | X | X | X | X | X | 6 | ||||||||||||
| KCNMB1 | X | X | X | X | X | X | 6 | ||||||||||||
| MYH11 | X | X | X | X | X | X | 6 | ||||||||||||
| MYLK | X | X | X | X | X | X | 6 | ||||||||||||
| SOD3 | X | X | X | X | X | X | 6 | ||||||||||||
| TPM2 | X | X | X | X | X | X | X | X | 8 |
The studies in which each gene is differentially expressed are given as an × under the study number, as given in Table 1. "Num studies" refers to the total number of studies in which the gene is significantly differentially expressed with a Q-value < 0.1.
Ingenuity canonical pathways significantly (B-H p-value < 0.05) represented by the genes down-regulated in the common metastatic signature
| Ingenuity Canonical Pathways | Fisher Exact | B-H | Ratio |
|---|---|---|---|
| Actin Cytoskeleton Signaling | 7.94E-08 | 1.51E-05 | 4.26E-02 |
| Regulation of Actin-based Motility by Rho | 3.89E-06 | 2.14E-04 | 6.52E-02 |
| Integrin Signaling | 4.57E-06 | 2.14E-04 | 3.96E-02 |
| Calcium Signaling | 2.24E-05 | 7.08E-04 | 3.41E-02 |
| Protein Kinase A Signaling | 1.12E-04 | 3.02E-03 | 2.51E-02 |
| RhoA Signaling | 1.86E-04 | 4.27E-03 | 4.55E-02 |
| NRF2-mediated Oxidative Stress Response | 2.00E-04 | 4.27E-03 | 3.28E-02 |
| ILK Signaling | 2.51E-04 | 4.68E-03 | 3.23E-02 |
| Thrombin Signaling | 3.39E-04 | 5.75E-03 | 2.94E-02 |
| Chemokine Signaling | 4.17E-04 | 6.61E-03 | 5.33E-02 |
| VEGF Signaling | 8.32E-04 | 1.07E-02 | 4.12E-02 |
| FAK Signaling | 8.71E-04 | 1.07E-02 | 4.00E-02 |
| Phospholipase C Signaling | 9.12E-04 | 1.07E-02 | 2.34E-02 |
| cAMP-mediated Signaling | 1.02E-03 | 1.15E-02 | 3.11E-02 |
| Tight Junction Signaling | 1.15E-03 | 1.20E-02 | 2.99E-02 |
| Relaxin Signaling | 3.47E-03 | 2.95E-02 | 2.68E-02 |
| CDK5 Signaling | 9.12E-03 | 6.17E-02 | 3.19E-02 |
| IL-8 Signaling | 9.33E-03 | 6.17E-02 | 2.15E-02 |
B-H: Benjamini-Hochberg method for correcting for the multiple testing problem; Ratio: The number of genes from the metastatic signature that map to the pathway divided by the total number of genes that map to the canonical pathway.
Figure 6Mapping of common pathways represented by gene list to metastatic cascade. Ingenuity pathways significantly enriched by the common metastatic signature with a p-value < 0.01 were mapped to the metastatic cascade after a literature review. The figure is based on the metastatic cascade as published by Isaiah Fidler in 2003 [3].
Enrichment of the common metastatic-signature (CMS)
| Study [Reference] | Platform | Unique Genes Tested | Primary | Distant Mets | Unique CMS Genes in dataset | Significant CMS Genes | LS Statistic | GEO Acc # or SMD Pub # |
|---|---|---|---|---|---|---|---|---|
| Varambally Prostate [ | HG U133 Plus 2.0 | 19079 | 7 | 6 | 65 | 57 | < 0.00001 | GSE3325 |
| Chen Gastric [ | Undefined cDNA microarray | 10568 | 89 | 14 | 61 | 26 | < 0.00001 | SMD Pub # 232 |
| Ki Colon [ | CMRC-GT | 9078 | 52 | 28 | 55 | 25 | 0.00011 | GSE6988 |
| Riker Melanoma [ | HG U133 Plus 2.0 | 19079 | 16 | 40 | 71 | 17 | 0.011 | GSE7553 |
| Linn Sarcoma
[ | Undefined cDNA microarray | 14437 | 47 | 10 | 61 | 5 | 0.50 | SMD Pub # 287 |
| Tothill Ovarian [ | HG U133 Plus 2.0 | 19079 | 189 | 54 | 65 | 7 | 0.93 | GSE9899 |
The 6 validation datasets with regard to the platform used in the original experiment, the number of unique genes represented in the platform, the number of samples that are primary tumors or distant metastases (mets), the number of genes in the common metastatic signature (CMS) represented in the platform, the number of CMS genes that were significant with a Q-value < 0.1, the LS statistic p-value, and the Gene Expression Omnibus Accession number (GEO Acc #) or SMD Publication number (SMD Pub #). HG U133 Plus 2: Affymetrix Human Genome U133 Plus 2.0 Array; CMRC-GT: Cancer Metastasis Research Center-Genomic Tree array, Yonsei Cancer Center, Seoul, Korea.