| Literature DB >> 19455236 |
Gul S Dalgin1, Dustin T Holloway, Louis S Liou, Charles DeLisi.
Abstract
Microarray gene expression profiling has been used to distinguish histological subtypes of renal cell carcinoma (RCC), and consequently to identify specific tumor markers. The analytical procedures currently in use find sets of genes whose average differential expression across the two categories differ significantly. In general each of the markers thus identified does not distinguish tumor from normal with 100% accuracy, although the group as a whole might be able to do so. For the purpose of developing a widely used economically viable diagnostic signature, however, large groups of genes are not likely to be useful. Here we use two different methods, one a support vector machine variant, and the other an exhaustive search, to reanalyze data previously generated in our Lab (Lenburg et al. 2003). We identify 158 genes, each having an expression level that is higher (lower) in every tumor sample than in any normal sample, and each having a minimum differential expression across the two categories at a significance of 0.01. The set is highly enriched in cancer related genes (p = 1.6 x 10⁻¹²), containing 43 genes previously associated with either RCC or other types of cancer. Many of the biomarkers appear to be associated with the central alterations known to be required for cancer transformation. These include the oncogenes JAZF1, AXL, ABL2; tumor suppressors RASD1, PTPRO, TFAP2A, CDKN1C; and genes involved in proteolysis or cell-adhesion such as WASF2, and PAPPA.Entities:
Keywords: Cancer diagnosis; Renal cell carcinoma; biomarker identification; microarray analysis
Year: 2007 PMID: 19455236 PMCID: PMC2675843
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1.(a) The minimum distances of randomly formed expression profiles in four simulations are shown as representative of other simulations. The x-axis is the minimum distance and y-axis is the number of genes having that distance. (b) The distribution of minimum distances for 466 genes. 158 of these genes have minimum distances with p-values ≤0.01, hence identified as significant single gene biomarkers.
Top ranked pathways with percentage of significant classifier genes.
| Glycolysis | 5 | 12.5 |
| Antigen processing | 4 | 11.76 |
| Oxidative phosphorylation | 5 | 3.36 |
| Calcium signaling pathway | 4 | 3.13 |
| G-Protein coupled signaling | 4 | 2.99 |
| MAPK signaling pathway | 4 | 2.06 |
| Immune response | 14 | 1.71 |
| Fatty acid metabolism | 4 | 1.53 |
| Cation transport | 6 | 1.19 |
| Apoptosis | 6 | 0.98 |
| Intracellular transport | 6 | 0.95 |
| Regulation of transcription | 12 | 0.58 |
| Cell adhesion | 4 | 0.56 |
Figure 2.The relationship between minimum distance and average fold change (log2(C/N)). Average fold change was previously calculated by Lenburg et al. and the significance was found by t-test. Here, C and N denotes the average expression values in tumor and normal samples, respectively. Significant markers (p-value ≤0.01, 158 genes) are indicated as red. 64 genes are shown in between the vertical arrows. These genes have an average fold change less than 3 hence were not identified as previously differentially expressed. Yet, these genes have been identified as new potential biomarkers by the current algorithm.
Top 20 ranked genes by SVM and their significance as classifier by exhaustive search.
| 1 | ALDOB | 6.00E-05 | 23 | RCC |
| 2 | NDUFA4 | 2.00E-05 | 8 | |
| 3 | TFAP2B | 0 | 1 | tumor suppressor candidate, melanoma |
| 4 | TCTE1L | 0.00275 | 93 | |
| 5 | LOC57821 | 2.00E-05 | 11 | |
| 6 | GABARAPL3 | 0.02892 | 239 | |
| 7 | SLC38A1 | 7.00E-05 | 27 | |
| 8 | POLDIP2 | 0.00456 | 124 | |
| 9 | DACH1 | 0.00033 | 43 | |
| 10 | GABARAPL1 | 2.00E-05 | 9 | RCC |
| 11 | HLA-DPA1 | 5.00E-05 | 16 | Melanoma |
| 12 | ERBB4 | 0 | 2 | Breast, ovarian cancer |
| 13 | HIG1 | 0.00362 | 108 | hypoxia induced |
| 14 | EHD2 | 0.00169 | 74 | |
| 15 | CD81 | 0.00013 | 38 | Hepatoma |
| 16 | PRG-3 | 0 | 3 | |
| 17 | NPHS1 | 6.00E-05 | 19 | Non cancerous kidney diseases |
| 18 | C1QA | 0.00379 | 112 | |
| 19 | ZNF697 | 0.01543 | 191 | |
| 20 | PIGR | 0.38576 | 432 |
†: minimum distance p-value >0.01
identified by four or more RCC studies (Young et al. 2001; Boer et al. 2001; Takahashi et al. 2001; Gieseg et al. 2002; Lenburg et al. 2003)
identified by Lenburg et al.
Performance of classifiers on the test samples in breast cancer dataset with 18 initial samples.
| % correctly classified samples | 88% | 82% |
| % misclassified samples | 11% | 18% |
| Sensitivity | 0.87 | 0.8 |
| Specificity | 0.92 | 0.85 |
| PPV | 0.96 | 0.91 |
Figure 3.(a) Clustering of samples and 158 significant markers using hierarchical clustering. The expression values are normalized for each gene by dividing every expression value to the mean of normal values for that gene and then transforming those values to logarithmic values (log2) to emphasize up or down-regulation with respect to normal expression values. Black represents the mean of normal values, green represents down-regulation and red represents up-regulation with respect to the mean. Clustering of genes reveals two big clusters: down-regulated genes in RCC (upper half) and up-regulated in RCC (bottom half) with respect to the normal samples, together with subclusters within these groups. The markers cluster the samples perfectly well into two major groups (Fig 3a, upper dendrogram). It is clear that tumor samples are separated into two major sub-clusters according to the grade. The only exception is T005 sample (grade I) which clusters with high grade samples. Within the normal samples, N032 and N035 have expression profiles most similar to tumor samples. (b) The projection of the samples onto the first three principal components (PC). The eigenvectors of the first three eigenvalues accounted for 86.9% of the variation (81.5, 5.4 and 2.2, respectively) in the data. Tumor samples are represented by open circles; normal samples are shown by filled circles. First principle component separates normal samples from tumor samples while second principle component separates tumor samples with low grade (T3, T023, T001, T2 and T4) from high grade (T011, T032, T035) again with the exception of T005 sample. Third principal component separates normal samples N032 and N035 from the rest of the normal samples as was observed with HCL.
158 significant (p-value ≤0.01) markers.
| NM_003221 | TFAP2B | 0 |
| NM_005235 | ERBB4 | 0 |
| NM_017753 | PRG-3 | 0 |
| NM_003714 | STC2 | 1x10−5 |
| AI655467 | 2x10−5 | |
| BF478120 | 2x10−5 | |
| NM_001692 | ATP6V1B1 | 2x10−5 |
| NM_002489 | NDUFA4 | 2x10−5 |
| NM_012232 | PTRF | 2x10−5 |
| NM_021179 | LOC57821 | 2x10−5 |
| NM_031412 | GABARAPL1 | 2x10−5 |
| AI733359 | 3x10−5 | |
| NM_005950 | MT1G | 3x10−5 |
| NM_006990 | WASF2 | 3x10−5 |
| NM_172369 | C1QG | 3x10−5 |
| BF541967 | 5x10−5 | |
| NM_002010 | FGF9 | 5x10−5 |
| NM_033554 | HLA-DPA1 | 5x10−5 |
| AI589190 | 6x10−5 | |
| BC005314.1 | 6x10−5 | |
| NM_004646 | NPHS1 | 6x10−5 |
| NM_133262 | ATP6V1G3 | 6x10−5 |
| NM_174896 | MGC24133 | 6x10−5 |
| NM_002848 | PTPRO | 7x10−5 |
| NM_003113 | SP100 | 7x10−5 |
| NM_014625 | NPHS2 | 7x10−5 |
| NM_000339 | SLC12A3 | 8x10−5 |
| NM_000491 | C1QB | 9x10−5 |
| NM_001009 | RPS5 | 9x10−5 |
| NM_000767 | CYP2B6 | 0.00011 |
| NM_003012 | SFRP1 | 0.00011 |
| NM_004894 | C14orf2 | 0.00011 |
| NM_016929 | CLIC5 | 0.00011 |
| NM_022073 | EGLN3 | 0.00011 |
| NM_033201 | BC008967 | 0.00011 |
| NM_004356 | CD81 | 0.00013 |
| NM_138799 | OACT2 | 0.00013 |
| NM_000211 | ITGB2 | 0.00014 |
| AK026764.1 | 0.00015 | |
| NM_152522 | MGC33864 | 0.0003 |
| AV691491 | 0.00033 | |
| NM_004392 | DACH1 | 0.00033 |
| NM_005565 | LCP2 | 0.00033 |
| NM_014463 | LSM3 | 0.00033 |
| NM_015474 | SAMHD1 | 0.00036 |
| BG251556 | 0.00037 | |
| NM_018162 | HELAD1 | 0.0004 |
| NM_000376 | VDR | 0.00043 |
| NM_001819 | CHGB | 0.00047 |
| NM_020142 | NUOMS | 0.00047 |
| NM_004578 | RAB4A | 0.00055 |
| AI962367 | 0.00058 | |
| NM_021800 | DNAJC12 | 0.0006 |
| NM_021199 | SQRDL | 0.00061 |
| NM_153233 | FLJ36445 | 0.00073 |
| NM_017606 | NM_017606 | 0.0008 |
| NM_015488 | MR-1 | 0.00084 |
| BF439449 | 0.00093 | |
| NM_000342 | SLC4A1 | 0.00093 |
| NM_006120 | HLA-DMA | 0.00098 |
| NM_000918 | P4HB | 0.00108 |
| NM_001099 | ACPP | 0.00113 |
| NM_021151 | CROT | 0.00113 |
| BG434272 | 0.00121 | |
| NM_001216 | CA9 | 0.00122 |
| NM_198991 | KCTD1 | 0.00125 |
| NM_006312 | NCOR2 | 0.00126 |
| NM_016582 | SLC15A3 | 0.00142 |
| NM_020632 | ATP6V0A4 | 0.00142 |
| NM_003220 | TFAP2A | 0.00166 |
| NM_005158 | ABL2 | 0.00169 |
| NM_014601 | EHD2 | 0.00169 |
| NM_003116 | SPAG4 | 0.00185 |
| AW771565 | AIM1 | 0.00187 |
| NM_003946 | NOL3 | 0.00192 |
| NM_000076 | CDKN1C | 0.00195 |
| NM_006058 | TNIP1 | 0.00197 |
| NM_000336 | SCNN1B | 0.00198 |
| NM_000035 | ALDOB | 0.00202 |
| NM_015103 | PLXND1 | 0.00206 |
| BF130943 | 0.00217 | |
| BE552097 | 0.00222 | |
| NM_000672 | ADH6 | 0.00248 |
| BE739519 | 0.00259 | |
| NM_198446 | FLJ45459 | 0.00259 |
| NM_021958 | HLX1 | 0.0026 |
| NM_001395 | DUSP9 | 0.00262 |
| NM_018023 | YEATS2 | 0.00267 |
| NM_001004196 | CD200 | 0.00269 |
| NM_006520 | TCTE1L | 0.00275 |
| NM_001152 | SLC25A5 | 0.00276 |
| NM_002193 | INHBB | 0.00277 |
| NM_006922 | SCN3A | 0.00277 |
| NM_000159 | GCDH | 0.0029 |
| NM_002800 | PSMB9 | 0.00314 |
| NM_004051 | BDH | 0.00314 |
| NM_145040 | PRKCDBP | 0.0032 |
| N58278 | 0.00325 | |
| NM_024006 | VKORC1 | 0.00335 |
| NM_004710 | SYNGR2 | 0.00339 |
| AI796222 | 0.00342 | |
| NM_000161 | GCH1 | 0.00348 |
| NM_000544 | TAP2 | 0.00357 |
| NM_014706 | SART3 | 0.00357 |
| NM_014056 | HIG1 | 0.00362 |
| NM_001645 | APOC1 | 0.00364 |
| NM_012153 | EHF | 0.00364 |
| NM_175061 | JAZF1 | 0.00368 |
| NM_015991 | C1QA | 0.00379 |
| NM_145648 | SLC15A4 | 0.00384 |
| NM_178014 | TUBB | 0.00391 |
| NM_000405 | GM2A | 0.00392 |
| AW242899 | 0.00407 | |
| NM_000582 | SPP1 | 0.00408 |
| NM_002610 | PDK1 | 0.00412 |
| NM_007021 | C10orf10 | 0.00413 |
| NM_016084 | RASD1 | 0.00423 |
| NM_016184 | CLECSF6 | 0.00433 |
| NM_017923 | MARCH-I | 0.00438 |
| NM_015584 | POLDIP2 | 0.00456 |
| NM_006406 | PRDX4 | 0.00468 |
| NM_020991 | CSH2 | 0.0047 |
| NM_000677 | ADORA3 | 0.00478 |
| NM_002223 | ITPR2 | 0.00483 |
| BF590528 | 0.00485 | |
| NM_005949 | MT1F | 0.00489 |
| NM_003038 | SLC1A4 | 0.0049 |
| NM_001465 | FYB | 0.00503 |
| NM_004790 | SLC22A6 | 0.00514 |
| NM_024027 | COLEC11 | 0.00514 |
| AI769774 | 0.00525 | |
| NM_016653 | ZAK | 0.00525 |
| NM_014629 | ARHGEF10 | 0.00527 |
| NM_000253 | MTP | 0.00571 |
| NM_003361 | UMOD | 0.00576 |
| BF510426 | 0.00583 | |
| NM_005531 | IFI16 | 0.006 |
| AI282982 | LOC120224 | 0.00629 |
| NM_004247 | U5-116KD | 0.00634 |
| NM_032118 | FLJ12953 | 0.00641 |
| NM_004414 | DSCR1 | 0.00655 |
| NM_032866 | CNGLN | 0.00665 |
| NM_002118 | HLA-DMB | 0.00719 |
| NM_004483 | GCSH | 0.00742 |
| NM_000316 | PTHR1 | 0.00743 |
| T90295 | 0.0076 | |
| NM_030674 | SLC38A1 | 0.00765 |
| NM_001699 | AXL | 0.00773 |
| AW242836 | LOC120224 | 0.0078 |
| NM_205848 | SYT6 | 0.00895 |
| NM_000034 | ALDOA | 0.00896 |
| NM_032717 | MGC11324 | 0.00942 |
| NM_020139 | DHRS6 | 0.00945 |
| AA148534 | PAPPA | 0.00951 |
| NM_016321 | RHCG | 0.00956 |
| H99792 | 0.00983 | |
| NM_053000 | TIGA1 | 0.00983 |
| NM_018660 | ZNF395 | 0.00994 |
Biological roles (from KEGG and GO databases) and disease associations of 115 annotated gene markers.
| DUSP9 | MAPK Signaling/JNK cascade | ||
| FGF9 | Prostate, ovarian cancer | Self-sufficiency in growth signals | MAPK Signaling/Regulation of actin cytoskeleton/growth factor |
| Tissue invasion and metastasis | Calcium Signaling Pathway/Gap junction | ||
| ERBB4 | Breast, ovarian cancer | Insensitivity to anti-growth signals | Calcium Signaling Pathway |
| RCC | Self-sufficiency (Loss of cancer cell dependence on OXPHOS) | Calcium Signaling Pathway/Intracellular transport | |
| DSCR1 | Calcium Signaling Pathway | ||
| PTHR1 | Chronic kidney failure | G-Protein coupled signaling | |
| RASD1 | Suppresses cell growth in human breast cancer and lung cancer cell lines | Insensitivity to anti-growth signals | G-Protein coupled signaling |
| SFRP1 | RCC, bladder cancer, cervical cancer | Evasion of apoptosis | Wnt Signaling Pathway/Apoptosis |
| Neuroendocrine tumors | Signaling/hormone | ||
| PTPRO | Lung cancer | Insensitivity to anti-growth signals | Signaling/tumor suppressor candidate |
| GABARAPL1 | RCC | Signaling | |
| ATP6V1G3 | Self-sufficiency (Loss of cancer cell dependence on OXPHOS) | Oxidative phosphorylation | |
| ATP6V0A4 | Self-sufficiency (Loss of cancer cell dependence on OXPHOS) | Oxidative phosphorylation | |
| ATP6V1B1 | RCC/renal tubular acidosis | Self-sufficiency (Loss of cancer cell dependence on OXPHOS) | Oxidative phosphorylation |
| KCTD1 | Cation transport | ||
| SCNN1B | Cation transport | ||
| SLC12A3 | Cation transport | ||
| SCN3A | Cation transport | ||
| EHO1 | Cation transport | ||
| RHCG | Cation transport | ||
| SLC4A1 | Anion transport | ||
| SLC12A3 | Anion transport | ||
| SLC22A6 | Anion transport | ||
| COLEC11 | Anion transport/Immune response | ||
| CLIC5 | Anion transport | ||
| MTP | Intracellular transport | ||
| Transcription regulation | |||
| TFAP2B | Char syndrome | Transcription regulation | |
| Melanoma | Insensitivity to anti-growth signals | Transcription regulation/tumor suppressor candidate | |
| VDR | RCC | Insensitivity to anti-growth signals | Transcription regulation |
| EHF | Prostate, breast, and lung carcinomas | Insensitivity to anti-growth signals | Transcriptional repressor |
| RNA splicing | |||
| RCQ5 | DNA repair | ||
| Up in colorectal cancer | Limitless replicative potential | DNA replication | |
| PAPPA | Tissue invasion and metastasis | Proteolysis | |
| DNAJC12 | Protein folding | ||
| CNGLN | RCC | Tissue invasion and metastasis | Regulation of actin cytoskeleton |
| AIM1 | Melanoma | Insensitivity to anti-growth signals+Tissue invasion and metastases | Cell adhesion/ tumor suppressor candidate |
| NPHS1 | Kidney diseases | Tissue invasion and metastasis | Cell adhesion |
| down-regulated in RCC and intrahepatic cholangiocarcinoma; up-regulated in breast, prostate, colon (and others) carcinomas | Insensitivity to anti-growth signals | Cell adhesion/Apoptosis/Immune response | |
| CDKN1C | Breast, pancreatic, thyroid cancer | Insensitivity to anti-growth signals | Cell cycle/tumor suppressor gene |
| UMOD | Immune response | ||
| ADH6 | RCC | Glycolysis/Fatty acid metabolism | |
| ALDOB | RCC, hepatocellular carcinoma | Glycolysis | |
| G3P2 | Glycolysis | ||
| CYP2B6 | Breast cancer | Fatty acid metabolism | |
| Fatty acid metabolism | |||
| Fatty acid metabolism | |||
| MGC11324 | Membrane lipid metabolism | ||
| BDH | Synthesis and degradation of ketone bodies | ||
| Self-sufficiency (Loss of cancer cell dependence on OXPHOS) | Oxidative phosphorylation | ||
| NPHS2 | Kidney diseases | Energy metabolism | |
| GCSH | Amino Acid metabolism | ||
| GCH1 | Amino Acid metabolism | ||
| ACPP | Reduces cell growth in prostate cancer upon induction by Vitamin D receptor agonists | Insensitivity to anti-growth signals | Metabolism |
| DHRS6 | Metabolism | ||
| Hypoxia induced gene | |||
| MT1G | RCC, papillary thyroid carcinoma, prostate cancer | Insensitivity to anti-growth signals | Metallothionein gene/tumor suppressor candidate |
| MT1F | Breast, liver cancer. Suppresses growth of liver cell line HepG2 | Insensitivity to anti-growth signals | Metallothionein gene |
Genes in italic: 64 genes previously not reported by Lenburg et al.
genes previously reported by 4 or more RCC studies
Figure 5.Genes related to critical processes underlying kidney cell transformation. Marker genes were replaced into six Weinberg categories which are essential for tumor development. Genes previously found by at least four other RCC studies are indicated with **, genes implicated in other cancers with *, and markers not identified previously by Lenburg et al. are given in red. Since the exact order of these steps is not known, the processes are given in here with no particular order.
Figure 4. (a-f).Heatmaps of some of the important pathways that 158 markers are involved in. The expression values for each gene are transformed (Section 3.4.1) and color coded as in Figure 3. Black represents the mean of normal values, green represents down-regulation and red represents up-regulation with respect to the mean.
25 genes that were not identified by Lenburg et al but identified by other RCC studies including this study.
Comparison with other RCC studies.
| More than two-fold changed in two or more tumor samples | 7 tumor (4 cc-RCC), 7 normal/cDNA 7,075 genes | 189 | 8 | 1 | |
| Three-fold or more changed in 75% or more of the tumor samples | 29 cc-RCC and 29 normal/cDNA 21,632 genes | 109 | 7 | - | |
| Changed genes in cc-RCC with Wilcoxon test p-value ≤0.001 and fold change ≥1.1 | 13 RCC (9cc-RCC), 9 normal/ Affymetrix 5600 genes | 355 genes, 85 reported | 4 | 1 | |
| adapted sign test by counting for each gene the number of times that its measured intensity in the set of repeated pair-wise comparisons is higher in T and N | 37 cc-RCC, 37 normal | 1738 cDNAs | 45 | 11 | |
| No reported gene set; selected genes with avg fold change >3 and t-test p-value >0.03. | 41 RCC (23 cc-RCC), 3 normal/cDNA 22,648 genes | 182 genes | 8 | 1 | |
| 90% lower confidence boung of the fold change was >2 and t-test p-value <0.001 | 8 clear cell stage I, 23 normal, Affymetrix 22,283 genes | 1359 up-regulated, 493 down-regulated | 37 up, 19 down-regulated | 15 up-regulated | |
| t-test with estimated false discovery rate <0.23 | 25 ccRCC, 25 normal/RCC-specific cDNA microarrays with 4207 genes | 620 up-regulated; 561 down-regulated genes | 13 up, 11 down-regulated | 5 up, 2 down-regulated |
genes not identified by Lenburg et al.
Disease related 64 markers not identified by Lenburg et al.
| NCOR2 | Up | Suppresses target genes for the vitamin D receptor (VDR) in prostate cancer cells resulting in hormonal insensitivity ( |
| ABL2 | Up | Related to proto-oncogene ABL. Implicated in hematologic neoplasms ( |
| CD81 | Up | The loss of CD81 was found to be associated with differentiation and metastasis of HCC ( |
| TUBB | Up | Implicated in many cancers including ovarian ( |
| SART3 | Up | Induces HLA-A24-restricted and tumor-specific cytotoxic T lymphocytes in colorectal cancer patients ( |
| HELAD1 | Down | Up-regulated in colorectal carcinomas ( |
| CHGB | Down | Up-regulated in neuroendocrine tumors ( |
| JAZF1 | Up | Frequent fusion of the JAZF1 and JJAZ1 genes in endometrial stromal tumors ( |
| PRKCDBP | Up | epigenetic or mutational inactivation contribute to the pathogenesis of breast and lung cancers ( |
| TFAP2A | Down | Loss of AP-2 results in metastasis of melanoma cells ( |