| Literature DB >> 19652763 |
George C Tseng1, Chunrong Cheng, Yan Ping Yu, Joel Nelson, George Michalopoulos, Jian-Hua Luo.
Abstract
Microarray technology has been widely applied to the analysis of many malignancies, however, integrative analyses across multiple studies are rarely investigated. In this study we performed a meta-analysis on the expression profiles of four published studies analyzing organ donor, benign tissues adjacent to tumor and tumor tissues from liver, prostate, lung and bladder samples. We identified 99 distinct multi-cancer biomarkers in the comparison of all three tissues in liver and prostate and 44 in the comparison of normal versus tumor in liver, prostate and lung. The bladder samples appeared to have a different list of biomarkers from the other three cancer types. The identified multi-cancer biomarkers achieved high accuracy similar to using whole genome in the within-cancer-type prediction. They also performed superior than the one using whole genome in inter-cancer-type prediction. To test the validity of the multi-cancer biomarkers, 23 independent prostate cancer samples were evaluated and 96% accuracy was achieved in inter-study prediction from the original prostate, liver and lung cancer data sets respectively. The result suggests that the compact lists of multi-cancer biomarkers are important in cancer development and represent the common signatures of malignancies of multiple cancer types. Pathway analysis revealed important tumorogenesis functional categories.Entities:
Keywords: carcinogenesis; common signature; expression profile; meta-analysis; multi-cancer biomarker
Year: 2009 PMID: 19652763 PMCID: PMC2716681 DOI: 10.4137/bmi.s930
Source DB: PubMed Journal: Biomark Insights ISSN: 1177-2719
Overview of data sets used in batch I and batch II analyses.
| Liver | 21 | 30 | 43 | 94 |
| Prostate | 23 | 59 | 66 | 148 |
| Liver | 21 | 43 | 64 | |
| Prostate | 23 | 66 | 89 | |
| Lung | 17 | 134 | 151 | |
| Bladder | 5 | 57 | 62 | |
Figure 1ANOVA model for batch I analysis: (A) Eight categories of ANOVA patterns used to select multi-cancer biomarkers. N denotes normal, A tissue adjacent to cancer, and T tumor sample. (B) Venn diagram representation of the number of ANOVA genes found to be significantly altered in liver and prostate tissues when comparing N, A and T groups. (C) Bar graph of genes that were altered in liver (1854), prostate (1139) or both tissue samples with same pattern (111). (D) Histogram of correlations of N-A-T patterns across prostate and liver of the 520 common ANOVA genes.
Figure 4Diagram of batchI-MBs and batchII-MBs and their intersection genes. The 47 batchII-MBs are listed in Table 5 and 109 batchII-MBs are listed in Supplement Table 4.
Figure 2Expression patterns of selected representative genes in liver and prostate samples. Selected genes of seven pattern categories from the 111 common concordant ANOVA genes in liver and prostate samples. Global sample normalization has been performed across prostate and liver data sets. It is clearly seen that although all these biomarkers demonstrate concordant patterns across prostate and liver, many of them (APBA2BP, SLC39A14, AGT, TOP2A and B2M) are at different expression level and direct application of a prediction model developed in one data set will likely perform poor in the other data set.
Figure 3Schemes of leave-one-out cross validation or external validation for batchI-MBs and batchII-MBs. Upper: scheme for leave-one-out cross validation to evaluate the procedure of selecting batchI-MBs and batchII-MBs. The test sample is first left aside. The remaining samples are used for selecting multi-cancer biomarkers and constructing the prediction model to be used to evaluate the set-aside test sample. This scheme is used to evaluate procedures of selecting both batchI-MBs and batchII-MBs to generate Table 2 and Table 3. (A) an example to evaluate liv→liv in Table 2 (B) an example to evaluate pro→liv in Table 2. Lower: scheme for external validation of batchII-MBs by 23 independent prostate cancer samples. (C) external evaluation of the prediction model based on liver data and batchII-MBs (EV_liv→ pro). (D) external evaluation of the prediction model based on the old prostate data and batchII-MBs (EV_pro→pro).
An example of confusion matrix. Two false negatives and five false positive are made in the prediction, which sum up to seven total errors (with 42/48 = 87.5% overall accuracy). The sensitivity is 41/42 = 97.6%, specificity 1/6 = 16.7% and prediction performance index (PPI) (97.6% + 16.7%)/2 = 57.2%.
| True normal tissues | True tumor tissues | |
|---|---|---|
| Predicted as normal tissues | 1 | 1 |
| Predicted as tumor tissues | 5 | 41 |
Figure 5Pathway analysis heatmap. The enriched Gene Ontology terms are demonstrated on rows and lists of multi-cancer biomarkers are shown on columns. The significance (p-values) is represented by gradient red color. When the number of genes of the biomarker list that fall in the GO term is too small or zero, the p-value assessment is not computable or not stable and is represented as missing in white color.
A total of 109 biomarkers are identified in more than 70% of leave-one-out cross validation in batch I (batchI-MBs). After deleting duplicates, 99 distinct predictive biomarkers are listed below.
| Probe set ID | Gene title | Gene symbol | Signed mean fold Change
| |||
|---|---|---|---|---|---|---|
| Liver
| Prostate
| |||||
| A-N | T-N | A-N | T-N | |||
| 39597_at* | actin binding LIM protein family, member 3 | ABLIM3 | −1.4 | −2.1 | −1.3 | −1.6 |
| 37599_at* | aldehyde oxidase 1 | AOX1 | −1.5 | −2.8 | −1.6 | −2.6 |
| 34736_at* | cyclin B1 | CCNB1 | 1.3 | 2.3 | 1.2 | 1.8 |
| 37302_at* | centromere protein F, 350/400 ka (mitosin) | CENPF | 1.3 | 1.9 | 1.1 | 1.4 |
| 37203_at* | carboxylesterase 1 (monocyte/macrophage serine esterase 1) | CES1 | −1.3 | −1.8 | 1 | −1.7 |
| 32168_s_at* | Down syndrome critical region gene 1 | DSCR1 | 1.1 | −2 | 1.1 | −1.6 |
| 34311_at* | glutaredoxin (thioltransferase) | GLRX | −1.6 | −2.5 | −1.3 | −1.7 |
| 1737_s_at* | insulin-like growth factor binding protein 4 | IGFBP4 | −1.5 | −1.8 | −1.8 | −2.5 |
| 609_f_at* | metallothionein 1B | MT1B | −1.4 | −3.6 | −1.3 | −2.3 |
| 36130_f_at* | metallothionein 1E | MT1E | −1.4 | −3.5 | −1.2 | −1.9 |
| 31622_f_at* | metallothionein 1F | MT1F | −1.5 | −2.9 | −1.4 | −2.3 |
| 39594_f_at* | metallothionein 1H | MT1H | −1.5 | −3.2 | −1.4 | −2.4 |
| 41530_at | acetyl-Coenzyme A acyltransferase 2 (mitochondrial 3-oxoacyl-Coenzyme A thiolase) | ACAA2 | −1.1 | −2 | −1.1 | −1.6 |
| 34050_at | acyl-CoA synthetase medium-chain family member 1 | ACSM1 | 2.1 | 3.5 | 1.7 | 3.1 |
| 684_at | angiotensinogen (serpin peptidase inhibitor, clade A, member 8) | AGT | −1.9 | −2.5 | −3.3 | −3.4 |
| 32747_at | aldehyde dehydrogenase 2 family (mitochondrial) | ALDH2 | 1.1 | −1.6 | 1.1 | −1.5 |
| 33756_at | amine oxidase, copper containing 3 (vascular adhesion protein 1) | AOC3 | −1.1 | −1.4 | −1.2 | −2.5 |
| 41306_at | amyloid beta (A4) precursor protein-binding, family A, member 2 binding protein | APBA2BP | 1.2 | 1.7 | 1.4 | 1.5 |
| 287_at | activating transcription factor 3 | ATF3 | 2.4 | 1.5 | 5.2 | 3.4 |
| 201_s_at | beta-2-microglobulin | B2M | 1.1 | −1.4 | 1.1 | −1.4 |
| 2011_s_at | BCL2-interacting killer (apoptosis- inducing) | BIK | 1.3 | 1.6 | 1.3 | 1.5 |
| 39409_at | complement component 1, r subcomponent | C1R | −1.2 | −2.2 | −1.4 | −1.8 |
| 40496_at | complement component 1, s subcomponent | C1S | −1.1 | −1.7 | −1.2 | −1.8 |
| 1943_at | cyclin A2 | CCNA2 | 2 | 2.5 | 1.6 | 1.6 |
| 33950_g_at | corticotropin releasing hormone receptor 2 | CRHR2 | 1.5 | 1.4 | 1.3 | 1.4 |
| 408_at | chemokine (C-X-C motif) ligand 1 (melanoma growth stimulating activity, alpha) | CXCL1 | 2.5 | 1.1 | 1.8 | 1.2 |
| 649_s_at | chemokine (C-X-C motif) receptor 4 | CXCR4 | 3.3 | 2.6 | 3 | 2.8 |
| 38772_at | cysteine-rich, angiogenic inducer, 61 | CYR61 | 1.7 | −1.2 | 3.9 | 2.5 |
| 36643_at | discoidin domain receptor family, member 1 | DDR1 | 1.7 | 1.6 | 1.6 | 1.5 |
| 33393_at | DEAD (Asp-Glu-Ala-As) box polypeptide 19B | DDX19B | −1.3 | −1.7 | −1.2 | −1.5 |
| 32600_at | docking protein 4 | DOK4 | −1.4 | −1.5 | −1.4 | −1.6 |
| 37827_r_at | dopey family member 2 | DOPEY2 | 1.4 | 1.7 | 1.8 | 2.3 |
| 34823_at | dipeptidyl-peptidase 4 (CD26, adenosine deaminase complexing protein 2) | DPP4 | 1.8 | 2.4 | 2.9 | 2.7 |
| 36088_at | Down syndrome critical region gene 2 | DSCR2 | −2.4 | −2.8 | −1.3 | −1.3 |
| 167_at | eukaryotic translation initiation factor 5 | EIF5 | −1.6 | −2.1 | −1.5 | −1.8 |
| 1519_at | v-ets erythroblastosis virus E26 oncogene homolog 2 (avian) | ETS2 | 1.3 | −1.5 | −1.1 | −1.9 |
| 36543_at | coagulation factor III (thromboplastin, tissue factor) | F3 | 2.6 | 1.9 | 2.7 | 2.7 |
| 1915_s_at | v-fos FBJ murine osteosarcoma viral oncogene homolog | FOS | 2.2 | 1.2 | 5.8 | 3.9 |
| 36669_at | FBJ murine osteosarcoma viral oncogene homolog B | FOSB | 1.5 | 1.2 | 5.7 | 4 |
| 39822_s_at | growth arrest and DNA-damage-inducible, beta | GADD45B | 2.7 | 1.4 | 2.3 | 1.4 |
| 290_s_at | G protein-coupled receptor 3 | GPR3 | −1.1 | −1.6 | −1.2 | −1.8 |
| 35127_at | histone cluster 1, H2ae | HIST1H2AE | 1 | 1.4 | 1.2 | 1.7 |
| 31521_f_at | histone cluster 1, H4k | HIST1H4J | 1 | 1.4 | 1 | 1.5 |
| 152_f_at | histone cluster 2, H4a | HIST2H4A | −1.6 | −1.4 | −1.5 | −1.5 |
| 38833_at | major histocompatibility complex, class II, DP alpha 1 | HLA-DPA1 | 3.3 | 2.5 | 1.4 | 1.1 |
| 38096_f_at | major histocompatibility complex, class II, DP beta 1 | HLA-DPB1 | 2.8 | 1.8 | 1.5 | 1.1 |
| 36878_f_at | major histocompatibility complex, class II, DQ beta 1 | HLA-DQB1 | 2.3 | 2.1 | 1.5 | 1.4 |
| 37039_at | major histocompatibility complex, class II, DR alpha | HLA-DRA | 2.5 | 1.8 | 1.6 | 1.2 |
| 36617_at | inhibitor of DNA binding 1, dominant negative helix-loop-helix protein | ID1 | −1.3 | −2.2 | −1 | −1.5 |
| 676_g_at | interferon induced transmembrane protein 1 (9–27) | IFITM1 | −1.4 | −1.8 | −1.9 | −2.8 |
| 41745_at | interferon induced transmembrane protein 3 (1–8U) | IFITM3 | −1.4 | −1.6 | −2.1 | −3.2 |
| 37319_at | insulin-like growth factor binding protein 3 | IGFBP3 | 1.8 | −1.4 | 1.3 | −1 |
| 36227_at | interleukin 7 receptor | IL7R | 2.5 | 2.1 | 1.6 | 1.6 |
| 35372_r_at | interleukin 8 | IL8 | 6.9 | 2.3 | 1.9 | 1.5 |
| 38545_at | inhibin, beta B (activin AB beta polypeptide) | INHBB | 3 | 2.8 | 1.5 | 1.7 |
| 36355_at | involucrin | IVL | 1.6 | 1.6 | 1.3 | 1.4 |
| 1895_at | jun oncogene | JUN | 2.5 | 1.6 | 3 | 2.3 |
| 41483_s_at | jun D proto-oncogene | JUND | 2.3 | 1.8 | 1.8 | 1.5 |
| 217_at | kallikrein-related peptidase 2 | KLK2 | 1.8 | 2.1 | 5.7 | 6.5 |
| 35118_at | lecithin-cholesterol acyltransferase | LCAT | 1.1 | −1.8 | 1.1 | −1.3 |
| 41710_at | hypothetical protein LOC54103 | LOC54103 | 1.7 | 1.5 | 1.6 | 1.5 |
| 35926_s_at | lysozyme (renal amyloidosis) | LYZ | 2.2 | 1.9 | 1.6 | 1.4 |
| 36711_at | v-maf musculoaponeurotic fibrosarcoma oncogene homolog F (avian) | MAFF | 3 | 1.8 | 1.7 | 1.2 |
| 33146_at | myeloid cell leukemia sequence 1 (BCL2-related) | MCL1 | 2 | 1.4 | 1.5 | 1.2 |
| 33241_at | microfibrillar-associated protein 3-like | MFAP3L | −1.5 | −1.9 | −1.6 | −1.8 |
| 668_s_at | matrix metallopeptidase 7 (matrilysin, uterine) | MMP7 | 2 | 1.2 | 4.1 | 2.8 |
| 870_f_at | metallothionein 3 | MT3 | −1.5 | −2.8 | −1.3 | −2 |
| 36933_at | N-myc downstream regulated gene 1 | NDRG1 | 1.6 | 1.9 | 1.5 | 1.5 |
| 37544_at | nuclear factor, interleukin 3 regulated | NFIL3 | 1.1 | −1.3 | 1.2 | −1.3 |
| 190_at | nuclear receptor subfamily 4, group A, member 3 | NR4A3 | 2.2 | 1.6 | 1.6 | 1.2 |
| 31886_at | 5’-nucleotidase, ecto (CD73) | NT5E | −1.3 | −1.8 | −1.2 | −2.1 |
| 31733_at | purinergic receptor P2X, ligand-gated ion channel, 3 | P2RX3 | 1.7 | 1.6 | 1.6 | 1.7 |
| 32210_at | phosphoglucomutase 1 | PGM1 | −1.2 | −1.8 | −1.1 | −1.5 |
| 36980_at | proline-rich nuclear receptor coactivator 1 | PNRC1 | −1 | −1.7 | −1 | −1.4 |
| 39366_at | protein phosphatase 1, regulatory (inhibitor) subunit 3C | PPP1R3C | −1.4 | −1.9 | −1.4 | −2 |
| 36159_s_at | prion protein (p27–30) (Creutzfeldt-Jakob disease, Gerstmann-Strausler-Scheinker syndrome, fatal familial insomnia) | PRNP | 1.2 | −1.2 | −1.1 | −1.7 |
| 216_at | prostaglandin D2 synthase 21 kDa (brain) | PTGDS | 2.3 | 1.8 | 1.4 | −1.2 |
| 1069_at | prostaglandin-endoperoxide synthase 2 (prostaglandin G/H synthase and cyclooxygenase) | PTGS2 | 1.4 | 1 | 2.1 | 1.2 |
| 37701_at | regulator of G-protein signalling 2, 24 kDa | RGS2 | 3.1 | 2.3 | 1.2 | −1.3 |
| 41471_at | S100 calcium binding protein A9 | S100A9 | −1.9 | −4.1 | −1.8 | −2.4 |
| 33305_at | serpin peptidase inhibitor, clade B (ovalbumin), member 1 | SERPINB1 | −1.2 | −1.6 | −1 | −1.6 |
| 36979_at | solute carrier family 2 (facilitated glucose transporter), member 3 | SLC2A3 | 1.3 | 1.1 | 1.4 | 1.1 |
| 38797_at | solute carrier family 39 (zinc transporter), member 14 | SLC39A14 | −1.7 | −2.9 | −1.4 | −1.9 |
| 38994_at | suppressor of cytokine signaling 2 | SOCS2 | 2.2 | 1.4 | 1.4 | 1.1 |
| 34666_at | superoxide dismutase 2, mitochondrial | SOD2 | −2.4 | −3.2 | −1.4 | −1.8 |
| 38763_at | sorbitol dehydrogenase | SORD | 1.7 | 1.4 | 2.4 | 2.5 |
| 38805_at | TGFB-induced factor homeobox 1 | TGIF1 | 1.6 | 1.7 | 1.5 | 1.4 |
| 39411_at | TCDD-inducible poly (ADP-ribose) polymerase | TIPARP | 2 | 1.6 | 1.8 | 1.3 |
| 1715_at | tumor necrosis factor (ligand) superfamily, member 10 | TNFSF10 | 1.6 | 1.3 | 1.9 | 1.9 |
| 904_s_at | topoisomerase (DNA) II alpha 170 kDa | TOP2A | 1.1 | 1.4 | 1.1 | 1.4 |
| 32793_at | T cell receptor beta variable 19 | TRBC1 | 1.5 | 1.4 | 1.5 | 1.4 |
| 38469_at | tetraspanin 8 | TSPAN8 | 1.7 | 2.2 | 1.6 | 1.6 |
| 40198_at | voltage-dependent anion channel 1 | VDAC1 | −1.5 | −1.4 | −1.3 | −1.5 |
| 36909_at | WEE1 homolog (S. pombe) | WEE1 | 2 | 1.5 | 1.8 | 1.5 |
| 40448_at | zinc finger protein 36, C3H type, homolog (mouse) | ZFP36 | 2.9 | 1.5 | 2.6 | 1.7 |
| 32588_s_at | zinc finger protein 36, C3H type-like 2 | ZFP36L2 | 2 | 1.2 | 1.4 | 1.1 |
| 1514_g_at | 1.6 | 1.7 | 3.6 | 3.2 | ||
| 1662_r_at | 1.9 | 2 | 3.6 | 4.1 | ||
| 40487_at | Transcribed locus | −1.2 | −1.6 | −1.1 | −1.5 | |
Prediction performance indexes (PPI) in batch I analysis. Pairwise two-group comparisons (N vs. T, N vs. A and A vs. T) are performed.
| All genes | 96.5% | 66.3% | 93.9% | 47.4% |
| Common signature | 96.5% | 93.0% | 98.8% | 96.3% |
| All genes | 92.6% | 77.9% | 96.6% | 54.6% |
| Common signature | 98.2% | 96.0% | 98.3% | 96.6% |
| All genes | 79.9% | 51.9% | 71.4% | 55.7% |
| Common signature | 75.6% | 74.7% | 66.7% | 65.1% |
Prediction performance indexes (PPI) in batch II analysis. The values shaded in grey are summarized in Table 4.
| All genes | 96.51% | 66.28% | 93.94% | 47.36% |
| Common signature | 97.67% | 97.67% | 95.55% | 94.14% |
| All genes | 96.51% | 56.98% | 90.72% | 45.32% |
| Common signature | 95.23% | 93.02% | 95.94% | 94.72% |
| All genes | 90.72% | 69.03% | 93.94% | 62.88% |
| Common signature | 94.82% | 94.45% | 79.61% | 72.76% |
| All genes | 96.51% | 62.79% | 88.60% | 49.65% |
| Common signature | 91.74% | 91.86% | 98.25% | 98.25% |
| All genes | 93.94% | 36.30% | 88.60% | 42.63% |
| Common signature | 92.92% | 86.86% | 97.81% | 88.25% |
| All genes | 90.72% | 51.87% | 88.60% | 50.88% |
| Common signature | 89.38% | 85.91% | 97.37% | 85.61% |
Batch II leave-one-out cross validation analysis result (confusion matrix).
| All genes | 69 | Predicted N | 21 | 3 | 21 | 29 | 55 | Predicted N | 23 | 8 | 19 | 58 |
| Predicted T | 0 | 40 | 0 | 14 | Predicted T | 0 | 58 | 4 | 8 | |||
| Common signature | 225.9 | Predicted N | 21 | 2 | 21 | 2 | 222.0 | Predicted N | 21.3 | 1 | 21 | 2 |
| Predicted T | 0 | 41 | 0 | 41 | Predicted T | 1.7 | 65 | 2 | 64 | |||
| All genes | 69 | Predicted N | 21 | 3 | 21 | 37 | 57 | Predicted N | 16 | 17 | 13 | 115 |
| Predicted T | 0 | 40 | 0 | 6 | Predicted T | 1 | 117 | 4 | 19 | |||
| Common signature | 120.1 | Predicted N | 21 | 4.1 | 21 | 6 | 120.3 | Predicted N | 16 | 3 | 16 | 6 |
| Predicted T | 0 | 38.9 | 0 | 37 | Predicted T | 1 | 131 | 1 | 128 | |||
| All genes | 57 | Predicted N | 16 | 17 | 17 | 83 | 55 | Predicted N | 23 | 8 | 23 | 49 |
| Predicted T | 1 | 117 | 0 | 51 | Predicted T | 0 | 58 | 0 | 17 | |||
| Common signature | 289.5 | Predicted N | 16 | 6 | 16 | 7 | 288.8 | Predicted N | 17 | 9.7 | 15 | 13 |
| Predicted T | 1 | 128 | 1 | 127 | Predicted T | 6 | 56.3 | 8 | 53 | |||
| All genes | 69 | Predicted N | 21 | 3 | 21 | 32 | 135 | Predicted N | 5 | 13 | 4 | 46 |
| Predicted T | 0 | 40 | 0 | 11 | Predicted T | 0 | 44 | 1 | 11 | |||
| Common signature | 51.7 | Predicted N | 21 | 7.1 | 21 | 7 | 54.5 | Predicted N | 5 | 2 | 5 | 2 |
| Predicted T | 0 | 35.9 | 0 | 36 | Predicted T | 0 | 55 | 0 | 55 | |||
| All genes | 55 | Predicted N | 23 | 8 | 16 | 64 | 135 | Predicted N | 5 | 13 | 4 | 54 |
| Predicted T | 0 | 58 | 7 | 2 | Predicted T | 0 | 44 | 1 | 3 | |||
| Common signature | 9.5 | Predicted N | 20.3 | 1.6 | 18 | 3 | 10.1 | Predicted N | 5 | 2.5 | 4 | 2 |
| Predicted T | 2.7 | 64.4 | 5 | 63 | Predicted T | 0 | 54.5 | 1 | 55 | |||
| All genes | 57 | Predicted N | 16 | 17 | 17 | 129 | 135 | Predicted N | 5 | 13 | 5 | 56 |
| Predicted T | 1 | 117 | 0 | 5 | Predicted T | 0 | 44 | 0 | 1 | |||
| Common signature | 19.1 | Predicted N | 14.9 | 11.9 | 15 | 22 | 19.2 | Predicted N | 5 | 3 | 4 | 5 |
| Predicted T | 2.1 | 122.1 | 2 | 112 | Predicted T | 0 | 54 | 1 | 52 | |||
The confusion matrixes in the gray shaded regions are used to generate the PPI in shaded regions in Table 3 and the corresponding Table 4.
PPI summary of within-cancer-type and inter-cancer-type predictions in batch II analysis.
| Test data
| |||||
|---|---|---|---|---|---|
| Liver | Prostate | Lung | Bladder | ||
| Liver | 96.5% (69) | 94.1% (225) | 94.7% (119) | 98.3% (53) | |
| Prostate | 97.7% (225) | 93.9% (55) | 94.5% (288) | 88.3% (10) | |
| Lung | 93.0% (119) | 72.8% (288) | 90.7% (57) | 85.6% (19) | |
| Bladder | 91.9% (53) | 86.9% (10) | 85.9% (19) | 88.6% (135) | |
All genes are used in the within-cancer-type prediction to allow PAM for automatic predictive gene selection. Numbers of genes used in PAM are shown in parentheses.
In all inter-cancer-type predictions, only common signature genes are used in PAM and PAM does not perform further gene selection. The numbers of genes appeared more than 70% of leave-one-out cross validations are shown in the parentheses (i.e. liv-pro-MBs, liv-lun-MBs and pro-lun-MBs).
The 44 batchII-MBs overlapped by pair-wise comparisons of liver, prostate and lung data sets (liv-pro-MB, liv-lun-MB, pro-lun-MB). The first 12 genes with asterisk overlapped batchI-MBs. The signed mean fold change shows mean fold change of tumor versus normal when positive (up-regulation) and normal versus tumor when negative (down-regulation).
| Probe set ID | Gene title | Gene symbol | Signed mean fold change
| ||
|---|---|---|---|---|---|
| Liver | Prostate | Lung | |||
| 39597_at* | actin binding LIM protein family, member 3 | ABLIM3 | −2.1 | −1.6 | −2 |
| 37599_at* | aldehyde oxidase 1 | AOX1 | −2.8 | −2.6 | −1.5 |
| 34736_at* | cyclin B1 | CCNB1 | 2.3 | 1.8 | 1.7 |
| 37302_at* | centromere protein F, 350/400 ka (mitosin) | CENPF | 1.9 | 1.4 | 1.4 |
| 37203_at* | carboxylesterase 1 (monocyte/macrophage serine esterase 1) | CES1 | −1.8 | −1.7 | −3 |
| 32168_s_at* | Down syndrome critical region gene 1 | DSCR1 | −2 | −1.6 | −1.8 |
| 34311_at* | glutaredoxin (thioltransferase) | GLRX | −2.5 | −1.7 | −1.5 |
| 1737_s_at* | insulin-like growth factor binding protein 4 | IGFBP4 | −1.8 | −2.5 | −1.6 |
| 609_f_at* | metallothionein 1B | MT1B | −3.6 | −2.3 | −1.5 |
| 36130_f_at* | metallothionein 1E | MT1E | −3.5 | −1.9 | −1.8 |
| 31622_f_at* | metallothionein 1F | MT1F | −2.9 | −2.3 | −1.8 |
| 39594_f_at* | metallothionein 1H | MT1H | −3.2 | −2.4 | −1.7 |
| 35699_at | BUB1 budding uninhibited by benzimidazoles 1 homolog beta (yeast) | BUB1B | 1.5 | 1.4 | 1.3 |
| 38796_at | complement component 1, q subcomponent, B chain | C1QB | −2.4 | −1.4 | −2.3 |
| 35276_at | claudin 4 | CLDN4 | 1.4 | 2.4 | 1.4 |
| 36668_at | cytochrome b5 reductase 3 | CYB5R3 | −1.4 | −1.4 | −1.5 |
| 33295_at | Duffy blood group, chemokine receptor | DARC | −1.7 | −2.9 | −1.4 |
| 41225_at | dual specificity phosphatase 3 (vaccinia virus phosphatase VH1 related) | DUSP3 | −1.4 | −1.4 | −1.5 |
| 38052_at | coagulation factor XIII, A1 polypeptide | F13A1 | −1.7 | −2.1 | −1.5 |
| 37743_at | fasciculation and elongation protein zeta 1 (zygin I) | FEZ1 | −1.5 | −1.6 | −2 |
| 38326_at | G0/G1switch 2 | G0S2 | −3.1 | −2.2 | −1.7 |
| 1597_at | growth arrest-specific 6 | GAS6 | −1.6 | −2 | −1.7 |
| 411_i_at | interferon induced transmembrane protein 2 (1–8D) | IFITM2 | −1.6 | −2 | −1.4 |
| 37484_at | integrin, alpha 1 | ITGA1 | −1.6 | −1.4 | −1.3 |
| 38116_at | KIAA0101 | KIAA0101 | 2.2 | 1.6 | 1.3 |
| 37883_i_at | Hypothetical gene supported by AK096951 | LOC400879 | 1.5 | 1.7 | 1.4 |
| 242_at | microtubule-associated protein 4 | MAP4 | −1.4 | −1.6 | −1.4 |
| 31623_f_at | metallothionein 1A | MT1A | −3.5 | −2.6 | −1.4 |
| 39081_at | metallothionein 2A | MT2A | −2 | −2.5 | −2.1 |
| 37736_at | protein-L-isoaspartate (D-aspartate) | PCMT1 | −1.6 | −1.3 | −1.3 |
| 35752_s_at | O-methyltransferase protein S (alpha) | PROS1 | −2.2 | −1.7 | −2 |
| 34163_g_at | RNA binding protein with multiple splicing | RBPMS | −1.5 | −2.4 | −1.4 |
| 34887_at | Radixin | RDX | −1.5 | −1.4 | −1.7 |
| 39150_at | ring finger protein 11 | RNF11 | −1.6 | −1.4 | −1.3 |
| 41096_at | S100 calcium binding protein A8 | S100A8 | −3.6 | −2.3 | −3.3 |
| 33443_at | serine incorporator 1 | SERINC1 | −1.8 | −1.5 | −1.6 |
| 39775_at | serpin peptidase inhibitor, clade G (C1 inhibitor), member 1, (angioedema, hereditary) | SERPING1 | −1.6 | −2.2 | −2 |
| 1798_at | solute carrier family 39 (zinc transporter), member 6 | SLC39A6 | 1.4 | 1.6 | 1.4 |
| 33131_at | SRY (sex determining region Y)-box 4 | SOX4 | 2.5 | 1.8 | 1.8 |
| 40419_at | stomatin | STOM | −1.5 | −1.8 | −2 |
| 1897_at | transforming growth factor, beta receptor III | TGFBR3 | −1.4 | −1.7 | −2.5 |
| 38404_at | transglutaminase 2 (C polypeptide, protein-glutamine-gamma-glutamyltransferase) | TGM2 | −2.9 | −1.8 | −1.8 |
| 40145_at | topoisomerase (DNA) II alpha 170 kDa | TOP2A | 1.6 | 1.7 | 2 |
| 35720_at | WD repeat domain 47 | WDR47 | −2.4 | −1.5 | −1.3 |
Figure 6MDS plot of existing training data set and independent prostate cancer data. Three MDS plots of the existing liver, prostate and lung training data sets respectively with the 23 independent prostate tumor samples. The mixing of the 23 tumor samples and old tumor samples exclude the possibility of accidental high accuracy due to study differences.
Batch I leave-one-out cross validation analysis result (confusion matrix).
| All genes | 69 | Predicted N | 21 | 3 | 21 | 29 | 55 | Predicted N | 23 | 8 | 19 | 58 |
| Predicted T | 0 | 40 | 0 | 14 | Predicted T | 0 | 58 | 4 | 8 | |||
| Common signature | 111.3 | Predicted N | 21 | 3 | 20 | 4 | 111.9 | Predicted N | 23 | 1.6 | 22 | 2 |
| Predicted T | 0 | 40 | 1 | 39 | Predicted T | 0 | 64.4 | 1 | 64 | |||
| All genes | 66 | Predicted N | 20 | 3 | 18 | 9 | 63 | Predicted N | 23 | 4 | 15 | 33 |
| Predicted A | 1 | 27 | 3 | 21 | Predicted A | 0 | 55 | 8 | 26 | |||
| Common signature | 111.2 | Predicted N | 21 | 1.1 | 20 | 1 | 110.4 | Predicted N | 23 | 2 | 23 | 4 |
| Predicted A | 0 | 28.9 | 1 | 29 | Predicted A | 0 | 57 | 0 | 55 | |||
| All genes | 64 | Predicted A | 27 | 13 | 13 | 17 | 266 | Predicted A | 44 | 21 | 46 | 44 |
| Predicted T | 3 | 30 | 17 | 26 | Predicted T | 15 | 45 | 13 | 22 | |||
| Common signature | 111.5 | Predicted A | 27 | 16.7 | 26 | 16 | 112.0 | Predicted A | 42 | 25 | 41 | 26 |
| Predicted T | 3 | 26.3 | 4 | 27 | Predicted T | 17 | 41 | 18 | 40 | |||
The numbers marked in dark gray are the number of genes used to construct the prediction model. When “all genes” are used, the PAM method performs automatic gene selection to construct the model. When “common signature genes” are used, no gene selection is performed in PAM and the results (number of genes and confusion matrix) shown are averages of leave-one-out cross-validation results.