| Literature DB >> 23304070 |
Mark Burton1, Mads Thomassen, Qihua Tan, Torben A Kruse.
Abstract
BACKGROUND: The popularity of a large number of microarray applications has in cancer research led to the development of predictive or prognostic gene expression profiles. However, the diversity of microarray platforms has made the full validation of such profiles and their related gene lists across studies difficult and, at the level of classification accuracies, rarely validated in multiple independent datasets. Frequently, while the individual genes between such lists may not match, genes with same function are included across such gene lists. Development of such lists does not take into account the fact that genes can be grouped together as metagenes (MGs) based on common characteristics such as pathways, regulation, or genomic location. Such MGs might be used as features in building a predictive model applicable for classifying independent data. It is, therefore, demanding to systematically compare independent validation of gene lists or classifiers based on metagene or individual gene (SG) features.Entities:
Keywords: breast cancer; classification; metagenes; microarray
Year: 2012 PMID: 23304070 PMCID: PMC3529607 DOI: 10.4137/CIN.S10375
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Metagene and single gene selection procedure.
Notes: MGs (blue) and SGs (red) were both derived from the same eight breast cancer gene expression datasets. These covered 32418 genes. 1057 gene lists was defined from these 32418 genes/probes. These were subjected to gene set enrichment analysis (GSEA), ranked within each dataset according to their signal-to-noise ratio, and their across dataset mean rank calculated. This mean rank was significance tested as described in the Materials and Methods section, resulting in 71 metagenes that were scored by the median gene expression of the GSEA leading edge genes. The single genes were selected by directly ranking each gene/probe across the datasets and subsequently following the same procedure as for the metagenes, resulting in 283 significant single genes. The measure for each single is the gene expression level associated with each gene.
Overview of datasets.
| Dataset | Chip | Probes (K) | Patients | Outcome | Treatment | Define MG | Define SG | Train | Test | Ref. |
|---|---|---|---|---|---|---|---|---|---|---|
| Amsterdam | Agilent/Rosetta | 25 | 295, N+, N− | DM | None, et, ct | √ | √ | [ | ||
| Amsterdam (AG1) (subset of the above) | Agilent/Rosetta | 25 | 151, N− | DM | None | √ | √ | √ | [ | |
| Rotterdam (AF1) | Affymetrix HG-133A | 22 | 286, N− | DM | None | √ | √ | √ | [ | |
| HUMAC | Spotted oligonucleotides | 29 | 60, N− | ME | None | √ | √ | [ | ||
| Huang | Affymetrix 95av2 | 12 | 52, N+ | RE | Ct | √ | √ | [ | ||
| Sotiriou 2003 | Spotted cDNA | 7.6 | 99, N+/N− | RE | Et, ct | √ | √ | [ | ||
| Sotiriou 2006 | Affymetrix HG-133A | 22 | 179, N+/N− | DM | Et | √ | √ | [ | ||
| Uppsala | Affymetrix HG-133A+B | 44 | 236, N+/N− | DF | None, ct, et | √ | √ | [ | ||
| Stockholm | Affymetrix HG-133A+B | 44 | 159, N+/N− | RE | None, ct, et | √ | √ | [ | ||
| TRANSBIG (AF2) | Affymetrix HG-133A | 22 | 147, N− | DM | None | √ | √ | [ | ||
| Mainz (AF3) | Affymetrix HG-133A | 22 | 200, N− | DM | None | √ | √ | [ |
Notes: The table shows name of dataset, microarray chip, number of probes, patients, outcome, patient treatment, datasets used to define features and for training and testing and the references.
designate used as training only when validating AF3;
designate used for training only when validating AF2.
Abbreviations: K, thousands; N+ and N−, node-positive and -negative patients; DM, distant metastasis; ME, metastasis; RE, relapse; DF, death from breast cancer; et, endocrine therapy; ct, chemo therapy; none, no adjuvant therapy.
The number of metagene and single genes features in the 24 models.
| Dataset | AG1 | AF1 | AF2 | AF3 | ||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| ||||
| Features method | #MG | #SG | #MG | #SG | #MG | #SG | #MG | #SG |
| RF | 4 | 21 | 15 | 21 | 14 | 14 | 21 | 26 |
| R-SVM | 18 | 20 | 57 | 25 | 5 | 22 | 10 | 71 |
| S-SVM | 29 | 17 | 67 | 35 | 9 | 19 | 64 | 122 |
Figure 2The internal dataset validation procedure.
Notes: For both types of features, the entire training set was used to rank each feature by the random forest importance value. This rank was used for feature selection adding one feature at a time starting from top 2 to top 71 (for MGs) or top 283 (for SGs), thus testing a classifier with a fixed number of features in each round. The performance of the classifier was evaluated using 10 times repeated 10-fold cross validation. Using the same combination of training data and classification method, the mean 10 times repeated 10-fold cross validation of the MG-classifier and SG-classifier were compared with each other.
Figure 3Internal classification performance.
Notes: The 10 times repeated 10-fold cross validation balanced accuracies (bAcc) within the four datasets, AG1, AF1, AF2, and AF3, using random forest (RF), support vector machines with a radial (R-SVM) or sigmoid-kernel (S-SVM), or across the three classification methods (Overall) are shown in blue and red respectively.
Figure 4Between study classifier or feature set validation. (A) Between different platform validations. The best classifiers developed from the training set (AG1) are either directly (transfer classifier) applied and validated in the independent test data (AF2 or AF3) or the features only from the best classifier are used within the test data for model building and testing by leave-one-out cross validation (LOOCV). In each case comparison of MG- (blue) and SG-classifier (red) or feature set performance is conducted using the same training data, classification method and test data. (B) Between similar platform validations. The best classifiers developed from the training set (AF1, AF2, or AF3) are either directly (Transfer classifier) applied and validated in the independent test data (AF2 or AF3) or the features only from the best classifier are used within the test data for model building and testing by leave-one-out cross validation (LOOCV). In each case, comparison of MG- and SG-classifier or feature set performance is conducted using the same training data, classification method, and test data.
Figure 5Exported feature set classification performance.
Notes: The mean balanced accuracies for the between platform validation (AG1 vs. AF2 or AF3) and the within similar platform validations (AF1 vs. AF2 or AF3 or AF2 vs. AF3, and vice versa) using random forest (RF), support vector machines with a radial (R-SVM) or sigmoid-kernel (S-SVM), or across the three classification methods (Across) are shown in blue and red respectively. The P values show the significance in classification between down-sampled testing using exported MG- and SG-feature sets for model building and testing in independent data.
Figure 6Exported classifier classification performance.
Notes: The mean balanced accuracies for the between platform classifier validation (AG1 vs. AF2 or AF3) and the within similar platform classifier validations (AF1 vs. AF2 or AF3 or AF2 vs. AF3, and vice versa) using random forest (RF), support vector machines with a radial (R-SVM) or sigmoid-kernel (S-SVM), or across the three classification methods (Across) are shown in blue and red respectively. The P values show the significance in classification between MG- and SG-classifiers in terms of down-sampled testing.
List of the 71 metagenes.
| Metagene | Type | # genes |
|---|---|---|
| 12q13 | Chromosome region | 28 |
| 14q24 | Chromosome region | 18 |
| 16q22 | Chromosome region | 23 |
| 16q24 | Chromosome region | 14 |
| 17q23 | Chromosome region | 13 |
| 17q25 | Chromosome region | 16 |
| 1p31 | Chromosome region | 14 |
| 1q42 | Chromosome region | 24 |
| 20q11 | Chromosome region | 10 |
| 20q13 | Chromosome region | 29 |
| 5q14 | Chromosome region | 6 |
| 5q33 | Chromosome region | 7 |
| 8p21 | Chromosome region | 14 |
| 8q22 | Chromosome region | 12 |
| 8q24 | Chromosome region | 21 |
| ACTINYPATHWAY | Biological pathway | 14 |
| AMINOACYL_TRNA_BIOSYNTHESIS | Biological pathway | 8 |
| ARAPPATHWAY | Biological pathway | 5 |
| ATRBRCAPATHWAY | Biological pathway | 10 |
| BETA_ALANINE_METABOLISM | Biological pathway | 11 |
| CELL_CYCLE_KEGG | Biological pathway | 39 |
| DNA_REPLICATION_REACTOME | Biological pathway | 19 |
| EGFPATHWAY | Biological pathway | 8 |
| ELECTRON_TRANSPORT_CHAIN | Biological pathway | 39 |
| ERBB2_GRB7 | Biological pathway | 2 |
| FATTY_ACID_METABOLISM | Biological pathway | 20 |
| FRUCTOSE_AND_MANNOSE_METABOLISM | Biological pathway | 10 |
| G2PATHWAY | Biological pathway | 11 |
| GCCATNTTG_V$YY1_Q6 | Transcription factor binding motif | 65 |
| GLEEVECPATHWAY | Biological pathway | 7 |
| GLYCEROLIPID_METABOLISM | Biological pathway | 14 |
| GLYCOLYSIS_AND_GLUCONEOGENESIS | Biological pathway | 12 |
| GPCRPATHWAY | Biological pathway | 8 |
| HISTIDINE_METABOLISM | Biological pathway | 11 |
| Il-12 | Biological pathway | 8 |
| MRNA_PROCESSING_REACTOME | Biological pathway | 24 |
| MRPPATHWAY | Biological pathway | 3 |
| NUCLEAR_RECEPTORS | Biological pathway | 12 |
| OXIDATIVE_PHOSPHORYLATION | Biological pathway | 26 |
| PDGFPATHWAY | Biological pathway | 7 |
| PENTOSE_PHOSPHATE_PATHWAY | Biological pathway | 11 |
| PPARAPATHWAY | Biological pathway | 10 |
| PROTEASOME_DEGRADATION | Biological pathway | 18 |
| PURINE_METABOLISM | Biological pathway | 28 |
| PYRIMIDINE_METABOLISM | Biological pathway | 23 |
| RNA_TRANSCRIPTION_REACTOME | Biological pathway | 9 |
| S1P_SIGNALING | Biological pathway | 6 |
| S1P54_01 | Biological pathway | 53 |
| TGASTMAGC_V$NFE2_01 | Transcription factor binding motif | 35 |
| TNFR2 | Biological pathway | 9 |
| TOLLPATHWAY | Biological pathway | 10 |
| UBIQUITIN_MEDIATED_PROTEOLYSIS | Biological pathway | 2 |
| V$AP1_01 | Transcription factor binding motif | 39 |
| V$AP2_Q3 | Transcription factor binding motif | 33 |
| V$ARNT_02 | Transcription factor binding motif | 34 |
| V$BACH1_01 | Transcription factor binding motif | 50 |
| V$CETS1P54_01 | Transcription factor binding motif | 53 |
| V$COUP_DR1_Q6 | Transcription factor binding motif | 29 |
| V$E2F_Q6_01 | Transcription factor binding motif | 52 |
| V$ELK1_02 | Transcription factor binding motif | 38 |
| V$ER_Q6_02 | Transcription factor binding motif | 25 |
| V$GABP_B | Transcription factor binding motif | 20 |
| V$HIF1_Q5 | Transcription factor binding motif | 27 |
| V$MYCMAX_B | Transcription factor binding motif | 54 |
| V$NFY_Q6 | Transcription factor binding motif | 22 |
| V$NRF1_Q6 | Transcription factor binding motif | 35 |
| V$NRF2_01 | Transcription factor binding motif | 35 |
| V$SP1_Q6_01 | Transcription factor binding motif | 26 |
| V$USF2_Q6 | Transcription factor binding motif | 34 |
| VALINE_LEUCINE_AND_ISOLEUCINE_DEGRADATION | Biological pathway | 15 |
| VEGFPATHWAY | Biological pathway | 9 |
Notes: The first column shows the name of the metagenes. The second column shows whether the metagene covers a biological pathway, chromosomal region or genes sharing a specific transcription factor binding motif. # genes lists the number of genes underlying the final metagene.
List of the 283 single genes.
| Gene symbol | Description |
|---|---|
| ABCA5 | ATP-binding cassette, sub-family A (ABC1), member 5 |
| ABCA8 | ATP-binding cassette, sub-family A (ABC1), member 8 |
| ABCC10 | ATP-binding cassette, sub-family C (CFTR/MRP), member 10 |
| ABCC5 | ATP-binding cassette, sub-family C (CFTR/MRP), member 5 |
| ABTB2 | Ankyrin repeat and BTB (POZ) domain containing 2 |
| ACD | Adrenocortical dysplasia homolog (mouse) |
| ADFP | Adipose differentiation-related protein |
| ADH1B | Alcohol dehydrogenase IB (class I), beta polypeptide |
| ADRA2A | Adrenergic, alpha-2A-, receptor |
| ADRM1 | Adhesion regulating molecule 1 |
| ALDH1A1 | Aldehyde dehydrogenase 1 family, member A1 |
| ALDH2 | Aldehyde dehydrogenase 2 family (mitochondrial) |
| ALDH6A1 | Aldehyde dehydrogenase 6 family, member A1 |
| APOD | Apolipoprotein D |
| ARHGEF6 | Rac/Cdc42 guanine nucleotide exchange factor (GEF) 6 |
| ATP1B3 | ATPase, Na+/K+ transporting, beta 3 polypeptide |
| ATP2A2 | ATPase, Ca++ transporting, cardiac muscle, slow twitch 2 |
| ATP9A | ATPase, Class II, type 9A |
| AURKB | Aurora kinase B |
| BARD1 | BRCA1 associated RING domain 1 |
| BCL2 | B-cell CLL/lymphoma 2 |
| BCL2L1 | BCL2-like 1 |
| BRCA1 | Breast cancer 1, early onset |
| BUB1 | BUB1 budding uninhibited by benzimidazoles 1 homolog (yeast) |
| BUB1B | BUB1 budding uninhibited by benzimidazoles 1 homolog beta (yeast) |
| C6 | Complement component 6 |
| C7ORF24 | Chromosome 7 open reading frame 24 |
| CACNA1D | Calcium channel, voltage-dependent, L type, alpha 1D subunit |
| CARS | Cysteinyl-tRNA synthetase |
| CAT | Catalase |
| CCNA2 | Cyclin A2 |
| CCNB1 | Cyclin B1 |
| CCNB2 | Cyclin B2 |
| CCNE2 | Cyclin E2 |
| CCNF | Cyclin F |
| CCT5 | Chaperonin containing TCP1, subunit 5 (epsilon) |
| CCT6A | Chaperonin containing TCP1, subunit 6A (zeta 1) |
| CD44 | CD44 molecule (Indian blood group) |
| CDC2 | Cell division cycle 2, G1 to S and G2 to M |
| CDC20 | CDC20 cell division cycle 20 homolog (S. cerevisiae) |
| CDC25B | Cell division cycle 25B |
| CDC25C | Cell division cycle 25C |
| CDC34 | Cell division cycle 34 |
| CDC45L | CDC45 cell division cycle 45-like (S. cerevisiae) |
| CDK8 | Cyclin-dependent kinase 8 |
| CDKN3 | Cyclin-dependent kinase inhibitor 3 (CDK2-associated dual specificity phosphatase) |
| CDO1 | Cysteine dioxygenase, type I |
| CENPE | Centromere protein E, 312 kDa |
| CENPF | Centromere protein F, 350/400 ka (mitosin) |
| CH25H | Cholesterol 25-hydroxylase |
| CHAF1B | Chromatin assembly factor 1, subunit B (p60) |
| CIRBP | Cold inducible RNA binding protein |
| CKAP5 | Cytoskeleton associated protein 5 |
| CKS2 | CDC28 protein kinase regulatory subunit 2 |
| CNN3 | Calponin 3, acidic |
| CNTN1 | Contactin 1 |
| CP | Ceruloplasmin (ferroxidase) |
| CREBL2 | CAMP responsive element binding protein-like 2 |
| CRIM1 | Cysteine rich transmembrane BMP regulator 1 (chordin-like) |
| CSE1L | CSE1 chromosome segregation 1-like (yeast) |
| CSTF1 | Cleavage stimulation factor, 3′ pre-RNA, subunit 1, 50 kDa |
| CTPS | CTP synthase |
| CTSD | Cathepsin D (lysosomal aspartyl peptidase) |
| CTSL | Cathepsin L |
| CX3CR1 | Chemokine (C-X3-C motif) receptor 1 |
| CYP4B1 | Cytochrome P450, family 4, subfamily B, polypeptide 1 |
| CYP4F12 | Cytochrome P450, family 4, subfamily F, polypeptide 12 |
| DDIT4 | DNA-damage-inducible transcript 4 |
| DDX39 | DEAD (Asp-Glu-Ala-Asp) box polypeptide 39 |
| DLG7 | Discs, large homolog 7 (Drosophila) |
| DLX2 | Distal-less homeobox 2 |
| DOCK1 | Dedicator of cytokinesis 1 |
| DPT | Dermatopontin |
| DUSP1 | Dual specificity phosphatase 1 |
| DUSP4 | Dual specificity phosphatase 4 |
| DYRK2 | Dual-specificity tyrosine-(Y)-phosphorylation regulated kinase 2 |
| EBP | Emopamil binding protein (sterol isomerase) |
| EDG1 | Endothelial differentiation, sphingolipid G-protein-coupled receptor, 1 |
| EGR2 | Early growth response 2 (Krox-20 homolog, Drosophila) |
| ELOVL5 | ELOVL family member 5, elongation of long chain fatty acids (FEN1/Elo2, SUR4/Elo3- like, yeast) |
| ENPP2 | Ectonucleotide pyrophosphatase/phosphodiesterase 2 (autotaxin) |
| EPHX2 | Epoxide hydrolase 2, cytoplasmic |
| ESPL1 | Extra spindle poles like 1 (S. cerevisiae) |
| EVPL | Envoplakin |
| EXO1 | Exonuclease 1 |
| EZH2 | Enhancer of zeste homolog 2 (Drosophila) |
| F3 | Coagulation factor III (thromboplastin, tissue factor) |
| FADD | Fas (TNFRSF6)-associated via death domain |
| FANCG | Fanconi anemia, complementation group G |
| FAS | Fas (TNF receptor superfamily, member 6) |
| FBLN1 | Fibulin 1 |
| FBLN5 | Fibulin 5 |
| FCER1A | Fc fragment of IgE, high affinity I, receptor for; alpha polypeptide |
| FEN1 | Flap structure-specific endonuclease 1 |
| FGL2 | Fibrinogen-like 2 |
| FLJ22531 | – |
| FMO2 | Flavin containing monooxygenase 2 (non-functional) |
| FOS | v-fos FBJ murine osteosarcoma viral oncogene homolog |
| FOXM1 | Forkhead box M1 |
| FRZB | Frizzled-related protein |
| FUCA1 | Fucosidase, alpha-L-1, tissue |
| GABARAP | GABA(A) receptor-associated protein |
| GAD1 | Glutamate decarboxylase 1 (brain, 67 kDa) |
| GALK1 | Galactokinase 1 |
| GEM | GTP binding protein overexpressed in skeletal muscle |
| GGCX | Gamma-glutamyl carboxylase |
| GLA | Galactosidase, alpha |
| GLI1 | Glioma-associated oncogene homolog 1 (zinc finger protein) |
| GMPS | Guanine monphosphate synthetase |
| GNG11 | Guanine nucleotide binding protein (G protein), gamma 11 |
| GNG12 | Guanine nucleotide binding protein (G protein), gamma 12 |
| GPSM2 | G-protein signalling modulator 2 (AGS3-like, C. elegans) |
| GRIK1 | Glutamate receptor, ionotropic, kainate 1 |
| GSTM3 | Glutathione S-transferase M3 (brain) |
| GUK1 | Guanylate kinase 1 |
| GYS2 | Glycogen synthase 2 (liver) |
| H2AFZ | H2A histone family, member Z |
| HIST1H3D | Histone cluster 1, H3d |
| HMGB2 | High-mobility group box 2 |
| HMMR | Hyaluronan-mediated motility receptor (RHAMM) |
| HNMT | Histamine N-methyltransferase |
| HNRPAB | Heterogeneous nuclear ribonucleoprotein A/B |
| HNRPH2 | Heterogeneous nuclear ribonucleoprotein H2 (H′) |
| HPN | Hepsin (transmembrane protease, serine 1) |
| HPRT1 | Hypoxanthine phosphoribosyltransferase 1 (Lesch-Nyhan syndrome) |
| IFNGR2 | Interferon gamma receptor 2 (interferon gamma transducer 1) |
| IGFBP4 | Insulin-like growth factor binding protein 4 |
| IQGAP2 | IQ motif containing GTPase activating protein 2 |
| ITM2A | Integral membrane protein 2A |
| ITPR1 | Inositol 1,4,5-triphosphate receptor, type 1 |
| JAK2 | Janus kinase 2 (a protein tyrosine kinase) |
| KCTD12 | Potassium channel tetramerisation domain containing 12 |
| KIF11 | Kinesin family member 11 |
| KIF13B | Kinesin family member 13B |
| KIF14 | Kinesin family member 14 |
| KIF2C | Kinesin family member 2C |
| KIFC1 | Kinesin family member C1 |
| KIAA0101 | KIAA0101 |
| KIAA0247 | KIAA0247 |
| KIAA0286 | – |
| KIAA0319 | KIAA0319 |
| LAMA2 | Laminin, alpha 2 (merosin, congenital muscular dystrophy) |
| LARP1 | La ribonucleoprotein domain family, member 1 |
| LEP | Leptin (obesity homolog, mouse) |
| LIG1 | Ligase I, DNA, ATP-dependent |
| LMNB1 | Lamin B1 |
| LMO2 | LIM domain only 2 (rhombotin-like 1) |
| LPHN2 | Latrophilin 2 |
| LPL | Lipoprotein lipase |
| LRIG1 | Leucine-rich repeats and immunoglobulin-like domains 1 |
| LRP2 | Low density lipoprotein-related protein 2 |
| MAD2L1 | MAD2 mitotic arrest deficient-like 1 (yeast) |
| MAPRE1 | Microtubule-associated protein, RP/EB family, member 1 |
| MARS | Methionine-tRNA synthetase |
| MCM3 | MCM3 minichromosome maintenance deficient 3 (S. cerevisiae) |
| MCM6 | MCM6 minichromosome maintenance deficient 6 (MIS5 homolog, S. pombe) (S. cerevisiae) |
| MCM7 | MCM7 minichromosome maintenance deficient 7 (S. cerevisiae) |
| MEIS1 | Meis1, myeloid ecotropic viral integration site 1 homolog (mouse) |
| MELK | Maternal embryonic leucine zipper kinase |
| MGP | Matrix Gla protein |
| MKI67 | Antigen identified by monoclonal antibody Ki-67 |
| MN1 | Meningioma (disrupted in balanced translocation) 1 |
| MRPL12 | Mitochondrial ribosomal protein L12 |
| MT2A | Metallothionein 2A |
| MTHFD2 | Methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 2, methenyltetrahydrofolate cyclohydrolase |
| MVD | Mevalonate (diphospho) decarboxylase |
| MYBL2 | v-myb myeloblastosis viral oncogene homolog (avian)-like 2 |
| NCOA1 | Nuclear receptor coactivator 1 |
| NDUFA9 | NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 9, 39 kDa |
| NEDD9 | Neural precursor cell expressed, developmentally down-regulated 9 |
| NEK2 | NIMA (never in mitosis gene a)-related kinase 2 |
| NME5 | Non-metastatic cells 5, protein expressed in (nucleoside-diphosphate kinase) |
| NNAT | Neuronatin |
| NP | Nucleoside phosphorylase |
| NR3C2 | Nuclear receptor subfamily 3, group C, member 2 |
| NTRK2 | Neurotrophic tyrosine kinase, receptor, type 2 |
| NUDT1 | Nudix (nucleoside diphosphate linked moiety X)-type motif 1 |
| NUP155 | Nucleoporin 155 kDa |
| NUP62 | Nucleoporin 62 kDa |
| NVL | Nuclear VCP-like |
| OMD | Osteomodulin |
| P4HA2 | Procollagen-proline, 2-oxoglutarate 4-dioxygenase (proline 4-hydroxylase), alpha polypeptide II |
| PDCD4 | Programmed cell death 4 (neoplastic transformation inhibitor) |
| PDE4A | Phosphodiesterase 4A, cAMP-specific (phosphodiesterase E2 dunce homolog, Drosophila) |
| PDZRN3 | PDZ domain containing RING finger 3 |
| PFKP | Phosphofructokinase, platelet |
| PHLDA2 | Pleckstrin homology-like domain, family A, member 2 |
| PIN1 | Protein (peptidylprolyl cis/trans isomerase) NIMA-interacting 1 |
| PIP | Prolactin-induced protein |
| PIR | Pirin (iron-binding nuclear protein) |
| PKMYT1 | Protein kinase, membrane associated tyrosine/threonine 1 |
| PLK4 | Polo-like kinase 4 (Drosophila) |
| PLP2 | Proteolipid protein 2 (colonic epithelium-enriched) |
| PNMA2 | Paraneoplastic antigen MA2 |
| PNRC1 | Proline-rich nuclear receptor coactivator 1 |
| POLD1 | Polymerase (DNA directed), delta 1, catalytic subunit 125 kDa |
| POLR2H | Polymerase (RNA) II (DNA directed) polypeptide H |
| POLS | Polymerase (DNA directed) sigma |
| PRAME | Preferentially expressed antigen in melanoma |
| PSD3 | Pleckstrin and Sec7 domain containing 3 |
| PSMB3 | Proteasome (prosome, macropain) subunit, beta type, 3 |
| PSMB7 | Proteasome (prosome, macropain) subunit, beta type, 7 |
| PSMD1 | Proteasome (prosome, macropain) 26S subunit, non-ATPase, 1 |
| PSMD11 | Proteasome (prosome, macropain) 26S subunit, non-ATPase, 11 |
| PTDSR | Phosphatidylserine receptor |
| PTGER3 | Prostaglandin E receptor 3 (subtype EP3) |
| PTGER4 | Prostaglandin E receptor 4 (subtype EP4) |
| PTPRT | Protein tyrosine phosphatase, receptor type, T |
| PTTG1 | Pituitary tumor-transforming 1 |
| QDPR | Quinoid dihydropteridine reductase |
| RABGGTA | Rab geranylgeranyltransferase, alpha subunit |
| RABIF | RAB interacting factor |
| RAD51 | RAD51 homolog (RecA homolog, |
| RAD51AP1 | RAD51 associated protein 1 |
| RAE1 | RAE1 RNA export 1 homolog (S. pombe) |
| RALA | v-ral simian leukemia viral oncogene homolog A (ras related) |
| RBMS3 | RNA binding motif, single stranded interacting protein |
| RDBP | RD RNA binding protein |
| RECQL4 | RecQ protein-like 4 |
| RFC3 | Replication factor C (activator 1) 3, 38 kDa |
| RFC4 | Replication factor C (activator 1) 4, 37 kDa |
| RGS5 | Regulator of G-protein signalling 5 |
| RICS | – |
| RNASEH2A | Ribonuclease H2, subunit A |
| RRM1 | Ribonucleotide reductase M1 polypeptide |
| RRM2 | Ribonucleotide reductase M2 polypeptide |
| RTN1 | Reticulon 1 |
| SAC3D1 | SAC3 domain containing 1 |
| SC5DL | Sterol-C5-desaturase (ERG3 delta-5-desaturase homolog, fungal)-like |
| SDS | Serine dehydratase |
| SEC14L2 | SEC14-like 2 (S. cerevisiae) |
| SEC61G | Sec61 gamma subunit |
| SELE | Selectin E (endothelial adhesion molecule 1) |
| SEMA3E | Sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3E |
| SERPINA1 | Serpin peptidase inhibitor, clade A (alpha-1 antiproteinase, antitrypsin), member 1 |
| SF3B4 | Splicing factor 3b, subunit 4, 49 kDa |
| SFRP4 | Secreted frizzled-related protein 4 |
| SFRS5 | Splicing factor, arginine/serine-rich 5 |
| SH3BGRL | SH3 domain binding glutamic acid-rich protein like |
| SIAHBP1 | – |
| SIX1 | Sine oculis homeobox homolog 1 (Drosophila) |
| SLBP | Stem-loop (histone) binding protein |
| SLC14A1 | Solute carrier family 14 (urea transporter), member 1 (Kidd blood group) |
| SLC16A3 | Solute carrier family 16, member 3 (monocarboxylic acid transporter 4) |
| SLC25A1 | Solute carrier family 25 (mitochondrial carrier; citrate transporter), member 1 |
| SLC4A7 | Solute carrier family 4, sodium bicarbonate cotransporter, member 7 |
| SLIT2 | Slit homolog 2 (Drosophila) |
| SMARCA2 | SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 2 |
| SORBS2 | Sorbin and SH3 domain containing 2 |
| SORL1 | Sortilin-related receptor, L(DLR class) A repeats-containing |
| SPAG5 | Sperm associated antigen 5 |
| SPRY2 | Sprouty homolog 2 (Drosophila) |
| SSPN | Sarcospan (Kras oncogene-associated gene) |
| SSRP1 | Structure specific recognition protein 1 |
| STC2 | Stanniocalcin 2 |
| STMN1 | Stathmin 1/oncoprotein 18 |
| SURF2 | Surfeit 2 |
| TACSTD1 | Tumor-associated calcium signal transducer 1 |
| TAT | Tyrosine aminotransferase |
| TBCD | Tubulin-specific chaperone d |
| TGFB3 | Transforming growth factor, beta 3 |
| TIMELESS | Timeless homolog (Drosophila) |
| TIMM17B | Translocase of inner mitochondrial membrane 17 homolog B (yeast) |
| TLR3 | Toll-like receptor 3 |
| TOP2A | Topoisomerase (DNA) II alpha 170 kDa |
| TPX2 | TPX2, microtubule-associated, homolog (Xenopus laevis) |
| TRIP13 | Thyroid hormone receptor interactor 13 |
| TROAP | Trophinin associated protein (tastin) |
| TUBA1 | Tubulin, alpha 1 |
| TXN | Thioredoxin |
| TXNIP | Thioredoxin interacting protein |
| TXNRD1 | Thioredoxin reductase 1 |
| TYRP1 | Tyrosinase-related protein 1 |
| UBE2C | Ubiquitin-conjugating enzyme E2C |
| UBE2V2 | Ubiquitin-conjugating enzyme E2 variant 2 |
| WDHD1 | WD repeat and HMG-box DNA binding protein 1 |
| WFDC2 | WAP four-disulfide core domain 2 |
| WWP2 | WW domain containing E3 ubiquitin protein ligase 2 |
| XPOT | Exportin, tRNA (nuclear export receptor for tRNAs) |
| YWHAZ | Tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, zeta polypeptide |
| ZNF238 | Zinc finger protein 238 |
| ZWINT | ZW10 interactor |
| AASS | Aminoadipate-semialdehyde synthase |
Notes: Gene symbol shows the gene symbol of the 283 single genes. Description shows their name.
Internal result.
| Internal results | RF | R-SVM | S-SVM | Across | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| ||||||||
| Method | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc |
| MG | 73 | 69 | 71 | 80 | 65 | 73 | 71 | 76 | 74 | 75 | 70 | 73 |
| SG | 85 | 53 | 69 | 85 | 59 | 72 | 69 | 79 | 74 | 80 | 64 | 72 |
Exported feature set performance (different platform).
| Feature set different platforms | RF | R-SVM | S-SVM | Across | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| ||||||||
| Method | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc |
| MG | 14 | 95 | 55 | 70 | 74 | 72 | 70 | 74 | 72 | 51 | 81 | 66 |
| SG | 35 | 79 | 57 | 48 | 66 | 57 | 72 | 69 | 71 | 52 | 71 | 62 |
Notes: The table shows the mean sensitivity, specificity and balanced accuracies for feature sets defined by AG1 and validated in AF2 and AF3, using either metagenes (MG) or single genes (SG) as features and using random forest (RF), support vector machine with a radial-based kernel (R-SVM) or a sigmoid kernel (S-SVM). Across shows the mean of the results across the three classification methods.
Exported feature set performance (similar platform).
| Feature set similar platforms | RF | R-SVM | S-SVM | Across | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| ||||||||
| Method | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc |
| MG | 71 | 46 | 59 | 68 | 69 | 69 | 73 | 72 | 73 | 71 | 62 | 67 |
| SG | 76 | 39 | 58 | 67 | 71 | 69 | 71 | 71 | 71 | 71 | 60 | 66 |
Notes: The table shows the mean sensitivity, specificity and balanced accuracies for external validation of feature sets covering the following validation: Feature sets defined by AF1 and validated in AF2 and AF3. Feature sets defined by AF2 and validated in AF3 and vice versa, using either metagenes (MG) or single genes (SG) as features and using random forest (RF), support vector machine with a radial-based kernel (R-SVM) or a sigmoid kernel (S-SVM). Across shows the mean of the results across the three classification methods.
Exported classifier performance (different platform).
| Feature set different platforms | RF | R-SVM | S-SVM | Across | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| ||||||||
| Method | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc |
| MG | 42 | 75 | 59 | 26 | 85 | 56 | 33 | 85 | 59 | 34 | 82 | 58 |
| SG | 56 | 67 | 62 | 49 | 66 | 58 | 33 | 71 | 52 | 46 | 68 | 58 |
Notes: The table shows the mean sensitivity, specificity and balanced accuracies for classifiers defined by AG1 and validated in AF2 and AF3, using either metagenes (MG) or single genes (SG) as features and using random forest (RF), support vector machine with a radial-based kernel (R-SVM) or a sigmoid kernel (S-SVM).
Exported classifier performance (similar platform).
| Feature set Similar platforms | RF | R-SVM | S-SVM | Across | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
| ||||||||
| Method | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc | Sen | Spe | bAcc |
| MG | 53 | 61 | 57 | 67 | 42 | 55 | 62 | 43 | 53 | 61 | 49 | 55 |
| SG | 65 | 58 | 62 | 65 | 53 | 59 | 58 | 60 | 59 | 63 | 57 | 60 |
Notes: The table shows the mean sensitivity, specificity and balanced accuracies for external validation of classifiers covering the following validation: Feature sets defined by AF1 and validated in AF2 and AF3. Feature sets defined by AF2 and validated in AF3 and vice versa, using either metagenes (MG) or single genes (SG) as features and using random forest (RF), support vector machine with a radial-based kernel (R-SVM) or a sigmoid kernel (S-SVM).