| Literature DB >> 26792270 |
Yuan Chen1,2, Lifeng Wang3, Lanzhi Li4, Hongyan Zhang5, Zheming Yuan6,7.
Abstract
BACKGROUND: Selecting a parsimonious set of informative genes to build highly generalized performance classifier is the most important task for the analysis of tumor microarray expression data. Many existing gene pair evaluation methods cannot highlight diverse patterns of gene pairs only used one strategy of vertical comparison and horizontal comparison, while individual-gene-ranking method ignores redundancy and synergy among genes.Entities:
Mesh:
Year: 2016 PMID: 26792270 PMCID: PMC4721022 DOI: 10.1186/s12859-016-0893-0
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Six patterns for joint effect of gene pairs in binary-class simulation data
| Class | Pattern I | Pattern II | Pattern III | Pattern IV | Pattern V | Pattern VI | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G1 | G2 | G3 | G4 | G5 | G6 | G7 | G8 | G9 | G10 | G11 | G12 | ||||||||||||||
| + | 50 | 100 | 5 | 100 | 50 | 50 | 5 | 50 | 50 | 100 | 50 | 50 | |||||||||||||
| + | 50 | 100 | 5 | 100 | 50 | 50 | 5 | 50 | 5 | 10 | 100 | 100 | |||||||||||||
| + | 50 | 100 | 5 | 100 | 50 | 50 | 5 | 50 | 50 | 100 | 50 | 50 | |||||||||||||
| + | 50 | 100 | 5 | 100 | 50 | 50 | 5 | 50 | 5 | 10 | 100 | 100 | |||||||||||||
| - | 100 | 50 | 10 | 50 | 100 | 100 | 10 | 100 | 100 | 50 | 50 | 100 | |||||||||||||
| - | 100 | 50 | 10 | 50 | 100 | 100 | 10 | 100 | 10 | 5 | 100 | 50 | |||||||||||||
| - | 100 | 50 | 10 | 50 | 100 | 100 | 10 | 100 | 100 | 50 | 50 | 100 | |||||||||||||
| - | 100 | 50 | 10 | 50 | 100 | 100 | 10 | 100 | 10 | 5 | 100 | 50 | |||||||||||||
| Background difference between gene pairs | Not exist | Exist | Not exist | Exist | Not exist | Not exist | |||||||||||||||||||
| Background difference among samples | Not exist | Not exist | Not exist | Not exist | Exist | Not exist | |||||||||||||||||||
| Vertical comparison | G1 < 75 | G1 > 75 | G2 < 75 | G2 > 75 | G3 < 7 | G3 > 7 | G4 < 75 | G4 > 75 | G5 < 75 | G5 > 75 | G6 < 75 | G6 > 75 | G7 < 7 | G7 > 7 | G8 < 75 | G8 > 75 | G9 < 41 | G9 > 41 | G10 < 41 | G10 > 41 | G11 < 75 | G11 > 75 | G12 < 75 | G12 > 75 | |
| + | 4 | 0 | 0 | 4 | 4 | 0 | 0 | 04 | 4 | 0 | 4 | 0 | 4 | 0 | 4 | 0 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | |
| - | 0 | 4 | 4 | 0 | 0 | 4 | 4 | 0 | 0 | 4 | 0 | 4 | 0 | 4 | 0 | 4 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | |
| Highlight | Yes ( | Yes ( | Yes ( | Yes ( | Yes ( | Yes ( | Yes ( | Yes ( | No ( | No ( | No ( | No ( | |||||||||||||
| Horizontal comparison of pair-wise genes | G1 > G2 | G1 < G2 | G3 > G4 | G3 < G4 | G5 > G6 | G5 < G6 | G7 > G8 | G7 < G8 | G9 > G10 | G9 < G10 | G11 > G12 | G11 < G12 | |||||||||||||
| + | 0 | 4 | 0 | 4 | 2 | 2 | 0 | 4 | 0 | 4 | 2 | 2 | |||||||||||||
| - | 4 | 0 | 0 | 4 | 2 | 2 | 0 | 4 | 4 | 0 | 2 | 2 | |||||||||||||
| Highlight | Yes ( | No ( | No ( | No ( | Yes ( | No ( | |||||||||||||||||||
| Vertical comparison of pair-wise genes | G1 < 75 & G2 < 75 | G1 < 75 &G2 > 75 | G1 > 75 & G2 < 75 | G1 > 75 &G2 > 75 | G3 < 7 & G4 < 75 | G3 < 7 & G4 > 75 | G3 > 7 & G4 < 75 | G3 > 7 & G4 > 75 | G5 < 75 & G6 < 75 | G5 < 75 & G6 > 75 | G5 > 75 & G6 < 75 | G5 > 75 & G6 > 75 | G7 < 7 & G8 < 75 | G7 < 7 & G8 > 75 | G7 > 7 & G8 < 75 | G7 > 7 & G8 > 75 | G9 < 41 & G10 < 41 | G9 < 41 & G10 > 41 | G9 > 41 & G10 < 41 | G9 > 41 & G10 > 41 | G11 < 75 & G12 < 75 | G11 < 75 & G12 > 75 | G11 > 75 & G12 < 75 | G11 > 75 & G12 > 75 | |
| + | 0 | 4 | 0 | 0 | 0 | 4 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 2 | 0 | 0 | 2 | 2 | 0 | 0 | 2 | |
| - | 0 | 0 | 4 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 0 | 4 | 0 | 0 | 0 | 4 | 2 | 0 | 0 | 2 | 0 | 2 | 2 | 0 | |
| Highlight | Yes ( | Yes ( | Yes ( | Yes ( | No ( | Yes (( | |||||||||||||||||||
Values in parenthesis are chi-square values, * denote p < 0.05
Nine multi-class gene expression datasets
| Dataset | Platform | No. of classes | No. of genes | No. of samples | Source | |
|---|---|---|---|---|---|---|
| Training | Test | |||||
| Leukemia1 | Affy | 3 | 7129 | 38 | 34 | [ |
| Lung1 | Affy | 3 | 7129 | 64 | 32 | [ |
| Leukemia2 | Affy | 3 | 12582 | 57 | 15 | [ |
| SRBCT | cDNA | 4 | 2308 | 63 | 20 | [ |
| Breast | Affy | 5 | 9216 | 54 | 30 | [ |
| Lung2 | Affy | 5 | 12600 | 136 | 67 | [ |
| DLBCL | cDNA | 6 | 4026 | 58 | 30 | [ |
| Cancers | Affy | 11 | 12533 | 100 | 74 | [ |
| GCM | Affy | 14 | 16063 | 144 | 46 | [ |
2×r Contingency table
| Class | Column1 | … | Column | … | Column | Total |
|---|---|---|---|---|---|---|
| + | f+1 | … | f+ | … | f+ | f+ |
| - | f-1 | … | f- | … | f- | f- |
| Total | f1 | … | f | … | f |
|
2×r Contingency table for maximum complexity
| Class | Column1 | … | Column | … | Column | Total |
|---|---|---|---|---|---|---|
| + | f+/ | … | f+/ | … | f+/ | f+ |
| - | f-/ | … | f-/ | … | f-/ | f- |
| Total |
| … |
| … |
|
|
2 × 2 contingency table for individual gene
| Class |
|
| Total |
|---|---|---|---|
| + | f+1 | f+2 | f+ |
| - | f−1 | f−2 | f− |
| Total | f1 | f2 |
|
f+1 is the number of positive samples with expression values larger than EP , f+2 is the number of positive samples with expression values less than EP , f−1 is the number of negative samples with expression values larger than EP , and f−2 is the number of negative samples with expression values less than EP . When X equals EP , and Y belongs to positive sample {+}, f+1 and f+2 increase by 0.5 respectively; when X equals EP , and Y belongs to negative sample {−}, f−1 and f−2 increase by 0.5 respectively
2 × 2 contingency table for gene pairs of horizontal comparison
| Class |
|
| Total |
|---|---|---|---|
| + | f+1 | f+2 | f+ |
| - | f−1 | f−2 | f− |
| Total | f1 | f2 |
|
X represents the expression value of the j th gene (G) in the i th sample; f+1 is the number of positive samples with X larger than X , f+2 is the number of positive samples with X less than X , f−1 is the number of negative samples with X larger than X , and f−2 is the number of negative samples with X less than X
2 × 4 contingency table for gene pairs of vertical comparison
| Class |
|
|
|
| Total |
|---|---|---|---|---|---|
| + | f+1 | f+2 | f+3 | f+4 | f+ |
| - | f−1 | f−2 | f−3 | f−4 | f− |
| Total | f1 | f2 | f3 | f4 |
|
f+1 is the number of positive samples with X. larger than EP and X. larger than EP , f+2 is the number of positive samples with X. larger than EP and X. less than EP , f+3 is the number of positive samples with X. less than EP and X. larger than EP , f+4 is the number of positive samples with X. less than EP and X. less than EP , f−1 is the number of positive samples with X. larger than EP and X. larger than EP , f−2 is the number of positive samples with X. larger than EP and X. less than EP , f−3 is the number of positive samples with X. less than EP and X. larger than EP , and f−4 is the number of positive samples with X. less than EP and X. less than EP
Fig. 1Integrated evaluation process of G
Pseudo-code of informative genes selection
| Algorithm 1 Informative gene selection (Dateset, GRank) |
| Require: Dateset is a binary-class training dataset with |
| Require: GRank is the order of all |
| Ensure: Returns the binary-discriminative informative genes subset of Dateset |
| 1: ture_ |
| 2: |
| 3: repeat |
| 4: |
| 5: if | |
| 6: for |
| 7: |
| 8: get |
| 9: |
| 10: get |
| 11: if |
| 12: else pred_ |
| 13: end for |
| 14: MCCbenchmark ← get MCC (true_ |
| 15: else |
| 16: for |
| 17: |
| 18: get |
| 19: |
| 20: get |
| 21: if |
| 22: else pred_ |
| 23: end for |
| 24: MCC ← get MCC (true_ |
| 25: end if |
| 26: if MCC > MCCbenchmark then MCCbenchmark ← MCC |
| 27: else delete GRank |
| 28: until |
| 29: retrun |
Independent test accuracy and the number of informative genes (in parenthesis) among different models
| Model | Leuk1 | Lung1 | Leuk2 | SRBCT | Breast | Lung2 | DLBCL | Cancers | GCM | Average |
|---|---|---|---|---|---|---|---|---|---|---|
| HC-TSPa | 97.06 | 71.88 | 80.00 | 95.00 | 66.67 | 83.58 | 83.33 | 74.32 | 52.17 | 78.22 ± 13.97 |
| (4) | (4) | (4) | (6) | (8) | (8) | (10) | (20) | (26) | (10.00) | |
| HC-K-TSPa | 97.06 | 78.13 | 100 | 100 | 66.67 | 94.03 | 83.33 | 82.43 | 67.39 | 85.45 ± 13.12 |
| (36) | (20) | (24) | (30) | (24) | (28) | (46) | (128) | (134) | (52.22) | |
| DTa | 85.29 | 78.13 | 80.00 | 75.00 | 73.33 | 88.06 | 86.67 | 68.92 | 52.17 | 76.40 ± 11.13 |
| (2) | (4) | (2) | (3) | (4) | (5) | (5) | (10) | (18) | (5.89) | |
| PAMa | 97.06 | 78.13 | 93.33 | 95.00 | 93.33 | 100 | 90.00 | 87.84 | 56.52 | 87.91 ± 13.34 |
| (44) | (13) | (62) | (285) | (4822) | (614) | (3949) | (2008) | (1253) | (1450) | |
| TSGb | 97.06 | 81.25 | 100 | 100 | 86.67 | 95.52 | 93.33 | 79.73 | 67.39 | 88.99 ± 11.11 |
| (6) | (20) | (44) | (13) | (63) | (60) | (16) | (81) | (112) | (46.11) | |
| mRMR-SVM | 76.47 | 78.13 | 100 | 75.00 | 96.67 | 95.52 | 96.67 | 71.62 | 45.65 | 81.75 ± 17.54 |
| (7) | (13) | (19) | (9) | (97) | (120) | (16) | (89) | (57) | (47.44) | |
| SVM-RFE-SVM | 85.29 | 78.13 | 93.33 | 95.00 | 90.00 | 88.06 | 90.00 | 93.24 | 63.04 | 86.23 ± 10.08 |
| (5) | (9) | (8) | (3) | (7) | (9) | (13) | (29) | (199) | (31.33) | |
| Entropy-based DC | 91.18 | 78.13 | 86.67 | 100 | 83.33 | 88.06 | 93.33 | 78.38 | 47.83 | 82.99 ± 14.93 |
| (7) | (14) | (13) | (9) | (13) | (39) | (15) | (73) | (93) | (30.67) | |
|
| 94.12 | 81.00 | 100 | 100 | 90.00 | 97.02 | 93.33 | 90.54 | 58.70 | 89.41 ± 12.91 |
| (23) | (18) | (30) | (31) | (33) | (42) | (23) | (95) | (90) | (42.78) | |
| RS-based DC | 94.12 | 84.38 | 100 | 100 | 93.33 | 98.51 | 90.00 | 90.54 | 71.74 | 91.40 ± 9.00 |
| (7) | (12) | (13) | (11) | (15) | (21) | (16) | (36) | (54) | (20.56) |
aResults reported in [6], bResults reported in [30]. The Average measurement was represented as the average value ± standard deviation. Bold values indicate the best prediction model of each dataset
Test accuracy of different classifiers with informative genes selected by different feature-selection methods
| Classifier | Feature-selection method | Leuk1 | Lung1 | Leuk2 | SRBCT | Breast | Lung2 | DLBCL | Cancers | GCM | Average |
|---|---|---|---|---|---|---|---|---|---|---|---|
| NB | ALLa | 85.29 | 81.25 | 100 | 60.00 | 66.67 | 88.06 | 86.67 | 79.73 | 52.17 | 77.76 |
| RS | 94.12 | 84.38 | 100 | 85.00 | 93.33 | 88.06 | 90.00 | 85.14 | 71.74 | 87.97 | |
| mRMR | 79.41 | 68.75 | 100 | 90.00 | 93.33 | 97.01 | 96.67 | 70.27 | 45.65 | 82.34 | |
| SVM-RFE | 67.65 | 81.25 | 80.00 | 95.00 | 80.00 | 89.55 | 90.00 | 77.03 | 63.04 | 80.39 | |
| TSG | 91.18 | 84.38 | 93.33 | 100 | 86.67 | 94.03 | 100 | 71.62 | 65.22 | 87.38 | |
| HC-K-TSP | 91.18 | 81.25 | 100 | 80.00 | 80.00 | 95.52 | 86.67 | 77.03 | 65.22 | 84.10 | |
| KNN | ALLa | 67.65 | 75.00 | 86.67 | 70.00b | 63.33 | 88.06 | 93.33 | 64.86 | 34.78 | 71.71 |
| RS | 97.06 | 78.13 | 93.33 | 90.00 | 93.33 | 95.52 | 93.33 | 72.97 | 43.48 | 84.13 | |
| mRMR | 70.59 | 68.75 | 80.00 | 80.00 | 96.67 | 86.57 | 100 | 54.05 | 36.96 | 74.84 | |
| SVM-RFE | 76.47 | 68.75 | 86.67 | 100 | 90.00 | 86.57 | 90.00 | 58.11 | 45.65 | 78.02 | |
| TSG | 91.18 | 75.00 | 93.33 | 100 | 80.00 | 88.06 | 96.67 | 74.32 | 39.13 | 81.97 | |
| HC-K-TSP | 88.24 | 87.50 | 86.67 | 85.00 | 83.33 | 94.03 | 93.33 | 64.86 | 52.17 | 81.68 | |
| SVM | ALLa | 79.41 | 87.50 | 100 | 100 | 83.33 | 97.01 | 100 | 83.78 | 65.22 | 88.47 |
| RS | 94.12 | 84.38 | 100 | 95.00 | 93.33 | 95.52 | 96.67 | 89.19 | 65.22 | 90.38 | |
| mRMR | 76.47 | 78.13 | 100 | 75.00 | 96.67 | 95.52 | 96.67 | 71.62 | 45.65 | 81.75 | |
| SVM-RFE | 85.29 | 78.13 | 93.33 | 95.00 | 90.00 | 88.06 | 90.00 | 93.24 | 63.04 | 86.23 | |
| TSG | 91.18 | 81.25 | 93.33 | 80.00 | 80.00 | 94.03 | 100 | 68.92 | 54.35 | 82.56 | |
| HC-K-TSP | 85.29 | 84.38 | 100 | 90.00 | 86.67 | 98.51 | 96.67 | 82.43 | 60.87 | 87.20 |
aResults reported in [6], bThe 30 reported in [3] is 70.00 after validation. Bold values indicate the best average accuracy in each classifier
Fig. 2Accuracy of mRMR-SVM for fitting, LOOCV and independent test
Fig. 3Accuracy of SVM-RFE-SVM for fitting, LOOCV and independent test
Fig. 4Accuracy of HC-K-TSP for fitting, LOOCV and independent test
Fig. 5Accuracy of TSG for fitting, LOOCV and independent test
Fig. 6Accuracy of RS-based DC for fitting, LOOCV and independent test
SVM performances with parameters optimization or not based on informative genes selected by RS
| Parameters optimization | Kernel | Evaluation | Leuk1 | Lung1 | Leuk2 | SRBCT | Breast | Lung2 | DLBCL | Cancers | GCM | Average |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No (fixed | linear | Fitting | 97.37 | 95.31 | 100 | 100 | 100 | 97.06 | 100 | 100 | 97.92 | 98.63 |
| LOOCV | 97.37 | 81.25 | 98.25 | 98.41 | 94.44 | 94.12 | 96.55 | 96 | 77.78 | 92.69 | ||
| Testing | 94.12 | 84.38 | 93.33 | 95 | 93.33 | 97.01 | 96.67 | 87.84 | 58.7 | 88.93 | ||
| Yes | linear | Fitting | 97.37 | 100 | 100 | 100 | 100 | 100 | 100 | 99 | 100 | 99.6 |
| LOOCV | 97.37 | 90.63 | 100 | 100 | 98.15 | 94.12 | 100 | 96 | 81.25 | 95.28 | ||
| Testing | 94.12 | 84.38 | 100 | 95 | 93.33 | 95.52 | 96.67 | 89.19 | 65.22 | 90.38 | ||
|
| 0.25 | 32 | 0.03125 | 0.5 | 0.125 | 8 | 0.25 | 0.25 | 4 | |||
| No (fixed | RBF | Fitting | 97.37 | 87.50 | 100 | 100 | 100 | 91.18 | 100 | 88.00 | 45.14 | 89.91 |
| LOOCV | 97.37 | 79.69 | 100 | 98.41 | 98.15 | 90.44 | 86.21 | 78.00 | 77.08 | 89.48 | ||
| Testing | 94.12 | 78.13 | 100 | 95.00 | 93.33 | 97.01 | 93.33 | 85.14 | 52.17 | 87.58 | ||
| Yes | RBF | Fitting | 97.37 | 100 | 100 | 100 | 100 | 95.59 | 100 | 100 | 100 | 99.22 |
| LOOCV | 97.37 | 90.63 | 100.00 | 98.00 | 98.15 | 94.12 | 100 | 98.00 | 82.64 | 95.43 | ||
| Testing | 94.12 | 84.38 | 86.67 | 90.00 | 93.33 | 95.52 | 90.00 | 87.84 | 52.17 | 86.00 | ||
|
| 8 | 2048 | 0.125 | 0.25 | 0.5 | 2 | 1 | 32768 | 32 | |||
| γ | 0.0125 | 0.0075125 | 0.25 | 0.125 | 0.25 | 0.0625 | 0.25 | 0.00390625 | 0.0625 |
C is penalty parameters and C∈[2−5, 215]; γ is gamma parameter in kernel function and γ∈[2−15, 23]; m is features number of each SVM models
Independent test accuracy of RS-based DC with different outlier adjustment and endpoint selection approach
|
|
| Leuk1 | Lung1 | Leuk2 | SRBCT | Breast | Lung2 | DLBCL | Cancers | GCM | Average |
|---|---|---|---|---|---|---|---|---|---|---|---|
| No adjustment | Formula (11) | 94.12 | 84.38 | 93.33 | 95.00 | 90.00 | 100.00 | 90.00 | 81.08 | 60.87 | 87.64 |
| 0.01 | Formula (11) | 94.12 | 84.38 | 93.33 | 100 | 90.00 | 97.02 | 90.00 | 90.54 | 63.04 | 89.16 |
| 0.05 | Formula (11) | 94.12 | 84.38 | 100 | 100 | 93.33 | 98.51 | 90.00 | 90.54 | 71.74 | 91.40 |
| 0.05 | Mean | 94.12 | 84.38 | 100 | 100 | 93.33 | 97.01 | 90.00 | 90.54 | 71.74 | 91.24 |
Horizontal and vertical comparison of gene pairs in real data
| GCM dataset | Horizontal comparison | Vertical comparison | ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| Class 11 | 2 | 9 | 3 | 1 | 1 | 6 |
| Class 12 | 10 | 1 | 0 | 0 | 0 | 11 |
|
| 0.0027 | 0.0908 | ||||
| Lung1 dataset | Horizontal comparison | Vertical comparison | ||||
|
|
|
|
|
|
| |
| Class 1 | 41 | 3 | 18 | 11 | 10 | 5 |
| Non-Class 1 | 19 | 1 | 0 | 2 | 1 | 17 |
|
| 0.7806 | 2.0716 × 10−7 | ||||
The 10 tumor related genes selected by RS on original training group of Leuk2 dataset
| Symbol | Synonym(s) | Entrez Gene Name | Related carcinoma and Ref. |
|---|---|---|---|
| FTL | LFTD, NBIA3 | ferritin, light polypeptide | breast cancer [ |
| PDK1 | pyruvate dehydrogenase kinase, isozyme 1 | leukemia [ | |
| POU2AF1 | BOB1, OBF-1, OBF1, OCAB | POU class 2 associating factor 1 | leukemia [ |
| KLRK1 | CD314, D12S2489E, KLR, NKG2-D, NKG2D | killer cell lectin-like receptor subfamily K, member 1 | leukemia [ |
| KCNH2 | ERG-1, ERG1, H-ERG, HERG, HERG1, Kv11.1, LQT2, SQT1 | potassium channel, voltage gated eag related subfamily H, member 2 | leukemia [ |
| VLDLR | CAMRQ1, CARMQ1, CHRMQ1CH, VLDLR | very low density lipoprotein receptor | breast cancer [ |
| MEIS1 | Meis homeobox 1 | leukemia [ | |
| MLXIP | MIR, MONDOA, bHLHe36 | MLX interacting protein | leukemia [ |
| NF2 | ACN, BANF, SCH | neurofibromin 2 (merlin) | tumor suppressor [ |
| MAP3K5 | ASK1, MAPKKK5, MEKK5 | mitogen-activated protein kinase kinase kinase 5 | leukemia [ |
The 34 tumor related genes selected by RS on original training group of Cancers dataset
| Symbol | Synonym(s) | Entrez Gene Name | Related carcinoma and Ref. |
|---|---|---|---|
| CYP1A1 | AHH, AHRR, CP11, CYP1, P1-450, P450-C, P450DX | cytochrome P450, family 1, subfamily A, polypeptide 1 | lung cancer [ |
| PTPRZ1 | HPTPZ, HPTPzeta, PTP-ZETA, PTP18, PTPRZ, PTPZ, R-PTP-zeta-2, RPTPB, RPTPbeta, phosphacan | protein tyrosine phosphatase, receptor-type, Z polypeptide 1 | lung cancer [ |
| WT1 | AWT1, EWS-WT1, GUD, NPHS4,WAGR, WIT-2, WT33 | Wilms tumor 1 | leukemic [ |
| ANGPT2 | AGPT2, ANG2 | angiopoietin 2 | lung cancer [ |
| LGALS1 | GAL1, GBP | lectin, galactoside-binding, soluble, 1 | hepatocellular carcinoma [ |
| ACPP | 5'-NT, ACP-3, ACP3 | acid phosphatase, prostate | prostate cancer [ |
| GC | DBP, DBP/GC, GRD3, HEL-S-51, VDBG, VDBP | group-specific component (vitamin D binding protein) | bladder cancer [ |
| PRMT1 | ANM1, HCP1,HRMT1L2, IR1B4 | protein arginine methyltransferase 1 | breast cancer [ |
| NOX1 | GP91-2, MOX1, NOH-1, NOH1 | NADPH oxidase 1 | colon cancer [ |
| ADH7 | ADH4 | alcohol dehydrogenase 7 (class IV), mu or sigma polypeptide | gastric cancer [ |
| DSG3 | CDHF6, PVA | desmoglein 3 | bladder carcinoma [ |
| NKX2-1 | BCH, BHC, NK-2, NKX2.1, NKX2A, T/EBP, TEBP, TITF1, TTF-1, TTF1 | NK2 homeobox 1 | lung cancer [ |
| EFHD1 | MST133, MSTP133, PP3051, SWS2 | EF-hand domain family, member D1 | colorectal cancer [ |
| EREG | EPR, ER, Ep | epiregulin | colorectal cancer [ |
| DHRS2 | HEP27, SDR25C1 | dehydrogenase/reductase (SDR family) member 2 | breast cancer [ |
| ENPEP | APA, CD249, gp160 | glutamyl aminopeptidase (aminopeptidase A) | prostate cancer [ |
| SCGB2A2 | MGB1, UGB2 | secretoglobin, family 2A, member 2 | breast cancer [ |
| KRT13 | CK13, K13, WSN2 | keratin 13, type I | breast cancer [ |
| SERPINC1 | AT3, AT3D, ATIII, THPH7 | serpin peptidase inhibitor, clade C (antithrombin), member 1 | bladder cancer [ |
| SLC12A2 | BSC, BSC2, NKCC1, PPP1R141 | solute carrier family 12 (sodium/ potassium/chloride transporter),member 2 | esophageal squamous cell carcinoma [ |
| IRF4 | LSIRF, MUM1, NF-EM5, SHEP8 | interferon regulatory factor 4 | hematological malignancies [ |
| GPA33 | A33 | glycoprotein A33 (transmembrane) | colorectal cancer [ |
| BCAT1 | BCATC, BCT1, ECA39, MECA39, PNAS121, PP18 | branched chain amino-acid transaminase 1, cytosolic | colorectal cancer [ |
| COL10A1 | collagen, type X, alpha 1 | breast cancer [ | |
| CEL | BAL, BSDL, BSSLL, CEase, FAP, FAPP, LIPA, MODY8, CEL | carboxyl ester lipase | pancreatic cysts [ |
| NPC2 | EDDM1, HE1 | Niemann-Pick disease, type C2 | liver cancer [ |
| CDH17 | CDH16, HPT-1, HPT1 | cadherin 17, LI cadherin (liver-intestine) | gastric cancer [ |
| MEIS1 | Meis homeobox 1 | pancreatic cancer [ | |
| KLK3 | APS, KLK2A1, PSA, hK3 | kallikrein-related peptidase 3 | prostrate [ |
| CXCL13 | ANGIE, ANGIE2, BCA-1, BCA1, BLC, BLR1L, SCYB13 | chemokine (C-X-C motif) ligand 13 | breast cancer [ |
| ELA3A | ELA3,ELA3A | chymotrypsin-like elastase family, member 3A | pancreatic carcinoma [ |
| IRX5 | HMMS, IRX-2a, IRXB2 | iroquois homeobox 5 | prostate cancer [ |
| VCAM1 | CD106, INCAM-100 | vascular cell adhesion molecule 1 | ovarian cancer [ |
| P4HB | CLCRP1, DSI, ERBA2L, GIT, P4Hbeta, PDI, PDIA1, PHDB, PO4DB, PO4HB, PROHB | prolyl 4-hydroxylase, beta polypeptide | Glioblastoma multiforme [ |