| Literature DB >> 25562040 |
Maryam Farhadian1, Hossein Mahjub2, Jalal Poorolajal3, Abbas Moghimbeigi4, Muharram Mansoorizadeh5.
Abstract
OBJECTIVES: Classification of breast cancer patients into different risk classes is very important in clinical applications. It is estimated that the advent of high-dimensional gene expression data could improve patient classification. In this study, a new method for transforming the high-dimensional gene expression data in a low-dimensional space based on wavelet transform (WT) is presented.Entities:
Keywords: breast cancer; microarray data; supervised wavelet; support vector machine
Year: 2014 PMID: 25562040 PMCID: PMC4281603 DOI: 10.1016/j.phrp.2014.09.002
Source DB: PubMed Journal: Osong Public Health Res Perspect ISSN: 2210-9099
Results for supervised wavelet and supervised principal component analysis (PCA): NKI_97, 10 times 10-fold cross-validation.
| Method | No. of preselected genes. | Method | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|---|
| SVM (linear) | 70 genes (van't Veer) | Wavelet (Db1.1) | 77.11 | 78.30 | 76.15 | 77.22 |
| Wavelet (Db1.2) | 69.11 | 64.47 | 73.00 | 68.74 | ||
| Supervised PCA | 73.77 | 75.72 | 71.84 | 73.78 | ||
| SVM (radial) | 70 genes (van't Veer) | Wavelet (Db1.1) | 77.55 | 82.28 | 73.24 | 77.76 |
| Wavelet (Db1.2) | 75.66 | 82.20 | 69.76 | 75.98 | ||
| Supervised PCA | 71.77 | 71.25 | 72.21 | 71.73 | ||
| SVM (sigmoid) | 70 genes (van't Veer) | Wavelet (Db1.1) | 78.88 | 78.57 | 79.18 | 78.87 |
| Wavelet (Db1.2) | 71.88 | 74.82 | 69.26 | 72.04 | ||
| Supervised PCA | 68.77 | 67.58 | 69.73 | 68.66 | ||
| SVM (linear) | 70 genes | Wavelet (Db1.1) | 72.33 | 67.55 | 76.38 | 71.97 |
| Wavelet (Db1.2) | 76.44 | 75.53 | 77.24 | 76.38 | ||
| Supervised PCA | 74.00 | 72.51 | 75.31 | 73.91 | ||
| SVM (radial) | 70 genes | Wavelet (Db1.1) | 82.77 | 90.14 | 74.46 | 82.30 |
| Wavelet (Db1.2) | 82.00 | 88.47 | 76.21 | 82.34 | ||
| Supervised PCA | 75.88 | 75.22 | 76.52 | 75.87 | ||
| SVM (sigmoid) | 70 genes | Wavelet (Db1.1) | 77.44 | 86.74 | 68.93 | 77.84 |
| Wavelet (Db1.2) | 77.00 | 82.86 | 71.72 | 77.29 | ||
| Supervised PCA | 78.22 | 76.83 | 79.45 | 78.14 | ||
| SVM (linear) | Wavelet (Db1.1) | 71.00 | 68.40 | 73.09 | 70.75 | |
| Wavelet (Db1.2) | 72.88 | 72.09 | 73.67 | 72.88 | ||
| Supervised PCA | 78.00 | 78.01 | 77.98 | 78.00 | ||
| SVM (radial) | Wavelet (Db1.1) | 82.55 | 87.55 | 78.21 | 82.88 | |
| Wavelet (Db1.2) | 81.66 | 84.47 | 79.00 | 81.73 | ||
| Supervised PCA | 79.22 | 83.25 | 75.22 | 79.24 | ||
| SVM (sigmoid) | Wavelet (Db1.1) | 79.88 | 88.17 | 72.53 | 80.35 | |
| Wavelet (Db1.2) | 78.88 | 86.62 | 70.94 | 78.78 | ||
| Supervised PCA | 75.55 | 80.00 | 71.48 | 75.74 | ||
| SVM (linear) | Wavelet (Db1.1) | 73.77 | 76.62 | 71.34 | 73.98 | |
| Wavelet (Db1.2) | 70.88 | 67.78 | 73.95 | 70.86 | ||
| Supervised PCA | 76.66 | 79.36 | 74.07 | 76.71 | ||
| SVM (radial) | Wavelet (Db1.1) | 83.11 | 88.27 | 78.63 | 83.45 | |
| Wavelet (Db1.2) | 82.33 | 85.11 | 79.55 | 82.33 | ||
| Supervised PCA | 77.33 | 82.43 | 72.72 | 77.58 | ||
| SVM (sigmoid) | Wavelet (Db1.1) | 80.66 | 89.69 | 72.51 | 81.10 | |
| Wavelet (Db1.2) | 80.77 | 85.77 | 76.07 | 80.92 | ||
| Supervised PCA | 76.00 | 80.87 | 71.86 | 76.37 |
AUC = area under the receiver operating characteristic curve; SVM = support vector machine.
Results for supervised wavelet and supervised principal component analysis (PCA): NKI_295, 10 times 10-fold cross-validation.
| Method | No. of preselected genes | Method | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|---|
| SVM (linear) | 70 genes (van't Veer) | Wavelet (Db1.1) | 65.10 | 38.32 | 77.82 | 58.07 |
| Wavelet (Db1.2) | 66.13 | 29.71 | 84.33 | 57.02 | ||
| Supervised PCA | 67.00 | 28.55 | 87.38 | 57.97 | ||
| SVM (radial) | 70 genes (van't Veer) | Wavelet (Db1.1) | 70.96 | 32.82 | 90.37 | 61.59 |
| Wavelet (Db1.2) | 67.96 | 26.37 | 88.64 | 57.50 | ||
| Supervised PCA | 65.72 | 18.36 | 91.14 | 54.75 | ||
| SVM (sigmoid) | 70 genes (van't Veer) | Wavelet (Db1.1) | 63.17 | 24.70 | 81.82 | 53.26 |
| Wavelet (Db1.2) | 64.55 | 19.25 | 88.10 | 53.67 | ||
| Supervised PCA | 66.27 | 23.73 | 89.04 | 56.39 | ||
| SVM (linear) | 70 genes | Wavelet (Db1.1) | 70.20 | 48.68 | 81.29 | 64.98 |
| Wavelet (Db1.2) | 72.65 | 53.08 | 82.52 | 67.80 | ||
| Supervised PCA | 69.37 | 45.83 | 81.71 | 63.77 | ||
| SVM (radial) | 70 genes | Wavelet (Db1.1) | 71.13 | 36.98 | 88.76 | 62.87 |
| Wavelet (Db1.2) | 70.06 | 39.92 | 86.22 | 63.07 | ||
| Supervised PCA | 70.10 | 34.41 | 89.37 | 61.89 | ||
| SVM (sigmoid) | 70 genes | Wavelet (Db1.1) | 65.79 | 43.03 | 77.08 | 60.06 |
| Wavelet (Db1.2) | 63.44 | 44.50 | 73.72 | 59.11 | ||
| Supervised PCA | 68.86 | 33.92 | 87.55 | 60.74 | ||
| SVM (linear) | Wavelet (Db1.1) | 69.68 | 48.65 | 80.87 | 64.76 | |
| Wavelet (Db1.2) | 67.20 | 41.12 | 80.87 | 60.99 | ||
| Supervised PCA | 71.68 | 46.81 | 84.56 | 65.68 | ||
| SVM (radial) | Wavelet (Db1.1) | 70.37 | 33.90 | 89.40 | 61.65 | |
| Wavelet (Db1.2) | 65.72 | 28.30 | 86.48 | 57.39 | ||
| Supervised PCA | 70.82 | 40.54 | 86.62 | 63.58 | ||
| SVM (sigmoid) | Wavelet (Db1.1) | 65.79 | 44.68 | 76.53 | 60.60 | |
| Wavelet (Db1.2) | 66.37 | 41.38 | 79.49 | 60.43 | ||
| Supervised PCA | 71.10 | 45.46 | 84.21 | 64.83 | ||
| SVM (linear) | Wavelet (Db1.1) | 72.37 | 46.50 | 86.00 | 66.25 | |
| Wavelet (Db1.2) | 70.43 | 80.97 | 57.00 | 67.24 | ||
| Supervised PCA | 73.03 | 46.51 | 86.76 | 66.63 | ||
| SVM (radial) | Wavelet (Db1.1) | 75.37 | 52.85 | 87.21 | 70.03 | |
| Wavelet (Db1.2) | 74.58 | 49.18 | 86.48 | 67.83 | ||
| Supervised PCA | 71.06 | 39.56 | 88.05 | 63.81 | ||
| SVM (sigmoid) | Wavelet (Db1.1) | 72.44 | 42.36 | 88.01 | 65.19 | |
| Wavelet (Db1.2) | 74.34 | 47.21 | 88.38 | 67.80 | ||
| Supervised PCA | 69.10 | 49.47 | 78.63 | 64.05 |
AUC = area under the receiver operating characteristic curve; SVM = support vector machine.
Results for supervised wavelet and supervised principal component analysis (PCA): VDX_286, 10 times 10-fold cross-validation.
| Method | No. of preselected genes | Method | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|---|
| SVM (linear) | 76 genes (Wang) | Wavelet (Db1.1) | 64.42 | 44.42 | 76.25 | 60.33 |
| Wavelet (Db1.2) | 66.39 | 44.86 | 79.13 | 61.99 | ||
| Supervised PCA | 68.17 | 39.13 | 85.82 | 62.47 | ||
| SVM (radial) | 76 genes (Wang) | Wavelet (Db1.1) | 63.89 | 35.74 | 79.77 | 57.75 |
| Wavelet (Db1.2) | 65.10 | 28.97 | 87.45 | 58.21 | ||
| Supervised PCA | 67.82 | 33.97 | 87.88 | 60.92 | ||
| SVM (sigmoid) | 76 genes (Wang) | Wavelet (Db1.1) | 66.92 | 45.49 | 79.66 | 62.58 |
| Wavelet (Db1.2) | 65.64 | 43.42 | 79.11 | 61.27 | ||
| Supervised PCA | 67.39 | 43.54 | 81.28 | 62.41 | ||
| SVM (linear) | 76 genes | Wavelet (Db1.1) | 75.17 | 61.97 | 83.02 | 72.50 |
| Wavelet (Db1.2) | 76.35 | 59.94 | 85.99 | 72.96 | ||
| Supervised PCA | 67.96 | 42.04 | 83.65 | 62.85 | ||
| SVM (radial) | 76 genes | Wavelet (Db1.1) | 76.07 | 60.80 | 84.86 | 72.83 |
| Wavelet (Db1.2) | 77.25 | 56.48 | 89.23 | 72.86 | ||
| Supervised PCA | 67.32 | 37.17 | 85.37 | 61.27 | ||
| SVM (sigmoid) | 76 genes | Wavelet (Db1.1) | 77.21 | 62.41 | 86.10 | 74.26 |
| Wavelet (Db1.2) | 71.57 | 61.79 | 77.34 | 69.56 | ||
| Supervised PCA | 68.10 | 42.85 | 82.77 | 62.81 | ||
| SVM (linear) | Wavelet (Db1.1) | 78.21 | 67.05 | 84.60 | 75.83 | |
| Wavelet (Db1.2) | 79.21 | 64.46 | 87.61 | 76.04 | ||
| Supervised PCA | 76.00 | 68.76 | 80.66 | 74.71 | ||
| SVM (radial) | Wavelet (Db1.1) | 77.00 | 58.65 | 87.56 | 73.10 | |
| Wavelet (Db1.2) | 75.17 | 54.41 | 88.33 | 71.37 | ||
| Supervised PCA | 75.00 | 60.97 | 83.68 | 72.33 | ||
| SVM (sigmoid) | Wavelet (Db1.1) | 77.03 | 65.75 | 83.54 | 74.65 | |
| Wavelet (Db1.2) | 78.50 | 66.79 | 85.59 | 76.19 | ||
| Supervised PCA | 75.21 | 64.96 | 81.63 | 73.30 | ||
| SVM (linear) | Wavelet (Db1.1) | 77.00 | 67.04 | 83.02 | 75.03 | |
| Wavelet (Db1.2) | 78.17 | 65.57 | 85.62 | 75.60 | ||
| Supervised PCA | 75.96 | 66.14 | 82.11 | 74.12 | ||
| SVM (radial) | Wavelet (Db1.1) | 75.96 | 55.15 | 88.20 | 71.68 | |
| Wavelet (Db1.2) | 76.17 | 53.57 | 89.45 | 71.51 | ||
| Supervised PCA | 75.57 | 63.50 | 82.98 | 73.24 | ||
| SVM (sigmoid) | Wavelet (Db1.1) | 77.32 | 66.18 | 83.91 | 75.04 | |
| Wavelet (Db1.2) | 74.67 | 59.40 | 83.36 | 71.38 | ||
| Supervised PCA | 74.28 | 65.61 | 79.19 | 72.40 |
AUC = area under the receiver operating characteristic curve; SVM = support vector machine.
External validation for supervised wavelet: NKI_234_61, 10 times 10-fold cross-validation.
| Method | No. of preselected genes | Wavelet | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|---|
| SVM (linear) | 70 genes | Db1. Level 1 | 67.83 | 75.63 | 59.15 | 67.39 |
| Db1. Level 2 | 64.33 | 69.45 | 58.82 | 64.13 | ||
| SVM (radial) | 70 genes | Db1. Level 1 | 64.50 | 72.47 | 54.94 | 63.71 |
| Db1. Level 2 | 67.66 | 67.94 | 67.36 | 67.65 | ||
| SVM (sigmoid) | 70 genes | Db1. Level 1 | 65.66 | 72.93 | 58.24 | 65.59 |
| Db1. Level 2 | 62.16 | 56.06 | 68.47 | 62.27 | ||
| SVM (linear) | Db1. Level 1 | 64.00 | 68.81 | 59.34 | 64.07 | |
| Db1. Level 2 | 61.50 | 53.96 | 69.82 | 61.89 | ||
| SVM (radial) | Db1. Level 1 | 71.83 | 78.33 | 65.33 | 71.83 | |
| Db1. Level 2 | 69.00 | 70.16 | 67.86 | 69.01 | ||
| SVM (sigmoid) | Db1. Level 1 | 70.66 | 65.06 | 76.73 | 70.90 | |
| Db1. Level 2 | 68.83 | 67.89 | 69.76 | 68.83 |
AUC = area under the receiver operating characteristic curve; SVM = support vector machine.
Previously published analyses for the breast cancer data.
| No. of samples | Feature selection | Classifier | Measure | Validation method | |
|---|---|---|---|---|---|
| Current study | 97 | Supervised wavelet | SVM radial kernel | Accuracy: 83.11 | CV |
| Supervised PCA | SVM radial kernel | Accuracy: 79.22 | |||
| 295 | Supervised wavelet | SVM radial kernel | Accuracy: 75.37 | ||
| Supervised PCA | SVM linear kernel | Accuracy: 73.03 | |||
| 286 | Supervised wavelet | SVM linear kernel | Accuracy: 79.21 | ||
| Supervised PCA | SVM linear kernel | Accuracy: 76.00 | |||
| Michiels et al (2005) | 97 | Correlation | Nearest-centroid | Accuracy: 68.00 | CV |
| Peng (2005) | 97 | Signal to noise ratio | SVM | Accuracy: 75.00 | Leave-one-out CV |
| Signal to noise ratio | Bagg & Boost SVM | Accuracy: 77.00 | |||
| Subsampling | Ensemble SVM | Accuracy: 81.00 | |||
| Pochet et al (2004) | 78+19* | None | LS-SVM linear kernel | Accuracy: 69.00 | Leave-one-out CV |
| None | SVM RBF kernel | Accuracy: 69.00 | |||
| None | SVM linear kernel | Accuracy: 52.00 | |||
| Alexe et al (2006) | 78+19 | Support set identified by logical analysis of data | SVM linear kernel | Accuracy: 77.00 | CV |
| Artificial NN | Accuracy: 79.00 | ||||
| Logistic regression | Accuracy: 78.00 | ||||
| Nearest neighbors | Accuracy: 76.00 | ||||
| Decision trees (C4.5) | Accuracy: 67.00 | ||||
| Jahid et al (2012) | 295 | Steiner tree based method | SVM | Accuracy: 62.00 | CV |
| 286 | Accuracy: 61.00 | ||||
| Chuang et al (2007) | 295 | Subnetwork marker | SVM | Accuracy: 72.00 | CV |
| 286 | Accuracy: 62.00 | ||||
| van Vliet et al (2012) | 295 | Filtering approach ( | Nearest mean classifier | AUC: 73.80 | CV |
| Dehnavi et al (2013) | 286 | Rough-set theory | Neuro-fuzzy System | Accuracy: 78.00 | 10-fold CV |
| Lee et al (2011) | 286 | Modules with condition responsive correlations | Naïve Bayesian classifier | AUC: 0.62 | Leave-one-out CV |
| Jahid et al (2014) | 295 | Patient–patient co-expression networks | PC-classifier | AUC: 0.78 | Leave-one-out CV |
| Dagging | AUC: 0.72 | ||||
| AdaBoost | AUC: 0.66 | ||||
| 286 | PC-classifier | AUC: 0.68 | |||
| Dagging | AUC: 0.61 | ||||
| AdaBoost | AUC: 0.55 |
AUC = area under the receiver operating characteristic curve; CV = cross validation; PCA = principal component analysis; RBF = radial basic function; SVM = support vector machine.