| Literature DB >> 28771522 |
Der-Chiang Li1, Susan C Hu2, Liang-Sian Lin3, Chun-Wu Yeh4.
Abstract
It is difficult for learning models to achieve high classification performances with imbalanced data sets, because with imbalanced data sets, when one of the classes is much larger than the others, most machine learning and data mining classifiers are overly influenced by the larger classes and ignore the smaller ones. As a result, the classification algorithms often have poor learning performances due to slow convergence in the smaller classes. To balance such data sets, this paper presents a strategy that involves reducing the sizes of the majority data and generating synthetic samples for the minority data. In the reducing operation, we use the box-and-whisker plot approach to exclude outliers and the Mega-Trend-Diffusion method to find representative data from the majority data. To generate the synthetic samples, we propose a counterintuitive hypothesis to find the distributed shape of the minority data, and then produce samples according to this distribution. Four real datasets were used to examine the performance of the proposed approach. We used paired t-tests to compare the Accuracy, G-mean, and F-measure scores of the proposed data pre-processing (PPDP) method merging in the D3C method (PPDP+D3C) with those of the one-sided selection (OSS), the well-known SMOTEBoost (SB) study, and the normal distribution-based oversampling (NDO) approach, and the proposed data pre-processing (PPDP) method. The results indicate that the classification performance of the proposed approach is better than that of above-mentioned methods.Entities:
Mesh:
Year: 2017 PMID: 28771522 PMCID: PMC5542532 DOI: 10.1371/journal.pone.0181853
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Confusion matrix.
| Predicted class | |||
|---|---|---|---|
| Positive | Negative | ||
| Actual class | Positive | ||
| Negative | |||
Fig 1Data trend estimation.
Fig 2The proposed procedure for learning imbalanced data sets.
Fig 3The testing procedure for imbalanced data sets.
Data set description.
| Data Set | No. Instances | No. Features | Feature Characteristics | |
|---|---|---|---|---|
| WDBC | 569 | 30 | Numeric | 37.26 |
| PD | 195 | 22 | Numeric | 24.62 |
| VC | 310 | 6 | Numeric | 32.22 |
| HS | 306 | 3 | Numeric | 26.47 |
The results of four classifiers for the WDBC, PD, VC, and HS data set.
| Data set | WDBC | |||
| classifiers | NB | 3-NN | SVM-linear | SVM-poly |
| ACC | 61.39 | 59.63 | 61.12 | |
| G-mean | 53.68 | 47.88 | 53.29 | |
| F1 | 49.71 | 44.14 | 49.34 | |
| Data set | PD | |||
| classifiers | NB | 3-NN | SVM-linear | SVM-poly |
| ACC | 66.31 | 53.91 | 55.41 | |
| G-mean | 51.71 | 10.13 | 37.30 | |
| F1 | 42.72 | 6.73 | 27.52 | |
| Data set | VC | |||
| classifiers | NB | 3-NN | SVM-linear | SVM-poly |
| ACC | 62.25 | 49.15 | 48.56 | |
| G-mean | 30.55 | 15.98 | 44.61 | |
| F1 | 20.56 | 7.32 | 37.21 | |
| Data set | HS | |||
| classifiers | NB | 3-NN | SVM-linear | SVM-poly |
| ACC | 68.32 | 43.86 | 41.46 | |
| G-mean | 18.57 | 7.62 | 28.75 | |
| F1 | 10.05 | 2.71 | 14.97 | |
The results for differentα-cut values.
| Data set | α-cut | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 |
| WDBC | ACC | 71.57 | 72.41 | 73.69 | 72.28 | 71.28 | 73.21 | 74.06 | 74.47 | |
| G-mean | 64.73 | 66.51 | 68.77 | 65.85 | 64.16 | 66.85 | 69.36 | 69.65 | ||
| F1 | 58.82 | 61.05 | 64.01 | 60.33 | 58.10 | 61.51 | 64.57 | 64.91 | ||
| 13 | 13 | 13 | 13 | 13 | 14 | 13 | 13 | 13 | ||
| 13 | 13 | 13 | 13 | 13 | 14 | 15 | 17 | 25 | ||
| 41 | 41 | 41 | 41 | 41 | 40 | 39 | 37 | 29 | ||
| 88 | 88 | 88 | 88 | 88 | 86 | 84 | 80 | 64 | ||
| Data set | α-cut | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 |
| PD | ACC | 67.25 | 67.61 | 64.84 | 67.21 | 64.53 | 67.62 | 67.70 | 67.64 | |
| G-mean | 60.03 | 61.08 | 58.60 | 59.19 | 57.46 | 60.81 | 59.72 | 60.35 | ||
| F1 | 50.27 | 51.54 | 48.00 | 49.21 | 46.60 | 51.11 | 50.03 | 50.61 | ||
| 10 | 10 | 10 | 10 | 11 | 11 | 10 | 10 | 10 | ||
| 10 | 10 | 11 | 10 | 12 | 12 | 14 | 19 | 29 | ||
| 44 | 44 | 43 | 44 | 42 | 42 | 40 | 35 | 25 | ||
| 94 | 94 | 92 | 94 | 90 | 90 | 86 | 76 | 56 | ||
| Data set | α-cut | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 |
| VC | ACC | 58.66 | 59.50 | 59.38 | 58.61 | 56.70 | 57.82 | 56.28 | 53.90 | |
| G-mean | 56.47 | 54.88 | 56.72 | 55.44 | 55.62 | 57.76 | 57.42 | 54.84 | ||
| F1 | 47.91 | 45.76 | 48.36 | 46.80 | 47.49 | 50.38 | 50.44 | 48.51 | ||
| 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | ||
| 8 | 9 | 12 | 14 | 18 | 23 | 29 | 35 | 46 | ||
| 46 | 45 | 42 | 40 | 36 | 31 | 26 | 19 | 8 | ||
| 98 | 96 | 90 | 86 | 78 | 68 | 58 | 44 | 22 | ||
| Data set | α-cut | 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 |
| HS | ACC | 57.51 | 57.11 | 55.30 | 57.67 | 57.84 | 55.49 | 54.63 | 52.35 | |
| G-mean | 51.39 | 51.11 | 49.46 | 51.71 | 51.87 | 50.64 | 50.18 | 48.04 | ||
| F1 | 38.29 | 38.18 | 36.54 | 38.92 | 39.08 | 37.93 | 37.37 | 36.66 | ||
| 6 | 7 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | ||
| 16 | 18 | 22 | 24 | 27 | 31 | 37 | 40 | 50 | ||
| 38 | 36 | 32 | 30 | 27 | 23 | 17 | 14 | 4 | ||
| 82 | 78 | 70 | 66 | 60 | 52 | 40 | 34 | 14 |
The results of the six methods on WDBC dataset.
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 74.98 | (0.00) | 74.58 | (0.00) | 73.23 | (0.00) | 71.74 | (0.00) |
| SB | 69.83 | (0.00) | 70.15 | (0.00) | 68.77 | (0.00) | 67.17 | (0.00) | |
| NDO | 74.01 | (0.00) | 74.87 | (0.00) | 73.19 | (0.00) | 71.21 | (0.00) | |
| PPDP | 73.51 | (0.00) | 74.37 | (0.00) | 72.38 | (0.00) | 71.42 | (0.00) | |
| D3C | 88.28 | (0.00) | 89.20 | (0.03) | 89.54 | (0.09) | 91.03 | (0.03) | |
| PPDP+D3C | - | - | - | - | |||||
| G-mean | OSS | 68.01 | (0.00) | 68.89 | (0.00) | 68.33 | (0.00) | 69.40 | (0.00) |
| SB | 59.53 | (0.00) | 62.13 | (0.00) | 61.27 | (0.00) | 63.19 | (0.00) | |
| NDO | 68.07 | (0.00) | 70.37 | (0.00) | 68.76 | (0.00) | 69.08 | (0.00) | |
| PPDP | 67.91 | (0.00) | 70.74 | (0.00) | 68.97 | (0.00) | 70.67 | (0.00) | |
| D3C | 85.85 | (0.02) | 87.37 | (0.06) | 88.29 | (0.13) | 90.91 | (0.05) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 63.44 | (0.00) | 64.70 | (0.00) | 64.10 | (0.00) | 65.56 | (0.00) |
| SB | 52.70 | (0.00) | 56.04 | (0.00) | 55.07 | (0.00) | 57.77 | (0.00) | |
| NDO | 63.12 | (0.00) | 66.29 | (0.00) | 64.42 | (0.00) | 65.18 | (0.00) | |
| PPDP | 62.66 | (0.00) | 66.50 | (0.00) | 64.46 | (0.00) | 66.98 | (0.00) | |
| D3C | 84.05 | (0.01) | 86.05 | (0.04) | 87.20 | (0.13) | 90.51 | (0.05) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 75.08 | (0.00) | 74.85 | (0.00) | 75.22 | (0.00) | 73.72 | (0.00) |
| SB | 71.30 | (0.00) | 71.34 | (0.00) | 71.61 | (0.00) | 70.03 | (0.00) | |
| NDO | 74.82 | (0.00) | 74.99 | (0.00) | 75.14 | (0.00) | 72.81 | (0.00) | |
| PPDP | 73.96 | (0.00) | 75.02 | (0.00) | 74.63 | (0.00) | 74.36 | (0.00) | |
| D3C | 90.85 | (0.03) | 90.66 | (0.08) | 92.11 | (0.58) | 92.35 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| G-mean | OSS | 69.00 | (0.00) | 70.00 | (0.00) | 71.54 | (0.00) | 71.92 | (0.00) |
| SB | 63.50 | (0.00) | 64.92 | (0.00) | 66.66 | (0.00) | 67.40 | (0.00) | |
| NDO | 69.66 | (0.00) | 70.79 | (0.00) | 72.05 | (0.00) | 71.22 | (0.00) | |
| PPDP | 69.41 | (0.00) | 72.23 | (0.00) | 72.87 | (0.00) | 74.60 | (0.00) | |
| D3C | 89.57 | (0.05) | 89.48 | (0.08) | 91.55 | (0.36) | 92.26 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 64.63 | (0.00) | 66.00 | (0.00) | 67.92 | (0.00) | 68.65 | (0.00) |
| SB | 57.46 | (0.00) | 59.51 | (0.00) | 61.76 | (0.00) | 63.05 | (0.00) | |
| NDO | 65.12 | (0.00) | 66.82 | (0.00) | 68.47 | (0.00) | 67.81 | (0.00) | |
| PPDP | 64.55 | (0.00) | 68.25 | (0.00) | 69.15 | (0.00) | 71.74 | (0.00) | |
| D3C | 88.20 | (0.04) | 88.29 | (0.08) | 90.69 | (0.41) | 91.85 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 76.83 | (0.00) | 76.37 | (0.00) | 75.61 | (0.00) | 75.23 | (0.00) |
| SB | 74.31 | (0.00) | 73.27 | (0.00) | 73.23 | (0.00) | 73.13 | (0.00) | |
| NDO | 77.10 | (0.00) | 76.42 | (0.00) | 75.04 | (0.00) | 75.22 | (0.00) | |
| PPDP | 75.99 | (0.00) | 75.62 | (0.00) | 75.41 | (0.00) | 76.45 | (0.00) | |
| D3C | 92.07 | (0.03) | 91.82 | (0.13) | 91.98 | (0.02) | 92.51 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| G-mean | OSS | 71.89 | (0.00) | 72.43 | (0.00) | 72.12 | (0.00) | 73.67 | (0.00) |
| SB | 68.32 | (0.00) | 67.77 | (0.00) | 69.10 | (0.00) | 71.04 | (0.00) | |
| NDO | 72.93 | (0.00) | 72.84 | (0.00) | 71.83 | (0.00) | 73.98 | (0.00) | |
| PPDP | 72.55 | (0.00) | 73.18 | (0.00) | 73.55 | (0.00) | 76.74 | (0.00) | |
| D3C | 90.82 | (0.00) | 91.07 | (0.04) | 91.03 | (0.01) | 92.24 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 68.27 | (0.00) | 68.83 | (0.00) | 68.55 | (0.00) | 70.73 | (0.00) |
| SB | 63.52 | (0.00) | 63.01 | (0.00) | 64.74 | (0.00) | 67.49 | (0.00) | |
| NDO | 69.22 | (0.00) | 69.29 | (0.00) | 68.11 | (0.00) | 71.09 | (0.00) | |
| PPDP | 68.40 | (0.00) | 69.36 | (0.00) | 69.90 | (0.00) | 74.18 | (0.00) | |
| D3C | 89.73 | (0.01) | 89.91 | (0.06) | 90.18 | (0.01) | 91.70 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 76.11 | (0.00) | 77.48 | (0.00) | 76.61 | (0.00) | 77.16 | (0.00) |
| SB | 73.97 | (0.00) | 75.14 | (0.00) | 75.08 | (0.00) | 74.11 | (0.00) | |
| NDO | 76.25 | (0.00) | 77.70 | (0.00) | 76.97 | (0.00) | 75.64 | (0.00) | |
| PPDP | 75.76 | (0.00) | 77.13 | (0.00) | 77.17 | (0.00) | 76.17 | (0.00) | |
| D3C | 91.48 | (0.00) | 92.35 | (0.00) | 92.46 | (0.07) | 93.78 | (0.03) | |
| PPDP+D3C | - | - | - | - | |||||
| G-mean | OSS | 71.33 | (0.00) | 74.26 | (0.00) | 73.61 | (0.00) | 75.98 | (0.00) |
| SB | 68.60 | (0.00) | 70.90 | (0.00) | 71.75 | (0.00) | 71.92 | (0.00) | |
| NDO | 72.25 | (0.00) | 74.79 | (0.00) | 74.44 | (0.00) | 74.10 | (0.00) | |
| PPDP | 72.84 | (0.00) | 75.66 | (0.00) | 76.16 | (0.00) | 76.22 | (0.00) | |
| D3C | 90.22 | (0.00) | 91.45 | (0.00) | 91.65 | (0.00) | 93.57 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 67.27 | (0.00) | 71.01 | (0.00) | 70.20 | (0.00) | 73.36 | (0.00) |
| SB | 63.73 | (0.00) | 66.66 | (0.00) | 67.82 | (0.00) | 68.41 | (0.00) | |
| NDO | 68.25 | (0.00) | 71.45 | (0.00) | 71.15 | (0.00) | 71.08 | (0.00) | |
| PPDP | 68.67 | (0.00) | 72.23 | (0.00) | 72.94 | (0.00) | 73.39 | (0.00) | |
| D3C | 88.82 | (0.00) | 90.38 | (0.00) | 90.73 | (0.01) | 93.06 | (0.01) | |
| PPDP+D3C | - | - | - | - | |||||
The results of the six methods on PD dataset.
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 69.31 | (0.00) | 65.85 | (0.00) | 64.66 | (0.00) | 47.76 | (0.00) |
| SB | 67.28 | (0.00) | 64.17 | (0.00) | 62.07 | (0.00) | 44.96 | (0.00) | |
| NDO | 67.47 | (0.00) | 66.74 | (0.00) | 65.17 | (0.00) | 53.40 | (0.00) | |
| PPDP | 67.86 | (0.36) | 65.92 | (0.09) | 63.93 | (0.03) | 51.53 | (0.00) | |
| D3C | 71.35 | (0.00) | 70.78 | (0.01) | 68.67 | (0.01) | 62.98 | (0.03) | |
| PPDP+D3C | - | - | - | - | |||||
| G-mean | OSS | 55.09 | (0.00) | 54.39 | (0.00) | 58.89 | (0.00) | 62.99 | (0.00) |
| SB | 50.31 | (0.00) | 50.62 | (0.00) | 54.79 | (0.00) | 59.97 | (0.00) | |
| NDO | 60.54 | (0.00) | 61.18 | (0.00) | 62.50 | (0.00) | 64.91 | (0.00) | |
| PPDP | 58.88 | (0.00) | 61.61 | (0.00) | 61.86 | (0.00) | 66.24 | (0.00) | |
| D3C | 62.84 | (0.01) | 65.02 | (0.02) | 65.54 | (0.00) | 69.75 | (0.02) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 45.53 | (0.00) | 45.33 | (0.00) | 52.12 | (0.00) | 61.58 | (0.00) |
| SB | 39.48 | (0.00) | 40.69 | (0.00) | 46.84 | (0.00) | 59.19 | (0.00) | |
| NDO | 50.83 | (0.00) | 53.39 | (0.00) | 56.97 | (0.00) | 67.40 | (0.00) | |
| PPDP | 49.04 | (0.00) | 53.70 | (0.00) | 56.05 | (0.00) | 66.58 | (0.00) | |
| D3C | 52.81 | (0.00) | 57.48 | (0.02) | 60.59 | (0.00) | 75.52 | (0.03) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 71.99 | (0.05) | 70.72 | (0.00) | 70.77 | (0.32) | 66.56 | (0.46) |
| SB | 69.67 | (0.00) | 69.66 | (0.00) | 68.33 | (0.00) | 62.33 | (0.00) | |
| NDO | 69.68 | (0.00) | 70.93 | (0.00) | 68.47 | (0.00) | 63.22 | (0.00) | |
| PPDP | 68.44 | (0.76) | 70.57 | (0.03) | 67.44 | (0.44) | 62.47 | (0.90) | |
| D3C | 73.69 | (0.95) | (0.30) | (0.01) | (0.03) | ||||
| PPDP+D3C | - | 73.60 | - | 71.64 | - | 67.64 | - | ||
| G-mean | OSS | 62.29 | (0.00) | 66.01 | (0.00) | 67.92 | (0.05) | 72.66 | (0.16) |
| SB | 56.79 | (0.00) | 62.42 | (0.00) | 64.04 | (0.00) | 69.42 | (0.35) | |
| NDO | 64.77 | (0.00) | 68.32 | (0.01) | 66.69 | (0.00) | 69.39 | (0.37) | |
| PPDP | 63.78 | (0.02) | 69.02 | (0.77) | 66.90 | (0.33) | 69.94 | (0.37) | |
| D3C | 66.25 | (0.03) | 70.41 | (0.69) | (0.05) | (0.01) | |||
| PPDP+D3C | - | - | 69.92 | - | 70.63 | - | |||
| F1 | OSS | 53.30 | (0.03) | 58.55 | (0.01) | 62.64 | (0.45) | 74.34 | (0.57) |
| SB | 46.75 | (0.00) | 54.37 | (0.00) | 57.73 | (0.00) | 70.18 | (0.00) | |
| NDO | 55.18 | (0.09) | 61.19 | (0.37) | 61.02 | (0.04) | 71.43 | (0.00) | |
| PPDP | 53.72 | (0.06) | 61.75 | (0.68) | 61.16 | (0.27) | 71.68 | (0.08) | |
| D3C | 56.22 | (0.05) | 62.78 | (0.64) | (0.07) | (0.08) | |||
| PPDP+D3C | - | - | 64.55 | - | 76.14 | - | |||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 72.63 | (0.71) | 71.24 | (0.08) | 73.48 | (0.30) | 71.04 | (0.27) |
| SB | 71.93 | (0.32) | 71.89 | (0.27) | 71.72 | (0.49) | 70.07 | (0.66) | |
| NDO | 71.36 | (0.09) | 72.86 | (0.96) | 71.91 | (0.61) | 70.02 | (0.67) | |
| PPDP | 69.81 | (0.08) | 69.68 | (0.05) | 70.22 | (0.00) | 67.76 | (0.01) | |
| D3C | (0.00) | (0.00) | (0.00) | (0.04) | |||||
| PPDP+D3C | 72.99 | - | 72.82 | - | 72.38 | - | 69.47 | - | |
| G-mean | OSS | 67.38 | (0.02) | 66.92 | (0.00) | 71.74 | (0.31) | 73.79 | (0.12) |
| SB | 63.63 | (0.00) | 65.82 | (0.00) | 67.71 | (0.02) | 73.463 | (0.18) | |
| NDO | 66.95 | (0.00) | 70.24 | (0.29) | 70.26 | (0.83) | 73.31 | (0.19) | |
| PPDP | 66.96 | (0.33) | 68.59 | (0.58) | 70.17 | (0.09) | 71.74 | (0.07) | |
| D3C | (0.09) | (0.29) | (0.04) | (0.01) | |||||
| PPDP+D3C | 70.04 | - | 71.17 | - | 70.49 | - | 71.60 | - | |
| F1 | OSS | 57.45 | (0.91) | 58.32 | (0.08) | 65.81 | (0.02) | 73.38 | (0.28) |
| SB | 53.55 | (0.01) | 57.57 | (0.01) | 61.07 | (0.42) | 73.78 | (0.42) | |
| NDO | 56.56 | (0.40) | 62.36 | (0.19) | 63.93 | (0.18) | 74.03 | (0.30) | |
| PPDP | 56.14 | (0.73) | 59.74 | (0.90) | 63.47 | (0.07) | 73.10 | (0.22) | |
| D3C | (0.04) | (0.17) | (0.03) | (0.33) | |||||
| PPDP+D3C | 58.63 | - | 61.91 | - | 63.24 | - | 73.62 | - | |
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 74.05 | (0.62) | 74.70 | (0.18) | 74.96 | (0.14) | 73.67 | (0.08) |
| SB | 73.40 | (0.83) | 74.65 | (0.10) | 74.39 | (0.32) | 75.40 | (0.00) | |
| NDO | 72.30 | (0.20) | 74.20 | (0.26) | 74.28 | (0.34) | 74.96 | (0.01) | |
| PPDP | 70.51 | (0.09) | 71.92 | (0.00) | 72.28 | (0.00) | 73.22 | (0.00) | |
| D3C | (0.00) | (0.00) | (0.00) | (0.00) | |||||
| PPDP+D3C | 73.59 | - | 73.41 | - | 73.58 | - | 71.20 | - | |
| G-mean | OSS | 70.97 | (0.33) | 71.78 | (0.80) | 73.88 | (0.05) | 75.35 | (0.04) |
| SB | 67.11 | (0.00) | 70.04 | (0.15) | 72.14 | (0.90) | (0.00) | ||
| NDO | 69.92 | (0.06) | 72.02 | (0.51) | 73.90 | (0.02) | 76.59 | (0.00) | |
| PPDP | 69.12 | (0.40) | 71.44 | (0.04) | 73.12 | (0.00) | 75.89 | (0.00) | |
| D3C | (0.85) | (0.01) | (0.00) | 76.52 | (0.00) | ||||
| PPDP+D3C | 72.05 | - | 71.47 | - | 72.03 | - | 72.23 | - | |
| F1 | OSS | (0.12) | 63.03 | (0.03) | 66.56 | (0.00) | 74.97 | (0.01) | |
| SB | 56.65 | (0.16) | 61.43 | (0.12) | 64.59 | (0.02) | (0.00) | ||
| NDO | 58.89 | (0.80) | 63.03 | (0.00) | 66.23 | (0.00) | 76.51 | (0.00) | |
| PPDP | 57.57 | (0.98) | 61.76 | (0.01) | 64.96 | (0.00) | 76.15 | (0.00) | |
| D3C | 60.64 | (0.34) | (0.00) | (0.00) | 75.03 | (0.00) | |||
| PPDP+D3C | 59.53 | - | 60.51 | - | 62.71 | - | 71.28 | - | |
The results of the six methods on VC dataset.
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 58.70 | (0.00) | 58.68 | (0.00) | 58.91 | (0.00) | 59.49 | (0.00) |
| SB | 58.92 | (0.00) | 60.89 | (0.00) | 60.62 | (0.00) | 62.34 | (0.00) | |
| NDO | 62.76 | (0.00) | 63.30 | (0.00) | 64.56 | (0.00) | 66.84 | (0.01) | |
| PPDP | 59.17 | (0.00) | 58.97 | (0.00) | 60.59 | (0.00) | 60.04 | (0.00) | |
| D3C | (0.65) | (0.37) | (0.85) | 67.38 | (0.15) | ||||
| PPDP+D3C | 67.36 | - | 66.90 | - | 67.26 | - | - | ||
| G-mean | OSS | 52.71 | (0.00) | 55.49 | (0.00) | 57.71 | (0.00) | 61.88 | (0.00) |
| SB | 50.60 | (0.00) | 56.36 | (0.00) | 58.48 | (0.00) | 64.84 | (0.00) | |
| NDO | 59.86 | (0.00) | 62.02 | (0.00) | 64.97 | (0.05) | (0.54) | ||
| PPDP | 56.74 | (0.00) | 58.39 | (0.00) | 62.03 | (0.00) | 62.93 | (0.00) | |
| D3C | 62.26 | (0.24) | 63.36 | (0.44) | 64.46 | (0.06) | 67.95 | (0.47) | |
| PPDP+D3C | - | - | - | 68.61 | - | ||||
| F1 | OSS | 43.41 | (0.00) | 48.05 | (0.00) | 52.21 | (0.00) | 61.70 | (0.00) |
| SB | 40.49 | (0.00) | 48.56 | (0.00) | 52.49 | (0.00) | 64.65 | (0.00) | |
| NDO | 51.99 | (0.01) | 55.91 | (0.03) | 61.16 | (0.07) | 70.62 | (0.44) | |
| PPDP | 47.96 | (0.00) | 51.71 | (0.00) | 58.20 | (0.00) | 64.98 | (0.00) | |
| D3C | 54.43 | (0.24) | 57.06 | (0.30) | 59.48 | (0.00) | 68.15 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 60.02 | (0.00) | 58.26 | (0.00) | 61.62 | (0.00) | 63.89 | (0.00) |
| SB | 60.74 | (0.00) | 60.33 | (0.00) | 64.40 | (0.00) | 67.36 | (0.00) | |
| NDO | 63.68 | (0.00) | 65.34 | (0.00) | 67.30 | (0.07) | 68.93 | (0.00) | |
| PPDP | 60.21 | (0.00) | 59.71 | (0.00) | 62.10 | (0.03) | 63.46 | (0.00) | |
| D3C | (0.48) | (0.45) | (0.00) | 71.06 | (0.97) | ||||
| PPDP+D3C | 68.33 | - | 69.00 | - | 68.42 | - | - | ||
| G-mean | OSS | 58.13 | (0.00) | 57.56 | (0.00) | 62.70 | (0.00) | 65.72 | (0.00) |
| SB | 56.91 | (0.00) | 58.92 | (0.00) | 65.04 | (0.00) | 69.29 | (0.04) | |
| NDO | 62.92 | (0.00) | 66.09 | (0.01) | 68.51 | (0.77) | 70.88 | (0.95) | |
| PPDP | 59.55 | (0.00) | 60.65 | (0.00) | 64.03 | (0.00) | 65.93 | (0.00) | |
| D3C | 63.92 | (0.01) | 64.19 | (0.00) | (0.44) | (0.77) | |||
| PPDP+D3C | - | - | 68.30 | - | 70.92 | - | |||
| F1 | OSS | 49.84 | (0.00) | 50.80 | (0.00) | 58.84 | (0.00) | 66.99 | (0.00) |
| SB | 47.89 | (0.00) | 51.80 | (0.00) | 60.93 | (0.01) | 70.19 | (0.00) | |
| NDO | 55.64 | (0.00) | 60.81 | (0.06) | (0.26) | 71.77 | (0.11) | ||
| PPDP | 51.57 | (0.00) | 54.40 | (0.00) | 60.34 | (0.00) | 68.06 | (0.00) | |
| D3C | 55.85 | (0.01) | 57.07 | (0.00) | 64.34 | (0.56) | 69.64 | (0.00) | |
| PPDP+D3C | - | - | 65.06 | - | - | ||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 59.06 | (0.00) | 61.15 | (0.00) | 63.71 | (0.00) | 65.56 | (0.00) |
| SB | 62.22 | (0.00) | 63.81 | (0.00) | 65.85 | (0.00) | 68.45 | (0.00) | |
| NDO | 64.61 | (0.00) | 67.20 | (0.00) | 68.68 | (0.22) | 69.23 | (0.03) | |
| PPDP | 60.26 | (0.00) | 62.42 | (0.00) | 62.49 | (0.00) | 63.59 | (0.00) | |
| D3C | 71.95 | (0.00) | 72.88 | (0.00) | 73.46 | (0.00) | 72.91 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| G-mean | OSS | 59.15 | (0.00) | 62.03 | (0.00) | 65.11 | (0.00) | 67.25 | (0.00) |
| SB | 60.95 | (0.00) | 63.50 | (0.00) | 66.87 | (0.00) | 70.23 | (0.98) | |
| NDO | 65.34 | (0.00) | 67.95 | (0.00) | 70.17 | (0.82) | 71.04 | (0.03) | |
| PPDP | 61.03 | (0.00) | 64.21 | (0.00) | 65.27 | (0.00) | 66.26 | (0.00) | |
| D3C | 66.31 | (0.16) | 69.38 | (0.66) | 71.13 | (0.19) | 72.46 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 51.17 | (0.00) | 56.07 | (0.00) | 60.96 | (0.00) | 67.73 | (0.00) |
| SB | 52.63 | (0.00) | 56.95 | (0.00) | 62.35 | (0.00) | 70.36 | (0.86) | |
| NDO | 58.20 | (0.04) | 62.32 | (0.17) | 66.16 | (0.50) | 71.34 | (0.04) | |
| PPDP | 53.16 | (0.00) | 58.10 | (0.00) | 61.76 | (0.00) | 67.28 | (0.00) | |
| D3C | 58.38 | (0.06) | 63.11 | (0.19) | 65.96 | (0.42) | 70.46 | (0.22) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 60.69 | (0.00) | 60.09 | (0.00) | 61.42 | (0.00) | 66.19 | (0.00) |
| SB | 63.57 | (0.00) | 63.83 | (0.00) | 66.00 | (0.00) | 69.40 | (0.01) | |
| NDO | 65.23 | (0.00) | 66.46 | (0.00) | 67.76 | (0.00) | 70.69 | (0.12) | |
| PPDP | 60.93 | (0.00) | 61.52 | (0.00) | 62.80 | (0.00) | 65.20 | (0.03) | |
| D3C | (0.00) | (0.00) | (0.00) | (0.00) | |||||
| PPDP+D3C | 69.34 | - | 69.59 | - | 70.68 | - | 71.51 | - | |
| G-mean | OSS | 62.23 | (0.00) | 62.24 | (0.00) | 63.57 | (0.00) | 68.06 | (0.00) |
| SB | 63.52 | (0.00) | 64.62 | (0.00) | 67.54 | (0.00) | 71.25 | (0.30) | |
| NDO | 66.10 | (0.00) | 68.18 | (0.00) | 69.50 | (0.00) | 72.61 | (0.27) | |
| PPDP | 62.63 | (0.00) | 64.14 | (0.00) | 65.57 | (0.00) | 68.04 | (0.00) | |
| D3C | 68.62 | (0.50) | 69.93 | (0.78) | (0.22) | (0.36) | |||
| PPDP+D3C | - | - | 71.60 | - | 72.01 | - | |||
| F1 | OSS | 54.73 | (0.00) | 56.57 | (0.00) | 59.05 | (0.00) | 67.68 | (0.00) |
| SB | 55.44 | (0.00) | 57.94 | (0.00) | 62.41 | (0.00) | 70.34 | (0.57) | |
| NDO | 58.62 | (0.01) | 62.32 | (0.05) | 64.68 | (0.00) | 71.56 | (0.14) | |
| PPDP | 54.72 | (0.00) | 57.94 | (0.00) | 60.76 | (0.00) | 67.45 | (0.00) | |
| D3C | 60.83 | (0.31) | 63.19 | (0.24) | 66.96 | (0.65) | 69.69 | (0.08) | |
| PPDP+D3C | - | - | - | - | |||||
The results of the six methods on HS dataset.
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 59.46 | (0.53) | 56.78 | (0.06) | 56.48 | (0.03) | 52.44 | (0.00) |
| SB | 61.06 | (0.54) | 57.72 | (0.35) | 57.01 | (0.04) | 52.74 | (0.01) | |
| NDO | 60.87 | (0.62) | 57.18 | (0.11) | 58.30 | (0.28) | 55.14 | (0.27) | |
| PPDP | 56.25 | (0.27) | 54.05 | (0.66) | 54.90 | (0.50) | 50.35 | (0.32) | |
| D3C | (0.00) | (0.00) | (0.00) | (0.03) | |||||
| PPDP+D3C | 60.33 | - | 58.71 | - | 59.29 | - | 56.32 | - | |
| G-mean | OSS | 44.36 | (0.00) | 45.82 | (0.00) | 47.27 | (0.00) | 49.22 | (0.00) |
| SB | 42.20 | (0.00) | 43.56 | (0.00) | 46.20 | (0.00) | 49.24 | (0.00) | |
| NDO | 48.43 | (0.01) | 49.08 | (0.24) | 50.67 | (0.00) | 52.88 | (0.25) | |
| PPDP | 51.08 | (0.98) | 48.62 | (0.67) | 51.89 | (0.34) | 49.55 | (0.03) | |
| D3C | 49.50 | (0.08) | 45.83 | (0.00) | 48.08 | (0.00) | 46.46 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 30.53 | (0.00) | 32.87 | (0.01) | 35.69 | (0.00) | 41.63 | (0.00) |
| SB | 28.27 | (0.00) | 30.27 | (0.00) | 34.16 | (0.00) | 41.37 | (0.00) | |
| NDO | 35.42 | (0.03) | 37.08 | (0.47) | 39.90 | (0.01) | 46.32 | (0.11) | |
| PPDP | 38.48 | (0.82) | 36.60 | (0.56) | 41.64 | (0.17) | 43.22 | (0.01) | |
| D3C | 36.62 | (0.05) | 33.30 | (0.00) | 35.02 | (0.00) | 36.40 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 58.27 | (0.08) | 58.07 | (0.00) | 57.64 | (0.00) | 56.05 | (0.00) |
| SB | 59.21 | (0.29) | 58.88 | (0.00) | 58.58 | (0.00) | 57.94 | (0.00) | |
| NDO | 60.10 | (0.82) | 60.74 | (0.00) | 60.76 | (0.32) | 58.79 | (0.00) | |
| PPDP | 54.66 | (0.46) | 58.72 | (0.62) | 55.58 | (0.16) | 54.92 | (0.00) | |
| D3C | (0.00) | (0.00) | (0.00) | (0.09) | |||||
| PPDP+D3C | 60.33 | - | 64.05 | - | 61.63 | - | 61.82 | - | |
| G-mean | OSS | 46.66 | (0.00) | 50.16 | (0.00) | 52.19 | (0.00) | 53.26 | (0.00) |
| SB | 44.88 | (0.00) | 48.45 | (0.00) | 52.36 | (0.00) | 55.46 | (0.00) | |
| NDO | 50.25 | (0.00) | 52.48 | (0.00) | 53.27 | (0.00) | 55.04 | (0.00) | |
| PPDP | 50.51 | (0.13) | 54.72 | (0.02) | 53.86 | (0.00) | 54.34 | (0.00) | |
| D3C | 48.38 | (0.00) | 47.88 | (0.00) | 49.70 | (0.00) | 49.68 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 32.53 | (0.00) | 37.41 | (0.00) | 40.86 | (0.00) | 45.44 | (0.00) |
| SB | 30.56 | (0.00) | 35.36 | (0.00) | 40.87 | (0.00) | 48.04 | (0.00) | |
| NDO | 36.77 | (0.01) | 40.27 | (0.00) | 42.28 | (0.00) | 47.57 | (0.00) | |
| PPDP | 37.05 | (0.07) | 42.58 | (0.01) | 43.17 | (0.00) | 47.23 | (0.00) | |
| D3C | 35.10 | (0.00) | 36.60 | (0.00) | 37.98 | (0.00) | 39.37 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 56.58 | (0.00) | 57.30 | (0.00) | 57.14 | (0.00) | 58.88 | (0.00) |
| SB | 58.41 | (0.00) | 59.63 | (0.00) | 58.76 | (0.00) | 59.99 | (0.00) | |
| NDO | 61.19 | (0.03) | 62.78 | (0.71) | 62.49 | (0.00) | 61.50 | (0.00) | |
| PPDP | 55.55 | (0.00) | 54.75 | (0.00) | 56.96 | (0.00) | 55.58 | (0.00) | |
| D3C | (0.00) | (0.00) | (0.00) | (0.00) | |||||
| PPDP+D3C | 63.24 | - | 63.10 | - | 65.15 | - | 63.67 | - | |
| G-mean | OSS | 50.50 | (0.00) | 50.18 | (0.00) | 51.19 | (0.00) | 56.49 | (0.00) |
| SB | 49.46 | (0.00) | 50.15 | (0.00) | 50.96 | (0.00) | 57.01 | (0.00) | |
| NDO | 52.97 | (0.00) | 53.71 | (0.00) | 54.11 | (0.00) | 54.85 | (0.00) | |
| PPDP | 51.48 | (0.00) | 51.44 | (0.00) | 54.50 | (0.00) | 54.99 | (0.00) | |
| D3C | 49.46 | (0.00) | 45.30 | (0.00) | 49.54 | (0.00) | 49.69 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 35.99 | (0.00) | 36.65 | (0.00) | 38.59 | (0.00) | 48.09 | (0.00) |
| SB | 34.84 | (0.00) | 36.68 | (0.00) | 38.29 | (0.00) | 48.54 | (0.00) | |
| NDO | 39.07 | (0.00) | 41.20 | (0.00) | 42.29 | (0.00) | 45.49 | (0.00) | |
| PPDP | 37.32 | (0.00) | 38.16 | (0.00) | 42.59 | (0.00) | 46.62 | (0.00) | |
| D3C | 31.54 | (0.00) | 32.47 | (0.00) | 37.52 | (0.00) | 38.64 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| Method | 60 | 80 | 100 | 150 | |||||
| ACC | OSS | 56.25 | (0.00) | 58.23 | (0.00) | 57.41 | (0.00) | 59.78 | (0.00) |
| SB | 59.24 | (0.00) | 60.13 | (0.00) | 59.98 | (0.00) | 60.63 | (0.00) | |
| NDO | 61.76 | (0.01) | 63.73 | (0.01) | 62.62 | (0.00) | 62.32 | (0.00) | |
| PPDP | 55.72 | (0.00) | 58.63 | (0.00) | 57.33 | (0.00) | 57.32 | (0.00) | |
| D3C | (0.00) | (0.00) | (0.00) | (0.00) | |||||
| PPDP+D3C | 64.04 | - | 66.13 | - | 65.09 | - | 65.42 | - | |
| G-mean | OSS | 50.56 | (0.00) | 52.48 | (0.00) | 54.39 | (0.00) | 57.93 | (0.00) |
| SB | 50.65 | (0.00) | 51.28 | (0.00) | 55.97 | (0.00) | 57.28 | (0.00) | |
| NDO | 53.80 | (0.00) | 54.26 | (0.00) | 55.56 | (0.00) | 55.63 | (0.00) | |
| PPDP | 52.34 | (0.00) | 54.98 | (0.00) | 55.03 | (0.00) | 56.29 | (0.00) | |
| D3C | 48.40 | (0.00) | 45.64 | (0.00) | 50.34 | (0.00) | 48.24 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
| F1 | OSS | 35.78 | (0.00) | 38.53 | (0.00) | 41.46 | (0.00) | 48.34 | (0.00) |
| SB | 35.83 | (0.00) | 37.15 | (0.00) | 43.20 | (0.00) | 47.40 | (0.00) | |
| NDO | 39.72 | (0.00) | 41.23 | (0.00) | 43.02 | (0.00) | 45.39 | (0.00) | |
| PPDP | 37.79 | (0.00) | 41.44 | (0.00) | 42.18 | (0.00) | 46.45 | (0.00) | |
| D3C | 35.06 | (0.00) | 32.52 | (0.00) | 37.68 | (0.00) | 36.45 | (0.00) | |
| PPDP+D3C | - | - | - | - | |||||
The number of S, S, M'−m, and M'+m' with N = 60.
| Dataset | Dataset | ||||||||
| WDBC | PD | ||||||||
| 5 | 10 | 15 | 20 | 5 | 10 | 15 | 20 | ||
| 13 | 13 | 12 | 12 | 10 | 10 | 9 | 9 | ||
| 14 | 13 | 12 | 12 | 12 | 11 | 10 | 10 | ||
| 40 | 37 | 32 | 28 | 42 | 39 | 34 | 30 | ||
| 86 | 84 | 80 | 76 | 90 | 88 | 84 | 80 | ||
| Dataset | Dataset | ||||||||
| VC | HS | ||||||||
| 5 | 10 | 15 | 20 | 5 | 10 | 15 | 20 | ||
| 3 | 3 | 3 | 3 | 6 | 6 | 6 | 6 | ||
| 18 | 18 | 16 | 14 | 27 | 26 | 24 | 23 | ||
| 36 | 32 | 28 | 26 | 27 | 24 | 20 | 17 | ||
| 78 | 74 | 72 | 72 | 60 | 58 | 56 | 54 | ||