| Literature DB >> 33267379 |
Yang Zhang1,2, Limin Wang1,2, Zhiyi Duan1,2, Minghui Sun1,2.
Abstract
Direct dependencies and conditional dependencies in restricted Bayesian network classifiers (BNCs) are two basic kinds of dependencies. Traditional approaches, such as filter and wrapper, have proved to be beneficial to identify non-significant dependencies one by one, whereas the high computational overheads make them inefficient especially for those BNCs with high structural complexity. Study of the distributions of information-theoretic measures provides a feasible approach to identifying non-significant dependencies in batch that may help increase the structure reliability and avoid overfitting. In this paper, we investigate two extensions to the k-dependence Bayesian classifier, MI-based feature selection, and CMI-based dependence selection. These two techniques apply a novel adaptive thresholding method to filter out redundancy and can work jointly. Experimental results on 30 datasets from the UCI machine learning repository demonstrate that adaptive thresholds can help distinguish between dependencies and independencies and the proposed algorithm achieves competitive classification performance compared to several state-of-the-art BNCs in terms of 0-1 loss, root mean squared error, bias, and variance.Entities:
Keywords: Bayesian network classifiers; conditional mutual information; mutual information; thresholding
Year: 2019 PMID: 33267379 PMCID: PMC7515162 DOI: 10.3390/e21070665
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Examples of network structures with four features for KDB.
Figure 2The distributions of (a) MI; (b) CMI values on dataset Connect-4. Note that the MI and CMI values are sorted in descending order.
List of acronyms used.
| Notation | Description |
|---|---|
| MI | mutual information |
| CMI | conditional mutual information |
| BNCs | Bayesian network classifiers |
| BNC | Bayesian network classifier |
|
| a BNC |
| LOOCV | leave-one-out cross validation |
| RMSE | root mean squared error |
| AMI | the average MI |
| ACMI | the average CMI |
| FS | feature selection |
| DS | dependence selection |
| MDL | Minimum Description Length |
|
| the number of features in BNC |
|
| the number of conditional dependencies in BNC |
| SMI | the sum of MI |
| SCMI | the sum of CMI |
The values of for each feature in KDB on Hypo dataset.
| No. | MI | No. | MI | No. | MI | No. | MI | No. | MI |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.0000 | 7 | 0.0002 | 13 | 0.0009 | 19 | 0.0020 | 25 | 0.0123 |
| 2 | 0.0000 | 8 | 0.0004 | 14 | 0.0012 | 20 | 0.0030 | 26 | 0.0337 |
| 3 | 0.0000 | 9 | 0.0005 | 15 | 0.0017 | 21 | 0.0052 | 27 | 0.1425 |
| 4 | 0.0000 | 10 | 0.0006 | 16 | 0.0017 | 22 | 0.0062 | 28 | 0.1580 |
| 5 | 0.0001 | 11 | 0.0007 | 17 | 0.0018 | 23 | 0.0065 | 29 | 0.3528 |
| 6 | 0.0001 | 12 | 0.0007 | 18 | 0.0019 | 24 | 0.0105 |
The values of for each feature in KDB on Waveform dataset.
| No. | MI | No. | MI | No. | MI | No. | MI | No. | MI |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.0000 | 6 | 0.5847 | 11 | 0.6348 | 16 | 0.6588 | 21 | 0.7497 |
| 2 | 0.0000 | 7 | 0.6014 | 12 | 0.6379 | 17 | 0.7023 | ||
| 3 | 0.4891 | 8 | 0.6020 | 13 | 0.6439 | 18 | 0.7111 | ||
| 4 | 0.4960 | 9 | 0.6295 | 14 | 0.6446 | 19 | 0.7400 | ||
| 5 | 0.5801 | 10 | 0.6305 | 15 | 0.6550 | 20 | 0.7424 |
The results of for each feature pair in KDB on Hypo dataset.
| No. | CMI | No. | CMI | No. | CMI | No. | CMI | No. | CMI |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.0000 | 12 | 0.0035 | 23 | 0.0092 | 34 | 0.0355 | 45 | 0.2331 |
| 2 | 0.0000 | 13 | 0.0044 | 24 | 0.0092 | 35 | 0.0471 | 46 | 0.3031 |
| 3 | 0.0000 | 14 | 0.0048 | 25 | 0.0098 | 36 | 0.0864 | 47 | 0.3354 |
| 4 | 0.0000 | 15 | 0.0049 | 26 | 0.0121 | 37 | 0.1153 | 48 | 0.4571 |
| 5 | 0.0000 | 16 | 0.0058 | 27 | 0.0146 | 38 | 0.1189 | 49 | 0.4782 |
| 6 | 0.0000 | 17 | 0.0058 | 28 | 0.0154 | 39 | 0.1240 | 50 | 0.4786 |
| 7 | 0.0018 | 18 | 0.0064 | 29 | 0.0241 | 40 | 0.1313 | 51 | 0.4834 |
| 8 | 0.0019 | 19 | 0.0073 | 30 | 0.0262 | 41 | 0.1361 | 52 | 0.4852 |
| 9 | 0.0023 | 20 | 0.0073 | 31 | 0.0279 | 42 | 0.1390 | 53 | 0.4912 |
| 10 | 0.0024 | 21 | 0.0076 | 32 | 0.0279 | 43 | 0.1855 | 54 | 0.5099 |
| 11 | 0.0032 | 22 | 0.0090 | 33 | 0.0286 | 44 | 0.2007 | 55 | 0.7263 |
The results of for each feature pair in KDB on Waveform dataset.
| No. | CMI | No. | CMI | No. | CMI | No. | CMI | No. | CMI |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 0.0000 | 9 | 0.0029 | 17 | 0.1024 | 25 | 0.2741 | 33 | 0.4917 |
| 2 | 0.0000 | 10 | 0.0530 | 18 | 0.1408 | 26 | 0.3046 | 34 | 0.5233 |
| 3 | 0.0000 | 11 | 0.0580 | 19 | 0.1463 | 27 | 0.3077 | 35 | 0.5291 |
| 4 | 0.0000 | 12 | 0.0750 | 20 | 0.1510 | 28 | 0.3922 | 36 | 0.5318 |
| 5 | 0.0016 | 13 | 0.0872 | 21 | 0.1548 | 29 | 0.4092 | 37 | 0.5449 |
| 6 | 0.0016 | 14 | 0.0937 | 22 | 0.1611 | 30 | 0.4115 | 38 | 0.5748 |
| 7 | 0.0017 | 15 | 0.0969 | 23 | 0.1612 | 31 | 0.4564 | 39 | 0.5847 |
| 8 | 0.0022 | 16 | 0.0973 | 24 | 0.2475 | 32 | 0.4752 |
Figure 3The MI values for KDB on datasets Hypo and Waveform. Note that features are sorted in ascending order of .
Figure 4The CMI values for KDB on datasets Hypo and Waveform. Note that the conditional dependencies are sorted in ascending order of .
Description of the datasets used in the experiments.
| No. | Dataset | Instance | Feature | Class | No. | Dataset | Instance | Feature | Class |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Echocardiogram | 131 | 6 | 2 | 16 | German | 1000 | 20 | 2 |
| 2 | Lymphography | 148 | 18 | 4 | 17 | Yeast | 1484 | 8 | 10 |
| 3 | Iris | 150 | 4 | 3 | 18 | Splice-c4.5 | 3177 | 60 | 3 |
| 4 | Hepatitis | 155 | 19 | 2 | 19 | Dis | 3772 | 29 | 2 |
| 5 | Autos | 205 | 25 | 7 | 20 | Hypo | 3772 | 29 | 4 |
| 6 | Glass Identification | 214 | 9 | 3 | 21 | Spambase | 4601 | 57 | 2 |
| 7 | Heart | 270 | 13 | 2 | 22 | Phoneme | 5438 | 7 | 50 |
| 8 | Primary Tumor | 339 | 17 | 22 | 23 | Page-blocks | 5473 | 10 | 5 |
| 9 | Ionosphere | 351 | 34 | 2 | 24 | Optdigits | 5620 | 64 | 10 |
| 10 | Musk1 | 476 | 166 | 2 | 25 | Mushroom | 8124 | 22 | 2 |
| 11 | Balance-scale | 625 | 4 | 3 | 26 | Magic | 19,020 | 10 | 2 |
| 12 | Soybean | 683 | 35 | 19 | 27 | Adult | 48,842 | 14 | 2 |
| 13 | Credit-a | 690 | 15 | 2 | 28 | Shuttle | 58,000 | 9 | 7 |
| 14 | Breast-cancer-w | 699 | 9 | 2 | 29 | Connect-4 | 67,557 | 42 | 3 |
| 15 | Vehicle | 846 | 18 | 4 | 30 | Waveform | 100,000 | 21 | 3 |
Experimental results of 0–1 loss.
| Dataset | NB | SNB-FSS | TAN | K | K | KDB-FS | KDB-DS | AKDB |
|---|---|---|---|---|---|---|---|---|
| Echocardiogram | 0.3359 | 0.3664 | 0.3282 | 0.3053 | 0.3435 | 0.3435 | 0.3206 | 0.3206 ∘ |
| Lymphography | 0.1486 | 0.1689 | 0.1757 | 0.1757 | 0.2365 | 0.1757 | 0.2095 | 0.1554 ∘ |
| Iris | 0.0867 | 0.0600 | 0.0800 | 0.0867 | 0.0867 | 0.0667 | 0.0867 | 0.0733 ∘ |
| Hepatitis | 0.1935 | 0.1677 | 0.1677 | 0.1548 | 0.1871 | 0.1806 | 0.1871 | 0.1419 ∘ |
| Autos | 0.3122 | 0.3561 | 0.2146 | 0.2146 | 0.2049 | 0.1951 | 0.2049 | 0.1951 |
| Glass-id | 0.2617 | 0.2430 | 0.2196 | 0.2243 | 0.2196 | 0.2009 | 0.2196 | 0.1963 ∘ |
| Heart | 0.1778 | 0.1741 | 0.1926 | 0.1963 | 0.2111 | 0.1926 | 0.1926 | 0.1630 ∘ |
| Primary-tumor | 0.5457 | 0.5398 | 0.5428 | 0.5693 | 0.5723 | 0.5693 | 0.5723 | 0.5428 ∘ |
| Ionosphere | 0.1054 | 0.0826 | 0.0684 | 0.0741 | 0.0741 | 0.0741 | 0.0712 | 0.0712 |
| Musk1 | 0.1660 | 0.1450 | 0.1134 | 0.1113 | 0.1155 | 0.1071 | 0.1034 | 0.1029 ∘ |
| Balance-scale | 0.2720 | 0.3648 | 0.2736 | 0.2816 | 0.2784 | 0.2720 | 0.2700 | 0.2800 |
| Soybean | 0.0893 | 0.0952 | 0.0469 | 0.0644 | 0.0556 | 0.0556 | 0.0556 | 0.0527 ∘ |
| Credit-a | 0.1406 | 0.1377 | 0.1507 | 0.1551 | 0.1464 | 0.1435 | 0.1464 | 0.1420 |
| Breast-cancer-w | 0.0258 | 0.0258 | 0.0415 | 0.0486 | 0.0744 | 0.0601 | 0.0715 | 0.0472 ∘ |
| Vehicle | 0.3924 | 0.4054 | 0.2943 | 0.3014 | 0.2943 | 0.2778 | 0.2943 | 0.3014 |
| German | 0.2530 | 0.2660 | 0.2730 | 0.2760 | 0.2890 | 0.2790 | 0.2810 | 0.2590 ∘ |
| Yeast | 0.4239 | 0.4239 | 0.4171 | 0.4218 | 0.4387 | 0.4387 | 0.4333 | 0.4218 |
| Splice-c4.5 | 0.0444 | 0.0381 | 0.0466 | 0.0482 | 0.0941 | 0.0469 | 0.0910 | 0.0469 ∘ |
| Dis | 0.0159 | 0.0154 | 0.0159 | 0.0146 | 0.0138 | 0.0130 | 0.0138 | 0.0130 ∘ |
| Hypo | 0.0138 | 0.0244 | 0.0141 | 0.0077 | 0.0114 | 0.0090 | 0.0098 | 0.0077 ∘ |
| Spambase | 0.1015 | 0.0765 | 0.0669 | 0.0765 | 0.0635 | 0.0635 | 0.0628 | 0.0752 • |
| Phoneme | 0.2615 | 0.2477 | 0.2733 | 0.2120 | 0.1984 | 0.1984 | 0.1984 | 0.1984 |
| Page-blocks | 0.0619 | 0.0442 | 0.0415 | 0.0433 | 0.0391 | 0.0378 | 0.0373 | 0.0391 |
| Optdigits | 0.0767 | 0.0788 | 0.0407 | 0.0416 | 0.0372 | 0.0352 | 0.0370 | 0.0358 |
| Mushrooms | 0.0196 | 0.0148 | 0.0001 | 0.0006 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| Magic | 0.2239 | 0.2132 | 0.1675 | 0.1742 | 0.1637 | 0.1637 | 0.1636 | 0.1636 |
| Adult | 0.1592 | 0.1656 | 0.1380 | 0.1385 | 0.1383 | 0.1338 | 0.1383 | 0.1338 |
| Shuttle | 0.0039 | 0.0040 | 0.0015 | 0.0015 | 0.0009 | 0.0009 | 0.0009 | 0.0009 |
| Connect-4 | 0.2783 | 0.2999 | 0.2354 | 0.2406 | 0.2283 | 0.2282 | 0.2283 | 0.2283 |
| Waveform | 0.0220 | 0.0273 | 0.0202 | 0.0226 | 0.0256 | 0.0193 | 0.0194 | 0.0196 ∘ |
∘, • denote significant improvement or degradation of AKDB over KDB.
Experimental results of RMSE.
| Dataset | NB | SNB-FSS | TAN | K | K | AKDB |
|---|---|---|---|---|---|---|
| Echocardiogram | 0.4896 | 0.4823 | 0.4886 | 0.4846 | 0.4889 | 0.4807 |
| Lymphography | 0.3465 | 0.3505 | 0.3813 | 0.3726 | 0.4362 | 0.4076 |
| Iris | 0.2545 | 0.2158 | 0.2441 | 0.2435 | 0.2447 | 0.2224 |
| Hepatitis | 0.3901 | 0.3770 | 0.3610 | 0.3559 | 0.3875 | 0.3823 |
| Autos | 0.5190 | 0.5330 | 0.4475 | 0.4460 | 0.4399 | 0.4380 |
| Glass-id | 0.4353 | 0.4325 | 0.4109 | 0.4223 | 0.4205 | 0.4105 |
| Heart | 0.3651 | 0.3579 | 0.3771 | 0.3752 | 0.3949 | 0.3773 |
| Primary-tumor | 0.7084 | 0.7159 | 0.7170 | 0.7190 | 0.7262 | 0.7092 |
| Ionosphere | 0.0856 | 0.0538 | 0.2615 | 0.0621 | 0.0499 | 0.0561 |
| Musk1 | 0.3972 | 0.3839 | 0.3022 | 0.3034 | 0.3058 | 0.3034 |
| Balance-scale | 0.4431 | 0.5448 | 0.4344 | 0.4384 | 0.4323 | 0.4605 |
| Soybean | 0.2945 | 0.3845 | 0.2014 | 0.2206 | 0.2063 | 0.2223 |
| Credit-a | 0.3342 | 0.3179 | 0.3411 | 0.3400 | 0.3525 | 0.3391 |
| Breast-cancer-w | 0.1570 | 0.1570 | 0.1928 | 0.1951 | 0.2497 | 0.2199 |
| Vehicle | 0.5736 | 0.5663 | 0.4593 | 0.4623 | 0.4591 | 0.4419 |
| German | 0.4945 | 0.4212 | 0.5000 | 0.4991 | 0.5053 | 0.4644 |
| Yeast | 0.5987 | 0.5987 | 0.5994 | 0.5997 | 0.6035 | 0.6035 |
| Splice-c4.5 | 0.1883 | 0.2030 | 0.1917 | 0.1944 | 0.2756 | 0.1848 |
| Dis | 0.1177 | 0.1104 | 0.1103 | 0.1072 | 0.1024 | 0.1024 |
| Hypo | 0.1105 | 0.1401 | 0.1050 | 0.0881 | 0.0955 | 0.0863 |
| Spambase | 0.2994 | 0.3939 | 0.2403 | 0.2480 | 0.2300 | 0.2300 |
| Phoneme | 0.4792 | 0.4632 | 0.5048 | 0.4385 | 0.4195 | 0.4195 |
| Page-blocks | 0.2331 | 0.1923 | 0.1894 | 0.1940 | 0.1811 | 0.1781 |
| Optdigits | 0.2637 | 0.2893 | 0.1906 | 0.1937 | 0.1806 | 0.1736 |
| Mushrooms | 0.1229 | 0.1083 | 0.0083 | 0.0188 | 0.0001 | 0.0001 |
| Magic | 0.3974 | 0.3802 | 0.3461 | 0.3509 | 0.3470 | 0.3470 |
| Adult | 0.3409 | 0.3345 | 0.3076 | 0.3071 | 0.3089 | 0.3047 |
| Shuttle | 0.0561 | 0.0674 | 0.0356 | 0.0367 | 0.0290 | 0.0279 |
| Connect-4 | 0.4787 | 0.5024 | 0.4435 | 0.4480 | 0.4336 | 0.4206 |
| Waveform | 0.1441 | 0.1499 | 0.1164 | 0.1285 | 0.1402 | 0.1253 |
Experimental results of Bias.
| Dataset | NB | SNB-FSS | TAN | K | K | AKDB |
|---|---|---|---|---|---|---|
| Splice-c4.5 | 0.0341 | 0.0355 | 0.0444 | 0.0358 | 0.0968 | 0.0353 |
| Dis | 0.0160 | 0.0191 | 0.0188 | 0.0174 | 0.0171 | 0.0190 |
| Hypo | 0.0098 | 0.0177 | 0.0101 | 0.0083 | 0.0072 | 0.0077 |
| Spambase | 0.0965 | 0.0735 | 0.0656 | 0.0665 | 0.0504 | 0.0589 |
| Phoneme | 0.2284 | 0.2004 | 0.2470 | 0.1740 | 0.1599 | 0.1572 |
| Page-blocks | 0.0409 | 0.0363 | 0.0331 | 0.0342 | 0.028 | 0.0286 |
| Optdigits | 0.0655 | 0.0685 | 0.0308 | 0.0313 | 0.0285 | 0.0235 |
| Mushrooms | 0.0399 | 0.0148 | 0.0002 | 0.0011 | 0.0002 | 0.0000 |
| Magic | 0.1987 | 0.1942 | 0.1357 | 0.1451 | 0.1321 | 0.1292 |
| Adult | 0.1485 | 0.1880 | 0.1125 | 0.1117 | 0.1135 | 0.1236 |
| Shuttle | 0.0066 | 0.0036 | 0.0023 | 0.0026 | 0.0028 | 0.0008 |
| Connect-4 | 0.2327 | 0.2959 | 0.1829 | 0.1882 | 0.1788 | 0.2069 |
| Waveform | 0.0314 | 0.0257 | 0.0138 | 0.0154 | 0.0180 | 0.0164 |
Experimental results of Variance.
| Dataset | NB | SNB-FSS | TAN | K | K | AKDB |
|---|---|---|---|---|---|---|
| Splice-c4.5 | 0.0095 | 0.0051 | 0.0296 | 0.0357 | 0.0813 | 0.0572 |
| Dis | 0.0091 | 0.0000 | 0.0009 | 0.0021 | 0.0025 | 0.0012 |
| Hypo | 0.0063 | 0.0033 | 0.0078 | 0.0066 | 0.0059 | 0.0069 |
| Spambase | 0.0104 | 0.0070 | 0.0171 | 0.0176 | 0.0238 | 0.0178 |
| Phoneme | 0.1831 | 0.0783 | 0.2496 | 0.1710 | 0.1490 | 0.1064 |
| Page-blocks | 0.0128 | 0.0110 | 0.0142 | 0.0171 | 0.0185 | 0.0161 |
| Optdigits | 0.0247 | 0.0156 | 0.0280 | 0.0290 | 0.0322 | 0.0227 |
| Mushrooms | 0.0081 | 0.0000 | 0.0006 | 0.0013 | 0.0005 | 0.0002 |
| Magic | 0.0409 | 0.0284 | 0.0792 | 0.0744 | 0.0818 | 0.0453 |
| Adult | 0.0355 | 0.0304 | 0.0640 | 0.0652 | 0.0717 | 0.0196 |
| Shuttle | 0.0038 | 0.0004 | 0.0008 | 0.0016 | 0.0021 | 0.0004 |
| Connect-4 | 0.0953 | 0.0037 | 0.0883 | 0.0956 | 0.1044 | 0.0294 |
| Waveform | 0.0044 | 0.0009 | 0.0119 | 0.0110 | 0.0102 | 0.0023 |
The 0–1 loss results of AKDB for all datasets with different values.
| Dataset |
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|
| Echocardiogram | 0.3740 |
| 0.3359 | 0.3511 | 0.3588 | 0.3664 | 0.3740 | 0.3664 | 0.3664 | 0.3664 |
| Lymphography | 0.2365 |
| 0.2432 | 0.2095 | 0.2162 | 0.2027 | 0.2568 | 0.2027 | 0.2162 | 0.2500 |
| Iris | 0.0867 |
| 0.0867 | 0.0867 | 0.0800 | 0.0767 | 0.0767 | 0.0800 |
|
|
| Hepatitis | 0.1677 |
| 0.2129 | 0.1935 | 0.2323 | 0.2129 | 0.2000 | 0.2323 | 0.2323 | 0.2323 |
| Autos | 0.2098 |
| 0.2000 | 0.2098 | 0.2098 | 0.1951 | 0.2195 | 0.2000 | 0.2000 | 0.2000 |
| Glass-Id | 0.2103 |
| 0.2103 | 0.2150 | 0.2196 | 0.2243 | 0.2150 | 0.2196 | 0.2103 | 0.2150 |
| Heart | 0.1963 |
| 0.1963 | 0.1963 | 0.1963 | 0.1889 | 0.1852 | 0.1852 | 0.1815 | 0.1815 |
| Primary-Tumor | 0.5693 |
| 0.5988 | 0.5988 | 0.5693 | 0.5664 | 0.5811 | 0.5841 | 0.5752 | 0.5782 |
| Ionosphere | 0.0912 |
| 0.0883 | 0.0912 | 0.0855 | 0.0769 | 0.0912 | 0.0940 | 0.0826 | 0.0940 |
| Musk1 | 0.1176 |
| 0.1134 | 0.1155 | 0.1155 | 0.1092 | 0.1197 | 0.1197 | 0.1218 | 0.1261 |
| Balance-Scale | 0.2784 | 0.2800 | 0.2784 | 0.2784 | 0.2784 | 0.2784 | 0.2784 | 0.2784 | 0.2784 |
|
| Soybean | 0.0556 |
| 0.0556 | 0.0600 | 0.0571 | 0.0630 | 0.0571 | 0.0732 | 0.0761 | 0.1098 |
| Credit-A | 0.1681 |
| 0.1551 | 0.1609 | 0.1768 | 0.1638 | 0.1493 | 0.1623 | 0.1536 | 0.1522 |
| Breast-Cancer-W | 0.0744 |
| 0.0544 | 0.0644 | 0.0730 | 0.0758 | 0.0758 | 0.0758 | 0.0758 | 0.0601 |
| Vehicle |
| 0.3014 | 0.2996 | 0.2986 | 0.2990 | 0.3002 | 0.3168 | 0.3109 | 0.3322 | 0.3310 |
| German | 0.2920 |
| 0.2700 | 0.2920 | 0.2880 | 0.2890 | 0.2880 | 0.2940 | 0.2950 | 0.2940 |
| Yeast | 0.4387 |
| 0.4447 | 0.4468 | 0.4461 | 0.4461 | 0.4501 | 0.4569 | 0.4616 | 0.4778 |
| Splice-C4.5 | 0.0853 | 0.0469 | 0.0661 | 0.0585 | 0.0516 | 0.0529 | 0.0475 | 0.0475 |
|
|
| Dis | 0.0151 |
| 0.0146 | 0.0154 | 0.0151 | 0.0146 | 0.0138 | 0.0143 | 0.0146 | 0.0151 |
| Hypo | 0.0130 |
| 0.0130 | 0.0103 | 0.0170 | 0.0217 | 0.0225 | 0.0233 | 0.0225 | 0.0233 |
| Spambase | 0.0762 |
| 0.0767 | 0.0796 | 0.0761 | 0.0776 | 0.0785 | 0.0795 | 0.0787 | 0.0813 |
| Phoneme |
|
|
|
| 0.2896 | 0.2602 | 0.2655 | 0.2519 | 0.2589 | 0.2758 |
| Page-Blocks | 0.0391 | 0.0391 | 0.0391 | 0.0391 | 0.0391 |
| 0.0380 | 0.0389 | 0.0402 | 0.0386 |
| Optdigits | 0.0438 |
| 0.0372 | 0.0368 | 0.0391 | 0.0388 | 0.0374 | 0.0368 | 0.0400 | 0.0418 |
| Mushrooms | 0.0004 |
|
|
|
| 0.0004 |
|
| 0.0010 | 0.0011 |
| Magic | 0.1637 |
| 0.1722 | 0.1900 | 0.1904 | 0.1914 | 0.1909 | 0.1906 | 0.1917 | 0.1897 |
| Adult | 0.1375 | 0.1338 | 0.1338 |
| 0.1347 | 0.1347 | 0.1348 | 0.1409 | 0.1421 | 0.1413 |
| Shuttle |
|
| 0.0018 | 0.0018 | 0.0018 | 0.0021 | 0.0018 | 0.0018 | 0.0018 | 0.0019 |
| Connect-4 | 0.2294 |
| 0.2442 | 0.2499 | 0.2535 | 0.2591 | 0.2575 | 0.2592 | 0.2652 | 0.2692 |
| Waveform |
| 0.0196 |
| 0.0195 | 0.0194 | 0.0194 | 0.0194 | 0.0194 | 0.0194 | 0.0234 |
The lowest 0–1 loss results for datasets are shown in bold.
Figure 5The comparison results of (KDB-FS|KDB) and (KDB-DS|KDB).
W/D/L records of 0–1 loss on all datasets.
| W/D/L | NB | NB-FSS | TAN | K | K |
|---|---|---|---|---|---|
| NB-FSS | 10/12/8 | ||||
| TAN | 17/8/5 | 18/6/6 | |||
| K | 19/5/6 | 19/5/6 | 5/18/7 | ||
| K | 17/7/6 | 20/1/9 | 8/11/11 | 11/11/8 | |
| AKDB | 21/7/2 | 22/5/3 | 16/11/3 | 19/11/0 | 15/14/1 |
W/D/L records of RMSE on all datasets.
| W/D/L | NB | NB-FSS | TAN | K | K |
|---|---|---|---|---|---|
| NB-FSS | 6/18/6 | ||||
| TAN | 17/10/3 | 14/9/7 | |||
| K | 18/10/2 | 15/10/5 | 2/25/3 | ||
| K | 16/9/5 | 16/7/7 | 6/19/5 | 6/18/6 | |
| AKDB | 20/8/2 | 18/6/6 | 10/13/7 | 8/18/4 | 7/21/2 |
W/D/L records of bias and variance on large datasets.
| W/D/L | NB | NB-FSS | TAN | K | K | |
|---|---|---|---|---|---|---|
| NB-FSS | 6/3/4 | |||||
| TAN | 9/1/3 | 10/1/2 | ||||
| Bias | K | 11/1/1 | 12/1/0 | 4/5/4 | ||
| K | 11/0/2 | 12/0/1 | 6/4/3 | 7/3/3 | ||
| AKDB | 11/1/1 | 11/2/0 | 8/2/3 | 8/1/4 | 5/3/5 | |
| NB-FSS | 13/0/0 | |||||
| TAN | 4/0/9 | 0/0/13 | ||||
| Variance | K | 4/2/7 | 0/0/12 | 4/3/6 | ||
| K | 5/0/8 | 0/0/13 | 4/1/8 | 4/0/9 | ||
| AKDB | 8/0/5 | 1/1/11 | 9/1/3 | 10/2/1 | 12/0/1 |
W/D/L records of 0–1 loss on large datasets.
| W/D/L | NB | NB-FSS | TAN | K | K |
|---|---|---|---|---|---|
| NB-FSS | 5/5/3 | ||||
| TAN | 9/4/0 | 10/1/2 | |||
| K | 11/1/1 | 10/2/1 | 3/7/3 | ||
| K | 11/0/2 | 12/0/1 | 8/3/2 | 9/1/3 | |
| AKDB | 12/0/1 | 11/1/1 | 7/5/1 | 9/4/0 | 4/8/1 |
Figure 6The results of ranking in terms of 0–1 loss and RMSE for alternative algorithms.
Figure 7The results of Nemenyi tests in terms of 0–1 loss and RMSE for alternative algorithms.