| Literature DB >> 33265835 |
Mirosław Kordos1, Krystian Łapa2.
Abstract
The purpose of instance selection is to reduce the data size while preserving as much useful information stored in the data as possible and detecting and removing the erroneous and redundant information. In this work, we analyze instance selection in regression tasks and apply the NSGA-II multi-objective evolutionary algorithm to direct the search for the optimal subset of the training dataset and the k-NN algorithm for evaluating the solutions during the selection process. A key advantage of the method is obtaining a pool of solutions situated on the Pareto front, where each of them is the best for certain RMSE-compression balance. We discuss different parameters of the process and their influence on the results and put special efforts to reducing the computational complexity of our approach. The experimental evaluation proves that the proposed method achieves good performance in terms of minimization of prediction error and minimization of dataset size.Entities:
Keywords: computational complexity; information selection; instance selection; k-NN; multi-objective evolutionary algorithms; regression
Year: 2018 PMID: 33265835 PMCID: PMC7512309 DOI: 10.3390/e20100746
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Instance selection in classification (a) and in regression (b). The axes represent the attributes x1 and x2. In classification, the red circle and blue cross represent points of two different classes. In regression, the height of the vertical line represents the output value of an instance and the circle shows its location in the input space.
Figure 2Sample results of the MEISR algorithm run. Each pair of points (orange and green) represent one solution (one training set) with the percentage of selected instances on the horizontal axis and the corresponding on training set (orange) and on test set (green) on vertical axis. Only the points that formed the Pareto front are shown. Horizontal orange and green lines show the and respectively without instance selection. The additional are points shown in blue (c3, r3 and others close to them) are described in the text.
Figure 3The experimental process.
Datasets used in the experiments and their properties: number of instances (Ints.), number of attributes (Attr.), the optimal k—the k in k-NN that gives the lowest (orgK) and the k used in the experiments as optimal k (optK) for the reason explained further in the text.
| Dataset | Inst. | Attr. | orgK | optK |
|---|---|---|---|---|
| mach. CPU | 209 | 6 | 1 | 1 |
| baseball | 337 | 16 | 7 | 7 |
| dee | 365 | 6 | 7 | 7 |
| autoMPG8 | 392 | 7 | 6 | 6 |
| autoMPG6 | 392 | 5 | 4 | 4 |
| ele-1 | 495 | 2 | 11 | 11 |
| forestFires | 517 | 12 | 50 | 11 |
| stock | 950 | 9 | 3 | 3 |
| steel | 960 | 12 | 4 | 4 |
| laser | 993 | 4 | 3 | 3 |
| concrete | 1030 | 8 | 4 | 4 |
| treasury | 1049 | 15 | 3 | 3 |
| mortgage | 1049 | 15 | 2 | 2 |
| friedman | 1200 | 5 | 7 | 7 |
| wizmir | 1461 | 9 | 7 | 7 |
| wankara | 1609 | 9 | 9 | 9 |
| plastic | 1650 | 2 | 3 | 11 |
| quake | 2178 | 3 | 50 | 11 |
| anacalt | 4052 | 7 | 2 | 2 |
| abalone | 4177 | 8 | 13 | 11 |
| delta-ail | 7128 | 5 | 17 | 11 |
| puma32h | 8191 | 32 | 21 | 11 |
| compactiv | 8192 | 21 | 2 | 2 |
| delta-elv | 9516 | 6 | 35 | 11 |
| tic | 9822 | 85 | 50 | 11 |
| ailerons | 13,750 | 40 | 10 | 10 |
| pole | 14,998 | 26 | 4 | 4 |
| elevators | 16,598 | 18 | 8 | 8 |
| california | 20,640 | 8 | 9 | 9 |
| house | 22,784 | 16 | 11 | 11 |
| mv | 40,767 | 10 | 9 | 9 |
Averaged results for experiments presented in Table A3, Table A4, Table A5, Table A6, Table A7, Table A8, Table A9 and Table A10.
| IEA | FPA | Validation | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 | Details |
|---|---|---|---|---|---|---|---|---|---|---|
| 1-NN (BIS) | 1-NN (BIS) | 50%H | 0.604 | 0.537 | 0.955 | 0.619 | 0.646 | 1.176 | 0.246 | see |
| 1-NN (BIS) | 1-NN (BIS) | 10-CV | 0.573 | 0.503 | 0.960 | 0.627 | 0.594 | 1.175 | 0.214 | see |
| 50%H | 0.476 | 0.486 | 1.050 | 0.505 | 0.516 | 1.163 | 0.249 | see | ||
| 10-CV | 0.446 | 0.453 | 1.056 | 0.497 | 0.487 | 1.203 | 0.227 | see | ||
| MLP | 50%H | 0.427 | 0.424 | 0.994 | 0.505 | 0.458 | 1.076 | 0.249 | see | |
| MLP | 10-CV | 0.393 | 0.391 | 0.992 | 0.497 | 0.418 | 1.068 | 0.227 | see | |
| 50%H | 0.476 | 0.496 | 1.116 | 0.557 | 0.555 | 1.307 | 0.185 | see | ||
| 10-CV | 0.448 | 0.465 | 1.113 | 0.532 | 0.508 | 1.282 | 0.165 | see |
Experimental results for MEISR with 50%H, IEA: 1-NN (BIS), FPA: 1-NN (BIS).
| Dataset | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 | r | r | c(r |
|---|---|---|---|---|---|---|---|---|---|---|
| mach. CPU | 0.450 | 0.517 | 1.149 | 0.459 | 0.541 | 1.202 | 0.249 | 0.428 | 0.951 | 0.910 |
| baseball | 0.758 | 0.746 | 0.984 | 0.609 | 0.789 | 1.041 | 0.668 | 0.746 | 0.984 | 0.609 |
| dee | 0.561 | 0.502 | 0.894 | 0.433 | 0.561 | 0.999 | 0.532 | 0.502 | 0.894 | 0.433 |
| autoMPG8 | 0.478 | 0.450 | 0.942 | 0.519 | 0.509 | 1.065 | 0.199 | 0.450 | 0.942 | 0.519 |
| autoMPG6 | 0.464 | 0.464 | 1.001 | 0.653 | 0.691 | 1.491 | 0.203 | 0.451 | 0.973 | 0.925 |
| ele-1 | 0.690 | 0.670 | 0.970 | 0.628 | 0.708 | 1.025 | 0.596 | 0.670 | 0.970 | 0.628 |
| forestFires | 1.669 | 0.899 | 0.539 | 0.413 | 0.902 | 0.541 | 0.201 | 0.899 | 0.539 | 0.413 |
| stock | 0.136 | 0.158 | 1.162 | 0.450 | 0.217 | 1.596 | 0.168 | 0.136 | 1.000 | 1.000 |
| steel | 0.359 | 0.404 | 1.124 | 0.480 | 0.416 | 1.158 | 0.391 | 0.359 | 1.000 | 1.000 |
| laser | 0.293 | 0.312 | 1.064 | 0.516 | 0.343 | 1.170 | 0.208 | 0.284 | 0.969 | 0.898 |
| concrete | 0.639 | 0.677 | 1.060 | 0.664 | 0.825 | 1.292 | 0.201 | 0.639 | 1.000 | 1.000 |
| treasury | 0.092 | 0.095 | 1.033 | 0.530 | 0.145 | 1.576 | 0.180 | 0.090 | 0.978 | 0.902 |
| mortgage | 0.071 | 0.082 | 1.151 | 0.453 | 0.112 | 1.572 | 0.309 | 0.071 | 1.000 | 1.000 |
| friedman | 0.510 | 0.467 | 0.915 | 0.735 | 0.647 | 1.268 | 0.156 | 0.467 | 0.915 | 0.735 |
| wizmir | 0.244 | 0.245 | 1.003 | 0.726 | 0.316 | 1.293 | 0.192 | 0.244 | 1.000 | 1.000 |
| wankara | 0.239 | 0.229 | 0.959 | 0.727 | 0.295 | 1.236 | 0.209 | 0.229 | 0.959 | 0.727 |
| plastic | 0.634 | 0.518 | 0.817 | 0.719 | 0.608 | 0.959 | 0.286 | 0.518 | 0.817 | 0.719 |
| quake | 1.349 | 1.124 | 0.833 | 0.723 | 1.348 | 0.999 | 0.143 | 1.124 | 0.833 | 0.723 |
| anacalt | 0.263 | 0.272 | 1.036 | 0.521 | 0.304 | 1.157 | 0.217 | 0.248 | 0.944 | 0.886 |
| abalone | 0.900 | 0.771 | 0.857 | 0.718 | 0.926 | 1.029 | 0.185 | 0.771 | 0.857 | 0.718 |
| delta-ail | 0.733 | 0.691 | 0.943 | 0.706 | 0.784 | 1.069 | 0.208 | 0.691 | 0.943 | 0.706 |
| puma32h | 1.227 | 1.036 | 0.844 | 0.716 | 1.279 | 1.042 | 0.213 | 1.036 | 0.844 | 0.716 |
| compactiv | 0.303 | 0.344 | 1.135 | 0.396 | 0.384 | 1.267 | 0.312 | 0.303 | 1.000 | 1.000 |
| delta-elv | 0.826 | 0.709 | 0.859 | 0.719 | 0.901 | 1.091 | 0.165 | 0.709 | 0.859 | 0.719 |
| tic | 1.348 | 1.111 | 0.824 | 0.719 | 1.349 | 1.001 | 0.189 | 1.111 | 0.824 | 0.719 |
| ailerons | 0.685 | 0.601 | 0.878 | 0.720 | 0.748 | 1.092 | 0.144 | 0.601 | 0.878 | 0.720 |
| pole | 0.271 | 0.288 | 1.061 | 0.716 | 0.371 | 1.367 | 0.205 | 0.271 | 1.000 | 0.925 |
| elevators | 0.719 | 0.659 | 0.916 | 0.719 | 0.940 | 1.307 | 0.171 | 0.659 | 0.916 | 0.719 |
| california | 0.681 | 0.612 | 0.899 | 0.716 | 0.766 | 1.125 | 0.163 | 0.612 | 0.899 | 0.716 |
| house | 0.893 | 0.787 | 0.881 | 0.717 | 0.983 | 1.101 | 0.199 | 0.787 | 0.881 | 0.717 |
| mv | 0.235 | 0.208 | 0.884 | 0.677 | 0.312 | 1.325 | 0.162 | 0.208 | 0.884 | 0.677 |
| average | 0.604 | 0.537 | 0.955 | 0.619 | 0.646 | 1.176 | 0.246 | 0.526 | 0.918 | 0.777 |
Experimental results for MEISR with 10-CV, IEA: 1-NN (BIS), FPA: 1-NN (BIS).
| Dataset | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 | r | r | c(r |
|---|---|---|---|---|---|---|---|---|---|---|
| mach. CPU | 0.351 | 0.407 | 1.159 | 0.385 | 0.426 | 1.212 | 0.216 | 0.351 | 1.000 | 0.914 |
| baseball | 0.727 | 0.639 | 0.878 | 0.662 | 0.654 | 0.899 | 0.559 | 0.639 | 0.878 | 0.662 |
| dee | 0.555 | 0.450 | 0.811 | 0.652 | 0.510 | 0.918 | 0.401 | 0.450 | 0.811 | 0.652 |
| autoMPG8 | 0.429 | 0.387 | 0.904 | 0.685 | 0.483 | 1.127 | 0.155 | 0.387 | 0.904 | 0.685 |
| autoMPG6 | 0.407 | 0.424 | 1.042 | 0.698 | 0.506 | 1.242 | 0.150 | 0.401 | 0.985 | 0.920 |
| ele-1 | 0.689 | 0.640 | 0.929 | 0.678 | 0.666 | 0.966 | 0.619 | 0.640 | 0.929 | 0.678 |
| forestFires | 1.548 | 0.768 | 0.496 | 0.344 | 0.796 | 0.514 | 0.191 | 0.768 | 0.496 | 0.344 |
| stock | 0.112 | 0.132 | 1.175 | 0.629 | 0.190 | 1.692 | 0.155 | 0.112 | 1.000 | 0.918 |
| steel | 0.348 | 0.317 | 0.911 | 0.469 | 0.402 | 1.154 | 0.369 | 0.317 | 0.911 | 0.469 |
| laser | 0.231 | 0.276 | 1.192 | 0.467 | 0.306 | 1.324 | 0.221 | 0.231 | 1.000 | 0.911 |
| concrete | 0.533 | 0.595 | 1.117 | 0.696 | 0.768 | 1.441 | 0.155 | 0.533 | 1.000 | 1.000 |
| treasury | 0.069 | 0.087 | 1.259 | 0.662 | 0.118 | 1.693 | 0.159 | 0.069 | 1.000 | 0.916 |
| mortgage | 0.054 | 0.069 | 1.284 | 0.545 | 0.096 | 1.784 | 0.287 | 0.054 | 1.000 | 0.926 |
| friedman | 0.462 | 0.443 | 0.959 | 0.692 | 0.595 | 1.288 | 0.148 | 0.443 | 0.959 | 0.692 |
| wizmir | 0.230 | 0.211 | 0.919 | 0.693 | 0.301 | 1.311 | 0.160 | 0.211 | 0.919 | 0.693 |
| wankara | 0.225 | 0.218 | 0.966 | 0.692 | 0.289 | 1.285 | 0.160 | 0.218 | 0.966 | 0.692 |
| plastic | 0.617 | 0.543 | 0.879 | 0.682 | 0.607 | 0.983 | 0.212 | 0.543 | 0.879 | 0.682 |
| quake | 1.344 | 1.151 | 0.856 | 0.685 | 1.349 | 1.003 | 0.157 | 1.151 | 0.856 | 0.685 |
| anacalt | 0.227 | 0.227 | 1.000 | 0.449 | 0.286 | 1.260 | 0.186 | 0.222 | 0.977 | 0.887 |
| abalone | 0.915 | 0.773 | 0.845 | 0.685 | 0.880 | 0.962 | 0.155 | 0.773 | 0.845 | 0.685 |
| delta-ail | 0.716 | 0.628 | 0.877 | 0.685 | 0.738 | 1.031 | 0.159 | 0.628 | 0.877 | 0.685 |
| puma32h | 1.212 | 1.024 | 0.845 | 0.689 | 1.212 | 1.000 | 0.161 | 1.024 | 0.845 | 0.689 |
| compactiv | 0.254 | 0.299 | 1.177 | 0.429 | 0.297 | 1.172 | 0.237 | 0.235 | 0.926 | 0.847 |
| delta-elv | 0.828 | 0.705 | 0.851 | 0.685 | 0.837 | 1.010 | 0.157 | 0.705 | 0.851 | 0.685 |
| tic | 1.366 | 1.130 | 0.827 | 0.685 | 1.330 | 0.973 | 0.156 | 1.130 | 0.827 | 0.685 |
| ailerons | 0.657 | 0.584 | 0.889 | 0.688 | 0.709 | 1.079 | 0.160 | 0.584 | 0.889 | 0.688 |
| pole | 0.244 | 0.258 | 1.055 | 0.686 | 0.353 | 1.443 | 0.157 | 0.244 | 1.000 | 0.922 |
| elevators | 0.686 | 0.638 | 0.930 | 0.687 | 0.764 | 1.114 | 0.162 | 0.638 | 0.930 | 0.687 |
| california | 0.654 | 0.596 | 0.911 | 0.686 | 0.718 | 1.099 | 0.158 | 0.596 | 0.911 | 0.686 |
| house | 0.872 | 0.775 | 0.888 | 0.684 | 0.926 | 1.061 | 0.159 | 0.775 | 0.888 | 0.684 |
| mv | 0.210 | 0.197 | 0.941 | 0.686 | 0.291 | 1.387 | 0.158 | 0.197 | 0.941 | 0.686 |
| average | 0.573 | 0.503 | 0.960 | 0.627 | 0.594 | 1.175 | 0.214 | 0.493 | 0.910 | 0.740 |
Experimental results for MEISR with 50%H, IEA: k-NN (BIS), FPA: k-NN (BIS).
| Dataset | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 | r | r | c(r |
|---|---|---|---|---|---|---|---|---|---|---|
| mach. CPU | 0.450 | 0.517 | 1.149 | 0.437 | 0.541 | 1.202 | 0.249 | 0.441 | 0.980 | 0.877 |
| baseball | 0.648 | 0.671 | 1.035 | 0.409 | 0.718 | 1.108 | 0.287 | 0.648 | 1.000 | 0.409 |
| dee | 0.472 | 0.454 | 0.962 | 0.378 | 0.462 | 0.979 | 0.314 | 0.442 | 0.937 | 0.912 |
| autoMPG8 | 0.422 | 0.412 | 0.976 | 0.428 | 0.481 | 1.140 | 0.299 | 0.407 | 0.964 | 0.936 |
| autoMPG6 | 0.411 | 0.398 | 0.968 | 0.489 | 0.452 | 1.100 | 0.261 | 0.377 | 0.917 | 0.939 |
| ele-1 | 0.584 | 0.562 | 0.962 | 0.409 | 0.589 | 1.009 | 0.321 | 0.562 | 0.962 | 0.409 |
| forestFires | 0.901 | 0.847 | 0.940 | 0.324 | 0.870 | 0.966 | 0.230 | 0.847 | 0.940 | 0.324 |
| stock | 0.129 | 0.148 | 1.147 | 0.542 | 0.168 | 1.302 | 0.293 | 0.124 | 0.961 | 0.962 |
| steel | 0.364 | 0.391 | 1.076 | 0.481 | 0.490 | 1.348 | 0.247 | 0.364 | 1.000 | 0.912 |
| laser | 0.274 | 0.284 | 1.038 | 0.472 | 0.324 | 1.184 | 0.239 | 0.258 | 0.943 | 0.941 |
| concrete | 0.576 | 0.631 | 1.096 | 0.497 | 0.687 | 1.193 | 0.205 | 0.576 | 1.000 | 0.922 |
| treasury | 0.079 | 0.086 | 1.093 | 0.502 | 0.126 | 1.594 | 0.255 | 0.079 | 1.000 | 1.000 |
| mortgage | 0.065 | 0.082 | 1.258 | 0.517 | 0.137 | 2.101 | 0.172 | 0.065 | 1.000 | 0.941 |
| friedman | 0.375 | 0.410 | 1.092 | 0.506 | 0.490 | 1.305 | 0.200 | 0.375 | 1.000 | 0.899 |
| wizmir | 0.199 | 0.215 | 1.081 | 0.384 | 0.225 | 1.131 | 0.205 | 0.199 | 1.000 | 1.000 |
| wankara | 0.185 | 0.199 | 1.076 | 0.503 | 0.224 | 1.211 | 0.210 | 0.185 | 1.000 | 1.000 |
| plastic | 0.463 | 0.460 | 0.994 | 0.491 | 0.458 | 0.989 | 0.211 | 0.460 | 0.994 | 0.491 |
| quake | 1.024 | 1.008 | 0.984 | 0.621 | 1.030 | 1.005 | 0.215 | 1.008 | 0.984 | 0.621 |
| anacalt | 0.242 | 0.254 | 1.052 | 0.484 | 0.286 | 1.184 | 0.260 | 0.242 | 1.000 | 1.000 |
| abalone | 0.713 | 0.704 | 0.987 | 0.655 | 0.721 | 1.011 | 0.197 | 0.704 | 0.987 | 0.887 |
| delta-ail | 0.582 | 0.588 | 1.010 | 0.676 | 0.688 | 1.181 | 0.187 | 0.582 | 1.000 | 0.905 |
| puma32h | 0.910 | 0.918 | 1.008 | 0.583 | 0.932 | 1.024 | 0.220 | 0.910 | 1.000 | 1.000 |
| compactiv | 0.281 | 0.316 | 1.125 | 0.426 | 0.337 | 1.200 | 0.243 | 0.281 | 1.000 | 1.000 |
| delta-elv | 0.626 | 0.624 | 0.996 | 0.687 | 0.628 | 1.003 | 0.266 | 0.624 | 0.996 | 1.000 |
| tic | 1.011 | 0.997 | 0.986 | 0.683 | 1.007 | 0.996 | 0.224 | 0.997 | 0.986 | 0.683 |
| ailerons | 0.522 | 0.533 | 1.021 | 0.523 | 0.568 | 1.088 | 0.272 | 0.522 | 1.000 | 0.932 |
| pole | 0.244 | 0.270 | 1.107 | 0.516 | 0.301 | 1.234 | 0.302 | 0.244 | 1.000 | 0.933 |
| elevators | 0.586 | 0.604 | 1.030 | 0.534 | 0.643 | 1.097 | 0.287 | 0.586 | 1.000 | 1.000 |
| california | 0.548 | 0.566 | 1.033 | 0.535 | 0.446 | 0.814 | 0.299 | 0.548 | 1.000 | 0.948 |
| house | 0.712 | 0.724 | 1.017 | 0.566 | 0.764 | 1.073 | 0.287 | 0.712 | 1.000 | 0.901 |
| mv | 0.160 | 0.198 | 1.236 | 0.384 | 0.205 | 1.280 | 0.267 | 0.160 | 1.000 | 1.000 |
| average | 0.476 | 0.486 | 1.050 | 0.505 | 0.516 | 1.163 | 0.249 | 0.468 | 0.986 | 0.861 |
Experimental results for MEISR with 10-CV, IEA: k-NN (BIS), FPA: k-NN (BIS).
| Dataset | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 | r | r | c(r |
|---|---|---|---|---|---|---|---|---|---|---|
| mach. CPU | 0.351 | 0.370 | 1.053 | 0.436 | 0.410 | 1.168 | 0.213 | 0.339 | 0.964 | 0.852 |
| baseball | 0.584 | 0.582 | 0.998 | 0.450 | 0.649 | 1.112 | 0.217 | 0.582 | 0.998 | 0.450 |
| dee | 0.424 | 0.427 | 1.008 | 0.403 | 0.439 | 1.035 | 0.229 | 0.418 | 0.986 | 0.928 |
| autoMPG8 | 0.372 | 0.401 | 1.078 | 0.458 | 0.422 | 1.134 | 0.218 | 0.364 | 0.978 | 0.927 |
| autoMPG6 | 0.366 | 0.382 | 1.043 | 0.424 | 0.393 | 1.074 | 0.228 | 0.351 | 0.921 | 0.944 |
| ele-1 | 0.584 | 0.557 | 0.954 | 0.533 | 0.585 | 1.002 | 0.367 | 0.557 | 0.954 | 0.533 |
| forestFires | 0.864 | 0.725 | 0.839 | 0.340 | 0.726 | 0.840 | 0.162 | 0.725 | 0.839 | 0.340 |
| stock | 0.105 | 0.124 | 1.181 | 0.486 | 0.145 | 1.386 | 0.340 | 0.104 | 0.990 | 0.954 |
| steel | 0.323 | 0.330 | 1.022 | 0.444 | 0.421 | 1.302 | 0.224 | 0.323 | 1.000 | 0.934 |
| laser | 0.204 | 0.210 | 1.031 | 0.435 | 0.261 | 1.280 | 0.208 | 0.195 | 0.955 | 0.926 |
| concrete | 0.521 | 0.549 | 1.054 | 0.488 | 0.648 | 1.244 | 0.193 | 0.521 | 1.000 | 0.929 |
| treasury | 0.058 | 0.077 | 1.321 | 0.486 | 0.103 | 1.768 | 0.234 | 0.058 | 1.000 | 0.927 |
| mortgage | 0.045 | 0.059 | 1.304 | 0.489 | 0.107 | 2.387 | 0.156 | 0.045 | 1.000 | 0.923 |
| friedman | 0.340 | 0.365 | 1.073 | 0.472 | 0.441 | 1.297 | 0.162 | 0.340 | 1.000 | 0.947 |
| wizmir | 0.178 | 0.188 | 1.060 | 0.470 | 0.231 | 1.297 | 0.214 | 0.178 | 1.000 | 0.779 |
| wankara | 0.167 | 0.183 | 1.098 | 0.497 | 0.220 | 1.315 | 0.160 | 0.167 | 1.000 | 1.000 |
| plastic | 0.468 | 0.453 | 0.969 | 0.446 | 0.466 | 0.996 | 0.212 | 0.453 | 0.969 | 0.446 |
| quake | 1.025 | 1.008 | 0.983 | 0.685 | 1.025 | 1.001 | 0.219 | 1.008 | 0.983 | 0.685 |
| anacalt | 0.212 | 0.215 | 1.013 | 0.440 | 0.282 | 1.330 | 0.174 | 0.209 | 0.988 | 0.684 |
| abalone | 0.702 | 0.709 | 1.011 | 0.682 | 0.750 | 1.068 | 0.158 | 0.688 | 0.981 | 0.801 |
| delta-ail | 0.560 | 0.579 | 1.033 | 0.686 | 0.604 | 1.078 | 0.158 | 0.560 | 1.000 | 0.835 |
| puma32h | 0.896 | 0.908 | 1.014 | 0.461 | 0.920 | 1.027 | 0.222 | 0.896 | 1.000 | 1.000 |
| compactiv | 0.231 | 0.280 | 1.209 | 0.461 | 0.286 | 1.238 | 0.275 | 0.231 | 1.000 | 1.000 |
| delta-elv | 0.610 | 0.620 | 1.016 | 0.684 | 0.628 | 1.029 | 0.225 | 0.610 | 1.000 | 1.000 |
| tic | 1.015 | 0.995 | 0.980 | 0.684 | 1.011 | 0.996 | 0.186 | 0.995 | 0.980 | 0.684 |
| ailerons | 0.504 | 0.519 | 1.031 | 0.473 | 0.550 | 1.092 | 0.263 | 0.504 | 1.000 | 0.934 |
| pole | 0.214 | 0.241 | 1.125 | 0.491 | 0.261 | 1.219 | 0.307 | 0.214 | 1.000 | 0.947 |
| elevators | 0.559 | 0.581 | 1.040 | 0.448 | 0.622 | 1.112 | 0.226 | 0.559 | 1.000 | 1.000 |
| california | 0.527 | 0.546 | 1.037 | 0.488 | 0.571 | 1.083 | 0.332 | 0.527 | 1.000 | 0.944 |
| house | 0.687 | 0.707 | 1.029 | 0.483 | 0.737 | 1.074 | 0.276 | 0.687 | 1.000 | 0.894 |
| mv | 0.140 | 0.159 | 1.136 | 0.492 | 0.183 | 1.311 | 0.273 | 0.140 | 1.000 | 1.000 |
| average | 0.446 | 0.453 | 1.056 | 0.497 | 0.487 | 1.203 | 0.227 | 0.437 | 0.984 | 0.843 |
Experimental results for MEISR with 50%H, IEA: k-NN (BIS), FPA: MLP.
| Dataset | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 | r | r | c(r |
|---|---|---|---|---|---|---|---|---|---|---|
| mach. CPU | 0.457 | 0.491 | 1.076 | 0.437 | 0.570 | 1.247 | 0.249 | 0.457 | 1.000 | 0.877 |
| baseball | 0.618 | 0.659 | 1.067 | 0.409 | 0.763 | 1.235 | 0.287 | 0.618 | 1.000 | 0.409 |
| dee | 0.439 | 0.453 | 1.033 | 0.378 | 0.500 | 1.141 | 0.314 | 0.428 | 0.976 | 0.912 |
| autoMPG8 | 0.366 | 0.343 | 0.937 | 0.428 | 0.432 | 1.182 | 0.299 | 0.366 | 1.000 | 1.000 |
| autoMPG6 | 0.367 | 0.344 | 0.938 | 0.489 | 0.416 | 1.132 | 0.261 | 0.344 | 0.938 | 0.939 |
| ele-1 | 0.554 | 0.512 | 0.923 | 0.409 | 0.585 | 1.056 | 0.321 | 0.512 | 0.923 | 0.409 |
| forestFire | 1.084 | 1.026 | 0.946 | 0.324 | 1.124 | 1.037 | 0.230 | 1.026 | 0.946 | 0.324 |
| stock | 0.195 | 0.190 | 0.974 | 0.542 | 0.198 | 1.011 | 0.293 | 0.190 | 0.974 | 0.962 |
| steel | 0.251 | 0.219 | 0.872 | 0.481 | 0.224 | 0.893 | 0.247 | 0.219 | 0.872 | 0.481 |
| laser | 0.204 | 0.203 | 0.994 | 0.472 | 0.203 | 0.995 | 0.239 | 0.203 | 0.994 | 1.000 |
| concrete | 0.407 | 0.423 | 1.039 | 0.497 | 0.449 | 1.103 | 0.205 | 0.407 | 1.000 | 1.000 |
| treasury | 0.077 | 0.076 | 0.985 | 0.502 | 0.080 | 1.050 | 0.255 | 0.076 | 0.985 | 1.000 |
| mortgage | 0.070 | 0.069 | 0.981 | 0.517 | 0.079 | 1.127 | 0.172 | 0.066 | 0.941 | 0.941 |
| friedman | 0.301 | 0.300 | 0.997 | 0.506 | 0.347 | 1.152 | 0.200 | 0.298 | 0.989 | 0.899 |
| wizmir | 0.095 | 0.096 | 1.007 | 0.384 | 0.099 | 1.042 | 0.205 | 0.095 | 1.000 | 1.000 |
| wankara | 0.100 | 0.098 | 0.980 | 0.503 | 0.102 | 1.023 | 0.210 | 0.096 | 0.961 | 0.843 |
| plastic | 0.435 | 0.437 | 1.004 | 0.491 | 0.442 | 1.016 | 0.211 | 0.435 | 1.000 | 0.491 |
| quake | 1.000 | 0.999 | 0.999 | 0.621 | 1.054 | 1.054 | 0.215 | 0.999 | 0.999 | 0.621 |
| anacalt | 0.223 | 0.199 | 0.891 | 0.484 | 0.251 | 1.124 | 0.260 | 0.199 | 0.891 | 0.484 |
| abalone | 0.652 | 0.642 | 0.984 | 0.655 | 0.646 | 0.991 | 0.197 | 0.642 | 0.984 | 0.887 |
| delta-ail | 0.554 | 0.552 | 0.997 | 0.676 | 0.551 | 0.996 | 0.187 | 0.550 | 0.993 | 0.905 |
| puma32h | 0.368 | 0.406 | 1.104 | 0.583 | 0.473 | 1.286 | 0.220 | 0.368 | 1.000 | 1.000 |
| compactiv | 0.155 | 0.158 | 1.019 | 0.426 | 0.164 | 1.055 | 0.243 | 0.155 | 1.000 | 1.000 |
| delta-elv | 0.600 | 0.590 | 0.982 | 0.687 | 0.614 | 1.022 | 0.266 | 0.588 | 0.980 | 0.910 |
| tic | 1.019 | 1.014 | 0.995 | 0.683 | 1.052 | 1.033 | 0.224 | 1.014 | 0.995 | 0.683 |
| ailerons | 0.414 | 0.425 | 1.026 | 0.523 | 0.447 | 1.080 | 0.272 | 0.414 | 1.000 | 1.000 |
| pole | 0.241 | 0.251 | 1.041 | 0.516 | 0.282 | 1.170 | 0.302 | 0.241 | 1.000 | 1.000 |
| elevators | 0.682 | 0.680 | 0.997 | 0.534 | 0.706 | 1.036 | 0.287 | 0.677 | 0.993 | 0.922 |
| california | 0.532 | 0.531 | 0.998 | 0.535 | 0.550 | 1.034 | 0.299 | 0.532 | 1.000 | 1.000 |
| house | 0.712 | 0.719 | 1.011 | 0.566 | 0.728 | 1.023 | 0.287 | 0.712 | 1.000 | 1.000 |
| mv | 0.055 | 0.055 | 1.002 | 0.384 | 0.055 | 1.008 | 0.267 | 0.055 | 1.000 | 1.000 |
| average | 0.427 | 0.424 | 0.994 | 0.505 | 0.458 | 1.076 | 0.249 | 0.419 | 0.979 | 0.835 |
Experimental results for MEISR with 10-CV, IEA: k-NN (BIS), FPA: MLP.
| Dataset | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 | r | r | c(r |
|---|---|---|---|---|---|---|---|---|---|---|
| mach. CPU | 0.348 | 0.379 | 1.089 | 0.436 | 0.438 | 1.261 | 0.213 | 0.378 | 1.000 | 1.000 |
| baseball | 0.626 | 0.656 | 1.048 | 0.450 | 0.750 | 1.198 | 0.217 | 0.643 | 1.000 | 1.000 |
| dee | 0.407 | 0.412 | 1.012 | 0.403 | 0.470 | 1.153 | 0.229 | 0.404 | 0.992 | 0.928 |
| autoMPG8 | 0.354 | 0.336 | 0.948 | 0.458 | 0.421 | 1.187 | 0.218 | 0.334 | 0.942 | 0.927 |
| autoMPG6 | 0.357 | 0.330 | 0.924 | 0.424 | 0.407 | 1.141 | 0.228 | 0.326 | 0.914 | 0.944 |
| ele-1 | 0.531 | 0.488 | 0.920 | 0.533 | 0.552 | 1.040 | 0.367 | 0.486 | 0.917 | 0.533 |
| forestFires | 0.735 | 0.699 | 0.952 | 0.340 | 0.745 | 1.014 | 0.162 | 0.691 | 0.940 | 0.340 |
| stock | 0.178 | 0.171 | 0.958 | 0.486 | 0.181 | 1.016 | 0.340 | 0.168 | 0.945 | 0.954 |
| steel | 0.225 | 0.198 | 0.882 | 0.444 | 0.199 | 0.885 | 0.224 | 0.196 | 0.872 | 0.934 |
| laser | 0.164 | 0.162 | 0.988 | 0.435 | 0.165 | 1.011 | 0.208 | 0.160 | 0.975 | 0.926 |
| concrete | 0.379 | 0.386 | 1.017 | 0.488 | 0.416 | 1.097 | 0.193 | 0.379 | 1.000 | 0.929 |
| treasury | 0.074 | 0.072 | 0.970 | 0.486 | 0.077 | 1.042 | 0.234 | 0.072 | 0.968 | 0.927 |
| mortgage | 0.055 | 0.054 | 0.975 | 0.489 | 0.060 | 1.088 | 0.156 | 0.053 | 0.958 | 0.923 |
| friedman | 0.318 | 0.315 | 0.992 | 0.472 | 0.368 | 1.159 | 0.162 | 0.313 | 0.985 | 0.947 |
| wizmir | 0.085 | 0.086 | 1.013 | 0.470 | 0.087 | 1.027 | 0.214 | 0.085 | 1.000 | 1.000 |
| wankara | 0.097 | 0.095 | 0.978 | 0.497 | 0.097 | 1.004 | 0.160 | 0.094 | 0.968 | 1.000 |
| plastic | 0.437 | 0.433 | 0.990 | 0.446 | 0.434 | 0.995 | 0.212 | 0.433 | 0.990 | 0.446 |
| quake | 1.000 | 0.992 | 0.992 | 0.685 | 1.017 | 1.017 | 0.219 | 0.992 | 0.992 | 0.685 |
| anacalt | 0.201 | 0.181 | 0.898 | 0.440 | 0.220 | 1.096 | 0.174 | 0.180 | 0.894 | 0.684 |
| abalone | 0.651 | 0.650 | 0.998 | 0.682 | 0.655 | 1.006 | 0.158 | 0.649 | 0.997 | 0.801 |
| delta-ail | 0.552 | 0.551 | 0.997 | 0.686 | 0.555 | 1.005 | 0.158 | 0.548 | 0.993 | 0.835 |
| puma32h | 0.338 | 0.373 | 1.103 | 0.461 | 0.433 | 1.283 | 0.222 | 0.371 | 1.000 | 1.000 |
| compactiv | 0.152 | 0.155 | 1.018 | 0.461 | 0.158 | 1.042 | 0.275 | 0.153 | 1.000 | 1.000 |
| delta-elv | 0.599 | 0.598 | 0.998 | 0.684 | 0.604 | 1.009 | 0.225 | 0.596 | 0.995 | 1.000 |
| tic | 1.017 | 1.017 | 1.000 | 0.684 | 1.068 | 1.050 | 0.186 | 1.011 | 0.993 | 0.684 |
| ailerons | 0.402 | 0.409 | 1.018 | 0.473 | 0.418 | 1.041 | 0.263 | 0.407 | 1.000 | 1.000 |
| pole | 0.255 | 0.265 | 1.038 | 0.491 | 0.303 | 1.188 | 0.307 | 0.262 | 1.000 | 1.000 |
| elevators | 0.344 | 0.347 | 1.011 | 0.448 | 0.348 | 1.013 | 0.226 | 0.347 | 1.000 | 1.000 |
| california | 0.532 | 0.531 | 0.997 | 0.488 | 0.532 | 0.999 | 0.332 | 0.524 | 0.985 | 0.944 |
| house | 0.710 | 0.714 | 1.005 | 0.483 | 0.722 | 1.017 | 0.276 | 0.710 | 1.000 | 1.000 |
| mv | 0.055 | 0.055 | 1.005 | 0.492 | 0.056 | 1.009 | 0.273 | 0.055 | 0.991 | 1.000 |
| average | 0.393 | 0.391 | 0.992 | 0.497 | 0.418 | 1.068 | 0.227 | 0.388 | 0.974 | 0.880 |
Experimental results for MEISR with 50%H, IEA: k-NN (RIS), FPA: k-NN (RIS).
| Dataset | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 |
|---|---|---|---|---|---|---|---|
| mach. CPU | 0.450 | 0.462 | 1.027 | 0.402 | 0.485 | 1.079 | 0.230 |
| baseball | 0.648 | 0.661 | 1.020 | 0.650 | 0.820 | 1.265 | 0.114 |
| dee | 0.472 | 0.471 | 0.998 | 0.429 | 0.577 | 1.222 | 0.148 |
| autoMPG8 | 0.422 | 0.460 | 1.090 | 0.525 | 0.546 | 1.295 | 0.166 |
| autoMPG6 | 0.411 | 0.442 | 1.076 | 0.402 | 0.511 | 1.243 | 0.176 |
| ele-1 | 0.584 | 0.569 | 0.974 | 0.533 | 0.596 | 1.020 | 0.184 |
| forestFires | 0.901 | 0.748 | 0.830 | 0.490 | 0.791 | 0.878 | 0.151 |
| stock | 0.129 | 0.176 | 1.361 | 0.753 | 0.225 | 1.746 | 0.364 |
| steel | 0.364 | 0.410 | 1.127 | 0.657 | 0.502 | 1.381 | 0.193 |
| laser | 0.274 | 0.330 | 1.205 | 0.484 | 0.365 | 1.334 | 0.152 |
| concrete | 0.576 | 0.659 | 1.145 | 0.622 | 0.784 | 1.361 | 0.137 |
| treasury | 0.079 | 0.120 | 1.523 | 0.541 | 0.165 | 2.093 | 0.180 |
| mortgage | 0.065 | 0.099 | 1.524 | 0.749 | 0.152 | 2.332 | 0.306 |
| friedman | 0.375 | 0.429 | 1.144 | 0.700 | 0.535 | 1.425 | 0.192 |
| wizmir | 0.199 | 0.215 | 1.081 | 0.481 | 0.274 | 1.377 | 0.140 |
| wankara | 0.185 | 0.206 | 1.115 | 0.673 | 0.297 | 1.604 | 0.139 |
| plastic | 0.463 | 0.452 | 0.975 | 0.551 | 0.512 | 1.107 | 0.223 |
| quake | 1.024 | 1.009 | 0.985 | 0.401 | 1.036 | 1.011 | 0.123 |
| anacalt | 0.242 | 0.399 | 1.652 | 0.504 | 0.478 | 1.978 | 0.233 |
| abalone | 0.713 | 0.738 | 1.035 | 0.444 | 0.780 | 1.094 | 0.166 |
| delta-ail | 0.582 | 0.585 | 1.004 | 0.475 | 0.630 | 1.081 | 0.180 |
| puma32h | 0.910 | 0.913 | 1.003 | 0.634 | 0.960 | 1.055 | 0.151 |
| compactiv | 0.281 | 0.303 | 1.079 | 0.447 | 0.387 | 1.379 | 0.105 |
| delta-elv | 0.626 | 0.639 | 1.020 | 0.556 | 0.670 | 1.070 | 0.150 |
| tic | 1.011 | 0.961 | 0.951 | 0.524 | 1.002 | 0.991 | 0.125 |
| ailerons | 0.522 | 0.528 | 1.011 | 0.437 | 0.581 | 1.112 | 0.153 |
| pole | 0.244 | 0.307 | 1.258 | 0.615 | 0.325 | 1.332 | 0.135 |
| elevators | 0.586 | 0.578 | 0.985 | 0.723 | 0.660 | 1.126 | 0.099 |
| california | 0.548 | 0.553 | 1.009 | 0.577 | 0.568 | 1.037 | 0.124 |
| house | 0.712 | 0.746 | 1.048 | 0.699 | 0.781 | 1.097 | 0.304 |
| mv | 0.160 | 0.205 | 1.279 | 0.578 | 0.221 | 1.380 | 0.494 |
| average | 0.476 | 0.496 | 1.116 | 0.557 | 0.555 | 1.307 | 0.185 |
Experimental results for MEISR with 10-CV, IEA: k-NN (RIS), FPA: k-NN (RIS).
| Dataset | r0 | r1 | r1/r0 | c1 | r2 | r2/r0 | c2 |
|---|---|---|---|---|---|---|---|
| mach. CPU | 0.392 | 0.407 | 1.038 | 0.385 | 0.426 | 1.086 | 0.216 |
| baseball | 0.606 | 0.623 | 1.029 | 0.591 | 0.753 | 1.244 | 0.110 |
| dee | 0.424 | 0.430 | 1.015 | 0.403 | 0.501 | 1.182 | 0.137 |
| autoMPG8 | 0.364 | 0.403 | 1.108 | 0.499 | 0.480 | 1.320 | 0.160 |
| autoMPG6 | 0.366 | 0.396 | 1.082 | 0.401 | 0.457 | 1.248 | 0.156 |
| ele-1 | 0.584 | 0.562 | 0.963 | 0.489 | 0.575 | 0.985 | 0.178 |
| forestFires | 0.864 | 0.729 | 0.844 | 0.456 | 0.732 | 0.847 | 0.126 |
| stock | 0.105 | 0.140 | 1.338 | 0.689 | 0.181 | 1.729 | 0.371 |
| steel | 0.323 | 0.359 | 1.110 | 0.665 | 0.431 | 1.333 | 0.165 |
| laser | 0.204 | 0.243 | 1.193 | 0.489 | 0.271 | 1.330 | 0.128 |
| concrete | 0.521 | 0.593 | 1.138 | 0.593 | 0.678 | 1.303 | 0.129 |
| treasury | 0.058 | 0.088 | 1.503 | 0.531 | 0.118 | 2.015 | 0.159 |
| mortgage | 0.049 | 0.076 | 1.554 | 0.682 | 0.109 | 2.246 | 0.263 |
| friedman | 0.340 | 0.385 | 1.132 | 0.668 | 0.489 | 1.437 | 0.176 |
| wizmir | 0.178 | 0.192 | 1.081 | 0.491 | 0.238 | 1.339 | 0.117 |
| wankara | 0.167 | 0.183 | 1.096 | 0.643 | 0.273 | 1.635 | 0.127 |
| plastic | 0.468 | 0.451 | 0.965 | 0.504 | 0.520 | 1.112 | 0.184 |
| quake | 1.025 | 1.000 | 0.976 | 0.374 | 1.003 | 0.979 | 0.117 |
| anacalt | 0.212 | 0.352 | 1.661 | 0.510 | 0.404 | 1.904 | 0.226 |
| abalone | 0.690 | 0.709 | 1.028 | 0.435 | 0.730 | 1.059 | 0.136 |
| delta-ail | 0.560 | 0.566 | 1.011 | 0.477 | 0.580 | 1.035 | 0.170 |
| puma32h | 0.896 | 0.902 | 1.008 | 0.619 | 0.927 | 1.035 | 0.135 |
| compactiv | 0.240 | 0.259 | 1.077 | 0.441 | 0.333 | 1.388 | 0.097 |
| delta-elv | 0.610 | 0.610 | 1.000 | 0.546 | 0.627 | 1.027 | 0.123 |
| tic | 1.015 | 0.985 | 0.970 | 0.496 | 0.992 | 0.977 | 0.121 |
| ailerons | 0.504 | 0.511 | 1.015 | 0.421 | 0.538 | 1.069 | 0.128 |
| pole | 0.214 | 0.267 | 1.249 | 0.583 | 0.282 | 1.319 | 0.108 |
| elevators | 0.559 | 0.553 | 0.990 | 0.662 | 0.620 | 1.110 | 0.081 |
| california | 0.527 | 0.531 | 1.008 | 0.532 | 0.555 | 1.053 | 0.112 |
| house | 0.687 | 0.718 | 1.046 | 0.671 | 0.743 | 1.083 | 0.236 |
| mv | 0.140 | 0.179 | 1.280 | 0.540 | 0.186 | 1.328 | 0.420 |
| average | 0.448 | 0.465 | 1.113 | 0.532 | 0.508 | 1.282 | 0.165 |
Relative for retention and (compression = 50% and 75%) in 50% holdout (average over 10 runs) for Threshold-Ensemble-CNN (TE-C), Threshold-Ensemble-ENN (TE-E), Discretization-Ensemble-CNN (DE-C), Discretization-Ensemble-ENN (DE-E) and MEISR. Inner evaluation algorithm k-NN with optimal k and binary instance selection. Final prediction algorithm: k-NN with optimal k.
| c1 = 0.50 | c2 = 0.25 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Dataset | TE-C | TE-E | DE-C | DE-E | MEISR | TE-C | TE-E | DE-C | DE-E | MEISR |
| mach. CPU | 1.270 | 1.065 | 1.362 | 1.022 | 1.149 | 1.537 | 1.296 | 1.755 | 1.393 | 1.202 |
| baseball | 1.118 | 1.087 | 1.155 | 1.266 | 1.035 | 1.346 | 1.310 | 1.734 | 1.598 | 1.183 |
| dee | 1.151 | 1.096 | 1.077 | 1.066 | 0.962 | 1.348 | 1.265 | 1.217 | 1.374 | 1.203 |
| autoMPG8 | 1.255 | 1.086 | 1.185 | 1.121 | 0.976 | 1.538 | 1.308 | 1.280 | 1.248 | 1.198 |
| autoMPG6 | 1.170 | 1.065 | 1.102 | 1.092 | 0.968 | 1.377 | 1.309 | 1.231 | 1.210 | 1.112 |
| ele-1 | 1.047 | 1.058 | 1.028 | 1.066 | 0.962 | 1.080 | 1.179 | 1.075 | 1.160 | 1.014 |
| stock | 1.584 | 1.323 | 1.445 | 1.645 | 1.147 | 1.875 | 2.497 | 1.789 | 2.249 | 1.433 |
| laser | 1.466 | 1.142 | 1.543 | 1.132 | 1.038 | 1.520 | 1.606 | 1.797 | 1.282 | 1.184 |
| concrete | 1.279 | 1.156 | 1.225 | 1.136 | 1.096 | 1.382 | 1.278 | 1.338 | 1.365 | 1.193 |
| treasury | 1.544 | 1.589 | 1.694 | 1.351 | 1.093 | 2.072 | 3.062 | 2.360 | 3.260 | 1.990 |
| mortgage | 1.426 | 1.568 | 1.799 | 1.305 | 1.281 | 2.049 | 4.032 | 2.476 | 3.241 | 2.101 |
| friedman | 1.213 | 1.199 | 1.147 | 1.173 | 1.092 | 1.376 | 1.420 | 1.316 | 1.415 | 1.305 |
| wizmir | 1.289 | 1.224 | 1.227 | 1.146 | 1.081 | 1.365 | 1.389 | 1.356 | 1.371 | 1.131 |
| wankara | 1.214 | 1.220 | 1.145 | 1.151 | 1.076 | 1.422 | 1.588 | 1.401 | 1.489 | 1.211 |
| plastic | 1.025 | 1.069 | 1.033 | 1.523 | 0.994 | 1.033 | 1.155 | 1.285 | 1.957 | 0.989 |
| quake | 1.238 | 1.023 | 1.126 | 1.064 | 0.992 | 1.456 | 1.055 | 1.187 | 1.133 | 1.005 |
| abalone | 1.045 | 1.012 | 1.137 | 1.074 | 1.023 | 1.093 | 1.039 | 1.210 | 1.148 | 1.011 |
| compactiv | 1.129 | 1.274 | 1.277 | 1.366 | 1.125 | 1.302 | 2.862 | 1.328 | 1.428 | 1.200 |
| tic | 1.236 | 1.013 | 1.069 | 1.003 | 0.992 | 1.429 | 1.046 | 1.286 | 0.996 | 0.996 |
| ailerons | 1.126 | 1.033 | 1.198 | 1.048 | 1.028 | 1.238 | 1.126 | 1.236 | 1.172 | 1.088 |
| pole | 1.351 | 1.066 | 1.420 | 1.049 | 1.117 | 1.501 | 1.747 | 1.598 | 1.796 | 1.234 |
| elevators | 1.126 | 1.059 | 1.236 | 1.052 | 1.041 | 1.202 | 1.190 | 1.291 | 1.157 | 1.097 |
| california | 1.103 | 1.115 | 1.077 | 1.080 | 1.037 | 1.259 | 1.216 | 1.197 | 1.146 | 1.234 |
| house | 1.073 | 1.093 | 1.253 | 1.062 | 1.036 | 1.222 | 1.146 | 1.304 | 1.258 | 1.073 |
| average | 1.228 | 1.152 | 1.248 | 1.166 | 1.056 | 1.418 | 1.588 | 1.460 | 1.535 | 1.224 |
| times best | 2 | 0 | 0 | 1 | 21 | 2 | 0 | 0 | 0 | 22 |
| 0.0000 | 0.005 | 0.0000 | 0.0022 | 0.0085 | 0.0158 | 0.0070 | 0.0128 | |||
| Wilcoxon | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0001 | 0.0000 | 0.0000 | 0.0000 | ||
Relative for retention and in 10-fold cross-validation with k-NN with optimal k. All symbols are explained in the caption of Table A11.
| c1 = 0.50 | c2 = 0.25 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Dataset | TE-C | TE-E | DE-C | DE-E | MEISR | TE-C | TE-E | DE-C | DE-E | MEISR |
| mach. CPU | 1.267 | 1.048 | 1.360 | 1.013 | 1.052 | 1.373 | 1.227 | 1.547 | 1.253 | 1.148 |
| baseball | 1.081 | 1.062 | 1.171 | 1.235 | 0.997 | 1.183 | 1.149 | 1.629 | 1.406 | 1.094 |
| dee | 1.129 | 1.105 | 1.058 | 1.058 | 1.007 | 1.199 | 1.199 | 1.110 | 1.293 | 1.031 |
| autoMPG8 | 1.226 | 1.098 | 1.172 | 1.120 | 1.031 | 1.362 | 1.226 | 1.199 | 1.199 | 1.138 |
| autoMPG6 | 1.127 | 1.082 | 1.071 | 1.088 | 1.043 | 1.280 | 1.197 | 1.141 | 1.197 | 1.074 |
| ele-1 | 1.019 | 1.029 | 1.041 | 1.039 | 0.954 | 1.056 | 1.100 | 1.054 | 1.097 | 1.048 |
| stock | 1.549 | 1.345 | 1.442 | 1.646 | 1.180 | 1.849 | 2.401 | 1.771 | 2.236 | 1.551 |
| laser | 1.409 | 1.155 | 1.570 | 1.118 | 1.031 | 1.501 | 1.501 | 1.723 | 1.293 | 1.232 |
| concrete | 1.240 | 1.181 | 1.192 | 1.111 | 1.054 | 1.346 | 1.308 | 1.346 | 1.308 | 1.224 |
| treasury | 1.491 | 1.563 | 1.763 | 1.327 | 1.302 | 1.982 | 3.091 | 2.364 | 3.273 | 1.372 |
| mortgage | 1.374 | 1.555 | 1.858 | 1.273 | 1.304 | 2.020 | 4.081 | 2.404 | 3.111 | 2.854 |
| friedman | 1.199 | 1.214 | 1.191 | 1.156 | 1.073 | 1.313 | 1.410 | 1.299 | 1.343 | 1.234 |
| wizmir | 1.243 | 1.200 | 1.217 | 1.122 | 1.060 | 1.362 | 1.362 | 1.362 | 1.362 | 1.264 |
| wankara | 1.212 | 1.212 | 1.164 | 1.146 | 1.098 | 1.373 | 1.492 | 1.373 | 1.403 | 1.254 |
| plastic | 0.991 | 1.071 | 1.006 | 1.519 | 0.969 | 1.022 | 1.138 | 1.253 | 1.909 | 0.989 |
| quake | 1.235 | 1.012 | 1.102 | 1.056 | 0.990 | 1.455 | 1.015 | 1.144 | 1.102 | 0.998 |
| abalone | 1.026 | 1.026 | 1.121 | 1.079 | 1.031 | 1.036 | 1.055 | 1.166 | 1.103 | 1.058 |
| compactiv | 1.107 | 1.293 | 1.281 | 1.341 | 1.209 | 1.245 | 2.810 | 1.331 | 1.348 | 1.241 |
| tic | 1.221 | 1.001 | 1.113 | 1.000 | 0.986 | 1.388 | 1.002 | 1.262 | 1.001 | 0.993 |
| ailerons | 1.096 | 1.042 | 1.158 | 1.054 | 1.031 | 1.170 | 1.102 | 1.251 | 1.138 | 1.096 |
| pole | 1.339 | 1.082 | 1.478 | 1.022 | 1.125 | 1.483 | 1.684 | 1.604 | 1.698 | 1.261 |
| elevators | 1.092 | 1.050 | 1.223 | 1.040 | 1.039 | 1.129 | 1.109 | 1.310 | 1.129 | 1.103 |
| california | 1.082 | 1.094 | 1.117 | 1.077 | 1.037 | 1.204 | 1.147 | 1.177 | 1.154 | 1.116 |
| house | 1.075 | 1.081 | 1.208 | 1.034 | 1.029 | 1.159 | 1.085 | 1.250 | 1.250 | 1.078 |
| average | 1.201 | 1.148 | 1.253 | 1.153 | 1.068 | 1.354 | 1.537 | 1.420 | 1.484 | 1.227 |
| times best | 1 | 1 | 0 | 3 | 19 | 1 | 0 | 0 | 1 | 22 |
| 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0235 | 0.0059 | 0.0003 | 0.0167 | |||
| Wilcoxon | 0.0001 | 0.0001 | 0.0000 | 0.0008 | 0.0004 | 0.0000 | 0.0000 | 0.0004 | ||
Relative in 50% houdout for DROP3-RT and MEISR for retention and with 8-NN. Dr2: relative with DROP3-RT. Mr2: relative with MEISR, Dc1: retention rate with DROP3-RT, Mc1: retention rate with MEISR, is the correlation between the predicted and actual output.
| Dataset | Dr2 | Mr2 | Mr2/Dr2 | Dc1 | Mc1 | Mc1/Dc1 |
|---|---|---|---|---|---|---|
| mach. CPU | 1.461 | 1.242 | 0.850 | 0.505 | 0.396 | 0.784 |
| baseball | 1.205 | 1.182 | 0.980 | 0.472 | 0.341 | 0.723 |
| dee | 1.120 | 1.074 | 0.959 | 0.523 | 0.315 | 0.602 |
| autoMPG8 | 1.211 | 1.116 | 0.921 | 0.539 | 0.387 | 0.718 |
| autoMPG6 | 1.109 | 1.171 | 1.056 | 0.536 | 0.345 | 0.644 |
| ele-1 | 1.082 | 0.847 | 0.783 | 0.530 | 0.314 | 0.593 |
| stock | 1.420 | 1.503 | 1.058 | 0.583 | 0.432 | 0.742 |
| laser | 1.381 | 1.017 | 0.736 | 0.659 | 0.443 | 0.673 |
| concrete | 1.292 | 1.224 | 0.948 | 0.507 | 0.410 | 0.808 |
| treasury | 1.419 | 1.470 | 1.036 | 0.636 | 0.421 | 0.662 |
| friedman | 1.195 | 1.153 | 0.964 | 0.574 | 0.434 | 0.756 |
| wizmir | 1.260 | 1.101 | 0.873 | 0.569 | 0.324 | 0.569 |
| wankara | 1.135 | 1.188 | 1.047 | 0.571 | 0.382 | 0.669 |
| plastic | 1.050 | 0.797 | 0.759 | 0.391 | 0.378 | 0.967 |
| quake | 1.078 | 0.962 | 0.892 | 0.438 | 0.403 | 0.920 |
| abalone | 1.031 | 1.081 | 1.049 | 0.455 | 0.432 | 0.949 |
|
| 1.040 | 1.037 | 0.997 | 0.445 | 0.425 | 0.955 |
| puma32h | 1.642 | 1.537 | 0.935 | 0.418 | 0.402 | 0.963 |
| compactiv | 1.023 | 1.094 | 1.070 | 0.485 | 0.429 | 0.885 |
|
| 1.051 | 1.043 | 0.992 | 0.469 | 0.438 | 0.933 |
| ailerons | 1.447 | 1.049 | 0.725 | 0.457 | 0.417 | 0.912 |
| pole | 1.151 | 1.139 | 0.990 | 0.335 | 0.409 | 1.222 |
| elevators | 1.066 | 1.150 | 1.079 | 0.470 | 0.416 | 0.885 |
| california | 1.089 | 1.072 | 0.984 | 0.502 | 0.474 | 0.944 |
| house | 1.136 | 0.974 | 0.858 | 0.457 | 0.428 | 0.938 |
| mv | 1.203 | 1.300 | 1.080 | 0.609 | 0.530 | 0.870 |
| average | 1.204 | 1.135 | 0.935 | 0.505 | 0.405 | 0.819 |
| times best | 8 | 18 | 1 | 25 | ||
| 0.0773 | 0.0000 | |||||
| Wilcoxon | 0.0566 | 0.0000 |
Relative in 10-fold cross-validation for DROP3-RT and MEISR for retention and with 8-NN. All symbols are explained in the caption of Table A13.
| Dataset | Dr2 | Mr2 | Mr2/Dr2 | Dc1 | Mc1 | Mc1/Dc1 |
|---|---|---|---|---|---|---|
| mach. CPU | 1.541 | 1.263 | 0.819 | 0.495 | 0.374 | 0.756 |
| baseball | 1.160 | 1.134 | 0.978 | 0.460 | 0.395 | 0.859 |
| dee | 1.166 | 0.991 | 0.850 | 0.511 | 0.352 | 0.689 |
| autoMPG8 | 1.210 | 1.118 | 0.924 | 0.491 | 0.421 | 0.857 |
| autoMPG6 | 1.168 | 1.137 | 0.974 | 0.511 | 0.373 | 0.729 |
| ele-1 | 1.023 | 0.892 | 0.873 | 0.486 | 0.353 | 0.725 |
| stock | 1.685 | 1.421 | 0.843 | 0.572 | 0.483 | 0.843 |
| laser | 1.436 | 1.079 | 0.751 | 0.605 | 0.455 | 0.752 |
| concrete | 1.351 | 1.235 | 0.914 | 0.502 | 0.465 | 0.926 |
| treasury | 1.346 | 1.500 | 1.114 | 0.620 | 0.439 | 0.707 |
| friedman | 1.076 | 1.186 | 1.102 | 0.538 | 0.437 | 0.812 |
| wizmir | 1.329 | 1.140 | 0.858 | 0.510 | 0.410 | 0.803 |
| wankara | 1.355 | 1.103 | 0.814 | 0.521 | 0.444 | 0.853 |
| plastic | 0.964 | 0.833 | 0.864 | 0.419 | 0.390 | 0.930 |
| quake | 1.058 | 1.044 | 0.986 | 0.420 | 0.387 | 0.922 |
| abalone | 1.053 | 1.034 | 0.982 | 0.416 | 0.389 | 0.935 |
| delta-ail | 1.039 | 1.001 | 0.963 | 0.433 | 0.364 | 0.841 |
| puma32h | 1.032 | 1.062 | 1.028 | 0.381 | 0.375 | 0.986 |
| compactiv | 1.557 | 1.066 | 0.685 | 0.449 | 0.391 | 0.870 |
| delta-elv | 1.016 | 1.011 | 0.995 | 0.432 | 0.376 | 0.870 |
| ailerons | 1.039 | 1.134 | 1.092 | 0.425 | 0.264 | 0.620 |
| pole | 1.425 | 1.696 | 1.190 | 0.245 | 0.244 | 0.996 |
| elevators | 1.137 | 1.135 | 0.998 | 0.438 | 0.358 | 0.816 |
| california | 1.048 | 1.105 | 1.055 | 0.475 | 0.402 | 0.845 |
| house | 1.084 | 1.041 | 0.969 | 0.430 | 0.318 | 0.741 |
| mv | 1.061 | 1.297 | 1.222 | 0.531 | 0.392 | 0.739 |
| average | 1.228 | 1.161 | 0.953 | 0.481 | 0.391 | 0.822 |
| times best | 8 | 19 | 0 | 26 | ||
| 0.1112 | 0.0001 | |||||
| Wilcoxon | 0.0588 | 0.0001 |
Figure 4Comparison of the instance selection approaches.
Statistical significance test for experiments presented in Table A3, Table A4, Table A5, Table A6, Table A7, Table A8, Table A9 and Table A10.
| Algorithm | Relation | Wilco-xon | relation | Wilco-xon | ||
|---|---|---|---|---|---|---|
| MEISR, IEA: 1-NN (BIS), FPA: 1-NN (BIS), 50%H | r1/r0 | 0.0148 | 0.0026 | r2/r0 | 0.4708 | 0.0018 |
| MEISR, IEA: 1-NN (BIS), FPA: 1-NN (BIS), 10-CV | r1/r0 | 0.0148 | 0.0026 | r2/r0 | 0.4708 | 0.0018 |
| MEISR, IEA: | r1/r0 | 0.0259 | 0.0182 | r2/r0 | 0.0001 | 0.0001 |
| MEISR, IEA: | r1/r0 | 0.2309 | 0.0058 | r2/r0 | 0.0001 | 0.0001 |
| MEISR, IEA: | r1/r0 | 0.5664 | 0.4839 | r2/r0 | 0.0001 | 0.0001 |
| MEISR, IEA: | r1/r0 | 0.4782 | 0.4593 | r2/r0 | 0.0003 | 0.0001 |
| MEISR, IEA: | r1/r0 | 0.0305 | 0.0022 | r2/r0 | 0.0001 | 0.0001 |
| MEISR, IEA: | r1/r0 | 0.0393 | 0.0032 | r2/r0 | 0.0001 | 0.0001 |
Figure 5Left and middle: dependency between loss function (1-NN) and relative () for retention . MEISR with 1-NN inner evaluator and 1-NN final regressor for retention (left), MEISR with k-NN inner evaluator and k-NN final regressor center) and k = optimal k. Larger and darker circles stand for datasets with higher number of instances. Right: dependency between optimal k and . Darker and bigger points represent larger datasets.
Calculation time of the MEISR algorithm for BIS using our software (available from the online supplementary resources) on a two Xeon X5-2696v2 machine. —number of instances, —number of attributes, , and —total time of the process, where the inner regressor was respectively 1-NN and k-NN with optimal k, and —percentage of total time used for calculating the distance matrix with respectively 1-NN and k-NN with optimal k as inner regressors, and —total process time per one instance.
| Dataset | inst | at | k | 1-NNt [s] | kNNt [s] | dist [s] | 1d% [%] | kd% [%] | 1-NNti [ms] | kNNti [ms] |
|---|---|---|---|---|---|---|---|---|---|---|
| mach. CPU | 188 | 6 | 1 | 0.119 | 0.119 | 0.011 | 9.3 | 9.3 | 0.633 | 0.633 |
| baseball | 303 | 16 | 7 | 0.179 | 0.215 | 0.018 | 9.8 | 8.1 | 0.590 | 0.709 |
| dee | 329 | 6 | 7 | 0.181 | 0.219 | 0.014 | 7.8 | 6.5 | 0.551 | 0.667 |
| autoMPG8 | 353 | 7 | 6 | 0.185 | 0.206 | 0.016 | 8.6 | 7.7 | 0.524 | 0.584 |
| autoMPG6 | 353 | 5 | 4 | 0.186 | 0.202 | 0.015 | 8.0 | 7.3 | 0.527 | 0.573 |
| ele-1 | 446 | 2 | 11 | 0.243 | 0.319 | 0.017 | 6.8 | 5.2 | 0.545 | 0.716 |
| forestFire | 465 | 12 | 11 | 0.293 | 0.343 | 0.024 | 8.1 | 6.9 | 0.630 | 0.737 |
| stock | 855 | 9 | 3 | 0.474 | 0.491 | 0.043 | 9.1 | 8.8 | 0.554 | 0.574 |
| steel | 864 | 12 | 4 | 0.462 | 0.481 | 0.046 | 9.8 | 9.5 | 0.535 | 0.557 |
| laser | 894 | 4 | 3 | 0.481 | 0.483 | 0.036 | 7.6 | 7.5 | 0.538 | 0.540 |
| concrete | 927 | 8 | 4 | 0.494 | 0.536 | 0.047 | 9.5 | 8.8 | 0.533 | 0.578 |
| treasury | 944 | 15 | 3 | 0.491 | 0.506 | 0.066 | 13.5 | 13.1 | 0.520 | 0.536 |
| mortgage | 944 | 15 | 2 | 0.508 | 0.513 | 0.058 | 11.4 | 11.3 | 0.538 | 0.543 |
| friedman | 1080 | 5 | 7 | 0.548 | 0.612 | 0.048 | 8.7 | 7.8 | 0.507 | 0.567 |
| wizmir | 1315 | 9 | 7 | 0.647 | 0.731 | 0.077 | 11.8 | 10.5 | 0.492 | 0.556 |
| wankara | 1448 | 9 | 9 | 0.671 | 0.802 | 0.080 | 11.9 | 10.0 | 0.463 | 0.554 |
| plastic | 1485 | 2 | 11 | 0.659 | 0.905 | 0.065 | 9.9 | 7.2 | 0.444 | 0.609 |
| quake | 1960 | 3 | 11 | 0.851 | 1.070 | 0.123 | 14.5 | 11.5 | 0.434 | 0.546 |
| anacalt | 3647 | 7 | 2 | 1.58 | 1.61 | 0.309 | 19.6 | 19.2 | 0.433 | 0.441 |
| abalone | 3759 | 8 | 11 | 1.68 | 2.25 | 0.341 | 20.3 | 15.2 | 0.447 | 0.599 |
| delta-ail | 6415 | 5 | 11 | 2.93 | 3.31 | 0.749 | 25.6 | 22.6 | 0.457 | 0.516 |
| puma32h | 7372 | 32 | 11 | 4.09 | 5.06 | 1.67 | 40.8 | 33.0 | 0.555 | 0.686 |
| compactiv | 7373 | 21 | 2 | 3.91 | 3.93 | 1.45 | 37.1 | 36.9 | 0.530 | 0.533 |
| delta-elv | 8564 | 6 | 11 | 4.11 | 5.19 | 2.06 | 50.1 | 39.7 | 0.480 | 0.606 |
| tic | 8840 | 85 | 11 | 6.41 | 7.81 | 3.93 | 61.3 | 50.3 | 0.725 | 0.884 |
| ailerons | 12375 | 40 | 10 | 8.61 | 10.1 | 4.88 | 56.7 | 48.3 | 0.696 | 0.816 |
| pole | 13498 | 26 | 4 | 8.52 | 10.3 | 4.97 | 58.3 | 48.3 | 0.631 | 0.763 |
| elevators | 14938 | 18 | 8 | 10.4 | 12.7 | 5.13 | 49.3 | 44.7 | 0.696 | 0.850 |
| california | 18576 | 8 | 9 | 16.0 | 18.2 | 9.27 | 57.9 | 50.9 | 0.861 | 0.980 |
| house | 20506 | 16 | 11 | 18.6 | 21.1 | 11.3 | 60.8 | 53.6 | 0.907 | 1.029 |
| part of mv | 27178 | 10 | 9 | 28.1 | 30.3 | 17.7 | 63.0 | 58.4 | 1.034 | 1.115 |
| mv | 36690 | 10 | 9 | 47.8 | 50.0 | 32.1 | 67.1 | 64.2 | 1.303 | 1.363 |
Figure 6Left: MEISR running time as a function of number of instances in the original training dataset. Right: Percentage of MEISR running time used to calculate the distance matrix used by k-NN. Light circles denote 1-NN as the inner evaluator and dark circles k-NN with optimal k.