| Literature DB >> 33265773 |
Fernando Jiménez1, Carlos Martínez1, Luis Miralles-Pechuán2, Gracia Sánchez1, Guido Sciavicco3.
Abstract
The ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model's predictions. Models which are inherently easier to interpret can be effortlessly related to the context of the problem, and their predictions can be, if necessary, ethically and legally evaluated. In this paper, we propose a novel method to generate rule-based classifiers from categorical data that can be readily interpreted. Classifiers are generated using a multi-objective optimization approach focusing on two main objectives: maximizing the performance of the learned classifier and minimizing its number of rules. The multi-objective evolutionary algorithms ENORA and NSGA-II have been adapted to optimize the performance of the classifier based on three different machine learning metrics: accuracy, area under the ROC curve, and root mean square error. We have extensively compared the generated classifiers using our proposed method with classifiers generated using classical methods such as PART, JRip, OneR and ZeroR. The experiments have been conducted in full training mode, in 10-fold cross-validation mode, and in train/test splitting mode. To make results reproducible, we have used the well-known and publicly available datasets Breast Cancer, Monk's Problem 2, Tic-Tac-Toe-Endgame, Car, kr-vs-kp and Nursery. After performing an exhaustive statistical test on our results, we conclude that the proposed method is able to generate highly accurate and easy to interpret classification models.Entities:
Keywords: categorical data; interpretable machine learning; multi-objective evolutionary algorithms; rule-based classifiers
Year: 2018 PMID: 33265773 PMCID: PMC7513209 DOI: 10.3390/e20090684
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Rank assignment of individuals with ENORA vs. NSGA-II.
Figure 2A Pareto front of a binary classification problem as formulated in Equation (3) where is minimized and is minimized.
Chromosome coding for an individual I.
| Codification for Rule Set | Codification for Adaptive Crossing and Mutation | |||||
|---|---|---|---|---|---|---|
| Antecedents | Consequent | Associated Crossing | Associated Mutation | |||
|
|
| … |
|
| ||
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
|
|
|
|
| … |
|
| ||
Attribute description of the Breast Cancer dataset.
| # | Attribute Name | Type | Possible Values |
|---|---|---|---|
| 1 | age | categorical | 10–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70–79, 80–89, 90–99. |
| 2 | menopause | categorical | lt40, ge40, premeno |
| 3 | tumour-size | categorical | 0–4, 5–9, 10–14, 15–19, 20–24, 25–29, 30–34, 35–39, 40–44, 45–49, 50–54, 55–59 |
| 4 | inv-nodes | categorical | 0–2, 3–5, 6–8, 9–11, 12–14, 15–17, 18–20, 21–23, 24–26, 27–29, 30–32, 33–35, 36–39 |
| 5 | node-caps | categorical | yes, no |
| 6 | deg-malign | categorical | 1, 2, 3 |
| 7 | breast | categorical | left, right |
| 8 | breast-quad | categorical | left-up, left-low, right-up, right-low, central |
| 9 | irradiat | categorical | yes, no |
| 10 | class | categorical | no-recurrence-events, recurrence-events |
Attribute description of the MONK’s Problem 2 dataset.
| # | Atttribute Name | Type | Possible Values |
|---|---|---|---|
| 1 | head_shape | categorical | round, square, octagon |
| 2 | body_shape | categorical | round, square, octagon |
| 3 | is_smiling | categorical | yes, no |
| 4 | holding | categorical | sword, balloon, flag |
| 5 | jacket_color | categorical | red, yellow, green, blue |
| 6 | has_tie | categorical | yes, no |
| 7 | class | categorical | yes, no |
Figure 3Pareto fronts of one execution of ENORA and NSGA-II, with , on the Breast Cancer dataset, and their respective HVR. Note that in the case of multi-objective classification where is maximized ( and ), function has been converted to minimization for a better understanding of the Pareto front.
Figure 4Pareto fronts of one execution of ENORA and NSGA-II, with , on the Monk’s Problem 2 dataset, and their respective HVR. Note that in the case of multi-objective classification where is maximized ( and ), function has been converted to minimization for a better understanding of the Pareto front.
Run times of ENORA and NSGA-II for Breast Cancer and Monk’s Problem 2 datasets.
| Method |
|
|
|---|---|---|
|
| 244.92 s. | 428.14 s. |
|
| 294.75 s. | 553.11 s. |
|
| 243.30 s. | 414.42 s. |
|
| 127.13 s. | 260.83 s. |
|
| 197.07 s. | 424.83 s. |
|
| 134.87 s. | 278.19 s. |
Comparison of the performance of the learning models in full training mode—Breast Cancer dataset.
| Learning Model | Number of Rules | Percent Correct | Precision | Recall |
|
| |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 2 | 79.02 | 0.790 | 0.449 | 0.796 | 0.790 | 0.762 | 0.455 | 0.671 | 0.697 | 0.458 |
|
| 2 | 75.87 | 0.759 | 0.374 | 0.751 | 0.759 | 0.754 | 0.402 | 0.693 | 0.696 | 0.491 |
|
| 2 | 77.62 | 0.776 | 0.475 | 0.778 | 0.776 | 0.744 | 0.410 | 0.651 | 0.680 | 0.473 |
|
| 2 | 77.97 | 0.780 | 0.501 | 0.805 | 0.780 | 0.738 | 0.429 | 0.640 | 0.679 | 0.469 |
|
| 2 | 75.52 | 0.755 | 0.368 | 0.749 | 0.755 | 0.752 | 0.399 | 0.693 | 0.696 | 0.495 |
|
| 2 | 79.37 | 0.794 | 0.447 | 0.803 | 0.794 | 0.765 | 0.467 | 0.673 | 0.700 | 0.454 |
|
| 15 | 78.32 | 0.783 | 0.397 | 0.773 | 0.783 | 0.769 | 0.442 | 0.777 | 0.793 | 0.398 |
|
| 3 | 76.92 | 0.769 | 0.471 | 0.762 | 0.769 | 0.740 | 0.389 | 0.650 | 0.680 | 0.421 |
|
| 1 | 72.72 | 0.727 | 0.563 | 0.703 | 0.727 | 0.680 | 0.241 | 0.582 | 0.629 | 0.522 |
|
| - | 70.27 | 0.703 | 0.703 | 0.494 | 0.703 | 0.580 | 0.000 | 0.500 | 0.582 | 0.457 |
Comparison of the performance of the learning models in full training mode—Monk’s Problem 2 dataset.
| Learning Model | Number of Rules | Percent Correct | Precision | Recall |
|
| |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 7 | 75.87 | 0.759 | 0.370 | 0.753 | 0.759 | 0.745 | 0.436 | 0.695 | 0.680 | 0.491 |
|
| 7 | 68.71 | 0.687 | 0.163 | 0.836 | 0.687 | 0.687 | 0.523 | 0.762 | 0.729 | 0.559 |
|
| 7 | 77.70 | 0.777 | 0.360 | 0.777 | 0.777 | 0.762 | 0.481 | 0.708 | 0.695 | 0.472 |
|
| 7 | 68.38 | 0.684 | 0.588 | 0.704 | 0.684 | 0.597 | 0.203 | 0.548 | 0.580 | 0.562 |
|
| 7 | 66.38 | 0.664 | 0.175 | 0.830 | 0.664 | 0.661 | 0.497 | 0.744 | 0.715 | 0.580 |
|
| 7 | 68.71 | 0.687 | 0.591 | 0.737 | 0.687 | 0.595 | 0.226 | 0.548 | 0.583 | 0.559 |
|
| 47 | 94.01 | 0.940 | 0.087 | 0.940 | 0.940 | 0.940 | 0.866 | 0.980 | 0.979 | 0.218 |
|
| 1 | 65.72 | 0.657 | 0.657 | 0.432 | 0.657 | 0.521 | 0.000 | 0.500 | 0.549 | 0.475 |
|
| 1 | 65.72 | 0.657 | 0.657 | 0.432 | 0.657 | 0.521 | 0.000 | 0.500 | 0.549 | 0.585 |
|
| - | 65.72 | 0.657 | 0.657 | 0.432 | 0.657 | 0.521 | 0.000 | 0.500 | 0.549 | 0.475 |
Rule-based classifier obtained with NSGA-II-RMSE for Breast Cancer dataset.
| Rule | Antecedents | Consequent | ||||||
|---|---|---|---|---|---|---|---|---|
|
| age = 50–59 |
| inv-nodes = 0–2 |
| node-caps = no | |||
|
| deg-malig = 1 |
| breast = right |
| breast-quad = left-low |
| class = no-recurrence-events | |
|
| age = 60–69 |
| inv-nodes = 18–20 |
| node-caps = yes | |||
|
| deg-malig = 3 |
| breast = left |
| breast-quad = right-up |
| class = recurrence-events | |
Rule-based classifier obtained with ENORA-RMSE for Monk’s Problem 2 dataset.
| Rule | Antecedents | Consequent | ||||||
|---|---|---|---|---|---|---|---|---|
|
| head_shape = round |
| body_shape = round |
| is_smiling = no | |||
|
| holding = sword |
| jacket_color = red |
| has_tie = yes |
| class = yes | |
|
| head_shape = octagon |
| body_shape = round |
| is_smiling = no | |||
|
| holding = sword |
| jacket_color = red |
| has_tie = no |
| class = yes | |
|
| head_shape = round |
| body_shape = round |
| is_smiling = no | |||
|
| holding = sword |
| jacket_color = yellow |
| has_tie = yes |
| class = yes | |
|
| head_shape = round |
| body_shape = round |
| is_smiling = no | |||
|
| holding = sword |
| jacket_color = red |
| has_tie = no |
| class = yes | |
|
| head_shape = square |
| body_shape = square |
| is_smiling = yes | |||
|
| holding = flag |
| jacket_color = yellow |
| has_tie = no |
| class = no | |
|
| head_shape = octagon |
| body_shape = round |
| is_smiling = yes | |||
|
| holding = balloon |
| jacket_color = blue |
| has_tie = no |
| class = no | |
|
| head_shape = octagon |
| body_shape = octagon |
| is_smiling = yes | |||
|
| holding = sword |
| jacket_color = green |
| has_tie = no |
| class = no | |
Comparison of the performance of the learning models in 10-fold cross-validation mode (three repetitions)—Breast Cancer dataset.
| Learning Model | Percent Correct | Serialized Model Size | |
|---|---|---|---|
|
| 73.45 | 0.61 | 9554.80 |
|
| 70.16 | 0.62 | 9554.63 |
|
| 72.39 | 0.60 | 9557.77 |
|
| 72.50 | 0.60 | 9556.20 |
|
| 70.03 | 0.61 | 9555.70 |
|
| 73.34 | 0.60 | 9558.60 |
|
| 68.92 | 0.61 | 55,298.13 |
|
| 71.82 | 0.61 | 7664.07 |
|
| 67.15 | 0.55 | 1524.00 |
|
| 70.30 | 0.50 | 915.00 |
Comparison of the performance of the learning models in split mode—Monk’s problem 2 dataset.
| Learning Model | Percent Correct | Serialized Model Size | |
|---|---|---|---|
|
| 76.69 | 0.70 | 9586.50 |
|
| 72.82 | 0.79 | 9589.30 |
|
| 75.66 | 0.68 | 9585.30 |
|
| 70.07 | 0.59 | 9590.60 |
|
| 67.08 | 0.70 | 9619.70 |
|
| 67.63 | 0.54 | 9565.90 |
|
| 73.51 | 0.79 | 73,115.90 |
|
| 64.05 | 0.50 | 5956.90 |
|
| 65.72 | 0.50 | 1313.00 |
|
| 65.72 | 0.50 | 888.00 |
Shapiro–Wilk normality test p-values for percent correct metric—Breast Cancer dataset.
| Algorithm | Null Hypothesis | |
|---|---|---|
|
| 0.5316 | Not Rejected |
|
| 0.3035 | Not Rejected |
|
| 0.7609 | Not Rejected |
|
| 0.1734 | Not Rejected |
|
| 0.3802 | Not Rejected |
|
| 0.6013 | Not Rejected |
|
| 0.0711 | Not Rejected |
|
| 0.5477 | Not Rejected |
|
| 0.316 | Not Rejected |
|
|
| Rejected |
Friedman p-value for percent correct metric—Breast Cancer dataset.
| Null Hypothesis | ||
|---|---|---|
| Friedman |
| Rejected |
Nemenyi post-hoc procedure for percent correct metric—Breast Cancer dataset.
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
|
| 0.2597 | - | - | - | - | - | - | - | - |
|
| 0.9627 | 0.9627 | - | - | - | - | - | - | - |
|
| 0.9981 | 0.8047 | 1.0000 | - | - | - | - | - | - |
|
| 0.2951 | 1.0000 | 0.9735 | 0.8386 | - | - | - | - | - |
|
| 1.0000 | 0.2169 | 0.9436 | 0.9960 | 0.2486 | - | - | - | - |
|
| 0.1790 | 1.0000 | 0.9186 | 0.6997 | 1.0000 | 0.1461 | - | - | - |
|
| 0.9909 | 0.8956 | 1.0000 | 1.0000 | 0.9186 | 0.9840 | 0.8164 | - | - |
|
|
| 0.6414 |
|
| 0.5961 |
| 0.7546 |
| - |
|
| 0.2377 | 1.0000 | 0.9538 | 0.7803 | 1.0000 | 0.1973 | 1.0000 | 0.8783 | 0.6709 |
Summary of statistically significant differences for percent correct metric—Breast Cancer dataset.
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
|
|
|
|
|
|
Shapiro–Wilk normality test p-values for area under the ROC curve metric—Breast Cancer dataset.
| Algorithm | Null Hypothesis | |
|---|---|---|
|
| 0.6807 | Not Rejected |
|
| 0.3171 | Not Rejected |
|
| 0.6125 | Not Rejected |
|
| 0.0871 | Not Rejected |
|
| 0.5478 | Not Rejected |
|
| 0.6008 | Not Rejected |
|
| 0.6066 | Not Rejected |
|
| 0.2978 | Not Rejected |
|
| 0.4531 | Not Rejected |
|
|
| Rejected |
Friedman p-value for area under the ROC curve metric—Breast Cancer dataset.
| Null Hypothesis | ||
|---|---|---|
| Friedman |
| Rejected |
Nemenyi post-hoc procedure for area under the ROC curve metric—Breast Cancer dataset.
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
|
| 1.0000 | - | - | - | - | - | - | - | - |
|
| 0.9972 | 0.9990 | - | - | - | - | - | - | - |
|
| 0.9999 | 1.0000 | 1.0000 | - | - | - | - | - | - |
|
| 1.0000 | 1.0000 | 1.0000 | 1.0000 | - | - | - | - | - |
|
| 0.9990 | 0.9997 | 1.0000 | 1.0000 | 1.0000 | - | - | - | - |
|
| 0.9999 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | - | - | - |
|
| 1.0000 | 1.0000 | 0.9992 | 1.0000 | 1.0000 | 0.9998 | 1.0000 | - | - |
|
|
|
| 0.0790 |
|
| 0.0582 |
|
| - |
|
|
|
|
|
|
|
|
|
| 0.6854 |
Summary of statistically significant differences for area under the ROC curve metric—Breast Cancer dataset.
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|
|
|
|
| - |
|
| - |
|
|
|
|
|
|
|
|
|
|
|
|
Shapiro–Wilk normality test p-values for serialized model size metric—Breast Cancer dataset.
| Algorithm | Null Hypothesis | |
|---|---|---|
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
| 0.3585 | Not Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
Friedman p-value for serialized model size metric—Breast Cancer dataset.
| Null Hypothesis | ||
|---|---|---|
| Friedman |
| Rejected |
Nemenyi post-hoc procedure for serialized model size metric—Breast Cancer dataset.
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
|
| 0.9998 | - | - | - | - | - | - | - | - |
|
|
|
| - | - | - | - | - | - | - |
|
| 0.3871 | 0.0942 | 0.8872 | - | - | - | - | - | - |
|
| 0.8872 | 0.4894 | 0.3871 | 0.9988 | - | - | - | - | - |
|
|
|
| 0.9860 | 0.2169 |
| - | - | - | - |
|
|
|
| 0.1973 |
|
| 0.8689 | - | - | - |
|
| 0.2712 | 0.6997 |
|
|
|
|
| - | - |
|
|
| 0.0546 |
|
|
|
|
| 0.9584 | - |
|
|
|
|
|
|
|
|
| 0.2377 | 0.9584 |
Summary of statistically significant differences for serialized model size metric—Breast Cancer dataset.
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
|
|
|
| - | - | - | - | - |
|
|
|
| - | - |
| - | - |
|
|
|
| - |
|
| - | - |
|
| - | - |
|
|
|
|
|
|
|
| - |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Shapiro–Wilk normality test p-values for percent correct metric—Monk’s Problem 2 dataset.
| Algorithm | Null Hypothesis | |
|---|---|---|
|
| 0.6543 | Not Rejected |
|
| 0.6842 | Not Rejected |
|
|
| Rejected |
|
| 0.979 | Not Rejected |
|
| 0.382 | Not Rejected |
|
|
| Rejected |
|
| 0.5671 | Not Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
Friedman p-value for percent correct metric—Monk’s Problem 2 dataset.
| Null Hypothesis | ||
|---|---|---|
| Frideman |
| Rejected |
Nemenyi post-hoc procedure for percent correct metric—Monk’s Problem 2 dataset.
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
|
| 0.8363 | - | - | - | - | - | - | - | - |
|
| 1.0000 | 0.9471 | - | - | - | - | - | - | - |
|
| 0.1907 | 0.9902 | 0.3481 | - | - | - | - | - | - |
|
|
| 0.6294 |
| 0.9958 | - | - | - | - | - |
|
|
| 0.6294 |
| 0.9958 | 1.0000 | - | - | - | - |
|
| 0.8714 | 1.0000 | 0.9631 | 0.9841 | 0.5769 | 0.5769 | - | - | - |
|
|
|
|
| 0.1341 | 0.6806 | 0.6806 |
| - | - |
|
|
| 0.0743 |
| 0.6032 | 0.9875 | 0.9875 | 0.0601 | 0.9984 | - |
|
|
| 0.0743 |
| 0.6032 | 0.9875 | 0.9875 | 0.0601 | 0.9984 | 1.0000 |
Summary of statistically significant differences for percent correct metric—Monk’s Problem 2 dataset.
|
|
|
|
| |
|---|---|---|---|---|
|
|
| - |
| - |
|
|
| - |
| - |
|
|
|
|
|
|
|
|
| - |
| - |
|
|
| - |
| - |
Shapiro–Wilk normality test p-values for area under the ROC curve metric—Monk’s Problem 2 dataset.
| Algorithm | Null Hypothesis | |
|---|---|---|
|
| 0.4318 | Not Rejected |
|
| 0.7044 | Not Rejected |
|
|
| Rejected |
|
| 0.3082 | Not Rejected |
|
|
| Rejected |
|
| 0.7802 | Not Rejected |
|
| 0.1641 | Not Rejected |
|
| 0.3581 | Not Rejected |
|
|
| Rejected |
|
|
| Rejected |
Friedman p-value for area under the ROC curve metric—Monk’s Problem 2 dataset.
| Null Hypothesis | ||
|---|---|---|
| Frideman |
| Rejected |
Nemenyi post-hoc procedure for area under the ROC curve metric—Monk’s Problem 2 dataset.
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
|
| 0.8363 | - | - | - | - | - | - | - | - |
|
| 1.0000 | 0.7054 | - | - | - | - | - | - | - |
|
| 0.8870 | 0.0539 | 0.9556 | - | - | - | - | - | - |
|
| 1.0000 | 0.8544 | 1.0000 | 0.8713 | - | - | - | - | - |
|
| 0.5504 |
| 0.7054 | 0.9999 | 0.5239 | - | - | - | - |
|
| 0.7054 | 1.0000 | 0.5504 |
| 0.7295 |
| - | - | - |
|
|
|
|
| 0.6806 |
| 0.9471 |
| - | - |
|
|
|
|
| 0.4715 |
| 0.8363 |
| 1.0000 | - |
|
|
|
|
| 0.4715 |
| 0.8363 |
| 1.0000 | 1.0000 |
Summary of statistically significant differences for area under the ROC curve metric—Monk’s Problem 2 dataset.
|
|
|
|
|
|
| ||
|---|---|---|---|---|---|---|---|
|
| - |
| - | - | - | - | - |
|
| - | - | - |
| - |
| - |
|
|
|
|
| - |
| - |
|
|
|
|
|
| - |
| - |
|
|
|
|
|
| - |
| - |
|
Shapiro–Wilk normality test p-values for serialized model size metric—Monk’s Problem 2 dataset.
| Algorithm | Null Hypothesis | |
|---|---|---|
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
| 0.9721 | Not Rejected |
|
|
| Rejected |
|
|
| Rejected |
|
|
| Rejected |
Friedman p-value for serialized model size metric—Monk’s Problem 2 dataset.
| Null Hypothesis | ||
|---|---|---|
| Frideman |
| Rejected |
Nemenyi post-hoc procedure for serialized model size metric—Monk’s Problem 2 dataset.
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
|
| 1.0000 | - | - | - | - | - | - | - | - |
|
| 1.0000 | 1.0000 | - | - | - | - | - | - | - |
|
| 1.0000 | 1.0000 | 1.0000 | - | - | - | - | - | - |
|
| 0.9925 | 0.9696 | 0.9984 | 0.9841 | - | - | - | - | - |
|
| 0.8870 | 0.9556 | 0.7966 | 0.9267 | 0.2622 | - | - | - | - |
|
| 0.2824 | 0.1752 | 0.3957 | 0.2246 | 0.9015 |
| - | - | - |
|
| 0.1752 | 0.2824 | 0.1110 | 0.2246 |
| 0.9752 |
| - | - |
|
|
|
|
|
|
| 0.6552 |
| 0.9993 | - |
|
|
|
|
|
|
| 0.1907 |
| 0.9015 | 0.9993 |
Summary of statistically significant differences for serialized model size metric—Monk’s Problem 2 dataset.
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
|
| - | - | - | - | - |
| - |
|
| - | - | - | - |
| - |
|
|
|
|
|
|
|
| - |
|
|
|
|
|
|
|
| - |
|
Attribute description of the Tic-Tac-Toe-Endgame dataset.
| # | Attribute Name | Type | Possible Values |
|---|---|---|---|
| 1 | top-left-square | categorical | x, o, b |
| 2 | top-middle-square | categorical | x, o, b |
| 3 | top-right-square | categorical | x, o, b |
| 4 | middle-left-square | categorical | x, o, b |
| 5 | middle-middle-square | categorical | x, o, b |
| 6 | middle-right-square | categorical | x, o, b |
| 7 | bottom-left-square | categorical | x, o, b |
| 8 | bottom-middle-square | categorical | x, o, b |
| 9 | bottom-right-square | categorical | x, o, b |
| 10 | class | categorical | positive, negative |
Attribute description of the Car dataset.
| # | Attribute Name | Type | Possible Values |
|---|---|---|---|
| 1 | buying | categorical | vhigh, high, med, low |
| 2 | maint | categorical | vhigh, high, med, low |
| 3 | doors | categorical | 2, 3, 4, 5-more |
| 4 | persons | categorical | 2, 4, more |
| 5 | lug_boot | categorical | small, med, big |
| 6 | safety | categorical | low, med, high |
| 7 | class | categorical | unacc, acc, good, vgood |
Attribute description of the kr-vs-kp dataset.
| # | Attribute Name | Type | Possible Values |
|---|---|---|---|
| 1 | bkblk | categorical | t, f |
| 2 | bknwy | categorical | t, f |
| 3 | bkon8 | categorical | t, f |
| 4 | bkona | categorical | t, f |
| 5 | bkspr | categorical | t, f |
| 6 | bkxbq | categorical | t, f |
| 7 | bkxcr | categorical | t, f |
| 8 | bkxwp | categorical | t, f |
| 9 | blxwp | categorical | t, f |
| 10 | bxqsq | categorical | t, f |
| 11 | cntxt | categorical | t, f |
| 12 | dsopp | categorical | t, f |
| 13 | dwipd | categorical | g, l |
| 14 | hdchk | categorical | t, f |
| 15 | katri | categorical | b, n, w |
| 16 | mulch | categorical | t, f |
| 17 | qxmsq | categorical | t, f |
| 18 | r2ar8 | categorical | t, f |
| 19 | reskd | categorical | t, f |
| 20 | reskr | categorical | t, f |
| 21 | rimmx | categorical | t, f |
| 22 | rkxwp | categorical | t, f |
| 23 | rxmsq | categorical | t, f |
| 24 | simpl | categorical | t, f |
| 25 | skach | categorical | t, f |
| 26 | skewr | categorical | t, f |
| 27 | skrxp | categorical | t, f |
| 28 | spcop | categorical | t, f |
| 29 | stlmt | categorical | t, f |
| 30 | thrsk | categorical | t, f |
| 31 | wkcti | categorical | t, f |
| 32 | wkna8 | categorical | t, f |
| 33 | wknck | categorical | t, f |
| 34 | wkovl | categorical | t, f |
| 35 | wkpos | categorical | t, f |
| 36 | wtoeg | categorical | n, t, f |
| 37 | class | categorical | won, nowin |
Attribute description of the Nursery dataset.
| # | Attribute Name | Type | Possible Values |
|---|---|---|---|
| 1 | parents | categorical | usual, pretentious, great_pret |
| 2 | has_nurs | categorical | proper, less_proper, improper, critical, very_crit |
| 3 | form | categorical | complete, completed, incomplete, foster |
| 4 | children | categorical | 1, 2, 3, more |
| 5 | housing | categorical | convenient, less_conv, critical |
| 6 | finance | categorical | convenient, inconv |
| 7 | social | categorical | nonprob, slightly_prob, problematic |
| 8 | health | categorical | recommended, priority, not_recom |
| 9 | class | categorical | not_recom, recommend, very_recom, priority, spec_prior |
Comparison of the performance of the learning models in 10-fold cross-validation mode—Monk’s Problem 2, Tic-Tac-Toe-Endgame, Car, kr-vs-kp and Nursery datasets.
| Learning Model | Number of Rules | Percent Correct | Precision | Recall |
|
| |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||||
|
| 7 | 77.70 | 0.777 | 0.360 | 0.777 | 0.777 | 0.762 | 0.481 | 0.708 | 0.695 | 0.472 |
|
| 47 | 79.53 | 0.795 | 0.253 | 0.795 | 0.795 | 0.795 | 0.544 | 0.884 | 0.893 | 0.380 |
|
| 1 | 62.90 | 0.629 | 0.646 | 0.526 | 0.629 | 0.535 | −0.034 | 0.478 | 0.537 | 0.482 |
|
| 1 | 65.72 | 0.657 | 0.657 | 0.432 | 0.657 | 0.521 | 0.000 | 0.500 | 0.549 | 0.586 |
|
| - | 65.72 | 0.657 | 0.657 | 0.432 | 0.657 | 0.521 | 0.000 | 0.491 | 0.545 | 0.457 |
|
| |||||||||||
|
| 2 | 98.33 | 0.983 | 0.031 | 0.984 | 0.983 | 0.983 | 0.963 | 0.976 | 0.973 | 0.129 |
|
| 49 | 94.26 | 0.943 | 0.076 | 0.942 | 0.943 | 0.942 | 0.873 | 0.974 | 0.969 | 0.220 |
|
| 9 | 97.81 | 0.978 | 0.031 | 0.978 | 0.978 | 0.978 | 0.951 | 0.977 | 0.977 | 0.138 |
|
| 1 | 69.94 | 0.699 | 0.357 | 0.701 | 0.699 | 0.700 | 0.340 | 0.671 | 0.651 | 0.548 |
|
| - | 65.35 | 0.653 | 0.653 | 0.427 | 0.653 | 0.516 | 0.000 | 0.496 | 0.545 | 0.476 |
|
| |||||||||||
|
| 14 | 86.57 | 0.866 | 0.089 | 0.866 | 0.866 | 0.846 | 0.766 | 0.889 | 0.805 | 0.259 |
|
| 68 | 95.78 | 0.958 | 0.016 | 0.959 | 0.958 | 0.958 | 0.929 | 0.990 | 0.979 | 0.1276 |
|
| 49 | 86.46 | 0.865 | 0.064 | 0.881 | 0.865 | 0.870 | 0.761 | 0.947 | 0.899 | 0.224 |
|
| 1 | 70.02 | 0.700 | 0.700 | 0.490 | 0.700 | 0.577 | 0.000 | 0.500 | 0.543 | 0.387 |
|
| - | 70.02 | 0.700 | 0.700 | 0.490 | 0.700 | 0.577 | 0.000 | 0.497 | 0.542 | 0.338 |
|
| |||||||||||
|
| 10 | 94.87 | 0.949 | 0.050 | 0.950 | 0.949 | 0.949 | 0.898 | 0.950 | 0.927 | 0.227 |
|
| 23 | 99.06 | 0.991 | 0.010 | 0.991 | 0.991 | 0.991 | 0.981 | 0.997 | 0.996 | 0.088 |
|
| 16 | 99.19 | 0.992 | 0.008 | 0.992 | 0.992 | 0.992 | 0.984 | 0.995 | 0.993 | 0.088 |
|
| 1 | 66.46 | 0.665 | 0.350 | 0.675 | 0.665 | 0.655 | 0.334 | 0.657 | 0.607 | 0.579 |
|
| - | 52.22 | 0.522 | 0.522 | 0.273 | 0.522 | 0.358 | 0.000 | 0.499 | 0.500 | 0.500 |
|
| |||||||||||
|
| 15 | 88.41 | 0.884 | 0.055 | 0.870 | 0.884 | 0.873 | 0.824 | 0.915 | 0.818 | 0.2153 |
|
| 220 | 99.21 | 0.992 | 0.003 | 0.992 | 0.992 | 0.992 | 0.989 | 0.999 | 0.997 | 0.053 |
|
| 131 | 96.84 | 0.968 | 0.012 | 0.968 | 0.968 | 0.968 | 0.957 | 0.993 | 0.974 | 0.103 |
|
| 1 | 70.97 | 0.710 | 0.137 | 0.695 | 0.710 | 0.702 | 0.570 | 0.786 | 0.632 | 0.341 |
|
| - | 33.33 | 0.333 | 0.333 | 0.111 | 0.333 | 0.167 | 0.000 | 0.500 | 0.317 | 0.370 |
Nomenclature table (Part I).
| Symbol | Definition |
|---|---|
|
| |
|
| k-th decision variable |
|
| Set of decision variables |
|
| i-th objective function |
|
| j-th constraint |
|
| Number of objectives |
|
| Number of constraints |
|
| Number of decision variables |
|
| Domain for each each decision variable |
|
| Domain for the set of decision variables |
|
| Set of all feasible solutions |
|
| Set of non-dominated solutions or Pareto optimal set |
|
| Pareto domination function |
|
| |
|
| Dataset |
|
| |
|
| Categorical input attributes in the dataset |
|
| Categorical output attribute in the dataset |
|
| Domain of i-th categorical input attribute in the dataset |
|
| Domain of categorical output attribute in the dataset |
|
| Number of categorical input attributes in the dataset |
|
| Rule-based classifier |
|
| |
|
| Category for |
|
| Category for categorical output attribute and |
|
| Compatibility degree of the |
|
| Result of the |
|
| Association degree of classifier |
|
| Result of of the |
|
| Classification or output of the classifier |
|
| |
|
| Performance objective function of the classifier |
|
| Number of rules of the classifier |
|
| Maximum number of rules allowed for classifiers |
|
| |
|
| |
|
| Number of instances in the dataset |
|
| Result of the classification of the |
|
| Predicted value of the |
|
| Corresponding true value for the |
|
| Area under the |
|
| |
|
| |
|
| Discrimination threshold |
|
| Square root of the |
Nomenclature table (Part II).
|
| |
|
| Population |
|
| Set of non-dominated individuals of |
|
| Volume of the search space dominated by the individual |
|
| Hypervolume: volume of the search space dominated by population |
|
| Volume of the search space non-dominated by population |
|
| Hypervolume ratio: ratio of |
|
| Volume of the search space |
|
| Minimum value for objective |
|
| Maximum value for objective |
|
| Minimum value for objective |
|
| Maximum value for objective |