| Literature DB >> 33265475 |
Muhammad Umar Chaudhry1, Jee-Hyong Lee1.
Abstract
Given the increasing size and complexity of datasets needed to train machine learning algorithms, it is necessary to reduce the number of features required to achieve high classification accuracy. This paper presents a novel and efficient approach based on the Monte Carlo Tree Search (MCTS) to find the optimal feature subset through the feature space. The algorithm searches for the best feature subset by combining the benefits of tree search with random sampling. Starting from an empty node, the tree is incrementally built by adding nodes representing the inclusion or exclusion of the features in the feature space. Every iteration leads to a feature subset following the tree and default policies. The accuracy of the classifier on the feature subset is used as the reward and propagated backwards to update the tree. Finally, the subset with the highest reward is chosen as the best feature subset. The efficiency and effectiveness of the proposed method is validated by experimenting on many benchmark datasets. The results are also compared with significant methods in the literature, which demonstrates the superiority of the proposed method.Entities:
Keywords: MOTiFS; Monte Carlo Tree Search (MCTS); dimensionality reduction; feature selection; heuristic feature selection; wrapper
Year: 2018 PMID: 33265475 PMCID: PMC7512904 DOI: 10.3390/e20050385
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Key aspects of feature selection.
Figure 2The proposed method, MOTiFS (Monte Carlo Tree Search Based Feature Selection).
Figure 3Feature selection tree and search procedure of MOTiFS.
Notations used in the proposed method.
| Notation | Interpretation |
|---|---|
|
| Original feature set |
|
| Total number of features |
|
| Node |
|
| Action taken at |
|
| Simulation reward |
Summary of the selected datasets.
| # | Dataset | No. of Features | No. of Instances | No. of Classes |
|---|---|---|---|---|
| 1 | Spambase | 57 | 4701 | 2 |
| 2 | WBC | 9 | 699 | 2 |
| 3 | Ionosphere | 34 | 351 | 2 |
| 4 | Arrhythmia | 195 | 452 | 16 |
| 5 | Multiple features | 649 | 2000 | 10 |
| 6 | Waveform | 40 | 5000 | 3 |
| 7 | WBDC | 30 | 569 | 2 |
| 8 | Glass | 9 | 214 | 6 |
| 9 | Wine | 13 | 178 | 3 |
| 10 | Australian | 14 | 690 | 2 |
| 11 | German number | 24 | 1000 | 2 |
| 12 | Zoo | 17 | 101 | 7 |
| 13 | Breast cancer | 10 | 683 | 2 |
| 14 | DNA | 180 | 2000 | 2 |
| 15 | Vehicle | 18 | 846 | 4 |
| 16 | Sonar | 60 | 208 | 2 |
| 17 | Hillvalley | 100 | 606 | 2 |
| 18 | Musk 1 | 166 | 476 | 2 |
| 19 | Splice | 60 | 1000 | 2 |
| 20 | KRFP * | 4860 | 215 | 2 |
| 21 | Soybean-small | 35 | 47 | 4 |
| 22 | Liver disorders | 6 | 345 | 2 |
| 23 | Credit | 15 | 690 | 2 |
| 24 | Tic-tac-toe | 9 | 985 | 2 |
| 25 | Libras movement | 90 | 360 | 15 |
* downloaded from [11].
Parameters setup for MOTiFS.
| Parameter | Values Used for Different Datasets |
|---|---|
| Scaling factor, | (0.1, 0.05, 0.02) |
| Termination criteria | (500, 1000, 10,000) iterations |
Summary of the methods for our comparison.
| Method | Description |
|---|---|
| SFS, SBS | Sequential Forward Selection and Sequential Backward Selection * [ |
| FS-FS | Feature Similarity Technique [ |
| FR-FS | Fuzzy Rule Based Technique [ |
| SFSW | An Evolutionary Multi-Objective Optimization Approach [ |
| DEMOFS | Differential Evolution Based Multi-Objective Feature Selection * [ |
| BA | Bat Algorithm and Optimum-Path Forest Based Wrapper Approach [ |
| PSO | Particle Swarm Optimization Based Method * [ |
| SCE, CCE | Shannon’s Entropy Reduction, Complementary Entropy Reduction * [ |
| PDE-2 | Partition Differential Entropy Based Method [ |
* Reported from the mentioned reference.
Comparison of MOTiFS with other methods, according to 5-NN.
| Dataset | Avg. Acc. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| MOTiFS | SFSW [ | SFS [ | SBS [ | FS-FS [ | FR-FS [ | DEMOFS [ | BA [ | PSO [ | |
| Spambase | 0.885 | 0.874 | 0.870 | 0.900 | |||||
| 31.5 |
| 35.7 | 37.3 | 29.0 | |||||
| WBC | 0.961 | 0.960 | 0.951 | 0.956 | |||||
| 5.52 | 4.2 | 6.4 | 7.3 |
| |||||
| Ionosphere | 0.883 | 0.887 | 0.859 | 0.788 | 0.844 | 0.780 | 0.790 | ||
| 12.32 | 11.5 |
| 9.1 | 16.0 | 4.33 | 21.0 | 14.0 | ||
| Arrhythmia | 0.650 ± 0.003 |
| 0.599 | 0.580 | 0.589 | ||||
| 94.4 | 100.0 | 89.4 |
| 100.0 | |||||
| Multiple features | 0.979 | 0.903 | 0.912 | 0.783 | |||||
| 321.84 | 270.0 |
| 305.0 | 325.0 | |||||
| Waveform | 0.816 ±0.002 |
| 0.778 | 0.785 | 0.752 | ||||
| 19.42 |
| 18.4 | 18.3 | 20.0 | |||||
| WBDC | 0.941 | 0.901 | 0.898 | 0.936 | |||||
| 15.42 | 13.5 | 13.9 | 17.8 |
| |||||
| Glass | 0.678 | 0.631 | 0.636 | 0.615 | |||||
| 4.80 |
| 5.8 | 7.0 | 6.96 | |||||
| Wine | 0.961 | 0.914 | 0.914 | 0.955 | 0.897 | ||||
| 7.52 | 6.9 | 6.0 | 7.5 |
| 6.0 | ||||
| Australian | 0.846 | 0.830 | 0.828 | 0.773 | |||||
| 6.98 | 4.7 | 3.7 |
| 4.0 | |||||
| German number | 0.713 | 0.682 | 0.658 | 0.701 | |||||
| 11.46 | 10.5 | 12.2 | 10.8 |
| |||||
| Zoo | 0.920 ± 0.022 | 0.954 | 0.949 |
| 0.954 | ||||
| 9.06 | 11.0 |
| 13.0 | 11.0 | |||||
| Breast cancer | 0.965 | 0.951 | 0.949 | 0.940 | 0.930 | ||||
| 6.14 |
| 6.10 | 6.10 | 5.0 | 5.0 | ||||
| DNA | 0.810 ± 0.006 |
| 0.822 | 0.823 | 0.760 | 0.760 | |||
| 89.26 | 71.8 |
| 20.6 | 96.0 | 91.0 | ||||
| Vehicle | 0.653 | 0.686 | 0.673 | ||||||
| 10.14 |
| 10.8 | 10.7 | ||||||
| Sonar | 0.827 | 0.729 | 0.786 | ||||||
| 28.96 | 20.0 |
| 10.0 | ||||||
| Hillvalley | 0.535 ± 0.003 | 0.575 |
| ||||||
| 45.18 | 40.0 |
| |||||||
| Musk 1 | 0.815 | 0.835 | |||||||
| 81.34 | 59.3 |
| |||||||
| Splice | 0.680 | 0.670 | |||||||
|
| 28.0 | 28.0 | |||||||
| KRFP | 0.842 * | 0.884 * | |||||||
| 2390.2 |
| 1866 | |||||||
* evaluated using weka library.
Comparison of MOTiFS with other methods, according to 3-NN.
| Dataset | Avg. Acc. | Avg. Acc. | ||
|---|---|---|---|---|
| MOTiFS | SCE [ | CCE [ | PDE-2 [ | |
| Soybean-small | 0.988 ± 0.015 |
|
|
|
| 12.18 | ||||
| Liver-disorders | 0.602 | 0.592 | 0.590 | |
| 3.94 | ||||
| Credit | 0.646 | 0.654 | 0.659 | |
| 8.22 | ||||
| Tic-tac-toe | 0.774 | 0.747 | 0.757 | |
| 7.18 | ||||
| Libras movement | 0.538 | 0.552 | 0.554 | |
| 44.94 | ||||
Figure 4Graphical representation of dimensional reduction (DR) achieved by MOTiFS on all the datasets.