| Literature DB >> 33265876 |
Xiao Zhang1, Xia Liu1, Yanyan Yang2.
Abstract
The information entropy developed by Shannon is an effective measure of uncertainty in data, and the rough set theory is a useful tool of computer applications to deal with vagueness and uncertainty data circumstances. At present, the information entropy has been extensively applied in the rough set theory, and different information entropy models have also been proposed in rough sets. In this paper, based on the existing feature selection method by using a fuzzy rough set-based information entropy, a corresponding fast algorithm is provided to achieve efficient implementation, in which the fuzzy rough set-based information entropy taking as the evaluation measure for selecting features is computed by an improved mechanism with lower complexity. The essence of the acceleration algorithm is to use iterative reduced instances to compute the lambda-conditional entropy. Numerical experiments are further conducted to show the performance of the proposed fast algorithm, and the results demonstrate that the algorithm acquires the same feature subset to its original counterpart, but with significantly less time.Entities:
Keywords: fast algorithm; feature selection; fuzzy rough set theory; information entropy
Year: 2018 PMID: 33265876 PMCID: PMC7512350 DOI: 10.3390/e20100788
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Description of the data sets.
| Data Set | Abbreviation of Data Set | Number of Objects | Number of Conditional Attributes | Number of Classes | ||
|---|---|---|---|---|---|---|
| All | Nominal | Real-Valued | ||||
| Horse Colic | Horse | 368 | 22 | 15 | 7 | 2 |
| Credit Approval | Credit | 690 | 15 | 9 | 6 | 2 |
| German Credit Data | German | 1000 | 20 | 13 | 7 | 2 |
| Wisconsin Diagnostic Breast Cancer | WDBC | 569 | 30 | 0 | 30 | 2 |
| Libras Movement | Libras | 360 | 90 | 0 | 90 | 15 |
| Musk (Version 1) | Musk1 | 476 | 166 | 0 | 166 | 2 |
| Hill-Valley | HV | 606 | 100 | 0 | 100 | 2 |
| Wall-Following Robot Navigation Data | Robot | 5456 | 24 | 0 | 24 | 4 |
| Waveform Database Generator (Version 2) | WDG2 | 5000 | 40 | 0 | 40 | 3 |
Figure 1Computation time of Algorithms 1 and 2 with the increase of the size of each data set.
Figure 2Computation time of Algorithms 1 and 2 on ten-fold data sets generated by each data set.
Average results of Algorithms 1 and 2 obtained from the ten-fold data sets.
| Data Set | Algorithm 2 | Algorithm 1 [ | |||
|---|---|---|---|---|---|
| Average Running Time (s) | |·| | Average Running Time (s) | |·| | ||
| Horse | 0.38 | 12.7 | 0.69 | 12.7 | |
| Credit | 0.70 | 13.9 | 1.16 | 13.9 | |
| German | 1.65 | 12.9 | 3.79 | 12.9 | |
| WDBC | 3.20 | 30.0 | 3.85 | 30.0 | |
| Libras | 7.94 | 71.4 | 11.48 | 71.4 | |
| Musk1 | 30.69 | 112.4 | 54.69 | 112.4 | |
| HV | 17.12 | 90.0 | 37.11 | 90.0 | |
| Robot | 259.00 | 24.0 | 428.85 | 24.0 | |
| WDG2 | 771.46 | 40.0 | 894.15 | 40.0 | |
Figure 3Variation of with the increase of iteration number in Algorithm 2.