| Literature DB >> 34720675 |
Giorgos Borboudakis1, Ioannis Tsamardinos1,2,3.
Abstract
Most feature selection methods identify only a single solution. This is acceptable for predictive purposes, but is not sufficient for knowledge discovery if multiple solutions exist. We propose a strategy to extend a class of greedy methods to efficiently identify multiple solutions, and show under which conditions it identifies all solutions. We also introduce a taxonomy of features that takes the existence of multiple solutions into account. Furthermore, we explore different definitions of statistical equivalence of solutions, as well as methods for testing equivalence. A novel algorithm for compactly representing and visualizing multiple solutions is also introduced. In experiments we show that (a) the proposed algorithm is significantly more computationally efficient than the TIE* algorithm, the only alternative approach with similar theoretical guarantees, while identifying similar solutions to it, and (b) that the identified solutions have similar predictive performance.Entities:
Keywords: Feature selection; Multiple feature selection; Multiple solutions; Stepwise selection
Year: 2021 PMID: 34720675 PMCID: PMC8550441 DOI: 10.1007/s10618-020-00731-7
Source DB: PubMed Journal: Data Min Knowl Discov ISSN: 1384-5810 Impact factor: 3.670