Literature DB >> 36247211

Opposition-based sine cosine optimizer utilizing refraction learning and variable neighborhood search for feature selection.

Bilal H Abed-Alguni1, Noor Aldeen Alawad1, Mohammed Azmi Al-Betar2, David Paul3.   

Abstract

This paper proposes new improved binary versions of the Sine Cosine Algorithm (SCA) for the Feature Selection (FS) problem. FS is an essential machine learning and data mining task of choosing a subset of highly discriminating features from noisy, irrelevant, high-dimensional, and redundant features to best represent a dataset. SCA is a recent metaheuristic algorithm established to emulate a model based on sine and cosine trigonometric functions. It was initially proposed to tackle problems in the continuous domain. The SCA has been modified to Binary SCA (BSCA) to deal with the binary domain of the FS problem. To improve the performance of BSCA, three accumulative improved variations are proposed (i.e., IBSCA1, IBSCA2, and IBSCA3) where the last version has the best performance. IBSCA1 employs Opposition Based Learning (OBL) to help ensure a diverse population of candidate solutions. IBSCA2 improves IBSCA1 by adding Variable Neighborhood Search (VNS) and Laplace distribution to support several mutation methods. IBSCA3 improves IBSCA2 by optimizing the best candidate solution using Refraction Learning (RL), a novel OBL approach based on light refraction. For performance evaluation, 19 real-wold datasets, including a COVID-19 dataset, were selected with different numbers of features, classes, and instances. Three performance measurements have been used to test the IBSCA versions: classification accuracy, number of features, and fitness values. Furthermore, the performance of the last variation of IBSCA3 is compared against 28 existing popular algorithms. Interestingly, IBCSA3 outperformed almost all comparative methods in terms of classification accuracy and fitness values. At the same time, it was ranked 15 out of 19 in terms of number of features. The overall simulation and statistical results indicate that IBSCA3 performs better than the other algorithms.
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Entities:  

Keywords:  Feature selection; Laplace distribution; Mutation methods; Opposition-based learning; Refraction learning; Sine cosine algorithm

Year:  2022        PMID: 36247211      PMCID: PMC9547101          DOI: 10.1007/s10489-022-04201-z

Source DB:  PubMed          Journal:  Appl Intell (Dordr)        ISSN: 0924-669X            Impact factor:   5.019


Introduction

Back in 2003, the amount of generated data was around five exabytes. Nowadays, the same amount of data, and even more, is produced within two days [1]. This rapid increase in the volume, velocity and variety of data raises challenges and, at the same time, opportunities. Dealing with such data is a challenge, but there are opportunities to utilize the data for beneficial applications [2]. In order to perform data mining, data are first pre-processed [3], which involves cleaning and preparing the data to best meet the requirements of input for later stages. One possible pre-processing step is Feature Selection (FS) [3], which is a method of choosing a subset of features of a dataset that can best represent the data accurately without redundancy, noise, or repetition. FS is used in a wide number of applications, including data classification [4-6], data clustering [7-9], image processing [10-13], and text categorization [14, 15]. Generally speaking, FS techniques are either based on an evaluation criterion or on a search strategy. Evaluation criterion-based methods can be further classified as either filters or wrappers. The main difference between these two is the absence or existence (respectively) of a learning algorithm in the process to evaluate feature subsets. Chi-Square [16], Gain Ratio [17], Information Gain [18], support vector machines [19], ReliefF [20, 21], and hybrid ReliefF [22, 23] are filter methods. They depend upon correlations between features and classes in the dataset. Wrapper FS methods [24], on the other hand, utilize learning algorithms. A disadvantage of wrapper FS methods is the high computational cost, however they often give precise results. Due to the huge search space, the FS problem has been shown to be NP-Hard [25, 26]. Thus, it is costly and time-consuming to employ exact methods to find a solution. However, when searching for approximate solutions, randomization searching strategies, such as sequential forward, sequential backward, random, and heuristic [27], often enhance results. Further, metaheuristic algorithms often lead to efficient implementations of various FS methods. Metaheuristic algorithms use heuristic strategies or guidelines in optimization algorithms to solve complex optimization problems (e.g., FS problem) in real time. Unlike single-purpose algorithms, metaheuristic algorithms can be used for many different optimization problems [27-33]. One major category of metaheuristic algorithms is Swarm Intelligence (SI), where creature swarms are the main inspiration (e.g., ants, flocks, bees) [34]. SI algorithms have been tested with various optimization problems, including FS. For instance, the authors of [35] utilize the powerful SI algorithm Grey Wolf Optimizer (GWO) with an FS problem, and the results reported a respectable performance. Similarly, the Antlion Optimizer (ALO) [36] has been successfully used as a wrapper for a FS strategy, and the Whale Optimization Algorithm (WOA) has been utilised in several different implementations of FS algorithms [37-40], as has Particle Swarm Optimization (PSO) [41], Artificial Bee Colony (ABC) [42], Ant Colony Optimization (ACO) [43, 44], Gravitational Search Algorithm [45], and the Salp Swarm Algorithm (SSA) [46-48]. Indeed, the hardness of tackling the FS problem is considerably increased with an increase of the original problem’s dimensions. For instance, when the FS data has n features, its search space has 2 different solutions. Thus, any metaheuristic algorithm used to tackle such an FS problem often requires modification to work well given the complex nature of the FS search space. This is also mentioned in the No Free Lunch (NFL) theorem [49], which states that no superior algorithm can achieve the best performance for all optimization problems or even for the same optimization problem with different instances. Therefore, research opportunities are still available to introduce new/modified metaheuristic algorithms for FS problems. Besides the previously mentioned SI algorithms, metaheuristics algorithms can imitate a physical rule, evolutionary phenomena, or human-based technique [50]. To this end, Seyedali Mirjalili proposed a metaheuristic algorithm called the Sine Cosine Algorithm (SCA) [50] in 2016. SCA is a population-based algorithm inspired by the sine and cosine trigonometric functions. The simplicity, robustness and efficiency of the algorithm are SCA’s main advantages. Those characteristics have motivated others to implement SCA for different optimization problems. For example, truss structure optimization is an architecture-based optimization problem [51] where SCA has been applied. SCA has also been adapted to support the travelling salesperson problem [52], text categorization [53], image segmentation [54], object tracking [55], unit commitment [56], optimal design of a shell and tube evaporator [57], abrupt motion tracking [58], and parameter optimization for support vector regression [59]. Because real-world problems are complex and have constraints, researchers have attempted to enhance SCA in a number of different ways. Firstly, SCA operators have been modified to deal with particular problems [60-63]. Alternatively, SCA has been hybridized with i) local-based algorithms [52, 64, 65], ii) population-based algorithms [66, 67], iii) operators from other optimization algorithms [65, 68]. For instance, in [62] the SCA exploration and exploitation phases were managed by a nonlinear conversion parameter. In addition, to help avoid local optima, the position update equation was modified. Another example of SCA hybridization is improving exploitation utilizing the Nelder-Mead simplex concept and the Opposition-based learning (OBL) searching strategy [64]. Further, the diversification of SCA has been enhanced by integrating SCA with a random mutation and gaussian local search technique [65]. Quite recently, Al-betar et al. [69] introduced a memetic version of SCA to solve the economic load dispatch problem. In this approach, adaptive β-hill climbing [70] was hybridized with the optimization framework of SCA to better balance exploration and exploitation. SCA was initially proposed for continuous decision variables. However, with a mapping function (transforming a continuous search space to binary), a binary SCA (BSCA) version was introduced in [71], where it was implemented for an FS optimization problem, and verified to be an efficient technique. The performance, accuracy, capability, and variety of decision variables’ types are the factors that motivated us to conduct the research described in this paper. We propose three versions of the Improved Binary Sine Cosine Algorithm (i.e., IBSCA1, IBSCA2, and IBSCA3) for the FS problem, in which different approaches of exploration and exploitation are conducted. Consequently, this leads to the following contributions: We apply Opposition Based Learning (OBL) in IBSCA1 to ensure a diverse population of solutions. The use of OBL is expected to expand the search region and improve the solution’s approximation. IBSCA2 builds on IBSCA1 and includes Variable Neighborhood Search (VNS) and Laplace distribution to explore the search space using several mutation methods (swap, insert, inverse, or random mutation). One of the advantages of VNS is that the mutated solution may break out of a local optimum. IBSCA3 builds on IBSCA2 and enhances the best candidate solution using Refraction Learning (RL). RL is a novel opposition learning approach that is based on the principle of light refraction. It is expected to improve the ability of IBSCA3 to jump out of local optima. The three exploration techniques are applied in an incremental manner, where IBSCA3 implements all of the three exploration techniques. Our purpose here is to show that the incremental integration of each exploration method gradually improves the performance of IBSCA and eventually leads to a strong optimization algorithm (IBSCA3). The candidate solutions produced by the optimization process of SCA and RL are continuous. Therefore, we used the V3 transfer function to convert the values of continuous decision variables into binary ones. V3 was selected based on extensive simulations on eight binary transfer functions (4 S-shaped and 4 V-shaped transfer functions). The experimental results indicated that V3 is the most viable transfer function. We evaluate the variations of IBSCA utilizing 19 well-known datasets (18 FS datasets from UCI repository and a COVID-19 dataset). IBSCA3 is found to be the most efficient version of IBSCA (Section 5.2). The performance of IBSCA3 was evaluated and compared to 10 popular binary algorithms (Section 5.3). The overall simulation results indicate that IBSCA3 outperformed all the compared algorithms in terms of accuracy and number of features selected over most of the datasets. We compared IBSCA3 to 10 state-of-the-art algorithms that adopt OBL-enhanced methods, VNS and Laplace distribution (Section 5.4). We found that IBSCA3 produces the best results among the results of the compared algorithms. We compared IBSCA3 to seven popular variations of SCA (Section 5.5). The experimental results indicate that IBSCA3 is the most accurate algorithm. The accumulative advantages proposed for IBSCA are included in IBSCA3 where the method has the ability to diversify the search through Opposition Based Learning (OBL) and intensify the search through Variable Neighborhood Search (VNS) while also having the ability to escape local optima through Refraction Learning (RL). By means of these improvements, a superior method (i.e., IBSCA3) is introduced for the FS problem. In general, the overall simulation results indicate that IBSCA3 outperforms the compared algorithms, based on accuracy and number of features selected, over almost all tested datasets. Note that there are two main differences between IBSCA3 and the other hybrid optimization algorithms that attempt to solve the FS problem. First, IBSCA3 is the only hybrid algorithm that combines OBL, RL, VNS and Laplace distribution in a single algorithm. Second, IBSCA3 is the first such algorithm to include Laplace distribution inside VNS. The rest of the paper is organized as follows: SCA optimization problem implementations and versions are highlighted in Section 2. Section 3 then reviews the binary Sine Cosine algorithm and the objective function used. The newly proposed Improved Binary SCA with multiple exploration and exploitation approaches (IBSCA) for solving the FS problem is presented in Section 4. For the purpose of evaluation, the algorithms’ performances over different experiments are compared and discussed in Section 5. Lastly, Section 6 summarises the work and presents potential future research avenues.

Related work

Several discrete variations of SCA have been developed to solve the FS problem [48, 61, 72–78]. This section examines recently proposed variations of the SCA for global optimization and solving the FS problem. El-kenawy and Ibrahim [72] introduced a binary hybrid optimization algorithm (Binary SC-MWOA) that includes the SCA algorithm and a modified Whale Optimization algorithm. Binary SC-MWOA converts the continuous candidate solutions generated by the optimization operators of the SC and whale optimization algorithms into binary discrete solutions that can be used for the FS problem using the sigmoid function. Binary SC-MWOA was evaluated over 10 UCI repository datasets and compared to a number of popular optimization algorithms including the Grey Wolf Optimizer (GWO) [79], Whale Optimization Algorithms (WOA) [80] and memetic firefly algorithm. The Binary SC-MWOA was able to find an optimum subset of features with the best category error. Neggaz et al. [48] presented a new hybrid optimization algorithm for FS called ISSAFD that combines the optimization operators of the SC algorithm and the Disrupt Operator of the Salp Swarm Optimizer (SSA). ISSAFD optimizes followers’ positions in the SSA algorithm using sinusoidal mathematical functions similar to those in SCA operators. The disrupt operator diversifies the population of candidate solutions in the algorithm. The performance of ISSAFD was compared to many optimization algorithms including SSA, SCA, binary GWO (bGWO), PSO, ALO, and Genetic Algorithm (GA) over four well-known datasets. The simulation results suggested that ISSAFD was more accurate, had higher sensitivity, and chose fewer features than the other tested FS algorithms. Hussain et al. [73] suggested an algorithm to solve continuous optimization problems and the FS problem called SCHHO that integrates the SCA algorithm in the Harris Hawks Optimization (HHO) algorithm. The goal of SCHHO is to use SCA as an exploration method in HHO. In addition, the exploitation ability of HHO is improved in SCHHO by having candidate solutions adjust dynamically to help avoid staying in local optima. As reported in [73], SCHHO performs much better than popular optimization algorithms, including Dragonfly algorithm (DA), grasshopper optimization algorithm (GOA), GWO, WOA, and SSA. The wrapper-based Improved SCA (ISCA) [61] adds an Elitism strategy to SCA as well as a mechanism to update the best solution. The experimental results in [61] suggest that ISCA provides more accurate results and fewer features than GA, PSO and the original SCA algorithm. Abd Elaziz et al., [74] proposed SCADE, an algorithm that combines the differential evolution (DE) algorithm with the SCA algorithm. DE’s optimization operators are used at each iteration of SCA to improve its population of solutions. This helps the SCA algorithm avoid local optima. SCADE’s performance was assessed over eight UCI datasets with comparison to three popular algorithms (social spider optimization (SSO), ABC and ACO [74]), with SCADE obtaining the best results. Abualigah and Dulaimi [75] introduced the hybrid SCA and GA algorithm (SCAGA) for solving the FS problem. In SCAGA, the genetic optimization operators (crossover and mutation) are used to improve the optimization process of SCA and balance between its exploration and exploitation of candidate solutions. SCAGA was compared to SCA, PSO, and ALO using 16 UCI datasets. SCAGA was found to be a better feature-selection method than the other tested algorithms in terms of the maximum obtained accuracy and minimal obtained features. Sindhu et al., [77] proposed an algorithm named Improved Biogeography Based Optimization (IBBO) for solving the FS problem. IBBO attempts to improve the optimization process of Biogeography Based Optimization (BBO) by employing the optimization operators of SCA after the migration operator of BBO. The performance of IBBO was compared to the performance of popular optimization algorithms such as BBO, SCA, GA, PSO, and ABC using four popular datasets. The simulation results suggest that IBBO is more accurate and selects fewer features compared to the other FS algorithms. SCA may get stuck in sub-optimal regions during its optimization process. This is because its exploration operators (i.e., the two trigonometric functions of SCA) are unable to efficiently explore the search space. Abd Elaziz et al., [76] proposed Opposition-based SCA (OBSCA), which is a variation of SCA that uses the OBL technique to improve the performance of SCA. In OBSCA, OBL selects the best candidate solutions and generates their opposite solutions in an attempt to lead to more accurate solutions. OBSCA was compared in [76] to several optimization algorithms including SCA, Harmony Search (HS), GA, and PSO using standard optimization test functions and real-world engineering problems. OBSCA performed competitively compared to the other algorithms. Kumar and Bharti [78] proposed the Hybrid Binary PSO and SCA algorithm (HBPSOSCA). In this algorithm, a V-shaped transfer function converts continuous candidate solutions into binary solutions. The effectiveness of HBPSOSCA was compared in [78] to binary PSO, modified BPSO with chaotic inertia weight, binary moth flame optimization algorithm, binary DA, binary WOA, binary SCA, and binary ABC using 10 standard benchmark functions and seven real-world datasets. The conducted experiments showed that HBPSOSCA exhibited better performance in most of the tested cases. ASOSCA [81] is a hybrid optimization algorithm based on the Atom Search Optimization (ASO) algorithm and the SCA algorithm. It is basically used for automatic clustering. In ASOSCA, SCA is used to improve the quality of candidate solutions (i.e., reduce the number of features and improve accuracy of the solutions) in ASO. The performance of ASOSCA was compared in [81] to other optimization methods (e.g., SCA, ASO, PSO) using 16 clustering datasets and different cluster validity indexes. ASOSCA performed better than the other tested algorithms. The Artificial Algae Algorithm (AAA) is a metaheuristic for solving continuous optimization problems [82]. It was originally inspired by the living behaviors of microalgae, photosynthetic specie. Turkoglu et al. [83] proposed eight binary versions of the AAA algorithm for solving the FS problem. Each binary version of AAA uses a different transfer function (four V-shaped and four S-shaped transfer functions). The performance of the binary versions of AAA was compared to the performance of seven well-known optimization algorithms (BBA, binary CS, binary Firefly algorithm, binary GWO, binary Moth flame algorithm, binary PSO, binary WOA [83]) using the UCI datasets. The experimental results indicate that the binary versions of AAA outperform the other tested algorithms. The Horse herd Optimization Algorithm (HOA) is a metaheuristic that simulates the survival behaviour of a pack of horses in solving NP-hard optimization problems [84]. Awadallah et al. [85] proposed fifteen binary versions of HOA (BHOA) for solving the FS problems. The fifteen variations of BHOA were created by combining three popular crossover operators (one-point, two-point and uniform operators) with three transfer-functions categories (S-shaped, V-shaped and U-shaped transfer functions). The versions of BHOA were tested and evaluated against each other using 24 real-world datasets and the experimental findings suggest that the best version of BHOA is the one with S-shape and one-point crossover. The Black Widow Optimization (BWO) algorithm is a new population-based optimization algorithm that mimics the mating process of black-widow spiders to solve the continuous optimization problems [86]. However, the BWO algorithm converges slowly to solutions when attempting to solve hard optimization problems. Therefore, the enhanced version of BWO (SDABWO) was proposed in [87] to improve the convergence behaviour of BWO and solve the FS problem. Three techniques were integrated in SDABWO. First, the spouses of male spiders are chosen based on a computational procedure that takes into consideration the weight of female spiders and the distance between spiders. Second, the mutation operators of differential evolution are used in SDABWO at its mutation phase in order to escape from local optima. Lastly, the three key parameters of SDABWO (procreating rate, cannibalism rate, and mutation rate) are adjusted dynamically over the course of the simulation process of SDABWO. SDABWO was compared to the performance of five well-established optimization algorithms (GWO, PSO, DE, BOA, HHO) using 12 datasets from the UCI repository. The experimental results indicate that SDABWO outperforms the other compared algorithms. The chimp optimization algorithm (ChOA) is an optimization algorithm that is inspired by the behaviour of individual chimps in their group hunting for prey [88]. This algorithm was originally proposed for solving continuous optimization problems. The binary chimp optimization algorithm (BChOA) for solving the FS problem was introduced in [89]. BChOA has two variations, which are a result of combining the chOA with the one-point crossover operator and two transfer-functions categories (S-shaped and V-shaped transfer functions). The two versions of BChOA were compared to six popular metaheuristics (GA, PSO, BA, ACO, firefly algorithm, and flower pollination) and the results revealed that the two versions of BChOA perform better than the other tested algorithms. The Hunger Games Search Optimization (HGSO) algorithm is an optimization algorithm for continuous mathematical problems. It was inspired by the prey anxiety from being eaten by their predators [90]. Devi et al. [91] presented two binary versions of the HGSO algorithm for the FS problem. It uses V-shaped and S-shaped transfer functions to transfer continuous solutions to binary ones. Binary HGSO was compared to well-known optimization algorithms (e.g., binary GWO and BSCA) using 16 datasets from the UCI repository. The simulation results demonstrated that the binary HGSO are more accurate with less selected features than the other tested algorithms. In summary, many of the hybrid SCA variations in this section, including Binary SC-MWOA, ISSAFD, SCHHO, SCADE, HBPSOSCA and SCAGA, have internal parameters that require fine tuning and use iterative-based optimization operators inside their optimization loops (e.g., the crossover and mutation operators in SCAGA). In general, when compared to traditional optimization algorithms, hybrid methods use more computations (e.g., ASOSCA, HBPSOSCA, SCHHO). We are encouraged to use SCA in this new work because the candidate solutions in SCA can easily be converted to binary solutions using the transfer function described in Section 4.3.

Binary version of sine cosine algorithm for FS

The Sine Cosine Algorithm (SCA) [50], summarized in code in Algorithm 3 and pictorially in Fig. 1, iteratively optimizes a population of candidate solutions using basic trigonometric functions. A candidate solution is usually made of m decision variables X =< x1,x2,...,x >, each initially generated randomly between the lower (LB) and upper (UB) bound for the variable. Once an initial population of candidate solutions has been randomly generated, SCA uses the problem’s fitness function to calculate a fitness value of each candidate solution. The iterative optimization process of SCA then begins, and the decision variables of each candidate solution are updated as follows: where r1, r2, r3 and r4 are random numbers and is the position of the destination point in at iteration t. In detail, r1 is used to balance between exploration and exploitation of the range of the trigonometric functions in (1). The value of r1 is selected at each iteration of SCA as follows: where a is a constant, t is the iteration number and T is the maximum number of iterations. r2 ∈ [0,2π] specifies the distance and direction of the movement related to the destination. r3 ∈ [0,2] determines the weight of the destination point . The fourth parameter r4 ∈ [0,1] is a number used to randomly choose one of the two options in (1).
Fig. 1

The flowchart of SCA algorithm

The flowchart of SCA algorithm SCA pseudo-code. The FS problem is a binary optimization problem. A hypercube represents its search space, and a bit flip in the candidate vector changes the candidate position in the search space (X = {x1,x2,...,x}). However, given that SCA is originally for continuous optimization problems, there is a need for a mapping function. The transfer function (TF) proposed by [92] is utilized to map a candidate continuous value to its corresponding binary value. In this paper, the use of the TF is based on literature work described in [93]. In more detail, the use of the TF is conducted as follows. First, the probability of flipping a bit is calculated using (3). Where refers to the velocity of the d dimension in the i step vector (velocity) for the current iteration (t). Next, the decision value is updated based on (4), in which a random number r ∈ [0,1] is generated and, if the probability of flipping is greater than r, then a bit flip takes place on the i-th element of the position vector (X(t + 1)). This TF is called V-shaped and is visualized in Fig. 2.
Fig. 2

V-shaped Transfer function

V-shaped Transfer function

Objective function

In every optimization problem, there must be an objective function, which is an evaluation function that is used to measure a solution’s effectiveness. In the case of the FS optimization problem, a wrapper (optimizer) aims to i) minimize the number of the selected feature, and ii) increase the algorithm accuracy. Therefore, the developed objective function is as illustrated in (5). The focus is to minimize the classification error rate and the selection ratio, where the classification error rate is denoted as ERR(D) and the selection ratio is calculated by dividing the selected number of features (|R|) over the total number of features (|N|). α ∈ [0,1] is the weight assigned to the classification error rate, and β = 1 − α is the weight assigned to the selection ratio [94].

Proposed algorithm: an improved binary sine cosine algorithm with multiple exploration and exploitation approaches for feature selection

We present three versions of our binary optimization algorithm called Improved Binary SCA with multiple exploration and exploitation approaches (IBSCA) which can be used to solve FS problems. Algorithm 2 and the flowchart in Fig. 3 present the details of this approach. Three exploration techniques are applied in an accumulative manner to the three versions of IBSCA (IBSCA1, IBSCA2, IBSCA3), where IBSCA3 uses all of the three exploration techniques. The three versions of IBSCA are as follows:
Fig. 3

The flowchart of IBSCA

IBSCA1: OBL is used as the exploration method. IBSCA2: Builds on IBSCA1 by additionally using the VNS method combined with the Laplace distribution to explore the search space using several mutation methods. IBSCA3: Builds on IBSCA2 by additionally using Refraction Learning to improve the current best candidate solution at each iteration of the optimization loop of SCA. Improved Binary SCA with multiple exploration and exploitation approaches (IBSCA). The flowchart of IBSCA

Representation of candidate solutions

A candidate solution for a FS problem with m features is a vector of m binary decision variables. Given a candidate solution X, x = 1 means that the i th feature is included in X, whereas x = 0 means that it is not. Table 1 shows an example candidate solution consisting of 10 decision variables X =< x1 = 0,x2 = 1,x3 = 1,...,x10 = 1 >.
Table 1

A sample binary candidate solution

Dimension12345678910
xi 0110101101
A sample binary candidate solution

Population initialization

The performance of optimization algorithms can be improved by a diversified initial population of solutions [95-97]. One possible way to create a diverse initial population is by using the opposition-based learning (OBL) approach. OBL is an intelligent method developed from the observation that considering opposite candidate solutions can lead to improved search times [98]. It can be be applied to the decision variables in machine learning, optimization and search algorithms. For example, if X = 〈x1,x2...,x〉 is a candidate solution with m decision variables, the opposite candidate solution X is as follows: where LB is the lower bound for variable i and UB is its upper bound. The initialization stage is similar in all versions of IBSCA. In this stage, the first half of the population is generated randomly. The remainder of the population is generated by applying OBL to the first half (Line 1 in Algorithm 2). The use of OBL is expected to expand the search region and improve the solution’s approximation. While OBL can also be applied in the initialization stage of other optimization algorithms (e.g., Cuckoo Search [96, 99], Grey Wolf Optimizer [100], Whale Optimization [101]), as can be seen in Section 5.2, the performance of IBSCA using only OBL is slightly better than the performance of BSCA. This leads to it being a good base to later combine VNS, Laplace distribution, and RL to strongly improve IBSCA’s performance.

Discretization strategy

Candidate solutions produced by the optimization process of SCA and RL are continuous. Therefore, we use two-step transfer functions to convert the continuous decision variables into binary ones (lines 8 and 10). Table 2 shows eight binary transfer functions (4 S-shaped and 4 V-shaped transfer functions). We conducted extensive simulations to verify the efficiency of these transfer functions and found that V3 was the most viable transfer function. The experimental results in [93, 102] confirm our conclusion about V3. Thus, V3 is adopted in our experiments.
Table 2

S-shaped and V-shaped transfer functions

S-ShapedV-Shaped
NameFunctionNameFunction
S1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\frac {1}{1+e^{-2x}}$\end{document}11+e2x V1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\left | erf\left (\frac {\sqrt {\pi }}{2}x\right ) \right |$\end{document}erfπ2x
S2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\frac {1}{1+e^{-x}}$\end{document}11+ex V2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\left | \tan (x) \right |$\end{document}tan(x)
S3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\frac {1}{1+e^{\frac {-x}{2}}}$\end{document}11+ex2 V3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\left | \frac {x}{\sqrt {1+x^{2}}} \right |$\end{document}x1+x2
S4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\frac {1}{1+e^{\frac {-x}{3}}}$\end{document}11+ex3 V4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\left | \frac {2}{\pi }\arctan \left (\frac {\pi }{2}x\right ) \right |$\end{document}2πarctanπ2x
S-shaped and V-shaped transfer functions In V3, each decision variable in candidate solution at iteration t is used to calculate the probability of altering to 0 or 1. The probability is calculated as follows: Then, is set to 0 or 1 as follows: where r ∈ [0,1] is generated randomly. The chance of flipping the new value increases as the value increases.

Fitness function

In wrapper FS methods, we seek to minimize the number of selected features while maximizing classification accuracy. These two conflicting goals should be taken into account in the fitness function. We adopted the following fitness function to be used in our proposed algorithm: where F(X) is the fitness function of candidate solution X, ERR is the error rate obtained by a k-Nearest Neighbor classifier using X, |R| is the number of features in X, |N| is the total number of features in the dataset, α is the weight for ERR and β = 1 − α is the weight for the selection ratio (|R|/|N|).

Optimization loop

The optimization loop of IBSCA starts at Line 3 in Algorithm 2, and ends at line 15. The first step is to evaluate each candidate solution using the fitness function (Section 4.4). Then, the random parameters of the algorithm are initialized (r1, r2, r3 and r4) and the best solution is determined (P = X∗). Afterwards, all the candidate solutions are updated using (1) and the two-step transfer function (Section 4.3) is applied to the updated solutions to generate binary equivalences. In line 9, RL is applied to the best solution X∗ as described in Section 4.5.1 and then the result is converted to a binary solution using the two-step transfer function. Finally, a combination of the variables neighborhood search with Laplace distribution (lines 11-14) is applied to a randomly selected solution from the current population, as described in Section 4.5.2.

Refraction learning

IBSCA3 applies RL to the current best solution to improve it. In this section, we describe RL and then show how it can be used in IBSCA3. The refraction of light is caused by a light ray hitting an interface between two different mediums (e.g., air and water). The ray bends as its velocity changes when it moves toward the boundary between the two mediums. RL is an OBL method based on the principle of light refraction. The one-dimensional spatial refraction-learning process for the global optima X∗ at iteration t is illustrated in Fig. 4 [95, 103].
Fig. 4

Refraction Learning for the Global Optimal x∗

Refraction Learning for the Global Optimal x∗ The inverse of X∗ can be calculated using refraction learning as follows: where η is the refraction index, given by: where and In the above equations, X represents the incidence point (original candidate solution) while is the refraction point (opposite candidate solution). O denotes the center point of the search interval [LB, UB], h denotes the distance between x and O and denotes the distance between and O. In general, (11) can handle n decision variables as follows: where and are the jth decision variable of X∗ and , respectively, and LB and UB are the lower and upper bounds of the jth decision variable, respectively. In IBSCA3, (11) is applied to the best solution yet discovered (Line 9 in Algorithm 2).

Variables neighborhood search with laplace distribution

Two versions of IBSCA (IBSCA2 and IBSCA3) employ a combination of the Laplace distribution and VNS method. In this section, we first explain the Laplace distribution and VNS method and then show how they are applied in these algorithms. Variable Neighborhood Search (VNS) is a powerful metaheuristic for solving combinatorial optimization problems. The primary goal when using VNS is to enhance a candidate solution by performing a series of operations (e.g., mutation) on a solution. This nearby solution may break out of a local optimum. The optimization process of VNS is iterative and moves between adjacent solutions in an attempt to identify a better candidate [97, 104]. The Laplace distribution is suitable for stochastic modeling because it is stable under geometric, rather than ordinary, summation [105, 106]. The Laplace distribution’s density function is given by: where . The Laplace distribution is then defined as follows: where a ∈ R is the location parameter and b > 0 is the scale parameter. IBSCA2 and IBSCA3 employ a combination of the Laplace distribution and VNS method (lines 11 to 14 in Algorithm 2). In detail, these algorithms randomly pick a candidate solution at iteration t from the current population of solutions. They then generate a random number r ∈ [0,1] using the Laplace distribution. r is then used as a probability to select one of four operations on the selected candidate solution (swap, insert, inverse, or random mutation), as follows: The swap operator randomly selects two decision variables in the candidate solution (say x and x) and then exchanges the values of x and x, as illustrated in Fig. 5.
Fig. 5

Swap operator between x3 and x6

Swap operator between x3 and x6 The insert operator randomly selects two decision variables (say x and x) in the candidate solution and then shifts the values between x and x down one position, inserting x into x, as illustrated in Fig. 6.
Fig. 6

Insert operator between x2 and x9

Insert operator between x2 and x9 The inverse operator, shown in Fig. 7, randomly selects two decision variables (x and x) in the candidate solution and then inverses the order of values from x to x.
Fig. 7

Inverse operator between x3 and x6

Inverse operator between x3 and x6 The random operator, shown in Fig. 8, randomly selects a number of decision variables (say p) in the candidate solution and then flips the binary value of each selected decision variable.
Fig. 8

Random operator for x3, x6 and x9

Random operator for x3, x6 and x9

Computational complexity of IBSCA

The purpose of this section is to show the detailed computational complexity of IBSCA. We assume that the cost of any basic vector operation is O(1) and we denoted MaxItr as M. The computational complexity of IBSCA (Algorithm 2) can be calculated as follows: In summary, the computational complexity of IBSCA is O(M.n). In Line 1(a), the generation of n/2 candidate solutions using a random generation function requires O(n/2) operations. In Line 1(b), the generation of n/2 opposite candidate solutions using OBL (7) requires O(n/2) operations. Line 2 requires O(1) operations. The internal operations inside the while loop (lines 3 to 15) are as follows: The number of operations required to evaluate the fitness of the candidate solutions is O(n) operations (Line 4). Updating the best candidate solution so far (P = X∗) requires O(n) operations (Line 5). Generating four random numbers requires O(1) operations (Line 6). Updating the candidate solutions using (1) requires O(n) operations (Line 7). Applying the two-step transfer function (Section 4.3) to the updated candidate solutions requires O(n) operations (Line 8). Applying RL to the best solution X∗ requires O(1) operations (Line 9). Applying the two-step transfer function (Section 4.3) to the updated solution using RL requires O(1) operations (Line 10). Selecting a random solution from the current population of solutions (say ) requires O(1) operations (Line 11). Generating a random number r ∈ [0,1] based on the Laplace distribution requires O(1) operations (Line 12). Selecting one of four moves based on the value of r requires O(1) operations (Line 13). Line 14 requires O(1) operations. Overall, the cost of the operations in the while loop (lines 3 to 15) is O(M(n + n + 1 + n + n + 1 + 1 + 1 + 1 + 1 + 1)), where M is the maximum number of iterations. This can be reduced to O(M.n). The total number of operations in IBSCA (lines 1 to 16) is O(n/2 + n/2 + 1 + M.n + 1). This can be reduced to O(M.n) because M.n is greater than n + 2.

Experiments

In this section, we first demonstrate the performance of the three variations of IBSCA when solving the FS problem. The detailed characteristics of the used datasets are presented in Section 5.1. Section 5.2 provides a comparison of the convergence behavior of the original Binary Sine Cosine Algorithm (BSCA) [107] to the convergence behaviors of the three variations of IBSCA over the UCI datasets. Section 5.3 shows the performance of IBSCA3 in comparison to other well known FS algorithms. Table 3 illustrates the parameter settings of our proposed approach. The values of the parameters of all of the algorithms have been finely tuned based on several experiments. Thus, the algorithms in this section were compared to each other based on their best parameter settings. Since the general feature of the optimization algorithms is random in nature, we executed the algorithms for 30 independent runs. We executed our experiments on a Windows 7 computer with an Intel Core i7-3517U CPU @ 1.90GHz 2.40GHz and 8.0 GB memory.
Table 3

Parameters Settings

ParameterValue
Population size (Search agents)10
Number of iterations100
DimensionNumber of features
Number of runs30
α in fitness function0.99
a2
r1decreases linearly from a to 0
r2a random number in the range [0 , 2π]
r3a random number in the range [0 , 2]
r4a random number in the range [0 , 1]
rLaplacea random number in the range [0 , 1]
Parameters Settings

Datasets properties

The performance of IBSCA was evaluated using nineteen datasets (18 from UCI repository [108] and a real-world COVID-19 dataset1. Table 4 provides a description of these datasets in terms of their dimensions, number of instances, and number of classes. All datasets were split randomly into 80 training instances and 20 testing instances [38] where the k-nearest neighbors classifier (KNN) is used. The KNN technique is a supervised machine learning method for solving classification and regression problems [102].
Table 4

Datasets description

DatasetNo. of AttributesNo. of ObjectsNo. of Classes
Breastcancer96992
BreastEW305692
Exactly1310002
Exactly21310002
HeartEW132702
Lymphography181484
M-of-n1310002
PenglungEW325737
SonarEW602082
SpectEW222672
CongressEW164352
IonosphereEW343512
KrvskpEW3631962
Tic_tac-toe99582
Vote163002
WaveformEW4050003
WineEW131783
Zoo161017
COVID-19 dataset1510852
Datasets description

Convergence behavior of BSCA vs three variations of IBSCA

Figures 9, 10 and 11 show the convergence behavior of BSCA, IBSCA1, IBSCA2 and IBSCA3 over the UCI datasets. In each chart of these figures, the x-axis represents the iteration number, and the y-axis represents the fitness value. The convergence charts show that IBSCA3 converges faster to good solutions than all of the other algorithms for all of the datasets. The superiority of IBSCA3 is mainly because it uses three exploration techniques. First, it uses OBL when initializing the population to improve quality and diversity. Second, it integrates the VNS and Laplace distribution to explore the search space using multiple mutation methods. Third, it uses RL to search the neighborhood of best candidate solutions for better solutions.
Fig. 9

Convergence behavior of BSCA, IBSCA1, IBSCA2 and IBSCA3 over the datasets: Breastcancer, BreastEW, CongressEW, Exactly, Exactly2 and HeartEW

Fig. 10

Convergence behavior of BSCA, IBSCA1, IBSCA2 and IBSCA3 over the datasets: IonosphereEW, KrvskpEW, Lymphography, M-of-n, penglungEW and SonarEW

Fig. 11

Convergence behavior of BSCA, IBSCA1, IBSCA2 and IBSCA3 over the datasets: SpectEW, Tic-tac-toe, Vote, WaveformEW, WineEW and Zoo

Convergence behavior of BSCA, IBSCA1, IBSCA2 and IBSCA3 over the datasets: Breastcancer, BreastEW, CongressEW, Exactly, Exactly2 and HeartEW Convergence behavior of BSCA, IBSCA1, IBSCA2 and IBSCA3 over the datasets: IonosphereEW, KrvskpEW, Lymphography, M-of-n, penglungEW and SonarEW Convergence behavior of BSCA, IBSCA1, IBSCA2 and IBSCA3 over the datasets: SpectEW, Tic-tac-toe, Vote, WaveformEW, WineEW and Zoo The second best performing algorithm was IBSCA2. It uses two exploration techniques compared to IBSCA3 that uses three techniques. IBSCA1 was the third best performing algorithm. It uses only one exploration technique. BSCA exhibits the worst convergence behavior compared to the other algorithms. This may be because it does not use any additional exploration techniques compared to the other algorithms.

Performance analysis of IBSCA3 compared to baseline algorithms

In this section, we present a comparison between IBSCA3 and other binary versions of the baseline algorithms: BSCA, Random based Binary Dragonfly Algorithm (RBDA) [102], Linear based Binary Dragonfly Algorithm (LBDA) [102], Quadratic based Binary Dragonfly Algorithm (QBDA) [102], Sinusoidal based Binary Dragonfly Algorithm (SBDA) [102], Binary Gray Wolf Optimizer (BGWO) [109], Binary Gravitational Search Algorithm (BGSA) [109], and Binary Bat Algorithm (BBA) [109]. These algorithms were compared according to their classification accuracy, number of selected features, and their best fitness values. We also compared IBSCA to Coronavirus Herd Immunity Optimizer (CHIO) [110] and Coronavirus Herd Immunity Optimizer-Greedy Crossover (CHIO-GC) [110]. Table 5 shows the parameter settings of these algorithms, as in [102, 110].
Table 5

Parameter settings of the baseline algorithms

AlgorithmParameter settings
RBDAPopulation size = 10, Number of iterations = 100, α = 0.99, and β = 0.01
LBDAPopulation size = 10, Number of iterations = 100, α = 0.99, and β = 0.01
QBDAPopulation size = 10, Number of iterations = 100, α = 0.99, and β = 0.01
SBDAPopulation size = 10, Number of iterations = 100, α = 0.99, and β = 0.01
BGWOPopulation size = 10, Number of iterations = 100, and a=[2, 0]
BGSAPopulation size = 10, Number of iterations = 100, Gø= 100, and α = 20
BBAPopulation size = 10, Number of iterations = 100, Frequency minimum Qmin = 0,
Frequency maximum Qmax = 2, Loudness A = 0.5, and Pulse rate r = 0.5
CHIOHIS = 30, Max_Age = 100, BRr = 0.01, Max_Itr = 100 , LB = 0 , UB = 1
CHIO-GCHIS = 30, Max_Age = 100, BRr = 0.01, Max_Itr = 100 , LB = 0 , UB = 1
Parameter settings of the baseline algorithms Table 6 shows the average value and standard deviation of the results obtained by the proposed IBSCA3 algorithm, and the other compared algorithms, in terms of average classification accuracy. IBSCA3 outperforms the other algorithms and obtains the best classification accuracy on all the UCI and COVID-19 datasets.
Table 6

Average and standard deviation of classification accuracy for the proposed IBSCA3 algorithm in comparison to existing algorithms

DatasetMetricIBSCA3BSCARBDALBDAQBDASBDABGWOBGSABBACHIOCHIO-GC
BreastcancerAvg0.9970.9650.9830.9780.9930.9930.9780.9480.932N/AN/A
StDev0.0000.0020.0040.0020.0010.0000.010.020.051N/AN/A
BreastEWAvg1.0000.9791.0000.9870.9800.9750.9230.9280.9130.8990.94
StDev0.0000.0050.0080.0080.0060.0060.0150.0140.0350.0210.019
ExactlyAvg1.0000.9851.0001.0000.9941.0000.8350.7320.602N/AN/A
StDev0.0000.0370.0030.0000.0200.0000.0770.1240.055N/AN/A
Exactly2Avg0.8230.7830.7970.7800.7850.7570.6740.6440.683N/AN/A
StDev0.0090.0380.0150.0020.0000.0140.0410.0410.04N/AN/A
HeartEWAvg0.9260.7980.8390.9010.8800.8670.7880.770.7280.8540.912
StDev0.0410.0470.0110.0340.0190.0090.0390.0660.0610.0270.018
LymphographyAvg0.9670.8140.9300.9130.9240.9540.8420.8640.6890.7610.834
StDev0.0210.0350.0210.0190.0230.0160.0570.0810.1030.0350.027
M-of-nAvg1.0000.9841.0001.0000.9991.0000.9130.8270.716N/AN/A
StDev0.0000.0050.0000.0000.0040.0000.0520.0610.083N/AN/A
PenglungEWAvg1.0000.9770.9591.0001.0001.0000.8690.9490.816N/AN/A
StDev0.0680.0000.0390.0000.0000.0000.0120.0540.054N/AN/A
SonarEWAvg0.9950.9520.9640.9440.9480.9930.8870.8650.814N/AN/A
StDev0.0080.0150.0170.0190.0120.0110.040.0470.059N/AN/A
SpectEWAvg0.9410.8620.8940.9230.8900.9250.8180.7850.756N/AN/A
StDev0.0170.0070.0100.0100.0130.0110.0290.0340.039N/AN/A
CongressEWAvg1.0000.9610.9760.9990.9930.9750.950.9430.869N/AN/A
StDev0.0000.0140.0030.0040.0060.0050.0470.0260.08N/AN/A
IonosphereEWAvg0.9930.9750.9700.9700.9230.9840.8910.8690.866N/AN/A
StDev0.0040.0130.0130.0090.0120.0110.0250.0260.027N/AN/A
KrvskpEWAvg0.9840.9620.9750.9810.9680.9660.9350.8980.79N/AN/A
StDev0.0070.0050.0040.0060.0040.0040.0190.0530.09N/AN/A
Tic-tac-toeAvg0.8690.8110.8200.8390.8470.8320.8060.7610.658N/AN/A
StDev0.0000.0490.0050.0000.0050.0050.0290.0380.081N/AN/A
VoteAvg0.9980.9840.9960.9710.9590.9720.9390.9430.856N/AN/A
StDev0.0040.0380.0070.0100.0080.0080.0210.0250.102N/AN/A
WaveformEWAvg0.7910.7240.7660.7600.7380.7760.7050.6970.659N/AN/A
StDev0.0130.0180.0090.0100.0080.0110.0150.0210.046N/AN/A
WineEWAvg1.0000.9470.9911.0001.0001.0000.9380.9760.838N/AN/A
StDev0.0000.0490.0130.0000.0000.0000.0360.0350.131N/AN/A
ZooAvg1.0000.9941.0001.0001.0001.0000.9930.9950.867N/AN/A
StDev0.0000.0280.0000.0000.0000.0000.0230.0150.114N/AN/A
COVID-19Avg0.9520.894N/AN/AN/AN/AN/AN/AN/A0.9140.937
StDev0.0080.028N/AN/AN/AN/AN/AN/AN/A0.0250.019

The results in bold point shows the best results in the table

Average and standard deviation of classification accuracy for the proposed IBSCA3 algorithm in comparison to existing algorithms The results in bold point shows the best results in the table Table 7 presents the average number of selected features for the tested algorithms. IBSCA3 outperforms the other tested algorithms on 14 out of 18 datasets. This is better than the second best algorithm (SBDA algorithm) which outperforms the remaining compared algorithms on 11 out of 18 datasets.
Table 7

Average and standard deviation of average selected features for the proposed IBSCA3 algorithm in comparison to existing algorithms

DatasetMetricIBSCA3BSCARBDALBDAQBDASBDABGWOBGSABBA
BreastcancerAvg2.715.015.074.933.0356.44.474.1
StDev0.120.671.310.250.1801.751.011.27
BreastEWAvg7.868.399.0711.713.3312.221.5714.9311.77
StDev1.691.781.741.972.512.544.823.94
ExactlyAvg5.928.046.076.137.036.1310.77.675.23
StDev1.830.090.250.350.850.352.021.492.25
Exactly2Avg1.016.382.831.31.035.036.976.135.77
StDev0.142.183.111.640.183.762.742.081.57
HeartEWAvg5.647.036.136.46.336.039.76.635.07
StDev1.821.471.251.281.060.961.991.941.7
LymphographyAvg5.916.049.438.077.676.8310.696.87
StDev0.772.031.811.511.840.912.632.181.96
M-of-nAvg5.276.886.076.076.976.0710.438.25.73
StDev1.940.540.250.250.670.251.451.161.82
PenglungEWAvg84.62106.38110.299.9132.47117.53152.33145.1126.47
StDev9.3310.3911.358.453.829.774.8815.62
SonarEWAvg22.1425.9123.126.5328.324.3334.8727.0723.53
StDev2.693.573.064.033.622.527.813.645.15
SpectEWAvg4.7810.139.575.29.48.5713.779.778.73
StDev2.162.562.372.311.941.632.932.32.29
StDev1.721.421.230.861.221.51.881.912.18
IonosphereEWAvg10.6515.731113.6312.9312.6716.1714.912.3
StDev1.832.812.33.152.992.172.352.893.4
KrvskpEWAvg13.0721.4518.9318.9720.619.5730.919.7314.97
StDev2.712.562.122.832.092.432.932.362.88
Tic-tac-toeAvg4.15.96.776.936.938.35.64.3
StDev0.651.060.4700.370.371.240.971.7
VoteAvg4.217.964.34.636.2348.637.376.1
StDev0.871.820.531.451.770.982.631.672.14
WaveformEWAvg18.5435.9421.421.421.7721.8334.0721.616.23
StDev3.494.933.542.32.712.654.483.694.08
WineEWAvg3.025.947.133.434.074.47.376.574.87
StDev0.431.531.430.680.691.071.671.361.87
ZooAvg1.65.633.44.24.51.977.376.976.43
StDev0.830.920.560.410.730.961.631.251.83
COVID-19 datasetAvg2.953.05N/AN/AN/AN/AN/AN/AN/A
StDev0.740.183N/AN/AN/AN/AN/AN/AN/A

The results in bold point shows the best results in the table

Average and standard deviation of average selected features for the proposed IBSCA3 algorithm in comparison to existing algorithms The results in bold point shows the best results in the table Table 8 illustrates the best fitness values obtained by the tested algorithms. We can observe that IBSCA3 shows superior performance over the other algorithms. It obtains the best fitness values on all datasets.
Table 8

Average and standard deviation of the best fitness value for the proposed IBSCA3 algorithm in comparison to existing algorithms

DatasetMetricIBSCA3BSCARBDALBDAQBDASBDABGWOBGSABBA
BreastcancerAvg0.0100.0290.0230.0280.0110.0130.0160.0270.036
StDev0.0010.0020.0020.0010.00200.0020.0070.005
BreastEWAvg0.0020.0220.0030.0170.0250.0290.0430.0390.036
StDev0.0010.0010.0010.0080.0050.0060.0070.010.009
ExactlyAvg0.0030.0240.0060.0050.0120.0050.1850.2530.303
StDev00.0020.00300.0200.0510.0940.108
Exactly2Avg0.2010.290.2040.2190.2140.2450.2490.2880.25
StDev0.0090.120.18000.0110.0140.0140.015
HeartEWAvg0.1030.1960.1650.1040.1240.1370.1280.1370.161
StDev0.0290.0250.0110.0320.0190.0080.0260.030.023
LymphographyAvg0.0370.110.0750.0910.0790.0490.0830.0810.162
StDev0.0120.0190.020.0180.0220.0160.0350.0330.053
M-of-nAvg0.0050.0090.0050.0050.0070.0050.0870.1650.165
StDev000.02700.00400.0390.0410.044
PenglungEWAvg0.0020.0480.0440.0030.0040.0040.1260.0040.132
StDev00.0190.0380000.02500.038
SonarEWAvg0.0090.0540.0390.0590.0570.0110.1040.0820.11
StDev0.0070.0250.0170.0190.0110.0110.020.0230.03
SpectEWAvg0.0630.1360.110.0790.1130.0790.1430.1530.143
StDev0.0070.0160.0090.0090.0120.010.0160.0180.021
CongressEWAvg0.0030.0340.0280.0050.0110.0290.0280.0320.032
StDev0.0010.0040.0030.0030.0050.0040.010.0130.015
IonosphereEWAvg0.0180.0910.0330.0330.0810.0200.0990.1270.124
StDev0.0060.0120.0130.0090.0120.010.0130.0110.019
KrvskpEWAvg0.0210.0430.030.0240.0380.0390.0510.0990.093
StDev0.0070.0050.0030.0060.0040.0040.0090.0490.039
Tic-tac-toeAvg0.1570.2140.1870.1690.1600.1750.1770.2320.232
StDev0.0030.0050.00400.0050.0040.0080.0240.022
VoteAvg0.0040.0340.0070.0320.0440.0300.0480.0380.063
StDev0.0030.0080.0070.010.0070.0080.0090.0090.017
WaveformEWAvg0.2090.2760.2370.2430.2640.2270.2370.2510.251
StDev0.0090.0120.0080.0090.0080.0110.0080.0130.016
WineEWAvg0.0030.0080.0150.0030.0030.0040.0450.0090.025
StDev0.0000.0090.0130.0010.0010.0010.0170.0120.017
ZooAvg0.0010.0070.0020.0030.0030.0010.0070.0050.052
StDev0.0010.0020000.0010.010.0010.032
COVID-19 datasetAvg0.0020.013N/AN/AN/AN/AN/AN/AN/A
StDev0.0340.072N/AN/AN/AN/AN/AN/AN/A

The results in bold point shows the best results in the table

Average and standard deviation of the best fitness value for the proposed IBSCA3 algorithm in comparison to existing algorithms The results in bold point shows the best results in the table In summary, the enhanced version of the Binary Sine Cosine algorithm outperformed the other algorithms for all of the tested datasets, with IBSCA3 providing the highest classification accuracy and the lowest fitness function for all datasets with different dimensions, and the lowest average number of selected features in most cases. The overall results indicate that IBSCA3 converges faster than the other algorithms to the most accurate solutions with the least number of features. The original SCA employs a random update method to update the solutions in the algorithm. This negatively affects the ability of SCA to balance between the exploration and exploitation of the search space. In contrast, IBSCA3 improves exploration and exploitation in the original SCA by employing several techniques. First, it employs an OBL approach to improve the diversity of initial population. Second, it integrates the VNS and Laplace distribution to explore the search space using multiple mutation methods. Third, it uses RL to search the neighborhood of the best candidate solutions for better ones. The overall results indicate that IBSCA3 improves the performance and convergence behavior of the original SCA in solving the FS problem.

Performance analysis of IBSCA3 compared to state-of-the-art algorithms that adopt OBL-enhanced methods, VNS and laplace distribution

In this section, we demonstrate a comparison between IBSCA3 and other new algorithms that incorporate OBL into their basic structure. These algorithms are: Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection (ISSA) [111], Improved Harris Hawks Optimization using elite opposition-based learning and novel search mechanism for feature selection (IHHO) [112], and New feature selection methods based on opposition-based learning and self-adaptive cohort intelligence for predicting patient no-shows (OSACI) [113]. We also compare IBSCA3 with other new algorithms that employ similar methods (VNS and Laplace distribution): A variable neighborhood search algorithm for human resource selection and optimization problem in the home appliance manufacturing industry (VNS-HRS) [114], Improving feature selection performance for classification of gene expression data using Harris Hawks Optimizer with variable neighborhood learning (VNLHHO) [115], Improved equilibrium optimization algorithm using elite opposition-based learning and new local search strategy for feature selection in medical datasets (IEOA) [116], Dynamic salp swarm algorithm for feature selection (DSSA) [117], Semi-supervised feature selection with minimal redundancy based on local adaptive (SFS-LARLRM) [118] and Binary optimization using hybrid grey wolf optimization for feature selection (BGWOPSO) [119]. Table 9 shows the parameter settings of these algorithms, as in [111-119].
Table 9

Parameter settings of ISSA, IHHO, OSACI, VNS-HRS, VNLHHO, IEOA, DSSA, SFS-LARLRM and BGWOPSO

AlgorithmParameter settings
ISSAPopulation size = 10, Number of iterations = 40
IHHOPopulation size = 10, Number of iterations = 50, α = 0.99, and β = 0.01
OSACIPopulation size = 100, Number of iterations = 50
VNS-MCIPopulation size = 10, Number of iterations = 40
VNLHHOPopulation size = 30, Number of iterations = 100
IEOAPopulation size = 10, Number of iterations = 50, α = 0.99, and β = 0.01
DSSAPopulation size = 10, Number of iterations = 100, c2 = rand(), c3 = rand()
SFS-LARLRMPopulation size = 10, Number of iterations = 100, k = 5,
σ = {10− 7, 10− 5, 10− 3, 10− 1, 100, 101, 103, 105, 107}, p = {0.25, 0.5, 0.75, 1}
BGWOPSOPopulation size = 10, Number of iterations = 100, c1 = 0.5, c2 = 0.5, c3 = 0.5,
w = 0.5 + rand ()/2, l∈[0, 1]
Parameter settings of ISSA, IHHO, OSACI, VNS-HRS, VNLHHO, IEOA, DSSA, SFS-LARLRM and BGWOPSO Table 10 shows a comparison of the average classification accuracy achieved by the proposed IBSCA3 algorithm, BSCA and the other algorithms that incorporate OBL, VNS and Laplace distribution. In Table 10, we report the average value and standard deviation of the results. Among all the datasets from UCI and COVID-19 functions (except one, where it is second best), IBSCA3 delivers the best classification accuracy.
Table 10

Average and standard deviation of classification accuracy for the proposed IBSCA3 algorithm in comparison to BSCA and the other algorithms that incorporate OBL, VNS and Laplace distribution

DatasetMetricIBSCA3BSCAISSAIHHOOSACIVNS-HRSVNLHHOIEOADSSASFS-LARLRMBGWOPSO
BreastcancerAvg0.9970.9650.9520.9860.9910.9940.9350.9640.9310.9950.978
StDev0.0000.0020.0070.0030.0050.0130.0270.0390.0220.0680.009
BreastEWAvg1.0000.9790.9621.0001.0000.9830.9550.9140.9130.9770.986
StDev0.0000.0050.0120.0000.0000.0190.0570.0410.0970.0880.062
ExactlyAvg1.0000.9110.9071.0000.9031.0001.0000.8921.0001.000
StDev0.0000.0370.0140.0020.0040.0000.0010.0510.0230.0000.000
Exactly2Avg0.8230.7830.7240.6870.7190.7530.7960.6160.6980.7810.765
StDev0.0090.0380.0190.0250.0420.0280.0090.0770.0930.0130.005
HeartEWAvg0.9260.7980.8870.7580.8490.8030.8610.9120.8290.9050.873
StDev0.0410.0470.0520.0780.0660.0170.0480.0340.0560.0160.037
LymphographyAvg0.9670.8140.9300.9130.9210.9540.8420.8640.6890.7610.838
StDev0.0210.0350.0210.0190.0230.0160.0570.0810.1030.0350.027
M-of-nAvg1.0000.9841.0001.0000.9991.0000.9130.8270.7160.7920.891
StDev0.0000.0050.0000.0000.0040.0000.0520.0610.0830.0100.007
PenglungEWAvg1.0000.9770.9591.0001.0001.0000.8690.9490.8160.7580.896
StDev0.0680.0000.0390.0000.0000.0000.0120.0540.0540.0250.007
SonarEWAvg0.9950.9520.9620.9440.9480.9930.8870.8650.8140.8360.923
StDev0.0080.0150.0170.0190.0120.0110.040.0470.0590.0160.004
SpectEWAvg0.9410.8620.8940.9230.8900.9250.8180.7850.7560.8690.927
StDev0.0170.0070.0100.0100.0130.0110.0290.0340.0390.0620.004
CongressEWAvg1.0000.9610.9760.9990.9931.0000.9520.9430.8690.9540.981
StDev0.0000.0140.0030.0040.0060.0050.0470.0260.0080.0610.017
IonosphereEWAvg0.9930.9780.9340.9780.9160.9510.9840.8710.9060.9820.972
StDev0.0040.0130.0060.0180.0230.0070.0360.0440.0120.0720.004
KrvskpEWAvg0.9840.9620.9180.9320.9460.9010.9790.8130.9690.9720.955
StDev0.0070.0050.0030.0100.0520.0370.0260.0620.0080.0170.003
Tic-tac-toeAvg0.8690.8110.8200.8390.8450.8320.8060.7610.6580.7280.813
StDev0.0000.0490.0050.0000.0050.0050.0290.0380.0810.0610.006
VoteAvg0.9980.9840.9630.9710.9690.9880.9220.9290.9050.8910.993
StDev0.0040.0380.0090.0180.0470.0140.0530.0820.0050.0670.003
WaveformEWAvg0.7910.7240.7830.8540.7580.7720.7630.7090.6840.6910.782
StDev0.0130.0180.0150.0120.0090.0380.0260.0470.0130.0230.019
WineEWAvg1.0000.9470.9911.0000.9851.0000.9231.0000.9080.8621.000
StDev0.0000.0490.0160.0000.0090.0000.0270.0000.0830.0990.000
ZooAvg1.0000.9940.9911.0000.9861.0000.9820.9191.0000.9711.000
StDev0.0000.0280.0070.0000.0050.0000.0190.0070.0000.0050.000
COVID-19Avg0.9520.8940.9150.9490.9270.9160.8930.9180.9390.8720.945
StDev0.0080.0280.0060.0070.0190.0480.0610.0520.0130.0310.009

The results in bold point shows the best results in the table

Average and standard deviation of classification accuracy for the proposed IBSCA3 algorithm in comparison to BSCA and the other algorithms that incorporate OBL, VNS and Laplace distribution The results in bold point shows the best results in the table

Performance analysis of IBSCA3 compared to state-of-the-art SCA algorithms

A comparison of IBSCA3 with other SCA variants is presented in this section. These variants include: An efficient hybrid sine-cosine Harris Hawks Optimization for low and high-dimensional feature selection (SCHHO) [73], A novel feature selection method for data mining tasks using hybrid Sine Cosine Algorithm and Genetic Algorithm (SCAGA) [75], A Hybrid Feature Selection Framework Using Improved Sine Cosine Algorithm with Metaheuristic Techniques (MetaSCA) [120], A novel hybrid BPSO–SCA approach for feature selection (BPSO–SCA) [78], Boosting Salp Swarm Algorithm by Sine Cosine algorithm and Disrupt Operator for Feature Selection (ISSAFD), and An improved sine cosine algorithm to select features for text categorization (ISCA) [121]. Table 11 shows the parameter settings of these algorithms, as in [72, 73, 75, 78, 120, 121].
Table 11

Parameter settings of SCHHO, SCAGA, MetaSCA, BPSO–SCA, ISSAFD and ISCA

AlgorithmParameter settings
SCHHOPopulation size = 10, Number of iterations = 100 , α = 2
SCAGAPopulation size = 5, Number of iterations = 80, pm = 0.02 α = 0.01, and β = 0.99,
MetaSCAPopulation size = 30, Number of iterations = 300
BPSO–SCAPopulation size = 50, Number of iterations = 150, e1 = 1.5, e2 = 1.5
ISSAFDPopulation size = 10, Number of iterations = 100, γ = 0.99, μ = 0.01, c1 = 2, c2 = 2,
ps = 0.7, pm = 0.2, Rate = 0.8
ISCAPopulation size = 30, Number of iterations = 0, a = 1, b = 8
Parameter settings of SCHHO, SCAGA, MetaSCA, BPSO–SCA, ISSAFD and ISCA Table 12 displays the average classification accuracy of the proposed IBSCA3 algorithm, BSCA and the other state-of-the-art SCA algorithms. In Table 12, we report the average value and standard deviation of the results. IBSCA3 consistently outperforms other algorithms when applied to UCI and COVID-19 datasets. Based on the classification accuracy of IBSCA3 and these algorithms, we determined that it had the best performance.
Table 12

Average and standard deviation of classification accuracy for the proposed IBSCA3 algorithm in comparison to BSCA and the other SCA variants algorithms

DatasetMetricIBSCA3BSCASCHHOSCAGAMetaSCABPSO–SCAISSAFDISCA
BreastcancerAvg0.9970.9650.9360.9570.9210.9060.9830.891
StDev0.0000.0020.0040.0070.0120.0030.0010.028
BreastEWAvg1.0000.9790.9460.9560.9610.9291.0000.983
StDev0.0000.0050.0110.0160.0070.0090.0000.031
ExactlyAvg1.0000.9850.9881.0000.9610.9731.0000.934
StDev0.0000.0370.0050.0000.0030.0010.0000.014
Exactly2Avg0.8230.7830.7510.7010.6870.7230.8160.656
StDev0.0090.0380.0140.0380.0150.0220.0060.043
HeartEWAvg0.9260.7980.8120.8030.8250.8130.8520.739
StDev0.0410.0470.0780.0670.0820.0790.0520.066
LymphographyAvg0.9670.8140.9180.8570.9120.9310.9530.836
StDev0.0210.0350.0150.0230.0360.0310.0450.025
M-of-nAvg1.0000.9841.0000.9320.9080.9511.0000.881
StDev0.0000.0050.0000.0110.0080.0160.0000.037
PenglungEWAvg1.0000.9770.9460.9810.9150.9661.0000.904
StDev0.0680.0000.0190.0230.0350.0170.0060.052
SonarEWAvg0.9950.9520.9310.9260.9510.9610.9880.917
StDev0.0080.0150.0360.0220.0170.0280.0130.042
SpectEWAvg0.9410.8620.8530.8190.7790.8250.8580.841
StDev0.0170.0070.0150.0270.0190.0640.0190.032
CongressEWAvg1.0000.9610.9590.9120.9420.9380.9550.917
StDev0.0000.0140.0260.0390.0520.0610.0170.062
IonosphereEWAvg0.9930.9750.9630.9580.9160.9370.9720.856
StDev0.0040.0130.0210.0350.0420.0510.0190.063
KrvskpEWAvg0.9840.9620.9540.9360.9470.9250.9610.899
StDev0.0070.0050.0070.0120.0070.0050.0030.018
Tic-tac-toeAvg0.8690.8110.8310.8060.8260.8020.8420.783
StDev0.0000.0490.0310.0270.0160.0240.0020.053
VoteAvg0.9980.9840.9750.9430.9220.9410.9870.916
StDev0.0040.0380.0170.0230.0150.0370.0130.041
WaveformEWAvg0.7910.7240.7390.7180.7250.7760.7480.616
StDev0.0130.0180.0150.0230.0310.0260.0130.039
WineEWAvg1.0000.9471.0000.9590.9360.9201.0000.886
StDev0.0000.0490.0000.0190.0210.0280.0000.035
ZooAvg1.0000.9941.0000.9860.9740.9381.0000.926
StDev0.0000.0280.0000.0050.0090.0160.0000.014
COVID-19Avg0.9520.8940.9180.8720.9040.9180.9320.897
StDev0.0080.0280.0160.0330.0240.0180.0110.007

The results in bold point shows the best results in the table

Average and standard deviation of classification accuracy for the proposed IBSCA3 algorithm in comparison to BSCA and the other SCA variants algorithms The results in bold point shows the best results in the table

Performance analysis of IBSCA3 compared to other new nature-inspired metaheuristic algorithms

This section shows a comparison between IBSCA3 and other new nature-inspired metaheuristic algorithms, including: A novel Binary Farmland Fertility Algorithm (BFFAG) [122], African vultures optimization algorithm (AVOA) [123] and Artificial gorilla troops optimizer (GTO) [124]. Table 13 shows the parameter settings of these algorithms, as in [122-124].
Table 13

Parameter settings of BFFAG, AVOA and GTO

AlgorithmParameter settings
BFFAGPopulation size = 10, Number of iterations = 50 , W = 1, Q = .7, R = 0.9
AVOAPopulation size = 30, Number of iterations = 500, L1 = 0.8, L2 = 0.2, w = 2.5,
p1 = 0.6, p2 = 0.4, p3 = 0.6
GTOPopulation size = 30, Number of iterations = 500, β = 3, W = 0.8, p = 0.03
Parameter settings of BFFAG, AVOA and GTO A comparison of the average classification accuracy achieved by the proposed IBSCA3 algorithm, BSCA and the other new nature-inspired metaheuristic algorithms is shown in Table 14, where we report the average value and standard deviation of the results. In all datasets from UCI and COVID-19, IBSCA3 delivers the best classification accuracy.
Table 14

Average and standard deviation of classification accuracy for the proposed IBSCA3 algorithm in comparison to BSCA and the other the other new nature-inspired metaheuristic algorithms

DatasetMetricIBSCA3BSCABFFAGAVOAGTO
BreastcancerAvg0.9970.9650.9720.9850.994
StDev0.0000.0020.0050.0030.001
BreastEWAvg1.0000.9790.9810.9951.000
StDev0.0000.0050.0090.0060.005
ExactlyAvg1.0000.9850.9911.0001.000
StDev0.0000.0370.0060.0000.000
Exactly2Avg0.8230.7830.8090.8150.822
StDev0.0090.0380.0090.0010.000
HeartEWAvg0.9260.7980.8130.8570.913
StDev0.0410.0470.0270.0180.007
LymphographyAvg0.9670.8140.9330.9510.959
StDev0.0210.0350.0280.0150.013
M-of-nAvg1.0000.9840.9881.0001.000
StDev0.0000.0050.0050.0000.000
penglungEWAvg1.0000.9770.9621.0001.000
StDev0.0680.0000.0260.0000.000
SonarEWAvg0.9950.9520.9710.9780.986
StDev0.0080.0150.0130.0110.009
SpectEWAvg0.9410.8620.8850.9190.937
StDev0.0170.0070.0250.0160.007
CongressEWAvg1.0000.9610.9740.9860.992
StDev0.0000.0140.0100.0050.002
IonosphereEWAvg0.9930.9750.9790.9880.991
StDev0.0040.0130.0110.0070.003
KrvskpEWAvg0.9840.9620.9750.9780.981
StDev0.0070.0050.0100.0080.005
Tic-tac-toeAvg0.8690.8110.8360.8520.861
StDev0.0000.0490.00410.0030.001
VoteAvg0.9980.9840.9870.9910.997
StDev0.0040.0380.0090.0070.005
WaveformEWAvg0.7910.7240.7530.7790.788
StDev0.0130.0180.0150.0140.012
WineEWAvg1.0000.9470.9951.0001.000
StDev0.0000.0490.0090.0000.000
ZooAvg1.0000.9920.9961.0001.000
StDev0.0000.0280.0130.0000.000
COVID-19Avg0.9520.8940.9310.9450.948
StDev0.0080.0280.0070.0050.004

The results in bold point shows the best results in the table

Average and standard deviation of classification accuracy for the proposed IBSCA3 algorithm in comparison to BSCA and the other the other new nature-inspired metaheuristic algorithms The results in bold point shows the best results in the table Consequently, the overall results summarized in all different sets of experiments indicate the strength of the IBSCA3 algorithm in improving the performance and convergence behavior of the original SCA when solving the FS problem.

Runtime performance comparison of IBSCA3 to existing algorithms

Tables 15, 16, 17, 18 provide the running time comparison of IBSCA3, BSCA, and the other algorithms described in Tables 6, 10, 12, and 14, respectively. The results are given in milliseconds, representing an average of 30 independent runs. For each algorithm in the tables, the values in the tables represent the run time to obtain the results after 100 iterations. As shown in the tables, IBSCA3 is faster than the other algorithms when applied to all datasets.
Table 15

Runtime Performance Comparison for the proposed IBSCA3 algorithm in comparison to existing algorithms

DatasetIBSCA3BSCARBDALBDAQBDASBDABGWOBGSABBA
Breastcancer9.18E + 031.12E + 041.08E + 041.03E + 041.59E + 041.09E + 041.21E + 041.48E + 041.28E + 04
BreastEW1.61E + 032.74E + 031.89E + 031.93E + 031.86E + 031.97E + 032.95E + 032.34E + 032.91E + 03
Exactly1.29E + 041.41E + 041.34E + 041.37E + 041.36E + 041.32E + 041.49E + 041.54E + 041.58E + 04
Exactly21.15E + 041.46E + 041.25E + 041.28E + 041.23E + 031.21E + 041.53E + 041.62E + 041.57E + 04
HeartEW5.13E + 036.19E + 035.68E + 035.08E + 035.72E + 036.02E + 037.01E + 038.72E + 038.31E + 03
Lymphography3.19E + 033.84E + 033.57E + 033.40E + 033.91E + 033.55E + 034.16E + 035.83E + 034.07E + 03
M-of-n9.12E + 03 1.29E + 049.38E + 038.49E + 039.54E + 039.78E + 031.14E + 041.09E + 041.28E + 04
PenglungEW3.47E + 033.98E + 034.06E + 034.15E + 033.83E + 033.65E + 034.32E + 035.79E + 033.84E + 03
SonarEW3.56E + 034.39E + 033.83E + 033.90E + 034.01E + 033.16E + 034.72E + 035.09E + 034.87E + 03
SpectEW4.61E + 035.82E + 034.89E + 034.93E + 034.75E + 034.87E + 036.08E + 035.74E + 035.18E + 03
CongressEW1.14E + 041.31E + 041.19E + 041.17E + 041.18E + 041.21E + 041.53E + 041.74E + 041.43E + 04
IonosphereEW4.36E + 035.98E + 035.32E + 035.74E + 035.17E + 035.06E + 036.04E + 037.01E + 036.81E + 03
KrvskpEW5.73E + 046.69E + 046.06E + 046.21E + 045.92E + 046.17E + 048.14E + 047.69E + 047.83E + 04
Tic-tac-toe1.26E + 041.66E + 041.49E + 041.37E + 041.26E + 041.39E + 041.77E + 041.53E + 041.68E + 04
Vote5.96E + 036.58E + 036.97E + 036.01E + 036.86E + 036.38E + 038.34E + 037.44E + 037.81E + 03
WaveformEW1.48E + 041.72E + 041.51E + 041.67E + 041.56E + 041.53E + 041.87E + 041.79E + 041.78E + 04
WineEW1.09E + 031.57E + 031.28E + 031.20E + 031.14E + 031.16E + 032.51E + 031.86E + 032.01E + 03
Zoo4.52E + 035.68E + 034.79E + 035.02E + 034.97E + 035.08E + 036.42E + 035.69E + 036.83E + 03

The results in bold point shows the best results in the table

Table 16

Runtime Performance Comparison for the proposed IBSCA3 algorithm in comparison to the other algorithms that incorporate OBL, VNS and Laplace distribution

DatasetIBSCA3ISSAIHHOOSACIVNS-HRSVNLHHOIEOADSSASFS-LARLRMBGWOPSO
Breastcancer9.18E + 031.03E + 041.14E + 041.22E + 041.31E + 041.10E + 041.37E + 041.61E + 041.19E + 041.26E + 04
BreastEW1.61E + 032.48E + 033.11E + 033.64E + 033.92E + 033.07 + 033.98E + 033.87E + 033.25E + 033.47E + 03
Exactly1.29E + 041.53E + 041.76E + 041.88E + 042.01E + 041.74E + 042.16E + 041.99E + 041.74E + 041.82E + 04
Exactly21.15E + 041.57E + 041.78E + 041.91E + 042.13E + 031.38E + 042.36E + 042.08E + 041.84E + 041.93E + 04
HeartEW5.13E + 036.38E + 036.77 + 037.12E + 037.49E + 036.15E + 037.36E + 037.23E + 036.55E + 036.08E + 03
Lymphography3.19E + 033.51E + 033.66E + 033.91E + 034.02E + 033.24E + 033.86E + 033.41E + 033.51E + 033.27E + 03
M-of-n9.12E + 03 1.35E + 041.42E + 041.58E + 041.71E + 049.918E + 031.16E + 041.02E + 041.11E + 041.09E + 04
PenglungEW3.47E + 033.63E + 034.12E + 034.27E + 034.61E + 033.51E + 033.62E + 033.76E + 033.83E + 033.57E + 03
SonarEW3.56E + 033.74E + 033.95E + 034.28E + 034.71E + 033.68E + 033.71E + 033.73E + 033.85E + 033.79E + 03
SpectEW4.61E + 034.75E + 034.91E + 034.95E + 035.01E + 044.69E + 034.70E + 034.82E + 034.68E + 034.65E + 03
CongressEW1.14E + 041.35 + 041.62E + 041.77E + 041.91E + 041.54E + 041.62E + 041.67E + 041.79E + 041.29E + 04
IonosphereEW4.36E + 034.76E + 034.91E + 035.03E + 035.12E + 034.88E + 034.97E + 035.02E + 035.11E + 034.52E + 03
KrvskpEW5.73E + 046.04E + 046.18E + 046.44E + 046.01E + 045.81E + 045.96E + 046.03E + 046.12E + 045.98E + 04
Tic-tac-toe1.26E + 041.72E + 041.89E + 041.97E + 041.85E + 041.91E + 041.98E + 042.02E + 041.94E + 041.79E + 04
Vote5.96E + 036.06E + 036.28E + 036.74E + 036.91E + 036.33E + 036.85E + 036.91E + 036.18E + 036.03E + 03
WaveformEW1.48E + 041.57E + 041.68E + 041.79E + 041.72E + 041.81E + 041.66E + 041.71E + 041.61E + 041.59E + 04
WineEW1.09E + 031.27E + 031.35E + 031.49E + 031.58E + 031.41E + 031.53E + 031.62E + 031.58E + 031.32E + 03
Zoo4.52E + 035.07E + 035.18E + 035.29E + 035.12E + 035.02E + 035.09E + 035.12E + 035.07E + 034.98E + 03
COVID-19 1.06E + 031.48E + 031.59E + 031.67E + 031.52E + 031.64E + 031.72E + 031.63E + 031.56E + 031.50E + 03

The results in bold point shows the best results in the table

Table 17

Runtime Performance Comparison for the proposed IBSCA3 algorithm in comparison to the other SCA variants algorithms

DatasetIBSCA3SCHHOSCAGAMetaSCABPSO-SCAISSAFDISCA
Breastcancer9.18E + 031.15E + 041.36E + 041.89E + 041.53E + 041.94E + 041.97E + 04
BreastEW1.61E + 031.78E + 031.95E + 032.04E + 031.98E + 032.14E + 032.35E + 03
Exactly1.29E + 041.42E + 041.49E + 041.45E + 041.53E + 041.62E + 041.73E + 04
Exactly21.15E + 041.23E + 041.35E + 041.41E + 041.29E + 031.39E + 041.72E + 04
HeartEW5.13E + 035.92E + 036.01E + 036.15E + 036.12E + 036.29E + 036.43E + 03
Lymphography3.19E + 033.97E + 034.05E + 034.16E + 034.07E + 034.15E + 034.38E + 03
M-of-n9.12E + 03 1.14E + 041.20E + 031.32E + 031.25E + 031.46E + 031.76E + 04
PenglungEW3.47E + 033.52E + 033.75E + 034.01E + 033.87E + 033.99E + 034.26E + 03
SonarEW3.56E + 034.16E + 034.52E + 034.91E + 035.12E + 035.16E + 034.37E + 03
SpectEW4.61E + 035.01E + 035.13E + 035.29E + 035.22E + 035.36E + 035.44E + 03
CongressEW1.14E + 041.27E + 041.38E + 041.31E + 041.41E + 041.59E + 041.71E + 04
IonosphereEW4.36E + 035.02E + 035.19E + 035.12E + 035.35E + 035.42E + 035.91E + 03
KrvskpEW5.73E + 045.79E + 046.16E + 046.27E + 046.21E + 046.39E + 046.45E + 04
Tic-tac-toe1.26E + 041.39E + 041.45E + 041.41E + 041.47E + 041.67E + 041.95E + 04
Vote5.96E + 036.18E + 036.34E + 036.27E + 036.56E + 036.67E + 038.78E + 03
WaveformEW1.48E + 041.59E + 041.68E + 041.61E + 041.76E + 041.83E + 042.03E + 04
WineEW1.09E + 031.15E + 031.31E + 031.46E + 031.36E + 031.56E + 031.96E + 03
Zoo4.52E + 034.70E + 034.83E + 034.74E + 035.03E + 035.16E + 035.65E + 03
COVID-191.06E + 031.32E + 031.46E + 031.61E + 032.01E + 032.15E + 032.05E + 03

The results in bold point shows the best results in the table

Table 18

Runtime Performance Comparison for the proposed IBSCA3 algorithm in comparison to the other new nature-inspired metaheuristic algorithms

DatasetIBSCA3BFFAGAVOAGTO
Breastcancer9.18E + 032.28E + 041.86E + 041.12E + 04
BreastEW1.61E + 032.91E + 031.75E + 031.67E + 03
Exactly1.29E + 041.92E + 041.85E + 041.61E + 04
Exactly21.15E + 041.81E + 041.72E + 041.41E + 04
HeartEW5.13E + 036.25E + 036.15E + 035.92E + 03
Lymphography3.19E + 034.59E + 034.39E + 033.97E + 03
M-of-n9.12E + 03 1.96E + 041.64E + 041.32E + 04
PenglungEW3.47E + 034.01E + 033.98E + 033.62E + 03
SonarEW3.56E + 034.16E + 034.02E + 033.77E + 03
SpectEW4.61E + 035.29E + 035.13E + 034.84E + 03
CongressEW1.14E + 041.64E + 041.51E + 041.37E + 04
IonosphereEW4.36E + 035.06E + 035.01E + 034.91E + 03
KrvskpEW5.73E + 046.37E + 046.24E + 046.05E + 04
Tic-tac-toe1.26E + 041.72E + 041.61E + 041.59E + 04
Vote5.96E + 036.37E + 036.26E + 035.99E + 03
WaveformEW1.48E + 041.75E + 041.71E + 041.58E + 04
WineEW1.09E + 031.93E + 031.76E + 031.41E + 03
Zoo4.52E + 035.23E + 035.14E + 034.92E + 03
COVID-191.06E + 031.48E + 031.57E + 031.62E + 04

The results in bold point shows the best results in the table

Runtime Performance Comparison for the proposed IBSCA3 algorithm in comparison to existing algorithms The results in bold point shows the best results in the table Runtime Performance Comparison for the proposed IBSCA3 algorithm in comparison to the other algorithms that incorporate OBL, VNS and Laplace distribution The results in bold point shows the best results in the table Runtime Performance Comparison for the proposed IBSCA3 algorithm in comparison to the other SCA variants algorithms The results in bold point shows the best results in the table Runtime Performance Comparison for the proposed IBSCA3 algorithm in comparison to the other new nature-inspired metaheuristic algorithms The results in bold point shows the best results in the table The experiments were conducted using an Intel Core i7-3517U, 1.90 GHz CPU with 16 GB RAM running 64-bit Windows. All the algorithms were implemented using Python programming language.

Statistical test results

An investigation of the significance of the results in Tables 6, 10, 12, and 14 has been conducted. We applied both Friedman’s test and Wilcoxon’s test [125] to the classification accuracy in the tables with α = 0.05. Tables 19, 20, 21 and 22 present the results of the Friedman’s test. The best ranks in each row are highlighted in bold. The average ranks of the algorithms were as follows (best to worst): In Table 19: IBSCA3, SBDA, LBDA, RBDA, QBDA, BSCA. BGWO, BGSA, and BBA. In Table 20: IBSCA3, VNS-HRS, IHHO, BGWOPSO, OSACI, ISSA, VNLHHO, SFS-LARLRM, IEOA, and DSSA. In Table 21: IBSCA3, ISSAFD, SCHHO, BPSO-SCA, SCAGA, MetaSCA, and ISCA. In Table 22: IBSCA3, GTO, AVOA, and BFFAG.
Table 19

Friedman’s test when comparing IBSCA3 with existing algorithms based on classification accuracy (Table 6)

Ranks of the algorithms
DatasetBSCARBDALBDAQBDASBDABGWOBGSABBAIBSCA3
Breastcancer745.52.52.55.5891
BreastEW51.53468791.5
Exactly62.52.552.57892.5
Exactly2425368971
HeartEW652347891
Lymphography835427691
M-of-n62.52.552.57892.5
PenglungEW562.52.52.58792.5
SonarEW436527891
SpectEW643527891
CongressEW642357891
IonosphereEW34.54.5627891
KrvskpEW632457891
Tic-tac-toe653247891
Vote325648791
WaveformEW634527891
WineEW752.52.52.58692.5
Zoo733338693
Sum of ranks101636370.559.5130.513616026.5
Sum of ranks squared10201396939694970.253540.2517030.251849625600702.25
Average of ranks5.613.53.53.923.317.257.568.891.47

The results in bold point shows the best results in the table

Table 20

Friedman’s test when comparing IBSCA3 with the other algorithms that incorporate OBL, VNS and Laplace distribution based on classification accuracy (Table 10)

Ranks of the algorithms
DatasetISSAIHHOOSACIVNS-HRSVNLHHOIEOADSSASFS-LARLRMBGWOPSOIBSCA3
Breastcancer85439710261
BreastEW72258910642
Exactly84944104444
Exactly269752108341
HeartEW41079628351
Lymphography35427610981
M-of-n2.52.552.56810972.5
PenglungEW52.52.52.58691072.5
SonarEW35427810961
SpectEW54638910721
CongressEW6341.58910751.5
IonosphereEW74862109351
KrvskpEW87692104351
Tic-tac-toe53247810961
Vote64538791021
WaveformEW31756810942
WineEW63738391033
Zoo63738103933
COVID-1982579641031
Sum of ranks31.5106.579101.579.512314615713289
Sum of ranks squared11342.25624110302.256320.25151292131624649174247921992.25
Average of ranks5.474.285.364.036.337.288.56.784.781.69

The results in bold point shows the best results in the table

Table 21

Friedman’s test when comparing IBSCA3 with the other SCA variants algorithms based on classification accuracy (Table 12)

Ranks of the algorithms
DatasetSCHHOSCAGAMetaSCABPSO-SCAISSAFDISCAIBSCA3
Breastcancer4356271
BreastEW65471.531.5
Exactly4265272
Exactly23564271
HeartEW5634271
Lymphography4653271
M-of-n2564272
PenglungEW53641.571.5
SonarEW5643271
SpectEW3675241
CongressEW2745361
IonosphereEW3465271
KrvskpEW3546271
Tic-tac-toe3546271
Vote3465271
WaveformEW4652371
WineEW2456272
Zoo2456272
COVID-193.5756272
Sum of ranks66.5939689.53912424
Sum of ranks squared4422.25864992168010.25152115376576
Average of ranks3.54.895.054.712.056.531.69

The results in bold point shows the best results in the table

Table 22

Friedman’s test when comparing IBSCA3 with the other new nature-inspired metaheuristic algorithms based on classification accuracy (Table 14)

Ranks of the algorithms
DatasetBFFAGAVOAGTOIBSCA3
Breastcancer4321
BreastEW431.51.5
Exactly4222
Exactly24321
HeartEW4321
Lymphography4321
M-of-n4222
PenglungEW4222
SonarEW4321
SpectEW4321
CongressEW4321
IonosphereEW4321
KrvskpEW4321
Tic-tac-toe4321
Vote4321
WaveformEW4321
WineEW4222
Zoo4222
COVID-194321
Sum of ranks765237.524.5
Sum of ranks squared577627041406.25600.25
Average of ranks42.741.971.29

The results in bold point shows the best results in the table

Friedman’s test when comparing IBSCA3 with existing algorithms based on classification accuracy (Table 6) The results in bold point shows the best results in the table Friedman’s test when comparing IBSCA3 with the other algorithms that incorporate OBL, VNS and Laplace distribution based on classification accuracy (Table 10) The results in bold point shows the best results in the table Friedman’s test when comparing IBSCA3 with the other SCA variants algorithms based on classification accuracy (Table 12) The results in bold point shows the best results in the table Friedman’s test when comparing IBSCA3 with the other new nature-inspired metaheuristic algorithms based on classification accuracy (Table 14) The results in bold point shows the best results in the table It is clear from the results that IBSCA3 achieves the best rank over 12 datasets, and competitive results for the other datasets. Therefore, IBSCA3 is the best in terms of the average of ranks among the other compared algorithms. We also conducted the Wilcoxon’s test with α = 0.05 as summarized in Tables 23, 24, 25 and 26 to evaluate the data in Tables 6, 10, 12, and 14, respectively. Our purpose here is to evaluate the significance of the values of the classification accuracy of IBSCA3 compared to the other algorithms in the tables. The reported p-values indicate that the values of the classification accuracy of IBSCA3 are statistically significant compared to the values of the other algorithms.
Table 23

Wilcoxon’s test results when comparing IBSCA3 with existing algorithms based on classification accuracy (Table 6)

AlgorithmBSCARBDALBDAQBDASBDABGWOBGSABBA
p-values0.003280.000960.001480.000640.001480.000200.000200.00020
Table 24

Wilcoxon’s test results when comparing IBSCA3 with the other algorithms that incorporate OBL, VNS and Laplace distribution based on classification accuracy (Table 10)

AlgorithmISSAIHHOOSACIVNS-HRSVNLHHOIEOADSSASFS-LARLRMBGWOPSO
p-values0.000200.015960.000300.001480.000200.000200.000300.000200.00044
Table 25

Wilcoxon’s test results when comparing IBSCA3 with the other SCA variants algorithms based on classification accuracy (Table 12)

AlgorithmSCHHOSCAGAMetaSCABPSO-SCAISSAFDISCA
p-values0.000440.000200.000140.000140.001480.00014
Table 26

Wilcoxon’s test results when comparing IBSCA3 with the other new nature-inspired optimization algorithms based on classification accuracy (Table 14)

AlgorithmBFFAGAVOAGTO
p-values0.000140.000960.00148
Wilcoxon’s test results when comparing IBSCA3 with existing algorithms based on classification accuracy (Table 6) Wilcoxon’s test results when comparing IBSCA3 with the other algorithms that incorporate OBL, VNS and Laplace distribution based on classification accuracy (Table 10) Wilcoxon’s test results when comparing IBSCA3 with the other SCA variants algorithms based on classification accuracy (Table 12) Wilcoxon’s test results when comparing IBSCA3 with the other new nature-inspired optimization algorithms based on classification accuracy (Table 14) In addition, we used Mann-Whitney U test to compare IBSCA3 against all other algorithms. Based on the results, IBSCA3 produces significant results compared to the other algorithms except for IHHO (0.28014), VNS-HRS (0.35758), BGWOPSO (0.0536), AVOA (0.39532), and GTO (0.65272). Accordingly, the statistical analysis gives evidence that the included modifications of IBSCA3 improve its search strategy, as compared to the original SCA algorithm, and thus achieves the highest accuracy for most of the datasets.

Conclusion and future work

This paper introduced three versions of a binary optimization algorithm by the name of Improved Binary Sine Cosine Algorithm with multiple exploration and exploitation approaches (IBSCA) for solving the Feature Selection problem. All versions of IBSCA (IBSCA1, IBSCA2, IBSCA3) employ an opposition-based learning approach in their initialization stage to generate a diverse population of candidate solutions. IBSCA2 and IBSCA3 use a combination of the variable neighborhood search and Laplace distribution to explore the search space using several mutation methods. Further, IBSCA3 improves the best candidate solution using Refraction Learning, which is a novel opposition learning approach that is based on the principle of light refraction. All versions of IBSCA use two-step transfer functions to convert continuous decision variables into binary ones. The three versions of IBSCA were compared with each other using 18 FS datasets from UCI repository and one COVID-19 dataset. These datasets are suitable for comparison because the numbers of features, objects and classes in these datasets vary significantly. IBSCA3 was found to be the most efficient version of IBSCA. Furthermore, the performance of IBSCA3 was evaluated and compared to several popular binary algorithms (RBDA, LBDA, QBDA, SBDA, BGWO, BGSA, BBA, CHIO, CHIO-GC, ISSA, IHHO, OSACI, VNS-HRS, VNLHHO, IEOA, DSSA, SFS-LARLRM, BGWOPSO, SCHHO, SCAGA, MetaSCA, BPSO–SCA, ISSAFD, ISCA, BFFAG, AVOA, GTO) using the 18 FS datasets from UCI repository and a COVID-19 dataset. The overall simulation results indicate that IBSCA3 outperformed all comparative algorithms in terms of accuracy and number of features selected over most datasets. It is worth mentioning that the performance of IBSCA is affected by the limitations of its methods. To begin, OBL and RL tend to generate good solutions at the beginning of the optimization process, but the generated solutions may converge to sub-optimality as the optimization process progresses [98]. Besides, every optimization problem requires a special OBL strategy that is suitable for the problem structure. In other words, there are no clear guidelines for designing OBL strategies for different optimization problems [126, 127]. Secondly, if the VNS method is implemented too frequently, the population of solutions could be spread over a larger area than necessary [128]. In the future, we are interested in conducting two research studies based on IBSCA3. We are going to apply IBSCA3 to multi-agent cooperative reinforcement learning [129, 130] based on the models described in [131, 132]. We also plan to incorporate the island model [96, 133–137] with IBSCA3 to further improve its performance over the FS problem. Applying the proposed methods on other FS applications can also be addressed in future work.
  10 in total

1.  Wavelet feature selection for image classification.

Authors:  Ke Huang; Selin Aviyente
Journal:  IEEE Trans Image Process       Date:  2008-09       Impact factor: 10.856

2.  Improving feature selection performance for classification of gene expression data using Harris Hawks optimizer with variable neighborhood learning.

Authors:  Chiwen Qu; Lupeng Zhang; Jinlong Li; Fang Deng; Yifan Tang; Xiaomin Zeng; Xiaoning Peng
Journal:  Brief Bioinform       Date:  2021-04-20       Impact factor: 11.622

3.  A hybrid feature extraction selection approach for high-dimensional non-Gaussian data clustering.

Authors:  Sabri Boutemedjet; Nizar Bouguila; Djemel Ziou
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2009-08       Impact factor: 6.226

4.  A Hybrid Feature Selection Method Based on Binary State Transition Algorithm and ReliefF.

Authors:  Zhaoke Huang; Chunhua Yang; Xiaojun Zhou; Tingwen Huang
Journal:  IEEE J Biomed Health Inform       Date:  2018-09-28       Impact factor: 5.772

5.  Ant system: optimization by a colony of cooperating agents.

Authors:  M Dorigo; V Maniezzo; A Colorni
Journal:  IEEE Trans Syst Man Cybern B Cybern       Date:  1996

6.  Binary Horse herd optimization algorithm with crossover operators for feature selection.

Authors:  Mohammed A Awadallah; Abdelaziz I Hammouri; Mohammed Azmi Al-Betar; Malik Shehadeh Braik; Mohamed Abd Elaziz
Journal:  Comput Biol Med       Date:  2021-12-18       Impact factor: 4.589

7.  BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection.

Authors:  Krishna Kumar Kandaswamy; Ganesan Pugalenthi; Mehrnaz Khodam Hazrati; Kai-Uwe Kalies; Thomas Martinetz
Journal:  BMC Bioinformatics       Date:  2011-08-17       Impact factor: 3.169

8.  STatistical Inference Relief (STIR) feature selection.

Authors:  Trang T Le; Ryan J Urbanowicz; Jason H Moore; Brett A McKinney
Journal:  Bioinformatics       Date:  2019-04-15       Impact factor: 6.937

9.  Economic load dispatch using memetic sine cosine algorithm.

Authors:  Mohammed Azmi Al-Betar; Mohammed A Awadallah; Raed Abu Zitar; Khaled Assaleh
Journal:  J Ambient Intell Humaniz Comput       Date:  2022-02-07
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.