Next Article in Journal
A Set Based Newton Method for the Averaged Hausdorff Distance for Multi-Objective Reference Set Problems
Next Article in Special Issue
A Memetic Decomposition-Based Multi-Objective Evolutionary Algorithm Applied to a Constrained Menu Planning Problem
Previous Article in Journal
Particle Swarm Optimization for Predicting the Development Effort of Software Projects
Previous Article in Special Issue
Opposition-Based Ant Colony Optimization Algorithm for the Traveling Salesman Problem
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Binary Whale Optimization Algorithm for Dimensionality Reduction

1
Faculty of Science, Fayoum University, Faiyum 63514, Egypt
2
IN3-Computer Science Department, Universitat Oberta de Catalunya, 08018 Barcelona, Spain
3
Departamento de Ciencias Computacionales, Universidad de Guadalajara, CUCEI, Guadalajara 44430, Mexico
4
Faculty of Computers and Information, Minia University, Minia 61519, Egypt
5
School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(10), 1821; https://doi.org/10.3390/math8101821
Submission received: 20 September 2020 / Revised: 30 September 2020 / Accepted: 12 October 2020 / Published: 17 October 2020
(This article belongs to the Special Issue Evolutionary Computation 2020)

Abstract

:
Feature selection (FS) was regarded as a global combinatorial optimization problem. FS is used to simplify and enhance the quality of high-dimensional datasets by selecting prominent features and removing irrelevant and redundant data to provide good classification results. FS aims to reduce the dimensionality and improve the classification accuracy that is generally utilized with great importance in different fields such as pattern classification, data analysis, and data mining applications. The main problem is to find the best subset that contains the representative information of all the data. In order to overcome this problem, two binary variants of the whale optimization algorithm (WOA) are proposed, called bWOA-S and bWOA-V. They are used to decrease the complexity and increase the performance of a system by selecting significant features for classification purposes. The first bWOA-S version uses the Sigmoid transfer function to convert WOA values to binary ones, whereas the second bWOA-V version uses a hyperbolic tangent transfer function. Furthermore, the two binary variants introduced here were compared with three famous and well-known optimization algorithms in this domain, such as Particle Swarm Optimizer (PSO), three variants of binary ant lion (bALO1, bALO2, and bALO3), binary Dragonfly Algorithm (bDA) as well as the original WOA, over 24 benchmark datasets from the UCI repository. Eventually, a non-parametric test called Wilcoxon’s rank-sum was carried out at 5% significance to prove the powerfulness and effectiveness of the two proposed algorithms when compared with other algorithms statistically. The qualitative and quantitative results showed that the two introduced variants in the FS domain are able to minimize the selected feature number as well as maximize the accuracy of the classification within an appropriate time.

1. Introduction

The datasets from real-world applications such industry or medicine are high-dimensional and contain irrelevant or redundant features. These kind of datasets then have useless information that affects the performance of machine learning algorithms; in such cases, the learning process is affected. Feature selection (FS) is a powerful rattling technique used to select the most significant subset of features, overcoming the high-dimensionality reduction problem [1], identifying the relevant features and removing redundant ones [2]. Moreover, using the subset of features, any machine learning algorithm can be applied for classification. Therefore, several studies have taken into consideration that the FS problem is an optimization problem, hence the fitness function for the optimization algorithm has been changed to classifier’s accuracy, which may be maximized by the selected features [3]. Moreover, FS has been applied successfully to solve many classification problems in different domains, such as data mining [4,5], pattern recognition [6], information retrieval [7], information feedback [8], drug design [9,10], job-shop scheduling problem [11], maximizing lifetime of wireless sensor networks [12,13], and the others where FS can be utilized [14].
There are three main classes of FS methods: (1) The wrapper, (2) filter and (3) hybrid methods [15]. The wrapper approaches generally incorporate classification algorithms to search for and select the relevant features [16]. Filter methods calculate the relevant features without prior data classification [17]. In the hybrid techniques, the compatible strengths of the wrapper and filter methods are combined. Generally speaking, the wrapper methods outperform filter methods in terms of classification accuracy, and hence the wrapper approaches are used in this paper.
In fact, a high accuracy classification does not depend on a large selected features number for many classification problems. In this context, the classification problems can be categorized into two groups: (1) binary classification and (2) multi-class classification. In this paper, we deal with the binary classification problem. There are numerous methods that are applied for binary classification problems, such as discriminant analysis [18], decision trees (DT) [19], the K-nearest neighbor (K-NN) [20], artificial neural networks (ANN) [21], and support vector machines (SVMs) [22].
On the other hand, the traditional optimization methods suffer from some limitations in solving the FS problems [23,24], and hence nature-inspired meta-heuristic algorithms [25] such as the whale optimization algorithm (WOA) [26], moth–flame optimisation [27], Ant Lion Optimization [28], Crow Search Algorithm [29], Lightning Search Algorithm [30], Henry gas solubility optimization [31] and Lévy flight distribution [32] are widely used in the scientific community for solving complex optimization problems and several real-world applications [33,34,35]. Optimization is defined as a process of searching the optimal solutions to a specific problem. In order to address issues such as FS, several nature-inspired algorithms have been applied; some of these algorithms are hybridized with each other or used alone, others created new variants like binary methods to solve this problem. A survey on evolutionary computation [36] approaches for FS is presented in [37]. Several separate and hybrid algorithms have been proposed for FS, such as hybrid ant colony optimization algorithm [38], forest optimization algorithm [39], firefly optimization algorithm [40], hybrid whale optimization algorithm with simulated annealing [41], particle swarm optimization [42], sine cosine optimization algorithm [43], monarch butterfly optimization [44], and moth search algorithm [45].
In addition to the aforementioned studies to find solutions for the FS problem, other search strategies called the binary optimization algorithms have been implemented. Some examples are the binary flower pollination algorithm (BFPA) in [46], binary bat algorithm (BBA) in [47], binary cuckoo search algorithm (BCSA) in [48]; all of them evaluate the accuracy of the classifier as an objective function. He et al. have presented a binary differential evolution algorithm (BDEA) [49] to select the relevant subset to train a SVM with radial basis function (RBF). Moreover, Emary et al., have proposed the binary ant lion and the binary grey wolf optimization [50,51], respectively. Rashedi et al. have introduced an improved binary gravitational search algorithm version called (BGSA) [52]. In addition, a  salps algorithm is used for feature selection of the chemical compound activities [53]. A binary version of particle swarm optimization (BPSO) is proposed [54]. A binary whale optimization algorithm for feature selection [55,56,57] has also been introduced. As the NO Free Lunch (NFL) theorm states, there is no algorithm that is able to solve all optimization problems. Hence, if an algorithm shows a superior performance on a class of problem, it cannot show the same performance on other classes. This is the motivation of our presented study, in which we propose two novel binary variants of the whale optimization algorithm (WOA) called bWOA-S and bWOA-V. In this regard, the WOA is a nature-inspired population-based metaheuristics optimization algorithm, which simulates the humpback whales’ social behavior [26]. The original WOA was modified in this paper for solving FS issues. The two proposed variants are (1) the binary whale optimization algorithm using S-shaped transfer function (bWOA-S) and (2) the binary whale optimization algorithm using V-shaped transfer function (bWOA-V). In both approaches, the accuracy of K-NN classifier [58] is used as an objective function that must be maximized. K-NN with leave-one-out cross-validation (LOOCV) based on Euclidean distance is also used to investigate the performance of the compared algorithms. The experiments results were evaluated on 24 datasets from UCI repository [59]. The results of the two proposed algorithms were evaluated versus different well-known algorithms famous in this domain, namely (1) particle swarm optimizer (PSO) [60], (2) three versions of binary ant lion (bALO1), bALO2, and bALO3) [51], (3) binary gray wolf Optimizer bGWO [50], (4) binary dragonfly [61] and (5) the original WOA. The reason behind choosing such algorithms is that PSO, one of the most famous and well-know algorithms, as well as bALO, bGWO, and bDA, are recent algorithms whose performance has been proved to be significant. Hence, we have implemented the compared algorithms using the original studies and then generated new results using these methods under the same circumstances. The experimental results revealed that bWOA-S and bWOA-V achieved higher classification accuracy with better feature reduction than the compared algorithms.
Therefore, the merits of the proposed algorithms versus the previous algorithms is illustrated by the following two aspects. First, bWOA-S and bWOA-V confirms not only feature reduction, but also the selection of relevant features. Second, bWOA-S and bWOA-V utilize the wrapper methods search technique for selecting prominent features, and hence the idea of these rules is based mainly on high classification accuracy regardless of a large number of selected features. The purpose of wrapper method is used to maintain an efficient balance between exploitation and exploration, so correct information of the features is provided [62]. Thus, bWOA-S and bWOA-V achieve a strong search capability that helps to select a minimum number of features as a subset from the most significant features pool.
The rest of the paper is organized as follows: Section 2 briefly introduces the WOA. Section 3, describes the two binary versions of whale optimization algorithm (bWOA), namely bWOA-S and bWOA-V, for feature selection. Section 4, discusses the empirical results for bWOA-S and bWOA-V. Eventually, conclusions and future work are drawn in Section 5.

2. Whale Optimization Algorithm

In [26], Mirjalili et al. introduced the whale optimization algorithm (WOA), based on the behaviour of whales. The special hunting method is considered the most interesting behaviour of humpack whales. This hunting technique is called bubble-net feeding. In the classical WOA, the solution of the current best candidate is set as close to either the optimum or the target prey. The other whales will update their position towards the best. Mathematically, the WOA mimics the collective movements as follows
D = | C · X ( t ) X ( t ) |
X ( t + 1 ) = X ( t + 1 ) A · D
where t refers to the current number of iterations, X refers to the position vector, X is the best solution position vector. C and A are coefficient vectors and can be calculated from the following equations
A = 2 · a · r a
C = 2 · r
where r belongs to the interval [0, 1] and a decreases linearly through the iterations from 2 to 0. WOA has two different phases: exploitation (Intensification) and exploration (diversification). In the diversification phase, the agents are moved for exploring or searching different search space regions, while in the intensification phase, the agents move in order to locally enhance the current solutions.
The intensification phase: the intensification phase is divided into two processes: the first one is the shrinking encircling technique which can be obtained by reducing a values using Equation (4). Note that a is a stochastic value in the interval [ a , a ] . The second phase is the spiral updating position in which the distance between the whale and the prey is calculated. To model a spiral movement, the following equation is used in order to mimic the movement of the helix-shaped.
X ( t + 1 ) = D l e b l · cos ( 2 π l ) ) + X ( t )
From Equation (5), l is a randomly chosen value between [ 1 , 1 ] where b is a fixed. A 50% probability is used for choosing either the spiral model or shrinking encircling mechanism, as assumed. Consequently, the mathematical model is established as follows
X ( t + 1 ) = X ( t ) A · D i f   p < 0.5 D l b l · cos ( 2 π l ) + X ( t ) i f   p 0.5
where p is a random number in a uniform distribution.
The exploration phase: In the exploration phase, A used random values within 1 A 1 to force the agent to move away from this location mathematically, formulated as in Equation (7).
D = | C · X r a n d X |
X ( t + 1 ) = X r a n d A · D

3. Binary Whale Optimization Algorithm

In the classical WOA, whales move inside the continous search space in order to modify their positions, and this is called the continuous space. However, to solve FS issues, the solutions are limited to only { 0 , 1 } values. In order to be able to solve feature selection problems, the continuous (free position) must be converted to their corresponding binary solutions. Therefore, two binary versions from WOA are introduced to investigate problems like FS and achieve superior results. The conversion is performed by applying specific transfer functions, either the S-shaped function or V-shaped function in each dimension [63]. Transfer functions show the probability of converting the position vectors’ from 0 to 1 and vice versa, i.e., force the search agents to move in a binary space. Figure 1 demonstrates the flow chart of the binary WOA version. Algorithm 1 shows the pseudo code of the proposed bWOA-S and bWOA-V versions.

3.1. Approach 1: Proposed bWOA-S

The common S-shaped (Sigmoid) function is used in this version. The S-shaped function is updating, as shown in Equation (11). Figure 2 illustrates the mathematical curve of the Sigmoid function.

3.2. Approach 2: Proposed bWOA-V

In this version, the hyperbolic tan function is applied. It is a common example of V-shaped functions and is given in Equations (9) and (10).
y k = | t a n h x k |
X i d = s e l d t i f r a n d < S ( x i k ( t + 1 ) ) o r g d t o t h e r w i s e
y k = 1 1 + e x i k ( t )
Algorithm 1 Pseudo code of bWOA-S & bWOA-V
1:
Input:n whales number in the population.
2:
M a x I t e r maximum iteration number.
3:
Output: position of the optimal whale.
4:
Initialize a and n.
5:
Calculate X .
6:
while current iter < maximum iteration number do
7:
    for Each Whale do
8:
        Calculate a ; A , C , p and l.
9:
        if p < 0.5 then
10:
           if ( | A | < 1 ) then
11:
               Update the position of whale using Equation (2).
12:
           else ( | A | 1 )
13:
               Choose a random search agent ( X r a n d )
14:
               Update the position of whale using (8).
15:
           end if
16:
        else ( p 0.5 )
17:
           UUpdate the position of whale using (5).
18:
        end if
19:
        Update X ( t + 1 ) using Equation (11) or (9)
20:
    end for
21:
    Update X if there is a better solution.
22:
     t + +
23:
end while

3.3. bWOA-S and bWOA-V for Feature Selection

Two binary variants of whale optimization algorithm, called bWOA-S and bWOA-V, are employed for solving the FS problem. For a feature vector size, if N is the number of different features, then the combination number would be 2 N , which is a huge feature number to search exhaustively. Under such a situation, the proposed bWOA-S and bWOA-V algorithms are used in an adaptive feature space search and provide the best combination of features. This combination is obtained by achieving the maximum classification accuracy and the minimum selected features number. The following Equation (12) shows the fitness function accompanied by the two proposed versions to evaluate individual whale positions.
F = α γ R ( D ) + β | C R | | C |
where F refers to Fitness function, R refers to the length of the selected feature subset, C refers to the total features number, γ R ( D ) refers to the classification accuracy of the condition attribute set R, α and β are two arguments that are symmetric to the subset length and the accuracy of the classification, and can be calculated as α [ 0 , 1 ] and β = 1 α . This will lead to the fitness function that achieves the maximum classification accuracy. Equation (12) can be converted to a minimization problem based on the error rate of classification and selected features. Thus, the obtained minimization problem can be calculated as in Equation (13)
F = α E R ( D ) + β | R | | C |
where F refers to Fitness function, E R ( D ) is the classification error rate. According to the wrapper methods characteristic in FS, the classifier was employed as an FS guide. In this study, K-NN classifier is used. Therefore, K-NN is applied to ensure that the selected features are the most relevant ones. However, bWOA is the search method that tries to explore the feature space in order to maximize the feature evaluation criteria, as shown in Equation (13).

4. Experimental Results and Discussion

The two proposed bWOA-S and bWOA-V methods are compared with a group of existing algorithms, including the PSO, three variants of binary ant lion (bALO1, bALO2, and bALO3), and the original WOA. Table 1 reports the parameter settings for the cometitior algorithms. In order to provide a fair comparison, three initialization scenarios are used and the experimental results are performed using 24 different datasets from the UCI repository.

4.1. Data Acquisition

Table 2 summarizes the 24 datasets from the UCI machine learning repository [59] that were used in the experiments. The datasets were selected with different instances and attribute numbers to represent various kinds of issue (small, medium and large). In each repository, the instances are divided randomly into three different subsets, namely training, testing, and validation subsets. The proposed algorithms were tested over three gene expression datasets of colon cancer, lymphoma and the leukemia [64,65,66]. The K-NN is used in the experimental tests using the trial and error method, and 5 is the best choice of K. Meanwhile, every position of whale produces one attribute subset through the training process. The training set is used to test and evaluate the performance of the K-NN classifier in the validation subset throughout the optimization process. The bWOA is employed to simultaneously guide the FS process.

4.2. Evaluation Criteria

Each algorithm carried out 20 independent runs with a random initial positioning of the search agents. Repeated runs were used to test the capability of the convergence. Eight well-known and common measures are recorded in order to investigate the algorithms performance in a comparative way. Such metrics are listed as follows:
  • Best: The minimum (or best for a minimization problem) fitness function value obtained at different independent runs, as depicted in Equation (14).
    B e s t = M i n i = 1 M g i
  • Worst: The maximum (or worst for a minimization) fitness function value obtained at different independent operations, as shown in Equation (15).
    W o r s t = M a x i = 1 M g i
  • Mean: Average calculation performance of the optimization algorithm applied M times, as shown in Equation (16).
    M e a n = 1 M i = 1 M g i
    where g i is the optimal solution obtained in the i-th operation;
  • Standard deviation (Std) can be calculated from the following Equation (17).
    S t d = 1 M ( g i M e a n ) 2
  • Average classification accuracy: Investigates the accuracy of the classifier and can be calculated by Equation (18).
    A v g e r a g e P e r f o r m a n c e = 1 M j = 1 M 1 N i = 1 N M a t c h ( C i , L i )
    where C i refers to classifier output for instance i; N refers to the instance number in the test set; and L i refers to the reference class corresponding to instance i;
  • Average selection size (Avg-Selection) measures the average reduction in selected features from all feature sets and is calculated by Equation (19)
    A v g e r a g e S e l e c t i o n S i z e = 1 M i = 1 M s i z e ( g i ) N t
    where N t is the total number of features in the original dataset;
  • Average execution time (Avg-Time) measures the average execution time in milliseconds for all comparison optimization algorithms to obtain the results over the different runs and calculated by Equation (20)
    R a = 1 M i = 1 M R u n T a , i
    where M refers to the run number for the optimizer a, and R u n T a , i is the computational time for optimizer a in milliseconds at run number i;
  • Wilcoxon rank sum test (Wilcoxon): a non-parametric test called Wilcoxon Rank Sum (WRS) [67]. The test gives ranks to all the scores in one group, and after that the ranks of each group are added. The rank-sum test is often described as the non-parametric version of the t test for two independent groups.
The two proposed versions of whale optimization algorithm (bWOA-S and bWOA-V) are compared with three common algorithms that are famous in this domain. Four different initialization methods/techniques are used to guarantee the two proposed algorithms’ ability to converge from different initial positions. These methods are: (1) a large initialization is expected to evaluate the capability of locally searching a given algorithm, as the search agents’ positions are commonly close to the optimal solution; (2) a small initialization method is expected to evaluate the ability of a given algorithm to use global searching as the initial search; (3) mixed initialization is the case in which some search agents are close enough to the optimal solution, whereas the other search agents are apart. It will provide diversity of the population frequently. since the search agents are expected to be apart from each other. (4) random initialization.

4.3. Performance on Small Initialization

The statistical average fitness values of the different datasets obtained from the compared algorithms using the small initialization methods are shown in Table 3. Table 4 shows average classification accuracy on the test data of the compared algorithms using small initialization methods. From these tables, we can conclude that both bWOA-S and bWOA-V achieve better results compared with other algorithms.

4.4. Performance on Large Initialization

The statistical average fitness values of the different datasets obtained from the compared algorithms using the large initialization methods are shown in Table 5. Table 6 shows average classification accuracy of the test data of the compared algorithms using small initialization methods. From these tables, we can conclude that when using large initialization methods, both bWOA-S and bWOA-V achieve better results compared with other algorithms.

4.5. Performance on Mixed Initialization

The statistical average fitness values on the different datasets obtained from the compared algorithms using the large initialization methods are shown in Table 7. Table 8 shows average classification accuracy of the test data of the compared algorithms using small initialization methods. As is notable from this table, we can conclude that both bWOA-S and bWOA-V achieve better results compared with other algorithms.

4.6. Discussion

Figure 3 shows the effect of the initialization method on the different optimizers applied over the selected datasets. The proposed bWOA-S and bWOA-V can reach the global optimal solution in almost half of the datasets, compared to the algorithms in all initialization methods. The limited search space in the case of binary algorithms explains the enhanced performance due to the balance between global and local searching. The balance between local and global searching assists the optimization algorithm to avoid early convergence and local optimal values. The small initialization keeps away the initial search agents from the optimal solution; however, in the large initialization, the search agents are closest to the optimal solution, although they have low diversity. While the mixed initialization method improves the performance of all compared algorithms, the two proposed algorithms are superior even in a high-dimensional dataset as in Table 9.
The standard deviation in the obtained fitness values on the different datasets for the compared algorithms averaged over the initialization methods is given in Table 10. As shown in this table, the proposed bWOA-V can reach the optimal solution better than compared algorithms, regardless of the initialization used.
With regard to the time consumption for optimization of these 11 test datasets, Table 11 presents the results of the average time obtained by the two proposed versions and other compared algorithms with 20 independent runs. As can be concluded from Table 11, bWOA-V ranks first among the algorithms. bWOA-S ranks fifth, but it is better than PSO and bALO, as it significantly outperforms the other compared algorithms with a little more time consumption.
On the other hand, Table 12 and Table 13 summarize the experimental results of the best and worst obtained fitness for the compared algorithms over 20 independent runs.
The mean selected features obtained from the compared algorithms are shown in Table 14.
Table 14 reports the ratio of mean selected features obtained from the compared algorithms. In Table 14, the performance of bWOA-V is superior in keeping its good classification accuracy by selecting a lower number of features.
This reveals the outstanding performance of bWOA-V in searching for both features’ reduction and enhancing the optimization process.
In order to compare each runs results, a non-parametric statistical called Wilcoxon’s rank sum (WRS) test was carried out over the 11 UCI datasets at 5% significance level, and the p-values are given in Table 15. From this table, p-values for the bWOA-V are mostly less than 0.05, which proves that this algorithm’s superiority is statistically significant. This means that bWOA-V exhibits a statistically superior performance compared to the other compared algorithms in the pair-wise Wilcoxon signed-ranks test.
Moreover, Figure 4 outlines the best and worst acquired fitness function value averaged over all the datasets, using small, mixed and large initialization. Figure 5 shows the classification accuracy average. From these figures, it can be proven that the bWOA-V performs better than other compared algorithms, such as PSO and bALO, which confirms bWOA-V’s searching capability, especially in the large initialization.
In order to show the merits of bWOA-S and bWOA-V qualitatively, Figure 6, Figure 7 and Figure 8, show the boxplots results for the three initialization methods obtained by all compared algorithms. According to these figures, bWOA-S and bWOA-V have superiority since the boxplot of bWOA-S and bWOA-V are extremely narrow and located under the minima of PSO, bALO, and the original WOA. In summary, the qualitative results prove that the two proposed algorithms are able to provide remarkable convergence and coverage ability in solving FS problems. Another fact worth mentioning here is that the boxplots show that bALO and PSO algorithms provide poor performance.

5. Conclusions and Future Work

In this paper, two binary version of the original whale optimization algorithm (WOA), called bWOA-S and bWOA-V, have been proposed to solve the FS problem. To convert the original version of WOA to a binary version, S-shaped and V-shaped transfer functions are employed. In order to investigate the performance of the two proposed algorithms, the experiments employ 24 benchmark datasets from the UCI repository and eight evaluation criteria to assess different aspects of the compared algorithms.The experimental results revealed that the two proposed algorithms achieved superior results compared to the three well-known algorithms, namely PSO, bALO (three variants), and the original WOA. Furthermore, the results proved that bWOA-S and bWOA-V both achieved smallest number of selected features with best classification accuracy in a minimum time. In addition, the Wilcoxon’s rank-sum nonparametric statistical test was carried out at 5% significance level to judge whether the results of the two proposed algorithms differ from the best results of the other compared algorithms in a statistically significant way. More specifically, the results proved that the bWOA-s and bWOA-V have merit among binary optimization algorithms. For future work, the two binary algorithms introduced here will be applied to high-dimensional real-world applications and will be used with more common classifiers such as SVM and ANN to verify the performance. The effects of different transfer functions on the performance of the two proposed algorithms are also worth investigating. This algorithm can be applied for many problems other than FS. We can also investigate a multi-objective version.

Author Contributions

A.G.H.: Software, Resources, Writing—original draft, editing. D.O.: Conceptualization, Data curation, Resources, Writing—review and editing. E.H.H.: Supervision, Methodology, Conceptualization, Formal analysis, Writing—review and editing. A.A.J.: Formal analysis, Writing—review and editing. X.Y.: Formal analysis, Writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external fundin.

Conflicts of Interest

The authors declare that there is no conflict of interest.

References

  1. Yi, J.H.; Deb, S.; Dong, J.; Alavi, A.H.; Wang, G.G. An improved NSGA-III algorithm with adaptive mutation operator for Big Data optimization problems. Future Gener. Comput. Syst. 2018, 88, 571–585. [Google Scholar] [CrossRef]
  2. Neggaz, N.; Houssein, E.H.; Hussain, K. An efficient henry gas solubility optimization for feature selection. Expert Syst. Appl. 2020, 152, 113364. [Google Scholar] [CrossRef]
  3. Sayed, S.A.F.; Nabil, E.; Badr, A. A binary clonal flower pollination algorithm for feature selection. Pattern Recognit. Lett. 2016, 77, 21–27. [Google Scholar] [CrossRef]
  4. Martin-Bautista, M.J.; Vila, M.A. A survey of genetic feature selection in mining issues. In Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), Washington, DC, USA, 6–9 July 1999; Volume 2, pp. 1314–1321. [Google Scholar]
  5. Piramuthu, S. Evaluating feature selection methods for learning in data mining applications. Eur. J. Oper. Res. 2004, 156, 483–494. [Google Scholar] [CrossRef]
  6. Gunal, S.; Edizkan, R. Subspace based feature selection for pattern recognition. Inf. Sci. 2008, 178, 3716–3726. [Google Scholar] [CrossRef]
  7. Lew, M.S. Principles of Visual Information Retrieval; Springer Science & Business Media: Berlin, Germany, 2013. [Google Scholar]
  8. Wang, G.G.; Tan, Y. Improving metaheuristic algorithms with information feedback models. IEEE Trans. Cybern. 2017, 49, 542–555. [Google Scholar] [CrossRef] [PubMed]
  9. Houssein, E.H.; Hosney, M.E.; Elhoseny, M.; Oliva, D.; Mohamed, W.M.; Hassaballah, M. Hybrid Harris hawks optimization with cuckoo search for drug design and discovery in chemoinformatics. Sci. Rep. 2020, 10, 1–22. [Google Scholar] [CrossRef] [PubMed]
  10. Houssein, E.H.; Hosney, M.E.; Oliva, D.; Mohamed, W.M.; Hassaballah, M. A novel hybrid Harris hawks optimization and support vector machines for drug design and discovery. Comput. Chem. Eng. 2020, 133, 106656. [Google Scholar] [CrossRef]
  11. Gao, D.; Wang, G.G.; Pedrycz, W. Solving fuzzy job-shop scheduling problem using de algorithm improved by a selection mechanism. IEEE Trans. Fuzzy Syst. 2020. [Google Scholar] [CrossRef]
  12. Houssein, E.H.; Saad, M.R.; Hussain, K.; Zhu, W.; Shaban, H.; Hassaballah, M. Optimal sink node placement in large scale wireless sensor networks based on Harris’ hawk optimization algorithm. IEEE Access 2020, 8, 19381–19397. [Google Scholar] [CrossRef]
  13. Ahmed, M.M.; Houssein, E.H.; Hassanien, A.E.; Taha, A.; Hassanien, E. Maximizing lifetime of large-scale wireless sensor networks using multi-objective whale optimization algorithm. Telecommun. Syst. 2019, 72, 243–259. [Google Scholar] [CrossRef]
  14. Chandrashekar, G.; Sahin, F. A survey on feature selection methods. Comput. Electr. Eng. 2014, 40, 16–28. [Google Scholar] [CrossRef]
  15. Liu, H.; Yu, L. Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 2005, 17, 491–502. [Google Scholar]
  16. Kohavi, R.; John, G.H. Wrappers for feature subset selection. Artif. Intell. 1997, 97, 273–324. [Google Scholar] [CrossRef] [Green Version]
  17. Liu, H.; Setiono, R. A probabilistic approach to feature selection-a filter solution. In Proceedings of the 13th International Conference on Machine Learning, Bari, Italy, 3–6 July 1996; Volume 23, pp. 319–327. [Google Scholar]
  18. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2002. [Google Scholar]
  19. Safavian, S.R.; Landgrebe, D. hastie2002elements. IEEE Trans. Syst. Man, Cybern. 1991, 21, 660–674. [Google Scholar] [CrossRef] [Green Version]
  20. Dasarathy, B.V. Nearest Neighbor ({NN}) Norms:{NN} Pattern Classification Techniques; IEEE Computer Society Press: Washington, DC, USA, 1991. [Google Scholar]
  21. Verikas, A.; Bacauskiene, M. Feature selection with neural networks. Pattern Recognit. Lett. 2002, 23, 1323–1335. [Google Scholar] [CrossRef]
  22. Vapnik, V.N.; Vapnik, V. Statistical Learning Theory; Wiley: New York, NY, USA, 1998; Volume 1. [Google Scholar]
  23. Khalid, S.; Khalil, T.; Nasreen, S. A survey of feature selection and feature extraction techniques in machine learning. In Proceedings of the Science and Information Conference (SAI), London, UK, 27–29 August 2014; pp. 372–378. [Google Scholar]
  24. Wang, G.G.; Guo, L.; Gandomi, A.H.; Hao, G.S.; Wang, H. Chaotic krill herd algorithm. Inf. Sci. 2014, 274, 17–34. [Google Scholar] [CrossRef]
  25. Hassanien, A.E.; Emary, E. Swarm Intelligence: Principles, Advances, and Applications; CRC Press: Boca Raton, FL, USA, 2016. [Google Scholar]
  26. Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  27. Hussien, A.G.; Amin, M.; Abd El Aziz, M. A comprehensive review of moth-flame optimisation: Variants, hybrids, and applications. J. Exp. Theor. Artif. Intell. 2020, 32, 705–725. [Google Scholar] [CrossRef]
  28. Assiri, A.S.; Hussien, A.G.; Amin, M. Ant Lion Optimization: Variants, hybrids, and applications. IEEE Access 2020, 8, 77746–77764. [Google Scholar] [CrossRef]
  29. Hussien, A.G.; Amin, M.; Wang, M.; Liang, G.; Alsanad, A.; Gumaei, A.; Chen, H. Crow Search Algorithm: Theory, Recent Advances, and Applications. IEEE Access 2020, 8, 173548–173565. [Google Scholar] [CrossRef]
  30. Shareef, H.; Ibrahim, A.A.; Mutlag, A.H. Lightning search algorithm. Appl. Soft Comput. 2015, 36, 315–333. [Google Scholar] [CrossRef]
  31. Hashim, F.A.; Houssein, E.H.; Mabrouk, M.S.; Al-Atabany, W.; Mirjalili, S. Henry gas solubility optimization: A novel physics-based algorithm. Future Gener. Comput. Syst. 2019, 101, 646–667. [Google Scholar] [CrossRef]
  32. Houssein, E.H.; Saad, M.R.; Hashim, F.A.; Shaban, H.; Hassaballah, M. Lévy flight distribution: A new metaheuristic algorithm for solving engineering optimization problems. Eng. Appl. Artif. Intell. 2020, 94, 103731. [Google Scholar] [CrossRef]
  33. Hashim, F.A.; Houssein, E.H.; Hussain, K.; Mabrouk, M.S.; Al-Atabany, W. A modified Henry gas solubility optimization for solving motif discovery problem. Neural Comput. Appl. 2020, 32, 10759–10771. [Google Scholar] [CrossRef]
  34. Fernandes, C.; Pontes, A.; Viana, J.; Gaspar-Cunha, A. Using multiobjective evolutionary algorithms in the optimization of operating conditions of polymer injection molding. Polym. Eng. Sci. 2010, 50, 1667–1678. [Google Scholar] [CrossRef]
  35. Gaspar-Cunha, A.; Covas, J.A. RPSGAe—Reduced Pareto set genetic algorithm: Application to polymer extrusion. In Metaheuristics for Multiobjective Optimisation; Springer: Berlin/Heidelberg, Germany, 2004; pp. 221–249. [Google Scholar]
  36. Avalos, O.; Cuevas, E.; Gálvez, J.; Houssein, E.H.; Hussain, K. Comparison of Circular Symmetric Low-Pass Digital IIR Filter Design Using Evolutionary Computation Techniques. Mathematics 2020, 8, 1226. [Google Scholar] [CrossRef]
  37. Xue, B.; Zhang, M.; Browne, W.N.; Yao, X. A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 2016, 20, 606–626. [Google Scholar] [CrossRef] [Green Version]
  38. Kabir, M.M.; Shahjahan, M.; Murase, K. A new hybrid ant colony optimization algorithm for feature selection. Expert Syst. Appl. 2012, 39, 3747–3763. [Google Scholar] [CrossRef]
  39. Ghaemi, M.; Feizi-Derakhshi, M.R. Feature selection using forest optimization algorithm. Pattern Recognit. 2016, 60, 121–129. [Google Scholar] [CrossRef]
  40. Emary, E.; Zawbaa, H.M.; Ghany, K.K.A.; Hassanien, A.E.; Parv, B. Firefly optimization algorithm for feature selection. In Proceedings of the 7th Balkan Conference on Informatics Conference, Craiova, Romania, 2–4 September 2015; p. 26. [Google Scholar]
  41. Mafarja, M.M.; Mirjalili, S. Hybrid Whale Optimization Algorithm with simulated annealing for feature selection. Neurocomputing 2017, 260, 302–312. [Google Scholar] [CrossRef]
  42. Xue, B.; Zhang, M.; Browne, W.N. Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Trans. Cybern. 2013, 43, 1656–1671. [Google Scholar] [CrossRef]
  43. Hafez, A.I.; Zawbaa, H.M.; Emary, E.; Hassanien, A.E. Sine cosine optimization algorithm for feature selection. In Proceedings of the 2016 International Symposium on INnovations in Intelligent Systems and Applications (INISTA), Sinaia, Romania, 2–5 August 2016; pp. 1–5. [Google Scholar]
  44. Wang, G.G.; Deb, S.; Cui, Z. Monarch butterfly optimization. Neural Comput. Appl. 2019, 31, 1995–2014. [Google Scholar] [CrossRef] [Green Version]
  45. Wang, G.G. Moth search algorithm: A bio-inspired metaheuristic algorithm for global optimization problems. Memetic Comput. 2018, 10, 151–164. [Google Scholar] [CrossRef]
  46. Rodrigues, D.; Yang, X.S.; De Souza, A.N.; Papa, J.P. Binary flower pollination algorithm and its application to feature selection. In Recent Advances in Swarm Intelligence and Evolutionary Computation; Springer: Berlin/Heidelberg, Germany, 2015; pp. 85–100. [Google Scholar]
  47. Nakamura, R.Y.; Pereira, L.A.; Costa, K.; Rodrigues, D.; Papa, J.P.; Yang, X.S. BBA: A binary bat algorithm for feature selection. In Proceedings of the 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Ouro Preto, Brazil, 22–25 August 2012; pp. 291–297. [Google Scholar]
  48. Rodrigues, D.; Pereira, L.A.; Almeida, T.; Papa, J.P.; Souza, A.; Ramos, C.C.; Yang, X.S. BCS: A binary cuckoo search algorithm for feature selection. In Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS), Beijing, China, 19–23 May 2013; pp. 465–468. [Google Scholar]
  49. He, X.; Zhang, Q.; Sun, N.; Dong, Y. Feature selection with discrete binary differential evolution. In Proceedings of the 2009 International Conference on Artificial Intelligence and Computational Intelligence, Shanghai, China, 7–8 November 2009; Volume 4, pp. 327–330. [Google Scholar]
  50. Emary, E.; Zawbaa, H.M.; Hassanien, A.E. Binary ant lion approaches for feature selection. Neurocomputing 2016, 213, 54–65. [Google Scholar] [CrossRef]
  51. Emary, E.; Zawbaa, H.M.; Hassanien, A.E. Binary grey wolf optimization approaches for feature selection. Neurocomputing 2016, 172, 371–381. [Google Scholar] [CrossRef]
  52. Rashedi, E.; Nezamabadi-Pour, H.; Saryazdi, S. BGSA: Binary gravitational search algorithm. Nat. Comput. 2010, 9, 727–745. [Google Scholar] [CrossRef]
  53. Hussien, A.G.; Hassanien, A.E.; Houssein, E.H. Swarming behaviour of salps algorithm for predicting chemical compound activities. In Proceedings of the 2017 Eighth International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, 5–7 December 2017; pp. 315–320. [Google Scholar]
  54. Nezamabadi-pour, H.; Rostami-Shahrbabaki, M.; Maghfoori-Farsangi, M. Binary particle swarm optimization: Challenges and new solutions. CSI J. Comput. Sci. Eng. 2008, 6, 21–32. [Google Scholar]
  55. Hussien, A.G.; Houssein, E.H.; Hassanien, A.E. A binary whale optimization algorithm with hyperbolic tangent fitness function for feature selection. In Proceedings of the 2017 Eighth International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, 5–7 December 2017; pp. 166–172. [Google Scholar]
  56. Hussien, A.G.; Hassanien, A.E.; Houssein, E.H.; Bhattacharyya, S.; Amin, M. S-shaped Binary Whale Optimization Algorithm for Feature Selection. In Recent Trends in Signal and Image Processing; Springer: Berlin/Heidelberg, Germany, 2019; pp. 79–87. [Google Scholar]
  57. Hussien, A.G.; Hassanien, A.E.; Houssein, E.H.; Amin, M.; Azar, A.T. New binary whale optimization algorithm for discrete optimization problems. Eng. Optim. 2020, 52, 945–959. [Google Scholar] [CrossRef]
  58. Chuang, L.Y.; Chang, H.W.; Tu, C.J.; Yang, C.H. Improved binary PSO for feature selection using gene expression data. Comput. Biol. Chem. 2008, 32, 29–38. [Google Scholar] [CrossRef]
  59. Dua, D.; Graff, C. UCI Machine Learning Repository. 2017. Available online: http://archive.ics.uci.edu/ml (accessed on 28 September 2020).
  60. Eberhart, R.; Kennedy, J. A new optimizer using particle swarm theory. In Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, 4–6 October 1995; pp. 39–43. [Google Scholar]
  61. Mafarja, M.M.; Eleyan, D.; Jaber, I.; Hammouri, A.; Mirjalili, S. Binary dragonfly algorithm for feature selection. In Proceedings of the 2017 International Conference on New Trends in Computing Sciences (ICTCS), Amman, Jordan, 11–13 October 2017; pp. 12–17. [Google Scholar]
  62. D’Angelo, G.; Palmieri, F. GGA: A modified Genetic Algorithm with Gradient-based Local Search for Solving Constrained Optimization Problems. Inf. Sci. 2020, 547, 136–162. [Google Scholar] [CrossRef]
  63. Mirjalili, S.; Hashim, S.Z.M. BMOA: Binary magnetic optimization algorithm. Int. J. Mach. Learn. Comput. 2012, 2, 204. [Google Scholar] [CrossRef] [Green Version]
  64. Alon, U.; Barkai, N.; Notterman, D.A.; Gish, K.; Ybarra, S.; Mack, D.; Levine, A.J. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 1999, 96, 6745–6750. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  65. Alizadeh, A.A.; Eisen, M.B.; Davis, R.E.; Ma, C.; Lossos, I.S.; Rosenwald, A.; Boldrick, J.C.; Sabet, H.; Tran, T.; Yu, X.; et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000, 403, 503–511. [Google Scholar] [CrossRef] [PubMed]
  66. Golub, T.R.; Slonim, D.K.; Tamayo, P.; Huard, C.; Gaasenbeek, M.; Mesirov, J.P.; Coller, H.; Loh, M.L.; Downing, J.R.; Caligiuri, M.A.; et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 1999, 286, 531–537. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  67. Wilcoxon, F. Individual comparisons by ranking methods. Biom. Bull. 1945, 1, 80–83. [Google Scholar] [CrossRef]
Figure 1. Binary whale optimization algorithm flowchart.
Figure 1. Binary whale optimization algorithm flowchart.
Mathematics 08 01821 g001
Figure 2. S-shaped and V-shaped transfer functions.
Figure 2. S-shaped and V-shaped transfer functions.
Mathematics 08 01821 g002
Figure 3. Statistical mean fitness averaged on the different datasets for the different optimizers using the different initializers.
Figure 3. Statistical mean fitness averaged on the different datasets for the different optimizers using the different initializers.
Mathematics 08 01821 g003
Figure 4. Best and worst fitness obtained for the compared algorithms on the different datasets averaged over the four initialization methods.
Figure 4. Best and worst fitness obtained for the compared algorithms on the different datasets averaged over the four initialization methods.
Mathematics 08 01821 g004
Figure 5. Average classification accuracy and average selection size obtained on the different datasets averaged for the compared algorithms over the three initialization methods.
Figure 5. Average classification accuracy and average selection size obtained on the different datasets averaged for the compared algorithms over the three initialization methods.
Mathematics 08 01821 g005
Figure 6. Small initialization boxplot for the compared algorithms on the different datasets.
Figure 6. Small initialization boxplot for the compared algorithms on the different datasets.
Mathematics 08 01821 g006
Figure 7. Mixed initialization boxplot for the compared algorithms on the different datasets.
Figure 7. Mixed initialization boxplot for the compared algorithms on the different datasets.
Mathematics 08 01821 g007
Figure 8. Large initialization boxplot for the compared algorithms on the different datasets.
Figure 8. Large initialization boxplot for the compared algorithms on the different datasets.
Mathematics 08 01821 g008
Table 1. Parameter setting.
Table 1. Parameter setting.
ParameterValue
No of search agents8
No of iterations70
Problem dimensionNo. of features in the data
Data Search domain[0, 1]
No. repetitions of runs20
Inertia factor of PSO0.1
Individual-best acceleration factor of PSO0.1
α Parameter in the fitness function0.99
β Parameter in the fitness function0.01
Table 2. List of datasets used in the experiments results.
Table 2. List of datasets used in the experiments results.
No.NameFeaturesSamples
1Breastcancer9699
2Tic-tac-toe9958
3Zoo16101
4WineEW13178
5SpectEW22267
6SonarEW60208
7IonosphereEW34351
8HeartEW13270
9CongressEW16435
10KrvskpEW363196
11WaveformEW405000
12Exactly131000
13Exactly 2131000
14M-of-N131000
15vote16300
16BreastEW30569
17Semeion2651593
18Clean 1166476
19Clean 21666598
20Lymphography18148
21PenghungEW32573
22Colon200062
23lymphoma964026
24Leukemia712972
Table 3. Statistical mean fitness measure on the different datasets calculated for the compared algorithms using small initialization.
Table 3. Statistical mean fitness measure on the different datasets calculated for the compared algorithms using small initialization.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.0610.0490.0510.0790.0950.0880.0600.0350.031
20.3270.2240.3130.3450.3520.3340.3330.2430.210
30.2470.1330.2200.4110.3950.4160.2490.1270.058
40.9330.9080.9370.9550.9600.9530.9260.8800.877
50.3450.2950.3400.3510.3910.3750.3620.2760.253
60.3370.2030.3150.3740.3720.3690.3030.1540.188
70.1370.1230.1310.1750.1770.1840.1410.0980.125
80.2970.2510.2730.2940.3020.2880.2820.1950.169
90.3810.3610.3790.3910.3970.3940.4020.3540.338
100.3910.0810.3750.4210.4180.4190.4210.0790.052
110.4360.1960.4370.4990.4980.5170.4320.1810.187
120.3220.2970.3370.3470.3320.3340.3140.3140.208
130.2450.2440.2390.2370.2640.2400.2430.2440.237
140.2910.1350.2990.3590.3510.3520.2890.1330.075
150.1250.0680.1400.1510.1550.1740.1300.0620.054
160.0510.0470.0590.0870.0840.0830.0510.0380.030
170.0970.0350.0970.0950.0940.0960.0990.0250.033
180.2980.1500.2980.3570.3750.3670.2940.1100.141
190.0870.0440.0870.1280.1310.1340.0860.0350.043
200.2940.2030.2750.3760.3170.3790.3090.1830.165
210.4610.1810.4440.6140.6020.6060.4460.1480.176
Table 4. Average classification accuracy for the compared algorithms on the different datasets using small initialization.
Table 4. Average classification accuracy for the compared algorithms on the different datasets using small initialization.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.8630.6480.7450.8340.8140.8420.8670.9660.758
20.6520.7810.6700.5980.5990.5840.6200.7430.685
30.7400.8430.7700.4570.4710.4420.5880.8620.817
40.0410.0570.0260.0140.0110.0170.0330.0880.033
50.6240.6630.6060.5660.5570.5500.5830.7050.640
60.6320.7120.6580.5470.5480.5490.6090.8320.696
70.8450.8350.8380.7800.7790.7610.8200.8900.828
80.6740.6450.6320.6020.5920.6040.6530.7930.658
90.5850.5840.5870.5570.5400.5720.5650.6290.584
100.5860.9190.6060.5170.5190.5190.5450.9160.782
110.5560.8040.5520.3980.4020.3920.3920.8170.742
120.6350.6680.6180.5880.6220.6190.6560.6560.640
130.7250.7220.7030.7440.6920.7040.7240.7280.710
140.6990.8450.8450.7200.7230.7080.8140.9320.873
150.8640.9150.8380.7200.7230.7080.8140.9320.873
160.8990.6940.7240.8080.8210.8330.8930.9630.780
170.8970.9640.8900.8760.9020.9030.8980.9710.956
180.6850.8150.6740.5930.5820.5890.6410.8750.796
190.9090.9570.9080.8470.8480.8420.8840.9650.952
200.6740.7340.6540.5130.5530.5230.6160.7990.706
210.4910.7480.4930.2850.2950.3000.4150.8090.729
Table 5. Statistical mean fitness measure calculated on the different datasets for the compared algorithms using large initialization.
Table 5. Statistical mean fitness measure calculated on the different datasets for the compared algorithms using large initialization.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.1330.1270.1640.1830.1460.2230.1600.0360.032
20.2150.2070.2090.2410.2480.2430.2040.2110.209
30.1490.1380.1390.1680.1290.1820.1710.1010.076
40.9280.9280.9290.9380.9370.9240.9250.9070.882
50.3160.3120.3140.3220.3200.3120.3140.3030.249
60.3030.2890.2930.2730.2980.2880.2770.2580.197
70.1680.1630.1800.1620.1770.1660.1600.1500.127
80.3490.3370.3490.3410.3580.3460.3450.2880.171
90.4000.4030.3900.4030.4030.3880.3970.3750.343
100.0690.0730.0720.0730.0710.0730.0690.0670.051
110.1930.1920.1920.1960.1930.1910.1880.1890.187
120.3030.3090.3120.3050.3050.3040.3020.3050.207
130.2590.2590.2600.2600.2660.2640.2580.2560.241
140.1380.1310.1380.1430.1370.1330.1210.1210.068
150.0870.0900.0860.0890.0930.0940.0860.0840.053
160.2170.2200.1560.1080.1550.2050.2000.0430.030
170.0440.0430.0440.0430.0420.0450.0460.0360.033
180.1870.1860.1890.1820.1950.1900.1890.1700.138
190.0520.0520.0530.0520.0510.0520.0510.0490.043
200.2380.2320.2220.2480.2350.2330.2340.2280.147
210.2600.2460.2730.2740.2620.2730.2320.2270.183
Table 6. Average classification accuracy on the different datasets for the compared algorithms using large initialization.
Table 6. Average classification accuracy on the different datasets for the compared algorithms using large initialization.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.6160.6190.6150.6790.6930.6660.7480.9590.780
20.7920.7990.7980.7400.7380.7420.7480.7600.668
30.8330.8390.8320.8110.8470.7980.8170.8900.787
40.0590.0560.0540.0480.0500.0620.0600.0840.033
50.6640.6700.6680.6630.6680.6740.6680.6880.643
60.6920.7030.6980.7190.6960.7050.7200.7410.704
70.8300.8360.8190.8380.8210.8320.8390.8520.819
80.6450.6540.6370.6480.6300.6390.6420.6970.653
90.5930.5830.5980.5810.5800.5930.5860.6200.589
100.9340.9300.9320.9180.9250.9230.9310.9390.777
110.8100.8080.8100.8040.8070.8100.8130.8150.740
120.6930.6830.6850.6800.6800.6790.6840.6890.648
130.7400.7410.7410.7280.7230.7240.7340.7370.712
140.8610.8650.8620.8310.8330.8340.8560.8660.721
150.9070.9080.9050.9070.9030.9010.9060.9170.881
160.6120.6100.6130.7150.6970.6560.7140.9380.766
170.9630.9640.9630.9640.9650.9620.9620.9710.958
180.8140.8180.8120.8200.8070.8120.8120.8340.807
190.9560.9560.9550.9550.9570.9560.9560.9590.953
200.7420.7540.7620.7360.7460.7520.7450.7700.717
210.7420.7550.7310.7290.7420.7300.7690.7730.731
Table 7. Statistical mean fitness measure calculated on the different datasets for the compared algorithms using mixed initialization.
Table 7. Statistical mean fitness measure calculated on the different datasets for the compared algorithms using mixed initialization.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.0540.0520.0790.1000.0990.0760.0310.0350.032
20.2200.2070.2150.2450.2520.2460.2040.2150.209
30.1530.1480.1200.1830.1460.1410.0780.0960.071
40.9250.9280.9100.9350.9380.9380.8840.9030.882
50.3130.3070.2890.3190.3210.3120.2420.2800.255
60.3040.2860.2540.2780.2980.2850.1680.2350.194
70.1590.1580.1520.1560.1690.1650.1130.1410.124
80.3280.3080.2590.3190.3240.3080.1580.2330.167
90.3890.3800.3720.3930.3970.3840.3370.3590.341
100.0710.0740.0810.0740.0720.0740.0400.0610.053
110.1930.1930.1950.1980.1950.1930.1820.1870.188
120.3030.3080.3010.3010.3070.3080.1510.2720.226
130.2410.2440.2520.2370.2440.2530.2380.2440.243
140.1390.1330.1550.1510.1500.1360.0220.1120.072
150.0840.0840.0810.0890.0900.0850.0480.0690.052
160.0810.0580.0620.0860.0880.0860.0330.0570.031
170.0440.0430.0370.0430.0430.0440.0320.0340.030
180.1910.1870.1760.1840.1920.1970.1360.1580.149
190.0520.0520.0490.0510.0520.0520.0410.0440.042
200.2350.2300.2230.2580.2430.2370.1380.2110.160
210.2600.2440.2420.2760.2620.2740.1490.2170.180
Table 8. Average classification accuracy on the different datasets for the compared algorithms using mixed initialization.
Table 8. Average classification accuracy on the different datasets for the compared algorithms using mixed initialization.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.7850.6190.6280.7400.7250.7260.8020.9620.789
20.7870.7990.7860.6860.6810.6860.7200.7640.673
30.8410.8390.8220.6560.7060.6800.7890.9000.779
40.0650.0560.0530.0390.0330.0310.0390.0860.031
50.6780.6700.6640.6350.6230.6250.6560.7070.649
60.6980.7030.7030.6450.6390.6470.7210.7650.705
70.8350.8360.8310.8190.8030.8020.8350.8600.827
80.6560.6540.6520.6250.6210.6230.6680.7510.652
90.5980.5820.5950.5730.5590.5770.5890.6310.571
100.9360.9300.9180.7660.7650.7570.7940.9430.754
110.8120.8080.8040.6420.6490.6470.7630.8160.747
120.6870.6830.6910.6440.6560.6480.6640.7060.642
130.7380.7400.7350.7330.7110.7030.7230.7350.712
140.8650.8650.8330.7340.7320.7440.7610.8830.728
150.9150.9080.9000.8290.8230.8290.8840.9300.866
160.7610.6100.6150.7300.7440.7270.8100.9440.769
170.9640.9640.9650.9240.9390.9250.9560.9720.959
180.8150.8180.8030.7290.7200.7240.8060.8450.791
190.9560.9560.9550.9080.9100.9110.9530.9620.952
200.7560.7550.7490.6390.6720.6590.7050.7860.709
210.7440.7550.7250.5530.5680.5630.7650.7810.730
Table 9. Results for high dimensional datasets.
Table 9. Results for high dimensional datasets.
DatasetAccuracySTDEVFitnessTimeSelSize
AvgMinMax
Colon
WOA0.670830.027100.523130.189330.336255.773460.52313
bWOA-S0.666670.030660.453860.189400.3156615.377270.45386
bWOA-V0.666670.030030.497240.231790.355649.875490.49724
bALO10.622500.035130.461100.209950.356883.524890.46110
bALO20.625840.043860.474580.230590.3774939.645000.47458
bALO30.620840.035440.498370.271830.3568637.769400.49836
PSO0.660840.026260.487930.168700.314243.52425
bGWO10.795840.035360.359110.126440.2722844.100910.35911
bDA0.651670.028540.438560.169150.252316.721460.43856
Lymphoma
WOA0.426280.060760.473140.384510.7218413.739070.47314
bWOA-S0.354350.060350.449210.174220.7139952.340150.44921
bWOA-V0.394570.057540.496420.371690.8066422.351060.49642
bALO10.419730.061940.510390.428770.735458.279760.510395
bALO20.399390.062300.448440.334820.7416177.979510.44844
bALO30.399230.058610.485940.414890.7667781.058180.48594
PSO0.466350.052120.478780.186660.711517.31112
bGWO10.486420.054910.280620.262720.7134389.871900.28062
bDA0.407170.039930.375950.326430.8483616.478200.37595
Leukemia
WOA0.823530.084310.647320.159090.2184830.542450.64732
bWOA-S0.823530.084710.696740.078690.1594185.808360.69674
bWOA-V0.846470.079020.579250.099670.2028145.170510.57925
bALO10.725000.087930.625120.144530.2391914.689360.62511
bALO20.724710.092720.624290.151820.23920171.6950.62429
bALO30.730290.089130.624910.122730.23192182.2920.62491
PSO0.850590.080550.801210.062810.165215.26511
bGWO10.945880.075890.473470.025650.09169205.8290.47348
bDA0.837060.076260.487770.026710.0631931.562700.48777
Table 10. Standard deviation fitness function on the different datasets averaged for the compared algorithms over the three initialization methods.
Table 10. Standard deviation fitness function on the different datasets averaged for the compared algorithms over the three initialization methods.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.0130.0130.0110.0280.0120.0130.0090.0090.007
20.0530.0450.0580.0560.0560.0570.0580.0470.048
30.0330.0170.0400.0460.0410.0420.0180.0200.015
40.2050.2080.2000.2080.2120.2140.2020.2000.199
50.0610.0730.0660.0720.0800.0640.0750.0720.052
60.0740.0530.0620.0670.0640.0710.0490.0420.046
70.0270.0300.0300.0350.0430.0350.0260.0350.028
80.0600.0600.0570.0610.0620.0580.0520.0540.039
90.0840.0840.0790.0870.0920.0890.0800.0750.076
100.0340.0150.0280.0350.0360.0430.0330.0170.012
110.0580.0430.0670.0610.0610.0620.0580.0410.040
120.0650.0700.0670.0680.0660.0680.0610.0710.045
130.0550.0580.0550.0550.0700.0550.0510.0510.051
140.0370.0330.0410.0430.0540.0420.0380.0210.012
150.0220.0160.0230.0240.0270.0300.0220.0130.010
160.0370.0310.0090.0110.0330.0130.0260.0100.006
170.0120.0090.0120.0120.0110.0130.0110.0070.007
180.0540.0320.0420.0520.0500.0500.0440.0270.034
190.0130.0100.0140.0130.0170.0160.0120.0090.009
200.0490.0410.0310.0670.0550.0670.0500.0390.040
210.0510.0710.0560.0690.0710.0870.0460.0200.040
Table 11. Average execution time in seconds on the different datasets for the compared algorithms averaged over the three initialization methods.
Table 11. Average execution time in seconds on the different datasets for the compared algorithms averaged over the three initialization methods.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
14.7224.2434.4674.8964.7034.428906.1727.1774.629
29.7478.7849.1147.7917.4686.813498.25310.1496.720
33.7123.4603.7414.0194.0103.895033.8224.7713.859
411.09410.55712.45011.93410.9440410.77911.80414.94611.225
53.7253.3644.0724.1584.1953.922774.4235.2183.654
63.8353.5403.8163.6735.0144.836524.3165.7563.599
74.1394.2204.3764.0304.9784.733934.4565.6704.033
83.7143.1243.6423.6143.7963.871774.0295.3393.616
94.3533.7194.6804.1304.5024.661214.4111334.477
1078.31178.51677.18265.79564.66357.45839.67178.06351.987
1118021223449157153140112199116
126.6108.0688.6728.0047.2596.7406.2877.0116.468
137.2108.4229.8198.5546.9466.5546.7836.7207.123
147.3348.6386.9578.1696.3326.5197.7897.8566.569
153.2813.9013.3074.2673.6683.7174.2133.6953.303
164.2484.6003.9195.4644.7514.2944.9955.0903.813
1710713914491.55295.18577.56486.14099.636122
189.49711.97017.2098.41210.71011.4818.89310.4745.933
19267219961733985101885892020531281
203.5933.9323.9173.6053.6833.3963.8093.9413.087
214.8306.4785.5223.99310.43710.2204.4077.8524.183
Table 12. Best fitness function on the different datasets averaged for the compared algorithms over the three initialization methods.
Table 12. Best fitness function on the different datasets averaged for the compared algorithms over the three initialization methods.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.0320.0310.0310.0330.0380.0380.0250.0220.020
20.2010.1850.1860.2340.2250.2330.1950.1920.172
30.0230.0460.0650.0930.0740.1230.0570.0150.004
40.8530.8610.8730.8840.8620.8770.8490.8300.812
50.2500.2470.2300.2750.2740.2490.2250.2180.216
60.2180.1820.1900.2130.2080.2200.1750.1320.137
70.1080.1230.1070.1220.1280.1240.0880.0770.083
80.2290.2140.2070.2410.2470.2140.1680.1400.125
90.3340.3430.3240.3460.3420.3390.3280.3280.310
100.1030.0570.0970.1170.1250.1260.1060.0380.038
110.2120.1790.1960.2620.2580.2610.2000.1710.177
120.2730.1860.2810.2760.2780.2830.1440.1850.026
130.2220.2250.2200.2210.2260.2260.2170.2160.217
140.1310.0850.1230.1540.1330.1700.0610.0460.012
150.0450.0420.0360.0500.0380.0460.0290.0430.027
160.0280.0280.0290.0390.0400.0350.0240.0230.018
170.0490.0300.0450.0440.0420.0450.0400.0220.024
180.1500.1280.1610.1790.1910.1850.1430.0920.109
190.0510.0410.0490.0560.0600.0620.0490.0370.038
200.1800.1500.1160.1960.1610.1830.1150.1190.115
210.1360.1220.1740.2060.1840.2380.1110.0710.046
Table 13. Worst fitness function on the different datasets averaged for the compared algorithms over the three initialization methods.
Table 13. Worst fitness function on the different datasets averaged for the compared algorithms over the three initialization methods.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.1710.2390.3390.2510.3410.2820.1510.1450.047
20.3040.2640.3220.3450.3380.3300.2710.2620.238
30.3010.2560.2670.3600.3310.3260.2900.2380.154
40.9780.9930.9610.9810.9850.9930.9450.9560.922
50.3770.3560.3940.3840.4170.4210.4010.3280.288
60.3750.3870.3560.3650.3890.3720.2900.2860.256
70.2140.1760.2150.2040.2220.2130.1900.1820.165
80.3760.3600.3860.4110.4070.3830.3010.3470.198
90.4420.4190.4460.4800.4550.4360.4330.3990.375
100.2110.1060.2240.2340.2310.2220.1940.1420.064
110.3150.2070.3210.3010.3010.3220.2730.1990.198
120.3540.3330.3650.3710.3720.3790.3240.3350.294
130.3030.2760.2780.2820.3450.2860.2870.2750.262
140.2200.1980.2770.2640.2650.2550.1970.1810.127
150.1900.1240.1980.1500.2030.1920.1180.1180.077
160.3140.1830.3430.3340.3280.2510.1480.2350.046
170.0720.0500.0690.0650.0660.0710.0620.0410.042
180.2840.2220.2610.2800.2860.2790.2330.2020.185
190.0690.0560.0690.0800.0820.0820.0610.0490.048
200.3370.3050.3030.3740.3520.3940.2890.2740.202
210.4400.3810.4620.4740.4810.5280.4330.3650.312
Table 14. Average selection size on the different datasets averaged for the compared algorithms over the three initialization methods.
Table 14. Average selection size on the different datasets averaged for the compared algorithms over the three initialization methods.
No.WOAbWOA-SbWOA-vBALO1BALO2BALO3PSObGWObDA
10.608750.638750.567500.475000.502500.508750.6360.638750.50625
20.775000.970830.755550.618060.637500.620830.5200.791670.80417
30.661720.763280.606250.620310.617970.625000.6090.591410.47109
40.625960.699040.583650.558650.561540.542310.6430.582690.47019
50.646020.739200.591480.544320.598860.569890.5680.628980.45966
60.647290.663960.556670.605630.605660.623960.5200.621460.43854
70.602210.668750.592650.545220.556990.540810.5640.612130.40625
80.555770.545190.541340.517310.457690.478850.6110.575960.41730
90.532810.584380.546090.508590.525780.504690.4270.628910.44219
100.704170.903470.679510.619090.625350.623610.5780.763140.53368
110.733440.905000.707500.626560.631560.630620.7500.799060.58656
120.640380.726930.697120.516350.542310.542310.4750.622120.61827
130.499040.467310.615380.394230.403850.446150.4750.429810.17885
140.724040.878840.691350.622120.608650.621150.6950.764420.63462
150.667190.746090.602340.591410.566400.610160.5200.610940.37813
160.572500.623750.602500.518750.495000.510000.5520.607500.48875
170.667880.799530.597740.621830.625380.623630.8560.641080.50028
180.692470.794880.588930.621460.619420.6238706570.649320.48532
190.668220.770860.575150.624320.624020.627710.7810.685770.48735
200.662500.727080.600690.605550.589580.590280.4990.625690.50486
210.648350.711310.536300.621420.621110.623120.5500.491260.47477
Table 15. The Wilcoxon test for the average fitness obtained by the compared algorithms.
Table 15. The Wilcoxon test for the average fitness obtained by the compared algorithms.
AlgorithmsbWOA-SbWOA-V
SmallMixedLargeSmallMixedLarge
WOA0.06060.47560.42010.41780.43520.5640
bALO10.00000.40060.46090.11910.21800.4480
bALO20.00380.27360.42480.07540.20360.5881
bALO30.09470.05960.64100.34040.07250.4672
bGWO0.05890.05320.8790.6540.05870.0.300
bDA0.04390.02980.14060.48920.05840.400
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Hussien, A.G.; Oliva, D.; Houssein, E.H.; Juan, A.A.; Yu, X. Binary Whale Optimization Algorithm for Dimensionality Reduction. Mathematics 2020, 8, 1821. https://doi.org/10.3390/math8101821

AMA Style

Hussien AG, Oliva D, Houssein EH, Juan AA, Yu X. Binary Whale Optimization Algorithm for Dimensionality Reduction. Mathematics. 2020; 8(10):1821. https://doi.org/10.3390/math8101821

Chicago/Turabian Style

Hussien, Abdelazim G., Diego Oliva, Essam H. Houssein, Angel A. Juan, and Xu Yu. 2020. "Binary Whale Optimization Algorithm for Dimensionality Reduction" Mathematics 8, no. 10: 1821. https://doi.org/10.3390/math8101821

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop