# Binary Whale Optimization Algorithm for Dimensionality Reduction

^{1}

Faculty of Science, Fayoum University, Faiyum 63514, Egypt

^{2}

IN3-Computer Science Department, Universitat Oberta de Catalunya, 08018 Barcelona, Spain

^{3}

Departamento de Ciencias Computacionales, Universidad de Guadalajara, CUCEI, Guadalajara 44430, Mexico

^{4}

Faculty of Computers and Information, Minia University, Minia 61519, Egypt

^{5}

School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China

^{*}

Author to whom correspondence should be addressed.

Received: 20 September 2020 / Revised: 30 September 2020 / Accepted: 12 October 2020 / Published: 17 October 2020

(This article belongs to the Special Issue Evolutionary Computation 2020)

Feature selection (FS) was regarded as a global combinatorial optimization problem. FS is used to simplify and enhance the quality of high-dimensional datasets by selecting prominent features and removing irrelevant and redundant data to provide good classification results. FS aims to reduce the dimensionality and improve the classification accuracy that is generally utilized with great importance in different fields such as pattern classification, data analysis, and data mining applications. The main problem is to find the best subset that contains the representative information of all the data. In order to overcome this problem, two binary variants of the whale optimization algorithm (WOA) are proposed, called bWOA-S and bWOA-V. They are used to decrease the complexity and increase the performance of a system by selecting significant features for classification purposes. The first bWOA-S version uses the Sigmoid transfer function to convert WOA values to binary ones, whereas the second bWOA-V version uses a hyperbolic tangent transfer function. Furthermore, the two binary variants introduced here were compared with three famous and well-known optimization algorithms in this domain, such as Particle Swarm Optimizer (PSO), three variants of binary ant lion (bALO1, bALO2, and bALO3), binary Dragonfly Algorithm (bDA) as well as the original WOA, over 24 benchmark datasets from the UCI repository. Eventually, a non-parametric test called Wilcoxon’s rank-sum was carried out at 5% significance to prove the powerfulness and effectiveness of the two proposed algorithms when compared with other algorithms statistically. The qualitative and quantitative results showed that the two introduced variants in the FS domain are able to minimize the selected feature number as well as maximize the accuracy of the classification within an appropriate time.