An Electric Fish-Based Arithmetic Optimization Algorithm for Feature Selection

With the widespread use of intelligent information systems, a massive amount of data with lots of irrelevant, noisy, and redundant features are collected; moreover, many features should be handled. Therefore, introducing an efficient feature selection (FS) approach becomes a challenging aim. In the recent decade, various artificial methods and swarm models inspired by biological and social systems have been proposed to solve different problems, including FS. Thus, in this paper, an innovative approach is proposed based on a hybrid integration between two intelligent algorithms, Electric fish optimization (EFO) and the arithmetic optimization algorithm (AOA), to boost the exploration stage of EFO to process the high dimensional FS problems with a remarkable convergence speed. The proposed EFOAOA is examined with eighteen datasets for different real-life applications. The EFOAOA results are compared with a set of recent state-of-the-art optimizers using a set of statistical metrics and the Friedman test. The comparisons show the positive impact of integrating the AOA operator in the EFO, as the proposed EFOAOA can identify the most important features with high accuracy and efficiency. Compared to the other FS methods whereas, it got the lowest features number and the highest accuracy in 50% and 67% of the datasets, respectively.


Introduction
The huge increase of data volume results in different challenges and problems such as irrelevant, high dimensionality, and noisy data [1]. Therefore, such problems affect the efficiency and accuracy of the machine learning algorithms and lead to high computational costs. Feature selection (FS) approaches have been utilized to reduce computational costs and to boost classification accuracy [2]. FS methods are generally used to capture data properties by selecting a subset of relevant features [3]. Additionally, they removed unnecessary and noisy data [3]. FS methods have been widely employed in different fields, such as human action detection [4], text classification [5], COVID-19 CT images classification [6], neuromuscular disorders [7], data analytics problems [8], parameter estimation of biochemical systems [9], MR image segmentation [10,11], and other applications [12][13][14]. There are three types of FS approaches, called wrapper, filter, and embedded [15]. Wrapper-based methods use the learning technique to assess the selected features, whereas filter-based methods use the properties of the datasets. Embedded-based methods learn which features best contribute to the accuracy of the classification model while it is being created. Therefore, filter-based methods are more efficient and faster than wrapper-based methods. The optimal FS method can be considered as a method that can minimize the number of selected features and maximize the accuracy of the classifier [15].
According to the NFL theorem, no one algorithm can solve all problems. Thus, the hybridization concept is widely adopted to address several complex problems, including FS. Following the hybridization concept, we propose a new FS approach using a new variant of the electric fish optimization (EFO). The EFO is a recently proposed MH algorithm inspired by the natural behavior of the nocturnal electric fish [24]. It was evaluated in different and complex optimization problems, and it showed significant performance, except in high dimensional problems due to its slow convergences and getting stuck in local minima. Thus, we use a new optimizer, called the arithmetic optimization algorithm (AOA), to overcome the shortcomings of the traditional EFO algorithm. The AOA is inspired by mathematical operations and proposed by Abualigah et al. [25]. It has been adopted in several applications, such as proton exchange membrane fuel cells by extreme learning machine [26], multilevel thresholding segmentation [27], real-world multiobjective problem [28], and damage assessment in FGM composite plates [29].
The proposed algorithm, EFOAOA, works by splitting the tested dataset into training and testing sets, which represent 70% and 30% of all the data, respectively. Then we set the initial value for a set of individuals that represents the solutions for the FS problem. To assess these individuals, their Boolean versions are computed, and the fitness value is computed based on the features corresponding to Boolean ones. The next process is to determine which individual has the best fitness value and using it as the best individual. Then the operators of AOA and EFO are competitive in the exploration phase to discover the feasible region which contains the optimal solutions. This leads to an increase in the convergence towards the optimal solution. The operators of traditional EFO within the exploitation phase are used. The process of enhancing the value of individuals is conducted until reached the stop criteria. Then the testing set is reduced by using only the features that correspond to ones in the binary version of the best individual, and the performance is computed using different measures. To the best of our knowledge, this is the first time using either EFO or its modified version for feature selection problems.
Our main objectives and contributions can be summarized as follows: • Propose a new modified version of electric fish optimization using the operators of arithmetic optimization algorithm to enhance exploration ability. • Apply the enhanced EFOAOA as an alternative FS technique to remove the irrelevant features, which leads to improve the classification efficiency and accuracy. • Use eighteen UCI datasets to assess the efficiency of the developed EFOAOA and compared it with well-known FS methods.
The rest of this paper is organized as follows: Section 2 introduces the similar FS method from previous literature. Section 3 presents the background of electric fish optimization and arithmetic optimization algorithm. In Section 4, the stages of the developed method are presented. Section 5 presents the experimental results and their discussions. Finally, the conclusion and future works are given in Section 6.

Related Works
In this section, we highlight a number of MH algorithms that were developed for feature selection problems. Chaudhuri and Sahu [30] proposed a modified version of the crow search algorithm (CSA) for FS. They used time-varying flight length for balancing the search process (exploration and exploitation). They evaluated eight variants of the developed FS method, and they tested them with 20 UCI datasets benchmark datasets, and it showed prominent performance.
A hybrid FS method based on binary butterfly optimization algorithm (BOA) and information theory was proposed by [31]. The developed method called information gain binary BOA (IG-bBOA) overcomes the shortcomings of the traditional BOA algorithm, and it achieved exceptional performance compared to several MH algorithms.
Maleki et al. [32] used the genetic algorithm as a feature selection method to improve the lung cancer disease classification process. A k-nearest-neighbors classifier was adopted to classify the stage of patients' disease. The evaluation outcomes confirmed that GA improved classification accuracy.
Song et al. [33] introduced an FS approach based on a new variant of the PSO algorithm, called bare bones PSO. The main idea is to use a swarm initialization technique depending on label correlation. Also, two operators called supplementary, and deletion operators are used to enhance the exploitation process. More so, to avoid getting stuck in local minima, they developed an adaptive flip mutation operator. It was employed with kNN for several datasets, and it was compared to several MH algorithms to verify its performance.
In [34], the authors presented FS approach, called GWORS, using a combination between grey wolf optimizer (GWO) and rough set for mammogram image analysis. The GWORS was compared to well-known FS methods, and it obtained competitive performance. Tubishat et al. [35] developed an FS based on a dynamic salp swarm algorithm (SSA). The SSA was developed based on two methods. The first method is developed to update salps' position, where the second one is to improve the local search process of the traditional SSA. The developed SSA was applied with kNN classifier, evaluated with well-known benchmark datasets, and compared to the traditional SSA and several MH algorithms.
Dhiman et al. [36] proposed a binary emperor penguin optimizer (EPO) for FS. They used twenty-five datasets to evaluate this approach with extensive comparisons to the state-of-art methods. Overall, the binary version of the EPO showed superior performance compared to the original one. In [21], an FS method based on a hybrid HHO algorithm and simulated annealing was proposed. In [37], the genetic algorithm was used with Elastic Net for feature selection. Neggaz et al. [38] presented an FS approach using the Henry gas solubility optimization (HGSO) algorithm. Ewees et al. [3] proposed an FS technique using a hybrid of slime mould algorithm and firefly algorithm. It was evaluated with different datasets, including two high dimensions QSAR datasets.
Yousri et al. [6] developed a new FS method to enhance COVID-19 CT images classification based on an improved cuckoo search (CS) optimization algorithm. The fractional-order (FO) calculus and heavy-tailed distributions are used to enhance the performance of the traditional CS algorithm. In general, many MH algorithms, including hybrid methods, have been developed for various FS applications, and they showed good performance compared to traditional methods, as described in [39,40].

Electric Fish Optimization
Electric fish optimization (EFO) is proposed in [1], which is inspired by the emergence of several optimization techniques. The electric fish solutions (N) are initialized randomly by the search space, considering the area's boundaries: where x ij is the position number j in the solution number I, max, and min are the maximum and minimum boundaries, respectively. In the EFO, as in nature, positions with a larger frequency use effective electrolocation, and others utilize passive electrolocation. The frequency value is given between the maximum and minimum of the fitness function values: where f t i is the fitness value of the solution number i at iteration number t. f it t worst and f itt t best are the worst and best obtained fitness functions values. f max − f min are the max and min fitness functions values. The amplitude cost of the solution number I (A i ) is determined as follows: where α is a value in range [0, 1].

Active Electrolocation
The active range estimation is calculated as follows: To discover neighboring solutions in the available space, estimate the distance is required between the current solution and other solutions. The distance between the solution number i and k is calculated as follows: If at least one neighbor is found in the active space, Equation (6) is used; otherwise, Equation (7) is used: where k is a random selected solution, ϕ is a value between [−1, 1], x cand ij is the candidate positions of the solution number i.

Passive Electrolocation
The probability of the solution number i in an active space is determined as follows: Using different approaches, such as roulette wheel selection, K solutions are determined from N A using Equation (8). A source location (x rj ) is defined using Equation (9). The new positions are then produced using Equation (10): x new Entropy 2021, 23, 1189

of 16
Although it is unusual, there might be a situation where a solution with a higher rate gives passive electrolocation. To evade this, Equation (11) is determined to choose the parameter values: The final action of passive space is to change one parameter of the solution number i by Equation (12) to improve the likelihood of a trait denoting exchanged: If the value of the parameter number j of the solution number i oversteps the boundaries, it is relocated to the following limitations:

Arithmetic Optimization Algorithm
The arithmetic optimization algorithm (AOA) is an optimization depends on using arthimentical operations [2]. The improvement process starts with choosing the search mechanisms based on Equation (14): where t is the active iteration, which is in range [1, T]. Min and Max are the smallest and highest values for this function. The mathematical of the search mechanisms is given as follows.

Exploration Part
The exploration process is given in Equation (3). This search is performed when rand > MOA, rand is a random number, and MOA can be found in Equation (14). The D search is executed when rand < 0.5; otherwise, the M search will be executed: where x i (t + 1) is the solution number i at the iteration number t, x i , j(t) is the position number j in the solution number i, and best (x j ) is the best solution yet. µ and α are parameter values fixed to 0.5, 5, respectively [3]. t is the used iteration, and T is the total used iterations.

Exploitation Part
This search section is executed when rand ≤ MOA. The S search is executed when rand < 0.5; otherwise, the A search will be executed. Thus, the exploitation search, based on S and A, typically averts the local search problem. The following mathematical presentation is used to express the exploration search mechanisms: To summarize, the processes in the AOA begin with stochastic solutions formed over some constraints. By the development rule, the search tools attempt to obtain the optimal solution with possible conditions. The primary practice in improving the worked solutions is the strategy of the best global solution. A transition approach (named MOA) is employed to preserve the stability among the search mechanisms using a linear function rose in the range [0.2, 0.9]. The exploration tools are practiced when rand > MOA; otherwise, the exploitation tools will be used. In searching sections, the operators will be practiced randomly. Eventually, the AOA is stopped by touching the end criterion.

Proposed FS Method
The framework of the developed FS technique depends on improving the effienciy of EFO using the operators of AOA is given in Figure 1. and A, typically averts the local search problem. The following mathematical presentation is used to express the exploration search mechanisms: To summarize, the processes in the AOA begin with stochastic solutions formed over some constraints. By the development rule, the search tools attempt to obtain the optimal solution with possible conditions. The primary practice in improving the worked solutions is the strategy of the best global solution. A transition approach (named MOA) is employed to preserve the stability among the search mechanisms using a linear function rose in the range [0.2, 0.9]. The exploration tools are practiced when rand > MOA; otherwise, the exploitation tools will be used. In searching sections, the operators will be practiced randomly. Eventually, the AOA is stopped by touching the end criterion.

Proposed FS Method
The framework of the developed FS technique depends on improving the effienciy of EFO using the operators of AOA is given in Figure 1. The main target of using AOA is to enhance the exploration ability of EFO since it has the largest influence on its ability to discover the feasible region that contains the optimal solutions. The proposed FS approach, named EFOAOA, begins with dividing the data into training and testing sets, which represent 70% and 30%, respectively. Then the random values for individuals are assigned, and for each of them, the fitness value is computed. Then the individual that has the best fitness value is used as the best individual. After this process, the updating of the solution is performed using the operators of EFO in the exploitation phase, while during the exploration phase, either the operators of AOA or traditional EFO are used according to random probability. The process of updating individuals is performed again until the stop conditions are reached. Thereafter, the testing set is reduced according to the best individual and the performance of the developed EFOAOA as FS is evaluated using different metrics. The details of the EFOAOA are given in the following paragraphs. The main target of using AOA is to enhance the exploration ability of EFO since it has the largest influence on its ability to discover the feasible region that contains the optimal solutions. The proposed FS approach, named EFOAOA, begins with dividing the data into training and testing sets, which represent 70% and 30%, respectively. Then the random values for N individuals are assigned, and for each of them, the fitness value is computed. Then the individual that has the best fitness value is used as the best individual. After this process, the updating of the solution is performed using the operators of EFO in the exploitation phase, while during the exploration phase, either the operators of AOA or traditional EFO are used according to random probability. The process of updating individuals is performed again until the stop conditions are reached. Thereafter, the testing set is reduced according to the best individual and the performance of the developed EFOAOA as FS is evaluated using different metrics. The details of the EFOAOA are given in the following paragraphs.

First Stage
At this stage, the initial individuals are generated, which represents the population of solutions. The formulation of this process is given as: where UB j and LB j are the upper and lower boundary at jth dimension. N represents the total number of individuals and D is the dimension of each solution, and it represents the total number of features. rand ∈ [0, 1] is a random number.

Second Stage
The main aim of this part of the developed EFOAOA is to update the individuals until they reached to the stop conditions. This is achieved through a set of steps; the first step is to convert each individual X i into a binary individual using the following equation: The next step is to use the training features that corresponding to ones in BX i,j to learn the KNN classifier and compute the fitness value that is defined as: In Equation (20), γ is the error classification using KNN classifier and BX i,j is the total number of ones (i.e., relevant features). α ∈ [0, 1] is the factor that balances between two parts of fitness value. The main reason of using KNN due to its simplicity and efficiency, as well as, it has one parameter. In addition, it has been provided better performance than most of other classifiers in different applications. Since, it stores the data of the training set and this.
The step after that is to allocate the best individual X b that has the smallest Fit b . Then compute the frequency ( f i ) and amplitude (A i ) for each X i using Equations (2) and (3), respectively. According to the value of f i the individuals will be updated using either the active phase (i.e., f i > rand) or passive phase (i.e., f i ≤ rand). During the active phase, the operators of traditional EFO are used to update the individuals as given in Equations (4)- (7). Meanwhile, inside the passive phase, the operators of AOA and EFO are competitive to improve the individuals, and this is conducted according to the following formula: where Pro ∈ [0, 1] refers to probability of using either AOA (i.e., Equations (14)- (17)) or EFO (i.e., Equations (8)-(13)) to update X i,j . In case the update X i,j has fitness value better than its old value, then update X i,j is used; otherwise, the old one is kept. Then the stop conditions are checked in case they are satisfied then the best individual X b is returned from this stage.

Third Stage
In this stage, the testing set is reduced by selecting only the features that corresponding ones in the binary version of X b . Then applied reduced testing set is applied to the trained classifier (KNN) and predicts the output of the testing set. The next process is to evaluate the quality of the output using different metrics. The setps of EFOAOA are given in Algorithm 1.

Algorithm 1. Steps of EFOAOA
1. Input: the dataset which has D features, number of individuals (N), number of iterations (tmax), and parameters of EFOAOA First Stage 2. Split data into twp parts (i.e., training and testing) 3. Construct the population X using Equation (18). Second Stage 4. t = 1 5. While (t < tmax) 6. Convert each X i into its binary version using Equation (19). 7. Compure fitness value for each X i based on training set as in Equation (20). 8. Find the best individual X b . 9. Update X using Equation (21). 10. t = t + 1 11. EndWhie Third Stage 12. Reduce the testing set based on selected features from X b . 13. Evalaute the performance using different measures

Complexity of EFOAOA
The time complexity of EFOAOA depends on the complexity of EFO and AOA. Since, time complexity of EFO and AOA are given in Equations (22) and (23), respectively: So, the complexity of EFOAOA can be represented as: where K p stand for the number of solutions updated using operators of EFO.

Experimental Results
This section evaluates the performance of the developed EOFAOA method over eighteen benchmark datasets. In addition, we compare the proposed EOFAOA with ten FS algorithms.

Dataset Description and Parameter Setting
The description of eighteen UCI datasets is listed in Table 1. From this table, it can be observed that these datasets are collected from different real-life applications, and they have different characteristics. For example, different numbers of samples, features, and classes. Moreover, the developed EFOAOA is compared with namely EFO, AOA, MRFO, bGWO, HGSO, MPA, TLBO, SGA, WOA, and SSA. The parameter of each algorithm is assigned based on its original work. The common parameters between these methods are the number of iterations and number of individuals, and we put their values to 50 and 20, respectively. In addition, each of these methods is conducted 25 times to make the comparison fair between them. The comparison bead on six measures: the average, worst (MAX), best (MIN), standard deviation (Std) of the fitness value, and accuracy (Acc).  Table 2 lists the feature selection results of all methods using the average of the fitness function values. From this table, we can notice that the proposed EOFAOA got the better average in eight out of 18 datasets (i.e., Breastcancer, D2, D4, D6, D8, D10, D12, and D17), and it was ranked first. Whereas the AOA method achieved the second rank with three out of 18 datasets (i.e., D3, D9, and D18), followed by TILBO and MRFO with two datasets for each one, but the average of the TILBO in all datasets was better than MRFO. The SGA recorded the worst results at all. Figure 2 illustrates the performance of the EFOAOA based on the average results of the fitness functions values. In addition, the developed EFOAOA is more stable than other FS methods in terms of fitness value as can be seen from the bars of standard deviation.   Table 2 lists the feature selection results of all methods using the average of the fitness function values. From this table, we can notice that the proposed EOFAOA got the better average in eight out of 18 datasets (i.e., Breastcancer, D2, D4, D6, D8, D10, D12, and D17), and it was ranked first. Whereas the AOA method achieved the second rank with three out of 18 datasets (i.e., D3, D9, and D18), followed by TILBO and MRFO with two datasets for each one, but the average of the TILBO in all datasets was better than MRFO. The SGA recorded the worst results at all. Figure 2 illustrates the performance of the EFOAOA based on the average results of the fitness functions values. In addition, the developed EFOAOA is more stable than other FS methods in terms of fitness value as can be seen from the bars of standard deviation.   The results of the standard division of the fitness function for all methods are recorded in Table 3. The developed EOFAOA approache showed good stability in all datasets compared to the other approaches, and it obtained the lowest Std value in four out of 18 datasets (i.e., D1, D5, D8, and D17), followed by TILBO and EOF with the lowest std values in three datasets for each one whereas, both MRFO and MPA also showed good stability. The WOA recorded the worst in this measure. The best results of the fitness values are recorded in Table 4. From this table, we can see that the proposed EOFAOA showed competitive results with TILBO method. The EOFAOA obtained the best results in three datasets (i.e., D3, D4, and D15), whereas the TILBO reached the minimum average in four datasets (i.e., D7, D12, D16, and D17), and they showed the same Min results in the D10 dataset. The worst method was the EFO. Regarding the worst results of the fitness values, as in Table 5, the proposed EOFAOA is superior to other compared methods and achieved the best results in 39% of all datasets (i.e., D4, D5, D6, D9, D11, D13, and D15), and it showed competitive results in the rest of the datasets. The AOA achieved the second rank by obtaining the best results in 28% of the dataset (i.e., D3, D8, D10, D17, and D18), followed by TILBO and MPA with 22% for each one. The rest methods were arranged in the following sequence the MFO, HGSO, SSA, EFO, bGWO, and WOA; whereas, the SGA showed the worst results. Furthermore, the best number of the selected features is recorded in Table 6. In this measure, the algorithms should try to select the most relative features meanwhile achieving the highest accuracy values. As shown in Table 6, the EOFAOA got the lowest features number in 50% of the datasets. The WOA obtained the second rank and got the lowest features number in 44% of the datasets, followed by MRFO, bGWO, MPA, HGSO, TILBO, AOA, SSA, and EFO; whereas, the SGA recorded the largest features number among all methods. The results of the proposed EOFAOA and all the compared methods in terms of accuracy are recorded in Table 7. The results of this measure show that the proposed EOFAOA is superior to other compared approaches. It achieved the highest accuracy in 39% of all datasets and got the same accuracy with the other methods in 28% of all datasets. This result indicates that the EOFAOA is able to select the most relative feature and save the quality of the classification accuracy. The second rank was recorded by MRFO followed by AOA, MPA, TILBO, SSA, HGSO, EFO, and bGWO whereas, the WOA showed the lowest accuracy in all datasets. Figure 3 indicates that the proposed EFOAOA got the highest accuracy in the average of all datasets, as well as, it is considered more stable than other methods as can be observed from standard deviation bars.  3. Average of accuracy measure overall the tested datasets.

Results and Discussion
Moreover, for further analysis of the results of the developed methods, the Friedman test is used. This test is a non-parametric test that provides a statistical value that indicates if there is a significant difference between the developed method and other methods. Table 8 shows the Friedman rank test results of the compared methods based on the accuracy, the number of selected features, and fitness value results. In this table, the EFOAOA obtained the first rank, followed by MPA, MRFO, AOA, TLBO, and SSA. In terms of the average of fitness value, the developed EFOAOA allocates the second rank following HGSO. In addition, the developed EFOAOA provides the best mean rank in terms of the number of selected features.  Moreover, for further analysis of the results of the developed methods, the Friedman test is used. This test is a non-parametric test that provides a statistical value that indicates if there is a significant difference between the developed method and other methods. Table 8 shows the Friedman rank test results of the compared methods based on the accuracy, the number of selected features, and fitness value results. In this table, the EFOAOA obtained the first rank, followed by MPA, MRFO, AOA, TLBO, and SSA. In terms of the average of fitness value, the developed EFOAOA allocates the second rank following HGSO. In addition, the developed EFOAOA provides the best mean rank in terms of the number of selected features.

Comparison with Other FS Techniques
In this part, the results of developed EFOAOA are compared with well-known FS approaches that depends on MH technqiues. These FS approaches including the two whale optimization algorithms [41,42], binary bat algorithm (BBA) [43], enhanced GWO (EGWO) [44], BGOA [45,46], PSO, biogeography-based optimization (BBO), two binary GWO algorithms, namely bGWO1 and bGWO2 [47], AGWO [44], satin bird optimizer (SBO) [44], and enhanced crow search algorithm (ECSA) [48]. Table 9 illustrates the results of classification accuracy of the developed EFOAOA and other methods. From these results it can be seen that the developed EFOAOA has high ability of improve the classification accuracy overall the tested datasets except D5, D7, D14, where PSO, bGWO2, and BGOA are the best, respectively. This incdicate the high ability of EFOAOA to select the relevant features with persevring the accuracy of classification.
To sum up, the previous results show that there is an obvious enhancement in solving feature selection problems when using the proposed EOFAOA method. The EOF is improved extremely by using the operators of the AOA in its structure. Therefore, the EOFAOA can be considered as an efficient and effective optimization algorithm for solving feature selection problems.

Conclusions and Future Work
Aiming for proposing an efficient feature selection (FS) optimizer, this paper proposed an innovative variant of the electric fish algorithm (EFO) via integrating the operator of the arithmetic optimization algorithm (AOA) into the EFO 's exploration phase. The EFO has a drawback while handling large-dimensional optimization problems for solving that a hybrid variant named EFOAOA was proposed. The proposed EFOAOA was applied on eighteen different real-life datasets to tackle the FS optimization task. The EFOAOA was compared with the basic EFO, AOA and MRFO, bGWO, HGSO, MPA, TLBO, SCA, WOA, and SSA through a set of statistical metrics, namely the average value, worst, best, the standard deviation of the fitness function, and the classification accuracy as well as the Friedman test as a non-parametric test. The comparisons revealed that the EFOAOA got the best average fitness in eight out of 18 datasets, the lowest Std value in four out of 18, and the lowest features number in 50% of the datasets. Accordingly, the proposed EFOAOA got the highest accuracy in the average of all datasets; this result indicated that the EOFAOA was able to select the most relative feature and saved the quality of the classification accuracy at the same time. Also, the operators of the AOA play an essential role in improving the exploration stage of the original EFO algorithm.
For future work, the proposed EFOAOA will be examined with several applications include image segmentation, parameter estimation, and time-series forecasting. Also, we will add more improvement to the EFOAOA stages by using different techniques such as chaotic maps and opposition-based learning.