An Efficient Parallel Reptile Search Algorithm and Snake Optimizer Approach for Feature Selection

: Feature Selection (FS) is a major preprocessing stage which aims to improve Machine Learning (ML) models ’ performance by choosing salient features, while reducing the computational cost. Several approaches are presented to select the most Optimal Features Subset (OFS) in a given dataset. In this paper, we introduce an FS-based approach named Reptile Search Algorithm – Snake Optimizer (RSA-SO) that employs both RSA and SO methods in a parallel mechanism to determine OFS. This mechanism decreases the chance of the two methods to stuck in local optima and it boosts the capability of both of them to balance exploration and explication. Numerous experiments are performed on ten datasets taken from the UCI repository and two real-world engineering problems to evaluate RSA-SO. The obtained results from the RSA-SO are also compared with seven popular Meta-Heuristic (MH) methods for FS to prove its superiority. The results show that the developed RSA-SO approach has a comparative performance to the tested MH methods and it can provide practical and accurate solutions for engineering optimization problems.


Introduction
With the rapid development of contemporary enterprises and the digital world, the transformation process of data into useful information has become more and more difficult due to the large amount of data produced by different sources. Machine Learning (ML) can play an essential role in Knowledge Discovery, which is categorized into a number of tasks, including data preprocessing (data preparation, reduction, and transformation), pattern evaluation, and knowledge presentation [1].
FS is a major preprocessing step, which can improve the ML model's performance by eliminating the size of features and simplifying the classification problem [2,3]. The biggest concern of the FS process is to discard the irrelevant, redundant, and noisy features from the whole set of features to derive a subset of representative features. This process is used in many areas of science such as data classification [4], image processing [5], text categorization [6], data clustering [7], and signal processing [8]. The primary objective of the FS process is to find OFS from highly discriminated features that result in high classification accuracy.
Recently, several MH methods have been introduced in the literature to simulate the behaviors of natural phenomena or living organisms for various problems. These methods show potential in selecting OFS from a given dataset and solving diverse and complex optimization problems, such as scheduling engineering design, production problems, and ML [9][10][11]. MH methods use exploration and exploitation principles [12,13]. Exploration refers to the ability to search the entire search space; this ability is linked to avoiding local optima and resolving traps in local optima. On the other hand, exploitation is the ability to investigate nearby prospective ideas to increase local quality. A proper balance between these two properties gives an excellent algorithm performance [14].
Various MH methods such as Multi-Verse Optimizer (MVO) [15], Particle Swarm Optimization (PSO) [16], Whale Optimization Algorithm (WOA) [17], Gray Wolf Optimizer (GWO) [18], and Salp Swarm Algorithm (SSA) [19] are some of the commonly applied MH methods for FS. However, the computational cost, classification accuracy, and finding a global optimum by these methods still need more focus and efforts to improve.
One can combine two or more MH approaches to develop a new one with a higher performance that can achieve a convincing balance between the two MH principles rather than using each of them alone for the problem of FS [20][21][22]. In the present work, a novel combined MH-based approach named Reptile Search Algorithm-Snake Optimizer (RSA-SO) is introduced to solve FS. The RSA-SO approach utilizes the best characteristics and capabilities of both the RSA [23] and SO [24] algorithms to obtain an optimal subset of informative features, where both are collaborated in a parallel mechanism. RSA and SO methods are among the most recent MH algorithms, and they show promising capabilities to solve FS problems with efficient balance between exploration and exploitation aspects. The parallel collaboration helps to decrease the chance of the two methods becoming stuck at local optima and it boosts the capability of both of them in balancing between exploration and explication. The contributions of this paper can be summarized as follows: • An efferent RSA-SO approach is introduced, which merges RSA and SO in a parallel mechanism to enhance the selection process of the OFS.

•
The developed RSA-SO is tested on twelve datasets from different fields and it is applied to solve two well-known engineering optimization problems with constraints.

•
The results show that the RSA-SO performed well when it is compared to other popular MH methods, and it can also provide a practical and accurate solution for engineering optimization problems.
The structure of the paper is as follows. The next section gives an overview of RSA and SO. The details of the proposed RSA-SO approach are described in Section 3, while Section 4 analyzes and discusses the experimental results. Finally, the conclusion and future research directions are given in the last section.

Reptile Search Algorithm (RSA)
RSA is a nature-inspired MH approach based on crocodiles' encircling and hunting behavior, introduced by [23] in 2022. It is a gradient-free method that begins with generating random candidate solutions. The ith candidate solution of the jth input feature , can be calculated as follows: where and are upper and lower boundaries of the jth feature, stands for uniformly distributed random number in the range [0, 1], N is the total number of candidate solutions, and M is the feature dimension. Like the other MH algorithms, RSA works in two principles: exploration and exploitation. These principles are facilitated by crocodiles' movement while searching for food. In the RSA, the total iterations are split into four stages to take advantage of the natural behavior of crocodiles. In the first two stages, RSA accomplishes the exploration based on the encircling behavior comprising the high and the belly walking movements. Crocodiles begin their encircling to search the region, facilitating a more exhaustive search of the solution space, and this can be mathematically modeled as: where ( ) is the best solution for jth feature, , refers to the hunting operator for the jth feature in the ith solution (calculated as in Equation (3)), and parameter controls the exploration accuracy and is set as 0.1. The reduce function , is used to reduce the search region and is computed as in Equation (6), ∈[1, ] is a number between 1 and N used to randomly select one of the possible candidate solutions, and Evolutionary Sense ( ) stands for the probability ratio which reduces from 2 to −2 over iterations, calculated as in Equation (7).
where , indicates the percentage difference between the jth value of the best solution to its corresponding value in the current solution and is calculated as: where denotes a sensitive parameter that controls the exploration performance, is a small floor value, and ( ) refers to the average solutions and is defined as: where the value 2 acts as a multiplier to provide correlation values in the range [0, 2], and ∈{−1,1} is a random integer number between {−1, 1}.
In the last two stages, RSA exploits (hunting) the search space and approaches the optimal solution by using hunting coordination and cooperation. The candidate solution can update its value during the exploitation stage using the following: The quality of candidate solutions at each iteration is measured using the predefined FF. the algorithm stops after G iterations, and a candidate solution with the least FF is used as OFS.

Snake Optimizer (SO)
SO is a MH algorithm proposed by [24] in 2022 to mimic the mating behavior of snakes. Mating happens when the temperature is low and food is available. The SO, like other MH methods, initializes random candidate solutions using Equation (1). This method divides the swarm into male and female groups equally using the following: where is the number of individuals, refers to the male individuals, and refers to the female individuals.
In each iteration, the best individual candidate solution (food position ) is found by analyzing each group for individual best male , and best female , .
The Temperature (T) and the Food Quantity (FQ) can be defined as: where is the current iteration, is the total number of iterations, and 1 is a constant equal to 0.5.
When < Threshold (Threshold = 0.5), the snakes search for food by selecting a random position and then update their position. To mathematically model the exploration behavior of the male and female snakes, the following can be used: where , is ith the male snake position, ( ∈[1, /2] , ) refers to the position of a random male snake, is a random number between 0 and 1, , is the ability to find the food by the male, , is the fitness of the earlier selected random male snake, and , is the fitness of ith male in the group. The flag direction operator ± (i.e., diversity factor) can be used to randomly scan all the possible directions in the given search space.

•
For female snakes: where , is ith the female snake position, ( ∈[1, /2] , ) is the position of a random female snake, , refers to her ability to find food, , is the fitness of the earlier selected random female snake, and , is the fitness of ith individual in the female group.
In the exploitation phase, SO uses two conditions to find the best solutions and they are: 1. If < Threshold (T > 0.6), then the snakes move to find only: where , is the position of individuals, either male or female; is the position of the best individuals; and 3 is a constant equal to 2.

If
< Threshold (Threshold < 0.6), then the snakes will be in two modes, either fighting or mating. The fighting and mating models can be represented as the follows: The fighting ability of the male agent can be written as: where , refers to the ith male position and , refers to the position of the best individual in the female group. Similarly, the fighting ability of the male agent , can be written as: where , , refers to the ith female position, , refers to the position of the best individual in the male group, and , is the fighting ability of the female agent.
where , and , are the positions of ith of male and female agents, and , and , refer to the mating ability of males and females.

Proposed RSA-SO Method
FS is a multi-objective problem where the minimal number of OFS and higher classification accuracy are simultaneously achieved [25]. The literature survey on different MH algorithms explores various nature-inspired phenomena to effectively search for the best solutions in a given search space. A combination of these MH algorithms is reported to enhance the overall performance by complementing the other's exploration and exploitation processes, which in turn can decrease the probability of trapping in local optima.
RSA and SO methods are among the most recent MH algorithms, showing promising capabilities to solve several problems with an efficient balance between exploration and exploitation aspects. In this work, RSA and SO methods collaborate in a parallel strategy to solve an FS problem. The primary objective of the parallel mechanism is that if one of the algorithms cannot improve the candidate solutions or becomes stuck in local optima, the other algorithm moves the current candidate solutions into another search region where some better solutions might be found. Figure 1 provides the procedural steps of the RSA-SO. At first, the hyper-parameters of RSA, SO, and the shared ones are initialized. A uniformly distributed random number generator in the range [−1, 1] is employed to initialize N candidate solutions for M features, as described earlier in the RSA section (Equation (1)). At the start of each iteration, the population (i.e., candidate solutions) is equally divided into two parts between the RSA and SO algorithms. For the gth iteration, candidate solutions { , ( ), 1 ≤ ≤ and 1 ≤ ≤ } are split into two parts. The first half is passed to RSA and the second half is passed to SO. It can be mathematically seen as follows: In the first iteration, both RSA and SO are executed in a parallel manner on the respective parts, and candidate solutions are updated according to Equations (2)-(8) in the RSA and Equations (9)- (16) in SO methods. At the end of the first titration, the updated candidate solutions from both algorithms are evaluated using the Fitness Function (FF). The solutions are sorted in ascending order using the Quick Sort algorithm based on their fitness values. The candidate solutions with the smaller fitness values are selected from each part of the population. The top /2 solutions from the entire population are found and passed to both algorithms for the next iteration. The complete set of candidate solutions after each iteration can be generated by merging solutions from both algorithms, as in Equation (17).
The sorting finds the best /2 solutions from the entire population with fitness values smaller than any solution other than the selected ones. These found solutions may be distributed differently amongst the RSA and SO algorithms. A set of improved lowfitness candidate solutions ̂( ) is obtained by swapping high-fitness candidate solutions with the low-fitness candidate solutions found by the complementary algorithm. The candidate solutions can be updated as follows: If the found candidate solutions comprise more solutions from RSA than SO, then the high-fitness candidate solutions from SO are replaced by solutions found by RSA and vice versa. Hence, the RSA will dominate the next iteration. On the other hand, if the found candidate solutions comprise more solutions from SO than RSA, then the SO will dominate in the next iteration. Lastly, if an equal number of low-fitness candidate solutions are found by both algorithms, then the next iteration displays the codominance of both algorithms. All three cases can be summarized as, An example of candidate solution optimization using = 8 is shown in Figure 2. Candidate solutions from RSA (red) and SO (green) are identified using different colors of the bounding boxes. The corresponding fitness value marks each candidate solution with a maximum of 1 (darker shade fill) and a minimum of 0 (lighter shade fill). The top /2 = 4 found low-fitness solutions (ligher shade fill) are marked by an additional bounding box (dotted black). In the case of Figure 2a, the gth iteration marks three solutions from RSA and only one from SO as low-fitness. In the (g+1)th iteration, a highfitness solution from RSA is replaced with a low-fitness solution from SO, while three high-fitness solutions in SO are replaced with three low-fitness solutions from RSA. Hence, the (g+1)th iteration is dominated by RSA, as observed in Figure 2a's selected solutions. A contradictory situation is presented in Figure 2b, where three solutions from SO and only one from RSA are marked low-fitness. Hence, solutions for the (g+1)th iteration are obtained by replacing three high-fitness solutions from RSA with low-fitness solutions from SO, and vice versa. Hence, the (g+1)th iteration is dominated by SO, as observed in Figure 2b's selected solutions. Finally, Figure 2c shows the equal number of solutions found by both algorithms. Hence, even after the replacement, both algorithms have equal shares indicating the codominance of both algorithms for the (g+1)th iteration. It should be noted that after optimizing, both algorithms will continue the next iteration using the exact same set of low-fitness candidate solutions, except for the sequence of solutions, as seen in Figure 2a,c. This can effectively coordinate and improve global exploration and local exploitation in the search space.
In the next iteration, ( + 1), the generated population is first split into two parts using Equation (19), and each part is passed as an input to the RSA and SO methods to simultaneously search other regions in the feature space. After finishing the second iteration, the obtained candidate solutions are sorted using FF. A new population that is composed of the best candidate solutions is obtained from each population part. This process continues until the termination condition is satisfied (i.e., the maximum number of iterations is reached). The pseudo-code of the RSA-SO is provided in Algorithm 1.  The K-Nearest Neighbor (KNN) classifier with k = 5 is used as the FF. The threshold value is set to 0.5 to produce a small number of features, as recommended by the work of [26,27]. The solution with the smallest number of features and highest accuracy is the best one (smallest fitness ) and it is defined as: where is the error rate of the KNN, is the number of OFS, and is the number of features in the original dataset. and are two weights that control the importance of classification quality and feature reductions; the value of in the range of [0, 1] and the value of is 1-. The parameters and are set to 0.99 and 0.01, respectively, in this work [28,29], and each feature in the OFS follows:

Experiments and Results
To assess the capability of the RSA-SO, its performance is compared with other MH methods, including PSO [16], GWO [18], MVO [15], WOA [17], SSA [19], RSA [23], and SO [24], on twelve datasets; the results are provided in this section. All the experiments are implemented using Python scikit-learn and conducted on a 3.13 GHz PC with 16 GB RAM and Windows 10 operating system.

Dataset
The RSA-SO is tested on eight datasets taken from the UCI data repository, and each of them is split into 80% of the samples used for training and the remaining used for testing. Table 1 summarizes the details of the used datasets.

Parameter Settings
To compare RSA-SO with other methods, six popular methods in the field of FS are selected. The population size and the maximum number of iterations are empirically set as 20 and 100, respectively. All the methods are run 20 times independently. The parameter settings of these methods are defined according to their implementations in the original work, and they are listed in Table 2.

Results and Discussion
A set of widely used performance measures is employed to assess the obtained results by the RSA-SO and other FS methods. These metrics include, classification accuracy, number of selected OFS, fitness values (best, worst, average (Avg), and standard deviation (STD)), and computational time consumed by each method. The Friedman ranking test is applied to rank each method for a fair comparison. Moreover, the convergence behavior of the introduced RSA-SO and other methods is provided in this section. Figure 3 shows the distribution of the best-selected candidate solutions obtained by RSA (in red color) and SO (in green color) for twelve datasets. It provides the number of iterations on the x-axis and the selected solutions on the y-axis. It can be noticed in Figure  2 that the RSA and SO begin by exploring the search space, followed by exploiting the best candidate solution in the feature space. For example, in the initial 25 iterations in the KrvskpEW dataset, more candidate solutions are selected from the first half of the revised candidate solutions, indicating that high walking in the RSA is more effective than SO. Similarly, the last 25 iterations indicate that the hunting cooperation process in RSA exploits candidate solutions more effectively than SO. The dominance of SO during iterations 25 to 50 and 50 to 75 shows that exploration using belly walking and exploitation using hunting coordination in RSA are not very effective. Similar observations can be made for SpectEW, Tic-tac-toe, and Chemical Water datasets. In IonosphereEW and Votes datasets, most of the iterations are dominated by the SO. On the other hand, most iterations for the Breastcancer dataset show approximately equal candidate solutions selected from both methods, indicating the codominance of both algorithms. Similar codominance can be observed in the first 25 and the last 25 iterations for BreastEW, Churn, HeartEW, Sonar, and Zoo datasets.  Tables 3 and 4 compare all the FS approaches in terms of the average testing accuracy and the number of OFS. In MA methods, the solution with the highest classification accuracy and minimum number of features is the best one in the population that needs to be accomplished. In Table 3, the RSA-SO scored the best accuracy compared to other techniques in eight out of twelve datasets. This can be interpreted by the improved capability of the RSA-SO in broadly searching the high-performance regions in the search space. For IonosphereEW, SO is placed first, while for SonarEW and Zoo datasets, GWO performed the best. Both WOA and RSA-SO achieved similar accuracy results on the Chemical Water dataset. As per the results in Table 4, the introduced RSA-SO had the smallest value of the selected OFS in nine out of twelve datasets. This confirms the efficiency of the proposed RSA-SO in eliminating irrelevant features in the datasets and reducing the search space. However, RSA-SO had the same results in Breastcancer, IonosphereEW, and Tic-tac-toe datasets, while the RSA method gained the best results only in the Churn dataset and the SO attained the best results in SonarEW and Vote datasets. A similar number of OFS is selected by all the methods on Breastcancer and Tic-tac-toe datasets. GWO selected the smallest number of OFS on the SonarEW dataset.  Table 5 records a summary of the results obtained by the RSA-SO against the other MH algorithms for different datasets. It also presents ranks of MH algorithms for each dataset depending on average, STD, best, and worst of fitness values. From Table 5, it can be observed that the RSA-SO earned the first rank in nine out of twelve datasets. For Breastcancer, IonosphereEW, and Zoo datasets, PSO, MVO, and RSA achieved first ranks while the proposed RSA-SO achieved ranks of 4, 4, and 2, respectively. The RSA-SO provides the best fitness values in eight datasets while all the methods have similar average best fitness on the Tic-tac-toe dataset. RSA-SO has the smallest worst fitness value in seven datasets, while it has similar average best fitness on the Breastcancer dataset. Moreover, the RSA-SO has better Avg and STD of fitness values in eight and six datasets, respectively. RSA-SO and SSA had the same Avg and STD of fitness values on the HeartEW dataset, while WOA, SSA, RSA, and RSA-SO had the same Avg and STD on the Chemical Water dataset. These results prove the capability of the introduced RSA-SO in sustaining a stable balance between the two main principles of MH methods. The average computational time in seconds for the RSA-SO and the other MH methods, which is computed over 20 independent runs on all the datasets, is provided in Table 6. According to the results in Table 6, the average computational time consumed by the RSA-SO is lower than PSO, GWO, MVO, WOA, SSA, RSA, and SO in five datasets. This is because both the RSA and SO run at the same time in a parallel manner at each iteration, which decreases the running time. Taking into account the accuracy rate and running time, the introduced RSA-SO proves to be superior since it gained a high accuracy rate and competitive execution time on most of the datasets. WOA ranked first for BreastEW, while SSA placed first for HeartEW and Tic-tac-toe datasets. GWO does not need much effort on SpectEW and Chemical Water, and PSO needed lower time on the Zoo dataset. The convergence behavior of the introduced RSA-SO is shown over 100 iterations on the x-axis in Figure 4 and the average fitness values on the y-axis. Figure 4 presents the convergence curves of the best solution obtained after executing each method 20 runs. In Figure 4, one can observe that RSA-SO has a faster and better convergence than the other methods among the used twelve datasets except three of them, namely, IonosphereEW, SpectEW, and Zoo datasets. However, RSA-SO has the fastest convergence speed on nine out of twelve datasets, which proves its suitability for the problem of FS.

Performance of RSA-SO in Engineering
In this section, the performance of the RSA-SO is tested on well-known engineering problems, which are Pressure Vessel Design (PVD) and Cantilever Beam Design (CBD).

Pressure Vessel Design (PVD)
The optimal design of a PVD aims to reduce the total of a pressure vessel constrained by material, shaping, and welding costs [30]. The PVD problem consists of four variables, as given in Figure 5: denotes the thickness of the shell, ℎ presents the thickness of the head, R is the inner radius, and provides the length of the cylindrical section of the vessel. The objective function of this problem can be written as: Minimize: Subject to: Variable range (0 ≤ ≤ 100, = 1.2) and (10 ≤ ≤ 200, = 3.4).  Table 7 lists the results obtained by the RSA-SO for the PVD problem and compares it with the other methods. As listed in Table 7, the suggested RSA-SO provides a lower cost than the PSO, GWO, MVO, WOA, SSA, RSA, and SO methods, and therefore, RSA-SO is suggested as a helpful method for the PVD problem. GWO placed second, MVO and SO placed third and fourth, and RSA placed last for the PVD problem.  Figure 6 illustrates the design of the CBD problem. The problem tries to minimize the total weight, and this problem has five parameters: x1, x2, x3, x4, and x5 [31]. The objective function of the CBD problem can be mathematically presented as follows: Minimize: ( ) = 0.6224( 1 2 , 3 , 4 5 ), Subject to: In Table 8, the performance results of the RSA-SO for the CBD engineering problem are given when it is compared with other MH methods. As per Table 8, the best weight obtained by RSA-SO is the smallest compared to the other methods. MVO, WOA, and SO place second, third, and fourth, respectively, while SSA and RSA are in last place. Based on the previous results and discussion, the developed RSA-SO has a high ability to explore the feasible region which contains the optimal solution. However, the time complexity of RSA-SO still needs more improvements, especially when applied to handle high-dimensional data.

Conclusions and Future Works
FS is one of the key factors in improving the classifier capability in classification problems. In this paper, an FS approach based on RSA and SO, named RSA-SO, is presented. The introduced RSA-SO approach employs both RSA and SO in a parallel mechanism to tackle the problem of FS. We tested the RSA-SO approach on twelve different public datasets taken from UCI and two engineering problems. RSA-SO's capability was evaluated using a set of evaluation measures and compared with some recently reported MH methods for FS, including SO, RSA, SSA, WOA, MVO, GWO, and PSO. The results verify that RSA-SO has a comparative performance to other MH methods for FS, and it can provide practical and accurate solutions for two engineering optimization problems. For future work, RSA-SO will be applied to address other problems in different fields, such as sentiment analysis, Big Data, smart cities, and other practical engineering problems.

Conflicts of Interest:
All authors declare that they have no conflict of interest.