Binarization of Metaheuristics: Is the Transfer Function Really Important?

In this work, an approach is proposed to solve binary combinatorial problems using continuous metaheuristics. It focuses on the importance of binarization in the optimization process, as it can have a significant impact on the performance of the algorithm. Different binarization schemes are presented and a set of actions, which combine different transfer functions and binarization rules, under a selector based on reinforcement learning is proposed. The experimental results show that the binarization rules have a greater impact than transfer functions on the performance of the algorithms and that some sets of actions are statistically better than others. In particular, it was found that sets that incorporate the elite or elite roulette binarization rule are the best. Furthermore, exploration and exploitation were analyzed through percentage graphs and a statistical test was performed to determine the best set of actions. Overall, this work provides a practical approach for the selection of binarization schemes in binary combinatorial problems and offers guidance for future research in this field.


Introduction
In recent years, the optimization of systems has become a fundamental task in various areas of industry and technology.The search for optimal solutions to complex and multidimensional problems is a constant challenge in fields such as engineering, economics, physics, and computer science.In this sense, the application of metaheuristic optimization algorithms has become a valuable tool for finding efficient and effective solutions to these problems.Furthermore, the use of continuous metaheuristics to solve binary problems has become increasingly relevant due to their ability to find optimal solutions in a short period of time [1].These techniques are capable of efficiently exploring and exploiting the search space, making them ideal for optimization problems in fields such as engineering, data science, and industry.However, it is important to note that the proper selection of parameters and search strategies is essential for obtaining optimal results.The proposal of this work is to study different sets of actions (combinations of transfer functions and binarization rules) in order to evaluate the impact on the resolution of binary combinatorial problems, where different sets were evaluated in the binarization scheme selection (BSS) proposed in [2], where Q-learning, a machine learning technique, acts as an intelligent selector of binarization schemes.
The no free lunch (NFL) theorem [3] states that there is no single algorithm capable of reaching the optimal solution to all optimization problems.With this, researchers have the motivation to innovate and/or create new algorithms and validate them on different optimization problems with different domains.Tests were carried out with three good metaheuristics (MHs): grey wolf optimization [4], the whale optimization algorithm [5], and the sine cosine algorithm (SCA) [6].
The main contributions of this work are as follows: • Evaluate different sets of transfer functions and binarization rules.• Explore the importance of binarization rules compared to transfer functions.

•
Compare the results in three different and complex metaheuristics.• Conduct a comprehensive comparison of the results obtained by solving the set covering problem.
The structure of the content in the paper is as follows: In Section 2, a review of the related works on the use of continuous metaheuristics and reinforcement learning in combinatorial binary problems will be presented.In Section 2.1, we will present how continuous metaheuristics solve binary combinatorial optimization problems.In Section 2.4, hybridization between continuous metaheuristics and Q-learning will be presented, where Q-learning acts as an intelligence selector of binarization schemes.In Section 3, the different sets of actions (pairs of transfer functions and binarization rules) to be compared will be presented.In Section 4, the results obtained and the analysis performed will be presented, ending with the final conclusions in Section 5.

Related Work
In this section, we will discuss related work on four topics.The first topic is on continuous MHs for solving combinatorial problems, discussed in Section 2.1.To perform work in binary domains, the second topic is two-step techniques, covered in Section 2.2.Additionally, in Section 2.3, we will explain the MH used in this study.Finally, in Section 2.4, we describe the binarization schemes selector (BSS) hybridization technique.

Continuous Metaheuristics to Solve Combinatorial Problems
The binarization techniques used in continuous MHs involve transferring continuous domain values to binary domains, with the aim of maintaining the quality of moves and generating high-quality binary solutions.While some MHs operate on binary domains without a binary scheme, studies have demonstrated that continuous MHs supported by a binary scheme perform exceptionally well on multiple NP-hard combinatorial problems [1].Examples of such MHs include the binary bat algorithm [28,29], particle swarm optimization [30], binary sine cosine algorithm [2,[31][32][33], binary salp swarm algorithm [34,35], binary grey wolf optimizer [32,36,37], binary dragonfly algorithm [38,39], the binary whale optimization algorithm [2,32,40], and the binary magnetic optimization algorithm [41].
In the scientific literature, two main groups of binary schemes used to solve combinatorial problems can be identified.The first group refers to operators that do not cause alterations in the operations related to different elements of the MH.Within this group, twostep techniques stand out as the most widely used in recent years, as they are considered to be the most efficient in terms of convergence and their ability to find optimal solutions.These techniques are based on modifying the solution in the first step and discretizing it into a 0 or a 1 in the second step [42].In addition, the angle modulation technique is also used in this group as it has been shown to be effective in solving combinatorial problems [43].
On the other hand, the second group of binary schemes includes methods that alter the normal operation of an MH.For example, the quantum binary approach, which is based on the application of quantum mechanisms to solve combinatorial problems [44].In addition, also included in this group are set-based approaches, which focus on the selection of solution sets to improve the efficiency of the MH.Finally, clustering-based techniques, such as the k-means approach [45,46], are also considered in this second group, as they modify the normal operation of the MH to improve its ability to find optimal solutions.

Two-Step Techniques
In the scientific community, two-step binarization schemes are very relevant [1].They have been widely used to solve a variety of combinatorial problems [47].As the name suggests, this binarization scheme consists of two stages.The first stage involves the application of a transfer function [30], which transfers the values generated by the continuous MH to a continuous interval between 0 and 1.The second stage consists of the application of a binarization rule, which discretizes the numbers within that interval into binary values.This technique has been shown to be effective in solving combinatorial problems, since it allows the quality moves of the continuous MH to be preserved, while generating high-quality binary solutions.

Transfer Function
In 1997, Kennedy et al. [48] introduced transfer functions in the field of optimization.Their main advantage is that they provide a probability between 0 and 1 with low computational cost.There are several types of transfer functions, including those in the form of an S [30,49] and a V [50], as well as in the forms of O, Z, X, Q, and U, among others.These functions are used to transfer values generated by the continuous MH to a continuous interval between 0 and 1, allowing the quality movements of the continuous MH to be preserved while high-quality binary solutions are generated.The families of transfer functions we have used in this work can be seen in Tables 1 and 2 and Figures 1 and 2. The notation d j w observed in Tables 1 and 2 corresponds to the continuous value of the j-th dimension of the w-th individual resulting from the perturbation performed by the continuous metaheuristic.
It is important to note that no transfer function is superior to the others in all cases in which they have been used, since according to the "no free lunch" theorem there is no universal optimization algorithm that is better than the others in all situations.Therefore, due to this theorem we have room for experimentation and analysis of new optimization algorithms.

Transfer Functions
X-shaped [51,52] Z-shaped [53,54] Name Equation Name Equation On the other hand, different researchers [38,[55][56][57] incorporated a parameter to these transfer functions, thus developing time-varying transfer functions.This parameter varies iteration by iteration, thus generating a new transfer function.For example, in [38], the parameter the authors incorporate is defined as follows: This parameter τ is added to the transfer functions of the S-shaped and V-shaped family.Table 3 shows the new time-varying transfer functions equation proposed by the authors and Figure 3 shows the results of running 100 iterations of each time-varying transfer function.

Time-Varying Transfer Functions [38]
S-shaped V-shaped Name Equation Name Equation

Binarization Rule
The process of binarization involves converting continuous values into binary values, that is, values of 0 or 1.In this context, binarization rules are applied to the probability obtained from the transfer function to obtain a binary value.There are various different techniques described in the scientific literature [58] that can be utilized for this binarization process.Some of these techniques are illustrated in Table 4, and can vary depending on the context and specific project needs.It is crucial to consider appropriate use of the binarization technique to obtain accurate and reliable results.

Type
Binarization Rules

Static Probability
The notation X j w observed in Table 4 corresponds to the j-th dimension binary value of the w-th current individual and X j Best corresponds to the j-th dimension binary value of the best solution.

Metaheuristics
Metaheuristics are general-purpose algorithms that provide good solutions in a reasonable time.The search process consists of balancing the diversification and intensification phases by means of operators specific to each algorithm [59].Exploration aims to find tentative regions with good solutions and exploitation intensifies the search for the best regions to try to find better solutions.
Human behavior, genetic evolution, social behavior of animals, and physical phenomena are some of the main sources of inspiration for the authors, and every year new metaheuristics are developed based on the no free lunch theorem [3].This theorem tells us that there is no supreme algorithm that solves all optimization problems.
In general, metaheuristics are designed and used to solve continuous optimization problems.In the following sections, we will present a brief summary of the three metaheuristics that have been used in this study with the aim of providing a background as to how the metaheuristic works; for more information see the seminal manuscripts.

Sine Cosine Algorithm
The sine cosine algorithm (SCA) was proposed by Mirjalili [6].This metaheuristic has two main equations combined into one (Equation ( 2)) and four main parameters for the position update of the solutions (r 1 , r 2 , r 3 , and r 4 ).The combined equations used are as follows: where X t i is the position of the current solution in the i-th dimension at the t-th iteration, X t Best shows the best individual's position at the t-th iteration, and r 1 , r 2 , r 3 , and r 4 are random parameters.SCA uses the latter parameters to avoid entrapment in suboptimal solutions and to balance the exploration and exploitation processes.
• r 1 is a linearly decreasing parameter and is calculated as follows: r 1 = a − t a T max , where a is a constant, t is the current iteration, and T max represents the maximum iterations allowed.This parameter conditions the movement of the solution either towards the best solution (r 1 < 1) or away from the best solution (r 1 > 1).The above equation allows for the balance between exploration and exploitation.
• r 2 has values in the range [0, 2π] and determines how big the movement of a solution is towards or away from the destination.• r 3 has values in the range [0, 2] and is used to assign a weight to the destination, reinforcing or inhibiting the impact of the destination point on the updating process of the other solutions.• r 4 , with values in the range [0, 1], is a switch between the sine and cosine functions.

Grey Wolf Optimizer
The grey wolf optimizer (GWO) was proposed by Mirjalili [4], this metaheuristic is inspired by both the hunting behavior and social hierarchy of the grey wolf.Within the pack there are four types of social hierarchy: • Alpha (α): these are wolves that are at the top of the hierarchy and lead the pack.• Beta (β): wolves that support the alpha wolves' decisions.• Delta (δ): they are strong but lack leadership skills.• Omega: they have no power, they are dedicated to follow, help, and protect the younger members of the pack.
Applying the previously described hierarchy, at each step we will denote the best three solutions as alpha, beta, and delta, and the other solutions as omega.Basically, this means that the optimization process follows the flow of the position of the three best wolves in the hierarchy.In addition, the prey will be the optimal solution of the solution.
Most of the logic follows the equations: where t denotes the current iteration, − → A and − → C are coefficient vectors, − → X p is the position vector of the prey, and − → X is the position of the wolf.Finally, the symbol "•" represents a multiplication operator.Vectors − → A and − → C are equal to: where components of − → a are linearly decreased from 2 to 0 through iterations and − → r 1 , − → r 2 are random vectors with values from [0, 1], calculated for each wolf at each iteration.Vector − → A controls the trade-off between exploration and exploitation, while − → C always adds some degree of randomness.This is necessary because our agents can become stuck in local optima and most of the metaheuristics have a way of avoiding this.
Since we do not know the real position of the optimal solution, − → X p depends on the three best solutions and the formulas for updating each of the agents (wolves) are: where − → X is the current position of the agent and − → X (t + 1) is the updated one.The formula above indicates that the position of the wolf will be updated according to the best three wolves from the previous iteration.Notice that it will not be exactly equal to the average of the three best wolves because of the vector − → C which adds a small random shift.This makes sense because, from one side, we want our agents to be guided by the best individuals, but from the other side, we do not want to become stuck in local optima.

Whale Optimization Algorithm
The whale optimization algorithm (WOA) is a metaheuristic that was proposed by Mirjalili and Lewis [5].Like the GWO, the whale metaheuristic is inspired by a hierarchy and a particular way of hunting called bubble-net hunting.The metaheuristic has three main phases: (1) Exploration phase: search for the prey.
(3) Exploitation phase: attacking the prey using a bubble-net method.
Based on the three main phases mentioned above, at each step of the exploration to search for the best solution (prey), the search agent (whale) is updated based on a random agent and not on the best.The mathematical model behind this logic is the following: where − → A and − → C are coefficient vectors, and − → X Rand is a random position vector selected from the current population.Vectors − → A and − → C are equal to: where − → r is a random vector in the range [0, 1] and − → a decreases linearly from 2 to 0 during the iterations.In addition, if | A |> 1, then the search agent is forced to move away from a reference whale.Humpback whales encircle the prey during hunting.Then, they consider the current best candidate solution as the best solution and near the optimal one.In short, here is the model of encircling behavior that is used to update the position of the other whales towards the best search agent: where t is the current iteration, − → X is the position of the best solution, − → X refers to the position vector of a solution, and finally, − → A and − → C are coefficient vectors, as shown in Equations ( 16) and (17).
The exploitation phase combines two approaches, shrinking the encircling mechanism and a spiral update of the position mechanism.In the shrinking encircling mechanism, the value of A is random within the interval [−a, a], and the value decreases from 2 to 0 as previously stated in Equation ( 16).In the spiral position update mechanism, we start by calculating the distance between the search agent (whale) and the best solution (prey), the mathematical model that simulates this movement is as follows: where − → D represents the distance between the whale and the prey (best solution obtained so far), l is a random number between [−1, 1] with a uniform distribution, and b is a constant defining the shape of the logarithmic spiral.Finally, the mathematical model that combines the two mechanisms is as follows: where p is a random number in the range [0, 1] and represents the probability of selecting one of these two methods to update the position of the whales.

Hybridization: Binarization Schemes Selector
In the literature, there are several related works on binarization [30,42] that have laid the groundwork for investigations into this domain problem, as there are several practical applications where working in binary domains is necessary.Moreover, research has emerged on how the change of binarization schemes affects each iteration of the search process, such as time-varying binarization schemes [38] or binarization scheme selectors [2,32,60], where the influence of binarization schemes and their impact at both the problem level and each iteration of the search has been demonstrated.
In Section 2.1, we can see how to adapt a continuous MH so that it can solve binary combinatorial problems.The two-step technique provides us with different possible combinations for binarizing continuous solutions.Although it is desirable to have a wide variety of combinations, each one must be tested individually to determine which is the best for a given problem.
In the literature, various related works have proposed the hybridization of the sine cosine algorithm, grey wolf optimizer, whale optimization algorithm, and Q-learning [2,31,36,40].Q-learning was used as a dynamic binarization scheme selector in each of the metaheuristics, allowing them to solve binary combinatorial problems.This solution delegates the selection of a good binarization scheme to an artificial intelligence, eliminating any human bias.
The BSS provides a way to dynamically choose the binarization scheme, in this case the combination of transfer functions and binarization rules, within a set of them, using the scheme proposed in [2], where an intelligent operator chooses between a set of actions (possible binarization schemes) by observing the environment (exploration or exploitation).

Actions
The decision of which action to take is a complex task that requires careful evaluation of multiple options in different situations.In this context, Q-learning is utilized, which is a reinforcement learning technique that seeks to determine the best action to take in a specific state.In this work, as in previous studies, we define the considered actions as the existing combinations between the transfer functions (Tables 1 and 2) and the binarization rules (Table 4).As an example, Figure 4 shows how the possible 40 actions that Q-learning would choose if it were working only with S-and V-type transfer functions and the five defined binarization rules, would be formed.

States
As can be seen in Section 2.3, metaheuristics perform the search process alternating between intensification (exploration) and diversification (exploitation).Previous proposals [2,31,40,60,61] used these phases as the states to use in Q-learning.In these papers, the authors determined the stage of the search process by calculating the population diversity.In particular, they used the diversity proposed by Hussain Kashif et al. [62], which is defined as follows: where Div represents the diversity status determination, xd denotes the mean values of the individuals in dimension d, x d i denotes the value of the i-th individual in dimension d, n denotes the population size, and l denotes the size of the individuals dimension.
If we consider the exploration and exploitation percentages to be XPL% (exploration) and XPT% (exploitation), the percentages XPL% and XPT% are computed from the study of Morales-Castañeda et al. [63] as follows: where Div represents the diversity state determined by Equation ( 23) and Div max denotes the maximum value of the diversity state discovered throughout the optimization process.

The Proposal: Analysis of Different Sets of Actions
Analyzing all the work presented so far and considering what is stated in Section 2.2, we ask ourselves the following questions: (1) Which will have more impact on binarization, the transfer function or the binarization rule?(2) Will the binarization schemes selector work better with more actions?
To answer these two questions, we apply the scheme proposed in Figure 5.For our analysis we used three continuous metaheuristics, which are SCA, GWO, and WOA.These three metaheuristics solved the set covering problem, a classical combinatorial optimization problem that will be defined in Section 4.1.Since it is a combinatorial problem and the chosen metaheuristics solve continuous problems, it is necessary to binarize the solutions.
As noted in Section 2.2, we have different ways of binarizing.On the other hand, Section 2.4 explains hybridizations where machine learning techniques are used to select binarization scheme dynamically.
We use the hybridization proposed in Section 2.4 as it allows us to evaluate how well a machine learning technique performs against different sets of actions.

Binarize the population using the action selected by Q-Learning
Update Best Solution end?

Return Best Solution
Initialization Q-Table

Get immediate reward for action
Update Q-Table SCA TFBR In other words, we analyzed different combinations between four families of transfer functions (S-shaped, V-shaped, X-shaped, and Z-shaped) and the five binarization rules were carried out.Fourteen different action sets will be analyzed.First, a total of five sets with eight actions will be formed, where the S-shaped and V-shaped families will be used as transfer functions and the binarization function will be fixed in each set.Secondly, another five sets with 18 actions will be formed, where the S-shaped, V-shaped, X-shaped, and Z-shaped families will be used as transfer functions and the binarization function will be fixed in each set.Thirdly, a set with 40 actions will be formed, replicating the same work presented by the authors in [2,31,36,40].Finally, a set with 80 actions will be formed, where the S-shaped, V-shaped, X-shaped, and Z-shaped families will be used as transfer functions and all the binarization functions will be used.
We present Table 5, which shows a set of actions analyzed in our study.The table includes 12 sets of actions, labeled as TFBR-1 to TFBR-12, and provides information on their transfer functions and binarization rules.The transfer functions considered are Sshaped, V-shaped, X-shaped, and Z-shaped, while the binarization rules used are standard, complement, static probability, elitist, and roulette elitist.The last column of the table reports the amount of actions associated with each set.The sets differ in terms of the combination of transfer functions and binarization rules used, and the number of actions considered.Table 5 serves as a reference for the subsequent experiments, in which we evaluate the performance of different algorithms on each set of actions.The experimental results are presented in Section 4. In particular, in Section 4.2, the results obtained in each algorithm executed will be observed.Section 4.3 will analyze the convergence of each algorithm executed.In Section 4.4, we will analyze the exploration and exploitation behavior of each algorithm executed thanks to Equations ( 24) and (25).Finally, in Section 4.5, we will analyze the statistical test performed where all the executed versions are compared.

Experimental Results
To evaluate the performance of the proposed algorithms, the test cases of the set covering problem proposed in Beasley's OR-Library [64] were used.In particular, 45 instances of this problem were solved.The algorithms were developed using the Python 3.7 programming language and executed using the free Google Colaboratory services [65].The results obtained were stored and processed through databases provided by Google Cloud Platform.
Following recommendations from the literature [58], 40,000 calls were made to the objective function in each run.To achieve this, a population of 40 individuals and 1000 iterations were used across all GWO, SCA, and WOA runs.Thirty-one independent runs were performed for each instance.As for the parameters used for GWO, SCA, WOA, and Q-learning, they are detailed in Table 6.

Set Covering Problem
The set covering problem (SCP) is a classic NP-hard combinatorial optimization problem [66] and consists of finding the set of elements with the lowest cost that meets a certain amount of needs.The objective function of the problem is as follows: Subject to the following restrictions: where A is a binary matrix of size m rows and n columns and a ij ∈ {0, 1} is the value of each cell in the matrix A. i and j are the sizes of the m rows and n columns.In the event that column j satisfies a row i, then a ij is equal to 1, otherwise it is 0. In addition, it has an associated cost c ∈ C, where C = {c 1 , c 2 , . . ., c n }, together with i = {1, 2, . . ., m} and j = {1, 2, . . ., n}, which are the sets of rows and columns, respectively.Finally, x corresponds to the area to be covered.The mathematical model of the set covering problem is explained in more detail in [67].This problem formulation has inspired the modeling of different real-world problems such as airline and bus crew scheduling [68], the location of gas detectors for industrial plants [69], plant location selection [70], the location of emergency services [71], dynamic vehicle routing problems [72], the location of electric vehicle charging points in California [73], disaster management systems [74], emergency humanitarian logistics [75], the optimal UAV locations for the purpose of generating wireless communication networks in disaster areas [76], among others.
These studies allow us to appreciate the importance of solving this problem with optimization techniques that guarantee good results.

Summary of Results
This section will present an analysis of the results obtained using the three metaheuristics.Tables 7-9 display the relative percentage deviation (RPD) between the optimum and the best result obtained for 45 instances of the set covering problem using GWO, SCA, and WOA, respectively, with 12 different action sets.The RPD is defined in Equation (29), and the values of the RPD are grouped into four ranges.The first range corresponds to a deviation of 0%, the second range to a deviation of more than 0% up to 3%, the third range to a deviation of more than 3% up to 5%, and the last range corresponds to a deviation greater than 5%.The 12 different combinations are called TFBR-1 to TFBR-12, and the values shown in the table are the number of instances for each MH and evaluated combination that are within the RPD range indicated in each row.The table presents thirteen columns, the first column indicates the four different RPD ranges and from column two to column thirteen indicates for each algorithm executed the number of instances whose RPD obtained is within the range.
Tables 10 and 11 show the results obtained with the twelve sets applied to the grey wolf optimizer (GWO).Tables 12 and 13 show the results obtained with the twelve sets applied to the sine cosine algorithm (SCA).Tables 14 and 15 show the results obtained with the twelve sets applied to the whale optimization algorithm (WOA).
For all tables, the first column indicates the name the OR-Library instances solved (Inst.), the second column the optimal value known for each of these instances (Opt.), the following three columns are repeated for each executed set.The first one indicates the best result obtained for the 31 independent runs, the second one indicates the average of the 31 results obtained, and the third one indicates the RPD.As can be seen in Table 7, only five sets achieved an RPD = 0, i.e., reached the known optimum.In particular, TFBR-5 achieved this in 12 instances, TFBR-6 as well as TFBR-11 and TFBR-12 achieved this in 8 instances, and TFBR-10 achieved this in one instance.
Table 16 shows a ranking of the best sets of actions considering only the RPD obtained.From this table, the first and second best sets include the elitist binarization rule (TFBR-5: S-shaped and V-shaped × elitist and TFBR-11: S-shaped, V-shaped, X-shaped, and Z-shaped × elitist) and the third and fourth best sets include the roulette elitist binarization rule (TFBR-12: S-shaped, V-shaped, X-shaped, and Z-shaped × roulette elitist and TFBR-6: S-shaped and V-shaped × roulette elitist).
On the other hand, the worst and second worst sets include the standard binarization rule (TFBR-8: S-shaped, V-shaped, X-shaped, and Z-shaped × standard and TFBR-2: S-shaped and V-shaped × standard) and the third and fourth worst sets include the complement binarization rule (TFBR-3: S-shaped and V-shaped × complement and TFBR-9: S-shaped, V-shaped, X-shaped, and Z-shaped × complement).
In terms of the number of actions in each set, the best sets have 8 and 16 actions.In contrast, the sets with the most actions (TFBR-1 with 40 actions and TFBR-7 with 80 actions) are in the middle of the ranking.
Looking at these results, we can see that the binarization rule has a greater impact than the transfer functions.Moreover, increasing the number of actions does not imply better results.As can be seen in Table 8, only five sets achieved a RPD = 0, i.e., reached the known optimum.In particular, TFBR-6 achieved this in seven instances, TFBR-5 as well as TFBR-11 and TFBR-12 achieved this in six instances, and TFBR-7 achieved this in one instance.
Table 17 shows a ranking of the best sets of actions considering only the RPD obtained.From this table, the best set includes the roulette elitist binarization rule (TFBR-6: S-shaped and V-shaped × roulette elitist), the second best set includes the elitist binarization rule (TFBR-5: S-shaped and V-shaped × elitist), the third best set also includes the roulette elitist binarization rule (TFBR-12: S-shaped, V-shaped, X-shaped, and Z-shaped × roulette elitist), and the fourth best set also includes the elitist binarization rule (TFBR-11: S-shaped, V-shaped, X-shaped, and Z-shaped × elitist).
On the other hand, the worst set includes the static probability binarization rule (TFBR-10: S-shaped, V-shaped, X-shaped, and Z-shaped x static probability), the second worst set includes the standard binarization rule (TFBR-8: S-shaped, V-shaped, X-shaped, and Z-shaped × standard), the third worst set also includes the static probability binarization rule (TFBR-4: S-shaped and V-shaped × static probability), and the fourth worst set also includes the standard binarization rule (TFBR-2: S-shaped and V-shaped × standard).
In terms of the number of actions in each set, the best sets have 8 and 16 actions.In contrast, the sets with the most actions (TFBR-1 with 40 actions and TFBR-7 with 80 actions) are in the middle of the ranking.
Looking at these results, we can see that the binarization rule has a greater impact than the transfer functions.Moreover, increasing the number of actions does not imply better results.Table 18 shows a ranking of the best sets of actions considering only the RPD obtained.From this table, the best set includes the roulette elitist binarization rule (TFBR-6: S-shaped and V-shaped × roulette elitist), the second best set includes the elitist binarization rule (TFBR-5: S-shaped and V-shaped × elitist), the third best set also includes the roulette elitist binarization rule (TFBR-12: S-shaped, V-shaped, X-shaped, and Z-shaped × roulette elitist), and the fourth best set also includes the elitist binarization rule (TFBR-11: S-shaped, V-shaped, X-shaped, and Z-shaped × elitist).
On the other hand, the worst set includes the static probability binarization rule (TFBR-10: S-shaped, V-shaped, X-shaped, and Z-shaped x static probability), the second worst set includes the standard binarization rule (TFBR-8: S-shaped, V-shaped, X-shaped, and Z-shaped × standard), the third worst set includes the complement binarization rule (TFBR-3: S-shaped and V-shaped × complement), and the fourth worst set also includes the standard binarization rule (TFBR-2: S-shaped and V-shaped × standard).
In terms of the number of actions in each set, the best sets have 8 and 16 actions.In contrast, the sets with the most actions (TFBR-1 with 40 actions and TFBR-7 with 80 actions) are in the middle of the ranking.
Looking at these results, we can see that the binarization rule has a greater impact than the transfer functions.Moreover, increasing the number of actions does not imply better results.

Convergence Analysis
In this section, a convergence analysis will be presented for each metaheuristic using the 12 sets of actions.Figure 6 shows the 12 convergence graphs for the best execution of the 31 performed using the grey wolf optimizer solving the scp44 instance; Figure 7 shows the 12 convergence graphs for the best execution of the 31 performed using the sine cosine algorithm solving the scpb2 instance; and Figure 8 shows the 12 convergence graphs for the best execution of the 31 performed using the whale optimization algorithm solving the scp65 instance.
For all figures, the x-axis shows the 1000 iterations run and the y-axis shows the best fitness obtained during the optimization process.

Analysis of the Convergence Graphs Using Grey Wolf Optimizer
To analyze the convergence of the 12 sets applied to the GWO, the scp44 instance was used as an example.Table 19 shows the ranking of the algorithms ordered from the best fitness obtained to the worst fitness obtained for the scp44 instance.The global optimum for the scp44 instance is 494.
Analyzing Figure 6e,f,i,k, we can see that the algorithms had a fast convergence but were able to exit them and obtain better results.The sets of actions in these algorithms incorporate the elitist and elitist roulette binarization rules and they are the best-performing sets.
On the other hand, analyzing Figure 6b,c,h, we can see that the algorithms had slow convergence, indicating that they explored more of the search space.The sets of actions in these algorithms incorporate the standard and complement binarization rules and they are the worst-performing sets.
Finally, analyzing Figure 6a,g, we can see that the algorithms had a fast convergence, and from their behavior we can say that they fell into local optima since they did not improve much after convergence.The sets of actions in these algorithms incorporate all of the binarization rules.

Analysis of the Percentage Graphs Using Grey Wolf Optimizer
The instance selected to analyze the behavior of the the grey wolf optimizer is the same as the one selected in Section 4.3.1,i.e., the selected instance is the scp44.
Analyzing Figure 9e,f,k,l we can see that there is a good balance between exploration and exploitation, reaching percentages close to 50%.The set of actions in these algorithms incorporates the elitist and elitist roulette binarization rules.If we recall Table 19, these sets have the best performance.
On the other hand, analyzing Figure 9a-c,g-i we can see that there is a bad balance between exploration and exploitation.In particular, we can observe that the algorithms had high exploration rates throughout the iterations.The set of actions in these algorithms incorporates the elitist and elitist roulette binarization rules and these sets are the ones that presented the worst results.
Finally, analyzing Figure 9d,j we can see that they have high exploitation percentages.The set of actions in these algorithms incorporates the static probability binarization rule.If we consider the results obtained, we can conclude that both algorithms fell into local optima and did not have the ability to exit from it.
From the above, we can conclude that the binarization rules have a greater impact than the transfer functions.

Analysis of the Percentage Graphs Using Sine Cosine Algorithm
The instance selected to analyze the behavior of the sine cosine algorithm is the same as the one selected in Section 4.3.2,i.e., the selected instance is the scpb2.
Analyzing Figure 10e,f,k,l we can see that all four algorithms show a high percentage of exploitation and that even only the first iteration shows some exploration.The set of actions in these algorithms incorporates the elitist and elitist roulette binarization rules.If we recall Table 20, these sets are the ones that had the best results, and they even reached the known global optimum.
On the other hand, analyzing Figure 10a,b,d,g,j we can see that the algorithms do not present a good balance between exploration and exploitation.In particular, they have a very aggressive behavior, where in one iteration they have a high percentage of exploration and in the next iteration they have a high percentage of exploitation.The set of actions in these algorithms incorporates the standard, static probability binarization rules and the algorithms incorporating all the binarization rules are present.Using the fitness values show in Table 20, we can see that aggressive behavior leads to good results in some cases and very good results in others.
Finally, analyzing Figure 10c,i we can see that they have high exploration percentages.Moreover, in most iterations, the algorithms show high exploration percentages.The set of actions in these algorithms incorporates the complement binarization rule.When reviewing Table 20 we can see that they reach acceptable results but when reviewing Tables 12 and 13 we can see that they present a high average.This indicates that they have a stochastic behavior, providing low confidence.
In conclusion, we can see that the binarization rules have a greater impact than the transfer functions.

Analysis of the Percentage Graphs Using Whale Optimization Algorithm
The instance selected to analyze the behavior of the whale optimization algorithm is the same as the one selected in Section 4.3.3,i.e., the selected instance is the scp65.
Analyzing Figure 11e,f,k,l we can see that all four algorithms show a high percentage of exploitation and that even only the first iteration shows some exploration.The set of actions in these algorithms incorporates the elitist and elitist roulette binarization rules.If we remember Table 21, these sets are the ones that had the best results, and one of them even reached the known global optimum.
On the other hand, analyzing Figure 11a,b,d,g,h,j we can see that the algorithms have an aggressive behavior at the beginning and as the iterations go by they present a more Table 25 shows the ranking of the sets by metaheuristic taking into account the statistical test applied.For each metaheuristic we have two columns, the first one refers to the name of the set and the second column refers to the number of times when the set was better than others.It is understood as a winning set when the comparison between them indicates a p-value lower than 0.05.In Section 3, we presented two research questions which were as follows.
(1) Which will have more impact on binarization, the transfer function or the binarization rule?(2) Will the binarization schemes selector work better with more actions?
If we analyze Table 25, we can see that for the three metaheuristics used in the research the best sets of actions are shown in Table 26.

Set ID
Transfer Functions Binarization Rules Amount of actions TFBR-5 S-shaped and V-shaped Elitist 8 TFBR-6 S-shaped and V-shaped Roulette Elitist 8 TFBR-11 S-shaped, V-shaped, X-shaped, and Z-shaped Elitist 16 TFBR-12 S-shaped, V-shaped, X-shaped, and Z-shaped Roulette Elitist 16 As can be seen in Table 26, the best action sets are composed of the elitist binarization rule and the elitist roulette binarization rule.Therefore, we can conclude that the binarization rules have a greater impact than the transfer functions and that the intelligent binarization scheme selector does not perform better by incorporating more actions.

Conclusions and Outlook
Continuous metaheuristics are a class of evolutionary algorithms that are used to solve combinatorial problems.This makes them a powerful tool for solving binary problems, as they can efficiently explore a large number of possible solutions.However, it is necessary to incorporate intermediate steps to convert continuous solutions to a binary domain.These techniques are also capable of avoiding falling into local optima and finding high-quality solutions in problems with a large number of variables.This work has important implications in industry, as the set covering problem is a key problem in many applications, such as production, logistics, project planning, and resource allocation.Using MH optimization techniques to tackle this problem helps to improve efficiency and reduce costs.Additionally, the proposed approach in this work allows for selecting the best combination of binarization rules and transfer functions for a given problem instance, leading to even better performance.
A proposal is presented to improve the performance of metaheuristic optimization algorithms by using differentiated sets of actions.These sets of actions are composed of combinations of binarization rules and transfer functions.Twelve different sets of actions were proposed and applied to three different metaheuristic optimization algorithms: grey wolf optimizer, sine cosine algorithm, and whale optimization algorithm.These algorithms were applied to 45 different instances of the set covering problem.
The experimental results showed that the sets of actions that incorporate at least one elitist or elitist roulette binarization rule are the best, as they obtained the best results in terms of fitness and were statistically superior to the other sets of actions.Furthermore, it was found that binarization rules have a greater impact on the performance of metaheuristic algorithms than transfer functions.This work has demonstrated the importance of solving combinatorial binary problems using continuous metaheuristic techniques.Through the proposal that selects among a set of actions based on a reinforcement learning technique, it has been possible to improve the results obtained using traditional techniques.Additionally, it has been demonstrated that the elitist and elitist roulette binarization schemes are the most effective compared to the standard, complementary, and static probability schemes.This work opens a new line of research to improve the resolution of combinatorial binary problems using continuous metaheuristic techniques and their hybridization with machine learning techniques.
Conclusively, this study transcends conventional research on the binarization of continuous metaheuristics by not only providing a deeper understanding of this fundamental process but also by pioneering innovative approaches hitherto unexplored in the literature.Through a comprehensive and comparative analysis, we have tangibly illustrated how the judicious selection of transfer functions can make a substantial difference in the effectiveness and precision of binarization in the context of metaheuristics.These distinctively pinpoint contributions establish this study as a prominent reference in the research of this discipline, underscoring its significance in the optimization of continuous algorithms.
Future work could investigate how to use these optimization techniques on coverage problems with additional constraints, such as time or capacity constraints.It would also be interesting to investigate how these techniques behave with other algorithms, such as genetic algorithms and knowledge-based algorithms.Additionally, it would be important to investigate how these optimization techniques can be adapted to real-time coverage problems and how they can be integrated with existing automation systems in industry.In general, this work provides a solid foundation for future research in this area and demonstrates the importance of using optimization techniques in coverage problems.

Figure 4 .
Figure 4. Building actions on the basis of binarization schemes.

Table 6 .
Parameter settings.SCA 2 Parameter a of GWO Decreases linearly from 2 to 0 Parameter a of WOA Decreases linearly from 2 to 0 Parameter b of WOA 1 Parameter α of Q-learning 0.1 Parameter γ of Q-learning 0.4

Table 1 .
S-shaped and V-shaped transfer functions.

Table 2 .
X-shaped and Z-shaped transfer functions.

Table 3 .
X-shaped and Z-shaped transfer functions.

Table 5 .
Set of actions analyzed.

Table 8 .
RPD obtained for the 12 sets using SCA.

Table 16 .
Ranking of the best sets considering RPD for GWO.

Table 17 .
Ranking of the best sets considering RPD for SCA.

Table 18 .
Ranking of the best sets considering RPD for WOA.

Table 25 .
Ranking of the best set based on the statistical test.

Table 26 .
Best sets of actions.