Continuous Metaheuristics for Binary Optimization Problems: An Updated Systematic Literature Review

: For years, extensive research has been in the binarization of continuous metaheuristics for solving binary-domain combinatorial problems. This paper is a continuation of a previous review and seeks to draw a comprehensive picture of the various ways to binarize this type of metaheuristics; the study uses a standard systematic review consisting of the analysis of 512 publications from 2017 to January 2022 (5 years). The work will provide a theoretical foundation for novice researchers tackling combinatorial optimization using metaheuristic algorithms and for expert researchers analyzing the binarization mechanism’s impact on the metaheuristic algorithms’ performance. Structuring this information allows for improving the results of metaheuristics and broadening the spectrum of binary problems to be solved. We can conclude from this study that there is no single general technique capable of efﬁcient binarization; instead, there are multiple forms with different performances.


Introduction
Information technologies have experienced exponential growth, generating multiple optimization problems.These problems can be classified into two main subfields: stochastic and deterministic optimization problems.The latter includes (a) unrestricted and constrained continuous optimization; (b) discrete optimization, which can be divided into integer programming and combinatorial optimization.The manuscript's focus is combinatorial optimization, which deals with problems where the set of feasible solutions is discrete or can be reduced to a discrete set.
Many techniques for solving combinatorial optimization problems can be classified into exact and approximation methods.The former includes branch and cut [15,16] and branch and bound [17][18][19].The latter includes heuristic and metaheuristic approaches.The latter has gained enormous popularity over exact methods due to their simplicity and the results' robustness when the combinatorial problem is of high dimension.
These metaheuristics have one feature in common: They were designed to solve problems with a continuous domain of variables.Therefore, the need arises to perform a binarization process.With binarization, we can use popular continuous metaheuristics with very good performance on binary combinatorics problems.
In this context and given the importance that metaheuristics have taken on in solving complex combinatorial problems, this paper presents a systematic review of articles published between 2017 and 2022 to deepen and extract techniques or mechanisms to binarize metaheuristic algorithms that they operate in continuous search spaces.This manuscript is an update of the literature review presented in [31].It will provide a theoretical foundation for young researchers tackling combinatorial optimization using metaheuristic algorithms and for experienced researchers looking at the impact of the binarization mechanism on the performance of metaheuristic algorithms.The systematic review will be guided by the procedure described in 2004 by Kitchenham [32].
The systematic literature review results were analyzed from the following points of view: 1. Journals publishing paper on binarized continuous metaheuristics.

2.
Scientific production by country.
Continuous metaheuristics used to solve binary combinatorial problems.

Classification and definition of the different binarization techniques
A brief summary of the structure of the content of the following sections: Section 2 explains the procedure and methodology used to perform this systematic review.Section 3 presents the results extracted and analyzed.In Section 4 we answer our research questions (presented in the methodology) and finally our conclusions in Section 5.

Methodology
This chapter is structured as follows, first, in Section 2.1, we identify the research questions that this work intends to clarify; the research questions guide and create a strong link of ideas that pivot on the main objective of the study.Next, in Section 2.2, the sources or search engines used to extract results are defined, then the search terms to be used are determined based on the summaries of some of the primary publications that are already available.Once the results are obtained, an a priori analysis is made to show trends in the results obtained.Then, exclusion and inclusion criteria are defined for the results in Section 2.3, focusing on the content of the title, keywords, and abstract of the article.Finally, each paper included in the review is assigned a score or value according to the scoring criteria Section 2.4.The above process is represented in Figure 1.In the last Section 2.5, we can find the data collected and used for our study and the presentation of the bigrams and maps shown in the later sections.

Research Questions
The first question from which the above-stated motivation of this study derives is: How to make a complete picture of the research on binarization techniques solving combinatorial problems in the binary domain?To answer it, we formulated two research questions (RQs) to consider in the collected literature.These questions are: RQ1 What continuous metaheuristics have been used from 2017 to date to solve binary combinatorial problems?RQ2 What techniques or forms of binarization have been used in metaheuristics from 2017 to date to solve binary combinatorial problems?
To address RQ-1, we identified the number of articles published per year, the journal/conference that published them, and whether they referred to continuous metaheuristics to solve problems that belonged to a binary domain.Regarding RQ-2, we considered the scope of the study, i.e., what techniques they used to transform a metaheuristic that specializes in working with continuous domains to a binary domain.

Search Process
The search process was a search through queries in well-known search engines such as Web of Science (Clarivate) and Scopus since 2017.It should be noted that only approved scientific articles in digital format and written in English were considered for the review.
To answer the questions posed and find the articles that interest us, search terms (STs) are used in the engines mentioned above: ST1 ("binarization" OR "binary") ST2 ("optimization" OR "optimizer" OR "combinatorial" AND "problem*" OR "combinatorial" AND "optimization") ST3 ("metaheuristic*" OR "continuous" AND "metaheuristic*") Using such search terms, we have elaborated in the sources the respective queries subject to the dates Jan 1st, 2017 and Jan 30th, 2022.

Inclusion and Exclusion Criteria
For the results obtained from Jan 1st, 2017, and Jan 30th, 2022, a first selection is made, which consists of removing duplicates of the sum of both queries.Those articles that are related to applications of continuous metaheuristics and solve binary combinatorial problems are selected.

•
Use of continuous metaheuristics.

•
Solving binary domain combinatorial problems.
• Describes advances in the way of binarizing continuous metaheuristics.
Items excluded were for the following topics.

•
Articles not related to metaheuristics.

•
Articles other than English.

Quality Assessment
At the end of the selection and storage of articles, a second selection is made, which consists of reading the title, abstract, and keywords and assigning a score concerning the degree of usefulness of this study.This score is calculated by adding the score obtained in each inclusion and exclusion criteria.
QA1 Are the authors using continuous metaheuristics?QA2 Are the authors solving a binary domain combinatorial problem?QA3 Do the authors binarize a continuous metaheuristic?QA4 The authors describe advances in the binarization of continuous metaheuristics?
The scoring procedure corresponds to a binary evaluation, where Y = 1, N = 0.The criteria are scored as follows: QA1 : Yes, authors use a continuous metaheuristic; No authors do not make use of continuous metaheuristics.QA2 : Yes, authors solve at least 1 problem with binary domain; No, authors solve discrete or continuous problems.QA3 : Yes, the authors use techniques to binarize a continuous metaheuristic, that is, they adapt the values of the real domain to binary to work with the binary problem: No, the authors do not apply techniques to binarize the metaheuristic or they use one from the literature and do not explain the procedure.QA4 : Yes, the authors propose new or novel techniques to perform binarization and transform continuous values to binary in a continuous metaheuristic: No, the authors propose nothing and stick to more traditional techniques for binarization.

Data Collection
The data extracted from each article studied were:

Results
In this section, a set of graphs presents the most relevant information about the selected articles, such as the number of articles, the journals of origin and their respective countries, the general concepts addressed, the quantification of the QA factors, and the popularity factors.

General and Bigrams Analysis
This section aims to conduct a general analysis of the metadata obtained from the select articles.First, a visualization will be generated that identifies the journals or conferences that are published the most in terms of binarizations of metaheuristics, and then a visualization will be developed indicating the countries that have generated the most significant contribution in this area.This last graph considers the country of all the authors.A bigram is a series of two contiguous elements of a chain of tokens, which correspond to words in this instance.The purpose is to conduct a statistical analysis of the frequency distribution of these bigrams in the various abstracts under consideration.The initial display represents the Treemap.This seeks to determine the frequency of the most prevalent bigrams in each theme.The thematic map is then utilized; this graph combines the concepts of density (internal associations) and centrality (external associations), [33,34].Finally, the visualization corresponds to conceptual maps and dendrograms.Conceptual structure visualization creates a conceptual structure map.Specifically, the Correspondence Analysis (CA) is performed on terms extracted from the summaries of the documents.In addition to analyzing the relationship between the terms hierarchically, the conceptual structure is also displayed through a dendrogram.
The main journals and conferences are shown in Figure 2. As can be seen from the results, "advances in intelligent systems and computing together" with "IEEE Access" are the ones that publish the most in the area.Subsequently, reading notes and communication in computer and information science appear.All are generated by conferences except in the case of IEEE Access.Figure 3 shows the main countries publishing in the binarization area.The main country corresponds to Chile with 299 appearances, followed by China with 183 appearances, and India with 143.When analyzing the Treemap Figure 4, three main concepts related to the techniques used stand out.In the first place, transfer functions appear as a binarization method, followed by machine learning techniques and, a little further down, techniques based on the concept of percentile.From the point of view of the problems used to verify the algorithms, the main problem corresponds to feature selection, followed by set-covering and unit commitment problems.When reviewing the techniques, we see that they are all of the swarm intelligence types where the most used corresponds to particle swarm, followed by bat algorithm, bee colony, and grey wolf.When analyzing the thematic map Figure 5, it is observed that a base theme, which has high centrality and low density, is related to the use of transfer functions.These transfer functions are associated with different swarm-like techniques.This can be seen in the lower right quadrant.On the other hand, the most important topics, which correspond to the high centrality and high-density quadrant (upper right quadrant), include machine learning techniques related to binarization and problems such as set covering and knapsack, as well as percentile techniques related to local search operators and swarm optimization techniques.
Finally, conceptual bigrams analysis, shown in Figure 6, returns two groups.In red, there is a cluster that mainly relates binarization techniques and swarm-type metaheuristics.
Here, the transfer function and machine learning concepts stand out within the binarization techniques.On the other hand, the cluster in blue is mainly related to binary problems, among which set-covering and knapsack stand out.

Search Results
Table 1 summarizes the results obtained from the search process shown in Figure 1.With the first search of the queries used, we obtained a total of 733 articles, from which it was necessary to remove the existing duplicates, leaving a total of 512 qualifying articles.For our second filter, we performed an analysis of the quality criteria defined in Section 2.4 based on the title, keywords, and abstract presented by the research, leaving a total of 283 potentially relevant articles for further analysis and to avoid false positives.From this list, we removed papers in a language other than English, those where it was impossible to obtain access to the article, and existing false positives.Finally, we obtained 195 unique studies.

Quality Evaluation of Articles
As mentioned above, the quality of the research obtained was evaluated using the criteria defined in Section 2.4.The score of each article is present in Table 2, it is worth mentioning, which contains the selected articles.The last column shows the score the researchers agreed on; in this case, only those articles that achieved a score equal to or greater than 3 points.All disagreements were discussed and resolved.Based on the number of articles chosen, a total of 317 articles scored less than three or were discarded for the reasons specified in Section 2.3.

Quality Factors
The relationship between the quality score of a selected article and the date of publication of the article was investigated.The mean quality scores obtained from the studies in each year are shown in Table 3.According to this table, we can observe that there is an increasing trend in the use of continuous metaheuristics to solve problems in the binary domain, although in 2022, there are only 16 articles, it is necessary to remember that for this year, only the month of January was considered.Based on the trend, we can assume that there will be more articles by the end of the year than in 2021.

Popularity Factors
To facilitate the reading of the metaheuristics, Table 4 has been created with the ID of each of the metaheuristics that were used in the respective works.On the other hand, the composition of the next tables is as follows: the first column indicates what we are analyzing (metaheuristics, problems, binarization technique used or transfer function), the next five columns correspond to the years that were considered in the study, from 2017 to 2021 (without considering the single month of January 2022, this way we avoid confusion with the values) and finally the last column corresponds to the total value of articles.
As well as quality, the relationship between the popularity of the problems, metaheuristics, binarization techniques, and transfer functions used in a selected article and the date of publication was investigated.According to Table 5, we can notice that the problems with more tendency to solve are Feature Selection, Set Covering Problem, and Knapsack Problem, while with Table 6 with the course of the years there is a tendency to use less conventional metaheuristics, as it is Particle Swarm Optimization or Cuckoo Search.The most popular techniques shown in Table 7 give indications that the most traditional is to use a simple transformation, which usually consists of two steps, going from continuous to discrete and then by means of an equation to binary, finally, the most popular transfer functions in the literature during the last five years, the most classical ones such as S and V type are maintained, however, there are some adventurous in using other less conventional ones as can be seen in the Table 8. 11 [14,41,134,142,143,158,179,209,211,234,235] 9 [39,40,69,130,138,191,214,232,236] 41 BA 1 [5] 1 [186] 5 [63,102,147,212,228] 6 [47,99,103,164,172,215] 3 [40,219,230] 16 CS 1 [55] 3 [73,105,112] 3 [178,183,199] 5 [10,71,158,179,196] 3 [48,64,176

Discussion
In this section, we will discuss the answers to our research questions.

What Continuous Metaheuristics Have Been Used from 2017 to Date to Solve Binary Combinatorial Problems?
In general, as mentioned in the previous section, we found a total of 195 out of 512 relevant articles in the sources we searched, a little more than 50% of the articles use continuous metaheuristics to solve binary combinatorial problems, this gives us good indications as they present good indicators that continuous metaheuristics give good results when solving problems of different binary nature.Of the total number of articles, researchers opted for more traditional metaheuristics (PSO, BA, and CS) while the rest decided to use less conventional metaheuristics (see Table 4 or Table 6).

What Techniques or Forms of Binarization Have Been Used in Metaheuristics from 2017 to Date to Solve Binary Combinatorial Problems?
Analyzing the results obtained, a classification of binarization techniques was made based on how they perform the binarization process to answer this question.They were classified into five categories, which are "Simple Transformation," "Encoding Transformation," "Machine Learning Structure," "Percentile Concept," and "Crossover."Table 7 shows the number of times each category was used in each year.Next, we will proceed to define and exemplify each of the five categories mentioned above.

Simple Transformation
In the literature, our categorization of Simple Transformation corresponds to a generally sequential mechanism; this transformation works with continuous operators without modifying them.We say it is sequential because the first step fits the introduction of operators that transform the solution of R n into {InterSpace}.For example, in Great Value Priority, our interspace is Z n ; in the case of a transfer function, we have [0, 1] n and {InterSpace Functions} in the Angular Modulation.The second step is to transform from the intermediate space (Z n , [0, 1] n , {InterSpace Functions}) into a binary space {0, 1} n .See Figure 7 for a better overview of the general scheme.Within our category, we can find several techniques that follow these principles, which we will detail below.

• Transfer Function and Binarization
The first step of this technique corresponds to the use of transfer functions, the most common normalization method and which was introduced in [239], the advantage of the transfer function is that it is a very cheap operator, providing a range of probabilities and attempts to model the transition of the particle position.These functions are responsible for the first step of the binarization method and for mapping the R n solutions into [0, 1] n .
The second step is to use a binarization rule, where the particle is transformed into a binary solution.In this review we could identify that a large percentage of the articles used standard (see Equation (1)), however, there are other rules, for example, in [31] four more are mentioned, static probability, elitist, elitist roulette [240,241], and finally complement the technique also used in [46,122].The four mentioned techniques can be seen in Table 9.

Type Binarization
Complement b

• Angle Modulation
This approach is used in the telecommunications industry for carrier modulation of the [242] signal and uses trigonometric functions with four parameters capable of controlling the frequency and offset of the function itself.
Let us consider X = {x 1 , x 2 , ..., x n } a solution of an n−dimensional binary problem.We will start with a 4-dimensional search space, where each of these dimensions represents a coefficient of Equation (2).Considering this search space, the first step is to obtain a function in a function space, namely from each solution represented in the space as (a i , b i , c i , d i ).Then we obtain a trigonometric function g i that lies in a function space.
For the second step, for each of the x j elements present in X, we apply a binarization rule (see Equation ( 3)) and obtain an n−dimensional binary solution.
In this way, for each initial solution, remembering (a i , b i , c i , d i ), we will obtain an n−dimensional binary solution: (b i,1 , b i,2 , ..., b i,n ).In this review, we were able to find some papers that applied this technique, such as [228][229][230][231].

• Binarization Conversion
There are some works where the authors did not apply any traditional way as mentioned above.Failing that, these authors in their research indicate that they obtain the value of their particle and then randomly apply binarization.In [57,150] they update the population with the following Eq.x 0 tk = round(rand(0, 1)) and in [77,139] they use Equation (4).

Encoding Transformation
Unlike the previous category, these methods classified as encoding are characterized by redefining the metaheuristic operators; we can identify two main groups.The first are those that modify the operations of the search space through modified algebraic or logical operations.The second group can be referred to as promising regions, and the operators, this time, are restructured based on the promising regions that were selected in the search space.The Quantum binary approach is an example of this group we found in the analyzed articles.

• Algebraic or Logic Operations
This method transforms real operators into binary or logical operators.This transformation is performed by Boolean operations, which causes the operators to be able to act on the binary solutions.This approach was proposed as a binarization technique in PSO [243].
The Boolean notation or also known as logic gates are as follows: "XOR" = , "AND" = and "OR" = .Equations ( 5) and ( 6) present the Boolean equations for velocity and position.
where V i (t) and X i are the representation of the velocity at time t and the particle position, c 1 and c 2 are random vectors.P best,i is the best position selected by the particle and P global corresponds to the position of the best global solution.This method has been used in the application in different binary optimization problems using GSK [107], PSO [40,86], BSO [84], TGA [173], BA [40,63], GWO [63], GSA [63], ABC [159], DA [40], SMO [114], BHO [188] and CS [196].

• Quantum Binary Approach
This approach is inspired by the uncertainty principle, where we cannot simultaneously determine position and velocity.Therefore, the PSO algorithm works differently for individual particles, and we need to rewrite the operators.In the quantum approach, each feasible solution has a position X = {x 1 , x 2 , ..., x n } and a quantum vector The quantum vector Q j represents the probability that x j takes the value 1.For each dimension j, a random number between [0, 1] is generated and compared to Q j (see Equation ( 7)).
After obtaining the binary values of the solutions, the new P best and P global are calculated using the objective function.Finally, the transition probability is updated using Equation (8).
where Q sel f and Q global are calculated with Equations ( 9) and ( 10) respectively.
The quantum method has been applied to various problems such as Multidimensional Knapsack Problem with PSO [209], Next Release Problem con AAA [72], Feature Selection con OSA [80], GSA [42].In [208], they apply this method to binarize PSO, while in [204] apart from PSO, also in ABC, AFSA.

Percentile Concept
This approach using percentile operators works directly with the own parameters of each metaheuristic ∆(x) (see Equation ( 11)), i.e., the binary percentile operator considers the displacements generated by the metaheuristic in each dimension for each of the particles.
The parameter ∆ i (x) is the magnitude of the displacement ∆(x) at the i-th position for the particle x; subsequently, these displacements are grouped using the magnitude of the displacement (∆ i (x)) and an assigned percentile list.The binary percentile operator will have as input the parameters of the percentile list; suppose Pr = {20, 40, 60, 80, 100} and a list of values.Given an iteration, the list of values corresponds to the magnitude of the displacement ∆ i (x) of the particles in each dimension.As a first step, the percentile operator uses the value list and obtains the given percentile values in Pr.Subsequently, each value in the value list is assigned the smallest percentile group to which it belongs.Finally, the list of the percentile to which each value belongs is obtained and assigned a transition probability by Equation (12).

Machine Learning Structure
This approach uses the great virtues of machine learning techniques to binarize.Machine Learning techniques have taken great prominence in recent years thanks to great technological advances, which allow us to have a great computational capacity.Machine learning techniques are classified into three groups, Supervised Learning, Unsupervised Learning, and Reinforcement Learning [244][245][246].In the present analysis, the use of algorithms belonging to the groups of Unsupervised Learning and Reinforcement Learning was detected.

• Unsupervised Learning
In unsupervised learning, the output data is unknown from the beginning.The learning lies in recognizing patterns in the input data.This type of learning is also known as clustering learning.
The algorithms used to binarize by clustering are K-Means [64,133,137,141] and Db-Scan [66,179,183,211].Were applied to binarize CS, PSO, BA, BHO, GOA, FA, WOA, CSA, GSO para resolver Buttressed Walls Problem, Multidimensional Knapsack Problem, Set Covering Problem, Crew Scheduling Problem, and Set Union Knapsack Problem.

• Reinforcement Learning
Reinforcement Learning consists of finding the best action to be taken by an agent in a given state.The performance of the action is judged through a reward.Therefore, Reinforcement Learning algorithms seek to maximize this reward accumulated over time.
In Section 4.2.1 we mentioned two-step techniques, a technique that combines a transfer function and a binarization function to binarize continuous solutions.To find the best combination of functions it is necessary to test them all, which is very time-consuming.To solve this, reinforced learning techniques are incorporated.
In [85,104,170,195] they use Q-Learning and in [85,227,247] they use SARSA as intelligent selectors of binarization schemes.These two techniques were applied to binarize GWO, HHO, SCA, and WOA to solve Set Covering Problem.

Crossover
Within the population metaheuristics, we have a group called Evolutionary Algorithms [248], which uses evolutionary theory to perturb the solutions.The operations used to perturb the solutions are crossover and mutation, where the most common algorithms within this group of metaheuristics are Genetic Algorithm (GA) and Differential Evolution (DE).Different proposals such as [7,88,129,219] use a continuous metaheuristic hybridized with an evolutionary algorithm to perturb the solutions.
The optimization process consists of initializing the solutions in a binary way to the metaheuristic, using its movement patterns in conjunction with the evolutionary algorithms to obtain binary perturbed solutions.

Closing of Discussions
Because of what has been presented and summarizing the answers proposed in Sections 4.1 and 4.2, it is clarified that the most used metaheuristics are those based on the swarm, since these have recently gained remarkable popularity, both for their good performance and their ease of implementation, both in continuous and discrete domains using the aforementioned techniques.Along with this, it is emphasized what was already indicated in the predecessor of this work, where the techniques or forms of binarization where the most used are the Simple Transformation or in other words, those that are sequential.But together with ratifying what was previously investigated, new alternatives of binarization are presented, such as Percentile Concept or Machine Learning Structure there are also novel transfer functions or hybridizations called in this work as crossovers, which present new and novel techniques or forms of binarization, enriching the options to be used to be able to operate with continuous metaheuristics in binary space.

Conclusions
This study investigated and compiled important binarization methods of continuous metaheuristics.Within our obtained conglomerate of binarization, we propose five main classifications; in Table 7, these groups are shown.The first group we call Simple Transformation sequential binarization mechanisms, which use an intermediate space from which the binarization is mapped.The second group we call Encoding Transformation, where the metaheuristic operator is adapted to a binary problem; by analyzing this adaptation of operators, we could find methods that transform algebraic operations and methods that use a probability to make the transition in the search space.The third group is called Percentile Concept and is based on statistics using percentile groups to the values obtained in the solutions.The fourth group was assigned the name Machine Learning Structure and, as its name indicates, incorporates machine learning techniques for binarization.In the review, two different groups were detected, Unsupervised Learning and Reinforcement Learning.On the Unsupervised Learning side, the authors used clustering techniques such as K-Means and DbScan to binarize.On the Reinforcement Learning side, the authors used Q-Learning and SARSA as intelligent selectors of binarization schemes coming from the two-step technique.The fifth group we call Crossover because continuous metaheuristics are hybridized with evolutionary algorithms to use operators such as Crossover to obtain perturbed binarization solutions.
In addition, we investigated which specific metaheuristics use these binarization techniques; the summary is shown in Table 6.Based on the study performed by all researchers, we can conclude that the most used method to binarize continuous metaheuristics is the transfer function, belonging to the Simple Transformation; a summary of the most used
QRY Scopus: ( TITLE-ABS-KEY ( binarization OR binary ) AND TITLE-ABS-KEY ( optimization OR optimizer OR combinatorial AND problem* OR combinatorial AND optimization ) AND TITLE-ABS-KEY ( metaheuristic* OR continuous AND metaheuristic* ) ) QRY Web of Science: ALL=( binarization OR binary) AND ALL=(optimization OR optimizer OR combinatorial AND problem* OR combinatorial AND optimization) AND ALL=(metaheuristic* OR continuous AND metaheuristic*)

Table 2 .
Quality evaluation of articles.

Table 3 .
Average quality scores for articles by publication date.

Table 4 .
Acronyms of the investigated metaheuristics.

Table 5 .
Problem per year.

Table 7 .
Technique of Binarization per year.

Table 8 .
Transfer Function per year.