You are currently viewing a new version of our website. To view the old version click .
Mathematics
  • Article
  • Open Access

24 January 2021

An Analysis of a KNN Perturbation Operator: An Application to the Binarization of Continuous Metaheuristics

,
and
1
Escuela de Ingeniería en Construcción, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362807, Chile
2
Escuela de Negocios Internacionales, Universidad de Valparaíso, Valparaíso 2361864, Chile
3
Institute of Concrete Science and Technology (ICITECH), Universitat Politècnica de València, 46022 València, Spain
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Mathematics and Engineering II

Abstract

The optimization methods and, in particular, metaheuristics must be constantly improved to reduce execution times, improve the results, and thus be able to address broader instances. In particular, addressing combinatorial optimization problems is critical in the areas of operational research and engineering. In this work, a perturbation operator is proposed which uses the k-nearest neighbors technique, and this is studied with the aim of improving the diversification and intensification properties of metaheuristic algorithms in their binary version. Random operators are designed to study the contribution of the perturbation operator. To verify the proposal, large instances of the well-known set covering problem are studied. Box plots, convergence charts, and the Wilcoxon statistical test are used to determine the operator contribution. Furthermore, a comparison is made using metaheuristic techniques that use general binarization mechanisms such as transfer functions or db-scan as binarization methods. The results obtained indicate that the KNN perturbation operator improves significantly the results.

1. Introduction

In many areas of the industry, it is necessary to make decisions that are increasingly complex given the scarcity and cost of resources. The amount of elements that are considered today means that these decisions are made on a large number of assessments which constitute combinatorial optimization problems (COPs) to find a result that, on the one hand, is feasible and, on the other hand, satisfactory. In the industry, there are several areas where this situation occurs such as automatic learning [1], transport [2], biology [3], logistics [4], civil engineering [5], sustainability [6], among others. Among the optimization problems are the so-called NP-hard problems, which are difficult to solve. Various techniques can be applied to solve these problems, ranging from the application of exact techniques to the use of approximate techniques. Within the latter, we find the metaheuristics which allow us to tackle large problems and find good solutions, which is not necessarily optimal, but in a reasonable computing time. Metaheuristics are an active line of research in the areas of computer science and operational research that allow us to obtain robust algorithms associated with the solution of COPs.
The need to find better results has allowed the development of new lines of research, where the hybridization stands out, to obtain more robust methods on the one hand in relation to the quality of the solution and on the other hand in improving the times of convergence. In hybridization there are four lines of work: The first corresponds to the combination of heuristics with mathematical programming [7], the second to the combination of different heuristic methods [8], the third line corresponds to the combination between simulation and heuristic methods [9], and finally the fourth line corresponds to the combination between heuristics and machine learning. This last line is an emerging area of interest for researchers where this combination (heuristics-machine learning) can occur in such a way that metaheuristics help machine learning algorithms to improve their results ([10,11]) or it can also occur in the reverse direction, where machine learning techniques help to obtain more robust metaheuristic algorithms (for example, in [12]). In Section 2.1, the different ways of hybridization are presented in detail.
In this work, in order to improve the diversification and intensification properties, and taking into account the different lines of research presented above, the k-nearest neighbors algorithm is applied to a perturbation operator. The contributions of this work are presented below.
  • Inspired by the work in [13], an improvement is proposed to the binarization technique that uses transfer functions, developed in [14], with the objective of metaheuristics which were defined to function in continuous spaces, efficiently solve COPs. This article includes the K-nearest neighbor technique to improve the diversification and intensification properties of a specific metaheuristic. Unlike in [13], in which the perturbation operator is integrated with the k-means clustering technique, in this article the perturbation operator is integrated with transfer functions, and these functions perform the binarization of the continuous metaheuristics. For this work, the Cuckoo Search (CS) metaheuristic was used. This algorithm was chosen due to its ease in parameter tuning, in addition to the existence of basic theoretical models of convergence.
  • Unlike in [13], in which the multidimensional knapsack problem was tackled, this article addresses the set covering problem (SCP). This combinatorial problem has been widely studied and, because of that, instances of different difficulties are available which facilitate our analysis. In this work, we have chosen to use large size instances in order to adequately evaluate the contribution of the KNNperturbation operator.
  • For a suitable evaluation of our KNN perturbation operator, we first use a parameter estimation methodology proposed in [15] with the goal to find the best metaheuristic configurations. Later, experiments are carried out to get insight into the contribution of the KNN operator. Finally, our hybrid algorithm is compared to the state-of-the-art general binarization methods. The numerical results show that our proposal achieves highly competitive results.
The rest of the work is presented as follows. In Section 2, a state-of-the-art of integrating metaheuristics with machine learning is developed. Then, in the same section, the different binarization methods are summary. Later, in Section 3, the optimization set covering problem is explained. Then, in Section 4, the detail of the perturbation operator and the algorithm that solves SCP is explained. After that, the results obtained are detailed in Section 5. Finally, in Section 6, the conclusions and new research lines are developed.

3. The Set Covering Problem

The classical set covering combinatorial problem (SCP) is an important NP-hard problem, which has not only theoretical importance in the field of optimization but also from a practical point of view as it has important practical applications in different areas of engineering, for example, in the vehicle routing, railroads, airline crew scheduling, microbial communities, and pattern finding [57,58,59,60].
The SCP consists of choosing a subset of possible locations at the lowest possible cost, which is given by a fixed cost of construction and implementation, so that all agents are covered from a set of possible locations that cover them.
Several algorithms have been created to solve this problem. Some of them may be exact as are exact algorithms that generally rely on the branch-and-bound and branch-and-cut methods to obtain optimal solutions [61,62]. However, this type of algorithm is faced with the time required to provide a solution which does not allow them to be used in industrial problems. As an alternative to this problem, different heuristics have been proposed [63,64].
Mathematically, the SCP is represented in the next paragraph: Consider A = ( a i j ) be an n × m zero-one matrix. Then, a column j covers a row i if a i j = 1 , and a column j is related with a positive real cost c j . Consider J = { 1 , . . . m } and I = { 1 , . . . , n } be, respectively, the columns and rows sets of A. Then, the SCP corresponds to a finding a minimum cost subset S J for which each row i I , at least one column j J covers it , i.e.,
Minimize f ( x ) = j = 1 m c j x j
Subject to j = 1 m a i j x j 1 , i I , and x j { 0 , 1 } , j J
where x j = 1 if j S and x j = 0 otherwise.

4. The Binary KNN Perturbed Algorithm

This section details the binary KNN perturbation algorithm to solve the SCP. This hybrid algorithm has 4 main operators: A solution initialization operator which is described in Section 4.1. A binarization operator which uses transfer functions to perform the binarization. This operator is detailed in Section 4.4. The perturbation operator described in Section 4.3 is based on the k-nearest neighbor technique. Finally, a repair operator in the event the solutions do not meet any of the coverage constraints. This operator is detailed in Section 4.5. Additionally, the KNN perturbation algorithm has a KNN-perturbation analysis module. The objective of this module is to collect data from the solutions obtained in the optimization process to later deliver information to the perturbation operator. The detail of this module will be developed in Section 4.2. The algorithm flow chart is shown in Figure 1.
Figure 1. A KNNperturbation algorithm flow chart.

4.1. Initialization Operator

The goal of this operator is to initiate solutions. As the first stage, a column will be selected randomly through the SelRandCol() function. Once we have the first column selected, compliance with the coverage constraints is evaluated. In the case that they are not fulfilled, then the Heu() function is called, which will be detailed in Section 4.6. This function receives a list with the currently selected column or columns ( l S o l ) and returns the new column n C to be incorporated into the l S o l . This Heu() function is executed until the coverage constraints are satisfied. The pseudocode for the initiation procedure is shown in Algorithm 1.
Algorithm 1 Init Operator
1:
Function Init()
2:
Input
3:
Output  l S o l
4:
l S o l SelRandCol()
5:
while not all rows are covered by l S o l do
6:
n C = Heu( l S o l )
7:
 lSol.append( n C )
8:
end while
9:
return  l S o l

4.2. KNN Perturbation Analysis Module

This module uses the data generated in the evolution of the metaheuristic algorithm and, through a measure, suggests a perturbation probability of the elements of the solution. As a measure of importance, the definition made in [65] is adapted. The objective of the measure developed in [65] corresponds to identifying the variables that are the most relevant considering a reduced number of evaluations. This approach models linear and non-linear effects and the correlation between variables.
Let K points belonging to the search space S. Then, E E i ( X ) is defined in Equation (5), where X S .
E E i ( X ) = f ( X 1 , . . . , X i ^ , . . , X d ) f ( X )
where f corresponds to the objective function and X i ^ is the complement of X i , that is, if X i = 1 then X i ^ = 0 , and i represents the location of the element X i in the solution. Subsequently, for each dimension, the average μ i and the standard deviation σ i , defined in Equations (6) and (7) respectively, can be calculated.
μ i = 1 K j = 1 K | E E i ( X j ) |
σ i = 1 K j = 1 K | E E i ( X j ) μ i | 2
When the ( μ , σ ) pair is analyzed, interesting interpretations can be obtained. In the case of obtaining small values of μ and σ , it is an indicator that the input variables have a small impact on the objective function. A small value of μ and a high value of σ suggest that the input variables have a nonlinear effect. A linear effect on the objective function is related to high values of μ and small values of σ . Finally, high values in both indicate that there are nonlinear effects or interaction with other variables.
In the methods reviewed in [65], the objective is to evaluate the exploration of the entire space. In our case, the goal is to measure the exploitation of a region around a solution to later apply a perturbation to the solution. Therefore, to achieve this goal, the previous calculus must be adapted. In this adaptation, the neighborhood concept must be incorporated. For our case, instead of calculating the indicators over the entire space, the calculation will be carried out on the k-nearest neighbors of the solution to be perturbed. The data set used to obtain the k-neighbors is generated with 25% of the best solutions obtained in each iteration. Therefore, the elements of the first quartile are being used in the estimation. For k-neighbors retrieval, we use the k-nearest neighbor algorithm (KNN).
Because, to perform the perturbation, it is necessary to obtain a value of w i between 0 and 1 for each dimension i. Therefore, in Equation (8), w i is defined. The indicators μ i * and σ i * correspond to normalized values between 0 and 1 of μ i and σ i , respectively. On the other hand, 2 is added to ensure that w i takes values between 0 and 1. The detail of the w i calculation is shown in Algorithm 2.
w i = μ i * + σ i * 2
Algorithm 2 KNN perturbation analysis module
1:
Function weight( l S o l )
2:
Input  l S o l
3:
Output The list of weights ( l W e i g h t )
4:
l W e i g h t []
5:
n e i g b o u r s getKneighbours( l S o l )
6:
for (each dimension i in l S o l ) do
7:
w i getweight( n e i g h b o u r s )
8:
l W e i g h t .append( w i )
9:
end for
10:
return  l W e i g h t

4.3. KNN Perturbation Operator

The goal of this operator is to perturb the solution list when the perturbation criterion is met. This operator, in the case that the solution is found in 25% of the best solutions of that iteration, consults the KNN-perturbation analysis module, for the probability of perturbation for each element of the solution. Otherwise, that is, the solution is not found in 25% of the best solutions, the solution is randomly perturbed, where the coefficient ν is used in order to manage the force of the perturbation. The criterion used for the list of solutions to be perturbed, corresponds to a number T of iterations without the best value changing. In this particular case, the number of T was 35. The pseudocode of the perturbation operator is shown in Algorithm 3. In the pseudocode, b S o l u t i o n s corresponds to 25% of the best solutions and o S o l u t i o n s to the rest of the solutions.
Algorithm 3 KNN Perturbation operator
1:
Function Perturbation( l i s t S o l u t i o n s , ν )
2:
Input Input sol l i s t S o l u t i o n s , strength of perturbation ν
3:
Output The perturbed sol l i s t S o l u t i o n s
4:
b S o l u t i o n s , o S o l u t i o n s getBSols( l i s t S o l u t i o n s )
5:
for each l S o l in b S o l u t i o n s do
6:
for (each dimension i in l S o l ) do
7:
  if ( w i > random and i == 1) then
8:
   remove element i from s o l
9:
  end if
10:
end for
11:
s o l Repair( l S o l )
12:
end for
13:
for (each l S o l in o S o l u t i o n s ) do
14:
for (i=1 to ν ) do
15:
  Randomly delete an item from l S o l
16:
end for
17:
s o l Repair( s o l )
18:
end for
19:
return  l i s t S o l u t i o n s

4.4. Transfer Function Operator

As CS is an algorithm that works naturally in continuous search spaces, it is necessary to adapt it to solve the SCP. A widely used method for these situations is transfer functions (Section 2.2). In this work, the function shown in Equation (9) was used as the transfer function and the Elitist roulette discretization method shown in Equation (10), to get the binary solutions.
T F ( x ) = | tanh ( x ) |
x i d ( t + 1 ) = B e s t i d ( t ) , if   rand T F ( x i d ( t + 1 ) ) 0 , otherwise
where B e s t d ( t ) corresponds to the best solution obtained by the solution d, up to the iteration t.

4.5. Repair Operator

Every time the transfer function operator or the perturbation operator is executed, there is a possibility the solution obtained does not meet the constraints of the problem. For these cases, a repair operator is used to obtain a viable solution. If there are rows that are not covered by the solution, the repair operator uses the heuristic function to choose the necessary elements. Once all the rows have been covered, the operator checks whether there are disjoint groups of columns that cover the same set of rows. In this case, we proceed to eliminate what has the highest cost. The pseudocode of the repair operator is shown in Algorithm 4.
Algorithm 4 Repair Operator
1:
Function Repair( l S o l ) )
2:
Input Input sol l S o l
3:
Output The repaired sol l S o l r e p
4:
while needRepair(lSol) == True do
5:
l S o l .append(Heu( l S o l ))
6:
end while
7:
l S o l r e p delRepeatedItem( l S o l )
8:
return  l S o l r e p

4.6. Heuristic Function

In cases where it is necessary to repair or initialize a solution, a heuristic function is often used in order to select the most suitable candidates. In our proposal as an input parameter, the heuristic function considers the list of elements of the solution l S o l . Therefore, with the elements of l S o l , we obtain the set of rows u R that have not been covered. Using the Equation (11), the bestRows function returns the first N rows. With this list of rows, l R o w s , and using the Equation (11), the bestCols function returns the first M columns. Finally, from the list of the selected columns, l C o l s , one of these is randomly obtained. The operation of the heuristic function is shown in Algorithm 5.
W e i g h t R o w ( i ) = 1 L i
where L i is the sum of all ones in row i
W e i g h t C o l u m n ( j ) = c j | R M j |
where M j is the set of rows covered by Col j
Algorithm 5 Heuristic function
1:
Function Heuristic()
2:
Input Input solution l S o l
3:
Output The new column n C
4:
l R o w s bestRows( l S o l , N=10)
5:
l C o l s bestCols( l R o w s , M=5)
6:
n C getCols( l C o l s )
7:
return  n C

5. Numerical Results

This section aims to study the contribution of the KNN perturbation operator when applied to the SCP. As the first stage in Section 5.1, the methodology to perform parameter tuning is explained. Later, in Section 5.2, the contribution of the KNN perturbation operator is analyzed. Finally, in Section 5.3 our proposal is compared with other algorithms that have solved SCP in recent years. The dataset used to develop the experiments considers the instances E, F, G, and H of the OR-library (OR-Library: http://people.brunel.ac.uk/~mastjjb/jeb/orlib/scpinfo.html). The configuration of the equipment used in the execution of the experiments corresponds to an Intel Core i7-4770 with 16GB in RAM. The algorithm was programmed in Python 3.6. For the analyzes, each instance was executed 30 times and box plots, convergence graphs, and the Wilcoxon test were considered to develop the comparisons.

5.1. Parameter Settings

In this section, the methodology used in the parameter setting is described. The methodology was proposed in [15] and considers 4 measurements based on the best value, the worst value, the average value, and the average time obtained by a specific configuration. The definition of the measures are shown in the Equations (13)–(16).
  • The percentage deviation of the best value resulting in ten runs compared with the best-known value:
    b S o l u t i o n = 1 K B e s t V a l B e s t V a l K B e s t V a l
  • The percentage deviation of the worst value resulting in ten runs compared to the best-known value:
    w S o l u t i o n = 1 K B e s t V a l W o r s t V a l K B e s t V a l
  • The average percentage deviation value resulting in ten runs compared with the best-known value:
    a S o l u t i o n = 1 K B e s t V a l A v e r a g e V a l K B e s t V a l
  • The convergence time in each experiment is standardized using Equation (16).
    n T i m e = 1 A v g C o n v T i m e m i n T i m e m a x T i m e m i n T i m e
The different explored configurations were obtained from the Range column in Table 1. For each configuration, problems E1, F1, G1, and H1 were considered, and each one of them was executed 10 times. Subsequently, the four previously defined measurements are obtained for each configuration. These measurements allow to generate a radar plot and calculate its area for each configuration. The configuration that gets the largest area corresponds to the selected setting. In the case of CS, the selected configuration is shown in the Value column of Table 1.
Table 1. Parameter configuration for the CS algorithm.

5.2. Perturbation Operator Analysis

This section aims to describe the experiments that evaluate the contribution of the KNN perturbation operator in the final result of the optimization. Once the experiments are detailed, the results will be presented and analyzed. To evaluate the contribution of the KNN perturbation operator, two algorithms were designed. In the first one, the KNN perturbation operator is replaced by a random perturbation operator. This random operator also uses a perturbation coefficient ν but does not consider the information provided by the KNN perturbation analysis module. The perturbation is executed in a random way, in the same way, that the o S o l u t i o n s are perturbed in Algorithm 3. For this random perturbation operator, 2 values of ν , 25, and 50 were used and its algorithms are denoted by r a n d . 25 , and r a n d . 50 , respectively. The first value used, is the same one used by the KNN perturbation operator.
The second algorithm used to understand the contribution of the perturbation operator corresponds to an algorithm that does not use a perturbation operator. That is, it is equivalent to the design in Figure 1; however, in this case, the solutions will never be perturbed. This algorithm will be denoted by non-perturbed.
For the analysis of the results, a comparative table, Table 2, is generated where the best value and the average obtained from the 30 executions in the different algorithms, and for each of the instances are compared. Moreover, in Figure 2 and Figure 3, box plots have been generated by type of instance. The objective of these box plots is to make a statistical and visual comparison between the results obtained by the different algorithms. Finally, in Figure 4 and Figure 5, the convergence charts of the different algorithms by type of instance are shown.
Table 2. Comparison between not perturbed, random and KNN perturbation operators.
Figure 2. Box plots comparison between non-perturbed, random perturbed, and KNN perturbed operators for E y F instances.
Figure 3. Box plots comparison between non-perturbed, random perturbed, and KNN perturbed operators for G y H instances.
Figure 4. Convergence chart for E and F instances.
Figure 5. Convergence chart for G and F instances.
In Table 2, it is observed for the indicator best value that KNN perturbed algorithm was the one that obtained the best values in all instances. Rand.25 and rand.50 got similar values, and the non-perturbed algorithm got the worst values. In the case of the average indicator, the result is similar to the previous one. Again, the KNN perturbed algorithm obtained the best results, subsequently followed by rand.25, with non-perturbed being the one that once again obtained the worst value. The Wilcoxon test indicates that the difference between the algorithms is statistically significant. In the case of the box plots shown in Figure 2 and Figure 3, KNN perturbed obtains the best interquartile range values, with values closer to 0 in all instance types, when comparing it with the other algorithms. In addition, almost in all cases IQR (IQR = Q3–Q1) is smaller than in the case of the other algorithms, except in instances E and F in which it has values are very similar to rand.25. Finally, in Figure 4 and Figure 5, the convergence curves for the 4 types of instances studied are compared. In the 4 types, we observe that the shape of the convergence charts stabilizes in the same number of iterations for the different algorithms. However, in the first iterations, KNN perturbed is separated from the other algorithms, obtaining GAPslower than the rest, and this difference is maintained during the complete execution of the optimization.
In summary, it is observed that the KNN perturbed operator contributes to the final result. This contribution makes it possible to obtain consistently better results, as well as a decrease in the dispersion of the values when comparing these with the other proposed algorithms. Regarding convergence, it is observed that the KNN perturbed operator contributes in an important way in the initial iterations of the optimization process.

5.3. Comparisons

This section aims to evaluate the KNN perturbed algorithm against other algorithms that have efficiently solved the SCP. For this, we select three algorithms. The first is the Binary Black Hole Algorithm (BBH) [66]. BBH uses a transfer function as a binarization method. In particular, the function used was 1 1 + e x i d ( t + 1 ) 3 . The maximum number of iterations allowed for BBH was 4000 and the implementation was done in Java. The second algorithm [66], corresponds to a Binary Cuckoo Search algorithm (BCS). The BCS also uses transfer functions, 1 1 + e x i d ( t + 1 ) , as a method of binarization. The algorithm was developed in Java and the maximum number of iterations allowed was 4000. Finally, the last algorithm used in the comparison [2] executes the binarization through the concept of clustering. Through the db-scan clustering technique, the solutions are grouped to generate binarization. The algorithm was developed in Python and the maximum number of iterations was 800.
The results are shown in Table 3. When analyzing the best value indicator, we observed that KNN perturbed was superior in 15 of the 20 instances when compared to BBH. When contrasting with BCS, we see that KNN perturbed was superior in 11 instances and BCS in none. Finally, the comparison with db-scan-CS showed similar results. KNN perturbed outperformed db-scan-CS in 4 instances. The latter was superior in 1 instance. When comparing the final average between KNN perturbed and db-scan-CS, these are practically the same. The Wilcoxon test indicates that the difference is significant in the cases of BBH and BCS and is not significant for db-scan.
Table 3. Comparison between db-scan-CS, BBH, BCS, and KNN-perturbation algorithms.
When analyzing the average indicator, we observe that the difference in favor of KNN perturbed with respect to BCS and BBH is maintained. In the case of BBH, KNN perturbed was higher in 18 instances and BHH in 2. The Wilcoxon test indicates that the difference is statistically significant. In the comparison with BCS, KNN perturbed was superior in 17 instances and BCS in 3. The Wilcoxon test also indicates that the difference is significant. In the comparison between db-scan-CS and KNN perturbed, the average indicator shows very similar results between them. Db-scan-CS obtains a better result in 15 instances and KNN perturbed in 5. However, due to the proximity of the results, the Wilcoxon test indicates that the difference is not statistically significant.
Additionally, we have incorporated Table 4 in order to develop a better understanding of the comparisons. In the Average Gap column of Table 4, the value corresponds to the average of the Gaps calculated for each instance. Subsequently, in the Gap ratio column, the gap ratio of each algorithm is calculated using KNN perturbation algorithm as a basis for comparison.
Table 4. GAP ratio respect to KNN-perturbation algorithm.
We noted the following patterns.
  • KNN perturbation in the 4 types of problems outperforms BBH and BCS. These techniques use transfer functions as a method of binarization, the same methods used by KNN perturbation.
  • KNN perturbation outperforms db-scan-CS only on instance G. In all other instances, db-scan-CS performs better. db-scan-CS uses a binarization mechanism based on db-scan which adapts iteration to iteration. However, the difference is not statistically significant.

6. Conclusions

In this work, we have used the K-nearest neighborhood technique in a perturbation operator, in order to tackle combinatorial problems effectively. The cuckoo search algorithm was selected for the development of the experiments. Two random operators were designed in order to determine the contribution of the KNN perturbed operator in the optimization result. From the experiments, it was concluded that the KNN perturbed operator contributes to improving the quality and dispersion of the solutions, in addition to obtaining better convergences. Additionally, a comparison was made with other algorithms that have tackled the SCP efficiently. From the comparison, it is concluded that our proposal improves the results of the algorithms that use the same binarization mechanism (Transfer functions). In the comparison with algorithms that use db-scan clustering techniques as a binarization method, we observed that the results were very close to each other.
As lines of future work, we observe that there is an opportunity in the dynamic handling of metaheuristics parameters. Here, we intuit that the integration of dynamic programming techniques or reinforcement learning is a line that can contribute to improving the quality and convergence of the metaheuristics algorithms. Another interesting line is the use of machine learning techniques for the selection of algorithms or parameterizations of algorithms. Appealing to the no-free-launch-theorem, having a group of algorithms or algorithm parameterizations together with a selection method of these, based on the historical behavior of the different algorithms, could contribute to improving the quality of the solutions obtained.

Author Contributions

Conceptualization, V.Y., G.A. and J.G.; methodology, V.Y., G.A. and J.G.; software, G.A. and J.G.; validation, V.Y., G.A. and J.G.; formal analysis, J.G.; investigation, G.A. and J.G.; data curation, G.A.; writing—original draft preparation, J.G.; writing—review and editing, V.Y., G.A. and J.G.; funding acquisition, V.Y. and J.G. All authors have read and agreed to the published version of the manuscript.

Funding

The first author was supported by the Grant CONICYT/FONDECYT/INICIACION/11180056.

Data Availability Statement

The data set used in this article can be obtained from: http://people.brunel.ac.uk/~mastjjb/jeb/orlib/scpinfo.html. The results of the experiments are in: https://drive.google.com/drive/folders/1RvYrzVjFn60qDeyrWQBGS_r8o7FW1uGZ?usp=sharing.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Al-Madi, N.; Faris, H.; Mirjalili, S. Binary multi-verse optimization algorithm for global optimization and discrete problems. Int. J. Mach. Learn. Cybern. 2019, 10, 3445–3465. [Google Scholar] [CrossRef]
  2. García, J.; Moraga, P.; Valenzuela, M.; Crawford, B.; Soto, R.; Pinto, H.; Peña, A.; Altimiras, F.; Astorga, G. A Db-Scan Binarization Algorithm Applied to Matrix Covering Problems. Comput. Intell. Neurosci. 2019, 2019, 3238574. [Google Scholar] [CrossRef] [PubMed]
  3. Guo, H.; Liu, B.; Cai, D.; Lu, T. Predicting protein–protein interaction sites using modified support vector machine. Int. J. Mach. Learn. Cybern. 2018, 9, 393–398. [Google Scholar] [CrossRef]
  4. Korkmaz, S.; Babalik, A.; Kiran, M.S. An artificial algae algorithm for solving binary optimization problems. Int. J. Mach. Learn. Cybern. 2018, 9, 1233–1247. [Google Scholar] [CrossRef]
  5. García, J.; Martí, J.V.; Yepes, V. The Buttressed Walls Problem: An Application of a Hybrid Clustering Particle Swarm Optimization Algorithm. Mathematics 2020, 8, 862. [Google Scholar] [CrossRef]
  6. Yepes, V.; Martí, J.V.; García, J. Black Hole Algorithm for Sustainable Design of Counterfort Retaining Walls. Sustainability 2020, 12, 2767. [Google Scholar] [CrossRef]
  7. Caserta, M.; Voß, S. Metaheuristics: Intelligent Problem Solving. In Matheuristics: Hybridizing Metaheuristics and Mathematical Programming; Maniezzo, V., Stützle, T., Voß, S., Eds.; Springer: Boston, MA, USA, 2010; pp. 1–38. [Google Scholar]
  8. Talbi, E.G. Combining metaheuristics with mathematical programming, constraint programming and machine learning. Ann. Oper. Res. 2016, 240, 171–215. [Google Scholar] [CrossRef]
  9. Juan, A.A.; Faulin, J.; Grasman, S.E.; Rabe, M.; Figueira, G. A review of simheuristics: Extending metaheuristics to deal with stochastic combinatorial optimization problems. Oper. Res. Perspect. 2015, 2, 62–72. [Google Scholar] [CrossRef]
  10. Chou, J.S.; Nguyen, T.K. Forward Forecast of Stock Price Using Sliding-Window Metaheuristic-Optimized Machine-Learning Regression. IEEE Trans. Ind. Inform. 2018, 14, 3132–3142. [Google Scholar] [CrossRef]
  11. Zheng, B.; Zhang, J.; Yoon, S.W.; Lam, S.S.; Khasawneh, M.; Poranki, S. Predictive modeling of hospital readmissions using metaheuristics and data mining. Expert Syst. Appl. 2015, 42, 7110–7120. [Google Scholar] [CrossRef]
  12. de León, A.D.; Lalla-Ruiz, E.; Melián-Batista, B.; Moreno-Vega, J.M. A Machine Learning-based system for berth scheduling at bulk terminals. Expert Syst. Appl. 2017, 87, 170–182. [Google Scholar] [CrossRef]
  13. García, J.; Lalla-Ruiz, E.; Voß, S.; Droguett, E.L. Enhancing a machine learning binarization framework by perturbation operators: Analysis on the multidimensional knapsack problem. Int. J. Mach. Learn. Cybern. 2020, 11, 1951–1970. [Google Scholar] [CrossRef]
  14. García, J.; Crawford, B.; Soto, R.; Astorga, G. A clustering algorithm applied to the binarization of swarm intelligence continuous metaheuristics. Swarm Evol. Comput. 2019, 44, 646–664. [Google Scholar] [CrossRef]
  15. García, J.; Crawford, B.; Soto, R.; Castro, C.; Paredes, F. A k-means binarization framework applied to multidimensional knapsack problem. Appl. Intell. 2018, 48, 357–380. [Google Scholar] [CrossRef]
  16. Dokeroglu, T.; Sevinc, E.; Kucukyilmaz, T.; Cosar, A. A survey on new generation metaheuristic algorithms. Comput. Ind. Eng. 2019, 137, 106040. [Google Scholar] [CrossRef]
  17. Geem, Z.W.; Kim, J.; Loganathan, G. A New Heuristic Optimization Algorithm: Harmony Search. Simulation 2001, 76, 60–68. [Google Scholar] [CrossRef]
  18. Karaboga, D. An Idea Based on Honey Bee Swarm for Numerical Optimization; Technical Report, Technical Report-tr06; Erciyes university, Engineering Faculty, Computer Engineering Department: Kayseri, Turkey, 2005. [Google Scholar]
  19. Yang, X.S.; Deb, S. Cuckoo search via Lévy flights. In Proceedings of the 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), Coimbatore, India, 9–11 December 2009; pp. 210–214. [Google Scholar]
  20. Rashedi, E.; Nezamabadi-Pour, H.; Saryazdi, S. GSA: A gravitational search algorithm. Inf. Sci. 2009, 179, 2232–2248. [Google Scholar] [CrossRef]
  21. Rao, R.V.; Savsani, V.J.; Vakharia, D. Teaching–learning-based optimization: A novel method for constrained mechanical design optimization problems. Comput.-Aided Des. 2011, 43, 303–315. [Google Scholar] [CrossRef]
  22. Gandomi, A.H.; Alavi, A.H. Krill herd: A new bio-inspired optimization algorithm. Commun. Nonlinear Sci. Numer. Simul. 2012, 17, 4831–4845. [Google Scholar] [CrossRef]
  23. Cuevas, E.; Cienfuegos, M. A new algorithm inspired in the behavior of the social-spider for constrained optimization. Expert Syst. Appl. 2014, 41, 412–425. [Google Scholar] [CrossRef]
  24. Abdel-Basset, M.; Abdel-Fatah, L.; Sangaiah, A.K. Metaheuristic algorithms: A comprehensive review. In Computational Intelligence for Multimedia Big Data on the Cloud with Engineering Applications; Elsevier: Amsterdam, The Netherlands, 2018; pp. 185–231. [Google Scholar]
  25. Xu, L.; Hutter, F.; Hoos, H.H.; Leyton-Brown, K. SATzilla: Portfolio-based algorithm selection for SAT. J. Artif. Intell. Res. 2008, 32, 565–606. [Google Scholar] [CrossRef]
  26. Bartz-Beielstein, T.; Markon, S. Tuning search algorithms for real-world applications: A regression tree based approach. In Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753), Portland, OR, USA, 19–23 June 2004; Volume 1, pp. 1111–1118. [Google Scholar]
  27. Smith-Miles, K.; van Hemert, J. Discovering the suitability of optimisation algorithms by learning from evolved instances. Ann. Math. Artif. Intell. 2011, 61, 87–104. [Google Scholar] [CrossRef]
  28. Peña, J.M.; Lozano, J.A.; Larrañaga, P. Globally multimodal problem optimization via an estimation of distribution algorithm based on unsupervised learning of Bayesian networks. Evol. Comput. 2005, 13, 43–66. [Google Scholar] [CrossRef] [PubMed]
  29. Bischl, B.; Mersmann, O.; Trautmann, H.; Preuß, M. Algorithm selection based on exploratory landscape analysis and cost-sensitive learning. In Proceedings of the 14th Annual Conference on Genetic And Evolutionary Computation, Philadelphia, PA, USA, 7–11 July 2012; pp. 313–320. [Google Scholar]
  30. Hutter, F.; Xu, L.; Hoos, H.H.; Leyton-Brown, K. Algorithm runtime prediction: Methods & evaluation. Artif. Intell. 2014, 206, 79–111. [Google Scholar]
  31. Kazimipour, B.; Li, X.; Qin, A.K. A review of population initialization techniques for evolutionary algorithms. In Proceedings of the 2014 IEEE Congress on Evolutionary Computation (CEC), Beijing, China, 6–11 July 2014; pp. 2585–2592. [Google Scholar]
  32. De Jong, K. Parameter setting in EAs: A 30 year perspective. In Parameter Setting in Evolutionary Algorithms; Springer: Adelaide, Australia, 2007; pp. 1–18. [Google Scholar]
  33. Eiben, A.E.; Smit, S.K. Parameter tuning for configuring and analyzing evolutionary algorithms. Swarm Evol. Comput. 2011, 1, 19–31. [Google Scholar] [CrossRef]
  34. García, J.; Yepes, V.; Martí, J.V. A Hybrid k-Means Cuckoo Search Algorithm Applied to the Counterfort Retaining Walls Problem. Mathematics 2020, 8, 555. [Google Scholar] [CrossRef]
  35. García, J.; Moraga, P.; Valenzuela, M.; Pinto, H. A db-Scan Hybrid Algorithm: An Application to the Multidimensional Knapsack Problem. Mathematics 2020, 8, 507. [Google Scholar] [CrossRef]
  36. Poikolainen, I.; Neri, F.; Caraffini, F. Cluster-based population initialization for differential evolution frameworks. Inf. Sci. 2015, 297, 216–235. [Google Scholar] [CrossRef]
  37. García, J.; Maureira, C. A KNN quantum cuckoo search algorithm applied to the multidimensional knapsack problem. Appl. Soft Comput. 2021, 102, 107077. [Google Scholar] [CrossRef]
  38. Rice, J.R. The algorithm selection problem. In Advances in Computers; Elsevier: West Lafayette, IN, USA, 1976; Volume 15, pp. 65–118. [Google Scholar]
  39. Brazdil, P.; Carrier, C.G.; Soares, C.; Vilalta, R. Metalearning: Applications to data mining; Springer Science & Business Media: Houston, TX, USA, 2008. [Google Scholar]
  40. Burke, E.K.; Gendreau, M.; Hyde, M.; Kendall, G.; Ochoa, G.; Özcan, E.; Qu, R. Hyper-heuristics: A survey of the state of the art. J. Oper. Res. Soc. 2013, 64, 1695–1724. [Google Scholar] [CrossRef]
  41. Florez-Lozano, J.; Caraffini, F.; Parra, C.; Gongora, M. Cooperative and distributed decision-making in a multi-agent perception system for improvised land mines detection. Inf. Fusion 2020, 64, 32–49. [Google Scholar] [CrossRef]
  42. Crawford, B.; Soto, R.; Astorga, G.; García, J.; Castro, C.; Paredes, F. Putting Continuous Metaheuristics to Work in Binary Search Spaces. Complexity 2017, 2017, 8404231. [Google Scholar] [CrossRef]
  43. Taghian, S.; Nadimi-Shahraki, M.H.; Zamani, H. Comparative analysis of transfer function-based binary Metaheuristic algorithms for feature selection. In Proceedings of the 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), Malatya, Turkey, 28–30 September 2018; pp. 1–6. [Google Scholar]
  44. Mafarja, M.; Aljarah, I.; Heidari, A.A.; Faris, H.; Fournier-Viger, P.; Li, X.; Mirjalili, S. Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl.-Based Syst. 2018, 161, 185–204. [Google Scholar] [CrossRef]
  45. Feng, Y.; An, H.; Gao, X. The importance of transfer function in solving set-union knapsack problem based on discrete moth search algorithm. Mathematics 2019, 7, 17. [Google Scholar] [CrossRef]
  46. Proakis, J.; Salehi, M. Communication Systems Engineering, 2nd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2002. [Google Scholar]
  47. Pampara, G.; Franken, N.; Engelbrecht, P. Combining particle swarm optimisation with angle modulation to solve binary problems. In Proceedings of the IEEE Congress on Evolutionary Computation Edinburgh, Scotland, UK, 2–5 September 2005; Volume 1, pp. 89–96. [Google Scholar]
  48. Liu, W.; Liu, L.; Cartes, D. Angle Modulated Particle Swarm Optimization Based Defensive Islanding of Large Scale Power Systems. In Proceedings of the IEEE Power Engineering Society Conference and Exposition in Africa, Johannesburg, South Africa, 16–20 July 2007; pp. 1–8. [Google Scholar]
  49. Swagatam, D.; Rohan, M.; Rupam, K. Multi-user detection in multi-carrier CDMA wireless broadband system using a binary adaptive differential evolution algorithm. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, GECCO, Amsterdam, The Netherlands, July 2013; pp. 1245–1252. [Google Scholar]
  50. Dahi, Z.A.E.M.; Mezioud, C.; Draa, A. Binary bat algorithm: On the efficiency of mapping functions when handling binary problems using continuous-variable-based metaheuristics. In Proceedings of the IFIP International Conference on Computer Science and its Applications, Saida, Algeria, 20–21 May 2015; Springer: Cham, Switzerland, 2015; pp. 3–14. [Google Scholar]
  51. Leonard, B.J.; Engelbrecht, A.P. Frequency distribution of candidate solutions in angle modulated particle swarms. In Proceedings of the 2015 IEEE Symposium Series on Computational Intelligence, Cape Town, South Africa, 7–10 December 2015; pp. 251–258. [Google Scholar]
  52. Zhang, G. Quantum-inspired evolutionary algorithms: A survey and empirical study. J. Heurist. 2011, 17, 303–351. [Google Scholar] [CrossRef]
  53. Srikanth, K.; Panwar, L.K.; Panigrahi, B.; Herrera-Viedma, E.; Sangaiah, A.K.; Wang, G.G. Meta-heuristic framework: Quantum inspired binary grey wolf optimizer for unit commitment problem. Comput. Electr. Eng. 2018, 70, 243–260. [Google Scholar] [CrossRef]
  54. Hu, H.; Yang, K.; Liu, L.; Su, L.; Yang, Z. Short-term hydropower generation scheduling using an improved cloud adaptive quantum-inspired binary social spider optimization algorithm. Water Resour. Manag. 2019, 33, 2357–2379. [Google Scholar] [CrossRef]
  55. Gao, Y.J.; Zhang, F.M.; Zhao, Y.; Li, C. A novel quantum-inspired binary wolf pack algorithm for difficult knapsack problem. Int. J. Wirel. Mob. Comput. 2019, 16, 222–232. [Google Scholar] [CrossRef]
  56. Kumar, Y.; Verma, S.K.; Sharma, S. Quantum-inspired binary gravitational search algorithm to recognize the facial expressions. Int. J. Mod. Phys. C 2020, 31, 2050138. [Google Scholar] [CrossRef]
  57. Balas, E.; Padberg, M.W. Set partitioning: A survey. SIAM Rev. 1976, 18, 710–760. [Google Scholar] [CrossRef]
  58. Borneman, J.; Chrobak, M.; Della Vedova, G.; Figueroa, A.; Jiang, T. Probe selection algorithms with applications in the analysis of microbial communities. Bioinformatics 2001, 17, S39–S48. [Google Scholar] [CrossRef] [PubMed]
  59. Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A. Logical analysis of numerical data. Math. Program. 1997, 79, 163–190. [Google Scholar] [CrossRef]
  60. Garfinkel, R.S.; Nemhauser, G.L. Integer Programming; Wiley: New York, NY, USA, 1972; Volume 4. [Google Scholar]
  61. Balas, E.; Carrera, M.C. A dynamic subgradient-based branch-and-bound procedure for set covering. Oper. Res. 1996, 44, 875–890. [Google Scholar] [CrossRef]
  62. Beasley, J.E. An algorithm for set covering problem. Eur. J. Oper. Res. 1987, 31, 85–93. [Google Scholar] [CrossRef]
  63. John, B. A lagrangian heuristic for set-covering problems. Nav. Res. Logist. 1990, 37, 151–164. [Google Scholar]
  64. Beasley, J.E.; Chu, P.C. A genetic algorithm for the set covering problem. Eur. J. Oper. Res. 1996, 94, 392–404. [Google Scholar] [CrossRef]
  65. Iooss, B.; Lemaître, P. A review on global sensitivity analysis methods. In Uncertainty Management in Simulation-Optimization of Complex Systems; Springer: Boston, MA, USA, 2015; pp. 101–122. [Google Scholar]
  66. Soto, R.; Crawford, B.; Olivares, R.; Barraza, J.; Figueroa, I.; Johnson, F.; Paredes, F.; Olguín, E. Solving the non-unicost set covering problem by using cuckoo search and black hole optimization. Nat. Comput. 2017, 16, 213–229. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.