A db-Scan Hybrid Algorithm: An Application to the Multidimensional Knapsack Problem

: This article proposes a hybrid algorithm that makes use of the db-scan unsupervised learning technique to obtain binary versions of continuous swarm intelligence algorithms. These binary versions are then applied to large instances of the well-known multidimensional knapsack problem. The contribution of the db-scan operator to the binarization process is systematically studied. For this, two random operators are built that serve as a baseline for comparison. Once the contribution is established, the db-scan operator is compared with two other binarization methods that have satisfactorily solved the multidimensional knapsack problem. The first method uses the unsupervised learning technique k-means as a binarization method. The second makes use of transfer functions as a mechanism to generate binary versions. The results show that the hybrid algorithm using db-scan produces more consistent results compared to transfer function (TF) and random operators.


Introduction
With the incorporation of technologies such as big data and the Internet of Things, the concept of real-time decisions has become relevant at the industrial level. Each of these decisions can be modeled as an optimization problem or, in many cases, a combinatorial optimization problem (COP). Examples of COPs are found in different areas: machine learning [1], transportation [2], facility layout design [3], logistics [4], scheduling problems [2,5], resource allocation [6,7], routing problems [8], robotics applications [9], image analysis [10], engineering design problems [11], fault diagnosis of machinery [12], and manufacturing problems [13], among others. If the problem is large, metaheuristics have been a good approximation to obtain adequate solutions. However, having a greater amount of data and requiring solutions in close to real time for some cases motivates us to continue strengthening the methods that address these problems.
One way to classify metaheuristics is according to the search space in which they work. In that sense, we have metaheuristics that work in continuous spaces, discrete spaces, and mixed spaces [14]. An important line of inspiration for metaheuristic algorithms is natural phenomena, many of which develop in a continuous space. Examples of metaheuristics inspired by natural phenomena in continuous spaces include particle swarm optimization [15], black hole optimization [16], cuckoo search [17], the bat algorithm [18], the firefly algorithm [19], the fruitfly algorithm [20], the artificial fish swarm [21], and the gravitational search algorithm [22]. The design of binary versions of these algorithms entails important challenges when preserving their intensification and diversification properties [14]. The details of binarization methods are specified in Section 3.
A strategy that has strengthened the results of metaheuristic algorithms has been the hybridization of these with techniques that come from the same or other areas. The main hybridization proposals found in the literature are the following: (i) matheuristics, which combine heuristics or metaheuristics with mathematical programming [23]; (ii) hybrid heuristics, a combination between different heuristic or metaheuristic methods [24]; (iii) simheuristics, where simulation and metaheuristics are combined together to solve a problem [25]; and (iv) Integration between metaheuristic algorithms and machine learning techniques. The last, hybridization between the areas of machine learning and metaheuristic algorithms, is an emerging research line in the areas of computer science and operational research. We find that hybridization occurs mainly with two intentions. The first is with the goal that metaheuristics will help machine learning algorithms improve their results (for example, [26,27]). The second intention is that machine learning techniques will be used to strengthen metaheuristic algorithms (for example, [28,29]). The details of the hybridization forms are specified in Section 4. This article is inspired by the research lines mentioned above. A hybrid algorithm is designed that explores the application of a machine learning algorithm in a binarization operator to allow continuous metaheuristics to address combinatorial optimization problems. The contributions of this work are detailed below: • A machine learning technique is used with the objective of obtaining binary versions of metaheuristics defined and used in continuous optimization to tackle COPs in a simple and effective way. To perform the binarization process, this algorithm uses db-scan, which corresponds to an unsupervised learning algorithm. The selected metaheuristics are particle swarm optimization (PSO) and cuckoo search (CS). Their selection is based on the fact that they have been frequently used in solving continuous optimization problems and their parameterization is relatively simple, which allows us to focus on the binarization process.

•
The multidimensional knapsack problem (MKP) was used to check the performance of the obtained binary versions. MKP was chosen because it is a problem extensively studied in the literature therefore we have specific instances making it easy to evaluate our hybrid algorithm. The details and applications of MKP are expanded on in Section 2.

•
Two random operators are designed in order to define a baseline and quantify the contribution of the hybrid algorithm that uses db-scan in the binarization process. Additionally, to make the comparison more robust, the performance of our hybrid algorithm was evaluated with methods that use k-means and transfer functions (TF) as binarization mechanisms.
This article is organized in the following sections. The definition of MKP is detailed in Section 2. Subsequently, the state-of-the-art of the main binarization and hybridization techniques between machine learning and metaheuristics are developed in Section 3. The detail of the proposed hybrid binarization algorithm is described in Section 4. In Section 5, we study the contribution of the db-scan operator to the binarization process. Additionally, in this same section, the proposed hybrid algorithm is compared with the binarization techniques that use TF and k-means. Finally, the main conclusions and future lines of investigation are detailed in Section 6.

Multidimensional Knapsack Problem
Due to having a large number of applications and N P-hard computational complexity, a major research effort has been dedicated to the MKP. This optimization problem has been addressed by different types of techniques. Examples of exact algorithms that have efficiently resolved the MKP are found in [30,31]. There are also hybrid algorithms where exact algorithms are combined with depth-first search [32] or with variable fixation [33]. However, exact algorithms are capable of producing optimal solutions for small and medium-sized instances, usually with a number of variables n ≤ 250 and a number of restrictions m ≤ 10 [32,33]. This makes MKP an interesting problem for metaheuristics to address.
In the case of metaheuristics, there are several algorithms that have addressed the MKP. A modification of the harmony search algorithm that redesigns the memory rule and improves exploration capabilities was proposed in [34]. A binary artificial algae algorithm was designed in [35] that uses transfer functions to perform binarization in addition to incorporating a local search operator. A hybrid algorithm based on the k-means technique was proposed in [29]. Additionally, this algorithm incorporates local perturbation and search operators. In [36], a binary multiverse optimizer was designed to address medium-sized problems. This multiverse algorithm uses a transfer function mechanism to perform binarization. Finally, a two-phase tabu-evolutionary algorithm was developed by Lai et al. [37] to address large instances of the MKP.
Let N = {1, 2, . . . , n} be a set of n elements and M = {1, 2, . . . , m} be a set of m resources with capacity limit b i for each resource i ∈ M. Then, each element j has profit p j and consumes an amount of resources c ij . The MKP consists of selecting a subset of elements such that the limit capacity of each resource is not exceeded while the profit of the selected elements is maximized. Formally, the problem is defined as follows: subject to: where b i corresponds to the capacity limitation of resource i ∈ M. Each element j ∈ N has a requirement of c ij regarding resource i as well as a benefit p j . Moreover, x j ∈ {0, 1} indicates whether the element j is in the knapsack, j ∈ {1, . . . , n}, c ij ≥ 0, p j > 0, b j > 0, n corresponds to the number of items, and m is the number of knapsack constraints. As mentioned above, the MKP has numerous applications. MKP modeling has been used in project selection and capital budgeting [38] applications as well as in the delivery of groceries in vehicles with multiple compartments [39] and the daily photographic scheduling problem of an Earth observation satellite [40]. Another interesting problem related to the MKP is the shelf space allocation problem [41]. Additionally, we found applications in the allocation of databases and processors in distributed data processing [42].

Related Binarization Work
Because many processes of nature that are modeled in continuous spaces have inspired metaheuristic algorithms, there are a large number of these that are designed to work in continuous spaces. In particular, the metaheuristics cuckoo search (CS) and particle swarm optimization (PSO) have been widely used in solving continuous problems. However, there are a large number of combinatorial optimization problems, where a significant number of these are N P-hard type. This motivates the search for robust methods that allow algorithms that operate in continuous spaces to tackle combinatorial optimization problems.
When developing a classification of existing binarization techniques, two large groups [14] are defined. The first group designs an adaptation of a continuous algorithm to work in binary environments. This adaptation usually turns out to be specific to the metaheuristic algorithm and the problem that is being solved. We call this group the specific binarizations. The second group separates the binarization process from the metaheuristic algorithm. Therefore, the latter continues to work in a continuous search space. Once the continuous solutions are obtained, they are binarized. We call this group the generic binarizations. In Figure 1a,b, the generic and specific binarization diagrams are shown. Examples of specific binarizations are frequently found in quantum binary approaches and in set-based approaches [14]. In the case of a quantum approach, continuous algorithms are adapted based on the uncertainty principle, where position and velocity cannot be determined simultaneously. In [43], a quantum binary gray wolf optimizer is proposed to solve the unit commitment problem. Using a quantum binary lightning search algorithm, in [44], the optimal placement of vehicle-to-grid charging stations in the distribution power system was addressed. The short-term hydropower generation scheduling problem was successfully addressed by [45] using a quantum-binary social spider optimization algorithm. In the case of the set-based approach, in [46], this method succeeded in solving the feature selection problem. Additionally, the vehicle routing problem with time windows was addressed in [47] by a particle swarm optimization set-based approach. Other examples of specific binarizations are found in [48], where a chaotic antlion algorithm was used to find a parameterization of the support vector machine technique. In this case, a chaotic map and random walk operators were used.
In the case of generic transformations, the simplest and most commonly used binarization method corresponds to the transfer functions (TFs). In this method, the particle has a position given by a solution in one iteration and a velocity corresponding to the vector obtained from the difference in the position of the particle between two consecutive iterations. The TF is a very simple operator that relates the velocities of the particles in PSO with a transition probability. The TF takes values of R n and generates transition probability values in [0, 1] n . Depending on the form of the function, they are generally classified as S-shaped [49] and V-shaped functions [50]. However, in recent years, the study of transfer functions has been extended by defining new families. In [51], a recurrence generated parametric Fibonacci hyperbolic tangent activation function has been defined and applied to neural networks. A Family of Functions Based on Half-Hyperbolic Tangent Activation Function was introduced in [52]. In this work, the authors demonstrated the existence of upper and lower estimates for the Hausdorff approximation of the sign function.
When the function produces a value between 0 and 1, the next step is to use a rule that allows 0 or 1 to be obtained. For this, well-defined rules have been used that use the concepts of complements, elites, and random functions, among others. In [53], a quadratic binary Harris hawk optimization, which uses transfer functions, successfully addressed the feature selection problem. Additionally, a feature selection problem in [54] was solved by a binary dragonfly optimization algorithm. In this case, a time-varying transfer function was used. Finally, in [55], binary butterfly optimization was used to solve the feature selection problem.
The main challenge a binarization framework has to tackle is associated with spatial disconnection [56]. When two solutions that are close in continuous space are not close in binary space when applying the binarization process, a spatial disconnection occurs. As a consequence of the existence of a spatial disconnection, alterations are observed in the exploration and exploitation properties of the optimization algorithm. These alterations result in a decrease in precision and an increase in the convergence times of the algorithms. In [57], we analyzed how TFs altered the exploration and exploitation process. We also find an analysis of these properties in [56], for the angle modulation technique.

Hybridizing Metaheuristics with Machine Learning
Metaheuristics form a wide family of algorithms. These algorithms are classified as incomplete optimization techniques and are usually inspired by natural or social phenomena. The main objective of these is to solve problems of high computational complexity, and they have the property of not having to deeply alter their optimization mechanism when the problem to be solved is modified. On the other hand, machine learning techniques correspond to algorithms capable of learning from a dataset [58]. If we classify these algorithms according to the method of learning, there are three main categories: supervised learning, unsupervised learning, and learning by reinforcement. Machine learning algorithms are usually used to solve time series problems, anomaly detection, computational vision, data transformation, dimensionality reduction, regression, and data classification, among others [59].
Among state-of-the-art algorithms that integrate machine learning techniques with metaheuristic algorithms, we have found two main approaches. In the first approach, machine learning techniques are used to improve the quality of the solutions and convergence times obtained by the metaheuristic algorithms. The second approach uses metaheuristic algorithms to improve the performance of machine learning techniques. Usually, the metaheuristic is responsible for solving an optimization problem related to the machine learning technique more efficiently than the machine learning technique alone.
When we analyze the integration mechanisms that take the first approach, we identify two lines of research. In the first line, machine learning techniques are used as metamodels to select different metaheuristics by choosing the most appropriate metaheuristic for each instance. The second line aims to use specific operators that make use of machine learning algorithms, and, subsequently, to integrate specific operators into a metaheuristic.
According to the articles found that use a general integration mechanism between machine learning algorithms and metaheuristics, three main groups are defined: hyper-heuristics, cooperative strategies, and algorithm selection. The approach through the algorithm selection method aims to select from a group of algorithms, the most appropriate algorithm for the instance being solved. This selection is made using a set of characteristics and associating the best algorithm that has solved similar instances. In the case of the hyper-heuristic strategy, the approach is to automate the design of heuristic methods in order to tackle a wide range of problems. Finally, in the case of cooperative strategies, they are aimed at combining algorithms through a parallel or a sequential mechanism, assuming that this combination will produce more robust methods. Examples of these approaches are found in [28], where the algorithm selection strategy is used and applied to the berth scheduling problem. A hyper-heuristic algorithm was used in [60] and was applied to the nurse training problem. A direct cooperation mechanism was used in [61] to solve the permutation Flow stores problem.
A metaheuristic is determined by its evolution mechanism, together with different operators, such as initialization solution operators, solution perturbation, population management, binarization, parameter setting, and local search operators. Specific integrations explore machine learning applications in some of these operators. In the design of binary versions of algorithms that work naturally in continuous spaces, we find binarization operators in [2]. These binary operators use unsupervised learning techniques to perform the binarization process. In [62], the concept of percentiles was explored in the process of generating binary algorithms. In addition, in [5], the Apache spark big data framework was applied to manage the population size of solutions to improve convergence times and the quality of results. Another interesting line of research was found in the adjustment of metaheuristic parameters. In [63], the parameter setting of a chess classification system was implemented. Based on decision trees and using fuzzy logic, a semi-automatic parameter setting algorithm was designed in [64]. The initiation of solutions of a metaheuristic is frequently carried out in a random way. However, using machine learning techniques, it has been possible to improve the performance of metaheuristic algorithms, through the process of initiating solutions. In [65], the initiation of solutions of a genetic algorithm through the case-based reasoning technique was applied to the problem of the design of a weighted circle. Again, in the initiation of a genetic algorithm, in [66], Hopfield neural networks were used. The genetic algorithm, together with the Hopfield networks, were applied to the economic dispatch problem.
In the other direction, where metaheuristics support the development of more robust machine learning algorithms, there are many studies and applications. For example, we find applications in feature selection, parameter settings, feature extraction. In [67], an integrated genetic algorithm with SVM was applied to the recognition of breast cancer. The hybrid algorithm improved the classification compared to the original SVM technique. In particular, the genetic algorithm was applied to the extraction of characteristics from the images involved in the analysis. Again for feature extraction in [68] a multiverse optimizer was used. Additionally, this optimizer was used to perform SVM tuning. In the case of neural networks, depending on the type of network and its topology, obtaining the weights properly can be a difficult and time-consuming task. In particular, in [69], the tuning of a feed-forward neural network was addressed through an improved monarch butterfly algorithm. The integration of a firefly algorithm with the least-squares support vector machine technique was developed in [70] with the goal of solving a geotechnical problem efficiently. The prediction of the compressive strength of high-performance concrete was modeled in [71] through a metaheuristic-optimized least-squares support vector regression algorithm. In [72], a hybrid algorithm that integrates metaheuristics with artificial neural networks was designed with the aim of improving the prediction of stock prices. Another example of price prediction was developed in [26]. In this case, the improvement was achieved using a sliding-window metaheuristic-optimized machine-learning regression and was applied to a construction company in Taiwan. We also find in [73], an application of a firefly algorithm applied for tuning parameters in the least-squares vector regression technique. In this case, the improved algorithm was applied to predictions in engineering design. Applications of metaheuristics to unsupervised machine learning techniques are also found in the literature. In particular, there are a large number of studies applied to cluster techniques. In [74], an algorithm based on the combination of a metaheuristic with a kernel intuitionistic fuzzy c-means method was designed and applied to different datasets. Another interesting problem is the search for centroids because it requires a large computing capacity. An artificial bee colony algorithm was used in [75], to find centroids in an energy efficiency problem of a wireless sensor network. Planning for the transportation of employees from an oil platform through helicopters was addressed in [76] through cluster search using a metaheuristic.

Binary db-Scan Algorithm
To efficiently solve the MKP, the binary db-scan algorithm is composed of five operators. The first operator corresponds to the initialization of the solutions. This operator is detailed in Section 4.1. After the population of solutions is initialized, the next step is to verify whether the maximum iteration criterion is met. If the criterion is not satisfied, then the binary db-scan operator is used. In this operator, the metaheuristic is executed in the continuous space to later group the solutions considering the absolute value of the velocity and using the db-scan technique. The details of this operator are described in Section 4.2. Subsequently, with the clusters generated by the db-scan operator, the transition operator is used, which aims to binarize the solutions grouped by db-scan. The transition operator is described in Section 4.3. Then, if the solutions obtained do not satisfy all the constraints, the repair operator described in Section 4.5 is applied. Finally, a random perturbation operator is used that is associated with the criterion of the number of iterations that are performed without the best value being modified; this operator is detailed in Section 4.4. The general flow chart of the binary db-scan algorithm is shown in Figure 2.

Initialization Operator
Each solution is generated as follows. First, we select an item randomly. Then, we consult the constraints of our problem to see whether there are other elements that can be incorporated. The list of possible elements to be incorporated is obtained, the weight for each of these elements is calculated, and one of the three best elements is selected. The procedure continues until no more elements can be incorporated. The pseudocode is shown in Algorithm 1.   5: element ← RandElement(lElements) 6: p.append(element) 7: while (There exist elements that satisfy the constraints:) do 8: lPosibleElements ← PosibleElements(lElements) 9: element ← BestElement(lPosibleElements) 10: p.append(element) 11: end while 12: return p Several techniques have been proposed in the literature to calculate the weight of each element. For example, in [77], the pseudoutility in the surrogate duality approach was introduced. The pseudoutility of each variable is given in Equation (4).
Another more intuitive measure was proposed in [78]. This measure focuses on the average occupancy of resources. Its equation is shown in Equation (5).
In this article, we use a variation of this last measure focused on the average occupation shown in Equation (6). In this equation, c kj represents the cost of object k in knapsack j, b j corresponds to the capacity constraint of knapsack j, and p i corresponds to the profit of element i. This heuristic was proposed in [29], and its objective is to select the elements that enter the knapsack.

Binary db-Scan Operator
Continuous metaheuristic solutions are clustered through the binary db-scan operator. If we make the analogy of solutions with particles, the position of the particle represents the location of the particle in the search space. Velocity is interpreted as a transition vector between a state t and a state t + 1.
The density-based spatial clustering of applications with noise (db-scan) is used as a technique to obtain clusters. Db-scan works with the concept of density to find clusters. The algorithm was proposed in 1996 by Ester et al. [79]. Let us consider a set S within a metric space, then the db-scan algorithm will group the points that meet a minimum density criterion and the others are labeled as outliers. To achieve this task, db-scan requires two parameters: a radius and a minimum number of neighbors δ. The main steps of the algorithm are shown below: • Find the points in the neighborhood of every point and identify the core points with more than δ neighbors.

•
Find the connected components of core points on the neighbor graph, ignoring all noncore points.

•
Assign each noncore point to a nearby cluster if the cluster is an neighbor; otherwise, assign it to noise.
Let us define l p(t) as the position list given by a MH metaheuristic in the t iteration. Then, the binary operator dbscan will have MH and l p(t) as input objects. The operator's goal is to generate the clusters of the solutions delivered by MH. As a first step, the operator must iterate l p(t) using MH, which will obtain another list l p(t + 1) with the positions of the solutions in the iteration t + 1. Finally, with l p(t + 1) and l p(t), we obtain a list of velocities lV(t + 1).
Let v p (t + 1) ∈ lV(t + 1) be the velocity vector in the transition between t and t + 1 corresponding to particle p. The dimension of the vector is n and is basically determined by the number of columns that the problem has. Let v p i (t + 1) ∈ v p (t + 1) be the value for dimension i of the vector v p (t + 1). Then, lV i (t + 1) corresponds to the list of absolute values of v p i (t + 1), ∀v p (t + 1) ∈ lV(t + 1). Then, we apply db-scan to the list lV i (t + 1), thereby obtaining the number of clusters nClusters(t + 1) and the cluster to which each v i (t + 1) belongs, lV i Clusters(t + 1), where abs(v i (t + 1)) ∈ lV i (t + 1)). The mechanism for the binary db-scan operator is shown in Algorithm 2.

Transition operator
The number of clusters and the list with the identifier of the membership of each element in the cluster is returned by the db-scan operator. Using these objects, the transition operator returns binarized solutions. To execute this binarization the identifier Id(J) ∈ Z that identifies the cluster, is assigned in an orderly manner. The value 0 is assigned to the cluster that has the absolute value of the centroid with the smallest value. As an example, let v j ∈ J and v i ∈ I be elements of Groups J and I, respectively, and abs(v j ) > abs(v i ); then, Id(J) > Id(I). Additionally, in the case that db-scan labels some element as an outlier, Equation (7) is used to assign the probability of transition. In Equation (7), α represents a minimum transition coefficient and β models the separation between the different clusters.
where T is the total number of clusters, not considering outliers (7) Finally, to execute the binarization process, consider p(t) as the position of a particle in iteration t. Let p i (t) be the value of dimension i for particle p(t), and let v p i (t + 1) be the velocity of particle p(t) in the ith dimension to transform p(t) from iteration t to iteration t + 1. Additionally, let there be v p i (t + 1) ∈ J, where J is one of the clusters identified by the binary db-scan operator. Then, we use Equation (8) to generate the binary positions of the particles in iteration t + 1.
When v p i (t + 1) ∈ outliers, a transition probability is assigned randomly. Finally, after the transition operator is applied, a repair operator is used, as described in Section 4.5, for solutions that do not satisfy some of the restrictions. The details of the transition operator are shown in Algorithm 3.

Random Perturbation Operator
Because the MKP is a difficult problem to solve, there is a condition that the algorithm is confined to local optimums. To address this situation, optimization is complemented by a perturbation operator. Once the condition that the solution does not improve is met, the perturbation operator performs a set of random deletions defined by the value η ν , where ν is a parameter to estimate, which multiplies the total length of the solution to obtain the value η ν . The procedure is outlined in Algorithm 4.

Repair Operator
Because the transition operator makes modifications to the solution, it may happen that the new solution does not respect any of the constraints of the optimization problem. This article solves this difficulty through a repair operator. The operator considers the solution to be repaired as input and returns a repaired solution as output. The operator first asks if the solution needs repair. If it is true, the repair procedure uses Equation (6) with the goal of ranking the elements that are eliminated. This elimination procedure runs until the solution obtained meets all the constraints. Subsequently, the possibility of incorporating new elements should be verified. To rank the elements to incorporate, Equation (6) is used again. After completing this procedure, the repair operator returns the repaired solution. The pseudocode of this process is given in Algorithm 5. p max ← MaxWeight(p) 7: p ← removeElement(p, p max ) 8: bRepair ← Repair(p) 9: end while 10: state ← False 11: while (state == False) do 12: p min ← MinWeight(p) 13: if (p min == ∅) then 14: state ← True 15: else 16: p ← addElement(p, p min ) 17: end if 18: end while 19: return p

Results and Discussion
To determine the importance of the db-scan operator in the binarization, three groups of experiments were defined. The first group of experiments aimed to define a comparison baseline. This baseline is determined through random operators. The results and comparisons of db-scan with the random operators are developed in Section 5.2. The second group of experiments compared db-scan with k-means. K-means is another clustering technique frequently used in data analysis. The comparison and its results are described in Section 5.3. The design of the binarization framework using k-means is found in [50]. Finally, the third group of experiments developed the comparison between db-scan and TF, which is detailed in section 5.4. In this last case, state-of-the-art algorithms were used to make the comparison. Additionally, the methodology to determine the parameters involved in the algorithms used in the binarization process is detailed in Section 5.1. To carry out the different experiments, the PSO and CS algorithms were used. They were chosen mainly because they are simple to parameterize, both have successfully solved a large number of optimization problems [2,5,[80][81][82], and there are simplified convergence models for CS [83] and PSO [56].
The cb.5, 500, cb.10, 500, and cb.30, 500 instances that correspond to the most difficult instances of the Beasley OR-library (http://www.brunel.ac.uk/mastjjb/jeb/orlib/mknapinfo.html) were selected to carry out the experiments. In the execution, a laptop with Windows 10 and Python 2.7 was used as the programming language. The laptop has an Intel Core i7-8550U processor with 16 GB of RAM. As a statistical test to measure significance, the Wilcoxon signed-rank non-parametric test was used.

Parameter Settings
The methodology used in determining the parameters is based on the evaluation of four measures. These measurements are defined in Equations (9)-(12) and through the use of radar plots determine which is the appropriate parameterization. More detail on the methodology used can be found in [29,50].
The percentage deviation of the best value obtained in the ten executions compared with the best known value (see Equation (9)): 2.
The percentage deviation of the worst value obtained in the ten executions compared with the best known value (see Equation (10)): 3.
The percentage deviation of the average value obtained in the ten executions compared with the best known value (see Equation (11)):

4.
The convergence time for the best value in each experiment normalized (see Equation (12)): For PSO, the coefficients c 1 and c 2 were set to 2. The parameter ω linearly decreased from 0.9 to 0.4. For the parameters used by db-scan, the minimum number of neighbors (minPts) was estimated as a percentage of the number of particles (N). Specifically, N = 30 and minPts = 10. To select the parameters, problems cb.5.250 were chosen. The parameter settings are shown in Tables 1 and 2. In both tables, the column labeled "Value" represents the selected value, and the column labeled "Range" corresponds to the set of scanned values.

The Contribution of the db-Scan Binary Operator
This section aims to determine the contribution of the db-scan operator to the MKP results. For this purpose, two random operators are designed that serve as a baseline for the comparison. The first random operator uses a fixed transition probability regardless of the velocity of the particle. We denote this operator with B-rand, and we use the two transition probability values 0.3 (B-rand3) and 0.5 (B-rand5). The second operator additionally incorporates the cluster concept. Three clusters are defined, and each is assigned a transition probability value among the values {0.1, 0.3, 0.5}. This operator is denoted as BC -rand3. To develop comparisons between the different algorithms, CS is used as the optimization algorithm, and all implementations use the same initiation, perturbation, and repair operators.
To make the comparisons, the set of problems cb.5.500 of the OR-library was used and divided into three groups: Group 0, Problems 0-9; Group 1, Problems 10-19; and Group 2, Problems 20-29. The results obtained from the comparison of the db-scan algorithm with the B-rand and CB-rand operators are shown in Table 3 and in Figure 3.
When comparing the best values in Table 3, we observe that db-scan has the same or better performance than random algorithms in all instances. When applying the Wilcoxon test, we see that the difference is significant. However, when we analyze the average, we see that the difference is small. In the case of the avg indicator, again, db-scan is higher in all cases. However, when comparing the average, we see that the difference is much larger. The Wilcoxon test again indicates that this difference is statistically significant. We must emphasize that, in this experiment, the only operator that was replaced was db-scan. Observing the shape of the distributions with the violin plots, we see that the median, the interquartile range, and the dispersion are much more robust in db-scan than in the rest of the algorithms.

K-means Algorithm Comparison
In this section, we develop a comparison between the db-scan and k-means clustering techniques. The k-means technique has been used to obtain binary versions of swarm intelligence algorithms. This technique has been successfully applied to the set-covering problem [50] and to the knapsack problem [29]. Unlike db-scan, k-means needs to set the number of clusters; therefore, this number is a parameter to estimate. In this experiment, the initiation, repair, and perturbation operators were exactly the same, and we only modified the binarization mechanism by replacing db-scan with k-means. In the case of k-means, and guided by the results obtained in [29], the number of clusters k = 5 was used, and we worked with the set of problems cb.30.500 to make comparisons. As in the previous experiments, we used three groups: Problems 0-9 as Group 0, 10-19 as Group 1 and 20-29 as Group 2. The results are shown in Table 4 and Figure 4. When we analyze the best known and average indicators, we see that there is very little difference between the results obtained by both techniques. However, when we perform a group analysis, we see that db-scan performs better than k-means in Group 0. The Wilcoxon test indicates that the performance is statistically significant. On the other hand, k-means has a better performance than db-scan in Groups 1 and 2. For Group 1, the difference is not significant, and, in the case of Group 2, it is significant in favor of the algorithm that uses k-means. The violin plot distributions do not show a relevant difference between the algorithms. Visually, it is observed that the interquartile range of the algorithm that uses db-scan obtains better values in Group 0. However, in Group 2, k-means obtains better values. The dispersion is similar in the different groups, and, in Group 0, a greater dispersion is observed for the algorithms that use db-scan.

Transfer Function Comparison
In this section, we compare the algorithm that uses db-scan with another general binarization mechanism, which uses transfer functions. As described in Section 3.1 and in [14], this binarization uses a transfer function T : R → [0, 1] to transform the particle velocity into a value in [0,1]; this value intuitively represents a probability. Subsequently, through a binarization mechanism, this probability becomes 0 or 1. In this comparison, we use the two best algorithms, to the best of our knowledge, that have resolved the MKP and that use transfer functions as a binarization mechanism.
The first algorithm used in the comparison, the binary artificial algae algorithm (BAAA), was developed in [35]. The BAAA uses the function tanh = e τ|x| −1 as a transfer function, setting the parameter tau to a fixed value of 1.5. Additionally, the BAAA incorporates an elitist local search operator with the aim of improving the quality of solutions. For the execution of the algorithm, an Intel Core (TM) 2 dual-CPU Q9300@2.5 GHz, with 4 GB RAM and the 64-bit Windows 7 operating system, was used. The maximum number of BAAA iterations was 35,000.
In Table 5, the results of the comparison are shown. The best result is marked in bold. When analyzing the best indicator, it is observed that db-scan-CS obtains 25 best values, 21 for db-scan-PSO and 6 for the BAAA. The sum is greater than 30 because some values are repeated. In the case of the average db-scan-CS indicator, 16 best values were obtained, 13 for db-scan-PSO and 1 for the BAAA. The Wilcoxon test indicates that the difference is significant. The second algorithm corresponds to the binary differential search (BDS) algorithm designed in [84]. This algorithm uses tanh = e π|x| −1 as a transfer function; as a binarization mechanism, it uses a random procedure (TR-BDS) and an elitist procedure (TE-BDS). In the transfer function, the parameter τ was set to a value of 2.5. As a maximum number of iterations, each BDS variant used 10,000 iterations. The BDS experiments were developed with MATLAB 7.5 using a PC with a Pentium dual core i7-4770 processor, 16 GB RAM and the Windows operating system. cb.10.500 was used as a dataset in the comparison. In Table 6, the results obtained by the different algorithms are shown. When we analyze the best indicator, we find that TR-BDS obtains 5 best values, and TE-BDS obtains 22. In the case of db-scan-PSO and db-scan-CS, the results are three and seven best values, respectively. When we compare the average of the best value indicator, we observe that db-scan-CS obtains the best result, followed by db-scan-PSO. This indicates that, although TE-BDS obtains the greatest number of best values, there are cases where its results have worse performance than db-scan-CS and db-scan-PSO. The Wilcoxon statistical test indicates that this difference is not significant for the case of TE-BDS. When analyzing the average indicator, the TR-BDS algorithm obtained the best value, TE-BDS 5 times, and db-scan-PSO and db-scan-CS 12 times each. This result confirms that db-scan-PSO and db-scan-CS consistently obtain better values than the BDS variants. The Wilcoxon test indicates that the difference is significant.

Conclusions
In this work, an algorithm is proposed that uses the clustering db-scan technique to enable swarm intelligence continuous metaheuristics to solve COPs. Additionally, the algorithm uses a perturbation operator in case the solutions fall into a deep local optimum. For the experiments, the 90 largest instances commonly used in the literature were used. In comparison with random operators, we see that binarization with db-scan allows more robust binary versions to be obtained, enabling consistently better results to be obtained and reducing the dispersion of these versions with respect to random operators. In the experiments that used TFs, according to the best of our knowledge, the best algorithms that have resolved the MKP and that use TFs as a binarization method were chosen. In the case of the BAAA, the results of binarizations with db-scan were better for both the best and the average indicators. In the case of TE-BDS, the difference was significant on average. In the case of k-means, the results were similar, showing that db-scan performed significantly higher in Group 0 and k-means in Group 2.
There are several possible directions for further extensions and improvements of the present work. The first line arises from observing the configuration parameters presented in Tables 1 and 2. The configuration procedure can be simplified and improved by incorporating adaptive mechanisms that allow the parameters to be modified in accordance with the feedback obtained from the candidate solutions. The second line is related to what was observed in the comparison between the k-means and db-scan techniques developed in Section 5.3. None of these algorithms can perform significantly better than the others on all problems. Then, by incorporating an intelligent agent that uses value-action or policy gradient methods frequently used in reinforcement learning, a more robust algorithm is obtained that allows the identification of the appropriate technique or parameterization for the problem or the stage of the problem that is being solved. Another possible line of research is to explore the population management of solutions dynamically. Through analyzing the history of exploration and exploitation of the search space, one can identify regions where it is necessary to increase the population and others where it is appropriate to decrease it. Finally, an interesting line of research is to use new transfer functions, such as those defined in [52,85], and evaluate their performance on a problem such as MKP. Additionally, and suggested by the research carried out in the previously cited articles, a procedure can be explored to make the optimal estimation of the parameter τ.

Conflicts of Interest:
The authors declare no conflict of interest.