B ARRAKUDA : A Hybrid Evolutionary Algorithm for Minimum Capacitated Dominating Set Problem

: The minimum capacitated dominating set problem is an NP-hard variant of the well-known minimum dominating set problem in undirected graphs. This problem ﬁnds applications in the context of clustering and routing in wireless networks. Two algorithms are presented in this work. The ﬁrst one is an extended version of construct, merge, solve and adapt, while the main contribution is a hybrid between a biased random key genetic algorithm and an exact approach which we labeled B ARRAKUDA . Both algorithms are evaluated on a large set of benchmark instances from the literature. In addition, they are tested on a new, more challenging benchmark set of larger problem instances. In the context of the problem instances from the literature, the performance of our algorithms is very similar. Moreover, both algorithms clearly outperform the best approach from the literature. In contrast, B ARRAKUDA is clearly the best-performing algorithm for the new, more challenging problem instances.


Introduction
Dominating set problems are difficult combinatorial optimization problems from the family of set covering problems.Over the last two decades, they have attracted research interests due to their applications to both clustering and routing in wireless networks [1][2][3].The most basic dominating set problem is the classical minimum dominating set (MDS) problem.Given an undirected graph G with vertex set V (where |V| = n), a set D ⊆ V is called a dominating set if and only if each vertex v ∈ V either forms part of D or has at least one neighbor v , which is a member of D. In the latter case, we say that v dominates v.Each dominating set is a feasible solution to the MDS problem.The optimization goal of the MDS problem is to find a smallest dominating set in G.In the case of the MDS problem, given a valid solution D, each vertex v ∈ D is said to dominate all its neighbors that do not form part of D themselves.

Literature Review for the CapMDS Problem
In this work, we focus on an NP-hard variant of the MDS problem known as the minimum capacitated dominating set (CapMDS) problem.The difference to the MDS problem is that the CapMDS problem restricts the number of vertices that each vertex from a solution D can dominate.The CapMDS problem is relevant in the field of wireless communications, and for this reason, several approaches have been found in the literature in recent years.Even though the CapMDS problem has been demonstrated to be NP-hard [4], some researchers have focused on the development of exact approaches.Cygan et al. [5] presented an algorithm based on maximum matching that runs in O(1.89 n ) time.The performance of the algorithm was further improved by considering dynamic programming over subsets [6] with a time complexity of O(1.8463 n ).A distributed approximation scheme was developed in [7].This algorithm achieves a O(log )-approximation in O(log 3 n+log(n)/ ) time, where n represents the number of vertices and denotes their maximal degree.Finally, the CapMDS problem has been subject to a few research studies focused on heuristics and metaheuristic approaches.Potluri and Singh [8] carried out a performance comparison between three greedy heuristics.They showed that the best heuristic is the one that selects, at each step, the vertex that maximizes the minimum between the vertex capacity and the number of uncovered neighbors.The neighboring vertices dominated by this vertex (in case the neighborhood is larger than the capacity of the vertex) are chosen randomly.This greedy heuristic has been used as the basis for the design of two metaheuristic approaches: one based on ant colony optimization (named ACO) and the other one based on genetic algorithms (named SGA) [9].Lately, Li et al. [10] developed an iterated local search approach labeled LS_PD, showing a significantly better performance than ACO and SGA when applied to general graphs with uniform and variable capacity.LS_PD adopts a penalization strategy in the context of the vertex-scoring scheme.Moreover, it makes use of a two-mode dominated vertex selection strategy taking into account both random and greedy decisions for the choice of the neighbors that a chosen vertex should dominate.This is done for the purpose of achieving a balance between the intensification and the diversification of the search process.Recently, Pinacho-Davidson et al. [11] developed a construct, merge, solve and adapt (CMSA) approach for the CapMDS.CMSA is a hybrid proposal that integrates a heuristic algorithm and an exact solver to accelerate the solution process, especially thought for the application to large-scale problem instances.Moreover, they applied-for the first time-a high-performance integer linear programming (ILP) solver, namely CPLEX, to all problem instances from the literature.They showed that both CMSA and CPLEX outperform LS_PD.

Literature Review on Related Problems
Apart from the CapMDS, various other computationally hard variants of the MDS problem can be found in the related literature.One of them is the so-called minimum connected dominating set (MCDS) problem, which has the additional restriction that a solution D must induce a connected sub-graph of the input graph.Recent algorithmic approaches for the MCDS problem include different local search algorithms [12,13] and a hybrid algorithm combining ant colony optimization with reduced variable neighborhood search [14].Another example of an extension of the classical MDS problem is the minimum independent dominating set problem in which a solution D must be an independent set of the input graph in addition to being a dominating set.Recent algorithmic approaches tailored to this problem include a two-phase removing algorithm [15] and a memetic algorithm [16].The node-weighted versions of these problems are also often considered.The latest algorithms for the minimum weight (vertex) independent dominating set problem include a local search approach that makes use of a reinforcement learning based repair procedure [17] and a memetic algorithm [18].Next, it is worth mentioning the so-called minimum total dominating set problem, which was solved by means of a hybrid evolutionary algorithm in [19].Finally, it is also worth mentioning that high-quality algorithms for the CapMDS problem can also be useful for solving problems, such as the capacitated vertex k-center problem; see [20].

Literature Review on Similar Techniques
In this paper we develop heuristic algorithms for the CapMDS problem that are based on solving a sequence of reduced sub-instances of the original problem instance.Historically, this general idea was first been exploited in the context of problem decomposition by techniques from the field of Operations Research, such as Dantzig-Wolfe decomposition (column generation) and Benders decomposition (row generation) [21].However, more recently, these ideas have also been applied in the context of heuristic optimization algorithms that make use of mathematical programming techniques, known as matheuristics [22].Large neighborhood search (LNS) [23], respectively, very large-scale neighborhood search [24], are among the most popular matheuristic techniques.These algorithms function as local searches.However, they make use of mathematical programming to find an improving solution of the incumbent solution at each iteration in a large-scale neighborhood of the incumbent solution.Many LNS approaches are based on the principle of ruin-and-recreate [25], also sometimes found as destroy-and-recreate or destroy-and-rebuild .Alternative ways of defining large neighborhoods are local branching [26], the corridor method [27], and POPMUSIC [28].
In this work we present two algorithms from the field of matheuristics for the CapMDS problem.The first one is a version of construct, merge, solve and adapt (CMSA) whose general idea was first presented in [29].In CMSA, the sub-instance that is solved at each iteration by a mathematical programming technique is assembled from a set of heuristically constructed solution.In the second algorithm proposed in this work, BARRAKUDA, the intention is to enhance the idea of CMSA with the learning component present in evolutionary algorithms.

Organization of The Paper
The remainder of the paper is organized as follows.The considered optimization problem is introduced in Section 2, while the proposed algorithms are described in Section 3. Finally, a comprehensive experimental evaluation is provided in Section 4, and conclusions, as well as indications on future lines of work, are presented in Section 5.

The Capmds Problem
We recall some preliminary definitions before describing the CapMDS problem.Let G = (V, E) be an undirected graph on a set of n vertices V = {v 1 , v 2 , . . ., v n } and a set of edges E. We assume that G neither contains edges from a vertex v ∈ V to itself (loops) nor multi-edges.Two vertices are neighbors (also called adjacent to each other) if and only if there exists an edge between them, that is, v ∈ V and u ∈ V are said to be neighbors if and only if (v, u) ∈ E. For a vertex v, let N(v) := {u ∈ V | (v, u) ∈ E} denote the set of neighbors known as the open neighborhood (or simply neighborhood) of v in G. Furthermore, the closed neighborhood of a vertex v ∈ V, denoted by N[v] and contains the vertices adjacent to v in addition to v itself, that is, N[v] := N(v) ∪ {v}.The degree deg(v) of v is the cardinality of the set of neighbors of v, that is, deg(v) = |N(v)|.As already mentioned above, a dominating set of G is a (sub)set S ⊆ V such that each vertex v ∈ V \ S must be adjacent to at least one vertex in S. Each vertex in S is called a dominator.Otherwise, it is called dominated.A dominator dominates itself and some or all of its neighbors.
A problem instance of the CapMDS problem is a tuple (G, Cap) that consists of an undirected graph G = (V, E)-without loops nor multi-edges-and a capacity function Cap : V → N.This function assigns a positive integer value Cap(v) to each vertex v ∈ V representing the maximum number of adjacent vertices this vertex is allowed to dominate.Any dominating set S ⊆ V is a potential solution to the CapMDS problem.Such a dominating set S is a valid solution, if it is possible to identify a set {C(v) | v ∈ S}) that (1) contains for each dominator v ∈ S the (sub-)set C(v) ⊆ N(v) \ S containing those neighbors of v that are chosen to be dominated by v, and (2) fulfills the following conditions: 1. S ∪ ( v∈S C(v)) = V, that is, all vertices from V are either chosen to be a dominator, or are dominated by at least one dominator.2. |C(v)| ≤ Cap(v) for all v ∈ D S , that is, all chosen dominators v ∈ D S dominate at most Cap(v) of their neighbors.
Finally, the objective function value (to be minimized) is defined as f (S) := |S|. Figure 1 presents an illustrative example of the CapMDS problem.While Figure 1a displays an example graph, Figure 1b shows an optimal solution (black vertices) considering a uniform capacity of 2 for each vertex.Notice that the sets of dominated neighbors of each node in S are indicted by bold edges.Vertices v 5 and v 9 , for example, form set

An ILp Formulation for the Capmds Problem
The CapMDS can be modeled in terms of the following ILP model.[11]: This model can be seen as an improvement of the model originally presented in [10], because it has fewer binary variables.First, a binary variable x i is associated to each vertex v i ∈ V indicating whether or not v i is chosen to be included in the solution.Secondly, for each edge (v i , v j ) ∈ E the model contains binary variables y ij and y ji .Hereby, variable y ij takes values one if vertex v i dominates vertex v j .Likewise, variable y ji takes value one if v j dominates v i .
In the above formulation, constraints (2) enforce that all vertices that are not part of the solution are dominated by at least one neighbor.Constraints (3) make sure that the total number of vertices dominated by a given particular vertex v i is limited by Cap(v i ).In this way, a vertex v i can dominate at most Cap(v i ) vertices from its neighborhood.The constraints in (4) ensure that a vertex v i can dominate a neighbor v j if and only if v i is part of the solution.

Proposed Algorithms
In this paper, we present the application of two hybrid algorithms for the CapMDS problem.Both proposals can be seen as algorithms from the category "Hybrid Algorithms Based on Problem Instance Reduction [30]".The main goal of this type of hybridization is to allow the application of an exact solver to large problem instances for which the direct use of the exact solver would not be possible or efficient given the instance size.For this purpose, this class of algorithms provides functionalities for the intelligent reduction in instances of a problem in order to obtain smaller sub-instances.
The first algorithm, labeled CMSA++, is a revised and extended version of the construct, merge, solve and adapt (CMSA) technique presented in prior work [11].The second algorithm is a new proposal, called BARRAKUDA, which makes use of the algorithmic framework of a biased random key genetic algorithm (BRKGA) [31].In Figure 2, we present a high-level overview of both techniques, especially for pointing out the two separated levels of operation.In the original problem instance level of operation, CMSA++ and BARRAKUDA deal with the full-size problem instance and generate a set of solutions needed for the construction of sub-instances.In this context, CMSA++ performs the generation of independent solutions through a constructive probabilistic heuristic.Meanwhile, BARRAKUDA makes use of BRKGA as a generator of solutions.In contrast to CMSA++, BARRAKUDA makes use of learning for the generation of solutions.This is because it uses the natural learning mechanism of BRKGA, based on the selection of good solutions for acting as parents in order to produce offspring for the next iteration.This aspect will be explained in detail in Section 3.2.1.In the reduced problem instance level of operation, both algorithms make use of an ILP solver to deal with the incumbent sub-instance.If this sub-instance is small enough, the solver obtains good results providing over time high-quality solutions to the original problem instance.The main difference between both proposals is the strategy for the creation and maintenance of the reduced sub-instances.CMSA++ produces sub-instances by using solution components found in the solutions produced by the constructive probabilistic heuristic, keeping the sub-instance size small enough through a sub-instance maintenance process.This process uses information about the utility of solution components, which is used to eliminate seemingly useless solution components from the sub-instances.On the other hand, BARRAKUDA does not make use of such a maintenance mechanism.In contrast, it creates sub-instances from scratch at each iteration.For this purpose, BARRAKUDA uses the solution components from a set of solutions selected from a subset of the chromosomes of the incumbent BRKGA population.After the application of the ILP solver to the sub-instance, BARRAKUDA implements a solution encoding process in order to be able to add the high-quality solution produced by the ILP solver to the incumbent population of the BRKGA.
Finally, the main reason for choosing BRKGA-instead of another evolutionary algorithm from the literature-as the main algorithm framework of BARRAKUDA is the following one.A BRKGA is usually very easy to apply, as it only requires to find a clever way of translating the individuals of BRKGA (which are kept as random keys) into feasible solutions to the tackled problem.All other components, such as mutation and crossover, are the same for any BRKGA.In this way, it is possible to focus all efforts on the hybridization aspect.

CMSA++: An Extension of CMSA
As mentioned above, CMSA++ is a revised and extended version of the CMSA algorithm from [11].The extensions concern the following aspects: Making the algorithm more robust concerning the parameter values.Note that, in [11], we had to conduct a very fine-grained tuning process in order to achieve a good algorithm performance for all problem instances.
Nevertheless, keep in mind that the original CMSA from [11] can be obtained by setting the parameter values of CMSA++ in a specific way.In other words, the parameter tuning process may choose this option, if it results better than the newly introduced variations.
A high-level description of CMSA++ for the CapMDS problem is provided in Algorithm 1.The algorithm maintains, at all times, an initially empty subset V of the set of vertices V of the input graph.This set is henceforth called the sub-instance.Vertices are added to sub-instance V by means of a probabilistic solution construction process, which is implemented in function ProbabilisticSolutionGeneration (d rate * ,l size , g opt ) (see line 10 of Algorithm 1).In particular, all dominator vertices found in the solutions generated by this function are added to V -if not already in V -and their so-called "age value" age[] is initialized to zero (see lines 12 and 13 of Algorithm 1).Moreover, solving the sub-instance V refers to the application of the ILP solver CPLEX in order to find-if possible within the imposed time limit of t solver seconds-the optimal solution to sub-instance V ; that is, the optimal CapMDS solution in G that is limited to only contain dominators from V .This is achieved by adding the following set of constraints to the ILP model from the previous section: The process of solving the sub-instance V is done in function ApplyExactSolver(V , t solver ); see line 16 of Algorithm 1.

Algorithm 1 CMSA++ for the CapMDS problem
end for 15: end for 16: Adapt(V , S opt , age max , A type ) 19: end while 20: output: S bsf The algorithm takes as input the tackled problem instance (G, Cap), and values for eight required parameters.The first four of them-see line 2 of Algorithm 1-are inherited from the original CMSA proposal, while the remaining four parameters-see line 3 of Algorithm 1-are in the context of the extensions that lead to CMSA++.The eight parameters can be described as follows: 1. age max : establishes the number of iterations that a vertex is allowed to form part of sub-instance V without forming part of the solution to V returned by the ILP solver (S opt ). 2. n a : defines the number of solutions generated in the construction phase of the algorithm at each iteration; that is, the number of calls to function ProbabilisticSolutionGeneration(). 3. t solver : establishes the time limit (in seconds) used for running the ILP solver at each iteration.4. l size , d rate L , d rate U : these three parameters determine the greediness of the solution construction process in function ProbabilisticSolutionGeneration(), which we described in more detail in the following subsection.5. g opt : indicates the variant of the solution construction process.6.A type : chooses between two different behaviors implemented for the sub-instance maintenance mechanism implemented in function Adapt().
As mentioned above, CMSA++ is-like any CMSA algorithm-equipped with a sub-instance maintenance mechanism for discarding seemingly useless solution components from the sub-instance V at each iteration.The original implementation of this mechanism works as follows.First, the age values of all vertices in V are incremented.Second, the age values of all vertices in S opt are re-initialized to zero.Third, the vertices with age values greater than age max are erased from V .In particular, this original implementation is used in case A type = 0.In the other case-that is, when A type = 1-a probability age[v]/age max for being removed from V is assigned to each vertex v ∈ V with age max > 0. Note that these probabilities linearly increase until reaching probability 1 for all vertices with an age value greater than age max .Both mechanisms are implemented in function Adapt(V , S opt , age max , A type ) (see line 18 of Algorithm 1).Their aim is to maintain V small enough in order to be able to solve the sub-instance (if possible) optimally.If this is not possible in the allotted computation time, the output S opt of function ApplyExactSolver(V , t solver ) is simply the best solution found by CPLEX within the given time.
One of the new features of CMSA++ (in comparison to the original version from [11]) is a dynamic mechanism for adjusting the determinism rate (d rate * ) used during the solution construction at each iteration.The mechanism is the same as that described in [32].In the original CMSA, this determinism rate was a fixed value between 0 and 1, determined by algorithm tuning.In contrast, CMSA++ chooses a value for This is done under the hypothesis that CMSA++ can escape from local optima, increasing the randomness of the algorithm.Whenever an iteration improves S bsf , the value of d rate * is set back to its upper bound (d rate U ). Otherwise, the value of d rate * is decreased by a factor determined by subtracting the lower bound vale from the upper bound value (d rate U ) and dividing the result by 3.0.Finally, whenever the value of d rate * falls below the lower bound, it is set back to the upper bound value.Adequate values for d rate L and d rate U must be determined by algorithm tuning.Note that the behavior of the original CMSA algorithm is obtained by setting Finally, note that CMSA++ iterates while a predefined CPU time limit is not reached.Moreover, it provides the best solution found during the search, S bsf , as output.

Probabilistic Solution Generation
The function used for generating solutions in a probabilistic way (in line 10 of Algorithm 1) is presented in Algorithm 2. Note that there are three ways of generating solutions that are pseudo-coded in this algorithm.They are indicated in terms of three options: opt 1 , opt 2 , and opt 3 .Line 18 of Algorithm 2, for example, is only executed for options opt 1 and opt 3 .Note that option opt 1 implements the solution construction mechanism of the original CMSA algorithm for the CapMDS problem from [11].In the following section, first, the general solution construction procedure is described.Subsequently, we outline the differences between the three options mentioned above.S := S ∪ {v * } 16: The procedure takes as input the problem instance (G, Cap), as well as values for the parameters l size , d rate * and g opt .The two first parameters are used for controlling the degree of determinism of the solution construction process.Meanwhile, g opt indicates one of the three options.The process starts with an empty partial solution S := ∅.At each step, exactly one vertex from a set W ⊆ V is added to S, until the set of uncovered vertices (V uncov ) is empty.Note that both W and V uncov are initially set to V. Next we explain how to choose a vertex v * from W at each step of the procedure.First of all, vertices from W are evaluated in function ComputeHeuristicValues(W) by a dynamic greedy function h(), which depends on the current partial solution S.More specifically: Hereby, Cap(v) refers to the capacity of v and deg S (v is the set of currently uncovered neighbors of v.The h()-value of a vertex is henceforth called its heuristic value.The choice of a vertex v * is then done as follows.First, a value δ ∈ [0, 1] is chosen uniformly at random.In case δ ≤ d rate * , v * is chosen as the vertex that has the highest heuristic value among all vertices in W. Otherwise, a candidate list L containing the min{l size , |W|} vertices from W with the highest heuristic values is generated, and v * is chosen from L uniformly at random.Thus, the greediness of the solution construction procedure depends on the values of the determinism rate (d rate * ) and the candidate list size (l size ).Note that, when deg S (v * ) is greater than Cap(v * ), we have to choose which vertices from the N(v * ) ∩ V uncov vertex v * is going to cover.In our current implementation, the set of Cap(v * ) vertices is simply chosen from N(v * ) ∩ V uncov uniformly at random and stored in C(v * ).Finally, W and V uncov are updated at the end of each step.
When g opt = opt 1 , the solution construction procedure is the same as that used in the original CMSA algorithm for the CapMDS problem.In particular, W is updated after each choice of a vertex v * in line 18 by removing the newly covered vertices.This concerns the chosen vertex v * and the vertices from C(v * ) that were chosen to be covered by v * .Moreover, the greedy function values are always updated according to the current partial solution S (see line 23 of Algorithm 2).In contrast, when g opt = opt 2 the update of W (see line 20) is done by only removing the chosen vertex v * from the options.The rationale behind this is as follows.Consider the example in Figure 3.In case g opt = opt 1 , nodes v 1 and C v 1 = {v 2 , v 3 , v 4 , v 5 } are removed from W for the next construction step.This removes node v 2 from the set of selectable vertices, although choosing v 2 in the next step would lead to the construction of the optimal solution in this example.Finally, with g opt = opt 3 the solution construction is the same as with g opt = opt 1 , just that the greedy function values are not recalculated in line 23.The rationale behind this way of constructing solutions is to gain speed.On the negative side, this proposal possibly overestimates the heuristic values of the vertices and generally favors the hubs (vertices with a high degree) in the input graph.

BARRAKUDA: A Hybrid of Brkga with an ILp Solver
As mentioned before, the main algorithm proposed in this work is a hybrid between the well-known biased random key genetic algorithm (BRKGA) [31] and an ILP solver.Roughly, the solution components of (some of the solutions in the) population of the BRKGA are joined at the end of each iteration and the resulting sub-instance is passed to the ILP solver.Afterwards, the solution returned by the ILP solver is fed back into the population of BRKGA.As in CMSA, the main motivation for this proposal is to take profit from the ILP solver even in the context of problem instances that are too large in order to apply the ILP solver directly.The advantage of BARRAKUDA over CMSA is the learning component of BRKGA, which is not present in CMSA.

Main Algorithmic Framework
The main algorithm framework of BARRAKUDA, which is pseudo-coded in Algorithm 3, is that of the BRKGA.In fact, the lines that turn the BRKGA into BARRAKUDA are lines 12-15, which are marked by a different background colors.In the following section, we first outline all the BRKGA components of our algorithm before describing the additional BARRAKUDA components.
BRKGA is a steady-state genetic algorithm which consists of a problem-independent part and of a problem-dependent part.It takes as input the tackled problem instance (G, Cap), and the values of four parameters (p size , p e , p m and prob elite ) that are found in any BRKGA.The algorithm starts with the initialization of a population P composed of p size random individuals (line 4 of Algorithm 3).Each individual π ∈ P is a vector of length established by 2 * |V|, where V is the set of vertices of G.In order to generate a random individual, for each position π(i) of π (i = 1, . . ., 2|V|) a random value from [0, 1] is generated.In this context, note that values π(i) and π(i + |V|) are associated to vertex v i of the input graph G.Then, the elements of the population are evaluated (see line 5 of Algorithm 3).This operation, which translates each individual π ∈ P into a solution S π (set of vertices) to the CapMDS problem, corresponds to the problem-dependent part described in the next subsection.Note that, after evaluation, each individual π ∈ P has assigned the objective function value f (S π ) of corresponding CapMDS solution.Henceforth, f (π) will refer to f (S π ) and vice versa.V := S∈P ilp S 14: S opt ← ApplyExactSolver(V , t solver ) 15: P ← IntegrateSolverSolution(P, S opt ) 16: end while 17: output: Best solution in P Then, at each iteration of the algorithm, a set of genetic operators are executed in order to produce the population of the next iteration (see lines 7-11 in Algorithm 3).First, the best max{ p e • p size , 1} individuals are copied from P to P e in function EliteSolutions(P , p e ).Then, a set of max{ p m • p size , 1} mutant individuals are generated and stored in P m .These mutants are produced in the same way as individuals from the initial population.Next, a set of p size − |P e | − |P m | individuals are produced by crossover in function Crossover(P , P e , prob elite ) and stored in P c .Each such individual is generated as follows: First, an elite parent π 1 is chosen uniformly at random from P e .Then, a second parent π 2 is chosen uniformly at random from P \ P e .Finally, an offspring individual π off is produced on the basis of π 1 and π 2 and stored in P c .In the context of the crossover operator, value π off (i) is set to π 1 (i) with probability prob elite , and to π 2 (i) otherwise.After generating all new offspring in P m and P c , these new individuals are evaluated in line 10.A standard BRKGA operation would now end after assembling the population of the next iteration (line 11).

Evaluation of an Individual
The problem-dependent part of the BRKGA concerns the evaluation of an individual, which consists of translating the individual into a valid CapMDS solution and calculating its objective function value.For this purpose, the solution generation procedure from Algorithm 2 is applied with option g opt = opt 2 .The only difference is in the choice of vertex v * ∈ W at each step (lines 8-14 of Algorithm 2), and the choice of the set of neighbors C(v * ) to be covered by v * .Both aspects are outlined in the following section.Moreover, the resulting pseudo-code is shown in Algorithm 4.
Instead of choosing a vertex v * ∈ W at each iteration of the solution construction process in a semi-probabilistic way as shown in lines 8-14 of Algorithm 2, the choice is done deterministically end while 17: ComputeHeuristicValues(W) 20: end while 21: output: S based on combining the greedy function value h(v i ) with its first corresponding value in individual π(i).(Remember that function h() is defined in Equation (7).)More specifically, This is also shown in line 7 of Algorithm 4. In this way, the BRKGA algorithm will produce-over time-individuals with high values π(i) for vertices v i that should form part of a solution, and vice versa.The greedy function h() is only important for providing an initial bias towards an area in the search space that presumably contains good solutions.Finally, note that the first |V| values of an individual (corresponding, in the given order, to vertices v 1 , . . ., v n }) are used for deciding which vertices are chosen for the solution.
As mentioned above, the second aspect that differs when comparing the evaluation of an individual with the construction of a solution in Algorithm 2 is the choice of the set of neighbors C(v * ) to be covered by v * at each step.Remember that, in Algorithm 2, min{|N uncov |, Cap(v * )}, vertices are randomly chosen from N uncov (v * ) and stored in C(v * ).In contrast, for the evaluation of a solution, vertices are sequentially picked from N uncov (v * ) and added to C(v * ) until either min{|N uncov |, Cap(v * )} vertices are picked or no further uncovered neighbor v i of v * exists with π(i + |V|) > 0. Assuming that we denote the set of remaining (still unpicked) uncovered vertices of v * by N, at each step the following vertex is picked from N: This is also shown in line 13 of Algorithm 4. Note that, as a greedy function for choosing from the uncovered neighbors of v * , we make use of the number of uncovered neighbors of these vertices, preferring those with a large number of uncovered neighbors.Note also that, for selecting the vertices that are dominated by a chosen node v * , the second half of an individual is used, that is, values π(|V| + 1), . . ., π(2|V|).In other words, with the second half of an individual, the BRKGA learns by which node a non-chosen node should be covered.

From Brkga to BARRAKUDA
Finally, it remains to describe lines 12-15 of Algorithm 3 that are an addition to the standard BRKGA algorithm and that convert the algorithm into BARRAKUDA.This part makes use of three BARRAKUDA-specific parameters: • Parameter t solver is used for establishing the time limit given to the ILP solver in function ApplyExactSolver(V , t solver ).

•
Parameter ex mode is used for determining the way in which individuals/solutions are chosen from the current BRKGA population for building a sub-instance V .This is done in function ChooseSolutions(P, ex mode , n a ).

•
Parameter n a refers to the number of individuals that must be chosen from P for the construction of the sub-instance.
The BARRAKUDA-specific part of the algorithm starts by choosing exactly n a < p size individuals from the current population P and storing their corresponding solutions (which are known, because these individuals are already evaluated) in a set P ilp .Parameter ex mode indicates the way in which these n a solutions are chosen.In case ex mode = ELITIST, the n a best individuals from P are chosen.Otherwise (when ex mode = RANDOM), the best individual from P in addition to (n a − 1) randomly chosen solutions are added to P ilp .Subsequently, the vertices of all solutions from P ilp are merged into set V , and CPLEX is applied in function ApplyExactSolver(V , t solver ) in the same way as in the case of CMSA++.Finally, the solution S opt returned by the solver is fed back to BRKGA in function IntegrateSolverSolution(P, S opt ).More specifically, S opt is transformed into an individual π, which is then added to P. Finally, the worst individual is deleted from P in order to keep the population size constant.The integration process is quite straightforward, because it only considers information about the vertices (dominators) used in the solvers' solution and not the domination relationships (edges).More specifically, BARRAKUDA creates a new individual π where π(i) := 0.5, ∀i ∈ {|V| + 1, |V| + 2, . . ., 2|V|}.Moreover, for each dominating vertex v i in the solution of the solver π(i) := 1.0, whereas for each remaining vertex v j , the value of π(j), is set to zero.

Experimental Evaluation
In this section, we present the comparison of our two algorithmic proposals: CMSA++ and BARRAKUDA.For this purpose, we employ four different sets of problem instances.Two of them were presented in [9] and previously used in [11].The remaining two instance sets are newly generated, because we noticed that most of the existing problem instances from the first two data sets are too small-respectively, too easy to solve-in order to be challenging.All four benchmark sets will be described in the following subsection.
We implemented all algorithms from scratch using ANSI C++ and compiled the code with GCC 8.3.0 (Ubuntu/Linux).Moreover, IBM ILOG CPLEX Optimization Studio V12.8.0 was executed in one-threaded mode, both as a standalone application (henceforth simply called CPLEX) and within CMSA++ and BARRAKUDA for solving the generated sub-instances.All experiments were conducted on the computing cluster of the Engineering Faculty of the University of Concepción (Luthier).Luthier is composed of 30 computing nodes, which all have the same hardware and software configuration.In particular, all nodes have an Intel CPU Xeon E3-1270 v6 at 3.8 GHz with 64 GB RAM.The cluster uses SLURM v17.11.2 for management and job scheduling purposes.The time limit for all experiments was 1000 CPU seconds per algorithm and per run, for all problem instances.

Benchmark Instances
The first two benchmark sets were taken from the literature.They were initially presented in [9] and later used for the experimental evaluation of the current state-of-the-art method in [10].The first benchmark set contains 120 unit disk graphs (UDGs) created using the topology generator from [33].In these instances, the vertices of each graph are randomly distributed over a Euclidean square of size 1000 × 1000.Moreover, the graphs are generated with two different range values: 150 and 200 units.Note that with a range value of 150, for example, an undirected edge is introduced between each pair of vertices whose Euclidean distance is mostly 150 units.Accordingly, graphs on the same number of vertices become more dense with a growing range value used for generating them.The second benchmark set consists of 180 general graphs taken from the set of so-called type I instances originally proposed by Shyu et al. [34] in the context of the minimum weight vertex cover problem.For each graph size (in terms of the number of vertices) this benchmark set contains graphs from three different densities as indicted by their number of undirected edges.In both benchmark sets, the number of vertices of the graphs is from {50, 100, 250, 500, 800, 1000}.Finally, graphs with vertices of two types of capacities-namely uniform capacities and variable capacities-can be found in these benchmark sets.In the case of uniform capacity, three different capacities of 2, 5 and α are tested, where α is graph-specific and is taken as the average degree of the corresponding graph.In the case of variable capacities, the vertex capacities are randomly chosen from the following three intervals: (2, 5), (α/5, α/2) and [1, α].Note that the instance files come with the capacities already explicitly assigned.Note that for each combination of graph size (number of vertices), density (as determined by range values, respectively, number of edges) and capacity type/range, the benchmark sets consist of 10 randomly generated problem instances.
As we had already noticed in our preliminary work [11] that many of the problem instances from the above-mentioned benchmark sets are easily solved by CPLEX, we decided generate more challenging instances, as follow.The first set is composed of general random graphs with 1000 or 5000 nodes.In this set, the number of edges depends on a parameter called edge probability (ep).This parameter is used in the construction of the graphs, where each possible edge is generated with probability ep.The probabilities that we considered are from {0.05, 0.15, 0.25}.Finally, we chose to generate instances with a uniform capacity of α, and others with a variable capacity for each vertex randomly chosen from [1, α].The second one of the new benchmark sets is composed of random geometric graphs that are generated in a similar way as the unit disc graph instances mentioned above.All vertices are randomly distributed over the euclidean square-that is, a square of size 1.0 × 1.0-and a connection radius r determines the edges.In particular, all vertices whose Euclidean distance is at most r are connected by an edge.Note that the radius r determines the density of the resulting graph.We used radius values from {0.14, 0.24, 0.34}.For this set we also considered instances with 1000 and 5000 vertices, as well as uniform capacity graphs with capacity α assigned to each node and variable capacity problems with a random capacity from [1, α] assigned to each vertex.In both cases (general random graphs and random geometric graphs) we generated 10 instances for each combination of graph size, density, and capacity type/size.

Tuning Process
As mentioned before, the reason for introducing CMSA++, which is an extension of the CMSA algorithm presented in preliminary work [11], was that CMSA was not robust at all.Its parameters had to be tuned separately for many sub-groups of the set of available benchmark instances.Only in this way, the algorithm obtained very good results.In order to confirm that CMSA++ and BARRAKUDA are sufficiently robust algorithms, a significantly lower number of algorithm configurations than those produced in [11] for CMSA are produced for the final evaluation of CMSA++ and BARRAKUDA.In particular, we produce different algorithm configurations only for 12 subsets of the instances from the literature.This is in contrast to 36 configurations that were produced for CMSA in [11].Remember in this context that CMSA++ is-from an algorithmic point of view-a superset of CMSA.This means that, if the proposed extensions are not found to be useful, the tuning process would choose the parameter settings, such that CMSA++ is the same as CMSA.
As outlined in the previous section, CMSA++ requires well-working parameter values for parameters {age max , n a , t solver , l size , d rate L , d rate U , A type , g opt }, while BARRAKUDA required values for parameters {p size , p e , p m , prob elite ,n a , ex mode , t solver }.Please refer to Section 3 for a comprehensive description of their function.We employ the scientific parameter tuning tool irace [35] for determining the values of these parameters.The parameters' value domains for all the tuning runs were chosen as follows.
CMSA++ parameter domains: • age max : {1, Hereby, real-valued parameters, such as d rate L , for example, are all treated with a precision of two positions behind the comma-that is, parameter d rate L whose domain is [0, 0.9] can take values from {0.0, 0.01, . . ., 0.89, 0.9}.
Tables 1 and 2 show the parameter value configurations of CMSA++ and BARRAKUDA for the problem instances from the literature.Note that we distinguish between small problem instances (marked by S in the first table columns) and large problem instances (marked by L).Hereby, configuration S is for all instances with {50, 100, 250} vertices, and configuration L for the remaining (large) problem instances.Moreover, tuning is performed for three different capacity types; see the second table columns.Parameter values from the rows with {2, (2, 5)} in the second column are, for example, for all instances with a uniform capacity of 2, and for all instances with a variable capacity from (2,5).Tables 3 and 4 show the parameter values determined by irace for the instances from the new data set of large random graphs (RGs) and large random geometric graphs (RGGs) for CMSA++ and BARRAKUDA, respectively.In these cases, the tuning for the uniform capacity problems was done separately from the variable capacity problems.
Finally, note that the original BRKGA algorithm can be obtained from BARRAKUDA by a setting of n a = 1, which means that the sub-instance V is built from only one solution and that the ILP solver can therefore not find any better solution in the sub-instance.In fact, the setting of n a = 1 was determined by irace very few times, such as, for example, for the instances of the first row in Table 2a.In other words, in these cases the original BRKGA algorithm is better than BARRAKUDA.In the overwhelming majority of the cases, however, BARRAKUDA is significantly better than BRKGA.

Results
In addition to CPLEX, CMSA++ and BARRAKUDA were applied with the corresponding parameter values from above to all problem instances exactly once.The time limit for each run was set to 1000 CPU seconds for each algorithm.The results for the problem instances from the literature are presented in numerical form in Tables 5-8.Each table row contains the solution quality (columns with heading q) and the time at which this result was obtained (columns with heading t) averaged over 10 problem instances for each of the three algorithms.Moreover, we show the results of LS_PD, which is an algorithm based on a local search that was the state-of-the-art approach for the CapMDS problem before the publication of CMSA.In addition to solution quality and computation time, in the case of CMSA++ and BARRAKUDA we also provide the average gap (in percent) over the respective 10 problem instances, either with respect to the optimal solutions obtained by CPLEX, or-if not possible-with respect to the optimal MDS solutions of the respective problem instances.In the former case, the gaps are marked by a superscript ''a", and in the latter one by a superscript ''b".Those cases in which not even the optimal MDS solutions could be computed (due to excessive computation times) are marked with ''--".The best results per table row are indicated in bold font.Moreover, the result of CPLEX is shown with a grey background in case the result is proven optimal.Finally, results with a preceding asterisk indicate new best-known results.In order to summarize these results, we additionally generated so-called critical difference (CD) plots for subsets of these problem instances (see the graphics in Figure 4).For their generation, first, the Friedman test was used to compare CPLEX, CMSA++ and BARRAKUDA simultaneously.(LS_PD was not included in the critical difference plots because, unfortunately, the detailed results per instance were not available.In any case, LS_PD is clearly worse than the other three techniques.)The hypothesis that the techniques perform equally was rejected.Subsequently, the algorithms were compared pairwise by the Nemenyi post-hoc test [36].The obtained results can graphically be shown in the form of the above-mentioned CD plots in which each algorithm is placed on the horizontal axis according to its average ranking for the considered subset of problem instances.The performances of those algorithm variants that are below the critical difference threshold (computed with a significance level of 0.05) are considered as statistically equivalent.Such cases are shown by additional horizontal bars joining the average ranking markers of the respective algorithm variants.(The statistical evaluation was conducted using R's scmamp package [37], available at https://github.com/b0rxa/scmamp.) The following observations can be made: • First of all, CPLEX as well as CMSA++ and BARRAKUDA clearly outperform LS_PD on most problem instances.• However, most of the instances from the literature are no challenge, neither for the exact solver (CPLEX), nor for the heuristic solvers proposed in this work (CMSA++ and BARRAKUDA).In fact, CPLEX is able to solve most problem instances to optimality; see the cases with a grey background.Interestingly, CPLEX seems to have more problems with unit disk graphs than with random graphs.Moreover, instances with variable node capacities seem slightly harder as well.This is also indicated by the CD plots in Figure 4b-g.In particular, the CD plots in Figure 4b,d,f show that CPLEX is not significantly worse than the best-performing algorithm.In the case of medium capacities (Figure 4d) CPLEX even ranks best on average.In contrast, Figure 4c,e,g show that BARRAKUDA is (apart from the case of medium capacities) significantly better than CPLEX and CMSA++.

•
Even though we can only detect rather small differences between the three best-performing algorithms, when considering all problem instances from the literature together, BARRAKUDA performs significantly better than CPLEX, which-in turn-performs significantly better than CMSA.(a) All instances from the literature.(c) Variable capacity graphs, low capacities.(d) Uniform capacity graphs, med.capacities.(e) Variable capacity graphs, med.capacities.(f) Uniform capacity graphs, high capacities.(a) All instances from this set.

Conclusions and Future Work
In this paper we dealt with an NP-hard variant of the family of dominating set problems-the so-called minimum capacitated dominating set problem.Two algorithms were proposed to tackle this problem.The first one is an extended version of construct, merge, solved and adapt, of which a preliminary version was already published [11].The aim of our extensions was to make the algorithm more robust with respect to the parameter value settings.The second algorithm that was proposed is a hybrid between a biased random key genetic algorithm and an exact solver.This solver was labeled BARRAKUDA.The main idea of Barrakuda is to use, at each iteration, the current population of individuals in order to generate a sub-instance of the tackled problem instances.This sub-instance is then solved by the exact solver (CPLEX) and the resulting solution is transformed into an individual and fed back into the population.The experimental results showed that both algorithms clearly outperform LS_PD, the currently best approach from the literature in the context of already published benchmark instances.However, the results also showed that these instances are not really a challenge for our algorithms.Therefore, we generated a new set of larger and more challenging benchmark instances.BARRAKUDA clearly outperformed the extended version of construct, merge, solve and adapt, especially in the context of general random graphs.In contrast, the performance of the two algorithms was shown to be quite similar for random geometric graphs.
Concerning future work, we want to study the reasons why BARRAKUDA performs (in relation to construct, merge and adapt) much better for general random graphs, while this difference does not show for random geometric graphs.Moreover, believe that BARRAKUDA-type algorithms-that is, hybrid algorithms that (1) take profit from an exact solver and (2) incorporate a learning component-can be very successful for other types of combinatorial optimization problems, and this is something that we would like to study.

Figure 1 .
Figure 1.An example of the CapMDS problem.Note that the bold edges in (b) indicate the domination relations.Vertices v 1 and v 4 , for example, are dominated by vertex v 2 which forms part of the solution.

Figure 2 .
Figure 2. High-level overview of the two proposed hybrid algorithms.

else 12 :
Let L ⊆ W contain the min{l size , |W|} vertices from W with the highest heuristic values 13: Choose v * uniformly at random from L 14: end if 15: if g opt = opt 1 or g opt = opt 2 then

Figure 3 .
Figure 3. Example problem instance after the first step of generating a solution.Let us assume that the capacity of each node is 4. Vertex v 1 was chosen in the first step.As the capacity of v 1 is 4, all its neighbors are placed in C(v 1 ) and covered by v 1 .
Variable capacity graphs, high capacities.

Figure 4 .
Figure 4. Critical difference plots concerning the problem instances from the literature.
Random geometric graph instances.

Figure 5 .
Figure 5. Critical difference plots concerning the challenging large problem instances.

Algorithm 2
Function ProbabilisticSolutionGeneration(d rate * , l size , g opt ) 1: input: a problem instance (G, Cap) 2: input: parameter values for l size , d rate * , g opt 3: Algorithm 3 BARRAKUDA for the CapMDS problem 1: input: a problem instance (G, Cap) 2: input: values for p size , p e , p m and prob elite (BRKGA parameters) 3: input: values for n a , ex mode , t solver (additional BARRAKUDA parameters) 4: P ← GenerateInitialPopulation(p size ) 5: Evaluate(P) 6: while computation time limit not reached do Crossover(P, P e , prob elite ) Evaluate(P m ∪ P c ){NOTE: P e is already evaluated} 11: P := P e ∪ P m ∪ P c 12: P ilp ← ChooseSolutions(P, ex mode , n a )

Table 1 .
CMSA++ parameter values generated by irace for the instances from the literature.

Table 2 .
BARRAKUDA parameter values generated by irace for the instances from the literature.

Table 3 .
CMSA++ parameter values generated by irace for the new data set of large problem instances.

Table 4 .
BARRAKUDA parameter values generated by irace for the new data set of large problem instances.

Table 5 .
Results for general graphs with uniform capacity.

Table 6 .
Results for general graphs with variable capacity.

Table 7 .
Results for Unit Disk Graphs with uniform capacity.