A Performance Study of Some Approximation Algorithms for Computing a Small Dominating Set in a Graph

: We implement and test the performances of several approximation algorithms for computing the minimum dominating set of a graph. These algorithms are the standard greedy algorithm, the recent Linear programming (LP) rounding algorithms and a hybrid algorithm that we design by combining the greedy and LP rounding algorithms. Over the range of test data, all algorithms perform better than anticipated in theory, and have small performance ratios, measured as the size of output divided by the LP objective lower bound. However, each have advantages over the others. For instance, LP rounding algorithm normally outperforms the other algorithms on sparse real-world graphs. On a graph with 400,000+ vertices, LP rounding took less than 15 s of CPU time to generate a solution with performance ratio 1.011, while the greedy and hybrid algorithms generated solutions of performance ratio 1.12 in similar time. For synthetic graphs, the hybrid algorithm normally outperforms the others, whereas for hypercubes and k-Queens graphs, greedy outperforms the rest. Another advantage of the hybrid algorithm is to solve very large problems that are suitable for application of LP rounding (sparse graphs) but LP formulations become formidable in practice and LP solvers crash, as we observed on a real-world graph with 7.7 million+ vertices and a planar graph on 1,000,000 vertices.


Introduction
Domination theory has its roots in the k-Queens problem in the 18th century. Later, in 1957, Berge [1] formally introduced the domination number of a graph. A subset of vertices S in a graph G is a dominating set if every vertex not in S is adjacent to some vertex in S. A dominating set of smallest cardinality is called a minimum dominating set. The cardinality of a minimum dominating set is called domination number of G and is denoted by γ(G). The vertices colored red in Figure 1 constitute a minimum dominating set in the graph Q 3 , or the three dimensional cube.
For the remainder of the paper, we assume familiarity with general concepts of graph theory as in [2], the theory of algorithms as in [3], and linear and integer programming concepts as in [4], respectively. We refer the reader to the book by Haynes, Hedetniemi, and Slater [5] as a general reference in domination theory. The problem of computing the domination number of a graph is well studied, and has extensive applications, including the design of telecommunication networks, facility location, and social networks.
Computing γ(G) is known to be an NP-hard problem, even in restricted cases, including unit disc graphs and grids [6], and hence the researchers have focused on approximation and finding a small dominating set. A simple greedy algorithm is known to approximate γ(G) to within a logarithmic factor from the optimal value. It is known that improving the logarithmic approximation factor is also NP-hard [7]. Hence, no algorithm for approximating γ(G) can improve the asymptotic worst-case performance ratio of the greedy algorithm. Different variations of this algorithm are proposed and some are tested in practice. See the work of Chalupa [8], Campan et al. [9], Eubank et al. [10], Sanchis [11], and Siebertz [12]. There are other approximation algorithms for very specific classes of graphs, including planar graphs, which have better than constant performance ratio in the worst case but are more complex than the algorithms described here. See [12] for a brief reference to some related papers.
Very recently, Bansal and Umboh [13] and Dvořák [14] showed that an appropriate rounding of solutions to the linear programming (LP) formulations for computing γ(G) provides dominating sets whose cardinalities are at most 3 · a(G) · L * and (2 · a(G) + 1) · L * , respectively, in polynomial time.
Here, a(G) is the arboricity of G, and L * is the value of the optimal solution to the linear programming, which is a lower bound on γ(G). Hence, for graphs with bounded arboricity, one can improve the logarithmic performance ratio of the greedy algorithm to a constant.
The greedy algorithm is simple, fast, and is tested in practice. One anticipates that it outperforms the LP-based approach if CPU time is the criteria. Nonetheless, its performance ratio in the worst case is logarithmic even for planar graphs which have arboricity at most 3; see example 1 in the Appendix A. For sparse graphs, the recent LP rounding methods referenced above have a bounded performance ratio, which is better than greedy, but to our knowledge, and in contrast to the greedy algorithm, the performance of the LP-based approaches have not been tested in practice. Furthermore, one would expect that for large graphs, the LP formulations would become formidable. Can one hope that a combination of these methods would give a better result than each individually, and if so, in what scenarios?
In this paper, we compare and contrast the performance of the greedy algorithm, the LP rounding algorithm, and a hybrid algorithm that combines the greedy and LP approaches. Our hybrid algorithm first solves the problem using the greedy algorithm and finds a dominating set, then takes a portion of vertices in this set and forces their values to be 1 in the linear programming formulation, solves the resulting linear program, and finally properly rounds the solution.

Our Findings
Through experimentation, all algorithms perform better than anticipated in theory, particularly with respect to the performance ratios, measured as the value of solution divided by the computed LP objective lower bound. However, each may offer advantages over the others depending on the nature of the data. LP rounding does well on sparse real-world graphs, consistent with theory, and normally outperforms the other algorithms. On a graph with 400,000+ vertices, LP rounding took less than 15 s of CPU time to generate a solution with performance ratio 1.011, while the greedy and hybrid algorithms generated solutions of performance ratio 1.12 in similar time. It is remarkable that the hybrid algorithm can solve the problem for very large sparse graphs, where the LP formulation becomes formidable in practice. For instance, it solved a real-world graph with 7.7 million+ vertices in 106 s of CPU time with a performance ratio of 2.0075. The LP solver crashed on this problem. We indicate that the simplex-based LP package used in our experiments performed very fast, although, in theory, simplex is not necessarily polynomially time bounded.
For hypercubes and k-Queens graphs (which are not sparse) greedy outperforms the rest, consistent with theory, both in terms of speed and performance ratio. In particular, on the 12-dimensional hypercube, greedy finds a solution with performance ratio 1.7 in 0.01 s. On the other hand, the LP rounding and hybrid algorithms produce solutions with performance ratio 13 and 3.3 using 7.5 and 0.08 s of CPU time, respectively. It is notable that greedy gives optimal results in some cases where the domination number is known. Specifically, the greedy algorithm produces an optimal solution on hypercubes with dimensions d = 2 k − 1 where k = 1, 2, 3, and 4. For synthetic graphs -generated k-trees (G is a k−tree if it has tree width k and the addition of any edge increases the tree width by one), and k-planar graphs (G is k−planar if it can be drawn in the plane with no edge crossed by more than k other edges) -the LP rounding is outperformed by the other two algorithms, and the hybrid algorithm outperforms greedy.
This paper is organized as follows. In Section 2, we set our notations and summarize related materials for the greedy algorithm. The LP rounding and hybrid algorithms are explained in sections three and four, respectively. Section 5 (Environment, implementations and datasets), contains our materials and methods. Section 6 contains the results for synthetic graphs (k-planar graphs and k-trees). Sections 7 and 8 contain the results for hypercubes and k-Queens graphs, and real-world graphs, respectively. In Section 9, we exclusively describe how the hybrid algorithm can be applied to very large sparse graphs, where the LP formulation becomes formidable. We present our conclusions in Section 10.

Preliminaries
Throughout this paper G = (V, E) denotes an undirected graph on vertex set V and edge set E with n = |V| and m = |E|. Two vertices x, y ∈ V where x = y are adjacent (or they are neighbors) if xy ∈ E. For any x ∈ V, the degree of x, denoted by deg(x) is the number of vertices adjacent to x in G. For any x ∈ V, let N(x) denote the set of all vertices in G that are adjacent to x. Let N[x] denote N(x) ∪ {x}. Arboricity of G, denoted by a(G), is the minimum number of spanning acyclic subgraphs of G that E can be partitioned into. By a theorem of Nash-Williams, a(G) = max S m S n S −1 , where n S and m S are the number of vertices and edges, respectively, of the induced subgraph on the vertex set S [15]. Consequently, m ≤ a(G)(n − 1), and thus a(G) measures how dense G is. It is known that a(G) can be computed in polynomial time [16].
Let D ⊆ V. D is a dominating set if for every x ∈ V \ D there exists y ∈ D such that xy ∈ E. The domination number of G, denoted by γ(G), is the cardinality of a smallest dominating set of G. A dominating set of cardinality γ(G) is called a minimum dominating set. Additional definitions will be introduced when required.

Greedy Approximation Algorithm
A simple greedy algorithm attributed to Chvátal [17] and Lovász [18] (for approximating the set cover problem) is known to approximate γ(G) within a multiplicative factor of H(∆(G)) from its optimal value, where ∆(G) is the maximum degree of G and H(k) = ∑ k i=1 (1/i) is the k−th harmonic number. Note that ln(k + 1) ≤ H(k) ≤ ln(x) + 1. The algorithm initially labels all vertices uncovered. At iteration one, the algorithm selects a vertex v 1 of maximum degree in G, places v 1 in a set D, and labels all vertices adjacent to v 1 as covered. In general, at iteration i ≥ 2, the algorithm selects a vertex v i ∈ V − {v 1 , v 2 , . . . , v i−1 } with the largest number of uncovered vertices adjacent to it , adds v i to D, and labels all of the uncovered vertices adjacent to v i as covered. The algorithm stops when D becomes a dominating set. It is easy to implement the algorithm in O(n + m) time. It is known that approximating γ(G) within a factor (1 − )ln(∆) from the optimal is NP-hard [7]. Hence, no algorithm for approximating γ(G) can improve the asymptotic worst-case performance ratio achieved by the greedy algorithm.
The appendix includes two examples of worst-case graphs (one sparse and one dense) for the greedy algorithm, which are derived from an instance of the set cover problem provided in [19]. For both instances, the O(ln(∆)) performance ratio is tight.

Linear Programming Approach
One can formulate the computation of γ(G) as an integer programming problem IP1 stated below. However, since integer programming problems are known to be NP-hard [20], the direct application of the integer programming method would not be computationally fruitful. Next observe that by relaxing the integer program IP1, one obtains the linear program LP1 shown below.

IP1:
Throughout rest of this paper we denote by L * and I * the values of L and I in LP1, and IP1, respectively, at optimality.

Linear Programming Rounding
Algorithm R 1 is due to Bansal and Umboh [13].
Solve LP1, and let H be the set of all vertices that have weight at least 1/(3a(G)), where a(G) is the arboricity of graph G. Let U be the set of all vertices not adjacent to any vertex in H and return H ∪ U.
Dvořák [14,21] studied the d-domination problem, i.e., when a vertex dominates all vertices at distance at most d from it and its combinatorial dual, or a 2d-independent set [22]. In [14] he employed the LP rounding approach of Bansal and Umboh as a part of his frame work and consequently, for d = 1, he improved the approximation ratio of algorithm R 1 by showing that the algorithm R 2 given below provides a 2a(G) + 1 approximation.
Solve LP1, and let H be the set of all vertices that have weight at least 1/(2a(G) + 1), where a(G) is the arboricity of graph G. Let U be the set of all vertices that are not adjacent to any vertex of H and return H ∪ U.

Remark 1.
Graph G in Example 1 in the Appendix A is planar, so a(G) ≤ 3. Thus, algorithms R 1 and R 2 have a worst-case performance ratio of nine and seven respectively, whereas greedy exhibits a worst-case O(log(n)) performance ratio. Throughout our experiments, rounding algorithms returned an optimal solution of size two for both examples, whereas greedy returned a set of size three for Example 1. Furthermore, in Example 2 in the appendix, it can be verified that a(G) ≥ (p + 2)/2 for graph G and hence in theory the worst-case performance ratios of the rounding algorithms are not constant either. Interestingly enough, in our experiments, L * was always two for graphs of type Example 2, and LP rounding algorithms also always found a solution of size two, which is the optimal value. Thus, the performance ratio was always one, and much smaller than the predicted worst case.

Hybrid Approach
Next, we provide a description of the decomposition approach for approximating LP1 and our hybrid algorithm. Recall that a separation in G = (V, E) is a partition A ∪ B ∪ C of V so that no vertex of A is adjacent to any vertex of C. In this case, B is called a vertex separator in G. Let X = {x v |v ∈ V} be a feasible solution to LP1, and let V ⊆ V. Then X(V ) denotes ∑ v∈V x v . Lemma 1. Let A ∪ B ∪ C be a separation in G = (V, E) and consider the following linear programs: Proof. Let X = {x v |v ∈ V} be an optimal solution to LP1. Please note that the restrictions of X to B ∪ C and A ∪ B give feasible solutions for LP3 and LP2 of values X(B ∪ C) and X(A ∪ B), and hence the claim for the lower bound on L * follows.
Please note that in LP2 and LP3 the constraints are not written for all variables, and the rounding method in [13] may not directly be applied. . Let X be an optimal solution for LP3, and let X(C) denote the sum of the weights assigned to all vertices in C. Then there is a dominating set in G of size at most |A| + 3a(G)X(C) ≤ |A| + 3a(G)N * .
Proof. Let H be the set of all vertices v in C with x(v) ≥ 1 3a , and let U = C − (H ∪ N(H)). Now apply the method in [13] to C to obtain a rounded solution, or a dominating set D, of at most |U| + |H| ≤ 3a(G)X(C) vertices in C. Finally, note that A ∪ D is a dominating set in G with cardinality at most |A| + 3a(G)X(C) ≤ |A| + 3a(G)N * .

The Hybrid Algorithm:
Fix 0 < α < 1. Apply the greedy algorithm to G to obtain a dominating set D = {x 1 , x 2 , . . . , x d }, and let S = {x 1 , x 2 , . . . , x α·d } be the first α · d vertices in D. Now, solve the following linear program on the induced subgraph of G with the vertex set V − {S}.
Next, let A = S, B = N(S) and C = V − (A ∪ B), and apply the rounding scheme in algorithms R 1 or R 2 to C. Let H and U be the corresponding sets, and output the set S ∪ H ∪ U.

Remark 3.
We choose the value of α by trial and error, normally starting at α = 1/2.

Environment, Implementation, and Datasets
We used a laptop with modest computational power-8th generation Intel i5 (1.6 GHz) and 8 GB RAM-to perform the experiments. We implemented the O(n + m) time version of the greedy algorithm in C++. At the time of writing this paper we did not have access to packages that offer the polynomially time bounded versions of linear programming. We used IBM Decision Optimization CPLEX Modeling (DOCPLEX) for Python to solve the LP relaxation of the problem for the LP rounding and hybrid algorithms.
Our data sources are listed below. https://github.com/joklawitter/GraphGenerators † http://www.inf.udec.cl/~jfuentess/datasets/graphs.php † † http://davidchalupa.github.io/research/data/social.html [8] https://www.cc.gatech.edu/dimacs10/downloads.shtml [23] The graph generator at † was used to create the k-planar graphs (graphs embedded in the plane with at most k crossings per edge) and k-trees (graphs with tree width k with largest number of edges) up to 20,000 vertices. We also used publicly available Google+ and Pokec social-network graphs, a publicly available 1,000,000-vertex planar graph † †, and real-world DIMACS Graphs with up to more than 7,700,000 vertices. Furthermore, we generated the k-Queens graphs, hypercubes (up to 12 dimensions) and graphs in Example A.1 ( Figure A1) and Example A.2.

Remark 4.
Throughout our experiments, the value of the solution computed by rounding algorithm R 2 was always better than the value of the solution computed by rounding algorithm R 1 , as predicted in theory. Likewise, the value of the solution computed by the hybrid algorithm using R 2 was always better compared to when R 1 was used. Throughout sections six through nine (tables), r denotes the value of the solution computed by R 2 , and h denotes the value of the solution computed by the hybrid algorithm associated with R 2 . We denote by g the value of the solution computed by the greedy algorithm. Throughout the tables in sections six through nine, the best computed values are bolded.

Performance on k-Planar Graphs and k-Trees
In this section, we compare the performance ratios of the greedy, LP rounding and hybrid algorithms on k-planar graphs and k-trees. In all cases, the hybrid algorithm performed better than the others. Greedy performed close to hybrid and LP rounding performed the worst.
The arboricity of each of the planar graphs is at most 3. For k-trees, we use k − (k/2)(k−1) n−1 for arboricity. For k-planar graphs, k ≥ 1, we use the upper bound of 8 √ k on arboricity. The k-planar graphs and k-trees were all made using † described in Section 5. The typical value of α was 1/2, but increasing it to 3/4 resulted in better performance in some cases. All algorithms were able to compute dominating sets in less than 2 s in all cases.

Performance on Sparse k-Planar Graphs and k-Trees
In Tables 1-3, we present the performance of the algorithms on sparse k-planar graphs and sparse k-trees.

Performance on Dense k-Planar Graphs and k-Trees
In Tables 4 and 5, we present the algorithms' performance on k-planar graphs where k = ln (|V|) , and k-trees where k = |V| 0.25 , respectively. These graphs are dense.

Performance on Hypercubes and k-Queen Graphs
In this section, we present the performance of the greedy, rounding, and hybrid algorithms on hypercubes from d = 5 to 12 dimensions and k-Queens graphs. Table 6 contains the results for hypercubes. We use the arboricity for hypercubes a = d/2 + 1 for LP rounding and hybrid [24]. For k-Queens graphs, arboricity is unknown, so we use the upper bound 3(k − 1), where k is the length of the chessboard.
For both Table 6 and 7, greedy performs the best, followed by hybrid. Rounding algorithms perform the worst by far. This is not surprising as LP rounding approaches are known to in general perform worse on dense graphs than sparse graphs. The chosen value of α was always 1/2. Solutions were computed in under 8 s for all graphs and algorithms.

Performance on Real-World Graphs
In this section, we present the performance of LP rounding, greedy, and hybrid on the real-world social network graphs from DIMACS [23], Google+ [8], and Pokec [8]. All of these graphs are sparse, but their arboricity is unknown. Since arboricity is unknown, we experiment with the threshold applied during LP rounding. Through experimentation, the best threshold which we found for rounding was 2/a , where a = m/(n − 1) . We denote the value of the solution computed by the LP rounding algorithm with this threshold as r and the value of the solution computed by the hybrid algorithm with this threshold as h . The chosen value of α was 1/2 in all cases. Table 8 contains the results for three sparse social network graphs from DIMACS. LP rounding performs better than the greedy and hybrid approaches, with greedy ranking last out of the algorithms tested. In Tables 9 and 10, we compare r , h , and g on the Google+ and Pokec graphs. The performance ratios, although different, happen to be very close. Thus, we list the actual sizes of the dominating sets returned by the algorithms.
Compared to the best results from [8], which used a randomized local search algorithm that is run for up to one hour, LP rounding approaches generally produced, with the exception of a few cases, a smaller or as good solution using significantly less cpu-time at less than 0.5 s for each graph.  Table 11 contains the results on a 7 million+ vertices graph (Great Britain street network) and a 1 million vertices planar graph, where we used Lemma 1 to compute Z = max{M * , N * }. The LP solver crashed when it was directly applied to the Great Britain street network and took over 3.5 h of CPU time to compute an answer on the 1 million vertices planar graph. Through a search for α, we came up with α = 3/4 for hybrid. The hybrid algorithm's performance ratio to Z was better than greedy's. Greedy took 14 s to produce a solution while hybrid took 107 s on the street network graph. On the 1 million vertices planar graph, greedy took 4 s and hybrid took 19 s.

Conclusions
Our findings indicate that all of these algorithms perform better than anticipated in theory, particularly with respect to the performance ratio. The LP rounding does well on sparse real-world graphs, consistent with theory, and normally outperforms the other algorithms. For hypercubes and k-Queens graphs (which are not sparse) greedy outperforms the rest, consistent with theory, both in terms of speed and performance ratio. For synthetic graphs (generated k-trees and k-planar graphs), LP rounding is outperformed by the other two algorithms, and the hybrid algorithm outperforms greedy.
Throughout our experimentation, the hybrid algorithm was never the worst. The hybrid algorithm's success in solving large sparse problems suggests that more research in this area will be fruitful with respect to characterizations of parameter α. In particular, a theoretical research direction would be attempting to tighten the upper bound on the value of the solution computed by the hybrid algorithm, as stated in Remark 2. Can the upper bound on the value of the solution be shown to be better than (2a + 1) * γ(G) with the appropriate choice of α?
Author Contributions: J.L. and R.P. wrote the code, collected and generated data, ran experiments and collected the results, and participated in analysis and writing the paper. F.S. conceptualized, wrote a draft of the paper, and supervised and provided guidance for the work. All authors have read and agreed to the published version of the manuscript.  Example A2. Let p ≥ 2 be an integer, and let G be a graph with vertices V 1 ∪ V 2 , where V 1 = {s 1 , s 2 , . . . , s p , t 1 , t 2 } and V 2 = {v 1 , v 2 , . . . , v 2 p+1 −2 }. Now make V 1 a clique and V 2 an independent set of vertices, respectively. Next, consider a linear ordering L on V 2 : for i = 1, 2, . . . , p, the set of neighbors of s i in V 2 , denoted by W i , has cardinality 2 i and is disjoint from W k , for any k ≤ i. Finally, for i = 1, 2, . . . , p place edges between t 1 and the first half of the vertices in each W i , and place edges between t 2 and the second half of the vertices in each W i . Now note that the greedy algorithm will be forced to pick the vertices s p , s p−1 , . . . , s 1 , in that order, but the minimum dominating set in G is {t 1 , t 2 } and ∆ = 2 p + p + 1.