1. Introduction
In a simple connected graph
with
vertices and
edges, a subset of vertices
is a
dominating set in graph
G if any vertex
is adjacent to some vertex from this subset unless vertex
v itself belongs to set
S. The general objective is to find an
optimal solution, a dominating set with the minimum possible size
. The domination problem is known to be
NP-hard [
1], being among the hardest problems in the family. It is closely related to other well-known graph optimization problems such as set cover, graph coloring and independent set. The problem of domination in graphs was formalized in [
2,
3]. Currently, this topic has been detailed in the two well-known books by [
4,
5]. The theory of domination in graphs is an area of increasing interest in discrete mathematics and combinatorial computing, in particular, because of a number of important real-life applications, for example, in facility location problems in street networks [
5,
6,
7], in monitoring electric power systems with the aim of minimizing stage measurement units [
8], in electric power networks [
9], and in the analysis of food web robustness [
10]. Different versions of the graph domination problem have been utilized to model clustering in wireless ad hoc networks [
11,
12,
13,
14]. A number of other important graph-theoretic problems including set cover, maximum independent set and chromatic number are reducible to the dominance problem [
5]. Dominating sets also find applications in the study of social networks [
15,
16,
17].
Due to the complexity of the problem, major efforts have been made towards the development of heuristic algorithms. Among the earliest related works, we can mention that of Chvatal [
18], who described an approximation algorithm with an approximation ratio
for the related weighted set cover problem; here and later,
is the optimal objective value. Iteratively, the algorithm selects a vertex
v minimizing
, where
is the weight of vertex
v, and
until
. Adopting this algorithm, Parekh [
19] solved the domination problem showing that the cardinality of the dominating set created by his algorithm is upper-bounded by
. Later, Eubank et al. [
20] and Campan et al. [
21] presented heuristic algorithms designed for special types of graphs estimating the performance of their algorithms solely with an experimental study. Recently, [
22] described a heuristic algorithm with a better performance.
Little work has been carried out in the development of exact algorithms. To the best of our knowledge, the only exact algorithms which do better than a complete enumeration are described in [
23,
24,
25]. In [
23,
24], the authors show that their algorithms run in times
and
, respectively, without presenting experimental evidence of the practical performance of their algorithms. Although these bounds are notably better than
, they remain too impractical. Quite recently, in [
25], two exact algorithms, an implicit enumeration algorithm and an alternative integer linear programming (ILP) formulation were proposed. The authors also described an approximation algorithm. On the one hand, it gave solutions of notably better quality than the previously known approximation algorithms for the benchmark instances with up to 2100 vertices, some of them being solved in less than 1 min. The algorithm found an optimal solution for 61.54% of these instances. On the other hand, it failed to create solutions within a reasonable time limit for the graphs with 3000 and more vertices. In contrast, for large-scale practical problems, the heuristic from [
22] gave solutions within the time limit of 50 s for graphs with up to 14,000 vertices.
Relying on the framework of the heuristic from [
22], here, we propose new procedures that improve the quality of solutions for large-scaled instances. The heuristic works in two stages. In stage 1, a dominant set is generated, and in stage 2, it is
purified, i.e., its size is reduced. The purification process is achieved through the analysis of the flowchart of stage 1, combined with a special kind of clustering of the dominating set of stage 1. A cluster exploits the special structural relationship between the vertices of a dominating set, which can beneficially be used during its purification process. Exploring the structural properties of a dominating set, we propose a new method for a more beneficial analysis of the structure of a dominating set. We present four different purification procedures, all of them outperforming, in practice, the earlier known purification procedure from [
22]. For over 1300 benchmark problem instances [
26], the improvement is about 10%, on average. Compared to the estimations due to the known upper bounds, the obtained solutions, on average, are about seven times better, resulting in a reduction of 85.71% of the value of the best of these upper bounds. For the 500 benchmark instances where the optimum is known (see [
25,
26]), at least one of the new purification procedures obtains an optimal solution for 46.33% of the instances, whereas the average error for the remained instances is 1.01.
The outline of the remaining part of the paper is as follows. In the following
Section 2, we describe the necessary preliminaries. In
Section 3, we give our clustering procedure.
Section 4 details the four purification procedures. In
Section 5, we present the approximation ratios, and in
Section 6, we report our experimental results. We give some final remarks in
Section 7.
2. Preliminaries
A formal description of our domination problem is as follows. Given a simple connected graph with vertices and edges, a subset of vertices is a dominating set in graph G if any vertex is adjacent to some vertex x from this subset (i.e., there is an edge , unless vertex v itself belongs to set S. The objective is to find a feasible solution with the minimum possible size .
Given a vertex
,
is the set of neighbors or the open neighborhood of
v in
G, that is,
, we denote by
the
degree of vertex
v in
G. We let
and
. The
private neighborhood of vertex
is defined by
; a vertex in the private neighborhood of vertex
v is said to be its
private neighbor with respect to set
S. For further details on the basic graph terminology, see [
5].
In [
22], a dominating set is formed in two stages. At the first stage, an initial dominating set
is created by a fast greedy algorithm, where
stands for the vertex included at iteration
i of this greedy algorithm. At the second purification stage, a special iterative procedure is applied to reduce the initial dominating set. This iterative procedure will be referred to as the purification procedure 0, abbreviated P0. It is based on the analysis of the flowchart of the greedy algorithm, represented as a special kind of spanning forest
T of the original graph
G and consisting of the vertices of set
S. This forest is the union of the so-called
clusters that organize vertices from set
S in special graphical structures. These structures are formed at the first stage of our algorithm, while the greedy algorithm generates set
S. Clusters represent important dependencies in the dominant set that allows its purification in different ways. In particular, the order in which the vertices of each cluster are purified is important and affects the outcome of the purification process. A cluster can be seen as a specific type of spanning forest of the initial graph
G. Different clustering methods yield a collection of different types of rooted trees. By a traversal of these trees, we will purify the initial dominating set at stage 2. Different combinations of a particular clustering and traversal/purification method result in different overall purification algorithms of different efficiency.
In the following sections, we describe different tree traversal methods starting with a formal definition of a cluster in the next section. Then, we describe how we carry out the purification process during a traversal of each cluster. We combine different clustering and traversal methods and obtain different purification procedures.
3. Stage 1: The Clustering
As briefly noted, the cluster is a rooted tree, a sub-graph of graph G associated with some connected component of the graph induced by set S. Initially, a set of clusters, , form a partition of set S. We denote the union of these trees by T. In a cluster, the vertices of each connected component can be represented in different ways as trees. During the traversal of these trees, special conditions are verified, and some vertices are omitted (purified).
We generate the clusters during the construction of the initial dominating set
S. This construction is based on the greedy procedure from [
22], further referred to as
Greedy. It works in a number of iterations, which is upper-bounded by
n. At each iteration
, one specially selected vertex, denoted by
, is added to the dominant set
of the previous iteration, i.e.,
, where initially
.
is the partial dominant set of iteration
. At each iteration
h, the
active degree of a vertex
is
. At each iteration
h, vertex
is a vertex of
with the maximum active degree. Note that
is a vertex with the maximum degree in graph
G. A vertex is said to be covered at iteration
h if it has at least one neighbor in the set
. Greedy halts when
is a dominating set of
G. At that iteration, all uncovered vertices with active degree 0 (if any) are included in the set
. A formal description is provided in Algorithm 1.
Throughout this section,
is used for the vertex with the maximum active degree selected by
Greedy at iteration
h; hence,
is the dominant set of Stage 1.
Algorithm 1 Algorithm Greedy |
Input: A graph G. Output: A dominating set S of G. ; ; ; ; {iterative step} while is not a dominating set of graph G do ; := any vertex of with maximum active degree; ; Cluster_Generation(, , h, i); end while
|
Our aim is to purify this set, i.e., omit some vertices from that set so that the reduced set remains dominant. A combination of a particular clustering and traversal method results in a particular purification procedure. The four purification procedures that we propose here apply the same rules for the creation of the clusters. Unlike the purification procedure P0, which is oriented on the creation of clusters in depth, here, Stage 1 creates clusters in width. Iteratively, every next vertex is included in the partial forest at iteration h, , as follows:
Below, we give a formal description of the procedure (see Algorithm 2).
Algorithm 2 Cluster_Generation |
Input: , , h, and number of clusters created i. Output: Forest and i. if then ; { is root of } ; else if then ; ; { is root of } ; else if then {insert as a child of x}; else {insert as a child of and update }; end if end if end if
|
Theorem 1. Stage 1 runs in time .
Proof. The number of trees
and the number of vertices in each tree are clearly bounded by
. At an iteration
h,
can be determined in time
, and the same time is required to locate and update each vertex
. Hence, at every iteration
h, vertex
can be incorporated into the forest
in time
. The theorem is followed as in Greedy, vertex
is determined and added in the dominating set
S also in time
(see [
22]). □
We use the graph in
Figure 2a to illustrate our algorithms. For this graph, Greedy creates the dominating set
. At iteration
,
and
. At iteration
, the vertices with the highest active degree (5) are 1, 2 and 3. So, vertex
can be chosen. Consequently,
and
. Note that the maximum active degree of the vertices in
is 1. Thus, in iteration
, we can choose vertex
, resulting in
and
. At iteration
, the active degree of vertex 3 is 1, so
,
, and
, where
is a tree with root vertex
, see
Figure 2b. At iteration
,
is a dominating set, and the algorithm stops.
4. Stage 2: Purification Procedures
Given a set of clusters formed during the execution of the algorithm of Stage 1, we wish to purify the forest T, i.e., to convert a possibly non-minimal dominating set to a minimal one. Note that there is a difference between a minimal and a minimum dominating set. The latter one is optimal, whereas the former set is not necessarily optimal; however, it cannot be reduced by omitting any node from it. As we will see, our purification procedures return a minimal dominating set. In stage 2, Procedure P0 takes forest T as input, which, together with the initial dominating set , is constructed at stage 1. During the construction of set S, the vertex is inserted into the current forest as a child of one of the previously inserted vertices , such that , where () is the vertex which is furthest away from the root in its cluster. If the root of another cluster in is adjacent to vertex in graph G, it becomes a child of vertex . Note that the clusters with so-defined roots will be merged with the former cluster C into a new larger cluster.
We may observe that, in Procedure P0, the clusters are generated in depth (resulting in clusters with high depth). Dealing with clusters with high depth has some advantages but also disadvantages. In Procedure P0, each cluster is traversed in a bottom-up fashion from a leaf to the root. We will refer to a sequence of four or three consecutive vertices on a path in the currently purified tree as a quadruple or trio, respectively. In Procedure P0, a quadruple or a trio , forming sub-parts of the path, are identified. Given a quadruple , the vertices are purified, unless these vertices possesses a semi-private neighbor, a vertex from set , which is the only (remained) neighbor of the former vertex from set . For a trio, a similar condition is verified only for vertex b.
As described above, two vertices can be purified at once. However, this might not be possible, either because there may exist no quadruple (the up going paths might be too short) or/and because a candidate vertex may possess a semi-private neighbor. It might be beneficial to set firm all vertices with a semi-private neighbor first (a firm vertex remains unpurified in the resultant dominant set). It may also be beneficial to leave a vertex possessing a considerable number of immediate descendants without semi-private neighbors unpurified in case all its immediate descendants are purified. Likewise, it might be good to leave a vertex unpurified if the number of its neighbors in is “large enough”. The above and similar considerations are taken into account in four new purification procedures.
The first three procedures P1–P3 set firm all vertices of forest T with private neighbors at iteration 0. If the so-formed set is dominating, all stop. Otherwise, three different disjoint subsets of set S forming a partition of S are distinguished, as defined below.
- (1)
The set of firm vertices (as noted above, is the set of vertices from T with private neighbors; at iteration , is the set of vertices from T with semi-private neighbors).
- (2)
The set of the pending vertices formed by yet non-firm vertices which are neighbors of a firm vertex.
- (3)
Set consisting of the remaining yet unconsidered vertices.
Iteratively, once is determined, sets and are formed. The two purification parameters for each cluster vertex are defined below.
The
outer cover set of vertex
at iteration
h,
, is the set of neighbors of
v from
, which are not adjacent with any firm vertex from set
; i.e.,
The
inner cover set of vertex
at iteration
h,
, is the set of yet non-firm neighbors of vertex
v in
, i.e.,
Roughly, vertices with high
and
tend to be non-purified; vice versa, direct descendants of such a vertex with low outer and inner cover indices tend to be purified. An overall purification balance of vertex
v is
where
.
The distinguished features of our procedures are defined below.
Procedure P1. At iteration
, every cluster
is traversed in a bottom-up fashion. For every non-leaf vertex
x at a given level of cluster
C, if
x is firm, its children without semi-private neighbors are purified. If
x is not firm and has no firm children, it is set firm, and its children with no semi-private neighbors are purified, whereas the children with semi-private neighbors are set firm. If
x has a firm child, the parent of
x is set firm (unless it is already firm), and
x is purified (see Algorithm 3):
Algorithm 3 Procedure Purify 1 (P1) |
Input: Forest . Output: . {minimal dominating set} {set vertices with private neighbors} {set of vertices from T with private neighbors}; if is dominating set then ; else for any do {in order} ; ; {Travel by levels in ascending order} for any leaf set in level j of do if then children not a firm of x do not have semi-private neighbors}; else if x has non firm children then ; {children of x do not have semi-private neighbors}; else if parent of x is not in then {parent of ; ; else ; end if end if end if ; end for end for ; end if
|
Procedure P2. At any iteration
, a non-firm vertex
x with the maximum
is set firm (for each non-firm vertex
v, the sets
and
being updated, respectively). The procedure halts at iteration
h if
and outputs dominant set
(see Algorithm 4):
Algorithm 4 Procedure Purify 2 (P2) |
Input: Forest . Output: . {minimal dominating set} {set of vertices from T with private neighbors}; ; if is dominating set then ; else ; for any do {in order} while do v:= any vertex with the maximum PB in set ; ; ; ; end while if is dominating set then ; break; end if end for end if
|
Procedure P3. At any iteration
, a vertex
x with the minimum
is purified, and the vertices from
with semi-private neighbors from
are set firm (for each non-firm vertex
v, the sets OCS
and ICS
being updated, respectively). The procedure halts at iteration
h if
and outputs dominant set
(see Algorithm 5):
Algorithm 5 Procedure Purify 3 (P3) |
Input: Forest . Output: . {minimal dominating set} {set of vertices from T with private neighbors}; ; if is dominating set then ; else ; for any do {in order} while do v:= any vertex with the minimum PB in set ; ; all vetex in set that have semi-private neighbor}; ; end while if is dominating set then ; break; end if end for end if
|
Note that, for the example in
Figure 2,
but
; hence, set
S can be purified. At iteration 0, Procedures P1–P3 set firm all vertices of forest
T with private neighbors. The vertices
and
have as private neighbors vertices 4 and 8, respectively; so,
. Since
is a dominating set,
, and the algorithm stops.
Procedure P4. (This procedure was basically suggested by one of the anonymous referees of [
22], which is highly acknowledged by the authors of this paper). The
private neighborhood of vertex
is
Let
a and
b be adjacent vertices from set
, such that
a is a leaf vertex added to set
after vertex
b. Procedure P4 processes vertex
a first and then vertex
b, as follows (see Algorithm 6).
Algorithm 6 Procedure Purify 4 (P4) |
Input: A graph and dominating set S. Output: . {minimal dominating set} ; {vertices of S in the order they are added by Greedy.} ; {iterative step} while do if then ; {we purify vertex } end if ; end while
|
We use the example of
Figure 2 to demonstrate Procedure P4. Initially,
. In iterations
and
,
and
, respectively, so these vertices cannot be purified. In iteration
,
, so this vertex is purified,
and the procedure stops. Note that all four purification procedures find an optimal solution for our example.
6. Experimental Results
In this section, we describe our computation experiments. We implemented our algorithms in C++ using Windows 10 operative system for 64 bits on a personal computer with Intel Core i7-9750H (2.6 GHz) and 16 GB of RAM DDR4. We generated different sets of problem instances using different pseudo-random methods for the generation of graphs. The order (the number of vertices) and the size (the number of edges) of each instance were determined randomly using the function
. While creating a new edge, the corresponding pair of currently non-adjacent vertices was selected randomly, until the corresponding amount of edges is attained. For a given order, the corresponding size was determined according to the desired density,
We employed instances with densities from 0.2 to 0.9.
We analyzed 1316 benchmark instances from [
26]. A complete summary of the results for these instances can be found at [
26]. Due to the space limitations, here, we summarize our results for 45 randomly selected sample instances in Tables 1 and 3, where
S denotes the initial dominant set found by [
22], and
is the purified dominant set returned by the corresponding purification procedures. The columns marked by % represent the percentage of the reduction in the number of vertices from the dominant set returned by the P0 in the dominant set returned by the purification procedures P1–P4.
Table 1 below presents the percentage of the improvement by purification procedures P1–P4 for the 45 sample instances with sizes between 800 and 15,000.
Table 2 below presents the percentage of the improvement accomplished by purification procedures P1–P4, on average, for all the tested benchmark instances.
On average over all the 1316 tested instances, a best solution among the four solutions generated by purification procedures P1–P4 is 9.68% smaller than the dominant set returned by P0. Purification procedures P1–P4 turn out to be more efficient for dense instances with about 40% of the improvement, though in absolute terms, the number of purified vertices for non-dense instances is higher.
Table 3 presents results for 45 sample benchmark instances from [
26] for which the optima solutions are known. This table compares the quality of the solutions obtained by purification procedures P1–P4 vs. the corresponding optimal solutions and upper bounds. The complete comparative analysis for all the 500 benchmark instances with known optimal solutions can be found in [
26] and is also reflected in
Figure 3. Over all the tested instances, our solutions, on average, contained 1/7th part of the number of vertices established due to the best known upper bound
U, an improvement of 85.71% (see Fact 1 at the end of this section). Remarkably, among the 500 instances with the known optimum, an optimal solution is generated for 46.33% of the instances, whereas the average error over all the created non-optimal solutions is 1.01 vertices.
As to the execution times, for the largest benchmark instances with 15,000 vertices, all four procedures finished in less than two seconds.
Figure 4 represents the execution times for all the tested instances.
The upper bound U is due to the following known results.
Fact 1 ([
4]
). If is a connected graph of order n, minimum degree and maximum degree , then , , and . Hence,is a valid upper bound on .