1. Introduction
The graph burning problem (GBP) is an NP-hard combinatorial optimization problem introduced in 2014 in the context of social contagion [
1]. This problem, concerned with the sequential spread of information over a graph, considers that information can be spread from different places and times [
1,
2]. In this paper, by graph we refer to an undirected simple graph [
3]. Computer network message propagation is an example of a real-world problem that the GBP may model; in this scenario, an initial spreading entity can send a message one host at a time, while these hosts can propagate the message only to their neighbors [
4,
5]. The GBP may model other real-world problems: social contagion in social networks and the spread of viral infections under a very idealistic context [
1,
6].
The GBP receives a graph $G=(V,E)$ as input, and its goal is to find a minimum length sequence of vertices $({s}_{1},{s}_{2},\dots ,{s}_{k})$ that burns all graph’s vertices by following the burning process. This process consists of repeating the following steps from $i=1$ to k, where all vertices are unburned at the beginning, and once a vertex is burned, it remains in that state.
- a
The neighbors of the burned vertices get burned.
- b
Vertex ${s}_{i}$ gets burned.
Any sequence that burns all the vertices of the input graph by following the burning process is known as a burning sequence. Thus, the GBP seeks a burning sequence of minimum length; this length is known as the burning number and is denoted by
$b\left(G\right)$ [
1,
2].
Figure 1 shows the execution of the burning process over a nine vertices path with optimal burning sequence
$({v}_{3},{v}_{7},{v}_{9})$. Certainly, this example is very simple. Thus, for further clarification,
Figure 2 shows the optimal burning sequence of more interesting graphs. These optimal solutions were computed using the proposed mathematical formulations, which are introduced in
Section 3. In all these graphs, the bigger vertices are elements
${s}_{i}$ of the burning sequence, and the color assigned to each vertex is the color of the vertex
${s}_{i}$ with the smaller index that burns it.
Due to its NP-hard nature, the GBP has been approached chiefly through approximation algorithms and heuristics. Some of these proposals are based on centrality measures and binary search over the possible values of the burning number
$b\left(G\right)$. According to experimental results, these heuristics and approximation algorithms have an acceptable performance [
4,
5,
6,
7,
8]. However, they do not guarantee to find optimal solutions. Therefore, there is a need for a mechanism for finding optimal solutions for arbitrary graphs. One of these mechanisms is mathematical formulations, which can be solved using off-the-shelf optimization software. For this reason, this paper introduces three novel mathematical formulations for the GBP: an integer linear program (ILP) and two constraint satisfaction problems (CSP1 and CSP2). Given the NP-hard nature of the problem, solving these formulations can take exponential time. Therefore, there must be a limit to their practicality. Through experimentation, we estimated such a limit.
The remaining part of the paper is organized as follows.
Section 2 presents the background of the problem and the main definitions used in this document.
Section 3 introduces the proposed mathematical formulations.
Section 4 presents an empirical performance comparison among these formulations using random graphs and off-the-shelf optimization software.
Section 5 reports optimal solutions found by the implemented CSP1 and CSP2 over some synthetic and real-world graphs using off-the-shelf optimization software. Finally,
Section 6 presents the concluding remarks.
2. Background
Let us begin by listing some basic definitions used throughout the paper. Observe that a graph is often called an undirected simple graph to distinguish it from directed graphs and multigraphs; however, in this paper, we only use the word graph to refer to this mathematical object.
Definition 1. Subsets with k elements are known as k-element subsets [3]. Definition 2. A graph $G=(V,E)$ is an ordered pair consisting of a set of vertices V and a set of edges E, where E contains 2-element subsets of V. The vertices and edges of any given graph J are also represented by $V\left(J\right)$ and $E\left(J\right)$, respectively [3]. Definition 3. Given a graph $G=(V,E)$, the distance $d(u,v)$ between vertices $u,v\in V$ is defined as the number of edges in their shortest path.
Definition 4. Given a graph $G=(V,E)$, the open neighborhood $N\left(v\right)$ of a vertex $v\in V$ is the set of vertices at distance one from v. Notice that $v\notin N\left(v\right)$.
Definition 5. Given a graph $G=(V,E)$, the closed neighborhood $N\left[v\right]$ of a vertex $v\in V$ is the set of vertices at distance at most one from v. In other words, $N\left[v\right]=N\left(v\right)\cup \left\{v\right\}$.
Definition 6. Given a graph $G=(V,E)$, the closed ${k}^{th}$ neighborhood ${N}_{k}\left[v\right]$ of a vertex $v\in V$ is the set of vertices at distance at most k from v. Notice that ${N}_{0}\left[v\right]=\left\{v\right\}$ and ${N}_{1}\left[v\right]=N\left[v\right]$.
Definition 7. A finite sequence with no repeated elements is a bijective functionwhere S is the set of objects in the sequence and $\left|S\right|$ is its length. Notice that $f\left(v\right)$ is v’s position in the sequence and that each object in S is assigned to exactly one position. The decision version of the GBP is NP-complete. This problem receives as input a graph
$G=(V,E)$ and a positive integer
k; it asks if a burning sequence of length at most
k exists. The problem remains NP-complete when restricted to trees of maximum degree three, chordal graphs, bipartite graphs, planar graphs, spider graphs, and disconnected graphs [
2]. The optimization version of the problem also remains NP-hard for trees and graphs with disjoint paths [
7]. Regarding arbitrary graphs, two approximation algorithms are reported in the literature; they have an approximation factor of 3 and
$3-2/b\left(G\right)$, respectively [
7,
8]. There is a 2-approximation algorithm for trees, a 1.5-approximation algorithm for graphs with disjoint paths, and a 2-approximation algorithm for square grids [
6,
7]. For minimization problems, a
$\rho $-approximation algorithm returns solutions of size at most
$\rho \xb7OPT$, where
$OPT$ is the size of the optimal solution and
$\rho \ge 1$. In the case of maximization problems, a
$\rho $-approximation algorithm returns solutions of size at least
$(1/\rho )\xb7OPT$ [
9]. Besides approximation algorithms, some heuristics have been proposed too [
4,
5]; these are mainly based on centrality measures and binary search over the set of possible values of
$b\left(G\right)$.
The GBP has been approached mostly from a theoretical point of view. As a result, many of its properties over specific graph families have been identified. Among these families are paths [
2,
7], trees [
2,
7], grids [
6], intervals [
6], fences [
10], theta [
11], spiders [
12], path-forests [
2,
7,
12], caterpillars [
13], products [
14], and generalized Petersen graphs [
15]. Some of the main properties of the GBP are the following. All paths and cycles
G of order
n have
$b\left(G\right)=\lceil {n}^{1/2}\rceil $, all graphs
G with a Hamiltonian path have
$b\left(G\right)\le \lceil {n}^{1/2}\rceil $ [
1,
2], all spiders and caterpillars
G have
$b\left(G\right)\le \lceil {n}^{1/2}\rceil $ [
12,
13], all complete graphs
G of order at least two have
$b\left(G\right)=2$, and all perfect binary trees
G of depth
r have
$b\left(G\right)=r+1$. Based on these properties, a conjecture on the upper bound of the burning number of connected graphs was formulated by Bonato et al. [
1]:
Conjecture 1. Every connected graph G of order n has burning number $b\left(G\right)\le \lceil {n}^{1/2}\phantom{\rule{4pt}{0ex}}\rceil $.
This conjecture, known as the burning number conjecture (BNC), is one of the most important open questions in the area. To date, the best-known bound for the burning number of arbitrary connected graphs is
$b\left(G\right)\le \lceil {(4\xb7n/3)}^{1/2}\rceil +1$ [
16]. From the BNC, Conjecture 2 for disconnected graphs follows.
Conjecture 2. The burning number$b\left(G\right)$of a disconnected graph$G=(V,E)$with p connected components$\{{H}_{1},{H}_{2},\dots ,{H}_{p}\}$is at most ${\sum}_{i=1}^{p}\lceil |V\left({H}_{i}\right){|}^{1/2}\phantom{\rule{4pt}{0ex}}\rceil $.
In case Conjecture
Section 2 is true, Conjecture
Section 2 can be proved by observing the following facts. The concatenation of the optimal burning sequences of each
${H}_{i}$ component is a burning sequence for the whole graph
G. This is because concatenation does not reduce the burning capacity of any vertex. So,
$b\left(G\right)$ is upper bounded by the length of the described concatenation.
Assuming the BNC is true,
$b\left({H}_{i}\right)\le \lceil |V\left({H}_{i}\right){|}^{1/2}\phantom{\rule{4pt}{0ex}}\rceil $, where
$\left|V\right({H}_{i}\left)\right|$ is the number of vertices in the connected component
${H}_{i}$. Therefore,
Anyway, from the best-known bound on the burning number over arbitrary connected graphs, we can prove the following lemma.
Lemma 1. The burning number $b\left(G\right)$ of a disconnected graph $G=(V,E)$ with p connected components $\{{H}_{1},{H}_{2},\dots ,{H}_{p}\}$ is at most $p+{\sum}_{i=1}^{p}\lceil (4\xb7|V\left({H}_{i}\right){|/3)}^{1/2}\rceil $.
Proof. If we concatenate the optimal burning sequences of each
${H}_{i}$ component, the resulting sequence is a burning sequence for the whole graph
G. Since the concatenation does not reduce the burning capacity of any vertex,
$b\left(G\right)$ is upper bounded by the length of this concatenation.
Then, since
$b\left({H}_{i}\right)\le \lceil (4\xb7|V\left({H}_{i}\right){|/3)}^{1/2}\rceil +1$ [
16],
□
Thanks to the best-known bounds on the burning number, the size of the explored search space may be reduced. Consistent with this observation, the proposed formulations tend to be solved faster when tighter lower and upper bounds on the burning number are available. Of course, feasible solutions returned by heuristics and approximation algorithms might help find better lower and upper bounds. We used such an approach in
Section 5 to solve the problem optimally over some benchmark graphs.
The GBP resembles other NP-hard problems, such as the vertex
k-center problem (VKCP) and the firefighter problem (FP). The VKCP consists of finding the best location for a set of
$k\in {\mathbb{Z}}^{+}$ centers, where such locations are the ones that minimize the maximum distance a
customer has to travel to its nearest center [
17,
18,
19]. Although the VKCP and the GBP are different, their approximation algorithms are conceptually similar [
7,
8,
19]. This similarity comes from the fact that the VKCP has a polynomial-time reduction to the minimum dominating set problem, which can be viewed as the problem of burning all vertices in parallel in one single step [
19,
20,
21,
22]. Regarding the FP, it aims at protecting vertices from burning given an initial set of
fire sources [
23,
24,
25]. Although GBP and FP have different goals, the latter’s integer linear program (ILP) inspired us to define an ILP for the GBP.
To end this section, notice that the GBP can be stated in different ways. For instance, it can be formulated in terms of the burning process. However, it can be formulated as a covering problem too [
1,
2,
6,
13]:
Definition 8. Given a simple graph $G=(V,E)$, the GBP consists of finding a minimum cardinality set $S\subseteq V$, and a bijective function $f:S\to \{1,2,\dots ,|S\left|\right\}$ such that Equation (6) holds, where $b\left(G\right)=\left|S\right|$, and ${N}_{b\left(G\right)-f\left(v\right)}\left[v\right]$ is the closed ${(b\left(G\right)-f\left(v\right))}^{th}$ neighborhood of v. By Definition 8, the GBP is a covering problem that seeks an optimal burning sequence with no repeated elements. Namely, it consists of finding a minimum length sequence
$({s}_{1},{s}_{2},\dots ,{s}_{b\left(G\right)})$ that cover all graph’s vertices:
5. Computing Optimal Solutions
The previous section shows that the CSP2 + BS seems better suited for solving graphs with a relatively large burning number. Nevertheless, CSP1 + BS seems better for solving graphs with a relatively small burning number. In this section, we executed CSP1 + BS and CSP2 + BS over synthetic and real-world benchmark graphs (See
Table 3); most of these were taken from the network repository [
29] and the Stanford large network dataset collection (SNAP) [
30]. Since this experimentation aims to find optimal burning sequences, we set the lower and upper bounds to the tightest known. To find the upper bound, we executed three state-of-the-art heuristics: improved cutting corners heuristic (ICCH), backbone-based greedy heuristic (BBGH), and component-based recursive heuristic (CBRH) [
5]; their authors kindly provided the implementation of these. We also executed the BFF and BFF+ approximation algorithms, where BFF+ returns the best possible solution BFF can find. To set the lower bound, we exploited that all solutions generated by the BFF algorithm have a length of at most
$3\xb7b\left(G\right)-2$ [
8]. Therefore, we computed the lower bound with
$L=\lceil (worst+2)/3\rceil $, where
$worst$ is the worst solution BFF can return. Since
$b\left(G\right)$ is unknown, the time reported is the time to proven optimality (
${t}_{p}$).
By the previous set of experiments, it seems likely that CSP1 + BS would be very inefficient over graphs with a burning number of seven or more. Therefore, we executed CSP1 + BS over graphs with an upper bound on the burning number of six or less. Furthermore, since CSP1 + BS has fewer memory requirements than CSP2 + BS, we could execute the former over graphs of order up to 5908. Regarding CSP2 + BS, we executed it over graphs with an upper bound on the burning number of at least seven and order at most 1458 because Gurobi’s branch-and-bound algorithm exhausted all available memory when executed over bigger graphs. This way, the results reported in
Table 3 confirm the optimality of most previously known solutions. Furthermore, the optimal solution of some graphs is reported for the first time (ca-netscience, web-polblogs, DD68, DD199, DD497, lattice2D, and tech-routers-rf). The TVshow graph could not be solved within ∼48 h using CSP1 + BS; this goes along with the observation that CSP1 + BS does not seem adequate for solving graphs with a relatively
large burning number. From
Table 3, we also observe that synthetic graphs with a well-defined structure seem more challenging to solve. For instance, lattice3D required 150,000 s (∼42 h) to be solved using CSP2 + BS. In order to find optimal solutions for challenging graphs, we executed CSP1 + BS and CSP2 + BS over some grid graphs (See
Table 4). As before, we executed CSP1 + BS and CSP2 + BS over graphs with a known upper bound of at most six and at least seven, respectively. From
Table 4, we can observe that the implemented formulations found twelve optimal solutions that the state-of-the-art heuristics could not find.
6. Discussion
This paper introduces three novel mathematical formulations for the GBP: an ILP and two CSPs (CSP1 and CSP2). Since CSP1 and CSP2 require the burning number in advance, they are integrated into a binary search procedure (CSP1 + BS and CSP2 + BS); this way, the issue of not knowing the burning number in advance is lessened. All these formulations can be solved over arbitrary graphs thanks to off-the-shelf optimization software.
Section 4 shows a series of experiments that validate the correct implementation of the proposed formulations. This same section presents an empirical performance comparison among them using random graphs. From these experiments, we observe that CSP2 + BS tends to be better suited for graphs with a relatively large burning number (we empirically estimated this value as seven or more.) From these same experiments, we observe that CSP1 + BS seems to be a better choice for solving graphs with a relatively small burning number (we empirically estimated this value as six or less.) Of course, these observations cannot be generalized. Thus, rigorous statistical analysis should be performed in the future.
In
Section 5, we used CSP1 + BS and CSP2 + BS to compute the optimal solution for some benchmark connected graphs of order at most 5908 (it is worth mentioning that, in all experiments, the BNC holds.) From these, we found seven previously unknown optimal solutions. We could not apply CSP2 + BS over graphs of order greater than 1458 because the memory requirements grew beyond our hardware’s capacity. Regarding CSP1 + BS, it solved graphs with a relatively small burning number and order at most 5908. The obtained set of optimal solutions can be helpful as a benchmark dataset for comparing non-exact algorithms for the GBP, i.e., approximation algorithms, heuristics, and metaheuristics.
Finally, as part of the reviewing process, a reviewer pointed to the possibility of reducing the number of variables and constraints from CSP2 by removing variables ${x}_{i}$ and ${y}_{i,j}$. However, we believe such variables cannot be removed because they let us guarantee that the solution has length B and that all vertices are burned. Nevertheless, we agree that mathematical formulations with fewer variables and constraints might exist. Thus, we will seek alternative mathematical formulations of the problem for future work.