Iterated Clique Reductions in Vertex Weighted Coloring for Large Sparse Graphs

The Minimum Vertex Weighted Coloring (MinVWC) problem is an important generalization of the classic Minimum Vertex Coloring (MinVC) problem which is NP-hard. Given a simple undirected graph G=(V,E), the MinVC problem is to find a coloring s.t. any pair of adjacent vertices are assigned different colors and the number of colors used is minimized. The MinVWC problem associates each vertex with a positive weight and defines the weight of a color to be the weight of its heaviest vertices, then the goal is the find a coloring that minimizes the sum of weights over all colors. Among various approaches, reduction is an effective one. It tries to obtain a subgraph whose optimal solutions can conveniently be extended into optimal ones for the whole graph, without costly branching. In this paper, we propose a reduction algorithm based on maximal clique enumeration. More specifically our algorithm utilizes a certain proportion of maximal cliques and obtains lower bounds in order to perform reductions. It alternates between clique sampling and graph reductions and consists of three successive procedures: promising clique reductions, better bound reductions and post reductions. Experimental results show that our algorithm returns considerably smaller subgraphs for numerous large benchmark graphs, compared to the most recent method named RedLS. Also, we evaluate individual impacts and some practical properties of our algorithm. Furthermore, we have a theorem which indicates that the reduction effects of our algorithm are equivalent to that of a counterpart which enumerates all maximal cliques in the whole graph if the run time is sufficiently long.


Introduction
Below we will introduce the MinVWC problem, current reduction approaches, and our proposed approach, together with some high-level motivation and comparisons.

The Problem
Given a simple undirected graph G = (V, E), a feasible coloring for G is an assignment of colors to V s.t.any pair of adjacent vertices are assigned different colors.Formally a feasible coloring S for G = (V, E) is defined as a partition S = {V 1 , • • • , V k } of V s.t.V i = ∅ for any 1 ≤ i ≤ k, V i ∩ V j = ∅ for any 1 ≤ i = j ≤ k, k i=1 V i = V, and for any edge {u, v} ∈ E, u and v are not in the same vertex subset V i where 1 ≤ i ≤ k.Notice that k is unknown until we find a feasible coloring.In the Minimum Vertex Weighted Coloring (MinVWC) problem, each vertex is associated with a positive weight, i.e., there is an additional weighting function w : V → Z + , and the goal is to find a feasible coloring that minimizes cost(S, G) = ∑ k i=1 max v∈V i w(v).Obviously, an instance of the NP-hard MinVC problem can conveniently be reduced to an instance of the MinVWC problem by associating a weight of 1 with each vertex.As a result, the MinVWC problem is also NP-hard [1,2].This problem arises in several applications like traffic assignment [3,4], manufacturing [5], scheduling [6] etc. Up to now, there are two types of algorithms for this problem: complete algorithms [3,7,8] and incomplete ones [4,9,10].

Current Reduction Approaches
In MinVC solving, a clique provides a lower bound for reductions because any two vertices in a clique cannot have the same color.In MinVWC solving, a clique is also able to do so, as can be found in the most recent reduction method RedLS published in [11].Roughly it is desirable that we have cliques in hand that are of great sizes and each vertex in them has a big weight.So one may think that we can call an incomplete maximum vertex weight clique solver like [12,13] to obtain a list of optimal or near-optimal cliques.Such examples can be found in the state-of-the-art method RedLS.In detail, RedLS first performs reduction to obtain a reduced subgraph and then does a local search on that subgraph.In this paper, we will abuse the name RedLS to refer to its reduction component as well.As to its reduction component, RedLS first samples a proportion of vertices, and for each of them namely v, it tries to find one maximum or near-maximum vertex weight clique that contains v. Second it combines such cliques to obtain a 'relaxed' partition set and apply this set for reductions.In a nutshell, the reduction method of RedLS performs clique sampling and graph reduction successively without interleaving, which we believe is not so flexible and may miss a few promising cliques and bounds.

Our Approach
We do not believe that sampling maximum or near-maximum vertex weight cliques is a perfect approach for clique reductions.In fact, there are two types of cliques that may not have great total vertex weights but are still useful: those only with big size and those only with high-weight vertices, because they also contribute to a bound.Actually, solving MinVWC requires diversification, to be specific, a list of cliques that vary in both sizes and vertex weight distributions is preferred.If we call a maximum vertex weight clique solving procedure, we may finally obtain a list of cliques that lack such diversification, which results in relatively ineffective reductions.Therefore in this paper, we abandon such an approach and instead enumerate diverse cliques.In this sense enumerating all maximal cliques in the input graph seems to be a good choice, however, doing so may be costly and thus infeasible even in sparse graphs, so we develop an algorithm that only enumerates a certain proportion of them but leads to equally effective reductions as the counterpart which enumerates all of them, if our algorithm completes.
Recently complex networks have presented a number of applications like cloud computing [14,15], so research about vertex-weighted coloring in large complex networks is capturing great interest.In this paper, we will present a reduction algorithm that processes large sparse graphs in order to speed up current MinVWC solving.Roughly speaking, it alternates between clique sampling and graph reductions.In a graph reduction procedure, it obtains a subgraph whose optimal solutions can be extended into optimal ones for the whole graph, and we call this subgraph a VWC-reduced subgraph (Vertex Weighted Coloring-reduced subgraph).Since most large sparse graphs obey the power-distribution law [16,17], they can be reduced considerably by cliques of a certain quality.On the other hand, a smaller graph presents smaller search space and the algorithm may find better cliques more easily which can then be used for further reductions.
Our algorithm consists of three successive procedures.Firstly, we collect vertices that have maximum degrees or weights and enumerate all maximal cliques containing them.Each time we find a maximal clique we check whether it leads to further reductions and do so immediately if possible.Secondly, we systematically look for cliques that can trigger more effective reductions.As in the previous procedure, we will perform reductions immediately once we have found such a clique.Thirdly, we perform clique reductions which are ignored in the first two procedures.We evaluated our algorithm on a list of large sparse graphs that were accessed via http://networkrepository.com/ on 1 January 2018, and compared its performance with RedLS.Experimental results show that our reduction algorithm often returns subgraphs that are considerably smaller than those obtained by RedLS.Also, we evaluated the individual impacts of the three procedures above, and found that they all had significant contributions.Furthermore, our algorithm was able to confirm that it had found the best bound on a list of benchmark graphs.Last we have a theorem that indicates that although our algorithm only samples a certain proportion of maximal cliques in the whole graph, its reduction effects are equivalent to that of a counterpart that enumerates all of them in the whole graph, given sufficient run time.

Preliminaries
In what follows, we suppose a vertex weighted graph G = (V, E, w(•)) with w : V → Z + being a weighting function.If e = {u, v} is an edge of G, we say that u and v are adjacent/connected and thus neighbors.Given a vertex v, we define the set of its neighbors, denoted by N(v), as {u ∈ V|{u, v} ∈ E} and we use any two vertices in C are mutually connected.A clique is said to be maximal if it is not a subset of any other clique.By convention we define size of a clique C, denoted by |C|, to be the number of vertices in it.Given a graph G and a vertex subset V ⊆ V, we use G[V ] to denote the subgraph of G which is induced by V , i.e., G[V ] = (V , E ) where E = {{u, v} ∈ E|u, v ∈ V }.Given a graph G, we use V(G) and E(G) to denote the set of vertices and edges of G, respectively.
In the following, for the ease of discussions, we generalize the notion of a coloring and allow it to color vertices not in V, so a coloring has now been redefined as Then we say that V 1 , . . ., V k are color classes and we redefine cost(S, G) as ∑ k i=1 max v∈V i ∩V w(v).Obviously according to new definitions, one coloring can have several representations, e.g., { 1, U , 2, V , 3, W } and { 1, W , 2, V , 3, U } represent the same coloring.
Given a graph G, we use S| G to denote a certain coloring for it.Then Proposition 1 below shows that given any feasible coloring, its cost on any induced subgraph does not exceed that on the whole graph.
Proof.See Appendix A.
Throughout this paper, when we say an optimal coloring/solution, we mean a feasible coloring/solution with the minimum cost.Given a vertex u, we use c u to denote u's color.In addition, we use c u ← j to denote the operation which assigns u the color j, so c u ← c v assigns u a color which is equal to that of v, i.e., which puts u in the same vertex subset with v.
Given a tuple t = x 1 , • • • , x l , we use |t| to denote the number of components in t, so |t| = l.For ease of expression, if t is an empty tuple, we define |t| to be 0. Given a map f : X → Y and an element x ∈ X, if y = f (x), then we say that y is x's image under f or simply say f (x) is x's image under f .Such notions will be useful when we discuss the removal of vertices in clique reductions.Finally, when given vertices u and v, we say that u is heavier (resp.lighter) than v if w(u) > w(v) (resp.w(u) < w(v)).

A Reduction Framework
Below we will present notions that are related to graph reductions for the MinVWC problem.The first is an extension to a coloring which relates solutions for a subgraph to that for the whole graph.
We also define S as an extension of itself.So an extension to S will not change the color of any vertices that have already been colored before.Instead, it will put a new vertex into one of the k existing vertex partitions if 1 ≤ j ≤ k, or a new one if j = k + 1. Obviously given two operations a 1 and a 2 for extensions, we have (S a 1 ) a 2 = (S a 2 ) a 1 , so the order of the operations does not matter.
Given a set A = {a 1 , . . ., a n }, we use S A to denote S a 1 . . .a n , and we also say S A is an extension to S.Last if S A or S (c x ← j) is a feasible color for G, then we say that S A or S (c x ← j) is a feasible extension to S for G. Below we have a proposition that will be useful in proving other later propositions.
The proposition below illustrates that extending a coloring will not decrease its cost.

Proof. See Appendix B.
Next, we define a type of subgraphs whose feasible solutions can be extended into feasible ones for the whole graph with the same cost.
This notion of VWC-reduced subgraph has two nice properties which are shown in Propositions 3 and 4 below.In detail, Proposition 3 shows that the relation of the VWC-reduced subgraph is transitive and we can compute a VWC-reduced subgraph in an iterative way.
Proposition 4 shows that in order to find an optimal solution for G, we can first find an optimal solution for its VWC-reduced subgraphs.
given any optimal feasible solution S * | G[U] for G[U], there exists an extension to S * | G[U] which is an optimal solution for G; 2.
given any non-optimal feasible solution S that is an optimal solution for G.
Proof.See Appendix D.
These propositions allow our algorithms to interleave between clique sampling and graph reduction, which is different from the approach in RedLS [11] yet similar to that in FastWClq [12].This is why we titled this paper 'iterative clique reductions '.
In what follows we will introduce a general principle for computing VWC-reduced subgraphs.

Clique Reductions
Below we will utilize the notion of VWC-reduced subgraph to introduce clique reductions which was initially proposed in [11].First, we introduce the notion of absorb which illustrates that a vertex's close neighborhood is a weak sub-structure of a clique.

Definition 3. Given a vertex u and a clique
) ≥ w(u), then we say that u is absorbed by C.
Note that the condition |C| > d(u) guarantees that w(v d(u)+1 ) always exists.Also notice that [11] did not allow the equation in w(v d(u)+1 ) ≥ w(u) to hold, but we extend their statements slightly.w(u) ≤ w(ξ(u)), that is, u is no heavier than its image under ξ; 2.
and for any x ∈ N(u), w(ξ(x)) ≥ w(ξ(u)), that is, images of u's neighbors are no lighter than that of u, or we may roughly say that u's image is the lightest compared to those of its neighbors.), even though we assign z 2 4 the same color as that of z 3 1 , the lightest vertex in C, the cost of that coloring will not increase.So we can now simply ignore z 2  4 and later assign it an existing color after all its neighbors have been colored, depending on its weight as well as its neighbors' colors.Obviously, this is a feasible extension that does not increase the cost of a coloring.Therefore G 1 [V\{z 2  4 }] is a VWC-reduced subgraph of G 1 .
In general, we have a proposition below [11].So an optimal coloring for G 1 is  From the propositions above, we can see that whether a vertex can be removed to obtain a VWC-reduced subgraph or not depends on the quality of the cliques in hand.Below we define a partial order between cliques which indicates whether vertices absorbed by one clique are a subset of those absorbed by the other.
So if C x C y , then C y leads to reductions that are at least as effective as that result from C x .In what follows, if C x C y , we say that C x is subsumed by C y .Obviously we have a proposition below which shows that the relation is transitive.
Then we have two propositions which show that if C x C y , then we can keep C y and ignore C x .Proposition 8. Suppose u is a vertex and C x , C y are cliques s.t.u ∈ C x ∪ C y and C x C y , then if u is absorbed by C x , then it is also absorbed by C y .
The proposition below states that if there occur reductions among C x , C y and their vertices where C x C y , then keeping C y is at least as good as keeping C x .
So if we utilize C x and C y to perform clique reductions where C x C y , we can simply ignore C x and keep C y .

A State-of-the-Art Reduction Method
To date, as we know, the only work on reductions for vertex weighted coloring is RedLS [11], which constructs promising cliques like FastWClq [12] and combines these cliques in an appropriate way to obtain a 'relaxed' partition set.Then it utilizes this set to perform reductions and compute lower bounds.So RedLS consists of clique sampling and graph reductions as successive procedures without interleaving.
Notice that FastWClq alternates between clique sampling and graph reduction and it benefits much from this approach.Hence it will be interesting to try whether such an alternating approach would lead to better reductions in vertex weighted coloring.Fortunately, the reduction framework introduced above allows us to do so.
For simplicity, we will put the details of RedLS in Section 4, where we will be able to reuse our notations and algorithms for succinct presentation.

Our Algorithm
Our reduction algorithm consists of three successive procedures: Algorithms 1 and 2 and post reductions in Section 3.3.As to Algorithm 1, we will first run it with maximumweight vertices assigned to startVertexSet in Line 1 and then run it again with maximumdegree vertices in the same way.

Sampling Promising Cliques
Algorithm 1 samples promising cliques that may lead to considerable reductions with three components as below.

1.
startVertexSet contains maximum degree/weight vertices and helps find promising cliques.

2.
criticalCliqSet contains cliques that may probably lead to effective reductions and will be utilized in post reductions in Section 3.3.

3.
topLevelWeights is a list of weights in non-increasing order and will be used for reductions.
In Line 7, we adopt depth-first search to enumerate all maximal cliques which contain vertices only in candSet.This operation can be costly, so in Section 3.4, we will set a cutoff for it.To be specific, before each enumeration, we will first put all related vertices into a list and shuffle this list randomly, then we will pick decision vertices one after another in this list to construct maximal cliques.By decision vertices, we mean those vertices that can both be included and excluded to form different maximal cliques.
Furthermore, Lines 8, 9, 10, and 16 will be introduced in Definition 8. Lines 21 and 22 are based on Proposition 16 and will be introduced in detail there.

Geometric Representations
First, we introduce a notation for representing weight distributions within given cliques.
Second, we introduce an operator for appending items to the end of a weight list, and it is somewhat like counterparts for vector in C++, ArrayList in Java, or list in Python.Definition 6.Given a list of weights L and a weight ω, we define L ⊕ ω as ω if L = and as In order to describe properties of our algorithms intuitively, we introduce Euclidean geometric representations of a list of weights in a rectangular coordinate system as below.

Algorithm 2: BetterBoundReductions
a set of critical cliques criticalCliqSet output: A set of critical cliques criticalCliqSet We draw the derived curves of δ(C 1 ), δ(C 2 ) and δ(C 3 ) as ABCDE (blue), FGHI (green) and FJHI (red) in (I) in Figure 1.On the other hand, we draw the derived curve of topLevelWeights which has just been updated with respect to C 1 and C 2 successively in Algorithm 3 as FBCDE (black) in (II) in Figure 1.

1.
In detail, when topLevelWeights has just been updated with respect to C 1 , its derived curve exactly overlaps that of δ(C 1 ).

2.
Next when topLevelWeights has just been updated with respect to C 2 , a part of its derived curve, namely AB, has moved to its top-right, namely FB, so the derived curve of topLevelWeights has turned into FBCDE.Notice that having been updated with respect to C 1 and C 2 , the derived curve of topLevelWeights is the bottom-left most curve that is not exceeded by that of C 1 and C 2 .In other words, topLevelWeights has become the tightest envelope of that of C 1 and C 2 .
Actually, if we switch the order of C 1 and C 2 in the procedure above, we will obtain the same sequence in topLevelWeights.In general, from the second time on, each time Algorithm 3 ends with topLevelWeights being updated, parts of the derived curve of topLevelWeights move to their top-right.Now we consider the derived curves of topLevelWeights and δ(C) and define several notions below which describe the relationship between a vertex weighted clique and a list of non-increasing weights.

Definition 8. Given a list of weights
we say that C is covered by L iff t ≥ |C| and ω i ≥ w i for any we say that C intersects with L at l iff 1 ≤ l ≤ min{|C|, t} and w l = ω l ; 3.
we say that C deviates above L at l iff Example 4. Consider Example 3 with topLevelWeights having been updated with respect to C 1 and C 2 .By referring to (II) in Figure 1, we can find the following.Obviously, we have a proposition below which helps determine whether a clique is effective in reductions.
If C 1 deviates above L at certain l and C 2 is covered by L, then C 1 C 2 .

Algorithm Execution
As to the execution of Algorithm 3, the next proposition presents a sufficient and necessary condition in which topLevelWeights will be updated.
Proposition 11.The topLevelWeights in Algorithm 3 will be updated if and only if topLevelWeights = or the input clique C deviates above topLevelWeights at certain l.Also, we have propositions below which illustrate how topLevelWeights will be updated.

Proposition 13 (Successor Top-level Updates
for any 1 ≤ l ≤ t, ω l will be replaced with w l in Line 5 in Algorithm 3 iff C deviates above topLevelWeights at l; 2.
for any l > t, a weight w l will be inserted in Line 7 in Algorithm 3 iff C deviates above topLevelWeights at l.
The following proposition shows the relation between topLevelWeights and C if it has been updated in Algorithm 3. Right before the execution of Line 21, for any C ∈ criticalCliqSet, C is covered by topLevelWeights.

2.
In Line 19, if C deviates above topLevelWeights, then for any C ∈ criticalCliqSet, we have C C .
Intuitively right before the execution of Line 21, topLevelWeights can do whatever any clique in criticalCliqSet can, with exceptions being dealt with in Section 3.3.In Line 19, if C updates topLevelWeights, then it will be allowed an entry into criticalCliqSet.
Example 5.After Algorithm 1 is run on G 2 in Example 3, criticalCliqSet has become {C 1 , C 3 } and topLevelWeights has been updated to be 7, 6, 3, 2, 1 , as is shown as FJCDE in Figure 2. The details are as follows.
So either C 2 was refused to enter criticalCliqSet or it was removed from criticalCliqSet, depending on whether the algorithm found C 2 earlier than it found C 3 .

3.
As to the two cliques above, neither subsumes the other.So in Line 19, cond2 implies cond3.In other words, if cond2 holds, then C is not covered by any clique in criticalCliqSet, i.e., C is not subsumed by any clique in criticalCliqSet.In this sense, we add it to criticalCliqSet and this will not cause obvious redundancy.
Based on the discussion above, we have 1.
right before the execution of Line 21 in Algorithm 1, topLevelWeights contains bestfound bounds formed by all previous enumerated cliques; 2.
and if any clique improves this bound, then no previously enumerated clique subsumes it.Unlike [11], we will apply topLevelWeights instead of 'relaxed' partition set to perform reductions in Algorithms 1 and 2.
Furthermore, for the sake of efficiency, we should keep criticalCliqSet as small as possible and as powerful as possible.So in Algorithm 1, if C C, i.e., C is subsumed by C, then we will simply remove C in Line 14 and this will do no harm to the power of criticalCliqSet.In addition, if C does not intersects with the derived curve of topLevelWeights, its reduction power is overwhelmed by topLevelWeights, so we remove it in Line 18 as well.

Reductions Based on Top Level Weights
Next we have a proposition below which states that topLevelWeights can be utilized for clique reductions.
for any 1 ≤ l ≤ t, there exists a clique given any feasible coloring given any vertex u s.t.d(u) < t and Notice that Item 1 states that topLevelWeights is the tightest envelope of all cliques that have been enumerated (See Figure 2 above for details and intuition).

1.
In this sense, if t was decreased or any of ω 1 , • • • , ω t was decreased, the derived curve of topLevelWeights would be left or down shift, which in turn, made at least one clique deviate above topLevelWeights at some certain l.Therefore there must exist a color whose weight was smaller than its lower-bound.

2.
Considering that weights of other colors are all underestimated, we have the sum of all components in the new variant of topLevelWeights could never be achieved by any feasible coloring.
So in order to obtain a feasible coloring that avoids lower-bound conflicts in any enumerated cliques, we have to accept the cost revealed by topLevelWeights or even more.In a word, any feasible coloring for G costs at least ∑ t i=1 ω t , which will be shown and proved formally in Proposition 17 and has also been proved by [11] in another approach.
Given a vertex u, we represent it as a point P u = (d(u) + 1, w(u)) on the Rectangular Coordinate Plane xOy in order for intuition (See Figure 3).Then we have P u is strictly below the derived curve of topLevelWeights iff d(u) < t and ω d(u)+1 > w(u), and such a location relation implies Items 3a and 3b above.Moreover each time one neighbor of u is removed, d(u) will be decreased by 1 and the point P u will be left shift by 1.Meanwhile, when we enumerate cliques, topLevelWeights tend to move to its top-right.These opposite trends will gradually help reduce the input graph.
Example 6.Consider G 1 in Example 1 in which there exists a maximal clique C = {z 3  1 , z 5 2 , z 6 5 , z 4 6 } with δ(C) = 6, 5, 4, 3 .As to the four other vertices z 5  3 , z 2 4 , z 1 7 , z 5 8 with degrees 2, 3, 3, 2, we represent them by E, F, G, H, respectively, on a rectangular coordinate plane in (I) in Figure 3 below.For instance, the coordinate of F is (d(z 2 4 ) + 1, w(z 2 4 )) namely (4, 2).Notice that z 5 3 and z 5 8 have the same degree and weight, so their corresponding points overlap on the coordinate plane, to be specific, z 5  3 and z 5 8 are represented by E and H, respectively, which overlap.On the other hand, we can utilize topLevelWeights instead of specific cliques to perform clique reductions.For example, Line 21 in Algorithm 1 exploits topLevelWeights to perform reductions based on Proposition 16 above.In detail, G ← applyCliqueReductions(G, topLevelWeights, S) performs clique reductions and obtain a VWC-reduced subgraph of G, but keeps all vertices in S in the returned subgraph.We do this for the following reason: In Line 21, since we are enumerating cliques in candSet, we should keep all vertices in it.Otherwise, the procedure may crash.However, in Line 22, since we have completed the enumeration procedures, we do not have to keep any vertices in the VWC-reduced subgraph.We also remind readers that in the applyCliqueReductions procedure, each time one vertex is removed, all its neighbors will be taken into account for further reductions because their degrees have all been decreased by 1.
Notice that in Proposition 16 we require ω d(u)+1 > w(u) rather than ω d(u)+1 ≥ w(u), because we have to ensure that u is absorbed by a clique that does not contain u, which is coincident with the approach in [11].However, this method may fail to perform some reductions which can be performed by Proposition 5. Yet this is not a problem, because, at the end of our reductions, we will deal with that case.See Section 3.3 for more details.
Example 7. Now we call applyCliqueReductions which is based on Proposition 16 as below.See Figure 3 for visualization.

1.
In (I) in Figure 3, we find that the derived curve of topLevelWeights is ABCD and F is strictly below it, so Item 3 in Proposition 16 is applicable and the corresponding vertex z 2 4 is removed.

2.
Because of the removal of z 2 4 , the degrees of z 5 3 , z 1 7 and z 5 8 are all decreased by 1, so their corresponding points on the coordinate plane are all left shift by 1 (see (II) in Figure 3).Notice that E, H, and B overlap at this time.

3.
Notice that G is strictly below the derived curved of topLevelWeights now, so we remove it like before, and this causes the left movement of H (see (III) in Figure 3).

4.
Analogously we remove z 5 8 because H is strictly below ABCD now (see (IV) in Figure 3).

5.
Note that removing z 5 3 is not allowed by Proposition 16, but it is permitted by Proposition 5.This shows the weakness of our applyCliqueReductions procedure, and we will address this issue in Section 3. Obviously, in order to perform effective reductions, we want ω 1 , • • • , ω t to be as big as possible.Hence, in Algorithm 2, we will try to increase their values.Furthermore, Proposition 16 is helpful in proving Proposition 17 below which computes a lower-bound of the cost of a feasible coloring.Proposition 17.Given any feasible coloring S for G and topLevelWeights = ω 1 , • • • , ω t , we have cost(S, G) ≥ Σ t i=1 ω i .
Proof.See Appendix F.
Also, we have a proposition below which will be helpful in Section 3. Proof.See Appendix G.

Searching for Better Cliques
Given topLevelWeights = ω 1 , • • • , ω t , Algorithm 2 attempts to increase the values of ω 1 , • • • , ω t and it even tries to find a clique whose size is bigger than t.So if Algorithm 2 completes, it will be able to confirm the following.

1.
Each component in topLevelWeights has achieved its maximum possible value.

2.
There exists no clique whose size is greater than |topLevelWeights|.
In Algorithm 2, we use updated(i) to denote whether ω i is increased in the iteration for i.In Line 2, updated(i − 1) = false means that we fail to update ω i−1 .In our algorithm, there are two tricks that refer to updated(i) as below.

1.
If ω i = ω i−1 and we have confirmed that there are no cliques that improve ω i−1 , then there will be no cliques which improve ω i .

2.
If ω i = ω i−1 and we fail to update ω i−1 , then it will be hard for us to update ω i as well, so we adopt a continue statement here to avoid probably hopeless efforts.
We also call the procedure applyCliqueReductions which was explained in the previous subsection.Notice that in Line 4, we enumerate maximal cliques which contain vertices in candSet only.To be specific, when i ≤ t, we will do so by considering vertices with weights greater than ω i only, because we are now focusing on increasing ω i .Like the counterpart in Algorithm 1, we will shuffle related vertices randomly before each enumeration.

Increasing Top Level Weights
Like Algorithm 1, we also exploit depth-first search to enumerate maximal cliques.Yet different from it, we will rarely enumerate all such maximal cliques.Instead, once we have found a clique that increases any value among ω 1 , • • • , ω t , we will immediately perform reductions and break the enumeration procedure (see Line 14).Below we have a proposition that illustrates a sufficient and necessary condition in which ω i (1 ≤ i ≤ t) will be increased.

Proposition 19.
As to the outermost loop in Algorithm 2, for any 1 ≤ i ≤ t + 1, ω i will be increased if there exists a clique C ⊆ candSet s.t.|C| ≥ i.
Example 8. Suppose we have topLevelWeights = 7, 6, 3, 2, 1 , and we are now focusing on increasing ω 3 whose current value is 3. Suppose among vertices with weights greater than ω 3 , we have found a clique C with δ(C) = w 1 , w 2 , w 3 = 5, 5, 4 whose derived curve deviates above that of topLevelWeights at 3 (see FGH and ABCDE in Figure 4).So ω 3 can now increase to be 4, and we will start another iteration to check whether ω 3 can further increase.
Notice that Line 14 breaks the clique enumeration loop and the program control of this algorithm will eventually be returned to Line 3 with an increased ω i .We do this for the following reason: Since we have increased ω i , any vertices that have a weight bigger than the previous ω i but not bigger than the current ω i will not help further increase ω i .Hence, we eliminate these vertices from candSet and enumerate maximal cliques again with respect to the same i (see Line 14).With a smaller candSet, we can increase ω i to its maximum possible value more efficiently.In a word, we increase ω i gradually until it reaches its maximum.Notice that Line 7 might also increase t, so long as the algorithm has found a clique that is bigger than any that have been found.Last we remind readers that although we are focusing on increasing ω i , there could be side effects that we increase ω i+1 , • • • , ω l as well where l ≤ t, so long as we have found a clique that contains sufficiently many vertices with big weights.

Effects of Better Cliques
There is a chance that the clique C obtained in Line 4 is not a maximal clique for the whole graph G, thus there may exist another clique in G that is a superset of C and has more reduction power.Alternatively, C may expand to a bigger clique by including vertices with a weight not greater than ω i and lead to more reductions.Yet this is not a problem.If such a case exists, the full reduction power will be exploited in later iterations.
In the first few iterations of the outermost loop, i is relatively small and thus ω i is relatively big, which is likely to result in a relatively small candSet, so enumerating cliques in candSet probably costs relatively little time.Moreover, these cliques may lead to effective reductions which significantly decrease the time cost of later enumerations.When i = t + 1, we have candSet = V(G), thus in the worst case, we will have to enumerate all maximal cliques in G, which seems to be time-consuming and thus infeasible.Yet this is not so serious, because 1.
we are dealing with large sparse graphs which often obey the power-distribution law, 2.
and we have performed considerable reductions before, so at this time, G is likely to be small enough to allow maximal clique enumerations.In Section 3.4, we will also set a cutoff for enumerating cliques.
Last we remind readers that as i increases and ω i decreases, candSet becomes larger and larger, and thus enumerating cliques will become more and more time-consuming, so we need to set a cutoff for enumerations (see Section 3.4).Due to this cutoff, once we fail to confirm that ω i has achieved its maximum, we will not make any effort to confirm whether ω j has arrived at its best possible value for any j > i.
Moreover, we have a proposition below which shows that, given sufficient run time, Algorithm 2 will be able to increase ω i to its maximum possible value for any 1 ≤ i ≤ ω(G), where ω(G) is the maximum size of a clique in G.
Proposition 20.As to the outermost loop in Algorithm 2, we have 1.
for any 1 ≤ i ≤ t, right before i is increased by 1, there exist no cliques which deviate above topLevelWeights at i.

2.
for i = t + 1, when the iteration ends, there exist no cliques which deviate above topLevelWeights at i.
Then by this proposition, we have a theorem below which shows that our clique reduction algorithm is as effective as the counterpart which enumerates all maximal cliques in G, if time permits.To describe this theorem we first define the equality relation between two lists in Definition 9.

Definition 9. Given two list of weights L
Theorem 1.Let L 1 be the topLevelWeights returned after Algorithms 1 and 2 are executed successively, and L 2 be the topLevelWeights returned after Algorithm 4 is executed, then L 1 = L 2 .
Note that Algorithm 4 can be time-consuming even for sparse graphs.

Post Reductions
Section 3.1 mentions that we have not fully exploited Proposition 5 to perform reductions, so in this subsection, we deal with the remaining case.At this stage, for each vertex, we will examine whether it is absorbed by some certain clique in criticalCliqSet and perform reductions if so.

Implementation Issues
Although we apply various tricks to enumerate diverse cliques for effective reductions, our algorithm may still become stuck in dense subgraphs, so we have to set a certain cutoff for our algorithm.
We believe that a good reduction algorithm should not focus too much on a local subgraph, so our cutoff will prevent each clique enumeration from spending too much time.The impact of this compromise is that we have to sacrifice some good properties above, to be specific, we now cannot expect that all ω i values in topLevelWeights will increase to their maximum.Yet in our parameter setting, there are still quite a few ω i values that are confirmed to achieve their optimum.
Furthermore, in some large graphs, we may need to consider a great many vertices and enumerate cliques that contain them, so there could be numerous enumerations.Hence, even though each enumeration needs a small amount of time, the total time cost of so many enumerations might not be affordable, so we also need to limit the total amount of time spent on enumerations.

Limiting The Number of Decisions Made in Each Enumeration
Notice that we adopt a depth-first search to enumerate maximal cliques in Algorithms 1 and 2. During each depth-first enumeration, decisions of whether a vertex should be included in the current clique have to be made, and the search has to traverse both branches recursively, so there may be an exponential number of decisions for a single depth-first enumeration.Hence, in any enumeration, if topLevelWeights has been unable to be improved within λ consecutive decisions, we will simply stop this enumeration and go on to the next one.

Limiting Running Time
Some benchmark graphs contain a large number of vertices that are of the greatest weights or degrees, so there can be a great amount of enumerations in Algorithm 1. Moreover as to Algorithm 2, there can be many candidate vertices that may form a clique to improve a particular component in topLevelWeights, hence numerous enumerations may be performed as well.
Even though we limit the number of decisions and thus limit the time spent in each enumeration, too many enumerations may still cost our algorithm so much time.Hence in practice, we employ another parameter T to limit the running time of our algorithm.More specifically in Algorithm 2, we will check whether the total time spent from the very beginning of our whole algorithm is greater than T. If so we will simply stop Algorithm 2 and turn to post reductions.
In fact, if Algorithm 2 is stopped because of this parameter, there can be cases as below.For the sake of presentation, we let K be the number of components in topLevelWeights which is equal to the size of the greatest clique that has been found.

1.
Algorithm 2 is unable to tell whether there exists a clique C s.t.|C| ≤ K and C is able to improve a particular component in topLevelWeights.

2.
Algorithm 2 has confirmed that any clique containing at most K vertices will not improve topLevelWeights.Yet it is unable to confirm whether there exists a clique whose size is bigger than K.

Programming Tricks
In graph algorithms, there is a common procedure as follows.Given a graph and its two vertices u and v, determine whether u and v are neighbors.In our program, this procedure is called frequently, so we have to implement it efficiently.However, it is unsuitable to store large sparse graphs by adjacency matrices.Therefore, we adopted a hash-based data structure which was proposed in [18] to do so.
In our algorithm, we often have to obtain vertices of certain weights or degrees.Moreover, as vertices are removed, the degrees of their neighbors will be decreased.Furthermore, our algorithm interleaves between clique sampling and graph reductions, which requires us to maintain such relations in time.So we need efficient data structures to maintain vertices of each degree and/or weight in the reduced graph.Hence, we adapted the so-called Score-based Partition in [19] to do so.

Related Works
To our best knowledge, the only algorithm on reductions for vertex weighted coloring is RedLS [11], and its details are shown in Algorithm 5.In this algorithm, C is a candidate clique being constructed and each vertex in candSet is connected to each one in C. Hence, any single vertex in candSet can be added into C to form a greater clique.In Line 2, 1% of the vertices in V are randomly collected to obtain startVertexSet.As to the outer loop starting from Line 4, each vertex like v in startVertexSet is picked and a maximal clique containing v is constructed from Lines 6 to 11, based on a heuristic inspired by FastWClq [12].In the inner loop starting from Line 8, Line 9 picks a vertex u in candSet, Line 10 places the vertex u into C, and Line 11 eliminates vertices which are not connected to every one in C, i.e., which are impossible to be added into C to make greater cliques.

Algorithm 5: RedLS
Notice that Line 9 selects a next vertex to put into C with some look-head technique.To be specific, rather than choose the heaviest vertices and maximize current benefits, it tries to maximize the total weight of the remaining possible vertices, i.e., N(x) ∩ candSet, with a hope for greater future benefits.So given a vertex v, Algorithm 5 always aims to look for maximum or near-maximum weight cliques that contain it.
Each time a maximal clique C is constructed, Algorithm 5 will compare topLevelWeights with C and updates topLevelWeights if needed (see Line 12).Actually RedLS adopts the so-called 'relaxed' vertex partition, yet the effects are equivalent to our descriptions with topLevelWeights in Algorithm 5.After enumerating cliques with respect to vertices in startVertexSet, Algorithm 5 will call the applyCliqueReductions procedure and perform reductions based on Proposition 16.However, when determining whether a vertex namely u can be removed, it always takes d(u) in the whole graph as u's degree, i.e., no degree decrease will be taken into account.In a nutshell, RedLS consists of clique sampling and graph reductions as successive procedures, which is different from our interleaving approach.

Experiments
We will present solvers and benchmarks, parameter settings, presentation protocols, results, and discussions in this section.

Solvers and Benchmarks
We consider a list of networks online that were accessed via http://networkrepository.com/ on 1 January 2018.They were originally unweighted, and to obtain the corresponding MinVWC instances, we use the same method as in [11,12].For the i-th vertex v i , w(v i ) = (i mod 200) + 1.For the sake of space, we do not report results on graphs with fewer than 100,000 vertices or fewer than 1,000,000 edges.There is an instance named soc-sinaweibo which contains 58,655,849 vertices and 261,321,033 edges and thus is too large for our program, so our program ran out of memory and we do not report its result.In the following experiments, we simply disable the local search component in RedLS [11] and compare its reduction method to our algorithm.
Our algorithm was coded in Java and open source via https://github.com/Fan-Yi/iterated-clique-reductions-in-vertex-weighted-coloring-for-large-sparse-graphs accessed on 1 June 2023.It was compiled by OpenJDK 20.0.1 and run in an OpenJDK 64-bit Server VM (build 20.0.1+9-29, mixed mode, sharing).The experiments were conducted on a workstation with Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz CPU with 266 GB RAM under CentOS 7.9.Since we shuffle vertices in Algorithms 1 and 2, there exists randomness in the effectiveness of reduction.Yet we only test one arbitrary seed, since the benchmark graphs are diverse and each of them contains a large number of maximal cliques.

Parameter Settings
As to the parameter λ that limits the number of branching decisions in each depth-first enumeration procedure, we set it as 10, 000 * d max where d max is the maximum degree in the input graph.On the other hand, RedLS was run with the default parameter setting in the machine environment reported by [11].In fact, RedLS usually completes reductions in a significantly shorter time compared to our algorithm, yet this is not a big problem, because this paper focuses only on the potential effectiveness of a reduction algorithm, instead of its efficiency.Since the MinVWC problem is NP-hard, even a small number of additionally removed vertices may decrease a great amount of later search time, so our idea is meaningful.
As to the parameter T that limits the total running time of enumerations, we set it as 1200 s.

Presentation Protocols
For each instance, we report the number of vertices and edges in the original graph (denoted by 'Original' in Table 1) as well as that obtained by RedLS and our algorithm (denoted by 'RedLS-reduced' and 'ours', respectively, in the same table).In Table 1, we mainly compare the number of remaining vertices obtained by RedLS and that by our algorithm (Columns 4 and 6), and better results (smaller numbers) are shown in bold.
To show the effectiveness of our algorithm more clearly, we also report the percentage of remaining vertices, ρ = |V |/|V|, where V is the set of original vertices and V is the set of remaining vertices after reductions.So the closer ρ is to 0, the more effective our algorithm is.Furthermore, the time column reports the number of seconds needed by our algorithm to perform reductions.

Main Results and Discussions
From Table 1, we observe the following.

1.
Our algorithm obtains significantly better results in most of these instances compared to RedLS.Among all the graphs, the number of remaining vertices returned by RedLS is at least 10,000.However, on nearly 20% of the instances, our algorithm returns a result less than 10,000.Moreover, on more than 10% of the instances, it returns a result less than 1000.2.
On more than 40% of the instances, our percentage of remaining vertices is smaller than 10%, while on nearly 20%, the respective results are smaller than 1%.

3.
The most attractive result lies in the road-net category, in which our algorithm returned subgraphs that contained 156, 86, 14, and 54 vertices, respectively, with |E| slightly more than |V|.However, RedLS returns subgraphs that contain at least 800,000 vertices.Thanks to our algorithm, it seems that optimal solutions for these graphs can now be easily found by state-of-the-art complete solvers.

Individual Impacts
We will show individual impacts of our three successive procedures as well as the optimality of top-level weights returned.

Individual Impacts of Our Successive Procedures
To show that each of our three successive procedures is necessary, we calculate the number of vertices removed in each procedure during the execution of our algorithm.In Table 2, we use ∆ 1 , ∆ 2 and ∆ 3 to represent the number of vertices removed by Algorithms 1 and 2 and post reductions, respectively.We select representative instances from most categories in order to reflect the individual impacts comprehensively.
From this table, we find that Algorithm 2 may sometimes remove no vertices, and post reductions usually have great contributions, which is why we allow equations to hold and extend statements in [11] to present Definition 3. Finally, we discuss the optimality of topLevelWeights returned by our algorithm which will play an essential role in future works.Notice that Algorithm 2 tries to enumerate all possible cliques that may increase any component of topLevelWeights and even attempt to find a clique whose size is bigger than |topLevelWeights|.In practice, Algorithm 2 was able to confirm that some particular ω i values had achieved their maximum.To be specific, we take instances web-it-2004, sc-pwtk and delaunay_n24 as examples, and show our experimental results in this aspect as below.

1.
As to web-it-2004, our experiment guaranteed that each ω i value had achieved its maximum and there existed no clique whose size was bigger than |topLevelWeights|.This is the best result which ensures that no better top-level weights can be found.This also implies that we have found the smallest number of remaining vertices.No better results can be obtained by clique reductions.In Table 1, all such instances are marked with * in our |V| column.

2.
As to sc-pwtk, our experiment guaranteed that each ω i value had achieved its maximum, but it was unable to tell whether a clique with a size greater than |topLevelWeights| existed.In this sense, future works on this instance can focus on finding a clique of greater size.

3.
As to delaunay_n24, our experiment could only make certain that the first two ω i values of topLevelWeights returned had achieved their maximum, but there were still two components that were not confirmed.Hence, more efforts are to be made in this instance.

Conclusions
In this paper, we have proposed an iterated reduction algorithm for the MinVWC problem based on maximal clique enumerations.It alternates between clique sampling and graph reductions and consists of three successive procedures: promising clique reductions, better-bound reductions and post reductions.Experimental results on several large sparse graphs show that the effectiveness of our algorithm significantly outperforms that of RedLS in most of the instances.Moreover, it makes a big improvement on about 10% to 20% of them, especially on the road-net instances.Also, we have shown and discussed individual impacts as well as practical properties of our algorithm.Last we have a theorem that indicates that our algorithm's reduction effects are equivalent to that of a counterpart which enumerates all maximal cliques in the input graph if time permits.
However, our clique enumeration procedures are somewhat brute-force, which may waste a great amount of time checking useless cliques.Furthermore given a vertex, clique reductions assume that each of its neighbors has a distinct color, yet this is not always the case and thus may limit the power of reductions.
For future works, we will develop various heuristics to sample promising cliques that are both effective and efficient for reductions.Also, we plan to develop reductions that allow neighbors of a vertex to have repeated colors.
{x, y} in G s.t.c x = c y , which contradicts the precondition that S| G is a feasible coloring for G.

Appendix B. Proof of Proposition 2
Proof.

Definition 2 .
Suppose G = (V, E, w(•)) and U ⊆ V.If given any feasible coloring S| G[U] for G[U], there exists an extension to S| G[U] , denoted by S| G , such that S| G is feasible for G and cost

1 FurthermoreProposition 6 .
we only have to focus on maximal cliques as is shown by the proposition below.If u is absorbed by a clique in G, then it must be absorbed by a maximal clique in G.

Figure 1 .
Figure 1.(I) Derived curves of δ(C 1 ), δ(C 2 ) and δ(C 3 ) represented by blue, green and red curves, respectively; (II) topLevelWeights after being updated by C 1 and C 2 successively, which is the tightest envelope of them.

Proposition 14 .
If topLevelWeights has been updated in Algorithm 3, then topLevelWeights covers the clique C at the end of this algorithm.Such a covering relation will still hold after Algorithm 3 returns program control back to Algorithm 1. Then we have a proposition about criticalCliqSet in Algorithm 1. Proposition 15. 1.
u is absorbed by some certain clique in G; (b) and G[V\{u}] is a VWC-reduced subgraph of G.

3 . 18 .
Proposition Right before the execution of Line 22 in Algorithm 1, there do not exist any two cliques C 1 , C 2 s.t.C 1 , C 2 ∈ criticalCliqSet and C 1 C 2 .

Figure 4 .
Figure 4.A clique is found to improve ω 3 .ABCDE represents topLevelWeights while FGH represents the derived curve of δ(C).

≤Appendix C. Proof of Proposition 3 Proof.
cost(S (c x ← j), G).Given any solution S| G[W] for G[W], we have there exists an extension to S| G[W] , denoted by S| G[U] , such that S| G[U] is feasible for G[U] and cost(S| G[W] , G[W]) = cost(S| G[U] , G[U]).Also for the same reason, given any solution S| G[U] for G[U], we have there exists an extension to S| G[U] , denoted by S| G , such that S| G is feasible for G and cost(S| G[W] , G[W]) = cost(S| G , G).

to |C| do 3 if topLevelWeights = and i ≤ |topLevelWeights| then 4 if w i > ω i then 5 replace
refer to Algorithm 3 ω i in topLevelWeights with w i ; Given a list of positive numbers L = d 1 , • • • , d t , we draw a curve on the Rectangular Coordinate Plane xOy with the list of coordinates (1, d 1 ), • • • , (t, d t ) by connecting adjacent points, and we call this curve the derived curve of L. Notice G 2 .There are three maximal cliques, C 1 9 remove C from criticalCliqSet; 10 foreach C ∈ criticalCliqSet do 11 if C is intersect with topLevelWeights then continue; 12 remove C from criticalCliqSet; 13 G ← applyCliqueReductions(G, topLevelWeights, ∅); 14 break and keep i unchanged for the next iteration; 15 return criticalCliqSet; Algorithm 3: updateTopLevelWeights input: C, topLevelWeights = ω 1 , • • • , ω t or output: Implicit from the context 1 w 1 , • • • , w |C| ← δ(C); 2 for i ← 1

Table 1 .
Reductions on instances.Those numbers of remaining vertices that are confirmed to be optimal are marked with '*.'

Table 2 .
Individual Impacts of Three Successive Procedures.
by the Pigeonhole Principle, we have there exists at least one 1 ≤ s ≤ l − 1 s.t.V s contains two or more vertices among v 1 , • • • , v |Q| , that is, S is not a feasible coloring for G, which contradicts the preconditions.Alternatively we have proved that cost(S, G) ≥ Σ t i=1 ω i .C 1 enters criticalCliqSet first.If C 1 C 2 , then C 1 will be removed from criticalCliqSet at Line 14 before C 2 enters criticalCliqSet, i.e., they will not be in criticalCliqSet simultaneously.2.C 2 enters criticalCliqSet first.If C 1 , C 2 ∈ criticalCliqSet, i.e., C 1 enters criticalCliqSet later, then by Proposition 15, we have C 1 C 2 .