Clustering Improves the Goemans–Williamson Approximation for the Max-Cut Problem

Rodriguez-Fernandez, Angel E.; Gonzalez-Torres, Bernardo; Menchaca-Mendez, Ricardo; Stadler, Peter F.

doi:10.3390/computation8030075

Open AccessArticle

Clustering Improves the Goemans–Williamson Approximation for the Max-Cut Problem

by

Angel E. Rodriguez-Fernandez

^1,2,*

,

Bernardo Gonzalez-Torres

³

,

Ricardo Menchaca-Mendez

⁴

and

Peter F. Stadler

^1,2,5,6,7,*

¹

Max Planck Institute for Mathematics in the Sciences, 04103 Leipzig, Germany

²

Bioinformatics Group, Department of Computer Science, Universität Leipzig, 04107 Leipzig, Germany

³

Department of Computer Science, University of California, Santa Cruz, CA 95064, USA

⁴

Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico City 07738, Mexico

⁵

Institute for Theoretical Chemistry, University of Vienna, 1090 Wien, Austria

⁶

Facultad de Ciencias, Universidad National de Colombia, Sede Bogotá 111321, Colombia

⁷

Santa Fe Insitute, Santa Fe, NM 87501, USA

^*

Authors to whom correspondence should be addressed.

Computation 2020, 8(3), 75; https://doi.org/10.3390/computation8030075

Submission received: 29 June 2020 / Revised: 11 August 2020 / Accepted: 24 August 2020 / Published: 26 August 2020

Download

Browse Figures

Versions Notes

Abstract

:

M AX - C UT

is one of the well-studied NP-hard combinatorial optimization problems. It can be formulated as an Integer Quadratic Programming problem and admits a simple relaxation obtained by replacing the integer “spin” variables

x_{i}

by unitary vectors

{\vec{v}}_{i}

. The Goemans–Williamson rounding algorithm assigns the solution vectors of the relaxed quadratic program to a corresponding integer spin depending on the sign of the scalar product

{\vec{v}}_{i} \cdot \vec{r}

with a random vector

\vec{r}

. Here, we investigate whether better graph cuts can be obtained by instead using a more sophisticated clustering algorithm. We answer this question affirmatively. Different initializations of k-means and k-medoids clustering produce better cuts for the graph instances of the most well known benchmark for

M AX - C UT

. In particular, we found a strong correlation of cluster quality and cut weights during the evolution of the clustering algorithms. Finally, since in general the maximal cut weight of a graph is not known beforehand, we derived instance-specific lower bounds for the approximation ratio, which give information of how close a solution is to the global optima for a particular instance. For the graphs in our benchmark, the instance specific lower bounds significantly exceed the Goemans–Williamson guarantee.

Keywords:

algorithms; approximation; semidefinite programming; Max-Cut; clustering

1. Introduction

The

M AX - C UT

problem is a well-known NP-hard [1] and APX-hard [2] combinatorial optimization problem. An instance consists of a graph G with vertex set V, edge set E and edge weights

w : E \to R

. A cut is a bipartition

A, V \ A

of the vertex set V, where

A \subseteq V

. The

M AX - C UT

problem consists of finding a cut that maximizes the total weight of the edges that span the cut. That is, the function

f (A) = \sum_{{p, q} \in E : p \in A, q \in V \ A} w ({p, q})

(1)

is to be maximized over all

A \subseteq V

. While

M AX - C UT

is NP-hard in general, polynomial-time algorithms exist for restricted graph classes, including planar graphs [3], graph without long odd cycles [4], and cographs [5].

M AX - C UT

can also be written as a spin glass model, see, e.g., [6], for an overview. Using an arbitrary numbering of the n vertices of the graph G, we write

x_{i} = + 1

if vertex

i \in A

,

x_{i} = - 1

if vertex

i \in V \ A

, and

x = (x_{1}, x_{2}, \dots, x_{n})

. Furthermore, we set

w_{i j} : = w ({i, j})

if

{i, j}

is an edge of G and

w_{i j} = 0

otherwise. With this notation, we can rewrite

f (A)

in terms of the “spin vector” as

f (x) = \frac{1}{4} \sum_{i, j} w_{i j} (1 - x_{i} x_{j}) x_{i} \in {- 1, + 1}

(2)

Maximizing

f (x)

amounts to the integer quadratic programming (IQP) formulation of the

M AX - C UT

problem. Without loss of generality, we assume

w_{i j} = w_{j i}

.

Solutions of large

M AX - C UT

problems are of considerable practical interest in network design, statistical physics, and data clustering, and hence a broad array of computational techniques have been customized to its solution. Broadly, they can be subdivided into combinatorial heuristics (see e.g., [7,8,9] and the references therein) and methods involving relaxations of the integer constraints in Equation (2).

Replacing the integer vectors by

x \in R^{n} \ {0}

leads to an equivalent continuous optimization problem for which an excellent heuristic is described in [10] and a close connection with the largest eigenvalues of generalized Laplacians [11] and their corresponding eigenvectors [12]. On this basis, algorithms with uniform approximation guarantees have been devised [13,14]. Goemans and Williamson [15] replaced the integers

x_{i}

by unitary vectors

{\vec{v}}_{i}

of dimension

n = | V |

.

\begin{matrix} max_{{\vec{v}}_{i}} \frac{1}{4} \sum_{i, j} w_{i j} (1 - {\vec{v}}_{i} \cdot {\vec{v}}_{j}) \\ subject to | {\vec{v}}_{i} | = 1 \forall i = 1, \dots, n \end{matrix}

(3)

where · denotes the scalar product on the unitary vectors. This problem contains all the instances of the original problem (2), as seen by setting

{\vec{v}}_{i} = (x_{i}, 0, \dots, 0)

for all i. The relaxed problem (3) is a particular instance of Vector Programming, or using a change of variables, an instance of Semidefinite Programming (SDP), and thus can be solved in polynomial time (up to an arbitrarily small additive error), see, e.g., [16,17].

The solutions of the relaxed problem (3) are translated to solutions of the original IQP. Goemans and Williamson [15] proposed to use a unitary random vector

\vec{r}

and to set

{\hat{x}}_{i} = sgn ({\vec{v}}_{i} \cdot \vec{r})

(4)

with

sgn (t) = - 1

for

t < 0

and

sgn (t) = + 1

for

t \geq 0

. This amounts to cutting the sphere at the hyperplane with normal vector

\vec{r}

and to assign

x_{i}

depending on whether

{\vec{v}}_{i}

lies in the “upper” or “lower” hemisphere defined by this hyperplane.

The Goemans–Williamson relaxation yields an approximation bound of

α : = {min}_{0 \leq θ \leq π} \frac{2}{π} \frac{θ}{1 - cos θ} > 0.878

for the expected value

E [f (\hat{x})] / max f

, where the expectation is taken over the choices of

\vec{r}

[15]. At present, it is the best randomized approximation algorithm for the integer quadratic programming formulation of the

M AX - C UT

problem. A clever derandomization [18] shows that deterministic algorithms can obtain the same approximation bound. On the other hand, it is NP-hard to approximate

M AX - C UT

better than the ratio

16 / 17

[19]. If the Unique Games Conjecture [20] is true, furthermore, the approximation ratio cannot be improved beyond the Goemans–Williamson bound for all graphs. However, better ratios are achievable, e.g., for (sub)cubic graphs [21].

The translation of the solution of the related problem (3) in the Goemans–Williamson approximation relies on the choice of a random vector

\vec{r}

. Naturally, we ask whether the performance can be improved by expending more efforts to obtain a better choice for

\vec{r}

. The key observation is that the purpose of

\vec{r}

is to separate the solution vectors

{\vec{v}}_{i}

of Equation (3) into two disjoint sets of points on the sphere. The two sets

A_{+} : = {i : {\vec{v}}_{i} \cdot \vec{r} \geq 0}

and

A_{-} : = {i : {\vec{v}}_{i} \cdot \vec{r} < 0}

can thus be thought of as a pair of clusters. Indeed, two vectors

{\vec{v}}_{i}

and

{\vec{v}}_{j}

tend to be anti-parallel if

w_{i j}

is large, while pairs of points i and j with small or even negative weights

w_{i j}

are likely to wind up on the same side of the maximal cut. Of course, the random vector

\vec{r}

is just one way of expressing this idea: if

{\vec{v}}_{i}

and

{\vec{v}}_{j}

are similar, then we will “usually” have

sgn {\vec{v}}_{i} \cdot \vec{r} = sgn {\vec{v}}_{j} \cdot \vec{r}

. The “randomized rounding” of the Goemans–Williamson method therefore can also be regarded as a clustering method. This immediately begs the questions whether the solutions of the

M AX - C UT

problem can be improved by replacing the random choice of

\vec{r}

by first clustering the solution set

{{\vec{v}}_{i}}

of the relaxed problem (3). We shall see empirically that the answer to this question is affirmative even though the theoretical performance bound is not improved.

In practical applications, solutions of relaxed problems are often post-processed by local search heuristics. Therefore, a local search starting from the final results of both the Goemans–Williamson relaxation and two of our best clustering approaches were made in order to improve the cut values.

This contribution is organized as follows. In the following section, we briefly summarize data sets and clustering algorithms with their relevant properties as well as the details of the local search used to improve the cut values. In Section 3.1, we describe an initial analysis of the effect of clustering on the cut weights, showing that the quality of near-optimal clusters correlates well with cut weights. Since we were not able to show for most clustering methods that they retain the Goemans–Williamson performance bound, we derive an instance specific bound in Section 3.2 that provides a convenient intrinsic quality measure. In Section 3.3, we extend the empirical analysis to the benchmarking set that also contains very large graphs. We show that the use of clustering methods indeed provides a consistent performance gain. We also see that the instance-specific performance bounds are much closer to 1 than the uniform Goemans–Williamson

α

. Finally, in Section 3.4, we consider the improvement to the cut values that are achieved with local search starting from the the Goemans–Williamson and the two best clustering relaxations.

2. Materials and Methods

2.1. Benchmark Data

In order to assess the utility of clustering as rounding method, we used the benchmark set of graphs generated using Rinaldi’s machine-independent graph generator. Both the generator and the graphs can be downloaded from Ye’s web page http://web.stanford.edu/~yyye/yyye/Gset/. These graphs vary from 800 to 20,000 nodes and have edge weights of ±1. The topology of the graph can be toroidal, almost planar or random. The first 21 G-set graphs are a standard benchmark for the

M AX - C UT

problem. The G-set benchmark consists of graphs G1 to G67, G70, G72, G77, and G81. The optimal cuts are not known for most of these graphs. We therefore use the best known cut-values compiled in [9] for comparison.

The relaxed problem (3) was solved using the CVX package in Matlab for graphs G1 to G21. For graphs G22 to G54, G57 to G59, G62 to G67, and G72, we used Mathematica’s function SemidefiniteOptimization. For graphs G55, G56, G60, G61, G70, G77, and G81 neither SemidefiniteOptimization nor CVX were able to find a solution to the SDP problem. We therefore had to exclude these instances from further consideration. From the SDP solution of each instance, we computed 50 iterations for the randomized clustering algorithms, including Goeamans and Williamson randomized clustering and reported the best solution for the seven best algorithms.

2.2. Clustering in the Goemans–Williamson Algorithm

We consider the following clustering methods:

randomized rounding as defined in [15].
k-means [22] adapted to the unitary sphere as in [23] for fixed $k = 2$ .
k-medoids [24] for fixed $k = 2$ .
Fuzzy c-means [25] adapted to the unitary sphere as in [23] for fixed $c = 2$ .
Minimum Spanning Tree (MST) clustering, i.e., splitting the MST at the longest edge [26].

In order to obtain the spherical variants of k-means and Fuzzy c-means, it is necessary to define the cluster centroids as points on the sphere. This can be achieved by rescaling the centroid to be unit length [23]. On the sphere, cosine similarity is a more natural choice than Euclidean dissimilarity. It is not difficult to see that the two variants are actually equivalent:

Lemma 1.

Spherical k-means clustering with cosine similarity is equivalent to k-means clustering with Euclidean distances and normalizing of the centroid vectors in each step.

Proof.

For unitary vectors, the square of their Euclidean distance can be expressed as

∥ \vec{x} - \vec{y} ∥^{2} = \vec{x} \cdot \vec{x} + \vec{y} \cdot \vec{y} - 2 \vec{x} \cdot \vec{y} = 2 (1 - cos θ)

in terms of the angle

θ

between

\vec{x}

and

\vec{y}

. The centroid

\vec{c}

of a given cluster

C

minimizes

\sum_{i \in C} {∥ {\vec{x}}_{i} - \vec{c} ∥}^{2}

and thus maximizes the sum

\sum_{i \in C} cos θ_{i}

. Analogously, each

x_{i}

is assigned to cluster

C_{j}

with minimal value of

∥ {\vec{x}}_{i} - {\vec{c}}_{j} ∥^{2}

and thus maximal

cos θ_{i}

. Thus, the squared Euclidean distance and the cosine distance optimize the same objective function. □

The same argument can be made for Fuzzy c-means clustering, since its cost function is also a linear combination of squared Euclidean distances and thus, equivalently, a linear combination of the corresponding cosines. For MST clustering, no adjustment is necessary since the relationship between Euclidean distances and cosines is monotonic and thus the transformation does not affect the MST. By the same token, k-medoids clustering is unaffected by the restriction to the sphere since medoids by definition are always the unitary vectors

{\vec{v}}_{i}

.

An important ingredient for the clustering procedures is the initialization. For k-means, k-medoids, and fuzzy c-means, we consider both a deterministic and a non-deterministic version. In the deterministic version, the pair

{\vec{v}}_{i}^{*}

and

{\vec{v}}_{j}^{*}

with maximal distance from each other is chosen. In the non-deterministic variant, the two initial cluster centroids are selected from the solution vectors

{\vec{v}}_{i}

at random. For k-means, we observed that choosing the initial cluster centroids as vectors

{\vec{v}}_{i}

does not work well since the optimization quickly get stuck in a local minimum. We therefore devised two alternative random initialization methods: (1) We generate a random unitary vector

\vec{r}

and use

{\vec{c}}_{1} = \vec{r}

and

{\vec{c}}_{2} = - \vec{r}

as initial centroids. (2) We use two independently generated random unitary vectors

{\vec{r}}_{1}

and

{\vec{r}}_{2}

as initial centroids. The main advantage of method (1) is that this initialization is equivalent to starting k-means from solutions obtained by Goemans–Williamson rounding. To see this, assume (without loss of generality) that the random vector chosen as initial centroid points towards the north pole and therefore the negative side of it will point towards the south pole. Then, any point in the lower hemisphere of the hypersphere will be clustered with the south pole vector since this is the closest centroid. The same will happen for points on the upper hemisphere and therefore this is equivalent to splitting the hypershpere into two hemispheres, which is the Goemans–Williamson rounding.

In summary, we consider a total of nine clustering procedures:

Fuzzy c-means (Fuzzy)
Randomized k-means initialized among vectors ${\vec{v}}_{i}$ (K-MeansRand)
Deterministic k-means (K-MeansDet)
Randomized k-medoids (K-MedRand)
Deterministic k-medoids (K-MedDet)
Minimum Spanning Tree (MST)
Randomized Rounding of Goemans–Williamson (RR)
Randomized k-means initialized with two random vectors (K-Means2N)
Randomized k-means initialized with a random vector and its negative (K-MeansNM)

In order to quantify the quality of the clusters, we use the distortion [27,28], a measure of cluster coherence defined as

dis (C) = \sum_{j = 1}^{k} \sum_{{\vec{x}}_{i} \in C_{j}} | | {\vec{x}}_{i} - {\vec{μ}}_{C_{j}} | |

(5)

Here,

{\vec{μ}}_{C_{j}} : = \frac{1}{| C_{j} |} \sum_{{\vec{x}}_{i} \in C_{j}} {\vec{x}}_{i}

is the centroid of the cluster

C_{j} \in C

. We note that k-means clustering minimizes the distortion.

2.3. Local Search

A natural definition of locality for cuts considers two cuts

A \neq A^{'}

adjacent if

A = A^{'} \cup {u}

for some

u \in V \ A^{'}

or

A^{'} = A \cup {u}

for some

u \in V \ A

[12]. In terms of the spin vectors x, this amounts to “flipping” (changing the sign of) exactly one spin

x_{i}

. An adaptive walk iteratively accepts a spin flip if the cut value

f (A)

improves. By construction, therefore, an adaptive walk terminates in a locally optimal solution, i.e., in a cut

A^{*}

for which there is no adjacent cut with a strictly larger cut weight. In general, neither the Goemans–Williamson nor any of the clustering results are locally optimal. We therefore use adaptive walks as a simple way to further improve solutions. We performed local improvement for each of the 50 repetitions of Goemans–Williamson randomized rounding, and the two best-performing clustering algorithms: K-MeansNM and K-Means2N.

3. Results

3.1. Cluster Quality Correlations with Solution Quality

In a preliminary evaluation on the first 21 G-set graphs (see Materials and Methods), we observed that clustering instead of random rounding yields systematically larger cuts. In order to better understand the reason for the beneficial effect of clustering, we investigated the relationship between a quality measure for the clustering and the resulting weight of the maximal cut. Since k-means clustering minimizes distortion, it serves as a natural measure of cluster quality, irrespective of the clustering method that is actually used. We chose K-MeansNM for this initial analysis because it uses RR solutions to initialize clusters and thus allows for a direct evaluation of the effect of clustering heuristics.

The RR solutions fall into a very narrow range of distortion values that is clearly separated from the near optimal range achievable by the clustering methods. The cut weights of the RR solutions do not appear to be correlated with the distortion of the corresponding clusters. However, after only a few clustering steps, k-means enters a regime in which distortion and cut weight are strongly correlated, see Figure 1.

Figure 1 provides a clear motivation to consider clustering as means of improving the Goemans–Williamson solution. We observe that there are two groups of graphs. In the first groups, exemplified by G43, Figure 1a, we consistently observe lower RR values at the starting point of k-MeansNM (right) that at the endpoint (top left) and the cut values mostly increase monotonically with decreasing distortion. In the second group of graphs, exemplified by G31, Figure 1b, this is still the case on average; however, the optimal cut weights are observed at sub-optimal distortions. This observation motivates us to record the cut weights for intermediary steps of the cluster procedures, not only at their endpoints. In the case of K-MeansNM, this guarantees that we retain the performance guarantee of the Goemans–Williamson bound.

3.2. An Instance-Specific Approximation Guarantee

Considering a fixed instance of

M AX - C UT

, let

{{\vec{v}}_{i}}

be the solution of the relaxed problem (3) and let

{A, V \ A}

be a discrete solution. Denote by

\partial A : = {(i, j) \in E (G) | i \in A

,

j \in V \ A}

the set of cut edges. The value S of the relaxed solution can be written as

\begin{matrix} S & = \frac{1}{2} \sum_{(i, j) \in \partial A} w_{i j} (1 - {\vec{v}}_{i} \cdot {\vec{v}}_{j}) + \frac{1}{2} \sum_{(i, j) \notin \partial A} w_{i j} (1 - {\vec{v}}_{i} \cdot {\vec{v}}_{j}) \\ = \underset{f (A)}{\underset{︸}{\sum_{(i, j) \in \partial A} w_{i j}}} - \underset{g_{c u t}}{\underset{︸}{\frac{1}{2} \sum_{(i, j) \in \partial A} w_{i j} (1 + {\vec{v}}_{i} \cdot {\vec{v}}_{j})}} + \underset{g_{i n}}{\underset{︸}{\frac{1}{2} \sum_{(i, j) \notin \partial A} w_{i j} (1 - {\vec{v}}_{i} \cdot {\vec{v}}_{j})}} . \end{matrix}

Thus, we have

f (A) = S + g_{c u t} - g_{i n}

. Writing

f^{*}

for the weight of the optimal cut, we know that the solution of the relaxed problem is an upper bound, i.e.,

S \geq f^{*}

. We therefore have

f (A) \geq f^{*} - (g_{i n} - g_{c u t})

(6)

Note that

g_{i n} - g_{c u t} \geq 0

since by definition

f^{*} \geq f (A)

. First, consider the case of positive edge-weights. Then,

f (A) > 0

and we can estimate the approximation ratio for the solution A as

α (A) : = \frac{f (A)}{f^{*}} \geq 1 - \frac{g_{i n} - g_{c u t}}{f^{*}} \geq 1 - \frac{g_{i n} - g_{c u t}}{f (A)}

(7)

If there are negative edge weights, we follow [15], define

W_{-} : = \sum_{(i, j)} min (w_{i j}, 0) \leq 0

, and make use of the fact that

f^{*} - W_{-} \geq 0

. From Equation (6), we immediately obtain

f (A) - W_{-} \geq f^{*} - W_{-} - (g_{i n} - g_{c u t})

and thus a generalized version of the approximation ration can be computed as

α (A) : = \frac{f (A) - W_{-}}{f^{*} - W_{-}} \geq 1 - \frac{g_{i n} - g_{c u t}}{f^{*} - W_{-}} \geq 1 - \frac{g_{i n} - g_{c u t}}{f (A) - W_{-}}

(8)

The bounds in Equations (7) and (8) can be seen as instant-specific versions of the general approximation ratio derived in [15]. Empirically, we found in benchmarking experiments (see below) that the instance specific bound

α (A)

substantially exceeds the uniform Goemans–Williamson bound of

α \approx 0.878

. For the Goemans–Williamson algorithms, of course, we necessarily have

E [α (A)] \geq α

.

3.3. Benchmarking Experiments

In order to compare the cut values of RR with each of the clustering algorithms, we use the following relative measure of performance:

{\hat{f}}_{c l u s t e r} = \frac{f_{c l u s t e r} - f_{R R}}{f_{R R}} = \frac{f_{c l u s t e r}}{f_{R R}} - 1

(9)

Figure 2 shows that clustering on most instances yield significant improvement for most clustering approaches. K-MeansNM by construction always finds a solution that is at least as good as RR; however, both K-MeansNM and K-Means2N never give a lower solution than RR. K-meansRand improves for all graphs except G12, Fuzzy for all except G12 and G72, K-MedRand for all except G32, G39, and G72, and, finally, for K-MeansDet in nine graphs, a better solution was not found.

The gains in solution quality differ substantially between the test instances, and a few graphs, in particular G12 and G72, do not profit from the clustering approaches other than the randomized versions of K-means. Interestingly, these two graphs are toroidal.

The same trend is also observed for the instance-specific performance bounds, see Figure 3. The performance bounds for the individual solutions are well above

0.9

, i.e., exceed the Goemans and Williamson also in those few cases where clustering is worse than RR. For bipartite graphs with non-negative edge weights, the entire edge set forms a maximal cut [11]. This is the case for the two unweighted graphs G48 and G49 and explains why, for these two cases, no difference between RR and clustering solutions is observed in Figure 2.

On average, all clustering algorithms yield improvements over RR. Interestingly, the average solution quality depends noticeably on the clustering algorithm. The clustering algorithm with largest average improvement was K-MeansNM, as expected. On the benchmark set, an average increase on the cut weight by

{\hat{f}}_{K - MeansNM} = 1.541 %

compared with RR is obtained. Other variants of 2-means performed similarly well. We found

{\hat{f}}_{K - Means 2 N} = 1.533 %

and

{\hat{f}}_{K - MeansRand} = 1.471 %

. For Fuzzy clustering, we only observed an improvement of

{\hat{f}}_{Fuzzy} = 1.247 %

. With

{\hat{f}}_{K - MedRand} = 0.841 %

and

{\hat{f}}_{K - MeansDet} = 0.789 %

, performance of medoids and deterministic methods was less encouraging. For individual instances, the improvement was substantial. For example, we obtain an improvement of 5.81% for the graph G10 and 5.12% for G28 with K-MeansNM.

Despite subtle differences between the clustering methods, the clusters are quite similar in their characteristics. One measure of interest is the mean clustering angle

θ_{A} arccos (|\frac{1}{| A |} \sum_{v_{i} \in A} v_{i}|)

(10)

It measures the average angle between two unit vectors in the same cluster. We found that the cardinalities

| A |

and

| X \ A |

are nearly even and

θ_{A}

lies between 60 and 75 degrees for the graphs in the benchmark set. We observed not convincing trends connecting these parameters and the weight of maximum cut.

3.4. Local Search Improvement

A straightforward way to improve a given solution of

M AX - C UT

is to add a local search step. We use adaptive walks for this purpose and restrict ourselves to RR and the two best-performing clustering approaches, i.e., K-MeansNM and K-Means2N. The results presented in Figure 4 are the best solutions found in all of the 50 iterations for each algorithm. The same relative measure of performance,

{\hat{f}}_{c l u s t e r}

, is used as in Section 3.3. The corresponding instance-specific performance bound

α

can be found in the Additional Material.

Figure 5 shows the ratio

f_{r o u n d i n g} / f_{b e s t}

, where

f_{r o u n d i n g}

is the best solution after local search found either by clustering or RR for that instance and

f_{b e s t}

is the best solution known in the literature, taken from [9]. The corresponding values are also available in tabular form as Additional Material. Graphs G11 to G13, G32 to G34, G57, G62, G65 to G67, and G72 have toroidal topology; both the clustering algorithm and Randomized Rounding show a comparably low approximation ratio for these instances. For graphs with other topologies, we obtain approximation ratios exceeding

0.96

, sometimes closely approaching 1. The combination of the Goemans–Williamson relaxation with clustering thus at least comes close to the best solutions known in the literature.

4. Conclusions

As we can see from the results, using other clustering methods than the randomized version of [15], on average, leads to better cut values. Using k-means with an initialization equivalent to starting from Goemans–Williamson rounding solutions (K-MeansNM), and keeping track of the points visited by k-means at all time, we can guarantee that the approximation guarantee is maintained, with the possibility of finding larger cut values. For the other clustering algorithms, this is not true; however, for one version of k-means (K-Means2N), the same or better solutions than RR were found, even without the guarantee. On average, the remaining clustering algorithms yield larger cut values than RR, and the number of instances where those algorithms find lower cuts are less than 15% for the worst case (K-MeansDet), and less than 5% for the others. Our approach is not guaranteed to improve all instances. In particular, it does not result in a theoretical improvement of the Goemans–Williamson approximation guarantee.

We have derived, however, an instance-specific lower bound for the approximation ratio that depends both on the instance and the solution, i.e., the cut itself. It provides a plausible measure of performance also for instances with unknown maximal cut value.

The success of k-means related clustering approaches suggests to extend this idea to other clustering methods. For spectral clustering, for instance, a natural starting points would be an auxiliary graph with weights

ω_{i j} = max ({\vec{v}}_{i} \cdot {\vec{v}}_{j})

and edge set

{(i, j) | ω_{i j} > 0}

.

Author Contributions

All authors jointly contributed the conceptualization of the study, the theoretical results, the interpretation of the benchmark data, and the writing of the manuscript. A.E.R.-F. implemented the software and performed the benchmark. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the Consejo Nacional de Ciencia y Tecnología (CONACyT), and the University of California Institute for Mexico and the United States (UC MEXUS). The APC was funded by Library of Leipzig University.

Acknowledgments

We thank Giovanni Rinaldi and Laura Palagi for the discussions about the solver of the SDP problem.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the scientific content of this work.

Abbreviations

The following abbreviations are used in this manuscript:

IQP	Integer Quadratic Programming
VP	Vector Programming
SDP	Semidefinite Programming
Fuzzy	Fuzzy c-means clustering
K-MeansRand	Randomized version of k-Means
K-MeansDet	Deterministic version of k-Means
K-MedRand	Randomized version of k-Medoids
K-MedDet	Deterministic version of k-Medoids
MST	Minimum Spanning Tree
RR	Randomized Rounding (Goemans–Williamson rounding)
K-Means2N	Randomized version of k-Means initialized with 2 random vectors
K-MeansNM	Deterministic version of k-Means initialized with a random vector and its negative

References

Karp, R.M. Reducibility among combinatorial problems. In Complexity of Computer Computation; Miller, R.E., Thacher, J.W., Eds.; Plenum Press: New York, NY, USA, 1972; pp. 85–103. [Google Scholar]
Papadimitriou, C.H.; Yannakakis, M. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci. 1991, 43, 425–440. [Google Scholar] [CrossRef] [Green Version]
Hadlock, F. Finding a Maximum Cut of a Planar Graph in Polynomial Time. SIAM J. Comput. 1975, 4, 221–225. [Google Scholar] [CrossRef]
Grötschel, M.; Nemhauser, G.L. A polynomial algorithm for the max-cut problem on graphs without long odd cycles. Math. Program. 1984, 29, 28–40. [Google Scholar] [CrossRef]
Bodlaender, H.L.; Jansen, K. On the complexity of the maximum cut problem. Nord. J. Comput. 2000, 7, 14–31. [Google Scholar] [CrossRef]
Anglès d’Auriac, J.C.; Preissmann, M.; Sebö, A. Optimal cuts in graphs and statistical mechanics. Math. Comput. Model. 1997, 26, 1–11. [Google Scholar] [CrossRef]
Festa, P.; Pardalos, P.M.; Resende, M.G.C.; Ribeiro, C.C. Randomized heuristics for the Max-Cut problem. Optim. Methods Softw. 2002, 17, 1033–1058. [Google Scholar] [CrossRef]
Klemm, K.; Mehta, A.; Stadler, P.F. Landscape Encodings Enhance Optimization. PLoS ONE 2012, 7, e34780. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ma, F.; Hao, J.K. A multiple search operator heuristic for the max-k-cut problem. Ann. Oper. Res. 2017, 248, 365–403. [Google Scholar] [CrossRef] [Green Version]
Shao, S.; Zhang, D.; Zhang, W. A simple iterative algorithm for maxcut. arXiv 2019, arXiv:1803.06496. [Google Scholar]
Delorme, C.; Poljak, S. Laplacian eigenvalues and the maximum cut problem. Math. Program. 1993, 62, 557–574. [Google Scholar] [CrossRef]
Poljak, S.; Rendl, F. Solving the max-cut problem using eigenvalues. Discret. Appl. Math. 1995, 62, 249–278. [Google Scholar] [CrossRef] [Green Version]
Trevisan, L. Max Cut and the Smallest Eigenvalue. SIAM J. Comput. 2012, 41, 1769–1786. [Google Scholar] [CrossRef] [Green Version]
Soto, J.A. Improved Analysis of a Max-Cut Algorithm Based on Spectral Partitioning. SIAM J. Discret. Math. 2015, 29, 259–268. [Google Scholar] [CrossRef] [Green Version]
Goemans, M.X.; Williamson, D.P. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 1995, 42, 1115–1145. [Google Scholar] [CrossRef]
Grippo, L.; Palagi, L.; Piccialli, V. An unconstrained minimization method for solving low-rank SDP relaxations of the maxcut problem. Math. Program. 2011, 126, 119–146. [Google Scholar] [CrossRef]
Palagi, L.; Piccialli, V.; Rendl, F.; Rinaldi, G.; Wiegele, A. Computational Approaches to Max-Cut. In Handbook on Semidefinite, Conic and Polynomial Optimization; Springer: Boston, MA, USA, 2011; pp. 821–847. [Google Scholar] [CrossRef] [Green Version]
Mahajan, S.; Ramesh, H. Derandomizing semidefinite programming based approximation algorithms. In Proceedings of the IEEE 36th Annual Foundations of Computer Science, Milwaukee, WI, USA, 23–25 October 1995; pp. 162–169. [Google Scholar] [CrossRef]
Håstad, J. Some optimal inapproximability results. J. ACM 2001, 48, 798–859. [Google Scholar] [CrossRef]
Khot, S.; Kindler, G.; Mossel, E.; O’Donnell, R. Optimal inapproximability results for MAX-CUT and other 2-variable CSPs? SIAM J. Comput. 2007, 37, 319–357. [Google Scholar] [CrossRef] [Green Version]
Feige, U.; Karpinski, M.; Karpinski, M. Improved approximation of Max-Cut on graphs of bounded degree. J. Algorithms 2002, 43, 201–219. [Google Scholar] [CrossRef]
MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics; University of California Press: Berkeley, CA, USA, 1967; Volume 1, pp. 281–297. [Google Scholar]
Dhillon, I.S.; Modha, D.S. Concept Decompositions for Large Sparse Text Data Using Clustering. Mach. Learn. 2001, 42, 143–175. [Google Scholar] [CrossRef] [Green Version]
Kaufmann, L.; Rousseeuw, P. Clustering by Means of Medoids. In Data Analysis Based on the L₁-Norm and Related Methods; Dodge, Y., Ed.; North-Holland: Amsterdam, The Netherlands, 1987; pp. 405–416. [Google Scholar]
Bezdek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms; Springer: Boston, MA, USA, 1981. [Google Scholar] [CrossRef]
Grygorash, O.; Zhou, Y.Z.; Jorgensen, Z. Minimum Spanning Tree Based Clustering Algorithms. In Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’06), Arlington, VA, USA, 13–15 November 2006; Lu, C.T.L., Bourbakis, N.G.B., Eds.; IEEE Computer Society: Los Alamitos, CA, USA, 2006; pp. 73–81. [Google Scholar] [CrossRef]
Bishop, C.M. Pattern Recognition and Machine Learning (Information Science and Statistics); Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]

Figure 1. Exploration of the path of 50 iterations of K-MeansNM on the distortion weight of the corresponding cut. The diagram shows all values generated while running the k-means. Points at the bottom right of the plot are the starting points and thus the results of Goemans–Williamson rounding; meanwhile, points at the top left are the end points. At each step of K-MeansNM, the points move to the left until the algorithm finds a local minima of distortion. The red line is a linear fit of all the points after the second step of k-means. These show a clear correlation between cluster distortion and cut weight.

Figure 2. Comparison between the cut value found by clustering algorithms and RR, using

{\hat{f}}_{c l u s t e r}

as the comparison. The horizontal axis represents the graph number, i.e., graph Gi is shown in position i. The red line indicates the average over the benchmark set. Positive values indicate that the clustering solutions are superior to the RR solutions.

Figure 2. Comparison between the cut value found by clustering algorithms and RR, using

{\hat{f}}_{c l u s t e r}

as the comparison. The horizontal axis represents the graph number, i.e., graph Gi is shown in position i. The red line indicates the average over the benchmark set. Positive values indicate that the clustering solutions are superior to the RR solutions.

Figure 3. Comparison of instance-specific performance bounds

α (A)

between clustering algorithms and RR. The cyan line is the average of

α (A)

for the clustering methods and the magenta line is the average of RR for comparison.

Figure 3. Comparison of instance-specific performance bounds

α (A)

between clustering algorithms and RR. The cyan line is the average of

α (A)

for the clustering methods and the magenta line is the average of RR for comparison.

Figure 4. Comparison between the cut value found by clustering algorithms and RR after local search (LS), using

{\hat{f}}_{c l u s t e r}

as the comparison. The horizontal axis represents the graph number, i.e., graph Gi is shown at position i. The red line indicates the average over the benchmark set. Positive values indicate that the locally improved clustering solutions are superior to the locally improved RR solutions.

Figure 4. Comparison between the cut value found by clustering algorithms and RR after local search (LS), using

{\hat{f}}_{c l u s t e r}

as the comparison. The horizontal axis represents the graph number, i.e., graph Gi is shown at position i. The red line indicates the average over the benchmark set. Positive values indicate that the locally improved clustering solutions are superior to the locally improved RR solutions.

Figure 5. Comparison of the best cut value obtained with clustering (in magenta) and RR (in cyan) and subsequence local improvement with the best cut value (

f_{b e s t}

) available in the literature.

Figure 5. Comparison of the best cut value obtained with clustering (in magenta) and RR (in cyan) and subsequence local improvement with the best cut value (

f_{b e s t}

) available in the literature.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Rodriguez-Fernandez, A.E.; Gonzalez-Torres, B.; Menchaca-Mendez, R.; Stadler, P.F. Clustering Improves the Goemans–Williamson Approximation for the Max-Cut Problem. Computation 2020, 8, 75. https://doi.org/10.3390/computation8030075

AMA Style

Rodriguez-Fernandez AE, Gonzalez-Torres B, Menchaca-Mendez R, Stadler PF. Clustering Improves the Goemans–Williamson Approximation for the Max-Cut Problem. Computation. 2020; 8(3):75. https://doi.org/10.3390/computation8030075

Chicago/Turabian Style

Rodriguez-Fernandez, Angel E., Bernardo Gonzalez-Torres, Ricardo Menchaca-Mendez, and Peter F. Stadler. 2020. "Clustering Improves the Goemans–Williamson Approximation for the Max-Cut Problem" Computation 8, no. 3: 75. https://doi.org/10.3390/computation8030075

APA Style

Rodriguez-Fernandez, A. E., Gonzalez-Torres, B., Menchaca-Mendez, R., & Stadler, P. F. (2020). Clustering Improves the Goemans–Williamson Approximation for the Max-Cut Problem. Computation, 8(3), 75. https://doi.org/10.3390/computation8030075

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Clustering Improves the Goemans–Williamson Approximation for the Max-Cut Problem

Abstract

1. Introduction

2. Materials and Methods

2.1. Benchmark Data

2.2. Clustering in the Goemans–Williamson Algorithm

2.3. Local Search

3. Results

3.1. Cluster Quality Correlations with Solution Quality

3.2. An Instance-Specific Approximation Guarantee

3.3. Benchmarking Experiments

3.4. Local Search Improvement

4. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI