## 1. Introduction

The

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ problem is a well-known NP-hard [

1] and APX-hard [

2] combinatorial optimization problem. An instance consists of a graph

G with vertex set

V, edge set

E and edge weights

$w:E\to \mathbb{R}$. A cut is a bipartition

$A,V\backslash A$ of the vertex set

V, where

$A\subseteq V$. The

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ problem consists of finding a cut that maximizes the total weight of the edges that span the cut. That is, the function

is to be maximized over all

$A\subseteq V$. While

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ is NP-hard in general, polynomial-time algorithms exist for restricted graph classes, including planar graphs [

3], graph without long odd cycles [

4], and cographs [

5].

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ can also be written as a spin glass model, see, e.g., [

6], for an overview. Using an arbitrary numbering of the

n vertices of the graph

G, we write

${x}_{i}=+1$ if vertex

$i\in A$,

${x}_{i}=-1$ if vertex

$i\in V\backslash A$, and

$x=({x}_{1},{x}_{2},\cdots ,{x}_{n})$. Furthermore, we set

${w}_{ij}:=w\left(\{i,j\}\right)$ if

$\{i,j\}$ is an edge of

G and

${w}_{ij}=0$ otherwise. With this notation, we can rewrite

$f\left(A\right)$ in terms of the “spin vector” as

Maximizing

$f\left(x\right)$ amounts to the integer quadratic programming (IQP) formulation of the

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ problem. Without loss of generality, we assume

${w}_{ij}={w}_{ji}$.

Solutions of large

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ problems are of considerable practical interest in network design, statistical physics, and data clustering, and hence a broad array of computational techniques have been customized to its solution. Broadly, they can be subdivided into combinatorial heuristics (see e.g., [

7,

8,

9] and the references therein) and methods involving relaxations of the integer constraints in Equation (

2).

Replacing the integer vectors by

$x\in {\mathbb{R}}^{n}\backslash \left\{0\right\}$ leads to an equivalent continuous optimization problem for which an excellent heuristic is described in [

10] and a close connection with the largest eigenvalues of generalized Laplacians [

11] and their corresponding eigenvectors [

12]. On this basis, algorithms with uniform approximation guarantees have been devised [

13,

14]. Goemans and Williamson [

15] replaced the integers

${x}_{i}$ by unitary vectors

${\overrightarrow{v}}_{i}$ of dimension

$n=\left|V\right|$.

where · denotes the scalar product on the unitary vectors. This problem contains all the instances of the original problem (

2), as seen by setting

${\overrightarrow{v}}_{i}=({x}_{i},0,\cdots ,0)$ for all

i. The relaxed problem (

3) is a particular instance of Vector Programming, or using a change of variables, an instance of Semidefinite Programming (SDP), and thus can be solved in polynomial time (up to an arbitrarily small additive error), see, e.g., [

16,

17].

The solutions of the relaxed problem (

3) are translated to solutions of the original IQP. Goemans and Williamson [

15] proposed to use a unitary random vector

$\overrightarrow{r}$ and to set

with

$\mathrm{sgn}\left(t\right)=-1$ for

$t<0$ and

$\mathrm{sgn}\left(t\right)=+1$ for

$t\ge 0$. This amounts to cutting the sphere at the hyperplane with normal vector

$\overrightarrow{r}$ and to assign

${x}_{i}$ depending on whether

${\overrightarrow{v}}_{i}$ lies in the “upper” or “lower” hemisphere defined by this hyperplane.

The Goemans–Williamson relaxation yields an approximation bound of

$\alpha :={min}_{0\le \theta \le \pi}\frac{2}{\pi}\frac{\theta}{1-cos\theta}>0.878$ for the expected value

$\mathbb{E}\left[f\left(\widehat{x}\right)\right]/maxf$, where the expectation is taken over the choices of

$\overrightarrow{r}$ [

15]. At present, it is the best randomized approximation algorithm for the integer quadratic programming formulation of the

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ problem. A clever derandomization [

18] shows that deterministic algorithms can obtain the same approximation bound. On the other hand, it is NP-hard to approximate

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ better than the ratio

$16/17$ [

19]. If the Unique Games Conjecture [

20] is true, furthermore, the approximation ratio cannot be improved beyond the Goemans–Williamson bound for all graphs. However, better ratios are achievable, e.g., for (sub)cubic graphs [

21].

The translation of the solution of the related problem (

3) in the Goemans–Williamson approximation relies on the choice of a random vector

$\overrightarrow{r}$. Naturally, we ask whether the performance can be improved by expending more efforts to obtain a better choice for

$\overrightarrow{r}$. The key observation is that the purpose of

$\overrightarrow{r}$ is to separate the solution vectors

${\overrightarrow{v}}_{i}$ of Equation (

3) into two disjoint sets of points on the sphere. The two sets

${A}_{+}:=\{i:{\overrightarrow{v}}_{i}\xb7\overrightarrow{r}\ge 0\}$ and

${A}_{-}:=\{i:{\overrightarrow{v}}_{i}\xb7\overrightarrow{r}<0\}$ can thus be thought of as a pair of clusters. Indeed, two vectors

${\overrightarrow{v}}_{i}$ and

${\overrightarrow{v}}_{j}$ tend to be anti-parallel if

${w}_{ij}$ is large, while pairs of points

i and

j with small or even negative weights

${w}_{ij}$ are likely to wind up on the same side of the maximal cut. Of course, the random vector

$\overrightarrow{r}$ is just one way of expressing this idea: if

${\overrightarrow{v}}_{i}$ and

${\overrightarrow{v}}_{j}$ are similar, then we will “usually” have

$\mathrm{sgn}{\overrightarrow{v}}_{i}\xb7\overrightarrow{r}=\mathrm{sgn}{\overrightarrow{v}}_{j}\xb7\overrightarrow{r}$. The “randomized rounding” of the Goemans–Williamson method therefore can also be regarded as a clustering method. This immediately begs the questions whether the solutions of the

$\mathrm{M}\mathrm{AX}-\mathrm{C}\mathrm{UT}$ problem can be improved by replacing the random choice of

$\overrightarrow{r}$ by first clustering the solution set

$\left\{{\overrightarrow{v}}_{i}\right\}$ of the relaxed problem (

3). We shall see

empirically that the answer to this question is affirmative even though the theoretical performance bound is not improved.

In practical applications, solutions of relaxed problems are often post-processed by local search heuristics. Therefore, a local search starting from the final results of both the Goemans–Williamson relaxation and two of our best clustering approaches were made in order to improve the cut values.

This contribution is organized as follows. In the following section, we briefly summarize data sets and clustering algorithms with their relevant properties as well as the details of the local search used to improve the cut values. In

Section 3.1, we describe an initial analysis of the effect of clustering on the cut weights, showing that the quality of near-optimal clusters correlates well with cut weights. Since we were not able to show for most clustering methods that they retain the Goemans–Williamson performance bound, we derive an instance specific bound in

Section 3.2 that provides a convenient intrinsic quality measure. In

Section 3.3, we extend the empirical analysis to the benchmarking set that also contains very large graphs. We show that the use of clustering methods indeed provides a consistent performance gain. We also see that the instance-specific performance bounds are much closer to 1 than the uniform Goemans–Williamson

$\alpha $. Finally, in

Section 3.4, we consider the improvement to the cut values that are achieved with local search starting from the the Goemans–Williamson and the two best clustering relaxations.

## 4. Conclusions

As we can see from the results, using other clustering methods than the randomized version of [

15], on average, leads to better cut values. Using

k-means with an initialization equivalent to starting from Goemans–Williamson rounding solutions (K-MeansNM), and keeping track of the points visited by

k-means at all time, we can guarantee that the approximation guarantee is maintained, with the possibility of finding larger cut values. For the other clustering algorithms, this is not true; however, for one version of

k-means (K-Means2N), the same or better solutions than RR were found, even without the guarantee. On average, the remaining clustering algorithms yield larger cut values than RR, and the number of instances where those algorithms find lower cuts are less than 15% for the worst case (K-MeansDet), and less than 5% for the others. Our approach is not guaranteed to improve all instances. In particular, it does not result in a theoretical improvement of the Goemans–Williamson approximation guarantee.

We have derived, however, an instance-specific lower bound for the approximation ratio that depends both on the instance and the solution, i.e., the cut itself. It provides a plausible measure of performance also for instances with unknown maximal cut value.

The success of k-means related clustering approaches suggests to extend this idea to other clustering methods. For spectral clustering, for instance, a natural starting points would be an auxiliary graph with weights ${\omega}_{ij}=max({\overrightarrow{v}}_{i}\xb7{\overrightarrow{v}}_{j})$ and edge set $\left\{(i,j)\right|{\omega}_{ij}>0\}$.