Symmetry Detection for Quadratic Optimization Using Binary Layered Graphs

: Symmetry in mathematical optimization may create multiple, equivalent solutions. In nonconvex optimization, symmetry can negatively affect algorithm performance, e.g., of branch-and-bound when symmetry induces many equivalent branches. This paper develops detection methods for symmetry groups in quadratically-constrained quadratic optimization problems. Representing the optimization problem with adjacency matrices, we use graph theory to transform the adjacency matrices into binary layered graphs. We enter the binary layered graphs into the software package nauty that generates important symmetric properties of the original problem. Symmetry pattern knowledge motivates a discretization pattern that we use to reduce computation time for an approximation of the point packing problem. This paper highlights the importance of detecting and classifying symmetry and shows that knowledge of this symmetry enables quick approximation of a highly symmetric optimization problem.


Introduction
When the optimization variables can be permuted without changing the structure of the underlying optimization problem, we say that the formulation group of an optimization problem is symmetric [1,2]. For motivation, consider the circle packing problem illustrated in Figure 1 [3]. Given an integer n > 0, the circle packing problem asks: what is the largest radius r for which n non-overlapping circles can be placed in the unit square? Costa et al. [3] show that the formulation group, i.e., a subgroup of symmetry group generated by permuting variables and constraints, is isomorphic to a symmetry group created by permuting the variable indices and switching the two coordinates in a unit square (C 2 × S n ). Solution methods for nonconvex optimization problems lacking symmetry-aware formulations and/or solution procedures may end up exploring all of these equivalent solutions. In other words, symmetry may cause classical optimization methods such as branch-and-bound to explore many unnecessary subtrees.
More generally, a number of authors have considered a range of symmetry detection methods, e.g., for constraint programming [4], integer programming [1,[5][6][7][8], and mixed-integer nonlinear optimization [2]. These automatic symmetry detection methods can then be used to mitigate the computational difficulties caused by symmetries, e.g., with symmetry-breaking constraints [9][10][11], objective perturbation [12], specialized branching strategies [13,14], cutting planes [15,16], and extended formulations [17]. The recent computational comparison of Pfetsch and Rehn [18] indicates that these state-of-the-art symmetry handling methods expedite the solution process for the MIPLIB 2010 instances and additionally enable more instances to be solved in a time limit.  Given an integer n > 0, the circle packing problem asks: what is the largest radius r for which n non-overlapping circles can be placed in the unit square? Already for n = 2, there are four equivalent solutions [3]; these solutions are related to one another via rotations and reflections.

M1
M2 Task 2   Task 1   Task 3   0  1  2  3  4  5  6  7  8  9  10 Time M1 M2  This paper develops detection methods for symmetry groups in quadratically-constrained quadratic optimization problems. Representing the optimization problem with adjacency matrices, we use graph theory to transform the adjacency matrices into binary layered graphs. We enter the binary layered graphs into the software package nauty that generates important symmetric properties of the original problem. Symmetry pattern knowledge motivates a discretization pattern that we use to reduce computation time for an approximation of the point packing problem.

Formulation Symmetry for Quadratically-Constrained Quadratic Optimization Problems
Consider the quadratically-constrained quadratic optimization problem (QCQP): where: with finite variable bounds x L i , x U i ∈ R, ∀i and coefficients α k ij ∈ R for i, j ∈ {0, . . . , n}, k ∈ {0, . . . , m}. To represent symmetry in the QCQP formulation, consider S n , the symmetric group of order n formed by the n! possible permutation operations. The formulation group of QCQP , or G QCQP , is the set of variable index permutations that preserve the objective and constraint structure [2]. For a variable index permutation π ∈ S n , we seek the constraint index permutations σ ∈ S n that maintain both the Because dom( f ) may be nonconvex and difficult to compute, this paper considers a G QCQP restriction that enforces symmetry on the entire box bounds, i.e., we assume that dom( f ) = x L , x U for the purpose of computing G QCQP . The next subsections represent formulation symmetry in two ways: (i) expression graphs in Section 2.1 and (ii) tensors in Section 2.2. Representing formulation symmetry using expression graphs is due to Liberti [2] and the tensor representation is new to this paper.

Symmetry Detection with Expression Graphs
One option to compare two functions is to compare their expression trees, i.e., a directed acyclic graph representation of each function that incorporates the relevant operations, constants, and variables [2]. These expression tree models were first developed for mixed integer nonlinear optimization (MINLP) by Smith and Pantelides [32] and are common in most global MINLP solvers [33][34][35][36][37][38][39] and other MINLP-related software [40][41][42]. Figure 3 illustrates a simple example of an expression tree for 3x 1 + 2x 2 4 + 2x 2 x 3 . A tree comparison algorithm may recursively compare two trees to determine equivalence [2]. More advanced implementations may detect equivalent but differently-formulated expressions, e.g., ( Example of an expression tree for 3x 1 With a directed acyclic graph representation, Liberti [2] computes the formulation symmetry group using the graph isomorphism problem, i.e., a problem that can be solved using off-the-shelf software nauty [43]. Liberti [2] also proves how to map the automorphism group of a directed acyclic graph to the formulation group of the original MINLP.

Symmetry Detection with Tensors
As an alternative to the expression tree representation, Figure 4 illustrates that QCQP can be represented as a tensor: A QCQP ∈ R (n+1)×(n+1)×(m+2n) . Each of the two dimensions (n + 1) corresponds to a constant term and the variables. Each two-dimensional slice of the tensor corresponds to the constant, linear, and quadratic terms in a constraint. The first m slices correspond to Equation (1) and have entries a k ij . The next 2n slices correspond to the box constraints, i.e., x i ≥ x L i and x i ≤ x U i , ∀i. The formulation group of this representation is:

Sparse Tensor Representation
For a given tensor A QCQP , consider a sparse representation, illustrated in Figure 5, that reduces the memory required to store the tensor. Instead of storing the entire tensor, we store arrays of length s where s is the number of nonzero entries in QCQP. The first array, M = (M 1 , . . . , M s ) stores all nonzero entries α k ij of QCQP. The next three arrays, I = (I 1 , . . . , I s ), J = (J 1 , . . . , J s ), K = (K 1 , . . . , K s ) represent the indices corresponding to the nonzero α k ij entries. The maximum size of s is (n + 1) 2 (m + 2n), but, in practice, most arrays will be significantly shorter.

Converting Matrices to Edge-Labeled, Vertex-Colored Graphs
We convert the sparse tensor representation of A QCQP into an edge-labeled, vertex-colored graph. Given the edge-labeled, vertex-colored graph, generating graph automorphisms to the original problem symmetries is well-known [1,5,44,45]. To construct the edge-labeled, vertex-colored graph, consider a graph G = (V, E, c) corresponding to an instance M, I, J, K. The function c : E → r, for r ∈ {0, . . . , − 1} is an edge coloring where ∈ Z + is the number of different coefficients in M. Each unique element in M is stored in a vector U ∈ R . We also partition (color) the vertex set into four subsets: a set V F representing the objective function, V C nodes for the constraints, a constant node V S , and V R variable nodes. The automorphism definition prevents vertices from being mapped onto a vertex of a different color, so these colors prevent, for example variables becoming constraints. The equivalence relation is [2]: Figure 6 illustrates the edge-labeled, vertex-colored graph. Initially, the edge set is empty E = ∅.
, from a vertex in the set that represents the constant element / variables to a vertex in the set of the objective function / constraints, with the relevant color. The graph construction incorporates edges between variable nodes V R for the quadratic bilinear terms. For i = {0, . . . , s}: . w 1 = Figure 6. The tensor A QCQP as an edge-labeled, vertex-colored graph.

Formulation Symmetry Detection via Binary Layered Graphs
The software nauty [43], which detects symmetry, accepts vertex-colored graphs but does not accept the Section 2.2.2 edge-labeled, vertex-colored graphs. Thus, we associate edge colors with layers in a graph and transform the edge-labeled, vertex-colored graph into a vertex-colored graph. Since the transformation from an edge-labeled, vertex-colored graph to a vertex-colored graph is isomorphic [43], the transformation does not lose anything. Using the resulting binary layered graphs, we generate the automorphism group and find symmetry in the original QCQP.
To convert a graph G = (V, E, c) with colors into an -layered graph [43], we replace each vertex v j ∈ V with a fixed connected graph of vertices v j . Finally, we partition the vertices by the superscripts, Alernatively, a binary representation avoids too many layers in G when the number of colors is large.
Definition 2 (Binary Layered Graph). Let ∈ Z + be the number of edge labels of G. A binary layered graph is a vertex-colored graph where the number of layers L = log 2 ( + 1) matches a binary representation.
We assign a unique positive integer µ(z) to each z ∈ U and map edge labels µ(z) to a binary representation that switches on/off parameters c t to represent the edge colors as layers. If c t = 1, add a new edge from v t i to v t j for every c t ∈ {c 1 , . . . , c L−1 }: Figure 7 illustrates the resulting binary labeled graph with its L = log 2 ( + 1) + 2 layers. There are vertices for the objective function and each constraint and layers of copies of these constraints (connected with vertical edges). The horizontal edges encode the problem coefficients. On the upper part of Figure 7, there are vertices for a constant element and each variable and a layer of variable copies (connected with vertical edges). Here, the horizontal edges and loops distinguish the linear and bilinear terms. Algorithms 1 and 2 summarize computing the vertex and edge sets, respectively.
L ∈ Z, L = log 2 (|U| + 1) + 2 Define L ∈ Z + the number of layers 4: for s = 0 → L − 3 do Partition of vertices representing the constraints 5: for k = 0 → m do 6: Copies of vertices representing the constraints 7: end for 8: end for 10: for s = L − 2, L − 1 do Partition of vertices representing the variables 11: for i = 0 → n do 12: Vertical edges between copies of vertices 5: for k = 0 . . . m do Copies of vertices representing the constraints 6: Vertical edges between copies of vertices 10: for j = 0 . . . n do Copies of vertices representing the variables 11: for k = 0 . . . m do 26: if KM(F) = k then 27: if I M(F) = 0 then 28: else 30: end if 32: end if 33: end for 34: end for 35: return G 36: end procedure
Equation (2) computes the binary representation of each unique element, e.g., 3 = 2 1 + 2 0 indicates that there is an edge between vertices on layer zero and another edge between the same vertices on layer 1. The graph consists of four layers and |V| = 18, one associated with a constant element and one with the objective function and the rest for the problem variables and constraints. The left-hand side of Figure 8 illustrates the graph representation. Nauty generates permutations: π = (1, 2)(5, 6)(9, 12)(10, 11)(14, 17) (15,16). To see how these Nauty-generated permutations usefully explain the symmetry properties of the entire problem, observe: (i) Permutations (1, 2)(5, 6), as shown in Figure 8, permute the constraints c 1 , c 2 and (ii) Permutations (9, 12)(10, 11) are associated with the variables x 1 , x 4 and x 2 , x 3 with (14, 17)(15, 16) their copies. These permutations therefore allow us to automatically calculate the formulation group G = (x 1 x 4 )(x 2 x 3 ). The right-hand side of Figure 8 uses Section 2.1 to develop a directed acyclic graph representation for the same problem. The graph colors represent the vertex partitioning that enables node exchanges. In this case, the directed acyclic graph representation uses a smaller number of vertices and edges than the tensor-based representation. However, the representations generate the same formulation group.
Comparison. To evaluate the trade-offs between the tensor and directed acyclic graph representations, observe that both methods will search for the same formulation group symmetries. However, the tensor representation may be especially useful when working with problems with many differently-valued coefficients, i.e., the logarithmic number of layers may reduce the number of nodes. The function assigning integer values to the problem coefficients lets us work not only with 0-1 coefficients, but also with any other value.

Exploiting Symmetry in the Point Packing Problem
Once symmetry has been detected, we can use our knowledge of the symmetry to mitigate the computational difficulties caused by symmetry. Here, we focus on solving the point packing problem [46][47][48][49]. The point packing problem concerns packing n points to a unit square. The aim is to maximize the in-between distance between any two points: where θ denotes the minimum distance between any two points. To approximate this problem, consider a grid approach that approximates the optimal solution by adding grid lines to the unit square and forcing the points to be placed only to the vertices generated by these grid lines. Note that the point packing problem has significant applications, e.g., in placing mobile phone towers. For n points, there will be at most a k * ×k * grid, where k * is the smallest number whose square is the least integer that is greater than n, i.e., (k * −1) 2 < n and k * 2 ≥ n. In other words, we add at most 2k * grid lines. On this k * ×k * grid, points will occupy most of the vertexes and the unit length of spacing has been maximized. This approach, unfortunately, has the potential of missing the optimal solution. Consider fitting six points to the square. The most fitting grid is 3 × 3 and the optimal θ we achieve is 0.5. However, the optimal solution of 0.6009 is achieved by the arrangement shown in Figure 9, which is not available on a 3 × 3 grid. Although an approximation, k * is still a useful pruning tool, e.g., points need to be at least 1 k * −1 away from any other point.

Exhaustive Search and 2D Symmetry Removal
First, consider Algorithm 3, an exhaustive search method. To break the symmetries, start by calculating how many grids can possibly have points and the upper limit on the number of points on any occupied grid lines. Once calculated, the complete set of all possible combinations on the x-axis is calculated. For example, on a 5 × 5 grid, one possible x-coordinate setup is (2, 0, 1, 0, 2). If we label the five horizontal grid lines from 1 to 5, this setup means that there are two points on grid 1, no points on grid 2, one point on grid 3, no points on grid 4 and two points on grid 5. To remove x-axis symmetries, we consider reflections as a duplication, i.e., only one of (1, 0, 2, 0, 2) and (2, 0, 2, 0, 1) are considered. Step2 Generate complete set of combinations of x-coordinate 3: Step3 Symmetries removal on x-coordinates 4: Step4 Generate all y-coordinate sets based on binomial coefficient 5: Step5 Y-coordinates pruning Algorithm 3 considers the unique x-axis combinations. The y-axis, i.e., the horizontal grid lines, should preserve the same characteristics. Thus, in the improved Algorithm 4, we generate the set O of possible point arrangement such as (2, 0, 1, 0, 2), and this is applied to both horizontal and vertical grid lines; in other words, we now consider the product O × O to give the exact coordinates of all n points.

Algorithm 4 Algorithm 2-2D Symmetry Removal
Input: number of points n, number of grids k, number of occupied grids m. Output: The optimal solution d *  Figure 10. In the last setup, we have also added the labels for grids to match what we defined earlier. This diagram contains all possible arrangements of points under this particular orbit partitioning. To improve further, we can implement pruning mechanisms such as using the bound 1 k * −1 to prune the non-optimal setups. In this particular example, we would have pruned all five setups as none of them satisfy the 0.5 bound from the most fitting grid. This means that O 3 × O 4 is not the optimal point setup for five points on a 5 × 5 grid.

Results and Comparisons
Both algorithms have been implemented and run on the same devices (HP EliteDesk 800 G2 TWR Intel Core i7-6700 3.4 GHz) to provide effective comparisons. Figure 11 shows the experimental results of 2D Symmetry Removal for packing 6 points. We eliminated the result from Exhaustive Search as it is clear that 2D Symmetry Removal outperforms Exhaustive Search in terms of run-time. We notice that m, the number of occupied grids (in the diagram, this corresponds to the occupied vertical grids), has an impact on the run-time. To a reasonable extent, the larger the m, the more choices we have regarding where we place the points so it in general takes longer time to compute. Although our strategies effectively convert the point packing QCQP into a mixed-integer linear optimization problems, we could have alternatively designed a branch-and-bound algorithm that is symmetry aware.

Conclusions
This paper has explored alternative representations for finding symmetry in formulation groups of a quadratically-constrained optimization problem. We also show that, after knowing the symmetry, we can design significantly better methods to solve the optimization problems. The contributions in this paper are relevant to industrial problems that contain a point packing element [50][51][52].

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.