1. Introduction
Multidimensional arrays or tensors arise naturally as an extension of matrices. They occur in applications where one needs to represent multidimensional data such as in signal processing [
1,
2,
3], machine learning [
2,
4,
5], material science [
6], and speech recognition [
7]. For example, any homogeneous polynomial of degree
d in
n-variables is associated with a symmetric tensor of order
d and dimension
n. In polynomial optimization and, likewise, in control theory, checking the non-negativity of a polynomial is a fundamental problem [
8,
9]. Nonnegativity of a polynomial over the real space or its non-negative orthant results in positive semidefinite and copositive tensors, respectively.
The copositive and completely positive cones of matrices, which are tensors of order two, are very well explored (see, e.g., [
10,
11,
12] and also [
13] for a list of open problems). Therefore, it seems natural to study similar results for copositive tensors. The generalization from matrix to tensor is not trivial since a higher dimension usually destroys the nice structure present at a lower dimension.
The current research on copositive tensor is focused on describing properties that can be generalized from the quadratic case to higher dimensional case. The area is not very well explored. Similar to its matrix analog a characterization of copositive tensors using eigenvectors of principal sub-tensors is described in [
14]. Moreover, Qi and co-authors have discussed several basic properties of copositive tensors in a series of papers (see e.g., [
15] and the book [
16]).
The set of copositive tensors forms the copositive cone. The copositive cone is used to reformulate hard combinatorial optimization problem as a linear conic optimization problem. Reformulation of polynomial optimization as copositive program does not reduce its complexity. However, the complexity is packaged in the copositivity constraint, which is known to be NP-hard. To approximate copositive cone several tractable approximation hierarchies are developed. These approximation hierarchies are based on sum-of-squares conditions, non-negativity of polynomial coefficients, simplicial partition, or rational griding of the simplex. For instance, Parrilo [
17] had provided a hierarchy of linear and semi-definite inner approximations for copositive cone (see also [
18]). Moreover, Bomze and de Klerk [
19] developed a hierarchy for copositive matrices based on non-negativity of polynomial coefficients.
In this paper, we describe approximation hierarchies for the cone of copositive tensors. We extend the hierarchies presented by Parrilo [
17] (represented by
) an Bomze and de Kelerk [
19] (represented by
) for higher dimension. Earlier work was focused on extending polynomial approximation schemes from lower to higher dimension. For example, Bomze and de Kelerk [
19] focused on improving the approximation result as presented by Nesterov [
20]. They first presented a polyhedral representation of
and then used this representation to derive an approximation scheme for polynomial optimization over the simplex. The approximation scheme was extended to a higher-degree fixed-degree polynomial by De Klerk, Laurent and Parrilo [
21]. However, they used rational griding of the simplex to derive this polynomial time approximation scheme. The results was further refined using the Bernstein approximation in [
22]. Since these approximation schemes relied on rational griding, they have therefore also presented an error analysis based on multivariate hyper-geometric distribution (see [
23] and also [
24] for a note on the convergence rate).
Furthermore, we develop a compact representation of
with an aim to extend the initial results as presented in Bomze and de Kelerk [
19]. To the best of our knowledge, the closest attempt in this direction is by [
25]. However, their results are restricted to the tensor of order four. Secondly, the results rely heavily on breaking the tensor of dimension four into tensors of lower dimensions. The representation process was tedious and had no obvious generalization to higher dimension cases. Our representation is general and holds for all dimensions. These hierarchies can be used to develop polynomial time approximation schemes for polynomial optimization over the simplex. Moreover, we have provided results in this direction (see
Section 5).
The main contributions of this paper are as follows. (a) To discuss basic properties of the copositive tensor cone. Moreover, we show that every Z-tensor is copositive if and only if it is positive semidefinite. (b) To describe approximation hierarchies for the copositive cone of tensors based on sum of square decomposition and non-negativity of the polynomial coefficients.Moreover, we present the polyhedral representation of the approximation hierarchical cone based on non-negative coefficients. The representation is compact and has not appeared in the literature. (c) application of approximation hierarchies to the polynomial optimization over the simplex.
The article is arranged as follows.
Section 2 is comprised of the basic definitions and notations. In
Section 3, we define tensor cones and related results. In
Section 4, we present approximation hierarchies for copositive cone of tensors. We discuss special cases and characterization of these hierarchies.
Section 5 provides approximation results based on these hierarchies. In
Section 6, we provide a conclusion and future directions.
2. Preliminaries
Throughout this article, the
n-dimensional Euclidean space and its non-negative orthant are denoted by
and
, respectively. The set of natural numbers is denoted by
and the set of first
n natural number is denoted by
. The set of whole numbers is denoted by
while the set of first
whole numbers is denoted by
. For any
we define
. We also define the index set
The cardinality of
is
(see, e.g., [
22]).
A
tensor is a multi-dimensional array of real numbers. Specifically, an
n-dimensional
dth order tensor is given by
Moreover, a tensor
is said to be symmetric if
The set of all symmetric tensors is denoted by
. For brevity of notation, if some index
of an element
is repeated
k-times, we write it as
i.e.,
Using above notation the ith diagonal element of the tensor is denoted by .
The inner product of two tensors
is defined as follows
For
and
,
represent a monomial. The maximum degree among all the monomials of
is called the degree of a polynomial. A polynomial whose all monomials have the same degree is termed as a ‘form’ or a ‘homogeneous polynomial’. Moreover, for
,
denotes a symmetric tensor of dimension
n and degree
d and is given below,
Notice that, the entries of tensor
are the monomials of degree
d in
n-variables. Thus for any symmetric tensor
its associated form can be written as
where
denote the coefficient of monomial
in
and
and
denotes the factorial of
.
For
a subset
is said to be a cone if for each tensor
the scalar product
for all
. Moreover, the cone
K is said to be a convex cone if for
and for non-negative scalars
we have,
. The dual
of the cone
K is defined as
A convex cone K is said to be pointed if , and K is said to be solid if its interior is nonempty. A convex cone which is closed, pointed, and solid is termed as proper cone. A convex cone K is said to be a polyhedral cone if it is finitely generated.
The cone of entry-wise non-negative tensors is denoted by
. Finally, a tensor with non-positive off-diagonal entries is termed as
Z-tensor (it is also called
essentially non-positive tensor see e.g., [
26]). Finally, the standard simplex
in
n-dimensional Euclidean space
is
where
.
3. Positive Semidefinite Tensors, Copositive Tensors and Their Duals
The cone of
dth-order (with
d even)
n-dimensional, positive semidefinite tensors is denoted by
and is given below
For a tensor
, the polynomial
is called PSD polynomial. The dual of PSD cone
is the cone of completely positive semidefinite tensors, denoted by
, which is defined as.
For
, PSD cone is self dual i.e.,
(cf. [
16]). However, for
we have
in general (see [
27], Example 4.5).
It is well known that a function is convex if and only if its Hessian matrix is positive semidefinite (see, e.g., [
28], Theorem 4.5). Therefore the convexity of homogeneous polynomial defined in (
3) amounts to checking if
. It has been shown that if a polynomial
is convex then its associated tensor
is positive semidefinite (see [
27], Proposition 5.10). However, the converse need not to be true in general (see [
27] (Example 5.11) and [
29]).
A tensor
is said to be
copositive if
for all
, whereas a tensor is called
strictly copositive if
for all
. The set of
n-dimensional,
dth-order copositive tensors defines a cone given below,
It is well known that, a tensor
if and only if it is strictly copositive, where
denotes the interior of the set
. The dual of copositive cone
is completely positive cone denoted by
(see e.g., [
16] (Theorem 6.9), [
30]), which is defined below,
It is clear that if a tensor is positive semidefinite then it is copositive also. A copositive tensor need not to be positive semidefinite (cf. Example 1). The question arises in which case a copositive tensor is also positive semidefinite. In the following theorem, we describe one such case (for
see [
31], Lemma 2.6).
Theorem 1. Let be a Z-tensor, then is copositive if and only if it is positive semidefinite, where d is even.
Proof. Let us take
and since
is
Z-tensor, we can write
where
such that,
To show that
take
and consider,
Since
d is even, we have
for all
. However,
can be positive or negative. Note that if
for some
then clearly from (
13) we have,
.
So, the only case left is when
for some
. To show that
in this case also, we define
such that
. To show that
consider,
and note that for
we have
. Clearly, we have
Since
d is even therefore we have
. Furthermore,
is copositive and
implying
. Thus, we have
From (
13) and (
15) we deduce that
in this case also. Hence,
is positive semidefinite.
Converse is obvious since every positive semidefinite tensor is also copositive. □
Note that the above result also appeared in Zhang et al. [
32] (Theorem 3.5(e) and Theorem 3.12), where the proof is constructed based on the spectral properties of the so called M-tensors. The proof given above is self-contained and does not require any extra structure.
4. Approximation Hierarchies for the Copositive Cone
Recall that a tensor
is copositive if
for all
. Notice that for any
, we can write
for some
, where ∘ indicates the component wise (Hadamard) product, giving
Thus, the copositivity condition translates to
for all
, for which a sufficient condition is that if (
16) can be written as a sum of squares (SOS). Let us illustrate this with an example:
Example 1. Consider a tensor such that . Then for , we have Clearly, is not positive semidefinite since . Let’s consider where with , then we have, Thus, is copositive.
In the above example
can be written as sum of squares. However, this is not the case in general (see Example 2). Therefore, to develop higher order sufficient conditions, the following polynomial, introduced by Parrilo [
17], is most often used.
Clearly,
is a polynomial of degree
. Based on (
18), one can define two cone approximations for the copositive cone (as we will see),
For the tensor given in Example 1, we have . It is clear from that has a negative coefficient implying . However, one can show that .
Inclusions
and
for
are evident from the following,
The cone hierarchies
and
approximates the copositive cone from the inside. For instance if
then
being sum of squares is non-negative i.e.,
for all
, which in turn imply that
for all
, hence
. Similarly, one can show that if
for some
then
is copositive. Thus, we have
Referring to Polya’s theorem (see e.g., [
25], Theorem 2.1), which states that for a tensor
there exists a large enough
r such that
. This further implies that, for some
the strictly copositive tensor
allows
to have sum of squares decomposition, that is
. Therefore, the infinite union of these cones contains the interior of copositive cone i.e.,
For the tensor as given in Example 1 it holds since . Thus, neither nor holds.
4.1. The Case
The case is interesting and require further exploration. It is clear that , which require no further exploration. For , the tensor is often characterized in terms of the decomposition , where and . However, for only one direction is possible, as shown below.
Theorem 2. Let and if then , where and i.e., .
Proof. The proof for the matrix case can be easily generalized to accommodate for higher order tensors see e.g., ([
19], Theorem 2.1). □
The converse of the above theorem is not true in general. The following is a counter example,
Example 2. Let be such that The associated polynomial is given by (cf. (3)) The polynomial given in (26) is the well known Robinson’s polynomial [33]. It is well known that for all . Therefore, trivially can be written as sum of positive semidefinite and non-negative (actually zero) tensor. It is also well known that Robinson polynomial cannot be written as SOS (see [33] for a proof) i.e., . To find special cases in which the converse of Theorem 2 holds note that from
, we have
Clearly,
is SOS since
is non-negative. Thus, if
can be written as SOS the converse of the Theorem 2 holds. It is well known that a matrix (i.e.,
) is positive semidefinite if an only if its associated polynomial can be written as sum of squares. Therefore, the converse of Theorem 2 holds for
(see [
19] (Theorem 2.1) for a proof). Moreover, for
and
a tensor is positive semidefinite if and only if its associated form is sum-of-squares, see, e.g., [
33]. Therefore, the converse of Theorem 2 holds in this case also. We close this discussion by showing that the converse of Theorem 2 holds for the
Z-tensors.
Theorem 3. Let be a Z-tensor and , where and then .
Proof. Note that since
is a
Z-tensor and
is non-negative therefore
is also a
Z-tensor. It is well-known that a
Z-tensor is positive semidefinite if and only if it is sum of squares. (see e.g., [
26] (Proposition 2.1), [
34] (Theorem 11)). Hence (
27) can be written as sum of square. □
4.2. Characterization of
In this subsection, we formulate a characterization for
. Before presenting this characterization, we provide bounds on the value of coefficients for the polynomial
. For this consider,
Note that
can be written as
where
is a unit vector and
. Using this notation (
28) can be written as,
Denote by
, then
, and with these notations we can write (
29) as follows,
where
for all
.
Note that the maximum value of
is
(Observe that
. Take
then maximum value occurs when
and
for all
). Moreover, the minimum value of
occurs when
for all
and the minimum value is 1. Thus, we have
To show that the upper bound in (
31) is sharp take
and
then for
, we have
, thus the denominator reduces to 1 and the upper bound is exactly
. In the following theorem, we describe a characterization of
.
Theorem 4 (cf. [
19] for
).
The tensor if and only if there exists a PSD matrix associated with the polynomial (18)such that Proof. The proof is an easy generalization of matrix case (Theorem 2.2, [
19]), which is presented here for the sake of completeness. Let us consider the polynomial
described in (
30) and using its matrix formulation as follows
Taking
implies that its associate polynomial
allows a sum of squares decomposition, which further implies that the matrix
is PSD. For converse we assume that, the matrix
in (
34) is PSD. So the following decomposition is evident
Combining (
34) and (
35) gives,
By comparing (
30) and (
34), we obtain the following
Comparing the coefficient of
in (
38) gives,
However, for
the coefficients of
on R.H.S. of (
38) are all zero, thus we have
Hence the proof is complete. □
From Theorem 4, it is clear that the matrix
is sparse. To deal with the sparsity Ahmadi and Majumdar [
35] has introduced polyhedral approximations using the observation that a diagonally dominant matrix is positive semidefinite.
4.3. Polyhedral Characterization of
In this section, we present a compact representation for the cone
. The representation is useful in deducing polyhedral characterization of
. For this, recall that for any
(
30) can be re-written as follows
Note that the first equality follows by using the definition of polynomial coefficient (see (
4)). Recognize that the product
is linked with the falling factorials. The falling factorials can be represented using the Stirling number of the first kind as follows
where
is well known Stirling number of the first kind (see e.g., [
36], Chapter 6.1). Using (
41) in (
40) gives
For
and
the inequality
is defined to hold element wise i.e., the inequality is true if
for all
. Furthermore,
. Observe that, if
then
and also
if
. The observation leads to the following simplification for each
and
for all
Combining (
42) and (
43) leads to
We define the tensors
and
of order
and dimension
n as follows,
Remark 1. Interestingly for each we have Consequently, the above notations lead to the following simplified representation of (
44)
Finally, we define the notation
. The notation leads to define the tensor
of order
d and dimension
n, whose entries are either forms of degree
t or zero that is,
Note that
and
. Moreover, from (
49) one can easily assert that
.
Remark 2. Recall that implying θ can be written as a linear combination of unit vectors as follows , . Since, the maximum number of non-zero elements of β will be . Similarly, for , we can have at most non-zero coefficients in the linear combination of the basis. Based on this observation we can obtain an explicit representation of the tensor for all . Thus, for each the entries of tensor are described as Thus, from (
45) and (
48) one obtain the following notionally convenient formulation of (
44),
Remark 3. For sanity check we consider a special case, the tensor of all ones , that is for all . Note that and from (51) we have The above representation leads to a polyhedral representation of the cone hierarchy
and is presented in the following theorem (cf. [
19], Theorem 2.4).
Theorem 5. For all and the polyhedral representation of cones is given aswhere for all . Proof. The proof follows immediately from (
19) and (
51). □
5. Approximating Polynomial Optimization over the Simplex
In this section, we consider the homogeneous polynomial optimization over the simplex
where
. It is well-known (see e.g., [
18,
19,
25]) that (
53) can be equivalently reformulated as a copositive program over the cone of completely positive tensors. The reformulation is as follows,
The dual formulation of (
54) is also a conic program over the cone of copositive tensors, which is given below.
We consider a special case where
. Obviously, the feasible set
is precisely the simplex
. Thus, the minimum (maximum) value of (
53) in this special case, that is optimization of the homogeneous polynomial over the simplex
is
As mentioned before testing if a tensor is copositive is co-NP-hard. To find an approximate solution we replace the cone
(for the special case
) in (
55) by it’s approximation
where
,
We are interested to compute the bound on the difference of approximate solution (to the dual program)
and the exact solution
. For this we use rational girding of the simplex
i.e., for non-negative integer
we have,
The rational grid
is a discretization of
, which leads to a natural approximation of (
56), i.e.,
Note that
approximates the dual while
approximates the primal as given in (
56). It is interesting to investigate the connection between these two approximations. The connection is given below (cf. [
19] (Theorem 3.1), for
and [
25] (Theorem 3.1) for
),
Theorem 6. Let be a rational discretization of simplex as given in (58) for any , then for we havewhere . Proof. Let
be a feasible point of the program given in (
57) then for
it follows, from (
51) and the definition of
, that,
The above imply that the maximum value of
in (
57) is attained at the minimum value of
. Thus, (
57) can be equivalently written as follows,
A mere change of variable
where
and
in (
65) yields the required expression, i.e.,
□
Note that, for any
we have the relation
. However, in (
66) a correction term
is deducted from the actual objective
for obtaining a closer approximation to
. Clearly, for increasing
the value
surpass the value
invariably. However, one has to compensate with the factor
. It would be interesting to find bounds on the difference between two approximations namely
and
. To compute the bound we define some notations. First recall that
denotes Stirling number. In addition, for
we define a function
as follows,
If there is no dependence on the variable
we simply write
that is,
One can define analogously.
Theorem 7. Let be a rational discretization of simplex as given in (58) for any , then for we have Proof. From the expression given in Equation (
66),
Notice that
then for all
, we have
. From this observation, we have that
. This observation leads to the following,
The lower bound could be done similar manner. □
6. Conclusions
The paper was focused on describing the copositive tensor cone and its approximations. We have shown that every
Z-tensor is copositive if and only if it is positive definite. The result has appeared already in [
32] (Theorem 3.5(e) and Theorem 3.12). However, the proof given by Zhang et al. relies heavily on the notion of
M-tensor and convex analysis. The proof we have provided is simpler and self-contained. We had discussed some approximation hierarchies for the copositive cone, focusing on providing a compact representation of these hierarchies. For the Parrilo cone,
the proof techniques are a straightforward generalization to a high dimensional case. For the cone,
a more rigorous approach is used to derive the representation. Most notions used are unique and have not appeared in the literature. We have illustrated this by applying our compact representation to the polynomial optimization over the simplex. We have compared the approximation obtained by our representation
with the approximation based on the rational griding. The bounds are proved between the two approximations. Moreover, the characterization helped to simplify the proofs and results related to approximating polynomial optimization over the simplex. In future it would be interesting to investigate the convergence rate of these approximations.
In the future, we work towards utilizing these hierarchies for providing approximation results related to copositive optimization, especially to recover approximation results for polynomial optimization over the simplex as obtained by De Klerk and co-authors [
21,
22,
23,
24]. Furthermore, our aim is to use these approximation hierarchies to develop numerical algorithms for application domains such as approximating clique numbers for uniform hypergraphs (see, e.g., [
37,
38]).