The Non-Tightness of a Convex Relaxation to Rotation Recovery

We study the Perspective-n-Point (PNP) problem, which is fundamental in 3D vision, for the recovery of camera translation and rotation. A common solution applies polynomial sum-of-squares (SOS) relaxation techniques via semidefinite programming. Our main result is that the polynomials which should be optimized can be non-negative but not SOS, hence the resulting convex relaxation is not tight; specifically, we present an example of real-life configurations for which the convex relaxation in the Lasserre Hierarchy fails, in both the second and third levels. In addition to the theoretical contribution, the conclusion for practitioners is that this commonly-used approach can fail; our experiments suggest that using higher levels of the Lasserre Hierarchy reduces the probability of failure. The methods we use are mostly drawn from the area of polynomial optimization and convex relaxation; we also use some results from real algebraic geometry, as well as Matlab optimization packages for PNP.


Introduction
The Perspective-n-Point (PNP) problem is a cornerstone of 3D computer vision [1][2][3][4][5]. We deal with its most fundamental variant: given 2D projections of 3D points whose realworld coordinates p i = (x i , y i , z i ) n i=1 are known, one seeks to determine the camera position and angle, i.e., a translation vector T and rotation matrix R, relative to the world coordinate frame, which provide an optimal fit between the 3D points and their projections on the camera image plane. We prove here a negative result in terms of complexity: specifically, that a commonly-used approach for convex relaxation of the problem fails for an open subset in the configuration space. A concrete example of such a configuration is provided. On the positive side, we note that, as the levels of the convex relaxation increase, the probability of failure decreases.

The PNP Problem: Details
Denoting by {u i } the unit vectors in the direction of the projections of {p i }, we obtain the very well-known expression to minimize, as a function of T and R: Geometrically (see Figure 1), this means rotating and translating the point set {p i }, so that the sum of squared norms of their respective projections on the lines spanned by the {u i } is maximized (i.e., the points are optimally aligned with the respective "lines of sight"). Due to the great importance of this problem in real-life applications, it was addressed in numerous papers. The solution obtained by minimizing Equation (1) is global and does not rely on approximations and iterative schemes. The minimization proceeds by first solving for T by differentiation, and then substituting back into Equation (1). If the rotation matrix R is "flattened" and viewed as a vector of length 9, we obtain the following minimization problem over rotation matrices (details are described in [2], see also a general survey on optimization of quadratic functions over rotation groups in [6]): where: R q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q (3) Plugging this representation into Equation (2) yields a positive semidefinite quartic homogenous polynomial in q1, q2, q3, q4, which should be minimized under the additional constraint + + + = 1. Since this polynomial depends on the input, {pi} and {ui}, it will be denoted by P { , } ( ), where ≜ ( , , , ). We shall say that a polynomial is realizable in PNP iff it is equal to P { , } ( ) for some {pi} and {ui}. We add the common requirements that, for every i, the z-coordinate of {pi} is non-negative, and 〈pi,ui〉 ≥ 0; these requirements correspond to the physical scenario of a forward-looking camera, i.e the field of view is limited to one side of the image plane.
The dependence of the 35 coefficients of P { , } ( ) on {pi} and {ui} is very complicated, due to the operations of matrix inversion and multiplication in Equation (2). While for given values of {pi} and {ui} it is straightforward to compute P { , } ( ), the question of characterizing the realizable polynomials is quite subtle, and we are not aware of a general solution to it.
Summarizing, the PNP problem becomes one of minimizing a homogenous quartic polynomial over = { ∈ ℜ | || || = 1}. However, there is no closed-form solution to The minimization proceeds by first solving for T by differentiation, and then substituting back into Equation (1). If the rotation matrix R is "flattened" and viewed as a vector of length 9, we obtain the following minimization problem over rotation matrices (details are described in [2], see also a general survey on optimization of quadratic functions over rotation groups in [6]): where: Note that since Q 2 i = Q i and Q t i = Q i , P is positive semidefinite. To minimize this quadratic polynomial over the space of rotation matrices, it is customary to represent R's elements as quadratics in the four components of a unit quaternion [7]: Plugging this representation into Equation (2) yields a positive semidefinite quartic homogenous polynomial in q 1 , q 2 , q 3 , q 4 , which should be minimized under the additional constraint q 2 1 + q 2 2 + q 2 3 + q 2 4 = 1. Since this polynomial depends on the input, {p i } and {u i }, it will be denoted by P {p i ,u i } (q), where q (q 1 , q 2 , q 3 , q 4 ). We shall say that a polynomial is realizable in PNP iff it is equal to P {p i ,u i } (q) for some {p i } and {u i }. We add the common requirements that, for every i, the z-coordinate of {p i } is non-negative, and p i ,u i ≥ 0; these requirements correspond to the physical scenario of a forward-looking camera, i.e the field of view is limited to one side of the image plane.
The dependence of the 35 coefficients of P {p i ,u i } (q) on {p i } and {u i } is very complicated, due to the operations of matrix inversion and multiplication in Equation (2). While for given values of {p i } and {u i } it is straightforward to compute P {p i ,u i } (q), the question of characterizing the realizable polynomials is quite subtle, and we are not aware of a general solution to it.
Summarizing, the PNP problem becomes one of minimizing a homogenous quartic polynomial over S 3 = q ∈ 4 | ||q| = 1 . However, there is no closed-form solution to this problem, as it requires finding the minimal value of a fourth-degree homogeneous quartic in four variables, which are constrained to lie on S 3 (i.e., the sum of their squares must be equal to 1). While the global minimum is known to exist, since the target function is continuous and S 3 is a compact set, recovering the minimum is a notoriously difficult problem [8,9]. In principle it can be solved using Lagrange multipliers, which reduce the optimization to the solution of a polynomial system, but this solution is lengthy, applies numerical techniques, and often multiple solutions are present, which need to be pruned (in addition to removing complex solutions); please see Appendix A for the details.
These difficulties have led practitioners [2,4] to apply a very well-developed and popular method for polynomial optimization, known as sum of squares (SOS) relaxation [8][9][10], which we now briefly describe.

Previous Work and Methods
Clearly, the problem argmin is equivalent to the problem That is, the minimal value p(q) attains on S 3 is equal to the maximal value of γ such that the polynomial p(q) − γ is non-negative on S 3 . Thus, the task of finding γ can be expressed as the problem of determining whether a polynomial is non-negative on S 3 . If there was a mechanism that allows to quickly determine whether a polynomial has this property, it could have been used (possibly combined with a binary search on γ) to find the minimum.
Alas, determining whether a polynomial is non-negative (even on a compact set) is notoriously difficult [8][9][10]. Therefore, a great deal of work concentrated on a sub-family of positive polynomials-which are defined by the property that they can be written as a sum of squares (SOS) of other polynomials (hence are clearly non-negative). The class of SOS polynomials has an important advantage: the question of whether a homogeneous polynomial of degree d in n variables is an SOS can be cast and solved as a semidefinite programing (SDP) problem, on matrices of size O n d [8,10].
The question whether every non-negative polynomial is SOS was first studied by Hilbert [11], who was able to prove the existence (but not an explicit construction) of non-negative, non-SOS polynomials. Much later, some simple and beautiful examples of such polynomials were found, one of them which we will use in this paper.
The set of non-negative, non-SOS quartic polynomials in four variables is denoted ∆ 4,4 , and it is known to be open (in the natural topology in coefficient space). This means that if a polynomial is in ∆ 4,4 , then small enough perturbations of its coefficient will define polynomials that are also in ∆ 4,4 .
To apply SOS polynomials to the minimization of polynomials over a sphere (similar results hold for more general algebraic varieties, but are more complicated and will not be described here), the following result is often applied: 13]). Assume that a polynomial p(q) is positive on S 3 . Then there exist polynomials φ(q) and s(q) such that p(q) = q 2 − 1 φ(q) + s(q), where s(q) is an SOS polynomial.
Note that if such polynomials exist, then, since on S 3 , q 2 − 1 = 0, it holds that on S 3 , p(q) = s(q), hence p(q) is non-negative. This is often referred to as a certificate of positivity for p(q) on S 3 . However, the theorem does not guarantee a bound on the degrees of the "auxiliary" polynomials φ(q) and s(q). For example, if the degree of p(q) is 4, then it may well be that the degree of φ(q) is 4 and the degree of s(q) is 6 (i.e., it is the sum of squares of cubic polynomials). In that case, the terms of degrees 5 and 6 will cancel out in Since the degrees of φ(q) and s(q) are not known in advance, optimization is carried out in the framework of the Lasserre hierarchy [10] for successively approximating the minimum. In our case, the levels of the hierarchy are defined as follows: setting deg(s) = 2t (clearly, the degree of an SOS polynomial is even), defines the t-th level. As the levels increase, the minimum approaches the correct one, alas the computational complexity rapidly increases with t, since the size of the matrices to optimize over in the corresponding semidefinite programming problem is of order 4 t × 4 t . The quest for a fast solution led researchers to implement the case t = 2 (i.e., the second level), which runs in reasonable time, and noting that, empirically, it appears to work well [2,4]. Still, the question remains: can this approach run into cases in which the second level fails to find the minimum (i.e., it is not tight)? We answer this question in the affirmative. To the best of our knowledge, we provide the first concrete example of a PNP configuration leading to a polynomial which fails the second Lasserre level. We then provide an example which also fails the third level.

Other Work on Convex Relaxation for Rotation Recovery
Recently, the question whether convex relaxation for rotation recovery (and other problems in computer vision) is tight, was addressed. In [14], the following question is studied: given an initial value at which the relaxation is tight, is there a neighborhood of this initial value at which tightness holds? (here, we empirically study the opposite scenario-i.e., how stable under additive noise is the "bad" property of the relaxation not being tight. In [15], an improved Lagrangian dual relaxation is provided, and it is noted that empirically, its performance is excellent, i.e the relaxation is tight. The work which we found to be closest to ours in spirit is [6], in which the question as to when the convex relaxation is tight is directly addressed, both for recovery of a single and multiple rotations. The authors note that, empirically (while not theoretically!), the convex relaxation for the recovery of a single rotation is tight, as opposed to the case of two rotations. Then, a general study of the tightness of convex relaxations is undertaken, relying in part on deep results from algebraic geometry, notably a study of minimal varieties [16].
Interestingly, the results in [6,15] suggest that the example we provide here, for the failure of convex relaxation for single rotation recovery, is rare; alas, it is not singular (i.e, the set of points-lines configurations in which this failure occurs are of full dimension, which immediately follows since ∆ 4,4 is an open set). An interesting question, which we hope to pursue in the future, is to estimate "how rare" it is, in terms of probability of failure. This question is related to estimating how many positive polynomials are not SOS [17]; while asymptotic results exist, we are not aware of research on this question for small numbers of variables and low degrees.

Results
We now present the results, which consist both of theoretical analysis and experiments which support them.
Proof. Since there are points on S 3 for which b(q 1 , q 2 , q 3 , q 4 ) = 0, the minimum of b c (q 1 , q 2 , q 3 , q 4 ) on S 3 is c. But the following identity, which is required for obtaining the solution in the second Lasserre level, cannot hold for a quartic SOS s(q): To see this, note that when Equation (4) is restricted to S 3 , it becomes b(q) = s(q) = ∑ i p 2 i (q), where p i (q) are quadratics in q (since s(q) is a quartic SOS). Denote p i (q) = p 2000 q 2 1 + . . . + p 0000 (i is suppressed since the following holds for every p i (q)). Obviously, if q 0 ∈ S 3 and b(q 0 ) = 0, then p i (q 0 ) = 0. There are 14 points on S 3 at which b(q 0 ) = 0: (±1, 0, 0, 0), (0, ±1, 0, 0), (0, 0, ±1, 0), as well as all combinations of the form 1 2 (±1, ±1, ±1, ±1) in which an even number of the coordinates are positive. Substituting these 14 points in p 2000 q 2 1 + . . . + p 0000 , equating to 0 and solving, yields that every p i (q) must be of the form a i q 2 − 1 + r i (q), where r i (q) is a quadratic form (i.e has no linear and constant terms). Therefore, it must hold that , but now both sides of the identity are homogenous quartics. Since they are equal on S 3 , they must be equal everywhere, but b(q) is not an SOS, a contradiction.

Note:
The proof depends on the fact that b(q) is not an SOS polynomial. To see this, observe that if s is a quartic SOS, then the following identity holds: In order to prove that an instance of the PNP problem can also fail in the second Lasserre level, we need: Proof. The proof is constructive, and consists of an example of {p i }, {u i }, which yield b c (q 1 , q 2 , q 3 , q 4 ). To find such {p i }, {u i }, the following paradigm was adopted: (1) Fix a certain value of c. This yields a polynomial b c (q 1 , q 2 , q 3 , q 4 ) with fixed scalar coefficients. (2) Define a distance between two polynomials f , g as the sum of squares of differences of their coefficients. In our case, f , g are homogenous quartics, hence there are 35 terms in this sum (the number of coefficients).
We note that the optimization problem 4 above is rather difficult, as the relation between {p i }, {u i } and the coefficients of P {p i ,u i } (q) is very complicated. Even when setting the vectors {u i } to scalar values, each of the 35 coefficients is a quadratic polynomial in the 3n variables {(x i , y i , z i )} n i=1 , i.e., its size is 3n 2 . After summing the squares of all these expressions with the respective coefficients of b c (q 1 , q 2 , q 3 , q 4 ) subtracted, the resulting expression to be minimized is a quartic in the 3n variables {(x i , y i , z i )} n i=1 , hence its size is O n 4 . The most general expression-with non-scalar values for the vectors {u i }has the same structure, but with each of the O n 4 coefficients being a very complicated rational function of the components of all the u i s, resulting from the matrix inversion and multiplication operations in Equation (2). Therefore, in order to obtain a feasible minimization problem, we have applied the following methodology: Empirically, it turned out that a solution exists only for values of c larger than a certain threshold. Since our goal in this short submission is only to demonstrate that real PNP scenarios can fail the second Lasserre level, we did not attempt to find the smallest values of c and n; however, these are interesting problems which merit further study. In order to verify the correctness of the optimization, the resulting values of {p i }, {u i } have been plugged into two packages for solving the PNP problem using SOS relaxation, and have indeed caused them to fail (see "Experimental Results").

Some Comments on the Result
• The long running time required to produce the configuration which yields b c (q 1 , q 2 , q 3 , q 4 ) is not an issue, as our goal was to show that such a configuration exists.

•
The configuration is stable, i.e small perturbations in the {p i }, {u i } still yield a polynomial which fails the relaxation. This is due to the fact that the set ∆ 4,4 is open [18]. That is, the "problematic" configurations are not restricted to a low-dimensional manifold.

•
Note that a rotation of {p i } and {u i } would have yielded a different polynomial, which would have also failed the relaxation; further, it would have been different from b c (q 1 , q 2 , q 3 , q 4 ). Thus, the problem the configuration poses cannot be solved by checking whether the resulting polynomial is equal to b c (q 1 , q 2 , q 3 , q 4 ).

•
While, as proved in the following section, the third Lasserre level can solve the PNP configuration which yield b c (q 1 , q 2 , q 3 , q 4 ), the solution did not converge due to numerical instabilities; a solution was obtained only at the fourth level. The numerical instability of solving optimization problems involving polynomials which are positive but not SOS was noticed by other researchers [9]. From the practical point of view, this indicates that there are PNP configurations at which higher Lasserre levels are required for the solution, than those indicated by Lemmas 3 and 5.

•
One may wonder why the expression c q 4 is added to b(q 1 , q 2 , q 3 , q 4 ). The reason is that we could not realize b(q 1 , q 2 , q 3 , q 4 ) in PNP, since b(q 1 , q 2 , q 3 , q 4 ) assumes the value of 0 at 14 points on S 3 . This means that to realize it in PNP, the matrix P in Equation (2) would have had to be highly singular.

PNP and the Third Lasserre Level
It turns out that a PNP configuration which yields b c (q 1 , q 2 , q 3 , q 4 ) can theoretically be solved in the next Lasserre level-the third one. To prove this, we must produce a quartic polynomial φ(q) and a sextic SOS polynomial s(q) , it remains to prove that q 2 b(q) is an SOS. This can be proved by directly verifying that: (where in order to reduce equation clutter, we denote q = (x, y, z, w)). It turns out that for our problem, the properties of being solvable in the third Lasserre level and of p(q) q 2 being an SOS are equivalent. This is summarized in the following: Lemma 4. Assume that p(q) is a homogenous quartic which is non-negative on S 3 and, in addition, its minimum on S 3 is equal to 0. Assume further that the minimum of p(q) on S 3 can be obtained in the third Lasserre level. Then the polynomial p(q) q 2 must be an SOS.
Proof. Assume that the minimum can be obtained in the third Lasserre level. Then, there is a quatric φ(q) and a sextic SOS s(q), such that the following holds: Next, we write down Equation (5), but while separating φ(q) and s(q) to homogeneous forms (in order to reduce equation clutter, q is suppressed wherever possible, as all polynomials are in q): That is, φ l contains only terms of total degree l, and C i , Q i , L i , A i stand for cubic, quadratic, linear, and constant terms in the polynomials whose squares compromise s(q).
Since two polynomials are equal iff all their respective forms are equal, Equation (6) yields the following identities for the even-degree forms: Extracting φ 4 , φ 2 , φ 0 , substituting in the equation for degree 4 above, and multiplying by q 2 , yields: , which is a sum of squares.
As a consequence of the above, the question of whether the PNP problem can also fail to have a solution in the third Lasserre level is equivalent to the question whether there exists a polynomial p(q) such that the following hold: • p(q) is non-negative on S 3 , and its minimum on S 3 is 0. • p(q) + c q 4 is realizable in PNP for some c > 0. • p(q) q 2 is not an SOS polynomial.

Lemma 5.
The answer to the above is affirmative.
Proof. Define p(q) = b q 1 , q 2 , q 3 , q 4 5 . As for b(q) + c q 4 , direct optimization proves that p(q) + c q 4 is realizable in PNP for c = 2. Also, p(q) q 2 is not an SOS, as can be verified directly by applying dedicated software.
As noted in Section 2, the complexity of solving the SOS problem quickly increases with the Lasserre level, as its solution requires processing matrices of size O 4 t . On the average, running time (on an Intel(R) Core TM i5-1135G7 CPU @ 2.40 GHz) were 0.07, 0.196, 0.98, 5.275 and 27.02 s, respectively for levels 2, 3, 4, 5 and 6.

Experimental Results
The experiments in this paper consisted of obtaining PNP configurations which fail the second and third Lasserre levels, as explained in Lemma 3; such a configuration is depicted in Figure 2. In this figure, the 14 points which were recovered in the optimization described in Section 3.1 are plotted in 3D space, with the corresponding lines of sight for each of them, in the same color. Thus, Figure 2 depicts a setting of the PNP problem which fails the second stage in the Lasserre hierarchy. In order to verify the result, this configuration was run on two existing PNP packages. In addition, the effect of adding noise to the configuration was examined; see Figure 3. This addition of noise was performed in order to demonstrate that the configurations which fail the second stage are "stable", i.e., do not lie on a low-dimensional manifold in the input space; intuitively speaking, if a configuration fails the second level, so will configurations which are close to it. The noise, which was uniform in the interval [−ε, ε], was added to each of the coordinates of the points p i in the configuration. As the power of the noise increased, the probability of finding a solution in the second level increased, as well as in the third level; since (by definition) the third level solutions contain all the solutions of the second level, the probability for obtaining a solution in the third level is always higher. A more elaborate analysis of the effect of adding noise to the solution is provided in Appendix B.  , ) is not an SOS for > √6 [19][20][21].
As noted in Section 2, the complexity of solving the SOS problem quickly increases with the Lasserre level, as its solution requires processing matrices of size (4 ). On the average, running time (on an Intel(R) Core TM i5-1135G7 CPU @ 2.40 GHz) were 0.07, 0.196, 0.98, 5.275 and 27.02 s, respectively for levels 2, 3, 4, 5 and 6.

Experimental Results
The experiments in this paper consisted of obtaining PNP configurations which fail the second and third Lasserre levels, as explained in Lemma 3; such a configuration is depicted in Figure 2. In this figure, the 14 points which were recovered in the optimization described in Section 3.1 are plotted in 3D space, with the corresponding lines of sight for each of them, in the same color. Thus, Figure 2 depicts a setting of the PNP problem which fails the second stage in the Lasserre hierarchy. In order to verify the result, this configuration was run on two existing PNP packages. In addition, the effect of adding noise to the configuration was examined; see Figure 3. This addition of noise was performed in order to demonstrate that the configurations which fail the second stage are "stable", i.e., do not lie on a low-dimensional manifold in the input space; intuitively speaking, if a configuration fails the second level, so will configurations which are close to it. The noise, which was uniform in the interval − , , was added to each of the coordinates of the points in the configuration. As the power of the noise increased, the probability of finding a solution in the second level increased, as well as in the third level; since (by definition) the third level solutions contain all the solutions of the second level, the probability for obtaining a solution in the third level is always higher. A more elaborate analysis of the effect of adding noise to the solution is provided in Appendix B.

Conclusions
We proved, both theoretically and experimentally, that there exist non-degenerate configurations which fail a commonly used convex relaxation to the PNP problem. The probability of failure decreases with the increase of the Lasserre Hierarchy level. We hope that this observation will contribute to further research in solving PNP, as well as other computer vision problems involving convex relaxations. Thus the theoretical implication of the work presented here is that the underlying optimization problem in PNP (and generally in rotation recovery) is difficult in some cases, and cannot always be solved by the first stages of the convex relaxations in the Lasserre hierarchy; however, if the solution in the second level fails, then a practical solution is to continue increasing the level, until a pre-defined threshold is reached. If no solution is obtained, one should resort to his/hers method of choice for constrained polynomial optimization-which are, indeed, more time-consuming than the convex relaxation method.

Data Availability Statement:
The dataset which is relevant (points/lines configuration) can be obtained by mailing the corresponding author.

Conflicts of Interest:
There are no conflicts of interest.

Conclusions
We proved, both theoretically and experimentally, that there exist non-degenerate configurations which fail a commonly used convex relaxation to the PNP problem. The probability of failure decreases with the increase of the Lasserre Hierarchy level. We hope that this observation will contribute to further research in solving PNP, as well as other computer vision problems involving convex relaxations. Thus the theoretical implication of the work presented here is that the underlying optimization problem in PNP (and generally in rotation recovery) is difficult in some cases, and cannot always be solved by the first stages of the convex relaxations in the Lasserre hierarchy; however, if the solution in the second level fails, then a practical solution is to continue increasing the level, until a pre-defined threshold is reached. If no solution is obtained, one should resort to his/hers method of choice for constrained polynomial optimization-which are, indeed, more time-consuming than the convex relaxation method.  where: and so this expression is always an SOS, which completes the proof.