A Novel Distribution for Representation of 6D Pose Uncertainty

The 6D Pose estimation is a crux in many applications, such as visual perception, autonomous navigation, and spacecraft motion. For robotic grasping, the cluttered and self-occlusion scenarios bring new challenges to the this field. Currently, society uses CNNs to solve this problem. The CNN models will suffer high uncertainty caused by the environmental factors and the object itself. These models usually maintain a Gaussian distribution, which is not suitable for the underlying manifold structure of the pose. Many works decouple rotation from the translation and quantify rotational uncertainty. Only a few works pay attention to the uncertainty of the 6D pose. This work proposes a distribution that can capture the uncertainty of the 6D pose parameterized by the dual quaternions, meanwhile, the proposed distribution takes the periodic nature of the underlying structure into account. The presented results include the normalization constant computation and parameter estimation techniques of the distribution. This work shows the benefits of the proposed distribution, which provides a more realistic explanation for the uncertainty in the 6D pose and eliminates the drawback inherited from the planar rigid motion.


Introduction
The Pose Estimation of rigid objects has a wide range of applications in today's life, such as robot grasping [1], aerospace applications, autonomous driving [2], and so on.
Throughout history, pose estimation has played a vital role, from Ceres's position predicting to navigation on the sea ( [3], Introduction 1.1). One day, Captain cook sailing in the sea has lost his bearings. The ship moves on a wavy and wind sea. With the reasonable assumption, He estimates the optimal localization of the ship from the noisy measurements ( Figure 1, top).
New challenges come to this field. The success of service robotics relies on wellperceiving an unfamiliar kitchen scenarios, such as bottles, dishes, and containers. Robot grasps under cluttered or limited lighting environments are becoming common in a rapidly growing warehouse automation industry [4]. Consequently, robots need to know the full 6D pose of objects to manipulate objects, which is difficult due to the uncertainty caused by environmental factors and sensing technologies. The intuition is to deploy expensive high-precision sensors to avoid such uncertainty. However, all sensors have limited precision, and measurements derived from actual sensors have associated uncertainty ( [3], Introduction 1.2).
The CNN [5] broke through in the ImageNet [6] challenge, motivating more and more works to embrace deep learning. However, most prior works in this area failed to take the weak information [7] and the 6D pose uncertainty into account, and they only provide a single best guess for each object pose [8][9][10]. The accurate metric of the model cannot handle unprepared situations, such as dim lighting. The uncertainty will rise with the increase in accuracy over time.
The pose estimation of symmetric objects is currently the most challenging task [11]. Existing multiple correct poses for the same visual appearance [12]. This issue also results in lousy training performance since a network receives inconsistent loss signals. Xiang et al. [9] design an MSE loss function based on quaternions and regress the 3D rotation using the neural network. It reports better performance for symmetric objects such as bowls. However, they do not take the underlying structure of the manifolds into consideration.
Deng et al. [13] detect the object's position by a bounding box using a neural network. They assume the Gaussian position and approximate the observation likelihood to obtain the conditioning orientation density by a Rao-Blackwellized particle filter. The result is considerably robust in rotational uncertainty. However, this approach is computationally expensive, and it is not clear which kind of distribution holds on the orientations. Furthermore, uncertain rotation axes need to be considered when underlying uncertainty is not axis aligned [14].
Nonetheless, the neural network correctly assumes the underlying structure of the data is robust. Prokudin [15] exhibits a good example dealing with periodic circular data. The same loss will emerge when the predicted angle is equal to 1°or 359°with the ground truth angle of 0°. He presents a loss based on the directional setting and assumes the angle obeys the von Mises distribution. As a result, the model works suitably for the data with the periodic nature.
The Gaussian assumption is commonly embedded, leading the neural network to fail to learn the underlying manifold structure of the 6D pose. Holding the robust assumption of uncertainty is the foremost for 6D pose estimation.
Inevitably, coupling or decoupling is the first consideration when parametrizing the 6D pose. It is also related to further modeling uncertainty. Daniilidis [16] employs dual quaternion to parameterize the 6D pose for hand-eye calibration and establishes a linear homogeneous system, simultaneously solving rotation and translation. Goddard and Addi [17] capture the correlation between rotation and translation for rigid motion tracking by dual quaternions.
Horn [18] proposes the method that decouples the orientation from translation. Inspired by Horn, Srivatsan [19] makes a linear assumption on the state of the SE(3) elements by adopting dual quaternions. Li [20] explains the correlation between rotation and translation terms based on hyperspherical parallel transport and gives a Bingham distribution on the orientation decoupled from the translation. Manhardt [12] illustrates the rotation uncertainty caused by occlusion in the 6D pose estimation by employing Bingham distribution. However, it is still unknown how the Bingham distribution extends to SE(3) space without variations.
The work's most Bingham distribution of interest is motivated by Glover. He presents the BPA [21] algorithm to recover the full poses from patches of local features of the 3D point cloud. This work extends from the SE(2) [22].

Main Contribution
In this work, we aim to model the uncertain 6D pose by extending the probability density function on the SE(2) to the SE(3). This is achieved through parameterizing the 6D pose by dual quaternions. The work simplifies a future generalization to the 6D pose estimation of objects (Figure 1, bottom). The proposed distribution appears as a marginal distribution for the rotational part of the proposed model. We further discuss the relationship between orientation and position. To the best of the author's knowledge, there is no work exactly claimed for this. This paper is organized as follows. We will compare several existing 6D parameterization techniques and select the most promising one in Section 2. In Section 3, we propose a new probability distribution on the 6D pose. The normalization constant and parameter estimation techniques will be discussed. Later, in Section 4, we use the proposed framework to represent the transformation of the 6D pose, and the algorithm that recovers the transformation from the samples of our proposed distribution will be presented. Lastly, the discussion and conclusions of this work are in Section 5.

Preliminaries
There are several existing methods to parameterize SE(3) elements, including orthonormal rotation matrices, Euler angles, Rodrigues vectors, quaternions, etc. For orthonormal rotation matrices, the compounding of two elements is much more complex ( [23], p. 45). Although Euler angles are invariant under transforms and easy to comprehend, there exist singularities [24]. Rodrigues vectors are complex to implement as a composition algorithm [25]. For parameterization of the 3D rotation, the quaternion is the best from the analog computation view [26]. Still, it is also limited to representing rotation in a full 6D pose, and the translation must be dealt with separately The dual quaternions this work introduced takes both rotation and translation into consideration. It provides a closed-form solution for the composition of 6D poses, which is the analogy to the transform matrix in homogeneous coordinates [27]. Kenwright [28] finds that the transforms by the dual quaternion multiplication 10 percent faster compared to matrix multiplication on average. Nonetheless, Kavan et al. [29] present a practical example, and they utilize dual quaternions to solve the shortage of the linear blend skinning.
Conventions. In this work, lower-case letters represent scalars, and matrices are represented by capital letters, vectors, and quaternions in bold. Quaternions are distinguished from dual quaternions by a caret, e.g., q denotes a quaternion and q denotes a dual quaternion. Two quaternions p and q multiplication is denoted as p q, while the multiplication of the two dual quaternions is denoted as p ⊗ q. The dot product and cross product of vectors v and w are denoted as v, w and v × w, respectively. Finally, we use H represent the skew-field of quaternions. With this in mind, consider the quaternions q ∈ H. The pose is synonymous with the 6D pose or transformation; the rotation is synonymous with orientation; the translation is synonymous with the position.

Quaternions
Similar to complex numbers, the sum of a real number and three complex numbers represent quaternions. The quaternion q ∈ R 4 can be treated as a 4-tuple (q 0 , q 1 , q 2 , q 3 ). It can also be identified with the typical basis elements i, j, k via the coefficients: The multiplication of the two quaternions p = (p 0 , p), q = (q 0 , q) is given by: where p 0 , q 0 names the scalar part and p, q ∈ R 3 is the vector part; note that the product is not commutative. Furthermore, linear operators [30] Q + p and Q − q → R 4×4 and associated with (2) can be also defined by matrix-vector form as: with where [ ] × is the skew-symmetric matrix generated from the corresponding vector. The canonical norm on H of quaternions is defined by The conjugate of the quaternion is obtained by changing the sign of each element in the imaginary part: q * = (q 0 , −q) = (q 0 , −q 1 , −q 2 , −q 3 ).

Representation of 3D Rotation
Unit quaternions q are quaternions with || q|| = 1 and commonly represent the rotations in 3D Euclid space. The inverse of a unit quaternion is obtained by its conjugate form The quaternion can represent the rotation around a 3D axis with a unit-length vector v and the rotation angle θ ∈ [−π, π]: A point p = (p 1 , p 2 , p 3 ) ∈ R 3 is denoted as the purely imaginary quaternion without real part p = p 1 i + p 2 j + p 3 k. From (4), the rotated point p rot is obtained as [25]: In addition, q and − q represent the same rotation due to the antipodal property on the hypersphere S 3 , so the set U of unit quaternion is a double coverage of the SO(3) of the 3D rotations.

Dual Quaternions
The dual theory is helpful when understanding the concept of dual quaternions. A dual number combines the non-dual part a 1 , and the dual part b 1 is represented as The is the dual unit; note that = 0, 2 = 0. The multiplication of two dual numbers is given as ( Dual quaternions ( x ∈ H D ) are quaternions equivalent to dual numbers, i.e., replacing real numbers a 1 , b 1 with quaternions p and q. Thus, a dual quaternion is given as follows: The multiply operation of two dual quaternions is similar to dual numbers. As two dual quaternions x 1 = p 1 + q 1 and x 2 = p 2 + q 2 ( x 1 , x 2 ∈ H D ), the product of the two is: Note that the product is non-commutative. The dual quaternions apply the same strategy utilized by quaternions, linear operators: (·) + and (·) − ∈ R 8×8 can be defined for dual quaternions by: Dual quaternions have three conjugates [30]: The norm of a dual quaternion in (5) is denoted as || x|| = x ⊗ x 2 * = x 2 * ⊗ x, which expands to: The unit dual quaternions have the norm that equals one. A dual quaternion satisfies the unit dual quaternion if and only if the non-dual part ||p|| equals one and the dot product of two parts p, q equals zero. This property helps to understand the relationship between the two parts of the dual quaternion. Especially, a unit quaternion is a unit dual quaternion when the dual part is zero.

Representation of 6D Pose
The 6D pose composes a 3D rotation part represented by a unit quaternion q r = [q 0 , q 1 , q 2 , q 3 ] in the form of (4) and a 3D translation part t ∈ R 3 . A unit dual quaternion representing the 6D pose is defined by the following equation: where q d for translation t is defined by: where t is represented by a purely imaginary translation quaternion with zero real parts. We introduce the Hamilton operators H + , H − to replace the linear operators defined above. The term Hamilton operator, which is borrowed from Akyar [31], is not commonly used, at least in the robotics literature, but it seems appropriate here: where H + t is a R 4×4 skew-symmetric matrix generated from translation vector t. A point p = (x, y, z) T ∈ R 3 , we can be embedded in skew-field H D by using a dual quaternion x d in (7). The transformation can be mathematically described as: Consistent with the unit quaternion in rotations, the dual quaternions x d and − x d represent the same pose.

Base Element
Consider the fact that the unit dual quaternions d q and − d q in the form of (5) represent the same pose. Furthermore, the underlying structure of the periodic nature of the 6D pose is no longer linear. Hence, a distribution that can characterize the non-linear structure antipodal symmetric property is required. Consequently, we can tackle multiple, conflicting hypotheses that naturally arise in ambiguous situations.
A Bingham distribution [32] is exactly a distribution on a unit hypersphere S d−1 . The property of hte most interest is the antipodal symmetric, where dF(−x) = dF(x) (i.e., opposite points on S d−1 have equal probability). It commonly represents uncertainty on 3D rotations, denoted as SO(3) mathematically, by the unit quaternion form [33], but not on the 6D pose.
For the representation of small uncertainty, Bingham is almost near the Gaussian distribution, and for the large uncertainty, the Gaussian is worse (or even quite poor) than the Bingham distribution [34].
For those who are familiar with multivariate Gaussian, the parameter matrices V and Λ can be derived via the eigendecomposition of a symmetric matrix C in subsequent Section 3.2 (which is denoted as the inverse covariance matrix Σ −1 in multivariate Gaussian distribution).

A New Distribution Model
For a quite intuitive interpretation, we decompose a unit dual quaternion x into (x r , x d ) in vector form. Thus, a joint probability density over the non-dual part, and the dual part can be denoted as the following Lemma 1.

Lemma 1.
A random vector x = (x r , x d ) ∈ S 3 × R 4 is distributed to the proposed distribution, and the p.d.f is: where x r ∈ S 3 and x d ∈ R 4 , symmetric and positive definite C ∈ R 8×8 , and a normalization constant N(C).
It is always possible to break a joint density into the product of two factors. We can work out the details for the joint case by employing the Schur complement. First, let C in Lemma 1 be denoted as: with C ij ∈ R 4×4 ; note that C 21 = C T 12 , according to the settings [22], C 11 needs to be symmetric, C 12 may be arbitrary, and C 22 has to be symmetric negative definite to ensure the antipodal symmetry, which we will further discuss in Section 3.3.

Lemma 2.
The proposed probability density function can be rewritten as: From Lemma 2, the rotation part of the dual quaternion evidently appears as a Bingham distribution, which can be derived by marginalizing out the corresponding conditional distribution of the dual part.
Since the dual part, x d , combines the rotation and translation by a Hamilton product, a canonical way to describe dependencies between the position and the orientation of a dual quaternion is still unknown [22].
From (9), the dual part given the non-dual part f (x d |x r ) can be treated as a multivariate Gaussian distribution N (A 2 x r , − C −1 22 2 ). Thus, the joint density function from Lemma 2 can be rewritten as:

Normliazation Constant
The Bingham distribution is flexible for 3D rotation represented by the unit quaternion on the hypersphere S 3 . Although it represents uncertainty, it is still not well propagated in the computer vision and robotics communities because the computation of the normalization constant is complex. The normalization constant, F, in the Bingham distribution is not closed form, which means the normalization constant, F, only has the numerical solution. However, several techniques can overcome this difficulty, such as caching techniques [35] and saddlepoint approximations. It is still an area of active research [14].
Fortunately, the computational burden is alleviated by the method adapted from [22]. The normalization constant is written as follows according to the discussion: Furthermore, this work calculates N B (A 1 ) by the hypergeometric function [34] of a matrix argument: For the case d = 3, this reduces to: A statistics library [36] is used for the computation of the normalization constant, F, in this work.

Parameter Estimation
We hope to obtain the parameters of the proposed distribution given a d × N sample matrix X = [x 1 , . . . , x N ], where each column vector is assumed to be generated i.i.d from our proposed distribution. This procedure can be divided into two stages. First, the first four entries of samples will recover the parameter of the Bingham distribution. Second, the eight entries recover the parameter in the Gaussian distribution.
From Lemma 2, we observed that the base element with d = 4 in Section 3.1 appears a marginal distribution of our model. Thus, we can apply Definition 1: where x r ∈ R 4 is the first four entries from the proposed distribution. The parameter V can be obtained as the matrix of eigenvectors of the covariance A 1 with corresponding eigenvalues Λ in the order λ 1 ≤ λ 2 ≤ λ 3 , equivalent to the eigendecomposition of The probability density function of a 4D Bingham distribution projected on S 3 by a unit vector is shown in Figure 2, which appears in the marginal distribution of the first four entries x r in our model.
As we have already discussed in Section 3.2, the dual part x d given the non-dual x r is assumed to be a Gaussian distribution N (A 2 x r , − C −1 22 2 ). The parameters A 2 and − C −1 22 2 can be obtained by maximum likelihood estimation of a multivariate linear regression ( [37], Theorem 8.2.1).
The mode of proposed distribution is related to the order of the column vector in matrix V . It turns to be the last column, according to diagonal entries when we enforce λ 1 ≤ · · · ≤ λ d−1 in Λ. It is also possible to swap columns of V without changing the distribution [38]. There exist two correct modes in the proposed distribution because of the antipodal symmetry. For one of the modes m = (m r , m d ), where m r is the normalized eigenvector of A 1 and m d = A 2 m r , the −m is the another correct one.

Result
As expected, unit dual quaternions can be used to represent 3D rotation while the dual part is zero. Thus, (4) can be extended to a dual form as: where This work utilizes dual quaternions to generate the form of the 3D translation. A vector t = (t 1 , t 2 , t 3 ) T ∈ R 3 embedded in the skew-field H D is represented by the following: To combine rotations and translations, the product of two unit dual quaternions is used according to (6). A rotation with a subsequent translation is given as follows: A dual quaterinon in the form of (14) is still a unit dual quaternion, and there is no restriction on translation that must be unit-length. According to the defined norm property of the unit dual quaternion, that is, p 0 q 0 + p 1 q 1 + p 2 q 2 + p 3 q 3 = 0, this property provides ingredients to our algorithm for translation. According to (5), the first four entries are interpreted as the non-dual part, which describes the rotation, and its last four entries are interpreted as a dual part of d x . That is: Represented by the vector form, the unit dual quaternion can be written as d x = q r + q d , decomposing into the non-dual part q r = (d 0 , d 1 , d 2 , d 3 ) and the dual part q d = (d 4 , d 5 , d 6 , d 7 ).
Furthermore, the Hamilton operator (·) + ∈ R 4×4 is associated with (8), and the translation part can be derived from (14) as follows: where q * r = (q 0 , −q) = (d 0 , −d 1 , −d 2 , −d 3 ) is the conjugate of q r . Algorithm 1 shows a robust method to generate the rotation angle, axis, and translation in 3D space by sampling from the proposed distribution. Furthermore, this work applies Algorithm 1 to several sets of the samples generated from the proposed distribution, and the results are shown in Figure 3.

Algorithm 1:
Recover the corresponding translation and rotation from sampled vector on the proposed distribution.
Input: A sampled random vector w x = (w 0 , . . . , w 7 ) T ∈ R 8 /* Compute the rotation angle */ φ ← 2 · atan2( w 1 2 + w 2 2 + w 3 2 , w 0 ) /* Compute the rotational arbitrary vector */ The 6D pose is obtained by the random vector sampled from the proposed distribution. Each sample from our distribution appears in the antipodal symmetry in rotations. Compared with the planar rigid motion case [22], our distribution breaks the antipodal symmetry in the translation that t and −t are denoted as positions that always have the same probability values.  In the first row, the rotation after the translation applies to sets sampled from the proposed distribution, the blue dots mean the position in 3D space, the rotation axis is represented by an arrow. In the bottom row, set the translation after the rotation. The mode in both rows appears as the antipodal symmetry in the representation of rotation; there is no such property in the position that recovers from the proposed distribution (Zoom in the figure to see the detail).

Conclusions
This work proposes a distribution over the 6D pose in order to capture the uncertainty which is typically ignored by the pose estimation task. Our model takes the periodic nature of the underline structure into consideration, such as antipodal symmetry, which is suitable for high-noise regimes. The proposed distribution eliminates the drawbacks from initializing the SE(2), in which the position is ambiguous due to antipodal symmetry, and our distribution setting is closer to reality.
However, we assume that the Gaussian on the correlation part leads to a better, but still imperfect, answer. Many beautiful properties in the parameter matrix of the proposed distribution have not been explored. Still, a unified distribution exists to couple the rotation and translation.
In the future, we will verify the proposed distribution in practical applications, such point cloud registration, as well as the most important one, the 6D pose estimation task. Last but not least, the combination of the neural network with the proposed probability to regress the object pose is worth endeavoring.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Proof of Lemma 2
The parameter matrix C in the distribution can be rewritten as the UDL form by Schur Complement: x r x d T C 11 C 12 where, once again, A 1 = C 11 − C 12 C −1 22 C 21 , A 2 = −C −1 22 C 21 .