Robust and Accurate Hand–Eye Calibration Method Based on Schur Matric Decomposition

To improve the accuracy and robustness of hand–eye calibration, a hand–eye calibration method based on Schur matric decomposition is proposed in this paper. The accuracy of these methods strongly depends on the quality of observation data. Therefore, preprocessing observation data is essential. As with traditional two-step hand–eye calibration methods, we first solve the rotation parameters and then the translation vector can be immediately determined. A general solution was obtained from one observation through Schur matric decomposition and then the degrees of freedom were decreased from three to two. Observation data preprocessing is one of the basic unresolved problems with hand–eye calibration methods. A discriminant equation to delete outliers was deduced based on Schur matric decomposition. Finally, the basic problem of observation data preprocessing was solved using outlier detection, which significantly improved robustness. The proposed method was validated by both simulations and experiments. The results show that the prediction error of rotation and translation was 0.06 arcmin and 1.01 mm respectively, and the proposed method performed much better in outlier detection. A minimal configuration for the unique solution was proven from a new perspective.


Introduction
The combination of vision sensors and robots is a milestone in robotic intelligence, increasing the extent and efficacy of robot applications [1][2][3][4][5]. Hand-eye calibration is an important technique for bridging the transformation between a robot gripper and a robot vision sensor [6]. Its application is mainly reflected in the robot's hand-eye coordination, guiding the robot gripper to accurately target and reach into a specified location using the machine vision system. From height work to surgery, the more sophisticated the operation, the better robot hand-eye coordination required.
Many researchers have studied hand-eye calibration, and all current methods can be divided into two categories: linear methods and iterative methods.
Linear methods are efficient and suitable for online hand-eye calibration. Shiu and Ahmad first introduced the dynamic equation AX = XB into hand-eye calibration and provided minimal configuration for a unique solution [6]. Tsai and Lens proposed a high-efficiency linear method for the equation AX = XB [7]. Chou and Kamel expressed rotation matrices using quaternions and obtained an analytical solution using Singular Value Decomposition (SVD) [8]. Lu and Chou used an eight-dimension vector to express rotation and translation and obtained a least squares solution [9]. Chen analyzed the relationship between screw movement and hand-eye calibration, and then proved that the movement of the robot gripper and vision sensor must satisfy certain geometric constraints [10]. Daniilidis solved rotation and translation simultaneously by means of a Figure 1 describes the hand-eye calibration problem. The symbols are notated as follows: G i is the robot gripper coordinate system, it is fixed on the robot gripper and moves together with it, C i is the camera coordinate system fixed on the camera that moves together with it and the origin point is coincident with the camera's optical center. The Z-axis is parallel to the optical axis, and the X and Y axes are parallel to the X and Y axes of the image coordinate system. CW is the world coordinate system and RW is the robot coordinate system that is fixed on the robot and moves together with it. When the robot gripper moves, its controlling device can identify the gripper's pose in RW. an eight-dimension vector to express rotation and translation and obtained a least squares solution [9]. Chen analyzed the relationship between screw movement and hand-eye calibration, and then proved that the movement of the robot gripper and vision sensor must satisfy certain geometric constraints [10]. Daniilidis solved rotation and translation simultaneously by means of a dual quaternion [11]. Park introduced canonical coordinates into the hand-eye calibration equation, which simplified the parameters [12]. Shah constructed a closed-form solution and derived the minimal configuration of the unique solution based on Kronecker product [13]. Compared with Daniilidis [11], Shah's method was more reliable and accurate. Iterative methods are mainly used to improve the accuracy and robustness. Other authors [14,15] took the F norm of the rotation error and translation error as the cost function, and then optimized it using nonlinear methods. Horaud expressed rotation matrices using quaternions and simultaneously optimized the transformation between the robot-world and hand and eye [16]. Strobl and Hirzinger proposed a new adaptive error model that helped improve the solution to AX = XB and AX = ZB [17]. Ruland proposed a self-calibration method that took projection error as its cost function and optimized it using branch-and-bound [18].

Description of Hand-Eye Calibration Problem
The accuracies of the above methods strongly depend on the quality of the observation data. Therefore, preprocessing observation data is essential. Observation data preprocessing is rarely reported. Schmidt et al. [19] proposed a preprocessing method based on vector quantization, which improved the quality of observation data to a certain extent but could not identity outliers. The complexity increased from O(N) to O(N 4 ), which considerably decreased the method's efficiency.  Figure 1 describes the hand-eye calibration problem. The symbols are notated as follows: Gi is the robot gripper coordinate system, it is fixed on the robot gripper and moves together with it, Ci is the camera coordinate system fixed on the camera that moves together with it and the origin point is coincident with the camera's optical center. The Z-axis is parallel to the optical axis, and the X and Y axes are parallel to the X and Y axes of the image coordinate system. CW is the world coordinate system and RW is the robot coordinate system that is fixed on the robot and moves together with it. When the robot gripper moves, its controlling device can identify the gripper's pose in RW.

Description of Hand-Eye Calibration Problem
Ai is the homogenous transformation matric from Gi to RW, obtained from the robot controlling device: Bi is the homogenous transformation matric from CW to Ci, obtained using camera pose estimation methods: A i is the homogenous transformation matric from G i to RW, obtained from the robot controlling device: B i is the homogenous transformation matric from CW to C i , obtained using camera pose estimation methods: A ij is the homogenous transformation matric from G i to G j : B ij is the homogenous transformation matric from C i to C j : and X is the homogenous transformation matric from C i to G i : i and j represent the ith and jth state of the robot gripper and camera respectively, ranging from 0 to N. N is the number of movements. Since the robot gripper and camera are fixed, X is constant. The hand-eye calibration equation can be represented by notations: Two equations can be obtained based on the partition matric: Equation (7) shows that R X is independent, but the accuracy of t X is related to R X .

Schur Matric Decomposition
A given matric can be simplified to a normalized form via similarity transformation. Considering numerical stability, the similarity transformation of a unitary matric is the most attractive. Schur matric decomposition can be simply described as: If A ∈ C n×n , then a unitary matric that satisfies U H AU = T = D + N exists, where D is a diagonal matric and N is a strictly upper triangular matric, implicating ∀I ≥ j n i j = 0. For a real matric A, U is restricted to an orthogonal matric: U T AU = T. T has the following form: T ii is a 1 × 1 or 2 × 2 matric consisting of complex conjugate eigenvalues. If R A ij is similar to R B ij and eigenvalues of R A ij and R B ij are the same, the matric T related to R A ij and R B ij are the same.

Hand-Eye Calibration Principle
A 0 , B 0 is notated as the initial state of the robot gripper and camera. (A i0 , i0 )(i = 1, 2, . . . , N − 1, N) is a series of homogenous transformation matrices related to their initial states. Without the loss of generality, e.g., i = 1, only consider the equation related to the rotation in Equation (7): From Theorem 1, proved in the Appendix A, the general solution can be written as: And R X only depends on c and d. For arbitrary i = 1, 2, . . . , N -1, N: Substitute Equation (10) into Equation (12): where: Collate Equation (13) into equations only related to s = (c d) T .
C i is a matric generated by the coefficients of c and d. D i is a matric generated by the constant term. Then, the final linear equation system can be constructed: where, This is a least squares problem with constraints: where, Notate the cost function as: Notate s = (K − λI)y and substitute it into previous equations: K is a symmetrical matric, so s T s = 1 is the same as y T F = 1: Because y T F = F T y: This is a symmetrical second eigenvalue problem [20]. Solve the least squares solution of the Langrage multiplier through methods previously published [21,22]. The least square solution of s is: Under the condition U R A 10 , U R B 10 , the least squares solution of R X is: An R i X exists for each i = 1, 2, . . . , N -1, N. To weaken the effect of noise, fuse the matrices based on the string distance of matrices. First, calculate the singular decomposition of the sum of R i X , i = 1, 2, . . . , N -1, N: Then: To solve for t X , for the ith movement, the translation satisfies the following equation: Substitute Equation (29) into Equation (30): Then, a large linear equation system can be obtained: where, This problem can be solved using the least squares method [20].

Outlier Detection
In practice, matrices A i and B i contain an observation error, notated asÂ i andB i , respectively. B i is more sensitive to image noises. A poor environment may lead to a large observation error and, in this case, the global optimization solution has no significance. This is a basic problem that considerably decreases the robustness of hand-eye calibration and has not been well solved.
The form of Y is: R A i0 and R B i0 must satisfy Equation (13).
For arbitrary c and d, Equation (36) is satisfied: which can be used to discriminate the quality of the observation data: if greater than a specific threshold ε, then the observation data are outliers and should be deleted. The threshold ε is an empirical value. Through setting its value, the observation data can be filtered. The lower the threshold ε, the higher the quality of the observation data. In simulations and experiments, ε was set to 0.01. In summary, the flowchart of the proposed method is described in Figure 2.

Unique Solution Conditions
Assume the rotation matrices of two movements are A1, A2, B1, and B2, and X is known. From theorem 1 (Appendix), the general solution of A1X = XB1 is: where, Y is a matric only related to c and d. Substitute Equation (37) into the equation built by two movements: Substitute them into Equation (13) to obtain:

Unique Solution Conditions
Assume the rotation matrices of two movements are A 1 , A 2 , B 1 , and B 2 , and X is known. From Theorem 1 (Appendix A), the general solution of A 1 X = XB 1 is: where, Y is a matric only related to c and d. Substitute Equation (37) into the equation built by two movements: Substitute them into Equation (13) to obtain: And For Equation (39): Equation (41) is an identical equation.
If rotation axes of two movements are not parallel, P 2 and Q 2 are independent: From Theorem 2 proved in the Appendix A, if the rotation axes of N movements of the robot gripper are parallel, there will be multiple solutions to the hand-eye calibration. Therefore, the minimal configuration of the unique solution is that the robot gripper and camera move at least twice, and the rotation axes cannot be parallel.

Simulations
We designed simulations to test the performance of different hand-eye calibration methods. The hand-eye calibration equation can be written as: where, A ij and B ij are the movement of the robot gripper and camera from time i to time j, respectively. A i and B i were simulated as the observation data. X is simulated as the transformation from the camera to the robot gripper. A i , B i and X consist of rotation matrices and translation vectors. The rotation matric can be generated using three Euler angles. The simulations included three parts: analysis of noise sensitivity, relationship between the number of movements and accuracy, and outlier detection ability. All the simulations were performed using MATLAB. In addition to the proposed method, we selected another five popular methods for comparisons [7,[11][12][13]23]. For the ith simulation, R i X and t i X are the ideal transformation from the camera to the robot gripper andR i X andt i X are the measured transformations. The error matric can be calculated as: where, k i error and θ i error are the rotation axis and rotation angle of R i error , respectively.
The errors of rotation and translation are defined as: where, n is the number of simulations.

Analysis of Noise Sensitivity
Gaussian rotation noise (µ R = 0, σ R = 0 • -5 • ) and translation noise (µ T = 0, σ T = 0-5 mm) were added into A i and B i (i = 1, 2, . . . , 9,10). We ran 100 simulations at each noise level. The results were shown in Figure 3, in which 'Rot.' represents 'Rotation' and 'Trans.' represents 'Translation'. Except for the dual quaternion method, translation perturbation had no effect on the rotation solution, where, n is the number of simulations.

Analysis of Noise Sensitivity
Gaussian rotation noise (μR = 0, σR = 0°-5°) and translation noise (μT = 0, σT = 0-5 mm) were added into Ai and Bi (i = 1, 2, …, 9, 10). We ran 100 simulations at each noise level. The results were shown in Figure 3, in which 'Rot.' represents 'Rotation' and 'Trans.' represents 'Translation'. Except for the dual quaternion method, translation perturbation had no effect on the rotation solution, because only the dual quaternion method solves rotation and translation simultaneously, whereas other methods solve rotation and translation by steps.

Relationship between Number of Movements and Accuracy
The simulation conditions included σR = 0.2°, σT = 2 mm, and the number of movements varied from 3 to 15. We ran 100 simulations at each number of movements. Figure 4a, b indicate that the accuracy of hand-eye calibration improves with the increase in the number of movements. When the number of movements increases from three to eight, the accuracy of hand-eye calibration improves considerably. Figure 4c, d demonstrate that the other five methods are more robust, except for the dual quaternion method being unstable.

Relationship between Number of Movements and Accuracy
The simulation conditions included σ R = 0.2 • , σ T = 2 mm, and the number of movements varied from 3 to 15. We ran 100 simulations at each number of movements. Figure 4a,b indicates that the accuracy of hand-eye calibration improves with the increase in the number of movements. When the number of movements increases from three to eight, the accuracy of hand-eye calibration improves considerably. Figure 4c,d demonstrates that the other five methods are more robust, except for the dual quaternion method being unstable.

Relationship between Number of Movements and Accuracy
The simulation conditions included σR = 0.2°, σT = 2 mm, and the number of movements varied from 3 to 15. We ran 100 simulations at each number of movements. Figure 4a, b indicate that the accuracy of hand-eye calibration improves with the increase in the number of movements. When the number of movements increases from three to eight, the accuracy of hand-eye calibration improves considerably. Figure 4c, d demonstrate that the other five methods are more robust, except for the dual quaternion method being unstable.

Outlier Detection
The simulation conditions were σ R = 0.2 • , σ T = 2 mm, and ε = 0.01. The robot gripper moved 10 times, in which large noise was added into n (n = 1, 2, 3, 4, 5, 6) movements randomly and these observations were regarded as outliers. We ran 100 simulations at each number of outliers. Figure 5a,b shows the relationship between calibration errors of R X and t X and the number of outliers, respectively. Figure 5c,d depicts the performance of the proposed method. The results indicate that the proposed method can detect outliers effectively and performs much better than the other five methods.

Experiments
Determining poses of the robot gripper with high precision is costly, but movements of the robot gripper can be measured precisely. Thus, most researchers adopt the following program to validate hand-eye calibration methods: the camera moves N + n times, where the preview N times are called the calibration link and the last n times are called the verification link. The calibration link is used to solve the transformation between the robot gripper and the camera. The verification link

Experiments
Determining poses of the robot gripper with high precision is costly, but movements of the robot gripper can be measured precisely. Thus, most researchers adopt the following program to validate hand-eye calibration methods: the camera moves N + n times, where the preview N times are called the calibration link and the last n times are called the verification link. The calibration link is used to solve the transformation between the robot gripper and the camera. The verification link is used to verify method accuracy by comparing its predicted movements with its true movements [3]. The predicted movements of the robot gripper can be solved from the camera's movements using Equation (47). The true movements of the robot gripper can be obtained from its controlling device. A robot arm was fixed with a camera, as shown in Figure 6a. For the calibration link: (1) Fix 9 feature points on the platform as shown in Figure 6b. The three-dimensional (3D) coordinates of feature points can be measured by Leica Total Station. All the feature points' coordinates remain unchanged during the experiment.
(2) At time 0, capture an image of the feature points on the platform. Calculate the camera's pose B0 through Perspective-n-Points (PnP) methods. The robot gripper's pose A0 can be determined from its controlling device.  1, …, N-1, N) can be obtained. (6) The transformation X from the camera to the robot gripper can be calibrated using all six hand-eye calibration methods.
For the verification link:  The true movement of the robot gripper Ai0 can be obtained from its controlling device. For the calibration link: (1) Fix 9 feature points on the platform as shown in Figure 6b. The three-dimensional (3D) coordinates of feature points can be measured by Leica Total Station. All the feature points' coordinates remain unchanged during the experiment.
(2) At time 0, capture an image of the feature points on the platform. Calculate the camera's pose B 0 through Perspective-n-Points (PnP) methods. The robot gripper's pose A 0 can be determined from its controlling device.
(3) At time i, move the robot gripper and camera.
(4) Capture an image of the feature points on the platform. Calculate the camera's pose B i through PnP methods. The robot gripper's pose A i can be determined from its controlling device.
(9) ComparingÂ i0 with A i0 , the error matric can be calculated using Equation (48): The rotation error is in arcmin and the translation error is in mm. In the experiment, N = 2-9 and n = 200. The results are shown in Table 1. The experiment results indicate that the prediction error decreased with the increase in the number of movements and when the robot gripper moved 9 times, the proposed method's prediction accuracy of rotation exceeded 6 arcsec, which is much higher than the calibration accuracy in the simulations. The reason is explained in the following. Expand Equation (47) using a partition matric: The prediction error consists of hand-eye calibration error and camera pose estimation error. Hand-eye calibration error is notated as ∆R X . Then, the prediction error of Equation (51) can be written as: Equation (52) can weaken the effect of the hand-eye calibration error. This conclusion also applies to the prediction error of translation. Thus, the prediction error in the experiment was much lower than the hand-eye calibration error in the simulations.

Conclusions
A hand-eye calibration method with high accuracy and robustness was proposed in this paper. Using this method, the basic problem of observation data preprocessing is solved by outlier detection, which significantly improves robustness. However, two aspects remain to be studied. To improve the method's efficiency, we used the least squares optimization method with constraints. If no strict need exists for efficiency, an iterative method could be considered. We decreased the rotation matric's dimension from three to two via Schur matric decomposition and unknown parameters satisfied the constraint c 2 + d 2 = 1. If the following triangle transformation is adopted, the degrees of freedom (DOFs) can be decreased from two to one. The Gröbner basis method can be used to solve polynomial equations [24]:

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Lemma 1. A is a 3 × 3 rotation matric and can be decomposed to Equation (55) based on Schur matric decomposition: Then, T A can be written as: and T 2×2 is a unit orthogonal matric: Proof of Lemma 1. Because A is a unit orthogonal matric: T A can be written as: Substitute T A into T A T T A = I: Thus, T 2×2 is a unit orthogonal matric. Notate: Then, The lemma has been proven.

Lemma 2.
For rotation matric A, B, C, D, and X: If the axis of A and B is parallel to the axis of C and D respectively, then: The form of M can be written as: Proof of Lemma 2. U A and U B can be obtained by Schur matric decomposition. k a and k b are the axes of A and B calculated through Rodrigues, respectively. θ a and θ b are the rotation angles. Then k a = Xk b , θ a = θ b , k c = Xk d and θ c = θ d . Then, the rotation matrices can be written as: E i A and E i B (i = 1, 2, 3) are linearly independent matrices generated from rotation axes. Any orthogonal transformation has no effect on the property of independence: Because U T A AU A = U T B BU B = T, The rotation angles of C and D are equal, then: Since, Proof of Theorem 1. Since A is similar to B, T A = T B = T. Substitute it into AX = XB, then: From Lemma 1, T can be obtained: , then: Due to arbitrariness, Thus, Assume then: Y is a unit orthogonal matric, so: The ± of Y 11 is related to the determinant of U A U B . Because the determinant of X is greater than 0, the symbol of Y 11 is the same as the symbol of the determinant of U A U B . Theorem 1 has been proven. Theorem 2. If rotation axes of N movements of robot gripper are parallel, there will be multiple solutions to the hand-eye calibration.
Proof of Theorem 2. Assume rotation matrices of two movements are A 1 , A 2 , B 1 , and B 2 . X is an unknown rotation matric. From Theorem 1, the general solution of equation A 1 X = XB 1 can be obtained: Y is a matric only related to c and d: Substitute the general solution into the second movement: From Lemma 2: Thus, MY ≡ YM The equation is an identical equation indicating that the second movement cannot provide any extra constraint related to c and d.
In the same way, an N -1 movement with same rotation axes cannot provide any extra constraint related to c and d. The general solution applies to all equations built by N movements. Therefore, hand-eye calibration problems with same rotation axes have multiple solutions. Theorem 2 has been proven.