A Novel Point Set Registration-Based Hand–Eye Calibration Method for Robot-Assisted Surgery

Pedicle screw insertion with robot assistance dramatically improves surgical accuracy and safety when compared with manual implantation. In developing such a system, hand–eye calibration is an essential component that aims to determine the transformation between a position tracking and robot-arm systems. In this paper, we propose an effective hand–eye calibration method, namely registration-based hand–eye calibration (RHC), which estimates the calibration transformation via point set registration without the need to solve the AX=XB equation. Our hand–eye calibration method consists of tool-tip pivot calibrations in two-coordinate systems, in addition to paired-point matching, where the point pairs are generated via the steady movement of the robot arm in space. After calibration, our system allows for robot-assisted, image-guided pedicle screw insertion. Comprehensive experiments are conducted to verify the efficacy of the proposed hand–eye calibration method. A mean distance deviation of 0.70 mm and a mean angular deviation of 0.68° are achieved by our system when the proposed hand–eye calibration method is used. Further experiments on drilling trajectories are conducted on plastic vertebrae as well as pig vertebrae. A mean distance deviation of 1.01 mm and a mean angular deviation of 1.11° are observed when the drilled trajectories are compared with the planned trajectories on the pig vertebrae.


Introduction
Pedicle screw insertion is an effective treatment of spinal diseases, such as scoliosis, in addition to spinal fracture and vertebral injury. Manual implantation is challenging, especially in patients with severe spinal deformity, osteoporosis, or tumor [1][2][3]. To address the challenge, one of the proposed technologies is to integrate a robot arm with a computer navigation system [4][5][6][7]. In developing such a system, hand-eye calibration is an essential component, which aims to determine the homogeneous transformation between the robot hand/end-effector and the optical frame affixed to the end-effector [8,9].
Due to its importance, a number of approaches have been developed to solve the problem. Hand-eye calibration can be formulated in the form of AX = XB, where A and B are the robotic end-effector and the optical frame poses between successive time frames, respectively, and X is the unknown transformation matrix between the robot endeffector and the optical frame. Many solutions have been proposed to recover X-given data streams {A i } and {B i }. Solutions to the problem can be roughly classified into four categories, i.e., separable solutions [10][11][12][13][14], simultaneous solutions [15][16][17], iterative solutions [8,[18][19][20][21], and probabilistic methods [22,23]. Specifically, given equations A and B, it is possible to decompose the equation into rotational and translational parts. Separable solutions utilize this property to solve hand-eye calibration, where the rotation part is first solved, followed by solving the translational part. In contrast, simultaneous solutions solve the rotational and translational parts at the same time. Methods in the third category solve a nonlinear optimization problem by minimizing equations such as ||AX − XB||. As the algorithm iterates, it will converge on a solution to X. Different from methods of the first three categories, which assume an exact correspondence between the data streams {A i } and {B i }, methods in the fourth category eliminate such a requirement.
Despite these efforts, accurate hand-eye calibration is challenging for the following reasons. First, although separable methods are useful, any error in the estimation for the rotational part is compounded when being applied to solving the translational part. Second, while simultaneous solutions can significantly reduce the propagation of error [24], they are sensitive to the nonlinearities present in measurements in the form of noise and errors [25]. Third, although it was observed that the nonlinear iterative approaches yielded better results to linear and closed-form solutions in terms of accuracy [25], they can be computationally expensive to carry out and may not always converge on the optimal solution.
In this paper, to tackle these challenges, we propose an effective hand-eye calibration method, namely registration-based hand-eye calibration (RHC), which estimates the calibration transformation via paired-point matching without the need to solve the AX = XB equation. Specifically, in our solution, we reformulate hand-eye calibration as tool-tip pivot calibrations in two-coordinate systems and a paired-point matching, taking advantage of the steady movement of the robot arm and thus reducing measurement errors and noise. The hand-eye calibration problem is then solved via closed-form solutions to three overdetermined equation systems. Our point set registration-based hand-eye calibration method has the following advantages: • Our method is a simultaneous closed-form solution, which guarantees an optimal solution; • Unlike other simultaneous solutions, our solution is obtained by solving three nonlinear least-square fitting problems, leading to three overdetermined equation systems. Thus, it is not sensitive to the nonlinearities present in measurements in the form of noise and errors; • In comparison with the nonlinear iterative approaches, our method requires only simple matrix operations. Thus, it is computationally efficient; • Our method achieves better results than the state-of-the-art (SOTA) methods.
The paper is organized as follows. Section 2 reviews related works. Section 3 presents the proposed method. Section 4 describes the experiments and results. Finally, we present discussions in Section 5, followed by our conclusion in Section 6.
The earliest approaches separately estimated the rotational and translational parts. For example, Shiu et al. proposed a method for solving homogeneous transform equations [10]. Tsai presented an efficient 3D robotics hand-eye calibration algorithm that computed 3D position and orientation separably [11]. Quaternion-based [13], extrinsic hand-eye calibration [12], and dual-quaternions-based calibration methods [14] have been introduced for the individual estimations of rotational and translational parts. One known problem with separable methods is that any error in the estimation of the rotation matrices may be propagated to the estimation of the translation vector.
Error propagation problem.
Simultaneous solutions [15][16][17] Solve the rotational and translational parts at the same time.
Sensitive to the nonlinearities present in measurements in the form of noise and errors.
Computationally expensive; may not always converge on the optimal solution.
Probabilistic methods [22,23] Solve the calibration problem without the assumption of exact correspondence between the data streams.
To avoid the error propagation problem with separable solutions, methods in the second category simultaneously compute the orientation and position. For example, Lu et al. proposed an approach that transformed the kinematic equation into linear systems using normalized quaternions [16]. Andreff et al. proposed an on-line hand-eye calibration method that derived a linear formulation of the problem [15]. Zhao et al. [17] proposed a hand-eye calibration method based on screw motion theory to establish linear equations and simultaneously solve rotation and translation. As confirmed by experimental results, simultaneous methods have less error than separable solutions [25].
Iterative solutions are another type of method used to solve the problem of error propagation. For example, Zhuang et al. [18] presented an iterative algorithm to solve the unknown matrix X in one stage, thus eliminating error propagation and improving noise sensitivity. Mao et al. [20] proposed using a direct linear closed-form solution followed by Jacobian optimization to solve AX = XB for hand-eye calibration. Hirsh et al. [26] proposed a robust iterative method to simultaneously estimate both the hand-eye and robot-world spatial transformation. Based on a metric defined on the group of the rigid transformation SE(3), Strobl and Hirzinger [27] presented an error model for nonlinear optimization. They then proposed a calibration method for estimating both the hand-eye and robot-world transformations. While iterative solutions are generally accurate, they can be computationally expensive and may not always converge to the optimal solution [28].
The methods mentioned above assume an exact correspondence between the streams of sensor data, while methods in the fourth category eliminate such a requirement. For example, Ma et al. [23] proposed two probabilistic approaches by giving new definitions of the mean on SE(3), which alleviated the restrictions on the dataset and led to improved accuracy. Although it is worth investigating the situation when the exact correspondence between sensor data is unknown, probabilistic methods usually lead to longer computation times. Additionally, assuming an exact correspondence is not a problem in our study.
Hand-eye calibration is also an active research topic in medical applications. For example, Morgan et al. [29] presented a Procrustean perspective-n-point (PnP) solution for hand-eye calibration for surgical cameras, achieving an average projection error of 12.99 pixels when evaluated on a surgical laparoscope. Özgüner et al. [30] proposed a solution for hand-eye calibration for the da Vinci robotic surgical system by breaking down the calibration procedure into systematic steps to reduce error accumulation. They reported a root mean square (RMS) error of 2.1 mm and a mean rotational error of 3.2 • when their calibration method was used to produce visually-guided end-effector motions. Using the da Vinci Research Kit (dVRK) and an RGB-D camera, Roberti et al. [31] proposed to separate the calibration of the robotic arms and an endoscope camera manipulator from the hand-eye calibration of the camera for an improved accuracy in a 3D metric space. The proposed method reached a sub-millimeter accuracy in a dual-arm manipulation scenario, while the use of the RGB-D camera limited its actual application in surgery. Sun et al. [32] proposed a hand-eye calibration method for robot-assisted minimally invasive surgery, which relied purely on surgical instruments already in the operating scenario. Their model was formed by the geometry information of the surgical instrument and the remote centerof-motion (RCM) constraint, outperforming traditional hand-eye calibration methods in both simulation and robot experiments.
Deep learning-based methods, especially those based on convolutional neural networks (CNN), have also been developed for low-level image-processing tasks in hand-eye calibration [33][34][35][36]. For example, Valassakis et al. [34] proposed a sparse correspondence model that used a U-Net to detect 2D key points for eye-in-hand camera calibration. Kim et al. [36] introduced deep learning-based methods to restore out-of-focus blurred images for an improved accuracy in hand-eye calibration.

System Overview
Our robot-assisted, image-guided pedicle screw insertion system consists of a master computer, an optical tracking camera (Polaris Vega XT, NDI, Waterloo, ON, Canada) and a robot arm (UR 5e, Universal Robots, Odense, Denmark) with a guiding tube. The master computer communicates with the tracking camera to obtain poses of different optical tracking frames with the remote controller of the UR robot in order to realize a steady movement and to receive feedback information.
During pedicle screw insertion, the target point and the aiming trajectory are planned in a pre-operative CT, which are transformed to the tracking camera space via a homogeneous transformation obtained by a surface registration [37]. Then, the pose of the guide will be adjusted to align with the planned trajectory. Thus, it is essential to determine the spatial transformation from the tracking camera space to the robot space, as shown in Figure 1. The transformation can be obtained via two different calibration procedures, including the hand-eye calibration and guiding tube calibration. During a pedicle screw insertion procedure, the pose of the guide is adjusted to align with a trajectory, which is planned in a pre-operative CT first, and then is transformed to the patient space via a surface registration.
Our robot-assisted, image-guided pedicle screw insertion procedure involves the following coordinate systems (COS), as shown in Figure 1. The 3D COS of the optical tracking camera is represented by O C ; the 3D COS of the optical reference frame on the endeffector is represented by O M ; the 3D COS of the robotic flange is represented by O F ; the 3D COS of the guiding tube is represented by O T ; the 3D COS of the robot base is represented by O B ; the 3D COS of the pre-operative CT data is represented by O CT ; and the 3D COS of the optical reference frame attached to the patient/phantom is represented by O R . At any time, poses of different optical tracking frames with respect to the tracking camera, such as C M T and C R T, are known. At the same time, the pose of the robotic flange with respect to the robot base B F T is known. This transformation information can be retrieved from the API (application programming interface) of the associated devices.

Registration-Based Hand-Eye Calibration
The aim of the hand-eye calibration is to establish the spatial transformation between the robot system and the optical tracking system. Mathematically, we solve the 4 × 4 spatial transformation matrix from the COS O M to the COS O F , referred as F M T. In this subsection, the proposed registration-based hand-eye calibration (RHC) is introduced, which mainly consists of two steps: (1) solving tool-tip pivot calibrations in both the optical tracking camera COS O C and the robot base COS O B ; (2) solving hand-eye calibration via a paired-point matching.

Tool-Tip Calibration
In the first step, we rigidly fixed a calibration tool with a sharp tip to the flange, as shown in Figure 2a. We then need to determine the coordinates of the tool tip relative to the respective two-coordinate systems, i.e., O M and O F . We obtained both by pivot calibration [38]. Once calibrated, the coordinates of the tool tip with respect to O M and O F are known, which will then be used in the next step to compute a paired-point matching.
We will start to describe the pivot calibration of the coordinate of the tool tip with respect to the coordinate system O M . We pivoted the tool tip around a stationary point, as shown in Figure 2b, to estimate the coordinates of the tool tip in both the optical tracking camera COS O C and the 3D COS O M of the optical reference frame on the end-effector. During pivoting, we placed the tool tip in a divot, which has the same size and shape with the tool tip to avoid any possible sliding. Then, we moved the tool around this pivot point while always touching the divot with its tip. We denoted, respectively, the two offsets as p C and p M . During pivoting, we kept p C and p M static while collecting a set of n homogeneous i n} via the tracking camera API. Then, we estimated p C and p M by solving the following overdetermined equations: Then, we can solve p M and p C using pseudo-inverse [39]: As we are only interested in knowing the offset of the tool tip with respect to O M , we keep p M while disregarding p C .
Similarly, we can use the same pivot calibration technique to estimate the coordinates of the tool tip in both the robotic flange COS O F and the robot base COS O B . This time, we pivoted the tool tip around a stationary point, as shown in Figure 2c. Similarly, we placed the tool tip in a divot to avoid sliding. We denoted, respectively, the two coordinates as p B and p F . During pivoting, we kept p B and p F static while collecting a set of l homogeneous l} via the robot arm API. Then, we estimated p B and p F by solving following the following overdetermined equations [39]: where I is the 3 × 3 identity matrix. Again, we are only interested in knowing the offset of the tool tip with respect to O F ; therefore, we kept p F while disregarding p B .

Solving Hand-Eye Calibration via Paired-Point Matching
After obtaining the offsets of the tool tip with respect to two-coordinate systems O M and O F , we can compute the coordinates of the tool tip in both the tracking camera COS O C and the robot base COS O B at any time via the corresponding device's API. In this section, we present an elegant method to solve the hand-eye calibration via paired-point matching using the setup shown in Figure 3.
Specifically, during the hand-eye calibration, we maintained a stationary spatial relationship between the robot base and the tracking camera while moving the robot flange. By controlling the flange to move in m different positions, we collected two set of points P C = (p C ) 1 . . .  For the first step to match two paired-point sets, we computed a 3 × 3 matrix H as follows: We then used the singular value decomposition (SVD) [39] to decompose matrix H into U, S, and V matrices: Based on the decomposed matrices, we computed the rotation matrix C B R as: where λ = det(UV ). Based on C B R, we solved C B t using: Therefore, we obtained the spatial transformation C B T as: For each position in the movement trajectory, we computed the spatial transformation ( F M T) i as: where ( B F T) i and ( C M T) i are retrieved from the associated device's API when generating P C and P B . Each position will give a different ( F M T) i . To improve the robustness and to increase the accuracy, we averaged all the obtained transformations. Specifically, we used (ψ i , θ i , φ i ) to represent the Euler angles of ( F M R) i , so the average rotation matrix F M R can be written as: where R() represents the transformation from the Euler angles to the rotation matrix. Meanwhile, the average translation vector F M t can be written as: where ( F M t) i is the translation vector of ( F M T) i . Therefore, the hand-eye transformation F M T is composed of the average rotation matrix F M R and average translation vector F M t, written as:

Guiding Tube Calibration
To achieve the robot-assisted pedicle screw insertion, the guiding tube that guides the drilling of a screw insertion trajectory needs to be calibrated. The guiding tube calibration is a procedure to estimate the transformation T M T of the COS O T defined on the guiding tube relative to the COS O M of the optical reference frame attached to the robot end-effector. In this calibration procedure, we utilized two COSs, i.e., the local COS O T of the guiding tube and the COS O M , as shown in Figure 4.
The local COS O T can be determined using three points: the two end points of the guiding tube that lie on the center axis of the tube (referred as p (1) and p (2) ), and one further point that is on the guiding tube (referred as p (3) ). To digitize p (1) and p (2) , we used a plug to insert into the guiding tube. We then digitized the coordinates of these three points in the COS O M , referred as p  To establish the COS O T , we defined the origin by p (2) , the z-axis by p (1) and p (2) , and determined the three points by the x-z plane. We obtained the transformation T M T by its origin and axes, as: where,

Pre-Operative Planning
In the first step, we obtained a pre-operative CT scan before the operation. We segmented the target vertebra in the CT image and defined a trajectory using an entry point

Intra-Operative Registration
In the second step, we performed an intra-operative registration to establish the spatial transformation R CT T from the CT image COS O CT to the COS O R . By digitizing points on the surface, we adopted a surface registration algorithm [37]  In the third step, we transformed the planned trajectory to the robot base COS O B so that the robot can align the guiding tube with the transformed trajectory, which is calculated as: In Equation (16), we retrieved C R T and C M T from the optical tracking camera's API. F M T is the hand-eye transformation. We retrieved B F T from the robot arm's API.

Experiments and Results
In this section, we will introduce the experiments and results of our study. We designed and conducted three experiments to investigate the efficacy of the proposed method: (1) an investigation of the influence of the range of robot movement to hand-eye calibration; (2) a comparison with state-of-the-art hand-eye calibration methods; and (3) an overall system accuracy study.

Metrics
In the experiments, the performance is quantified by the deviations between the actual path and the planned trajectory. The deviations consist of the incline angle (unit: • ) and distance deviation (unit: mm). We used the entry point p (e) and the target point p (t) on the planned trajectory to measure the distance, as shown in Figure 6. The distance deviation and incline angle between the guidance path and the planned trajectory are denoted as d and φ, respectively, while the distance deviation and the incline angle between the drilled path and the planned trajectory are denoted as d and φ , respectively.

Investigation of the Influence of the Range of Robot Movement to the Hand-Eye Calibration
In this experiment, we investigated the influence of the spatial range of robot movement to the proposed RHC. In the experiment, a plastic phantom was designed and used, as shown in Figure 7a. The phantom, which was fabricated by 3D printing, had a dimension of 140 × 90× 85 mm 3 , and 25 trajectories were planned on the phantom.
During the hand-eye calibration, the robot is controlled to move in an L × L × L mm 3 cubic space. To investigate the influence of the range of robot movement, we calibrated different hand-eye transformation matrices with an L of 30, 60, 90, 120, 150, or 200 mm. Each time, after obtaining hand-eye calibration, we used the obtained transformation to control the robot to align the guiding tube with a planned trajectory. After that, we digitized the guidance path to evaluate the alignment accuracy.
Experimental results are shown in Figure 7 and Table 2. Both d and φ decreased when L increased. When L was 200 mm, the mean distance deviation was 0.70 mm and the mean incline angle was 0.68 • . The results demonstrate that the larger the robot movement range, the higher the hand-eye calibration accuracy. However, further increasing the movement range will lead to a failure in tracking by the camera. We found that the maximally allowed robot movement range is 200 × 200× 200 mm 3 .

Comparison with State-of-the-Art Hand-Eye Calibration Methods
The plastic phantom introduced in Section 4.2 was also used in this study to compare our method with state-of-the-art (SOTA) hand-eye calibration methods, including Tsai's method [11], Andreff's method [15], Chou's method [13], Shah's method [40], and Wu's method [8]. Each time, after obtaining hand-eye calibration using one of the mentioned methods, we used the obtained transformation to control the robot to align the guiding tube with a planned trajectory. After that, we then digitized the guidance path to evaluate the alignment accuracy, which reflects the hand-eye calibration accuracy.
The distance deviation d and the angular deviation φ are shown in Figure 8 and Table 3. We also report the computational time cost for each method in Table 3. In comparison with the SOTA methods, our method achieved the best results in terms of distance deviation and incline angle. Meanwhile, the time cost of our method is much lower than the iterative calibration method [8], as shown in Table 3.

Overall System Accuracy Study
To evaluate the overall system accuracy, we conducted trajectory drilling experiments on three types of objects: (a) the plastic phantom used in Section 4.2, (b) four 3D-printed vertebrae, and (c) eight pig vertebrae. Each time, we controlled the robot to align the guiding tube with the planned trajectory and drilled a trajectory into the test subject. In total, we planned and drilled 20 trajectories on the plastic phantom, another 8 trajectories on the 3D-printed vertebrae and further another 8 trajectories on the pig vertebrae. For each trajectory, after drilling, both the guidance paths and the drilled paths were digitized to measure the accuracy.
Results are shown in Figure 9 and Table 4. Specifically, on the plastic phantom, the average distance deviation and the average incline angle between the guiding paths and the planned trajectories are 0.70 mm and 0.72 • , respectively, while the average distance deviation and the average incline angle between the drilled trajectories and the planned trajectories are 0.93 mm and 1.04 • , respectively. Additionally, on the 3D-printed vertebrae, our system achieved a slightly better result, i.e., the average distance deviation and the average incline angle between the guiding paths and the planned trajectories are 0.66 mm and 0.79 • , respectively, and the average distance deviation and the average incline angle between the drilled trajectories and the planned trajectories are 0.90 mm and 0.96 • , respectively. Finally, we evaluated our system accuracy on the pig vertebrae. The average distance deviation and the average incline angle between the guiding paths and the planned trajectories are 0.71 mm and 0.82 • , respectively, while the average distance deviation and the average incline angle between the drilled trajectories and the planned trajectories are 1.01 mm and 1.11 • , respectively. Figure 9b shows a post-operative CT scan of the drilled path on a pig vertebra, demonstrating the high accuracy of our system. Both quantitative and qualitative results demonstrate that our system accuracy is good enough for robot-assisted pedicle screw insertion.

Discussions
Hand-eye calibration is one of the essential components when developing a robotassisted, image-guided pedicle screw insertion system. The accuracy of hand-eye calibration will have a direct influence on the system accuracy. However, it is challenging to develop an accurate and robust method for hand-eye calibration. In this paper, we proposed an effective hand-eye calibration method based on tool-tip pivot calibration and paired-point matching without the need to solve the AX = XB equation. Comprehensive experiments were conducted to validate the accuracy of our proposed hand-eye calibration method as well as the robot-assisted, image-guided pedicle screw insertion system. Both qualitative and quantitative results demonstrate the efficacy of our hand-eye calibration method and the high accuracy of our system.
In comparison with a SOTA hand-eye calibration method, our method has the following advantages: First, our method is a simultaneous closed-form solution, which is derived by solving three overdetermined equations, guaranteeing an optimal solution. Second, unlike other simultaneous solutions, we reformulate the hand-eye calibration problem as solutions to tool-tip pivot calibrations in two-coordinate systems and pairedpoint matching, taking advantage of the steady movement of the robot arm, thus reducing measurement errors and noise. Third, in comparison with methods depending on iterative solutions [18][19][20][21] or probabilistic models [22,23], our method is much faster because it is not an iterative solution and only requires simple matrix operations.
Based on the novel hand-eye calibration method, we further developed a robotassisted, image-guided pedicle screw insertion system. We conducted trajectory drilling experiments on a plastic phantom, 3D-printed vertebrae, and pig vertebrae to validate the accuracy of our system. When drilling trajectories on the plastic phantom, our system achieved a mean distance deviation of 0.93 mm and a mean angular deviation of 1.04 • . When it was used to drill trajectories on the 3D-printed vertebrae, our system achieved a mean distance deviation of 0.90 mm and a mean angular deviation of 0.96 • . To check whether the differences between results obtained from the plastic phantom and the 3Dprinted vertebrae are statistically significant, we conducted an unpaired t-test and chose a significant level of α = 0.05. We found a p-value of 0.52 for the distance deviation and a p-value of 0.40 for the angular deviation. When drilling trajectories on the pig vertebrae, our system achieved a mean distance deviation of 1.01 mm and a mean angular deviation of 1.11 • , which are regarded accurate enough for pedicle screw insertion.

Conclusions
In this paper, we proposed a novel hand-eye calibration method, namely registrationbased hand-eye calibration (RHC), to estimate the calibration transformation via pairedpoint matching without the need to solve the AX = XB equation. Based on the proposed hand-eye calibration method, we developed a robot-assisted, image-guided pedicle screw insertion system. Comprehensive experiments were conducted to investigate the influence of the range of robot movement on the hand-eye calibration to compare our method with state-of-the-art methods and to evaluate overall system accuracy. Our experimental results demonstrate the efficacy of our hand-eye calibration method and the high accuracy of our system. Our novel hand-eye calibration method can be applied to other types of robot-assisted surgery.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

CT
Computed Tomography API Application Programming Interface COS Coordinate System SVD Singular Value Decomposition RHC Registration-based Hand-eye Calibration 3D Three-dimension