Flexible and Accurate Calibration Method for Non-Overlapping Vision Sensors Based on Distance and Reprojection Constraints

This paper addresses the problem of flexible and accurate global calibration for multiple non-overlapping vision sensors in a confined workspace. Instead of using an auxiliary calibration pattern, the proposed method uses one laser tracker and only its accessory target sphere to obtain all the 3D calibration points and then accomplish the initial estimation of pose between the vision sensors. Then, the 3D calibration points and the extrinsic parameters between vision sensors are further optimized via the bundle adjustment algorithm based on the distance and reprojection constraints. Experiments were conducted to validate the performance and the experimental results demonstrate that the distance error can be decreased from 3.5 mm to 0.8 mm after introducing the distance and reprojection constraints.


Introduction
Accurate calibration of the relative pose between vision sensors is crucial for the final measurement accuracy of multi-vision-sensor (MVS) measurement systems. Although there are many existing methods that can efficiently handle the precise pose calibration problem for vision sensors with a common field-of-view (FOV), more effort is needed in the study of flexible and accurate calibration methods for non-overlapping vision sensors. Since the intrinsic calibration of a single vision sensor has some comprehensive solutions such as the methods presented by Tsai [1] and Zhang [2], we now focus on the extrinsic parameter calibration for vision sensors without overlapping FOVs in this paper.
According to the difference of auxiliary tools used in the calibration process, the published extrinsic calibration methods for non-overlapping vision sensors can be roughly classified into three categories. The first approach utilizes high precision measuring instruments to establish a global reference coordinate and finishes the global calibration by coordinate transformation. Lu and Li [3] employed two theodolites and additional calibration targets to accomplish the global calibration. To solve the problem of the blind observation zone of a dual theodolite, Zhang et al. [4] utilized one theodolite and one planar calibration target to obtain the 3D coordinates and corresponding image coordinates of the feature points. Xie et al. [5] used a high-resolution digital camera instead of a theodolite as a global measurement device to complete global calibration. Similarly, Dong et al. [6] realized the extrinsic calibration of a camera network based on close-range photogrammetry. In addition, Liu et al. [7] used a laser tracker to establish a global coordinate system, and then used one precise three dimensional target to obtain the rotation and translation matrix of each local vision sensor coordinate system relative to the global laser tracker coordinate system, which in turn, realized the global calibration among multiple sensors. Additionally, Lu et al. [8] utilized a coordinate measuring machine to accomplish the calibration of stereo cameras.
The second approach does not need auxiliary measuring instruments, but uses customized calibration targets that contain precise spatial geometric information. Taking advantage of a one-dimensional target, Liu et al. [9] proposed a global calibration method for multiple sensors based on the principle of cross-ratio invariance and collinear constraints of the calibration object. Likewise, vanishing points of a one-dimensional target were also used for solving the coordinates of each target point in camera coordinate systems [10]. Liu et al. [11] further used the laser rangefinder instead of a long one-dimensional target for the calibration of vision sensors in a wide range. In order to improve the acquisition efficiency of feature points for calibration, Zhang et al. [12] formed a dual-planar target by fixing two planar calibration panels on the two ends of a rigid beam. During the calibration process, each camera saw the planar target in its own FOV, and the invariance of spatial structure between the two planar targets was fully utilized to calculate the relative pose between two cameras. Additionally, Gong et al. [13] constructed a global target covered with circular features to realize the quick calibration of multiple vision sensors for some specific applications. It is also worth noting that Kumar et al. [14] skillfully acquired the calibration image of the same target by means of planar mirroring. The global calibration of multiple sensors was implemented by making the transformation between the virtual local vision sensor coordinate system and the real one. Liu et al. [15] designed a three-dimensional target, which was combined of three planar targets to calibrate the relative orientation between multiple depth cameras. Ni et al. [16] used the Lie algebra optimization to address the relative pose estimation for multiple cameras in the context of motion-based camera calibration. In their methods, a planar target is almost indispensable.
The final approach does not use any auxiliary tools during the calibration process, which is also called self-calibration methods. This kind of method is used to accomplish the calibration of the relative pose between vision sensors based on the invariance of the spatial structure in the observed scenario. Specifically, Equivel et al. [17] applied the structure from motion algorithm to locate the change of the relative position of each image sequence acquired by each vision sensor and then implemented the global calibration of multiple sensors based on rigid constraints among the sensors. Lebraly et al. [18] also presented a similar algorithm. Anjum et al. [19] improved the estimation robustness of the relative orientation and position by taking the unobserved trajectory and the exit-entrance direction of each object into consideration. In addition, Mendikute et al. [20] presented a self-calibration technique for vision systems by using redundant information of machine measurements to avoid extra mechanical anchoring or calibration means, but the method lacks versatility.
Among the aforementioned methods, the third kind of approach is the most flexible, but also has the worst calibration repeatability and accuracy. Essentially, most of self-calibration methods are not designed for measurement applications. Regarding calibration methods using customized targets, they can achieve moderate calibration accuracy and have a wide range of applications. However, the second kind of method is usually executed under controllable test environments such as in laboratories or factories. Despite the high cost of auxiliary measuring instruments, the first approach is still popular in an outdoor environment due to its high precision and strong capacity of resisting disturbance.
In some practical measurement applications, the field of view of each vision sensor is so narrow that there is no space to place such a large planar target or similar 3D target. The presented method in this paper was designed to overcome the calibration difficulties on these occasions. Unlike existing approaches, our method does not need additional targets to locate feature points for calibration. The accessorial target sphere of the laser tracker is used to form those calibration points. Since the target sphere diameter can range from 10 mm to about 50 mm, the accurate 3D locations of calibration points can be easily obtained in most constrained environments by freely moving the small target sphere at several positions. By substituting a small optical target sphere for those planar or three-dimensional targets, our method can accomplish the extrinsic calibration of vision sensors installed in a confined workspace. Despite the fact that the accurate locations of the target sphere center in the global reference frame can be directly read out by the laser tracker, the recovered 3D positions in the local camera coordinate are inevitably affected by noises. Thus, the distance and reprojection constraints were introduced in our method to restrain the noise influence. The remainder of this paper is organized as follows. Section 2 introduces the basic calibration principle and procedure using a laser tracker and its accessorial target sphere. In Section 3, the bundle adjustment for extrinsic parameters based on distance and projection constraints is described in detail. Section 4 presents a physical experiment conducted to verify the feasibility of the proposed approach, specifically, the accuracy improvement due to bundle adjustment based on distance and projection constraints. Conclusions are drawn in Section 5.

Calibration Principle
Since the local coordinate system of a vision sensor is always consistent with its camera coordinate, we hereinafter refer to the camera coordinate frame as the coordinate frame of the vision sensor. Without loss of generality, the global calibration of multiple vision sensors without overlapping FOVs is represented by the calibration of two non-overlapping cameras in this section. As illustrated in Figure 1, the relative pose between the left and right cameras is to be calibrated. For brevity, the global coordinate system established by the laser tracker is denoted by global coordinate system (GCS), and local coordinate systems (LCS) of the left and right cameras are denoted by LLCS and RLCS, respectively. workspace. Despite the fact that the accurate locations of the target sphere center in the global reference frame can be directly read out by the laser tracker, the recovered 3D positions in the local camera coordinate are inevitably affected by noises. Thus, the distance and reprojection constraints were introduced in our method to restrain the noise influence. The remainder of this paper is organized as follows. Section 2 introduces the basic calibration principle and procedure using a laser tracker and its accessorial target sphere. In Section 3, the bundle adjustment for extrinsic parameters based on distance and projection constraints is described in detail. Section 4 presents a physical experiment conducted to verify the feasibility of the proposed approach, specifically, the accuracy improvement due to bundle adjustment based on distance and projection constraints. Conclusions are drawn in Section 5.

Calibration Principle
Since the local coordinate system of a vision sensor is always consistent with its camera coordinate, we hereinafter refer to the camera coordinate frame as the coordinate frame of the vision sensor. Without loss of generality, the global calibration of multiple vision sensors without overlapping FOVs is represented by the calibration of two non-overlapping cameras in this section. As illustrated in Figure 1, the relative pose between the left and right cameras is to be calibrated. For brevity, the global coordinate system established by the laser tracker is denoted by global coordinate system (GCS), and local coordinate systems (LCS) of the left and right cameras are denoted by LLCS and RLCS, respectively. The basic calibration procedure contains three steps. First, the target sphere is placed at several (at least three) different positions and then observed by the camera. The three dimensional coordinate of the sphere center in the GCS can be easily read out from the laser tracker and its coordinate in the LCS can be reconstructed according to the known radius of the sphere and its pixel coordinate in the projection image. Second, the relative position and orientation between the LCS and the GCS can be calculated after the point correspondence is given in the first step. Finally, the relative pose between the LLCS and the RLCS is computed by rigid transformation.

3D Localization of the Target Sphere Center in the LCS and the GCS
At each placement position, the coordinate of the target sphere center in the GCS can be read directly from the laser tracker. Meanwhile, the coordinate of the target sphere center in the LCS can be reconstructed through the projection contour of the target sphere on the image plane, according to Shiu et al. [21].
Specifically, the equation of the ellipse projected on the image plane by the target sphere is as follows: The basic calibration procedure contains three steps. First, the target sphere is placed at several (at least three) different positions and then observed by the camera. The three dimensional coordinate of the sphere center in the GCS can be easily read out from the laser tracker and its coordinate in the LCS can be reconstructed according to the known radius of the sphere and its pixel coordinate in the projection image. Second, the relative position and orientation between the LCS and the GCS can be calculated after the point correspondence is given in the first step. Finally, the relative pose between the LLCS and the RLCS is computed by rigid transformation.

3D Localization of the Target Sphere Center in the LCS and the GCS
At each placement position, the coordinate of the target sphere center in the GCS can be read directly from the laser tracker. Meanwhile, the coordinate of the target sphere center in the LCS can be reconstructed through the projection contour of the target sphere on the image plane, according to Shiu et al. [21].
Specifically, the equation of the ellipse projected on the image plane by the target sphere is as follows: where (u, v) is the pixel coordinate of the point on the image ellipse. The coefficient (a-f) of the ellipse can be calculated by elliptic fitting after extracting the contour of the image ellipse. Suppose (x, y, z) is the back-projection coordinate of the point on the image ellipse in the LCS and f0 is the focal length of the camera, then Equation (1) can be rewritten as: (2) can be further expressed in terms of a quadratic form: At the ith position, the three-dimensional coordinate P c i of the target sphere center in the LCS is shown below: where {λ1, λ2, λ3} is the eigenvalue of matrix Q in Equation (3), and e3 = (e3x, e3y, e3z)T is the eigenvector of matrix Q corresponding to the eigenvalue λ3. Furthermore, R0 is the radius of the target sphere. It can be seen from Equations (3) and (4) that the three-dimensional coordinates of the target sphere centers in the LCS are closely related to the size of the target sphere and its position relative to the camera.

Local Calibration of the Relative Pose between the LCS and the GCS
Once the three-dimensional coordinates of the target sphere center in the LCS and the GCS at several placements are known, the local calibration of the relative pose between the LCS and the GCS can be accomplished through the following two steps. In the first step, three positions of the target sphere center are randomly selected to establish a transfer coordinate system (TCS) and obtain the initial value of the transformation matrix from the LCS to the GCS. Then, all positions of the target sphere center are involved in the optimization of the transformation matrix using a nonlinear iterative method such as the well-known Levenberg-Marquardt algorithm.
Assume that the coordinates of the target sphere center at three different positions in the LCS (LLCS or RLCS) are P c 1 , P c 2 , P c 3 respectively, and their counterparts in the GCS are P g 1 , P g 2 , and P g 3 . As long as the three points are not on a straight line, we can establish a TCS as follows: (1) The origin O t of the TCS in the LCS and the GCS are shown below: The x-axis Xt of the TCS in the LCS and the GCS are represented by: (3) The z-axis Zt of the TCS in the LCS and the GCS are: (4) The y-axis Yt of the TCS can be calculated by means of the cross product of Xt and Zt, that is: According to Equation (5), the translation vectors T c t and T g t from the TCS to the LCS and the GCS are respectively given by: Based on Equations (6)- (8), the rotation matrices R c t and R g t from the TCS to the LCS and the GCS can be calculated as follows: Now, we can calculate the relative pose (that is, the rotation matrix R g c and the translation vector T g c ) between the LCS and the GCS using Equations (9) and (10): Using the rotation matrix and the translation vector given by Equation (11) as the initial value, the relative pose can be further optimized by minimizing the objective function defined by

Global Calibration of the Relative Pose between Vision Sensors
Suppose the rotation matrix and the translation vector between the LLCS and the GCS are R g l and T g l , and their counterparts between the RLCS and the GCS are R g r and T g r , the rotation matrix R r l and the translation vector T r l can be given by: where R g l , T g l , R g r , and T g r can be obtained using the method described in Section 2.2. So far, the global calibration of extrinsic parameters between two vision sensors has been accomplished after two local calibrations have been carried out.

Bundle Adjustment Based on Distance and Reprojection Constraints
In Section 2, we described the global calibration procedure for multiple vision sensors without an overlapping field of view in detail. Nevertheless, the practical image of the target sphere usually deteriorates to some extent due to the variations of the environmental illumination and the observation angle of the vision sensor. Furthermore, the uncertainties of the feature extraction during the sphere image processing and the elliptic fitting process also introduce errors into the final three-dimensional location of the target sphere center. Position errors of the target sphere center in the LCS will inevitably reduce the extrinsic calibration accuracy.
In order to decrease the adverse impact imported by the deviation of the three-dimensional positions of the target sphere in the LCS, two constraints are taken into consideration. One is the distance between different positions of the target sphere center in the GCS, and the other is involved with the image projection of the target sphere center besides that of the sphere contour.
Assume that the three-dimensional coordinates of the target sphere center in the GCS and the LCS at the ith position are P g i and P c i , respectively, and the deviation vector between the reconstructed coordinates of the target sphere center and the real ones at the ith position in the LCS is ∆P c i . Then, we can optimize the local calibration by bundle adjustment based on the distance and projection constraints and the objective function is given by: ij || 2 represents the distance constraint between the three-dimensional coordinates of target sphere centers in the LCS and the GCS. The item d 2 i represents the distance between the image projection point of the target sphere center and the major axis of the projection ellipse of the target sphere, and e 2 i represents the reprojection errors of the reconstructed target sphere center positions. Assume that the camera intrinsic parameters f x , f y , u 0 , and v 0 , the distortion parameters k 1 and k 2 are known, and the linear equation of the major axis of the projection ellipse of the target sphere is l(l 1 ,−1,l 2 ), then d i can be given by: Based on the conclusion of Sun et al. [22], the image projection point of the target sphere center should be located on the major axis of the projection ellipse of the target sphere, that is, the ideal value of d i should be zero. Additionally, the principal point of the image should be on the major axis of the projection ellipse of the target sphere, according to Daucher et al. [23]. Then, we can obtain the linear equation of the major axis using the known principal point of the image (u 0 , v 0 ) and the geometric center of the projection ellipse (u c , v c ) as follows: where the coordinates of the geometric center of the projection ellipse are given by: and the coefficients a, b, c, d, and e are defined in Equation (1). Additionally, the error function ei represents the distance between the extracted ellipse feature point (u ij ,v ij ) and the re-projection ellipse at the ith position, which is defined as follows: where M represents the number of extracted feature points on the ith projection image. Given the known target sphere center (P c i − ∆P c i ), the radius of the sphere R 0 and the focal length f 0 , the re-projection error e i can be easily calculated by the method proposed by Shiu et al. [21].
During the bundle adjustment process, ∆P c i is set to zero as its initial value, and other parameters can be initialized using the method described in Section 2. From Equations (15)- (19), the optimal estimation of the rotation matrix and translation vector between the LCS and the GCS can be obtained by means of the large-scale trust-region reflective algorithm. After the precise local calibration, the precise calibration between vision sensors can be realized using the method described in Section 2.3.

Experimental Results
To verify the feasibility of the proposed method, a typical vision measurement system with two non-overlapping vision sensors was established, as illustrated in Figure 2. The two vision sensors were both AVT GC1380H digital cameras with 12 mm Schneider lens. The image resolutions of the vision sensors were 1360 × 1024. The reference GCS was built on a Leica API-T3 laser tracker and the radius of its target sphere was 19.5 mm. The LLCS and RLCS coincided with the corresponding local camera coordinates, respectively.  (18) and the coefficients a, b, c, d, and e are defined in Equation (1). Additionally, the error function ei represents the distance between the extracted ellipse feature point (uij,vij) and the re-projection ellipse at the ith position, which is defined as follows: where M represents the number of extracted feature points on the ith projection image. Given the known target sphere center (P c i -△P c i ), the radius of the sphere R0 and the focal length f0, the re-projection error ei can be easily calculated by the method proposed by Shiu et al. [21].
During the bundle adjustment process, △P c i is set to zero as its initial value, and other parameters can be initialized using the method described in Section 2. From Equations (15)- (19), the optimal estimation of the rotation matrix and translation vector between the LCS and the GCS can be obtained by means of the large-scale trust-region reflective algorithm. After the precise local calibration, the precise calibration between vision sensors can be realized using the method described in Section 2.3.

Experimental Results
To verify the feasibility of the proposed method, a typical vision measurement system with two non-overlapping vision sensors was established, as illustrated in Figure 2. The two vision sensors were both AVT GC1380H digital cameras with 12 mm Schneider lens. The image resolutions of the vision sensors were 1360 × 1024. The reference GCS was built on a Leica API-T3 laser tracker and the radius of its target sphere was 19.5 mm. The LLCS and RLCS coincided with the corresponding local camera coordinates, respectively. During the experiment, the target sphere was moved several times (no less than three) within the FOV of each camera. The target sphere should be placed as far as possible to cover the measurement range of each camera. For each position of the target sphere, the laser tracker was used to take 10 samples on the coordinates of the sphere center with the center-of-mass of the sphere center coordinates as the precise coordinates of the target sphere center in the GCS, thus reducing the location positioning noise of the target sphere center. Similarly, for each position of the target sphere, the camera also captured 10 projection images of the target sphere to reduce the influence of illumination on the image quality.
In the following section, the ellipse detection and fitting results in the projection images of the target sphere are first introduced. Then, the 3D positions of the target sphere center in the LCS and During the experiment, the target sphere was moved several times (no less than three) within the FOV of each camera. The target sphere should be placed as far as possible to cover the measurement range of each camera. For each position of the target sphere, the laser tracker was used to take 10 samples on the coordinates of the sphere center with the center-of-mass of the sphere center coordinates as the precise coordinates of the target sphere center in the GCS, thus reducing the location positioning noise of the target sphere center. Similarly, for each position of the target sphere, the camera also captured 10 projection images of the target sphere to reduce the influence of illumination on the image quality.
In the following section, the ellipse detection and fitting results in the projection images of the target sphere are first introduced. Then, the 3D positions of the target sphere center in the LCS and the GCS at each position are given. Finally, the accuracy of the global calibration results before and after the bundle adjustment are analyzed and compared.

Ellipse Detection and Fitting in the Projection Image of the Target Sphere
After projection images of the target sphere were captured, edge feature points were selectively extracted using the Harris detector. Then, the ellipse center of the projected sphere was located and the initial estimate of the ellipse equation coefficient was calculated utilizing the method proposed by Bennett et al. [24]. Finally, the parameters of the projection ellipse equation were further optimized using the least square fitting algorithm. Figures 3 and 4 exemplify the ellipse detection and fitting results of left and right projection images, respectively.
It is noteworthy that the distortion of all extracted feature points should be corrected before they are used to locate the ellipse center and further applied to fit the ellipse equation. the GCS at each position are given. Finally, the accuracy of the global calibration results before and after the bundle adjustment are analyzed and compared.

Ellipse Detection and Fitting in the Projection Image of the Target Sphere
After projection images of the target sphere were captured, edge feature points were selectively extracted using the Harris detector. Then, the ellipse center of the projected sphere was located and the initial estimate of the ellipse equation coefficient was calculated utilizing the method proposed by Bennett et al. [24]. Finally, the parameters of the projection ellipse equation were further optimized using the least square fitting algorithm. Figures 3 and 4 exemplify the ellipse detection and fitting results of left and right projection images, respectively.

Ellipse Detection and Fitting in the Projection Image of the Target Sphere
After projection images of the target sphere were captured, edge feature points were selectively extracted using the Harris detector. Then, the ellipse center of the projected sphere was located and the initial estimate of the ellipse equation coefficient was calculated utilizing the method proposed by Bennett et al. [24]. Finally, the parameters of the projection ellipse equation were further optimized using the least square fitting algorithm. Figures 3 and 4 exemplify the ellipse detection and fitting results of left and right projection images, respectively.

3D Localization of the Target Sphere Center in the LCS and GCS
The intrinsic parameters of the left and right cameras were obtained using the method presented by Zhang [2] and are shown in Table 1. For each camera, the target sphere was placed at ten different positions, respectively. The 3D coordinates of the target sphere center in the GCS read by the laser tracker and their counterparts in the LLCS and the RLCS that were reconstructed by Equation (4), where R 0 = 19.5 mm, are shown in Table 2. Table 2. 3D coordinates of the target sphere center in the GCS and LCS (mm).

Results Analysis and Accuracy Comparison
Given the three-dimensional coordinates of the target sphere center in the GCS and LCS, the initial global calibration of the relative pose between the LLCS and the RLCS can be realized using method described in Section 2.3. After the bundle adjustment based on distance and reprojection constraints, the rectified 3D coordinates of the target sphere center in LLCS and RLCS are shown in Table 3. The extrinsic parameters between LLCS and RLCS before and after the bundle adjustment are shown in Table 4. Table 3. Rectified 3D coordinates of the target sphere center in the LLCS and RLCS (mm). Using distances between the left and right points in the GCS as the reference distances, we can easily compute the corresponding distance errors using points in the LCS. According to the calibration results, the root mean square error of the 100 distances before optimization was about 3.5 mm and that after the bundle adjustment was 0.8 mm. It is notable that the accuracy was increased by two times. The error curves calculated by the points before and after the bundle adjustment are displayed in Figure 5. Using distances between the left and right points in the GCS as the reference distances, we can easily compute the corresponding distance errors using points in the LCS. According to the calibration results, the root mean square error of the 100 distances before optimization was about 3.5 mm and that after the bundle adjustment was 0.8 mm. It is notable that the accuracy was increased by two times. The error curves calculated by the points before and after the bundle adjustment are displayed in Figure 5. Meanwhile, the reprojection errors using the initial and rectified 3D points in LLCS and RLCS were also calculated and are shown in Figures 6 and 7, respectively. From Figures 6 and 7, we can see that the distance errors between extracted feature points and the reprojection ellipse of each sphere center point after bundle adjustment were all restrained to some extent.   Meanwhile, the reprojection errors using the initial and rectified 3D points in LLCS and RLCS were also calculated and are shown in Figures 6 and 7, respectively. From Figures 6 and 7, we can see that the distance errors between extracted feature points and the reprojection ellipse of each sphere center point after bundle adjustment were all restrained to some extent.  Using distances between the left and right points in the GCS as the reference distances, we can easily compute the corresponding distance errors using points in the LCS. According to the calibration results, the root mean square error of the 100 distances before optimization was about 3.5 mm and that after the bundle adjustment was 0.8 mm. It is notable that the accuracy was increased by two times. The error curves calculated by the points before and after the bundle adjustment are displayed in Figure 5. Meanwhile, the reprojection errors using the initial and rectified 3D points in LLCS and RLCS were also calculated and are shown in Figures 6 and 7, respectively. From Figures 6 and 7, we can see that the distance errors between extracted feature points and the reprojection ellipse of each sphere center point after bundle adjustment were all restrained to some extent.    Using distances between the left and right points in the GCS as the reference distances, we can easily compute the corresponding distance errors using points in the LCS. According to the calibration results, the root mean square error of the 100 distances before optimization was about 3.5 mm and that after the bundle adjustment was 0.8 mm. It is notable that the accuracy was increased by two times. The error curves calculated by the points before and after the bundle adjustment are displayed in Figure 5. Meanwhile, the reprojection errors using the initial and rectified 3D points in LLCS and RLCS were also calculated and are shown in Figures 6 and 7, respectively. From Figures 6 and 7, we can see that the distance errors between extracted feature points and the reprojection ellipse of each sphere center point after bundle adjustment were all restrained to some extent.

Conclusions
In this paper, we proposed a flexible and accurate method for global calibration of vision sensors without overlapping fields of view. The presented method utilized the laser tracker to establish the reference coordinate system and an accessory target sphere was used as the calibration medium. By fitting the projection ellipse of the sphere, the three-dimensional coordinates of the target sphere center could be recovered and the initial global calibration can be accomplished. The calibration of extrinsic parameters was further optimized via the bundle adjustment algorithm after the distance and reprojection constraints were introduced. For two point sets with a distance of about 1500 mm apart from each other, the experimental results showed that the distance error could be decreased from 3.5 mm to 0.8 mm after the optimization, which means that the presented method is feasible and flexible for the global calibration of vision sensors, especially when vision sensors are in a confined workspace where common planar targets can be awkward, but a little target sphere is handy.