Abstract
Three-dimensional (3D) triangulation based on active binocular vision has increasing amounts of applications in computer vision and robotics. An active binocular vision system with non-fixed cameras needs to calibrate the stereo extrinsic parameters online to perform 3D triangulation. However, the accuracy of stereo extrinsic parameters and disparity have a significant impact on 3D triangulation precision. We propose a novel eye gaze based 3D triangulation method that does not use stereo extrinsic parameters directly in order to reduce the impact. Instead, we drive both cameras to gaze at a 3D spatial point P at the optical center through visual servoing. Subsequently, we can obtain the 3D coordinates of P through the intersection of the two optical axes of both cameras. We have performed experiments to compare with previous disparity based work, named the integrated two-pose calibration (ITPC) method, using our robotic bionic eyes. The experiments show that our method achieves comparable results with ITPC.
1. Introduction
Active binocular vision is a binocular vision system that could actively change its own view direction. It is beneficial for multiple applications such as manipulation [], three-dimensional (3D) reconstruction [], navigation [], 3D mapping [], and so on. 3D coordinates estimation based on active binocular vision attracts extensive research interests.
Active binocular vision systems can be divided into two categories: the first category has fixed cameras and the second category has non-fixed cameras. The vision systems of ASIMO [], HRP-3 [], HRP-4 [], PR2 [], ATLAS [], and Walkman [] belong to the first category. However, this category can not perceive objects that are very close to the cameras. The second category, which is similar as human vision system, can obviously improve flexibility and extend the field of view. The vision systems of the ESCHeR head [], Yorick head [], MAC-EYE robot eyes [], iCub head [], ARMAR-III head [], Flobi head [], Zhang XL’s robot eyes [], ARMAR-4 head [], Muecas [], Romeo [], SARA [], and our designed anthropomorphic robotic bionic eyes [,,] belong to the second category.
Usually, 3D triangulation can be performed using disparity and stereo extrinsic parameters [,,,,,,]. The stereo extrinsic parameters are calibrated offline [,] or online [,,,,,]. Active vision systems of the first category only need offline calibration as the stereo extrinsic parameters are fixed, since one camera is static with respect to (w.r.t) the other. The stereo extrinsic parameters of the second category need to be calibrated online because the stereo extrinsic parameters may change all the time. In [], Chen et al. proposed an integrated two-pose calibration (ITPC) method that calculated the stereo extrinsic parameters online using forward kinematics and the calibrated head-eye parameters.
However, in the disparity based triangulation method, the errors of the stereo extrinsic parameters and disparity have a significant impact on 3D triangulation []. In [], Dang et al. analyzed that the stereo extrinsic parameters error can generate baseline error, pixel errors. The disparity error is generated by image detection and matching. The baseline error, pixel errors, and the disparity error will directly affect the 3D triangulation precision. In order to reduce the impact, this paper proposes a novel eye gaze based 3D triangulation method that does not use stereo extrinsic parameters directly. In our proposed method, the 2D image of a spatial point in each camera is kept at the principal point through eye gazing or visual servoing. Or, the point is always at the location of the fixation point of the two cameras. After that, we could obtain 3D coordinates of through intersection of the two optical axes. In real applications, the fixation point is represented by the middle point of the common perpendicular line segment of the two skew optical axes. Figure 1 shows the idea. The advantage of the eye gaze based triangulation is that we no longer need the stereo extrinsic parameters directly. The other advantage is that image detection error tolerance is higher than the disparity based triangulation method.

Figure 1.
In our proposed method, we use eye gazing or visual servoing to actively adjust the view direction of each camera to keep the 2D image of a three-dimensional (3D) point at the principal point. In other words, is located at the fixation point. 3D triangulation could be performed through the intersection of two cameras’ optical axes.
The main contribution of this paper is that keeping the target point lies on the optical axes of each camera through eye gazing, we can perform 3D triangulation through calculating the intersecting point of two optical axes of cameras instead of solving the projection equations of two cameras that rely on disparity and stereo extrinsic parameters.
2. System Configuration
The anthropomorphic robotic bionic eyes we designed (as shown in Figure 2) for human-like active stereo perception, is illustrated in detail in []. This is a revised version with seven degree of freedoms (DOFs). It has two eyes and one neck. Each of the eye has two DOFs: tilt and pan. The neck has three DOFs. Joints of the neck and eyes are serially connected. It means that the links are connected serially by the neck and eyes joints and form ordered open chains. In the ordered open chain, the next joint is the load of the previous joint. As shown in Figure 3, our designed robotic bionic eyes contain two serial open chains: frame , frame . Each eye is installed with a USB camera with resolution of 640 × 480 @ 120 fps. A common trigger signal is connected to both cameras in order to realize hardware synchronization.

Figure 2.
The anthropomorphic robotic bionic eyes.

Figure 3.
The definition of the robot bionic eyes’ coordinate system.
We define the coordinate systems that are based on the standard D-H convention. Figure 3 shows the frame definitions. Table 1 shows the D-H parameters of the robotic bionic eyes. The transformation matrix between Frame i and Frame can be calculated according to the D-H parameters [].

Table 1.
The D-H parameters of the robotic bionic eyes. (Here parameters , , , are the link offset, joint angle, link length and link twist, respectively. The i-th joint offset is the value of in the initial state. is 64.27 mm, is 11.00 mm, is 44.80 mm, is 47.20 mm, is 13.80 mm, and is 30.33 mm).
3. Eye Gaze Based 3D Triangulation Method
The main idea of the eye gaze [,,,] based 3D triangulation method is to control the view direction of both cameras to keep a 3D spatial point lies on the optical axes of both cameras. We are able to calculate the exact 3D coordinates of through calculating the intersection point of the two optical axes, as we could obtain the pose of each camera through forward kinematics. We also know that once lies on the optical axis of a camera, the image of will be at the principal point of the 2D image. Accordingly, we could perform image based visual servoing to change the pose of both cameras to keep the images of at the principal points of two cameras and, consequently, lies on the two optical axes.
In this section, we first introduce visual servoing based approach for eye gazing, and then we show the process of calculation of the intersection point.
3.1. Visual Servoing Based Eye Gazing
We use image based visual servoing to control the pose of each camera. We will first get relationship of the velocity of the image feature points and the velocity of the joints. After that, we will present a simple visual servoing control law in order to generate commands to drive the feature point error down to zero.
3.1.1. Relation Matrix
Here, we use a relation matrix to describe the velocity of the image feature points and the joint velocity. The relation matrix could be decomposed to be the multiplication of an Interaction matrix and a Jacobian matrix. An Interaction matrix is used to build the relationship of the velocity of an image feature point and the velocity of the camera. A Jacobian Matrix is used to describe the relationship of the velocity of the camera and joint velocity.
where is the velocity of an image feature point, is the velocity of a camera, and is the Interaction matrix calculated, as follows:
where and is the focal length in the X and Y directions of the camera, respectively, is the depth of the spatial point in the camera frame.
The relationship between and the joint velocity could be described, as follows:
where is the Jacobian matrix w.r.t camera frame, and it is defined as:
where is the rotation matrix of the base frame w.r.t the camera frame, and is the Jacobian matrix w.r.t the base frame. The elements () can be calculated, as follows:
where and are the direction vector of Z axis and the origin of Frame w.r.t base frame.
Accordingly, the relationship between the velocity of an image feature point and the joint velocity can be formulated, as follows:
3.1.2. Image Based Visual Servoing Control Law
We use a simple image based visual servoing control law to control the motion of the camera so as to reduce the error of the expected image point and the actual image point of 3D spatial point , where are set to the principal point , which can be achieved from the intrinsic parameters, and are the actual pixel coordinates of the spatial point in the camera.
where is the gain matrix, and where and are the gain used to control how fast the pixel errors in the X and Y directions are regulated to zero independently [], and is the error between and . From Equation (7), we can infer that the error will decay to zero exponentially, where is the exponential coefficient.
The above equation could be rewritten for the left eye and right eye, respectively. For the left eye, we could derive
where and are the actual pixel coordinates and the expected pixel coordinates of the spatial point in the left camera, respectively, and is the joint velocity.
We fix the neck joints and establish the base frame at link 3, the joint velocity vector is rewritten as:
From Equations (10) and (11), the velocities of left eye joints can be calculated, as follows:
where is the element on the i-th row and j-th column in the matrix .
Similarly, the velocities of the right eye joints can be calculated, as follows:
In each loop of the visual servoing, we send the generated eye joint velocity commands to drive the robotic bionic eyes to the view direction that reduces the error between and in both cameras.
3.2. 3D Triangulation by Calculation of Intersection Point
After visual servoing, the point will lie on the optical axes of each camera, and theoretically the intersection point of the two optical axes is the representation of . We could calculate the intersection point geometrically if the equations of the two optical axes are obtained. We could use forward kinematics in order to derive the description of the two optical axes because each axis is the Z of the camera frame for each eye. In real situation, the two optical lines may not intersect each other, and so we use the middle point of the common perpendicular line segment of the two skew optical axes to be the representation of the intersection point instead.
3.2.1. Calculation of the Coordinates of the Optical Axes
First of all, we will deduct the process to calculate the coordinates of the two optical axes and . The optical axis of each camera is the coordinates of Z axis w.r.t the neck frame N, and so we need to derive the transformation matrix of the frame of the two cameras w.r.t frame N through forward kinematics, or the pose of left camera and the pose of right camera .
where is the head-eye parameters of the left camera, which can be calculated using the method in []. is represented as the transformation matrix of the left eye frame w.r.t the neck frame N, and it can be obtained using the D-H parameters of the robotic bionic eyes, as follows:
It is also the case for the right camera.
where is head-eye parameters of the right camera. is the transformation matrix of the right eye frame w.r.t the neck frame N, it can be obtained, as follows:
3.2.2. Calculation of the Intersection Point
We are going to calculate the intersection point of the optical lines and . Here, and are the ideal optical axes. The actual optical axes and will no longer intersect each other due to the measurement errors. It means that and may not in the same plane. We use the middle point of their common perpendicular line segment to be the representation of instead, which is shown in Figure 4.

Figure 4.
The actual model of our proposed method.
In order to obtain , we need to calculate and . Here, is the intersection of line and plane derived from and . is the intersection of line and plane derived from and .
First of all, we would like to calculate the plane derived from and . In order to do that, we will introduce two points , on line with coordinates of and in the right camera frame. We set the coordinates in frame N as and . And and . We will also introduce two points , on line . The in frame N (in the following statements, all of the parameters are represented in frame N) are defined, as follows:
We use , , , to represent the direction of , , and the norm of plane respectively. Accordingly,
and
and the plane could be written as:
Similarly, we could get like this:
Additionally, the fixation point P could be obtained:
Hence, the eye gaze based 3D triangulation Algorithm 1 can be summarized, as follows:
Algorithm 1: Eye Gaze Based 3D Triangulation Method. |
|
4. Propagation of Uncertainty
Uncertainty [,,] is a quantification of the doubt about the validity of measurement results. The uncertainties of using our method and the ITPC method are mainly propagated from image detection uncertainty. For our method, because some times the image uncertainty is tiny, which may result in no motion of optical axes because of the joint feedback precision () is not as high as image coordinates. Subsequently, we could not measure the uncertainty directly, because there is no variation of the coordinates of . In order to solve this problem, we use the law for the propagation of uncertainty [,] instead to calculate the theoretical uncertainties of point obtained from the two methods that are based on image detection uncertainty.
4.1. Image Detection Uncertainty
The image detection uncertainty is represented as and . They can be calculated, as follows:
where , are the i-th (i = 1, 2, …, n) independent observation of the image coordinates u, v, respectively. , are the average of the n observations of u, v, respectively.
4.2. Uncertainty of P Using Our Method
If eye gazing converges at time t, the image point of coincide with the principal point in both cameras. After time t, the eye joint angles (i = 4, 5, 7, 8) will vary according to the variation of the image coordinates of in both cameras through eye gazing. For the left eye, the joint angles , can be formulated, as follows:
where , are the image of in the left camera, , are the principal point in the left camera, and , are the focal length in the X and Y directions of the left camera, respectively.
The uncertainties of the eye joint angles (i = 4, 5) can be calculated by propagating the uncertainties of the image of in the left camera through partial derivatives of Equations (27) and (28), as follows:
where , can be obtained using Equations (25) and (26). We use , to be the replacement of and in and . The uncertainties propagation of , are similar to , .
From the previous Section, we derived that the 3D point can be triangulated by Equations (22)–(24). The uncertainties of can be calculated by propagating the uncertainties of the eye joint angles (i = 4, 5, 7, 8) through partial derivatives of Equation (24). The uncertainties of using our proposed method can be formulated, as follows:
In the above equations, we use or the average of the n independent repeated observations of the i-th (i = 4, 5, 7, 8) joint angle feedback as the value of in , , and .
4.3. Uncertainty of P Using ITPC Method
In the ITPC method, the eye joint angles (i = 4, 5, 7, 8) will not change after time t. The stereo extrinsic parameters can be calculated using the eye joint angles (i = 4, 5, 7, 8). Here, , where is the i-th joint angle feedback at time t. can be calculated while using the intrinsic parameters of both cameras, the stereo extrinsic parameters , and the image coordinates of in both cameras. The uncertainties of using the ITPC method can be calculated by propagating the uncertainties of the image coordinates in both cameras through partial derivatives:
where , can be calculated using Equation (25). , can be calculated using Equation (26). In , , , , , , , , , , , and , we use , , , .
5. Experiments and Results
We will first compare the triangulation performance of our method with ITPC method in simulated and physical experiments in order to evaluate the eye gaze based 3D triangulation method. After that, we will do the precision experiments comparing with stereo system ZED mini with fixed cameras. Finally, we will execute experiments to obtain the time response of our method.
5.1. Comparison Experiments with the ITPC Method
We will compare our proposed method with the ITPC method in both simulated and physical experiments. Static target points as well as moving target points will be triangulated to see the performance of the two methods.
5.1.1. Simulated Experiments
We will do the two comparison experiment in simulation environments. The first is triangulation of static target points and the other is triangulation of a moving target point.
Experimental Setup
In simulated experiments, we use GAZEBO [] in order to simulate the kinematics of the robotic bionic eyes with the same mechanical model as the physical robotic bionic eyes. We also use the same joint controller under the ROS framework to send joint velocity commands to GAZEBO and get the joint angle feedback as well as image feedback from both eyes.
The image size of the left and right cameras is 1040 × 860 in pixel. The intrinsic parameters obtained while using Bouguet toolbox [] are shown in Table 2. The checkerboard corner points are detected with sub-pixel position accuracy. We choose the point at the upper right corner as the spatial point . The expected pixel coordinates of point in both cameras are fixed at the principal point (520.50, 430.50). Figure 5 shows the images of in the left and right cameras.

Table 2.
The intrinsic parameters of the left and right simulated cameras.

Figure 5.
The images of in the left and right simulated cameras. (a) left image; (b) right image.
Triangulation of Static Target Points
In this experiment, we get 21 different static target points by placing the simulated checkerboard at 21 different positions. At each position, we recorded the estimated 3D coordinates w.r.t the base frame N using our proposed method and ITPC method, respectively. The ground truth can be obtained from GAZEBO (as shown in Table 3).

Table 3.
The ground truth of the 3D coordinates of the 21 static target points.
Error Analysis: we calculated the absolute errors of the mean of the 3D coordinates using our proposed method and ITPC method (as shown in Figure 6) w.r.t the ground truth, respectively. The absolute error of both methods in the simulated experiments are mainly from the error of D-H parameters, the error of intrinsic parameters and the error of head-eye parameters. The absolute errors in X axis of both methods are between 0.73 mm and 3.96 mm, as shown in Figure 6. The absolute errors in Y axis of both methods are between 1.14 mm and 6.18 mm. We can see that the minimum absolute errors in Z axis of both methods are only 0.02 mm and 0.04 mm, respectively, when Z ground truth is 1100 mm. The maximum absolute errors in Z axis of both methods only reach 0.5% of the ground truth depth. In conclusion, the absolute errors of our proposed method are very close to the ITPC method in the X, Y, and Z axes.

Figure 6.
The absolute error of the mean of the estimated 3D coordinates using our proposed method and integrated two-pose calibration (ITPC) method w.r.t ground truth respectively. (a) absolute error in X axis; (b) absolute error in Y axis; and,(c) absolute error in Z axis.
Uncertainty Analysis: the uncertainties of our proposed method (shown in Figure 7) were calculated using Equations (31)–(33). The uncertainties of the ITPC method (shown in Figure 7) were calculated using Equations (34)–(36). At each position, the times n of independent repeated observations is 1150. For both of the left and right cameras, the image detection uncertainties in the X direction are between 0.006 pixels and 0.057 pixels. The image detection uncertainties in the Y direction are between 0.006 pixels and 0.057 pixels. The absolute uncertainties of our proposed method are very close to the ITPC method in the X, Y and Z axes, as shown in Figure 7. The absolute uncertainty in X axis of both methods are between 0.009 mm and 0.130 mm. The absolute uncertainty in Y axis of both methods are between 0.028 mm and 0.188 mm. The absolute uncertainty in Z axis of both methods are between 0.088 mm and 4.545 mm.

Figure 7.
The uncertainties of our proposed method and ITPC method. (a) absolute uncertainty in X axis; (b) absolute uncertainty in Y axis; and, (c) absolute uncertainty in Z axis.
Triangulation of a Moving Target Point
We want to verify the effectiveness of our proposed method in the case that the target point is moving.
We placed a checkerboard at 1000 mm on the Z axis w.r.t the base frame N. The trajectory of the target point on the X and Y axes w.r.t frame N is and , respectively, where mm. We recorded the 3D coordinates of the moving target points estimated by our proposed method and ITPC method, respectively.
Error Analysis: we calculated the absolute error of our proposed method and ITPC method w.r.t ground truth, respectively. The mean absolute error in X axis of both methods are 1.76 mm and 1.61 mm, respectively, as shown in Figure 8. The mean absolute error in Y axis of both methods are 2.03 mm and 2.17 mm, respectively. The mean absolute error in Z axis of both methods are 1.69 mm and 1.71 mm which reach only 0.17% of the ground truth depth. The mean absolute error in X axis of our proposed method is larger than the ITPC method and the mean absolute error in Y, Z axes of the proposed method are smaller than the ITPC method.

Figure 8.
The absolute error of the mean of the estimated 3D coordinates using our proposed method and ITPC method w.r.t ground truth, respectively. (a) absolute error in X axis; (b) absolute error in Y axis; and, (c) absolute error in Z axis.
Uncertainty Analysis: we calculated the standard deviation of the coordinates in Z axis as the uncertainties of our proposed method and ITPC method. In the experiments, the number n is 3780. The absolute uncertainties of our proposed method and the ITPC method in Z axis are 0.30 mm and 0.98 mm, respectively. The absolute uncertainties in Z axis of our proposed method are smaller than the ITPC method.
5.1.2. Physical Experiments
The physical experiments are performed on the real robotic bionic eyes. Triangulation on static and moving target points is performed. Comparisons are carried out between our proposed method and the ITPC method.
Experimental Setup
In physical experiments, the image size of the left and right cameras is 640 × 480 in pixel. The intrinsic parameters of both cameras are shown in Table 4. The distortion coefficients of the left and right cameras are [−0.0437, 0.1425, 0.0005, −0.0012, 0.0000] and [−0.0425, 0.1080, 0.0001, −0.0015, 0.0000], respectively. The maximum horizontal and vertical field of view of the real robotic bionic eyes are 170° and 170°, respectively. The AprilTag [] information is obtained through ViSP [] library. We choose the center (point ) of the AprilTag as the fixation point. The expected image of point in the left and right cameras are set to the principal points (362.94, 222.53) and (388.09, 220.82), respectively. Figure 9 shows the images of in the left and right cameras.

Table 4.
The intrinsic parameters of the left and right cameras of the designed robotic bionic eyes.

Figure 9.
The images of the left and right cameras. (a) left image; (b) right image.
Triangulation of Static Target Points
We placed the AprilTag at 21 different positions (as shown in Table 5). At each position, the estimated 3D coordinates using our proposed method and ITPC method were recorded, respectively.

Table 5.
The ground truth of the static target points.
Error Analysis: we calculated the absolute error of the mean of estimated 3D coordinates w.r.t ground truth using our proposed method and ITPC method, respectively (as shown in Figure 10). The absolute error of both methods in the physical experiments are mainly from errors of forward kinematics, errors of intrinsic parameter, errors of head-eye parameters and joint offset errors. As shown in Figure 10, our proposed method are closer to the ground truth than the ITPC methods in the X and Z axes in most of the 21 different positions, especially when Z ground truth are larger than 2000 mm. The absolute error in Y axis of our proposed method is between 0.55 mm and 12.28 mm. The absolute error in Y axis of the ITPC method are between 1.52 mm and 12.88 mm. The minimum absolute errors in Z axis of our proposed method and the ITPC method are 1.42 mm and 2.32 mm, which reach 0.23% and 0.38% of the ground truth depth, respectively. The maximum absolute error in Z axis of our proposed method and the ITPC method are 124.49 mm and 174.64 mm, which reach 4.97% and 6.98% of the ground truth depth, respectively. Our proposed method obtains smaller mean absolute errors in the X, Y, and Z axes.

Figure 10.
The absolute error of the mean of the estimated 3D coordinates using our proposed method and ITPC method w.r.t ground truth respectively. (a) absolute error in X axis; (b) absolute error in Y axis; and, (c) absolute error in Z axis.
Uncertainty Analysis: the uncertainties of our proposed method (shown in Figure 11) were calculated using Equations (31)–(33). The uncertainties of the ITPC method (shown in Figure 11) were calculated using Equations (34)–(36). At each position, n = 1200. For the left camera, the image detection uncertainties in the X direction are between 0.027 pixels and 0.089 pixels. The image detection uncertainties in the Y direction are between 0.039 pixels and 0.138 pixels. For the right camera, the image detection uncertainties in the X direction are between 0.016 pixels and 0.067 pixels. The image detection uncertainties in the Y direction are between 0.046 pixels and 0.138 pixels. The absolute uncertainty in X axis of both methods are between 0.119 mm and 3.905 mm, as shown in Figure 11. The absolute uncertainty in Y axis of both methods are between 0.091 mm and 0.640 mm. The absolute uncertainty in Z axis of both methods are between 0.268 mm and 7.975 mm. In conclusion, the absolute uncertainties of our proposed method are very close to the ITPC method in the X, Y, and Z axes.

Figure 11.
The uncertainties of our proposed method and ITPC method. (a) absolute uncertainty in X axis; (b) absolute uncertainty in Y axis; and, (c) absolute uncertainty in Z axis.
Triangulation of a Moving Target Point
We move the target point of the AprilTag from (−278.60, −23.13, 957.84) to (−390.02, −30.39, 1111.91) at the average speed of 0.01 m/s. Figure 12 shows the trajectory of the ground truth and the estimated using our proposed method and ITPC method.

Figure 12.
The trajectory of the ground truth and the estimated point using our proposed method and ITPC method.
Error Analysis: we calculated the absolute error of our proposed method and ITPC method w.r.t ground truth respectively (shown in Figure 13). The mean absolute error in X axis of our proposed method and the ITPC method are 11.81 mm and 9.53 mm, as shown in Figure 13. The mean absolute error in Y axis of our proposed method and the ITPC method are 14.89 mm and 16.59 mm. The mean absolute error in Z axis of our proposed method and the ITPC method are 23.74 mm and 22.48 mm. The mean absolute error in X, Z axes of the proposed method are larger than the ITPC method, and the mean absolute error in Y axis of our proposed method is smaller than the ITPC method.

Figure 13.
The absolute error of our proposed method and ITPC method w.r.t ground truth respectively. (a) absolute error in X axis; (b) absolute error in Y axis; and, (c) absolute error in Z axis.
5.2. Comparison Experiments with Zed Mini
We compare the triangulation performance of our proposed method and ITPC method with stereo system ZED Mini while using fixed cameras (shown in Figure 2).
Experimental Setup
The image size of the left and right cameras of ZED mini are 1280 × 720 pixels. The intrinsic parameters including the focal length and principal point of both cameras are shown in Table 6. The maximum horizontal and vertical field of view of ZED mini are 90° and 60°, respectively. We placed the AprilTag at eight different positions to get 8 different spatial points. The estimated 3D coordinates using ZED mini are shown in Table 7.

Table 6.
The intrinsic parameters of the left and right cameras of ZED mini.

Table 7.
The estimated 3D coordinates of the spatial points w.r.t the base frame using ZED mini.
Error analysis: we calculated the absolute error of our proposed method and ITPC method w.r.t ZED mini, respectively. As shown in Figure 14, the absolute errors of our proposed method are very close to the ITPC method in the X and Y axes. Our proposed method gets smaller absolute error in the Z axis at most of the eight different positions. The absolute error in X axis of both methods are between 25.17 mm and 68.63 mm. The absolute error in Y axis of both methods are between 35.26 mm and 55.18 mm. The minimum absolute errors in Z axis of our proposed method and ITPC method are 0.44 mm and 2.10 mm. The maximum absolute errors in Z axis of our proposed method and ITPC method are 86.40 mm and 92.90 mm. No camera movements exist in ZED mini. Our proposed method and ITPC method get a minimally larger errors when comparing with ZED mini mainly because of eye motion.

Figure 14.
The absolute error of our proposed method and ITPC method w.r.t ZED Mini. (a) absolute error in X axis; (b) absolute error in Y axis; and, (c) absolute error in Z axis.
5.3. Time Performance Experiments
5.3.1. Simulated Experiments
We perform experiments to evaluate how and in the constant coefficient matrix affect the time it takes for eye gazing. We initially placed the simulated checkerboard to set the target point at (−0.00, −100.00, 1000). A step signal with amplitude of 100 mm in Y direction was applied to the target point. We adjusted and from 0.5 to 4.0. The system overshot when or is larger than 4.0. Increasing and moderately can shorten the time it takes for eye gazing, as shown in Figure 15.

Figure 15.
The time it takes for eye gazing affected by and .
5.3.2. Physical Experiments
In the physical experiments, we placed the AprilTag to set the target point at (−0.00, −145.00, 1500). We set the robotic bionic eyes to the initial state. We set , in . It takes 650 ms to move the target point from (385.98, 196.60) to (362.94, 222.53) in the left image frame, and from (361.09, 191.82) to (388.09, 220.82) in the right image frame, respectively, through eye gazing.
6. Conclusions
In this paper, we have proposed an eye gaze based 3D triangulation method for our designed robotic bionic eyes. The eye gaze was realized through image based visual servoing in order to keep the two optical axes pass through the target point . We could obtain the 3D coordinates of through the intersection of the two optical axes of both cameras. The optical axes of both cameras could be derived from forward kinematics with head-eye parameters calibrated beforehand. In real applications, the two optical axes may not intersect each other due to visual servoing errors and model errors, and so we use the middle point of the common perpendicular line segment of the two skew optical axes as the representation of the intersection point .
From the simulated and physical experiments, we can see that the proposed method achieves comparable results with the ITPC method in the absolute errors and the propagated uncertainties, and our proposed method gets smaller mean absolute errors with the triangulation of static target points in physical experiments. Our proposed method and ITPC method get larger errors w.r.t conventional stereo systems with fixed cameras such as ZED Mini due to model errors introduced by manufacturing which include link length error, coaxiality error and error due to link stiffness. The experiments show that at the beginning of the visual servoing process our method need several hundred milliseconds to locate a target point to its optical center. Selecting and in the gain matrix by using fuzzy PID could be potential solution to minimize the initial processing time to lacate a target point to its optical center.
Although our method has only tiny improvement in triangulation precision compare with the ITPC method, it is a new bionic approach for triangulation using eye gazing through image based visual servoing. Our system has much larger field of view than traditional stereo pair, such as ZED Mini, and it does not rely on stereo extrinsic parameters directly. Another advantage is that image detection error tolerance is higher. In the future, we are going to reduce the model error and joint offset error introduced by manufacturing to obtain compatible precision as stereo pair with fixed cameras.
Author Contributions
Conceptualization, X.C.; Data curation, X.L.; Formal analysis, D.F. and X.C.; Investigation, F.M.; Methodology, D.F. and Y.L. (Yunhui Liu); Software, D.F. and Z.U.; Supervision, Y.L. (Yunhui Liu) and Q.H.; Validation, Y.L. (Yanyang Liu), F.M. and W.C.; Visualization, Y.L (Yanyang Liu). and X.L.; Writing—original draft, D.F.; Writing—review and editing, X.C., Z.U., Y.L. (Yunhui Liu) and Q.H. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 91748202); the Science and Technology Program of Beijing Municipal Science and Technology Commission (Z191100008019003); the Pre-research Project under Grant 41412040101; and the Open Project Fund of Hebei Static Traffic Technology Innovation Center, AIPARK (No.001/2020).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Sangeetha, G.; Kumar, N.; Hari, P.; Sasikumar, S. Implementation of a stereo vision based system for visual feedback control of robotic arm for space manipulations. Procedia Comput. Sci. 2018, 133, 1066–1073. [Google Scholar]
- Lin, C.-Y.; Shih, S.-W.; Hung, Y.-P.; Tang, G.Y. A new approach to automatic reconstruction of a 3-d world using active stereo vision. Comput. Vis. Image Underst. 2002, 85, 117–143. [Google Scholar] [CrossRef]
- Al-Mutib, K.N.; Mattar, E.A.; Alsulaiman, M.M.; Ramdane, H. Stereo vision slam based indoor autonomous mobile robot navigation. In Proceedings of the 2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014), Bali, Indonesia, 5–10 December 2014; pp. 1584–1589. [Google Scholar]
- Diebel, J.; Reutersward, K.; Thrun, S.; Davis, J.; Gupta, R. Simultaneous localization and mapping with active stereo vision. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), Sendai, Japan, 28 September–2 October 2004; Volume 4, pp. 3436–3443. [Google Scholar]
- Sakagami, Y.; Watanabe, R.; Aoyama, C.; Matsunaga, S.; Higaki, N.; Fujimura, K. The intelligent asimo: System overview and integration. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Lausanne, Switzerland, 30 September–4 October 2002; Volume 3, pp. 2478–2483. [Google Scholar]
- Kaneko, K.; Harada, K.; Kanehiro, F.; Miyamori, G.; Akachi, K. Humanoid robot hrp-3. In Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, 22–26 September 2008; pp. 2471–2478. [Google Scholar]
- Zhang, T.; Uchiyama, E.; Nakamura, Y. Dense rgb-d slam for humanoid robots in the dynamic humans environment. In Proceedings of the 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids), Beijing, China, 6–9 November 2018; pp. 270–276. [Google Scholar]
- Willow Garage. Available online: https://en.wikipedia.org/wiki/Willow_Garage (accessed on 1 September 2018).
- Atlas (Robot). Available online: https://en.wikipedia.org/wiki/Atlas_(robot) (accessed on 1 September 2018).
- Tsagarakis, N.G.; Caldwell, D.G.; Negrello, F.; Choi, W.; Baccelliere, L.; Loc, V.-G.; Noorden, J.; Muratore, L.; Margan, A.; Cardellino, A.; et al. Walk-man: A high-performance humanoid platform for realistic environments. J. Field Robot. 2017, 34, 1225–1259. [Google Scholar] [CrossRef]
- Berthouze, L.; Bakker, P.; Kuniyoshi, Y. Learning of oculo-motor control: A prelude to robotic imitation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS’96, Osaka, Japan, 8 November 1996; Volume 1, pp. 376–381. [Google Scholar]
- Sharkey, P.M.; Murray, D.W.; McLauchlan, P.F.; Brooker, J.P. Hardware development of the yorick series of active vision systems. Microprocess. Microsyst. 1998, 21, 363–375. [Google Scholar] [CrossRef]
- Biamino, D.; Cannata, G.; Maggiali, M.; Piazza, A. Mac-eye: A tendon driven fully embedded robot eye. In Proceedings of the 5th IEEE-RAS International Conference on Humanoid Robots, Tsukuba, Japan, 5 December 2005; pp. 62–67. [Google Scholar]
- Beira, R.; Lopes, M.; Praça, M.; Santos-Victor, J.; Bernardino, A.; Metta, G.; Becchi, F.; Saltarén, R. Design of the robot-cub (icub) head. In Proceedings of the 2006 IEEE International Conference on Robotics and Automation, Orlando, FL, USA, 15–19 May 2006; pp. 94–100. [Google Scholar]
- Asfour, T.; Welke, K.; Azad, P.; Ude, A.; Dillmann, R. The karlsruhe humanoid head. In Proceedings of the Humanoids 2008-8th IEEE-RAS International Conference on Humanoid Robots, Daejeon, Korea, 1–3 December 2008; pp. 447–453. [Google Scholar]
- Lütkebohle, I.; Hegel, F.; Schulz, S.; Hackel, M.; Wrede, B.; Wachsmuth, S.; Sagerer, G. The bielefeld anthropomorphic robot head flobi. In Proceedings of the 2010 IEEE International Conference on Robotics and Automation, Anchorage, AK, USA, 3–7 May 2010; pp. 3384–3391. [Google Scholar]
- Song, Y.; Zhang, X. An active binocular integrated system for intelligent robot vision. In Proceedings of the 2012 IEEE International Conference on Intelligence and Security Informatics, Arlington, VA, USA, 11–14 June 2012; pp. 48–53. [Google Scholar]
- Asfour, T.; Schill, J.; Peters, H.; Klas, C.; Bücker, J.; Sander, C.; Schulz, S.; Kargov, A.; Werner, T.; Bartenbach, V. Armar-4: A 63 dof torque controlled humanoid robot. In Proceedings of the 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids), Atlanta, GA, USA, 15–17 October 2013; pp. 390–396. [Google Scholar]
- Cid, F.; Moreno, J.; Bustos, P.; Núnez, P. Muecas: A multi-sensor robotic head for affective human robot interaction and imitation. Sensors 2014, 14, 7711–7737. [Google Scholar] [CrossRef]
- Pateromichelakis, N.; Mazel, A.; Hache, M.; Koumpogiannis, T.; Gelin, R.; Maisonnier, B.; Berthoz, A. Head-eyes system and gaze analysis of the humanoid robot romeo. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, 14–18 September 2014; pp. 1374–1379. [Google Scholar]
- Penčić, M.; Rackov, M.; Čavić, M.; Kiss, I.; Cioată, V. Social humanoid robot sara: Development of the wrist mechanism. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2018; Volume 294, p. 012079. [Google Scholar]
- Zhang, T.; Chen, X.; Owais, H.M.; Liu, G.; Fu, S.; Tian, Y.; Chen, X. Multi-loop stabilization control of a robotic bionic eyes. In Proceedings of the 2017 IEEE International Conference on Cyborg and Bionic Systems (CBS), Beijing, China, 17–19 October 2017; pp. 87–90. [Google Scholar]
- Liu, G.; Owais, H.M.; Zhang, T.; Fu, S.; Tian, Y.; Chen, X. Reliable eyes pose measurement for robotic bionic eyes with mems gyroscope and akf filter. In Proceedings of the 2017 IEEE International Conference on Cyborg and Bionic Systems (CBS), Beijing, China, 17–19 October 2017; pp. 83–86. [Google Scholar]
- Fan, D.; Chen, X.; Zhang, T.; Chen, X.; Liu, G.; Owais, H.M.; Kim, H.; Tian, Y.; Zhang, W.; Huang, Q. Design of anthropomorphic robot bionic eyes. In Proceedings of the 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), Macau, China, 5–8 December 2017; pp. 2050–2056. [Google Scholar]
- Majumdar, J. Efficient parallel processing for depth calculation using stereo. Robot. Autonomous Syst. 1997, 20, 1–13. [Google Scholar] [CrossRef]
- Nefti-Meziani, S.; Manzoor, U.; Davis, S.; Pupala, S.K. 3D perception from binocular vision for a low cost humanoid robot nao. Robot. Autonomous Syst. 2015, 68, 129–139. [Google Scholar] [CrossRef]
- Zhang, J.; Du, R.; Gao, R. Passive 3D reconstruction based on binocular vision. In Proceedings of the Tenth International Conference on Graphics and Image Processing (ICGIP 2018), International Society for Optics and Photonics, Chengdu, China, 12–14 December 2018; Volume 11069, p. 110690Y. [Google Scholar]
- Hartley, R.I.; Sturm, P. Triangulation. Comput. Vision Image Underst. 1997, 68, 146–157. [Google Scholar] [CrossRef]
- Kanatani, K.; Sugaya, Y.; Niitsuma, H. Triangulation from two views revisited: Hartley-sturm vs. optimal correction. Practice 2008, 4, 5. [Google Scholar]
- Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Zhong, F.; Shao, X.; Quan, C. A comparative study of 3d reconstruction methods in stereo digital image correlation. Opt. Lasers Eng. 2019, 122, 142–150. [Google Scholar] [CrossRef]
- Zhang, Z. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [Google Scholar] [CrossRef]
- Cui, Y.; Zhou, F.; Wang, Y.; Liu, L.; Gao, H. Precise calibration of binocular vision system used for vision measurement. Opt. Express 2014, 22, 9134–9149. [Google Scholar] [CrossRef] [PubMed]
- Neubert, J.; Ferrier, N.J. Robust active stereo calibration. In Proceedings of the 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), Washington, DC, USA, 11–15 May 2002; Volume 3, pp. 2525–2531. [Google Scholar]
- Xu, D.; Li, Y.F.; Tan, M.; Shen, Y. A new active visual system for humanoid robots. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2008, 38, 320–330. [Google Scholar]
- Xu, D.; Wang, Q. A new vision measurement method based on active object gazing. Int. J. Adv. Robot. Syst. 2017, 14, 1729881417715984. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, X.; Wan, Z.; Zhang, J. A method for extrinsic parameter calibration of rotating binocular stereo vision using a single feature point. Sensors 2018, 18, 3666. [Google Scholar] [CrossRef]
- Wang, Y.; Wang, X. An improved two-point calibration method for stereo vision with rotating cameras in large fov. J. Modern Opt. 2019, 66, 1106–1115. [Google Scholar] [CrossRef]
- Li, M. Kinematic calibration of an active head-eye system. IEEE Trans. Robot. Autom. 1998, 14, 153–158. [Google Scholar]
- Chen, X.; Wang, C.; Zhang, W.; Lan, K.; Huang, Q. An integrated two-pose calibration method for estimating head-eye parameters of a robotic bionic eye. IEEE Trans. Instrum. Meas. 2019, 69, 1664–1672. [Google Scholar] [CrossRef]
- Dang, T.; Hoffmann, C.; Stiller, C. Continuous stereo self-calibration by camera parameter tracking. IEEE Trans. Image Process. 2009, 18, 1536–1550. [Google Scholar] [CrossRef]
- Paul, R.P. Robot Manipulators: Mathematics, Programming, and Control: The Computer Control of Robot Manipulators; MIT Press: Cambridge, MA, USA, 1981. [Google Scholar]
- Coombs, D.J.; Brown, C.M. Cooperative gaze holding in binocular vision. IEEE Control Syst. Mag. 1991, 11, 24–33. [Google Scholar]
- Tanaka, M.; Maru, N.; Miyazaki, F. Binocular gaze holding of a moving object with the active stereo vision system. In Proceedings of the 1994 IEEE Workshop on Applications of Computer Vision, Sarasota, FL, USA, 5–7 December 1994; pp. 250–255. [Google Scholar]
- Roca, X.; Vitria, J.; Vanrell, M.; Villanueva, J. Gaze control in a binocular robot systems. In Proceedings of the 1999 7th IEEE International Conference on Emerging Technologies and Factory Automation, ETFA’99 (Cat. No. 99TH8467), Barcelona, Spain, 18–21 October 1999; Volume 1, pp. 479–485. [Google Scholar]
- Satoh, Y.; Okatani, T.; Deguchi, K. Binocular motion tracking by gaze fixation control and three-dimensional shape reconstruction. Adv. Robot. 2003, 17, 1057–1072. [Google Scholar] [CrossRef]
- Hutchinson, S.; Chaumette, F. Visual servo control, part I: Basic approaches. IEEE Robot. Autom. Mag. 2006, 13, 82–90. [Google Scholar]
- Muelaner, J.E.; Wang, Z.; Martin, O.; Jamshidi, J.; Maropoulos, P.G. Estimation of uncertainty in three-dimensional coordinate measurement by comparison with calibrated points. Meas. Sci. Technol. 2010, 21, 025106. [Google Scholar] [CrossRef]
- da Silva Hack, P.; ten Caten, C.S. Measurement uncertainty: Literature review and research trends. IEEE Trans. Instrum. Meas. 2012, 61, 2116–2124. [Google Scholar] [CrossRef]
- BIPM, I.; IFCC, I.; IUPAC, I.; ISO, O. Evaluation of measurement data—Guide for the expression of uncertainty in measurement. Joint Committee for Guides in Metrology (JCGM) 100: 2008. Citado en las 2008, 18–21. [Google Scholar]
- Leo, G.D.; Liguori, C.; Paolillo, A. Propagation of uncertainty through stereo triangulation. In Proceedings of the 2010 IEEE Instrumentation & Measurement Technology Conference Proceedings, Austin, TX, USA, 3–6 May 2010; pp. 12–17. [Google Scholar]
- Leo, G.D.; Liguori, C.; Paolillo, A. Covariance propagation for the uncertainty estimation in stereo vision. IEEE Trans. Instrum. Meas. 2011, 60, 1664–1673. [Google Scholar] [CrossRef]
- Koenig, N.; Howard, A. Design and use paradigms for gazebo, an open-source multi-robot simulator. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), Sendai, Japan, 28 September–2 October 2004; Volume 3, pp. 2149–2154. [Google Scholar]
- Camera Calibration Toolbox for Matlab. Available online: http://www.vision.caltech.edu/bouguetj/calib_doc/ (accessed on 1 September 2018).
- Olson, E. Apriltag: A robust and flexible visual fiducial system. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 3400–3407. [Google Scholar]
- Marchand, É.; Spindler, F.; Chaumette, F. Visp for visual servoing: A generic software platform with a wide class of robot control skills. IEEE Robot. Autom. Mag. 2005, 12, 40–52. [Google Scholar] [CrossRef]
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).