Gaze Point Tracking Based on a Robotic Body–Head–Eye Coordination Method

Feng, Xingyang; Wang, Qingbin; Cong, Hua; Zhang, Yu; Qiu, Mianhao

doi:10.3390/s23146299

Open AccessArticle

Gaze Point Tracking Based on a Robotic Body–Head–Eye Coordination Method

by

Xingyang Feng

^1,†,

Qingbin Wang

^2,†,

Hua Cong

¹,

Yu Zhang

¹ and

Mianhao Qiu

^1,*

¹

Army Academy of Armored Forces, Beijing 100072, China

²

Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Sensors 2023, 23(14), 6299; https://doi.org/10.3390/s23146299

Submission received: 1 June 2023 / Revised: 29 June 2023 / Accepted: 6 July 2023 / Published: 11 July 2023

(This article belongs to the Special Issue Mobile Robots: Navigation, Control and Sensing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

When the magnitude of a gaze is too large, human beings change the orientation of their head or body to assist their eyes in tracking targets because saccade alone is insufficient to keep a target at the center region of the retina. To make a robot gaze at targets rapidly and stably (as a human does), it is necessary to design a body–head–eye coordinated motion control strategy. A robot system equipped with eyes and a head is designed in this paper. Gaze point tracking problems are divided into two sub-problems: in situ gaze point tracking and approaching gaze point tracking. In the in situ gaze tracking state, the desired positions of the eye, head and body are calculated on the basis of minimizing resource consumption and maximizing stability. In the approaching gaze point tracking state, the robot is expected to approach the object at a zero angle. In the process of tracking, the three-dimensional (3D) coordinates of the object are obtained by the bionic eye and then converted to the head coordinate system and the mobile robot coordinate system. The desired positions of the head, eyes and body are obtained according to the object’s 3D coordinates. Then, using sophisticated motor control methods, the head, eyes and body are controlled to the desired position. This method avoids the complex process of adjusting control parameters and does not require the design of complex control algorithms. Based on this strategy, in situ gaze point tracking and approaching gaze point tracking experiments are performed by the robot. The experimental results show that body–head–eye coordination gaze point tracking based on the 3D coordinates of an object is feasible. This paper provides a new method that differs from the traditional two-dimensional image-based method for robotic body–head–eye gaze point tracking.

Keywords:

bionic eyes; gaze point tracking; gaze point approaching; body–eye–head coordination; 3D coordinates

1. Introduction

When the magnitude of a gaze is too large, human beings change the orientation of their head or body to assist their eyes in tracking targets because saccade alone is insufficient to keep a target at the center region of the retina. Studies on body–head–eye coordination gaze point tracking are still rare because the body–head–eye coordination mechanism of humans is prohibitively complex. Multiple researchers have investigated the eye–head coordination mechanism, binocular coordination mechanism and bionic eye movement control. In addition, researchers have validated the eye–head coordination models on eye–head systems. This work is significant for the development of intelligent robots for human–robot interaction. However, most of these methods are based on the principle of neurology, and their further developments and applications may be limited by people’s understanding of human processes. However, binocular coordination based on the 3D coordinates of an object is simple and practical, as verified by our previous paper [1].

When the fixation point transfers greatly, the head and eyes should move in coordination to accurately shift the gaze to the target. Multiple studies have built models of eye–head coordination based on the physiological characteristics of humans. For example, Kardamakis A A et al. [2] researched eye–head movement and gaze shifting. The best balance between eye movement speed and the duration time was sought, and the optimal control method was used to minimize the loss of motion. Freedman E G et al. [3] studied the physiological mechanism of coordinated eye–head movement. However, they did not establish an engineering model. Nakashima et al. [4] proposed a method for gaze prediction that combines information on the head direction with a saliency map. In another study [5], the authors presented a robotic head for social robots to attend to scene saliency with bio-inspired saccadic behaviors. The scene saliency was determined by measuring low-level static scene information, motion, and prior object knowledge. Law et al. [6] described a biologically constrained architecture for developmental learning of eye–head gaze control on an iCub robot. They also identified stages in the development of infant gaze control and proposed a framework of artificial constraints to shape the learning of the robot in a similar manner. Other studies have investigated the mechanisms of eye–head movement for robots and achieved satisfactory performance [7,8].

Some application studies based on coordinated eye–head movement have been carried out in addition to the mechanism research. For example, Kuang et al. [9] developed a method for egocentric distance estimation based on the parallax that emerges during compensatory head–eye movements. This method was tested in a robotic platform equipped with an anthropomorphic neck and two binocular pan–tilt units. Reference [10]’s model is capable of reaching static targets posed at a starting distance of 1.2 m in approximately 250 control steps. Hülse et al. [11] introduced a computational framework that integrates robotic active vision and reaching. Essential elements of this framework are sensorimotor mappings that link three different computational domains relating to visual data, gaze control and reaching.

Some researchers have applied the combined movement of the eyes, head and body in mobile robots. In one study [12], large reorientations of the line of sight, involving combined rotations of the eyes, head, trunk and lower extremities, were executed either as fast single-step or as slow multiple-step gaze transfers. Daye et al. [13] proposed a novel approach for the control of linked systems with feedback loops for each part. The proximal parts had separate goals. In addition, an efficient and robust human tracker for a humanoid robot was implemented and experimentally evaluated in another study [14].

On the one hand, human eyes can obtain three-dimensional (3D) information from objects. This 3D information is useful for humans to make decisions. Human can shift their gaze stably and approach a target using the 3D information of the object. When the human gaze shifts to a moving target, the eyes first rotate to the target, and then the head and even the body rotate if the target leaves the sight of the eyes [15]. Therefore, the eyes, head and body move in coordination to shift the gaze to the target with minimal energy expenditure. On the other hand, when a human approaches a target, the eyes, head and body rotate to face the target and the body moves toward the target. The two movements are typically executed with the eyes, head and body acting in conjunction. A robot that can execute these two functions will be more intelligent. Such a robot would need to exploit the smooth pursuit of eyes [16], coordinated eye–head movement [17], target detection and the combined movement of the eyes, head and robot body to carry out these two functions. Studies have achieved many positive results in these aspects.

Mobile robots can track and locate objects according to 3D information. Some special cameras such as deep cameras and 3D lasers have been applied to obtain the 3D information of the environment and target. In one study [18], a nonholonomic under-actuated robot with bounded control was described that travels within a 3D region. A single sensor provided the value of an unknown scalar field at the current location of the robot. Nefti-Meziani S et al. [19] presented the implementation of a stereo-vision system integrated in a humanoid robot. The low cost of the vision system is one of the main aims, avoiding expensive investment in hardware when used in robotics for 3D perception. Namavari A et al. [20] presented an automatic system for the gauging and digitalization of 3D indoor environments. The configuration consisted of an autonomous mobile robot, a reliable 3D laser rangefinder and three elaborated software modules.

The main forms of motion of bionic eyes include saccade [1], smooth pursuit, vergence [21], vestibule–ocular reflex (VOR) [22] and optokinetic reflex (OKR) [23]. Saccade and smooth pursuit are the two most important functions of the human eye. Saccade is used to move eyes voluntarily from one point to another by rapid jumping, while smooth pursuit can be applied to track moving targets. In addition, binocular coordination and eye–head coordination are of high importance to realize object tracking and gaze control.

It is of great significance for robots to be able change their fixation point quickly. In control models, the saccade control system should be implemented using a position servo controller to change and keep the target at the center region of the retina with minimum time consumption. Researchers have been studying the implementation of saccade on robots over the last twenty years. For example, in 1997, Bruske et al. [24] incorporated saccadic control into a binocular vision system by using the feedback error learning (FEL) strategy. In 2013, Wang et al. [25] designed an active vision system that can imitate saccade and other eye movements. The saccadic movements were implemented with an open-loop controller, which ensures faster saccadic eye movements than a closed-loop controller can accommodate. In 2015, Antonelli et al. [26] achieved saccadic movements on a robot head by using a model called recurrent architecture (RA). In this model, the cerebellum is regarded as an adaptive element used to learn an internal model, while the brainstem is regarded as a fixed-inverse model. The experimental results on the robot showed that this model is more accurate and less sensitive to the choice of the inverse model relative to the FEL model.

The smooth pursuit system acts as a velocity servo controller to rotate eyes at the same angular rate as the target while keeping them oriented toward the desired position or in the desired region. In Robinson’s model of smooth pursuit [27], the input is the velocity of the target’s image across the retina. The velocity deviation is taken as the major stimulus to pursue and is transformed into an eye velocity command. Based on Robinson’s model, Brown [28] added a smooth predictor to accommodate time delays. Deno et al. [29] applied a dynamic neural network, which unified two apparently disparate models of smooth pursuit and dynamic element organization to the smooth pursuit system. The dynamic neural network can compensate for delays from the sensory input to the motor response. Lunghi et al. [30] introduced a neural adaptive predictor that was previously trained to accomplish smooth pursuit. This model can explain a human’s ability to compensate for the 130 ms physiological delay when they follow external targets with their eyes. Lee et al. [31] applied a bilateral OCS model on a robot head and established rudimentary prediction mechanisms for both slow and fast phases. Avni et al. [32] presented a framework for visual scanning and target tracking with a set of independent pan–tilt cameras based on model predictive control (MPC). In another study [33], the authors implemented smooth pursuit eye movement with prediction and learning in addition to solving the problem of time delays in the visual pathways. In addition, some saccade and smooth pursuit models have been validated on bionic eye systems [34,35,36,37]. Santini F et al. [34] showed that the oculomotor strategies by which humans scan visual scenes produce parallaxes that provide an accurate estimation of distance. Other studies have realized the coordinated control of eye and arm movements through configuration and training [35]. Song Y et al. [36] proposed a binocular control model, which was derived from a neural pathway, for smooth pursuit. In their smooth pursuit experiments, the maximum retinal error was less than 2.2°, which is sufficient to keep a target in the field of view accurately. An autonomous mobile manipulation system was developed in the form of a modified image-based visual servo (IBVS) controller in a study [37].

The above-mentioned work is significant for the development of intelligent robots. However, there are some shortcomings. First, most of the existing methods are based on the principle of neurology, and further developments and applications may be limited by people’s understanding aimed at humans. Second, only two-dimensional (2D) image information is applied when gaze shifts to targets are implemented, while 3D information is ignored. Third, the studies of smooth pursuit [16], eye–head coordination [17], gaze shift and approach are independent and have not been integrated. Fourth, bionic eyes are different from human eyes; for example, some of them are two eyes that are fixed without movement or move with only 1 DOF, whereas some of them use special cameras or a single camera. Fifth, the movements of bionic eyes and heads are performed separately, without coordination.

To overcome the shortcomings mentioned above to a certain extent, a novel control method that implements the gaze shift and approach of a robot according to 3D coordinates is proposed in this paper. A robot system equipped with bionic eyes, a head and a mobile robot is designed to help nurses deliver medicine in hospitals. In this system, both the pan and each eye have 2 DOF (namely, tilt and pan [38]), and the mobile robot can rotate and move forward over the ground. When the robot gaze shifts to the target, the 3D coordinates of the target are acquired by the bionic eyes and transferred to the eye coordination system, head coordination system and robot coordination system. The desired position of the eye, head and robot are calculated based on the 3D information of the target. Then, the eye, head and mobile robot are driven to the desired positions. When the robot approaches the target, the eye, head and mobile robot first rotate to the target and then move to the target. This method allows the robot to achieve the above-mentioned functions with minimal resource consumption and can separate the control of the eye, head and mobile robot, which can improve the interactions between robots, human beings and the environment.

The rest of the paper is organized as follows. In Section 2, the robot system platform is introduced, and the control system is presented. In Section 3, the desired position is discussed and calculated. Robot pose control is described in Section 4. The experimental results are given and discussed in Section 5; finally, conclusions are drawn in Section 6.

2. Platform and Control System

To study the gaze point tracking of the robot, this paper designs a robot experiment platform including the eye–head subsystem and the mobile robot subsystem.

2.1. Robot Platform

The physical object of the robot is shown in Figure 1. With the mobile robot as a carrier, a head with two degrees of freedom is fixed on the mobile robot, and the horizontal and vertical rotations of the head are controlled by M_hu and M_hd, respectively. The bionic eye system is fixed to the head. The mobile robot is driven by two wheels, each of which is individually controlled by a servo motor. The angle and displacement of the robot platform can be determined by controlling the distance and speed of each wheel’s movement. The output shaft of each stepper motor of the head and eye is equipped with a rotary encoder to detect the position of the motor. Using the frequency multiplication technique, the resolution of the rotary encoder is 0.036°. The purpose of using a rotary encoder is to prevent the effects of lost motor motion on the 3D coordinate calculations. The movement of each motor is limited by a limit switch. The initial positioning of the eye system is based on the visual positioning plate [39].

The robot system includes two eyes and one mobile robot. To simulate the eyes and the head, six DOFs are designed in this system. The left eye’s pan and tilt are controlled by motors M_lu and M_ld, respectively. The right eye’s pan and tilt are controlled by motors M_ru and M_rd, respectively. The head’s pan and tilt are controlled by motors M_hu and M_hd, respectively. The mobile robot has two driving wheels and can perform rotation and forward movement. When the mobile robot needs to rotate, two wheels are set to turn the same amount in different directions. When the mobile robot needs to go forward, two wheels are set to turn the same amount in the same direction.

A diagram of the robot system’s organization is shown in Figure 2. The host computer and the mobile robot motion controller, the head motion controller and the eye motion controller all communicate through the serial ports. For satisfactory communication quality and stability, the baud rate of serial communication is 9600 bps. The camera communicates with the host computer via a GigE Gigabit Network. The camera’s native resolution is 1600 × 1200 pixels. To increase the calculation speed, the system uses an image downsampled to 400 × 300 pixels.

2.2. Control System

Figure 3 shows the control block diagram of the gaze point tracking of the mobile robot. First, based on binocular stereo-vision perception, the binocular pose and the left and right images are used to calculate the 3D coordinates of the target [40], and the coordinates of the target in the eye coordinate system are converted to the head and mobile robot coordinate system. Then, the desired poses of the eyes, head and mobile robot are calculated according to the 3D coordinates of the target. Finally, according to the desired pose, the motor is controlled to move to the desired position, and the change in the position of the motor is converted into changes in the eyes, head and mobile robot.

The tracking and approaching motion control problem based on the target 3D coordinates [1] is equivalent to solving the index J minimization problem of Equation (1), where f_i is the current state vector of the joint pose of the eye, head and mobile robot and f_q is the desired state vector:

J = ‖f_{i} - f_{q}‖

(1)

where J is the indicator function.

Figure 4a shows the definition of each coordinate system of the robot. The coordinate system of the eye is O_eX_eY_eZ_e, which coincides with the left motion module’s base coordinate system at the initial position. The head coordinate system is O_hX_hY_hZ_h, and the coordinates P_h (x_h, y_h, z_h) of the point P in the head coordinate system can be calculated using the coordinates P_e (x_e, y_e, z_e) in the eye coordinate system. The definitions of d_x and d_y are shown in Figure 4b. The robot coordinate system O_wX_wY_wZ_w coincides with the head coordinate system of the initial position. In the bionic eye system, the axis of rotation of the robot approximately coincides with Y_w.

Figure 4b,c show the definition of each system parameter. ^lθ_p and ^lθ_t are the pan and tilt of the left eye, respectively. ^rθ_p and ^rθ_t are the pan and tilt of the right eye, respectively. ^hθ_p and ^hθ_t are the pan and tilt of the head, respectively. The angle of the robot that rotates around the Y_w axis is ^wθ_p. The robot can not only rotate around Y_w but can also shift in the X_wO_wZ_w plane. When the robot moves, the robot coordinate system at time i is the base coordinate system, and the position of the robot at time i + 1 relative to the base coordinate system is ^wP_m (^wx_m, ^wz_m). When the robot performs gaze point tracking or approaches the target, the 3D coordinates of the target are first calculated at time i, and then the desired posture f_q of each part of the robot at time i + 1 is calculated according to the 3D coordinates of the target. When the current pose f_i of the robot system is equal to the desired pose, the robot maintains the current pose; when not equal, the system controls the various parts of the robot to move to the desired pose. The current pose vector of the robot system is f_i = (^wx_mi, ^wz_mi, ^wθ_pi, ^hθ_pi, ^hθ_ti, ^lθ_pi, ^lθ_ti, ^rθ_pi, ^rθ_ti), and the desired pose is f_q = (^wx_mq, ^wz_mq, ^wθ_pq, ^hθ_pq, ^hθ_tq, ^lθ_pq, ^lθ_tq, ^rθ_pq, ^rθ_tq). When performing in situ gaze point tracking, the robot performs only pure rotation and does not move forward. When the robot approaches the target, it first turns to the target and then moves straight toward the target. Therefore, the definition of f_q in the two tasks is different. Let ^gf_q be the desired pose when the gaze point is tracked and ^af_q be the desired pose of the robot when approaching the target.

After analyzing the control system, we found that the most important step in solving this control problem is to determine the desired pose.

3. Desired Pose Calculation

When performing in situ gaze point tracking, the robot performs only pure rotation and does not move forward. When the robot approaches the target, it first turns to the target and then moves straight toward the target. Therefore, the calculation of the desired pose can be divided into two sub-problems: (1) desired pose calculation for in situ gaze point tracking and (2) desired pose calculation for approaching gaze point tracking.

The optimal observation position is used for the accurate acquisition of 3D coordinates. The 3D coordinate accuracy is related to the baseline, time difference and image distortion. In the bionic eye platform, the baseline is changed with the changes in the cameras’ positions because the optical center is not coincident with the center of rotation. The 3D coordinate error of the target is smaller when the baseline of the two cameras is longer. Therefore, it is necessary to keep the baseline unchanged. On the other hand, there is a time difference caused by unstick synchronization between image acquisition and camera position acquisition. In addition, it is necessary to keep the target in the center areas of the two camera images to obtain accurate 3D coordinates of the target.

3.1. Optimal Observation Position of Eyes

In the desired pose of the robot, the most important aspect is the expected pose of the bionic eye [40]. Following the definition of this parameter, the calculation of the desired pose of the robot system is greatly simplified; thus, we present an engineering definition here of the desired pose of the bionic eye.

As shown in Figure 5, ^lm_i (^lu_i, ^lv_i) and ^r(^ru_i, ^rv_i) are the image coordinates of point ^eP in the camera at time i. ^lm_o and ^rm_o are the image centers of the left and right cameras, respectively. ^lP is the vertical point of ^eP along the line ^lO_c^lZ_c, and ^rP is the vertical point of ^eP along the line ^rO_c^rZ_c. ^l∆m is the distance between ^lm and ^lm_o. ^r∆m is the distance between ^rm and ^rm_o. D_b is the baseline length. The pan angles of the left and right cameras in the optimal observation position are ^lθ_p and ^rθ_p, respectively. The tilt angles of the left and right cameras in the optimal observation position are ^lθ_t and ^rθ_t, respectively. P_ob (^lθ_p, ^lθ_t, ^rθ_p, ^rθ_t) is the optimal observation position.

When the two eyeballs of the bionic eye move relative to each other, the 3D coordinates of the target obtained by the bionic eye produces a large error. To characterize this error, we give a detailed analysis of its origins in Appendix A. Through analysis, we obtain the following conclusions to reduce the measurement error of the bionic eye:

(1) Make the length of D_b long enough, and maintain as much length as possible during the movement;

(2) Try to observe the target closer to the target so that the depth error is as small as possible;

(3) During the movement of the bionic eye, control the two cameras so that they move at the same angular velocity;

(4) Try to keep the target symmetrical, and make ^lΔm and ^rΔm as equal as possible in the left and right camera images.

Based on these four methods, the motion strategy of the motor is designed, and the measurement accuracy of the target’s 3D information can be effectively improved.

According to the conclusion, we can define a definition of the optimal observed pose of the bionic eye to reduce the measurement error.

The optimal observation position needed to meet the conditions is listed in Equation (2). When the target is very close to the eyes, the target’s optimal observation position cannot be obtained because the image position of the target can be kept at the image center region. It is challenging to obtain the optimal solution of the observation position based on Equation (12). However, a suboptimal solution can be obtained by using a simplified calculation method. First, ^lθ_t and ^rθ_t are calculated in the case that ^lθ_t and ^rθ_t are equal to zero; then, ^lθ_t and ^rθ_t are calculated while ^lθ_t and ^rθ_t are kept equal to the calculated value. Trial-and-error methods can be used to obtain the optimal solution when the suboptimal solution is obtained.

\{\begin{cases} ^{l} θ_{pq} =^{r} θ_{pq} = θ_{p} \\ ^{l} θ_{tq} =^{r} θ_{tq} = θ_{t} \\ ^{l} Δ m = -^{r} Δ m \end{cases}

(2)

where

^{l} Δ m = (\begin{array}{l} ^{l} Δ u \\ ^{l} Δ v \end{array}) = (\begin{matrix} ^{l} u_{i} -^{l} u_{0} \\ ^{l} v_{i} -^{l} v_{0} \end{matrix})

(3)

^{r} Δ m = (\begin{array}{l} ^{r} Δ u \\ ^{r} Δ v \end{array}) = (\begin{matrix} ^{r} u_{i} -^{r} u_{0} \\ ^{r} v_{i} -^{r} v_{0} \end{matrix})

(4)

3.2. Desired Pose Calculation for In Situ Gaze Point Tracking

When the range of target motion is large and the desired posture of the eyeball exceeds its reachable posture, the head and mobile robot move to keep the target in the center region of the image. In robotic systems, eye movements tend to consume the least amount of resources and do not have much impact on the stability of the head and mobile robot during exercise. Head rotation consumes more resources than the eyeball but consumes fewer resources than trunk rotation. At the same time, the rotation of the head affects the stability of the eyeball but does not have much impact on the stability of the trunk. Mobile robot rotation consumes the most resources and has a large impact on the stability of the head and eyeball. When tracking the target, one needs only to keep the target in the center region of the binocular image. Therefore, when performing gaze point tracking, the movement mechanism of the head, eyes and mobile robot are designed with the principle of minimal resource consumption and maximum system stability. When the eyeball can perceive the 3D coordinates of the target in the reachable and optimal viewing posture, only the eye is rotated; otherwise, the head is rotated. The head also has an attainable range of poses. When the desired pose exceeds this range, the mobile robot needs to be turned so that the bionic eye always perceives the 3D coordinates of the target in the optimal viewing position. Let ^hγ_p and ^hγ_t be the angles between the head and the gaze point in the X_hO_hZ_h and Y_hO_hZ_h planes, respectively. The range of binocular rotation in the horizontal direction is [−^eθ_pmax, ^eθ_pmax], and the range of binocular rotation in the vertical direction is [−^eθ_tmax, ^eθ_tmax]. The range of head rotation in the horizontal direction is [−^hθ_pmax, ^hθ_pmax], and the range of head rotation in the vertical direction is [−^hθ_tmax, ^hθ_tmax]. For the convenience of calculation, the angles between the head and the fixation point in the horizontal direction and the vertical direction are designated as [−^hγ_pmax, ^hγ_pmax] and [−^hγ_tmax, ^hγ_tmax], respectively. When the angle between the head and the target exceeds a set threshold, the head needs to be rotated to the

^{h} θ_{p}^{'}

and

^{h} θ_{t}^{'}

positions in the horizontal and vertical directions, respectively. When

^{h} θ_{p}^{'}

exceeds the angle that the head can attain, the angle at which the mobile robot needs to be compensated is ^wθ_p. In the in situ gaze point tracking task, the cart does not need to translate in the X_wO_wZ_w plane, so x_w = 0, and z_w = 0. Furthermore, according to the definition of the optimal observation pose of the bionic eye, the conditions that ^gf_q should satisfy are

^{g} f_{q} = (\begin{array}{l} ^{w} x_{mq} = 0 \\ ^{w} z_{mq} = 0 \\ ^{w} θ_{pq} = {θ | | θ | \leq 2 π,^{h} θ_{pq} + θ =^{h} θ_{p}^{'}} \\ ^{h} θ_{pq} = {θ | | θ | \leq^{h} θ_{pmax}, |^{h} γ_{p} | \leq^{h} γ_{pmax}} \\ ^{h} θ_{tq} = {θ | | θ | \leq^{h} θ_{tmax}, |^{h} γ_{t} | \leq^{h} γ_{tmax}} \\ ^{l} θ_{pq} =^{r} θ_{pq} = {θ | | θ | \leq^{e} θ_{pmax},^{l} {Δ m}_{l} = - Δ m_{r}} \\ ^{l} θ_{tq} =^{r} θ_{tq} = {θ | | θ | \leq^{e} θ_{tmax}, Δ m_{l} = - Δ m_{r}} \end{array})

(5)

The desired pose needs to be calculated based on the 3D coordinates of the target. Therefore, to obtain the desired pose, it is necessary to acquire the 3D coordinates of the target according to the current pose of the robot.

3.2.1. Three-Dimensional Coordinate Calculation

The mechanical structure and coordinate settings of the system are shown in Figure 6a. The principle of binocular stereoscopic 3D perception is shown in Figure 6b. E is the eye coordinate system, E_l is the left motion module’s end coordinate system, E_r is the right motion module’s end coordinate system, B_l is the left motion module’s base coordinate system, B_r is the right motion module’s base coordinate system, C_l is the left camera coordinate system and C_r is the right camera coordinate system. In the initial position, E_l coincides with B_l, and E_r overlaps with B_r. When the binocular system moves, the base coordinate system does not change. ^lT represents the transformation matrix of the eye coordinate system E to the left motion module’s base coordinate system B_l, ^rT represents the transformation matrix of E to B_r, ^lT_e represents the transformation matrix of B_l to E_l, ^rT_e represents the transformation matrix of B_r to E_r and ^lT_m represents the leftward motion. The module end coordinate system corresponds to the transformation matrix of the left camera coordinate system, and ^rT_m represents the transformation matrix of the right motion module’s end coordinate system to the right camera coordinate system. ^lT_r represents the transformation matrix of the right camera coordinate system to the left camera coordinate system at the initial position.

The origin ^lO_c of C_l lies at the optical center of the left camera, the ^lZ_c axis points in the direction of the object parallel to the optical axis of the camera, the ^lX_c axis points horizontally to the right along the image plane and the ^lY_c axis points vertically downward along the image plane. The origin ^rO_c of C_r lies at the optical center of the right camera, ^rZ_c is aligned with the direction of the object parallel to the optical axis of the camera, ^rX_c points horizontally to the right along the image plane and ^rY_c points vertically downward along the image plane. E_l’s origin ^lO_e is set at the intersection of the two rotation axes of the left motion module, ^lZ_e is perpendicular to the two rotation axes and points to the front of the platform, ^lX_e coincides with the vertical rotation axis and ^lY_e coincides with the horizontal rotation axis. Similarly, the origin ^rO_e of the coordinate system E_r is set at the intersection of the two rotation axes of the right motion module, ^rZ_e is perpendicular to the two rotation axes and points toward the front of the platform, ^rX_e coincides with the vertical rotation axis and ^rY_e coincides with the horizontal rotation axis.

The left motion module’s base coordinates system B_l coincides with the eye coordinate system E; thus, ^lT consists of an identity matrix. To calculate the 3D coordinates of the feature points in real time from the camera pose, it is necessary to calculate ^rT. At the initial position of the system, the external parameters ^lT_r of the left and right cameras are calibrated offline, as are the hand–eye parameters of the left–right motion module to the camera coordinate system.

When the system is in its initial configuration, the coordinates of point P in the eye coordinate system are P_e (x_e, y_e, z_e). Its coordinates in B_l are ^lP_e (^lx_e, ^ly_e, ^lz_e), and its coordinates ^lP_c (^lx_c, ^ly_c, ^lz_c) in C_l are

^{l} P_{c} =^{l} T_{m}^{- 1} P_{e}

(6)

The coordinates ^rP_e (^rx_e, ^ry_e, ^rz_e) of point P in B_r are

^{r} P_{e} =^{r} T P_{e}

(7)

The coordinates ^rP_c (^rx_c, ^ry_c, ^rz_c) of point P in C_r are

^{r} P_{c} =^{r} T_{m}^{- 1}^{r} T P_{e}

(8)

The point in C_r is transformed into C_l:

^{l} P_{c} =^{l} T_{r}^{r} T_{m}^{- 1}^{r} T P_{e}

(9)

Based on the Equations (6) and (9), ^rT is available:

^{r} T =^{r} T_{m}^{l} T_{r}^{- 1}^{l} T_{m}^{- 1}

(10)

During the movement of the system, when the left motion module rotates by ^lθ_p and ^lθ_t in the horizontal and vertical directions, respectively, the transformation relationship between B_l and E_l is

^{l} T_{e} = (\begin{matrix} Rot (Y,^{l} θ_{p}) Rot (X,^{l} θ_{t}) & 0 \\ 0 & 1 \end{matrix})

(11)

The coordinates of point P in C_l are

^{l} P_{e} =^{l} T_{m}^{- 1}^{l} T_{e} P_{w} =^{l} T_{d} P_{e}

(12)

Assume that

^{l} T_{d} = (\begin{matrix} ^{l} n_{x} & ^{l} o_{x} & ^{l} a_{x} & ^{l} p_{x} \\ ^{l} n_{y} & ^{l} o_{y} & ^{l} a_{y} & ^{l} p_{y} \\ ^{l} n_{z} & ^{l} o_{z} & ^{l} a_{z} & ^{l} p_{z} \\ 0 & 0 & 0 & 1 \end{matrix})

(13)

The point ^lP_1c (^lx_1c, ^ly_1c) at which line P^lO_c intersects ^lZ_c = 1 is

(\begin{matrix} ^{l} x_{1 c} \\ ^{l} y_{1 c} \end{matrix}) = (\begin{matrix} \frac{^{l} n_{x} x_{e} +^{l} o_{x} y_{e} +^{l} a_{x} z_{e} +^{l} p_{x}}{^{l} n_{z} x_{e} +^{l} o_{z} y_{e} +^{l} a_{z} z_{e} +^{l} p_{z}} \\ \frac{^{l} n_{y} x_{e} +^{l} o_{y} y_{e} +^{l} a_{y} z_{e} +^{l} p_{y}}{^{l} n_{z} x_{e} +^{l} o_{z} y_{e} +^{l} a_{z} z_{e} +^{l} p_{z}} \end{matrix})

(14)

The image coordinates of ^lP_1c in the left camera are m_l (u_l, v_l), (^lx_1c, ^ly_1c) and (u_l, v_l) and can be converted by the parameters of the camera. According to the camera’s internal parameter model, the following can be obtained:

(\begin{matrix} ^{l} x_{1 c} \\ ^{l} y_{1 c} \\ 1 \end{matrix}) =^{l} M_{in}^{- 1} (\begin{matrix} u_{l} \\ v_{l} \\ 1 \end{matrix})

(15)

where ^lM_in is the internal parameter matrix of the left camera. The value of (^lx_1c, ^ly_1c) can be obtained by the image coordinates of ^lP_1c, and the parameters of the left camera can be obtained by substituting (15) into (14):

\{\begin{cases} (^{l} n_{x} -^{l} x_{1 c}^{l} n_{z}) x_{e} + (^{l} o_{x} -^{l} x_{1 c}^{l} o_{z}) y_{e} + (^{l} a_{x} -^{l} x_{1 c}^{l} a_{z}) z_{e} +^{l} p_{x} -^{l} x_{1 c}^{l} p_{z} = 0 \\ (^{l} n_{y} -^{l} y_{1 c}^{l} n_{z}) x_{e} + (^{l} o_{y} -^{l} y_{1 c}^{l} o_{z}) y_{e} + (^{l} a_{y} -^{l} y_{1 c}^{l} a_{z}) z_{e} +^{l} p_{y} -^{l} y_{1 c}^{l} p_{z} = 0 \end{cases}

(16)

During the motion of the system, when the right motion module rotates through ^rθ_p and ^rθ_t in the horizontal and vertical directions, respectively, the transformation relationship between B_r and E_r is

^{r} T_{e} = (\begin{matrix} Rot (Y,^{r} θ_{p}) Rot (X,^{r} θ_{t}) & 0 \\ 0 & 1 \end{matrix})

(17)

The coordinates of point P in C_r are

^{r} P_{e} =^{r} T_{m}^{- 1}^{r} T_{e}^{r} T P_{e} =^{r} T_{d} P_{e}

(18)

Assume that

^{r} T_{d} = (\begin{matrix} ^{r} n_{x} & ^{r} o_{x} & ^{r} a_{x} & ^{r} p_{x} \\ ^{r} n_{y} & ^{r} o_{y} & ^{r} a_{y} & ^{r} p_{y} \\ ^{r} n_{z} & ^{r} o_{z} & ^{r} a_{z} & ^{r} p_{z} \\ 0 & 0 & 0 & 1 \end{matrix})

(19)

The point ^lP_1c (^rx_1c, ^ry_1c) at which line P^rO_c intersects ^rZ_c = 1 is

(\begin{matrix} ^{r} x_{1 c} \\ ^{r} y_{1 c} \end{matrix}) = (\begin{matrix} \frac{^{r} n_{x} x_{e} +^{r} o_{x} y_{e} +^{r} a_{x} z_{e} +^{r} p_{x}}{^{r} n_{z} x_{e} +^{r} o_{z} y_{e} +^{r} a_{z} z_{e} +^{r} p_{z}} \\ \frac{^{r} n_{y} x_{e} +^{r} o_{y} y_{e} +^{r} a_{y} z_{e} +^{r} p_{y}}{^{r} n_{z} x_{e} +^{r} o_{z} y_{e} +^{r} a_{z} z_{e} +^{r} p_{z}} \end{matrix})

(20)

The image coordinates of ^rP_1c in the camera, namely, m_r (u_r, v_r), (^rx_1c, ^ry_1c) and (u_r, v_r), can be converted using the parameters of the camera. According to the camera’s internal parameter model, the following can be obtained:

(\begin{matrix} ^{r} x_{1 c} \\ ^{r} y_{1 c} \\ 1 \end{matrix}) =^{l} M_{in}^{- 1} (\begin{matrix} u_{r} \\ v_{r} \\ 1 \end{matrix})

(21)

where ^rM_in is the inner parameter matrix of the right camera. The value of (^rx_1c, ^ry_1c) can be obtained by the image coordinates of ^rP_1c and the parameters in the camera, and the following can be obtained by substituting (21) into (20):

\{\begin{cases} (^{r} n_{x} -^{r} x_{1 c}^{r} n_{z}) x_{e} + (^{r} o_{x} -^{r} x_{1 c}^{r} o_{z}) y_{e} + (^{r} a_{x} -^{r} x_{1 c}^{r} a_{z}) z_{e} +^{r} p_{x} -^{r} x_{1 c}^{r} p_{z} = 0 \\ (^{r} n_{y} -^{r} y_{1 c}^{r} n_{z}) x_{e} + (^{r} o_{y} -^{r} y_{1 c}^{r} o_{z}) y_{e} + (^{r} a_{y} -^{r} y_{1 c}^{r} a_{z}) z_{e} +^{r} p_{y} -^{r} y_{1 c}^{r} p_{z} = 0 \end{cases}

(22)

Four equations can be obtained from Equations (16) and (22) for x_e, y_e and z_e, and the 3D coordinates of point P_e can be calculated by the least squares method.

The 3D coordinates P_h (x_h, y_h, z_h) in the head coordinate system can be obtained by Equation (23). d_x and d_y are illustrated in Figure 4.

(\begin{matrix} x_{h} \\ y_{h} \\ z_{h} \end{matrix}) = (\begin{matrix} x_{e} - d_{x} \\ y_{e} - d_{y} \\ z_{e} \end{matrix})

(23)

Let the angles at which the current moment of the head rotate relative to the initial position be ^hθ_pi and ^hθ_ti; the coordinates of the target in the robot coordinate system are

(\begin{matrix} \begin{matrix} ^{w} x_{m} \\ ^{w} y_{m} \\ ^{w} z_{m} \end{matrix} \\ 1 \end{matrix}) = {(\begin{matrix} Rot (X,^{h} θ_{t i}) Rot (Y,^{h} θ_{p i}) & 0 \\ 0 & 1 \end{matrix})}^{- 1} (\begin{matrix} \begin{matrix} x_{h} \\ y_{h} \\ z_{h} \end{matrix} \\ 1 \end{matrix})

(24)

According to the 3D coordinates of the target in the head coordinate system, the angle between the target and Z_h in the horizontal direction and the vertical direction can be obtained as follows:

^{h} γ_{p} = \arctan (\frac{x_{h}}{z_{h}})

(25)

^{h} γ_{t} = \arctan (\frac{y_{h}}{z_{h}})

(26)

When ^hγ_p and ^hγ_t exceed a set threshold, the head needs to rotate. To leave a certain margin for the rotation of the eyeball and for the convenience of calculation, the angles required for the head to rotate in the horizontal direction and the vertical direction are calculated by the principle shown in Figure 7a,b, respectively. Figure 7a shows the calculation principle of the horizontal direction angle when the target’s x coordinates of the head coordinate system is greater than zero. After the head is rotated to

^{h} θ_{p}^{'}

, the target point is on the ^lZ_e axis of the left motion module end coordinate system, and the left motion module reaches the maximum rotatable threshold ^eθ_pmax. Figure 7b shows the calculation principle of the vertical direction when the target’s y coordinates of the head coordinate system are greater than d_y. After the head is rotated to

^{h} θ_{t}^{'}

, the target point is on the Z_e axis of the eye coordinate system, and the eye reaches the maximum threshold ^eθ_tmax that can be rotated.

3.2.2. Horizontal Rotation Angle Calculation

Let the current angle of the head in the horizontal direction be ^hθ_pi. When the head is rotated in the horizontal direction to

^{h} θ_{p}^{'}

, the 3D coordinates of the target in the new head coordinate system are

(\begin{matrix} \begin{matrix} x_{h}^{'} \\ y_{h}^{'} \\ z_{h}^{'} \end{matrix} \\ 1 \end{matrix}) = (\begin{matrix} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) & 0 & - \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) & 0 \\ 0 & 1 & 0 & 0 \\ \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) & 0 & \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) & 0 \\ 0 & 0 & 0 & 1 \end{matrix}) (\begin{matrix} \begin{matrix} x_{h} \\ y_{h} \\ z_{h} \end{matrix} \\ 1 \end{matrix})

(27)

Therefore,

(\begin{matrix} x_{h}^{'} \\ y_{h}^{'} \\ z_{h}^{'} \end{matrix}) = (\begin{matrix} x_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) - z_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) \\ y_{h} \\ x_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) + z_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) \end{matrix})

(28)

The coordinates of the target in the new eye coordinate system are

(\begin{matrix} ^{e} x_{h}^{'} \\ ^{e} y_{h}^{'} \\ ^{e} z_{h}^{'} \end{matrix}) = (\begin{matrix} d_{x} + x_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) - z_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) \\ y_{h} + d_{y} \\ x_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) + z_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) \end{matrix})

(29)

After turning, the left motion module reaches the maximum threshold ^eθ_pmax that can be rotated, so that

\tan (^{e} θ_{pmax}) = \frac{^{e} z_{h}^{'}}{^{e} x_{h}^{'}} = \frac{x_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) + z_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i})}{d_{x} + x_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) - z_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i})}

(30)

Simplifying Equation (30), we have

\sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) = \frac{d_{x} \tan (^{e} θ_{qmax})}{x_{h} + z_{h} \tan (^{e} θ_{qmax})} + \frac{x_{h} \tan (^{e} θ_{qmax}) - z_{h}}{x_{h} + z_{h} \tan (^{e} θ_{qmax})} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i})

(31)

Assume that

\{\begin{cases} k_{1} = \frac{x_{h} \tan (^{e} θ_{qmax}) - z_{h}}{x_{h} + z_{h} \tan (^{e} θ_{qmax})} \\ k_{2} = \frac{d_{x} \tan (^{e} θ_{qmax})}{x_{h} + z_{h} \tan (^{e} θ_{qmax})} \end{cases}

(32)

According to the triangular relationship,

{[k_{1} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) + k_{2}]}^{2} + \cos^{2} (^{h} θ_{p}^{'} -^{h} θ_{p i}) = 1

(33)

The solution of Equation (33) is

\cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) = \frac{- k_{1} k_{2} \pm \sqrt{k_{1}^{2} - k_{2}^{2} + 1}}{k_{1}^{2} + 1}

(34)

Therefore,

^{h} θ_{p}^{'} =^{h} θ_{p i} + \arccos (\frac{- k_{1} k_{2} \pm \sqrt{k_{1}^{2} - k_{2}^{2} + 1}}{k_{1}^{2} + 1})

(35)

Equation (35) has two solutions; therefore, we choose the solution in which the deviation e of Equation (36) is minimized:

e = |\tan (^{e} θ_{qmax}) - \frac{x_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) + z_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i})}{d_{x} + x_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) - z_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i})}|

(36)

When the obtained ^h

θ_{p}^{'}

is outside of the range [−^hθ_pmax, ^hθ_pmax], the value of ^hθ_pq is

^{h} θ_{pq} = \{\begin{cases} ^{h} θ_{pmax} & , & ^{h} θ_{p}^{'} \geq^{h} θ_{pmax} \\ -^{h} θ_{pmax} & , & ^{h} θ_{p}^{'} \leq -^{h} θ_{pmax} \\ ^{h} θ_{p}^{'} & , & else \end{cases}

(37)

Finally, one can obtain the ^wθ_pq value:

^{w} θ_{pq} = \{\begin{cases} ^{h} θ_{p}^{'} -^{h} θ_{pmax} & , & ^{h} θ_{p}^{'} >^{h} θ_{pmax} \\ ^{h} θ_{p}^{'} +^{h} θ_{pmax} & , & ^{h} θ_{p}^{'} < -^{h} θ_{p \max} \\ 0 & , & else \end{cases}

(38)

Based on the same principle, when the x coordinate of the target in the head coordinate system is less than 0, the coordinates of the target in the right motion module base coordinate system after the rotation are

(\begin{matrix} ^{r} x_{e}^{'} \\ ^{r} y_{e}^{'} \\ ^{r} z_{e}^{'} \\ 1 \end{matrix}) = (\begin{matrix} x_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) - z_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) - d_{x} \\ y_{h} + d_{y} \\ x_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) + z_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) \\ 1 \end{matrix})

(39)

After turning, the right motion module reaches −^eθ_pmax, and the following can be obtained:

\tan (-^{e} θ_{qmax}) = \frac{^{r} z_{e}^{'}}{^{r} x_{e}^{'}} = \frac{x_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) + z_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i})}{x_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) - z_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) - d_{x}}

(40)

We simplify Equation (40) as follows:

\sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) = \frac{d_{x} \tan (^{e} θ_{qmax})}{x_{h} - z_{h} \tan (^{e} θ_{qmax})} - \frac{x_{h} \tan (^{e} θ_{qmax}) + z_{h}}{x_{h} - z_{h} \tan (^{e} θ_{qmax})} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i})

(41)

Let

\{\begin{cases} k_{1}^{'} = - \frac{x_{h} \tan (^{e} θ_{qmax}) + z_{h}}{x_{h} - z_{h} \tan (^{e} θ_{qmax})} \\ k_{2}^{'} = \frac{d_{x} \tan (^{e} θ_{qmax})}{x_{h} - z_{h} \tan (^{e} θ_{qmax})} \end{cases}

(42)

The same two solutions are available:

^{h} θ_{p}^{'} =^{h} θ_{p i} + \arccos (\frac{- k_{1}^{'} k_{2}^{'} \pm \sqrt{{(k_{1}^{'})}^{2} - {(k_{2}^{'})}^{2} + 1}}{{(k_{1}^{'})}^{2} + 1})

(43)

Select the solution in which the deviation e of Equation (44) is minimized:

e = |- \tan (^{e} θ_{qmax}) - \frac{x_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) + z_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i})}{x_{h} \cos (^{h} θ_{p}^{'} -^{h} θ_{p i}) - z_{h} \sin (^{h} θ_{p}^{'} -^{h} θ_{p i}) - d_{x}}|

(44)

Using Equations (37) and (38), ^hθ_pq and ^wθ_pq can be obtained.

3.2.3. Vertical Rotation Angle Calculation

When the target’s y coordinate in the head coordinate system is greater than d_y, the current angle of the head in the vertical direction is ^hθ_ti, and when the head is rotated in the vertical direction to

^{h} θ_{t}^{'}

, the target is in the new head coordinate system. The 3D coordinates are

(\begin{matrix} \begin{matrix} x_{h}^{'} \\ y_{h}^{'} \\ z_{h}^{'} \end{matrix} \\ 1 \end{matrix}) = (\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) & \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) & 0 \\ 0 & - \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) & \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) & 0 \\ 0 & 0 & 0 & 1 \end{matrix}) (\begin{matrix} \begin{matrix} x_{h} \\ y_{h} \\ z_{h} \end{matrix} \\ 1 \end{matrix})

(45)

Therefore,

(\begin{matrix} x_{h}^{'} \\ y_{h}^{'} \\ z_{h}^{'} \end{matrix}) = (\begin{matrix} x_{h} \\ y_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) + z_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) \\ z_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) - y_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) \end{matrix})

(46)

Using Equation (29), the coordinates of the eye coordinate system after the rotation of the target can be calculated:

(\begin{matrix} ^{e} x_{h}^{'} \\ ^{e} y_{h}^{'} \\ ^{e} z_{h}^{'} \end{matrix}) = (\begin{matrix} d_{x} + x_{h} \\ y_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) + z_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) + d_{y} \\ z_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) - y_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) \end{matrix})

(47)

After rotation, the left and right motion modules reach the rotatable maximum value ^eθ_tmax in the vertical direction, so that

\tan (^{e} θ_{tmax}) = \frac{^{e} y_{h}^{'}}{^{e} z_{h}^{'}} = \frac{y_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) + z_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) + d_{y}}{z_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) - y_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i})}

(48)

Simplifying Equation (48), we obtain

\sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) = \frac{z_{h} \tan (^{e} θ_{tmax}) - y_{h}}{z_{h} + y_{h} \tan (^{e} θ_{tmax})} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) - \frac{d_{y}}{z_{h} + y_{h} \tan (^{e} θ_{tmax})}

(49)

Let

\{\begin{cases} k_{1} = \frac{z_{h} \tan (^{e} θ_{tmax}) - y_{h}}{z_{h} + y_{h} \tan (^{e} θ_{tmax})} \\ k_{2} = - \frac{d_{y}}{z_{h} + y_{h} \tan (^{e} θ_{tmax})} \end{cases}

(50)

Therefore,

^{h} θ_{t}^{'} =^{h} θ_{t i} + \arccos (\frac{- k_{1} k_{2} \pm \sqrt{k_{1}^{2} - k_{2}^{2} + 1}}{k_{1}^{2} + 1})

(51)

Equation (51) has two solutions; therefore, we choose the solution in which the deviation e of Equation (52) is minimized:

e = |\tan (^{e} θ_{tmax}) - \frac{y_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) + z_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) + d_{y}}{z_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) - y_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i})}|

(52)

Similarly, when the target’s y coordinates in the head coordinate system are less than d_y, we have

\tan (-^{e} θ_{tmax}) = \frac{^{e} y_{h}^{'}}{^{e} z_{h}^{'}} = \frac{y_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) + z_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) + d_{y}}{z_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) - y_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i})}

(53)

\sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) = - \frac{z_{h} \tan (^{e} θ_{tmax}) + y_{h}}{z_{h} - y_{h} \tan (^{e} θ_{tmax})} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) - \frac{d_{y}}{z_{h} - y_{h} \tan (^{e} θ_{tmax})}

(54)

\{\begin{cases} k_{1} = - \frac{z_{h} \tan (^{e} θ_{tmax}) + y_{h}}{z_{h} - y_{h} \tan (^{e} θ_{tmax})} \\ k_{2} = - \frac{d_{y}}{z_{h} - y_{h} \tan (^{e} θ_{tmax})} \end{cases}

(55)

e = |- \tan (^{e} θ_{tmax}) - \frac{y_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) + z_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i}) + d_{y}}{z_{h} \cos (^{h} θ_{t}^{'} -^{h} θ_{t i}) - y_{h} \sin (^{h} θ_{t}^{'} -^{h} θ_{t i})}|

(56)

When the obtained

^{h} θ_{t}^{'}

is outside of the range [−^hθ_tmax, ^hθ_tmax], the value of ^hθ_tq is

^{h} θ_{tq} = \{\begin{cases} ^{h} θ_{tmax} & , & ^{h} θ_{t}^{'} \geq^{h} θ_{tmax} \\ -^{h} θ_{tmax} & , & ^{h} θ_{t}^{'} \leq -^{h} θ_{tmax} \\ ^{h} θ_{t}^{'} & , & else \end{cases}

(57)

After obtaining ^hθ_pq, ^hθ_tq and ^wθ_pq,

P_{e}^{'} (x_{e}^{'}, y_{e}^{'}, z_{e}^{'})

are the coordinates of the target in the eye coordinate system after the mobile robot and the head are rotated:

(\begin{matrix} \begin{matrix} x_{e}^{'} \\ y_{e}^{'} \\ z_{e}^{'} \end{matrix} \\ 1 \end{matrix}) = (\begin{matrix} Rot (X,^{h} θ_{tq}) Rot (Y,^{h} θ_{pq}) & 0 \\ 0 & 1 \end{matrix}) (\begin{matrix} Rot (Y,^{w} θ_{pq}) & 0 \\ 0 & 1 \end{matrix}) (\begin{matrix} \begin{matrix} x_{w} \\ y_{w} \\ z_{w} \end{matrix} \\ 1 \end{matrix}) + (\begin{matrix} d_{x} \\ d_{y} \\ 0 \\ 0 \end{matrix})

(58)

The desired observation pose of the eye, characterized by ^lθ_tq, ^lθ_pq, ^rθ_tq and ^rθ_pq, can be obtained using the method described in the following section.

3.2.4. Calculation of the Desired Observation Poses of the Eye

According to Formula (2), ^lθ_tq = ^rθ_tq = θ_t, and ^lθ_pq = ^rθ_pq = θ_p.

The inverse of the hand–eye matrix of the left camera and left motion module end coordinate system is

^{l} T_{m}^{- 1} = (\begin{matrix} ^{l} n_{x} & ^{l} o_{x} & ^{l} a_{x} & ^{l} p_{x} \\ ^{l} n_{y} & ^{l} o_{y} & ^{l} a_{y} & ^{l} p_{y} \\ ^{l} n_{z} & ^{l} o_{z} & ^{l} a_{z} & ^{l} p_{z} \\ 0 & 0 & 0 & 1 \end{matrix})

(59)

The coordinate ^lP_c (^lx_c, ^ly_c, ^lz_c) of

P_{e}^{'} (x_{e}^{'}, y_{e}^{'}, z_{e}^{'})

in the left camera coordinate system satisfies the following relationship:

(\begin{matrix} ^{l} P_{c} \\ 1 \end{matrix}) =^{l} T_{m}^{- 1} (\begin{matrix} Rot (X, -^{l} θ_{t}) & 0 \\ 0 & 1 \end{matrix}) (\begin{matrix} Rot (Y, -^{l} θ_{p}) & 0 \\ 0 & 1 \end{matrix}) (\begin{matrix} P_{e}^{'} \\ 1 \end{matrix})

(60)

According to the small hole imaging model, the imaging coordinates of the

P_{e}^{'} (x_{e}^{'}, y_{e}^{'}, z_{e}^{'})

point in the left camera are

(\begin{matrix} ^{l} u \\ ^{l} v \\ 1 \end{matrix}) =^{l} M_{in}^{l} P_{1 c} = (\begin{matrix} ^{l} k_{x} & 0 & ^{l} u_{0} \\ 0 & ^{l} k_{y} & ^{l} v_{0} \\ 0 & 0 & 1 \end{matrix}) (\begin{matrix} ^{l} x_{c} /^{l} z_{c} \\ ^{l} y_{c} /^{l} z_{c} \\ 1 \end{matrix}) = (\begin{matrix} ^{l} k_{x}^{l} x_{c} /^{l} z_{c} +^{l} u_{0} \\ ^{l} k_{y}^{l} y_{c} /^{l} z_{c} +^{l} v_{0} \\ 1 \end{matrix})

(61)

Substituting Equation (61) into Equation (2), we obtain

(\begin{matrix} ^{l} Δ u \\ ^{l} Δ v \end{matrix}) = (\begin{matrix} ^{l} k_{x}^{l} x_{c} /^{l} z_{c} \\ ^{l} k_{y}^{l} y_{c} /^{l} z_{c} \\ 1 \end{matrix})

(62)

Based on the same principle, the coordinate ^rP_c (^rx_c, ^ry_c, ^rz_c) of

P_{e}^{'} (x_{e}^{'}, y_{e}^{'}, z_{e}^{'})

in the right camera coordinate system is

(\begin{matrix} ^{r} P_{c} \\ 1 \end{matrix}) =^{r} T_{m}^{- 1} (\begin{matrix} Rot (X, -^{r} θ_{t}) & 0 \\ 0 & 1 \end{matrix}) (\begin{matrix} Rot (Y, -^{r} θ_{p}) & 0 \\ 0 & 1 \end{matrix})^{r} T_{e} (\begin{matrix} P_{e}^{'} \\ 1 \end{matrix})

(63)

The imaging coordinates of point

P_{e}^{'} (x_{e}^{'}, y_{e}^{'}, z_{e}^{'})

in the right camera are

(\begin{matrix} ^{r} u \\ ^{r} v \\ 1 \end{matrix}) =^{r} M_{in}^{r} P_{1 c} = (\begin{matrix} ^{r} k_{x} & 0 & ^{r} u_{0} \\ 0 & ^{r} k_{y} & ^{r} v_{0} \\ 0 & 0 & 1 \end{matrix}) (\begin{matrix} ^{r} x_{c} /^{r} z_{c} \\ ^{r} y_{c} /^{r} z_{c} \\ 1 \end{matrix}) = (\begin{matrix} ^{r} k_{x}^{r} x_{c} /^{r} z_{c} +^{r} u_{0} \\ ^{r} k_{y}^{r} y_{c} /^{r} z_{c} +^{r} v_{0} \\ 1 \end{matrix})

(64)

(\begin{matrix} ^{r} Δ u \\ ^{r} Δ v \end{matrix}) = (\begin{matrix} ^{r} k_{x}^{r} x_{c} /^{r} z_{c} \\ ^{r} k_{y}^{r} y_{c} /^{r} z_{c} \\ 1 \end{matrix})

(65)

By Equations (2), (62) and (65), two equations related to θ_t and θ_p (see Appendix C for the complete equations) can be obtained. It is challenging to calculate the values of θ_t and θ_p directly from these two equations, however. To obtain a solution, we consider a suboptimal observation pose and use this pose as the initial value; then, we use the trial-and-error method to obtain the optimal observation pose. When θ_t is calculated, let θ_p = 0; the solution of θ_t can then be obtained by Δv_l = −Δv_r. When θ_p is calculated, the solution of θ_p is solved by Δu_l = −Δu_r. The solution P_ob (θ_t, θ_t, θ_p, θ_p) is a suboptimal observed pose. Based on the suboptimal observation pose, the trial-and-error method can be used to obtain the optimal solution with the smallest error. The range of θ_t is [−θ_tmax, θ_tmax]. The range of θ_p is [−θ_pmax, θ_pmax].

According to Equations (60) and (63), let θ_p be equal to 0 to obtain

(\begin{matrix} ^{l} P_{c} \\ 1 \end{matrix}) = {(^{l} T_{m})}^{- 1} (\begin{matrix} Rot (X, - θ_{t}) & 0 \\ 0 & 1 \end{matrix})^{l} T_{e} (\begin{matrix} P_{e}^{'} \\ 1 \end{matrix})

(66)

The following result is also available:

(\begin{matrix} ^{r} P_{c} \\ 1 \end{matrix}) = {(^{r} T_{m})}^{- 1} (\begin{matrix} Rot (X, - θ_{t}) & 0 \\ 0 & 1 \end{matrix})^{r} T_{e} (\begin{matrix} P_{e}^{'} \\ 1 \end{matrix})

(67)

The base coordinate system of the left motion module is the world coordinate system. Therefore, ^lT_w is a unit matrix. To simplify the calculation, we have

(\begin{matrix} ^{r} P_{e} \\ 1 \end{matrix}) = (\begin{matrix} ^{r} x_{e}^{'} \\ ^{r} y_{e}^{'} \\ ^{r} z_{e}^{'} \\ 1 \end{matrix}) =^{r} T_{e} (\begin{matrix} P_{e}^{'} \\ 1 \end{matrix})

(68)

According to the calculation principle of Section 3.2.1, we have the following:

\{\begin{cases} Δ u_{l} =^{l} f_{x} \frac{(^{l} o_{x} y_{e}^{'} +^{l} a_{x} z_{e}^{'}) \cos θ_{t} + (^{l} o_{x} z_{e}^{'} -^{l} a_{x} y_{e}^{'}) \sin θ_{t} + (^{l} p_{x} +^{l} n_{x} x_{e}^{'})}{(^{l} o_{z} y_{e}^{'} +^{l} a_{z} z_{e}^{'}) \cos θ_{t} + (^{l} o_{z} z_{e}^{'} -^{l} a_{z} y_{e}^{'}) \sin θ_{t} + (^{l} p_{z} +^{l} n_{z} x_{e}^{'})} \\ Δ v_{l} =^{l} f_{y} \frac{(^{l} o_{y} y_{e}^{'} +^{l} a_{y} z_{e}^{'}) \cos θ_{t} + (^{l} o_{y} z_{e}^{'} -^{l} a_{y} y_{e}^{'}) \sin θ_{t} + (^{l} p_{y} +^{l} n_{y} x_{e}^{'})}{(^{l} o_{z} y_{e}^{'} +^{l} a_{z} z_{e}^{'}) \cos θ_{t} + (^{l} o_{z} z_{e}^{'} -^{l} a_{z} y_{e}^{'}) \sin θ_{t} + (^{l} p_{z} +^{l} n_{z} x_{e}^{'})} \end{cases}

(69)

\{\begin{cases} Δ u_{r} =^{r} f_{x} \frac{(^{r} o_{x}^{r} y_{e}^{'} +^{r} a_{x}^{r} z_{e}^{'}) \cos θ_{t} + (^{r} o_{x}^{r} z_{e}^{'} -^{r} a_{x}^{r} y_{e}^{'}) \sin θ_{t} + (^{r} p_{x} +^{r} n_{x}^{r} x_{e}^{'})}{(^{r} o_{z}^{r} y_{e}^{'} +^{r} a_{z}^{r} z_{e}^{'}) \cos θ_{t} + (^{r} o_{z} z_{e}^{'} -^{r} a_{z}^{r} y_{e}^{'}) \sin θ_{t} + (^{r} p_{z} +^{r} n_{z}^{r} x_{e}^{'})} \\ Δ v_{r} =^{r} f_{y} \frac{(^{r} o_{y}^{r} y_{e}^{'} +^{r} a_{y}^{r} z_{e}^{'}) \cos θ_{t} + (^{r} o_{y}^{r} z_{e}^{'} -^{r} a_{y}^{r} y_{e}^{'}) \sin θ_{t} + (^{r} p_{y} +^{r} n_{y}^{r} x_{e}^{'})}{(^{r} o_{z}^{r} y_{e}^{'} +^{r} a_{z}^{r} z_{e}^{'}) \cos θ_{t} + (^{r} o_{z}^{r} z_{e}^{'} -^{r} a_{z}^{r} y_{e}^{'}) \sin θ_{t} + (^{r} p_{z} +^{r} n_{z}^{r} x_{e}^{'})} \end{cases}

(70)

Assume the following:

E_{sv} = |Δ v_{l}| + |Δ v_{r}|

(71)

The solution to θ_t that keeps the target at the center of the two cameras needs to satisfy the following conditions:

\{\begin{cases} Δ v_{l} + Δ v_{r} = 0 \\ - θ_{tmax} \leq θ_{t} \leq θ_{tmax} \\ θ_{t} = \arg \min (E_{sv}) \end{cases}

(72)

Substituting the second equation of Equations (69) and (70) into Equation (72) and solving the equation, we have

k_{1} \cos^{2} θ_{t} + k_{2} \sin^{2} θ_{t} + k_{3} \sin θ_{t} \cos θ_{t} + k_{4} \cos θ_{t} + k_{5} \sin θ_{t} + k_{6} = 0

(73)

where

k_{1}, k_{2}, k_{3}, k_{4}, k_{5}

are

k_{1} =^{l} f_{y} (^{l} o_{y} y_{e}^{'} +^{l} a_{y} z_{e}^{'}) (^{r} o_{z}^{r} y_{e}^{'} +^{r} a_{z}^{r} z_{e}^{'}) +^{r} f_{y} (^{l} o_{z} y_{e}^{'} +^{l} a_{z} z_{e}^{'}) (^{r} o_{y}^{r} y_{e}^{'} +^{r} a_{y}^{r} z_{e}^{'})

(74)

k_{2} =^{l} f_{y} (^{l} o_{y} z_{e}^{'} -^{l} a_{y} y_{e}^{'}) (^{r} o_{z}^{r} z_{e}^{'} -^{r} a_{z}^{r} y_{e}^{'}) +^{r} f_{y} (^{l} o_{z} z_{e}^{'} -^{l} a_{z} y_{e}^{'}) (^{r} o_{y}^{r} z_{e}^{'} -^{r} a_{y}^{r} y_{e}^{'})

(75)

\begin{array}{r} k_{3} =^{l} f_{y} (^{l} o_{y} y_{e}^{'} +^{l} a_{y} z_{e}^{'}) (^{r} o_{z}^{r} z_{e}^{'} -^{r} a_{z}^{r} y_{e}^{'}) +^{l} f_{y} (^{l} o_{y} z_{e}^{'} -^{l} a_{y} y_{e}^{'}) (^{r} o_{z}^{r} y_{e}^{'} +^{r} a_{z}^{r} z_{e}^{'}) \\ +^{r} f_{y} (^{l} o_{z} y_{e}^{'} +^{l} a_{z} z_{e}^{'}) (^{r} o_{y}^{r} z_{e}^{'} -^{r} a_{y}^{r} y_{e}^{'}) +^{r} f_{y} (^{l} o_{z} z_{e}^{'} -^{l} a_{z} y_{e}^{'}) (^{r} o_{y}^{r} y_{e}^{'} +^{r} a_{y}^{r} z_{e}^{'}) \end{array}

(76)

\begin{array}{r} k_{4} =^{l} f_{y} (^{l} o_{y} y_{e}^{'} +^{l} a_{y} z_{e}^{'}) (^{r} p_{z} +^{r} n_{z}^{r} x_{e}^{'}) +^{l} f_{y} (^{l} p_{y} +^{l} n_{y} x_{e}^{'}) (^{r} o_{z}^{r} y_{e}^{'} +^{r} a_{z}^{r} z_{e}^{'}) \\ +^{r} f_{y} (^{l} o_{z} y_{e}^{'} +^{l} a_{z} z_{e}^{'}) (^{r} p_{y} +^{r} n_{y}^{r} x_{e}^{'}) +^{r} f_{y} (^{l} p_{z} +^{l} n_{z} x_{e}^{'}) (^{r} o_{y}^{r} y_{e}^{'} +^{r} a_{y}^{r} z_{e}^{'}) \end{array}

(77)

\begin{array}{r} k_{5} =^{l} f_{y} (^{l} o_{y} z_{e}^{'} -^{l} a_{y} y_{e}^{'}) (^{r} p_{z} +^{r} n_{z}^{r} x_{e}^{'}) +^{l} f_{y} (^{l} p_{y} +^{l} n_{y} x_{e}^{'}) (^{r} o_{z}^{r} z_{e}^{'} -^{r} a_{z}^{r} y_{e}^{'}) \\ +^{r} f_{y} (^{l} o_{z} z_{e}^{'} -^{l} a_{z} y_{e}^{'}) (^{r} p_{y} +^{r} n_{y}^{r} x_{e}^{'}) +^{r} f_{y} (^{l} p_{z} +^{l} n_{z} x_{e}^{'}) (^{r} o_{y}^{r} z_{e}^{'} -^{r} a_{y}^{r} y_{e}^{'}) \end{array}

(78)

k_{6} =^{l} f_{y} (^{l} p_{y} +^{l} n_{y} x_{e}^{'}) (^{r} p_{z} +^{r} n_{z}^{r} x_{e}^{'}) +^{r} f_{y} (^{l} p_{z} +^{l} n_{z} x_{e}^{'}) (^{r} p_{y} +^{r} n_{y}^{r} x_{e}^{'})

(79)

According to the triangle relationship, we have

\cos^{2} θ_{t} + \sin^{2} θ_{t} = 1

(80)

Replacing cosθ_t in Equation (73) with sinθ_t, we obtain the following:

k_{1}^{'} \sin^{4} θ_{t} + k_{2}^{'} \sin^{3} θ_{t} + k_{3}^{'} \sin^{2} θ_{t} + k_{4}^{'} \sin θ_{t} + k_{5}^{'} = 0

(81)

where

k_{1}, k_{2}, k_{3}, k_{4}, k_{5}

are

k_{1}^{'} = {(k_{2} - k_{1})}^{2} + k_{3}^{2}

(82)

k_{2}^{'} = 2 (k_{2} - k_{1}) k_{5} + 2 k_{3} k_{4}

(83)

k_{3}^{'} = 2 (k_{2} - k_{1}) k_{6} + k_{5}^{2} + k_{4}^{2} - k_{3}^{2}

(84)

k_{4}^{'} = 2 k_{5} k_{6} - 2 k_{3} k_{4}

(85)

k_{5}^{'} = k_{6}^{2} - k_{4}^{2}

(86)

Four solutions can be obtained using Equation (81). The optimal solution is a real number, and the most suitable solution can be selected by the condition of Equation (72).

After θ_t is obtained, θ_p can be solved based on the obtained θ_t.

According to Equations (60) and (63),

{\bar{θ}}_{t}

is the solution obtained in Section 3.2.2, so that

(\begin{matrix} ^{l} P_{c} \\ 1 \end{matrix}) = {(^{l} T_{m})}^{- 1} (\begin{matrix} Rot (X, - {\bar{θ}}_{t}) & 0 \\ 0 & 1 \end{matrix}) (\begin{matrix} Rot (Y, -^{l} θ_{p}) & 0 \\ 0 & 1 \end{matrix})^{l} T_{e} (\begin{matrix} P_{e}^{'} \\ 1 \end{matrix})

(87)

The following result is also available:

(\begin{matrix} ^{r} P_{c} \\ 1 \end{matrix}) = {(^{r} T_{m})}^{- 1} (\begin{matrix} Rot (X, - {\bar{θ}}_{t}) & 0 \\ 0 & 1 \end{matrix}) (\begin{matrix} Rot (Y, -^{l} θ_{p}) & 0 \\ 0 & 1 \end{matrix})^{r} T_{e} (\begin{matrix} P_{e}^{'} \\ 1 \end{matrix})

(88)

Since

{\bar{θ}}_{t}

is known, for convenience of calculation, we set

^{l} T_{m}^{'} = {(^{l} T_{m})}^{- 1} (\begin{matrix} Rot (X, - {\bar{θ}}_{t}) & 0 \\ 0 & 1 \end{matrix}) = (\begin{matrix} ^{l} n_{x}^{'} & ^{l} o_{x}^{'} & ^{l} a_{x}^{'} & ^{l} p_{x}^{'} \\ ^{l} n_{y}^{'} & ^{l} o_{y}^{'} & ^{l} a_{y}^{'} & ^{l} p_{y}^{'} \\ ^{l} n_{z}^{'} & ^{l} o_{z}^{'} & ^{l} a_{z}^{'} & ^{l} p_{z}^{'} \\ 0 & 0 & 0 & 1 \end{matrix})

(89)

^{r} T_{m}^{'} = {(^{r} T_{m})}^{- 1} (\begin{matrix} Rot (X, - {\bar{θ}}_{t}) & 0 \\ 0 & 1 \end{matrix}) = (\begin{matrix} ^{r} n_{x}^{'} & ^{r} o_{x}^{'} & ^{r} a_{x}^{'} & ^{r} p_{x}^{'} \\ ^{r} n_{y}^{'} & ^{r} o_{y}^{'} & ^{r} a_{y}^{'} & ^{r} p_{y}^{'} \\ ^{r} n_{z}^{'} & ^{r} o_{z}^{'} & ^{r} a_{z}^{'} & ^{r} p_{z}^{'} \\ 0 & 0 & 0 & 1 \end{matrix})

(90)

The following results are obtained:

\{\begin{cases} Δ u_{l} =^{l} f_{x} \frac{(^{l} n_{x}^{'} x_{e}^{'} +^{l} a_{x}^{'} z_{e}^{'}) \cos θ_{p} + (^{l} a_{x}^{'} x_{e}^{'} -^{l} n_{x}^{'} z_{e}^{'}) \sin θ_{p} + (^{l} p_{x}^{'} +^{l} o_{x}^{'} y_{e}^{'})}{(^{l} n_{z}^{'} x_{e}^{'} +^{l} a_{z}^{'} z_{e}^{'}) \cos θ_{p} + (^{l} a_{z}^{'} x_{e}^{'} -^{l} n_{z}^{'} z_{e}^{'}) \sin θ_{p} + (^{l} p_{z}^{'} +^{l} o_{z}^{'} y_{e}^{'})} \\ Δ v_{l} =^{l} f_{y} \frac{(^{l} n_{y}^{'} x_{e}^{'} +^{l} a_{y}^{'} z_{e}^{'}) \cos θ_{p} + (^{l} a_{y}^{'} x_{e}^{'} -^{l} n_{y}^{'} z_{e}^{'}) \sin θ_{p} + (^{l} p_{y}^{'} +^{l} o_{y}^{'} y_{e}^{'})}{(^{l} n_{z}^{'} x_{e}^{'} +^{l} a_{z}^{'} z_{e}^{'}) \cos θ_{p} + (^{l} a_{z}^{'} x_{e}^{'} -^{l} n_{z}^{'} z_{e}^{'}) \sin θ_{p} + (^{l} p_{z}^{'} +^{l} o_{z}^{'} y_{e}^{'})} \end{cases}

(91)

\{\begin{cases} Δ u_{r} =^{r} f_{x} \frac{(^{r} n_{x}^{'}^{r} x_{e}^{'} +^{r} a_{x}^{'}^{r} z_{e}^{'}) \cos θ_{p} + (^{r} a_{x}^{'}^{r} x_{e}^{'} -^{r} n_{x}^{'}^{r} z_{e}^{'}) \sin θ_{p} + (^{r} p_{x}^{'} +^{r} o_{x}^{'}^{r} y_{e}^{'})}{(^{r} n_{z}^{'}^{r} x_{e}^{'} +^{r} a_{z}^{'}^{r} z_{e}^{'}) \cos θ_{p} + (^{r} a_{z}^{'}^{r} x_{e}^{'} -^{r} n_{z}^{'}^{r} z_{e}^{'}) \sin θ_{p} + (^{r} p_{z}^{'} +^{r} o_{z}^{'}^{r} y_{e}^{'})} \\ Δ v_{r} =^{r} f_{y} \frac{(^{r} n_{y}^{'}^{r} x_{e}^{'} +^{r} a_{y}^{'}^{r} z_{e}^{'}) \cos θ_{p} + (^{r} a_{y}^{'}^{r} x_{e}^{'} -^{r} n_{y}^{'}^{r} z_{e}^{'}) \sin θ_{p} + (^{r} p_{y}^{'} +^{r} o_{y}^{'}^{r} y_{e}^{'})}{(^{r} n_{z}^{'}^{r} x_{e}^{'} +^{r} a_{z}^{'}^{r} z_{e}^{'}) \cos θ_{p} + (^{r} a_{z}^{'}^{r} x_{e}^{'} -^{r} n_{z}^{'}^{r} z_{e}^{'}) \sin θ_{p} + (^{r} p_{z}^{'} +^{r} o_{z}^{'}^{r} y_{e}^{'})} \end{cases}

(92)

Assume that

E_{su} = |Δ u_{l}| + |Δ u_{r}|

(93)

The solution to θ_p that keeps the target at the center of the two cameras needs to satisfy the following conditions:

\{\begin{cases} Δ u_{l} + Δ u_{r} = 0 \\ - θ_{pmax} \leq θ_{p} \leq θ_{pmax} \\ θ_{p} = \arg \min (E_{su}) \end{cases}

(94)

Substituting the second equation of Equations (91) and (92) into Equation (94) and solving the available equation, we obtain

k_{1} \cos^{2} θ_{p} + k_{2} \sin^{2} θ_{p} + k_{3} \sin θ_{p} \cos θ_{p} + k_{4} \cos θ_{p} + k_{5} \sin θ_{p} + k_{6} = 0

(95)

where

k_{1} =^{l} f_{x} (^{l} n_{x}^{'} x_{e}^{'} +^{l} a_{x}^{'} z_{e}^{'}) (^{r} n_{z}^{'}^{r} x_{e}^{'} +^{r} a_{z}^{'}^{r} z_{e}^{'}) +^{r} f_{x} (^{r} n_{x}^{'}^{r} x_{e}^{'} +^{r} a_{x}^{'}^{r} z_{e}^{'}) (^{l} n_{z}^{'} x_{e}^{'} +^{l} a_{z}^{'} z_{e}^{'})

(96)

k_{2} =^{l} f_{x} (^{l} n_{x}^{'} z_{e}^{'} -^{l} a_{x}^{'} x_{e}^{'}) (^{r} n_{z}^{'}^{r} z_{e}^{'} -^{r} a_{z}^{'}^{r} x_{e}^{'}) +^{r} f_{x} (^{r} n_{x}^{'}^{r} z_{e}^{'} -^{r} a_{x}^{'}^{r} x_{e}^{'}) (^{l} n_{z}^{'} z_{e}^{'} -^{l} a_{z}^{'} x_{e}^{'})

(97)

\begin{array}{r} k_{3} =^{l} f_{x} (^{l} n_{x}^{'} x_{e}^{'} +^{l} a_{x}^{'} z_{e}^{'}) (^{r} n_{z}^{'}^{r} z_{e}^{'} -^{r} a_{z}^{'}^{r} x_{e}^{'}) +^{l} f_{x} (^{l} n_{x}^{'} z_{e}^{'} -^{l} a_{x}^{'} x_{e}^{'}) (^{r} n_{z}^{'}^{r} x_{e}^{'} +^{r} a_{z}^{'}^{r} z_{e}^{'}) \\ +^{r} f_{x} (^{r} n_{x}^{'}^{r} z_{e}^{'} -^{r} a_{x}^{'}^{r} x_{e}^{'}) (^{l} n_{z}^{'} x_{e}^{'} +^{l} a_{z}^{'} z_{e}^{'}) +^{r} f_{x} (^{r} n_{x}^{'}^{r} x_{e}^{'} +^{r} a_{x}^{'}^{r} z_{e}^{'}) (^{l} n_{z}^{'} z_{e}^{'} -^{l} a_{z}^{'} x_{e}^{'}) \end{array}

(98)

\begin{array}{r} k_{4} =^{l} f_{x} (^{l} n_{x}^{'} x_{e}^{'} +^{l} a_{x}^{'} z_{e}^{'}) (^{r} o_{z}^{'}^{r} y_{e}^{'} +^{r} p_{z}^{'}) +^{l} f_{x} (^{l} o_{x}^{'} y_{e}^{'} +^{l} p_{x}^{'}) (^{r} n_{z}^{'}^{r} x_{e}^{'} +^{r} a_{z}^{'}^{r} z_{e}^{'}) \\ +^{r} f_{x} (^{r} o_{x}^{'}^{r} y_{e}^{'} +^{r} p_{x}^{'}) (^{l} n_{z}^{'} x_{e}^{'} +^{l} a_{z}^{'} z_{e}^{'}) +^{r} f_{x} (^{r} n_{x}^{'}^{r} x_{e}^{'} +^{r} a_{x}^{'}^{r} z_{e}^{'}) (^{l} o_{z}^{'} y_{e}^{'} +^{l} p_{z}^{'}) \end{array}

(99)

\begin{array}{r} k_{5} =^{l} f_{x} (^{l} n_{x}^{'} z_{e}^{'} -^{l} a_{x}^{'} x_{e}^{'}) (^{r} o_{z}^{'}^{r} y_{e}^{'} +^{r} p_{z}^{'}) +^{l} f_{x} (^{l} o_{x}^{'} y_{e}^{'} +^{l} p_{x}^{'}) (^{r} n_{z}^{'}^{r} z_{e}^{'} -^{r} a_{z}^{'}^{r} x_{e}^{'}) \\ +^{r} f_{x} (^{r} o_{x}^{'}^{r} y_{e}^{'} +^{r} p_{x}^{'}) (^{l} n_{z}^{'} z_{e}^{'} -^{l} a_{z}^{'} x_{e}^{'}) +^{r} f_{x} (^{r} n_{x}^{'}^{r} z_{e}^{'} -^{r} a_{x}^{'}^{r} x_{e}^{'}) (^{l} o_{z}^{'} y_{e}^{'} +^{l} p_{z}^{'}) \end{array}

(100)

k_{6} =^{l} f_{x} (^{l} o_{x}^{'} y_{e}^{'} +^{l} p_{x}^{'}) (^{r} o_{z}^{'}^{r} y_{e}^{'} +^{r} p_{z}^{'}) +^{r} f_{x} (^{r} o_{x}^{'}^{r} y_{e}^{'} +^{r} p_{x}^{'}) (^{l} o_{z}^{'} y_{e}^{'} +^{l} p_{z}^{'})

(101)

Replacing cosθ_p in Equation (73) with sinθ_p, we obtain

k_{1}^{'} \sin^{4} θ_{p} + k_{2}^{'} \sin^{3} θ_{p} + k_{3}^{'} \sin^{2} θ_{p} + k_{4}^{'} \sin θ_{p} + k_{5}^{'} = 0

(102)

where

k_{1}^{'} = {(k_{2} - k_{1})}^{2} + {(k_{3})}^{2}

(103)

k_{2}^{'} = 2 (k_{2} - k_{1}) k_{5} + 2 k_{3} k_{4}

(104)

k_{3}^{'} = 2 (k_{2} - k_{1}) k_{6} + {(k_{5})}^{2} + {(k_{4})}^{2} - {(k_{3})}^{2}

(105)

k_{4}^{'} = 2 k_{5} k_{6} - 2 k_{3} k_{4}

(106)

k_{5}^{'} = {(k_{6})}^{2} - {(k_{4})}^{2}

(107)

Four solutions can be obtained using Equation (102). The optimal solution must be a real number, and the most suitable solution can be selected using the condition of Equation (94). For the case where the four solutions cannot satisfy Equation (94), the position of the target is beyond the position that the bionic eye can reach. In this case, compensation is required through the head or torso.

{\bar{θ}}_{t}

and

{\bar{θ}}_{p}

obtained at this time are suboptimal solutions close to the optimal solution. θ_t and θ_p are the optimal solutions.

Through the above steps, the desired observation pose can be calculated. The calculation steps of ^gf_q can be summarized by the flow chart shown in Figure 8.

3.3. Desired Pose Calculation for Approaching Gaze Point Tracking

The mobile robot approaches the target in two steps: the first step is that the robot and the head rotate in the horizontal direction until the robot and the head are facing the target, and the second step is that the robot moves straight toward the target. The desired position of the approaching motion should satisfy the following conditions: (1) the target should be on the Z axis of the robot and the head coordinate system, (2) the distance between the target and the robot should be less than the set threshold D_T and (3) the eye should be in the optimal observation position. ^af_q can be defined as

^{a} f_{q} = (\begin{array}{l} ^{w} x_{mq} = 0 \\ ^{w} z_{mq} = {z | 0 <^{w} z_{m} - z \leq D_{T}} \\ ^{w} θ_{pq} = {θ | | θ | \leq 2 π,^{w} γ_{p} = 0} \\ ^{h} θ_{pq} = 0 \\ ^{h} θ_{tq} = {θ | | θ | \leq^{h} θ_{tmax}, |^{h} γ_{t} | \leq^{h} γ_{tmax}} \\ ^{l} θ_{pq} =^{r} θ_{pq} = {θ | | θ | \leq^{e} θ_{pmax}, Δ m_{l} = - Δ m_{r}} \\ ^{l} θ_{tq} =^{r} θ_{tq} = {θ | | θ | \leq^{e} θ_{tmax}, Δ m_{l} = - Δ m_{r}} \end{array})

(108)

The desired rotation angle ^wθ_pq of the moving robot is the same as the angle ^bγ_p between the robot and the target and can be obtained by

^{w} θ_{pq} =^{w} γ_{p} = \arctan (\frac{^{w} z_{m}}{^{w} x_{m}})

(109)

^hθ_tq can be obtained using the method described in Section 3.2. The optimal observation pose described in Section 3.2.4 can be used to obtain ^lθ_tq, ^lθ_pq, ^rθ_tq and ^rθ_pq.

4. Robot Pose Control

After obtaining the desired pose of the robot system, the control block diagram shown in Figure 9 is used to control the robot to move to the desired pose.

The desired pose is converted to the desired position of the motor. Δθ_lt, Δθ_lp, Δθ_rt, Δθ_rp, Δθ_ht and Δθ_hp are deviations of the desired angle from the current angle of motor M_lu, motor M_ld, motor M_ru, motor M_rd, motor M_hu and motor M_hd, respectively. ^lθ_m and ^rθ_m are the angles at which each wheel of the moving robot needs to be rotated. During the in situ gaze point tracking process, the moving robot performs only the rotation of the original position, and the angle of the robot movement can be calculated according to the desired angle of the robot. When the robot rotates, the two wheels move in opposite directions at the same speed. Let the distance between the two wheels of the moving robot be D_r; when the robot rotates around an angle ^wθ_pq, the distance that each wheel needs to move is

S =^{w} θ_{pq} \frac{D_{r}}{2}

(110)

The diameter of each wheel is d_w, and the angle of rotation of each wheel is (where counterclockwise is positive)

^{r} θ_{m} = -^{l} θ_{m} = \frac{2 S}{d_{w}}

(111)

In the process of approaching the target, the moving robot follows a straight line, and the angle of rotation of each wheel is

^{r} θ_{m} =^{l} θ_{m} = \frac{2^{w} z_{mq}}{d_{w}}

(112)

The movement of the moving robot is achieved by controlling the rotation of each wheel. Each wheel is equipped with a DC brushless motor, and a DSP2000 controller is used to control the movement of the DC brushless motor. Position servo control is implemented in the DSP2000 controller.

In the robot system, the weight of the camera and lens is approximately 80 g, the weight of the camera and the fixed mechanical parts is approximately 50 g and the motor that controls the vertical rotation of the camera (rotating around the horizontal axis of rotation) and the corresponding encoder weighs approximately 250 g. The mechanical parts of the fixed vertical rotating motor and encoder weigh approximately 100 g. The radius of the rotation of the camera in the vertical direction is approximately 1 cm, and the rotation in the horizontal direction (rotation about the vertical axis of rotation) has a radius of approximately 2 cm. Therefore, when the gravitational acceleration is 9.8 m/s², the torque required for the vertical rotating electric machine is approximately 0.013 N·m, and the torque required for the horizontal rotating electric machine is approximately 0.043 N·m. The vertical rotating motor uses a 28BYG5401 stepping motor with a holding torque of 0.1 N·m and a positioning torque of 0.008 N·m. The driver is HSM20403A. The horizontal rotating motor is a 57BYGH301 stepping motor with a holding torque of 1.5 N·m, a positioning torque of 0.07 N·m and drive model HSM20504A. The four stepping motors of the eye have a step angle of 1.8° and are all subdivided by 25, so the actual step angle of each motor is 0.072°, and the minimum pulse width that the driver can receive is 2.5 µs. The stepper motor has a maximum angular velocity of 200°/s.

The head vertical rotary motor uses a 57BYGH401 stepper motor with a holding torque of 2.2 N·m, a positioning torque of 0.098 N·m and drive model HSM20504A. The head horizontal rotary motor is an 86BYG350B three-phase AC stepping motor with a holding torque of 5 N·m, a positioning torque of 0.3 N-m and an HSM30860M driver. The step angle of the head motor after subdivision is also 0.072°. The head vertical motor has a load of approximately 5 kg and a radius of rotation of less than 1 cm. The head horizontal rotary motor has a load of approximately 9.5 kg and a radius of rotation of approximately 5 cm. In the experiment, we found that the maximum horizontal pulse frequency that the head horizontal rotary motor can receive is 0.6 Kpps. Its maximum angular velocity is 43.2°/s.

5. Experiments and Discussion

Using the robot platform introduced in Section 2, experiments on in situ gaze point tracking and approaching gaze point tracking were performed

Each camera has a resolution of 400 × 300 pixels. The directions of rotation are [−45°, 45°]. The range of rotation of the head is [−30°, 30°]. d_x and d_y are 150 mm and 200 mm, respectively. The internal and external parameters, distortion parameters, initial position parameters and left- and right-hand–eye parameters of the dual purpose method are calibrated as follows:

^{l} M_{in} = (\begin{matrix} 341.58 & 0 & 201.6 \\ 0 & 341.97 & 147.62 \\ 0 & 0 & 1 \end{matrix})

(113)

K_{l} = (\begin{matrix} - 0.1905 & 0.2171 & - 0.0018 & - 0.0005 & - 0.0823 \end{matrix})

(114)

^{l} T_{m} = (\begin{matrix} 1.0 & 0.0078 & - 0.0022 & 58.4172 \\ 0.0001 & 0.9954 & 0.0959 & 3.6042 \\ 0.0013 & - 0.0959 & 0.9954 & 51.9366 \\ 0 & 0 & 0 & 1 \end{matrix})

(115)

^{r} M_{in} = (\begin{matrix} 335.13 & 0 & 184.32 \\ 0 & 335.5 & 141.26 \\ 0 & 0 & 1 \end{matrix})

(116)

K_{r} = (\begin{matrix} - 0.1861 & 0.1987 & - 0.004 & - 0.0011 & - 0.0739 \end{matrix})

(117)

^{r} T_{m} = (\begin{matrix} 0.9999 & - 0.0086 & - 0.0125 & - 45.0147 \\ 0.0190 & - 0.9969 & - 0.0782 & - 24.5528 \\ 0.0097 & - 0.0784 & 0.9970 & 42.9270 \\ 0 & 0 & 0 & 1 \end{matrix})

(118)

^{l} T_{r} = (\begin{matrix} 0.9998 & - 0.0099 & - 0.0193 & 189.5922 \\ 0.0095 & 0.9997 & - 0.0215 & - 0.0426 \\ 0.0195 & 0.0213 & 0.9996 & 8.9671 \\ 0 & 0 & 0 & 1 \end{matrix})

(119)

The experimental in situ gaze point tracking scene is shown in Figure 10, with a checkerboard target used as the target. For in situ gaze point tracking, the target is held by a person. In the approaching target gaze tracking experiment, the target is fixed in front of the robot.

5.1. In Situ Gaze Point Tracking Experiment

In the in situ gaze experiment, the target moves at a low speed within a certain range, and the robot combines the movement of the eye, the head and the mobile robot so that the binocular vision can always perceive the 3D coordinates of the target at the optimal observation posture. This experiment prompts the robot to find the target and gaze at it. In the gaze point tracking process, binocular stereo vision is used to calculate the 3D coordinates of the target in the eye coordinate system in real time. Through the positional relationship between the eye and the head, the coordinate system of the target in the eye can be converted to the head coordinate system. Similarly, the 3D coordinates of the target in the robot coordinate system can be obtained. Through the 3D coordinates, the desired poses of the eyes, head and mobile robot are calculated according to the method proposed in this paper. Then, the camera is controlled to the desired position by the stepping motor; after reaching the desired position, the image and the motor position information are collected again, and the 3D coordinates of the target are calculated.

In the experiment, the angles between the head and the target, ^hγ_pmax and ^hγ_pmax, are each 30°. The method described in Section 3 is used to calculate the desired pose of each joint of the robot based on the 3D coordinates of the target. In the experiment, the actual coordinate position and desired coordinate position of the target in the binocular image space, the actual position and desired position of the eye and head motor, the angle between the head and the robot and the target, and the target in the robot coordinate system are stored. Figure 11a,b show the u and v coordinates of the target on the left image, respectively, and Figure 11c,d show the u and v coordinates of the target on the right image, respectively. The desired image coordinates are recalculated based on the optimal observation position. Figure 11e–h show the positions of the tilt motor (M_lu) of the left eye, the pan motor (M_ld) of the left eye, the tilt motor (M_ru) of the right eye and the pan motor (M_rd) of the right eye, respectively. Figure 11i shows the positions of the pan motor (M_hd) of the head. Since the target moves in the vertical direction with small amplitude, the motor M_hu does not rotate, and the case is similar to the motion principle of the motor M_hd, so the motor position of the head only provides the result of the motor M_hd. Figure 11j shows the angle deviation and rotation. In this figure, T-h is the angle between the head and target, T-r is the angle between the robot and target, R-r is the angle of the robot rotation from the origin location and T-o is the angle of the target to the origin location. Figure 11k shows the coordinates (^wx, ^wz) of the target in the world coordinate system. Figure 11l shows the coordinates (^ox, ^oz) of the target in the world coordinate system of the origin location.

As shown in Figure 11, the image coordinate of the target is substantially within ±40 pixels in the central region of the left and right images in the x direction. These coordinates are kept within ±10 pixels of the center region of the left and right images in the y direction. Throughout the experiment, the target was rotated approximately 200° around the robot. The robot moved approximately 140°, the head rotated 30° and the target could be kept in the center region of the binocular images. The motor position curve shows that the motor’s operating position can track the desired position very well. The angle variation curve shows that the angle between the target and the head and the robot changes and that the robot turning angles are suitably consistent. The coordinates of the target shown in Figure 11 in the robot coordinate system and the coordinates of the target in the initial position of the world coordinate system are very close to the actual position change in the target’s position.

Through the above analysis, we can determine the following: (1) It is feasible to realize gaze point tracking of a robot based on 3D coordinates. (2) Using the movement of the head, eyes and mobile robot used in this paper, it is possible to achieve gaze point tracking of the target while ensuring minimum resource consumption.

5.2. Approaching Gaze Point Tracking Experiment

The approaching gaze point tracking experimental scene is shown in Figure 12.

The robot approaches the target without obstacles and reaches the area in which the robot can operate on the target. The target can be grasped or carefully observed. In the approaching gaze experiment, a target is fixed at a position 2.2 m from the robot, and when the robot moves to a position where the distance from the target to robot is 0.6 m, the motion is stopped, and the maximum speed of the moving robot is 1 m/s. The experiment realizes the approaching movement to the target in two steps: first, the head, the eye and the moving robot chassis are rotated so that the head and the moving robot are facing the target, and the head observes the target in the optimal observation posture; second, the movement is controlled. The robot moves linearly in the target’s direction. During the movement, the angles of the head and the eye are fine-tuned, and the 3D coordinates of the target are detected in real time until the z coordinate of the target in the robot coordinate system is less than the threshold set to stop the motion.

Figure 13 shows the results of the approaching gaze point tracking experiment. Figure 13a,b show the u and v coordinates of the target on the left image, respectively, and Figure 13c,d show the u and v coordinates of the target on the right image, respectively. The desired image coordinates are recalculated based on the optimal observation position. Figure 13e–h show the positions of the tilt motor (M_lu) of the left eye, the pan motor (M_ld) of the left eye, the tilt motor (M_ru) of the right eye and the pan motor (M_rd) of the right eye, respectively. Figure 13i shows the positions of the pan motor (M_hd) of the head. Figure 13j shows the angle deviation and rotation. T-h is the angle between the head and the target, T-r is the angle between the robot and the target, R-r is the angle of the robot’s rotation from the origin location and T-o is the angle of the target to the origin location. Figure 13k shows the coordinates (^wx, ^wz) of the target in the world coordinate system. Figure 13l shows the robot’s forward distance and the distance between the target and the robot.

The change in the image’s coordinate curve indicates that the coordinates of the target in the left and right images move from the initial position to the central region of the image and stabilize in the center region of the image during the approach process. In the process of turning towards the target in the first step, the target coordinates in the image fluctuate because the head motor rotates a large amount and is accompanied by a certain vibration during the rotation, which can be avoided by using a system with better stability. The variety curve of the motor position in Figure 13 shows that the motion of the motor can track the target well with the desired pose, and the prediction of the 3D coordinates is not used during the tracking process, so this prediction is accompanied by a cycle lag. The changes in angle in Figure 13 show that the robot system achieves the task of steering toward the target in the first few control cycles and then moves toward the target at a stable angle. Figure 13a shows the change in the coordinates of the target in the robot coordinate system. When the robot rotates, fluctuations arise around the measured x coordinate, mainly due to the measurement error caused by the shaking of the system. The experimental results in Figure 13b show that the robot’s movement toward the target is very consistent. During the approach process, the target can be kept within ±50 pixels of the desired position in the horizontal direction of the image while being within ±20 pixels of the desired position in the vertical direction of the image. The eye motor achieves fast tracking of the target in 1.5 s. The angle between the target and the head is reduced from 20° to 0°, and the angle between the target and the robot is reduced from 35° to 0°. The robot then over-turns. At 34°, the target changes by 34° from the initial position.

Through the above analysis, it can be found that by using the combination of the head, the eye and the trunk in the present method, the approach toward the target can be achieved while ensuring that the robot is gazing at the target.

6. Conclusions

This study achieved gaze point tracking based on the 3D coordinates of the target. First, a robot experiment platform was designed. Based on the bionic eye experiment platform, a head with two degrees of freedom was added, using the mobile robot as a carrier.

Based on the characteristics of the robot platform, this paper proposed a method of gaze point tracking. To achieve in situ gaze point tracking, the combination of the eyes, head and trunk is designed based on the principles of minimum resource consumption and maximum system stability. Eye rotation consumes the least amount of resources and has minimal impact on the stability of the overall system during the exercise. The head rotation consumes more resources than the eyeball but fewer than the trunk rotation. At the same time, the rotation of the head affects the stability of the eyeball but only minimally affects the stability of the entire robotic system. The resources consumed by the rotation of the trunk generally predominate, and the rotation of the trunk tends to affect the stability of the head and the eye. Therefore, when the eye can observe the target in the optimal observation posture, only the eye is rotated; otherwise, the head is rotated, and when the angle at which the head needs to move exceeds its threshold, the mobile robot rotates. When approaching gaze point tracking is performed, the robot and head first face the target and then move straight toward the vicinity of the target. Based on the proposed gaze point tracking method, this paper provides an expected pose calculation method for the horizontal rotation angle and the vertical rotation angle.

Based on the experimental robot platform, a series of experiments was performed, and the effectiveness of the gaze point tracking method was verified. In our future works, a practical task of delivering medicine in a hospital and more detailed comparative experiments, as well as discussions with other similar studies, will be implemented.

Author Contributions

Conceptualization, Q.W. and M.Q.; Methodology, X.F.; Software, X.F.; Validation, X.F.; Formal analysis, H.C.; Investigation, M.Q.; Data curation, Q.W. and Y.Z.; Writing—original draft, X.F. and Q.W.; Writing—review & editing, M.Q.; Project administration, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data was created.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Measurement Error Analysis of Binocular Stereo Vision

It is assumed that the angle at which the left motion module rotates in the horizontal direction with respect to the initial position is ^lθ_p and that the angle at which the right motion module rotates in the horizontal direction with respect to the initial position is ^rθ_p. When using the bionic eye platform for 3D coordinate calculation, we find that when ^lθ_p = ^rθ_p, the measured three-dimensional coordinate ratio is higher than ^lθ_p > 0 and ^rθ_p < 0, which is more accurate, as shown in Figure A1a,b. In this case, the measurement accuracy is higher than that in the case shown in Figure A1c. This chapter analyzes the measurement error of three-dimensional coordinates, explains the reason for this phenomenon and proposes a method to improve the accuracy of three-dimensional coordinate measurement.

Figure A1. Binocular motion mode, (a) initial position ^lθ_p = ^rθ_p, (b) ^lθ_p= ^rθ_p, (c) ^lθ_p > 0 and ^rθ_p < 0.

For the convenience of calculation, according to the characteristics of the bionic eye platform, the binocular stereo vision measurement model shown in Figure A2 is used to analyze the error of the three-dimensional coordinate measurement.

Figure A2. The principle of the vision system of bionic eyes.

Appendix A.1. Vision System of Bionic Eyes

Two cameras are used to imitate human eyes and the principle of the vision system of bionic eyes is shown in Figure 1. We suppose that the optical axes of two cameras used to imitate human eyes are coplanar. As shown in Figure 1, O_wX_wZ_w is the world coordinate system, the X_w axis is along the baseline of two cameras and the Z_w axis is in the plane

\prod

, which consists of the two cameras’ optical axes. ^lO_c^lX_c^lZ_c is the coordinate system of the left camera, where ^lZ_c is along the optical axis of the left camera and ^lX_c axis is in the plane

\prod

. ^rO_c^rX_c^rZ_c is the coordinate system of the right camera, ^rZ_c is along the optical axis of the right camera and ^rX_c axis is in the plane

\prod

. The two cameras can move cooperatively to imitate the movement of human eyes.

Appendix A.2. Depth Measurement Model

In Figure A2, the position vector of point ^lO_c in O_wX_wZ_w is ^wO_l = [−d_i/2, 0]^T and the position vector of point ^rO_c in O_wX_wZ_w is ^wO_r = [d_i/2, 0]^T. d_i is the length of the baseline. Let ^lP = [x_l, z_l]^T and ^rP = [x_r, z_r]^T be the position vectors of the object point P in the left and right camera coordinate systems, respectively. Let ^wP = [x_w, z_w]^T be the position vector of point P in O_wX_wZ_w; then, ^lP can be obtained as follows:

^{l} P =^{l} R_{w} (^{w} P -^{w} O_{l})

(A1)

where ^lR_w is the rotation transformation matrix from the world coordinate system to the left camera coordinate system. ^lR_w can be expressed as

^{l} R_{w} = [\begin{matrix} \cos θ_{l} & \sin θ_{l} \\ - \sin θ_{l} & \cos θ_{l} \end{matrix}]

(A2)

where

θ_{l}

is the rotation angle which is defined as the angle between camera optical axis and the Z_w axis.

Based on the same principle, we can obtain

^{r} P =^{r} R_{w} (^{w} P -^{w} O_{r})

(A3)

where

^{r} R_{w} = [\begin{matrix} \cos θ_{r} & \sin θ_{r} \\ - \sin θ_{r} & \cos θ_{r} \end{matrix}]

(A4)

As shown in Figure 1, P_l₁ is the intersection point of the line ^lO_cP and the normalized image plane of the left camera, and P_r₁ is the intersection point of the line ^rO_cP and the normalized image plane of the right camera. From the geometric relationship as shown in Figure 1, the position vector of point P_l₁ in ^lO_c^lX_c^lZ_c can be expressed as ^lP_l₁ = [x_l/z_l, 1]^T, and the position vector of point P_r₁ in ^rO_c^rX_c^rZ_c can be expressed as ^rP_r₁ = [x_r/z_r, 1]. Let ^wP_l₁ = [^wx_l₁, ^wz_l₁]^T be the position vector of point P_l₁ in O_wX_wZ_w and ^wP_r₁ = [^wx_r₁, ^wz_r₁]^T be the position vector of point P_r₁ in O_wX_wZ_w. ^wP_l₁ and ^wP_r₁ can be calculated in (A5) and (A6) by coordinate transformation according to (A1) and (A3).

^{w} P_{l 1} = {(^{l} R_{w})}^{- 1}^{l} P_{l 1} +^{w} O_{l}

(A5)

^{w} P_{r 1} = {(^{r} R_{w})}^{- 1}^{r} P_{r 1} +^{w} O_{r}

(A6)

From Equations (99) and (100), we can obtain

^{w} P_{l 1} = [\begin{matrix} ^{w} x_{l 1} \\ ^{w} z_{l 1} \end{matrix}] = [\begin{array}{l} \frac{x_{l}}{z_{l}} \cos θ_{l} - \sin θ_{l} - \frac{d_{i}}{2} \\ \frac{x_{l}}{z_{l}} \sin θ_{l} + \cos θ_{l} \end{array}]

(A7)

^{w} P_{r 1} = [\begin{matrix} ^{w} x_{r 1} \\ ^{w} z_{r 1} \end{matrix}] = [\begin{array}{l} \frac{x_{r}}{z_{r}} \cos θ_{r} - \sin θ_{r} + \frac{d_{i}}{2} \\ \frac{x_{r}}{z_{r}} \sin θ_{r} + \cos θ_{r} \end{array}]

(A8)

According to ^wO_l = [−d_i/2, 0]^T and Equation (A7), the line ^lO_cP can be expressed as Equation (A9) in the world coordinate system O_wX_wZ_w.

z = \frac{^{w} z_{l 1}}{\frac{d_{i}}{2} +^{w} x_{l 1}} x + \frac{^{w} z_{r 1} \times \frac{d_{i}}{2}}{\frac{d_{i}}{2} +^{w} x_{l 1}}

(A9)

Based on ^wO_r = [d_i/2, 0]^T and Equation (A8), the line ^rO_cP in the world coordinate system O_wX_wZ_w can be expressed as follows:

z = - \frac{^{w} z_{r 1}}{\frac{d_{i}}{2} -^{w} x_{r 1}} x + \frac{^{w} z_{r 1} \times \frac{d_{i}}{2}}{\frac{d_{i}}{2} -^{w} x_{l 1}}

(A10)

The intersection point of the lines ^rO_cP and ^lO_cP is the point P, so the depth z_w of point P can be obtained by Equations (A9) and (A10).

z_{w} = \frac{- d_{i}^{w} z_{r 1}^{w} z_{r 1}}{^{w} x_{r 1}^{w} z_{l 1} -^{w} z_{l 1} \frac{d_{i}}{2} -^{w} z_{r 1} \frac{d_{i}}{2} -^{w} x_{l 1}^{w} z_{r 1}}

(A11)

Appendix A.3. Measurement Error Analysis

In the real situation, there are the eyes’ rotation angle errors (usually caused by time difference between image acquisition and motor angle acquisition when the two cameras move, the stepper motor’s clearance error and the encoder’s resolution) and image errors (usually caused by image distortion, image resolution and image feature extraction error). Let

Δ m_{l}

and

Δ m_{r}

be the image errors of the two cameras, respectively, then ^lP_l₁= [x_l/z_l, 1]^T and ^rP_r₁= [x_r/z_r, 1] can be revised as

^{l} P_{l 1}^{'}

= [x_l/z_l +

Δ m_{l}

, 1]^T and

^{r} P_{r 1}^{'}

= [x_r/z_r +

Δ m_{r}

, 1]. Let

Δ θ_{l}

and

Δ θ_{r}

be the errors of two cameras’ rotation angles, respectively. Therefore, we can rewrite Equations (A7) and (A8) as follows:

^{w} P_{l 1}^{'} = [\begin{matrix} ^{w} x_{l 1}^{'} \\ ^{w} z_{l 1}^{'} \end{matrix}] = [\begin{array}{l} (\frac{x_{l}}{z_{l}} + Δ m_{l}) \cos (θ_{l} + Δ θ_{l}) - \sin (θ_{l} + Δ θ_{l}) - \frac{d_{i}}{2} \\ (\frac{x_{l}}{z_{l}} + Δ m_{l}) \sin (θ_{l} + Δ θ_{l}) + \cos (θ_{l} + Δ θ_{l}) \end{array}]

(A12)

^{w} P_{r 1}^{'} = [\begin{matrix} ^{w} x_{r 1}^{'} \\ ^{w} z_{r 1}^{'} \end{matrix}] = [\begin{array}{l} (\frac{x_{r}}{z_{r}} + Δ m_{r}) \cos (θ_{r} + Δ θ_{r}) - \sin (θ_{r} + Δ θ_{r}) + \frac{d_{i}}{2} \\ (\frac{x_{r}}{z_{r}} + Δ m_{r}) \sin (θ_{r} + Δ θ_{r}) + \cos (θ_{r} + Δ θ_{r}) \end{array}]

(A13)

Based on the same principle, the revised depth

z_{w}^{'}

of point P can be obtained as follows:

z_{w}^{'} = \frac{- d_{i}^{w} z_{l 1}^{'}^{w} z_{r 1}^{'}}{^{w} x_{r 1}^{'}^{w} z_{l 1}^{'} -^{w} z_{l 1}^{'} \frac{d_{i}}{2} -^{w} z_{r 1}^{'} \frac{d_{i}}{2} -^{w} x_{l 1}^{'}^{w} z_{r 1}^{'}}

(A14)

According to the cooperative movement pattern of human eyes, the absolute values of θ_l and θ_r are restricted to a limited range and assumed to be equal as follows:

- θ_{l} = θ_{r} = θ s . t . 0 \leq θ < \frac{π}{2}

(A15)

Since Δm_r, Δm_l, Δθ_l and Δθ_r are usually close to 0, the simplified expression of the error

Δ z

between the actual value z_w and measurement value

z_{w}^{'}

can be derived from (A7), (A8) and (A11)–(A14).

Δ z = z_{w}^{'} - z_{w} \approx - z_{w} \frac{z_{w} (Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - Δ θ_{l} + Δ θ_{r})}{d_{i} + z_{w} (Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - Δ θ_{l} + Δ θ_{r})}

(A16)

In addition, the relative error of z_w is

Δ z_{r} = \frac{Δ z}{z_{w}} \approx - \frac{z_{w} (Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - Δ θ_{l} + Δ θ_{r})}{d_{i} + z_{w} (Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - Δ θ_{l} + Δ θ_{r})}

(A17)

Let

ε = Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - Δ θ_{l} + Δ θ_{r}

(A18)

In practice, ɛ has very small value and z_wɛ < < d_i, so

Δ z_{r}

can be simplified as follows:

Δ z_{r} \approx - \frac{z_{w} ε}{d_{i}}

(A19)

From Equation (A19), it can be known that relative error of z_w is proportional to z_wɛ and inversely proportional to d_i. Thus, we can adopt the following strategies to reduce the error of depth:

(1) Keep d_i long enough and constant when the bionic eyes move.

(2) Observe the target as close as possible since the depth error is smaller when bionic eyes observe target in a close distance.

(3) Control the two cameras of the bionic eyes with the same angular velocity during the process of the eyes’ movement. In this way, Δθ^l and Δθ^r will be approximately equal to each other, and ɛ can be reduced.

(4) Keep the target on the Z_w axis if possible, so that Δm_l and Δm_r are close to each other.

These strategies can be used to design effective motion control methods so that bionic eyes can perceive the target’s 3D information accurately.

Appendix B. Proof of Equation (100)

Let

a = - d_{i}^{w} z_{l 1}^{'}^{w} z_{r 1}^{'}

(A20)

b =^{w} x_{r 1}^{'}^{w} z_{l 1}^{'} -^{w} z_{l 1}^{'} \frac{d_{i}}{2} -^{w} z_{r 1}^{'} \frac{d_{i}}{2} -^{w} x_{l 1}^{'}^{w} z_{r 1}^{'}

(A21)

Then,

z_{w}^{'}

in Equation (A14) can be expressed as

z_{w}^{'} = \frac{a}{b}

(A22)

From Equations (A7) and (A8), we can obtain

^{w} z_{l 1}^{'} = (\frac{x_{l}}{z_{l}} + Δ m_{l}) \sin (θ_{l} + Δ θ_{l}) + \cos (θ_{l} + Δ θ_{l})

(A23)

^{w} z_{r 1}^{'} = (\frac{x_{r}}{z_{r}} + Δ m_{r}) \sin (θ_{r} + Δ θ_{r}) + \cos (θ_{r} + Δ θ_{r})

(A24)

From (A10), (A23) and (A24), we can obtain

\begin{array}{l} a = & - d_{i} \frac{x_{r}}{z_{r}} \frac{x_{l}}{z_{l}} [\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}] [\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}] \\ - d_{i} \frac{x_{r}}{z_{r}} [\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}] [\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}] Δ m_{l} \\ - d_{i} \frac{x_{r}}{z_{r}} [\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}] [\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}] \\ - d_{i} \frac{x_{l}}{z_{l}} [\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}] [\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}] Δ m_{r} \\ - d_{i} [\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}] [\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}] Δ m_{l} Δ m_{r} \\ - d_{i} [\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}] [\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}] Δ m_{r} \\ - d_{i} \frac{x_{l}}{z_{l}} [\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}] [\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}] \\ - d_{i} [\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}] [\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}] Δ m_{l} \\ - d_{i} [\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}] [\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}] \end{array}

(A25)

Δm_r, Δm_l, Δθ_l and Δθ_r are usually close to 0, so

\{\begin{cases} \cos Δ θ_{r} \approx 1 \\ \cos Δ θ_{l} \approx 1 \\ \sin Δ θ_{r} \sin Δ θ_{l} \approx 0 \\ Δ m_{l} Δ m_{r} \approx 0 \\ Δ m_{l} \sin Δ θ_{l} \approx 0 \\ Δ m_{r} \sin Δ θ_{l} \approx 0 \\ Δ m_{l} \sin Δ θ_{r} \approx 0 \\ Δ m_{r} \sin Δ θ_{r} \approx 0 \end{cases}

(A26)

From (A25) and (A26), we can obtain

\begin{aligned} a \approx & d_{i} \frac{1}{z_{r} z_{l}} [- x_{l} x_{r} \sin θ_{l} \sin θ_{r} - x_{r} z_{l} \cos θ_{l} \sin θ_{r} - x_{l} z_{r} \sin θ_{l} \cos θ_{r} - z_{r} z_{l} \cos θ_{r} \cos θ_{l} \\ + \sin Δ θ_{l} (- x_{l} x_{r} \sin θ_{r} \cos θ_{l} + x_{r} z_{l} \sin θ_{r} \sin θ_{l} - x_{l} z_{r} \cos θ_{r} \cos θ_{l} + z_{r} z_{l} \cos θ_{r} \sin θ_{l}) \\ + \sin Δ θ_{r} (x_{l} z_{r} \sin θ_{l} \sin θ_{r} + z_{r} z_{l} \cos θ_{l} \sin θ_{r} - x_{l} x_{r} \sin θ_{l} \cos θ_{r} - x_{r} z_{l} \cos θ_{l} \cos θ_{r}) \\ - Δ m_{l} (x_{r} z_{l} \sin θ_{l} \sin θ_{r l} + z_{r} z_{l} \sin θ_{l} \cos θ_{r}) - Δ m_{r} (x_{l} z_{r} \sin θ_{r} \sin θ_{l} - z_{r} z_{l} \sin θ_{r} \cos θ_{l})] \end{aligned}

(A27)

From (A7), (A8) and (A19), we can obtain

\{\begin{cases} x_{l} = (x_{w} + \frac{d_{i}}{2}) \cos θ - z_{w} \sin θ \\ z_{l} = - (x_{w} + \frac{d_{i}}{2}) \sin θ + z_{w} \cos θ \\ x_{r} = (x_{w} - \frac{d_{i}}{2}) \cos θ + z_{w} \sin θ \\ z_{r} = (x_{w} - \frac{d_{i}}{2}) \sin θ + z_{w} \cos θ \end{cases}

(A28)

Equation (A19) can be derived from (A19), (A27) and (A28):

\begin{array}{l} a \approx z_{w}^{2} \frac{- d_{i} + [\frac{d_{i}}{2} \sin 2 θ + \frac{d_{i} (x_{w} + \frac{d_{i}}{2}) \sin^{2} θ}{z_{w}}] Δ m_{l}}{z_{l} z_{r}} + z_{w}^{2} \frac{[\frac{d_{i} (x_{w} - \frac{d_{i}}{2}) \sin^{2} θ}{z_{w}} - \frac{d_{i}}{2} \sin 2 θ] Δ m_{r}}{z_{l} z_{r}} \\ + z_{w}^{2} \frac{\frac{- d_{i} (x_{w} + \frac{d_{i}}{2}) \sin Δ θ_{l}}{z_{w}} + \frac{d_{i} (x_{w} - \frac{d_{i}}{2}) \sin Δ θ_{r}}{z_{w}}}{z_{l} z_{r}} \end{array}

(A29)

Δm_r, Δm_l, Δθ_l and Δθ_r are usually close to 0,

x_{w} ≪ z_{w},

and

d_{i} ≪ z_{w}

. Thus,

a \approx z_{w}^{2} \frac{- d_{i}}{z_{l} z_{r}}

(A30)

From (A21), (A23) and (A24), we can obtain

\begin{array}{l} b = & \frac{x_{l}}{z_{l}} \frac{x_{r}}{z_{r}} (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) (\cos θ_{r} \cos Δ θ_{r^{r}} - \sin θ_{r} \sin Δ θ_{r}) \\ + \frac{x_{l}}{z_{l}} (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) (\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}) Δ m_{r} \\ - \frac{x_{l}}{z_{l}} (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) \\ + \frac{x_{r}}{z_{r}} (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) (\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}) Δ m_{l} \\ + (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) (\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}) Δ m_{l} Δ m_{r} \\ - (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) Δ m_{l} \\ + \frac{x_{r}}{z_{r}} (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) (\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}) \\ + (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) (\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}) Δ m_{r} \\ - (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) \\ - \frac{x_{r}}{z_{r}} \frac{x_{l}}{z_{l}} (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) \\ - \frac{x_{r}}{z_{r}} (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) Δ m_{l} \\ + \frac{x_{r}}{z_{r}} (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) \\ - \frac{x_{l}}{z_{l}} (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) Δ m_{r} \\ - (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) Δ m_{l} Δ m_{r} \\ + (\sin θ_{r} \cos Δ θ_{r} + \cos θ_{r} \sin Δ θ_{r}) (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) Δ m_{r} \\ - \frac{x_{l}}{z_{l}} (\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}) (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) \\ - (\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}) (\cos θ_{l} \cos Δ θ_{l} - \sin θ_{l} \sin Δ θ_{l}) Δ m_{l} \\ + (\cos θ_{r} \cos Δ θ_{r} - \sin θ_{r} \sin Δ θ_{r}) (\sin θ_{l} \cos Δ θ_{l} + \cos θ_{l} \sin Δ θ_{l}) \end{array}

(A31)

From (A26) and (A31), we can obtain:

\begin{aligned} b \approx & \frac{1}{z_{l} z_{r}} [x_{l} x_{r} \sin θ_{l} \cos θ_{r} - x_{l} z_{r} \sin θ_{l} \sin θ_{r} + z_{l} x_{r} \cos θ_{l} \cos θ_{r} - z_{l} z_{r} \cos θ_{l} \sin θ_{r} \\ - x_{l} x_{r} \sin θ_{r} \cos θ_{l} + z_{l} z_{r} \sin θ_{l} \cos θ_{r} - x_{l} z_{r} \cos θ_{l} \cos θ_{r} + z_{l} x_{r} \sin θ_{l} \sin θ_{r} \\ + \sin Δ θ_{l} (x_{l} x_{r} \cos θ_{l} \cos θ_{r} - x_{l} z_{r} \cos θ_{l} \sin θ_{r} - z_{l} x_{r} \cos θ_{r} \sin θ_{l} + x_{l} z_{r} \sin θ_{l} \cos θ_{r} \\ + z_{l} z_{r} \cos θ_{r} \cos θ_{l} + z_{l} z_{r} \sin θ_{l} \sin θ_{r} + x_{l} x_{r} \sin θ_{l} \sin θ_{r} + z_{l} x_{r} \sin θ_{r} \cos θ_{l}) \\ + \sin Δ θ_{r} (- x_{l} x_{r} \sin θ_{l} \sin θ_{r} - x_{l} z_{r} \sin θ_{l} \cos θ_{r} - z_{l} x_{r} \cos θ_{l} \sin θ_{r} - z_{l} z_{r} \cos θ_{l} \cos θ_{r} \\ - x_{l} x_{r} \cos θ_{l} \cos θ_{r} + z_{l} x_{r} \sin θ_{l} \cos θ_{r} + x_{l} z_{r} \cos θ_{l} \sin θ_{r} - z_{l} z_{r} \sin θ_{l} \sin θ_{r}) \\ + Δ m_{l} (z_{l} x_{r} \sin θ_{l} \cos θ_{r} - z_{l} z_{r} \sin θ_{l} \sin θ_{r} - z_{l} x_{r} \cos θ_{l} \sin θ_{r} - z_{l} z_{r} \cos θ_{l} \cos θ_{r}) \\ + Δ m_{r} (x_{l} z_{r} \sin θ_{l} \cos θ_{r} + z_{l} z_{r} \cos θ_{l} \cos θ_{r} - x_{l} z_{r} \cos θ_{l} \sin θ_{r} + z_{l} z_{r} \sin θ_{l} \sin θ_{r})] \end{aligned}

(A32)

Equation (A33) can be derived by (A19), (A28) and (A32):

\begin{array}{l} b \approx & z_{w} \frac{- d_{i} - Δ m_{l} [2 x_{w} \sin θ \cos θ + \frac{(x_{w}^{2} - \frac{d_{i}^{2}}{4})}{z_{w}} \sin^{2} θ + z_{w} \cos^{2} θ]}{z_{l} z_{r}} \\ + z_{w} \frac{Δ m_{r} [z_{w} \cos^{2} θ + \frac{(x_{w}^{2} - \frac{d_{i}^{2}}{4})}{z_{w}} \sin^{2} θ - 2 x_{w} \sin θ \cos θ]}{z_{l} z_{r}} \\ + z_{w} \frac{\sin Δ θ_{l} [z_{w} + \frac{x_{w}^{2} - \frac{d_{i}^{2}}{4}}{z_{w}}] - \sin Δ θ_{r} [z_{w} + \frac{x_{w}^{2} - \frac{d_{i}^{2}}{4}}{z_{w}}]}{z_{l} z_{r}} \end{array}

(A33)

Δm_r, Δm_l, Δθ_l and Δθ_r are usually close to 0,

x_{w} ≪ z_{w},

and

d_{i} ≪ z_{w}

. Thus,

b \approx z_{w} \frac{- d_{i} + z_{w} (Δ m_{r} \cos^{2} θ - Δ m_{l} \cos^{2} θ + \sin Δ θ_{l} - \sin Δ θ_{r})}{z_{l} z_{r}}

(A34)

From (A22), (A29) and (A34), we can obtain

z_{w}^{'} \approx \frac{z_{w} d_{i}}{d_{i} - z_{w} (Δ m_{r} \cos^{2} θ - Δ m_{l} \cos^{2} θ + \sin Δ θ_{l} - \sin Δ θ_{r})}

(A35)

So,

Δ z = z_{w}^{'} - z_{w} \approx - z_{w} \frac{z_{w} (Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - \sin Δ θ_{l} + \sin Δ θ_{r})}{d_{i} + z_{w} (Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - \sin Δ θ_{l} + \sin Δ θ_{r})}

(A36)

Δθ_l and Δθ_r are usually close to 0, so

Δ z \approx - z_{w} \frac{z_{w} (Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - Δ θ_{l} + Δ θ_{r})}{d_{i} + z_{w} (Δ m_{l} \cos^{2} θ - Δ m_{r} \cos^{2} θ - Δ θ_{l} + Δ θ_{r})}

(A37)

The proof is completed.

Appendix C. Two Equations Related to θ_t and θ_p

Substituting Equation (59) into Equation (60), we can obtain:

(\begin{matrix} ^{1} x_{c} \\ ^{1} y_{c} \\ ^{1} z_{c} \\ 1 \end{matrix}) = (\begin{array}{cccc} ^{1} n_{x} & ^{1} o_{x} & ^{1} a_{x} & ^{1} p_{x} \\ ^{1} n_{y} & ^{1} o_{y} & ^{1} a_{y} & ^{1} p_{y} \\ ^{1} n_{z} & ^{1} o_{z} & ^{1} a_{z} & ^{1} p_{z} \\ 0 & 0 & 0 & 1 \end{array}) (\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & \cos^{1} θ_{t} & \sin^{1} θ_{t} & 0 \\ 0 & - \sin^{1} θ_{t} & \cos^{1} θ_{t} & 0 \\ 0 & 0 & 0 & 1 \end{array}) (\begin{array}{cccc} \cos^{1} θ_{p} & 0 & - \sin^{1} θ_{p} & 0 \\ 0 & 1 & 0 & 0 \\ \sin^{1} θ_{p} & 0 & \cos^{1} θ_{p} & 0 \\ 0 & 0 & 0 & 1 \end{array}) (\begin{matrix} x_{w} \\ y_{w} \\ z_{w} \\ 1 \end{matrix})

(A38)

Equation (A38) can be factored out:

(\begin{matrix} ^{1} x_{c} \\ ^{1} y_{c} \\ ^{1} z_{c} \end{matrix}) = (\begin{matrix} \begin{matrix} ^{l} n_{x} x_{w} \cos^{l} θ_{p} -^{l} n_{x} z_{w} \sin^{l} θ_{p} +^{l} o_{x} x_{w} \sin^{l} θ_{p} \sin^{l} θ_{t} \\ +^{l} o_{x} y_{w} \cos^{l} θ_{t} +^{l} o_{x} z_{w} \cos^{l} θ_{p} \sin^{l} θ_{t} +^{l} a_{x} x_{w} \sin^{l} θ_{p} \cos^{l} θ_{t} \\ -^{l} a_{x} y_{w} \sin^{l} θ_{t} +^{l} a_{x} z_{w} \cos^{l} θ_{p} \cos^{l} θ_{t} +^{l} p_{x} \end{matrix} \\ \begin{matrix} ^{l} n_{y} x_{w} \cos^{l} θ_{p} -^{l} n_{y} z_{w} \sin^{l} θ_{p} +^{l} o_{y} x_{w} \sin^{l} θ_{p} \sin^{l} θ_{t} \\ +^{l} o_{y} y_{w} \cos^{l} θ_{t} +^{l} o_{y} z_{w} \cos^{l} θ_{p} \sin^{l} θ_{t} +^{l} a_{y} x_{w} \sin^{l} θ_{p} \cos^{l} θ_{t} \\ -^{l} a_{y} y_{w} \sin^{l} θ_{t} +^{l} a_{y} z_{w} \cos^{l} θ_{p} \cos^{l} θ_{t} +^{l} p_{y} \end{matrix} \\ \begin{matrix} ^{l} n_{z} x_{w} \cos^{l} θ_{p} -^{l} n_{z} z_{w} \sin^{l} θ_{p} +^{l} o_{z} x_{w} \sin^{l} θ_{p} \sin^{l} θ_{t} \\ +^{l} o_{z} y_{w} \cos^{l} θ_{t} +^{l} o_{z} z_{w} \cos^{l} θ_{p} \sin^{l} θ_{t} +^{l} a_{z} x_{w} \sin^{l} θ_{p} \cos^{l} θ_{t} \\ -^{l} a_{z} y_{w} \sin^{l} θ_{t} +^{l} a_{z} z_{w} \cos^{l} θ_{p} \cos^{l} θ_{t} +^{l} p_{z} \end{matrix} \end{matrix})

(A39)

Substituting Equation (A39) into Equation (62), we can obtain:

(\begin{matrix} Δ u_{l} \\ Δ v_{l} \end{matrix}) = (\begin{matrix} \frac{\begin{matrix} (^{l} k_{x}^{l} n_{x} x_{w} \cos^{l} θ_{p} -^{l} k_{x}^{l} n_{x} z_{w} \sin^{l} θ_{p} +^{l} k_{x}^{l} o_{x} x_{w} \sin^{l} θ_{p} \sin^{l} θ_{t} \\ +^{l} k_{x}^{l} o_{x} y_{w} \cos^{l} θ_{t} +^{l} k_{x}^{l} o_{x} z_{w} \cos^{l} θ_{p} \sin^{l} θ_{t} +^{l} k_{x}^{l} a_{x} x_{w} \sin^{l} θ_{p} \cos^{l} θ_{t} \\ -^{l} k_{x}^{l} a_{x} y_{w} \sin^{l} θ_{t} +^{l} k_{x}^{l} a_{x} z_{w} \cos^{l} θ_{p} \cos^{l} θ_{t} +^{l} k_{x}^{l} p_{x}) \end{matrix}}{\begin{matrix} (^{l} n_{z} x_{w} \cos^{l} θ_{p} -^{l} n_{z} z_{w} \sin^{l} θ_{p} +^{l} o_{z} x_{w} \sin^{l} θ_{p} \sin^{l} θ_{t} \\ +^{l} o_{z} y_{w} \cos^{l} θ_{t} +^{l} o_{z} z_{w} \cos^{l} θ_{p} \sin^{l} θ_{t} +^{l} a_{z} x_{w} \sin^{l} θ_{p} \cos^{l} θ_{t} \\ -^{l} a_{z} y_{w} \sin^{l} θ_{t} +^{l} a_{z} z_{w} \cos^{l} θ_{p} \cos^{l} θ_{t} +^{l} p_{z}) \end{matrix}} \\ \frac{\begin{matrix} (^{l} k_{y}^{l} n_{y} x_{w} \cos^{l} θ_{p} -^{l} k_{y}^{l} n_{y} z_{w} \sin^{l} θ_{p} +^{l} k_{y}^{l} o_{y} x_{w} \sin^{l} θ_{p} \sin^{l} θ_{t} \\ +^{l} k_{y}^{l} o_{y} y_{w} \cos^{l} θ_{t} +^{l} k_{y}^{l} o_{y} z_{w} \cos^{l} θ_{p} \sin^{l} θ_{t} +^{l} k_{y}^{l} a_{y} x_{w} \sin^{l} θ_{p} \cos^{l} θ_{t} \\ -^{l} k_{y}^{l} a_{y} y_{w} \sin^{l} θ_{t} +^{l} k_{y}^{l} a_{y} z_{w} \cos^{l} θ_{p} \cos^{l} θ_{t} +^{l} k_{y}^{l} p_{y}) \end{matrix}}{\begin{matrix} (^{l} n_{z} x_{w} \cos^{l} θ_{p} -^{l} n_{z} z_{w} \sin^{l} θ_{p} +^{l} o_{z} x_{w} \sin^{l} θ_{p} \sin^{l} θ_{t} \\ +^{l} o_{z} y_{w} \cos^{l} θ_{t} +^{l} o_{z} z_{w} \cos^{l} θ_{p} \sin^{l} θ_{t} +^{l} a_{z} x_{w} \sin^{l} θ_{p} \cos^{l} θ_{t} \\ -^{l} a_{z} y_{w} \sin^{l} θ_{t} +^{l} a_{z} z_{w} \cos^{l} θ_{p} \cos^{l} θ_{t} +^{l} p_{z}) \end{matrix}} \end{matrix})

(A40)

Based on the same principle, substituting into each matrix and factoring the value of ∆m_r, we can obtain

(\begin{matrix} Δ u_{r} \\ Δ v_{r} \end{matrix}) = (\begin{matrix} \frac{\begin{matrix} (^{r} k_{x}^{r} n_{x}^{r} x_{w} \cos^{r} θ_{p} -^{r} k_{x}^{r} n_{x}^{r} z_{w} \sin^{r} θ_{p} +^{r} k_{x}^{r} o_{x}^{r} x_{w} \sin^{r} θ_{p} \sin^{r} θ_{t} \\ +^{r} k_{x}^{r} o_{x}^{r} y_{w} \cos^{r} θ_{t} +^{r} k_{x}^{r} o_{x}^{r} z_{w} \cos^{r} θ_{p} \sin^{r} θ_{t} +^{r} k_{x}^{r} a_{x}^{r} x_{w} \sin^{r} θ_{p} \cos^{r} θ_{t} \\ -^{r} k_{x}^{r} a_{x}^{r} y_{w} \sin^{r} θ_{t} +^{r} k_{x}^{r} a_{x}^{r} z_{w} \cos^{r} θ_{p} \cos^{r} θ_{t} +^{r} k_{x}^{r} p_{x}) \end{matrix}}{\begin{matrix} (^{r} n_{z}^{r} x_{w} \cos^{r} θ_{p} -^{r} n_{z}^{r} z_{w} \sin^{r} θ_{p} +^{r} o_{z}^{r} x_{w} \sin^{r} θ_{p} \sin^{r} θ_{t} \\ +^{r} o_{z}^{r} y_{w} \cos^{r} θ_{t} +^{r} o_{z}^{r} z_{w} \cos^{r} θ_{p} \sin^{r} θ_{t} +^{r} a_{z}^{r} x_{w} \sin^{r} θ_{p} \cos^{r} θ_{t} \\ -^{r} a_{z}^{r} y_{w} \sin^{r} θ_{t} +^{r} a_{z}^{r} z_{w} \cos^{r} θ_{p} \cos^{r} θ_{t} +^{r} p_{z}) \end{matrix}} \\ \frac{\begin{matrix} (^{r} k_{y}^{r} n_{y}^{r} x_{w} \cos^{r} θ_{p} -^{r} k_{y}^{r} n_{y}^{r} z_{w} \sin^{r} θ_{p} +^{r} k_{y}^{r} o_{y}^{r} x_{w} \sin^{r} θ_{p} \sin^{r} θ_{t} \\ +^{r} k_{y}^{r} o_{y}^{r} y_{w} \cos^{r} θ_{t} +^{r} k_{y}^{r} o_{y}^{r} z_{w} \cos^{r} θ_{p} \sin^{r} θ_{t} +^{r} k_{y}^{r} a_{y}^{r} x_{w} \sin^{r} θ_{p} \cos^{r} θ_{t} \\ -^{r} k_{y}^{r} a_{y}^{r} y_{w} \sin^{r} θ_{t} +^{r} k_{y}^{r} a_{y}^{r} z_{w} \cos^{r} θ_{p} \cos^{r} θ_{t} +^{r} k_{y}^{r} p_{y}) \end{matrix}}{\begin{matrix} (^{r} n_{z}^{r} x_{w} \cos^{r} θ_{p} -^{r} n_{z}^{r} z_{w} \sin^{r} θ_{p} +^{r} o_{z}^{r} x_{w} \sin^{r} θ_{p} \sin^{r} θ_{t} \\ +^{r} o_{z}^{r} y_{w} \cos^{r} θ_{t} +^{r} o_{z}^{r} z_{w} \cos^{r} θ_{p} \sin^{r} θ_{t} +^{r} a_{z}^{r} x_{w} \sin^{r} θ_{p} \cos^{r} θ_{t} \\ -^{r} a_{z}^{r} y_{w} \sin^{r} θ_{t} +^{r} a_{z}^{r} z_{w} \cos^{r} θ_{p} \cos^{r} θ_{t} +^{r} p_{z}) \end{matrix}} \end{matrix})

(A41)

By Equations (2), (A40) and (A41), Equation (A42) related to θ_t and θ_p can be obtained. It can be found from Equation (A32) that both θ_t and θ_p appear in the form of a trigonometric function, and it is difficult to obtain values of θ_t and θ_p directly from these two equations. In order to obtain the solution available in the project, we firstly obtain a sub-optimal observation pose and then use the sub-optimal observation pose as the initial value. We finally use the trial and error method to obtain the optimal observation pose.

{\begin{matrix} \begin{array}{l} (^{l} k_{x}^{l} n_{x} x_{w} \cos θ_{p} -^{l} k_{x}^{l} n_{x} z_{w} \sin θ_{p} +^{l} k_{x}^{l} o_{x} x_{w} \sin θ_{p} \sin θ_{t} \\ +^{l} k_{x}^{l} o_{x} y_{w} \cos θ_{t} +^{l} k_{x}^{l} o_{x} z_{w} \cos θ_{p} \sin θ_{t} + -^{l} k_{x}^{l} a_{x} y_{w} \sin θ_{t} \\ ^{l} k_{x}^{l} a_{x} x_{w} \sin θ_{p} \cos θ_{t} +^{l} k_{x}^{l} a_{x} z_{w} \cos θ_{p} \cos θ_{t} +^{l} k_{x}^{l} p_{x}) \\ (^{r} n_{z}^{r} x_{w} \cos θ_{p} -^{r} n_{z}^{r} z_{w} \sin θ_{p} +^{r} o_{z}^{r} x_{w} \sin θ_{p} \sin θ_{t} +^{r} p_{z} \\ +^{r} o_{z}^{r} y_{w} \cos θ_{t} +^{r} o_{z}^{r} z_{w} \cos θ_{p} \sin θ_{t} +^{r} a_{z}^{r} x_{w} \sin θ_{p} \cos θ_{t} \\ -^{r} a_{z}^{r} y_{w} \sin θ_{t} +^{r} a_{z}^{r} z_{w} \cos θ_{p} \cos θ_{t}) = (^{r} k_{x}^{r} n_{x}^{r} x_{w} \cos θ_{p} \\ -^{r} k_{x}^{r} n_{x}^{r} z_{w} \sin θ_{p} +^{r} k_{x}^{r} o_{x}^{r} x_{w} \sin θ_{p} \sin θ_{t} +^{r} k_{x}^{r} o_{x}^{r} y_{w} \cos θ_{t} \\ +^{r} k_{x}^{r} o_{x}^{r} z_{w} \cos θ_{p} \sin θ_{t} +^{r} k_{x}^{r} a_{x}^{r} x_{w} \sin θ_{p} \cos θ_{t} +^{r} k_{x}^{r} p_{x} \\ -^{r} k_{x}^{r} a_{x}^{r} y_{w} \sin θ_{t} +^{r} k_{x}^{r} a_{x}^{r} z_{w} \cos θ_{p} \cos θ_{t}) (^{l} n_{z} x_{w} \cos θ_{p} \\ -^{l} n_{z} z_{w} \sin θ_{p} +^{l} o_{z} x_{w} \sin θ_{p} \sin θ_{t} +^{l} o_{z} y_{w} \cos θ_{t} +^{l} o_{z} z_{w} \cos θ_{p} \sin θ_{t} \\ +^{l} a_{z} x_{w} \sin θ_{p} \cos θ_{t} -^{l} a_{z} y_{w} \sin θ_{t} +^{l} a_{z} z_{w} \cos θ_{p} \cos θ_{t} +^{l} p_{z}) \end{array} \\ \begin{array}{l} (^{l} k_{y}^{l} n_{y} x_{w} \cos θ_{p} -^{l} k_{y}^{l} n_{y} z_{w} \sin θ_{p} +^{l} k_{y}^{l} o_{y} x_{w} \sin θ_{p} \sin θ_{t} \\ +^{l} k_{y}^{l} o_{y} y_{w} \cos θ_{t} +^{l} k_{y}^{l} o_{y} z_{w} \cos θ_{p} \sin θ_{t} -^{l} k_{y}^{l} a_{y} y_{w} \sin θ_{t} \\ +^{l} k_{y}^{l} a_{y} x_{w} \sin θ_{p} \cos θ_{t} +^{l} k_{y}^{l} a_{y} z_{w} \cos θ_{p} \cos θ_{t} +^{l} k_{y}^{l} p_{y}) \\ (^{r} n_{z}^{r} x_{w} \cos θ_{p} -^{r} n_{z}^{r} z_{w} \sin θ_{p} +^{r} o_{z}^{r} x_{w} \sin θ_{p} \sin θ_{t} +^{r} p_{z} \\ +^{r} o_{z}^{r} y_{w} \cos θ_{t} +^{r} o_{z}^{r} z_{w} \cos θ_{p} \sin θ_{t} +^{r} a_{z}^{r} x_{w} \sin θ_{p} \cos θ_{t} \\ -^{r} a_{z}^{r} y_{w} \sin θ_{t} +^{r} a_{z}^{r} z_{w} \cos θ_{p} \cos θ_{t}) = (^{r} k_{y}^{r} n_{y}^{r} x_{w} \cos θ_{p} \\ -^{r} k_{y}^{r} n_{y}^{r} z_{w} \sin θ_{p} +^{r} k_{y}^{r} o_{y}^{r} x_{w} \sin θ_{p} \sin θ_{t} +^{r} k_{y}^{r} o_{y}^{r} y_{w} \cos θ_{t} \\ +^{r} k_{y}^{r} o_{y}^{r} z_{w} \cos θ_{p} \sin θ_{t} +^{r} k_{y}^{r} a_{y}^{r} x_{w} \sin θ_{p} \cos θ_{t} +^{r} k_{y}^{r} p_{y} \\ -^{r} k_{y}^{r} a_{y}^{r} y_{w} \sin θ_{t} +^{r} k_{y}^{r} a_{y}^{r} z_{w} \cos θ_{p} \cos θ_{t}) (^{l} n_{z} x_{w} \cos θ_{p} \\ -^{l} n_{z} z_{w} \sin θ_{p} +^{l} o_{z} x_{w} \sin θ_{p} \sin θ_{t} +^{l} o_{z} y_{w} \cos θ_{t} +^{l} o_{z} z_{w} \cos θ_{p} \sin θ_{t} \\ +^{l} a_{z} x_{w} \sin θ_{p} \cos θ_{t} -^{l} a_{z} y_{w} \sin θ_{t} +^{l} a_{z} z_{w} \cos θ_{p} \cos θ_{t} +^{l} p_{z}) \end{array} \end{matrix}

(A42)

References

Wang, Q.; Zou, W.; Xu, D.; Zhu, Z. Motion control in saccade and smooth pursuit for bionic eye based on three-dimensional coordinates. J. Bionic Eng. 2017, 14, 336–347. [Google Scholar] [CrossRef]
Kardamakis, A.A.; Moschovakis, A.K. Optimal control of gaze shifts. J. Neurosci. 2009, 29, 7723–7730. [Google Scholar] [CrossRef] [PubMed]
Freedman, E.G.; Sparks, D.L. Coordination of the eyes and head: Movement kinematics. Exp. Brain Res. 2000, 131, 22–32. [Google Scholar] [CrossRef] [PubMed]
Nakashima, R.; Fang, Y.; Hatori, Y.; Hiratani, A.; Matsumiya, K.; Kuriki, I.; Shioiri, S. Saliency-based gaze prediction based on head direction. Vis. Res. 2015, 117, 59–66. [Google Scholar] [CrossRef] [PubMed]
He, H.; Ge, S.S.; Zhang, Z. A saliency-driven robotic head with bio-inspired saccadic behaviors for social robotics. Auton. Robot. 2014, 36, 225–240. [Google Scholar] [CrossRef]
Law, J.; Shaw, P.; Lee, M. A biologically constrained architecture for developmental learning of eye–head gaze control on a humanoid robot. Auton. Robot. 2013, 35, 77–92. [Google Scholar] [CrossRef]
Wijayasinghe, I.B.; Aulisa, E.; Buttner, U.; Ghosh, B.K.; Glasauer, S.; Kremmyda, O. Potential and optimal target fixating control of the human head/eye complex. IEEE Trans. Control Syst. Technol. 2015, 23, 796–804. [Google Scholar] [CrossRef]
Ghosh, B.K.; Wijayasinghe, I.B.; Kahagalage, S.D. A geometric approach to head/eye control. IEEE Access 2014, 2, 316–332. [Google Scholar] [CrossRef]
Kuang, X.; Gibson, M.; Shi, B.E.; Rucci, M. Active vision during coordinated head/eye movements in a humanoid robot. IEEE Trans. Robot. 2012, 28, 1423–1430. [Google Scholar] [CrossRef]
Vannucci, L.; Cauli, N.; Falotico, E.; Bernardino, A.; Laschi, C. Adaptive visual pursuit involving eye-head coordination and prediction of the target motion. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots, Madrid, Spain, 18–20 November 2014; pp. 541–546. [Google Scholar]
Huelse, M.; McBride, S.; Law, J.; Lee, M. Integration of active vision and reaching from a developmental robotics perspective. IEEE Trans. Auton. Ment. Dev. 2010, 2, 355–367. [Google Scholar] [CrossRef] [Green Version]
Anastasopoulos, D.; Naushahi, J.; Sklavos, S.; Bronstein, A.M. Fast gaze reorientations by combined movements of the eye, head, trunk and lower extremities. Exp. Brain Res. 2015, 233, 1639–1650. [Google Scholar] [CrossRef] [Green Version]
Daye, P.M.; Optican, L.M.; Blohm, G.; Lefèvre, P. Hierarchical control of two-dimensional gaze saccades. J. Comput. Neurosci. 2014, 36, 355–382. [Google Scholar] [CrossRef] [PubMed]
Rajruangrabin, J.; Popa, D.O. Robot head motion control with an emphasis on realism of neck–eye coordination during object tracking. J. Intell. Robot. Syst. 2011, 63, 163–190. [Google Scholar] [CrossRef]
Schulze, L.; Renneberg, B.; Lobmaier, J.S. Gaze perception in social anxiety and social anxiety disorder. Front. Hum. Neurosci. 2013, 7, 1–5. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.; Zhu, D.; Peng, J.; Wang, X.; Wang, L.; Chen, L.; Li, J.; Zhang, X. Real-time robust stereo visual SLAM system based on bionic eyes. IEEE Trans. Med. Robot. Bionics 2020, 2, 391–398. [Google Scholar] [CrossRef]
Guitton, D. Control of eye–head coordination during orienting gaze shifts. Trends Neurosci. 1993, 15, 174–179. [Google Scholar] [CrossRef]
Matveev, A.S.; Hoy, M.C.; Savkin, A.V. 3D environmental extremum seeking navigation of a nonholonomic mobile robot. Automatica 2014, 50, 1802–1815. [Google Scholar] [CrossRef]
Nefti-Meziani, S.; Manzoor, U.; Davis, S.; Pupala, S.K. 3D perception from binocular vision for a low cost humanoid robot NAO. Robot. Auton. Syst. 2015, 68, 129–139. [Google Scholar] [CrossRef]
Surmann, H.; Nüchter, A.; Hertzberg, J. An autonomous mobile robot with a 3D laser range finder for 3D exploration and digitalization of indoor environments. Robot. Auton. Syst. 2003, 45, 181–198. [Google Scholar] [CrossRef]
Song, W.; Minami, M.; Shen, L.Y.; Zhang, Y.N. Bionic tracking method by hand & eye-vergence visual servoing. Adv. Manuf. 2016, 4, 157–166. [Google Scholar]
Li, H.Y.; Luo, J.; Huang, C.J.; Huang, Q.Z.; Xie, S.R. Design and control of 3-DoF spherical parallel mechanism robot eyes inspired by the binocular vestibule-ocular reflex. J. Intell. Robot. Syst. 2015, 78, 425–441. [Google Scholar] [CrossRef]
Masseck, O.A.; Hoffmann, K.P. Comparative neurobiology of the optokinetic reflex. Ann. N. Y. Acad. Sci. 2009, 1164, 430–439. [Google Scholar] [CrossRef] [PubMed]
Bruske, J.; Hansen, M.; Riehn, L.; Sommer, G. Biologically inspired calibration-free adaptive saccade control of a binocular camera-head. Biol. Cybern. 1997, 77, 433–446. [Google Scholar] [CrossRef]
Wang, X.; Van De Weem, J.; Jonker, P. An advanced active vision system imitating human eye movements. In Proceedings of the 2013 16th International Conference on Advanced Robotics, Montevideo, Uruguay, 25–29 November 2013; pp. 5–10. [Google Scholar]
Antonelli, M.; Duran, A.J.; Chinellato, E.; Pobil, A.P. Adaptive saccade controller inspired by the primates’ cerebellum. In Proceedings of the IEEE International Conference on Robotics and Automation, Seattle, WA, USA, 26–30 May 2015; pp. 5048–5053. [Google Scholar]
Robinson, D.A.; Gordon, J.L.; Gordon, S.E. A model of the smooth pursuit eye movement system. Biol. Cybern. 1986, 55, 43–57. [Google Scholar] [CrossRef] [PubMed]
Brown, C. Gaze controls with interactions and delays. IEEE Trans. Syst. Man Cybern. 1990, 20, 518–527. [Google Scholar] [CrossRef]
Deno, D.C.; Keller, E.L.; Crandall, W.F. Dynamical neural network organization of the visual pursuit system. IEEE Trans. Biomed. Eng. 1989, 36, 85–92. [Google Scholar] [CrossRef] [PubMed]
Lunghi, F.; Lazzari, S.; Magenes, G. Neural adaptive predictor for visual tracking system. In Proceedings of the 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Hong Kong, China, 1 November 1998; Volume 20, pp. 1389–1392. [Google Scholar]
Lee, W.J.; Galiana, H.L. An internally switched model of ocular tracking with prediction. IEEE Trans. Neural Syst. Rehabil. Eng. 2005, 13, 186–193. [Google Scholar] [CrossRef]
Avni, O.; Borrelli, F.; Katzir, G.; Rivlin, E.; Rotstein, H. Scanning and tracking with independent cameras-a biologically motivated approach based on model predictive control. Auton. Robot. 2008, 24, 285–302. [Google Scholar] [CrossRef]
Zhang, M.; Ma, X.; Qin, B.; Wang, G.; Guo, Y.; Xu, Z.; Wang, Y.; Li, Y. Information fusion control with time delay for smooth pursuit eye movement. Physiol. Rep. 2016, 4, e12775. [Google Scholar] [CrossRef]
Santini, F.; Rucci, M. Active estimation of distance in a robotic system that replicates human eye movement. Robot. Auton. Syst. 2007, 55, 107–121. [Google Scholar] [CrossRef]
Chinellato, E.; Antonelli, M.; Grzyb, B.J.; Del Pobil, A.P. Implicit sensorimotor mapping of the peripersonal space by gazing and reaching. IEEE Trans. Auton. Ment. Dev. 2011, 3, 43–53. [Google Scholar] [CrossRef]
Song, Y.; Zhang, X. An active binocular integrated system for intelligent robot vision. In Proceedings of the IEEE International Conference on Intelligence and Security Informatics, Washington, DC, USA, 11–14 June 2012; pp. 48–53. [Google Scholar]
Wang, Y.; Zhang, G.; Lang, H.; Zuo, B.; De Silva, C.W. A modified image-based visual servo controller with hybrid camera configuration for robust robotic grasping. Robot. Auton. Syst. 2014, 62, 1398–1407. [Google Scholar] [CrossRef]
Lee, Y.C.; Lan, C.C.; Chu, C.Y.; Lai, C.M.; Chen, Y.J. A pan-tilt orienting mechanism with parallel axes of flexural actuation. IEEE-ASME Trans. Mechatron. 2013, 18, 1100–1112. [Google Scholar] [CrossRef]
Wang, Q.; Zou, W.; Zhang, F.; Xu, D. Binocular initial location and extrinsic parameters real-time calculation for bionic eye system. In Proceedings of the 11th World Congress on Intelligent Control and Automation, Shenyang, China, 29 June–4 July 2014; pp. 74–80. [Google Scholar]
Fan, D.; Liu, Y.Y.; Chen, X.P.; Meng, F.; Liu, X.L.; Ullah, Z.; Cheng, W.; Liu, Y.H.; Huang, Q. Eye gaze based 3D triangulation for robotic bionic eyes. Sensors 2020, 20, 5271. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Physical implementation of the robot system. (a) The front side. (b) The left side. (c) The right side.

Figure 2. Robot system’s organization diagram.

Figure 3. Block diagram of the gaze point tracking control system.

Figure 4. Robot coordinates system and system parameter definition, (a) coordinate system definition, (b) eye–head system parameters and (c) mobile robot parameters.

Figure 5. Schematic of the relationship between a Cartesian point and its image point.

Figure 6. (a) Mechanical structure and coordinate systems of the bionic eye platform and (b) binocular 3D perception principle of bionic eyes.

Figure 7. Principle of head rotation calculation in fixation point tracking: (a) horizontal rotation angle and (b) vertical rotation angle.

Figure 8. Steps for calculating the desired pose of the fixation point.

Figure 9. Robot pose control block diagram.

Figure 10. Experimental in situ gaze point tracking scene.

Figure 11. Experimental results of gaze shifting to the target: (a) U coordinates of the target on the left image. (b) V coordinates of the target on the left image. (c) U coordinates of the target on the right image. (d) V coordinates of the target on the right image. (e) Left camera tilt. (f) Left camera pan. (g) Right camera tilt. (h) Right camera pan. (i) Head pan. (j) Angle deviation and rotation. (k) Coordinates (^wx, ^wz) of the target in the world coordinate system. (l) Coordinates (^ox, ^oz) of the target in the world coordinate system based on the origin location. The “+” in the subfigures (k,l) represents the position of the target in the coordinate system, and the “☆” represents the position of the robot in the coordinate system.

Figure 12. Experimental approaching gaze point tracking scene.

Figure 13. Experimental results of gaze shifting to the target: (a) U coordinates of the target on the left image. (b) V coordinates of the target on the left image. (c) U coordinates of the target on the right image. (d) V coordinates of the target on the right image. (e) Left camera tilt. (f) Left camera pan. (g) Right camera tilt. (h) Right camera pan. (i) Head pan. (j) Angular deviation and rotation. (k) Coordinates (^wx, ^wz) of the target in the world coordinate system. (l) Robot forward distance and the distance between the target and robot.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, X.; Wang, Q.; Cong, H.; Zhang, Y.; Qiu, M. Gaze Point Tracking Based on a Robotic Body–Head–Eye Coordination Method. Sensors 2023, 23, 6299. https://doi.org/10.3390/s23146299

AMA Style

Feng X, Wang Q, Cong H, Zhang Y, Qiu M. Gaze Point Tracking Based on a Robotic Body–Head–Eye Coordination Method. Sensors. 2023; 23(14):6299. https://doi.org/10.3390/s23146299

Chicago/Turabian Style

Feng, Xingyang, Qingbin Wang, Hua Cong, Yu Zhang, and Mianhao Qiu. 2023. "Gaze Point Tracking Based on a Robotic Body–Head–Eye Coordination Method" Sensors 23, no. 14: 6299. https://doi.org/10.3390/s23146299

APA Style

Feng, X., Wang, Q., Cong, H., Zhang, Y., & Qiu, M. (2023). Gaze Point Tracking Based on a Robotic Body–Head–Eye Coordination Method. Sensors, 23(14), 6299. https://doi.org/10.3390/s23146299

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Gaze Point Tracking Based on a Robotic Body–Head–Eye Coordination Method

Abstract

1. Introduction

2. Platform and Control System

2.1. Robot Platform

2.2. Control System

3. Desired Pose Calculation

3.1. Optimal Observation Position of Eyes

3.2. Desired Pose Calculation for In Situ Gaze Point Tracking

3.2.1. Three-Dimensional Coordinate Calculation

3.2.2. Horizontal Rotation Angle Calculation

3.2.3. Vertical Rotation Angle Calculation

3.2.4. Calculation of the Desired Observation Poses of the Eye

3.3. Desired Pose Calculation for Approaching Gaze Point Tracking

4. Robot Pose Control

5. Experiments and Discussion

5.1. In Situ Gaze Point Tracking Experiment

5.2. Approaching Gaze Point Tracking Experiment

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Measurement Error Analysis of Binocular Stereo Vision

Appendix A.1. Vision System of Bionic Eyes

Appendix A.2. Depth Measurement Model

Appendix A.3. Measurement Error Analysis

Appendix B. Proof of Equation (100)

Appendix C. Two Equations Related to θ_t and θ_p

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Gaze Point Tracking Based on a Robotic Body–Head–Eye Coordination Method

Abstract

1. Introduction

2. Platform and Control System

2.1. Robot Platform

2.2. Control System

3. Desired Pose Calculation

3.1. Optimal Observation Position of Eyes

3.2. Desired Pose Calculation for In Situ Gaze Point Tracking

3.2.1. Three-Dimensional Coordinate Calculation

3.2.2. Horizontal Rotation Angle Calculation

3.2.3. Vertical Rotation Angle Calculation

3.2.4. Calculation of the Desired Observation Poses of the Eye

3.3. Desired Pose Calculation for Approaching Gaze Point Tracking

4. Robot Pose Control

5. Experiments and Discussion

5.1. In Situ Gaze Point Tracking Experiment

5.2. Approaching Gaze Point Tracking Experiment

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Measurement Error Analysis of Binocular Stereo Vision

Appendix A.1. Vision System of Bionic Eyes

Appendix A.2. Depth Measurement Model

Appendix A.3. Measurement Error Analysis

Appendix B. Proof of Equation (100)

Appendix C. Two Equations Related to θt and θp

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Appendix C. Two Equations Related to θ_t and θ_p