Variable Admittance Control Based on Trajectory Prediction of Human Hand Motion for Physical Human-Robot Interaction

: In order to achieve effective physical human–robot interaction, human dynamic characteristics needs to be considered in admittance control. This paper proposes a variable admittance control method for physical human–robot interaction based on trajectory prediction of human hand motion. By predicting the moving direction of the robot end tool under human guidance, the admittance control parameters are adjusted to reduce the interaction force. The end tool trajectory of the robot under human guidance is used for ofﬂine training of long and short-term memory neural network to generate trajectory predictors. Then the trajectory predictors are used in variable admittance control to predict the trajectory and movement direction of the robot end tool in real time. The variable admittance controller adjusts the damping matrix to reduce the damping value in the moving direction. Experiment results show that, using the constant admittance method as a benchmark, the interaction force of the proposed method is reduced by 23%, the trajectory error is reduced by 51%, and the operating jerk is reduced by at least 21%, which proves that the proposed method improves the accuracy and compliance of the operation.


Introduction
With the expansion of the application range of robots, it is increasingly common for humans and robots to collaborate closely in the same space, especially in the fields of industrial assembly and medical robots [1]. In order to improve the working efficiency and safety of the human-robot coupling system, the study of physical human-robot interaction (pHRI) has been widely studied by scholars [2]. The improvement of pHRI performance can be achieved from two aspects: structure design and controller design methods of structure design, such as artificial muscles and soft robots, have been studied extensively. For example, Dai et al. [3] proposed a handshaking robotic arm based on a variable viscoelastic joint consisting of artificial muscles and magnetorheological brakes, enabling it to acquire properties similar to human arm joints. Ohta et al. [4] developed a seven-DOF bio-inspired manipulator that combines the elements of rigid and soft robots, thus showing excellent characteristics in compliance and accuracy while maintaining a low weight. This paper focuses on improving the pHRI performance from the perspective of controller design, especially the integration of human arm impedance into variable admittance control. The impedance characteristics of the arm are generally incorporated into the controller design of the pHRI by means of parameter estimation, indirect measurements and artificial intelligence and adaptive control.
Scholars have done a lot of work in modeling and indirect measurement of arm impedance characteristics. As early as 2002, Rahman et al. [5] modeled the human arm as a spring-damper system, after collecting the interactive force and position data through human-robot cooperation experiments, he obtained the impedance characteristics of the human arm by means of a parametric model, which was used to control a robot to simulate human motion and thus achieve human-robot cooperation. Tsumugiwa et al. [6] estimated the stiffness of the human arm based on experiment data including the interaction force and position of the robot end effector in low-speed range, and adjusted the viscosity coefficient of the impedance controller in real time to be proportional to the estimated stiffness of the operator's arm. The drawback of this method is that the proportion of the environment force relative to the operating force must not be too large, otherwise it affects the accuracy of estimated human stiffness. Erden et al. [7][8][9] carried out a series of works on human-robot interaction in robot-assisted welding. In [7], he distinguished skilled workers and unskilled workers by analyzing the pose information of welding gun. In [8], he investigated the optimal impedance parameters of robot to reduce the human effort while suppressing the vibration of welding torch. In [9], he applied the measurement of human arm impedance to human-robot interaction control. In order to measure the impedance of the human arm, the robotic arm is set to the admittance control mode, and multiple subjects are used to operate the torch to draw straight lines repeatedly, and the force-position information is collected to measure the impedance of human arm. The results showed that in the direction perpendicular to the drawing line, the experienced operator showed greater impedance. On this basis, a directional damping scheme for robot-assisted welding was developed, which significantly improved the accuracy of the operators.
Grafakos et al. [10] proposed a method for variable admittance control by evaluating human arm stiffness through EMG signal strength, in which the virtual damping is adjusted in real time by measuring the operator's muscle activation state to switch between high and low damping, and the algorithm significantly reduced the operator's operation force and improved the accuracy of the co-movement compared to constant admittance. Gallagher et al. [11] proposed a variable admittance control method based on EMG signals, in which a skeletal muscle model of the upper limb containing multiple muscles was firstly built, then the best pair of antagonist muscles was filtered out, by comparing the effects of various muscles on the wrist and elbow joints, finally the end stiffness of arm was estimated by measuring the contraction level of the antagonist muscles. Ajoudani et al. [12] introduced the concept of tele-impedance to estimate the human arm stiffness in real time by measuring the EMG signals of eight muscles on the operator's arm, which was used to adjust the virtual stiffness of a robot in master-slave tele-operation task. The method had yielded good results in preliminary experiments, but its application was limited by the variability, drift and inaccuracy of EMG signals in real environments and the requirement for frequent recalibration of sensors. In summary, a lot of important achievements have been obtained in the estimation and measurement of human arm impedance, but there are still many limitations. On the one hand, these methods are difficult to apply to unstructured tasks due to the requirement of a priori motion model [13]. On the other hand, it is difficult to estimate the stiffness of each operator's arm accurately, and this approach indirectly increases the complexity of the pHRI control system due to the need for complex peripheral sensors.
The neural network approach has advantages in dealing with complex model problems and can be used to recognize and predict human intentions. It should be noted that the stochastic artificial intelligence (S.A.I) does not need to fully understand the dynamic model of the controlled object, and the convergence and accuracy of the prediction results can be guaranteed by learning a large amount of data. The deterministic artificial intelligence (D.A.I) which obeys physical laws to establish the controller model can obtain high-precision performance and greatly reduce the demand for large amounts of data. Timothy reviewed the development of D.A.I in the control of the unmanned underwater vehicles [14], and used D.A.I to control the DC motor of the unmanned underwater vehicles to reduce the error of trajectory tracking [15]. Baker et al. [16] realized the autonomous generation of control trajectory through D.A.I, which significantly reduced the average error of the trajectory. D.A.I has achieved excellent results, but the S.A.I is still the main method in the research of trajectory prediction in pHRI. Sharkawy [17] proposed a variable admittance control method for pHRI. A multilayer feedforward neural network was designed to adjust the virtual damping of the admittance controller online using the robot's velocity and operation force as inputs. Dimeas et al. [18] proposed a variable admittance controller based on reinforcement learning for a human-robot cooperative operation task. By setting the objective of the reinforcement learning algorithm to minimize acceleration during point-to-point motion, the proposed controller can learn appropriate damping adjustment for effective collaboration without prior knowledge of the target position or other task characteristics. Maithani et al. [19] used RNN-LSTM network to predict the interaction force in industrial meat cutting task, and applied it in an impedance controller to enable the robot to provide assisting forces to the butcher. Medina et al. [20] combined an empirical model of human motion to predict human behavior in order to estimate human intentions in pHRI. Lee et al. [21] characterized the dynamics of the human arm and robot and compensated the reaction forces using human hand impedance, thus improving the transparency of the pHRI control system. He et al. [22] investigated an admittance controller for interactive operations in a restricted task space and used an adaptive neural network to handle the trajectory tracking problem. Generally, the neural network method has achieved good results in solving the related problems of pHRI by predicting human behavior and introducing it into the variable admittance control. However, most of these researches do not consider the direction of motion, that is, the admittance is isotropic in the task space. In the application scenario of this paper, which is robot assisted joint replacement surgery, the robot needs to move accurately along a predetermined task path under guidance of operator, and the compliance of pHRI in the motion direction has the most important clinical significance.
In this paper, we propose a new variable admittance control method, which uses a long and short-term memory neural network to predict the intended trajectory and moving direction of robot end tool under operator guidance, and then the virtual damping matrix of the robot is changed to realize variable admittance control in the motion direction, thus obtaining good pHRI performance. The experiment results show the proposed method achieves smoother human-robot interaction and higher accuracy of path following than constant admittance control.
The rest of this paper is organized as follows. Section 2 provides an overview of the proposed control method. Section 3 provides details of the admittance controller with human hand motion prediction. Section 4 shows the LSTM design and training process. The experiment results and analysis are presented in Section 5. Finally, Section 6 presents the conclusions and discussions.

Overview of the Control System
The entire system includes a robot system, an LSTM predictor, and a variable admittance controller, as shown in Figure 1. The robot system includes a six-DOF serial manipulator. A six-DOF force/torque sensor is installed between the robot and the end tool. After the robot is moved by operator, the trajectory of the robot end tool between t-n and t is collected by the control system, which is used as the input of the LSTM network predictor. LSTM outputs real-time trajectory of the robot's end tool from t to t + m in the future. The control system calculates the unit vector P t+m of the movement direction according to the predicted trajectory. The variable admittance controller sets the proportional coefficient of the damping matrix in the projection direction of P t+m to k, and adjusts k to improve the compliance of pHRI.

Trajectory Prediction Variable Admittance Control Method
Consider a typical mass-spring-damper model where the mass M x is subjected to an external force F ext , and the second-order linear equation is as follows.
where X is the position of the end tool in Cartesian space, and X d represents the pre-defined reference trajectory, and G represents gravity and f is the friction. M x , D x and K x are the virtual mass, damping and stiffness in Cartesian coordinate system respectively. The application scenario of this paper is grinding task in knee replacement surgery, and the Equation (1) can be further simplified. First, in the "hand-on" human-robot collaboration mode, the robot is set to be freely moved by the operator, and will maintain its position after releasing the hand. In this control mode, the robot has no reference trajectory, and the stiffness in the equivalent admittance is zero, so X d and K x will not appear in the admittance equation. Secondly, the gravity term can be compensated based on the joint angle, and friction is not considered in this paper, so gravity G and friction f will not appear in the equation. Finally, considering that the damping force is obtained by the output torque of the controller, the dynamic equation is as follows: where J is the Jacobian matrix and τ is the output torque of the motor. By comparing the dynamic model in joint space, it is generally: where q is joint angle, M q and C q are the virtual mass, damping in joint space respectively. In the trajectory prediction variable admittance control (TPVAC) method proposed in this paper, the key step is to predict the movement direction of robot end tool, and then change the damping in that direction. The schematic diagram and control block diagram of TPVAC are shown in the Figure 2. In Figure 2a, points P t and P t+∆t are the positions of the robot end tool at time t and t + ∆t, respectively, and P' t+∆t is the predicted point at time t + ∆t. V t and V' t are actual movement velocity and predicted movement velocity respectively, and V' t = (P' t+∆t − P t )/∆t. Then the predicted motion direction vector is d mt = V' t /|V' t |. F t and F' t are the robot damping force under constant damping and variable damping control at time t respectively. Since the damping force and motion are reversed under constant damping control, it is assumed that the initial damping matrix is D 0 , that is, F t = −D 0 V t . At the same time, the damping force can be decomposed into component forces F tu and F tv that are perpendicular and parallel to the direction of motion. D 0 can be decomposed into D ut and D vt , and their calculation formulas are as follows: The projection matrix D mt along the direction of d mt is expressed as follows: In this case, the two components of D 0 are: Taking variable admittance control into consideration, set the proportional coefficient k before D tu , and make 0 < k < 1, then the damping force F tu in the direction of motion can be reduced, while the damping force F tv perpendicular to the direction of motion remains unchanged, thus generating a trend towards the predicted direction of motion at the end of the robot. At this point, the robot feedback to the operator is that the resistance of moving along the original direction is less, so that the human-robot interaction force can be reduced under the condition of keeping the direction of movement stability. The damping matrix in Equation (4) at time t is D x which can be expressed as

Trajectory Prediction
The key of the variable admittance method is to accurately predict the trajectory of the robot end tool. In this paper, a trajectory predictor is established using neural network, and the trajectory data of simulated operation, which is collected offline, is used as the training set.

LSTM Predictor
The recurrent neural network (RNN) can process time series data. However, when the time span is large between the cell that holds the relevant information and the cell that uses it, RNN has difficulty connecting the relevant information due to gradient vanishing and explosion problems. Therefore, LSTM can improve the performance of the original RNN by introducing the three-gate architecture (input gate, forget gate and output gate). Since the arm motion data is a set of time series data, it is suitable to use LSTM to predict the motion trajectory.
At the beginning, we tried to use the network structure of multiple middle layers, but the training time is too long and the prediction effect is poor. Through the comparison of experimental results, the three-layer network structure is finally determined to be used. The input layer and the output layer are full connection layers, each containing three nodes, representing the coordinate values of x, y and z respectively. The middle layer is composed of the basic LSTM cell, and tanh hyperbolic tangent is used as the activation function.

Data Preparation
In order to obtain the data for training LSTM predictor, two testers are required to operate the robot to simulate the grinding task of the prosthesis implantation plane, and the end tool of the robot moves in the plane according to the designated path. In this process, data such as the trajectory of the robot end tool and operating force (human-robot interaction force) are collected. In the data collection process, the robot's admittance value is set to be constant, which is also the initial admittance value of the TPVAC method. In order to compare the influence of different admittance values, the admittance values are divided into two groups. Therefore, there are 4 sets of original data, as shown in Table 1. In each set of data, 10,000 points are selected as the training set and 1500 points are used as the test set, and all data are normalized. Considering the time limit of control cycle (30 ms), the data sampling interval is set to 5 ms.

Training and Evaluation
According to the characteristics of RNN, the data is truncated into 10-point fragments, and then the data is packed to form data packets with a size of 100. The initial learning rate is 0.1, the learning rate per training is reduced to 98%, and the total number of training steps is 20,000. In order to determine the best hidden layer size, we start training with 512 and gradually reduce to eight. We observe the change trend of the absolute error loss function and determine the best size of the hidden layer as 32. Four sets of training sets were used to train four LSTM predictors, and predictions were made on the corresponding test sets. The results are shown in Figure 3. It is noted that the test trajectory and the predicted trajectory coincide, which proves the performance of the predictor is good and stable. In order to quantitatively evaluate the performance of the predictor, the prediction error is defined as the distance between the predicted point and the actual point. The probability density of each group of prediction errors and the position parameter µ and scale parameter σ of the normal distribution are shown in Figure 4. The experiment results show that the average prediction error of each group is less than 1 mm. Using Predictor 1 as an example, the maximum error within 99.7% of confidence probability is 1.9 mm. The prediction error of the predictor is close to the positioning accuracy of the robot, which meets the requirements of subsequent experiments.

Experimental Setup and Procedure
The experimental setup is shown in Figure 5. The robot is a self-developed cabledriven six-joint serial manipulator, and the positioning accuracy is 1.5 mm. In the physical human-robot interaction scenarios, the flexibility of operation is an important factor that needs to be considered. Compared with traditional robot, our robot used cable-pulley transmission to replace gear reducer, which will significantly reduce the inertia of the manipulator arm, thus improving the flexibility. Therefore, this robot adopts a cable-drive mode. A six-DOF force sensor (SRI-M3552B, Sunrise Instruments, Shanghai, China) is installed at the end of the robot to measure the interaction force between the operator and robot, with a range of 150 N in the X and Y directions and a range of 250 N in the Z direction. The nonlinearity and hysteresis are less than 1.5% F.S., and the refresh rate in the experiment is 200 Hz. An indicator pen is installed on the robot's sixth joint (end joint), which is used to represent grinding tools, while auxiliary testers move robots by task path. The simulation experiment is to operate the robot to grind the prosthesis implantation plane, and its task path is an N-shaped path covering the plane. To simplify the analysis of experiment data, the grinding plane is limited to XY plane. The experiment is divided into four groups, and the four different trajectory predictors in the above section are adopted, respectively. The tester holds the handle and drags the robot according to the designated task path. The moving speed and acceleration of the robot end tool on the path are adjusted by the tester according to his own habits. In particular, the overall time for each group experiment is roughly the same.

Results and Discussion
Robot end tool trajectories and interaction forces of the four groups are shown in Figure 6. CAC-i corresponds to the training set Train-i, which represents the results of the human-robot interaction experiment under constant admittance, and TPVAC-i is experiment results with predictor-i. CAC-1, CAC-2, TPVAC-1 and TPVAC-2 were completed by Tester 1, and the rest of the experiments were completed by Tester 2. From the qualitative analysis of the data in Figure 6, the following conclusions can be drawn: (a) The direction of the interactive force is somewhat advanced relative to the direction of the robot end tool, especially in right-angle. (b) Compared with the constant admittance control (CAC), the amplitude of the human-robot interaction force under TPVAC is smaller, but the direction of the interaction force changes more frequently, indicating that the tester can perceive the change of the parameters and adjust it in time. (c) The habits of the two testers are slightly different. Tester 1 has a larger interaction force relative to Tester 2 and the steering is smoother. It is proved that our proposed variable admittance control method can effectively help the operator to make the robot end tool move along the task path. In this paper, the error between the actual trajectory and the task path is used to demonstrate the effectiveness of the method. Robot end tool trajectories versus task path of the four groups of comparative experiments are shown in Figure 7a,b. The trajectory errors are shown in Figure 7c. Compared with CAC, the deviation between the actual trajectory and the task path is far below with TPVAC, indicating that the TPVAC method can improve the robot's operational performance. After using TPVAC, the trajectory errors decreased by 51~67%, as shown in Table 2.  Two indicators were adopted to quantitatively evaluate the compliance improvement of the TPVAC method. One is the moving average of the square of the jerk (MAS-jerk), which is to evaluate the compliance of operation. Another is the moving average of the interaction force (MA-force), which is to evaluate the magnitude of interaction force. The calculation method of the two indicators is shown in Equations (8) and (9). The reason for choosing the moving average of the parameters is that the trajectory within the task path is a N-shaped line, and the tester has constantly changing accelerations and decelerations.
The MAS-jerk and MA-force curves are shown in Figure 8. For each group of comparative experiments, the MAS-jerk and MA-force under TPVAC are less than those under CAC. The average values and reduction rates of each group of curves are shown in Table 2. With TPVAC, the MAS-jerk of each group of experiments is reduced by 21~59%, and the MA-force is reduced by 23~45%. The experiment results verify the effectiveness of TPVAC in improving operation accuracy and reducing interaction force, which proves that our method helps to improve the tester's operating performance in pHRI.

Conclusions
To improve the physical human-robot interaction (pHRI) performance in robot assisted surgery, a trajectory prediction variable admittance control method is proposed in this paper. A three-layer LSTM neural network is designed to predict human hand movements and then the virtual damping matrix of the robot is changed to realize variable admittance control. Experiment results show that the trajectory prediction error is less than 1 mm, and compared with the constant admittance control, the trajectory errors is reduced by at least 51%, the operating force is reduced by at least 23%, and the operating jerk is reduced by at least 21%, which proves that our proposed method improves the operation performance in both accuracy and compliance. The proposed method in this paper is for human-robot cooperation in surface grinding of joint replacement surgery, and its potential application can be extended to human-robot operations with predefined task paths. In our future work, we will consider the influence of the contact force between the robot end tool and the diseased bone in actual surgery, and take the factor of human-robot-environment interaction into the controller design. We will also study the online neural network train-ing into the variable admittance controller, hoping that the controller can better adapt to different operators.