Neural Adaptive Impedance Control for Force Tracking in Uncertain Environment

: Torque-based impedance control, a kind of classical active compliant control, is widely required in human–robot interaction, medical rehabilitation, and other ﬁelds. Adaptive impedance control effectively tracks the force when the robot comes in contact with an unknown environment. Conventional adaptive impedance control (AIC) introduces the force tracking error of the last moment to adjust the controller parameters online, which is an indirect method. In this paper, joint friction in the robot system is ﬁrst identiﬁed and compensated for to enable the excellent performance of torque-based impedance control. Second, neural networks are inserted into the torque-based impedance controller, and a neural adaptive impedance control (NAIC) scheme with directly online optimized parameters is proposed. In addition, NAIC can be deployed directly without the need for data collection and training. Simulation studies and real-world experiments with a six link rotary robot manipulator demonstrate the excellent performance of NAIC.


Introduction
In the last few years, compliant control has been widely used in polishing, assembly, dual-arm coordination, human-robot interaction, and other fields.Classical compliant control methods include hybrid or parallel force/position control, impedance, and admittance control.Compared with the passive compliance with mechanical structures, active compliant control realized within software is applied in a wider field of robotics [1].Impedance control, as a typical active compliant control method, has been evolving since its first proposal by Hogan in 1985 [2].In contrast to position-based admittance control, impedance control is a torque-based control method.However, there are usually unknown dynamics such as friction, modeling uncertainties, and external disturbances in robotics system, which can significantly limit the precision of impedance control.
Friction feedforward compensation plays a vital role in robot dynamics control, such as collision detection [3,4] and sensorless kinesthetic teaching [5].Stribeck [6] is a kind of nonlinear friction effect resulting from friction.Over the past decades, the friction model has evolved from Coulomb/viscous friction to the LuGre model [7,8] and smooth tanh function [9,10].Both of the latter two methods can overcome the Stribeck effect.The LuGre model is piecewise continuous and therefore non-differentiable, whereas the friction model based on tanh is smooth and can be made a time derivative.Nevertheless, the friction model may also depict natural friction.Alongside the mathematical friction model studied, the joint torque sensor has been designed and utilized to sense the friction in LWR iiwa [11].In this paper, measured results reveal that the friction torque is not entirely symmetrical concerning the positive and negative directions of the joint motor.Thus, the piecewise continuous LuGre friction model is adopted.
By establishing a virtual mass-spring-damping contact dynamic, classical impedance control can implement force control when the environment is known.In order to track the desired force when the robot is in contact with an unknown environment, adaptive impedance control [12,13] was developed.In [14], the contact force model of the robot end-effector (EE) and the environment were analyzed, and a kind of adaptive impedance control law was proposed.In [15], an adaptive target impedance control scheme for the dual collaborative robotic arm was proposed, such that the controller exhibits different supple behaviors depending on the magnitude of the interaction with the environment.In [16], a discrete robust position control law based on Cartesian impedance control law was discussed, and the control input of the robot manipulator was explicitly converted to joint torque.In [17], a cascaded position-torque control loop for impedance control was designed, and the goal of the adaptive control law was to achieve adaptive assistance based on human force.These methods often achieve the adaptive effect by introducing other terms in the impedance controller but do not update the impedance parameters online.
In addition to the aforementioned adaptive control schemes, some scholars have also used neural networks to implement variable impedance control due to the dramatic increase in computer arithmetic power [18].Neural network techniques can be applied to both torque-based impedance control and position-based admittance control [19].In fact, the optimization of controller parameters with neural networks has been widely used.
In [20], radius basis network (RBF) is utilized in both position-loop and force-loop to deal with the uncertainties of the manipulator in the grinding task.In [21], PID controller parameters were optimized online with the neural network.Except for neural networks, reinforcement learning has also been studied to design variable impedance control [22].However, large-scale data acquisition requires expensive human and material resources for most neural networks or reinforcement learning methods.
In this paper, we propose a kind of neural adaptive impedance control to track the desired force when the robot is in contact with an unknown environment.By embedding the neural network into the traditional impedance control, the performance of the controller is improved.Compared to some recent results, the contributions of this study are summarized as follows: (1) A kind of adaptive impedance control method based on the neural network is proposed.
This method is simple and easy to implement.Although some modern control approaches have also been proposed, their design procedures are more difficult or complex.(2) The impedance parameters of the controller are directly adjusted online, which improves the performance of the controller.
(3) The proposed method can be deployed without any data collection or training process.
In addition, its simple structure does not require a large amount of computing resources.
The remainder of this article is organized as follows.The dynamic model of the robot and friction is introduced as preliminaries in Section 2. The neural adaptive impedance control method and stability analysis are given in Section 3. Several simulation scenarios are shown in Section 4, and the real-world experiment is validated in Section 5. Finally, conclusions are presented in Section 6.

Preliminaries
The dynamic model of a six degrees of freedom robot can be described as M(q) q + C(q, q) q + G(q) + τ f (q, q) = τ + τ ext (1) where q, q, and q ∈ R 6 denote the joint position, velocity, and acceleration, respectively; M(q) ∈ R 6×6 is the positive definite inertia matrix; C(q, q) ∈ R 6×6 is the Coriolis matrix; G(q) ∈ R 6 represents gravity; τ f (q, q) ∈ R 6 is a column vector resulting from joint friction; τ ∈ R 6 denotes the joint driving torque; and τ ext ∈ R 6 is the external torque resulting from an external wrench where J(q) ∈ R 6×6 represents the Jacobian matrix and symbol T means transpose.Equation (1) shows the dynamic model of the robot in joint space, and it can be transformed to Cartesian space with the Jacobian matrix.
where D = J −T MJ −1 , and h = J −T (C(q, q) q + G(q)) − D J J −1 Ẋ; X ∈ R 6 denotes the pose of the end-effector (EE).Dynamic parameters of the robot can be determined from carefully designed identification experiments [23].
The compensation of joint friction plays a pivotal and important role in robot dynamics control.In general, the ith joint friction torques τ f i can be modeled as where sgn(•) denotes the sign function, f ci represents the Coulomb friction coefficient, and f vi is the viscous friction coefficient.However, this model does not reflect the actual joint friction at low speeds very well.To solve this problem, the Stribeck effect is introduced in LuGre friction.
where f ai , f bi , q si , and f vi are parameters to be identified.

Neural Adaptive Impedance Control
This section proposes a kind of neural adaptive impedance control law based on backpropagation, and the stability is analyzed.

Traditional Adaptive Impedance Control Law
The goal of impedance control is to present the robot with a spring-like effect when interacting with the environment.To track the external force in Cartesian space, impedance control law can be designed as where X d ∈ R 6 represents the desired trajectory.Substituting ( 6) into (3), the target impedance relationship (7) can be obtained.
where E = X d − X represents the position error between desired robot position X d and measured EE position X.
The new impedance function ( 8) can be acquired by subtracting the desired force The interaction process between the robot and the environment can be recognized as a two-phase control algorithm: the first phase is free-space when the robot is approaching the environment, and the second phase is contact-space control in which the EE is in contact with the environment [16].In the first phase, the contact force F ext = 0.In the second phase, to satisfy the controller's stability, K d = 0 is set.Adaptive impedance control (AIC) is often used to solve the uncertainty of robot interaction with unknown environments.The traditional AIC law [14] can be designed as where λ is the sampling period of the controller and σ is the update rate.
Remark 1.Without loss of generality, a lowercase letter indicates one of the corresponding uppercase letters in ( 9) and similarly hereinafter.For example, m d is an element on the main diagonal of matrix M d ∈ R 6×6 , and f d is an element of the vector F d ∈ R 6×1 .

Neural Adaptive Impedance Control
Integrating the three equations in (9) yields Unlike classic impedance control, the error between the desired force and the environment force at the last moment is explicitly introduced into the adaptive impedance control law (10).In addition, the sum of the desired and environmental force errors for the last time is also included in (10), which can be considered an integral action.The parameters m d , b d , and σ do not change, which limits its adaptive capability.This is an indirect method of optimizing the controller parameters.
Because of the nonlinear fitting ability and adaptive features, neural networks have been widely used in the design of adaptive controllers.Figure 1 shows the neural adaptive impedance controller (NAIC) we proposed.Let e f (t) = f ext (t) − f d (t) denote the error between the desired force and environment force.Therefore, a neural adaptive impedance controller based on backpropagation is proposed.This network is designed skillfully.The input layer has four neurons corresponding to ë, ė, Φ(t − λ), and e f (t − λ).The hidden layer has three neurons, and the output layer has one neuron.This gives exactly three neuron weights from the hidden layer to the output layer, which is consistent with the number of impedance parameters.The output of the neural adaptive impedance controller is e f (t), and the weights of the output layer are the parameters of the NAIC.Let f i denote the output of the ith neuron in the hidden layer.The relationship from input layer to hidden layer can be described as in (11):

Inverse Dynamics
Let w i ∈ {m d , b d , σ}(i = 1, 2, 3) represent the weight of the ith neuron in the hidden layer to the output layer.Therefore, the impedance parameters are denoted with the weight of the neural network.The mapping relationship from the hidden layer to the output layer is shown in ( 12): The loss function is selected as The derivative of the loss function with respect to the output neuron weights can be obtained by utilizing the chain rule of derivatives.
Finally, the update law of the output weight is where γ is the learning rate.Obviously, when γ = 0, NAIC will be reduced to a classical adaptive impedance controller.
Proof of Theorem 1.In order to stabilize the system with the NAIC, the Lyapunov stability theorem is used here.The Lyapunov function is defined as Thus, the change of the Lyapunov function is According to the NAIC structure shown in Figure 1, (18) can be obtained by utilizing Taylor expansion.
From ( 12), one has Based on the back-propagation update law (15), we have Substituting ( 18)∼(20) into ( 17), yields According to the Lyapunov stability theorem, only if ∆V(t) ≤ 0 in any sampling time t will the stability of the system with NAIC be guaranteed.In (21), γ ≥ 0, e 2 f (t) ≥ 0, and f 2 i (t) ≥ 0. Therefore, one can conclude the sufficient condition for ∆V(t) ≤ 0 is that learning rate γ satisfies (15).

Simulation Studies
In this section, the proposed control algorithm is tested by simulating the tracking performance with a six degrees of freedom collaborative robot manipulator JK5.The simulation environment is based on open source physics engine MuJoCo.The simulation can be regarded as a two-phase motion: the first phase is free-space when the robot moves towards the environment, and the second phase is the robot contact-phase in which the EE is in contact with the environment.

Flat Surface Tracking
In this simulation, the environment is set as a flat plane, shown in Figure 2. When the robot is in contact with the environment, the control objective is force tracking in the z-direction.The height of the flat surface is set as x env = 0.08 m.It is easy to know that ẋenv = ẍenv = 0.In the free-space phase, f ext = 0, M d = I ∈ R 6 , B d = diag[500, 500, 40, 500, 500, 500], and K d = diag[500, 500, 50, 1000, 1000, 1000].In the contactphase, the initial desired contact force is set as f d = 10N, and after 3000 frames the desired contact force is f d = 20 N. In this phase, initial impedance control parameters are selected as M d = I, B d = diag[500, 500, 40, 500, 500, 500], and K d = diag[100, 100, 0, 1000, 1000, 1000].When the robot is in contact with the environment, the Z-directional stiffness parameter k d is set to 0.
In Figure 2, the robot is in contact with the flat surface at t = 0.7 s, and the external force tries to track the desired force f d = 10 N. When t = 3 s, the desired force suddenly changes to 20 N. Simulation results show that both NAIC and AIC can track the desired force, but NAIC converges faster and with less oscillation than AIC.
Consider a case of environmental variable impedance: assume that the stiffness in the plane is variable.By abruptly changing the surface stiffness k e (N/m) as (22), simulation results are shown in Figure 3.
Figure 3 illustrates that in the case of unknown environment stiffness, NAIC still exhibits excellent performance compared to AIC if the environment stiffness changes abruptly.When t = 2 s, the surface stiffness suddenly changes from 2000 N/m to 5000 N/m, and the NAIC quickly converges to the expected force, whereas the AIC oscillates substantially for some time.

Slope Surface Tracking
Another simulation was carried out for the slope-shaped environment, as shown in Figure 4a.Compared with flat surface simulation, ẋenv = 0 m/s and ẍenv = 0 m/s 2 when the robot is in contact with the environment.The desired force is set to 10 N, and the initial damping parameter in the Z-direction is 40 N•s/m for both NAIC and AIC.Simulation results in Figure 4b-d illustrate that the robot is in contact with the environment at Z = 0.124 m since t = 0.621 s.For both NAIC and AIC, the contact force successfully tracks the desired force.However, in the contact phase, the damping parameter is adaptively adjusted under NAIC and allows the contact force to track more quickly and smoothly than the AIC to the desired force.

Experimental Studies
This section uses a real-world collaborative robot JK5 with six rotational joints.The load of JK5 is 5 kg, and the control frequency is 1 kHz.The rated torque of the first three joints of JK5 is 27.27 Nm and of the last three joints is 192.39Nm.

Friction Compensation
Joint friction compensation must be performed for a real robot arm to have excellent control results.According to the friction model in (5), the joint friction is a function of the angular velocity q.Equation (1) shows that the gravity torque is related to the joint position q.In the experiment, the initial position of the robot is zero position.The six joints of the robot are controlled to move sequentially at different speeds within a certain range.
The velocity, joint angle, and the corresponding torque of each joint are measured.The joint friction moments can be obtained by subtracting the gravity torque from the measured torque.Taking the sixth joint as an example, the measured velocity and angle are shown in Figure 5.By calculating the mean value of each set of measured velocities and the mean value of the friction torque, the relationship between them can be plotted, as shown in Figure 6.The friction is modeled in (5) by uniformly considering the joint speed for both positive and negative.However, the measured results in Figure 6 show that the friction torque is related to the positive and negative joint rotation.
According to the friction model (5) and measured data in Figure 6, the identified parameters are listed in Table 1.In Table 1, the symbols + and -indicate the movement of the joint in the positive ( q > 0) and negative ( q < 0) directions, respectively.The unit of f ai and f bi is Nm, and the unit of f vi is Ns.Table 1.This table shows the identified results of the six joints' friction parameters in (5).After the identification of joint friction and feedforward compensation, the real-world experiment with a flat foam surface shown in Figure 7 was conducted to demonstrate the performance of our method.ATI mini45 is a six-dimensional force/torque sensor and can acquire data at frequencies up to 7 kHz.
In the experiment, the desired position is the height of the flat foam surface, which is 0.3813 m in the robot base frame.The desired force in the contact phase is set as −10 N. The robot arm moves from the start point 0.538 m and contacts with the flat foam at t = 2.826 s.In the contact phase, the robot is controlled by NAIC to track the desired force, and the result depicts that NAIC works well in this task.
Because the end is a rod rather than a ball [14], the robot will struggle to move in the x/y direction if the contact force is relatively large.

Force Tracking under an Unknown Environment
In this experiment, the environment consists of an acrylic plate and foam (see Figure 8a).It is obvious that the stiffness of the two materials is different.Therefore, this experimental setup can be used to simulate the interaction between the robot and the unknown environment.As shown in Figure 8b, the robot is in contact with the hard acrylic plate with greater stiffness at t = 3.327 s, and a sudden impact is produced.The contact force converges to the desired force −3 N before t = 4.756 s.In Figure 8c, the robot moves to the soft foam plate with smaller stiffness.Due to a sudden change in environmental stiffness, the contact force also undergoes a sudden change.Subsequently, the contact force tracks the desired force well under the action of NAIC, as shown in Figure 8f.

Conclusions
In this study, a new kind of adaptive impedance control is presented.The neural network is embedded into the classical adaptive impedance controller to optimize the parameters directly.Consistent with previous studies of adaptive impedance control law, NAIC can also track the contact force under an unknown environment.However, it must be pointed out that our method can explicitly adjust the impedance parameters online.NAIC can be deployed directly without data collection and training.Simulation proved that the parameter of the NAIC was updated online in the contact phase.NAIC is excellent for tracking the desired force with a sudden change and works better than AIC when the environment stiffness is abruptly changed.Due to friction feedforward compensation, real-world experiments can also achieve the desired force tracking.In subsequent work, we will consider the possibility of migrating the method to other network architectures.

Figure 1 .
Figure 1.NAIC structure.The neural network consists of three layers: input layer, hidden layer, and output layer.The input contains ë, ė, Φ(t − λ), and e f (t − λ).The output is the force error e f (t).The dark blue dashed line denotes the back propagation.The parameters of the AIC are updated with the neural network.

Figure 2 .Figure 3 .
Figure 2. Force tracking simulation in flat surface.(a) Simulation scene.(b) Adaptive effect of damping parameters in the Z-direction.(c) The legend Desired means the environment position, which is 0.08 m.(d) Force tracking result.When t = 3 s, the desired force suddenly changes to f d = 20 N, and the error between EE position and surface height increases.
(a) Slope surface scene (b) Z-direction damping parameter b z (c) EE position (d) Contact force

Figure 4 .
Figure 4. Force tracking with the slope surface.(a) Simulation scene.(b) Adaptive effect of damping parameters in the Z-direction.(c) The legend Desired means the environment position.(d) Force tracking result.When t = 0.621 s, the robot is in contact with the slope surface at Z = 0.124 m.

Figure 5 .
Figure 5.This figure shows the measured data of the sixth joint.For clarity, the figure shows the duration of data acquisition is 216.023s, starting at 339.765 s and ending at 555.788 s.(a) Measured velocity (blue line) from the servo driver is noisy, so median filtering (red line) is required.(b) Measured position from the servo driver.The limitation angle is ±5 • when | q| < 0.02 rad/s, and the limitation angle is ±10 • when | q| ≥ 0.02 rad/s.

Figure 6 .
Figure 6.The relationship between friction torque and joint velocity.The dots represent the measured data, and the solid lines indicate the identified results.

Figure 7 .
Figure 7.This figure shows the real-world experiment with the proposed NAIC method: (a) {B} denotes the robot base frame, and {S} is the force sensor frame; {S} can be obtained by rotating {B} with 180 • around the y-axis; (a,b) demonstrate the motion of the robot. (c,d) are the measured Z-direction position and contact force, respectively.

Figure 8 .
Figure 8.This figure shows the force tracking under unknown environment with the proposed NAIC method: (a-d) are the motion of the robot; (e) depicts the measured position of EE in Z-direction; (f) illustrates the force tracking result.The contact position in the Z-direction is 0.3648 m, and the desired force is −3 N.