Adaptive Kinematic Modelling for Multiobjective Control of a Redundant Surgical Robotic Tool

Accurate kinematic models are essential for effective control of surgical robots. For tendon driven robots, which are common for minimally invasive surgery, the high nonlinearities in the transmission make modelling complex. Machine learning techniques are a preferred approach to tackle this problem. However, surgical environments are rarely structured, due to organs being very soft and deformable, and unpredictable, for instance, because of fluids in the system, wear and break of the tendons that lead to changes of the system’s behaviour. Therefore, the model needs to quickly adapt. In this work, we propose a method to learn the kinematic model of a redundant surgical robot and control it to perform surgical tasks both autonomously and in teleoperation. The approach employs Feedforward Artificial Neural Networks (ANN) for building the kinematic model of the robot offline, and an online adaptive strategy in order to allow the system to conform to the changing environment. To prove the capabilities of the method, a comparison with a simple feedback controller for autonomous tracking is carried out. Simulation results show that the proposed method is capable of achieving very small tracking errors, even when unpredicted changes in the system occur, such as broken joints. The method proved effective also in guaranteeing accurate tracking in teleoperation.


Introduction
Accuracy and precision are of utmost importance in many robotic applications, especially in minimally invasive surgery, where little (or preferably no) damage should be inflicted on the patient's body. The introduction of robots in the operating room has revolutionized the way patients are treated, leading to much higher performance, less trauma, and faster recoveries.
Even today, because of safety issues related to the accuracy and capabilities of surgical robots, and ethical issues due to patients not feeling comfortable being operated by autonomous robots, most of the surgical robotic platform are still teleoperated: the surgeon directly controls the surgical robot through a master device. This means that the precision of the robotic system is still not fully exploited, because of the possible motion errors coming from the surgeon. Moreover, high precision tasks require high cognitive effort by the surgeon, which increases the possibility of surgical errors. Autonomous surgery, instead, allows faster and more precise execution, and reduction of the surgeon's burden [1].
To be able to perform surgeries autonomously, robots need to be safe, precise, and capable of adapting to different situations. However, robots for minimally invasive surgery have usually complex structures, being very articulated or even continuum robots. Therefore, modelling their kinematics may be very challenging. Moreover, due to miniaturization requirements, flexibility, and sterilization, least squares is chosen to update online the parameters of an initial model of a concentric tube robot. These approaches, however, work only if the model can be expressed in a linear form with respect to its parameters. Generally, however, the behaviour of the robotic system is highly nonlinear and more complex models are required.
In this work, we present an approach to effectively model a surgical robotic system and use the learned model to perform a tumor resection task autonomously.
Feedforward ANN are used to first build the forward kinematic model of the robot offline. It has been shown that feedforward networks can be regarded to as universal approximators, meaning that they can model any suitably smooth function, given enough hidden units, to any desired level of accuracy [19,20]. They are thus capable of representing complicated behaviours, without the need for knowing any mathematical or physical model. Feedforward networks consist of different layers of neurons. The first layer is the input layer, the last one is the output layer, and all the others in between are called hidden layers. Each layer has several neurons, with a defined activation function, each one receiving a linear combination of inputs from the neurons of the previous layer and sending an output to the neurons of the following layer. The final output of the layer results to be a nonlinear function of the input values, controlled by the nodes' weights. These weights can be retrieved by minimizing a desired cost function [21].
To deal with possible unpredictable changes in the model and not sufficient generalizabilty of ANN, an online adaptive strategy is also included. By measuring the tip position (for instance through the endoscope or electromagnetic trackers), the adaptive strategy updates the network weights to reduce the error between the network expected output and the real measurement. ANN have been chosen thanks to their ability to model complex functions. Secondly, the parametric nature of ANN allows easily and analytically computing the network derivatives, which are needed for solving the inverse kinematics of the robot and for updating the model. In addition, a global learning method has been preferred over local ones because it allows the model to be more reliable in case data for the adaptation are not available. As a matter of fact, local methods need to be continuously fed with data, and this might be an issue if they are not available (e.g., camera obstructions). Global learning methods, instead, may not need any new data if the learned offline model is accurate enough. The proposed method is validated in a simulation environment based on V-REP simulator [22], which also allows having a ground truth for comparison. Therefore, the paper is structured as follows. Section 2 presents how the kinematics of the robot is computed, along with the strategy to control redundant robots and the proposed adaptive modelling. Section 3 shows the results of the proposed method. The method is compared to a simple feedback controller. In this work two tests were performed in simulation: autonomous path tracking and teleoperation. Finally, discussion and conclusions are drawn in Sections 4 and 5.

Materials and Method
In this section the kinematic modelling of the robot under examination is presented, along with the control strategy and the online model re-adaptation.

Robot Kinematic Modelling
The Micro-IGES [23] (Figure 1) is a surgical robotic tool, composed of a rigid shaft (27 cm) and a flexible section (54 mm at zero configuration). The shaft provides the roll and translation degrees of freedom (DOFs). The articulated end consists of two elbows for pitch and yaw, with each elbow made of a pair of coupled joints, one DOF revolute joint for the wrist pitch, and the jaws. The jaws provide two more DOFs: one for the wrist yaw and one for the gripper's opening/closing. Each joint of the articulated part is driven by an antagonistic pair of tendons, with each pair being connected to the corresponding driving capstan at the proximal drive unit. The coupling of the two pairs of joints of the elbows occurs at the driving unit: the two capstans that drive the two serial joints for each DOF of the elbow (pitch and yaw) are coupled by a series of gears with 1:2 ratio. Because of the current setup, however, the translation DOF cannot be actuated. Moreover, the gripping angle is also considered fixed, therefore only 5 DOFs will be considered in this work (Roll, Elbow 1, Elbow 2, Wrist Pitch, Wrist Yaw).

Elbow1
Elbow2   The nonlinearities in tendon transmission make the mathematical derivation of the system kinematics and dynamics tedious. In addition, tendon-driven systems for minimally invasive surgery usually lack sensors at the distal side, therefore joint values cannot be measured and controlled directly. Because of the generally nonlinear motor to joint (and joint to motor) mapping q = f(θ), being θ ∈ R n m the vector of motor positions and q ∈ R n j the vector of joint positions, the kinematic model of the robot can be rewritten as: P = P q (q) = P(θ) where P ∈ R 3 is the Cartesian end-effector position, P q (q) and P(θ) describe the mapping from the joint values to the Cartesian tip position and from the motor value to the tip position, respectively.
The matrix J q = ∂P q ∂q is the Cartesian Jacobian with respect to the joint variables and J = ∂P ∂θ the Jacobian with respect to the motor values. The matrix L, with L i,j = ∂q i ∂θ j , is the motor to joint differential matrix [24]. In this work, the same motor to joint mapping described in [25] is used to compute the motor values from the joint values obtained from the simulator. Thanks to the ability of ANN to learn complex models, the mapping from motor values to Cartesian tip position θ → P can be learned. The expected output of the ANN modelled robot can be written asP = P NN (w, θ), where w is the vector of the ANN weights and biases. Being each Cartesian component independent on each other, three different neural networks can be used to find the mapping between the motor values and each Cartesian component.
It is known that regression approaches are highly influenced by outliers [26][27][28], and ANN are no exception [29,30]. To improve the performance of the learning approach, the method presented in [31] is employed. This method consists in iteratively reweighting each data point, based on a user defined weighting function, thus allowing discarding points too far from the model's expected output (outliers). Moreover, due to the parametric nature of ANN, it is possible to easily compute the derivatives of the network output with respect to the network weights through back-propagation. Additionally, it is also possible to compute in a similar way the derivatives of the output with respect to the inputs, which allows easily retrieving the robot Jacobian. For each network layer, with input x i and output y i , the derivative of the output with respect to the input can be computed as: where z i = W i x i + W i,0 , with W i being the matrix of weights of the layer and W i,0 the biases, and h i the activation function. The first two partial derivatives can be easily computed analytically once the activation function is chosen (for instance sigmoid), and ∂z i ∂x i = W i . Being the total network a cascade of layers, the final derivative of the network output with respect to the network inputs can be calculated iteratively by applying the chain rule to the derivatives of each layer. The computed derivatives provide the estimated JacobianĴ = J NN (w, θ). By using three networks, the derivative of each network will correspond to one row of the Jacobian matrix.

Multiobjective Control
Since in this work we only focus on the robot tip position, and it being 3-dimensional, the robot results in being a redundant system (the number of DOF is larger than the task dimension).
Different techniques were proposed to solve the redundancy problem, such as the extended Jacobian technique or the augmented Jacobian [32,33]. However, these approaches may not be optimal, because the tasks may conflict with each other, having all the same priority level. A better approach is to solve the kinematic problem for redundant robots by means of the stack of tasks. In this scenario, each task is given a different level of priority.
In general a desired 3D Cartesian task can be defined at each instant by P d ,Ṗ d ∈ R 3 . To exploit the redundancy of the system, prioritized subtasks P d,k ,Ṗ d,k can be assigned. Considering a scenario with only two subtasks k = 1, 2 (with k = 1 being the highest priority task) such that P d = P d,1 P d,2 T andṖ d = Ṗ d,1Ṗd,2 T , the motor values can then be computed as [33]: where J k is the end-effector's Jacobian corresponding to the subtask k, † is the pseudoinversion operator andṖ k =Ṗ d,k + K p (P d,k − P a,k ), with K p being a positive definite gain matrix and P a = P a,1 P a,2 T the measured tip position. Projecting the lower priority task in the null space of the higher one allows solving the actual task at best, without violating the higher priority one.

Adaptive Modelling
Machine learning and data-driven approaches such as ANN are highly influenced by the data. To have good models, data should be carefully acquired and this might be time consuming and complicated. Moreover, it is usually very tedious to have such good data that allow the model to be very generalizable and applicable to different situations. In fields such as surgical robotics, the environment may vary during the task, for instance due to tissue deformation, wearing or breaking of tendons, or to fluids in the body (e.g., blood) that change the friction in the system. This, in turn, may cause unpredicted changes in the transmission. Having a feedback control loop may not be sufficient due to the errors in the model. Considering the kinematics of the real robotic system, the end-effector velocity is:Ṗ where ∆J is the deviation between the real robot Cartesian Jacobian and the modelled one. The error between the desired end-effector tip position and the actual position can be expressed as d,a = P d − P a = P d − P NN − NN,a and˙ d,a =Ṗ d −Ṗ NN −˙ NN,a . A feedback control of the form Jθ =Ṗ d + K p (P d − P a ) would not be sufficient to guarantee error convergence. With such a controller, the error dynamics would be˙ d,a = −K p d,a −˙ NN,a , which may not converge to zero due to the neural network modelling errors at the velocity level˙ NN,a = ∆Jθ. Therefore model adaptation may be required, in order to reduce the deviation between the real and expected model. The proposed re-adaptation works as in the flow chart of Figure 2. With the current model, 1 the motor values to reach the desired position are computed as in (3). These motor values are then commanded to the robot and 2 the actual tip position P a is measured and the expected tip position computed. If the error between the expected position and the actual robot position (for any Cartesian component) is higher than a certain threshold, 3 the network weights are updated. Once the model is updated, Equation (3) is used again to compute the motor values to reach the desired position. The adaptation is repeated until convergence. Afterwards, 4 a new desired trajectory point is assigned and the process repeated until the whole task has been executed.
To re-adapt the model, the newly computed motor values and the corresponding actual robot position need to be added to the training set of the neural net.
Ideally, the training set for the adaptation would consider all data points acquired. However, having a large datasets may slow down the computation of the new model. To have a faster learning, a "short memory" approach is used, which consist of considering only m values. The NN weights are then adapted to minimize the mean squared error in the chosen dataset as: where m is the chosen number of points to be stored in the memory and k n is a weight that can be assigned to each datapoint. Being each Cartesian component of the tip-position learnt independently with a single neural network, each network can be updated singularly. If two or more networks need to be updated, the computation can be parallelized, making it faster. This would not be possible if one single network, mapping the motor values to all three Cartesian components, was used. For the network adaptation Levenberg-Marquardt algorithm is used [34]. The algorithm consists in sequentially approximating the cost function and incrementally update the weights by ∆w = (∑ m n=1 J T w,n J w,n + µI) −1 ∑ m n=1 J T w,n ∆P n , where J w,n are the matrices of network derivatives with respect to the network weights, obtained via back-propagation, H and g are the approximated Hessian matrix and the gradient of the cost function. Algorithm 1 shows the process to update the weights. The update stops when the mean squared error between the actual tip position and the network output in the chosen dataset is smaller than a threshold δ. Different aspects, however, affect the speed of adaptation, such as the damping factor µ in the Levenberg-Marquardt algorithm, the size of the dataset, and the network architecture. Having a small damping factor may lead to larger changes of the weights, and thus longer time for the adaptation. Large datasets, instead, means computing the network estimates and derivatives among more datapoints, thus requiring more computational time. Large networks also increase the computational time, since the more the layers and neurons, the larger the number of weights to update.
while StopCriteria == False do H = µI, g = 0 for n = 1 . . . m do Get inputs and outputs from dataset θ n , P a n ← getData(Dataset) Compute network estimates and derivativeŝ P n , J w,n ← getNetEstimates(θ n , w i ) H = H + J T w,n J w,n g = g + J T w,n (P a,n −P n ) end for w i+1 = w i + H −1 g Compute mean squared error with new weights mse ← getMeanSquaredError(Dataset, w i+1 ) if mse ≤ δ then StoppingCriteria = True end if i = i + 1 end while return w i+1 end function

Results
In this section, the results of the proposed method are presented. Results are evaluated on a simulated environment based on V-REP simulator [22]. The attached Video S1 shows the simulation results.

Robot Modelling
To build the model of the robot as presented in Section 2.1, a similar process to the one that would be applied on the real system is carried out. The robot is commanded to follow a specified path, which in this case is set to be a circumference with a radius of 15 mm. All measurements are referred to the robot base frame. The simulated kinematics of the robot is obtained through standard Denavit-Hartenberg convention [35]. This model is then used to compute the motor values to command to the robot to follow the path. In the mean time, the end-effector tip position is collected from the simulator. Gaussian noise with zero mean and standard deviation of 1 mm for each Cartesian component is added. In total, the dataset consisted of 30,000 points.
The learning method is then used to learn the simulated kinematics and the mapping from motor values to end-effector tip position from the recorded data. The simulated kinematics is also used to have a ground truth for the learned model and evaluate its performance. Being each position component independent on each other, three different neural networks are employed. In this work, all three networks have a similar structure with one input layer (with the input being the 5-dimensional vector of the motor values θ), one hidden layer, and one output layer (the output being a 1-dimensional vector corresponding to each Cartesian component). For the x, y components satisfying results were obtained with 20 neurons in the hidden layer, whereas for the z component 10 neurons were used. For all the components, the activation function is set to be a sigmoid, which guarantees smoothness and continuity in the computation of network derivatives. Figure 3 shows the results for the offline learned model. The acquired dataset has been randomly divided into three subsets for the training (70%), validation (15%), and testing (15%). The robust approach [31] is able to find a very good model for the data acquired offline, yielding very small errors between the estimated outputs and the ground truths (Table 1).

Autonomous Tracking
For the control task, the end-effector of the robot is required to follow three different circular paths: one with a radius of 15 mm, such as the one used for training the network; one with radius of 20 mm; one with radius of 10 mm. Two exploit the redundancy of the system, two subtasks with different priorities are assigned: the primary task is to perform the circle in the x, y directions; the secondary task is to keep the z constant. In this case the z is fixed at 0.050 m. The robot starts from the home configuration and moves to the circle along a straight line. Then it is required to loop around twice. The desired motion time is set to 25 s (5 s for reaching the circle and 20 s for looping) with a sampling time dt = 1 ms. The motor commands are computed as in (3), where P d,1 is the 2D vector of the desired x, y tip position, P a,1 the x, y measured components of the tip position, P d,2 is the z tip position, P a,2 the measured z component. Since we focused only on path tracking, the desired velocitiesṖ d,2 ,Ṗ d,2 can be set to 0. Figure 4 plots the results for tracking the desired paths, whereas Figure 5 shows the absolute errors for each Cartesian component between the ANN expected position and the desired position ( NN,d ), the expected position and the actual tip position ( NN,a ), and between the actual and the desired positions ( a,d ).   For all the three paths, the feedback controller is able to lead to good tracking performance, with small errors between the actual tip position and the desired one. This happens because, despite the modelling errors between the expected network output and the actual tip position, the error at the velocity level is small, and, therefore, feasible for the feedback controller to compensate for (4). The proposed adaptive method also leads to effective tracking, with generally smaller errors, especially in the z direction. The errors in the z direction are higher than those along x, y since the former is given a lower priority. For the adaptation, the retraining set consists of the latest 100 data points and each model for each Cartesian component is re-adapted whenever ( NN,a ) is greater than a certain threshold, set to 0.5 0.5 1 mm for each component. The threshold for the z is larger, being it a less important task. However, when no adaptation is implemented, the errors between the actual tip position and the ANN expected position are large. The adaptation, instead, ensures minimization also of NN,a , and, thus, minimizing the risk of instability of the system. Table 2 shows the mean absolute errors¯ a,d ,¯ NN,a in the different cases. Thanks to the use of a global learning method such as ANN, which allows having more generalizable models compared to local learning methods such as those proposed in [13,16], model adaptation is not always needed, therefore there is not a continuous re-learning of the model. Table 3 shows the number of times each model, for each Cartesian component, was adapted. Due to the adaptation, however, the total time for executing the desired motion tasks is larger than in the case with a simple feedback controller. The main drawback of a simple feedback controller is that it works well if the model is accurate enough. In surgical robots, however, nonlinear effects such as friction, cable extensions, or even the occurrence of a broken tendon are difficult to predict and compensate for. To validate the effectiveness of the proposed method, we designed a path tracking test where the robot needs to follow the circular path of 15 mm in radius, but with a broken joint (Elbow 1). In this scenario, the joint is considered locked at an angle of 10 • . Detecting which joint is not functioning anymore may be challenging or even impossible in a real scenario and, therefore, the system model needs to adapt accordingly. In this case, no information about the broken joint is given, and the controller only receives data about the tip position and the commanded motor values. Figure 6 shows that the proposed method is able to effectively relearn the model and ensure correct execution of the task. Apart from some initial errors, the adaptation is able to converge to an accurate model. When a simple feedback controller is used, instead, the system becomes unstable and fails the tracking task, due to the unmodelled errors. When the adaptation is activated, the mean absolute errors result to bē a,d = 0.006 0.006 1.625 mm and¯ NN,d = 0.208 0.215 1.639 mm, proving the capabilities of the approach to guarantee correct path tracking. Again the errors along z are higher since it is a lower priority task. In this case, the model for the x component adapted 171 times, the one for the y component 265 times, and the one for the z component 28 times. Larger number of adaptations occur due to the unpredicted system behaviour.

Teleoperation
To further prove the capabilities of the proposed method, a teleoperation test is also performed. This test serves for two purposes. On one hand, surgical robots are usually teleoperated and, therefore, the adaptation must be fast and accurate enough to allow the robot to follow the master commands. On the other hand, through teleoperation it was possible to generate paths completely different from the training one, which might happen due to tissue deformations. Also in this case the primary task is following the path in the x, y Cartesian space. The same thresholds and memory size of the autonomous tracking are used. Figure 7a shows the results. Thanks to the fast model adaptation, there is very little lag between the commanded motion and the executed one. Moreover, the model adaptation proved effective and able to allow the system to compensate for the errors and follow the desired path with high accuracy resulting in¯ a,d = 0.144 0.214 3.922 mm and¯ NN,d = 0.248 0.277 3.616 mm. Again, errors in the z direction are higher, but just because this task is not as important as the tracking in the x, y plane.
If larger and faster motions were commanded, the adaptation would take longer. However, in surgical environments, motions are usually slow, making the use of this method in such operations feasible.
Also for the teleoperation case, a test with a broken joint was conducted. Again, the adaptive method allows effectively relearning the model and guarantee safe and correct task execution, as shown in Figure 7b. The mean absolute errors result to be¯ a,d = 0.087 0.236 3.185 mm and¯ NN,d = 0.209 0.239 3.076 mm. Due to the reduction of redundancy and dexterity for the broken joint, the tracking errors are generally higher.

Discussion
The simulation results showed the capabilities of the proposed approach in re-adapting the model of the surgical tool online, improving system's stability and tracking performance. However, the speed of the adaptation still depends on different parameters, such as the size of the memory chosen, the complexity of the neural network, and the learning algorithm. Having a large dataset for retraining may allow building a better global model, whereas if a short memory is used, the proposed approach may be considered more like a local learning method. A network with many layers and neurons may also better learn a global model, yet it highly increases the computational cost for the adaptation. The algorithm for adaptation also has impact on the time it takes the adaptation to minimize the errors in the dataset. Levenberg-Marquardt algorithm was chosen because it is a generalization of the gradient descend method, it is fast, and allows specifying the damping factor µ, which affects the rate of change of the network weights. Moreover, in this work, the proposed approach was tested only in simulation. To better compensate for modelling errors due to the nonlinearities in the real system, a more complex network architecture may be needed, with more layers and neurons.
Finally, another point to consider is that the proposed adaptation relies only on the measured tip position and on the commanded motor values. The motor values can be directly measured from the motor encoders, whereas, for the tip position external sensors such as electromagnetic trackers or the endoscopic camera need to be used. The accuracy of the adaptation therefore depends on the accuracy of the measurements.

Conclusions
One of the major challenges that still limits the use of autonomous (or little supervised) surgical robots is proper modelling and control of these systems. The nonlinearities in the actuation, the highly articulated design, and the unstructured surgical environments require complex modelling techniques and control strategies to guarantee safety and accuracy. An inaccurate model may lead to system instability and improper control, causing serious traumas to the patients. In this work, we presented an approach to control a redundant surgical robot, consisting in learning the forward kinematic model of the robot by means of a global learning method such as ANN. The forward kinematic model is learned because it allows having access to the robot Cartesian Jacobian and, thus, exploit effectively the redundancies of the system.
As for the chosen modelling method, ANN have been chosen for different reasons. Firstly, they allow modeling complex functions; secondly, their parametric nature makes it easy and fast to compute the robot Jacobian; thirdly, being global methods, they reduce the amount of model adaptation and increase the reliability in case data cannot be collected and used to learn the model online. Most of the state-of-the-art works, instead, focus on learning the inverse kinematics of surgical robots, therefore, given the desired tip position, the machine learning algorithm is employed to directly retrieve the corresponding joint or motor values. Consequently, this does not allow for solving the motion task as an optimal control problem, thus neglecting the possibility to execute prioritized subtasks. In an highly articulated robot, such as the Micro-IGES, however, exploiting the redundancies can be beneficial to improve the versatility of the system and execute a desired task more safely.
In our approach, the kinematic model of the robot is first learned offline. Having an accurate model allows effectively tracking a desired path even with a simple feedback controller. However, when unpredicted changes in the system occur, such as a broken tendon, a feedback controller may lead to system instability and failure to execute the task. The proposed adaptive method, instead, proved its effectiveness in allowing the model to quickly adapt to a changing environment. Results show that the precision achieved in tracking the path is very high, even when unknown paths need to be tracked, as in the case of teleoperation.
Future work will focus on implementing the proposed method on a real surgical robotic tool, where the nonlinear effects of the tendon-driven transmission are difficult to model and highly unpredictable. As a matter of fact, the simulation used here does not consider nonlinear effects such as friction or tendon elongation and slack, which are common in tendon-driven surgical robots and which can cause inaccuracies in the model and in the control. These effects are configuration dependent and a fixed offline model may not be accurate enough to achieve the required precision. The proposed online model adaptation may be able to compensate for the modelling errors, improving the tracking performance.
Author Contributions: Conceptualization, methodology, software, validation, writing, F.C.; supervision and proofreading, P.K., G.P.M. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.