Passive Exercise Adaptation for Ankle Rehabilitation Based on Learning Control Framework

Ankle injuries are among the most common injuries in sport and daily life. However, for their recovery, it is important for patients to perform rehabilitation exercises. These exercises are usually done with a therapist’s guidance to help strengthen the patient’s ankle joint and restore its range of motion. However, in order to share the load with therapists so that they can offer assistance to more patients, and to provide an efficient and safe way for patients to perform ankle rehabilitation exercises, we propose a framework that integrates learning techniques with a 3-PRS parallel robot, acting together as an ankle rehabilitation device. In this paper, we propose to use passive rehabilitation exercises for dorsiflexion/plantar flexion and inversion/eversion ankle movements. The therapist is needed in the first stage to design the exercise with the patient by teaching the robot intuitively through learning from demonstration. We then propose a learning control scheme based on dynamic movement primitives and iterative learning control, which takes the designed exercise trajectory as a demonstration (an input) together with the recorded forces in order to reproduce the exercise with the patient for a number of repetitions defined by the therapist. During the execution, our approach monitors the sensed forces and adapts the trajectory by adding the necessary offsets to the original trajectory to reduce its range without modifying the original trajectory and subsequently reducing the measured forces. After a predefined number of repetitions, the algorithm restores the range gradually, until the patient is able to perform the originally designed exercise. We validate the proposed framework with both real experiments and simulation using a Simulink model of the rehabilitation parallel robot that has been developed in our lab.


Introduction
Nowadays, robots are present in many different areas. For instance, rehabilitation devices can be used as therapy aids, e.g., for the development of adjustable devices for assisting different sensorimotor The robot applies assistive or resistive moments depending on whether the exercise is passive or active, respectively. For entertainment during exercises, the platform can be interfaced with game-like virtual environments [30]. The Rutgers Ankle is also being used to perform clinical trials for post-stroke rehabilitation [31]. Despite its use in research and experimentation, the device suffers from redundant actuations. For redundancy reduction, the authors of [13] proposed three-df and four-df PRs with a configurable central strut for sprained ankle treatments [13,32]. Different configurations of the central strut allowed the authors to analyze three different PRs in the stiffness domain.
The characteristics of the exercises to be performed in each case are very different. For that reason, a reconfigurable device is introduced in order to adapt to each patient's range of ankle motion [33]. This robot works on the metatarsophalangeal joint and its controller varies the impedance parameters in order to accommodate different exercise modes. A three-RSS (Revolute, Spherical, and Spherical) PR is proposed by [34] and validated in simulation for ankle rehabilitation. Syrseloudis and Emiris [26] introduced a tripod-based PR actuated by electric motors for ankle rehabilitation.
Actuation redundancy is used to deal with PR singularities [35]. Fan and Yin developed a four-df wearable PR [36]. In their design, the moving platform is linked to the patient's foot, while the fixed platform is attached to the lower extremity. Cable-driven systems [37] have also been used in rehabilitation.
For more recent approaches for ankle rehabilitation using PRs, readers may refer to [38][39][40][41][42]. The selection and design of the control algorithms are based on analysis of the rehabilitation protocol taking into account the dynamics of both the system and the human-robot interaction. Dynamic posturography has been studied [43], where multi-axial perturbations are required. However, they did not use force sensor measurement.
The main motivation behind this work is to improve the therapeutic resources that can be applied to people with locomotive disorders and to offer better rehabilitation results by providing different types of exercises. In this context, it is important to develop an appropriate low-cost mechanical solution that is able to adapt different rehabilitation exercises to different patients. Unlike the aforementioned rehabilitation devices, the proposed system not only has a suitable kinematic and dynamic design but also provides a control system equipped with a learning algorithm that monitors movements and forces that arise during the execution of the exercise and that can adapt to patients' needs using a new Learning from Demonstration (LfD) framework [44]. LfD is an end user technique for non-roboticists to teach new behaviors to a robot by extracting task-relevant information from a demonstration (or several demonstrations) and transfer these skills directly to a robot instead of hard-coding. LfD has proven to be an effective way of teaching robots important motion skills that are necessary when assisting people and providing health care services [45]. Specifically, LfD approaches have been used to teach robots a variety of skills, e.g., physical rehabilitation [46], hand rehabilitation [47], motion planning for rehabilitation [48], robotic surgery [49], and feeding [50], among others.
In this paper, we propose to exploit LfD to learn passive rehabilitation exercises and adapt them based on the patient's needs by integrating Dynamic Movement Primitives (DMPs) [51] and Iterative Learning Control (ILC) [52]. DMPs are trajectory generators that can effectively encode and reproduce trajectories. DMPs were first introduced by [51], then updated in 2013 by [53], before being further updated to include unit quaternion trajectories [54]. Most recently, they have been updated to encode symmetric positive definite matrices profiles [55]. On the other hand, ILC makes it possible to reuse the control signal from the previous iteration cycle in the next one [56]. This paper is an extension of a previous conference paper [57], where we extended the theory behind the rehabilitation device and algorithm, the control scheme, stability analysis, and the validation experiments. The main contributions of this paper are: -Exploitation of force sensing in an LfD framework for ankle rehabilitation using a PR that integrates ILC and DMPs to learn different passive exercises and adapt them autonomously; -Implementation of soft emergency stopping due to the integration of DMP phase-stopping in the emergency button control loop in order to provide soft and smooth stopping; -Provision of a stability analysis of our learning control; -Provision of a brief review of ankle rehabilitation devices, injuries, and exercises; -Implementation of different experiments in order to validate our control scheme.

Overview of the Ankle Joint: Anatomy, Physiology, and Injuries
The human foot and ankle are composed of 28 bones: tibia, fibula, 7 tarsals, 5 metatarsals, and 14 phalanges. The human ankle joint is a very complex bony structure [58] that is composed of three joints: the ankle joint proper or talocrural joint, the subtalar joint, and the inferior tibiofibular joint.
The ankle joint has rotations in the sagittal, frontal, and transverse planes. Figure 1 shows the ankle motion in these three orthogonal planes. These motions are: (i) plantar flexion and dorsiflexion movements in the sagittal plane that occur around the y-axis, (ii) adduction and abduction movements in the transverse plane that occur around the z-axis, and (iii) inversion and eversion movements in the frontal plane that occur around the x-axis. The ranges and moment requirements are summarized in Table 1. the foot also known as medial side is lifted up) and eversion (when the inner side of the foot is pushed down) ( Figure 3). that slowly increase the stress on the ligament. The specific demands of each individual sport dictate the individual drills of this progression. The athlete should have complete range of motion and at least 80 to 90% of pre-injury strength before considering a return to the sport. Finally, if full practice is tolerated without any pain in the injured part, the athlete may return to competition [24].
Currently, most centers have some limitations in the service that can provide to patients requiring rehabilitation, since they have few staff, the therapist can apply sudden movements for various reasons (fatigue, carelessness, etc..) causing pain in the affected part, so the availability of ankle rehabilitation devices that could assist therapists in the development of their work is considered to be very useful.

Low complexity devices
Devices used in ankle rehabilitation could be very simple; such as elastic bands, roller foams and wobble boards. These devices are typically used in exercises that could be performed both in clinic or at home, they are easy to find in almost any physiotherapist shop; they are usually intended for functional rehabilitation. Elastic bands are the simplest devices, made of multishaped strips of resistive elastic intended for muscular foams (b); wobble board (c).

Intermediate complexity devices
Recently, in order to assist and improve the process of ankle rehabilitation, some firms have developed several commercial electromechanical systems that allow patients to move and stretch the muscles and tendons gently [29][30][31][32][33]; usually their movements are similar to the basic ankle movements (figure 4), being able to obtain different ROM (measured in degrees) and various angular velocities (measured in degrees/s) for each rotation axis considered. These machines, in a general sense, are good in helping the rehabilitation ankle process, focusing in range of motion restoration and improving the flexibility of the ankle muscles. Nevertheless, they

Ankle Joint Injuries
Ankle injuries are among the most common injuries in sport and daily life [37]. Namely ankle sprains represents 20% to 40% of all sport injuries and these cause stretching or tearing of the ligaments due to a sudden movement 5 Figure 1. Ankle joint movements in three orthogonal planes [59]. Ankle injuries are among the most common injuries in sport and daily life [58]. Ankle sprains represents 20% to 40% of all sport injuries. These sprains are a stretch or tear of the ligaments due to sudden changes in direction [64,65]. In most cases, ankle sprains can become chronic if the injury is not rehabilitated properly. Approximately 85% of ankle sprains are caused by excessive inversion [66].
The first ankle treatment after injury includes Rest, Ice, Compression, and Elevation (RICE) of the affected foot [13], which should be followed by stretching and therapy exercise along with partial weight bearing with crutches to maintain mobility in the ankle. In order to avoid muscular atrophy, which may lead to a reduction of the ROM, and stimulate healing of the injured ligaments, patient should start motion therapy within 72 h after the injury [64]. Once the ROM is achieved, strengthening of weakened muscles is essential for rapid recovery and is a preventive measure against further injury. Once patients achieve full weight-bearing capability without pain, proprioceptive exercises are initiated. These exercises aim to recover both balance and postural control using wobble boards. Finally, advanced exercises using an uneven surface wobble board should be performed to regain normal activity functions.
At present, different techniques are used for ankle rehabilitation; however, not all of them have the same effectiveness. Some techniques require the patient to be an active agent in the rehabilitation process [67]. In these cases, the patient performs active work through a series of exercises which are gradually intensified to help the ankle regain its mobility. On the other hand, the patient may also perform passive work [67,68], which usually occurs in the early stages of rehabilitation. In this type of work, an external agent, either a qualified person or a device, moves the patient's ankle without her/his voluntary movement.
In this article, a number of references have been generated to rehabilitate an injured ankle with passive exercises. These passive exercises are used to train dorsiflexion/plantar flexion and inversion/eversion ankle movements. The rehabilitation exercises are performed using a three-PRS (Prismatic, Revolute, and Spherical) PR, which is described in the following section.

Parallel Robot: Kinematics And Dynamics
A three-df PR is used as a mechanical device for ankle rehabilitation. Its kinematic and dynamic models are explained in the following subsections.

Three-PRS Kinematics
The PR has been modeled by means of a set of nine dependent coordinates, as can be seen in Figure 2. These coordinates are: (i) actuated prismatic joints (P) represented by q 1 , q 6 , and q 8 , (ii) passive revolute joints (R) in q 2 , q 7 , and q 9 , and (iii) the coordinates q 3 , q 4 , and q 5 only correspond to one spherical joint (S), which is located at position P 1 . Explicit expressions can be obtained for the inverse kinematic problem. Given the location of the mobile platform, so that the position and orientation of the reference system {P m − X m Y m Z m } with respect to the fixed one {A 1 − X 0 Y 0 Z 0 } attached to the base of the robot, it is possible to obtain the coordinates of points P 1 , P 2 , and P 3 , corresponding to the spherical joints. These points are then used to determine the active coordinates q 1 , q 6 , and q 8 .
The forward kinematics is solved using a geometric approach, taking into account that the length between any consecutive P d points is constant and equal to l m , where d = 1, 2, 3. Thus, the following three nonlinear geometrical constraints can be obtained: If we know the active generalized coordinates q 1 , q 6 , and q 8 , it is possible to obtain the passive coordinates q 2 , q 7 , and q 9 by solving the equation (Equation (1)). Afterwards, the locations of points P 1 , P 2 , and P 3 can be easily obtained. From the coordinates of those three points, the roll γ, the pitch β angles, and the heave z of the mobile platform can be obtained.
The velocity problems, both inverse and forward, are based on the following: Sensors 2020, 20, 6215 6 of 23 where j = 1, 6, and n = 1, 2, 3. A 1 is the fixed frame system. P d is the origin of the reference system attached to the mobile platform (end effector), u A n B n is a unitary vector from joints A n to B n , u B n P n is a unitary vector from points B n to P n , and l r is the constant length of links 2, 5, and 7. Figure 3 shows the closed loop for the velocity and acceleration problems for link 1; the same can be applied for links 2 and 3. By multiplying (dot product) both sides of Equation (2) by u B n P n and taking time derivatives, the following matrix expression can be obtained. This expression relates the linear and angular velocities of the mobile platform to the time derivative of the active generalized coordinates: where V P m = [ẋẏż] is the velocity of the origin of the mobile reference frame (end effector).
is the angular velocity of the mobile platform, J x is the Jacobian matrix in Cartesian space, and J q is the Jacobian matrix in generalized coordinate space. Finally, taking into account that the parallel robot has three degrees of freedom, it is possible to obtain a relationship between the velocities of the mobile platform and the time derivatives of the roll, pitch, and heave of the reference system attached to the mobile platform and any choice of three of them forming the matrix J m , such as: so that: This equation allows us to solve both the inverse and the forward velocity problems. In a similar way, the expressions for the acceleration can be obtained.

Three-PRS Dynamics
As mentioned before, the parallel robot has been modeled through a set of dependent generalized coordinates, considering which the equation of motion will be as follows: where Θ is a vector grouping the dynamic parameters (masses, first inertia moments, moments and products of inertia of the links and friction coefficients). q,q, andq are the generalized coordinates, velocities, and accelerations. M stands for the mass, C for the centrifugal and Coriolis terms, while G denotes the gravitational vector. τ is the generalized torque vector. By deriving the constraint equations with respect to all generalized coordinates, we can obtain the Jacobian matrix J. λ is the vector of Lagrange multipliers. The detailed dynamic model of this PR is described in [69]. For control purposes, the generalized internal forces term is not convenient, so it could be canceled by multiplying both terms of Equation (6) by an orthogonal complement R [70]. Thus, Equation (6) can be rewritten as follows: By considering the relationship between all the generalized coordinates and the active ones, Equation (7) can be written as follows: where d f is the number of degrees of freedom of the parallel robot, and the new vectorsq * ,q * correspond to the active generalized velocities and accelerations.

Policy Learning and Adaptation Algorithm
In this section, we introduce our trajectory learning and adaptation algorithm for robots employed in rehabilitation activities. The proposed algorithm is general and can accommodate other robotics applications that involve contact with the environment, such as force-based trajectory tracking. Figure 4 shows an illustrative diagram of the proposed framework. The position controller is fed by q c (χ) (Equation (14)), which in turn results from either the emergency reference trajectory or the adapted one from the policy learning block (in gray). The policy learning block is covered in Sections 4.1-4.4, while feedback error and offset learning is covered in Section 4.5. A more detailed diagram is shown below in Figure 6.

Learning from Demonstration for Rehabilitation Exercises
In this section, the patient's ankle reference exercise trajectory is designed and the robot learning procedure of the rehabilitation trajectory is described. The trajectory is designed by a medical professional, who guides the mobile platform of the PR while the patient's foot is in the orthopedic boot [38]. This means that the specialist moves the platform in specific directions in dorsiflexion/plantar flexion and eversion/inversion. During this movement, the trajectory of the orthopedic boot (with the patient's foot) is measured by proprioception. The specialist moves the platform, performing an appropriate rehabilitation exercise and setting the maximum positions in each direction. These maximums are determined for each patient according to the pain that the specialist considers the patient can endure in each direction of the movement.
The force sensor is located under the orthopedic boot ( Figure 5) and, consequently, the forces exerted by the specialist (human operator) during the demonstration affect the measured forces and torques. Therefore, to obtain the net forces and torques exerted by the patient, the acquired trajectory is replayed by the PR interacting solely with the patient's foot in the boot and without any adaptation. The resulting force profiles are then recorded. These profiles indicate the actual maximum forces allowed by the patient. This procedure should be repeated for each patient. At this stage, the exercise reference trajectory is determined. Moreover, patients should be able to repeat the exercise without any danger, because they are doing a customized exercise designed for them. In order to make the exercise more comfortable for patients and reduce the maximum forces (pain) applied to them, the specialist marks a threshold a little below those maximum forces. Thus, patients would be able to repeat the exercise assisted by the PR with less pain. After several repetitions, patients may be able to repeat the reference trajectory perfectly without pain. Afterwards, the specialist would determine whether a new exercise reference trajectory should be designed or whether the treatment should end.

Overview of DMPs
In this paper, robot trajectories are encoded by DMPs. They have the ability to slow execution of the trajectory (exercise) down using a phase-stopping mechanism [51] whenever it is necessary to adapt to the patient's needs. DMPs can be found in many applications, e.g., biped locomotion [71], adaptive frequency modulation [72], reinforcement learning [73], automatic assembly [54,74,75], etc.
For each exercise, as mentioned in Section 4.1, a medical professional sets a personalized exercise reference trajectory for each patient. These trajectories are encoded by DMPs. A DMP for a single arbitrary trajectory y is defined by the following nonlinear differential equations [53]: where χ is the phase variable, z is an auxiliary variable, and τ is the time constant. Both parameters α z and β z define the behavior of the second-order system described by Equations (9) and (10). The phase evolution is defined by Equation (11). With the choice of the time constant τ > 0, α z = 4β z , and α χ > 0, the convergence of the underlying dynamic system to a unique attractor point at y = y d and z = 0 is guaranteed [53]. f (χ) is the linear combination of N nonlinear radial basis functions, which enable the robot to follow any trajectory smoothly from an initial position y 0 to a target position y d . In the basic DMP Equations (9) and (10), each df is encoded as a separate DMP (one for dorsiflexion/plantar flexion and another for eversion/inversion); however, all the dfs share the same phase variable χ.

Exercise Generation Using DMPs
In order to encode exercise trajectories we substitute y, y g , f (χ) in Equation (9) for q, q d , f q (χ), andẏ in Equation (10) forq.
where D = diag(q d − q) ∈ R 3×3 . The diagonal matrix D is used to scale the movement amplitude if the target configuration changes.
are fixed basis functions. c i are the centers of Gaussian distributed functions throughout the phase of the trajectory and h i are their widths. w i are adjustable weights. For each df trajectory, the weights w i are estimated from nominal trajectories using regression [76]. Thus, the resulting DMPs encode the desired exercise trajectory. To track the desired trajectory,Equations (9) and (10) need to be integrated for all dfs with the common phase in Equation (11).
Since forces F d and torques M d (obtained from human demonstration in Section 4.1) are used as desired variables along the trajectory and not as robot control variables, they do not need to be encoded by DMPs. Instead, linear combinations of radial basis functions are used to approximate the desired forces throughout the phase χ i = χ(t i ): Thus, six systems of linear equations need to be solved in order to estimate w F i and w M i from the measured force/torque data.

Overview of ILC
ILC [52,56] is a tracking control method for systems that execute the same trajectory in a repetitive mode. ILC assumes that the performance of an agent that repeatedly performs the same task can be improved by learning from past executions. In the conventional ILC formulation, the objective is to reduce the trajectory tacking error while rejecting periodic disturbances. This is obtained by adjusting the pre-defined control input with a corrective term that linearly depends on the tracking error.
Standard ILC assumes: (1) stable system dynamics, (2) fixed common initial conditions for each trial, and (3) the same duration for each trial. In the case of this paper, the third assumption cannot be fulfilled due to the slowing-down/speeding-up of the trajectory. However, in order to overcome this problem, the trajectory is temporarily scaled as a function of the phase variable using Equation (11). In this case, it is possible to sample the same number of times in each trial. In other word, all trials have the same phase even though they have different durations.

Error Feedback and DMP Phase Stopping
In human-robot interaction, the interaction might change the resulting measured forces/torques when the robot executes the demonstrated trajectory. These forces may be different from the ones (desired forces) recorded by the human demonstration. Consequently, the robot has to adapt the trajectory in order to minimize this difference between the measured and desired forces. As a solution for this problem, either admittance or impedance control can be implemented. In this work, admittance control [77] has been implemented. In human life, it can be observed that individuals are able to acquire skills in many different ways (through work, play, etc.) by repeating the same action over and over again. This means that humans learn skills from repeating actions. In the same way, and for the same task, the robot should learn from previous repetitions to adapt the executed trajectory, especially when the robot is interacting with humans for safety reasons. Hence the importance of tracking and monitoring the error in the previous repetition, which can be used to improve the performance of the next repetition for the same action (trajectory). This principle is used to learn and adapt rehabilitation exercises (trajectories) which is the basic idea of ILC [52,56].
In passive ankle rehabilitation exercises, the robot is required to follow a specific predetermined trajectory, as described in Section 4.1. However, depending on the state of the patient's ankle, this exercise may cause some pain if the robot executes its preset trajectory. To avoid this, trajectory adaptation is introduced whenever the measured force exceeds a certain safety threshold due to the resistance of the patient's ankle to follow the exercise. In order to adapt the PR to the new situation, the trajectory is modified according to the admittance control law [77]: where q c (χ) is the new position commanded of the robot controller. q DMP (χ) is the reference trajectory obtained by DMPs and K is the gain matrix. The force feedback control is provided by the feedback error K · e q (χ), where e q (χ) = F d (χ) − F, F and F d (χ) are the actual measured force profile and the desired force one as a function of phase χ, respectively. ϕ q (χ) is the on-line learned offset to be added to the original trajectory, where its initial value is [0, 0, 0] . This way, the commanded trajectory is modified by adding the offset to the original one, instead of modifying it. To ensure safety for the patients, our proposed framework adapts the execution trajectory in order to minimize the force error between the desired and measured forces. Thus, low gains are used in order to achieve stable and robust force adaptation. Moreover, for efficient force adaptation the algorithm slow the trajectory execution down. DMP slow-down mechanism is derived from Equation (11) [51]: where = q − q , q andq are the DMP output and the the corresponding actual position of the robot, respectively. In this work, = e T q , and α pχ , α py are positive constants.

Offset Learning
The aim of our learning framework is to iteratively modify the reference trajectory so that the patient can safely repeat the rehabilitation exercise. The offset is updated after each trial through δ q t,l+1 = ϕ q,l (χ t ) + K · e q (χ t ), (17) where l is the iterator. Each offset component ϕ k is represented as a linear combination of M radial basis functions as follows The new data points {δ k t,l+1 }, t = 0, . . . , T, are obtained from the k-th component of the offsets trajectory, where k = 1, 2, 3. This optimization problem aims to find {w i,k } that minimize the quadratic objective function:

Ankle Rehabilitation Control Scheme
A detailed control scheme of the proposed ankle rehabilitation framework is shown in Figure 6. The reference trajectory Q d (t) is designed by the specialist. The transformation of data from time domain to phase domain and vice versa are done in the blocks t ⇒ χ and χ ⇒ t, respectively. q d (χ) is the DMP reproduction of the encoded original trajectory Q d (t). F d (t) describes the reference/desired force profile that produces F d (χ) by applying Equation (13). The measured forces F may have high values depending on the patient's response during the exercise. The objective of this work is to adapt the exercise by reducing the difference between F and F d (χ). By applying Equation (14), the new offset is estimated and added to the one from the previous repetitions ϕ(χ). q c (χ) is the commanded trajectory to be executed by the robot and represented as the aggregation of force feedback in Equation (14), the learned offset in several repetitions through Equation (17), and the DMP-generated trajectory. This procedure is repeated until the desired and measured forces match or no further improvement is possible. The gray shaded area in Figure 6 represents the learning procedure, which belongs to the ILC algorithms, where current iteration causal learning is applied, as described in [52,56].

Position Controller
In our framework, we do not modify the reference/original trajectory, instead, we adapt it by adding iteratively an offset learned from the previous iteration. This offset update is represented by the discrete delay Z −N block in Figure 6, where N is the number of samples of that repetition.
Under normal conditions, the DMP provides the reference (heave, pitch, and roll angles) for the parallel robot control unit (see Figure 6). The controller compares the reference with the robot position/velocity to calculate the torques as in Equation (8), providing them at a frequency of 100 Hz.
It should be noted that the robot is equipped with an emergency button that can be actuated by the patient or by the specialist. If the button is pressed, the DMP stops the rehabilitation exercise and brings smoothly the platform into a safe configuration. In our case, it has been considered that the platform is at zero value for the heave as well as for the angles of eversion (roll) and dorsiflexion (pitch). When the emergency button is released, the DMP smoothly brings it to the pose where the platform was in before the button was pressed and continues with the rehabilitation exercise.

Stability Analysis
In order to prove the stability of the learning control, we assumed that the closed control loop of the 3-PRS is stable, without the iteration loop, with proper choice of the admittance feedback gain K [78,79]. However, closed-loop stability does not necessarily imply that the system will remain stable during the repetitive learning. In this regard, the aim of this section is to determine how our system may be affected by the iteration loop.
To clarify the notation, let uppercase letters denote one-sided Z-transform of the corresponding time-discrete signal, which is denoted with lower case letters. Note that the signals in Equations (14), (17) and (18) are phase dependent. However, we can always express the time-dependent counterpart and express the corresponding Z-transform. For the sake of simplicity, explicit dependence on z in transfer functions and Z-transform of the signals is omitted. After that, it is assumed that the nonlinear dynamics of the robot is fully compensated for using feedback control. By assuming a known environment stiffness K s , the force at iteration l (F l ) can be predicted: where K s is a diagonal positive definite environment stiffness matrix, G is a diagonal matrix containing the decoupled dynamics of the robot in the form of a second-order system which maps the desired position vector P l into the actual position, and P o denotes the environment contact positions. According to Equations (14) and (17), the Z-transform of the error function E l , the position update function P l , and the learned offset function Φ are: where Q presents a transfer function which maps the original sampled function to a function approximated with Gaussian kernel functions. K is the gain matrix from Equation (14). In [80] it was shown that Q can be approximated with a second-order transfer function. However, with enough M (in Equation (18)), the approximated function, with Gaussian kernel functions, is close enough to the original function, so that Q can be set to I. By defining E l as the error function and E l−1 as the error function in the previous learning cycle [81]: Dividing Equation (24) by E l and re-arranging leads to: Asymptotic stability is assured if E l E l−1 < 1, ∀ l. Inserting the z dependence into transfer functions and signals and substituting z = e jω in Equation (25), the condition for asymptotic stability becomes [52]: With a proper selection of K s and K, the above equation is fulfilled and the learning stability is guaranteed.

Results
In this section, we develop different simulation examples as will as real experiments in order to validate the performance of our LfD framework. The trajectories used in these examples were obtained from the real robot by a medical professional guiding the moving platform of the robot, while the patient's foot is installed in the boot, according to the procedure detailed in Section 4.1. These trajectories represent only passive exercises, which require the robot to follow them accurately.
To simulate the robot, we used a MATLAB Simulink R model. The model accurately imitates the real robot [69]. The description of the robot's hardware is detailed in the next section, where the kinematic and dynamic models used in the simulation are based on that setup.

Hardware Description
The rehabilitation robot (see Figure 7) consists of three PRS kinematic chains (1, 4, 6 in Figure 2) connected at one end to a coupling bar (2, 5, 7 in Figure 3), and at the other end they are perpendicularly attached to the platform's base, as shown in Figure 2. Each leg is driven by a direct drive ball screw actuator (actuated prismatic joints) (P). The coupling bar is connected on one side to the leg with a revolute (R) joint and with a spherical joint (S) to the moving platform on the other side. The legs are distributed forming an equilateral triangular configuration at the base. The choice of this configuration is based on the need to develop a low-cost robot with two rotational dfs that are required to perform the main rehabilitation exercises; dorsiflexion/plantar flexion and eversion/inversion. In addition, a translational df is used to adapt the platform with respect to the patient's height while sitting on a chair. The configuration and dimensions of the robot fulfill the requirements to perform the lower limb rehabilitation exercises proposed in this paper. The motor for each actuated prismatic joint is a brushless DC servomotor equipped with a power amplifier, with the following specifications: continuous stall torque of 2.86 Nm and continuous peak torque of 11.43 Nm. The lead of the ball screw is 20 mm and the actuators are Aerotech BMS465 AH brushless servomotors.
The control unit of the robot is based on an industrial PC equipped with two AdvantechTM cards: a PCI-1720 to supply the control actions by means of digital-to-analog outputs, and a PCI-1784, to read the actuated prismatic joint positions. The PC is equipped with a Linux Ubuntu operating system (patched with the real-time kernel Xenomai) and the open software middlewares Open RObot COntrol Software (Orocos) and Robot Operating System (ROS).
We have used open-source middlewares installed on an industrial PC for the robot control architecture. The main advantages of that are: (i) open-source with high-level tasks programming capabilities (e.g., control based on external sensing using a force sensor, automatic trajectory generation, and artificial vision, etc) and (ii) it is low-cost where the total cost of the hardware is around $2000, in addition to free operating system and programming tools.
The parallel robot is equipped with an orthopedic boot and Delta SI-330-30 ATI force sensor ( Figure 5). This sensor is six-df and is capable of measuring forces and torques in 3D using a amonolithic instrumented transducer. The force/torque sensor is integrated into the system to measure the effort exerted by the patient. This configuration provides the possibility of implementing different types of rehabilitation exercises (active and passive), although this paper focuses on passive exercises. A complete description of the mechatronics of this robot can be found in [79].

Execution of Different Exercises
In the first simulation test, we have applied our proposed framework to learn and adapt an exercise trajectory that moves the robot platform in z while maintaining zero values for the pitch and roll angles, Figure 8. The robot moves along the nominal trajectory as long as the vertical force exerted by the patient keeps within the admissible values. Whenever the sensed forces exceeds a threshold designed by the therapist, the control system will trigger the phase-stopping mechanism in order to slow the trajectory evolution down, Equation (15). During the slow-down, a new z-offset will be estimated to be added to the trajectory in the next cycle which subsequently will reduce the sensed forces. Consequently, across cycles, the phase stopping becomes less frequent and the execution time decreases in each learning iteration. If we observe Figure 8, the measured forces decrease from one trial to the next (from 2 to 10). This is because the PR adapts its trajectory whenever the measured forces exceed the desired forces. Moreover, the learned offset increases with each iteration (from 2 to 10) to adapt the trajectory to the new situation. Line-1 in Figure 8 corresponds to the reference force, offset, and phase. At this stage of this work, the desired force profile is equal to zero to test the functionality of the proposed algorithm on our robot. Figure 9 shows a simulation for an exercise that has been repeated 14 times. The purpose of this simulation is to show the adaptability of the system to new situations. For instance, when the measured forces are high, trajectory amplitude is reduced. The reference trajectory peaks are at 0.5 m and the corresponding forces are at 17.5 N. Now, if we set the force threshold to 13 N; which corresponds to position 0.35 m, it can be observed from the figure that the algorithm modifies the reference trajectory amplitude by adding a position offset in each repetition. This offset is calculated from the feedback forces error. After 14 repetitions, the algorithm is able to replay the trajectory into the safe region.
The next experiment, shown in Figure 10, demonstrates a three-df exercise. In this experiment the algorithm runs γ, β, and z trajectories. Figure 10-bottom shows the phase evolution of the whole exercise, the learned offset for each df is shown in Figure 10-left, while the force adaptation for each df is illustrated in Figure 10-middle. Figure 10-right shows the original exercise trajectory for each df as a dashed red line, the first, fifth, and fifteenth repetitions (cycles) of the same exercise. It is clear from the figure that the algorithm does not change the shape of the original trajectory. In fact, it slows the exercise down and tries to reduce the resulting forces.
It is noteworthy that when the system stabilizes and there is no further adaptation, after a period of time determined by the specialist, the algorithm starts to go backward in the direction of the original    Figure 9-bottom shows the phase evolution of the whole exercise, the learned offset for each df is shown in Figure 9-left, while the force adaptation for each df is illustrated in Figure 9-middle. Figure 9-right shows the original exercise trajectory for each df as a dashed red line, the first, fifth, and fifteenth repetitions (cycles) of the same exercise. It is clear from the figure that the algorithm does not change the shape of the original trajectory. In fact, it slows the exercise down and tries to reduce the resulting forces.
It is noteworthy that when the system stabilizes and there is no further adaptation, after a period of time determined by the specialist, the algorithm starts to go backward in the direction of the original trajectory by gradually removing the added offsets. At each offset removal, the system repeats the exercise for another period of time, and so on.

Position Tracking Error
Different model-based control strategies have been implemented for the PR, such as passivity-based control, inverse dynamic control, and adaptive control. More detailed information about the design and implementation of these controllers can be found in [79]. In addition, the authors in [78] demonstrated that the closed-loop system (robot/adaptive controller) is convergent, so the tracking error asymptotically converges to zero and all internal signals remain bounded under suitable conditions of the controller gains. Figure 11 illustrates the robot response for the third active generalized coordinate q 8 . Figure 11-left shows the reference and the robot response, while Figure 11-right represents the joint error. As can readily be appreciated in these figures, the robot response obtained is accurate because the robot joint follows the reference with a very small error. The other generalized coordinates, q 1 and q 6 , have a very similar behavior. For verification, Table 2 shows the mean error, the root-mean-square error (RMSE) and the variance between the references and the parallel robot active joint positions.  Figure 11. The response of the generalized coordinate q 8 in Simulink.

Emergency Button Testing
In order to validate soft-emergency-stopping, in this simulation we have used an isokinetic rehabilitation exercise based on gait trajectory training [33]. Such exercises are used to restore the original mobility and ROM and to strengthen the affected limbs or ankle joint. The ankle and foot motions are generated based on a gait of normal walking on level ground. This exercise is used to restore the range of flexion/dorsiflexion motion. In order to establish the exercise, we need to elevate the platform to a certain heave. Figure 12-left shows the reference trajectory of the heave and the corresponding robot response, while Figure 12-right represents the error between both signals. In the same way, Figure 13-left illustrates the plantar flexion/dorsiflexion trajectory of the gait exercise along with the robot response. The error between both signals is shown in Figure 13-left. In all the cases, the robot controller provides very good performance, with a small error value. The following figures show another application of the DMPs with the rehabilitation robot. In this case, the patient and the medical doctor have an alarm button (red block in Figure 6). By pressing it, the control unit stops the normal execution of the exercise and moves the platform to a rest position using linear DMPs in Equation (9) with f q (χ) = 0 (yellow block in Figure 6). As soon as the alarm button is released, the robot returns to the initial position before the alarm activation and restarts the rehabilitation exercise. The resting configuration that has been considered for the moving platform is a heave position of 0.3 m with an orientation for the roll and pitch of 0 rad. In this trial, the alarm button was pressed at time t = [19,24] s. and at time t = [44,53] s. Figures 14 and 15 show the heave and the pitch evolution for the mobile platform in this experiment.

Experiments in Real Robot
Similarly, the previous simulated experiments (Section 5.2.3) have been tested here in real setup using the PR shown in Figure 7. Figure 16 illustrates how accurately the robot tracks a reference trajectory. Figure 16a shows the reference and the response of the active joint q 8 , while (b) shows its response when the emergency button is pressed and released in two different locations during the execution. As can be observed in Figure 16c,d, the response of the real robot is very precise, having a tracking error around 1 mm in both previous executions. Figure 16e,f show the reference trajectory and the response of the robot for γ and the heave, respectively. In this execution, the emergency button has been activated twice, as it can be observed in the figure. This execution has been obtained with a system of 10 cameras that detect the position and orientation of the mobile platform. As we can see, the tracking for the angle γ is accurate, while the heave has a tracking error of about 3 mm due to the mechanical clearances of the robot.

Conclusions
In this paper, we proposed an LfD framework to learn and adapt passive exercises for ankle rehabilitation using a PR. This framework, exploits DMPs along with ILC in order to iteratively adapt the exercise trajectory by transferring the feedback error into an offset that can be added to the original trajectory.
Moreover, we solved the forward and inverse kinematic models for our device as well as the dynamic model and the Jacobian needed to implement force control. A model-based controller was chosen to carry out the active generalized coordinates position control. The response obtained with this position control showed an accurate response in terms of position error.
In order to validate our system, we conducted several simulation examples in addition to real experiments in order to test the adaptability, robustness, and accuracy of the system. In these tests, we used passive exercises trajectories where different movement references for γ, β, and z have been executed by the robot. Observing the experiments, the algorithm was able to successfully adapt the exercise to the patient's needs by learning the offset that leads to a reduction in the measured forces exerted by the patient.
Finally, our proposed framework successfully adapts trajectories based on sensed forces. However, in the future we still need to extend this approach in two directions: (i) perform a clinical study, and (ii) speed-up the exercise execution based on the therapeutic recommendations; for example, when the patient repeats the original exercise, the algorithm starts to speed up the exercise. Funding: This work has been partially funded by the FEDER-CICYT project with reference DPI2017-84201-R (Integración de modelos biomecánicos en el desarrollo y operación de robots rehabilitadores reconfigurables) financed by Ministerio de Economía, Industria e Innovación (Spain).