Learning , Generalization , and Obstacle Avoidance with Dynamic Movement Primitives and Dynamic Potential Fields

In order to offer simple and convenient assistance for the elderly and disabled to take care of themselves, we propose a general learning and generalization approach for a service robot to accomplish specified tasks autonomously in an unstructured home environment. This approach firstly learns the required tasks by learning from demonstration (LfD) and represents the learned tasks with dynamic motion primitives (DMPs), so as to easily generalize them to a new environment only with little modification. Furthermore, we integrate dynamic potential field (DPF) with the above DMPs model to realize the autonomous obstacle avoidance function of a service robot. This approach is validated on the wheelchair mounted robotic arm (WMRA) by performing serial experiments of placing a cup on the table with an obstacle or without obstacle on its motion path.


Introduction
A wheelchair mounted robotic arm (WMRA) is a typical service robot, which is developed to help the elderly and disabled to take care of themselves in a home environment [1][2][3][4].However, due to the physical or cognitive defects of the users, it is still hard or impossible for them to manipulate the WMRA flexibly to complete daily tasks [5,6].From the perspective of the disabled and the elderly, it is better to achieve the autonomous manipulation of such service robots to help them accomplish related tasks existed in a home environment [7,8].Unlike the well-structured factory environment, the natural home environment is full of dynamic, unpredictable, and stochastic events.In this situation, the robot requires a flexible motion planning and controlling approach in response to the changes in the environment, such as changed goals, encountered obstacles, and external perturbations [9,10].The traditional approach of generating a complex movement plan is based on the search process or related controllers [11,12] to satisfy all the constraints, including the changes in the environment.However, these approaches are unsuitable for service robot to make rapid reactions due to its drawbacks of computationally expensive and time-consuming.At the same time, easy to program is another factor that affects the service robot to become widespread.This can be achieved by learning from demonstration (LfD) [13][14][15].The robot can learn new skills just by reproducing the recorded human movements.In consideration of the large number of tasks that exist in the home environment, it is feasible only if a demonstrated movement can be generalized to other contexts, like different goal positions [16].
In order to address the above questions, much recent work has focused on dynamic movement primitives (DMPs) [17][18][19][20][21][22], which offer a simple and versatile framework to represent and generate related movements.The core of DMPs is learning from demonstration (LfD).Usually, LfD offers a simple and convenient way to obtain the movement information of related tasks.Then, DMPs can represent the arbitrarily recorded movement with a set of nonlinear differential equations in which a linear point attractor is modulated by a nonlinear function [23,24].Representing the movement with differential equation has the advantages that it can be easily initialized with learning from demonstration and the generated movement is robust against the following changes, such as the task duration, goal points, and slight perturbations [7].Moreover, it can ensure system converge to the specified goal position.In other words, the learned movement can be easily generalized to a new goal by simply changing the goal parameters [16].This generalization characteristic is very suitable for a service robot to replay various learned tasks in different situations.
For online obstacle avoidance, it is also a difficult and classical problem that robots often encounter in the process of autonomous manipulation.So far, various approaches have been proposed to solve it [25].They can be divided into two categories, the local methods and the global methods.The local methods can offer fast response in face of the obstacles with local optimization, mainly including the vector field histogram [26], motion field flow [27], the curvature-velocity method [28], and the artificial potential field approach [29,30]; the global methods can ensure that a valid and whole trajectory optimized solution can be found if it exists, but requires large computation and global representation of the environments and obstacles, mainly containing the path planning algorithms [31,32] and the global search approaches [33,34].The artificial potential field approach, which was proposed by Khatib [29] in 1986, has been widely studied to avoid obstacles.Generally speaking, when a potential field is constructed, the robot in this field suffers the corresponding repellent force both from the goal and the obstacle simultaneously so as to prevent the robot from colliding with the obstacle.During this process, the goal generates the attractive field and the obstacle generates the repulsive field.The two fields act together on robots located in the artificial potential field to generate the desired motion trajectory [25,35].This approach has the characteristics of a simple structure and real-time underlying control.For these above reasons, this approach has been extensively used in mobile robotics [36,37] and robotic manipulators [38] to achieve real-time obstacle avoidance and smooth trajectory control.
Recently, new discoveries have been found in this research direction.According to Dae-Hyung Park and Peter Paster's study, DMPs can combine with the artificial potential fields to avoid the related obstacles [13,16,23].The artificial potential field can be seen as a coupling term added to the differential equation of DMPs directly, with the purposes to offer related direct feedback of the environment.Additionally, in Dae-Hyung Park's study, he found that the static potential cannot generate a smooth trajectory, especially when the end effector moves towards the obstacle directly.To solve this problem, he used the dynamic potential field in conjunction with DMPs to avoid obstacles.This approach has been successfully used to avoid the point obstacles [13] and many static obstacles [23].However, the obstacles in these papers are all considered as single points individually, without considering the spatially extended obstacles.Additionally, we have not found any paper systematically describing DMPs' generalization ability and its obstacle avoidance function in the autonomous manipulation of service robots.
In this paper, we have detailed a general framework for WMRA to learn demonstrated motion and generalize it to a new environment even where existing obstacles are located on its motion path so as to help the elderly and disabled to live independently in an unstructured home environment.This framework is mainly based on the DMPs-DPF approach, which is an approach of dynamic movement primitives combined with dynamic potential fields.The main achievement of this paper is that it systematically describes the whole realization process of WMRA's autonomous manipulation, including learning, generalization, and obstacle avoidance with any shape and size obstacles.Unlike the traditional path planning method, this approach can quickly generate a new path that conforms to the user's operating habits without tedious programming process and heavy consumption of time.This characteristic is particularly suitable and preferred by the users of WMRA to accept the assistive help from WMRA without any panic and fear.It is worth highlighting that the users of WMRA are just located in the operating space of the robotic arm, which is obviously different from other service robots.Additionally, this approach can also be used in the field of human-robot collaboration [39,40], which has similar work scenes with the WMRA.The rest of this paper is organized as follows.
In Section 2, we described the dynamic system framework for movement generation, mainly including the introduction of the dynamic movement primitives model and the learning and generalization process of DMPs.In Section 3, we introduced the overall framework of the DMPs-DPF approach and mainly described the coupling term for obstacle avoidance.In Section 4, we carried out a set of experiments of placing a cup on the table, including without obstacle, with a small spherical obstacle, and with a large cuboid obstacle.We concluded and briefly introduced the future research trends in Section 5.

Dynamic Movement Primitives Model
Dynamic movement primitives (DMPs) were first introduced to the trajectory control of a robot by Auke Ijspeert et al. [17] in 2002.The basic idea of DMPs is to describe related motions using a series of nonlinear differential equations with attractive points which are modulated by a nonlinear function.Especially, the DMPs can guarantee the system to converge to the goal point because the nonlinear function vanishes at the end of a movement.For different goals, discrete DMPs can generate new trajectories that meet current environmental requirements while ensuring the shape of the demonstrated trajectory [13,41,42].
In this paper, we carried out related research based on the improved DMPs model motivated by human behavioral data and convergent force fields [13,16,23].This improved model can successfully avoid two major defects of the traditional DMPs model.One is the generated large acceleration when the goal is near to the initial position; the other is the generated mirror trajectory when the sign of x g − x 0 is contrary to the demonstrated one.
Any discrete movement with DMPs model can be represented with the transformation system, and the canonical system, τ .
In the above equations, x and v represent the current position and velocity of the system; x 0 and x g represent the initial position and goal position; D is the damping term; K acts as the spring constant; τ is the temporal scaling factor of the movement duration; f is the non-linear function allowed to generate any complex movements; s is the phase variable; and α is predefined constant.
Especially, the non-linear function is defined as where ω i are the weights and ψ i (s) are Gaussian basis functions with the total number N. The ψ i (s) is computed by with its width h i and center c i .This function f depends on the phase s, which is obtained by the canonical system with s(0) = 1 as its initial state.Moreover, the influence of f vanishes at the end of a movement.

Learning and Generalization Process of DMPs
Learning the demonstrated movements and generalizing them to new situations is the ultimate goals of DMPs.According to the introduction of the DMPs model, the function of a nonlinear function term f (s) is to generate arbitrary complex movements while ensuring the shape of the demonstrated trajectory.Especially, the weight parameter ω i is the core factor which can be learned from a given trajectory during the DMPs' learning process.
Given the displacement sequence x demo (t) of a demonstration trajectory, where the corresponding time is t ∈ {∆t, 2∆t, • • • , n∆t} and ∆t represents the step size, we can easily obtain the velocity sequence .
x demo (t) and the acceleration sequence x demo (t).Rearranging Equation (1), integrating the initial position with x 0 = x demo (0) and final position with x g = x demo (n∆t), we can calculate the non-linear function sequence by Additionally, considering the canonical system is integrable, the phase s can be calculated based on the constants α and τ.Thus, the non-linear function sequence f demo can be easily obtained.
Furthermore, the learning problem can be transformed into the function approximation problem.The purpose of the DMPs learning framework is to determine the approximate weight parameter ω i in Equation ( 4) to make the values of f close to f demo .This problem can be addressed with the locally weighted regression, such as the least mean square method.Equation (4) can be converted to the form of a linear equation where Based on the minimum error criterion the optimal weight parameter ω i of the system can be obtained when J takes the minimum value.Following the above steps, we can obtain the weights sequence of the demonstrated movement, which can be used to generate new motions in any new environment with the same motion characteristics of the demonstrated one.The generalization process of DMPs is just the opposite of the learning process.A movement plan in a new environment can be generated by reusing the obtained weight parameters ω i by specifying the desired start point x 0 and goal point x g and integrating the canonical system with s = 1.The non-linear function f, which is derived from the phase variable, can, in turn, perturb the linear spring-damper system to generate the desired attractor landscapes.Just by rearranging Equations ( 1) and (2), the displacement x, velocity .
x of the generalized trajectory can all be computed with the point-by-point iterative method.The whole learning and generalization process are illustrated in Figure 1 in detail.Furthermore, the learning and generalization process of DMPs can also be generalized to the motion learning of multi-degrees of freedom.In this case, all the motions are coupled in time, but each degree of freedom has its corresponding non-linear function and dynamical system.In the DMPs model, the same canonical system can ensure the time coupling of each degree of freedom and the corresponding dynamical system can guarantee each degree of freedom has its own motion characteristics.Just by sharing the same canonical system, each degree of freedom can not only have its own motion characteristics but also keep the synchronization in time.The detail description of this process is shown in Figure 2.

Overall Framework of DMPs-DPF Approach
Although the DMPs can generalize the learned skills to new environments, it is still a tricky problem when some obstacles exist in the re-planned motion path.Here, we only consider the obstacles existed in the motion path are stationary.Considering the excellent performance of dynamical potential field (DPF) in obstacle avoidance, it is a good try to combine the DPF with the DMPs model so as to extend its functionality.Essentially speaking, the DMPs model and DPF model are all force field models guided by attractors, which can both be described by a set of differential equations.Therefore, it is reasonable and feasible to insert the DPF force field model after reasonable abstraction and correction into the DMPs model.
When inserted the DPF term into the transformation system as a coupling term, the transformation system of DMPs can be described as Furthermore, the learning and generalization process of DMPs can also be generalized to the motion learning of multi-degrees of freedom.In this case, all the motions are coupled in time, but each degree of freedom has its corresponding non-linear function and dynamical system.In the DMPs model, the same canonical system can ensure the time coupling of each degree of freedom and the corresponding dynamical system can guarantee each degree of freedom has its own motion characteristics.Just by sharing the same canonical system, each degree of freedom can not only have its own motion characteristics but also keep the synchronization in time.The detail description of this process is shown in Figure 2. Furthermore, the learning and generalization process of DMPs can also be generalized to the motion learning of multi-degrees of freedom.In this case, all the motions are coupled in time, but each degree of freedom has its corresponding non-linear function and dynamical system.In the DMPs model, the same canonical system can ensure the time coupling of each degree of freedom and the corresponding dynamical system can guarantee each degree of freedom has its own motion characteristics.Just by sharing the same canonical system, each degree of freedom can not only have its own motion characteristics but also keep the synchronization in time.The detail description of this process is shown in Figure 2.

Overall Framework of DMPs-DPF Approach
Although the DMPs can generalize the learned skills to new environments, it is still a tricky problem when some obstacles exist in the re-planned motion path.Here, we only consider the obstacles existed in the motion path are stationary.Considering the excellent performance of dynamical potential field (DPF) in obstacle avoidance, it is a good try to combine the DPF with the DMPs model so as to extend its functionality.Essentially speaking, the DMPs model and DPF model are all force field models guided by attractors, which can both be described by a set of differential equations.Therefore, it is reasonable and feasible to insert the DPF force field model after reasonable abstraction and correction into the DMPs model.
When inserted the DPF term into the transformation system as a coupling term, the transformation system of DMPs can be described as

Overall Framework of DMPs-DPF Approach
Although the DMPs can generalize the learned skills to new environments, it is still a tricky problem when some obstacles exist in the re-planned motion path.Here, we only consider the obstacles existed in the motion path are stationary.Considering the excellent performance of dynamical potential field (DPF) in obstacle avoidance, it is a good try to combine the DPF with the DMPs model so as to extend its functionality.Essentially speaking, the DMPs model and DPF model are all force field models guided by attractors, which can both be described by a set of differential equations.Therefore, it is reasonable and feasible to insert the DPF force field model after reasonable abstraction and correction into the DMPs model.
When inserted the DPF term into the transformation system as a coupling term, the transformation system of DMPs can be described as where, o is the position of the obstacle, x is the current position of the system, ν is the relative velocity of the obstacle and system, p(o, x, ν) is the coupling term.The modified DMPs can be called "DMPs-DPF" for short.
The block diagram of DMPs-DPF approach is shown in Figure 3.It mainly contains two parts (the DMPs model and the DPF coupling term) with three functions (learning, obstacle avoidance, and generalization).The DPF coupling term is used to avoid obstacles existed in its motion paths.This term is inserted into the DMPs model through the transformation system shown in Equation (10).Just with this modification, the robot can easily follow the similar steps shown in Figure 1 to learn and generalize related movements, even avoid obstacles may be encountered.
where,  is the position of the obstacle,  is the current position of the system,  is the relative velocity of the obstacle and system, (, , ) is the coupling term.The modified DMPs can be called "DMPs-DPF" for short.
The block diagram of DMPs-DPF approach is shown in Figure 3.It mainly contains two parts (the DMPs model and the DPF coupling term) with three functions (learning, obstacle avoidance, and generalization).The DPF coupling term is used to avoid obstacles existed in its motion paths.This term is inserted into the DMPs model through the transformation system shown in Equation (10).Just with this modification, the robot can easily follow the similar steps shown in Figure 1 to learn and generalize related movements, even avoid obstacles may be encountered.
Compared with the DMPs approach, the main difference is the structure of the transformation system in the DMPs-DPF approach.Just with this modification, the robot can increase the function of avoiding obstacles while maintaining the original learning and generalization functions.

Description of the DPF Coupling Term
Based on the relative position and speed relationships between the system and obstacle, the coupling term can feedback the repulsive force generated by the obstacle to the system in real time.
If the system is close to the obstacle, the repulsive force is increased.Otherwise, the repulsive force is reduced.Based on this principle, the system can successfully avoid the obstacle.Especially, compared to (, ) which only considers the position relationship of the system and obstacle, the coupling term (, , ) also considers the relative speed relationship in addition to the position relationship.This has the advantage that it can effectively avoid the drawbacks of unsmooth obstacle avoidance trajectory and speed incoherence.
According to the size and shape of the obstacle, the obstacle avoidance problem can be divided into two categories.One is the obstacle with negligible size and shape, the other is the obstacle with non-negligible shape and size.The coupling terms (, , ) in these two cases are calculated as follows.
1) An obstacle with negligible size and shape When the system operates in an environment with a single small obstacle, the obstacle can be treated as a point with no size and shape.In this situation, the motion analysis of the system and obstacle is drawn in Figure 4   Compared with the DMPs approach, the main difference is the structure of the transformation system in the DMPs-DPF approach.Just with this modification, the robot can increase the function of avoiding obstacles while maintaining the original learning and generalization functions.

Description of the DPF Coupling Term
Based on the relative position and speed relationships between the system and obstacle, the coupling term can feedback the repulsive force generated by the obstacle to the system in real time.If the system is close to the obstacle, the repulsive force is increased.Otherwise, the repulsive force is reduced.Based on this principle, the system can successfully avoid the obstacle.Especially, compared to p(o, x) which only considers the position relationship of the system and obstacle, the coupling term p(o, x, ν) also considers the relative speed relationship in addition to the position relationship.This has the advantage that it can effectively avoid the drawbacks of unsmooth obstacle avoidance trajectory and speed incoherence.
According to the size and shape of the obstacle, the obstacle avoidance problem can be divided into two categories.One is the obstacle with negligible size and shape, the other is the obstacle with non-negligible shape and size.The coupling terms p(o, x, ν) in these two cases are calculated as follows.
(1) An obstacle with negligible size and shape When the system operates in an environment with a single small obstacle, the obstacle can be treated as a point with no size and shape.In this situation, the motion analysis of the system and obstacle is drawn in Figure 4   In the above motion analysis diagram, O is the simplified obstacle center point position, x is the current position of the system, v is the current velocity of the system,  is the acceleration generated by the coupling term (, , ), φ is the angle between the velocity vector and the relative position vector, which can be computed with the following equation In order to realize the most efficient obstacle avoidance effect, the acceleration term should be located in the plane determined by the obstacle O, position x, and velocity  at an angle of 90 deviating away from the velocity v.
Considering the matching characteristics of the dynamic potential field with the velocity, the magnitude of acceleration generated by (, , ) should be consistent with the velocity of the system.Thus, the coupling term should include the factor constructed by the following equations where r is the cross product of the relative position vector and velocity vector, R is the rotation matrix with r as the rotation axis and π/2 as the rotation angle.Moreover, the rotation matrix can be solved with the Rodrigue Rotation Formula.The processed Equation ( 12) can be represented as where  is the unit vector of r, which can be calculated by  = /|| Except considering the impact of velocity, the coupling term (, , ) also takes the angle φ and relative distance d into account.The factor in the coupling term can be calculated by where ψ is the control factor, β is the angle coefficient used to adjust the influence of angle on the coupling term, and k is the distance coefficient used to adjust the influence of relative distance on the coupling term.Specifically speaking, when the system is moving slowly or the obstacle in the path is small, it is better to choose large β and k.On the other hand, when the system is moving quickly or the obstacle is large, it is better to choose small β and k.Following the above rules, the effects of angle and distance on the value of the coupling term can be effectively presented.
In summary, overall considering the velocity, angle, and relative distance, the coupling term can be constituted by multiplying Equations ( 14), (15), and coefficient γ used to directly adjust the amplitude of force field.The basic form of the coupling term is ψ γRv In the above motion analysis diagram, O is the simplified obstacle center point position, x is the current position of the system, v is the current velocity of the system, a p is the acceleration generated by the coupling term p(o, x, ν), ϕ is the angle between the velocity vector and the relative position vector, which can be computed with the following equation In order to realize the most efficient obstacle avoidance effect, the acceleration term should be located in the plane determined by the obstacle O, position x, and velocity ν at an angle of 90 deviating away from the velocity v.
Considering the matching characteristics of the dynamic potential field with the velocity, the magnitude of acceleration generated by p(o, x, ν) should be consistent with the velocity of the system.Thus, the coupling term should include the factor constructed by the following equations where r is the cross product of the relative position vector and velocity vector, R is the rotation matrix with r as the rotation axis and π/2 as the rotation angle.Moreover, the rotation matrix can be solved with the Rodrigue Rotation Formula.The processed Equation ( 12) can be represented as where r 0 is the unit vector of r, which can be calculated by r 0 = r/|r|.Except considering the impact of velocity, the coupling term p(o, x, ν) also takes the angle ϕ and relative distance d into account.The factor in the coupling term can be calculated by where ψ is the control factor, β is the angle coefficient used to adjust the influence of angle on the coupling term, and k is the distance coefficient used to adjust the influence of relative distance on the coupling term.Specifically speaking, when the system is moving slowly or the obstacle in the path is small, it is better to choose large β and k.On the other hand, when the system is moving quickly or the obstacle is large, it is better to choose small β and k.Following the above rules, the effects of angle and distance on the value of the coupling term can be effectively presented.
In summary, overall considering the velocity, angle, and relative distance, the coupling term can be constituted by multiplying Equations ( 14), (15), and coefficient γ used to directly adjust the amplitude of force field.The basic form of the coupling term is It is worth noting that Equation ( 16) is only suitable for a single and small obstacle, for it treats the obstacle as a point with no size and shape.
(2) An obstacle with non-negligible shape and size When encountering an obstacle with non-negligible shape and size, which is shown in Figure 5 for illustration, we choose the nearest point on the obstacle to approximate the obstacle's boundary.In this situation, the coupling term can be defined as Appl.Sci.2019, 9, x FOR PEER REVIEW 8 of 16 It is worth noting that Equation ( 16) is only suitable for a single and small obstacle, for it treats the obstacle as a point with no size and shape.
2) An obstacle with non-negligible shape and size When encountering an obstacle with non-negligible shape and size, which is shown in Figure 5 for illustration, we choose the nearest point on the obstacle to approximate the obstacle's boundary.In this situation, the coupling term can be defined as relative distance with its purpose to guarantee the coupling term can generate enough force to avoid the obstacle, even if the first two terms are close to 0 when the system is moving towards the obstacle.
Equation ( 17) is the final form of the DPF-based negative feedback coupling term.It can timely feedback the complex obstacle information into the DMPs model in a simple enough form, so as to achieve a stable and smooth obstacle avoidance behavior.Additionally, the learning and generalization process based on DMPs-DPF is similar to the whole process based on DMPs.We only have to combine the transformation system with the corresponding coupling term, the robot can follow similar steps to learn demonstrated tasks and generalize them to a new environment-even existing obstacles on its motion path.

Robot Experiment
This DMPs-DPF approach to learn and generalize the demonstrated motion as well as avoid obstacles based on DMPs-DPF was validated on the wheelchair mounted robotic arm (WMRA) by performing the common domestic task of placing a cup on the table.
The WMRA is a typical service robot mainly composed of an electrical wheelchair (Vermeiren, Suzhou, Jiangsu Province, China) and a 6-DOF robotic arm JACO (Kinova, Montreal, QC, Canada) on its front right side in our laboratory.It has the advantages of possessing both the mobility performance of electric wheelchair and the operational performance of a robotic arm [43,44].Figure 6 shows a physical picture of WMRA with JACO robotic arm retracted on its shoulder joint.In this paper, we only focus on the movement of the hand, which is the end-effector of the JACO robotic arm, to accomplish a set of specific tasks.In Equation (17), the first term γ p Rvψ p is generated by the point P which represents the nearest point on the obstacle to the system, and the selection of point P changes in real time with the relative location of the system and obstacle; the second term γ o Rvψ o is generated by the centroid of the obstacle, which is fixed during the whole process; the third term γ d Rv exp(−kd) just relies on the relative distance with its purpose to guarantee the coupling term can generate enough force to avoid the obstacle, even if the first two terms are close to 0 when the system is moving towards the obstacle.
Equation ( 17) is the final form of the DPF-based negative feedback coupling term.It can timely feedback the complex obstacle information into the DMPs model in a simple enough form, so as to achieve a stable and smooth obstacle avoidance behavior.Additionally, the learning and generalization process based on DMPs-DPF is similar to the whole process based on DMPs.We only have to combine the transformation system with the corresponding coupling term, the robot can follow similar steps to learn demonstrated tasks and generalize them to a new environment-even existing obstacles on its motion path.

Robot Experiment
This DMPs-DPF approach to learn and generalize the demonstrated motion as well as avoid obstacles based on DMPs-DPF was validated on the wheelchair mounted robotic arm (WMRA) by performing the common domestic task of placing a cup on the table.
The WMRA is a typical service robot mainly composed of an electrical wheelchair (Vermeiren, Suzhou, Jiangsu Province, China) and a 6-DOF robotic arm JACO (Kinova, Montreal, QC, Canada) on its front right side in our laboratory.It has the advantages of possessing both the mobility performance of electric wheelchair and the operational performance of a robotic arm [43,44].Figure 6 shows a physical picture of WMRA with JACO robotic arm retracted on its shoulder joint.In this paper, we only focus on the movement of the hand, which is the end-effector of the JACO robotic arm, to accomplish a set of specific tasks.

Task Demonstration
With the help of appropriate teaching interfaces, end users can easily teach a service robot to complete various tasks existed in the domestic environment.At present, the common teaching interfaces are shown as follows: 1) handle control mode; 2) kinesthetic teaching; 3) directly recording human motions, such as visual, exoskeleton, and wearable sensor; 4) teleoperation mode [43].Considering the WMRA is mainly used in a home environment and precise demonstration motion information is required in the learning framework, it is better to choose the default handle control mode.This mode requires the flexible manipulation of the handle.This may cause some trouble for the elderly or disabled, but it is very easy for teachers with a good athletic ability to manipulate it.
In order to facilitate the acquisition process of demonstrated motion information, we also designed a demonstration interface in Visual C++ of Visual Studio 2013 (Microsoft Corporation, Redmond, WA, US) based on the original application programming interfaces (APIs) of JACO robotic arm.With this interface, we just have to gather a few key points of the demonstration trajectory; then, the robot can move along the gathered key points and save the accurate motion information of the complete trajectory to the specified document.In this way, the Cartesian coordinate information of the hand and motion state of the fingers can be easily stored as the most original demonstration motion information.It should be pointed out that the motion information is obtained by reading relative API information of the JACO robotic arm.And, this motion information is measured in its default coordinate system.

An Experiment of Placing a Cup on the Table
The experiment of placing a cup on the table is to bring a cup in the distance to the front of the user.It is firstly carried out in the scene with no obstacle, aiming to demonstrate that the JACO robotic arm in our framework can learn the demonstrated task and replay it in a new environment with the similar trajectory shape.On the basis of this experiment, we separately modify the initial settings of the experimental scene with small size obstacle and large size obstacle on its motion path, in order to verify that the DMPs-DPF learning framework can also effectively avoid obstacles.
1) Placing a cup with no obstacle In this experiment, the WMRA is parked in front of the table and stayed still during the whole experiment process.The JACO robotic arm is predefined as the right-hand configuration, which is consistent with the operating habits of most people.After the above initial settings, we carry out the task demonstration and subsequently generalize the learned task to new locations.Both the demonstrated and generalized locations are all randomly selected on the table with one

Task Demonstration
With the help of appropriate teaching interfaces, end users can easily teach a service robot to complete various tasks existed in the domestic environment.At present, the common teaching interfaces are shown as follows: (1) handle control mode; (2) kinesthetic teaching; (3) directly recording human motions, such as visual, exoskeleton, and wearable sensor; (4) teleoperation mode [43].Considering the WMRA is mainly used in a home environment and precise demonstration motion information is required in the learning framework, it is better to choose the default handle control mode.This mode requires the flexible manipulation of the handle.This may cause some trouble for the elderly or disabled, but it is very easy for teachers with a good athletic ability to manipulate it.
In order to facilitate the acquisition process of demonstrated motion information, we also designed a demonstration interface in Visual C++ of Visual Studio 2013 (Microsoft Corporation, Redmond, WA, US) based on the original application programming interfaces (APIs) of JACO robotic arm.With this interface, we just have to gather a few key points of the demonstration trajectory; then, the robot can move along the gathered key points and save the accurate motion information of the complete trajectory to the specified document.In this way, the Cartesian coordinate information of the hand and motion state of the fingers can be easily stored as the most original demonstration motion information.It should be pointed out that the motion information is obtained by reading relative API information of the JACO robotic arm.And, this motion information is measured in its default coordinate system.

An Experiment of Placing a Cup on the Table
The experiment of placing a cup on the table is to bring a cup in the distance to the front of the user.It is firstly carried out in the scene with no obstacle, aiming to demonstrate that the JACO robotic arm in our framework can learn the demonstrated task and replay it in a new environment with the similar trajectory shape.On the basis of this experiment, we separately modify the initial settings of the experimental scene with small size obstacle and large size obstacle on its motion path, in order to verify that the DMPs-DPF learning framework can also effectively avoid obstacles.
(1) Placing a cup with no obstacle In this experiment, the WMRA is parked in front of the table and stayed still during the whole experiment process.The JACO robotic arm is predefined as the right-hand configuration, which is consistent with the operating habits of most people.After the above initial settings, we carry out the task demonstration and subsequently generalize the learned task to new locations.Both the demonstrated and generalized locations are all randomly selected on the table with one requirement that they are in the workspace of JACO.The detail positions are shown in Figure 7 with round numbered papers indicated.Additionally, the Cartesian coordinate information of related paper position is given in advance.The complete demonstration process of placing a cup on the table is shown in Figure 8 with the help of WMRA.During this process, the Cartesian coordinates of the JACO robotic arm are recorded as the original raw motion information.The task demonstration process mainly contains two phases, the grasp motion phase (from step a to step c) and the mobile motion phase (from step d to step f).It is worth noting that the The complete demonstration process of placing a cup on the table is shown in Figure 8 with the help of WMRA.During this process, the Cartesian coordinates of the JACO robotic arm are recorded as the original raw motion information.The complete demonstration process of placing a cup on the table is shown in Figure 8 with the help of WMRA.During this process, the Cartesian coordinates of the JACO robotic arm are recorded as the original raw motion information.The task demonstration process mainly contains two phases, the grasp motion phase (from step a to step c) and the mobile motion phase (from step d to step f).It is worth noting that the The task demonstration process mainly contains two phases, the grasp motion phase (from step a to step c) and the mobile motion phase (from step d to step f).It is worth noting that the initial positions of the cup and the JACO robotic arm are unchanged during the whole experiment process.For this reason, the grasp motion phase remains the same even towards any different goal position and the only difference is the mobile motion phase.
In order to obtain a better contrast effect, we only consider the mobile motion phase in the following learning and generalization process and comparatively analyze the task replaying with different goal positions.The complete demonstrated trajectory toward No. 2 goal position is drawn in Figure 9a.In this figure, the blue straight line represents the grasp motion phase and the red straight line represents the mobile motion phase.log (0.01), so as to make sure the ninety-nine percent of phase convergence at  = .As the spring constant K in the transformation is set to 100.The damping D is set to 20, which is used to make the system critically damped.Additionally, the number of Gaussian basis functions in the non-linear function is set to 4.
When it comes to the replaying phase, we replace Equation ( 5) with the new goal position.In addition, the weight parameters  obtained in the learning process are used to calculate the corresponding non-linear function.This is the core to generate any new trajectory with a similar shape style.Finally, we use the point-by-point iterative method to calculate the motion information of the new generalized trajectory.In this step, the precision threshold is set to 0.02 and the number of iterations is set to 5. Following the above steps, the generalized mobile motion trajectories towards different goal positions can be obtained and drawn in Figure 9b for illustration.With these generalized mobile motion phases integrated with the grasp motion phase, the WMRA can easily accomplish the specified task of placing the cup to different goal positions marked with No.3 and No.4.
In Figure 9b, the green dotted line represents the generalized trajectory towards No.3 goal position and the blue dash-dotted line represents the trajectory towards No.4 position.Additionally, we also add the original mobile demonstration trajectory for contrast, which is represented by the red solid line.From the above figures, we can easily draw the conclusion that the generalized trajectories can converge to the specified positions.This indicates the WMRA can replay the learned task in the new environment.Especially, neither the green dotted line nor the blue dash-dotted line, they have similar trajectory shape styles compared to the original demonstration.This confirms that the basic characteristic of the learning framework is to generate a new trajectory with a similar shape style.This characteristic is very significant for the users of WMRA.With this advantage, the WMRA can accomplish related tasks according to the user's favorite operating habits.
2) Placing a cup with a small spherical obstacle During the learning process, the recorded motion information is used to compute the non-linear function f demo in Equation (5).Its core weight parameters ω i can also be calculated with the least mean square method with Equations ( 6)-( 8) in the dynamical system.For the related parameters in our system, we made the following choices.The α in the canonical system is set to -log (0.01), so as to make sure the ninety-nine percent of phase convergence at t = τ.As the spring constant K in the transformation is set to 100.The damping D is set to 20, which is used to make the system critically damped.Additionally, the number of Gaussian basis functions in the non-linear function is set to 4.
When it comes to the replaying phase, we replace Equation ( 5) with the new goal position.In addition, the weight parameters ω i obtained in the learning process are used to calculate the corresponding non-linear function.This is the core to generate any new trajectory with a similar shape style.Finally, we use the point-by-point iterative method to calculate the motion information of the new generalized trajectory.In this step, the precision threshold is set to 0.02 and the number of iterations is set to 5. Following the above steps, the generalized mobile motion trajectories towards different goal positions can be obtained and drawn in Figure 9b for illustration.With these generalized mobile motion phases integrated with the grasp motion phase, the WMRA can easily accomplish the specified task of placing the cup to different goal positions marked with No. 3 and No. 4.
In Figure 9b, the green dotted line represents the generalized trajectory towards No. 3 goal position and the blue dash-dotted line represents the trajectory towards No. 4 position.Additionally, we also add the original mobile demonstration trajectory for contrast, which is represented by the red solid line.
From the above figures, we can easily draw the conclusion that the generalized trajectories can converge to the specified positions.This indicates the WMRA can replay the learned task in the new environment.Especially, neither the green dotted line nor the blue dash-dotted line, they have similar trajectory shape styles compared to the original demonstration.This confirms that the basic characteristic of the learning framework is to generate a new trajectory with a similar shape style.This characteristic is very significant for the users of WMRA.With this advantage, the WMRA can accomplish related tasks according to the user's favorite operating habits.
(2) Placing a cup with a small spherical obstacle Based on the initial settings of the above experiment, we modify the experiment by adding an extra small spherical obstacle (represented by a ball with its diameter 55 mm) at the center of its mobile motion path from No. 1 position to No. 3 position.The initial experiment setting is shown in Figure 10a.Based on the initial settings of the above experiment, we modify the experiment by adding an extra small spherical obstacle (represented by a ball with its diameter 55 mm) at the center of its mobile motion path from No.1 position to No.3 position.The initial experiment setting is shown in Figure 10a.
Considering the size of the spherical obstacle is small, we integrate Equation (10) with Equation ( 16) as the modified transformation system.With the help of the initial demonstration trajectory obtained in experiment 1), we follow the same learning and generalization steps to obtain the new avoidance obstacle trajectory and draw related trajectories in Figure 10b.With this avoidance obstacle trajectory, the WMRA easily avoid the spherical obstacle during its mobile motion phase and successfully accomplish the task.In Figure 10b, the green dotted line represents the new generalization avoidance obstacle trajectory towards No.3 goal position.In contrast, the initial generalization trajectory and original demonstration trajectory are also added and respectively represented with a purple solid line and red dash-dotted line.Moreover, the three-dimensional models of spherical obstacle and cup are drawn in this figure for illustration.In addition, the cup position is randomly selected on the table except for the No.1 position so as to avoid the coincidence with related trajectories.It is needed to point out that the related trajectory in this paper is the path of JACO robotic arm, which grasps the cup at the middle height position (shown in Figure 8b).For this reason, we also need to consider the half height of the bottle to avoid obstacles in the actual experiment.In Figure 10b, the initial part of the green dotted line coincides with the purple solid line.This indicates the generalized trajectory remains the same as the originally demonstrated one.When the system approaches the spherical obstacle, the generalized trajectory changes immediately to avoid the obstacle encountered.After that, the generalized trajectory smoothly approaches the original demonstrated trajectory and converges to the specified goal position.Especially, the shape of the avoidance obstacle trajectory is similar to the original demonstration trajectory except for the avoidance obstacle part.This verifies the DMP-DPF learning framework can not only avoid the obstacles encountered in its motion path but also can maintain the demonstrated motion style while avoiding the obstacle.
3) Placing a cup with a large cuboid obstacle The initial experiment setting is similar to experiment 2), which is shown in Figure 11a in detail.The only difference is that we replace the spherical obstacle with a large cuboid obstacle (represented by a carton with its length 100 mm, width 100 mm, and height 148 mm) on its motion path.Especially, the location of the carton is at the center of its mobile motion path from location 1 to location 4 and its direction is parallel to the JACO robotic arm coordinate system.Considering the shape of the obstacle cannot be ignored, we integrate the Equation ( 10) with the Equation ( 17) as the modified transformation system.After this modification, we obtain the new avoidance obstacle trajectory which is drawn in Figure 11b.Combing this avoidance obstacle trajectory with the grasp Considering the size of the spherical obstacle is small, we integrate Equation (10) with Equation ( 16) as the modified transformation system.With the help of the initial demonstration trajectory obtained in experiment 1), we follow the same learning and generalization steps to obtain the new avoidance obstacle trajectory and draw related trajectories in Figure 10b.With this avoidance obstacle trajectory, the WMRA easily avoid the spherical obstacle during its mobile motion phase and successfully accomplish the task.
In Figure 10b, the green dotted line represents the new generalization avoidance obstacle trajectory towards No. 3 goal position.In contrast, the initial generalization trajectory and original demonstration trajectory are also added and respectively represented with a purple solid line and red dash-dotted line.Moreover, the three-dimensional models of spherical obstacle and cup are drawn in this figure for illustration.In addition, the cup position is randomly selected on the table except for the No. 1 position so as to avoid the coincidence with related trajectories.It is needed to point out that the related trajectory in this paper is the path of JACO robotic arm, which grasps the cup at the middle height position (shown in Figure 8b).For this reason, we also need to consider the half height of the bottle to avoid obstacles in the actual experiment.In Figure 10b, the initial part of the green dotted line coincides with the purple solid line.This indicates the generalized trajectory remains the same as the originally demonstrated one.When the system approaches the spherical obstacle, the generalized trajectory changes immediately to avoid the obstacle encountered.After that, the generalized trajectory smoothly approaches the original demonstrated trajectory and converges to the specified goal position.Especially, the shape of the avoidance obstacle trajectory is similar to the original demonstration trajectory except for the avoidance obstacle part.This verifies the DMP-DPF learning framework can not only avoid the obstacles encountered in its motion path but also can maintain the demonstrated motion style while avoiding the obstacle.
(3) Placing a cup with a large cuboid obstacle The initial experiment setting is similar to experiment 2), which is shown in Figure 11a in detail.The only difference is that we replace the spherical obstacle with a large cuboid obstacle (represented by a carton with its length 100 mm, width 100 mm, and height 148 mm) on its motion path.Especially, the location of the carton is at the center of its mobile motion path from location 1 to location 4 and its direction is parallel to the JACO robotic arm coordinate system.Considering the shape of the obstacle cannot be ignored, we integrate the Equation ( 10) with the Equation ( 17) as the modified transformation system.After this modification, we obtain the new avoidance obstacle trajectory which is drawn in Figure 11b.Combing this avoidance obstacle trajectory with the grasp demonstration trajectory, the WMRA easily accomplish the task of placing a cup with a large cuboid obstacle.In Figure 11b, the new generalization avoidance obstacle trajectory towards No.4 goal position is represented with the green dotted line.The original demonstration trajectory and initial generalization trajectory are also added in this figure and represented with the red solid line and blue dash-dotted line.Except for the relative trajectories, the three-dimensional models of cuboid obstacle and cup are drawn for illustration.Similar to experiment 2, the position of the cup is also randomly selected.
In Figure 11b, the initial part of the generalization avoidance obstacle trajectory is almost the same with the original demonstration trajectory.Compared to the avoidance obstacle trajectory in Figure 10b, the trajectory shape in Figure 11b changes largely when the system approaches the obstacle.This may be caused by the large repulsive force generated when considering the shape of the obstacle.Even with this large fluctuation, the trajectory still converges to the goal position with a similar shape style.This experiment further confirms the validity and advantage of the proposed DMPs-DPF approach

Conclusions
In this paper, we mainly describe a general learning and generalization framework based on DMPs-DPF for WMRA service robot to autonomously accomplish some common domestic tasks.With this framework, we only have to demonstrate related tasks to the WMRA.Then, the WMRA can learn the tasks and generalize them to a new environment even obstacles exist on the motion path.Experiments of placing a cup on the table, no matter with an obstacle or without obstacle on its motion path, show that our learning framework can help the robot to accomplish the learned tasks and generate similar motion trajectories with the demonstrated one.Even when an obstacle exists on its path, the shape style of the generalization trajectory is still similar except the avoidance obstacle part.This phenomenon proves the validity of the proposed approach.
It is important to emphasize that the approach is not restricted to the WMRA service robot only.Any type of service robot that can capture the end-effector's Cartesian coordinate information and related environment state can substitute the WMRA to accomplish the demonstrated task.Future work will focus on the management and extension of the related demonstration task library.In Figure 11b, the new generalization avoidance obstacle trajectory towards No. 4 goal position is represented with the green dotted line.The original demonstration trajectory and initial generalization trajectory are also added in this figure and represented with the red solid line and blue dash-dotted line.Except for the relative trajectories, the three-dimensional models of cuboid obstacle and cup are drawn for illustration.Similar to experiment 2, the position of the cup is also randomly selected.
In Figure 11b, the initial part of the generalization avoidance obstacle trajectory is almost the same with the original demonstration trajectory.Compared to the avoidance obstacle trajectory in Figure 10b, the trajectory shape in Figure 11b changes largely when the system approaches the obstacle.This may be caused by the large repulsive force generated when considering the shape of the obstacle.Even with this large fluctuation, the trajectory still converges to the goal position with a similar shape style.This experiment further confirms the validity and advantage of the proposed DMPs-DPF approach

Conclusions
In this paper, we mainly describe a general learning and generalization framework based on DMPs-DPF for WMRA service robot to autonomously accomplish some common domestic tasks.With this framework, we only have to demonstrate related tasks to the WMRA.Then, the WMRA can learn the tasks and generalize them to a new environment even obstacles exist on the motion path.Experiments of placing a cup on the table, no matter with an obstacle or without obstacle on its motion path, show that our learning framework can help the robot to accomplish the learned tasks and generate similar motion trajectories with the demonstrated one.Even when an obstacle exists on its path, the shape style of the generalization trajectory is still similar except the avoidance obstacle part.This phenomenon proves the validity of the proposed approach.
It is important to emphasize that the approach is not restricted to the WMRA service robot only.Any type of service robot that can capture the end-effector's Cartesian coordinate information and related environment state can substitute the WMRA to accomplish the demonstrated task.Future work will focus on the management and extension of the related demonstration task library.

Figure 2 .
Figure 2. Multiple degrees of freedom schematic in DMPs generalization.

Figure 2 .
Figure 2. Multiple degrees of freedom schematic in DMPs generalization.

Figure 2 .
Figure 2. Multiple degrees of freedom schematic in DMPs generalization.

Figure 3 .
Figure 3. Block diagram of dynamic movement primitives combined with dynamic potential field (DMPs-DPF) approach.
in detail.

Figure 3 .
Figure 3. Block diagram of dynamic movement primitives combined with dynamic potential field (DMPs-DPF) approach.
in detail.

Figure 4 .
Figure 4. Motion analysis diagram of the system and obstacle.

Figure 4 .
Figure 4. Motion analysis diagram of the system and obstacle.

Figure 5 .
Figure 5. Diagram of an obstacle with non-negligible shape and size.

Figure 5 .
Figure 5. Diagram of an obstacle with non-negligible shape and size.

Figure 6 .
Figure 6.A physical picture of wheelchair mounted robotic arm (WMRA)
Appl.Sci.2019, 9, x FOR PEER REVIEW 10 of 16 requirement that they are in the workspace of JACO.The detail positions are shown in Figure 7 with round numbered papers indicated.Additionally, the Cartesian coordinate information of related paper position is given in advance.

Figure 7 .
Figure 7. Experiment setting of placing a cup on the table.Cup initial position: 1; demonstration goal position: 2; new goal positions: 3, 4; demonstration 1: from 1 to 2; replay task 1: from 1 to 3; replay task 2: from 1 to 4. In this experiment scene, No.1 indicates the initial position of the cup; No.2 indicates the goal position for task demonstration; and No.3 and 4 are two new goal positions for the learned task replaying, which are obviously different from the demonstrated one.The complete demonstration process of placing a cup on the table is shown in Figure8with the help of WMRA.During this process, the Cartesian coordinates of the JACO robotic arm are recorded as the original raw motion information.

Figure 8 .
Figure 8. Demonstration process of placing a cup on the table: (a) home configuration; (b) reach the cup; (c) grasp the cup; (d) pick up the cup; (e) move to the goal position; (f) put down the cup.

Figure 7 .
Figure 7. Experiment setting of placing a cup on the table.Cup initial position: 1; demonstration goal position: 2; new goal positions: 3, 4; demonstration 1: from 1 to 2; replay task 1: from 1 to 3; replay task 2: from 1 to 4. In this experiment scene, No. 1 indicates the initial position of the cup; No. 2 indicates the goal position for task demonstration; and No. 3 and 4 are two new goal positions for the learned task replaying, which are obviously different from the demonstrated one.The complete demonstration process of placing a cup on the table is shown in Figure8with the help of WMRA.During this process, the Cartesian coordinates of the JACO robotic arm are recorded as the original raw motion information.
Appl.Sci.2019, 9, x FOR PEER REVIEW 10 of 16 requirement that they are in the workspace of JACO.The detail positions are shown in Figure 7 with round numbered papers indicated.Additionally, the Cartesian coordinate information of related paper position is given in advance.

Figure 7 .
Figure 7. Experiment setting of placing a cup on the table.Cup initial position: 1; demonstration goal position: 2; new goal positions: 3, 4; demonstration 1: from 1 to 2; replay task 1: from 1 to 3; replay task 2: from 1 to 4. In this experiment scene, No.1 indicates the initial position of the cup; No.2 indicates the goal position for task demonstration; and No.3 and 4 are two new goal positions for the learned task replaying, which are obviously different from the demonstrated one.The complete demonstration process of placing a cup on the table is shown in Figure8with the help of WMRA.During this process, the Cartesian coordinates of the JACO robotic arm are recorded as the original raw motion information.

Figure 8 .
Figure 8. Demonstration process of placing a cup on the table: (a) home configuration; (b) reach the cup; (c) grasp the cup; (d) pick up the cup; (e) move to the goal position; (f) put down the cup.

Figure 8 .
Figure 8. Demonstration process of placing a cup on the table: (a) home configuration; (b) reach the cup; (c) grasp the cup; (d) pick up the cup; (e) move to the goal position; (f) put down the cup.

Figure 10 .
Figure 10.The experiment of placing a cup with a small spherical obstacle: (a) initial experiment setting with spherical obstacle; (b) generalization trajectory with a spherical obstacle.

Figure 10 .
Figure 10.The experiment of placing a cup with a small spherical obstacle: (a) initial experiment setting with spherical obstacle; (b) generalization trajectory with a spherical obstacle.

Figure 11 .
Figure 11.Experiment of placing a cup with a large cuboid obstacle: (a) initial experiment setting with cuboid obstacle; (b) generalization trajectory with a cuboid obstacle.

Figure 11 .
Figure 11.Experiment of placing a cup with a large cuboid obstacle: (a) initial experiment setting with cuboid obstacle; (b) generalization trajectory with a cuboid obstacle.

Author
Contributions: M.C. wrote the paper; Y.Y. conceived and designed the experiments, Y.L. performed the experiments; M.Z.revised the paper.