Software Sensor to Enhance Online Parametric Identification for Nonlinear Closed-Loop Systems for Robotic Applications

This paper proposes an online direct closed-loop identification method based on a new dynamic sliding mode technique for robotic applications. The estimated parameters are obtained by minimizing the prediction error with respect to the vector of unknown parameters. The estimation step requires knowledge of the actual input and output of the system, as well as the successive estimate of the output derivatives. Therefore, a special robust differentiator based on higher-order sliding modes with a dynamic gain is defined. A proof of convergence is given for the robust differentiator. The dynamic parameters are estimated using the recursive least squares algorithm by the solution of a system model that is obtained from sampled positions along the closed-loop trajectory. An experimental validation is given for a 2 Degrees Of Freedom (2-DOF) robot manipulator, where direct and cross-validations are carried out. A comparative analysis is detailed to evaluate the algorithm’s effectiveness and reliability. Its performance is demonstrated by a better-quality torque prediction compared to other differentiators recently proposed in the literature. The experimental results highlight that the differentiator design strongly influences the online parametric identification and, thus, the prediction of system input variables.


Introduction
The general problem of manipulators still lies in the large number of their physical parameters, which are usually not well known. To correctly build a mathematical model of such systems, different parameter identification techniques exist, which can be divided into two categories: direct and indirect approaches [1]. For the latter approach, the controller expression is related to the definition of the identification algorithm, whereas direct methods can identify the parameters of the system model independently of the structure controller applied to the robot. This research study relates to direct methods.
Most of the identification methods defined in the literature use the direct model of the system [1,2]. The formulation of the identification problem with such a model can lead to nonlinear optimization. Unfortunately, such a solution suffers from various difficulties, such as having multiple feasible regions; each one has multiple locally optimal points, specifically when the objective function or any of the constraints is non-convex. This non-convexity issue is more problematic for nonlinear models due to the methods that give coherent initialization steps. These methods may not be available for many nonlinear model structures. Other methods exist in the literature which are based on neural networks [3][4][5][6]. Identification methods based on nonlinear observers/filters such as methods based on the extended Kalman filter [7], high-gain observers [8], and observers based on sliding modes [9][10][11][12] have also been proposed. The drawback of these last ones is that they depend on a priori knowledge of the modeled system, which can sometimes be quite complex and inaccurate. Nevertheless, the direct model for mechanical systems is often flat, thus promoting the use of the inverse model that can be considered linear with respect to a set of dynamic parameters. By relying on the inverse model, the problem of identification is reduced to a minimization of a prediction error of unknown parameters. This minimization is performed according to a chosen criterion that is generally of a quadratic form. Thus, the vector of unknown parameters can be obtained through optimization based on different models that exist in the literature. Some of these models include the least-squares (LS) method [13], the weighted least squares (WLS) method [7], the maximum likelihood estimation method [14], and the recursive least squares (RLS) algorithm. The RLS algorithm has a major advantage in that it requires multiple iterations, making online identification easier. It has good convergence and provides for small estimation errors in the stationary case and the underlying normally distributed noise [15]. To further improve the performance of this algorithm, a forgetting factor (FF) term can be included [15,16]. Apart from the possibility of application of a classical algorithm such as the RLS, the use of the inverse model requires also a good prior estimation of the state and its derivatives. Therefore, both filtering and differentiation algorithm methods play a key role in such identification processes. Only a few studies have investigated the procedure of online parametric identification by associating a differentiator and have implemented it for fear of amplifying the measurement noise. The measurement noise properties are not known beforehand, especially in practice. Thus, the main challenge here is to find a suitable online algorithm that can guarantee a good compromise between the differentiation accuracy and noise rejection. This algorithm is defined as a soft sensor to estimate the successive derivatives of the system state and also to replace a real physical sensor. The use of such a sensor makes it possible to reduce the number of speed and acceleration sensors. Indeed, if we have an n-DoF manipulator, we need 2 × n physical sensors. In addition, a software sensor rarely breaks down, never wears out, and does not require calibration. Soft sensors have become increasingly important in various applications, such as the design of controllers, observers [17,18], the sensorless control approach, and diagnostic problems [19]. A limited number of studies have proposed software sensors based on the differentiation algorithm [20,21]. In [21], the authors addressed the problem of online identification for uncertain nonlinear systems based on the state derivative estimation method. For their purpose, they have approximated the system model online, using a neural network with some feedback term to compensate for the modeling errors and exogenous disturbances. To achieve the identification process, a comparative study of different differentiation algorithms-including the high gain observer, Levant's first-order sliding mode differentiator, backward difference, and central difference methods-was performed. This study was done with simulation tests, and only a first-order state derivative was computed. In [22][23][24], the authors proposed the parameter identification of a robot manipulator, using a causal Jacobi orthogonal-based algebraic differentiator to compute the joint acceleration from the position measurements. The principle of this kind of differentiator, as proposed by [25], is based on the truncated series of the Taylor expansion signal to be estimated. Although such an algorithm allows efficient attenuation of the noise, it is sensitive to the truncation order, to the size of the sliding window estimation, and especially to the setting of its parameters. All this makes it difficult to set these parameters in order to obtain a good estimate. An alternative differentiator that is based on a higher-order sliding mode can be used. For the state estimation, a robust differentiator based on the sliding mode technique is applied, as in [26][27][28]. In fact, the author uses the well-known Levant's differentiator, the so-called Super Twisting (ST) algorithm [27], and the LS method for online parametric identification of nonlinear systems in the presence of noise. Different new forms of the popular first-order ST differentiator have been proposed and applied with satisfactory results [26][27][28][29][30][31][32][33][34].
In this paper, we propose a soft sensor based on the sliding mode differentiation algorithm to enhance an online dynamic parameter identification procedure for a robot manipulator based on its inverse model. The software sensor outputs are, therefore, injected into an RLS estimator to reach this aim. To the best of our knowledge, only a few software sensors based on the second-order adaptive sliding mode algorithm allow velocity and acceleration to be provided simultaneously [35]. The main features of the proposed online identification approach may be summarized as follows: (1) Estimation of both the velocity and acceleration of each joint by the proposed software sensor, which is robust with respect to the noisy data, without any knowledge of the statistical properties of the noise. (2) A comparative analysis allows assessing the performance of the proposed identification approach with respect to other software sensor-based differentiators. (3) The proposed approach allows identifying model parameters with very good performances. This paper is organized as follows. Section 2 outlines the general principle of the proposed method. Section 3 presents the system application and the different implementation steps of the identification process. Section 4 is dedicated to the experimental results, validation, and discussion.

Manipulator Modeling and Parameter Identification
This paper considers the parameter identification problem of closed-loop nonlinear systems [36]. In fact, the problem is especially interesting for open-loop unstable systems and also for operating safety reasons. The correlation between the input/noise in the closed-loop technique is the major area in which it differs from the open-loop methods. Considering some input/output constraints, it is possible to consider that the performances of the open-loop and closed-loop identification techniques are the same [37,38]. For both techniques, the manipulator input-output signals are recorded as the system tracks some pre-defined trajectories. Thus, the main practical challenge for the identification procedure is the noises that could be provided by the data acquisition chain, sensors, and even the process. The defined closed-loop online identification approach can be applied to any robot manipulator no matter how many inputs/outputs it has. The only condition is that the direct model must be flat [39]. Therefore, exploiting the properties of the inverse model gives us a linear model with respect to a grouping of physical parameters.
The problem, then, becomes a minimization problem of a prediction error q , and → θ are the vector of system inputs, outputs, input estimation, velocity estimation, acceleration estimation, and the vector parameter estimation, respectively. This approach requires in return a good prior estimate of the state derivatives. Thus, the software sensor that must be chosen plays a key role in the identification process. For such sensors, the first issue concerns the noises on their outputs and the second issue is about the estimation accuracy of the velocities and accelerations. For the closed-loop identification approach, the robots are usually position-controlled. Different basic controllers are generally used such as PD and PID controllers. Although these controllers do not have good precision compared with others, they are widely used for identification due to their ease of adjustment. Commonly, all the electrical parts relating to the actuators are neglected because their dynamics are very fast compared to the mechanical parts. Thus, the relationship between the torque and the current signal provided from the electrical motors is modeled by a static gain. The value of this gain is known a priori from the manufacturer data.

Dynamic Model
This section presents the class of nonlinear systems to be identified. Let us define a mathematical dynamic model of an n-link rigid robot in joint space as follows: T are the vector of joint position, the joint velocity, and the acceleration, respectively. A(q) is the inertia matrix of the robot, C q, . q is the matrix containing the centrifugal, and the Coriolis torques, g(q) ∈ R n is the vector of the gravitational vector τ F and τ, are the friction forces and the joint torque vectors, respectively. The friction forces vector is defined by the dry friction and viscous friction terms as follows: where F v , F s ∈ R (n * n) are constant diagonal matrices representing viscous and Coulomb friction parameters, respectively. Let us rewrite Equation (1) in a linearly parameterized form with respect to a vector of n θ dynamic parameters [40]: where H q, . q, .. q ∈ R (n * n θ ) is a regression matrix and θ ∈ R (n θ * 1) is a vector of the parameter model that represents the minimal set of identifiable parameters to describe the dynamic model. The vector θ is obtained by regrouping some of the base parameters with respect to the QR decomposition [39] or via some linear relations [2,7].

Algorithm for the Proposed Soft Sensor
The main advantage of higher-order sliding mode (HOSM) algorithms is the ease of their implementation in real-time, which justifies their successful applications [41][42][43]. However, their major drawback is the gain setting in real-time. In fact, the gain setting requires that the Lipschitz constant of the derivative signal, which is difficult to know in practice since the signal to estimate is not necessarily known in advance, must be accurately known beforehand. Thus, for online applications, it is necessary to adjust the gains each time the basic signal changes. Therefore, the major difficulty lies in the gain selection of such differentiators. To fix this problem, various new sliding mode differentiators have been proposed in the literature, where the aim is to define an adaptive form of this classic scheme.
Let the input signal g(t) be a function defined on [0, ∞[ and have a Lipschitz derivative C defined as: where C is unknown. This input signal can be defined as follows: where g 0 (t) is an unknown base signal with the (n + 1)th derivative having some Lipschitz constant C > 0 and ξ(t) is a bounded Lebesgue-measurable noise with unknown features; it is defined by: |ξ(t)| < ε, where ε is sufficiently small.
The basic form of the first-order differentiator ST and the nth-order differentiator are defined in [27,28], respectively. The ST algorithm is described by Equation (5). By referring to [26], a basic form of the second-order sliding mode differentiator (2SMD) (for n = 2) is defined by Equation (6): where η i , i = {0, . . . , n} are differentiator parameters, which are positive, depending on the Lipschitz constant C of g n+1 0 (t), and y 0 and y 1 are the differentiator outputs. At time t = 0, these differentiators can well be performed as follows: z 0 (0) = g(0), z i (0) = 0, i ∈ {1, 2} after finite-time convergence and in noiseless cases, z 1 = y 0 is the estimation of . g 0 (t), and z 2 = y 1 is the estimation of .. g 0 (t). In the equation systems (5) and (6), the quantities that present the sign(.) functions must theoretically vanish in finite time, but it is impossible to achieve this due to different inaccuracy sources as the measurement errors. In addition, this problem is amplified by the presence of discontinuities, which come from the sign(.) functions, in these equations. These latter produce the so-called chattering effect on the estimated signals. To overcome this problem, it is necessary to have adequate values of the differentiator parameters to also have good accuracy and minimize the chattering effect as much as possible. In some previous works [44], it was possible to replace the "sign (.)" with the "sat(.)" function; this makes it possible to slightly reduce the noise amplification in spite of the convergence algorithm, which can occur in the event of an inadequate slope value.
To define an algorithm with a compromise between the exactness and the level of noise for the considered signal, the new scheme of 2SMD is proposed. This solution permits some dynamic laws on the estimator's settings that will be exposed by the following.
Proof of Theorem 1. Let σ 0 = e 0 − g. With this new coordinate, the first two equations of the system (7) can be re-written as follows: . where Subtracting . g(t) on both sides of the second Equation of (7), we obtain: Substituting . g(t) in Equation (13) with its expression in Equation (10), we obtain the following equation: with η 0 =η 0 − η * 0 , which is an error between the estimated gain value and the gain value known a priori. Considering now that from both sides of the last equation of (7), we have: Let us define a Lyapunov function as: Let us define the equilibrium point such as X eq = (0, 0, 0) T . Then, the derivative of the Lyapunov function defined by Equation (16) is given by: η 0 = −Γ 0 σ 2 0 , and we can also obtain: We have: Replacingη 0 (see system (9)) in Equation (19) by its expression, the following equality is satisfied: Consequently, Equation (17) can be rewritten as follows: To show that .
V is negative, it is sufficient to prove that: Therefore, let us assume that | η 0 | ≤ η 0M , where η 0M is a positive constant satisfying the following inequality: To obtain the condition defined by Equation (23), one must choose Γ 0 such that: It should be noted that the only condition to satisfy inequalities (24) and (25) is to have a positive value of Γ 0 (namely a high value).

Remark 1. It can be noticed that
. V is a negative function ∀(σ 0 , σ 1 , η i ) ∈ R 3 and it cancels when V is a globally semi-negative definite function on R 3 and it is defined as a locally negative definite function of R 3 |(σ 0 = 0, σ 1 = 0, η i ) T . Consequently, with the defined Lyapunov function, a global convergence of R 3 of the equilibrium point is proved. And a local asymptotic convergence of the algorithm has also been proven on The global asymptotic convergence of the algorithm could not be demonstrated using the LaSalle's invariance principle.

Remark 2.
The convergence of the dynamic gains η * 0 and η * 1 is not guaranteed. As opposed to this, these dynamic gains change over time in a continuous way according to the imposed adaptation laws. Depending on the initial values of the differentiator gains, the dynamic gains have a bounded evolution for all simulation tests that are carried out.

Remark 3.
The behavior of the P2SMD is equivalent to the behavior of a bandpass filter for small values of a couple of gains (Γ 0 , Γ 1 ). In fact, the setting (Γ 0 , Γ 1 ) is specified for the two possible cases: with noiseless signals and with a noisy signal. For the first case, if the gain values (Γ 0 , Γ 1 ) become high, then the convergence time of the algorithms becomes quick. For the second case, there is some compromise between the convergence time and the noise amplification rate. In fact, the linear terms Γ i s i i ∈ {0, 1} defined by the (Γ 0 , Γ 1 ) gains are the key to smoother output differentiators compared to the basic scheme. Then, it is necessary not to choose values that are too high.

Recursive Least Squares Estimator
To track the imposed reference trajectories that excite the system dynamics of the robot manipulator, the inputs/outputs of this last one are sampled. For our case, a recursive least squares (RLS) estimation method is used. Then, via the measurement of torques and positions of each joint, the root-mean-square residual error of the model is optimized. This error is the difference between the signal torque and its estimated value. Thereafter, a cost function resting on this error is used to obtain a parameter identification formula under the assumption that the measurement errors are negligible. The joint positions/torques are measured with a sampling period T e and these data are collected with N samples over one period T e . We denote the kth sampling time as t k . These measurements can be used to obtain an over-determined set of equations [36]: where q,ˆ. q and.
. q are vectors of joint positions, estimated velocities, and estimated accelerations, respectively. Γ t ∈ R (n×1) and ρ ∈ R (n×1) are vectors that represent a sampling of the actual torques τ and all terms caused by the modeling error, friction, and measurement noise, respectively. n is the number of equations and n θ is the number of parameters to be identified. ρ is supposed to have a zero mean and is serially uncorrelated. W ∈ R (n×n θ ) is the observation matrix, which is a sampling of the regression H defined by (3). Then, Γ t and W are defined as follows: . q(.)) is the jth row of the (n × n θ ) matrix of the regressor given by Equation (3).
Finally, the over-determined set of the equation system (25) is solved using the RLS estimator with a constant forgetting factor.
Noise will limit the accuracy of parameters obtained by least squares and also the convergence rate of the RLS algorithm. To overcome such problems, the trajectory used in the identification process must be correctly chosen, which is called a persistently exciting trajectory [45]. The parameter identification results rely on the well-conditioned reduced observation matrix and therefore to obtain a unique LS solution. The unicity of this solution directly depends on the rank of the observation matrix W. The rank loss of this matrix may occur in two cases: (i) where there is a structural issue of the parameter identifiability problem; (ii) where there is a data deficiency due to a lack of consideration of the sufficiency of trajectory excitation. To obtain such a trajectory, two methods are usually used: (i) compute the trajectory based on some optimization criteria [45]; (ii) use special test movements done sequentially to excite each time some parameters. This special test consists of locking some joints while moving others. For our case, the generation of such test moves has been carried out, which is why the observation matrix is assumed to be a full-rank and well-conditioned matrix. For the identification experiment, the solution of Equation (25) may induce some bias, essentially due to the measurement noises. Therefore, it is better to use data filtering to improve parameter estimations by the RLS method.

Application System: Robot SCARA
The proposed identification scheme has been experimentally tested on the robot SCARA with 2-DoF without gravity and joints driven by synchronous motors with an absolute encoder (see Figure 2). The control law of the robot is validated via a Dspace 1104 controller board with a dedicated digital signal processor. The different terms defined in Equation (4) can be described as follows [17], where the regression matrix is given by: with C O2 = cos(q 2 ) and S O2 = sin(q 2 ). The vector of unknown parameters' vector is given by: where ZZR 1 = ZZ 1 + M 2 L 2 1 L 1 is the length of body 1, M 2 is the mass of body 2, and ZZ i represents the inertia's moments of body i, i = (1, 2). L 1 MX 2 and L 1 MY 2 are the first moments of body 2 multiplied by the length of body 1, and F υ1 , F s1 , F υ2 , and F s2 are the viscous and dry friction parameters of both axes.

Implementation of the Identification Procedure
The different steps required to implement the identification procedure are summarized as follows. (1) Make an adequate choice of the reference trajectory that must satisfy the persistency of excitation (PE) condition [44]. (2) Apply a controller for the system defined by Equations (4) As indicated previously, the identification procedure is applied when the system operates in a closed-loop due to its instability in an open loop. Closed-loop identification is operated with a PD feedback control as the system tracks a fifth-order polynomial trajectory. The sampling frequency of the recording experimental data is equal to 200 Hz. Excitation trajectory. The reference trajectory is computed to have a well-conditioned observation matrix W [45]. To satisfy the PE condition having an excitation trajectory q re f , some conditions are imposed to define the polynomial trajectory as: where q max , . q max , and ..
q max are bounds on joint positions, velocities, and accelerations, respectively. As shown in Equation (28), the values at the start (t = 0) and endpoints (t = tf) are null.
Data filtering. The PD controller is given by the following equation: where K pr and K dr are positive definite diagonal matrices for the proportional and derivative actions, respectively. It is worth noting that the PD controller has no impact on the accuracy of the parameter identification because such a control scheme allows for a very small error.
The measurements are performed on the test bench in order to collect both the joint position and torque data. However, these data may be noisy and biased due to bad sensors, such as the quantization noise of the encoder. Then, to improve the accuracy of the parameter identification, a low-pass filter, e.g., the Butterworth filter, is used to treat data (inputs/outputs) online. This filter's cutoff frequency f F is set to f F ≥ 10 f dyn , where f dyn represents the estimated natural frequency of the robot. The filtered torques are shown in Figure 3. All filters are implemented in their discreet form with the same sampling period of the control loop.

Experimental Results and Discussion
Different algorithms will also be exposed in order to consider them as a comparative basis to evaluate the proposed identification approach. Direct and cross-validations are performed, and the results are discussed, with some criteria of performance metrics analyzed to show the role of the soft sensor-based differentiator used in the parametric identification loop. Therefore, our aim is to compare the practical parameter estimation by the proposed differentiator P2SMD with the estimations by the classic sliding mode differentiator defined in Equation (4) (2SMD), the basic Euler differentiator (backward difference algorithm) associated with the FIR (finite impulse response) low-pass filter (ED+FIR), and a new scheme of the adaptive super twisting differentiator (ASTD) proposed by Shtessel [30]. This latter is described as follows: The sliding function e 0 (t) is defined in the first equation of Equation (8). k 0 (t) and k 1 (t) are positive gains, where k 1 (t) = 2εk 0 (t) and the dynamics of k 0 (t) are given by: ε, α 1 , α 0 , µ 1 , γ and k 0m : positive constants (31) For the experimental validation, an initialization step is necessary. For the RLS estimator, the initial covariance matrix P 0 is arbitrarily chosen such that it is a diagonal matrix. It is preferable that the values of the matrix coefficients are high in order to ensure fast convergence of the dynamic parameters. In practice, we have P 0 = 100I 1 , where I 1 is an identity matrix. The values of the gains are set to α 1 = 0.95 and α 2 = 1. For the PD controller, its gain values are selected as K pr = [6.5 155] T and K dr = [5 25] T . The setting of each differentiation algorithm is described in Table 1. Table 1. Parametric tuning of algorithms.

ED+FIR
The cutoff frequency of the FIR low-pass filter is a 10th-order FIR tuned to 5 Hz It should be noted that the experiments were realized to measure the joint positions with an optical encoder. These position signals are affected by the quantization error. The velocities and accelerations estimated with the aforementioned differentiators are depicted in Figures 4 and 5. We observe from Figures 4a and 5a that the signals given for the velocities present low noise. On the other hand, the acceleration signals show amplification of the noise level (Figures 4b and 5b). Moreover, the proposed algorithm presents the lowest noise level compared to the others. However, it must also be said that the signals estimated by the ASTD have a relatively long transient phase compared to the signals estimated by the other algorithms. In fact, if there is a change of setting to improve this transient phase, an increase in the level of the noise would then be recorded. Therefore, a delicate compromise exists with the adjustment of the ASTD.

Statistical Analysis and Comparison Criteria
To correctly compare the estimated parameters obtained by the different differentiators, let us define the relative standard deviations can determine assuming the matrix W to be a deterministic one. From Equation (25), ρ is a zero-mean additive independent noise such that: where I is an (n × n) identity matrix. The covariance matrix of the estimation error is calculated by: where σ 2 is the ith diagonal coefficient of Cθθ. The relative standard deviation %σθ i ofθ i is given by: Focusing on other quantitative elements for the comparative study is also considered. A root-mean-square (RMS) error for each estimated parameter is computed as a criterion. This error is given by the following expression: where N n is the number of samples and θ i ,θ i , i = {1, ..8} are the ith actual and estimated parameters, respectively. Good validation requires a good prediction of the torques of each joint. To quantify the quality of torque prediction, some criteria are calculated, such as the RMS error (RMSE) already defined in Equation (35), and the coefficient of determination, denoted by R 2 , is also computed to assess the strength of the linear relationship between the actual torques and the predicted ones. The formula used to calculate this coefficient is as follows: where N n is the sample size and τ i ,τ i , are the individual samples of the actual and the predicted torques, respectively, indexed with i. Table 2 presents the parameters identified with the four algorithms as well as the a priori values of these parameters. The a priori values have been determined via mea-surements made on the disassembled links. From Table 2, one can notice that, of all the differentiation algorithms, the P2SMD has the lowest relative standard deviation for almost all the parameters. The deviations recorded for the frictions by the P2SMD are high but values remain lower than those given by the other algorithms. For example, for the dry friction F v2 , the value given by the proposed algorithm is 3.37 times lower than the ED+FIR value and more than 10 times lower than the ASTD value. In Figure 5, the maximum value %σθ i is presented for each algorithm. It is clear from this figure that the lowest value is given by the proposed algorithm. For each of the other algorithms, the maximum value of the relative standard deviation greatly exceeds 30%. If the values of %σθ i exceed 20% or 30%, then the parameters are misidentified [8]. In this research work, the authors suggest that the parameters which remain poorly identifiable must be canceled because their contributions to the system dynamics are poor. However, there are no statistical tests that prove the cancelation or not of such parameters. However, we can conclude that the rather large %σθ i proves that the algorithm used presents more noise and allows the estimator to be biased. Thus, one can observe that, except for the friction parameters (F v1 , F s1 , F v2 , F s2 ), the obtained outcomes are widely accepted values of %σθ i . Therefore, it is more interesting to consider them as parameters varying over time, which can enhance their identification. Table 3 provides the results concerning the RMS error corresponding to each identified parameter. The lowest RMS error values are described in bold; these lowest values are given by the proposed differentiation algorithm. The time evolution of the estimated parameters using the studied and proposed algorithms is presented in Figure 6. Note from this figure that for the estimates provided by the ASTD, the time evolution of the parameters behaves in an oscillatory manner and has the highest transient phase. This adaptive form of the ST differentiator (ASTD) remains hard to adjust due to the large number of parameters required (six parameters) and the compromise that exists between the convergence speed, the noise level, and the accuracy of the algorithm. The evolutions of the estimated parameters obtained by the 2SMD and ED+FIR are quite close, although the 2SMD presents faster convergence with less overshoot for some parameters. With the proposed algorithm, the parameters are identified with the weakest transient phase and have the lowest overshoot. For some parameters, the estimated values converge to the a priori known values in 2 s. A direct validation is performed to test the model and to highlight its performance. The aim is to compare the actual torques with those estimated using the identified parameters (see Figure 7). Note that the predicted torques given by the ED+FIR are most affected by the noise (see Figure 7a). This is obviously because the velocity and acceleration estimates for this differentiator are much noisier than for the other differentiators. It is also worth mentioning that all disturbances, such as noise, position quantization error, and quality of velocity and acceleration estimations, remarkably affect the identification results. Figure 7b,c show that the identified model and measurements are close with less noise than the ED+FIR. From Figure 7d, it is clear that the predicted torques are very close to the measured ones for the P2SMD. This implies a good estimation of the parameters. In addition, the proposed algorithm is easier to adjust than the other algorithms. As noted in Table 1, the P2SMD presents only two parameters that need adjustment. Table 4 shows the RMS error and the R 2 values yielded by each differentiation method, with the best values highlighted in bold. In general, the validation gives better results for the prediction of torque 1 than for the prediction of torque 2. Indeed, for torque 1, the values of R 2 and RMSE are quite close for the ED+FIR, 2SMD, and ASTD algorithms. The value of R 2 for each of these three algorithms is about 91%, while the corresponding value for the proposed algorithm is 96%. The RMSE of the proposed algorithm is 1.5 times lower than those of the other algorithms. However, the results provided for torque 2 remain acceptable, with an R 2 equal to 92%, 85%, 81%, and 72% for P2SMD, 2SMD, ED+FIR, and ASTD, respectively. Thus, the R 2 values obtained with the proposed method are very close to 1 but the RMSE values are lowest for both torques, which indicates that the fit between the actual and predicted signals is almost perfect; therefore, the model is reliable. It is important to mention that the other differentiators give an acceptable value of R 2 since they exceed 71%. However, the values of RMSE and R 2 given by ASTD for torque 2 are the worst compared with other methods. Although this algorithm has a good filtering rate, its estimates are less accurate. For a given system, the set of parameter estimates could be valid for any inputs/outputs. Therefore, validation must be carried out with another input/output dataset to provide a conclusion on the quality of the parameter identification step or torque prediction step. The next section presents the result of the cross-validation of the model.

Results of the Cross Validation
For the cross-validation, we use the same parametric setting of the PD controller. The parametric setting of the different algorithms also remains the same. Other exciting trajectories different from those defined during the identification process are used. Figure 8 shows the actual and predicted actuator torques. For torque 1, the crossvalidation yields relatively good results. The results are not as good for torque 2 compared to torque 1. This is explained by the fact that the dynamics of the new trajectory of torque 2 is too different from that used for direct validation. Figure 8 indicates that the proposed method has the best performance and allows predicting the actuator torque with a good filtering rate. In general, algorithms based on sliding modes have some transient phases, but this remains weak, and all the algorithms quickly converge. It is possible to reduce this transient phase by increasing the gains of the algorithms. This is not a limitation for the proposed algorithm given that it presents a good level of filtering. The torque ripples can also be suppressed with a decimation procedure; this procedure is detailed in [22]. On the other hand, increasing the values of the gains tends to increase the noise level. Given the good filtering provided by the P2SMD, increasing the convergence gains will not pose such a challenge. Similar to the qualitative criterion for cross-validation via Figure 8, the quantitative criteria show good performance of the proposed algorithm. In fact, the P2SMD has the lowest values in terms of RMSE and the highest values for R 2 , see Table 5. Moreover, the R 2 values are very close to 1, which shows that the model obtained makes it possible to correctly predict both torques. For torque 2, the results provided by the other algorithms are largely degraded especially for the ASTD. This is due to its high cumulative imprecision, since the estimation of acceleration is done via two successive blocks of ASTD.

Discussion
The parametric identification of a nonlinear model and the prediction of the inputs based on the inverse model associated with a differentiation algorithm have been studied in this paper. Such an approach can circumvent the non-convexity problem to find a local solution. In fact, this study describes the key role of a soft sensor for this purpose with a basic RLS algorithm. It is well known that the RLS method can provide unbiased results due to inaccurate measurements of joint positions q at a high sampling rate and especially due to the bad-tuned filtering. In the previous research works, one associates the RLS method with a traditional Euler type differentiator with a filter. The challenge of such an LS estimator concerns the noisy observation matrix; thus, the filter cut-off frequency must be correctly chosen. An alternative solution to the conventional differentiator associated with a filter is the soft sensors based on SM differentiators, in which the differentiation and filtering take place in a parallel manner. Two SM algorithms (2SMD and ASTD) have been compared to the proposed P2SMD. The results indicate a significant difference between the three studied algorithms (ED+FIR, 2SMD, and ASTD) and the P2SMD algorithm. The performance has been evaluated in terms of qualitative and quantitative criteria. The qualitative criteria are based on the quality of the curve forms of the predicted torques compared with the actual ones and on the time evolution of the estimated parameters given for all soft sensors. The quantitative criteria are defined by some performance metrics, such as the relative standard deviations, the coefficient of determination R 2 , and the RMSE. For all these performance criteria, the results show the higher performance of the proposed soft sensor, which may be attributed to the quality of both the velocity and acceleration estimates. In fact, the performance likely depends on three factors: time convergence, accuracy, and noise robustness of the differentiator. Despite the disturbances due to the position quantization error, the P2SMD algorithm has been found to behave faster and more accurately than the other sensors. Moreover, it has a high filtering rate compared to the other algorithms. Thus, the considered metric values almost do not change between the direct and cross-validations, which proves the effectiveness of the P2SMD. Furthermore, the gain adjustment of the proposed algorithm is easier to do compared with other methods, and it is also easy to implement. In fact, only two parameters are necessary for this adjustment compared to three for the 2SMD and six for the ASTD. The classic differentiator yields better results than the 2SMD and ASTD, but attention must be paid to its filter setting in order to avoid any distortion in the frequency range related to the closed-loop manipulator. In [7], the authors present an extended Kalman filter to identify the parameters of a 2-DoF manipulator such as the one used in this paper. With such a filter, it is not necessary to carry out any treatment on the measurement joint positions. However, the obtained results were very sensitive to the initial values. Therefore, the extended Kalman filter requires good a priori knowledge of the initial parameters and takes a very long computing time compared to the proposed algorithm. It is important to highlight that for all the algorithms, the performance may be increased by incorporating a variable forgetting factor (FF) to correctly track the parameter changes. Specifically, for the parameters of viscous and dry friction, it is possible to optimize their identification using a value of FF that is smaller than 1, since these variables are non-stationary parameters. In fact, friction models are generally nonlinear. Thus, it is possible to use the separable least squares as presented in [46], where the dynamic model is divided into two sub-models. The first sub-model contains all base parameters and the second one is a nonlinear model that defines the friction parameters.
Our study is mainly applied on the SCARA robot with 2 DoF and no gravity effect. As shown in [47], it will be very interesting to validate our algorithm on a more complex robot (six degrees of freedom and with gravity effect) with a considerable number of parameters.