Improved Performance for PMSM Sensorless Control Based on Robust-Type Controller, ESO-Type Observer, Multiple Neural Networks, and RL-TD3 Agent

This paper summarizes a robust controller based on the fact that, in the operation of a permanent magnet synchronous motor (PMSM), a number of disturbance factors naturally occur, among which both changes in internal parameters (e.g., stator resistance Rs and combined inertia of rotor and load J) and changes in load torque TL can be mentioned. In this way, the performance of the control system can be maintained over a relatively wide range of variation in the types of parameters mentioned above. It also presents the synthesis of robust control, the implementation in MATLAB/Simulink, and an improved version using a reinforcement learning twin-delayed deep deterministic policy gradient (RL-TD3) agent, working in tandem with the robust controller to achieve superior performance of the PMSM sensored control system. The comparison of the proposed control systems, in the case of sensored control versus the classical field oriented control (FOC) structure, based on classical PI-type controllers, is made both in terms of the usual response time and error speed ripple, but also in terms of the fractal dimension (DF) of the rotor speed signal, by verifying the hypothesis that the use of a more efficient control system results in a higher DF of the controlled variable. Starting from a basic structure of an ESO-type observer which, by its structure, allows the estimation of both the PMSM rotor speed and a term incorporating the disturbances on the system (from which, in this case, an estimate of the PMSM load torque can be extracted), four variants of observers are proposed, obtained by combining the use of a multiple neural network (NN) load torque observer and an RL-TD3 agent. The numerical simulations performed in MATLAB/Simulink validate the superior performance obtained by using properly trained RL-TD3 agents, both in the case of sensored and sensorless control.


Introduction
In today's world, the control of actuation systems based on servomotors has become a priority, due to the fact that these systems can be found in a wide range of products that contribute both to improving the efficiency of certain activities and to improving ergonomics [1,2]. It is worth remembering that PMSM is incorporated into servomotors used in robotics, computer peripherals, the aerospace industry, electric drives, etc. [3,4]. Thus, PMSM control takes on new dimensions, in terms of the complex concerns of researchers to improve the performance of control systems. It is worth remembering that the

•
The synthesis of a robust PMSM controller using the d-q frame mathematical model for the sensored case of the PMSM control structure; • The robust controller synthesis using MATLAB and integration with PMSM sensored control system; • The improvement in the control performance of the PMSM sensored control system, by combining the robust controller and an RL-TD3 agent that provides additional correction signals to adjust the u d and u q commands generated by the main robust controller; • The comparative presentation of the performance of the proposed controllers used in the PMSM control system structure, in terms of response time, speed ripple error and DF of the rotor speed signal; • The ESO-type observer synthesis for the sensorless case of the PMSM control system for rotor speed estimation; • The improvement in the performance of the basic ESO observer structure for PMSM rotor speed estimation, by providing load torque estimates using two-layer feedforward multiple NN networks implemented with the Neural Net Fitting MATLAB application [31]; Sensors 2023, 23, 5799 3 of 29

•
The improvement in the performance of the ESO basic observer structure for PMSM rotor speed estimation, by implementing an RL-TD3 agent [33] in MATLAB/Simulink that provides correction signals to estimate the PMSM rotor speed value as close as possible to the sensored rotor speed provided by a PMSM speed sensor; • Comparison of the performance of the four observer variants for estimating the proposed PMSM rotor speed, in terms of response time and speed ripple error.
The rest of the article is organized as follows: Section 2 presents the proposed sensored and sensorless control structure of PMSM. The robust control of the PMSM is presented in Section 3, while the MATLAB/Simulink implementation of the PMSM sensored control system together with the improvement in the control performance by using an RL-TD3 agent is presented in Section 4. The PMSM sensorless control system using improved ESO-type observer variants is presented in Section 5, and numerical simulations of the operation of the PMSM sensorless control system are presented in Section 6. Section 7 presents some conclusions and suggestions for future approaches.

Sensored and Sensorless PMSM Control
Traditionally, PMSM control is achieved using the FOC strategy, in which a PMSM rotor speed sensor is integrated and the name of the control structure is sensored control. Figure 1 shows the schematic diagram of the FOC-type strategy for the PMSM control system, where the controllers used in the current and voltage loops are PI-type.
In Relation (1), the notations are the usual ones in the d-q rotating reference frame [1,2,28]. Therefore, it can be denoted these notations as follows: the stator voltages, u d and u q , in the d-q frame; the stator currents, i d and i q , in the d-q frame; the stator inductances, L d and L q , in the d-q frame; and the stator resistances, R d and R q , in the d-q frame. Other common parameters of the PMSM include: the flux linkage is denoted by λ 0 , the number of pole pairs is denoted by n p , the rotor moment inertia combined with load moment inertia is denoted by J, the load torque is denoted by T L , the viscous friction coefficient is denoted by B, and the PMSM rotor speed is noted with ω.
In short, the description of the nonlinear operating equations of the PMSM is given by Relation (1).
In the usual case, where L d = L q is R d = R q = R s , the system shown in (1) can be rewritten in the following form [1,2,28]: In the following sections we propose both an improved control structure, in the case of sensored control, and the use of ESO-type observer combinations, in the case of a structure called sensorless control. Thus, Figure 2 shows the schematic diagram of the proposed PMSM sensorless control system based on a robust controller and RL-TD3 agent algorithm, using an improved ESO-type observer combined with a multiple NN load torque estimation and an RL-TD3 agent for speed estimation.
In terms of improving control structures, a robust controller combined with an RL-TD3 agent is presented, which provides superior performance compared to classical PI controllers. Four combinations of observers are proposed for PMSM speed estimation. The first type of observer is based on the ESO-type observer, which can estimate not only the PMSM rotor speed ω, but also the load torque T L , by adding another variable containing disturbance elements and by a special structure of the observer. A second proposed variant is a multiple NN, trained over different load torque ranges, which provides this value to an ESO-type observer which, in turn, will have a better PMSM rotor speed estimation relative to the measured PMSM rotor speed (sensored control structure case). The following two variants are derived from those presented above, with the addition of an RL-TD3 agent that provides superior estimates of rotor PMSM. In the following sections we propose both an improved control structure, in the case of sensored control, and the use of ESO-type observer combinations, in the case of a structure called sensorless control. Thus, Figure 2 shows the schematic diagram of the proposed PMSM sensorless control system based on a robust controller and RL-TD3 agent algorithm, using an improved ESO-type observer combined with a multiple NN load torque estimation and an RL-TD3 agent for speed estimation.   In the following sections we propose both an improved control structure, in the case of sensored control, and the use of ESO-type observer combinations, in the case of a structure called sensorless control. Thus, Figure 2 shows the schematic diagram of the proposed PMSM sensorless control system based on a robust controller and RL-TD3 agent algorithm, using an improved ESO-type observer combined with a multiple NN load torque estimation and an RL-TD3 agent for speed estimation.  Schematic diagram of the proposed PMSM sensorless control system based on a robust controller and an RL-TD3 agent algorithm, using an improved ESO-type observer combined with a multiple NN load torque estimation and an RL-TD3 agent for speed estimation.

Robust Control of PMSM
Consider the operating point (x*, u*) obtained from the linearization of the nonlinear PMSM model in Section 2. Consider the operating point (x*, u*) obtained by linearizing the nonlinear PMSM model in Section 2. The explanation of the operating point is given by the following Relation (3) [28]. Around the operating point (x*, u*), the nonlinear equation of the PMSM takes the following form: f (x * , u * ) = 0 (4) The following linearized system can be related to the nonlinear Equation (4): .
where the matrices are defined by Relation (6).
Relation (7) shows the reference values selected for the operating point: For robust control of the PMSM, Figure 3   Following the steps in the synthesis of a robust control algorithm [23], Figure 4 shows the generalized plant. In addition to inputs/outputs and references, it contains the K controller structure, the nominal plant P, the uncertainties block ∆ and the input and output uncertainties w and z.  Following the steps in the synthesis of a robust control algorithm [23], Figure 4 shows the generalized plant. In addition to inputs/outputs and references, it contains the K controller structure, the nominal plant P, the uncertainties block ∆ and the input and output uncertainties w and z.
The relation between these elements can be written as follows: where ∆ represents a diagonal matrix defining the uncertainties and u is the controller feedback, i.e., the command given by the robust controller K. Following the steps in the synthesis of a robust control algorithm [23], Figure 4 shows the generalized plant. In addition to inputs/outputs and references, it contains the K controller structure, the nominal plant P, the uncertainties block ∆ and the input and output uncertainties w and z. The relation between these elements can be wri en as follows: where ∆ represents a diagonal matrix defining the uncertainties and u is the controller feedback, i.e., the command given by the robust controller K.
The global transfer matrix from r and TL to y−r is given by Relation (10): where: the generalized plant is denoted by P, the robust controller is denoted by K, and the upper/lower fractional linear transform is denoted by Fu/l [24,28].
According to [23], the generalized plant P verifies the following relation:  The global transfer matrix from r and T L to y−r is given by Relation (10): where: the generalized plant is denoted by P, the robust controller is denoted by K, and the upper/lower fractional linear transform is denoted by F u/l [24,28]. According to [23], the generalized plant P verifies the following relation: Figure 5 shows the block diagram of the extended system. It also shows the weighting function W i (with i = 1, 2, 3) and the linearized nominal transfer function of the PMSM given in (6) and denoted by G nom . Typically, the sensitivity function is denoted by S, the complementary sensitivity function is denoted by T, and the additive robustness matrix is denoted by R. Their expressions are given in (12)- (14) respectively.
The main objectives of robust control are tracking performance and disturbance a enuation over the entire frequency range of the controlled process. These objectives can be summarized as shown in Relation (15). The weighting matrices in Relation (15) are usually chosen as follows: W1S for a low-pass-type filter, W2U for a high-pass-type filter, and the matrix W3T provides high-frequency a enuation for the complementary sensitivity function T [24,28].

MATLAB/Simulink Implementation and Numerical Simulations for PMSM Sensored Control System Using a Robust Controller Combined with RL-TD3 Agent
Based on what was described in the previous section on the synthesis of a robust controller for the sensored control of PMSM, this section presents the MATLAB/Simulink implementation of such a controller, while the second part of this section presents a way to improve the performance of the robust controller using an RL-TD3 agent.

Robust Controller Synthesis Based on MATLAB/Simulink
The nominal parameters of the PMSM used in the following numerical simulations are shown in Table 1. Figure 6 shows the diagram of the MATLAB/Simulink model block for the proposed PMSM sensored control system using a robust controller. Typically, the sensitivity function is denoted by S, the complementary sensitivity function is denoted by T, and the additive robustness matrix is denoted by R. Their expressions are given in (12)- (14) respectively.
The main objectives of robust control are tracking performance and disturbance attenuation over the entire frequency range of the controlled process. These objectives can be summarized as shown in Relation (15). The weighting matrices in Relation (15) are usually chosen as follows: W 1S for a low-pass-type filter, W 2U for a high-pass-type filter, and the matrix W 3T provides high-frequency attenuation for the complementary sensitivity function T [24,28].

MATLAB/Simulink Implementation and Numerical Simulations for PMSM Sensored Control System Using a Robust Controller Combined with RL-TD3 Agent
Based on what was described in the previous section on the synthesis of a robust controller for the sensored control of PMSM, this section presents the MATLAB/Simulink implementation of such a controller, while the second part of this section presents a way to improve the performance of the robust controller using an RL-TD3 agent.

Robust Controller Synthesis Based on MATLAB/Simulink
The nominal parameters of the PMSM used in the following numerical simulations are shown in Table 1. Figure 6 shows the diagram of the MATLAB/Simulink model block for the proposed PMSM sensored control system using a robust controller.   The implementation of the robust controller follows the steps described in Section 3, as well as a set of instructions specific to robust synthesis, used in the MATLAB Robust Control toolbox environment [25]. Therefore, to define a variation in some parameters, e.g., load torque TL, around nominal values, the command ureal() is used. Using the musyn() command, the numerical values for the implementation of the robust controller K(s) are obtained as shown in Relation (16  The implementation of the robust controller follows the steps described in Section 3, as well as a set of instructions specific to robust synthesis, used in the MATLAB Robust Control toolbox environment [25]. Therefore, to define a variation in some parameters, e.g., load torque T L , around nominal values, the command ureal() is used. Using the musyn() command, the numerical values for the implementation of the robust controller K(s) are obtained as shown in Relation (16).
Furthermore, the controller K synthesized in Relation (16) and partitioned as shown in Relation (17) is also implemented in Simulink, as shown in Figure 6 [26,28].
where each component of the controller K(s) can be written as a transfer function: num(s)/den(s): The numerical values of the numerators and denominators of the components of the robust controller K(s), namely, N 11 (s), N 12 (s), N 21 (s), N 22 (s), and D(s) are as follows [26,28]: N 12 (s) = 8 · 10 −5 · s 6 + 7 · 10 −5 · s 5 + 5 · 10 −5 · s 4 + 0.0422 · s 3 + 0.0976 · s 2 + 0.0143 · s + 0.0211 (20) N 21 (s) = 5 · 10 −5 · s 6 + 6 · 10 −5 · s 5 + 7 · 10 −5 · s 4 + 0.0235 · s 3 + 0.0232 · s 2 + 0.0089 · s + 0.0678 (21) D(s) = 3 · 10 −4 · s 7 + 2 · 10 −4 · s 6 + 9 · 10 −4 · s 5 + 0.0196 · s 4 + 1.4102 · s 3 + 2.5062 · s 2 + 1.2128 · s + 0.0973 (23) The following are the general rules for the selection of the weighting matrices in Relation (15) that characterize the robust control [26,28]: Bode plots can be drawn to highlight the robustness properties of the synthesized controller. Thus, Figure 7a shows Bode plots for the transfer function ω ref to z 1 , while Figure 7b shows Bode plots for the transfer function T L to z 1 . From the analysis of these Bode plots, it can be seen that the robust control system of the PMSM ensures the stability of the closed-loop system, even in the case of parametric variations.  Figure 8 shows the evolution of the parameters of interest of the PMSM sensored control system using the classical FOC control structure with PI-type controllers, namely: rotor speed-ω; electromagnetic torque-T e ; load torque-T L ; stator currents-i a , i b , and i c ; and d-q rotating reference frame i d and i q . The numerical simulations are performed by applying a sequence of step signals to the PMSM rotor speed reference as follows: ω ref = [1000, 1250, 1500, 900] rpm, together with a load torque T L with a value of de 0.5 Nm. It can be seen that the response time of the control system is 29 ms, and there is no overshoot.   Figure 8 shows the evolution of the parameters of interest of the PMSM sensored control system using the classical FOC control structure with PI-type controllers, namely: rotor speed-ω; electromagnetic torque-Te; load torque-TL; stator currents-ia, ib, and ic; and d-q rotating reference frame id and iq. The numerical simulations are performed by applying a sequence of step signals to the PMSM rotor speed reference as follows: ωref = [1000, 1250, 1500, 900] rpm, together with a load torque TL with a value of de 0.5 Nm. It can be seen that the response time of the control system is 29 ms, and there is no overshoot. Figure 8. Time evolution parameters of the PMSM sensored control system using a PI controller for rotor speed reference ωref = [1000, 1250, 1500, 900] rpm and load torque TL = 0.5 Nm. Figure 9 shows the numerical simulations of the PMSM sensored control system, when both the classical PI-type controller (FOC-type strategy) and the robust-type controller are used. Thus, numerical simulations are performed after applying a sequence of step signals to the PMSM rotor speed reference as follows: ωref = [1000, 1200, 1400, 1000] rpm, together with a load torque TL with a value of 0.5 Nm. For the PMSM control, be er  Figure 9 shows the numerical simulations of the PMSM sensored control system, when both the classical PI-type controller (FOC-type strategy) and the robust-type controller are used. Thus, numerical simulations are performed after applying a sequence of step signals to the PMSM rotor speed reference as follows: ω ref = [1000, 1200, 1400, 1000] rpm, together with a load torque T L with a value of 0.5 Nm. For the PMSM control, better behavior of the robust controller is observed compared to the PI-based controller. However, if there is a 50% variation in the load torque TL and a 50% variation in the combined rotor and load inertia J parameter, Figure 10 shows a comparison of the performance of the robust-type controller and the PI-type controller. It can be seen that the robust controller has the same performance as when the parameters have nominal values (see Figure 9), while the PI controller provides a response time more than 50% higher than when the PMSM parameters have nominal values.  However, if there is a 50% variation in the load torque T L and a 50% variation in the combined rotor and load inertia J parameter, Figure 10 shows a comparison of the performance of the robust-type controller and the PI-type controller. It can be seen that the robust controller has the same performance as when the parameters have nominal values (see Figure 9), while the PI controller provides a response time more than 50% higher than when the PMSM parameters have nominal values. However, if there is a 50% variation in the load torque TL and a 50% variation in the combined rotor and load inertia J parameter, Figure 10 shows a comparison of the performance of the robust-type controller and the PI-type controller. It can be seen that the robust controller has the same performance as when the parameters have nominal values (see Figure 9), while the PI controller provides a response time more than 50% higher than when the PMSM parameters have nominal values.

Improvement in the Robust Control of PMSM Sensored Control System Using RL-TD3 Agent
In order to improve the performance of the robust controller according to [28], an RL-TD3 agent is used which, after the training phase, will provide correction signals over the u d and u q control signals of the robust controller, so that the control system consisting of the robust controller plus the RL-TD3 agent will provide superior performance for the PMSM control. The proposed block diagram of the implementation in Simulink of the robust controller in tandem with the RL-TD3 agent for the sensored control of a PMSM is shown in Figure 11.

Improvement in the Robust Control of PMSM Sensored Control System Using RL-TD3 Agent
In order to improve the performance of the robust controller according to [28], an RL-TD3 agent is used which, after the training phase, will provide correction signals over the ud and uq control signals of the robust controller, so that the control system consisting of the robust controller plus the RL-TD3 agent will provide superior performance for the PMSM control. The proposed block diagram of the implementation in Simulink of the robust controller in tandem with the RL-TD3 agent for the sensored control of a PMSM is shown in Figure 11. Using the elements and structure from the MATLAB Reinforcement Learning toolbox, Figure 12 shows the RL-TD3 agent implementation subsystem. Note that the inputs to this subsystem, viz.: PMSM rotor speed ω, iq current, iqerror current error and speed error ωerror, are observations for this type of algorithm, while the outputs are the correction signals for the ud and uq control signals and are considered actions. The generalized reward plays the role of a global criterion for optimizing the performance of the control system. The expression of the reward is given by the following relation [32,33]: Note that in Relation (27), the term j t u 1  includes the actions generated in the previous step. The training performance for 200 epochs of the RL-TD3 agent is shown in Figure 13. Using the elements and structure from the MATLAB Reinforcement Learning toolbox, Figure 12 shows the RL-TD3 agent implementation subsystem. Note that the inputs to this subsystem, viz.: PMSM rotor speed ω, i q current, i qerror current error and speed error ω error , are observations for this type of algorithm, while the outputs are the correction signals for the u d and u q control signals and are considered actions. The generalized reward plays the role of a global criterion for optimizing the performance of the control system. The expression of the reward is given by the following relation [32,33]: Note that in Relation (27), the term u j t−1 includes the actions generated in the previous step. The training performance for 200 epochs of the RL-TD3 agent is shown in Figure 13.
The following is a series of numerical simulations performed in MATLAB/Simulink for a PMSM with the nominal values given in Table 1 on the performance of the robust control system, but also with the enhancements of the RL-TD3 agent, where the variables of interest are those presented in Section 4.1. Therefore, Figure 14 shows the performance of the PMSM sensored robust control system for rotor speed reference ω ref = [1000, 1250, 1500, 900] rpm and load torque T L = 0.5 Nm, while Figure 15 shows the performance of the PMSM sensored robust control system plus RL-TD3 agent. Figure 16 shows the results of the performance of the robust control system plus RL-TD3 agent for PMSM control, for a 100% increase in the nominal value of the load torque and a 50% increase in the combined rotor and load inertia J parameter. It can be seen that the control system maintains its performance, but with a corresponding increase in currents.  The following is a series of numerical simulations performed in MATLAB/Simulink for a PMSM with the nominal values given in Table 1 on the performance of the robust control system, but also with the enhancements of the RL-TD3 agent, where the variables of interest are those presented in Section 4.1. Therefore, Figure 14 shows the performance of the PMSM sensored robust control system for rotor speed reference ωref = [1000, 1250, 1500, 900] rpm and load torque TL = 0.5 Nm, while Figure 15 shows the performance of the PMSM sensored robust control system plus RL-TD3 agent. Figure 16 shows the results of the performance of the robust control system plus RL-TD3 agent for PMSM control, for a 100% increase in the nominal value of the load torque and a 50% increase in the combined rotor and load inertia J parameter. It can be seen that the control system maintains its performance, but with a corresponding increase in currents.  The following is a series of numerical simulations performed in MATLAB/Simulink for a PMSM with the nominal values given in Table 1 on the performance of the robust control system, but also with the enhancements of the RL-TD3 agent, where the variables of interest are those presented in Section 4.1. Therefore, Figure 14 shows the performance of the PMSM sensored robust control system for rotor speed reference ωref = [1000, 1250, 1500, 900] rpm and load torque TL = 0.5 Nm, while Figure 15 shows the performance of the PMSM sensored robust control system plus RL-TD3 agent. Figure 16 shows the results of the performance of the robust control system plus RL-TD3 agent for PMSM control, for a 100% increase in the nominal value of the load torque and a 50% increase in the combined rotor and load inertia J parameter. It can be seen that the control system maintains its performance, but with a corresponding increase in currents.       Figure 17 shows a comparison of the time evolution of the rotor speed for the PMSM sensored control system using the PI controller, the robust controller, and the robust controller plus RL-TD3 agent. It can be seen that, compared to the robust controller, the response time of the robust controller used in tandem with RL-TD3 agent decreases from 22.9 ms to 18.2 ms, while the PI controller provides a response time of 30.1 ms.  Figure 17 shows a comparison of the time evolution of the rotor speed for the PMSM sensored control system using the PI controller, the robust controller, and the robust controller plus RL-TD3 agent. It can be seen that, compared to the robust controller, the response time of the robust controller used in tandem with RL-TD3 agent decreases from 22.9 ms to 18.2 ms, while the PI controller provides a response time of 30.1 ms.
plus RL-TD3 agent, for a rotor speed reference ωref = [1000, 1250, 1500, 900] rpm, load torque TL = 1 Nm, and 50% increase in J parameter. Figure 17 shows a comparison of the time evolution of the rotor speed for the PMSM sensored control system using the PI controller, the robust controller, and the robust controller plus RL-TD3 agent. It can be seen that, compared to the robust controller, the response time of the robust controller used in tandem with RL-TD3 agent decreases from 22.9 ms to 18.2 ms, while the PI controller provides a response time of 30.1 ms. Figure 17. Comparison of the time evolution of rotor speed for the PMSM sensored control system using PI controller, robust controller, and robust controller plus RL-TD3 agent.
In addition to the response time and ripple of the PMSM rotor speed signal, the calculation of the DF of the PMSM rotor speed is also proposed as an element to compare the performance of PMSM control systems for sensored control structures.
The box-counting method is used to calculate DF, according to [34]. In principle, it starts from a square that can contain the signal, and the length of one side of this square is Figure 17. Comparison of the time evolution of rotor speed for the PMSM sensored control system using PI controller, robust controller, and robust controller plus RL-TD3 agent.
In addition to the response time and ripple of the PMSM rotor speed signal, the calculation of the DF of the PMSM rotor speed is also proposed as an element to compare the performance of PMSM control systems for sensored control structures.
The box-counting method is used to calculate DF, according to [34]. In principle, it starts from a square that can contain the signal, and the length of one side of this square is used as the unit of measurement. The side of the square is chosen so that it contains the signal, and the size of the side of the square is chosen as a power of 2 to speed up the calculations. By dividing the unit of measurement by 2, the algorithm is repeated until the value is below a predefined threshold. We can write that the form of each division has the expression 1/2k, where k is the current step. The division on the second axis is of the form n k /2 2k where the total number of occupied domains on the scale given by the initial signal is denoted by n k . In the current step, the values obtained for each of the two axes using the algorithm described above are retained, namely the coordinates of a point M k (x,y), represented in the Cartesian system by logarithmic coordinates. The sequence of points M 1 , M 2 , . . . , M n is calculated sequentially at each step. The stop condition is when the set threshold value has been exceeded. The slope of a line closest to points M 1 , M 2 , . . . , M n is calculated by least squares, and the value obtained represents the DF of the initial signal. In the MATLAB environment, using the command "[n,r] = boxcount(signal,'slope')", the vectors of the two dimensions are obtained according to the algorithm described above.
Moreover, by using the commands "df = −diff(log(n))./diff(log(r))" and "['DF = 'num2str(mean(df(1:length(df)))) '+/− 'num2str(std(df(1:length(df))))])" the coordinates of the M k points in logarithmic coordinates are obtained, as well as the slope of the line closest to the points provided by the algorithm representing DF. Figures 18-20 show the DF of the rotor speed signal when using PI controller, robust controller, and robust controller plus RL-TD3 agent for PMSM sensored control system. Table 2 shows the comparative performance of the proposed controllers for the PMSM sensored control system. The elements that are compared are response time, rotor speed ripple and DF of the rotor speed signal. The performance improvement is clearly seen when using a robust controller compared to the case of using the classical PI controllers in the sensored control structure of the PMSM. Moreover, the improvement in the performance of the PMSM control system can be further enhanced by using the robust controller in tandem with an RL-TD3 agent. Note that by analyzing the obtained values of the DF of the rotor speed signal for the three cases of controllers presented, the hypothesis presented in [34] is verified, i.e., when using a controller with superior performance, the DF of the controlled signal is high. algorithm described above.
Moreover, by using the commands "df = −diff(log(n))./diff(log(r))" and "['DF = 'num2str(mean(df(1:length(df)))) '+/− 'num2str(std(df(1:length(df))))])" the coordinates of the Mk points in logarithmic coordinates are obtained, as well as the slope of the line closest to the points provided by the algorithm representing DF. Figures 18-20 show the DF of the rotor speed signal when using PI controller, robust controller, and robust controller plus RL-TD3 agent for PMSM sensored control system. Figure 18. DF of the rotor speed signal in the case of using the PI controller for the PMSM control system. Figure 19. DF of the rotor speed signal in the case of using the robust controller for the PMSM control system. Moreover, by using the commands "df = −diff(log(n))./diff(log(r))" and "['DF = 'num2str(mean(df(1:length(df)))) '+/− 'num2str(std(df(1:length(df))))])" the coordinates of the Mk points in logarithmic coordinates are obtained, as well as the slope of the line closest to the points provided by the algorithm representing DF. Figures 18-20 show the DF of the rotor speed signal when using PI controller, robust controller, and robust controller plus RL-TD3 agent for PMSM sensored control system.    Table 2 shows the comparative performance of the proposed controllers for the PMSM sensored control system. The elements that are compared are response time, rotor speed ripple and DF of the rotor speed signal. The performance improvement is clearly seen when using a robust controller compared to the case of using the classical PI controllers in the sensored control structure of the PMSM. Moreover, the improvement in the performance of the PMSM control system can be further enhanced by using the robust controller in tandem with an RL-TD3 agent. Note that by analyzing the obtained values of the DF of the rotor speed signal for the three cases of controllers presented, the hypothesis presented in [34] is verified, i.e., when using a controller with superior performance, the DF of the controlled signal is high.

PMSM Sensorless Control System Using Improved ESO-Type Observer Variants
This section presents the sensorless control of the PMSM, where the controller used is the one described in Section 4.2, i.e., a robust controller working in tandem with an RL-TD3 agent. Once the controller is fixed, there is the problem of estimating the speed of the PMSM rotor by means of a speed observer. Due to the structure of the nonlinear equations of the PMSM, an ESO-type observer is chosen according to [29]. This section also proposes improvements to PMSM rotor speed estimation, where the element of comparison is the rotor speed provided by the speed sensor in the PMSM sensored control structure. Therefore, first of all, the ESO-type observer, as its name implies, extends the estimated states by an additional state, and from the description of the ESO-type observer equations, it is also possible to estimate the load torque T L . In order to achieve an improvement in the estimated speed of the PMSM rotor, it is proposed to use a multiple NN, which uses multiple NNs trained over portions of the load torque variation, and selected accordingly by a stateflow to provide an estimate of the load torque. A second way to improve PMSM rotor speed estimation is to use an RL-TD3 agent that, after training, provides signals that overlap the ESO-type observer output to provide an estimate of the PMSM rotor closer to the value provided for the PMSM sensored control structure.

ESO-Type Observer Description
To estimate the PMSM rotor speed using the ESO-type observer, we start from the form of the PMSM nonlinear description equations. These equations were described in Section 2, but can be written in the general form given by Equation (28).
Equation (28) describes a monovariable nonlinear system in which perturbations occur, where y (l) denotes the derivative l of the output of the system y, u denotes the input to the system y, and d denotes the perturbation acting on the system y.
By denoting x 1 = y, x 2 = . y, . . . , x n = y (n−1) , the following form is obtained: x n = f (x 1 , x 2 , . . . , x n , d, t) + bu (29) An additional state is also selected in the following form: which gives the form of the function h(t): Note that in the descriptions given by the previous equations, the variable f includes the concentrated disturbances acting on the system.
Starting from the equations of the PMSM nonlinear system in Section 2, and adding the previous transformations, the equations of an ESO-type observer can be synthesized in the following form [29]: Thus, Relation (32) describes the component equations of the system describing the ESO-type observer which, based on a minimum of information, will provide the estimate of the states of the system together with the extended state including the f term. By adjusting the above relation to estimate the PMSM rotor speed, the following relation can be written: where the term f, which includes the generalized disturbances acting on the system of equations describing the operation of the PMSM, is expressed as follows: Substituting Relation (34) into Relation (33) gives the following equation: If in the previous relations x 1 = ω, x 2 = f and . f = h, the matrix expression of the system containing the PMSM rotor speed is obtained as follows: where the system matrices (36) are given by: Based on this, the ESO-type observer structure can be written as follows: .
The bandwidth of the ESO-type observer system is denoted by ω 0 . The poles of this system are assigned the value −ω 0 , and to determine the gain of the observer L, the identification is derived from the equality of polynomials as follows [29]: For fast convergence, the chosen ω 0 is equal to 900, and in the implementation in Figure 21, it can be seen that the PMSM rotor speed is estimated by the state variable x 1 of the observer, and the term f is estimated by the state variable x 2 of the observer.
Furthermore, considering the estimated PMSM rotor speed, the estimated PMSM rotor angle can also be obtained as:θ where: θ(t 0 ) is the initial value of the rotor position relative to a fixed reference point. The estimate of the load torque can also be obtained in the form of the following equation, where x 1 and x 2 represent the states of the ESO-type observer: In the following section, a method is proposed to improve the estimation of the PMSM rotor speed, by replacing Equation (43), which was used to estimate the load torque T L with a multiple NN trained on the specific portions of the load torque variation.
For fast convergence, the chosen ω0 is equal to 900, and in the implementation in Figure 21, it can be seen that the PMSM rotor speed is estimated by the state variable x1 of the observer, and the term f is estimated by the state variable x2 of the observer.
Furthermore, considering the estimated PMSM rotor speed, the estimated PMSM rotor angle can also be obtained as: where: is the initial value of the rotor position relative to a fixed reference point.
The estimate of the load torque can also be obtained in the form of the following equation, where x1 and x2 represent the states of the ESO-type observer:

Load Torque Estimation Using Multiple NN
T L load torque estimation is performed using multiple NNs, each of which is a twolayer feed-forward network [30,31]. The Neural Net Fitting application from MATLAB is used in the implementation and proper training of each NN over 5 load torque ranges T L , i.e., between 0 Nm and 1 Nm. After the numerical simulation on a predetermined interval in the load torque range specified above, the input data of each NN, namely the PMSM rotor speed ω, the derivative of the PMSM rotor speed dω/dt, i d and i q currents from d-q frame, but also the output of each NN given by the load torque required to generate and train it, were recorded. Therefore, in the Neural Net Fitting application, the input data, represented by sets of 500,000 values for each variable, is divided into 70% data needed for training, 15% data needed for validation, and 15% data needed for testing.
The algorithm chosen for training, validation, and testing is Bayesian regularization. This type of algorithm minimizes a linear combination of squared errors and weights. It also modifies the linear combination so that, at the end of training, the resulting network has good generalization qualities. After generating the required 5 NNs in the selected load torque range, each NN is selected using a stateflow implemented in MATLAB [30,31]. Figure 22 shows the MATLAB/Simulink subsystem implementation of multiple NN for load torque estimation and selection. Implementation of a NN for load torque estimation in the Neural Net Fitting app from MATLAB is shown in Figure 23. Details of this implementation, in terms of the block diagram of a NN for load torque estimation, represented by two-layer feed-forward network with sigmoid hidden neurons (150) and linear output neurons for regression tasks, is shown in Figure 23a, while Figure 23b shows the Simulink NN exported model for load torque estimation, and this stage is performed using the Neural Net Fitting application after the process of training, validation and testing of each NN. estimation in the Neural Net Fi ing app from MATLAB is shown in Figure 23. Details of this implementation, in terms of the block diagram of a NN for load torque estimation, represented by two-layer feed-forward network with sigmoid hidden neurons (150) and linear output neurons for regression tasks, is shown in Figure 23a, while Figure 23b shows the Simulink NN exported model for load torque estimation, and this stage is performed using the Neural Net Fi ing application after the process of training, validation and testing of each NN. Figure 22. MATLAB/Simulink subsystem implementation of multiple NN for load torque estimation and selection using stateflow. "*" is specific to the multi switch block in Simulink and represents the default selection. The number of epochs used in the training phase is 1000, and the learning rate is 0.1. Figure 24 shows the best validation performance, which represents the mean square error (MSE) values obtained on the entire data set selected for training, validation and testing during the iterative calculation process. Therefore, it can be seen that, at epoch 518, a MSE value of 0.00031339 is obtained from the total computation period of the Bayesian regularization algorithm. Figure 25 shows the evolution of the gradient in the iterative calculation algorithm, and it can be seen that, when applying the NN algorithm that provides The number of epochs used in the training phase is 1000, and the learning rate is 0.1. Figure 24 shows the best validation performance, which represents the mean square error (MSE) values obtained on the entire data set selected for training, validation and testing during the iterative calculation process. Therefore, it can be seen that, at epoch 518, a MSE value of 0.00031339 is obtained from the total computation period of the Bayesian regularization algorithm. Figure 25 shows the evolution of the gradient in the iterative calculation algorithm, and it can be seen that, when applying the NN algorithm that provides the load torque estimation at epoch 524, the gradient value is 0.00001489. Figure 23. Implementation of an NN for load torque estimation in Neural Network Fi ing toolbox from MATLAB: (a) block diagram of a NN for load torque estimation; (b) Simulink NN exported model for load torque estimation.
The number of epochs used in the training phase is 1000, and the learning rate is 0.1. Figure 24 shows the best validation performance, which represents the mean square error (MSE) values obtained on the entire data set selected for training, validation and testing during the iterative calculation process. Therefore, it can be seen that, at epoch 518, a MSE value of 0.00031339 is obtained from the total computation period of the Bayesian regularization algorithm. Figure 25 shows the evolution of the gradient in the iterative calculation algorithm, and it can be seen that, when applying the NN algorithm that provides the load torque estimation at epoch 524, the gradient value is 0.00001489.    Figure 26 shows the sample error distribution in the iterative calculation process. can be seen that the sample error corresponding to an NN for load torque estimation distributed between values of ±0.01. The regression analysis diagrams of the trainin sample data, test sample data, and the entire data set during the application of th Bayesian regularization type algorithm, are shown in Figure 27. It can be seen that th correlation factor R value corresponding to a NN for load torque estimation is around th value of 0.94.  Figure 26 shows the sample error distribution in the iterative calculation process. It can be seen that the sample error corresponding to an NN for load torque estimation is distributed between values of ±0.01. The regression analysis diagrams of the training sample data, test sample data, and the entire data set during the application of the Bayesian regularization type algorithm, are shown in Figure 27. It can be seen that the correlation factor R value corresponding to a NN for load torque estimation is around the value of 0.94. Figure 28 shows the load torque and estimated load torque using multiple NN estimation time evolution after numerical simulation of the PMSM control system for the following load torque sequence [0.1, 0.7, 0.9, 0.5, 0.3] Nm. It can be seen that the multiple NN estimation is very good, which creates the conditions for obtaining superior PMSM rotor speed estimation results when the load torque is provided by multiple NNs in the ESO observer structure. Figure 25. Training stage of a neural fi ing network for load torque estimation. Figure 26 shows the sample error distribution in the iterative calculation process. It can be seen that the sample error corresponding to an NN for load torque estimation is distributed between values of ±0.01. The regression analysis diagrams of the training sample data, test sample data, and the entire data set during the application of the Bayesian regularization type algorithm, are shown in Figure 27. It can be seen that the correlation factor R value corresponding to a NN for load torque estimation is around the value of 0.94.   Figure 28 shows the load torque and estimated load torque using multiple NN estimation time evolution after numerical simulation of the PMSM control system for the following load torque sequence [0.1, 0.7, 0.9, 0.5, 0.3] Nm. It can be seen that the multiple NN estimation is very good, which creates the conditions for obtaining superior PMSM

Improved Estimation of PMSM Rotor Speed Using ESO-Type Observer and RL-TD3 Agent
Similar to using an RL-TD3 agent to improve the performance of the PMSM control system, an agent whose implementation is shown in Figure 29 is selected and its reward is defined by the following relation [32,33]: where Q = 0.5, R = 0.1, and u j t−1 is the history of previous weighted actions. Figure 27. Regression factor after training, validation, and testing stages of a neural fi ing network for load torque estimation. Figure 28 shows the load torque and estimated load torque using multiple NN estimation time evolution after numerical simulation of the PMSM control system for the following load torque sequence [0.1, 0.7, 0.9, 0.5, 0.3] Nm. It can be seen that the multiple NN estimation is very good, which creates the conditions for obtaining superior PMSM rotor speed estimation results when the load torque is provided by multiple NNs in the ESO observer structure.  Sensors 2023, 23, x FOR PEER REVIEW 5.3. Improved Estimation of PMSM Rotor Speed Using ESO-Type Observer and R Similar to using an RL-TD3 agent to improve the performance of the P system, an agent whose implementation is shown in Figure 29 is selected an is defined by the following relation [32,33]:  is the history of previous weighted actions. Figure 30 shows the evolution of the performance of the RL-TD3 age during the training phase over a period of 200 episodes. After the RL-TD3 a stage, it will cooperate with the ESO-type observer in the sense that the R will provide correction signals that overlap the value of the rotor speed est vided by the ESO-type observer itself, thus achieving superior performance rotor speed estimation. Thus, by implementing this observer variant, an imp rotor speed estimation is achieved for the sensorless variant of the PMSM ture.   Figure 30 shows the evolution of the performance of the RL-TD3 agent algorithm during the training phase over a period of 200 episodes. After the RL-TD3 agent training stage, it will cooperate with the ESO-type observer in the sense that the RL-TD3 agent will provide correction signals that overlap the value of the rotor speed estimation provided by the ESO-type observer itself, thus achieving superior performance to the PMSM rotor speed estimation. Thus, by implementing this observer variant, an improvement in rotor speed estimation is achieved for the sensorless variant of the PMSM control structure.

Numerical Simulation of the PMSM Sensorless Control System
MATLAB/Simulink implementation of the proposed PMSM sensorlesss control system based on robust controller combined with RL-TD3 agent for improvement in ud and uq command using improved ESO-types version observers for speed estimation is shown in Figure 31. Figure 32 shows the MATLAB/Simulink subsystem implementation for ESO-type basic version observer for PMSM rotor speed estimation.

Numerical Simulation of the PMSM Sensorless Control System
MATLAB/Simulink implementation of the proposed PMSM sensorlesss control system based on robust controller combined with RL-TD3 agent for improvement in u d and u q command using improved ESO-types version observers for speed estimation is shown in Figure 31. Figure 32 shows the MATLAB/Simulink subsystem implementation for ESO-type basic version observer for PMSM rotor speed estimation.
Sensors 2023, 23, x FOR PEER REVIEW 25 of 31 Figure 31. MATLAB/Simulink implementation of the proposed PMSM sensorlesss control system based on a robust controller combined with an RL-TD3 agent for improvement in ud and uq command, using improved ESO-types version observers for speed estimation. "*" is specific to the multi switch block in Simulink and represents the default selection. Figure 31. MATLAB/Simulink implementation of the proposed PMSM sensorlesss control system based on a robust controller combined with an RL-TD3 agent for improvement in u d and u q command, using improved ESO-types version observers for speed estimation. "*" is specific to the multi switch block in Simulink and represents the default selection. Figure 31. MATLAB/Simulink implementation of the proposed PMSM sensorlesss control system based on a robust controller combined with an RL-TD3 agent for improvement in ud and uq command, using improved ESO-types version observers for speed estimation. "*" is specific to the multi switch block in Simulink and represents the default selection.  The comparison between the time evolution of the PMSM rotor speed estimates using the four observer variants and the time evolution of the measured PMSM rotor speed is shown in Figure 37. The comparison between the time evolution of the PMSM rotor speed estimates using the four observer variants and the time evolution of the measured PMSM rotor speed is shown in Figure 37.   . Time evolution parameters of the PMSM sensorless control system based on a robust controller with an RL-TD3 agent, using the ESO-type basic version observer for rotor speed estimation; rotor speed reference ωref = [500, 800, 1100, 700] rpm and load torque TL = [0.1, 0.7, 0.9, 0.5, 0.3] Nm. Figure 34. Time evolution parameters of the PMSM sensorless control system based on a robust controller with an RL-TD3 agent, using an ESO-type version observer with a multiple NN observer       Table 3 summarizes the comparative performance of the four observer variants of the PMSM rotor speed and measured PMSM rotor speed as a function of the response time and estimated speed ripple. Note that the error ripple for an analyzed variable, relative to the reference variable, can be associated with the root mean square (RMS) value of the error between the two signals. While, in Table 2, the error ripple was calculated for the error signal given by the reference rotor speed and the measured rotor speed, in Table  3 the error ripple is calculated for the error signal given by the measured rotor speed and Figure 37. Comparison of the time evolution for the rotor speed in a sensored case, and estimated rotor speeds in sensorless cases (improved ESO-types version observers) for the PMSM control system based on a robust controller with an RL-TD3 agent. Table 3 summarizes the comparative performance of the four observer variants of the PMSM rotor speed and measured PMSM rotor speed as a function of the response time and estimated speed ripple. Note that the error ripple for an analyzed variable, relative to the reference variable, can be associated with the root mean square (RMS) value of the error between the two signals. While, in Table 2, the error ripple was calculated for the error signal given by the reference rotor speed and the measured rotor speed, in Table 3 the error ripple is calculated for the error signal given by the measured rotor speed and the estimated rotor speed. It can be noted that the best performance in PMSM rotor speed estimation, in terms of the lowest estimated response time and speed ripple, is obtained when using the ESO-type observer with the load torque estimated using the multiple NN observer combined with the RL-TD3 agent.

Conclusions
Naturally, PMSM operation is subject to a number of perturbations, including both changes in internal parameters (e.g., stator resistance R s and combined inertia of rotor and load J) and changes in load torque T L . In order to maintain the performance of the control system over a relatively wide range of variation of the above types of parameters, a PMSM control system based on a robust controller is used. The article also presents the synthesis of the robust controller, its implementation in MATLAB/Simulink, and an improved version using an RL-TD3 agent that works in tandem with the robust controller to achieve superior performance of the PMSM sensored control system.
Regarding the sensorless control of the PMSM, starting from a basic structure of an ESO-type observer, four variants of observers are presented, obtained by combining the use of a multiple NN and an RL-TD3 agent.
Future work proposes the implementation in real time and the optimization of the performance, both in terms of control and in terms of PMSM rotor speed estimation, by modifying the robust control weights accordingly, and implementing multiple NN and reward of the RL-TD3 agent.