A Robust Optimal Control Strategy for PMSM Based on VGPDO and Actor-Critic Neural Network Against Flux Weakening and Mismatched Load Torque

Niu, Yangyu; Shi, Haibin

doi:10.3390/math13213387

Open AccessArticle

A Robust Optimal Control Strategy for PMSM Based on VGPDO and Actor-Critic Neural Network Against Flux Weakening and Mismatched Load Torque

by

Yangyu Niu

and

Haibin Shi

^*

College of Information Science and Engineering, Northeastern University, Shenyang 110819, China

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(21), 3387; https://doi.org/10.3390/math13213387

Submission received: 20 August 2025 / Revised: 8 October 2025 / Accepted: 13 October 2025 / Published: 24 October 2025

Download

Browse Figures

Versions Notes

Abstract

In this paper, a novel robust optimal control strategy is proposed for permanent magnet synchronous motors (PMSMs), simultaneously addressing two critical challenges in speed regulation: flux linkage degradation during long-term operation and abrupt load torque variations. The robust optimal control strategy is implemented through a combination of feedforward control and feedback control. A novel Variable-Gain Proportional Disturbance Observer (VGPDO) is proposed to simultaneously estimate time-varying flux linkage and torque disturbances in PMSM systems. The estimated disturbances are then compensated via a feedforward control loop, significantly improving the system’s robustness against parameter variations and external load changes. An optimal controller based on an actor-critic neural network provides feedback for optimal control performance. The uniform ultimate boundedness (UUB) of the proposed strategy is proved through Lyapunov stability analysis, and comprehensive simulation studies demonstrate the efficacy of both the proposed VGPDO and the proposed robust optimal control strategy.

Keywords:

PMSM; variable-gain proportional disturbance observer; flux linkage degradation; actor-critic neural network

MSC:

93B51; 93C40; 93B70; 93B53; 93C95

1. Introduction

Owing to their superior power density, wide speed regulation range, compact structure, and high reliability, Permanent magnet synchronous motors (PMSMs) have been widely adopted in high-tech industries such as aerospace and new energy vehicles. The control strategies for PMSM have evolved significantly with the continuous development of control theory. With the advancement of modern industry, there is an increasing demand for enhanced control performance, achieving both robustness and optimality in PMSM speed control, which plays a crucial role in industrial production.

An effective PMSM control strategy should typically achieve optimality to minimize energy consumption while maintaining robustness to ensure long-term stable and efficient operation of the equipment. For PMSM speed regulation, conventional control strategies such as PID, sliding mode control,

H \infty

control, model reference adaptive control, and backstepping control [1,2,3,4,5,6,7,8,9] have been widely adopted. However, these methods fail to achieve process optimality in terms of energy efficiency and dynamic performance. To achieve optimal control for the nonlinear PMSM system, it is necessary to formulate the Hamilton–Jacobi–Bellman (HJB) equation based on the cost function. However, these equations typically lack analytical solutions and pose significant challenges for numerical computation.

With the advancement of computational capabilities, reinforcement learning (RL)—also known as adaptive dynamic programming (ADP)—has emerged as a powerful tool for approximating solutions to the HJB equation, thereby enabling optimal control in practical applications [10,11,12,13,14,15,16]. Recent advances in PMSM optimal control based on ADP have demonstrated the effectiveness in [17,18,19,20,21], which utilize two neural networks to approximate the value function (Critic) and optimal control policy function (Actor), respectively, thereby circumventing the numerical challenges of directly solving HJB equations. Theoretical analysis proves that gradient-based weight updating ensures UUB stability within a neighborhood of ideal parameters [22].

Standard ADP methods struggle with HJB estimation during sudden parameter changes. Real-time identification and compensation of PMSM internal uncertainties and external disturbances constitute an effective solution. Prior works extensively address mismatched load torque effects: ref. [18] developed a serial extended state observer (ESO) for load torque observation; ref. [19] constructed a disturbance observer to estimate the load torque; ref. [20] proposed a load torque estimator that incorporates actuator error compensation to guarantee system stability. However, to ensure long-term operational stability of permanent magnet synchronous motors, it is essential to account for flux linkage deterioration effects, which necessitates real-time, accurate observation of motor flux linkage parameters. Numerous studies have developed dedicated flux linkage observers [23,24,25] and load torque observers [17,18,26,27] for PMSM. However, research on designing integrated observers capable of simultaneously estimating both flux linkage and load torque to ensure long-term stable operation in PMSM speed control systems remains scarce.

To solve the above problem, a variable-gain proportional disturbance observer (VGPDO), which simultaneously estimates both the flux linkage deterioration and external load torque disturbances, is proposed in this paper. Meanwhile, a robust optimal speed control strategy is proposed for PMSM. The strategy deploys a feedforward controller in the PMSM control system, which utilizes the flux linkage and torque estimates obtained from VGPDO to compensate for disturbances in real time, thereby transforming the PMSM dynamic system into an error dynamic system with disturbance rejection robustness. Furthermore, an actor-critic network-based optimal controller is implemented in this error dynamic system to stabilize the system and achieve optimal control.

The remainder of this paper is organized as follows: Section 2 derives the PMSM error dynamic equations with feedforward compensation for flux linkage and load torque variations. Section 3 develops the VGPDO to estimate both flux linkage and load torque in PMSM. Section 4 presents the actor-critic-based optimal controller design. Section 5 performs simulation studies with a comprehensive analysis. Finally, Section 6 concludes the paper by summarizing key contributions.

The main contributions of this paper are summarized as follows:

•: To mitigate flux linkage degradation in PMSM control systems during prolonged load operation, a VGPDO is proposed to simultaneously estimate torque and flux linkage disturbances, thereby enhancing system robustness.
•: To achieve optimality and robustness in PMSM speed regulation under varying load torque and flux linkage conditions, a robust optimal control strategy is proposed, which integrates feedforward-based VGPDO compensation with an actor-critic neural network-based optimal speed controller.
•: Comprehensive simulations are conducted to validate the effectiveness of the VGPDO and robust optimal control strategy.

2. System Descriptions

As proposed in [19], the PMSM dynamics can be represented as:

\{\begin{matrix} \dot{ω} = \frac{n_{p} ψ_{f}}{J} i_{q} - \frac{B}{J} ω - \frac{1}{J} T_{L} \\ {\dot{i}}_{q} = - \frac{R_{s}}{L} i_{q} - n_{p} ω i_{d} - \frac{n_{p} ψ_{f}}{L} ω + \frac{1}{L} u_{q} \\ {\dot{i}}_{d} = - \frac{R_{s}}{L} i_{d} + n_{p} ω i_{q} + \frac{1}{L} u_{d} \end{matrix}

(1)

where

w, J

,

T_{L}

and B are the angular velocity, inertia, load torque and viscous friction coefficient of PMSM,

L, R_{s}

and

n_{p}

are the stator inductance, stator resistance, and number of pole pairs, respectively.

i_{q}

and

u_{q}

are the q-axis stator current and voltage.

i_{d}

and

u_{d}

are the d-axis stator current and voltage.

ψ_{f}

is the rotor flux linkage. The PMSM states w,

i_{q}

, and

i_{d}

are assumed to be measurable.

When PMSM operates over an extended period, demagnetization effects must be considered. Let

Δ ψ

denote the variation in magnetic flux and let

ψ_{s}

represent the initial magnetic flux, hence we have:

ψ_{f} = ψ_{s} + Δ ψ

(2)

For convenience in the following analysis, we take

\frac{n_{p} ψ_{f}}{J}

as

m_{1}

,

\frac{B}{J}

as

m_{2}

,

\frac{1}{J}

as

m_{3}

,

\frac{R_{s}}{L}

as

m_{4}

,

\frac{n_{p} ψ_{f}}{L}

as

m_{5}

,

n_{p}

as

m_{6}

and

\frac{1}{L}

as

m_{7}

. We also define:

\{\begin{matrix} m_{1}^{*} ≜ \frac{n_{p} ψ_{s}}{J}; m_{5}^{*} ≜ \frac{n_{p} ψ_{s}}{L} \\ {\hat{m}}_{1} ≜ \frac{n_{p} (ψ_{s} + Δ \hat{ψ})}{J}; {\hat{m}}_{5} ≜ \frac{n_{p} (ψ_{s} + Δ \hat{ψ})}{L} \\ {\tilde{m}}_{1} ≜ m_{1} - {\hat{m}}_{1}; {\tilde{m}}_{5} ≜ m_{5} - {\hat{m}}_{5} \end{matrix}

(3)

In order to achieve our tracking goals, we introduce:

\{\begin{matrix} \tilde{w} ≜ w - w_{d} \\ {\tilde{i}}_{q} ≜ i_{q} - {\hat{i}}_{q}^{*} \end{matrix}

(4)

where

w_{d}

represents the reference speed,

{\hat{i}}_{q}^{*}

represents the virtual control input and is given by:

{\hat{i}}_{q}^{*} = \frac{m_{2} ω_{d} + m_{3} {\hat{T}}_{L} + {\dot{ω}}_{d}}{{\hat{m}}_{1}}

(5)

To facilitate the practical implementation of the actor-critic neural network-based optimal speed controller, the system described above is restructured via feedforward compensation. Hence, the control inputs

u_{d}

and

u_{q}

are decoupled into:

\{\begin{matrix} u_{q} = {\hat{u}}_{c q} + u_{s q} \\ u_{d} = {\hat{u}}_{c d} + u_{s d} \end{matrix}

(6)

where

u_{s q}

and

u_{s d}

denote the stabilizing terms generated by the actor-critic neural network-based optimal speed controller, while

{\hat{u}}_{c d}

and

{\hat{u}}_{c q}

represent terms which are defined as:

\{\begin{matrix} {\hat{u}}_{c q} & = \frac{1}{m_{7}} (m_{4} {\hat{i}}_{q}^{*} + {\hat{m}}_{5} ω_{d} + m_{6} ω_{d} i_{d} + {\dot{\hat{i}}}_{q}^{*}) \\ {\hat{u}}_{c d} & = - \frac{1}{m_{7}} (m_{6} ω_{d} {\tilde{i}}_{q} + m_{6} ω {\hat{i}}_{q}^{*}) \end{matrix}

(7)

By substituting (2)–(7) into Model (1), a new error dynamics model can be derived:

\{\begin{matrix} \dot{\tilde{w}} = m_{1} {\tilde{i}}_{q} - m_{2} \tilde{ω} - m_{3} {\tilde{T}}_{L} + \frac{{\tilde{m}}_{1}}{{\hat{m}}_{1}} (m_{2} ω_{d} + m_{3} {\hat{T}}_{L} + {\dot{ω}}_{d}) \\ \begin{matrix} {\dot{\tilde{i}}}_{q} & = - m_{5} \tilde{w} - k_{4} {\tilde{i}}_{q} + m_{7} u_{s q} - m_{6} \tilde{ω} i_{d} - {\tilde{m}}_{5} ω_{d} \\ {\dot{i}}_{d} & = - m_{4} i_{d} + m_{7} u_{s d} + m_{6} \tilde{ω} {\tilde{i}}_{q} \end{matrix} \end{matrix}

(8)

To facilitate further analysis, the Model (8) can be rewritten in the following form:

\dot{x} = f (x) + g u + δ

(9)

where:

\{\begin{matrix} x ≜ {[\begin{matrix} \tilde{ω} & {\tilde{i}}_{q} & i_{d} \end{matrix}]}^{T}, u ≜ {[\begin{matrix} u_{s q} & u_{s d} \end{matrix}]}^{T}; \\ f (x) ≜ [\begin{matrix} - m_{2} \tilde{ω} + m_{1} {\tilde{i}}_{q} \\ - m_{5} \tilde{ω} - m_{4} {\tilde{i}}_{q} - m_{6} \tilde{ω} i_{d} \\ - m_{4} i_{d} + m_{6} \tilde{ω} {\tilde{i}}_{q} \end{matrix}], g ≜ [\begin{matrix} 0 & 0 \\ m_{7} & 0 \\ 0 & m_{7} \end{matrix}]; \\ δ ≜ {[\begin{matrix} - m_{3} {\tilde{T}}_{L} + \frac{{\tilde{m}}_{1}}{{\hat{m}}_{1}} (m_{2} ω_{d} + m_{3} {\hat{T}}_{L} + {\dot{ω}}_{d}) & - {\tilde{m}}_{5} ω_{d} & 0 \end{matrix}]}^{T} \end{matrix}

(10)

3. Variable-Gain Proportional Disturbance Observer Design

To ensure long-term stable operation of PMSM, an advanced observer capable of capturing both external disturbances and internal parameter variations is essential in a robust optimal control strategy. In this paper, we explicitly address abrupt load torque variations and flux linkage degradation, which are two primary disturbances occurring during prolonged PMSM operation, to enhance system robustness.

Considering the variation in magnetic flux, Model (1) can be reformulated by substituting (2) and (3), yielding:

\{\begin{matrix} \dot{ω} = m_{1}^{*} i_{q} - m_{2} ω - m_{3} T_{L} + n_{1} i_{q} Δ ψ \\ {\dot{i}}_{q} = - m_{4} i_{q} - m_{6} ω i_{d} - m_{5}^{*} ω + m_{7} u_{q} + n_{2} ω Δ ψ \\ {\dot{i}}_{d} = - m_{4} i_{d} + m_{6} ω i_{q} + m_{7} u_{d} \end{matrix}

(11)

where

n_{1} ≜ \frac{n_{p}}{J}, n_{2} ≜ - \frac{n_{p}}{L}

.

For simplicity, the above model (11) can be further expressed as:

\dot{z} = A z (t) + h (z) + B u + E (z) d

(12)

with:

\{\begin{matrix} z ≜ {[\begin{matrix} z_{1} & z_{2} & z_{3} \end{matrix}]}^{T} ≜ {[\begin{matrix} w & i_{q} & i_{d} \end{matrix}]}^{T}; \\ d ≜ {[\begin{matrix} Δ ψ & T_{L} \end{matrix}]}^{T}; \\ A ≜ [\begin{matrix} - m_{2} & m_{1} & 0 \\ - m_{5} & - m_{4} & 0 \\ 0 & 0 & - m_{4} \end{matrix}], B ≜ [\begin{matrix} 0 & 0 \\ m_{7} & 0 \\ 0 & m_{7} \end{matrix}]; \\ h (z) ≜ [\begin{matrix} 0 \\ - m_{6} z_{1} z_{3} \\ m_{6} z_{1} z_{2} \end{matrix}], E (z) ≜ [\begin{matrix} n_{1} z_{2} & - m_{3} \\ n_{2} z_{1} & 0 \\ 0 & 0 \end{matrix}] . \end{matrix}

(13)

Based on model (12), a disturbance observer is proposed as follows:

\{\begin{matrix} \hat{d} = s + p \\ \dot{p} = l (z) \dot{z} \\ \dot{s} = - l (z) E (z) s - l (z) [A z + h (z) + B u + E (z) p] \end{matrix}

(14)

where

l (z)

is an undetermined matrix function expressed as:

l (x) = [\begin{matrix} l_{11} & l_{12} & l_{13} \\ l_{21} & l_{22} & l_{23} \end{matrix}]

.

Defining the estimation error as

\tilde{d} = d - \hat{d}

, and combining (12) and (14), we obtain:

\begin{matrix} \dot{\tilde{d}} & = - \dot{\hat{d}} + \dot{d} = l (z) E (z) s + l (z) [A z + h (z) + B u + E (z) p] - l (z) \dot{z} + \dot{d} \\ = l (z) E (z) s + l (z) [A z + h (z) + B u + E (z) p] - l (z) [A z + h (z) + B u + E (z) d] + \dot{d} \\ = l (z) E (z) s + l (z) [E (z) p - E (z) d] + \dot{d} \\ = - l (z) E (z) (d - s - p) + \dot{d} \\ = - l (z) E (z) (d - \hat{d}) + \dot{d} \\ = - l (z) E (z) \tilde{d} + \dot{d} \end{matrix}

(15)

To guarantee the bounded convergence of

\tilde{d}

, the following condition must be satisfied:

l (x) E (x) = [\begin{matrix} l_{11} & l_{12} & l_{13} \\ l_{21} & l_{22} & l_{23} \end{matrix}] [\begin{matrix} n_{1} z_{2} & - m_{3} \\ n_{2} z_{1} & 0 \\ 0 & 0 \end{matrix}] = [\begin{matrix} n_{1} z_{2} l_{11} + n_{2} z_{1} l_{12} & - m_{3} l_{11} \\ n_{1} z_{2} l_{21} + n_{2} z_{1} l_{22} & - m_{3} l_{21} \end{matrix}] > 0

(16)

Considering the safety margins

α_{1}

and

α_{2}

, we set:

[\begin{matrix} n_{1} z_{2} l_{11} + n_{2} z_{1} l_{12} & - m_{3} l_{11} \\ n_{1} z_{2} l_{21} + n_{2} z_{1} l_{22} & - m_{3} l_{21} \end{matrix}] \geq [\begin{matrix} α_{1} & 0 \\ 0 & α_{2} \end{matrix}] > 0

(17)

A feasible solution is given as follows:

\{\begin{matrix} l_{11} = l_{13} = l_{23} = 0 \\ l_{12} = \frac{α_{1}}{n_{2} z_{1}} \\ l_{21} = \frac{α_{2}}{- m_{3}} \\ l_{22} = \frac{n_{1} z_{2}}{n_{2} z_{1}} \frac{α_{2}}{m_{3}} \end{matrix}

(18)

Assumption 1.

1.: The load torque of PMSM and its rate are bounded by: $\{\begin{matrix} ‖ T_{L} ‖ ⩽ γ_{1} \\ ‖ {\dot{T}}_{L} ‖ ⩽ Υ_{1} \end{matrix}$ .
2.: The flux linkage of PMSM variation rate is bounded by: $‖ Δ \dot{ψ} ‖ ⩽ Υ_{2}$ .
3.: The reference speed and its derivative for PMSM are bounded as follows: $\{\begin{matrix} ‖ w_{d} ‖ ⩽ γ_{3} \\ ‖ {\dot{w}}_{d} ‖ ⩽ Υ_{3} \end{matrix}$ .

Remark 1.

The above assumptions are practically reasonable for: (1). In real-world mechanical systems, the load torque is always finite due to physical limitations and typically varies smoothly due to inertia and damping effects. (2). The flux linkage variation rate

Δ \dot{ψ}

is naturally constrained by the physical constraints of magnetic domain reorientation and limited energy available for instantaneous demagnetization. (3). Reference speed

w_{d}

and acceleration

{\dot{w}}_{d}

are bounded by both the motor’s rated specifications and practical motion control needs.

Theorem 1.

When applying the VGPDO (14) to Model (12) for flux linkage deterioration and disturbance torque estimation, the estimation error is bounded as follows under Assumption 1:

\{\begin{matrix} ‖ {\tilde{T}}_{L} ‖ ⩽ \frac{Υ_{1}}{\sqrt{1 + α_{1}^{2}}} \\ ‖ Δ \tilde{ψ} ‖ ⩽ \frac{Υ_{2}}{\sqrt{1 + α_{2}^{2}}} \end{matrix}

(19)

Notably, the VGPDO estimation error exhibits asymptotic stability under constant load torque and flux linkage conditions.

Furthermore, the parameter δ in (10) remains bounded (the detailed formulation is given in (A8)).

Proof of Theorem 1.

The complete mathematical derivations are provided in Appendix A.1. □

4. Actor-Critic Network-Based Optimal Controller Design

The optimal controller u is designed to guarantee speed tracking while minimizing the following cost function:

V (x_{0}) = \int_{0}^{\infty} r (x (τ), u (τ)) d τ

(20)

where

r (x, u) = Q (x) + u^{T} R u

and

Q (x)

is a positive definite function, R is a symmetric positive definite constant matrix, and

x_{0}

denotes the initial state of x.

According to [19], the minimization of this cost function can be achieved through the Hamiltonian function. The corresponding Hamiltonian for the cost function

V (x)

yeilds:

H (x, u, V_{x}) = r (x, u) + V_{x}^{T} [f (x) + g u + δ]

(21)

where

V_{x}

is defined as:

V_{x} ≜ \frac{d V}{d x}

.

Define the optimal cost function as:

V^{*} (x_{0}) = min_{u} (\int_{0}^{\infty} r (x (τ), u (x (τ))) d τ)

(22)

According to [28], the optimal controller u and optimal cost function

V_{x}^{*}

satisfy:

0 = Q (x) + u^{T} R u + V_{x}^{* T} (f (x) + g u + δ)

(23)

The optimal control input u can be solved as:

u = \arg min_{μ} [H (x, μ, V_{x})] = - \frac{1}{2} R^{- 1} g^{T} V_{x}^{*} .

(24)

Based on [28], the cost function

V (x)

can be expanded as an infinite series of basis functions

ζ_{i}

multiplied by their corresponding coefficients

c_{i}

:

\begin{matrix} V (x) = \sum_{i = 1}^{\infty} c_{i} ζ_{i} (x) = \sum_{i = 1}^{N} c_{i} ζ_{i} (x) + \sum_{i = N + 1}^{\infty} c_{i} ζ_{i} (x) \end{matrix}

(25)

For notational simplicity, we introduce:

V (x) ≜ C_{1} ζ + \sum_{i = N + 1}^{\infty} c_{i} ζ_{i} (x)

(26)

where

C_{1} ≜ {[\begin{matrix} c_{1} & c_{2} & \dots & c_{n} \end{matrix}]}^{T}

,

ζ ≜ {[\begin{matrix} ζ_{1} & ζ_{2} & \dots & ζ_{n} \end{matrix}]}^{T}

.

According to [22,29], there exist unknown neural network weights

W_{c}

and an neural network basis function vector

ζ (x)

that can approximate the cost function

V (x)

as:

V (x) = W_{c}^{T} ζ (x) + ε (x)

(27)

where

ζ

:

R^{n} \to R^{N}

and N represents the number of neurons in the hidden layer, and

ε

is the approximation error.

Substituting (27) into the Hamiltonian function (21), we obtain:

H (x, u, W_{c}) = W_{c}^{T} \nabla ζ (f + g u) + Q (x) + u^{T} R u = ε_{H}

(28)

where the residual error

ε_{H}

is expressed as:

\begin{matrix} ε_{H} & = - {(\nabla ε)}^{T} (f + g u) - W_{c}^{T} \nabla ζ δ \\ = - {(C_{1} - W_{c})}^{T} \nabla ζ (f + g u) - \sum_{i = N + 1}^{\infty} c_{i} \nabla ζ_{i} (x) (f + g u) - W_{c}^{T} \nabla ζ δ \end{matrix}

(29)

Similarly, the estimated cost function

\hat{V} (x)

obtained using the estimated weights

{\hat{W}}_{c}

can be expressed as:

\hat{V} (x) = {\hat{W}}_{c}^{T} ζ (x)

(30)

and the corresponding Hamiltonian function of

\hat{V} (x)

is given by:

H (x, u, {\hat{W}}_{c}) = {\hat{W}}_{c}^{T} \nabla ζ (f + g u) + Q (x) + u^{T} R u = e_{1}

(31)

e_{1} = - {\tilde{W}}_{c}^{T} \nabla ζ (f + g u) + ε_{H}

(32)

where we have

{\tilde{W}}_{c} ≜ W_{c} - {\hat{W}}_{c}

.

To minimize the estimation error

{\tilde{W}}_{c}

, we define

E_{1} = \frac{1}{2} e_{1}^{T} e_{1}

. Following [19], the normalized gradient descent algorithm yields:

{\dot{\hat{W}}}_{c} = - a_{1} \frac{\partial E_{1}}{\partial {\hat{W}}_{c}} = - a_{1} \frac{σ_{1}}{{(σ_{1}^{T} σ_{1} + 1)}^{2}} [σ_{1}^{T} {\hat{W}}_{c} + Q (x) + u^{T} R u]

(33)

where

a_{1}

is a tunable parameter and

σ_{1} = \nabla ζ (f + g u)

, while

{(σ_{1}^{T} σ_{1} + 1)}^{2}

is used for normalization.

To achieve optimal control, a neural network-based estimator for the optimal controller

u_{2}

is given by:

u_{2} (x) = - \frac{1}{2} R^{- 1} g^{T} (x) \nabla ζ^{T} {\hat{W}}_{a}

(34)

where

{\hat{W}}_{a}

denotes the estimate of

W_{c}

with the estimation error of

W_{a}

defined as

{\tilde{W}}_{a} ≜ W_{a} - {\hat{W}}_{a}

.

The update law of

{\hat{W}}_{a}

is designed as:

{\dot{\hat{W}}}_{a} = - a_{2} (η_{a} {\hat{W}}_{a} - η_{c} {\hat{W}}_{c} - \frac{1}{4} G {\hat{W}}_{a} m^{T} {\hat{W}}_{c})

(35)

where

a_{2}

is a tunable parameter and we have:

G ≜ \nabla ζ g R^{- 1} g^{T} \nabla ζ^{T}

(36)

σ_{2} ≜ \nabla ζ (f + g u_{2})

(37)

m ≜ \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}}

(38)

Assumption 2.

1.: f(x) is Lipschitz, and g(x) is bounded by a constant:

$\begin{matrix} ‖ f (x) ‖ < γ_{f} ‖ x ‖ \\ ‖ g (x) ‖ < γ_{g} \end{matrix}$
2.: The NN approximate error and its gradient are bounded by:

$\begin{matrix} ∥ ε ∥ < γ_{ε} \\ ∥ \nabla ε ∥ < γ_{ε_{x}} \end{matrix}$
3.: The NN basis functions and their gradients are bounded by:

$\begin{matrix} ∥ ζ (x) ∥ < γ_{ζ} \\ ∥ \nabla ζ (x) ∥ < γ_{ζ x} \end{matrix}$

Remark 2.

These assumptions are fundamentally grounded in the optimal control framework based on ADP [30,31,32], with rigorous theoretical foundations established in [28] and practical validations demonstrated in [20].

Theorem 2.

For the PMSM model (1) equipped with the feedforward controller (7) and the feedback controller (34), the update law of the critic neural network is designed as:

{\dot{\hat{W}}}_{c} = - a_{1} \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}} (σ_{2}^{T} {\hat{W}}_{c} + Q + u_{2}^{T} R u_{2}),

(39)

and the updating law of the actor neural network as:

{\dot{\hat{W}}}_{a} = - a_{2} (η_{a} {\hat{W}}_{a} - η_{c} {\hat{W}}_{c} - \frac{1}{4} G {\hat{W}}_{a} m^{T} {\hat{W}}_{c}),

(40)

and

η_{c} > 0

and

η_{a} > 0

are tuning parameters satisfy LMI:

[\begin{matrix} ϵ I & 0 & 0 \\ 0 & I & {(- \frac{1}{2} η_{c} - \frac{1}{8 m_{s}} G W_{c})}^{T} \\ 0 & - \frac{1}{2} η_{c} - (\frac{1}{8 m_{s}} G W_{c}) & η_{a} - \frac{1}{8} (G W_{c} m^{T} + m W_{c}^{T} G) \end{matrix}] > 0

(41)

let Assumption 2 hold, and there exists a positive integer

N_{0}

such that the number of the hidden layer units

N > N_{0}

, then, the error dynamic system states x, the critic neural network approximate error

{\tilde{W}}_{c}

, and the actor neural network approximate error

{\tilde{W}}_{a}

are uniformly ultimately bounded.

Proof of Theorem 2.

The complete mathematical derivations are provided in Appendix A.2. □

Figure 1 illustrates the complete flowchart of the proposed robust optimal control strategy for clearer demonstration.

5. Simulation Results

To evaluate the effectiveness of the proposed approach, Software-in-the-Loop (SIL) simulations were conducted within the MATLAB/Simulink (R2023b) environment to implement and evaluate both the VGPDO algorithm and the robust optimal control strategy. The simulation parameters for PMSM in this study are consistent with those in [18,19], with detailed parameter values presented in Table 1.

In simulation case 1, to demonstrate the effectiveness of the proposed VGPDO capable of simultaneous torque and flux estimation, two benchmark methods are employed for comparison: (1) a sliding-mode flux observer (SMO) algorithm from [25], which has been adopted by Simulink as a standard benchmark module, and (2) a load torque nonlinear disturbance observer (NDO) from [19]. Notably, the NDO explicitly accounts for demagnetization effects through compensation, where the compensation quantity is derived from the flux linkage estimation provided by SMO.

The load torque profile is configured as: (1) 0–5 s: linear ramp from 0 N·m to 2 N·m; (2) 5–30 s: linear decrease from 2 N·m to 1 N·m; (3) 30–40 s: sinusoidal oscillation (mean: 1 N·m, amplitude: 1 N·m, frequency: 2

π

rad/s); (4) 40–60 s: constant torque at 1 N·m. The flux linkage variation includes three phases: (1) 0–2 s: maintained at initial value; (2) 2–40 s: decay at a rate of

0.6 %

of

ψ_{s}

per second; (3) 40–60 s: held constant. Additionally, an abrupt 20% flux linkage reduction is introduced at 40 s to emulate abrupt demagnetization under fault conditions. To intuitively demonstrate the performance of the proposed VGPDO algorithm, a simulation was first conducted under ideal, noise-free conditions, and the results are presented in Figure 2 and Figure 3.

The experimental results are presented in Figure 2 and Figure 3.

The flux linkage estimation results of the two observers are compared in Figure 2. During t = 0–40 s, the proposed VGPDO demonstrates superior dynamic tracking performance for flux estimation compared to the SMO proposed in [25], with no significant chattering phenomena observed. Following the abrupt flux linkage demagnetization event at t = 40 s, the SMO exhibits significant high-amplitude chattering in tracking. In contrast, VGPDO demonstrates smoother and more stable tracking performance. Ultimately, when both flux linkage and load torque stabilize after t = 40 s, VGPDO demonstrates a smooth transient response with rapid convergence, achieving asymptotic stability.

The torque estimation performance of different observers is shown in Figure 3. The NDO proposed in [25] achieves relatively accurate load torque estimation in the early period. However, with progressive deterioration of the flux linkage, the estimation error increases steadily, eventually causing complete observer failure. The SMO-compensated NDO achieves satisfactory load torque tracking but demonstrates persistent chattering, which intensifies dramatically under severe flux linkage degradation at t = 40 s, while the proposed VGPDO maintains smooth and stable tracking for the load torque all the time.

To further evaluate the performance of the proposed VGPDO algorithm and demonstrate its robustness against measurement noise, the simulation study was extended to include three distinct noise variance levels, building upon the initial noise-free results. A comparative summary of the performance metrics across all conditions is provided in Table 2.

Remark 3.

The terms

δ_{w}^{2}

,

δ_{i q}^{2}

, and

δ_{i d}^{2}

denote the measurement noise variances for the angular velocity, q-axis current, and d-axis current, respectively; The maximum settling time is defined as the duration required for the observer’s estimation errors of both load torque and flux linkage to fall and remain below 2% after a step disturbance is applied to the PMSM at t = 40 s; The term "N/A" denotes a non-existing value. The CPU utilization was calculated in a SIL environment by converting the algorithm’s floating-point operation count into the equivalent execution time on a 180 MHz processor.

As shown in Table 2, under various noise conditions, the proposed VGPDO achieves smaller maximum estimation errors and lower RMSE for both load torque and flux, demonstrating better tracking accuracy. Meanwhile, the proposed VGPDO also exhibits a shorter maximum settling time, indicating a faster dynamic response.

Additionally, the proposed VGPDO algorithm demonstrates robustness to measurement noise, as evidenced by the minimal variation in its performance metrics across the first three different noise levels. Moreover, under different measurement noise variances, the CPU utilization of the proposed VGPDO algorithm showed minimal change. This suggests that the level of measurement noise has no strong correlation with its computational burden, and thus did not significantly increase the real-time computational demand.

To demonstrate the efficacy of the tuned parameters

α_{1}

and

α_{2}

in VGPDO, a sensitivity analysis was conducted under measurement noise variances of

δ_{w}^{2} = 1

,

δ_{i_{q}} = 10^{- 3}

and

δ_{i_{d}} = 10^{- 3}

; the results are summarized in Table 3.

As shown in Table 3, the parameters

α_{1}

and

α_{2}

exhibit an approximately linear inverse relationship with the settling time. When

α_{1}

and

α_{2}

are moderately increased from 10 to 100, the settling time of the VGPDO decreases significantly from 0.3921 s to 0.0403 s, indicating a substantial improvement in convergence speed and dynamic tracking performance. This trend is further corroborated by the monotonic decrease in the RMSE for both the load torque (from

1.520 \times 10^{- 2}

to

2.986 \times 10^{- 3}

) and the flux linkage (from

1.116 \times 10^{- 3}

to

3.590 \times 10^{- 4}

).

However, when

α_{1} = α_{2} = 1000

, the maximum estimation error increases. This suggests that excessively large parameters degrade the algorithm’s filtering effectiveness against measurement noise, consequently reducing estimation accuracy.

Therefore, selecting

α_{1}, α_{2}

necessitates a trade-off between estimation accuracy and dynamic response speed. Systematic tuning is required to determine the optimal values for achieving the best overall performance of the VGPDO algorithm.

In simulation case 2, to validate the effectiveness of the proposed optimal robust control strategy integrating VGPDO and actor-critic neural network-based optimal speed controller (VGPDO-AC), we select the a set of optimal control strategy integrating a ESO and actor-critic neural network-based optimal speed controller (ESO-AC) from [18] as the benchmark. This comparative strategy employs ESO for system disturbance estimation and compensation, while utilizing actor-critic neural network to achieve the optimal control policy. Additionally, an H ∞ robust control algorithm that incorporates flux compensation is also adopted as a benchmark.

The load torque profile is configured as: (1) 0–2 s: constant at 1.5 N·m; (2) 2–10 s: linear increase from 1.5 N·m to 1.8 N·m; (3) 10–15 s: rectangular wave (mean: 1.8 N·m, amplitude: ±0.3 N·m, period: 2 s). The flux linkage variation includes: (1) 0–2 s: maintained at initial value

ψ_{s}

; (2) 2–15 s: decay at a rate of 2% of

ψ_{s}

per second. Additionally, an abrupt 20% reduction is introduced at 7 s to emulate fault conditions. The PMSM reference speed

ω_{d}

is set as a step command of 1000 RPM. The simulation is conducted with measurement noise variances of

δ_{w}^{2} = 1

,

δ_{i_{q}} = 10^{- 3}

, and

δ_{i_{d}} = 10^{- 3}

.

For the actor-critic neural network, the weight matrix for state deviations is given by:

Q = [\begin{matrix} 100 & 0 & 0 \\ 0 & 100 & 0 \\ 0 & 0 & 100 \end{matrix}]

. The weight matrix for input is given by:

R = [\begin{matrix} 0.5 & 0 \\ 0 & 1 \end{matrix}]

. The basis functions are selected as:

ζ (x) = {[\begin{matrix} x_{1} x_{2} & x_{2} x_{3} & x_{1} x_{3} & x_{1}^{2} & x_{2}^{2} & x_{3}^{2} \end{matrix}]}^{T}

. The tunable parameter

η_{c}

and

η_{a}

is set to be 1 and 1, respectively.

The simulation results are presented in Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8. The relevant performance metrics from the experimental results can be found in Table 4.

As summarized in Table 4, the proposed VGPDO-AC strategy achieves the highest steady-state accuracy and fastest dynamic response, as evidenced by its smallest steady-state error (0.01217%), shortest settling time (0.0676 s), and minimum RMSE (27.3861) among all control strategies. Furthermore, it maintains lower power consumption (

1.396 \times 10^{4}

J) than the contemporary H-infinity control (

1.424 \times 10^{4}

J). Although its energy consumption is slightly higher than that of the ESO-AC strategy, the VGPDO-AC demonstrates notable overall control performance, positioning it as a high-performance and energy-efficient control solution.

To demonstrate the robustness of the proposed VGPDO-AC control strategy, the dynamic speed responses of the PMSM under flux linkage and load torque step changes are illustrated in Figure 9 and Figure 10, respectively, with the corresponding performance metrics provided in Table 5 and Table 6.

As shown in Table 5 and Table 6, the proposed VGPDO-AC control strategy achieves both the lowest RMSE and the minimum maximum deviation under flux linkage and load torque disturbances. These results demonstrate that the proposed VGPDO-AC control strategy exhibits enhanced robustness against flux-weakening and abrupt load changes.

Based on the results from simulation case 2 under the step speed command scenario during the first second, a sensitivity analysis of the tunable parameters

η_{a}

and

η_{c}

is conducted in Table 7 and Table 8 to quantify their effects on key performance metrics.

Remark 4.

The settling time of ${\hat{W}}_{a}$ is defined as the earliest time $t_{s e t t l i n g}$ such that for all $t > t_{s e t t l i n g}$ , we have: $\frac{∥{\hat{W}}_{a} (t) - {\hat{W}}_{c} (t)∥}{∥{\hat{W}}_{c} (t)∥} \leq 2 %$
The percentage deviation of ${\hat{W}}_{a}$ is used to evaluate its tracking performance relative to ${\hat{W}}_{c}$ and is computed as:

$max_{t \geq t_{steady}} ∥\frac{{\hat{W}}_{c} (t) - {\hat{W}}_{a} (t)}{{\hat{W}}_{c} (t)}∥ \times 100 %$

where $t_{steady}$ denotes the time instant at which the system state enters the steady-state regime, and is is computed as twice the value of $t_{settling}$ .
The steady-state estimation error on $\dot{\hat{V}} (x)$ is used to evaluate the approximation degree of ${\hat{W}}_{c}$ to the actual cost function after ${\hat{W}}_{c}$ has converged, and is calculated as:

$max_{t \geq t_{steady}} ∥\frac{\dot{V} (x (t)) - \dot{\hat{V}} (x (t))}{\dot{V} (x (t))}∥ \times 100 %$

As shown in Table 8, the steady-state estimation error of

\dot{\hat{V}} (x)

remains below 2% with no significant variation. This confirms that the critic neural network weight

{\hat{W}}_{c}

in the proposed VGPDO-AC algorithm achieves an excellent approximation of the cost function

V (x)

. Furthermore, the CPU usage remains within the range of 50% to 60%, demonstrating the computational efficiency of the proposed algorithm. This indicates its potential for practical deployment on processors with 180 MHz clock frequencies at 5 kHz control rates.

Meanwhile, as shown in Table 7 and Table 8, as parameters

η_{a}

and

η_{c}

increase, the dynamic performance of both PMSM speed control and neural network weight updates is improved, evidenced by the monotonic decrease in the RMSE of the speed, the settling time of the speed, percentage deviation on

{\hat{W}}_{a}

, settling time of

{\hat{W}}_{a}

. This indicates that appropriately increasing

η_{a}

and

η_{c}

is beneficial for enhancing both the dynamic response rate of PMSM speed control and the convergence rate of

{\hat{W}}_{a}

. However, this comes at the trade-off of an increase in the steady-state error of the speed. Therefore, the selection of

η_{a}

and

η_{c}

should be based on the specific control performance requirements.

6. Conclusions

In this paper, a VGPDO is proposed for PMSM that simultaneously estimates both flux linkage and load torque. Experimental results demonstrate that the VGPDO achieves smooth and rapid tracking of both flux linkage and load torque variations, while guaranteeing asymptotic stability when flux linkage and load torque reach steady-state conditions. Furthermore, we developed a robust optimal control framework that integrates the VGPDO estimates through feedforward compensation, which transforms the PMSM dynamics into an error dynamic system. Then we employ an actor-critic neural network to achieve optimal control. Experimental validation confirms that the proposed control strategy achieves optimal control while exhibiting superior dynamic tracking performance with minimal steady-state error and maintaining robustness against both flux-weakening operation and abrupt load torque variations.

Author Contributions

Conceptualization, Y.N.; methodology, Y.N.; software, Y.N.; validation, Y.N., H.S.; formal analysis, Y.N., H.S.; writing—original draft, Y.N.; writing—review and editing, Y.N.; supervision, Y.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PMSM	Permanent magnet synchronous motor
UUB	Uniform ultimate boundedness
ADP	Adaptive dynamic programming
RL	Reinforcement learning
VGPDO	Variable-gain proportional disturbance observer
AC	actor-critic neural network

Appendix A

Appendix A.1

By substituting (18) into (15), we obtain:

\{\begin{matrix} {\dot{\tilde{T}}}_{L} = - α_{1} {\tilde{T}}_{L} + {\dot{T}}_{L} \\ Δ \dot{\tilde{ψ}} = - α_{2} Δ \tilde{ψ} + Δ \dot{ψ} \end{matrix}

(A1)

Therefore, based on Assumption 1, the estimation error concerning flux linkage deterioration and disturbance torque can be derived as:

\{\begin{matrix} ‖ {\tilde{T}}_{L} ‖ ⩽ \frac{Υ_{1}}{\sqrt{1 + α_{1}^{2}}} \\ ‖ Δ \tilde{ψ} ‖ ⩽ \frac{Υ_{2}}{\sqrt{1 + α_{2}^{2}}} \end{matrix}

(A2)

Notably, when the rate of change for both flux linkage and load torque becomes zero (i.e.,

{\dot{T}}_{L} = 0

and

Δ \dot{ψ} = 0

), it follows from (A1) that

{\dot{\tilde{T}}}_{l}

and

Δ \dot{\tilde{ψ}}

will achieve asymptotic stability.

Furthermore, the following expression can be derived:

‖ \frac{{\tilde{m}}_{1}}{{\hat{m}}_{1}} ‖ = ‖ \frac{Δ \tilde{ψ}}{ψ_{s} + Δ \hat{ψ} + Δ \tilde{ψ}} ‖ ⩽ \frac{\frac{Υ_{2}}{\sqrt{1 + α_{2}^{2}}}}{ψ_{s} + \frac{Υ_{2}}{\sqrt{1 + α_{2}^{2}}}} = \frac{Υ_{2}}{ψ_{s} \sqrt{1 + α_{2}^{2}} + Υ_{2}}

(A3)

Simultaneously, the following relationship holds:

\begin{matrix} ‖ m_{2} ω_{d} + m_{3} {\hat{T}}_{L} + {\dot{ω}}_{d} ‖ & = ‖ m_{2} ω_{d} + m_{3} T_{L} + {\dot{ω}}_{d} - m_{3} {\tilde{T}}_{L} ‖ \\ ⩽ ‖ m_{2} γ_{3} + m_{3} γ_{1} + Υ_{3} - m_{3} \frac{Υ_{1}}{\sqrt{1 + α_{1}^{2}}} ‖ \end{matrix}

(A4)

By combining inequality (A2)–(A4), with Assumption 1, the following can be obtained:

\begin{matrix} ‖ - m_{3} {\tilde{T}}_{L} + \frac{{\tilde{m}}_{1}}{{\hat{m}}_{1}} (m_{2} ω_{d} + m_{3} {\hat{T}}_{L} + {\dot{ω}}_{d}) ‖ ⩽ δ_{1} \end{matrix}

(A5)

where:

\begin{matrix} δ_{1} ≜ \sqrt{{(m_{3} \frac{Υ_{1}}{\sqrt{1 + α_{1}^{2}}})}^{2} + {(\frac{Υ_{2}}{ψ_{s} \sqrt{1 + α_{2}^{2}} + Υ_{2}} (m_{2} γ_{3} + m_{3} γ_{1} + Υ_{3} - m_{3} \frac{Υ_{1}}{\sqrt{1 + α_{1}^{2}}}))}^{2}} \end{matrix}

(A6)

From inequality (A2) and Assumption 1, it can be readily derived that:

\begin{matrix} ‖ {\tilde{m}}_{5} w_{d} ‖ ⩽ γ_{3} ‖ \frac{n_{p} Δ \tilde{ψ}}{L} ‖ ⩽ \frac{n_{p} γ_{3} Υ_{2}}{L \sqrt{1 + α_{2}^{2}}} \end{matrix}

(A7)

Finally, by incorporating inequalities (A5) and (A7), we establish the boundary of

δ

as:

‖ δ ‖ ⩽ δ_{m}

(A8)

where:

δ_{m} ≜ \sqrt{δ_{1}^{2} + \frac{n_{p}^{2} γ_{3}^{2} Υ_{2}^{2}}{L^{2} (1 + α_{2}^{2})}}

(A9)

Appendix A.2

Take (24) into (23), we have:

0 = Q (x) + V_{x}^{* T} (x) f (x) - \frac{1}{4} V_{x}^{* T} (x) g (x) R^{- 1} g^{T} (x) V_{x}^{*} (x) + V_{x}^{* T} (x) δ

(A10)

Take (27) into (A10), we have:

W_{c}^{T} \nabla ζ f - \frac{1}{4} W_{c}^{T} \nabla ζ {gR}^{- 1} g^{T} \nabla ζ^{T} W_{c} + Q (x) = ε_{H J B}

(A11)

where

ε_{H J B} = - \nabla ε^{T} f + \frac{1}{2} W_{c}^{T} \nabla ζ {gR}^{- 1} g^{T} \nabla ε + \frac{1}{4} \nabla ε^{T} g R^{- 1} g^{T} \nabla ε - W_{c}^{T} \nabla ζ δ - \nabla ε^{T} δ

. Take (36) and

σ_{1} = \nabla ζ (f + g u)

into (A11) yield:

W_{c}^{T} σ_{1} = - Q (x) - \frac{1}{4} W_{c}^{T} G (x) W_{c} + ε_{H J B} (x)

(A12)

The whole Lyapunov function is given by:

L = L_{V} (x) + L_{c} (x) + L_{a} (x) = \frac{1}{2} x^{T} x + \frac{1}{2} t r ({\tilde{W}}_{c}^{T} a_{1}^{- 1} {\tilde{W}}_{c}) + \frac{1}{2} t r ({\tilde{W}}_{a}^{T} a_{2}^{- 1} {\tilde{W}}_{a})

(A13)

where we have:

L_{V} (x) ≜ \frac{1}{2} x^{T} x

,

L_{c} (x) ≜ \frac{1}{2} t r ({\tilde{W}}_{c}^{T} a_{1}^{- 1} {\tilde{W}}_{c})

,

L_{a} (x) ≜ \frac{1}{2} t r ({\tilde{W}}_{a}^{T} a_{2}^{- 1} {\tilde{W}}_{a})

.

The derivitive of

L_{V}

is written as:

\begin{matrix} {\dot{L}}_{V} (x) = & \frac{\partial L_{V}}{\partial x} \cdot \dot{x} = (\nabla ζ^{T} W_{c} + \nabla ε) \cdot (f (x) + g u_{2} + δ) \\ = & W_{c}^{T} (\nabla ζ f (x) - \frac{1}{2} G (x) {\hat{W}}_{a}) + (\nabla ζ^{T} W_{c} + \nabla ε) δ \\ + \nabla ε^{T} (x) (f (x) - \frac{1}{2} g (x) R^{- 1} g^{T} (x) \nabla ζ^{T} {\hat{W}}_{a}) \end{matrix}

(A14)

To simplify the notation, let us define:

ε_{1} (x) ≜ \nabla ε^{T} (x) (f (x) - \frac{1}{2} g (x) R^{- 1} g^{T} (x) \nabla ζ^{T} (x) {\hat{W}}_{a}) + (\nabla ζ^{T} W_{c} + \nabla ε) δ

(A15)

The (A14) could be written as:

\begin{matrix} {\dot{L}}_{V} (x) = & W_{c}^{T} (\nabla ζ f (x) - \frac{1}{2} G (x) {\hat{W}}_{a}) + ε_{1} (x) \\ = & W_{c}^{T} \nabla ζ f (x) + \frac{1}{2} W_{c}^{T} G (x) (W_{c} - {\hat{W}}_{a}) \\ - \frac{1}{2} W_{c}^{T} G (x) W_{c} + ε_{1} (x) \\ = & W_{c}^{T} \nabla ζ f (x) + \frac{1}{2} W_{c}^{T} G (x) {\tilde{W}}_{a} - \frac{1}{2} W_{c}^{T} G (x) W_{c} + ε_{1} (x) \\ = & W_{c}^{T} σ_{1} + \frac{1}{2} W_{c}^{T} G (x) {\tilde{W}}_{a} + ε_{1} (x) \end{matrix}

(A16)

Take (A12) into (A16) yields:

\begin{matrix} {\dot{L}}_{V} (x) = & - Q (x) - \frac{1}{4} W_{c}^{T} G (x) W_{c} \\ + \frac{1}{2} W_{c}^{T} G (x) {\tilde{W}}_{a} + ε_{H J B} (x) + ε_{1} (x) \end{matrix}

(A17)

for

{\dot{L}}_{c}

, we have:

\begin{matrix} {\dot{L}}_{c} & = {\tilde{W}}_{c}^{T} a_{1}^{- 1} {\dot{\tilde{W}}}_{c} \\ = {\tilde{W}}_{c}^{T} a_{1}^{- 1} a_{1} \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}} (σ_{2}^{T} {\hat{W}}_{c} + Q (x) + \frac{1}{4} {\hat{W}}_{a}^{T} G {\hat{W}}_{a}) \end{matrix}

(A18)

Rewriting (A12) gives:

0 = - W_{c}^{T} σ_{1} - Q (x) - \frac{1}{4} W_{c}^{T} G (x) W_{c} + ε_{H J B} (x)

(A19)

Taking Equation (A19) into Equation (A18) yields the following:

\begin{matrix} {\dot{L}}_{c} = & {\tilde{W}}_{c}^{T} \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}} (σ_{2}^{T} {\hat{W}}_{c} + Q (x) + \frac{1}{4} {\hat{W}}_{a}^{T} G (x) {\hat{W}}_{a} \\ - Q (x) - σ_{1}^{T} W_{c} - \frac{1}{4} W_{c}^{T} G (x) W_{c} + ε_{H J B} (x)) \\ = & {\tilde{W}}_{c}^{T} \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}} (σ_{2}^{T} (x) {\hat{W}}_{c} - σ_{1}^{T} (x) W_{c} \\ + \frac{1}{4} {\hat{W}}_{a}^{T} G (x) {\hat{W}}_{a} - \frac{1}{4} W_{c}^{T} G (x) W_{c} + ε_{H J B} (x)) \end{matrix}

(A20)

Substituting (24) and (27) into

σ_{1} = \nabla ζ (f + g u)

yields:

\begin{matrix} σ_{1} & = \nabla ζ (f + g u) \\ = \nabla ζ (f - \frac{1}{2} g R^{- 1} g (W_{c}^{T} ζ + ε)) \\ = \nabla ζ f - \frac{1}{2} \nabla ζ g R^{- 1} g W_{c}^{T} ζ (x) - \frac{1}{2} \nabla ζ g R^{- 1} g ε (x) \end{matrix}

(A21)

Take (A21), (37), (36) into

σ_{2}^{T} (x) {\hat{W}}_{c} - σ_{1}^{T} (x) W_{c}

yields:

\begin{matrix} σ_{2}^{T} (x) {\hat{W}}_{c} - σ_{1}^{T} (x) W_{c} \\ = {(\nabla ζ (f + g u_{2}))}^{T} {\hat{W}}_{c} - {(\nabla ζ (f + g u))}^{T} W_{c} \\ = - \nabla ζ f {\tilde{W}}_{c} + {(\nabla ζ g u_{2})}^{T} {\hat{W}}_{c} - {(\nabla ζ g u)}^{T} W_{c} \\ = - \nabla ζ f {\tilde{W}}_{c} + {(- \frac{1}{2} \nabla ζ g R^{- 1} g^{T} (x) \nabla ζ^{T} {\hat{W}}_{a})}^{T} {\hat{W}}_{c} - {(\nabla ζ g u)}^{T} W_{c} \\ = - \nabla ζ f {\tilde{W}}_{c} - \frac{1}{2} ({\hat{W}}_{a}^{T} G {\hat{W}}_{c} - W_{c}^{T} G W_{c}) + \frac{1}{2} {(\nabla ζ g R^{- 1} g ε (x))}^{T} W_{c} \end{matrix}

(A22)

Additionally, we can transform

\frac{1}{2} {\hat{W}}_{a}^{T} G {\hat{W}}_{c} - \frac{1}{2} W_{c}^{T} G W_{c} - \frac{1}{4} {\hat{W}}_{a}^{T} G (x) {\hat{W}}_{a} + \frac{1}{4} W_{c}^{T} G (x) W_{c}

as:

\begin{matrix} \frac{1}{2} {\hat{W}}_{a}^{T} G {\hat{W}}_{c} - \frac{1}{2} W_{c}^{T} G W_{c} - \frac{1}{4} {\hat{W}}_{a}^{T} G (x) {\hat{W}}_{a} + \frac{1}{4} W_{c}^{T} G (x) W_{c} \\ = \frac{1}{2} {\hat{W}}_{a}^{T} G (W_{c} - {\tilde{W}}_{c}) - \frac{1}{2} W_{c}^{T} G W_{c} - \frac{1}{4} {\hat{W}}_{a}^{T} G (x) {\hat{W}}_{a} + \frac{1}{4} W_{c}^{T} G (x) W_{c} \\ = - \frac{1}{2} {\hat{W}}_{a}^{T} G {\tilde{W}}_{c} + \frac{1}{2} {\hat{W}}_{a}^{T} G W_{c} - \frac{1}{2} W_{c}^{T} G W_{c} - \frac{1}{4} {\hat{W}}_{a}^{T} G (x) {\hat{W}}_{a} + \frac{1}{4} W_{c}^{T} G (x) W_{c} \\ = - \frac{1}{2} {\hat{W}}_{a}^{T} G \tilde{W} + \frac{1}{2} {\hat{W}}_{a}^{T} G W_{c} - \frac{1}{4} W_{c}^{T} G W_{c} - \frac{1}{4} {\hat{W}}_{a}^{T} G (x) {\hat{W}}_{a} \\ = - \frac{1}{2} {\hat{W}}_{a}^{T} G \tilde{W} - \frac{1}{4} {\tilde{W}}_{a} G W_{c} + \frac{1}{4} {\hat{W}}_{a}^{T} G {\tilde{W}}_{a} \\ = - \frac{1}{2} {\hat{W}}_{a}^{T} G \tilde{W} + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) {\tilde{W}}_{a} \end{matrix}

(A23)

Substituting (A22) into (A20) and adding (A23) yields:

\begin{matrix} {\dot{L}}_{c} = & {\tilde{W}}_{c}^{T} \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}} (- f {(x)}^{T} \nabla ζ^{T} (x) {\tilde{W}}_{c} + \frac{1}{2} {\tilde{W}}_{a}^{T} G (x) {\tilde{W}}_{c} \\ + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) {\tilde{W}}_{a} + ε_{H J B} (x) - \frac{1}{2} {(\nabla ζ g R^{- 1} g ε)}^{T} W_{c}) \end{matrix}

(A24)

By combining Equations (34), (36) and (37), Equation (A24) can be reformulated as:

\begin{matrix} \begin{matrix} {\dot{L}}_{c} & = {\tilde{W}}_{c}^{T} \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}} (- σ_{2}^{T} {\tilde{W}}_{c} + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) {\tilde{W}}_{a} + ε_{H J B} (x) - \frac{1}{2} {(\nabla ζ g R^{- 1} g ε)}^{T} W_{c}) \end{matrix} \end{matrix}

(A25)

Substituting (A17), (A25) and (35) into the derivative of Equation (A13) yields:

\begin{matrix} \begin{matrix} \dot{L} (x) = & x^{T} \dot{x} + {\tilde{W}}_{c}^{T} a_{1}^{- 1} {\dot{\tilde{W}}}_{c} + {\tilde{W}}_{a}^{T} a_{2}^{- 1} {\dot{\tilde{W}}}_{a} \\ = & - Q (x) - \frac{1}{4} W_{c}^{T} G (x) W_{c} + \frac{1}{2} W_{c}^{T} G (x) {\tilde{W}}_{a} \\ + ε_{H J B} (x) + ε_{1} (x) + {\tilde{W}}_{c}^{T} \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}} \\ \times (- σ_{2}^{T} {\tilde{W}}_{c} + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) {\tilde{W}}_{a} + ε_{H J B} (x) - \frac{1}{2} {(\nabla ζ g R^{- 1} g ε)}^{T} W_{c}) + {\tilde{W}}_{a}^{T} a_{2}^{- 1} {\dot{\tilde{W}}}_{a} \end{matrix} \end{matrix}

(A26)

Equation (A26) can be written as:

\begin{matrix} \begin{matrix} \dot{L} (x) = & {\dot{\bar{L}}}_{V} + {\dot{\bar{L}}}_{c} + ε_{1} (x) - {\tilde{W}}_{a}^{T} a_{2}^{- 1} {\dot{\tilde{W}}}_{a} + \frac{1}{2} {\tilde{W}}_{a}^{T} G (x) W_{c} \\ + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) W_{c} \frac{{\bar{σ}}_{2}^{T}}{m_{s}} {\tilde{W}}_{c} - \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) W_{c} \frac{{\bar{σ}}_{2}^{T}}{m_{s}} W_{c} \\ + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) {\tilde{W}}_{a} \frac{{\bar{σ}}_{2}^{T}}{m_{s}} W_{c} + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) {\hat{W}}_{a} \frac{{\bar{σ}}_{2}^{T}}{m_{s}} {\hat{W}}_{c} \end{matrix} \end{matrix}

(A27)

with the definitions:

\begin{matrix} \{\begin{matrix} {\bar{σ}}_{2} ≜ \frac{σ_{2}}{σ_{2}^{T} σ_{2} + 1} \\ m_{s} ≜ σ_{2}^{T} σ_{2} + 1 \\ {\dot{\bar{L}}}_{V} ≜ - Q (x) - \frac{1}{4} W_{c}^{T} G (x) W_{c} + ε_{H J B} (x) \\ {\dot{\bar{L}}}_{c} ≜ {\tilde{W}}_{c}^{T} \frac{σ_{2}}{{(σ_{2}^{T} σ_{2} + 1)}^{2}} (- σ_{2}^{T} {\tilde{W}}_{c} + ε_{H J B} (x) - \frac{1}{2} {(\nabla ζ g R^{- 1} g ε)}^{T} W_{c}) \end{matrix} \end{matrix}

(A28)

Taking (35) into (A27), we have the following:

\begin{matrix} \dot{L} (x) = & - Q (x) - \frac{1}{4} W_{c}^{T} G (x) W_{c} + ε_{H J B} (x) + {\tilde{W}}_{c}^{T} {\bar{σ}}_{2} (- {\bar{σ}}_{2}^{T} {\tilde{W}}_{c} + \frac{ε_{H J B} (x) - \frac{1}{2} {(\nabla ζ g R^{- 1} g ε)}^{T} W_{c}}{m_{s}}) \\ + ε_{1} (x) + \frac{1}{2} {\tilde{W}}_{a}^{T} G (x) W_{c} + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) W_{c} \frac{{\bar{σ}}_{2}^{T}}{m_{s}} {\tilde{W}}_{c} \\ - \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) W_{c} \frac{{\bar{σ}}_{2}^{T}}{m_{s}} W_{c} + \frac{1}{4} {\tilde{W}}_{a}^{T} G (x) W_{c} \frac{{\bar{σ}}_{2}}{m_{s}} {\tilde{W}}_{a} \\ + {\tilde{W}}_{a}^{T} η_{a} W_{c} - {\tilde{W}}_{a}^{T} η_{a} {\tilde{W}}_{a} - {\tilde{W}}_{a}^{T} η_{c} {\bar{σ}}_{2}^{T} W_{c} + {\tilde{W}}_{a}^{T} η_{c} {\bar{σ}}_{2}^{T} {\tilde{W}}_{c} . \end{matrix}

(A29)

As established in [28], for any given

γ_{ε}

, there exists

N_{0}

such that

‖ ε_{H J B} ‖ < γ_{ε}

holds when

N > N_{0}

.

From Assumption 2 and (A15), it follows that:

‖ ε_{1} (x) ‖ < γ_{ε_{x}} γ_{f} ‖ x ‖ + \frac{1}{2} γ_{ε_{x}} γ_{g}^{2} γ_{ζ x} σ_{min}^{- 1} (R) (‖ W_{c} ‖ + ‖ {\tilde{W}}_{a} ‖) + γ_{g} γ_{ζ x} δ_{m}

(A30)

{(\nabla ζ g ε (x))}^{T} W_{c} < γ_{ε} γ_{g} γ_{ζ x} W_{c}

(A31)

where

σ_{min}^{- 1} (R)

denotes the reciprocal of the minimum singular value of matrix R.

Let

ϵ

be a positive definite constant satisfying the inequality:

x^{T} ϵ x < Q (x)

and define

Ω ≜ [\begin{matrix} x \\ {\bar{σ}}_{2}^{T} {\tilde{W}}_{c} \\ {\tilde{W}}_{a} \end{matrix}]

, we obtain:

\begin{matrix} \dot{L} < & \frac{1}{4} ‖ W_{c} ‖^{2} ‖ G (x) ‖ + γ_{ε} + \frac{1}{2} ‖ W_{c} ‖ γ_{ε_{x}} γ_{ζ_{x}} γ_{g}^{2} σ_{min}^{- 1} (R) + γ_{g} γ_{ζ_{x}} δ_{m} \\ - Ω^{T} [\begin{matrix} ϵ I & 0 & 0 \\ 0 & I & {(- \frac{1}{2} η_{c} - \frac{1}{8 m_{s}} G W_{c})}^{T} \\ 0 & - \frac{1}{2} η_{c} - (\frac{1}{8 m_{s}} G W_{c}) & η_{a} - \frac{1}{8} (G W_{c} m^{T} + m W_{c}^{T} G) \end{matrix}] Ω \\ + Ω^{T} [\begin{matrix} γ_{ε_{x}} γ_{f} \\ \frac{2 γ_{ε} + γ_{ε} γ_{g}^{2} γ_{ζ_{x}} W_{c} σ_{min}^{- 1} (R)}{2 m_{s}} \\ (\frac{1}{2} G + η_{a} - η_{c} {\bar{σ}}_{2}^{T} - \frac{1}{4} G W_{c} m^{T}) W_{c} + \frac{1}{2} γ_{ε_{x}} γ_{g}^{2} γ_{ζ_{x}} σ_{min}^{- 1} (R) \end{matrix}] \end{matrix}

(A32)

Define:

\{\begin{matrix} \begin{matrix} H ≜ [\begin{matrix} ϵ I & 0 & 0 \\ 0 & I & {(- \frac{1}{2} η_{c} - \frac{1}{8 m_{s}} G W_{c})}^{T} \\ 0 & - \frac{1}{2} η_{c} - (\frac{1}{8 m_{s}} G W_{c}) & η_{a} - \frac{1}{8} (G W_{c} m^{T} + m W_{c}^{T} G) \end{matrix}] \\ Γ ≜ [\begin{matrix} γ_{ε_{x}} γ_{f} \\ \frac{2 γ_{ε} + γ_{ε} γ_{g}^{2} γ_{ζ_{x}} W_{c} σ_{min}^{- 1} (R)}{2 m_{s}} \\ (\frac{1}{2} G + η_{a} - η_{c} {\bar{σ}}_{2}^{T} - \frac{1}{4} G W_{c} m^{T}) W_{c} + \frac{1}{2} γ_{ε_{x}} γ_{g}^{2} γ_{ζ_{x}} σ_{min}^{- 1} (R) \end{matrix}] \end{matrix} \\ ρ ≜ \frac{1}{4} ‖ W_{c} ‖^{2} ‖ G (x) ‖ + γ_{ε} + \frac{1}{2} ‖ W_{c} ‖ γ_{ε_{x}} γ_{ζ_{x}} γ_{g}^{2} σ_{min}^{- 1} (R) + γ_{g} γ_{ζ_{x}} δ_{m} \end{matrix}

(A33)

With proper selection of

η_{c}

and

η_{a}

guaranteeing

H > 0

, the following holds:

\dot{L} < - {‖ Ω ‖}^{2} σ_{\min} (H) + ‖ Γ ‖ ‖ Ω ‖ + ρ + γ_{ε}

(A34)

Based on (A34), we conclude that

\dot{L}

is negative definite when:

‖ Ω ‖ > \frac{‖ Γ ‖}{2 σ_{min} (H)} + \sqrt{\frac{Γ^{2}}{4 σ_{min}^{2} (H)} + \frac{ρ + γ_{ε}}{σ_{min} (H)}}

(A35)

Thus, all signals in the closed-loop system are uniformly ultimately bounded.

References

Wang, M.; Ren, X.; Chen, Q. Cascade Optimal Control for Tracking and Synchronization of a Multimotor Driving System. IEEE Trans. Control Syst. Technol. 2019, 27, 1376–1384. [Google Scholar] [CrossRef]
Errouissi, R.; Al-Durra, A.; Muyeen, S.M. Experimental Validation of a Novel PI Speed Controller for AC Motor Drives with Improved Transient Performances. IEEE Trans. Control Syst. Technol. 2018, 26, 1414–1421. [Google Scholar] [CrossRef]
Lee, H.; Lee, Y.; Shin, D.; Chung, C.C. H∞ control based on LPV for load torque compensation of PMSM. In Proceedings of the 2015 15th International Conference on Control, Automation and Systems (ICCAS), Busan, Republic of Korea, 13–16 October 2015; pp. 1013–1018. [Google Scholar] [CrossRef]
Zhang, X.; Sun, L.; Zhao, K.; Sun, L. Nonlinear Speed Control for PMSM System Using Sliding-Mode Control and Disturbance Compensation Techniques. IEEE Trans. Power Electron. 2013, 28, 1358–1365. [Google Scholar] [CrossRef]
Repecho, V.; Biel, D.; Arias, A. Fixed Switching Period Discrete-Time Sliding Mode Current Control of a PMSM. IEEE Trans. Ind. Electron. 2018, 65, 2039–2048. [Google Scholar] [CrossRef]
Linares-Flores, J.; García-Rodríguez, C.; Sira-Ramírez, H.; Ramírez-Cárdenas, O.D. Robust Backstepping Tracking Controller for Low-Speed PMSM Positioning System: Design, Analysis, and Implementation. IEEE Trans. Ind. Inform. 2015, 11, 1130–1141. [Google Scholar] [CrossRef]
Yin, W.; Wu, X.; Rui, X. Adaptive Robust Backstepping Control of the Speed Regulating Differential Mechanism for Wind Turbines. IEEE Trans. Sustain. Energy 2019, 10, 1311–1318. [Google Scholar] [CrossRef]
Li, S.; Zhou, M.; Yu, X. Design and Implementation of Terminal Sliding Mode Control Method for PMSM Speed Regulation System. IEEE Trans. Ind. Inform. 2013, 9, 1879–1891. [Google Scholar] [CrossRef]
Preindl, M.; Bolognani, S. Model Predictive Direct Speed Control with Finite Control Set of PMSM Drive Systems. IEEE Trans. Power Electron. 2013, 28, 1007–1015. [Google Scholar] [CrossRef]
Li, S.; Ding, L.; Gao, H.; Liu, Y.J.; Huang, L.; Deng, Z. ADP-Based Online Tracking Control of Partially Uncertain Time-Delayed Nonlinear System and Application to Wheeled Mobile Robots. IEEE Trans. Cybern. 2020, 50, 3182–3194. [Google Scholar] [CrossRef]
Xue, S.; Zhao, N.; Zhang, W.; Luo, B.; Liu, D. A Hybrid Adaptive Dynamic Programming for Optimal Tracking Control of USVs. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 9961–9969. [Google Scholar] [CrossRef] [PubMed]
Yu, Y.; Ma, X.; Su, R.; Jet, T.K.; Viswanathan, V.; Gajanayake, C.J.; RamaKrishna, S.; Gupta, A.K. Application of integral reinforcement learning for optimal control of a high speed flux-switching permanent magnet machine. In Proceedings of the IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society, Florence, Italy, 23–26 October 2016; pp. 2702–2707. [Google Scholar] [CrossRef]
Chen, G.; Wang, W.; Dong, J. Performance-Optimize Adaptive Robust Tracking Control for USV-UAV Heterogeneous Systems with Uncertainty. IEEE Trans. Veh. Technol. 2025, 74, 7251–7262. [Google Scholar] [CrossRef]
Chen, G.; Dong, J. Approximate Optimal Adaptive Prescribed Performance Control for Uncertain Nonlinear Systems with Feature Information. IEEE Trans. Syst. Man Cybern. Syst. 2024, 54, 2298–2308. [Google Scholar] [CrossRef]
Jiang, Y.; Jiang, Z.P. Robust Adaptive Dynamic Programming with an Application to Power Systems. IEEE Trans. Neural Netw. Learn. Syst. 2013, 24, 1150–1156. [Google Scholar] [CrossRef]
Heydari, A. Optimal Impulsive Control Using Adaptive Dynamic Programming and its Application in Spacecraft Rendezvous. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4544–4552. [Google Scholar] [CrossRef]
El-Sousy, F.F.M.; Amin, M.M.; Al-Durra, A. Adaptive Optimal Tracking Control Via Actor-Critic-Identifier Based Adaptive Dynamic Programming for Permanent-Magnet Synchronous Motor Drive System. IEEE Trans. Ind. Appl. 2021, 57, 6577–6591. [Google Scholar] [CrossRef]
Lee, J.; You, S.; Kim, W.; Moon, J. Extended state observer-actor–critic architecture based output-feedback optimized backstepping control for permanent magnet synchronous motors. Expert Syst. Appl. 2025, 270, 126542. [Google Scholar] [CrossRef]
Fan, Z.X.; Li, S.; Liu, R. ADP-Based Optimal Control for Systems with Mismatched Disturbances: A PMSM Application. IEEE Trans. Circuits Syst. II Express Briefs 2023, 70, 2057–2061. [Google Scholar] [CrossRef]
Fan, Z.X.; Li, S.; Su, J. Adaptive Dynamic Programming for PMSM Control Under Safety, Robustness, and Optimality Constraints. IEEE Trans. Syst. Man Cybern. Syst. 2025, 55, 2724–2733. [Google Scholar] [CrossRef]
Wang, Z.; Ye, H.; Wang, Y.; Shi, Y.; Liang, L. Optimal Output-Feedback Controller Design Using Adaptive Dynamic Programming: A Permanent Magnet Synchronous Motor Application. IEEE Trans. Circuits Syst. II Express Briefs 2025, 72, 208–212. [Google Scholar] [CrossRef]
Vamvoudakis, K.G.; Lewis, F.L. Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 2010, 46, 878–888. [Google Scholar] [CrossRef]
Uddin, M.N.; Zou, H.; Azevedo, F. Online Loss-Minimization-Based Adaptive Flux Observer for Direct Torque and Flux Control of PMSM Drive. IEEE Trans. Ind. Appl. 2016, 52, 425–431. [Google Scholar] [CrossRef]
Ye, S.; Yao, X. A Modified Flux Sliding-Mode Observer for the Sensorless Control of PMSMs with Online Stator Resistance and Inductance Estimation. IEEE Trans. Power Electron. 2020, 35, 8652–8662. [Google Scholar] [CrossRef]
Podder, A.; Pandit, D. Study of Sensorless Field-Oriented Control of SPMSM Using Rotor Flux Observer & Disturbance Observer Based Discrete Sliding Mode Observer. In Proceedings of the 2021 IEEE 22nd Workshop on Control and Modelling of Power Electronics (COMPEL), Cartagena, Colombia, 2–5 November 2021; pp. 1–8. [Google Scholar] [CrossRef]
Zhu, G.; Dessaint, L.A.; Akhrif, O.; Kaddouri, A. Speed tracking control of a permanent-magnet synchronous motor with state and load torque observer. IEEE Trans. Ind. Electron. 2000, 47, 346–355. [Google Scholar] [CrossRef]
Apte, A.; Joshi, V.A.; Mehta, H.; Walambe, R. Disturbance-Observer-Based Sensorless Control of PMSM Using Integral State Feedback Controller. IEEE Trans. Power Electron. 2020, 35, 6082–6090. [Google Scholar] [CrossRef]
Hornik, K.; Stinchcombe, M.; White, H. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw. 1990, 3, 551–560. [Google Scholar] [CrossRef]
Li, J.; Liu, H.; Zhang, Z.; Li, X.; Yang, X. Event-triggered adaptive NN tracking control with dynamic gain for a class of unknown nonlinear systems. Neurocomputing 2022, 467, 292–299. [Google Scholar] [CrossRef]
Abu-Khalaf, M.; Lewis, F.L. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 2005, 41, 779–791. [Google Scholar] [CrossRef]
Modares, H.; Lewis, F.L. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 2014, 50, 1780–1792. [Google Scholar] [CrossRef]
Li, D.; Dong, J. Approximate Optimal Robust Tracking Control Based on State Error and Derivative Without Initial Admissible Input. IEEE Trans. Syst. Man Cybern. Syst. 2024, 54, 1059–1069. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the robust optimal control strategy based on VGPDO and actor-critic neural network.

Figure 2. Flux linkage estimation results achieved by VGPDO and SMO.

Figure 3. Torque estimation results achieved by VGPDO and SMO.

Figure 4. Speed trajectories under three strategy.

Figure 5. Total voltage under three strategy.

Figure 6. Critic NN weight update trajectory on VGPDO-AC.

Figure 7. Actor NN weight update trajectory on VGPDO-AC.

Figure 8.

\dot{V} (x)

and

\dot{\hat{V}} (x)

trajectory on VGPDO-AC.

Figure 8.

\dot{V} (x)

and

\dot{\hat{V}} (x)

trajectory on VGPDO-AC.

Figure 9. Speed trajectories for three strategies under flux linkage step disturbance.

Figure 10. Speed trajectories for three strategies under load torque step disturbance.

Table 1. Parameters of PMSM.

Symbols	Values	Symbols	Values
$U_{R}$	60 V	$n_{p}$	4
$I_{R}$	12 A	$ψ_{s}$	0.0192 Wb
J	$7.06 \times 10^{- 4} kg \cdot m^{2}$	$R_{s}$	$0.72 Ω$
B	$3.5 \times 10^{- 4} N \cdot s / rad$	L	$0.4 \times 10^{- 3} H$

Table 2. Performance comparison of VGPDO and NDOB-SMO algorithms under different measurement noise variance conditions.

Algorithm	Noise Variance ( $δ_{w}, δ_{iq}^{2}, δ_{id}^{2}$ )	Maximum Estimation Error (%)		RMSE		Maximum Setting Time (s)	CPU Usage (%)
Algorithm	Noise Variance ( $δ_{w}, δ_{iq}^{2}, δ_{id}^{2}$ )	Load Torque	Flux Linkage	Load Torque	Flux Linkage	Maximum Setting Time (s)	CPU Usage (%)
VGPDO	$(0, 0, 0)$	0.3331	0.0755	2.961 × $10^{- 3}$	5.074 × $10^{- 5}$	0.0782	12.26
	$(1, 10^{- 3}, 10^{- 3})$	0.3964	0.1525	2.986 × $10^{- 3}$	5.083 × $10^{- 5}$	0.0801	13.98
	$(10, 10^{- 2}, 10^{- 2})$	0.6165	0.2469	3.271 × $10^{- 3}$	5.175 × $10^{- 5}$	0.0857	14.24
	$(100, 0.5, 0.5)$	3.601	3.109	1.176 × $10^{- 2}$	1.263 × $10^{- 4}$	N/A	13.17
NDOB-SMO	$(0, 0, 0)$	0.7176	0.3496	1.148 × $10^{- 2}$	3.358 × $10^{- 4}$	0.1328	11.6
	$(1, 10^{- 3}, 10^{- 3})$	2.996	0.5022	1.251 × $10^{- 2}$	3.590 × $10^{- 4}$	0.1331	15.74
	$(10, 10^{- 2}, 10^{- 2})$	3.365	0.9817	1.955 × $10^{- 2}$	3.608 × $10^{- 4}$	0.1345	15.96
	$(100, 0.5, 0.5)$	7.957	5.422	2.271 × $10^{- 2}$	4.244 × $10^{- 4}$	N/A	14.2

Table 3. Sensitivity analysis of

α_{1}, α_{2}

on VGPDO performance.

Table 3. Sensitivity analysis of

α_{1}, α_{2}

on VGPDO performance.

$α_{1}$	$α_{2}$	Maximum Estimation Error (%)		RMSE		Maximum Setting Time (s)	CPU Usage (%)
$α_{1}$	$α_{2}$	Load Torque	Flux Linkage	Load Torque	Flux Linkage	Maximum Setting Time (s)	CPU Usage (%)
10	10	3.306	0.1265	1.520 × $10^{- 2}$	1.116 × $10^{- 3}$	0.3921	11.63
30	30	1.116	0.1183	6.470 × $10^{- 3}$	6.424 × $10^{- 4}$	0.1308	15.74
50	50	0.6842	0.1525	4.563 × $10^{- 3}$	4.987 × $10^{- 4}$	0.0801	15.96
100	100	0.3965	0.2206	2.986 × $10^{- 3}$	3.590 × $10^{- 4}$	0.0403	14.20
1000	1000	0.6471	0.8010	1.791 × $10^{- 3}$	2.782 × $10^{- 4}$	N/A	12.96

Table 4. Speed control performance evaluated for three strategies under comprehensive operating conditions.

Control Strategy	Steady-State Error (%)	RMSE	Energy Consumption (J)	Settling Time (s)
$H \infty$ - Flux Compensation	0.08331	32.8425	1.424 × $10^{4}$	0.0814
ESO-AC	0.39359	34.4345	1.387 × $10^{4}$	0.1044
VGPDO-AC	0.01217	27.3861	1.396 × $10^{4}$	0.0676

Table 5. Speed control performance evaluated for three strategies under flux linkage step disturbance.

Algorithm	Maximum Deviation (%)	RMSE
$H \infty$ - Flux Compensation	0.5957	0.6515
ESO-AC	0.495	3.4886
VGPDO-AC	0.4623	0.3582

Table 6. Speed control performance evaluated for three strategies under load torque step disturbance.

Algorithm	Maximum Deviation (%)	RMSE
$H \infty$ - Flux Compensation	0.1747	0.6947
ESO-AC	0.3816	3.7403
VGPDO-AC	0.1156	0.1034

Table 7. Sensitivity analysis of

η_{a}, η_{c}

on PMSM speed control.

Table 7. Sensitivity analysis of

η_{a}, η_{c}

on PMSM speed control.

$η_{a}$	$η_{c}$	Speed
$η_{a}$	$η_{c}$	RMSE	Steady-State Error (%)	Settling Time (s)
0.1	0.1	100.20	0.00972	0.1004
0.5	0.5	96.99	0.01058	0.0874
1	1	92.21	0.01217	0.0689
2	2	90.87	0.0226	0.0443

Table 8. Sensitivity analysis of

η_{a}, η_{c}

on NN network update performance of VGPDO-AC algorithm.

Table 8. Sensitivity analysis of

η_{a}, η_{c}

on NN network update performance of VGPDO-AC algorithm.

$η_{a}$	$η_{c}$	${\hat{W}}_{a}$		Steady-State Estimation Error on $\dot{\hat{V}} (x)$ (%)	CPU Usage (%)
$η_{a}$	$η_{c}$	Percentage Deviation (%)	Settling Time (s)	Steady-State Estimation Error on $\dot{\hat{V}} (x)$ (%)	CPU Usage (%)
0.1	0.1	0.5307	0.6329	1.684	60.329
0.5	0.5	0.3303	0.3048	1.653	59.484
1	1	0.1050	0.2066	1.640	56.649
2	2	0.0129	0.1609	1.807	52.418

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Niu, Y.; Shi, H. A Robust Optimal Control Strategy for PMSM Based on VGPDO and Actor-Critic Neural Network Against Flux Weakening and Mismatched Load Torque. Mathematics 2025, 13, 3387. https://doi.org/10.3390/math13213387

AMA Style

Niu Y, Shi H. A Robust Optimal Control Strategy for PMSM Based on VGPDO and Actor-Critic Neural Network Against Flux Weakening and Mismatched Load Torque. Mathematics. 2025; 13(21):3387. https://doi.org/10.3390/math13213387

Chicago/Turabian Style

Niu, Yangyu, and Haibin Shi. 2025. "A Robust Optimal Control Strategy for PMSM Based on VGPDO and Actor-Critic Neural Network Against Flux Weakening and Mismatched Load Torque" Mathematics 13, no. 21: 3387. https://doi.org/10.3390/math13213387

APA Style

Niu, Y., & Shi, H. (2025). A Robust Optimal Control Strategy for PMSM Based on VGPDO and Actor-Critic Neural Network Against Flux Weakening and Mismatched Load Torque. Mathematics, 13(21), 3387. https://doi.org/10.3390/math13213387

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Robust Optimal Control Strategy for PMSM Based on VGPDO and Actor-Critic Neural Network Against Flux Weakening and Mismatched Load Torque

Abstract

1. Introduction

2. System Descriptions

3. Variable-Gain Proportional Disturbance Observer Design

4. Actor-Critic Network-Based Optimal Controller Design

5. Simulation Results

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1

Appendix A.2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI