Next Article in Journal
Multi-Objective Optimization of Off-Grid Hybrid Renewable Energy Systems for Sustainable Agricultural Development in Sub-Saharan Africa
Previous Article in Journal
Comparative Analysis of Energy and Emission Properties of Hazelnut Shell Biomass from Temperate and Subtropical Climates
Previous Article in Special Issue
Multi-Peak Photovoltaic Maximum Power Point Tracking Method Based on Honey Badger Algorithm Under Localized Shading Conditions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Design of a Robust Adaptive Cascade Fractional-Order Proportional–Integral–Derivative Controller Enhanced by Reinforcement Learning Algorithm for Speed Regulation of Brushless DC Motor in Electric Vehicles

School of Engineering, Edith Cowan University, Joondalup, WA 6027, Australia
*
Author to whom correspondence should be addressed.
Energies 2025, 18(19), 5056; https://doi.org/10.3390/en18195056
Submission received: 8 July 2025 / Revised: 26 August 2025 / Accepted: 26 August 2025 / Published: 23 September 2025

Abstract

Brushless DC (BLDC) motors are commonly used in electric vehicles (EVs) because of their efficiency, small size and great torque-speed performance. These motors have a few benefits such as low maintenance, increased reliability and power density. Nevertheless, BLDC motors are highly nonlinear and their dynamics are very complicated, in particular, under changing load and supply conditions. The above features require the design of strong and adaptable control methods that can ensure performance over a broad spectrum of disturbances and uncertainties. In order to overcome these issues, this paper uses a Fractional-Order Proportional-Integral-Derivative (FOPID) controller that offers better control precision, better frequency response, and an extra degree of freedom in tuning by using non-integer order terms. Although it has the benefits, there are three primary drawbacks: (i) it is not real-time adaptable, (ii) it is hard to choose appropriate initial gain values, and (iii) it is sensitive to big disturbances and parameter changes. A new control framework is suggested to address these problems. First, a Reinforcement Learning (RL) approach based on Deep Deterministic Policy Gradient (DDPG) is presented to optimize the FOPID gains online so that the controller can adjust itself continuously to the variations in the system. Second, Snake Optimization (SO) algorithm is used in fine-tuning of the FOPID parameters at the initial stages to guarantee stable convergence. Lastly, cascade control structure is adopted, where FOPID controllers are used in the inner (current) and outer (speed) loops. This construction adds robustness to the system as a whole and minimizes the effect of disturbances on the performance. In addition, the cascade design also allows more coordinated and smooth control actions thus reducing stress on the power electronic switches, which reduces switching losses and the overall efficiency of the drive system. The suggested RL-enhanced cascade FOPID controller is verified by Hardware-in-the-Loop (HIL) testing, which shows better performance in the aspects of speed regulation, robustness, and adaptability to realistic conditions of operation in EV applications.

1. Introduction

The growing implementation of Brushless DC (BLDC) motors in electric vehicles (EVs) is predetermined by higher performance in efficiency, torque-speed characteristics, and reliability [1,2]. They have a compact structure and brushless design, which provides lower maintenance and increased power density, which makes them suitable in space-limited, high-performance applications like EVs. Although the above advantages are present, BLDC motors are highly nonlinear, their parameters are uncertain, and their dynamic behavior varies, particularly when subjected to varying load and supply conditions [3]. Moreover, the BLDC motors are generally driven by a three phase inverter that brings in another level of control complexity since a high speed switching and synchronization is required. Such improper or inaccurate control of the inverter switches may result in higher switching losses, electromagnetic interference, and thermal stress- making real-time implementation more complex, and decreasing the efficiency of the overall system [4]. These properties present significant difficulties to traditional control systems and require the creation of superior control tactics that are able to guarantee robust, accurate and flexible operation.
The classical PID controller is still in common use because of its simplicity, low cost and known stability. Nevertheless, its fixed-gain topology cannot provide the flexibility needed to manage the dynamics of BLDC motors. PID controllers have difficulties in parameter tuning when nonlinear conditions are present, and their performance is strongly degraded in the case of load disturbances or system uncertainties [5]. In order to overcome these shortcomings, a number of intelligent control methods have been proposed in the literature. All of these, fuzzy logic controllers, neural networks, and sliding mode controllers (SMC) have been shown to be more adaptive and robust [6,7,8,9,10]. Fuzzy logic control regulates the controller gains on the basis of real-time error and needs careful design of the rule base [8,9]. Controllers using neural networks provide nonlinear approximation and learning, but require a significant amount of training, and computational resources [10]. Conversely, SMC is highly robust but is also plagued with the familiar problem of chattering that may result in undue wear in mechanical systems [6,7]. Fractional-Order PID (FOPID) controllers, as a tradeoff between classical and intelligent controllers, have gained more and more attention [11,12]. The FOPID controllers have two extra tuning parameters, which result in more flexible frequency-domain shaping and better time-domain performance by generalizing the integral and derivative operators to non-integer orders. As demonstrated in numerous studies, FOPID controllers are capable of improving on the performance of conventional PID controllers, especially when it is necessary to track a signal with high precision and reject disturbances effectively [13,14,15]. Nevertheless, due to their theoretical merits, three major issues restrict the universal usage of FOPID controllers: the inability to choose the initial gain values appropriately, the inability to adapt to changes in real-time, and the vulnerability to abrupt parameter changes and disturbances. Such disadvantages are particularly problematic in real-time EV systems, in which the changes in the environment and loads are supposed to be rapid [15]. In order to enhance FOPID performance in these types of environments, metaheuristic optimization algorithms have been used to tune offline. Particle Swarm Optimization (PSO), Genetic Algorithms (GA) and Antlion Optimizer (ALO) techniques have been used to determine optimal FOPID gain settings [16,17,18,19]. Such algorithms improve the performance of controllers by overcoming the issue of initial gain selection, but are restricted by being static, i.e., do not offer online adaptation or learning. Moreover, most of these metaheuristics are likely to converge too early or require excessive computational costs and thus limit their applicability to real-time control applications.
In order to facilitate online flexibility, reinforcement learning (RL) and its more advanced versions like Deep Reinforcement Learning (DRL) have become potential control system tools [20,21]. RL allows agents to learn optimal control policies by trial and error and as such is well suited to nonlinear, model-free systems such as BLDC motor drives. The Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3) algorithms have shown to be able to control voltage, current and speed in power converters and electric drives with high adaptability and disturbance rejection [22,23,24,25]. DDPG is particularly useful in continuous action spaces and would be applicable in real-time control of motor speed and torque. TD3 is an improvement of DDPG in that it corrects the overestimation bias with the twin critic networks and delayed update, whereas Proximal Policy Optimization (PPO) provides better stability in noisy environments [26,27]. Regardless of these developments, there are still a few problems that have not been resolved yet, namely, sample inefficiency, safety in the exploration process, and the discrepancy between the simulated and real-world performance. In an attempt to reduce these problems, hybrid architectures that integrate RL and traditional or fractional-order PID controllers have attracted interest in enhancing adaptability and maintaining control structure [28,29]. Among all the discussed DRL techniques, DDPG offers the most potential due to its capacity to strike the correct balance between convergence rate, accuracy of control, and its compatibility with continuous dynamic systems such as BLDC motors. The RL methods are however sensitive to their start policy parameters. Poor initialization may result in slow convergence, unstable transients or poor learning outcomes. This highlights the importance of providing a good starting point to the learning agent so as to improve the performance at the initial levels and to render the system safe. Of late, a body of work has emerged on hybrid methods, that integrate metaheuristic optimization with RL [30,31]. In such methods, the gains of the controllers are initialized using optimization algorithms, and then they are tuned online by RL agents. Whereas the use of well-known algorithms is typically common in the existing studies, a new bio-inspired metaheuristic known as the Snake Optimization (SO) algorithm has recently been demonstrated to be a very promising alternative [32]. The SO algorithm simulates the social behavior and adaptive movement of snakes, and provides dynamic trade-off between exploration and exploitation at less computational cost than conventional techniques [33,34]. Its population update strategies avoid local optima better and converge to high-quality solutions quickly due to its unique population update strategies. Furthermore, in power electronic systems, especially in BLDC motors operated with three-phase inverters, switching losses are a key factor in the overall system efficiency and thermal behaviour; a soft-switching approach, using a cascade control architecture, is a promising solution, since it minimizes switching stress and improves energy efficiency. Along with the development of control strategies, cascade control systems have gained popularity as a method of BLDC motor control and power converters. Cascade controllers are more robust, have better disturbance rejection and dynamic response, since they decouple the control loops, usually placing current control in the inner loop with speed control in the outer loop [35,36]. Recent papers have shown the effectiveness of cascade structures to enhance the stability and accuracy of BLDC motor drives, particularly in the case of load disturbances, noise and sensor uncertainty. Predictive and sensorless strategies have been added to advanced versions of cascade control to further improve the precision of control and simplify the system [37]. This not only means that cascade control is a viable way of managing the nonlinear dynamics of BLDC motors, but also a strategic basis on which to incorporate intelligent tuning mechanisms, like reinforcement learning, to further increase adaptability and performance.
In light of the existing state of the art, the paper suggests a strong and adaptive control scheme that combines the power of fractional-order control, RL method, and metaheuristic optimization in a cascade control scheme. The FOPID gains are continuously adjusted online using a DDPG algorithm, which means that the system learns the optimal control actions in real time. To eliminate the problems of poor initialization and unstable early learning, the Snake Optimization algorithm is used to optimize the FOPID parameters offline so that a stable starting point can be achieved. The suggested controller is implemented in cascade form with the outer FOPID loop controlling the speed and the inner FOPID loop controlling the current, therefore, offering better transient response and rejection of disturbances. This combined solution is verified by the Hardware-in-the-Loop (HIL) testing, which proves its advantage in robustness, flexibility, and accuracy in dynamic operating conditions of EV. Figure 1 shows the overall design of the test system with BLDC and the controller. Our work has the following main contributions:
  • FOPID control strategy has been developed and implemented in cascade form to control speed and current of the motor. The method is not only less prone to error, and guarantees proper monitoring, but it also offers better disturbance rejection, especially when subjected to parametric variations and noise without affecting the stability of the system.
  • Using RL to adaptively optimize FOPID controller gains in real-time using the DDPG algorithm to improve robustness and tracking performance of the BLDC motor without necessarily having to model the system accurately.
  • The SOA is used as an evolutionary computation method to find the best parameters of the system and reduce the number of conflicts. This leads to an increased convergence rate and adaption to the system frequency behavior.
  • Real-time testing in HIL setups gives the system a high level of validation in the simulation and experimental conditions, which makes the results more credible and practically applicable.

2. BLDC Motor Modeling

The characteristics of the BLDC motor are challenging to analyze precisely due to the trapezoidal shape of the induced electromotive force of BLDC motors, which have a significant amount of harmonics. To power the BLDC motor, a three-phase inverter is employed, as illustrated in Figure 1a. The speed controller receives the position and speed information from the BLDC motor through the Hall Effect signals. In order to regulate the motor’s speed and current, this cascade controller comprises two operational loops, also known as the inner and outer loops Figure 1. Based on the error reported by the speed controller in the outer loop, the current controller generates commutation signals to drive the gates of the power MOSFETs, which in turn regulate the BLDC motor. Based on Figure 1, the stator voltage equations can be expressed in the a, b, and c coordinate as [38,39,40,41,42]:
v a = L d i a d t + e a ( θ r , ω r ) + R i a v b = L d i b d t + e b ( θ r , ω r ) + R i b v c = L d i c d t + e c ( θ r , ω r ) + R i c
where R is the stator resistance, v a , v b , v c are the phase voltages, L is the stator inductance, i a , i b , i c are the phase currents, ω r is the rotor electrical speed, θ r is the rotor electrical angle, and e a , e b , e c are the back EMFs of each phase. Looking at Equation (1), the initial state space model can be written as [37]:
v a v b v c = R 1 0 0 0 1 0 0 0 1 i a i b i c + L 0 0 0 L 0 0 0 L i ˙ a i ˙ b i ˙ c + e a ( θ r , ω r ) e b ( θ r , ω r ) e c ( θ r , ω r )
Electromagnetic torque is a crucial component of the BLDC motor’s performance. It is defined based on the interaction between the back EMF and the phase currents, which convert electrical energy into mechanical torque. Electromagnetic torque can be defined below [39]:
T e = ( e a i a + e b i b + e c i c ) ω m ; ω m = ω r P / 2
where P is the number of poles, and ω m is the rotor mechanical speed. Using the linearized model, the electromagnetic torque becomes [41]:
T e = P 2 × E I ω r
Actuated by the torque in Equation (4), the rotor motion can be described as below:
J d ω r d t = T e ( Δ θ r ) T l D ω r
where T e ( Δ θ r ) denotes the torque function of Δ θ r and J, D, and T l are the moment of inertia, damping coefficient, and the load torque, respectively. From Equations (4) and (5), the dynamic equation representing the synchronization error between the rotor and the stator can be derived as [43]:
Δ θ r ˙ = Δ ω r ,
Δ ω r ˙ = P J T e ( Δ θ r ) D J Δ ω r a P J T l D J ω s ω ˙ s
where ω s is the synchronous speed. Henceforth, the problem of stability analysis for the BLDC motor in the open-loop operation mode can be transformed into the qualitative study on the nonlinear ordinary differential Equation (14).
The relationship between the electrical angle θ r and ω r in BLDC motors is a standard identity based on the number of poles. It is given by [41]:
d θ r d t = P 2 × ω r
Rewriting the electrical dynamics, we get:
d i a d t = 1 L v a R i a e a ( θ r , ω r ) d i b d t = 1 L v b R i b e b ( θ r , ω r ) d i c d t = 1 L v c R i c e c ( θ r , ω r )
where the back-EMF terms are defined as:
e a , b , c ( θ r , ω r ) = K e · g a , b , c ( θ r ) · ω r
Equation (9) represents the dynamic equations of each phase of the BLDC motor. The left-hand side correctly expresses the time derivative of the phase currents. The back-EMF terms e a , b , c ( θ r , ω r ) are modeled as proportional to the angular speed and shaped by the position-dependent function g a , b , c ( θ r ) , which approximates the trapezoidal waveform.
The back EMF is proportional to the rotor speed and depends on the rotor’s position:
e a = ω r · g a ( θ r ) · K e e b = ω r · g b ( θ r ) · K e e c = ω r · g c ( θ r ) · K e
In which, K e is the back EMF constant. In order to facilitate the design of control, the nonlinear model is linearized about a steady-state operating point. This system in state-space form is [37,38]:
x ˙ ( t ) = A x ( t ) + B u ( t ) y ( t ) = C x ( t )
where:
x ( t ) = [ i a i b i c ω r θ r ] T ,
u ( t ) = [ v a v b v c T l ] T .
A = R L 0 0 K e L g a ( θ r ) K e L ω r g a ( θ r ) 0 R L 0 K e L g b ( θ r ) K e L ω r g b ( θ r ) 0 0 R L K e L g c ( θ r ) K e L ω r g c ( θ r ) e a J e b J e c J ε J 0 0 0 0 1 0 B = 1 L 0 0 0 0 1 L 0 0 0 0 1 L 0 0 0 0 1 J 0 0 0 0 C = [ 0 0 0 1 0 ]
In Equation (13), g a , b , c ( θ r ) means the derivative of g a , b , c ( θ r ) with respect to θ r . For the design and implementation of the proposed controller, the real-time specifications have been considered, focusing on speed regulation in EVs. Key parameters of this motor are listed in Table 1 to ensure relevance to practical performance conditions.
It is important to highlight that, in order to achieve a realistic representation of the converter behavior, parasitic elements were incorporated into the HIL model. Specifically, parasitic resistances were considered in the transformer windings and inductors. These resistive losses ensure that the simulation environment closely replicates the practical operating conditions of the BLDC motor, thereby improving the reliability and accuracy of the obtained experimental validation results. By accounting for these parasitic elements, the proposed controller’s robustness was validated under more realistic and non-ideal system dynamics.

3. Stability of Open-Loop Operation for BLDC Motor System

For the purpose of open-loop stability analysis, the motor equations are transformed to the rotating dq-rotor reference frame using Park’s transformation. Using the transformed dq variables, the electromagnetic T e in Equation (3) for a round-rotor machine can be expressed as [1]:
T e = 3 / 2 P ψ r m [ i q + i d ]
where i q and i d are stator currents in d-q frame. Higher-order harmonics of rotor flux are neglected in this equation. For round PMSMs, only the quadrature component of the AC currents (i.e., iqs) contributes to the torque. Hence, considering the mean value of i q stator current [43], the mean electromagnetic torque can be achieved as:
T e = 9 / 2 P π ψ r m i cos ( Δ θ r )
where:
  • ψ r m : Peak flux linkage of the rotor’s magnetic field, representing the maximum magnetic coupling between rotor and stator,
  • i : Stator phase current,
  • Δ θ r : Commutation angle error, representing the alignment between the rotor’s magnetic field and the stator’s current field.
Effectiveness of torque generation is determined by the term cos ( Δ θ r ) in T e . An ideally aligned system ( Δ θ r = 0 ) provides maximum torque, whereas misalignment ( Δ θ r 0 ) decreases the amount of torque and affects stability. These equations emphasize the dependence of the torque on the orientation of stator current and rotor magnetic flux and the effect of the misalignment of the commutation angle on the generation of the torque.
If the electromagnetic torque T e is replaced with its mean value T e ¯ , the error dynamics (5) of the open-loop operation mode for BLDCMs can be changed into
Δ θ ˙ = Δ ω r ,
Δ ω r ˙ = 9 P 2 2 π J ψ r m i cos Δ θ r D J Δ ω r P J T l D J ω s ω ˙ s ,
Obviously, the error dynamics in (17) is nonlinear. For the stability of nonlinear dynamical systems, various results have been derived in [44,45]. For convenience, the stability of Equation (17) can be analyzed by using the Lyapunov indirect method.

Stability Conditions

The stability of the open-loop system depends on the ability of T e to counteract T l and viscous damping. The system exhibits the following behavior:
  • Unstable Dynamics (Advanced Commutation, Δ θ r < 0 ): When , T e is reduced due to misalignment. Insufficient torque generation causes the system to decelerate uncontrollably, leading to instability.
  • Stable Dynamics (Retarded Commutation, Δ θ r > 0 ): When cos ( Δ θ r ) > 0 , torque generation is favorably aligned, providing sufficient force to counter T l and damping. The system becomes asymptotically stable, though transient responses may be slower.
  • Marginal Stability (Accurate Commutation, Δ θ r = 0 ): When cos ( Δ θ r ) = 1 , torque generation is maximized. The system reaches equilibrium where T e = T l + ϵ ω r . While this configuration is ideal for steady-state operation, it is highly sensitive to disturbances, resulting in limited robustness.
Further to explain the stability implication of commutation angle error, Figure 2 shows a graphical representation of the torque behavior with respect to Δ θ r . As indicated, the maximum torque is obtained when Δ θ r = 0 , which is marginal stability. When Δ θ r > 0 the system is stable as the torques are aligned positively, and when Δ θ r < 0 the system is unstable because the torques are aligned negatively, or reduced in magnitude. This visualization helps the theoretical classification of dynamic behavior in the open-loop operation of BLDC motors.

4. Controller Design

4.1. Fractional Calculus

The FOPID technique is an extended version of the conventional PID, improved by applying the fractional method. This controller’s main advantage is the additional degree of freedom applied by two additive parameters, which improves design flexibility [12]. This concept is designed based on linear filters, while the lower and upper levels of the operator are established using b and a , which shows the order of integration or differentiation. Here, a generalized transfer function is developed to better demonstrate the FOPID controller:
G F O P I D s = K d s μ + K P + K i s λ
The values K D , K I and K P indicate the pertinent controller parameters. It is clear that the differ-integral order function contains additional terms μ and λ . It is evident that μ and λ are extra terms in the differ-integral order function. These five parameters are essential to the effectiveness and performance of the task and must be chosen as the best options. We have incorporated the RL-DDPG algorithm into the suggested controller to increase its flexibility to different disturbances. This approach is described in the next section.

4.2. Reinforcement Learning-Based Adaptive Gain Tuning

Reinforcement Learning (RL) has become a strong, model-free technique of control in nonlinear and complex settings [20]. Within the proposed control structure, RL is utilized to adaptively adjust the gains of a FOPID controller applied to both the speed and current regulation loops of a BLDC motor drive: the gains of the FOPID controller are the gains of the K D , K I and K P gains. This adaptive controller is learned-based and could keep the controller in optimal performance in different operating conditions, such as parameter changes, load disturbances, and voltage variations. The RL problem is modeled as Markov Decision Process (MDP) with the control system modeled as a tuple of (S, A, R, γ ) [21]. In this case, the state space is represented by S, the action space by A, the reward function by R and the discount factor (balancing between immediate and future rewards) by γ in [0, 1].
In this work, the state vector is defined minimally as:
s t = e t e ˙ t
where, e(t) represents the instantaneous control error (either speed or current), and its first derivative. This concise representation will provide efficient learning and will incorporate the key dynamics of the system. The RL agent output is the action vector:
a t = K p t K D t K I t
which is directly proportional to the controller gains used on FOPID structure. The non-integer orders λ and μ are fixed during the training and execution phases to minimize the complexity of computation. A scalar reward is computed at every time step to steer the learning process as:
r t = α e t 2 + β e ˙ ( t ) 2 + γ u ( t ) 2
where u(t) is the control effort and α , β , γ and are positive scalar weights. This reward formulation is a penalty on tracking error, aggressive transients and control saturation, which promotes smooth and accurate system regulation.
The aim of the RL agent is to find a deterministic policy π s t a t that maximizes the expected cumulative reward:
J = E t = 0 γ t r t

Deep Deterministic Policy Gradient (DDPG) Integration

The Deep Deterministic Policy Gradient (DDPG) algorithm is used to implement the above RL framework in a continuous action space. DDPG is an actor–critic and model-free reinforcement learning algorithm, which is specifically targeted at continuous control problems. It has a deterministic policy (actor) and a learned value function (critic), and it allows gradient descent to update the policy stably [31]. The actor network is a mapping of the observed state st to a set of controller gains at = [KP, KI, KD]. Simultaneously, the critic network Q(st, at | θQ) estimates the expected return in order to determine the quality of the chosen action. The two networks are deep neural networks and they are trained in parallel.
To make learning stable, DDPG employs an experience replay buffer to keep a history of transitions (st, at, rt, s(t+1)). Mini-batches are randomly selected at every training step out of this buffer so as to decorrelate observations and minimize variance. Also, target networks μ (and Q′), which gradually follow the weights of the main networks, are used so as to avoid sudden policy changes, and to guarantee convergence.
The critic network is trained with the objective of reducing the temporal-difference (TD) error:
L = Q s t , a t r t + γ Q s t + 1 , μ s t + 1 2
The actor is updated using the sampled policy gradient:
θ μ , J E s a Q s , a | θ Q θ μ μ s | θ μ
Exploration is promoted during training by introducing noise on the output of the actor with an OrnsteinUhlenbeck process. After the training is over, the trained actor is put in both inner and outer control loops to constantly give fresh gain values. This enables the controller to be flexible to dynamic variation of motor parameters or external disturbances thereby increasing the overall robustness and accuracy of the system. The reward signal, which is depicted in Figure 3, goes through the environment to the critic.
The proposed system can be made highly autonomous and resilient with the use of DDPG to control the design, eliminating the necessity of manual gain tuning or precise modeling of the plant, and guaranteeing optimal performance under a broad variety of operating conditions.
The hybrid training algorithm proposed in Table 2 integrates the use of SOA to tune the initial gains of the controller with the DDPG to tune the FOPID controller gains in real-time. Each of those steps, state observation, action generation, reward computation, and updates of the actor and critic networks in different operating conditions, are listed in the table. In order to choose the gain of the controller, we have coupled the controller with Snake optimization to obtain improved results of the controller. The following part explains the Snake optimizer.

4.3. Snake Optimizer Algorithm (SOA)

Snake Optimization (SO) algorithm is a metaheuristic optimization algorithm based on the social and reproductive behaviours of snakes in response to stimuli in the environment. In nature, snakes adjust their behavior- including foraging and mating- depending on the food availability and the temperature. The algorithm simulates this adaptive behavior in order to balance exploration (exploring new regions of the solution space) and exploitation (optimizing known good regions). Mating is a priority when environmental requirements like low temperature and availability of food are met. In this stage, male snakes are competitive in efforts to attract a female and the female can decide to mate or not. When mating takes place, eggs are deposited in a safe place and the female leaves after the young ones hatch.
These biological mechanisms are mathematically translated into mathematical operators in the context of optimization. The exploration phase is based on individual food-seeking process with no mating conditions, when snakes move randomly within the search space to find new solutions. When environmental indicators are favorable, the exploitation phase simulates localized searching and competition, as in the case of male snakes competing to mate, so that the algorithm can refine solutions around known optima. The active alternation of these two phases depending on the environmental thresholds allows SO to prevent premature convergence and keep diversity in the population. Such bio-inspired flexibility results in the Snake Optimization algorithm being very useful in solving nonlinear, multidimensional, and constrained problems, including tuning controller gains in power electronic systems. An illustration of the flow of the algorithm and steps of its work is presented in [32]. In order to support the selection of the SOA in this research quantitatively, Table 3 compares it in detail with four popular optimization algorithms: GWO, ALO, PSO and GA. The criteria used to evaluate it is the convergence iterations, final Integral of Absolute Error (IAE), standard deviation (as an indicator of solution stability), parameter sensitivity, CPU time and final fitness value. As demonstrated, SOA provides the lowest IAE and the best robustness with minimal CPU overhead, and therefore, it is possible to tune parameters of the controller quickly, accurately, and stably. This renders it particularly useful in such areas as real-time motor control and adaptive converter systems.
Also, Table 4 presents the specific algorithmic parameters and internal parameters used in the configuration of the Snake Optimization Algorithm (SOA) to optimize the controller gains, with a balanced trade-off between the convergence speed, solution diversity and robustness.
To show the superiority of the proposed method to other classical methods, four other controllers are implemented, and their results will be compared with this work, including, a single-loop PID and PI controllers optimized with PSO algorithm, a cascade PID controller optimized by SOA and a cascade FOPID optimized with PSO algorithm. Table 5 shows the gains used for the controllers.

4.4. Closed-Loop Stability Analysis of the System

This section presents a powerful closed-loop control architecture of the BLDC motor that will have cascaded FOPID controllers on speed and current loops. Through the flexibility of FOPID controllers, the system has a better control accuracy and dynamic performance and hence it is suitable in challenging motor control applications. The control system has a cascaded structure to provide the best operation. Speed is controlled by the outer loop, which takes the error in speed and the actual motor speed and a FOPID controller to produce a reference current to the inner loop. The inner loop calculates in conjunction with the measured motor current to generate control signals to the PWM inverter that drives the motor.
In order to examine and show the stability of the closed-loop system, Figure 4 shows the cascaded control system with FOPID controllers. This diagram graphically illustrates how the speed and current control loops interact with one another, the BLDC motor dynamics, and the feedback paths that help to make the system robust and stable. Figure 4 shows feedback gains as X, Y, Z and U s is the control signal produced by the inner controller. Based on this block diagram, we are able to drive closed-loop transfer function of the system to prove its stability.
The closed-loop transfer function describes the behavior of the system after incorporation of the feedback controller. With a feedback arrangement, the FOPID controller transfer function is multiplied with the open-loop transfer function of the system to generate the closed-loop transfer function. Here is the expression of the closed-loop transfer function:
T inner ( s ) = G open ( s ) · G FOPID , inner ( s ) 1 + G open ( s ) · G FOPID , inner ( s )
where, G open ( s ) is the open-loop transfer function of the BLDC motor and G FOPID , inner ( s ) is the FOPID controller for the current loop. G open ( s ) can be defined using the state-space matrices of the system as:
G o p e n s = a 11 s 3 + a 12 s 2 + a 13 s s 5 + a 21 s 4 + a 22 s 3 + a 23 s 2 + a 24 s
a 11 = 1.463 e 8 ,   a 12 = 2.927 e 11 ,   a 13 = 1.463 e 14 ,   a 21 = 3003 ,   a 22 = 1.93 e 9 ,   a 23 = 3.855 e 12 ,   a 24 = 1.927 e 15 .
The outer loop transfer function corresponds to the speed controller, which regulates the motor speed using the output of the inner loop. The transfer function for the closed-loop system can be given by:
G closed ( s ) = T inner ( s ) · G FOPID , outer ( s ) 1 + T inner ( s ) · G FOPID , outer ( s )
Using the values defined for each transfer function, the following closed-loop transfer function can be reached:
G closed ( s ) = b 11 s 6 + b 12 s 5 + b 13 s 4 + b 14 s 3 + b 15 s 2 + b 16 s + b 17 b 21 s 9 + b 22 s 8 + b 23 s 7 + b 24 s 6 + b 25 s 5 + b 26 s 4 + b 27 s 3 + b 28 s 2 + b 29 s + b 2
b 11 = 8.194 ,   b 12 = 1.66 e 4 ,   b 13 = 8.674 e 6 ,   b 14 = 2.633 e 8 ,   b 14 = 1.58 e 10 ,   b 15 = 2.009 e 11 ,   b 16 = 5.895 e 12 .
b 21 = 0.001 e 9 ,   b 22 = 0.32 e 8 ,   b 23 = 1.931 e 5 ,   b 24 = 4.245 e 8 ,   b 25 = 2.726 e 8 ,   b 26 = 4.29 e 13 ,   b 27 = 2.012 e 15 ,   b 28 = 3.89 e 15 ,   b 29 = 1.943 e 15 ,   b 2 = 2.14 e 15 .
Figure 5 shows the performance and stability analysis of closed-loop system with the designed cascade FOPID controllers. As shown in the pole-zero map in Figure 5a, the system is stable since all the poles are in the left half of the complex plane indicating a good damping. The zero distribution complements the poles and this helps in improving the transient performance. Further confirmation of the system stability and robustness is through the Bode diagram in Figure 5b where the gain margin (G.M.) is 56.6 dB and phase crossover frequency is 1.02 rad/s. These measures show that it has adequate margins to be stable even in the presence of perturbations or uncertainty in the system. The high frequencies are attenuated smoothly in the magnitude plot indicating good noise rejection and the phase plot verifies that the closed-loop dynamics have suitable phase characteristics. Collectively, the analyses confirm the robustness and efficiency of the closed-loop system that is attained using the cascade control structure.

5. Simulation Results and Analysis

To control the system’s output speed, the suggested resilient cascade SO-RL-FOPID architecture is put into practice and evaluated using MATLAB/Simulink (2024a). The performance of the controllers at different settings is illustrated in this section through numerical simulations. At a goal speed of 6000 revolutions per minute (RPM) and a supply voltage of 350V, we assess the controllers’ tracking capability. In this instance, the controllers’ tracking performance is displayed in Figure 6.

5.1. Case 1: Tracking Performance

Here we note the clear difference in performance depending on tracking speed, overshoot, and undershoot. The analyzed controllers are C-SO-PID, C-SO-RL-FOPID, C-PSO-FOPID, PSO-PID, and PSO-PI, where each controller was tuned either by the SO or PSO algorithm. It is clear that the algorithms are critical in optimizing the performance of controllers. Other than tuning, the design of these controllers employs a cascade control structure and in some applications fractional-order control which is a major factor that further improves performance. The speed tracking performance of the controller is shown in Figure 6 and a table showing the performance of the different controllers in speed tracking is shown in Table 6.
According to the comparative analysis, which is presented in Table 6, C-SO-RL-FOPID controller is the best in terms of the overall performance to regulate the output speed of the BLDC motor, as it provides the fastest tracking, zero overshoot, very small undershoot, and very high stability. Other controllers, on the contrary, including the PSO-PID, PSO-PI and C-SO-PID, have greater overshoot and undershoot, which can potentially interfere with stability and control accuracy.
Then, to check the stability of these controllers under severe circumstances we have changed the reference speed in Figure 7 in order to clearly observe the efficiency of these techniques.
Unlike the PSO-tuned controllers, the SO-tuned controllers, especially the cascade FOPID, perform well with large reference changes, i.e., during the change in the reference from 5000 RPM to 7000 RPM and 7000 RPM to 5500 RPM. The C-SO-RL-FOPID controller shows little deviation with respect to the setpoints, resulting in controlled and stable performance even in these extreme conditions. The PSO-PID and PSO-PI controller is quick in response yet the controller has trouble with large reference changes and has a lot of overshoot and undershoot, particularly when making quick speed changes. This initial overshoot is high and this may put a lot of mechanical stress on the motor, which may shorten the life of the motor. Conversely, other configurations are not as smooth and reliable in tracking as the SO-tuned controllers, particularly the cascade FOPID, particularly when aggressive changes of speed are applied. The fractional-order concept with the RL tuning enables the finer control of system dynamics that leads to a more refined behavior and enhanced stability under different operating conditions. Overall, cascade structure and fractional-order methodology contribute to the improvement of motor control performance greatly, and the SO algorithm enhances the results specifically. The RL-tuned cascade FOPID controller is characterized by the capacity to control large speed variations with little overshoot and undershoot and, therefore, the best functioning of the motor.

5.2. Case 2: Supply Voltage Variations

Further, the varying supply voltage may also worsen the performance of BLDC motors, especially in EVs, and it is important to test the controllers that regulate the performance of the motors in different voltages to make them reliable and efficient. In Figure 8, the controllers have been tested at setpoint of 5000 RPM with the change of supply voltage.
The given graph (Figure 8) shows the comparison of the performance of the controllers aimed at speed tracking in a BLDC motor under different conditions of supply voltage: the supply voltage begins with 350 V, rises to 400 V at 4 s, then to 300 V at 8 s and to 450 V at 12 s. The findings show that the Cascade and Fractional-Order approaches applied in these controllers are critical in improving the stability and response of the speed of the motor to large changes in voltage. The cascade method enables a more effective treatment of complicated control issues by separating the control action into several levels and enhancing system robustness. The SO-tuned controllers have better performance with shorter settling times and less overshoot than the PSO-tuned controllers as can be seen by the smoother and more accurate tracking of the reference speed as the voltage varies. This shows the significance of the controller design and the tuning algorithm to ensure performance and stability of the motor under varying supply voltages.

5.3. Case 3: Parametric Variations

EVs are a performance-sensitive and fast-paced industry where the regulation of speed is crucial to ensuring efficient operation across different parametric conditions. The common use of BLDC motors in EVs, which are more efficient and have a high torque-to-weight ratio, has led to problems with performance because of changes in internal parameters such as resistance and inductance, which can greatly affect speed control and current management.
Figure 9 shows the comparison of C-SO-RL-FOPID controller with classical techniques under the effect of these parametric variations at a set speed of 5000 RPM. The analysis shows that the C-SO-RL-FOPID controller is more superior in terms of performance with minimal overshoot, shorter settling time and more robust to disturbances than classical controller such as PSO-PID, PSO-PI and C-SO-PID which have slower responses and larger deviations of the reference speed, particularly in the case of sudden disturbances. The strong structure of the C-SO-RL-FOPID controller guarantees low deviation and quick recovery that are vital in EVs where the speed and current control are vital in smooth and efficient drive. This controller has the potential to make a big difference in the stability and energy consumption of drives, as well as the overall reliability of electric transportation systems because it can effectively manage parameter uncertainties.

6. Real-Time Validation

The Typhoon HIL microgrid system as shown in Figure 10 is a high fidelity real-time simulation platform specifically designed to develop and test control strategies in power electronic systems and smart grid scenarios. It supports Hardware-in-the-Loop (HIL) test, which allows simulating complex microgrid systems (e.g., renewable energy source (e.g., solar PV and wind), battery energy storage systems (BESS), inverters, and distributed energy resources (DERs)) with sub-microsecond accuracy. In contrast to classical software simulations, Typhoon HIL can be easily combined with real control devices, including digital signal processors (DSPs), FPGAs, and real-time controllers, into the test loop.

Hardware-in-the-Loop (HIL) Implementation Setup

In order to test the practicality of the suggested adaptive cascade controller, the Typhoon HIL 606 platform was used to validate the controller in real-time. All of the BLDC motor drive system was simulated in the Typhoon Schematic Editor, which includes the inverter, the motor dynamics, and the DC link. MATLAB/Simulink was used to create the control algorithm which was then translated to C code using Simulink Coder. This C code was subsequently compiled and added to the HIL environment to run on the Typhoon onboard real-time processor. The gain tuning agent based on DDPG was built as a modular block and worked parallel to the main PI controller. The FOPI terms were estimated by employing low-order digital filters. Control signals (duty cycles) were sent back to the simulated inverter and input signals (motor current, rotor speed) were fed by the real-time plant model. This closed-loop setup permits the testing of the controller in real time without actual hardware, which makes it safe and repeatable to conduct experiments under different conditions such as voltage disturbance, measurement noise, and load uncertainty. The HIL tests prove that the suggested system can work reliably with a low level of execution delay, which proves its appropriateness in real-time application. The HIL simulation, unlike traditional MATLAB/Simulink simulations, which run in non-real-time and are affected by the variable CPU processing speed, provides a deterministic, fixed-step real-time execution (typically in the range of 1–50 µs), hardware-level accuracy and real-time feedback that is essential in the validation of control strategies under realistic operating conditions. To get a better description of the elements regarding HIL setup, one may consult [31].

7. HIL Simulation Results

This configuration speeds up development, improves safety, and offers a real-time evaluation of system reaction and stability. The Supervisory Control and Data Acquisition (SCADA) environment and HIL simulation used in this investigation are shown in Figure 11. The online platform for the BLDC motor with the suggested speed controller is displayed in Figure 11a, and the real-time SCADA platform for the online results is displayed in Figure 11b. This real-time testing setup is used to test various scenarios in order to assess the effectiveness of the suggested controller for BLDC’s speed regulation. The specifics of this technique are explained in [23].

7.1. Case 1: Tracking Performance

Firstly, to examine the performance of the proposed controller in the real time environment, the references of 5000 RPM and 6000 RPM are tracked by the controllers with no disturbances (Figure 12) with 350 V supply voltage. In addition, to better understand the impact of current on all phases and the level of back EMF for each phase, these components are shown for the proposed C-SO-RL-FOPID controller’s output.
Figure 12 shows the results of the C-SO-RL-FOPID controller that has been designed to track the speed of a BLDC motor at two speeds 5000 RPM (top) and 6000 RPM (bottom) (Figure 12a,c). The left-hand graphs illustrate how the controller tracks the reference speed and the right-hand graphs illustrate the currents and Back EMF of the three phases of the motor. As is illustrated in both examples, the proposed controller is able to track a desired speed with a smooth and stable manner, with little overshoot and no noticeable steady-state error, demonstrating the effectiveness of fractional-order control in the control of complex dynamics. Also, the RL algorithm is essential in the optimization of the gains of the FOPID controller. The RL algorithm guarantees that the controller will have an optimal response in terms of speed tracking, current regulation, and overall stability by optimising the controller parameters. SO tuning can be used to tune the controller to better follow the nonlinearities and disturbances of the system and lead to a well-coordinated control strategy of the current and speed loop. The right figure (Figure 12b,d) also proves the robustness of the controller, since the currents ( i a , i b , i c ) and Back EMF signals ( e a , e b , e c ) are well regulated, despite the motor being operated at varying speeds. The controller has consistent waveforms with very little distortion, which points to its capability to work under different operating conditions without affecting the performance of the motor. As a result, the cascade control, FOPID methodology, and the SO algorithm combination provides high performance in speed tracking, current regulation, and stability of the BLDC motor. It is a very efficient solution to the challenging motor control applications because the fractional-order control increases the flexibility of the controller, and the SO algorithm improves its performance.
The present waveforms in Figure 12b,d represents the three phases of the BLDC motor at 5000 RPM and 6000 RPM respectively. The currents are aligned correctly with the trapezoidal back-EMF waveforms, which means that commutation works. The controller proposed ensures that there are smooth and symmetrical current profiles in all phases and the distortion or ripple is minimal. This proves that the inner current control loop is well damped in steady-state operation. Also, there is no current spike or asymmetry, which indicates that the system can effectively suppress switching transients and handle inverter switching losses.
It is important to mention that the back-EMF waveforms shown in Figure 12 are obtained based on mathematical model with ideal conditions and serve as theoretical shape of e a , b , c ( θ r , ω r ) . In practice, back-EMF signals have small distortions caused by current ripple and inverter switching. These effects are not graphically represented in this plot; however, they were considered in the HIL environment by modeling inverters in detail and coupling signals. The performance of the overall controller in these non-ideal conditions has been confirmed by real-time testing.

7.2. Case 2: Variable Speed Tracking—Controller Performance Evaluation

Figure 13 illustrates the real-time operation of the proposed cascade controller used to track speed of a BLDC with different speed references, conducted through a HIL system. The graph displays that the motor speed (red line) tracks the reference speed (black dashed line) with different changes in speed setpoints and therefore demonstrates that the controller is very accurate and responsive. The motor speed tracks the reference with little deviation and no major overshoot which demonstrates the effectiveness of the C-SO-RL-FOPID controller in real-time applications. It is thus an effective and efficient motor control solution in the real time application, especially in the industrial high demand setting.

7.3. Case 3: Supply Voltage Variation Tracking: Controller Performance

Another scenario that could demonstrate the robustness of this method is the modification of the supply voltage level. In the next case, the impact of supply voltage variation has been tested to ensure the stability of the system against this scenario. Figure 14 shows the performance of the controller in handle this disturbance under various levels.
The proposed C-SO-RL-FOPID controller has been shown to be highly robust to abrupt changes in the supply voltage as shown by the HIL outcomes. Through sudden variations of the input voltage between 300 V and 450 V the controller is always able to control the speed of the motor around the reference of 5000 RPM with little variation. The controller manages to overcome these disturbances and keeps the speed at a very tight range and recovers quickly after every voltage change. This proves the high robustness of the controller to cope with the unpredictable variations in supply voltage so that the motor can be run reliably and stably over the actual operation environment despite the large external disturbances.
The industrial settings, in which the motors are operated, do not have ideal conditions; hence, the probability of dynamical changes on the motor is high. In order to re-establish the high efficiency and strong dynamics of this controller in real time applications, abrupt parametric changes are tested in Figure 15.

7.4. Case 4: Load Variation-Controller Performance Tracking

The Figure 15 shows the real-time simulation outcomes of the BLDC motor with the proposed C-SO-RL-FOPID controller with two different parametric changes. The two marked steps, First Step and Second Step, are the instances where internal resistance and inductance variation were added to evaluate the capacity of the controller to regulate the speed. The controller is able to follow the reference speed with robustness and stability even in the face of the disturbances added.

7.5. Case 5: Noise Impact

In practice, the EV motor control systems are usually susceptible to all kinds of disturbances and noise, either due to the environment, electromagnetic interference, or sensor errors. The robustness of the controller when subjected to these noisy conditions is important to be able to obtain stable and reliable performance in practice. This case study examines the behaviour of the suggested C-SO-RL-FOPID controller in the presence of sudden noise disturbances. Figure 16 shows the strength of the suggested controller in noisy environment.
The BLDC motor was operated with a fixed reference speed of 5000 RPM, and noise with varying variances (0.5–2.5) was added at varying time intervals to reflect the noises in the real world. With noise variance of 0.5, the controller was able to hold a stable speed near the reference with high noise rejection with slight variations. When the variance of the noise was set to 1, the fluctuations in speed were more significant, but the controller reduced the deviations successfully and returned to stability fast. The greatest deviations were recorded at the largest noise variance of 2.5 whereby there were occasional sharp falls and spikes in speed. In spite of these drastic variations, the controller was able to settle the motor speed following every disturbance, and this suggests a high level of robustness. These findings indicate that the controller is efficient in managing moderate levels of noise and that, although the controller is effective in managing a high level of disturbances, adaptive improvements or extra filtering techniques might be required when dealing with extreme noise levels. In general, the paper proves the reliability and the applicability of the Cascade SO-RL-FOPID controller to real-life EVs and provides the future directions of improvements in the performance of the controller. The cascade SO-RL-FOPID controller has been found to be much more reliable, robust, and adaptive than other controllers as shown by the experimental results in different scenarios. Its performance makes it the best and most versatile alternative in any test carried out. In order to underline the benefits of this method even more, the next section will present a thorough comparison of the main features and assessment criteria among various controllers.
It is necessary to mention that the dynamic performance of the system that can be seen in the real-time Hardware-in-the-Loop (HIL) environment has minor differences with the results of the offline simulation. The differences are mainly attributed to the fact that such non-idealities as signal sampling delays, switching transients, and parasitic effects are included in the HIL platform but are generally not present in idealized MATLAB/Simulink simulations. In addition, real time control hardware will add quantization and processing delays, which will influence current loop accuracy and transient timing. In spite of these differences, the general control goals such as fast tracking, low overshoot and high response to disturbances are still attained. The fact that the trend and behavior are consistent proves the practical feasibility of the proposed controller.

8. Discussion

Although the suggested controller incorporates more progressive ideas like fractional calculus and RL, it is necessary to explain that the general framework is highly computationally efficient and practically feasible to real-time implementation. The core controller is a typical cascade PI structure, and the only two extra degrees of freedom are added by fractional-order operators in the form of low-order filters. These filters are chosen so that they are accurate and yet place a minimum computational load. SOA is used in the offline stage to initialize the controller gains. The method is a metaheuristic approach that guarantees global exploration at the early tuning phase yet does not add to the computational burden at runtime. To achieve online adaptability, a DDPG agent is added to real-time update the PI gains. Although this adds more computation, the learning agent is a modular block of optimization that revises the controller parameters at a set period of time based on state-feedback. The RL block is implemented to operate in parallel with the main control loop and is optimized in order to converge efficiently with a compact neural network structure. The feasibility of the suggested method has been proved in practice with the help of the Typhoon HIL 606 platform. The findings indicate that the system was able to process real-time signal acquisition, learning-based adaption, and control computation without degrading stability or responsiveness. This establishes that the suggested algorithm is not only strong and flexible but also can be applied on real-time embedded control systems. In spite of the integration of the sophisticated elements, including fractional-order dynamics, optimization heuristics, and deep reinforcement learning, the controller is constructed to be robust in the form of modularity and fault tolerance. The inherent PI control structure ensures that the system is stable without real-time tuning, and the extra elements are used to enhance tracking precision, flexibility and disturbance rejection. Moreover, a large number of tests were performed both in simulation and real time to determine the robustness to a large variety of disturbances such as voltage variations, loading variations, parametric uncertainties and noise injection. These tests proved that the suggested approach does not only preserve the accuracy of control but also is capable of responding to the dynamic operating conditions, which proves its stability and resilience in real-life settings.

Novelty Highlight

The originality of this work can be summarized as follows:
  • Cascade RL-DDPG-SO-FOPID Control Structure—A dual-loop cascade Fractional-Order PID (FOPID) controller is developed, where both inner current and outer speed loops are coordinated. The architecture is enhanced by RL-DDPG and SOA, which is not reported in existing BLDC motor control literature.
  • Hybrid Initialization and Online Adaptation—Unlike prior works that rely only on offline tuning or solely adaptive learning, this study introduces a hybrid framework: (i) SOA provides optimal initial gain tuning to avoid poor initialization, and (ii) RL-DDPG continuously adjusts the controller gains in real-time to ensure adaptability under nonlinear, uncertain, and dynamic EV conditions.
  • Hardware-in-the-Loop Validation—The proposed RL-DDPG-SO-FOPID controller is implemented and tested on a Typhoon HIL 606 real-time platform, which incorporates non-idealities such as parasitic elements, delays, and switching transients. This ensures practical feasibility beyond conventional MATLAB/Simulink-based studies.
  • Extensive Robustness Testing—The controller is validated under diverse scenarios including abrupt reference speed changes, voltage disturbances, parametric uncertainties, load variations, and noise injection. Results confirm that the RL-DDPG-SO-FOPID controller offers superior adaptability, tracking accuracy, and robustness compared with classical PI/PID and metaheuristic-based controllers.
In summary, this work presents the first cascade RL-DDPG-SO-FOPID control scheme for BLDC motor drives, establishing a novel hybrid adaptive control strategy that is both theoretically innovative and practically validated for EV applications.

9. Conclusions

Finally, this paper proposes a powerful and smart control strategy of BLDC motors, which is founded on a cascade fractional-order PID (FOPID) controller. The suggested dual-loop structure overcomes the shortcomings of traditional PID controllers to deal with the nonlinearities, parameter variations, and dynamic dynamics that are characteristic of BLDC motor systems. The cascade structure makes the system much more responsive, stable, and able to reject disturbances over a wide range of operating conditions by separating control tasks into an inner loop that regulates current and an outer loop that controls speed. The non-integer dynamics of the FOPID controller has a number of benefits over conventional PID methods, such as better noise resistance, more tuning freedom, and control over system uncertainties. To address the issue of the selection of the optimum gain values and adaptation of the same in real time, however, a hybrid learning-based method is proposed. Snake Optimization Algorithm (SOA) is applied to give a good initial set of controller gains so that the training starts off well-conditioned. After that, the RL method with Deep Deterministic Policy Gradient (DDPG) algorithm is incorporated to online adjust the FOPID gains adaptively according to the performance of the system. The combination allows the controller to continually optimize its behavior with respect to varying load conditions, disturbances and non-modeled dynamics. The proposed cascade RL-SOA-FOPID controller is effective, and its effectiveness is proven by the real-time implementation with the Hardware-in-the-Loop (HIL) simulation. Experimentally, it has been shown that tracking is quicker, overshoot is lower and the controller is more robust to variations in supply voltage and parameters than classical and fixed-gain controllers. On the whole, the designed controller offers a very flexible, accurate and stable solution to be used in advanced motor control applications. Although the current validation scenarios involve a wide range of disturbances and dynamic loads, in future studies, standard EV driving cycles, e.g., the EPA UDDS and WLTP, will be used to further test the performance of the controller in realistic operating conditions. Also, the scope of multi-objective or transfer learning-based reinforcement frameworks may be extended to improve generalization and convergence on various types of motors and environments. All in all, the suggested controller presents a realistic, adaptive, and high-performance solution that is applicable in the real-world application of advanced BLDC motor control.

Author Contributions

S.M.G.: Conceptualization, Methodology, Software, Writing—original draft, Investigation, Validation, Implementation. A.A.: Principal Supervision, Writing—review & editing, Methodology, Project Management. M.G.: Writing—review & editing. D.H.: Supervision, Writing—review & editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is available on request from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Vk, A.R.; Prasad, V. Online Adaptive Gain for Passivity-Based Control for Sensorless BLDC Motor Coupled with DC Motor for EV Application. IEEE Trans. Power Electron. 2023, 38, 13625–13634. [Google Scholar] [CrossRef]
  2. Zhang, H.; Li, H. Fast Commutation Error Compensation Method of Sensorless Control for MSCMG BLDC Motor with Nonideal Back EMF. IEEE Trans. Power Electron. 2020, 36, 8044–8054. [Google Scholar] [CrossRef]
  3. Mohanraj, D.; Aruldavid, R.; Verma, R.; Sathiyasekar, K.; Barnawi, A.B.; Chokkalingam, B.; Mihet-Popa, L. A review of BLDC motor: State of art, advanced control techniques, and applications. IEEE Access 2022, 10, 54833–54869. [Google Scholar] [CrossRef]
  4. Gad, E.; Pimentel, J. An algebraic approach for the stability analysis of BLDC motor controllers. arXiv 2020, arXiv:2007.01387. [Google Scholar] [CrossRef]
  5. Kumarasamy, V.; KarumanchettyThottam Ramasamy, V.; Chandrasekaran, G.; Chinnaraj, G.; Sivalingam, P.; Kumar, N.S. A review of integer order PID and fractional order PID controllers using optimization techniques for speed control of brushless DC motor drive. Int. J. Syst. Assur. Eng. Manag. 2023, 14, 1139–1150. [Google Scholar] [CrossRef]
  6. Bodur, F.; Kaplan, O. Integral sliding mode control with improved reaching law for brushless DC motor speed control. In Proceedings of the 2023 11th International Conference on Smart Grid (icSmartGrid), Paris, France, 4–7 June 2023. [Google Scholar]
  7. Alnaib, I.I.; Alsammak, A.N.; Mohammed, K.K. Brushless DC motor drive with optimal fractional-order sliding-mode control based on a genetic algorithm. Electr. Eng. ElectroMech. 2025, 2, 19–23. [Google Scholar] [CrossRef]
  8. Sikora, A.; Zielonka, A.; Woźniak, M.; Orság, P.; Mlčák, T.; Hrabovský, L. Fuzzy control system to improve the efficiency of the brushless direct current motor by correcting the control angle. Int. J. Electr. Power Energy Syst. 2025, 169, 110762. [Google Scholar] [CrossRef]
  9. He, X.; Yu, Q.; Pan, X.; Liu, L.; Jiang, Z.; Zhao, W.; Fan, R. Improved beluga whale optimization-based variable universe fuzzy controller for brushless direct current motors of electric tractors. Comput. Electr. Eng. 2024, 120, 109866. [Google Scholar] [CrossRef]
  10. Zhang, Y.; Gono, R.; Jasiński, M. An improvement in dynamic behavior of single phase PM brushless DC motor using deep neural network and mixture of experts. IEEE Access 2023, 11, 64260–64271. [Google Scholar] [CrossRef]
  11. Benbouhenni, H.; Bizon, N.; Mosaad, M.I.; Colak, I.; Djilali, A.B.; Gasmi, H. Enhancement of the power quality of DFIG-based dual-rotor wind turbine systems using fractional order fuzzy controller. Expert Syst. Appl. 2024, 238, 121695. [Google Scholar] [CrossRef]
  12. Ghamari, S.M.; Molaee, H.; Ghahramani, M.; Habibi, D.; Aziz, A. Design of an Improved Robust Fractional-Order PID Controller for Buck–Boost Converter using Snake Optimization Algorithm. IET Control Theory Appl. 2025, 19, e70008. [Google Scholar] [CrossRef]
  13. Kaveh, A.; Vahedi, M.; Gandomkar, M. Improving the Performance of the Chaotic Nonlinear System of the Fractional-Order Brushless Direct Current Electric Motor by Using Fractional-Order Sliding Mode Control. 2023. Available online: https://assets-eu.researchsquare.com/files/rs-3166378/v1_covered_819d5038-8444-4659-a890-9c1d5ded5ff1.pdf (accessed on 8 July 2023).
  14. Abro, K.A.; Atangana, A.; Gómez-Aguilar, J.F. Chaos control and characterization of brushless DC motor via integral and differential fractal-fractional techniques. Int. J. Model. Simul. 2023, 43, 416–425. [Google Scholar] [CrossRef]
  15. Sharma, M.; Sharma, S.; Vajpai, J. A Novel Approach to Design and Analyze Fractional Order PID Controller for Speed Control of Brushless DC motor. Renew. Energy Sustain. Dev. 2024, 10, 279–293. [Google Scholar] [CrossRef]
  16. Ghamari, S.; Habibi, D.; Ghahramani, M.; Aziz, A. Robust Cascade Pid-Based Controller Design for Brushless Dc Motor Using Antlion Optimization Algorithm. In Proceedings of the 2024 International Conference on Sustainable Technology and Engineering (i-COSTE), Perth, Australia, 18–20 December 2024. [Google Scholar]
  17. Vanchinathan, K.; Valluvan, K.R.; Gnanavel, C.; Gokul, C. Numerical simulation and experimental verification of fractional-order PIλ controller for solar PV fed sensorless brushless DC motor using whale optimization algorithm. Electr. Power Components Syst. 2022, 50, 64–80. [Google Scholar] [CrossRef]
  18. Prabhakaran, A.; Ponnusamy, T.; Janarthanan, G. Optimized fractional order PID controller with sensorless speed estimation for torque control in induction motor. Expert Syst. Appl. 2024, 249, 123574. [Google Scholar] [CrossRef]
  19. Kumari, S.; Kumar, R. Hybridized GWO-RUN optimized fractional order control for permanent magnet brush-less dc motor. Eng. Res. Express 2023, 5, 015056. [Google Scholar] [CrossRef]
  20. Shakya, A.K.; Pillai, G.; Chakrabarty, S. Reinforcement learning algorithms: A brief survey. Expert Syst. Appl. 2023, 231, 120495. [Google Scholar] [CrossRef]
  21. Aske, P. Deep Reinforcement Learning; Springer: Singapore, 2022; Volume 10. [Google Scholar]
  22. Ye, J.; Guo, H.; Wang, B.; Zhang, X. Deep deterministic policy gradient algorithm based reinforcement learning controller for single-inductor multiple-output DC–DC converter. IEEE Trans. Power Electron. 2024, 39, 4078–4090. [Google Scholar] [CrossRef]
  23. Ghamari, S.M.; Habibi, D.; Aziz, A. Robust Adaptive Fractional-Order PID Controller Design for High-Power DC-DC Dual Active Bridge Converter Enhanced Using Multi-Agent Deep Deterministic Policy Gradient Algorithm for Electric Vehicles. Energies 2025, 18, 3046. [Google Scholar] [CrossRef]
  24. Muktiadji, R.F.; Ramli, M.A.; Milyani, A.H. Twin-delayed deep deterministic policy gradient algorithm to control a boost converter in a DC microgrid. Electronics 2024, 13, 433. [Google Scholar] [CrossRef]
  25. Zholtayev, D.; Rubagotti, M.; Do, T.D. Deep reinforcement learning for PMSG wind turbine control via twin delayed deep deterministic policy gradient (TD3). Optim. Control Appl. Methods 2024, 45, 1889–1906. [Google Scholar] [CrossRef]
  26. Saha, U.; Jawad, A.; Shahria, S.; Rashid, A.H.U. Proximal policy optimization-based reinforcement learning approach for DC-DC boost converter control: A comparative evaluation against traditional control techniques. Heliyon 2024, 10, e37823. [Google Scholar] [CrossRef]
  27. Chen, P.; Zhao, J.; Liu, K.; Zhou, J.; Dong, K.; Li, Y. A review on the applications of reinforcement learning control for power electronic converters. IEEE Trans. Ind. Appl. 2024, 60, 8430–8450. [Google Scholar] [CrossRef]
  28. Alejandro-Sanjines, U.; Maisincho-Jivaja, A.; Asanza, V.; Lorente-Leyva, L.L.; Peluffo-Ordóñez, D.H. Adaptive PI controller based on a reinforcement learning algorithm for speed control of a DC motor. Biomimetics 2023, 8, 434. [Google Scholar] [CrossRef]
  29. Cheng, H.; Jung, S.; Kim, Y.B. A novel reinforcement learning controller for the DC-DC boost converter. Energy 2025, 321, 135479. [Google Scholar] [CrossRef]
  30. Ghamari, S.M.; Hajihosseini, M.; Habibi, D.; Aziz, A. Design of an Adaptive Robust PI Controller for DC/DC Boost Converter Using Reinforcement-Learning Technique and Snake Optimization Algorithm. IEEE Access 2024, 12, 141814–141829. [Google Scholar] [CrossRef]
  31. Ghamari, S.; Habibi, D.; Ghahramani, M.; Aziz, A. Design of a Robust Adaptive Cascade Fractional-Order Nonlinear-Based Controller Enhanced Using Grey Wolf Optimization for High-Power DC/DC Dual Active Bridge Converter in Electric Vehicles. IET Power Electron. 2025, 18, e70056. [Google Scholar] [CrossRef]
  32. Hashim, F.A.; Hussien, A.G. Snake Optimizer: A novel meta-heuristic optimization algorithm. Knowl. Based Syst. 2022, 242, 108320. [Google Scholar] [CrossRef]
  33. Çelik, E.; Karayel, M. Effective speed control of brushless DC motor using cascade 1PDf-PI controller tuned by snake optimizer. Neural Comput. Appl. 2024, 36, 7439–7454. [Google Scholar] [CrossRef]
  34. Mohammadi, F.; Kaffash, A.; Donyagozashteh, Z.; Marasi, M.; Tavakoli, M. Design of a novel robust adaptive backstepping controller optimized by snake algorithm for buck-boost converter. IET Control Theory Appl. 2025, 19, e12770. [Google Scholar] [CrossRef]
  35. Manoharan, S.K.; Megalingam, R.K.; Shaju, B. Cascade PI Control with Extended Kalman Filtering for BLDC Actuators in Collaborative Robotic Arm. In Proceedings of the 2024 Parul International Conference on Engineering and Technology (PICET), Vadodara, India, 3–4 May 2024. [Google Scholar]
  36. Mossadak, M.A.; Chebak, A.; Ouahabi, N.; Rabhi, A.; Elmahjoub, A.A. A novel hybrid PI–backstepping cascade controller for battery–supercapacitor electric vehicles considering various driving cycles scenarios. IET Power Electron. 2024, 17, 1089–1105. [Google Scholar] [CrossRef]
  37. Hanselman, D.C. Brushless Permanent Magnet Motor Design; The Writers’ Collective: Cranston, RI, USA, 2003. [Google Scholar]
  38. Ali, E. (Ed.) Handbook of Automotive Power Electronics and Motor Drives; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
  39. Feng, Z.; Ramesh, R.R.; Tahim, E.S.; Zhang, J.; Ebrahimi, S.; Jatskevich, J. Torque Ripple Reduction in Brushless DC Motors with 180° Commutation. IEEE Trans. Ind. Appl. 2025, 61, 7304–7317. [Google Scholar] [CrossRef]
  40. Saha, B.; Singh, B. Torque ripple mitigation in sensorless PMBLDC motor drive with adaptive observer for LEV. IEEE Trans. Power Electron. 2024, 40, 1739–1747. [Google Scholar] [CrossRef]
  41. Ramu, K. Permanent Magnet Synchronous and Brushless DC Motor Drives; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
  42. Niapour, S.A.; Garjan, G.H.; Shafiei, M.; Feyzi, M.R.; Danyali, S.; Bahrami Kouhshahi, M. Review of permanent-magnet brushless DC motor basic drives based on analysis and simulation study. Int. Rev. Electr. Eng. 2014, 9, 930–957. [Google Scholar]
  43. Wu, Z.; Lyu, H.; Shi, Y.; Shi, D. On Stability of Open-Loop Operation Without Rotor Information for Brushless DC Motors. Math. Probl. Eng. 2014, 1, 740498. [Google Scholar] [CrossRef]
  44. Zhang, W.; Li, J.; Chen, M. Global exponential stability and existence of periodic solutions for delayed reaction-diffusion BAM neural networks with Dirichlet boundary conditions. Bound. Value Probl. 2013, 2013, 105. [Google Scholar] [CrossRef]
  45. Li, J.; Zhang, W.; Chen, M. Synchronization of delayed reaction–diffusion neural networks via an adaptive learning control approach. Comput. Math. Appl. 2013, 65, 1775–1785. [Google Scholar] [CrossRef]
Figure 1. Proposed schematics for BLDC motor using the proposed control technique; (a) general structure, (b) current and EMF waveforms.
Figure 1. Proposed schematics for BLDC motor using the proposed control technique; (a) general structure, (b) current and EMF waveforms.
Energies 18 05056 g001
Figure 2. Stability classification of BLDC motor open-loop operation based on the commutation angle error Δ θ .
Figure 2. Stability classification of BLDC motor open-loop operation based on the commutation angle error Δ θ .
Energies 18 05056 g002
Figure 3. Schematic diagram of RL-DDPG method used [31].
Figure 3. Schematic diagram of RL-DDPG method used [31].
Energies 18 05056 g003
Figure 4. Block diagram of the proposed cascaded control system for the BLDC motor using FOPID controllers.
Figure 4. Block diagram of the proposed cascaded control system for the BLDC motor using FOPID controllers.
Energies 18 05056 g004
Figure 5. Closed-loop analysis of the structure with pole-zero placement of the system.
Figure 5. Closed-loop analysis of the structure with pole-zero placement of the system.
Energies 18 05056 g005
Figure 6. Tracking performance of the controllers with set point of 6000 RPM on the BLDC motor.
Figure 6. Tracking performance of the controllers with set point of 6000 RPM on the BLDC motor.
Energies 18 05056 g006
Figure 7. Tracking performance of the controllers under the impact of sudden reference speed variations.
Figure 7. Tracking performance of the controllers under the impact of sudden reference speed variations.
Energies 18 05056 g007
Figure 8. The performance of the controllers under the impact of sudden supply voltage changes.
Figure 8. The performance of the controllers under the impact of sudden supply voltage changes.
Energies 18 05056 g008
Figure 9. Robustness of the controller under parametric variations.
Figure 9. Robustness of the controller under parametric variations.
Energies 18 05056 g009
Figure 10. HIL Typhoon setup with 606 Module [23].
Figure 10. HIL Typhoon setup with 606 Module [23].
Energies 18 05056 g010
Figure 11. Testing process using real-time setup; (a) testing model using the proposed controller in HIL environment, (b) HIL simulation and corresponding SCADA platform.
Figure 11. Testing process using real-time setup; (a) testing model using the proposed controller in HIL environment, (b) HIL simulation and corresponding SCADA platform.
Energies 18 05056 g011
Figure 12. Fixed speed tracking; (a) tracking 5000 RPM, (b) current and back EMF generated by the proposed controller in tracking 5000 RPM, (c) tracking 6000 RPM, (d) current and back EMF generated by the proposed controller in tracking 6000 RPM.
Figure 12. Fixed speed tracking; (a) tracking 5000 RPM, (b) current and back EMF generated by the proposed controller in tracking 5000 RPM, (c) tracking 6000 RPM, (d) current and back EMF generated by the proposed controller in tracking 6000 RPM.
Energies 18 05056 g012
Figure 13. Variable speed tracking performance of the controller.
Figure 13. Variable speed tracking performance of the controller.
Energies 18 05056 g013
Figure 14. Impact of supply voltage variation on the performance of the controllers.
Figure 14. Impact of supply voltage variation on the performance of the controllers.
Energies 18 05056 g014
Figure 15. Performance of the controllers under load variations.
Figure 15. Performance of the controllers under load variations.
Energies 18 05056 g015
Figure 16. Robustness of the proposed controller under various level of noise.
Figure 16. Robustness of the proposed controller under various level of noise.
Energies 18 05056 g016
Table 1. Specifications of the BLDC motor.
Table 1. Specifications of the BLDC motor.
ParameterDefinitionValue
V D C Nominal Supply Voltage300 V–400 V
RStator Resistance0.04 Ω
LStator Inductance0.04 mH
PNumber of Poles8
P o u t Output Power3 kW
ω n Nominal Speed6000 RPM
K e Back EMF Constant8.78
ε Viscous Friction Coefficient0.00524
JInertia Coefficient0.0015
Table 2. Algorithm process.
Table 2. Algorithm process.
StepDescription
1Initial Gain Fine-Tuning with SOA: The Snake Optimization Algorithm (SOA) is used to obtain a well-tuned initial set of FOPID gains ( K p , K i , K d ) by minimizing an objective function such as the Integral of Squared Error (ISE). These optimized gains initialize the RL agent to improve convergence and early-stage performance.
2Initialization: The actor and critic neural networks are initialized using the SOA-tuned gains. Target networks (for both actor and critic) are set equal to the initial networks. A replay buffer is also initialized.
3State Observation: At each control step, the system state is observed. The state vector consists of the instantaneous tracking error e ( t ) and its derivative e ˙ ( t ) .
4Action Selection: The actor network generates an action consisting of three controller gains [ K P , K I , K D ] . To promote exploration, noise is added to the action.
5Control Execution: The FOPID controller applies the selected gains to compute the control signal, which is then fed to the BLDC motor.
6Reward Computation: A scalar reward is computed to penalize tracking error, fast fluctuations, and high control effort. The typical reward function is defined in Equation (17).
7Experience Storage: The current state, action, reward, and next state are stored as a transition tuple in the replay buffer.
8Critic Network Update: A mini-batch is sampled from the buffer. The critic is updated by minimizing the error between the predicted Q-value and the target Q-value, which is computed using the target networks.
9Actor Network Update: The actor is updated using the deterministic policy gradient method to improve the action predictions based on the critic’s feedback.
10Target Network Soft Update: The weights of the target networks are softly updated to slowly track the main networks using a weighted average.
11Repeat: Steps 3 to 10 are repeated continuously to refine the controller gains in real time, ensuring adaptability and robustness under varying operating conditions.
Table 3. Comparison of optimization methods for controller gain initialization.
Table 3. Comparison of optimization methods for controller gain initialization.
AlgorithmConvergence IterationFinal IAEStd. DeviationParam. SensitivityCPU Time (s)Final Fitness
SOA451.480.0051.2%15.20.110
GWO521.620.0071.5%17.40.125
ALO571.730.0091.8%18.60.138
PSO742.380.0263.7%19.80.196
GA852.610.0334.2%31.40.209
Table 4. Control parameters and algorithm-specific settings for the SOA.
Table 4. Control parameters and algorithm-specific settings for the SOA.
ParameterValueDescription
Number of Snakes (N)25Total population size (search agents) used for exploring and exploiting the search space.
Number of Snake Pairs12Number of cooperative pairs (snakes) that form during the pairing phase to balance diversification.
Search Dimension (D)6Dimensionality of the problem .
Maximum Iterations ( T max )100Maximum number of iterations allowed for convergence.
Head–Tail Behavior Ratio0.7Ratio controlling the transition between exploration (head) and exploitation (tail) phases.
Step Size Coefficient ( λ )0.05–1.0Controls snake movement range per iteration; adaptively reduced to focus on exploitation.
Temperature Factor ( τ )0.95Influences convergence stability during search, especially for avoiding local optima.
Escape MechanismEnabledSnakes periodically change direction if stagnation is detected in local neighborhood.
Fitness FunctionIAE + Param. SensitivityMulti-objective function to minimize error (IAE) and improve robustness against parameter variation.
Boundary ControlReflectiveEnsures search agents stay within feasible control gain bounds by reflecting them at boundaries.
Initialization MethodRandom + Uniform SpreadInitial snakes are randomly positioned but spread uniformly across the search space.
Pair Switching FrequencyEvery 10 IterationsSnake pairs are reshuffled periodically to maintain diversity.
Table 5. Value of gains in the controllers.
Table 5. Value of gains in the controllers.
ControllerLoop K P K I K D μ λ
PSO-PI127.25×××
PSO-PID10.525.750.78××
Cascade SO-PIDInner0.250.0210.074××
Outer7.250.040.57××
Cascade PSO-FOPIDInner1.082.070.420.910.86
Outer1.04 × 10−30.420.030.250.54
Cascade SO-FOPIDInner0.98 × 10−30.1050.010.751.093
Outer2.08 × 10−40.380.580.930.25
Table 6. Comparative analysis based on Figure 5.
Table 6. Comparative analysis based on Figure 5.
ControllerTrackingOvershootUndershootStability
PSO-PIFastVery HighVery HighHigh
PSO-PIDFastVery HighVery HighHigh
C-SO-PIDSlowNoneHighModerate
C-PSO-FOPIDModerateNoneLowHigh
C-SO-RL-FOPIDFastestNoneVery LowVery High
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ghamari, S.M.; Ghahramani, M.; Habibi, D.; Aziz, A. Design of a Robust Adaptive Cascade Fractional-Order Proportional–Integral–Derivative Controller Enhanced by Reinforcement Learning Algorithm for Speed Regulation of Brushless DC Motor in Electric Vehicles. Energies 2025, 18, 5056. https://doi.org/10.3390/en18195056

AMA Style

Ghamari SM, Ghahramani M, Habibi D, Aziz A. Design of a Robust Adaptive Cascade Fractional-Order Proportional–Integral–Derivative Controller Enhanced by Reinforcement Learning Algorithm for Speed Regulation of Brushless DC Motor in Electric Vehicles. Energies. 2025; 18(19):5056. https://doi.org/10.3390/en18195056

Chicago/Turabian Style

Ghamari, Seyyed Morteza, Mehrdad Ghahramani, Daryoush Habibi, and Asma Aziz. 2025. "Design of a Robust Adaptive Cascade Fractional-Order Proportional–Integral–Derivative Controller Enhanced by Reinforcement Learning Algorithm for Speed Regulation of Brushless DC Motor in Electric Vehicles" Energies 18, no. 19: 5056. https://doi.org/10.3390/en18195056

APA Style

Ghamari, S. M., Ghahramani, M., Habibi, D., & Aziz, A. (2025). Design of a Robust Adaptive Cascade Fractional-Order Proportional–Integral–Derivative Controller Enhanced by Reinforcement Learning Algorithm for Speed Regulation of Brushless DC Motor in Electric Vehicles. Energies, 18(19), 5056. https://doi.org/10.3390/en18195056

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop