Design of Robust Adaptive Nonlinear Backstepping Controller Enhanced by Deep Deterministic Policy Gradient Algorithm for Efficient Power Converter Regulation
Abstract
1. Introduction
- An adaptive backstepping control approach is designed for Boost and Buck converters, which is improved in dynamics using the Lyapunov theorem. This approach can guarantee accurate monitoring by reducing susceptibility to error.
- The ABSC approach is enhanced with the RL-DDPG method to optimize control parameters online to reach better disturbance-rejection behavior and higher adaptability to the working environment.
- The GWO algorithm is adopted as an evolutionary computation part to initially optimize the parameters of the system for faster convergence in the controller block, which leads to better adaptation to the system frequency behavior.
- This is a model-free controller that lowers the dependency on exact mathematical modeling of the system, providing faster dynamics and better disturbance-rejection behavior.
- The use of HIL setups for real-time testing provides a rigorous validation of the system’s performance in both simulation and experimental environments, enhancing the credibility and practical relevance of the results.
2. Mathematical Modeling of the Converters
2.1. Boost Converter
2.2. State Variables and Input Definition
2.3. Circuit Equations
2.3.1. Switch ON (Q Closed)
2.3.2. Switch OFF (Q Open)
2.4. Averaged State-Space Model
2.5. Buck Converter
2.6. State Variables and Input Definition
2.7. Mode 1: Switch ON (Q Closed)
- Diode is reverse-biased.
- Inductor is connected directly to the source.
- Capacitor supplies the load and is charged.
2.8. Mode 2: Switch OFF (Q Open)
- Diode conducts.
- Inductor current freewheels through the diode.
2.9. Averaged State-Space Model
3. Controllers
Adaptive Backstepping Technique
4. RL-Based Optimization of ABSC Gains
4.1. Problem Setup
- State:
- Action:
- Reward: A function penalizing poor tracking and excessive control effort.
4.2. DDPG Agent Architecture
- An actor network , which outputs continuous-valued gains .
- A critic network , which estimates expected return.
- Target networks and for stable learning.
- A replay buffer for experience replay and sample efficiency.
4.3. Mathematical Formulation
4.4. Learning Loop and Training Behavior
- Observe system state .
- Actor network outputs gains .
- Controller computes using these gains.
- System returns new state and reward .
- Transition is stored in the replay buffer.
- Critic and actor networks are updated from sampled mini-batches using Equations (32)–(35).
- Target networks are updated with a soft update rule to improve stability.
4.5. Stability Consideration
4.6. Neural Network Training and Architecture
4.7. Grey Wolf Optimization for Initial Gain Tuning
Comparative Analysis of Optimization Strategies
- Convergence speed: Number of iterations to reach 95% of minimum cost.
- Final tracking error: Mean square error after optimization.
- Control effort: Sum of squared control input.
- Stability margin: Evaluated through Lyapunov derivative analysis.
5. Result of Simulations
5.1. Case 1: Output Regulation
5.2. Case 2: Supply Voltage Variation
5.3. Case 3: Load Uncertainty
5.4. Case 3: Noise Impact
6. Experimental Implementation
6.1. Case: Tracking Performance
6.2. Case: Supply Voltage Variation
6.3. Case: Load Variation
6.4. Case: Noise Impact
6.5. Case: Sudden Variation
7. Discussion
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Erickson, R.W. DC–DC power converters. In Wiley Encyclopedia of Electrical and Electronics Engineering; John Wiley & Sons: Hoboken, NJ, USA, 2001. [Google Scholar]
- Xu, L.; Guerrero, J.M.; Lashab, A.; Wei, B.; Bazmohammadi, N.; Vasquez, J.C.; Abusorrah, A. A review of DC shipboard microgrids—Part I: Power architectures, energy storage, and power converters. IEEE Trans. Power Electron. 2021, 37, 5155–5172. [Google Scholar] [CrossRef]
- Hu, J.; Shan, Y.; Cheng, K.W.; Islam, S. Overview of power converter control in microgrids—Challenges, advances, and future trends. IEEE Trans. Power Electron. 2022, 37, 9907–9922. [Google Scholar] [CrossRef]
- Bushra, E.; Zeb, K.; Ahmad, I.; Khalid, M. A comprehensive review on recent trends and future prospects of PWM techniques for harmonic suppression in renewable energies based power converters. Results Eng. 2024, 22, 102213. [Google Scholar] [CrossRef]
- Gupta, M.; Gupta, N.; Garg, M.M.; Kumar, A. Robust control strategies applicable to DC–DC converter with reliability assessment: A review. Adv. Control Appl. Eng. Ind. Syst. 2024, 6, E217. [Google Scholar] [CrossRef]
- Shen, X.; Liu, J.; Liu, Z.; Ga, Y.; Leon, J.I.; Vazquez, S.; Wu, L.; Franquel, L.G. Sliding mode control of neutral-point-clamped power converters with gain adaptation. IEEE Trans. Power Electron. 2024, 39, 9189–9201. [Google Scholar] [CrossRef]
- Shen, X.; Liu, G.; Liu, J.; Gao, Y.; Leon, J.I.; Wu, L.; Franquelo, L.G. Fixed-time sliding mode control for NPC converters with improved disturbance rejection performance. IEEE Trans. Ind. Inform. 2025, 21, 4476–4487. [Google Scholar] [CrossRef]
- Qureshi, M.A.; Musumeci, S.; Torelli, F.; Reatti, A.; Mazza, A.; Chicco, G. A novel model reference adaptive control approach investigation for power electronic converter applications. Int. J. Electr. Power Energy Syst. 2024, 156, 109722. [Google Scholar] [CrossRef]
- Liu, X.; Qiu, L.; Fang, Y.; Wang, K.; Li, Y.; Rodríguez, J. A simple model-free solution for finite control-set predictive control in power converters. IEEE Trans. Power Electron. 2024, 39, 12627–12635. [Google Scholar] [CrossRef]
- Vazani, A.; Mirshekali, H.; Mijatovic, N.; Ghaffari, V.; Dashti, R.; Shaker, H.R.; Mardani, M.M.; Dragičević, T. Composite nonlinear feedback control of a DC-DC boost converter under input voltage and load variation. Int. J. Electr. Power Energy Syst. 2024, 155, 109562. [Google Scholar] [CrossRef]
- He, W.; Zhang, Y.; Zhou, W. Observerless output feedback control of DC-DC converters feeding a class of unknown nonlinear loads via power shaping. IEEE Trans. Circuits Syst. I Regul. Pap. 2024, 71, 2951–2963. [Google Scholar] [CrossRef]
- Ghamari, S.M.; Molaee, H.; Ghahramani, M.; Habibi, D.; Aziz, A. Design of an Improved Robust Fractional-Order PID Controller for Buck–Boost Converter using Snake Optimization Algorithm. IET Control Theory Appl. 2025, 19, E70008. [Google Scholar] [CrossRef]
- Nithara, P.V.; Anand, R.; Ramprabhakar, J.; Meena, V.P.; Padmanaban, S.; Khan, B. Brayton–Moser passivity based controller for constant power load with interleaved boost converter. Sci. Rep. 2024, 14, 28325. [Google Scholar] [CrossRef]
- Reddy, A.; Bhukya, C.N.; Venkatesh, A. Constant power load in DC microgrid system: A passivity based control of two input integrated DC-DC converter. e-Prime-Adv. Electr. Eng. Electron. Energy 2025, 11, 100941. [Google Scholar]
- Al-Dabbagh, Z.A.; Shneen, S.W.; Hanfesh, A.O. Fuzzy Logic-based PI Controller with PWM for Buck-Boost Converter. J. Fuzzy Syst. Control 2024, 2, 147–159. [Google Scholar] [CrossRef]
- Wiryajati, I.K.; Satiawan, I.N.W.; Suksmadana, I.M.B.; Wiwaha, B.B.P. Investigation and Analysis of Fuzzy Logic Controller Method on DC-DC Buck-Boost Converter. J. Penelit. Pendidik. IPA 2025, 11, 1066–1074. [Google Scholar] [CrossRef]
- Manoharan, R.; Wahab, R.S. Model predictive controller-based Convolutional Neural Network controller for optimal frequency tracking of resonant converter-based EV charger. Results Eng. 2024, 24, 103658. [Google Scholar] [CrossRef]
- Ramu, S.K.; Vairavasundaram, I.; Palaniyappan, B.; Bragadeshwaran, A.; Aljafari, B. Enhanced energy management of DC microgrid: Artificial neural networks-driven hybrid energy storage system with integration of bidirectional DC-DC converter. J. Energy Storage 2024, 88, 111562. [Google Scholar] [CrossRef]
- Al-Dabbagh, Z.A.; Shneen, S.W. Neuro-Fuzzy Controller for a Non-Linear Power Electronic DC-DC Boost Converters. J. Robot. Control. (JRC) 2024, 5, 1479–1491. [Google Scholar]
- Sahraoui, H.; Mellah, H.; Mouassa, S.; Jurado, F.; Bessaad, T. Lyapunov-Based Adaptive Sliding Mode Control of DC–DC Boost Converters Under Parametric Uncertainties. Machines 2025, 13, 734. [Google Scholar] [CrossRef]
- Liu, X. Design of CCM boost converter utilizing fractional-order PID and Lyapunov-based PID techniques for PF correction. Electr. Eng. 2025, 107, 3451–3462. [Google Scholar] [CrossRef]
- Ghamari, S.; Habibi, D.; Ghahramani, M.; Aziz, A. Design of a Robust Adaptive Cascade Fractional-Order Nonlinear-Based Controller Enhanced Using Grey Wolf Optimization for High-Power DC/DC Dual Active Bridge Converter in Electric Vehicles. IET Power Electron. 2025, 18, E70056. [Google Scholar] [CrossRef]
- Cheng, H.; Jung, S.; Kim, Y. A novel reinforcement learning controller for the DC-DC boost converter. Energy 2025, 321, 135479. [Google Scholar] [CrossRef]
- Chen, P.; Zhao, J.; Liu, K.; Zhou, J.; Dong, K.; Li, Y.; Guo, X.; Pan, X. A review on the applications of reinforcement learning control for power electronic converters. IEEE Trans. Ind. Appl. 2024, 60, 8430–8450. [Google Scholar] [CrossRef]
- Vu, N.T.-T.; Nguyen, H.X.; Bui, M.Q. Adaptive optimal sliding mode control for three-phase voltage source inverter: Reinforcement learning approach. Trans. Inst. Meas. Control 2024, 46, 2001–2012. [Google Scholar]
- Wan, Y.; Xu, Q.; Dragičević, T. Reinforcement learning-based predictive control for power electronic converters. IEEE Trans. Ind. Electron. 2024, 72, 5353–5364. [Google Scholar] [CrossRef]
- Abdulkader, R.; Salem, M.; Senjyu, T. Adaptive Voltage Control of Single-Inductor 3x Multilevel Converters Interfaced DC Microgrids Using Multi-Agent Approximate Q-Learning. IEEE Access 2024, 12, 114295–114303. [Google Scholar] [CrossRef]
- Oboreh-Snapps, O.; Sharma, A.; Saelens, J.; Fernandes, A.; Strathman, S.A.; Morris, L.; Uddarraju, P.; Kimball, J.W. Feedback Control of CLLLC Resonant DC-DC Converter using Deep Reinforcement Learning. In Proceedings of the 2024 IEEE Energy Conversion Congress and Exposition (ECCE), Phoenix, AZ, USA, 20–24 October 2024; IEEE: Piscataway, NJ, USA, 2024. [Google Scholar]
- Wan, Y.; Xu, Q.; Dragičević, T. Safety-enhanced self-learning for optimal power converter control. IEEE Trans. Ind. Electron. 2024, 71, 15229–15234. [Google Scholar] [CrossRef]
- Çimen, M. Controller Design For Dc-Dc Boost Converter Using PI, State Feedback and Q Learning. Gaziosmanpaşa Bilimsel Araştırma Derg. 2024, 13, 30–46. [Google Scholar]
- Rajamallaiah, A.; Karri, S.P.K.; Shankar, Y.R. Deep reinforcement learning based control strategy for voltage regulation of DC-DC Buck converter feeding CPLs in DC microgrid. IEEE Access 2024, 12, 17419–17430. [Google Scholar] [CrossRef]
- Zandi, O.; Poshtan, J. Voltage control of DC–DC converters through direct control of power switches using reinforcement learning. Eng. Appl. Artif. Intell. 2023, 120, 105833. [Google Scholar] [CrossRef]
- Ghahramani, M.; Habibi, D.; Ghamari, S.; Aziz, A. Optimal Operation of an Islanded Hybrid Energy System Integrating Power and Gas Systems. IEEE Access 2024, 12, 196591–196608. [Google Scholar] [CrossRef]
- Saha, U.; Jawad, A.; Shahria, S.; Rashid, A.B.M. H-U. Proximal policy optimization-based reinforcement learning approach for DC-DC boost converter control: A comparative evaluation against traditional control techniques. Heliyon 2024, 10, e37823. [Google Scholar] [CrossRef] [PubMed]
- Muktiadji, R.F.; Ramli, M.A.M.; Milyani, A.H. Twin-Delayed Deep Deterministic Policy Gradient Algorithm to Control a Boost Converter in a DC Microgrid. Electronics 2024, 13, 433. [Google Scholar] [CrossRef]
- Ye, J.; Zhao, D.; Pan, X.; Li, S.; Wang, B.; Zhang, X.; Iu, H.H.C. Improving Voltage Regulation of Interleaved DC-DC Boost Converter via Soft Actor-Critic Algorithm Based Reinforcement Learning Controller. IEEE J. Emerg. Sel. Top. Power Electron. 2025. [CrossRef]
- Ghamari, S.M.; Habibi, D.; Aziz, A. Robust Adaptive Fractional-Order PID Controller Design for High-Power DC-DC Dual Active Bridge Converter Enhanced Using Multi-Agent Deep Deterministic Policy Gradient Algorithm for Electric Vehicles. Energies 2025, 18, 3046. [Google Scholar] [CrossRef]
- Rajwar, K.; Deep, K.; Das, S. An exhaustive review of the metaheuristic algorithms for search and optimization: Taxonomy, applications, and open challenges. Artif. Intell. Rev. 2023, 56, 13187–13257. [Google Scholar] [CrossRef]
- Tomar, V.; Bansal, M.; Singh, P. Metaheuristic algorithms for optimization: A brief review. Eng. Proc. 2024, 59, 238. [Google Scholar]
- Nassef, A.M.; Abdelkareem, M.A.; Maghrabie, H.M.; Baroutaji, A. Review of metaheuristic optimization algorithms for power systems problems. Sustainability 2023, 15, 9434. [Google Scholar] [CrossRef]
- Yakut, Y.B. A new control algorithm for increasing efficiency of PEM fuel cells–Based boost converter using PI controller with PSO method. Int. J. Hydrogen Energy 2024, 75, 1–11. [Google Scholar] [CrossRef]
- Hollweg, G.V.; Evald, P.J.D.d.O.; Mattos, E.; Borin, L.C.; Tambara, R.V.; Montagner, V.F. Self-tuning methodology for adaptive controllers based on genetic algorithms applied for grid-tied power converters. Control Eng. Pract. 2023, 135, 105500. [Google Scholar] [CrossRef]
- Peng, C.; Ghamari, S.M.; Mollaee, H.; Rezaei, O. Design of a novel robust adaptive fractional-order model predictive controller for boost converter using grey wolf optimization algorithm. Sci. Rep. 2025, 15, 27670. [Google Scholar] [CrossRef]
- Khan, M.A.; Yousaf, M.Z.; Khalid, S.; Fashihi, D.; Bokhari, S.A.H.; Insafmal, B.K.; Abbas, G. Applying Ant Lion Optimization Technique to Enhance Power Converters Performance via Effective Controller Tuning. In Proceedings of the 2023 2nd International Conference on Emerging Trends in Electrical, Control, and Telecommunication Engineering (ETECTE), Lahore, Pakistan, 27–29 November 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
- Liu, Y.; Azizan As’arry; Hassan, M.K.; Hairuddin, A.A.; Mohamad, H. Review of the grey wolf optimization algorithm: Variants and applications. Neural Comput. Appl. 2024, 36, 2713–2735. [Google Scholar] [CrossRef]
- Krishnaram, K.; T. Padmanabhan, S.; Alsaif, F.; Senthilkumar, S. Development of grey wolf optimization based modified fast terminal sliding mode controller for three phase interleaved boost converter fed PV system. Sci. Rep. 2024, 14, 9256. [Google Scholar] [CrossRef] [PubMed]
- Lai, S.; Wang, W. Design of CCM boost converter using fractional-order PID controller optimized with gray wolf algorithm for power factor correction. Int. J. Dyn. Control 2024, 12, 3685–3693. [Google Scholar] [CrossRef]
- Jagatheesan, K.; Boopathi, D.; Samanta, S.; Anand, B.; Dey, N. Grey wolf optimization algorithm-based PID controller for frequency stabilization of interconnected power generating system. Soft Comput. 2024, 28, 5057–5070. [Google Scholar] [CrossRef]
Component | Definition | Boost | Buck |
---|---|---|---|
Output Voltage | 60–120 V | 50–120 V | |
E | Supply Voltage | 10–50 V | 120–60 V |
L | Inductor | 2 μH | 2 μH |
R | Resistor (Load) | 50 | 50 |
C | Capacitor | 100 μF | 100 μF |
f | Switching Frequency | 20 kHz | 20 kHz |
Component | Description |
---|---|
Actor network architecture | Input layer: 3 neurons ; hidden layer 1: 10 neurons (ReLU); hidden layer 2: 6 neurons (ReLU); output layer: 3 neurons with tanh activation scaled to safe gain ranges (Figure 4a). |
Critic network architecture | Inputs: state vector + action vector; hidden layers: 2 layers of 64 neurons each (ReLU activations); output: single Q-value (linear). |
Optimizers | Adam; learning rate for actor, for critic. |
Replay buffer | Capacity: 100,000 transitions; mini-batch size: 64. |
Target networks | Soft update with factor . |
Discount factor | . |
Exploration strategy | Ornstein–Uhlenbeck noise (, decays from 0.2 to 0.05). |
Training schedule | 300 episodes, 6000 steps per episode; warm-up of 5000 random steps before training begins. |
Regularization | Gradient clipping (max norm = 1.0) and normalization of state observations. |
Initialization | Gains initialized offline using Grey Wolf Optimization (GWO), ensuring stable exploration from the start. |
Training outcome | Cumulative reward increased steadily; fluctuations in early episodes due to noise; convergence achieved as actor policy stabilized. Training/test losses converge towards zero (Figure 4b). |
Algorithm | Population (P) | Iterations (T) | Dimension (d) | Algorithm-Specific Settings | Selected Gains |
---|---|---|---|---|---|
GWO | 30 wolves | 100 | 3 | Coefficient vectors | (10.2, 5.1, 0.8) |
ALO | 30 ants | 100 | 3 | Roulette wheel selection | (11.4, 5.6, 0.9) |
PSO | 30 particles | 100 | 3 | Inertia , | (9.8, 4.9, 0.7) |
GA | 30 chromosomes | 100 | 3 | Crossover = 0.8, mutation = 0.1 | (12.0, 6.2, 1.0) |
Algorithm | Convergence Iteration | Final IAE | Std. Deviation | Param. Sensitivity | CPU Time (s) | Final Fitness |
---|---|---|---|---|---|---|
GWO | 52 | 1.62 | 0.007 | 1.5% | 17.4 | 0.125 |
ALO | 57 | 1.73 | 0.009 | 1.8% | 18.6 | 0.138 |
PSO | 74 | 2.38 | 0.026 | 3.7% | 19.8 | 0.196 |
GA | 85 | 2.61 | 0.033 | 4.2% | 31.4 | 0.209 |
Controller | Online Effort | Memory | Sampling Time (µs) | Exec. Time (µs) | Delay (µs) |
---|---|---|---|---|---|
ABSC | Few algebraic ops. (fixed) | low | 100 | 10 | 3 |
RL–PID | Error calc., sum/diff, clamp | low | 100 | 5 | 2 |
RL–ABSC | Actor NN + ABSC algebra | Moderate | 100 | 25 | 5 |
Controller | Case | Rise Time (ms) | Overshoot (%) | Settling Time (ms) | RMSE (V) | IAE |
---|---|---|---|---|---|---|
ABSC | Buck | 6.5 | 3.8 | 12.0 | 0.42 | 5.1 |
Boost | 7.2 | 4.1 | 13.5 | 0.47 | 5.6 | |
RL–PID | Buck | 4.2 | 6.7 | 10.5 | 0.36 | 4.3 |
Boost | 4.8 | 7.2 | 11.0 | 0.41 | 4.9 | |
Proposed RL–ABSC | Buck | 3.1 | 2.1 | 7.8 | 0.25 | 3.2 |
Boost | 3.6 | 2.5 | 8.4 | 0.29 | 3.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ghamari, S.M.; Aziz, A.; Ghahramani, M. Design of Robust Adaptive Nonlinear Backstepping Controller Enhanced by Deep Deterministic Policy Gradient Algorithm for Efficient Power Converter Regulation. Energies 2025, 18, 4941. https://doi.org/10.3390/en18184941
Ghamari SM, Aziz A, Ghahramani M. Design of Robust Adaptive Nonlinear Backstepping Controller Enhanced by Deep Deterministic Policy Gradient Algorithm for Efficient Power Converter Regulation. Energies. 2025; 18(18):4941. https://doi.org/10.3390/en18184941
Chicago/Turabian StyleGhamari, Seyyed Morteza, Asma Aziz, and Mehrdad Ghahramani. 2025. "Design of Robust Adaptive Nonlinear Backstepping Controller Enhanced by Deep Deterministic Policy Gradient Algorithm for Efficient Power Converter Regulation" Energies 18, no. 18: 4941. https://doi.org/10.3390/en18184941
APA StyleGhamari, S. M., Aziz, A., & Ghahramani, M. (2025). Design of Robust Adaptive Nonlinear Backstepping Controller Enhanced by Deep Deterministic Policy Gradient Algorithm for Efficient Power Converter Regulation. Energies, 18(18), 4941. https://doi.org/10.3390/en18184941