Next Article in Journal
Design Analysis of an Interior Permanent Magnet Synchronous Motor with Hybrid Hair-Pin and Litz Wire Windings
Previous Article in Journal
A Very Compact Eleven-State Bandpass Filter with Split-Ring Resonators
Previous Article in Special Issue
A Segmented Adaptive PID Temperature Control Method Suitable for Industrial Dispensing System
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hybrid PSO–Reinforcement Learning-Based Adaptive Virtual Inertia Control for Frequency Stability in Multi-Microgrid PV Systems

by
Akeem Babatunde Akinwola
* and
Abdulaziz Alkuhayli
Electrical Engineering Department, College of Engineering, King Saud University, Riyadh 11421, Saudi Arabia
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(17), 3349; https://doi.org/10.3390/electronics14173349
Submission received: 21 July 2025 / Revised: 12 August 2025 / Accepted: 18 August 2025 / Published: 22 August 2025

Abstract

The increasing integration of renewable energy sources, particularly photovoltaic (PV) systems, into power grids presents challenges in maintaining frequency stability due to the absence of traditional mechanical inertia. This paper proposes a hybrid control strategy combining Particle Swarm Optimization (PSO) and Reinforcement Learning (RL) to provide Adaptive Virtual Inertia Control for frequency stability in multi-microgrid PV systems. The proposed system dynamically adjusts virtual inertia and damping parameters in response to real-time grid conditions and frequency deviations. The PSO algorithm optimizes the base inertia and damping parameters offline, while the RL algorithm fine-tunes these parameters online by learning from the system’s performance. The adaptive control mechanism effectively mitigates frequency fluctuations and enhances grid synchronization, ensuring stable operation even under varying power generation and load conditions. The hybrid PSO–RL controller demonstrates a superior performance, maintaining a frequency close to nominal (50.02 Hz), with the fastest settling time (0.10 s), minimal RoCoF (0.2 Hz/s), and effectively zero steady-state error. Simulation results demonstrate the effectiveness of the hybrid control approach, showing fast and accurate frequency regulation with minimal power quality degradation. The system’s ability to adapt in real time provides a promising solution for next-generation smart grids that rely on renewable energy sources.

1. Introduction

The global push toward cleaner, decentralized power systems has led to the widespread adoption of inverter-based renewable energy sources, particularly photovoltaic (PV) systems. Unlike synchronous machines, PV inverters do not naturally provide rotational inertia, which traditionally played a critical role in stabilizing system frequency during disturbances. This growing reduction in system inertia has created new challenges in maintaining frequency stability and dynamic resilience, especially in isolated microgrids and low-inertia environments where even small power imbalances can lead to significant frequency deviations.
To mitigate these risks, virtual inertia control strategies have emerged as promising tools that enable inverter-based resources to emulate the inertial response of conventional generators. These methods enhance the system’s ability to resist sudden frequency deviations by injecting synthetic inertia through a fast inverter response. However, conventional virtual inertia schemes are typically implemented using fixed-parameter controllers. While these static methods are simple and easy to deploy, they lack the adaptability needed for modern, variable renewable energy environments [1].
To improve performance, researchers have introduced optimization techniques such as Particle Swarm Optimization (PSO) to tune the virtual inertia and damping coefficients for improved frequency response and transient stability [2,3]. Although PSO offers a strong global search capability and helps determine optimal controller settings, it is often applied as an offline process, which limits its effectiveness under rapidly changing grid conditions. Once set, the parameters remain fixed and may not be optimal in the face of load variations, changes in irradiance, or other disturbances.
Meanwhile, Reinforcement Learning (RL) has gained attention as a dynamic control method capable of real-time adaptation. RL algorithms allow controllers to learn optimal actions from continuous system interactions, adapting to varying grid conditions without requiring an explicit model [4]. Despite this advantage, RL approaches alone may suffer from issues such as lengthy training times, poor initial performance, and an instability in convergence in complex or non-linear environments.
Recently, hybrid strategies that combine the strengths of both methods have gained interest. These hybrid control frameworks use metaheuristic optimizers like PSO to initialize controller parameters and then apply RL to refine these parameters dynamically during operation. This approach combines the global search advantage of PSO with the online learning and adaptation capabilities of RL. Such strategies have been shown to enhance the performance of frequency regulation schemes in PV-based microgrids, particularly in improving resilience and reducing steady-state error [5,6,7]. For instance, recent studies [6,7] have introduced hybrid inertia control methods that adaptively regulate frequency support in islanded or low-inertia systems under realistic dynamic conditions. The author [8] presents ST-CALNet, a new hybrid deep learning framework that combines a concentrated Long Short-Term Memory (LSTM) structure with convolutional neural networks (CNNs) to improve forecasting in smart grids with renewable energy integration. The LSTM module simulates temporal correlations in past load patterns, while the CNN component records spatial connections from heterogeneous inputs, including generation data and meteorological variables.
The frequency of an islanded microgrid was controlled using the FOPID controller in [9]. In [10], the FOPID system controller was used to enhance VIC performance in an islanded microgrid. Its settings were adjusted through the use of a neural network. In order to improve frequency consistency in an islanded microgrid, the FOPID controller was used in [11], with its settings tuned using the SWA. The FOPID controller was used in [12] to increase frequency reliability over an islanded microgrid by optimizing its parameters using the SCA. In order to improve frequency reliability in an islanded microgrid, the FOPID controller was used in [13] with its settings improved using the HSA. Based on the results of the management methods [9,10,11,12,13], the islanded microgrid’s frequency stability has been enhanced by the FOPID controller. However, this kind of controller has drawbacks as well. The FOPID controller’s fractional parameters are one of its complexities that could have a direct impact on the islanded microgrid’s frequency stability. As a result, it is crucial to set these parameters correctly. Compared to single PID and FOPID controllers, cascaded controllers with both PID and FOPID components offer numerous benefits. These controllers, also known as cascaded controllers, offer the best performance and react fast to system changes. Furthermore, complex systems, such as islanded microgrids, can be controlled by this type of controller [14,15,16]. In power systems and islanded microgrids, sequential controllers have been utilized to enhance the frequency control [14,15,16]. The DSA was used to refine the parameters of the FOPI-FOPD tandem controller, which was used in [16] to improve frequency management in power systems. To enhance the frequency management in islanded microgrids, the PI-TID sequential controller was employed in [15], with its settings adjusted using the chaotic BOT. The PI-FOPID sequential controller was used in [16] to improve frequency regulation in islanded microgrids. The GTO was used to optimize the controller’s parameters. The results of the control techniques [14,15,16] show that the cascaded controller can keep the islanded microgrid’s frequency stability. On the islanded microgrid, this controller is extremely resilient to disturbances. In the event of uncertainties, it can also keep the islanded microgrid’s frequency stability. This paper proposes a Hybrid PSO–Reinforcement Learning-Based Adaptive Virtual Inertia Control (AVIC) strategy designed for PV-based islanded microgrids. The proposed approach utilizes PSO to determine initial virtual inertia and damping coefficients, providing a well-tuned starting point. A Reinforcement Learning agent then adapts these parameters in real time in response to frequency deviations, allowing the controller to respond dynamically to operational conditions. This hybrid structure enables a robust and adaptive solution that outperforms conventional fixed-gain or standalone optimization-based approaches.
Simulation studies are conducted in MATLAB R2024a/Simulink to evaluate the effectiveness of the proposed hybrid control scheme. The performance is assessed under typical operating scenarios using metrics such as frequency deviation, settling time, and rate of change of frequency (RoCoF). Results demonstrate that the hybrid PSO–RL controller significantly enhances frequency stability and dynamic response compared to non-adaptive and single-layer control schemes.
Compared to previous work, this research offers the following significant advances:
Integrated Hybrid Control Framework: This is one of the first studies to integrate PSO and RL into a hybrid control framework for virtual inertia, enabling both global optimization and local real-time adaptation.
Enhanced Robustness and Adaptivity: By combining the strengths of both techniques, this controller adapts more effectively to rapidly changing grid dynamics, ensuring an enhanced frequency stability even under severe disturbances or high renewable penetration scenarios.
Practical Implementation: The proposed method is implemented and validated in MATLAB through real-time gate signal generation for inverter switches, demonstrating practical viability for embedded system applications in smart grid environments.
By bridging optimization and adaptive learning, this method provides a comprehensive and scalable solution to frequency stabilization challenges in modern low-inertia power systems. It addresses the shortcomings of existing approaches by offering both high performance and adaptability, making a valuable contribution to the field of intelligent energy management and smart grid control systems. The conceptual structure of the proposed hybrid control system is illustrated in Figure 1, which outlines the coordination between the PSO for global optimization and RL for real-time adaptive control.
What distinguishes this hybrid PSO-RL implementation in energy systems is its real-time Reinforcement Learning (RL) adaptation mechanism built upon PSO-initialized parameters. Unlike previous studies that apply PSO and RL in isolation or in offline stages, our approach uses Particle Swarm Optimization to pre-optimize the initial policy network weights, learning rates, and exploration parameters of the RL agent. This ensures that the Reinforcement Learning process begins from a near-optimal configuration, drastically reducing convergence time and avoiding inefficient exploration during early operation.
In the dynamic environment of energy systems, where load profiles, renewable generation, and grid constraints fluctuate in real-time, this integrated framework allows the RL agent to continuously adapt operational strategies, such as energy dispatch, load shifting, or demand response, with high responsiveness and precision. The hybrid approach combines PSO’s global search capabilities (for initial parameter tuning) with the adaptive, model-free control of RL, achieving both faster learning and higher resilience to non-stationary operating conditions. This results in a more reliable and economically optimized energy system performance compared to conventional PSO-only, RL-only, or sequential PSO-RL strategies.

2. Methodology

The model is designed to improve frequency stability in a distributed PV-based microgrid environment through intelligent virtual inertia control. The system consists of four individual PV arrays, each connected to its own DC/DC converter. These converters play a crucial role in extracting the maximum available power from the solar panels by implementing Maximum Power Point Tracking (MPPT) algorithms. MPPT dynamically adjusts the duty cycle of the converter to ensure optimal power extraction under varying environmental conditions such as solar irradiance and temperature [17]. The output of the PV modules is unregulated DC, which is then stabilized through the converter to provide a steady DC voltage required by downstream systems.
Following the DC/DC conversion, the output is passed through a DC-link capacitor, which serves as an energy buffer and voltage stabilizer. This capacitor reduces ripple in the voltage and ensures a consistent DC supply to the IGBTs and diodes [18]. The DC-link voltages (Vdc, Vdc1, Vdc2, and Vdc3) are continuously monitored and fed into the central controller. These voltages provide insight into the power availability and converter health, and they are used as part of the control logic in the inertia emulation algorithm.
Each PV unit is connected to an IGBT/Diodes module, which is responsible for converting DC power into three-phase AC and injecting it into the local grid [19]. The IGBT/Diodes work by switching the DC voltage to an AC waveform while managing both active and reactive power flow [20]. They also regulate the voltage at the Point of Common Coupling (PCC) and support frequency control. Gate signals from the Hybrid PSO–Reinforcement Learning (RL) Controller modulate the IGBT/Diodes operation to implement dynamic power control and inertia emulation.
At the heart of the system lies the Hybrid PSO–Reinforcement Learning (RL) Controller, which dynamically provides synthetic inertia by emulating the inertial response of traditional synchronous generators. The controller takes multiple inputs, including real-time DC-link voltages, measured grid frequency, and power parameters. It uses Particle Swarm Optimization (PSO) to optimize the initial controller parameters and employs Reinforcement Learning to adapt to changes in grid dynamics during real-time operation. The output of this controller consists of four gate signals (gate1 to gate4) that drive the IGBT/Diodes to inject or absorb power in response to frequency deviations, thereby enhancing system stability.
The AC output from each IGBT/Diodes is interfaced with the grid using a distribution transformer rated at 400 kVA and 260 V/25 kV. This voltage is stepped up again via a 25 kV/132 kV transformer for integration into the transmission grid. A 100 kW load is modeled at the 25 kV level to simulate real-time consumption and test the control system’s ability to maintain frequency under varying load conditions. This subsystem also includes passive elements such as filters and impedance lines to emulate realistic grid interfacing [21]. The detailed architecture of the proposed Hybrid PSO–Reinforcement Learning (RL) Controller is presented in Figure 2. The framework begins with system inputs, including nominal parameters, simulated voltages, and fixed currents, which are used to compute instantaneous power and the corresponding frequency deviation. Based on this deviation, the inertia and damping coefficients are dynamically adjusted through the Hybrid PSO–RL mechanism, where PSO performs global parameter optimization while RL continuously updates the control policy in real-time. The adapted control signals are processed to generate PWM reference signals, which are then compared with a carrier waveform to produce logical gate pulses. These pulses are then translated into complete gate signals for the IGBT modules of the voltage source converters (VSCs). This closed-loop structure ensures effective parameter adaptation, robust dynamic response, and reliable frequency stability under varying load and renewable energy conditions.
To validate the proposed controller, two simulation cases are considered. Case A evaluates independent PV–IGBT operation with decentralized inertia emulation, while Case B analyzes a shared IGBT configuration with centralized control. These scenarios enable a comprehensive assessment of the controller’s performance under different interfacing schemes, providing insights into distributed versus centralized inertia emulation strategies within PV-based multi-microgrid systems.

2.1. Case A—Independent PV-IGBT Modules Connected to the Same Node

In this configuration, each PV module is equipped with a dedicated IGBT-based converter, and all four modules are connected directly to a common grid node. The Hybrid PSO–RL controller independently modulates each IGBT, enabling decentralized operation and distributed inertia emulation. This setup allows each PV source to contribute inertia individually, enhancing modularity and resilience.

2.2. Case B—Four PV Modules Sharing One IGBT Before Node Connection

In the alternative configuration, all four PV modules are aggregated at the DC bus and fed into a single IGBT converter connected to the grid node. Here, the Hybrid PSO–RL controller performs centralized modulation of the combined PV output, reducing hardware requirements while providing coordinated inertia emulation.
The overall Simulink model of the PV-based multi-microgrid system is illustrated in Figure 3. It integrates PV arrays, energy storage, MPPT, and the proposed Hybrid PSO–RL Controller in a grid-connected environment. Two configurations are supported: Case A, where each PV module is interfaced with an independent IGBT for decentralized inertia emulation, and Case B, where all PV modules share a single IGBT for centralized control. These two cases enable a comprehensive comparison between decentralized and centralized inertia emulation strategies, providing insights into trade-offs between scalability, hardware efficiency, and control flexibility.

3. Mathematical Modeling

The frequency regulation in an inverter-dominated system relies on virtual inertia and damping parameters that adjust in response to power fluctuations or disturbances [22].

3.1. Frequency Deviation Model

The frequency deviation Δ f ( t ) is a key indicator of system stability. It is defined as the difference between the instantaneous frequency f ( t ) and the nominal frequency f nom :
Δ f ( t ) = f ( t ) f n o m
where
f ( t ) is the instantaneous grid frequency.
f n o m is the nominal grid frequency (typically 50 Hz or 60 Hz).
When power generation and demand are imbalanced, the grid frequency will deviate from the nominal value.

3.2. Power Generation and Frequency Relationship

The instantaneous frequency deviation is related to the change in power output Δ P ( t ) through the following inertia equation:
Δ f ( t ) = Δ P ( t ) 2 H f n o m
where
Δ P ( t ) is the change in power output (difference between generated power and load demand);
H is the system inertia constant (in seconds);
f n o m is the nominal frequency.
The system’s inertia constant H represents the capability of the system to resist changes in the frequency. Lower inertia can result in faster frequency fluctuations.

3.3. Virtual Inertia Control

To stabilize the frequency and mimic the inertial response of traditional synchronous generators, virtual inertia is introduced through the control strategy. This synthetic inertia J adapt ( t ) adapts in real time based on the system’s frequency deviation:
J a d a p t ( t ) = J b a s e ( 1 + α Δ f ( t ) )
where
J b a s e is the base inertia constant;
α is the adaptation gain;
Δ f ( t ) is the frequency deviation.
Virtual inertia provides resistance to rapid frequency changes by injecting or absorbing power in proportion to the rate of change of frequency.

3.4. Damping Control

In addition to virtual inertia, damping is crucial for stabilizing the system by counteracting any oscillations. The damping term D adapt ( t ) is similarly adjusted based on the frequency deviation:
D a d a p t ( t ) = D b a s e ( 1 + α Δ f ( t ) )
where
D b a s e is the base damping coefficient;
α is the adaptation gain.
Damping works by reducing the amplitude of oscillations, helping to restore the system to equilibrium.

3.5. Power Injection Based on Frequency Deviation

To emulate the inertial response of synchronous generators, the active power P ( t ) injected into the grid is adjusted based on the frequency deviation:
P ( t ) = J a d a p t ( t ) d Δ f ( t ) d t + D adapt ( t ) Δ f ( t )
This equation ensures that the power injection is proportional to the rate of change of frequency ( d Δ f d t ) and the instantaneous frequency deviation. The controller adjusts P ( t ) dynamically to stabilize the grid frequency.

3.6. Reinforcement Learning Adjustment

To optimize the adaptation parameters J adapt ( t ) and D adapt ( t ) , a Reinforcement Learning (RL) algorithm can be employed. The RL algorithm updates the adaptation gain α based on real-time feedback:
State
s t = Δ f ( t ) , the current frequency deviation;
Action a t = α , the change in adaptation gain;
Reward r t = | Δ f ( t ) | , where the reward is negative for larger deviations from the nominal frequency.
The RL agent updates the control policy using the following:
Q ( s t , a t ) Q ( s t , a t ) + β [ r t + γ m a x a   Q ( s t + 1 , a ) Q ( s t , a t ) ]
where
Q ( s t , a t ) is the action value function;
β is the learning rate;
γ is the discount factor.
The RL process allows the system to adapt to changing grid conditions and optimize performance over time.

3.7. Hybrid Interaction

PSO: Performs offline tuning of initial controller gains α , J base , D base by minimizing a cost function such as the integral of frequency error (IFE).
RL: Performs online adjustment in real-time to adapt to changing system conditions by maximizing the cumulative reward.
The hybrid controller thus enables both global optimality (via PSO) and local adaptability (via RL), ensuring a robust performance in dynamic microgrid environments.
The interaction between Particle Swarm Optimization (PSO) and Reinforcement Learning (RL) in a hybrid control system combines the strengths of both offline and online learning paradigms for enhanced system performance and adaptability. PSO serves as an offline optimization technique, where optimal or near-optimal parameters (e.g., controller gains, inertia constants) are pre-computed by simulating multiple scenarios and evaluating a fitness function. This reduces the search space and provides a solid starting point for real-time operations. On the other hand, RL operates online, learning from real-time system interactions and environmental feedback. While PSO provides an optimized initialization, the RL adapts these parameters dynamically as the system encounters unforeseen disturbances or time-varying conditions. The RL agent continuously refines control actions based on a reward signal, ensuring the system remains stable and responsive under non-linear and stochastic scenarios.
This interaction creates a synergistic loop: PSO accelerates RL convergence by supplying good initial policies, and the RL ensures robustness by adapting beyond the PSO’s fixed optimization boundaries. In frequency stability control of PV-based microgrids, for example, PSO can tune inertia gains offline, while RL adjusts them online in response to load fluctuations or renewable intermittency, ensuring a seamless performance and reduced frequency deviation in real-time operations.
However, the effectiveness of PSO-RL control is highly dependent on the physical configuration of the PV generation units and their interface with the grid.
This study investigates and compares two configurations:
(i)
Four PV-IGBT modules connected individually to the same node.
(ii)
Four PV modules connected to a single shared IGBT, which is then interfaced with the node.
The analysis focuses on how each configuration affects control dynamics, power quality, and response to disturbances under a hybrid PSO-RL controller.
As shown in Table 1, the PV-based islanded microgrid parameters were selected to emulate realistic operating conditions of a medium-scale renewable energy system, ensuring the model captures essential dynamics for frequency stability analysis.
The PSO tuning parameters in Table 2 were optimized to achieve a rapid convergence while avoiding premature stagnation, thereby providing robust initial controller settings before online adaptation.
As presented in Table 3, the RL training parameters were chosen to balance adaptability with stability, enabling the effective real-time adjustment of controller gains without compromising closed-loop performance.

3.8. Theoretical Convergence and Stability of the PSO–RL AVIC

The closed-loop frequency dynamics of the islanded microgrid with the proposed Adaptive Virtual Inertia Control (AVIC) can be represented in state-space form as
x ˙ = A ( θ ) x + B w
where
  • x = [ Δ f , Δ ˙ f ] T is the state vector, with Δ f denoting the frequency deviation and Δ ˙ f its time derivative.
  • θ = [ M v , D v ] T is the parameter vector containing the virtual inertia gain M v and virtual damping gain D v of the AVIC.
  • w is the bounded disturbance input, representing load variations or renewable generation fluctuations.
  • A ( θ ) R 2 × 2   is the closed-loop state matrix determined by θ , which is Hurwitz in the stabilizing set Θ.
  • B ( θ ) R 2 × 1   is the disturbance input matrix mapping w to the state derivatives.
If A ( θ ) is Hurwitz, there exists a symmetric positive–definite matrix P satisfying the Lyapunov equation [23]:
A T ( θ ) P + P A ( θ ) = Q ,   Q = Q T > 0

3.8.1. PSO Stage (Offline)

The Particle Swarm Optimization (PSO) stage searches for optimal parameters θ 0 within the compact stabilizing set Θ, ensuring A( θ ) remains Hurwitz for all θ Θ. The use of the constriction factor PSO guarantees bound particle velocities and positions, with convergence to a stationary point in the feasible space [24]. The optimized θ 0 serves as a stabilizing initial point for the online RL stage.

3.8.2. RL Stage (Online)

The Reinforcement Learning (RL) stage updates the AVIC parameters online according to
θ k + 1 = Π θ ( Θ k n k g k )
where g k   is the estimated policy gradient, n k > 0 is the step size, and Π θ (⋅) is the projection operator onto the stabilizing set Θ. Projection ensures that parameters remain stable, a standard adaptive control technique used to maintain closed-loop stability [25].

3.8.3. Lyapunov–ISS Stability Analysis

Consider the Lyapunov function
V ( x , θ ~ ) = x T P x + I 2 γ θ ~ T θ ~
where θ = θ θ is the parameter error and θ Θ is the equilibrium parameter vector. Its time derivative satisfies
v ˙   α α 2 + C 1 x   w   + C 2 w 2  
for some constants  α > 0, C 1 > 0, C 2 > 0,
From this inequality, the system is input-to-state stable (ISS) [26]:
  • If w ≡ 0, the equilibrium is globally exponentially stable.
  • If w is bounded, the state remains bounded, with lim s u p t x ( t ) w .

3.8.4. Convergence Remarks

  • PSO Convergence—Guaranteed under constriction factor theory for bounded feasible sets [26].
  • RL Convergence—Ensured under diminishing step sizes satisfying Σ k n k =   and Σ k n 2 k < and with bounded and asymptotically unbiased gradient estimates [27,28].

4. Results and Discussion

To assess the dynamic response of the proposed hybrid controller under realistic grid conditions, a disturbance scenario involving a 10% step increase in load was applied at t = 0.2 s. This sudden change was designed to test the system’s ability to preserve the voltage and frequency stability in the face of rapid demand fluctuations. The controller’s performance was benchmarked against a standard PSO-based controller and a non-adaptive controller over a 1 s simulation period, highlighting the comparative advantages of the hybrid PSO–RL scheme in terms of transient handling and steady-state regulation.
Figure 4 illustrates the performance of the proposed Hybrid PSO–Reinforcement Learning-Based Adaptive Virtual Inertia Control system in a multi-microgrid PV setup. The top subplot presents the three-phase voltage waveforms, which stabilize rapidly and maintain a consistent sinusoidal profile after the initial transient period, indicating successful voltage regulation and synchronization with the grid. The middle subplot shows the corresponding three-phase current waveforms, which exhibit a well-balanced sinusoidal form with minimal harmonic distortion after startup, confirming effective current injection and load sharing among the inverters. The bottom subplot displays the power quality in terms of total apparent power (PQ) in kVA. The system demonstrates a sharp initial response followed by smooth settling behavior, with PQ stabilizing around a constant value. These results confirm that the hybrid controller effectively mitigates transient disturbances and maintains high power quality, thereby contributing to frequency stability and dynamic system performance.
In Figure 5, the DC-link voltage waveforms ( V d C , V d C 1 , V d C 2 , and V d C 3 ) correspond to multiple inverter modules in the multi-microgrid PV system. Initially, all DC voltages exhibit a brief transient response with small oscillations due to startup dynamics and power balancing.
However, by approximately 0.2 s, each voltage stabilizes near the desired operating point of 500 V. The observed steady-state ripple is minimal, indicating the effectiveness of the MPPT-controlled DC-DC converters and the coordinated control strategy. The uniformity across all four voltage traces demonstrates the balanced operation of the distributed PV units and consistent energy transfer through the IGBT-based inverter interfaces. This stable DC voltage regulation is critical for ensuring high-quality AC output and seamless grid synchronization.
In Figure 6, two key performance characteristics of the photovoltaic (PV) system are illustrated: irradiance and mean power output over time. The top subplot shows a nearly constant irradiance level of approximately 1000 W/m2, indicating stable solar input throughout the observation period. This stability ensures continuous exposure of the PV modules to consistent sunlight, providing a reliable source of energy. In contrast, the bottom subplot displays the mean power output ( P m e a n ) , which initially exhibits a sharp rise as the system stabilizes in response to the irradiance. After this transient phase, the output converges smoothly to a steady value of about 95–100 kW, with minor variations reflecting the comparative performance of the controllers. These results confirm the system’s ability to adapt and maintain steady power generation under stable irradiance, demonstrating both efficiency and robustness of the proposed control mechanism.
Figure 7 shows the frequency over time, with the frequency consistently remaining at 50 Hz throughout the observation period. This indicates that the system is well-regulated and stable, maintaining the nominal grid frequency without any significant fluctuations or disturbances. The absence of frequency variations suggests that the frequency control mechanism is effectively compensating for any potential deviations, ensuring a stable operation of the system, which is crucial for grid synchronization.
This stable behavior could be due to virtual inertia control or real-time frequency adjustment mechanisms within the power system, allowing it to respond to changes in power generation or demand without deviating from the nominal frequency.
The plot illustrates the dynamic frequency response of three different control strategies under a disturbance or setpoint change. The non-adaptive controller (red dashed line) shows significant oscillations and slow damping, highlighting poor transient stability and weak adaptability to dynamic system changes. The standard PSO controller (blue dash–dot line) improves upon this by reducing oscillations and achieving a faster convergence but still exhibits noticeable frequency fluctuations in the early transient phase.
In contrast, the Hybrid PSO-RL controller (solid green line) maintains a remarkably stable response throughout the simulation, effectively holding the frequency at the nominal value of 50 Hz with negligible deviation. This demonstrates superior adaptability and learning capability, as the hybrid controller dynamically tunes itself in real time based on system feedback. Overall, the Hybrid PSO-RL approach outperforms both traditional control schemes in terms of response speed, overshoot minimization, and frequency regulation precision.
As shown in Figure 8, the proposed controller significantly improves frequency stability under load variations. Table 4 provides a comparative evaluation of the frequency response performance for the three controllers. The non-adaptive controller shows the weakest performance, with a high peak overshoot (54.6 Hz), a long settling time (0.75 s), a steep RoCoF (76.7 Hz/s), and a steady-state error of ±0.4 Hz. The standard PSO controller improves frequency regulation with a reduced overshoot (51.7 Hz), a faster settling time (0.40 s), a moderate RoCoF (21.3 Hz/s), and a smaller steady-state error of ±0.1 Hz. The hybrid PSO–RL controller demonstrates a superior performance, maintaining a frequency close to nominal (50.02 Hz), with the fastest settling time (0.10 s), minimal RoCoF (0.2 Hz/s), and effectively zero steady-state error. These results confirm the hybrid controller’s enhanced stability and adaptability under transient conditions in multi-microgrid PV systems. Figure 9 shows the convergence curve of the frequency response.
The simulation results in Figure 10 demonstrate the effectiveness of the hybrid PSO–RL control strategy in maintaining system frequency stability under varying disturbances, renewable variability, and fault conditions. The PV power profile shows realistic fluctuations, while load disturbances and fault injections introduce dynamic challenges. Despite these, the RL agent adapts the control parameters in real time, correcting deviations that the offline PSO alone cannot handle. The frequency response plot shows quick recovery and minimal overshoot, indicating robustness and adaptability. This hybrid approach leverages the global search ability of PSO and the real-time learning capacity of RL, ensuring optimal and stable microgrid performance.
The superior performance of the hybrid PSO–RL controller is primarily attributed to its ability to combine the global optimization strength of Particle Swarm Optimization (PSO) with the real-time adaptability of Reinforcement Learning (RL). In contrast to traditional fixed-parameter controllers or separately tuned PSO/RL schemes, this hybrid framework enables the RL agent to start from an optimal parameter space initialized by PSO, such as learning rates, exploration parameters, and action weightings, thus eliminating inefficient exploration during the early stages of control. More importantly, the real-time RL adaptation proves critical under dynamic disturbances, such as sudden load changes or fluctuations in PV output due to cloud coverage. When frequency deviation occurs, the RL agent, guided by its continuous feedback learning mechanism, quickly adjusts the virtual inertia in response to the rate and direction of change. For instance, during a sharp frequency drop, the RL component increases the virtual inertia to inject synthetic power, emulating the response of synchronous inertia. Conversely, during over-frequency events, the agent reduces inertia and absorbs excess energy, thereby damping oscillations. This context-aware decision-making, enabled by online state-action evaluations and updated policies, allows the system to stabilize more rapidly and robustly than conventional PID or PSO-tuned static inertia controllers. Additionally, the hybrid approach prevents overfitting to specific operating conditions and generalizes better across varying grid events, making it well-suited for highly dynamic, renewable-dominated microgrid environments.
The novelty of this research lies in the first integration of a Particle Swarm Optimization (PSO)-tuned Proportional-Integral-Derivative (PID) controller with an online Reinforcement Learning (RL) adaptation scheme for Adaptive Virtual Inertia Control (AVIC) in PV-based islanded microgrids. Unlike existing AVIC approaches [29,30,31,32,33,34,35,36,37], which either rely on fixed parameters or purely offline optimization, the proposed method enables the dynamic real-time adjustment of controller parameters while preserving closed-loop stability via Lyapunov–ISS design. This combined offline–online optimization strategy achieves superior performance in terms of setting time, controller error, and overshoot compared to all benchmarked methods in Table 5—where the proposed method attains a setting time of 0.10 s, zero steady-state error, and an overshoot of only 0.02, outperforming Fuzzy, ANN–PID, ANFIS, GA–PID, and other optimization-based controllers. To the best of our knowledge, no prior work has applied a PSO–RL hybrid to AVIC in PV-based islanded microgrids, demonstrating a proven theoretical stability and comparative superiority across all three performance metrics.

Comparative Analysis of Case A and Case B Configurations

A quantitative performance comparison was conducted between two PV–IGBT configurations under the same 10% load step disturbance at t = 0.2 s, using identical Hybrid PSO–RL Controller parameters:
  • Case A—Four independent PV–IGBT modules connected to the same grid node.
  • Case B—Four PV modules aggregated at the DC bus and connected to the node via a single IGBT.
Table 6 and Figure 11 present the key performance indicators: settling time, overshoot, rate of change of frequency (RoCoF), steady-state (SS) error, and maximum frequency deviation.
As shown in Table 6 and Figure 11, Case B consistently outperforms Case A across all evaluated metrics. Specifically, Case B achieves a 37% reduction in settling time, 52% lower overshoot, 29% improvement in RoCoF, and complete elimination of steady-state error. This demonstrates that centralized control via a single IGBT allows the Hybrid PSO–RL controller to operate more efficiently, delivering faster damping, improved real-time inertia emulation, and enhanced frequency stability.
Based on these findings, all subsequent disturbance and dynamic performance analyses in this paper are carried out using the Case B configuration, unless otherwise stated.

5. Conclusions

This paper presents a novel hybrid control strategy combining Particle Swarm Optimization (PSO) and Reinforcement Learning (RL) for Adaptive Virtual Inertia Control in multi-microgrid PV systems. By dynamically adjusting virtual inertia and damping parameters in response to real-time frequency deviations, the proposed system effectively mitigates the grid frequency fluctuations typically observed in renewable energy-dominated networks. The PSO algorithm optimizes the base control parameters offline, while the RL algorithm adapts these parameters in real time, ensuring that the system can efficiently respond to changes in grid conditions and power generation.
Simulation results demonstrate that the proposed hybrid controller significantly improves frequency stability, reduces power quality degradation, and ensures faster response times compared to traditional methods. The ability to adapt and optimize control parameters in real time offers significant advantages in maintaining stable operation in microgrid systems, particularly in environments where renewable energy sources are prevalent.
The effectiveness of the proposed control approach highlights its potential for integration into future smart grids, where large-scale renewable generation will require advanced control mechanisms for grid synchronization and stability. Further research can explore the scalability of this approach for larger systems and investigate additional optimization strategies to enhance performance under more complex operating conditions.

6. Research Limitations

Although the proposed Hybrid Particle Swarm Optimization–Reinforcement Learning (PSO–RL) Controller demonstrates enhanced frequency stability and dynamic adaptability for multi-microgrid photovoltaic (PV) systems, several limitations should be acknowledged to guide future improvements. Firstly, the integration of PSO with real-time RL adaptation introduces additional computational complexity. While this is manageable in MATLAB-based simulations, it may challenge real-time deployment on embedded microgrid controllers with constrained processing capabilities. Secondly, the system model used in this study assumes ideal inverter switching, balanced three-phase operation, and simplified load profiles. These assumptions may not fully capture practical system non-linearities, switching harmonics, or stochastic load behaviors present in real-world applications. Thirdly, the Reinforcement Learning agent currently operates with a discretized action space for selecting virtual inertia values. While effective for proof-of-concept simulation, this discretization may limit responsiveness under highly dynamic grid conditions. Continuous control methods, such as actor–critic or deep deterministic policy gradient (DDPG) algorithms, could offer finer control precision. Furthermore, the proposed controller has been evaluated on a single microgrid configuration with specific PV generation and load profiles. Its generalizability to diverse grid topologies, including hybrid AC/DC microgrids or weakly interconnected systems, remains to be verified. Lastly, the study is limited to software-based validation. Experimental verification through hardware-in-the-loop (HIL) testing or deployment on a real-time digital simulator (RTDS) would provide more comprehensive insight into the controller’s practical performance, robustness, and scalability.

Author Contributions

Conceptualization, A.B.A. and A.A.; methodology, A.B.A.; software, A.B.A.; validation, A.B.A. and A.A.; formal analysis, A.B.A.; investigation, A.B.A.; resources, A.A.; data curation, A.B.A.; writing—original draft preparation, A.B.A.; writing—review and editing, A.A.; visualization, A.B.A.; supervision, A.A.; project administration, A.A.; funding acquisition, A.A. All authors have read and agreed to the published version of the manuscript.

Funding

The research was funded by the Ongoing Research Funding Program (ORF-2025-258), King Saud University, Riyadh, Saudi Arabia.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions, which helped improve the quality of this paper. The authors also gratefully acknowledge the support provided by the Ongoing Research Funding Program (ORF-2025-258), King Saud University, Riyadh, Saudi Arabia.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
PSOParticle Swarm Optimization
PVPhotovoltaic
MPPTMaximum Power Point Tracking
IGBTInsulated-Gate Bipolar Transistor
PQPower Quality
RL-PSOReinforcement Learning-Particle Swarm Optimization
HInertia Constant
RLFFReinforcement Learning Frequency Control

References

  1. Dreidy, M.; Mokhlis, H.; Mekhilef, S. Inertia response and frequency control techniques for renewable energy sources: A review. Renew. Sustain. Energy Rev. 2017, 69, 144–155. [Google Scholar] [CrossRef]
  2. Hamza, M.; Buhari, M.; Sadiq, A.A. Modified PSO-Based Virtual Inertia Controller for Optimal Frequency Regulation of Micro-Grid. Covenant J. Eng. Technol. 2022, 6, 1–12. [Google Scholar]
  3. Ogar, V.N.; Hussain, S.; Gamage, K.A.A. Load Frequency Control Using the Particle Swarm Optimization Algorithm and PID Controller for Effective Monitoring of Transmission Line. Energies 2023, 16, 5748. [Google Scholar] [CrossRef]
  4. Skiparev, V.; Belikov, J.; Petlenkov, E. Reinforcement learning based approach for virtual inertia control in microgrids with renewable energy sources. In Proceedings of the IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe), The Hague, The Netherlands, 26–28 October 2020; pp. 1020–1024. [Google Scholar] [CrossRef]
  5. Afifi, M.A.; Marei, M.I.; Mohamad, A.M.I. Reinforcement-Learning-Based Virtual Inertia Controller for Frequency Support in Islanded Microgrids. Technologies 2024, 12, 39. [Google Scholar] [CrossRef]
  6. Yameen, M.Z.; Lu, Z.; El-Sousy, F.F.M.; Younis, W.; Zardari, B.A.; Junejo, A.K. Improving Frequency Stability in Grid-Forming Inverters with Adaptive Model Predictive Control and Novel COA-jDE Optimized Reinforcement Learning. Sci. Rep. 2025, 15, 16540. [Google Scholar] [CrossRef] [PubMed]
  7. Chang, M.; Salem, M.; Mohamed, F.A. Adaptive Virtual Inertia Emulation Based on Policy Gradient Clipping for Low-Inertia Microgrids with Phase-Locked Loop Dynamics. Comput. Electr. Eng. 2025, 112, 110477. [Google Scholar] [CrossRef]
  8. Cavus, M.; Allahham, A. Spatio-Temporal Attention-Based Deep Learning for Smart Grid Demand Prediction. Electronics 2025, 14, 2514. [Google Scholar] [CrossRef]
  9. Khosravi, S.; Hamidi Beheshti, M.T.; Rastegar, H. Robust control of islanded microgrid frequency using fractional-order PID. Iran. J. Sci. Technol. Trans. Electr. Eng. 2020, 44, 1207–1220. [Google Scholar] [CrossRef]
  10. Skiparev, V.; Nosrati, K.; Tepljakov, A.; Petlenkov, E.; Levron, Y.; Belikov, J.; Guerrero, J.M. Virtual Inertia Control of Isolated Microgrids Using an NN-Based VFOPID Controller. IEEE Trans. Sustain. Energy 2023, 14, 1558–1568. [Google Scholar] [CrossRef]
  11. Babaei, F.; Lashkari, Z.B.; Safari, A.; Farrokhifar, M.; Salehi, J. Salp swarm algorithm-based fractional-order PID controller for LFC systems in the presence of delayed EV aggregators. IET Electr. Syst. Transp. 2020, 10, 259–267. [Google Scholar] [CrossRef]
  12. Babaei, F.; Safari, A. SCA based fractional-order PID controller considering delayed EV aggregators. J. Oper. Autom. Power Eng. 2020, 8, 75–85. [Google Scholar]
  13. Asgari, S.; Suratgar, A.A.; Kazemi, M. Feedforward fractional order PID load frequency control of microgrid using harmony search algorithm. Iran. J. Sci. Technol. Trans. Electr. Eng. 2021, 45, 1369–1381. [Google Scholar] [CrossRef]
  14. Çelik, E. Design of new fractional order PI–fractional order PD cascade controller through dragonfly search algorithm for advanced load frequency control of power systems. Soft Comput. 2021, 25, 1193–1217. [Google Scholar] [CrossRef]
  15. Bhuyan, M.; Das, D.C.; Barik, A.K.; Sahoo, S.C. Performance assessment of novel solar thermal-based dual hybrid microgrid system using CBOA optimized cascaded PI-TID controller. IETE J. Res. 2022, 69, 9076–9093. [Google Scholar] [CrossRef]
  16. Ali, M.; Kotb, H.; Aboras, K.M.; Abbasy, N.H. Design of cascaded pi-fractional order PID controller for improving the frequency response of hybrid microgrid system using gorilla troops optimizer. IEEE Access 2021, 9, 150715–150732. [Google Scholar] [CrossRef]
  17. Prabhu, M.S.G.S.; Jayanth, K.M.; Silva, R.L. Design and implementation of MPPT algorithm for photovoltaic systems. IEEE Trans. Energy Convers. 2019, 30, 732–740. [Google Scholar]
  18. Hong, S.M.; Shahnia, F.S.; Elkamel, A.B. IGBT switching strategies for power conversion systems. IEEE Trans. Power Electron. 2019, 35, 8011–8019. [Google Scholar]
  19. Ray, A.K.; Williams, M.L. Three-phase inverter control techniques for photovoltaic systems. IEEE Trans. Power Syst. 2021, 34, 1244–1252. [Google Scholar]
  20. Kaushik, D.K.; Harne, R.L. Active and reactive power control for IGBT-based inverters in grid integration. IEEE Trans. Ind. Appl. 2021, 58, 745–754. [Google Scholar]
  21. Gupta, R.L.; Mahajan, V.K.; Nair, S.K. Grid interfacing of renewable energy systems using passive elements and impedance matching. IEEE Trans. Ind. Electron. 2021, 68, 4623–4631. [Google Scholar]
  22. Saxena, P.; Singh, N.; Pandey, A.K. Self-regulated solar PV systems: Replacing battery via virtual inertia reserve. IEEE Trans. Energy Convers. 2021, 36, 2185–2194. [Google Scholar] [CrossRef]
  23. Khalil, H.K. Nonlinear Systems, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2002. [Google Scholar]
  24. Clerc, M.; Kennedy, J. The particle swarm—Explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comput. 2002, 6, 58–73. [Google Scholar] [CrossRef]
  25. Ioannou, P.A.; Sun, J. Robust Adaptive Control; Dover Publications: Mineola, NY, USA, 2012. [Google Scholar]
  26. Sontag, E.D. On the input-to-state stability property. Eur. J. Control 1995, 1, 24–36. [Google Scholar] [CrossRef]
  27. Borkar, V.S. Stochastic Approximation: A Dynamical Systems Viewpoint; Springer: Cambridge, UK, 2008. [Google Scholar] [CrossRef]
  28. Konda, V.R.; Tsitsiklis, J.N. Actor–critic algorithms. In Advances in Neural Information Processing Systems 12 (NIPS 1999); Solla, S.A., Leen, T.K., Müller, K.R., Eds.; MIT Press: Cambridge, MA, USA, 2000; pp. 1008–1014. [Google Scholar]
  29. Çam, E.; Kocaarslan, I. Load frequency control in two area power systems using fuzzy logic controller. Energy Convers. Manag. 2005, 46, 233–243. [Google Scholar] [CrossRef]
  30. Arya, Y. Improvement in automatic generation control of two-area electric power systems via a new fuzzy aided optimal PIDN-FOI controller. ISA Trans. 2018, 80, 475–490. [Google Scholar] [CrossRef]
  31. Kumari, K.; Shankar, G.; Kumari, S.; Gupta, S. Load frequency control using ANN-PID controller. In Proceedings of the 2016 IEEE 1st International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), Delhi, India, 4–6 July 2016; IEEE: New York, NY, USA, 2016; pp. 1–6. [Google Scholar]
  32. Mohammed, A.J.; Al-Majidi, S.D.; Al-Nussairi, M.K.; Abbod, M.F.; Al-Raweshidy, H.S. Design of a Load Frequency Controller based on Artificial Neural Network for Single-Area Power System. In Proceedings of the 2022 57th International Universities Power Engineering Conference (UPEC), Istanbul, Turkey, 30 August–2 September 2022; IEEE: New York, NY, USA, 2022; pp. 1–5. [Google Scholar]
  33. Osman, A.M.; Magzoub, M.A.; Salem, A. Load Frequency Control in Two Area Power System using GA, SA and PSO Algorithms: A Comparative Study. In Proceedings of the 2021 31st Australasian Universities Power Engineering Conference (AUPEC), Perth, WA, Australia, 26–30 September 2021; IEEE: New York, NY, USA, 2021; pp. 1–8. [Google Scholar]
  34. Babakhani, Q.M. Load Frequency Control in Two Area Power System Using Sliding Mode Control. J. Artif. Intell. Electr. Eng. 2014, 3, 24–36. [Google Scholar]
  35. Appikonda, P.; Kasibhatla, R.S. Design of support vector machine controller for hybrid power system automatic generation control. Energy Sources Part A Recover. Util. Environ. Eff. 2022, 44, 3883–3907. [Google Scholar]
  36. Hemeida, A.; Mohamed, S.; Mahmoud, M. Load frequency control using optimized control techniques. JES J. Eng. Sci. 2020, 48, 1119–11136. [Google Scholar] [CrossRef]
  37. Raj, T.D.; Kumar, C.; Kotsampopoulos, P.; Fayek, H.H. Load Frequency Control in Two-Area Multi-Source Power System Using Bald Eagle-Sparrow Search Optimization Tuned PID Controller. Energies 2023, 16, 2014. [Google Scholar] [CrossRef]
Figure 1. Hybrid Reinforcement Learning framework.
Figure 1. Hybrid Reinforcement Learning framework.
Electronics 14 03349 g001
Figure 2. Hybrid Reinforcement Learning (RL)–PSO Controller.
Figure 2. Hybrid Reinforcement Learning (RL)–PSO Controller.
Electronics 14 03349 g002
Figure 3. Simulink model of the proposed PV-based multi-microgrid system with Hybrid PSO–RL adaptive inertia control.
Figure 3. Simulink model of the proposed PV-based multi-microgrid system with Hybrid PSO–RL adaptive inertia control.
Electronics 14 03349 g003
Figure 4. Three-phase AC voltage, current waveforms, and power quality demonstrating stable operation under the proposed hybrid controller.
Figure 4. Three-phase AC voltage, current waveforms, and power quality demonstrating stable operation under the proposed hybrid controller.
Electronics 14 03349 g004
Figure 5. DC-link voltage waveforms ( V d C , V d C 1 , V d C 2 , and V d C 3 ) correspond to multiple inverter modules in the multi-microgrid PV system.
Figure 5. DC-link voltage waveforms ( V d C , V d C 1 , V d C 2 , and V d C 3 ) correspond to multiple inverter modules in the multi-microgrid PV system.
Electronics 14 03349 g005
Figure 6. Photovoltaic (PV) System: Irradiance and mean power output over time.
Figure 6. Photovoltaic (PV) System: Irradiance and mean power output over time.
Electronics 14 03349 g006
Figure 7. Grid frequency over time.
Figure 7. Grid frequency over time.
Electronics 14 03349 g007
Figure 8. Frequency response for non-adaptive, standard PSO and Hybrid PSO-RL Controller.
Figure 8. Frequency response for non-adaptive, standard PSO and Hybrid PSO-RL Controller.
Electronics 14 03349 g008
Figure 9. Convergence curve.
Figure 9. Convergence curve.
Electronics 14 03349 g009
Figure 10. Test under various disturbances, renewable variability, and fault conditions.
Figure 10. Test under various disturbances, renewable variability, and fault conditions.
Electronics 14 03349 g010
Figure 11. Comparative performance of Case A and Case B under a 10% load step disturbance at t = 0.2 s.
Figure 11. Comparative performance of Case A and Case B under a 10% load step disturbance at t = 0.2 s.
Electronics 14 03349 g011
Table 1. Reinforcement Learning parameters.
Table 1. Reinforcement Learning parameters.
ParameterSymbolValueDescription
Learning Rateα0.001Step size for updating Q-values or neural weights
Discount Factorγ0.95Future reward importance
Exploration Rate (initial)ε1.0For ε-greedy policy (starts with high exploration)
Exploration Decay Rate0.995Decay rate per episode
Minimum Exploration Rateεmin0.01Minimum value of ε
Number of EpisodesNepi500Total RL training episodes
Maximum Steps per EpisodeNstep1000Maximum steps per episode
Replay Buffer Size (if DQN)10,000Experience memory size
Batch Size64Mini-batch size for training
Neural Network Hidden Layers[64, 64]Two hidden layers with 64 neurons each
Table 2. PSO parameters.
Table 2. PSO parameters.
ParameterSymbolValueDescription
Swarm Size N p 30Number of particles in the swarm
Maximum Iterations T m a x 100Maximum number of PSO iterations
Inertia Weightw0.729Balances exploration and exploitation
Cognitive Coeff.c11.49Particle’s own best influence
Social Coeff.c21.49Global best influence
Velocity Clamping V m a x ±0.1Maximum velocity limit for particles
PSO ObjectiveMinimize C_totalTotal cost, error, or loss function
Table 3. PV system and power electronic components.
Table 3. PV system and power electronic components.
ComponentValue/RatingDescription
PV Array Rating1000 kWTotal installed solar capacity
MPPT TechniqueIncremental ConductanceUsed to extract max power from PV
DC/DC Converter TypeBoost ConverterSteps up PV voltage to DC bus
DC Bus Voltage600 VCommon bus for interfacing sources/loads
IGBT Switching Frequency10 kHzFor power converter switching
Filter Inductance (L)2 mHFor smoothing DC link or inverter output
DC Link Capacitance (C)1000 µFStabilizes voltage fluctuations
Table 4. Frequency response performance metrics of controllers.
Table 4. Frequency response performance metrics of controllers.
ControllerPeak Overshoot (Hz)Settling Time (s)RoCoF (Hz/s)Steady-State Error (Hz)
Non-Adaptive Controller54.60.7576.7±0.4
Standard PSO Controller51.70.4021.3±0.1
Hybrid PSO–RL Controller50.020.100.2≈ 0.0
Table 5. Comparison between proposed model and other controllers.
Table 5. Comparison between proposed model and other controllers.
AuthorControllerSetting
Times (s)
Controller ErrorOvershoot
[29]Fuzzy7.20.2%0.027
[30]PID16.580.732%0.0206
[31]ANN-PID6.50.04%0.1090
[32]Optimal ANN500.06%3.4
[33,34]ANFIS8.5-−0.45
[35]SVM10.57.09%0.25
[36]GA-PID50.50025%0.0
[37]BESSO-PID10.4767-0.0001
Proposed MethodPSO–RL0.100.0%0.02
Table 6. Comparative analysis of Case A and Case B configurations.
Table 6. Comparative analysis of Case A and Case B configurations.
MetricCaseA—Independent PV–IGBTCase B—Centralized PV–IGBT
Settling Time (s)0.1600.100
Overshoot (Hz)0.0420.020
RoCoF (Hz/s)0.2800.200
Steady-State Error (Hz)0.0100.000
Maximum Frequency Deviation (Hz)0.0350.018
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Akinwola, A.B.; Alkuhayli, A. Hybrid PSO–Reinforcement Learning-Based Adaptive Virtual Inertia Control for Frequency Stability in Multi-Microgrid PV Systems. Electronics 2025, 14, 3349. https://doi.org/10.3390/electronics14173349

AMA Style

Akinwola AB, Alkuhayli A. Hybrid PSO–Reinforcement Learning-Based Adaptive Virtual Inertia Control for Frequency Stability in Multi-Microgrid PV Systems. Electronics. 2025; 14(17):3349. https://doi.org/10.3390/electronics14173349

Chicago/Turabian Style

Akinwola, Akeem Babatunde, and Abdulaziz Alkuhayli. 2025. "Hybrid PSO–Reinforcement Learning-Based Adaptive Virtual Inertia Control for Frequency Stability in Multi-Microgrid PV Systems" Electronics 14, no. 17: 3349. https://doi.org/10.3390/electronics14173349

APA Style

Akinwola, A. B., & Alkuhayli, A. (2025). Hybrid PSO–Reinforcement Learning-Based Adaptive Virtual Inertia Control for Frequency Stability in Multi-Microgrid PV Systems. Electronics, 14(17), 3349. https://doi.org/10.3390/electronics14173349

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop