1. Introduction
Hydrogen has emerged as a versatile and clean energy carrier, widely recognized for its high energy density, ease of storage, and lack of harmful emissions [
1,
2,
3]. Traditional hydrogen production methods, such as natural gas reforming and coal gasification, are associated with significant carbon and pollutant emissions [
4,
5,
6]. In contrast, water electrolysis powered by renewable energy sources offers a sustainable and environmentally friendly alternative—green hydrogen. Among the various electrolyzer technologies, proton exchange membrane (PEM) electrolyzers stand out due to their compact design, high current density, and ability to produce high-purity hydrogen [
7,
8,
9]. PEM electrolysis is particularly well-suited for integration with renewable energy sources like solar and wind power, addressing the need for green hydrogen production.
Integrating renewable energy, particularly photovoltaic (PV) systems, into hydrogen production through electrolysis presents a promising pathway for sustainable energy solutions. Unlike conventional methods reliant on fossil fuels, this approach significantly reduces environmental impact while leveraging the declining cost of solar electricity [
10,
11]. However, due to natural fluctuations, PV generation has inherent intermittency, requiring hybrid energy storage systems for dependable hydrogen production. Integrating PV systems with PEM electrolyzers allows two-phase energy storage (short-term battery buffering and long-term hydrogen storage), thereby assuring uninterrupted operation [
12,
13].
Despite their promising potential, standalone PV–Battery–PEM electrolyzer systems for green hydrogen production present both advantages and challenges that must be carefully considered. One of the primary advantages is the ability to directly utilize intermittent solar energy to generate hydrogen, enabling clean and decentralized energy storage that can decouple energy generation from consumption in time and space. The integration of batteries provides short-term energy buffering, smoothing fluctuations in PV output, and enhancing system reliability, while PEM electrolyzers offer fast dynamic response and high efficiency at varying loads, making them well-suited to cope with solar variability. Furthermore, PEM electrolyzers produce high-purity hydrogen without the need for extensive downstream purification, and their modular design allows for scalability and rapid startup/shutdown cycles, which align well with the variable nature of renewable energy.
However, these systems also face several challenges that limit their widespread deployment. The high capital costs of PEM electrolyzers and batteries remain a significant barrier, alongside operational costs related to maintenance and component degradation, particularly the limited cycle life of batteries under frequent charge–discharge cycles. Moreover, the variability and intermittency of solar energy impose complex control and energy management requirements to optimize hydrogen production without excessive energy curtailment or premature battery aging. The efficiency of the entire PV–Battery–PEM system depends heavily on accurate forecasting and dynamic control strategies to balance energy flows among components while meeting variable load demands. Additionally, PEM electrolyzers require stable operating conditions, and frequent power fluctuations can accelerate membrane degradation, reducing system lifetime and increasing replacement costs.
Addressing these advantages and limitations through advanced energy management and control strategies, the standalone DC PV-microgrid plays a pivotal role in bridging the gap between solar energy’s intermittent nature and the stable power demands of electrical loads and hydrogen production. The system relies on solar panels to generate DC electricity, which is then regulated by DC-DC converters to maintain optimal voltage levels for different components [
14,
15]. An integrated energy management system (EMS) dynamically allocates power between direct consumption, battery storage for short-term fluctuations, and hydrogen production for long-term energy storage. Additionally, charge controllers ensure efficient energy flow to storage units, preventing overcharging and optimizing battery lifespan [
16,
17].
Effective power optimization is crucial to improving the efficiency and reliability of PV-PEM microgrids, where solar energy is the primary source for both electrical loads and hydrogen production. A well-designed control strategy ensures the optimal power distribution between the load demand, energy storage system (ESS), and electrolyzer, enhancing the overall system stability [
18,
19]. Due to PV power’s intermittent nature, fluctuations in solar irradiance can lead to sudden variations in power output, which, if not properly managed, can negatively impact system performance.
To address these challenges, DC-DC converters play a critical role in regulating the power flow from the PV array to the loads, batteries, and electrolyzer. The converter adjusts the voltage and current levels to ensure efficient energy transfer while mitigating the effects of power fluctuations [
20,
21]. Maximum power point tracking (MPPT) control further enhances system performance by dynamically adjusting the operating point of the PV array to extract the maximum available power under varying solar irradiance conditions [
22,
23]. By integrating an advanced MPPT control, the system can quickly respond to changes in solar input and optimize power allocation. This ensures stable operation, preventing excessive voltage deviations that could compromise the efficiency of the ESS or electrolyzer-efficient energy storage system coordination, stabilizing the coordination of the energy storage system, which stabilizes power fluctuations.
When solar generation exceeds demand, surplus energy can be stored in batteries, ensuring optimal energy utilization. Conversely, the stored energy can be dispatched during periods of low solar availability to maintain a stable and reliable power supply. Additionally, surplus energy can be strategically allocated to power electrolyzers for hydrogen production, optimizing resource utilization and reducing energy wastage. By dynamically managing power flow within the PV-PEM microgrid, the EMS enhances energy efficiency, reduces operational costs, and improves the overall sustainability of hydrogen production systems.
Despite significant advancements in renewable energy technologies, a critical challenge in the real-world deployment of standalone PV systems is the unpredictable nature of load demand. Traditional deterministic energy management strategies often rely on fixed forecasts or average load profiles, making them inadequate in handling abrupt, stochastic variations in consumption. These limitations lead to inefficient energy allocation, increased reliance on batteries, and underutilization of surplus solar energy. Consequently, there is a pressing need for energy management approaches that can anticipate and adapt to fluctuations in random load in real time. This research addresses this gap by proposing a stochastic EMS framework based on a Markov decision process (MDP), designed to optimize energy flow within a PV–Battery–PEM electrolyzer system and ensure sustainable hydrogen production under uncertain operating conditions.
1.1. Literature Review
Recent research has demonstrated a growing interest in standalone renewable energy systems for green hydrogen production. The authors of [
24] propose an autonomous hybrid wind–solar system designed for green hydrogen production and water treatment, focusing on optimal sizing and economic feasibility to support hydrogen refueling stations. Similarly, Ref. [
25] addresses the adaptability of renewable energy systems to various demand profiles, such as rural, institutional, and medical needs, by using optimization algorithms to ensure economically and environmentally sound configurations. Moreover, an operational optimization method tailored to off-grid hydrogen systems is designed in [
26], emphasizing the reduction of design inefficiencies and improving long-term operational reliability.
Furthermore, the authors of [
27] explore optimal hybrid microgrid configurations by performing extensive techno-economic simulations, with a focus on cost minimization, renewable penetration, and hydrogen production potential. Their study provides valuable insights into long-term system planning and highlights hydrogen’s role in sustainable energy strategies. However, their approach is purely design-oriented and relies on deterministic scenarios with averaged resource and load profiles. In contrast, this work shifts the focus to real-time stochastic control that dynamically manages energy flows under uncertain and time-varying solar generation and load demand.
In terms of control strategies and energy flow management, various studies have explored advanced optimization techniques for hydrogen-integrated microgrids. The authors of [
17] implement a model predictive control (MPC) scheme to manage a standalone PV-hydrogen-battery system, effectively reducing battery cycling and prioritizing hydrogen production during energy surplus periods. Ref. [
28] introduces a hierarchical economic MPC framework that coordinates short- and long-term control layers to maintain system flexibility and economic performance. In the same context, an MPC-based management system for an on-site hydrogen refueling station is designed in [
29], considering dynamic constraints and operational scheduling to maximize hydrogen production and distribution efficiency. Furthermore, the study presented in [
30] proposes a neural network-based predictive control system capable of smoothing power fluctuations in a solar-wind hybrid system with hydrogen and battery storage, thereby advancing intelligent control applications in hydrogen energy systems.
Several researchers have also focused on improving system reliability through enhanced power electronics and hybrid storage solutions. Ref. [
20] proposes a two-stage DC–DC conversion architecture combining a resonant frequency converter and a partial power regulation unit to facilitate MPPT and efficient electrolyzer operation in off-grid PV systems. The resonant frequency converter and partial power regulation unit enhance MPPT and electrolyzer functionality; nonetheless, the study’s dependence on idealized irradiance profiles neglects actual transient situations, such as rapid cloud cover. The intricacy of the suggested topology, necessitating synchronized management of several conversion steps, may elevate maintenance costs and diminish scalability for field deployments. These constraints highlight the need for resilient control systems that can adjust to erratic solar fluctuations.
The authors of [
31] advocate for sliding mode control and bidirectional converters to improve operational flexibility and storage efficiency in standalone PV systems. An integer linear programming with MPC to optimize hydrogen-battery hybrid storage was proposed in [
32], aiming to reduce component degradation and enhance economic performance. Ref. [
33] takes a complementary approach by proposing a hybrid energy storage system that integrates batteries, fuel cells, and supercapacitors, which together reduce stress on any single storage element and extend the operational life of the system, optimized using a fractional gradient descent algorithm.
Finally, the authors of [
21] propose a simplified architecture for standalone PV-powered hydrogen generation by eliminating traditional power converters and implementing a degradation-aware control strategy. Their approach maintains constant electrolyzer power despite irradiance variations, achieving impressive efficiency metrics through indirect PV control and strategic battery use. However, unlike the proposed stochastic control framework, their method lacks real-time adaptability to random load fluctuations. The direct electrical coupling between the PV array and electrolyzer limits the system’s flexibility, particularly under unpredictable load conditions.
Ref. [
34] focuses on long-term performance estimation through machine learning, identifying the most suitable forecasting models for hydrogen production across different geographic and climatic conditions based on extensive weather datasets. These works emphasize the value of predictive and simplified design methodologies in making green hydrogen systems more accessible, resilient, and scalable for off-grid and distributed energy applications. Yet, the proposed methodology requires high-resolution input data (1-min irradiance/wind measurements), which may be unfeasible for distant off-grid locations with inadequate monitoring infrastructure. These restrictions indicate a need for adaptive algorithms that balance prediction accuracy with practical operating limits.
Despite the comprehensive advancements in the design, control, and optimization of the PV-PEM microgrid via optimized power electronics, streamlined architectures, and machine learning forecasting, a critical limitation across much of the existing literature lies in the limited consideration of demand-side uncertainty, particularly in scenarios where load behavior exhibits random variations. Many studies emphasize the optimization of generation resources, energy storage, and power electronics, yet they often operate under assumptions of predictable or averaged load profiles. Such simplifications can significantly underestimate the operational challenges faced in real-world off-grid systems, where consumption patterns may fluctuate unpredictably due to user behavior, seasonal changes, or application-specific demands. This oversight leaves a gap in ensuring robust energy management strategies that can dynamically adapt to random load variations. Addressing this gap requires control models that optimize energy flow and incorporate stochastic representations of load demand to reflect actual operating conditions and enhance the resilience of autonomous PV microgrids.
Incorporating stochastic modeling into the control architecture of PV-PEM microgrids is essential for capturing the inherent randomness associated with real-world load consumption. Unlike deterministic methods that rely on fixed or average values, stochastic models use probability distributions and random variables to describe fluctuations in energy demand more realistically. This approach allows for anticipating a wide range of possible consumption scenarios, rather than a single expected outcome, enabling the system to prepare for both typical and extreme load conditions. By integrating uncertainty directly into forecasting and decision-making processes, stochastic modeling enhances the responsiveness and flexibility of energy management strategies. As a result, PV-PEM microgrids are better equipped to allocate resources, schedule storage use, and maintain supply-demand balance, even under erratic or rapidly changing consumption patterns. This capability is especially critical in isolated or autonomous systems, where forecasting errors can lead to power shortages, unnecessary cycling of storage devices, or system instability.
MDP offers a structured and adaptable approach for addressing uncertainties in energy management, particularly in standalone PV microgrids dedicated to green hydrogen production. In such systems, the unpredictable nature of load demand and hydrogen production dynamics introduces complexity that requires probabilistic modeling. MDPs are well-suited to this context, as they represent decision-making in environments where the outcome of each action depends only on the current system state and a set of probabilistic transitions. This property makes them ideal for modeling sequential decision-making under uncertainty, where actions, such as charging batteries, powering electrolyzers, or shedding loads, must be taken in response to fluctuating inputs. Integrating MDPs into the control strategy makes it possible to evaluate each decision’s long-term impact on system performance, balancing hydrogen production efficiency, storage stability, and energy availability. The ability of MDPs to generate optimal policies over time enhances the resilience and intelligence of the microgrid, enabling it to adapt dynamically to both forecasted and unforeseen changes in its operational environment.
1.2. Main Contribution
This research introduces a stochastic energy management approach designed to enhance the utilization of excess energy in standalone PV microgrids, with the primary objective of maximizing green hydrogen production. The proposed framework leverages an MDP integrated with energy management control to anticipate unpredictable load consumption and optimize the distribution of generated power between local consumption, battery storage, and a PEM electrolyzer. Unlike conventional strategies that rely on deterministic assumptions, this method incorporates the stochastic nature of load behavior to make informed, real-time decisions that prevent energy wastage. By accurately forecasting when energy demand will be low, the system intelligently channels surplus power toward hydrogen production rather than letting it go unused or overcharging the storage systems. This ensures a more efficient exploitation of solar energy, even under fluctuating environmental and consumption conditions, thereby improving the reliability and sustainability of standalone microgrids designed for hydrogen generation.
The main contributions of this work can be summarized as follows:
A novel integration of MDP and energy management control is proposed to manage power flows in a standalone PV–battery–electrolyzer system, specifically focusing on converting excess solar energy into hydrogen by anticipating future load behavior.
The method enhances forecasting accuracy of energy consumption patterns, enabling the system to proactively allocate available power to hydrogen production when predicting low demand.
The approach ensures continuous and efficient hydrogen generation by maintaining operational stability despite random solar input and load profile variations.
Battery lifetime is extended through smart control of charge–discharge cycles, reducing unnecessary cycling by prioritizing hydrogen production during energy surplus periods.
The remainder of this paper is structured as follows.
Section 2 outlines the architecture of the standalone PV–battery–electrolyzer microgrid, including its main components and the integration of DC loads.
Section 3 details the proposed stochastic energy management strategy, combining energy management control with an MDP for optimized power distribution. In
Section 4, simulation results are presented to evaluate the performance of the developed method under realistic operating conditions. Finally,
Section 5 summarizes the key findings and offers concluding insights.
2. Overview of the Standalone PV–PEM–Microgrid System
A standalone DC PV-PEM microgrid is an autonomous energy system engineered to function independently from the main power grid, with the primary goal of generating and storing renewable electricity for green hydrogen production. It integrates energy conversion, regulation, storage, and consumption subsystems in a coordinated manner to ensure reliable and continuous operation. As shown in
Figure 1, the system comprises a PV array, DC-DC converters, a hydrogen production unit based on a PEM electrolyzer, and energy storage components. The PV modules are the main energy source, converting solar radiation into DC electricity through the PV effect. Their output depends on irradiance and ambient temperature, and is inherently variable throughout the day.
Power conditioning stages are employed to manage these fluctuations and match the power requirements of the PEM electrolyzer and storage units. A DC-DC converter is essential to stabilize the voltage and current levels, thereby improving power transfer efficiency and safeguarding sensitive components. These converters also integrate MPPT algorithms, which continuously adjust the operating point of the PV array to extract the highest possible power under changing solar conditions [
22,
23].
Since electrolyzers require a stable and continuous DC power input to operate efficiently, the regulated power from the PV array is first used to meet the energy demand of all connected DC loads. When solar availability is high, the PV system also charges the batteries until they reach their maximum SoC. Once the load demand is satisfied and the storage system is fully charged, surplus energy is directed to the PEM electrolyzer to produce green hydrogen. This approach prioritizes supplying critical loads and maximizing energy storage before utilizing excess renewable energy for hydrogen generation. The battery system serves as a buffer, storing energy during high-generation periods and supplying power when solar input is insufficient, thereby maintaining energy balance and supporting continuous hydrogen production when direct PV power is not available.
2.1. PV Conversion System
The PV system is designed to harness solar energy by utilizing PV modules that generate DC electricity when exposed to sunlight. These modules are connected to a DC-DC converter, which plays a crucial role in conditioning the output power to meet the requirements of various DC loads. As illustrated in
Figure 2, the converter adjusts the voltage and current from the PV array through a duty cycle control
, which modulates the switching of the MOSFET, ensuring efficient power delivery to the connected loads. This setup allows the PV generator to supply power to different components, including batteries and a PEM electrolyzer, while maintaining optimal operation.
The nonlinear state-space model governs the PV conversion system [
35]:
where
is the state vector representing the system states,
is the voltage at the PV terminals,
is the inductor current, and
is the output capacitor voltage of the DC-DC converter. The system matrices are given as follows:
2.2. Battery Energy Storage System
Battery energy storage systems (BESSs) are a cornerstone of energy management strategies in stand-alone PV microgrids. They function as dynamic buffers, capturing surplus electrical energy for deferred use, thereby ensuring a continuous supply during periods of low solar generation. The operational behavior of BESSs is governed by electrochemical mechanisms within the cells, which define the dynamics of charging and discharging, as well as the overall efficiency and service life of the storage unit [
36].
Energy exchange between the battery and the rest of the microgrid is handled by a power conversion interface, typically a bidirectional DC-DC converter. This system is tasked with managing energy flows in both directions, absorbing energy during surplus generation and supplying it during deficits, while minimizing conversion losses and enhancing the overall reliability of the system. Control strategies embedded within this converter also serve to monitor the state of charge (SoC), prevent battery degradation from overcharging or deep discharging, and maintain load balance [
37].
The SoC is a critical indicator of the battery’s remaining usable capacity and is commonly tracked using a Coulomb counting technique. This method estimates the SoC by integrating the current entering or leaving the battery over time, accounting for the direction of flow. Mathematically, the SoC at time
t is calculated as follows:
where
denotes the nominal capacity of the battery, and
is the net charge accumulated or discharged between the initial time
and time
t, given by the following:
with
representing the instantaneous current. A positive value indicates charging, while a negative value corresponds to discharging.
In a standalone PV system, the bidirectional converter operates in two main configurations, as illustrated in
Figure 3:
Buck mode (charging phase): The converter steps down the higher DC bus voltage to align with the lower battery voltage , facilitating safe energy storage.
Boost mode (discharging phase): When the system needs energy, the converter increases the battery voltage to match the bus voltage , ensuring adequate power delivery to the loads and other subsystems.
This flexible operation allows the BESS to actively contribute to energy autonomy, load regulation, and stability in the standalone PV microgrid architecture.
2.3. PEM Electrolyzer System
The PEM electrolyzer is a critical subsystem in the standalone PV-based microgrid, enabling the conversion of electrical energy into hydrogen via electrolysis. The electrolyzer dissociates water molecules into hydrogen and oxygen gases when powered by DC electricity. This process is highly sensitive to the input voltage and current, which the upstream power electronics regulate to ensure safe and efficient operation [
38].
The dynamic behavior of the PEM electrolyzer can be described by the relationship between the input current and the produced hydrogen flow rate. The molar flow of hydrogen
is directly proportional to the electrolyzer current
, and can be expressed as follows:
where
is the Faraday efficiency, and
F is the Faraday constant (
96,485 C/mol). The Faraday efficiency accounts for losses due to side reactions and non-idealities in the electrochemical process.
The terminal voltage of the PEM electrolyzer
can be modeled as the sum of the thermodynamic voltage
and the overpotentials resulting from activation, ohmic, and concentration losses. A simplified dynamic expression is as follows:
where
is the reversible voltage dependent on temperature and pressure,
is the activation overvoltage caused by reaction kinetics,
accounts for voltage losses due to membrane and electrode resistance, and
is the concentration overvoltage arising at high current densities.
2.4. Control Objectives for Energy Management
Energy management in autonomous PV systems integrating a PEM electrolyzer and operating under variable and uncertain load profiles requires a robust and forward-looking control strategy to ensure optimal performance and system sustainability. In this context, stochastic energy management (SEM) offers a suitable framework for real-time decision-making under uncertainty [
39,
40,
41], enabling the optimization of power flows while preserving system reliability and operational efficiency. The primary control objective is to maximize the utilization of PV-generated power, prioritizing local consumption, without the need for external energy sources. A secondary, but equally critical, objective is to maintain the battery’s SoC within predefined safety and performance thresholds to prevent degradation and ensure energy availability during low irradiance periods.
To meet these goals, the control strategy must account for the inherent stochasticity in solar irradiance and load demand, as well as the nonlinear dynamics of system components, such as the PV generator, battery energy storage system, and PEM electrolyzer. The SEM algorithm incorporates models of these uncertainties, often using an MDP to forecast possible future states of the system, including energy generation and consumption scenarios. These forecasts inform control decisions, enabling proactive and adaptive power dispatch.
Operational constraints are integrated into the control model, including the maximum and minimum SoC limits, the efficiency maps of the PV panels, and the safe operating range of the electrolyzer, particularly its allowable input current and voltage ranges. The DC-DC converters interfacing the PV array with both the battery and electrolyzer are modulated accordingly, ensuring that power flows are dynamically regulated in real-time. This prevents both overcharging and deep discharging of the battery, and maintains the electrolyzer within its high-efficiency operating zone.
Through this SEM approach, the system ensures optimal energy sharing between the battery and the electrolyzer, while adapting to real-time variations in generation and load. As a result, the PV-microgrid operates autonomously and efficiently, producing green hydrogen when surplus energy is available, maintaining battery health, and ensuring a reliable and continuous energy supply even in the face of unpredictable demand patterns.
The following section will demonstrate how the proposed SEM-MDP framework achieves these objectives.
3. MDP-Driven Approach to Optimizing Hydrogen Production
This section presents a stochastic energy management architecture designed for an off-grid DC PV-PEM microgrid integrating multiple loads, battery storage, and a PEM electrolyzer. The system configuration, shown in
Figure 4, illustrates the proposed energy management strategy aimed at enhancing hydrogen production by intelligently managing surplus solar energy. This framework represents a novel contribution, as it integrates MDP-based forecasting directly into the energy management loop for real-time, adaptive decision-making under uncertainty.
The proposed EMS orchestrates the operation of the microgrid by dynamically supervising power flows among the PV generator, load demands, battery storage, and the electrolyzer. Central to this EMS is the MDP-based stochastic controller, which anticipates future variations in load consumption, enabling proactive and optimal energy allocation.
A power management controller (PMC) receives probabilistic forecasts from the MDP and determines the optimal power dispatch strategy. It continuously assesses the instantaneous load demand, battery SoC, and PV generation. Under normal conditions, priority is given to supplying local loads and charging the battery. However, when the load is fully met and the battery reaches its maximum SoC, any remaining surplus PV energy is automatically redirected to the PEM electrolyzer.
This approach ensures that no solar energy is wasted, as excess power is effectively transformed into green hydrogen. In doing so, the system not only enhances hydrogen production but also avoids unnecessary battery cycling, thereby preserving battery lifespan and maximizing overall system efficiency and sustainability, even under fluctuating environmental and unpredictable load conditions.
3.1. MDP-Driven Load Consumption Forecasting
The overall power consumption in a standalone PV microgrid that supplies diverse loads often exhibits random and unpredictable patterns. This variability arises from factors such as user behavior, intermittent appliance usage, and the non-uniform power profiles of individual loads. For instance, some appliances may operate cyclically or have usage peaks at specific times (e.g., day vs. night), leading to significant temporal fluctuations in total load demand. Moreover, when several loads with distinct power profiles are involved, their combined consumption behavior becomes highly stochastic, complicating the task of real-time energy allocation. These fluctuations can undermine the stability, efficiency, and reliability of the microgrid if not properly anticipated and managed. Designing an EMS capable of predicting and adapting to these changes is, therefore, a key requirement for a sustainable and resilient PV-PEM microgrid.
To address these challenges, this work introduces a novel stochastic control strategy based on an MDP, which constitutes a key contribution of this study. Unlike conventional methods, the MDP-based approach enables real-time modeling of random fluctuations in load demand and facilitates probabilistic forecasting of power consumption across multiple users. This probabilistic load model becomes the cornerstone of the EMS, allowing for anticipatory decisions that optimize energy dispatch across the PV generator, battery storage, and hydrogen production unit.
The MDP captures the discrete and dynamic nature of load states, where each load may be either active or inactive at any moment. This abstraction leads to a finite but possibly large number of system states, each corresponding to a unique combination of load activities and associated power demand. The stochastic behavior is then modeled as a continuous-time Markov chain, denoted as , where transitions between states are governed by transition probabilities reflecting the likelihood of switching from one load configuration to another.
By incorporating this stochastic load model into the EMS, the system is able to predict future demand trajectories and optimize energy use accordingly. For instance, during low-demand intervals or when the battery reaches its maximum SoC, the EMS proactively redirects surplus PV energy to the PEM electrolyzer, thereby producing green hydrogen instead of curtailing generation or over-cycling the battery. This predictive logic improves system efficiency, prolongs battery lifespan, and ensures maximum utilization of renewable energy resources.
The MDP framework applied in this study is structured as follows:
State space definition: The state space includes all possible combinations of the on/off statuses of
n loads, leading to
distinct states
, each associated with a deterministic load power
. These states capture the stochastic load profile dynamics over time [
35,
42,
43].
Transition probabilities: Given a state space
, the probability of transitioning from state
i to
j after a short time
ℏ is as follows:
where
are the transition rates, and
ensures that total probabilities sum to one.
Real-time updating: As new measurements are acquired, the transition matrix is updated to reflect observed consumption patterns. For example, if a load becomes more active than predicted, the model adapts by increasing the transition rates toward higher-power states, thereby enhancing future prediction accuracy.
Continuous forecasting loop: The algorithm operates in a loop, continuously updating state probabilities and outputting a real-time forecast of . This forecast feeds directly into the EMS, informing the power dispatch decisions of the PV system, battery, and PEM electrolyzer.
By leveraging this MDP-based stochastic modeling approach, the EMS gains the ability to make optimal decisions under uncertainty, ensuring that PV generation is dynamically matched to actual and forecasted load demands. The integration of this mechanism into the microgrid control strategy enables a predictive and robust energy management system. By outperforming traditional MPPT and rule-based energy scheduling, this strategy provides probabilistic foresight, unlocking new levels of operational efficiency and enabling the exploitation of excess PV energy for sustainable hydrogen production when conventional storage and consumption pathways are saturated.
3.2. PV Power Optimization Under Unpredictable Load Consumption
To ensure MPPT in uncertain and time-varying load demands, a robust control framework based on the technique is proposed as an alternative to model optimization control. This approach exploits real-time load consumption forecasts provided by an MDP, capturing the stochastic behavior of load consumption.
Unpredictable load demand fluctuations significantly affect the PV array’s output power. This relationship can be characterized through a nonlinear dependence on the load state
and the converter duty cycle
, where the ratio between PV and load currents modulates the efficiency of effective power transfer. More formally, the output power of the PV generator under stochastic load conditions is expressed as follows:
To design a control scheme capable of responding to such uncertainties, we adopt a robust stochastic optimization framework that directly incorporates the PV system’s nonlinear electrical characteristics. According to established PV modeling approaches [
35,
44], the instantaneous power extracted from the PV panel can also be described as follows:
where
is the photocurrent,
is the reverse saturation current, and
and
are the number of PV cells in parallel and series, respectively. The parameter
represents the inverse thermal voltage of the cell, encapsulating the temperature dependence of the PV diode equation.
These expressions are integrated into the -based MPPT controller design, ensuring that the PV array operates near its maximum power point despite variations in load and environmental conditions. The combined power equations and stochastic control formulation enable the system to robustly track optimal power levels, maintaining efficient energy delivery in dynamic and uncertain scenarios.
The output power delivered by the PV system, as influenced by the stochastic load dynamics and converter operation, is characterized by the following expression:
The stochastic
controller is designed to minimize the worst-case impact of unpredictable disturbances, such as random load fluctuations, on the ability of the PV system to operate at its maximum power point. By combining (
1) and (
9), the dynamics of the controlled system are modeled as a linear process of varying time influenced by the state of the MDP
and are expressed as follows:
where
where
is the augmented system state, and
is the control input (duty cycle of the DC-DC converter). The desired optimal operating trajectory
corresponds to the conditions under which the PV output reaches the MPP. To regulate tracking, the error signal is defined as follows:
with the goal of ensuring
, i.e., asymptotic convergence to the MPP. The robust feedback control law is formulated as follows:
where
and
are state- and error-dependent gain matrices adapted to each MDP state. The regulated output is defined as
. The
control problem seeks to minimize the worst-case energy gain from the disturbance associated with the stochastic load demand
to the performance output
, thereby ensuring robust stability and optimal tracking performance.
This is achieved by minimizing the following cost functional:
where
is a predefined robustness margin. The design guarantees that tracking error energy remains bounded and attenuated despite random demand deviations. Moreover, physical constraints are imposed on the converter’s duty cycle and the battery’s state-of-charge to ensure operational safety:
By embedding the controller within a stochastic energy management framework based on MDP forecasts, the PV microgrid gains the capability to robustly and adaptively track its optimal operating point. This strategy enhances energy capture, improves dynamic response, and maintains overall system stability, even under high variability and uncertainty. It is particularly effective in autonomous settings where environmental and load unpredictability are prominent.
3.3. Stochastic Power Flow Management
The proposed strategy in Algorithm 1 orchestrates a predictive and adaptive control strategy within the PV microgrid to ensure optimal energy utilization in real time. At each control interval, the algorithm initiates by forecasting future load consumption using an MDP, which captures the stochastic behavior of user demand through probabilistic transitions between discrete load states. Concurrently, the PV subsystem is governed by a robust -based MPPT controller that computes the optimal duty cycle required to track the maximum power point under environmental uncertainties.
This control loop minimizes the tracking error while enhancing system robustness against unpredictable load changes. The predicted load demand and generated PV power are then used to evaluate the power balance and determine the appropriate operational mode. If the PV generation exceeds the load and the battery is not fully charged, the surplus energy is directed to charge the battery.
However, when the SoC reaches its maximum, any excess energy is intelligently routed to a hydrogen production unit via an electrolyzer, thereby preventing energy curtailment and contributing to long-term energy storage. Conversely, if the PV power is insufficient to meet demand, the battery discharges according to its SoC level to supply the load partially or fully.
The real-time execution of the proposed MDP-based energy management algorithm proceeds as follows:
- 1.
Stochastic load forecasting: At each time step t, the MDP predicts the next load level based on the current state and the transition probability matrix Q. This provides a probabilistic estimation of without relying on historical datasets. The transition matrix is constructed offline and can be updated periodically using recent operational data.
- 2.
Measurement and state update: Real-time data such as irradiance , temperature , the current system state , and battery SoC are measured. These serve as inputs for evaluating the PV model and updating the system state prediction for .
- 3.
Robust power optimization: The controller solves the optimization problem by minimizing the cost functional , yielding the control input . The robustness parameter is tuned (e.g., ) to achieve optimal performance. The control sampling frequency is set to s, which is sufficient for capturing solar and load dynamics without inducing excessive computational load.
- 4.
Control execution: The control input is applied to adjust the PV operating point, ensuring maximum power extraction under uncertainties while driving the power tracking error .
- 5.
Supervisory energy dispatch: Based on the predicted surplus or deficit between and , and the current SoC level, the controller activates one of five operational modes: charging the battery (Mode 1), routing power to hydrogen production (Mode 2), discharging the battery to supply full or partial load (Modes 3 and 4), or relying on direct PV supply if the battery is depleted (Mode 5). This ensures safe SoC management and effective hydrogen utilization without violating system constraints.
Algorithm 1 Stochastic -based energy management for PV microgrid with hydrogen production |
- 1:
Identify discrete load levels - 2:
Define MDP states - 3:
Construct transition probability matrix - 4:
Initialize load state , system state , and battery SoC - 5:
Set control time step ℏ - 6:
for to do - 7:
Load forecasting via MDP - 8:
Update transition probabilities - 9:
Identify next load state and get predicted - 10:
PV optimization using stochastic MPPT - 11:
Measure current irradiance and temperature - 12:
Compute optimal reference - 13:
Evaluate system dynamics via PV model - 14:
Minimize cost functional to derive - 15:
Apply control to maximize while ensuring - 16:
Energy management decision - 17:
Update - 18:
if then - 19:
- 20:
if then - 21:
Mode 1: Charge the battery with - 22:
else - 23:
Mode 2: Route to the hydrogen production system - 24:
Set - 25:
end if - 26:
else - 27:
- 28:
if then - 29:
if then - 30:
Mode 3: Discharge battery to supply total load - 31:
- 32:
else - 33:
Mode 4: Discharge battery to meet deficit - 34:
- 35:
end if - 36:
else - 37:
Mode 5: Battery off, PV supplies as much as possible - 38:
- 39:
end if - 40:
end if - 41:
end for
|
4. Simulation Results and Discussion
This section presents simulation results that highlight the performance of the proposed stochastic control-based EMS tailored for standalone PV–PEM microgrids dedicated to green hydrogen production. The proposed strategy effectively manages uncertainties from fluctuating solar irradiance and stochastic load demand by dynamically coordinating energy flows between PV generation, battery storage, and the PEM electrolyzer.
Figure 5 illustrates the real-time implementation architecture of the EMS for the standalone DC PV microgrid. The proposed EMS integrates three key functional blocks: (i) a stochastic
controller for robust MPPT operation, (ii) an MDP-based load consumption forecasting, and a decision-making that orchestrates power flow among the system components. This architecture ensures that the PV system’s output is continuously optimized under uncertainty by dynamically adjusting the duty cycle
of the DC–DC boost converter, allowing the PV generator to track its MPP despite unpredictable changes in irradiance and load conditions.
Unlike traditional fixed-rule EMS schemes, this architecture supports adaptive control and predictive decision-making. EMS defines multiple operational modes governed by the SoC of the battery, the priority of the load, and the status of the hydrogen production. During high solar generation periods, if the battery reaches its SoC upper limit, the EMS triggers hydrogen production mode, redirecting excess PV power to the PEM electrolyzer via switch
. In contrast, during low irradiance periods, if the SoC remains above a defined threshold, the EMS enables battery discharge mode through switch
to maintain load supply. Switches
,
, and
, respectively, represent battery charging, DC load supply, and electrolyzer activation. These transitions are coordinated by a central logic controller, as shown in
Figure 5, which dynamically adjusts power routing based on real-time conditions and predictive inputs.
This clarified architecture underscores the novelty of the proposed EMS: it combines stochastic control, load forecasting, and intelligent mode switching in a unified framework to ensure reliable operation, high PV utilization, and optimized hydrogen production under varying environmental and load profiles.
The simulation is conducted on a representative off-grid PV–PEM system comprising a Siemens SP75 solar panel, a high-efficiency DC–DC boost converter, a lithium-ion battery for short-term storage, and a PEM electrolyzer for green hydrogen production. These components mirror real-world deployment scenarios and offer practical insight into EMS performance. Detailed specifications are provided in
Table A1 in
Appendix A.
An MDP represents the stochastic behavior of the load profiles within the microgrid. This approach allows for a realistic modeling of load consumption patterns, which are inherently uncertain and time-varying. The transitions between different discrete load states are governed by a transition rate matrix
Q, which encodes the probabilities of switching from one consumption level to another over time. This matrix is constructed based on simulated load scenarios that mimic typical user demand patterns, generated through a combination of consumption statistics and random sampling to capture variability. Statistical analysis of these scenarios is then used to estimate transition probabilities, ensuring that
Q reflects the temporal dynamics of real-world load fluctuations. By integrating this probabilistic framework, the energy management strategy can anticipate likely future load scenarios and proactively adjust control actions. The following rate matrix exemplifies the modeled transitions:
Through this formulation, the PV–PEM microgrid dynamically adapts its energy flow, not only to meet immediate consumption needs but also to prioritize long-term sustainability by converting excess renewable energy into hydrogen.
4.1. Performance Evaluation and Interpretation of Key Findings
The simulation study was carried out under two distinct scenarios to evaluate the performance of the proposed EMS strategy. The first scenario considered synthetically generated weather profiles under standard test conditions. The second scenario employed real-time meteorological data obtained from the weather monitoring station at the School of Electrical, Mechanical, and Computer Engineering (EMC), the Federal University of Goiás (UFG), located in Goiânia, Brazil. This dataset, which includes measurements of solar irradiance and ambient temperature, is publicly accessible via
https://sites.google.com/site/sfvemcufg/weather-station (accessed on 1 May 2025).
4.1.1. Scenario 1: Synthetic Weather Profiles
In the first scenario, simulations were carried out using synthesized weather data to replicate dynamic environmental conditions. This setup enables a rigorous evaluation of the proposed EMS under standard test conditions ( and ), which are critical for accurately modeling the stochastic nature of load demand in off-grid PV systems.
These simulations are crucial for validating the system’s performance when subjected to unpredictable load consumption, offering insights into its ability to track the maximum power point, manage load demands, and regulate the battery SoC efficiently. Furthermore, the controller’s capability to handle energy surpluses through hydrogen production using the PEM electrolyzer was also evaluated, ensuring that excess PV power is utilized productively when the battery reaches its maximum capacity.
In our simulations, the PV system was configured to supply energy to a set of DC loads with time-varying power demand. To account for the stochastic nature of load behavior, we defined eight distinct consumption scenarios, as presented in
Table 1, each corresponding to a specific level of load demand. These scenarios were synthetically generated to reflect a wide range of realistic load conditions and abrupt consumption variations commonly observed in standalone PV systems. These scenarios were modeled using an MDP, represented by the state variable
, allowing us to capture the probabilistic switching between load profiles based on a predefined transition rate matrix
Q.
To manage the uncertainties associated with time-varying load behavior, an MDP-based energy management strategy was implemented to estimate, in real-time, the global consumption state. This strategy relies on the transition matrix
Q, which defines the probabilistic evolution between eight predefined load scenarios, as presented in
Figure 6.
The MDP framework enables the control system to dynamically adjust energy distribution according to the anticipated load level by continuously evaluating the current operating state and forecasting likely transitions. This adaptability ensures that energy flows are optimally balanced between the PV source, the storage system, the DC loads, and the hydrogen production.
As depicted in
Figure 7, the proposed MDP-based mechanism effectively tracks abrupt and unpredictable variations in consumption, demonstrating a high level of responsiveness and robustness. Its predictive capability enhances decision-making under uncertainty, allowing the system to proactively compensate for demand fluctuations while maintaining operational efficiency.
Figure 8a depicts the temporal evolution of the PV power output under the supervision of the proposed control strategy, previously outlined in
Figure 5. The results clearly show that the controlled system effectively follows the reference power profile
, even in the presence of sudden shifts in load demand. This capacity to promptly track dynamic setpoints demonstrates the responsiveness and precision of the control algorithm during both start-up and steady-state conditions. These results are achieved by minimizing the robust control cost defined in Equation (
13), where the robustness parameter
was tuned to an optimal value of
. This setting ensures reliable tracking of
under stochastic load variations, balancing robustness and performance.
Furthermore, the proposed control scheme demonstrates remarkable resilience to stochastic variation induced by MDP-based load forecasting, ensuring stable and reliable power delivery under dynamic and unpredictable conditions. Compared to the conventional Perturb and Observe (P&O) technique, the proposed strategy achieves a significantly lower average tracking error of 0.3125, versus 9.8836 for P&O. It also maintains a higher average energy conversion efficiency of
, compared to
for the P&O method. As illustrated in
Figure 8, the proposed controller responds much faster to abrupt load changes, quickly converging to the new optimal operating point, while the P&O technique exhibits slower adaptation and larger oscillations. This rapid and accurate convergence, coupled with stable voltage and current behavior, confirms the effectiveness of the proposed method for real-time energy management in standalone PV systems. Overall, these results emphasize the superior performance, robustness, and adaptability of the proposed control strategy in managing uncertain and time-varying energy flows without compromising system stability.
The temporal profile of the battery’s SoC is illustrated in
Figure 9, highlighting the effectiveness of the proposed stochastic EMS in real-time operation. Throughout the simulation, the SoC remains consistently within the predefined safety margins, namely
and
, despite the presence of unpredictable variations in load demand. This indicates that the control system successfully anticipates fluctuations and allocates available energy accordingly. When the battery approaches its upper charge threshold, the control algorithm intelligently diverts excess photovoltaic power, rather than curtailing it, toward the PEM electrolyzer for green hydrogen production. This coordinated mechanism ensures optimal utilization of solar resources, prevents battery overcharging, and promotes sustainable energy storage through hydrogen generation.
Figure 10 and
Figure 11 illustrate the cumulative mass of hydrogen produced and the corresponding water consumption over the simulation horizon. These results highlight the effectiveness of the proposed energy management strategy in harnessing surplus PV energy for sustainable hydrogen generation once the battery reaches its maximum SoC.
As shown, hydrogen production increases progressively during periods of high solar availability and low load demand, indicating that excess energy is efficiently diverted to the PEM electrolyzer rather than being curtailed. The associated water consumption profile in
Figure 11 mirrors the hydrogen production trend, reflecting the relationship between water electrolysis and hydrogen output.
Overall, the results validate the controller’s capability to intelligently coordinate energy flows between the battery, the DC loads, and the PEM electrolyzer in accordance with the operational modes defined in
Figure 12. By dynamically adjusting power allocation based on the battery’s SoC, the real-time load demand, and the availability of PV generation, the proposed strategy ensures efficient energy utilization under varying conditions. This adaptive management not only prevents battery overcharge or deep discharge but also enables the productive use of surplus solar energy for green hydrogen production.
4.1.2. Scenario 2: Real-Time Weather Data
In this scenario, the performance of the proposed multi-objective stochastic control strategy is evaluated under real-time solar irradiance and temperature conditions, as depicted in
Figure 13. When exposed to realistic and time-varying climatic conditions, these environmental profiles enable us to evaluate the controller’s adaptability and robustness in managing energy flows within the microgrid, under operating temperatures ranging from 19 °C to 55 °C.
Figure 14 illustrates the temporal evolution of the MDP. Based on the defined state space in
Table 2, the MDP effectively tracks and forecasts variations in load consumption under realistic weather conditions. The system achieved precise real-time estimation of the total load demand by implementing the MDP-based stochastic forecasting strategy, as shown in
Figure 15. These predictive insights allowed the energy management algorithm to take anticipatory and well-informed actions in distributing the available energy resources.
Figure 16 illustrates the optimized PV power output achieved under real-time weather conditions. The results demonstrate the ability of the proposed stochastic control strategy to continuously adapt PV generation in response to fluctuating solar irradiance and sudden shifts in load demand. By efficiently tracking the available solar resource and reallocating power accordingly, the controller ensures optimal energy utilization. This adaptability highlights the robustness and responsiveness of the control approach in dynamic operational environments.
The battery’s SoC evolution under real-time weather conditions is depicted in
Figure 17, offering insights into its operational performance within the proposed microgrid architecture. Throughout the simulation, the battery exhibited consistent behavior—charging during periods of high solar availability and discharging when the PV output was insufficient to meet load demand. The observed SoC peaks align with midday periods characterized by strong irradiance, while decreases correspond to cloudy intervals or load surges. Notably, once the SoC approached its upper threshold, surplus energy was redirected to the electrolyzer for hydrogen production, thereby preventing overcharging. The battery operated reliably within the predefined bounds, ensuring smooth transitions and avoiding excessive cycling. This controlled SoC profile confirms the robustness of the proposed stochastic control framework in maintaining energy balance and safeguarding system stability under varying real-world conditions.
The results presented in
Figure 18 and
Figure 19 provide a comprehensive overview of the hydrogen production performance and corresponding water consumption under real-time weather conditions. The mass of hydrogen generated follows a variable trend that reflects both the availability of excess PV energy and the battery’s SoC. When the battery reaches its upper SoC threshold, surplus solar energy is intelligently redirected to power the electrolyzer, resulting in efficient green hydrogen production.
This process is inherently linked to the fluctuating solar irradiance levels observed throughout the day, as well as the stochastic load demand predicted by the MDP-based strategy. Simultaneously, the evolution of the water consumption curve shows a direct proportional relationship to hydrogen generation, confirming the expected electrolysis behavior. According to the operating modes illustrated in
Figure 20, these results validate the capacity of the energy management system to optimize renewable energy utilization and support sustainable hydrogen production, even under dynamically changing weather and load conditions.
4.2. Benchmarking Against Existing Control Methods
This section provides a comparative analysis with existing studies to better contextualize the proposed MDP-based energy management strategy for green hydrogen production.
Table 3 highlights key distinctions in power optimization techniques, load forecasting approaches, and data requirements. Many conventional methods employ deterministic optimization, such as adaptive control or model predictive control, often based on simplified or repetitive load profiles. In contrast, some recent approaches adopt stochastic optimization techniques, including Monte Carlo simulations (MCS) and deep reinforcement learning, which, although powerful, typically demand significant computational resources and large volumes of historical data for training and scenario generation.
In contrast, the proposed method leverages a finite-state MDP to model and manage random load fluctuations in real time. This eliminates the need for large-scale data collection or computationally heavy prediction engines. The key contributions of our approach are summarized as follows:
Stochastic load adaptation: By using a Markov model with discrete load states and transition probabilities, the control system can respond effectively to abrupt and unpredictable changes in load without relying on historical consumption data.
Efficient energy allocation: The MDP policy optimizes the distribution of PV power between battery charging and hydrogen production, ensuring safe battery operation and improved hydrogen yield under dynamic conditions.
Low computational overhead: The proposed strategy provides a practical alternative to high-complexity algorithms, making it well-suited for real-time applications in remote areas where computational resources may be limited.
5. Conclusions
This study aimed to develop a robust stochastic energy management strategy for standalone PV systems dedicated to green hydrogen production, focusing on the intelligent exploitation of excess solar energy under unpredictable load conditions. Recognizing the limitations of traditional deterministic control methods, the proposed approach integrates optimization control with an MDP to forecast short-term load consumption and guide optimal power dispatch between DC loads, battery storage, and a PEM electrolyzer.
The simulation results validated the effectiveness of the proposed strategy under realistic and dynamic operating scenarios. The MDP-based controller successfully adapted to fluctuations in both solar irradiance and load behavior, enabling more efficient use of surplus energy for hydrogen production. Compared to baseline strategies without stochastic modeling, the proposed method achieved a power optimization efficiency of , which consequently enhanced the continuity of hydrogen generation. Additionally, intelligent scheduling of battery operations contributed to a potential extension of battery lifespan by reducing cycling stress, as reflected in decreased depth-of-discharge fluctuations.
One of the key findings of this work is the demonstration that incorporating stochastic load forecasting enables proactive and adaptive energy management, even in highly variable off-grid environments. This makes the proposed framework particularly relevant for autonomous energy systems in remote or infrastructure-limited regions—defined as locations where grid connectivity is unavailable or unreliable, and where access to maintenance resources, fuel supply chains, or technical support is severely constrained—making consistent hydrogen production and system resilience critical.
Despite these promising results, the current model assumes idealized sensor measurements without explicit consideration of measurement noise or hardware limitations. Furthermore, the Faraday efficiency and electrolyzer performance parameters were treated as constants, which may limit accuracy under varying operating conditions. These assumptions represent limitations that will be addressed in future work.
Future research will explore the integration of additional renewable energy sources, such as wind or biomass, to further enhance the adaptability and autonomy of the standalone microgrid system, enabling more robust and resilient operation under diverse environmental conditions. To bridge the gap between simulation and practical deployment, we also plan to conduct real-time hardware-in-the-loop (HIL) validation to assess the feasibility and effectiveness of the proposed stochastic energy management strategy on actual control hardware.
Implementing the MDP-based controller in real hardware environments presents several challenges, including computational requirements for real-time load forecasting and decision-making, as well as the necessity for continuous, accurate monitoring of load demand and system states. Addressing these challenges will involve optimizing the algorithm for embedded platforms with limited processing power and memory, and developing efficient data acquisition systems capable of reliable, low-latency measurement of system variables.