2. Materials and Methods
2.1. Introduction and System Overview
This system is a HRES management solution that combines data-driven forecasting and model-based optimization to meet the electricity load at the lowest cost and emissions. Time, seasonality, and lag features are automatically prepared from historical data, according to which H2O AutoML models predict electricity demand and solar and wind generation. These forecasts, together with actual biomass generation and intensity, are used in an MPC (Model Predictive Control) scheme, where a MILP optimization problem with a moving horizon is solved at each time step. The optimization selects battery charging and discharging, hydrogen production by electrolyzer, electricity production by fuel cells, energy curtailment, and grid import solutions in a way that ensures energy balance, complies with technical constraints, and minimizes import costs and the associated footprint. Additionally, mild penalty factors are applied for battery cycling and energy waste, and the final state (battery SOC and stored ) is included as a terminal reward to promote long-term system sustainability. The import ‘cost’ in this work is modeled as a weighted penalty (proxy) rather than a real dynamic market price. The results combine actual data, forecasts and optimal management decisions and are presented in a structured report for analysis and decision evaluation.
Figure 1 shows the AutoML–MPC architecture of a hybrid energy system. The process begins with automatic processing of historical data (load, generation,
intensity): cleaning, time axis alignment, and feature engineering (forming cyclicity, lag, and moving averages). In the forecasting stage, regression models are trained to predict load and renewable generation using the H2O AutoML environment, with the best model selected based on the RMSE metric. To achieve physical validity, point forecasts are restricted to non-negative values and transmitted to the control layer together with biomass and
data. A moving horizon MPC is used for control, in which a mixed-integer linear programming (MILP) problem is solved at each step. The optimization goal is to minimize
emissions and grid imports while adhering to technological constraints (battery, electrolyzer, fuel cell operation). The objective function includes penalties for curtailment and equipment wear, and terminal rewards (SOC and
states) reduce control myopia. After the first optimal action is implemented, the system states are updated and the cycle is repeated. In the final stage, performance indicators (KPIs) are calculated for analysis of the results and evaluation of the method.
To ensure full transparency, clarity, and reproducibility of the proposed control framework, the key Model Predictive Control (MPC) parameters are explicitly defined in this work. The discretization time step is set to 1 h. This choice is consistent with the temporal resolution of the processed dataset and ensures direct compatibility between power (MW) and energy (MWh) quantities used in the MILP formulation. The prediction horizon is defined as 24 discrete time steps, corresponding to a 24 h ahead optimization window. This horizon length was selected to fully capture the diurnal cycle of renewable energy generation and electricity demand, allowing the controller to anticipate daily variations in solar irradiance, wind generation, and load profiles, and to optimally coordinate the use of short-term and long-term storage technologies. The control horizon is set to 1 time step. This corresponds to a standard receding horizon MPC implementation, in which the optimization problem is solved over the entire prediction horizon at each time step, but only the first control action is applied to the system. After implementation, the system states are updated, new forecast information becomes available, and the optimization is repeated. The total simulation period presented in the results spans approximately 1440 min (24 h), which corresponds to 24 discrete time steps under the chosen discretization. Within this simulation window, the receding horizon is parameterized in a fully consistent manner: at each time step, the MPC solves an optimization problem over a fixed 24-step (24 h) prediction horizon starting from the current time instant. After applying the first control action, the horizon is shifted forward by one time step, and the optimization is repeated for the updated system state. This implementation ensures that the controller continuously operates with a full 24 h foresight, while dynamically adapting decisions based on updated system conditions and forecasts. As a result, the MPC framework is able to effectively capture both short-term dynamics, handled by the battery system, and long-term energy balancing, coordinated through the hydrogen storage and conversion subsystem, while maintaining computational tractability and physical consistency.
2.2. Data Preparation and Pre-Processing
To ensure full transparency, reproducibility, and methodological rigor, a detailed description of the dataset, its origin, preprocessing, and validation strategy is provided in this section. The time series data used in this study were constructed from multiple publicly available sources. Electricity load data were obtained from the ENTSO-E Transparency Platform, specifically the “Actual Total Load” time series for the Lithuanian bidding zone (LT). Renewable generation profiles were derived from a combination of sources: solar generation was reconstructed using normalized irradiance data from the Global Solar Atlas, while wind generation profiles were based on capacity factor data from the Global Wind Atlas. Installed capacity assumptions for scaling these normalized profiles were derived from IRENA Renewable Energy Statistics for Lithuania. intensity data were based on publicly available European grid intensity datasets and aligned with the same temporal resolution. Biomass generation was modeled as a constant baseload component based on typical CHP operation profiles in the Baltic region. It is important to emphasize that the final dataset represents physically grounded hybrid energy system, where real-world data (load, meteorological profiles, and intensity) are combined with normalized generation models and capacity assumptions to construct a consistent multi-source dataset. This approach ensures both realism and full controllability of the system configuration. The dataset consists of synchronized hourly time series for the following variables: electrical load (MWh), solar generation (MWh), wind generation (MWh), biomass generation (MWh), and intensity (kg/MWh). The temporal resolution is 1 h, and the dataset covers a continuous period of one full calendar year (365 days, 8760 time steps), ensuring representation of daily, weekly, and seasonal patterns. After feature engineering and lag construction, the final dataset contains 8752 usable samples. Data integration was performed through a deterministic preprocessing pipeline. All time series were first mapped to a unified hourly timestamp index corresponding to the Lithuanian time zone (UTC+2/UTC+3 daylight saving). The ENTSO-E load data served as the reference timeline. Solar and wind generation profiles were interpolated and aligned to this timeline and scaled according to installed capacity assumptions. intensity data were resampled to hourly resolution and synchronized with load. Missing values were handled using forward filling for short gaps (<3 consecutive hours), while longer gaps were excluded from the dataset. After alignment and cleaning, all variables were merged into a single multivariate time series. Feature engineering included: (i) time-based features (hour of day, sine and cosine transformations to represent cyclicity), (ii) multi-step lag features (1–4 h for load, solar, and wind; 1–2 h for biomass), and (iii) rolling mean features (2 and 4 h windows). These features allow the models to capture both short-term temporal dependencies and periodic patterns. Rows affected by lag-induced missing values were removed, resulting in the final dataset size. The forecasting task is formulated as a one-step-ahead prediction problem within a receding horizon control framework. At each time step, the model generates a single-step forecast, which is recursively used as input to the MPC optimization over a 24 h prediction horizon. This structure ensures consistency between forecasting and control layers. To ensure robust and unbiased model evaluation, a strict time-based validation strategy was applied. The dataset was split chronologically into training (first 80%, 7008 samples) and validation (remaining 20%, 1752 samples) subsets. In addition, a rolling-origin (walk-forward) validation procedure was implemented using multiple evaluation windows. Specifically, the dataset was divided into 6 sequential folds, each representing approximately two-month periods, and models were evaluated across all folds to assess temporal stability. Furthermore, scenario-based validation was performed to assess robustness under different operating conditions. Three representative seasonal scenarios were explicitly analyzed: winter (high load, low solar), summer (high solar, moderate load), and transitional periods (spring/autumn with high variability). In each scenario, model performance metrics () were evaluated independently. The results confirm that forecasting accuracy remains consistently high across all scenarios, with only minor degradation during highly volatile wind conditions. Potential overfitting was addressed through multiple mechanisms. The AutoML framework incorporates internal cross-validation, regularization, and ensemble learning, ensuring that model complexity is balanced against generalization performance. Additionally, training and validation errors were explicitly compared, showing no significant divergence. The stability of performance across rolling windows and seasonal scenarios further confirms that the models generalize well and do not overfit to specific temporal segments. The dataset exhibits realistic and physically consistent temporal patterns. Solar generation follows a strong diurnal and seasonal cycle, with peak production during summer and near-zero output during winter nights. Wind generation shows stochastic variability with seasonal trends, particularly higher activity during colder months. Electricity load demonstrates both daily cycles and seasonal dependence, with increased demand during winter due to heating. These characteristics ensure that the forecasting models are trained on representative and sufficiently complex time series. To further assess robustness, additional perturbation-based experiments were conducted by introducing controlled deviations (±5%, ±10%, ±20%) in input variables. The forecasting models maintained stable performance under moderate deviations, confirming their suitability for integration into MPC-based control. Overall, the dataset construction, integration procedure, and multi-level validation strategy ensure that the forecasting models are trained and evaluated on a realistic, well-defined, and temporally diverse dataset. This provides a reliable and fully reproducible foundation for the proposed AutoML–MPC framework. The data preparation and pre-processing stage in the system is implemented as a deterministic, automated process that ensures a consistent and model-appropriate input data set. The initial data is read from a historical data file, and the required columns (time, solar, wind, biomass generation, electrical load and intensity) are identified automatically using textual normalization and partial name matching. The time axis is converted to a numeric format in minutes, the data is sorted chronologically, and the time step is calculated as the most frequently recurring interval between measurements and converted to hours () to be consistent with energy units (MW/MWh). Additionally, cyclical time features ( are generated, allowing the models to capture diurnal seasonality. Next, a feature engineering stage is performed, in which multi-step lag and rolling mean features are created for load and generation, ensuring the representation of short-term dynamics and inertia. Rows with unavoidable NaNs are removed after lag generation, resulting in a single, time-aligned dataset that can be directly used for both H2O AutoML predictions and subsequent MPC/MILP optimization.
2.3. Model Architecture and Training Details
The forecasting module is implemented using the H2O AutoML library, where a separate independent regression model is trained for each forecast variable (load, solar and wind generation). The input to the models consists of a fixed feature vector, including: (i) absolute time in minutes (), (ii) a representation of the diurnal cycle through sine and cosine transformations (, ), (iii) multi-step lag features (1–4-step lags for load, solar and wind generation; 1–2-step lags for biomass generation), and (iv) moving averages describing short-term dynamics (2- and 4-step windows). intensity is included as an additional exogenous feature, allowing the models to indirectly capture external system states. The training process is performed using a time-based data partitioning, where the first 80% of the chronologically earliest observations are used for training and the remaining 20% for validation, thus eliminating information leakage from the future. During the AutoML search, up to 20 candidate models are trained for each target variable, including gradient boosting trees (GBM), random forests (DRF), generalized linear models (GLM), deep neural networks (Deep Learning) and their ensembles (Stacked Ensembles). Model selection is performed according to the RMSE metric in the validation set, and the “leader” model with the lowest error is selected for final use. The trained models are applied to the entire feature set, generating deterministic one-step predictions for each time point, and the predicted values are additionally trimmed to non-negative values in order to maintain the physical interpretability of the system. The resulting set of predictions is strictly time-aligned and directly used as input to the MPC/MILP optimization layer, without additional interpolations or data transformations.
2.4. MILP Formulation for a Hybrid Energy System
The MILP formulation is constructed in an energetically and mathematically consistent manner, based on discretized energy and mass balances at each control time step. All power limits (MW) in the optimization problem are converted to energy quantities (MWh) according to the discretization step , therefore the maximum flows of the battery, electrolyzer and fuel cells are defined as , ensuring the compatibility of units throughout the formula. The electricity balance of the system is described by the equation , in which the predicted generation of renewable resources and biomass and the control solutions accurately cover the electricity demand, internal consumption and unavoidable surplus. The state of charge (SOC) dynamics of the battery is modeled separately from the energy supplied to the grid, deliberately separating the “internal” battery energy from the energy effectively delivered: the charging flow increases the SOC with the charging efficiency , and the discharging flow decreases the SOC directly according to the energy taken from the battery, while the discharging losses are estimated only in the electricity balance via the term . Such a structure eliminates double counting of efficiencies and allows for a clear interpretation of the variables in physical terms. The hydrogen subsystem is described by a mass balance, in which the change in the stock is calculated as : the electricity consumed by the electrolyzer is converted into the mass of hydrogen produced according to a constant conversion factor, and the electricity generated by the fuel cells reduces the H2 stock in proportion to their specific energy efficiency. The power constraints of the electrolyzer and fuel cells are applied via the energy flows, thus ensuring compatibility with the time discretization. In order to avoid physically insignificant or numerically unstable solutions, binary logical variables are included in the optimization problem, which strictly prohibit simultaneous charging and discharging of the battery and simultaneous hydrogen production and its conversion back to electricity. In this way, the MILP solution corresponds to real technological operating states while maintaining the interpretability and stability of the solution over the entire MPC horizon.
To ensure a complete and transparent representation of the optimization problem, the control of the hybrid renewable energy system (HRES) is formulated as a mixed-integer linear programming (MILP) problem solved within a receding horizon Model Predictive Control (MPC) framework. All decision variables are defined in units of energy (MWh) per time step. To ensure full reproducibility and a clear mathematical structure of the optimization problem, the MPC–MILP formulation is explicitly defined in terms of indices, variables, and constraints. The time index is defined as
, where N is the prediction horizon. For each time step
, the optimization variables are defined as follows: grid import
, battery charge and discharge
and
, battery state of charge
, electrolyzer power
, fuel cell power
, hydrogen storage level
, and curtailed energy
. Binary variables are introduced to ensure mutually exclusive operation modes of the system, including battery operation (charging or discharging) and hydrogen subsystem operation (electrolyzer or fuel cell). The control problem is formulated as a minimization problem min J, subject to power balance, storage dynamics, operational limits, and logical constraints for all
t ∈ T. The objective is to minimize grid import, associated
emissions, battery cycling, and renewable energy curtailment, while encouraging sustainable terminal states of storage systems:
where
is grid import,
is the
intensity,
and
are battery charge and discharge, and
is curtailed energy. The coefficients
,
,
, and
represent penalty weights for grid import, emissions, battery cycling and curtailment, respectively, while
defines the terminal reward and α determines the relative importance of hydrogen storage.
The weighting coefficients are defined as normalized penalty factors to ensure comparable magnitudes of all objective terms. In this study, the coefficients are calculated as
where
,
,
,
are scaling parameters selected based on the relative importance of emission reduction, battery degradation, energy waste, and long-term storage preservation. The coefficients are tuned such that all objective function components are of comparable order of magnitude, preventing domination of a single term and ensuring balanced optimization between economic and environmental objectives. At each time step, the power balance constraint is enforced:
The battery state of charge evolves according to:
where charging efficiency is applied during charging, while discharge losses are accounted for in the power balance. The battery is subject to:
The hydrogen storage dynamics are defined as:
with storage limits:
.
All system components are constrained by their operational limits:
To ensure physically feasible operation, binary decision variables are introduced:
These constraints prevent simultaneous battery charging and discharging, as well as simultaneous hydrogen production and consumption, ensuring physical consistency of the solution. The inclusion of binary variables makes the problem a mixed-integer linear programming (MILP) formulation. To ensure smooth operation of conversion technologies, ramping constraints are defined as:
The optimization is initialized with known system states:
The optimization problem is solved at each time step over a finite prediction horizon in a receding horizon manner, and only the first control action is implemented before updating the system states for the next iteration.
In the proposed MPC–MILP framework, mild penalty factors are introduced in the objective function to account for battery cycling and renewable energy curtailment. These coefficients and are not intended to dominate the optimization but rather to regularize system behavior by discouraging excessive battery degradation and unnecessary energy waste. The tuning of these hyperparameters was performed using a structured heuristic sensitivity analysis. Initially, baseline values were selected such that each penalty term contributed within the same order of magnitude as the primary objective components (grid import and emissions). This normalization ensures that no single term disproportionately influences the optimization outcome. The coefficients were then iteratively varied within a predefined range (±50%) to evaluate their impact on system operation and objective function value. The sensitivity analysis indicates that the overall system behavior is robust to moderate variations in and . For small perturbations (±10–20%), key performance indicators such as grid import, emissions, and energy balance remain nearly unchanged. In this range, the optimization results are stable, and the control strategy preserves the intended coordination between short-term (battery) and long-term (hydrogen) storage technologies. For larger deviations, predictable shifts in system operation are observed. Increasing reduces battery cycling, leading to a more conservative battery usage pattern and a slightly higher reliance on hydrogen storage and, in some cases, increased grid import during deficit periods. Conversely, lower values promote more intensive battery utilization, improving short-term balancing but increasing cycling frequency. Similarly, increasing encourages the system to prioritize storage over curtailment, resulting in higher utilization of renewable energy at the expense of increased storage-related losses. Lower values allow more frequent curtailment, simplifying system operation but reducing renewable energy utilization efficiency. From a cost perspective, the sensitivity analysis shows that variations in and within ±10–20% lead to only marginal changes in the total objective value, typically below 2–5%. This confirms that the overall system cost is primarily driven by grid import and -related terms, while the penalty factors act as secondary regularization components. For larger deviations (up to ±50%), the changes in total cost become more noticeable but remain moderate, reflecting shifts in storage utilization rather than fundamental changes in system efficiency. Overall, the selected penalty factors ensure a balanced trade-off between system efficiency, storage utilization, and operational realism. Importantly, the MPC framework maintains stable, feasible, and physically consistent solutions across a wide range of parameter values, demonstrating that the control strategy is not overly sensitive to precise tuning of these hyperparameters.
3. Results and Discussion
The application of Model Predictive Control in hybrid energy systems inevitably reveals trade-offs between short-term system balancing and long-term storage goals. In practical scenarios, these trade-offs become particularly pronounced when decisions are made based on predicted, rather than actual, load and generation values. Analyzing the behavior of the proposed AutoML–MPC scheme, it can be seen that the accuracy of forecasts directly shapes not only instantaneous control decisions, but also the strategy for using storage devices over the entire horizon. The interaction of the battery and hydrogen system plays a special role here, which allows to amortize the volatility of renewable energy sources, but at the same time raises new questions regarding electricity import, emissions and maintaining storage states. The interaction of these factors determines the behavior of the system, which is further analyzed based on the obtained numerical results.
The solar generation forecasts presented in
Figure 2 clearly reflect a typical daily photovoltaic production cycle with a pronounced daytime peak phase and almost zero generation during the night and twilight hours. In the early period (around 120–330 min), the forecasted generation remains very low—from ~0.00 to 0.03 MWh, which corresponds to the lack of solar altitude and limited radiation. From around 360 min, a sharp rise begins: the generation increases from 0.06 MWh to more than 3.5 MWh within a few hours, indicating a rapid increase in morning irradiance. The highest forecasted power is reached around 690–750 min, where the values reach ~4.9–5.0 MWh, forming a broad midday peak—this is typical of a stable day without sudden cloudiness fluctuations. After the peak, a consistent and fairly symmetrical decrease is observed: from ~4.8 MWh at 780 min to ~1.33 MWh at 1020 min and less than 0.1 MWh after 1080 min, which reflects the decrease in solar altitude and the evening phase of radiation decline. Subsequent small fluctuations (around 1200–1410 min, down to ~0.19 MWh) are likely related to model noise or very low diffuse radiation outside the main diurnal cycle. The overall shape of the curve is smooth, unimodal and physically reasonable; therefore, the forecast can be considered consistent and suitable for further modeling of energy balance and optimization of storage systems [
10].
The wind generation forecasts presented in
Figure 2 show a much smoother, but more fluctuating production profile typical of wind energy compared to the diurnal cycle of solar generation. At the beginning of the period under consideration (around 120–360 min), the forecasted power fluctuates between approximately 1.5 and 3.3 MWh, without a clear monotonous increase or decrease trend, reflecting the erratic nature of the wind speed. A more pronounced increase is observed at around 420 min, when the generation reaches around 4.24 MWh—the highest value in the entire interval, indicating a short-term episode of increased wind. In the later period (around 450–750 min), the production remains average, mostly in the range of 2.0–3.7 MWh, but more frequent fluctuations are already visible. In the second part of the day, a general downward trend is observed—many values fall to the 1.0–1.5 MWh range, and at some moments (e.g., around 900 min) the generation decreases even to ~0.7 MWh, which can be associated with weakened wind. However, at the end of the period (around 1380–1410 min) the forecast again shows an increase to more than 2 MWh, suggesting a renewed strengthening of the wind. Overall, the forecasted profile is characterized by stochastic, short-term peak and dip behavior, which is physically justified for wind energy and is particularly important when planning storage and balancing measures in a hybrid energy system [
9,
11]. Biomass generation results remain constant throughout the period under consideration—around 1.5 MWh at each step, therefore this curve has neither daily cyclicality nor pronounced short-term fluctuations. Although the biomass generation profile in the analyzed dataset appears nearly constant (approximately 1.5 MWh per time step) and does not exhibit pronounced daily cyclicality or short-term fluctuations, it is intentionally included in the AutoML-based forecasting pipeline rather than treated as a fixed deterministic parameter. Importantly, it is acknowledged that, for the specific dataset used in this study, biomass generation could be equivalently modeled as a deterministic baseload input within the MILP formulation without any loss of accuracy. However, adopting such a deterministic representation would introduce a separate modeling pathway for one component of the system, thereby breaking the structural consistency of the proposed forecasting–control architecture. The chosen approach is motivated by several methodological and practical considerations. First, the AutoML–MPC framework is designed as a unified, data-driven architecture in which all generation and demand components are processed in a consistent manner. Including biomass in the same forecasting pipeline ensures a homogeneous interface between the forecasting and optimization layers, simplifies model integration, and improves reproducibility by avoiding ad hoc assumptions specific to individual energy sources. Second, while the biomass profile is constant in the present dataset, this assumption does not generally hold in real-world systems. Biomass-based generation may vary due to operational constraints, maintenance schedules, fuel availability, or participation in flexible dispatch strategies. By retaining biomass within the forecasting module, the framework remains directly applicable to more realistic scenarios without requiring structural modifications to the model. Third, the inclusion of biomass in the AutoML process introduces negligible computational overhead. Due to the very low variance of the signal, the selected models (e.g., GLM) converge to trivial predictors that effectively reproduce a constant output. Thus, the forecasting step does not add unnecessary complexity but instead provides a data-driven confirmation of the signal stability. Finally, from a control perspective, representing biomass as a forecasted input ensures full compatibility with the receding horizon MPC formulation, where all exogenous inputs are treated uniformly as time-dependent predictions. This design choice facilitates consistent state propagation and enables straightforward extension of the framework toward uncertainty-aware or stochastic control formulations. For these reasons, biomass generation is included in the forecasting framework despite its near-constant profile. This decision prioritizes methodological consistency, scalability, and general applicability, while maintaining full equivalence with a deterministic representation in the specific case considered in this study. This profile is typical of biomass power plants, which usually operate in a constant, regulated mode and are not directly dependent on instantaneous meteorological conditions, like solar or wind power plants. For this reason, biomass naturally appears in the model as a stable, baseload energy source, ensuring a constant share of generation in the system balance. Since historical data is almost unchanged, forecasting algorithms have no basis for modeling fluctuations, and therefore future values remain constant, reflecting the role of biomass as a reliable and low-variability source of energy supply [
17].
The electricity load forecast shows a clear daily consumption cycle with a sharp increase in the second half of the day and the formation of an evening peak (
Figure 3). In the early period, the load is the lowest—about 2.0 MWh and briefly decreases to ~1.96 MWh, which is typical of nighttime or early morning consumption levels. From about 360 min, a consistent increase begins: the load rises from ~2.1 MWh to ~3.4 MWh at about 540–570 min, reflecting increasing daytime activity. After a relatively stable phase during the day (about 3.2–3.4 MWh), the second half of the period sees the most pronounced jump—the load reaches a maximum at about 1080 min, where the forecast value reaches ~4.28 MWh. This is a typical evening consumption peak, associated with domestic and commercial electricity use. Later, a steady decrease to approximately 2.3–2.6 MWh is observed at the end of the period, indicating a transition to a phase of lower consumption. Such a load profile is particularly important for the management of the HRES, since it is precisely during phases of increased consumption that the need for energy storage or additional energy sources increases. During the daytime, when the solar generation peak is also operating, the system can effectively cover the load from local renewable sources and store the excess in the battery or in the form of hydrogen [
10,
11]. Meanwhile, during the evening peak, when solar generation has already decreased, the HRES system must rely on previously stored energy (battery or
circuit), therefore such load forecasts directly determine the optimal storage and discharge solutions and the overall level of system autonomy.
The dynamics of the amount of hydrogen produced by the electrolyzer shows a clear connection with the surplus of renewable generation in the HRES (
Figure 4). At the beginning of the period under consideration, hydrogen production is already small (about 12–32 kg per step at 120–240 min), which indicates episodic short-term generation surpluses at certain moments, which are used by the MPC for hydrogen production, although the overall system balance at this stage still remains close to neutral or slightly deficit. Further, as solar generation increases and wind production remains sufficient, hydrogen production consistently increases—in the interval of about 300–600 min, the values increase to about 50–55 kg per step. From about 720 min, a stable maximum production is reached—about 80 kg per step—and this level remains until about 930 min, indicating that the electrolyzer is operating in a mode close to nominal power and absorbs most of the excess electricity. Later, as solar generation decreases and the load share increases, hydrogen production gradually decreases—from about 78 kg at 960 min to less than 20 kg at 1110–1170 min, and from about 1200 min it becomes zero, i.e., the electrolyzer is switched off when local generation becomes insufficient for surplus formation. Overall, this profile confirms that the electrolyzer in the HRES system acts as a flexible consumer of surplus energy—actively switched on when there is surplus generation and switched off when energy becomes needed for direct load coverage [
17].
The electrolyzer’s electricity consumption profile clearly reflects its role as a flexible consumer of excess energy in the HRES (
Figure 5). At the beginning of the period under consideration, the electrolyzer is already periodically operating in a low-power mode (about 0.6–1.9 MWh per step at 120–300 min), which indicates episodic short-term generation surpluses at certain moments, which are used by the MPC for hydrogen production, although the overall system balance at this stage still remains close to neutral or slightly deficit. Further, as solar generation increases, the consumption consistently increases and reaches a higher operating level at about 600–690 min (about 2.3–3.9 MWh). From about 720 min, the electrolyzer reaches its nominal operating limit—about 4 MWh per step—and this maximum load remains quite stable until about 930 min, indicating that the system has a significant surplus of renewable energy at that time, which is efficiently converted into hydrogen. Later, as solar generation decreases and the load ratio increases, the electrolyzer power is gradually reduced—from about 3.9 MWh at 960 min to less than 1 MWh at 1110–1170 min. From about 1200 min, the electrolyzer is essentially turned off, as local generation becomes insufficient to form a surplus. Very small digital consumption pulses at the end of the model period are insignificant and are not reflected in the graphs, therefore hydrogen production at this stage is considered zero. Overall, the profile shows that the MPC uses the electrolyzer as a flexible balancing device—it actively absorbs excess energy during the period of maximum generation and gradually turns off when energy becomes needed for direct consumption [
9,
14,
17].
The fuel cell electricity generation profile shows that this part of the HRES acts as a flexible long-term energy storage discharge device, activated only when local renewable generation and battery are not sufficient to cover the load (
Figure 6). For most of the period under consideration, the fuel cell is not operating, because the excess energy generated during the day is directed to battery charging and hydrogen production by the electrolyzer. Only in the later part of the period, as solar generation decreases and the relative load increases, the fuel cell briefly turns on and starts to supply electricity, using the previously stored hydrogen. The generation reaches a maximum value of about 1.1 MWh per step and then gradually decreases until the device is turned off again. Such an operating mode shows that the fuel cell is used in the system not as a permanent source of generation, but as a reserve balancing device, allowing to cover the energy deficit of the evening period and reduce the need for electricity imports from the grid. Overall, the fuel cell operating profile confirms the consistent HRES control logic: during the day, excess renewable energy is stored in the form of hydrogen, and later, when generation decreases, this stored energy is converted back to electricity to maintain system balance.
The dynamics of the battery state of charge (SOC) reveals its key role in short-term energy balancing in the HRES (
Figure 7). In the initial period, the SOC gradually decreases from ~43.6% to about 21% at 390 min, indicating that the battery is discharged to cover the load at a time when local generation is not yet sufficient. Later, with the rapid growth of solar generation, the SOC begins to increase consistently and reaches a high, stable level of about 74–75% at about 720 min, which remains almost unchanged until about 930 min. This indicates that the battery reaches its optimal operating range at that time and is mainly used to suppress small short-term balancing fluctuations, and the main amount of excess energy is directed to hydrogen production. In the later part of the period, as solar generation decreases and the relative share of the load increases, the SOC gradually decreases again. However, this decrease is not sudden or complete, because part of the longer-term deficit is covered by the fuel cell at that time, using previously stored hydrogen. As a result, the battery retains some of its charge reserve and at the end of the period the SOC remains around 28%, rather than reaching the minimum level. This profile confirms that the battery in the HRES system acts as a fast-response storage, balancing short-term generation and consumption mismatches, while a combination of hydrogen storage and fuel cell is used to compensate for longer-term energy shortages [
11,
14]. During evening deficits, the system simultaneously uses both the battery to suppress short-term fluctuations and the fuel cell to compensate for longer-term energy shortages, thus distributing the load between fast and slow storage technologies.
The presented chart “Overall Forecast Model Accuracy (H2O AutoML)” summarizes the accuracy of the models for all three forecast variables—electrical load, solar and wind generation—using four main evaluation indicators: RMSE, MAE, MAPE and R
2 (
Figure 8). The results show that the load forecast model has a very high accuracy—RMSE is about 0.026 MWh, MAE—0.015 MWh, and the mean percentage error (MAPE) is only ~0.48%. The R
2 value of ~0.998 indicates that the model explains almost all the variation in the actual data, therefore the load dynamics are reproduced very accurately both in terms of shape and absolute values. The solar generation forecast also has a very high accuracy: RMSE is about 0.063 MWh, MAE—0.046 MWh, and MAPE is ~1.32%. The R
2 value of ~0.999 indicates an excellent fit between the predicted and actual data. This means that the model reproduces a clear diurnal cycle and a midday peak very well, and the errors remain small even during periods of maximum generation. Compared to other variables, the forecast for wind generation is slightly less accurate, but still remains of high quality. The RMSE is about 0.104 MWh, the MAE is 0.065 MWh, and the MAPE is ~4.18%. The R
2 value of ~0.986 still indicates a very strong correlation with the actual data. The higher errors are physically justified, since wind generation is characterized by higher stochasticity and short-term fluctuations, which are more difficult to predict accurately than regular load or solar cycles. Overall, this diagram confirms that the forecasting models based on H2O AutoML provide very high accuracy for all input variables of the HRES, with the highest reliability achieved in load and solar forecasts. Such forecast accuracy is sufficient for reliable MPC optimization work, as it reduces decision-making uncertainty and allows for more efficient planning of energy storage and distribution in a hybrid energy system. Although the overall forecast accuracy is very high, even small systematic errors or time shifts can have a significant impact on MPC decisions due to storage device limitations and time dependence, so the quality of forecasts remains a critical factor for control efficiency [
17].
While the previous analysis provides a detailed interpretation of the temporal dynamics of key system variables (solar and wind forecasts, load, hydrogen production, electrolyzer consumption, fuel cell operation, and battery state of charge), a robust validation of the proposed control strategy requires a consolidated and fully quantitative evaluation based on key performance indicators (KPIs). To ensure methodological consistency, the KPIs used for validation are directly aligned with the objective function components and performance metrics defined in the MPC–MILP framework. These indicators were systematically calculated over the entire simulation horizon and explicitly consolidated, covering energy performance, environmental impact, and operational efficiency.
The KPIs are defined as follows: grid import (MWh): total imported energy from the external grid over the simulation horizon;
emissions (kg): total emissions associated with grid import, calculated using time-dependent
intensity; Renewable energy utilization (%): ratio of renewable energy used to total available renewable generation; Load supplied by renewables (%): share of total load covered by renewable sources and storage; Curtailment (MWh): total unused renewable energy; Avoided emissions (%): relative reduction in
emissions compared to the RBC baseline; Battery utilization (%): ratio of total battery throughput to its maximum possible capacity over the horizon; Hydrogen system utilization (%): fraction of time the electrolyzer and fuel cell are actively used; Operating cost proxy (normalized): total objective function value normalized with respect to the RBC case (
Table 1).
The results demonstrate that the proposed AutoML–MPC framework significantly improves system performance across all evaluated metrics. Compared to the rule-based control (RBC) baseline, grid import is reduced by 28%, while
emissions decrease by 25%. At the same time, renewable energy utilization increases from 82% to 94%, indicating a substantial improvement in system autonomy and local energy use. Curtailment is reduced by 66%, confirming that the predictive control strategy enables more effective absorption and utilization of renewable energy through coordinated operation of battery and hydrogen storage systems. The increase in battery utilization (from 48% to 63%) reflects improved short-term balancing, while the hydrogen system utilization nearly doubles compared to RBC, highlighting its effective role in long-term energy shifting. The avoided emissions indicator further confirms the environmental benefit of the proposed approach, reaching 25% relative to the baseline case. In parallel, the operating cost proxy shows a consistent reduction across control strategies, with AutoML–MPC achieving a 28% improvement compared to RBC. When compared to the ideal MPC case with perfect foresight, the performance gap remains limited (below 10% for all major KPIs), demonstrating that the AutoML-based forecasting models provide sufficiently accurate inputs for near-optimal control decisions. Importantly, all KPIs are explicitly consolidated in
Table 1 and directly correspond to the performance metrics embedded in the MPC formulation, ensuring a transparent and reproducible validation procedure. This KPI-based evaluation complements the previously presented time-series analysis by translating observed system behavior into measurable and comparable performance improvements. Overall, the presented results provide a complete, consistent, and quantitatively rigorous validation of the proposed AutoML–MPC framework, confirming its effectiveness in improving energy efficiency, reducing emissions, and enhancing the operational performance of hybrid renewable energy systems.
To substantiate the contribution of the proposed AutoML–MPC framework, a quantitative comparison against baseline control strategies and a forecast error sensitivity analysis were performed. This allows the performance improvements to be evaluated rigorously rather than inferred qualitatively from system behavior. The quantitative results are summarized in
Table 2.
For the baseline comparison, three alternative strategies were considered. A rule-based control (RBC) strategy was implemented as a non-predictive reference, where energy dispatch follows fixed priority rules without anticipating future system states. In addition, an MPC formulation using naive persistence-based forecasts was used to isolate the impact of forecast quality. Finally, an MPC case with perfect foresight, where actual future values are used, was included as an upper performance bound. The results obtained from the simulation demonstrate that the proposed AutoML–MPC framework provides a substantial and measurable improvement over simpler strategies. Compared to RBC, total grid import is reduced from 31.4 MWh to 22.6 MWh, corresponding to a reduction of 28%, while emissions decrease from 12.8 kg to 9.6 kg (−25%). At the same time, renewable energy utilization increases from 82% to 94%, indicating significantly more efficient use of locally generated energy. These improvements are directly linked to the predictive capability of the controller, which enables coordinated use of battery and hydrogen storage across different time scales. In contrast, the RBC strategy operates reactively and is unable to anticipate future surpluses or deficits, leading to suboptimal storage usage and higher reliance on external grid supply. When compared to MPC with naive forecasting, the AutoML–MPC approach reduces grid import from 26.9 MWh to 22.6 MWh (−16%) and improves renewable energy utilization from 87% to 94% (+7%). Furthermore, the difference between AutoML–MPC and the ideal MPC case with perfect foresight remains limited, with grid import values of 22.6 MWh and 20.8 MWh, respectively (difference below 10%), indicating that the forecasting models achieve near-optimal performance in practical conditions. In addition to the benchmark comparison, a systematic sensitivity analysis was conducted to evaluate the robustness of the proposed framework with respect to forecast errors. Forecast inputs for solar generation, wind generation, and electrical load were perturbed within ±5%, ±10%, and ±20% ranges, and the resulting control performance was evaluated. The results show that for moderate forecast errors (±5–10%), grid import increases by 2–9% and emissions by 2–8%, indicating high robustness of the control strategy. Even under larger deviations (±20%), the system remains stable and feasible, with grid import increasing to approximately 25.3–26.7 MWh (12–18% increase compared to the nominal case), while all operational constraints are satisfied. This demonstrates that the MPC formulation, combined with the complementary roles of battery and hydrogen storage, provides inherent robustness to realistic levels of uncertainty.
The marginal contribution of hydrogen storage was evaluated by comparing the full system configuration with a battery-only scenario. The results show that removing the hydrogen subsystem leads to a clear degradation in system performance. Grid import increases by approximately 10–15%, while renewable energy curtailment increases by approximately 30–40% due to the limited ability of the battery to absorb excess generation. In addition, battery utilization increases significantly, indicating more intensive cycling and reduced operational flexibility. From a system-level perspective, this corresponds to an effective reduction in renewable energy self-sufficiency by approximately 8–12%, highlighting the inability of short-term storage alone to manage prolonged imbalances. These results confirm that batteries alone are insufficient to handle long-term energy shifting, while the hydrogen subsystem provides a complementary function by enabling energy transfer across extended time scales. This contribution becomes particularly critical during prolonged deficit periods, such as evening peaks, when the battery alone cannot sustain the load and the fuel cell reduces dependence on external grid import. The influence of objective function weighting coefficients was also analyzed to assess the sensitivity of control decisions. Increasing the penalty associated with battery cycling leads to more conservative battery usage and a corresponding increase in reliance on hydrogen storage and grid import. Increasing the penalty on curtailment promotes more intensive use of storage, particularly by increasing hydrogen production during surplus periods. When the weight associated with emissions is increased, the system prioritizes emission reduction, resulting in reduced grid import and more aggressive utilization of available storage resources. Despite these variations, moderate changes in weighting coefficients within a range of ±10–20% lead to only minor variations in total system cost, typically below 5%, indicating that the optimization framework is robust and not overly sensitive to precise parameter tuning. Finally, the sensitivity of system performance to storage capacity was examined. Increasing battery capacity improves short-term balancing and reduces peak grid import; however, the marginal benefit decreases beyond a certain capacity due to the limited duration of short-term imbalances. In contrast, increasing hydrogen storage capacity has a more pronounced effect on long-term system performance, as it enables the system to store excess renewable energy and utilize it during extended deficit periods. When storage capacity is reduced by approximately 20%, grid import increases by approximately 10–14%, while renewable energy curtailment increases by approximately 12–18% due to limited absorption capability. These effects are nonlinear, as insufficient storage capacity simultaneously increases both unused renewable energy and dependence on external supply. The results highlight the importance of balanced sizing of short-term and long-term storage technologies, as their coordinated operation is essential for achieving high renewable energy utilization and system autonomy. Overall, the extended analysis demonstrates that the proposed AutoML–MPC framework not only significantly improves system performance compared to baseline strategies but also maintains stable, robust, and physically consistent operation under forecast uncertainty, parameter variation, and different system configurations.