4.1. Problem Definition and System Description
Large-scale offshore wind farms are subject to significantly higher spatiotemporal uncertainty than their onshore counterparts, primarily owing to rapid marine atmospheric boundary layer dynamics, complex wake interactions among turbines, and frequent extreme meteorological events. Conventional deterministic or low-resolution stochastic configuration methods systematically underestimate the magnitude and correlation structure of minute-to-hour power fluctuations, resulting in either excessive conservative storage oversizing or unacceptable reliability degradation during real operation. Existing hybrid storage studies further suffer from an artificial decoupling between forecasting layers and planning layers: uncertainty is either ignored, treated via oversimplified Gaussian assumptions, or represented by a limited number of hand-crafted scenarios that fail to capture non-stationary tail risks characteristic of offshore environments. Consequently, lithium-ion batteries are forced to absorb almost all fluctuations, leading to accelerated calendar and cycling degradation, whereas long-duration storage technologies remain underutilised due to static allocation rules that cannot adapt to the time-varying predictability of wind power.
To overcome these fundamental limitations, this work establishes a closed-loop, prediction-guided configuration paradigm in which minute-ahead probabilistic forecasts directly determine both the capacity planning vector and the time-varying power-sharing strategy of a hybrid energy storage system composed of lithium-ion batteries and liquid air energy storage. The core philosophy is to exploit the intrinsic complementarity between the two storage media: batteries excel at delivering high power over short durations with sub-second response capability required for primary frequency regulation, whereas liquid air energy storage provides high energy density, negligible self-discharge, and geographical independence suitable for intra-day and multi-day energy shifting as well as black-start ancillary services. By making the power allocation factor explicitly dependent on the width and shape of the forecasted uncertainty distribution, the proposed framework ensures that batteries are protected from excessive cycling whenever wind power becomes more predictable, while liquid air energy storage absorbs predictable bulk energy imbalances, thereby achieving a globally optimal lifetime cost-reliability trade-off that static or deterministic approaches cannot attain.
The physical system under consideration comprises an offshore wind farm with total rated capacity
. The aggregated wind power output at time
t is denoted
. The hybrid storage system injects or absorbs power according to
where
and
are the instantaneous charge/discharge powers, with positive values indicating discharge. State-of-energy dynamics are expressed as
with
,
, and time-invariant round-trip efficiencies.
The primary decision variables of the planning problem are the rated power and energy capacities of both technologies together with a dynamic power allocation factor
that governs the fraction of the total compensating power assigned to the battery at each instant:
where
is the instantaneous imbalance with respect to a desired smooth grid injection profile. Unlike previous works where
is fixed or heuristically predefined, here
is continuously adjusted by a prediction-guided rule that maps the forecasted conditional variance of wind power over the next four hours onto an optimal battery utilisation level, thereby minimising unnecessary battery throughput while preserving reliability.
The overall optimisation horizon spans one typical year with minute resolution for critical segments and hourly resolution elsewhere. Because perfect foresight is unavailable, the configuration vector
must remain feasible and near-optimal against the true but unknown probability distribution of future wind power trajectories. This naturally leads to a distributionally robust optimisation formulation driven by high-fidelity empirical distributions constructed in real time from the probabilistic forecasting module presented subsequently.
4.2. PredOpt-HS: Prediction-Guided Rolling Robust Multi-Objective Optimization Framework
Conventional hybrid storage configuration approaches rely on either deterministic power profiles or a small set of pre-generated stochastic scenarios that are detached from the actual time-evolving predictability of offshore wind power. Such decoupling inevitably produces configurations that are simultaneously over-sized for periods of high forecast skill and critically under-sized for low-predictability episodes dominated by frontal passages and convective turbulence. Moreover, the majority of existing multi-objective formulations treat uncertainty through expected-value metrics alone, thereby exposing the system to unacceptably large tail risks during extreme ramp events that are disproportionately responsible for frequency excursions and black-start triggers in isolated offshore grids.
The PredOpt-HS framework eliminates these deficiencies by establishing a direct, causal linkage between the instantaneous quality of minute-ahead probabilistic forecasts and the instantaneous intensity with which each storage technology is utilised. At every rolling horizon initiation, the DCR-Net module delivers a high-resolution empirical distribution comprising thousands of equally likely wind power trajectories that jointly capture non-Gaussian, non-stationary, and spatially correlated fluctuations. This empirical distribution serves as the reference measure around which a Wasserstein-metric ambiguity set is constructed, endowing the optimisation with out-of-sample performance guarantees against distribution misspecification.
4.3. Evaluation Metrics and Objective Functions
To explicitly quantify the economic and reliability performance of the hybrid storage configuration, three primary mathematical metrics are defined within the objective function.
The first objective
represents the Levelized Cost of Energy (LCOE). It is computed over the full life cycle as the annuity of total investment, replacement, and fixed operation and maintenance (O&M) costs divided by the expected energy served:
where
is the capital recovery factor with discount rate
r and project lifetime
Y. The terms
and
denote the unit power and energy capital costs for each storage technology, respectively.
The second objective penalizes the tail risk of power shortages under extreme meteorological uncertainty, formulated as the Conditional Value-at-Risk (CVaR) of the Expected Energy Not Supplied (EENS) at a confidence level
(e.g., 0.99):
where
is the Value-at-Risk auxiliary variable, and
represents the total load shedding under a specific wind power trajectory
.
The third objective
minimizes the maximum frequency nadir deviation (
) to ensure primary frequency regulation capabilities. The nadir is defined by the maximum absolute deviation during a transient event:
subject to the system swing equation dynamics:
where
is the equivalent system inertia,
is the nominal frequency,
D is the load damping constant, and
reflects the instantaneous active power injected by the fast-responding lithium-ion batteries to arrest the frequency drop.
The distributionally robust multi-objective optimisation problem is formally stated as
The first objective represents the levelized cost of energy, computed over the full life cycle as the annuity of total investment, replacement, and fixed operation costs divided by the expected energy served. The second and third objectives penalise the worst-case conditional value-at-risk of expected energy not supplied and the worst-case expected frequency nadir deviation, respectively, thereby providing explicit protection against rare but catastrophic events. The fourth objective is the maximum black-start recovery time, expressed as the minimum duration required for the hybrid system to restore predefined critical load using only stored energy after a complete farm outage.
Crucially, the dynamic allocation function is a monotonically decreasing mapping that translates the forecasted conditional variance over the forthcoming four hours into a reduced battery utilisation factor . When forecast uncertainty is low, variance is small, approaches zero, and nearly all smoothing and regulation duties are transferred to liquid air energy storage, dramatically extending battery lifetime. Conversely, during passages of mesoscale fronts where variance spikes, rises toward unity, mobilising the full battery power capacity to suppress steep ramps and provide synthetic inertia. This prediction-guided load-sharing mechanism constitutes the core innovation that static or rule-based hybrid systems cannot replicate, because they lack access to continuously updated, high-fidelity uncertainty quantification.
The Wasserstein ambiguity set is defined as
where
denotes the 1-Wasserstein distance and the radius
is chosen proportionally to the evidential total uncertainty output by DCR-Net, ensuring that the degree of robustness itself adapts to forecast confidence. Thanks to recent advances in distributionally robust optimisation, the supremum over the ambiguity set can be reformulated exactly as a finite-dimensional convex program for most common risk measures, preserving computational tractability despite the minute-level resolution and annual horizon.
Through this tightly coupled, prediction-driven, distributionally robust formulation, PredOpt-HS achieves a configuration that is simultaneously economical under nominal conditions, resilient against extreme meteorological uncertainty, and capable of delivering both primary frequency regulation and black-start services at minimum lifetime cost—objectives that remain mutually conflicting and unattainable in all previously decoupled or deterministic paradigms.
4.4. DCR-Net: Dual-Channel Residual Network for Ultra-Short-Term Probabilistic Forecasting
Existing ultra-short-term forecasting models for offshore wind power overwhelmingly adopt either purely data-driven sequential architectures or physically based numerical weather prediction downscaling. The former excel at capturing deep latent temporal representations yet inevitably fail to generalise across unseen meteorological regimes and systematically underestimate tail uncertainty during rapidly evolving convective systems. The latter provide physically consistent uncertainty estimates but operate at temporal resolutions far too coarse for minute-ahead storage dispatch and primary frequency control. Hybrid approaches that simply concatenate meteorological variables as additional inputs to black-box deep networks still suffer from catastrophic error propagation when physical constraints are violated, producing poorly calibrated probabilistic forecasts whose coverage deviates by more than thirty percentage points from nominal confidence levels in real offshore datasets.
The proposed DCR-Net overcomes these fundamental shortcomings through a dual-channel architecture that separately extracts and subsequently fuses physically interpretable spatiotemporal features with purely statistical temporal dependencies, thereby preserving the strengths of both paradigms while eliminating their respective failure modes. The network ingests two fundamentally different input streams that are processed in parallel before being combined through a gated cross-attention mechanism specifically designed to let each channel veto spurious patterns generated by the other.
The physical-informed channel receives a tensor of collocated meteorological variables measured at hub height and surrounding met-ocean buoys. These variables include wind speed, wind direction, atmospheric pressure, air temperature, relative humidity, significant wave height, and spectral wave period. Spatial-temporal features are extracted by a stack of convolutional long short-term memory layers whose recurrent kernels are regularised by discrete approximations of the momentum and continuity equations, ensuring that learned flow patterns remain divergence-free and momentum-conserving even under severe extrapolation. This physically constrained convolution forces the channel to ignore statistically salient but dynamically impossible transients, dramatically improving tail calibration during extreme ramp events.
The statistical residual channel processes only the historical aggregated power sequence of the entire wind farm. Six stacked temporal convolutional blocks with exponentially increasing dilation rates are arranged in a residual fashion, enabling an effective receptive field exceeding one hour while maintaining linear computational complexity with respect to sequence length. Each residual block contains weight-normalised temporal convolutions followed by gated linear unit activations, a design proven to mitigate the vanishing gradient problem and accelerate convergence in non-stationary time series. Because this channel observes only past power realisations, it specialises in learning turbine-level wake interactions and aggregate non-linear dynamics that are invisible to the physical channel.
Feature fusion occurs through a gated cross-attention module that computes attention weights separately for each time step and each channel, allowing the network to adaptively penalize spurious spatiotemporal representations during periods dominated by turbine control nonlinearities and to down-weight statistical artifacts when strong synoptic forcing is present. The multi-modal embedding is finally passed to a deep evidential regression head that parameterises a normal inverse-gamma distribution over future power values. The four evidential parameters evidence for the mean, evidence for the variance, degrees of freedom, and scale are predicted jointly, yielding a closed-form predictive distribution that naturally decomposes total uncertainty into aleatoric and epistemic components without requiring ensemble training or Monte Carlo dropout.
Formally, let the physical channel output at time
t be denoted
and the statistical channel output
. The fused representation is obtained as
where
is the sigmoid activation,
and
are learned projection matrices, and ⊙ denotes element-wise multiplication. The evidential head then maps
to positive parameters
such that the predictive distribution is
where
is the location, the scale is controlled jointly by
and
, and
governs tail thickness. Training minimises the negative log-likelihood augmented by a Kullback–Leibler regularisation term that penalises overconfident evidence whenever ground truth falls outside high-probability predictive intervals.
By explicitly separating physical consistency from statistical flexibility and recombining them only under mutual supervision, DCR-Net produces probabilistic forecasts that simultaneously achieve state-of-the-art point accuracy and dramatically superior calibration and sharpness compared to all existing single-channel or naïvely hybridised architectures, providing the high-fidelity, minute-resolution uncertainty quantification that PredOpt-HS requires to make theoretically justified and practically robust storage configuration decisions. To clarify the parameter updating mechanism of the forecasting module, it is important to note that the trainable parameters of the DCR-Net are optimized independently of the downstream multi-objective scheduling function. The network parameters
are updated by minimizing a deep evidential loss function
, which is formulated as:
where
is the ground-truth wind power,
represents the predicted evidential parameters, and
is a dynamic annealing coefficient. The Negative Log-Likelihood (NLL) loss ensures the accuracy of the predictive distribution and is defined as:
Meanwhile, the Kullback–Leibler (KL) divergence term acts as a regularizer to penalize overconfident evidence. This decoupled updating strategy ensures that the forecasting model preserves the objective physical and statistical integrity of meteorological dynamics without being biased by downstream operational scheduling targets.