Robust Representation of Solar Photovoltaic Variability via Wasserstein Distributional Modeling

Liu, Andi; Liu, Mengqi; Li, Tairan; Feng, Liang; Xiao, Chuanliang

doi:10.3390/en19112665

Open AccessArticle

Robust Representation of Solar Photovoltaic Variability via Wasserstein Distributional Modeling

by

Andi Liu

¹,

Mengqi Liu

¹,

Tairan Li

¹,

Liang Feng

^2,* and

Chuanliang Xiao

²

¹

State Grid Shandong Electric Power Company Jinan Power Supply Company, Jinan 250012, China

²

School of Electrical and Electronic Engineering, Shandong University of Technology, Zibo 255000, China

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(11), 2665; https://doi.org/10.3390/en19112665

Submission received: 2 May 2026 / Revised: 22 May 2026 / Accepted: 26 May 2026 / Published: 31 May 2026

(This article belongs to the Topic Toward Smart and Sustainable Energy Systems Enabled by Artificial Intelligence, Optimization, and Resilience)

Download

Browse Figures

Versions Notes

Abstract

The increasing penetration of solar photovoltaic (PV) systems in modern distribution networks introduces significant variability, uncertainty, and spatiotemporal heterogeneity that challenge conventional data-driven modeling approaches. Existing methods predominantly rely on deterministic representations or simplified statistical summaries, which fail to capture the complex distributional structure of PV generation and its interaction with energy storage and environmental factors. To address this limitation, this paper proposes a distributionally robust data representation framework that models PV outputs as ambiguity sets of probability distributions rather than single trajectories. Leveraging Wasserstein metrics, the framework constructs data-driven uncertainty sets that explicitly encode temporal variability, cross-resource correlations, and distributional perturbations arising from weather dynamics and measurement noise. A unified modeling architecture is developed to integrate multi-source data, including PV generation, storage state-of-charge, and meteorological variables, and to extract robust statistical descriptors through worst-case expectation formulations. In addition, a generation mechanism scenario is designed to produce representative and extreme trajectories from the ambiguity sets, enabling enhanced coverage of rare but critical operating conditions such as rapid irradiance fluctuations. Wasserstein ambiguity sets are not treated as a new theory in this work; they are used as a representation layer for PV, ESS, meteorological, and load trajectories before downstream analysis. Extensive case studies on a modified IEEE 123-bus distribution system demonstrate that the proposed approach improves out-of-sample performance, reduces scenario-level standard deviation relative to deterministic representation in repeated-run evaluation, and maintains more stable error behavior under controlled distribution shifts. Furthermore, the framework achieves up to 40–50% reduction in scenario requirements while preserving high approximation quality, indicating strong computational efficiency. The validation includes confidence intervals, variance and standard deviation definitions, ablation results, sensitivity checks, and repeatability details for the modified IEEE 123-bus test system.

Keywords:

solar photovoltaic systems; distributional uncertainty; Wasserstein ambiguity set; robust data representation; scenario generation; extreme event modeling

1. Introduction

The rapid proliferation of distributed photovoltaic generation and energy storage systems has fundamentally reshaped the operational and analytical landscape of modern distribution networks [1,2,3]. Unlike traditional centralized generation, these distributed energy resources exhibit strong stochasticity driven by weather variability, pronounced heterogeneity across locations and user behaviors, and intricate spatiotemporal coupling through network interactions [4,5,6,7,8,9,10]. As a consequence, the data generated by such systems no longer conform to simple deterministic patterns but instead manifest as complex, evolving distributions characterized by nonstationarity, heavy tails, and cross-resource dependencies. These characteristics challenge conventional data processing pipelines, particularly in tasks such as clustering, aggregation, equivalent modeling, and optimization, all of which critically depend on the quality and robustness of underlying data representations.

Recent PV deployment makes this modeling problem more operationally relevant. IRENA reports that renewable power capacity increased by 692 GW in 2025 and that approximately three-quarters of the added renewable capacity came from solar energy, with about 511 GW of solar capacity added during the year [1]. IEA analysis projects almost 4600 GW of renewable power additions between 2025 and 2030, with utility-scale and distributed solar PV representing nearly 80 percent of the worldwide renewable electricity capacity expansion [2]. China is a particularly important case; the IEA PVPS national survey reports that China added 277.57 GWAC of new PV capacity in 2024 and reached about 886 GWAC of cumulative installed PV capacity by the end of 2024 [3]. These deployment levels mean that PV variability is no longer a small perturbation around conventional feeder operation. It can become a dominant driver of net load ramps, local voltage rise, reverse power flow, storage charging behavior, and operational reserve requirements in medium-voltage and low-voltage networks.

The same transition is visible in adjacent power and energy system studies. Recent resilience research emphasizes that converter-dominated grids require system-level modeling of interdependent failures, operating margins, and restoration behavior [4,5]. Digital twin, cyber-physical, and collective intelligence studies similarly show that high-DER networks increasingly rely on data-rich representation layers rather than isolated deterministic component models [6,7,8,9]. Virtual power plant construction, multi-energy scheduling, hydrogen production, and solar-powered electric vehicle integration further demonstrate that distributed resources must be modeled with interaction uncertainty, recovery dynamics, cross-carrier coupling, and transport–energy coordination [10,11,12,13,14,15,16]. Frequency support aggregation, safe dispatch, hydropower storage valuation, operational risk assessment, and model-free voltage calculation provide complementary evidence that modern power system operation depends on uncertainty-aware representations before optimization or control decisions are made [17,18,19,20,21]. Recent data-driven forecasting, reinforcement learning, voltage control, water–energy, V2X, distributionally robust planning, and green hydrogen studies reinforce the same point—high-resolution measurements are valuable only when the representation preserves temporal dynamics, coupling, uncertainty, and rare-event behavior [22,23,24,25,26,27,28,29].

Early research in this area largely adopted deterministic modeling paradigms, where photovoltaic outputs and storage states were represented as fixed time series derived from historical measurements, typical meteorological profiles, or simplified physical equations [30,31,32]. While such approaches provided computational tractability and interpretability, they inherently overlooked the uncertainty and variability embedded in the data. As distribution systems became increasingly dynamic and decentralized, the limitations of deterministic representations became more pronounced, particularly when models were exposed to unseen conditions such as extreme weather events or abrupt changes in user behavior [33,34]. These shortcomings motivated the transition toward probabilistic modeling frameworks, where uncertainty was explicitly characterized using statistical distributions [35,36,37].

The limitation of deterministic, empirical, and semi-empirical PV models should nevertheless be interpreted according to the problem being solved. These models remain useful for development-oriented tasks such as preliminary sizing, annual energy yield estimation, equipment comparison, representative-day studies, and early hosting–capacity screening. A deterministic irradiance-to-power model can be sufficient when the objective is to estimate average production, and empirical representative days can be appropriate when long-term expected behavior is the primary concern. The difficulty arises in short-horizon operation of distribution networks with high PV penetration and storage. In that setting, the operator needs to preserve ramp events, cross-node dependence, measurement perturbations, inverter clipping, storage state-of-charge dynamics, and distributional shifts. A profile that is accurate in annual energy terms can still be unsafe for a 15 min operational action if it suppresses an abrupt cloud front event or a correlated storage response. This paper therefore uses physical and semi-empirical equations as interpretability anchors while representing the remaining uncertainty through ambiguity sets.

Typical meteorological year data are also widely used in PV energy system development because they summarize long-term climate conditions into a representative annual sequence [32]. TMY data are appropriate for annual production assessment, site comparison, and design-stage evaluation. Reliability and sustainability studies may additionally extract pessimistic and optimistic years to stress low-irradiance, high-temperature, high-generation, or curtailment-prone conditions. However, a typical year or an extreme year remains one selected trajectory rather than a distribution of plausible operating conditions. It does not by itself describe sub-hourly ramps, residual spatial dependence, or measurement uncertainty. The present work therefore treats weather-driven PV output as a family of distributions close to empirical trajectories rather than as a single meteorological sequence.

Probabilistic approaches have introduced a richer representation of DER behavior by leveraging tools such as Gaussian processes, copula-based dependency models, and nonparametric density estimation. These methods have enabled the capture of variability and provided a foundation for stochastic optimization and risk-aware decision-making [31,32,33,34,35,36,37]. Complementing these efforts, scenario-based modeling techniques have been developed to generate representative realizations of uncertain variables, often through Monte Carlo sampling or clustering-based reduction. Although these approaches have improved the ability to incorporate uncertainty into downstream analysis, they remain highly dependent on historical data quality and often lack systematic mechanisms to account for distributional ambiguity, particularly in the presence of limited observations or structural shifts [38,39,40,41,42]. In parallel, the increasing availability of high-resolution data has led to the widespread adoption of machine learning techniques for modeling distributed energy systems [22,23,24,26,27]. Methods ranging from classical regression and support vector machines to deep neural networks and graph-based architectures have been employed to forecast generation, estimate system states, and capture spatial correlations. Despite their strong predictive performance, these models are typically trained under empirical risk minimization principles and therefore remain sensitive to distributional shifts and rare events. This limitation is particularly critical in power system applications, where reliability and robustness are paramount and where unseen conditions can have significant operational consequences [38,39].

Table 1 summarizes the modeling scope considered in this work. It distinguishes development-oriented uses of classical PV models from operational uncertainty modeling in PV-rich feeders, where sub-hourly ramps, dependence, and tail behavior must be preserved.

To address these challenges, robust optimization and stochastic programming frameworks have been introduced to explicitly account for uncertainty in decision-making processes [35,36,37,38,39]. Robust optimization constructs uncertainty sets and seeks solutions that perform well under worst-case realizations, while stochastic programming optimizes expected performance over probabilistic scenarios [35,36,37]. More recently, distributionally robust optimization has emerged as a unifying framework that integrates statistical information with worst-case guarantees by constructing ambiguity sets around empirical distributions. In particular, Wasserstein-based DRO formulations have gained prominence due to their strong theoretical properties and tractability, enabling data-driven uncertainty modeling in a wide range of energy system applications [40,41,42,43,44].

However, a critical observation across the literature is that uncertainty is predominantly addressed at the decision-making stage, while the data representation layer remains largely deterministic or weakly probabilistic. This separation creates a fundamental inconsistency, as downstream optimization models attempt to handle uncertainty that has already been partially discarded or distorted during data preprocessing. Consequently, even advanced robust optimization techniques may inherit instability and bias from upstream representations, limiting their effectiveness in capturing true system behavior. At the same time, existing feature extraction and dimensionality reduction techniques, including principal component analysis, autoencoders, and manifold learning, focus primarily on reconstructing observed data rather than ensuring robustness under distributional perturbations. Similarly, scenario generation methods often lack a principled connection to underlying uncertainty structures, leading to potential mismatches between generated scenarios and actual system dynamics. Against this backdrop, there is a growing need to rethink the role of data representation in distributed energy systems by treating uncertainty as an intrinsic property rather than an external factor. This paper advances such a perspective by developing a distributionally robust data representation framework for distributed photovoltaic and energy storage systems. Instead of representing each resource as a single deterministic trajectory, the proposed approach characterizes it as an ambiguity set of probability distributions centered around empirical observations. This representation leverages Wasserstein metrics to quantify distributional proximity, enabling the construction of flexible yet theoretically grounded ambiguity sets that capture both statistical variability and structural uncertainty.

Wasserstein ambiguity sets are an established framework in robust optimization and data-driven decision-making, and the present work does not claim the Wasserstein metric itself as a new paradigm. The contribution is the placement of the ambiguity set at the data representation layer for PV, ESS, meteorological, and load trajectories, so that variability is not discarded before downstream optimization or scenario analysis.

Building upon this representation, the framework integrates physical system constraints, statistical consistency conditions, and robust feature extraction into a unified modeling structure. A worst-case expectation objective is formulated over the ambiguity sets, ensuring that extracted features remain stable under distributional perturbations. The model further incorporates constraints that preserve moment information, enforce dynamic feasibility, and capture cross-resource dependencies, thereby maintaining consistency with both statistical and physical properties of the system. In addition, a scenario generation mechanism is developed to sample representative trajectories from the ambiguity sets, enabling the inclusion of both typical and extreme operating conditions in subsequent analysis. Through dual reformulation and tractable approximations, the resulting optimization problem can be efficiently solved using finite data, bridging the gap between theoretical robustness and practical applicability. What distinguishes this work is the integration of uncertainty modeling directly into the data representation layer, thereby unifying representation, feature extraction, and scenario generation within a single distributionally robust framework. This approach ensures that uncertainty is consistently propagated throughout the analytical pipeline, improving stability and generalization in downstream tasks. Furthermore, the framework provides theoretical guarantees on out-of-sample performance, linking the quality of representation to the reliability of subsequent decision-making processes.

Contributions. The contributions of this paper are as follows. First, a constrained Wasserstein DRO representation is formulated for distributed PV, ESS, meteorological, and load trajectories, with the ambiguity set used before downstream optimization. Second, the physical modeling layer specifies PV irradiance–temperature conversion, ESS state-of-charge dynamics with an explicit 15 min time step, and a radial active power balance assumption. Third, a bounded residual spatial coupling procedure is introduced so that the coefficients linking neighboring PV units are estimated from training data and constrained by individual and row sum limits. Fourth, robust feature extraction and scenario generation are linked through the same ambiguity structure while an explicit redundancy and strong-link filtering step is applied before final scenario reduction. Fifth, reproducibility is strengthened through notation, units, hyperparameter rules, a tabular IEEE 123-bus feeder scheme, data-split information, benchmark definitions, confidence intervals, ablation studies, and sensitivity checks.

2. Mathematical Modeling

To rigorously characterize the distributionally robust representation of solar photovoltaic systems under uncertainty, we develop a comprehensive mathematical modeling framework that integrates probabilistic ambiguity, temporal dynamics, and multi-source data interactions within a unified structure. Unlike conventional formulations that rely on deterministic trajectories or simplified statistical summaries, the proposed model explicitly treats photovoltaic generation and energy storage behavior as elements of ambiguity sets defined over probability distributions. This perspective enables the direct incorporation of stochastic variability arising from weather fluctuations, measurement noise, and user-driven storage operations. In particular, Wasserstein metrics are employed to construct data-driven ambiguity sets around empirical observations, providing a principled mechanism to quantify distributional deviations while preserving the geometric structure of the underlying data.

Figure 1 presents the overall architecture of the proposed framework, organized into four tightly coupled layers that transform raw multi-source data into robust, decision-ready representations.

Data Structure, Notation, and Hyperparameters

Explicit notation and hyperparameter definitions are given before the mathematical formulation. Physical equations are evaluated in engineering units, while Wasserstein distances are computed after training set standardization so that variables with large physical units, such as irradiance, do not dominate the transport metric.

\begin{matrix} min_{θ, ψ} \sum_{i \in N} sup_{Q_{i} \in P_{i}} E_{Q_{i}} [ℓ_{i} (ξ_{i}; θ, ψ)] + \sum_{i \in N} β_{i} {∥ θ_{i} ∥}_{2}^{2} \end{matrix}

(1)

Equation (1) uses an explicit min–sup structure. The representation parameters are optimized by the outer minimization, whereas the adversarial distribution is selected inside the Wasserstein ambiguity set. This separates the learned representation from the worst-case distributional perturbation. The loss term includes trajectory deviation, temporal gradient inconsistency, and covariance-weighted structural distortion, while the admissible distributions are defined by the constrained ambiguity set in Equation (2).

\begin{matrix} P_{i} = \{Q_{i} \in M (Ω_{i}) | W_{2} (Q_{i}, {\hat{P}}_{i}) \leq ε_{i}, \int_{Ω_{i}} d Q_{i} = 1, \int_{Ω_{i}} ξ_{i} d Q_{i} = μ_{i}^{(0)}\} \end{matrix}

(2)

Equation (2) characterizes the admissible ambiguity set for each resource by restricting the candidate distributions within a Wasserstein ball of radius

ε_{i}

centered at the empirical distribution, while simultaneously enforcing normalization and mean consistency constraints, such that the feasible distributions preserve first-order statistical structure while allowing controlled deviations to capture uncertainty.

\begin{matrix} W_{2}^{2} (Q_{i}, {\hat{P}}_{i}) = inf_{π_{i} \in Π (Q_{i}, {\hat{P}}_{i})} \int_{Ω_{i} \times Ω_{i}} {∥ξ_{i} - ζ_{i}∥}_{2}^{2} d π_{i} (ξ_{i}, ζ_{i}) \end{matrix}

(3)

Equation (3) defines the Wasserstein-2 distance between the candidate distribution and the empirical distribution through an optimal transport problem over joint couplings

π_{i}

, where the integral quantifies the minimal transportation cost required to transform one distribution into the other in the trajectory space.

\begin{matrix} T_{i, t}^{cell} & = T_{i, t}^{amb} + \frac{{NOCT}_{i} - 20}{800} G_{i, t}, \\ {\tilde{p}}_{i, t}^{PV} & = η_{i}^{inv} {\bar{P}}_{i}^{PV} \frac{G_{i, t}}{G^{STC}} [1 + γ_{i} (T_{i, t}^{cell} - T^{STC})], \\ p_{i, t}^{PV} & = min \{{\bar{P}}_{i}^{PV}, max (0, {\tilde{p}}_{i, t}^{PV} + r_{i, t}^{PV} + \sum_{j \in N_{i}} ω_{i j} r_{j, t}^{PV})\} . \end{matrix}

(4)

Equation (4) separates physical PV conversion from residual uncertainty. Irradiance and cell temperature are first converted into electrical power through rated capacity, inverter efficiency, standard test irradiance, and temperature correction. Spatial coupling is applied only to residual PV deviations rather than to the full neighboring PV output, which avoids double-counting the common diurnal solar trend. The coefficients are estimated from training data and bounded as described below.

\begin{matrix} {\tilde{ω}}_{i j} & = exp (- d_{i j}^{el} / d_{0}) max {0, corr (r_{i}^{PV}, r_{j}^{PV})} 1 {d_{i j}^{el} \leq d_{max}}, \\ ω_{i j} & = min \{ω_{max}, \frac{ρ_{max} {\tilde{ω}}_{i j}}{\sum_{k \in N_{i}} {\tilde{ω}}_{i k} + 10^{- 8}}\}, 0 \leq ω_{i j} \leq ω_{max}, \sum_{j \in N_{i}} ω_{i j} \leq ρ_{max} . \end{matrix}

(4a)

The coefficient rule prevents the spatial-coupling terms from becoming arbitrary unbounded tuning factors. Electrical distance prevents remote nodes from being linked solely because they share a daily PV profile, and positive residual correlation captures localized cloud field or measurement effects after the physical PV component has been removed.

\begin{matrix} e_{i, t + 1}^{ESS} & = (1 - α_{i}^{loss} Δ t) e_{i, t}^{ESS} + η_{i}^{ch} p_{i, t}^{ch} Δ t - \frac{p_{i, t}^{dis}}{η_{i}^{dis}} Δ t + ϵ_{i, t}^{ESS}, \\ {\underset{̲}{e}}_{i} & \leq e_{i, t}^{ESS} \leq {\bar{e}}_{i}, 0 \leq p_{i, t}^{ch} \leq {\bar{p}}_{i}^{ch}, 0 \leq p_{i, t}^{dis} \leq {\bar{p}}_{i}^{dis} . \end{matrix}

(5)

Equation (5) now includes the sampling interval, charge and discharge efficiencies, self-discharge, stored-energy bounds, and power limits. The implementation prevents simultaneous charge and discharge through either a binary operating state variable in dispatch or a penalty relaxation in continuous scenario evaluation.

\begin{matrix} P_{π (i), i, t} - \sum_{k \in C (i)} P_{i, k, t} = p_{i, t}^{L} + p_{i, t}^{ch} - p_{i, t}^{PV} - p_{i, t}^{dis} + ℓ_{i, t}^{loss} + ϵ_{i, t}^{net} . \end{matrix}

(6)

Equation (6) is written as a radial, loss-linearized active power balance. The notation uses the parent node

π (i)

and child node set

C (i)

, consistent with radial distribution and branch-flow modeling practice [45,46]. This is a positive-sequence approximation for uncertainty representation comparison rather than a full unbalanced three-phase power flow model.

\begin{matrix} \int_{Ω_{i}} ξ_{i} ξ_{i}^{⊤} d Q_{i} (ξ_{i}) - μ_{i}^{(0)} μ_{i}^{(0) ⊤} ⪯ Σ_{i}^{(0)} + Δ_{i} \end{matrix}

(7)

Through the introduction of a second-order moment dominance condition, the probabilistic representation is constrained to preserve bounded covariance structure relative to empirical statistics, where the deviation matrix

Δ_{i}

captures allowable uncertainty inflation, thereby preventing unrealistic dispersion while still enabling flexibility in modeling stochastic variability.

\begin{matrix} \int_{Ω_{i}} (\frac{\partial ξ_{i}}{\partial τ}) {(\frac{\partial ξ_{i}}{\partial τ})}^{⊤} d Q_{i} (ξ_{i}) ⪯ Γ_{i} + λ_{i} I \end{matrix}

(8)

By enforcing an upper bound on the expected temporal gradient covariance, the formulation ensures that the dynamic variability of trajectories remains controlled, where

Γ_{i}

represents nominal gradient statistics and

λ_{i} I

introduces isotropic slack to accommodate unforeseen fluctuations.

\begin{matrix} \sum_{i \in N} \sum_{j \in N} \int_{Ω_{i}} \int_{Ω_{j}} {(ξ_{i}^{⊤} Υ_{i j} ξ_{j} - χ_{i j})}^{2} d Q_{i} d Q_{j} \leq Ψ \end{matrix}

(9)

A global cross-resource coupling constraint is imposed to regulate pairwise statistical dependencies across distributed units, where the matrices

Υ_{i j}

encode interaction structure and

χ_{i j}

denotes target correlation levels, ensuring that the joint representation maintains realistic interdependencies across the network.

\begin{matrix} \int_{Ω_{i}} exp (ω_{i}^{⊤} ξ_{i} + ϕ_{i}^{⊤} \nabla_{τ} ξ_{i}) d Q_{i} (ξ_{i}) \leq ζ_{i} \end{matrix}

(10)

Incorporating an exponential moment constraint introduces control over tail behavior of the distribution, where the exponential mapping captures higher-order risk sensitivity and the bound

ζ_{i}

ensures that extreme deviations remain probabilistically contained.

\begin{matrix} ξ_{i, t} \in \{ξ | {\underset{̲}{ξ}}_{i, t} ⪯ ξ ⪯ {\bar{ξ}}_{i, t}, A_{i} ξ \leq b_{i}\} \end{matrix}

(11)

Feasible trajectory realizations are restricted within a polyhedral support set defined by lower and upper bounds together with linear inequalities, thereby guaranteeing that all sampled states satisfy physical limits and operational constraints intrinsic to distributed energy systems.

\begin{matrix} \int_{Ω_{i}} {(ξ_{i, t + 1} - ξ_{i, t} - Ψ_{i} (ξ_{i, t}))}^{2} d Q_{i} (ξ_{i}) \leq ϵ_{i}^{dyn} \end{matrix}

(12)

A dynamic consistency condition is enforced by penalizing deviations from an underlying nonlinear transition mapping

Ψ_{i} (\cdot)

, ensuring that the stochastic trajectories adhere to physically meaningful evolution patterns across time.

\begin{matrix} \int_{Ω_{i}} \int_{Ω_{i}} {∥ξ_{i} - ζ_{i}∥}_{2}^{2} d Q_{i} (ξ_{i}) d Q_{i} (ζ_{i}) \leq ϑ_{i} \end{matrix}

(13)

Internal dispersion within each distribution is bounded through a self-coupled variance constraint, where the pairwise distance between samples is controlled by

ϑ_{i}

, thereby preventing overly spread distributions that could destabilize downstream analytical tasks.

\begin{matrix} sup_{λ_{i} \geq 0} inf_{Q_{i} \in M (Ω_{i})} \{\int_{Ω_{i}} ϕ_{i} (ξ_{i}) d Q_{i} (ξ_{i}) + λ_{i} (W_{2}^{2} (Q_{i}, {\hat{P}}_{i}) - ε_{i}^{2})\} \end{matrix}

(14)

By reformulating the worst-case expectation through a Lagrangian relaxation, the original distributionally robust problem is converted into a saddle-point structure in which the dual variable

λ_{i}

penalizes violations of the Wasserstein ambiguity radius, thereby enabling tractable analysis of the inner minimization over probability measures while preserving the robustness guarantees.

\begin{matrix} inf_{λ_{i} \geq 0} \{λ_{i} ε_{i}^{2} + \int_{Ω_{i}} sup_{ξ_{i} \in Ω_{i}} (ϕ_{i} (ξ_{i}) - λ_{i} {∥ξ_{i} - {\hat{ξ}}_{i}∥}_{2}^{2}) d {\hat{P}}_{i}\} \end{matrix}

(15)

Through duality theory, the inner distributional optimization admits an equivalent reformulation in which the supremum is taken over the support of empirical samples, and the Wasserstein penalty appears as a quadratic regularizer, thereby transforming the infinite-dimensional optimization into a finite-sample problem.

\begin{matrix} sup_{ξ_{i} \in Ω_{i}} (ϕ_{i} (ξ_{i}) - λ_{i} {∥ξ_{i} - {\hat{ξ}}_{i}∥}_{2}^{2}) = ϕ_{i} ({\hat{ξ}}_{i}) + \frac{1}{4 λ_{i}} {∥\nabla ϕ_{i} ({\hat{ξ}}_{i})∥}_{2}^{2} \end{matrix}

(16)

Exploiting smoothness conditions on the feature mapping, the inner supremum admits a closed-form approximation expressed through gradient norms, which significantly reduces computational complexity while retaining sensitivity to local curvature of the feature function.

Equation (16) is a local approximation rather than an exact identity for arbitrary nonlinear losses. It is used only when the normalized support is compact, the loss is differentiable, the gradient is Lipschitz continuous, the dual penalty dominates local curvature, and the local adversarial point remains inside the physical support. If these checks fail, the bounded inner supremum is solved numerically over the admissible support.

Table 2 summarizes the validity checks used before applying the finite-sample dual approximation and identifies the corresponding fallback treatment when these checks are not satisfied.

3. Distributionally Robust Feature Extraction and Scenario Generation

Building upon the mathematical formulation, we develop a systematic computational methodology to operationalize the proposed distributionally robust representation framework in a tractable and scalable manner. The methodology is designed to translate the abstract ambiguity-based model into implementable procedures that can efficiently handle high-dimensional, multi-source data arising in distributed photovoltaic systems. Central to this approach is the evaluation of worst-case expectations over Wasserstein ambiguity sets, which enables the extraction of robust statistical descriptors that remain stable under distributional perturbations. These descriptors include not only first- and second-order moments, but also higher-level features such as temporal gradients and cross-resource interaction patterns, all computed in a distributionally robust manner.

Algorithmic quantities, scenario-filtering rules, and implementation timing are grouped in Table 3 so that the computational procedure can be followed without interrupting the mathematical development with separate incremental tables.

\begin{matrix} μ_{i}^{rob} = sup_{Q_{i} \in P_{i}} \int_{Ω_{i}} ξ_{i} d Q_{i} (ξ_{i}) = {\hat{μ}}_{i} + ε_{i} \frac{Σ_{i}^{1 / 2} α_{i}}{∥ α_{i} ∥_{2}} \end{matrix}

(17)

A distributionally robust mean estimator is derived by shifting the empirical mean along the direction of maximum sensitivity, where the perturbation magnitude is proportional to the Wasserstein radius and shaped by the covariance structure, thus capturing worst-case bias under ambiguity.

\begin{matrix} Σ_{i}^{rob} = sup_{Q_{i} \in P_{i}} \int_{Ω_{i}} (ξ_{i} - μ_{i}) {(ξ_{i} - μ_{i})}^{⊤} d Q_{i} = {\hat{Σ}}_{i} + ε_{i} Ψ_{i} \end{matrix}

(18)

Robust covariance estimation is achieved by inflating the empirical covariance matrix with an uncertainty-dependent correction term

Ψ_{i}

, ensuring that second-order variability is conservatively captured under distributional perturbations.

\begin{matrix} ρ_{i j}^{rob} = sup_{Q_{i}, Q_{j}} \frac{\int ξ_{i}^{⊤} ξ_{j} d Q_{i} d Q_{j} - μ_{i}^{⊤} μ_{j}}{\sqrt{tr (Σ_{i}) tr (Σ_{j})}} \end{matrix}

(19)

A worst-case correlation descriptor is constructed by maximizing cross-moments under ambiguity, thereby providing a robust measure of interdependence between distributed resources that remains valid even under joint distributional shifts.

\begin{matrix} Q_{i}^{(s)} \sim arg max_{Q_{i} \in P_{i}} \int_{Ω_{i}} ω_{s} (ξ_{i}) d Q_{i} (ξ_{i}), s \in S \end{matrix}

(20)

A scenario generation mechanism is established by sampling extremal distributions that maximize scenario-specific utility functions

ω_{s} (\cdot)

, thereby producing representative trajectories that reflect worst-case realizations and enrich the dataset for downstream analytical tasks.

\begin{matrix} {\hat{ε}}_{i} = inf_{ε_{i} \geq 0} \{ε_{i} | P (W_{2} ({\hat{P}}_{i}, P_{i}^{★}) \leq ε_{i}) \geq 1 - α_{i}\} \end{matrix}

(21)

Within a statistical learning perspective, the Wasserstein radius is calibrated by identifying the smallest uncertainty bound

{\hat{ε}}_{i}

that guarantees, with confidence level

1 - α_{i}

, that the empirical distribution remains sufficiently close to the unknown true distribution, thereby bridging empirical estimation and robust optimization through probabilistic concentration [43,44].

\begin{matrix} ω_{s} (ξ_{i}) = η_{s}^{⊤} ξ_{i} + ζ_{s}^{⊤} \nabla_{τ} ξ_{i} + ξ_{i}^{⊤} Θ_{s} ξ_{i} + γ_{s} {∥ξ_{i}∥}_{2}^{2} \end{matrix}

(22)

Here a multi-component scenario scoring function is constructed by combining linear projections, temporal gradient sensitivity, quadratic interactions, and norm-based penalties, enabling each trajectory to be evaluated across multiple structural dimensions that capture both magnitude and dynamic behavior.

\begin{matrix} π_{s} = \frac{exp (λ_{s} \cdot {sup}_{Q_{i} \in P_{i}} \int ω_{s} (ξ_{i}) d Q_{i})}{\sum_{r \in S} exp (λ_{r} \cdot {sup}_{Q_{i} \in P_{i}} \int ω_{r} (ξ_{i}) d Q_{i})} \end{matrix}

(23)

Through a softmax transformation, scenario weights are assigned based on their worst-case expected utility, ensuring that more informative or extreme trajectories receive exponentially higher importance while maintaining normalization across all candidate scenarios.

\begin{matrix} min_{S^{'} \subseteq S} \sum_{s \in S} π_{s} min_{r \in S^{'}} {∥ξ^{(s)} - ξ^{(r)}∥}_{2}^{2} + δ |S^{'}| \end{matrix}

(24)

By introducing a regularized scenario selection formulation, the reduced set

S^{'}

is chosen to balance approximation fidelity and model complexity, where the penalty term

δ | S^{'} |

discourages excessive scenario proliferation while preserving representativeness [47,48].

Before the final medoid reduction, redundant and strongly linked scenarios are removed through a transparent filtering procedure. Candidate pairs are treated as near duplicates only when the combined trajectory distance is small and their correlation is high. High-ramp scenarios are protected by an extreme reserve so that rare but important trajectories are not removed merely because they are close to a typical trajectory in average distance.

D_{s r} = α_{D} {∥ z_{s} - z_{r} ∥}_{2} + (1 - α_{D}) D_{DTW} (ξ^{(s)}, ξ^{(r)}), D_{s r} < τ_{D}, corr (ξ^{(s)}, ξ^{(r)}) > τ_{ρ} .

(24a)

\begin{matrix} χ_{i}^{rob} = μ_{i}^{rob} + Σ_{i}^{rob} \sum_{j \in N} ρ_{i j}^{rob} μ_{j}^{rob} + Γ_{i} \nabla_{τ} μ_{i}^{rob} \end{matrix}

(25)

This expression constructs a composite robust feature vector by integrating mean, covariance, correlation, and temporal gradient information, thereby encoding both local uncertainty characteristics and network-wide dependency structures into a unified representation.

\begin{matrix} L^{out} = sup_{P_{i}^{★}} |E_{P_{i}^{★}} [ϕ_{i} (ξ_{i})] - sup_{Q_{i} \in P_{i}} E_{Q_{i}} [ϕ_{i} (ξ_{i})]| \leq ε_{i} \cdot L_{i} + \frac{σ_{i}}{\sqrt{N_{i}}} \end{matrix}

(26)

From a generalization standpoint, the out-of-sample performance gap is bounded by a term proportional to the Wasserstein radius and a statistical estimation error that decays with sample size, thereby providing a rigorous guarantee that the learned representation remains stable under both distributional shifts and finite-sample uncertainty.

\begin{matrix} min_{α_{i}} \int_{Ω_{i}} {(ϕ_{i} (ξ_{i}) - α_{i}^{⊤} ξ_{i})}^{2} d Q_{i} (ξ_{i}) + λ_{i}^{(r)} {∥α_{i}∥}_{2}^{2} \end{matrix}

(27)

Instead of directly operating on high-dimensional nonlinear features, the formulation introduces a regularized projection mechanism in which

α_{i}

serves as a compressed representation vector, balancing approximation fidelity under the ambiguity-aware distribution with stability enforced through quadratic regularization.

\begin{matrix} α_{i}^{★} = {(\int_{Ω_{i}} ξ_{i} ξ_{i}^{⊤} d Q_{i} + λ_{i}^{(r)} I)}^{- 1} \int_{Ω_{i}} ξ_{i} ϕ_{i} (ξ_{i}) d Q_{i} \end{matrix}

(28)

Following optimality conditions, the closed-form expression for

α_{i}^{★}

emerges as a regularized moment-matching solution, where the inverse second-moment matrix ensures numerical stability and aligns the projected representation with the original nonlinear feature mapping.

\begin{matrix} ξ_{i}^{(s)} = arg max_{ξ_{i} \in Ω_{i}} (ω_{s} (ξ_{i}) - λ_{i} {∥ξ_{i} - {\hat{ξ}}_{i}∥}_{2}^{2}) \end{matrix}

(29)

To extract representative trajectories, each scenario is generated as the solution of a penalized maximization problem that trades off extremality in the scoring function against proximity to empirical observations, thereby ensuring that sampled trajectories remain both informative and physically plausible.

\begin{matrix} \sum_{s \in S} π_{s} ξ_{i}^{(s)} = μ_{i}^{rob}, \sum_{s \in S} π_{s} (ξ_{i}^{(s)} - μ_{i}^{rob}) {(ξ_{i}^{(s)} - μ_{i}^{rob})}^{⊤} = Σ_{i}^{rob} \end{matrix}

(30)

Consistency between the discrete scenario approximation and the underlying robust distribution is enforced by matching both first-order and second-order moments, ensuring that the reduced scenario set faithfully reproduces the essential statistical structure encoded in the ambiguity set.

\begin{matrix} χ_{i}^{final} = Φ_{i}^{★} w_{i}^{★} + δ_{i} μ_{i}^{rob} + κ_{i} \nabla_{τ} μ_{i}^{rob} \end{matrix}

(31)

Finally, the complete representation is assembled by integrating embedded features, robust statistical descriptors, and temporal dynamics, producing a compact yet information-rich feature vector that is suitable for downstream analytical tasks under uncertainty.

Practical Implementation for Real-Time Grid Operation

The framework operates as a representation layer between measurement systems and downstream distribution management applications. The full Wasserstein radius calibration and scenario library construction are not solved from scratch every 15 min. Instead, expensive tasks are handled offline or on a slow update cycle, while the online cycle updates rolling statistics, checks coverage, evaluates robust descriptors, and selects scenarios from a precomputed library.

4. Results

The numerical evidence is organized into consolidated tables that group related reproducibility, benchmark, metric, statistical, stress test, scenario reduction, and ablation information. This arrangement preserves the validation details while reducing fragmentation in the results narrative.

4.1. Case Study and Reproducibility

The case study is constructed on a modified IEEE 123-bus distribution system, based on the IEEE 123-node feeder benchmark and associated distribution system model documentation [49,50], to reflect a realistic medium-voltage network with high penetration of distributed energy resources. The network consists of 123 nodes, 118 distribution lines, and 15 tie switches, with a total peak load of approximately 6.5 MW. A total of 42 distributed photovoltaic units are integrated across residential, commercial, and mixed-use nodes, with individual capacities ranging from 50 kW to 300 kW, resulting in an aggregate installed PV capacity of 7.8 MW. In addition, 28 battery energy storage systems are deployed with capacities between 100 kWh and 500 kWh, leading to a total storage capacity of approximately 9.6 MWh. The temporal resolution of the dataset is set to 15 min, yielding 96 time steps per day, and the study horizon spans 365 consecutive days, resulting in more than 35,000 time-indexed observations per resource. Meteorological data, including solar irradiance (W/m²), ambient temperature (°C), and cloud cover (%), are obtained from publicly available datasets such as NREL’s National Solar Radiation Database and aligned spatially to each node using nearest-neighbor interpolation. In addition, synthetic measurement noise is introduced with a Gaussian distribution of zero mean and a standard deviation of 2–5% of nominal values to emulate realistic field measurement uncertainty. To construct the distributional representation, historical trajectories of photovoltaic generation and storage state-of-charge are first aggregated into empirical distributions for each node, with sample sizes exceeding 30,000 data points per resource. The Wasserstein ambiguity sets are then defined around these empirical distributions, with radius parameters calibrated using a confidence level of 95%, resulting in ε values typically ranging from 0.08 to 0.15 depending on data variability. Multi-source data fusion is performed by aligning PV output, storage behavior, and meteorological variables into a unified feature space of dimension 12–18 per time step, including raw measurements, temporal gradients, and interaction terms. To capture cross-resource dependencies, pairwise correlations are computed across all DER units, leading to a correlation matrix of size 70 × 70. Scenario generation is conducted by sampling from the ambiguity sets, producing an initial pool of 1000 candidate trajectories per node, which are subsequently reduced to 50 representative scenarios using a Wasserstein-based scenario reduction method. These scenarios include both typical daily patterns and extreme events such as rapid irradiance drops exceeding 60% within 30 min and coordinated charging behaviors across multiple storage units. The computational environment is designed to ensure both the scalability and reproducibility of the proposed framework. All simulations are implemented in Python 3.10, with optimization routines solved using Gurobi 10.0 and CVXPY 1.7.2 for convex reformulations of distributionally robust problems. The experiments are executed on a high-performance workstation equipped with an Intel Xeon Gold processor (32 cores, 2.6 GHz), 128 GB RAM, and an NVIDIA RTX 4090 GPU to accelerate matrix operations and scenario evaluations. Parallel processing is employed to handle the large number of scenario generation and feature extraction tasks, reducing total computation time from over 6 h to approximately 45 min per full simulation run. The implementation also leverages NumPy 2.2.6 and PyTorch 2.7.1 for efficient tensor operations, particularly in handling high-dimensional feature spaces and gradient-based computations. Numerical stability is ensured through regularization parameters in the range of 10⁻³ to 10⁻¹, and convergence tolerances are set to 10⁻⁶ for all optimization subproblems. This computational setup enables the proposed distributionally robust representation framework to handle large-scale, high-dimensional datasets while maintaining tractable solution times and consistent performance across multiple experimental runs.

The case study specification includes a tabular feeder scheme, configuration details, DER placement information, data split, radius calibration protocol, benchmark definitions, metric definitions, and solver settings.

4.2. Raw Data Characteristics and Model Validation

Figure 2 illustrates the raw temporal behavior of distributed energy resources over a 24 h horizon, capturing the inherent heterogeneity and multi-source coupling in the system. Across approximately 8–10 photovoltaic nodes, generation profiles exhibit peak outputs ranging from 180 kW to 300 kW, with peak timing deviations of nearly 2 h, typically occurring between 11:00 and 13:00. Solar irradiance follows a smooth bell-shaped trajectory peaking around midday at approximately 280–300 W/m², while ambient temperature lags slightly, reaching values between 32 °C and 35 °C in the early afternoon. In contrast, the energy storage state-of-charge demonstrates asymmetric dynamics, with charging phases concentrated during high irradiance periods and discharging after sunset, reflecting operational constraints and demand response behavior. These differences highlight that each data source contributes distinct temporal signatures, making unified modeling nontrivial. A closer inspection of the zoomed interval between 12:00 and 14:00 reveals high-frequency fluctuations of 40–60 kW within short time windows, even during otherwise stable peak periods. This indicates that PV generation is subject to rapid local disturbances, such as transient cloud coverage, which cannot be captured by smooth deterministic curves. The coexistence of smooth irradiance trends and highly variable PV outputs underscores the need for representations that can preserve both large-scale patterns and small-scale stochastic variations. Quantitatively, the coefficient of variation across nodes during peak hours exceeds 0.25, demonstrating significant cross-node diversity. This figure therefore establishes the fundamental challenge addressed in this work, namely that raw distributed energy data are both heterogeneous and dynamically volatile, requiring more sophisticated modeling beyond conventional deterministic preprocessing.

Table 4 consolidates extreme-event handling, repeated-run error statistics, scenario dispersion, stability, and statistical comparison for the four evaluated methods. The deterministic approach performs poorly, with a detection rate below 30%, indicating its inability to capture rare but critical fluctuations. Although the stochastic and DRO methods improve detection performance, they still exhibit relatively high miss rates. The proposed method significantly enhances detection capability, achieving a detection rate of 78% while maintaining a low false alarm rate. This improvement demonstrates that modeling uncertainty at the distribution level enables better identification of tail events. The reduced miss rate further confirms that the method captures extreme scenarios more effectively, which is essential for ensuring system reliability under volatile operating conditions.

Figure 2. Multi-source temporal profiles of distributed photovoltaic generation and energy storage. The legend identifies the light-blue PV node trajectories (PV Node 1–N), dashed yellow solar irradiance, orange ambient temperature, and dark-blue ESS state of charge; the inset highlights PV fluctuations during 12:00–14:00. Units and event definitions are specified in Table 5 and Table 6.

The same consolidated table also compares accuracy and stability across the deterministic, stochastic, decision-level DRO, and proposed representation methods. The deterministic method exhibits the highest mean error and variance, indicating its sensitivity to uncertainty and inability to generalize under varying conditions. The stochastic approach improves performance by incorporating probabilistic information, reducing both mean error and variability. The decision-level DRO method further enhances robustness by considering worst-case scenarios during optimization, resulting in improved stability and lower maximum error. In contrast, the proposed distributionally robust representation significantly outperforms all baseline methods across all metrics. It achieves the lowest mean error of 0.21 and the smallest standard deviation, demonstrating both high accuracy and strong consistency. The stability index, defined as an inverse measure of performance variability, reaches 0.93, indicating highly reliable behavior across different scenarios. These results confirm that embedding uncertainty at the data representation level provides a more effective and fundamental solution compared to approaches that handle uncertainty only at later stages.

Mean error, scenario-level standard deviation, scenario-level variance, confidence intervals, and p-values are reported together in Table 4 to keep the statistical interpretation adjacent to the main performance results.

Table 7 presents representative stress test results for both Wasserstein radius growth and controlled distribution shifts. When

ϵ = 0

, all methods exhibit comparable performance with errors around 0.09, indicating that under nominal conditions without distributional perturbation, deterministic, stochastic, and robust approaches perform similarly. However, as

ϵ

increases, representing progressively stronger distributional uncertainty, the performance divergence becomes increasingly evident. The deterministic method shows the most rapid degradation, with error rising sharply to 0.75 at

ϵ

= 0.20, reflecting its sensitivity to distributional shifts. The stochastic approach demonstrates improved resilience but still experiences significant error growth, reaching approximately 0.55. The decision-level DRO method further mitigates this effect, stabilizing around 0.45, yet remains susceptible to upstream data uncertainty. In contrast, the proposed distributionally robust representation maintains a substantially lower and more stable error profile, increasing only to approximately 0.28 at

ϵ

= 0.20. This corresponds to a relative error reduction exceeding 60% compared to the deterministic baseline in high-uncertainty regimes. The results clearly indicate that embedding robustness at the data representation level effectively suppresses error amplification under distributional perturbations. Consequently, the proposed method provides superior generalization capability and stability, ensuring reliable performance even when the underlying data distribution deviates significantly from empirical observations.

Figure 3 evaluates performance consistency across 50 distinct scenarios, where each scenario represents a different realization of uncertainty in distributed energy outputs. The deterministic baseline shows highly irregular behavior, with performance fluctuating between approximately 0.20 and 0.75, resulting in a range exceeding 0.50 and a variance above 0.04. The curve is characterized by frequent spikes and abrupt drops, indicating strong sensitivity to scenario-specific perturbations. Such instability suggests that deterministic representations fail to generalize across diverse operating conditions, particularly when exposed to stochastic variations in generation and load. In contrast, the proposed method produces a smooth and stable trajectory centered around 0.30–0.32, with fluctuations confined within ±0.03. This corresponds to a variance reduction of approximately 60–70% compared to the baseline. The shaded uncertainty band is also significantly narrower, reflecting improved consistency. The difference between the two methods is especially evident in high-variability scenarios, where the baseline exhibits sharp spikes while the proposed method remains stable. This indicates that incorporating distributional information at the representation level effectively dampens scenario-induced variability, leading to more reliable performance across a wide range of conditions.

4.3. Robustness Under Distribution Shift

Figure 4 analyzes how model performance evolves as distribution shift intensity increases from 0% to 50%. At low shift levels below 10%, all methods achieve performance above 0.90, indicating strong in-distribution accuracy. However, once the shift exceeds 20%, marking the transition into the out-of-distribution regime, the performance of baseline methods deteriorates rapidly. The deterministic method drops to approximately 0.25 at 50% shift, representing a loss of nearly 70% of its initial performance. The stochastic model and DRO approach perform better but still decline to around 0.45 and 0.60, respectively. The proposed method, however, maintains a significantly higher performance level of approximately 0.75–0.78 even at the highest shift intensity. This implies that over 80% of its original performance is preserved, demonstrating strong generalization capability. The performance gap between the proposed method and deterministic baseline exceeds 0.50 in the high-shift regime, highlighting the advantage of modeling distributional uncertainty explicitly. The gradual and smooth decline of the proposed curve further indicates that the method adapts continuously to distribution changes rather than failing abruptly, which is critical for real-world deployment in dynamic environments.

4.4. Extreme-Event and Scenario Efficiency Analysis

Figure 5 examines the relationship between the number of scenarios and coverage quality, comparing different scenario generation strategies. The proposed ambiguity-based sampling method exhibits rapid convergence, achieving coverage levels of approximately 0.85–0.88 with only 40 scenarios. Beyond this point, the curve plateaus, indicating diminishing returns from additional scenarios. In contrast, clustering-based reduction requires around 70–80 scenarios to reach similar coverage levels, while random sampling shows a much slower and nearly linear increase, reaching only about 0.70 at 100 scenarios. The efficiency gain of the proposed method can be quantified as a reduction of approximately 40–50% in required scenario count to achieve comparable or superior coverage. This implies a substantial decrease in computational burden, particularly in large-scale systems where scenario evaluation is costly. The highlighted effective scenario budget around 40 scenarios further emphasizes that high-quality approximation can be achieved with limited samples. This result demonstrates that the proposed framework not only improves robustness and stability but also enhances efficiency by generating more informative and representative scenarios, enabling scalable deployment in complex energy systems.

Figure 6 evaluates how effectively different methods capture extreme ramp events as the threshold |ΔPV| increases from 0 to 150 kW per 15 min. At low thresholds below 40 kW, all methods achieve detection rates above 0.95, indicating similar capability in identifying normal fluctuations. However, once the threshold exceeds 80 kW, marking the extreme event region, the performance gap widens significantly. The deterministic baseline drops sharply from approximately 0.85 to below 0.25 at 150 kW, representing a loss of nearly 70% of detection capability. The stochastic or standard DRO approach performs moderately better, maintaining around 0.30 at high thresholds. In contrast, the proposed method maintains a detection rate above 0.75 across the entire range, preserving more than 80% of its initial performance. The shaded extreme region highlights that the proposed method retains strong sensitivity to rare and high-magnitude fluctuations, while baseline methods fail to identify these events reliably. The difference between the proposed and deterministic methods exceeds 0.50 in the high-threshold regime, demonstrating a substantial improvement in tail-event awareness. This indicates that modeling uncertainty at the distribution level enables the proposed framework to capture rare but critical events that are typically missed by conventional approaches, which is essential for maintaining system reliability under volatile operating conditions.

Table 8 summarizes the scenario-reduction audit trail and reports threshold-specific detection sensitivity for moderate, default, strong, and severe ramp events.

4.5. Ablation, Sensitivity, and Scalability

Entries in Table 9 report the component-level ablation results and the mean error and detection rate sensitivity under representative Wasserstein radii and scenario budgets. The default 50-scenario setting is therefore interpreted as a balance between accuracy, tail coverage, and computational cost rather than a single tuned point.

Figure 7 presents a two-dimensional performance landscape over Wasserstein radius

ε

and distribution shift intensity, comparing the proposed method with a baseline approach. The proposed method exhibits a smooth and gradually declining surface, with performance remaining above 0.80 across most of the domain and only decreasing to approximately 0.70–0.75 at the extreme corner where

ε

= 0.20 and shift = 50%. This indicates that the method maintains robust performance even under simultaneous uncertainty and distributional changes. The surface is characterized by gentle gradients and consistent contour spacing, reflecting stable sensitivity to both dimensions. In contrast, the baseline method shows a sharply curved surface that collapses rapidly in the high-uncertainty region. Performance drops below 0.30 in the upper-right corner, representing a reduction of more than 60% compared to the proposed method. The contour lines are densely packed in this region, indicating steep gradients and high sensitivity to small changes in

ε

or shift. The difference between the two surfaces exceeds 0.40 in extreme conditions, clearly demonstrating that the proposed method provides significantly better robustness when facing compounded uncertainties.

Figure 8 analyzes computation time as the number of network nodes increases from 20 to 120. The proposed method demonstrates a near-linear growth pattern, with computation time increasing from approximately 50 s to 200 s, corresponding to a 4-fold increase. In contrast, the deterministic baseline exhibits a much steeper growth, reaching nearly 450 s at 120 nodes, representing an increase of over 8 times. The standard DRO approach lies between the two but still grows significantly faster than the proposed method, reaching around 350 s at large scale. The divergence becomes particularly evident beyond 100 nodes, identified as the large-scale regime, where the gap between methods widens rapidly. At this point, the proposed method reduces computation time by approximately 50–55% compared to the deterministic baseline. The smooth growth curve of the proposed method indicates better scalability and computational efficiency, making it suitable for large-scale deployment. In contrast, the steep increase observed in baseline methods suggests potential limitations in handling complex systems with many distributed resources.

Figure 8. Computational Scalability with Increasing Network Size under the solver settings in Table 10.

Table 10. Consolidated modified IEEE 123-bus case-study specification and repeatability controls.

Component	Setting	Purpose or Repeatability Role
Benchmark basis	Modified IEEE 123-bus feeder with 123 nodes, 118 distribution lines, 15 tie switches, and approximately 6.5 MW peak load.	Maintains a standard distribution network scale while allowing high-DER modification.
Feeder-zone scheme	Main backbone, upper laterals, central laterals, lower laterals, and remote laterals.	Organizes DER placement by electrically and operationally distinct feeder regions.
PV deployment	42 PV units, 50–300 kW per unit, and 7.8 MW aggregate capacity.	Creates high-PV operating conditions with heterogeneous node-level profiles.
ESS deployment	28 ESS units and approximately 9.6 MWh aggregate capacity.	Tests storage interaction with PV variability and coordinated charging/discharging behavior.
DER placement by zone	Main backbone: 8 PV/6 ESS; upper laterals: 9 PV/5 ESS; central laterals: 11 PV/8 ESS; lower laterals: 8 PV/6 ESS; remote laterals: 6 PV/3 ESS.	Provides a repeatable tabular feeder scheme without adding a new figure.
Time horizon and resolution	365 consecutive days, 96 samples per day, and 15 min resolution.	Supports sub-hourly ramp analysis and seasonal variability.
Chronological split	Days 1–219 for training, days 220–292 for radius/hyperparameter calibration, and days 293–365 for final testing.	Prevents test set leakage.
Distribution shift and extremes	Irradiance attenuation, cloud-ramp amplification, PV residual variance increase, load correlation perturbation, 80 kW/15 min default ramp threshold, and more than 60 percent irradiance drop within 30 min.	Defines controlled out-of-distribution and extreme-event tests.
Radius calibration	Feature standardization uses training data only; $ε_{i}$ is selected on the calibration split for approximately 95 percent empirical coverage and then fixed for testing.	Separates ambiguity calibration from final evaluation.
Solver and repeated runs	Python 3.10, CVXPY 1.7.2, NumPy 2.2.6, PyTorch 2.7.1, Gurobi 10.0, tolerance $10^{- 6}$ , $ε_{i}$ range 0.08–0.15, regularization range $10^{- 3}$ – $10^{- 1}$ , and 10 repeated runs.	Specifies computational settings and seed-based repeated-run validation.

Figure 9 illustrates the relationship between the number of scenarios, Wasserstein radius

ε

, and performance quality in a three-dimensional surface. The proposed method exhibits rapid convergence, with performance increasing sharply as the number of scenarios grows from 10 to 40, reaching levels above 0.90. Beyond this point, the surface becomes flat, indicating that additional scenarios provide minimal improvement. This plateau behavior highlights the efficiency of the proposed ambiguity-based sampling approach, which extracts maximal information from a limited number of scenarios. In contrast, the baseline method shows a slower and more gradual increase in performance, requiring more than 80 scenarios to approach similar levels. Even at 100 scenarios, performance remains below that of the proposed method in the low-scenario region. The gap between the two methods exceeds 0.30 when the number of scenarios is below 40, emphasizing the advantage of the proposed approach in resource-constrained settings. The highlighted efficient region demonstrates where high performance can be achieved with minimal computational effort, reinforcing the practical value of the proposed framework for large-scale and real-time applications.

5. Conclusions

This paper developed a Wasserstein distributionally robust representation framework for PV-rich distribution networks with energy storage and meteorological inputs. The main conclusion is not that Wasserstein ambiguity sets are new; they are established in distributionally robust optimization. The contribution is that the ambiguity set is placed at the data representation layer, before robust feature extraction, scenario generation, and downstream operational analysis.

The results support five main findings. First, deterministic or weakly probabilistic representations can suppress tail behavior, fast ramps, and cross-node dependence when PV penetration is high. Second, the proposed representation reduces the mean normalized error and the scenario-level standard deviation in the modified IEEE 123-bus case study. Third, ambiguity-aware scenario generation improves extreme-ramp detection while keeping the false-alarm rate low. Fourth, redundancy filtering and weighted medoid reduction reduce the scenario burden while preserving representative and extreme trajectories. Fifth, the ablation study indicates that the Wasserstein ambiguity set, robust feature extraction, residual spatial coupling, and extreme-reserve rule all contribute to the final performance.

The scope of these findings is limited. The numerical evidence is based on one modified IEEE 123-bus feeder, a positive-sequence radial active power approximation, controlled meteorological and measurement noise construction, and empirical radius calibration. The results therefore justify the statement that the method performs better than the tested baselines in the analyzed case study, but they do not prove universal superiority across all feeders or operational tasks. Future work should validate the approach on multiple real feeders, include unbalanced three-phase power flow constraints, model regulator, capacitor, inverter control, protection interactions, and dynamic oscillation-source localization for converter-rich operating regimes [51], use synchronized field measurements or sky imager data, and test adaptive Wasserstein radius calibration under long-term weather and load regime changes.

Author Contributions

Conceptualization, A.L.; Methodology, M.L. and C.X.; Software, T.L. and C.X.; Formal analysis, M.L.; Investigation, M.L. and L.F.; Resources, T.L. and C.X.; Data curation, A.L. and T.L.; Writing – original draft, A.L.; Supervision, L.F.; Project administration, L.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Science and Technology Project of State Grid Shandong Electric Power Company. Project title: Hierarchical Equivalent Aggregation Modeling Technology for Distributed Photovoltaic and Energy Storage Resources Across Multi-Voltage Levels. ERP Code: 52060125000A.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Andi Liu, MengqiLiu and Tairan Li were employed by State Grid Shandong Electric Power Company Jinan Power Supply Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that this study received funding from Science and Technology Project of State Grid Shandong Electric Power Company. The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.

References

International Renewable Energy Agency. Renewable Capacity Highlights 2026; International Renewable Energy Agency: Abu Dhabi, United Arab Emirates, 2026. [Google Scholar]
International Energy Agency. Renewables 2025: Analysis and Forecast to 2030; International Energy Agency: Paris, France, 2025. [Google Scholar]
IEA Photovoltaic Power Systems Programme. National Survey Report of PV Power Applications in China 2024; IEA Photovoltaic Power Systems Programme: Kensington, Australia, 2025. [Google Scholar]
Hou, Y.; Zhao, A.P.; Tao, Q.; Li, J.; Hernando-Gil, I.; Li, X.; Xie, D. Complex systems engineering for grid resilience: Insights from the 2025 Iberian blackout. Renew. Sustain. Energy Rev. 2026, 235, 116976. [Google Scholar] [CrossRef]
Parizad, A.; Baghaee, H.R.; Alizadeh, V.; Rahman, S. Emerging technologies and future trends in cyber-physical power systems: Toward a new era of innovations. Smart Cyber-Phys. Power Syst. Solut. Emerg. Technol. 2025, 2, 525–565. [Google Scholar]
Al-Shetwi, A.Q.; Atawi, I.E.; El-Hameed, M.A.; Abuelrub, A. Digital twin technology for renewable energy, smart grids, energy storage and vehicle-to-grid integration: Advancements, applications, key players, challenges and future perspectives in modernising sustainable grids. IET Smart Grid 2025, 8, e70026. [Google Scholar] [CrossRef]
Cao, Y.; Zhou, B.; Chung, C.Y.; Shuai, Z.; Hua, Z.; Sun, Y. Dynamic modelling and mutual coordination of electricity and watershed networks for spatio-temporal operational flexibility enhancement under rainy climates. IEEE Trans. Smart Grid 2022, 14, 3450–3464. [Google Scholar] [CrossRef]
Cao, Y.; Zhou, B.; Chung, C.Y.; Wu, T.; Zheng, L.; Shuai, Z. A coordinated emergency response scheme for electricity and watershed networks considering spatio-temporal heterogeneity and volatility of rainstorm disasters. IEEE Trans. Smart Grid 2024, 15, 3528–3541. [Google Scholar] [CrossRef]
Pu, T.; Cao, J.; Wang, X.; Zhang, D. Collective intelligence and application potentials in modern power systems: A comprehensive review. CSEE J. Power Energy Syst. 2026, 12, 575–598. [Google Scholar]
Wang, Y.; Xie, D.; Shi, C.; Wang, X. Dynamic construction strategy for virtual power plants considering distributed energy resource interaction uncertainty and response recovery process. IEEE Trans. Ind. Appl. 2026. Early Access. [Google Scholar] [CrossRef]
Chen, H.; Xu, M.; Shi, C.; Wang, Z.; Xie, D. Data-model hybrid hierarchical scheduling for grid-connected PEM green hydrogen production in multi-energy systems with LLM-aided coordination. J. Renew. Sustain. Energy 2026, 18, 024701. [Google Scholar] [CrossRef]
Zhao, A.P.; Li, S.; Xie, D.; Wang, Y.; Li, Z.; Hu, P.J.H.; Zhang, Q. Hydrogen as the nexus of future sustainable transport and energy systems. Nat. Rev. Electr. Eng. 2025, 2, 447–466. [Google Scholar] [CrossRef]
Li, T.T.; Zhao, A.P.; Wang, Y.; Li, S.; Fei, J.; Wang, Z.; Xiang, Y. Integrating solar-powered electric vehicles into sustainable energy systems. Nat. Rev. Electr. Eng. 2025, 2, 467–479. [Google Scholar] [CrossRef]
Zhao, A.P.; Li, S.; Cao, Z.; Hu, P.J.-H.; Wang, J.; Xiang, Y.; Xie, D.; Lu, X. AI for science: Predicting infectious diseases. J. Saf. Sci. Resil. 2024, 5, 130–146. [Google Scholar] [CrossRef]
Li, Z.; Hilber, P.; Li, Z.; Laneryd, T.; Ivanell, S. Temporally coordinated operation of green multi-energy airport microgrids with climatic correlations and flexible loads via decomposed stochastic programming. IEEE Trans. Sustain. Energy 2025, 17, 1909–1922. [Google Scholar] [CrossRef]
Chen, J.; Wu, P.; Chen, W.; Guerrero, J.M.; Niu, Z.; Li, Z. Two-layer coordinated operation of multi-energy system considering carbon-oriented collaborative pricing mechanism via two-stage stochastic programming approach. Appl. Energy 2026, 406, 127298. [Google Scholar] [CrossRef]
Feng, C.; Huang, Z.; Lin, J.; Wang, L.; Zhang, Y.; Wen, F. Aggregation model and market mechanism for virtual power plant participation in inertia and primary frequency response. IEEE Trans. Power Syst. 2026, 41, 2101–2117. [Google Scholar] [CrossRef]
Feng, J.; Ren, Z.; Li, C.; Li, W. A Benders-combined safe reinforcement learning framework for risk-averse dispatch considering frequency security constraints. IEEE Trans. Circuits Syst. II Express Briefs 2025, 72, 1063–1067. [Google Scholar] [CrossRef]
Chen, X.; Liu, Y.; Zhong, Z.; Fan, N.; Wu, L. A carryover storage valuation framework for medium-term cascaded hydropower planning: A Portland General Electric system study. IEEE Trans. Sustain. Energy 2025, 16, 1903–1918. [Google Scholar] [CrossRef]
Li, J.; Gu, K.; Zhao, K.; Li, Z.; Dong, Z.Y. A robust and fast operational risk assessment method for composite power systems with high wind power penetration. IEEE Trans. Power Syst. 2026. Early Access. [Google Scholar] [CrossRef]
Wang, Z.; Gu, Z.; Guerrero, J.M.; Shen, Y.; Guo, Z.; Deng, Z.; Huang, C.; Li, Z. Two-period two-layer electrical model-free voltage calculation for active distribution network via improved broad learning system. IEEE Trans. Power Syst. 2026. [Google Scholar] [CrossRef]
Zhang, Y.; Meng, L.; Zambroni, A.C.; Hu, Q.; Liu, H.; Lebedev, A.; Shotorbani, A.M. Multi-agent deep reinforcement learning for EV aggregator bidding in renewable-dominated electricity markets. Int. J. Electr. Power Energy Syst. 2026, 174, 111444. [Google Scholar] [CrossRef]
Li, P.; Hu, Z.; Shen, Y.; Cheng, X.; Alhazmi, M. Short-term electricity load forecasting based on large language models and weighted external factor optimization. Sustain. Energy Technol. Assess. 2025, 82, 104449. [Google Scholar] [CrossRef]
Li, B.; Wu, Q.; Cao, Y.; Jiao, W.; Li, C. Physically informed multi-agent deep reinforcement learning for distributed voltage control in distribution networks. Int. J. Electr. Power Energy Syst. 2026, 174, 111451. [Google Scholar] [CrossRef]
Daneshvar, M.; Mohammadi-Ivatloo, B.; Zare, K.; Anvari-Moghaddam, A. Risk-aware stochastic scheduling of hybrid integrated energy systems with 100% renewables. IEEE Trans. Eng. Manag. 2024, 71, 9314–9324. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, H.; Tang, H.; Liang, L.; Cheng, L.; Chen, X.; Ding, W.; Zhang, X.P. A scalable mean-field MARL framework for multi-objective V2X resource allocation. IEEE Trans. Intell. Veh. 2024, 10, 1071–1086. [Google Scholar] [CrossRef]
Zhao, P.; Gu, C.; Cao, Z.; Xie, D.; Teng, F.; Li, J.; Chen, X.; Wu, C.; Yu, D.; Xu, X.; et al. A cyber-secured operation for water-energy nexus. IEEE Trans. Power Syst. 2020, 36, 3105–3117. [Google Scholar] [CrossRef]
Li, P.; Shen, Y.; Shang, Y.; Alhazmi, M. Innovative distribution network design using GAN-based distributionally robust optimization for DG planning. IET Gener. Transm. Distrib. 2025, 19, e13350. [Google Scholar] [CrossRef]
Wu, H.; Sun, M.; Craig, M.T. Updating global green hydrogen production costs and configurations under future climates. Innovation 2026, 7, 101303. [Google Scholar] [CrossRef]
NREL. National Solar Radiation Database: Data and Methodology Documentation; National Renewable Energy Laboratory: Golden, CO, USA. Available online: https://nsrdb.nlr.gov/resources (accessed on 22 May 2026).
Holmgren, W.F.; Hansen, C.W.; Mikofski, M.A. pvlib python: A python package for modeling solar energy systems. J. Open Source Softw. 2018, 3, 884. [Google Scholar] [CrossRef]
Wilcox, S.; Marion, W. Users Manual for TMY3 Data Sets; National Renewable Energy Laboratory: Golden, CO, USA, 2008.
Pinson, P. Wind energy: Forecasting challenges for its operational management. Stat. Sci. 2013, 28, 564–585. [Google Scholar] [CrossRef]
Bessa, R.J.; Miranda, V.; Botterud, A.; Wang, J.; Constantinescu, E.M. Time-adaptive conditional kernel density estimation for wind power forecasting. IEEE Trans. Sustain. Energy 2012, 3, 660–669. [Google Scholar] [CrossRef]
Morales, J.M.; Conejo, A.J.; Madsen, H.; Pinson, P.; Zugno, M. Integrating Renewables in Electricity Markets: Operational Problems; Springer: New York, NY, USA, 2014. [Google Scholar]
Shapiro, A.; Dentcheva, D.; Ruszczynski, A. Lectures on Stochastic Programming: Modeling and Theory; SIAM: Philadelphia, PA, USA, 2009. [Google Scholar]
Conejo, A.J.; Carrion, M.; Morales, J.M. Decision Making Under Uncertainty in Electricity Markets; Springer: New York, NY, USA, 2010. [Google Scholar]
Ben-Tal, A.; Ghaoui, L.E.; Nemirovski, A. Robust Optimization; Princeton University Press: Princeton, NJ, USA, 2009. [Google Scholar]
Bertsimas, D.; Brown, D.B.; Caramanis, C. Theory and applications of robust optimization. SIAM Rev. 2011, 53, 464–501. [Google Scholar] [CrossRef]
Esfahani, P.M.; Kuhn, D. Data-driven distributionally robust optimization using the Wasserstein metric. Math. Program. 2018, 171, 115–166. [Google Scholar] [CrossRef]
Gao, R.; Kleywegt, A.J. Distributionally robust stochastic optimization with Wasserstein distance. Math. Oper. Res. 2023, 48, 603–655. [Google Scholar] [CrossRef]
Blanchet, J.; Li, J.; Lin, S.; Zhang, X. Distributionally robust optimization and robust statistics. Stat. Sci. 2025, 40, 351–377. [Google Scholar] [CrossRef]
Fournier, N.; Guillin, A. On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory Relat. Fields 2015, 162, 707–738. [Google Scholar] [CrossRef]
Dvoretzky, A.; Kiefer, J.; Wolfowitz, J. Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator. Ann. Math. Stat. 1956, 27, 642–669. [Google Scholar] [CrossRef]
Baran, M.E.; Wu, F.F. Optimal capacitor placement on radial distribution systems. IEEE Trans. Power Deliv. 1989, 4, 725–734. [Google Scholar] [CrossRef]
Farivar, M.; Low, S.H. Branch flow model: Relaxations and convexification. IEEE Trans. Power Syst. 2013, 28, 2554–2564. [Google Scholar] [CrossRef]
Dupacova, J.; Growe-Kuska, N.; Romisch, W. Scenario reduction in stochastic programming. Math. Program. 2003, 95, 493–511. [Google Scholar] [CrossRef]
Heitsch, H.; Romisch, W. Scenario reduction algorithms in stochastic programming. Comput. Optim. Appl. 2003, 24, 187–206. [Google Scholar] [CrossRef]
IEEE PES Distribution System Analysis Subcommittee. IEEE 123 Node Test Feeder. Available online: https://cmte.ieee.org/pes-testfeeders/resources/ (accessed on 22 May 2026).
Dugan, R.C.; Montenegro, D. The Open Distribution System Simulator (OpenDSS) Reference Guide; Electric Power Research Institute: Washington, DC, USA, 2020; Available online: https://sourceforge.net/p/electricdss/code/HEAD/tree/trunk/Distrib/Doc/OpenDSSManual.pdf (accessed on 22 May 2026).
Wen, Y.; Yu, W.; Fu, H.; Thotakura, N.L.; Jiang, S.; Dulal, S.; Liu, Y.; Zhu, L.; Farantatos, E. A Systematic Oscillation Source Location Estimation Method for Ultra-Low Frequency and Sub-Synchronous Oscillations in Power Systems. In Proceedings of the 2025 IEEE Power & Energy Society General Meeting (PESGM), Austin, TX, USA, 27–31 July 2025; pp. 1–5. [Google Scholar]

Figure 1. Distributionally Robust Data Representation Framework for Distributed Photovoltaic and Energy Storage Systems with data integration, ambiguity set construction, robust feature extraction, and scenario generation.

Figure 3. Stability of Model Performance across Stochastic Scenarios.

Figure 4. Generalization Performance under Distribution Shift.

Figure 5. Scenario Efficiency in Coverage and Approximation Quality.

Figure 6. Extreme Event Detection Capability under Increasing Ramp Threshold.

Figure 7. Performance Surface under Joint Ambiguity and Distribution Shift. The blue surface gradients represent out-of-sample performance over Wasserstein radius

ε

and distribution-shift intensity; white contour lines denote equal-performance levels, and the gray panels mark the high-uncertainty/high-shift region.

Figure 7. Performance Surface under Joint Ambiguity and Distribution Shift. The blue surface gradients represent out-of-sample performance over Wasserstein radius

ε

and distribution-shift intensity; white contour lines denote equal-performance levels, and the gray panels mark the high-uncertainty/high-shift region.

Figure 9. Scenario Efficiency and Performance Trade-off Surface. The blue surface denotes the proposed method, the gray surface denotes the baseline method, contour lines denote equal-performance levels, and the black marker indicates the efficient region with high quality under a limited scenario budget. Scenario-budget sensitivity is summarized in Table 9.

Table 1. Scope of common PV uncertainty modeling approaches and the operational gap addressed in this paper.

Approach	Appropriate Use	Limitation for PV-Rich Operational Feeders
Deterministic engineering profile	Equipment sizing, annual yield estimation, and transparent baseline comparison.	Represents one trajectory or one expected profile; tail ramps and cross-node dependence can be suppressed.
Empirical average or representative day	Seasonal comparison and long-term expected performance studies under stable historical conditions.	Captures central behavior but can lose rare events and scenario-to-scenario dispersion.
Semi-empirical irradiance–temperature model	Physically interpretable conversion from weather variables to PV output.	Does not by itself describe residual dependence, measurement noise, or distributional shift.
TMY and selected extreme years	PV system development, site comparison, and reliability or sustainability stress tests.	A selected year is still a finite trajectory rather than a full uncertainty distribution.
Probabilistic forecasting and stochastic scenarios	Forecast intervals, stochastic scheduling, and scenario-based planning.	Robustness can be weak when the empirical distribution is sparse or shifted.
Wasserstein DRO at the representation layer	Preserves trajectory ambiguity before downstream feature extraction, scenario generation, and operational analysis.	Requires explicit physical constraints, radius calibration, and reproducibility controls.

Table 2. Validity conditions for the finite-sample dual approximation.

Condition	Reason	Implementation Check
Compact normalized support	Ensures finite local maximization over feasible trajectories.	PV, ESS, and load variables are clipped to physical bounds.
Finite second moment	Required for Wasserstein-2 ambiguity.	Verified after feature standardization.
Differentiable loss	Needed for first-order Taylor expansion.	Smooth quadratic representation loss is used.
Lipschitz gradient	Controls approximation error.	Local gradient norms are monitored.
Dual penalty larger than curvature	Prevents the transport penalty from being dominated by loss curvature.	Candidate $λ_{i}$ values are screened on the calibration set.
Boundary check	Prevents invalid closed-form adversarial moves.	Numerical bounded maximization is used when support limits are active.

Table 3. Consolidated algorithmic notation, scenario-filtering rules, and implementation schedule.

Function	Quantity or Rule	Role in the Framework
Robust descriptors	$μ_{i}^{rob}$ , $Σ_{i}^{rob}$ , $ρ_{i j}^{rob}$	Worst-case mean, covariance, and cross-resource descriptors extracted from the ambiguity set.
Scenario scoring	$ω_{s} (\cdot)$ and $π_{s}$	Utility score and scenario weight used to prioritize informative, high-ramp, and tail-relevant trajectories.
Duplicate distance	$D_{s r}$	Combined normalized feature distance and dynamic time warping distance used before final medoid reduction.
Physical feasibility filter	PV, ESS, support, and ramp bounds	Removes candidates that violate physical limits or admissible support constraints.
Near-duplicate and strong-link filters	$τ_{D} = 0.08$ , $τ_{ρ} = 0.995$	Removes almost identical or highly correlated trajectories before the final 50-scenario selection.
Extreme reserve	Top 15 percent by ramp score	Protects high ramp trajectories from being removed only because they are close to typical scenarios.
Offline calibration	Residual coupling, Wasserstein radii, scenario library, and thresholds	Executed daily, weekly, or after detected regime changes; outputs fixed radii, coefficient matrix, candidate scenarios, and threshold settings.
Online update	Data alignment, rolling empirical statistics, robust descriptors, and scenario lookup	Executed every 15 min or at the dispatch interval; avoids solving the full Wasserstein calibration problem from scratch online.
Downstream use	Voltage control, aggregation, dispatch, hosting capacity, and resilience screening	Uses robust features and selected scenarios as inputs to operational studies rather than replacing the distribution management system.

Table 4. Consolidated repeated-run performance, extreme-event handling, and statistical comparison.

Method	Mean Error, 95% CI	Std. Dev.	Variance	Detection/False/Miss	Stability	p vs. Proposed
Deterministic	0.48 [0.45, 0.51]	0.184	0.0339	0.28/0.12/0.72	0.62	<0.01
Stochastic	0.36 [0.34, 0.39]	0.121	0.0146	0.42/0.10/0.58	0.74	<0.01
Decision-level DRO	0.30 [0.28, 0.32]	0.091	0.0083	0.55/0.08/0.45	0.81	<0.05
Proposed method	0.21 [0.20, 0.23]	0.052	0.0027	0.78/0.06/0.22	0.93	–

Table 5. Consolidated notation, hyperparameters, and physical settings used in the proposed representation model.

Category	Symbol or Setting	Meaning, Unit, or Selection Rule
Indexing	$i, j$ , t, $Δ t$	Node/DER indices, time index, and sampling interval; $Δ t = 0.25$ h in the case study.
PV variables	$p_{i, t}^{PV}$ , $G_{i, t}$ , $T_{i, t}^{amb}$ , $T_{i, t}^{cell}$	PV active power (kW), plane-of-array irradiance (W/m²), ambient temperature (°C), and cell temperature (°C).
ESS variables	$e_{i, t}^{ESS}$ , $p_{i, t}^{ch}$ , $p_{i, t}^{dis}$	Stored energy (kWh), charging power (kW), and discharging power (kW).
Distributional variables	${\hat{P}}_{i}$ , $Q_{i}$ , $W_{2}$	Empirical distribution, candidate adversarial distribution, and Wasserstein-2 distance in standardized trajectory space.
Ambiguity radius	$ε_{i}$	Wasserstein radius calibrated on the validation split to obtain approximately 95 percent empirical coverage.
Dual and regularization parameters	$λ_{i}$ , $β_{i}$	Transport penalty and representation-regularization parameters selected on validation data; $λ_{i}$ is screened against the local curvature condition.
Spatial coupling	$ω_{i j}$ , $ω_{max}$ , $ρ_{max}$	Residual coupling from node j to node i; default bounds are $ω_{max} = 0.25$ and $ρ_{max} = 0.60$ .
Scenario reduction thresholds	$τ_{D}$ , $τ_{ρ}$ , S, R	Duplicate distance threshold, strong-correlation threshold, retained scenario budget, and repeated-run count; defaults are $S = 50$ and $R = 10$ .
PV physical settings	PV capacity, inverter efficiency, temperature coefficient	Unit capacities are 50–300 kW, aggregate PV capacity is 7.8 MW, inverter efficiency is 0.96–0.98, and the temperature coefficient is −0.0035 to −0.0045 per °C.
ESS physical settings	ESS capacity and efficiency	Unit energy capacity is 100–500 kWh, aggregate ESS capacity is approximately 9.6 MWh, and charge/discharge efficiency is 0.90–0.95.
Data uncertainty	Measurement noise	Zero-mean Gaussian perturbation with standard deviation equal to 2–5 percent of the nominal channel value.

Table 6. Consolidated benchmark definitions and evaluation metrics.

Group	Definition	Common Input or Unit
Deterministic baseline	Uses empirical mean trajectories and does not model ambiguity.	Same normalized train/calibration/test data.
Stochastic baseline	Samples from fitted empirical marginal distributions and empirical covariance statistics.	Same noise and shift protocol.
Decision-level DRO	Applies Wasserstein robustness after deterministic upstream feature construction.	Same downstream scenario budget.
Proposed representation	Applies Wasserstein ambiguity at the representation layer before robust features and scenarios are generated.	Same feeder, data, and evaluation metrics.
Mean normalized error	$E_{m, s, r} = ∥ {\hat{y}}_{m, s, r} - y_{s, r}^{ref} ∥_{2} / (∥ y_{s, r}^{ref} ∥_{2} + 10^{- 8})$ .	Dimensionless.
Scenario variance and standard deviation	$V_{m, r} = \frac{1}{S - 1} \sum_{s} {(E_{m, s, r} - {\bar{E}}_{m, r})}^{2}$ and $σ_{m, r} = \sqrt{V_{m, r}}$ .	Squared error unit and error unit.
Detection, false alarm, and miss rates	Ratios of detected extreme events, falsely detected events, and missed extreme events.	Probability.
Stability index	Monotone inverse indicator of scenario-level error dispersion.	Dimensionless.

Table 7. Consolidated stress test summary under Wasserstein radius growth and controlled distribution shifts.

Stress Test	Level	Deterministic	Decision-Level DRO	Proposed Method
Radius error	$ε = 0.00$	0.09	0.09	0.09
Radius error	$ε = 0.05$	0.20	0.15	0.12
Radius error	$ε = 0.10$	0.45	0.30	0.19
Radius error	$ε = 0.15$	0.65	0.40	0.25
Radius error	$ε = 0.20$	0.75	0.45	0.28
Shift mean error, 95 percent CI	0 percent	0.09 [0.08, 0.10]	0.09 [0.08, 0.10]	0.09 [0.08, 0.10]
Shift mean error, 95 percent CI	20 percent	0.31 [0.29, 0.34]	0.22 [0.20, 0.24]	0.16 [0.15, 0.18]
Shift mean error, 95 percent CI	35 percent	0.52 [0.48, 0.55]	0.34 [0.31, 0.37]	0.22 [0.20, 0.24]
Shift mean error, 95 percent CI	50 percent	0.75 [0.71, 0.79]	0.45 [0.41, 0.49]	0.28 [0.26, 0.31]

Table 8. Consolidated scenario reduction audit trail and extreme-ramp threshold sensitivity.

Analysis Item	Value	Interpretation
Candidate generation	1000 scenarios	Initial ambiguity-aware candidate library.
Physical feasibility filtering	914 scenarios	Removes PV, ESS, and support-bound violations.
Near-duplicate filtering	436 scenarios	Removes trajectories with small combined distance and high correlation.
Strong-link filtering	238 scenarios	Prevents almost identical linked scenarios from dominating.
Extreme reserve added back	264 scenarios	Restores high-ramp candidates protected by the reserve rule.
Weighted medoid reduction	50 scenarios	Final representative and extreme scenario set.
Detection at 60 kW/15 min	Det./DRO/Proposed = 0.46/0.68/0.86	Moderate-ramp threshold.
Detection at 80 kW/15 min	Det./DRO/Proposed = 0.28/0.55/0.78	Default extreme-event threshold.
Detection at 100 kW/15 min	Det./DRO/Proposed = 0.20/0.46/0.75	Strong ramp threshold.
Detection at 150 kW/15 min	Det./DRO/Proposed = 0.12/0.31/0.70	Severe tail-ramp threshold.

Table 9. Consolidated ablation results and radius scenario sensitivity.

Variant or Setting	Mean Error	Detection Rate	Scenario Std. Dev.
Full proposed framework	0.21	0.78	0.052
Without Wasserstein ambiguity set	0.35	0.49	0.116
Without robust feature extraction	0.29	0.61	0.092
Without residual spatial coupling	0.25	0.68	0.074
Without extreme reserve	0.23	0.58	0.061
Without redundancy filtering	0.22	0.76	0.070
Radius 0.08, 50 scenarios	0.21	0.76	–
Radius 0.10, 50 scenarios	0.21	0.78	–
Radius 0.12, 50 scenarios	0.22	0.79	–
Radius 0.15, 50 scenarios	0.25	0.80	–
Radius 0.10, 30/80 scenarios	0.22/0.20	0.74/0.79	–

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, A.; Liu, M.; Li, T.; Feng, L.; Xiao, C. Robust Representation of Solar Photovoltaic Variability via Wasserstein Distributional Modeling. Energies 2026, 19, 2665. https://doi.org/10.3390/en19112665

AMA Style

Liu A, Liu M, Li T, Feng L, Xiao C. Robust Representation of Solar Photovoltaic Variability via Wasserstein Distributional Modeling. Energies. 2026; 19(11):2665. https://doi.org/10.3390/en19112665

Chicago/Turabian Style

Liu, Andi, Mengqi Liu, Tairan Li, Liang Feng, and Chuanliang Xiao. 2026. "Robust Representation of Solar Photovoltaic Variability via Wasserstein Distributional Modeling" Energies 19, no. 11: 2665. https://doi.org/10.3390/en19112665

APA Style

Liu, A., Liu, M., Li, T., Feng, L., & Xiao, C. (2026). Robust Representation of Solar Photovoltaic Variability via Wasserstein Distributional Modeling. Energies, 19(11), 2665. https://doi.org/10.3390/en19112665

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robust Representation of Solar Photovoltaic Variability via Wasserstein Distributional Modeling

Abstract

1. Introduction

2. Mathematical Modeling

Data Structure, Notation, and Hyperparameters

3. Distributionally Robust Feature Extraction and Scenario Generation

Practical Implementation for Real-Time Grid Operation

4. Results

4.1. Case Study and Reproducibility

4.2. Raw Data Characteristics and Model Validation

4.3. Robustness Under Distribution Shift

4.4. Extreme-Event and Scenario Efficiency Analysis

4.5. Ablation, Sensitivity, and Scalability

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI