A Reproducible and Correlation-Aware Polynomial Chaos Framework for Probabilistic AC Power Flow in Renewable-Rich Distribution Networks

Guerra, Julio; Recalde, Gustavo; Gavilanez, Jean; Cuenca, Dirley

doi:10.3390/en19122777

Open AccessArticle

A Reproducible and Correlation-Aware Polynomial Chaos Framework for Probabilistic AC Power Flow in Renewable-Rich Distribution Networks

¹

Faculty of Engineering in Applied Sciences, Universidad Técnica del Norte, Ibarra 100105, Ecuador

²

Discience Ec., Ibarra 100101, Ecuador

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(12), 2777; https://doi.org/10.3390/en19122777 (registering DOI)

Submission received: 14 May 2026 / Revised: 4 June 2026 / Accepted: 6 June 2026 / Published: 9 June 2026

Download

Browse Figures

Versions Notes

Abstract

High renewable penetration introduces stochastic variability in distribution-network operation, requiring probabilistic AC power-flow tools that remain accurate in the tails while avoiding the computational burden of large Monte Carlo simulation. This paper presents a fully reproducible non-intrusive polynomial chaos expansion (PCE) framework for uncertainty propagation through nonlinear Newton–Raphson AC power flow. The method uses sparse-grid quadrature to train PCE surrogates from deterministic power-flow evaluations and is benchmarked against high-fidelity Monte Carlo simulations. In the validation, the IEEE 33-bus feeder is evaluated using up to 50,000 Monte Carlo samples, 95% bootstrap confidence intervals, PCE orders 2–5, correlated uncertainty scenarios, realistic thermal-loading recalibration, reactive-power sensitivity of renewable injections, multi-feeder testing on IEEE 33-bus, CIGRE MV, CIGRE LV, and IEEE 118-bus networks, and a 365-snapshot full-year daily screening. For the base IEEE 33-bus case, third-order PCE required only 494 deterministic power-flow evaluations and reproduced the 50,000-sample Monte Carlo benchmark with relative mean errors of 0.014% for minimum voltage, 0.119% for active losses, and 0.113% for substation import. The corresponding wall-clock speed-up was 13.29×, while reducing deterministic evaluations by approximately 101×. Correlated load–PV uncertainty increased the upper tail of substation import from 6.06 MW to 6.30 MW, and realistic thermal recalibration revealed line-loading p99 values above 100% for the 60% target case, demonstrating the operational value of physically meaningful ampacity settings. The proposed workflow provides an open, scalable, and tail-aware basis for uncertainty-informed distribution-network planning under renewable variability.

Keywords:

probabilistic power flow; polynomial chaos expansion; non-intrusive spectral projection; correlated uncertainty; renewable integration; distribution networks; Monte Carlo benchmarking; sparse-grid quadrature

1. Introduction

1.1. Motivation, Scope, Research Problem, Hypothesis, and Objectives

Distribution networks are undergoing a rapid structural change driven by the large-scale integration of variable renewable generation (in particular photovoltaic and wind resources) and by increasingly volatile demand patterns. Under these conditions, the operating point of the feeder is no longer well described by a single deterministic snapshot: both injections and, in some cases, effective network parameters vary stochastically, which propagates nonlinearly through the AC power flow equations. As a consequence, engineering questions that are inherently risk-based—such as the probability of voltage-limit violations, the likelihood of extreme losses, or the tail behavior of substation import—cannot be answered reliably using purely deterministic power-flow calculations alone. This motivates the adoption of probabilistic power flow (PPF) methods in distribution systems with high renewable penetration, as also emphasized by recent surveys on probabilistic load-flow practices for PV-rich distribution feeders [1].

This need for probabilistic and uncertainty-aware analysis is further reinforced by emerging flexible demand and electrification trends, including electric-vehicle integration, where grid operation increasingly depends on stochastic interactions among renewable generation, demand response, storage, and market participation [2].

A widely used baseline for PPF is Monte Carlo simulation, where thousands of deterministic AC power flow solves are performed under randomly sampled scenarios. MC is attractive due to its conceptual simplicity and statistical consistency; however, its computational cost scales linearly with the number of samples and becomes prohibitive when one needs accurate estimation of tails (e.g., p01–p99) or when extensive parametric sweeps are required (e.g., renewable-penetration or uncertainty-magnitude studies). This motivates surrogate-based approaches that preserve the physical fidelity of the nonlinear AC equations while reducing the number of required deterministic solves.

In this context, polynomial chaos expansion (PCE) provides a principled framework for uncertainty propagation by approximating stochastic outputs as orthogonal polynomial expansions of latent random inputs, enabling efficient computation of moments and distributional properties. The Wiener–Askey polynomial chaos framework formalizes the construction of orthogonal bases aligned with common input distributions [3] and has been applied to stochastic AC power flow problems, demonstrating that PCE can provide accurate distributional approximations with substantially fewer deterministic evaluations than brute-force MC [4]. While PCE has gained attention in power-system applications, including recent contributions in Electric Power Systems Research that leverage generalized polynomial chaos to tackle distribution-level stochastic studies and related optimization formulations [5,6], there remains a practical gap in reproducible, configuration-driven pipelines that (i) retain the full nonlinear AC model, (ii) validate not only moments but also tail percentiles against MC, and (iii) quantify the computational speed-up and numerical robustness across renewable regimes.

Accordingly, the research problem addressed in this work is: can a non-intrusive PCE-based probabilistic AC power flow framework reproduce the output distributions of a Monte Carlo benchmark—including tails relevant to operational risk—while sufficiently reducing computational cost to enable routine, fully reproducible studies on standard hardware? To answer this question, this study focuses on a distribution-feeder case study and a set of output risk indicators that are standard in distribution-system assessment: minimum system voltage, total active-power losses, and substation import power (with additional indicators reported where meaningful).

The central hypothesis is that a low-order PCE surrogate, trained using numerical projection on a sparse set of deterministic power-flow evaluations, can approximate the stochastic mapping induced by the nonlinear AC equations with sufficient accuracy to match Monte Carlo statistics (mean, variance, and risk-relevant quantiles), while requiring substantially fewer deterministic solves and maintaining numerical stability across high renewable penetration regimes.

This study pursues five specific objectives. First, it develops a fully reproducible non-intrusive PCE pipeline for probabilistic AC power flow using open-source tools and configuration-driven execution. Second, it validates the surrogate against a substantially larger Monte Carlo benchmark, including 50,000-sample reference simulations and bootstrap confidence intervals for means and tail quantiles. Third, it evaluates PCE order convergence for polynomial orders 2–5 to assess whether a third-order expansion is sufficient under the studied nonlinear AC response. Fourth, it extends the uncertainty model beyond independent inputs by incorporating correlated load–PV, PV–wind, and mixed dependence scenarios, as well as reactive-power sensitivity through alternative renewable power factors. Fifth, it broadens the engineering validation through realistic thermal-loading recalibration, multi-feeder testing on IEEE 33-bus, CIGRE MV, CIGRE LV, and IEEE 118-bus systems, and a 365-snapshot full-year daily screening. These additions reposition the contribution from a single-feeder PCE demonstration toward a reproducible, tail-aware, and operationally interpretable validation framework for renewable-rich distribution networks [7,8].

The novelty of this work lies in integrating non-intrusive sparse-grid PCE with a transparent and reproducible AC power-flow workflow that explicitly validates tails, Monte Carlo statistical uncertainty, PCE order convergence, correlated inputs, realistic line-loading calibration, multi-feeder scalability, and annual operating variability. This combination addresses several practical limitations commonly found in PCE-based probabilistic power-flow studies, where validation is often restricted to a single benchmark feeder, independent inputs, moment-level comparisons, or non-binding thermal constraints.

1.2. Theoretical Background

High-penetration renewable distributed energy resources (DERs) change the operational regime of distribution networks by introducing fast and persistent uncertainty in net injections. In a radial or weakly meshed feeder, small stochastic variations in nodal active/reactive injections can translate into non-negligible variability of bus-voltage magnitudes, feeder losses, and upstream import at the substation—precisely the quantities used for power-quality compliance, loss assessment, and operational planning. Probabilistic power flow (PPF) formalizes this setting by treating the AC power-flow mapping as a transformation from random inputs (e.g., load multipliers and renewable generation multipliers) to random outputs (e.g., voltage extrema and losses). In the distribution context, modern surveys consistently identify Monte Carlo (MC) simulation as the most general benchmark because it does not linearize the nonlinearity of the AC equations; however, they also emphasize that its computational burden becomes prohibitive when accurate tail statistics are required or when many operating points must be assessed [1].

Within this landscape, a large body of work has pursued “surrogate” uncertainty-propagation techniques that preserve the nonlinear AC physics while reducing the number of full Newton–Raphson (NR) solves. In power systems, variance-reduction sampling (e.g., structured space-filling strategies) has been proposed to improve MC efficiency while keeping the black-box solver unchanged; for instance, Latin-supercube sampling has been reported as a practical improvement over plain MC in probabilistic power-flow studies [9]. Yet, even improved sampling retains the slow convergence of sampling-based estimators for rare-event or far-tail quantiles—precisely the regime that matters for voltage-violation risk and contingency-aware planning.

Polynomial chaos expansion (PCE) provides a spectral alternative for uncertainty propagation when the power-flow solution is sufficiently smooth with respect to the uncertain inputs. The central idea is to represent a quantity of interest

y (ξ)

(e.g., system minimum-voltage magnitude) as a truncated series of orthonormal polynomials

\{Ψ_{k}}_{k = 0}^{K}

in a vector of independent random variables

ξ

:

y (ξ) \approx \sum_{k = 0}^{K} {\hat{y}}_{k} Ψ_{k} (ξ),

(1)

where the polynomial family is chosen to be orthogonal with respect to the joint probability measure of

ξ

. The generalized PCE framework links common distributions to orthogonal polynomial families through the Wiener–Askey scheme, enabling, for example, Hermite polynomials for Gaussian inputs and Legendre polynomials for uniform inputs [10]. This representation is attractive because low-order moments can be analytically computed from the coefficients (e.g.,

E [y]

is

{\hat{y}}_{0}

under standard normalization, while

V a r (y)

is the sum of squared coefficients of non-constant modes), and because the same surrogate can be inexpensively queried to extract percentiles, risk measures, or sensitivity indices.

Two broad computational routes exist for obtaining the PCE coefficients. Intrusive stochastic Galerkin (SG) methods insert the PCE ansatz directly into the governing equations (here, the nonlinear AC power-flow equations) and enforce orthogonality of the residual with respect to each basis function. This yields a coupled deterministic system in the unknown PCE coefficients, typically solved by a Newton-type method applied to an “augmented” residual system [11]. The intrusive approach can be highly accurate and can leverage the structure of the coupled system, but it requires a careful numerical implementation to maintain robustness. Non-intrusive methods, in contrast, keep the deterministic solver as a black box and compute coefficients through projection or regression using solver evaluations at selected nodes in the random space; stochastic collocation is a well-established non-intrusive route that often attains high-order accuracy for smooth responses while retaining solver modularity [12]. In power-flow applications, both routes have been reported. Since the present study keeps the deterministic pandapower solver unchanged and computes PCE coefficients from deterministic evaluations at sparse-grid quadrature nodes, it follows the non-intrusive projection route.

A practical challenge in PCE-based PPF is the growth of the basis size with the number of uncertain inputs and the polynomial order (the “curse of dimensionality”). Consequently, efficient numerical integration and node-selection strategies are crucial. When coefficients are computed via projection, multidimensional quadrature rules are required to evaluate inner products; sparse-grid constructions, particularly Smolyak-type rules, exploit the tensor-product structure to reduce the number of nodes dramatically relative to full tensor quadrature while preserving high polynomial exactness in moderate dimensions. A comprehensive numerical analysis perspective on sparse grids and their approximation properties is provided in the standard survey literature [13]. In distribution-network PPF, such sparse quadrature is especially relevant because it directly reduces the number of deterministic NR solves needed to build the surrogate.

Finally, a complementary strand of the literature emphasizes that PCE-based PPF is most compelling when it is paired with transparent modeling assumptions and open computational tooling. Recent power-systems studies have demonstrated PPF workflows where generalized PCE is coupled with deterministic distribution power-flow solvers to reproduce full output distributions with substantial speed-ups over MC, while still retaining a physically interpretable AC model [14]. Reproducibility is further enabled by open-source power-system analysis frameworks and uncertainty-quantification libraries, including pandapower for AC power-flow automation [7] and Chaospy for polynomial-chaos and quadrature constructions [8]. Together, these elements motivate PCE-based PPF as a numerically principled and practically reproducible approach for uncertainty-aware assessment of distribution networks with high renewable penetration.

2. Materials and Methods

This study evaluates probabilistic AC power flow in a distribution feeder with high renewable penetration by propagating uncertainty through the nonlinear AC equations. A Monte Carlo (MC) simulation is adopted as the statistical benchmark, while a polynomial chaos expansion (PCE) surrogate is used to reduce the computational burden without sacrificing physical interpretability. Both approaches rely on the same deterministic Newton–Raphson (NR) AC power flow engine implemented in pandapower [7], ensuring that differences between MC and PCE arise exclusively from the uncertainty propagation strategy. Uncertainty quantification (UQ) primitives for polynomial chaos and quadrature are implemented through Chaospy [8].

2.1. Case Study and Deterministic AC Power Flow Model

The primary benchmark is the IEEE 33-bus radial distribution feeder (“case33bw” in pandapower), which is retained to ensure comparability with the original validation. To address scalability and generality, this study also includes additional test networks available through pandapower: CIGRE MV, CIGRE LV, and IEEE 118-bus. The CIGRE MV and LV systems provide more realistic distribution-network structures, whereas IEEE 118-bus is used as a larger-scale stress case for computational scalability rather than as a distribution-feeder representation. All networks are solved using the same balanced steady-state AC formulation and the same Newton–Raphson solver configuration. Let

Y = G + j B

denote the bus admittance matrix and

V_{i} = ∣ V_{i} ∣ e^{j θ_{i}}

the complex voltage at bus

i

. The active and reactive-power balance equations are expressed as

P_{i} = \sum_{k = 1}^{n} ∣ V_{i} ∣ ∣ V_{k} ∣ (G_{i k} \cos (θ_{i} - θ_{k}) + B_{i k} \sin (θ_{i} - θ_{k})),

(2)

Q_{i} = \sum_{k = 1}^{n} ∣ V_{i} ∣ ∣ V_{k} ∣ (G_{i k} \sin (θ_{i} - θ_{k}) - B_{i k} \cos (θ_{i} - θ_{k})) .

(3)

The nonlinear system is solved using a Newton–Raphson iteration with state vector

x = [θ, ∣ V ∣]

, mismatch vector

F (x)

, and Jacobian

J (x)

:

J (x^{(m)}) Δ x^{(m)} = - F (x^{(m)}), x^{(m+ 1)} = x^{(m)} + Δ x^{(m)} .

(4)

The same solver, convergence tolerances, and initialization procedure are used for every deterministic evaluation in both MC and PCE, providing a controlled comparison focused on uncertainty propagation.

Distributed renewable generation is represented as injections connected at candidate buses selected automatically by the workflow. The PV injections operate at unity power factor, whereas wind injections operate at a near-unity power factor. System voltage quality and operational risk are assessed using distributional metrics computed from the solved AC state.

2.2. Operating Point Construction and Snapshot Selection

A 24-h operating horizon is used to define the base snapshot for the main Monte Carlo–PCE comparison. The reference snapshot is selected consistently for all base, correlated, thermal, and power-factor sensitivity experiments so that differences among cases are attributable to the uncertainty model or network configuration rather than to a change in operating point. In addition, a 365-snapshot full-year daily screening is introduced to assess temporal variability across a synthetic annual operating profile. In this annual screening, one representative snapshot per day is evaluated using the same PCE configuration, yielding 365 daily probabilistic assessments. This design provides a compact and reproducible annual-scale test of seasonal and diurnal variability while keeping this manuscript computationally tractable.

2.3. Uncertainty Model and Random Inputs

Uncertainty is introduced through a low-dimensional latent random vector

Z \sim N (0, I_{d})

. Physical uncertain scalings are constructed from

Z

to ensure consistent sampling and orthogonal polynomial bases aligned with the latent distribution. In particular, load uncertainty is modeled through a multiplicative lognormal scaling,

λ_{L} = e x p (μ_{L} + σ_{L} Z_{1}),

(5)

which preserves positivity and captures realistic relative fluctuations around the deterministic demand snapshot. Renewable availability is represented through bounded scalings derived via probability integral transforms:

U = Φ (Z)

is mapped to the target distribution and then converted into a multiplicative factor

λ \in [0,1]

. Network-parameter uncertainty, when enabled, is represented via small perturbations on online parameters (e.g., resistance scaling) constrained to remain physically meaningful.

This latent-variable construction is compatible with polynomial chaos expansions from the Wiener–Askey framework [4], supporting Hermite bases for Gaussian inputs and enabling stable computation of surrogate moments and distributions. To address the limitations of independent stochastic inputs, the experiments additionally include dependence-aware scenarios implemented through a Gaussian copula in the latent space. Three representative dependence structures are considered: negative load–PV correlation, negative PV–wind correlation, and a mixed correlation case combining load–PV, load–wind, and PV–wind dependencies. These scenarios are not intended to represent a specific measured feeder but to quantify the direction and magnitude of tail-risk bias introduced by the independence assumption. Reactive-power sensitivity is also evaluated by varying the renewable power factor from 1.00 to 0.98, 0.95, and 0.90, thereby testing whether near-unity inverter assumptions materially affect voltage and loss distributions.

The selected marginal distributions are intended as reproducible benchmark assumptions rather than feeder-specific forecasts. Load uncertainty is represented with a lognormal multiplier because demand variations are naturally positive and often modeled as multiplicative deviations around a forecast or deterministic operating point. PV availability is mapped to a bounded beta-type scaling because photovoltaic production is physically constrained between zero and its available capacity and typically exhibits skewness under uncertain irradiance conditions. Wind availability is represented with a Weibull-based scaling because Weibull models are widely used to describe wind-speed variability and can be transformed into bounded generation availability factors. Line resistance uncertainty is modeled as a small normal perturbation around the nominal value to represent parameter tolerance, temperature-related variation, and modeling uncertainty while preserving physically meaningful positive impedances. The numerical parameters used in this study are therefore not claimed as measurements from a specific utility feeder; rather, they define a transparent and repeatable stress-test configuration for comparing Monte Carlo and PCE under controlled uncertainty conditions.

2.4. Monte Carlo Probabilistic Power Flow Benchmark

The MC benchmark estimates output distributions through repeated deterministic NR solves under random scenarios. For

i = 1, \dots, N_{MC}

, a latent sample

Z^{(i)}

is drawn, transformed into a physical scenario (loads, renewable injections, and parameter scalings), applied to the feeder model, and then solved with NR. For each sample, convergence status is recorded. Converged samples are used to compute empirical statistics, while non-converged samples (if any) are reported to quantify numerical robustness under uncertainty.

For a scalar output

y

, MC estimators of mean and variance are computed as

{\hat{μ}}_{M C} = \frac{1}{N_{o k}} \sum_{i = 1}^{N_{o k}} y^{(i)}, {\hat{σ}}_{M C}^{2} = \frac{1}{N_{o k} - 1} \sum_{i = 1}^{N_{o k}} {(y^{(i)}− {\hat{μ}}_{M C})}^{2},

(6)

where

N_{o k}

is the number of converged trials. Empirical quantiles

{\hat{q}}_{α, M C}

are obtained from the sorted converged samples. MC provides an asymptotically unbiased reference but requires

N_{MC}

deterministic solves, making it computationally demanding for large parametric studies.

2.5. Polynomial Chaos Expansion Surrogate

PCE approximates the stochastic mapping from uncertain inputs to power-flow outputs using an orthogonal polynomial series. For a scalar output

y (Z)

, a truncated expansion is written as

y (Z) \approx \sum_{k = 0}^{P} c_{k} Ψ_{k} (Z),

(7)

where

\{Ψ_{k}\}

is a multivariate orthogonal polynomial basis in

Z

(Hermite polynomials for Gaussian latent inputs) and

\{c_{k}\}

are deterministic coefficients. Under a total-order truncation of degree

p

in dimension

d

, the number of retained terms is

P + 1 = (\binom{d + p}{p}) .

(8)

This work adopts the Wiener–Askey polynomial chaos perspective [3] and follows established stochastic power-flow practices that apply polynomial chaos to AC power flow and stochastic optimal power flow formulations [5].

2.6. Non-Intrusive Coefficient Estimation via Numerical Projection and Quadrature

Coefficients are estimated using a non-intrusive spectral projection (NISP) strategy, which treats the deterministic NR solver as a black box. Each coefficient is defined by an

L^{2}

projection,

c_{k} = \frac{E [y (Z) Ψ_{k} (Z)]}{E [Ψ_{k}( Z)^{2}]},

(9)

and expectations are approximated through numerical quadrature with nodes

{Z^{(i)}}_{i = 1}^{N_{q}}

and weights

\{w_{i}\}

:

E [f (Z)] \approx \sum_{i = 1}^{N_{q}} w_{i} f (Z^{(i)}) .

(10)

The computational cost of PCE training is therefore proportional to the number of quadrature nodes

N_{q}

, because each node triggers one deterministic NR solve. To mitigate node growth with dimension, sparse-grid quadrature is used, which substantially reduces the number of deterministic evaluations compared to tensor-product rules while maintaining high-order integration accuracy.

2.7. Surrogate Post-Processing: Moments and Quantiles

Once

\{c_{k}\}

are computed, key statistics are obtained without additional NR calls. Due to orthogonality, the PCE mean equals the constant coefficient,

μ_{P C E} = c_{0},

(11)

and the variance is computed from the energy of the non-constant modes,

σ_{P C E}^{2} = \sum_{k = 1}^{P} c_{k}^{2} E [Ψ_{k}( Z)^{2}] .

(12)

Tail-oriented metrics are obtained by drawing a large number of samples in the latent space and evaluating the polynomial surrogate directly, which is substantially computationally faster than repeatedly solving the nonlinear AC equations. This makes PCE particularly attractive for estimating operationally relevant percentiles (e.g.,

p 01

,

p 05

,

p 95

,

p 99

) in renewable-rich regimes.

2.8. Scenario Sweeps: Renewable Penetration and Uncertainty Sensitivity

To evaluate robustness across operating regimes, renewable penetration is swept by scaling distributed generation injections relative to the nominal case and repeating the probabilistic analysis at each penetration level. In addition, uncertainty sensitivity can be assessed by varying uncertainty magnitudes while keeping the deterministic snapshot fixed, thereby isolating how dispersion and tail widening scale with uncertainty. These sweeps are executed using the same configuration-driven pipeline and identical deterministic solver settings to ensure fair comparison across regimes.

2.9. Implementation and Reproducibility Protocol

All experiments are controlled through YAML configuration files specifying the case study, profile mode, uncertainty distributions, MC sample size, PCE truncation order, quadrature settings, and the number of surrogate evaluation samples used to estimate quantiles. A fixed random seed ensures repeatable MC sampling and repeatable surrogate post-processing. Each run exports (i) raw MC samples and convergence logs, (ii) MC summary statistics, (iii) PCE moments, (iv) surrogate-derived quantiles, and (v) a metadata file recording the complete setup and software environment. The deterministic and probabilistic pipelines are implemented in Python using open-source dependencies and can be executed either on a local workstation for moderate sample sizes or on an HPC node for large Monte Carlo benchmarks, high-order PCE, multi-feeder testing, and annual screening.

2.10. Exact Configuration Used in This Study (Replication-Ready)

The base IEEE 33-bus benchmark corresponds to the configuration file configs/case33bw.yaml and its associated exported metadata. Extended validation experiments, including high-sample Monte Carlo runs, correlated uncertainty, thermal recalibration, multi-feeder testing, and full-year daily screening, are specified through additional configuration files reported in the reproducibility package.

The base reproducibility package was originally prepared in a Python environment using pandapower [7] and Chaospy [8]. The extended high-performance validation runs are documented separately in Section 2.11 with their corresponding software stack. UQ routines for polynomial chaos and quadrature are provided by Chaospy [8]. The case study is the pandapower feeder “case33bw” (IEEE 33-bus). The voltage operating limits are set to [0.9,1.1] p.u. Distributed generation is enabled with automatic bus selection for PV and wind candidates. The nominal renewable penetration (fraction of total load served by DER in the deterministic snapshot) is set to 0.6. PV injections operate at power factor 1.0, and wind injections operate at power factor 0.98.

The deterministic operating point is constructed from a 24 h profile horizon with 60 min time steps in a fully offline synthetic mode. The snapshot analyzed in the probabilistic experiments is fixed at

t_{index} = 12

(i.e., the same time index is used for both MC and PCE to guarantee consistency). The random seed is fixed to 123 to ensure repeatable scenario generation.

Uncertainty is defined through four enabled stochastic inputs driven by a latent standard normal vector: (i) load scaling modeled as lognormal with

μ = 0

and

σ = 0.2

; (ii) PV availability scaling modeled as Beta with

α = 2

,

β = 5

; (iii) wind availability scaling modeled as Weibull with shape

k = 2

and scale

1.0

; and (iv) line resistance scaling modeled as a small normal perturbation with standard deviation 0.05. Line reactance uncertainty is disabled in this configuration. The probabilistic outputs tracked are: minimum-voltage magnitude

v_{m i n}

(“vm_pu_min”), maximum voltage magnitude (“vm_pu_max”), total active-power losses (“loss_mw”), maximum line loading (“line_loading_max”), and slack/substation active-power import (“ext_grid_p_mw”).

The MC benchmark uses

N_{MC} = 3000

independent samples and executes a full deterministic NR solve per sample. The PCE surrogate uses a total polynomial order

p = 3

and quadrature order 4, implemented through a sparse-grid Gaussian quadrature in the latent space. This configuration produces 494 quadrature nodes, hence 494 deterministic NR solves to fit the surrogate. Quantiles are computed by drawing 50,000 random surrogate evaluations from the latent input distribution and evaluating the fitted polynomial expansion. This exact configuration follows established PCE methodology within the Wiener–Askey framework [3] and is consistent with prior applications of polynomial chaos to stochastic AC power flow [4] and with recent EPSR studies using generalized polynomial chaos in power-flow-related settings [5,6].

2.11. High-Performance Validation Protocol

Regarding sample robustness, scalability, correlated uncertainty, and thermal realism, the validation protocol was extended using a high-performance computing environment. The base IEEE 33-bus benchmark was re-run with 10,000 and 50,000 Monte Carlo samples. Bootstrap resampling with 1000 repetitions was used to estimate 95% confidence intervals for the Monte Carlo mean, standard deviation, and tail quantiles. PCE order convergence was evaluated for total polynomial orders 2, 3, 4, and 5, with corresponding sparse-grid quadrature orders 3, 4, 5, and 6. Correlation-aware experiments considered load–PV, PV–wind, and mixed dependence structures. Realistic thermal-loading tests were obtained by recalibrating line ampacities so that the deterministic base case reached target maximum loadings of 40%, 60%, and 80%. Multi-feeder scalability was evaluated using IEEE 33-bus, CIGRE MV, CIGRE LV, and IEEE 118-bus systems. Finally, a 365-snapshot full-year daily PCE screening was performed for the IEEE 33-bus feeder to evaluate annual operating variability. The high-performance runs were executed in the CEDIA HPC Jupyter environment on an interactive CPU session with 32 CPU cores and 128 GB RAM. The software stack was Python 3.10.20, pandapower 3.4.0, Chaospy 4.3.21, NumPy 2.2.6, SciPy 1.15.3, and a fixed random seed set to 123.

3. Results

3.1. Benchmark Configuration and Monte Carlo Robustness

The IEEE 33-bus feeder was evaluated with 50,000 Monte Carlo samples, all of which converged under the adopted Newton–Raphson settings. This sample size provides a more stable empirical reference for the central moments and tail quantiles used to validate the PCE surrogate. The non-intrusive PCE surrogate was trained using total polynomial order 3 and sparse-grid quadrature order 4, which produced 494 deterministic AC power-flow evaluations. Therefore, the base comparison uses 50,000 full nonlinear Monte Carlo power-flow solutions versus 494 deterministic quadrature-node solutions for PCE.

To account for the statistical uncertainty of Monte Carlo itself, 95% bootstrap confidence intervals were computed for the mean and for the p01 and p99 tail quantiles. This additional layer of validation is important because the PCE surrogate is assessed against a finite-sample Monte Carlo benchmark rather than against an exact distribution. The bootstrap results show that the 50,000-sample Monte Carlo benchmark provides narrow confidence intervals for the main operational indicators, supporting its use as a reference for surrogate validation.

Figure 1 shows the Monte Carlo–PCE comparison for the minimum system voltage magnitude. Figure 2 shows the corresponding comparison for total active-power losses, and Figure 3 reports the comparison for substation active-power import. These three indicators are used as the main operational validation set because they represent voltage-quality risk, energy-efficiency degradation, and upstream grid stress, respectively.

Table 1 summarizes the base validation. For the minimum-voltage magnitude, PCE matched the 50,000-sample Monte Carlo mean with a relative error of 0.0137%. For total active losses and substation import, the corresponding relative mean errors were 0.1186% and 0.1127%, respectively. Tail agreement was also strong: the p99 discrepancy was negligible for voltage, 0.147% for losses, and 0.276% for substation import. These results confirm that the third-order non-intrusive PCE surrogate captures both central tendency and tail-relevant behavior while using approximately 101× fewer deterministic AC power-flow evaluations than the Monte Carlo benchmark.

Table 2 reports the Monte Carlo bootstrap confidence intervals for the same base benchmark. The PCE estimates remain close to the corresponding Monte Carlo confidence ranges, particularly for the mean and upper-tail import and loss metrics. The larger deviations occur in extreme lower-tail indicators, where both finite-sample Monte Carlo uncertainty and polynomial truncation effects are expected to be more pronounced.

3.2. Computational Cost and PCE Order Convergence

The benchmark provides a stronger estimate of computational benefit than the original 3000-sample comparison. The 50,000-sample Monte Carlo benchmark required 116.32 s using 32 CPU cores, whereas the third-order PCE surrogate required 494 deterministic power-flow evaluations and 8.75 s in a representative isolated run. This corresponds to a wall-clock speed-up of 13.29× and a reduction from 50,000 to 494 deterministic AC power-flow evaluations. The latter represents approximately 101× fewer nonlinear power-flow solves. Since wall-clock time is affected by shared-cluster load and scheduling, the number of deterministic AC solves is also reported as an architecture-independent measure of computational cost.

Under the 32-core HPC setup, the effective parallel wall-clock time was approximately 2.33 ms per Monte Carlo sample for the 50,000-sample benchmark and 17.71 ms per quadrature-node evaluation for the third-order PCE run. These values should be interpreted as parallel wall-clock averages under the reported HPC environment, not as universal serial laptop timings. For this reason, the number of deterministic AC power-flow evaluations is reported alongside wall-clock time as an architecture-independent measure of computational effort.

A PCE order-convergence analysis was performed for total polynomial orders 2, 3, 4, and 5. Figure 4 illustrates the convergence trend for active-power losses, which is a nonlinear and right-tailed output. Table 3 shows that all tested orders reproduced the Monte Carlo mean values with errors below 0.13% for the selected indicators. Increasing the order beyond 3 did not materially improve mean accuracy for the base case, while increasing the number of quadrature nodes from 494 to 1278 and 2958 for orders 4 and 5, respectively, did. Therefore, order 3 represents a practical compromise between accuracy and computational cost for the studied four-dimensional uncertainty representation.

3.3. Effect of Correlated Uncertainty

The independence assumption was relaxed by introducing three correlated uncertainty scenarios: negative load–PV correlation, negative PV–wind correlation, and a mixed correlation case. Table 4 compares the independent case with the correlated cases using Monte Carlo benchmarks. The strongest tail impact appears under negative load–PV correlation, where the p99 of substation import increases from 6.063 MW to 6.295 MW and the p99 of total active losses increases from 0.520 MW to 0.547 MW. This behavior is physically consistent: when high load coincides with low PV availability, net demand increases, producing higher upstream import, larger currents, and larger losses. In contrast, the negative PV–wind correlation case produces only small changes relative to the independent benchmark, indicating that its effect is weaker for the selected snapshot and penetration level. These findings confirm that ignoring dependence can bias tail-risk estimates, particularly when correlations align high demand with low local renewable output.

3.4. Realistic Thermal-Loading Recalibration

The original IEEE 33-bus benchmark produced maximum line-loading values on the order of 10⁻⁴% because the encoded/default ampacities were unrealistically large. To provide physically meaningful thermal-stress indicators, the analysis recalibrated line ampacities so that the deterministic base case reached target maximum loadings of 40%, 60%, and 80%. This recalibration does not alter the AC power-flow physics; it only rescales thermal limits to evaluate how uncertainty propagates into binding or near-binding congestion indicators.

Figure 5 shows the Monte Carlo–PCE distributional comparison for the 60% thermal target case. This target is used as the main thermal-risk illustration because it produces a realistic operating range in which the deterministic case remains below the limit but stochastic realizations may approach or exceed it.

Table 5 shows that realistic thermal calibration substantially changes the operational interpretation of line loading. Under the 60% target case, the Monte Carlo mean maximum line loading was 61.43%, the p95 was 88.14%, and the p99 exceeded 100%, reaching 103.40%. Therefore, once physically meaningful ampacities are imposed, the same uncertainty model reveals low-probability thermal-risk events that were completely masked by the original benchmark ratings. The PCE surrogate accurately tracks this behavior, matching the mean line loading closely across all target levels.

3.5. Renewable Power-Factor Sensitivity

Reactive-power assumptions were tested by varying the PV and wind power factor from 1.00 to 0.98, 0.95, and 0.90. Table 6 shows that lower power factors produced higher minimum-voltage values and lower active losses for the selected operating point. The mean minimum voltage increased from 0.9200 p.u. at unity power factor to 0.9231 p.u. at 0.90 power factor, while mean active losses decreased from 0.1920 MW to 0.1782 MW. These results indicate that near-unity power-factor assumptions are not neutral for voltage and loss analysis and that inverter reactive-power capability can materially affect probabilistic voltage profiles.

3.6. Multi-Feeder Scalability and Generality

This study evaluated four networks: IEEE 33-bus, CIGRE MV, CIGRE LV, and IEEE 118-bus. Figure 6 summarizes the minimum-voltage results across feeders, while Table 7 reports the Monte Carlo and PCE runtimes and the main mean indicators. The CIGRE MV and LV systems provide additional distribution-network validation, whereas IEEE 118-bus serves as a larger-scale computational stress case rather than as a distribution-feeder representation. All networks were successfully evaluated with the same workflow. These results indicate that the implementation is not restricted to the IEEE 33-bus feeder and can be applied across networks of different size and structure.

3.7. Full-Year Daily PCE Screening

Finally, a 365-snapshot full-year daily screening was performed using the IEEE 33-bus feeder. Each daily snapshot was evaluated using the same third-order PCE configuration, resulting in 180,310 deterministic quadrature-node power-flow evaluations across the annual screening. The total wall-clock time was 6844.95 s, equivalent to an average of 18.75 s per daily probabilistic snapshot in the HPC environment.

Because each probabilistic snapshot is solved independently, a full chronological 8760 h extension would scale approximately linearly with the number of operating points. Under the present third-order PCE setting, such an extension would require 8760 × 494 = 4,327,440 deterministic quadrature-node AC power-flow evaluations. An equivalent 50,000-sample Monte Carlo benchmark at every hourly point would require 8760 × 50,000 = 438,000,000 deterministic AC power-flow evaluations. Therefore, although the present manuscript reports a 365-snapshot daily screening rather than a full 8760 h chronological simulation, the deterministic-solve reduction of approximately 101× is expected to be preserved for independent hourly snapshot assessments.

Figure 7 shows the normalized annual variation of the main PCE-derived indicators. Figure 8 provides the annual evolution of the mean minimum voltage together with the p05–p95 probabilistic band. These figures demonstrate that the surrogate workflow can be extended beyond a single snapshot to capture annual-scale operating variability.

Table 8 summarizes the annual descriptive statistics. The annual screening shows that the mean minimum voltage varied from 0.9126 p.u. to 0.9404 p.u., while the mean total active losses varied from 0.1041 MW to 0.2153 MW across the year. Substation import ranged from 2.717 MW to 3.935 MW in terms of daily mean values, with the highest p95 value reaching 5.463 MW. Although this screening uses one representative snapshot per day rather than all 8760 hourly points, it provides a compact temporal validation and supports the feasibility of full chronological extension.

4. Discussion

This study tested the hypothesis that a non-intrusive probabilistic AC power-flow formulation, based on polynomial chaos expansion (PCE) and sparse-grid numerical projection, can reproduce the uncertainty propagation of a full nonlinear AC power-flow model with substantially fewer deterministic solves than Monte Carlo simulation. The results provide stronger support for this hypothesis than the original submission. Using a 50,000-sample Monte Carlo benchmark, third-order PCE reproduced the main operational indicators with relative mean errors below 0.12% while requiring only 494 deterministic power-flow evaluations. The validation also shows that the method remains informative under correlated uncertainty, realistic thermal-loading calibration, renewable power-factor variation, multiple test feeders, and annual-scale daily operating variability. Therefore, the main contribution is not the introduction of PCE itself, but the construction and validation of a reproducible, tail-aware, and operationally interpretable PCE workflow for probabilistic AC power flow in renewable-rich distribution networks.

The results also clarify the engineering meaning of the probabilistic outputs. The lower tail of the minimum-voltage distribution identifies rare but relevant undervoltage-prone operating conditions; the right tail of active losses captures high-current scenarios with disproportionate efficiency penalties; and the upper tail of substation import reflects feeder or transformer stress under coincident high demand and low renewable availability. The correlated uncertainty experiments show that assuming independence can underestimate these tail risks, particularly when load and PV availability are negatively correlated. Similarly, the recalibrated thermal-loading experiments demonstrate that default benchmark ampacities can mask congestion risk: once realistic loading targets are imposed, p99 line-loading values can exceed 100% even when the deterministic base point remains within the target operating range.

From a computational perspective, the results show that PCE is most attractive when the same stochastic mapping must be queried repeatedly, as in tail estimation, sensitivity studies, multi-scenario planning, or annual screening. The observed 13.29× wall-clock speed-up relative to the 50,000-sample Monte Carlo benchmark is complemented by an approximately 101× reduction in deterministic AC power-flow evaluations. The 365-snapshot annual screening further indicates that the same surrogate-building process can be applied repeatedly over a representative operating horizon, requiring 180,310 deterministic quadrature-node solves across the year and an average of 18.75 s per daily probabilistic snapshot in the HPC environment.

4.1. Positioning with Respect to Prior Literature

The main findings align with a substantial body of work showing that polynomial-chaos methods can outperform brute-force sampling for uncertainty propagation when the underlying model is expensive and sufficiently smooth for spectral approximations. Foundationally, the efficiency of generalized polynomial chaos (gPC) follows from orthogonality and spectral convergence properties of polynomial bases [3]. In power-system contexts, PCE/gPC has been applied through both intrusive Galerkin formulations and non-intrusive collocation or projection strategies. The present study follows the latter route: the deterministic AC power-flow solver is kept unchanged, and the PCE coefficients are obtained from sparse-grid quadrature evaluations of the black-box solver.

A key agreement with the prior probabilistic load-flow literature is that PCE-based surrogates can reproduce not only central moments but also risk-relevant tail behavior when the expansion is sufficiently expressive and the stochastic discretization is well chosen. This is coherent with results reported for sparse/adaptive PCE variants designed to mitigate the curse of dimensionality in higher-dimensional settings [15,16,17,18]. The present results extend that narrative by emphasizing distributional fidelity on operationally meaningful outputs.

This study also sits naturally alongside the growing literature that leverages gPC/PCE for nonlinear power-flow and OPF formulations, including work showing that formulation choices can substantially influence tractability and runtime. For example, Van Acker et al. discuss gPC in a current–voltage OPF formulation and report strong computational advantages relative to alternative variable spaces [6,19]. While the present work focuses on probabilistic power flow rather than stochastic OPF, the shared conclusion is that PCE-based uncertainty propagation can deliver substantial computational gains when repeated deterministic solves dominate runtime.

The most meaningful discrepancies versus some prior studies arise from modeling scope and uncertainty representation rather than from the PCE mechanism itself. Several probabilistic load-flow works rely on (i) linearized or DC approximations, (ii) non-intrusive collocation/regression surrogates, or (iii) uncertainty models assuming known marginal families and/or weak dependence. The present results show that, even in a fully nonlinear AC setting, a non-intrusive, quadrature-based PCE surrogate can remain accurate on distributional metrics, but its performance is necessarily conditioned on (a) the dimensionality of uncertain inputs and (b) whether dependence structures are modeled explicitly. In that sense, data-driven arbitrary polynomial chaos (aPC) and dependence-handling strategies (e.g., Nataf-based transformations) represent a complementary direction that can improve realism under correlated uncertainties ([18]; also see the broader aPC framework beyond the Wiener–Askey scheme [19,20]).

4.2. Mechanistic Interpretation

The observed distributional patterns can be explained by how nonlinear AC physics transforms uncertain injections into voltages, currents, and losses.

Worst-case voltage behavior (minimum per-unit voltage) is driven by the combination of (i) stochastic net loading at electrically weak locations and (ii) nonlinear voltage drops governed by feeder impedances and reactive-power balance. Under uncertainty, the minimum-voltage statistic behaves like an “extreme-value projection” of the voltage field, so it is expected to exhibit sensitivity to tail events (rare but physically plausible combinations of higher demand and lower renewable output). The fact that PCE tracks these tails with small percentile errors indicates that the mapping from uncertain injections to the minimum-voltage indicator remains sufficiently smooth over the considered regime and is well captured by the chosen expansion order and stochastic discretization.
Total active-power losses naturally develop a right-skewed distribution because losses scale approximately with $I^{2} R$ , so high-current events disproportionately contribute to the upper tail. As uncertainty increases, both the mean and the spread of losses grow, and the tail thickens. The fact that this widening is reproduced by the surrogate supports the interpretation that the PCE basis is capturing the dominant nonlinear interactions (including cross-terms) rather than only first-order effects.
Imported substation power reflects the aggregated balance between uncertain demand and uncertain renewable injections. Its variability therefore depends strongly on the assumed stochastic structure of net injections, including whether load and renewable uncertainties are treated as independent or correlated. Within the present experimental scope, the surrogate’s ability to reproduce percentiles suggests that the principal uncertainty modes are well represented. However, the literature indicates that stronger dependence modeling (e.g., correlated forecast errors and spatial coupling) can materially alter tail risk; this motivates extending the present framework toward aPC/dependence-aware constructions in future work.

4.3. Practical Implications for Distribution-Network Studies

The results have direct implications for distribution-network planning and operation under renewable uncertainty. First, the lower tail of minimum voltage provides a probabilistic indicator of undervoltage-prone operating conditions that may not appear in a deterministic snapshot. This is relevant for planning voltage-control margins, assessing feeder hosting capacity, and identifying operating regimes where additional reactive-power support or voltage regulation may be required.

Second, the right tail of active-power losses captures rare but inefficient high-current conditions. Because losses scale nonlinearly with current magnitude, small changes in the tails of load and renewable availability can produce disproportionate changes in loss distributions. The ability of the non-intrusive PCE surrogate to reproduce these upper-tail statistics indicates that the surrogate captures not only first-order uncertainty effects but also relevant nonlinear interactions in the AC power-flow response.

Third, substation active-power import reflects the aggregate balance between uncertain demand and uncertain renewable injections. The correlation scenarios show that assuming independence can underestimate the upper tail of imported power, particularly when high load coincides with low PV availability. This has practical relevance for transformer loading, feeder capacity assessment, and upstream grid-support planning.

Finally, the recalibrated thermal-loading results demonstrate that benchmark ampacity settings can mask congestion risk. Once realistic thermal targets are imposed, stochastic realizations can produce p99 loading values above 100% even when the deterministic base case remains below the target limit. Therefore, the proposed workflow is not only a statistical surrogate exercise; it can support operationally meaningful risk assessment when physically interpretable network constraints are used.

4.4. Limitations and Future Work

Limitations should be explicitly recognized. First, although this study extends the validation to correlated uncertainty, multiple test networks, realistic thermal-loading recalibration, and annual daily screening, the uncertainty representation remains moderate-dimensional. The present four-dimensional latent model is appropriate for controlled methodological validation and tail-risk benchmarking, but realistic distribution systems may require many more uncertain inputs, including spatially distributed loads, multiple PV plants, wind sites, electric-vehicle charging clusters, inverter reactive-power controls, and feeder-parameter uncertainty. Higher-dimensional uncertainty representations would require adaptive sparse PCE, basis selection, regression-based PCE, low-rank approximations, or arbitrary polynomial chaos to control basis growth and quadrature-node explosion.

Second, the correlation scenarios used in this manuscript are representative stress cases rather than empirically fitted dependence structures. They were introduced to quantify the direction and magnitude of bias caused by the independence assumption, especially under negative load–PV dependence. Future work should estimate spatial and temporal correlations directly from measured load, irradiance, wind, and DER output data and should evaluate dependence-aware transformations, such as Nataf-type mappings or arbitrary polynomial chaos, when the input dependence structure is not compatible with independent standard bases.

Third, this study relies on benchmark networks and synthetic operating profiles rather than confidential utility measurements. Therefore, the numerical values should be interpreted as reproducible benchmark evidence rather than feeder-specific operational forecasts. Validation using measured feeder data, SCADA records, advanced metering infrastructure, inverter telemetry, or hardware-in-the-loop platforms would substantially strengthen the operational applicability of the framework.

Fourth, the annual analysis uses 365 representative daily snapshots instead of a full chronological 8760 h simulation. This screening demonstrates the feasibility of repeated annual-scale surrogate construction, but it does not capture all chronological dependencies, ramping effects, storage states, or control-device trajectories. Future work should extend the method to full time-series probabilistic analysis and probabilistic optimal power flow.

Finally, discrete control devices and nonlinear operating logic were not explicitly modeled. On-load tap changers, switched capacitors, protection actions, inverter Volt–VAR curves, and dispatch or curtailment rules may introduce discontinuities that reduce the accuracy of low-order global polynomial surrogates. Future research should investigate adaptive PCE, local surrogate partitioning, hybrid PCE–machine-learning models, and probabilistic optimal power-flow formulations for these more complex operating regimes.

5. Conclusions

This study presented and validated a reproducible non-intrusive polynomial chaos expansion framework for probabilistic AC power flow in renewable-rich distribution networks. In the analysis, the framework was benchmarked against a 50,000-sample Monte Carlo simulation, supplemented with 95% bootstrap confidence intervals, PCE order-convergence tests, correlated uncertainty scenarios, realistic thermal-loading recalibration, renewable power-factor sensitivity, multi-feeder validation, and a 365-snapshot full-year daily screening.

The main conclusion is that third-order PCE provides an accurate and computationally efficient surrogate for the studied four-dimensional uncertainty representation. In the IEEE 33-bus base case, PCE required 494 deterministic AC power-flow evaluations and reproduced the 50,000-sample Monte Carlo benchmark with relative mean errors of 0.014% for minimum voltage, 0.119% for active losses, and 0.113% for substation import. The corresponding wall-clock speed-up was 13.29×, with approximately 101× fewer deterministic power-flow evaluations. The order-convergence analysis showed that increasing the polynomial order beyond three did not materially improve mean accuracy in the base case, supporting the use of third-order PCE as a practical accuracy–cost compromise.

The extended experiments also demonstrated the importance of physically and statistically realistic modeling assumptions. Negative load–PV correlation increased the p99 of substation import from 6.063 MW to 6.295 MW, showing that independence assumptions can underestimate operational tail risk. Realistic thermal-loading calibration revealed p99 line-loading values above 100% in the 60% target case, whereas the default benchmark ampacities masked this risk almost completely. Renewable power-factor sensitivity showed that reactive-power assumptions affect voltage and loss distributions. Multi-feeder tests on IEEE 33-bus, CIGRE MV, CIGRE LV, and IEEE 118-bus systems confirmed that the workflow is not restricted to a single feeder. Finally, the 365-snapshot annual screening demonstrated the feasibility of applying the surrogate-based workflow across a representative yearly operating horizon.

Future work should incorporate measured feeder data, empirically fitted dependence structures, higher-dimensional uncertainty models, adaptive or sparse-regression PCE, discrete voltage-control devices, inverter Volt–VAR logic, and full chronological 8760 h studies. Despite these limitations, the results support the use of non-intrusive PCE as a transparent, reproducible, and tail-aware tool for uncertainty-informed distribution-network planning under high renewable penetration.

Author Contributions

Conceptualization, J.G. (Julio Guerra) and G.R.; methodology, J.G. (Julio Guerra), G.R., J.G. (Jean Gavilanez) and D.C.; software, G.R. and J.G. (Jean Gavilanez); validation, J.G. (Julio Guerra), G.R., J.G. (Jean Gavilanez) and D.C.; formal analysis, J.G. (Julio Guerra), G.R. and J.G. (Jean Gavilanez); investigation, J.G. (Julio Guerra), G.R., J.G. (Jean Gavilanez) and D.C.; resources, J.G. (Julio Guerra) and D.C.; data curation, G.R. and J.G. (Jean Gavilanez); writing—original draft preparation, J.G. (Julio Guerra); writing—review and editing, J.G. (Julio Guerra), G.R., J.G. (Jean Gavilanez) and D.C.; visualization, G.R. and J.G. (Jean Gavilanez); supervision, J.G. (Julio Guerra); project administration, J.G. (Julio Guerra); funding acquisition, J.G. (Julio Guerra). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Universidad Técnica del Norte, grant number InvestigaUTN-2025-1561. The APC was funded by Universidad Técnica del Norte.

Data Availability Statement

The data, configuration files, and computational artifacts supporting the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors acknowledge Universidad Técnica del Norte for its institutional and academic support. During the preparation of this manuscript, the authors used ChatGPT, GPT-5.5 Thinking, by OpenAI, solely for grammar correction, language editing, and improvement of writing clarity. The authors reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

Abbreviation	Meaning
AC	Alternating Current
DER	Distributed Energy Resource
IEEE	Institute of Electrical and Electronics Engineers
MC	Monte Carlo
NR	Newton–Raphson
NISP	Non-Intrusive Spectral Projection
PCE	Polynomial Chaos Expansion
PF	Power Flow
PPF	Probabilistic Power Flow
PV	Photovoltaic
UQ	Uncertainty Quantification

References

Ramadhani, U.H.; Shepero, M.; Munkhammar, J.; Widén, J.; Etherden, N. Review of probabilistic load flow approaches for power distribution systems with photovoltaic generation and electric vehicle charging. Int. J. Electr. Power Energy Syst. 2020, 120, 106003. [Google Scholar] [CrossRef]
Lei, X.; Zhong, J.; Chen, Y.; Shao, Z.; Jian, L. Grid integration of electric vehicles within electricity and carbon markets: A comprehensive overview. eTransportation 2025, 25, 100435. [Google Scholar] [CrossRef]
Xiu, D.; Karniadakis, G. The Wiener–Askey Polynomial Chaos for Stochastic Differential Equations. SIAM J. Sci. Comput. 2006, 24, 619–644. [Google Scholar] [CrossRef]
Mühlpfordt, T.; Faulwasser, T.; Hagenmeyer, V. Solving Stochastic AC Power Flow via Polynomial Chaos Expansion. In 2016 IEEE Conference on Control Applications (CCA); IEEE: Piscataway, NJ, USA, 2016. [Google Scholar] [CrossRef]
Koirala, A.; Hashmi, M.U.; D’hulst, R.; Van Hertem, D. Decoupled probabilistic feeder hosting capacity calculations using general polynomial chaos. Electr. Power Syst. Res. 2022, 211, 108535. [Google Scholar] [CrossRef]
Van Acker, T.; Geth, F.; Koirala, A.; Ergun, H. General polynomial chaos in the current–voltage formulation of the optimal power flow problem. Electr. Power Syst. Res. 2022, 211, 108472. [Google Scholar] [CrossRef]
Thurner, L.; Scheidler, A.; Schäfer, F.; Menke, J.H.; Dollichon, J.; Meier, F.; Meinecke, S.; Braun, M. Pandapower—An Open-Source Python Tool for Convenient Modeling, Analysis, and Optimization of Electric Power Systems. IEEE Trans. Power Syst. 2018, 33, 6510–6521. [Google Scholar] [CrossRef]
Feinberg, J.; Langtangen, H.P. Chaospy: An open source tool for designing methods of uncertainty quantification. J. Comput. Sci. 2015, 11, 46–57. [Google Scholar] [CrossRef]
Hajian, M.; Rosehart, W.D.; Zareipour, H. Probabilistic Power Flow by Monte Carlo Simulation With Latin Supercube Sampling. IEEE Trans. Power Syst. 2013, 28, 1550–1559. [Google Scholar] [CrossRef]
Xiu, D.; Karniadakis, G.E. Modeling uncertainty in steady state diffusion problems via generalized polynomial chaos. Comput. Methods Appl. Mech. Eng. 2002, 191, 4927–4948. [Google Scholar] [CrossRef]
Xiu, D.; Karniadakis, G.E. Modeling uncertainty in flow simulations via generalized polynomial chaos. J. Comput. Phys. 2003, 187, 137–167. [Google Scholar] [CrossRef]
Xiu, D.; Hesthaven, J.S. High-Order Collocation Methods for Differential Equations with Random Inputs. SIAM J. Sci. Comput. 2005, 27, 1118–1139. [Google Scholar] [CrossRef]
Bungartz, H.-J.; Griebel, M. Sparse Grids. Acta Numer. 2004, 13, 147–269. [Google Scholar] [CrossRef]
Gruosso, G.; Maffezzoni, P.; Zhang, Z.; Daniel, L. Probabilistic load flow methodology for distribution networks including loads uncertainty. Int. J. Electr. Power Energy Syst. 2019, 106, 392–400. [Google Scholar] [CrossRef]
Open Power System Data, Time Series Data Package. 2020. Available online: https://data.open-power-system-data.org/time_series/2020-10-06 (accessed on 1 March 2026).
Sun, X.; Tu, Q.; Chen, J.; Zhang, C.; Duan, X. Probabilistic load flow calculation based on sparse polynomial chaos expansion. IET Gener. Transm. Distrib. 2018, 12, 2735–2744. [Google Scholar] [CrossRef]
Ni, F.; Nguyen, P.H.; Cobben, J.F.G. Basis-Adaptive Sparse Polynomial Chaos Expansion for Probabilistic Power Flow. IEEE Trans. Power Syst. 2017, 32, 694–704. [Google Scholar] [CrossRef]
Wang, G.; Xin, H.; Wu, D.; Ju, P.; Jiang, X. Data-Driven Arbitrary Polynomial Chaos-Based Probabilistic Load Flow Considering Correlated Uncertainties. IEEE Trans. Power Syst. 2019, 34, 3274–3276. [Google Scholar] [CrossRef]
Wan, X.; Karniadakis, G.E. Multi-Element Generalized Polynomial Chaos for Arbitrary Probability Measures. SIAM J. Sci. Comput. 2006, 28, 901–928. [Google Scholar] [CrossRef]
Xu, Y.; Mili, L.; Zhao, J. Probabilistic Power Flow Calculation and Variance Analysis Based on Hierarchical Adaptive Polynomial Chaos-ANOVA Method. IEEE Trans. Power Syst. 2019, 34, 3316–3325. [Google Scholar] [CrossRef]

Figure 1. Monte Carlo versus PCE surrogate for the minimum system voltage magnitude in the IEEE 33-bus base case. The histogram shows the empirical 50,000-sample Monte Carlo distribution, while the dashed vertical markers indicate PCE-derived percentile locations.

Figure 2. Monte Carlo versus PCE surrogate for total active-power losses in the IEEE 33-bus base case. The comparison highlights the right-tailed behavior of losses and the ability of the PCE surrogate to reproduce operationally relevant upper-tail statistics.

Figure 3. Monte Carlo versus PCE surrogate for substation active-power import in the IEEE 33-bus base case. The upper tail represents high-import operating conditions relevant to transformer and feeder capacity assessment.

Figure 4. PCE order convergence analysis for total active-power losses in the IEEE 33-bus base case. The figure shows that increasing the polynomial order beyond three does not materially change the mean loss estimate under the studied uncertainty representation.

Figure 5. Monte Carlo versus PCE surrogate for maximum line loading after realistic thermal-limit recalibration with a 60% deterministic target. The resulting distribution shows that stochastic operating conditions can produce upper-tail loading values above 100%, which were masked by the original benchmark ampacities.

Figure 6. Multi-feeder comparison of mean minimum-voltage magnitude obtained with the PCE-based probabilistic workflow. The tested networks include IEEE 33-bus, CIGRE MV, CIGRE LV, and IEEE 118-bus systems.

Figure 7. Normalized annual variation of PCE-derived probabilistic indicators over the 365-snapshot full-year daily screening. Each daily point corresponds to one representative operating snapshot evaluated with a third-order PCE surrogate.

Figure 8. Full-year daily PCE screening for minimum system voltage magnitude. The solid line represents the daily PCE mean, while the shaded band represents the p05–p95 interval.

Table 1. Base validation of the non-intrusive PCE surrogate against the 50,000-sample Monte Carlo benchmark for the IEEE 33-bus feeder.

Indicator	MC Mean	PCE Mean	Mean Error (%)	MC p01	PCE p01	MC p99	PCE p99
Minimum voltage, vm_pu_min (p.u.)	0.920387	0.920514	0.0137	0.862198	0.861976	0.956956	0.956957
Total active losses, loss_mw (MW)	0.190427	0.190201	0.1186	0.053825	0.051555	0.520382	0.521147
Substation import, ext_grid_p_mw (MW)	3.548834	3.544835	0.1127	1.831718	1.821171	6.063438	6.080180

Table 2. Bootstrap 95% confidence intervals for the 50,000-sample Monte Carlo benchmark.

Indicator	Statistic	Estimate	95% CI Low	95% CI High
vm_pu_min (p.u.)	Mean	0.920387	0.920196	0.920560
vm_pu_min (p.u.)	p01	0.862198	0.861141	0.863450
vm_pu_min (p.u.)	p99	0.956956	0.956541	0.957289
loss_mw (MW)	Mean	0.190427	0.189598	0.191295
loss_mw (MW)	p01	0.053825	0.052950	0.054774
loss_mw (MW)	p99	0.520382	0.512535	0.528703
ext_grid_p_mw (MW)	Mean	3.548834	3.540812	3.556817
ext_grid_p_mw (MW)	p01	1.831718	1.812310	1.849433
ext_grid_p_mw (MW)	p99	6.063438	6.018813	6.119895

Table 3. PCE order-convergence analysis for the IEEE 33-bus base case.

PCE Order	Quadrature Order	Nodes	Wall Time (s)	Mean Error vm_pu_min (%)	Mean Error loss_mw (%)	Mean Error ext_grid_p_mw (%)
2	3	165	6.06	0.0105	0.1218	0.1141
3	4	494	8.75	0.0137	0.1186	0.1127
4	5	1278	30.50	0.0135	0.1192	0.1132
5	6	2958	27.90	0.0124	0.1190	0.1131

Table 4. Effect of correlated uncertainty on tail-risk indicators.

Scenario	MC Samples	vm_pu_min p01	loss_mw p99 (MW)	ext_grid_p_mw p99 (MW)	ext_grid_p_mw std (MW)
Independent	50,000	0.862198	0.520382	6.063438	0.902568
load–PV ρ = −0.50	20,000	0.858263	0.547482	6.295110	1.001317
PV–wind ρ = −0.40	20,000	0.862873	0.520280	6.066672	0.900211
Mixed correlations	20,000	0.859092	0.542911	6.273219	0.986242

Table 5. Maximum line loading under realistic thermal-limit recalibration.

Thermal Target (%)	MC Mean (%)	MC p95 (%)	MC p99 (%)	PCE Mean (%)
40	40.9519	58.7631	68.9361	40.9368
60	61.4279	88.1446	103.4041	61.4052
80	81.9038	117.5261	137.8721	81.8736

Table 6. Sensitivity of probabilistic outputs to renewable power factor.

Power Factor	vm_pu_min Mean	vm_pu_min p01	loss_mw Mean (MW)	ext_grid_p_mw Mean (MW)
1.00	0.920024	0.862387	0.192039	3.546481
0.98	0.921355	0.863952	0.186010	3.540452
0.95	0.922146	0.864996	0.182453	3.536895
0.90	0.923100	0.866283	0.178230	3.532672

Table 7. Multi-feeder validation and scalability results.

Feeder	MC Converged/Requested	MC Wall Time (s)	PCE Nodes	PCE Wall Time (s)	vm_pu_min Mean	loss_mw Mean	ext_grid_p_mw Mean
IEEE 33-bus	10,000/10,000	29.73	494	21.11	0.920458	0.190169	3.544611
CIGRE MV	10,000/10,000	28.52	494	9.26	0.974389	0.055640	40.515751
CIGRE LV	10,000/10,000	30.67	494	9.75	0.922872	0.017555	0.643369
IEEE 118-bus	9995/10,000	41.73	494	9.46	0.941983	165.412392	135.151452

Table 8. Annual descriptive statistics from the 365-snapshot full-year daily PCE screening.

Output	Mean of Daily Means	Minimum Daily Mean	Maximum Daily Mean	Mean Daily Std.	Maximum Daily Std.	Minimum p05	Maximum p95
ext_grid_p_mw	3.344268	2.717203	3.935454	0.744823	0.852627	1.836652	5.462734
line_loading_max	0.000181	0.000148	0.000211	0.000040	0.000046	0.000101	0.000294
loss_mw	0.157506	0.104077	0.215295	0.074377	0.100335	0.046073	0.404795
vm_pu_max	1.000000	1.000000	1.000000	0.000000	0.000000	1.000000	1.000000
vm_pu_min	0.926219	0.912593	0.940443	0.017091	0.019788	0.876973	0.960257

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guerra, J.; Recalde, G.; Gavilanez, J.; Cuenca, D. A Reproducible and Correlation-Aware Polynomial Chaos Framework for Probabilistic AC Power Flow in Renewable-Rich Distribution Networks. Energies 2026, 19, 2777. https://doi.org/10.3390/en19122777

AMA Style

Guerra J, Recalde G, Gavilanez J, Cuenca D. A Reproducible and Correlation-Aware Polynomial Chaos Framework for Probabilistic AC Power Flow in Renewable-Rich Distribution Networks. Energies. 2026; 19(12):2777. https://doi.org/10.3390/en19122777

Chicago/Turabian Style

Guerra, Julio, Gustavo Recalde, Jean Gavilanez, and Dirley Cuenca. 2026. "A Reproducible and Correlation-Aware Polynomial Chaos Framework for Probabilistic AC Power Flow in Renewable-Rich Distribution Networks" Energies 19, no. 12: 2777. https://doi.org/10.3390/en19122777

APA Style

Guerra, J., Recalde, G., Gavilanez, J., & Cuenca, D. (2026). A Reproducible and Correlation-Aware Polynomial Chaos Framework for Probabilistic AC Power Flow in Renewable-Rich Distribution Networks. Energies, 19(12), 2777. https://doi.org/10.3390/en19122777

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Reproducible and Correlation-Aware Polynomial Chaos Framework for Probabilistic AC Power Flow in Renewable-Rich Distribution Networks

Abstract

1. Introduction

1.1. Motivation, Scope, Research Problem, Hypothesis, and Objectives

1.2. Theoretical Background

2. Materials and Methods

2.1. Case Study and Deterministic AC Power Flow Model

2.2. Operating Point Construction and Snapshot Selection

2.3. Uncertainty Model and Random Inputs

2.4. Monte Carlo Probabilistic Power Flow Benchmark

2.5. Polynomial Chaos Expansion Surrogate

2.6. Non-Intrusive Coefficient Estimation via Numerical Projection and Quadrature

2.7. Surrogate Post-Processing: Moments and Quantiles

2.8. Scenario Sweeps: Renewable Penetration and Uncertainty Sensitivity

2.9. Implementation and Reproducibility Protocol

2.10. Exact Configuration Used in This Study (Replication-Ready)

2.11. High-Performance Validation Protocol

3. Results

3.1. Benchmark Configuration and Monte Carlo Robustness

3.2. Computational Cost and PCE Order Convergence

3.3. Effect of Correlated Uncertainty

3.4. Realistic Thermal-Loading Recalibration

3.5. Renewable Power-Factor Sensitivity

3.6. Multi-Feeder Scalability and Generality

3.7. Full-Year Daily PCE Screening

4. Discussion

4.1. Positioning with Respect to Prior Literature

4.2. Mechanistic Interpretation

4.3. Practical Implications for Distribution-Network Studies

4.4. Limitations and Future Work

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI