Mathematical Analysis Methods for Quantitative Scenario Generation of Renewable Power Output: A Comprehensive Review

Ma, Tong; Qin, Boyu; Hong, Shidong; Su, Yiwei

doi:10.3390/en19071701

Open AccessReview

Mathematical Analysis Methods for Quantitative Scenario Generation of Renewable Power Output: A Comprehensive Review

¹

School of Energy and Electrical Engineering, Xi’an Jiaotong University, Xi’an 710049, China

²

School of Energy and Electrical Engineering, Qinghai University, Xining 810016, China

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(7), 1701; https://doi.org/10.3390/en19071701 (registering DOI)

Submission received: 10 March 2026 / Revised: 16 March 2026 / Accepted: 17 March 2026 / Published: 31 March 2026

(This article belongs to the Special Issue Emerging AI Technologies in Renewable Power System Assessment, Control and Dispatching)

Download

Browse Figures

Versions Notes

Abstract

As the proportion of renewable power continues to increase, its inherent intermittency and volatility pose serious challenges to the security and stability of power systems. Scenario generation technology serves as a key tool supporting decision-making methods such as stochastic optimization and risk analysis. By generating representative power output scenarios, it can effectively characterize the uncertainty of renewable power output. This paper systematically reviews mainstream methods for the scenario generation of renewable power output, categorizing them into two major classes: sampling-based methods and model-based methods. Among them, sampling-based methods include Monte Carlo sampling, Latin hypercube sampling (LHS), Markov chains (MCs), and Copula functions. Model-based methods encompass artificial neural networks (ANNs), long short-term memory networks (LSTMs), autoregressive moving average models (ARMAs), generative adversarial networks (GANs), variational autoencoders (VAEs), diffusion models and transformer-based models. This paper elaborates on the principles and characteristics of each type of method. Moreover, scenario quality is evaluated from three dimensions: output-based metrics for numerical accuracy, distribution-based metrics for statistical consistency, and event-based metrics for key operational event representation. The current research challenges and future research directions are also summarized to provide a reference for modeling the uncertainty of renewable output.

Keywords:

renewable power output; scenario generation; sampling-based methods; model-based methods

1. Introduction

Driven by the global consensus on carbon neutrality and low-carbon energy transition, renewable energy sources (RESs) represented by wind and photovoltaic (PV) power have become the core driving force for the transformation of global power systems [1]. In recent years, the installed capacity of renewable energy has maintained sustained and rapid growth worldwide. By 2025, the global cumulative installed capacity of wind and PV power has exceeded 3 TW and 4 TW, respectively, and more than 100 countries and regions have formulated clear roadmaps for 100% renewable power supply. In China, under the “Dual Carbon” strategic goals, wind and PV power are gradually evolving from supplementary energy sources to the mainstay of the power system, and it is expected that the installed capacity of new energy will exceed 1.2 TW by 2030, accounting for more than 50% of the total installed power generation capacity. This profound transformation of the energy structure fundamentally changes the operation mode and physical characteristics of traditional power systems, and also brings unprecedented challenges to the safe, stable and economical operation of power systems worldwide.

Unlike conventional fossil fuel power generation with controllable and adjustable output, wind and PV power generation are inherently driven by meteorological factors. Wind power output is directly determined by stochastic fluctuations in wind speed [2], while PV generation is highly dependent on real-time solar irradiance, ambient temperature and cloud cover conditions [3]. These natural driving factors lead to the prominent randomness, volatility, intermittency and non-stationarity of renewable power output, and also form complex spatiotemporal correlation characteristics between geographically adjacent wind farms and PV stations [4]. With the increasing proportion of renewable energy connected to the grid, these inherent uncertainties have penetrated all links of the power system, from long-term generation and transmission expansion planning, through medium-term unit commitment and maintenance scheduling, to short-term day-ahead and intra-day economic dispatch, real-time frequency regulation and peak shaving, as well as electricity market transaction settlement and system resilience assessment [5]. Specifically, large-scale renewable energy integration has brought severe challenges to the power system: It aggravates the imbalance between power supply and load, reduces the system inertia and anti-disturbance capability, increases the difficulty of frequency and voltage control, deteriorates power quality, and even leads to cascading failures and large-scale power outages under extreme weather conditions [6]. Therefore, accurate and effective modeling of the uncertainty of renewable power output has become a core scientific problem and technical bottleneck that must be solved for the high-proportion renewable energy power system.

At present, stochastic optimization (SO), robust optimization (RO), and distributionally robust optimization (DRO) have formed the three mainstream methodological systems to deal with the uncertainty of renewable power output in power system decision-making [7]. Among them, SO formulates the optimization objective based on the statistical expectation of uncertain variables, which requires the preset probability distribution of wind and PV output to describe the uncertainty [8]; RO seeks the optimal solution under the worst-case scenario by defining a bounded uncertainty set, which relies on the accurate characterization of the fluctuation boundary of renewable output [9]; DRO combines the advantages of SO and RO, and uses the ambiguous probability set to deal with the modeling error of probability distribution, whose performance also depends on the reasonable mapping of real output scenarios [10]. It is worth noting that stochastic optimization (SO), robust optimization (RO), and distributionally robust optimization (DRO) all fundamentally depend on a set of representative renewable power output scenarios [11]. Scenario generation (SG) technology, which can characterize the continuous uncertainty of renewable output into a finite set of discrete scenarios with statistical representativeness, has gradually become the key enabling technology connecting uncertainty modeling and power system engineering decision-making [12]. By simulating the external meteorological driving factors and the spatiotemporal evolution law of renewable output, SG technology constructs a series of representative output scenarios, which provide essential boundary conditions and data input for the optimal operation, planning and risk assessment of power systems with a high proportion of renewable energy [13,14].

In recent years, with the rapid development of statistical theory and artificial intelligence technology, renewable power output SG methods have been continuously enriched and innovated, forming two major technical routes: sampling-based methods and model-based methods [5]. A series of research breakthroughs have been made in both routes: Traditional sampling-based methods represented by Monte Carlo sampling, Latin hypercube sampling, Markov chains and Copula functions have been continuously improved in terms of sampling efficiency and high-dimensional dependency modeling; data-driven model-based methods, from early time-series models such as ARMA and classical neural networks such as ANN and LSTM, to deep generative models represented by GAN, VAE, diffusion models and transformer-based architectures, have achieved leaps in the fidelity, diversity and computational efficiency of SG. However, through the combing of existing research, we find that there are still obvious gaps in the current academic community in the systematic sorting and comprehensive review of renewable power SG technology.

On the one hand, most of the existing review studies focus on a single type of SG method, or only cover part of the technical routes. For example, some reviews only focus on traditional statistical sampling methods, lacking a systematic sorting of the latest deep generative models that have developed rapidly in recent years; some reviews only discuss the application of GAN and its variants in SG, but do not include the emerging diffusion models and transformer-based generative architectures, which have shown outstanding performance in this field. On the other hand, existing reviews rarely establish a unified, multi-dimensional evaluation system for the quality of generated scenarios, and fail to clarify the applicable boundaries and engineering selection criteria of different SG methods under different scales of renewable energy clusters and different application scenarios. In addition, most of the existing studies focus on unconstrained SG based on historical data, while ignoring the core role of Numerical Weather Prediction (NWP) data in engineering-oriented conditional SG, which leads to a disconnection between academic research and the practical engineering application of SG technology. At the same time, there is still a lack of a comprehensive review that systematically sorts out the development context, technical principles, advantages and limitations of the full spectrum of SG methods, and summarizes the current research bottlenecks and future development trends in this field. This gap not only makes it difficult for new researchers to systematically grasp the overall framework of SG technology, but also brings obstacles to engineering practitioners in selecting appropriate SG methods according to actual application requirements.

To fill the above research gaps, this paper conducts a comprehensive and systematic review of mainstream renewable power output SG methods. The main contributions of this paper are as follows. Firstly, we establish a clear classification system for SG methods, dividing them into two major categories—sampling-based methods and model-based methods—and systematically elaborate the technical principles, improved variants, advantages and limitations, and typical applications of each sub-category method, covering the full spectrum from traditional statistical methods to the latest deep generative models. Secondly, we construct a three-dimensional evaluation system for scenario quality, including output-based metrics for numerical accuracy, distribution-based metrics for statistical consistency, and event-based metrics for key operational event representation, which provides a unified standard for the performance evaluation of different SG methods. Thirdly, we systematically sort out the integration mechanism of NWP data in mainstream SG models, especially the conditional embedding methods in deep generative models, which bridges the gap between academic research and the engineering application of SG technology. Finally, we summarize the current key research challenges in this field, and look ahead to the future research directions and development trends, providing reference and guidance for the subsequent theoretical research and engineering application of renewable power output uncertainty modeling.

To fill the above research gaps, this paper conducts a comprehensive and systematic review of mainstream renewable power output SG methods. The entire review is organized into three logically progressive core parts with clear research focuses, to avoid the fragmentation caused by an overly broad scope.

In the first part, we define the mathematical connotation and mainstream implementation paradigms of renewable power SG, and we establish a unified classification framework dividing mainstream SG methods into sampling-based and model-based categories. We systematically sort out the technical principles, improved variants, advantages, limitations and engineering applications of each sub-method. In the second part, we develop a unified three-dimensional evaluation system for scenario quality. They are output-based numerical accuracy metrics, distribution-based statistical consistency metrics and event-based key operational event representation metrics, and we clarify the calculation principles and applicable scenarios of each metric to provide a standardized evaluation criterion for different SG methods. In the third part, we focus on the engineering-oriented application mechanism and future research of SG technology. We systematically comb the core role of Numerical Weather Prediction (NWP) data and its embedding mechanisms in mainstream SG models. We also summarize the key research challenges and six core future research directions in this field, providing systematic reference for the engineering application of renewable power output uncertainty modeling.

The main contributions of this paper are summarized as follows. Firstly, we establish a clear and complete classification system for renewable power SG methods, covering the full technical spectrum from traditional statistical methods to the latest deep generative models. Secondly, we construct a three-dimensional unified evaluation system for scenario quality, which fills the gap of the lack of systematic evaluation standards in existing reviews. Thirdly, we systematically sort out the integration mechanism of NWP data in mainstream SG models, which bridges the gap between academic research and engineering application. Finally, we summarize the current research challenges and look ahead to the future development trends in this field, providing directional guidance for subsequent research.

The rest of this paper is organized as follows. Section 2 gives the mathematical definition of SG technology, and describes the classification framework and essential differences between sampling-based and model-based SG methods, laying the theoretical foundation for the first core part. Section 3 and Section 4 systematically elaborate the technical principles, characteristics, improved variants and typical applications of sampling-based methods and model-based methods, completing the full-spectrum methodological review of the first core part. Section 5 summarizes the three-dimensional evaluation system for the quality of generated scenarios, which constitutes the full content of the second core part. Section 6 analyzes the core role of NWP data in engineering-oriented SG, and sorts out the NWP information embedding mechanisms of different types of models. Section 7 summarizes the current research bottlenecks and future development trends in this field. Section 6 and Section 7 together form the third core part of this review.

2. Description of Scenario Generation

2.1. Definition of SG Methods

SG refers to the computational process that approximates the randomness and uncertainty in a model. Specifically, this process converts the randomness and uncertainty into a finite discrete probability distribution through structured and systematic methods [15]. The generated discrete scenario set forms a critical bridge between the continuous distribution and the stochastic optimization model. Its quality directly affects the reliability and robustness of the final decision solution [16].

Unlike simple random sampling, SG is a controlled approximation. It typically divides the support region of the probability distribution into several intervals. For each interval, a representative scenario is generated (often taken as the mean, midpoint, or conditional expectation of that interval) with its probability equal to the probability mass of the original distribution over that interval [17]. The core objective of the entire process is to make this discrete set approximate the statistical properties of the original distribution as closely as possible [18]. This maintains a reasonable characterization of uncertainty while ensuring computational feasibility [19].

For this review, an SG method is defined as a targeted technical approach for power systems with high renewable penetration. It takes wind/PV power output data, load data, and Numerical Weather Prediction (NWP) data as core inputs, and models the inherent uncertainty of renewable energy output, including its randomness, volatility, intermittency, and spatiotemporal correlations. The method generates renewable power time-series output scenarios that are consistent with both statistical laws and physical constraints, and further constructs a probability-weighted discrete scenario set. The scenario set provides essential data inputs and boundary conditions for core power system applications, including optimization, planning, and risk assessment. This review focuses on the SG methods for wind and PV power output in power systems, covering all SG methods for modeling the uncertainty of wind/PV power output time series (including single-site, multi-site and cluster-level wind/PV power output) and constructing corresponding probability-weighted scenario sets. We also strictly exclude SG technologies in unrelated fields that are not associated with renewable power output uncertainty modeling in power systems, such as financial time series SG, traffic flow modeling and SG, general industrial process simulation SG, and load SG without coupling with renewable power output.

The theoretically continuous or highly complex true probability distribution is transformed into a set consisting of a limited number n of scenarios:

S = \{(ζ_{1}, ρ_{1}), (ζ_{2}, ρ_{2}), \dots, (ζ_{n}, ρ_{n})\}

(1)

where

ζ_{n}

is a scenario that is a specific possible realization value or vector of the random variable, representing a possible state of uncertainty.

ρ_{n}

is the corresponding probability of that scenario occurring (

ρ_{n} > 0

and

\sum ρ_{n} = 1

).

For SG-based modern models, two mainstream paradigms are widely adopted to realize the bridging from the learned continuous distribution to the discrete scenario set required by SO.

First is direct sampling with equiprobable scenario assignment. This paradigm is the most commonly used lightweight processing method in engineering practice. We directly draw N samples from the continuous distribution

P_{θ} (ξ)

learned by the generative model via forward propagation, take each sample as an individual scenario, and assign equal occurrence probability to all scenarios. Its mathematical formulation is as follows:

{\tilde{ξ}}_{1}, {\tilde{ξ}}_{2}, \dots, {\tilde{ξ}}_{N} ~ P_{θ} (ξ)

(2)

where

{\tilde{ξ}}_{n}

denotes the n-th scenario generated from the continuous distribution

P_{θ} (ξ)

, which is a specific realization value or vector of the random vector representing a possible state of uncertainty; N is the total number of scenarios. Under this paradigm, the occurrence probability of each scenario is uniformly assigned as follows:

π_{n} = \frac{1}{N}, \forall n = 1, 2, \dots, N

(3)

Second is sampling-scenario reduction with non-equiprobable assignment. This paradigm is applicable to scenarios that require high statistical fitting accuracy of the scenario set, while needing to control the number of scenarios to ensure the efficiency of the optimization solution. First, a sufficient number of raw samples are drawn from the continuous distribution

P_{θ} (ξ)

learned by the generative model. Then, the massive raw samples are compressed into N representative scenarios via scenario reduction algorithms (e.g., fast forward selection, backward elimination, K-means clustering), with the corresponding probability mass of the original distribution assigned to each representative scenario. Finally, we transform the continuous probability distribution characterizing uncertainty into a discrete scenario set composed of a finite number of scenarios, whose general mathematical formulation is the following:

S = {\{(ξ_{n}, π_{n})\}}_{n = 1}^{N}

(4)

where

ξ_{n}

is the n-th representative scenario, which is a specific realization value or d-dimensional vector of the random vector, representing a possible state of uncertainty;

π_{n}

is the corresponding occurrence probability of the n-th scenario, which satisfies the fundamental constraints of probability:

\sum_{n = 1}^{N} π_{n} = 1, 0 < π_{n} < 1, \forall n = 1, 2, \dots, N

(5)

2.2. Classification of SG Methods for Renewable Power Scenarios

The methods for generating renewable output scenarios are categorized into two major classes: sampling-based methods [20] and model-based methods [12].

Sampling-based methods generate discrete scenarios by sampling from probability distributions. They primarily include Monte Carlo (MC) [21], Latin hypercube sampling (LHS) [22], Markov chains (MCs) [23], Copula functions [24], and their derivatives. The fundamental characteristic of sampling-based methods lies in their reliance on prior probability distribution assumptions and parametric modeling [25]. Typically, these methods start with key assumptions. They assume that the core meteorological parameters governing renewable power output follow specific probability distributions. For instance, wind speed is often assumed to follow a Weibull distribution [26,27], while solar irradiance is typically modeled using a Beta distribution [28,29]. Alternatively, the underlying stochastic processes themselves are assumed to conform to established mathematical models, such as Markov processes [30] or Copula-based dependency structures [31]. Model parameters are then estimated by fitting historical data.

The primary advantage of sampling-based methods is model transparency, clear computational logic, and strong interpretability [32]. They provide an explicit mathematical description and statistical significance for uncertainty quantification. However, the performance of these methods heavily depends on the accuracy of the predefined models [33]. They struggle to precisely capture the complex, high-dimensional, nonlinear characteristics and spatiotemporal couplings inherent in renewable power output [34]. Consequently, their flexibility and representational capacity are limited when handling large-scale, non-stationary time series.

Model-based methods make use of various models to learn complex feature distributions from historical data, in turn producing more realistic and flexible scenarios of renewable energy output. These mainly include autoregressive moving average (ARMA) models [35], artificial neural networks (ANNs), long short-term memory networks (LSTM), generative adversarial networks (GANs) [36], variational autoencoder (VAE) [17], diffusion models [37], transformer and their variants. The fundamental characteristic of model-based methods is data-driven, end-to-end feature learning and distribution mapping [38]. These methods do not depend on predetermined probability distributions. Instead, they leverage the powerful nonlinear mapping capabilities of neural networks or deep generative models to automatically extract and learn complex, high-dimensional spatiotemporal features, then joint distributions directly from historical data [39].

The core strength of model-based methods lies in their powerful representational learning capacity and generative flexibility. They are capable of capturing intricate patterns and nonlinear dependencies that are difficult for traditional methods to model, thereby generating highly realistic and diverse scenarios [40]. However, these methods usually demand large amounts of high-quality training data. Their internal mechanisms often resemble a “black box,” resulting in poor interpretability. Furthermore, the training process can face challenges such as instability or extremely high computational costs [41].

The classification of SG methods is shown in Figure 1.

Sampling-based methods follow the paradigm of modeling the joint distribution first, then generating samples via sampling. Specifically, they first explicitly or semi-explicitly define the N-dimensional joint probability distribution and spatial dependence structure of renewable farm outputs via statistical methods (e.g., Copula functions), then extract samples from the predefined distribution via sampling algorithms. For Copula-based sampling, the parameter space of the N-dimensional dependence structure grows at

O (N^{2})

or even exponentially. For example, a Gaussian Copula requires fitting an N × N correlation matrix, which involves 4950 independent parameters when N = 100. Consequently, this paradigm is inherently limited by the curse of dimensionality.

Model-based methods follow the paradigm of end-to-end joint distribution learning, implicit spatial dependence encoding, and sample generation via forward mapping. Instead of explicitly defining the N-dimensional joint distribution in advance, model-based methods use nonlinear mapping of deep neural networks to learn the transformation from a low-dimensional latent space to the high-dimensional sample space, implicitly fitting the joint distribution and spatial dependence structure of renewable farm outputs. Without explicitly modeling the parameters of the N-dimensional joint distribution, the parameter scale grows linearly or sub-linearly with N, rather than exponentially, as in sampling methods. For example, transformer-based generative models can control the computational complexity within

O (N \log N)

via sparse attention mechanisms. This paradigm fundamentally changes the logic of high-dimensional distribution modeling, and systematically mitigates the curse of dimensionality compared to sampling-based methods.

Comprehensively, sampling-based methods and model-based methods present completely different logical paths and performance boundaries when dealing with spatial dimension scaling. For small-scale renewable clusters (N < 30 farms), sampling-based methods (especially Copula-based methods) have the advantages of explicit interpretability, low computational overhead, and sufficient modeling accuracy, and are a reasonable choice in this scenario. For medium- and large-scale renewable clusters (30 ≤ N ≤ 200 farms), model-based methods fundamentally break through the curse of dimensionality constraints of sampling methods, and have overwhelming advantages in spatial dependence capture, sample generation efficiency, and constraint embedding capability, which are the mainstream technical path in this scenario. For ultra-large-scale renewable clusters (N > 200 farms), both types of methods have significant logical limitations. Sampling methods are completely invalid, while model-based methods face core problems such as data hunger, lack of interpretability, and difficulty in constraint embedding. Future research should focus on the fusion of the two: using statistical methods to ensure the accuracy and interpretability of marginal distributions, using deep generative models to capture complex spatial dependence structures in high-dimensional space, and combining physical information embedding to improve the generalization and constraint satisfaction of the model, so as to finally realize high-dimensional output modeling under ultra-large-scale power systems.

3. Sampling-Based Methods

Sampling-based methods are founded on probability and statistical theory. They generate scenarios by sampling from preset probability distributions or by exploring correlations between variables. These methods are characterized by strong interpretability and clear computational logic. They are suitable for situations with limited data availability or where high model transparency is required.

3.1. Monte Carlo Method

The Monte Carlo method does not rely on complex causal modeling. Instead, it follows the logic of “probability distribution assumption—parameter fitting—random sampling—scenario output” to reproduce the randomness inherent in renewable power output [42]. The core concept of Monte Carlo is to utilize historical data to ascertain the distribution types of key parameters. Then, extensive sampling is performed to cover various possible meteorological scenarios [43].

The first step in the Monte Carlo method is to identify the core influencing parameters and their distribution types [44]. For wind power, the core parameter is typically speed of wind (with auxiliary parameters such as air density considered in some cases), which is often fitted using a Weibull distribution [45]. For PV power generation, the core parameter is usually solar irradiance (sometimes combined with influencing factors such as temperature and cloud cover), which is often fitted using a Beta distribution [46]. Next, parameters of these distributions are computed on the basis of collected historical data. Typical examples include the shape parameter and scale parameter of the Weibull distribution, and the α and β parameters for the Beta distribution. Finally, extensive independent random sampling is performed based on the fitted distribution for each parameter. The sampled meteorological parameters are then converted into actual power output scenarios. The specific flowchart is shown in Figure 2.

Monte Carlo sampling is the most prevalent approach for SG. In [47], the Monte Carlo method is adopted to generate an adequate number of scenarios considering random fault locations and load distributions, and subsequently reduced the scenario quantity by employing the K-means clustering algorithm. Similarly, ref. [48] utilized Monte Carlo sampling to simulate the uncertainties and random perturbations of loads and wind energy.

Nevertheless, the Monte Carlo method demands a large sample size to guarantee statistical characteristics, leading to high computational costs, and it also struggles to directly characterize the autocorrelation of time series. To address this problem, the Markov chain Monte Carlo (MCMC) method is proposed [49]. As a class of random sampling algorithms based on Markov chain construction, the core principle consists in constructing a Markov chain where the stationary distribution is consistent with the target probability distribution. Through long-term state transitions on this chain, approximate samples that comply with the target distribution can be obtained [50]. The MCMC method is particularly suitable for the sampling and numerical integration of high-dimensional, non-standard complex distributions (such as posterior distributions in Bayesian statistics) [51].

In [52], the MCMC method is adopted to directly generate synthetic wind power output time series. The key to this approach resides in leveraging the nonlinear characteristics of wind turbine power curves to map continuous wind speed values into discrete power states. Consequently, the number of states in the power domain is much lower than that in the wind speed domain, thereby significantly cutting down the number of Markov chain parameters and alleviating the inaccuracy in parameter estimation caused by limited data. Additionally, the autocorrelation of wind power sequences is weaker than that of wind speed sequences, allowing low-order Markov chains (such as first-order chains) to achieve satisfactory fitting performance in the power domain. By optimizing the sampling time step, the MCMC method can precisely reproduce the probability density function as well as the autocorrelation function of power, offering a more efficient and stable “black-box” simulation tool for wind power stochastic modeling.

Building on this, the persistence and variation-Monte Carlo (PV-MC) method is proposed for the direct generation of synthetic wind power output sequences [53]. The PV-MC method introduces a key innovation that decouples two critical time-domain characteristics of wind power (persistence and variability) from traditional state transition probabilities. The PV-MC method first generates an initial state sequence based on a modified state transition matrix, then assigns random durations consistent with historical rules to each state using the inverse Gaussian distribution, and finally generates specific power values matching actual fluctuation patterns via the t location-scale distribution. This improvement enables the generated synthetic sequences to perfectly retain the statistical characteristics (such as probability density and autocorrelation) reproducible by the traditional MCMC method. Meanwhile, the PV-MC method realizes more realistic and accurate simulation of two key time-domain dynamic characteristics—state duration and power fluctuation—thus providing higher-quality stochastic scenarios for power system planning.

3.2. Latin Hypercube Sampling (LHS) Method

Latin hypercube sampling (LHS) is an efficient random sampling method based on the stratified sampling principle [22]. Compared with Monte Carlo sampling, LHS covers the probability space more uniformly with fewer samples, improving sampling efficiency [54].

In renewable power output SG, LHS is often used to generate samples of input random variables [55]. The flowchart of LHS scenario generation is shown in Figure 3. Initially, the cumulative distribution function (CDF) [56] of each random variable is evenly partitioned into several intervals with equal probability. Subsequently, one sample (typically the midpoint or a random point within the interval) is extracted from each of these intervals [57]. Finally, the correlation among the random variables is reduced by means of permutation and combination operations.

The mathematical formula is as follows:

[\frac{i - 1}{N}, \frac{i}{N}], i = 1, 2, \dots, N

(6)

U_{j}^{(i)} = \frac{i - 0.5}{N}, i = 1, 2, \dots, N

(7)

x_{j}^{(i)} = F_{j}^{- 1} (U_{j}^{(i)})

(8)

where

x_{j}^{(i)}

denotes the sample and

F_{j}^{- 1}

is the inverse CDF of the j variable.

LHS generally requires fewer sampling times and exhibits higher sampling efficiency, thus being widely applied in renewable energy output SG. In [55], researchers utilized the LHS technique to analyze the reliability of power systems containing renewable energy sources. Firstly, the probability distributions of load and renewable energy output are divided into equal-probability intervals, and sampling is performed within each interval to generate two independent initial sample sequences. Subsequently, four “matching” strategies are adopted, namely the load duration curve constructed from historical data, the linear regression model, the joint probability table, and the rank correlation coefficient matrix. These strategies are utilized to rearrange or correlate the two sequences, thereby accurately embedding the actual correlation. Finally, the matched load-renewable energy samples are combined with independently sampled conventional unit states to form a complete system state for reliability assessment (such as calculating LOLP and EUE). The process ensures comprehensive coverage of the respective probability distributions by stratified sampling, and accurately reflects the dependencies between variables through correlation reconstruction technology, enabling stable and unbiased estimation of reliability indicators with a small quantity of samples and notably improving computational efficiency. Ref. [55] established a probability distribution model using LHS combined with Polynomial Normal Transformation (PNT), fully considering the correlations between random variables.

However, in addressing multi-variable input stochastic problems, LHS not only has its simulation accuracy affected by sample values but also involves a considerable computational burden when handling multi-variable correlations. Consequently, a probabilistic load flow calculation method (CLMCS) combining Nataf transformation and LHS was proposed [58], aiming to efficiently handle complex correlations between input random variables (such as wind power, photovoltaic power, and load). The principal advantage falls in obtaining high-precision solutions with less computational effort. This process is hardly constrained by the probability distributions of input random variables. The core process of CLMCS is as follows. First, original variables with arbitrary marginal distributions and a given correlation matrix are transformed into standard normal variables via Nataf transformation, yielding a known correlation coefficient matrix. Second, LHS is applied to these transformed variables to generate samples with target correlations, which are then inversely transformed back to the original variables’ distribution space. Finally, the correlated input samples are substituted into deterministic load flow calculation, enabling the statistical analysis of output variables such as voltage, phase angle, and line load flow. The CLMCS method enhances the coverage efficiency of the sampling space by leveraging LHS and accurately embeds the correlations between variables via Nataf transformation. Thus, the CLMCS method achieves more accurate and stable probabilistic load flow solutions with fewer samples than traditional simple random sampling methods, while imposing minimal restrictions on the distribution types of input variables.

Aiming at the low efficiency of Monte Carlo simulation due to simple random sampling (SRS) in probabilistic load flow calculation, an improved sampling technology combining LHS with Cholesky decomposition (LHS-CD) is proposed. The LHS-CD method effectively solves the load flow problem under multiple uncertainties in power systems [59]. First, stratified sampling is performed on the probability distribution of each input random variable (such as node load and generator output) through LHS to ensure that the entire distribution space is covered with a small number of samples. Then, Cholesky decomposition is used to process the sorting matrix of the initial sampling matrix to minimize unintended and spurious statistical correlations between samples of different variables. This key step addresses the problem that residual correlations between samples affect accuracy when traditional LHS only uses random permutation. The Results Section shows that, with the same small sample size, LHS-CD produces much smaller errors than SRS when estimating the mean and standard deviation of output variables. Furthermore, LHS-CD is significantly superior to LHS, using only random permutation (LHS-RP). Its computation time, however, remains comparable to both SRS and LHS-RP. Thus, LHS-CD achieves a substantial improvement in sampling efficiency and computational accuracy while maintaining the robustness and flexibility of the Monte Carlo method.

Focusing on wind power generation and its modeling characteristics, a low-discrepancy LHS method is proposed [60]. The low-discrepancy LHS method is an advanced sampling design that elegantly combines the strengths of two established techniques. On the one hand, it retains the stratified marginal uniformity of traditional LHS across each parameter dimension. Simultaneously, it employs low-discrepancy sequences to govern the pairing of samples between dimensions, thereby optimizing the overall spatial uniformity and filling properties in the multivariate space. A low-discrepancy LHS method considers correlations among generator units, reduces sample fluctuation through optimized sampling point selection, and enhances the accuracy of scenario simulation.

3.3. Markov Chains (MCs) Method

The Markov chain (MC) refers to a stochastic process built on the probabilities of state transitions [61], suitable for capturing the temporal correlations in renewable power output. The core concept of MC is “no memory”, meaning the state of the system at the next instant is only dependent on the present state and independent of the historical state sequences from the past [62]. In generating renewable energy output scenarios, MC first discretizes continuous wind speed or PV output values into a finite number of states. Then, based on historical data, MC calculates the transition frequencies between states to construct a probability matrix of state transitions, in which each entry represents the probability of transitioning from the current state i to the next state j. When generating scenarios, starting from an initial state, subsequent states are sequentially generated through random sampling according to the probability distribution provided by the transition matrix, forming a time-series scenario sequence [63]. The mathematical model of MC is as follows:

P_{i j} = P (X_{t + 1} = j | X_{t} = i) = P (X_{t + 1} = j | X_{t} = i, X_{t + 1}, \dots, X_{0})

(9)

where

P_{i j}

denotes the transition probability from state i to state j, with

P_{i j} \geq 0

and

\sum P_{i j} = 1

. The matrix composed of all

P_{i j}

is the state transition probability matrix P. The flowchart of the MC scenario generation steps is shown in Figure 4.

The MC method can effectively generate scenarios with reasonable temporal evolution patterns and which are widely applied. In [64], an MC-based SG method is proposed to solve the stochastic optimization problem in Active Distribution Networks (ADN). First, historical PV output and power demand data (such as sunny, cloudy, and overcast days) are clustered, and an MC model is trained for each type of data to obtain the probability matrix of state transitions. Then, random forest is used to predict the next-day baseline scenario, determine its category, and generate numerous time-series scenarios based on the corresponding transition probability matrix. Finally, the number of scenarios is reduced through K-Means clustering to obtain a set of representative scenarios with both typicality and specificity, while ref. [63] also employs the MC method to calculate the short-circuit fault probability for each category.

However, basic MC still has limitations in handling the complex temporal characteristics of wind and PV output. Therefore, numerous researchers have proposed various improved methods, which are detailed as follows:

The High-order Markov chain (HMC) is an extended model of the basic MC. HMC relaxes the first-order assumption that “the future state only depends on the current state” and allows the future state to rely on a finite number of consecutive historical states in the past (the previous k states). This enables HMC to capture more complex long-term dependencies as well as dynamic characteristics in the time series. In [65], a PV power probability distribution prediction method based on HMC is proposed. First, ambient temperature and solar irradiance are taken as features, and the operating conditions of PV power generation are categorized via the Pattern Discovery Method (PDM). HMC models are constructed for each condition category to capture the multi-step temporal dependencies of PV power. Then, the Gaussian Mixture Model (GMM) is used to fit the PV power probability distribution for the next 15 min, where the weights, means, and standard deviations of GMM are all derived from HMC parameters. Finally, based on the similarity between the current operating point and the centroid point of each category, weighted fusion is performed on the conditional probability distributions output by each HMC to obtain the final predicted PDF, and the GMM parameters are optimized through a genetic algorithm to improve prediction accuracy. Additionally, the model supports online updates to adjust to the non-stationary features of PV power output. Ref. [66] also used HMC for electricity price data fitting and prediction, providing a robust solution for power market contract pricing. Ref. [67] proposed an HMC modeling framework to represent the random behavior of PV output and load curves, which can effectively capture nonlinear temporal autocorrelations across multiple time intervals.

The enhanced MC based on graph learning technology is a hybrid method integrating graph structure modeling and sequence state transition. The MC based on graph learning technology extracts complex dependencies between state nodes in the system through graph neural networks and dynamically adjusts state transition probabilities using these dependencies, thereby overcoming the limitation that traditional Markov chains only rely on the current state sequence. Ref. [68] enhanced the Markov chain model by constructing a graph structure based on Minimum Spanning Tree (MST), which can effectively seize the spatiotemporal features of wind farms and notably boost the accuracy of scenario modeling. The process is as follows. Using the spatial position information of wind turbines in the wind farm, an MST is established with the reference wind turbine as the root node for each turbine type. Linear regression quantifies the power output relationship between parent–child turbine pairs, and a Markov chain with non-uniform state division is designed based on these spatiotemporal analysis results. Compared with traditional uniform quantization methods, the MC based on graph learning technology more accurately characterizes the strong correlation between wind turbines and the spatiotemporal non-stationarity of power evolution, significantly improving the accuracy of short-term distribution prediction and point prediction.

The Markov chain Monte Carlo (MCMC) method has been previously mentioned as an improvement over basic Monte Carlo sampling. Ref. [49] proposed an optimized MCMC method for SG, while ref. [51] has applied MCMC to produce synthetic time series of wind energy generation output, fully using its advantage in capturing complex probability distributions and temporal correlations.

The Markov chain mixture distribution (MCM) model is a statistical model for modeling high-order temporal dependencies, aiming to capture long-term historical dependencies with low computational complexity. By decomposing the state transition probability of HMC into a weighted combination of multiple low-order transition probabilities, MCM avoids the problem that the number of parameters in traditional HMC grows exponentially with the order. Ref. [69] proposed the MCM model and applied it to ultra-short-term load prediction of electricity consumption in domestic areas. The electricity consumption range in the training data is uniformly partitioned into 100 bins, and a state transition probability matrix is constructed from historical electricity consumption sequences to characterize the transition likelihood between different consumption states. For prediction, only the probability distribution of the corresponding row is extracted from the transition matrix according to the bin of the current electricity consumption, thus yielding a piecewise uniform distribution for the next-time-step electricity consumption prediction. Ref. [70] also used the MCM distribution model to explore probability-based and scenario-based solar irradiance prediction.

3.4. Copula Functions

Copula functions are mathematical tools used to model the dependency structure among multiple random variables [71]. Copula functions allow the marginal distribution of each variable to be separated from the correlations between variables [72]. In the process of generating joint renewable power output scenarios, Copula functions can effectively seize the complex, nonlinear spatiotemporal dependencies between wind and solar power outputs [73]. Copula function is a key method for generating scenarios with realistic statistical correlations. The core principle is Sklar’s theorem, which states that any multivariate joint distribution function

F (X_{1}, X_{2}, \dots, X_{d})

can be decomposed into the marginal distribution functions of each variable and a Copula function:

F (X_{1}, X_{2}, \dots, X_{d}) = C (F_{1} (X_{1}), F_{2} (X_{2}), \dots, F_{d} (X_{d}))

(10)

where

X_{1}, X_{2}, \dots, X_{d}

represent wind-PV related variables,

F_{1} (X_{1}), F_{2} (X_{2}), \dots, F_{d} (X_{d})

are the marginal distributions of each variable, and C is the Copula function. Subsequently, uniform samples C are drawn from the Copula function. Ultimately, joint scenarios with specified marginal distributions and dependence structures are generated through the inverse transformation

x_{i} = F_{i}^{- 1} (u_{i})

. The specific steps are illustrated in Figure 5:

Copula functions allow for the independent selection of the most suitable marginal distribution for each variable, followed by flexible dependency modeling. As a result, it is widely applied in renewable energy output SG. Ref. [74] used Copula functions to model the correlations between different random variables, providing theoretical effective backing for the analysis of uncertainty in power systems. Firstly, each random variable (such as wind speed, load) is mapped to a unified “uniform domain” via cumulative distribution function (CDF) transformation to eliminate the impact of marginal distributions. Subsequently, the Copula function is employed in this domain to describe the interdependence structure between variables, characterizing their monotonic correlation through the rank correlation matrix. Finally, the correlated uniform distribution samples are mapped back to the original variable distribution space via inverse CDF transformation, thereby accurately reflecting the random dependence between variables while accounting for actual marginal distributions. Ref. [75] also proposed a wind energy SG method based on Copula function and prediction error, which can embody the spatiotemporal characteristics of wind power generation and the probabilistic features of prediction errors.

However, traditional Copula functions are mostly used for static dependence modeling and have weak ability to capture dynamic correlations that change over time. Consequently, many derivative methods have been developed, which are detailed as follows.

The multivariate Gaussian Copula method is a statistical tool for modeling correlations between multiple variables. The core concept of multivariate Gaussian Copula is to transform the marginal distribution of each variable into a uniform distribution through CDF to eliminate the influence of marginal distribution forms, and then use the multivariate normal distribution in the transformed “uniform domain” to characterize the dependence structure between variables. Ref. [76] proposed a multivariate Gaussian Copula method to characterize the dependence structure between different time points or different energy sources, generating aggregated output scenarios with strong correlations based on the probabilistic prediction of wind power, solar energy and small hydropower.

Vine Copula function is a flexible framework that decomposes complex high-dimensional joint distributions into a series of conditional bivariate Copulas and marginal distributions for modeling. By constructing a vine structure, the vine Copula function decomposes high-dimensional dependence into a combination of multiple low-dimensional Copulas, thereby enabling the fine characterization of various complex nonlinear, asymmetric and tail dependencies between variables. The vine Copula function effectively solves the “curse of dimensionality” by setting conditional distributions layer by layer. Therefore, the vine Copula function is more flexible than a single multivariate Copula, especially suitable for modeling and analyzing high-dimensional data with heterogeneous dependence structures. Ref. [77] proposed an SG method based on vine Copula. First, the K-means clustering is adopted to partition historical wind power data into multiple subclasses, thereby capturing its multimodal characteristics. Then, C-vine and D-vine models are respectively constructed within each subclass, and, by the selection of optimal bivariate Copula functions (such as Frank, Gumbel), characterizes the complex interdependencies between wind power outputs. Subsequently, scenario samples are generated separately based on these two structures; their consistency with the original data distribution is evaluated to identify the optimal scenarios. Finally, the optimal scenarios from all subclasses are integrated to form a comprehensive scenario set that reflects the spatiotemporal interdependencies of wind power, which can support subsequent system optimization analysis. Refs. [78,79] also used the vine Copula-based method for SG, which allows flexible regulation of the correlation structure between different variables and markedly enhances the accuracy of wind energy correlation modeling.

The Pair-Copula construction method is a core method for constructing high-dimensional joint distributions. The core concept of Pair-Copula is to recursively decompose complex multivariate dependence structures into a combination of a series of paired bivariate Copulas and univariate marginal distributions through probability decomposition. Pair-Copula first converts the multivariate distribution modeling problem into multiple flexibly selected bivariate Copula modeling problems by sequentially imposing conditions on variables, thereby enabling the fine characterization of complex nonlinear and tail dependencies between variables. Ref. [80] used Pair Copula construction to seize correlations in SG and simulate time-coupled wind power input scenarios aggregated by wind farms and effectively characterized the complex correlations between wind farm scenarios. First, the error marginal distribution at each prediction time point is estimated based on quantile regression. Subsequently, the D-vine Copula structure is employed to decompose the multi-dimensional temporal dependence into a cascade of conditional bivariate Copulas, with the optimal Copula function flexibly selected to characterize the asymmetric and nonlinear dependence across different time points. Ultimately, error scenarios with temporal autocorrelation are generated via this structure and superimposed onto the point prediction values, yielding wind power output-coupled scenarios that reflect the spatiotemporal propagation characteristics of errors. Ref. [81] also used the Pair-Copula model for wind turbine fault prediction.

3.5. Brief Summary

Each of these sampling methods has own characteristics and applicable conditions, making the selection of an appropriate method dependent on the specific problem. Table 1 summarizes the advantages and disadvantages of the sampling methods discussed above.

4. Model-Based Methods

Model-based methods are data-driven at their core. No preset probability distributions are required for these data-driven models, and complex features in data can be automatically uncovered through neural networks. These approaches are well-suited for generating renewable power output scenarios characterized by high dimensionality, strong nonlinearity, and significant spatiotemporal correlations.

Before elaborating on specific model-based SG architectures, it is critical to highlight a core statistical validation principle for all data-driven time-series generative models discussed in this chapter: the rigorous application of appropriate temporal data splitting techniques, which is the fundamental prerequisite to ensure the reliability of published results and avoid over-optimistic performance estimates caused by data leakage.

Unlike independent tabular data, renewable power output time series have inherent temporal continuity and strong autocorrelation characteristics. The widely used random k-fold cross-validation, which randomly shuffles and splits the dataset without preserving temporal order, will cause severe data leakage in time-series generative tasks. It allows the model to access future data during the training phase, leading to inflated performance metrics and over-fitted models that fail to generalize to real unseen future scenarios. For renewable power SG, this leakage will directly result in generated scenarios overestimating the predictability of real-world output fluctuations, and ultimately lead to unreliable, even risky decision-making for downstream power system optimization and scheduling. To address this issue, temporal splitting techniques, including blocked cross-validation and rolling-origin cross-validation, are the only statistically valid approaches for time-series generative model validation. These methods strictly preserve the chronological order of the dataset, ensuring that all training data are chronologically prior to the validation/test data, thus eliminating data leakage fundamentally and providing an unbiased evaluation of the model’s actual generalization performance in real-world grid operations.

4.1. Artificial Neural Networks (ANNs)

The artificial neural network (ANN) is a machine learning model inspired by the structure and information transmission mechanism of neuronal networks in the biological brain. The core objective of the ANN is to model complex nonlinear data and conduct feature learning and pattern recognition by simulating the connections and signal transmission between neurons. It is made up of multiple layers of neurons that are interconnected. As shown in Figure 6, the typical structure includes an input layer, several hidden layers and an output layer. The input layer is responsible for receiving raw data, such as time-series features like wind speed, light intensity and temperature in wind–solar scenario generation. The hidden layers serve as the core part for feature extraction and transformation. The design of the number of layers and nodes directly affects the fitting ability of the model. The output layer outputs prediction results according to task requirements, such as wind–solar power values and scenario categories. Each neuron receives the output signals from the nodes of the previous layer. It first performs weighted summation and adds a bias term. Then it conducts nonlinear transformation via an activation function. Finally, it maps the results of linear combination to a specific range, thus realizing the modeling of nonlinear relationships.

When generating renewable power scenarios using the ANN, no preset probability distribution is required. By learning the complex relationships between meteorological parameters and power output, ANNs can produce temporal or joint scenarios that are both realistic and diverse. This approach is particularly well-suited for modeling high-dimensional, strongly coupled uncertainties in wind and PV power generation.

First, an ANN model is trained using data that include historical wind/PV output, meteorological factors (wind speed and irradiance), and temporal features. This enables the model to learn the complex nonlinear mapping from input features to output values. Subsequently, in the generation phase, a new feature vector is fed into the trained network. The network can directly output a deterministic power sequence (deterministic generation). Alternatively, it can be embedded as a core component into more complex probabilistic frameworks like GANs or VAEs. By leveraging its strong nonlinear transformation capability, the ANN then assists in generating a large number of probabilistic scenarios that conform to the data distribution. Ref. [82] employed an ANN to generate load, PV and wind power scenarios. Ref. [83] used an ANN to model the stochastic process of wind turbine output, thereby forecasting multiple future scenarios.

4.2. Long Short-Term Memory Networks (LSTM)

Long short-term memory (LSTM) is a specialized improved variant of the recurrent neural network (RNN), which is specifically designed to mitigate the issues of gradient vanishing or gradient explosion that commonly arise when traditional RNNs process long time-series sequences. The core design of LSTM incorporates memory cells and a gating mechanism, enabling the efficient capture and persistent storage of long-distance temporal dependencies. The network structure of LSTM is shown in Figure 7. As a variant of recurrent neural networks, LSTM still preserves the core feature of feedback-connected architecture: The output at the current moment is determined not only by the current input, but also by the hidden state of the preceding moment. However, unlike traditional RNNs, LSTM precisely regulates information transmission and retention through three collaborative gating units: the forget gate, the input gate, and the output gate. The forget gate filters and discards irrelevant or outdated historical information within the memory cells, preventing the model from being perturbed by redundant data. The input gate controls the selective integration of new time-series feature information into the memory cells, thus completing the update and storage of valid data. The output gate generates the hidden state at the current time step and produces the corresponding output based on the current status of the memory cells and the input information, laying the foundation for computations in the subsequent time step. This distinctive gating mechanism allows LSTM to not only retain key temporal information that is critical for subsequent predictions over extended time horizons but also filter out invalid information promptly, thereby significantly enhancing the model’s capability for modeling long time-series sequences.

In the field of renewable power output SG, LSTM demonstrates remarkable applicability and advantages and can effectively capture the nonlinear correlations and periodic fluctuation characteristics of key factors (such as wind speed, light intensity, and ambient temperature) that vary with time. Through learning from historical time-series monitoring data, LSTM accurately fits the dynamic variation patterns of wind–solar power output, thus generating wind–solar power scenarios that conform to actual distribution characteristics. Therefore, LSTM can provide reliable data support for uncertainty analysis, dispatching planning, and risk assessment of power systems. First, the long-term dependencies between historical output sequences and meteorological data are learned via the gating mechanism and cell state, and the model is subsequently trained to minimize prediction errors. Subsequently, the trained LSTM network can be used in an autoregressive or sequence-to-sequence manner. By inputting new temporal features, LSTM can directly generate deterministic output sequences. Alternatively, LSTM can be embedded as a generator within a probabilistic framework (such as combined with VAE or GAN). Through multiple sampling, LSTM can then produce a large number of probabilistic scenarios that conform to both temporal dynamics and statistical distributions. Ref. [84] proposed a class-driven method based on LSTM for electricity price SG and reduction. Ref. [85] introduced a method based on an LSTM autoencoder for creating typical representative scenarios in an integrated hydro-photovoltaic power generation system.

4.3. Autoregressive Moving Average (ARMA)

The autoregressive moving average model (ARMA) is a classic linear time-series modeling method in the field of time-series analysis. ARMA is coupled by two core components: the autoregressive (AR) component and the moving average (MA) component, and is generally denoted as ARMA (p, q), where p represents the order of the autoregressive component, and q denotes the order of the moving average component. The core assumption of ARMA is that the time-series data must satisfy the stationarity condition, which means the mean, variance, and autocovariance of the series remain constant over time. The mathematical expression is presented as

X_{t} = c + \sum_{i = 1}^{p} ϕ_{i} X_{t - i} + ε_{t} + \sum_{j = 1}^{q} θ_{j} ε_{t - j}

(11)

where

X_{t}

is the value of the time series at time t; c is a constant term;

ϕ_{i}

is the autoregressive (AR) coefficient, describing the linear relationship between the current value and the past p values;

θ_{j}

is the moving average (MA) coefficient, describing the linear relationship between the current value and the past q random shocks; and

ε_{t}

is a white noise sequence with zero mean and constant variance. From the perspective of model mechanism, the AR (p) component captures the autocorrelation characteristics of time-series data via linear weighting of the series’ own historical observations from the previous p periods, thereby reflecting the impact of the past variation trends on its current values. In contrast, the MA (q) component characterizes the short-term fluctuation characteristics of the series that cannot be explained by historical observations through the linear combination of random disturbance terms from the previous q periods. The combination of these two components enables effective fitting of the linear dynamic variation patterns of stationary time-series. In practical modeling, four key steps need to be completed in sequence: stationarity test (such as ADF test), order determination, parameter estimation, and model diagnosis. These steps ensure the rationality and effectiveness of the established model.

In renewable power output SG, ARMA is primarily used to simulate and generate sequences of wind speed, solar irradiance, or power output that exhibit linear temporal correlations. First, stationarity testing and necessary preprocessing are performed on the historical wind speed or PV output time-series data. Next, the model orders (p, q) are identified using the autocorrelation function (ACF) and partial autocorrelation function (PACF). The autoregressive coefficients and moving average coefficients are then estimated through methods (such as maximum likelihood estimation). Subsequently, model diagnostics are conducted to ensure the model adequately captures the linear dependency structure of the series. Finally, the fitted model is applied. Using the historical series as initial conditions and sampling the random noise

ε_{t}

, the model recursively generates deterministic output sequences or probabilistic scenario sets for future periods that exhibit the corresponding linear temporal correlations.

The ARMA model is concise and computationally efficient, making it a classical tool for generating wind and PV temporal scenarios when dealing with linear stationary sequences. Ref. [86] applied the ARMA time-series approach to replicate hourly wind speed data, establishing a reliability model for wind energy conversion systems. Ref. [87] adopted the ARMA time-series method to mimic hourly wind speeds at different locations, aiming to study the impact of a large number of intermittent energy sources on system reliability. First, wind speed data are generated based on the ARMA model, which is capable of capturing the temporal autocorrelation and dynamic variation characteristics of wind speed sequences. Then, the generated wind speed data are converted into time-series wind power output by combining the wind turbine power curve, which realizes the accurate mapping from meteorological parameters to energy output. Furthermore, through the sequential Monte Carlo simulation and the system health analysis framework, ARMA evaluates the impact of large-scale wind power grid connection on system reliability from two dimensions: power generation adequacy and security. This integrated approach combines the advantages of ARMA in temporal sequence simulation and Monte Carlo in uncertainty analysis, providing a comprehensive technical support for the reliability assessment of power systems with high-proportion intermittent energy. Ref. [88] proposed a bivariate autoregressive moving average–generalized autoregressive conditional heteroscedasticity (ARMA-GARCH) model, which jointly models wave height and wave period. This model adopts the vector autoregressive moving average (VARMA) structure to describe the dynamic relationships between variables, and combines multivariate GARCH (MGARCH) to capture the temporal variations in their conditional variances and covariances. During fitting, a diagonal simplified form (MGARCH-DG) with upper triangular constraints is introduced to reduce the number of parameters and ensure positive definiteness. Eventually, the model achieves high-precision density and point prediction of wave energy flux within the 1 to 24 h prediction horizon, fully leveraging the advantages of VARMA in inter-variable dynamic modeling and MGARCH in heteroscedasticity capture.

However, ARMA is not suitable for non-stationary and non-Gaussian stochastic processes. To address this deficiency, the autoregressive integrated moving average (ARIMA) model was proposed. As a classic time-series prediction method, ARIMA incorporates three components (autoregressive, differencing, and moving average) for the modeling and prediction of non-stationary time series. Determined by three parameters, namely the autoregressive order, differencing order, and moving average order, ARIMA needs to be established through order determination and parameter estimation. The ARIMA first stabilizes non-stationary sequences via differencing, and subsequently captures the dynamic characteristics of the sequences through linear combinations of historical observations and historical prediction errors. Ref. [89] proposed a stochastic wind power time-series model based on ARIMA, and constructed a limited ARIMA (LARIMA) model by introducing a limiter to accurately describe the non-stationarity and physical lower bounds of wind power. The LARIMA describes stochastic wind power generation by means of average magnitude, temporal correlation, and driving noise. The LARIMA outperforms traditional discrete Markov models in terms of probability distribution fitting, autocorrelation fitting, and partial autocorrelation fitting. Moreover, LARIMA can also further capture seasonal variations by adjusting parameters, providing an effective tool for time-series wind power simulation in power system reliability assessment.

4.4. Generative Adversarial Networks (GANs)

The generative adversarial network (GAN) is a highly representative deep generative model proposed by Ian Goodfellow et al. The core idea of GAN is derived from the zero-sum game in game theory. GAN can realize the learning of real data distribution and the generation of new samples by constructing two deep neural networks that confront and coevolve with each other (the generator (G) and the discriminator (D)). The network structure diagram is shown in Figure 8. The generator is a deep neural network taking random noise (usually subject to uniform or normal distribution) as input. Its core objective is to map meaningless random noise into fake samples highly similar to the real data distribution through multi-layer nonlinear transformation. The discriminator is also a deep neural network, whose function is to perform binary classification on the input samples and distinguishes whether a sample is a “real sample” from the real dataset or a “fake sample” generated by the generator, and outputs the probability value of the sample being real data. During the model training phase, the generator and the discriminator are trained via an alternating iterative optimization approach. The first step is to fix the generator parameters and train the discriminator, updating the discriminator parameters through the gradient descent algorithm to enable it to distinguish real samples from fake samples more accurately. The second step is to fix the discriminator parameters and train the generator, adjusting the generator parameters through backpropagation to make the fake samples generated by it confuse the discriminator as much as possible, so that the discriminator gives a judgment probability close to 0.5. This cyclic adversarial process of “generation—discrimination—regeneration” will eventually converge to the Nash equilibrium state. Through this dynamic adversarial process, GAN can capture complex nonlinear spatiotemporal patterns in wind and PV data, thereby generating highly realistic and diverse output scenarios. The adversarial loss function of GAN is as follows:

\min_{G} \max_{D} V (D, G) = E_{x ~ p_{data} (x)} [\log D (x)] + E_{z ~ p_{z} (z)} [\log (1 - D (G (z)))]

(12)

where G is the generator, D is the discriminator, x represents real scenarios, and z denotes random noise drawn from a prior distribution.

In the field of renewable power SG, GAN exhibits distinctive technical advantages by accurately capturing the nonlinear correlations and the periodic fluctuation characteristics of key factors, as well as the power output mutation patterns under extreme weather conditions. Furthermore, GAN generates high-dimensional, diversified wind–solar power output scenarios that are highly consistent with the distribution characteristics of real data, thereby providing high-quality data support for uncertainty analysis, dispatching planning, and risk assessment of power systems. In the process of renewable energy output SG, historical wind–solar power output sequences are first prepared and standardized as the training dataset. Then, two deep neural networks (the generator and the discriminator) are constructed: The generator is responsible for mapping random noise to synthetic scenarios, while the discriminator learns to distinguish between real scenarios and generated ones. Next, adversarial training is performed (alternatively and iteratively optimizing both networks to maximize the discriminator’s classification accuracy) while minimizing the probability that the generator’s generated scenarios are identified. Once the training process converges to the Nash equilibrium, the well-trained generator can be deployed. By inputting different random noise vectors, GAN can batch-generate new wind–solar power output scenarios that are highly similar to historical data in terms of statistical characteristics and spatiotemporal patterns. Finally, inverse standardization and multi-dimensional evaluation (such as statistical similarity, diversity, and downstream task performance) are conducted on the generated scenarios.

GANs do not rely on any prior probability distribution assumptions. Instead, they can learn and replicate complex joint distributions directly from historical sequences in a data-driven manner. The GAN is particularly adept at capturing nonlinear dependencies, spatiotemporal correlations, and extreme fluctuation patterns (such as ramp events) in wind and PV output, which are hard to simulate with conventional linear approaches. Consequently, the GAN can generate high-quality scenarios with realistic statistical properties and strong visual diversity. Ref. [90] proposed a GAN-based method for wind power SG. Ref. [91] used the GAN to capture the complex spatial and temporal correlations of wind energy. The generator network learns the high-dimensional distribution of real wind energy data from latent noise vectors, and the discriminator network continuously optimizes the authenticity of generated data. Without assuming a specific distribution form, the GAN can implicitly model the multi-dimensional spatiotemporal dependence structure of wind energy, generating synthetic scenarios with spatiotemporal correlations similar to real wind energy. The generated data present correlation patterns similar to real wind farms in space and maintain reasonable autocorrelation characteristics in time, thereby effectively supporting the construction and solution of distributionally robust joint chance-constrained economic dispatch models. Ref. [92] also applied the GAN for renewable energy scenario generation.

However, GAN methods also have several notable limitations. The training process is unstable, making the GAN difficult to achieve equilibrium between the generator and discriminator. This often leads to issues, such as mode collapse, generating repetitive scenarios and vanishing gradients. Therefore, numerous variants have been derived from the GAN to enable conditional generation or improve training stability.

The Fed-LSGAN model is a framework combining federated learning and deep generative models. The core of Fed-LSGAN lies in training the GAN in a distributed environment that protects data privacy. The framework is based on the least squares generative adversarial network (LSGAN). By replacing the cross-entropy loss of the classic GAN with least squares loss, Fed-LSGAN improves training stability and generation quality. In Fed-LSGAN, multiple clients train generators and discriminators locally using private data, and only upload model parameters to a central server for secure aggregation without sharing raw data. The distributed training mechanism effectively solves the problems of data silos and privacy leakage, which is particularly critical for renewable energy SG involving multi-party data (such as data from multiple wind farms or photovoltaic power plants) that are not suitable for direct sharing. Ref. [93] proposed Fed-LSGAN, which integrates federated learning with LSGAN for renewable energy SG.

The conditional generative adversarial network (CGAN) model is an important extension of the GAN. The CGAN introduces additional conditional information into the inputs of both the generator and the discriminator. By inputting conditional information together with random noise into the generator, and requiring the discriminator to simultaneously judge the authenticity of the data and the degree of matching with the conditions, the CGAN enables targeted and controllable data generation. This mechanism addresses the uncontrollable problem of the generation process in the original GAN, allowing the CGAN to generate samples with clear targets and distinct features according to specific conditions. For renewable energy SG, such conditional information can be meteorological factors (such as temperature, humidity, wind direction), time periods, or operation modes of power systems. This controllability makes the CGAN particularly suitable for generating scenario samples that meet specific operation requirements, providing more targeted data support for power system optimization, scheduling, and reliability assessment. Ref. [94] generates PV output scenarios by extracting features from historical data using an improved CGAN. By introducing conditional terms (such as day-ahead prediction data) into the inputs of both the generator and the discriminator, the generator can generate PV output scenarios with corresponding spatiotemporal correlations for specific meteorological conditions, and can effectively capture the complex implicit features of PV output affected by factors like light and temperature under extreme weather conditions. Consequently, a high-quality uncertainty scenario input is provided for subsequent multi-stage recovery strategies. Ref. [95] is among the first to explore the CGAN for generating sufficient PV scenarios to support PV power plant planning.

The Wasserstein generative adversarial network (WGAN) is an important improved model proposed to address the training instability and mode collapse problems of the classic GAN. The core of WGAN innovation lies in replacing the JS divergence in the original GAN with the Wasserstein distance to measure the difference between the generated distribution and the real distribution. The Wasserstein distance can provide smooth and meaningful gradients even between non-overlapping distributions, which significantly stabilizes the training process of WGAN and makes the generation quality and diversity more controllable. The WGAN further introduces conditional constraints on the basis of the original WGAN, integrating specific scenario conditions (such as meteorological constraints and power system operation boundaries) into the training process of the generator and discriminator. This enables the WGAN to generate targeted and high-quality renewable energy scenarios while maintaining training stability, effectively making up for the defects of the classic GAN in practical application and being widely used in complex scenario generation tasks. Ref. [96] proposed a multi-wind farm wind power SG framework based on the conditionally improved WGAN. First, unsupervised classification of wind power prediction errors is performed through agglomerative clustering, and a support vector classifier is used to establish the mapping relationship between wind power point prediction and error categories. Then, a conditional WGAN-GP is constructed, with category labels introduced as conditional information into the inputs of the generator and discriminator; weight clipping is replaced with a gradient penalty term to strengthen the Lipschitz constraint, thereby stabilizing the adversarial training process and improving generation quality. Finally, the generated scenarios can not only fit the marginal distribution characteristics of each category but also effectively capture the spatiotemporal correlations among wind farms, providing high-quality and diverse uncertainty scenario inputs for stochastic optimization problems involving wind power. Ref. [97] introduced the Wasserstein distance with gradient penalty (WGAN-GP) for renewable energy SG, optimizing the stability of model training. Ref. [98] established an extreme SG model based on WGAN with gradient penalty and a variable learning rate. Ref. [99] also used WGAN for stochastic wind power output scenario generation.

The spectral normalization generative adversarial network (SNGAN) is an efficient model that stabilizes GAN training through spectral norm normalization technology. The core operation of SNGAN is to impose spectral norm constraints on the weight matrix of each layer in the discriminator, thereby controlling the Lipschitz constant of the discriminator. The constraint mechanism can effectively prevent gradient explosion or disappearance during the training process, which is a key problem affecting the stability of the classic GAN. By regulating the Lipschitz constant, the SNGAN not only enhances the stability of the training process but also improves the quality and diversity of generated samples. For renewable energy SG, the SNGAN is particularly suitable for tasks requiring high sample consistency and stability, such as large-scale wind–solar-storage integrated system scenario modeling, providing a reliable deep learning tool for complex uncertainty modeling. Ref. [100] proposed a PV power plant SG method based on the SNGAN, which improved the stability and convergence of the GAN model. The SNGAN imposes spectral normalization on the parameters of the discriminator and introduces the 1-Lipschitz constraint to enhance training stability. Both the generator and the discriminator adopt convolutional and deconvolutional layers, combine batch normalization and spectral normalization technologies, and use the Adam optimizer for training. Then, the generator generates PV power generation scenarios consistent with the probability distribution of real data, featuring similar trends but distinct specific samples. The rationality and diversity of the generated scenarios are verified through indicators such as MAE, RMSE and cumulative distribution function, providing an effective tool for the uncertainty analysis of PV power generation and grid planning.

The sequence generative adversarial network (SeqGAN) is a GAN designed specifically for discrete sequence generation tasks (such as text and music). The core innovation of the SeqGAN is introducing the policy gradient technique in reinforcement learning to solve the gradient non-differentiable problem of the traditional GAN caused by discrete outputs. In the SeqGAN, the generator is regarded as an agent in reinforcement learning, where the sequences it generates serve as actions, and the discriminator provides reward signals. The generator samples the un-generated parts of the sequence through Monte Carlo search, and updates parameters via policy gradient based on the discriminator’s evaluation of the complete sequence, thereby realizing the step-by-step optimized generation of long sequences. For renewable energy SG, the SeqGAN is highly suitable for generating discrete time-series sequences (such as wind and PV output), as it can effectively handle the discrete characteristics of sequence data and optimize the generation quality of long-term time series, providing a new technical path for modeling the temporal dynamics of intermittent energy. Ref. [101] proposed a wind power SG method based on the SeqGAN. By combining the GAN, LSTM and reinforcement learning, a distribution-free time-series generation model is constructed. The generator adopts an LSTM structure to capture the temporal dynamic characteristics of wind power data, while the discriminator is used to evaluate the authenticity of the generated sequences. To balance the local fitting of each time point and the overall sequence quality, a reinforcement learning mechanism is introduced, and future rewards are estimated through Monte Carlo search to guide the optimization of the generation process. The SeqGAN does not require manual feature selection, avoids overfitting and pattern misjudgment problems of traditional supervised learning, and can generate wind power or prediction error scenarios with reasonable temporal correlation, diversity and extreme event coverage.

The progressive growing generative adversarial network (PgGAN) is an efficient GAN variant designed to generate high-resolution, high-quality images through a progressive layered training strategy. The core idea of PgGAN is to start training the generator and discriminator from low-resolution images, then gradually add network layers to improve resolution, and smoothly integrate higher-resolution details at each new stage. The progressive approach allows the PgGAN model to first learn the overall structure of the image, then optimize local features layer by layer, which significantly stabilizes the training process of high-resolution images and reduces the phenomenon of mode collapse. In renewable energy SG, the PgGAN can be applied to the high-precision visualization of spatiotemporal distribution scenarios (such as wind farm power output spatial distribution maps and PV array operation state diagrams), generating high-resolution scenario images that retain both overall distribution laws and local detail characteristics, providing intuitive and accurate data support for power system visual analysis and decision-making. Ref. [102] proposed a wind power scenario prediction method based on the PgGAN. The PgGAN gradually learns the temporal dynamic characteristics of wind power data from low resolution to high resolution through a progressive training strategy. During the training process, the PgGAN gradually increases the number of network layers and resolution, enabling the generator to progressively capture complex temporal patterns of wind power (such as randomness, volatility and intermittency) from overall trends to local details. In addition, the PgGAN designs a composite scenario structure including consecutive days of wind power and corresponding point predictions, allowing the PgGAN to simultaneously learn the internal temporal dependence of wind power, the correlation of daily patterns, and their association with point predictions. Thus, high-quality wind power scenarios are generated that not only conform to historical statistical characteristics but also have high spatiotemporal consistency, providing more reliable stochastic inputs for day-ahead scheduling.

4.5. Variational Autoencoder (VAE)

The variational autoencoder (VAE) is a type of generative model that integrates Bayesian probabilistic inference with deep learning architectures, composed of two core neural network modules: the encoder and the decoder. The model structure diagram is shown in Figure 9. The core objective of the VAE is to learn the latent probability distribution of input data and generate new samples that conform to this distribution. Unlike the game-theoretic training mechanism of the GAN, the VAE models data from a probabilistic perspective and achieves effective fitting of the latent space distribution by maximizing the Evidence Lower Bound (ELBO). The working mechanism of the VAE can be divided into three key steps: encoding, reparameterization, and decoding. The encoder takes real data (such as historical wind–solar power output time-series sequences in wind–solar SG) as input, maps them to a low-dimensional latent space via multi-layer nonlinear transformation, and outputs the probability distribution parameters of this latent space (typically the mean and variance of a normal distribution). Directly sampling latent vectors from the probability distribution of the latent space prevents gradient backpropagation, so VAE introduces the reparameterization trick. A random noise vector is sampled from a standard normal distribution, and differentiable latent vectors are then computed by combining this noise with the mean and variance output by the encoder, which addresses the gradient vanishing problem during training. The decoder takes these latent vectors as input, maps them back to the original data space through inverse transformation, and reconstructs new samples similar to the input data. During the model training phase, the optimization objective of the VAE consists of two parts. The first is the reconstruction loss, which measures the difference between the samples reconstructed by the decoder and the original input data to ensure the authenticity of the generated samples; the second is the KL divergence loss, which constrains the latent space distribution output by the encoder to be as close as possible to a predefined prior distribution (such as standard normal distribution), ensuring the continuity and regularity of the latent space and avoiding model overfitting. Once training is completed, new samples can be generated simply by randomly sampling latent vectors from the prior distribution and feeding them into the decoder.

First, the encoder learns the posterior distribution

q_{ϕ} (z | x)

of the input scenario data. The reparameterization trick

z = μ + σ ⊙ ε

is then applied to sample the latent variable. Next, the decoder learns the likelihood distribution

p_{θ} (x | z)

to reconstruct the data. The model is trained by maximizing the Evidence Lower Bound (ELBO), balancing reconstruction accuracy with the regularity of the latent space.

L (θ, ϕ; x) = E_{q_{ϕ} (z | x)} [\log p_{θ} (x | z)] - D_{KL} (q_{ϕ} (z | x) ‖ p (z))

(13)

Among this formulation, the first term represents the reconstruction loss, and the second term is the KL divergence.

The training process of the VAE is stable and less prone to common GAN issues such as mode collapse. Furthermore, the VAE provides a continuous and structured latent space, facilitating scenario interpolation and controllable generation. These attributes make the VAE widely applicable in renewable energy SG research. For instance, ref. [103] employed the VAE to analyze electric vehicle charging load profiles. Ref. [104] proposed an importance-weighted autoencoder method for wind power SG. By increasing the sampling frequency of hidden variables for corresponding samples, a tighter variational lower bound can be achieved, thereby enhancing the model’s generative capability.

But the VAE has inherent limitations, with samples generated by it often lacking sharp details and appearing blurry or over-smoothed. To address this defect, a derivative method, namely the variational autoencoder–generative adversarial network (VAE-GAN), is proposed.

As a hybrid generative model integrating the VAE and GAN, the core design of the VAE-GAN is that the structure of the decoder and the generator is shared, while both the reconstruction loss and the adversarial loss are optimized simultaneously. Specifically, the encoder maps input data to a latent space distribution, the decoder and generator reconstruct data based on sampling, and the discriminator distinguishes between real data and generated samples. The joint training mechanism enables the VAE-GAN to not only learn the structured latent representation of data through the VAE branch but also to enhance the visual authenticity and detail richness of generated samples via the GAN branch. Thus, the VAE-GAN achieves a better balance between realism and diversity in tasks (such as image generation, feature learning, and data reconstruction). For renewable energy SG, the VAE-GAN is suitable for tasks requiring both structural rationality and detail fidelity (such as PV panel fault scenario reconstruction and wind farm spatiotemporal detail simulation), making up for the shortcomings of single VAE or GAN models. Ref. [105] introduced a VAE-GAN model capable of learning diverse data distributions and generating plausible samples from the same distribution without requiring prior data analysis during training. Ref. [106] explored the integration of the VAE with the GAN applying the deep generative model to synthetic data generation for solar PV systems.

4.6. Diffusion Models

The diffusion model (DM) is a type of generative model integrating non-equilibrium thermodynamics principles with deep learning architectures. The core idea of DM is to learn the probability distribution of real data and generate high-quality new samples by simulating a Markov chain process of “gradual noising—gradual denoising”. Compared with the game-theoretic training mechanism of the GAN and the probabilistic inference framework of the VAE, diffusion models have emerged as a research hotspot in the field of deep generative models in recent years, thanks to their advantages of stable training, resistance to mode collapse, and high diversity and fidelity of generated samples. Figure 10 presents the network structure diagram of the diffusion model. The workflow of a diffusion model can be divided into two core stages: the forward diffusion process and the reverse denoising process. The forward diffusion process is an artificially defined, computable procedure. Gaussian noise is gradually added to real data within a limited number of steps using a fixed noise scheduling strategy. The forward diffusion process ultimately transforms the data into a completely random noise distribution, and the noising operation at each step satisfies the Markov property, that is, the distribution of the current data is only related to the state of the previous step. The reverse denoising process, by contrast, is a learnable inverse procedure. A deep neural network (typically a convolution- or transformer-based U-Net architecture) is constructed as a denoiser. Taking the noisy data at a certain noising step and the corresponding step information as input, the network learns to predict the noise added at that step or to directly restore the low-noise data from the previous step. The optimization objective of the diffusion model is usually based on the Evidence Lower Bound (ELBO) of variational inference or the simplified denoising score matching criterion. By minimizing the difference between the noise predicted by the model and the actual added noise, the denoiser gradually grasps the rules of restoring real data from noise. Once model training is completed, the process of generating new samples starts from random noise. Using the well-trained denoiser, denoising operations are performed step by step in the reverse order of the forward diffusion process. After a preset number of steps, new samples highly congruent with the distribution of real data can be obtained.

In renewable energy output SG, the diffusion models can progressively transform a simple noise distribution into high-dimensional, complex scenarios that conform to the statistical properties and spatiotemporal patterns of historical wind and PV output. High generation quality, good diversity, and a more stable training process are offered compared to the GAN. Diffusion models consist of two key processes. The first is the forward diffusion process, which gradually adds Gaussian noise to the original wind and PV scenario data

x_{0}

over T timesteps, eventually transforming them into near-pure noise

x_{t}

. This is a fixed Markov process:

q (x_{t} | x_{t - 1}) = N (x_{t}; \sqrt{1 - β_{t}} x_{t - 1}, β_{t} I)

(14)

where

β_{t} \in (0, 1)

is the noise scheduling parameter used to control the noise intensity at each step.

N

denotes the multivariate Gaussian distribution.

The second is the reverse generative process. The diffusion models learn to predict either the added noise or the denoised data from the noise, thereby progressively reconstructing the noise into new wind and PV scenarios. This is a parameterized Markov chain:

p_{θ} (x_{t - 1} | x_{t}) = N (x_{t - 1}; μ_{θ} (x_{t}, t), Σ_{θ} (x_{t}, t))

(15)

where

μ_{θ}

and

Σ_{θ}

are the mean and variance learned by the neural network.

The optimization objective is to minimize the mean squared error between the noise predicted by the denoising network and the actual added noise:

L_{simple} = E_{t, x_{0}, ϵ} [‖ ϵ - ϵ_{θ} (x_{t}, t) ‖^{2}]

(16)

where

L_{simple}

is the simplified training loss function.

E

denotes the expectation operator.

The training process of diffusion models avoids issues (such as mode collapse and adversarial instability) enabling the stable generation of scenarios with highly realistic statistical properties, rich detail, and sufficient diversity. The diffusion models excel particularly in capturing extreme fluctuations and complex spatiotemporal correlations, making them increasingly applied in renewable energy SG research. Ref. [107] proposed a novel wind power generation framework based on Denoising Diffusion Probabilistic Models (DDPMs), which overcomes the limitations of conventional methodologies by learning the distribution of real data and generating reliable renewable energy scenarios. Ref. [108] introduced an improved diffusion model for generating high-variability output scenarios for new energy sources in high-altitude regions. The DDPM is enhanced by embedding LSTM networks, thereby improving the model’s ability to learn complex temporal features.

The diffusion models have achieved remarkable breakthroughs in generation quality in recent years and have thus gained growing attention in renewable energy SG. However, the diffusion models still face two core challenges. First, the generation process is completely random, making it difficult to achieve precise controllable generation. Second, the numerous iterative sampling steps lead to high computational costs and slow generation speed. To overcome these limitations, the basic framework has been enhanced through different approaches. Conditional diffusion models endow the model with precise controllable generation capabilities by using external information as guidance. Improved diffusion models focus on optimizing noise scheduling, training objectives, and sampling algorithms, significantly improving efficiency while maintaining generation quality. These two derivative directions jointly promote diffusion models to a more efficient and practical new stage.

The conditional diffusion model is a probabilistic generative model that achieves high-fidelity and refined controllable generation in multi-modal generation tasks (such as images and audio). The core mechanism introduces the guidance of conditional information on the basis of the standard diffusion process: Noise is gradually added to the data in the forward diffusion phase, while in the reverse denoising phase, conditional information (such as text descriptions, category labels, or reference images) is integrated into the prediction of the denoising network to gradually reconstruct data samples that both meet conditional constraints and possess highly realistic details. The conditional diffusion model based on iterative denoising, combined with powerful conditional injection technology, significantly surpasses traditional conditional generative models in balancing quality and diversity. Ref. [109] proposed a novel conditional latent diffusion model (CLDM) suitable for short-term SG to address key challenges in short-term wind energy SG, significantly reducing the denoising complexity of the diffusion model. The core of the CLDM lies in decomposing the complex joint probability distribution modeling task into two sub-problems. First, a high-precision deterministic wind power prediction is obtained through regression using Numerical Weather Prediction (NWP) data via an independent embedding network, filtering out irrelevant meteorological features and ensuring the baseline accuracy of scenarios. Second, prediction errors are taken as the latent space, and a conditional diffusion model is applied in this space to generate random scenarios conforming to the real error distribution. The CLDM not only reduces denoising computational complexity but also maintains spatiotemporal correlations among multiple time periods. By integrating the accuracy of deterministic regression and the distribution expression capability of generative models, the CLDM finally reconstructs reliable and diverse wind energy scenarios through the superposition of deterministic predictions and error scenarios, significantly improving comprehensive performance in terms of probabilistic prediction, scenario quality, and downstream scheduling economy. Ref. [110] introduced a new deep learning method using conditional diffusion models to tackle wind energy SG challenges.

The Improved diffusion model mainly incorporates key algorithmic and training strategy optimizations on the basis of the original diffusion probability model. For example, linear scheduling is replaced with cosine noise scheduling to make the noise addition process smoother, thereby improving training stability and generation quality. The improved diffusion model designs a weighted loss function that focuses more on semantically meaningful intermediate noise levels during training, effectively enhancing the semantic coherence of samples. Ref. [111] proposed an enhanced VAE-DM deep SG method by combining the VAE with a diffusion model (DM), which is used to efficiently generate new energy output and load scenarios with uncertainties. The VAE-DM method first uses the VAE to map wind power, PV output, and multi-dimensional load data to the latent space for feature extraction, then performs forward noise diffusion and reverse denoising generation processes in the latent space through the DM to generate diverse and high-quality scenario data. While maintaining temporal correlation and statistical consistency, the model significantly improves the accuracy and diversity of generated scenarios, providing a reliable uncertainty characterization basis for subsequent low-carbon optimal scheduling of integrated energy systems.

4.7. Transformer-Based Models

The transformer is a classic deep learning architecture first proposed in 2017. The architecture completely abandons the sequential structure of traditional RNNs. In the core of the transformer lies the self-attention mechanism. This mechanism enables global modeling of information at any position in the input sequence, and efficiently captures long-range dependencies within the sequence. The transformer adopts a series of key designs to optimize model performance, including multi-head attention, positional encoding, residual connections, and layer normalization. These designs enable parallel processing of input data, and guarantee stable training of the deep network. The classic encoder–decoder structure forms the core framework of the transformer. The encoder module completes deep semantic understanding of input information. The decoder module implements autoregressive sequence generation through the masked self-attention mechanism. The transformer was initially applied in the field of machine translation. Subsequently, its application scenarios have gradually expanded to various generative tasks, including text generation, abstract writing, code generation, dialog generation, and multimodal content generation. With powerful representation capability, excellent generation quality, and superior scalability, the transformer has become the foundational architecture of mainstream deep generative models, including Large Language Models (LLMs) and multimodal generative models. It also serves as the core technical support for the rapid development of modern deep generative artificial intelligence.

Among all components of the transformer architecture, the self-attention mechanism (also known as intra-attention) is the most core innovation. The network structure of the self-attention mechanism is shown in Figure 11. This mechanism is also the fundamental enabler of the transformer’s capability in deep generation tasks and long-text understanding. The core function of the self-attention mechanism is to enable each position in the input sequence to directly access and weightedly utilize all information from the entire sequence. This is fundamentally different from traditional RNNs, which can only process sequence information in a step-by-step sequential manner. During the calculation process, the model first maps input vectors through three independent linear transformations to generate three core matrices: Query (Q), Key (K), and Value (V), where Q represents the information intended to be retrieved at the current position, K contains the matchable features of all positions in the sequence, and V stores the actual content information to be weighted and aggregated. Next, the dot product between Q and each K is calculated to obtain the correlation score between the current position and all other positions in the sequence, which represents the degree of attention the current token pays to other tokens in the sequence. To stabilize the numerical range and ensure smoother model training, the dot product results are scaled by dividing by the square root

\sqrt{d_{k}}

of the dimension of the key vector. The scaled scores are then normalized into attention weights ranging from 0 to 1 via the softmax function, where a higher weight indicates a stronger dependence of the current position on the information from the corresponding position. Finally, a weighted sum of the V matrix is performed using these obtained attention weights, thus generating the self-attention output that integrates the global context information of the input sequence. The mathematical formulation of the self-attention mechanism is expressed as follows:

Attention (Q, K, V) = softmax (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(17)

The structural diagram of SG based on the transformer model is presented in Figure 12. When applying the transformer model to SG for renewable energy sources, the input data, including historical output sequences of wind farms and PV stations, meteorological features (e.g., wind speed and irradiance), and spatiotemporal labels, are first processed via embedding and positional encoding. The processed data are then fed into an encoding module consisting of stacked multi-layer encoders. Through the multi-head self-attention mechanism and feed-forward network, the encoders deeply capture the long-term temporal dependencies of the output data and the spatial correlations across multiple renewable farms, and generate an encoding representation fused with global context information. Subsequently, this encoding representation is taken as the context input of the decoder. Through masked multi-head self-attention (to ensure the legality of the generation order) and encoder–decoder attention (to focus on the key information of the input), the decoder realizes autoregressive generation of the output sequence for future time periods. Meanwhile, physical constraints (e.g., no exceeding the rated power) are embedded at the output end through operations. Finally, renewable energy scenarios that both conform to the real data distribution and comply with physical laws are generated, which provide critical input data support for uncertainty analysis, stochastic programming, and robust scheduling of power systems. Ref. [112] adopts the transformer for ultra-short-term photovoltaic (PV) power generation forecasting. It selects PV power generation data and meteorological data from Hebei Province, China, and compares the forecasting results of the transformer model with those of the Gated Recurrent Unit (GRU) and deep neural network (DNN) models, verifying that the transformer model has superior forecasting performance and stability. Experimental results demonstrated that the proposed transformer model outperforms the GRU model and DNN model by a difference of about 0.04 kW and 0.047 kW in the MSE value, and 22.0% and 29.1% of the MAPE.

In [113], a hybrid transformer-KAN architecture is proposed for PV power generation forecasting, to address the variable and stochastic nature of PV power generation. For the first time, this model introduces the Kolmogorov–Arnold Network (KAN) into the renewable energy field, aiming to improve the interpretability and forecasting accuracy of the transformer architecture. First, a VAE is applied for feature selection and dimensionality reduction, and the OPTICS clustering algorithm is used to identify the intrinsic operation states of the data. Second, time-GAN is adopted to generate temporal features for enhancing data diversity, combined with multi-stage data preprocessing based on diurnal and seasonal division. Finally, the transformer is utilized to capture long-term temporal dependencies, while the B-spline functions of KAN are leveraged to conduct local feature interaction analysis, thus forming a “global–local” dual-scale modeling capability. Experimental results show that the proposed hybrid model significantly outperforms traditional single models in terms of error metrics, including RMSE and MAE. Compared with the standalone transformer model, it reduces RMSE and MAE by 29.51% and 12.0%, respectively, exhibiting stronger robustness and interpretability while maintaining favorable computational efficiency.

A method based on the transformer–Wasserstein GAN with gradient penalty for short-term renewable energy output SG [114]. First, a deep neural network architecture inspired by the transformer algorithm is developed to capture the temporal characteristics of renewable energy output. Subsequently, combined with the strengths of the Wasserstein generative adversarial network with gradient penalty (WGAN-GP) in data generation, the transformer-WGAN-GP framework is proposed for short-term renewable energy output SG. Finally, experimental validation is conducted on open-source wind energy and PV datasets from the U.S. National Renewable Energy Laboratory (NREL), to evaluate the performance of the proposed model in renewable energy output SG across multiple dimensions, including expectation, variance, PDF, CDF, Power Spectral Density (PSD) and autocorrelation coefficient. The results show that the proposed method outperforms the standalone WGAN-GP model, VAE model, Copula function model, and LHS model across all the aforementioned evaluation metrics, and provides a more accurate representation of the probability distribution of short-term renewable energy output.

In [115], a learning method based on the Wavelet Transform Convolutional Transformer (WTC-Transformer) is proposed for wind power forecasting under extreme scenarios. First, a CGAN is adopted to generate dynamic extreme scenarios in accordance with physical constraints and expert rules, so as to ensure the realism of generated scenarios and capture the key characteristics of wind power fluctuations under extreme conditions. Next, wavelet transform convolutional layers are applied to enhance the sensitivity to frequency-domain characteristics and effectively extract features from extreme scenarios, thus enabling an in-depth understanding of the input data. Subsequently, the WTC-Transformer leverages the self-attention mechanism of the transformer to capture the global dependencies among features, which strengthens its sequence modeling capability and improves the forecasting accuracy for extreme scenarios. The WTC-Transformer model is trained on a wide range of scenarios to further enhance its forecasting performance under extreme conditions. Compared with other benchmark models, its coefficient of determination (R²) reaches as high as 0.95, while MAE and RMSE are significantly lower than those of the counterparts. The results demonstrate the high accuracy and effectiveness of the proposed model in handling complex wind power operating conditions.

4.8. Brief Summary

These model-based methods each possess distinct characteristics and applicable conditions, thus requiring selection based on the specific problem at hand. Table 2 summarizes the advantages and disadvantages of the methods discussed above.

Through a systematic analysis of the advantages, disadvantages and core technical bottlenecks of various model-based methods, we have condensed five core cutting-edge research and development themes in conjunction with the practical requirements of power system engineering. First, we have considered the development of physics-informed and interpretable deep generative architectures, which embeds physical prior knowledge into model design to solve the “black-box” problem of existing deep models and ensure the physical consistency of generated scenarios. Second, we have considered few-shot and zero-shot scenario generation methods for data-scarce renewable energy scenarios, breaking the heavy data dependency of data-driven models through transfer learning, meta-learning and physics-informed optimization. Third, we have considered ultra-high-dimensional spatiotemporal joint scenario generation for multi-energy coupled large-scale power systems, which breaks the curse of dimensionality through hybrid architectures integrating graph neural networks and a transformer, and supports joint modeling of wind-PV–storage-load multi-energy sources. Fourth, we have considered NWP-conditioned high-fidelity extreme SG, which optimizes the NWP information embedding mechanism of conditional generative models to enhance the capture capacity of low-probability high-impact extreme events that determine grid security. Fifth, we have considered end-to-end task-aligned scenario generation and a unified evaluation framework, which realizes customized SG oriented to downstream power system tasks, and establishes a multi-dimensional evaluation system that connects statistical metrics with practical engineering performance. These research directions directly address the core bottlenecks of existing methods, and will drive the transition of SG technology from academic data fitting to practical decision support for power system operation.

5. Evaluation of SG Methods

The evaluation of generated scenarios is a crucial component in stochastic planning for wind-integrated power systems. Due to the significant intermittency and uncertainty of renewable power generation, whether the generated scenarios can authentically reflect its stochastic characteristics directly impacts the scientific validity and reliability of subsequent dispatch, planning, and decision-making. In practical applications, low-quality scenarios may not only lead to underestimated operational risks and overestimated economic benefits, but also cause dispatch deviations and inappropriate reserve capacity arrangements, and may even affect grid frequency stability and power supply security. Especially under the current trend of the high penetration of renewable energy, wind power volatility and forecast errors impose higher demands on the real-time balance and long-term planning of power systems. Therefore, establishing a systematic and comprehensive scenario evaluation framework is not only fundamental for verifying the effectiveness of scenario generation methods, but also a key step in supporting the transition of system operation from “passive response” to “active control.” An essential basis for achieving “observable, measurable, and controllable” wind power integration is thereby provided.

While numerous studies have focused on SG methods, systematic evaluation of the quality of generated scenarios remains relatively limited. To improve the accuracy of SG methods and maximize their economic feasibility, it is essential to evaluate the quality of generated scenarios to ascertain whether they sufficiently embody the characteristics of real-world data. This paper systematically reviews existing evaluation metrics for wind and solar output scenario generation and categorizes them into three groups: output-based evaluation, distribution-based evaluation, and event-based evaluation [116]. This classification framework systematically assesses scenario accuracy, statistical consistency, and engineering applicability from different dimensions, offering a scientific basis for the selection and optimization of SG methods.

SG methods that generate results in the form of expected values are categorized into the output-based evaluation group. This type of evaluation focuses on the direct numerical differences between generated scenarios and actual observations, making it suitable for situations that require accuracy measurement for point forecasts or deterministic outputs. For SG methods that yield results in the form of discrete probability distributions, classification is performed through distribution-based evaluation. This evaluation emphasizes whether the generated scenarios align with real data in terms of probability distribution shape, statistical moments, and uncertainty representation, and is applicable in contexts such as risk assessment and reliability analysis that require probabilistic information. In addition to these two evaluations of SG results, the event-based group assesses SG at an overall level. This evaluation treats wind power output as a dynamic process with temporal continuity and event correlation. By examining the performance of generated scenarios in key system events (such as ramping, sustained high output, and extreme fluctuations), the event-based evaluation determines whether they possess sufficient engineering representativeness and operational guidance value. Together, these three types of evaluation characterize the quality of SG from different dimensions, forming a multi-level, multi-perspective comprehensive evaluation system.

5.1. Output-Based Evaluation

Output-based evaluation primarily measures the accuracy of generated scenarios at the numerical level, and is applicable to scenario generation methods that produce outputs in the form of expected values or deterministic trajectories. The core idea of output-based evaluation is to directly compare the differences between generated scenario sequences and actual observed sequences at each time step or on an overall basis, quantifying the deviation of the generated results through statistical error metrics. Due to intuitive computation and clear interpretation, output-based evaluation is widely adopted in engineering practice, particularly when point-wise validation of generated scenarios is required or when the scenarios serve as deterministic inputs for subsequent optimization models, as these metrics provide clear references for accuracy. Commonly used evaluation metrics include Mean Absolute Error (MAE) [117], Root Mean Square Error (RMSE) [118], and Mean Absolute Percentage Error (MAPE) [119]. These metrics characterize the deviation between generated and actual values from different perspectives, offering quantitative support for model parameter tuning and method comparison.

MAE represents the average magnitude of prediction errors, measured as the mean of absolute differences between predicted and actual values. MAE assigns balanced weighting to all error magnitudes and is less sensitive to extreme values. The formula is as follows:

M A E = \frac{1}{T} \sum_{t - 1}^{T} |y_{t} - {\hat{y}}_{t}|

(18)

where T is the total number of time periods,

y_{t}

is the actual observed value, and

{\hat{y}}_{t}

is the generated scenario value. A smaller MAE value indicates a lower overall deviation of the generated scenarios.

RMSE quantifies the accuracy of generated scenarios. By squaring the errors, RMSE assigns greater weight to larger errors, making it more sensitive to outliers. Its formula is as follows:

R M SE = \sqrt{\frac{1}{T} \sum_{t - 1}^{T} {(y_{t} - {\hat{y}}_{t})}^{2}}

(19)

RMSE reflects the fluctuation consistency of the generated scenarios, where a smaller value indicates that the generated results are closer to the actual sequence.

MAPE represents the percentage deviation of the generated scenarios from the actual scenarios. The formula is as follows:

M A P E = \frac{1}{T} \sum_{t - 1}^{T} |\frac{y_{t} - {\hat{y}}_{t}}{y_{t}}| \times 100 %

(20)

MAPE is particularly suitable for assessing the importance of relative error. However, MAPE may lead to distortion of the metric when actual values are close to zero. In such cases, caution is advised, so MAPE should be considered together with other metrics for a comprehensive evaluation.

5.2. Distribution-Based Evaluation

Distribution-based evaluation focuses on analyzing whether the statistical probability characteristics of generated scenarios align with the distribution of real data. As a stochastic process, renewable energy output has significant implications for system risk assessment, reserve capacity determination, and chance-constrained planning through its probability distribution shape, tail characteristics, and moment information. Therefore, relying solely on numerical error evaluation is insufficient to fully reflect the quality of generated scenarios in capturing uncertainty. Distribution-based evaluation assesses the effectiveness of a generation method in capturing the statistical patterns of the original data by comparing the similarity between the distribution of generated scenarios and that of actual observed data in terms of shape, central tendency, dispersion, skewness, and kurtosis. This type of evaluation is typically suitable for scenario generation methods that output probability distributions or distribution parameters. Commonly used metrics include distance measures (such as Wasserstein distance) [120], the Energy Score [121], and the Quantile Score [122].

Distance metric’s commonly used measures include the Wasserstein distance and the Euclidean distance, which are used to describe the proximity between two probability distributions. The Wasserstein distance quantifies the minimum “effort” required to transform one distribution into another. Sound mathematical properties and an intuitive interpretation are offered when measuring differences in distribution shapes. The formula for the Wasserstein distance is as follows:

W_{p} (P, Q) = {[\inf_{γ \in Γ (P, Q)} \int d {(x, y)}^{p} d γ (x, y)]}^{1 / p}

(21)

where P and Q represent two probability distributions,

Γ (P, Q)

denotes all possible joint distributions, d is the distance function, and p is the order parameter. Euclidean distance, on the other hand, is commonly used to compare the direct differences between statistical feature vectors of distributions, such as various moments. A smaller distance metric value indicates that the two distributions are closer.

The Energy Score (ES) is a metric used to evaluate the quality of probabilistic predictive distributions, measuring the discrepancy between a discrete probability distribution and the actual output. The ES not only considers the central tendency of the predictive distribution but also assesses the sharpness of the prediction by evaluating the dispersion among predictive samples. Its formula is as follows:

E_{s} = \frac{1}{S} \sum_{i - 1}^{S} ‖ς_{i} - y‖ - \frac{1}{2 S^{2}} \sum_{i - 1}^{S} \sum_{j - 1}^{S} ‖ς_{i} - ς_{j}‖

(22)

where S is the number of generated scenarios,

ς_{i}

represents the i-th generated scenario sequence, y is the actual observed sequence, and

‖•‖

denotes the Euclidean norm. A smaller ES indicates that the predictive distribution is more similar to the actual distribution, and the generated scenarios themselves exhibit a moderate level of dispersion.

Quantile Score (QS) characterizes the alignment between the generated distribution and the true distribution by comparing the values of the generated distribution at various quantile points with the actual observed values. The QS is typically computed using the Pinball Loss function, which provides a comprehensive evaluation of the predictive performance across different probability levels. Its formula is as follows:

L_{τ} = \{\begin{matrix} τ (y - q), & y \geq q \\ (1 - τ) (q - y), & y < q \end{matrix}

(23)

where τ is the target quantile, y is the actual observed value, and q is the predicted value from the generated distribution at the τ quantile. Averaging over all quantiles yields the overall Quantile Score. A smaller score indicates that the generated distribution matches the distribution characteristics of the actual data well across all quantile points.

5.3. Event-Based Evaluation

Event-based evaluation assesses the overall performance of generated scenarios at the event level, focusing on event coverage and relevance. Renewable energy output is treated as a dynamic process that evolves over time and includes several key operational events, such as sharp power ramps, sudden drops, and sustained high-power periods. From the perspective of practical engineering needs, such as system operation security, dispatch response, and reserve allocation, an evaluation is conducted on whether the generated scenarios can effectively reproduce the statistical characteristics and occurrence patterns of these critical events. The event-based evaluation goes beyond mere numerical or distributional matching, placing greater emphasis on the authenticity of generated scenarios at the level of “behavioral patterns.” Significant importance is attached to verifying the applicability of scenarios in specific power system applications, such as ramp event warning, frequency response analysis, and reserve requirement assessment. Commonly used evaluation metrics include the Coverage Rate [116], the Brier Score [123], and various correlation coefficients (such as Pearson correlation coefficient and tail correlation coefficient) [124].

Coverage Rate (c) is defined as the probability that the variation range of the generated scenarios within a given time interval—typically formed by the minimum and maximum values of all generated scenarios during that interval—covers the actual observed value. A higher Coverage Rate indicates that the set of generated scenarios encompasses a broader range of possible actual values, reflecting stronger representativeness. Its formula is as follows:

c = \frac{1}{T} \sum_{t - 1}^{T} I (y_{t} \in [\min ({\hat{y}}_{s, t}), \max ({\hat{y}}_{s, t})])

(24)

where I is the indicator function, taking a value of 1 when the condition is satisfied and 0 otherwise;

{\hat{y}}_{s, t}

represents the value of the s-th generated scenario at time t.

Brier Score (BS) is a metric used to measure the accuracy of probabilistic forecasts, particularly suitable for assessing the predictive capability of generated scenarios regarding the probability of a specific event (such as “output exceeding a certain threshold” or “the occurrence of a ramp event”). The BS converts probabilistic predictions into a binary classification evaluation of whether the event occurs or not, quantifying forecast accuracy by calculating the mean squared error between the predicted probability and the actual outcome. The formula is as follows:

B_{s} = \frac{1}{N} \sum_{i - 1}^{N} {(p_{i} - o_{i})}^{2}

(25)

where N is the number of evaluation samples,

p_{i}

is the predicted probability of the event occurring, and

o_{i}

is the actual observed value (1 if the event occurs, 0 otherwise). A smaller Brier Score indicates more accurate probabilistic predictions of events by the generated scenarios.

Commonly used correlation metrics include the Pearson Product-Moment Correlation Coefficient (PPMCC), tail correlation (TC), and non-parametric correlation measures. The PPMCC is mainly used to measure linear correlation between variables. The TC focuses on measuring the co-movement probability of two sequences under extreme conditions, such as very high or low renewable energy output. The non-parametric correlation measures are used to assess the monotonic relationships between sequences. Granger causality (GC) can also be used to evaluate the consistency in the direction and magnitude of changes between random variables [125]. These correlation coefficients help determine whether the generated scenarios capture the dependence structure and variation patterns of the actual output over time. Among them, the formula for Spearman’s rank correlation coefficient is as follows:

ρ = 1 - \frac{6 \sum {d_{i}}^{2}}{n (n^{2} - 1)}

(26)

where

ρ

is the correlation coefficient, ranging from −1 to 1; n is the sample size; and

d_{i}

is the difference in ranks for each pair of observations. A value close to 1 indicates a high consistency in the ordering of the two sequences, suggesting that the generated scenarios effectively preserve the trend characteristics of the actual output fluctuations over time.

5.4. Brief Summary

In summary, the evaluation of renewable output generation scenarios is a multidimensional, multi-level systematic task. Output-based evaluation focuses on the point estimation accuracy of generated values, providing a direct basis for deterministic applications of scenarios. Distribution-based evaluation emphasizes whether the probabilistic statistical characteristics of generated scenarios align with real data, offering support for risk-informed decision-making. Event-based evaluation, starting from the practical operational concerns of power systems, examines the performance of generated scenarios in key operational events to ensure their engineering applicability. Three categories of evaluation metrics each have their own focus, complement one another, and together form a comprehensive assessment system for the quality of scenario generation. The classification of these evaluation metrics for SG methods is shown in Figure 13.

In practical research, the selection and combination of the aforementioned evaluation metrics should be reasonably determined based on the characteristics of the scenario generation method and its target application scenarios. For example, for SG primarily used in short-term deterministic scheduling, output-based evaluation may be emphasized. When applied to medium- or long-term risk assessment or planning, distribution-based evaluation should be strengthened. If the focus is on system behavior under extreme weather conditions or special operational modes, event-based evaluation becomes particularly important. As artificial intelligence and big data technologies become increasingly integrated into scenario generation in the future, the evaluation framework must also evolve accordingly. Additional dimensions may be considered, such as the diversity, novelty, and computational efficiency of generated scenarios, as well as exploring end-to-end evaluation methods directly linked to the performance of specific power system optimization problems. Such advancements will drive wind power scenario generation technology toward greater accuracy, reliability, and practicality.

6. Analysis of the Impacts of NWP Data on SG Methods

Renewable power SG is a core enabling technology for dispatch and optimization under uncertainty in power systems [126]. In practical power grid operation, SG is by no means an unconstrained replay of historical data, but must take Numerical Weather Prediction (NWP) data for future periods as the core boundary condition [127]. The generated scenarios should not only fit the power output baseline corresponding to the NWP data, but also fully cover the power output uncertainty caused by NWP forecast errors. Otherwise, the generated scenarios will be severely inconsistent with the actual future operating conditions, and thus cannot provide effective support for dispatch decision-making.

Existing review studies in this field have mostly focused on unconstrained SG methods based on historical time-series data, while lacking a systematic review of conditional SG techniques that integrate NWP forecast information and are widely applied in practical engineering. In particular, a complete review framework has not yet been formed for relevant studies on how to embed NWP conditions into mainstream state-of-the-art architectures, including CGANs and diffusion models, to improve the forecast consistency and engineering applicability of generated scenarios. To address this research gap, this chapter briefly sorts out the technical system of renewable power SG integrated with NWP data, and elaborates on the embedding mechanisms of NWP information by model category. This chapter also focuses on analyzing the technical advantages of state-of-the-art architectures such as CGANs and diffusion models in forecast-conditioned scenario synthesis, thus filling the gap of insufficient attention to engineering-oriented, practically applicable SG techniques in existing reviews.

6.1. Core Roles and Integration Paradigms of NWP Data in SG

In the practice of renewable energy dispatch in power systems, NWP is both the core input for deterministic wind and PV power forecasting and the critical boundary condition for power output uncertainty SG [128]. The three core attributes, namely multi-dimensional spatiotemporal attribute, multi-time-scale adaptability and uncertainty coupling characteristic, are highly compatible with SG requirements. These attributes can not only provide data support for spatiotemporal coupling modeling of wind and PV power output, but also offer a core anchor for uncertainty modeling, driving the paradigm shift in SG from unconstrained historical data replay to forecast-bound conditional uncertainty synthesis that supports practical dispatch decision-making.

In the engineering practice and academic research of renewable power output SG, three mature technical paradigms have been formed for the integration of NWP data and SG models, namely, two-stage pre-integration, end-to-end conditional integration, and error modeling-based integration. Based on differentiated technical logics, these three paradigms are adapted to different model architectures and engineering application scenarios, and together constitute the core technical system of NWP-driven conditional SG.

Among them, two-stage pre-integration is the classic mode with the lowest implementation threshold. The core logic is to first calculate the future power output baseline curve with NWP as the input, then generate fluctuation scenarios through sampling based on the statistical characteristics of historical NWP forecast errors, and superimpose them on the baseline to obtain the final power output scenarios. The mode features a low application threshold and strong interpretability, and is suitable for the rapid engineering requirements of small and medium-sized power grids and distributed renewable energy projects.

End-to-end conditional integration is the mainstream route in current academic research and engineering applications. The core logic is to directly embed NWP as a constraint condition into deep generative models such as CGANs and diffusion models, to realize end-to-end integrated modeling from NWP input to scenario output. It avoids the error accumulation of the two-stage structure, can accurately capture the nonlinear spatiotemporal coupling relationship between wind/PV power output and NWP, and achieves higher generation accuracy. It is suitable for scenarios with high requirements for scenario refinement, such as large-scale power grid day-ahead dispatch and electricity spot markets.

Error modeling-based integration is a lightweight implementation paradigm. The core logic, on the premise of an existing mature and high-precision NWP-based baseline power output forecast, is to directly model the historical NWP forecast error sequence to generate fluctuation scenarios, which are then superimposed on the baseline power output to obtain the final scenarios. This mode has the advantages of low transformation cost and flexible deployment, and is suitable for application scenarios requiring high-frequency and rapid scenario generation, such as intra-day ultra-short-term rolling dispatch.

6.2. NWP Integration Mechanisms in Deep Generative Models

The ANN is the earliest method to integrate NWP into SG, and the mainstream implementation adopts a two-stage paradigm. In the first stage, with NWP data as the input, a deterministic baseline of future power output is obtained through models such as back propagation artificial neural network (BP-ANN). In the second stage, based on the statistical characteristics of historical forecast errors, uncertainty scenarios are generated around the baseline via sampling methods including Monte Carlo sampling and LHS. At present, this method is still widely applied in the engineering scenarios of small and medium-sized power grids. Ref. [129] adopts NWP to improve wind characteristic assessment and medium-term wind forecasting over complex hilly terrain, and combines an ANN to implement medium-term wind power forecasting. The research results demonstrate that the medium-term forecasting framework using the NWP model outputs and a simple ANN achieves a 14% reduction in the annual NMAE of wind power forecasting.

LSTM is the fundamental architecture for temporal SG, and the core of integration with NWP is to achieve end-to-end integrated modeling of forecasting and SG. Mainstream studies in this field adopt the Multi-Input Multi-Output (MIMO) LSTM architecture, which takes as the input the multi-source features concatenated via transformer, namely, historical power output sequences, observed NWP data at corresponding time steps, and NWP forecast data for future periods, and outputs multi-scenario power output sequences. Ref. [130] proposes a novel neural network forecasting model named EALSTM-QR for wind power forecasting, which integrates inputs from NWP data and deep learning methods. The model consists of four core modules, namely an encoder, an attention module, a Bidirectional LSTM (Bi-LSTM) module, and a quantile regression (QR) module. The encoder and attention module are adopted to fuse historical wind power data and features extracted from NWP data. The Bi-LSTM module is utilized to generate probabilistic forecasting results of wind power time series, and the QR method is applied to obtain the final prediction intervals. The results demonstrate that the proposed model achieves favorable accuracy and reliability in interval and probabilistic forecasting.

The CGAN is the most widely used NWP integration architecture in engineering applications of renewable energy SG. The core logic is to take NWP data as the global conditional variable c, and construct a condition-constrained generator

G (z, c)

and discriminator

D (x, c)

. Specifically, the generator takes random noise z and NWP condition c as inputs to output scenarios conforming to the forecast constraints, while the discriminator simultaneously incorporates the corresponding NWP conditions to distinguish between real and generated scenarios. The design fundamentally addresses the key limitation of unconstrained GANs, where generated scenarios are severely inconsistent with the actual future operating conditions of the power grid. At present, the NWP embedding and optimization directions under this architecture are mainly divided into three categories.

The first is basic concatenation-based embedding, which encodes NWP temporal features into a conditional vector, concatenates with random noise for generator input, and also uses it as the conditional input of the discriminator. This method is easy to implement and has strong engineering applicability. Ref. [131] focuses on day-ahead wind power scenario generation, and constructs a CGAN model by encoding 24 h NWP sequences of wind speed, wind direction and temperature into conditional vectors. Compared with the conventional unconstrained GAN, the proposed model achieves a substantial improvement in the consistency between the generated scenarios and the actual operating conditions.

The second is spatiotemporal feature coding-based embedding. For multi-station joint SG, the spatial distribution characteristics of multi-site NWP are encoded via CNN/GNN, and then embedded into the generator and discriminator as conditional inputs. Ref. [132] proposes a Dynamic Spatiotemporal Graph GAN (DSTG-GAN), which adopts Chebyshev Graph Convolution to extract time-varying graph structural features and dynamically encode the spatial dependencies among multi-station sites. The results demonstrate that this method significantly improves the quality of generated scenarios. Ref. [133] uses an undirected graph to characterize the dependencies among PV power stations, with each station acting as a node and the inter-station correlation as edges. It further develops an Adaptive GCN to capture the hidden spatiotemporal dependencies among stations. A sparse spatiotemporal attention mechanism is introduced to filter out weak correlations, ensuring the model focuses on strongly correlated spatial distribution features of NWP data and effectively addresses the spatial correlation mismatching problem in traditional methods.

The third is advanced variant optimization: Targeting the inherent training instability of vanilla CGAN, NWP-integrated variants such as WGAN-GP have become research hotspots. Ref. [134] integrates Wasserstein, gradient penalty (GP) and the CGAN to solve the training instability problem of traditional CGAN, and validates the proposed method using real-world renewable energy data and forecast data. The results demonstrate that this method can effectively capture the uncertainty and volatility associated with renewable energy.

CVAE realizes NWP-based SG by taking NWP as the conditional variable and embedding it into both the encoder and the decoder at the same time, with core advantages of stable training and strong interpretability. Ref. [41] adopts an advanced CVAE, whose encoder modules are trained on historical wind speed observations, multivariate outputs (including wind speed, wind direction, temperature, atmospheric pressure, and humidity) from the NWP model, as well as spatiotemporal encodings. This model is used to generate probabilistic hub-height wind speed forecasts for power output prediction. The results demonstrate that the proposed method achieves superior wind speed forecasting performance from both the deterministic perspective (with reduced RMSE) and the probabilistic perspective (with lowered CRPS).

Diffusion models, with their superior full distribution fitting capability, have become the current research hotspot in NWP-conditional SG. The core logic is to embed NWP conditions into each step of the denoising U-Net via the cross-attention mechanism, to achieve conditional denoising and SG under NWP constraints. At present, the mainstream NWP embedding and optimization path for this model is temporal conditional attention embedding. Under the technical framework, the NWP time series is encoded into a conditional vector through LSTM/transformer, and then integrated into the up-sampling and down-sampling modules of the U-Net via cross-attention. Ref. [135] proposes the Time Series-specific Diffusion Transformer (TimeDiT), which integrates modular conditional mechanisms to separately fuse NWP input, noise, and temporal features. The results demonstrate that this method consistently achieves superior performance in terms of deterministic metrics and probabilistic accuracy, delivering better forecast calibration and sharper distribution characterization.

7. Conclusions

Although significant achievements have been made in research on SG methods, the following key challenges remain:

(1): SG models exhibit a strong dependence on the quality and quantity of historical data. The performance of most methods, especially data-driven deep generative models (GANs, VAEs, diffusion models), heavily relies on large volumes of complete, high-quality historical data. In practical applications, remote areas often face difficulties such as data scarcity, high noise levels, or incomplete records, which severely constrain model training and generalization capabilities.
(2): There is insufficient model interpretability and physical consistency. Methods represented by deep learning commonly suffer from the “black-box” problem. Their internal decision-making mechanisms are opaque, and the generated scenarios lack clear physical explanations. This makes it difficult to determine whether the generated results violate fundamental physical laws.
(3): SG models inadequately characterize extreme scenarios and transitional scenarios. Efficiently generating multi-timescale scenarios ranging from seconds to years within a unified framework and achieving dynamic coupling with multi-physical processes such as meteorology and power grid power flows still requires further breakthroughs.
(4): The scenario evaluation system is not comprehensive and lacks unified standards. Existing evaluation practices mostly focus on single-dimensional metrics, failing to form a systematic framework that integrates output-based, distribution-based, and event-based evaluation. Meanwhile, there is no unified standard for selecting and weighting evaluation metrics, leading to inconsistent evaluation results of the same SG model in different studies, which hinders the comparison and optimization of various SG methods.

To address the aforementioned challenges, future research and practice should focus on the following key directions and development trends:

(1): Future research will aim to develop “few-shot” or “zero-shot” generation techniques. This involves utilizing transfer learning, meta-learning, and physics-informed enhancements. These approaches can reduce reliance on historical data and improve the physical plausibility of generated scenarios.
(2): Enhancing model interpretability by designing latent spaces with clear physical meaning, developing interpretable generative architectures, and constructing hybrid modeling frameworks that combine the powerful fitting capabilities of deep learning with the physical interpretability of traditional models.
(3): With the increasing integration of high proportions of renewable energy, the research focus will shift from single-energy, single-site scenario generation to joint scenario generation for “wind-PV–storage-load” multi-energy coupled, multi-region interconnected systems. This requires models to simultaneously capture the complex spatiotemporal complementarity and competition among resources.
(4): Optimize and innovate scenario evaluation metrics. Improve the adaptability of correlation metrics to nonlinear, asymmetric dependencies in multi-energy systems, and develop targeted evaluation metrics for extreme and transitional scenarios. Explore end-to-end evaluation methods directly linked to the performance of power system optimization tasks (such as day-ahead scheduling and reserve allocation), so that the evaluation results can better guide the practical application of generated scenarios.
(5): Future research will focus on developing end-to-end task-aligned SG methodology and its closed-loop integration with power system operation control. This involves constructing customized SG methods for core downstream tasks including VPP optimal scheduling, real-time frequency control and chance-constrained optimization, and realizing the joint optimization of SG models and downstream optimization/control frameworks. Meanwhile, this direction will explore the deep fusion of SG technology and model predictive control (MPC) for power system real-time regulation, realizing the dynamic update of scenario sets and rolling optimization of control strategies in the closed-loop process, so as to fundamentally solve the mismatch between general SG scenarios and specific task requirements, and improve the robustness of renewable energy control under uncertainty.
(6): Future research will advance distributed SG technology for multi-agent VPP scenarios and establish a unified engineering application benchmark system for SG methods. This includes combining federated learning and privacy-preserving mechanisms to develop distributed SG methods adapted to the multi-agent structure of VPP, which can support the distributed optimization and coordinated control of VPP while protecting the data privacy of each market entity. In addition, this direction will build a unified benchmark system covering typical downstream scenarios such as long-term planning, day-ahead scheduling, intra-day optimization and real-time control, to provide a quantitative basis for the selection of SG methods in different engineering applications, and promote the standardized engineering application of SG technology.

In summary, the field of renewable SG is evolving from pursuing performance improvements with single methods towards constructing a new generation of generative paradigms. These paradigms should be data-efficient, physically consistent, highly interpretable, and capable of supporting decision-making for complex systems. Addressing the existing challenges relies on interdisciplinary collaboration, deeply integrating meteorology, power system analysis, and artificial intelligence to jointly advance this technology and provide more solid and reliable support for the energy transition.

Author Contributions

Conceptualization, T.M. and B.Q.; methodology, T.M. and B.Q.; investigation, S.H. and Y.S.; resources, S.H.; writing—original draft preparation, T.M.; writing—review and editing, S.H.; visualization, T.M.; supervision, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Basic Research Program of Qinghai Province (2024-ZJ-929Q).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, C.Z.; Umair, M. Does green finance development goals affect renewable energy in China. Renew. Energy 2023, 203, 898–905. [Google Scholar] [CrossRef]
Quan, H.; Khosravi, A.; Yang, D.; Srinivasan, D. A Survey of Computational Intelligence Techniques for Wind Power Uncertainty Quantification in Smart Grids. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 4582–4599. [Google Scholar] [CrossRef]
Wen, Y.; Alhakeem, D.; Mandal, P.; Shantanu, C. Performance Evaluation of Probabilistic Methods Based on Bootstrap and Quantile Regression to Quantify PV Power Point Forecast Uncertainty; IEEE: Piscataway, NJ, USA, 2020. [Google Scholar] [CrossRef]
Li, X.; Song, J.; Ma, Y.; Zhu, Z.; Liu, H.; Wei, C. Capacity planning for hydro-wind-photovoltaic-storage systems considering high-dimensional uncertainties. Energy Inform. 2025, 8, 3. [Google Scholar] [CrossRef]
Qin, B.; Wang, H.; Liao, Y.; Li, H.; Ding, T.; Wang, Z.; Li, F.; Liu, D. Challenges and opportunities for long-distance renewable energy transmission in China. Sustain. Energy Technol. Assess. 2024, 69, 103925. [Google Scholar] [CrossRef]
Wang, H.; Qin, B.; Su, Y.; Li, F.; Hong, S.; Ding, T. Coordinated planning of mobile electric-hydrogen energy storage for remote power system resilience enhancement. J. Energy Storage 2026, 147, 120160. [Google Scholar] [CrossRef]
Li, B.; Tan, Y.; Wu, A.G.; Duan, G.R. A Distributionally Robust Optimization Based Method for Stochastic Model Predictive Control. IEEE Trans. Autom. Control (T-AC) 2022, 67, 15. [Google Scholar] [CrossRef]
Li, J.; Zhang, J. Maximum likelihood identification of dual-rate Hammerstein output-error moving average system. IET Control Theory Appl. 2020, 14, 1089–1101. [Google Scholar] [CrossRef]
Ceseña, E.A.M.; Mancarella, P. Energy systems integration in smart districts: Robust optimisation of multi-energy flows in integrated electricity, heat and gas networks. IEEE Trans. Smart Grid 2018, 10, 1122–1131. [Google Scholar] [CrossRef]
Xu, X.; Lin, Z.; Li, X.; Shang, C.; Shen, Q. Multi-objective robust optimisation model for MDVRPLS in refined oil distribution. Int. J. Prod. Res. 2022, 60, 6772–6792. [Google Scholar] [CrossRef]
Qin, B.; Hong, S.; Wang, H.; Zhao, J.; Li, H.; Chen, P.; Ding, T. Non-isothermal Dynamic Model and Collaborative Optimization for Multi-energy System Considering Pipeline Energy Storage. J. Energy Storage 2026, 141, 119083. [Google Scholar] [CrossRef]
Li, J.; Zhou, J.; Chen, B. Review of wind power scenario generation methods for optimal operation of renewable energy systems. Appl. Energy 2020, 280, 115992. [Google Scholar] [CrossRef]
Wang, H.; Qin, B.; Hong, S.; Cai, Q.; Li, F.; Ding, T.; Li, H. Optimal planning of hybrid hydrogen and battery energy storage for resilience enhancement using bi-layer decomposition algorithm. J. Energy Storage 2025, 110, 115367. [Google Scholar] [CrossRef]
Qin, B.; Wang, H.; Li, F.; Liu, D.; Liao, Y.; Li, H. Towards zero carbon hydrogen: Co-production of photovoltaic electrolysis and natural gas reforming with CCS. Int. J. Hydrogen Energy 2024, 78, 604–609. [Google Scholar] [CrossRef]
Zhou, W.; Chu, W.; Tian, Y.; Fei, T.; Ding, Y.; Tang, X.; Li, K. Autonomous driving scenario generation based on neural radiance fields or 3D Gaussian splatting: State-of-the-art investigations, reviews, and perspectives. Chin. J. Mech. Eng. 2026, 39, 100093. [Google Scholar] [CrossRef]
Luo, F.; Qiu, X.; Wang, S.; Xu, Z. Long-term scenario generation for distribution network loads based on interpretable diffusion models. Int. J. Electr. Power Energy Syst. 2026, 174, 111491. [Google Scholar] [CrossRef]
Jin, Y.; Xing, H.; Zhu, H.; Li, Z.; Wu, C. A solar radiation data generation method for solar energy utilization scenarios: BIPV generation forecasting as a case study. Renew. Energy 2026, 259, 124772. [Google Scholar] [CrossRef]
Guo, P.; Cheng, X.; Min, W.; Zeng, X.; Sun, J. A Climate-Informed Scenario Generation Method for Stochastic Planning of Hybrid Hydro–Wind–Solar Power Systems in Data-Scarce Regions. Energies 2025, 19, 74. [Google Scholar] [CrossRef]
Ursachi, M.T.; Dascalu, I.M. Curriculum to Immersion: A Conceptual Framework of Artificial Intelligence-Assisted Scenario Generation in Extended Reality for Primary and Secondary Education. Electronics 2025, 14, 4955. [Google Scholar] [CrossRef]
Zheng, K.; Sun, Z.; Song, Y.; Zhang, C.; Zhang, C.; Chang, F.; Yang, D.; Fu, X. Stochastic Scenario Generation Methods for Uncertainty in Wind and Photovoltaic Power Outputs: A Comprehensive Review. Energies 2025, 18, 503. [Google Scholar] [CrossRef]
Hanoot, A.K.A.; Mokhlis, H.; Mekhilef, S.; Alghoul, M.; Aqil, M.A.; Alhanut, M. Monte Carlo simulation for real-world energy yield analysis of car park solar PV system installations in harsh environments. Results Eng. 2025, 28, 106996. [Google Scholar] [CrossRef]
Hakimi, F. Robust estimation with Latin Hypercube Sampling: A Central Limit Theorem for Z-estimators. J. Stat. Plan. Inference 2026, 243, 106374. [Google Scholar] [CrossRef]
Chao, H.; Hu, B.; Xie, K.; Tai, H.; Yan, J.; Li, Y. A Sequential MCMC Model for Reliability Evaluation of Offshore Wind Farms Considering Severe Weather Conditions. IEEE Access 2019, 7, 132552–132562. [Google Scholar] [CrossRef]
Haghi, H.V.; Lotfifard, S. Spatiotemporal modeling of wind generation for optimal energy storage sizing. IEEE Trans. Sustain. Energy 2014, 6, 113–121. [Google Scholar] [CrossRef]
Stian, B.; Mohammadreza, A.; Asgeir, T. Stable stochastic capacity expansion with variable renewables: Comparing moment matching and stratified scenario generation sampling. Appl. Energy 2021, 302, 117538. [Google Scholar] [CrossRef]
Kabir, A.; Cecilia, O.; Sylvia, L. Assessment of Wind Energy Potential in North Central Nigeria Using Weibull Distribution for Sustainable Energy Planning and Generation. Phys. Sci. Int. J. 2026, 30, 12–23. [Google Scholar] [CrossRef]
Liu, J.; Xiong, G.; Fu, X.; Mohamed, A.W. Estimating the best-fit parameters of Weibull distribution with numerical methods for wind energy assessment: A case study in China. Energy Strategy Rev. 2026, 63, 102017. [Google Scholar] [CrossRef]
Yoon, H.P.; Lazar, M.; Salem, C.; Seough, J.; Martinović, M.M.; Klein, G.K.; López, A.R. Boundary of the Distribution of Solar Wind Proton Beta versus Temperature Anisotropy. Astrophys. J. 2024, 969, 77. [Google Scholar] [CrossRef]
Kumar, M.; Tyagi, B. Multi-variable constrained non-linear optimal planning and operation problem for isolated microgrids with stochasticity in wind, solar, and load demand data. IET Gener. Transm. Distrib. 2020, 14, 2181–2190. [Google Scholar] [CrossRef]
Zhao, J.; Wang, C.; Fan, N. Continuous-time Markov processes on infinite dimensional hypercube. Infin. Dimens. Anal. Quantum Probab. Relat. Top. 2026; prepublish. [CrossRef]
Xie, J.; Wang, X.; Li, X.; Cai, J.; Zhao, Y. A Data-Driven Approach for Reliability Assessment and Remaining Life prediction of aero-engines based on a time-transformed bivariate wiener process and Copula dependency structure. Aerosp. Sci. Technol. 2026, 168, 111257. [Google Scholar] [CrossRef]
Benitez, B.I.; Singh, G.J. A comprehensive review of machine learning applications in forecasting solar PV and wind turbine power output. J. Electr. Syst. Inf. Technol. 2025, 12, 54. [Google Scholar] [CrossRef]
Fong, Y.; Fung, A.S. Review and analysis of wind and solar PV farms power outputs to meet Ontario hourly electricity demand with optimal sizing of PV farms, wind farms, and energy storage systems. E3S Web Conf. 2025, 629, 05006. [Google Scholar] [CrossRef]
Yang, M.; Li, J.; Sun, J.; Xu, J.; Li, J. Robust Optimal Scheduling of EHG-IES Based on Uncertainty of Wind Power and PV Output. Int. Trans. Electr. Energy Syst. 2022, 2022, 6587478. [Google Scholar] [CrossRef]
Matevosyan, J.; Soder, L. Minimization of imbalance cost trading wind power on the short-term power market. IEEE Trans. Power Syst. 2006, 21, 1396–1404. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Doersch, C. Tutorial on variational autoencoders. arXiv 2016, arXiv:1606.05908. [Google Scholar] [CrossRef]
Koike, Y.; Nakagawa, T.; Waida, H.; Kanamori, T. Noiseless Diffusion-GAN: Scaling-based data augmentation for generative models. Neural Netw. 2026, 197, 108458. [Google Scholar] [CrossRef]
Swathi, B.; Rao, J.B.D. Automated image inpainting for historical artifact restoration using hybridisation of transfer learning with deep generative models. Sci. Rep. 2026, 16, 4810. [Google Scholar] [CrossRef]
Liu, J.; Yang, G.; Li, X.; Hao, S.; Guan, Y.; Li, Y. A deep generative model based on CNN-CVAE for wind turbine condition monitoring. Meas. Sci. Technol. 2023, 34, 035902. [Google Scholar] [CrossRef]
Salazar, A.A.S.; Zhang, J.; Che, Y.; Xiao, F. Deep generative model for probabilistic wind speed and wind power estimation at a wind farm. Energy Sci. Eng. 2022, 10, 1855–1873. [Google Scholar] [CrossRef]
Liu, C.; Xu, W.; Ni, L.; Chen, H.; Hu, X.; Lin, H. Development of a sensitive simultaneous analytical method for 26 targeted mycotoxins in coix seed and Monte Carlo simulation-based exposure risk assessment for local population. Food Chem. 2024, 435, 137563. [Google Scholar] [CrossRef]
Xu, S.; Zhang, Q.; Wang, D.; Huang, X. Uncertainty Quantification of Compressor Map Using the Monte Carlo Approach Accelerated by an Adjoint-Based Nonlinear Method. Aerospace 2023, 10, 280. [Google Scholar] [CrossRef]
Tranos, D.M. Is the Monte Carlo search method efficient for a paleostress analysis of natural heterogeneous fault-slip data? An example from the Kraishte area, SW Bulgaria. J. Struct. Geol. 2018, 116, 178–188. [Google Scholar] [CrossRef]
Dariush, B.; Hamid, K.; Mehdi, S. Evaluating the effect of wind turbine faults on power using the Monte Carlo method. Wind. Energy 2022, 25, 935–951. [Google Scholar] [CrossRef]
Özdenizci, O.; Legenstein, R. Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 10346–10357. [Google Scholar] [CrossRef]
Shi, Q.; Li, F.; Teja, K.; Olama, M.M.; Dong, J.; Wang, X.; Chris, W. Resilience-Oriented DG Siting and Sizing Considering Stochastic Scenario Reduction. IEEE Trans. Power Syst. 2021, 36, 3715–3727. [Google Scholar] [CrossRef]
Sun, Y.; Zhang, J.; Li, Z.; Tian, W.; Shahidehpour, M. Stochastic Scheduling of Battery-Based Energy Storage Transportation System with the Penetration of Wind Power. IEEE Trans. Sustain. Energy 2017, 8, 135–144. [Google Scholar] [CrossRef]
Nabergoj, D.; Štrumbelj, E. Empirical evaluation of normalizing flows in Markov chain Monte Carlo. Mach. Learn. 2025, 114, 282. [Google Scholar] [CrossRef]
Zhang, K.; Wu, H.; Zhang, J.; Wu, S. Efficient Markov chain Monte Carlo sampling for Bayesian inverse problems with covariance matrix adaptation. J. Hydrol. 2025, 663, 134235. [Google Scholar] [CrossRef]
Majee, S.; Abhishek, A.; Strauss, T.; Khan, T. MCMC-Net: Accelerating Markov Chain Monte Carlo with neural networks for inverse problems. Inverse Probl. 2025, 41, 095013. [Google Scholar] [CrossRef]
Papaefthymiou, G.; Klockl, B. MCMC for Wind Power Simulation. IEEE Trans. Energy Convers. 2008, 23, 234–240. [Google Scholar] [CrossRef]
Li, J.; Li, J.; Wen, J.; Cheng, S.; Xie, H.; Yue, C. Generating wind power time series based on its persistence and variation characteristics. Sci. China Technol. Sci. 2014, 57, 2475–2486. [Google Scholar] [CrossRef]
Boschini, M.; Gerosa, D.; Crespi, A.; Falcone, M. “LHS in LHS”: A new expansion strategy for Latin hypercube sampling in simulation design. SoftwareX 2025, 31, 102294. [Google Scholar] [CrossRef]
Xue, C.; Bai, X. Probabilistic carbon emission flow calculation of power system with Latin Hypercube Sampling. Energy Rep. 2025, 14, 751–765. [Google Scholar] [CrossRef]
Siti, N.B.; Wira, I.F.R.; Mohammad, F.; Winda, W. Utilization of quantile mapping method using cumulative distribution function (CDF) to calibrated satellite rainfall GSMaP in Majalaya watershed. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2023; Volume 1165. [Google Scholar] [CrossRef]
Tang, C.; Wang, Y.; Xu, J.; Sun, Y.; Zhang, B. Efficient scenario generation of multiple renewable power plants considering spatial and temporal correlations. Appl. Energy 2018, 221, 348–357. [Google Scholar] [CrossRef]
Liu, Y.; Gao, S.; Cui, H.; Yu, L. Probabilistic load flow considering correlations of input variables following arbitrary distributions. Electr. Power Syst. Res. 2016, 140, 354–362. [Google Scholar] [CrossRef]
Chen, Q.; Zuo, L.; Wu, C.; Bu, Y.; Huang, Y.; Chen, F.; Chen, J. Supply adequacy assessment of the gas pipeline system based on the Latin hypercube sampling method under random demand. J. Nat. Gas Sci. Eng. 2019, 71, 102965. [Google Scholar] [CrossRef]
Roy, T.P.; Jofre, L.; Jouhaud, J.; Cuenot, B. Versatile sequential sampling algorithm using Kernel Density Estimation. Eur. J. Oper. Res. 2020, 284, 201–211. [Google Scholar] [CrossRef]
Inah, I.O.; Akuru, B.U.; Sotenga, Z.P. Modelling South Africa’s carbon-peak trajectories through a Decoupling–Markov Chain–Monte Carlo (D-MCMC) energy–economic transition framework. Results Eng. 2026, 29, 108979. [Google Scholar] [CrossRef]
Raju, A. Applications of Markov chains in climate change modelling: A comprehensive review of advances, challenges, and future directions. Ecol. Model. 2026, 514, 111470. [Google Scholar] [CrossRef]
Wang, C.; Yang, H.; Ni, L.; Bian, X.; Wang, D.; Liang, Y. Risk assessment method for stator transposition bar of AC generator based on fault dataset and Markov chain. Eng. Sci. Technol. Int. J. 2026, 73, 102249. [Google Scholar] [CrossRef]
Mohammad, R.; Mokhtar, B.; Mauro, C.; Rachid, C. Stochastic optimization and Markov chain-based scenario generation for exploiting the underlying flexibilities of an active distribution network. Sustain. Energy Grids Netw. 2023, 34, 100999. [Google Scholar] [CrossRef]
Sanjari, M.J.; Gooi, H.B. Probabilistic Forecast of PV Power Generation Based on Higher Order Markov Chain. IEEE Trans. Power Syst. 2017, 32, 2942–2952. [Google Scholar] [CrossRef]
Xiong, H.; Mamon, R. A higher-order Markov Chain-modulated model for electricity spot-price dynamics. Appl. Energy 2019, 233–234, 495–515. [Google Scholar] [CrossRef]
Liu, M.; Xiong, Y.; Li, Q.; Murad, A.A.M.; Zhong, W. Higher-Order Markov Chain-Based Probabilistic Power Flow Calculation Method Considering Spatio-Temporal Correlations. Energies 2025, 18, 1058. [Google Scholar] [CrossRef]
Avila, P.M.D.; Rodríguez, R.A.; Torres, S.E.; Rodríguez, S.A.; Lino, G.T.; Bolaños, O.R. Verification of the Short-Term Forecast of the Wind Speed for the Gibara II Wind Farm according to the Prevailing Synoptic Situation Types. Environ. Sci. Proc. 2023, 27, 25. [Google Scholar] [CrossRef]
Joakim, M.; Dennis, M.D.V.; Joakim, W. Very short-term load forecasting of residential electricity consumption using the Markov-chain mixture distribution (MCM) model. Appl. Energy 2021, 282, 116180. [Google Scholar] [CrossRef]
Munkhammar, J. Very short-term probabilistic and scenario-based forecasting of solar irradiance using Markov-chain mixture distribution modeling. Sol. Energy Adv. 2024, 4, 100057. [Google Scholar] [CrossRef]
Terzi, B.T.; Üçüncü, O. Probabilistic Risk Assessment of Meteorological and Hydrological Droughts with Copula Functions: A Multivariate Framework. Water Resour. Manag. 2026, 40, 61. [Google Scholar] [CrossRef]
Wu, C.; Ren, C.; Jin, J.; Zhou, Y.; Nie, B.; Bai, X.; Cui, Y.; Tong, F.; Zhang, L. C-Vine Copulas Function and Conditional Quantile Regression Coupling Model for Agricultural Drought Prediction Analysis. Water Resour. Manag. 2026, 40, 59. [Google Scholar] [CrossRef]
Orcel, O.; Sergent, P. Sea level rise: Using copula function to get structure crest level rating from its average overtopping discharge. LHB 2025, 111, 2579896. [Google Scholar] [CrossRef]
Dusson, G.; Klüppelberg, C.; Friesecke, G. Copula methods for modeling pair densities in density functional theory. J. Chem. Phys. 2025, 162, 144109. [Google Scholar] [CrossRef]
Yoo, J.; Son, Y.; Yoon, M.; Choi, S. A Wind Power Scenario Generation Method Based on Copula Functions and Forecast Errors. Sustainability 2023, 15, 16536. [Google Scholar] [CrossRef]
Camal, S.; Teng, F.; Michiorri, A.; Kariniotakis, G.; Badesa, L. Scenario generation of aggregated Wind, Photovoltaics and small Hydro production for power systems applications. Appl. Energy 2019, 242, 1396–1406. [Google Scholar] [CrossRef]
Qiu, Y.; Li, Q.; Pan, Y.; Yang, H.; Chen, W. A scenario generation method based on the mixture vine copula and its application in the power system with wind/hydrogen production. Int. J. Hydrogen Energy 2019, 44, 5162–5170. [Google Scholar] [CrossRef]
Wang, Z.; Wang, W.; Liu, C.; Wang, B. Forecasted Scenarios of Regional Wind Farms Based on Regular Vine Copulas. J. Mod. Power Syst. Clean Energy 2020, 8, 77–85. [Google Scholar] [CrossRef]
Xu, Y.; Yuan, Y. Analysis of Aggregated Wind Power Dependence Based on Optimal Vine Copula. In IEEE Innovative Smart Grid Technologies—Asia (ISGT Asia); IEEE: Piscataway, NJ, USA, 2019; pp. 1788–1792. [Google Scholar] [CrossRef]
Raik, B. Generation of Time-Coupled Wind Power Infeed Scenarios Using Pair-Copula Construction. IEEE Trans. Sustain. Energy 2018, 9, 1298–1306. [Google Scholar] [CrossRef]
Luo, Z.; Liu, C.; Liu, S. A Novel Fault Prediction Method of Wind Turbine Gearbox Based on Pair-Copula Construction and BP Neural Network. IEEE Access 2020, 8, 91924–91939. [Google Scholar] [CrossRef]
Vagropoulos, I.S.; Kardakos, G.E.; Simoglou, K.C.; Bakirtzis, G.A.; Catalão, P.S.J. ANN-based scenario generation methodology for stochastic variables of electric power systems. Electr. Power Syst. Res. 2016, 134, 9–18. [Google Scholar] [CrossRef]
Cui, M.; Ke, D.; Sun, Y.; Gan, D.; Zhang, J.; Hodge, B.M. Wind Power Ramp Event Forecasting Using a Stochastic Scenario Generation Method. IEEE Trans. Sustain. Energy 2015, 6, 422–433. [Google Scholar] [CrossRef]
Stappers, B.; Paterakis, N.G.; Kok, K.; Gibescu, M. A Class-Driven Approach Based on Long Short-Term Memory Networks for Electricity Price Scenario Generation and Reduction. IEEE Trans. Power Syst. 2020, 35, 1. [Google Scholar] [CrossRef]
Yang, J.; Zhang, S.; Xiang, Y.; Liu, J.; Liu, J.; Han, X.; Teng, F. LSTM auto-encoder based representative scenario generation method for hybrid hydro-PV power system. IET Gener. Transm. Distrib. 2020, 14, 5935–5943. [Google Scholar] [CrossRef]
Gao, X.Z.; Mao, J.A.; Chen, Z.D.; Song, T.Y. A Wind Farm Capacity Credibility Calculation Method Based on Parabola. Appl. Mech. Mater. 2014, 472, 953–957. [Google Scholar] [CrossRef]
Wangdee, W.; Billinton, R. Probing the Intermittent Energy Resource Contributions from Generation Adequacy and Security Perspectives. IEEE Trans. Power Syst. 2012, 27, 2306–2313. [Google Scholar] [CrossRef]
Jeon, J.; Taylor, W.J. Short-term density forecasting of wave energy using ARMA-GARCH models and kernel density estimation. Int. J. Forecast. 2016, 32, 991–1004. [Google Scholar] [CrossRef]
Han, J.; Hao, S.; Chen, S. Prediction Study of Wind Power Generation Power Based on Arima Model. Int. J. New Dev. Eng. Soc. 2025, 9, 350–360. [Google Scholar] [CrossRef]
Jiang, C.; Mao, Y.; Chai, Y.; Yu, M.; Tao, S. Scenario Generation for Wind Power Using Improved Generative Adversarial Networks. IEEE Access 2018, 6, 62193–62203. [Google Scholar] [CrossRef]
Ning, C.; You, F. Deep Learning Based Distributionally Robust Joint Chance Constrained Economic Dispatch Under Wind Power Uncertainty. IEEE Trans. Power Syst. 2022, 37, 191–203. [Google Scholar] [CrossRef]
Chen, Y.; Wang, Y.; Kirschen, D.; Zhang, B. Model-Free Renewable Scenario Generation Using Generative Adversarial Networks. IEEE Trans. Power Syst. 2018, 33, 3265–3275. [Google Scholar] [CrossRef]
Li, Y.; Li, J.; Wang, Y. Privacy-Preserving Spatiotemporal Scenario Generation of Renewable Energies: A Federated Deep Generative Learning Approach. IEEE Trans. Ind. Inform. 2022, 18, 2310–2320. [Google Scholar] [CrossRef]
Liu, W.; Wang, Y.; Shi, Q.; Yao, Q.; Wan, H. Multi-Stage Restoration Strategy to Enhance Distribution System Resilience with Improved Conditional Generative Adversarial Nets. CSEE J. Power Energy Syst. 2025, 11, 1657–1669. [Google Scholar] [CrossRef]
Yang, X.; He, H.; Li, J.; Zhang, Y. Toward Optimal Risk-Averse Configuration for HESS with CGANs-Based PV Scenario Generation. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 1779–1793. [Google Scholar] [CrossRef]
Zhang, Y.; Ai, Q.; Xiao, F.; Hao, R.; Lu, T. Typical wind power scenario generation for multiple wind farms using conditional improved Wasserstein generative adversarial network. Int. J. Electr. Power Energy Syst. 2020, 114, 105388. [Google Scholar] [CrossRef]
Zhang, G. Renewable Scenario Generation Based on Improved Conditional Generative Adversarial Networks. In Proceedings of the 4th International Symposium on New Energy Technology Innovation and Low Carbon Development (NET-LC), Hangzhou, China, 9–11 May 2025; pp. 85–89. [Google Scholar] [CrossRef]
Wang, H.; Qin, B.; Hong, S.; Xu, X.; Su, Y.; Lu, T.; Ding, T. Enhanced GAN Based Joint Wind-Solar-Load Scenario Generation with Extreme Weather Labelling. IEEE Trans. Smart Grid 2025, 16, 4213–4224. [Google Scholar] [CrossRef]
Zhang, X.; Li, D.; Fu, X. A novel wasserstein generative adversarial network for stochastic wind power output scenario generation. IET Renew. Power Gener. 2024, 18, 3731–3742. [Google Scholar] [CrossRef]
Zhang, X.; Fan, S.; Li, D. Spectral normalization generative adversarial networks for photovoltaic power scenario generation. IET Renew. Power Gener. 2024, 19, E12978. [Google Scholar] [CrossRef]
Liang, J.; Tang, W. Sequence Generative Adversarial Networks for Wind Power Scenario Generation. IEEE J. Sel. Areas Commun. 2020, 38, 110–118. [Google Scholar] [CrossRef]
Yuan, R.; Wang, B.; Mao, Z.; Watada, J. Multi-objective wind power scenario forecasting based on PG-GAN. Energy 2021, 226, 120379. [Google Scholar] [CrossRef]
Pan, Z.; Wang, J.; Liao, W.; Chen, H.; Yuan, D.; Zhu, W.; Fang, X.; Zhu, Z. Data-Driven EV Load Profiles Generation Using a Variational Auto-Encoder. Energies 2019, 12, 849. [Google Scholar] [CrossRef]
Xiao, S.; Liu, Z.; He, X.; Zhang, J.; Liu, H. Wind Power Output Scenario Generation Based on Importance Weighted Auto-encoder. In Proceedings of the 26th International Conference on Electrical Machines and Systems (ICEMS), Zhuhai, China, 5–8 November 2023; pp. 791–796. [Google Scholar] [CrossRef]
Razghandi, M.; Zhou, H.; Erol-Kantarci, M.; Turgut, D. Variational Autoencoder Generative Adversarial Network for Synthetic Data Generation in Smart Home. In IEEE International Conference on Communications (ICC); IEEE: Piscataway, NJ, USA, 2022; pp. 4781–4786. [Google Scholar] [CrossRef]
Rosa de Jesús, D.A.; Mandal, P.; Senjyu, T.; Kamalasadan, S. Unsupervised Hybrid Deep Generative Models for Photovoltaic Synthetic Data Generation. In IEEE Power & Energy Society General Meeting (PESGM); IEEE: Piscataway, NJ, USA, 2021; pp. 1–5. [Google Scholar] [CrossRef]
Xu, C.; Dai, Y.; Xu, P.; Gao, T.; Zhang, J. Wind Power Scenario Generation Based on Denoising Diffusion Probabilistic Model. In IEEE International Conference on Systems, Man, and Cybernetics (SMC); IEEE: Piscataway, NJ, USA, 2023; pp. 4525–4529. [Google Scholar] [CrossRef]
Cai, Y.R.; Zhang, X.; Hu, W.; Ding, R.S.; He, G.C. A scenario generation method for highly volatile renewable energy output in high-altitude areas based on an improved diffusion model. Power Syst. Technol. 2026. [Google Scholar] [CrossRef]
Dong, X.; Mao, Z.; Sun, Y.; Xu, X. Short-Term Wind Power Scenario Generation Based on Conditional Latent Diffusion Models. IEEE Trans. Sustain. Energy 2024, 15, 1074–1085. [Google Scholar] [CrossRef]
Yan, J.; Li, P.; Huang, Y. A Short-Term Wind Power Scenario Generation Method Based on Conditional Diffusion Model. In IEEE Sustainable Power and Energy Conference (iSPEC); IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar] [CrossRef]
Li, S.; Xu, C.; Wei, L.; Li, R.; Ai, X. Scenario Generation of Renewable Energy Based on Improved Diffusion Model. In IEEE Sustainable Power and Energy Conference (iSPEC); IEEE: Piscataway, NJ, USA, 2023; pp. 1–7. [Google Scholar] [CrossRef]
Tian, F.; Fan, X.; Wang, R.; Qin, H.; Fan, Y. A Power Forecasting Method for Ultra-Short-Term Photovoltaic Power Generation Using Transformer Model. Math. Probl. Eng. 2022, 2022, 9421400. [Google Scholar] [CrossRef]
Pan, Y.; Wang, Z.; Tan, Z.; Zhu, Z. Interpretable photovoltaic power modeling via Kolmogorov-Arnold network and timeGAN hybrid architecture with regime-aware data augmentation. Sol. Energy 2025, 302, 114022. [Google Scholar] [CrossRef]
Gu, L.; Xu, J.; Ke, D.; Ke, D.; Deng, Y.; Hua, X.; Yu, Y. Short-Term Output Scenario Generation of Renewable Energy Using Transformer–Wasserstein Generative Adversarial Nets-Gradient Penalty. Sustainability 2024, 16, 10936. [Google Scholar] [CrossRef]
Liang, J.; Wang, Q.; Wang, L.; Zhang, Z.; Sun, Y.; Tao, H.; Li, X. Wavelet Transform Convolution and Transformer-Based Learning Approach for Wind Power Prediction in Extreme Scenarios. Comput. Model. Eng. Sci. 2025, 143, 945–965. [Google Scholar] [CrossRef]
Wang, X.; Hu, Z.; Zhang, M. Research on Establishment of Quality Evaluation Framework of Short-Term Wind Power Scenarios. Power Syst. Technol. 2017, 5, 33. [Google Scholar]
Prosper, M.A.; Otero-Casal, C.; Canoura Fernandez, F.; Miguez-Macho, G. Wind power forecasting for a real onshore wind farm on complex terrain using WRF high resolution simulations. Renew. Energy 2019, 135, 674–686. [Google Scholar] [CrossRef]
Gao, Y.; Xue, F.; Yang, W.; Yang, Q.; Sun, Y.; Sun, Y.; Liang, H.; Li, P. Optimal operation modes of photovoltaic-battery energy storage system based power plants considering typical scenarios. Prot. Control Mod. Power Syst. 2017, 2, 36. [Google Scholar] [CrossRef]
Ouyang, T.; Zha, X.; Qin, L. A combined multivariate model for wind power prediction. Energy Convers. Manag. 2017, 144, 361–373. [Google Scholar] [CrossRef]
Pflug, C.G.; Pichler, A. Dynamic generation of scenario trees. Comput. Optim. Appl. 2015, 62, 641–668. [Google Scholar] [CrossRef]
Panni, Y.U.; Donald, C.; Blatnik, A.J.; Williams, M.; Yu, J.; Wise, E.P. Evaluating the Quality of AI-Written Scenarios for Virtual Oral Surgical Board Preparatory Examination. J. Surg. Educ. 2025, 82, 103736. [Google Scholar] [CrossRef]
Chen, H.; Zuo, Y.; Chau, K.T.; Zhao, W.; Lee, C.H.T. Modern electric machines and drives for wind power generation: A review of opportunities and challenges. IET Renew. Power Gener. 2021, 15, 1864–1887. [Google Scholar] [CrossRef]
Hoessly, L. On misconceptions about the Brier score in binary prediction models. Glob. Epidemiol. 2026, 11, 100242. [Google Scholar] [CrossRef]
Ming, H.; Xie, L.; Campi, M.; Garatti, S.; Kumar, P.R. Scenario-based economic dispatch with uncertain demand response. IEEE Trans. Smart Grid 2017, 10, 1858–1868. [Google Scholar] [CrossRef]
Ma, R.; Xu, W.; Liu, S.; Zhang, Y.; Xiong, J. Asymptotic mean and variance of Gini correlation under contaminated Gaussian model. IEEE Access 2016, 4, 8095–8104. [Google Scholar] [CrossRef]
Li, H.; Qin, B.; Wang, S.; Ding, T.; Liu, J.; Wang, H. Aggregate Power Flexibility of Multi-Energy Systems Supported by Dynamic Networks. Appl. Energy 2025, 377, 124565. [Google Scholar] [CrossRef]
Qin, B.; Liu, J.; Wang, H.; Wang, Z.; Xiong, Z.; Wang, M.; Qian, Q. Energy-efficient and reliable urban rail transit: A new framework incorporating underground energy storage systems. iEnergy 2025, 4, 86–97. [Google Scholar] [CrossRef]
Qin, B.; Chen, P.; Zhang, Z.; Wang, H.; Ding, T. Coordinated preventive control strategy for transient overvoltage suppression in hybrid AC/DC sending-side systems. Int. J. Electr. Power Energy Syst. 2025, 171, 111017. [Google Scholar] [CrossRef]
Kim, J.; Shin, H.J.; Lee, K.; Hong, J. Enhancement of ANN-based wind power forecasting by modification of surface roughness parameterization over complex terrain. J. Environ. Manag. 2025, 362, 121246. [Google Scholar] [CrossRef]
Peng, X.; Wang, H.; Lang, J.; Li, W.; Xu, Q.; Zhang, Z.; Cai, T.; Duan, S.; Liu, F.; Li, C. EALSTM-QR: Interval wind-power prediction model based on numerical weather prediction and deep learning. Energy 2020, 220, 119692. [Google Scholar] [CrossRef]
He, G.; Liu, K.; Wang, S.; Lei, Y.; Li, J. CWM-CGAN Method for Renewable Energy Scenario Generation Based on Weather Label Multi-Factor Definition. Processes 2022, 10, 470. [Google Scholar] [CrossRef]
Hu, J.; Cao, Y.; Tan, G. A dynamic spatiotemporal graph generative adversarial network for scenario generation of renewable energy with nonlinear dependence. Energy 2025, 335, 138049. [Google Scholar] [CrossRef]
Yang, Y.; Liu, Y.; Zhang, Y.; Shu, S.; Zheng, J. DEST-GNN: A double-explored spatio-temporal graph neural network for multi-site intra-hour PV power forecasting. Appl. Energy 2025, 378, 124744. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, K.; Huang, X.; Xu, Y.; Wan, X.; Li, W. A Renewable Power Scenario Generation Method Based on Wasserstein Distance for Conditional Generative Adversarial Networks. In 10th Asia Conference on Power and Electrical Engineering (ACPEE); IEEE: Piscataway, NJ, USA, 2025. [Google Scholar] [CrossRef]
Gao, L. Learning Residual Distributions with Diffusion Models for Probabilistic Wind Power Forecasting. Energies 2025, 18, 4226. [Google Scholar] [CrossRef]

Figure 1. A classification diagram of scenario generation methods.

Figure 2. A flowchart of Monte Carlo scenario generation.

Figure 3. A flowchart of LHS SG.

Figure 4. The flowchart of the Markov chain SG steps.

Figure 5. The flowchart of the Copula function scenario generation steps.

Figure 6. The network structure of artificial neural network.

Figure 7. The network structure of long short-term memory.

Figure 8. The network structure of the generative adversarial network.

Figure 9. The network structure of the variational autoencoder.

Figure 10. The network structure of the diffusion model.

Figure 11. The network structure of the self-attention mechanism.

Figure 12. The brief network structure of the transformer.

Figure 13. A classification diagram of evaluation of SG methods.

Table 1. The advantages, disadvantages and problems solved of the sampling methods discussed above.

Method	Advantages	Disadvantages	Problems Solved
Monte Carlo	Simple principle; easy to implement	Convergence speed significantly slower than ARMA; requires more samples to achieve equivalent statistical accuracy	The uncertainties of loads and wind energy [48]
MCMC	Higher temporal quality	Computational cost higher than ARMA; higher parameter estimation complexity than ARMA	Wind power output time series [52]
Latin Hypercube Sampling	Higher sampling efficiency than ARMA and Monte Carlo	Does not directly generate temporal scenarios	The reliability of power systems [55]
Markov Chain	Better fitting of sequential state transition characteristics than ARMA	Limited to modeling short-term dependencies; higher state division complexity than ARMA	The short-circuit fault probability [63]
HMC	Richer temporal pattern capture capability and basic MC; better long-term dependency modeling than ARMA	Prone to overfitting; higher computational cost than ARMA	PV power probability distribution prediction [65]
MCM	Stronger nonlinear fitting capability than ARMA; avoids parameter explosion of HMC	Higher computational cost model complexity than ARMA; larger sample size for stable training	Ultra-short-term load prediction [69]
Copula Functions	Captures complex nonlinear correlations	Higher computational cost than ARMA in high-dimensional scenarios	The correlations between different random variables [74]
Multivariate Gaussian Copula	Captures nonlinear dependencies between variables	May not align with non-elliptical dependencies observed in real wind/PV output.	The dependence structure between different time points [76]
Vine Copula	Better high-dimensional complex dependency modeling than Gaussian Copula; flexible decomposition of high-dimensional joint distributions	Model structure selection is complex; higher computational cost than ARMA in ultra-high-dimensional scenarios	The spatiotemporal interdependencies of wind power [77]
Pair Copula	Higher high-dimensional modeling efficiency than vine Copula	Higher computational cost than ARMA; large sample size for stable parameter estimation	Wind turbine fault prediction [81]

Table 2. The advantages and disadvantages and problems solved of model-based methods discussed above.

Method	Advantages	Limitations	Problems Solved
ANN	Strong nonlinear fitting capability; no requirement for sequence stationarity	Heavily reliant on large volumes of high-quality data; poor long-sequence modeling capability than ARMA	The stochastic process of wind turbine output [83]
LSTM	Far better long-term temporal dependency capture capability than ARMA; superior nonlinear sequence modeling performance	Heavily reliant on large high-quality data; poor long-sequence modeling capability than ARMA; black-box model with poor interpretability	Electricity price SG and reduction [84]
ARMA	Simple model, computationally efficient, and highly interpretable	Only applicable to stationary linear sequences; cannot capture nonlinear spatiotemporal correlations	Wind energy conversion systems [86]
ARIMA	Extends ARMA to non-stationary sequences; better trend fitting capability than ARMA	Still based on linear assumption; higher-order identification complexity than ARMA	Stochastic wind power time-series model [84]
GAN	High generation quality and strong diversity; superior nonlinear spatiotemporal correlation capture capability; better extreme event fitting than ARMA	Unstable training process, limited interpretability of generated scenarios; higher computational cost than ARMA	The complex spatial and temporal correlations of wind energy [91]
Fed-LSGAN	Enables privacy-preserving distributed training on decentralized data; better training stability than vanilla GAN	High communication overhead; higher computational cost than ARMA; requires multi-party data coordination	Renewable energy power SG [93]
CGAN	Generation can be controlled by conditional information; higher forecast consistency than GAN	Requires well-defined conditional data, training can be more complex; higher computational cost than GAN	PV power plant planning [95]
WGAN	More stable training than vanilla GAN, mitigates mode collapse; better distribution fitting than ARMA	Critic network requires weight clipping or gradient penalty, may have higher computational cost than GAN	Multi-wind farm wind power SG [96]
SNGAN	Uses spectral normalization to stabilize GAN training	May sometimes sacrifice some generation diversity for stability; higher computational cost than ARMA	PV power plant SG [100]
SeqGAN	Designed for sequential data generation using RL (policy gradient)	Training is complex and can be sample-inefficient	Wind power SG [101]
PgGAN	High-resolution long-sequence generation capability; better progressive detail capture than ARMA	Sequential generation is very slow, not suitable for long sequences; higher training complexity than GAN	Wind power scenario prediction [102]
VAE	Stable training process (no mode collapse risk); continuous structured latent space not available in ARMA; better controllability than GAN	Generated samples are over-smoothed; inferior extreme scenario capture capability to GAN; higher computational cost than ARMA	Electric vehicle charging load profiles [103]
VAE-GAN	Combines VAE’s stable training with GAN’s high-quality generation	More complex architecture, combines the hyperparameter tuning challenges of both models; higher computational cost than VAE and GAN	Synthetic data generation for solar PV systems [106]
Diffusion Models	Far higher scenario fidelity and distribution fitting capability than ARMA; stable training (no mode collapse)	Extremely high computational cost and slower sampling speed than AMRA and GAN	High-variability output scenarios for new energy sources in high-altitude regions [108]
Conditional Diffusion Model	Precise controllable generation capability not available in ARMA; better NWP conditional embedding than CGAN	Higher training complexity than basic diffusion model; requires paired condition-data samples	In short-term wind energy SG [109]
Improved Diffusion Model	Faster sampling speed than basic diffusion model; maintains high generation quality; better computational efficiency	May introduce approximation errors, tuning trade-off between speed and fidelity; higher computational cost than ARMA and GAN variants	Map wind power, PV output, and multi-dimensional load data [111]
Transformer-KAN	Superior long-sequence modeling capability than ARMA; better nonlinear fitting than LSTM; higher forecasting accuracy	High computational cost, complex hyperparameter tuning, poor interpretability	PV power generation forecasting [113]
Transformer-WGAN	Excellent spatiotemporal correlation capture; more stable training than vanilla GAN; better long-sequence performance than ARMA	High computational complexity, complex tuning, limited heterogeneous data generalization	Short-term renewable energy output SG [114]
WTC-Transformer	Effective extreme feature capture, strong long-sequence modeling, high forecasting robustness	Poor parameter adaptability, high inference latency, prone to small-sample overfitting	Wind power forecasting under extreme scenarios [115]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ma, T.; Qin, B.; Hong, S.; Su, Y. Mathematical Analysis Methods for Quantitative Scenario Generation of Renewable Power Output: A Comprehensive Review. Energies 2026, 19, 1701. https://doi.org/10.3390/en19071701

AMA Style

Ma T, Qin B, Hong S, Su Y. Mathematical Analysis Methods for Quantitative Scenario Generation of Renewable Power Output: A Comprehensive Review. Energies. 2026; 19(7):1701. https://doi.org/10.3390/en19071701

Chicago/Turabian Style

Ma, Tong, Boyu Qin, Shidong Hong, and Yiwei Su. 2026. "Mathematical Analysis Methods for Quantitative Scenario Generation of Renewable Power Output: A Comprehensive Review" Energies 19, no. 7: 1701. https://doi.org/10.3390/en19071701

APA Style

Ma, T., Qin, B., Hong, S., & Su, Y. (2026). Mathematical Analysis Methods for Quantitative Scenario Generation of Renewable Power Output: A Comprehensive Review. Energies, 19(7), 1701. https://doi.org/10.3390/en19071701

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Mathematical Analysis Methods for Quantitative Scenario Generation of Renewable Power Output: A Comprehensive Review

Abstract

1. Introduction

2. Description of Scenario Generation

2.1. Definition of SG Methods

2.2. Classification of SG Methods for Renewable Power Scenarios

3. Sampling-Based Methods

3.1. Monte Carlo Method

3.2. Latin Hypercube Sampling (LHS) Method

3.3. Markov Chains (MCs) Method

3.4. Copula Functions

3.5. Brief Summary

4. Model-Based Methods

4.1. Artificial Neural Networks (ANNs)

4.2. Long Short-Term Memory Networks (LSTM)

4.3. Autoregressive Moving Average (ARMA)

4.4. Generative Adversarial Networks (GANs)

4.5. Variational Autoencoder (VAE)

4.6. Diffusion Models

4.7. Transformer-Based Models

4.8. Brief Summary

5. Evaluation of SG Methods

5.1. Output-Based Evaluation

5.2. Distribution-Based Evaluation

5.3. Event-Based Evaluation

5.4. Brief Summary

6. Analysis of the Impacts of NWP Data on SG Methods

6.1. Core Roles and Integration Paradigms of NWP Data in SG

6.2. NWP Integration Mechanisms in Deep Generative Models

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI