1. Introduction
In recent years, with the continued advancement of the “dual carbon” strategy, wind power—being a green and clean renewable energy source—has seen increasing penetration in power systems, becoming a key pillar in the development of a new-type power system [
1]. However, as wind power output is highly dependent on meteorological conditions, it exhibits significant randomness and volatility. This results in forecasting errors in wind power output, leading to discrepancies between the reported schedule and actual generation, thereby increasing the operational risk and cost of system dispatching [
2].
To address the uncertainty of wind power, early studies predominantly adopted deterministic optimization methods for day-ahead scheduling, in which the forecasted power was directly used as the submitted schedule [
3]. Although this approach features a simple structure, it fails to account for the risk of deviation caused by forecasting errors, often resulting in large-scale wind curtailment or substantial penalty costs. To improve the feasibility of scheduling plans, some studies have introduced probabilistic constraint optimization frameworks to control the probability of wind power schedule deviations under a specified confidence level [
4]. Reference [
5] proposed a hydro–wind–thermal joint dispatching method based on the confidence interval of wind power output. By incorporating the confidence interval to model wind power uncertainty and leveraging the regulation capabilities of hydropower and thermal power, the method facilitates the accommodation of wind power uncertainty while minimizing system costs.
Meanwhile, robust optimization methods have been increasingly applied to wind power dispatch models to enhance the system’s resilience against adverse scenarios [
6]. Reference [
7] proposed a robust modeling approach for multiple wind farms based on high-dimensional non-parametric Copula functions, which effectively captures the correlation structure of wind power outputs. Reference [
8] integrated an improved Wasserstein metric with a combined heat and power (CHP) coordination mechanism to enhance both the conservativeness and economic efficiency of system scheduling. Building on this, Reference [
9] proposed a piecewise stochastic robust optimization approach, which divides the uncertainty set of wind power and introduces piecewise linear decision rules to coordinate AGC (Automatic Generation Control) and P2G (Power-to-Gas) resources, thereby improving real-time dispatch efficiency under wind power uncertainty. Although robust optimization performs well in handling extreme risks, its inherently conservative nature may limit the economic performance of dispatch strategies to some extent.
To strike a balance between risk control and profit maximization, multi-scenario optimization methods have been extensively studied and applied in wind power scheduling [
10]. By constructing multiple representative wind power output scenarios, this approach incorporates the randomness of forecasting errors into the scheduling model, thereby enhancing the stability and flexibility of the dispatch strategy. Reference [
11] addressed the uncertainty of load response by proposing a two-stage multi-scenario demand response strategy based on forecast deviation compensation. The prediction-adjustment dual-stage framework effectively mitigates the impact of forecasting errors on plan execution. Reference [
12] considered the coordinated regulation capabilities of industrial loads and energy storage systems and developed a multi-timescale optimization model under high wind power penetration scenarios. This approach improves both system flexibility and the economic efficiency of wind power integration. Reference [
13] further incorporated environmental wind conditions and proposed an efficient scenario generation method that integrates scenario screening with incremental risk control. This method enhances the robustness and computational efficiency of wind power scheduling models under chance-constrained frameworks. Reference [
14] proposed a rolling intraday scheduling strategy for wind/storage systems based on multi-scenario synergy. By incorporating real-time adjustments of energy storage dispatch in response to wind power fluctuations, their method enhances operational flexibility and short-term economic benefits.
Accurate modeling of wind power forecast errors and their dependencies is essential for generating realistic multi-scenario datasets. Traditional parametric models—such as Beta, Levy-stable, or Gamma-like distributions—often fail to capture the skewness, heavy tails, and multimodal characteristics observed in actual data [
15]. To address these limitations, this study adopts kernel density estimation (KDE) for flexible marginal distribution modeling without predefined assumptions. In addition, Copula-based methods are employed to characterize the spatio-temporal dependencies among forecast errors more accurately than traditional linear correlation models, providing a more realistic foundation for scenario generation [
16].
While improving the executability of wind power scheduling, energy storage systems have also been integrated into the dispatch framework as flexible regulation resources. Reference [
17] proposed a frequency control strategy based on state-of-charge (
SOC) regulation to enable coordinated operation between wind and storage systems. Reference [
18] developed a fluctuation characteristic analysis and typical day clustering model based on actual wind and solar output data, and proposed an energy storage capacity optimization strategy aimed at smoothing power fluctuations. However, most existing studies regard energy storage primarily as a regulation resource, without fully exploring its arbitrage potential under time-of-use pricing mechanisms. Moreover, current scheduling models often fail to incorporate the initial
SOC of storage systems into the joint optimization framework.
On the other hand, although multi-scenario approaches can improve the average profit of dispatch plans, they still face the risk of revenue decline under extreme scenarios. Therefore, Conditional Value-at-Risk (CVaR), as a risk measurement tool, has been introduced into power dispatch optimization models to quantify tail risk [
19]. Reference [
20] incorporated CVaR into a wind power investment equilibrium model under multiple market trading mechanisms to characterize the risk preferences on the power production side. Reference [
21] introduced CVaR into the scheduling model of an integrated energy system with uncertain wind power, enabling the quantification and management of risks caused by wind power volatility. However, existing studies that incorporate CVaR into dispatch models mainly focus on the formulation and constraint of the risk metric itself, with insufficient attention to the participation mechanism of energy storage systems in risk mitigation. In particular, the lack of joint modeling of initial state-of-charge and time-of-use price response strategies limits the ability of wind/storage systems to achieve coordinated optimization between profitability and robustness.
While existing studies have explored wind/storage scheduling through multi-scenario optimization, demand-side coordination, or rolling intraday control, many of them assume a fixed SOC and do not explicitly incorporate risk constraints. In contrast, this study focuses on day-ahead scheduling and introduces a unified framework that integrates non-parametric modeling of forecast errors, Copula-based scenario generation, initial SOC optimization, and CVaR-based risk control. This approach captures both the economic and risk dimensions of decision-making under uncertainty, distinguishing it from prior works that either overlook initial SOC flexibility or treat uncertainty in a simplified manner. The main contributions of this study are as follows:
- (1)
Based on historical wind power forecasting errors, kernel density estimation is employed to fit the marginal distributions, while a Gaussian Copula function is used to capture the temporal correlation among forecasting errors across different time periods. This enables the generation of wind power output scenarios with sequential dependence. Representative scenarios are then extracted using K-means clustering to enhance scenario representativeness and modeling accuracy;
- (2)
The initial SOC of the energy storage system is introduced as a decision variable in the scheduling framework. By jointly considering electricity sales revenue, operation and maintenance costs, penalties for schedule deviations, and time-of-use pricing, the model enables economic arbitrage through charging and discharging during peak and off-peak price periods, thereby improving overall system profit and operational flexibility;
- (3)
A CVaR-based risk control mechanism is incorporated into the multi-scenario optimization framework to effectively constrain profit losses under extreme scenarios, thereby enhancing the robustness and risk controllability of the scheduling strategy under high uncertainty.
The remainder of this paper is organized as follows:
Section 2 introduces the modeling of wind power forecasting errors and the generation of representative scenarios.
Section 3 develops the day-ahead optimization scheduling model for the wind/storage system, with a detailed description of the objective function, constraints, and the CVaR-based risk control mechanism.
Section 4 presents case studies based on multi-scenario wind power outputs, in which simulation experiments are conducted under different optimization strategies, initial
SOC settings, and CVaR confidence levels.
Section 5 concludes the paper and outlines directions for future research.
2. Multi-Scenario Generation Based on Wind Power Forecasting Errors
The data used in this study are obtained from the SCADA system of a wind farm in Jiangsu Province, covering a typical month (30 days). The scheduling horizon is 24 h with a 10 min sampling interval, totaling 4320 time points. The 10 min interval complies with industry standards, offering sufficient resolution to capture power fluctuations while ensuring computational efficiency.
Figure 1 presents a comparison between the forecasted and actual wind power outputs for the entire month. The forecasting errors derived from the forecasted and actual outputs over the first 29 days are used for statistical analysis and multi-scenario modeling. The forecasted power on the 30th day serves as the input for scheduling optimization.
2.1. Statistical Analysis of Wind Power Forecasting Errors
To characterize the forecasting error characteristics of wind power output, statistical analysis is first conducted based on historical data from 144 time intervals per day over a one-month period. Let the forecasted wind power at time interval t of a given day be denoted as
, and the actual wind power be denoted as
. The wind power forecasting error is then defined as follows:
Based on the historical data from the first 29 days, the forecasting errors are grouped by time interval to form an error sample set for each time period, denoted as , where .
To describe the distribution characteristics of wind power forecasting errors at different time intervals, statistical analysis is further conducted on the error samples of each interval to compute the sample mean
and sample standard deviation
. The corresponding formulas are as follows:
2.2. Kernel Density Estimation for Error Distribution Fitting
To characterize the distribution of wind power forecasting errors across different time intervals, this study employs the kernel density estimation (KDE) method to fit the error samples derived from the historical data of the first 29 days [
22]. Compared with traditional parametric distribution fitting methods, KDE—as a non-parametric technique—can adaptively capture the true underlying distribution of the data without assuming any specific distribution form in advance.
where
d is the number of historical sample days, and
is the kernel function. In this study, the Gaussian function is selected, which is defined as follows:
In Equation (4),
h denotes the bandwidth parameter, which directly affects the smoothness of the kernel density estimation. To determine an appropriate bandwidth, this study adopts Silverman’s rule of thumb [
23], which defines the bandwidth as follows:
where
is the standard deviation of the forecasting error sample set, and
d is the number of historical samples used for error distribution estimation. If the bandwidth is set too small, the KDE curve becomes overly steep and sensitive to noise; conversely, if the bandwidth is too large, the estimated distribution may become overly smooth, potentially resulting in the loss of important detail. Therefore, Silverman’s rule of thumb is adopted in this study to determine the bandwidth parameter.
In addition, to further validate the fitting performance of the kernel density estimation method, two typical parametric approaches—namely, the normal distribution and the
t-distribution—are selected for comparative analysis. The normal distribution fitting method derives the probability density function using the sample mean and standard deviation of the historical forecasting errors, estimated via the maximum likelihood method. The
t-distribution fitting method accounts for the heavy-tailed characteristics of the data, making it more suitable for modeling extreme forecasting errors [
24].
Figure 2 presents a comparison between the histogram of forecasting errors from the first 29 days and the fitted curves obtained from the three methods.
As shown in
Figure 2, the normal distribution fitting curve performs well near zero forecasting errors, but its accuracy deteriorates in regions with larger errors. Although the
t-distribution curve improves the fitting performance to some extent compared to the normal distribution, its unimodal nature prevents it from accurately capturing multimodal patterns or local density fluctuations present in the actual data. In contrast, kernel density estimation, as a non-parametric method, is not constrained by any specific distributional assumptions and can flexibly adapt the fitting curve according to the actual data characteristics. It effectively captures the skewness, multimodality, and heavy-tailed features of forecasting error data. The KDE fitting curve exhibits a significantly better match with the error histogram than both the normal and
t-distribution fitting methods.
2.3. Error Correlation Modeling Based on Copula Functions
In practical wind power scenarios, forecasting errors not only exhibit random fluctuations within individual time intervals but also demonstrate significant temporal correlation. Traditional single-period probabilistic modeling methods fail to capture the interdependence among forecasting errors across different time intervals, which in turn compromises the accuracy of subsequent scenario generation and system scheduling. To address this limitation, this study introduces the Copula function to construct a joint probabilistic model that captures the dependency structure among wind power forecasting errors across multiple time intervals [
25].
A Copula function describes the dependency structure among multivariate random variables. It allows for the separation of marginal distributions from the joint distribution, thus avoiding rigid assumptions about the form of the data distribution and offering broad applicability. According to Sklar’s theorem, for any set of random variables, the joint distribution can be expressed as a combination of the marginal distributions and a Copula function:
where
is the joint distribution function of the
n-dimensional random variables,
is the marginal distribution function of the
i-th variable, and
denotes the Copula function.
Among various Copula functions, the Gaussian Copula is widely used to characterize the dependence structure among complex random variables due to its advantages of simple parameter estimation, clear functional form, and strong adaptability [
26]. Therefore, the Gaussian Copula is selected to model the temporal correlation of wind power forecasting errors. Its mathematical formulation is given as follows:
where
is the joint cumulative distribution function (CDF) of an
n-dimensional standard normal distribution with covariance matrix
;
is the inverse of the standard normal CDF; and
denotes the marginal cumulative probability of the forecasting error at time interval
i.
To accurately capture the temporal correlation characteristics of wind power forecasting errors, a joint probability distribution model is constructed using the Gaussian Copula function. The specific implementation steps are as follows:
First, kernel density estimation is applied to the historical forecasting error data from the first 29 days for each time interval, yielding the empirical distribution function of forecasting errors at each time step. Based on this, the original forecasting error data are transformed into a unified [0,1] probability space using their respective empirical cumulative distribution functions, thereby forming a set of standardized random variables with consistent marginal distributions. This transformation eliminates the influence of scale differences among error data on the modeling of correlation structures;
- 2.
Gaussian Copula Parameter Estimation
Next, to determine the parameters reflecting the temporal correlation of errors, the standardized data in the [0,1] space are further transformed into the standard normal space using the inverse of the standard normal distribution function. Then, the covariance matrix parameter of the Gaussian Copula function is estimated using the maximum likelihood estimation method. This allows the model to fully capture the intrinsic temporal dependence among forecasting errors across different time intervals;
- 3.
Forecasting Error Scenario Generation
Finally, based on the parameters of the Gaussian Copula model, stochastic scenarios of wind power forecasting errors are generated. The generation process consists of two stages. First, using the established Gaussian Copula parameters, random samples are drawn from the joint distribution in the standard normal space. Then, the sampled data are mapped back to the original error space by applying the inverse of the fitted empirical cumulative distribution function (CDF) for each time interval. In this way, a set of candidate wind power output scenarios is constructed.
2.4. Multi-Scenario Generation of Wind Power Output
Let the forecasted wind power output for each time interval on Day 30 be denoted as
, with a data sampling interval of 10 min. To generate candidate scenarios,
S samples are first randomly drawn from the joint distribution constructed using the wind power data from the first 29 days, resulting in a standardized forecasting error matrix. The mathematical expression is given by the following:
where
represents the standardized forecasting error at time interval
t in the
s-th candidate scenario. By applying the inverse of the empirical cumulative distribution function
for each time interval, the standardized sampled values can be transformed into actual forecasting errors:
By adding the inverse-transformed actual forecasting errors to the forecasted wind power values of each time interval on Day 30, the wind power output scenarios for Day 30 can be constructed as follows:
Directly using all candidate scenarios may lead to severe data redundancy, making it difficult for the scheduling model to accurately capture the characteristics of wind power uncertainty, and substantially increasing the computational complexity of the optimization problem. Therefore, k-means clustering is employed to reduce the dimensionality of the S candidate scenarios by grouping them into
K clusters, from which representative wind power output scenarios are extracted [
27]. This reduction not only preserves the statistical structure of the original scenario set but also significantly improves the computational efficiency of the subsequent optimization process. Let
denote the number of candidate scenarios in the
k-th cluster. The weight assigned to the representative scenario of this cluster is defined as follows:
Figure 3 illustrates the overall process of the proposed approach, from data acquisition and error modeling to scenario generation, reduction, and scheduling optimization. It outlines the key steps involved in constructing the wind/storage day-ahead scheduling model.
5. Conclusions
This paper proposes a day-ahead planning and scheduling model for wind/storage systems that integrates multi-scenario generation with CVaR constraints. Wind power output scenarios are generated using non-parametric modeling combined with Copula functions. The model also incorporates initial SOC optimization and a peak–valley arbitrage mechanism, effectively addressing the scheduling challenges posed by wind power uncertainty.
Simulation results show that the proposed model significantly enhances both economic performance and risk control. Compared with the benchmark without CVaR and arbitrage design, the expected revenue increases by CNY 340.67, the CVaR improves by CNY 420.74, and the minimum scenario revenue rises by CNY 563.95. When the initial SOC is optimized to 0.1, the expected revenue is CNY 189.08 and CNY 310.66 higher than those with SOC values of 0.5 and 0.9, respectively. Under different confidence levels, raising the level from 0.8 to 0.95 leads to a CVaR increase of CNY 739.96 and a minimum revenue gain of CNY 1599.73, at the cost of CNY 681.42 in expected revenue, reflecting a clear trade-off between risk and return. In addition, increasing the number of clusters from 5 to 10 improves both CVaR and expected revenue, while further increasing to 15 results in only a slight revenue gain (CNY 145.89) but a decline in CVaR by CNY 132.49, indicating diminishing returns and potential overfitting.
Although the proposed model improves the economic efficiency and robustness of the wind/storage system, it primarily focuses on day-ahead scheduling and does not fully capture the dynamic response capabilities required during real-time operation. This limitation may affect its adaptability under highly volatile conditions. Future work will aim to incorporate real-time dispatch mechanisms to enhance the responsiveness and practical applicability of the proposed approach.