2.1. Definition of Marginal Distribution
The random variable Xi for each geographical location i (wind power Pw,I, photovoltaic power Ps,I, and load PL,i) defines the marginal distribution. The marginal distribution is defined based on the actual physical models and historical data, as follows:
- (1)
Photovoltaic Marginal Distribution
In the equation, Fs,i(x) represents the cumulative distribution function (CDF) of the photovoltaic output at location i, indicating the probability of Ps,i being less than or equal to x, where x is any real number and represents the possible values of photovoltaic output. In planning, x can be used to set output thresholds and analyze the system’s performance under different output levels. Ps,i denotes the photovoltaic output at geographical location i.
The output of the photovoltaic power source at a certain moment is related to the radiation intensity and temperature at that moment, as represented by Equation (2). The actual radiation intensity is calculated using Equation (3).
Pstd is the photovoltaic output under standard illumination of 1000 W/m2 and standard temperature of 25 °C. F is the actual radiation intensity received by the photovoltaic power source, αT is the power temperature coefficient, T is the actual temperature of the photovoltaic power source, and ε is the reduction coefficient for the output of the photovoltaic power source due to weather changes such as shading and cloud cover, simulated using a Beta distribution. Fstd is the standard illumination under conditions where cloud cover is not considered and is determined by the local geographic location.
- (2)
Wind Power Marginal Distribution
In the equation, Fw,i (x) represents the cumulative distribution function (CDF) of the wind power output at geographical location i, indicating the probability that Pw,i is less than or equal to x, where x is any real number representing possible values of wind power output. In planning, x can be used to set output thresholds and analyze system performance under different output levels. Pw,i is the power output of the wind farm at location i, reflecting the generation capacity of the wind farm at a given moment, which is affected by the wind speed v(t).
According to the power curve of the wind turbine, the relationship between
Pw,i and the wind speed
v(
t) is a piecewise function:
In the equation, Pw,i(t) is the wind power output at time t; v(t) is the wind speed at time t; vs is the cut-in wind speed of the wind turbine—when the wind speed reaches vs, the turbine meets the grid-connection requirements and can be used for power generation; CP is the wind energy utilization coefficient; Sw is the swept area of the wind turbine rotor; ρ is the air density; vr is the rated wind speed, which is the minimum wind speed at which the wind turbine outputs at rated power; v0 is the cut-out wind speed—when the wind speed exceeds v0, the environmental conditions are no longer suitable for turbine operation, and the turbine shuts down.
The wind speed v(t) typically follows a Weibull distribution. This approach ensures that the model captures the actual fluctuation characteristics of wind power output, such as zero output under low wind speeds and saturation effects at rated wind speed.
- (3)
Load Marginal Distribution
In actual power system operation, electricity load demand is influenced by various factors, including seasonal changes, weather conditions, industrial production activities, and people’s lifestyle. These factors lead to significant variability in load. Actual load data often exhibit symmetric fluctuations around the mean. Therefore, in the planning of new-type power systems, load is typically modeled as a random variable that follows a normal distribution:
In the equation, FL,i(x)s the cumulative distribution function (CDF) of the load at location i, and PL,i is the electricity demand at geographical location i, which follows a normal distribution. μL,i is the average load at location i, and σL,i is the load fluctuation amplitude at location i.
2.2. Hierarchical Copula Model Construction
After defining the marginal distributions, different types of physical variables are standardized into uniform distribution variables through the probability integral transform. This process converts the heterogeneous distributions of wind power
Pw,i, photovoltaic power
Ps,i, and load
PL,i into uniform distribution variables
Ui within the [0,1] interval, eliminating the differences in distribution types between the variables. This allows for the handling of correlations between multiple variables under the Copula framework. The formula is as follows:
In the equation, Ui is the uniform distribution variable, representing the standardized probability value at location i. Fi is the marginal distribution function at location i (i.e., Fw,i, Fs,i or FL,i), and Xi is the original random variable at location i. Us,i, Uw,i, and Ul,i are the standardized uniform distribution variables, representing the normalized photovoltaic power, wind power, and load at location i, respectively.
A Copula is a mathematical tool used to connect multiple marginal distributions and describe the dependencies between variables. In multivariate distributions, Copula functions link variables from their respective marginal distributions, capturing the dependencies among them.
To address the complex spatial and temporal relationships among wind power, photovoltaic power, and load, a hierarchical Copula model is adopted.
The main idea of the hierarchical Copula model is to divide the dependencies in the system into multiple levels, with each level of the Copula function representing dependencies at a different scale.
- (1)
First-level Copula
Assuming there exists local correlation among wind power, photovoltaic power, and load, to model this correlation, we use the Gaussian Copula, which is suitable for describing data with symmetric dependencies.
In the equation, ρ is the correlation coefficient between wind power and photovoltaic power, and σw and σs are the standard deviations of wind power and photovoltaic power, respectively. CLayer1 is the first-level Copula function, used to capture the local correlation between wind power and photovoltaic power.
- (2)
Second-Level Copula
The second-level Copula is used to describe dependencies over longer time scales, such as the long-term dependencies among wind power, photovoltaic power, and load. To capture these correlations, the Clayton Copula is used, which is suitable for modeling variables with asymmetric dependencies.
In the equation, θ is the dependence parameter in the Clayton Copula, which determines the strength of the correlation between wind power and photovoltaic power. CLayer2 is the second-level Copula, which describes the long-range dependencies among wind power, photovoltaic power, and load.
Finally, the two levels of Copula functions are combined into a joint Copula to describe the joint distribution of wind power, photovoltaic power, and load. The resulting joint Copula model is as follows:
- (3)
Parameter Estimation
To estimate the parameters in the Copula model from data, we use the Maximum Likelihood Estimation (MLE) method to maximize the log-likelihood function and obtain the optimal parameter estimates.
Given sample data {
Uw,i,
Us,I,
Ul,i}, the maximum likelihood function is as follows:
In the equation, c(Uw,i, Us,I, Ul,i|θ) is the density function of the joint Copula, and θ is the parameter to be estimated.
Based on the joint Copula function mentioned above, uniform variables for multiple scenarios are sampled from the joint distribution. The formula is as follows:
The density function of the Clayton Copula used in the second layer is as follows:
By maximizing the log-likelihood function using MLE, the parameter θ is obtained.
- (4)
Scenario Generation from the Joint Distribution
By using the joint Copula model and maximum likelihood estimation, multiple scenarios can be generated from the joint distribution for risk assessment and system optimization. Based on the estimated joint Copula model,
K scenarios are sampled from the joint Copula function, with each scenario corresponding to a set of uniform variables
:
In the equation, , U(k)s, represent the set of uniform variables for the k-th scenario, where k = 1, 2,…, K, and K is the total number of scenarios.
Using the Monte Carlo method,
K sets of samples are extracted from the joint Copula function, with each sample set representing a possible supply–demand combination. These samples not only reflect the marginal distribution characteristics of each location but also capture the interactions between geographical locations through the spatial correlation matrix
R. In the planning of new power systems, this sampling method can generate a diverse range of scenarios, including extreme cases, thus providing comprehensive data support for risk assessment and optimization. The randomness of the Monte Carlo method ensures the statistical representativeness of the samples, while the spatial characteristics of
R enhance the physical realism of the samples. Finally, the sampled uniform variables are inverse-transformed into actual physical quantities, forming scenario data that can be used for planning.
In the equation, is the physical quantity at location i in the k-th scenario, is the inverse function of the marginal distribution, and is the uniform variable at location i in the k-th scenario.
Finally,
K joint scenarios are generated: