Source–load forecasting can support both day-ahead scheduling and intra-day or hourly rolling decision-making in power system operation. Within a unified mathematical framework, the forecasting model developed in this paper can flexibly accommodate operational requirements at different time scales by adjusting the forecasting start time and the length of the forecasting horizon.
2.1. Wind and PV Output Forecasting Considering Spatiotemporal Correlation
Wind and photovoltaic generation outputs exhibit pronounced temporal dependence at the hourly time scale, i.e., the output states of adjacent time periods are statistically dependent. The system load also shows strong temporal correlation and is coupled with wind/PV outputs. Using a Markov chain model, the current output can be characterized based on a finite historical sequence, thereby effectively capturing such temporal dependence.
Conventional Markov-chain sampling methods can only derive the probability distribution of outputs under each state and cannot directly predict specific wind/PV output states. To address this limitation, this paper integrates Markov chains with statistical simulation techniques: output variations are discretized into multiple states, and extensive simulations are conducted at each time instant to generate sample data. The resulting samples are used to estimate state transition probabilities, yielding a time-varying state transition matrix. By jointly considering the state transition matrices across time and the output-variation ranges associated with each state, effective forecasting of wind/PV generation is achieved. The overall procedure is illustrated in
Figure 2.
In forecasting, it is necessary not only to predict the source–load outcomes at the next time instant, but also to extend the prediction to multiple consecutive time instants. To this end, a recursive sampling approach can be adopted: the forecast at each time instant is used as the input for forecasting the subsequent instant, thereby forming a set of forecast trajectories.
Assume that the system contains
N0 forecasting targets, including wind/PV plants and the aggregate load. Let the power of target
q at time
t be
Pq,t, and let its reference upper limit be given in
. The normalized series is defined as follows:
where
q denotes the index of the forecasting target,
t denotes the hourly discrete time instant,
Pq,t denotes the power of target
q at time
t, and the normalization base is given in
.
xqt denotes the normalized power series.
For any forecasting target, the value range of the normalized power
is partitioned into
KM state intervals, and a discrete state variable is used to indicate which interval each time instant belongs to. Let the state boundaries be
; then the
k-th state interval is defined as follows:
Based on the above definitions, the discrete state variable
is defined as follows:
where
KM is the number of Markov-chain states,
denotes the state boundaries,
is the
k-th state interval, and
Zq,t represents the discrete state of target
q at time
t.
To improve statistical stability and avoid having too few samples in certain states, the state boundaries are determined via empirical quantile binning:
where
is the sample quantile function.
To characterize the intraday periodicity of wind/PV generation and load, a time-segmented transition matrix is adopted by mapping each time instant t to a group index . In this study, hourly grouping is used with , and the state-transition frequencies are counted within each hour.
Let
denote the number of transitions from state
in group
g. To avoid zero transition probabilities caused by insufficient samples, a smoothing term
is introduced, and the transition probability is estimated as follows:
Accordingly, the transition matrix for group g is .
Here, is the time-segment index of time t; is the transition count within group g; is the smoothing coefficient; is the estimated transition probability; and is the transition matrix for group g.
For multi-step forecasting, a recursive sampling strategy is used to generate a set of state trajectories. Let the number of sampled trajectories be
M, and the prediction horizon be
H. For the
m-th trajectory of target
q, given
, the next-step state is sampled from the corresponding row of the transition matrix:
After obtaining the trajectory state
recursively, a continuous sample is drawn uniformly within the corresponding interval:
where
indexes the sampled trajectories;
denotes a uniform distribution over an interval;
and
are the normalized and power samples of the
m-th trajectory at time
t, respectively.
For each target
q, the sample mean is taken as the time forecast, considering temporal correlation only:
or wind, PV, and load forecasting, correlation exists not only in the temporal dimension but also in the spatial dimension. By performing joint forecasting of wind, PV, and load at different locations to account for spatial correlation, the uncertainties of individual series can be mutually reduced, thereby improving the overall stability and accuracy of the forecasts.
Considering the nonlinear correlation characteristics of wind and PV outputs within the same region, this paper adopts a dynamic C-Copula function to construct a dynamic copula model for the joint wind–PV output in the region. The dynamic C-Copula function features asymmetric tail characteristics, with stronger lower-tail dependence and weaker upper-tail dependence, and is more sensitive to variations in the lower tail of the joint distribution. Therefore, it can effectively characterize the correlation patterns when wind, PV, and load are relatively low [
24].
This study uses the Akaike information criterion (AIC) and Bayesian information criterion (BIC) to evaluate the goodness of fit of candidate models, and further uses the maximum log-likelihood (LogL) to compare the fitting performance of dynamic and static models under the same copula family. The definitions of AIC and BIC are given by the following:
where
dj is the number of model parameters,
is the maximized log-likelihood value, and
Ncp is the sample size.
Assume that the system contains
n wind and PV plants in total. Based on the wind and PV output forecast data under different forecasting horizons, considering temporal dependence as generated above, a nonparametric method is applied to obtain the marginal distribution of wind and PV power. The nonparametric method is based on empirical distributions and nonparametric kernel density estimation, where the empirical distribution function of wind and PV power is used as an approximation of the population distribution. The probability density function of nonparametric kernel density estimation is given by the following equation:
In this equation, h is the smoothing parameter, h > 0; is the kernel function, and xi is a sample of the random variable x. By integrating f(x), the corresponding marginal distribution function can be obtained.
For system load, since it exhibits pronounced intraday periodicity, intraweek periodicity, and seasonal characteristics, this paper models the marginal distribution of load power under a day-ahead rolling forecasting framework using a nonparametric method based on conditional grouping. Specifically, the month, day-type of the week, and hour corresponding to the forecasting time instant are taken as conditioning variables, and historical load samples sharing the same conditioning characteristics as the forecasting time instant are selected to form the corresponding sample set. On this basis, the sample set is modeled using the empirical distribution function or nonparametric kernel density estimation to obtain the marginal distribution function of load under the given conditions. With this approach, the marginal distribution of load power can effectively capture the statistical characteristics of load uncertainty under different time conditions while retaining the advantages of nonparametric modeling.
After obtaining the marginal distribution functions of each wind/PV plant and the load, the C-Copula function is used to connect them pairwise to derive the dynamic correlation coefficients.
In the dynamic copula model, estimating the correlation coefficients is transformed into estimating the parameters in the evolution equation. Using nonparametric kernel density estimation, the wind power output series
wi and
si are substituted into the evolution equation of the dynamic C-Copula function, which is expressed as follows:
In this equation, , , and are parameters to be estimated; is the logistic function, which is introduced to ensure .
The evolution equation transforms the static
into the dynamic
and the corresponding likelihood function is also changed from a function of the correlation coefficient
θ to a function of the evolution-equation parameters
,
, and
. By solving the likelihood function, the maximum log-likelihood estimate
LogL and the corresponding
,
, and
can be obtained. Substituting
,
, and
together with
wi and
si into the evolution equation yields the dynamic correlation-coefficient series
. By evaluating all plant pairs, the spatial correlation-coefficient matrix at each time instant can be obtained as follows:
In this equation, denotes the correlation coefficient between variable i and variable j at time t. The n + 1 variable corresponds to the total system load of the single region, so that the matrix simultaneously characterizes multi-plant spatial correlation and source–load coupling correlation.
Based on the spatial correlation-coefficient matrix at each time instant, a regression adjustment model considering spatial correlation is constructed to obtain the final forecast outputs at each time instant:
Historical data are used to estimate the parameters of the adjustment model. The regression coefficients
and
can be estimated using the least squares method:
In this equation, Nsl is the total number of wind/PV plants and the load; is the final forecast output of wind/PV Plant 1 in period t; is the forecast value of the i-th wind/PV plant considering only temporal dependence; is the spatial correlation coefficient between plant i and the i-th wind/PV plant in period t; is the mean output of the i-th wind/PV plant in period t; and is the standard deviation of the output of the i-th wind/PV plant.