Interval Forecasting Method of Aggregate Output for Multiple Wind Farms Using LSTM Networks and Time-Varying Regular Vine Copulas

: Interval forecasting has become a research hotspot in recent years because it provides richer uncertainty information on wind power output than spot forecasting. However, compared with studies on single wind farms, fewer studies exist for multiple wind farms. To determine the aggregate output of multiple wind farms, this paper proposes an interval forecasting method based on long short-term memory (LSTM) networks and copula theory. The method uses LSTM networks for spot forecasting ﬁrstly and then uses the forecasting error data generated by LSTM networks to model the conditional joint probability distribution of the forecasting errors for multiple wind farms through the time-varying regular vine copula (TVRVC) model, so as to obtain the probability interval of aggregate output for multiple wind farms under different conﬁdence levels. The proposed method is applied to three adjacent wind farms in Northwest China and the results show that the forecasting intervals generated by the proposed method have high reliability with narrow widths. Moreover, comparing the proposed method with other four methods, the results show that the proposed method has better forecasting performance due to the consideration of the time-varying correlations among multiple wind farms and the use of a spot forecasting model with smaller errors.


Introduction
Owing to the intermittent and stochastic nature, the increasing penetration of wind power poses a big challenge to the power grid operation.Interval forecasting is a type of probabilistic forecasting that can provide the upper and lower boundaries of output wind power under a given confidence level [1].Compared with a spot forecast, an interval forecast can provide more uncertain information for solving problems in power grid operation, such as unit commitment optimization, reserve plan making and operational risk assessment, and thus has been widely studied in recent years [2].
The methods of interval forecasting can be divided into three categories, namely, methods based on fitting probability distribution function (PDF) of spot forecasting errors, quantile regression methods and bound evaluation methods.Among them, a quantile regression method [3][4][5][6] obtains the forecasting intervals by constructing a regression model for multiple quantiles, and the complexity of the quantile regression model increases with the number of quantiles required and is computationally intensive.A bound evaluation method transforms the interval forecasting into an objective optimization problem [7,8], which can directly output the upper and lower edge values of wind power, this method usually requires machine learning algorithms because of the discontinuous optimization objectives.At the same time, the above two methods need to be remodeled when the required confidence level changes.A method based on fitting PDF of spot forecasting errors performs spot forecasting firstly and then calculates forecasting intervals by fitting the PDF of the spot forecasting error [1].Although this method has more steps than the previous two methods, the wind power output intervals at all confidence levels can be obtained after the PDF of forecast errors is modeled.In addition, since most power dispatching departments already have the capability of short-term wind power spot forecasting, this method is the easiest to implement and be understood by dispatchers and is the most widely used.
For the methods based on fitting PDF of spot forecasting errors, both the spot forecast model and the PDF model will affect the forecasting results.Spot forecast models can be mainly categorized into physical models and the statistical learning models.A physical model uses numerical weather forecast data and geographic information to predict wind power [9].The statistical learning model makes predictions by learning from historical data and finding the relationship between relevant data and wind power output; examples of this model include the time series model in ref. [10] and the Markov and autoregressive moving average (ARMA) models in ref. [11].In recent years, with the continuous development of artificial intelligence theory, various artificial intelligence algorithms have also been incorporated into statistical learning methods [12][13][14][15][16]. Specifically, ref. [17,18] constructed a backpropagation (BP) neural network and ref. [19] constructed a support vector machine (SVM) model for wind power forecasting.The authors of ref. [1,20,21] used long short-term memory (LSTM) networks for spot forecasting, and ref. [22] established a spot forecasting model based on a deep Boltzmann machine.The use of artificial intelligence algorithms for the spot forecasting of wind power output has been a development trend.For the PDF modeling of spot forecasting error, a Gaussian distribution is used in ref. [23,24], and a beta distribution is used in ref. [25] to model the PDF of wind farm forecast error.The work in ref. [26] established a conditional PDF for the prediction errors through the copula function.In ref. [1], the nonparametric kernel density estimation was used to fit the PDF of the wind power prediction errors.
However, all the aforementioned studies involved an interval forecasting method for a single wind farm.With the introduction of the goals of carbon peaking and carbon neutrality, the proportion of wind power as clean energy in the power grid will be further expanded [27][28][29][30], so compared with a single wind farm, the power grid will be more concerned about the aggregate output of multiple wind farms.Interval forecasting methods for multiple wind farms include the extrapolation method [31], the upscaling method [32], the superposition method [33] and the high-dimensional copula method [34].The extrapolation method predicts aggregate wind power by finding wind speed predictions that are similar to the current ones in the historical dataset; thus, the accuracy of prediction is strongly influenced by the accuracy of the weather forecasts.In the upscaling method, reference wind farms are selected, and the weighted sum of the forecast outputs of these reference wind farms is used as the aggregate output, but there is no standard for how to select the reference wind farms and weights.Superposition method adds the forecasting values of multiple single wind farms as the forecasting values of their aggregate output.However, this method ignores correlations among multiple wind farms due to their similar geographic locations and climatic conditions [35,36], and may have poor forecasting accuracy.
The high-dimensional copula model performs better than other models in terms of fitting the correlations of multidimensional variables and has been introduced into interval forecasting for multiple wind farms to model the PDF of forecasting errors in recent years.In ref. [34], the regular vine copula model was introduced to describe the joint PDF of the spot forecasts and real outputs of multiple wind farms.Then, the aggregate output for multiple wind farms can be derived from the joint PDF, and only one type of copula function is used in this correlation model.The work in ref. [37] constructs a regular vine copula model for multiple wind farms with more types of copula functions.The studies in ref. [35,37] both use static copula functions, but in practice, the correlations between multiple wind farms vary with time.The authors of ref. [38] used time-varying copulas to construct a drawable vine copula model, which considers time-varying correlations of multiple variables of wind farms, but only one type of copula function is used in the model, which may not be applicable to describe the correlation between multiple wind farms.
In summary, most of the existing interval forecasting studies focus on single wind farms, there are few studies on aggregate output of multiple wind farms.Additionally, the accuracy of the copula model used in interval forecasting of aggregate output for multiple wind farms needs to be improved.Moreover, the studies mentioned above rarely provide complete modeling processes stating how the copula model is combined with the spot forecasting model to achieve interval forecasting.To address these problems, the main contributions of this paper are as follows: (1) A time-varying regular vine copula model is proposed to obtain the conditional joint PDF of the forecasting errors of multiple wind farms.The proposed method not only uses multiple types of copula functions and optimize the structure of vine copula based on the Akaike information criterion (AIC), but also uses copula functions with time-varying dependence parameters to improve the model's ability to capture the complex and time-varying correlations among multiple wind farms; (2) Interval forecasting is achieved for the aggregate output for multiple wind farms by combining the spot forecasting model based on LSTM networks and the time-varying regular vine copula model.In this method, the historical outputs of multiple wind farms are used to train the spot forecasting model based on LSTM networks.Then, using the forecasting outputs and errors generated by the trained spot forecasting model as modeling data, a time-varying regular vine copula model is established to obtain the conditional joint PDF of the forecasting errors; then, the confidence interval can be derived from this model.Finally, the confidence intervals of the aggregate output are obtained by adding up the confidence intervals of forecasting errors and the spot forecasting outputs.This modeling framework can also be applied to combine the copula model with other spot forecasting methods.
This paper is organized as follows.Section 2 proposes a time-varying regular vine copula model to obtain the conditional joint PDF of the forecasting errors.Section 3 proposes an aggregate output interval forecasting method based on LSTM networks and time-varying regular vine copulas (LSTM-TVRVC).Section 4 provides a case analysis and shows the experimental results and discussions, and Section 5 concludes the paper.

Sklar's Theorem
A copula function is defined as the connection function between the joint distribution function of multiple variables and their marginal distribution functions [39].For an sdimensional variable (g 1 , g 2 , • • • , g i , • • • , g s ), according to Sklar's theorem, there must be a copula function that satisfies (1).
where u i = F(g i ) is the cumulative probability distribution (CDF) of variable g i , F(•) is the CDF and C(•) is the copula CDF.The derivative of (1) is the corresponding s-dimensional joint PDF: where c(•) is the copula PDF and f (g i ) is the marginal PDF of variable g i .Therefore, the joint PDF of multiple variables can be converted into a copula function and marginal distribution functions for multiple variables.

Regular Vine Copulas
As (1) shows, the copula function for multiple variables is equivalent to using only one type of copula function to establish the dependence structure between multidimensional variables; consequently, this function has great limitations.In addition, as the dimensionality of the variables increases, the scale of the parameters to be estimated for the high-dimensional copula function becomes larger, thus creating difficulties for equation solving.
The regular vine copula model transforms a high-dimensional copula function into a cascade of bivariate copula models [40], thus allowing for more types of copula functions to be used and making the correlation modeling process for multiple variables more flexible.
The regular vine copula model for s variables consists of s − 1 trees denoted as The j − th tree contains s − j nodes connected by s − j − 1 edges.Each node corresponds to a CDF, and each edge corresponds to a bivariate copula model calculated from two nodes connected to the edge.
The node set and edge set of T j are denoted as N j and E j , respectively, and regular vine copulas should satisfy the following conditions [37]: 1 T 1 has s nodes, with a node set N 1 = {1, 2, • • • , s} and an edge set E 1 . 2Concerning T j (2 ≤ j ≤ s − 1), N j equals E j−1 . 3If two edges in T j are joined in T j+1 , the edges need to share a common node in T j .
An edge in where u a(e)|D(e) = F g a(e) g D(e) and g D(e) denotes the variables corresponding to D(e) (the D(e) of the tree T 1 is empty).
Figure 1 shows a possible structure of a six-dimensional regular vine copula model, where blocks denote nodes and lines denote edges.As (3) and Figure 1 show, the regular vine copula model not only has more options in terms of the type of bivariate copula corresponding to each edge but can also freely adjust structures of trees as needed.

Time-Varying Copula Functions
When constructing the regular vine copula model, the type of bivariate copula corresponding to each edge needs to be determined.Copula functions can be divided into elliptical copulas and Archimedean copulas [41].Elliptical copulas, which include As (3) and Figure 1 show, the regular vine copula model not only has more options in terms of the type of bivariate copula corresponding to each edge but can also freely adjust structures of trees as needed.

Time-Varying Copula Functions
When constructing the regular vine copula model, the type of bivariate copula corresponding to each edge needs to be determined.Copula functions can be divided into elliptical copulas and Archimedean copulas [41].Elliptical copulas, which include Gaussian and t copulas, have symmetric tail correlations.Archimedean copulas, which include Gumbel and Clayton copulas, have asymmetric tails correlations.The dependence parameters of traditional copula functions are static; that is, these parameters do not change over time.However, the nonlinear correlations among the outputs of multiple wind farms often exhibit time-varying characteristics.Therefore, when constructing regular vine copulas, this paper uses the time-varying copula functions proposed by Patton as alternative copula functions, whose dependence parameters are akin to a restricted ARMA process [42,43].
The time-varying copula functions used in this paper and their evolution equation of the dependence parameters are as follows.
(3) Time-varying Gaussian copula.This copula has a symmetric distribution but does not reflect tail correlations.The functional form of this copula is shown in (4).
where u 1 and u 2 are the marginal CDFs of two variables, Φ −1 (•) is the inverse of the standard Gaussian CDF and ρ N.t is the dependence parameter.The evolution equation of this copula is as follows: where , and u 1,t−j and u 2,t−j are the CDFs of the two variables at moment t − j; the parameters to be estimated are {ω N , β N , α N }.
where T −1 (•) is the inverse of the t CDF, ρ T,t and d T,t are the dependence parameters and the evolution equations of these parameters are as follows: and the parameters to be estimated are (5) Time-varying Clayton copula.
Processes 2023, 11, 1530 This copula is suitable for variables with a strong lower tail correlation.The functional form is shown as (8).
where θ C,t is the dependence parameter and its evolution equation is as follows: and the parameters to be estimated are {ω C , β C , α C }.
This copula is suitable for variables with a strong upper tail correlation.The functional form is shown as (10).
where θ G,t is the dependence parameter and its evolution equation is as follows: and the parameters to be estimated are {ω G , β G , α G }.
This copula is suitable for variables with different upper tail and lower tail correlations.The functional form of this copula is as follows: where τ U SJC,t and τ L SJC,t are the upper and lower tail dependence parameters, respectively, and their evolution equations are as follows: and the parameters to be estimated are ω U SJC , β U SJC , α U SJC and ω L SJC , β L SJC , α L SJC .

Modeling the Conditional Joint PDF for the Forecast Errors of Multiple Wind Farms Using Time-Varying Regular Vine Copulas
For adjacent regions, where multiple wind farms have similar geographical and meteorological environments, there is a statistical correlation between the spot forecasting values and the forecasting errors; this correlation is the basis for constructing the conditional PDF of the forecasting error by using the copula model.
According to Section 2.2, three problems need to be solved when constructing regular vine copulas: (1) The marginal distribution function of each variable needs to be estimated.
(2) The edge set E j needs to be determined for tree T j ; that is, the structure of tree T j needs to be selected.(3) The type of copula function for constructing the bivariate copula model corresponding to each edge needs to be chosen.By using five kinds of copula functions in Section 3, this paper proposed a time-varying regular vine copulas (TVRVC) model to obtain the conditional joint PDF of the forecasting errors for multiple wind farms.Suppose that the number of wind farms is M; p 1 , p 3 , . . ., p 2M−1 denote the spot forecasting outputs of M wind farms and p 2 , p 4 , . . ., p 2M denote the forecasting errors of the M wind farms.Then, the M wind farms contain 2M variables, and the modeling process is as follows: 1.
Fit the marginal distribution of the 2M variables.Kernel density estimation is used to fit the marginal PDFs of the 2M variables, and the estimation formula is as follows: where f i (p i ) is the PDF of variable p i , h is the length of the sliding window, n is the sample numbers of samples of variable p i , and K(•) is the kernel function.The CDF of p i is obtained through a computing integral for f (p i ); 2. j = 1; 3.
Form the node set N j for tree T j , and calculate the CDFs corresponding to the nodes in the formula for calculating the CDFs corresponding to the nodes in N j is shown in reference [39]; 4.
Form all possible edge sets for tree T j ; 5.
Construct bivariate copula models for each possible edge set.For each edge in a possible edge set, construct bivariate copula models by using five alternative types of time-varying copula functions and calculate the AIC indices of these models.The optimal model is selected as the final bivariate copula model corresponding to this edge.The AIC index calculation formula is as follows: where k is the number of model parameters and L is the likelihood.The smaller the AIC, the better the bivariate copula; 6.
Choose the optimal edge set as the edge set E j for tree T j .The sum of the AICs of all edges in a possible edge set is used as the evaluation index, and the edge set with the smallest AIC is selected as the edge set E j for tree T j ; 7.
Determine whether j = 2M − 1.If not, j = j + 1, calculate the node set N j for tree T j and return to 3; otherwise, proceed to 8; 8.
Calculate the joint PDF of the spot forecasting outputs and errors of the M wind farms by using (3); 9.
Calculate the conditional joint PDF of the forecasting errors of the M wind farms by using (16).
The flow chart for modeling the conditional joint PDF by TVRVC is displayed in Figure 2.

Interval Forecasting Method
The TVRVC model proposed in the previous section requires spot forecasting errors as modeling data.An LSTM network is a special kind of recurrent neural network (RNN) that was proposed by Sepp Hochreiter and Jurgen Schmid Huber in 1997 [1].The LSTM network changes the way of gradient transmission during backpropagation by adding a memory cell to the hidden layer unit of the RNN, thereby effectively alleviating the problems of gradient disappearance and gradient explosion.Many studies have shown that LSTM performs well in terms of wind power spot forecasting [44,45], so this paper uses the combination of spot forecast model based on LSTM works and TVRVC model to realize interval forecasting of aggregate output for multiple wind farms.If there are a total of M wind farms, the modeling process of interval forecasting method using LSTM networks and the time-varying regular vine copulas (LSTM-TVRVC) is as follows: 1. Construct the spot forecasting model based on LSTM networks for M wind farms.
First, the historical output data of the M wind farms are divided into a training set

Interval Forecasting Method
The TVRVC model proposed in the previous section requires spot forecasting errors as modeling data.An LSTM network is a special kind of recurrent neural network (RNN) that was proposed by Sepp Hochreiter and Jurgen Schmid Huber in 1997 [1].The LSTM network changes the way of gradient transmission during backpropagation by adding a memory cell to the hidden layer unit of the RNN, thereby effectively alleviating the problems of gradient disappearance and gradient explosion.Many studies have shown that LSTM performs well in terms of wind power spot forecasting [44,45], so this paper uses the combination of spot forecast model based on LSTM works and TVRVC model to realize interval forecasting of aggregate output for multiple wind farms.If there are a total of M wind farms, the modeling process of interval forecasting method using LSTM networks and the time-varying regular vine copulas (LSTM-TVRVC) is as follows: Construct the time-varying regular vine copula model for multiple wind farms, and obtain the conditional joint PDF of the forecasting errors; 4.
Calculate the conditional joint PDF of the spot forecasting error corresponding to subset B. Input the spot forecasting outputs obtained for the M wind farms at the same moment belonging to subset B into the conditional joint PDF of the forecasting errors to obtain the conditional joint PDF of the forecasting errors for the corresponding moment; 5.
Transform the conditional joint PDF of the forecasting errors into the conditional PDF of the sum of the forecasting errors by using the convolution formula [38]; 6.
Calculate the confidence interval of the sum of the forecasting errors at the given confidence level; 7.
Calculate the forecasting interval for the aggregate output of the M wind farms.This interval is obtained by adding the confidence interval of the sum of the forecasting errors to the corresponding spot forecasted aggregate output.
Figure 3 shows the interval forecasting method by using LSTM-TVRVC.
and a test set.The training dataset is used to train the LSTM networks, and dataset is used to verify the effectiveness of the model.Then, M LSTM netw trained, and the th i − LSTM network is trained for the . After training, the test dataset can be fed to the trained LSTM n to obtain spot forecasting outputs for each wind farm.The spot forecaste gate output for the M wind farms is obtained by adding up the spot for outputs of the M wind farms at the same moment; 2. Generate modeling data for the time-varying regular vine copula model b the trained spot forecasting model.The test set of the spot model is divi subsets A and B. The spot forecasting outputs and the errors of subset A ge by the trained spot forecasting model are used as modeling data to const TVRVC model, while subset B is used to test the effectiveness of the p method; 3. Construct the time-varying regular vine copula model for multiple wind far obtain the conditional joint PDF of the forecasting errors; 4. Calculate the conditional joint PDF of the spot forecasting error correspon subset B. Input the spot forecasting outputs obtained for the M wind farm same moment belonging to subset B into the conditional joint PDF of the ing errors to obtain the conditional joint PDF of the forecasting errors for th sponding moment; 5. Transform the conditional joint PDF of the forecasting errors into the con PDF of the sum of the forecasting errors by using the convolution formula [ 6. Calculate the confidence interval of the sum of the forecasting errors at th confidence level; 7. Calculate the forecasting interval for the aggregate output of the M win This interval is obtained by adding the confidence interval of the sum of casting errors to the corresponding spot forecasted aggregate output.
Figure 3 shows the interval forecasting method by using LSTM-TVRVC.

Evaluation Indices
Reliability and sharpness are two aspects of evaluating a forecasting interval [46].Generally, reliability is the primary factor that needs to be guaranteed for interval forecasting, and the average coverage deviation (ACD) is a commonly used index for evaluating reliability.The nominal mean prediction interval width (NMPIW) is a commonly used index for evaluating sharpness.
For a given nominal confidence level 1 − α, the formula of the ACD is as follows: where PICP is the probability that the test samples lie within the forecasting intervals at the given confidence level; PICP is calculated by: where λ i is an indicator variable that is defined as follows: where q(α/2) i and q(1−α/2) i are the lower and upper boundaries of the forecasting intervals, respectively, and x i is the i − th test sample.Forecasting intervals with small absolute values of the ACD values are more reliable.
The formula of the NMPIW is as follows: where N is the number of test samples and R is the between the maximum and minimum real wind power output values.Reliable forecasting intervals with smaller NMPIWs are preferred.The skill score (SS) is a comprehensive index that considers both reliability and sharpness [37]; the SS is positively oriented and computed as follows: where ξ is an indicator variable that equals 1 when x i ≤ q(α j ) i and that equals 0 otherwise.21) is the SS for the confidence level 1 − α.

Results and Discussion
The proposed method is applied to three adjacent wind farms in Northwest China to verify its effectiveness.Each wind farm has an installed capacity of 50 MW, and the data resolution is 15 min.The distance between any two of the three wind farms is no more than 50 km, and Kendall's tau coefficients for the real outputs of the three wind farms are shown in Table 1, demonstrating strong correlations.The historical data of the three wind farms for the whole year of 2017 are divided into four datasets (datasets 1-4), and each dataset contains three months of wind power output data.The proposed method is applied to these four datasets to verify its effectiveness.
For each dataset, the training set of the spot forecasting model contains the historical data of the three wind farms over the first two months, and the test contains the historical data of the third month.The modeling data for the TVRVC are the spot forecasting data and the forecasting error data of the three wind farms over the first three weeks of the third month (generated by the spot forecast model based on LSTM networks), and confidence intervals for the aggregate output of the fourth week of the third month are finally calculated to verify the validity of the interval forecast method.

Spot Forecasting Results and Discussion
The LSTM networks have two hidden layers with 32 and 8 hidden units, and the learning rate is set as 0.005.To conduct a comparison with the LSTM networks, a spot forecasting model based on BP neural networks is also created, and the modeling steps are the same as those in Section 3.1, only the LSTM networks are replaced by the BP networks.The BP networks also have two hidden layers with 32 and 8 hidden units, and the learning rate is set as 0.005.The root mean square error (RMSE) is taken as the evaluation index, and its formula can be found in [1].The RMSEs of the two spot forecasting models are shown in Table 2.For each wind farm, the RMSE of the spot forecasting model based on the LSTM networks does not exceed 5% of the installed capacity and is smaller than that of the BP neural networks.The results demonstrate the effectiveness of the spot forecasting model used in this paper.

Structures of the Time-Varying Regular Vine Copulas for the Three Wind Farms
The structure of each tree and the type of bivariate copula model corresponding to each edge in the TVRVC of the three wind farms for dataset 1 are shown in Figure 4.The historical data of the three wind farms for the whole year of 2017 are divided into four datasets (datasets 1-4), and each dataset contains three months of wind power output data.The proposed method is applied to these four datasets to verify its effectiveness.
For each dataset, the training set of the spot forecasting model contains the historical data of the three wind farms over the first two months, and the test dataset contains the historical data of the third month.The modeling data for the TVRVC are the spot forecasting data and the forecasting error data of the three wind farms over the first three weeks of the third month (generated by the spot forecast model based on LSTM networks), and confidence intervals for the aggregate output of the fourth week of the third month are finally calculated to verify the validity of the interval forecast method.

Spot Forecasting Results and Discussion
The LSTM networks have two hidden layers with 32 and 8 hidden units, and the learning rate is set as 0.005.To conduct a comparison with the LSTM networks, a spot forecasting model based on BP neural networks is also created, and the modeling steps are the same as those in Section 3.1, only the LSTM networks are replaced by the BP networks.The BP networks also have two hidden layers with 32 and 8 hidden units, and the learning rate is set as 0.005.The root mean square error (RMSE) is taken as the evaluation index, and its formula can be found in [1].The RMSEs of the two spot forecasting models are shown in Table 2.For each wind farm, the RMSE of the spot forecasting model based on the LSTM networks does not exceed 5% of the installed capacity and is smaller than that of the BP neural networks.The results demonstrate the effectiveness of the spot forecasting model used in this paper.

Structures of the Time-Varying Regular Vine Copulas for the Three Wind Farms
The structure of each tree and the type of bivariate copula model corresponding to each edge in the TVRVC of the three wind farms for dataset 1 are shown in Figure 4.  Table 3 shows the AIC values produced when fitting edge 1,2 and edge 3,6|5 with five types of time-varying copula functions.Table 3 shows that the optimal bivariate copula for edge 1,2 is the Clayton copula and for 5,2|1, it is the SJC copula.

Interval Forecasting Results and Discussion
Figure 5 shows the aggregate output intervals, the real aggregate output values and the spot-forecasted aggregate output values at different confidence levels for the three wind farms on days 1-2 of the fourth week of the third month in dataset 1.
Table 3 shows the AIC values produced when fitting edge 1, five types of time-varying copula functions.Table 3 shows that the optimal bivariate copula for edge 1,2 and for edge 5,2|1, it is the SJC copula.

Interval Forecasting Results and Discussion
Figure 5 shows the aggregate output intervals, the real aggreg the spot-forecasted aggregate output values at different confiden wind farms on days 1-2 of the fourth week of the third month in d As Figure 5 shows, the forecasting interval of the proposed m envelop the real aggregate output of the three wind farms and c characteristics well even when the real output fluctuates greatly.that the width of the interval is narrower when the wind power wider when the output is larger, which indicates that the uncert output of the three wind farms is smaller at lower wind power out Figure 6 shows the means and fluctuation ranges of the absol for each of the four datasets at different confidence levels.As Figure 5 shows, the forecasting interval of the proposed method can effectively envelop the real aggregate output of the three wind farms and can reflect uncertainty characteristics well even when the real output fluctuates greatly.The figure also shows that the width of the interval is narrower when the wind power output is smaller and wider when the output is larger, which indicates that the uncertainty of the aggregate output of the three wind farms is smaller at lower wind power output levels.
Figure 6 shows the means and fluctuation ranges of the absolute values of the ACD for each of the four datasets at different confidence levels.Regarding reliability, the means of the absolute values of the ACD indices obtain for the four datasets do not exceed 4% at all confidence levels and fluctuates within 3 Regarding sharpness, the interval width of the 90% confidence interval is only 10% an fluctuates within 5%.The forecasting results show that the method proposed in this p per is highly adaptable, has good performance during different periods of the year an has small interval widths while remaining highly reliable.
The following methods are designed for comparison with the methods proposed this paper: • BP networks and time-varying regular vine copulas (BP-TVRVC) method: T forecast error data generated by BP networks are used for time-varying regular vi copulas modeling, and the RMSE of the BP networks is shown in Table 2.

•
LSTM networks and time-varying copulas (LSTM-TVC) method: This is a superp sition method.After performing spot forecasting with the LSTM networks, mod ing the time-varying bivariate copula of the spot forecasting outputs and foreca ing errors for each wind farm to obtain their respective forecasting intervals, t forecast intervals are superimposed as their aggregate output interval.This metho ignores the spatial correlation between multiple wind farms.

•
LSTM networks and static regular vine copulas (LSTM-SRVC) method: LSTM n works are still used for spot forecasting, but the time-varying copula functions us in the time-varying regular vine copula model are replaced by the correspondin static copula functions.This method considers the correlations among the outpu of multiple wind farms but does not consider the time-varying nature of the corr lations; this method was also used in ref. [37].

•
LSTM networks and time-varying regular vine Gaussian copulas (LSTM-TVRVG method: After the LSTM networks are used for the spot forecasting, the regular vi copula model is constructed with the time-varying Gaussian copulas.This metho corresponds to the method used in [38].
The absolute values of the ACD, NMPIW and SS obtained by the method propos in this paper and the four methods mentioned above are shown in Figure 7a,b and Tab 4, respectively.For presentation purposes, the index scores in Figure 7a,b and Table 4 a the mean index scores of the forecasting intervals obtained for the four datasets.Regarding reliability, the means of the absolute values of the ACD indices obtained for the four datasets do not exceed 4% at all confidence levels and fluctuates within 3%.Regarding sharpness, the interval width of the 90% confidence interval is only 10% and fluctuates within 5%.The forecasting results show that the method proposed in this paper is highly adaptable, has good performance during different periods of the year and has small interval widths while remaining highly reliable.
The following methods are designed for comparison with the methods proposed in this paper: • BP networks and time-varying regular vine copulas (BP-TVRVC) method: The forecast error data generated by BP networks are used for time-varying regular vine copulas modeling, and the RMSE of the BP networks is shown in Table 2.

•
LSTM networks and time-varying copulas (LSTM-TVC) method: This is a superposition method.After performing spot forecasting with the LSTM networks, modeling the time-varying bivariate copula of the spot forecasting outputs and forecasting errors for each wind farm to obtain their respective forecasting intervals, the forecast intervals are superimposed as their aggregate output interval.This method ignores the spatial correlation between multiple wind farms.

•
LSTM networks and static regular vine copulas (LSTM-SRVC) method: LSTM networks are still used for spot forecasting, but the time-varying copula functions used in the time-varying regular vine copula model are replaced by the corresponding static copula functions.This method considers the correlations among the outputs of multiple wind farms but does not consider the time-varying nature of the correlations; this method was also used in ref. [37].

•
LSTM networks and time-varying regular vine Gaussian copulas (LSTM-TVRVGC) method: After the LSTM networks are used for the spot forecasting, the regular vine copula model is constructed with the time-varying Gaussian copulas.This method corresponds to the method used in [38].
The absolute values of the ACD, NMPIW and SS obtained by the method proposed in this paper and the four methods mentioned above are shown in Figure 7a,b and Table 4, respectively.For presentation purposes, the index scores in Figure 7a,b and Table 4 are the mean index scores of the forecasting intervals obtained for the four datasets.Comparing the proposed method with the BP-TVRVC method, the absolute values of the ACDs of both methods are less than 6% at all confidence levels, thus indicating that these two methods are both reliable.Since the absolute values of the ACDs of these two methods are small and do not differ significantly, more attention is given to sharpness.As Figure 7b shows, the method proposed in this paper has smaller NMPIW values, thus indicating that using a spot forecasting model with a small error in combination with the copula model helps improve the sharpness of the forecast interval.
Comparing the proposed method with the LSTM-TVC method, the NMPIW values of the LSTM-TVC method are smaller than those of the proposed method at all confidence levels, thus indicating that the former method has higher sharpness.However, in terms of reliability, the absolute values of the ACDs of the LSTM-TVC method are greater than those of the proposed method at all confidence levels, especially at 90% confidence levels.The values of the ACD of the LSTM-TVC method are 4 times as large as those of the proposed method, thus indicating that considering the spatial correlation of multiple wind farms is beneficial to compensating for the single wind farm forecasting errors, thus improving the reliability of the forecasting intervals.Although the sharpness of the LSTM-TVC method is higher, its reliability, which is the most important for interval forecasting, is not as good as that of the method proposed in this paper.
Comparing the proposed method with the LSTM-SRVC method, the absolute value of the ACDs of the proposed method are smaller than that of the LSTM-SRVC method at all confidence levels, thus indicating that the proposed method has higher reliability.Comparing the proposed method with the BP-TVRVC method, the absolute values of the ACDs of both methods are less than 6% at all confidence levels, thus indicating that these two methods are both reliable.Since the absolute values of the ACDs of these two methods are small and do not differ significantly, more attention is given to sharpness.As Figure 7b shows, the method proposed in this paper has smaller NMPIW values, thus indicating that using a spot forecasting model with a small error in combination with the copula model helps improve the sharpness of the forecast interval.
Comparing the proposed method with the LSTM-TVC method, the NMPIW values of the LSTM-TVC method are smaller than those of the proposed method at all confidence levels, thus indicating that the former method has higher sharpness.However, in terms of reliability, the absolute values of the ACDs of the LSTM-TVC method are greater than those of the proposed method at all confidence levels, especially at 90% confidence levels.The values of the ACD of the LSTM-TVC method are 4 times as large as those of the proposed method, thus indicating that considering the spatial correlation of multiple wind farms is beneficial to compensating for the single wind farm forecasting errors, thus improving the reliability of the forecasting intervals.Although the sharpness of the LSTM-TVC method is higher, its reliability, which is the most important for interval forecasting, is not as good as that of the method proposed in this paper.
Comparing the proposed method with the LSTM-SRVC method, the absolute value of the ACDs of the proposed method are smaller than that of the LSTM-SRVC method at all confidence levels, thus indicating that the proposed method has higher reliability.Additionally, the absolute values of the ACDs of the proposed method do not vary much; with a minimum value of 0.982% at the 10% confidence level and a maximum value of 3.087% at the 30% confidence level, the difference between the maximum and minimum values is 2.117%, while the absolute values of the ACDs of the methods using static copula functions vary greatly at different confidence levels, with a minimum value of 3.325% at the 90% confidence level and a maximum value of 16.723 at the 40% confidence level.The difference between the maximum and minimum values is 13.398%.This result indicates that it is difficult to accurately capture the correlations of multiple wind farms and maintain high reliability at different confidence levels by using static copula functions that ignore the time-varying characteristics of the correlations.Regarding sharpness, the NMPIW values of the proposed method at all confidence levels are smaller than those of the LSTM-SRVC method.Figure 8 shows the forecasting intervals of the LSTM-SRVC method for days 1-2 of the fourth week of the third month in dataset 1.
much; with a minimum value of 0.982% at the 10% confidence level and a maximum value of 3.087% at the 30% confidence level, the difference between the maximum and minimum values is 2.117%, while the absolute values of the ACDs of the methods using static copula functions vary greatly at different confidence levels, with a minimum value of 3.325% at the 90% confidence level and a maximum value of 16.723 at the 40% confidence level.The difference between the maximum and minimum values is 13.398%.This result indicates that it is difficult to accurately capture the correlations of multiple wind farms and maintain high reliability at different confidence levels by using static copula functions that ignore the time-varying characteristics of the correlations.Regarding sharpness, the NMPIW values of the proposed method at all confidence levels are smaller than those of the LSTM-SRVC method.Figure 8 shows the forecasting intervals of the LSTM-SRVC method for days 1-2 of the fourth week of the third month in dataset 1.
Comparing Figure 8 with Figure 5 shows that the forecasting intervals can effectively envelop the real aggregate wind power regardless of whether static copula functions or time-varying copula functions are used, but the width of the intervals produced by the method using static copula functions is significantly larger than that of the proposed method, indicating that the forecasting result of the LSTM-SRVC method are more conservative.Comparing the proposed method with the LSTM-TVRVGC method, the absolute values of the ACDs of the LSTM-TVRVGC method are larger than those of the proposed method at all confidence levels, and the values of the ACDs vary greatly at different confidence levels.This indicates that it is also difficult to accurately model the correlation of multiple wind farms and achieve high reliability at different confidence levels by using only one type of copula function.In terms of sharpness, the NMPIW values of the proposed method are smaller than those of the LSTM-TVRVGC method at all confidence levels.
In terms of the comprehensive index, the SS values of the proposed method are closer to 0 than those of the other four methods, thus indicating that the proposed method has the best overall performance.

Conclusions
This paper proposes an interval forecasting method by combining a spot forecast model based on LSTM networks and a time-varying regular vine copula model.Case analysis shows that the proposed method can provide high-reliability forecast intervals with reasonable interval widths.Comparing Figure 8 with Figure 5 shows that the forecasting intervals can effectively envelop the real aggregate wind power regardless of whether static copula functions or timevarying copula functions are used, but the width of the intervals produced by the method using static copula functions is significantly larger than that of the proposed method, indicating that the forecasting result of the LSTM-SRVC method are more conservative.
Comparing the proposed method with the LSTM-TVRVGC method, the absolute values of the ACDs of the LSTM-TVRVGC method are larger than those of the proposed method at all confidence levels, and the values of the ACDs vary greatly at different confidence levels.This indicates that it is also difficult to accurately model the correlation of multiple wind farms and achieve high reliability at different confidence levels by using only one type of copula function.In terms of sharpness, the NMPIW values of the proposed method are smaller than those of the LSTM-TVRVGC method at all confidence levels.
In terms of the comprehensive index, the SS values of the proposed method are closer to 0 than those of the other four methods, thus indicating that the proposed method has the best overall performance.

Conclusions
This paper proposes an interval forecasting method by combining a spot forecast model based on LSTM networks and a time-varying regular vine copula model.Case analysis shows that the proposed method can provide high-reliability forecast intervals with reasonable interval widths.
In the case study, comparing the proposed method with the method using only a single type of copulas, the method using static copulas (which thus ignores time-varying characteristic of correlations among multiple wind farms), and the superposition method (which thus ignores spatial correlations among multiple wind farms), the results show that the proposed method strikes a good balance between reliability and sharpness and has good comprehensive performance because of consideration of the temporal and spatial correlation of multiple wind farms and the use of multiple types of copula functions.Furthermore, the copula model was combined with a different spot forecasting model to interval forecasting in the case analysis, and the result shows that a spot forecasting model with smaller errors leads to sharper forecasting intervals.
In conclusion, by combining a spot forecast model with a small error and an improved copula model that considers the temporal and spatial correlation of multiple wind farms, the proposed method can provide accurate uncertain information on wind power for scheduling plans, thus improving the economic benefit and reliability of the system operation.Future research will focus on how to further improve spot forecast accuracy and apply the interval forecast result to scheduling decisions.
E j is denoted as e = a(e), b(e)|D(e) , where a(e)|D(e) and b(e)|D(e) are two nodes connected by e and D(e) is the conditioning set.The copula PDF corresponding to this edge is denoted as c a(e),b(e)|D(e) u a(e)|D(e) , u b(e)|D(e) , where u a(e)|D(e) and u b(e)|D(e) denote CDFs corresponding to the two nodes connected by e; then, (2) can be transformed into: e),b(e)|D(e) u a(e)|D(e) , u b(e)|D(e) s ∏ i=1 f (g i )

Figure 1 .
Figure 1.A possible structure of the six-dimensional regular vine copulas.

Figure 2 .
Figure 2. Flow chart for modeling the conditional joint PDF of the forecasting errors for multiple wind farms by TVRVC.

Figure 2 .
Figure 2. Flow chart for modeling the conditional joint PDF of the forecasting errors for multiple wind farms by TVRVC.

1 .
Construct the spot forecasting model based on LSTM networks for M wind farms.First, the historical output data of the M wind farms are divided into a training set and a test set.The training dataset is used to train the LSTM networks, and the test dataset is used to verify the effectiveness of the model.Then, M LSTM networks are trained, and the i − th LSTM network is trained for the i − th wind farm, where i ∈ [1, M].After training, the test dataset can be fed to the trained LSTM networks to obtain spot forecasting outputs for each wind farm.The spot forecasted aggregate output for the M wind farms is obtained by adding up the spot forecasting outputs of the M wind farms at the same moment; 2.Generate modeling data for the time-varying regular vine copula model based on the trained spot forecasting model.The test set of the spot model is divided into subsets A and B. The spot forecasting outputs and the errors of subset A generated by the trained spot forecasting model are used as modeling data to construct the TVRVC model, while subset B is used to test the effectiveness of the proposed method; 3.

Figure 3 .
Figure 3. Interval forecasting method of aggregate output for multiple wind farms usin TVRVC.

Figure 3 .
Figure 3. Interval forecasting method of aggregate output for multiple wind farms using LSTM-TVRVC.

Figure 4 .
Figure 4. Structure of the TVRVC of the three wind farms.Figure 4. Structure of the TVRVC of the three wind farms.

Figure 4 .
Figure 4. Structure of the TVRVC of the three wind farms.Figure 4. Structure of the TVRVC of the three wind farms.

Figure 5 .
Figure 5. Forecasting intervals produced by the interval forecasting metho

Figure 6 .
Figure 6.Mean values and fluctuation ranges of ACD and NMPIW for four datasets at differe confidence levels using LSTM-TVRVC.

Figure 6 .
Figure 6.Mean values and fluctuation ranges of ACD and NMPIW for four datasets at different confidence levels using LSTM-TVRVC.

Figure 7 .
Figure 7. ACDs and NMPIWs produced by different interval forecasting methods.

Figure 7 .
Figure 7. ACDs and NMPIWs produced by different interval forecasting methods.

Figure 8 .
Figure 8. Forecasting intervals produced by the interval forecast method using LSTM-SRVC.

Figure 8 .
Figure 8. Forecasting intervals produced by the interval forecast method using LSTM-SRVC.

Table 1 .
Kendall's tau coefficients for the three wind farms.

Table 2 .
RMSEs of different spot forecasting models.

Table 2 .
RMSEs of different spot forecasting models.

Table 4 .
SSs of different interval forecast methods.

Table 4 .
SSs of different interval forecast methods.