A Novel Twin Support Vector Regression Model for Wind Speed Time-Series Interval Prediction

: Although the machine-learning model demonstrates high accuracy in wind speed prediction, it struggles to accurately depict the ﬂuctuation range of the predicted values due to the inherent uncertainty in wind speed sequences. To address this limitation and enhance the reliability, we propose an effective wind speed interval prediction model that combines twin support vector regression (TSVR), variational mode decomposition (VMD), and the slime mould algorithm (SMA). In our methodology, the complex wind speed series is decomposed into multiple relatively stable subsequences using the VMD method. The principal component and residual series are then subject to interval prediction using the TSVR model, while the remaining components undergo point prediction. The SMA method is employed to search for optimal parameter combinations. The prediction interval of wind speed is obtained by aggregating the forecasting results of all TSVR models for each subseries. Our proposed model has demonstrated superior performance in various applications. It ensures that the wind speed value falls within the designated interval range while achieving the narrowest prediction interval. For instance, in the spring dataset with 1-period, we obtained a predicted interval with a prediction intervals coverage probability (PICP) value of 0.9791 and prediction interval normalized range width (PINRW) value of 0.0641. This outperforms other comparative models and signiﬁcantly enhances its practical application value. After adding the residual interval prediction model, the reliability of the prediction interval is signiﬁcantly improved. As a result, this study presents a novel twin support vector regression model as a valuable approach for multi-step wind speed interval prediction.


Introduction
With the rapid advancement of artificial intelligence, machine-learning-based point prediction models have gained extensive applications in the field of time-series forecasting [1][2][3].However, for wind speed forecasting, the deterministic values obtained from these point prediction models fail to capture the inherent uncertainty in wind speed sequences.As a result, interval prediction emerges as a more suitable approach in this context [4].Therefore, this paper is set to perform interval prediction by providing upper and lower bounds that reflect the probability of numerical occurrence and the potential range of fluctuation in the prediction results.
Traditional interval prediction methods, including delta, Bayesian, and bootstrap methods, require a prior assumption of data distribution and involve complex calculations, limiting their widespread application [5,6].Some different existing hybrid interval prediction methods are shown in Table 1.In Ref. [5], the performance of four traditional prediction interval methods was compared, highlighting their respective advantages.To address this problem, a linear combination of these techniques was suggested, demonstrating significant advantages over individual methods.Probability prediction methods such as quantile regression [7,8] and kernel density estimation (KDE) [9] also rely on accurate point predictions and often entail intricate mathematical calculations [10,11].In the case of water demand prediction, a KDE and particle swarm optimization and long short-term memory (KDE-PSO-LSTM) hybrid model was proposed by Ref. [11] to obtain an optimal prediction error interval through confidence-window shifting.Another approach, lower upper and lower boundary estimation (LUBE) [12], tackles the limitations of traditional interval prediction methods.A neural network dual-output model is constructed in LUBE to directly obtain prediction intervals.Model parameters are iteratively optimized and calibrated using an evaluation index as the objective function, resulting in high-reliability and clear prediction intervals [13].Due to the high nonlinearity, complexity, and discontinuity of the cost function, conventional derivative-based algorithms are inadequate for minimizing it.Therefore, utilizing more powerful optimization algorithms becomes crucial to achieving significant changes in the calculation results [14].The interval prediction method suggested based on LUBE and LSTM models is proposed in Ref. [15].Ref. [16] proposed a wavelet transform and LUBE and artificial neural network (WT-LUBE-ANN) model for interval prediction, combining the PSO and mutation operator with coverage width criterion (CWC) cost function minimization.The model proves effective in accurately anticipating wind power intervals.In flood prediction, Ref. [17] introduced a dual-output kernel extreme learning machine (KELM) model based on LUBE, incorporating orthogonal chaotic non-dominated sorting genetic algorithm-II (OCNSGA-II) for parameter calibration through multi-objective optimization to obtain optimal prediction intervals.The slime mold optimization algorithm (SMA) [18] is used for parameter calibration.Moreover, various improvement methods have emerged in recent years [19,20].Among them, Ref. [20] proposed the LUBE-ANN method, which utilizes a gradient descent (GD) training mechanism and a new Huber loss function.The results show that this approach reduces training time while enhancing interval quality compared to traditional LUBE models.
The complexity of time series can affect the accuracy of prediction models.To address this issue, various methods yielding good results have been proposed by scholars [21][22][23], which all preprocessed the data by using decomposition methods to reduce the nonlinearity and non-stationary nature of the sequence.For instance, in the field of wind speed prediction, Ref. [24] developed an interval prediction model based on the gate recurrent unit (GRU) prediction module, incorporating the variational mode decomposition (VMD) method and an error correction model.Building upon this, Ref. [25] introduced two GRU prediction modules and utilized the PSO method to assign weights to the prediction errors of each sub-signal, resulting in a high-quality stacked prediction interval.Moreover, other scholars have employed different methods to establish prediction intervals [26,27].In Ref. [27], an attention-based LSTM model was proposed to significantly improve the interval coverage, and the prediction interval was obtained using the fuzzy information granulation (FIG) theory with a tree-structured parzen estimator (TPE) as the hyperparameter optimizer.Another study used the cubic spline interpolation (CSI) algorithm to obtain the shortest confidence interval based on the point predictions from the VMD and firefly algorithm and KELM (VMD-FA-KELM) model [28].Ref. [29] combined CSI with support vector machine quantile regression to estimate wind power intervals.Furthermore, in Ref. [30], two different machine-learning approaches, including multilayer perceptron neural networks trained with a multi-objective genetic algorithm (GA) and ELMs, were coupled with the nearest neighbors' approach to derive prediction intervals.
Support vector regression (SVR) and least squares support vector regression (LSSVR) have also been employed by researchers to predict intervals [31][32][33].In Ref. [31], a random forest model is initially used to screen for influential factors for the model, while the sparrow search algorithm and SVR model are then utilized to forecast China's future total energy consumption.To enhance the predictive relevance of the model, interval forecasting is implemented using the KDE approach.On the other hand, Ref. [33] leverages the LUBE theory and PSO algorithm to construct a dual-output LSSVR model by considering the bias of prediction.This approach enables the interval prediction of landslide displace-Energies 2023, 16, 5656 3 of 23 ment.In comparison to SVR, twin support vector regression (TSVR) [34] simplifies a large quadratic programming problem (QPP) into two smaller QPPs, resulting in accelerated model training, improved generalization capability, and enhanced model robustness.This study proposes utilizing a pair of upper and lower boundary functions generated by the model itself to establish upper and lower interval bounds for achieving TSVR interval prediction.

Reference
Basic Theory Research Contents [6,8] Bootstrap and quantile regression The probability method based on bootstrap and quantile regression quantifies the uncertainty.
[10] KDE A brand-new distance-weighted KDE method is put forward to predict the complete distribution function of wind power.
[17] LUBE with optimization algorithm Multi-objective optimization for flood interval prediction using KELM and OCNSGA-II.[16,21,22] Prediction method with data preprocessing The decomposition method such as WT and VMD based on the interval prediction method such as bootstrap and LUBE is used to preprocess the original data, so as to improve the quality of the prediction interval.
[25] Two prediction modules PSO is used to find the optimal weight for the prediction error of each GRU module and obtain the prediction interval.
In this study, an interval prediction model called VMD-SMA-TSVR is proposed.The variational mode decomposition (VMD) method is used to preprocess the wind speed series.Twin support vector regression (TSVR) prediction models combined with the slime mould algorithm (SMA) are constructed for both the subsequences and residual sequence obtained from the decomposition.Experimental results demonstrate that the proposed model exhibits superior performance in constructing prediction intervals while effectively balancing reliability and clarity.Importantly, the proposed method does not require assumptions about the error distribution.Instead, it achieves interval prediction by leveraging the point prediction and interval prediction of the decomposed subsequences.This approach provides a novel strategy for interval prediction within the TSVR framework.
The rest of the paper is structured as follows: Section 2 describes various methods used in this paper; Section 3 contains the evaluation indices for prediction intervals; Section 4 discusses wind speed data experiments, along with the outcomes of the proposed model and comparisons to other approaches; and Section 5 gives the conclusions of this paper.

Variational Mode Decomposition
The VMD technique involves the formulation and solution of variational problems [35].It decomposes the original time series into K modes, each capturing specific data characteristics.By applying the Hilbert transform to each mode u, the corresponding one-sided frequency spectrum can be obtained.Adjusting the exponent according to the estimated center frequency ω k allows for the transmission of mode frequency spectra to the baseband.The bandwidth of each mode is determined by Gaussian smoothness applied to the demodulated signal.Consequently, a constrained variational problem arises [24].
Energies 2023, 16, 5656 4 of 23 where j 2 = −1.δ is the Dirac distribution.* denotes the convolution operation.It should be emphasized that u 1 is the principal mode with the maximum amplitude.By employing the augmented Lagrange function, the original constrained variational problem is transformed into an unconstrained variational problem.The resulting extended Lagrange expression is solved by iteratively updating the variables u k , ω k and the Lagrangian multiplication operator λ.The iteration process continues until the convergence condition of the following equation is met.
where n is the number of iterations.ûn+1

Slime Mould Algorithm
The SMA is a metaheuristic algorithm that constructs a mathematical model based on the actions and physical changes observed in the slime mold Physarum polycephalum during foraging [36,37].Figure 1 demonstrates the sequential steps involved in the SMA method. where .  is the Dirac distribution.* denotes the convolution operation.It should be emphasized that u1 is the principal mode with the maximum amplitude.
By employing the augmented Lagrange function, the original constrained variational problem is transformed into an unconstrained variational problem.The resulting extended Lagrange expression is solved by iteratively updating the variables uk, ωk and the Lagrangian multiplication operator λ.The iteration process continues until the convergence condition of the following equation is met.
where n is the number of iterations.
， are the Fourier transforms of

Slime Mould Algorithm
The SMA is a metaheuristic algorithm that constructs a mathematical model based on the actions and physical changes observed in the slime mold Physarum polycephalum during foraging [36,37].Figure 1 demonstrates the sequential steps involved in the SMA method.
(1) The slime molds actively search for food and move towards it by sensing airborne chemical signals.At each iteration (t), with a population size of n, the location of a particular slime mold is denoted as X.Additionally, XA and XB represent the directional vectors of two randomly selected slime molds.The mathematical representation of this process can be expressed as follows: The steps of SMA (adopted from [36]).
(1) The slime molds actively search for food and move towards it by sensing airborne chemical signals.At each iteration (t), with a population size of n, the location of a particular slime mold is denoted as X.Additionally, X A and X B represent the directional vectors of two randomly selected slime molds.The mathematical representation of this process can be expressed as follows: where v b is a parameter with a range of [−a, a] and v c linearly decreases within the range of [0, 1]; X b is the orientation of the individual with the highest odour concentration found; and the fitness S(i) of each slime mold and the best fitness D F in the iterative process can express the switching probability.The following formula is the specific calculation of the weight W.

W(Smell
Smell Index = sort(S) (7) where r is the random values in [0, 1].b F , w F , respectively, represent the optimal fitness value and the worst fitness value obtained in the current iteration process.condition expresses the individuals whose fitness values rank in the first half of the population; SmellIndex expresses the sorted fitness value sequence (ascending order in solving the minimum value optimization problem).
(2) Wrapping food.The higher the food concentration, the greater the weight near the area.The following calculation formulae are about the updating position of slime molds.
where z is a parameter; rand is the random numbers within the range [0, 1]; and U B , L B are the upper and lower boundary values of the search range.

Twin Support Vector Regression
The TSVR method aims to address the computational challenges associated with the traditional SVR by transforming a large QPP into two smaller ones.TSVR achieves this by maximizing the number of samples located between two nonparallel hyperplanes.In the context of training samples, considering A l×n as the inputs and Y l×1 as the corresponding outputs, the construction of the two QPPs can be outlined as follows [34,38]: where C 1 > 0, C 2 > 0 are the penalty coefficient; b is the unknown number of the regression parameter equation; w is the inertia weight; ξ, η are relaxation vectors; e is a vector of appropriate dimensions; and ε 1 , ε 2 are optimize parameters.By introducing Lagrange multipliers α and β, the original problem can be transformed into a corresponding dual problem.This transformation involves using these multipliers to create a Lagrangian function, which combines the objective function and the constraints of the original problem.As a result, the original optimization problem becomes the following dual optimization problem: where G = [K(A, A T ) e], and K(x, A) = exp −σ x − A 2 is a Gaussian kernel function.I is the identity matrix of the appropriate dimension; and C 3 , C 4 are positive regularization parameters.
Energies 2023, 16, 5656 6 of 23 We obtain the final prediction result through the regression function.
This paper employs a pair of upper and lower boundary functions to establish interval bounds, [LB, UB], thus enabling interval prediction for the TSVR model.Figure 2 provides a visual representation of the TSVR interval prediction model in conjunction with SMA.
where k denotes the distance between each boundary function and the predicted point value; and a denotes the optimization parameter.When a = 1, the upper and lower boundary functions coincide, which is the predicted point value. where x A is a Gaussian kernel function.I is the identity matrix of the appropriate dimension; and C3, C4 are positive regularization parameters.
We obtain the final prediction result through the regression function.(13) This paper employs a pair of upper and lower boundary functions to establish interval bounds, [LB, UB], thus enabling interval prediction for the TSVR model.Figure 2 provides a visual representation of the TSVR interval prediction model in conjunction with SMA.
where k denotes the distance between each boundary function and the predicted point value; and a denotes the optimization parameter.When a = 1, the upper and lower boundary functions coincide, which is the predicted point value.

A Novel Interval Prediction Method
The aim of this research is to enhance the performance of the wind speed interval prediction model.To achieve this, the study proposes the VMD-SMA-TSVR model incorporating a residual series.The VMD method is initially used to decompose the data into numerous suitable subsequences.To improve interval effectiveness, the prediction interval model Subsequently, the obtained results are added and summed to generate the final prediction interval [LB, UB], as illustrated in Figure 3.An evaluation index system is established, incorporating indicators such as PICP, PINRW, and CWC.Furthermore, a dual-output interval prediction model is constructed for LSTM, LSSVR, and KELM, with the KDE method being applied based on the TSVR model, resulting in a total of five comparative models.For clarity, the TSVR model combined with the KDE method and the proposed method are abbreviated as TSVR-K and TSVR-P, respectively.Moreover, the aforementioned models undergo multi-step interval prediction to thoroughly investigate the reliability and accuracy of the proposed model.
numerous suitable subsequences.To improve interval effectiveness, the prediction interval model (PI Model) Model 1 is constructed based on the principal components decomposed by sequence VMD, resulting in the prediction of upper and lower intervals [LB1, UB1].Prediction point models (PP Models) Model 2 to Model N are then constructed for the remaining components, and their point prediction values are summed.Additionally, a TSVRResidual model is developed for the residual series generated due to the reconstruction error after VMD, leading to the derivation of upper and lower intervals [LBRes, UBRes].Subsequently, the obtained results are added and summed to generate the final prediction interval [LB, UB], as illustrated in Figure 3.An evaluation index system is established, incorporating indicators such as PICP, PINRW, and CWC.Furthermore, a dual-output interval prediction model is constructed for LSTM, LSSVR, and KELM, with the KDE method being applied based on the TSVR model, resulting in a total of five comparative models.For clarity, the TSVR model combined with the KDE method and the proposed method are abbreviated as TSVR-K and TSVR-P, respectively.Moreover, the aforementioned models undergo multi-step interval prediction to thoroughly investigate the reliability and accuracy of the proposed model.

PICP
The performance of the interval prediction can be evaluated in terms of two aspects: the coverage probability and width.The prediction intervals coverage probability (PICP) reflects the reliability of the prediction interval [39].If the observation value yi is within the constructed prediction interval [Li, Ui], ci = 1; otherwise, ci = 0.

PINAW and PINRW
Indicators serve as a measure of the width of the prediction interval to verify the clarity of the prediction interval.An ideal prediction interval is characterized by a low prediction interval normalized average width (PINAW) or prediction interval normalized range width (PINRW) [16].The smaller the indicator value, the narrower the prediction interval and the higher the clarity.Since PINAW is already accounted for in the CWC, this paper will not include it in the subsequent tables.

PICP
The performance of the interval prediction can be evaluated in terms of two aspects: the coverage probability and width.The prediction intervals coverage probability (PICP) reflects the reliability of the prediction interval [39].If the observation value y i is within the

PINAW and PINRW
Indicators serve as a measure of the width of the prediction interval to verify the clarity of the prediction interval.An ideal prediction interval is characterized by a low prediction interval normalized average width (PINAW) or prediction interval normalized range width (PINRW) [16].The smaller the indicator value, the narrower the prediction interval and the higher the clarity.Since PINAW is already accounted for in the CWC, this paper will not include it in the subsequent tables.
where R is the range of observation value.

NAD and ACE
Considering the possibility of true values lying outside the prediction interval, the aforementioned indicators may not be sufficient.To address this, the normalized average deviation (NAD) is introduced as a measure of the deviation of data not covered by the interval [40].In Equation (19), the NAD values are processed for ease of presentation, as they generally tend to be close to 0. The average coverage error (ACE) indicator captures the deviation between the prediction interval and the prediction interval nominal confidence (PINC).In this paper, a value of 0.1 is set for α, while PINC is set at 90%.If the PICP is significantly smaller than 1 − α, it indicates that the constructed prediction interval is unreliable.

CWC
When determining the objective optimization function, there is a trade-off between achieving a higher PICP value and a lower PINAW value.In certain situations, increasing the coverage rate can result in a wider interval, while reducing the interval width may lead to a lower coverage rate.To address this, a comprehensive evaluation index called the CWC is constructed to convert complex multi-objective optimization problems into a single objective [15].In order to differentiate the CWC and PINAW values when PICP > PINC, the CWC formula is set as follows: where the value of µ is that assumed by PINC; as a penalty term, η is used to amplify the gap between PICP and µ, which is set as 30.

Winker Score
The Winker Score is set as an additional indicator for evaluating the prediction interval.The Winker Score is based on obtaining a lower absolute value compared to a given confidence level, indicating a sharper interval.However, it is important to note that, while the sharpness of the interval can be assessed using the Winker Score, PICP remains the primary indicator for evaluating the reliability of the prediction interval [41].

Case Studies 4.1. Data Preprocessing
This study examines the wind speed series from the 1st to the 10th for each month of the year, with measurements taken at 10 min intervals.The accuracy of the wind speed prediction model may be affected as the data are missing.To address this issue, this article employs the k-NN imputation algorithm to interpolate the missing values.The specific principle and details are explained in Ref. [42].The dataset is then divided into two distinct subsets: a training and validation set, comprising the initial 70% of the data, and a testing set containing the remaining 30% [43,44].

Model Development
Multiple types of comparative models were employed to evaluate the performance of the proposed model.The first type involved employing traditional prediction models such as LSTM, LSSVR, and KELM in conjunction with the LUBE method.In this approach, the CWC objective function is utilized, and the SMA optimizer iterated to determine the optimal proportion.The second type of model employed the TSVR model with nonparametric KDE using the Gaussian kernel function.This method initially utilized the TSVR model coupled with SMA for point prediction, with RMSE serving as the objective function.The prediction intervals were then obtained by analyzing the error series.The third type of model was the VMD-SMA-TSVR model without a residual model.Building on this, the proposed model conducted interval prediction on the residual model, thus obtaining high-quality prediction intervals.The parameter settings of other methods are shown in Table 2.

Prediction Results of the Annual Wind Speed Data
This study aimed to establish various wind speed interval prediction models and compare their performance across different months using multiple statistical evaluation indicators.Table 3 presents detailed statistical indicators for the 1-period prediction results.The data reveals that each model exhibited varying predictive performance across different months.In terms of the Score indicator, all models performed poorly in predicting intervals in June.Regarding the CWC indicator, the LSTM model exhibited significant fluctuations, while the LSSVR, KELM, and TSVR-P models showed poor indicator values in May and June.Comparatively, the differences in indicator values between TSVR-K and the proposed model were relatively small when considering the entire year.The proposed model consistently achieved the lowest values for both the Score and CWC indicators, indicating its ability to generate prediction intervals with good coverage and width.Figures 4-6 depict the annual test results of LSTM, LSSVR, and the proposed model.The conclusion drawn is that, under high interval reliability conditions, the proposed model can achieve the narrowest and high-quality prediction intervals.
Based on the annual average of indices, although LSTM's NAD value is small, the ACE indicator is negative and lower than the confidence level, suggesting poor reliability in interval prediction.Comparing it with the TSVR-K model, we find that, while the TSVR-P model can reduce the width of the prediction interval, it comes at the cost of sacrificing reliability, resulting in significant deviation.However, coupling the VMD method greatly improves this situation, with the NAD index decreasing from 0.1551 to 0.0712, though still inferior to the TSVR-K model, which has a value of 0.0693.ACE indicator is negative and lower than the confidence level, suggesting poor reliability in interval prediction.Comparing it with the TSVR-K model, we find that, while the TSVR-P model can reduce the width of the prediction interval, it comes at the cost of sac rificing reliability, resulting in significant deviation.However, coupling the VMD method greatly improves this situation, with the NAD index decreasing from 0.1551 to 0.0712 though still inferior to the TSVR-K model, which has a value of 0.0693.in interval prediction.Comparing it with the TSVR-K model, we find that, while the TSVR-P model can reduce the width of the prediction interval, it comes at the cost of sac rificing reliability, resulting in significant deviation.However, coupling the VMD method greatly improves this situation, with the NAD index decreasing from 0.1551 to 0.0712 though still inferior to the TSVR-K model, which has a value of 0.0693.

Multi-Step Prediction of Different Seasons
To further validate the quality of the proposed model's prediction interval, this section selects wind speed data from April, July, October, and January, representing the seasons of spring, summer, autumn, and winter, respectively.Multi-step interval prediction is conducted, and Figures 7-10 illustrate the interval prediction results for each season.Analyzing the four statistical indicators, namely, PICP, PINRW, CWC, and Score, it becomes evident that the TSVR-P model outperforms other comparative models in terms of providing a clearer prediction interval, with a more stable performance across all four seasons.Specifically, the KELM and TSVR-K models demonstrate poor performance in predicting intervals during summer and winter, respectively.In contrast, the proposed model exhibits good stability and robustness based on the four prediction datasets.

Multi-Step Prediction of Different Seasons
To further validate the quality of the proposed model's prediction interval, this section selects wind speed data from April, July, October, and January, representing the seasons of spring, summer, autumn, and winter, respectively.Multi-step interval prediction is conducted, and Figures 7-10 illustrate the interval prediction results for each season.Analyzing the four statistical indicators, namely, PICP, PINRW, CWC, and Score, it becomes evident that the TSVR-P model outperforms other comparative models in terms of providing a clearer prediction interval, with a more stable performance across all four seasons.Specifically, the KELM and TSVR-K models demonstrate poor performance in predicting intervals during summer and winter, respectively.In contrast, the proposed model exhibits good stability and robustness based on the four prediction datasets.

Multi-Step Prediction of Different Seasons
To further validate the quality of the proposed model's prediction interval, this section selects wind speed data from April, July, October, and January, representing the seasons of spring, summer, autumn, and winter, respectively.Multi-step interval prediction is conducted, and Figures 7-10 illustrate the interval prediction results for each season.Analyzing the four statistical indicators, namely, PICP, PINRW, CWC, and Score, it becomes evident that the TSVR-P model outperforms other comparative models in terms of providing a clearer prediction interval, with a more stable performance across all four seasons.Specifically, the KELM and TSVR-K models demonstrate poor performance in predicting intervals during summer and winter, respectively.In contrast, the proposed model exhibits good stability and robustness based on the four prediction datasets.To assess stability, multi-step predictions are conducted for each model, and the research findings are presented in Tables 4 and 5.It can be concluded that, as the prediction period increases, the ACE values of the LSTM, LSSVR, KELM, and TSVR-P models may occasionally be negative.This suggests that the predicted intervals of these models fail to reach the desired confidence level and lack credibility.On the other hand, the TSVR-K interval prediction model consistently constructs reliable intervals with minimal deviation from the true points, even during the winter period with 2-period.Its PICP value remains at 1, indicating high reliability, while the NAD value is 0. However, this comes at the cost of reduced interval clarity, with a PINRW value of 0.3164.It should be noted that, when the interval width is large, it holds limited practical application value.In contrast, the proposed model also achieves a PICP value of 1 and a NAD value of 0, but with a PINRW value of 0.1365.As a result, the proposed model significantly sharpens the prediction intervals and improves their clarity.The small absolute values of Score indicates that the sharpness of the prediction interval provided by the proposed model has been significantly improved.To assess stability, multi-step predictions are conducted for each model, and the research findings are presented in Tables 4 and 5.It can be concluded that, as the prediction   To assess stability, multi-step predictions are conducted for each model, and the research findings are presented in Tables 4 and 5.It can be concluded that, as the prediction   To assess stability, multi-step predictions are conducted for each model, and the research findings are presented in Tables 4 and 5.It can be concluded that, as the prediction  From Figures 11 and 12, it is evident how the CWC and Score values of each model change during multi-step prediction.The findings suggest that the changes in the CWC indicators are not as pronounced as those in the Score indicators.Only in the KELM model is there a pattern of deteriorating model performance as the forecast period increases.As for the Score indicator, every model exhibits a consistent trend of decreasing absolute values.This demonstrates that the proposed model consistently delivers outstanding interval prediction performance across various forecast periods.This may be attributed to the fact that the CWC indicator represents two aspects of prediction interval reliability and clarity, encompassing the comprehensive reflection of PICP and PINRW indicators, both of which carry a certain degree of contradiction.However, Score solely reflects the sharpness of the prediction interval, resulting in a strong regularity.

Analysis of the Residual Model
After applying VMD, it is possible to encounter reconstruction errors, which may negatively affect the accuracy of interval prediction results to some extent.Given the residual sequence's inherent non-stationary and nonlinear characteristics, a reliable interval prediction model is developed to mitigate the prediction uncertainty effectively.The final upper and lower bounds are obtained by constructing interval predictions of the residual sequences, then stacking and summing them with the original prediction results.A comparison between the proposed model and the VMD-SMA-TSVR interval prediction model without residual sequences for a 1-period is summarized in Table 6.In autumn, the CWC value is high and the quality of the forecast interval is poor due to the low reliability of the comparing model's prediction interval, such as PICP = 0.5986, ACE = −30.1392%.During winter, adding the interval prediction results of the residual sequences may not necessarily enhance the reliability and clarity of the prediction intervals, but it significantly reduces the deviation from the true value (NAD decreases from 0.0626 to 0.0039).In the other three seasons, incorporating the interval prediction results of the residual sequences significantly increases the PICP value of the prediction intervals.However, this enhancement in interval performance also leads to an increase in the PINRW value of the prediction intervals.
Energies 2023, 16, x FOR PEER REVIEW 17 of 24 encompassing the comprehensive reflection of PICP and PINRW indicators, both of which carry a certain degree of contradiction.However, Score solely reflects the sharpness of the prediction interval, resulting in a strong regularity.

Analysis of the Residual Model
After applying VMD, it is possible to encounter reconstruction errors, which may negatively affect the accuracy of interval prediction results to some extent.Given the residual sequence's inherent non-stationary and nonlinear characteristics, a reliable interval prediction model is developed to mitigate the prediction uncertainty effectively.The final upper and lower bounds are obtained by constructing interval predictions of the residual sequences, then stacking and summing them with the original prediction results.A comparison between the proposed model and the VMD-SMA-TSVR interval prediction model without residual sequences for a 1-period is summarized in Table 6.In autumn, the CWC value is high and the quality of the forecast interval is poor due to the low reliability of the comparing modelʹs prediction interval, such as PICP=0.5986,ACE=-30.1392%.During winter, adding the interval prediction results of the residual sequences may not necessarily enhance the reliability and clarity of the prediction intervals, but it significantly reduces the deviation from the true value (NAD decreases from 0.0626 to 0.0039).In the other three seasons, incorporating the interval prediction results of the residual sequences encompassing the comprehensive reflection of PICP and PINRW indicators, both of which carry a certain degree of contradiction.However, Score solely reflects the sharpness of the prediction interval, resulting in a strong regularity.

Analysis of the Residual Model
After applying VMD, it is possible to encounter reconstruction errors, which may negatively affect the accuracy of interval prediction results to some extent.Given the residual sequence's inherent non-stationary and nonlinear characteristics, a reliable interval prediction model is developed to mitigate the prediction uncertainty effectively.The final upper and lower bounds are obtained by constructing interval predictions of the residual sequences, then stacking and summing them with the original prediction results.A comparison between the proposed model and the VMD-SMA-TSVR interval prediction model without residual sequences for a 1-period is summarized in Table 6.In autumn, the CWC value is high and the quality of the forecast interval is poor due to the low reliability of the comparing modelʹs prediction interval, such as PICP=0.5986,ACE=-30.1392%.During winter, adding the interval prediction results of the residual sequences may not necessarily enhance the reliability and clarity of the prediction intervals, but it significantly reduces the deviation from the true value (NAD decreases from 0.0626 to 0.0039).In the other three seasons, incorporating the interval prediction results of the residual sequences   Taking the summer prediction results for 1-period as an example in Figure 13.Although the CWC value of this model is lower than the proposed model due to its narrow interval width with PINRW = 0.0233, it also results in poor reliability, far below the confidence level.It is evident that the model without residual sequences fails to ensure that the more points fall within the prediction interval.Additionally, the proposed model exhibits higher reliability and lower deviation in its prediction interval.Taking the summer prediction results for 1-period as an example in Figure 13.Although the CWC value of this model is lower than the proposed model due to its narrow interval width with PINRW=0.0233, it also results in poor reliability, far below the confidence level.It is evident that the model without residual sequences fails to ensure that the more points fall within the prediction interval.Additionally, the proposed model exhibits higher reliability and lower deviation in its prediction interval.The degree of improvement in the CWC values for all comparative models is illustrated in Table 7 and Figure 15, aiming to underscore the predictive performance of these models more directly.The data clearly indicate that, as the forecast period increases, the proposed model demonstrates an upward trend in the degree of improvement.This suggests that, with a longer forecast period, the proposed model maintains high stability and consistently generates high-quality prediction intervals.The degree of improvement in the CWC values for all comparative models is illustrated in Table 7 and Figure 15, aiming to underscore the predictive performance of these models more directly.The data clearly indicate that, as the forecast period increases, the proposed model demonstrates an upward trend in the degree of improvement.This suggests that, with a longer forecast period, the proposed model maintains high stability and consistently generates high-quality prediction intervals.The degree of improvement in the CWC values for all comparative models is illustrated in Table 7 and Figure 15, aiming to underscore the predictive performance of these models more directly.The data clearly indicate that, as the forecast period increases, the proposed model demonstrates an upward trend in the degree of improvement.This suggests that, with a longer forecast period, the proposed model maintains high stability and consistently generates high-quality prediction intervals.

Analysis of the Proposed Model in Public Dataset
In this section, two sets of wind speed data collected every 5 min over a period of 5 days from Humeston, IA, US, are used for analysis.The first 1008 samples are designated as training data, while the remaining samples are used as testing data.The experimental results are presented in Table 8 and Figures 16 and 17.The results of the multi-step prediction experiments show that the proposed model achieves more accurate prediction intervals.The CWC value has a consistent increasing trend as the foresight period lengthens, which suggests that prediction errors accrue over time and the model's predictive accuracy declines.By incorporating a coupling decomposition method into the prediction model, the width of the prediction interval can be effectively reduced, resulting in improved clarity.The proposed model should be tested under diverse conditions to ensure its effectiveness and practical applicability.This includes accounting for variable wind conditions in different geographical regions, extreme weather events such as storms and hurricanes, seasonal changes in various landform regions, temporal variations throughout the day, infrastructure and environmental constraints, integration with the power grid, and considerations for maintenance schedules.These factors may have an impact on the analysis results of the model's prediction performance.However, due to the limited availability of necessary data, this study focused on analyzing the interval prediction performance of the proposed model for wind speed in specific areas during different seasons and months.Validating the model under the complex factors will contribute to a wider understanding of wind speed prediction and enhance its reliability and usability in real-world scenarios.Moreover, the proposed model utilizes a decomposition method, which requires establish-   The proposed model should be tested under diverse conditions to ensure its effectiveness and practical applicability.This includes accounting for variable wind conditions in different geographical regions, extreme weather events such as storms and hurricanes, seasonal changes in various landform regions, temporal variations throughout the day, infrastructure and environmental constraints, integration with the power grid, and considerations for maintenance schedules.These factors may have an impact on the analysis results of the model's prediction performance.However, due to the limited availability of necessary data, this study focused on analyzing the interval prediction performance of the proposed model for wind speed in specific areas during different seasons and months.Validating the model under the complex factors will contribute to a wider understanding of wind speed prediction and enhance its reliability and usability in real-world scenarios.The proposed model should be tested under diverse conditions to ensure its effectiveness and practical applicability.This includes accounting for variable wind conditions in different geographical regions, extreme weather events such as storms and hurricanes, seasonal changes in various landform regions, temporal variations throughout the day, infrastructure and environmental constraints, integration with the power grid, and considerations for maintenance schedules.These factors may have an impact on the analysis results of the model's prediction performance.However, due to the limited availability of necessary data, this study focused on analyzing the interval prediction performance of the proposed model for wind speed in specific areas during different seasons and months.Validating the model under the complex factors will contribute to a wider understanding of wind speed prediction and enhance its reliability and usability in real-world scenarios.Future work will focus on addressing this issue by finding methods to streamline the process.It should be noted that, due to time constraints, the use of the SMA method to optimize the prediction interval may lead to local optima, potentially impacting the CWC.The article acknowledges the inherent trade-off between the two evaluation indicators, PICP and PINRW, in the prediction interval.To resolve this, this article transformed the multi-objective problem into a single-objective problem.Future research will delve further into the multi-objective optimization problem associated with the proposed prediction model, aiming to produce more reasonable and scientifically grounded prediction intervals for wind speed.

Conclusions
Accurate and effective wind speed prediction is crucial for the stability of the electricity system and the efficient production of wind energy.It enables the timely dispatch of the power grid and optimal selection of wind farms, and ensures the secure and stable operation of the power system.Implementing interval prediction of wind speed can meet the actual needs of the power production system by providing information on the likelihood of occurrence and potential fluctuation range of anticipated outcomes.To satisfy practical requirements, this paper proposes a novel wind speed time-series interval prediction approach that combines the SMA method and the VMD method into the TSVR model, considering interval prediction for residual sequences.The real wind speed data show that the LSTM model exhibits significant uncertainty in interval prediction.For example, from the annual prediction results in Table 3, it can be seen that the prediction index CWC results of the LSTM model fluctuate significantly, ranging from 1.1716 to 3.4943, while the reliability of the LSSVR and KELM models decreases as the forecast period increases, even falling below the confidence level during the prediction process.The TSVR-K model tends to produce prediction intervals with poor clarity and good reliability.Although the prediction interval of the TSVR-P model has better clarity compared to the TSVR-K model, as the prediction period increases, disappointing results can be obtained.For example, when predicting summer wind speed data, the ACE at 3-period is −0.9049%, which indicates PICP is lower than the confidence level.For the proposed model, the integration of the VMD method and the additional residual interval prediction model significantly enhances its predictive performance.The coverage of the prediction interval is improved (the average improvement rate of PICP is 30.49%), while the deviation of the interval is reduced, compared with VMD-SMA-TSVR.Overall, the proposed model effectively reduces uncertainty and enhances the reliability and clarity of the prediction intervals, thereby contributing to improved accuracy and quantitatively describe the uncertainty in wind speed prediction.
the accuracy convergence criterion.
(PI Model) Model 1 is constructed based on the principal components decomposed by sequence VMD, resulting in the prediction of upper and lower intervals [LB 1 , UB 1 ].Prediction point models (PP Models) Model 2 to Model N are then constructed for the remaining components, and their point prediction values are summed.Additionally, a TSVR Residual model is developed for the residual series generated due to the reconstruction error after VMD, leading to the derivation of upper and lower intervals [LB Res , UB Res ].Energies 2023, 16, 5656 7 of 23

Figure 3 .
Figure 3. Sketch map of interval prediction for the proposed model.

Figure 3 .
Figure 3. Sketch map of interval prediction for the proposed model.

Figure 6 .
Figure 6.Interval prediction process diagram of the proposed model.

Figure 7 .
Figure 7. Display of predicted results of six models in spring for 1-period.

Figure 6 .
Figure 6.Interval prediction process diagram of the proposed model.

Figure 6 .
Figure 6.Interval prediction process diagram of the proposed model.

Figure 7 .
Figure 7. Display of predicted results of six models in spring for 1-period.

Figure 7 .
Figure 7. Display of predicted results of six models in spring for 1-period.

Figure 8 .
Figure 8. Display of predicted results of six models in summer for 1-period.

Figure 9 .
Figure 9. Display of predicted results of six models in autumn for 1-period.

Figure 10 .
Figure 10.Display of predicted results of six models in winter for 1-period.

Figure 8 . 24 Figure 8 .
Figure 8. Display of predicted results of six models in summer for 1-period.

Figure 9 .
Figure 9. Display of predicted results of six models in autumn for 1-period.

Figure 10 .
Figure 10.Display of predicted results of six models in winter for 1-period.

Figure 9 . 24 Figure 8 .
Figure 9. Display of predicted results of six models in autumn for 1-period.

Figure 9 .
Figure 9. Display of predicted results of six models in autumn for 1-period.

Figure 10 .
Figure 10.Display of predicted results of six models in winter for 1-period.

Figure 10 .
Figure 10.Display of predicted results of six models in winter for 1-period.

Figure 11 .
Figure 11.CWC of several models for different foresight periods.

Figure 12 .
Figure 12.Score of several models for different foresight periods.

Figure 11 .
Figure 11.CWC of several models for different foresight periods.

Figure 11 .
Figure 11.CWC of several models for different foresight periods.

Figure 12 .
Figure 12.Score of several models for different foresight periods.

Figure 12 .
Figure 12.Score of several models for different foresight periods.

Figure 13 .
Figure 13.Process chart of interval prediction ability for 1-period at summer.4.4.2.Analysis of Model Prediction Performance for Different SeasonsFigure14presents a comparison of three statistical indicators, namely, Score, NAD, and CWC, for 1-period forecasts across different seasons.When predicting interval values for autumn wind speed data, it is observed that, except for the LSSVR model, the CWC values obtained by other models are generally low, suggesting that the prediction intervals for autumn wind speed demonstrate good reliability and clarity.Conversely, the summer wind speed data exhibit high NAD values for each model, indicating a significant deviation of the predicted range from the actual sequence and poor interval clarity.

Figure 13 . 24 Figure 14 .
Figure 13.Process chart of interval prediction ability for 1-period at summer.4.4.2.Analysis of Model Prediction Performance for Different Seasons Figure 14 presents a comparison of three statistical indicators, namely, Score, NAD, and CWC, for 1-period forecasts across different seasons.When predicting interval values for autumn wind speed data, it is observed that, except for the LSSVR model, the CWC values obtained by other models are generally low, suggesting that the prediction intervals for autumn wind speed demonstrate good reliability and clarity.Conversely, the summer wind speed data exhibit high NAD values for each model, indicating a significant deviation of the predicted range from the actual sequence and poor interval clarity.Energies 2023, 16, x FOR PEER REVIEW 19 of 24

Figure 14 .
Figure 14.Index values for different seasons of each model for 1-period.4.4.3.Analysis of the Improvement of CWC in the Proposed Model

Figure 14 .
Figure 14.Index values for different seasons of each model for 1-period.4.4.3.Analysis of the Improvement of CWC in the Proposed Model

Figure 15 .
Figure 15.Improvement of CWC (%) in multi-step prediction for the proposed model.

Figure 15 .
Figure 15.Improvement of CWC (%) in multi-step prediction for the proposed model.

Figure 16 .
Figure 16.Indicator charts for multiple models with 1-period.

Figure 16 .
Figure 16.Indicator charts for multiple models with 1-period.

Figure 16 .
Figure 16.Indicator charts for multiple models with 1-period.

Figure 17 .
Figure 17.CWC values for different cases during different foresight periods.

Table 1 .
Summary of different existing hybrid interval prediction methods.

Table 2 .
Parameter settings of the other methods.

Table 3 .
Performance indices comparison of six models.

Table 4 .
Comparison of multi-step prediction performance of various models.

Table 5 .
Comparison of multi-step prediction performance of various models.

Table 6 .
Comparing the interval prediction performance of different models for 1-period.

Table 6 .
Comparing the interval prediction performance of different models for 1-period.

Table 7 .
Improvement of CWC (%) of the proposed model.

Table 7 .
Improvement of CWC (%) of the proposed model.

Table 8 .
Prediction results of wind speed data experimental intervals.
a separate prediction model for each subsequence.This approach increases the time required for model establishment and reduces the efficiency of the prediction model. ing