Improving Wind Power Generation Forecasts: A Hybrid ANN-Clustering-PSO Approach

: This study introduces a novel hybrid forecasting model for wind power generation. It integrates Artiﬁcial Neural Networks, data clustering, and Particle Swarm Optimization algorithms. The methodology employs a systematic framework: initial clustering of weather data via the k-means algorithm, followed by Pearson’s analysis to pinpoint pivotal elements in each cluster. Subsequently, a Multi-Layer Perceptron Artiﬁcial Neural Network undergoes training with a Particle Swarm Optimization algorithm, enhancing convergence and minimizing prediction discrepancies. An important focus of this study is to streamline wind forecasting. By judiciously utilizing only sixteen observation points near a wind farm plant, in contrast to the complex global numerical weather prediction systems employed by the European Center Medium Weather Forecast, which rely on thousands of data points, this approach not only enhances forecast accuracy but also signiﬁcantly simpliﬁes the modeling process. Validation is performed using data from the Italian National Meteorological Centre. Comparative assessments against both a persistence model and actual wind farm data from Southern Italy substantiate the superior performance of the proposed hybrid model. Speciﬁcally, the clustered Particle Swarm Optimization-Artiﬁcial Neural Network-Wind Forecasting Method demonstrates a noteworthy improvement, with a reduction in mean absolute percentage error of up to 59.47% and a decrease in root mean square error of up to 52.27% when compared to the persistence model.


Introduction
In recent years, there has been an escalating global commitment to combat climate change, particularly in ensuring sustainable electricity production [1].Projections indicate that renewable energy sources (RES) will account for 60% of the world's total energy generation by 2030, with wind energy poised to emerge as the leading contributor within a decade [2].Notably, the cumulative global wind power capacity surged from 432.9 GW in 2015 to an impressive 744 GW by 2020, bolstered by new installations in Latin America, Asia, and Africa.This surge aligns with established leaders such as China, the US, Europe, India, and Brazil.Encouragingly, Europe saw wind power cover approximately 18% of its electricity demand in 2020 [3,4].
While wind power generation (WPG) boasts a clean and pollution-free profile, its inherent intermittency and unpredictability pose integration challenges within the electric power system.The variability of wind power necessitates transmission system operators (TSOs) to oversize primary, secondary, and tertiary reserves, which, although ensuring system stability, diminishes the advantages of this renewable energy source [5][6][7].An alternative approach to mitigate this variability is the implementation of energy storage systems (ESS) at the wind farm or transmission network level.However, this option entails substantial initial investments and ongoing maintenance expenses.Hence, achieving accurate wind power forecasting emerges as a cost-effective and readily implementable solution crucial for the successful large-scale integration of WPG.
In recent times, a growing number of researchers have turned their attention to the challenge of wind speed forecasting in the power system sector.Wind power generation, due to its stochastic and intermittent nature, is heavily contingent on the variability of wind speed, making it one of the most challenging meteorological parameters to predict [8].Forecasting models are categorized into short-term (10 min to 1 h), medium-term (1 h to 24 h), and long-term (1 day to 2 days) predictions, based on the temporal depth of the forecast, and into statistical, physical, and machine learning models, depending on the approach employed [9][10][11][12].
Statistical models, such as autoregressive processes (AR), autoregressive moving averages (ARMA), autoregressive integrated moving averages (ARIMA), Gaussian processes (GP), and wavelet transforms (WTs), are part of the classic Box-Jenkins methodology for wind speed forecasting, leveraging historical wind series [13].These methods establish a linear correlation between expected wind speed in the near future and the presently measured speed, proving particularly effective for very short-term forecasting [14].Notably, Lydia et al. propose a forecasting model integrating linear and non-linear ARMA models for wind speed prediction at 10-min intervals up to 1 h, using measured wind direction and annual trends [15].Cai et al., introduce an approach based on a multi-task Gaussian processes (MTGP) regression model, while Haiqiang et al. employ ARMA and gray prediction for ultra-short-term (1 min to 10 min) wind speed prediction, considering spatial and temporal correlations [16,17].Other researchers, including Skittides and Früh, advocate a model based on principal component analysis (PCA) and past wind speed data to improve prediction accuracy [18].Recent works combine linear models with time series decomposition methods, as demonstrated by Kiplangat et al. in their hybrid model combining f-ARIMA and wavelet decomposition AR models for wind speed forecasting [11].
On the other hand, physical models leverage atmospheric physics to predict wind speed, using meteorological data such as temperature, air pressure, and atmospheric stratification, along with local information like surface roughness [19][20][21][22][23][24].Among physical models, the mesoscale, combined dynamic factor (CDF), and time-series approaches are widely employed [21,22].Sanz et al., propose a hybridized global and mesoscale model for wind speed prediction with impressive performance [23], while Zajaczkowski et al. explore the application of computational fluid dynamic models (CFDM) assimilating numerical weather prediction data [24].
Physical models can complement statistical approaches, as demonstrated by authors employing high-resolution regional atmospheric systems in tandem with statistical processes for precise local wind forecasts, minimizing prediction error [25].
For medium to long-term wind power forecasting, machine learning approaches, including artificial neural networks (ANN), support vector machines (SVM), genetic algorithms (GA), and fuzzy logic (FL), are favored for their ability to model the non-linear relationship between wind power and local meteorological data [10,26,27].These methods correlate weather parameters like wind speed, direction, temperature, and pressure with wind power production time series [28].Recent advancements have extended machine learning techniques to short-term forecasting, enhancing prediction accuracy [29,30].Kaur et al. employ five different ANN models to identify the optimal short-term forecasting model based on mean squared error (MSE) [31].Some researchers leverage uncertain data to develop models based on SVMs and FL theory, achieving superior wind power forecasting performance compared to statistical approaches [32][33][34][35][36].
Furthermore, hybrid models combining ANNs with particle swarm optimization (PSO) or wavelet transforms (WTs) have gained traction in the technical literature.These approaches leverage the strengths of both statistical and machine learning models for short-term wind speed and power prediction [37][38][39][40][41][42].

Proposal and Main Contribution
This study introduces a methodology for hourly wind speed prediction, building upon prior research [43,44].The proposed model centers on the spatio-temporal evolution of weather fronts near the wind farm plant (WFP) and its correlation with the anticipated wind speed at the WFP (S 0 ).Unlike prior literature that primarily relied on local-scale meteorological evolution and historical wind power data, our approach extends the mesoscale methodology to forecast wind speed on an hourly basis using a limited number of data points, overcoming previous limitations.
The primary contribution of this work lies in the strategic utilization of only sixteen observation points in the vicinity of the WFP, a significant departure from the complex global numerical weather prediction employed by the European Center for Medium-Range Weather Forecasts (ECMWF), which relies on thousands of points [45,46].This streamlined approach not only ensures high forecast accuracy but also simplifies the modeling process considerably.Additionally, to address the seasonal variability and irregular wind patterns in temperate regions like Italy, our major innovation involves introducing a hybrid method with a data pre-clustering phase, markedly enhancing the efficiency of the ANN.
Furthermore, an optimized parameterization of the Artificial Neural Network (ANN) is introduced using a hybrid system based on Particle Swarm Optimization (PSO).This ancillary approach accelerates convergence toward analytical minimums, enhancing the performance of the traditional multi-layer perceptron ANN (MLP-ANN).The resulting model, called the Clustered PSO-ANN-Wind Forecasting Method (CPA-WF method), combines the benefits of data pre-clustering, ANN modeling, and PSO optimization, offering a comprehensive solution for precise wind speed forecasting.
Through extensive experimentation and evaluation using real weather data obtained from the Italian National Meteorological Centre, the superior performance of the CPA-WF method over existing approaches is demonstrated.Moreover, comparative analysis of different case studies with persistence models and real wind data from the WFP validates the forecasting accuracy of our hybrid model, assessed by metrics such as mean absolute percentage error (MAPE) and root mean square error (RMSE).
By introducing the concept of weather front evolution and employing a streamlined approach with reduced data points, seasonality consideration, and optimization techniques, our CPA-WF method represents a significant leap forward in short-term wind speed prediction.This research greatly contributes to the field of wind power forecasting, providing a practical and efficient solution for the reliable integration of wind energy into existing power systems and promoting a sustainable energy future.

Methodology and CPA-WF Model
The proposed model is aimed at hourly wind speed forecasting, in order to estimate the power production of a wind farm.The emphasis has been placed on wind speed prediction due to its greater impact on WPG compared to factors such as plant size, turbine availability, and site-specific characteristics, which are standard parameters known to wind power producers.
As mentioned in the previous section, the model employs only sixteen observation points (OPs) within the vicinity of the WFP (S 0 ).This streamlined approach not only ensures high forecast accuracy but also greatly simplifies the modeling process.
The approach is mainly based on an MLP-ANN, whose inputs are the weather data referring to the OPs around the WFP and to the WFP itself.In fact, weather data allow us to derive a phenomenological characterization of wind.The barometric pressure (mb) and air temperature (°C) gradients are the most influencing meteorological factors in wind formation [47].In detail, the study leverages the mesoscale-beta region, spanning approximately 20 km to 200 km around the WFP and encompassing the OPs (S ij ) to characterize weather front phenomena, which significantly influence wind speed patterns within the same area [48,49].The OPs are chosen along the cardinal and secondary points of the wind rose around the WFP (Figure 1).To model the weather fronts' evolution, we consider data referring to three different time instants (t −2 , t −1 , and t 0 ) and two different distances from the WFP (δ 1 and δ 2 ).The data relating to the furthest time instant (t −2 ) refer to the furthest 8 points (δ 1 ) away from the WFP; data related to the intermediate time instant (t −1 ) refer to the 8 points at the distance δ 1 from the WFP, whereas meteorological data related to the current time t 0 refer to S 0 .The model output is the wind speed W at the WFP, S 0 , forecasted at the next time instant t +1 , which can be expressed as follows: where T, P, and W are temperature, barometric pressure, and wind speed, respectively, while t −1 ij e t −2 ij are two instant times preceding t + 1 (forecast time horizon), and with: Based on the definitions ( 2) and ( 3), the set of points S, on which the proposed model is based, is Assuming δ 2 ∼ = 2δ 1 , according to the mesoscale-beta model, the spatial-temporal correlation between the weather fronts' evolution and the WFP can be simplified as follows: where t 0 is the current time instant and α is the time shift delay which depends on the propagation speed of the meteorological fronts in both the considered period and area.
Based on the model inputs and outputs, a hybrid approach is proposed, involving a pre-clustering step for the input data.This enables the examination of wind characteristics and their dominant directions throughout different months of the year.Then data related to each cluster are filtered to reduce redundant and potentially misleading information.With this in mind, Pearson's indices are determined for each cluster to remove data from the training set characterized by a low correlation level with the related cluster.This step provides a significant contribution to reducing the forecasting error, as shown in Section 4. Finally, the hybrid training method based on backpropagation (BP) optimized by a PSO-based algorithm represents a further contribution of the proposed methodology, in terms of minimizing the prediction error and speeding up the convergence of the wind speed predictor.In fact, we define the weights of MLP-ANN by using the particle swarm algorithm, thus preventing the ANN from falling into local minima.

Wind Speed Forecasting Procedure
The forecasting procedure based on an ANN can be divided into two main phases: ANN building and ANN operation.The first one is divided into ANN definition, construction of the dataset, and training of the neural network (Figure 2).In the first phase, the inputs and outputs of our model are defined, as described in Section 2, and design the ANN.The input layer has 51 neurons (three data for each OP), whereas the output layer has just one neuron that provides the hourly wind speed at the wind farm at point S 0 at time t +1 .Taking into account the convexity of the problem [50], the MPL [31,44,45] has just one hidden layer with 150 neurons, whose number was defined through a classic set-up procedure.The logistic sigmoid is chosen as the activation function, ensuring good network performance in terms of convergence speed in the learning phase with the same average error on the training set.The learning law is the BP, optimized by using a PSO-based algorithm (PSO-BP, Figure 3).Following, the focus is on the data sets (training, validation, and test) definition.Specifically, here each element hn , of the data set H= { h1 , h2 , . . ., hN }, is the hourly wind speed, air temperature, and barometric pressure at the time n at the input point S ij ∈ S.
The input data are clustered to account for the seasonality of meteorological events associated with wind formation.Subsequently, a dedicated MLP-ANN wind speed predictor is developed for each identified cluster.The cluster size is defined as a subset of consecutive months of the year.In detail, the k-means clustering algorithm to split the whole data set H into K groups is used [51,52].The clustering process, as described in Figure 4, starts by setting K = 2 as the initial value and ends when the algorithm identifies the optimal number of sets (K), representative of data with a high correlation between the meteorological parameters at points S ij and the wind at S0 [48,49,53].Data identification characterized by a high correlation index is performed through the Pearson correlation analysis: if the value of the Pearson correlation index is close to zero, the correlation between the analyzed data is weak; in contrast, if the index is close to −1 or +1, the correlation is strong [54].Following, the K CPA-WFs are trained according to the training set structured in such a way.

Select Euclidean or Pearson distance
Assign the number of cluster k equal 2

Clustering according to the minimum distance of data objects and initial ki
All objects assigned?As mentioned above, the weights and the biases of the MLP-ANNs are updated through a PSO-based algorithm that optimizes the standard BP procedure, thus improving the convergence of each CPA-WF [55,56], and overcoming the drawback of the standard BP algorithm which tends to be trapped in local minima [50,51].The cost function (i.e., the fitness function to be minimized) of the ith particle is assumed to be the MSE produced by the K CPA-WF:

Recalculate the centroid
where N is the number of training samples, O is the number of output neurons, d ij and y ij are the desired and forecast output, respectively.According to the fitness function value of each particle, individual extreme values and global values are computed.Each current value of the fitness function is compared with the value calculated before and after it is updated.This iterative procedure ends when the stopping criterion (MSE minimization) is satisfied; thus, the set of global optimum solutions, corresponding to the particle position, becomes the network weight vector, which is used in the training of K CPA-WFs.
In detail, for each CPA-WF, the specific steps of the PSO-BP training phase are as follows: 1. define the topology of the neural network; 2.
randomly initialize the parameters of K CPA-WF (weights and biases); 3.
initialize the parameters (velocity and position) and the search space of the PSO according to the topology of the CPA-WF; 4.
run the K CPA-WF and for each particle at each iteration h, a wind speed forecasting is derived; after that, the PSO computes the best position of the ith particle over its history up to iteration h (P best ), and the position of the best particle in the swarm at iteration h (G best ); 5.
calculate for each particle the value of the cost function, as defined in (6); 6.
update the velocity and position of the PSO particles until the cost function is minimized, as described in [49]; 7.
set vectors of the best position and velocity that minimize the cost function as weights and biases of the MLP-ANN.
The above procedure is applied to all K CPA-WFs.After that, the validation phase is run, thus concluding the ANN building phase.The ANN operation phase will be described in the next Section.

Wind Prediction Results and Error Analysis
To show the effectiveness of the proposed methodology, an intense test campaign is carried out.We focus our analysis on the most critical points, where the forecast error is greatest, and where it is possible to better evaluate the benefits achieved by our methodology.In the following main obtained results are presented and discussed.

Case Study and Input Data
The proposed hybrid model has been implemented and applied to forecast hourly wind speed in a wind farm situated in the South of Italy.The ANN training set consists of hourly average meteorological data, specifically wind speed W (m/s), air temperature T (°C), and barometric pressure P (mb) defined in 17 points, as described in Section 2. The sixteen OPs chosen are at the minimum distance δ 1 set approximately equal to 20 km, and at a maximum distance δ 2 set approximately equal to 50 km.The time-shifted delay factor α is set equal to 1 h.
The data are provided by the Italian Air Force Meteorological Service and by IVPC (Italian Vento Power Corporation).All data have been acquired every 10 min by weather measurement stations sited in each of the chosen OPs, respectively, and in the test site where the prediction is required.
With this in mind, the training set is built with data averaged over one hour.The entire data set covers a period of four years and has been partitioned into clusters, according to the procedure described in the previous paragraph.The data from the first three years are used for the building of the training set, while the data relating to the fourth year are used for the validation model phase.
The application of the clustering procedure led to the identification of four clusters, corresponding approximately to the four seasons of the year.As shown in Table 1, the forecasting error shows a significant reduction by implementing the clustering phase joint with Pearson's correlation analysis.

Wind Speed Prediction Results
To better highlight the effectiveness of the proposed we show the predicted values as average on the first 10 runs of the ANN corresponding to each cluster instead of the best-predicted value for each cluster.The obtained results refer to four specific days of the year, one for each cluster identified by the K CPA procedure, and refer to the two equinoxes and the two solstices.These particular days identify transitional periods of the year, typically characterized by a high meteorological variability that makes forecasting more difficult.Moreover, the simulation results of the proposed CPA-WF predictor have been compared both with simulation results obtained with persistence model [49]-the typical benchmark model of an ANN-based predictor-and with the real wind data acquired by IVPC.The comparison is shown in Figures 5-8.

Forecasting Error Analyses
RMSE and the MAPE indexes are used to evaluate the performance of the CPA-WF model proposed for hourly wind speed prediction.These error metrics are calculated as a function of the actual wind speed data and of the forecast wind speed value methodology as follows: where d a i and d f i are the actual and forecast wind speeds at hour i, respectively, while N is the prediction horizon.The performance comparison of the proposed hybrid and persistence models is shown in Tables 2 and 3. Simulation results show that the proposed model is characterized by an almost constant error for each cluster, which is significantly lower than the persistence model, used as a benchmark.The trends in Figures 5-8 also show that the forecast error of the proposed model in the worst case is significantly lower with respect to the persistence model even in absolute terms.In addition, the case studies show a high ability of the proposed model to predict the wind's actual profile, showing superior performance not only in terms of absolute value but also by considering the ability to follow quick variations of wind speed.Specifically, with reference to the four clusters in the worst case, the RMSE in the proposed CPA-WF is characterized by a value of variance of 0.06 against a value of 0.26 in the case of the persistence model.Consequently, even the standard deviation in the proposed case is less than half the value it takes in the persistence model (0.24 versus 0.51).Similar results occur considering the MAPE index: the variance value falls from 13.07 for the persistence model to 2.31 for the proposed model, and the standard deviation value falls from 3.62 to 1.52, thus confirming that the proposed methodology produces predictions on an hourly basis of high quality when compared to the literature wide spread used benchmark.

Conclusions
This paper introduces a hybrid methodology for hourly wind speed forecasting, called the Clustered PSO-ANN-Wind Forecasting Method.The approach is grounded in a simplified mesoscale model with a significantly reduced number of points, complemented by an MLP-ANN utilizing pre-clustered input data and an optimized learning law.This innovative methodology relies on a limited set of fundamental meteorological data, encompassing time series of wind speed, air temperature, and barometric pressure, within a maximum time horizon of two hours and a perimeter of approximately 50 km around the focal point for the next hour's wind forecast.
The cornerstone of this methodology lies in the construction of the data set, which hinges on the spatio-temporal evolution of weather fronts and their influence on wind formation.These data are then processed by an MLP-ANN.To enhance the performance of the ANN and substantially reduce forecasting errors and convergence time, two additional refinements were introduced.First, a hybrid PSO-BP approach was integrated into the training phase to expedite backpropagation convergence.Second, to better align with the seasonal characteristics of winds, a clustering algorithm based on the k-means method and Pearson's indices was implemented.Subsequently, a dedicated ANN was trained for each cluster identified by the k-means method.
The effectiveness of this proposed methodology was rigorously assessed on an actual site using four years' worth of meteorological data provided by the Italian Air Force Meteorological Service and IVPC.The results of the test campaign, when compared both with those obtained by the persistence model and with the measured data, demonstrated a higher consistency of forecasts throughout all periods of the year.Notably, the prediction error was approximately half that which characterizes the most widely used benchmark models in the literature.
In terms of contributions to the field of knowledge, this study brings to the forefront a streamlined utilization of only sixteen strategically placed observation points, coupled with the introduction of a hybrid method incorporating data pre-clustering.These advancements significantly enhance the efficiency of the ANN.The applicability of this methodology is particularly pertinent in the realm of wind power generation forecasting, offering a highly accurate and practical approach with substantial potential for implementation in the renewable energy sector.Looking ahead, future investigations may delve into further refinements to optimize performance in specific meteorological contexts, as well as address emerging research questions pertaining to the scalability of the approach for diverse geographical regions and climate patterns.

Figure 4 .
Figure 4. K-Means Clustering for Seasonal Wind Pattern Analysis.

Table 1 .
Mean absolute percentage error (MAPE %) with and without Pearson's correlation analysis.

Table 2 .
RMSE comparison between CPA-WF and Persistence model.

Table 3 .
MAPE comparison between CPA-WF and Persistence model.