Using Artiﬁcial Intelligence to Predict Wind Speed for Energy Application in Saudi Arabia

: Predicting wind speed for wind energy conversion systems (WECS) is an essential monitor, control, plan, and dispatch generated power and meets customer needs. The Kingdom of Saudi Arabia recently set ambitious targets in its national transformation program and Vision 2030 to move away from oil dependence and redirect oil and gas exploration e ﬀ orts to other higher-value uses, chieﬂy meeting 10% of its energy demand through renewable energy sources. In this paper, we propose the use of the artiﬁcial neural networks (ANNs) method as a means of predicting daily wind speed in a number of locations in the Kingdom of Saudi Arabia based on multiple local meteorological measurement data provided by K.A.CARE. The suggested model is a feed-forward neural network model with the administered learning technique using a back-propagation algorithm. Results indicate that the best structure is obtained with thirty neurons in the hidden layers matching a minimum root mean square error (RMSE) and the highest correlation coe ﬃ cient (R). A comparison between predicted and actual data from meteorological stations showed good agreement. A comparison between ﬁve machine learning algorithms, namely ANN, support vector machines (SVM), random tree, random forest, and RepTree revealed that random tree has low correlation and relatively high root mean square error. The signiﬁcance of the present study relies on its ability to predict wind speeds, a necessary prerequisite to executing sustainable integration of wind power into Saudi Arabia’s electrical grid, assisting operators in e ﬃ ciently managing generated power, and helping achieve the energy e ﬃ ciency and production targets of Vision 2030. data More is underway to compare and test other machine learning algorithms capable of predicting wind speed distribution accuracy as well as a study of the effect of different attributes on the prediction of speed sensitivity analysis.


Introduction
Today, governments, policymakers, and energy utilities employ a broad range of tools to encourage the use of various renewable energy technologies, cash, and tax credit. According to various resources in the energy sector like the Global Status Report REN21 [1] and the Global Wind Energy Council Report GWEC [2,3], the global wind energy capacity installed in the last ten years increased from about 159 gigawatts (GW) in 2009 to about 591 GW at the end of 2018. By the end of 2019, it is expected to achieve an installed capacity of around 656 GW, as shown in Figure 1, with more details in Table 1 showing data of the global annual installed wind capacity from 2009 to the expected year 2019e. According to Wind Europe's Central Scenario, 323 GW of cumulative wind energy capacity would be installed in the EU by 2030. In the High Scenario, 397 GW of wind energy capacity would be installed in the EU by 2030, with 298.5 GW onshore and 99 GW offshore. In the low scenario, there would be 256.4 GW of wind capacity in 2030, with 207 GW onshore and 49 GW offshore [4]. Currently, in Europe alone, electricity generated from wind turbines covers up to 11% of the electricity demand. By 2020, it will increase to 16.5%, and by 2030 it is expected that renewable energy could serve at least 27% of Europe's electricity need and will generate over three million jobs [4].  In Saudi Arabia, 15% of domestic oil production is used to generate electricity, and another 50% is consumed by electric power plants [5,6]. According to the Electricity and Cogeneration Regulatory Authority, the residential sector consumes 50% of the generated power [7,8]. In the last ten years, the consumption per capita increased from 7258 kWh/Capita to 9167 kWh/Capita. In 2017, Saudi Arabia produced only 0.04% of electricity from non-fossil fuels using solar energy [7,9]. Recently, Saudi Arabia set an ambitious target in its national transformation program and Vision 2030 to move from oil dependence and redirect the oil and gas exploration to other higher-value uses [10,11]. This goal is being achieved by setting an energy roadmap with the aim to supply 10% of its energy demand from renewable sources with an initial target of generating 9.5 GWs of renewable energy by 2023 and 3.45 GW by 2020. More recently, the Renewable Energy Project Development Office (REPDO) of Saudi Arabia's Ministry of Energy, Industry and Mineral Resources [12] announced a substantial increase in the renewable energy share to produce 40 GW of solar energy and 16 GW of wind power over the next decade. Vestas will supply 99 V 150-4.2 MW wind turbines machines to Saudi Arabia's first utility-scale wind farm in the Dumat Al Jandal, south of the Al Jouf region.
With the current rapid development and growth in wind generation, there is a need for serious research on different areas of wind energy conversion systems (WECS) and energy management, in particular, wind speed and power forecasting. As the wind speed and nonlinear fluctuations represent the main components in the prediction of the energy output of wind turbines, improving wind forecasting has substantial financial and technical benefits, and assists electrical system  In Saudi Arabia, 15% of domestic oil production is used to generate electricity, and another 50% is consumed by electric power plants [5,6]. According to the Electricity and Cogeneration Regulatory Authority, the residential sector consumes 50% of the generated power [7,8]. In the last ten years, the consumption per capita increased from 7258 kWh/Capita to 9167 kWh/Capita. In 2017, Saudi Arabia produced only 0.04% of electricity from non-fossil fuels using solar energy [7,9]. Recently, Saudi Arabia set an ambitious target in its national transformation program and Vision 2030 to move from oil dependence and redirect the oil and gas exploration to other higher-value uses [10,11]. This goal is being achieved by setting an energy roadmap with the aim to supply 10% of its energy demand from renewable sources with an initial target of generating 9.5 GWs of renewable energy by 2023 and 3.45 GW by 2020. More recently, the Renewable Energy Project Development Office (REPDO) of Saudi Arabia's Ministry of Energy, Industry and Mineral Resources [12] announced a substantial increase in the renewable energy share to produce 40 GW of solar energy and 16 GW of wind power over the next decade. Vestas will supply 99 V 150-4.2 MW wind turbines machines to Saudi Arabia's first utility-scale wind farm in the Dumat Al Jandal, south of the Al Jouf region.
With the current rapid development and growth in wind generation, there is a need for serious research on different areas of wind energy conversion systems (WECS) and energy management, in particular, wind speed and power forecasting. As the wind speed and nonlinear fluctuations represent the main components in the prediction of the energy output of wind turbines, improving wind forecasting has substantial financial and technical benefits, and assists electrical system operators in decreasing the risk of unreliability in electricity supply. The wind is particularly intermittent and variable by nature, and any unexpected variations critically affect the operating costs of electricity production. In this paper, we propose using the artificial neural networks (ANNs) method to predict the daily wind speed distribution in some locations in the Kingdom of Saudi Arabia using multiple local meteorological measurement data provided by K.A.CARE [13]. The primary benefit of using artificial intelligence techniques to determine wind speed is the ability to build a prediction model capable of forecasting short term wind speed distribution since the model can train and learn from the historical data set. The rest of this paper is organized as follows: Section 2 details the main research on wind speed forecasting techniques and methods for wind energy conversion systems (WECS). Section 3 discusses the wind data collection and analysis in some cities in Saudi Arabia. Section 4 presents the artificial neural network modeling designed to forecast wind speed distribution. Section 5 discusses the results obtained from the prediction of the wind speed distribution and its comparison to experimental data. Finally, Section 6 provides the main conclusion of this study.

Related Work
With rapid technology development, wind energy has become an important commercial option for large scale power production. However, wind energy resource is highly variable, and the resulting fluctuations in the generation capacity can cause instability in the power grid. Today there is a growing ambition and enthusiasm about using the power of the wind to generate clean and environmentally friendly renewable energy and reduce our dependence on fossil fuel resources. In a report published by the United Nations [14], it is recommended that the maximum emission of CO 2 should not exceed 44 gigatons by 2020, falling to 40 gigatons by 2025 and further to 22 gigatons by 2050. With the current rapid development and growth in wind generation, there is a need for serious research on different areas of wind energy conversion systems (WECs) and energy management, in particular, wind speed prediction and power forecasting. As the wind speed and nonlinear fluctuations represent the main components in the prediction of the energy output of wind turbines, investigating wind energy production from wind turbine machines at a given location requires an intensive study of wind distribution in terms of its availability, direction, hourly distribution, diurnal variation, and frequency. In contrast to conventional power plants, the electricity generated from wind turbines depends mostly on meteorological conditions, in particular, the magnitude of the wind speed, the atmospheric turbulence, and the control of the wind turbine characteristics [15,16]. In recent years, there has been an increasing interest in developing methods and techniques to predict the wind speed and the generated power of WECS to maintain a sustainable integration of wind power into the electricity grid [17][18][19][20][21]. These methods are based on the time scale of (i) very short-term forecast, (ii) short-term forecast, (iii) medium-term forecast, and (iv) long-term forecast, as shown in Table 2 along with the time zone, range, and applications. Dispatching the generated power of the WECS to meet customer need within a short time [24,25] Medium-term 6 h to 1 day Operational security, safety, and electricity market [26,27] Long-term 1 day to 1 week or above Unit commitment decisions, maintenance scheduling, and operational management [28,29]  Machine learning methods have been widely used to predict wind speed distribution [17,[30][31][32]. These methods are divided into (i) supervised learning where both inputs and output are provided, (ii) unsupervised learning where only the input data is provided, (iii) reinforcement learning represented by a mixture of supervised and unsupervised learning, and (iv) evolutionary learning, which is considered a biological approach to machine learning. On the other hand, wind forecasting methods are classified into four categories: physical methods, statistical methods, hybrid methods, and artificial intelligence methods [17,19,33]. Other authors, such as Sun et al. [34] listed the persistence method or "Naïve Predictor", which assumes a correlation between the wind speed at a time "t + x" and the current wind speed at the time "t", where both speeds are assumed to be the same. The precision of this model decreases quickly in subsequent prediction time. The following subsections highlight the main points of each of the four forecasting methods.

Physical Methods
Physical methods are based on a series of empirical formulas where a large amount of data is obtained from weather forecasts based on the geographical locations of the WECS and physical parameters such as temperature, humidity, and pressure [19,[35][36][37]. The training input from past data is not necessary for the physical method. Using the downscaling method, the meteorological wind speed is used to determine the wind speed at the hub of the WECS and then the amount of energy produced. Multiple physical models have been developed to help in producing a more reliable forecast, such as the numeric weather prediction (NWP) model [35], which solves complex mathematical equations describing the physical process of the wind over a given geographical area. NWP runs once or twice a day, which limits its use to a short-term prediction. Applying physical modeling requires excessive computer time and shows better results only when weather conditions are stable [38].

Statistical Methods
Wind speed prediction using statistical methods is based on historical data and patterns [19,39]. These methods are fundamental for WECS control and planning. However, due to the stochastic nature of wind, the predictions are not as successful as we expected them to be. The time series technique uses a statistical approach by training on the experimental data and minimizing the difference between the predicted and the experimental data. Time series models such as the autoregressive model (AR), moving-average model (MA), auto-regressive moving average (ARMA) model, or autoregressive integrated moving average (ARIMA) model, are capable of forecasting the average hourly wind speed and the annual power generated by the WECS [17][18][19].

Hybrid Methods
Hybrid methods use a combination of different techniques such as mixing physical and statistical methods, individual models, or short-term and medium-term forecasts. The main objective is to take advantage of each method and increase the accuracy of the wind speed forecast [17,40]. Several hybrid techniques have been used to forecast the wind speed and power generated by WECS, including Wavelet and ANN, ANN and Fuzzy logic, spatial correlation and neural networks (NNs), and ARIMA and neural networks (NNs) to list just a few [41]. In a recent study conducted by Alencar et al. [19] on ultra-short, short, medium, and long-term wind speed prediction models based on ANN, ARIMA, and hybrid models, the authors concluded that despite the existence of a full range of methods none of the forecasts cover the entire forecasting possibilities between ultra-short-term and several years ahead.

Artificial Intelligence Methods
The core strategy in artificial intelligence methods is to build a relationship between the input and output data using algorithms as opposed to using analytical methods. In statistical methods, the model looks for decreasing the difference between the predicted and immediate past values using an auto-recursive mathematical model, while neural networks look for patterns in the input and output data over a long period. Among the most popular models used in forecasting wind speed based on artificial intelligence we found ANN models [18,42], support vector machines (SVM) [43,44], fuzzy logic [45], and the adaptive neuro-fuzzy inference framework (ANFIS) approach [46,47]. Although ANN has some advantages due to its simplicity and self-learning capacity, selecting the appropriate method is not an easy task. In some locations and under certain atmospheric circumstances, some models may perform well in predicting the wind speed; however, the same models may generate false previsions under other circumstances. As artificial intelligence methods developed rapidly, many researchers turned to machine learning techniques to predict wind speed, as reported by [17,30,43]. Artificial intelligence can capture the non-linearity of the wind speed characteristic but may not capture all the factors, such as the topography of the terrain, the temperature, the humidity, the seasonal wind speed distribution, and the height of the wind turbine machines. There is still more work to be done to increase the accuracy of wind speed prediction.
Neural network approaches require particular attention to select the appropriate and relevant structural parameters of the model such as the number of layers and neurons. Alencar et al. [19] and Sun et al. [34] revealed that the forecasting error increases with the increase in the time horizon. In a comparison between ANN and a hybrid model (ANN and computational fluid dynamics (CFD)) against the Supervisory Control and Data Acquisition (SCADA) in a wind farm in Italy, Burlando et al. [48] found that both methods gave similar results, however, ANN forecasts were better at medium wind speed ranges while the hybrid model forecasts were better at low and high wind speed ranges. In another review of ANN models for energy prediction, including wind, solar, and hydropower, Bermejo et al. [49] concluded that many contributions endorsed the ANN model but only under some conditions including the use of a large range of data to improve the prediction accuracy. The authors stated that the main advantage of ANN models is due to its capacity to determine complex relations among variables while keeping high data tolerance. Ernst et al. [50] stated that using a combination of models and forecasting for larger regions and shorter horizon reduces the average predicted errors. An exhaustive survey of artificial neural network applied in wind energy systems published by Marugán et al. [51] demonstrated that artificial neural networks could be an alternative to conventional methods in many cases. This analysis summarizes the key approaches by compiling processes, algorithms, and mode. The concluded highlighted the evolution of ANN in the last 10 years along with future trends and application in the domain of wind turbines. In a review of 30 years of wind speed prediction, Costa et al. [52] suggested further study on complex terrain and research on adaptive parameter estimation. A best practice for short-term forecasting has been suggested by Giebel and Kariniotakis [53]. Mujeeb et al. [54] proposed a demand management scheme and a wind power forecasting method using big data-driven wind power forecasting. They employed a deep-learning technique to predict the day-ahead wind power at the New England's wind farm located in Maine, USA. Mosavi et al. [17], in a study on "State of the Art of Machine Learning Models in Energy Systems, a Systematic Review", indicated that currently, the trend is towards personalized machine learning models designed for a particular application.

Wind Data Collection
The cut-in speed for a wind turbine machine is typically between 3-4 m/s [55]. In terms of wind speed distribution, Saudi Arabia's regions are surrounded by a significant onshore wind speed distribution ranging between 6 and 8 m/s [56][57][58] enough for a wind turbine machine to produce an output. The average annual wind speed at 100-m height recorded in Saudi Arabia at different cities ranges from 5. Data used in the present study has been collected from the King Abdullah City for Atomic and Renewable Energy as part of the Renewable Resource Monitoring and Mapping (RRMM) Program [13]. The hourly data provided (May 2013 to July 2016) includes different attributes such as air temperature, wind direction and speed, global horizontal irradiance (GHI), relative humidity, and barometric pressure. The hourly distribution of wind speed for the city of Jeddah during January and May 2015 is shown in Figure 2. A comparison between wind distribution in 2014 and 2015 is given in Figure 3. Both Figures 2 and 3, show the random variation of the wind speed. May 2015 is shown in Figure 2. A comparison between wind distribution in 2014 and 2015 is given in Figure 3. Both Figures 2 and 3, show the random variation of the wind speed.   Figure 4. The windrose diagram in Figure 4a shows that the most prevailing wind directions in Jeddah city are in the 315-317.5 sectors, north-west, but it rarely blows from the south or east. The north-west and north-north-west also comprise almost 50% of all hourly wind directions. For the Taif region, Figure  4b, the most prevailing wind direction is from the west and west-north-west. May 2015 is shown in Figure 2. A comparison between wind distribution in 2014 and 2015 is given in Figure 3. Both Figures 2 and 3, show the random variation of the wind speed.   Figure 4. The windrose diagram in Figure 4a shows that the most prevailing wind directions in Jeddah city are in the 315-317.5 sectors, north-west, but it rarely blows from the south or east. The north-west and north-north-west also comprise almost 50% of all hourly wind directions. For the Taif region, Figure  4b, the most prevailing wind direction is from the west and west-north-west.  Figure 4. The windrose diagram in Figure 4a shows that the most prevailing wind directions in Jeddah city are in the 315-317.5 sectors, north-west, but it rarely blows from the south or east. The north-west and north-north-west also comprise almost 50% of all hourly wind directions. For the Taif region, Figure 4b, the most prevailing wind direction is from the west and west-north-west.  Another important parameter is the variation of the wind speed as a function of the height of the WECS. The speed increases with the height according to a power law [60]. Based on measured data at a specific height, usually 10 m, it is possible to calculate the wind speed at different heights using: where is the wind speed at the height , is the wind speed at the reference height, and is the wind shear or Hellman's exponent, which depends on the terrain and atmospheric stability [61]. It is important to note that the power generated by the wind turbine, Equation (2), is proportional to the cube of the wind speed [55].
where is the air density, is the area swept by the rotor of the wind turbine machine, is the wind speed, and , , represent the turbine efficiency, mechanical efficiency, and electrical efficiency, respectively. It is therefore strongly recommended to accurately predict the wind speed distribution to collect the power generated since any error in predicting wind speed will affect the power output generated by the WECS.

Artificial Neural Network Modeling
In a neural network-based system, the neurons are linked by weighted activation where the network is modeled by an input layer related to the sources of information (input data), the hidden layer constructed of several neurons, and an output layer comprised of information transferred from the network to the signal output [36,62,63]. The available database used for the ANN prediction is divided into training and validation sets. The main objective of the algorithm is to minimize the difference between the predicted and actual data. All the input parameters are gathered for wind speed estimation and analyzed to characterize the connection between the input and output using the correlation technique. The suggested neural network for the wind speed prediction model is a feed-forward neural network model with the administered learning technique using the backpropagation algorithm [64].
Assuming as the connection weight vector between the input node and the hidden node, the bias on the hidden node, the connection weight vector from the hidden layer to the output node, and the bias term of output, then the output is calculated by Equations (3)  Another important parameter is the variation of the wind speed as a function of the height of the WECS. The speed increases with the height according to a power law [60]. Based on measured data at a specific height, usually 10 m, it is possible to calculate the wind speed at different heights using: where V z is the wind speed at the height Z, V r is the wind speed at the reference height, and α is the wind shear or Hellman's exponent, which depends on the terrain and atmospheric stability [61]. It is important to note that the power generated by the wind turbine, Equation (2), is proportional to the cube of the wind speed [55].
where ρ is the air density, A is the area swept by the rotor of the wind turbine machine, V is the wind speed, and η t , η m , η e represent the turbine efficiency, mechanical efficiency, and electrical efficiency, respectively. It is therefore strongly recommended to accurately predict the wind speed distribution to collect the power generated since any error in predicting wind speed will affect the power output generated by the WECS.

Artificial Neural Network Modeling
In a neural network-based system, the neurons are linked by weighted activation where the network is modeled by an input layer related to the sources of information (input data), the hidden layer constructed of several neurons, and an output layer comprised of information transferred from the network to the signal output [36,62,63]. The available database used for the ANN prediction is divided into training and validation sets. The main objective of the algorithm is to minimize the difference between the predicted and actual data. All the input parameters are gathered for wind speed estimation and analyzed to characterize the connection between the input and output using the correlation technique. The suggested neural network for the wind speed prediction model is a feed-forward neural network model with the administered learning technique using the back-propagation algorithm [64]. Assuming w ij as the connection weight vector between the input node and the hidden node, w oj the bias on the hidden node, v jk the connection weight vector from the hidden layer to the output node, and v ok the bias term of output, then the output is calculated by Equations (3)-(6) [65].
z j = g z netj . (4) Equation (3) represents the summing product of the net inputs, Equation (4) represents the hidden layer using the sigmoid activation function, Equation (5) represents the summing product of the hidden layer outputs, and, finally, the output layer is calculated using Equation (6) along with the sigmoid function defined by g(x) = 1 1+e −x . An example of multi-layer neural networks formed by three layers, a passive input layer with seven nodes, three hidden-layers with eight, four, and three neurons, respectively, and an active output layer for the wind speed prediction is shown in Figure 5. Each input is multiplied by a weight, and then summed up to produce a single value that is passed through a nonlinear mathematical transfer function "sigmoid." The input layer includes, among others, the air temperature, the relative humidity, the wind direction, the direct normal irradiation (DNI), the diffuse horizontal irradiation (DHI), the global horizontal irradiation (GHI), and the barometric pressure. In addition, selecting the number of layers and neurons in each layer is a very important decision in selecting the architecture of the ANN used. However, there is no 'one size fits all' solution to all applications but there are some guidelines that most researchers follow such as i) increasing the number of neurons when the input data's connection to the required output is complex, ii) increasing the number of hidden layers when modeling a multiple-stage process, etc. [66].
Equation (3) represents the summing product of the net inputs, Equation (4) represents the hidden layer using the sigmoid activation function, Equation (5) represents the summing product of the hidden layer outputs, and, finally, the output layer is calculated using Equation (6) along with the sigmoid function defined by ( ) = .
An example of multi-layer neural networks formed by three layers, a passive input layer with seven nodes, three hidden-layers with eight, four, and three neurons, respectively, and an active output layer for the wind speed prediction is shown in Figure 5. Each input is multiplied by a weight, and then summed up to produce a single value that is passed through a nonlinear mathematical transfer function "sigmoid." The input layer includes, among others, the air temperature, the relative humidity, the wind direction, the direct normal irradiation (DNI), the diffuse horizontal irradiation (DHI), the global horizontal irradiation (GHI), and the barometric pressure. In addition, selecting the number of layers and neurons in each layer is a very important decision in selecting the architecture of the ANN used. However, there is no 'one size fits all' solution to all applications but there are some guidelines that most researchers follow such as i) increasing the number of neurons when the input data's connection to the required output is complex, ii) increasing the number of hidden layers when modeling a multiple-stage process, etc. [66]. The weights in each intermediate and output layer are set in the training mode until the errors are recognized within a prescribed scope. During the training period, the system parameters can be monitored and adjusted based on the multilayer perceptron method, as defined in WEKA software [64]. The training sets determine the output by iteratively limiting the errors using the steepest decision technique, and the slope is resolved using a backpropagation algorithm. The performance The weights in each intermediate and output layer are set in the training mode until the errors are recognized within a prescribed scope. During the training period, the system parameters can be monitored and adjusted based on the multilayer perceptron method, as defined in WEKA software [64]. The training sets determine the output by iteratively limiting the errors using the steepest decision technique, and the slope is resolved using a backpropagation algorithm. The performance of the proposed algorithm is measured by using the root-mean-square-error (RMSE), the mean absolute error (MAE), and the correlation coefficient (R). Values close to zero are appropriate for MAE and RMSE, while values close to one for (R) indicate a strong correlation. Assuming V act i is the wind speed observed, V pred i is the wind speed predicted, the root mean square root is given by The mean absolute error (MAE) measuring the average magnitude of the absolute difference between the predicted and actual data can be written as: Using the average V act and V pred wind speeds, the linear correlation coefficient can be rearranged as

Results and Discussion
Four cities have been selected for the prediction of wind speed using artificial neural networks (ANNs). To develop the model using ANN, the database was divided into two stages: training and testing phases. To train neural networks, several architectures using the cross-validation method have been applied. The method consists of performing the optimization of the weights by finding the set of weights that minimizes the errors. The number of attributes, the percentage of training and testing, and the number of hidden layers and neurons were selected according to the accuracy determined by the root-mean-square-error (RMSE), the mean absolute error (MAE), and the correlation coefficient (R). As explained in the previous section, there is no known solution for calculating the number of hidden layers and neurons; to overcome these challenges, a number of tests were performed to select the optimum number of hidden layers and neurons. We tested 2, 5, 10, 20, 30, 40, and 50 neurons; the result is shown in Table 3 and Figure 6. Using one layer and 30 neurons, the best result with an RMSE of 0.6109 and a correlation coefficient of 0.9222 was obtained for the station at King Abdulaziz University (KAU) located in the city of Jeddah. Therefore, a network with 30 neurons was selected for the optimum ANN algorithm. Once the number of layers and neurons was set, additional runs were conducted to select the best percentage for training and testing. Our tests showed that 70%-30% for training and testing gives the best RMSE of 0.8078 compared to RMSE of 1.1119 obtained when using 60%-40%. On the number of attributes used in this study, an extensive test was conducted to study the effect of different attributes. After a series of tests, six attributes were selected from 13 attributes available in the original data, including the air temperature, the wind direction, the GHI, the peak wind speed, the relative humidity, and the pressure. Therefore, a network with 30 neurons, six layers, along with 70%-30%, was selected for the optimum ANN algorithm. Note that other parameters such as the direct normal irradiation (DNI) and the diffuse horizontal irradiation (DHI) were dropped from the input data since the global horizontal irradiation (GHI) includes both the DNI and the DHI. On the number of data used, tests were conducted on four years of data as well as on one year of data. The performance obtained at three different stations located in four cities: King Abdulaziz University (KAU) in Jeddah (21.49604, 39.24492), King Saud University in Riyadh (24.72359, 46.61639), Taif University in Taif (21.43278, 40.49173), and Afif Technical Institute in Afif city (23.92118, 42.94815) is shown in Table 4. The correlations (R) between the predicted and actual data is better in the case of the station located in Jeddah with R = 0.9222.             Figure 11 show a comparison between ANN and four other machine learning techniques, namely SVM, random tree, random forest, and RepTree. Details of each of these techniques can be found in [64,67]; however, we highlight the main characteristics of each method. The support vector machine (SVM) is a supervised machine learning algorithm that classifies linear and nonlinear data. SVM uses a kernel trick technique to transform the available data and then finds an optimal correlation between possible outputs. The random tree is an ensemble learning algorithm that generates many individual learners. The performance of single decision trees is significantly enhanced by using tree diversity and randomization. The random forests are, as its name implies, an ensemble learning method for classification or regression made up of many decision-making trees acting as an ensemble. During the testing process, the prediction is performed by averaging each decision tree prediction. The reduces error pruning (REP) tree is a fast decision tree learning algorithm that builds a regression or decision tree using the reduction of information gain and prunes it using reduced-error pruning.   Table 5 and Figure 11 show a comparison between ANN and four other machine learning techniques, namely SVM, random tree, random forest, and RepTree. Details of each of these techniques can be found in [64,67]; however, we highlight the main characteristics of each method. The support vector machine (SVM) is a supervised machine learning algorithm that classifies linear and nonlinear data. SVM uses a kernel trick technique to transform the available data and then finds an optimal correlation between possible outputs. The random tree is an ensemble learning algorithm that generates many individual learners. The performance of single decision trees is significantly enhanced by using tree diversity and randomization. The random forests are, as its name implies, an ensemble learning method for classification or regression made up of many decision-making trees acting as an ensemble. During the testing process, the prediction is performed by averaging each decision tree prediction. The reduces error pruning (REP) tree is a fast decision tree learning   Table 5 and Figure 11 show a comparison between ANN and four other machine learning techniques, namely SVM, random tree, random forest, and RepTree. Details of each of these techniques can be found in [64,67]; however, we highlight the main characteristics of each method. The support vector machine (SVM) is a supervised machine learning algorithm that classifies linear and nonlinear data. SVM uses a kernel trick technique to transform the available data and then finds an optimal correlation between possible outputs. The random tree is an ensemble learning algorithm that generates many individual learners. The performance of single decision trees is significantly enhanced by using tree diversity and randomization. The random forests are, as its name implies, an ensemble learning method for classification or regression made up of many decision-making trees acting as an ensemble. During the testing process, the prediction is performed by averaging each decision tree prediction. The reduces error pruning (REP) tree is a fast decision tree learning   Figure 11. Comparison between five machine learning algorithms, "Jeddah, KAU Station, 2015".

Conclusion
Wind speed forecasting represents significant potential for energy infrastructure development and management, requiring careful investigations of availability, direction, hourly distribution, and frequency to examine wind energy production and ensure a sustainable integration of wind power into electricity grids. The present paper introduced artificial neural networks (ANNs) as a powerful tool to predict wind speed distribution, required for energy applications in Saudi Arabia. The study showed that it was possible to estimate and predict wind speed variability using ANN techniques. Based on different tests, ANN proved itself a flexible method in terms of accuracy and computer time usage, when compared to atmospheric models such as weather research and forecasting (WRF). Correlation coefficients of 0.9222, 0.8655, 0.9039 and 0.8957, were obtained for the city of Jeddah, Riyadh, Taif, and Afif, respectively. Data obtained from the King Abdullah City for Atomic and Renewable was trained with different hidden layers and 2, 5, 10, 20, 30, 40 and 50 neurons. Based on the RMSE and the correlation coefficient "R," the best result was obtained with an RMSE of 0.6109 for ANN and an RMSE of 0.5543 for the random forest. Our tests also showed that 70%-30% for data training and testing provided the best correlation. Moreover, correlation between actual and predicted data was low in the case of random tree, where the root mean square error was relatively high. ANN predictions could be improved by conducting additional tests on hidden layers and ANN parameters producing the best RMS and correlation. More work is underway to compare and test other machine learning algorithms capable of predicting wind speed distribution with better accuracy as well as a study of the effect of different attributes on the prediction of wind speed through sensitivity analysis. When comparing the five algorithms (Table 5), the random forest presented slightly better performance than the rest of the algorithm, but they all showed favorable results except the random tree, which showed low correlation and relatively high root mean square error.

Conclusions
Wind speed forecasting represents significant potential for energy infrastructure development and management, requiring careful investigations of availability, direction, hourly distribution, and frequency to examine wind energy production and ensure a sustainable integration of wind power into electricity grids. The present paper introduced artificial neural networks (ANNs) as a powerful tool to predict wind speed distribution, required for energy applications in Saudi Arabia. The study showed that it was possible to estimate and predict wind speed variability using ANN techniques. Based on different tests, ANN proved itself a flexible method in terms of accuracy and computer time usage, when compared to atmospheric models such as weather research and forecasting (WRF). Correlation coefficients of 0.9222, 0.8655, 0.9039 and 0.8957, were obtained for the city of Jeddah, Riyadh, Taif, and Afif, respectively. Data obtained from the King Abdullah City for Atomic and Renewable was trained with different hidden layers and 2, 5, 10, 20, 30, 40 and 50 neurons. Based on the RMSE and the correlation coefficient "R," the best result was obtained with an RMSE of 0.6109 for ANN and an RMSE of 0.5543 for the random forest. Our tests also showed that 70%-30% for data training and testing provided the best correlation. Moreover, correlation between actual and predicted data was low in the case of random tree, where the root mean square error was relatively high. ANN predictions could be improved by conducting additional tests on hidden layers and ANN parameters producing the best RMS and correlation. More work is underway to compare and test other machine learning algorithms capable of predicting wind speed distribution with better accuracy as well as a study of the effect of different attributes on the prediction of wind speed through sensitivity analysis.