A Comprehensive Review on Machine Learning Techniques for Forecasting Wind Flow Pattern

: The wind is a crucial factor in various domains such as weather forecasting, the wind power


Introduction
The wind is the fundamental variable of the atmosphere.The wind is a fundamental source of renewable energy, variations in wind patterns are responsible for global weather, and it is an indicator of life comfort and safety.The study of wind flow patterns and their characteristics has crucial applications in varied domains such as weather forecasting, pollution control, climatic changes, power production, structural health monitoring, and many more.In recent years, as the importance of wind flow prediction has increased, several research methodologies have been introduced for accurate prediction.Urbanization and the rise in population has led to the establishment of highly-structured residential and commercial spaces.The wind flow variation between densely located high-rise buildings creates discomfort and unhealthy conditions for the occupants and pedestrians.The aerodynamic characteristics and structural health of high-rise buildings also depend on the wind speed and flow pattern exerted by the wind on buildings.The structural design, such as the height, shape, and nature of nearby building clusters, impacts wind flow around these buildings.The evaluation of the wind scale becomes essential for maintaining the crucial for effective planning of wind power generation, maintenance, and conservation.By understanding and forecasting wind flow, stakeholders can optimize their operations, ensure a stable power supply, and maximize the utilization of this valuable renewable resource.On the nonlinear nature of wind, physical methods were initially employed for wind pattern forecasting.These methods were implemented in different terrains, altitudes, pressure, and temperatures to estimate the wind's nature in future.Initial measurements were conducted onsite due to the studies related to pedestrian-level wind (PLW) assessments.Wind tunnel experiments were introduced to simulate the surrounding urban environment, and the wind characteristics were investigated.The experiments were conducted using hot wire at limited measurement regions.The observed values at tunnel surfaces were scaled to pedestrian levels.Recently sensors were implanted around the buildings to measure the wind speed, and the effects were investigated [2].Many other wind measuring techniques, such as particle image velocimetry, laser doppler, thermography and thermistor anemometry, were used to simulate the wind flow around the buildings.A lack of professionals for constant observation, the complexity in modelling based on first principles, crucial data maintenance, and high computational costs in earlier years detracted from improvement of these physical methods.This caused the progression of data-driven numerical weather prediction (NWP) models that use statistical methods.The NWP model was not accurate on short-term predictions and thus digital evolution models were introduced for reducing error and improving accuracy [3,4].Models based on physical process derivation inferred wind flow using a variety of meteorological parameters with computational fluid dynamics (CFD) and thermodynamic methods.Physical methods have been employed for a few decades due to their excellent interpretability and solid theoretical foundations.CFD-based Gerri's method, Sketchup and CREO3.0 models have been applied for large-scale, long-term predictions.Nevertheless, these methods were unsuitable for short-term predictions due to inadequate computational facilities and low accuracy.Therefore, prediction using physical methods has numerous restrictions [5].
The physical models were found to introduce non-quantifiable environmental uncertainties.A numerical simulation model using a computational fluid dynamics approach was conducted to explore the effect of wind at the pedestrian level in an urban area.A study employed a CFD approach with a modified Launder-Kato-type (LK) model to compare the wind measurements between CFD and wind tunnel tests.The results proved that the CFD-based model reasonably fits the wind tunnel experimental results [6].

Statistical Numerical Methods
The conventional statistical models used time series data for pattern identification and some characteristic estimation based on historical data.The autoregressive model (AR), moving average model (MA), and integration of these two models (ARIMA) were proposed to calculate the wind speed at a given time.These models used higher order differential functions.The Kalman filters model was introduced for online wind speed forecasting.An integration of autoregression and an information criterion function was applied to test the effectiveness of the prediction models by comparing the one-month average speed and one-hour average wind speed.Other time-series non-Gaussian models were developed, but the autoregression mean average model outperformed the persistence models with less error [7].The Wind Atlas Analysis and Application Program (WAsP) is most extensively utilized in the wind industry to forecast long-term wind energy.WAsP is a data-driven model that predicts wind speed in a target region based on the connection between shortterm wind data in the target region and neighboring reference regions.WAsP methods were implemented on assumptions that the target and reference region have the same weather, are neutrally stable, have smooth terrain and the collected data are reliable.These assumptions were unreal and thus the model suffered from significant errors [8].Measurecorrelate-prediction (MCP) methods were also used in long-term wind energy estimation by determining the relationship between target and reference areas.Numerous MCP methods were devised based on the linear, nonlinear, and probabilistic relationships of the data.MCP methods were found to be more accurate than physical methods and WAsP but had underlying constraints on data collection methods such as choosing measuring sites, installation of anemometers, height of wind measurement, influence of nearby buildings, vegetation and so on [9].
Spatial correlation models that considered the relationship between different terrestrial locations and the wind speed were used to determine the wind speed of the neighboring areas.The model was more challenging than time-series models as it relied on the measurement of spatially correlated areas.In a combination of spatial correlation with fuzzy methods, genetic-based algorithms were utilized to increase the accuracy of the prediction.The error in these methods was studied, revealing that accuracy mainly depends on the size of the region of interest and the number of points considered for estimation [10].Spatial temporal, moving average, and fractional autoregressive models were found to capture patterns more accurately on very short-term time series.The accuracy of the autoregressive approaches relied on the wind speed and reliability of the distribution of investigation areas.The performance of auto-regressive integrated models out-performed that of secondorder Markov chain, Weibull and recurrent neural network-based models when applied to seasonal wind trends [11].In order to make short-term projections, ARIMA models were used in conjunction with hybrid or combinational methods.The findings demonstrated that the ARMAX models, including wind direction as an exogenous input (for example, wind speed at various heights, wind direction, temperature, and sun radiation), outperformed the other models [12].The boundary layer scaling model (BLSM) was applied to NWP data and a wind map of a power hub in Great Britain to predict the wind speed and power density at a height of 10 m.BLSM produced accurate wind speed prediction with less error than the microgeneration certification scheme model [13], which predicts average wind speed by scaling wind speed data.BLSM performed well on high-resolution data and more detailed datasets [14].A VMD-TCN-ST model was built to forecast the wind speed to support the safety and stability of wind power industries.The model integrated variational mode decomposition of time-series wind data for feature extraction, a temporal convolutional network for prediction and a sequence triplet loss function for enhanced performance.The results were improved compared with ARIMA, LSTM and other decomposition methods [15].
A seasonal ARIMA model was implemented to predict the time-series short-term wind speed in the offshore area of Scotland and observed that it produced more accurate results than deep learning algorithms [16].The nonlinearity of wind energy contradicts the application of linear statistical methods, thus the combination of several methods produced better results [17].Reliable wind speed predictive results were achieved by assembling data decomposition, sub-model selection and different predictor selection techniques on the Mayfly algorithm [18].These methods could be applied only on the single time-series scale of input and outputs.In addition, when the timescale was more extensive averaging methods were applied, which may result in the loss of useful knowledge.
The statistical and probability-based models were chosen over physical methods for real-time wind predictions, as they are efficient and use historical data.A Markov chain (MC) was the popular statistical tool used for predictions.An MC transition matrix was utilized as an optimization model with multi-object evolutionary strategy algorithms on large-scale high-dimensional real-time predictions [19].A probability-based mass function was applied to measure wind flow direction and speed on varied time scales for forecasting.The method could accurately predict shorter time intervals, but the results were unrealistic for long-range time intervals [20].Gaussian Processes (GP) were proposed over probabilistic predictions on regional wind flow.The covariance functions are the critical element of GP.A comparative study was conducted on different covariance functions with direct, indirect, static, dynamic, and combined structures with prediction intervals.The results showed that the indirect dynamic GP had more accuracy on broader prediction intervals [21].However, the forecast out-performance of statistical models was also significantly constrained because of their poor nonlinear fitting ability.A comparative study on conventional statistical models namely autoregressive mean average and autoregressive integrated moving average with that of support vector machine learning methods was conducted.The results proved that the machine learning methods produce better forecasting results than those of conventional methods [22].Table 1 shows the summary of physical, statistical, and numerical methods used for wind prediction.CFD and CREO The system was developed to predict wind information on complex terrain The model proved to be a cost-effective approach when coupled with CFD rather than WAsP for complex terrain.

Hill D.C et al.
2012, [7] Univariate and multivariate autoregression The work analyzed the wind power impact for future power systems, considering geographically-dispersed wind speed.The results showed ML models outperformed conventional methods for short-term wind energy prediction

Intelligent Based Approaches
Logical decisions can be made in advance based on the prediction findings.The precise forecast of wind speed is an efficient technique to increase the dependability and safekeeping of the wind power system [23].As a result, in-depth studies focusing on wind speed forecasting have lately emerged.Wind speed forecasting can be categorized into four main models: physical, statistical, intelligent, and hybrid approaches [24].Physical models simulate the dynamic process of wind by utilizing geographic and meteorological data.They are known for their high accuracy and physical interpretability.However, due to their limited computational effectiveness, they are not suitable for short-term wind speed predictions [25].On the other hand, statistical models rely on probability theory and quantitative statistics, using historical data to generate straightforward forecasting predictions.They offer practicality and real-time applicability, even predicting wind direction.However, their predictive capacity may be constrained by their poor ability to capture nonlinear relationships in the data [26,27].In contrast, intelligent approaches, such as machine learning and artificial neural networks, transcend traditional statistical methods.They excel at capturing complex nonlinear patterns, as they employ algorithms capable of learning from data and adapting to changing patterns.These intelligent models have the potential to outperform statistical models, particularly when dealing with nonlinear and dynamic wind patterns.By harnessing advanced techniques, intelligent approaches enhance the forecasting capabilities and provide valuable insights for optimizing wind power systems.

Machine Learning Approaches
In recent years, there has been a growing interest in using machine learning algorithms for wind pattern forecasting.These algorithms are particularly well-suited for this task due to their ability to analyze large amounts of data and make predictions based on patterns and trends.One popular machine learning technique used for wind pattern forecasting is artificial neural networks (ANNs).ANNs are a type of algorithm that are inspired by the structure and function of the human brain and can be used to model complex relationships between inputs and outputs.In wind pattern forecasting, ANNs can be trained to predict wind patterns based on historical data and a variety of meteorological variables.Another machine learning technique that has been used for wind pattern forecasting is support vector regression (SVR).SVR is a type of algorithm that can be used to model complex, nonlinear relationships between inputs and outputs.In wind pattern forecasting, SVR can be used to predict wind patterns based on historical data and a variety of meteorological variables.
Random Forest is another machine learning algorithm that has been used for wind pattern forecasting.Random Forest is an ensemble method that creates multiple decision trees and combines them to create a more robust model.The multiple decision trees in the ensemble are created by randomly selecting a subset of the training data and a subset of the features.This randomness helps to reduce overfitting and improve the generalization of the model.Gradient Boosting Machine (GBM) is also a machine learning algorithm that has been used in wind pattern forecasting.GBM is an ensemble method that creates multiple decision trees and combines them to create a more robust model.Unlike Random Forest, GBM creates the decision trees in a sequential manner where each tree tries to correct the errors of the previous tree.This sequential nature of the model helps to capture complex relationships in the data.Long Short-Term Memory (LSTM) is a more advanced algorithm in the field of machine learning which has been used in wind pattern forecasting.LSTM is a type of Recurrent Neural Network (RNN) that is particularly well-suited for sequential data, such as time series.LSTM has a memory cell that can retain information for a long period, which helps in predicting the future values of the time series.
To generate very short-term parametric probabilistic wind power estimates at numerous locations, a sparse vector autoregression approach is proposed [28] using a machine learning technique.To operate well, smart grids with 10s or even 100s of wind generators need expert, extremely short-term forecasts, and geographical information is greatly desired.For multistep-ahead estimating of wind power production, a mathematical morphologybased local predictor (MMLP) and mean trend detector (MTD) are employed.To be more precise, the wind power is predicted using historical daily wind speed data using support vector regression (SVR), random forest regression (RF), least absolute shrinkage and selection operator (LASSO) regression and k-nearest neighbors (kNN) [29].The results emphasize that machine learning models may be applied in areas other than those where the models were trained.These models, however, are immobile and disregard information from earlier data [30].Figure 1 shows the architectural stage of machine learning approaches.The architectural stage in machine learning refers to the process of designing and organizing the various components and flow of a machine learning model.It involves several important steps that are crucial for constructing a strong and efficient predictive model.At the outset of the architectural stage, data gathering plays a vital role.This entails collecting the appropriate datasets that will be used for training, validating, and testing the model.These datasets can be obtained from a variety of sources, including public repositories or private databases.Once the data have been collected, the subsequent step involves data cleaning.This process involves preprocessing the data to address any missing values, outliers, or inconsistencies that could potentially impact the model's performance.Techniques such as imputation, outlier detection, and data normalization or standardization are employed to ensure the data's quality and reliability.Another important aspect of the architectural stage is exploratory data analysis.This step entails examining and visualizing the data to gain insights into their inherent patterns, distributions, and relationships.By doing so, we can better understand the characteristics of the data and identify relevant features or variables that will be valuable for the model.Following data preprocessing and exploratory analysis, the subsequent phase is feature engineering and selection.This involves transforming and selecting the most informative features from the dataset.Feature engineering techniques may include scaling, dimensionality reduction, or creating new derived features that capture essential information.Feature selection techniques help identify the subset of features that have the most significant impact on the model's performance.Next in the architectural stage comes model selection and assessment.This step entails choosing the appropriate machine learning algorithm or ensemble of algorithms that best align with the specific problem at hand.Factors such as the data's nature, the complexity of the problem, and the desired model interpretability or accuracy are taken into account when selecting the most suitable model.Model assessment involves evaluating the model's performance using appropriate metrics and validation techniques to ensure its generalization and reliability.Model training is a critical step within the architectural stage.It involves feeding the selected algorithm with the prepared data to learn the underlying patterns and relationships.The model is trained using optimization techniques that aim to minimize errors or maximize the model's performance on the training data.Once the model is trained, it undergoes evaluation and optimization.Model evaluation assesses its performance on unseen test data, allowing for an unbiased assessment of its ability to generalize.If necessary, further optimization techniques such as hyperparameter tuning or regularization methods are applied to fine-tune the model and enhance its performance.Lastly, ongoing maintenance and monitoring are essential for ensuring the model's sustained effectiveness.Regular monitoring of the model's performance and periodic retraining or updating help address any issues that may arise, such as concept drift or changes in the underlying data distribution.Researchers must monitor the model using predictive analytics software and look for problems like model drift or bias to ensure that it continues to produce correct predictions over time.Contrarily, intelligent techniques, such as the extreme learning machine (ELM) [31], Elman neural network (Elman), echo state networks (ESN) [32], and long short-term memory network (LSTM) [33], are frequently utilized for wind speed forecasting due to their excellent nonlinear appropriate and self-learning capabilities.However, the predicting performance of these intelligent models could be more stable because of their heavy reliance on built-in factors.As a result, ensemble models were developed.Ensemble approaches may be more efficient since they can concurrently ensure the generalization and accuracy of the forecasting approach [18].The weight integration concept and optimization technique are used in the ensemble learning method, which combines several predictors and eliminates the phenomena of model overfitting while enhancing the predictor's capacity to modify to data with various temporal properties [34].
Among other techniques, particle swarm optimization (PSO) integrates other neural networks.The ensemble network model can significantly better optimize performance The machine learning life cycle's main activity is model creation, which has three subpoints: model selection and assessment, model training, and model evaluation.Finally, optimizing the model and executing ongoing maintenance checks is critical.Researchers must monitor the model using predictive analytics software and look for problems like model drift or bias to ensure that it continues to produce correct predictions over time.Contrarily, intelligent techniques, such as the extreme learning machine (ELM) [31], Elman neural network (Elman), echo state networks (ESN) [32], and long short-term memory network (LSTM) [33], are frequently utilized for wind speed forecasting due to their excellent nonlinear appropriate and self-learning capabilities.However, the predicting performance of these intelligent models could be more stable because of their heavy reliance on built-in factors.As a result, ensemble models were developed.Ensemble approaches may be more efficient since they can concurrently ensure the generalization and accuracy of the forecasting approach [18].The weight integration concept and optimization technique are used in the ensemble learning method, which combines several predictors and eliminates the phenomena of model overfitting while enhancing the predictor's capacity to modify to data with various temporal properties [34].
Among other techniques, particle swarm optimization (PSO) integrates other neural networks.The ensemble network model can significantly better optimize performance than the conventional network.By combining four separate networks with various structural differences using the Grey Wolf optimizer (GWO), a unique ensemble model can outperform individual networks [35].It is stimulating for a single neural network to adapt to many scenarios due to the variety of the data.To improve the model's overall capacity for time-series regression analysis, the ensemble learning approach blends neural networks with other structures.Therefore, there are potential research opportunities for ensemble learning in wind power forecasting.Many academics have used unsupervised machine learning to remove autocorrelation characters from high-resolution time series to improve prediction performance [36,37].
Machine learning is used by current error correction methods to anticipate residual errors.It is problematic to train the forecast model through high generalization, nevertheless, because of the error data's significant unpredictability and discreteness [24].Some better methods, such the multi-cycle error correction approach and the error correction model based on decomposition [38], nevertheless require assistance with unfavorable engineering applications, poor operating efficiency, and over-correction.Therefore, an integration study is required to enhance the wind speed forecasting performance of the high-performance error correction approach.In conclusion, machine learning algorithms like Artificial Neural Networks, Support Vector Regression, Random Forest, Gradient Boosting Machine, and Long Short-Term Memory, etc., have been used in wind pattern forecasting to improve the accuracy of predictions.These algorithms can analyze large amounts of data and make predictions based on patterns and trends, which is particularly useful in wind pattern forecasting where complex relationships between inputs and outputs are involved.Table 2 shows the summary of machine learning approaches used for wind prediction.Particle swarm optimization and gravitational search algorithm (PSOGSA) A three-step data-driven hybrid approach was utilized to predict locomotive axle environmental parameters, including temperature, wind, and humidity.
The data was combined using the Complementary empirical mode decomposition method.The goal function weights could be optimized, ensembled, and combined using the PSOGSA to provide the final forecasts.
Liu, H. et al. 2019, [36] Unsupervised machine learning The hidden representation of the original 3 s high-resolution wind speed data is proposed to be extracted using a two-layer stacked sparse autoencoder (SSAE).
A multi-step wind speed forecasting model with deep bidirectional learning, multi-objective data ensemble, and error correction has been suggested.On datasets acquired from three separate places, four experiments were run.

Deep Learning Approaches
Numerous wind power prediction models have surfaced in recent years, encompassing statistical, physical, and artificial intelligence approaches.Artificial intelligence methods, leveraging the in-depth information contained in erratic wind power statistics, have shown promise in creating accurate and nonlinear models.Consequently, researchers have proposed various artificial intelligence algorithms to develop precise wind power forecasting models.Neural networks-based approaches have gained popularity in wind power prediction due to their ability to handle complex relationships in the data.These approaches use artificial neural networks to learn and model wind speed patterns.In neural networks-based approaches, interconnected nodes called neurons form layers.Each neuron takes inputs, applies weights, and generates an output using an activation function.By combining multiple neurons in layers, neural networks capture intricate relationships and make predictions based on learned patterns.
To capture the intricate details of wind flow patterns surrounding buildings and depict street-level wind atmospheres, the application of particle image velocimetry (PIV) is common near model construction areas.However, the measurement of immediate wind speeds in protected places around building models remains challenging due to laserlight shielding.Deep learning algorithms, such as the generative adversarial imputation Network (GAIN), can be employed to thoroughly analyze wind flow patterns by imputing the unmeasured wind velocities [2].This utilization of deep learning techniques has also proven effective in time-series forecasting models, surpassing conventional algorithms [39].
The effectiveness of BI-LSTM in forecasting wind power has been extensively demonstrated by numerous groups of comparison tests [40].This research introduces Linear Neural Networks with Tapped Delay (LNNTD) in a blend with Wavelet Transform (WT) aimed at probabilistic wind power forecasting in a time-based environment [41] with the aim of increasing wind power prediction accuracy and reliability.A new wind speed forecasting model was created using the long-short-term memory network (LSTM) as the predictor.Although the computing overhead might be lessened, LSTM could standardize the lengthy time series information in wind power data [42].High-precision prediction modelling was developed based on LSTM and, to be more effective, was realized using the gated recursive unit (GRU) [43].A temporal convolutional neural network was also incorporated into the forecasting model (TCN).The DL technique has gained actual research usefulness in wind power forecasting based on the relevant works stated above [44].
The Convolutional Neural Network (CNN) extracts profound features from the wind speed data to conduct wind speed forecasting.Figure 2 demonstrates the working of deep learning approach CNN architecture.speed forecasting model was created using the long-short-term memory network (LSTM) as the predictor.Although the computing overhead might be lessened, LSTM could standardize the lengthy time series information in wind power data [42].High-precision prediction modelling was developed based on LSTM and, to be more effective, was realized using the gated recursive unit (GRU) [43].A temporal convolutional neural network was also incorporated into the forecasting model (TCN).The DL technique has gained actual research usefulness in wind power forecasting based on the relevant works stated above [44].
The Convolutional Neural Network (CNN) extracts profound features from the wind speed data to conduct wind speed forecasting.Figure 2 demonstrates the working of deep learning approach CNN architecture.Convolutional Neural Networks (CNNs) are vital in wind speed forecasting due to their ability to extract valuable features from raw wind speed data.The CNN architecture comprises key components that enhance its effectiveness in capturing spatial patterns and correlations.The core of the CNN architecture consists of convolutional layers that use filters to perform convolutions on the input wind speed data.These filters are designed to detect specific patterns at different scales, such as edges or corners.By applying these filters across the data, the CNN identifies essential spatial information relevant to wind speed prediction.Pooling layers are often added after the convolutional layers to downsample the feature maps generated by the convolutions.For example, max pooling selects the maximum value within each region, reducing data dimensions while retaining important features.This downsampling reduces model complexity and provides translation invariance, allowing the CNN to focus on crucial features regardless of their location.The output of the convolutional and pooling layers is then flattened and fed into fully connected layers.Similar to traditional neural networks, these layers learn complex relationships between the extracted features and the target variable, in this case, wind speed.Fully connected layers enable the CNN to capture higher-level representations and make predictions based on learned patterns.During training, the CNN adjusts its internal parameters, weights, and biases, to minimize the difference between predicted and actual wind speed values in the training data.This optimization is achieved through the backpropagation algorithm, which updates the model parameters based on gradients computed from the loss function.The CNN architecture is a powerful tool for extracting spatial features from wind speed data.By utilizing convolutional layers for local patterns, pooling layers for downsampling, and fully connected layers for higher-level relationships, CNNs effectively learn and predict wind speed patterns.Understanding the CNN architecture allows readers to grasp its importance in wind power prediction and its potential for improving forecast accuracy.
This substantially enhances the modelling implementation and the inclusive prediction outcome of LSTM.The example demonstrates how well CNN can remove the Convolutional Neural Networks (CNNs) are vital in wind speed forecasting due to their ability to extract valuable features from raw wind speed data.The CNN architecture comprises key components that enhance its effectiveness in capturing spatial patterns and correlations.The core of the CNN architecture consists of convolutional layers that use filters to perform convolutions on the input wind speed data.These filters are designed to detect specific patterns at different scales, such as edges or corners.By applying these filters across the data, the CNN identifies essential spatial information relevant to wind speed prediction.Pooling layers are often added after the convolutional layers to downsample the feature maps generated by the convolutions.For example, max pooling selects the maximum value within each region, reducing data dimensions while retaining important features.This downsampling reduces model complexity and provides translation invariance, allowing the CNN to focus on crucial features regardless of their location.The output of the convolutional and pooling layers is then flattened and fed into fully connected layers.Similar to traditional neural networks, these layers learn complex relationships between the extracted features and the target variable, in this case, wind speed.Fully connected layers enable the CNN to capture higher-level representations and make predictions based on learned patterns.During training, the CNN adjusts its internal parameters, weights, and biases, to minimize the difference between predicted and actual wind speed values in the training data.This optimization is achieved through the backpropagation algorithm, which updates the model parameters based on gradients computed from the loss function.The CNN architecture is a powerful tool for extracting spatial features from wind speed data.By utilizing convolutional layers for local patterns, pooling layers for downsampling, and fully connected layers for higher-level relationships, CNNs effectively learn and predict wind speed patterns.Understanding the CNN architecture allows readers to grasp its importance in wind power prediction and its potential for improving forecast accuracy.
This substantially enhances the modelling implementation and the inclusive prediction outcome of LSTM.The example demonstrates how well CNN can remove the one-dimensional variation data from the original wind speed data while outperforming more conventional approaches.Using extracted features from CNN, the model may examine and improve the quality of the actual data [45].When combined with other techniques to delve into time-series data thoroughly, simple models like GRU and LSTM, which have incomplete capability to mine the core properties of sequences, may significantly increase the correctness of wind speed prediction.The CNN is employed to investigate the correlation and nonlinear characteristics of time series.For example, a model utilizing the 2D-CNN as the auto-encoder is used to forecast 2D regional wind speeds [46].2D-CNN is employed as the feature withdrawal and is attached to the radial basis function network to estimate wind power (RBF).Table 3 shows the summary of neural network approaches used for wind prediction.A model using LNNTD has been applied to wind power estimation for the Ontario Electricity Market (OEM) The developed method is presented for a stochastic time series of wind energy forecasting.The advantage of this method is that it considers all sources of uncertainty supplied by the input data indecision throughout the parametric hypothesis.

2021, [42] LSTM
The main meteorological elements significantly impacting power generation because of nonlinear effects were successfully extracted using a long-term memory network.
The mid-to-long-term forecast accuracy of wind and solar power generation based on the long short-term memory network was significantly increased associated to the persistence and support vector machine models.Temporal convolutional neural network (TCN) The framework supports both parametric and non-parametric settings for estimating probability density.To capture the temporal dependencies of the series, stacked residual blocks were utilized, which were based on dilated causal convolutional nets.
The methodology has excellent scalability and flexibility due to its capacity to learn latent correlation between series and handle challenging real-world forecasting scenarios including data sparsity and cold beginnings.

2022, [45] CNN
An innovative wind speed forecast model based on the WPD (Wavelet Packet Decomposition) and CNN.
The resultant high-frequency sublayers were forecasted using CNN with a 1D convolution operator.To finish forecasting the low-frequency sub-layer, CNN-LSTM was used.

2021, [46] 2D-CNN
A CNN auto-encoder and long short-term memory were used to create a novel deep learning model for a 2-D regional wind speed forecast (LSTM).
The array of wind speed mechanisms was skilled as a structured 2D matrix using a CNN model.Utilizing deep knowledge from the preceding phases, such a deep feature was repeatedly expected sideways across the timeline using the LSTM unit.

Hybrid Approaches
Hybrid models have emerged as promising approaches for wind power prediction by combining different techniques and models to enhance forecast accuracy and reliability.One example of a hybrid model is the combination of artificial neural networks with wavelet transforms.Wavelet transforms capture both time and frequency information in wind speed data, allowing hybrid models to extract relevant features for prediction and handle different scales and patterns.By combining the benefits of several models, numerous deep learning and machine learning procedures have been created in the literature to increase prediction accuracy further.Another approach combines linear neural networks with methods like tapped delay, considering lagged effects and temporal dependencies.This combination captures both linear and nonlinear relationships, resulting in more accurate predictions for complex wind speed behavior.Hybrid models can also incorporate statistical techniques like time-series analysis and regression models.These complement the predictive power of neural networks by providing a solid foundation for understanding underlying patterns and trends.Data mining, discrete wavelet transforms, and a multilayer perceptron neural network are used to create a hybrid wind power forecaster.Research emphasized that cluster selection speeds up the forecasting process by limiting the data to train the forecaster to the relevant subset rather than the entire dataset.During training, hybrid models integrate different components and optimize their parameters to minimize prediction errors.Careful calibration and parameter tuning ensure that the models work harmoniously and leverage each other's strengths.By utilizing hybrid models, researchers and practitioners can benefit from the complementary capabilities of different techniques, leading to more accurate and reliable wind power predictions.These models contribute to the advancement of wind power forecasting, facilitating better decision-making in renewable energy systems.A proposed SVM-enhanced Markov approach for forecasting short-term wind power is also included.To predict the nominal evolution of wind generation, finite-state Markov techniques based on data analytics are first used [47].The SVM forecast is then adequately combined with the finite-state Markov models.A new hybrid model to anticipate wind power with high accuracy would significantly reduce the uncertainty of the system's operation.The hybrid model approach is built using multiple linear regressions and least square methods (MLR&LS) [48].This paves the path to employ a novel principle for the prediction methodologies.The suggested hybrid model predicts wind power in two phases.Data transformation is the initial step, in which historical wind power data are transformed into ratios and used as values for the prediction.Inverse transformation, the second phase, represents the total wind power by converting the projected ratios into actual numbers.Figure 3 shows how two architectures extract features and then concatenate the features for better prediction.
The hybrid approach is the common forecasting technique used by sophisticated wind speed prediction systems.However, the multiple reservoirs hybrid model's application research and error correction still need improvement.In order to do this, a dual dynamic SP-ESN (serial-parallel dynamic echo state network) is employed.Its serial structure may dynamically choose the training set's length and capture the sequence's short-term memory information.However, the parallel system cannot automatically correct the prediction inaccuracy of the serial structure.An improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) approach, the chaotic coyote optimization algorithm (CCOA) and the SP-ESN, are then included in a dynamic forecast model created on stage space reconstruction [49].This is a helpful model that considers the impact of several climatological variables to advance the accuracy of short-term wind speed forecasts.(a) The filter-wrapper non-dominated sorting differential evolution algorithm containing K-medoid clustering (FWNSDEC) was developed to identify critical meteorological elements and produce various feature subsets.The fusion DL model is constructed for each feature subset.A Convolutional long short-term memory (ConvLSTM) network is then used to process the sample set of three-dimensional sequences.The concluding forecasting outcomes is the regular prediction of all the constructed ConvLSTMs.Singular spectrum analysis (SSA) decomposes the meteorological factors and creates the three-dimensional input structure [50].Forecasting wind speeds is essential for wind power plants' reliability and operational safety.However, high-performance wind speed forecasting needs to be improved because the natural wind has strong non-stationary and nonlinear properties.In order to provide precise and trustworthy wind speed forecasting, a unique multi-scale feature adaptive extraction (MSFAE) ensemble model is created in this study.The model builds six GWO-CNN-BiLSTM (GCNBiL) networks with various convolution operator lengths.At different time scales, it extracts and learns the deep autocorrelation feature hidden in high-resolution data.The forecasting outcomes of six GCNBiL benchmark models are then combined using the multi-objective cuckoo-search-moth-flame hybrid optimization (MOCSMFHO) algorithm [17].Table 4 shows the summary of hybrid approaches used for wind prediction.The hybrid approach is the common forecasting technique used by sophisticated wind speed prediction systems.However, the multiple reservoirs hybrid model's application research and error correction still need improvement.In order to do this, a dual dynamic SP-ESN (serial-parallel dynamic echo state network) is employed.Its serial structure may dynamically choose the training set's length and capture the sequence's short-term memory information.However, the parallel system cannot automatically correct the prediction inaccuracy of the serial structure.An improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) approach, the chaotic coyote optimization algorithm (CCOA) and the SP-ESN, are then included in a dynamic forecast model created on stage space reconstruction [49].This is a helpful model that considers the impact of several climatological variables to advance the accuracy of shortterm wind speed forecasts.(a) The filter-wrapper non-dominated sorting differential evolution algorithm containing K-medoid clustering (FWNSDEC) was developed to identify critical meteorological elements and produce various feature subsets.The fusion DL model is constructed for each feature subset.A Convolutional long short-term memory (ConvLSTM) network is then used to process the sample set of three-dimensional sequences.The concluding forecasting outcomes is the regular prediction of all the constructed ConvLSTMs.Singular spectrum analysis (SSA) decomposes the meteorological factors and creates the three-dimensional input structure [50].Forecasting wind speeds is essential for wind power plants' reliability and operational safety.However, high-performance wind speed forecasting needs to be improved because the natural wind has strong non-stationary and nonlinear properties.In order to provide precise and trustworthy wind speed forecasting, a unique multi-scale feature adaptive extraction (MSFAE) ensemble model is created in this study.The model builds six GWO-CNN-BiLSTM (GCNBiL) networks with various convolution operator lengths.At different time scales, it extracts and learns the deep autocorrelation feature hidden in high-   Data transformation was the initial step, in which historical wind power data was transformed into ratios and used as values for the prediction.The predicted ratios were changed into the actual values, after which the total wind power was forecasted using inverse transformation.In sophisticated wind speed prediction systems, the hybrid approach is acknowledged as the standard forecasting technique.
A dynamic prediction model based on phase space reconstruction that combines the SP-ESN, the CCOA, and the ICEEMDAN approach is created.The model is built with six GWO-CNN-BiLSTM (GCNBiL) networks with various convolution operator lengths, and at various time scales.A multi-objective hybrid optimization algorithm is then used by the proposed model to combine the forecasting outputs of the six GCNBiL benchmark models.
Figure 4 illustrates the growth rate of research in the field of wind flow prediction.Initially, researchers predominantly relied on physical models to predict wind flow patterns near tall buildings.However, recent trends indicate a shift towards the adoption of hybrid approaches.These hybrid models combine the strengths of multiple methods, resulting in increased accuracy in wind flow prediction.This figure provides a visual representation of the transition from physical methods to hybrid models in the research domain.At the same time, the latest trends show that the research areas are gradually increasing towards hybrid approach features where the benefits of two or more models are combined with increasing the predicting accuracy rate.
overhead, and present interpretability challenges due to the integration of various model types.It is important to acknowledge that these observations are general and can vary based on specific algorithms, datasets, and application domains.Therefore, researchers and practitioners should carefully evaluate these factors when selecting the most suitable model for their prediction tasks.
Limitations and research gap for the methods used for predicting wind patterns will be discussed in the next section.

Research Gaps and Limitations
The persistence methods are employed ignoring the weather changes and considering them as deterministic events.Physical methods are suited for short-term  Different types of predictive models, including machine learning, deep learning, and hybrid models, possess their own strengths and weaknesses.Machine learning models exhibit versatility, allowing them to be applied across diverse domains, while certain algorithms offer interpretability and scalability.However, they require manual feature engineering and may struggle with capturing complex patterns or be sensitive to noise.In contrast, deep learning models automatically learn complex features, possess powerful representation capabilities, and can handle large-scale datasets.Nonetheless, they come with computational complexity, lack interpretability, and demand substantial labeled data.Hybrid models combine different models to achieve enhanced accuracy, improved robustness, and flexibility.However, they entail increased complexity, computational overhead, and present interpretability challenges due to the integration of various model types.It is important to acknowledge that these observations are general and can vary based on specific algorithms, datasets, and application domains.Therefore, researchers and practitioners should carefully evaluate these factors when selecting the most suitable model for their prediction tasks.
Limitations and research gap for the methods used for predicting wind patterns will be discussed in the next section.

Research Gaps and Limitations
The persistence methods are employed ignoring the weather changes and considering them as deterministic events.Physical methods are suited for short-term forecasting, but their accuracy eventually decreases with an increase in time.As the physical methods do not depend on historical data, they are well suited for new wind farms.The electrical utilities use some of the physical methods for instant measurement of wind speed.The numerical methods are suitable for long-term predictions.The WAsP model is widely used across the wind industry despite its topographical limitations considering that the sites are located in smooth terrain.The MCP methods are valuable in estimating the long-term wind characteristics at farm sites.There are several uncertainties abounding in wind forecasting, and there are no robust methods devised to quantify uncertainties, which are found to underestimate and produce erroneous results.Thus, MCP methods are most applied and prove effective on linear regression models.Some statistical methods use variability in the relationship between the current data and long-term variability between the reference and target regions to assess the uncertainty.
The MCP methods make predictions only based on the current data available in the reference and target regions, thus there is little information on uncertainty.Autoregression and moving-average-based statistical methods are best suited for short-term predictions.The predictive ARIMA models showed reliable results that were more accurate than backpropagation neural networks for short time intervals of calculation.The ARIMA models can be made more effective when integrated into hybrid models.The BLS technique used wind-map image data instead of raw NWP data, which also includes the climatology, vegetation, and surface morphology.BLS and MCS produced quick and cost-effective estimations for long-term predictions.The methods used wind maps, and the maps are only available up to 45 m from the proposed site whereas NWP raw data are available at various heights.The BLS with vertical scaling technique proved to produce better accuracy than previously.The accuracy of estimations depends on the correctness of meteorological parameters and if any error is made then it greatly affects the accuracy of the prediction.The statistical methods rely on the previous wind data and fail to consider the other atmospheric parameters that prevail in the present.
Table 5 presents a comprehensive overview of the methodology and research gap in intelligence-based approaches for wind flow prediction.It provides insights into the techniques and methods employed in these approaches, such as machine learning and deep learning.Additionally, the table highlights the existing research gap in this area, indicating areas where further investigation and improvement are required.The information in Table 5 helps readers understand the landscape of intelligence-based approaches and the potential areas for future research and advancements.

Method Research Gap
Sparse Vector Autoregression (sVAR) The Sparse vector regression approach's disadvantage is that it only considers a small dimension.Work should extend to spatial dimensions.

LASSO and SVR
When there are two or more highly correlated variables, or when the number of predictors exceeds the number of observations, Lasso regression and SVR will not work well.For predicting wind patterns, the attributes are highly correlated to each other.

Ensemble Learning
Due to the added complexity, using ensemble methods decreases the model's interpretability.Additionally, applications requiring quick responses will not be suitable for computation-and design-intensive systems.

Kernel extreme learning machine (KELM)
The PSO algorithm is needed along with KELM to modify the neural network kernel function parameters to get the best values.Both generalization performance and stability are lacking in the KELM algorithm.

Echo state networks (ESN)
ESNs' dependency on the hyperparameters is one of its main drawbacks.It is difficult to train these networks because the stable working region of ESNs in the hyperparameter space is extremely constrained.

Grey wolf optimizer
The Grey Wolf Optimization (GWO) algorithm has a few drawbacks, including poor local search performance, low solving accuracy, and a sluggish convergence rate.

Particle Swarm Optimization
In a high-dimensional dataset, the particle swarm optimization (PSO) technique has a low convergence rate during the iterative process.PSO does not perform well because of its computational complexity when solving the high-dimensional dataset.

Empirical wavelets transform
Understanding the variables important to a project or model is the largest issue with the empirical wavelets transform (EWT).Shift sensitivity, inadequate directionality, and a lack of phase information are three significant drawbacks.

Generative adversarial imputation network
The generator and discriminator networks in a GAIN are continually contending with one another, which can lead to unstable and slow training.Every generator iteration overoptimizes for a specific discriminator, and the discriminator never learns how to escape the trap.

LSTM
The difficulty of LSTMs to manage temporal dependencies longer than a few steps is one of its main drawbacks.The small size of the context window is yet another drawback of LSTMs.

Linear Neural Networks
Only linear relationships between input and output vectors can be learned by linear networks.Problems with a temporal component cannot be solved by linear neural networks.

GRU
The sluggish convergence rate and low learning efficiency of GRU models continue to be an issue, leading to excessively long training times and even under-fitting when used alone.

CNN
The drawbacks of CNNs include the requirement for a large amount of training data and the inability to encode the position and orientation of objects.The position and orientation of objects are not encoded which will lead more complexity in the proposed model.

Deep reinforcement learning
During the learning process, federated learning necessitates frequent communication between nodes.Therefore, it requires more than just adequate local processing and memory.
Only discrete action spaces are suitable for deep reinforcement techniques.
Tables 1-4 provide a basic list of several current techniques that can optimize the forecasting performance of wind power forecasting models.Although it has been demonstrated that the methodologies listed in the tables above have produced some research findings on wind power forecasting, there are still several research gaps worthy of further exploration in Table 5.

State of Art and Future Directions
Currently, one of the top trends in wind and weather forecasting methods used by NASA and other geographic research institutes is the use of advanced numerical models.These models use mathematical equations to simulate the behavior of the atmosphere and make predictions about future weather patterns.NASA, for example, uses a model called the Global Forecast System (GFS) to make global weather predictions.Another trend in wind and weather forecasting is the use of satellite data.Satellites can provide a wealth of information about the atmosphere, including temperature, humidity, and wind patterns.NASA and other research institutes use these data to improve the accuracy of weather predictions.A third trend in wind and weather forecasting is the use of machine learning algorithms.As mentioned before, these algorithms can analyze large amounts of data and make predictions based on patterns and trends.NASA and other research institutes are exploring the use of these algorithms to improve the accuracy of weather predictions.
One way to improve these forecasting methods is to increase the amount of data used in the models.With more data, the models can better capture the complexity of the atmosphere and make more accurate predictions.However, it is important to ensure that the data used are of high quality and relevant to the forecast being made.Another way to improve these forecasting methods is to increase the resolution of the models.Higher resolution models can capture smaller-scale weather phenomena, such as thunderstorms and tornadoes, which are often missed by lower resolution models.However, high-resolution models also require more computational power and resources, which can be a limitation.A third way to improve these forecasting methods is to increase collaboration between research institutes.By sharing data and models, research institutes can improve the accuracy of weather predictions by combining the strengths of different models and data sets.Additionally, collaboration can also help to identify gaps in knowledge and research areas that need further exploration.
In addition to these points, it is important to note that weather forecasting is a challenging task, and even with the most advanced methods, it is not possible to predict the weather with 100% accuracy.Therefore, it is crucial to have a clear understanding of the limitations of the forecasting methods used, and to communicate this information to the public in a transparent and accurate way.Ultimately, advanced numerical models, satellite data, and machine learning algorithms are some of the top trends in wind and weather forecasting methods used by NASA and other research institutes.To improve these methods, it is important to increase the amount of data used in the models, increase the resolution of the models, and increase collaboration between research institutes.However, it is also important to acknowledge the limitations of forecasting methods and communicate them accurately to the public.

Conclusions
This paper presents an overview of wind characteristic forecasting schemes available in various domains.Many forecasting models have been introduced, and researchers are still focusing on robustness.Every model has its own features and limitations.Some models performed well at short-term forecasting while some are best suited for the long term.Some physical or persistence methods like WAsP are simple and widely used despite their limitations around topography of the sites.The physical methods do not require any historical data for wind forecasting at new sites whereas the numerical or statistical methods require comprehensive information about the sites.The MCP methods were used for linear long-term wind energy predictions.The data-driven machine learning models achieved greater accuracy with minimized errors in wind forecasting.For long-term predictions, neural networks have gained more visibility in forecasting models due to their greater accuracy with a high volume of data.On the evaluation of various strategies, most of the models rely on underlying data and thus become overfit to available data and cannot be generalized to other data.Besides employing individual methods, hybridization of methods and ensembling of models are proved to achieve greater accuracy with minimal errors.For any forecasting approach the atmospheric conditions play a crucial role in determining the accuracy and fitness of the model.In addition, for the weather forecasting models the accuracy relies on the land topology, altitude of wind measurement, temperature, pressure and so on.

Sustainability 2023 , 23 Figure 1 .
Figure 1.Architectural stage of machine learning approaches.The machine learning life cycle's main activity is model creation, which has three subpoints: model selection and assessment, model training, and model evaluation.Finally, optimizing the model and executing ongoing maintenance checks is critical.Researchers must monitor the model using predictive analytics software and look for problems like model drift or bias to ensure that it continues to produce correct predictions over time.Contrarily, intelligent techniques, such as the extreme learning machine (ELM)[31], Elman neural network (Elman), echo state networks (ESN)[32], and long short-term memory network (LSTM)[33], are frequently utilized for wind speed forecasting due to their excellent nonlinear appropriate and self-learning capabilities.However, the predicting performance of these intelligent models could be more stable because of their heavy reliance on built-in factors.As a result, ensemble models were developed.Ensemble approaches may be more efficient since they can concurrently ensure the generalization and accuracy of the forecasting approach[18].The weight integration concept and optimization technique are used in the ensemble learning method, which combines several predictors and eliminates the phenomena of model overfitting while enhancing the predictor's capacity to modify to data with various temporal properties[34].Among other techniques, particle swarm optimization (PSO) integrates other neural networks.The ensemble network model can significantly better optimize performance

Figure 1 .
Figure 1.Architectural stage of machine learning approaches.

Figure 4 .
Figure 4. Growth rate of research from physical method to hybrid models.

Figure 4 .
Figure 4. Growth rate of research from physical method to hybrid models.

Table 1 .
Summary of conventional wind energy prediction approaches.

Table 2 .
Summary of Machine learning approaches.

Table 3 .
Summary of neural network approaches.

Table 4 .
Summary of hybrid approaches.

Table 5 .
Methodology and research gap in intelligent based approaches.