Benchmarking of Various Flexible Soft-Computing Strategies for the Accurate Estimation of Wind Turbine Output Power

Bilal, Boudy; Yetilmezsoy, Kaan; Ouassaid, Mohammed

doi:10.3390/en17030697

Open AccessArticle

Benchmarking of Various Flexible Soft-Computing Strategies for the Accurate Estimation of Wind Turbine Output Power

by

Boudy Bilal

^1,2

,

Kaan Yetilmezsoy

^3,*

and

Mohammed Ouassaid

⁴

¹

Electrical Engineering Department, UR-EEDD, Ecole Supérieure Polytechnique, Nouakchott BP 4303, Mauritania

²

URAER/FST, Université de Nouakchott, Nouakchott BP 5026, Mauritania

³

Department of Environmental Engineering, Faculty of Civil Engineering, Yildiz Technical University, Davutpasa, Esenler, 34220 Istanbul, Turkey

⁴

Electrical Engineering Department, Engineering for Smart and Sustainable Systems Research Centre, Mohammadia School of Engineers, Mohammed V University in Rabat, Rabat 10090, Morocco

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(3), 697; https://doi.org/10.3390/en17030697

Submission received: 17 December 2023 / Revised: 16 January 2024 / Accepted: 22 January 2024 / Published: 1 February 2024

(This article belongs to the Special Issue Whole-Energy System Modeling)

Download

Browse Figures

Versions Notes

Abstract

This computational study explores the potential of several soft-computing techniques for wind turbine (WT) output power (kW) estimation based on seven input variables of wind speed (m/s), wind direction (°), air temperature (°C), pitch angle (°), generator temperature (°C), rotating speed of the generator (rpm), and voltage of the network (V). In the present analysis, a nonlinear regression-based model (NRM), three decision tree-based methods (random forest (RF), random tree (RT), and reduced error pruning tree (REPT) models), and multilayer perceptron-based soft-computing approach (artificial neural network (ANN) model) were simultaneously implemented for the first time in the prediction of WT output power (WTOP). To identify the top-performing soft computing technique, the applied models’ predictive success was compared using over 30 distinct statistical goodness-of-fit parameters. The performance assessment indices corroborated the superiority of the RF-based model over other data-intelligent models in predicting WTOP. It was seen from the results that the proposed RF-based model obtained the narrowest uncertainty bands and the lowest quantities of increased uncertainty values across all sets. Although the determination coefficient values of all competitive decision tree-based models were satisfactory, the lower percentile deviations and higher overall accuracy score of the RF-based model indicated its superior performance and higher accuracy over other competitive approaches. The generator’s rotational speed was shown to be the most useful parameter for RF-based model prediction of WTOP, according to a sensitivity study. This study highlighted the significance and capability of the implemented soft-computing strategy for better management and reliable operation of wind farms in wind energy forecasting.

Keywords:

artificial neural networks; decision tree-based modeling; soft-computing; wind turbine output power

1. Introduction

Meeting the world’s expanding energy demand depends mostly on fossil fuels, increasing greenhouse gas emissions—the primary cause of global warming [1]—the impact of which is environmental degradation that causes constant damage [2]. However, the consequence of global warming can be mitigated by deploying efforts to increase renewable energy production [3]. Global warming effects could be reduced by utilizing wind energy, which is one of the most promising renewable energy sources. Many efforts have been made to generate combined mechanical and electric power [4]. To reduce dependence on the fossil fuel sector and coal for electricity generation, various wind farms have been constructed [5]. According to the Global Wind Energy Council (GWEC), more than 1.2 billion tons of CO₂ are avoided thanks to the 837 GW total worldwide wind generating capacity installed up to 2022 [6]. Moreover, many countries have built offshore wind farms contributing about 22.5% to wind energy installed worldwide in 2021 [4]. However, the wind turbine (WT) is an intricate electro-mechanical device made up of multiple components. It incorporates an electrical generator, a rotating shaft, a gearbox, a lubrification system, and an electronic-control system, and is generally susceptible to various problems and failures in severe conditions, which will eventually lead to further operational and maintenance cost [7]. In addition, a number of operational and meteorological factors affect how wind turbines (WTs) operate [8]. While the operating parameters incorporate the pitch angle, generator operating temperature, generator rotation speed, and grid voltage [9], the intermittency of the meteorological parameters is one of the major causes of the WT failures. Precise knowledge of the operating state of WTs makes it possible to optimize the control and planning of energy consumption management. For instance, one of the most important indicators used to decide the quality of wind potential for wind energy generation is wind speed distribution [8]. In addition, the pitch angle is generally considered for optimizing the WTs’ output power [4]. Therefore, the use of WT technology should be supported by the development of new tools for wind power forecasting, to build up an ideal strategy for the intelligent control, maintenance, and management of the electrical systems, minimizing the deterioration of the wind system components. In addition, system reliability has a considerable influence on WTs’ power producing cost [10]. In this situation, optimizing actual wind farms at the lowest possible cost requires an original approach consisting of a variety of relevant decision variables, and limitations [11]. As a result, a strong and flexible optimization technique combined with an appropriate estimation method may provide significant financial advantages over manual scheduling [12,13]. Additionally, monitoring the dynamic nature of meteorological parameters and wind farms is a challenging task that calls for the development of reliable forecasting methodologies, forecasting algorithms, and expert systems.

In light of the dearth of wind data on actual locations, a number of methods for predicting wind power and speed have lately been created [14,15,16,17,18,19,20,21,22,23]. Additionally, three stage genetic ensemble and auxiliary predictor [24], Bayesian model averaging and ensemble learning (BMA-EL) [25], stacked recurrent neural network (SRNN) with parametric sine activation function (PSAF) algorithm [26], data-driven approach integrating data pre-processing and deep learning models [27], spatiotemporally multiple clustering algorithm and hybrid neural network method [28], deep residual gated recurrent unit (GRU) network combined with ensemble empirical mode decomposition (EEMD) and crisscross optimization algorithm (CSO) [29], machine learning [30], three improved encoder–decoder architectures (TIEDA), sequence-to-sequence bidirectional gated recurrent unit (SBIGRU), attention-based sequence-to-sequence Bi-GRU (ASBIGRU) and transformer, in natural language processing (T-NLP) [31], data-driven applications using both historical measurements and modern-era retrospective analysis [32], and wavelet transform based convolutional neural network and twin support vector regression [33] have been conducted to improve prediction accuracy in wind power forecasting. Appendix A (Table A1 and Table A2) summarizes various techniques and approaches developed for the forecast of wind power and speed to understand the behavior of wind farms in different climatic condition. Statistical regression, machine learning, and artificial intelligence methods are used to predict wind speed and wind power, with the goal of improving the quality of the wind signal to optimize energy production and reduce the failures of WTs. It is noted that other parameters (e.g., air pressure, humidity, ambient temperature, etc.) can also influence the production of the WTs and should be taken into account when forecasting wind power. Furthermore, the wind power forecast performance is influenced by the specific operational parameters of the WTs and their components under certain operating conditions. These parameters include rotating speed, lubrication, output voltage, output current, alignment angle, and energy loss during warm-up. These factors can all lead to structural fatigue, bearing picking, corrosion, abrasion of the blades, and failures, as well as lowering the accuracy of wind power prediction. From this perspective, several studies have been conducted using various new approaches to increase the forecasts of wind power and speed accuracy (Appendix A). Moreover, Xiong et al. [34] proposed a hybrid model that combines complementary ensemble empirical mode decomposition (CEEMD), sample entropy (SE), random forest (RF), improved reptile search algorithm (IRSA), bidirectional long short-term memory (BiLSTM) network, and extreme learning machine (ELM) for wind power prediction. Furthermore, Jiading et al. [35] presented a novel strategy integrating learning algorithm (TS_XGB model) based on spatio-temporal data mining according to change in the direction and speed of the wind for ultra-short-term wind power forecasting. In another study from China, Sheng et al. [36] conducted a short-term wind power prediction based on the deep clustering-improved temporal convolutional network (DCTCN) for WT output power (WTOP) prediction by classifying the various typical features of numerical weather prediction (NWP) extracted based on the categorical generative adversarial network (CGAN). In Portugal, Osório et al. [37] developed a hybrid and adaptable ANFIS-based technique incorporating the wavelet transform (WT) and the PSO with mutual information (MI). An et al. [38] employed a hybrid prediction model including the EMD (empirical model decomposition) based on the chaos and grey theories. The EMD allowed the decomposition of the power signal into numerous intrinsic-mode-function (IMF) components and one residual, whereas the grey forecasting model allowed the residual prediction. Guo et al. [39] developed a physics-inspired neural network (PINN) model for short-term wind power prediction considering wake effects. Al-qaness et al. [40] conducted an optimized Random Vector Functional Link (RVFL) network using a new naturally inspired technique called the capuchin search algorithm (CapSA). Ye et al. [41] proposed an ensemble learning prediction model considering the rolling error correction strategy for wind power prediction based on multiple gradient boosting trees (GBDTs) based on Bayesian optimization. In Iran, Bigdeli et al. [42] introduced various hybrid prediction models based on neural networks optimized by an imperialist competitive algorithm (ICA), the GA, and the PSO. Benchmarking of the NN-ICA, NN-GA, and NN-PSO prediction models on an input dataset selected using time series analysis has revealed the dominance of the NN-ICA prediction model. Additionally, a novel adaptive neuro-fuzzy inference system with the moving window (ANFIS-MoW) for the wind power prediction was developed by Bilal et al. [43]. The proposed approach was applied to dataset in different time series windows, namely the very short-term, short-term, medium-term and long-term time horizons. According to the study’s findings, the recommended method was a potential soft-computing tool for precisely measuring the WT output power. To highlight the superior performance of the G-NN-based model, Weidong et al. [44] built a genetic neural network (G-NN) modeling technique for predicting both the wind speed and the WTs output power. They also benchmarked with the standard back-propagation (BP), the momentum BP, and the GA, respectively. For multi-step offshore wind power prediction, a new hybrid probability density model including time varying filter based empirical mode decomposition (TVFEMD), approximate entropy (AE), Yeo-Johnson transform quantile regression (YJQR), and gaussian approximation of quantiles (GAQ) was proposed by Zhang et al. [45]. First, the raw data was preprocessed using TVFEMD decomposition and AE theory. YJQR was then used to forecast offshore wind power 16 steps ahead. Finally, the GAQ technique was used to generate probability density curves for the outcomes of the 16-step cumulative quantile prediction. Recently, Liu and Zhang [46] conducted study on a bilateral branch learning paradigm with data of multiple sampling resolutions for short term wind power prediction. Huang et al. [47] studied a multiple-SVR-based model as another innovative approach for short-term wind power prediction. Kassa et al. [48] indicated that a hybrid GA-BP-NN-based model prediction outperformed others. The model’s parameters are determined using the (enhanced harmony search (EHS) approach with 15-min increments of accurate forecasts for predictions at 3 h. In another study conducted by the same research team, Kassa et al. [49] suggested an ANFIS-based one-day wind power generation forecast that outperformed the BP-NN and hybrid GA-BP-NN-based models. According to the above-mentioned research, hybrid models based on algorithms can enhance the accuracy of the model significantly more than single models.

WTs are complex electromechanical devices that depend heavily on wind speed and direction to function. Many other meteorological factors (e.g., ambient temperature, atmospheric pressure, humidity, air density, wind direction, etc.) and operational parameters (e.g., rotational speed, lubricating oil temperature, output voltage, output current, alignment angle, energy loss with warm-up, and so forth) could influence the behavior of the wind farm, reducing the energy produced in the long term, so they should be taken into consideration when modeling a new approach for wind power. The most recent studies based on hybrid techniques for wind speed [14,15,16,17,18,19,20,21,50,51,52,53,54] and on the hybrid techniques for wind power predictions [25,26,28,45,46,48] have used a simple input variable (wind speed or wind power). Also, two or three input variables combined are used in some studies [21,47]. However, the performances of the models are generally affected due to neglect of those potentials meteorological and operational parameters. Furthermore, the complexities of some hybrid techniques [20,45,54] have made prediction accuracy a challenging task in scientific research. It was found in another study [28] that complicated intermittent weather factors, including typhoons, cannot be processed fully, while another [47] produced different confidence levels for different forecasting periods. In this situation, more models may be considered to improve forecasting accuracy. In addition, the model’s capability to extract the meaning characteristics of the considered input data is investigated in order to refine the model. Of course, the dynamic character of wind farms and the operational characteristics of WTs have been extensively studied and documented. However, there is still a shortage of research on state-of-the-art soft computing or machine learning methods for modeling WTOP. Previous models have weaknesses that need to be addressed by robust techniques, and it is envisaged that the application of models based on more effective techniques will produce better results and offer new possibilities for the WT design scheme. Although several data-driven models have been used in earlier studies, a nonlinear regression-based model (NRM), three decision tree-based methods (RF, RT, REPT), and a multilayer perceptron-based soft-computing approach (ANN) are still lacking comprehensive benchmarks for WTOP modeling. The primary contribution of this research is to identify the best-performing model for wind turbine output power prediction while taking into account various input variables such as meteorological and wind farm operating parameters. Furthermore, the impact of input variables and the dynamic behavior of the wind system on the accuracy of the indicator performance have been taken into account. The most reliable model for WTOP estimation that responds to the intermittency of critical system parameters is required to better manage wind farms’ operation and contribute to better scheduling of wind farm maintenance operations, reducing system component failure.

As a result, in order to shed light on the specified gap, the following goals were devised for the current study: (1) gathering a considerable amount of WT-related data from a 30-MW wind farm; (2) the first-ever WTOP forecast using simultaneous optimization and inter-comparison of the used data-intelligent methodologies; (3) identifying the top soft-computing strategy utilizing more than 30 different statistical performance evaluations, reliable mathematical diagrams, and detailed supportive visual/tabulated presentations for the WT dataset; and (4) presentation of the flexibility and usefulness of the proposed soft calculation approach for a highly nonlinear real-world wind farm power system.

2. Materials and Methods

2.1. Collection of the Dataset Used in the Present Computational Analysis

The dataset from a 30-MW wind farm (Figure 1) considered in this investigation was given by the Mauritanian Electricity Company (SOMELEC). Figure 1 illustrates the meteorological station (including anemometers at different heights, a wind vane, a tower for installing the wind sensors, and a data logger to collect the meteorological data for the control the wind form system), the structure of the wind farm, data SCADA/SGIPE for monitoring and controlling the wind system, and an interface for observation and planning actions. The data were collected over a year, from 00:00 on 3 July 2015 to 23:50 on 2 July 2016. The facility composed of 15 WTs of 2 MW each is located on the northwestern coast of Nouakchott (Mauritania) at 17.99 North, 15.97 West, and 1 m above the sea level. The cut-in speed of each WT is 3 m/s, the rated speed is 12 m/s, the cut-off speed is 20 m/s, and the tower height is 100 m. The dataset contains meteorological observations (e.g., wind speed, wind direction, temperature, air density, humidity, and pressure) and the WTs’ state variables including output power, rotation speed, WT generator temperature, voltage, pitch angle, rotating speed of the generator, frequency, oil lubrication temperature, gearbox temperature, etc. Every second of data was recorded, and the average was calculated over a 10-min duration (which corresponds to 52,560 values a year for every parameter, before filtering the outliers’ data). Next, the data for the standard deviation, maximum, minimum, and average were transferred to a supervisory control and data acquisition (SCADA) system, as shown in Figure 1. Software called SGIPE (Sistema de Gestión de Parques Eólicos) connected to the SCADA server for facility monitoring, providing observation of measured parameters, reports on energy generated, wind farm availability, and graphical representation of the findings. Due to the server’s ability to send data via a wide area network (WAN) to distant SCADA computers, this program allows an external operator to stop or restart a turbine based on its performance and circumstances. However, an expert intervention is susceptible to delays based on the degradation of system components. Thus, considering operational factors and meteorological data, an accurate assessment of wind farms’ output power is required to lower the danger of failure and financial loss. The SGIPE software is configured according to two scenarios, namely local configuration and distance configuration. For the local configuration, the operator can act locally on the SCADA to collect data, stop/start wind turbines, and configure the system operation. However, for the distance configuration, the expert can operate online.

2.2. Importance of Selected Predictor Variables

Wind speed and direction are the main inputs for electricity generation from WTs [55]. Generally, WTs operate in wind speeds ranging from 3 up to 25 m/s [56]. However, the wind speed can reach more than 25 m/s in some cases [57]. Under real operation conditions, WTs use wind speed and direction to generate power. WT power decreases with rapid changes in wind speed and direction due to torsional vibrations in the drivetrain structure, increasing stress on turbine components [58,59]. Moreover, WTs are controlled and adjusted to maximize power. The swept area of the wind blades has a significant impact on output power, so it is important to determine the blades’ orientation towards the wind direction based on changes in the location of the WTs. So, the pitch angle is controlled to maximize the best region on the power curve for the WT [60]. The WT’s pitch angle mechanism may malfunction, preventing it from optimizing output power and forcing it to operate at a fixed angle. Without repairs, damage to the pitch angle system may spread to other parts of the WT, which would cause it to shut down. WT performance is significantly impacted by blade angle control [61]. Moreover, WTOP depends on wind speed, direction, system conversion parameters, blade pitch angle, rotor-generator speed, temperature, and turbine components (e.g., generator, gearbox, lubricating oil temperature, etc.) [43,62]. Additionally, high wind power penetration can reduce the frequency regulation ability of conventional synchronous generators (SGs). WTs operate using maximum power point tracking (MPPT), which is independent of the grid system and does not react to system frequency deviation [63]. Therefore, WTs must be controlled to guarantee the grid’s stability in the event of faults.

2.3. Descriptive Statistics of the Model Components Assigned for Training and Testing Phases

The goal of this work was to show how different complex soft-computing methods could be applied to estimate WTOP and to determine which method would yield the greatest modeling performance. Out of a total of 36,798 observations from a 30-MW wind farm, 25,759 observations (~70% of the entire dataset) were employed during the model’s training phase, while 11039 observations (~30% of the entire dataset) were considered in light of the literature for the testing step [64,65,66,67,68]. In this study, wind speed (X₁: WS [=] m/s), wind direction (X₂: WD [=] °), air temperature (X₃: AT [=] °C), pitch angle (X₄: PA [=] °), generator temperature (X₅: GT [=] °C), rotating speed of the generator (X₆: RSG [=] rpm), and voltage of the network (X₇: VN [=] V) were considered as the input variables, whereas WT output power (Y: WTOP [=] kW) was selected as the models’ target. The comprehensive descriptive statistics of the model components utilized in soft computing approaches based on multiple inputs single output (MISO) are compiled in Table 1. The preliminary trial-and-error results (not presented here due to limited place) also demonstrated that better predictions were obtained with actual data-based inputs in the present study, which is in line with the previous findings [65,69,70]. The skewness values showed that AT and VN datasets were approximately symmetric (“+” indicates right-skewed or right-tailed, and the “−” symbol indicates left-skewed or left-tailed), while WS and PA datasets were weakly skewed right (Table 1). On the other hand, the GT dataset was somewhat skewed to the right for both the training and test sets. Although WD and WTOP datasets showed a slight skewness to the right, the RSG dataset had a slight skewness to the left. In addition, the kurtosis values indicated that the WS dataset was classified as almost mesokurtic (i.e., K ≈ 3), whereas the VN data revealed a leptokurtic character (i.e., K > 3). All other attributes (WD, AT, PA, GT, RSG, and WTOP) had platykurtic distributions for training and testing phases (i.e., K < 3). To further illustrate the point, Figure 2 shows scatter plots of WTOP versus each estimator. Considering the strength of S-type, I-type, and square-type clusters in certain intervals, every forecaster showed a specific importance, suggesting that they should not be excluded from the applied models, which is consistent with other MISO-type data-intelligent studies [67,71,72].

2.4. Presentation of Soft-Computing Techniques and Software Systems

Soft-computing approaches, including the random forest (RF) model, random tree (RT) model, reduced error pruning tree (REPT) model, and feed-forward artificial neural network (ANN) (or multi-layer perceptron (MLP)) model, were developed utilizing WEKA 3.9.6 (Waikato Environment for Knowledge Analysis) software (The University of Waikato, Hamilton, New Zealand, https://www.cs.waikato.ac.nz/ml/weka/, accessed on 16 December 2023). In the computational analysis, the overall dataset was shuffled with a random seed value of 42 to ensure consistency and reproducibility, which is in accordance with recent studies [73,74,75,76]. It is noted that since the random state 42 offers a reliable beginning point for random number generation, it is frequently used in machine learning applications. This suggests that when the random state is set to 42, the random number generation process will always result in the same set of values.

In addition to three decision tree-based methods and an ANN-based approach, a nonlinear regression-based model (NRM) was also established within the scope of the present study. To achieve this, the training portion of the entire dataset imported from Microsoft^® Excel^® 2010 (V14.0.7015.1000) was moved into the DataFit^® (V8.1.69) multiple regression software package’s numerical computation environment and assessed using a Casper Excalibur PC running Windows 10.

Two statistical and visualization software packages, namely StatsDirect (V2.7.2, Copyright^© 2024–2008, StatsDirect Ltd., Altrincham, Cheshire, UK) and GraphPad Prism (V9.5.0 (730), Copyright^© 2024–2022, GraphPad Software LLC, Boston, MA, USA) were employed in the computation of the comprehensive descriptive statistics as well as the measured and forecasted WTOP values (kW) for the training, testing, and overall datasets. These software packages were also employed in order to generate predictor variable scatter plots, violin plots, box-and-whisker plots, and spread plots. SigmaPlot^® (V10.0.0.54, Systat Software, Inc., GmbH, Düsseldorf, Germany) software and Microsoft^® Excel^® 2010 were used to create linear correlation graphs of the applied soft-computing models for both the training and testing phases.

In the present computational analysis, MATLAB^® R2018a program (V9.4.0.813654) was used for the determination of more than 30 distinct statistical performance evaluators (definitions of the relevant indices are presented in the Section 2.5). Moreover, Taylor diagrams for both the training and testing stages were developed through the execution of an original solution script created in MATLAB^®’s M-file Editor.

2.4.1. Nonlinear Regression-Based Model (NRM)

In the current study, the training dataset (n = 25,759) was imported from Microsoft^® Excel^® 2010. The nonlinear regression-based analysis was implemented within the context of DataFit^® software. For the convergence criterion in the multiple regression-based analysis, the values of the solution preferences were selected as follows: (a) regression tolerance = 1 × 10⁻¹⁰, (b) maximum number of iterations = 250, and (c) diverging nonlinear iteration limit = 10. The nonlinear regression was performed using Richardson’s extrapolation approach to obtain numerical derivatives for the model solution. The Levenberg–Marquardt approach with double-precision was used to conduct the multiple regression-based analysis. An alpha (α) level of 0.05 (or 95% confidence) was used to determine the statistical importance of the model’s components.

2.4.2. Random Forest (RF) Model

Using random vector samples, the random forest (RF) algorithm is a well-liked ensemble machine learning technique that creates a structured set of tree predictors from input vectors [77,78]. To obtain a final decision via majority voting, it mixes random subspace, bagging (sometimes called bootstrap aggregating), and functions. When determining how to split the forest trees, two factors need to be considered: the number of decision trees (N-tree) to be formed and the number of features to be examined to identify the ideal split. By combining many criteria, random forest regression allows the tree to expand to the depth of all new training data [79]. Regression forests are less predictive than single regression trees. To give the optimum RF model, RF models classify the variables based on their relevance. In order to accurately predict WTOP, an RF-based model was built in this work using a trial-and-error methodology.

2.4.3. Random Tree (RT) Model

Without pruning, a predetermined number of random features at each node are evaluated using the random tree (RT) technique [80]. It is a member of the class of forests called tree estimators. Before extracting the category mark with the most votes, the random tree classifier takes the vector input property and classifies it for each tree in the forest. Model trees are decision-generating frameworks that show the linear process for each leaf, specific to the local subdomain it represents. The RT uses a combination of random forests and model trees for division criteria [81]. This facilitates balanced trees with an environment of spherical ridges passing through every leaf, which facilitates optimization [82]. A process of trial and error was utilized to create an RT-based model for WTOP estimation.

2.4.4. Reduced Error Pruning Tree (REPT) Model

Using the logistic regression technique, the reduced error pruning tree (REPT) approach is a fast classification tree logic strategy that creates several trees via a series of calculations [83]. In order to strive for the shortest representation of the optimal precision tree logic, it considers backward over-fitting complexity and applies the post-pruning technique [84,85]. REPT’s main advantages are its ability to precisely minimize decision tree complexity and its ability to minimize variance-related error [86]. In the present analysis, a REPT-based approach was also implemented as a competitive decision tree method through a process of trial-and-error for the estimation of WTOP.

2.4.5. Artificial Neural Network (ANN) Model

As reviewed by Sheela and Deepa [87], several researchers have offered different deterministic and heuristic techniques to establish the ideal number of hidden neurons in multi-layer neural networks. In this study, the number of hidden neurons (n_h) required for the proposed three-layer ANN model was searched between the following lower and upper bounds [88]:

(1.5) (2^{n_{i}}) {(n_{i} + 1)}^{- 1} \leq n_{h} \leq (3) (2^{n_{i}}) {(n_{i} + 2)}^{- 1}

, where n_i is the number of inputs (n_i ≥ 3). The suggested multi-layer perceptron model was implemented with a typical sigmoidal activation function

(f (x) = {(1 + e^{- x})}^{- 1})

to simulate inter-node interactions.

2.5. Description of the Statistical Performance Indices

The current computational analysis includes a number of key statistics, such as slope of the best-fit line (herein b or s), intercept (a), determination coefficient (R²), adjusted coefficient of multiple determination (R²_adj), mean absolute error (MAE), mean bias error (MBE), mean absolute percentage error (MAPE), root mean squared error (RMSE), systematic and unsystematic RMSE (RMSE_S and RMSE_U, respectively), proportion of systematic error (PSE), standard error of the estimate (SEE), index of agreement (IA) (or known as Willmott’s Index (WI)), fractional variance (FV), the factor of two (FA2), coefficient of variation of RMSE (CV(RMSE) (or known as scattering index (SI) or normalized root mean squared error (NRMSE)), Durbin–Watson statistic (DW), Nash–Sutcliffe efficiency (NSE), Legates and McCabe’s index (LMI), mean fractional bias (MFB), mean fractional error (MFE), Akaike information criterion (AIC), t-statistic, and overall accuracy score (ψ) (with varying weighting factors of 3, 1, 1, 1, and 1 for s, R², RMSE, MBE, and MAE, respectively), which were calculated to measure the degree of agreement and to make detailed comparisons between the applied soft-computing techniques for the training, testing, and overall datasets. Detailed descriptions and formulae of these measures (not presented here due to space limitations) can be found in the previous investigations [64,65,66,67,68,89,90,91,92,93,94,95].

3. Results

3.1. Assessment of the Prediction Accuracy for the Nonlinear Regression-Based Model

In the present computational analysis, three multiple regression-based models (ERM, PRM-1, and PRM-2) were established using the training dataset (n = 25,759) within the computational framework of DataFit^® software for the forecast of WTOP: (a) an exponential regression model (ERM) (SEE = 96.6287 kW, R² = 0.9783, NNI (number of nonlinear iterations) = 8), (b) a polynomial regression model with a constant term (PRM-1) (SEE = 113.0784 kW, R² = 0.9702, NNI = 5), and (c) a polynomial regression model without constant term (PRM-2) (SEE = 118.4587 kW, R² = 0.9672, NNI = 11). The corresponding results for the best-fit multi-regression model (ERM) are summarized in Table 2. The best-fit regression-based model (ERM) described as a function of seven independent variables is expressed in Equation (1) (the units of the model variables are as given in Section 2.3).

\begin{array}{l} W T O P = e x p & [(3.52 \times 10^{- 2}) (W T) - (2.21 \times 10^{- 5}) (W D) - (6.11 \times 10^{- 3}) (A T) \\ - (1.17 \times 10^{- 4}) (P A) + (4.95 \times 10^{- 3}) (G T) + (2.52 \times 10^{- 3}) (R S G) \\ + (3.97 \times 10^{- 4}) (V N) + 2.3115] \end{array}

(1)

The t-ratios and p-values for each parameter used in the multiple regression-based analysis of WTOP are presented in Table 2. In this regard, RSG, GT, and WS are more important than the other variables for the ERM in prediction of WTOP [96]. Comparative statistical performance of the nonlinear regression-based methodology (NRM) and other soft-computing approaches are presented in Table 3. Figure 3 illustrates the linear correlation between the observed and predicted values of WTOP using the best-fit NRM for the training and testing phases.

The statistical results obtained for the ERM (Table 3) suggested that the performance of the nonlinear regression-based methodology was acceptable with R² values as 0.9783 and 0.9789, MBE values as 3.6799 and 3.3816, PSE values as 0.0456 and 0.0566, IA (WI) values as 0.9944 and 0.9945, FA2 values as 0.9670 and 0.9652, and NSE values as 0.9782 and 0.9787 for the training and the testing stages, respectively. Figure 3 shows that the predicted values obtained from NRM are within the ±32% and ±30% error bands during the training and testing phases. Moreover, DW statistics obtained for the ERM were determined to be close enough to 2.0 (1.9780 and 2.0106 for the training and the testing stages, respectively), suggesting that there is probably no autocorrelation among regression models’ residual error terms [97]. Although the multiple regression-based model is better than the ANN model for some statistical parameters (e.g., R², R²_adj, MBE, RMSE_S, PSE, FA2, MFB, OAS), its performance was much lower than the models based on decision trees (RF-, RT-, and REPT-based models) in terms of all quantitative statistics (Table 3).

3.2. Assessment of the Prediction Accuracy for the Random Forest (RF) Model

In the present study, a number of trials were conducted using RF-based model, and the values of user-defined parameters (not presented here due to limited space but will be available upon request) are in line with the values reported in the previous decision tree-based modeling studies [98,99]. Using the current dataset, the RF-based model was built, trained, and tested for 25,759 instances in 5.64 s, 25,759 instances in 10.11 s, and 11,039 instances in 4.53 s, respectively. At the end of the analysis conducted in WEKA, RF-based predictions on the training set (n = 25,759) produced a relative absolute error (RAE) of 1.8789% and a root relative squared error (RRSE) of 2.3441%, whilst RAE and RRSE values for the testing set (n = 11039) were computed as 2.9105% and 4.1946%, respectively. Figure 4 shows the linear correlation between the measured and predicted values of WTOP using the RF-based model for both training and testing stages. Figure 4 illustrates that the estimated values generated by the RF-based technique fell within the error bands of ±10% and ±15% during the training and testing phases, respectively. As observed from the boldface statistics in Table 3, the RF-based model outperformed the other methods in 13 of the 22 indicators (complementary statistics of n, a, and DW are excluded) for all datasets (training, testing, and overall stages). In the case of WTOP estimation, for example, the results showed that the RF-based method outperformed the other methods based on decision trees (RT and REPT), according to its R² (0.9995 and 0.9982), MAE (10.7843 kW and 16.8908 kW), MAPE (7.0737% and 7.5597%), RMSE (15.3417 kW and 27.7217 kW), CV(RMSE) (SI) (0.0155 and 0.0277), and MFE (3.0783 and 3.6428) values for the training and testing phases, respectively. Moreover, using the current WT dataset, it was discovered that the NSE and LMI values derived from the RF-based technique were superior to those of other models (Table 3). Moreover, the calculated OAS (ψ) values (6.6967 and 6.4797 for the training and testing sets, respectively) were much closer to 7 [92], suggesting that the RF-based model worked better than other soft-computing-based methods. Furthermore, the RF-based model’s AIC values were the lowest across all subgroups, demonstrating its superior predictive accuracy in comparison to alternative modeling strategies.

3.3. Assessment of the Prediction Accuracy for the Random Tree (RT) Model

A number of trials were conducted using RT-based model, and the user-defined parameters (not presented here due to space limitations but will be available upon request) are consistent with the RT-based hyper-parameters reported in the previous data mining studies conducted for modeling of reference crop evapotranspiration [100] and global solar radiation [98]. Using the current dataset, the RT-based model was built, trained, and tested for 25,759 instances in 0.09 s, 25,759 instances in 8 s, and 11,039 instances in 3.49 s, respectively. The results indicated that the numerical size of the tree (or the number of total nodes) was 5991 after building of the model. It is noted that the visualization of the developed tree is not shown here due to its high size. The results of the computational analysis revealed that RAE and RRSE values obtained for the RT-based model were computed as 2.1154% and 2.5492% for the training set (n = 25,759), and 4.3418% and 6.3258% for the testing set (n = 11,039). Figure 5 shows the linear relationships between the observed and estimated values of WTOP using the RT-based model for both the training and testing stages. As depicted in Figure 5, the estimations of the RT-based model range within the ±12% error line during the training stage and within the ±28% error line during the testing stage. The boldface statistics presented in Table 3 indicate that the RT-based model outperformed the other approaches in six of the 22 indicators for all datasets. For instance, the results showed that the RT-based method performed better than the RF and REPT approaches in the estimation of WTOP according to its RMSE_S (0.4254 kW and 1.2803 kW), PSE (0.0007 and 0.0009), FV (0.0003, −0.0006), and MFB (0.4565 and 0.4650) values for the training and testing phases, respectively. In addition, the RT-based approach shows its superiority over other methods (NRM, REPT, and ANN) in estimating WTOP by providing satisfactory OAS (ψ) values (6.6678 and 6.2211 for the training and testing sets, respectively) compared to these approaches (Table 3).

3.4. Assessment of the Prediction Accuracy for the Reduced Error Pruning Tree (REPT) Model

In the computational analysis, a number of trials were conducted using the REPT-based model, and the values of user-defined parameters were discovered to be consistent with the values listed in other data-driven machine learning studies such as prediction of groundwater level [101] and modeling the thermal conductivity of concrete [102]. Using the current dataset, the REPT-based model was built, trained, and tested for 25,759 instances in 0.33 s, 25,759 instances in 8.01 s, and 11,039 instances in 3.44 s, respectively. The results indicated that the numerical size of the tree (or the number of total nodes) was 639 after building of the model. Due to its large size, the flow network diagram of the generated tree is not depicted here. The results of the computational analysis showed that RAE and RRSE values obtained for the REPT-based model were computed as 3.3419% and 4.5970% for the training set (n = 25,759), and 3.7333% and 5.3967% for the testing set (n = 11,039). Figure 6 illustrates the linear correlation between the observed and predicted values of WTOP using the REPT-based model for the training and testing stages. Figure 6 illustrates that the predictions of the REPT-based model were within the ±25% and ±21% error bands during the training and testing phases. The boldface statistics presented in Table 3 reveal that the REPT-based model outperformed the other approaches in three of the 22 indicators for all datasets. For instance, the obtained results indicated that the REPT-based method performed better than other decision tree-based models (RF and RT) in estimation of WTOP according to its MBE (2.37 × 10⁻⁶ kW and 0.2168 kW), FA2 (1.0000 and 1.0010), FV (0.0003, −0.0006), and t-statistic (1.26 × 10⁻⁵ and 0.6387 < t_critical ≈ 1.96) at the α level of 0.05 and (n − 1) degrees of freedom) values for the training and testing stages, respectively [103,104]. Nevertheless, the REPT-based strategy performed in third place for the current dataset among the applied decision tree models based on all statistical indicators.

3.5. Assessment of the Prediction Accuracy for the Artificial Neural Network (ANN) Model

In the present ANN-based soft-computing approach (n_i = 7), the optimum n_h value was explored in the range of 24–43 using WEKA’s “multilayer perceptron” classifier that learns a multi-layer perceptron by backpropagation. The values of user-defined parameters (not presented here due to limited place) are consistent with the numerical simulation conditions considered in the previous MLP-based modeling studies [76,102,105,106].

The trial-and-error results (not shown here due to the lack of space but will be available upon request) showed that the number of neurons (n_h) in the hidden layer was optimized as 30 within the lower and upper limits searched for the three-layer ANN model. Although the R² values did not show a significant change for the n_h values between 25 and 40 (up to three decimal places) during the simulation process, a noticeable change was recorded for the other statistics (i.e., MAE, RMSE, RAE, RRSE) reported by WEKA. The results of the computational analysis indicated that RAE and RRSE values obtained for the optimal three-layer ANN structure (n_i:n_h:n_o = 7:30:1) were computed as 13.2547% and 12.5050% for the training set (n = 25,759), and 13.1857% and 12.4158% for the testing set (n = 11,039). On the other hand, MAE, RMSE, RAE and RRSE values were found to be higher for other neural network topologies (e.g., MAE = 83.1004 kW and 81.7560 kW, RMSE = 89.0069 kW and 87.3479 kW, RAE = 14.3191% and 14.0874%, RRSE = 13.4678% and 13.2168% for the testing stages of the ANN models in 7:25:1 and 7:40:1 structures, respectively).

Using the current dataset, the three-layer (7:30:1) ANN-based model was built for 25,759 instances 55.57 s, while the GUI window was active during the simulation. It was trained and tested for 25,759 instances in 8.18 s and 11,039 instances in 3.56 s, respectively. Figure 7 illustrates the linear correlation between the observed and predicted values of WTOP using the three-layer (7:30:1) ANN-based model for both the training and testing stages. As illustrated in Figure 7, the estimations of the ANN-based model range within the ±24% and ±22% error bands during the training and testing phases. The statistical results summarized in Table 3 show that the ANN-based model worked better than the nonlinear regression-based model (NRM) in terms of some performance indicators, such as R²_adj, MAE, MAPE, RMSE, RMSE_U, SEE, IA, FV, CV(RMSE) (SI), NSE, LMI, and MFE. Although these statistics reflected the superiority of the ANN-based model over the NRM, the estimation performance of the multilayer perceptron-based approach on WTOP was far behind the decision tree-based models (RF, RT, and REPT) in terms of all statistical indicators examined (Table 3).

3.6. Inter-Comparison of the Implemented Soft-Computing Models

In this section, the inconsistency of WTOP estimation, and the comparative descriptive statistics of absolute residual errors (ARE) between the measured and forecasted values of the soft-computing models for the testing phase (n = 11,039) are assessed in Table 4. The box-and-whisker plot, violin plot, and Taylor diagram are three helpful graphical tools that were used to benchmark the prediction accuracy of all utilized soft-computing models from the standpoint of visual comparisons. Figure 8a,b (violin plots for the training and testing phases) and Figure 8c,d (box-and-whisker plots for the training and testing phases) reveal the structure of the actual data against the implemented models for the prediction of WTOP. The box-and-whisker plots summarize each variable by following components: (1) the median value (Q₂: median or second quartile) in each box acts as a center solid line (red line for the actual dataset and blue lines for the applied models); (2) a box represents the range of variation around this central tendency (the edges of the box are the 25th (Q₁: lower quartile or first quartile) and 75th (Q₃: upper quartile or third quartile) percentiles); (3) black diamond (♦) inside each boxplot indicates the mean value. Both the violin plots and the box-and-whisker plots of the decision tree models show shapes that are almost identical to the observed values based on the whole distribution of the WT dataset. However, to visually examine the prediction performance of the applied models in more detail, the box-and-whisker plots and spread plots of residual errors between the measured and forecasted values are also depicted in Figure 9.

Upon examining Figure 9, it can be seen that the RF-based model outperformed the NRM-, RT-, REPT-, and ANN-based models in predicting WTOP with the least amount of variance. These conclusions are supported by descriptive statistics of ARE values. In addition, for the testing stage, the ARE value (2.8090 kW) of the RF-based model with respect to the interquartile range (IQR) was found to be lower than the other applied models, indicating its superior performance than other approaches (Table 4). Moreover, the second quartile (median) for the RF-based model (Q₂ = 866.5470 kW) was more closely aligned with the observed data (Q₂ = 866.7900 kW) during the testing phase.

Lastly, one of the most well-known graphical representations used for comparing soft computing-based techniques, the Taylor diagram, was also used to evaluate and validate the prediction performances of the constructed models [66,67]. Figure 10 illustrates that the RF-based approach is the best-performing model since it is closest to the observed position (solid blue circle on the x-axis), as can be seen from the zoomed-in sections for both the training and testing phases. On the other hand, Figure 10 indicates that the NRM is the worst among all models in the estimation of WTOP due to its greatest distance from the actual data point.

3.7. Uncertainty Analysis for the Applied Prediction Models

Uncertainty analysis was employed in the present investigation to more realistically examine the applicability and accuracy of the soft-computing methods that were utilized to estimate WTOP. The expanded uncertainty with 95% confidence level (U₉₅) was utilized to evaluate the prediction accuracy of the developed models for each subset in order to further compare the model performances. The model exhibiting a smaller value of U₉₅ was deemed to be the more precise approach [68,89,97]. Statistical details regarding the uncertainty analysis can be found in the literature [107,108,109]. The results of the uncertainty are presented in Table 5 for all subsets of the implemented approaches.

Decision tree-based models exhibited almost comparable behavior during the testing stage, according to the results of the uncertainty analysis (0 < e_m < 1), whereas NRM and ANN models showed the opposite behavior and overestimated (e_m >> 0) WTOP. Overall data confirmed that the RF-based model had the narrowest uncertainty bands when compared to other soft-computing methods, despite the fact that the subgroups of the REPT-based model showed the lowest mean prediction errors (e_m). In addition, the narrowest prediction error intervals were observed for the RF-based model. Furthermore, the RF-based approach fared better than the other models with the fewest U₉₅ values. As a result, the benchmarking findings indicated that a decision tree modeling methodology based on RF would be helpful in accurately estimating WTOP.

3.8. Sensitivity Analysis for the Best-Fit Soft-Computing Model

Ultimately, the best-performing method (RF-based model) was used to estimate WTOP, and a sensitivity test was run to determine which predictor variable was the most significant. As shown in Table 6, several testing datasets were constructed through the gradual removal of each input component. The impact of each WT-related input on the output (WTOP) was evaluated in terms of R², MAE, and RMSE. The results from Table 6 suggest that the rotating speed of the generator (RSG) has the most significant role in predicting WTOP. The sensitivity test was also corroborated by the regression variable results of the best-fit model (ERM) for the RSG with the largest absolute t-ratio (370.7333) (Table 2).

4. Discussion

The purpose of this study is to forecast WTOP. Finding an appropriate soft-computing model structure for WTOP prediction is the primary contribution of the current computational research to the relevant topic. Therefore, the emphasis was on enhancing the performance of the WTs based on experimental parameters and WTs operational variables, collected over one year from a 30-MW wind farm installed in the Sahelian conditions, in Mauritania. In some cases, the visual observation of the wind farm components and manual collection of the faults detected on some components of the WTs are used to analyze the wind system’s performance. This way of managing the critical state of the wind system under stochastic parameters, such as wind speed and direction and other operational factors (e.g., pitch angle, generator temperature, network voltage, etc.), is not feasible to make a system of this complexity trustworthy. In addition, it is challenging to improve the operational risk assessment and shutdown plan of a wind farm due to the lack of real meteorological and operating data and accurate forecasting techniques. Therefore, the purpose of the SCADA systems consist of collecting the operational and climatic data necessary to understand the operation of a wind farm through in-depth analysis and suggesting an approach for fault detection in the wind system components.

The approach developed in this investigation enabled data from a 30-MW wind farm in Mauritania to estimate WTOP by considering the meteorological parameters of the region and the operating variables of the wind farm. Based on predictive approaches, this analysis can better manage the operation of the system, reducing the gap between supply and demand by considering interoperability among components and optimizing the transmission of wind farm-generated energy to the power grid. In addition, this study could be used as a useful tool for reducing the financial risk thanks to adapted maintenance planning and improving wind system management.

The results in Table 3 show improved performance compared to previous studies in the literature [40], which were dedicated to wind power forecasting. However, the random forest (RF) model performed best for the majority of performance indicators. When comparing the R² for this model to the literature, it was discovered that the results (between 0.9985 and 0.9995) were better than in the previous study [26]. The MAE (which ranged between 12.6161 and 16.8908 kW) demonstrated that the indicators outperformed the proposed model in the workplace [22,26,40,45,49]. Furthermore, the obtained MAPE (lies between 7.0737 and 7.5597%) was higher than those reported in the previous investigations [25,38]. Overall, the RMSE (between 15.3417 and 27.7217 kW) are higher than in the literature [22,25,28,40,42,45,46,49]. Additionally, the developed approach was subjected to a comparative analysis against other performance indicators to refine the selection of the most appropriate model for predicting WTOP, and most of them were the best for the random forest (RF) model. Moreover, when compared to existing adaptive neuro-fuzzy inference system-based models and other methods in the literature, the suggested soft-computing strategy demonstrated improved forecasting ability and hence greater accuracy in estimating wind power prediction.

The study’s superiority can be attributed to the use of appropriate meteorological and operating parameters, which have the greatest influence on the operation and management of the wind farm. Of course, it is critical to note that wind power prediction is highly sensitive to the input variables. Wind power forecasting requires the use of appropriate parameters. The sensitivity analysis results indicated that the parameters selected for forecasting have a considerable impact on wind turbine prediction. It is obvious and natural that the wind speed and the generator’s rotation speed are highly correlated, as are the rotation speed of the generator and temperature. Of course, the greater the speed, the faster the wind generator rotates, and the higher the speed, the higher the generator temperature rises due to the machine’s high current output. This heating is coupled with that caused by the ambient temperature, which is significant. However, ignoring one of the parameters for its correlation with another will decrease the models’ performance. This study found (Table 6) that omitting only the rotating speed of the generator (RSG) reduces the model’s performance. Indeed, Table 6 demonstrates that the synergistic effect of all the specified parameters helps to improve the model’s performance accuracy. Furthermore, the use of six input parameters is one of the reasons why this current study has a lower RMSE than our previous work with ANFIS. According to the previous analysis, the proposed approach for predicting wind power, incorporating meteorological and operational parameters, outperforms several models.

Furthermore, there are some limitations in this research. First, the present study only utilized some meteorological and operational variables (e.g., wind speed and direction, rotational speed of the generator, pitch angle, temperature of the generator, and grid voltage) as inputs to the model and ignored other environmental aspects (e.g., air density, pressure, humidity, solar radiation, etc.).

5. Conclusions

This study benchmarked different flexible soft-computing models (NRM, RF, RT, REPT, and ANN) for the prediction of WTOP. It made use of meteorological and operational parameter data that were gathered over the course of a year at the 30-MW wind farm in Nouakchott, Mauritania. The simultaneous adoption of these data-driven methodologies in the modeling of WTOP for the first time was the most important contribution of the current computational investigation. A variety of visual representations and over 30 distinct statistical performance evaluators were used for the first time in the framework of the present subject to measure the effectiveness of the established soft-computing models. Another noteworthy finding of this study was that the RF model outperformed the RT-, REPT-, nonlinear regression-, and ANN-based models, as demonstrated by comparative statistics of the testing datasets of the implemented soft-computing methods. On the other hand, NRM performed the worst among all models used.

The performance assessment indices corroborated the superiority of the RF-based model (R² = 0.9982, MAE = 16.8908 kW, RMSE = 27.7217 kW, SEE = 27.6704 kW, IA or WI = 0.9996, CV(RMSE) or SI = 0.0277, NSE = 0.9982, LMI = 0.9709 for the testing dataset) over other data-driven approaches in estimation of WTOP. On the other hand, the RT (R² = 0.9960, MAE = 25.1978 kW, RMSE = 41.8067 kW for the testing dataset) and REPT (R² = 0.9971, MAE = 21.6661 kW, RMSE = 35.6662 kW for the testing dataset) models also showed a competitive prediction potential over the NRM (R² = 0.9789, MAE = 77.3617 kW, RMSE = 96.4472 kW for the testing dataset) and the ANN (R² = 0.9974, MAE = 76.5227 kW, RMSE = 82.0540 kW for the testing dataset) models. While all competitive decision tree-based models have respectable R² values (>0.995 for the testing dataset), the RF-based model had greater performance and higher accuracy compared to other competitive techniques (MAPE = 8.2325% and 8.9620% and ψ = 6.2211 and 6.3362 for the RT and REPT, respectively), as seen by its smaller percentile deviations (MAPE = 7.5597% < 10%) and higher overall accuracy score (ψ = 6.4797). Although the lowest mean prediction errors (e_m) were observed for the subsets of the REPT-based model, overall statistics corroborated that the narrowest uncertainty bands were generated for all sets using the proposed RF-based model (±1.96S_e = ±30.0702 kW, ±54.3319 kW, and ±38.9685 kW for the training, testing, and overall datasets, respectively) in contrast to alternative soft-computing techniques. In addition, the narrowest prediction error intervals were observed for the RF-based model (−30.0302 kW to 30.1101 kW, −53.9517 kW to 54.7121 kW, and −38.8265 kW to 39.1105 kW for the training, testing, and overall datasets, respectively). Moreover, the RF-based strategy surpassed the remaining models by exhibiting the lowest levels of expanded uncertainty values (U₉₅ = 7.9948 kW, 12.3385 kW, and 6.7099 kW for the training, testing, and overall datasets, respectively). Furthermore, sensitivity analysis revealed that the generator’s rotational speed was the key factor in the RF-based model’s ability to accurately estimate WTOP. The results were also supported by the best-fitting exponential regression model’s regression variable results (SEE = 96.6287 kW, R² = 0.9783, NNI = 8) for the RSG with a relatively small error value (6.80 × 10⁻⁶) and the largest absolute t-ratio (370.7333). Therefore, the computational findings indicated that the precise calculation of WTOP could be achieved by the application of an RF-based decision tree modeling approach.

This work highlights the importance of the soft-computing technique used to estimate WTOP for the improved management and steady operation of wind farms in wind energy forecasting. The proposed approach improves the accuracy of wind energy forecasts and provides strong technical support that reduces the downtime and financial risk associated with wind farm operation. It also offers a fairly adaptable method for calculating WTOP. Hence, it would be intriguing to build on the existing study to integrate some further sophisticated and hybrid algorithms. Moreover, to strengthen the stability of wind power system deployment and management, more meteorological characteristics will be incorporated into the models in future studies. Ultimately, further systematic data with new process-related input factors must be gathered for more accurate findings.

As a result, the future research directions will be as follows: (a) including additional climatic parameters into the model prediction scheme, (b) building efficient optimization algorithms, and (c) incorporating certain deep learning algorithms into future study efforts for better prediction. Moreover, in future studies, the proposed approach may be expanded to additional complicated and challenging time series forecasting issues.

Author Contributions

Supervision, B.B., K.Y. and M.O.; project administration, B.B. and K.Y.; conceptualization, B.B., K.Y. and M.O.; methodology, B.B. and K.Y.; software, B.B. and K.Y.; formal analysis, B.B., K.Y. and M.O.; resources, B.B. and K.Y.; data curation, B.B. and K.Y.; investigation, B.B., K.Y. and M.O.; writing—original draft preparation, B.B., K.Y. and M.O.; writing—review and editing, B.B., K.Y. and M.O.; visualization, B.B., K.Y. and M.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by the program (initiative “2023, Année de Production Scientifique”) of the ANRSI: Agence Nationale de la Recherche Scientifique et de l’Innovation.

Data Availability Statement

Data are contained within the article.

Acknowledgments

Authors would like to acknowledge the Ministry of Petroleum, Energy, and Mines, National Industrial and Mining Company of Mauritania and Mauritanian Electricity Company for providing the data used in this study. Also, authors would like to acknowledge the ANRSI for the funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Data-Intelligent Approaches Used in Wind Speed and WTOP Estimation

Table A1. Classification of various data-driven model categories related to wind speed prediction.

Model Category	Wind Speed Prediction	Study Location	Approach and Methods	Used Datasets	Obtained Performance Metrics	Advantages of Study	Disadvantages of Study
Statistical regression method	MSFAE [50]	Xinjiang, China	A novel multi-scale feature adaptive extraction (MSFAE) ensemble model for wind speed forecasting	Three different wind speed time series collected from anemometers are selected to prove the superiority of the model.	Datasite#1 MAPE (%): 3.426 MAE (m/s): 0.146 RMSE (m/s): 0.182 Datasite#2 MAPE (%): 2.312 MAE (m/s): 0.128 RMSE (m/s): 0.166 Datasite#1 MAPE (%): 2.326 MAE (m/s): 0.142 RMSE (m/s):0.186	The proposed algorithm has the advantages that it provided better global search accuracy and convergence speed than the traditional algorithms	Only wind speeds are considered as input to the model. The training phase is time-consuming. Model is applied only for wind speed prediction.
Statistical regression method	MKSVRE-WOA [51]	Shandong Province, China	Multi-kernel SVR ensemble (MKSVRE) model based on unified optimization and whale optimization algorithm (WOA)	Wind speed datasets (from 00:00 on 1 September 2011 to 23:50 on 20 September 2011) for two sites (A and B).	Site A MAE (m/s): 0.3698 RMSE (m/s): 0.4786 MAPE (%): 5.21 SAE (m/s): 53.2519 STD (m/s): 0.4796 Site B MAE (m/s): 0.5288 RMSE (m/s): 0.6751 MAPE (%): 8.58 SAE (m/s): 76.1455 STD (m/s): 0.6773	The model provides results without the need to select a specific kernel function and achieves a global parameter selection.	Only wind speeds are considered as input to the model. The training time of the SVR model is long. Model is applied only for wind speed prediction.
Machine learning	EISM, RTRD Bi-LSTM [14]	Yunnan, China	GWO-CNN-BiLSTM (GCNBiL) networks model with different lengths of convolution operators	Wind speeds collected for 91 days, from 4 January 2010 to 30 June 2010 and included 13,104 sets .	For six-step prediction RMSE (m/s): 0.816 MAPE (%): 13.295 MAE (m/s): 0.635	The proposed model has greater accuracy than traditional neural network models	Only wind speeds are considered as input to the model. Model is applied only for wind speed prediction.
	MST-GNN [15]	Denmark, Netherlands	Multidimensional spatial-temporal graph neural networks (MST-GNN) model for wind speed prediction based on multidimensional data	Open-source datasets for wind speed from Denmark and Netherlands	Denmark dataset MAE(m/s): 1.244 MSE (m/s): 2.616 Netherlands Dataset MAE (m/s): 7.849 MSE (m/s): 11.851	The model performs the best, especially in long-term prediction tasks considering multidimensional data	Model is applied only for wind speed prediction.
	MFMS [16]	Zhangjiakou, North China	Method based on multi-feature and multi-scale integrated learning (MFMS) for wind speed prediction	Wind speed data from 16 wind turbines in a wind farm	For 4-h ultra-short-term wind speed prediction MAPE (%): 6.164 RMSE (m/s): 0.275 R²: 0.966	This method provides a reference for the ultra-short-term wind speed prediction of wind farms.	Model is applied only for wind speed prediction.
	CNN-LSM-NDL [17]	Jiangsu Province, China	Hybrid wind speed prediction model based on convolutional neural network and long short-term memory network deep learning model	Historical wind speed dataset collected at two sites from “22 July to 12 August 2017” and from “22 August to 11 September 2017” are used for this study.	Dataset #1 MAE (m/s): 0.1477 RMSE (m/s): 0.1964 MAPE (%): 3.7803 R²: 0.9702 Dataset #2 MAE (m/s): 0.1675 RMSE (m/s): 0.2461 MAPE (%): 2.9065 R²: 0.9726	Model allows denoising operation in the data preprocessing process, that can provide a high-quality input data, which help to find high prediction performance	Only wind speeds are considered as input to the model. Model is applied only for wind speed prediction.
	VMD-TCN-STL [18]	Xinjiang, China	Novel wind-speed prediction model based on variational mode decomposition, temporal convolutional network, and sequential triplet loss	Wind speed series from the SCADA system of the Xinjiang wind farm includes three sets of data are used.	MAPE (%): 4.77 MAE (m/s): 0.11 RMSE (m/s): 0.15	Prediction accuracy is effectively improved by introducing modal decomposition. VMD exhibits advantages in the same type of method	Only original wind speeds are considered as input to the model. Although the method can greatly improve the efficiency of the wind energy system, the problem has not been fundamentally solved in the process of network training through this study.
	RNN-CNN-LSTM [19]	New Zealand	A novel hybrid neural network scheme based on convolutional neural network (CNN) and long short-term memory (LSTM)	Three datasets given as Data1, Data2, and Data3: Data1 has 39,575 sampling records. Data2 has 26,135 sampling records. Data3 has 39,916 samples records.	Data 1 MAE (m/s): 0.4783 RMSE (m/s): 0.6480 R²: 0.9070 Data 2 MAE (m/s): 0.3193 RMSE (m/s): 0.4477 R²: 0.9414 Data 3 MAE (m/s): 0.6281 RMSE (m/s): 0.8724 R²: 0.9775	RNN-CNN-LSTM can learn the spatial and temporal information of the raw data. It improved the accuracy of the wind speed prediction compared with the traditional single neural network model	The models allow only the wind speed prediction; Only the wind speeds were considered as input to the model.
	DRIPS-PDI [20]	Nolan and Kern, US	A novel decomposition-recognition-integration-prediction system (DRIPS) based on a newly developed predictive difficulty index	Wind dataset collected for every 10 min for two American sites (Nolan and Kern).	Nolan Site RMSE (m/s): 0.0655 MAPE (m/s): 0.3743 R²: 0.9997 Kern Site RMSE (m/s): 0.0347 MAPE (m/s): 2.4855 R²: 0.9998	DRIPS associated to (PDI) can provide excellent performance in the accuracy of wind speed prediction and the complexity of the proposed prediction system is acceptable to the industry with the increase in computing power of modern hardware devices	The models allow only wind speed prediction. Only the wind speeds were considered as input to the model. The model prediction accuracy is a difficult task in scientific research.
	CNN-BILSTM-MOHHO [21]	Hebei, China	Variable short wind speed prediction model of Capsule Neural Network (Capsnet) and bidirectional Long-and Short-Term Memory Network (BILSTM) combined with Multi-Object Harris Hawk optimization (MOHHO)	Historical wind speed information from wind farm and multidimensional meteorological variables	Combined model MAE (m/s): 0.1646 MAPE (%):2.43 RMSE (m/s): 0.1992	The proposed model combines historical data of multiple meteorological data, so the model performs better than other univariate machine learning models.	The study analyzed only the effects of two typical climates on wind speed prediction. The models allow only a study of wind speed prediction.
	WT-CNN-tSVR [33]	Sotavento, Spain VejaMate, Germany Madryn, Argentina	Hybrid techniques employing wavelet decomposition transform in tandem with convolutional neural network and twin support vector machine	Wind speed datasets collected in three different periods (three months, 12 months, and 36 months) at the height of 10 m over 10 min	Sotavento (36 months) RMSE (%): 0.275 MSE (m/s): 0.0756 VejaMate (36 months) RMSE (%): 0.1375 MSE (m/s): 0.01890 Madryn (36 months) RMSE (%): 0.085 MSE (m/s): 0.0072	The model outperforms the classical and simple machine learning for wind speed prediction.	Only the wind speeds are considered as input variables to the model. The models allow only a study of wind speed prediction.
Artificial intelligence	EPT-CEEMDAN-TCN [52]	Gansu, Liaoning, Jiangsu, China	A hybrid decomposition method coupling the ensemble patch transform (EPT) and the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN)	Historical wind speed data from three wind farms located at Gansu, Liaoning, and Jiangsu in China	Gansu site MAE (m/s): 0.28890 RMSE (m/s): 0.40157 MAPE (%):0.07595 Liaoning site MAE (m/s): 0.15659 RMSE (m/s): 0.19586 MAPE (%):0.08896 Jiangsu site MAE (m/s): 0.17790 RMSE (m/s): 0.22361 MAPE (%): 0.09606	The proposed model has the capability of decomposing the nonlinear volatility completely and allows higher computational efficiency.	Only the wind speeds are considered as input variables to the model.
	ED-Wavenet-TF [53]	Minnesota, USA	A novel forecasting model called EDWavenet-TF	Two WS datasets collected from wind farms in Nebraska and Minnesota, USA (in 2012 and 2011, respectively)	MAE (m/s): 0.8018 RMSE (m/s): 1.1052 R²: 0.9135 SMAPE (%): 13.9128	ED-Wavenet-TF outperforms the comparable models in most cases at the 1% significance level and could be used for the wind speed and wind power forecasting.	Only the wind speeds and wind power were considered as input variables to the model.
	VMD-CA-LSTM-EL-EC [54]	Hebei, China	This study proposed a hybrid model based on the variational mode decomposition (VMD), clustering analysis, LSTM network, stacking ensemble learning and error complementation for wind speed forecasting	Four original wind speed datasets monitored from four wind farms in Hebei Province in China	Site#1 MRE: 0.025 RMSE (m/s): 0.65 SSE (m/s): 754.774	The approach has provided an improvement in terms of the predicted accuracy.	The number of clusters is determined by the experience, which needs to be optimized by artificial intelligence algorithms to find out the information hidden in the decomposed subseries. Only the wind speeds are considered as input variables to the model.

Description of abbreviations provided below, in footer of Table A2.

Table A2. Classification of various data-driven model categories related to WTOP prediction.

Model Category	Wind Speed Prediction	Study Location	Approach and Methods	Used Datasets	Obtained Performance Metrics	Advantages of Study	Disadvantages of Study
Statistical regression method	BMA-EL [25]	Inner Mongolia Autonomous region, China	Hybrid wind power forecasting approach based on Bayesian model averaging and Ensemble learning (BMA-EL)	SCADA system of a wind farm, sampled in 15-min (from August to October 2014)	RMSE (kW): 27.8960 MAPE (%): 10.0848	The model allows reducing the uncertainty of the forecasting results of a single model by increasing the diversity of sub-training sets. Three meteorological input variables were considering: wind speed, wind direction and ambient temperature.	Other operations parameters should be considering, (pitch angle, temperature of generator, rotating speed, etc.)
Statistical regression method	TVFEMD-AE-YJQR-GAQ [45]	Germany	A hybrid probability model for multi-step offshore wind power prediction, including time varying filter based empirical mode decomposition (TVFEMD), approximate entropy (AE), Yeo–Johnson Transforms Quantile regression (YJQR), and Gaussian Approximation of Quantile (GAQ)	Two datasets recorded at 15-min intervals (from 1 July 2020 to 31 July 2020 and 1 December 2020 to 31 December 2020) from offshore wind power	Datasets #1 MAPE (%): 3.9681 RMSE (kW): 58.9924 MAE (kW): 40.8323 Datasets #2 MAPE (%): 3.3487 RMSE (kW): 46.3364 MAE (kW): 34.7261	The developed method can be used for further model prediction. Also, the use of the improved GAQ help to effectively improve the reliability and the accuracy of multi-step interval prediction	The wind speed was the only parameter used as an input variable to the model. The grid search method is used in this study to determine the model optimal parameters, leads to an increase in the running time, and it has been a real challenge to improve the running efficiency.
Machine learning	SRNN-PSAF [26]	China	A method based on stacked recurrent neural network (SRNN) with parametric sine activation function (PSAF) algorithm for wind power forecasting	Data (wind power and meteorological data) collected from the continental United States (from 2007 to 2012) and from the National Renewable Energy Laboratory (NREL)	MAE (MW): 0.0602 MAPE (%): 0.9360 MSE (MW): 0.0143 RMSE (MW): 0.1195 R²: 0.7847	The SRNNPSAF neural network approach can combine the advantages of RNN, deep learning framework and merits of PSAF for more accuracy prediction.	The study did not consider other operating parameters (pitch angle, temperature of generator, rotating speed, etc.).
	MC-hNN [28]	United States	A regional method using a spatio-temporal, multiple clustering algorithm and hybrid neural network for wind power prediction	Actual measured power and meteorological data from the wind integration national dataset (WIND)	MAPE (%): 4.86–5.58 MAE: 18.64–22.44 RMSE: 28.45–33.26	This study allows for enhancing the recognition ability and helps with wind power prediction.	This study focuses on the deterministic prediction of wind farm power in relatively stable weather. So, the processing capacity of complex power fluctuations in extreme weather such as typhoons is insufficient.
	BBLP-MSR [46]	Mainland China	Novel bilateral branch learning based wind power prediction (WPP) modeling framework, which includes two data feature engineering branches and one prediction module	A SCADA dataset of a commercial wind farm, which contains 33 wind turbines with rated power of 2 MW in Mainland China	RMSE: 130.95–255.04	The proposed model for the WPP modeling framework consisting of a high sampling resolution data feature engineering branch which allowed improved the WPP accuracy.	The study investigated only the usage of data of multiple sampling resolutions in the short-term WPP task. The study did not consider other operating parameters (e.g., pitch angle, temperature of generator, rotating speed, etc.).
	SVR [47]	Taiwan	A hybrid intelligent method for short-term wind power forecasting and uncertainty analysis	The actual wind power generation, wind speed and wind direction data collected for every 15-min over one year	RMSE (W): 67.2543 MRE (%): 2.8845	The proposed method provides more accurate forecasts than other existing methods	The proposed approach produced different confidence levels for each forecasting period. So, to allow more accurate forecasting, more models could be considered.
	GA-BP-ANN [48]	Beijing, China	A GA-BP hybrid algorithm-based ANN model for wind power prediction	Actual datasets correspond to records of 10-min average wind speed and wind turbine output power for the period of one year (from 26 March 2014 to 25 March 2015)	MAE (kW): 45.68 MAPE (%): 7.48	The proposed approach demonstrated superior performance and substantial improvement over persistence and feed forward BP NN based forecast models; It could be an important tool for 1-day-ahead hourly wind power prediction.	The study was carried out for 1-day-ahead wind power prediction considering only the wind speed as input data.
Artificial intelligence	LSTM-IVMD-SE [22]	Dingbian and Gansu, in China	A robust short-term wind power forecasting model based on Long Short-term Memory (LSTM) with correntropy including improved variational mode decomposition (IVMD) and sample entropy (SE)	Two sets of data with different sampling intervals and different scales were used for this work.	RMSE (kW): 58.77 MAE (kW): 41.10 TIC: 0.0047	Since the hybrid model is insensitive to outliers and noise, it can significantly improve prediction accuracy.	Several interesting studies should be conducted, such as the non-linear weighted combination of components forecasting results, etc. Input data should be improved with others wind turbine operating parameters.
	FCM-Clustering algorithm [23],	Northeastern China	An improved Fuzzy C-means (FCM) Clustering Algorithm for day-ahead wind power prediction.	Historical data collected from two different wind farms of 52.5 MW located in northeastern China were used.	RMSE (%): 4.12–21.18 MAE (%): 5.49–23.96	The proposed approach can be used to establish the relationship between wind speed and wind power.	Only the wind power is considered as an input variable to the model.
	DD-PPDL [27]	Levenmouth, Fife, Scotland and United Kingdom	A novel data-driven approach by integrating data pre-processing & re-sampling, anomalies detection and treatment, feature engineering, and hyperparameter tuning based on gated recurrent deep learning models is proposed for wind power forecasting.	Datasets recorded from SCADA over a nine-month period from 1 July 2018 to 31 March 2019 were used in this study.	MSE: 0.003532 Accuracy (%): 94.06	The developed approach in this study has the advantage of a high degree of accuracy while retaining low computational costs.	The study did not consider other wind turbine operating parameters (e.g., wind direction, pitch angle, temperature of generator, rotating speed, etc.).
	ANFIS-WT-PSO-MI [37]	Portugal	New hybrid evolutionary-adaptive methodology for wind power forecasting in the short-term, successfully combining mutual information, wavelet transform, evolutionary particle swarm optimization, and the adaptive neuro-fuzzy inference system	Datasets collected in Portugal were used for this study.	MAPE (%): 3.75 NMAE (%): 1.51 NRMSE (%): 2.66	The application of the proposed hybrid evolutionary-adaptive (HEA) methodology was revealed to be accurate and effective, helping to reduce the uncertainty associated with wind power.	The study did not consider other operating parameters (e.g., wind direction, pitch angle, temperature of generator, rotating speed, etc.) for wind power prediction.
	EMD-C-GT [38]	Dongtai, China	A hybrid prediction model with empirical mode decomposition (EMD), chaotic theory, and grey theory	Power data collected every 10 min.	MAPE(%): 18.33 NMAE(%): 5.71 NRMSE (%): 7.80	The approach can reduce the non-stationary wind farm of the power time series and enhance the prediction accuracy compared to the direct prediction method for using the power data directly.	Only the wind turbine output power datasets were used as input to the model.
	CapSA-RVFL [40]	La Haute Borne, France	An optimized RVFL network using a new naturally inspired technique called the Capuchin search algorithm (CapSA)	Datasets obtained from La Haute Borne wind turbines in France (from 2017 to 2020)	RMSE (kW):127.7821 MAE (kW): 84.6789 R²: 0.9638	The application of the CapSA has boosted the process of the parameter configuration to provide the RVFL with a high performance and high prediction accuracy and could be used for other applications.	The study did not consider other wind turbine operating parameters (e.g., wind speed, pitch angle, temperature of generator, rotating speed, etc.).
	NN-ICA-GA and PSO [42]	Alberta, Canada	Different hybrid prediction models based on neural networks trained by various optimization approaches are examined to forecast the wind power time series from Alberta, Canada.	Experimental data from a wind farm in Alberta, Canada for the year 2007	MAE (kW): 3.4320–8.7586 RMSE (kW): 4.2963–13.8326 MAPE (%): 7.3888–20.3263	The low error indices and very fast convergence are the main properties of the proposed approach specifically for the hybrid ICA–neural network model.	The study did not clearly indicate the input variables and their influence on the performance of the model.
	ANFIS-MoW [43]	Nouakchott, Mauritania	A novel adaptive neuro-fuzzy inference system with the moving window approach	Wind turbine datasets from a 30-MW wind farm over on year provided by the Mauritanian Electricity Company (SOMELEC) are used in this study.	NMSE: 0.0027–0.0075 NMAE: 0.0347–0.0636 RMSE (kW): 36.6973–53.9617 R²: 0.9961–0.9987	The proposed approach can be used as a useful tool to avoid shutdown risks in the wind farm system and is helpful for the management of the electricity grid.	Further research is needed to improve the accuracy of the ANFIS-MoW model by considering more operational parameters and further improving the ANFIS-MoW approach.
	G-NN [44]	Zhangbei, China	Short-term forecasting of wind turbine power generation based on a genetic neural network approach	Actual wind speed data from 10 days were used as original data to train and validate the model.	RMSE (kW): 4.031 MAE (kW): 3.534 MRE (%): 2.38	The proposed model ranges from the wind speed to the output power from wind turbines.	The proposed approaches used predicted wind speed to generate the output power from the WTs. Also, only the datasets measured every 10 min over 10 days are used for this study.
	ANFIS [49]	Beijing, China	An ANFIS-based approach for 1-day-ahead hourly wind power generation prediction	Datasets recorded for every 10-min average wind speed and turbine output power for a period of one year from 26 March 2014 to 25 March 2015	MAE (kW): 28.39 MAPE (%): 4.45 RMSE (kW): 46.06 MSE (kW): 2121.5	The validation of the proposed model demonstrates the capability of the approach to predict wind power from a daily wind speed profile at a reasonable accuracy with superior precision over feed-forward ANN and GA-BP NN models.	Only wind speeds are used as input for the proposed model.

EISM: Effective information screening module; RTRD: Real-time rolling decomposition module; Bi-LSTM: Bidirectional long short-term memory neural network; MST-GNN: Multidimensional spatio-temporal graph based on the neural networks; MFMS: Multi-feature and multi-scale learning; CNN-LSM-NDL: Convolutional neural network and long short-term memory network deep learning model; VMD-TCN-STL: Variation mode decomposition–temporal convolutional network and sequential triplet loss: RNN-CNN-LSTM: Hybrid neural network scheme coupled to convolutional neural network and long short-term memory; DRIPS-PDI: Decomposition–recognition–integration-prediction system considering a recently predictive difficulty index; CNN-BILSTM-MOHHO: Capsule neural network and bidirectional long and short term memory network combined with multi-object harris hawk optimization; LSTM-IVMD-SE: Long short-term memory neural network coupled with correntropy combining an improved variational mode decomposition and sample entropy; FCM: Fuzzy C-means clustering algorithm; BMA-EL: Bayesian model averaging and ensemble learning; DD-PPDL: Data-driven approach integrating data pre-processing and deep learning models; SRNN-PSAF: Stacked recurrent neural network coupled with parametric sine activation function algorithm; MC-hNN: Multiple clustering algorithm and hybrid neural network method; WT-CNN-tSVR: Wavelet transform based convolutional neural network and twin support vector regression; ANFIS-WT-PSO-MI: Hybrid and adaptable ANFIS-based technique incorporating the wavelet transform and the PSO with mutual information; EMD-C-GT: Hybrid prediction model including the empirical model decomposition based on the chaos and grey theories; CapSA-RVFL: An optimized random vector functional link network using capuchin search algorithm approach; NN-ICA-GA and PSO: Hybrid prediction models based on neural networks optimized using the so-called imperialist competitive algorithm (ICA), the GA, and the PSO; ANFIS-MoW: A novel adaptive neuro-fuzzy inference system with the moving window; G-NN: Genetic neural network (G-NN) modeling technique; hPDM-TVFEMD-AE-YJQR-GAQ: Hybrid probability density model including time varying filter based empirical mode decomposition, approximate entropy, Yeo-Johnson transform quantile regression, and Gaussian approximation of quantiles; BBLP-MSR: Bilateral branch learning paradigm with data of multiple sampling resolutions; Multiple-SVR: Multiple support vector regression-based model; GA-BP-ANN: Genetic algorithm trained by the back propagation artificial neural network learning algorithm; ANFIS: Adaptive network-based fuzzy inference system; EPT-CEEMDAN-TCN: Ensemble patch transform and the complete ensemble empirical mode decomposition with adaptive noise combined with temporal convolutional networks; ED-Wavenet-TF: Wavenet networks based encoder-decoder framework; MSFAE: Multi-scale feature adaptive extraction ensemble model; MKSVRE-WOA: multi-kernel SVR ensemble model based on unified optimization and whale optimization algorithm; VMD-CA-LSTM-EL-EC: hybrid model based on the variational mode decomposition, clustering analysis, long short-term memory network, stacking ensemble learning, and error complementation.

References

Valente, A.; Iribarren, D.; Dufour, J. Harmonised life-cycle global warming impact of renewable hydrogen. J. Clean. Prod. 2017, 149, 762–772. [Google Scholar] [CrossRef]
Jin, T.; Kim, J. What is better for mitigating carbon emissions—Renewable energy or nuclear energy? A panel data analysis. Renew. Sustain. Energy Rev. 2018, 91, 464–471. [Google Scholar] [CrossRef]
Cho, H.H.; Strezov, V.; Evans, T.J. A review on global warming potential, challenges and opportunities of renewable hydrogen production technologies. Sustain. Mater. Technol. 2023, 35, e00567. [Google Scholar] [CrossRef]
Martínez-Barbeito, M.; Gomila, D.; Colet, P. Dynamical model for power grid frequency fluctuations: Application to islands with high penetration of wind generation. IEEE Trans. Sustain. Energy 2023, 14, 1436–1445. [Google Scholar] [CrossRef]
Nezhad, M.M.; Neshat, M.; Piras, G.; Garcia, D.A. Sites exploring prioritisation of offshore wind energy potential and mapping for wind farms installation: Iranian islands case studies. Renew. Sustain. Energy Rev. 2022, 168, 112791. [Google Scholar] [CrossRef]
Global Wind Energy Council (GWEC). Global Wind Report 2022; GWEC: Brussels, Belgium, 2022; pp. 1–154. [Google Scholar]
Jiang, G.; Fan, W.; Li, W.; Wang, L.; He, Q.; Xie, P.; Li, X. DeepFedWT: A federated deep learning framework for fault detection of wind turbines. Measurement 2022, 199, 111529. [Google Scholar] [CrossRef]
Bilal, B.; Adjallah, K.H.; Yetilmezsoy, K.; Bahramian, M.; Kıyan, E. Determination of wind potential characteristics and techno-economic feasibility analysis of wind turbines for Northwest Africa. Energy 2020, 218, 119558. [Google Scholar] [CrossRef]
Bilal, B.; Adjallah, K.H.; Sava, A. Data-Driven Fault Detection and Identification in Wind Turbines Through Performance Assessment. In Proceedings of the 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, Metz, France, 18–21 September 2019. [Google Scholar] [CrossRef]
Peeters, C.; Guillaume, P.; Helsen, J. Vibration-based bearing fault detection for operations and maintenance cost reduction in wind energy. Renew. Energy 2018, 116, 74–87. [Google Scholar] [CrossRef]
Xiaodong, L.; Djamila, O.; Xiang, S.; Dylan, J.; Graham, W.; Kerry, E.H.; Paul, I.; Simon, M.; Dongping, S.; Emmanuel, P. A decision support system for strategic maintenance planning in offshore wind farms. Renew. Energy 2016, 99, 784–799. [Google Scholar] [CrossRef]
Stock-Williams, C.; Swamy, S.K. Automated daily maintenance planning for offshore wind farms. Renew. Energy 2019, 133, 1393–1403. [Google Scholar] [CrossRef]
Atashgar, K.; Abdollahzadeh, H. Reliability optimization of wind farms considering redundancy and opportunistic maintenance strategy. Energy Convers. Manag. 2016, 112, 445–458. [Google Scholar] [CrossRef]
Li, K.; Shen, R.; Wang, Z.; Yan, B.; Yang, Q.; Zhou, X. An efficient wind speed prediction method based on a deep neural network without future information leakage. Energy 2023, 267, 126589. [Google Scholar] [CrossRef]
Wu, Q.; Zheng, H.; Guo, X.; Liu, G. Promoting wind energy for sustainable development by precise wind speed prediction based on graph neural networks. Renew. Energy 2022, 199, 977–992. [Google Scholar] [CrossRef]
Xiaoxun, Z.; Zixu, X.; Yu, W.; Xiaoxia, G.; Xinyu, H.; Hongkun, L.; Ruizhang, L.; Yao, C.; Huaxin, L. Research on wind speed behavior prediction method based on multi-feature and multi-scale integrated learning. Energy 2023, 263, 125593. [Google Scholar] [CrossRef]
Long, H.; He, Y.; Cui, H.; Li, Q.; Tan, H.; Tang, B. Research on short-term wind speed prediction based on deep learning model in multi-fan scenario of distributed generation. Energy Rep. 2022, 8, 14183–14199. [Google Scholar] [CrossRef]
Li, H.; Jiang, Z.; Shi, Z.; Han, Y.; Yu, C.; Mi, X. Wind-speed prediction model based on variational mode decomposition, temporal convolutional network, and sequential triplet loss. Sustain. Energy Technol. Assess. 2022, 52, 101980. [Google Scholar] [CrossRef]
Shen, Z.; Fan, X.; Zhang, L.; Yu, H. Wind speed prediction of unmanned sailboat based on CNN and LSTM hybrid neural network. Ocean Eng. 2022, 245, 111352. [Google Scholar] [CrossRef]
Gao, Y.; Wang, J.; Zhang, X.; Li, R. Ensemble wind speed prediction system based on envelope decomposition method and fuzzy inference evaluation of predictability. Appl. Soft. Comput. 2022, 124, 109010. [Google Scholar] [CrossRef]
Liang, T.; Chai, C.; Sun, H.; Tan, J. Wind speed prediction based on multi-variable Capsnet-BILSTM-MOHHO for WPCCC. Energy 2022, 250, 123761. [Google Scholar] [CrossRef]
Duan, J.; Wang, P.; Ma, W.; Tian, X.; Fang, S.; Cheng, Y.; Chang, Y.; Liu, H. Short-term wind power forecasting using the hybrid model of improved variational mode decomposition and correntropy long short-term memory neural network. Energy 2021, 214, 118980. [Google Scholar] [CrossRef]
Yang, M.; Shi, C.; Liu, H. Day-ahead wind power forecasting based on the clustering of equivalent power curves. Energy 2021, 218, 119515. [Google Scholar] [CrossRef]
Shahid, F.; Khan, A.; Zameer, A.; Arshad, J.; Safdar, K. Wind power prediction using a three stage genetic ensemble and auxiliary predictor. Appl. Soft. Comput. 2020, 90, 106151. [Google Scholar] [CrossRef]
Wang, G.; Jia, R.; Liu, J.; Zhang, H. A hybrid wind power forecasting approach based on Bayesian model averaging and ensemble learning. Renew. Energy 2020, 145, 2426–2434. [Google Scholar] [CrossRef]
Liu, X.; Zhou, J.; Qian, H. Short-term wind power forecasting by stacked recurrent neural networks with parametric sine activation function. Electr. Power Syst. Res. 2021, 192, 107011. [Google Scholar] [CrossRef]
Kisvari, A.; Lin, Z.; Liu, X. Wind power forecasting e A data-driven method along with gated recurrent neural network. Renew. Energy 2021, 163, 1895–1909. [Google Scholar] [CrossRef]
Yu, G.; Liu, C.; Tang, B.; Chen, R.; Lu, L.; Cui, C.; Hu, Y.; Shen, L.; Muyeen, S. Short term wind power prediction for regional wind farms based on spatial-temporal characteristic distribution. Renew. Energy 2022, 199, 599–612. [Google Scholar] [CrossRef]
Meng, A.; Chen, S.; Ou, Z.; Ding, W.; Zhou, H.; Fan, J.; Yin, H. A hybrid deep learning architecture for wind power prediction based on bi-attention mechanism and crisscross optimization. Energy 2022, 238, 121795. [Google Scholar] [CrossRef]
He, R.; Yang, H.; Sun, S.; Lu, L.; Sun, H.; Gao, X. A machine learning-based fatigue loads and power prediction method for wind turbines under yaw control. Appl. Energy 2022, 326, 120013. [Google Scholar] [CrossRef]
Wang, L.; He, Y.; Li, L.; Liu, X.; Zhao, Y. A novel approach to ultra-short-term multi-step wind power predictions based on encoder–decoder architecture in natural language processing. J. Clean. Prod. 2022, 354, 131723. [Google Scholar] [CrossRef]
Mandzhieva, R.; Subhankulova, R. Data-driven applications for wind energy analysis and prediction: The case of “La Haute Borne” wind farm. Digital Chem. Eng. 2022, 4, 100048. [Google Scholar] [CrossRef]
Dhiman, H.S.; Deb, D.; Guerrero, J.M. On wavelet transform based convolutional neural network and twin support vector regression for wind power ramp event prediction. Sustain. Comput. Inform. Syst. 2022, 36, 100795. [Google Scholar] [CrossRef]
Xiong, J.; Peng, T.; Tao, Z.; Zhang, C.; Song, S.; Nazir, M.S. A dual-scale deep learning model based on ELM-BiLSTM and improved reptile search algorithm for wind power prediction. Energy 2023, 226, 126419. [Google Scholar] [CrossRef]
Jiading, J.; Feng, W.; Rui, T.; Lingling, Z.; Xin, X. TS_XGB: Ultra-short-term wind power forecasting method based on fusion of time-spatial data and XGBoost algorithm. Procedia Comput. Sci. 2022, 199, 1103–1111. [Google Scholar] [CrossRef]
Sheng, Y.; Wang, H.; Yan, J.; Liu, Y.; Han, S. Short-term wind power prediction method based on deep clustering-improved Temporal Convolutional Network. Energy Rep. 2023, 9, 2118–2129. [Google Scholar] [CrossRef]
Osório, G.J.; Matias, J.C.O.; Catalão, J.P.S. Short-term wind power forecasting using adaptive neuro-fuzzy inference system combined with evolutionary particle swarm optimization, wavelet transform and mutual information. Renew. Energy 2015, 75, 301–307. [Google Scholar] [CrossRef]
An, X.; Jiang, D.; Zhao, M.; Liu, C. Short-term prediction of wind power using EMD and chaotic theory. Commun. Nonlinear Sci. Numer. Simulat. 2012, 17, 1036–1042. [Google Scholar] [CrossRef]
Guo, N.Z.; Shi, K.Z.; Li, B.; Qi, L.W.; Wu, H.H.; Zhang, Z.L.; Xu, J.Z. A physics-inspired neural network model for short-term wind power prediction considering wake effects. Energy 2022, 261, 125208. [Google Scholar] [CrossRef]
Al-qaness, M.A.A.; Ewees, A.A.; Fan, H.; Abualigah, L.; Elsheikh, A.H.; Elaziz, M.A. Wind power prediction using random vector functional link network with capuchin search algorithm. Ain. Shams. Eng. J. 2022, 14, 102095. [Google Scholar] [CrossRef]
Ye, L.; Dai, B.; Li, Z.; Pei, M.; Zhao, Y.; Lu, P. An ensemble method for short-term wind power prediction considering error correction strategy. Appl. Energy 2022, 322, 19475. [Google Scholar] [CrossRef]
Bigdeli, N.; Afshar, K.; Gazafroudi, A.S.; Ramandi, M.Y. A comparative study of optimal hybrid methods for wind power prediction in wind farm of Alberta, Canada. Renew. Sustain. Energy Rev. 2013, 27, 20–29. [Google Scholar] [CrossRef]
Bilal, B.; Adjallah, K.H.; Sava, A.; Yetilmezsoy, K.; Ouassaid, M. Wind turbine output power prediction and optimization based on a novel adaptive neuro-fuzzy inference system with the moving window. Energy 2023, 263, 126159. [Google Scholar] [CrossRef]
Weidong, X.; Yibing, L.; Xingpei, L. Short-term forecasting of wind turbine power generation based on genetic neural network. In Proceedings of the 8th World Congress on Intelligent Control and Automation (WCICA) 2010, Jinan, China, 7–9 July 2010; pp. 5943–5946. [Google Scholar] [CrossRef]
Zhang, W.; He, Y.; Yan, S. A multi-step probability density prediction model based on gaussian approximation of quantiles for offshore wind power. Renew. Energy 2023, 202, 992–1011. [Google Scholar] [CrossRef]
Liu, H.; Zhang, Z. A bilateral branch learning paradigm for short term wind power prediction with data of multiple sampling resolutions. J. Clean. Prod. 2022, 380, 134977. [Google Scholar] [CrossRef]
Huang, C.M.; Kuo, C.J.; Huang, Y.C. Short-term wind power forecasting and uncertainty analysis using a hybrid intelligent method. IET Renew. Power Gener. 2017, 11, 678–687. [Google Scholar] [CrossRef]
Kassa, Y.; Zhang, J.H.; Zheng, D.H.; Wei, D. A GA-BP hybrid algorithm based ANN Model for wind power prediction. In Proceedings of the IEEE Smart Energy Grid Engineering (SEGE), Oshawa, ON, Canada, 21–24 August 2016; pp. 158–163. [Google Scholar] [CrossRef]
Kassa, Y.; Zhang, J.H.; Zheng, D.H.; Wei, D. Short term wind power prediction using ANFIS. In Proceedings of the IEEE International Conference on Power and Renewable Energy (ICPRE), Shanghai, China, 21–23 October 2016; pp. 388–393. [Google Scholar] [CrossRef]
Chen, J.; Liu, H.; Chen, C.; Duan, Z. Wind speed forecasting using multi-scale feature adaptive extraction ensemble model with error regression correction. Expert. Syst. Appl. 2022, 207, 117358. [Google Scholar] [CrossRef]
Xian, H.; Che, J. Unified whale optimization algorithm based multi-kernel SVR ensemble learning for wind speed forecasting. Appl. Soft. Comput. 2022, 130, 109690. [Google Scholar] [CrossRef]
Li, D.; Jiang, F.; Chen, M.; Qian, T. Multi-step-ahead wind speed forecasting based on a hybrid decomposition method and temporal convolutional networks. Energy 2022, 238, 121981. [Google Scholar] [CrossRef]
Wang, Y.; Chen, T.; Zhou, S.; Zhang, F.; Zou, R.; Hu, Q. An improved Wavenet network for multi-step-ahead wind energy forecasting. Energy Convers. Manag. 2023, 278, 116709. [Google Scholar] [CrossRef]
Sun, Z.; Zhao, M.; Zhao, G. Hybrid model based on VMD decomposition, clustering analysis, long short memory network, ensemble learning and error complementation for short-term wind speed forecasting assisted by Flink platform. Energy 2022, 261, 125248. [Google Scholar] [CrossRef]
Mahmoodi, K.; Ghassemi, H.; Razminia, A. Wind energy potential assessment in the Persian Gulf: A spatial and temporal analysis. Ocean Eng. 2020, 15, 107674. [Google Scholar] [CrossRef]
Korkos, P.; Linjama, M.; Kleemola, J.; Lehtovaara, A. Data annotation and feature extraction in fault detection in a wind turbine hydraulic pitch system. Renew. Energy 2022, 185, 692–703. [Google Scholar] [CrossRef]
He, J.; Chan, P.W.; Li, Q.; Lee, C.W. Spatiotemporal analysis of offshore wind field characteristics and energy potential in Hong Kong. Energy 2020, 201, 117622. [Google Scholar] [CrossRef]
Hyers, R.W.; Mcgowan, J.G.; Sullivan, K.L.; Manwell, J.F.; Syrett, B.C. Condition monitoring and prognosis of utility scale wind turbines. Energy Mater. 2006, 3, 187–203. [Google Scholar] [CrossRef]
Avazov, A.; Colas, F.; Beerten, J.; Guillaud, X. Application of input shaping method to vibrations damping in a Type-IV wind turbine interfaced with a grid-forming converter. Electr. Power Syst. Res. 2022, 210, 108083. [Google Scholar] [CrossRef]
Sreenivas, P.; Murthy, V.S.S.; Kumar, S.V.; Kumar, U.P. Design and analysis of new pitch angle controller for enhancing the performance of wind turbine coupled with PMSG. Mater. Today Proc. 2022, 52, 1456–1460. [Google Scholar] [CrossRef]
Dao, P.B. On Wilcoxon rank sum test for condition monitoring and fault detection of wind turbines. Appl. Energy 2022, 318, 119209. [Google Scholar] [CrossRef]
Kusiak, A.; Li, W. The prediction and diagnosis of wind turbine faults. Renew. Energy 2011, 36, 16–23. [Google Scholar] [CrossRef]
Hu, Z.; Gao, B.; Sun, R. An active primary frequency regulation strategy for grid integrated wind farms based on model predictive control. Sustain. Energy Grids Netw. 2022, 32, 100955. [Google Scholar] [CrossRef]
Dayev, Z.; Kairakbaev, A.; Yetilmezsoy, K.; Bahramian, M.; Sihag, P.; Kıyan, E. Approximation of the discharge coefficient of differential pressure flowmeters using different soft computing strategies. Flow Meas. Instrum. 2021, 79, 101913. [Google Scholar] [CrossRef]
Yetilmezsoy, K.; Sihag, P.; Kıyan, E.; Doran, B. A benchmark comparison and optimization of Gaussian process regression, support vector machines, and M5P tree model in approximation of the lateral confinement coefficient for CFRP-wrapped rectangular/square RC columns. Eng. Struct. 2021, 246, 113106. [Google Scholar] [CrossRef]
Dayev, Z.; Shopanova, G.; Toksanbaeva, B.; Yetilmezsoy, K.; Sultanov, N.; Sihag, P.; Bahramian, M.; Kıyan, E. Modeling the flow rate of dry part in the wet gas mixture using decision tree/kernel/non-parametric regression-based soft-computing techniques. Flow Meas. Instrum. 2022, 86, 102195. [Google Scholar] [CrossRef]
Dayev, Z.; Yetilmezsoy, K.; Sihag, P.; Bahramian, M.; Kıyan, E. Modeling of the mass flow rate of natural gas flow stream using genetic/decision tree/kernel-based data-intelligent approaches. Flow Meas. Instrum. 2023, 90, 102331. [Google Scholar] [CrossRef]
Fan, J.; Wu, L.; Zhang, F.; Cai, H.; Zeng, W.; Wang, X.; Zou, H. Empirical and machine learning models for predicting daily global solar radiation from sunshine duration: A review and case study in China. Renew. Sust. Energy Rev. 2019, 100, 186–212. [Google Scholar] [CrossRef]
Thakur, M.S.; Pandhiani, S.M.; Kashyap, V.; Upadhya, A.; Sihag, P. Predicting bond strength of FRP bars in concrete using soft computing techniques. Arab. J. Sci. Eng. 2021, 46, 4951–4969. [Google Scholar] [CrossRef]
Yaseen, Z.M.; Sihag, P.; Yusuf, B.; Al-Janabi, A.M.S. Modelling infiltration rates in permeable stormwater channels using soft computing techniques. Irrig. Drain. 2021, 70, 117–130. [Google Scholar] [CrossRef]
Yetilmezsoy, K.; Karakaya, K.; Bahramian, M.; Abdul-Wahab, S.A.; Goncaloğlu, B.İ. Black-, gray-, and white-box modeling of biogas production rate from a real-scale anaerobic sludge digestion system in a biological and advanced biological treatment plant. Neural Comput. Applic. 2021, 33, 11043–11066. [Google Scholar] [CrossRef]
Yetilmezsoy, K.; Abdul-Wahab, S.A. A prognostic approach based on fuzzy-logic methodology to forecast PM₁₀ levels in Khaldiya residential area, Kuwait. Aerosol Air Qual. Res. 2012, 12, 1217–1236. [Google Scholar] [CrossRef]
Hassan, D.; Hussein, H.I.; Hassan, M.M. Heart disease prediction based on pre-trained deep neural networks combined with principal component analysis. Biomed. Signal Process Control. 2023, 79, 104019. [Google Scholar] [CrossRef]
Coban, O. Use of different variants of item response theory-based feature selection method for text categorization. In Proceedings of the 2022 International Conference on Theoretical and Applied Computer Science and Engineering (ICTASCE), Ankara, Turkey, 29 September–1 October 2022; pp. 66–71. [Google Scholar] [CrossRef]
Wang, Y.; Cui, W.; Vuong, N.K.; Chen, Z.; Zhou, Y.; Wu, M. Feature selection and domain adaptation for cross-machine product quality prediction. J. Intell. Manuf. 2023, 34, 1573–1584. [Google Scholar] [CrossRef]
Sharma, V.; Chouhan, A.P.S.; Bisen, D. Prediction of activation energy of biomass wastes by using multilayer perceptron neural network with Weka. Mater. Today Proc. 2022, 57, 1944–1949. [Google Scholar] [CrossRef]
Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Breiman, L.; Friedman, J.; Olshen, R.; Stone, C. Classification and Regression Trees; Chapman and Hall/CRC: Boca Raton, FL, USA, 1984; p. 368. [Google Scholar]
Hamoud, A.; Hashim, A.S.; Awadh, W.A. Predicting student performance in higher education institutions using decision tree analysis. Int. J. Interact. Multi. Artif. Intell. 2018, 5, 26–31. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3243704 (accessed on 16 December 2023). [CrossRef]
Cutler, A.; Cutler, D.R.; Stevens, J.R. Random forests. In Ensemble Machine Learning, Methods and Applications; Zhang, C., Ma, Y., Eds.; Springer: Boston, MA, USA, 2012; pp. 157–175. [Google Scholar] [CrossRef]
Barddal, J.P.; Enembreck, F.; Gomes, H.M.; Bifet, A.; Pfahringer, B. Merit-guided dynamic feature selection filter for data streams. Expert Syst. Appl. 2019, 116, 227–242. [Google Scholar] [CrossRef]
Quinlan, J.R. Simplifying decision trees. Int. J. Man Mach. Stud. 1987, 27, 221–234. [Google Scholar] [CrossRef]
Lakshmi, D.C. Proficiency comparison of LADTree and REPTree classifiers for credit risk forecast. Int. J. Comput. Sci. Appl. 2015, 5, 39–50. [Google Scholar] [CrossRef]
Mohamed, W.N.H.W.; Salleh, M.N.M.; Omar, A.H. A comparative study of reduced error pruning method in decision tree algorithms. In Proceedings of the 2012 IEEE International Conference on Control System, Computing and Engineering, Penang, Malaysia, 23–25 November 2012; pp. 392–397. [Google Scholar] [CrossRef]
Shahdad, M.; Saber, B. Drought forecasting using new advanced ensemble-based models of reduced error pruning tree. Acta Geophys. 2022, 70, 697–712. [Google Scholar] [CrossRef]
Sheela, K.G.; Deepa, S.N. Review on methods to fix number of hidden neurons in neural networks. Math Probl. Eng. 2013, 2013, 425740. [Google Scholar] [CrossRef]
Zhang, Z.; Ma, X.; Yang, Y. Bounds on the number of hidden neurons in three-layer binary neural networks. Neural Netw. 2003, 16, 995–1002. [Google Scholar] [CrossRef]
Saberi-Movahed, F.; Najafzadeh, M.; Mehrpooya, A. Receiving more accurate predictions for longitudinal dispersion coefficients in water pipelines: Training group method of data handling using extreme learning machine conceptions. Water Resour. Manag. 2020, 34, 529–561. [Google Scholar] [CrossRef]
Wang, Q.; Luo, K.; Fan, J.; Gao, X.; Cen, K. Spatial distribution and multiscale transport characteristics of PM_2.5 in China. Aerosol Air Qual. Res. 2019, 19, 1993–2007. [Google Scholar] [CrossRef]
Badescu, V. Assessing the performance of solar radiation computing models and model selection procedures. J. Atmos. Sol. Terr. Phys. 2013, 105, 119–134. [Google Scholar] [CrossRef]
Caliskan, N.; Jadraque, E.; Tham, Y.; Muneer, T. Evaluation of the accuracy of mathematical models through use of multiple metrics. Sustain. Cities Soc. 2011, 1, 63–66. [Google Scholar] [CrossRef]
Moreno, J.J.M.; Pol, A.P.; Abad, A.S.; Blasco, B.C. Using the R-MAPE index as a resistant measure of forecast accuracy. Psicothema 2013, 25, 500–506. [Google Scholar] [CrossRef]
Çelik, A.N.; Makkawi, A.; Muneer, T. Critical evaluation of wind speed frequency distribution functions. J. Renew. Sustain. Energy 2010, 2, 013102. [Google Scholar] [CrossRef]
Shabanlou, S. Improvement of extreme learning machine using self-adaptive evolutionary algorithm for estimating discharge capacity of sharp-crested weirs located on the end of circular channels. Flow Meas. Instrum. 2018, 59, 63–71. [Google Scholar] [CrossRef]
Yetilmezsoy, K.; Özçimen, D.; Koçer, A.T.; Bahramian, M.; Kıyan, E.; Akbin, H.M.; Goncaloğlu, B.İ. Removal of anthraquinone dye via struvite: Equilibria, kinetics, thermodynamics, fuzzy logic modeling. Int. J. Environ. Res. 2020, 14, 541–566. [Google Scholar] [CrossRef]
Yetilmezsoy, K.; Bahramian, M.; Kıyan, E.; Bahramian, M. Development of a new practical formula for pipe-sizing problems within the framework of a hybrid computational strategy. J. Irrig. Drain Eng. 2021, 147, 04021012. [Google Scholar] [CrossRef]
Sharafati, A.; Khosravi, K.; Khosravinia, P.; Ahmed, K.; Salman, S.A.; Yaseen, Z.M.; Shahid, S. The potential of novel data mining models for global solar radiation prediction. Int. J. Environ. Sci. Technol. 2019, 16, 7147–7164. [Google Scholar] [CrossRef]
Nwulu, N.I. Modelling locational marginal prices using decision trees. In Proceedings of the 2017 International Conference on Information and Communication Technologies (ICICT), Karachi, Pakistan, 30–31 December 2017; pp. 156–159. [Google Scholar] [CrossRef]
Thongkao, S.; Ditthakit, P.; Pinthong, S.; Salaeh, N.; Elkhrachy, I.; Linh, N.T.T.; Pham, Q.B. Estimating FAO Blaney-Criddle b-Factor using soft computing models. Atmosphere 2022, 13, 1536. [Google Scholar] [CrossRef]
Pham, Q.B.; Kumar, M.; Di Nunno, F.; Elbeltagi, A.; Granata, F.; Islam, A.R.M.T.; Talukdar, S.; Nguyen, X.C.; Ahmed, A.N.; Anh, D.T. Groundwater level prediction using machine learning algorithms in a drought-prone area. Neural Comput. Applic 2022, 34, 10751–10773. [Google Scholar] [CrossRef]
Sargam, Y.; Wang, K.; Cho, I.H. Machine learning based prediction model for thermal conductivity of concrete. J. Build. Eng. 2021, 34, 101956. [Google Scholar] [CrossRef]
Bakirci, K. Prediction of global solar radiation and comparison with satellite data. J. Atmos. Sol. Terr. Phys. 2017, 152, 41–49. [Google Scholar] [CrossRef]
Stone, R.J. Improved statistical procedure for the evaluation of solar radiation estimation models. Sol. Energy 1993, 51, 289–291. [Google Scholar] [CrossRef]
Evin, M.; Hidalgo-Munoz, A.; Béquet, A.J.; Moreau, F.; Tattegrain, H.; Berthelon, C.; Fort, A.; Jallais, C. Personality trait prediction by machine learning using physiological data and driving behavior. Mach. Learn Appl. 2022, 9, 100353. [Google Scholar] [CrossRef]
Psarras, A.; Anagnostopoulos, T.; Salmon, I.; Psaromiligkos, Y.; Vryzidis, L. A Change management approach with the support of the balanced scorecard and the utilization of artificial neural networks. Adm. Sci. 2022, 12, 63. [Google Scholar] [CrossRef]
Shamshirband, S.; Jafari Nodoushan, E.; Adolf, J.E.; Abdul Manaf, A.; Mosavi, A.; Chau, K.W. Ensemble models with uncertainty analysis for multi-day ahead forecasting of chlorophyll a concentration in coastal waters. Eng. Appl. Comput. Fluid. Mech. 2019, 13, 91–101. [Google Scholar] [CrossRef]
Najafzadeh, M.; Rezaie Balf, M.; Rashedi, E. Prediction of maximum scour depth around piers with debris accumulation using EPR, MT, and GEP models. J. Hydroinformatics 2016, 18, 867–884. [Google Scholar] [CrossRef]
Sattar, A.M. Gene expression models for the prediction of longitudinal dispersion coefficients in transitional and turbulent pipe flow. J. Pipeline Syst. Eng. Pract. 2014, 5, 04013011. [Google Scholar] [CrossRef]

Figure 1. Description of the wind farm, SCADA, and SGIPE for control and data monitoring in Nouakchott, Mauritania.

Figure 2. Scatter plots of WTOP based on predictor components: WTOP = f (WT, WD, AT, PA, GT, RSG, VN).

Figure 3. Correlations between the measured and forecasted WTOP values using the nonlinear regression-based model: (a) training phase (n = 25,759) and (b) testing phase (n = 11,039).

Figure 4. Correlations between the measured and forecasted WTOP values using the RF-based model: (a) training phase (n = 25,759) and (b) testing phase (n = 11,039).

Figure 5. Correlations between the measured and forecasted WTOP values using the RT-based model: (a) training phase (n = 25,759) and (b) testing phase (n = 11,039).

Figure 6. Correlations between the measured and forecasted WTOP values using the REPT-based model: (a) training phase (n = 25,759) and (b) testing phase (n = 11,039).

Figure 7. Correlations between the measured and forecasted WTOP values using three-layer (7:30:1) ANN-based model: (a) training phase (n = 25,759) and (b) testing phase (n = 11,039).

Figure 8. Visual inter-comparison of the implemented soft-computing approaches for both the training and testing phases, respectively: (a,b) violin plots and (c,d) box-and-whisker plots.

Figure 9. Visual inter-comparison of the residual errors for both the training and testing phases, respectively: (a,b) box-and-whisker-plots and (c,d) spread plots.

Figure 10. Taylor diagrams representing the soft computing methods utilized to forecast WTOP: (a) training phase (n = 25,759) and (b) testing phase (n = 11,039).

Table 1. Comprehensive descriptive statistics of the model elements utilized in techniques based on soft-computing.

Statistics		Set	WS	WD	AT	PA	GT	RSG	VN	WTOP
Number of data (n)		TRA	25,759	25,759	25,759	25,759	25,759	25,759	25,759	25,759
		TES	11,039	11,039	11,039	11,039	11,039	11,039	11,039	11,039
		ALL	36,798	36,798	36,798	36,798	36,798	36,798	36,798	36,798
Mean		TRA	7.2961	157.3875	26.2558	172.9313	87.5073	1412.0088	690.6746	992.4396
		TES	7.3086	159.7961	26.2374	169.5787	87.7762	1413.8035	691.0028	1000.9088
		ALL	7.2998	158.1100	26.2502	171.9256	87.5880	1412.5472	690.7731	994.9803
Standard deviation		TRA	1.9911	175.5085	4.0817	178.7955	13.1531	246.5384	10.9000	654.4935
		TES	1.9907	175.7825	4.1103	178.6290	13.3157	247.5135	10.8640	660.8619
		ALL	1.9910	175.5918	4.0903	178.7498	13.2025	246.8294	10.8901	656.4129
Variance coefficient		TRA	0.2729	1.1151	0.1555	1.0339	0.1503	0.1746	0.0158	0.6595
		TES	0.2724	1.1000	0.1567	1.0534	0.1517	0.1751	0.0157	0.6603
		ALL	0.2727	1.1106	0.1558	1.0397	0.1507	0.1747	0.0158	0.6597
Standard error of mean		TRA	0.0124	1.0935	0.0254	1.1140	0.0820	1.5361	0.0679	4.0779
		TES	0.0189	1.6731	0.0391	1.7002	0.1267	2.3558	0.1034	6.2899
		ALL	0.0104	0.9154	0.0213	0.9318	0.0688	1.2867	0.0568	3.4219
Upper 95% CL of mean		TRA	7.3204	159.5309	26.3056	175.1149	87.6679	1415.0197	690.8077	1000.4326
		TES	7.3458	163.0755	26.3141	172.9113	88.0247	1418.4212	691.2055	1013.2382
		ALL	7.3202	159.9042	26.2920	173.7520	87.7229	1415.0692	690.8843	1001.6873
Lower 95% CL of mean		TRA	7.2718	155.2441	26.2059	170.7478	87.3467	1408.9980	690.5415	984.4466
		TES	7.2715	156.5166	26.1607	166.2461	87.5278	1409.1857	690.8001	988.5795
		ALL	7.2795	156.3159	26.2085	170.0992	87.4531	1410.0252	690.6618	988.2733
Quadratic mean (RMS)		TRA	7.5630	235.7000	26.5700	248.7000	88.4900	1433.0000	690.8000	1189.0000
		TES	7.5750	237.6000	26.5600	246.3000	88.7800	1435.0000	691.1000	1199.0000
		ALL	7.5660	236.3000	26.5700	248.0000	88.5800	1434.0000	690.9000	1192.0000
Skewness		TRA	0.1874	0.2601	−0.0559	0.0865	0.6748	−0.2624	−0.1135	0.2951
		TES	0.1199	0.2321	−0.0337	0.1241	0.6494	−0.2760	−0.0365	0.2757
		ALL	0.1671	0.2517	−0.0492	0.0978	0.6673	−0.2665	−0.0907	0.2893
Kurtosis		TRA	3.1188	1.0682	2.5573	1.0079	2.4583	1.4716	3.4671	1.6994
		TES	2.9105	1.0544	2.5353	1.0158	2.3835	1.4739	3.3665	1.6690
		ALL	3.0560	1.0639	2.5506	1.0099	2.4355	1.4722	3.4390	1.6901
Maximum (Q₄)		TRA	19.5000	360.0000	40.1400	360.0000	122.6000	1685.6100	739.4300	2040.1100
		TES	16.1900	360.0000	40.3900	360.0000	122.7400	1686.0900	737.0300	2031.9700
		ALL	19.5000	360.0000	40.3900	360.0000	122.7400	1686.0900	739.4300	2040.1100
Upper quartile (Q₃)		TRA	8.7300	357.0000	29.4200	359.6500	96.6100	1679.5900	697.7800	1627.2400
		TES	8.8100	357.0000	29.4500	359.6400	97.2100	1679.7700	698.0300	1660.2000
		ALL	8.7500	357.0000	29.4300	359.6400	96.7600	1679.6600	697.8600	1636.7800
Median (Q₂)	TRA		7.3100	6.0000	26.5400	7.7200	83.5700	1448.2700	690.7300	855.1900
	TES		7.3400	6.0000	26.5100	6.6800	83.7300	1454.3700	690.9900	866.7900
	ALL		7.3200	6.0000	26.5300	7.4200	83.6200	1449.7850	690.8100	858.2600
Lower quartile (Q₁)	TRA		5.8900	3.0000	23.2500	0.7300	77.5400	1159.1100	683.8300	419.5100
	TES		5.8900	3.0000	23.1800	0.7300	77.5200	1159.3700	684.0400	420.2900
	ALL		5.8900	3.0000	23.2300	0.7300	77.5400	1159.1200	683.9100	419.6400
Minimum (Q₀)	TRA		2.1300	0.0000	13.9600	−0.9000	42.1300	1045.2300	638.8800	0.1200
	TES		2.4200	0.0000	13.9900	−0.9000	37.0200	1045.4400	643.7300	0.0900
	ALL		2.1300	0.0000	13.9600	−0.9000	37.0200	1045.2300	638.8800	0.0900
Range (Q₄–Q₀)	TRA		17.3700	360.0000	26.1800	360.9000	80.4700	640.3800	100.5500	2039.9900
	TES		13.7700	360.0000	26.4000	360.9000	85.7200	640.6500	93.3000	2031.8800
	ALL		17.3700	360.0000	26.4300	360.9000	85.7200	640.8600	100.5500	2040.0200
Interquartile range (IQR = Q₃–Q₁)	TRA		2.8400	354.0000	6.1700	358.9200	19.0700	520.4800	13.9500	1207.7300
	TES		2.9200	354.0000	6.2700	358.9100	19.6900	520.4000	13.9900	1239.9100
	ALL		2.8600	354.0000	6.2000	358.9100	19.2200	520.5400	13.9500	1217.1400
Centile 95	TRA		10.3000	359.2400	32.2600	359.9100	112.7500	1681.7400	708.4900	2001.6900
	TES		10.2500	359.2300	32.3700	359.9100	113.1800	1681.8600	708.8300	2002.1000
	ALL		10.2900	359.2300	32.3000	359.9100	112.8600	1681.7800	708.5800	2001.8900
Centile 5	TRA		4.0100	0.6600	19.3700	−0.3000	71.5300	1049.9900	672.7400	108.4400
	TES		3.9900	0.7100	19.4200	−0.3000	71.5300	1049.9900	673.1800	108.8100
	ALL		4.0000	0.6800	19.3900	−0.3000	71.5300	1049.9900	672.9100	108.5400

TRA: Training dataset; TES: Testing dataset; ALL: Overall dataset; CL: Confidence limit; RMS: Root mean square; Q₀: Minimum value or zeroth quartile (0th centile/percentile, quantile 0.00); Q₁: Lower quartile or first quartile (25th centile/percentile, quantile 0.25); Q₂: Median or second quartile (50th centile/percentile, quantile 0.50); Q₃: Upper quartile or third quartile (75th centile/percentile, quantile 0.75); Q₄: Maximum value or fourth quartile (100th centile/percentile, quantile 1.00); IQR: Interquartile range; WS: Wind speed (m/s); WD: Wind direction (°); AT: Air temperature (°C); PA: Pitch angle (°); GT: Generator temperature (°C); RSG: Rotating speed of the generator (rpm); VN: Voltage of the network (V); WTOP: Wind turbine output power (kW).

Table 2. Regression variable findings and model components for the best-fit multiple regression-based approach (ERM) in estimating WTOP.

Regression Coefficients and Constant Term	Input Variables	Standard Error	t-Ratio
a = 3.52 × 10⁻²	X₁: Wind speed (m/s)	5.73 × 10⁻⁴	61.5091
b = −2.21 × 10⁻⁵	X₂: Wind direction (°)	2.94 × 10⁻⁶	−7.5289
c = −6.11 × 10⁻³	X₃: Air temperature (°C)	1.34 × 10⁻²	−45.6246
d = −1.17 × 10⁻⁴	X₄: Pitch angle (°)	3.66 × 10⁻⁶	−32.1099
e = 4.95 × 10⁻³	X₅: Generator temperature (°C)	7.05 × 10⁻⁵	70.2989
f = 2.52 × 10⁻³	X₆: Rotating speed of the generator (rpm)	6.80 × 10⁻⁶	370.7333
g = 3.97 × 10⁻⁴	X₇: Voltage of the network (V)	5.21 × 10⁻⁵	7.6171
h = 2.3115	Constant term	3.64 × 10⁻²	63.4548

Table 3. Comparative indicator performance of the implemented soft-computing models considering various quantitative statistics (boldface values show superior statistical outputs in the comparison of relevant datasets among themselves).

Statistics	Set	NRM	RF	RT	REPT	ANN
Number of data (n)	TRA	25,759	25,759	25,759	25,759	25,759
	TES	11,039	11,039	11,039	11,039	11,039
	ALL	36,798	36,798	36,798	36,798	36,798
R²	TRA	0.9783	0.9995	0.9994	0.9979	0.9973
	TES	0.9789	0.9982	0.9960	0.9971	0.9974
	ALL	0.9785	0.9991	0.9983	0.9976	0.9974
b (slope: s)	TRA	0.9697	0.9986	0.9994	0.9979	1.0042
	TES	0.9666	0.9975	0.9986	0.9973	1.0048
	ALL	0.9688	0.9983	0.9991	0.9977	1.0044
a (intercept)	TRA	33.7679	1.3956	0.6450	2.0973	70.2112
	TES	36.7948	2.9319	2.2995	2.9327	70.0002
	ALL	34.6830	1.8577	1.1396	2.3483	70.1452
R²_adj	TRA	0.9783	0.9995	0.9993	0.9979	0.9973
	TES	0.9789	0.9982	0.9960	0.9971	0.9974
	ALL	0.9785	0.9991	0.9983	0.9976	0.9974
MAE (kW)	TRA	77.4032	10.7843	12.1422	19.1817	76.0789
	TES	77.3617	16.8908	25.1978	21.6661	76.5227
	ALL	77.3908	12.6161	16.0587	19.9270	76.2120
MBE (kW)	TRA	3.6799	0.0400	−4.32 × 10⁻⁵	2.37 × 10⁻⁶	74.3916
	TES	3.3816	0.3802	0.8517	0.2168	74.7765
	ALL	3.5904	0.1420	0.2555	0.0650	74.5071
MAPE (%)	TRA	73.8172	7.0737	7.1107	8.8677	34.7264
	TES	73.4223	7.5597	8.2325	8.9620	33.9020
	ALL	73.6988	7.2195	7.4472	8.8960	34.4791
RMSE (kW)	TRA	96.6137	15.3417	16.6843	30.0867	81.8426
	TES	96.4472	27.7217	41.8067	35.6662	82.0540
	ALL	96.5638	19.8821	26.8175	31.8632	81.9061
RMSE_S (kW)	TRA	20.1804	0.8949	0.4254	1.3831	74.4427
	TES	22.3181	1.7271	1.2803	1.8062	74.8430
	ALL	20.8242	1.1407	0.6368	1.5077	74.5626
RMSE_U (kW)	TRA	94.4825	15.3155	16.6789	30.0549	34.0074
	TES	93.8294	27.6679	41.7871	35.6204	33.6360
	ALL	94.2916	19.8494	26.8100	31.8275	33.8972
SEE (kW)	TRA	94.4862	15.3161	16.6795	30.0560	34.0087
	TES	93.8379	27.6704	41.7908	35.6236	33.6391
	ALL	94.2942	19.8499	26.8107	31.8284	33.8982
PSE	TRA	0.0456	0.0034	0.0007	0.0021	4.7918
	TES	0.0566	0.0039	0.0009	0.0026	4.9510
	ALL	0.0488	0.0033	0.0006	0.0022	4.8385
IA (WI)	TRA	0.9944	0.9999	0.9998	0.9995	0.9961
	TES	0.9945	0.9996	0.9990	0.9993	0.9962
	ALL	0.9944	0.9998	0.9996	0.9994	0.9961
FV	TRA	0.0198	0.0011	0.0003	0.0011	−0.0055
	TES	0.0233	0.0017	−0.0006	0.0013	−0.0060
	ALL	0.0209	0.0013	0.0001	0.0011	−0.0057
FA2	TRA	0.9670	0.9976	1.0000	1.0000	0.8742
	TES	0.9652	0.9982	1.0011	1.0010	0.8741
	ALL	0.9665	0.9978	1.0003	1.0003	0.8742
CV(RMSE) (SI)	TRA	0.0973	0.0155	0.0168	0.0303	0.0825
	TES	0.0964	0.0277	0.0418	0.0356	0.0820
	ALL	0.0971	0.0200	0.0270	0.0320	0.0823
DW	TRA	1.9780	2.0265	1.9869	2.0246	0.3517
	TES	2.0106	2.0035	1.9938	2.0081	0.3396
	ALL	1.9878	2.0131	1.9920	2.0184	0.3480
NSE	TRA	0.9782	0.9995	0.9994	0.9979	0.9844
	TES	0.9787	0.9982	0.9960	0.9971	0.9846
	ALL	0.9784	0.9991	0.9983	0.9976	0.9844
LMI	TRA	0.8651	0.9812	0.9788	0.9666	0.8675
	TES	0.8669	0.9709	0.9567	0.9627	0.8684
	ALL	0.8657	0.9781	0.9721	0.9654	0.8677
MFB (%)	TRA	6.9448	0.6334	0.4565	0.5520	14.9609
	TES	7.1322	0.6120	0.4650	0.4813	14.9603
	ALL	7.0010	0.6270	0.4590	0.5308	14.9607
MFE (%)	TRA	16.4711	3.0783	3.6114	4.3072	15.0623
	TES	16.4707	3.6428	4.8233	4.5192	15.0649
	ALL	16.4710	3.2476	3.9750	4.3708	15.0631
AIC	TRA	2.35 × 10⁵	1.41 × 10⁵	1.45 × 10⁵	1.75 × 10⁵	2.27 × 10⁵
	TES	1.01 × 10⁵	7.34 × 10⁴	8.24 × 10⁴	7.89 × 10⁴	9.73 × 10⁴
	ALL	3.36 × 10⁵	2.20 × 10⁵	2.42 × 10⁵	2.55 × 10⁵	3.24 × 10⁵
t-statistic	TRA	NS	0.4180	0.0004	1.26 × 10⁻⁵	NS
	TES	NS	1.4411	NS	0.6387	NS
	ALL	NS	1.3703	1.8274	0.3916	NS
OAS (ψ)	TRA	4.8379	6.6967	6.6678	6.4323	4.1547
	TES	4.8335	6.4797	6.2211	6.3362	4.1432
	ALL	4.8365	6.6231	6.5070	6.4024	4.1512

All abbreviations are defined in the main text (see Section 2.5) and under the previous tables. The t-statistics of some models are shown as NS (not significant) since their t values are greater than t_α_/2 values (or known as t_critical values ≈ 1.96) at the α level of 0.05 and (n − 1) degrees of freedom.

Table 4. Comparative descriptive statistics of absolute residual errors (ARE) between the actual and predicted WTOP values of the soft-computing models for the testing stage.

Statistics	Set	Actual	NRM	RF	RT	REPT	ANN
Mean	TES	1000.9088	1004.2904	1001.2890	1001.7605	1001.1257	1075.6854
Mean	ARE	-	3.3816	0.3802	0.8517	0.2168	74.7765
Standard deviation	TES	660.8619	645.6553	659.7576	661.2278	660.0307	664.8670
Standard deviation	ARE	-	15.2066	1.1043	0.3659	0.8312	4.0051
Variance coefficient	TES	0.6603	0.6429	0.6589	0.6601	0.6593	0.6181
Variance coefficient	ARE	-	0.0174	0.0014	0.0002	0.0010	0.0422
Standard error of mean	TES	6.2899	6.1452	6.2794	6.2934	6.2820	6.3281
Standard error of mean	ARE	-	0.1447	0.0105	0.0035	0.0079	0.0381
Upper 95% CL of mean	TES	1013.2382	1016.3361	1013.5978	1014.0967	1013.4396	1088.0895
Upper 95% CL of mean	ARE	-	3.0979	0.3596	0.8585	0.2013	74.8513
Lower 95% CL of mean	TES	988.5795	992.2448	988.9803	989.4243	988.8118	1063.2813
Lower 95% CL of mean	ARE	-	3.6653	0.4008	0.8448	0.2323	74.7018
Geometric mean	TES	711.4340	780.0629	716.8255	715.5859	715.9597	832.8495
Geometric mean	ARE	-	68.6289	5.3915	4.1518	4.5257	121.4154
Harmonic mean	TES	234.2000	594.3000	379.8000	369.7000	377.2000	584.0000
Harmonic mean	ARE	-	360.1000	145.6000	135.5000	143.0000	349.8000
Quadratic mean (RMS)	TES	1199.0000	1194.0000	1199.0000	1200.0000	1199.0000	1265.0000
Quadratic mean (RMS)	ARE	-	5.0000	0.0000	1.0000	0.0000	66.0000
Skewness	TES	0.2757	0.3151	0.2694	0.2747	0.2694	0.2922
Skewness	ARE	-	0.0395	0.0063	0.0010	0.0063	0.0166
Kurtosis	TES	1.6690	1.5260	1.6594	1.6661	1.6599	1.6787
Kurtosis	ARE	-	0.1430	0.0096	0.0028	0.0091	0.0097
Maximum (Q₄)	TES	2031.9700	2420.7424	2001.4660	2013.0900	1999.4690	2191.4470
Maximum (Q₄)	ARE	-	388.7724	30.5040	18.8800	32.5010	159.4770
Upper quartile (Q₃)	TES	1660.2000	1668.0759	1660.5070	1659.9300	1660.4990	1723.1070
Upper quartile (Q₃)	ARE	-	7.8759	0.3070	0.2700	0.2990	62.9070
Median (Q₂)	TES	866.7900	825.8791	866.5470	864.4850	857.0380	937.2570
Median (Q₂)	ARE	-	40.9109	0.2430	2.3050	9.7520	70.4670
Lower quartile (Q₁)	TES	420.2900	367.6910	417.7880	414.7790	425.6550	492.9600
Lower quartile (Q₁)	ARE	-	52.5990	2.5020	5.5110	5.3650	72.6700
Minimum (Q₀)	TES	0.0900	219.2158	25.7500	23.9380	30.1930	72.2940
Minimum (Q₀)	ARE	-	219.1258	25.6600	23.8480	30.1030	72.2040
Range (Q₄–Q₀)	TES	2031.8800	2201.5266	1975.7160	1989.1520	1969.2760	2119.1530
Range (Q₄–Q₀)	ARE	-	169.6466	56.1640	42.7280	62.6040	87.2730
Interquartile range (IQR = Q₃–Q₁)	TES	1239.9100	1300.3849	1242.7190	1245.1510	1234.8440	1230.1470
Interquartile range (IQR = Q₃–Q₁)	ARE	-	60.4749	2.8090	5.2410	5.0660	9.7630
Centile 95	TES	2002.1000	1980.8696	1999.3650	1997.7430	1999.4690	2095.8200
Centile 95	ARE	-	21.2304	2.7350	4.3570	2.6310	93.7200
Centile 5	TES	108.8100	262.6021	106.5400	99.5670	99.3730	184.8770
Centile 5	ARE	-	153.7921	2.2700	9.2430	9.4370	76.0670

NRM: Nonlinear regression-based model; RF: Random forest model; RT: Random tree model; REPT: Reduced error pruning tree model; ANN: Artificial neural network model. All other abbreviations are defined under Table 1.

Table 5. Uncertainty estimation for the implemented nonlinear regression/decision tree/multilayer perceptron-based soft-computing approaches (boldface values show superior statistical outputs in the comparison of relevant datasets among themselves).

Statistics (kW)	Set	NRM	RF	RT	REPT	ANN
Expanded uncertainty (U₉₅)	TRA	8.0792	7.9948	7.9952	8.0010	8.0549
	TES	12.4583	12.3385	12.3524	12.3456	12.4224
	ALL	6.7790	6.7099	6.7124	6.7147	6.7588
Mean prediction error (e_m)	TRA	3.6799	0.0400	−4.32 × 10⁻⁵	2.37 × 10⁻⁶	74.3916
	TES	3.3816	0.3802	0.8517	0.2168	74.7765
	ALL	3.5904	0.1420	0.2555	0.0650	74.5071
Width of uncertainty band (±1.96 S_e)	TRA	±189.2290	±30.0702	±32.7019	±58.9710	±66.8745
	TES	±188.9289	±54.3319	±81.9278	±69.9075	±66.2187
	ALL	±189.1367	±38.9685	±52.5607	±62.4526	±66.6784
95% PEI (LL)	TRA	−185.5492	−30.0302	−32.7019	−58.9710	7.5171
	TES	−185.5473	−53.9517	−81.0761	−69.6907	8.5578
	ALL	−185.5463	−38.8265	−52.3052	−62.3875	7.8286
95% PEI (UL)	TRA	192.9089	30.1101	32.7018	58.9710	141.2661
	TES	192.3105	54.7121	82.7794	70.1244	140.9953
	ALL	192.7270	39.1105	52.8161	62.5176	141.1855

PEI: prediction error interval; LL: lower limit; UL: upper limit. All other abbreviations are defined in the main text and under the previous tables.

Table 6. Summary of the sensitivity analysis for the testing dataset of the best-performing approach.

Combination of Inputs ^a							Output	Statistical Indicators ^b
WS ^c (m/s)	WD (°)	AT (°C)	PA (°)	GT (°C)	RSG (rpm)	VN (V)	WTOP (kW)	R²	MAE	RMSE
OV	+	+	+	+	+	+	+	0.9974	19.1727	33.9919
+	OV	+	+	+	+	+	+	0.9980	17.7169	29.7665
+	+	OV	+	+	+	+	+	0.9978	18.8213	31.2336
+	+	+	OV	+	+	+	+	0.9980	17.6786	29.4051
+	+	+	+	OV	+	+	+	0.9982	17.1856	28.6774
+	+	+	+	+	OV	+	+	0.9968	23.2314	37.0061
+	+	+	+	+	+	OV	+	0.9982	16.8222	27.4775

^a The plus symbol (+) denotes that the relevant variable is included in the RF-based model. ^b The statistics of the most important input variable are displayed as boldface values; OV: omitted variable. ^c All other abbreviations are defined under Table 1.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bilal, B.; Yetilmezsoy, K.; Ouassaid, M. Benchmarking of Various Flexible Soft-Computing Strategies for the Accurate Estimation of Wind Turbine Output Power. Energies 2024, 17, 697. https://doi.org/10.3390/en17030697

AMA Style

Bilal B, Yetilmezsoy K, Ouassaid M. Benchmarking of Various Flexible Soft-Computing Strategies for the Accurate Estimation of Wind Turbine Output Power. Energies. 2024; 17(3):697. https://doi.org/10.3390/en17030697

Chicago/Turabian Style

Bilal, Boudy, Kaan Yetilmezsoy, and Mohammed Ouassaid. 2024. "Benchmarking of Various Flexible Soft-Computing Strategies for the Accurate Estimation of Wind Turbine Output Power" Energies 17, no. 3: 697. https://doi.org/10.3390/en17030697

APA Style

Bilal, B., Yetilmezsoy, K., & Ouassaid, M. (2024). Benchmarking of Various Flexible Soft-Computing Strategies for the Accurate Estimation of Wind Turbine Output Power. Energies, 17(3), 697. https://doi.org/10.3390/en17030697

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Benchmarking of Various Flexible Soft-Computing Strategies for the Accurate Estimation of Wind Turbine Output Power

Abstract

1. Introduction

2. Materials and Methods

2.1. Collection of the Dataset Used in the Present Computational Analysis

2.2. Importance of Selected Predictor Variables

2.3. Descriptive Statistics of the Model Components Assigned for Training and Testing Phases

2.4. Presentation of Soft-Computing Techniques and Software Systems

2.4.1. Nonlinear Regression-Based Model (NRM)

2.4.2. Random Forest (RF) Model

2.4.3. Random Tree (RT) Model

2.4.4. Reduced Error Pruning Tree (REPT) Model

2.4.5. Artificial Neural Network (ANN) Model

2.5. Description of the Statistical Performance Indices

3. Results

3.1. Assessment of the Prediction Accuracy for the Nonlinear Regression-Based Model

3.2. Assessment of the Prediction Accuracy for the Random Forest (RF) Model

3.3. Assessment of the Prediction Accuracy for the Random Tree (RT) Model

3.4. Assessment of the Prediction Accuracy for the Reduced Error Pruning Tree (REPT) Model

3.5. Assessment of the Prediction Accuracy for the Artificial Neural Network (ANN) Model

3.6. Inter-Comparison of the Implemented Soft-Computing Models

3.7. Uncertainty Analysis for the Applied Prediction Models

3.8. Sensitivity Analysis for the Best-Fit Soft-Computing Model

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Data-Intelligent Approaches Used in Wind Speed and WTOP Estimation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI