Fuzzy First-Order Transition-Rules-Trained Hybrid Forecasting System for Short-Term Wind Speed Forecasts

: Due to the ever-increasing environmental pollution becoming progressively more serious, wind power has been widely used around the world in recent years. However, because of their randomness and intermittence, the accurate prediction of wind speeds is di ﬃ cult. To address this problem, this article proposes a hybrid system for short-wind-speed prediction. The system combines the autoregressive di ﬀ erential moving average (ARIMA) model with a three-layer feedforward neural network. An ARIMA model was employed to predict linear patterns in series, while a feedforward neural network was used to predict the nonlinear patterns in series. To improve accuracy of the predictions, the neural network models were trained by using two methods: ﬁrst-order transition rules and fuzzy ﬁrst-order transition rules. The Levenberg–Marquardt (LM) algorithm was applied to update the weight and deviation of each layer of neural network. The dominance matrix method was employed to calculate the weight of the hybrid system, which was used to establish the linear hybrid system. To evaluate the performance, three statistical indices were used: the mean square error (MSE), the root mean square error (RMSE) and the mean absolute percentage error (MAPE). A set of Lorenz-63 simulated values and two datasets collected from di ﬀ erent wind ﬁelds in Qilian County, Qinghai Province, China, were utilized as to perform a comparative study. The results show the following: (a) compared with the neural network trained by ﬁrst-order transition rules, the prediction accuracy of the neural network trained by the fuzzy ﬁrst-order transition rules was higher; (b) the proposed hybrid system attains superior performance compared with a single model; and (c) the proposed hybrid system balances the forecast accuracy and convergence speed simultaneously during forecasting. Therefore, it was feasible to apply the hybrid model to the prediction of real time-series.


Introduction
Wind speed forecasting is an extremely relevant metric for wind resource assessment (WRA) for wind parks in renewable sector. With environmental pollution becoming increasingly more serious, the development of green and environmentally safe renewable energy has become a research hotspot. As one form of green renewable energy, wind energy has been widely used around the world in recent years. The accurate prediction of wind speeds is an important component of effectively utilizing wind energy to generate electricity. However, due to the stochasticity and intermittence of wind speed [1][2][3][4], it is difficult to accurately predict the wind speed estimation errors and meet the actual requirements of wind power generation. Therefore, the accurate prediction of wind speed is of great significance.
Wind speed predictions can be divided into short, medium and long term predictions according to the sample time interval [5]. There are also various prediction models: physical models, statistical models and artificial intelligence models and hybrid methods [6]. Each model has its specialty: for example, physical models are suitable for long-term predictions, and statistical methods are applicable to short-term predictions [5]. With the development of artificial intelligence methods in recent years, machine learning methods have also been applied to wind speed prediction. The feedforward neural network is the most commonly used neural network, which due to its fast learning speed and good generalization performance. Various feedforward neural networks have been used in various scientific fields. Authors in [6] proposed training the support vector machine (SVR) and artificial neural network (ANN) models with different training sets for short-term wind speed prediction and compared their prediction results. Authors in [7] proposed the use of feedforward neural networks to predict stock index and sunspot movements. The results show that the method is effective at predicting stock and sunspot activity by improving the training set of the neural network. However, whether this method is suitable for short-term wind speed prediction is a topic for further research. In addition, the main problem with these models was that utilizing a single neural network to model and predict wind-speed data did not provide sufficient results [8][9][10][11][12].
As for hybrid methods, Authors in [13] the idea of combining predictions, which combines different prediction methods in an appropriate way. In recent decades, different models such as linear/nonlinear, supervised/unsupervised and statistical/intelligent have been combined to form various hybrid models, including the following: ARIMA-ANN [8,9] ARIMA-LSSVM [10], EMD-ANN [11], ARIAM-BP [12], ARIMA-Kalman [14], ARIMA-MLP and ARIMA-SVR [15]. Aasim [16] proposed an RWT-ARIMA model based on the continuous wavelet transform, which improved the accuracy of very-short-term wind speed predictions. Wang et.al [5] proposed a nonlinear combination model based on data feature extraction and multi-objective optimization for short-term wind speed predictions. The results show that the model has higher prediction accuracy and stability than the comparison model. The hybrid models proposed in the above literature [17][18][19][20][21][22][23][24] combine the single models through different methods, and the results verify that the combined prediction effect is better than that of the single model.
In recent years, the idea of a fuzzy set has been introduced into time-series prediction algorithms. Through the fuzzy preprocessing of original time-series, fuzzy relation are established and defuzzification is applied to improve the prediction accuracy. A large number of definitions and applications of fuzzy-time series are proposed in [25][26][27][28][29][30][31]. Jiang et.al [32] developed a mixed prediction system consisting of a data pretreatment module, an optimization module and a prediction module. The multi-objective differential evolutionary algorithm is used to optimize the fuzzy-time series to balance the prediction accuracy and stability. Although the fuzzy-time-series (FTS) methods has been widely used in other scientific fields, to the best of our knowledge, it is rarely used to predict wind speed. Therefore, we apply the fuzzy set idea to wind-speed data prediction, which is a great contribution to the application of fuzzy-time series to meteorology. Therefore, in order to solve the above problems to improve the accuracy of time-series prediction, we propose a new hybrid model in this study, which is indicated to enhance the forecast accuracy. It combines the ARIMA model, the three-layer feedforward neural network and fuzzy sets. This hybrid system uses the ARIMA and feedforward neural network to predict the original series and utilizes the dominance matrix to optimize the weight of the single model to maximize the accuracy of the hybrid model prediction. Additionally, we evaluate it with a set of Lorenz-63 and two sets of actual wind-speed data and analyze the results by using three evaluation indices. All of these approaches confirm that the novel hybrid model can be used to predict time series. The innovations of our study are as following: First, the overwhelming majority of existing studies used ARIMA methods to preprocess the original time-series, and they then used other intelligent models to model and predict the residuals of the original time-series. In this study, the ARIMA is employed to model and predict the original time-series, and then the neural network is employed to process the original time-series, forming a Energies 2020, 13, 3332 3 of 18 parallel linear-nonlinear hybrid model. Additionally, the neural network model in this study can be used as a preprocessing method for the original time-series and can also be used as a postprocessing method to deal with the errors.
Second, to validate the effectiveness of the framework of the proposed ARIMA-NN-FFOTR model, another hybrid model is proposed based on the framework: the ARIMA-NN-FOTR. Furthermore, this study reports the results of a comparative study on single models and the proposed model, including the following: the ARIMA, the NN-FOTR and the NN-FFOTR. Additionally, the three-layer feedforward neural network is employed, which has many advantages over other neural networks, the most important of which are its extremely short training time and fast convergence speed.
Third, in existing studies, most fuzzy-time series are mainly used for enrollment rate or stock index predictions. In this article, the fuzzy-time-series method is applied to wind speed predictions, which is a great contribution to the application of fuzzy-time series to meteorology. Structure of our proposed hybrid forecasting system is described in Figure 1.
The rest of this study is organized as follows: Section 2 describes the methods of the proposed hybrid system. Section 3 shows the experimental set up and the results of the proposed system. Section 4 presents relevant aspects of the proposed system. Finally, Section 5 contains the concluding remarks and the future work.  The rest of this study is organized as follows: Section 2 describes the methods of the proposed hybrid system. Section 3 shows the experimental set up and the results of the proposed system. Section 4 presents relevant aspects of the proposed system. Finally, section 5 contains the concluding remarks and the future work.

Background Theories
In this section, we summarize some background theories which are relative to our hybrid method in detail.

Fuzzy Sets
A fuzzy set is different from a regular set, in that the elements of the fuzzy set belong to a certain membership value in the range of the closed interval [0, 1]. The definition of the fuzzy set is defined as follows Section 2.1.1.

Definition
Given a set U that is the "universe of discourse", a fuzzy set in the universe U is defined as a set of 2-tuples [7], as show below where x is an element of universe U and µ A (x) ∈ [0, 1] denotes the membership value of element x in fuzzy set A. If the universe of discourse U is continuous and infinite, it is not possible to define a set of discrete 2-tuples as given in Equation (1) for fuzzy set A. In such cases, a fuzzy membership function µ A : U → [0, 1] is defined to map each element in the universe of discourse to its corresponding membership value in fuzzy set A.

Definition
Given a time series C, let c max and c min represent the maximum and minimum values of the time-series, respectively. We define partitioning as the act of dividing the range into h non-overlapping contiguous Intervals such that the following two conditions jointly hold [7]: Let a partition P i = P − i , P + i be defined as a bounded set, where P − i and P + i represent the lower and upper bounds of the partition, respectively. Clearly, a time-series data point c(t) belongs to the partition P i if and only if P − i ≤ c(t) ≤ P + i .

Definition
Let c(t) and c(t + 1) denote two consecutive data points in a time-series C and let P i and P j be the partitions to which these data points belong. Then, we denote P i → P j as a first-order [7].

Basis Function (RBF) Networks
The radial basis function (RBF) network is a three-layer (input, hidden layer and output) feedforward neural network, and its input consists of a source node that connects the network to the outside world. Its hidden layer uses a nonlinear radial basis function as the activation function. The output of the network is typically a linear combination of hidden layer functional values. Let the network output be a vector → Y = y 1 , y 2 , . . . y k . There are n hidden layer neurons in the network, and the output of the RBF network is as follows: → C j represents the center vector of the jth hidden layer neuron, is the Euclidean norm, φ is the activation function, and q is the number of hidden layer unit activation functions. There are a variety Energies 2020, 13, 3332 6 of 18 of hidden layer activation functions and the most commonly used radial basis function is the Gaussian function [33]. It is defined as follows: where β is a positive integer. Since the activation function depends on the distance between the input vector and the center vector of the hidden-layer neuron, the function is radially symmetric with the central vector of the neuron. RBF neurons are used as pre-selectors to determine when the neural network triggers based on the pre-trained neural network transition rules. The linear weighted sum is used to convert the weighted sum of the output layers to the output of the neural network and the activity of the ith unit in the output layer can be calculated according to the following formula: where w 0 is the bias term, w pr is the connecting weight between the pth hidden unit and the rth output unit, w pr is calculated by using the Levenberg-Marquardt (LM) back-propagation algorithm [34], Y is the response of the qth hidden unit resulting from all input data and N is the number of output units.

First-Order Transition Rules Based on the Neural Network Model
In this section, the neural networks are trained by using first-order transition rules [7] for a given time series. S i is a set of all first-order transition rules and p P j /P i indicates the probability that the next partition is P j when the current partition is P i . Then the formula for p P j /P i F is as follows: where count P i → P j is the sum of the number of occurrences of the transition rule P i → P j . The set S i can be expressed as follows: Given the current partition, in order to predict the next partition, consider the weighted contributions of all transition rules with P i as an antecedent, which can be achieved by designing a neural network model. This model allows all rules in the set S i to be triggered at the same time. To implement the model, a set of neural networks is used that satisfy the following conditions. Condition 1: Given the current input partition P i , each neural network in the ensemble P i can trigger at most one transition rule with P i in the antecedent.
Condition 2: All transition rules with partitions P i in the antecedent must be triggered by the ensemble. In other words, all rules contained in ensemble S i must be triggered.
Condition 3: There are no two neural networks in ensemble that can trigger the same transition rules, such as: P i → P a and P i → P b simultaneously triggering P a → P b .
The above conditions can be satisfied by constraining the training set of the neural network. See the relevant theorems from these conditions [7].
The steps of grouping the set transition rules into training sets are as follows: Step 1: Partition. Divide set S into a subset or group of transition rules. All rules in a group have the same partition in the antecedent, and the antecedent contains partition P i . The number of these subsets is equal to the total number of different partitions that occurred before the transition rules. The choice of the number of partitions will affect prediction accuracy [25], and the number of partitions of the model is set to 20 (multiple trials). The partition diagram of Lorenz-63 is shown in Figure 2. Step 1: Partition. Divide set S into a subset or group of transition rules. All rules in a group have the same partition in the antecedent, and the antecedent contains partition i P . The number of these subsets is equal to the total number of different partitions that occurred before the transition rules. The choice of the number of partitions will affect prediction accuracy [25], and the number of partitions of the model is set to 20 (multiple trials). The partition diagram of Lorenz-63 is shown in Figure 2  Step 2: Construct a training set. The transition rules in each subset are ordered. By collecting the transition rules of the position in each subset, the training set of the ith neural network in the set is constructed. By repeating this process, multiple training sets are constructed. These training sets contain transition rules, and the antecedent and consequence of each rule represent a partition label. The regression neural network model is used to obtain the time-series data points that are input as the current time-series data points and output as the future time-series data points. Therefore, the label of each partition is replaced by the corresponding partition median value to get the modified training set propagation algorithm is used to train the neural network.
Step 3: Neural network training. The first-order transition rules training set constructed in step 2 is used to train the neural network.
Step 4: Predicting the next data point. In the prediction phase, assuming that the partition containing the current time-series data points (system inputs) is i P , it may happen that the neural network in the ensemble has not been trained according to the rules containing i P in the antecedent, which will lead to an approximation error. Therefore, the RBF is used as the pre-selector to trigger the appropriate neural network in the given input set. The output i ρ of the ith hidden layer neuron is as follows: where x is the input to the RBF neuron, i m is the median of the ith partition and  Step 2: Construct a training set. The transition rules in each subset are ordered. By collecting the transition rules of the position in each subset, the training set of the ith neural network in the set is constructed. By repeating this process, multiple training sets are constructed. These training sets contain transition rules, and the antecedent and consequence of each rule represent a partition label. The regression neural network model is used to obtain the time-series data points that are input as the current time-series data points and output as the future time-series data points. Therefore, the label of each partition is replaced by the corresponding partition median value to get the modified training set T = m 1q , m 2q , . . . m iq , m iq q ∈ { 1, 2, ...20} . The Levenberg-Marquardt (LM) back propagation algorithm is used to train the neural network.
Step 3: Neural network training. The first-order transition rules training set constructed in step 2 is used to train the neural network.
Step 4: Predicting the next data point. In the prediction phase, assuming that the partition containing the current time-series data points (system inputs) is P i , it may happen that the neural network in the ensemble has not been trained according to the rules containing P i in the antecedent, which will lead to an approximation error. Therefore, the RBF is used as the pre-selector to trigger the appropriate neural network in the given input set. The output ρ i of the ith hidden layer neuron is as follows: where x is the input to the RBF neuron, m i is the median of the ith partition and β = 1. If the current partition P i exists in the antecedent of any training set rules of neural network NN j , neural network NN j is enabled. The enable signal of NN j is obtained by the logical OR operation of the RBF neuron output. Given the current data point c(t), the final prediction c (t+1) is calculated using the weighted sum of the single output of the neural network triggered by the pre-selector RBF neurons. Suppose there are υ elected neurons whose outputs o 1 , o 2 , . . . o υ are located in partitions p o 1 , p o 2 , . . . p o υ , respectively, and the final output is as follows:

Fuzzy First-Order Transition Rules Based on the Neural Network Model
In this section, the first-order transition rules are fuzzified to get the fuzzy first-order transition rules to build the training set and then train the neural network [7]. It should be noted that the best way to represent a time-series partition is by using the center of the partition, but the data points of time series cannot be completely represented by the partition. Therefore, by approximating the data points in the partition and their corresponding midpoints, some approximate errors that increase with the Energies 2020, 13, 3332 8 of 18 width of the partition itself can be deliberately studied [10]. One way to avoid this error is to treat each partition as a fuzzy set. Obviously, each partition is associated with a fuzzy membership function, and each time-series data point will have some membership values in the set [0, 1]. The classical Gaussian function is chosen as the membership function, and its peak corresponds to the center of the partition. Near the boundary of the partition, both sides of the peak have values that gradually decrease and approach 0. The membership value at the partition boundary is not necessarily 0. These membership values can be used to train the neural network in the model. The advantage of this approach is that it takes advantage of the inherent fuzziness involved in assigning partitions to time-series data points and uses partitions to obtain better results. The membership function corresponding to the ith partition is as calculated as follows: Then, we train the neural network using the improved fuzzy-first-order-transition rules, as follows.
Step 1: Partition. Set the number of partitions to 40 (multiple trials) to get the best prediction results.
Step 2: Construct the first-order transition rules and training set. The training set T constructed by the fuzzified first-order transition rules is T = µ 1 m iq , µ 2 m iq , . . . µ k m iq , m jq q ∈ { 1, 2, ...40} where m iq and m jq are the median values corresponding to the partitions P iq and P jq , respectively and µ i (x) is the membership value of x in the fuzzy set.
Step 3: Neural network training. The fuzzy-first-order-transition-rules training set constructed in step 2 is used to train the neural network.
Step 4: Predicting the next data point. This is the same as the first-order-transition-rules-based model discussed in Section 2.4.

Determination of the Weights of the Proposed Model
The general form of a hybrid model uses the weighted sum of each single prediction model. Therefore, the key point of the hybrid model is to determine the weight coefficients. If the weight coefficients of each single prediction model are properly assigned, the prediction accuracy of the whole hybrid model will be improved accordingly. However, when using the hybrid model, how to determine the weight coefficients of the single models is a big problem. In this regard, many scholars have proposed their own methods for determining the weights. The current commonly used methods are the arithmetic average method, the optimal weight method, the variance reciprocal method, etc. In this study, we use the dominance matrix method to determine the prediction weights of two single models. Let the ARIMA model be model 1 and the neural network model be model 2. By calculating the relative errors of two single-model predictions, we determine the size of the relative error of two single models. For example, when the relative error of the predicted value at a certain point in model 1 is less than the relative error of the predicted value at the same point in model 2. It shows that the prediction effect of model 1 is better than model 2. We judge the relative error of 200 predicted values of two single models by analogy, it can be obtained that the prediction effect of model 1 is better than model 2 m 1 times, the prediction effect of model 2 is better than model 1 m 2 times, and the total number of predicted steps is n. Then, the weights are w 1 = m 1 n and w 2 = m 2 n respectively.

Proposed Model Prediction
After determining the weights, the predicted value of the hybrid model can be obtained by the formula y = w 1 x 1 + w 2 x 2 . w 1 is the weight of the ARIMA model, w 2 is the weight of the neural network model, x 1 is the predicted value of the ARIMA model and x 2 is the predicted value of the neural network model. The weights of the new proposed model for the three time-series data sets are shown in Table 1 below.

Experimental Data
In this study, the experimental data were the simulated values generated by the Lorenz-63 model and the wind-speed data of two sites: Yakou station and Arou station that were located in the Qilian

Evaluation Indices
To evaluate the prediction accuracy, we use three well-known error metrics, which are the mean square error (MSE), the root mean square error (RMSE) and the mean absolute percentage error (MAPE). Let be the time-series value of the test cycle at time t, and ( ) ' c t is the predicted time-series value at the same time. The calculation formulas of the error metrics mentioned above are defined as Equations (12), (13) and (14), respectively, and the results of neural network are shown in Table 2 ( ) ( ) ( )

Evaluation Indices
To evaluate the prediction accuracy, we use three well-known error metrics, which are the mean square error (MSE), the root mean square error (RMSE) and the mean absolute percentage error (MAPE). Let be the time-series value of the test cycle at time t, and c (t) is the predicted time-series value at the same time. The calculation formulas of the error metrics mentioned above are defined as Equations (12), (13) and (14), respectively, and the results of neural network are shown in Table 2 (14) where N is the total number of test phase time-series data points and c(t) is the average of N data points.

Experiment I: Compare the Prediction Results of the Neural Networks Trained by Two Training Sets
The main purpose of Experiment I was to compare the prediction results of the first-order-transitionrules-trained neural network model with those of the fuzzy-first-order-transition-rules-trained neural network model. The neural network models used in this study were all three-layer-feedforward neural networks. For the neural network, the relevant parameters were set as follows Table 3 below. From Figures 4-6, it could be seen that the prediction effect of the neural network model trained by the fuzzy-first-order transition rules was better than that trained by the first-order transition rules. The first-order transition rules replaces the complete data band with the median value of the partition, which resulted in an approximation error. The first-order transition rules were fuzzified and replaced the antecedent of the partition by using the fuzzy membership value. The neural network with the large weight value was triggered according to the size of the membership value, which avoided triggering an unnecessary neural network and improved the speed and prediction accuracy of the network convergence. Table 3. Experimental parameters setting in the neural network model. model trained by the fuzzy-first-order transition rules was better than that trained by the first-order transition rules. The first-order transition rules replaces the complete data band with the median value of the partition, which resulted in an approximation error. The first-order transition rules were fuzzified and replaced the antecedent of the partition by using the fuzzy membership value. The neural network with the large weight value was triggered according to the size of the membership value, which avoided triggering an unnecessary neural network and improved the speed and prediction accuracy of the network convergence.  transition rules. The first-order transition rules replaces the complete data band with the median value of the partition, which resulted in an approximation error. The first-order transition rules were fuzzified and replaced the antecedent of the partition by using the fuzzy membership value. The neural network with the large weight value was triggered according to the size of the membership value, which avoided triggering an unnecessary neural network and improved the speed and prediction accuracy of the network convergence.   Table 3. Experimental parameters setting in the neural network model.   Table 3. Experimental parameters setting in the neural network model. Energies 2020, 13, 3332 12 of 18

Training Rules Experimental Parameters Value
The Lorenz system is a set of three dimensional first-order ordinary differential equations set up by the American meteorologist Lorenz to simulate the atmospheric circulation model. The expression is as follows: where x, y and z are state variables of Lorenz-63 system. We use a step size of 0.01 and use the Runge-Kutta method to generate 2000 sample points. Then, the neural networks trained using the first-order transition rules and the fuzzy first-order transition rules are, respectively constructed. Moreover, their prediction performances are compared.

Experiment II: Compare the Prediction Results of Hybrid Models and Single Models
The main purpose of Experiment II was to compare the prediction effects of single models and hybrid models. Three sets of time-series data were modeled by the ARIMA model. As seen from Figure 7, the first 1800 series of the two sets of wind-speed data satisfied the AR or ARMA model. Further, the ACF had a trailing character, and the PACF had a decreasing and oscillating phenomenon. The same was true for the ACF and PACF with the theoretical value series of Lorenz-63. Table 4 showed the determined model order. Figure 8 shows the ACF and PACF of the residual series after two sets of original wind-speed data after modeling. It can be seen that the ACF and PACF of the residual series were both within the 95% confidence interval. Similarly, the ACF and PACF of residual series of Lorenz-63 were the same. Thus, the residual series was white noise, and the ARIMA model was reasonable and could be predicted. According to Figure Figure 11 show that the prediction effect of the Arou site was the same. The prediction and evaluation indices of the three groups of data were shown in Table 5.

Discussion
Aiming at the problems and shortcomings of the existing time-series prediction models, this study proposes a comparative study of two hybrid prediction models based on the ARIMA and a Energies 2020, 13, 3332 16 of 18 neural network. Each model obtains a better prediction with respect to the Lorenz theoretical value and wind-speed data. Based on the existing work, we can conduct in-depth discussions from the following aspects.
(1) Effectiveness of the new algorithm This study proposes a new linear-nonlinear parallel combination model. For the theoretical and actual wind-speed data, the combined ARIMA-NN-FFOTR model proposed in this study has higher prediction accuracy than any single model. This study uses a three-layer feedforward neural network with a single input and a single output. The resulting training time is short, and the convergence rate of the whole prediction system is effectively improved. The effectiveness of the new algorithm is verified by comparative experiments. The results show that the hybrid model effectively combines the advantages of a single model to make up for the shortcomings of a single model prediction, and it improves the prediction accuracy and convergence speed of the hybrid model.
(2) Robustness of the fuzzy-time series This study applies fuzzy-time series to wind-speed data. The original wind-speed data are partitioned and then fuzzified, which will describe the distribution of the membership values of the data points in the partition, and the comparison experiment verifies the robustness of the neural network prediction by using fuzzy set training. Based on the existing research work, the robustness of the fuzzy-time series in real-time-series predictions is further proved.
(3) Applicability in real-time series In this study, the proposed model is verified by using the Lorenz-63 theoretical value and actual wind-speed data, which shows the validity of this model. The model can be applied to any real-time-series prediction.

Conclusions
This work proposes a hybrid system that performs time-series forecasting in three steps: The linear component of time-series modeling using ARIMA model; the nonlinear component of time-series forecasting with feedforward neural network model; and linear combination of the forecasts of ARIMA and feedforward neural network using the dominant matrix theory. To maximize the accuracy, the feedforward neural network is trained by first-order transition rules and fuzzy first-order transition rules. Generating two hybrid systems: ARIMA-NN-FOTR and ARIMA-NN-FFOTR.
Using three evaluation metrics to evaluate experiments, the results show that the ARIMA-NN-FFOTR attains a better performance than single models and ARIMA-NN-FOTR model. The ARIMA-NN-FFOTR reaches a higher accuracy because it is able to forecast separately the linear and nonlinear patterns of the series through ARIMA and NN-FFOTR model; compare with the ARIMA-NN-FOTR model, after the input partition Gaussian of the feedforward neural network is fuzzified, the contribution of the weighted data points in the partition is effectively improved.
In future work, we aim to develop a method to calculate the optimal weight. Additionally, this study considers only short-term wind-speed predictions. Therefore, future studies should evaluate the performance of the developed method for mid-term and long-term wind speed predictions.