In order to better illustrate the forecasting performance of the combined forecasting model ELM–Elman–LSTM, we compare the forecasting results of the combined model with that of the three single models (ELM, Elman, and LSTM) from the four evaluation indices (MSE, MAE, MAPE, and R-squared).
5.1. Forecasting Results of Individual Models
In forecasting with single models, the evaluation indices for the model we selected are the MAE, MSE, and MAPE. The number of hidden layers and neurons in each hidden layer greatly impacts the forecasting results. For the ELM network, there is only one hidden layer. For the data selected in this paper, compared to the 20 hidden layer neurons network, the forecasting effect is better than that of the ELM network of 10 hidden layer neurons. The forecasting results of ELM network are shown in
Figure 6.
Table 2 shows the evaluation indices of the ELM network.
First, by comparing the forecasting results of different datasets at the same height, we can obtain the following results. For wind speed data at the height of 20 m, the ELM network had the best forecasting effect on Tuesday with the minimum values of three evaluation indices: the value of the MSE was 0.1773, the value of the MAE was 0.3475, and the value of the MAPE was 9.89%. For wind speed data at the heights of 50 m and 80 m, there were the smallest values of the MSE and MAE on Tuesday, while the smallest value of the MAPE occurred on Sunday. For example, the value of the MSE was 0.1914, the value of the MAE was 0.3387, and the value of the MAPE was 7.03%.
Second, we compared the forecasting results at different heights within the same dataset. It can be seen that the value of the MAPE decreases gradually as the height increases for all datasets. However, the changes were not regular. For example, in the forecasting results on Tuesday, the MAPE values were 9.89%, 7.68%, and 7.51% at the heights of 20, 50, and 80 m, respectively. There is no particular relationship between height and the values of the MSE and MAE. For example, in the forecasting results on Tuesday, the MSE and MAE values were 0.1773 and 0.3475, respectively, at the height of 20 m. The MSE and MAE values were 0.1914 and 0.3387, respectively, at the height of 50 m. At the same time, the MSE and MAE values were 0.1603 and 0.3222, respectively, at the height of 80 m.
In addition, the forecasting results at the height of 80 m on Sunday were the best, wherein the MAPE was 6.34%.
When using the Elman neural network to forecast the wind speed, the same attention should be paid to the selection of hidden layer nodes. The results are shown in
Figure 7.
Table 3 shows the evaluation index of Elman.
First, by comparing the forecasting results of the data at the same height of different datasets, we can obtain the following results. For wind speed data at the height of 20 m, the forecasting results of the Elman neural network had the smallest values of the MSE and MAPE on Tuesday. The value of the MSE was 0.2540 and of the MAPE was 11.89%, while the value of the MAE was 0.4179, which is higher than that on Wednesday. For wind speed data at the height of 50 m, there were the smallest values of three evaluation indices on Tuesday. The value of the MSE was 0.1936, the MAE was 0.3315, and the MAPE was 7.46%. For wind speed data at the height of 80 m, there were the smallest value of the MSE and MAPE on Tuesday, while the value of the MAE was higher than that on Monday.
Second, by comparing the forecasting results of wind speed at different heights in the same dataset, we can see that there is no definite rule for the change of three evaluation indices. For example, on Monday, the values of the MSE and MAPE decreased as the height increased, while the value of the MAE did not have the same regular. On Wednesday and Sunday, the values of the MSE and MAE increased as the height decreased, but the change in the MAPE was not the same. In addition, on Monday and Wednesday, the value of the MAPE decreased as the height increased. In short, when using the Elman neural network for wind speed forecasting at different heights, it is not certain that the value of the MAPE will decrease as the height increases.
In addition, the forecasting results at the height of 50 m on Tuesday were the best, wherein the value of the MAPE was 7.46%.
The wind speed data after noise reduction were entered into the LSTM network, and the forecasting results are shown in
Figure 8 and
Table 4. According to the figure and table, we can make the following conclusions.
First of all, we compared the forecasting results at the same height of different datasets. It can be concluded that the LSTM network had a better forecasting performance at the height of 20 m on Tues day, while the value of the MAPE was 10.87% and of the MSE was 0.2336, the smallest value in all forecasting results. For heights of 50 m and 80 m, there were the smallest values of the MSE and MAE on Wednesday, but the value of the MAPE was relatively large at this time. For example, at the height of 50 m, the value of the MAPE was 16.26% on Wednesday, while the value of the MAPE was 8.94% on Tuesday.
Secondly, in comparing the forecasting results at different heights of the same dataset, it can be seen that the LSTM network does not have specific rules for wind speed forecasting at different heights from the three evaluation indices. However, from the perspective of the MAPE alone, among the four-day forecasting results, except for Wednesday, there were three days that the value of the MAPE was the smallest at the height of 50 m. This also shows that the LSTM network has some advantages for wind speed forecasting at the height of 50 m.
5.2. Forecasting Results of ELM–Elman–LSTM
The forecasting results of the ELM, Elman, and LSTM networks were combined with a certain weight using a variance reciprocal weighting method, and then the weight coefficient was optimized by the SCO algorithm.
Figure 9 shows the forecasting results of the ELM–Elman–LSTM model. It can be seen from the figure that the forecasting curve of the combined model ELM–Elman–LSTM is basically in line with the actual price curve. Especially at the heights of 50 m and 80 m on Sunday, the forecasting curve is closer to the actual price curve. In comparison, the fit between the forecasting curve and the actual price curve at the height of 20 m is not as good as that of the remaining two heights.
In order to better prove the forecasting performance of the combined model, the ELM–Elman–LSTM model was compared to three single models (ELM, Elman, and LSTM).
Table 5 records the three evaluation indices (MSE, MAE, and MAPE), and
Table 6 records the value of R-squared of the four models.
First, we made a simple comparison of three individual models. The MAE and MSE of the ELM network were higher than those of the other two models at the height of 20 m on Wednesday, and the MAE and MSE of the ELM network were less than those of the other two models in the remaining datasets. The MAPE using the ELM network was higher than that of the Elman network at the height of 50 m on Wednesday and 20 m on Sunday. The MAPE of the ELM network was higher than those of the other two models at the height of 20 m on Wednesday. Except for these two cases, the three evaluation indices of the ELM network were smaller than those of the other two models. Based on this, it can also be said that the ELM network has better forecasting performance than the Elman and LSTM networks. In addition, it can also be seen from the table that the evaluation indices of the Elman and LSTM networks fluctuate up and down, and the two models have different advantages for different datasets. Next, we compared the combined model with single models.
We compared the forecasting results of the ELM–Elman–LSTM model with those of the ELM neural network. The value of the MSE using the ELM–Elman–LSTM was higher than that of the ELM network at the height of 50 m on Wednesday and Monday. For example, for the wind speed at the height of 50 m on Wednesday, the MSE was 0.2600 for the ELM–Elman–LSTM model, while the MSE of the ELM network was 0.2587. Except for the above case, the values of the three evaluation indices of the remaining datasets were less than those of the ELM network. For example, for the wind speed data at the height of 20 m on Monday, the MAPE of the ELM network is 14.68%, and the MAPE of the ELM–Elman–LSTM model is 13.46%, a decrease of 1.22%.
We compared the forecasting results of the ELM–Elman–LSTM model with those of the Elman neural network. The value of the three evaluation indices forecasted by the ELM–Elman–LSTM model was smaller than that of the Elman network. It can also be said that the forecasting performance of the ELM–Elman–LSTM model is better than that of Elman neural network. For example, for the weed speed data forecasted by the Elman neural network at the height of 20 m on Tuesday, the MSE was 0.2540, the MAE was 0.4179, and the MAPE was 11.89%. On the other hand, the MSE was 0.1618, the MAE was 0.3140, and the MAPE was 8.63%, forecasted by the ELM–Elman–LSTM model, a value decrease of 3.26%.
We compared the forecasting results of the ELM–Elman–LSTM model with those of the LSTM neural network. The value of the three evaluation indices forecasted by the ELM–Elman–LSTM model was smaller than that of the LSTM network for all datasets, as observed from the angle of the MAPE. It can be seen that the value of the MAPE is greatly reduced when forecasted by the ELM–Elman–LSTM model. For example, at the height of 80 m on Sunday, the MAPE of the LSTM network was 11.69%, whereas the MAPE of the ELM–Elman–LSTM model was 5.96%, a decrease of 5.73%.
In conclusion, when using the MSE, MAE, and MAPE to evaluate the model, the forecasting performance of the ELM–Elman–LSTM model for forecasting wind speed at different heights is superior to the three single forecasting models. According to
Table 6 and
Figure 10, for the wind speed at the height of 50 m on Sunday, the R-squared value forecasted by the ELM–Elman–LSTM model is smaller than that of the ELM network. For the remaining datasets, the R-squared value of the ELM–Elman–LSTM model is greater than those of the other three individual models. Moreover, the R-squared value of the combined model ELM–Elman–LSTM is close to 1. For example, for the data at the height of 80 m on Tuesday, the R-squared value of the ELM–Elman–LSTM reached 0.9388, but for the data at the height of 20 m on Wednesday, the value of R-squared of the four methods was not very close to 1; the highest value was 0.5447.
Next, we compare the forecasting result of the ELM-Elman-LSTM model on wind speed data at different heights.
Figure 11 shows the value of the R-squared coefficient and MAPE of the combined model ELM-Elman-LSTM at different heights. According to
Figure 11 and
Table 6, we can see that the value of the R-squared of the wind speed data at the height of 80 m is maximum in the three heights and is also closest to 1. For example, the maximum value of R-squared can reach 0.9388. The value of the MAPE at the height of 80 m is greater than that at 50 m only on Tuesday. It also can be said that ELM-Elman-LSTM has the best forecasting performance for the wind speed at the height of 80 m. In addition, if the model is only evaluated from the perspective of R-squared, only the value of R-squared for wind speed at the height of 20 m on Wednesday is smaller, which is only 0.5447. For the rest of the dataset, the forecasting performance of the wind speed at the height of 20 m is as good as that at the height of 50 m. However, if we evaluate the model from the perspective of the MAPE, the forecasting performance of the data at the height of 50 m is better than that at 20 m.
In short, it can be seen from the forecasting results that the ELM–Elman–LSTM model can improve forecasting accuracy compared to a single forecasting model and can obtain better forecasting results for wind speed data at different heights. Significantly, the forecasting result at the height of 80 m was perfect.