A Novel Ultra-Short-Term PV Power Forecasting Method Based on DBN-Based Takagi-Sugeno Fuzzy Model

: Forecasting uncertainties limit the development of photovoltaic (PV) power generation. New forecasting technologies are urgently needed to improve the accuracy of power generation forecasting. In this paper, a novel ultra-short-term PV power forecasting method is proposed based on a deep belief network (DBN)-based Takagi-Sugeno (T-S) fuzzy model. Firstly, the correlation analysis is used to ﬁlter redundant information. Furthermore, a T-S fuzzy model, which integrates fuzzy c-means (FCM) for the fuzzy division of input variables and DBN for fuzzy subsets forecasting, is developed. Finally, the proposed method is compared to a benchmark DBN method and the T-S fuzzy model in case studies. The numerical results show the feasibility and ﬂexibility of the proposed ultra-short-term PV power forecasting approach. Author Contributions: Conceptualization, L.L.; methodology, F.L.; software, L.L.; validation, L.L. and F.L.; formal analysis, Y.Z.; investigation, F.L.; resources, F.L.; data curation, Y.Z.; writing—original L.L. writing—review and editing, Y.Z.; visualization, Y.Z.; supervision, F.L.;


Introduction
To cope with the challenges of energy security and climate change, promote energy transformation and deepen the concept of sustainable development, countries around the world are promoting the development of sustainable energy generation. Among them, PV power generation is a focus in many countries for its high efficiency, low pollution, safety and convenience [1]. However, when large-scale photovoltaic power generation is connected to the grid, its randomness and intermittent nature endangers the stability and security of the electricity system [2,3]. As we all know, ultra-short-term forecasting is very useful in planning power generation and planning energy storage. Therefore, the ultra-short-term, high-accuracy forecasting of photovoltaic energy is of great importance to ensure the proper functioning of the electricity grid and to optimize the transmission of the electricity system.
Extensive research has been conducted on PV power prediction, and a relatively systematic method has been formed. According to the different input data, power prediction methods are divided into univariate prediction and multivariable prediction. Univariate prediction usually carries out the modal decomposition of power variables. Then, it carries out the independent prediction for the main modes, and finally combines all the predictions to achieve the final result. The multivariate forecasting not only takes historical power as an input, but includes meteorological data, such as wind speed and direction. Based on the different prediction technologies, there are physical prediction methods, statistical methods and machine learning methods [4,5]. Physical forecasting methods are limited by the optimization of physical understanding rules and numerical weather forecasting, and have a large number of calculations and a poor timeliness. Additionally, statistical prediction methods cannot cope with a large randomness, or unstable data and multivariate situations, and their application scenarios are limited. Machine learning can easily explore the nonlinear relationship between data and has a higher generalization ability. Common machine learning prediction methods include the Markov chain [6], support vector machine [7], artificial neural network [8], extreme learning machine [9], XGBoost [10], and random forest [11], etc. Recently, the Takagi-Sugeno (T-S) fuzzy model was applied to ultra-short-term predictions of low-dimensional PV power, as it could characterize complex nonlinear functions with a smaller number of fuzzy rules. In [12], an inter type-2 fuzzy C regression algorithm based on the T-S fuzzy model was used for short-term wind energy interval prediction. From the simulation, it can be seen the proposed model greatly improves the performance in the quality of the prediction interval (PI) compared to the traditional prediction models. The T-S fuzzy model is applied to complete the ultra-short-term PV power prediction with a 1 h time scale. However, its accuracy greatly depends on the similarity between the training set and the actual situation [13,14].
There are many factors which influence PV power generation. Deep learning has the ability to extract deep functions. Compared to the shallow machine learning algorithms, the deep network model with multiple hidden layers can better understand the variation trend between the time series data and make more accurate predictions. At present, the ideal deep learning prediction methods include the long-term and short-term memory network, the gated recurrent neural network, and the deep belief networks (DBN) [15][16][17]. A deep learning model developed from the neural network is used to solve high-dimensional problems with multi-meteorological factors, such as input [18][19][20][21][22]. Based on the panel temperature, ambient temperature, cumulative power, and irradiance, DBN is used to forecast the PV power after 15 min by Neo [18]. Based on the weather classification and DBN, a prediction method is proposed to achieve a one-day forecasting of the PV power output [19]. In addition, DBN is used to learn the high-level abstractions in the historical PV output data by leveraging Gary's hierarchical architectures [20]. The results show that DBN can guarantee a minimal prediction error and satisfactory accuracy with the optimal data set length. Because of the small amount of data, the model cannot be fully trained with the sampling frequency in 15 min. Alzahrani takes the high-resolution global horizontal irradiance and global tilt irradiance data as model inputs, and then the PV power output is predicted by the interval [21]. Sorkun uses the hourly global horizontal irradiance data from the last decade to establish a long-short-term memory network model for PV power forecasting [22]. These two models [20,21] have requirements for data sampling interval or capacity. However, the above approaches only use a single model to forecast PV power. To further improve the forecasting accuracy, some schemes, which are combined with multimodel, are devised [23][24][25][26][27][28][29]. In [23], the self-organizing map and learning vector quantization networks are used to classify the historical data into multiple weather types, and then the supporting vector is used to predict the PV power. However, the single prediction model is simple and has a low precision. Based on a combination of a genetic algorithm, particle swarm optimization and the adaptive neuro-fuzzy inference method, a PV power generation prediction method is proposed in [24]. Liu uses the backpropagation (BP) neural networks with an artificial neural network (ANN) model to predict future photovoltaic power generation. The prediction accuracy exceeds that of conventional artificial neural networks [25]. The proposed BP-ANN model combines the advantages of BP neural networks and ANN, which achieves a higher forecasting performance of PV power. A temporal convolution network model based on wavelet decomposition is presented to predict the short-term photovoltaic power generation [26]. Quantile regression averaging (QRA) is used to compile a group of independent deterministic prediction models for longshort-term memory (LSTM) in [27], which have a higher prediction performance due to the better prediction accuracy of deterministic prediction. Wang proposes an independent PV power day-ahead forecasting model based on a long-term recurring neural network (LSTM-RNN) [28]. However, the differences between the binding models are small and the characteristics are not significant. Given the characteristics of many influencing factors and the huge uncertainty, Ge proposes a hybrid model based on a principal component analysis, gray wolf optimization and the general regression nervous network (PCA-GWO-GRNN) for the short-term forecasting of 24 h of solar energy production [29]. PCA is used to reduce the dimensions of the meteorological properties and GRNN is used to analyze the input Energies 2021, 14, 6447 3 of 10 operations again after the reduction in dimensions and the parameter GRNN is optimized with GWO. The results confirm the accuracy and usefulness of the proposed model in real life. In addition, Akylas C. Stratigakos et al. [30] combine LSTM with a singular spectrum analysis (SSA) to improve the accuracy of day-ahead hourly load prediction. From the above analysis, it was confirmed that a combination of a variety of different models can outperform single methods.
Motivated by the above, this paper proposes a DBN-based T-S fuzzy PV power ultrashort-term forecasting model. The proposed prediction model can overcome some of the shortcomings of traditional approaches, such as adapting only low dimensions and a loss of precision by the input variables screening. The organization of this paper is as follows: In Section 2, the proposed model framework is introduced. Section 3 describes the methodologies we use in detail and the whole process. Then, Section 4 is the case study, including the data processing and parameter settings. Lastly, some conclusions are provided in Section 5.

Forecast Model Framework
The framework of the DBN-based, T-S fuzzy, PV power, ultra-short-term forecasting model is shown in Figure 1. based on a principal component analysis, gray wolf optimization and the general regression nervous network (PCA-GWO-GRNN) for the short-term forecasting of 24 h of solar energy production [29]. PCA is used to reduce the dimensions of the meteorological properties and GRNN is used to analyze the input operations again after the reduction in dimensions and the parameter GRNN is optimized with GWO. The results confirm the accuracy and usefulness of the proposed model in real life. In addition, Akylas C. Stratigakos et al. [30] combine LSTM with a singular spectrum analysis (SSA) to improve the accuracy of day-ahead hourly load prediction. From the above analysis, it was confirmed that a combination of a variety of different models can outperform single methods. Motivated by the above, this paper proposes a DBN-based T-S fuzzy PV power ultra-short-term forecasting model. The proposed prediction model can overcome some of the shortcomings of traditional approaches, such as adapting only low dimensions and a loss of precision by the input variables screening. The organization of this paper is as follows: In Section 2, the proposed model framework is introduced. Section 3 describes the methodologies we use in detail and the whole process. Then, Section 4 is the case study, including the data processing and parameter settings. Lastly, some conclusions are provided in Section 5.

Forecast Model Framework
The framework of the DBN-based, T-S fuzzy, PV power, ultra-short-term forecasting model is shown in Figure 1. First, the data of the experiment are pretreated and normalized. After filtering the data using clustering via a correlation analysis, the FCM algorithm is employed to cluster the input variables into several subsets. Then, a DBN PV power forecasting model for each fuzzy subset is established. The weight of each sub-model is determined by the similarity from the fuzzy clustering center. Based on the forecasting results of all sub-models and weights, the final PV power forecasting output value could be calculated.
Each T-S fuzzy model with multiple inputs and single outputs is described by a set of if-then fuzzy rules: First, the data of the experiment are pretreated and normalized. After filtering the data using clustering via a correlation analysis, the FCM algorithm is employed to cluster the input variables into several subsets. Then, a DBN PV power forecasting model for each fuzzy subset is established. The weight of each sub-model is determined by the similarity from the fuzzy clustering center. Based on the forecasting results of all sub-models and weights, the final PV power forecasting output value could be calculated.
Each T-S fuzzy model with multiple inputs and single outputs is described by a set of if-then fuzzy rules: Energies 2021, 14, 6447 4 of 10 in which R (i) is the if-then rule, x i represents the independent input parameter, A ij are the fuzzy sets, y (r) is the output value of the ith rule, p ij is the linear parameter, w i is the fuzzy system output, and is the applicability of the i-th rule.

Filter Fuzzy Clustering Variables
The factors affecting PV power generation are complex. If all factors are used as the input of the clustering, the information becomes redundant and the efficiency of the operation is reduced. Therefore, a correlation analysis is required to select factors that have an important impact on the power forecasting results. The correlation coefficient γ is a statistical indicator that can be adopted to show the correlation between two variables: where x is one of the factors, y is the power output, is the average value of x, and is the average value of y. A factor with a higher |γ| mean has a large association with the output, and this factor should be selected as the input of the clustering system.

Fuzzy Division of Input Variables
Clustering analysis divides the input variables into different fuzzy sets, and input variables belonging to the same fuzzy set combine to form a fuzzy subset. In this part, the FCM clustering algorithm is employed to obtain the degree of membership of input variables in each fuzzy set. The main steps of the identification of the FCM cluster algorithm are shown in Figure 2.

Establish A DBN Model for Each Fuzzy Subset
As the visible cell state is known, the activation probability of the hidden cell Because of the symmetrical structure, as the hidden cell state is known, the activation probability of the visible cell ( ) The activation function selected in the above formula is: 1 Figure 3. The structure of DBN model.
RBM is a stochastic neural network model, which contains a two-layer structure (a visible layer v and a hidden layer h), symmetric connection, and no self-feedback. The energy of an RBM system is defined as: where a i , b j are bias for visible layer and hidden layer, n, m are neuron numbers for visible layer and hidden layer, and w ij is weight between v i and h j . The probability distribution function of the visible layer in RBM is: As the visible cell state is known, the activation probability of the hidden cell P h j = 1|v is: Because of the symmetrical structure, as the hidden cell state is known, the activation probability of the visible cell P v j = 1|h is: Energies 2021, 14, 6447 6 of 10 The activation function selected in the above formula is: Train each RBM to obtain the weight matrix and bias vector of each layer until the training of the entire RBM stack is complete. The power output of the final DBN model can be expressed as: y dbn = f x, a i , w ij (9)

T-S Model Output
The final forecasting result of PV power is a combination of results for each fuzzy subset. Thus, it can be represented as follows: Here, y PV Power is the final forecasting result of PV power; y dbn−subi is the power output of the i-th sub-DBN model, and λ(i) is weight of the i-th sub-DBN model. Here, the degree of membership for each subsystem is selected as the weight.

Data Set Description
The proposed approach is tested using a 433 kW PV matrix database on the University of Queensland campus in St Lucia. The data include air temperature, humidity, solar radiation, wind speed, wind direction, and historical power. The data sampling interval is 1 h. The period of the data spans March, June, September, and December of 2015, which represent the four seasons. The first to the twentieth day of each month is selected as the train dataset, and the remaining days of the month are regarded as the test dataset.

Evaluation Index
The following three evaluation indicators were selected to assess the predictive performance of different networks or models: Mean Absolute Error (MAE), Mean Relative Error (MRE) and Standard Error of the Mean (RMSE). In the following formulas, y andŷ represent the actual power and the forecasting result, y andŷ represent the average of y and y , Cap represents the daily average boot capacity, and n represents the number of samples.

Experimental Setup
Following the method described in 3.1, the correlation coefficients between the PV power and the influencing factors are shown in Table 1. According to Table 1, the correlation coefficient between the PV power and the irradiation power data has the best performance, followed by the historical power data. Moreover, they are all greater than 0.8. This indicates that the aforementioned two types of influencing factors can reflect the variation of the power output to a large extent. Thus, the input variables are classified using the insolation and power data within 15 min.
The forecasting performance of the DBN model is not static, which is strongly related to many parameters, such as the number of RBM layers and the number of neurons in each layer, etc. Therefore, multiple sets of experimental parameters are compared to establish the optimal model structure. The results are shown in Table 2. From Table 2, it can be seen that the performance of the DBN model does not increase linearly with the number of RBM layers and the number of neurons in each layer. The unnecessarily complicated multi-layer neural networks not only lead to an excessive training time but also increase the error due to structural redundancy. Consequently, it is necessary to construct an optimal DBN model framework by analyzing the evaluation index among different parameters. From the results in Table 2, the DBN model with 3 RBM layers has the best performance in all frameworks when the number of neurons in each layer is 200, 300, and 400.

Experimental Results
The forecasting results from the proposed model, DBN model, and T-S model are obtained as shown in Figure 4, where (a) is the forecasting results for one day in March in spring, (b) is the forecasting results for one day in June in summer, (c) is the forecasting results for one day in September in autumn, and (d) is the forecasting results for one day in December in winter.
From the results in Figure 4, it is observed that the predictions of the DBN and T-S models fluctuate greatly during the power fluctuation period. However, the proposed model can track the target power curve well. The proposed model can outperform the DBN model and the T-S model can outperform in the four seasons because the distance between the curve of the proposed model and the actual power is minimal. It is proven that the proposed model can successfully classify without many input variables and learn the characteristics of the multi-impact factor via RBM stacks. This finding indicates that the proposed model can be used to improve the performance of PV power forecasting.  From the results in Figure 4, it is observed that the predictions of the DBN and T-S models fluctuate greatly during the power fluctuation period. However, the proposed model can track the target power curve well. The proposed model can outperform the DBN model and the T-S model can outperform in the four seasons because the distance between the curve of the proposed model and the actual power is minimal. It is proven that the proposed model can successfully classify without many input variables and learn the characteristics of the multi-impact factor via RBM stacks. This finding indicates that the proposed model can be used to improve the performance of PV power forecasting.
Moreover, Table 3 shows the MAE, MRE, and RMSE of the proposed model and other forecasting approaches in the four seasons can accurately observe the forecasting accuracy of each model. It can be seen that the selected four types of evaluation indices of the proposed model are smaller than some T-S and DBN approximations in the four seasons. This indicates that the proposed model is more in accordance with the actual output situation and has an intensive prediction error. The averages of the RMSE of the TS fuzzy model, the DBN model and the proposed model under the four seasons are 8.2%, and 12.6% and 7.2%, both of which can meet the requirement that the RMSE of the PV power forecast is 20%. Additionally, the proposed model decreased by 12.2% and 42.85%, respectively, compared to the T-S model and the DBN model. Therefore, the proposed model absorbs the characteristics of two types of models and effectively improves the overall forecasting accuracy.  Moreover, Table 3 shows the MAE, MRE, and RMSE of the proposed model and other forecasting approaches in the four seasons can accurately observe the forecasting accuracy of each model. It can be seen that the selected four types of evaluation indices of the proposed model are smaller than some T-S and DBN approximations in the four seasons. This indicates that the proposed model is more in accordance with the actual output situation and has an intensive prediction error. The averages of the RMSE of the TS fuzzy model, the DBN model and the proposed model under the four seasons are 8.2%, and 12.6% and 7.2%, both of which can meet the requirement that the RMSE of the PV power forecast is 20%. Additionally, the proposed model decreased by 12.2% and 42.85%, respectively, compared to the T-S model and the DBN model. Therefore, the proposed model absorbs the characteristics of two types of models and effectively improves the overall forecasting accuracy.

Conclusions
This article proposes a new DBN-based T-S fuzzy model for predicting PV power. The historical PV power per hour, air temperature, humidity and solar radiation per hour are selected as the multiple input variables of the proposed prediction model. The fuzzy clustering makes the forecasting data more similar to the training data and the RBM stacks can outperform typical models with a high-dimensional data input. The data from the PV array from the University of Queensland of Australia is adopted to test the proposed approach by comparing it with the single DBN and T-S fuzzy model. The results indicate Energies 2021, 14, 6447 9 of 10 that fewer prediction errors and a more stable prediction process can be achieved by using the proposed approach. In summary, the proposed model effectively improves the ultra-short-term forecasting accuracy of PV power and has an ideal performance.