Hour-Ahead Photovoltaic Output Forecasting Using Wavelet-ANFIS

: The operational challenge of a photovoltaic (PV) integrated system is the uncertainty (irregularity) of the future power output. The integration and correct operation can be carried out with accurate forecasting of the PV output power. A distinct artiﬁcial intelligence method was employed in the present study to forecast the PV output power and investigate the accuracy using endogenous data. Discrete wavelet transforms were used to decompose PV output power into approximate and detailed components. The decomposed PV output was fed into an adaptive neurofuzzy inference system (ANFIS) input model to forecast the short-term PV power output. Various wavelet mother functions were also investigated, including Haar, Daubechies, Coiﬂets, and Symlets. The proposed model performance was highly correlated to the input set and wavelet mother function. The statistical performance of the wavelet-ANFIS was found to have better efﬁciency compared with the ANFIS and ANN models. In addition, wavelet-ANFIS coif2 and sym4 offer the best precision among all the studied models. The result highlights that the combination of wavelet decomposition and the ANFIS model can be a helpful tool for accurate short-term PV output forecasting and yield better efﬁciency and performance than the conventional model. 30, and 60 min ahead of PV power. All the same, forecasting ahead by 60 min has yielded high accuracy. The outcomes of this research demonstrate that the conjunction of DWT decomposition and the ANFIS model could meaningfully better the reliability of models used in the short-term forecasting of PV output power. The results achieved in this paper demonstrate that the connective forecast with discrete wavelet decomposition and ANFIS could be an outstanding tool for the short-term forecasting of PV output power.


Introduction
Estimating and forecasting solar power output has played an influential and critical role in integrated system management and the optimal operation of solar power in highdemand periods. Therefore, this subject is of interest both to academia and to power companies. The forecasting of upcoming events is the backbone of crisis management; as soon as this target can be accomplished, integrated system management becomes accessible [1]. Several published studies have proposed methods to forecast the future power output. Exploiting each of these proposed prediction methods generally leads to some error. Accurate prediction of PV output can provide helpful information for power management in an integrated grid [1][2][3].
The booming power forecasting methods proposed for solar power systems in the last decade can be divided into statistical, artificial intelligence, fuzzy inference, and hybrid methods [4]. Statistical methods, for example, auto-regressive moving average and autoregressive integrated moving average, have been used in power system prediction [5,6]. Artificial intelligence (AI) methods can be used to minimize research costs and reduce computing time as reliable alternative methods for forecasting the performance of complex systems [7]. Among others, artificial intelligence methods, multilayer perceptron neural networks [8][9][10], radial basis function neural networks (RBF NN) [11], physical hybrid artificial neural networks [12], recurrent neural networks [13], deep neural networks [14], training, and a testing stage using ANFIS. Compared to the existing literature, the main contributions of this paper are: • A new method of combined PV power forecasting based on the decomposition at different resolution levels to optimize the weight determination. • A study of different combinations of wavelet mother functions is proposed to find the most suitable for PV power time series.

•
The combination of wavelet decomposition and ANFIS would have a strong learning ability and handle non-linear sequences regarding the chaotic and high non-linearity PV power output. • The use of 2 and 3 h of data each day to forecast 10 min to 60 min ahead to increase the forecasting accuracy and reduce the computation time.

•
Up to 30 shuffles are conducted to have initial random weighting and capture their diversity for a reliable and robust proposed combination. Moreover, the reconstructed wavelet features are used to calculate the accuracy of the proposed method.
This manuscript is organized as follows: The wavelet transforms (decomposition and reconstruction) model is presented in Section 2; the overview of basic ANFIS methodology is provided in Section 3; Section 4 gives an overview of the case study. In Section 5, the forecasting results are presented with different mother wavelets, such as haar, Daubechies, Symlets, and Coiffets wavelets, on forecasting accuracy. The discussion and conclusion are presented in Sections 6 and 7, respectively.

Wavelet Transform
Generally, PV power output data can usually contain non-linear and dynamic features in pick and fluctuations [32], which are among the principal attributes that influence endogenous PV output power forecasting accuracy. In actual conditions, low and highfrequency signals are included in PV power data [33], which can affect the learning process of the artificial intelligence method. However, the outliers and behaviors at each frequency are more easily forecasted. Accordingly, signal decomposition techniques such as wavelet transforms can be used to forecast each frequency behavior. Consequently, the latter can improve the PV-generated forecasting accuracy. Wavelet transform is a well-documented effective "scalable" technique for time series data analysis with a higher time-frequency localization. Conventionally, discrete wavelet transforms (DWT) are generally used to increase the calculation efficiency of the prediction model. DWT of time series data can be computed as [34]: where ψ(t), a continuous function, designates the wavelet mother function, f (t) is the time-series function signal, and m and n are integers used to manage the wavelet dilatation and translation, respectively. Various wavelet mother functions are available, such as Harr, Daubechies, Symlets, Coiflets, Biorthogonal, Morlet, or Mexican Hat. The most popular wavelet mother functions, including Haar, Daubechies (db3), Coiflets (coif2), and Symlets (sym4), are illustrated in Figure 1. The Mallat technique [34] is utilized as a fast-DWT method to equilibrate the wavelength and smoothness in the current literature. This computation requires less time as it is the least complex process of calculating the DWT. The original time-series function f (t) signal can be reconstructed as the following [34]: the integers m and n are, respectively, in the range: 1 < m < M and 0 < n < 2 M−n − 1. T(t) designates the approximation subsignal at the M level. A simple format of the signal reconstruction can be computed as [34]: in which W m (t) are the details subsignals that can apprehend the interpretation of the value of small features present in the data.
Mathematics 2021, 9, x FOR PEER REVIEW 4 of 14 the integers and are, respectively, in the range: 1 < < and 0 < < 2 − − 1. ̅ ( ) designates the approximation subsignal at the M level. A simple format of the signal reconstruction can be computed as [34]: in which ( ) are the details subsignals that can apprehend the interpretation of the value of small features present in the data.

Adaptive Neuro-Fuzzy Inference System
ANFIS is an intelligent system that combines the strong points of the principles of fuzzy logic and neural networks into a hybrid technique. ANFIS is considered as a mere data learning algorithm that uses fuzzy logic to process data inputs into the desired output through using strongly interconnected neural network processing elements and weighted information connections to map inputs into output. This approach can simulate complex non-linear mappings, and it is suitable for an accurate short-term predictions [35][36][37].
As illustrated in Figure 2, the ANFIS architecture uses five layers relying on Takagi-Sugeno-Kang (TSK) rules inference system. Each layer carries out distinct functions. There are five layers: the fuzzification layer, rule reasoning layer, normalization layer, defuzzification layer, and output layer. The premise and consequent parameters are the essential parameters of ANFIS. The premise parameters { , , } are included in the designed membership functions of the fuzzification layer, which is the input layer of the ANFIS network used to generate input spaces by retrieving patterns in the input data. The consequent parameters { , , } correspond to the parameter sets of the defuzzification layer.

Adaptive Neuro-Fuzzy Inference System
ANFIS is an intelligent system that combines the strong points of the principles of fuzzy logic and neural networks into a hybrid technique. ANFIS is considered as a mere data learning algorithm that uses fuzzy logic to process data inputs into the desired output through using strongly interconnected neural network processing elements and weighted information connections to map inputs into output. This approach can simulate complex non-linear mappings, and it is suitable for an accurate short-term predictions [35][36][37].
As illustrated in Figure 2, the ANFIS architecture uses five layers relying on Takagi-Sugeno-Kang (TSK) rules inference system. Each layer carries out distinct functions. There are five layers: the fuzzification layer, rule reasoning layer, normalization layer, defuzzification layer, and output layer. The premise and consequent parameters are the essential parameters of ANFIS. The premise parameters {α k , β k , γ k } are included in the designed membership functions of the fuzzification layer, which is the input layer of the ANFIS network used to generate input spaces by retrieving patterns in the input data. The consequent parameters {ρ k , σ k , τ k } correspond to the parameter sets of the defuzzification layer.
The premise and the consequent parameters are optimized through training. A hybrid algorithm optimizes the parameter sets of the ANFIS forecasting system in the proposed approach.
ANFIS is explained by assuming two input variables (z 1 and z 2 ) and a unique output variable (ŷ). The rule illustrating the relationship associating the input, the membership function, and the output can be expressed using the if-then TSK fuzzy rules illustrated in the following conditional statements [25]: where D 1 , D 2 , E 1 and E 2 correspond to the fuzzy sets, also called linguistic labels. The function of every layer is presented as below: Fuzzification layer (layer 1): Each individual node is adaptive in this layer. The input variables' membership functions are mapped into fuzzy sets. The function is assigned to every node as described: where L 1 k represents the output of the node kth, µ A k and µ B k are the membership function, there exist various membership functions. The generalized bell function can be written as: where α k , β k , and γ k represent premise parameters used to change the shape of the membership function. The premise and the consequent parameters are optimized through training. A hybrid algorithm optimizes the parameter sets of the ANFIS forecasting system in the proposed approach.
ANFIS is explained by assuming two input variables ( 1 and 2 ) and a unique output variable (̂). The rule illustrating the relationship associating the input, the membership function, and the output can be expressed using the if-then TSK fuzzy rules illustrated in the following conditional statements [25]: if 1 2 and 2 is 2 , then ̂2 = 2 1 + 2 2 + 2 where 1 , 2 , 1 and 2 correspond to the fuzzy sets, also called linguistic labels. The function of every layer is presented as below: Fuzzification layer (layer 1): Each individual node is adaptive in this layer. The input variables' membership functions are mapped into fuzzy sets. The function is assigned to every node as described: where 1 represents the output of the node ℎ, and are the membership function, there exist various membership functions. The generalized bell function can be written as: Rule layer (layer 2): Each node in the layer is shown with a non-adaptive node, labeled as Π in Figure 2, and all the incoming signals are multiplied to compute the output: where the layer 2 output is given by L 2 k and w k is weight strength for the kth TSK rule. Every node illustrates the weight strength of a rule.
Normalization layer (layer 3): the activity level of each rule is calculated by each node. The kth node is normalized as w k . The kth rule determines the weight strength, then is divided by the sum of all weight strength rules and can be computed by: where L 3 k stands for the layer 3 output and w k the normalized weight strength. Defuzzification layer (layer 4): The nodes of this layer are adaptable. This layer function attributes to a single node the contribution of the kth rule inference to the fifth layer. L 4 k is computed as the defuzzification layer output using the consequent parameters and is defined as: Output layer (layer 5): Only one fixed node composes this layer. The whole output is computed by the sum of the arriving signals from the previous layer as:

Case Study
In power systems, for enhancing the grid performance, its stability should be guaranteed at all costs. To achieve grid stability, the appropriate scheduling of the spinning reserves and demand response is vital. With the increase in the adoption of PV as a source of energy, its intermittent nature may impact grid stability. There is a need to design forecasting models with good accuracy in forecasting the PV output power to mitigate issues with the grid stability and make PV a reliable energy source. This study used all-year solar power output data from 2017 to 2020, averaging the values for 10 min interval data. The data used for the current study were recorded at a solar power plant close to Taichung in the middle of Taiwan. The first 900 days' measurement data were employed as input training data, and the last 185 last days' data were employed to assess the trained model. The model has been designed to forecast 10 min, 30 min, and 60 min ahead each day during PV generation peak time.
The proposed forecasting algorithm consists of two parts. In the first part, the input dataset of ANFIS training data are decomposed using DWT. At that point, the function operated to decompose the training dataset is applied to the test dataset for test data decomposition. The study has also investigated the accuracy of well-known wavelet mother functions. A statistical study of different wavelet mother functions is conducted to evaluate each function's performance, including haar, db2, db3, db5, db8, coif1, coif2, coif3, coif5, sym4, sym6, and sym8.
The second part corresponds to the training and testing steps using ANFIS. Before developing the model, the optimal selection of input model numbers is essential because it can significantly reduce the computational time and cost. Two different input pattern numbers are presented in this work. The main settings of the ANFIS network include the types of input and the output membership function, the number of input and output membership functions, the number of iterations (epochs), and optimization methods such as hybrid learning.
The MATLAB toolbox is used to generate the ANFIS model for the studied data. The resulting equation of each rule is obtained by applying the linear least square assessment. Fuzzy c-means (fcm) was employed as a data clustering technique. Every data point is incumbent to a cluster and determined by membership level. The number of clusters has been set to 12, with 4 partition matrix exponents in 0.01 steps. First, the optimal dataset is carried out so that the generation of the initial FIS becomes possible. In this study, a maximum of 200 epochs has been used to achieve accurate prediction.
The training data were trained to determine the parameters of the TSK-type FIS build on the hybrid learning algorithm, which comprises the integration of the least square estimator and backpropagation gradient descent, as described by Table 1. After the training stage, the developed forecasting models were performed, and the efficiency was calculated with different evaluation criteria. The architecture of the ANFIS model developed is presented in Figure 3, with n inputs (k = 12 or k = 18), 1 output (ŷ), and 12 fuzzy rules (r = 12). When k = 12, the inputs represent every 10 min PV generated data of the last 2 h; meanwhile, k = 18 employs the last 3 h of data. The 12 conditional statements are used as a rule-base. The output (ŷ i ) gives the forecasted value of PV power. Data for the same training period are used to forecast the coming 10 min, 30 min, and 60 min of each day. That is, using 2-or 3 h data to train the model (from 9 am to 11 am or 8 am to 11 am in other words t − 12, and t − 18, separately) and forecast 11:10 (t + 1), 11:30 (t + 3), and 12:00 (t + 6) PV power. The training and testing flowchart are, respectively summarized in Figures 4 and 5.
The asterisk (*) in Figure 5 indicates that the test dataset wavelet transforms are acquired by applying the wavelet transform function used to decompose the training dataset.

Forward
Backward Least-squares estimator Fixed Signals Node outputs Error signals The training data were trained to determine the parameters of the TSK-type FIS build on the hybrid learning algorithm, which comprises the integration of the least square estimator and backpropagation gradient descent, as described by Table 1. After the training stage, the developed forecasting models were performed, and the efficiency was calculated with different evaluation criteria. The architecture of the ANFIS model developed is presented in Figure 3, with n inputs (k = 12 or k = 18), 1 output (̂), and 12 fuzzy rules (r = 12). When k=12, the inputs represent every 10 min PV generated data of the last 2 h; meanwhile, k=18 employs the last 3 h of data. The 12 conditional statements are used as a rulebase. The output (̂) gives the forecasted value of PV power. Data for the same training period are used to forecast the coming 10 min, 30 min, and 60 min of each day. That is, using 2-or 3 h data to train the model (from 9 am to 11 am or 8 am to 11 am in other words t − 12, and t − 18, separately) and forecast 11:10 (t + 1), 11:30 (t + 3), and 12:00 (t+6) PV power. The training and testing flowchart are, respectively summarized in Figures 4 and  5. The asterisk (*) in Figure 5 indicates that the test dataset wavelet transforms are acquired by applying the wavelet transform function used to decompose the training dataset.  The same datasets were used for an ANN model as a comparison to establish the effectiveness of the current idea. The ANN model's input pattern contained 12 or 16 inputs. The output layer was designed to forecast the power output value as described above for the ANFIS model.      The same datasets were used for an ANN model as a comparison to establish the effectiveness of the current idea. The ANN model's input pattern contained 12 or 16 inputs. The output layer was designed to forecast the power output value as described above for the ANFIS model.

Forecasting Accuracy Evaluation
Various standard error metrics were exercised to evaluate the proposed PV output power prediction strategy. The actual and forecast sequence are represented, respectively, by . y i andŷ i with N time steps. y i represents the maximum recorded PV power in the test dataset. Normalize root means square error (nRMSE (%)), mean absolute percentage error (MAPE (%)), means absolute error (MAE (kWh)), root mean square error (RMSE (kWh)), and the standard deviation (STD (kWh) criteria are used to assess the accuracy of the different forecasting model. The metrics mentioned above are computed as below:

Wavelet Study Results
The performance of the wavelet-ANFIS model for various wavelets mother functions for the input datasets is computed. Figure 6 shows the PV output of the original in blue compared with the wavelet output with different mother functions (Haar, Daubechies(db3), coiflets (coif2), and symlets (sym4)). Table 2 shows the forecasting result with different wavelet mother functions. It is discernible from Table 2 that wavelet-ANFIS models using coif2, sym4, and sym6 were more highly accurate than the other mother wavelets. The wavelet-ANFIS models with sym4 gave the highest efficiency in an overall view, indicating that wavelet decomposition utilizing the mother wavelet, sym4, can noticeably enhance ANFIS forecasting models' accuracy compared with other mother wavelets functions. It should be mentioned that the value of Table 2 is the average value computed from 30-time differently shuffled data points to have a strong comparison.

Forecasting Results
The forecasting evaluation analysis was performed by analyzing the results using the forecasting accuracy evaluation metrics mentioned in Section 5.1.
All the mother wavelets described previously have been computed, and the subsequent outcomes are presented in Table 2. It is essential to mention that after the forecasting output is obtained, the reconstitution function (2) is used to reconstruct the output. The forecasting error is computed with . y i , which represents the original value of the PV power output before the wavelet decomposition.

Forecasting Results
The forecasting evaluation analysis was performed by analyzing the results using the forecasting accuracy evaluation metrics mentioned in Section 5.1.
All the mother wavelets described previously have been computed, and the subsequent outcomes are presented in Table 2. It is essential to mention that after the forecasting  Different forecasting methods have been completed. Firstly, wavelet-ANFIS was devised to show the effectiveness of wavelet decomposition. The decomposition significantly improves the efficiency of the forecasting method. The same forecasting cases were than conducted with ANFIS without wavelet decomposition. The statistical performance has likewise been computed with the ANN forecasting model. The ANN and ANFIS model have similar input patterns. They have inputs in the range of 12 and 18, an additional 2 hidden layers with 20 nodes at each hidden layer, and one output. The Bayesian regularization algorithm of the backpropagation method is used to update the weighting value. The maximum iteration epochs are equal to 800. Table 3 gives the comparison of wavelet-ANFIS, ANFIS, and ANN models of the different prediction scenarios. Up to 30 scenarios are conducted for each forecasting method to randomize the initial weight. Each scenario is obtained by shuffling the available data to capture their diversity for a reliable and robust comparison. The average value of all cases RMSE is illustrated in Table 2. Table 3 summarized all evaluation criteria (nRMSE, MAPE, MAE, RMSE, and STD) of the wavelet-ANFIS, ANFIS, and ANN model.

Discussion
The model simulation results for different wavelet decomposition mother functions and two different input patterns are listed in Table 2. All the examinations are estimated by the index RMSE (kWh) defined in (13). One can observe that the mother function coif2 yielded the highest efficiency among mother wavelets in the forecasting model with 12 input patterns. Regarding the 18 input patterns, the sym4 yielded the lowest amount of forecasting errors in the overall view. Nevertheless, for the 10 min forecasting, sym6 yielded the lowest number of errors, followed by the sym4 wavelet model. Furthermore, Table 2 also indicates that the best accuracy of 10, 30, and 60 min PV power forecasting are obtained with 18 input patterns. Meanwhile, the comparison in Table 2 demonstrates that DWT decomposition computing mother wavelets, coif2, sym4, and sym6 can considerably increase ANFIS models' effectiveness compared with the ANN one.
The comparisons between the forecasts obtained using wavelet-ANFIS, the ANFIS, and ANN are listed in Table 3. The maximums forecasting RMSE are obtained for 60 min, 2.2565 × 10 −4 , 1.0610 × 10 −3 , and 2.2924 × 10 −3 kWh for the wavelet-ANFIS, ANFIS, ANN model, respectively. In other words, the maximum forecasting error of the ANN model was about 10.16 and 4.70 times higher than the wavelet-ANFIS and ANFIS model, separately. Such results show that the wavelet-ANFIS and ANFIS models have better performance in PV output forecasting. In addition, the MAPE, MAE, and nRMSE wavelet-ANFIS and ANFIS led to better efficiency compared with the ANN prediction model in all of the forecasting scenarios presented in this paper. The results in Table 3 are the average of 30 different shuffles, and it can be concluded that in the short-term, forecasting the PV power using the ANFIS model is more efficient than doing so using the ANN.
For more investigation, the comparison between wavelet-ANFIS and ANFIS that is presented in Table 3 indicates that the 10, 30, and 60 min for the ANFIS model with 12 input patterns forecasting RMSE are, respectively, equal to 6.3196 × 10 −4 , 8.5144 × 10 −4 , and 1.0610 × 10 −3 kWh, which are about 3.38, 4.04, and 4.70 times higher than the waveletmodel, separately. Meanwhile, the RMSE for the previous scenarios with 18 input patterns using the ANFIS model yielded, respectively, results about 3.90, 3.66, and 4.67 times the RMSE higher than the wavelet-ANFIS model RMSE for the exact same scenarios. From Figure 7, which illustrates the error distribution of the ANFIS and wavelet-ANFIS model, it can be observed that the 10, 30, and 60 min forecasts using wavelet-ANFIS model have more error-index computations in (18) close to zero.
In summary, from Tables 2 and 3 and Figure 7, the models using wavelet components used by DWT as inputs can result in greater efficiency than the ANFIS models without wavelet transforms. It can be concluded that the forecasts using the wavelet-ANFIS models are closer to the actual PV power output values than those with the ANFIS and the ANN models. Such a result indicates that the wavelet decomposition has the capacity to enhance the effectiveness of the ANFIS forecasting PV power models.
Mathematics 2021, 9, x FOR PEER REVIEW 12 of 14 models are closer to the actual PV power output values than those with the ANFIS and the ANN models. Such a result indicates that the wavelet decomposition has the capacity to enhance the effectiveness of the ANFIS forecasting PV power models.

Conclusions
An extensive study on applying the wavelet-ANFIS method for PV output forecasting is presented in this paper. The specific aims are to develop and evaluate a properly endogenous method for PV output forecasting in Taiwan. In this work, we compared wavelet-ANFIS, ANFIS, and ANN based on miscellaneous performance indexes, including RMSE, nRMSE, MAE, MAPE and standard deviation. The results highlight that AN-FIS obtains better results than the ANN model. Furthermore, the wavelet-ANFIS model yields a better accuracy in the PV output power than the ANFIS model and the ANN

Conclusions
An extensive study on applying the wavelet-ANFIS method for PV output forecasting is presented in this paper. The specific aims are to develop and evaluate a properly endogenous method for PV output forecasting in Taiwan. In this work, we compared wavelet-ANFIS, ANFIS, and ANN based on miscellaneous performance indexes, including RMSE, nRMSE, MAE, MAPE and standard deviation. The results highlight that ANFIS obtains better results than the ANN model. Furthermore, the wavelet-ANFIS model yields a better accuracy in the PV output power than the ANFIS model and the ANN model. Comparison of the wavelet-based model shows that wavelet-ANFIS-coif2 and sym4 yield better performance than all other mother functions when the last 2 and 3 h of generated PV power are used as the input of the proposed model, separately. The wavelet-ANFIS model used the last 3 h of data sym4 decomposed to yield the best overall accuracy for 10, 30, and 60 min ahead of PV power. All the same, forecasting ahead by 60 min has yielded high accuracy. The outcomes of this research demonstrate that the conjunction of DWT decomposition and the ANFIS model could meaningfully better the reliability of models used in the short-term forecasting of PV output power. The results achieved in this paper demonstrate that the connective forecast with discrete wavelet decomposition and ANFIS could be an outstanding tool for the short-term forecasting of PV output power.

Institutional Review Board Statement:
The study did not involve humans or animals.

Informed Consent Statement:
The study did not involve humans.
Data Availability Statement: Not applicable.