Short-Term Forecasting of Energy Production for a Photovoltaic System Using a NARX-CVM Hybrid Model

In this paper, a methodology for short-term forecasting of power generated by a photovoltaic module is reported. The method incorporates a nonlinear autoregressive with exogenous inputs (NARX) fed by the solar radiation and temperature times series, as well as an estimation of power time series obtained by implementing an ideal single diode model. This synthetic time series was validated against an actual photovoltaic module. The NARX model has been implemented in conjunction with the corrective vector multiplier (CVM) technique, which uses solar radiation under clear sky conditions to adjust the forecasting results. In addition, collinearity and the Granger causality tests were used to choose the input variables. The forecasting horizon was 24-h-ahead. The hybrid NARX-CVM model was compared to a nonlinear autoregressive neural network and persistence model using the typic forecasting error measures such as the mean bias error, mean squared error, root mean squared error and forecast skill. The results showed that the forecasting skills of the hybrid model are about 34% against the NAR model and about 42% against the Persistence model. The model was validated by blind forecasting. The results demonstrated evidence of the quality of the conformed forecasting model and the convenience of its implementation and building.


Introduction
Research concerning solar radiation (SR) forecasting with the finality of estimating the electrical energy produced by photovoltaic systems has been increased in the last few years. SR estimation is of great importance because this parameter is used in the sizing of the photovoltaic and thermal systems; moreover, it has meteorological variables [1,2].
Even though it is difficult to classify them, they can generally be divided into three groups: in the first group, those who use only the clarity index [3][4][5][6][7]. In the second, the hybrid models, such as hybrid data grouping models [8], models that combine different types of artificial neural networks (ANN) [9,10], and models that use the clarity index [11]. Finally, in the third group, only ANN techniques are used.
There are models to predict the solar radiation or some of its components that only use historical data of a single variable; these models generally present a good performance; however, in some cases, there are used because there are no more meteorological variables. The most useful techniques for univariate data are statistical techniques [2,5] and neural networks.
Many of these works focused mainly on the architecture of the neural network and varying different parameters, such as the lags number, training functions, activation func-2 of 23 tions, et cetera [2,3,[11][12][13][14]. Among the most used models are the Feed Forward Neural Networks and Elman Recurrent [15].
For example, Tingting Zhu et al. [16] used a Siamese convolutional neural network to carry out the forecasting of the direct normal irradiance in a time horizon of 10 min; this is an interesting study due to the implemented neural network type since the more conventional used is deep learning. Other ANNs are also used in this forecasting process as genetic algorithms [17]. Besides the univariate models, there are approaches that take advantage of the characteristics of two or more techniques to develop hybrid models [8]. Abdel-Rahman Hedar et al. [18] proposed a hybrid model used to forecast solar radiation, using Weather Research Forecasting (WRF) and machine learning. They established a classification of the solar radiation, and then several models of classification were used to determine the corresponding class. The proposed methodology is evaluated using a real environment temporal environment data set collected from different regions of Saudi Arabia; nevertheless, the causality between the variables was not analyzed. Other models used different ANN techniques [9,10], diverse meteorological variables and additional variables such as extraterrestrial radiation [19]. It is also important to mention the models that only used the clear index to make predictions.
According to the above mentioned, many models use different types of ANN. In most works, the vectors that make up the input layer of the network are chosen arbitrarily; the same happens with the inputs that feed the network that represent the lags of the used variables (in the case of the NARX model, for example). Generally, the authors solve the problem of the input vector conformation by testing different combinations of variables and choosing the architecture that provides the best results, focusing on the development of a methodology for ANN training [20,21]; failing that, they use other types of methods to find the optimal inputs such as the k-nearest neighbors model [5,8,10], or other data grouping techniques.
Different combinations and types of input variables have been proposed for training the ANN for modeling, estimating, or forecasting the behavior of the SR. Kaplanis et al. [14], Yadav et al. [22] and Kaushika et al. [23] implemented a combination of meteorological and geographic data to improve the SR prediction. In contrast, Hocaoglu and Serttas [20] and Wang et al. [21] worked only with meteorological data. NARX models have been implemented successfully to forecast wind speed by Cadenas et al. [24]. These models have also been implemented successfully to predict SR by Ahmad et al. [25] and Alzahrani et al. [26].
This research reports a methodology for predicting short-term energy produced by a photovoltaic generating model (PGM) using a hybrid model. The methodology incorporates a NARX model fed with solar radiation and temperature time series, as well as a synthetic time series of the power of a photovoltaic module obtained from an idealized model of a single diode. Collinearity and the Engel-Granger tests were used to choose the significant variables of the input vector of the NARX model. The autocorrelation and partial autocorrelation functions were also implemented to estimate the input vector lags. In addition to the NARX model, the corrective vector multiplier (CVM) technique obtained from a solar model was implemented. The turbidity factor of the solar energy was used to adjust the forecasting results.
This work was divided into six sections: Section 1 exhibits a general view of the models used to forecast solar radiation as a first approach to predicting solar systems' output power. Section 2 presents the fundamental aspects of the models, techniques and tests used for solar radiation prediction. Section 3 describes the proposed methodology used to build the forecasting hybrid model. Section 4 shows the typical performance metrics employed to evaluate the forecasting models. In Section 5, an analysis of the forecasting results, as well as a discussion of the feasibility of the methodology used in the hybrid model construction, are included. Finally, the conclusions of the research work are presented in Section 6.

Collinearity Test
The collinear variables are those whose data vectors that represent them and are in the same line. In a general way, it is if the vectors that represent n variables are in the same subspace, that is, when one vector is a linear combination of the other vectors [27]. Exact collinearity is difficult to find in a real situation, obviously; the fact that exact collinearity is difficult to find does not mean that there are no collinearity troubles [28].
To find trouble collinearity, an analysis is required of the main components of the dependent variables to figure out if there is such a problem. The linear combination of the original variables is defined by the main components of a set of variables to other variables.

Augmented Dickey-Fuller Test
It is possible to apply the causality test only if the time series is stationary. This work implemented the augmented Dickey-Fuller (ADF) test. This test is based on the null hypothesis, which establishes that the analyzed time series is not stationary. Therefore, the time series is stationary if the null hypothesis is rejected [29][30][31]. One way to delete nonstationarity is through the differentiation method. This method is defined by the change between each of the observations in the original time series, as in the following equation: where Y t is the corresponding value in the time t and Y t−1 is the corresponding value in the time t−1.

Engle-Granger Causality Test
The Engle-Granger causality test has been used to know the relationship between diverse meteorological variables, as presented in [32]. This technique examines whether the lagged value of one variable helps to forecast other variables in a model, such as the Engle-Granger test [29]: Rule of decision: of p-value is: • <0.01 X Granger causes Y at the 1%; • >0.01 X does not Granger cause Y at the 1%.
The equations for two variables are as follows: where X represents mean global irradiation, Y represents the air temperature, u 1t is the uncorrelated white noise, α i , β j , λ i and δ j are parameters to be determined and n is the number of lags. Similarly, using the autoregressive vector technique, it is possible to apply this technique to several variables [29].

Simplified Single Diode Model
The method based on series resistance has been widely studied and characterized to model a PV generator [33][34][35]. It can obtain the characteristic curves and the generated electrical power for a PVG [27], as shown in Figure 1.

Simplified Single Diode Model
The method based on series resistance has been widely studied and characterized to model a PV generator [33][34][35]. It can obtain the characteristic curves and the generated electrical power for a PVG [27], as shown in Figure 1. The characteristic equation V-I that is used to model de PVG is defined by: where is determined by solving Equation (4) for the following open-circuit conditions: = 0 and = . and are the short-circuit current and open-circuit voltages to the environmental conditions for the generator.
Equation (5) can be solved by an iterative method. The resistance of the generator is defined as [36]: where is the filling factor of the generator without series resistance, and is the normalized value for the open-circuit voltage.
The short-circuit current of the PVG with temperature dependence was defined by [36] as: where is the cell temperature, is the cell temperature to conditions of nominal operation (25 °C), is the short-circuit current of the generator for environmental conditions, is the temperature coefficient for short-circuit current and is the The characteristic equation V-I that is used to model de PVG is defined by: where I o is determined by solving Equation (4) for the following open-circuit conditions: I G = 0 and V G = V ocGE . I scGE and V ocGE are the short-circuit current and open-circuit voltages to the environmental conditions for the generator.
Equation (5) can be solved by an iterative method. The resistance R sG of the generator is defined as [36]: with, where FF 0 is the filling factor of the generator without series resistance, and v oc is the normalized value for the open-circuit voltage.
The short-circuit current of the PVG with temperature dependence was defined by [36] as: where T c is the cell temperature, T c0 is the cell temperature to conditions of nominal operation (25 • C), I scGE is the short-circuit current of the generator for environmental conditions, ∂I scG ∂T c is the temperature coefficient for short-circuit current and I scGn is the nominal short-circuit current of the generator, i.e., the maximum power point to nominal operation.
The open-circuit voltage of the generator, depending on the temperature, is defined as: where ∂V ocG The cell temperature can be calculated with the environmental temperature and the nominal operating cell temperature (NOCT) provided by the manufacturer. In this way, the cell and module temperature will be given by: where NOCT is the nominal operating cell temperature provided by the manufacturer, T c is the cell temperature and T E is the environmental temperature.

Solar Radiation under Clear Sky Conditions
Solar radiation suffers losses because of atmospheric absorption and dispersion when this passes through the atmosphere. According to the Handbook of Solar Energy [37], after the atmospheric absorption phenomenon, the normal solar flown rate (solar radiation/normal irradiation) that reaches the Earth's surface can be estimated from Equation (12): where SR clr is the SR under clear sky conditions, I ext is the extraterrestrial radiation and cosθ z is defined as: cosθ z = cosφcosδcosω + sinδsinφ (13) where θ z is the zenith angle, T R is the turbidity factor, φ is the site latitude, δ is the declination angle of the Earth and ω is the hourly angle. The extraterrestrial radiation is obtained through the following equation [38]: where E 0 is the eccentricity correction factor, I sc = 1367 W/m 2 is the solar constant and ω 0.5 is the hourly angle each half hour.

Calculation of the Turbidity Factor
The following criteria were considered for calculating the turbidity factor: • At least one day of each month is completely clear and corresponds to the day that records the maximum SR measured for that month. • For each month of the year, the maximum extraterrestrial SR value was calculated to each maximum extraterrestrial SR value corresponding to a value of cosθ z month .
By solving the turbidity factor from Equation (12), Equation (15) is obtained: where SR max moth is the monthly maximum SR, I ext month is the monthly maximum extraterrestrial SR and cosθ z month ; i.e., they depend on the maximum extraterrestrial radiation value. Twelve SR max moth values were obtained; one for each month. In the same way, twelve I ext month maximum values were calculated. Finally, twelve values of cosθ z month , each according to the maximum monthly of I ext month , were obtained. Once these values were obtained, the monthly turbidity factor could be calculated [27].

Methodology for Building the NARX-CVM Hybrid Model
During the first stage, the time series of the site's meteorological variables used for the hybrid NARX-CVM model construction were the temperature (T), solar radiation (SR), relative humidity (RH), wind speed (WS), and pressure (P). In addition, a synthetic time series of the electric power (EP) was implemented, which was obtained by modeling the photovoltaic equipment. The structure of the NARX model was performed by an input/output vector. Collinearity, unit root and causality tests defined the input vector [24,27,29,32]. In addition, the autocorrelation and partial autocorrelation functions were implemented to estimate the lags. The final resulting vector used to feed up the NARX model was conformed by the time series of SR, T and EP, as well as 24 lags. The output vector was the time series of the electric power (CFP). Figure 2 shows the flow chart to forecast the electric power of a photovoltaic system. First, the electrical power (EP) estimation was obtained by feeding up the ideal single diode model (PVM) with SR and T time series. Second, the NARX model was trained using the time series of the SR, T and EP to get the forecast of the electrical power (FP). Finally, FP was improved by implementing the corrective vector multiplier (CVM) technique and obtaining the corrected forecast of the power (CFP). The main steps of the methodology are described below.

Methodology for Building the NARX-CVM Hybrid Model
During the first stage, the time series of the site's meteorological variables used for the hybrid NARX-CVM model construction were the temperature (T), solar radiation (SR), relative humidity (RH), wind speed (WS), and pressure (P). In addition, a synthetic time series of the electric power (EP) was implemented, which was obtained by modeling the photovoltaic equipment. The structure of the NARX model was performed by an input/output vector. Collinearity, unit root and causality tests defined the input vector [24,27,29,32]. In addition, the autocorrelation and partial autocorrelation functions were implemented to estimate the lags. The final resulting vector used to feed up the NARX model was conformed by the time series of SR, T and EP, as well as 24 lags. The output vector was the time series of the electric power (CFP). Figure 2 shows the flow chart to forecast the electric power of a photovoltaic system. First, the electrical power (EP) estimation was obtained by feeding up the ideal single diode model (PVM) with SR and T time series. Second, the NARX model was trained using the time series of the SR, T and EP to get the forecast of the electrical power (FP). Finally, FP was improved by implementing the corrective vector multiplier (CVM) technique and obtaining the corrected forecast of the power (CFP). The main steps of the methodology are described below.

Step 1. Databases (VARIABLES)
Temixco is a Mexican city located in the state of Morelos. It has warm, sub-humid weather and possesses excellent solar energy potential. The meteorological variables recorded in the ESOLMET station [39] are solar radiation, temperature, relative humidity, wind speed and pressure. This database was chosen thanks to its quality, its small sampling rate and its storage frequency.
The sensors record the data with an average of 10 min, and the measurement data correspond to two years. The database has 105,120 data on each meteorological variable. The data were transformed to an hourly scale, obtaining 17,520 data of each variable. Table 1 shows the sensor characteristics used in the ELSOMET meteorological station.

Step 1: Databases (VARIABLES)
Temixco is a Mexican city located in the state of Morelos. It has warm, sub-humid weather and possesses excellent solar energy potential. The meteorological variables recorded in the ESOLMET station [39] are solar radiation, temperature, relative humidity, wind speed and pressure. This database was chosen thanks to its quality, its small sampling rate and its storage frequency.
The sensors record the data with an average of 10 min, and the measurement data correspond to two years. The database has 105,120 data on each meteorological variable. The data were transformed to an hourly scale, obtaining 17,520 data of each variable. Table 1 shows the sensor characteristics used in the ELSOMET meteorological station.

Collinearity Test
To avoid spurious results, a forecasting model should not be including redundant or irrelevant variables. The collinearity test helps to identify these kinds of inlet variables. Figure 3 shows the collinearity test results; the analyzed meteorological variables are shown in the abscissa axis, while the ordinate axis presents the variance decomposition in a 0 to 1 range. The collinear variables have values above 0.5, and a red circle highlights them.
In this step, different practical techniques for choosing the multivariable forecasting model's input variables are explained: collinearity test, augmented Dickey-Fuller (ADF) test, time series differentiation and causality test. For practical purposes, the electrical power was not treated as one more variable among the input variables analysis because the EP was obtained from SR and T; besides, it has similar behavior to SR.

Collinearity Test
To avoid spurious results, a forecasting model should not be including redundant or irrelevant variables. The collinearity test helps to identify these kinds of inlet variables. Figure 3 shows the collinearity test results; the analyzed meteorological variables are shown in the abscissa axis, while the ordinate axis presents the variance decomposition in a 0 to 1 range. The collinear variables have values above 0.5, and a red circle highlights them.

Augmented Dickey-Fuller Test (ADF)
One of the conditions for applying the Engle-Granger causality test is that the time series must be stationary; therefore, the ADF test must be implemented before the causality test. The augmented Dickey-Fuller test determines whether the time series is stationary. Table 2 shows the results of the ADF test. First, the autocorrelation is verified by analyzing the Durbin-Watson statistic.

Augmented Dickey-Fuller Test (ADF)
One of the conditions for applying the Engle-Granger causality test is that the time series must be stationary; therefore, the ADF test must be implemented before the causality test. The augmented Dickey-Fuller test determines whether the time series is stationary. Table 2 shows the results of the ADF test. First, the autocorrelation is verified by analyzing the Durbin-Watson statistic. Durbin-Watson values, estimated using a 5% significance level, range from 1.85 to 2.15, and conclude that autocorrelation does not exist in the model [25,39]. Second, the p-value was verified to be higher than 0.05, which indicates that the time series has a unit root; thus, it is stationary. Consequently, there was no need to differentiate any time series, and the causality test could be applied directly. Figure 5 shows the flow chart following the results obtained with the ADF test. In case some of the time series have not been stationary, the first difference is calculated, and the ADF test is applied again. Generally, it is enough to differentiate the time series once for passing the ADF test.

Engle-Granger Causality Test Results
In this causality test, the nullity hypothesis establishes that the independent variables cause the dependent variable, so the independent variables contain helpful information for forecasting the behavior of the dependent variable. In these kinds of tests, the significance level is 1% [29]. Table 3 shows the causality test results applied to each group of variables obtained from the collinearity test. From the causality test results, only the WS value is statistically significant, as 0.12 > 0.01. Therefore, in this case, the nullity is rejected. Thus, it is established that for the variable set (SR, T, WS), the independent variable WS does not contain useful information to forecast the dependent variable, SR. Then, this variable is discarded from the input variables, and the unit is used.

Engle-Granger Causality Test Results
In this causality test, the nullity hypothesis establishes that the independent variables cause the dependent variable, so the independent variables contain helpful information for forecasting the behavior of the dependent variable. In these kinds of tests, the significance level is 1% [29]. Table 3 shows the causality test results applied to each group of variables obtained from the collinearity test. From the causality test results, only the WS value is statistically significant, as 0.12 > 0.01. Therefore, in this case, the nullity is rejected. Thus, it is established that for the variable set (SR, T, WS), the independent variable WS does not contain useful information to forecast the dependent variable, SR. Then, this variable is discarded from the input variables, and the unit is used. Figure 6 shows the variable combinations once the causality tests were applied. The red-dotted line indicates the inputs that best describe the behavior of the power, and the bold capital letters indicate the variables with strong collinearity. As previously reported, the first input combination resulting from the collinearity test was SR, T and WS. Once the causality test was applied, SR and T were obtained. The second variable set obtained from the collinearity test was SR, RH and WS; the causality test did not report any change. The causality test also shows no change for the third variable set (SR, WS, P). Therefore, it had input 1 = SR, T; input 2 = SR, RH, WS; and input 3 = SR, WS, P. The proposed methodology focuses on finding the optimal input vector; thus, the NARX model includes only SR and T as input variables for this study case.

Step 3. Lags for the NARX Model (LAGS)
The autocorrelation (ACF) and partial autocorrelation functions (PACF) were implemented to estimate the number of lags in the forecasting models. First, these functions were applied to the time series; then, they were analyzed to find the seasonal patterns that determine the lags number. The resulting interpretation of the ACF and PACF is through their plots, which indicate seasonality. Seasonality is defined as a pattern that repeats itself over fixed intervals in time [28,30].

Step 3: Lags for the NARX Model (LAGS)
The autocorrelation (ACF) and partial autocorrelation functions (PACF) were implemented to estimate the number of lags in the forecasting models. First, these functions were applied to the time series; then, they were analyzed to find the seasonal patterns that determine the lags number. The resulting interpretation of the ACF and PACF is through their plots, which indicate seasonality. Seasonality is defined as a pattern that repeats itself over fixed intervals in time [28,30]. Figure 7 shows the ACF and PACF plots. The ACF presents a sinusoidal behavior, which indicates the seasonality of the SR time series (Figure 7a). Simultaneously, the PACF plot (Figure 7b) shows peaks every 24 h, which means the time series has seasonality every 24 h. Therefore, the lags number defined for the NARX models was 24.

Step 4. Modeling Photovoltaic Systems
The idealized single diode model was used to estimate the electric power from T (°C) and SR (W/m 2 ) time series. A monocrystalline photovoltaic module ISF-250 black [40] was used in order to verify the model's validity. Mechanical and electrical characteristics are shown in Tables 4 and 5, respectively.  Figure 8a shows the characteristic curves of the manufacturer of the monocrystalline module ISF-250, and Figure 8b shows the characteristic curves obtained by the idealized single diode model. A great similitude between both characteristic curves is observed, which indicates the reliability of the electric power time series calculated from the model.

Step 4: Modeling Photovoltaic Systems
The idealized single diode model was used to estimate the electric power from T ( • C) and SR (W/m 2 ) time series. A monocrystalline photovoltaic module ISF-250 black [40] was used in order to verify the model's validity. Mechanical and electrical characteristics are shown in Tables 4 and 5, respectively.   Figure 8a shows the characteristic curves of the manufacturer of the monocrystalline module ISF-250, and Figure 8b shows the characteristic curves obtained by the idealized single diode model. A great similitude between both characteristic curves is observed, which indicates the reliability of the electric power time series calculated from the model. A comparison was also made between the electrical data of the monocrystalline module ISF-250 reported in Table 5 (manufacturer) and Table 6 (calculated). The comparison result is presented in Table 7, where the most significant error was for the maximum current ( ), with an error of 1.52% above the theoretical value. An essential factor to fit the single idealized diode model to the experimental model was the nonideality constant of the diode ( ), which frequently is set as 1.2 [34]. But, for this case study, = 1.8 was the value that the estimated results best fit the data provided by the manufacturer.  In this work, a nonlinear autoregressive exogenous model (NARX) was used for the short-term prediction of solar radiation. NARX model is a nonlinear autoregressive model that has exogenous inputs. The artificial neural network training was carried out using A comparison was also made between the electrical data of the monocrystalline module ISF-250 reported in Table 5 (manufacturer) and Table 6 (calculated). The comparison result is presented in Table 7, where the most significant error was for the maximum current (I max ), with an error of 1.52% above the theoretical value. An essential factor to fit the single idealized diode model to the experimental model was the nonideality constant of the diode (n), which frequently is set as 1.2 [34]. But, for this case study, n = 1.8 was the value that the estimated results best fit the data provided by the manufacturer. In this work, a nonlinear autoregressive exogenous model (NARX) was used for the short-term prediction of solar radiation. NARX model is a nonlinear autoregressive model that has exogenous inputs. The artificial neural network training was carried out using the meteorological variables recorded during the measurement of the first year. From this database, 70% dataset was used to train the NARX model, 15% dataset for the validation, and 15% dataset for the test. The models to obtain the blind forecasting were then implemented; thus, the measurement of the second year that was not used to train the model was selected. Different inputs were generated according to the results obtained in Step 2. The model outputs were the electrical power generated by the photovoltaic module in Step 4.
Generally, a NARX model is formed of an input layer, a hidden layer, and the output layer. For this case, the input layer was conformed by the input vectors previously obtained in Step 2 and the electrical power calculated in Step 4 (EPower), while the output layer was the electrical power (FPower). Figure 9 shows a simplified representation of the NARX models developed from the input vectors and lags obtained in Steps 2 and 3. The letter "I" indicates inputs, whereas "O" denotes the outputs, and 24 is the lag number. and 15% dataset for the test. The models to obtain the blind forecasting were then implemented; thus, the measurement of the second year that was not used to train the model was selected. Different inputs were generated according to the results obtained in Step 2. The model outputs were the electrical power generated by the photovoltaic module in Step 4.
Generally, a NARX model is formed of an input layer, a hidden layer, and the output layer. For this case, the input layer was conformed by the input vectors previously obtained in Step 2 and the electrical power calculated in Step 4 (EPower), while the output layer was the electrical power (FPower). Figure 9 shows a simplified representation of the NARX models developed from the input vectors and lags obtained in Steps 2 and 3. The letter "I" indicates inputs, whereas "O" denotes the outputs, and 24 is the lag number. This work's primary purpose is to improve the forecast of the power generated by a PV generator based on the appropriate selection of the input variables, the lags and the application of the corrective vector multiplier (CVM). Therefore, the default configuration proposed by the Matlab ® program was used. Figure 10 shows the general arrangement of the NARX model, where ( ) represents the neural network inputs, is the number of inputs variables and ( ) is the output variable, is the number of lags, are the weights, are the biases, ℎ is the number of hidden layers and is the number of output variables. This work's primary purpose is to improve the forecast of the power generated by a PV generator based on the appropriate selection of the input variables, the lags and the application of the corrective vector multiplier (CVM). Therefore, the default configuration proposed by the Matlab ® program was used. Figure 10 shows the general arrangement of the NARX model, where x(t) represents the neural network inputs, n is the number of inputs variables and y(t) is the output variable, L is the number of lags, w are the weights, b are the biases, hn is the number of hidden layers and m is the number of output variables.
A test was made first using all variables to determine the effectiveness of the proposed methodology. Then, using the input vectors and lags previously selected, the different architectures of the NARX models are described in Table 7. The simplest model was the H-NARX model, and it is a NARX model with two input neurons (SR and T), ten hidden neurons, 24 lags and one output neuron (electrical power). This simple model was compared with model NARX I (all the variables are used as inputs), NARX II (input neurons with SR, T and WS), NARX III (input neurons with SR, RH and WS), and finally, NARX IV (input neurons with SR, WS and P). This work's primary purpose is to improve the forecast of the power generated by a PV generator based on the appropriate selection of the input variables, the lags and the application of the corrective vector multiplier (CVM). Therefore, the default configuration proposed by the Matlab ® program was used. Figure 10 shows the general arrangement of the NARX model, where ( ) represents the neural network inputs, is the number of inputs variables and ( ) is the output variable, is the number of lags, are the weights, are the biases, ℎ is the number of hidden layers and is the number of output variables. A test was made first using all variables to determine the effectiveness of the proposed methodology. Then, using the input vectors and lags previously selected, the different architectures of the NARX models are described in Table 7. The simplest model was the H-NARX model, and it is a NARX model with two input neurons (SR and T), ten hidden neurons, 24 lags and one output neuron (electrical power). This simple model was compared with model NARX I (all the variables are used as inputs), NARX II (input Forecasting models of solar radiation and photovoltaic power sometimes return values that should not be considered in the final forecasting results. For example, a prediction model can forecast a positive value of the SR or output electric power of a PVG at night, which is wrong. For this reason, the use of the SR under clear sky conditions was proposed to improve the forecasting results of the electrical power through a corrective vector multiplier (CVM).
According to the performance tests, the forecast of the electrical power (CPOWER) results improved when the corrective vector multiplier was applied, where FPOWER is the forecast of electrical energy obtained from the NARX model, and the corrective vector multiplier was built from

Performance Tests
The NARX models were programmed using the ntstool library from Matlab ® . The input vectors were reported in Table 8. According to the proposed methodology, the bestinput vector was formed by SR and T, H-NARX without CVM and H-NARX-CVM once the CVM was applied (see Tables 8 and 9). The input number of neurons is defined by the input vectors, the hidden layer neurons are set up in 10, the output neuron is one and it is defined by the output vector. The number of lags was obtained using the ACF and PACF. The time series were pre/post-processing using the functions removeconstantrows and mapminmax. The first function removes the rows of the input vector that correspond to input elements that always have the same value because these input elements are not providing any useful information to the network, and the second function transforms input data so that all values fall into the interval [−1, 1]; this can speed up the learning networks. The division of data for training, validation and testing was carried out using dividerand; this function divided data randomly; the sample data were split up into 70%, 15% and 15% for training, validation and testing, respectively. The performance function is the mean squared error (mse). The transfer function is the tan-sigmoid defined as tansig(n) = The NARX models trained with the 2017 dataset were used to forecast the 2018 data set from 24 h ahead and updated the existing data until the whole year was completed. Performance tests were applied to the results of the annual forecasting to estimate which ANN architecture was the one that best predicts the behavior of the electrical power. Some of the most used performance tests are the mean squared error (MSE), the root of the mean squared error (RMSE), the mean bias error (MBE) and the coefficient of determination (R 2 ) [29,[41][42][43]. Equations (20)-(23) describe all these performance metrics. Table 8 shows the results of the performance tests for models before applying a corrective vector multiplier (output: FPower). Table 9 shows the results of the performance tests when the CVM (output: CPower). Table 10 shows a comparison between the RMSE and cRMSE and the improvement rate of RMSE for each case. In all cases, forecasting results were improved when CVM was applied. The most significant improvement was obtained for the NARX-CVM I model, where the RMSE was enhanced by 6.7%. By contrast, the lowest improvement was obtained in NARX-CVM IV with a 0.4% improvement.

Comparison between Models with and without CVM
Another essential tool used for evaluating forecasting results is the linear regression plot, where the ordinate axis represents the forecast data while the abscise axis indicates the actual data. Figure 11 shows the linear regression plots of the five evaluated cases, where models without the CVM are at the top of the figure. At the bottom of the figure are the linear regressions that resulted from applying the corrective vector. This figure shows that the linear regression dataset was more homogeneous when the corrective vector was applied. The coefficient of correlation showed slight improvement when the CVM was

Comparison of the H-NARX-CVM Model against Other Models
The forecasting results are often compared with simpler models to identify the Figure 11. Linear regression analysis for the five study cases: models without CVM at the top and models with CVM at the bottom.

Comparison of the H-NARX-CVM Model against Other Models
The forecasting results are often compared with simpler models to identify the proposed model's ability to forecast. From Table 10, the H-NARX-CVM model was the one that provided the best results, that is, the model with the smallest RMSE. Therefore, this was the only model used to compare the results with other prediction models. One of the most widely used models for comparison purposes is the well-known persistence model (Equation (24)) [25,[43][44][45].
Like the persistence model, the nonlinear autoregressive (NAR) neural network has been used by Benmouiza, Cheknane [9,46] and García-Tena et al. [47] to forecast time series or as a benchmark to compare more complex models. Table 11 shows the results of the performance tests for the proposed methodology, NAR and the persistence model. The NARX models obtained with the proposed method outperformed the persistence and NAR models. On the other hand, predictive modeling researchers use forecasting skills as one of the most widely used measures. This compares the developed model with a less complex model, such as persistence [41][42][43]. In this research, the NAR and persistence models have also been used as benchmarks: where RMSE H−NARX−CVM is the RMSE for the H-NARX-CVM model and RMSE Re f is the RMSE for the benchmark models. Figure 12 (cyan-dotted line) shows the results of the forecasting skill using the persistence model as a benchmark, and a key performance indicator (KPI) of 35% was used as a goal. H-NARX-CVM I, II, III and IV models were higher than 35% of the KPI, with the proposed methodology being the one that obtained the best result with a skill forecasting of 42%. Compared with model 1, with a forecasting skill of 31%, it did not exceed the target value. But, when the exercise was performed using the NAR model as a benchmark, the NARX models did not exceed the KPI goal, H-NARX-CVM IV being the best result, with skill forecasting of 34%, as shown in Figure 12 (magenta-dotted line).
Results of the H-NARX-CVM, NAR and Persistence Models Versus the Real Data This section compares the actual electrical power and the forecasting results calculated as follows: (1) The blind prediction of the power obtained using the proposed methodology; (2) The blind prediction using the NAR model; (3) The prediction using the persistence model.
A qualitative comparison was made using plots of time series of randomly chosen days each month (Figure 13), and quantitative analysis was performed to calculate the forecasting errors (Table 12). The days used to compare the mentioned models were obtained using the single diode model algorithm, which allows obtaining a power production daily of the PVG. persistence model as a benchmark, and a key performance indicator (KPI) of 35% was used as a goal. H-NARX-CVM I, II, III and IV models were higher than 35% of the KPI, with the proposed methodology being the one that obtained the best result with a skill forecasting of 42%. Compared with model 1, with a forecasting skill of 31%, it did not exceed the target value. But, when the exercise was performed using the NAR model as a benchmark, the NARX models did not exceed the KPI goal, H-NARX-CVM IV being the best result, with skill forecasting of 34%, as shown in Figure 12 (magenta-dotted line).

Results of the H-NARX-CVM, NAR and Persistence Models Versus the Real Data
This section compares the actual electrical power and the forecasting results calculated as follows: (1) The blind prediction of the power obtained using the proposed methodology; (2) The blind prediction using the NAR model; (3) The prediction using the persistence model.
A qualitative comparison was made using plots of time series of randomly chosen days each month (Figure 13), and quantitative analysis was performed to calculate the forecasting errors ( obtained using the single diode model algorithm, which allows obtaining a power production daily of the PVG.  Finally, a random day was chosen from the 12 days separated from the annual forecast to make a visual and more detailed analysis of the forecast's behavior obtained from the H-NARX-CVM, NAR and persistence. In this case, it turned out to be October 19, as is shown in Figure 14. From this figure, the H-NARX-CVM and the NAR models are the best predictors of the behavior of electrical power. It can also be observed that at 3:00 p.m., the electrical power generated by the PV system was 50.84 W. In comparison, the persistence prediction was 87.22 W, the artificial neural network was 102.36 W and 116.58 W by the H-NARX-CVM model.  Finally, a random day was chosen from the 12 days separated from the annual forecast to make a visual and more detailed analysis of the forecast's behavior obtained from the H-NARX-CVM, NAR and persistence. In this case, it turned out to be October 19, as is shown in Figure 14. From this figure, the H-NARX-CVM and the NAR models are the best predictors of the behavior of electrical power. It can also be observed that at 3:00 p.m., the electrical power generated by the PV system was 50.84 W. In comparison, the persistence prediction was 87.22 W, the artificial neural network was 102.36 W and 116.58 W by the H-NARX-CVM model. According to Figure 14, the best model that predicted the electrical power produced by the photovoltaic system was the NAR model, followed by the proposed methodology. The persistence model had a deficient performance. It is evident that on this particular  According to Figure 14, the best model that predicted the electrical power produced by the photovoltaic system was the NAR model, followed by the proposed methodology. The persistence model had a deficient performance. It is evident that on this particular day (October 19), the NAR model surpassed the NARX-CVM model. However, in general terms, the proposed methodology surpassed the NAR model, thus concluding that its use is more appropriate, as shown in Tables 11-13.

Conclusions
In this work, the authors present a methodology to improve and simplify the NARX models, the input vector is the meteorological variable and the output vector is the electric power. The electric power is estimated with the simplified single diode model. The methodology is divided into two parts. The first one focuses on the input vector and implements the collinearity and Granger causality tests to build the input vector from the available variables. The second part focuses on the output vector and implements solar radiation models to build a CVM, which is applied to the output vector to treat with implying atypical results due to the solar radiation behavior. The collinearity test determines which variables are collinear, and the collinear variables are used to form three variable groups. The first group is formed with SR, T and WS, the second one with SR, RH and WS, and the third group is formed with SR, WS and P. The Granger causality test is applied to the three variables' group; this technique determines which variables have useful information to forecast the dependent variable. According to the collinearity and Granger causality tests, the simpler NARX model is when an input vector forms with SR and T in the H-NARX-CVM model is used.
Four NARX models were developed to validate the results with NARX-CVM I-IV. The first uses all meteorological variables as input vectors, and three more are used as input variables: the variables obtained from the collinearity test. The results indicate that the best model was H-NARX-CVM model, obtained with the proposed methodology, demonstrating the importance that the input vector plays in multivariable models. The skill forecasting using a 35% as a goal (KPI) was determined; as a benchmark, we proposed to use the NAR model and the persistence. The skill forecasting results indicate that the worse NARX model, NARX-CVM I, does not overcome the proposed KPI, whereas the NARX-CVM II, III and IV only overcame the KPI, taking it as a benchmark persistence. The H-NARX-CMV obtained from the proposed methodology overcame the proposed KPI with a value of 42%; using persistence as a benchmark and using the NAR model as a benchmark, the model obtains a result of 34%, 1% above the established goal. According to the previous report, we can conclude that even using only the collinearity tests to prove different vectors, we obtained good results; otherwise, it is important to point out that the usage of all variables gives worse results.
Finally, it is important to point out that the development methodology in this work can be applied anywhere. The only requirement is that the site counts with a considerable solar energy resource. However, due to the significant variability of the meteorological variables in different places, the causal relationship between the variables changes. It is necessary to carry out the whole methodology and fix the proposed NARX model. Acknowledgments: The authors appreciate the Instituto de Energías Renovables de la Universidad Nacional Autónoma de México for providing the database through the "Sistema de Información de datos Solarimétricos y Meteorológicos" and sincerely thank Jesús Quiñones for the collaboration.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The