Prediction of Horizontal Daily Global Solar Irradiation Using Artificial Neural Networks (ANNs) in the Castile and León Region, Spain

This article evaluates horizontal daily global solar irradiation predictive modelling using artificial neural networks (ANNs) for its application in agricultural sciences and technologies. An eight year data series (i.e., training networks period between 2004–2010, with 2011 as the validation year) was measured at an agrometeorological station located in Castile and León, Spain, owned by the irrigation advisory system SIAR. ANN models were designed and evaluated with different neuron numbers in the input and hidden layers. The only neuron used in the outlet layer was the global solar irradiation simulated the day after. Evaluated values of the input data were the horizontal daily global irradiation of the current day [H(t)] and two days before [H(t−1), H(t−2)], the day of the year [J(t)], and the daily clearness index [Kt(t)]. Validated results showed that best adjustment models are the ANN 7 model (RMSE = 3.76 MJ/(m2·d), with two inputs ([H(t), Kt(t)]) and four neurons in the hidden layer) and the ANN 4 model (RMSE = 3.75 MJ/(m2·d), with two inputs ([H(t), J(t)]) and two neurons in the hidden layer). Thus, the studied ANN models had better results compared to classic methods (CENSOLAR typical year, weighted moving mean, linear regression, Fourier and Markov analysis) and are practically easier as they need less input variables.


Introduction
Solar radiation affects all the Earth's processes related to plant growing and the environment, and it plays a fundamental role in the development of human activities. Among these processes, solar radiation influences the primary production of plants by means of the photosynthesis process and the temperature and water evaporation into the atmosphere and, consequently, also humidity of ground and air [1]. Therefore, solar radiation data at ground level are important for a wide range of applications in agricultural sciences [2], particularly for agricultural hydrology, soil physics, monitoring plant growth and disease control, and estimating crop evapotranspiration for irrigation, as well as in meteorology, engineering, and research in the natural sciences.
Many computer simulation models, which predict growth, development, and yield of agronomic and horticultural crops, require daily weather data as input. Evapotranspiration, which is very important for applications in agriculture and environment, can be calculated as a function of several meteorological data including solar radiation [3,4]. Solar radiation is a function of multiple variables. Iqbal [5] details perfectly how to determine quantitatively the amount of solar radiation incident on a surface of the Earth. Existing solar irradiation models can be classified according to different criteria [1]: (1) Output data (direct, diffuse, or global radiation); (2) input data (used predicted variables, which can be meteorological or geographical); (3) model spatial resolution (for a particular location or for large areas); (4) model temporal resolution (according to the application: short term predictions, for example every each minute, used in solar position monitoring devices in concentrated solar installations; time predictions, used for simulations of the solar thermal systems behavior at low temperature; and daily predictions, commonly used in agricultural meteorology); (5) solar radiation wavelength, which is herein of interest; (6) modelling algorithm type (deterministic or algorithm); (7) simulation model type (physical or formal; statistical or empiric); (8) caption surface geometry (horizontal, leaning surfaces or with solar position monitoring); (9) sky types (clear sky, partly cloudy, or cloud cover, based on the cloud presence influence).
On the other hand, different artificial intelligence techniques-such as artificial neuronal networks (ANNs), genetic algorithms, fuzzy logic, or hybrid techniques generated as a result of the previous ones-have been used in different ways to estimate solar irradiation [6], considering different time scales (monthly, daily, and hourly), using different meteorological variables (for example, temperature and air relative humidity, wind speed, cloudiness, daytime period duration, clarity index, and atmospheric pressure), or including geographical coordinates (latitude, longitude, and altitude) from the locations where the predictions are applied. Qazi et al. [7] reviewed different articles that apply ANNs for solar irradiation simulation and they concluded that this technique results in good precision prediction, with errors smaller than 20% based on the input data used and the architecture of the selected ANN. Further, they concluded that it is necessary to do more research into ANN solar modelling in order to provide better adjusted predictions.
Furthermore, results on the ANNs solar irradiation modelling have been compared with those obtained applying other statistical modelling techniques, such as atmospheric sciences [8], which allowed for the identification of ANNs, mainly when the meteorological conditions are complex [9], concluding that ANN models obtained better prediction results. On this matter, Yadav and Chandel [10] verified the best predicted ANN behavior compared with conventional methods, reporting that ANN models adjustment depends on the input parameters combinations (they use sunshine hour and air temperature), training algorithm, and ANN architecture (they used a model with learning machine algorithm).
For their part, Ghimire et al. [11], for the global solar irradiation estimation, used machine learning models based on ANNs, obtaining better results that when other models are used, such as support vector regression (SVR), gaussian process machine learning (GPML), and genetic programming (GP).
The influence of the input variables is an important aspect which was already mentioned in the ANNs solar radiation modelling, in which, explicitly or implicitly, the geographical location factor appears when the simulation is applied. Thus, Shah et al. [12] estimated the diffuse solar radiation monthly average, daily and hourly, under different meteorological conditions measured in India, providing data of nine variables to the input neuronal layer (latitude, longitude, month of the year, temperature and air humidity, rainfall, air speed, and long wavelength solar radiation). However, in the same country, Rao et al. [13] estimated the daily global solar irradiation, monthly average, and found that the best input combination was the difference between the maximum and minimum daily temperatures and the extra-terrestrial solar radiation, obtaining with them a very small prediction error (RMSE = 3.96%). Furthermore, the same authors concluded that ANN models presented better prediction statistic than conventional classical models of simulation, also employing a generally smaller input variable number. Seme et al. [14], for the Slovenian meteorological conditions, estimated the global solar irradiation every 30 min, using five input variables in their ANN model (i.e., extra-terrestrial solar radiation, solar zenith angle, day of the year, temperature, and relative atmospheric pressure), obtaining good prediction results but only for those days when the cloudiness was low.
Mentioning other applications, ANNs have also been used to obtain solar irradiation maps in large territorial surfaces through the generation of solar irradiation data synthetic series, when the quantity and quality of the available measures is not the appropriate one, such as is the case of Sözen et al. [15] in Turkey, Hontoria et al. [16,17] in Spain, or Siqueira et al. [18] in Brazil.
On the other hand, despite the use of ANNs, other modelling techniques, based on artificial intelligence, have also been used to predict solar radiation, its components, and related phenomena associated to solar influence. In this way, the Ångström regression equation has been improved through fuzzy logic techniques working with data from Turkey [19] and ANNs with data from Cyprus [20] or Iran [21,22]. Using meteorological data of Algeria, Mellit et al. [23] presented an application of the adaptive network-fuzzy inference system (ANFIS) and Markov's matrix to estimate the sequences of the monthly average clearness index and the daily global solar irradiation, on the basis of the geographical coordinates of the evaluated positions (i.e., latitude, longitude, and altitude). In the same line of research, López et al. [24] used the bayesian method to evaluate the relative importance of atmospheric variables and radiometric as an input to an ANNs model, which resulted in the finding that the clearness index and the relative air mass as the most important variables to estimate the hourly direct solar irradiation. Equally, for the meteorological conditions of Thailand, Pattanasethanon et al. [25] studied the available solar irradiation in a tropical climate with different sky cloudiness scenarios.
The use of solar radiation data measured in previous days (delay days) highly improves the precision of any simulation model, amongst which we find ANNs. Thus, Mellit and Pavan [26] predicted solar irradiation the day after, in Italian meteorological conditions, with the last 24 h of data collection (using as input data the daily irradiance and the average daily air temperature), proving that the model works perfectly in sunny days but reduces the efficiency in cloudy days. Based on Corsica (France) conditions, Paoli et al. [27], in their ANN model, used daily solar irradiation values from 1 to 15 delay days, as well as a normalization of solar irradiation data in the Earth surfaced in relation to extra-terrestrial solar radiation data, thus improving the measured and predicted data correlation. In Greece, Zervas et al. [28] used six different clearness index levels and previously measured solar irradiance values (based in 10 min test) to predict hourly solar global irradiance and the solar midday. Mellit et al. [29,30] used an ANN hybrid model and Markov matrixes to predict daily global solar irradiation in Algeria, using solar irradiation values with some delay days. For Chinese meteorological conditions, Cao et al. [31][32][33][34] used different wavelength setting neuronal networks to predict solar irradiance, analysing the importance of different influential values, amongst which we can find air mass, clouds, and the wavelength of the incidental solar radiation (atmospheric influence).
Recently, Caldas and Alonso-Suárez [35] studied the prediction of solar irradiance in the short term (1-10 min), combined with all-sky images and irradiance measurements. Paulescu and Paulescu [36] evaluated five statistical models for the immediate prediction of solar irradiance in Romania, without classifying any of the models as the best one. Heydari et al. [37] developed a method based on various sections (wavelet transform, hybrid feature selection, group method of data handling neural network, and modified multi-objective fruit fly optimization algorithm) for the short term prediction of wind speed and solar radiation, with the objective of analyzing the energy consumption in the Favignana island microgrid, in the south of Italy.
In this article, the horizontal daily global solar irradiation for the day after (i.e., [H(t+1)]) is studied though models based on ANNs and using data measured from Mansilla Mayor (León, Spain). The main objective of this work was to achieve improved solar radiation daily predictors with agricultural purposes in order to obtain a better prediction of the [H(t+1)], using the least possible number of inputs, with the aim of facilitating its practical application in irrigation needs processes and evaluation systems or to predict plant growth or disease control. With that purpose, different ANNs were evaluated with solar radiation data with delay days (between 1 to 3), resulting in a similar simulation adjustment to that achieved with classical models, which were significantly improved by adding the year and day number as a predictor, and/or the daily clearness index, which considers the randomness of the cloud's presence event.

Horizontal Daily Global Solar Irradiation Data
Data on the horizontal daily global solar irradiation used in this study (i.e., an eight years period, from 2004 to 2011), were collected at the agrometeorological station that belongs to the SIAR system (Agroclimatic Information System for Irrigation, in Spanish) located in the Mansilla Mayor (León, Castile and León region, center-north of Spain), with geographical coordinates 42 • 30 43" N and 5 • 26 46 O, altitude 791 MAMsl and local time GMT-21.725555. The SIAR system is a project from the Ministry of Environment and Rural Areas and Maritime of Spain, which is managed by ITACyL (Agricultural Technological Institute in Castile and León, in Spanish), through the weather information service InfoRiego, by which farmers obtain advice on irrigation water doses with the objective of its rationalization [38]. Global solar irradiation at the agrometeorological reference station is measured with a Skye SP1110 pyranometer (Campbell Scientific, Inc., North Logan, UT, USA), in which the silica photocell measures the incident solar radiation in the spectrum band between 350-1000 nm. The electronic circuit for the linearization and the amplification of that sensor is situated next to the Vaisala HMP45C probe (Campbell Scientific, Inc., North Logan, UT, USA), to measure temperature and ambient relative humidity, in the ranges of −40 to 60 • C, and 0 to 100%, respectively. Climate classification of the location of the agrometeorological station is Csb [39], according to Koppen-Geiger climate classification, with the following average yearly values (data from 1981-2010):

The Prediction of Horizontal Daily Global Solar Irradiation Using ANNs
Prediction of horizontal daily global solar irradiation of the day after ([H(t+1)], MJ/(m 2 ·d)) was carried out with different empirical models similar to the black-box, implemented ANNs that used data corresponding to the seven year period from 2004 to 2010. The automatic data preprocessing and postprocessing has done with the GUI (graphical user interface) neural network fitting toolbox 'nftool' in MATLAB [40].
To achieve the proposed objective, eight ANNs architectural models were designed (i.e., from ANN 1 to ANN 8), with different input data combinations: [Kt(t)], daily clearness index, adimensional, which is calculated as the relation between the incident global solar irradiation over the earth surface (H), and the extraterrestrial solar irradiation (H 0 ), which is calculated for each particular day as a function of the latitude [3].
In the hidden layer of the different evaluated ANNs, a different number of neurons was used (i.e., 1-10, 20, 30, 40, and 50) with the purpose of knowing their influence in the simulation precision. The output layer in all the tested ANN models has only one neuron that corresponds to the [H(t+1)] prediction during a whole year (year 2011), by calculating the root mean square error (RMSE, MJ/(m 2 ·d)), which was obtained in those predictions regarding the measured data, with the objective of selecting a better predictive behavior ANN model. The evaluated ANNs architectures are shown in Figure 1.  ANNs are created with the feedforwardnet function, fed with the input and output data vectors, which determines the size of the respective layers, generating a multilayer feed-forward perceptron (MLP) ANN with just one hidden layer, where the selected transference function between neurons in the hidden layer is the hyperbolic tangent sigmoid (tansig), whilst the selected transference function for the output layer neuron is lineal (purelin) [41,42].
The back-propagation Levenberg-Marquardt (BP-LM) algorithm is applied to achieve a quick optimization (trainlm), as well as the following options: bias learning function and weight moment with descendent gradient (learngdm); normalized function of squared error (mse); and functions of the processing of input matrix elements, such as data processing to recoding unknown data rows (fixunknowns), repeated data vectors at the inlet, which do not provide useful information (removeconstantrows), and the matrixes processing to normalized vectors with minimum and maximum values in the range of [−1, 1] (mapminmax) [41,42]. ANNs are created with the feedforwardnet function, fed with the input and output data vectors, which determines the size of the respective layers, generating a multilayer feed-forward perceptron (MLP) ANN with just one hidden layer, where the selected transference function between neurons in the hidden layer is the hyperbolic tangent sigmoid (tansig), whilst the selected transference function for the output layer neuron is lineal (purelin) [41,42].
The back-propagation Levenberg-Marquardt (BP-LM) algorithm is applied to achieve a quick optimization (trainlm), as well as the following options: bias learning function and weight moment with descendent gradient (learngdm); normalized function of squared error (mse); and functions of the processing of input matrix elements, such as data processing to recoding unknown data rows (fixunknowns), repeated data vectors at the inlet, which do not provide useful information (removeconstantrows), and the matrixes processing to normalized vectors with minimum and maximum values in the range of [−1, 1] (mapminmax) [41,42].
ANN training was done with the train function, with input and output data vector matrixes over a period of seven years (i.e., from 2004 to 2010, without the consideration of 2011, as it was the year of verification), registering the whole training period (epoch and performance functions). Finally, sim function was used, with previously trained ANNs, to do the prediction of [H(t+1)], with the vector matrix of input data from 2011.

The Prediction of Horizontal Daily Global Solar Irradiation Using Classic Models
Classic models for the daily global solar irradiation prediction over the horizontal surface, which has been considered in this work, are: (1) CENSOLAR typical year; (2) weighted moving mean with partial autocorrelation; (3) linear regression; (4) Fourier analysis; and (5) Markov analysis. Model building was done by taking the horizontal daily global solar irradiation corresponding to the seven year period (i.e., 2004-2010) that was considered for the construction of the models.

CENSOLAR Typical Year
The horizontal global solar irradiation values included in the tables CENSOLAR [43] table characterize an average day of each month in a typical year. This is why this model is mainly applicable to large geographical areas and for long periods of time. The information is available for each of the Spanish provinces.

Weighted Moving Mean with Partial Autocorrelation
Partial autocorrelation makes reference to variable value dependency with the same variable values precedents in time. For that reason, the weighted moving mean (WMM) model with partial autocorrelation is applied, giving more weight to the values closer to the simulation day and less weight to the ones furthest, with the objective that the average value behaves more flexible than when a simple mobile mean model is used.
In this current work, partial autocorrelation coefficients obtained over a seven year period was used, which were calculated with the parcorr MATLAB function, defining the weighted mobile mean with 2 to 20 days' time delays, in 2011. The partial autocorrelation coefficients are applied to the solar irradiation value corresponding to its delay day, doing the summation of those products and dividing by the sum of those coefficients.

Linear Regression
Global solar radiation prediction can be done through a lineal regression, that models the relation between a dependent variable, in this case the global solar irradiance the day after H(t+1), and an independent variable, in this case the global solar irradiance of the current day H(t), together with a calculated random term using MATLAB curve fitting toolbox 'cftool'.

Fourier Analysis
Fourier analysis is applied to the variables that show significant frequencies, such as it occurs with the horizontal daily global solar irradiation. In this case, MATLAB Curve Fitting Toolbox 'cftool' was used to calculate the first eight harmonic coefficients.

Markov Analysis
A Markov chain is a situation or states chain that is generated with a random process in which a series of state change amongst different states of a finite group of possible states.
In this current work, for the data series between 2004 and 2010, states were defined, one per each horizontal daily solar irradiance integer value (MJ), according to the following process:

−
All data series were rounded and a corresponding state was assigned making use of MATLAB 'round' function.

−
Maximum and minimum value of the data series was found for the horizontal daily global solar irradiance by using MATLAB 'max' and 'min' functions, resulting in 33 possible states. − A probability matrix was created using the transitions or change of state of the existing data series. − A Markov transition matrix (MTM) probability matrix was created with a transitions number for each data series state. The state change probability has been calculated for the day after based on the state of the current day, multiplying the state vector of the current day by NMTM vector, ending with a vector in which the position where the highest value is located, will be the state with the highest probability to occur the day after.

Results
In this section, the obtained results of the application of horizontal daily global irradiation prediction models of the day after [H(t+1)] are shown, based on the measurements taken at the agrometeorological station, located in Mansilla Mayor (León, Spain), which belongs to the SIAR network, between 2004 and 2010.
In 2011, measured data at the same station was used in the model validation phase. In order to do the validation and be able to compare the predictive models behavior, the following statistical tools were used: Root mean square error (RMSE, MJ/(m 2 ·d)), with Equation (1a); determination coefficient (R 2 ), model adjustment level indicator, with Equation (1b); Durbin-Watson (DW) coefficient, used to detect the first order autocorrection between the data, with Equation (1c); mean percentage error (MPE), which allows the interpretation the prediction error bias, with Equation (1d); forecast accuracy (FA), used to make a measurement available for the precision of prediction models at short term, with Equation (1e); and Akaike information criterion (AIC), used to select the most adequate model taking into account the number of used variables, penalizing the most complex models, with Equation (1f).
where k is the number of variables.

Results of Simulations with the Artificial Neural Networks Models
First of all, the research was focused in ANN models to identify the network architecture that simulates with more precision [H(t+1)], as a function of the neuron number (i.e., 1-10, 20, 30, 40, and 50) that forms the hidden layer. The statistics used for this validation is the RMSE of the simulated outlet based on the data measured for the same variable.
Obtained results for the ANN 1, ANN 2, and ANN 3 models are shown in Figure 2. The best results for the prediction done for the ANN 1 model were achieved with the (1-1-1) architecture, with RMSE = 4.26 MJ/(m 2 ·d), whilst for ANN 2 model the best results were achieved with the (2-7-1) architecture, with RMSE = 4.12 MJ/(m 2 ·d). Finally, the best behavior for ANN 3 model was obtained with the (3-20-1) architecture, with a RMSE = 3.96 MJ/(m 2 ·d).
Agronomy 2020, 10, x FOR PEER REVIEW 8 of 19 First of all, the research was focused in ANN models to identify the network architecture that simulates with more precision [H(t+1)], as a function of the neuron number (i.e., 1-10, 20, 30, 40, and 50) that forms the hidden layer. The statistics used for this validation is the RMSE of the simulated outlet based on the data measured for the same variable.
Obtained results for the ANN 1, ANN 2, and ANN 3 models are shown in Figure 2. The best results for the prediction done for the ANN 1 model were achieved with the (1-1-1) architecture, with RMSE = 4.26 MJ/(m 2 ·d), whilst for ANN 2 model the best results were achieved with the (2-7-1) architecture, with RMSE = 4.12 MJ/(m 2 ·d). Finally, the best behavior for ANN 3 model was obtained with the (3-20-1) architecture, with a RMSE = 3.96 MJ/(m 2 ·d). Furthermore, the results were obtained with the simulation by using ANN 4, ANN 5, and ANN 6 models ( Figure 3). It can be seen that best results of the prediction done obtained with ANN 4 model were those for the (2-2-1) architecture, RMSE = 3.75 MJ/(m 2 ·d), whilst for ANN 5 and ANN 6 models were achieved with the (3-4-1), RMSE = 3.78 MJ/(m 2 ·d), and (4-4-1), RMSE = 3.80 MJ/(m 2 ·d) architectures, respectively. Furthermore, the results were obtained with the simulation by using ANN 4, ANN 5, and ANN 6 models ( Figure 3). It can be seen that best results of the prediction done obtained with ANN 4 model were those for the (2-2-1) architecture, RMSE = 3.75 MJ/(m 2 ·d), whilst for ANN 5 and ANN 6 models were achieved with the (3-4-1), RMSE = 3.78 MJ/(m 2 ·d), and (4-4-1), RMSE = 3.80 MJ/(m 2 ·d) architectures, respectively. First of all, the research was focused in ANN models to identify the network architecture that simulates with more precision [H(t+1)], as a function of the neuron number (i.e., 1-10, 20, 30, 40, and 50) that forms the hidden layer. The statistics used for this validation is the RMSE of the simulated outlet based on the data measured for the same variable.
Obtained results for the ANN 1, ANN 2, and ANN 3 models are shown in Figure 2. The best results for the prediction done for the ANN 1 model were achieved with the (1-1-1) architecture, with RMSE = 4.26 MJ/(m 2 ·d), whilst for ANN 2 model the best results were achieved with the (2-7-1) architecture, with RMSE = 4.12 MJ/(m 2 ·d). Finally, the best behavior for ANN 3 model was obtained with the (3-20-1) architecture, with a RMSE = 3.96 MJ/(m 2 ·d). Furthermore, the results were obtained with the simulation by using ANN 4, ANN 5, and ANN 6 models ( Figure 3). It can be seen that best results of the prediction done obtained with ANN 4 model were those for the (2-2-1) architecture, RMSE = 3.75 MJ/(m 2 ·d), whilst for ANN 5 and ANN 6 models were achieved with the (3-4-1), RMSE = 3.78 MJ/(m 2 ·d), and (4-4-1), RMSE = 3.80 MJ/(m 2 ·d) architectures, respectively.   Figure 5 and Table 1 show the results of the simulation and the behavior of the studied ANN models, considering the neuron architecture of the hidden layer with which the best prediction results were obtained, for each of the days of the evaluated year.     Figure 5 and Table 1 show the results of the simulation and the behavior of the studied ANN models, considering the neuron architecture of the hidden layer with which the best prediction results were obtained, for each of the days of the evaluated year.   Figure 5 and Table 1 show the results of the simulation and the behavior of the studied ANN models, considering the neuron architecture of the hidden layer with which the best prediction results were obtained, for each of the days of the evaluated year.

Results of the Classic Models
The behavior of the prediction models based on ANNs was compared with the behavior of simulation models that use classic simulation techniques of global solar irradiation. The classical simulation models that were analyzed are: CENSOLAR typical year, weighted moving mean, linear regression, Fourier analysis, and Markov analysis. In every case, the base of the analysis was the same, corresponding to the horizontal daily global solar irradiation of the day after during 2011 and its comparison with the same variable measurements done by the agrometeorological station located in Mansilla Mayor (León, Spain), which belongs to SIAR. Figure 6 shows the results for the predictions for each day of each year. Table 2 summarizes the adjustment analysis between simulated and measured values. In the following sections, the results obtained with each of the simulation classical models are analyzed.

Results of the Classic Models
The behavior of the prediction models based on ANNs was compared with the behavior of simulation models that use classic simulation techniques of global solar irradiation. The classical simulation models that were analyzed are: CENSOLAR typical year, weighted moving mean, linear regression, Fourier analysis, and Markov analysis. In every case, the base of the analysis was the same, corresponding to the horizontal daily global solar irradiation of the day after during 2011 and its comparison with the same variable measurements done by the agrometeorological station located in Mansilla Mayor (León, Spain), which belongs to SIAR. Figure 6 shows the results for the predictions for each day of each year. Table 2 summarizes the adjustment analysis between simulated and measured values. In the following sections, the results obtained with each of the simulation classical models are analyzed.

CENSOLAR Typical Year
Values for the horizontal daily global solar irradiation of a typical year in the province of León (Spain), where the agrometeorological station at Mansilla Mayor is located, are include in the CENSOLAR tables [43], which are shown in Figure 6. Values simulated with this model show a RMSE of 5.1829 MJ/(m 2 ·d) in relation to the measured values. Figure 7 shows the partial correlation coefficients that were the result of using the solar radiation data studied over seven years, with delays between 1 and 20 days. It can be observed that the dependency of the daily global solar irradiance [H(t+1)  Subsequently, the WMM model with the partial autocorrelation coefficients that corresponded to the period between 2 and 20 days delay was used for the [H(t+1)] prediction during 2011, obtaining the simulation errors shown in Figure 8. It can be seen that the best simulation result was achieved for the 11 day delay [H(t−10), H(t)], with a prediction error of 3.9810 MJ/(m 2 ·d). Thus, the model of the weighted mobile mean model was selected with 11 days delay, which is shown in Equation (2): 0.874H + 0.262H + 0.177H + 0.084H + 0.115H + 0,112H + 0.088H + 0.058H + 0.057H + 0.101H + 0.030H Subsequently, the WMM model with the partial autocorrelation coefficients that corresponded to the period between 2 and 20 days delay was used for the [H(t+1)] prediction during 2011, obtaining the simulation errors shown in Figure 8. It can be seen that the best simulation result was achieved for the 11 day delay [H(t−10), H(t)], with a prediction error of 3.9810 MJ/(m 2 ·d). Thus, the model of the weighted mobile mean model was selected with 11 days delay, which is shown in Equation (2):

Linear Regression
The model obtained with a lineal regression with one day of delay is shown in Figure 9 and Equation (3), with an adjustment error of 4.32 MJ/(m 2 ·d). Using this model for the horizontal daily global solar irradiation simulation in 2011, a prediction error RMSE of 4.2434 MJ/(m 2 ·d) was obtained ( Table 2).

Fourier Analysis
The application of the Fourier analysis model with the solar radiation measured data over seven years (2004-2010) has allowed us to get typical annual functions, shown in Table 3, that consider the

Linear Regression
The model obtained with a lineal regression with one day of delay is shown in Figure 9

Linear Regression
The model obtained with a lineal regression with one day of delay is shown in Figure 9 and Equation (3), with an adjustment error of 4.32 MJ/(m 2 ·d). Using this model for the horizontal daily global solar irradiation simulation in 2011, a prediction error RMSE of 4.2434 MJ/(m 2 ·d) was obtained ( Table 2).

Fourier Analysis
The application of the Fourier analysis model with the solar radiation measured data over seven years (2004-2010) has allowed us to get typical annual functions, shown in Table 3, that consider the first to the eighth harmonics; the produced error for each function decreases very slightly with the

Fourier Analysis
The application of the Fourier analysis model with the solar radiation measured data over seven years (2004)(2005)(2006)(2007)(2008)(2009)(2010) has allowed us to get typical annual functions, shown in Table 3, that consider the first to the eighth harmonics; the produced error for each function decreases very slightly with the number of harmonics. Figure 10 shows the achieved adjustment considering the first harmonic, with RMSE error of 4.417 MJ/(m 2 ·d). Prediction of the horizontal daily global solar irradiation the day after [H(t+1)], with date from 2011, gives us a prediction error RMSE which varies from 4.2747 MJ/(m 2 ·d) with the first harmonic only, to 4.2557 MJ/(m 2 ·d) with the first eight harmonics (Table 2).

Markov Analysis
Markov analysis carried out with the procedure described in Section 2.3.5 for the prediction of horizontal daily global solar irradiation the day after in 2011, results in a RMSE prediction error of 4.3653 MJ/(m 2 ·d) ( Table 2).

Discussion
The prediction precision of the horizontal daily global solar irradiation using the tested ANN models, can be rated as high. Prediction errors varied in the year that the validation was done, from 3.7703 MJ/(m 2 ·d) for the model with the best adjustment achieved (i.e., ANN 7), to 4.2609 MJ/(m 2 ·d) for the model with the worst behavior (i.e., ANN 1). These errors are in the range of 11.4% to 12.9% of the maximum value of the incident solar radiation measured.
On the other hand, the adjustment of the prediction using the tested models improved when the number of delay days increased; these numbers were considered in the horizontal daily global solar irradiation during the process of modelling the day after [H(t+1)]. Indeed, when ANN models were involved, the adjustment error obtained with the ANN 3 model, which added two delay days was, 2.7% and 5.9%, respectively, smaller than the errors obtained with models that only considered a day of delay (i.e., ANN 2) or the radiation of the current day (i.e., ANN 1). The same behavior is confirmed with classic models, specifically with the WMM model, in which the smallest prediction error is obtained considering 11 days of delay, and it is 6.9% smaller when two delay days are considered.
However, the performance of ANN 4, ANN 5, and ANN 6 models, which included the dependency of the solar radiation with the time of the year through the variable day of the year [J(t)], did not vary significantly when a higher number of delay days was considered. The prediction error of ANN 5 and ANN 6 models that considered one and two delay days, respectively, were higher than the error from model ANN 4, which only considered the solar radiation measured on the current day, although the difference was small (i.e., prediction error of 1.1% in both cases). Furthermore, the incorporation of the variable [J(t)] meant that the adjustment achieved with these models was better than the one obtained with ANN 1, ANN 2, and ANN 3 models (difference between 5.7% to 12.1%, compared with ANN 4 model). This also showed that it was better than the behavior of the classic methods tested. Therefore, the use of the variable day of the year as a predictor significantly improved the performance of ANN model simulation, because it included the time and season of the year when the prediction was done and, with that, the variation of the incident solar radiation throughout the year, because the extra-terrestrial solar irradiation at the limit of the atmosphere varied annually in a senoidal way. This great influence of the time of year in the incident solar radiation value over the Earth's surface, when thinking about the simulation, was also mentioned by different authors. Citakoglu [4] worked at a monthly temporary scale for the average solar irradiation estimation in Turkey, finding that the most significant explanatory variable was the month, and thus, the ANN model that included this variable behaved better than the multiple linear regression model, the adaptive network-fuzzy inference system model, and the tested empirical equations (Abdalla,

Discussion
The prediction precision of the horizontal daily global solar irradiation using the tested ANN models, can be rated as high. Prediction errors varied in the year that the validation was done, from 3.7703 MJ/(m 2 ·d) for the model with the best adjustment achieved (i.e., ANN 7), to 4.2609 MJ/(m 2 ·d) for the model with the worst behavior (i.e., ANN 1). These errors are in the range of 11.4% to 12.9% of the maximum value of the incident solar radiation measured.
On the other hand, the adjustment of the prediction using the tested models improved when the number of delay days increased; these numbers were considered in the horizontal daily global solar irradiation during the process of modelling the day after [H(t+1)]. Indeed, when ANN models were involved, the adjustment error obtained with the ANN 3 model, which added two delay days was, 2.7% and 5.9%, respectively, smaller than the errors obtained with models that only considered a day of delay (i.e., ANN 2) or the radiation of the current day (i.e., ANN 1). The same behavior is confirmed with classic models, specifically with the WMM model, in which the smallest prediction error is obtained considering 11 days of delay, and it is 6.9% smaller when two delay days are considered.
However, the performance of ANN 4, ANN 5, and ANN 6 models, which included the dependency of the solar radiation with the time of the year through the variable day of the year [J(t)], did not vary significantly when a higher number of delay days was considered. The prediction error of ANN 5 and ANN 6 models that considered one and two delay days, respectively, were higher than the error from model ANN 4, which only considered the solar radiation measured on the current day, although the difference was small (i.e., prediction error of 1.1% in both cases). Furthermore, the incorporation of the variable [J(t)] meant that the adjustment achieved with these models was better than the one obtained with ANN 1, ANN 2, and ANN 3 models (difference between 5.7% to 12.1%, compared with ANN 4 model). This also showed that it was better than the behavior of the classic methods tested. Therefore, the use of the variable day of the year as a predictor significantly improved the performance of ANN model simulation, because it included the time and season of the year when the prediction was done and, with that, the variation of the incident solar radiation throughout the year, because the extra-terrestrial solar irradiation at the limit of the atmosphere varied annually in a senoidal way. This great influence of the time of year in the incident solar radiation value over the Earth's surface, when thinking about the simulation, was also mentioned by different authors. Citakoglu [4] worked at a monthly temporary scale for the average solar irradiation estimation in Turkey, finding that the most significant explanatory variable was the month, and thus, the ANN model that included this variable behaved better than the multiple linear regression model, the adaptive network-fuzzy inference system model, and the tested empirical equations (Abdalla, Ångström, Bahel, and Hargreaves-Samani).
Modelling performance with ANNs improved when the daily clearness index (ANN 7) was considered, but it did not improve as much when the day of the year and the clearness index were used together as predictors (ANN 8), with which the results obtained were similar. As the use of the variable day of the year was easier than the clearness index and the predictive performance was similar, it is advised to use only the day of the year variable. Table 1 shows that ANN 3, ANN 6, and ANN 7 models have the lowest autocorrelation degree, which have been evaluated with the DW statistical method, showing that the autocorrelation degree between the past measured values and the future simulated values for all ANN models is close to value 2 (i.e., autocorrelation nil).
The effectiveness of the simulation was higher for the ANN 7, ANN 6, and ANN 4 models (Table 1) because in them a better result of the statistical MPE was achieved (i.e., values −0.2022, −0.2032, and −0.2154, respectively) and the bias prediction error was valued through the FA statistical method (0.6324, 0.6277, and 0.6196, respectively), which values the prediction in the short term.
Similar conclusions can be made with the RMSE statistic. Indeed, the best efficiency in the simulation was achieved with the ANN 7 and ANN 4 models, obtaining RMSE values of 3.7703 and 3.8012 MJ/(m 2 d), respectively, and AIC values of 3.8118 and 3.8427, respectively. The AIC statistical method was an indicator of the complexity of the prediction because the low values obtained with these two models meant that a smaller number of predictive variables (two inputs for both models) achieved better results for the simulation, which is backed up by the Law of Parsimony or Occam's razor, or by which models were used to obtain better results in the predictions.
The ANN 7 and ANN 4 models did not use the information available in the delay days but only the corresponding information for the clearness index and the day of the year. Thus, it was confirmed that the ANN models offered the best conditions with the least number of necessary inputs, for its use in the horizontal daily global solar irradiation in Castile and León region, Spain. This conclusion agrees with the work of Rao et al. [13] for the election of a smaller number of ANNs input variables and also with the work of Yadav and Chandel, [10] for the appropriate selection of these variables with the aim of predicting solar irradiation more accurately. Furthermore, similar results were obtained by Paoli et al. [27], in which conditions in Corsica (France), using an ANN with a few days of delay with data from 1988-1989, significantly improved the model by using the clearness index, (RMSE = 3.59 MJ/m 2 and R 2 = 0.80). The obtained results presented in this work are similar (worse RMSE but better R 2 ) to those obtained by these authors, but with the advantage that a smaller number of inputs help practical application. In Saudi Arabia conditions, Almaraashi [44] showed the importance of selecting appropriately the most important explicit variables for the predictive modelling of solar radiation, using for that purpose four different algorithms.
The best predictive behavior found that the best classic models of the solar radiation simulation was the WMM model, which achieved an 11 day delay. The worst response was achieved with the CENSOLAR typical year. The strongest partial autocorrelation was produced with the models with a delay day (WMM, linear regression, and Markov analysis), which confirms the solar radiative persistence from one day to the following, where the autocorrelation decreases dramatically for the following days. Table 2 shows that the linear regression models and WMM have two days delay and an average adjustment in the prediction, but a high potential for practical application due to its simplicity and daily solar radiative persistence. However, these models obtained worse results for the other indicators (DW, MPE, and FA). In China, Sun et al. [45] showed strong heteroscedastic persistence for the daily global solar irradiation and its influence in the solar radiation prediction.
The achieved adjustment with the Fourier analysis model was similar to the one obtained with the WMM models and with the two days delay and linear regression. The use of more than one harmonic does not provide a significant improvement in the prediction, making it more complicated for its use and making the results worse (i.e., DW, MPE, and FA).
The obtained adjustment with the Markov analysis model was similar to the one achieved with the Fourier analysis model, except it had better values with the FA indicator (0.6497), which analyzes short term prediction. This obtained the best result of all the classic models after the WMM model with 2 days delay (0.6589), caused by the solar radiative persistence effect from one day to the other (0.8747 as the maximum partial autocorrelation coefficient in the data series of horizontal global solar irradiation for the first delay day and 0.2622 for the second delay day; Figure 7).
As mentioned previously, the model with the best predictive adjustment was the WMM with 11 days delay, because it used partial autocorrelation coefficients that corresponded to the 11 days previous to the prediction, with the smallest RMSE (3.9810 MJ/(m 2 ·d)) error, the highest R 2 (0.8134) coefficient, and the smallest statistical value AIC (4.2283). These results indicate that, although this model used the highest number of predictive variables, they are used effectively, and it is the best classic model.
In summary, comparing prediction results for the horizontal daily global solar irradiance, performed with ANN models ( Figure 5 and Table 1) and classic models ( Figure 6 and Table 2) that considered a seven years learning phase (i.e., 2004-2010) and a one year series for validation (i.e., 2011), and for the meteorological conditions of Castile and León (Spain), it was concluded that ANN models have better simulation adjustments than classical models. Moreover, using a smaller number of predictive variables obtained similar results to the best, but more complex, models available in the literature.

Conclusions
One of the most interesting characteristics of ANNs is the ability to model a process and learn as the process transference function varies with time. In order to benefit from them while predicting incident solar radiation, a number of fixed variables, of which the previous values of the simulated variable are found, are established as ANN inputs, which is the output of the simulated future value of the objective variable.
Predictions regarding global solar irradiation in the terrestrial horizontal surface the day after ([H(t+1)]) is relevant for all sorts of farming applications, particularly for estimating crop evapotranspiration for irrigation and monitoring plant growth and disease control. In this article, a prediction of [H(t+1)] was done using ANNs. We tried to design networks with the simplest architecture and the smallest possible number of inputs to facilitate practical technological application. The only requirement was to have a consistent series of horizontal global solar irradiation data, which could be measured in place or estimated after satellite measurement, or even through other environmental variables (ambient temperature and isolation hours, mainly).
In our predictions, data for the global solar irradiation of the current day and two days delay was used. We obtained models with which the predictive performance improved when compared to the classic models (CENSOLAR typical year, weighted moving mean with two days delay, linear regression, and Fourier and Markov analysis). However, the weighted moving mean model with an 11 day delay obtained a better adjustment. Still, the predictive improvement of ANN models is achieved when the input variable is the day of the year, because the simulation references the time of year when it occurs. In this situation, it can be seen that using solar radiation data with more delay days does not allow for the improvement of predictive behavior models. Furthermore, another way to improve ANN models is to use the daily clearness index, which is an indicator of the daily solar radiation proportion that is absorbed and dispersed to the atmosphere, with which the best simulation adjustments are obtained as compared with the classical models.
In order to continue working in the predictive capacity and the applicability of the daily global solar irradiation over the horizontal surface, the following future work is proposed: (1) Use of other explicative variables, such as humidity, temperature, atmospheric pressure, or cloudiness, which contribute to changes in the evolution of solar radiation, mainly the days in which sudden changes in the weather occurs, and moments when ANN models present their worst results. (2) Division of input data for the different times/seasons of the year that have similar characteristics and the generation of models for each of them. (3) Use of predictions from the national meteorological services as input data for the ANN models, instead of the historic data registered in the area.