Performance Analysis of Daily Global Solar Radiation Models in Peru by Regression Analysis

: Solar radiation (Rs) is one of the main parameters controlling the energy balance at the Earth’s surface and plays a major role in evapotranspiration and plant growth, snow melting, and environmental studies. This work aimed at evaluating the performance of seven empirical models in estimating daily solar radiation over 1990–2004 (calibration) and 2004–2010 (validation) at 13 Peruvian meteorological stations. With the same variables used in empirical models (temperature) as well as two other parameters, namely precipitation and relative humidity, new models were developed by multiple linear regression analysis (proposed models). In calibration of empirical models with the same variables, the lowest estimation errors were 227.1 and 236.3 J ∙ cm − 2 ∙ day − 1 at Tacna and Puno stations, and the highest errors were 3958.4 and 3005.7 at San Ramon and Junin stations, respectively. The poorest ‐ performing empirical models greatly overestimated Rs at most stations. The best performance of a proposed model (in terms of percentage of error reduction) was 73% compared to the average of all empirical models and 93% relative to the poorest result of empirical models, both at San Ramon station. According to root mean square errors (RMSEs) of proposed models, the worst and the best results are achieved at San Martin station (RMSE = 508.8 J ∙ cm − 2 ∙ day − 1 ) and Tacna station (RMSE = 223.2 J ∙ cm − 2 ∙ day − 1 ), respectively.


Introduction
Solar radiation reaching Earth's surface is one of the main sources of clean and renewable energy, optimal use of which can reduce human dependence on fossil fuels that contribute substantially to global warming [1,2]. The energy from Rs is the source of many processes on our planet, to the point that human life depends on it. Rs is an important topic in various areas of study such as hydrology [3][4][5], environmental science [6][7][8][9], water resources management [10,11], water balance modeling [12,13], and plant growth modeling [14][15][16][17]. This parameter is widely used in meteorological forecast models, climate change models, and ecosystem models [18][19][20][21][22]. It also plays a prominent role in such important processes as evaporation, evapotranspiration, and snowmelt.
Knowing daily values of Rs and how it changes in the long-term can help researchers (in development of theoretical studies) and technologists (in assessment of the equipment associated with solar energy, for example, in designing photovoltaic systems and solar panels or thermal systems).
Rs is directly measured in meteorological stations using pyranometers. In contrast to other major meteorological parameters which are readily available (temperature, humidity, and precipitation, for instance), the high cost and difficulty in maintenance and calibration of pyranometers have led to a lack of reliable data for this parameter in many Khatib [55] integrated the Random Forest (RF) model by Firefly optimization algorithm (FFA) for solar radiation estimation in Malaysia. The proposed new model (RF-FFA) was compared to ordinary artificial neural network (ANN), hybrid ANN-FFA, and standalone RF model; finally, they reported satisfied results for all machine learning model used. Capability of ANN, genetic programming (GP), and support vector machine coupled with firefly algorithm (SVM-FFA) for predicting Rs was examined in three locations in Nigeria with best results (RMSE = 1.866; R 2 = 0.73) for SVM-FFA [56]. Accuracy of empirical, ANN, and SVM models in estimating Rs using most influencing meteorological parameters in India was assessed by Meenal and Selvakumar [47]. The lowest and highest RMSEs were 0.6387 (sunshine-based models) and 2.4328 (temperature-based models) for empirical models, 0.5814 (hybrid models) and 2.2197 (temperature-based models) for ANN and 0.4205 (hybrid models), and 1.1434 (temperature-based models) for SVM. Zang et al. [57] assessed the performance of 14 day of the year-based models in 35 Chinese meteorological stations. These included seven empirical models (six from the literature and one proposed model) and seven machine learning models (support vector regression (SVR), Gaussian process regression (GPR), three ANFIS models, and two ANFIS models coupled with chaotic firefly algorithm (CFA) and whale optimization algorithm with simulated annealing and roulette wheel selection (WOASAR)). ANFIS-CFA and ANFIS-WOASAR showed the highest accuracy in 19 and 15 stations, with RMSEs and MAPEs in the ranges 1.203-2.491 MJ•m 2 and 4.516-18.976%, respectively. A relatively comprehensive review on application of machine learning methods is provided by Voyant et al. [58].
The particular geographic location of Peru (on the southern hemisphere), reliance of its economy on the agricultural industry, and the absence of a comprehensive study on Rs estimation in this country highlight the importance of accurate estimation of this parameter in Peru. Due to the lack of recorded sunshine hour data, the main objective of the present study was to evaluate the performance of seven empirical models (six temperature-based models and one temperature-precipitation based model) in estimating daily Rs values over 1990-2010 in 13 Peruvian meteorological stations. The authors also intended to develop and validate new empirical models based on recorded meteorological data in each station (proposed models) for improving Rs estimation if none of the available models would prove to be suitable.

Study Area
Peru has a total area of 1,280,000 square kilometers, spanning from 0° to 18° S and from 69° to 82° W, its average altitude is 2650 m above sea level, and it is bounded by the South Pacific Ocean to the east (Figure 1). The Andes mountain range extends from north to south and divides the country into three parts: a mountainous region with sunlit valleys and 6000 m peaks, a narrow desert and lowland zone between the mountains and the Pacific Ocean, and a lowland, wet, and very warm region on the eastern side of the mountainous area. Eastern Peru is covered by tropical rainforests (Amazonia) with a very high precipitation. With Lake Titicaca (the highest navigable lake in the world) on the south-eastern part and along the border with Bolivia, Atacama Desert (the driest place on Earth) along the border with Chile, and Sechura Desert on the northwest along the Pacific Ocean coast, the country has a unique weather profile.
Peruvian economy relies on the agriculture industry. Knowing the quantity, distribution, and dynamics of Rs across the country will therefore significantly contribute to irrigation scheduling and water resources management. Accurate estimation of this parameter will also be useful in sustainable solar energy generation by helping in the design of solar panels, solar thermal systems, and photovoltaic systems.

Empirical Models
Using daily recorded meteorological parameters in 13 Peruvian stations (Table 1), performance of seven empirical models (Table 2) in estimating Rs from 1990 to 2010 was assessed. For this purpose, measured data from 1 January 1990 to 31 December 2004 (5479 data points for each parameter) and from 1 January 2005 to 31 December 2010 (2191 data points per parameter) were used for calibration and validation of empirical models, respectively. Due to the unavailability of measured radiation values, the empirical models used were selected from among temperature-based models whose acceptable results have been reported for various locations. Regarding data quality and assurance procedures applied to the data for current study, neighbor stations method was use for controlling quality of data and. In addition, some missing data (gape in data set) were fixed by neighbor stations approach (however, missing data was less than 5% of whole data set).  In these models, ΔT is the difference between the highest and lowest daily temperatures, Tmean is average daily temperature, Z is the altitude of the weather station, Pt is transformed precipitation, and Ra is extraterrestrial radiation.

Evaluating the Performance of Empirical Models
Calibrated coefficients for the best empirical models are given separately for each station in Table 3. According to the results of Tables 4 and 5 [33] model has shown the best performance with RMSE decreased by 68, 66, and 69 percent at calibration phase and by 68, 59, and 65% at validation phase, compared to the poorest models, at the above stations, respectively. These findings indicate that there is a noticeable difference in performance between the best and the poorest empirical models at these stations. Table 3. Calibrated coefficients for the best empirical models for each station.  Table 4. Root mean square errors (RMSEs) (J•cm −2 •day −1 ) of solar radiation estimated by empirical models in selected meteorological stations (calibration set).  [32]), Chen (Chen et al. [33]), Wu (Wu et al. [36]), Ja1 (Jahani et al. [2], first model), and Ja2 (Jahani et al. [2], second model). Table 5. RMSEs (J•cm −2 •day −1 ) of solar radiation estimated by empirical models in selected meteorological stations (validation set).  [2], and Samani [31] models in most stations (Tables 4 and 5), Jahani et al. 1 [2] proved to be the best performing model in Huanuco station at both calibration and validation phases (with RMSEs of 371.8 and 367.1 J•cm −2 •day −1 , respectively) and Jahani et al. 2 [2] was the best model in Tumbes station at calibration phase (RMSE = 293 J•cm −2 •day −1 ). Overall, at validation phase, Wu et al. [36] model (in Tacna station) and Chen et al. [33] model (in Puno station) were again the best, with RMSEs of 227.1 and 236.3 J•cm −2 •day −1 , respectively and Samani [31] model (in San Ramon station) and Jahani et al. 2 [2] model (in Junin station) had the poorest performance in the study area, with RMSEs of 3958.4 and 3005.7 J•cm −2 •day −1 , respectively.

Models Stations Tumbes Cusco Arequipa Lima Loreto SanRamon Puno Tacna San Martin Lambayeque Junin Cajamarca Huanuco
As can be seen in Figure 2, with the exception of Jahani et al. 1 [2] model in Lambayeque station and Samani [31] model in Tumbes station, the poorest performance of empirical models in the other 11 stations has been associated with severe overestimation of Rs. Another finding of this study was the similarity in performance, in terms of estimation error, between Wu et al. [36] and Chen et al. [33]

Development of New Models
According to the unsatisfactory results of most existing empirical models, new models were proposed for estimating Rs in each of the 13 stations (Table 6). In addition to the input variables used for the seven empirical models, precipitation, and relative humidity-which were measured at all stations-were also employed for development of the proposed solar radiation estimator models. In addition, t * 1 and t * 2 can help generalize these models to different locations since they represent the dimensionless nature of temperature. The proposed models were developed on the basis of multiple linear regression analysis using the SPSS software package, with the aim of minimizing the error between measured and estimated radiation values. In the proposed models, RH is relative humidity and Pre refers to the amount of precipitation; they are the ratio of ΔT to Tmax and Tmin, respectively. For developing models, first ordinary models' ability was analyzed by testing those models on different regions in Peru. Then, researchers provided some new regression models by more accuracy than the previous excited models. New proposed models were calibrated for each region separately in calibration phase, and suitable coefficients for proposed models were evaluated for each region. In the next phase, ability of each new model was analyzed by investigation of test section of each model. Application of the new structure, changing the form of parameters, and the use of precipitation and relative humidity as well as the dimensionless parameters and have all been effective in improving the performance of proposed models. Radiation values estimated by the proposed models at each station are analyzed in the discussion part according to the following sections.  (Table 7), the proposed model has had a better performance when estimating radiation values above threshold in both under-and overestimation sets, with the maximum estimation error occurring in the overestimated, below 20 MJ•m −2 •day −1 radiation values (RMSE = 559.6 J•cm −2 •day −1 ).

Cajamarca Station
If the average of measured Rs values at calibration (Mean (Rs)mea = 17.241 MJ•m −2 •day −1 ) and validation (Mean (Rs)mea = 16.761 MJ•m −2 •day −1 ) phases are taken as thresholds for analyzing the results of proposed model, there is a direct relation between the number of data points and magnitude of error rate in under-and overestimation sets for radiation values lower than the above threshold and an inverse relationship for radiation values higher than the threshold ( Table 8). The model proposed for this station ( Figure 4) has led to higher error rates in overestimation set at both calibration and validation phases, although RMSE difference between the two sets at validation phase (ΔRMSE = 164 J•cm −2 •day −1 ) is larger than that at calibration phase (ΔRMSE = 81 J•cm −2 •day −1 ). The ratio between error rates of over-and underestimation sets ( ) for radiation values lower and higher than the average of measured  56, respectively. It can be therefore concluded that the lowest and highest differences in error rates of proposed model between underand overestimation sets have occurred at calibration phase (radiation values above average) and validation set (radiation values below average), respectively, and this conclusion is confirmed by the results shown in Table 8.

Cusco Station
Although radiation values estimated by the proposed model have an appropriate distribution around 1:1 line ( Figure 5), the results presented in Table 9 show that prediction error in overestimation set is greater than that in underestimation set, despite that the former has a lower number of data points. The proposed model has also led to lower prediction errors when estimating radiation values higher than the average measured radiation at both calibration phase (Mean (Rs)mea = 22.53 MJ•m −2 •day −1 ) and validation phase (Mean (Rs)mea = 22.66 MJ•m −2 •day −1 ) (Table 10) Table 9. RMSE values (J•cm −2 •day −1 ) and number of data points belonging to under-and overestimation sets in calibration and validation phases using proposed model.

Validation Calibration Data Set
244 (n = 1170) 242.5 (n = 3061) underestimated 339.5 (n = 1021) 324 (n = 2412) overestimated Table 10. RMSE values (J•cm −2 •day −1 ) and number of data points belonging to each group separated by a threshold value (mean of measured solar radiation): lower and higher than the mentioned threshold in calibration and validation sets.

Huanuco Station
At this station, overestimation set has more data points and the proposed model has higher error rates compared to underestimation set. Average values of measured radiation at calibration phase (15.037 MJ•m −2 •day −1 ) and validation phase (16.247 MJ•m −2 •day −1 ) were used as threshold values for analyzing the results of the proposed model. At calibration phase and for measured radiation values lower and higher than average, higher error rates were observed in over-and underestimation sets, respectively; although the difference in error rates between the two sets for radiation values below average (ΔRMSE = 176.9 J•cm −2 •day −1 ) was much higher than those above average (ΔRMSE = 29.8 J•cm −2 •day −1 ). At validation phase, for all radiation values, error rates are higher in overestimation set (Table 11). Another important result regarding the performance of the proposed model is the inaccurate estimation of some of the relatively high measured Rs values (see Figure 6, in which the mentioned values are enclosed by a blue line). At this station, maximum measured radiation values are 34 MJ•m −2 •day −1 at calibration phase and 33 MJ•m −2 •day −1 at validation phase. However, in 25 and 41 days, radiation values estimated by the proposed model are higher than the above maxima at calibration and validation phases, respectively; and error rates for those days are about 847 and 914 J•cm −2 •day −1 , respectively, which are considerable.  Table 11. RMSE values (J•cm −2 •day −1 ) and number of data points belonging to under-and overestimation sets for measured solar radiation less or higher than mean (Rs)mea in calibration and validation phases using proposed model.

Junin Station
Taking average values of measured radiation at calibration phase (25 MJ•m −2 •day −1 ) and validation phase (24.611 MJ•m −2 •day −1 ) as thresholds for analyzing the results of the proposed model, total error at calibration phase for radiation values below average is about 370.8 J•cm −2 •day −1 , to which overestimation set (RMSE = 444.2 J•cm −2 •day −1 ) has contributed much more than underestimation set (RMSE = 239.6 J•cm −2 •day −1 ). For radiation values above average, however, the underestimation set (RMSE = 240.2 J•cm −2 •day −1 ) contributes more to the total error (RMSE = 227.5 J•cm −2 •day −1 ) than overestimation set (Table 12). However, validation phase results, for radiation values both above and below measured average, indicate that a larger portion of total error of the proposed model in either of these intervals is caused by the inappropriate performance of overestimation sets. However, overestimation set's contribution to the total error for radiation values below average is much greater than that for radiation values above average (Table 12 and Figure 7).  Table 12. RMSE values (J•cm −2 •day −1 ) and number of data points belonging to under-and overestimation sets for measured solar radiation less or higher than mean (Rs)mea in calibration and validation phases using proposed model.

Lambayeque Station
As can be seen in Figure 8, the proposed model has larger errors in overestimation set compared to underestimation set (especially at validation phase), with the error ratio ( ) being approximately 0.48 and 0.32 at calibration and validation phases, respectively. For an accurate analysis of results and based on the distribution of points in Figure 8, performance of the proposed model was evaluated in three intervals of measured radiation values including below 10, between 10 and 20, and above 20 MJ•m −2 •day −1 , as shown in Table 13

Loreto Station
The relative stability of performance between under-and overestimation sets, in terms of error rates and the number of data points, at both calibration (ΔRMSE = 31.7 J•cm −2 •day −1 , Δn = 16) and validation (ΔRMSE = 60 J•cm −2 •day −1 , Δn = 99) phases was an advantage of the proposed model. In three intervals of measured radiation (below 10, between 10 and 20, and above 20 MJ•m −2 •day −1 ), error rates of radiation estimation show an upward and a downward trend in under-and overestimation sets, respectively (Table  15).
According to Figure 10 and Table 15

Puno Station
Satisfactory performance of the proposed model, especially for underestimated radiation values at both calibration and validation phases, is illustrated by the appropriate distribution of points relative to 1:1 line in Figure 11. Estimation errors of the proposed model for radiation values below and above average of measured values at calibration and validation phases (Mean (Rs)mea = 26.5 MJ•m −2 •day −1 ) are given in Table 16. As can be seen, the proposed model has led to lower errors for radiation values lower and higher than average in under-and overestimation sets, respectively. Table 17 illustrates the performance of the proposed model in three radiation intervals (below 15, between 15, and 30, and above 30 MJ•m −2 •day −1 ). Unsatisfactory performance for the first and better performance for the second and third intervals in under and overestimation sets, respectively, are characteristics of the proposed model for this station.  Table 16. RMSE values (J•cm −2 •day −1 ) and number of data points belonging to under-and overestimation sets for measured solar radiation less or higher than mean (Rs)mea in calibration and validation phases using proposed model.  Table 17. RMSE values (J•cm −2 •day −1 ) and number of data points belonging to under-and overestimation sets in three defined intervals of measured solar radiation (MJ•m −2 •day −1 ) for calibration and validation phases using proposed model. Performance of this model was examined from two different perspectives: analyzing error rates of radiation estimation for two intervals including lower and higher than average of measured radiation values at calibration (Mean (Rs)mea = 16.568 MJ•m −2 •day −1 ) and validation (Mean (Rs)mea = 16.586 MJ•m −2 •day −1 ) phases (Table 18), and the model's behavior within three radiation intervals (below 10, between 10 and 20, and above 20 MJ•m −2 •day −1 ) ( Table 19). The results from Table 18 indicate that for radiation values lower and higher than average, the proposed model has performed better in under-and overestimation sets, which is confirmed by the appropriate distribution of points at the two aforementioned sets compared to other radiation values (Figure 12).

Validation
At both calibration and validation phases, increased error rates from under-to overestimation set in below 10 and 10-20 MJ•m −2 •day −1 intervals, and reduced error rates from under-to overestimation set in the third interval (above 20 MJ•m −2 •day −1 ) indicate that the proposed model has performed better for underestimated, low (below 10 and to some extent between 10-20 MJ•m −2 •day −1 ) radiation values and overestimated, high (above 20 MJ•m −2 •day −1 ) radiation values. The maximum radiation values estimated by the proposed model at calibration and validation phases are 24.09 and 23.69 MJ•m −2 •day −1 , respectively. However, there are 628 days at calibration phase (RMSE = 768 J•cm −2 .day −1 ) and 316 days at validation phase (RMSE = 735 J•cm −2 •day −1 ) on which measured radiation values are greater than the above maxima, which means an absolute underestimation of measured radiation values higher than those maxima by the proposed model (these days are marked by a blue ellipse in Figure 12).  Table 18. RMSE values (J•cm −2 •day −1 ) and number of data points belonging to under-and overestimation sets for measured solar radiation less or higher than mean (Rs)mea in calibration and validation phases using proposed model.  Table 19. RMSE values (J•cm −2 •day −1 ) and number of data points belonging to under-and overestimation sets in three defined intervals of measured solar radiation (MJ•m −2 •day −1 ) for calibration and validation phases using proposed model.

San Ramon Station
It can be inferred from Figure 13 that at both calibration and validation phases, the proposed model has performed better in underestimation sets. At validation phase, error rates of radiation estimation are about 173.4 and 421.1 J•cm −2 •day −1 in under-and overestimation sets, respectively. In confirmation of the above findings, examination of the results of Table 20 indicates that although overestimation sets have higher error rates in both above and below 20 MJ•m −2 •day −1 intervals, the difference of error rates between under-and overestimation sets is more noticeable for radiation values below 20 MJ•m −2 •day −1 compared to those above 20 MJ•m −2 •day −1 at both calibration (ΔRMSE = 474 J•cm −2 •day −1 ) and validation (ΔRMSE = 368 J•cm −2 •day −1 ) phases; so that the percentage of error increment from under-to overestimation sets in below 20 MJ•m −2 •day −1 interval is about 180% and 78% at calibration and validation phases, respectively, and corresponding values for above 20 MJ•m −2 •day −1 interval are about 15% and 44%. Accordingly, the proposed model has shown the best performance when estimating radiation values higher than 20 MJ•m −2 •day −1 , especially in underestimation sets.

Tumbes Station
According to distribution of points in Figure 15, performance of the proposed model has been much better in underestimation set. At calibration phase, underestimation set (with 3480 data points and an RMSE of 201 J•cm −2 •day −1 ) showed a better performance in comparison with overestimation set (1993 data points, RMSE = 395 J•cm −2 •day −1 ). At validation phase, corresponding values are n = 1075 and RMSE = 167 J•cm −2 •day −1 for underestimation set and n = 1110 and RMSE = 430 J•cm −2 •day −1 for overestimation set. Considering the distribution of points in Figure 15, performance of the proposed model was analyzed within two intervals of measured radiation (below 20 and above 20 MJ•m −2 •day −1 ; Table 22). At calibration phase and for radiation values below 20 MJ•m −2 •day −1 , 608 data points from overestimation set are estimated with a high error rate (RMSE = 649.2 J•cm −2 •day −1 ); whereas the sharp decline in the number of data points in underestimation set (n = 38, a 94% decrease in number) compared with overestimation set has led to a 63% decrease in RMSE. Within the above 20 MJ•m −2 •day −1 interval, model performance has been almost the same in under-and overestimation sets, although the number of data points belonging to underestimation set is about 2.5 times that of overestimation set. At validation phase, the difference in error rates between under-and overestimation sets at both below 20 MJ•m −2 •day −1 interval (ΔRMSE = 557 J•cm −2 •day −1 ) and above 20 MJ•m −2 •day −1 interval (ΔRMSE = 55 J•cm −2 •day −1 ) has increased compared to corresponding values at calibration phase, although the magnitude of the increment in the former interval (147 J•cm −2 •day −1 ) is much greater relative to the latter interval (54 J•cm −2 •day −1 ).   Figure 16 illustrates the spatial distribution of error rates of Rs estimation using existing and proposed models throughout Peru. It can be concluded from Figure 16 that Hargreaves-Samani [30], Annandale et al. [32], Chen et al. [33], Wu et al. [36], and our proposed model have demonstrated the best results at southern stations (Arequipa, Puno, and Tacna), and estimation error has increased towards the northern parts of the study area, with the largest errors in Cajamarca, San Martin, Loreto, and to some extent in Huanuco station. The highest estimation errors of Samani [31] model occur in stations located in central Peru (San Ramon and Junin stations); although a certain degree of error is also observed at Cusco, Tacna, Cajamarca, San Martin, and Loreto stations. Unlike the five above-mentioned models which exhibit the best results in the southern part of the country, the best results of Samani [31]  Analyzing the performance of the proposed model reveals a relatively consistent trend in the spatial distribution of estimation error throughout the study area: in the southern half of the country (latitudes above 10° S) and from the south towards the north, estimation error is increased so that the southernmost stations (Tacna, Puno, and Arequipa) have the lowest errors, and the magnitude of error increases towards the north (Cusco, Lima, Junin, and San Ramon). In the northern half (latitudes below 10° S), the highest error rates are observed at Cajamarca, San Martin, and Loreto stations. In other words, error rates follow a west-east increasing trend, probably due in part to the absence of any stations between longitudes 70 and 78° W which has affected error interpolation. Figure 16. Interpolated spatial distribution of error in solar radiation estimation at all stations using (a) Hargreaves-Samani [30] and (b) Samani [31] models (c) Annandale et al. [32] and (d) Chen et al. [33] models (e) Wu et al. [36] and (f) Jahani et al. 1 [2] models (g) Jahani et al. 2 [2] and (h) proposed models.

Conclusions
In some regions of Peru, either there is a lack of weather stations equipped with Rs measurement instruments or, in stations where required equipment is available this parameter has not been reliably recorded. Agriculture plays a crucial role in the economic development of Peru and the importance of accurate estimation of Rs for irrigation scheduling, design, and installation of solar panels, photovoltaic systems, and sustainable exploitation of renewable energy sources is self-evident. However, no study has been conducted for estimating Rs in this country. According to the lack of sunshine hour data, the present study assessed the performance of seven empirical models available in the literature (six based on temperature and one based on temperature as well as transformed precipitation), in terms of estimating daily Rs values at 13 weather stations in Peru, and a new model was also proposed for each station. Overall, the results showed that in most stations, Wu et al. [36] and Chen et al. [33] models have exhibited the best performance, and Samani [31] and Jahani et al. 2 [2] models have led to the poorest results. Analyzing the results of the poorest-performing empirical temperature-based models indicates that Rs is greatly overestimated at most stations. RMSEs of proposed models in the 13 stations are given in Table 23, according to which the worst and the best performances are achieved at San Martin station (RMSE = 508.8 J•cm −2 •day −1 ) and Tacna station (RMSE = 223.2 J•cm −2 •day −1 ), respectively. Table 23. RMSEs (J•cm −2 •day −1 ) of the proposed models and percentage of RMSE reduction by proposed models compared to best, poorest, and average results of empirical models in each of the 13 selected stations at calibration (validation) set.

Percentage of RMSE Reduction Compared to the Poorest-Performing Empirical Model
Percentage of RMSE Reduction Compared to the Average of All Empirical Models learning machine or their coupling with bioinspired optimization algorithms, for example, firefly or krill herd algorithm, for improvement of radiation estimation. Solar radiation prediction is an essential task in atmospheric studies, hydrological forecasting, agriculture product management, and saving energy issues. Then, knowing about a simple accurate time series model for prediction solar radiation by available climate data is necessary for each region. Measuring solar radiation data set is costly and having a highquality solar radiation data set for a whole country is a limitation of this research. In addition, calibrating different models for different regions is another difficulty of this type study.