1. Introduction
Solar radiation is the primary source of surface energy, which drives carbon and water exchanges between the atmosphere and terrestrial ecosystems [
1]. Population growth, limited fossil fuels, and environmental pollution have caused the rapid development of renewable energy sources such as solar and wind power. However, in many solar energy applications, accurate information about the presence of solar energy is required [
2]. Solar measuring equipment is much more expensive than other meteorological parameters such as temperature, relative humidity and wind speed. More than 2400 weather stations in China record meteorological data, while only about 5% of stations observe global solar radiation (Rs). Therefore, models need to be developed for stations with no solar-radiation records, to estimate solar radiation [
3]. Three main methods are used to calculate daily global solar radiation, i.e., satellite-derived, stochastic and meteorological-based processes [
4]. The satellite-derived method can receive the reflectivity of the Earth’s atmosphere of the irradiation, invert the daily radiation value and estimate the solar radiation in a large area.
Nevertheless, the uncertainty of satellite-based solar-radiation remote sensing can be high in cloudy and polluted areas. Stochastic algorithms depend on history; a statistical summary of radiation information is used to infer the probability of future radiation, which requires the support of existing high-quality historical radiation-observation data. Weather-based approaches aim to establish relationships between solar radiation and other, more readily available, meteorological elements. This method is by far the most widely used.
Recently, machine-learning models, due to their super nonlinear fitting ability, have been widely used in the simulation of natural phenomena, agriculture, engineering and the economy, also including Rs predicting/forecasting. Rehman and Mohandes [
5] used an artificial neural network (ANN) to estimate solar radiation in Abha of Saudi Arabia. They found an ANN model with air temperature and relative humidity as inputs can capably estimate Rs. Quej et al. [
6] assessed three approaches (SVM, ANN and ANFIS) to predict daily Rs in Yucatán, México. They declared that SVM models performed well in warm sub-humid regions. Ghimire et al. [
7] explored the feasibility of using numerical weather prediction to forecast Rs. Deo et al. [
8] used geo-temporal and satellite images as input data to feed the ELM method to develop an Rs model in Australia. The results show that the ELM model outperformed RF, M5T and MARS methods. Hassan et al. [
9] evaluated the ability of four ML algorithms (MLP, ANFIS, SVM and RT) in modeling Rs. Based on these algorithms, sunshine-, temperature-, meteorological parameters- and day-number-based models were examined in Egypt. They verified that the MLP algorithm excelled in comparison to other models. On the other hand, many studies also show that ML is not always better, for example, as it has less precision than the dependency model [
10]. Mohammadi et al. [
11] compared the performance of an SVM model and ANFIS in predicting Rs under temperature data only with the data of Iran. It was found that the SVM model using an RBF kernel function had the highest accuracy. Feng et al. [
12] used six machine-learning models to map daily global solar radiation and photovoltaic power in the Loess Plateau of China. In addition, the prediction of Rs by kernel-based machine-learning models has been widely reported in northwest China [
12], humid regions of China [
13], air-pollution regions of north China [
14,
15], Algeria [
16], Spain [
17], other regions around the world [
18], also including diffuse radiation [
19]. The kernel-based model also been used to map the solar photovoltaic potential of China [
20,
21].
Recently, deep-learning models have been gradually applied to the prediction of solar radiation, including LSTM algorithms, which are good at mining time-series information [
22,
23,
24], and spatial processing information [
25]. In addition, ML models can also be used to identify the most significant input parameters to better understand the relationship between common meteorological factors and Rs.
Voyant et al. [
26] reviewed different machine-learning technologies used for solar-radiation forecasting. They pointed out that methods such as ANN and SVM were primarily used in the early stage, while methods such as regression tree and boosting tree have been used more recently. Compared with ANN, SVM, ANFIS, and decision-making, the most significant advantage tree-based methods have is processing larger data sets faster [
9]. Sun et al. [
27] applied an RF method to estimate Rs in an air-pollution environment. Ibrahim and Khatib [
28] coupled an RF model with FFA to predict radiation on an hourly scale. Prasad et al. [
29] designed a new approach named the EEMD-AOC-RF method for Rs forecasting. Firstly, this method decomposed the time lagging (t-1) data into signal data and noise data by EEMD; the data was brought into the RF model and optimized by AOC algorithm. Wu et al. [
13] compared six machine-learning models (M5T, KNEA, MLP, CatBoost, RF and MARS) for predicting Rs in a sub-humid region in China. They found that the KNEA model had the highest accuracy, MLP model had the best stability, and CatBoost model had the fastest speed.
Recently, The National Centers for Environmental Prediction (NECP) released its new product, Global Ensemble Forecast System version 12 (GEFSv12) [
30]. This product has up to 35 days ahead of Rs forecast data, however, its accuracy has not been evaluated. A new model-based bat algorithm and KNEA was used to forecasting Rs, and the input data was from the GEFSv12 output for the 1–3 d ahead. Therefore, the objectives of this study were: (1) to evaluate the 1–3 d ahead solar-radiation-prediction performance of GEFSv12 at four stations in northwest China; (2) to build a coupling model based on the bat algorithm and KNEA (BA–KNEA) model; and (3) compare the newly developed BA-KNEA model with the traditional empirical model and five other machine-learning models.
3. Results
3.1. Empirical Statistics Methods
Table 3 presents the statistical indicators of the GEFSv12 NWP raw Rs
f forecasting data and the results from QM and EDCDFm methods. In general, with the extension of the forecast period, the errors of NWP raw Rs
f data and the Rs
f correct by QM and EDCDFm methods gradually increase. In Altay, the performance of the QM and EDCDFm methods were very similar, and both o
f them were slightly better than that of the NWP raw Rs
f data. In Kashgar, the error of the raw Rs
f data was relatively large. However, the QM and EDCDFm methods were superior to the raw Rs
f, with RMSE decreased by 28.2–31% and 28.6–31.5%, and MAE decreased by 27.9–31.1% and 27.7–31.1%, respectively, during 1–3 d ahead. In Ruoqiang, the error of the raw Rs
f was large, and its RMSE was more than 5 MJ m
−2 d
−1. After correcting by QM and EDCDFm method correction, RMSE decreased by 17.4–18.5% and 19.7–20.1% for 1–3 d ahead, and MAE decreased by 16–17.7% and 17.7–19.4%, respectively. However, the R
2 of the raw Rs
f was slightly higher than that of the two statistical methods. This indicates that the statistical method improved the overestimation (or underestimation) problem. The performance for Khotan station was similar to that of Ruoqiang station. Compared with the raw Rs
f over the four stations, the RMSE and MAE of QM and EDCDFm models decreased by 20% and 15%, respectively. It can be seen from the above results that empirical statistical methods can improve forecasting accuracy.
As can be seen from the scatter plot of raw Rs vs. ground observed Rs (
Figure 4), the discrete points increased slightly from 1 d to 3 d, indicating a slight decrease in inaccuracy. The forecasting value of Rs
f in the future 1–3 d was not higher than 30 MJ m
−2 d
−1, which was slightly lower than the extreme value of Rs. The main problem of the GEFS data set lay in the existence of many overestimated discrete points when the observed value was lower than 25 MJ m
−2 d
−1. However, the QM and EDCDFm methods can alleviate this problem, and the R
2 of the two methods was slightly higher than the corresponding value of raw Rs
f data.
3.2. Machine-Learning Methods
Table 4 shows the statistical indicators of Rs-forecasting results by seven different machine-learning methods during 1–3 d ahead. In Altay, the third day’s R
2 average increased by 0.046, and the average RMSE and MAE increased by 13.4% and 13.1%, compared with the first day. Among the seven machine-learning models, the BA-KNEA model was superior to other machine−learning models each day, and the RMSE, MAE and NRMSE of the BA-KNEA model decreased by 2.1–10.3%, 2.5–12.0% and 2.8–12.4% than other machine-learning models for 1 d ahead, decreased by 1.8–8.8%, 1.7% to 10.1% and 1.6–9.9% for 2 d ahead, and decreased by 2.2–8.2%, 2.2–9.6% and 2.0–9.5% for 3 d ahead. The performance of the BA-SVM model was ranked second, followed by BA-XGBoost, PSO-KNEA, PSO-SVM, LSTM and PSO-XGBoost models.
In Kashgar, the BA-KNEA model did not have a significant advantage over the PSO-KNEA model on the first two days, but performed slightly better than the PSO-KNEA model on the third day. In addition, the BA-KNEA model was generally superior to other models. RMSE, MAE and NRMSE decreased by 3.8–7.0%, 0–6.2% and 0–6.3% for 1 d ahead, 3.8–8.4%, 2.8–8.6% and 2.4–8.2% for 2 d ahead, and 5.4–12.5%, 2.3–14.7% and 2.3–14.5% for 3 d ahead. In addition, the BA-XGBoost model slightly outperformed the BA-SVM model.
In Ruoqiang, the BA-KNEA model performed better than the other six models. Compared with the BA-KNEA model, the RMSE, MAE and NRMSE of the other six models increased by 4.6–9.6%, 4.6–10.6% and 4.5–10.5% for 1 d ahead, 7.1–10.0%, 4.9–9.5%, 6.3–9.7% for 2 d ahead, and 3.3–4.9%, 2.6–4.6%, 3.2–5.1% for 3 d ahead. The BA-SVM model performed better than the other four models on 1 d, but the advantage in the other models, except BA-KNEA, was not obvious on the other two days. In Khotan, the BA-KNEA model also achieved the highest accuracy, and the RMSE, MAE and NRMSE of the other five models increased by 2.6–6.9%, 4.8–7.1% and 1.3–6.7% for 1 d ahead, 3.5–9.0%, 1.6–8.4%, 1.2–8.5% for 2 d ahead, and 3.0–8.5%, 1.8–12.8%, 1.8–11.8% for 3 d ahead. The performance of the BA-SVM model was still better than the other four models, except for the BA-KNEA model.
The scatter plots of observed Rs vs. Rs
f by seven machine-learning models are shown in
Figure 5. Among all the machine-learning models, it can be seen that the BA-KENA model performed slightly better than other models, followed by the BA-SVM model. The slope of all the regression equations in the Figure was less than 1, and the intercept was greater than 0, which means that all the models exhibit the problem that when Rs is very large, the model will underestimate the result, and when Rs is very small, the model will underestimate the result.
Figure 6 shows the distribution of the absolute error of the forecast Rs for different machine-learning models 1–3 days ahead. As can be seen, at 1 d ahead, the proportion of days with AE < 2 MJ m
−2 d
−1 for the six models was around 60%; the proportion of PSO-KNEA and BA-KNEA was slightly higher than in other models; and had a AE > 6 MJ m
−2 d
−1 days ratio, the BA-KNEA had a slight advantage over the other models. The performance on 2 d ahead was slightly worse than that on 1 d ahead: the proportion of days with AE < 2 MJ m
−2 d
−1 for all six models was below 60%, while the number of days with AE > 6 MJ m
−2 d
−1 showed little change compared with 1 d ahead, with the BA-KNEA model having a slight advantage over the other models in the number of days with AE > 6 MJ m
−2 d
−1. In the 3 d ahead, the accuracy of the six models continued to decline compared with the previous 2 d, and the BA-KNEA model had a slightly lower proportion of days with AE > 6 MJ m
−2 d
−1 than the other models.
Figure 7 shows the Taylor diagram of different methods over the four stations. It can be seen that the BA-KNEA model outperformed the other methods over the all stations.
3.3. Comparison of Statistical Models and Machine-learning Models
To evaluate the performance of different categories of models, we ranked the four statistical indicators of all models over the four stations (
Table 5). With the highest R
2 or the lowest RMSE, MAE or NRMSE would rank first, and so on. When the ranking of different statistical indicators is different, the model with more indicators at the top ranks first. It can be seen that the rank of different models in 1–3 d ahead were the same. The BA-KNEA model was the best, followed by the BA-SVM, BA-XGBoost, PSO-KNEA, PSO-SVM, LSTM, PSO-XGBoost, EDCDFm and QM models. The above results prove that the machine−learning model is superior to the empirical-statistical model, and the new BA-KNEA model has the best performance in accuracy. In addition, the Taylor plots of different stations on the first day of the forecast period are shown in
Figure 6. It can also be seen that the results of the BA-KNEA model were the closest to the observations, while the GEFS raw data had the largest error.
3.4. BA-KNEA with Different Input Combinations
In order to analyze the difference in the forecasting ability of different meteorological factors on the results, we used the BA-KNEA model to set up different input combinations. Through the results, we explored the contribution differences of different factors.
Table 6 shows the statistical indicators of the different input combinations of the BA-KNEA model in the forecast period 1–3 d. When the input factor is Rs
f, the accuracy of the BA-KNEA model was better than that of the QM and EDCDFm methods with the same input at four stations (
Table 3), and the RMSE and MAE of the BA-KNEA model was 1.7–7.9% and was 1.6–7.6% lower in the forecast period of 1–3 days, relative to the EDCDFm method. This model was also better than the model established with temperature and extraterrestrial radiation as inputs (Combination 5), which shows that the solar radiation accuracy of the GEFSv12 dataset is better than that of the traditional temperature-based machine-learning model method. In Altay, when only the maximum and minimum air temperature was used as input, the error was larger than the model with Rs input: R
2 was between 0.712–0.723, RMSE was between 4.705–4.812 MJ m
−2 d
−1, and MAE was between 3.766–3.799 MJ m
−2 d
−1, and NRMSE was between 0.241–0.243. Adding RH
f, U
f, Tmax
f and Tmin
f based on the Rs
f can improve the prediction accuracy of Rs, among which the increase in wind speed was the largest, followed by air temperature, and, finally, relative humidity. Compared with Combination 2, 3, and 4, the accuracy of combination 6 was higher, and it can be seen that the accuracy of the multi-factor was higher than that of the two-factor combination. This shows that the multi-factor combination contains more nonlinear information related to Rs than the two-factor combination, which helps improve the model accuracy further. At Kashgar station, adding relative humidity based on Rs did not improve the accuracy significantly, and when the forecast period was 2 and 3 days, adding wind speed based on Rs slightly improved the accuracy. Adding the temperature model based on Rs improves the model’s accuracy to a certain extent, but it is not much different from the accuracy of the complete combination (Combination 6). This is mainly due to the limited contribution of RH and U to improving the accuracy of the model. The performance of the BA-KNEA model on the first two days of Ruoqiang Station was similar to that on Altay, but on the third day, Combination 3 outperformed the complete input combination. Due to poor forecast accuracy of wind speed and relative humidity, adding these factors will increase the noise in the model. At Khotan station, on the first day, the complete combination was close to the Combination 2, 3, and 4 but superior to those during the other two days. The complete combination is slightly better than the other combinations. As seen from the above, the complete combination was slightly better than the other combinations over the four stations.
4. Discussion
Different machine-learning models perform differently in solar-radiation prediction. This is mainly due to two reasons. Firstly, different machine-learning models have different sensitivities to data distribution. For example, kernel-based machine-learning methods can perform well in low-dimensional data sets [
47]. However, the tree-based model performs better with high dimensions and a large amount of typed data. The deep-learning model has better performance in image processing [
48]. Another reason is that the parameter selection of machine-learning models did not achieve the optimal global solution. Fan et al. [
31] compared the performance of SVM and XGBoost when the input factors were temperature and precipitation and found that SVM was slightly better than the XGBoost model. Ghimire et al. [
7] compared ANN, SVR, GPML and GP models for forecasting solar radiation with reanalysis data in Queensland, Australia. They highlighted that the ANN model outperformed other ML models. Shin et al. [
49] used a deep-learning model to short-term forecast solar radiation for photovoltaic power generation. Hu et al. [
50] used ground-based images and an ANN model to forecast solar radiation. However, there is limited study of using weather-forecast products to forecast solar radiation in China. In this study, we evaluated the capability of the GEFSv12 product in the solar-resource-rich region of China. We found that the raw solar-radiation forecast data in GEFSv12 has poor performance and uncertainty for indirect use. Thus, we built a coupling model based on the bat algorithm and KNEA model. The result shows that the newly developed model is superior to other empirical-statistical and machine-learning models. The LSTM had been used to forecast Rs on hourly and other time scales [
51,
52]. However, we found that the LSTM did not perform better than the BA-KNEA model nor other models. The daily Rs fluctuated widely on an hourly scale in the arid regions of the northwest of China, and historical information is not as important as the WRF data for future. Thus, the LSTM did not achieve enough information to forecast 1–3 d Rs.
Many scholars have found that various meteorological factors, such as air temperature, relative humidity, wind speed, and precipitation, are closely related to solar radiation [
53,
54], but the effects of these factors vary in different regions of the globe [
55,
56]. In northwest China, air temperature is the closest meteorological variable to solar radiation [
57]. Thus, many scholars have established solar-radiation models based on air temperature. In addition, relative humidity and wind speed have also been used to improve the accuracy of solar radiation prediction [
58,
59]. Although the forecast data set was used in this study, similar results have been obtained, which means that the forecast data set and observation data have similar results. The most significant difference between the forecast data set and observation data lies in the forecast precision of different forecast factors. In general, the temperature has a very high forecast accuracy, but the relative humidity and wind-speed forecast accuracy are low, a fact mainly caused by two data mismatches. That is to say, the forecast data is the average of a large area, while the relative humidity and wind speed observed by the weather station is a minimal point value. We found that, in the four stations of this study, the model’s accuracy with temperature factor is generally better than that of wind speed and relative humidity, and the prediction performance of relative humidity and wind speed of GEFSv12 needs to be improved.
5. Conclusions
Accurate forecasting of solar radiation (Rs) is significant to photovoltaic power generation and agricultural management. For the first time, this study evaluated and improved the capability of the newly released National Centers for Environmental Prediction Global Ensemble Forecast System version 12 (NECP GEFSv12) for short-term forecasting of Rs. To achieve this goal, a new coupling model based on the bat algorithm (BA) and kernel-based nonlinear extension of Arps decline (KNEA) was established. The data used four solar-radiation stations in Xinjiang, China as the benchmark. The new model was also compared with two empirical statistical methods (quantile mapping and Equiratio cumulative distribution function matching) with five machine-learning methods, e.g., support vector machine (SVM), XGBoost, KNEA, BA-SVM, BA-XGBoost. The results show that the accuracy of forecasting Rs from all of the models decreases from 1 d to 3 d ahead. Compared with the GEFS raw Rs data over the four stations, the RMSE and MAE of the QM and EDCDFm models decreased by 20% and 15%, respectively. In addition, the BA-KNEA model was superior to the GEFSv12 raw Rs data and other post-processing methods, with R2 = 0.782–0.829, RMSE= 3.240–3.685 MJ m−2 d−1, MAE = 2.465–2.799 MJ m−2 d−1, NRMSE = 0.152–0.173.