1. Introduction
The global consideration of renewable energy sources such as solar energy, wind, hydro, and biomass has remarkably increased in terms of sustainable energy due to the reduction of fossil fuels. In this regard, the implementation of solar energy sources has widely focused on the use of photovoltaic (PV) systems, thermal solar energy, and concentrated solar energy [
1]. In particular, the emergence of renewable energy has enhanced solar radiation forecasting in the activity and management of the modern smart grid, with renewable energy generation [
2,
3].
Solar radiation forecasting has become crucial to accurately predict the efficiency of solar energy conversion systems and ensure the electrical grid’s reliability and safety. Specifically, global solar radiation is the most critical aspect of solar energy, which is essential for implementing renewable, solar energy systems, and PV system sizing [
4,
5]. Moreover, a precise understanding of solar radiation could optimize the accuracy of the electricity network and increase the efficiency of the smart grid. Therefore, appropriate solar radiation forecasting significantly improves solar energy usage, rises the economic losses due to electrical constraints, and maximizes the return on investment in photovoltaic grids [
6]. However, solar radiation is considered one of the prime resources and it plays a significant role in the foreseeable future, particularly in developing countries.
The forecasting of solar radiation was commonly discussed in the literature review. Several researchers have categorized the forecasting solar radiation into four types such as very-short-, short-, medium- and long-forecasting [
7]. However, the appropriate approaches are required to improve solar radiation forecasting model accuracy and reduce the negative effects of system fluctuations. These are structured in two main methods, namely, the time series statistical method and the physical techniques, and the combination of the above techniques called the hybrid techniques [
8]. The selected techniques are depending entirely on the forecasting horizon, available data information, and locations. In addition, the physical techniques are based on various equations, which describe the transmission phenomena and thermodynamic mechanisms, occurring in the atmosphere and on the surface of the earth. However, the equations become more complex when additional variables are added such as temperature, wind speed, dust, and humidity [
9]. The forecast accuracy of the physical techniques mainly depends on the accuracy of the collected data and various information about the location. Further, the time series statistical technique is purely based on the relationship between the past values of the weather parameters and the solar radiation that are identified and used in the forecasting process. This could include forecasting methods such as artificial neural networks, vector support machines, Markov chains, auto-regressive models, or regression models. Additionally, the time series could be described as a series of measurements collected over time at regular intervals; it is flexible and requires fewer data inputs, resulting in simple implementation and less cost. However, the main limitation of forecasting time series is the absence of deterministic causes [
10]. According to this background, numerous papers have developed various models and techniques with the times series analysis, based on artificial neural networks (ANNs) [
11]. In [
12], have presented a brief overview of the different techniques to determine their source and forecast solar radiation using ANNs in Greece. Further, the results elaborate that the geographical pattern and the different climatic conditions enhance the cloud’s temporal and spatial variability. A new model has been developed based on the ensemble of spatiotemporal deep learning models and variational Bayesian inference which uses spatiotemporal information to forecast solar radiation in the literature [
13]. In [
14], the authors compared eleven statistical and machine learning models for hourly solar radiation forecasting based on three meteorological locations with different collected data, and they explain that the precision and performance of each model are related to the variation of both the meteorological location and solar radiation information. In addition, the efficiency of the models was compared in terms of the statistical metrics named normalized root mean square error (RMSE), mean absolute error (MAE), and skill score. For weak variability, the auto-regressive moving average and multi-layer perceptron are the best predictors. The feedforward backpropagation algorithm was used by [
15] to predict daily global solar radiation in 25 cities around the kingdom of Morocco. Several meteorological astronomical and geographical coordinates were employed as input data to predict the outcoming output. Multiple combination parameters were adopted to select the most suitable configuration with optimal input data for each study location. According to statistical metrics, the obtained result is, respectively, twelve inputs for Er-Rachidia, Marrakech, Medilt, Taza, Oujda, Nador, Tetouan, Tangier, Al-Auin, Dakhla, Settat, and Safi, seven inputs for Fes, Ifrane, Beni-Mellal, and Meknes; six inputs for Agadirand Rabat; five inputs for Sidi Ifni, Essaouira, Casablanca, and Kenitra; and four inputs for Ouarzazate, Larache, and Al-Hoceima. In terms of accuracy, the R
2 of the selected best inputs parameters varies between 0.9860% and 0.9920%, with the range value of MBE (%) being from −0.1076% to −0.5931%, the RMSE between 0.1990 and 0.4580%, the range value of the NRMSE is between 0.0355 and 0.8938, and the lowest value of the MAPE is between 0.0019 and 0.0060%. This technique could be used to predict other parameters for locations where measurement instrumentation is unavailable or costly to obtain. In the meantime, the authors of [
16] presented a comparative optimization of daily global solar radiation forecasting with different machine learning and time series methods. The selected methods are compared with the persistence technique and measured data. Several statistical metrics are assessed to obtain the most appropriate method, which presents the lowest value accuracy. The select result is, respectively, the RMSE (%) and MBE (%) values of several models employed in this study, and were computed to be mostly positive. The range value of the selected model measured by RMSE (%) and MBE (%) varied between 4.64% to 8.87% and 6% to 22.93%. Based on all statistical metrics, the lower value of the selected model corresponds to the neural FFBP (6 × 10 × 1) in comparison with the other models. The appropriate one performs well and is close to the measured data. The authors of [
17] presented a complete and detailed synthesis of solar radiation modeling, forecasting, and solar radiation data using artificial intelligence methods: ANN, fuzzy logic, genetic algorithm, expert system, and a hybrid method. It is proven that solar radiation is a vital factor in PV system performance and sizing. The same researchers have presented a combination of the above methods for generating horizontal global solar radiation by combining ANN and library of Markov transition metrics (MTM) approaches based on three-parameter coordinates (longitude, altitude, and latitude) and the data were collected from a data basis of 60 stations in Algeria over 9 years. This prediction is comparatively accurate related to the relative Root Mean Square Error (RMSE), which is less than 8.2%. On other hand, López et al. [
18] have chosen the automatic relevance determination method (ARD) based on the ANN model to select the relevant input parameters in direct normal solar irradiance forecasting. Clearness index and relative air mass were considered the most important input parameters to the neural network. According to J. Lampinen et al. [
19], Penny et al. [
20], and B. Belmahdi et al. [
8], the ARD and ANN are the priority distribution on the network weights and determine the most relevant input parameters by introducing a hyperparameter for each input unit of the ANN. In ANNs, the important prior distribution of the network weights is controlled by the hyperparameters. Here, the ANN is presented with training data, and posterior weight distribution and hyperparameters are calculated using the Bayes rules.
Recently, time series forecasting models have taken the attention of researchers in different fields for their power and ability to forecast complex systems [
11]. Especially, the autoregressive integrated moving average (ARIMA) was utilized in the COVID-2019 pandemic [
21], hydroelectricity [
22], agriculture [
23], and groundwater-based irrigation [
24]. Generally, ARIMA models are an integration of autoregressive models (AR) and moving average models (MA), which have proven reasonable precision in the forecasting of stationary time-series information. Further, it is strongly assumed that the prospective data values are linearly dependent on present and past data values. However, several real-world time series data result from dynamic non-linear structures that cannot be accurately modeled by ARIMA. Consequently, artificial neural networks (ANNs) are considered the most commonly implemented algorithms for nonlinear time series modeling. In another word, the ANNs have a range of benefits with respect to ARIMA and other forecasting frameworks, which are capable of executing a dynamic non-linear function. Hence, the ANNs are capable of reconstructing every continuously measurable function with arbitrary desired accuracy [
11]. Moreover, ANNs are flexibly data-driven in design, which ensures that ANN models could be modified with the characteristics of time series data information. Several studies have been carried out to compare the performance of machine learning (ML) and deep learning (DL) algorithms in forecasting solar radiation [
25]. In some cases, ML algorithms have been found to be more accurate than DL algorithms [
26], while DL methods have been found to be more accurate than ML algorithms in others [
27]. In this context,
Table 1 summarizes the difference between ARIMA and FFBP in solar radiation forecasting in various locations.
As a novel method for time series forecasting, hybrid ARIMA-FFBP and ARMA-FFBP models have been developed. FFBP is a sort of feed-forward neural network that employs log sigmoidal functions. In comparison to single models and other hybrid models like SARIMA-MLP, it has been observed that this hybrid technique increases accuracy and minimizes errors. These hybrid ARIMA-FFBP, and ARMA-FFBP models could be further deployed to forecast the future performance and energy capacity of solar energy systems. Here, both have accompanied a different combination of parameters for inputs to select the adequate model. In addition, the daily solar radiation data is collected from two cities called Tetouan and Tangier in northern Morocco. These locations are essential, as the solar energy intensities are utilized in the photovoltaic and thermal system and the region has the largest port in the Mediterranean area. Several statistical metrics are evaluated and calculated to validate the performance of the forecast values.
The rest of the paper is organized as follows.
Section 2 contains collected data and the proposed methodology (forecasting method).
Section 3 contains summarizes the statistical metric employed in this work.
Section 4 contains the main finding and simulation results of the short-term forecasted daily global solar radiation. Lastly, the conclusion has been presented in
Section 4.
3. Results and Discussion
This section deals with forecast ability via five techniques of the daily GSR in two different cities in Morocco. The first and second methods are purely autoregressive (ARMA and ARIMA), and the third approach is called FFBP, which has utilized a backpropagation algorithm and the combined methods. In order to assess the success of these techniques, numerous statistical metrics which are commonly used in the literature are discussed. The proposed methodologies are widely operated to forecast the daily global solar radiation (GSR) using the MATLAB environment. In
Table 3,
Table 4 and
Table 5, the selected models could be defined as ARMA (10, 0, 0), ARMA (16, 0, 0), ARIMA (2, 1, 1), ARIMA (2, 2, 1), and FFBP (12, 2, 1), respectively, for Tetouan and Tangier cities.
Figure 9A–D show the comparisons of the daily GSR forecasted by the proposed ARIMA, ARMA, FFBP, hybrid ARMA-FFBP, hybrid ARIMA-FFBP, and measured data of the Tetouan site. The Figures present the error forecasting (EF) of each model, which shows the agreement relationship between the forecasted methodologies and measured data. Considering
Figure 4A and
Table 6 together, it appears clearly that the R
2, WIA, SBF, and LCE of the selected ARMA (10.0.0) present a significant accuracy and vary between 0.8074 and 0.9939. In terms of BIC and AIC, the selected model presents the worst values forecasting after the ARIMA (2.1.0) model. The average value of the daily GSR forecasted by the ARMA (10.0.0) model is lower than the average daily GSR measured. In terms of error forecasting, it is seen that the selected model presents various observations with multiple error forecastings, like observation number 63 (EF = 1.878 kW·h/m
2), 87 (EF = 2.378 kW·h/m
2) 102 (EF = 2.818 kW·h/m
2), and 263 (EF = 2.994 kW·h/m
2), respectively.
Figure 9B shows the daily GSR forecasted by the ARIMA (2.1.0) model and error forecasting. In addition, the forecasted ARIMA (2.1.0) is compared with the measured daily GSR for the Tetouan site. Considering the performance accuracy result presented in
Table 6, it can be seen that the all-statistical metric indicates that the ARIMA (2.1.0) model forecast well through the lowest value of the MBE (0.0839%), RMSE (16.6421%), and Sd (12.6704%). In term of R
2, the ARIMA (2.1.0) is 0.9628, which presents successful forecasting accuracy compared with the ARMA (10.0.0) model. The considering error forecasting of the ARIMA (2.1.0) model was seen in observation number 161 (3.542 kW·h/m
2).
Figure 9C shows the daily GSR and EF of both forecasted FFBP (12.2.1) and measured data. By comparing the two previous models and the FFBP (12.2.1) model in terms of R
2, the shown model gives the highest accuracy, which is estimated at 0.9890. The lowest value of the FFBP (12.2.1) in terms of the MBE, RMSE, BIC, and AIC is 0.817 kW·h/m
2, 0.5119 kW·h/m
2, 991.3442, and 890.6528, respectively. Unlike the ARMA (10.0.0) and ARIMA (2.1.0) models, the error forecasting of the significant FFBP (12.2.1) model is less in observation number 161(3.542 kW·h/m
2) compared with ARIMA (2.1.0) and ARMA (10.0.0). It can be concluded that the FFBP (12.2.1) model performed better than the ARMA (10.0.0) and ARIMA (2.1.0) models.
Figure 9D shows the forecasted hybrid ARMA-FFBP model for the daily GSR, measured data, and error forecasting of the Tetouan site. Taking into account the advantage of a hybrid model, which can minimize the shortcomings of a single model, the shown hybrid ARMA-FFNB model is close to the measured data. In terms of R
2, SBF, LCE, and WIA, the presented value of the hybrid model is 0.9890, 0.9148, 0.9580, and 0.9910, respectively. The R
2 value is close to one, which indicates the good agreement between forecasted hybrid ARMA-FFBP and measured data. The other statistical metric of the hybrid ARMA-FFBP shows the lowest values compared with the three previous models. The error forecasting of the proposed hybrid ARMA-FFBP model shows significant and lower values than other models. Among the three previous models, it was seen that the hybrid ARMA-FFBP is the most suitable model to forecast the daily GSR compared with ARMA (10.0.0), ARIMA (2.1.0), and FFBP (12.2.1) models.
Figure 9E shows the forecasted daily GSR generated by the hybrid ARIMA-FFBP model, the measured data, and the error forecasting of the Tetouan site. Among all models, the hybrid ARIMA-FFBP is the most successful model, which is very close to the measured daily GSR. In terms of R
2, the shown hybrid ARIMA-FFBP is approximately higher by about 0.41% on the hybrid ARMA- FFBP model, 0. 53% on the FFBP (12.2.1) model, 4.59% on the ARMA (10.0.0) model and around 3.03% on ARIMA (2.1.0) model. The computed MBE (%), RMSE (%) Sd (%), SBF, LCE, WIA, BIC, and AIC showed a very close forecasting success compared with FFBP (12.2.1) and hybrid ARMA-FFBP models. The Hybrid ARMA-FFBP model is close to the hybrid ARIMA-FFBP model, particularly in R
2 (%). In term of error forecasting, the proposed hybrid ARIMA-FFBP can be recognized as “the very most suitable model forecasting”, which present the lowest value of statistical performance and the highest values of R
2, SPE, LCE, and WIA. Likewise, particularly in the previous observation number 161, the presented error forecasting of the hybrid ARIMA-FFBP was seen to be very low.
Figure 10A–D shows the comparisons of the daily GSR forecasted by the proposed ARIMA, ARMA, FFBP, hybrid ARMA-FFBP, hybrid ARIMA-FFBP, and measured data of the Tangier site. The Figures present the error forecasting (EF) of each model, which shows the agreement relationship between the forecasted methodologies and measured data. The study location presents satisfactory results in terms of statistical metrics in the same way as the previous one. The Tangier site is among the Mediterranean regions with good prediction results selected in this study. In this regard, it has been seen from the forecasted results and error depicted in
Figure 5. Various observation numbers seen from the forecasted daily GSR have maximal and minimal error forecasting. The presented result led to increasing and decreasing the total forecast error of the selected study location. The reason why the Tangier site is less expected than the Tetouan site. The ARMA (16.0.0) has been the worst forecasted model in terms of R
2, SPE, LCE, and WIA for the Tangier site. The range value of the selected model is between 0.8074 and 0.9601. Compared with the Tetouan site, the ARMA (16.0.0) model performed better than the ARMA (10.0.0) model.
Figure 10B shows the daily GSR forecasted by ARIMA (2.2.0) model compared with the measured data. In terms of MBE (%), RMSE (%), Sd (%), AIC, and BIC, the proposed model increases the forecast accuracy compared with ARMA (16.0.0) model. The error forecasting in this model is less than ARMA (16.0.0) model. Particularly, in observation numbers 73 (1.98 kW·h/m
2), 88 (3.638 kW·h/m
2), 89 (2.95 kW·h/m
2) 128 (5.368 kW·h/m
2), 131 (3.882 kW·h/m
2), and 360 (0.6491 kW·h/m
2). In terms of MBE, the selected model is the only one that has the lowest value (0.0042 kW·h/m
2) compared to the other models. It can be concluded that the ARIMA (2.2.0) exceeds the ARMA (16.0.0) model and present a significant agreement between the forecasted daily GSR and measured data.
Figure 10C shows the comparisons of the forecasted daily GSR generated by the FFBP (12.2.1) method and measured data for the Tangier site. The error forecast is depicted in the same figure, which presents the forecast improvement FFBP (12.2.1), and the performance accuracy between measured data. As seen from
Table 6, in term of R
2, the selected model rank third after the combined models. The lowest value of the FFBP (12.2.1) is by about 0.03092 kW·h/m
2 for MBE, 0.0517% for MBE (%), and 0.79265 (%) for Sd (%) respectively. The highest value of WIA is about 0.9891 and is nearly close to 1. As a result, the selected FFBP (12.2.1) exceeds the ARIMA (2.2.0) and ARMA (16.0.0) models and illustrates a successful forecast of the daily GSR.
Figure 10D shows the daily GSR forecasted by the hybrid ARMA-FFBP model and error forecasting. In addition, the forecasted combined model is compared with the measured daily GSR for the Tangier site. Considering the depicted
Figure 5 and
Table 6 together, it appears clearly that the hybrid ARMA-FFBP presents the highest statistical metric indicator compared with the previous model for the selected study location. In addition, the combined ARMA-FFPB is close to FFBP (12.2.1) in terms of R
2, Sd (%), and Sd. Unlike the ARMA (16.0.0) and ARIMA (2.2.0) models, the error forecasting of the significant hybrid ARMA-FFBP model is less in observation numbers 73 (1.98 kW·h/m
2), 88 (3.638 kW·h/m
2), 89 (2.95 kW·h/m
2) 128 (5.368 kW·h/m
2), 131 (3.882 kW·h/m
2), and 360 (0.6491 kW·h/m
2).
Figure 10E shows the forecasted daily GSR implemented by combined ARIMA-FFBP and compared with those measured data for the Tangier site. Among the four previous models, the hybrid ARIMA-FFBP is the most successful model, which presents the lowest values of MBE (%), RMSE (%), Sd (%), AIC, and BIC, and the highest value of R
2, SPE, LCE, and WIA. In terms of R
2, the shown hybrid ARIMA-FFBP is approximately higher by about 1.57% on the ARMA (16.0.0) model, 3% on the ARIMA (2.2.0) model, 0.67% on the FFBP (12.2.1) model, and around 0.13% on the hybrid ARMA-FFBP (2.1.0) model. All computed statistical performance metrics showed a very close forecasting success compared with the FFBP (12.2.1) and hybrid ARMA-FFBP models. In terms of error forecasting, the proposed hybrid ARIMA-FFBP can be recognized as “the very most suitable model forecasting”. Likewise, in the previous observation number, the presented error forecasting of the hybrid ARIMA-FFBP was seen to be very low.
The Taylor diagram has been utilized as a comparison tool revealing the accuracies of different selected models (forecasted and measured data). Moreover, this diagram combines the correlation coefficient (R
2), the root means square error (RMSE), and the standard deviation (Sd) in a polar (two-dimensional) diagram. The main objective of this illustration is to closely inspect the forecasted results and the measured data on a particular day.
Figure 11 and
Figure 12 compare the performance of the most appropriate inputs, and graphs based on the statistical error metric. The figures illustrate the accuracies of the 16 relevant models, which have relatively lower errors in terms of standard deviation (Sd) and RMSE (value between 0.1 and 0.8 kWh/m
2). In addition, the, highest value of R
2 (99.31%) presents the accuracy relationship between the measured and predicted values.
Figure 11 illustrates the Taylor diagram for the Tetouan site, in which statistics for the forecasted ARMA (10.0.0), ARIMA (2.1.0), FFBP (12.2.1), hybrid ARMA-FFBP, and hybrid ARIMA-FFBP models (each model contains 16 appropriate models to select the performed one) were computed. Each appropriate model appearing in the diagram quantifies how the forecasted models matched measured daily GSR. It is seen from the figure that the centered RMSE is related to the distance from the reference point (the horizontal axis is considered as observed data). As a result, the forecasted ARMA (10.0.0) and ARIMA (2.1.0) revealed the highest RMSE and the lowest value of R
2, which is varying between 16.6421~15.6709 and 0.9472~0.9628, respectively. This model is less than the forecasted FFBP (12.2.1) model and revealed the optimum RMSE (0.5119%) and Sd (9.9852%). The forecasted combined hybrid ARMA-FFBP and hybrid-FFBP models resulted in the lowest value of RMSE than the simple models.
The Taylor diagram for the Tangier site shown in
Figure 12 is generated by five models, which contain 16 appropriate models. It appears from the figure that the forecasted FFBP (12.2.1), hybrid ARMA-FFBP, and ARIMA-FFBP models proved the appropriate match with the measured daily GSR. The hybrid ARIMA-FFBP model had the highest values in terms of R
2 and the lowest value in terms of RMSE compared with previous models. In this case study, the Taylor correlation increased by about 10% to 15% compared with the Tetouan site.
The regression plot of the forecasted daily GSR generated by the most five appropriate models for the Tetouan and Tangier sites is given in
Figure 13A,B. As seen from the figures, the error estimated between the forecasted daily GSR and measured data have a wide dispersion of the ARMA (10.0.0), ARIMA (2.1.0), ARMA (16.0.0), and ARIMA (2.2.0) models for Tetouan and Tangier site, respectively. The dispersion of the FFBP (12.2.1) model is smaller than the other two previous models. The dispersion of the combined models is smaller and less than the FFBP (12.2.1) model. The accuracy between the forecasted and measured data in the hybrid ARIMA-FFBP method is improved. It is observed that all data sets are correctly fitted to the corresponding line, which verifies the hybrid ARIMA-FFBP is more accurate compared to other methods. In addition, the correlation coefficient (R
2) values of the best model for the Tetouan and Tangier sites is close to 1, which explains the good relationship between the forecasted and measured data.
Table 6 gives the numerical values of the adopted methodologies by using the computed statistical metric in order to select the best model, which is at present the optimal value. The correlation coefficient (R
2) of the forecasted ARIMA, ARMA, FFBP, and hybrid models varies between 0.9472% and 0.9931% depending on the study location and the trained methods. The range value of the slope of the best-fit line (SPE) varies between 0.8435 and 0.9296. The range value of the legate’s coefficient of efficiency (LCE) is 0.8954, 0.9696 and the range value of Willmott’s index of agreement (WIA) is 0.9491 and 0.9945. These results show that the hybrid ARIMA-FFBP is more reliable in the forecasting of the daily GSR for the Tetouan and Tangier sites. In this context, the obtained performance will be compared and discussed by considering
Table 6 as the reference. Further, the results obtained from the hybrid ARIMA-FFBP model compared with single and combined models have exposed the highest correlation coefficient of 0.9901% for Tetouan city and 0.9831% for Tangier city. In addition, the values of MBE (%), RMSE (%), Sd (%), Akaike information criterion (AIC), and Bayesian information criterion (BIC) for both cities are 0.0297 (%), 0.02101 (%), 9.6917 (%), 9.06742 (%), 8.67911 (%), 6.87613 (%), 792.8625, 765.091 and 756.3418, 504.816, respectively. Eventually, the results have defined that the hybrid ARIMA-FFBP model is more accurate and suitable compared with the other methods to predict the daily global solar radiation for any location with the same weather conditions.
In several investigations, the hybrid ARIMA-FFBP and hybrid ARMA-FFBP models were compared to deep learning models for solar radiation. A study, for example, compared artificial intelligence (AI) methods for solar radiation forecast or estimation, including empirical, statistical, physical, and machine learning models [
25]. Another study introduced a new hybrid strategy based on deep learning approaches for Global Solar Radiation (GSR) prediction problems [
48]. In addition, one study constructed and analyzed two innovative hybrid neural network models for solar irradiance forecasting [
49], and another examined the effects of various classic long short-term memory (LSTM) models on hour-ahead solar irradiance forecasting [
50]. Finally, a study demonstrated that using input parameters in this hybrid model for daily GSR forecast proves the performance accuracy compared to the previous models.