Global Solar Radiation Forecasting Based on Hybrid Model with Combinations of Meteorological Parameters: Morocco Case Study

: The adequate modeling and estimation of solar radiation plays a vital role in designing solar energy applications. In fact, unnecessary environmental changes result in several problems with the components of solar photovoltaic and affects the energy generation network. Various computational algorithms have been developed over the past decades to improve the efﬁciency of predicting solar radiation with various input characteristics. This research provides ﬁve approaches for forecasting daily global solar radiation (GSR) in two Moroccan cities, Tetouan and Tangier. In this regard, autoregressive integrated moving average (ARIMA), autoregressive moving average (ARMA), feed forward back propagation neural networks (FFBP), hybrid ARIMA-FFBP, and hybrid ARMA-FFBP were selected to compare and forecast the daily global solar radiation with different combinations of meteorological parameters. In addition, the performance in three approaches has been calculated in terms of the statistical metric correlation coefﬁcient (R 2 ), root means square error (RMSE), stand deviation ( σ ), the slope of best ﬁt (SBF), legate’s coefﬁcient of efﬁciency (LCE), and Wilmott’s index of agreement (WIA). The best model is selected by using the computed statistical metric, which is present, and the optimal value. The R 2 of the forecasted ARIMA, ARMA, FFBP, hybrid ARIMA-FFBP, and ARMA-FFBP models is varying between 0.9472% and 0.9931%. The range value of SPE is varying between 0.8435 and 0.9296. The range value of LCE is 0.8954 and 0.9696 and the range value of WIA is 0.9491 and 0.9945. The outcomes show that the hybrid ARIMA–FFBP and hybrid ARMA–FFBP techniques are more effective than other approaches due to the improved correlation coefﬁcient (R 2 ).


Introduction
The global consideration of renewable energy sources such as solar energy, wind, hydro, and biomass has remarkably increased in terms of sustainable energy due to the reduction of fossil fuels. In this regard, the implementation of solar energy sources has widely focused on the use of photovoltaic (PV) systems, thermal solar energy, and concentrated solar energy [1]. In particular, the emergence of renewable energy has enhanced solar radiation forecasting in the activity and management of the modern smart grid, with renewable energy generation [2,3].
Solar radiation forecasting has become crucial to accurately predict the efficiency of solar energy conversion systems and ensure the electrical grid's reliability and safety. between 0.9860% and 0.9920%, with the range value of MBE (%) being from −0.1076% to −0.5931%, the RMSE between 0.1990 and 0.4580%, the range value of the NRMSE is between 0.0355 and 0.8938, and the lowest value of the MAPE is between 0.0019 and 0.0060%. This technique could be used to predict other parameters for locations where measurement instrumentation is unavailable or costly to obtain. In the meantime, the authors of [16] presented a comparative optimization of daily global solar radiation forecasting with different machine learning and time series methods. The selected methods are compared with the persistence technique and measured data. Several statistical metrics are assessed to obtain the most appropriate method, which presents the lowest value accuracy. The select result is, respectively, the RMSE (%) and MBE (%) values of several models employed in this study, and were computed to be mostly positive. The range value of the selected model measured by RMSE (%) and MBE (%) varied between 4.64% to 8.87% and 6% to 22.93%. Based on all statistical metrics, the lower value of the selected model corresponds to the neural FFBP (6 × 10 × 1) in comparison with the other models. The appropriate one performs well and is close to the measured data. The authors of [17] presented a complete and detailed synthesis of solar radiation modeling, forecasting, and solar radiation data using artificial intelligence methods: ANN, fuzzy logic, genetic algorithm, expert system, and a hybrid method. It is proven that solar radiation is a vital factor in PV system performance and sizing. The same researchers have presented a combination of the above methods for generating horizontal global solar radiation by combining ANN and library of Markov transition metrics (MTM) approaches based on three-parameter coordinates (longitude, altitude, and latitude) and the data were collected from a data basis of 60 stations in Algeria over 9 years. This prediction is comparatively accurate related to the relative Root Mean Square Error (RMSE), which is less than 8.2%. On other hand, López et al. [18] have chosen the automatic relevance determination method (ARD) based on the ANN model to select the relevant input parameters in direct normal solar irradiance forecasting. Clearness index and relative air mass were considered the most important input parameters to the neural network. According to J. Lampinen et al. [19], Penny et al. [20], and B. Belmahdi et al. [8], the ARD and ANN are the priority distribution on the network weights and determine the most relevant input parameters by introducing a hyperparameter for each input unit of the ANN. In ANNs, the important prior distribution of the network weights is controlled by the hyperparameters. Here, the ANN is presented with training data, and posterior weight distribution and hyperparameters are calculated using the Bayes rules.
Recently, time series forecasting models have taken the attention of researchers in different fields for their power and ability to forecast complex systems [11]. Especially, the autoregressive integrated moving average (ARIMA) was utilized in the COVID-2019 pandemic [21], hydroelectricity [22], agriculture [23], and groundwater-based irrigation [24]. Generally, ARIMA models are an integration of autoregressive models (AR) and moving average models (MA), which have proven reasonable precision in the forecasting of stationary time-series information. Further, it is strongly assumed that the prospective data values are linearly dependent on present and past data values. However, several realworld time series data result from dynamic non-linear structures that cannot be accurately modeled by ARIMA. Consequently, artificial neural networks (ANNs) are considered the most commonly implemented algorithms for nonlinear time series modeling. In another word, the ANNs have a range of benefits with respect to ARIMA and other forecasting frameworks, which are capable of executing a dynamic non-linear function. Hence, the ANNs are capable of reconstructing every continuously measurable function with arbitrary desired accuracy [11]. Moreover, ANNs are flexibly data-driven in design, which ensures that ANN models could be modified with the characteristics of time series data information. Several studies have been carried out to compare the performance of machine learning (ML) and deep learning (DL) algorithms in forecasting solar radiation [25]. In some cases, ML algorithms have been found to be more accurate than DL algorithms [26], while DL methods have been found to be more accurate than ML algorithms in others [27]. In this context, Table 1 summarizes the difference between ARIMA and FFBP in solar radiation forecasting in various locations. Table 1. Selected studies on ARIMA and ANN modeling of solar radiation.

References
Simple and Combined Modeling for Short-Term and Long-Term Prediction of Solar Radiation [28] Seasonal ARIMA (0, 1, 2) (1, 0, 1) 30 was found to be a suitable model for predicting daily solar radiation at Reese Research Centre of Lubbock, Texas [29] ARIMA (1, 0, 0) was found reasonable in capturing the autocorrelative structures of the daily average of solar irradiance in Awali, Kingdom of Bahrain.
[8] Hybrid ARIMA-backed propagation does not outperform ARIMA for hourly solar irradiance from National Solar Radiation Database (NRSDB) site from 2008 to 2009.
[31] ARMA (2, 0) and ARMA (4, 0) were identified as appropriate models combined with ANN for the prediction of daily global solar radiation. [32] ARIMA (2, 1, 1) was developed for the prediction of the daily clearness index In Abu Dhabi. [33] Employed ARMA, which revealed that the residuals were best estimated by non-seasonal ARMA (2, 0) for daily solar radiation data over four locations in Malaysia. [34] Employed ANN-BP neural network and multilayered feed-forward neural network As a novel method for time series forecasting, hybrid ARIMA-FFBP and ARMA-FFBP models have been developed. FFBP is a sort of feed-forward neural network that employs log sigmoidal functions. In comparison to single models and other hybrid models like SARIMA-MLP, it has been observed that this hybrid technique increases accuracy and minimizes errors. These hybrid ARIMA-FFBP, and ARMA-FFBP models could be further deployed to forecast the future performance and energy capacity of solar energy systems. Here, both have accompanied a different combination of parameters for inputs to select the adequate model. In addition, the daily solar radiation data is collected from two cities called Tetouan and Tangier in northern Morocco. These locations are essential, as the solar energy intensities are utilized in the photovoltaic and thermal system and the region has the largest port in the Mediterranean area. Several statistical metrics are evaluated and calculated to validate the performance of the forecast values.
The rest of the paper is organized as follows. Section 2 contains collected data and the proposed methodology (forecasting method). Section 3 contains summarizes the statistical metric employed in this work. Section 4 contains the main finding and simulation results of the short-term forecasted daily global solar radiation. Lastly, the conclusion has been presented in Section 4.

Data Collection and Study Sites
The data was analyzed from the 1 January 2015 to the 31 December 2015, and conducted in a meteorological station installed on the rooftop of the Faculty of Science, Abdelmalek Essaadi Tetouan University ( Figure 1). This station consists of pyranometers in Kipp & Zonen brands, model CM-11, for the measurement of global solar radiation. Another identical pyranometer and the shadow ring is operated to measure the diffuse solar radiation. The shadow ring has a width of 7.6 cm, a radius of 31 cm, an anemometer equipped with a wind vane with a horizontal axis turbine for the measurement of the wind speed, and direction-along with a Campbell scientific thermo-hygrometer containing two probes-to measure the temperature and the relative humidity. In addition, the local station uses a rain gauge to measure precipitation. In Figure 2, we plot the daily range of the mean (Tmean), maximum (Tmax), a mum (Tmin), temperature. The three-temperature dataset describes the timescale tion of day-to-day temperature variations in the Tetouan and Tangier sites. Th temperature variation reveals the succession of warm and cold periods over the y analysis is conducted for each day. The recorded Tmax was observed in the July p °C for Tetouan and 36.6 °C for Tangier), while the Tmin was observed in the Januar (6 °C for Tetouan and 2 °C for Tangier). The distribution of the Tmin is close to condition.   In Figure 2, we plot the daily range of the mean (T mean ), maximum (T max ), and minimum (T min ), temperature. The three-temperature dataset describes the timescale distribution of day-to-day temperature variations in the Tetouan and Tangier sites. The annual temperature variation reveals the succession of warm and cold periods over the year. The analysis is conducted for each day. The recorded T max was observed in the July period (38 • C for Tetouan and 36.6 • C for Tangier), while the T min was observed in the January period (6 • C for Tetouan and 2 • C for Tangier). The distribution of the T min is close to normal condition. In Figure 2, we plot the daily range of the mean (Tmean), maximum (Tmax), and minimum (Tmin), temperature. The three-temperature dataset describes the timescale distribution of day-to-day temperature variations in the Tetouan and Tangier sites. The annual temperature variation reveals the succession of warm and cold periods over the year. The analysis is conducted for each day. The recorded Tmax was observed in the July period (38 °C for Tetouan and 36.6 °C for Tangier), while the Tmin was observed in the January period (6 °C for Tetouan and 2 °C for Tangier). The distribution of the Tmin is close to normal condition.  Figure 3A,B show a representative behavior of the daily global solar radiation from January to December 2015 of the Tetouan and Tangier sites. The total solar radiation over the year was 18.63 MJ/m² and 19.99 MJ/m² with a standard deviation of 1.98 kW.h/m² and 1.76 kW.h/m² for the Tetouan and Tangier sites, respectively. In the warm period, the lowest solar radiation deviation was 1.106 kWh/m² for the Tetouan site and 0.7375 kW.h/m² for the Tangier site. The highest solar radiation deviation in the cold period was 7.48 kW.h/m² and 7.413 kW.h/m for Tetouan and Tangier sites, respectively. As for month averages, the highest solar radiation months over the year were from May to July, while the lowest values were from December to January.  .99 MJ/m 2 with a standard deviation of 1.98 kW·h/m 2 and 1.76 kW·h/m 2 for the Tetouan and Tangier sites, respectively. In the warm period, the lowest solar radiation deviation was 1.106 kWh/m 2 for the Tetouan site and 0.7375 kW·h/m 2 for the Tangier site. The highest solar radiation deviation in the cold period was 7.48 kW·h/m 2 and 7.413 kW·h/m for Tetouan and Tangier sites, respectively. As for month averages, the highest solar radiation months over the year were from May to July, while the lowest values were from December to January. In Figure 4A,B shows the daily wind speed (Ws) and relative humidity (Rh) over year for the Tetouan and Tangier sites. The range value of wind speed varies by ab 13.02 m/s and 13.3 m/s, respectively, for the Tetouan and Tangier sites. In the cold per the highest relative humidity for the Tetouan and Tangier sites was 94.5% and 92.58% a warm period, the lowest relative humidity was 49.75% and 55.54%, respectively, for touan and Tangier sites. The manipulations may require other types of meteorological parameters suc ambient temperature (T, °C; maximum, minimum, average, difference, and ratio), number, day length, wind speed (Ws, m/s; average), relative humidity (RH,%), top o mosphere radiation (TOA radiation), and geographic coordinates (Latitude, Altitude, Longitude). These are exogenous data, whose periods and characteristics of meas ments are summarized in Table 2.  In Figure 4A,B shows the daily wind speed (Ws) and relative humidity (Rh) over the year for the Tetouan and Tangier sites. The range value of wind speed varies by about 13.02 m/s and 13.3 m/s, respectively, for the Tetouan and Tangier sites. In the cold period, the highest relative humidity for the Tetouan and Tangier sites was 94.5% and 92.58%. In a warm period, the lowest relative humidity was 49.75% and 55.54%, respectively, for Tetouan and Tangier sites. In Figure 4A,B shows the daily wind speed (Ws) and relative humidity (Rh) over the year for the Tetouan and Tangier sites. The range value of wind speed varies by about 13.02 m/s and 13.3 m/s, respectively, for the Tetouan and Tangier sites. In the cold period, the highest relative humidity for the Tetouan and Tangier sites was 94.5% and 92.58%. In a warm period, the lowest relative humidity was 49.75% and 55.54%, respectively, for Tetouan and Tangier sites. The manipulations may require other types of meteorological parameters such as ambient temperature (T, °C; maximum, minimum, average, difference, and ratio), day number, day length, wind speed (Ws, m/s; average), relative humidity (RH,%), top of atmosphere radiation (TOA radiation), and geographic coordinates (Latitude, Altitude, and Longitude). These are exogenous data, whose periods and characteristics of measurements are summarized in Table 2.  The manipulations may require other types of meteorological parameters such as ambient temperature (T, • C; maximum, minimum, average, difference, and ratio), day number, day length, wind speed (W s , m/s; average), relative humidity (RH,%), top of atmosphere radiation (TOA radiation), and geographic coordinates (Latitude, Altitude, and Longitude). These are exogenous data, whose periods and characteristics of measurements are summarized in Table 2.

ARIMA and ARMA Model
Generally, the ARIMA model could be recognized as the combination of several models named p, d, q, and AR (p) refers to the order of the autoregressive component. In fact, the I(d) integrated refers to the degree of differentiation involved, and MA(q) represents the order of the average movement of the components [35]. Moreover, the AR assumes that each value of the time series depends on the weighted production sum in the previous values and the most residual regression coefficient. An autoregressive model could be considered an autoregressive model of order (p) [36], which represents the previous (lagged) values of the dependent variable. MA is utilized to calculate the mean for a specified set of values and predict the following periods; MA (moving average) refers to the lagged error terms (i.e., residues) created by the model, which is represented by (q) [37]. The elements of the series could be affected by past errors (or a random shock), which are impossible to compute by the autoregressive component. The coupled AR(p) and MA (q) built an ARMA model which is commonly used in forecasting tacks [38]. The general ARIMA model p, d, q can be expressed as Equation (1): Or in a general form as Equation (2): where ϕ i refers to the i-th term autoregressive parameter, θ i refers to the ith term moving average parameter, c is the constant, e t is defined as an error at time t, B p refers to the p-th order backward shift operator, and X(t) is defined as a time series value at time t. The first step in ARIMA and ARMA construction is to identify the time series stationary data as presented in Figure 5, and then significant seasonality in the time series. The assumptions which were accounted for in the ARIMA model are that the time series should be stationary, the flow data must be separated to secure the sequence stationary, the data should vary continuously, and the data has consisted of a logarithmic transition to rendering the variance stable to accommodate the stationary variability. In addition, the two methods which were deployed in the pattern recognition process are the sequence of correlograms called the autocorrelation (ACF) function and the partial autocorrelation function [39][40][41]. The ACF and PACF are considered the most important elements of time series analysis and forecasting. In particular, the ACF determines the amount of the linear dependency around measurements in the time series, while the PACF enables the calculation of the number of autoregressive terms necessary to demonstrate the time lag characteristics. Moreover, the Akaike Information Criterion was used as a provided tool to select the optimal AIC, where the lowest AIC was identified as a parsimonious model [39,40,42].

Artificial Neural Network Model (FFBP)
Neural networks are approximately similar to the human brain's computing and create awareness of basic processing units called neurons. Neurons are divided into layers, and synaptic connections (weight) interconnect the neighboring layers. In particular, it is possible to distinguish three separate types of layers: the input layer, which connects the input information; the output layer, which generates the final output; and one or more hidden layers, which serve as intermediate computational layers between input and output [43]. In addition, the input values are measured by the first weights of the information exchange. Hence, the products are applied to the neuron with a particular parameter designated bias. This is utilized to extend the total number of the products over a reasonable range by becoming inputs for hidden layer nodes, which attach a nonlinear activation structure (a sigmoid unit) to the total above-generated hidden node. These signals are interpreted in the same manner through the corresponding hidden layers (if any) or the output layer, producing the output of the network, which is illustrated in Figure 6.
should be stationary, the flow data must be separated to secure the sequence stationary, the data should vary continuously, and the data has consisted of a logarithmic transition to rendering the variance stable to accommodate the stationary variability. In addition, the two methods which were deployed in the pattern recognition process are the sequence of correlograms called the autocorrelation (ACF) function and the partial autocorrelation function [39][40][41]. The ACF and PACF are considered the most important elements of time series analysis and forecasting. In particular, the ACF determines the amount of the linear dependency around measurements in the time series, while the PACF enables the calculation of the number of autoregressive terms necessary to demonstrate the time lag characteristics. Moreover, the Akaike Information Criterion was used as a provided tool to select the optimal AIC, where the lowest AIC was identified as a parsimonious model [39,40,42].

Artificial Neural Network Model (FFBP)
Neural networks are approximately similar to the human brain's computing and create awareness of basic processing units called neurons. Neurons are divided into layers and synaptic connections (weight) interconnect the neighboring layers. In particular, it is possible to distinguish three separate types of layers: the input layer, which connects the input information; the output layer, which generates the final output; and one or more hidden layers, which serve as intermediate computational layers between input and output [43]. In addition, the input values are measured by the first weights of the information exchange. Hence, the products are applied to the neuron with a particular parameter designated bias. This is utilized to extend the total number of the products over a reasonable range by becoming inputs for hidden layer nodes, which attach a non-linear activation structure (a sigmoid unit) to the total above-generated hidden node. These signals are interpreted in the same manner through the corresponding hidden layers (if any) or the output layer, producing the output of the network, which is illustrated in Figure 6. The connection between the output Y(t) and the input xt-1…. xt-4 could be defined in Equation (3): where the wj and wi represent the connection weights, and S1 and S2 are the activation function.
The most common activation function is the logistic sigmoid function, which is given by the Equation (4): The weights are the principal regulation parameters in every FFBP. The parameter evaluation methods could be considered as training, where optimal relation weights are calculated by minimizing the objective function.
In the following Equation (5), all input parameters are applied to train and validate the FFNN model, and the binomial coefficient has applied for selecting the appropriate input parameters: The connection between the output Y(t) and the input x t−1 . . . . x t−4 could be defined in Equation (3): where the w j and w i represent the connection weights, and S 1 and S 2 are the activation function.
The most common activation function is the logistic sigmoid function, which is given by the Equation (4): The weights are the principal regulation parameters in every FFBP. The parameter evaluation methods could be considered as training, where optimal relation weights are calculated by minimizing the objective function.
In the following Equation (5), all input parameters are applied to train and validate the FFNN model, and the binomial coefficient has applied for selecting the appropriate input parameters: where C is the number of all combinations, and m is the total number of inputs.

Hybrid Model
As already discussed in previous sections, the classification, the prediction, and the forecasting methodologies are the main aspect of the time series models analysis. Time series analysis forecasting is generally achieved through statistical techniques, such as linear and non-linear models. These models are commonly used since their flexibility and their variety of fixed pre-processing datasets.
The main limitation of ARIMA and ARMA methods is estimating the linear relationship through the pre-processing time series input data. The behavior of several input data contains linear and nonlinear patterns. In this context, ARIMA and ARMA models were fitted to simulate the linear pattern, and an artificial neural network with a backpropagation algorithm was used to simulate the nonlinear component. By taking into account the main advantages of the two previous techniques, a hybrid ARIMA-FFBP and ARMA-FFBP were proposed by [8] to simulate both linear and nonlinear patterns. A general mathematical expression of the combined model is as follows: where Ψ t is the linear pattern obtained by ARMA and ARIMA models and Φ t is the residual that can be estimated from the feed-forward with the backpropagation algorithm. The linear pattern is obtained by modeling the ARMA and ARIMA models. After that, we subtract the residual series (R t ) from the forecasted models: We create the FFBP architecture model by using the subtracting residuals series: The forecasted series obtained from ARIMA, ARMA, and FFBP models are aggregated to estimate the proposed time series:Ŷ whereŶ T = +i is the forecasted time series, and Ψ t and Φ t are the linear and nonlinear forecasting of the original series. The adopted methodology of this purpose study is divided into forth sections as depicted in Figure 7: -Data pre-processing in section one (grey color) involves the collection of meteorological, computational, astronomical, and geographical data. These parameters require many corrections of missing data and outlier removal. - The application of multiple combinations of several input parameters in order to select the appropriate architecture executed in section two (gold color) was accomplished by splitting data into two steps, which are training data (80%), testing, and validation (20%) data. - The step of the training (green color) was operating the proposed methodologies. The input parameters were tested by using time series model stationarity (Ljung-Box test). After that, the data stationarity was implemented for ARIMA and ARMA models. In the case of the ARIMA model, that involves the residual generated by the FFBP model, which built the combined ARIMA and FFBP models.
-The models were built and divided into simple (ARMA, ARIMA, and FFBP), and hybrid methods (hybrid ARMA-FFBP and hybrid ARIMA-FFBO models; orange color). - The obtained result (grey color) was evaluated and interpreted by using various statistical metrics in order to choose the best model, which presents the lowest value of MBE (%), RMSE (% Sd (%), AIC, and BIC and the highest values of R 2 , SBF, LCE, WIA. -Data pre-processing in section one (grey color) involves the collection of m ical, computational, astronomical, and geographical data. These paramet many corrections of missing data and outlier removal. - The application of multiple combinations of several input parameters in lect the appropriate architecture executed in section two (gold color) w plished by splitting data into two steps, which are training data (80%), t validation (20%) data. - The step of the training (green color) was operating the proposed methodo input parameters were tested by using time series model stationarity (

Model Selection
According to the fundamental issues observed by modern data, the scientist has introduced meaningful conclusions for a complicated system utilizing data from a single determined parameter. Meanwhile, the most effective treatment of global solar radiation data involves the deployment of neural networks, either with data from each measurement or with data from several successive measurements. The management is describable as Equation (10), where γ represents the neural network forecaster and i is the number of successive observations. Figure 8A-C above show that ACF and PACF can have on the auto-correlogram of the global solar radiation time series. Here the night hours have not considered during the simulations [44], and several measurements taken daily have been made.
According to the fundamental issues observed by modern data, the scientist has introduced meaningful conclusions for a complicated system utilizing data from a single determined parameter. Meanwhile, the most effective treatment of global solar radiation data involves the deployment of neural networks, either with data from each measurement or with data from several successive measurements. The management is describable as Equation (10), where γ represents the neural network forecaster and i is the number of successive observations. Figure 8A-C above show that ACF and PACF can have on the auto-correlogram of the global solar radiation time series. Here the night hours have not considered during the simulations [44], and several measurements taken daily have been made. The parameters of the ARIMA (p, d, q) and ARMA (p, q) models were determined by plotting The Autocorrelation Function (ACF) and the Partial Autocorrelation Function (PACF) for different lags (Lag). Figure 8A-D illustrate the daily global solar radiation data with a 95% confidence interval. According to the figure, the significant Lags in PACF are shown at Lag 7 for Tetouan city, and Lag 16 for Tangier city. Further in ACF, the Lag shows a geometric decrease at each Lag (i.e., 1, 2, 3, 4) and the order of the autoregressive term in the ARMA model for Tetouan and Tangier city is 10 and 16. Concerning the ARIMA (p, d, q) model, the order of the p and d terms is 2, 1 for Tetouan city, and 2, 2 for Tangier city. In addition, the use of the stationarity method (first and second differentiation for the ARIMA model) improves the prediction phenomenon, and the effectiveness of improvements in stationarity with the quality of the prediction would be discussed later in this study.
Tables 3 and 4 compare the FFBP with the ARIMA and ARMA models, and deliver an example for optimization of the ARIMA and ARMA models with the parameters estimated from the two study sites. It is further considered that the prediction of the two techniques is relatively simple and eliminated the moving average order q (q = 0). The parameters of the ARIMA (p, d, q) and ARMA (p, q) models were determined by plotting The Autocorrelation Function (ACF) and the Partial Autocorrelation Function (PACF) for different lags (Lag). Figure 8A-D illustrate the daily global solar radiation data with a 95% confidence interval. According to the figure, the significant Lags in PACF are shown at Lag 7 for Tetouan city, and Lag 16 for Tangier city. Further in ACF, the Lag shows a geometric decrease at each Lag (i.e., 1, 2, 3, 4) and the order of the autoregressive term in the ARMA model for Tetouan and Tangier city is 10 and 16. Concerning the ARIMA (p, d, q) model, the order of the p and d terms is 2, 1 for Tetouan city, and 2, 2 for Tangier city. In addition, the use of the stationarity method (first and second differentiation for the ARIMA model) improves the prediction phenomenon, and the effectiveness of improvements in stationarity with the quality of the prediction would be discussed later in this study.
Tables 3 and 4 compare the FFBP with the ARIMA and ARMA models, and deliver an example for optimization of the ARIMA and ARMA models with the parameters estimated from the two study sites. It is further considered that the prediction of the two techniques is relatively simple and eliminated the moving average order q (q = 0).
In the models, the measurement data set has divided into three parts: a learning set (80%), a validation set (10%), and a test set (10%).
The weight (wij) adjustment is performed by the training set, and the validation set assures whether the error is within a certain range (this set is not adjusted weights directly, instead giving the optimal number of hidden layers or determining a breakpoint for the backscatter algorithm). Finally, the accuracy of the model on the test data facilitates the prediction performance of the model.
The hidden layer could be suggested even for preliminary analysis or modeling. Further, the application of more than one hidden layer has significantly increased the number of measurable variables. However, the improvement in the number of variables would reduce the training phase without increasing network performance. In this analysis, one hidden layer has been implemented, and identifying the correct number of neurons in the hidden layer is critical for effective implementation since it greatly improves neural network efficiency. If the hidden layer has included insufficient neurons, the neural network output could be degrading. On the other side, if the hidden layer has accompanied by excess neurons, it would increase the parameter complexity and the training data set could be over-adjusted. Therefore, the most appropriate approach to determining the correct number of neurons in the hidden layer is through experiments, such as adopting a trialand-error procedure [45]. The analysis of the results below presents the daily global solar radiation prediction set using the FFBP model. The application of multiple architectures has been deployed to select the appropriate model with a low value of root mean square error (RMSE in%). In addition, the selection of input variables is carried out by the combination method. Table 5 shows the prediction result of different FFBP architectures with coefficients of variation for the two cities. The training and validation set along with the change in the number of neurons in the hidden layer are essential factors in selecting the appropriate architecture of the FFBP model. Specifically, the root means square error (RMSE) has functioned as a comparison tool to measure the accuracy of different selected models. Moreover, the low value of RMSE (%) and coefficient of variation indicate the efficiency of the FFBP model. According to the results, the appropriate architecture for models was chosen as twelve input parameters, two neurons in hidden layers, and one output (DGSR) for the two cities. The value of RMSE (%) and the coefficient of variation in Tetouan city are 0.489% and 0.426%, and the RMSE value (%) and the coefficient of variation for Tangier city are 0.406% and 0. 382%.

Performance Criterion
In this study, the important statistical error metric indexes mentioned in the literature were considered to evaluate the precision of the forecasting methods [7,46,47]. The performance of ARIMA and ANN models are validated in terms of root mean square error (RMSE), mean bias error (MBE), standard deviation (σ), and correlation coefficient (R 2 ).
The root means square error (RMSE) is a commonly employed calculation of the variations between the forecasted values by a model or an equation and the experimental values. Further, the RMSE is widely used to compare the actual deviation between the predicted and the measured value, modeling, and regression analysis (Equation (11)): The mean bias error measures the model's average bias and determines further measures to fix the model. The average bias in forecasting is represented by Equation (12): Equation (13) expresses the standard deviation, which is a function of the amount of attenuation in a set of values: In Equation (14), the R 2 defines the appropriate sequential match among measured and forecasted values and the writing: The slope of the best-fit line (SBF) defines the appropriate sequential match among measured and forecasted values and can be expressed by Equation (15): The Willmott's index of agreement (WIA) defines the appropriate sequential match among measured and forecasted values and can be expressed by Equation (16): The Legate's coefficient of efficiency (LCE) defines the appropriate sequential match among measured and forecasted values and can be expressed by Equation (17): (17) where t(x) and a(x) are the i-th forecasted and measured values of daily GSR, (a(x)) presents the average of measured values of daily GSR, and N is a number of experimental data.
Another criterion named the X2 Information Criterion (AIC of order two) and the X2 information criteria (BIC) are explained in Equations (18) and (19), which are computed for both ANN and ARIMA models: Forecasting 2023, 5 186 BIC = m ln(RMSE) + npar * ln(m) (19) where m is the number of input-output models and npar is the number of parameters to identify. Moreover, if the RMSE statistics should gradually improve with the more added parameters, the AIC and BIC statistics penalize the model with more parameters and therefore tend to give rise to more parsimonious models. In this case, the ratio of (m/npar) is less than 60, and the second-order AIC has been evaluated to measure the performance of the model.

Results and Discussion
This section deals with forecast ability via five techniques of the daily GSR in two different cities in Morocco. The first and second methods are purely autoregressive (ARMA and ARIMA), and the third approach is called FFBP, which has utilized a backpropagation algorithm and the combined methods. In order to assess the success of these techniques, numerous statistical metrics which are commonly used in the literature are discussed. The proposed methodologies are widely operated to forecast the daily global solar radiation (GSR) using the MATLAB environment. In Tables 3-5, the selected models could be defined as ARMA (10, 0, 0), ARMA (16, 0, 0), ARIMA (2, 1, 1), ARIMA (2, 2, 1), and FFBP (12, 2, 1), respectively, for Tetouan and Tangier cities. Figure 9A-D show the comparisons of the daily GSR forecasted by the proposed ARIMA, ARMA, FFBP, hybrid ARMA-FFBP, hybrid ARIMA-FFBP, and measured data of the Tetouan site. The Figures present the error forecasting (EF) of each model, which shows the agreement relationship between the forecasted methodologies and measured data. Considering Figure 4A and Table 6 together, it appears clearly that the R 2 , WIA, SBF, and LCE of the selected ARMA  Figure 9B shows the daily GSR forecasted by the ARIMA (2.1.0) model and error forecasting. In addition, the forecasted ARIMA (2.1.0) is compared with the measured daily GSR for the Tetouan site. Considering the performance accuracy result presented in Table 6, it can be seen that the all-statistical metric indicates that the ARIMA (2.1.0) model forecast well through the lowest value of the MBE (0.0839%), RMSE (16.6421%), and Sd (12.6704%). In term of R 2 , the ARIMA (2.1.0) is 0.9628, which presents successful forecasting accuracy compared with the ARMA (10.0.0) model. The considering error forecasting of the ARIMA (2.1.0) model was seen in observation number 161 (3.542 kW·h/m 2 ). Figure 9C shows the daily GSR and EF of both forecasted FFBP (12.2.1) and measured data. By comparing the two previous models and the FFBP (12.  Figure 9D shows the forecasted hybrid ARMA-FFBP model for the daily GSR, measured data, and error forecasting of the Tetouan site. Taking into account the advantage of a hybrid model, which can minimize the shortcomings of a single model, the shown hybrid ARMA-FFNB model is close to the measured data. In terms of R 2 , SBF, LCE, and WIA, the presented value of the hybrid model is 0.9890, 0.9148, 0.9580, and 0.9910, respectively. The R 2 value is close to one, which indicates the good agreement between forecasted hybrid ARMA-FFBP and measured data. The other statistical metric of the hybrid ARMA-FFBP shows the lowest values compared with the three previous models. The error forecasting of the proposed hybrid ARMA-FFBP model shows significant and lower values than other models. Among the three previous models, it was seen that the hybrid ARMA-FFBP is the most suitable model to forecast the daily GSR compared with ARMA (10.0.0), ARIMA (2.1.0), and FFBP (12.2.1) models. Figure 9E shows the forecasted daily GSR generated by the hybrid ARIMA-FFBP model, the measured data, and the error forecasting of the Tetouan site. Among all models, the hybrid ARIMA-FFBP is the most successful model, which is very close to the measured daily GSR. In terms of R 2 , the shown hybrid ARIMA-FFBP is approximately higher by about 0.41% on the hybrid ARMA-FFBP model, 0. 53% on the FFBP (12.     Figure 10A-D shows the comparisons of the daily GSR forecasted by the proposed ARIMA, ARMA, FFBP, hybrid ARMA-FFBP, hybrid ARIMA-FFBP, and measured data of the Tangier site. The Figures present the error forecasting (EF) of each model, which shows the agreement relationship between the forecasted methodologies and measured data. The study location presents satisfactory results in terms of statistical metrics in the same way as the previous one. The Tangier site is among the Mediterranean regions with good prediction results selected in this study. In this regard, it has been seen from the forecasted results and error depicted in Figure 5. Various observation numbers seen from the forecasted daily GSR have maximal and minimal error forecasting. The presented result led to increasing and decreasing the total forecast error of the selected study location. The reason why the Tangier    . In terms of MBE, the selected model is the only one that has the lowest value (0.0042 kW·h/m 2 ) compared to the other models. It can be concluded that the ARIMA (2.2.0) exceeds the ARMA (16.0.0) model and present a significant agreement between the forecasted daily GSR and measured data. Figure 10C shows the comparisons of the forecasted daily GSR generated by the FFBP (12.2.1) method and measured data for the Tangier site. The error forecast is depicted in the same figure, which presents the forecast improvement FFBP (12.2.1), and the performance accuracy between measured data. As seen from Table 6, in term of R 2 , the selected model rank third after the combined models. The lowest value of the FFBP (12.2.1) is by about 0.03092 kW·h/m 2 for MBE, 0.0517% for MBE (%), and 0.79265 (%) for Sd (%) respectively. The highest value of WIA is about 0.9891 and is nearly close to 1. As a result, the selected FFBP (12.2.1) exceeds the ARIMA (2.2.0) and ARMA (16.0.0) models and illustrates a successful forecast of the daily GSR. Figure 10D shows the daily GSR forecasted by the hybrid ARMA-FFBP model and error forecasting. In addition, the forecasted combined model is compared with the measured daily GSR for the Tangier site. Considering the depicted Figure 5 and Table 6 together, it appears clearly that the hybrid ARMA-FFBP presents the highest statistical metric indicator compared with the previous model for the selected study location. In addition, the combined ARMA-FFPB is close to FFBP (12.  Figure 10E shows the forecasted daily GSR implemented by combined ARIMA-FFBP and compared with those measured data for the Tangier site. Among the four previous models, the hybrid ARIMA-FFBP is the most successful model, which presents the lowest values of MBE (%), RMSE (%), Sd (%), AIC, and BIC, and the highest value of R 2 , SPE, LCE, and WIA. In terms of R 2 , the shown hybrid ARIMA-FFBP is approximately higher by about 1.57% on the ARMA (16.0.0) model, 3% on the ARIMA (2.2.0) model, 0.67% on the FFBP (12.2.1) model, and around 0.13% on the hybrid ARMA-FFBP (2.1.0) model. All computed statistical performance metrics showed a very close forecasting success compared with the FFBP (12.2.1) and hybrid ARMA-FFBP models. In terms of error forecasting, the proposed hybrid ARIMA-FFBP can be recognized as "the very most suitable model forecasting". Likewise, in the previous observation number, the presented error forecasting of the hybrid ARIMA-FFBP was seen to be very low.
The Taylor diagram has been utilized as a comparison tool revealing the accuracies of different selected models (forecasted and measured data). Moreover, this diagram combines the correlation coefficient (R 2 ), the root means square error (RMSE), and the standard deviation (Sd) in a polar (two-dimensional) diagram. The main objective of this illustration is to closely inspect the forecasted results and the measured data on a particular day. Figures 11 and 12 compare the performance of the most appropriate inputs, and graphs based on the statistical error metric. The figures illustrate the accuracies of the 16 relevant models, which have relatively lower errors in terms of standard deviation (Sd) and RMSE (value between 0.1 and 0.8 kWh/m 2 ). In addition, the, highest value of R 2 (99.31%) presents the accuracy relationship between the measured and predicted values. Figure 11 illustrates the Taylor diagram for the Tetouan site, in which statistics for the forecasted ARMA (10.0.0), ARIMA (2.1.0), FFBP (12.2.1), hybrid ARMA-FFBP, and hybrid ARIMA-FFBP models (each model contains 16 appropriate models to select the performed one) were computed. Each appropriate model appearing in the diagram quantifies how the forecasted models matched measured daily GSR. It is seen from the figure that the centered RMSE is related to the distance from the reference point (the horizontal axis is considered as observed data). As a result, the forecasted ARMA (10.0.0) and ARIMA (2.1.0) revealed the highest RMSE and the lowest value of R 2 , which is varying between 16.6421~15.6709 and 0.9472~0.9628, respectively. This model is less than the forecasted FFBP (12.2.1) model and revealed the optimum RMSE (0.5119%) and Sd (9.9852%). The forecasted combined hybrid ARMA-FFBP and hybrid-FFBP models resulted in the lowest value of RMSE than the simple models.
The Taylor diagram has been utilized as a comparison tool revealing the accuracies of different selected models (forecasted and measured data). Moreover, this diagram combines the correlation coefficient (R²), the root means square error (RMSE), and the standard deviation (Sd) in a polar (two-dimensional) diagram. The main objective of this illustration is to closely inspect the forecasted results and the measured data on a particular day. Figures 11 and 12 compare the performance of the most appropriate inputs, and graphs based on the statistical error metric. The figures illustrate the accuracies of the 16 relevant models, which have relatively lower errors in terms of standard deviation (Sd) and RMSE (value between 0.1 and 0.8 kWh/m²). In addition, the, highest value of R² (99.31%) presents the accuracy relationship between the measured and predicted values.     The Taylor diagram for the Tangier site shown in Figure 12 is generated by five models, which contain 16 appropriate models. It appears from the figure that the forecasted FFBP (12.2.1), hybrid ARMA-FFBP, and ARIMA-FFBP models proved the appropriate match with the measured daily GSR. The hybrid ARIMA-FFBP model had the highest values in terms of R 2 and the lowest value in terms of RMSE compared with previous models. In this case study, the Taylor correlation increased by about 10% to 15% compared with the Tetouan site.
The regression plot of the forecasted daily GSR generated by the most five appropriate models for the Tetouan and Tangier sites is given in Figure 13A,B. As seen from the figures, the error estimated between the forecasted daily GSR and measured data have a wide dispersion of the ARMA (10.0.0), ARIMA (2.1.0), ARMA (16.0.0), and ARIMA (2.2.0) models for Tetouan and Tangier site, respectively. The dispersion of the FFBP (12.2.1) model is smaller than the other two previous models. The dispersion of the combined models is smaller and less than the FFBP (12.2.1) model. The accuracy between the forecasted and measured data in the hybrid ARIMA-FFBP method is improved. It is observed that all data sets are correctly fitted to the corresponding line, which verifies the hybrid ARIMA-FFBP is more accurate compared to other methods. In addition, the correlation coefficient (R 2 ) values of the best model for the Tetouan and Tangier sites is close to 1, which explains the good relationship between the forecasted and measured data.   Table 6 gives the numerical values of the adopted methodologies by using the computed statistical metric in order to select the best model, which is at present the optimal value. The correlation coefficient (R²) of the forecasted ARIMA, ARMA, FFBP, and hybrid models varies between 0.9472% and 0.9931% depending on the study location and the trained methods. The range value of the slope of the best-fit line (SPE) varies between 0.8435 and 0.9296. The range value of the legate's coefficient of efficiency (LCE) is 0.8954, 0.9696 and the range value of Willmott's index of agreement (WIA) is 0.9491 and 0.9945. These results show that the hybrid ARIMA-FFBP is more reliable in the forecasting of the daily GSR for the Tetouan and Tangier sites. In this context, the obtained performance will be compared and discussed by considering Table 6 as the reference. Further, the results obtained from the hybrid ARIMA-FFBP model compared with single and combined models have exposed the highest correlation coefficient of 0.9901% for Tetouan city and 0.9831% for Tangier city. In addition, the values of MBE (%), RMSE (%), Sd (%), Akaike information criterion (AIC), and Bayesian information criterion (BIC) for both cities are 0.0297 (%), 0.02101 (%), 9.6917 (%), 9.06742 (%), 8.67911 (%), 6.87613 (%), 792.8625, 765.091 and 756.3418, 504.816, respectively. Eventually, the results have defined that the hybrid ARIMA-FFBP model is more accurate and suitable compared with the other methods to predict the daily global solar radiation for any location with the same weather conditions.
In several investigations, the hybrid ARIMA-FFBP and hybrid ARMA-FFBP models were compared to deep learning models for solar radiation. A study, for example, com-  Table 6 gives the numerical values of the adopted methodologies by using the computed statistical metric in order to select the best model, which is at present the optimal value. The correlation coefficient (R 2 ) of the forecasted ARIMA, ARMA, FFBP, and hybrid models varies between 0.9472% and 0.9931% depending on the study location and the trained methods. The range value of the slope of the best-fit line (SPE) varies between 0.8435 and 0.9296. The range value of the legate's coefficient of efficiency (LCE) is 0.8954, 0.9696 and the range value of Willmott's index of agreement (WIA) is 0.9491 and 0.9945. These results show that the hybrid ARIMA-FFBP is more reliable in the forecasting of the daily GSR for the Tetouan and Tangier sites. In this context, the obtained performance will be compared and discussed by considering Table 6 as the reference. Further, the results obtained from the hybrid ARIMA-FFBP model compared with single and combined models have exposed the highest correlation coefficient of 0.9901% for Tetouan city and 0.9831% for Tangier city. In addition, the values of MBE (%), RMSE (%), Sd (%), Akaike information criterion (AIC), and Bayesian information criterion (BIC) for both cities are 0.0297 (%), 0.02101 (%), 9.6917 (%), 9.06742 (%), 8.67911 (%), 6.87613 (%), 792.8625, 765.091 and 756.3418, 504.816, respectively. Eventually, the results have defined that the hybrid ARIMA-FFBP model is more accurate and suitable compared with the other methods to predict the daily global solar radiation for any location with the same weather conditions.
In several investigations, the hybrid ARIMA-FFBP and hybrid ARMA-FFBP models were compared to deep learning models for solar radiation. A study, for example, compared artificial intelligence (AI) methods for solar radiation forecast or estimation, including empirical, statistical, physical, and machine learning models [25]. Another study introduced a new hybrid strategy based on deep learning approaches for Global Solar Radiation (GSR) prediction problems [48]. In addition, one study constructed and analyzed two innovative hybrid neural network models for solar irradiance forecasting [49], and another examined the effects of various classic long short-term memory (LSTM) models on hour-ahead solar irradiance forecasting [50]. Finally, a study demonstrated that using input parameters in this hybrid model for daily GSR forecast proves the performance accuracy compared to the previous models.

Conclusions
This paper has evaluated and compared the forecasting of the daily global solar radiation in two cities by five approaches such as FFBP, ARIMA, ARMA, hybrid ARMA-FFBP, and hybrid ARIMA-FFBP, which were modeled in the MATLAB simulation platform. A different combination of input data was used to achieve our proposed methodologies in terms of statistical metrics. Further, the forecasting accuracy of the FFBP model in various applications and the analysis capability time series of the ARIMA and ARMA models were considered. Moreover, the solar radiation intensity, according to the climatic conditions and geographical coordinates in the two cities, was evaluated by their measured daily GSR dataset.
The hybrid ARIMA-FFBP model obtained the highest correlation coefficient of 0.9901% for Tetouan city and 0.9831% for Tangier city once compared to single and combined models. MBE (%), RMSE (%), Sd (%), Akaike information criterion (AIC), and Bayesian information criterion (BIC) values for both cities are 0.0297 (%), 0.02101 (%), 9.6917 (%), 9.06742 (%), 8.67911 (%), 6.87613 (%), 792.8625, 765.091, and 756.3418, 504.816, respectively. Furthermore, the results reveal that the hybrid ARIMA-FFBP model has been considered an appropriate forecasting tool to predict the daily global solar radiation, which exceeds the hybrid ARMA-FFBP and the other models. This method could be deployed further for modeling different areas in the country for long-term forecasting. Notably, the most appropriate model has corresponded to the lowest value of statistical performance indicators. For instance, the hybrid ARIMA-FFBP model reaches low values compared to the hybrid ARMA-FFBP and simple models. Ultimately, the results verify that the hybrid ARIMA-FFBP model is highly accurate in predicting the daily global solar radiation and could further be substituted for other locations under similar weather conditions in the future.