An Interpretable Machine Learning Model for Daily Global Solar Radiation Prediction

: Machine learning (ML) models are commonly used in solar modeling due to their high predictive accuracy. However, the predictions of these models are difﬁcult to explain and trust. This paper aims to demonstrate the utility of two interpretation techniques to explain and improve the predictions of ML models. We compared ﬁrst the predictive performance of Light Gradient Boosting (LightGBM) with three benchmark models, including multilayer perceptron (MLP), multiple linear regression (MLR), and support-vector regression (SVR), for estimating the global solar radiation ( H ) in the city of Fez, Morocco. Then, the predictions of the most accurate model were explained by two model-agnostic explanation techniques: permutation feature importance (PFI) and Shapley additive explanations (SHAP). The results indicated that LightGBM (R 2 = 0.9377, RMSE = 0.4827 kWh/m 2 , MAE = 0.3614 kWh/m 2 ) provides similar predictive accuracy as SVR, and outperformed MLP and MLR in the testing stage. Both PFI and SHAP methods showed that extraterrestrial solar radiation ( H 0 ) and sunshine duration fraction ( SF ) are the two most important parameters that affect H estimation. Moreover, the SHAP method established how each feature inﬂuences the LightGBM estimations. The predictive accuracy of the LightGBM model was further improved slightly after re-examination of features, where the model combining H 0 , SF , and RH was better than the model with all features.


Introduction
Renewable energy transition will enormously benefit African countries by creating employment opportunities, protecting the environment, and promoting energy security [1]. Morocco is regarded as one of the leading African countries in renewable energy, thanks to its policies encouraging investments in renewable energies. These sources of energy are expected to generate 52% of the country's electricity by 2030 [2]. Moreover, the Moroccan government has established a new climate strategy by ratifying the Paris Agreement, and holding the United Nations Conference of Parties (COP22) in Marrakesh in 2016 [3].
Solar energy is a sustainable energy source used widely for a variety of applications, including electricity generation, water pumping, air or water heating, and water desalination [4,5]. Global solar radiation information is critical for such applications. The most precise method of acquiring this information consists of using a radiometric measurement station running continuously for an extended period. However, in most countries, Tree-based models are ML techniques that use decision trees as a base model [19]. These methods have been successfully applied in many solar radiation related studies. Benouna et al. [28] compared 3 tree-based models (boosted trees, bagged trees, and random forest (RF)), 22 empirical models, and an MLP model, for estimating H in five locations in Morocco. Their results revealed the superiority of the (RF) model in terms of r, normalized mean absolute error (nMAE), and (nRMSE) that were in the range of 0.8753-0.9620, 5.84-11.81%, and 7.85-15.33%, respectively. Fan et al. [21] compared the XGBoost model with the SVR technique for predicting daily H in humid subtropical China. The authors reported that XGBoost (R 2 = 0.7530, RMSE = 0.9238 kWh/m 2 , MAE = 0.6925 kWh/m 2 ) exhibited a similar performance to SVR while outperforming the empirical models.
Fuzzy logic techniques take into account the uncertainty associated with weather conditions to estimate H [29]. Boata et al. [30] proposed a functional fuzzy approach to forecast daily global solar radiation at 12 European stations. The results of RMSE and MAE were between 1.04-1.69 kWh/m 2 and 0.66-1.16 kWh/m 2. , respectively. Rizwan et al. [29] used a fuzzy logic approach to estimate monthly mean H in four Indian stations. They concluded that the proposed model provided a good predictive performance compared to a clear sky and MLP models, with an overall mean absolute percentage error (MAPE) of 5% across all stations.
In 2017, Ke et al. [31] developed a novel tree-based ensemble method named Light-GBM, which is a new variant of gradient boosting with a faster training time and higher prediction capability. This model has been successfully applied in many fields to predict the energy yield of photovoltaic systems [32], protein-ATP binding residues [33], peer-to-peer network loan default [34], and reference evapotranspiration [35]. However, LightGBM was rarely used in global solar radiation estimation. To the best of the authors' knowledge, only Park et al. [36] used the LightGBM algorithm to predict multistep-ahead solar radiation, using data from two regions in South Korea. The findings of this research indicated that the LightGBM algorithm was more efficient than the tree-based ensemble and deep learning methods. A list of representative literature related to the comparison ML models for H prediction is depicted in Table 1.   [21] 4 Sites (India) Fuzzy, clear sky, MLP MLP MAPE = 4.81% [29] As the reviewed literature shows, ML models are powerful tools for global solar radiation estimation. However, most of them are considered black-box models. This means that the user will have difficulty comprehending the internal logic of these models [37] To overcome these limitations, many strategies for interpretable ML have been recently developed, including partial dependence plot (PDP), local interpretable model-agnostic Explanations (LIME), accumulated local effects (ALE), permutation feature importance (PFI), and Shapley additive explanations (SHAP) [38]. These methods can be used to explain model predictions, extract knowledge, and enhance predictive ability [39]. The explanation techniques can be classified into model-agnostic and model-specific. Model-agnostic can be applied to any ML algorithm (e.g., PFI, PDP, SHAP), while model-specific is limited to specific model classes [37]. For instance, the interpretation of regression weights in a linear model is model-specific, and does not apply to any other model [40]. Alternatively, these techniques might be classified by whether they produce global or local interpretations. Global interpretation refers to understanding the overall relationship between features and the target based on the entire model (e.g., PFI), whereas local interpretations focus on explaining the prediction of a single or a subset of instances (e.g., LIME) [38].
In solar modeling, the most popular approach for explaining predictive ML algorithms is feature importance. This strategy aims at identifying the most relevant features for global solar radiation estimation. Alsina et al. [41] used the automatic relevance determination method (ARD) to identify the most significant attributes for an ANN model developed to predict the monthly solar radiation in Italy. The group achieved the best results (mean absolute percentage error (MAPE) equal to 1.67%, an nRMSE of 1.01%, and a mean percentage bias error (MPBE) of 0.03%) with seven inputs, namely top of atmosphere radiation, day length, number of rainy days, rainfall, latitude, period of time, and altitude. Shamshirband et al. [42] conducted a sensitivity analysis for an extreme learning machine (ELM) algorithm to find the most influential input on H estimation. The results indicated that the most critical single input parameter is the relative sunshine duration, and that the optimal combination of two inputs is the sunshine duration and the difference between the maximum and minimum temperatures. Rohani et al. [43] showed, using the GPR model, that sunshine fraction duration, mean temperature, relative humidity, and extraterrestrial radiation are the most important features for daily and monthly H prediction in Mashhad, Iran. Using the random forest (RF) model, Zeng et al. [44] demonstrated that daily sunshine duration, daily maximum land surface temperature, and day of the year are the most effective input variables for H estimating across China.
PFI is a new global model-agnostic explanation technique that was recently used to identify the most relevant features in many fields, such as medicine [45], agriculture [46], and engineering [47]. Similarly, the SHAP method has been applied successfully to interpret local and global ML predictions in several studies in order to predict the risk of water erosion [48], estimate pairwise acquisition [49], investigate the factors that contribute to freight truck-related crashes [50], estimate the occurrence of benthic macroinvertebrate species [51], and predict the fuel properties of the chars [52].
As this brief review indicates, the application of LightGBM for global solar radiation prediction remains limited. In addition, PFI and SHAP methods have not yet been applied in H modeling to the knowledge of the authors. For this reason, this study aims: (1) To compare the performances of the newly LightGBM model to three benchmark machine learning algorithms, namely MLP, MLR, and SVR. (2) To explain the predictions of the best algorithm with PFI and SHAP techniques by quantifying the relevance of inputs, elucidating their impacts on each individual estimation, and highlighting their interaction. (3) To evaluate the efficacy of the two explanation techniques by feature re-examination of the most accurate model.
The remaining of this paper is structured as follows: Section 2 presents an overview of the different models and techniques used in this paper. It also describes the study area, data collection, partitioning process, and statistical indicators. The results are presented and discussed in Section 3. Finally, Section 4 provides the main conclusions. SVRs are kernel ML techniques based on statistical learning theory and the structural risk minimization principle. The fundamental idea behind SVRs is to convert the non-linear relationship between features and the target in the original space into a linear regression in a new higher dimensional space, using a process called the kernel trick [53]. More details about the SVR model can be found in [54]. The SVR model includes hyperparameters that can be tuned to reduce model overfitting and improve prediction efficiency. The hyperparameters considered in this study are (1) (C): the regularization parameter, and (2) (epsilon): the width of the tube around the estimated function.

Multilayer Perceptron (MLP)
MLP models are a subclass of feedforward ANNs that consist of an input layer, an output layer, and one or several hidden layers. The model first propagates the signal forward from the input layer to the hidden layer and finally to the output layer. Further, the error signal is backpropagated to the input layer. A learning algorithm adjusts the network's weights and bias until the error reaches an acceptable level. More details about this model can be found in [55]. For an MLP model with one hidden layer, the optimized hyperparameters include (1) (activation): the activation function for the hidden layer, (2) (solver): the learning algorithm, and (3) (hidden_layers_sizes): the number of neurons in the hidden layer.

Multiple Linear Regression (MLR)
MLR model is based on a linear relationship between the features, x i , and the output variable, Y, given as [56]: where β 0 is the intercept, β 1 , β 2 . . . β M are regression coefficients, and M is the number of features.

Light Gradient Boosting (LightGBM)
LightGBM is a variant of gradient boosting proposed by Ke et al. [31] in 2017. Gradient boosting refers to an ensemble model based on a decision tree as a weak learner. The predictive ability and the computational cost of this algorithm deteriorate when a large amount of data is available, or the attribute dimension is high. The LightGBM model can overcome these limitations by using gradient-based one-side sampling (GOSS) and exclusive feature bundling (EFB) techniques [57]. Furthermore, the LightGBM model grows its trees using the leaf-wise strategy, rather than the level-wise tree technique. This strategy grows the trees vertically, whereas other algorithms grow them horizontally. Figure 1 illustrates the two tree growth strategies. lustrates the two tree growth strategies.
Due to its high sensitivity to overfitting, the LightGBM's hyperparameters should be optimized. The main hyperparameters of this model are (1) (num_leaves): the number of leaves per tree, (2) (learning_rate): the parameter that controls the speed of iteration, and (3) (max_depth): the maximum depth of the tree. More details about the LightGBM algorithm can be found in [31].   [58]. The basic idea is to permute the values of a variable i, and calculate how much the prediction error increases because of this permutation. The computation of the PFI score comprises the following four steps: (1) estimation of the original model error e orig , (2) permutation of the values of the predictor variable i , (3) calculation of the new error e perm , and (4) determination of the permutation feature importance score PFI = e perm − e orig [38]. The error used in this paper is the mean absolute error (MAE) defined by Equation (10).

Shapley Additive Explanations (SHAP)
SHAP is an explanation technique introduced by Lundberg and Lee in 2017, based on cooperative game theory [59]. It calculates the individual contribution of each feature using Shapley values [60], where the Shapley value corresponds to the average marginal impact of a feature value on the predictions over all feasible coalitions [59].
Let f the ML model that needs to be explained, g, is the explanatory model, and x and x denote the input variable and the simplified input, respectively. To interpret the output of the ML model, SHAP utilizes an additive feature attribution [61] as: where M is the number of attributes, ∅ i is the SHAP value of a feature i, and ∅ 0 represents the constant value when all input variables are missing. SHAP can be used for both local and global explanations. The local explanations are aggregated to generate the global explanation by averaging the absolute Shapley values per feature across the data. The SHAP method is also able to identify how each feature influences the estimations (positively or negatively), and to quantify the interaction between two variables, i and j [59,61]. The present study used daily global solar radiation (H), sunshine duration (N), average temperature (T), atmospheric pressure (P), relative humidity (RH), precipitation (P r ), and wind speed (v). These data were collected from a meteorological station in Fez (latitude 33 • 55 58 N, longitude 4 • 58 30 W, altitude 571.3 m) between 2016 and 2017. Figure 2 shows the location of the meteorological station. The daily extraterrestrial solar radiation (H 0 ) and the daily sunshine duration fraction (SF) were added to the database as a part of the feature engineering process. The daily extraterrestrial solar radiation on a horizontal surface H 0 . is computed as [62]: where G sc is the solar constant, assumed equal to 1367 W/m 2 , ϕ is the latitude, n day is the day number, δ is the declination angle, and ω s is sunset hour angle. used in this study, we detected five days with missing values, and four days with incorrect values. Figure 3 illustrates the triangular correlation heatmap showing the Pearson correlation coefficient r between two variables. We can see from this figure the existence of a strong positive linear correlation between and , and a moderate positive linear correlation between both features and , and . On the other hand, there is a weak negative linear association between and , a low negative linear relationship between and , and a moderate negative linear correlation between and . There is no linear relationship between and .
The sunshine duration fraction SF is expressed as: where N and N 0 represent measured and calculated sunshine duration, respectively. The theoretical sunshine duration is given as: The dataset contains some incorrect and missing values that must be removed. To ensure that the developed models are highly accurate, each observation must satisfy the following conditions: the daily clearness index (K t = H H 0 ) and SF should be in the ranges of 0.015 < K t < 1 and 0 ≤ SF ≤ 1 , respectively [25,62]. Among the 731 daily data used in this study, we detected five days with missing H values, and four days with SF incorrect values. Figure 3 illustrates the triangular correlation heatmap showing the Pearson correlation coefficient r between two variables. We can see from this figure the existence of a strong positive linear correlation between H 0 and H, and a moderate positive linear correlation between both features T and SF, and H. On the other hand, there is a weak negative linear association between P and H, a low negative linear relationship between P r and H, and a moderate negative linear correlation between RH and H. There is no linear relationship between v and H.

Data Preprocessing and Performance Criteria
To develop the ML models, the dataset was randomly divided into two subsamples: 60% for training, and 40% for testing. LightGBM does not require data normalization. However, the data used for MLP, MLR, and SVR must be pre-processed. The normalization formula is given as [63]: where X norm , X i , X i,min , X i,max represent the normalized value, the real value, the maximum, and the minimum values, respectively. The Bayesian optimization (BO) approach was used to find the optimal hyperparameters for SVR, MLP, and LightGBM machine learning models. After obtaining the hyperparameter values by running the BO algorithm 30 times, a final model was trained and tested. SVR, MLP, MLR, and PFI were developed using scikit-learn 0.24.2, LightGBM was developed using LightGBM 3.2.1.99, SHAP was implemented using SHAP, and the Bayesian optimization was implemented using scikit-optimize libraries in Python 3.8.9. Three statistical indicators were selected to evaluate the robustness of the developed models, namely root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R 2 ) defined as [64]: where n is the number of observations, H i,c denotes the calculated solar radiation, H i,m represents the measured solar radiation, and H m,avg is the mean of the measured solar radiation values. A model achieves high predictive accuracy when RMSE and MAE are close to 0 and R 2 is close to 1. The methodology used in this study is illustrated in Figure 4.

Results and Discussion
In the next section, we tuned the hyperparameters of the three algorithms, SVR, MLP, and LightGBM, using the Bayesian optimization approach. Then, we compared SVR, MLP, MLR, and LightGBM to choose the most accurate of them. Table 2 summarizes the optimal hyperparameters for the SVR, MLP, and LightGBM models. The values of the statistical indicators obtained by SVR, MLP, MLR, and LightGBM models during the training and testing stages are depicted in Table 3. According to this table, the performances obtained by LightGBM (R 2 = 0.9871, RMSE = 0.2229 kWh/m 2 , MAE = 0.1638 kWh/m 2 ) in the training phase were significantly better than those of MLR (R 2 = 0.9275, RMSE = 0.5290 kWh/m 2 , MAE = 0.3955 kWh/m 2 ), and superior than

Results and Discussion
In the next section, we tuned the hyperparameters of the three algorithms, SVR, MLP, and LightGBM, using the Bayesian optimization approach. Then, we compared SVR, MLP, MLR, and LightGBM to choose the most accurate of them.  Table 3. According to this  table, Figure 5 presents the scatter plots of the predicted daily global solar radiation values using the three ML models against the measured values during training and testing phases. As shown in this figure, the plotted data points in the training stage are generally located near the 1:1 line for MLP and LightGBM algorithms, while they are more scattered in the case of MLR. In the testing stage, the four techniques yielded more scattered estimates, particularly the MLR model.

Predictive Performance of the ML Models
Since LightGBM was the most accurate model, the analysis with PFI and SHAP approaches was restricted to this algorithm. Based on the results of these two techniques, a re-examination of the LightGBM model's features was conducted to enhance its predictive performance.

Feature Importance Using PFI
In this section, we used the PFI technique to determine the importance of each feature for H estimation, and to identify the most effective of them. Figure 6 represents the PFI scores of the seven investigated predictors. Among all features, H 0 and SF were the most important, with PFI scores of 4 and 2.39, respectively. These two variables are widely used in solar modeling, since H 0 and N 0 are deterministic parameters calculated using sun geometry equations, and sunshine duration data are available in most meteorological stations worldwide [65]. Traditionally, these two quantities represented the components of the linear or nonlinear Angström-Prescott models, where SF is used as an independent variable, and H 0 is part of the clearness index ratio [66]. The next three relevant parameters  used in solar modeling, since and are deterministic parameters calculated using sun geometry equations, and sunshine duration data are available in most meteorological stations worldwide [65]. Traditionally, these two quantities represented the components of the linear or nonlinear Angström-Prescott models, where is used as an independent variable, and is part of the clearness index ratio [66]. The next three relevant parameters , , and P had low PFI scores with values of 0.046, 0.019 and 0.01, respectively. The remaining attributes had insignificant PFI scores. Figure 6. PFI scores of the seven studied features. Figure 6. PFI scores of the seven studied features.

Local and Global Explanations Using SHAP
The PFI method offers global explanations by identifying the most important attributes. In contrast, the SHAP method can provide both local and global explanations. SHAP explains the outcomes of individual observations, quantifies the relative importance of features, and elucidates their influence on model prediction. Figure 7 shows the explanation generated by the SHAP method for an instance chosen randomly from the testing dataset. The base value (5.097) represents the mean model prediction over the testing dataset. Predictors that move the estimation higher than the base value (to the right) are in red, while those moving it lower are in blue.

Local and Global Explanations Using SHAP
The PFI method offers global explanations by identifying the most important attributes. In contrast, the SHAP method can provide both local and global explanations. SHAP explains the outcomes of individual observations, quantifies the relative importance of features, and elucidates their influence on model prediction. Figure 7 shows the explanation generated by the SHAP method for an instance chosen randomly from the testing dataset. The base value (5.097) represents the mean model prediction over the testing dataset. Predictors that move the estimation higher than the base value (to the right) are in red, while those moving it lower are in blue. Table 4 shows the feature values for this observation, as well as the contribution of features to the LightGBM model's output value of 7.20.
It can be seen from Figure 8 and Table 4 that demonstrated the highest contribution, whereas showed the lowest one. Moreover, , , , , and pushed the prediction to the right with values of 2.02, 0.05, 0.04, 0.12, and 0.03 respectively. On the other side, and pushed the prediction to the left by −0.09 and −0.07, respectively. As a result, the estimated output value was given as follows:  To generate a global interpretation of the LightGBM predictions, the local interpretations were aggregated by averaging the absolute Shapley values per attribute across the data. Figure 8 represents the SHAP feature importance plot which shows the global effect of each feature on estimation. The SHAP method confirmed that and are the most important features, and that is the least effective feature. This method also confirmed the order of importance of the five features , , , , and . However, it reversed the order of and . We used the SHAP summary plot (Figure 9) to determine the magnitude and direction of each attribute's impact at global and local scales. This plot contains many points, where the y-axis indicates the feature names in decreasing order of relevance, and the xaxis indicates the SHAP values for each input predictor. Red points correspond to higher values of the feature, while blue points to lower ones.
We can observe from this plot that and showed a high positive association with estimation. This is consistent with the scientific knowledge and the results of Sec-  It can be seen from Figure 8 and Table 4 that H 0 demonstrated the highest contribution, whereas v showed the lowest one. Moreover, H 0 , SF, RH, T, and v pushed the prediction to the right with values of 2.02, 0.05, 0.04, 0.12, and 0.03 respectively. On the other side, P and P r pushed the prediction to the left by −0.09 and −0.07, respectively. As a result, the estimated output value was given as follows: 5.97 + 2.02 + 0.05 + 0.04 + 0.12 − 0.09 − 0.07 + 0.03 ≈ 7.20 (12) was inversely related to prediction for the majority of examples, with higher values of this feature leading to lower SHAP values and vice versa. and had comparable influences on estimation, and exhibited a mixed pattern, where different values of these attributes are associated with both high and low impacts on the predictions. Insignificant positive SHAP values were recorded for low values of , and a small negative impact on prediction was produced when increased. The wind speed was the least relevant feature, with no significant effect on prediction.

Feature Dependency Analysis
To get more insight into the LightGBM's behavior when predicting , we used the SHAP partial dependence plots for the seven predictor variables ( Figure 10). Each plot illustrates how the values of a feature (x-axis) affect the prediction (y-axis) of each instance in the dataset [67]. The plots also include another feature that the selected variable interacts most with.
We can see from this figure that had the highest interaction with all other features, with the exception of which interacted strongly with . We can also see from Figure 10a,b the existence of a positive trend between the features and , and estimation. These two features had the highest interaction between them, and generated a positive impact on prediction when > 8.34 kWh/m 2 or > 0.64, and vice versa. To generate a global interpretation of the LightGBM predictions, the local interpretations were aggregated by averaging the absolute Shapley values per attribute across the data. Figure 8 represents the SHAP feature importance plot which shows the global effect of each feature on H estimation. The SHAP method confirmed that H 0 and SF are the most important features, and that v is the least effective feature. This method also confirmed the order of importance of the five features H 0 , SF, RH, P r , and v. However, it reversed the order of P and T.
We used the SHAP summary plot (Figure 9) to determine the magnitude and direction of each attribute's impact at global and local scales. This plot contains many points, where the y-axis indicates the feature names in decreasing order of relevance, and the x-axis indicates the SHAP values for each input predictor. Red points correspond to higher values of the feature, while blue points to lower ones. and : the solar radiation reaching the ground is the fraction of that is transmitted through the atmosphere, and is an indirect index of the sky's state, where higher values of correspond to clear days and lower values correspond to overcast days [65]. This figure also revealed that the highest values of had the maximum positive impact on prediction. On the other hand, the lowest values of had the highest negative impact.
was inversely related to prediction for the majority of examples, with higher values of this feature leading to lower SHAP values and vice versa. and had comparable influences on estimation, and exhibited a mixed pattern, where different values of these attributes are associated with both high and low impacts on the predictions. Insignificant positive SHAP values were recorded for low values of , and a small negative impact on prediction was produced when increased. The wind speed was the least relevant feature, with no significant effect on prediction.

Feature Dependency Analysis
To get more insight into the LightGBM's behavior when predicting , we used the SHAP partial dependence plots for the seven predictor variables (Figure 10). Each plot illustrates how the values of a feature (x-axis) affect the prediction (y-axis) of each instance in the dataset [67]. The plots also include another feature that the selected variable interacts most with.
We can see from this figure that had the highest interaction with all other features, with the exception of which interacted strongly with . We can also see from Figure 10a,b the existence of a positive trend between the features and , and estimation. These two features had the highest interaction between them, and generated a positive impact on prediction when > 8.34 kWh/m 2 or > 0.64, and vice versa. We can observe from this plot that H 0 and SF showed a high positive association with H estimation. This is consistent with the scientific knowledge and the results of Section 2.2.1, which indicate the existence of a direct correlation between these two quantities and H: the solar radiation reaching the ground is the fraction of H 0 that is transmitted through the atmosphere, and SF is an indirect index of the sky's state, where higher values of SF correspond to clear days and lower values correspond to overcast days [65]. This figure also revealed that the highest values of H 0 had the maximum positive impact on H prediction. On the other hand, the lowest values of SF had the highest negative impact.
RH was inversely related to H prediction for the majority of examples, with higher values of this feature leading to lower SHAP values and vice versa. P and T had comparable influences on H estimation, and exhibited a mixed pattern, where different values of these attributes are associated with both high and low impacts on the predictions. Insignificant positive SHAP values were recorded for low values of P r , and a small negative impact on H prediction was produced when P r increased. The wind speed was the least relevant feature, with no significant effect on H prediction.

Feature Dependency Analysis
To get more insight into the LightGBM's behavior when predicting H, we used the SHAP partial dependence plots for the seven predictor variables (Figure 10). Each plot illustrates how the values of a feature (x-axis) affect the prediction (y-axis) of each instance in the dataset [67]. The plots also include another feature that the selected variable interacts most with.   We can see from this figure that H 0 had the highest interaction with all other features, with the exception of P which interacted strongly with RH. We can also see from Figure 10a,b the existence of a positive trend between the features H 0 and SF, and H estimation. These two features had the highest interaction between them, and generated a positive impact on H prediction when H 0 > 8.34 kWh/m 2 or SF > 0.64, and vice versa. The feature interaction caused the spread of SHAP values for these attributes where a larger negative influence was produced when H 0 or SF was smaller, whereas the other predictor was larger, and a high positive impact was generated when these two features varied in the same direction. Figure 10c depicts the relationship between SHAP values and relative humidity. According to this figure, SHAP values were generally positive when RH < 57%, and became completely negative when RH exceeded 65.4%, with the maximum negative impact observed when RH is in the range 70-80% and H 0 > 8 kWh/m 2 .
The influences of P and T on LightGBM predictions are shown in Figure 10d,e. As can be seen from these figures, no obvious relationship exists between P and T and the SHAP values. On the other hand, when P < 943 hPa or 951 hPa < P < 957.2 hPa, a purely positive impact on H estimation was produced, while a purely negative influence was generated when P > 961 hPa. The average temperature had a purely positive impact on global solar radiation estimation when 13.4 • C < T < 19.8°C, and a mixed impact for the other values. Figure 10f indicates that, for low values of precipitation (P r ≈ 0), a small positive effect on global solar radiation estimation was produced; this effect peaked at medium values of H 0 . When P r was above zero, the effect became negative. Figure 10g shows that wind speed engendered a minor purely negative impact on H estimation when v < 2.33 m/s or v > 6.7 m/s. In contrast, a minor purely positive influence was recorded when 3.2 m/s < v < 4.9 m/s or 5.2 m/s < v < 6.4 m/s.

Feature Re-Examination of LigtGBM
The two techniques, PFI and SHAP, showed that the input variables had different impacts on H estimation, and that some of them were redundant. Besides, the SHAP method quantified the interaction between the features. To assess the effectiveness of these two explanation techniques, we compared the LightGBM model with all inputs to seven LightGBM models using the top three most relevant features. Table 5 shows the obtained results during the testing phase. According to this table, H 0 was the best single input, followed by SF, and then RH. This ranking agrees with the results obtained by PFI and SHAP methods. The model integrating H 0 and SF as predictors showed close results (R 2 = 0.9336, RMSE = 0.4984 kWh/m 2 , MAE = 0.3700 kWh/m 2 ) to the model with complete features, and performed significantly better than both models combining the two features (H 0 , RH) or (SF, RH). These results confirmed that H 0 and SF are the most interactive and the most important for H estimation. The model associating H 0 , SF, and RH (R 2 = 0.9382, RMSE = 0.4806 kWh/m 2 , MAE = 0.3602 kWh/m 2 ) slightly outperformed the model with all variables. This model achieved a reasonable prediction accuracy compared to the results reported in the literature (Table 1). For instance, the performances are comparable or better than those obtained in [21,22,[24][25][26]. On the other hand, they are worse than those reported in [23,27].
These findings proved the benefits of the interpretation techniques, particularly the SHAP method, for understanding the inner working of ML models and boosting their predictive capability. Nonetheless, some limitations to our study should be acknowledged, such as the small sample size of the dataset, and its restriction to one geographical location. Thus, further research should be conducted using more data collected over different locations.

Conclusions
In this paper, we started by comparing four ML models for predicting global solar radiation in Fez, Morocco. The results revealed that the LightGBM had comparable performances with SVR, and outperformed the MLP and MLR models. The LightGBM's predictions were then explained by two model-agnostic interpretation techniques. Both the PFI and SHAP methods showed that H 0 and SF are the most important features for estimating global solar radiation. Moreover, the SHAP model was able to highlight the effect of each attribute on H estimation, and to provide local explanations. The predictive ability of the LightGBM model was further slightly improved by feature re-examination based on the results of the two explanation techniques. The findings of this paper proved the utility of using interpretation strategies to explain and enhance the predictive performance of ML models. Additional assessment is required on the applicability of these interpretation approaches to explain the estimations of other predictive ML models, as well as the use of other interpretation strategies, such as local interpretable model-agnostic explanations (LIME), accumulated local effects (ALE), and partial dependence plots (PDP).