Development and Comparison of Two Novel Hybrid Neural Network Models for Hourly Solar Radiation Prediction

: There are a lot of developing countries with inadequate meteorological stations to measure solar radiation. This has been a major drawback for solar power applications in these countries as the performance of the solar ‐ powered system cannot be accurately forecasted. In this study, two novel hybrid neural networks namely; convolutional neural network/artificial neural network (CNN ‐ ANN) and convolutional neural network/long short ‐ term memory/artificial neural network (CNN ‐ LSTM ‐ ANN), have been developed for hourly global solar radiation prediction. ANN models are also developed and the performance of the hybrid neural network models is compared with it. This study contributes to the search for more accurate solar radiation estimation methods. The hybrid neural network models are trained/tested with data from ten different countries across Africa. Results from this study indicate that the performance of all the hybrid models developed in this study is superior to what has been presented in existing literature with their r values ranging from 0.9662 to 0.9930. CNN ‐ ANN model is the best for solar radiation forecasting in Southern, Central, and West Africa. CNN ‐ LSTM ‐ ANN is better for East Africa while both CNN ‐ ANN and CNN ‐ LSTM ‐ ANN are suitable for North Africa. CNN ‐ ANN application for solar radiation prediction in Chad had the overall best performance with an r ‐ value, MAE, RMSE, and MAPE of 0.9930, 15.70 W/m 2 , 46.84 W/m 2 , and 4.98% respectively. The integration of CNN and LSTM algorithms with an ANN model enhanced long ‐ term computational dependency and reduce error terms for the model.


Introduction
The accurate forecast of the available renewable energy (RE) resources is evolving rapidly as this is one of the steps towards the maximization of RE potential. For instance, the accurate forecasting of photovoltaic (PV) system production makes the use of solar energy more reliable [1]. Hence the need to develop a model to predict systems' monitoring and renewable energy resources prediction. An innovative multi-layered architecture for heterogeneous automation and monitoring of PV smart microgrid showed that the developed model can foster digital transformation of power grids and empower real developments in microgrids [2]. Similarly, the implementation of various forecast approaches for PV in microgrid and multigood demonstrated that PV systems can be used to operate Islanded microgrids in safe conditions [3]. The continuous and accurate measurement of solar radiation over a long-term period makes the conversion and utilization of solar power more efficient [4]. However, the measurement of solar radiation is inadequate or unavailable for many (African and developing) countries [5].
In Nigeria (the largest economy and the most populated country in Africa), there are only 54 stations for measuring solar irradiance instead of the required 9000 stations [6]. This is the reality of most African/developing countries. The greenhouse effect and depletion of fossil fuels have concentrated recent research attention on RE resources utilization thereby increasing the demand for solar power [7]. The accurate forecasting of solar PV power generation is important for the following reasons [8]: (a) Accurate solar irradiance prediction can improve solar power utilization, thereby reducing economic losses from electrical restrictions. (b) Solar PV power generation is random, volatile, and intermittent creating a reliability problem in the power grid. Hence, the accurate prediction of solar irradiation can increase solar PV integration and improve the reliability of the power grid. These reasons have made solar irradiance and solar power generation an important subject in the energy research field [9].
In recent years, forecasting techniques such as time-series model-based techniques [10], physical methods, ensemble methods [11], and mapping techniques have been used for different prediction purposes [12]. Furthermore, the use of random forest (RF), artificial neural network (ANN), convolutional neural network (CNN), recurrent neural network (RNN), and other deep learning models have been considered for different RE resources prediction as well as electricity load forecast [13]. A review of different studies on the direct prediction of PV power reported that ANN and SVM models perform well under rapid and varying environmental conditions [14]. A short-term PV power forecast with the GA-SVM hybrid model showed that hybrid models are more robust, accurate, and they require less memory [15]. The adaptative neuro-fuzzy approach was used by Olatomiwa et al. [16] for solar radiation prediction in Nigeria while Pang et al. [17] studied the use of RNN and ANN for solar radiation prediction in Alabama. Their results reflect that both ANN and RNN have good prediction accuracy for this purpose [17]. Furthermore, a comparative study of reliable ensemble learning-based models for solar prediction showed that ensemble models have a consistent and reliable prediction performance when applied to data from different locations [18].
Bendiek et al. [19] proposed the use of a data-driven algorithm and contextual optimization for the forecasting of solar irradiance. Their approach achieved a consistent performance for long and short-term predictions for all the cities considered in their study. Aljanad et al. [20] also used the neural network approach for the prediction of global solar irradiance using particle swarm optimization algorithm considering extremely short-time intervals. From their results, the three days performance profile for the model proposed in the study are 0.0292 of MSE, 0.7537 of MAE, 1.7078 of RMSE, and 31.4348 of MAPE (%) considering a 5 s time interval. Their model outperformed the existing standalone neural networks for solar irradiance prediction [20]. In another study, machine learning and deep learning models were compared for solar irradiance prediction and it was concluded that deep learning models are more viable for this specific forecasting task [21].
Other studies in existing literature [22][23][24] have worked on the forecast of solar radiation of which the use of ANN, RNN, and CNN models was proposed for different case studies. Owing to the under-development in Africa, models that will accurately predict solar radiation are required to encourage solar energy utilization in the continent. Africa as a continent has a high and well-distributed solar energy potential, however, this resource is underutilized and underdeveloped. The energy poverty and lack of access to electricity in many countries in Africa reflect the under-utilization of solar-powered systems. It is estimated that over 600 million Africans lack access to electricity [25].
Beyond Africa, the prediction of solar radiation has been studied in some literature [23], but there is still a need to develop more robust, accurate, and fast predictive models for solar radiation. The long short-term memory (LSTM) model retains long-term computational dependency thereby reducing the error term. Additionally, in comparison to other deep learning algorithms, the CNN algorithm is capable of excellently extracting nonlinear intrinsic features using a convolution process with pooling operations [26]. Consequently, if CNN models are applied in real energy or renewable energy resources forecasting, it will enhance the robustness of the energy design/resource for short-term or long-term forecasts which may be difficult to attain conventionally. To address the gaps in knowledge that advocate for a need to have versatile energy management devices that can boost the integration of solar energy (considering its variability in behavior), the novelty of this paper is to design a new hybrid deep learning forecast model based on the integration of LSTM, ANN, and CNN models.
Two new hybrid neural network models namely, convolutional neural network/artificial neural network (CNN-ANN) and convolution neural network/long short-term memory/artificial neural network (CNN-LSTM-ANN) are developed for accurate solar radiation prediction in Africa. The first model (CNN-ANN) hybridized three hidden layers of CNN to extract the nonlinear intrinsic features of the data. Then a flatten layer, and an ANN model (with three hidden layers) in addition to an input and an output layer are integrated. Model-2 (CNN-LSTM-ANN) hybridizes two layers of CNN, one layer of LSTM, and three layers of ANN in its hidden layer. Although hybrid models such as LSTM-ANN [27] and LSTM-CNN [28] have been used to predict solar radiation and solar PV power production, no study in literature considered the use of the hybrid models presented in this research. This work has been bench-marked against the ANN model as this model is fundamental to the design of the hybrid models. Additionally, in the extant literature, ANN models have been extensively used for solar radiation prediction, therefore, the performance of these hybrid models is compared with that of the developed ANN model. The specific objective of this study is:

-
The development of two novel hybrid neural network models for solar radiation prediction.

-
Integration of CNN with other models to enhance prediction robustness and accuracy. -Development of solar radiation predictive hybrid models adaptable to different climatic conditions. -Comparison of these hybrid models and ANN model performances.
The hybrid models are applied for solar radiation prediction in ten countries from the five geopolitical zones in Africa.
This study is important as the models developed will be instrumental in the calculation/estimation of solar-powered systems' performance and subsequently increase solar energy utilization for electricity generation and other purposes. Thereby reducing the lack of access to electricity in Africa significantly. The development of these models is justified in the subsequent section. The materials and methods (including the data preparation, case study, model development, etc.) used in this study are reported comprehensively in Section 2. The results from this study are discussed and compared with existing works of literature in Section 3 while the concluding remarks from the entire study are highlighted in Section 4.

Materials and Methods
Neural networks are a form of machine learning techniques that uses the connection of computational nodes called neurons to determine or describe in essence any non-linear or linear function [29]. In this study, two hybrid neural network models namely, CNN-ANN and CNN-LSTM-ANN have been created to predict solar radiation in Africa. A flowchart of this research is presented in Figure 1. The performance of these models is also compared to that of an ANN model in this study. In this section, the materials and methods used in building the neural networks are justified. A brief insight into the individual models (ANN, CNN, LSTM) hybridized is first presented. Then, the area studied and the data preparation process is briefly introduced. The model development and metrics used in evaluating these models are also justified in the subsequent subsections.

Artificial Neural Network (ANN)
ANN was inspired by the study of biological neural networks (animal brains). The concept was first proposed for solving different complex problems and the first model (McCulloch-Pitts neural model) was developed in 1943 [30]. Since then, hundreds of various ANN models have been developed and optimized for different applications such as data prediction, pattern recognition, image processing, optimization, controls, and associative memory [17]. In this study, ANN will be used for data analysis. ANN has received increased attention in recent years due to its power in data prediction. In recent studies [31,32], it has been used for solar radiation prediction, however, some studies argued that there is still room for improvement in the prediction accuracy of solar radiation data. The ANN architecture adopted in this study is illustrated in Section 2.6.

Convolutional Neural Network (CNN)
CNN models are more suitable for ingesting and processing data or images, as the input and hidden layers for the model consist of neuron layers that are arranged in different dimensions [33]. In this study, a one-dimension CNN model is adopted as the target data exist in the same dimension. The weights in each filter of a layer of CNN are connected to a small region of the layer as it undergoes the convolution process [29]. In literature [34], this model has also been used for solar radiation prediction. In most machine learning/deep learning frameworks such as Tensorflow, Keras, and Pytorch, a 1-D convolution layer is used to convolve layer input over a single spatial (or temporal) dimension such as the data used in our paper. In cases of data with two or three dimensions, the 2-D convolution is more suitable, and for high-dimensional independent data samples, the recurrent neural network is the most suitable.

Long Short-Term Memory (LSTM)
LSTM is a modified type of the recurrent neural network (RNN) used for sequence data processing. The core feature of the RNN model is highlighted with the word "recurrent" which means that the output of the network will remain together with the input of the next moment to determine the output of the next moment [35]. LSTM models have been applied for machine translation, speech recognition, and text generation. Similar in implementation to the neural network that updates parameters by backpropagation, LSTM also optimizes the model along the negative gradient direction. The gradient will gradually reduce and approach zero as the sequence accumulates thereby causing the gradient to disappear [24]. LSTM model has been used for solar radiation prediction [36] and in this study, one layer of the LSTM model is hybridized with a CNN and ANN model to ensure better performance and learning of long-term dependencies.

Area of Study
Africa is the second-most populous and second-largest continent in the world. It covers 6% of the earth's total surface (20% of total land area) with a total area of about 30.3 × 10 6 km 2 [37]. Solar energy distribution in Africa is fairly uniform and the global solar horizontal irradiance for a larger proportion (85%) of the landscape is over 2000 kWh/m 2 /year. The continent has a solar power generation potential of 1000 GW which is largely untapped to date [38] and the theoretical estimated solar power production is 60 × 10 6 TWh/yr [39]. Ten out of the 54 countries in Africa have been selected to test the hybrid neural network developed in this study. These countries are from the five geopolitical zones in Africa (Nigeria and Ghana from West Africa; Algeria and Egypt from North Africa; South Africa and Namibia from Southern Africa; Ethiopia and Somalia from East Africa; the Central Africa Republic and Chad from Central Africa) and are highlighted with location tags in Figure 2. The case study details in terms of latitude, longitude, elevation, optimum azimuth, and optimum slope are summarized in Table 1.

Data Preparation
Data preparation involves various processes such as data division, data collection, and data normalization. In this study, the data used in training and testing the model have been obtained from Photovoltaic Geographical Information System (PVGIS) website [40]. The accuracy of this data set has been confirmed in literature as it has been used for other machine learning and deep learning tasks in existing works of literature [41,42]. According to PVGIS, the following parameters; ambient temperature sun elevation, and wind speed at 10 m are required to determine/predict PV performance and solar irradiance and these parameters are represented as Tamb (deg. C), AS (deg.), W10 (m/s), and Gi (W/m 2 ) in this study. While the ambient temperature affects the intensity of solar radiation, sun elevation is directly related to solar irradiance [43]. Additionally, PVGIS uses information about the elevation of the terrain with a resolution of 3 arc seconds (about 90 m). The hourly measurement of these parameters for twelve years (2005-2016) will be used in the study. The data collected (105,192 rows) are divided into training and testing sets respectively. A total of 841,536 data points is used in this study for each location. The data split is done in 9:1 proportion such that 90% of the data is used for training the hybrid neural networks and the remaining (10%) is used for testing the models.
Data normalization is applied to improve accuracy and speed up the rate of convergence of gradient descent. If the data is not normalized, the network often encounters a model learning problem because the gradient descent becomes complex and does not converge swiftly. This is done in accordance with machine learning literature. Since, normalization is majorly to standardize the range of the values, the range [0, 1] is used because it is better at reducing computation complexities. In this study, each dimension of the dataset is normalized to values between 0 and 1 using Equations (1) and (2).
where and represent the minimum and maximum values of variable respectively.
is the normalized value of variable . The statistical summary of the data used in this study is tabulated in Table 2. This data will be insightful in evaluating the performance of all the models developed. Typically, ANNs exist as organized layers that include interconnected input nodes, output nodes, and hidden layers ( Figure 3). They are computing systems that were inspired by the biological/human neural network [44]. They are also a class of feedforward models that accepts data into the dense inputs layer and outputs prediction results based on the number of neurons in the dense output layer. In the ANN model design, the input layer has 7 nodes representing the 7 (Year, Month, Day, Hour, Tamb, AS, and W10) input data columns that are required to determine/predict solar radiation and 1 node on the output layer representing the target column (Gi). The network has 1 hidden layer with 2500 neurons which is followed by an activation function to add nonlinearity to the layer's computation. In this study, the rectified linear unit (ReLU) is applied as the nonlinear activation function for the ANN models. The model loss was calculated using the mean square error being a regression analysis, and the global minima are determined via the backpropagation gradient descent. Adam optimizer was implemented with a learning rate of 0.001 and a training batch size of 512. The model was carefully designed to optimally learn the features of the data without underfitting and overfitting which is observed in the evaluated metric score of the training data to the test. The model overfitting problem is avoided by increasing the number of epochs gradually while checking the model performance. In total, the ANN model was trained for the various number of epochs depending on input data. The number of epochs used in training each model is highlighted in Table 3.  Figure 3. ANN model architecture.

Hybrid CNN-ANN Architectural Design
The CNN-ANN network combines the feature extraction from both networks. CNN applies the kernel technique to update the filter weights which helps to learn the feature representation of the training data. The model has a single CNN layer with 5 filters of 2 × 2 stride which convolves the input data. The CNN model has 3 hidden layers with [32,64,32] neurons. The output of the CNN layer is then flattened such that it could be fed to the complementary ANN model. The ANN network has 3 hidden layers with [32,100,32] neurons and an output layer consisting of one node. Both models are trained as a single end-to-end network with cross-entropy as the loss function and backpropagated to compute the corresponding derivatives. The model was trained for different numbers of epochs (as seen in Table 3) using Adam optimizer, a learning rate of 0.001, and a training batch size of 512. The model architecture is illustrated in Figure 4. The neurons in the hidden layers of this hybrid network can be summarized as followed [32,64,32; ( ); 32, 100, 32].
Mathematically, each layer in the one-dimension (1-D) convolutional neural network will extract patterns in the G_i as it relates to other input variables using Equation (3) [45].
where W k is the weight of the kernel connected to the kth feature map, f is the activation function, is the bias, and the star * is the operator of the convolutional process. Equation (3) can be re-written as Equation (4) where c is the output ℎ . * In the hybrid model, a flatten layer is used to convert the matrix to a singular vector (Equation (5)) so that it can be suitable for input into the ANN model.
The output of the flatten layer (Z) will serve as the input for the ANN model (Equation (6)).
where is the forecasted G_i, is the weight that connects neurons in the input layer, represents the input variable in discrete-time t and c is the neuronal bias, . denotes the hidden transfer function.

Hybrid CNN-LSTM-ANN Architectural Design
The triple hybrid model was designed for comparing the efficiency of the model at extracting the necessary features of the data by complimenting each other to learn both short and long-term dependencies. As seen in Figure 5, for this hybrid model, a recurrent neural network that runs in cycles is included, making it highly adept for analyzing sequence data. In comparison to the CNN-ANN model, the LSTM integrated as such, the constituting gates of the LSTM help it to retain necessary information from previous hidden states. The input data is fed to the 2-hidden layer 1D CNN with [32,16] neurons, after which it is forwarded to the LSTM network with 32 hidden states and finally to the densely connected network which produces the general model predictions. The ANN model for this hybrid has 3 hidden layers with [25,50,25] neurons. Both the CNN and ANN architecture is identical to the hybrid CNN-ANN design discussed in the preceding section. The summary of the neurons in the hidden layers of the hybrid model is [32, 16; 32; 25, 50, 25]. This model has basic calculations embedded in it which is described in four subsequent separate steps [46]. 1st Step: According to the hidden state ht−1 and the new input qt from Equation (4), the LSTM model will determine the information that will be thrown away from the "forget gate" as seen in Equation (7).
ℎ , where Wf is the weight of the matrices, … is the logistic sigmoid function, and is the bias function. 2nd Step: In this step, the information that will be stored in the cell state will be decided. A new candidate cell ( ) is also generated and it is scaled by an "input gate" it.
tanh(…) in Equation (8) is the hyperbolic tangent function. 3rd Step: The new cell Ct is updated with the combination of a previous cell state Ct−1 and . The former cell is affected by and also scaled by it. * * (10) 4th Step: In the final step, the output process is divided into two steps and an "output gate" is built to decide the cell state that is outputted. Ct activated by tanh function is filtered by the multiplication of . The result of the multiplication is the desired output ht For this hybrid model, the flatten layer converts the matrix (Equation (12)) to a singular vector. ℎ The output of the flatten layer (Z) will serve as the input for the ANN model (Equation (6)).

Model Training and Implementation
In implementing the hybrid models and the ANN model, the selection of the number of neurons in different layers of the model was strategically determined to ensure optimal convergence and fitting of the models. The regression models were built using the Keras library (which is a library in Python open-source programming package) while mean square error (MSE) is adopted as the loss function. Adam optimizer is used to minimize the cost function, and the rectified linear unit (ReLU) is applied as the nonlinear activation function due to its ability to make the network sparse and efficient. ReLU is one of the non-linear activation functions available and it is specifically used after each layer in a neural network to ensure that the computed output is activated (such that not all the neurons are activated at the same time). This is also adopted to apply non-linearity and overcome the vanishing gradient problem. The supervised learning feature of deep learning models creates room for further improvement of the model (especially when applied in other locations), however, the model overfitting problem is avoided by increasing the number of epochs gradually while checking the model performance. The optimal number of epochs is determined when the reduction in training losses stops. All the models developed were implemented in a Python environment running under core i7, 2.20 GHz system with 16 GB RAM, and GTX1060 6 GB Graphics card. The specifications of the computer used for the simulation are chosen considering the implementation of the models developed and these specs are common in the market nowadays. The optimal number of epochs required for the training of each model for different locations is summarized in Table 3.
The neural network models try to learn the relationship between the data features by computing and updating the weight and bias functions of a logistic regression operation. As the model is trained, the weights and biases are finetuned and updated to yield better prediction at every cycle by comparing the prediction to the data label. The differences between the prediction and label are computed as the loss function and averagely as the cost function. The cost function is then minimized using the backpropagation algorithm such that a global minimum is reached, ensuring that the predicted value is as close to the label as possible.
Specifically, for CNN integrated models, kernels are designed to run convolution on the data to update the weights and bias for refining the model prediction. The model predicted values are compared to the labels and the cost functions are modified by updating the model parameters using backpropagation. In the case of the LSTM model, the input, forget and output gates help the model to determine which values to retain for keeping long-term computational dependencies.

Evaluation Metrics
The performance of the two-hybrid neural network (CNN-ANN and CNN-LSTM-ANN) models, as well as that of the ANN model, will be evaluated with different deep learning statistical indicators and metrics. These include correlation coefficient (r), mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). The mean square error (MSE) is used as a loss function for the training and testing of the model. These metrics have been used in different literature [32,47]. They serve as deterministic models to evaluate the performances of the models. The mathematical representation of these metrics can be found in Equations (14)- (19). It is noteworthy that the smaller the value of MAE and RMSE, the more accurate the model. Additionally, a model is said to be more accurate as the r value approaches 1. In addition to other metrics, the promoting percentage of MAE ( ) and RMSE ( ) presented in literature [45] are also adopted to check the model performance. Finally, the change in MAE and RMSE (ΔRMSE and ΔMAE) for the training and test task is used to check the models' performances. While a small value of ΔRMSE and ΔMAE signifies that the developed model is good for the prediction task, a ΔRMSE, and ΔMAE of zero does indicate that the model is not good.
where is the measured value and represents the predicted value, and / are the average values of and , respectively. MAE1 and RMSE1 are the training performance metrics while the MAE2 and RMSE2 are the prediction performance metrics. N is the number of the dataset used and / are the promoting percentage of MAE/RMSE.

Results and Discussion
This work has two main objectives. The first objective is to develop two-hybrid neural network models suitable for estimating hourly global solar radiation in Africa. The second objective is to compare the performance of the hybrid neural network models with that of an artificial neural network model developed for the same purpose. Ten different countries in Africa have been selected for testing the developed models. There is no onesize-fits-all indicator for measuring the prediction accuracy of a model. Therefore, eight different evaluation metrics are used to evaluate the performance of the models developed in this study, and the results are presented in Table 3. Where r is the correlation coefficient of a model, it is a relative measure of fit of the predicted variable in comparison to the test dataset. The higher the r-value of a model, the more accurate the model for a prediction task. When comparing the prediction model used for a single time series, or multiple time series with the same units, MAE and RMSE are popular used to evaluate the performance of such models [48]. Minimization of MAE leads to a forecast of median and minimizing the RMSE will lead to a forecast of mean, thereby both metrics are crucial in evaluating model predictive performance. In comparison to MAE, RMSE does not treat each error equally, therefore, one large error can lead to a bad RMSE. In this study, the best model will be chosen based on the highest r-value, however, the corresponding MAE, RMSE, MAPE, and other metrics should be the least or one of the least for it to be categorized as the best. In the subsequent subsections, the overview of the hybrid neural networks model's performance is discussed, and the performance of these models is also compared for different countries. The performance of the hybrid models and the ANN model is summarized in Table 4.

Hybrid Neural Network Models' Performance Overview
The three neural network models (ANN, CNN-ANN, and CNN-LSTM-ANN) developed in this are capable of giving a good solar irradiance prediction. It is noteworthy that the performance of all the models developed in this study is very good, the r values range from 0.9662 to 0.9930. This is superior to the model developed in literature [49] for solar radiation prediction in which their r-value range between 0.8426 and 0.9356. Additionally, the performance of the models is better for countries with a well-distributed global solar radiation resource. According to the statistical summary of the data used in training and testing the models (Table 2), countries with a good mean value (Chad, Egypt, and Somalia) that corresponds to the maximum solar radiation have a better prediction performance. The RMSE and MAE of the models indicate that the model can accurately predict solar radiation for all the countries across Africa. It is noteworthy that the units of the MAE and RMSE are W/m 2 which makes the errors above 10 in most cases. Typically, the errors (MAE and RMSE) are presented in MJ/m 2 or kW/m 2 [45,49] and this will reduce these values significantly.
Out of all the case studies, the CNN-ANN hybrid model had the best performance for seven countries (Ghana, Nigeria, Central African Republic, Chad, Egypt, South Africa, and Namibia) while CNN-LSTM-ANN had the best performance for three countries (Ethiopia, Somalia, and Algeria) considering the r-value and other metrics ( Table 4). The time taken to train and test the hybrid (CNN-ANN and CNN-LSTM-ANN) models is smaller in comparison to the ANN model and this is an outstanding attribute for these models. CNN-ANN model for solar radiation prediction in Chad had the best accuracy of all the countries considered in this study. Although the r-value of the ANN model (0.9939) is higher than that of the CNN-ANN (0.9930), the performance of the ANN model considering the MAE, RMSE, MAPE, and other evaluation metrics shows that the prediction accuracy is not as good as the CNN-ANN model. The ΔRMSE for the ANN model is 5.843 and this reflects that the model has not trained very well in comparison to the CNN-ANN model. Furthermore, the ANN model performance for the Central African Republic had the least performance of all the models in this study (Table 4). To further highlight the strength and accuracy of all the models, out of the entire testing dataset, the performance of the models within 24 h period is plotted (Figures 6-10). This plot shows the fitting characteristics of each model.   , ) performance. The predictive performance comparison of the three models for hourly solar radiation prediction over 24 h for one west African country is illustrated in Figure 6. The performance of the model also shows that the CNN-ANN can be used for hourly solar radiation prediction for any country in the region (West African).

Performance of Hybrid Neural Network Models and Its Comparison for Different Geopolitical Zones
In East Africa, CNN-LSTM-ANN had the best performance and is most accurate for hourly solar radiation in Ethiopia and Somalia. As seen in Table 4, the performance of the CNN-LSTM-ANN hybrid model is better and more refined than that of the ANN model and CNN-ANN model. It also had the highest r-value (0.9800 for Ethiopia; 0.9904 for Somalia) and the least MAE, RMSE, and MAPE (25.89 W/m 2 , 72.72 W/m 2 W/m 2 , and 9.31% for Ethiopia; 16.60 W/m 2 , 51.54 W/m 2 , and 5.725% for Somalia). The integration of one layer of LSTM with the CNN and ANN models smoothens the predictive performance of the model as seen in Figure 7 (where the predictive performance of the three models is compared based on their hourly solar radiation prediction with 24-h).
As discussed in the preceding section, CNN-ANN has the most accurate predictive performance for Chad. This model also has the best for the Central African Republic ( Table 4). The hourly solar irradiation prediction illustrated in Figure 8 further shows that the CNN-ANN is the closest to the true value. It is however noteworthy that CNN-LSTM-ANN had the least performance out of the three models considered for this region.
Contrary to the model performance in other regions (where one model is good for both countries), the two-hybrid model had the most accurate predictive performance in North Africa. While CNN-ANN is the best model for Egypt, CNN-LSTM-ANN had the best performance for Algeria. The r-value, MAE, RMSE, and MAPE for the two countries respectively are 0.9925, 17. 25 W/m 2 , 48.33 W/m 2 , and 5.56% for Egypt; 0.9782, 26.41 W/m 2 , 72.80 W/m 2 , and 9.98% for Algeria. Figure 9 highlights the accuracy of the CNN-LSTM-ANN model when predicting solar radiation on an hourly time step for Algeria.
Similar to West and Central Africa, the CNN-ANN hybrid model had the best predictive accuracy in comparison to the two other models (Table 4) for countries in the southern part of Africa. As seen in Figure 10, the CNN-ANN hybrid model can predict the hourly global solar radiation better than the ANN and CNN-LSTM-ANN models. The evaluation metrics (r-value, MAE, MAPE, and RMSE) for the two countries studied in this region are similar. This is a reflection of the uniform distribution of solar radiation in the region. While CNN-ANN had the best performance for this region, it is noteworthy that CNN-LSTM-ANN performance is also good for solar radiation prediction. However, ANN had the least accuracy of all the three models in this region. It should be noted that the performance of the models is compared over a period of 24 h out of the total 10,519 h test results. This is to highlight the intrinsic detailed differences in the models' predictive performance.

Performance Comparison of Hybrid Neural Network Models with Existing Literatures
Comparing the novel hybrid deep learning models presented in this study to other models that have been presented in existing literature for solar radiation prediction purposes, the CNN-ANN and CNN-LSTM-ANN model has superior performance as seen in Table 5. In comparison to the hybrid models presented by Olatomiwa et al. [50] and Feng et al. [51], the CNN-ANN and CNN-LSTM-ANN models have higher r-values and the RMSE are significantly smaller. Although the use of MLP and EANN models in another study [52] reported r-values of 0.9749 and 0.9598, the performance of all the models developed in this study is significantly better leading to a more accurate solar radiation prediction (Table 5).

Conclusions
In this study, two novel deep learning models namely CNN-ANN and CNN-LSTM-ANN have been developed in Keras-Python and used to forecast hourly global solar radiation in Africa. The developed model is also compared to an ANN model developed for the same purpose. Data from ten different African countries within the five geopolitical zones have been used to train and test the models developed. Variation in weather conditions is one of the main factors that affect the performance of the model as seen in Table 4. The models performed better for locations with high and well-distributed solar radiation as in the case of Egypt, Chad, and Somalia. The integration of LSTM helps the model to retain long-term computational dependency thereby reducing the error term in comparison to ANN and this is obvious in the MAE results in Table 4. Additionally, the addition of CNN to the ANN algorithms helps in boosting the performance of the model thereby giving a more accurate or higher r-score ( Table 4). The concluding remarks from this study are highlighted as follows. - The hybrid models were found to predict solar radiation more accurately than the ANN model. While CNN-ANN had the best performance for seven different countries (Ghana, Nigeria, Chad, CAR, Egypt, Namibia, and South Africa), CNN-LSTM-ANN had the best predictive performance for Algeria, Somalia, and Ethiopia. -Also, the integration of a flatten layer in the CNN-ANN hybrid model enhanced the predictive performance of this model.

-
The two-hybrid models train faster than the ANN model making it more desirable for computation in developing countries. Also, the performance of the model was found to be better for countries with welldistributed solar radiation. -Finally, the performance of the ANN model developed in this study is also very good, however, the large number (2500) neurons in the hidden layer and the lengthy training period make it undesirable for developing nations. On the other hand, the novel hybrid neural network models presented in this study can achieve a better result with the use of a lesser number of neurons and this makes it more suitable for application in any part of the world.
Overall, it can be concluded that CNN-ANN has the best prediction performance of the two-hybrid neural network due to its simplified methodology. Although the hybrid models in this study have been tested on data from Africa, it is noteworthy that this can be applied for a similar purpose in other continents. In future studies, the application of these hybrid models for other renewable energy resources forecasts will be considered. Additionally, the development of hybrid models with traditional learning and machine learning tools will be studied. The proposed approach in this study can also be considered for PV-related systems modeling including the single diode modeling of PV systems.