Exploring the Potentials of Artiﬁcial Neural Network Trained with Differential Evolution for Estimating Global Solar Radiation

: The use of solar powered systems is gradually getting more attention due to technological advances as well as cost effectiveness. Thus, solar powered systems like photovoltaic, concentrated solar power, concentrator photovoltaics, as well as hydrogen production systems are now commercially available for electricity generation. A major input to these systems is solar radiation data which is either partially available or not available in many remote communities. Predictive models can be used in estimating the amount and pattern of solar radiation in any location. This paper presents the use of evolutionary algorithm in improving the generalization capabilities and efﬁciency of multilayer feed-forward artiﬁcial neural network for the prediction of solar radiation using meteorological parameters as input. Meteorological parameters which included monthly average daily of: sunshine hour, solar radiation, maximum temperature and minimum temperature were used in the evaluation. Results show that the proposed model returned a RMSE of 1.1967, NSE of 0.8137 and R 2 of 0.8254. moving average backpropagation networks-Particle swarm optimization autoregressive exogenous network backpropagation networks-random forests backpropagation algorithm vector machine


Introduction
Utilization of renewable sources for meeting daily energy needs comes with an advantage of environmental friendliness in contrast to the use of fossil fuel sources [1][2][3]. Renewable energy sources are also reported to be useful in the electrification of remote places which are either cut off from the grid because of terrain barriers or the uneconomical size of available consumers [4,5]. As a result, research and funding efforts directed to the development of the subject matter is on the increase [6][7][8]. These researches and funding have made electrification through renewable energy more competitive with their conventional counterparts in recent years [9]. For instance, as of 2018, international weighted-average electricity cost dropped by 1% for offshore wind and geothermal, 12% for hydropower, 13% for both onshore wind and photovoltaic (PV), and 14% for bioenergy [10]. Concentrated solar power (CSP), has the highest decline of 26% [10]. From the foregoing, wind and solar are among the major renewable energy resources that has enjoyed technological advancement and economic breakthroughs in terms of competitiveness with conventional sources. Presently, solar energy is playing an important role in the decarbonization and sustainability of the electricity industry by providing emission-free electricity which is economically attractive. It is more common to see solar panels (instead of wind turbines) powering homes and business in developing countries with electricity access challenges. This can be attributed to its modularity and acceptability among consumers. Hydrogen production is also becoming an emerging technology because of its high efficiency and low negative environmental impact. It has therefore, found applications in the heating, transportation as well as electrification of off-grid remote communities [11,12]. In addition, more works are now concentrating on how hydrogen can be generated from renewable energy sources [11,[13][14][15][16]. In the face of anxieties caused by irreversible depletion of conventional fuel sources and their environmental and health impacts on the society, the production of hydrogen using renewable sources can contribute to the ongoing transition from a carbon intense electricity economy to a low-carbon energy production which is more sustainable. One of the ways being explored for hydrogen production is the electrolysis of water using solar energy from PV [11,13,17,18]. This process does not contribute to environmental pollution and climate change because no greenhouse gas is emitted [19].
In designing hybrid energy systems that has solar-powered technologies as one of its constituent, it is very important that detailed solar radiation data are obtained and properly processed. However, the availability and cost of equipment and few personnel for maintenance of these equipment makes the gathering of solar radiation data difficult in many regions of the world. For example, in Nigeria, many of the locations with weather stations do not have solar radiation data archives [7]. In places where they have, the data is either disjointed or available for few periods, because of inappropriate calibration of measuring instruments being deployed. As a result of these uncertainties and difficulties the development of prediction models and algorithms that can estimate solar radiation with very little error is inevitable.
One of the benefits of correlation-based methods in solar resource forecasting is evident in energy imbalance markets where periodic (especially day-ahead) energy sales among participants are essential [20]. It is also useful for data patching purpose-a process of filling in the missing data for specific periods due to faulty measuring equipment [21]. Missing data can lead to uncertainties that can result in inaccurate energy estimation models with negative consequences. For example, when solar radiation data are missing or inaccurately patched, it can either result in penalty cost on the utility (if the capacity on the bilateral agreement is not met) or increment in the cost of energy which is most time borne by the consumers. From the standpoint of a grid operator, inaccuracies in solar radiation predictions may translate to loss of loads or urgency in making-up for unexpected imbalances between demand and generation using flexible short-term power sources which may be costlier especially when estimated per unit. The cost of alternative energy for meeting the shortfall can be appreciated when comparing the total cost to make up for 15% error on a 5 MW and 150 MW solar power plant. Since the output power from solar energy systems depends primarily on the availability of complete solar radiation data set, there is a continuous need to improve correlation-based methods for solar resource forecasting.

Literature Review
Many literature have been dedicated to improving the methodologies used in the prediction of diffuse solar radiation [19,[22][23][24][25][26][27][28]. For instance, a linear, second and third order predictive model was developed by Yaniktepe and Genc for the estimation of solar radiation [19] based on two parametersmaximum possible sunshine duration and monthly average of daily extraterrestrial radiation on a horizontal surface. Based on the outcome of the study, the third order predictive model exhibited a better performance as compared to both the linear and second order polynomial model. In another work, Feng et al. presented 4 artificial intelligence methods for estimating diffuse solar radiation in China [23]. The predictive models were based on; back propagation neural networks optimized by genetic algorithm (GANN), generalized regression neural networks (GRNN), random forests (RF) and extreme learning machine (ELM). The results (based on performance) of the study show that the GANN method ranked best followed by ELM, RF and GRNN. ANN is a powerful predictive tool that has found applications in solar radiation estimation [24][25][26][27]. The performance of ANN in predicting solar radiation is being continuously improved by various researchers. Xue presented an improved ANN model for predicting daily diffuse solar radiation by using genetic algorithm (GA) and particle swarm optimization (PSO) for optimizing ANN [24]. Based on the result of the study, the BPNN-PSO model exhibited a better performance when compared to BPNN-GA and BPNN methods. The efficacy of ANN for monthly average daily global solar radiation estimation has also been presented in the literature [25]. In order to estimate the monthly average daily global solar radiation for 45 locations in Italy, the study used 13 input parameters. The results shows that the combination of predictive factors which best estimates the monthly average daily global solar radiation include, altitude, day duration, latitude, period, rainfall, rainy days, and top of atmosphere radiation. Another study developed 4-ANN models for estimating global horizontal irradiation in Abu Dhabi [26]. The developed ANN models include adaptive neuro-fuzzy inference system (ANFIS), multilayer perceptron (MLP), nonlinear autoregressive recurrent exogenous neural network (NARX) and generalized regression neural networks (GRNN). In order to explore the advantage of hybridization of techniques, Gairaa et al. combined the strength of both linear autoregressive moving average (ARMA) and ANN to develop a model that approximates daily global solar radiation [27]. Fan et al. proposed a novel solar prediction model by combining multiple sunshine models. The results of these models were compared with that of eight existing and two single sunshine-based empirical models to test its precision and suitability [29]. Generally, the proposed models performed better in all cases considered. Bayesian neural network has also been combined with empirical models to forecast daily global solar radiation on a horizontal surface [30]. Fan et al. presented a comparative study on the use of two machine learning algorithms (Extreme Gradient Boosting and support vector machine) for the estimation of daily global solar radiation as against the use of selected empirical methods under limited dataset [31]. Hussain and Al-Alili analysed solar radiation models using wavelets [32]. Hassan et al. explored the potentials of 3 tree-based ensemble approaches (Gradient boosting, bagging and random forest) in predicting daily and hourly solar radiation [33]. Table 1 shows a comparative analysis of various studies related to prediction of solar radiation.
It is clear from the literature presented that ANN has been vastly explored for the prediction of solar radiation because of its superior efficiency and prediction accuracy as compared to empirical methods. However, ANN has its own challenges that needs to be addressed with respect to solar radiation prediction. Some of these include local minimum tendencies, over-fitting, poor generalization and slow convergence, all of which impact on model accuracy. Moreover, when ANN produces high accuracy, it often does so with highly complex model architectures, resulting in high computational demands [24,34]. To improve the performance of ANN in solar radiation prediction, there is a need to strike a balance between the accuracy and complexity of solar radiation predictive models. These could be achieved via the hybridization of global-based algorithms such as evolutionary algorithms [24,35,36].
The practicability or otherwise of these solar radiation predictive models has been a source of debate in the literature [21,37,38]. In a study, it is stated that "objective interpolation of solar radiation measurement is often required for the sites where measurements do not exist; using global solar radiation estimations calculated from sunshine duration data" [37]. This is because sunshine duration data can be easily obtained in many regions as compared to solar radiation data [37]. Meanwhile, another study explained that many of the empirical models developed for estimating solar radiations are based on correlations between meteorological parameters and the available solar radiation data [21]. The authors then argued that such correlations can only be achieved in areas where data availability (solar and related meteorological) for validation is not a challenge. This implies that these regions may not need such correlations. Meanwhile, the region that needs such correlations does not have adequate data to develop these empirical models [21]. While acknowledging that the dilemma stated by [21] exists and needs to be investigated, this study is only limited to exploring the predictive capability of differential evolution and ANN in estimating solar radiation using Nigeria as a case study. As an original contribution to studies on solar radiation prediction, this paper seeks to explore the capabilities of differential evolution (DE) in optimizing multilayer feed-forward ANN used in solar radiation prediction. To the best of the authors' knowledge, no study is yet to report the use of DE in optimizing mulitlayer feed-forward ANN for solar radiation prediction. This is the main objective of this study.

Artificial Neural Networks
Artificial neural networks are biologically inspired to mimic the structural sophistication and performance of the brain (biological neural system) used in computational machine learning purposes. Artificial neural networks comprise elements that are organized in a way comparable to that of the brain's anatomy and perform functions similar to that of a biological neuron. As such, models of artificial neural networks learn patterns intrinsic in observations in ways similar to that of the brain [50,51].
The perceptron is the basic artificial neural model that can learn [50,52,53]. A perceptron is composed of input units (congruous with the input signals which biological neuron receive); connection weights that can be trained and adjusted (congruous with the synapses in the biological neural system); processing element (consisting of summation function & activation function and congruous with soma in the biological neuron); bias (which is an additional input into the processing element); and an output unit (corresponding to output signal or response from the biological neuron). The perceptron is termed a single-layer neural network because it has only a layer of output unit [50]. The basic mathematical description of the perceptron is given in Equations (1) and (2).
where v i , w i , y 0 , b o and w T denotes input signals, trainable synaptic weights, output signal, bias, synaptic weights transpose, respectively.
To generate an output from the perceptron, the product of each input and its synaptic weight is carried out. In basic ANN models, these products are summed and compared to a threshold before they are fed through the activation function that generates the output [53]. The activation function introduces non-linearity to the output of the artificial neuron. Without the activation function, the output signal will simply be linear and this may not always satisfy real-world interfaces [50][51][52][53]. The various activation functions used for ANN applications include: threshold logic unit, log sigmoid, tan sigmoid and the saturated linear activation function [50][51][52]. The overall aim of ANN training is reduction of overall error E r between the actual and predicted observations as expressed in Equation (3).
where D m and F m are the actual and predicted values for the mth output processor, respectively, and n is the total number of training patterns. Training of ANNs can broadly be classified as supervised and unsupervised [53]. In supervised training, input-output pairs are supplied so that parameters of the network are adjusted in such a way that a specific input gives a target output. For unsupervised learning, the network is supplied only with the input data and self-organizes the data to decipher the collective pattern in it. Variants of these broad categories of learning exist [53,54]. In training artificial neural networks, learning algorithms such as Levenberg-Marquardt algorithm, backpropagation, Newton method, conjugate gradient, Quasi-Newton methods etc. are frequently used [53,55]. An artificial neural network is said to be well trained if it serves the purpose of reasonable generalization. For the purpose of effective generalization and enhanced approximation capability of ANN models, it is important that data are not over fitted during training. Data overfit can be avoided through complexity regularization by applying standard selection criteria such as the Akaike information criterion (AIC), Bayesian information criterion (BIC) and predictive stochastic complexity (PSC) [53,56]. Details of ANN implementation is available in the literature [34,57]. The structure of the proposed ANN model is shown in Figure 1. This ANN model is trained using DE.

Differential Evolution
Differential evolution (DE) is a heuristic population-based parallel direct search algorithm optimization of continuous functions [36,58]. DE thrives on population of vectors which can independently be manipulated to perform the search. When the trial vector returns a lower objective function as compared to a predetermined population member, the trial vector that is recently generated will replace the vector and be compared in the next generation [59]. As this process continues, the convergence of the population is observed and the perturbations become less [60]. Based on this procedure, DE implements global search during the initial process and local search during the later phase of the process [59,60]. This feature makes DE capable of identifying the optimal weights needed for the minimization of error in ANN. [36]. The steps involved in DE is presented in Algorithm 1.

Algorithm 1: Classical Differential Evolution Algorithm
1 Data: NP, F and CR; 2 Result: The best individual population; 3 Generate initial population with NP individuals; 4 while g ≤ number of generations do 5 for i ∈ in population do 6 Select three random individuals (X r1 , X r2 , X r3 ); 7 d rand ← select a random dimension to mutate; if u i, f itness ≤ X i, f itness then 14 Add u i in the offspring; 15 else 16 Add X i in the offspring 17 Population ≤ offspring; 18 g ≤ g + 1

Site Description and Data Collection
The data used in the study were obtained from the national weather station (located in Iseyin) managed by the Nigerian meteorological agency. Iseyin (7 • 58 N 3 • 36 E) is a Nigerian city located in south-western part of the country and known for its commercial farming and mining activities. It has a population of more than 300 thousand people. The city is characterized by a tropical climate with a mean annual rainfall and ambient temperature of 1171 mm and 26.1 • C respectively. Monthly average daily meteorological variables of global solar radiation (H g ), maximum and minimum temperature (T max , T max ) and sunshine hour (S h ) for a 21 year period is obtained from the weather station ( Figure 2).

Relation between Extraterrestrial and the Other Factors
The monthly average daily extraterrestrial radiation on a horizontal surface at the weather station can be obtained using Equations (5)-(9) [7,61].
where I sc , w s , d r , H o , δ, N, d and φ are the solar constant, hour angle, inverse relative distance of the sun to earth, monthly mean daily extraterrestrial solar radiation, solar declination, monthly daylight hour, day number, and latitude of the site under consideration respectively.

Model Development
The procedural steps used in modelling solar radiation at a monthly timescale is presented in this section. These steps summarized in a methodological framework and illustrated in Figure 3 entailed the use of a classic DE algorithm in training a MLF-ANN.

Network Typology and Setup
The architectural layout of the MLF-ANN developed in this study comprised three layers: one input and output layer as well as a single hidden layer which comprises neurons. A logistic sigmoidal-type activation function [0, 1] was utilized in the hidden layer of the MLF-ANN to rescale the inputs in an interval [0.1, 0.9] while a linear activation function was adopted in the output layer. The monthly averages of daily sunshine duration, minimum and maximum temperatures were used in developing the input layer of the ANN architecture while the corresponding averages of solar radiation were used to construct the output layer (i.e., the target output). Consequently, the MLF-ANN was characterized by three input layer neurons and one output layer neuron. Equation (10) describes the functional relationship between solar radiation and the input variables considered in this study.

Data Splitting
In data-driven modelling, the standard is to split a given set of historical observations into training and testing sets [34]. The total number of observations for the study area from year 1987 to 2007 and utilized for training and validation (testing) is 250. The data set was divided into two grouped that exhibit analogous statistical features with 200 (80%) observations used for model training and the outstanding 50 (30%) observations for testing. Table 2 presents the statistical properties of the training and testing data sets.

Model Implementation and Optimization
Considering the fact that network complexity is dependent on the number of neurons in hidden layer, a decision was taken to not predefine the number of hidden layer neurons for the MLF-ANN. Alternatively, using the DE algorithm, the hidden layer was subjected to a sensitivity analysis which involved varying the number of processing neurons incrementally between 1 to 10, utilizing a single step function and observing the error profile at each stage. DE was therefore used to optimize the network parameters (i.e., the synaptic weights and biases) and network architecture of model. In training the MLF-ANN, the DE algorithm run was initially run for 100 generations. The parameters used in governing the algorithm run include population size (NP), crossover probability (CR) and mutation probability (F). NP was set at "D × 10" ( the number of weights and bias is D) while a search for the optimal values of CR and F was performed within the range [0.1, 0.9] to determine the best combination of parameter setting. The best parameter setting obtained from the initial run (CR = 0.9 and F = 0.4) was thereafter used in performing a finer search with the DE algorithm initialized for 10,000 generations. The DE algorithm was therefore aimed at optimizing model accuracy and model complexity simultaneously to foster a mutual accommodation between both objectives. To mitigate the risk of over-fitting which ANN models typically suffer from, an early-stopping function which detects the point where the least error on the test data set starts to increase and immediately stops training was introduced [36].

Model Performance Evaluation
The prediction performance and accuracy of the proposed model is evaluated using three statistical metrics, which include [36,62,63]:

1.
Root-mean-square error: 2. Coefficient of determination: 3. Nash-Sutcliffe efficiency index: where the observed and predicted values are given as D i and F i respectively and their mean values are given asD i andF i respectively and X is the data size.

Results and Discussion
The efficacy of the proposed ANN-DE model was appraised based on two metrics-accuracy and complexity. The results with respect to model accuracy is presented in Table 3. Table 3 show that a minimal model error occurred during training, producing a RMSE of 1.3292. R 2 and NSE values were estimated to be 0.7838 and 0.7835 respectively, signifying high model accuracy during training. During the testing phase, an improvement in the model performance metrics (RMSE = 1.196; R 2 = 0.8254; NSE = 0.8134) can be observed. This translates to a 11%, 5.3% and 3.8% improvement in model performance during testing, respectively.
These results imply that the model did not only produce an accurate representation of solar radiation of the City but has showcased good model generalization and convergence across the training and testing phases of the simulation. This is an indication that the ANN model did not encounter over-fitting problems which typically plagues conventional ANN models. The remarkable performance of the ANN model can be attributed to the robustness of the DE algorithm via its genetic operators, ensuring a productive exploration in estimating the network parameters (weight and biases) of the model. The adoption of the early stopping technique also served as a good complement as it ensured that training is halted as soon as an increase in error is observed to prevent over-fitting. With regards to complexity, the DE algorithm ensured that model accuracy is obtained at minimal model complexity as the optimal number of hidden layer neurons returned after exploration [1,10] is three. Hence, the optimal network architecture of the ANN is 3-3-1. This signifies that the model accuracy was obtained using minimal computational resources. Figures 4 and 5 present a visual representation of the model performance in predicting solar radiation for the City of Iseyin over the study period. Figure 4 shows plots of actual and predicted values of solar radiation during training and testing. The chart visibly indicates that the proposed model created a good replica of the solar radiation pattern for Iseyin. Majority of the troughs and crests were reproduced by the model. Only a limited number of under-and over-estimations can be observed. The corresponding scatter plots also shows a positive correlation and high accuracy between the predicted and actual values-R 2 is 0.7838 at the training phase, while R 2 is 0.8254 for the testing phase ( Figure 5).
To further evaluate the performance of the ANN model, the results obtained from this study were compared with results reported in   [7]. In their study, the authors employed seven soft computing techniques to estimate monthly solar radiation for the same location and period used in this study. The models reported in their study were developed using the following techniques: SVR with polynomial kernel function (SVR-polynomial), SVR with radial basis function kernel (SVR-radial), ANFIS coupled with ant colony optimization (ANFIS-ACO), ANFIS coupled with differential evolution (ANFIS-DE), ANFIS coupled with genetic algorithm (ANFIS-GA), ANFIS coupled with particle swarm optimization (ANFIS-PSO), and ANFIS. Table 4 presents results from the comparative analysis. It can be observed from the results that, although all the models performed well during training, they were all unable to produce the same or a better performance during testing. All the seven models produced higher RMSE values during testing than those produced during training. Similarly, the estimated R 2 values during testing phase were lower than those estimated during training. This denotes that all the models were plagued to some degree by over-fitting, resulting in inadequate generalization. In fact, the SVR-radial, ANFIS-PSO and ANFIS models suffered severely from over-fitting. On the other hand, the ANN model developed in this study recorded an improved performance between the training and testing phases. It can be concluded from the comparative analysis that the ANN model developed in this study clearly outperformed the seven models reported in   [7].   To further assess the performance of the ANN model developed in this study, its prediction accuracy was compared against eight existing studies in the literature undertaken in different regions across the world (Table 5). Techniques employed in the compared models comprise both empirical and soft computing techniques. The performance of the models was evaluated based on R 2 estimates. The comparison results show that the ANN model developed in this study (trained using DE) provided superior results than the other models compared in this study. The combined use of DE and early stopping can be considered as instrumental in enhancing the efficiency of the ANN model, providing high accuracy at minimal complexity. This paper has been able to elucidate the capability of differential evolution in training artificial neural network for solar radiation prediction.

Conclusions
The decarbonization of the energy sector has led to extensive research and technological development with respect to the use of solar energy for electricity generation. As such, solar technologies like solar photovoltaics (PV), concentrated solar power (CSP), solar hybrid systems (solar-hydrogen, solar-diesel generator, solar-wind etc.), concentrator photovoltaics (CPV) etc are presently being used to generate electricity either in small scale or commercial quantity. In all cases and technologies, solar radiation data is a very essential input parameter during design and modelling. This is usually not available or incomplete. One way of obtaining solar radiation data is from historical data, through the use of predictive tools. In order to elucidate the efficacy of the proposed ANN-DE model in predicting solar radiation in the presence of limited data set, 3 input parameters (T max , T max and S h ) and 3 performance (R 2 , RMSE and NSE) metrics were used. The performance metrics obtained from the proposed ANN-DE model show superior performance with respect to the error values returned when compared to existing models that have been used in the area under consideration. It is worth noting that the values of these performance metrics may change with change in location and data. This is one of the drawbacks of all data driven techniques [21]. The proposed model has the capability of predicting solar radiation that can be used as input for various solar powered projects at the location where the data are recorded. The methodology adopted in this model can be tested at other locations to ascertain its universality. Future studies could also look into the use of DE and other meta-heuristic approaches in training ANN for solar radiation prediction at finer timescales. Acknowledgments: The first author appreciate the useful discussions and insightful inputs of Oluwaseun Oyebode. The first author also appreciate Olatomiwa Lanre for helping out with the raw data.

Conflicts of Interest:
The authors declare no conflict of interest.