A Time Series Forecasting of Global Horizontal Irradiance on Geographical Data of Najran Saudi Arabia

Alghamdi, Hisham A.

doi:10.3390/en15030928

Open AccessArticle

A Time Series Forecasting of Global Horizontal Irradiance on Geographical Data of Najran Saudi Arabia

by

Hisham A. Alghamdi

Electrical Engineering Department, College of Engineering, Najran University, Najran 55461, Saudi Arabia

Energies 2022, 15(3), 928; https://doi.org/10.3390/en15030928

Submission received: 29 November 2021 / Revised: 18 January 2022 / Accepted: 19 January 2022 / Published: 27 January 2022

(This article belongs to the Section F: Electrical Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Environment-friendly and renewable energy resources are the need of each developed and undeveloped country. Solar energy is one of them, thus accurate forecasting of it can be useful for electricity supply companies. This research focuses on analyzing the daily global solar radiation (GSR) data of Najran province located in Saudi Arabia and proposed a model for the prediction of global horizontal irradiance (GHI). The weather data is collected from Najran University. After inspecting the data, I we found the dependent and independent variables for calculating the GHI. A dataset model has been trained by creating tensor of variables belonging to air, wind, peak wind, relative humidity, and barometric pressure. Furthermore, six machine learning algorithms convolutional neural networks (CNN), K-nearest neighbors (KNN), support vector machines (SVM), logistic regression (LR), random forest classifier (RFC), and support vector classifier (SVC) techniques are used on dataset model to predict the GHI. The evaluation metrics determination coefficients (R²), root mean square error (RMSE), relative root mean square error (rRMSE), mean bias error (MBE), mean absolute bias error (MABE), mean absolute percentage error (MAPE), and T-statistic (t-stat) are used for the result verification of proposed models. Finally, the current work reports that all methods examined in this work may be utilized to accurately predict GHI; however, the SVC technique is the most suitable method amongst all techniques by claiming the precise results using the evaluation metrics.

Keywords:

solar energy; machine learning; forecasting; GHI; GSR

1. Introduction

Energy is the essential source of humanoid existence in the world. It has a large impact on daily routine work. There are many sustainable and infinite energy resources to fulfill the essential needs of the globe [1]. Solar energy is one of those known as radiation energy emitted by the sun that can produce heat, chemical reactions, and electricity. The sun is the most important solar energy resource with a free, reliable, and limitless source of solar energy radiations (SER). It has a huge probability to accomplish the energy requirements around the world in the future [2]. There are many applications of solar energy used, such as in solar lights, solar appliances, solar heating, and solar ventilation, and solar-powered transportation is one of the major applications for the future [3] as shown in Figure 1. Time series is the process of analyzing the statistics and modeling time series-related data to make forecasting and rules. It is not always precise forecasting, but some likelihoods also occur due to frequent changes in data variables. It is also used for a wide range of industrial projects; for example, weather and earthquake forecasting [4].

On the other hand, the prediction of solar energy radiations of the region has great importance to design the energy conversion systems. The architecture of solar energy systems largely depends on the calibration of energy radiations in specific regions. It is also helpful to make future energy investments and policies related to solar energy conversion system modeling and designs [5]. To regulate the requirements of the solar energy of the area, most important to know is the information about the solar radiations in that region [6]. Moreover, the accurate prediction of solar radiations on the ground generates a large profit for energy suppliers in the market. So, accurate short-term energy radiations prediction is very important to sell solar energy commercially to enhance revenue margin [7,8].

Overall, solar radiation value is measured by using solar radiation measurement tools. The installation and calibration of these tools are costly. Therefore, these measurement tools are not easily available worldwide. That is why different climatic parameters, such as the speed of the wind, humidity, temperature, and weather conditions, are used to predict solar energy. Numerous types of models have been used to predict solar energy. These models generally depend on different mathematical techniques which are easy to predict solar energy [9,10].

With the rapid and complex environmental changes happening, such as weather changes, rainy season starts, and clouding, due to these changes, the already existing techniques are not capable of predicting an accurate number of solar radiations. Hence, it was reported that these mathematical models are not able to satisfy the prediction of solar radiation daily around the globe [11,12].

With the evolution of new technologies, different artificial intelligence (AI) techniques have been used in all engineering fields. Furthermore, including these mathematical models, different AI techniques, such as a support vector machine (SVM), artificial neural networks (ANN), and different deep learning (DL) methods have started to be used for the prediction of solar energy. According to the previous studies, AI techniques report the best solar prediction results as compared to the already existing mathematical models [9,13,14]. In [15], the author used the three machine learning algorithms named SVM, ANN, and Adaptive Neuro-Fuzzy Inference System (ANFIS) to predict the daily solar radiations in the six stations in Mexico. The environmental temperature, rainfall, and other environmental conditions were used for the training of the algorithm. In another study, the author used the ANN machine learning technique to predict the global solar radiations at thirteen different stations. For this process, the maximum and minimum temperature conditions were used to train the algorithm for the prediction of solar radiations [16].

In this paper, the author analyzes the given dataset and determines the dependent and independent variables for feature selection. Selective features are passed to six machine learning algorithms K-NN, CNN, SVM, SVC, RFC, and LR. These algorithms predict the GHI value, and for measuring their performance, seven statistical matrices known as determination coefficients (R²), root mean square error (RMSE), relative root mean square error (rRMSE), mean bias error (MBE), mean absolute bias error (MABE), mean absolute percentage error (MAPE), and T-statistic (t-stat) are used. In this way, the author discovered the highest error rate values of RMSE 37.5 in KNN, rRMSE 19.5 in KNN, MBE 1.27 in RFC, MABE 15.0 in K-NN, MAPE 12.8 in K-NN, and the lowest error rate values of RMSE 6.96 in SVC, rRMSE 2.63 in SVC, MBE −2.68 in CNN, MABE 0.58 in SVC, and MAPE 0.09 in SVC. In this study, based on a lower error value, the proposed SVC method produces the best prediction results for GHI as shown in results section.

The remaining sections are organized as: Section 2 defines the literature review, Section 3 defines the methodology procedure, study region, data collection, and other statistical metrics. Section 4 explains the results and discussion with comparison and finally, Section 4 defines the conclusion of the proposed technique.

2. Literature Review

The sun is a nonpolluting, almost infinite source of energy that is necessary for life on Earth to exist. Several ancient cultures around the world recognized this and constructed solar architecture, seasonal ceremonies, and even deified the sun as a god [17]. Many approaches were used to predict the solar energy on the earth. In [15], the author employed three machine learning algorithms, SVM, ANN, and the Adaptive Neuro-Fuzzy Inference System (ANFIS), to forecast daily sun radiations in six Mexican stations. The training of the algorithm was based on temperature, rainfall, and other environmental factors. The author of another study employed the ANN machine learning technique to forecast worldwide sun radiations at thirteen different locations. To train the system for predicting solar radiations, the maximum and minimum temperature conditions were employed [16].

On the other hand, in another study in Iran, the author used the three models known as gene expression programming, ANFIS, and ANN to predict solar energy daily. The ANN model was seen to be the best model for solar energy prediction with an R² value is 0.93 [18]. The author presented the comparison of the Angstrom model with the ANN technique. The ANN presents the best results for the prediction of solar energy [19]. Moreover, in another research, an SVM, empirical, and ANN model were presented for better results prediction of solar radiations. The SVM outperform as the 0.99 correlation [20].

In Turkey for four different stations, the monthly solar energy prediction was performed by using an ANN and regression model. The environmental factors such as humidity, temperature, sunshine, longitude, and the day of the year were used to calculate the statistical values of the solar energy prediction. The ANN model finds the best statistical values such as R²: 0.961 and RMSE: 014 [5]. In another research in Turkey, only a deep learning technique was used to predict the monthly global solar energy. The statistical value R² 0.98 was calculated as the best one [21].

In another research study, a KNN, empirical model, and ANN technique was used to predict solar energy radiations. As a result, K-NN achieved the R² values of 0.96, while another model was used to estimate the solar radiations having a R² value of 0.97 [22].

Although there are many meteorological stations around the globe to estimate solar radiations, many of them are not capable of accurately measuring the daily global solar radiations due to higher maintenance costs and calibration issues [8]. Therefore, it is compulsory to design a method for the accurate prediction of daily global solar radiations. As seen in the earlier studies, there are many machine learning models, including empirical models, that are used for solar radiation prediction. The main reason behind using these methods is to accurately predict solar radiations. So, a few matrices are insufficient to predict global solar energy. That is why the anticipated study contributes to the literature in the following ways:

Six machine learning algorithms (Convolutional Neural Networks (CNN), K-nearest neighbor (KNN), support vector machines (SVM), logistic regression, random forest classifier, and support vector classifier techniques) are used to predict the daily global solar radiations.
Additionally, these techniques only used the one province dataset of daily solar radiations to predict potentials solar radiations.
These techniques predict the global daily solar radiations in the form of seven statistical value matrices known as determination coefficients (R²), Rootr mean square error (RMSE), relative root mean square error (rRMSE), mean bias error (MBE), mean absolute bias error (MABE), mean absolute percentage error (MAPE), and T-statistic (t-stat).

3. System Model

This section explains the details of the focused study region, dataset collection information, and data preprocessing. Moreover, detailed information about the machine learning algorithm is also defined here. Some statistical evaluation metrics also included getting the required results.

3.1. Focused Study Region

The focused area for this research is the Najran University, located in the country of Saudi Arabia. The province of Najran is located near the border of Yamen in southwestern Saudi Arabia. Its longitudes and latitudes are 17.5656° N and 44.2289° E, respectively, with an elevation of 1293 m. Najran is the fastest growing province concerning its population and now its population is around 418,792 in 2021 [23]. The annual sunshine duration is 3245 h and annual global solar radiation is 7.004 kWh/m². This geographical data of Saudi Arabia shows the importance of future solar energy investments. The details of the geographical properties and view of Najran on the Saudi Arabian map are given in Figure 2, respectively. Figure 2 was taken from the Global Solar Atlas online portal [24].

3.2. Dataset Collection and Preprocessing

The dataset used for this study is collected with the support of the Najran University solar station. The collected dataset covers the Najran province of Saudi Arabia for daily global solar radiation prediction for the year August 2017 to August 2018. The entire amount of shortwave radiation received by a surface horizontal to the ground is referred to as global horizontal irradiance (GHI). This figure, which combines both Direct Normal Irradiance (DNI) and Diffuse Horizontal Irradiance (DHI), is particularly important for solar installations. Factors such as minimum and maximum temperature, wind speed, wind direction, and other environmental conditions are targeted values for the prediction of GHI. The description of the dataset is given in Table 1 and Table 2. The status column in Table 1 with an NA description shows the features of the static value which are not selected. On the other hand, the description with a tick sign shows the dynamic features which are selected to get the GHI values.

Moreover, five other properties of solar radiations were composed of different barometric places or estimated, as previously described. The minimum and highest daily ambient temperatures, cloud cover, daily solar energy, and day duration are all included. The dataset was arbitrarily divided as the shambled sample in the algorithms, with 70% of the total data used in the training stage and 30% used in the testing stage. The same dataset was utilized for training and testing data in all approaches, and the same labels were predicted to provide a better comparison among the ML models. The details of the proposed machine learning methods are given below in Figure 3.

Dataset Validation

For the enhancement of the prediction accuracy and to get verified results, dataset validation is an important step. As the dataset was collected privately with the support of Najran University. To check the validity of the collected dataset it was initially verified through the King Abdullah City for Atomic and Renewable Energy (KACARE) [25] center, and afterward, it was compared with the solar radiation data available at the website Solar Radiation Data known as SoDa with a Clear-Sky model (CAMS-McClear) [26] This website provides many solar radiations and metrological dataset collections to facilities of the production and forecasting of energy. To compare the data of solar radiation the used algorithm is known as Copernicus Atmosphere Monitoring Service (CAMS) McClear Clear-Sky. The CAMS algorithm provides the time series solar irradiation values that are observed for some sites under clear sky conditions. After the comparison of the privately collected Najran University dataset with the SoDa and KACARE, it has seemed that the datasets having more than 90 percent similarities in GHI values. Therefore, it clearly shows authenticity and validation of the collected dataset that have been used for this research to get verified results.

3.3. Machine Learning Techniques

The ML is a prominent form of artificial intelligence and continues to grow in popularity and appeal as new application domains emerge. Machine learning assigns the system the capability to comprehend itself to estimate strange outcomes. Without a doubt, the selection of features and training achievement of an ML algorithm has an important impact on its performance. Six different machine learning algorithms were used in this research. These include CNN, KNN, SVM, SVC, logistic regression, and a random forest classifier, as seen in Figure 4.

3.3.1. K-Nearest Neighbors (KNN)

Among all the machine learning algorithms, K-NN is identified as the most elementary and earliest nonfunctional classification method [27,28]. The average classes of the KNN are achieved by defining a special benchmark value of k [29] and the new item is allocated to the nearest class to its neighbors. Functions such as Euclidean, Manhattan, Minkowski, and Chebyshev [30] can be used to calculate distances between the new item and its neighbors. When the number k is large enough, it has a resilient structure alongside noisy training data [31]. When the data set sizes to get larger, the processing interval required grows significantly, and all the distance computations results must be retained in memory in this manner [32,33]. As a result, selecting the appropriate k value is crucial. In this research, K-NN works with 50 neighbors, uniform weights, auto algorithm, 261 samples, and 14 features. As a result, K-NN holds the best result of R² and t-stat values.

3.3.2. Convolutional Neural Network (CNN)

CNN is a common method for both regression and classification problems, and it is a perfect method tool for modeling systems with non-linear relationships. As a result, elaborate mathematical formulas are not required to explain such systems [34,35]. The convolutional neural network (CNN) is a network structure made up of pieces called neurons that can process information. However, to begin the relationship between I/O, the generated structure must be trained [36]. It is quite like the human brain’s information processing function in this regard [37]. They can deliver rapid and truthful results even in datasets with noise and missing data after training. If the systems are complicated, the required data for the training process may be larger than that essential by simplified technique, and approximate data preparation may be necessary [38]. The hidden layers numbering in CNN can be produced in proportion to the size of the network. This can result in the network memorizing the data set, as well as noise [39]. DL is a common approach for solving complex issues with huge data sets [40,41]. DL allows for supervised, semi-supervised, and unsupervised processes [42]. Unlike traditional machine learning algorithms, DL can execute feature extraction on its own, even when the rough dataset is used as an input. When a larger collection of a dataset is used for DL then it gives better results [43]. The success of DL can be attributed to its neural network topology, which consists of many hidden layers [44]. To avoid this issue and cut down on training time, it is critical to be able to determine the ideal network size. The back-propagation algorithm was employed to train the CNN in this work, which was a feed-forward neural network. In this research, CNN structure holds 200,977 parameters taken from 4 one-dimensional convolution layers, 4 dense layers, 1 max-pooling layer, and 1 flatten layer, Adam as an optimizer function, and ReLU as an activation function. CNN holds the best result of rRMSE and MAPE values in 200 epochs.

3.3.3. Support Vector Machine

This machine was initially designed to solve the two-class problem but has now been effectively applied for multiple classification purposes [45,46]. Based on statistical learning theory [47], it is a supervised parametric machine learning technique. An equivalent hyperplane is constructed amongst the data and classes in the SVM method to divide them [48]. In two-dimensional space, the structure that separates classes is represented by a line, while in 3D space, it is represented by a smooth line [49]. In this process, support vectors behave similarly to data points nearest to the hyperplane. It gets more noise resistant when the distances increase between data points of classes [50]. In this research, an SVM classifier has been used with linear kernel, 100 maximum iterations, random state, and class weights are none, tolerance 0.0001, and a number of features are 14. SVM claims its best result on given data in the form of MAPE value.

3.3.4. Logistic Regression (LR)

The LR method is used when the reliant variable is complex. It is the best appropriate analysis model. The logistic regression is a prognostic investigation, like all regression studies [51]. LR is a statistical method for describing and explaining relationships between one dependent binary variable and one or more nominal, ordinal, interval, or ratio-level independent variables. It can also be used for the multiple classes evaluation and calculate the statistical values in terms of a probability for a certain class or event [52]. In this research, logistic regression takes random state = 0, solver = “Limited-memory Broyden–Fletcher–Goldfarb–Shanno (lbfgs)”, multi_class = “ovr”, penalty = “l2”, 100 maximum iterations, and number of features are 14. LR performs excellent results in the form of R², rRMSE, and MAPE values.

3.3.5. Random Forest Classifier (RFC)

The RFC is the combination of many decisions tree models. Each decision tree is built using two separate processes: sampling and complete split. The first stage consists of two random sampling procedures, each with its own set of input data for row and column sampling. The taking back approach is employed in row sampling, and the sampled data sets may contain repeated samples. After that, we create a decision tree using the split data sampling method [53]. There is no need for pruning because the two random samples guarantee the randomness of the process. As the decision tree size grows, the random forest method produces no over-fitting [54]. While growing the trees, the random forest adds more randomness to the model. When splitting a node, it looks for the best feature from a random subset of features rather than the most essential feature. As a result, there is a lot of variety, which leads to a better model. Instead of searching for the greatest possible thresholds, it calculates a score for each essential feature and adjusts the findings so that the total importance is equal to one. In this research, RFC takes 100 estimators, max_depth is 2, the random state is none, and 14 number of features. As a result, RFC performs better result in the form of t-stat and MAPE values.

3.3.6. Support Vector Classifier (SVC)

For pattern classification issues with input data provided in sequence rather than batch, the SVC is used. SVC employs a one-vs-one strategy for multi-class, whereas linear SVC employs a one-vs-rest technique. SVC is to fit the data you supply, providing the best fit hyperplane that divides or categorizes the data. After obtaining the hyperplane, you can next input some characteristics to your classifier to get the “predicted” class. Generally, the features are stored in an array of the dependent and independent variables mapping. In the end, the dependent variable mapping array feeds into the SVC classifier to get the required results [55]. In this research, SVC takes kernel = rbf, gamma = auto_deprecated, class_weight = None, decision_function_shape = ovr, random_state = None. As a result, SVC performs excellent results in all metrics except t-stat.

3.4. Evaluation Criterion

The accuracy of the prediction algorithms is undoubtedly the most important parameter in assessing their performance. As a result, the generally used error metrics are utilized to evaluate and compare the results of prediction models. To equate the results of the prediction techniques used in this study, some metrics were used, including the determination coefficient (

R^{2}

), root mean square error (RMSE), relative root mean square error (rRMSE), mean bias error (MBE), mean absolute bias error (MABE), mean absolute percentage error (MAPE), and T-statistic (t-stat). The following statistical metrics are explained in Equations (1)–(7) with descriptions. Here, the variables

a_{n}

and

b_{n}

indicate the actual and predictive values, respectively, and

\bar{b_{n}}

is the mean of predictive value. Moreover, k represents the total number of samples.

MBE = \frac{1}{k} \sum_{n = 1}^{k} (a_{n} - b_{n})

(1)

MBE is a crucial indicator for assessing prediction models’ long-term performance. The prediction model has greater performance if MBE is small near to zero. Moreover, zero denotes the perfect situation [9].

RMSE = \sqrt{\frac{1}{k} \sum_{n = 1}^{k} {(a_{n} - b_{n})}^{2}}

(2)

RMSE is a metric that measures how well prediction models perform in the near run. It has a constant positive value that should be near zero [56].

rRMSE = \frac{\sqrt{\frac{1}{k} \sum_{n = 1}^{k} {(a_{n} - b_{n})}^{2}}}{\bar{b_{n}}} \times 100

(3)

The average value of the measured data is used to calculate the rRMSE with RMSE values. The employment of a prediction model with a low rRMSE value indicates that it performs better [27].

t - stat = \sqrt{\frac{(k - 1) {MBE}^{2}}{{RMSE}^{2} - {MBE}^{2}}}

(4)

The t-stat method is used to determine whether a model’s predictions are statistically important. The higher the prediction model’s performance, the smaller the t-stat value is. This approach [57,58] uses statistical tables to calculate t-critic values. The goal is to get as near to 0 as possible

MABE = \frac{1}{k} \sum_{n = 1}^{k} | a_{n} - b_{n} |

(5)

MABE is a measure of a correlation’s quality. The goal is to get as near to 0 as possible. MABE informs us about the prediction models’ long-term performance [59,60].

R^{2} = 1 - \frac{\sum {(b_{n} - a_{n})}^{2}}{\sum {(b_{n} - \bar{b_{n}})}^{2}}

(6)

This method reveals how effectively a technique can predict a set of measurable data. It has a value that ranges from 0 to 1. R² values near 1 indicate improved performance [61].

MAPE = \frac{1}{k} \sum_{n = 1}^{k} | \frac{b_{n} - a_{n}}{bn} | \times 100

(7)

The MAPE ratio is the average of total values of error prediction and a total of real-time dataset values. The smaller the value of MAPE, the better the model’s performance [56,62].

4. Results and Discussions

The current paper uses six different machine learning algorithms to forecast the predictability of daily solar radiation falling onto the horizontal surface of Najran province, located in Saudi Arabia. The Python language was used by following libraires numpy, TensorFlow, statistic, math, random, seaborn and sklearn in Jupyter notebook to calculate and predict the GHI. The evaluation metrics MBE, MABE, MAPE, R², RMSE, rRMSE, MBE, and t-stat are used for calculations as shown in Table 3. The R² varies from 0.63 to 0.99 contingent on the location and methodology used. To put it another way, all algorithms perform well in terms of R² when it comes to predicting daily global solar radiation. In Table 3, all error values including t-stat nearer to zero considers excellent performance on a given dataset. For R², the value nearer to 1 claims an exceptional result.

Figure 5 and Figure 6 show the daily solar radiation predictions and error rate, respectively, in Najran province predicted by the KNN algorithm. The value of R² is 0.99 which is the highest one as compared to the other applied algorithms. In addition, the result of MAPE: 12.8% is also meaningful as seen in Table 3. In terms of MBE, the KNN algorithm shows the negative values in Najran province. In other words, the prediction results of the KNN for solar radiations are the highest one as compared to the other methods.

Figure 7 and Figure 8 display the daily solar radiation predictions and error rates, respectively, in Najran province predicted by the CNN (DL) algorithm. The value of R² is 0.63 which is the lowest one as compared to the other applied algorithms. In addition, the results of MAPE: 7.26% is also significant as comprehended in Table 3. In terms of MBE, the DL algorithm shows the negative values in Najran province. In other words, the prediction results of the DL for solar radiations are the lowest as compared to the other methods.

The daily solar radiation predictions and error rate in Najran province predicted by logistic regression (LR) algorithm are also shown in Figure 9 and Figure 10, respectively. The value of R² is 0.99 and t-stat is 0.23 as compared to the other applied algorithms. In addition, the results of MAPE: 0.43% is also significant as known in Table 3. In terms of MBE, the LR algorithm shows the positive values in Najran province.

The daily solar radiation predictions and error rate in the province of Najran predicted by SVM algorithm is also shows in the Figure 11 and Figure 12, respectively. The value of R²:0.96 and t-stat is 0.26 as compared to the other applied algorithms. In addition, the results of MAPE:9.00% is also the important as identified in Table 3. In terms of MBE the SVM algorithm shows the negative values in the province of Najran.

Figure 13 and Figure 14 display the daily solar radiation predictions and error rate, respectively, in Najran province predicted by a random forest classifier. The value of R² is 0.98 which is third the lowest one as compared to the other applied algorithms. In addition, the results of MAPE: 0.09% is also significant as followed in Table 3. In terms of MBE, the RFC classifier displays the positive values in Najran province.

The daily solar radiation predictions and error rate in Najran province predicted by the SVC classifier are also shown in Figure 15 and Figure 16, respectively. The value of R²: 0.95 and the t-stat is 0.31 as compared to the other applied algorithms. In addition, the results of MAPE: 0.18% is also significant as identified in Table 3. In terms of MBE, the SVC classifier shows the positive values in Najran province.

Table 4 shows a statistical comparison of the current study’s metric results with previously published literature research for the GHI using various algorithms.

SVC performs worse in RMSE (MJ/m²) value as compared to counterpart models. It happens when data points fall in the margin area, SVC classify it as error; therefore, it has higher error values at certain point;
Due to relative memory efficiency of SVC, it claims excellent performance in term of MBE (MJ/m²), rRMSE (%), MBE (MJ/m²), MABE (MJ/m²), R², and MAPE (%) values as compared to existing approaches.

Table 4 shows that the number of preferred measures for measuring algorithm prediction success is often limited. Furthermore, the algorithms utilized in most prior studies for the prediction of global solar radiation came from the same categories. As a result, statistical measurements have often shown similar outcomes for different algorithms.

Authors investigated the MBE value: only ANN [11,74] and proposed SVC consider it;
For RMSE value, all algorithms [11,63,64,65,66,67,68,69,70,71,72,73,74] and proposed SVC calculate it;
For rRMSE value, SVM [63,65], SVR [67], MEA-ANN [68], ANN-ARX [70], KELM [71], WT-SVM [73], ANN [74] and proposed SVC are taking into consideration;
For t-stat value, only ANN [74] and proposed SVC are showing results;
For MABE value, only consider the following algorithms SVM [65], SVR [67], ANN-ARX [70], KELM [71], ANN [74] and proposed SVC;
For R² value, ANN [64], SVM [65], MLP [66], SVR [67], MEA-ANN [68], ANN [11,69,74], ANN-ARX [70], KELM [71], and proposed SVC are considering;
For MAPE value, SVM [65], SVR [67], ANN-ARX [70], WT-SVM [73], ANN [74] and proposed SVC are calculating it.

In this study, the proposed SVC method produces the best prediction results, as indicated previously shown in Table 4. The geography benefits, dataset multiplicity, the absence of mislaid data, and other factors may have contributed to this study. Moreover, the accurate prediction of solar radiations on the ground generates a large profit for energy suppliers in the market. So, the accurate short-term energy radiations prediction is very important to sold solar energy commercially to enhance the revenue margin. Finally, the current work reports that all methods examined in this work may be utilized to accurately predict GHI; however, the SVC technique is the most suitable method amongst all techniques by claiming the precise results using the evaluation metrics.

5. Conclusions

This study deliberates the performance of six machine learning algorithms (CNN, K-NN, SVM, LR, RFC, and SVC) for the prediction of solar radiations on a daily basis. This work creates the tensor of certain and uncertain variables belong to air, wind, peak wind, relative humidity, and barometric pressure for solar radiation prediction in Najran province Saudi Arabia. The evaluation metrics MBE, MABE, MAPE, R², RMSE, rRMSE, and t-stat are used for calculations. The following conclusion can be drawn from this study: in this research, the authors discovered the highest values of R² 0.99, RMSE 37.5, rRMSE 19.5, MBE 1.27, MABE 15.1, t-stat 4.91, MAPE 12.8, as explained in Table 4. The author also discovered the lowest values of R² 0.63, RMSE 6.96, rRMSE 2.63, MBE −2.68, MABE 0.58, t-stat 0.03, MAPE 0.09 as explained in Table 4 Considering the results of R² all the algorithm shows the better results for the Najran province. The MBE values for all the algorithms are half positive and half negatively predicted. The algorithms KNN, CNN, and SVM shows the negative values, whereas the LR, RFC, and SVC give the positive values. The MABE values for all the algorithms range from 0.58 to 15.1 MJ/m². SVC performs better as compared to all algorithms by taking 0.58. The RMSE results vary from 6.96 to 37.5 MJ/m², in which lowest error value achieved by SVC while highest value taken by KNN. The MAPE value for SVC is 0.18 that claims the lowest error rate. In comparison to other methods, the error rate in the usage of the SVC algorithm is quite low, especially in observations when solar radiation is lower. That is why the SVC shows the best results as compared to the existing techniques.

Funding

The author is thankful to the Deanship of Scientific Research at Najran University for funding this work under the General Research Funding program grant code (NU/-/SERC/10/655).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data could be shared upon request.

Conflicts of Interest

The author declares no conflict of interest.

References

Khanlari, A.; Sözen, A.; Afshari, F.; Şirin, C.; Tuncer, A.D.; Gungor, A. Drying municipal sewage sludge with v-groove triple-pass and quadruple-pass solar air heaters along with testing of a solar absorber drying chamber. Sci. Total Environ. 2020, 709, 136198. [Google Scholar] [CrossRef] [PubMed]
Khanlari, A.; Sözen, A.; Şirin, C.; Tuncer, A.D.; Gungor, A. Performance enhancement of a greenhouse dryer: Analysis of a cost-effective alternative solar air heater. J. Clean. Prod. 2020, 251, 119672. [Google Scholar] [CrossRef]
Available online: https://freedomsolarpower.com/blog/7-uses-of-solar-energy (accessed on 30 August 2021).
Available online: https://www.tableau.com/learn/articles/time-series-forecasting (accessed on 31 August 2021).
Yıldırım, H.B.; Çelik, Ö.; Teke, A.; Barutcu, B. Estimating daily Global solar radiation with graphical user interface in Eastern Mediterranean region of Turkey. Renew. Sustain. Energy Rev. 2018, 82, 1528–1537. [Google Scholar] [CrossRef]
Malik, H.; Garg, S. Long-term solar irradiance forecast using artificial neural network: Application for performance prediction of Indian cities: Applications of Artificial Intelligence Techniques in Engineering. In Advances in Intelligent Systems and Computing; Springer: Berlin/Heidelberg, Germany, 2019; pp. 285–293. [Google Scholar]
Dong, N.; Chang, J.-F.; Wu, A.-G.; Gao, Z.-K. A novel convolutional neural network framework based solar irradiance prediction method. Int. J. Electr. Power Energy Syst. 2020, 114, 105411. [Google Scholar] [CrossRef]
Gürel, A.E.; Ağbulut, Ü.; Biçen, Y. Assessment of machine learning, time series, response surface methodology and empirical models in prediction of global solar radiation. J. Clean. Prod. 2020, 277, 122353. [Google Scholar] [CrossRef]
Fan, J.; Wang, X.; Wu, L.; Zhang, F.; Bai, H.; Lu, X.; Xiang, Y. New combined models for estimating daily global solar radiation based on sunshine duration in humid regions: A case study in South China. Energy Convers. Manag. 2018, 156, 618–625. [Google Scholar] [CrossRef]
Kisi, O.; Parmar, K.S. Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution. J. Hydrol. 2016, 534, 104–112. [Google Scholar] [CrossRef]
Jahani, B.; Mohammadi, B. A comparison between the application of empirical and ANN methods for estimation of daily global solar radiation in Iran. Theor. Appl. Climatol. 2019, 137, 1257–1269. [Google Scholar] [CrossRef]
Jiang, Y. Prediction of monthly mean daily diffuse solar radiation using artificial neural networks and comparison with other empirical models. Energy Policy 2008, 36, 3833–3837. [Google Scholar] [CrossRef]
Huang, C.; Zhao, Z.; Wang, L.; Zhang, Z.; Luo, X. Point and interval forecasting of solar irradiance with an active Gaussian process. IET Renew. Power Gener. 2020, 14, 1020–1030. [Google Scholar] [CrossRef]
Long, H.; Zhang, Z.; Su, Y. Analysis of daily solar power prediction with data-driven approaches. Appl. Energy 2014, 126, 29–37. [Google Scholar] [CrossRef]
Quej, V.H.; Almorox, J.; Arnaldo, J.A.; Saito, L. ANFIS, SVM and ANN soft-computing techniques to estimate daily global solar radiation in a warm sub-humid environment. J. Atmos. Sol.-Terr. Phys. 2017, 155, 62–70. [Google Scholar] [CrossRef] [Green Version]
Marzo, A.; Trigo-Gonzalez, M.; Alonso-Montesinos, J.; Martínez-Durbán, M.; López, G.; Ferrada, P.; Fuentealba, E.; Cortés, M.; Batlles, F. Daily global solar radiation estimation in desert areas using daily extreme temperatures and extraterrestrial radiation. Renew. Energy 2017, 113, 303–311. [Google Scholar] [CrossRef]
Knowles, R. Ritual House: Drawing on Nature’s Rhythms for Architecture and Urban Design; Island Press: Washington, DC, USA, 2006. [Google Scholar]
Mehdizadeh, S.; Behmanesh, J.; Khalili, K. Comparison of artificial intelligence methods and empirical equations to estimate daily solar radiation. J. Atmos. Sol.-Terr. Phys. 2016, 146, 215–227. [Google Scholar] [CrossRef]
Tymvios, F.; Jacovides, C.; Michaelides, S.; Scouteli, C. Comparative study of Ångström’s and artificial neural networks’ methodologies in estimating global solar radiation. Sol. Energy 2005, 78, 752–762. [Google Scholar] [CrossRef]
Meenal, R.; Selvakumar, A.I. Assessment of SVM, empirical and ANN based solar radiation prediction models with most influencing input parameters. Renew. Energy 2018, 121, 324–343. [Google Scholar] [CrossRef]
Samadianfard, S.; Majnooni-Heris, A.; Qasem, S.N.; Kisi, O.; Shamshirband, S.; Chau, K.-W. Daily global solar radiation modeling using data-driven techniques and empirical equations in a semi-arid climate. Eng. Appl. Comput. Fluid Mech. 2019, 13, 142–157. [Google Scholar] [CrossRef] [Green Version]
Marzouq, M.; Bounoua, Z.; El Fadili, H.; Mechaqrane, A.; Zenkouar, K.; Lakhliai, Z. New daily global solar irradiation estimation model based on automatic selection of input parameters using evolutionary artificial neural networks. J. Clean. Prod. 2019, 209, 1105–1118. [Google Scholar] [CrossRef]
Available online: https://worldpopulationreview.com/world-cities/najran-population (accessed on 25 August 2021).
Global Solar Atlas. Najran. Available online: https://globalsolaratlas.info/map?c=18.242395,45.686646,8&r=SAU:SAU.12_1 (accessed on 25 August 2021).
King Abdullah City for Atomic and Renewable Energy. Available online: https://www.energy.gov.sa/ar/pages/default.aspx (accessed on 25 August 2021).
Copernicus Atmosphere Monitoring Service. CAMS McClear Service for Irradiation under Clear-Sky. Available online: https://www.soda-pro.com/web-services/radiation/cams-mcclear (accessed on 25 August 2021).
Chen, H.-L.; Huang, C.-C.; Yu, X.-G.; Xu, X.; Sun, X.; Wang, G.; Wang, S.-J. An efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest neighbor approach. Expert Syst. Appl. 2013, 40, 263–271. [Google Scholar] [CrossRef]
Hu, L.-Y.; Huang, M.-W.; Ke, S.-W.; Tsai, C.-F. The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus 2016, 5, 1304. [Google Scholar] [CrossRef] [Green Version]
Zhang, S.; Li, X.; Zong, M.; Zhu, X.; Wang, R. Efficient kNN classification with different numbers of nearest neighbors. IEEE Trans. Neural Netw. Learn. Syst. 2017, 29, 1774–1785. [Google Scholar] [CrossRef] [PubMed]
Rodrigues, É.O. Combining Minkowski and Cheyshev: New distance proposal and survey of distance metrics using k-nearest neighbours classifier. Pattern Recognit. Lett. 2018, 110, 66–71. [Google Scholar] [CrossRef]
Myhre, J.N.; Mikalsen, K.Ø.; Løkse, S.; Jenssen, R. Robust clustering using a kNN mode seeking ensemble. Pattern Recognit. 2018, 76, 491–505. [Google Scholar] [CrossRef] [Green Version]
Maillo, J.; Luengo, J.; García, S.; Herrera, F.; Triguero, I. Exact Fuzzy k-Nearest Neighbor Classification for Big Datasets; IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
Saikia, J.; Yin, S.; Jiang, Z.; Seok, M.; Seo, J.-S. K-Nearest Neighbor Hardware Accelerator Using in-Memory Computing SRAM; IEEE: Piscataway, NJ, USA, 2019; pp. 1–6. [Google Scholar]
Methaprayoon, K.; Yingvivatanapong, C.; Lee, W.-J.; Liao, J.R. An integration of ANN wind power estimation into unit commitment considering the forecasting uncertainty. IEEE Trans. Ind. Appl. 2007, 43, 1441–1448. [Google Scholar] [CrossRef]
Mabel, M.C.; Fernandez, E. Analysis of wind power generation and prediction using ANN: A case study. Renew. Energy 2008, 33, 986–992. [Google Scholar] [CrossRef]
Palani, S.; Liong, S.-Y.; Tkalich, P. An ANN application for water quality forecasting. Mar. Pollut. Bull. 2008, 56, 1586–1597. [Google Scholar] [CrossRef]
Behrang, M.; Assareh, E.; Ghanbarzadeh, A.; Noghrehabadi, A. The potential of different artificial neural network (ANN) techniques in daily global solar radiation modeling based on meteorological data. Sol. Energy 2010, 84, 1468–1480. [Google Scholar] [CrossRef]
Azlah, M.A.F.; Chua, L.S.; Rahmad, F.R.; Abdullah, F.I.; Wan Alwi, S.R. Review on techniques for plant leaf classification and recognition. Computers 2019, 8, 77. [Google Scholar] [CrossRef] [Green Version]
Ramil, A.; López, A.; Pozo-Antonio, J.; Rivas, T. A computer vision system for identification of granite-forming minerals based on RGB data and artificial neural networks. Measurement 2018, 117, 90–95. [Google Scholar] [CrossRef]
Najafabadi, M.M.; Villanustre, F.; Khoshgoftaar, T.M.; Seliya, N.; Wald, R.; Muharemagic, E. Deep learning applications and challenges in big data analytics. J. Big Data 2015, 2, 1. [Google Scholar] [CrossRef] [Green Version]
Wang, C.; Gong, L.; Yu, Q.; Li, X.; Xie, Y.; Zhou, X. DLAU: A scalable deep learning accelerator unit on FPGA. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2016, 36, 513–517. [Google Scholar] [CrossRef]
Bharati, A.; Singh, R.; Vatsa, M.; Bowyer, K.W. Detecting facial retouching using supervised deep learning. IEEE Trans. Inf. Forensics Secur. 2016, 11, 1903–1913. [Google Scholar] [CrossRef]
Lin, Y.-Z.; Nie, Z.-H.; Ma, H.-W. Structural damage detection with automatic feature-extraction through deep learning. Comput.-Aided Civ. Infrastruct. Eng. 2017, 32, 1025–1046. [Google Scholar] [CrossRef]
Hua, Y.; Guo, J.; Zhao, H. Deep Belief Networks and Deep; IEEE: Piscataway, NJ, USA, 2015; pp. 1–4. [Google Scholar]
Zendehboudi, A.; Baseer, M.; Saidur, R. Application of support vector machine models for forecasting solar and wind energy resources: A review. J. Clean. Prod. 2018, 199, 272–285. [Google Scholar] [CrossRef]
Onel, M.; Kieslich, C.A.; Guzman, Y.A.; Floudas, C.A.; Pistikopoulos, E.N. Big data approach to batch process monitoring: Simultaneous fault detection and diagnosis using nonlinear support vector machine-based feature selection. Comput. Chem. Eng. 2018, 115, 46–63. [Google Scholar] [CrossRef] [PubMed]
Park, H.; Kim, N.; Lee, J. Parametric models and non-parametric machine learning models for predicting option prices: Empirical comparison study over KOSPI 200 Index options. Expert Syst. Appl. 2014, 41, 5227–5237. [Google Scholar] [CrossRef]
Kim, S.; Mun, B.M.; Bae, S.J. Data depth based support vector machines for predicting corporate bankruptcy. Appl. Intell. 2018, 48, 791–804. [Google Scholar] [CrossRef]
Yoon, M.; Yun, Y.; Nakayama, H. A Role of Total Margin in Support Vector Machines; IEEE: Piscataway, NJ, USA, 2003; pp. 2049–2053. [Google Scholar]
Birzhandi, P.; Kim, K.T.; Lee, B.; Youn, H.Y. Reduction of training data using parallel hyperplane for support vector machine. Appl. Artif. Intell. 2019, 33, 497–516. [Google Scholar] [CrossRef]
Lou, S.; Li, D.H.; Lam, J.C.; Chan, W.W. Prediction of diffuse solar irradiance using machine learning and multivariable regression. Appl. Energy 2016, 181, 367–374. [Google Scholar] [CrossRef]
Jagadeesh, V.; Venkata Subbaiah, K.; Varanasi, J. Forecasting the probability of solar power output using logistic regression algorithm. J. Stat. Manag. Syst. 2020, 23, 1–16. [Google Scholar] [CrossRef]
Criminisi, A.; Shotton, J.; Konukoglu, E. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found. Trends® Comput. Graph. Vis. 2012, 7, 81–227. [Google Scholar] [CrossRef]
Liu, J.; Cao, M.; Bai, D.; Zhang, R. Solar Radiation Prediction Based on Random Forest of Feature-Extraction; IOP Publishing: Bristol, UK, 2019; p. 012006. [Google Scholar]
Lau, K.; Wu, Q. Online training of support vector classifier. Pattern Recognit. 2003, 36, 1913–1920. [Google Scholar] [CrossRef]
Zang, H.; Cheng, L.; Ding, T.; Cheung, K.W.; Wang, M.; Wei, Z.; Sun, G. Application of functional deep belief network for estimating daily global solar radiation: A case study in China. Energy 2020, 191, 116502. [Google Scholar] [CrossRef]
Bakirci, K. Correlations for estimation of daily global solar radiation with hours of bright sunshine in Turkey. Energy 2009, 34, 485–501. [Google Scholar] [CrossRef]
Fan, J.; Wu, L.; Zhang, F.; Cai, H.; Ma, X.; Bai, H. Evaluation and development of empirical models for estimating daily and monthly mean daily diffuse horizontal solar radiation for different climatic regions of China. Renew. Sustain. Energy Rev. 2019, 105, 168–186. [Google Scholar] [CrossRef]
Rehman, S. Solar radiation over Saudi Arabia and comparisons with empirical models. Energy 1998, 23, 1077–1082. [Google Scholar] [CrossRef]
Yang, L.; Cao, Q.; Yu, Y.; Liu, Y. Comparison of daily diffuse radiation models in regions of China without solar radiation measurement. Energy 2020, 191, 116571. [Google Scholar] [CrossRef]
Gouda, S.G.; Hussein, Z.; Luo, S.; Yuan, Q. Model selection for accurate daily global solar radiation prediction in China. J. Clean. Prod. 2019, 221, 132–144. [Google Scholar] [CrossRef]
Ceylan, İ.; Gürel, A.E.; Ergün, A. The mathematical modeling of concentrated photovoltaic module temperature. Int. J. Hydrogen Energy 2017, 42, 19641–19653. [Google Scholar] [CrossRef]
Chen, J.-L.; Li, G.-S.; Wu, S.-J. Assessing the potential of support vector machine for estimating daily solar radiation using sunshine duration. Energy Convers. Manag. 2013, 75, 311–318. [Google Scholar] [CrossRef]
Moreno, A.; Gilabert, M.; Martínez, B. Mapping daily global solar irradiation over Spain: A comparative study of selected approaches. Sol. Energy 2011, 85, 2072–2084. [Google Scholar] [CrossRef]
Mohammadi, K.; Shamshirband, S.; Tong, C.W.; Arif, M.; Petković, D.; Ch, S. A new hybrid support vector machine–wavelet transform approach for estimation of horizontal global solar radiation. Energy Convers. Manag. 2015, 92, 162–171. [Google Scholar] [CrossRef]
Wang, L.; Kisi, O.; Zounemat-Kermani, M.; Salazar, G.A.; Zhu, Z.; Gong, W. Solar radiation prediction using different techniques: Model evaluation and comparison. Renew. Sustain. Energy Rev. 2016, 61, 384–397. [Google Scholar] [CrossRef]
Mohammadi, K.; Shamshirband, S.; Anisi, M.H.; Alam, K.A.; Petković, D. Support vector regression based prediction of global solar radiation on a horizontal surface. Energy Convers. Manag. 2015, 91, 433–441. [Google Scholar] [CrossRef]
Feng, Y.; Gong, D.; Zhang, Q.; Jiang, S.; Zhao, L.; Cui, N. Evaluation of temperature-based machine learning and empirical models for predicting daily global solar radiation. Energy Convers. Manag. 2019, 198, 111780. [Google Scholar] [CrossRef]
Antonopoulos, V.Z.; Papamichail, D.M.; Aschonitis, V.G.; Antonopoulos, A.V. Solar radiation estimation methods using ANN and empirical models. Comput. Electron. Agric. 2019, 160, 160–167. [Google Scholar] [CrossRef]
Shamshirband, S.; Mohammadi, K.; Piri, J.; Petković, D.; Karim, A. Hybrid auto-regressive neural network model for estimating global solar radiation in Bandar Abbas, Iran. Environ. Earth Sci. 2016, 75, 172. [Google Scholar] [CrossRef]
Shamshirband, S.; Mohammadi, K.; Chen, H.-L.; Samy, G.N.; Petković, D.; Ma, C. Daily global solar radiation prediction from air temperatures using kernel extreme learning machine: A case study for Iran. J. Atmos. Sol.-Terr. Phys. 2015, 134, 109–117. [Google Scholar] [CrossRef]
Baser, F.; Demirhan, H. A fuzzy regression with support vector machine approach to the estimation of horizontal global solar radiation. Energy 2017, 123, 229–240. [Google Scholar] [CrossRef]
Deo, R.C.; Wen, X.; Qi, F. A wavelet-coupled support vector machine model for forecasting global incident solar radiation using limited meteorological dataset. Appl. Energy 2016, 168, 568–593. [Google Scholar] [CrossRef]
Ağbulut, Ü.; Gürel, A.E.; Biçen, Y. Prediction of daily global solar radiation using different machine learning algorithms: Evaluation and comparison. Renew. Sustain. Energy Rev. 2021, 135, 110114. [Google Scholar] [CrossRef]

Figure 1. Solar energy applications in home appliances.

Figure 2. Geographical representation of Najran province for global solar radiations.

Figure 3. Research workflow to predict GHI Values.

Figure 4. Hierarchy of machine learning techniques used in this work.

Figure 5. Graphical representation of KNN predictions.

Figure 6. KNN error frequency rate.

Figure 7. Graphical representation of CNN (DL) predictions.

Figure 8. CNN (DL) error frequency rate.

Figure 9. Graphical representation of logistic regression predictions.

Figure 10. Logistic regression error frequency rate.

Figure 11. Graphical representation of SVM predictions.

Figure 12. SVM error frequency rate.

Figure 13. Graphical representation of random forest classifier predictions.

Figure 14. Random forest classifier error frequency rate.

Figure 15. Graphical representation of SVC predictions.

Figure 16. SVC error frequency rate.

Table 1. Dataset description.

No	Feature Name	Description	Status
1	Site	Najran province of Saudi Arabi	NA
2	Latitude	44.2289° E	NA
3	Longitude	17.5656° N	NA
4	Date	August 2017 to August 2018	NA
5	Air Temperature (CÂ°)	The measure of how hot or cold the air is	✓
6	Air Temperature Uncertainty (CÂ°)	The difference in the temperature in a day	✓
7	Wind Direction at 3 m (Â°N)	The direction of the air in degrees	✓
8	Wind Direction at 3 m Uncertainty (Â°N)	Uncertainty in direction of the air	✓
9	Wind Speed at 3 m (m/s)	A total of 3 m distance covered by air in per time	✓
10	Wind Speed at 3 m Uncertainty (m/s)	The difference in the wind speed throughout the day	✓
11	Wind Speed at 3 m (std dev) (m/s)	Wind speed standard deviation to intervals	NA
12	Wind Speed at 3 m (std dev) Uncertainty (m/s)	Difference of wind speed standard deviation to intervals	NA
13	DHI (Wh/m²)	Diffused Horizontal Irradiance is the amount of solar energy that does not arrive from the sun in a direct path.	NA
14	DHI Uncertainty (Wh/m²)	Diffused Horizontal Irradiance uncertainty is the difference in the amount of solar energy that does not arrive from the sun in a direct path.	NA
15	Standard Deviation DHI (Wh/m²)	Dispersion of Diffused Horizontal Irradiance over the globe	NA
16	DNI (Wh/m²)	The amount of solar energy received by a surface per unit area	NA
17	DNI Uncertainty (Wh/m²)	Difference of the amount of solar energy received by a surface per unit area	NA
18	Standard Deviation DNI (Wh/m²)	Dispersion of the amount of solar energy received by a surface per unit area	NA
19	GHI (Wh/m²)	The total amount of solar radiation that strikes a horizontal surface	✓
20	GHI Uncertainty (Wh/m²)	The difference in the total amount of solar radiation that strikes a horizontal surface	NA
21	Standard Deviation GHI (Wh/m²)	Dispersion of the total amount of solar radiation that strikes a horizontal surface	NA
22	Peak Wind Speed at 3 m (m/s)	The maximum speed of air	✓
23	Peak Wind Speed at 3 m Uncertainty (m/s)	The difference in wind speed	✓
24	Relative Humidity (%)	The number of water vapors present in the air	✓
25	Relative Humidity Uncertainty (%)	The difference in the number of water vapors present in the air	✓
26	Barometric Pressure (mB (hPa equiv))	The pressure of the atmosphere	✓
27	Barometric Pressure Uncertainty (mB (hPa equiv))	The difference in the pressure of the atmosphere	✓

Table 2. Details of the dataset features.

	AT	ATU	WD	WDU	WS	WSU	GHI	PWS	PWSU	RH	RHU	BP	BPU
Count	275	275	275	275	275	275	275	275	275	275	275	275	275
Mean	29.0	0.5	137.1	3.9	2.2	0.02	7106.5	9.2	0.08	20.9	3.0	880.3	4.40
std	4.6	0.0	127.5	0.4	0.5	0.04	824.1	2.1	0.03	10.9	0.0	3.2	0.008
Min	18.5	0.5	0.0	0.0	0.6	0.00	3809.0	5.1	0.0	9.0	3.0	873.4	4.40
25%	25.2	0.5	35.5	4.0	1.9	0.00	6595.5	7.7	0.10	13.8	3.0	877.6	4.40
50%	30.2	0.5	88.0	4.0	2.2	0.00	7274	9.1	0.10	17.8	3.0	880.4	4.40
75%	33.3	0.5	305.0	4.0	2.6	0.00	7692	10.7	0.10	24.9	3.0	882.6	4.40
Max	36.0	0.5	359.0	4.0	4.2	0.10	8600	20.5	0.10	66.4	3.0	888.5	4.50

Air Temperature (AT), Air Temperature Uncertainty (ATU), Wind Direction (WD) at 3 m, Wind Direction at 3 m Uncertainty (WDU), Wind Speed (WS), Wind Speed at 3 m Uncertainty (WSU), Peak Wind Speed (PWS), Peak Wind Speed at 3 m Uncertainty (PWSU), Relative Humidity (RH), Relative Humidity Uncertainty (RHU), Barometric Pressure (BP), Barometric Pressure Uncertainty (BPU).

Table 3. Statistical metrics evaluation for all the algorithms in Najran province.

Sr#	Metrics	KNN	CNN (DL)	LR	SVM	RFC	SVC
1	R²	0.99	0.63	0.99	0.96	0.98	0.99
2	RMSE (MJ/m²)	37.5	9.20	34.1	14.5	12.5	6.96
3	rRMSE (%)	19.5	3.11	5.81	11.5	11.1	2.63
4	MBE (MJ/m²)	−0.83	−2.68	0.48	−2.00	1.27	0.13
5	MABE (MJ/m²)	15.1	2.72	2.69	8.57	7.25	0.58
6	t-stat	0.03	4.91	0.23	0.26	0.16	0.31
7	MAPE (%)	12.8	7.26	0.43	9.00	0.09	0.18

Table 4. Comparison of the statistical metrics for GHI with existing methods.

Citation	Models	Evaluation Statistics
Citation	Models	$MBE$ $(MJ / m^{2})$	$RMSE$ $(MJ / m^{2})$	$rRMSE$ $(%)$	$t - stat$	$MABE$ $(MJ / m^{2})$	$R^{2}$	$MAPE$ $(%)$
[63]	SVM	-	1.77	13.17	-	-	-	-
[64]	ANN	-	3.17	-	-	-	0.88	-
[65]	SVM	-	1.57	7.92	-	0.85	0.91	6.88
[66]	MLP	-	1.95	-	-	-	0.87	-
[67]	SVR	-	2.001	9.4	-	1.30	0.92	10.42
[68]	MEA-ANN	-	2.91	19.6	-	-	0.89	-
[11]	ANN	0.39	1.9	-	-	-	0.93	-
[69]	ANN	-	3.51	-	-	-	0.89	-
[70]	ANN-ARX	-	1.81	9.91	-	1.54	0.88	9.53
[71]	KELM	-	2.21	11.3	-	1.51	0.83	-
[72]	FRF-SVM	-	1.66	-	-	-	-	-
[73]	WT-SVM	-	2.43	12.8	-	-	-	12.64
[74]	ANN	0.18	2.14	14.2	1.29	1.6	0.93	11.93
Proposed	SVC	0.13	6.96	2.63	0.31	0.58	0.99	0.18

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alghamdi, H.A. A Time Series Forecasting of Global Horizontal Irradiance on Geographical Data of Najran Saudi Arabia. Energies 2022, 15, 928. https://doi.org/10.3390/en15030928

AMA Style

Alghamdi HA. A Time Series Forecasting of Global Horizontal Irradiance on Geographical Data of Najran Saudi Arabia. Energies. 2022; 15(3):928. https://doi.org/10.3390/en15030928

Chicago/Turabian Style

Alghamdi, Hisham A. 2022. "A Time Series Forecasting of Global Horizontal Irradiance on Geographical Data of Najran Saudi Arabia" Energies 15, no. 3: 928. https://doi.org/10.3390/en15030928

APA Style

Alghamdi, H. A. (2022). A Time Series Forecasting of Global Horizontal Irradiance on Geographical Data of Najran Saudi Arabia. Energies, 15(3), 928. https://doi.org/10.3390/en15030928

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Time Series Forecasting of Global Horizontal Irradiance on Geographical Data of Najran Saudi Arabia

Abstract

1. Introduction

2. Literature Review

3. System Model

3.1. Focused Study Region

3.2. Dataset Collection and Preprocessing

Dataset Validation

3.3. Machine Learning Techniques

3.3.1. K-Nearest Neighbors (KNN)

3.3.2. Convolutional Neural Network (CNN)

3.3.3. Support Vector Machine

3.3.4. Logistic Regression (LR)

3.3.5. Random Forest Classifier (RFC)

3.3.6. Support Vector Classifier (SVC)

3.4. Evaluation Criterion

4. Results and Discussions

5. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI