Nowcasting Hourly-Averaged Tilt Angles of Acceptance for Solar Collector Applications Using Machine Learning Models

: Challenges in utilising fossil fuels for generating energy call for the adoption of renewable energy sources. This study focuses on modelling and nowcasting optimal tilt angle(s) of solar energy harnessing using historical time series data collected from one of South Africa’s radiometric stations, USAid Venda station in Limpopo Province. In the study, we compared random forest (RF), K-nearest neighbours (KNN), and long short-term memory (LSTM) in nowcasting of optimum tilt angle. Gradient boosting (GB) is used as the benchmark model to compare the model’s predictive accuracy. The performance measures of mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE) and R 2 were used, and the results showed LSTM to have the best performance in nowcasting optimum tilt angle compared to other models, followed by the RF and GB, whereas KNN was the worst-performing model.


Introduction 1.Background
The constant increase in global energy consumption has led to a worsening shortage of primary energy resources such as fossil fuels, coal, and natural gas globally, with South Africa experiencing electricity load shedding since 2007 due to energy generation and maintenance difficulties [1].These, together with an increase in fossil fuel prices, have compelled improved developments in such renewable energy aspects as forecasting for the improvement of the operational performance of a solar energy system and for scheduling the maintenance of the solar power [2][3][4].The growth of solar renewable energy has been increasing rapidly, with the total installed PVs reaching over 67.4 gigawatts and 627 gigawatts globally at the end of 2011 and 2019, respectively.Studies show that it will keep increasing while fossil fuels decrease [2].
It has been estimated that the Earth receives solar irradiance of around 1000 W/m 2 , which is estimated to generate around 85,000 TW overall [5].However, according to the Renewable Global Status Report, the solar energy contribution every day was less than one per cent of the world's energy demand between 2006 and 2007 and increased to over 27% on average by 2019 and 2020 [6].The performance level was determined by the type of semiconductor used in building solar panels, as they were only capable of converting 15% to 40% of sunlight into electricity, and the number of solar systems was limited due to their high cost.To increase the PV performances without increasing the cost, some studies focused on estimating the solar panel's tilt angles for optimal irradiation and the solar collector's acceptance angle to maximize the amount of solar radiation [5].
Plenty of studies have focused on forecasting the behaviour of solar irradiance using different techniques [7][8][9][10][11].An accurate estimate of solar irradiance production and the optimum solar collector's angle of acceptance can increase the stability of solar power performance by providing operators with insight into the amount of power that can be gained in the future by predicting future solar power output, and with the optimum angle set for solar collectors, optimal or required solar power would be obtained, thus enabling operators to manage their power systems effectively and make more economical decisions.In this study, we focus on modelling and nowcasting the optimum tilt angle of solar energy acceptance using machine learning models and datasets from the Southern African Universities Radiometric Network (SAURAN), USAid Venda station in Limpopo Province [12].

Related Work and Review
Solar collector tilt angles have not been extensively studied to maximise continuous solar irradiance received by collectors [13].A work completed on forecasting angles of acceptance for solar energy has been reported by Jamil et al. [14] in which they focused on estimating both solar radiation and optimum tilt angles in Aligarh city using a dataset observed at Heat Transfer and Solar Energy Laboratory, Department of Mechanical Engineering, Aligarh Muslim University, Aligarh, to estimate monthly average solar radiation.Monthly, seasonal, and annual optimum tilt angles were also estimated.In the study by Kim et al. [5], they proposed a solar panel tilt angle optimisation model using machine learning algorithms.Their objective was to find a tilt angle that maximises the solar irradiance on the PV systems.They made use of linear regression (LR), random forest, SVM, gradient boosting (GB), and least absolute shrinkage and selection operator (LASSO) using the solar power generation dataset from 22 PV modules involving various factors such as weather, dust level, and aerosol level.The GB was the best-performing model for forecasting monthly/yearly panel tilt by predicting the best tilt angles concerning the amount of solar power generated.Another study was carried out by Swarnavo et al. [15], where they focused on the optimisation of the tilt angle of solar panels using artificial intelligence and a solar tracking method.An analysis of the optimisation of tilt angles on monthly measurements in Kolkata, India, on the Lat.22.56670 N and Lon.88.36670 E, was made using a genetic algorithm.Thus, it was proven that a significant power gain could be obtained by finding the optimal tilt angle of acceptance.A study by Ramaneti et al. [16] proposed a solution to efficient solar power by tracking the sun's relative position to the Earth and finding the tilt angle of the solar panel using the deep learning method.The proposed method could predict the sun's tilt and orientation angle, increasing solar power by 10.6%.Chih-Chiang Wei [17] used forecasting models such as multilayer perceptron (MLP), random forests (RF), k-nearest neighbours (kNN), and linear regression (LR) with the Satellite Remote-Sensing Dataset to estimate surface solar radiation on an hourly basis and the solar irradiance received by the solar panels at different tilt angles in Tainan, Southern Taiwan.The study showed good performances in MLP compared to RF and KNN, while the LR had the worst performance.
Khan et al. [18] studied data accumulated from 16 solar panels with different combinations of tilt angle and direction and used the stacking ensemble learning technique to propose a machine learning model from the following: XGboost, catboost, and random forest.Sahin [19] used mathematical methods to determine the optimum tilt angles using an artificial neural network in the boundaries of Turkey.The study was conducted on the fixed solar panels' system and resulted in a 34% increase in energy obtained from fixed solar panel systems.Hailu and Fung [20] predicted the optimum tilt angle and orientation of solar panels for the scattering of maximum solar irradiance using isotropic and anisotropic models.All the models led to the inference that the solar panel should be changed four times a year to receive maximum solar irradiance and should be set up with an orientation west or east of due south with a flatter tilt angle.Ramli et al. [21] studied the performance of SVM and ANN in predicting solar radiation on the tilt angles of 16°and 37.5°of PV panels in two different locations.The study made use of a dataset accumulated over 360 days.The error measures used were RMSE and the coefficient of correlation (CC), and models SVN and ANN were compared based on the training seed.It was indicated that the former had better performance than the latter.Verma and Patil [22] developed a neural network model for predicting solar irradiation using a dataset composed of satellite images in visible and infrared bands, altitude, longitude, latitude, month, day, time, solar zenith angle, and solar azimuth angle accumulated from Solcast.Benghanem [23] investigated the optimal tilt angle for solar panels to increase solar irradiance in Madinah, Saudi Arabia.He focused on the yearly average and monthly average tilt angle, using the data measured daily global and diffuse solar radiation on a horizontal surface with isotropic and anisotropic models.He found that more energy is collected using the monthly average optimum tilt angle rather than the yearly average.Sharma et al.
[24] compared forecasting models for power generation using the data from the National Weather Service (NWS) and solar intensity readings from a weather station deployment using machine learning models.The SVM was found to be more accurate than linear least squares.Jawaid and Nazirluneio [25] performed a comparative analysis of ANN and the standard regression algorithms in forecasting solar energy using weather and time of the year attributes with and without azimuth and zenith angles involved.The ANN outperformed all standard regression models (LR, SVM and K-NN).Božiková et al. [26] investigated the effects of changing a photovoltaic system's tilt angle and azimuth angle for energy production.They analysed both angles mathematically to identify the optimal position of photovoltaic system installation in the southern Slovakia regions.It was deduced that both angles significantly influenced the total energy balance of the PV system and that it was very important to know the optimal tilt angle and azimuth angle orientation for the area in which the PV system is installed.As a final and most recent example, Kim and Byun [27] sought to improve electricity production by working with solar panels installed in five regions where the direction and tilt angles of the installation site were used to predict the optimum electricity production with the use of a machine learning XGBoost regression model.A summary of some previous studies is given in Table 1.

Conclusion from the Related Work
Few studies have been conducted to predict/estimate the acceptance angle for solar energy using machine learning and deep learning.We believe that more studies using machine learning/deep learning models in forecasting or nowcasting optimum angles of acceptance for solar energy should be carried out, as the angle of acceptance plays a valuable role in optimising the amount of solar energy being absorbed.Our study differs from previous studies in that we focus on modelling and nowcasting tilt angles of acceptance for solar energy using machine learning and deep learning models such as long short-term memory (LSTM), random forest (RF), and K-nearest neighbours (KNN), as well as benchmarking them against a gradient boosting (GB) model using datasets collected from the Southern African Universities Radiometric Network (SAURAN) website.The models were chosen based on their performance in previous related studies, where it was concluded that models such as GB, RF, and KNN had good performances.Despite this, there remain gaps in the research since there are not enough studies on their use, and some lack sufficient performance metrics to determine their effectiveness.KNN and RF performances seem inconsistent, whereas the GB outperforms most other models.Due to the superiority of the LSTM performance in time series projects and its lack of use in many related studies, this is a good instance of evaluating its performance against other methods.The rest of the work is arranged as follows.Section 2 explains the nature and source of data used, including methods and models, followed by analysis.Section 3 consists of the results, while Section 4 provides our discussions.Lastly, in Section 5, we present the conclusion and possible future work to be carried out.

Data
In this study, we used a minute-averaged solar irradiance dataset accessed on 13 August 2021 from one station called USAid Venda station (https://sauran.ac.za/) and recorded it using a pyranometer.The station is situated in South Africa in the province of Limpopo, at latitude −23.13100052 and longitude 30.42399979, and at a 628 m elevation [12].Figure 1

Data Preparation
The collected data were first converted from minute-averaged to hourly-averaged, and hours from 07:00 pm to 04:00 am were all removed, leaving the models to train on the dataset collected during sunlight hours from 5:00 am to 6:00 pm, especially during summertime.A tilt angle greater than 92 degrees was also removed to avoid inaccuracy of the models because they are achieved when the sunlight is usually unavailable, especially during winter when it is already dark at 5:00 am and 6:00 pm.Further preprocessing was performed to deal with missing data points where an approach of removing rows with any missing data points was taken because there were few missing points and the data set was large enough.
In order to obtain variables with good relation to the target variable, the least absolute shrinkage and selection operator (LASSO) regression method was used to perform variable selection because it is more interpretable than other methods.Performing LASSO regression increases the model's accuracy and helps reduce the system's memory and training time.The cost function formulation of LASSO regression is given by Equation (1).arg min The error is calculated with the actual y i for every i th value, β 0 is a constant coefficient, and N is the total number of training examples in the dataset.β j represents the weight for the j th feature and lambda is the regularisation strength.

Models 2.4. Random Forest
A random forest (RF) is a supervised machine learning technique/model that utilises ensemble learning to solve regression and classification problems.Leo Breiman and Adele Cutler trademarked it in 2006 as an extension of the model of Tim Kam Ho, who created it in 1995 [28].It generates multiple decision trees during the training time by randomly selecting a sample of the training data set, replacing it, and creating an ensemble using an algorithm such as bagging and boosting to add more diversity and reduce the correlation among ensembled decision trees.The predictions vary depending on the type of problem being solved.For regression, the average of all decision trees ensembled is considered the prediction, whereas, for classification, the majority vote of the regular categorical class is considered the predicted class.The main steps for RF are as follows: 1.
Start with a given set of training data X = x 1 , ..., x n with responses Y = y 1 , ..., y n , 2.
A set of random sample X a , Y a containing n training examples is selected N times with the replacement, where a = 1, ..., N.

3.
For each random sample X a , Y a , a regression tree f a is fitted and trained.

4.
The final prediction of regression f for unseen samples x is then made by averaging the predictions of all the individual regression trees on x using Equation (2).

K-Nearest Neighbours
K-nearest neighbour (KNN) is one of the simplest supervised machine learning models that can be used for both regression and classification problems.It is a non-parametric algorithm that does not use training data points to make generalisations.It follows the concept of finding the nearest neighbours where the closest points to the query are found.Given the training dataset with labels in correspondence (X, y), KNN involves the following steps to learn and discover a function h : X → y.

1.
Select k and the weighting method, 2.
Train the model, where the model saves the training data for making predictions, 3.
Calculate the distance from a query example to the labelled examples using a distance formula such as Minkowski, Hamming, and Euclidean distance, 4.
Sort the calculated distances in ascending increasing order, 5.
Find heuristically the optimal number k of the nearest neighbours by making use of the root mean square error, 6.
Finally, vote for the most frequent label in the case of a classification problem and take the average of the labels for a regression problem.

Long Short-Term Memory
Long short-term memory (LSTM) is an artificial neural network technique introduced by Hochreiter and Schmidhuber in 1995, refined and popularised by many researchers [29].Instead of a recurrent neural network (RNN), each layer in the LSTM can be realised in four steps, which are the forget gate, the input gate, the update gate, and the cell output gate, as described below and also through the diagram in Figure 2.

1.
Forget gate: The model decides what information should be thrown away or kept for training, Equation (3) represents the forget gate with W f and U f not dependent on time. 2.

Input gate:
The input gate in Equation ( 4) decides what information is relevant to add from the current step using the sigmoid activation function and produces the output as a hidden state represented in Equation ( 5), while the input gate represented in Equation ( 4) and the weight and bias are not time-dependent. 3.

Update gate (Cell state):
Takes the output from the input gate and completes a pointwise addition (with the hyperbolic tangent as an activation function, as shown in Equation ( 6), which updates the cell state to new values that the neural network finds relevant.ct = tanh(x t U d + h t−1 W d ). (6)

4.
Cell output state: As shown in Equations ( 7) and ( 8), the cell output state uses the sigmoid function to decide which information should be going to the next hidden state.
where the symbols and variables used in Equations ( 3)-( 8) have the following meanings: g is the sigmoid function, tanh is the hyperbolic tangent, x t is the input vector, h t is the output vector, c t is a cell state vector, W represents weights, f t , i t , and o t are gates of the block.The following steps are followed during the employment of LSTM,

1.
Prepare data and divide them into training and testing.2.
Specify the time-step and reshape the dataset.

3.
Define the network, where the input and hidden layers are specified together with the activation functions and dropout rate.

4.
Compile the network by specifying the optimiser, loss function, and metrics. 5.
Fit the network, where a backpropagation algorithm is used to train the network based on the compilation by specifying the number of epochs and batch size.6.
Evaluate the network, where the network's performance is being evaluated on the fitted dataset.7.
Make predictions; following training and evaluation, predictions are made using the new dataset.

Gradient Boosting (GB)
Gradient boosting is a machine learning ensemble learning technique that can be used for both regression and classification problems.The objective is to minimise the error by combining the best possible next model with the previous models, where error gradients determine the target outcomes compared to the prediction.The employment of GB follows the following steps below, and Figure 3 below summarises the steps followed by GB.

1.
Given prepared data X and y, a base model 1 (m 1 ) is employed to predict y.

2.
Find the pseudo residuals from the observed and model 1 predicted values S = y − y 1 .

3.
Employ a new model 2 (m 2 ) using pseudo residuals as the target variable and the same input variables.4.
Predict and add the predicted pseudo residuals S p to the previous predictions, R = S p + y 1 .

5.
Calculate the pseudo residuals again using the observed data y and the R values and repeat the steps until the residual sum becomes constant or over-fitting starts.

Performance Measures
The performance of models was measured using evaluation metrics such as mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE).Where ȳi represents the predicted value, y i represents the actual value, ȳ represents the mean of all the values and m is the number of data points during the test period.
Mean absolute error Mean squared error Root mean square error R-Squared 2.9.Modelling Flowchart Figure 4 shows a flowchart of the proposed models.To nowcast solar irradiance acceptance angles, the following modelling stages are followed.The first stage is to collect the dataset from the source, followed by data preprocessing to obtain the proper dataset for training models.Preprocessed data are then split into the best-split ratio for training and testing the models, followed by fitting the dataset to the models following the proper selections of hyper-parameters using the grid search method.The last stage is to evaluate the performance of the models and re-modelling to improve the model's accuracy.

Results
The research was conducted on a computer with 16 GB ram and a 2.59 GHz processor running Windows 10 and running a Python programming language with Keras and TensorFlow, along with Scikit-learn, Matplotlib, pandas, and NumPy installed.

Feature Selections
Before performing the modelling, it is necessary to prepare and analyse the data to ensure the models' accuracy.Table 2 helps us identify variables with significant effects, and variables with zero coefficient values are not significant.Figure 5 visualises the significance of independent variables in nowcasting the tilt angles of acceptance.The selection was made on 15 independent variables; variables with long bars are more important than variables with short bars, while variables without bars are unimportant because they are equal to zero.Therefore, variables such as rain total, wind speed, wind vector magnitude, year, and month have been discarded as they do not significantly affect the accuracy of the models.Thus, the models were trained on the dataset composed of variables such as temperature, relative humidity, wind direction, wind direction StdDev, wind speed max, barometric pressure, logger temperature, week, day, and hour, as shown in Figure 5, where the negative bars infer a negative relationship to the tilt angle while positive bars infer a positive relationship.

Exploratory Data Analysis
Table 3 below shows the summary statistics of the dependent feature (tilt angles) recorded for the proposed period.The table shows that the smallest angle measured is 5.1 degrees, whereas the maximum angle is 92.0 degrees.The dependent feature is further analysed by visualising its probability density plot with kernel density estimation (kde) fitted in Figure 6b.It is also inferred that the mean value of the tilt angle used is around 53 degrees, as shown in Table 3. From Figure 6b, it is also inferred that the data density plot is slightly skewed to the right since the median is smaller than the mean.Furthermore, the standard deviation is 22.6, whereas the lower and upper quartiles are also visualised in Figure 6a through the box plot.A visualisation of tilt angles is depicted in Figure 7, which shows seasonal changes in tilt angle for three years from 2017 to 2019, with the smallest angles recorded during summer times and the highest recorded in winter times.A tilt acceptance angle is generally higher during winter, with the smallest angle exceeding 45 degrees, compared to 5.1 degrees during summer.It can be deduced that the morning acceptance angle is higher and becomes smaller as it reaches midday.During midday, when maximum GHI is scattered, the tilt angle of acceptance is at a minimum angle.The tilt angle then increases from midday to sunset and reaches its highest point at sunset.The monthly records (b) of Figure 8 show that the tilt angles are mostly high during the winter period and low during the summer period.Figure 9 shows the average amount of GHI to be achieved hourly for 14 h from 05:00 am to 06:00 pm, with an accurate tilt angle of acceptance being predicted.

Modelling and Results
Summary of evaluation metrics for LSTM, RF, KNN, and the benchmark GB model are tabulated in Tables 4 and 5. Table 4 presents the model's results after training, and Table 5 shows the testing data used for evaluation.
Figure 10 presents the probability density plots with the kernel density estimation fitted for the observed testing data and the nowcasts of KNN, RF, LSTM, and GB models.In the comparison plot in the last figure (e) of Figure 10, it is inferred that the KNN appears to be worst in aligning with the observed testing data, followed by the RF and GB, which aligns better when compared to the KNN, and the LSTM appear to have shown better alignment.Figures 11-14 present the prediction results for KNN, RF, LSTM, and the benchmark GB model, whereas Figure 15 reflects their performances in comparison.The prediction was made using a small portion of randomly selected sequences of testing datasets, and in the first figure, it was predicted for three days, starting from 28 January 2019 to 30 January 2019, and the second figure presents the prediction for 31 days, starting from 05:00 am and finishing at 06:00 pm for each day except when the tilt angle was higher than 92 degrees.Figure 15 finalises by comparing the nowcasts plots of the proposed models, where it is deduced that all models showed good results, with the KNN showing the worst results compared to the RF, GB, and LSTM.

Discussion
To obtain the best accuracy of the models, analyses of all models were conducted several times with different split ratios, and the best results were obtained with a split ratio of 70% in training and 30% in testing.The grid search method was utilised for hyperparameter optimisation of RF, KNN, and GB, and the analysis was performed with the number of trees set to 430 for RF and the maximum number of decision tree levels set to 8. Furthermore, there were 525 estimators used in GB training, and the rest of the parameters were set to default.KNN was also employed according to the grid search results, where the number of neighbours was set to eight and the power parameter set to two, while all other parameters were left as the default.The LSTM was implemented with an input layer and one hidden layer with a dropout rate of 0.2, and each layer had a rectified linear activation function.During the modelling, different epochs and early stoppings were assigned to search for the ones that bring the best performances.The best epochs were 30,000 and the early stopping of 3500.These led to the completion of training earlier with 19,284 epochs.
As shown in the summary table of the evaluation metrics, all the models have shown good performances.The KNN turns out to have the worst performance by a very large margin.They were followed by the RF and GB, which gave the same results in almost all performance measures.It can be deduced that the RF and GB were able to perform well based on their advantage of working well with the dataset of multi variables since they utilise multiple trees for analysis.Furthermore, the RF and GB have more parameters to tune for model performance optimisation than the KNN, which has few.Overall the LSTM outperformed all the models, which makes it the best-performing model in this study.The LSTM is well known for its good performance on time series data because it predicts future values based on the previous sequential data with the help of time steps.A large amount of hyperparameter tuning, which allows operators to tune them for accuracy improvement, has also played a role in LSTM's outstanding performance.In addition to the RF, GB, and KNN, they all resulted in good nowcast accuracies, and the RF and GB have also proven reliable by being consistent with other studies such as [5,17].Probability density was plotted in Figure 10 using testing data with a kernel density estimation.It is inferred that the density plot for testing data is bimodal, and in general, the predicted kernel density estimation of the models tends to align well with the plots.
Based on the obtained predicted density plots, it is also apparent that the KNN gives the worst kernel density estimation, which aligns with the observed testing results.The RF, which comes second, is simple to spot in terms of misalignment at the peaks of the kernel density estimation plot, and the LSTM, as it should, has proved to have better accuracy by showing a proper alignment with the observed testing data.In Figures 11-15, it is further shown through nowcast alignments that the LSTM has a better alignment, followed by the RF by a very small margin from the GB and the KNN as the fourth best.

Conclusions
The study focused on a comparative analysis of RF, KNN, and LSTM in nowcasting the optimum tilt angle of acceptance for short-term global horizontal irradiance, benchmarking them against the GB to compare their accuracies.Performance measures such as MAE, MSE, RMSE, and R-squared were utilised.The optimal results of the models were achieved by performing data preparation and variable selection using the LASSO regression method.Furthermore, the LSTM has proved to be more reliable by showing better accuracy in the nowcasting tilt angle of acceptance and being consistent with other studies, followed by the RF, GB, and then KNN as the worst.Possible future work should focus on searching for more machine learning models with the best accuracy in nowcasting angles of acceptance for solar energy.

Figure 5 .
Figure 5. Selection of variables using LASSO regression.

Figure 6 .
Figure 6.The (a) boxplot and (b) density plot represent the visualisation of tilt angles measured over three years.

Figure 8
Figure8shows the hourly (a) and monthly (b) tilt angle changes over three years.It can be deduced that the morning acceptance angle is higher and becomes smaller as it reaches midday.During midday, when maximum GHI is scattered, the tilt angle of acceptance is at a minimum angle.The tilt angle then increases from midday to sunset and reaches its highest point at sunset.The monthly records (b) of Figure8show that the tilt angles are mostly high during the winter period and low during the summer period.

Figure 8 .
Figure 8. Tilt angles visualised (a) hourly and (b) monthly for three years.

Figure 9 .
Figure 9. GHI recorded in W/s 2 for three years and hourly averaged.

Figure 10 .
Figure 10.Probability density plots for (a) KNN, (b) RF, and (c) LSTM predictions, (d) GB predictions and their comparisons in (e), all plotted with hourly observed testing data.

Figure 11 .
Figure 11.KNN prediction results for a random selection of (a) 3 days and (b) 31 days of the observed dataset.

Figure 12 .
Figure 12.RF prediction results for a random selection of (a) 3 days and (b) 31 days of the observed dataset.

Figure 13 .
Figure 13.LSTM nowcast results for a random selection of (a) 3 days and (b) 31 days of testing dataset.

Figure 14 .
Figure 14.GB prediction results for a random selection of (a) 3 days and (b) 31 days of the observed dataset.

Figure 15 .
Figure 15.Comparison of nowcasts from KNN, RF, GB, and the LSTM for the randomly selected 31 days of testing data.

Table 1 .
Summary of some of the related literature.

Table 3 .
Statistical summary of tilt angle.

Table 4 .
Training errors of the RF, KNN, LSTM, and GB models for estimating tilt angle of acceptance for solar energy, with error measured in the degree unit of the target variable.

Table 5 .
Testing errors of the RF, KNN, LSTM and GB models for predicting tilt angle of acceptance for solar energy, with error measured in degree unit of the target variable.