Comparison Analysis of Machine Learning Techniques for Photovoltaic Prediction Using Weather Sensor Data

Over the past few years, solar power has significantly increased in popularity as a renewable energy. In the context of electricity generation, solar power offers clean and accessible energy, as it is not associated with global warming and pollution. The main challenge of solar power is its uncontrollable fluctuation since it is highly depending on other weather variables. Thus, forecasting energy generation is important for smart grid operators and solar electricity providers since they are required to ensure the power continuity in order to dispatch and properly prepare to store the energy. In this study, we propose an efficient comparison framework for forecasting the solar power that will be generated 36 h in advance from Yeongam solar power plant located in South Jeolla Province, South Korea. The results show a comparative analysis of the state-of-the-art techniques for solar power generation.


Motivation
Nowadays, renewable energy is considered in different countries as source of electricity production. The principal characteristic of renewable energy is highly dependent on weather factors, making it difficult for obtaining a stable energy production. This characteristic leads to a production level that fluctuates with weather conditions [1]. Furthermore, power companies must guarantee a precise balance between the production and consumption of electricity. Therefore, they need to maintain a stability of services to their customers, forestalling unanticipated disturbances in the energy production. Furthermore, power companies must be prepared in advance to know the amount of energy that will possibly be produced in the next few hours. In general, a time horizon of 36 h ahead allows power companies to take optimal decisions and create a schedule for the generated energy.
Although using various types of sensors easily captures weather data, the quantity and quality issues from those information create a PV forecasting challenge. For instance, solar power generation as renewable energy is highly dependent in the irradiance of the sun. To obtain an accurate prediction of solar power generation, the analysis of other weather factors is needed, such as whether it is a cloudy day or the humidity of the environment. Furthermore, a considerable amount of information related to the weather conditions are obtained every day from different meteorological stations.
Additionally, it is important to understand what weather data (observable, forecasting or both) should be used and in what specific cases for better prediction results. Meteorological stations collect weather information from different weather sensors to forecast weather minutes, hours, or days

Contribution and Paper Structure
In this study, we aim to conduct a comprehensive analysis of various up-to-date machine learning techniques to feature their pros and cons when they come to predict solar power generation problems. Specifically, we consider a particular prediction problem that is to predict solar power generation 36-h ahead using weather data, and we analyze the specific cases in which one must be used, such as the situations in which only one weather dataset is available in the region. The variables that are taken in account are the observed and the forecasted weather in order to gain an accurate prediction, and they can be beneficial for the grid operator and electricity supply companies. Additionally, the implementation and comparison of fifteen prediction methods (e.g., SVR, random forest, gradient boosting and XGboost) in three weather information settings (weather observation, weather forecasts and the combination of both) is taken into account. Furthermore, analysis and tuning of the parameters for each model are performed. The novelty of this paper is valuable to adopt and to replicate predictive methodologies based on the weather information for energy production power. Thus, the proposed methodology shows how solar power generation can be predicted using different types of weather variables. Furthermore, the methodology can be replicable regardless of geographic location with available weather data using only historical weather observations, forecasts, or both. In addition, this study presents how the maximum benefit can be obtained using a specific type of weather variables. Moreover, this study can be extended for the analysis of home solar power plants.
This paper is organized as follows: Section 2 presents a brief overview of the utilized framework and the weather data; Section 3 describes the machine learning methods compared in this study; Section 4 presents the outcomes and the key results along with a comparison against the machine learning models; finally, Section 5 lists our discussion and future work.

Research Framework
This section describes the framework and the weather datasets used to make a comparison for the prediction of solar power generation. Section 2.1 describes the framework with a brief description of the sources utilized in the study. Section 2.2 explains the preprocessing steps to prepare the data to be suitable for modelling. Section 2.3 presents the cross-validation step. Section 2.4 shows the comparison module utilizing ten-fold cross-validation. Finally, Section 2.5 introduce the subset selection.

Framework and Data Collection
This subsection describes the data and the framework used to evaluate the machine learning methods tested in this study. Figure 1 presents a graphical representation of the evaluation framework. The framework consists in five steps: data collection, data preprocessing, cross-validation (CV), ten-fold cross-validation and subset selection. The data collection step gathers the information from four data sources, solar power generation data, solar elevation, observational and forecast weather data. Solar power generation data is provided by the Korean Open Data Portal (http://www.data.go.kr). Solar elevation data is given by the planetarium software Stellarium (http://www.stellarium.org) and observational and forecast weather data are offered by the Korea Meteorological Administration (KMA) (http://kma.go.kr). The solar power generation data used in this research is provided by the Korean government from Yeongam in South Korea. The solar power data is given in one-hour periods from 1 January 2013 to 31 December 2015. Figure 2 shows the total solar power generation per day in the three-year period, which implies the dynamic nature of solar power generation caused by weather conditions of each day. The graph shows the production levels fluctuate seasonally, where the greatest energy generation tends to be obtained in spring compared to the other seasons, and it tends to be the lowest in winter.
Along with the solar power generation data, we also collected three additional data that is to be used as the inputs of machine learning models. Firstly, the solar elevation data is taken from the geographical location of the power plant in Yeongam. The solar elevation represents the position of the sun from the power plant perspective in latitude and longitude coordinates. Next, the observational weather records provided by KMA are the measured actual weather. The observational weather data is taken from the closest meteorological station located in Mokpo, approximately 8.5 km from the PV plant. Finally, the forecast weather records announced by KMA are given in three-hour periods starting from 02:00 a.m. every day. In this study, the forecast weather data used are the predictions 36 h-ahead. The meteorological observation system utilized by KMA consists of a 10-m-high meteorological tower. At the top, wind direction and wind speed sensors are installed horizontally on the left and right sides. The humidity and temperature meter are installed at 1.5 m above the ground and a precipitation sensor is installed in the other side. A pressure sensor is installed around 50 to 60 cm above the ground.
Sensors 2020, 20, x FOR PEER REVIEW 4 of 28 The meteorological observation system utilized by KMA consists of a 10-m-high meteorological tower. At the top, wind direction and wind speed sensors are installed horizontally on the left and right sides. The humidity and temperature meter are installed at 1.5 m above the ground and a precipitation sensor is installed in the other side. A pressure sensor is installed around 50 to 60 cm above the ground.    The meteorological observation system utilized by KMA consists of a 10-m-high meteorological tower. At the top, wind direction and wind speed sensors are installed horizontally on the left and right sides. The humidity and temperature meter are installed at 1.5 m above the ground and a precipitation sensor is installed in the other side. A pressure sensor is installed around 50 to 60 cm above the ground.

Data Preprocessing
The preprocessing step prepares the gathered data to be suitable for the inputs of the models we considered. Firstly, in the filtering task, the instances with no sunlight information are excluded Sensors 2020, 20, 3129 5 of 26 (00:00-05:00 and 20:00-24:00). Next, in the merging task, consecutive instances in three-hour periods are merged into a single instance, (1: 5 a.m.-8 a.m., 2: 8 a.m.-11 a.m., 3: 11 a.m.-2 p.m., 4: 2 p.m.-5 p.m., 5: 5 p.m.-8 p.m.), making a total of 5 records per day. In other words, data from solar power generation and weather observation are given in one-hour segments, and to match the weather forecast data, hourly data is aggregated every three hours. Additionally, the solar power generation is sliced in seven time periods so that the models can forecast each period one by one. Lastly, in the qualitative variables task, a conversion of all categorical variables to multiple binary variables is performed. Tables 1 and 2 present the independent variables divided in three categories: solar elevation, weather forecast and weather observation.

Cross-Validation
The cross-validation step is designed to obtain the best hyperparameter values of each model. In this step, different performance metrics are used to evaluate the models. First, a segmentation is performed on all data by dividing them sequentially into the training, validation, and test set (60%, 20%, 20%). The training set consists of data from 1 January 2013 to 19 October 2014. The validation set is composed of data from 20 October 2014 to 26 May 2015. Test set contains the information from 27 May 2015 to 31 December 2015. Second, to obtain the best hyper parameters, the grid-search method is utilized over training set and evaluated over validation set. In the test set, the first comparison of the generated models is made. Then the best performing models are sent as input for the last step.

Ten-Fold CV
In the ten-fold cross-validation step, the complete data is taken as input and the generated 10 × 10 models from the previous step to analyze and evaluate each machine learning model power prediction for solar power generation. First, all the input data is randomly partitioned into ten equal segments. Second, from the ten segments, one is used as a validation set and the remaining nine segments are used as the training set. Third, the ten-fold cross-validation process is repeated ten times using each segment as a validation set. Last, the ten results are averaged to give a final result [33].

Subset Selection
Finally, in subset selection, we considered the forward stepwise method to select the best subsets from the independent variables. The forward stepwise propose a complexity of O(N 2 ), smaller than O(2 N ) given by the best subset selection method. To choose the optimal model from the forward stepwise method, we analyzed four approaches: C p , Akaike's information criterion (AIC), Bayesian information criterion (BIC), and adjusted R 2 [34]. The criterion was performed with the training, validation, and test set from the cross-validation step, and these four approaches were then analyzed. Forward stepwise as methodology first starts modelling without any features. Following this, it tests the addition of each feature by calculating the performance metrics C p , AIC, BIC or adjusted R 2 , and then adds the feature that has the most statistically performance value. It subsequently repeats the process until the last feature, and finally selects the best performance subset from the complete dataset.

Machine Learning Methods
This section describes the methods tested in this study. For this approach, supervised learning algorithms for multiple independent variables are evaluated and categorized into three sections. Single regression models, bagging ensemble methods and boosting ensemble methods analyzed in this study are shown in Table 3.

Single Regression Methods
Linear regression is a simple yet useful approach in many various applications, and it is used as a base line in this study. By using the single regression model, the final PV output prediction for a particular time is calculated as follows: where X j represents the jth dependent variable, and β j explains how much the variation in the outcome can be explained by the variation in the independent variable. Moreover, the Huber, ridge, lasso and elastic net shrinking methods are involved, which use least squares to fit a linear model using a technique that regularized the coefficient estimates towards zero [34]. First, Huber regression is defined as a linear regression model that has robustness to outliers by minimizing the squared error and the absolute loss. Second, the ridge regression method focuses on solving the overfitting problem. Ridge is a regularized linear regression for analyzing multicollinearity and utilizes an L 2 penalty to the square of the magnitude β as regularization. Next, least absolute shrinkage and selection operator (lasso) is one of the variants of the shrinkage methods and is an alternative to ridge regression. The principal problem of ridge algorithms is that they include all X predictors because β converges to zero, but this is not the case for lasso where β can be equal to zero. Thus, lasso selects variables from a subset of predictors X. Lasso regression utilizes the L 1 penalty as a regularization method. Finally, the elastic net regression method arises from the limitation of the lasso regression methods where in a large subset of predictors X selects at most n, which is less than complete subset i. To overcome this limitation, elastic net utilizes the L 1 and L 2 penalty as regularization methods.
Other simple methods considered in this study are k-NN regression and SVR. k-nearest neighbors (k-NN) regression is a simple algorithm as it does not use a discriminative function but memorizes all available cases. This method chooses the number of k subsets and a distance metric to group cases into k subsets based on their similarity, and it finds the nearest neighbors based on the distance metric. In other words, k-NN regression predicts the average of solar power based on the similarity measure of the k subsets. On the other hand, SVR is a regression method that maintains all the characteristics from support vector machines. SVR tries to maximize the distance between the separating hyperplane and the support vectors within a threshold value.
Decision tree methods are considered as one of the most popular algorithms in supervised learning in which their goal is to segment the predictors into several simple or smaller groups useful for interpretation [33]. As an example, for predicting the solar power generation 36 h-ahead, the decision tree starts to split all data by considering each X predictor where the mean of the subset is considered as the splitting value. One of the disadvantages of decision trees is that they are unstable with small variations in the predictors and can overfit the generated model. The variance of the model needs to be reduced, and this is achieved with other ensemble methods like bagging and boosting.

Bagging Ensemble Methods
The purpose of an ensemble of regressions is to find a more effective model by cooperating with a set of regression models by combining their results. Specifically, bagging ensemble methods try to decrease the model's variance and train weak models in parallel. The most widely bagging method used is random forest [33]. Random forest constructs multiple decision trees as regression methods at training time and gains a mean prediction as an output. Another bagging method is extra trees, which like random forest, ensembles individual decision trees at training time and gains a mean prediction as an output. The principal difference is that each decision tree uses the full training subset and the splitting of the decision tree is randomized.

Boosting Ensemble Methods
Boosting ensemble algorithms try to decrease the model's bias and they train different models sequentially in order to improve each previous model generated. In this study, state-of-the-art boosting methods will be used such as AdaBoost, CatBoost, gradient boosting and XGBoost. AdaBoost utilizes decision trees with a single split.
Gradient boosting is different to AdaBoost in that it tries to fit the new decision tree model with the y i -ŷ i made by the previous decision tree model. CatBoost and XGboost are based on gradient boosting. CatBoost is a method that can support categorical variables as well as numerical variables. XGboost in comparison with gradient boosting uses a better regularized method to control overfitting.

Experiment Results
This section presents the results of the machine learning methods described in the previous section. The results are grouped by the input weather data: observational, forecast and the combination of observational-forecast weather data.

Performance Metrics
Section 4.1 presents the performance metrics used in this research. This study realizes a two-step analysis using k-fold cross-validation. The principal objective of cross-validation is understanding how well the model will make predictions over unseen data, indicating problems such as bias or overfitting [35].
To make a comparison of the algorithms based on the results obtained, it is important to estimate how a predictive model will perform in practice based on several performance metrics. The root mean square error (RMSE) is a metric that penalizes regression models for how far theŷ i is from the y i , by squaring and averaging over N. The mean absolute error (MAE) measures the average distance betweenŷ i and y i . The R 2 measures the strength of the correlation betweenŷ i and y i . We consider these three metrics because energy power prediction is a regression problem. RMSE penalizes the higher difference between prediction and actual solar power, and MAE directly takes the average of offsets. The mathematical equations of the performance metrics used in this study are as follows: where y i is the actual solar power value,ŷ i is the prediction value from the models,ӯ i is the mean of y i , and N is the sample size.
To measure the subsets of the resulting models and to analyze the most important variables that improve the performance of the models, four approaches where compared: Mallows C p (C p ), Akaike's information criteria (AIC), Bayesian information criteria (BIC) and adjusted R 2 , represented as follows: Adjusted where MSE is the square of RMSE, d is number of features in our model and σ 2 is the estimated standard error of the mean. AIC is a metric to compare other models, it tests the fitting of the data without overfitting and penalizes if is too complex. C p is a metric that makes a penalization when additional variables are added to the model and is a variant of AIC. BIC is a variant of AIC that has a stronger penalty for including additional variables.

Cross-Validation
In this section, the data is separated in tree segments (training, validation, test). Furthermore, a grid search is performed to obtain the best models by obtaining the best hyper parameters values over the training set, and these models are compared over the test set.
The three years data used in the experiments is divided into three segments, training set (60%), validation set (20%) and test set (20%). To train the machine learning models, the training set was used, and as a resample procedure, the five-fold cross-validation was performed over the training time. Moreover, a grid search process was performed to find the best hyperparameter values for each model. The experiments were implemented using scikit-learn and statsmodels in Python 3.6, which offer implementations of those machine learning methods. The hyperparameter tuning processes are set to be identical to the ones for all data inputs, as shown in Table A1 in Appendix A. Table 4 lists the tested hyperparameter candidates for each algorithm with each model's best values. A total of eight single regression algorithms, three bagging ensemble algorithms and five boosting ensemble algorithms were analyzed. It should be noted that the hyperparameters present similar selection between the three datasets. Moreover, forecast data for single regression models and a decision tree model presents the best R 2 in the test set with a maximum depth of = 3. XGBoost presents the best performance in the three datasets with a number of estimators = 80. Table 4. Best values for the evaluated hyperparameter tuning from grid search.

Prediction Models Observation Weather Forecast Weather Forecast and Observation Weather
Single regression models  Figure 3 presents the summary of the results of the comparative algorithm analysis over observational, forecast, and combined weather data. The lower part of Table A2 presents the performance of the fifteen models using observational and forecast weather. Regarding the results, the first group based on observational weather data, the single regression models, and the k-NN model showed the best performance in terms of RMSE, MAE and R 2 for the test set (RMSE = 676.44, MAE = 459.42 and R 2 = 63.1%). Among the ensemble models, gradient boosting and XGBoost yielded better performance than the single regression models. Between the ensembles, the XGBoost (RMSE = 650.36, MAE = 440.67, and R 2 = 65.9%) performed better than gradient boosting. In the second group based on forecast weather data in single regression models, k-NN model showed the best performance in the three-performance metrics for the test set (RMSE = 529.37, MAE = 334.78, and R 2 = 77.4%). Among the ensemble models, random forest obtained the best MAE for the test set (MAE = 317.40) and XGBoost showed the best performance in terms of RMSE and R 2 for the test set (RMSE = 509.44 and R 2 = 79.1). Finally, in the third group using observational-forecast weather data in single regression models, k-NN showed the best performance for the test set (RMSE = 533.59, MAE = 340.86, and R 2 = 77.1%). Regarding the ensemble models, the XGBoost (RMSE = 493.85, MAE = 317.70, and R 2 = 80.4%) performed better than the other models in terms of all performance measures, and it is also the only method that shows R 2 value higher than 80%.

10-Fold CV
In the current section, a ten-fold cross-validation was performed over all data using the best models obtained in the previous step. Additionally, a further analysis comparing the variance and the standard deviation of the model's results were conducted over the three weather datasets.
The ten-fold cross-validation was performed over the dataset from 2013 to 2015 and the utilized machine learning models were the ones with the best hyperparameters obtained in the previous step. The goal of this step is to obtain the least biased machine learning model. Three analysis were implemented using weather observation, weather forecast and both. The analysis was performed

10-Fold CV
In the current section, a ten-fold cross-validation was performed over all data using the best models obtained in the previous step. Additionally, a further analysis comparing the variance and the standard deviation of the model's results were conducted over the three weather datasets.
The ten-fold cross-validation was performed over the dataset from 2013 to 2015 and the utilized machine learning models were the ones with the best hyperparameters obtained in the previous step. The goal of this step is to obtain the least biased machine learning model. Three analysis were implemented using weather observation, weather forecast and both. The analysis was performed over ten runs for each machine learning method, making comparative statistics for each regression model and obtaining the mean and the standard deviation for each performance metric. Furthermore, the results were separated in single regression models and ensemble models based on their interpretability.
The first analysis presented in Table 5 provides a comparison of the statistical information of the model's regression using three perspectives: RSME, MAE and R 2 from which the mean and the standard deviation (STD) were obtained. For Figures 4-6, the red line represents the medians and the green triangle the means. In our results, for the single regression model's decision trees obtained the best performance, outperforming k-NN in terms of mean. Additionally, elastic net presented the least biased model. Furthermore, gradient boosting obtained the best scores or RMSE and MAE in comparison with XGBoost that presented better results in the previous step. XGBoost achieved the least biased model based on the standard deviation. It can be observed that in the single regression model, the decision tree showed the best performance in regard to RMSE, MAE and R 2 in terms of their means (RMSE = 694.24, MAE = 478.77 and R 2 = 61.6%). Among the ensemble models, gradient boosting yielded the best RMSE and R 2 in terms of their means (RMSE = 680.65 and R 2 = 63.1%) and bagging regression shows the best MAE (MAE = 470.1). Figure 4 shows three boxplots with the results of the machine learning models separated by each performance metric. Additionally, in Figure 4a, the median for gradient boosting and how the predictions shown a negative skewed distribution can be observed.  The second analysis showed in Table 6 presents the comparative statistics using weather forecast data. Similar to the results obtained from cross-validation, in single regression models, k-NN presents the best statistics in the three performances metrics based on the means (RMSE = 542.09, MAE = 350.20 and R 2 = 76.6%) and SVR obtained the smaller standard deviation in RMSE and MAE. For the   Finally, the third analysis presents the results of using observation and forecast weather data, as shown in Table 7. For the single regression models, k-NN gained the best scores-similar to when using forecast weather data. The k-NN model yielded better performances in RMSE, MAE and R 2 based on their mean (RMSE = 547.79, MAE = 358.28 and R 2 = 76.0%). Among the ensemble models,

Subset Selection
This subsection presents the results from the forward stepwise. The goal of this step is finding the optimal combination of features of the compared models. As explained previously, RMSE, MAE and R 2 are sensitive to the addition of more features in the model. Thus, we use the Cp, AIC, BIC and adjusted R 2 , which penalizes the inclusion of features to the model. The results of all the models are The second analysis showed in Table 6 presents the comparative statistics using weather forecast data. Similar to the results obtained from cross-validation, in single regression models, k-NN presents the best statistics in the three performances metrics based on the means (RMSE = 542.09, MAE = 350.20 and R 2 = 76.6%) and SVR obtained the smaller standard deviation in RMSE and MAE. For the ensembles models, on one hand similar to the weather observation data, gradient boosting presents the best results for RMSE and R 2 in terms of mean (RMSE = 531.85 and R 2 = 77.5%), while on the other hand, random forest present the best results in MAE performance in terms of mean (MAE = 340.77). Figure 5 illustrates the results gained after applying ten-fold cross-validation. For this experiment, gradient boosting shows a better performance over k-NN, but it can have observed that the mean of random forest varies much less than gradient boosting. Random forest has a more consistent mean relative to the weather forecast. Finally, the third analysis presents the results of using observation and forecast weather data, as shown in Table 7. For the single regression models, k-NN gained the best scores-similar to when using forecast weather data. The k-NN model yielded better performances in RMSE, MAE and R 2 based on their mean (RMSE = 547.79, MAE = 358.28 and R 2 = 76.0%). Among the ensemble models, gradient boosting and bagging yielded better performance than the single regression models. Between the ensembles, the gradient boosting (RMSE = 517.56, R2 = 78.6%) performed better than bagging (MAE = 338.03). Figure 6 presents the statistical results and it can be observed that gradient boosting has the best mean in RMSE and R 2 compared with other models. Furthermore, it can be observed in Figure 6b that bagging and gradient boosting have a similar prediction distribution. Moreover, in Figure 6a, the difference is more noticeable.

Subset Selection
This subsection presents the results from the forward stepwise. The goal of this step is finding the optimal combination of features of the compared models. As explained previously, RMSE, MAE and R 2 are sensitive to the addition of more features in the model. Thus, we use the C p , AIC, BIC and adjusted R 2 , which penalizes the inclusion of features to the model. The results of all the models are shown in Figure A1. Figure 7 present the best two algorithms from cross-validation and the ten-fold CV step. In our experiments, gradient boosting and XGBoost present the best values in adjusted R 2 . Gradient boosting obtained the minimum values of C p = 47831.4, AIC = 1.027 and BIC = 1.040 and the maximum value of adjusted R 2 = 80.8% at ten number of predictors. Table 8 present the ten features used to obtain the optimal gradient boosting model. XGboost gained the minimum values of C p = 46528.2, AIC = 0.999 and BIC = 1.018 and the maximum value of adjusted R 2 = 81.3% at 15 features. Table 9 present the 13 variables and the 15 features obtained to gain the optimal model from XGBoost.
Sensors 2020, 20, x FOR PEER REVIEW 19 of 28 shown in Figure A1. Figure 7 present the best two algorithms from cross-validation and the ten-fold CV step. In our experiments, gradient boosting and XGBoost present the best values in adjusted R 2 . Gradient boosting obtained the minimum values of Cp = 47831.4, AIC = 1.027 and BIC = 1.040 and the maximum value of adjusted R 2 = 80.8% at ten number of predictors. Table 8 present the ten features used to obtain the optimal gradient boosting model. XGboost gained the minimum values of Cp = 46528.2, AIC = 0.999 and BIC = 1.018 and the maximum value of adjusted R 2 = 81.3% at 15 features. Table 9 present the 13 variables and the 15 features obtained to gain the optimal model from XGBoost.

Discussion and Conclusion
In this study, we proposed a comparative of different machine learning techniques for the prediction of solar power generation 36 h ahead. The input and output are in a time frame of 3 h (1: , making a total of five records per day, and they are the same in all the methods considered in this study. The proposed study uses an input data time frame of 3 h similar with [1,6,11,16,30] and the region of this study is Asia [1,5,9,15,16,30]. One characteristic considered for predicting solar power energy 36 hahead is the weather observation, because it has a negative correlation with solar power generation. For example, solar power generation energy predictions made at 5:00 a.m. in the morning would be for 5:00 p.m. the next day. Furthermore, the number of weather variables that can be obtained to predict energy must be considered. Therefore, we performed experiments using weather forecast, weather observation and the combination of both.
Additionally, a framework with five steps is proposed: data collection, data preprocessing, cross-validation, ten-fold CV and selection subset. First, we suggest a comparative using crossvalidation and ten-fold CV because by training the model only with a training set, we cannot be sure of the desired variance and accuracy that will be obtained. In cross-validation, XGBoost performed better than the other algorithms, and in ten-fold CV, gradient boosting gained the best performance and generally in a less biased model. Therefore, these two algorithms must be studied in regard to the implementations of prediction of photovoltaic energy, and they should be compared with future proposals. Moreover, as general interpretations, some differences and similarities in the performances can be observed: for example in Figures 4-6, some single models present a similar performance, and this may be due to the fact that their primal formula is based on linear regression methods such as Huber, ridge, lasso, elastic net and SVR. Additionally, the assumption of the data can be another characteristic that can differentiate the power of prediction, for example, the closeness in time can affect the variance in these models. Furthermore, gradient boosting and XGBoost show the optimal models in the subset selection step. In the subset selection step, the metrics Cp. AIC, BIC and adjusted R 2 are metrics for optimal model selection, and these metrics are used directly with the training set. In our study, we considered analyzing these four metrics in the subset selection to evaluate if there exists any difference, but as it could be seen in our experiments with gradient boosting and XGBoost, the four converged on the same subset features.
In regard to the limitations of this study, the results are based on the weather-relevant data, and previous solar power data is not taken into account because we want to emphasize the power prediction of the weather data. In addition, for this study, only machine learning regression algorithms are analyzed-no other time series algorithms were examined. Furthermore, based on the

Discussion and Conclusions
In this study, we proposed a comparative of different machine learning techniques for the , making a total of five records per day, and they are the same in all the methods considered in this study. The proposed study uses an input data time frame of 3 h similar with [1,6,11,16,30] and the region of this study is Asia [1,5,9,15,16,30]. One characteristic considered for predicting solar power energy 36 h-ahead is the weather observation, because it has a negative correlation with solar power generation. For example, solar power generation energy predictions made at 5:00 a.m. in the morning would be for 5:00 p.m. the next day. Furthermore, the number of weather variables that can be obtained to predict energy must be considered. Therefore, we performed experiments using weather forecast, weather observation and the combination of both.
Additionally, a framework with five steps is proposed: data collection, data preprocessing, cross-validation, ten-fold CV and selection subset. First, we suggest a comparative using cross-validation and ten-fold CV because by training the model only with a training set, we cannot be sure of the desired variance and accuracy that will be obtained. In cross-validation, XGBoost performed better than the other algorithms, and in ten-fold CV, gradient boosting gained the best performance and generally in a less biased model. Therefore, these two algorithms must be studied in regard to the implementations of prediction of photovoltaic energy, and they should be compared with future proposals. Moreover, as general interpretations, some differences and similarities in the performances can be observed: for example in Figures 4-6, some single models present a similar performance, and this may be due to the fact that their primal formula is based on linear regression methods such as Huber, ridge, lasso, elastic net and SVR. Additionally, the assumption of the data can be another characteristic that can differentiate the power of prediction, for example, the closeness in time can affect the variance in these models. Furthermore, gradient boosting and XGBoost show the optimal models in the subset selection step. In the subset selection step, the metrics C p . AIC, BIC and adjusted R 2 are metrics for optimal model selection, and these metrics are used directly with the training set. In our study, we considered analyzing these four metrics in the subset selection to evaluate if there exists any difference, but as it could be seen in our experiments with gradient boosting and XGBoost, the four converged on the same subset features.
In regard to the limitations of this study, the results are based on the weather-relevant data, and previous solar power data is not taken into account because we want to emphasize the power prediction of the weather data. In addition, for this study, only machine learning regression algorithms are analyzed-no other time series algorithms were examined. Furthermore, based on the comparative analysis presented in this study, future studies can adopt analysis of the other time series algorithms such as ARIMA, or deep learning techniques such as artificial neural networks or long short-term memory. Finally, by using a single photovoltaic plant, the methodology can easily be replicated for the analysis of other photovoltaic plants in other countries, that is, the methodology can be replicated in different places where only one type of weather data can be obtained. Future studies can support solar power forecasting for different time frames such as 48 or 72 h-ahead, or for different input data time frames such as 30 min or 1 h.
Author Contributions: B.C. and K.K. conducted the initial design and developed the framework. B.C. wrote the draft of the manuscript, while K.K. reviewed and proofread the paper. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Table A1. Hyperparameter tuning using grid search for the machine learning algorithms.

Input Data Prediction Models Evaluated Hyperparameters
Observational data    The best values are in underlined boldface. The best values are in underlined boldface.