Short-Term PV Power Forecasting Using a Regression-Based Ensemble Method

: One of the most critical aspects of integrating renewable energy sources into the smart grid is photovoltaic (PV) power generation forecasting. This ensemble forecasting technique combines several forecasting models to increase the forecasting accuracy of the individual models. This study proposes a regression-based ensemble method for day-ahead PV power forecasting. The general framework consists of three steps: model training, creating the optimal set of weights, and testing the model. In step 1, a Random forest (RF) with different parameters is used for a single forecasting method. Five RF models (RF 1 , RF 2 , RF 3 , RF 4 , and RF 5 ) and a support vector machine (SVM) for classiﬁcation are established. The hyperparameters for the regression-based method involve learners (linear regression (LR) or support vector regression (SVR)), regularization (least absolute shrinkage and selection operator (LASSO) or Ridge), and a penalty coefﬁcient for regularization ( λ ). Bayesian optimization is performed to ﬁnd the optimal value of these three hyperparameters based on the minimum function. The optimal set of weights is obtained in step 2 and each set of weights contains ﬁve weight coefﬁcients and a bias. In the ﬁnal step, the weather forecasting data for the target day is used as input for the ﬁve RF models and the average daily weather forecasting data is also used as input for the SVM classiﬁcation model. The SVM output selects the weather conditions, and the corresponding set of weight coefﬁcients from step 2 is combined with the output from each RF model to obtain the ﬁnal forecasting results. The stacking recurrent neural network (RNN) is used as a benchmark ensemble method for comparison. Historical PV power data for a PV site in Zhangbin Industrial Area, Taiwan, with a 2000 kWp capacity is used to test the methodology. The results for the single best RF model, the stacking RNN, and the proposed method are compared in terms of the mean relative error (MRE), the mean absolute error (MAE), and the coefﬁcient of determination (R 2 ) to verify the proposed method. The results for the MRE show that the proposed method outperforms the best RF method by 20% and the benchmark method by 2%.


Introduction
Forecasting photovoltaic (PV) power generation is a vital element in the planning and operation of an electric power grid.Renewable energy resources are rapidly integrated into smart grids [1][2][3].The variability and uncertainty of PV power output and availability must be considered in the complex decision-making processes required to balance supply and demand for the power system.A solar generator at the ground level is affected by cloud cover, atmospheric aerosol levels, and other atmospheric parameters, so solar power is intermittent and variable [4].Meteorological features, such as solar irradiance, air temperature, relative humidity, and wind speed, directly or indirectly affect the power generated by a PV [5].The intermittent nature of power generation from solar PV systems means to the benchmark method.Another study [26] compared classification methods, such as K-nearest neighbor (KNN) and SVM models, in terms of performance.The results show that an SVM performs well on a small sample scale.
An ensemble learning method is used to increase the accuracy of PV power forecasting, which involves combining multiple models to make predictions.One study [27] developed a stacked generalization ensemble model for short-term PV power generation forecasting.This uses base learners such as extreme learning machines (ELM), extremely randomized trees, K-nearest neighbor (KNN), and the Mondrian forest model.A deep belief network is used as a meta learner to generate the final outputs from meteorological features such as global horizontal irradiance (GHI), diffuse horizontal irradiance (DHI), relative humidity, wind direction, and temperature.The MAE, RMSE, mean absolute percentage error (MAPE), and R 2 values are used as evaluation criteria.The proposed model gives a MAPE that is 2.30% more than that for the benchmark and a single model.
One study [28] used a seasonal time series model to develop a regression-based ensemble forecasting combination.Seasonal time series models use a seasonal autoregressive integrated moving average (SARIMA), exponential smoothing (ETS), multilayer perceptron (MLP), seasonal trend decomposition, a TBATS model, and a theta model.Eight ensemble forecasting combination methods were used to combine the forecasting results.The normalized root mean squared error (nRMSE), normalized mean bias error (nMBE), forecast skill, and Kolmogorov-Smirnov test integral (KSI) are used to calculate the accuracy.Sometimes the best individual model is more accurate than the ensemble model.
Another study [29] used the bagging ensemble method with Random forest (RF) and extra trees (ET) to predict hourly PV generation, and an SVR was used as the benchmark model.The inputs for the model are solar radiation, air temperature, relative humidity, wind speed, and the previous hourly value for PV output.The RMSE and MAE values are used for error validation.ET outperforms RF and SVR, with an MAE of 1.0851 kWh.However, this study did not involve different weather conditions, such as sunny, cloudy, or rainy.
One study [30] blended forecasting results from multiple feedforward neural network (FNN) predictors using the RF model.Meteorological measurements, such as solar irradiance, ambient temperature, and wind speed, were used as model inputs.The method for this study outperforms six benchmark models in terms of persistence, SVR, linear regression (LR), RF, gradient boosting (GB), and extreme GB (XGBoost) by 40%, but the method only uses one-hour-ahead forecasts for very short-term PV power forecasting.
Table 1 shows the previous researches on PV power forecasting using the ensemble method.Many studies show that the use of the ensemble method can increase the accuracy of the single forecasting method.In fact, the weight coefficients in every weather condition, such as sunny, cloudy, or rainy, are different from one another.The suitable weights must be implemented in the proper weather conditions to increase the accuracy.This study proposed an ensemble-based model for short-term PV power forecasting to increase the accuracy of the short-term PV power output predictions.The proposed model incorporates five RF models for five weather types: sunny, light-cloudy, cloudy, heavy-cloudy, and rainy.It also uses regression-based methods such as linear regression (LR) and support vector regression (SVR) and uses LASSO and Ridge regularization for weighting to combine the forecasting results.A previous study implemented a stacked generalization ensemble method for short-term PV power forecasting using an RNN meta learner [31].The stacking RNN method is used as a benchmark for the ensemble forecasting method for this study.The goal of this study is to improve the accuracy and performance of the individual forecasting models for day-ahead PV power forecasting by implementing a regressionbased ensemble method.This study makes the following significant contributions to this field of study:

•
A new PV forecasting structure that incorporates K-means clustering, RF models, and the regression-based method with LASSO and Ridge regularizations is used to increase forecasting accuracy.

•
A regression-based ensemble learning with Bayesian optimization is used with LASSO and Ridge regularization to calculate the five optimal sets of weight coefficients, which allows us to determine which predictors in the model are significant.

•
The regression-based method is easier to implement and has fewer hyperparameters compared to the stacking RNN method.The results show that the proposed regressionbased method outperforms the benchmark stacking RNN by 2%.The remainder of the paper is structured as follows.Section 2 briefly describes the proposed methodology and setup modeling.Section 3 explains the ensemble forecasting strategy.Section 4 details the proposed PV power forecasting simulation results, and Section 5 details the conclusions and future applications.

The K-Means Model
The K-means clustering method is used for this study to divide the training set into clusters.K-Means clustering is a type of unsupervised machine-learning technique that is frequently used to divide a set of data into several subgroups.K-means is a traditional clustering method that is simple, fast, and robust [36] and produces groups that have similar characteristics that are significantly different from those other groups.
The K-Means clustering minimizes the sum of squared errors (SSE) as in [37]: where k represents the number of clusters, n represents the number of observations, x i represents the ith observation, and c j represents the centroid for cluster j.
To iteratively update the centroid of each cluster, Equation ( 2) is used: where C j represents the total number of points in cluster j.
The following are the steps of the K-means clustering: 1.
The initial centers of each group, K samples, are chosen at random to eliminate the dimensional effects.Each feature is normalized using the min-max method.

2.
Samples are assigned to groups based on their Euclidean distance from the center of the group, and the group with the smallest Euclidean distance is chosen for each sample.

3.
The centers of each group are recalculated using the sample data for each group, and the results are output if none of the centers are changed.4.
Steps 2 and 3 are repeated until convergence is achieved.
The elbow method is a common method for determining the optimal number of clusters.This method uses the concept of the within-cluster sum of squares (WCSS) value [38].The total variance within a cluster is defined using the WCSS.
The elbow method uses the following steps to determine the optimal value of clusters: 1. K-means clustering is performed on a given dataset for various K values.

2.
The WCSS value is calculated for each value of K.

3.
A line is drawn between the calculated WCSS values and the number of clusters K.

4.
When the point on the plot looks like an arm, it has the best value for K.
The elbow method is defined as: where K denotes the number of clusters, X i denotes the number of observations, and C K denotes the cluster center.

The Random Forest Model
Random forest is a machine-learning approach that is used for classification and regression problems.A Random forest is an ensemble of decision trees.The Random forest output is the class that is chosen by the majority of trees that are used for classification problems, but for regression problems the mean or average prediction for an individual tree is used.
Figure 1 shows the structure of the Random forest model.Random forest uses the ensemble technique of bagging, which is also known as bootstrap aggregation.Bagging selects a sample at random from the original dataset, so rows are sampled to construct each model using the bootstrap samples from the original data.The bootstrap method is used for row sampling with replacement.The results are generated using each model, which is independently trained.A majority vote or mean decision is made when all models are combined.Aggregation involves combining all of the results and generating output.The RF model is robust to missing values and outliers and is less affected by noise.A detailed model of RF is shown in [39].
Energies 2022, 15, x FOR PEER REVIEW 5 The elbow method is a common method for determining the optimal number of ters.This method uses the concept of the within-cluster sum of squares (WCSS) value The total variance within a cluster is defined using the WCSS.
The elbow method uses the following steps to determine the optimal value of clust 1. K-means clustering is performed on a given dataset for various K values.
2. The WCSS value is calculated for each value of K. 3. A line is drawn between the calculated WCSS values and the number of cluster 4. When the point on the plot looks like an arm, it has the best value for K.
The elbow method is defined as: where K denotes the number of clusters,   denotes the number of observations, an denotes the cluster center.

The Random Forest Model
Random forest is a machine-learning approach that is used for classification an gression problems.A Random forest is an ensemble of decision trees.The Random f output is the class that is chosen by the majority of trees that are used for classific problems, but for regression problems the mean or average prediction for an indiv tree is used.
Figure 1 shows the structure of the Random forest model.Random forest use ensemble technique of bagging, which is also known as bootstrap aggregation.Bag selects a sample at random from the original dataset, so rows are sampled to cons each model using the bootstrap samples from the original data.The bootstrap meth used for row sampling with replacement.The results are generated using each m which is independently trained.A majority vote or mean decision is made when all m els are combined.Aggregation involves combining all of the results and generating put.The RF model is robust to missing values and outliers and is less affected by noi detailed model of RF is shown in [39].

The Stacking RNN Ensemble Method
In this study, the stacking RNN was used as the benchmark method.Stacking R is the ensemble method based on stacked generalization by training the first-level lea

The Stacking RNN Ensemble Method
In this study, the stacking RNN was used as the benchmark method.Stacking RNN is the ensemble method based on stacked generalization by training the first-level learners and combining them using the second-level learner to obtain the final forecasting results.A more detailed explanation of the stacking RNN ensemble method can be found in [31].

The Linear Regression (LR) Model
This study uses a linear regression model [41] with ordinary least squares (OLS) to combine the forecasting results of five different RF forecasting models: where x i denotes the forecast result for model i, w denotes the weight coefficient, and b denotes the intercept or bias.
The best fit is determined by minimizing the sum of squared errors: The solution involves solving: To avoid overfitting, a regularization term is used (to minimize the magnitude of w): -LASSO regression: LASSO stands for least absolute shrinkage and selection operator.LASSO regression performs L1 regularization by adding a penalty coefficient λ equal to the absolute value of the magnitude of the coefficients.
-Ridge regression: Ridge regression performs L2 regularization by adding a penalty coefficient λ equal to the square of the magnitude of the coefficients.
where λ represents the penalty coefficient.

The Support Vector Regression (SVR) Model
Support vector regression (SVR) is used in this study to combine the forecasting results for five different RF models.Figure 2 shows the structure of the SVR.
The function of SVR is: where f (x) represents the forecast values, ϕ(x) represents the kernel function (RBF function as a kernel function) for the inputs, and w and b represent the weighted coefficient and the bias, respectively.
Energies 2022, 15, 4171 A penalty function is used to calculate the values of coefficients w and b: where w 2 represents the regularization term, C represents the penalty coefficient, and ε represents the maximum value for the tolerable error.

Ridge regression:
Ridge regression performs L2 regularization by adding a penalty coefficient to the square of the magnitude of the coefficients.

min ∑(𝑦
where  represents the penalty coefficient.

The Support Vector Regression (SVR) Model
Support vector regression (SVR) is used in this study to combine the foreca sults for five different RF models.Figure 2 shows the structure of the SVR.

Bayesian Optimization
In machine learning, hyperparameters need to be tuned to ensure the performance of the prediction model.The best results can be obtained by using the optimal hyperparameters.Hyperparameter optimization is used to optimize the model.Bayesian optimization is one of the global optimization algorithms that generates a probabilistic model of the function mapping from hyperparameter values to the target, which is then tested on a validation set.A detailed description of the Bayesian optimization algorithm can be found in [43].

Setup Modelling 2.5.1. Data Preprocessing
Data preprocessing involves data normalization, cleaning, repair, and data splitting.During the data preparation stage, data are normalized using min-max normalization.The min-max normalization is defined as: x n = x n − x min x max − x min (12) where x n is the normalized data, x n is the original data, and x max and x min are the maximum and minimum values of x n .After data normalization, data cleaning removes outliers and data repair replaces missing values using linear interpolation.The good data are then divided into data training and testing sets.

Datasets
The PV site for this study is located at Zhangbin Industrial Area in Taiwan, at a latitude of 24.12809 • and longitude of 120.4281 • .Zhangbin Industrial Area's PV site has a ground-mounted panel with a 2000 kWp capacity.The PV power output data for 2020 is used for this study.
Two types of datasets are used for this study: meteorological data that is obtained from Solcast and measurement data from the PV site.The meteorological data from Solcast is open access data that contains the real values for irradiance and weather with a 10 min resolution [44].The meteorological data features for this study are solar irradiance (GHI), air temperature, precipitation, relative humidity, and wind speed.This data is averaged to a one-hour resolution to meet the requirements of this study.The measured real data for PV power output from the PV site's ground panel in Zhangbin Industrial Area, Taiwan, is also used as a data feature.
The Pearson correlation coefficient (PCC) and t-statistics are used to select appropriate data features.PCC is used to calculate the correlation value between each weather variable and the PV power output.The values are between −1 and 1 [45].A value of r = 1 indicates a positive correlation, r = 0 indicates no correlation, and r = −1 indicates a negative correlation.The formula for PCC is: where r is the Pearson correlation coefficient (PCC), x = 1 n ∑ N i=1 x i represents the mean of x, and y = 1 n ∑ N i=1 y i represents the mean of y.The PCC, t-test, and p-value between weather features and PV power output are shown in Table 2. Precipitation and wind speed have a low correlation with PV power output, but solar irradiance, air temperature, and relative humidity have a high correlation.Even though precipitation and wind speed have a low correlation, the t-test results show that input variables with p-values less than 0.05 are still significant and can be used as input variables [24].A previous study [5] also demonstrated that precipitation and wind speed indirectly affect PV power output.Therefore, solar irradiance (GHI), air temperature, precipitation, relative humidity, and wind speed are the weather variables used for this study.

Evaluation Criteria
The mean relative error (MRE), the mean absolute error (MAE), and the coefficient of determination (R 2 ) are used as evaluation criteria to validate the error.The MRE is calculated by dividing the actual and forecasted values by the nominal capacity of the photovoltaic facility [46].MAE represents the accuracy of the prediction [47].R 2 is the coefficient of determination, which ranges from 0 to 1 [48].The higher the value of R 2 , the more accurate the model.The formulas for MRE, MAE, and R 2 are: Energies 2022, 15, 4171 where y i and ŷi are the forecast value and the true value for PV power output at the ith point, respectively, N is the number of prediction points, N p is the PV site's nominal power capacity, and y i is the average PV power output.

Ensemble Forecasting Strategy
The clustering method, classification techniques, RF models, and the regression-based ensemble model are used for the proposed PV power ensemble forecasting strategy.Figure 3 shows the overall structure of the ensemble PV power generation forecast.The general framework consists of three steps: model training, optimal set of weights creation, and model testing.In step 1, K-means clustering uses the daily average historical PV power output for k different weather conditions using the optimal number of k.The optimal number of k is calculated using the elbow method, which is five in this case.The five clusters are labeled as rainy, heavy-cloudy, cloudy, light-cloudy, and sunny.Then an RF model was trained on each cluster using the historical hourly weather data as input and PV power generation as output.In addition, an SVM classification model was also trained using the historical daily average weather data as input and the label defined by K-means clustering as output.There were five RF models (RF 1 , RF 2 , RF 3 , RF 4 , and RF 5 ) and an SVM classification model obtained in this step.Figure 4 shows the detailed process of step 1.
Energies 2022, 15, x FOR PEER REVIEW 9 of 21 Figure 3 shows the overall structure of the ensemble PV power generation forecast.The general framework consists of three steps: model training, optimal set of weights creation, and model testing.In step 1, K-means clustering uses the daily average historical PV power output for k different weather conditions using the optimal number of k.The optimal number of k is calculated using the elbow method, which is five in this case.The five clusters are labeled as rainy, heavy-cloudy, cloudy, light-cloudy, and sunny.Then an RF model was trained on each cluster using the historical hourly weather data as input and PV power generation as output.In addition, an SVM classification model was also trained using the historical daily average weather data as input and the label defined by K-means clustering as output.There were five RF models (RF1, RF2, RF3, RF4, and RF5) and an SVM classification model obtained in this step.Figure 4 shows the detailed process of step 1.In step 2, the dataset for each cluster was trained using RF models that we obtained from the first step.Then, the PV power output of each Random forest model is used as an input to the regression-based method to construct the set of weights.The hyperparameters for the regression-based method are learner (LR or SVR), regularization (LASSO or Ridge), and λ (penalty coefficient for regularization).Bayesian optimization is performed to find the optimal value of these three hyperparameters.The optimal set of weights is obtained in this step and each set of weights contains five weight coefficients and a bias.Different weather conditions have different sets of weights to ensure accurate forecasting results.Figure 5 shows the detailed process of step 2.
Energies 2022, 15, x FOR PEER REVIEW 10 of 21 In step 2, the dataset for each cluster was trained using RF models that we obtained from the first step.Then, the PV power output of each Random forest model is used as an input to the regression-based method to construct the set of weights.The hyperparameters for the regression-based method are learner (LR or SVR), regularization (LASSO or Ridge), and λ (penalty coefficient for regularization).Bayesian optimization is performed to find the optimal value of these three hyperparameters.The optimal set of weights is obtained in this step and each set of weights contains five weight coefficients and a bias.Different weather conditions have different sets of weights to ensure accurate forecasting results.Figure 5 shows the detailed process of step 2. In order to simulate weather forecasting inaccuracies, random errors of 10%, 20%, 30%, 40%, 50%, and 60% are applied to the actual weather value in load forecasting [49].In step 3, we assumed that a random error of ±20% is added to the actual weather data to allow it to be used as forecasting data due to insufficient weather forecasting data.The weather forecasting data for the target day is used as input for five RF models.The average daily weather forecasting data is also used as input for the SVM classification model obtained in step 1.The SVM output selects the weather conditions, and the corresponding set of weight coefficients from step 2 is combined with the output from each RF model to obtain the final forecasting results by using ( 17): where  ̂ is the final forecasting results,  1 ,  2 , …,  5 are the weight coefficients,  ̂1,  ̂2, In order to simulate weather forecasting inaccuracies, random errors of 10%, 20%, 30%, 40%, 50%, and 60% are applied to the actual weather value in load forecasting [49].In step 3, we assumed that a random error of ±20% is added to the actual weather data to allow it to be used as forecasting data due to insufficient weather forecasting data.The weather forecasting data for the target day is used as input for five RF models.The average daily weather forecasting data is also used as input for the SVM classification model obtained in step 1.The SVM output selects the weather conditions, and the corresponding set of weight coefficients from step 2 is combined with the output from each RF model to obtain the final forecasting results by using ( 17): where Ŷ is the final forecasting results, w 1, w 2 , . . ., w 5 are the weight coefficients, ŷ1 , ŷ2 , . . ., ŷ5 are the forecasting results of RF models, and b is the bias.Figure 6 shows the detailed process of step 3.
30%, 40%, 50%, and 60% are applied to the actual weather value in load forecasting In step 3, we assumed that a random error of ±20% is added to the actual weather d allow it to be used as forecasting data due to insufficient weather forecasting data weather forecasting data for the target day is used as input for five RF models.The age daily weather forecasting data is also used as input for the SVM classification m obtained in step 1.The SVM output selects the weather conditions, and the correspon set of weight coefficients from step 2 is combined with the output from each RF mo obtain the final forecasting results by using ( 17): ̂=  1  ̂1 +  2  ̂2 + ⋯ +  5  ̂5 +  where  ̂ is the final forecasting results,  1 ,  2 , …,  5 are the weight coefficients,  …,  ̂5 are the forecasting results of RF models, and b is the bias.Figure 6 shows th tailed process of step 3.

PV Power Forecasting Simulation Results
The software package, MATLAB 2021b edition, with an Intel Core i7 CPU at 3.60 GHz and an 8 GB RAM computer, is used for the simulation.The stacking RNN ensemble method is used as a benchmark model to compare the results for one-day-ahead PV power forecasting.The stacking RNN ensemble method is proven to give accurate short-term PV power forecasts [31].

Test System
A case study used a 2000 kWp PV farm in Zhangbin Industrial Area, Taiwan, as a test system to determine the accuracy of PV power output forecasting.The actual irradiance and weather features from Solcast are used to train the model.There is a lack of weather prediction data, so a ±20% random error is generated in the actual data to simulate the weather forecast.The measured PV power generation for this study was obtained from the Zhangbin Industrial Area's PV site in Taiwan.
The test system includes historical data for PV power output and hourly average values for irradiance, temperature, precipitation, relative humidity, and wind speed.The data for 2020 is used as a dataset for the system.The data preprocessing that is described in Section 2.5.1 gives 300 days of good data that is used for the simulation.This study uses twelve points for each PV power output and corresponding weather variables on one day: the PV power output and weather variables from 06:00-17:00.
The collected data is classified into five weather conditions using K-means clustering, and the elbow method is used to determine the optimal number of clusters.The weather conditions are sunny, light-cloudy, cloudy, heavy-cloudy, and rainy.
The training and testing datasets for various weather conditions that are used to train and test the RF model and the ensemble method are shown in Table 3. From the 300-day dataset, 223 days (75%) are used to train a single RF model, and 77 days (25%) are used to test the model.The test results for the single RF model are used as datasets to train and test the ensemble learner.Sixty days (80%) are used to train the ensemble learner, and 17 days (20%) are used for testing.A total of 10 days of the ensemble learner's testing data are used to test the proposed method for each model, and the proposed method is tested using the ensemble learner's testing data for seven consecutive days from 14 May to 20 May 2020. Figure 7 shows a detailed illustration of the RF model and the data preparation for the ensemble learner.twelve points for each PV power output and corresponding weather variables on one day: the PV power output and weather variables from 06:00-17:00.The collected data is classified into five weather conditions using K-means clustering, and the elbow method is used to determine the optimal number of clusters.The weather conditions are sunny, light-cloudy, cloudy, heavy-cloudy, and rainy.
The training and testing datasets for various weather conditions that are used to train and test the RF model and the ensemble method are shown in Table 3. From the 300-day dataset, 223 days (75%) are used to train a single RF model, and 77 days (25%) are used to test the model.The test results for the single RF model are used as datasets to train and test the ensemble learner.Sixty days (80%) are used to train the ensemble learner, and 17 days (20%) are used for testing.A total of 10 days of the ensemble learner's testing data are used to test the proposed method for each model, and the proposed method is tested using the ensemble learner's testing data for seven consecutive days from 14 May to 20 May 2020. Figure 7 shows a detailed illustration of the RF model and the data preparation for the ensemble learner.

Hyperparameters Setting for the RF and Ensemble Models
The hyperparameters for the RF model are the number of trees and the minimum leaf size.The search spaces are 100, 200, 500, and 1000 trees, with minimum leaf sizes of 1, 3, and 5. Table 4 shows the parameters for each single RF model, as determined by the experiment.The penalty coefficient (lambda), a learner, and regularization are used to tune the optimal hyperparameters for the proposed ensemble model.Lambda is a positive coefficient.The regression-based learners are linear regression using ordinary least square (OLS) and support vector regression (SVR), and the regularization methods are LASSO and Ridge regression.
The ensemble model uses Bayesian optimization to optimize the hyperparameters."Bayesian optimization" is a global optimization problem [50].The benchmark method is a stacking RNN, which has the same structure as that of a previous study [31].Table 5 shows the parameters for the benchmark model and the optimal hyperparameters for the proposed ensemble model that are determined using the optimization process.

Short-Term PV Power Output Forecasting
K-means clustering is used to label the data.The elbow method is used to determine the optimal number of clusters (k).The plot with the best number of clusters is shaped like an arm.The elbow method gives the results that are shown in Figure 8.The optimal value for k is 3-5 clusters.Ensemble forecasting requires diverse models [51] so different individual models use different datasets [32], or the same dataset uses different parameters [52] so a maximum value must be assigned for k, which in this case is 5.The results of a previous study [31] show that an ensemble of f forms an ensemble of three models in terms of accuracy.The five wea this study are sunny, light-cloudy, cloudy, heavy-cloudy, and rainy.trained using these five weather conditions, and five RF models are p  The results of a previous study [31] show that an ensemble of five models outperforms an ensemble of three models in terms of accuracy.The five weather conditions for this study are sunny, light-cloudy, cloudy, heavy-cloudy, and rainy.The RF models are trained using these five weather conditions, and five RF models are produced: RF 1 , RF 2 , RF 3 , RF 4 , and RF 5 , for rainy, heavy-cloudy, cloudy, light-cloudy, and sunny, respectively.The dataset for each weather condition is trained using these five RF models in order to calculate a set of weights.Each set contains five weight coefficients and a bias.Regression-based ensemble learning with Bayesian optimization is then used to calculate five optimal sets of weight coefficients and a bias for each weather condition, and Equation ( 17) is used to calculate the final ensemble forecasting results for each weather condition.
The SVM classification gives the weather conditions for the target day, and an appropriate weight set is used.The SVM classification model receives weather forecasts as input and weather conditions as output.To simulate weather forecasting, a ±20% random error is added to the real weather value.
Figure 9 shows the results for the RF models and the proposed regression-based ensemble forecasting method for sunny weather conditions.The RF 5 model has the lowest MRE value of 7.91% compared to other RF models.The stacking RNN shows that the ensemble method gives more accurate PV power forecasting, with an MRE value of 4.49%.However, the proposed ensemble method provides the most accurate results, with an MRE of 3.49%.Figure 10 shows the results for the RF models and the proposed regression-based ensemble forecasting method for light-cloudy weather conditions.Compared to other RF models, the RF 5 model has the lowest MRE value of 5.83%.With an MRE value of 5.61%, the stacking RNN demonstrates that the ensemble model provides more accurate PV power forecasting.On the other hand, the proposed ensemble method produces the most accurate results with an MRE of 5.22%.The results for the RF models and the proposed regression-based ensemble forecasting method for heavy-cloudy weather conditions are shown in Figure 12.The RF3 model has the lowest MRE value of 4.62% compared to the other RF models.With an MRE value of 4.4%, the stacking RNN outperforms the best RF model in forecasting PV power.However, the proposed ensemble method has the lowest MRE of 3.93% compared to all other models.The results for the RF models and the proposed regression-based ensemble forecasting method for cloudy conditions are shown in Figure 11.The RF 3 model has the lowest MRE value of 6.19% compared to the other RF models.In terms of forecasting PV power, the stacking RNN outperforms the best RF model, with an MRE value of 5.49%.However, the proposed ensemble method has the lowest MRE of 4.19%.Figure 13 shows the results for the RF models and the proposed regression-based ensemble forecasting method for rainy conditions.The RF3 has the lowest MRE value of 1.87% compared to the other RF models, but the stacking RNN outperforms the best RF model with an MRE value of 1.76%.Nevertheless, the proposed ensemble method has the best MRE value of 1.59%.Figure 13 shows the results for the RF models and the proposed regression-based ensemble forecasting method for rainy conditions.The RF 3 has the lowest MRE value of 1.87% compared to the other RF models, but the stacking RNN outperforms the best RF model with an MRE value of 1.76%.Nevertheless, the proposed ensemble method has the best MRE value of 1.59%.The proposed method is more accurate than the best RF and benchmark methods in terms of the coefficient of determination (R 2 ).It has a higher R 2 value for all weather conditions than the best RF and benchmark methods and the lowest R 2 value is for rainy conditions.The benchmark model stacking RNN is an ensemble method and is much more accurate than a single forecasting method such as the RF model.However, the proposed ensemble method is more accurate than the benchmark stacking RNN model, as demonstrated by the results for the sunny, light-cloudy, cloudy, heavy-cloudy, and rainy datasets.
The proposed regression-based ensemble method for short-term PV power forecasting performance was also tested using the data for seven consecutive days to represent a real industrial application.Figure 14 shows a 7-day comparison of the proposed method, and Figure 15 compares the MRE, MAE, and R 2 for the best RF model, the stacking RNN, and the proposed method.The best single RF model has an MRE of 5.611%, an MAE of kW, and an R 2 of 0.903.The benchmark method with stacking RNN has an MRE of 4.457%, an MAE of 89.130 kW, and an R 2 of 0.93, so it is more accurate than the best RF model.The proposed method has an MRE of 4.362%, an MAE of 87.242 kW, and an R 2 of 0.933, so it is the most accurate method.In terms of the MRE, the proposed method is 22% better than the best RF model and 2% better than the benchmark method.

tasets.
The proposed regression-based ensemble method for short-term PV power forecasting performance was also tested using the data for seven consecutive days to represent a real industrial application.Figure 14 shows a 7-day comparison of the proposed method, and Figure 15 compares the MRE, MAE, and R 2 for the best RF model, the stacking RNN, and the proposed method.The best single RF model has an MRE of 5.611%, an MAE of 112.223 kW, and an R 2 of 0.903.The benchmark method with stacking RNN has an MRE of 4.457%, an MAE of 89.130 kW, and an R 2 of 0.93, so it is more accurate than the best RF model.The proposed method has an MRE of 4.362%, an MAE of 87.242 kW, and an R 2 of 0.933, so it is the most accurate method.In terms of the MRE, the proposed method is 22% better than the best RF model and 2% better than the benchmark method.

Conclusions
To increase the prediction accuracy for a one-day-ahead PV power forecasting strategy, this study has proposed a short-term PV power forecasting algorithm that uses the regression-based ensemble forecasting method.The ensemble model is constructed by combining individual forecasting models for the RF algorithm.K-means clustering and an SVM classification model are also used to increase the accuracy of the proposed method.
The combination strategy for this study uses linear regression (LR) and support vector regression (SVR), with LASSO and Ridge as regularization methods.The simulation results show that the proposed method is 20% more accurate than the best RF model.The benchmark ensemble forecasting method for this study is a stacking RNN.The proposed method is 2% more accurate than the stacking RNN.The results of this study show that

Conclusions
To increase the prediction accuracy for a one-day-ahead PV power forecasting strategy, this study has proposed a short-term PV power forecasting algorithm that uses the regression-based ensemble forecasting method.The ensemble model is constructed by combining individual forecasting models for the RF algorithm.K-means clustering and an SVM classification model are also used to increase the accuracy of the proposed method.
The combination strategy for this study uses linear regression (LR) and support vector regression (SVR), with LASSO and Ridge as regularization methods.simulation results show that the proposed method is 20% more accurate than the best RF model.The benchmark ensemble forecasting method for this study is a stacking RNN.The proposed method is 2% more accurate than the stacking RNN.The results of this study show that ensemble forecasting strategies, particularly the proposed method, are much more accurate than single forecasting models.
Future studies will involve the use of a metaheuristic optimization method to determine the optimal weighting coefficients and increase the accuracy of the proposed method.Dynamic ensembles will also replace static ensembles to increase the accuracy by recalculating the weight of the individual prediction model for each new input sample [53].

Figure 3 .
Figure 3.The general framework of a one-day ahead ensemble PV power forecasting strategy.

Figure 3 .
Figure 3.The general framework of a one-day ahead ensemble PV power forecasting strategy.

Figure 3 .Figure 4 .
Figure 3.The general framework of a one-day ahead ensemble PV power forecasting strategy

Figure 4 .
Figure 4.The detailed process of step 1.

Figure 5 .
Figure 5.The detailed process of step 2.

Figure 5 .
Figure 5.The detailed process of step 2.

Figure 6 .
Figure 6.The detailed process of step 3.

Figure 6 .
Figure 6.The detailed process of step 3.

Figure 7 .
Figure 7. Details of the data preparation procedure.

Figure 7 .
Figure 7. Details of the data preparation procedure.

Figure 8 .
Figure 8.The Elbow method for optimizing clusters.

Figure 9 .
Figure 9. Results for one-day-ahead PV power forecasting for sunny weather conditions.Figure 9. for one-day-ahead PV power forecasting for sunny weather conditions.

Figure 9 .
Figure 9. Results for one-day-ahead PV power forecasting for sunny weather conditions.Figure 9. for one-day-ahead PV power forecasting for sunny weather conditions.

Figure 9 .
Figure 9. Results for one-day-ahead PV power forecasting for sunny weather conditions.

Figure 10 .
Figure 10.Results for one-day-ahead PV power forecasting for light-cloudy weather conditions.

Figure 10 .
Figure 10.Results for one-day-ahead PV power forecasting for light-cloudy weather conditions.

Energies 2022 , 21 Figure 11 .
Figure 11.Results for one-day-ahead PV power forecasting for cloudy weather conditions.Figure 11. Results for one-day-ahead PV power forecasting for cloudy weather conditions.

Figure 11 .
Figure 11.Results for one-day-ahead PV power forecasting for cloudy weather conditions.Figure 11. Results for one-day-ahead PV power forecasting for cloudy weather conditions.The results for the RF models and the proposed regression-based ensemble forecasting method for heavy-cloudy weather conditions are shown in Figure12.The RF 3 model has the lowest MRE value of 4.62% compared to the other RF models.With an MRE value of 4.4%, the stacking RNN outperforms the best RF model in forecasting PV power.However,

Figure 11 .
Figure 11.Results for one-day-ahead PV power forecasting for cloudy weather conditions.

Figure 12 .
Figure 12. Results for one-day-ahead PV power forecasting for heavy-cloudy weather conditions.

Figure 12 .
Figure 12. Results for one-day-ahead PV power forecasting for heavy-cloudy weather conditions.

Energies 2022 , 21 Figure 13 .
Figure 13.Results for one-day-ahead PV power forecasting for rainy weather conditions.

Figure 13 .
Figure 13.Results for one-day-ahead PV power forecasting for rainy weather conditions.

Figure 14 .
Figure 14.The comparison of the one-day-ahead PV power forecasting results for seven consecutive days (14 May 2020-20 May 2020).

Figure 14 .
Figure 14.The comparison of the one-day-ahead PV power forecasting results for seven consecutive days (14 May 2020-20 May 2020).Energies 2022, 15, x FOR PEER REVIEW 18 of 21

Table 1 .
The previous study of ensemble PV power forecasting.

Table 2 .
The statistical test between weather features and PV power output.

Table 3 .
The number of days that are used for training and testing.

Table 3 .
The number of days that are used for training and testing.

Table 4 .
Parameters for the single RF models.

Table 5 .
Hyperparameters for the ensemble model.

Table 6
compares the proposed method to the RF model and benchmark method in terms of one-day-ahead observations.The proposed method produces the lowest MRE and MAE values.The MRE for sunny weather conditions is 3.492%, and the MRE values for the stacking RNN and the best RF model are 4.495% and 7.905%, respectively.The proposed method gives an MAE of 69.833 kW, and the best RF and stacking RNN models give MAE values of 158.1 kW and 89.893 kW, respectively.The proposed method achieves a 5.222% MRE and a 104.434 kW MAE for light-cloudy weather conditions.The best RF and stacking RNN models give MRE values of 5.833% and 5.607%, respectively, and MAE

Table 6
compares the proposed method to the RF model and benchmark method in terms of one-day-ahead observations.The proposed method produces the lowest MRE and MAE values.The MRE for sunny weather conditions is 3.492%, and the MRE values for the stacking RNN and the best RF model are 4.495% and 7.905%, respectively.The proposed method gives an MAE of 69.833 kW, and the best RF and stacking RNN models give MAE values of 158.1 kW and 89.893 kW, respectively.The proposed method achieves a 5.222% MRE and a 104.434 kW MAE for light-cloudy weather conditions.The best RF and stacking RNN models give MRE values of 5.833% and 5.607%, respectively, and MAE values of 116.651 kW and 112.13 kW.

Table 6 .
Forecasting accuracy for all weather conditions.For cloudy weather conditions, the proposed method gives an MRE of 4.195% and an MAE value of 83.902 kW, values that are lower than those for the stacking RNN model with a 5.497% MRE and a 109.935kW MAE.The best RF model is less accurate than the stacking RNN model, which gives a 6.189% MRE and a 123.781 kW MAE.The proposed method is more accurate than the best RF and stacking RNN methods for heavy-cloudy conditions, with a 3.934% MRE and an MAE value of 78.688 kW.The proposed method gives a 1.599% MRE and a 31.976kW MAE for rainy conditions, and the best RF method performs worse, with a 1.871% MRE and an MAE of 37.42 kW.The MRE value for the stacking RNN method is 1.76% and the MAE value is 35.199kW.