Feasibility Study on the Inﬂuence of Data Partition Strategies on Ensemble Deep Learning: The Case of Forecasting Power Generation in South Korea

: Ensemble deep learning methods have demonstrated signiﬁcant improvements in forecasting the solar panel power generation using historical time-series data. Although many studies have used ensemble deep learning methods with various data partitioning strategies, most have only focused on improving the predictive methods by associating several different models or combining hyperparameters and interactions. In this study, we contend that we can enhance the precision of power generation forecasting by identifying a suitable data partition strategy and establishing the ideal number of partitions and subset sizes. Thus, we propose a feasibility study of the inﬂuence of data partition strategies on ensemble deep learning. We selected ﬁve time-series data partitioning strategies—window, shufﬂe, pyramid, vertical, and seasonal—that allow us to identify different characteristics and features in the time-series data. We conducted various experiments on two sources of solar panel datasets collected in Seoul and Gyeongju, South Korea. Additionally, LSTM-based bagging ensemble models were applied to combine the advantages of several single LSTM models. The experimental results reveal that the data partition strategies positively inﬂuence the forecasting of power generation. Speciﬁcally, the results demonstrate that ensemble models with data partition strategies outperform single LSTM models by approximately 4–11% in terms of the coefﬁcient of determination (R 2 ) score.


Introduction
Renewable energy refers to the generation of electricity from natural, sustainable resources such as the sun, wind, and water. Solar energy is one of the most popular renewable energy sources. It supplies electric energy to homes or businesses by capturing sunlight. Countries have recently been paying attention to solar energy development because of their advantages: inexhaustible, non-polluting emissions, competitive sources, reducing fossil fuel and natural gas, R 2 and many others [1]. Even during the COVID-19 pandemic, the solar energy market development did not have a significant impact, excluding some delays due to lockdowns [2]. Like many other countries, South Korea's government is interested in increasing solar energy usage. More specifically, the government declared the goal of a low-carbon and eco-friendly nation by increasing the renewable energy market to 40% by 2030 from the current 30% [3]. Despite the benefits of solar energy, the provision of electrical energy from solar panels also has some drawbacks. More specifically, there is a high initial investment, ample space required for installing solar panels, and inefficient solar panels [1]. Moreover, solar energy is considered to be intermittent because solar panels produce energy from sunlight. Thus, there are energy storage systems that do not interrupt the power supply. However, persistent bad weather, such as cloudy, rainy, or snowy weather, can • First, we propose an accurate methodology for forecasting daily and hourly solar panel power generation using an ensemble deep learning model and data partitioning. The method consists of three steps: partitioning time-series data, training models using partitioned subsets, and aggregating the results of each model to obtain the final forecasted power generation. • Furthermore, we use five simple data partition strategies, namely window, shuffle, pyramid, vertical, and seasonal, to investigate the influence of each strategy on the accuracy of forecasting the solar panel power generation. Data partition strategies are selected to divide the datasets into effective subsets with different characteristics and features in the time-series data. The ensemble model can comprehend multiple characteristics of data by learning from various characteristics. The experiments evaluated the subset sizes and the number of partitions. • Finally, we evaluated the proposed data partition strategies through extensive experiments using LSTM to forecast the power generation of the solar panels. The experiments examined each data partition using LSTM models with different hyperparameters and checked the influence of different numbers of partitions and subset sizes. We evaluate the experiments on two independent datasets to demonstrate the applicability of the proposed method.
The remainder of this paper is organized as follows: prior studies on the forecasting of solar panel power generation are explained and discussed in Section 2; the materials and methods used in this study are explained in Section 3; the evaluation methods and evaluation results are presented in Section 4; and finally, Section 5 summarizes and concludes this study and discusses future works.

Related Work
This section explains the related works that proposed machine learning and deep learning methods to forecasting power generation in renewable energy sources such as wind, hydropower, and solar panels. We explain every study in the following categories: single and ensemble. Additionally, the distinctions between our methodology and that of related studies are discussed.

Single Methods
Lee et al. [5] predicted the daily solar panel power generation using time-sequential predictive methods: RNN, LSTM, and gated recurrent units (GRU). The monitoring system in Tainan, Taiwan, provided the data that were used in this study. It contains information from three sources, including the Central Weather Bureau of Taiwan, the Environmental Protection Administration of Taiwan, and data from solar power monitoring systems. Experiments in the single inverter showed an accuracy of 89%. Furthermore, the authors used the generative adversarial network (GAN) method to extend the number of inverters to eight. In the experiments, the accuracy (i.e., 93%) of the bidirectional GRU model outperformed other models, such as GRU and LSTM, by approximately 2-17%. Abdel-Nasser and Mahmoud [7] forecasted hourly solar panel power generation using a LSTM-RNN. The experimental results showed that the forecasting error of LSTM was lower than that of other methods, such as multiple linear regression (MLR), bagged regression trees (BRT), and neural networks. The authors noted that the recurrent architecture and memory units of LSTM are efficient for pursuing temporal changes in the solar panel power generation. However, the authors declared the limitations of the study as follows: the effect of outliers was not studied, and environmental features were incorporated.
Deenadayalan and Vaishnavi [21] forecasted the future solar panel power generation and wind turbines using fault identification and remediation. Specifically, the proposed deep learning method consists of parameter adjustment using modified grey wolf optimization (MGWO), fault identification using a CNN-based classifier, power generation forecasts using a regression neural network, and fault remediation using a discriminative gradient. The study dataset was obtained from solar panels and wind turbos in India. The performance analysis of the proposed method showed that the proposed system has a lower error rate than other state-of-art methods. Wang et al. [22] forecasted solar irradiance using a new direct explainable neural network in which it is easy to interpret the prediction result. The proposed network can explain the relationship between the input and the output by extracting the nonlinear mapping features in solar irradiance. The experiments that were conducted using the solar irradiance dataset from Lyon, France, show better prediction performance and explanation. Zsibor'acs et al. [23] studied the difference between day-ahead and intraday solar panel power generation forecasts and the actual generation data in the European Network of Transmission System member states' operators. The study results show that the intraday forecasts are less skillful than the day-ahead forecasts in all but one of the countries, which highlights the significance of further application-related studies on the intraday horizon. Tu et al. [24] proposed a grey wolf optimization-based general regression neural network for short-term solar power forecasting. The authors claimed that the proposed method provides more accurate predictions with shorter computational times. The performance of their experiments revealed that the proposed method can significantly enhance the prediction accuracy of PV systems.

Ensemble Methods
Tan et al. [17] explained that it is challenging to develop an accurate and robust model to forecast power demand owing to the intense volatility of industrial power loads. Therefore, they proposed a hybrid ensemble method to forecast ultra-short-term industrial power demand. The ensemble method employs different ensemble strategies such as bagging, random subspace, and boosting. The study evaluated the proposed methods using an open dataset collected from the Australian Energy Market Operator (AEMO), open half-hourly electricity load data from 2013, and a practical dataset from a real-time practical steel plant. The proposed method demonstrated that the ensemble method had greater accuracy and robustness. Wang et al. [18] used a LSTM deep learning model based on the bagging ensemble method to forecast the inflow of hydropower stations. The bagging ensemble method integrates the outputs of member models. There are other ways to integrate the outputs, and this study employs a weighted average, which takes the accuracy of each member model into account. Data from a hydropower station in southern China from 2015 to 2017 was used by the authors. In the experiments, the proposed ensemble method outperformed the other individual models by 0.2% (i.e., deep belief network, random forest regression, GBRT, and LSTM) to 18.7% (i.e., support vector regression). Su et al. [25] proposed a modification to improve the ensemble learning framework for forecasting solar power generation. This study implemented a novel adaptive residual compensation (ARC) algorithm and an evolutionary optimization technique. ARC increases the reliability of conventional models by considering the residuals brought on by prediction mistakes. The authors aimed to forecast the hourly power generation at three solar panel sites. The experimental results proved that the proposed method improves the traditional ensemble methods by approximately 12% in terms of the R 2 score.
Lotfi et al. [26] presented a novel ensemble method based on kernel density estimation (KDE) to forecast the solar panel power generation. The proposed method forecasts inverter AC power using meteorological variables, such as wind speed, temperature, solar irradiance, precipitation, and humidity. The dataset for one year, from 15 March 2015 to 15 March 2016, was taken from an actual solar panel site located in the vicinity of the city of Coimbra, Portugal. First, the authors calculated the most similar cases from the historical dataset using KDE. The results from all individual models were then ensembled using similar cases in one individual model. The suggested method performed better in the spring, summer, and fall than the irradiance forecast and neural network methods. However, it cannot overcome the limitations of the neural network method in winter. Wen et al. [27] used a hybrid ensemble model to forecast solar panel output intervals. The ensemble model has four individual models: BPNN, radial basis function neural network (RBFNN), extreme learning machine (ELM), and Elman NN. First, the ensemble model forecasts the irradiance, temperature, and wind speed. The authors proposed a ship motion model to predict the power output based on the forecasted features. This study focuses on solar panels deployed on shipboard, in contrast to other solar panel locations. The authors emphasized how the location, date, time zone, and local time, as well as the rolling angle of the ship, affected the solar panel output. The authors designed seven ensemble combination models, and the seventh model, which has members of the BPNN, RBFNN, ELM, and Elman NN, showed the lowest error in root mean squared error (RMSE). Zhang et al. [28] presented an ensemble method to forecast day-ahead power generation in solar panel systems. The dataset of this study comes from free data sources, such as the SolrenView server and the North American Mesoscale Forecast System. The authors combined clustering and blending strategies to improve solar power forecasting accuracy. The proposed forecasting method reduced the normalized RMSE by 13.8-61.21% over the three baseline methods. Kim et al. [29] developed a stacking ensemble SARIMAX-LSTM model for power generation prediction for several solar power plants in various regions of South Korea. The authors used the spatial and temporal characteristics of solar PV generation from satellite images and numerical text data were combined and used. The experimental results revealed that their proposed model outperformed other state-of-art methods, such as SARIMAX, LSTM, Random Forest, and others.

Discussions
In the field of renewable energy, forecasting power generation benefits from both single and ensemble methods. Even though the single machine learning method has been quite effective in forecasting power generation, the method may not be strong enough to handle time-series data challenges. Therefore, ensemble learning aims to overcome these challenges by combining the results from two or more predictive models to create a more stable and accurate model than single predictive models. Although numerous studies have employed ensemble methods with different data partitioning methods, most of them have focused on enhancing the predictive methods by integrating various models or hyperparameters and interactions. We performed empirical experiments, unlike existing ensemble learning methods, to evaluate the influence of time-series data partition strategies, the number of partitions, and subset sizes.

Overview
This study aimed to forecast the solar panel power generation using LSTM and data partitions. Figure 1 illustrates the overall flow of the proposed methodology. This methodology generally consists of the following steps: data fusion, data partitioning, model training, and model evaluation. We concatenated the datasets from different domains based on the DateTime field and applied data preprocessing methods, such as filling missing hours, filling missing values, filtering hours, and scaling. We trained the data-based ensemble LSTM models after prepping the data using various data partitions: window, shuffle, pyramid, vertical, and seasonal. The proposed methodology is evaluated using R 2 , RMSE, and mean absolute error (MAE) which are widely used to measure regression problems. The proposed method is extensively discussed in the subsequent subsections.

Study Area
This study used datasets from two types of solar panel plants: testbed and actual ( Figure 2). The first location (Site A) was a testbed solar panel plant in Seoul, South Korea. The installed capacity of the plant was 30 kW/h. The second location (Site B) was an actual solar panel plant in Gyeongju City, South Korea. The installed capacity of the plant was 1500 kW/h. The datasets of Sites A and B consist of solar panels and weather features, while the dataset of Site A has some additional features, such as power factor and slope. All datasets were provided by Daeyeon C&I [30], a South Korean renewable energy company that has been developing solar power generation and monitoring systems since 1998.

Study Area
This study used datasets from two types of solar panel plants: testbed and actual ( Figure 2). The first location (Site A) was a testbed solar panel plant in Seoul, South Korea. The installed capacity of the plant was 30 kW/h. The second location (Site B) was an actual solar panel plant in Gyeongju City, South Korea. The installed capacity of the plant was 1500 kW/h. The datasets of Sites A and B consist of solar panels and weather features, while the dataset of Site A has some additional features, such as power factor and slope. All datasets were provided by Daeyeon C&I [30], a South Korean renewable energy company that has been developing solar power generation and monitoring systems since 1998.

Study Area
This study used datasets from two types of solar panel plants: testbed and actual ( Figure 2). The first location (Site A) was a testbed solar panel plant in Seoul, South Korea. The installed capacity of the plant was 30 kW/h. The second location (Site B) was an actual solar panel plant in Gyeongju City, South Korea. The installed capacity of the plant was 1500 kW/h. The datasets of Sites A and B consist of solar panels and weather features, while the dataset of Site A has some additional features, such as power factor and slope. All datasets were provided by Daeyeon C&I [30], a South Korean renewable energy company that has been developing solar power generation and monitoring systems since 1998.   Table 1 shows detailed information on the raw datasets before preprocessing. The dataset of Seoul (Site A) consisted of 12 features and 26,280 samples over three years, and the dataset of Gyeongju (Site B) had eight features and 35,487 samples over four years. Both original datasets do not include missing values. The source, name, abbreviation, and description of all the features of the datasets are listed in Table 2. These features consist of two primary sources: solar panels and weather. Moreover, we used two more derived features: month and hour. Implementing machine learning or deep learning models on a single dataset might not be convincing due to the likelihood that the chosen dataset could randomly fit the models well. Therefore, we intend to prove the viability of our proposed methodology based on the different locations, features, and characteristics of the two datasets.

Data Preprocessing
The data preprocessing part generally comprises two sections: exploratory analysis and normalization. Time-series data are collected over time intervals, such as minutes, hours, and days. Time-series data, though, are frequently intermittent in the real world. This issue causes the daily distribution of our datasets to be uneven. Specifically, there are usually data of 24 h a day, but on some days, data of 23 or fewer hours are recorded. In the exploratory analysis, we first filled up these missing hours with NaN values. Next, we filled in the NaN values using the linear interpolation technique. After the datasets were combined, we extracted the relevant information from all the raw data. More specifically, solar panels do not collect power all day, but there are some active hours such as 6 a.m. to 6 p.m. Therefore, data from other hours (i.e., 7 p.m. to 5 a.m.) can affect a prediction model adversely, and this problem is called "bias in the data" in data analysis. Figure 3 shows the power generation of the solar panels by the hour in the Site B dataset. Based on the figure information, data were obtained from 7 a.m. to 5 p.m., and the rest were not used. Figure 4 shows the correlation between the power generation of the solar panels and the time in the datasets. The data distributions of the datasets were similar, as demonstrated by the figures. The rush hours for solar panels are from 10 a.m. to 3 p.m. Additionally, solar panels produce more power from April to June. The solar panel power generation is low in July and August because they are the rainiest months in South Korea. Table 3 describes the statistical information of each feature of the datasets after applying the exploratory analysis. The features in the datasets differ significantly from one another. For example, the range of power generation was from 0 to 1400 at Site B, while the range of temperature was from −13 to 39. Therefore, larger differences between the data points of features increase the uncertainty of the prediction models. Consequently, we scaled the datasets using min-max normalization. combined, we extracted the relevant information from all the raw data. More specifically, solar panels do not collect power all day, but there are some active hours such as 6 a.m. to 6 p.m. Therefore, data from other hours (i.e., 7 p.m. to 5 a.m.) can affect a prediction model adversely, and this problem is called "bias in the data" in data analysis. Figure 3 shows the power generation of the solar panels by the hour in the Site B dataset. Based on the figure information, data were obtained from 7 a.m. to 5 p.m., and the rest were not used.    Table 3 describes the statistical information of each feature of the datasets after applying the exploratory analysis. The features in the datasets differ significantly from figure information, data were obtained from 7 a.m. to 5 p.m., and the rest were not used.    Table 3 describes the statistical information of each feature of the datasets after applying the exploratory analysis. The features in the datasets differ significantly from

Data Partition
This study proposes a methodology for ensemble LSTM models using several data partition strategies: window, shuffle, pyramid, vertical, and seasonal. Each data partition strategy revealed different characteristics and features in time-series data. These data partition strategies enable us to recognize different features in the time-series data. Moreover, different numbers of partitions and subset sizes are assessed in empirical experiments. The data were first divided by 80% and 20% for learning and testing after preprocessing to process the experiments of the data partition strategies. The learning data was used to extract the training and validation datasets. The evaluation of the prediction models and comparison of the proposed data partition strategies, number of partitions, and subset sizes were performed using the testing data.

Window Data Partition
The window data partition divides the learning data into a given number of partitions by moving a fixed-size window through the learning data samples. The extracted subsets had the same size, and each subset contained similar characteristics because the subsets covers the similar period. This data partition strategy is a straightforward method for reducing the noise in large data samples. Because smaller dataset contains less noise than larger dataset. Algorithm 1 explains the window data partition procedure. The learning data D, length of the learning data N, length of one partition n, and number of data partition splitN are the inputs for the algorithm. The output of the algorithm is a set of partitions P. The length of a partition n and the number of partitions splitN are initialized in line 1. In line 2, the algorithm calculates the step size stepSize by dividing the difference between the length of learning data N and the length of one partition n by the difference between the number of partitions splitN and 1. Line 3 selects the partition index from the number of partitions splitN. In lines 4-5, the algorithm calculates the start and end indices for data selection. Then, lines 6-7 select the data between the calculated indices and place them into the set of partition P. The algorithm is completed in line 8 when the set of partitions is filled by the given number of partitions. Calculate start index: startIndex = i * stepSize 5 Calculate end index: endIndex = startIndex + n 6 Select data between the indices: p = D[startIndex : endIndex] 7 Append p into P

Shuffle Data Partition
The shuffle data partition divides the learning data into a given number of fixed-size partitions. The subsets are the same size as the window data partition, and they all contain similar characteristics. It has a similar advantage to the window partition in that it selects a specific part of the training dataset. As opposed to window data partitioning, each partition, in this case, refers to a random portion of the total data. It is also possible that a particular part of the total data does not fit any partition. Algorithm 2 shows the procedure for the shuffle data partition. The inputs and outputs of Algorithm 2 are the same as those of Algorithm 1. In line 1, the length of a partition n and the number of partitions splitN are initialized. Line 2 calculates the highest point that can be selected as a random-start index. If an index exceeds the highest point, we cannot select a partition of n size. In line 3, the repetition of the number of partitions begins. Line 4 obtains a random start index lower than the highest point, and line 5 calculates the end index. Then, lines 6-7 select the data between the calculated indices and put them into the set of partition P. The algorithm is completed in line 8 when the set of partitions is filled by the number of partitions.

Input:
D ← learning data, N ← length of learning data, n ← length of a partition, splitN ← number of partitions Output: P ← set of partitions Procedure: 1 Initialize n, splitN, 2 Calculate the limit for start index: startLimit = N − n 3 foreach i in range(0, split_n) do 4 Get random start index: startIndex = randomInt(0, startLimit) 5 Calculate end index: endIndex = startIndex + n 6 Select data between the indices: p = D[startIndex : endIndex] 7 Append p into P 8 end

Pyramid Data Partition
The pyramid data partition is a strategy in which the partition size increases from small to large. The first partition was initiated by a fixed-size partition from the center of the data samples. Subsequently, the fixed size was broadened to both sides of the total dataset. Simply put, this data partitioning strategy has the advantage of producing subsets of different sizes, which the ensemble model can combine. Algorithm 3 shows the procedure for the pyramid data partition. The inputs and outputs of Algorithm 3 are identical to those of Algorithms 1 and 2. In line 1, the length of a partition n and the number of partitions splitN are initialized. In lines 2-3, the first start and end indices were calculated. Line 4 calculates the step size, which broadens the start and end indices. In line 5, the repetition of the number of partitions begins. In lines 6 and 9, the algorithm selects the data for a partition. If the start index is equal to or lower than 0, the total learning data is selected as a partition (Line 7). In contrast, a partition is selected between the start and end indices. In line 10, the algorithm places the selected data into a set of partitions. In lines 10-11, the startIndex is updated by subtracting the step size from the start index, and the endIndex is updated by adding the step size to the end index. The algorithm is completed in line 13 when the set of partitions is filled by the number of partitions. Append p into P

Vertical Data Partition
The vertical data partition strategy splits the learning dataset vertically rather than horizontally, in contrast to other data partition strategies. It splits datasets by selecting a subset of relevant variables and reduces dimensionality. It is inspired by variable selection methods in machine learning. A set of features is first created by manually. Specifically, all the features of the datasets were divided into several subsets. In Site A, the feature sets consist of "Slope, Power Factor, Horizontal Irradiation, PV Temperature, Temperature," "Power Factor, Horizontal Irradiation, PV Temperature, Temperature, Humidity," "Horizontal Irradiation, PV Temperature, Temperature, Humidity, Sunshine," "PV Temperature, Temperature, Humidity, Sunshine, Solar Radiation," and "Temperature, Humidity, Sunshine, Solar Radiation, Cloud." At Site B, the feature sets consisted of "Temperature, Humidity," "Humidity, Dew Point," "Dew Point, Solar Radiation," and "Solar Radiation, Cloud." Additionally, "Month" and "Hour" features are added to all subsets. Algorithm 4 shows the procedure for the vertical data partition. The inputs for the algorithm are the learning data D, and feature sets S. The output of the algorithm is a set of partitions P. In line 1, the feature set S is initialized. Line 2 selects a feature set from all feature sets. In lines 3-4, the algorithm creates partitions based on the selected features. The algorithm is completed in line 5.

Algorithm 4. Vertical data partition
Input: D ← learning data, S ← feature sets Output: P ← set of partitions Procedure: 1 Initialize S 2 foreach s in S do 3 Select data related to the set s: p = D[s] 4 Append p into P

Seasonal Data Partition
Seasonal data partitioning is a two-level data partition strategy. It can be used to catch the seasonal features of the datasets. Algorithm 5 presents the procedure for seasonal data partitioning. First, the datasets were divided into subsets by time-logical splitters, such as monthly or hourly. Monthly, we split the datasets based on seasons, such as winter (December, January, and February), spring (March, April, and May), summer (June, July, and August), and autumn (September, October, and November). We split the datasets based on three hours ranges: morning (7-10), noon (11)(12)(13)(14), and evening (15)(16)(17). Each subset was then subjected to a window partition strategy. The datasets were split based on seasonal factors, which created subsets with similar characteristics and improved the accuracy and stability of the model. The reason for this is that the predictive model can always learn from the same time ranges, such as the winter, summer, morning, or evening. Calculate start index: startIndex = i * stepSize 7 Calculate end index: endIndex = startIndex + n 8 Select data between the indices: p = D[startIndex : endIndex] 9 Append p into P 10 end 11 end

Training of LSTM Models
In this study, our principal predictive model was LSTM, an expendable type of RNN that overcomes the problem of long-term dependencies. Learning important parts and forgetting less important parts in sequence data makes LSTM prevalent in time-series data forecasting [31][32][33][34]. Figure 5 shows the structures of the LSTM models. It consists of two concepts: single and data partition ensemble LSTM models. We first used the entire learning data for different hyperparameters to train n single LSTM models. Single models were specifically trained using the same data but different hyperparameters. After that, we used data partition strategies to create a dataset. The set contained n training and validation data combinations. Subsequently, the data partition ensemble LSTM methods aggregate the outputs of the same single LSTM models. concepts: single and data partition ensemble LSTM models. We first used the entire learning data for different hyperparameters to train single LSTM models. Single models were specifically trained using the same data but different hyperparameters. After that, we used data partition strategies to create a dataset. The set contained training and validation data combinations. Subsequently, the data partition ensemble LSTM methods aggregate the outputs of the same single LSTM models.

Dataset
This study conducted two types of experiments: a single LSTM and a data partition ensemble LSTM. The number of training and validation data for each experiment was slightly

Dataset
This study conducted two types of experiments: a single LSTM and a data partition ensemble LSTM. The number of training and validation data for each experiment was slightly different. However, we used the same test data for all experiments to check the effectiveness of our methodology. Table 4 summarizes the datasets used in the experiments. The last 20% of the total data were test data at each site. The remaining data were split by training and validation based on the methodology. We selected the same number of test data from the training data in the single LSTM models as the validation data. In total, 20% of the training data were used as validation data in the data partition ensemble methods.

Evaluation Metrics
We evaluated the experiments in this study by using three standard measures of regression problems. Specifically, R 2 , RMSE, and MAE are provided in Equations (1)-(3). Here, i and n are the index of the sample and number of samples, respectively. Moreover, y, y, and y are the actual values, forecasted values, and mean of the actual values, respectively. R 2 measures the accuracy of a regression model with a value between 0 and 1. A value closer to 1 indicates that the model fits the data better. We multiplied R 2 by 100 to represent the accuracy as a percentage. The residuals or prediction errors are assumed to be the cause of the discrepancy between the actual and forecasted values. The standard deviation of the residuals is known as RMSE. MAE is a measure of errors between actual and forecasted values without considering their direction. A lower RMSE and MAE suggest that the actual and forecasted values are closer. Table 5 lists the hyperparameters of the proposed methods, such as single LSTM, model-based ensemble LSTM, and data-based ensemble LSTM. Our model consists of two layers, such as LSTM and fully connected, which returns final prediction value. We found that the ADAM optimizer with a learning rate of 0.001 was the optimal hyperparameter during several training sessions with different optimizers and learning rates. The number of epochs indicates how frequently the model trained the entire training dataset. In model training, setting the right epoch is crucial because low epochs might cause underfitting issues. High epochs, on the other hand, can result in overfitting issues and prolonged training time. The EarlyStopping function generally stops training if the accuracy cannot be increased during the number (i.e., patience) of epochs. Therefore, we set epochs to 1000 and early stopping with patience to 30. We trained the individual LSTM models with 60, 70, 80, 90, and 100 units. Subsequently, these single LSTM models were compared with the ensemble LSTM models.  Table 6 exhibits experimental results for forecasting power generation hourly in Site A. The table shows that all data partition strategies improve the accuracies of single LSTM models. More specifically, the seasonal data split technique consistently delivers the best results. The fundamental explanation is that we only train and evaluate the ensemble model during a particular season, such as the summer. Here, we discover that the ensemble model of seasonal partition with unit 60 is the best model to forecast the amount of energy per hour. This model outperforms other single LSTM models by around 3.4-4.7%.  Figure 6 shows the hourly forecasted power generation results for Site A. In the figure, we selected the results of the last 32 h of the test datasets, where the blue line represents the actual values, and the dashed lines represent the best cases in each data partition strategy. It is difficult to distinguish between the actual and forecasted values if the entire test dataset is selected. The figure illustrates that the results of data partitioning schemes more closely match actual observations than the results of a single model. Table 7 displays the experimental findings for the daily power generation forecasting in Site A. The table demonstrates that, with the exception of specific vertical data partition strategy cases, all data partition strategies improve the accuracy of single LSTM models. Like the window data partition, the seasonal data partition strategy performs best in all cases. Here, we find that the best model to forecast the amount of energy per hour is the ensemble model of seasonal partition with unit 60. This model outperforms other single LSTM models by around 4-11.2%. partition strategy. It is difficult to distinguish between the actual and forecasted values if the entire test dataset is selected. The figure illustrates that the results of data partitioning schemes more closely match actual observations than the results of a single model.  Table 7 displays the experimental findings for the daily power generation forecasting in Site A. The table demonstrates that, with the exception of specific vertical data partition strategy cases, all data partition strategies improve the accuracy of single LSTM models. Like the window data partition, the seasonal data partition strategy performs best in all cases. Here, we find that the best model to forecast the amount of energy per hour is the ensemble model of seasonal partition with unit 60. This model outperforms other single LSTM models by around 4-11.2%.  Figure 7 shows the hourly forecasted power generation results for Site A. In the figure, we selected the results of the last month of the test datasets, where the blue line represents the actual values, and the dashed lines represent the best cases in each data partition strategy. From the figure, we can see that the results of the data partition strategies follow the actual observations better than those of the single model.   Figure 7 shows the hourly forecasted power generation results for Site A. In the figure, we selected the results of the last month of the test datasets, where the blue line represents the actual values, and the dashed lines represent the best cases in each data partition strategy. From the figure, we can see that the results of the data partition strategies follow the actual observations better than those of the single model.  Table 8 shows experimental results for forecasting power generation hourly in Site B. The table demonstrates how all data partition strategies increase the accuracies of single LSTM models. More specifically, the seasonal data partition strategy performs best in all cases. Here, we find that the best model to forecast the amount of energy per hour is the ensemble model of seasonal partition with unit 90. This model outperforms other single LSTM models by around 3.9-4.6%. Figure 8 exhibits the results for hourly forecasting power generation in Site B. We selected the test dataset results from the most recent 21 h  Table 8 shows experimental results for forecasting power generation hourly in Site B. The table demonstrates how all data partition strategies increase the accuracies of single LSTM models. More specifically, the seasonal data partition strategy performs best in all cases. Here, we find that the best model to forecast the amount of energy per hour is the ensemble model of seasonal partition with unit 90. This model outperforms other single LSTM models by around 3.9-4.6%. Figure 8 exhibits the results for hourly forecasting power generation in Site B. We selected the test dataset results from the most recent 21 h to exhibit in the figure. The figure demonstrates that the results of data partition strategies more closely match actual observations than the results of a single model.   Table 8 shows experimental results for forecasting power generation hourly in Site B. The table demonstrates how all data partition strategies increase the accuracies of single LSTM models. More specifically, the seasonal data partition strategy performs best in all cases. Here, we find that the best model to forecast the amount of energy per hour is the ensemble model of seasonal partition with unit 90. This model outperforms other single LSTM models by around 3.9-4.6%. Figure 8 exhibits the results for hourly forecasting power generation in Site B. We selected the test dataset results from the most recent 21 h to exhibit in the figure. The figure demonstrates that the results of data partition strategies more closely match actual observations than the results of a single model.    Table 9 shows experimental results for daily power generation forecasting in Site B. The table demonstrates that, with the exception of specific vertical data partition strategy scenarios, all data partition strategies increase the accuracy of single LSTM models. Like the window data partition, the seasonal data partition strategy performs best in all cases. Here, we find that the best model to forecast the amount of energy per hour is the ensemble model of seasonal partition with unit 60. This model outperforms other single LSTM models by around 3.6-5.7%. Figure 9 shows the results of forecasting power generation daily at Site B. In the figure, we selected the results of the last month of the test datasets, where the blue line represents the actual values, and the dashed lines represent the best cases in each data partition strategy. From the figure, we can see that the results of the data partition strategies follow the actual observations better than those of the single model.

Comparison of Seasonal Partition
We used two types of seasonal splitters, monthly and hourly, in the seasonal partition. We evaluated these two cases and used the better ones in the following experiments. Figure 10 shows the monthly and hourly split experimental results by the R 2 score. Here, using the months outperforms the hourly results, except at Site A. The results were similar at Site A. Therefore, we used the monthly split in subsequent experiments. scenarios, all data partition strategies increase the accuracy of single LSTM models. Like the window data partition, the seasonal data partition strategy performs best in all cases.
Here, we find that the best model to forecast the amount of energy per hour is the ensemble model of seasonal partition with unit 60. This model outperforms other single LSTM models by around 3.6-5.7%.  Figure 9 shows the results of forecasting power generation daily at Site B. In the figure, we selected the results of the last month of the test datasets, where the blue line represents the actual values, and the dashed lines represent the best cases in each data partition strategy. From the figure, we can see that the results of the data partition strategies follow the actual observations better than those of the single model.

Comparison of Seasonal Partition
We used two types of seasonal splitters, monthly and hourly, in the seasonal partition. We evaluated these two cases and used the better ones in the following experiments. Figure 10 shows the monthly and hourly split experimental results by the R 2 score. Here, using the months outperforms the hourly results, except at Site A. The results were similar at Site A. Therefore, we used the monthly split in subsequent experiments. This experiment evaluated the relationship between the forecasting performance and different numbers of partitions and subset sizes. To this end, we ran three data partition strategies (window, shuffle, and pyramid) with five combinations of the number o

Partition Length and Subset Size
This experiment evaluated the relationship between the forecasting performance and different numbers of partitions and subset sizes. To this end, we ran three data partition strategies (window, shuffle, and pyramid) with five combinations of the number of partitions (5, 8, and 10) and subsite sizes (60%, 70%, and 80% of training data). Table 10 presents the detailed results of these experiments. The results specify the number of partitions, subset sizes, and optimal data partition strategy for each dataset. Specifically, the best number of partitions and subset sizes were determined as follows: • Site A, hourly forecasting: window partition strategy with five partitions and subset size of 70%. • Site A, daily forecasting: shuffle partition strategy with ten partitions and a subset size of 80%. • Site B, hourly forecasting: window partition strategy with ten partitions and 80% subset size. • Site B, daily forecasting: window partition strategy with eight partitions and subset size of 80%. the sun goes higher and stays longer in summer. Additionally, the winter months have more stormy and cloudy weather. Based on these reasons, the collected solar panel power generation data have different features for each season. Training prediction model for each season helps us to reduce high variance and bias. Additionally, we investigated the relationship between performance and the number of partitions as well as the size of subsets. The results indicated that adding more training data did not improve performance. The experiments proved that the proposed data partition ensemble LSTM methods forecast the hourly and daily solar panel power generation more accurately and reliably. Integrating solar energy monitoring with forecasting models increases the performance of solar panel systems and provides advantages to all participants in the sector, such as government, businesses, and consumers. Using this system, solar energy consumers can reconcile their electricity usage and avoid unexpected power outages and unnecessary costs. Additionally, businesses can give customers additional options and products. Furthermore, the data generated from the models can be used to improve and develop plans. Governments have been promoting renewable energy and have set time-bound goals. Efficient electricity consumption by consumers will help to make government goals more realistic.
This study demonstrated that data partitioning has positive influences on forecasting the solar panel power generation, even though we used simple strategies. However, unlike methods such as clustering, these strategies cannot account for the relationship between the data. Therefore, we plan to study more logical strategies for data partitioning in future study. Consequently, each part of the data contains appropriate data points and helps improve the forecasting performance.