# Feasibility Study on the Influence of Data Partition Strategies on Ensemble Deep Learning: The Case of Forecasting Power Generation in South Korea

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

^{2}) score.

## 1. Introduction

^{2}and many others [1]. Even during the COVID-19 pandemic, the solar energy market development did not have a significant impact, excluding some delays due to lockdowns [2]. Like many other countries, South Korea’s government is interested in increasing solar energy usage. More specifically, the government declared the goal of a low-carbon and eco-friendly nation by increasing the renewable energy market to 40% by 2030 from the current 30% [3]. Despite the benefits of solar energy, the provision of electrical energy from solar panels also has some drawbacks. More specifically, there is a high initial investment, ample space required for installing solar panels, and inefficient solar panels [1]. Moreover, solar energy is considered to be intermittent because solar panels produce energy from sunlight. Thus, there are energy storage systems that do not interrupt the power supply. However, persistent bad weather, such as cloudy, rainy, or snowy weather, can result in power outages. Consumers need to monitor weather, electricity production, and consumption to prevent this potential power outage. Energy production forecasting can aid the government’s renewable energy policy as well as help consumers and businesses plan their consumption and develop new products.

^{2}score of 10–12% for a single LSTM and ANN. Pirbazari et al. [15] also predicted solar panel energy generation and household consumption based on an ensemble method combined with several sequence-to-sequence LSTM networks. Experiments on the proposed method showed the potential of the ensemble LSTM to provide more stable and accurate forecasts. Although numerous studies have employed ensemble methods with various data partitioning methods, most have emphasized enhancing the predictive methods by associating many different models or integrating different hyperparameters and interactions. In practice, the performances of ensemble machine learning models are highly dependent on the data partitioning strategy, the number of partitions, and subset sizes. Choosing only a dedicated data partition strategy and subset size may weaken the prediction model for neglected fluctuations. Liang et al. [19] and Wang et al. [20] mentioned the problems of ensemble methods: (1) the number of members significantly affects the accuracy and diversity of ensemble methods, and (2) if the similarity between members is high, the ensemble method may lead to poor performance. Therefore, a feasibility study of data partitioning strategies is essential to effectively reveal the characteristics and features of time-series data and improve the accuracy of power generation forecasting.

- First, we propose an accurate methodology for forecasting daily and hourly solar panel power generation using an ensemble deep learning model and data partitioning. The method consists of three steps: partitioning time-series data, training models using partitioned subsets, and aggregating the results of each model to obtain the final forecasted power generation.
- Furthermore, we use five simple data partition strategies, namely window, shuffle, pyramid, vertical, and seasonal, to investigate the influence of each strategy on the accuracy of forecasting the solar panel power generation. Data partition strategies are selected to divide the datasets into effective subsets with different characteristics and features in the time-series data. The ensemble model can comprehend multiple characteristics of data by learning from various characteristics. The experiments evaluated the subset sizes and the number of partitions.
- Finally, we evaluated the proposed data partition strategies through extensive experiments using LSTM to forecast the power generation of the solar panels. The experiments examined each data partition using LSTM models with different hyperparameters and checked the influence of different numbers of partitions and subset sizes. We evaluate the experiments on two independent datasets to demonstrate the applicability of the proposed method.

## 2. Related Work

#### 2.1. Single Methods

#### 2.2. Ensemble Methods

^{2}score.

#### 2.3. Discussions

## 3. Materials and Methods

#### 3.1. Overview

^{2}, RMSE, and mean absolute error (MAE) which are widely used to measure regression problems. The proposed method is extensively discussed in the subsequent subsections.

#### 3.2. Study Area

#### 3.3. Data Collection

#### 3.4. Data Preprocessing

#### 3.5. Data Partition

#### 3.5.1. Window Data Partition

Algorithm 1. Window data partition | |

Input: | $D$ ← learning data, $N$ ← length of learning data, $n$ ← length of a partition, $splitN$ ← number of partitions |

Output: | $P$ ← set of partitions |

Procedure: | |

1 | Initialize: $n$, $splitN$ |

2 | Calculate step size: $stepSize=\frac{N-n}{splitN-1}$ |

3 | foreach $i$ in $range\left(0,splitN\right)$ do |

4 | Calculate start index: $startIndex=i\ast stepSize$ |

5 | Calculate end index: $endIndex=startIndex+n$ |

6 | Select data between the indices: $p=D\left[startIndex:endIndex\right]$ |

7 | Append $p$ into $P$ |

8 | end |

#### 3.5.2. Shuffle Data Partition

Algorithm 2. Shuffle data partition | |

Input: | $D$ ← learning data, $N$ ← length of learning data, $n$ ← length of a partition, $splitN$ ← number of partitions |

Output: | $P$ ← set of partitions |

Procedure: | |

1 | Initialize $n$, $splitN$, |

2 | Calculate the limit for start index: $startLimit=N-n$ |

3 | foreach $i$ in $range\left(0,split\_n\right)$ do |

4 | Get random start index: $startIndex=randomInt\left(0,startLimit\right)$ |

5 | Calculate end index: $endIndex=startIndex+n$ |

6 | Select data between the indices: $p=D\left[startIndex:endIndex\right]$ |

7 | Append $p$ into $P$ |

8 | end |

#### 3.5.3. Pyramid Data Partition

Algorithm 3. Pyramid data partition | |

Input: | $D$ ← learning data, $N$ ← length of learning data, $n$ ← length of a partition, $splitN$ ← number of partitions |

Output: | $P$ ← set of partitions |

Procedure: | |

1 | Initialize $n$, $splitN$, |

2 | Calculate the first start index: $startIndex=N-n$ |

3 | Calculate the first end index: $endIndex=startIndex+n$ |

4 | Calculate the step size: $stepSize=\frac{startIndex}{splitN-1}$ |

5 | foreach $i$ in $range\left(0,split\_n\right)$ do |

6 | $\hspace{1em}\hspace{1em}ifstartIndex\le 0$then |

7 | Get all dataset $p=D$ |

8 | $\hspace{1em}\hspace{1em}else$then |

9 | Get data between the indices: $p=D\left[startIndex:endIndex\right]$ |

10 | Append $p$ into $P$ |

11 | Update start index: $startIndex=startIndex-stepSize$ |

12 | Update end index: $endIndex=endIndex+stepSize$ |

13 | end |

#### 3.5.4. Vertical Data Partition

Algorithm 4. Vertical data partition | |

Input: | $D$ ← learning data, $S$ ← feature sets |

Output: | $P$ ← set of partitions |

Procedure: | |

1 | Initialize $S$ |

2 | foreach $s$ in $S$ do |

3 | Select data related to the set $s$: $p=D\left[s\right]$ |

4 | Append $p$ into $P$ |

5 | end |

#### 3.5.5. Seasonal Data Partition

Algorithm 5. Seasonal data partition | |

Input: | $D$ ← learning data, $N$ ← length of learning data, $S$ ← set of seasonal data $n$ ← length of a partition, $splitN$ ← number of partitions |

Output: | $P$ ← set of partitions |

Procedure: | |

1 | Initialize: $S$ by splitting D by seasonal (i.e., Monthly or Hourly) |

2 | foreach s in S do |

3 | Based on the subset initialize: $n$, $splitN$, $N$ |

4 | Calculate step size: $stepSize=\frac{N-n}{splitN-1}$ |

5 | foreach $i$ in $range\left(0,splitN\right)$ do |

6 | Calculate start index: $startIndex=i*stepSize$ |

7 | Calculate end index: $endIndex=startIndex+n$ |

8 | Select data between the indices: $p=D\left[startIndex:endIndex\right]$ |

9 | Append $p$ into $P$ |

10 |
end |

11 | end |

#### 3.6. Training of LSTM Models

## 4. Results

#### 4.1. Dataset

#### 4.2. Evaluation Metrics

^{2}, RMSE, and MAE are provided in Equations (1)–(3). Here, $i$ and $n$ are the index of the sample and number of samples, respectively. Moreover, $y$, $\widehat{y}$, and $\overline{y}$ are the actual values, forecasted values, and mean of the actual values, respectively. R

^{2}measures the accuracy of a regression model with a value between 0 and 1. A value closer to 1 indicates that the model fits the data better. We multiplied R

^{2}by 100 to represent the accuracy as a percentage. The residuals or prediction errors are assumed to be the cause of the discrepancy between the actual and forecasted values. The standard deviation of the residuals is known as RMSE. MAE is a measure of errors between actual and forecasted values without considering their direction. A lower RMSE and MAE suggest that the actual and forecasted values are closer.

#### 4.3. Experimental Results

#### 4.3.1. Hourly Forecasting of Site A

#### 4.3.2. Daily Forecasting of Site A

#### 4.3.3. Hourly Forecasting of Site B

#### 4.3.4. Daily Forecasting of Site B

#### 4.3.5. Comparison of Seasonal Partition

^{2}score. Here, using the months outperforms the hourly results, except at Site A. The results were similar at Site A. Therefore, we used the monthly split in subsequent experiments.

#### 4.3.6. Partition Length and Subset Size

- Site A, hourly forecasting: window partition strategy with five partitions and subset size of 70%.
- Site A, daily forecasting: shuffle partition strategy with ten partitions and a subset size of 80%.
- Site B, hourly forecasting: window partition strategy with ten partitions and 80% subset size.
- Site B, daily forecasting: window partition strategy with eight partitions and subset size of 80%.

## 5. Discussion and Conclusions

^{2}scores of 93.6–94.9% and 84.4–85.2% for Sites A and B, respectively, in hourly forecasting. For daily forecasting, Sites A and B had R2 scores of 86.8–94% and 83.7–85.8%, respectively. Second, the data partition ensemble LSTM model outperformed all single LSTM models in the experimental cases. More specifically, the results were as follows: Sites A and B had R

^{2}scores of 95.3–98.3% and 85.6–89%, respectively, in hourly forecasting and 90.5–98% and 82.2–89.3%, respectively, in daily forecasting. Particularly, the two-level seasonal data partition strategy showed good performance improvements. Solar panel power generation depends highly on seasons. If we compare winter and summer, winter days are shorter than summer days. Because of shorter days, the sun angle on solar panels changes rapidly in winter. On the contrary, the sun goes higher and stays longer in summer. Additionally, the winter months have more stormy and cloudy weather. Based on these reasons, the collected solar panel power generation data have different features for each season. Training prediction model for each season helps us to reduce high variance and bias. Additionally, we investigated the relationship between performance and the number of partitions as well as the size of subsets. The results indicated that adding more training data did not improve performance.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Guangul, F.M.; Chala, G.T. Solar energy as renewable energy source: SWOT analysis. In Proceedings of the 4th MEC International Conference on Big Data and Smart City (ICBDSC), Muscat, Oman, 15–16 January 2019. [Google Scholar]
- International Energy Agency. Snapshot of Global PV Markets 2021; Report IEA-PVPS T1-39; International Energy Agency: Paris, France, 2021. [Google Scholar]
- Korea Energy Agency. National Survey Report of PV Power Applications in Korea; Korea Energy Agency: Yongin-si, Korea, 2019. [Google Scholar]
- Gao, M.; Li, J.; Hong, F.; Long, D. Day-ahead power forecasting in a large-scale photovoltaic plant based on weather classification using LSTM. Energy
**2019**, 187, 115838. [Google Scholar] [CrossRef] - Lee, C.-H.; Yang, H.-C.; Ye, G.-B. Predicting the performance of solar power generation using deep learning methods. Appl. Sci.
**2021**, 11, 6887. [Google Scholar] [CrossRef] - Zheng, J.; Zhang, H.; Dai, Y.; Wang, B.; Zheng, T.; Liao, Q.; Liang, Y.; Zhang, F.; Song, X. Time series prediction for output of multi-region solar power plants. Appl. Energy
**2020**, 257, 114001. [Google Scholar] [CrossRef] - Abdel-Nasser, M.; Mahmoud, K. Accurate photovoltaic power forecasting models using deep LSTM-RNN. Neural Comput. Appl.
**2019**, 31, 2727–2740. [Google Scholar] [CrossRef] - Wang, F.; Xuan, Z.; Zhen, Z.; Li, K.; Wang, T.; Shi, M. A day-ahead PV power forecasting method based on LSTM-RNN model and time correlation modification under partial daily pattern prediction framework. Energy Convers. Manag.
**2020**, 212, 112766. [Google Scholar] [CrossRef] - Wang, K.; Qi, X.; Liu, H. A comparison of day-ahead photovoltaic power forecasting models based on deep learning neural network. Appl. Energy
**2019**, 251, 113315. [Google Scholar] [CrossRef] - Wang, K.; Qi, X.; Liu, H. Photovoltaic power forecasting based LSTM-Convolutional Network. Energy
**2019**, 189, 116225. [Google Scholar] [CrossRef] - Ghimire, S.; Deo, R.C.; Raj, N.; Mi, J. Deep solar radiation forecasting with convolutional neural network and long short-term memory network algorithms. Appl. Energy
**2019**, 253, 113541. [Google Scholar] [CrossRef] - Gensler, A.; Henze, J.; Sick, B.; Raabe, N. Deep Learning for solar power forecasting—An approach using AutoEncoder and LSTM Neural Networks. In Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Budapest, Hungary, 9–12 October 2016. [Google Scholar]
- Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci.
**2020**, 14, 241–258. [Google Scholar] [CrossRef] - Khan, W.; Walker, S.; Zeiler, W. Improved solar photovoltaic energy generation forecast using deep learning-based ensemble stacking approach. Energy
**2022**, 240, 122812. [Google Scholar] [CrossRef] - Pirbazari, A.M.; Sharma, E.; Chakravorty, A.; Elmenreich, W.; Rong, C. An ensemble approach for multi-step ahead energy forecasting of household communities. IEEE Access.
**2021**, 9, 36218–36240. [Google Scholar] [CrossRef] - Singla, P.; Duhan, M.; Saroha, S. An ensemble method to forecast 24-h ahead solar irradiance using wavelet decomposition and BiLSTM deep learning network. Earth Sci. Inform.
**2022**, 15, 291–306. [Google Scholar] [CrossRef] [PubMed] - Tan, M.; Yuan, S.; Li, S.; Su, Y.; Li, H.; He, F.H. Ultra-short-term industrial power demand forecasting using LSTM based hybrid ensemble learning. IEEE Trans. Power Syst.
**2020**, 35, 2937–2948. [Google Scholar] [CrossRef] - Wang, L.; Peng, H.; Tan, M.; Pan, R. A multistep prediction of hydropower station inflow based on bagging-LSTM model. Discret. Dyn. Nat. Soc.
**2021**, 2021, 1031442. [Google Scholar] [CrossRef] - Liang, J.; Wei, P.; Qu, B.; Yu, K.; Yue, C.; Hu, Y.; Ge, S. Ensemble learning based on multimodal multiobjective optimization. In Bio-inspired Computing: Theories and Applications, Proceedings of the International Conference on Bio-Inspired Computing: Theories and Applications, Zhengzhou, China, 22–25 November 2019; Pan, L., Liang, J., Qu, B., Eds.; Springer: Singapore, 2020; Volume 1159, p. 1159. [Google Scholar]
- Wang, X.; Han, T. Transformer fault diagnosis based on stacking ensemble learning. IEEJ Trans. Electr. Electron. Eng.
**2020**, 15, 1734–1739. [Google Scholar] [CrossRef] - Deenadayalan, V.; Vaishnavi, P. Improvised deep learning techniques for the reliability analysis and future power generation forecast by fault identification and remediation. J. Ambient. Intell. Humaniz. Comput.
**2021**, 1–9. [Google Scholar] [CrossRef] - Wang, H.; Cai, R.; Zhou, B.; Aziz, S.; Qin, B.; Voropai, N.; Gan, L.; Barakhtenko, E. Solar irradiance forecasting based on direct explainable neural network. Energy Convers. Manag.
**2020**, 226, 113487. [Google Scholar] [CrossRef] - Zsibor´acs, H.; Pint´er, G.; Vincze, A.; Baranyai, H.; Mayer, M.J. The reliability of photovoltaic power generation scheduling in seventeen European countries. Energy Convers. Manag.
**2022**, 260, 115641. [Google Scholar] [CrossRef] - Tu, C.-S.; Tsai, W.-C.; Hong, C.-M.; Lin, W.-M. Short-Term Solar Power Forecasting via General Regression Neural Network with GreyWolf Optimization. Energies
**2022**, 15, 6624. [Google Scholar] [CrossRef] - Su, H.-Y.; Liu, T.-Y.; Hong, H.-H. Adaptive residual compensation ensemble models for improving solar energy generation forecasting. IEEE Trans. Sustain. Energy
**2020**, 11, 1103–1105. [Google Scholar] [CrossRef] - Lotfi, M.; Javadi, M.; Osório, G.J.; Monteiro, C.; Catalão, J.P.S. A novel ensemble algorithm for solar power forecasting based on kernel density estimation. Energies
**2020**, 13, 216. [Google Scholar] [CrossRef][Green Version] - Wen, S.; Zhang, C.; Lan, H.; Xu, Y.; Tang, Y.; Huang, Y. A hybrid ensemble model for interval prediction of solar power output in ship onboard power systems. IEEE Trans. Sustain. Energy
**2021**, 12, 14–24. [Google Scholar] [CrossRef] - Zhang, X.; Li, Y.; Lu, S.; Hamann, H.F.; Hodge, B.-M.; Lehman, B. A solar time based analog ensemble method for regional solar power forecasting. IEEE Trans. Sustain. Energy
**2019**, 10, 268–279. [Google Scholar] [CrossRef] - Kim, B.; Suh, D.; Otto, M.-O.; Huh, J.-S. A Novel Hybrid Spatio-Temporal Forecasting of Multisite Solar Photovoltaic Generation. Remote Sens.
**2021**, 13, 2605. [Google Scholar] [CrossRef] - Daeyeon C&I Co., LTD. Available online: http://dycni.com/ (accessed on 26 March 2022).
- Sagheer, A.; Kotb, M. Time series forecasting of petroleum production using deep LSTM recurrent networks. Neurocomputing
**2019**, 323, 203–213. [Google Scholar] [CrossRef] - Pheng, T.; Chuluunsaikhan, T.; Ryu, G.-A.; Kim, S.-H.; Nasridinov, A.; Yoo, K.-H. Prediction of process quality performance using statistical analysis and long short-term memory. Appl. Sci.
**2022**, 12, 735. [Google Scholar] [CrossRef] - Ai, S.; Chakravorty, A.; Rong, C. Evolutionary Ensemble LSTM based Household Peak Demand Prediction. In Proceedings of the International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Okinawa, Japan, 11–13 February 2019. [Google Scholar]
- Zhao, F.; Zeng, G.Q.; Lu, K.D. EnLSTM-WPEO: Short-term traffic flow prediction by ensemble LSTM, NNCT weight integration, and population extremal optimization. IEEE Trans. Veh. Technol.
**2019**, 69, 101–113. [Google Scholar] [CrossRef]

**Figure 1.**Overall flow of the proposed methodology. Here, the abbreviations of the features are described in Table 2; R

^{2}: Coefficient of Determination; RMSE: Root Mean Squared Error; MAE: Mean Absolute Error.

**Figure 4.**Power generation of (

**a**) hourly in Site A, (

**b**) monthly in Site A, (

**c**) hourly in Site B, and (

**d**) monthly in Site B.

Location | Number of Features | Number of Samples | Date |
---|---|---|---|

Site A | 12 | 26,280 | 1 January 2017~31 December 2019 |

Site B | 8 | 35,487 | 1 January 2017~31 December 2020 |

Source | Feature | Abbr. | Site A | Site B | Description |
---|---|---|---|---|---|

Solar panel | Power generation | PG | o | o | The power output of panels (kWh). |

Power factor | PF | o | - | The ratio between the utilized and generated power. | |

Slope | SL | o | - | The angle at which the panels are positioned relative to a flat surface. | |

Horizontal irradiation | HI | o | - | The total solar radiation incident on a horizontal surface. | |

Module temperature | MT | o | - | The temperature of solar panels (°C). | |

Weather | Temperature | TE | o | o | Outside temperature (°C). |

Humidity | HU | o | o | The concentration of water vapor present in the air (%). | |

Cloud | CO | o | o | Amount of cloud. | |

Dew point | DP | - | o | Dewpoint (°C). | |

Sunshine | SS | o | - | Sunlight reaches the ground without being covered by clouds. | |

Solar radiation | SR | o | o | The amount of solar radiation energy on the ground (W/m^{2}). | |

Derived | Month | MO | o | o | Month of date stamp. |

Hour | HO | o | o | Hour of date stamp. |

Feature | Site A | Site B | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

Count | Mean | Std | Min | Max | Count | Mean | Std | Min | Max | |

Power generation | 12,045 | 8.89 | 7.13 | 0.00 | 25.76 | 16,060 | 525.38 | 373.94 | 0.00 | 1396.85 |

Power factor | 12,045 | 90.59 | 22.90 | 0.00 | 99.00 | - | - | - | - | - |

Slope | 12,045 | 353.34 | 257.74 | 0.00 | 942.73 | - | - | - | - | - |

Horizontal irradiation | 12,045 | 304.26 | 219.16 | 0.00 | 880.52 | - | - | - | - | - |

Module temperature | 12,045 | 25.09 | 16.00 | −19.79 | 65.25 | - | - | - | - | - |

Temperature | 12,045 | 16.66 | 11.83 | −16.81 | 42.21 | 16,060 | 16.11 | 373.94 | −12.90 | 39.20 |

Humidity | 12,045 | 51.28 | 20.46 | 7.00 | 100.00 | 16,060 | 58.59 | 23.34 | 0.00 | 100.00 |

Cloud | 12,045 | 5.02 | 4.00 | 0.00 | 10.00 | 16,060 | 3.11 | 3.98 | 0.00 | 10.00 |

Dew point | - | - | - | - | - | 16,060 | 6.81 | 12.14 | −26.90 | 28.00 |

Sunshine | 12,045 | 0.60 | 0.44 | 0.00 | 1.00 | - | - | - | - | - |

Solar radiation | 12,045 | 1.18 | 0.90 | 0.00 | 3.59 | 16,060 | 313.00 | 243.47 | 0.00 | 975.00 |

LSTM | Site A | Site B | |||||
---|---|---|---|---|---|---|---|

Train | Validation | Test | Train | Validation | Test | ||

Single | 7231 | 2407 | 2407 | 9640 | 3210 | 3210 | |

Data-based ensemble | Window | 7200 | 1800 | 2407 | 8000 | 2000 | 3210 |

Shuffle | 7200 | 1800 | 2407 | 8000 | 2000 | 3210 | |

Pyramid | 8000~9638 | 1600~1928 | 2407 | 8000~12850 | 1600~2570 | 3210 | |

Vertical | 7231 | 2407 | 2407 | 9640 | 3210 | 3210 | |

Seasonal | 2991~3618 | 318~1000 | 2407 | 1990~2969 | 43~1000 | 3210 |

LSTM | Optimizer | Learning Rate | Epochs | Batch Size | Patience | Units |
---|---|---|---|---|---|---|

Single | ADAM | 0.001 | 1000 | 32 | 30 | 60,70,80,90,100 |

Ensemble | ADAM | 0.001 | 1000 | 32 | 30 | 60,70,80,90,100 |

Methods | LSTM60 | LSTM70 | LSTM80 | LSTM90 | LSTM100 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | |

No partition | 94.44 | 1.56 | 0.80 | 94.94 | 1.49 | 0.60 | 94.07 | 1.61 | 0.77 | 93.58 | 1.68 | 0.94 | 93.93 | 1.63 | 0.82 |

Window | 97.95 | 0.95 | 0.47 | 97.98 | 0.96 | 0.50 | 97.97 | 0.94 | 0.49 | 97.89 | 0.96 | 0.45 | 98.18 | 0.89 | 0.41 |

Shuffle | 97.94 | 0.95 | 0.45 | 97.74 | 1.00 | 0.51 | 97.99 | 0.94 | 0.44 | 97.93 | 0.95 | 0.45 | 98.19 | 0.89 | 0.38 |

Pyramid | 96.52 | 1.24 | 0.83 | 96.68 | 1.17 | 0.76 | 96.96 | 1.16 | 0.74 | 96.94 | 1.16 | 0.64 | 97.49 | 1.05 | 0.62 |

Vertical | 95.30 | 1.44 | 0.74 | 96.56 | 1.23 | 0.68 | 96.17 | 1.30 | 0.68 | 95.98 | 1.33 | 0.68 | 96.23 | 1.29 | 0.66 |

Seasonal | 98.31 | 0.86 | 0.33 | 98.23 | 0.88 | 0.34 | 98.05 | 0.93 | 0.33 | 98.22 | 0.89 | 0.32 | 98.22 | 0.89 | 0.34 |

Methods | LSTM60 | LSTM70 | LSTM80 | LSTM90 | LSTM100 | ||||||||||

R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | |

No partition | 90.09 | 12.84 | 10.10 | 93.98 | 10.00 | 7.15 | 92.98 | 10.80 | 7.99 | 87.84 | 14.22 | 10.86 | 86.81 | 14.81 | 11.73 |

Window | 95.92 | 8.30 | 5.38 | 95.25 | 8.96 | 5.92 | 94.12 | 9.97 | 6.33 | 95.78 | 8.45 | 5.26 | 94.93 | 9.25 | 5.99 |

Shuffle | 95.52 | 8.70 | 5.52 | 95.67 | 8.55 | 5.17 | 94.26 | 9.85 | 6.27 | 96.39 | 7.81 | 4.94 | 96.23 | 7.98 | 4.81 |

Pyramid | 93.91 | 10.14 | 7.21 | 92.30 | 11.40 | 8.79 | 90.49 | 12.68 | 9.22 | 94.00 | 10.07 | 7.22 | 93.61 | 10.39 | 7.30 |

Vertical | 88.59 | 13.88 | 10.16 | 92.75 | 11.07 | 6.93 | 94.14 | 9.95 | 5.79 | 93.22 | 10.70 | 6.81 | 92.53 | 11.23 | 6.77 |

Seasonal | 98.00 | 5.77 | 2.88 | 98.00 | 5.78 | 2.91 | 97.49 | 6.47 | 3.16 | 97.89 | 5.93 | 2.92 | 97.73 | 6.15 | 2.95 |

Methods | LSTM60 | LSTM70 | LSTM80 | LSTM90 | LSTM100 | ||||||||||

R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | |

No partition | 85.15 | 140.91 | 104.32 | 84.43 | 144.28 | 104.83 | 84.66 | 143.21 | 105.55 | 84.53 | 143.84 | 106.33 | 85.18 | 140.77 | 102.98 |

Window | 86.85 | 132.61 | 94.09 | 86.87 | 132.50 | 95.01 | 87.11 | 131.28 | 94.05 | 87.21 | 130.77 | 92.85 | 87.18 | 130.92 | 93.09 |

Shuffle | 87.11 | 131.26 | 91.97 | 87.31 | 130.28 | 91.38 | 87.31 | 130.28 | 91.93 | 87.36 | 130.00 | 91.26 | 87.55 | 129.03 | 90.76 |

Pyramid | 86.97 | 131.98 | 93.33 | 87.32 | 130.20 | 91.80 | 86.92 | 132.26 | 96.40 | 87.44 | 129.62 | 91.23 | 86.93 | 132.22 | 94.12 |

Vertical | 85.78 | 137.88 | 100.20 | 85.58 | 138.86 | 99.50 | 86.13 | 136.16 | 98.23 | 85.95 | 137.08 | 98.10 | 86.44 | 134.63 | 95.96 |

Seasonal | 88.64 | 125.17 | 89.47 | 88.60 | 125.36 | 89.93 | 88.42 | 126.36 | 91.47 | 89.05 | 122.88 | 88.30 | 88.69 | 124.86 | 88.85 |

Method | LSTM60 | LSTM70 | LSTM80 | LSTM90 | LSTM100 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | |

No partition | 83.68 | 928.77 | 677.82 | 83.84 | 924.18 | 664.43 | 85.76 | 867.54 | 624.75 | 85.07 | 888.24 | 633.74 | 85.69 | 869.49 | 635.16 |

Window | 87.19 | 822.94 | 613.09 | 86.47 | 845.68 | 640.72 | 86.34 | 849.73 | 639.83 | 87.36 | 817.29 | 603.58 | 87.90 | 799.55 | 581.59 |

Shuffle | 86.67 | 839.39 | 615.75 | 86.68 | 838.86 | 625.65 | 86.06 | 858.30 | 639.80 | 87.10 | 825.70 | 612.01 | 87.08 | 826.28 | 591.23 |

Pyramid | 87.22 | 821.65 | 596.28 | 87.11 | 825.51 | 616.56 | 87.34 | 817.85 | 605.42 | 87.12 | 824.90 | 605.30 | 87.14 | 824.25 | 593.65 |

Vertical | 83.90 | 922.40 | 693.93 | 84.26 | 912.06 | 693.72 | 82.81 | 953.26 | 725.32 | 82.17 | 970.77 | 735.92 | 83.20 | 942.34 | 725.87 |

Seasonal | 89.33 | 738.98 | 560.45 | 87.94 | 785.68 | 592.60 | 89.04 | 749.03 | 564.35 | 88.50 | 767.23 | 568.71 | 88.72 | 759.93 | 574.78 |

Site | Method | 5_60% | 5_70% | 5_80% | 8_80% | 10_80% | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | R^{2} | RMSE | MAE | ||

A/Hourly | Window | 97.69 | 1.01 | 0.46 | 98.16 | 0.90 | 0.41 | 97.82 | 0.98 | 0.53 | 97.81 | 0.98 | 0.49 | 97.92 | 0.96 | 0.49 |

A/Hourly | Shuffle | 95.97 | 1.33 | 0.67 | 97.22 | 1.10 | 0.61 | 97.86 | 0.97 | 0.50 | 97.91 | 0.95 | 0.47 | 98.02 | 0.93 | 0.44 |

A/Hourly | Pyramid | 98.13 | 0.91 | 0.41 | 98.05 | 0.93 | 0.45 | 98.05 | 0.92 | 0.45 | 97.98 | 0.94 | 0.48 | 98.03 | 0.93 | 0.45 |

A/Daily | Window | 95.49 | 8.73 | 5.22 | 95.71 | 8.51 | 5.11 | 93.65 | 10.36 | 6.67 | 90.86 | 12.43 | 8.37 | 92.16 | 11.59 | 8.21 |

A/Daily | Shuffle | 93.78 | 10.25 | 5.41 | 94.96 | 9.23 | 5.57 | 96.68 | 7.49 | 4.72 | 96.66 | 7.50 | 4.65 | 96.85 | 7.30 | 4.44 |

A/Daily | Pyramid | 96.59 | 7.59 | 4.66 | 95.86 | 8.37 | 5.06 | 96.31 | 7.90 | 4.72 | 96.37 | 7.84 | 4.70 | 95.99 | 8.23 | 4.99 |

B/Hourly | Window | 87.41 | 129.75 | 91.54 | 87.44 | 129.61 | 91.81 | 85.97 | 137.00 | 95.11 | 87.44 | 129.60 | 91.44 | 87.63 | 128.63 | 89.72 |

B/Hourly | Shuffle | 86.60 | 133.85 | 94.93 | 86.75 | 133.09 | 95.28 | 86.96 | 132.02 | 94.19 | 87.21 | 130.79 | 92.12 | 87.20 | 130.81 | 92.87 |

B/Hourly | Pyramid | 87.40 | 129.78 | 90.67 | 87.48 | 129.39 | 90.60 | 87.56 | 128.98 | 90.34 | 87.39 | 129.83 | 90.24 | 87.39 | 129.85 | 90.94 |

B/Daily | Window | 87.25 | 820.82 | 595.05 | 87.31 | 818.91 | 600.47 | 86.25 | 850.59 | 620.11 | 87.93 | 798.74 | 615.01 | 87.66 | 852.59 | 625.11 |

B/Daily | Shuffle | 86.84 | 833.96 | 613.85 | 87.14 | 824.47 | 604.91 | 87.09 | 826.10 | 609.20 | 87.25 | 820.96 | 604.14 | 87.03 | 827.99 | 614.89 |

B/Daily | Pyramid | 87.61 | 809.27 | 592.22 | 87.68 | 806.86 | 583.62 | 87.81 | 802.59 | 588.31 | 87.87 | 800.62 | 595.06 | 87.75 | 804.36 | 596.45 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Chuluunsaikhan, T.; Kim, J.-H.; Shin, Y.; Choi, S.; Nasridinov, A. Feasibility Study on the Influence of Data Partition Strategies on Ensemble Deep Learning: The Case of Forecasting Power Generation in South Korea. *Energies* **2022**, *15*, 7482.
https://doi.org/10.3390/en15207482

**AMA Style**

Chuluunsaikhan T, Kim J-H, Shin Y, Choi S, Nasridinov A. Feasibility Study on the Influence of Data Partition Strategies on Ensemble Deep Learning: The Case of Forecasting Power Generation in South Korea. *Energies*. 2022; 15(20):7482.
https://doi.org/10.3390/en15207482

**Chicago/Turabian Style**

Chuluunsaikhan, Tserenpurev, Jeong-Hun Kim, Yoonsung Shin, Sanghyun Choi, and Aziz Nasridinov. 2022. "Feasibility Study on the Influence of Data Partition Strategies on Ensemble Deep Learning: The Case of Forecasting Power Generation in South Korea" *Energies* 15, no. 20: 7482.
https://doi.org/10.3390/en15207482