An Improved Ensemble Learning Regression Algorithm for Electricity Demand Forecasting with Symmetric Experimental Evaluation

Zhou, Jie; Yan, Peisheng; Bian, Zekang; Jiang, Zhibin; Yu, Donghua

doi:10.3390/sym18010123

Open AccessArticle

An Improved Ensemble Learning Regression Algorithm for Electricity Demand Forecasting with Symmetric Experimental Evaluation

by

Jie Zhou

¹,

Peisheng Yan

¹,

Zekang Bian

^2,3,

Zhibin Jiang

¹ and

Donghua Yu

^1,*

¹

Department of Computer Science and Engineering, Shaoxing University, Shaoxing 312000, China

²

Department of AI & Computer Science, Jiangnan University, Wuxi 214122, China

³

Jiangsu Key Construction Laboratory of IoT Application Technologies (Taihu), Wuxi 214122, China

^*

Author to whom correspondence should be addressed.

Symmetry 2026, 18(1), 123; https://doi.org/10.3390/sym18010123

Submission received: 11 December 2025 / Revised: 3 January 2026 / Accepted: 6 January 2026 / Published: 8 January 2026

(This article belongs to the Special Issue Machine Learning and Data Analysis III)

Download

Browse Figures

Versions Notes

Abstract

Electricity demand forecasting plays a crucial role in energy planning and power system operation. However, it is affected by numerous factors and complex relationships, making accurate prediction challenging. Therefore, from the perspective of sample diversity in the base dataset, we propose an improved stacking-based ensemble regression algorithm to enhance the accuracy of electricity demand forecasting. Firstly, a continuous sampling strategy is constructed between the sample integration selection probability and the base dataset using D²-Sampling and KNN; secondly, multiple base regression models are integrated through stacking to improve the predictive performance. In the electricity demand forecasting experiments conducted on three different datasets and across multiple base models, the proposed improved stacking ensemble learning regression algorithm (DK-Stacking) achieved the best performance. This symmetric experimental evaluation ensured consistent and balanced assessment of the model performance across datasets and models, highlighting the robustness and generalization of the proposed algorithm. Compared to the ANN, SVR, and RF models, its prediction accuracy increased by more than 1 percentage point. Even when compared to the optimized XGBoost model, it showed an improvement of 0.44 percentage points. Overall, the proposed DK-Stacking demonstrates symmetry-inspired robustness in electricity demand forecasting through the balanced treatment of datasets and model integration.

Keywords:

electricity demand forecasting; ensemble learning; D²-Sampling; K-Nearest Neighbor Algorithm; symmetric evaluation; XGBoost

1. Introduction

Accurate electricity forecasting is indispensable for power systems, significantly enhancing the ability to manage losses, improving the efficiency of grid scheduling, and providing crucial support for the safe and stable operation of power systems [1,2]. The current demand management for power metering equipment generally adopts the “first-level storage, first-level distribution” model [3]. Traditional prediction methods require a large amount of information, and the modeling process is relatively complex [4]. With the acceleration of informatization in the power system, the large amount of data on loads, temperatures, and humidities has enhanced the accuracy of power demand analysis and prediction but has also increased its complexity. At the same time, the introduction of the price competition mechanism has set higher standards for load forecasting [5]. As we enter a new stage of economic development, power demand forecasting [6,7,8,9] is important for power grid planning, dispatching, enhancing grid efficiency, and improving the economic operation of the grid. Therefore, accurately and rapidly predicting electricity demand has become a valuable research hotspot.

Many research methods have been developed for power prediction applications, with remarkable results. Traditional statistical forecasting methods, such as ARIMA, SARIMA, and gray forecasting models, exhibit limited performance when dealing with the strong nonlinearity, randomness, and multi-scale periodic features commonly present in electricity load data [10,11,12]. Moreover, as the operational complexity of power systems increases, a single statistical model struggles to fully capture the multiple fluctuation patterns in load series [13]. To improve forecasting performance, machine learning methods, such as support vector regression (SVR), artificial neural networks (ANN), and random forest (RF), have enhanced the model’s fitting capability to some extent. However, their forecasting effectiveness is still constrained by the model structure settings, parameter sensitivity, and generalization ability [14]. Additionally, in multi-region and multi-load-type scenarios, these methods often face difficulties in balancing stability and accuracy.

To overcome the limitations of individual models, ensemble learning techniques have been widely applied to electricity demand forecasting tasks [15]. By combining multiple base learners, ensemble learning can effectively enhance the robustness and precision of predictions, addressing the issue of inconsistent performance across different load fluctuation patterns [16]. Common ensemble methods include bagging, boosting, and stacking, all of which have demonstrated significant advantages in electricity demand forecasting. For example, random forest (RF), as a representative of the bagging approach, alleviates overfitting through the integration of multiple decision trees. Gradient Boosting Decision Trees (GBDT) and XGBoost, on the other hand, strengthen the model’s ability to fit complex nonlinear relationships through the boosting strategy. Meanwhile, the stacking method further enhances the overall performance of predictions by combining different models.

However, existing ensemble learning methods still face certain limitations in practical applications. On the one hand, the selection of base models, feature construction, and parameter tuning have a significant impact on the final forecasting performance. On the other hand, electricity load data exhibit time-varying, multi-periodic, and stochastic characteristics, which may lead to instability in the predictive performance of a single ensemble strategy across different load stages. Furthermore, as the data scale grows, and the forecasting scenarios diversify, improving the computational efficiency while ensuring the forecasting accuracy becomes an urgent issue to address.

To address the aforementioned issues, we propose an improved ensemble learning regression algorithm. This method enhances model performance by introducing a diversity-enhancing strategy and integrates various types of base learners. Innovatively, the stacking strategy is employed to organically combine these heterogeneous models, thereby significantly improving the model’s generalization ability. Moreover, experiments on real electricity load data from multiple regions demonstrate that the proposed method exhibits superior forecasting accuracy and stability across different load types and application scenarios, providing reliable data support for power system scheduling, load management, and economic operation.

Based on stacking, we further focus on the key factors influencing performance in ensemble learning diversity and propose a new DK-Stacking algorithm framework. This includes a novel DK sampling strategy to enhance the diversity of the base dataset samples. Meanwhile, stacking [17,18] integrates multiple base models to reduce the risk of overfitting, improve the prediction accuracy, and enhance the model robustness, ultimately improving the electricity demand forecasting performance.

The main design idea is summarized as follows: Based on D²-Sampling [19] and KNN [20], two sample selection probabilities are proposed, respectively. The former mainly reflects the global distribution information of the samples, while the latter mainly reflects the local distribution information of the samples. Then, bagging [21,22] integrates these two probabilities to form a new sample selection probability. At the same time, the sampling between the base datasets should be continuous, that is, the last sample of the previous base dataset should be the starting sample of the next base dataset. This increases the difference between the base datasets. Then, the base datasets are used to train the base model. Finally, multiple base models are integrated using stacking to prevent overfitting and improve accuracy.

We refer to the improved algorithm as DK-Stacking (D²-Sampling- and KNN-based Stacking) and apply it to actual electricity demand prediction scenarios. The experimental results show that compared with models such as SVR [23,24], ANN [25,26,27], RF, and XGBoost, the prediction accuracy of DK-Stacking has improved, with better performance. Figure 1 illustrates the framework of the proposed DK-Stacking algorithm.

The main contributions of this study are as follows.

(1): We propose a novel DK-Stacking algorithm for electricity load forecasting, which combines D²-Sampling- and KNN-based sampling strategies to enhance the diversity of the base datasets, thereby improving the robustness and generalization ability of the ensemble models.
(2): The DK-Stacking algorithm integrates multiple heterogeneous base learners via the stacking strategy, effectively reducing overfitting and improving the forecasting accuracy across different load types and scenarios.
(3): Extensive experiments on real electricity load datasets from multiple regions demonstrate that the proposed DK-Stacking algorithm outperforms conventional machine learning models (SVR, ANN, RF, XGBoost) and existing ensemble methods in terms of both the prediction accuracy and stability.

The rest of this study is organized as follows. In Section 2, we briefly review the related works. The proposed DK-Stacking algorithm is presented and discussed in Section 3. Extensive experimental studies and performance evaluations are presented in Section 4. Section 5 concludes this study and discusses potential future work.

2. Related Works

2.1. Machine Learning Methods for Electricity Demand Forecasting

In the field of electricity demand forecasting, numerous methods and models have been proposed, covering traditional statistical methods, machine learning, deep learning, and ensemble learning. These approaches have made continual progress in terms of their accuracy and applicability to various scenarios.

Early research mainly relied on traditional statistical models. For instance, Pappas et al. constructed and evaluated an ARMA model for the Hellenic power system, demonstrating its effectiveness in electricity load forecasting [28]. In addition, the optimized gray prediction model proposed by Zhao and Guo exhibited strong performance in annual power load forecasting [29]. These studies indicate that traditional statistical models play an important role in improving the forecasting accuracy. However, they are applicable only when certain statistical assumptions are met, and both are designed for single sequence prediction. With the increase in available data and related variables, multivariate modeling has become an important means of improving the forecasting accuracy. Mohamed and Bodger introduced GDP, electricity prices, and total population numbers to establish a multiple linear regression model to predict electricity consumption [9]. However, this model performed poorly when dealing with data that were either over- or under-distributed.

Then, machine learning methods began to be used to further enhance the model’s representational ability and forecasting performance. Machine learning-based approaches have demonstrated strong predictive performance under conditions with enriched feature spaces [30], for instance, the mixed kernel-based extreme learning machine proposed by Chen et al. [31] and the interpretable causal graph neural network proposed by Amir Miraki and Pekka Parviainen [32]. Among these, the extreme learning machine model uses empirical mode decomposition (EMD) to extract complex features from the load series and denoise the data, while a mixed kernel function consisting of RBF and UKF kernels is incorporated to enhance model performance. Graph neural networks are causal reasoning-based graph neural networks. The relationships between variables are represented through the construction of causal graphs, which improves the prediction performance. However, they are more often used for optimization and to explore the feature representation capabilities.

Deep learning for electricity forecasting has also made significant progress. Aprillia et al. proposed a prediction strategy that combined a convolutional neural network (CNN) and the salp swarm algorithm (SSA) to predict the power generation of photovoltaic power plants [33]. They divided historical photovoltaic power generation data and related weather information into five types of weather conditions: rain, heavy cloud, cloudy, light cloud, and sunny. The CNN classification was used to determine the prediction of the next day’s weather type, and five CNN regression models were established to adapt to the prediction of different weather types, making this algorithm more suitable for the actual generation mode. In the same year, Hossain and Mahmood proposed a prediction algorithm for photovoltaic power generation using a long short-term memory (LSTM) neural network (NN) [34]. They employed the K-means algorithm to classify historical irradiance data into dynamic types of sky groups that change hourly in the same season, significantly improving the prediction accuracy and enabling more reliable photovoltaic power generation predictions. Compared with other machine learning engines, this algorithm is superior. However, deep learning requires high computing power support, and it is almost a black-box model with poor interpretability, usually involving more fine-tuning and preprocessing.

Beyond deep learning, ensemble learning methods, such as random forest [35,36] and the XGBoost algorithm [37,38], have also been introduced to further enhance the performance of multiple models in predicting power-related indicators. Compared with other machine learning-based methods, random forest reduces the risk of overfitting and enhances the generalization ability by randomly selecting the base datasets and features. The XGBoost algorithm is an optimized distributed gradient boosting library that can accelerate model training through parallel processing and incorporates regularization terms on this basis, effectively controlling the complexity of the model and avoiding overfitting.

Among the various ensemble strategies, bagging, boosting, and stacking are the three most representative methods. Stacking stands out, particularly in complex scenarios, due to its ability to integrate heterogeneous information from different types of models.

2.2. Stacking Ensemble Learning Algorithm

Stacking is an advanced and efficient method in ensemble learning. By integrating multiple models, it conducts multi-level learning processes, comprehensively leveraging the advantages of various models to achieve better prediction results. Its working principle is to use multiple base learners to learn the original data and then use their outputs as new features to input into the second layer learner; that is, the output of the first layer becomes the input to the second layer.

The stacking ensemble learning framework is shown in Figure 2. It adopts a two-layer structure. The initial dataset is first divided into multiple base datasets, which are then input into multiple base learners in the first layer. Each base learner produces a corresponding prediction result. Then, the output of the first layer is used as the input for the next layer, which is employed to train the meta-learner of the second layer. Finally, the output of the second layer’s meta-learner is used to generate the final prediction result. By generalizing the output results of multiple models and integrating the advantages of different models, the overall prediction accuracy of the stacking learning framework is improved.

3. DK-Stacking Algorithom

In this section, we introduce the DK-Stacking ensemble algorithm for electricity load forecasting. The algorithm first employs the DK-Sampling strategy, which combines D²-Sampling and KNN-based sampling to enhance the diversity of the base datasets. Based on this strategy, a bagging ensemble generates multiple base datasets, which are then integrated using a two-layer stacking structure with heterogeneous base learners.

3.1. D²-Sampling Selection Probability

For all datasets

X

except for dataset

C

and the empty dataset

C

, select samples from

X

except for dataset

C

and add them to

C

. Let

x_{i}

be any point in dataset

X

except for dataset

C

, that is,

x_{i} \notin C, x_{i} \in X

. Randomly select a sample point and use it as the starting point of the first base training set.

The square of the distance between the sample point

x_{i}

and the set

C

is defined as

{d (x_{i}, C)}^{2} = \min_{c \in C} {‖x_{i} - c‖}_{2}^{2}

(1)

Then the probability of the selected sample point

x_{i}

being chosen is defined as

p_{i}^{'} = \frac{{d (x_{i}, C)}^{2}}{\sum_{x_{j} \in X - C} {d (x_{j}, C)}^{2}}

(2)

The above formula indicates that the points that are farther away from the selected sample point set have a higher probability of being chosen. It should be noted that after each update of the set

C

, Equation (2) needs to be recalculated to update the selection probability of each sample point.

3.2. KNN Selection Probability

Let

x_{i} \in X

be any sample point in the sample set

X

, and

{k n n}_{i}

represent the set of K nearest neighbors of point

x_{i}

. Then, the KNN distance of sample point

x_{i}

can be defined as

d (x_{i}, {k n n}_{i}) = \sum_{j \in {k n n}_{i}} d (x_{i}, x_{j})

(3)

To avoid the scaling problem, the following mean normalization process is carried out:

\bar{d (x_{i}, {k n n}_{i})} = \frac{d (x_{i}, {k n n}_{i})}{k}

(4)

Let S be the sum of the K nearest neighbors of all points in the sample set:

S = \sum_{j = 1}^{N} \bar{d (x_{j}, {k n n}_{j})}

(5)

Then the probability of sample point

x_{i}

being selected based on KNN is defined as

p_{i}^{″} = \frac{S - \bar{d (x_{i}, {k n n}_{i})}}{\sum_{j}^{N} (S - \bar{d (x_{j}, {k n n}_{j})})} = \frac{S - \bar{d (x_{i}, {k n n}_{i})}}{N * S - S}

(6)

The above formula indicates that the points that are more closely surrounded by the local area have a higher probability of being selected.

3.3. DK’s Sampling Bagging Strategy

We construct the bagging ensemble based on D²-Sampling and KNN. The selection probability of sample points

p_{i}^{'}, p_{i}^{″}

is determined. The selection probability of the final sample point

x_{i}

is

p_{i}

:

p_{i} = p_{i}^{'} * p_{i}^{″}

(7)

It should be noted that, in order to ensure the diversity of sample point sampling between different base datasets, except for the starting point of the first base dataset, the starting sample point

c_{0}

of the next base dataset is the same as the sample point

c_{n}

of the previous base dataset. The generation process of multiple base datasets is shown in Figure 3. The base dataset

C

set is initially set to be empty and

m

represents the number of base datasets, which corresponds to the number of base models in the first layer of Stacking.

n

represents the number of sample points in the base dataset. In this study, it is required to account for approximately 70% of the entire dataset. The sampling process is as follows:

(1): For the first base dataset $C$ , the selection probability of each sample point is assigned according to Equation (6). A starting point $c_{0}$ is randomly selected based on this selection probability. The remaining sample points are assigned selection probabilities according to Equation (7), and a point $c_{1}$ outside the $C$ set that is not in the training set is randomly selected as the second point of the base training set $C$ . This process is repeated to select $c_{2} ~ c_{n}$ , until approximately 70% of the points in the training set are selected to enter the base training set $C$ .
(2): After the previous base data set $C$ collection is selected, the next base data set $C$ collection points will be chosen. The $c_{0}$ of the next $C$ set is the last sample point $c_{n}$ of the previous $C$ set. The remaining selection operations are the same as those in step 1, and so on. The selection of subsequent base training set points is carried out in the same way.

Based on the above descriptions and analysis, the implementation of DK-Stacking is summarized in Algorithm 1.

Algorithm 1: DK-Stacking Algorithm

Input: Dataset

X

with

N

samples and

F

features. Number of base datasets

m

. Size of base dataset

n

(typically 70% of total). Number of neighbors

K

for KNN. First-layer base models: ANN, SVR, RF, XGBoost. Second-layer meta-model: XGBoost.
Output: Final prediction

\hat{Y}

for test samples.
Procedure:
1: Initialize empty base dataset

C = \emptyset

2: Compute D²-Sampling probabilities

p_{i}^{'}

using Equations (1)–(2)
3: Compute KNN-based probabilities

p_{i}^{″}

using Equations (3)–(6)
4: Compute final selection probability

p_{i} = p_{i}^{'} \cdot p_{i}^{″}

for each sample using Equation (7)
5: for

k = 1

to

m

, generate

m

base datasets
6: if

k = 1

then randomly select starting sample

c_{0}

based on

p_{i}

7: else

c_{0}

of current base dataset = last sample

c_{n}

of previous base dataset
8: repeat
9: Select sample

c_{j} \notin C

based on

p_{i}

10: Add

c_{j}

to

C

11: until base dataset

C

reaches

n

samples
12: Train first-layer base models (ANN, SVR, RF, XGBoost) on base dataset

C

13: end for
14: Collect predictions from all first-layer models on training and validation sets
15: Integrate predictions as input features for second-layer XGBoost meta-model
16: Train second-layer XGBoost model using 5-fold cross-validation and grid search
17: Output final prediction

\hat{Y}

for test samples

The overall framework of the stacking model is shown in Figure 4. The first layer of the stacking integrated model in this study employs four basic regression models as the prediction models, namely, artificial neural network (ANN), support vector regression (SVR), random forest (RF), and XGBoost. The second layer of the prediction model adopts XGBoost, which has the best prediction effect among the four models. Stacking integration enhances the model’s generalization ability by automatically combining the strengths of multiple models. This reduces the risk of overfitting, improves the prediction accuracy and robustness, optimizes the ensemble prediction, and ultimately boosts the overall performance.

It is worth noting that we train the first and second layers of the stacking ensemble model differently. For the first-level prediction model, we use the DK-sampled base training set for training. For the second-level prediction model, the training set and validation set are combined and input into the already trained first-level prediction model to generate the prediction result. Then, the obtained prediction result is integrated and passed as the input feature to the second-level prediction model. This study employed five-fold cross-validation and grid search for optimization and parameter tuning, thereby selecting the optimal combination.

4. Experimental Studies

In this section, we present the datasets, preprocessing procedures, parameter settings, and experimental results used to evaluate the performance of the proposed DK-Stacking algorithm. The experiments were conducted on three datasets from different regions, Region 1, Region 2, and Queensland, to compare the predictive accuracy of DK-Stacking with several mainstream models such as SVR, ANN, RF, and XGBoost. The effectiveness of the proposed method was assessed in terms of the mean absolute percentage error (MAPE). To ensure a fair and consistent comparison across datasets and models, a symmetric experimental evaluation strategy was applied, where identical preprocessing, parameter settings, and evaluation metrics were used for all models and datasets.

4.1. Datasets and Preprocessing

To make the experimental results more convincing, we used data sets from three different regions, Region 1, Region 2, and Queensland (the second largest state in Australia) to conduct experimental comparisons and validations, respectively.

Among them, the data sets of region 1 and region 2 obtained from the internet are both based on the electricity consumption and meteorological factor data from 1 January 2012 to 10 January 2015. Both datasets consist of 1107 samples and 7 features, and include the following fields, as shown in Table 1 below:

Among them, the selected sample features include continuous variables such as the maximum temperature, minimum temperature, average temperature, relative humidity, and rainfall amount. These features will be used to verify the effectiveness of the DK-Stacking algorithm.

We also obtained the electricity consumption dataset of Queensland (the second largest state in Australia) from the Kaggle data website. This dataset consists of 2106 samples and 14 features, and includes the following fields, as shown in Table 2 below:

Among them, the selected sample features included continuous variables, such as the lowest temperature during the day, the highest temperature during the day, the total daily solar energy, and the daily rainfall amount. Additionally, there were two categorical variables: whether the student is at school and whether it is a holiday. These features were also used to verify the effectiveness of the DK-Stacking algorithm.

Before using this dataset, we carried out data cleaning and normalization processing to ensure the accuracy and appropriateness of the dataset. Specifically, for the Region 1 and Region 2 datasets, two data samples with missing values were deleted, while for the electricity consumption dataset of the second largest state in Australia, four data samples with missing values were also deleted.

In the preprocessing stage, the data is first normalized. A scaler object is created using the MinMaxScaler class from the sklearn library to perform feature normalization, converting the range to [−1, 1]. The normalized results for Region 2 are shown in Figure 5, for Region 1 in Figure 6, and for Queensland in Figure 7.

4.2. Parameter Settings and Model Evaluation

The DK-Stacking model in our study involves several predefined parameters. When calculating the selection probability based on KNN, the number of neighbors K needs to be set to 5. Regarding the size of the base dataset, 70% of the total dataset is selected for the choice. The SVR, ANN, and RF models selected in this study are used as comparison models. The predefined parameters involved in each of these models are set to the default recommended values. On the other hand, XGBoost is used as the comparison model, and an optimized model is employed for the experiments. Specifically, the XGBoost comparison model will be the same as the XGBoost model in the second layer of the Stacking model in this study, and it will use 5-fold cross-validation and grid search to select the optimal parameters to optimize the model. Through experiments, it was found that the optimal parameters selected for the three regions were the same. Among them, the optimal parameters for the XGBoost model in the comparison model were ‘learning_rate’: 0.1, ‘n_estimators’: 50, while the optimal parameters for the XGBoost model in the second layer of the Stacking model were ‘learning_rate’: 0.01, ‘max_depth’: 3, ‘n_estimators’: 300.

We use the relative error (MAPE) as the performance evaluation index, and its expression is

M A P E = \frac{1}{n} Σ (∣ (y_{i} - {\hat{y}}_{i}) / y_{i} ∣) 100

(8)

In the formula, n represents the sample size,

y_{i}

refers to the true value, and

{\hat{y}}_{i}

refers to the predicted value.

4.3. Experimental Results

We conducted experimental comparisons and validation using datasets from three different regions. Under the condition of maintaining the same settings for each dataset, multiple mainstream models such as SVR, ANN, RF, and XGBoost were employed for comparative experiments.

For Region 2, the comparison results of the prediction performance are shown in Table 3. The DK-Stacking algorithm we propose had an MAPE value of 9.57% in actual power demand prediction, which was the lowest relative error among all the methods. Compared with the optimized XGBoost algorithm, the prediction accuracy of DK-Stacking increased by 0.04 percentage points, indicating that the improved stacking strategy significantly enhanced the predictive ability, accuracy, and generalization ability of the model. Compared with RF, the prediction accuracy increased by 0.31 percentage points; compared with SVR and ANN, the accuracy improved by more than 1 percentage point.

For Region 1, the comparison results of the prediction performance are shown in Table 4. The DK-Stacking algorithm we propose had an MAPE value of 12.24% in actual power demand prediction, which was the lowest relative error among all methods. Compared with the optimized XGBoost algorithm, the prediction accuracy of DK-Stacking increased by 0.35 percentage points, indicating that the improved stacking strategy can effectively enhance the model performance. Compared with SVR and ANN, the prediction accuracy increased by 0.47 percentage points and 0.61 percentage points, respectively, while compared with RF, the accuracy increased by more than 1 percentage point.

For Queensland, the comparison results of prediction performance are shown in Table 5. From the table, it can be seen that the DK-Stacking algorithm we propose has an MAPE value of 6.97% in the actual power demand prediction, which is the lowest relative error among all methods. Compared with the optimized XGBoost algorithm, the prediction accuracy of DK-Stacking has increased by 0.44 percentage points, indicating that the improved Stacking strategy can effectively enhance the model performance. Moreover, when compared with ANN, SVR, and RF, the accuracy improvement exceeds 1 percentage point.

To further illustrate the predictive performance of the models, we present comparative prediction plots for the three regions (Figure 8, Figure 9 and Figure 10). Each figure consists of six subplots: subplots (a)–(e) show the comparison between the predicted and actual values obtained by the ANN, SVR, RF, XGBoost, and DK-Stacking models, respectively, while subplot (f) overlays the prediction results of all five models within the same coordinate system, allowing for a more intuitive comparison of their fitting performance and error differences.

In subplots (a)–(e) of Figure 8, Figure 9 and Figure 10, the blue lines represent the actual values, while the orange lines depict the predictions generated by each corresponding model. As shown in the figures, although all models were able to capture the overall pattern of load variation to varying degrees, the prediction curve of the DK-Stacking model aligned most closely with the actual load curve. In particular, during periods with pronounced load fluctuations, DK-Stacking demonstrated markedly superior fitting performance and trend-capturing capability compared to the other models.

In subplot (f) of Figure 8, Figure 9 and Figure 10, the blue line represents the actual values, the orange–yellow lines depict the predicted values of the ANN model, the green line represents the prediction of the SVR model, the red line represents the prediction result of the RF model, the purple line indicates the prediction data of the random forest XGBoost model, and the brown–red line presents the prediction result of the DK-Stacking model in this study. The graph shows that there were many actual peak values of electricity demand, and the situation was complex. The predicted results of each model had the same trend as the actual values, indicating that the prediction of the mean trend was roughly the same. However, in comparison, at each peak point, the brown–red line is more prominent than the lines of other colors, fitting the blue line more closely; that is, it was closer to the actual value. This means that the prediction results of the DK-Stacking model will have more points closer to the actual point value; in particular, at many peak points, it will be closer to the actual value. Therefore, the improved DK-Stacking prediction model has better adaptability and better prediction performance in the complex electricity demand field.

Finally, to make the conclusions drawn from the experimental results clear, we integrate the experimental results from the three different regional datasets into Table 6.

The table shows that, in different datasets, the DK-Stacking algorithm we propose is significantly superior to other models. The best performance achieved an MAPE value of 6.97% in the Queensland dataset. Moreover, in this dataset, compared with the optimized XGBoost model, the prediction accuracy improved by 0.44 percentage points. Compared with the other models, the prediction accuracy increased by more than 1 percentage point in all three datasets, making it the model with the best prediction effect. Secondly, the prediction accuracy of Region 2 was also quite good, with the MAPE value being within 10%. Region 1 had the poorest prediction performance; however, the DK-Stacking algorithm we propose also showed significant performance improvement compared to other comparison models.

5. Conclusions

Electricity demand forecasting plays a crucial role in energy planning and power system operation. However, the changes in demand are complex, and there are no obvious mechanisms or patterns to be identified. Given the complex and variable power demand prediction problem, we propose an improved algorithm named DK-Stacking. In terms of the sample set, an improved bagging strategy was presented. By integrating the sample point selection probabilities based on D²-Sampling and KNN, a base dataset that better reflects the entire sample space was constructed. This reduced the probability of repeatedly selecting approximately representative sample points. The proposed continuous sampling method between the base datasets also increased the diversity among the base datasets. On the model level, stacking ensemble learning was adopted, integrating multiple regression models, automatically fusing the advantages of different models, improving the prediction accuracy and model robustness, optimizing the combined predictions, and ultimately enhancing the prediction performance. Unlike most models, DK-Stacking enhances the overall model performance from both the perspective of sample set diversity and the combination of model optimization. Although the proposed DK-Stacking algorithm has improved prediction accuracy, there is still room for further studies. In future work, we will certainly include RMSE, MAE, and other relevant metrics to further validate and strengthen the evaluation of our approach. In addition, we will conduct end-to-end deployment experiments to quantitatively evaluate its engineering performance.

Author Contributions

Conceptualization, J.Z. and D.Y.; Methodology, J.Z. and D.Y.; Software, P.Y.; Validation, P.Y. and Z.B.; Resources, Z.B.; Data curation, Z.B., Z.J. and D.Y.; Writing—original draft, P.Y.; Writing—review and editing, J.Z., Z.B. and Z.J.; Visualization, Z.J.; Supervision, Z.J. and D.Y.; Funding acquisition, J.Z. and Z.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Soft Science Research Program of Zhejiang Province under Grant No. 2026C35053; by the National Natural Science Foundation of China under Grant Nos. 62206177 and 62106145; and by the Zhejiang Provincial Natural Science Foundation of China under Grant Nos. LY23F020007 and LQ22F020024.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Habbak, H.; Mahmoud, M.; Metwally, K.; Fouda, M.M.; Ibrahem, M.I. Load Forecasting Techniques and Their Applications in Smart Grids. Energies 2023, 16, 1480. [Google Scholar] [CrossRef]
Nti, I.K.; Teimeh, M.; Nyarko-Boateng, O.; Adekoya, A.F. Electricity load forecasting: A systematic review. J. Electr. Syst. Inf. Technol. 2020, 7, 13. [Google Scholar] [CrossRef]
Bakare, M.S.; Abdulkarim, A.; Zeeshan, M.; Shuaibu, A.N. A comprehensive overview on demand side energy management towards smart grids: Challenges, solutions, and future direction. Energy Inform. 2023, 6, 4. [Google Scholar] [CrossRef]
Hasan, M.; Mifta, Z.; Papiya, S.J.; Roy, P.; Dey, P.; Salsabil, N.A.; Chowdhury, N.-U.; Farrok, O. A state-of-the-art comparative review of load forecasting methods: Characteristics, perspectives, and applications. Energy Convers. Manag. X 2025, 26, 100922. [Google Scholar] [CrossRef]
Zhong, B. Deep learning integration optimization of electric energy load forecasting and market price based on the ANN–LSTM–transformer method. Front. Energy Res. 2023, 11, 1292204. [Google Scholar] [CrossRef]
McSharry, P.E.; Bouwman, S.; Bloemhof, G. Probabilistic Forecasts of the Magnitude and Timing of Peak Electricity Demand. IEEE Trans. Power Syst. 2005, 20, 1166–1172. [Google Scholar] [CrossRef]
Mir, A.A.; Alghassab, M.; Ullah, K.; Khan, Z.A.; Lu, Y.; Imran, M. A Review of Electricity Demand Forecasting in Low and Middle Income Countries: The Demand Determinants and Horizons. Sustainability 2020, 12, 5931. [Google Scholar] [CrossRef]
Velasquez, C.E.; Zocatelli, M.; Estanislau, F.B.G.L.; Castro, V.F. Analysis of time series models for Brazilian electricity demand forecasting. Energy 2022, 247, 123483. [Google Scholar] [CrossRef]
Li, R.; Jiang, P.; Yang, H.; Li, C. A novel hybrid forecasting scheme for electricity demand time series. Sustain. Cities Soc. 2020, 55, 102036. [Google Scholar] [CrossRef]
Tarmanini, C.; Sarma, N.; Gezegin, C.; Ozgonenel, O. Short term load forecasting based on ARIMA and ANN approaches. Energy Rep. 2023, 9, 550–557. [Google Scholar] [CrossRef]
Anh, N.T.N.; Anh, N.N.; Thang, T.N.; Solanki, V.K.; Crespo, R.G.; Dat, N.Q. Online SARIMA applied for short-term electricity load forecasting. Appl. Intell. 2023, 54, 1003–1019. [Google Scholar] [CrossRef]
Jiang, J.; Zhao, W.; Ou, M.; Wang, T.; Huang, G. The Practical Engineering Application of Improved Grey Prediction Model in Power Load Forecasting. In Proceedings of the 2023 3rd International Conference on Intelligent Power and Systems (ICIPS), Shenzhen, China, 20–22 October 2023; pp. 193–199. [Google Scholar] [CrossRef]
Lee, J.; Cho, Y. National-scale electricity peak load forecasting: Traditional, machine learning, or hybrid model? Energy 2022, 239, 122366. [Google Scholar] [CrossRef]
Sayed, H.A.; William, A.; Said, A.M. Smart Electricity Meter Load Prediction in Dubai Using MLR, ANN, RF, and ARIMA. Electronics 2023, 12, 389. [Google Scholar] [CrossRef]
Matrenin, P.; Safaraliev, M.; Dmitriev, S.; Kokin, S.; Ghulomzoda, A.; Mitrofanov, S. Medium-term load forecasting in isolated power systems based on ensemble machine learning models. Energy Rep. 2022, 8, 612–618. [Google Scholar] [CrossRef]
Yamasaki, M.; Freire, R.Z.; Seman, L.O.; Stefenon, S.F.; Mariani, V.C.; Coelho, L.d.S. Optimized hybrid ensemble learning approaches applied to very short-term load forecasting. Int. J. Electr. Power Energy Syst. 2024, 155, 109579. [Google Scholar] [CrossRef]
Zhang, Q.; Wu, J.; Ma, Y.; Li, G.; Ma, J.; Wang, C. Short-term load forecasting method with variational mode decomposition and stacking model fusion. Sustain. Energy Grids Netw. 2022, 30, 100622. [Google Scholar] [CrossRef]
Guo, F.; Mo, H.; Wu, J.; Pan, L.; Zhou, H.; Zhang, Z.; Li, L.; Huang, F. A Hybrid Stacking Model for Enhanced Short-Term Load Forecasting. Electronics 2024, 13, 2719. [Google Scholar] [CrossRef]
Jaiswal, R.; Kumar, M.; Sen, S. Improved analysis of D²-sampling based PTAS for k-means and other clustering problems. Inf. Process. Lett. 2015, 115, 100–103. [Google Scholar] [CrossRef]
Zhang, S.; Li, X.; Zong, M.; Zhu, X.; Wang, R. Efficient kNN Classification With Different Numbers of Nearest Neighbors. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 1774–1785. [Google Scholar] [CrossRef]
Sun, Y.; Huang, B.; Ullah, A.; Wang, S. Nonparametric estimation and forecasting of interval-valued time series regression models with constraints. Expert Syst. Appl. 2024, 249, 123385. [Google Scholar] [CrossRef]
Huanyu, C.; Qilong, H.; Nianbin, W.; Ye, W. An adaptive approach for compression format based on bagging algorithm. Int. J. Parallel Emergent Distrib. Syst. 2023, 38, 401–423. [Google Scholar] [CrossRef]
Chen, Y.; Xu, P.; Chu, Y.; Li, W.; Wu, Y.; Ni, L.; Bao, Y.; Wang, K. Short-term electrical load forecasting using the Support Vector Regression (SVR) model to calculate the demand response baseline for office buildings. Appl. Energy 2017, 195, 659–670. [Google Scholar] [CrossRef]
Liu, Y. Prediction of Structural Damage Trends Based on the Integration of LSTM and SVR. Appl. Sci. 2023, 13, 7135. [Google Scholar] [CrossRef]
Zheng, J.; Liu, C.; Huang, S.; He, Y. A novel adaptive dynamic GA combined with AM to optimize ANN for multi-output prediction: Small samples enhanced in industrial processing. Inf. Sci. 2023, 644, 119285. [Google Scholar] [CrossRef]
Aasi, H.K.; Mishra, M. Investigation on cross-flow three-fluid compact heat exchanger under flow non-uniformity: An experimental study with ANN prediction. Exp. Heat Transf. 2022, 36, 688–718. [Google Scholar] [CrossRef]
Román-Portabales, A.; López-Nores, M.; Pazos-Arias, J.J. Systematic Review of Electricity Demand Forecast Using ANN-Based Machine Learning Algorithms. Sensors 2021, 21, 4544. [Google Scholar] [CrossRef]
Pappas, S.; Ekonomou, L.; Karampelas, P.; Karamousantas, D.; Katsikas, S.; Chatzarakis, G.; Skafidas, P. Electricity demand load forecasting of the Hellenic power system using an ARMA model. Electr. Power Syst. Res. 2010, 80, 256–264. [Google Scholar] [CrossRef]
Zhao, H.; Guo, S. An optimized grey model for annual power load forecasting. Energy 2016, 107, 272–286. [Google Scholar] [CrossRef]
Chen, J.-F.; Wang, W.-M.; Huang, C.-M. Analysis of an adaptive time-series autoregressive moving-average (ARMA) model for short-term load forecasting. Electr. Power Syst. Res. 1995, 34, 187–196. [Google Scholar] [CrossRef]
Chen, Y.; Kloft, M.; Yang, Y.; Li, C.; Li, L. Mixed kernel based extreme learning machine for electric load forecasting. Neurocomputing 2018, 312, 90–106. [Google Scholar] [CrossRef]
Chen, G.; Hu, Q.; Wang, J.; Wang, X.; Zhu, Y. Machine-Learning-Based Electric Power Forecasting. Sustainability 2023, 15, 11299. [Google Scholar] [CrossRef]
Aprillia, H.; Yang, H.-T.; Huang, C.-M. Short-Term Photovoltaic Power Forecasting Using a Convolutional Neural Network–Salp Swarm Algorithm. Energies 2020, 13, 1879. [Google Scholar] [CrossRef]
Miraki, A.; Parviainen, P.; Arghandeh, R. Electricity demand forecasting at distribution and household levels using explainable causal graph neural network. Energy AI 2024, 16, 100368. [Google Scholar] [CrossRef]
Tiwari, S.; Jain, A.; Ahmed, N.M.O.S.; Charu; Alkwai, L.M.; Dafhalla, A.K.Y.; Hamad, S.A.S. Machine learning-based model for prediction of power consumption in smart grid-smart way towards smart city. Expert Syst. 2021, 39, e12832. [Google Scholar] [CrossRef]
Ahmad, T.; Zhang, H.; Yan, B. A review on renewable energy and electricity requirement forecasting models for smart grid and buildings. Sustain. Cities Soc. 2020, 55, 102052. [Google Scholar] [CrossRef]
Solyali, D. A Comparative Analysis of Machine Learning Approaches for Short-/Long-Term Electricity Load Forecasting in Cyprus. Sustainability 2020, 12, 3612. [Google Scholar] [CrossRef]
Jawad, M.; Nadeem, M.S.A.; Shim, S.-O.; Khan, I.R.; Shaheen, A.; Habib, N.; Hussain, L.; Aziz, W. Machine Learning Based Cost Effective Electricity Load Forecasting Model Using Correlated Meteorological Parameters. IEEE Access 2020, 8, 146847–146864. [Google Scholar] [CrossRef]

Figure 1. Framework of the proposed DK-Stacking algorithm.

Figure 2. Stacking Ensemble Learning Framework.

Figure 3. Sampling operation of DK sample.

Figure 4. Overall framework of the Stacking model.

Figure 5. Normalization of Region 2 Characteristics.

Figure 6. Normalization of Region 1 Characteristics.

Figure 7. Normalization of Queensland’s Characteristics.

Figure 8. Region 2—Comparison of Prediction Results.

Figure 9. Region 1—Comparison of Prediction Results.

Figure 10. Queensland—Comparison of Prediction Results.

Table 1. Description of Data Sets for Region 1 and Region 2.

Serial Number	Data Field
1	date and time
2	electric load (MV)
3	maximum temperature °C
4	minimum temperature °C
5	average temperature °C
6	relative humidity (average)
7	rainfall (mm)

Table 2. Description of Electricity Consumption Data Set for Queensland.

Data Field	Description
Date	Date; Date variable
Demand	Daily total power demand; continuous variable
RRP	Suggested retail price; continuous variable
demand_pos_RRP	The daily total demand quantity with a positive retail price; continuous variable
RRP_positive	Average retail price; continuous variable
demand_neg_RRP	The daily total demand quantity with a negative retail price; continuous variable
RRP_negative	Average negative retail price; continuous variable
frac_at_neg_RRP	The portion of negative retail price transactions; continuous variable
min_temperature	Daytime minimum temperature; continuous variable
max_temperature	Daytime maximum temperature; continuous variable
solar_exposure	Daily total solar energy; continuous variable
rainfall	Daily rainfall amount; continuous variable
school_day	Whether the student is at school; Categorical variable
holiday	Is it a holiday (Categorical variable)

Table 3. Comparative Experiment in Region 2.

Method	MAPE of Forecasting Results (%)
ANN	11.51
SVR	11.52
RF	9.88
XGBoost	9.61
DK-Stacking	9.57

The best result is highlighted in bold.

Table 4. Comparative Experiment in Region 1.

Method	MAPE of Forecasting Results (%)
ANN	12.85
SVR	12.71
RF	13.57
XGBoost	12.59
DK-Stacking	12.24

The best result is highlighted in bold.

Table 5. Comparative Experiment in Queensland.

Method	MAPE of Forecasting Results (%)
ANN	8.73
SVR	8.29
RF	8.09
XGBoost	7.41
DK-Stacking	6.97

The best result is highlighted in bold.

Table 6. Comparative Verification of Experimental Results in Three Regions.

Method	Region 1	Region 2	Region 3
	MAPE of Forecasting Results (%)	MAPE of Forecasting Results (%)	MAPE of Forecasting Results (%)
ANN	12.85	11.51	8.73
SVR	12.71	11.52	8.29
RF	13.57	9.88	8.09
XGBoost	12.59	9.61	7.41
DK-Stacking	12.24	9.57	6.97

The best results are highlighted in bold.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, J.; Yan, P.; Bian, Z.; Jiang, Z.; Yu, D. An Improved Ensemble Learning Regression Algorithm for Electricity Demand Forecasting with Symmetric Experimental Evaluation. Symmetry 2026, 18, 123. https://doi.org/10.3390/sym18010123

AMA Style

Zhou J, Yan P, Bian Z, Jiang Z, Yu D. An Improved Ensemble Learning Regression Algorithm for Electricity Demand Forecasting with Symmetric Experimental Evaluation. Symmetry. 2026; 18(1):123. https://doi.org/10.3390/sym18010123

Chicago/Turabian Style

Zhou, Jie, Peisheng Yan, Zekang Bian, Zhibin Jiang, and Donghua Yu. 2026. "An Improved Ensemble Learning Regression Algorithm for Electricity Demand Forecasting with Symmetric Experimental Evaluation" Symmetry 18, no. 1: 123. https://doi.org/10.3390/sym18010123

APA Style

Zhou, J., Yan, P., Bian, Z., Jiang, Z., & Yu, D. (2026). An Improved Ensemble Learning Regression Algorithm for Electricity Demand Forecasting with Symmetric Experimental Evaluation. Symmetry, 18(1), 123. https://doi.org/10.3390/sym18010123

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Improved Ensemble Learning Regression Algorithm for Electricity Demand Forecasting with Symmetric Experimental Evaluation

Abstract

1. Introduction

2. Related Works

2.1. Machine Learning Methods for Electricity Demand Forecasting

2.2. Stacking Ensemble Learning Algorithm

3. DK-Stacking Algorithom

3.1. D²-Sampling Selection Probability

3.2. KNN Selection Probability

3.3. DK’s Sampling Bagging Strategy

4. Experimental Studies

4.1. Datasets and Preprocessing

4.2. Parameter Settings and Model Evaluation

4.3. Experimental Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

An Improved Ensemble Learning Regression Algorithm for Electricity Demand Forecasting with Symmetric Experimental Evaluation

Abstract

1. Introduction

2. Related Works

2.1. Machine Learning Methods for Electricity Demand Forecasting

2.2. Stacking Ensemble Learning Algorithm

3. DK-Stacking Algorithom

3.1. D2-Sampling Selection Probability

3.2. KNN Selection Probability

3.3. DK’s Sampling Bagging Strategy

4. Experimental Studies

4.1. Datasets and Preprocessing

4.2. Parameter Settings and Model Evaluation

4.3. Experimental Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. D²-Sampling Selection Probability