Stackade Ensemble Learning for Resilient Forecasting Against Missing Values, Adversarial Attacks, and Concept Drift
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe problem to be solved in this paper is how to forecast power load in smart grids to deal with problems such as missing data, adversarial attacks, and concept drift. The difficulties in studying this problem include: missing data will lead to reduced prediction accuracy; adversarial attacks will deceive machine learning models and lead to wrong predictions; concept drift will make the current data distribution inconsistent with the training data distribution, reducing prediction accuracy.
This paper proposes a new method called Stackade Ensemble Learning (StEL) to solve the problem of power load forecasting in smart grids. However, there are some things that need to be explained in this paper:
1. The paper points out that most existing machine learning techniques assume that data is clean and has a consistent distribution, while real-world data often has problems such as missing values, adversarial attacks, and concept drift. This ideal assumption leads to poor performance of the model in practical applications.
2. When multiple solutions are applied sequentially, insufficient corrections in the early steps will lead to error accumulation, which will eventually affect the prediction accuracy. For example, inaccurate missing value interpolation will affect the effect of subsequent adversarial attack detection and concept drift adaptation.
3. Most existing solutions only target a single problem and lack the ability to comprehensively handle multiple abnormal situations.
4. Adversarial attacks can fool machine learning models by adding carefully designed noise, and these perturbations may be masked by missing values, making detection and correction more difficult.
5. Factors such as seasonal changes cause data distribution to change over time, making trained models gradually invalid. Existing methods usually require retraining models, which may be impractical in real-time systems.
6. Existing solutions often cannot effectively deal with missing values, adversarial attacks, and concept drift at the same time. The paper points out that in this case, the prediction accuracy will drop significantly.
7. Although federated learning is proposed as a solution, it focuses on protection during the training phase rather than real-time anomaly handling during the deployment phase. In addition, excluding attacked local models or data may not be feasible in practical applications.
8 Literature More relevant literature is needed, and it is recommended to add: Optimizing Insulator Defect Detection with Improved DETR Models https://doi.org/10.3390/math12101507
Author Response
Comments 1: The paper points out that most existing machine learning techniques assume that data is clean and has a consistent distribution, while real-world data often has problems such as missing values, adversarial attacks, and concept drift. This ideal assumption leads to poor performance of the model in practical applications.
Response 1: Thank you for the comments. Most of the existing forecasters are designed under the ideal assumption that the input data does not have missing values, adversarial attacks, and concept drift, which negatively affects the forecasting accuracy.
Comments 2: When multiple solutions are applied sequentially, insufficient corrections in the early steps will lead to error accumulation, which will eventually affect the prediction accuracy. For example, inaccurate missing value interpolation will affect the effect of subsequent adversarial attack detection and concept drift adaptation.
Response 2: Agree. Any inaccuracies from missing value imputation, adversarial attack correction, and concept drift scaling adaptation could accumulate, which inadvertently causes the forecasting accuracy to degrade even if each of the problems were addressed.
Comments 3: Most of the existing solutions only target a single problem and lack the ability to comprehensively handle multiple abnormal situations.
Response 3: This is true due to the difficulties of unifying multiple solutions into one. Additionally, this paper shows the trivial method of arranging multiple solutions in a cascading manner is inadequate due to the error accumulation.
Comments 4: Adversarial attacks can fool machine learning models by adding carefully designed noise, and these perturbations may be masked by missing values, making detection and correction more difficult.
Response 4: Correct. The result in the compounding problem confirmed the observation, where the remaining perturbation caused erroneous imputation errors.
Comments 5: Factors such as seasonal changes cause data distribution to change over time, making trained models gradually invalid. Existing methods usually require retraining models, which may be impractical in real-time systems.
Response 5: This is true. Another method used in this paper to handle seasonal concept drift in real time is to adapt the min-max normalization to the current season in a cascading operation, followed by using radian scaling in a stacking operation to avoid relying on parameters that could be influenced by the seasonal concept drift.
Comments 6: Existing solutions often cannot effectively deal with missing values, adversarial attacks, and concept drift at the same time. The paper points out that in this case, the prediction accuracy will drop significantly.
Response 6: This is true. To resolve the issue with error accumulations from trivial method, this paper proposed Stackade Ensemble Learning, which combines the strength of Stacking Ensemble Learning on top of Cascading Ensemble Learning.
Comments 7: Although federated learning is proposed as a solution, it focuses on protection during the training phase rather than real-time anomaly handling during the deployment phase. In addition, excluding attacked local models or data may not be feasible in practical applications.
Response 7: Though the previously proposed federated learning followed by excluding local models with low credibility works well in their setup to forecast housing load, the implementation does not translate well into zonal smart grid load due to the data heterogeneity requiring the global model to be retrained again on the local data. Although the previous research matches all the keywords, it focuses only on the training phase, while this paper focuses on the deployment phase.
Comments 8: Literature More relevant literature is needed, and it is recommended to add: Optimizing Insulator Defect Detection with Improved DETR Models https://doi.org/10.3390/math12101507.
Response 8: Added the reference on Section 2.2.1, line 218, to show that machine learning-based adversarial detection follows the same concept as hardware anomaly detection.
Reviewer 2 Report
Comments and Suggestions for AuthorsPlease refer to the attachment.
Comments for author File: Comments.pdf
Author Response
Comments 1: In section 2.1, the authors included a sentence that lacks significance. I recommend deleting it.
Response 1: Thank you for the suggestion. We have removed Section 2.1.
Comments 2: Your review draws exclusively on Google Scholar results from 1 January 2021–20 June 2025. This 4.5-year window is too brief to substantiate claims about broader research trends, as it omits foundational work that shapes current directions. Please widen the search period (ideally, ≥10 years or to the field’s onset), include additional databases, and document the search strategy, screening criteria, and update date. Presenting temporal sub- analyses (pre-2021 vs. recent) would allow stronger, evidence-based inferences about the evolution of the literature.
Response 2: Thank you for the suggestion. We have expanded the search period and incorporated ScienceDirect, our institution' subscribed database, into the search. Bibliometric analysis was performed to investigate patterns and trends in enhancing forecasting accuracy in smart grids, specifically against missing values, adversarial attacks, and concept drift. From 1 January 2015 to 31 July 2025, the yearly top 50 most relevant references were gathered from Google Scholar and ScienceDirect, and VOSviewer was used to analyze the patterns and trends. Detailed information on the analysis is in Section 1, rows 70-83.
Comments 3: The manuscript’s literature review is framed as a 4.5-year survey (January 1, 2021, to June 20, 2025; see Fig. 3) to characterize research activity related to forecasting under missing values, adversarial attacks, and concept drift in smart grids. In contrast, the empirical illustrations in Figure 5 (which uses 50% synthetic missing values) and Figure 6 (which applies PGD adversarial perturbation) only depict a single 24-hour segment of load data from New York City (from January 1, 2023, 00:00 to January 2, 2023, 00:00). These panels seem to be short, didactic examples rather than analyses that represent longer operational periods, seasonal variations, or the range of anomaly severities discussed elsewhere in the manuscript. Please provide reasonable explanations for this discrepancy.
Response 3: In this paper, the subset of the data were visualized in 24-hour intervals, and the accuracy was measured in 3-month intervals. As it is difficult to visualize how the missing values and adversarial attacks look in a 3-month interval, we opted to plot the first 24-hour interval instead. The discrepancy is added in Section 2.1.1, rows 138-139.
Comments 4: Table 1 presents a comparison of previous studies. However, why were only seven articles chosen? In the Introduction section, the authors mentioned that nine papers matched the keywords “Forecast,” “Smart Grid,” “Missing Values,” “Adversarial Attacks,” and “Concept Drift.” Why did the authors not use all nine articles for comparison? Additionally, are Zhou et al. [31] and Zhou et al. [35] the same article?
Response 4: Although these articles matched the keywords, the content of the articles themselves does not match in the context of solving missing values, adversarial attacks, and concept drift (i.e., surveys, reviews, and irrelevant papers). This result leaves us with only one paper to compare. This explanation was added in Section 1, rows 84-86. To differentiate between two authors who worked on two different articles, we have added their first name in the in-text citation.
Comments 5: Equations (6)~(9), the three equations are supposed to represent a sequential data-cleaning (cascading) pipeline that first imputes Missing Values (M), then removes Adversarial perturbations (A), and finally rescales/adapts for Concept Drift (D) across the three correlated zone inputs (NYC, DUNWOD, WEST). The current notation in the manuscript is confusing (and the strike- through formatting you see in the image is not standard); I recommend revising the equations so that the superscripts change after each correction step. For clarity, please consider adopting the following notation, which should be more intuitive for readers.
Response 5: Thank you for your suggestion. We have corrected the notations to follow the suggestion.
Comments 6: As for Equations (9)~(13), please also reconsider them carefully.
Response 6: We change the notations in Equation (9)~(13) to match.
Comments 7: In Figure 11, the word "Cascade" appears behind the number 2, and the word "Meta" is not fully visible behind the number 4. Please readjust the picture.
Response 7: We have readjusted the picture to show the word “Cascade.”
Comments 8: I could not find the corresponding explanation for Figure 25. Please include it.
Response 8: We have added the explanation for the surrogate model in Section 2.1.2, rows 161-164, and Section 5.3, rows 541-543.
Comments 9: In rows 498 to 500, the authors mention Figure 9, which indicates that an MV of percentage=0.2 and an AA of ϵ=0.05 were simulated. However, upon reviewing Figure 9, it is difficult to discern its implications or explanations.
Response 9: We have expanded Figure 12 into a stacked line chart, with different coloring to describe how adversarial attacks and missing values impact the original drifted data.
Comments 10: In rows 506~530, “which decreases the MAE score by 303.1366 MW, or 66.9896% of reductions via cascading operation. StEL takes it even further by reducing the MAE score on the trivial solution by 10.7612 MW, or 7.2041% of reductions via stacking operation. By comparing the federated solution against StEL, the MAE score was reduced by 313.8978 MW, or 69.3677% of reduction.” How can they obtain the values? Which tables or figures can provide this evidence?
Response 10: The results was obtained by calculating the score difference. To make it more intuitive, we have improved the result by providing the averaged forecasting accuracy, and added the table that shows the averaged accuracy differences Against Stackade Ensemble Learning in Section 5.5, Table 21.
Comments 11: In conclusion, the section is too brief. It lacks insightful perspectives, meaningful implications of results, and explanations of its significant contributions. As a result, this oversight diminishes the overall contribution of the research.
Response 11: We have reworked the conclusion to discuss the current research trends, the limitation of the trivial and federated solutions, the strength of the proposed Stackade Ensemble Learning, and its limitation that we aim to address in future work.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsPower load forecasting in smart grids faces multiple data anomaly issues, including missing values (MV), adversarial attacks (AA), and concept drift (CD). These issues can lead to reduced forecast accuracy, impacting grid stability and energy management efficiency. Existing solutions typically address only a single issue, neglecting the compounding problem of multiple anomalies occurring simultaneously, leading to error accumulation and reduced forecast performance.
This paper proposes a novel stacked ensemble learning (StEL) framework that combines the advantages of cascading ensemble learning (CEL) and stacking ensemble learning (SEL). StEL sequentially corrects the MV, AA, and CD in the input data through a cascading operation and aggregates multiple forecast results through a stacking operation to reduce error accumulation.
Although this paper is innovative, some issues remain that require improvement:
1. Traditional missing value imputation has limitations. Single-variable imputation methods cannot handle high proportions of missing data, resulting in errors that grow linearly with the missingness rate.
2. Passive adversarial attack defenses are ineffective. Existing detection-based defenses cannot repair perturbed data in real time and can only reduce attack intensity. Even at ε = 0.02, the MAE still increases by 69.56%.
3. Sliding window or feature scaling updates require waiting for new data to accumulate, making them unable to immediately respond to sudden distribution changes during extreme weather events.
4. Cascading operations can lead to error accumulation. The sequential processing of MV → AA → CD causes errors in the previous modules, ultimately worsening the MAE by 30.37%.
5. Federated learning data is heterogeneous, and local models cannot share features due to privacy constraints, resulting in a 61.52% increase in the RMSE of the global model.
6. Single-problem solutions are siloed. For example, Hou's hybrid interpolation method only addresses MV and cannot collaboratively handle the combined interference of AA or CD.
7. Computational efficiency is limited. Stackade's metamodel training requires multiple iterations, which poses latency challenges in real-time prediction scenarios.
8. References: A Lightweight Double-Deep Q-Network for Energy Efficiency Optimization of Industrial IoT Devices in Thermal Power Plants, https://doi.org/10.3390/electronics14132569
Author Response
Comments 1: Traditional missing value imputation has limitations. Single-variable imputation methods cannot handle high proportions of missing data, resulting in errors that grow linearly with the missingness rate.
Response 1: Thank you for your comments. It is true that a univariate imputation method could not handle a high percentage of missing values, necessitating additional data to act as a reference to impute missing values.
Comments 2: Passive adversarial attack defenses are ineffective. Existing detection-based defenses cannot repair perturbed data in real time and can only reduce attack intensity. Even at ε = 0.02, the MAE still increases by 69.56%.
Response 2: The proposed federated learning implementation by Yang Zhou et al. [38] handles data and local models compromised with adversarial attacks by lowering their contributions. Though this method works well during the training phase, it cannot handle live adversarial attacks.
Comments 3: Sliding window or feature scaling updates require waiting for new data to accumulate, making them unable to immediately respond to sudden distribution changes during extreme weather events.
Response 3: This is correct. Although increasing the training frequency could potentially solve the problem, a machine learning-based forecasting model could experience catastrophic forgetting, where the previously learned data are being forgotten. To avoid this issue, the experiment setup is set for retraining every 3 months instead of for the trivial, Stackade, and federated solutions. While trivial and Stackade relies on data preprocessing to rescale or harden the model against seasonal concept drift, federated learning utilizes the past 2 hours or historical data to help the forecasting model capture the seasonal trends instead.
Comments 4: Cascading operations can lead to error accumulation. The sequential processing of MV → AA → CD causes errors in the previous modules, ultimately worsening the MAE by 30.37%.
Response 4: This is true. To avoid errors from accumulating, it is necessary to deploy a method that could examine the pre-correction and post-correction data and adjust the forecast.
Comments 5: Federated learning data is heterogeneous, and local models cannot share features due to privacy constraints, resulting in a 61.52% increase in the RMSE of the global model.
Response 5: Though the local models cannot share the features, the obtained global model from averaging the base model weights could be fine-tuned on their local data to improve accuracy. However, there is a limit to how much a forecasting model could improve by relying solely on single data, hence why multivariate or multivariable forecasting models often perform better.
Comment 6: Single-problem solutions are siloed. For example, Hou's hybrid interpolation method only addresses MV and cannot collaboratively handle the combined interference of AA or CD.
Response 6: Though it is true that single-purpose solutions could only solve one problem, as long as the solutions are arranged in missing values imputation → adversarial attacks correction → concept drift rescaling configuration → forecast, which is like how trivial solutions are being implemented, it can collaboratively handle the combined interference. However, the error could accumulate, which lowers the forecasting accuracy.
Comments 7: Computational efficiency is limited. Stackade's metamodel training requires multiple iterations, which poses latency challenges in real-time prediction scenarios.
Response 7: Stackade Ensemble Learning does not have latency problems due to the centralized nature of its implementation. The main challenge, which was discussed in Section 6 is that it requires a vast amount of data gathered from the base models to learn how to adjust the forecast. This in turn causes the training time to become longer.
Comments 8: References: A Lightweight Double-Deep Q-Network for Energy Efficiency Optimization of Industrial IoT Devices in Thermal Power Plants, https://doi.org/10.3390/electronics14132569
Response 8: Added on page 29, lines 629-630.
Reviewer 2 Report
Comments and Suggestions for AuthorsTheir efforts to revise this paper in response to the reviewers' comments have improved its quality. The scholarly community will benefit from the perceptive viewpoints and implications.
Author Response
Comments 1: Their efforts to revise this paper in response to the reviewers' comments have improved its quality. The scholarly community will benefit from the perceptive viewpoints and implications.
Response 1: Thank you for your comments. We appreciate your previous suggestion about conducting bibliographic analysis to uncover research patterns and trends, which led us to learn a better way to understand how past research evolved.