Federated Deep Learning Model for False Data Injection Attack Detection in Cyber Physical Power Systems
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis submission presents a use of federated deep learning to enhance the detection of FDIAs in cyber-physical power systems. The authors implemented various deep learning models, including GRU, LSTM, and Transformer-based architectures, to evaluate their performance in a federated learning framework. The objective was to improve data privacy and security by training models across decentralized clients without sharing raw data.
While the research has potentials and is on hot research topic currently, I have several feedback that I'd like the authors to correspond to:
1. The abstract needs to be rewritten to highlight the potentials of the study. Avoid personal pronouns (we, our ...etc) in the writing as it's not academic to do so.
2. The article needs a better structure than its current shape. For example, don't include mathematical description in the introduction section! This should be a section/subsection of its own to keep the flow of information in proper way.
3. Following point (2), the authors need to expand the discussion in the introduction, especially that the area of cyberattacks in cyber-physical power systems is considerably new. Therefore, I'd recommend the authors to expand the introduction, especially with new recent publications, highlighting on this topic and why it's needed to be study further.
4. Following point (3 and 2), I also recommend the authors to incorporate some discussion on the recently published surveys that explored this topic. A recommended (not imposed) list could be:
- T. Aljohani and A. Almutairi, A comprehensive survey of cyberattacks on EVs: Research domains, attacks, defensive mechanisms, and verification methods, Defence Technology, https://doi.org/10.1016/j.dt.2024.06.009
- Pinto, S. J., Siano, P., & Parente, M. (2023). Review of cybersecurity analysis in smart distribution systems and future directions for using unsupervised learning methods for cyber detection. Energies, 16(4), 1651.
- Asefi, S., Mitrovic, M., Ćetenović, D., Levi, V., Gryazina, E., & Terzija, V. (2023). Anomaly detection and classification in power system state estimation: Combining model-based and data-driven methods. Sustainable Energy, Grids and Networks, 35, 101116.
and so on ... I recommend incorporating 8-12 references between recent studies and surveys to enrich your introduction.
5. While the paper discusses the evaluation of model performance, it does not provide specific equations for the metrics used (e.g., accuracy, precision, recall). The absence of these definitions makes it challenging to understand how the authors quantified the effectiveness of their models
6. The paper mentions the use of LSTM, GRU, and Simple RNN models for baseline performance metrics. However, it lacks a mathematical justification as of why these specific models were chosen over others.
A more rigorous approach would involve discussing the theoretical underpinnings of these models, including their respective equations, and how they relate to the problem of FDIAs. For instance, the mathematical formulation of the LSTM cell, including the forget, input, and output gates, should be explicitly stated to clarify how these contribute to capturing temporal dependencies in the data.
This must be considered to improve the quality of the submission!
7. The paper does not specify the loss functions used for training the models. The choice of loss function is critical in machine learning, as it directly influences the optimization process.
8. Also, the paper mentions the use of various deep learning algorithms but does not provide a mathematical framework for hyper-parameter tuning.
9. The paper mentions that centralized models like the ExtraTrees Classifier achieved an accuracy of 0.94, which dropped to 0.85 under attack conditions (T5). While this demonstrates the vulnerability of centralized models to FDIAs, the authors should provide a more detailed analysis of the performance metrics across different models and conditions utilized in the work.
10. Figure 7 presents ROC curves for ML models (T5). While ROC curves are a valuable tool for visualizing model performance, the paper should also provide a detailed explanation of how to interpret these curves. Specifically, it should discuss the area under the curve (AUC) values for each model, as these values quantitatively summarize the model's ability to distinguish between classes.
Comments on the Quality of English LanguageThe paper need further improvements in its writing style.
Author Response
Comment 1:The abstract needs to be rewritten to highlight the potentials of the study. Avoid personal pronouns (we, our ...etc) in the writing as it's not academic to do so.
Response 1: Thank you for your valuable feedback. The abstract has been revised to clearly highlight the potentials and contributions of the study.
Comment 2: The article needs a better structure than its current shape. For example, don't include mathematical description in the introduction section! This should be a section/subsection of its own to keep the flow of information in proper way.
Response 2: Thank you for your feedback. We have revised the introduction to enhance its readability and the overall flow of information. In place of the mathematical expressions, we now have a narrative description of the state estimation process and the attack mechanism. We feel that this allows for a much greater accessibility of the introduction while still conveying the integral concepts that underpin our work.
Comment 3: Following point (2), the authors need to expand the discussion in the introduction, especially that the area of cyberattacks in cyber-physical power systems is considerably new. Therefore, I'd recommend the authors to expand the introduction, especially with new recent publications, highlighting on this topic and why it's needed to be study further.
Response 3: Thank you for your insightful comment. The introduction has been expanded to provide a more comprehensive discussion on the topic of cyberattacks in Cyber-Physical Power Systems (CPPS). Recent publications have been incorporated to emphasize the growing significance of this area, particularly in light of the increasing frequency and sophistication of cyberattacks targeting critical infrastructure. These additions highlight the novelty and urgency of addressing False Data Injection Attacks (FDIAs) in CPPS, demonstrating the need for continued research and advanced security solutions. This expanded discussion further contextualizes the importance of our study and aligns with the evolving landscape of CPPS security challenges.
Comment 4:
Following point (3 and 2), I also recommend the authors to incorporate some discussion on the recently published surveys that explored this topic. A recommended (not imposed) list could be:
- T. Aljohani and A. Almutairi, A comprehensive survey of cyberattacks on EVs: Research domains, attacks, defensive mechanisms, and verification methods, Defence Technology, https://doi.org/10.1016/j.dt.2024.06.009
- Pinto, S. J., Siano, P., & Parente, M. (2023). Review of cybersecurity analysis in smart distribution systems and future directions for using unsupervised learning methods for cyber detection. Energies, 16(4), 1651.
- Asefi, S., Mitrovic, M., Ćetenović, D., Levi, V., Gryazina, E., & Terzija, V. (2023). Anomaly detection and classification in power system state estimation: Combining model-based and data-driven methods. Sustainable Energy, Grids and Networks, 35, 101116.
and so on ... I recommend incorporating 8-12 references between recent studies and surveys to enrich your introduction.
Response 4: Thank you for your valuable suggestion. We have incorporated 12 new references, including recent surveys and studies, into the introduction section to enrich the discussion on the topic of cyberattacks in Cyber-Physical Power Systems (CPPS).
Comment 5: While the paper discusses the evaluation of model performance, it does not provide specific equations for the metrics used (e.g., accuracy, precision, recall). The absence of these definitions makes it challenging to understand how the authors quantified the effectiveness of their models
Response 5: Thank you for your valuable feedback. To address your concern, we have now included the specific definitions of the evaluation metrics used—namely, accuracy, precision, recall, and F1-score in new subsection 4.6. Evaluation Metrics.
Comment 6: The paper mentions the use of LSTM, GRU, and Simple RNN models for baseline performance metrics. However, it lacks a mathematical justification as of why these specific models were chosen over others.A more rigorous approach would involve discussing the theoretical underpinnings of these models, including their respective equations, and how they relate to the problem of FDIAs. For instance, the mathematical formulation of the LSTM cell, including the forget, input, and output gates, should be explicitly stated to clarify how these contribute to capturing temporal dependencies in the data. This must be considered to improve the quality of the submission!
Response 6: Thank you for your valuable feedback. In response to your comment regarding the need for a more rigorous mathematical justification of the models used, we have added a detailed subsection 4.1. Model Selection Rationale in the manuscript. We have included the relevant mathematical equations with explanations for the RNN, LSTM, GRU, and GDNN. We have also added the relevance of these models for FDIA detection.
Comment 7: The paper does not specify the loss functions used for training the models. The choice of loss function is critical in machine learning, as it directly influences the optimization process.
Response 7: Thank you for your insightful feedback. We agree that specifying the loss function is crucial to understanding the optimization process. In response, we have addressed this point by adding details to the Experimental Setup section, clarifying the loss function used for training the models. The chosen loss function aligns with the binary classification nature of the task and was selected to ensure the models could generalize effectively across different data distributions in both centralized and federated learning frameworks. This addition enhances the clarity of the optimization strategy used in the study.
Comment 8: Also, the paper mentions the use of various deep learning algorithms but does not provide a mathematical framework for hyper-parameter tuning.
Response 8: We appreciate your thoughtful suggestions. We agree that details about the process of hyper-parameter tuning are necessary to ensure that our work can be reproduced and to clarify how we optimized our models for performance. We have added the details of hyper-parameter tuning by updating the Experimental Setup subsection.
Comment 9: The paper mentions that centralized models like the ExtraTrees Classifier achieved an accuracy of 0.94, which dropped to 0.85 under attack conditions (T5). While this demonstrates the vulnerability of centralized models to FDIAs, the authors should provide a more detailed analysis of the performance metrics across different models and conditions utilized in the work.
Response 9: Thank you for your valuable feedback on the section discussing the vulnerability of the ExtraTrees Classifier to False Data Injection Attacks (FDIAs). We acknowledge your concern regarding the accuracy drop mentioned for the ExtraTrees Classifier and have carefully reviewed our results in Table 4, which indeed shows that the accuracy and F1 score for the ExtraTrees Classifier are 0.94 under normal conditions, with no explicit indication of a drop to 0.85 under attack scenarios. Based on this insight, we have revised the section to accurately reflect the performance of the ExtraTrees Classifier as observed in our experiments. We clarified that, while the classifier demonstrated strong performance metrics under standard conditions, there remains a theoretical susceptibility to adversarial attacks. This vulnerability arises not due to a significant drop in accuracy directly observed in our tests, but rather because of the centralized nature of traditional machine learning models, which makes them more prone to targeted data manipulation.
Comment 10: Figure 7 presents ROC curves for ML models (T5). While ROC curves are a valuable tool for visualizing model performance, the paper should also provide a detailed explanation of how to interpret these curves. Specifically, it should discuss the area under the curve (AUC) values for each model, as these values quantitatively summarize the model's ability to distinguish between classes.
Response 10: Thank you for your valuable feedback. We agree that adding a detailed explanation of ROC curves and AUC values will improve the clarity of our results. In response, we have expanded the discussion of Figure 7 to explain how ROC curves are interpreted and the significance of AUC values for comparing model performance.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors proposes a federated deep learning-based architecture to detect false data injection attacks(FDIAs) in cyber-physical power systems.
Questions:
1. Section 2 Related Works needs to be improved. There is a body of recent work that seeks to detect cyberattacks on power systems such as false data injection attacks, denial-of-service attacks and deception attacks. Review more papers and present the main weaknesses of existing methods and what the authors aim to improve or solve.
2. The figures are not clear and are small in the article. I particularly could not see the results. In some graphs it is not possible to distinguish the signals.
3. The false data injection attack model is clear, but it is not clear how the machine learning technique is able to deal with attacks as there is nothing specific. The technique does not have any training tool to design the model that directly deals with the attack.
4. The result in Table 4 is not coherent. Is the model output a class (TP, TN, FP, FN) or a real number? Because this table provides class evaluation indices (Accuracy, Precision, Recall, F1-Score) and continuous result evaluation indices (RMSE). Does the output present two forms of results?
5. The authors do not provide information about the test system and the database. How was the database built? How big is this database? How many entries are there for the machine learning techniques? How is the database divided into training and testing stages?
6. The authors do not provide information about the degree of false data injection attacks in the database? How many corrupted data out of the total data due to false data injection attacks? It would be interesting to present results from the database without attacks and with different levels of false data injection attacks. This would allow us to assess the extent to which the proposed method is effective.
7. The conclusions could be improved. Based on the results achieved, what are the main benefits of the proposed method? Is it possible to extend the proposed method to deal with other types of cyber attacks such as denial-of-service attacks?
Author Response
Comment- 1: "Section 2 Related Works needs to be improved. There is a body of recent work that seeks to detect cyberattacks on power systems such as false data injection attacks, denial-of-service attacks, and deception attacks. Review more papers and present the main weaknesses of existing methods and what the authors aim to improve or solve."
Response: Thank you for your suggestion. We have expanded Section 2 (Related Works) to include more recent studies on cyberattack detection in power systems, specifically focusing on false data injection attacks (FDIAs), denial-of-service attacks, and deception attacks. We have reviewed additional papers to highlight the main weaknesses of existing detection methods and the specific improvements our proposed approach aims to address.
Comment-2: "The figures are not clear and are small in the article. I particularly could not see the results. In some graphs, it is not possible to distinguish the signals."
Response: We have revised the figures in the manuscript by increasing their size and resolution to enhance clarity.
Comment-3: "The false data injection attack model is clear, but it is not clear how the machine learning technique is able to deal with attacks as there is nothing specific. The technique does not have any training tool to design the model that directly deals with the attack."
Response: We have updated the manuscript with a new section 4.3 Model Training Strategies for Detecting False Data Injection Attacks to provide a more detailed explanation of how the machine learning techniques are specifically trained to detect and mitigate FDIAs. We clarified the data representation, feature selection, and how the models focus on identifying the anomalies introduced by FDIAs.
Comment-4: "The result in Table 4 is not coherent. Is the model output a class (TP, TN, FP, FN) or a real number? Because this table provides class evaluation indices (Accuracy, Precision, Recall, F1-Score) and continuous result evaluation indices (RMSE). Does the output present two forms of results?"
Response: We have clarified the model's output type in the manuscript, explaining that it outputs both class-based and continuous values, depending on the use case. We restructured Table 4 to separately present classification metrics (such as Accuracy, Precision, Recall, F1-Score) and removed the regression-based metrics, RMSE.
Comment-5: "The authors do not provide information about the test system and the database. How was the database built? How big is this database? How many entries are there for the machine learning techniques? How is the database divided into training and testing stages?"
Response: Thank you for your observation. We have added detailed information regarding the test system and the database in the revised manuscript, with curated subsections for Data collection, data features, data preprocessing, and Data Splitting Strategy. The dataset used in this study is provided by Mississippi State University and Oak Ridge National Laboratory, consisting of multiple interconnected stations equipped with Phasor Measurement Units (PMUs) and Intelligent Electronic Devices (IEDs). The database includes data from 37 different power system event scenarios, comprising both natural and attack events. The dataset contains 78,377 rows and 129 feature columns after preprocessing. We have added “3.4.1. Data Splitting Strategy” with the details of data splitting for centralized and federated learning models.
Comment-6: "The authors do not provide information about the degree of false data injection attacks in the database. How many corrupted data out of the total data due to false data injection attacks? It would be interesting to present results from the database without attacks and with different levels of false data injection attacks. This would allow us to assess the extent to which the proposed method is effective."
Response: We appreciate your suggestion and have now included information regarding the degree of false data injection attacks in the dataset. Approximately 70% of the data represents attack events (55,663 instances), while 30% represents natural events (22,714 instances). We also categorized the attacks into different levels of severity (low, moderate, and high) to better understand the effectiveness of the proposed detection methods. We conducted experiments on datasets with varying levels of attack severity to demonstrate the robustness of our models.
Comment-7: "The conclusions could be improved. Based on the results achieved, what are the main benefits of the proposed method? Is it possible to extend the proposed method to deal with other types of cyber attacks such as denial-of-service attacks?"
Response: We have revised the conclusion to highlight the main benefits of our proposed federated deep learning-based approach, emphasizing its ability to detect false data injection attacks (FDIAs) effectively while preserving data privacy. Additionally, we have discussed the potential for extending our method to handle other types of cyberattacks, such as denial-of-service (DoS) attacks. We noted that the adaptability of the federated learning framework makes it suitable for a broader range of cybersecurity challenges in critical infrastructure systems. The revised conclusion can be found in Section 6 on Page Z of the updated manuscript.
Author Response File: Author Response.docx
Reviewer 3 Report
Comments and Suggestions for Authors1. Revise the introduction to indicate the significance and innovation of this study.
2. It is suggested to modify the format in the paper, such as using (a), (b) to mark subgraphs in Figure 3, modifying the repeated (a) and (b) in Figures 4- 5, changing “table 4” to “Table 4” in Line 487, changing "Table??" in Line 558, and changing "we" in Line 556, etc.
3. It is recommended to provide the formula for data standardization in Line 371.
4. There is a "Transformer" model in Table 6, but not in Figure 8. It is recommended to add the corresponding curve of the model in Figure 8.
Comments on the Quality of English LanguageIt is suggested to improve the English expression in the paper.
Author Response
Comment-1: "Revise the introduction to indicate the significance and innovation of this study."
Response: We have revised the introduction to highlight the significance and innovation of our study, particularly focusing on the novel application of federated learning for FDIA detection in cyber-physical power systems. We emphasized how our approach addresses existing gaps in cybersecurity and enhances data privacy while maintaining high detection performance.
Comment-2: "It is suggested to modify the format in the paper, such as using (a), (b) to mark subgraphs in Figure 3, modifying the repeated (a) and (b) in Figures 4-5, and correcting minor format errors."
Response: Thank you for the valuable suggestion. The format of Figure 3 has been updated with the appropriate subgraph labels (a), (b), as requested. Additionally, the repeated labels in Figures 4 and 5 have been corrected, and minor formatting issues have been addressed throughout the manuscript.
Comment-3: "It is recommended to provide the formula for data standardization in Line 371."
Response: We have included the formula for data standardization in the manuscript to provide clarity on the data preprocessing steps.
Comment-4: "There is a 'Transformer' model in Table 6, but not in Figure 8. It is recommended to add the corresponding curve of the model in Figure 8."
Response: Thank you for pointing this out. The Transformer model has been incorporated into Figure 8, and the corresponding curve has been added to ensure consistency with Table 6.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have responded to my inquiries in a comprehensive manner. I recommend their publication to be accepted, and congrats them for such a good work.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors proposes a federated deep learning-based architecture to detect false data injection attacks (FDIAs) in cyber-physical power systems.
The article has been improved, the contribution is good and all questions have been effectively answered.