Federated Learning for Cybersecurity: A Privacy-Preserving Approach
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsFederated learning (FL) enables privacy-preserving collaborative intelligence in IoT systems but faces challenges like inefficient communication and limited robustness. This manuscript proposes a modular FL framework combining privacy-preserving techniques, secure aggregation, and blockchain logging for intrusion detection and malware classification. Validated with real-world datasets (e.g., CICIDS2017, TON_IoT), it achieves high accuracy with reduced communication and privacy costs. The paper is suitable for publication in Applied Sciences after addressing the following comments:
- It is suggested that the authors complete all abbreviations, such as “TLS,” when they first appear in the text and provide relevant references to support their usage.
- The authors should explain why specific techniques like gradient clipping, Fisher-based parameter pruning, and others were chosen. Are these techniques particularly easy to integrate into the framework? A discussion of the advantages of these techniques would strengthen the paper.
- Instead of merely listing the techniques (e.g., gradient clipping, blockchain), Table 1 should be revised to emphasize the key challenges (e.g., privacy leakage, model poisoning, communication overhead) and how the proposed techniques address these issues.
- At present, quantum communication and quantum computing [Rep. Prog. Phys. 87, 127901 (2024); PRX Quantum 3, 020315 (2022)] can bring new opportunities for federated learning in terms of security and efficiency. In order to enable readers to understand the relevant background information, it is necessary for the author to add one or two sentences of introduction or discussion.
- Several figures (e.g., Figures 3, 5, 7) are blurry or lack sufficient resolution. It is suggested that the authors should update them with higher-quality versions and ensure labels, legends, and axis titles are legible.
- In Table 2, Figure 5, and Figure 7, the authors should specify which datasets and learning models were used. Additionally, baseline comparisons with existing FL methods (e.g., FedAvg, FedProx) should be included, along with key hyperparameters (e.g., learning rate, batch size, number of local epochs) to ensure that the experimental setup is reproducible.
Author Response
Response to Reviewer
We thank the reviewer for the valuable feedback and constructive suggestions. We have carefully revised the manuscript in accordance with the comments received. Below is a summary of the changes made, aligned with each point raised:
1. Clarification of Abbreviations
We have reviewed the entire manuscript and ensured that all abbreviations are fully spelled out at their first appearance. For example, “TLS” is now introduced as “Transport Layer Security (TLS)”. Relevant references have been added to support their use.
2. Justification of Techniques (e.g., gradient clipping, Fisher pruning)
We have added a new subsection titled “3.2 Justification of Selected Techniques”, which explains the rationale for using gradient clipping, Fisher-based parameter pruning, differential privacy, and other methods. These techniques were selected for their compatibility with resource-constrained IoT devices and their ability to address key FL challenges such as privacy leakage, overfitting, and communication overhead. Relevant citations were included to support the selection.
3. Improvement of Table 1
We have revised Table 1 to clearly highlight the main challenges in Federated Learning (e.g., privacy leakage, model poisoning, communication overhead) and directly map each proposed technique to the corresponding challenge. This restructuring enhances clarity and aligns with the reviewer's recommendation.
4. Addition of Quantum Communication Context
As suggested, we have added a short paragraph at the end of Section 1 discussing the implications of quantum communication and computing in the context of federated learning. We cited the recent studies:
-
Rep. Prog. Phys. 87, 127901 (2024)
-
PRX Quantum 3, 020315 (2022)
This addition provides context for future directions involving quantum-safe techniques.
5. Improved Figures (Figures 3, 5, and 7)
We have updated Figures 3, 5, and 7 to higher-resolution versions. We also improved the clarity of labels, legends, and axes to ensure better readability and presentation quality.
6. Experimental Details and Baseline Comparisons
In Section 4, we explicitly detailed the experimental setup, including:
-
Datasets used: CICIDS2017 and TON_IoT
-
Model: MLP architecture
-
Hyperparameters: learning rate = 0.01, batch size = 64, local epochs = 10
Additionally, we introduced comparisons with two baseline FL methods (FedAvg and FedProx), and summarized the results in a new Table 3, improving the scientific depth and reproducibility of our work.
We trust that these revisions adequately address the reviewer’s concerns and significantly improve the manuscript. We are grateful for the thoughtful and constructive feedback.
With respect,
Drd. Ing. Timofte Edi Marian
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsFederated Learning, known from machine learning, is of great importance for cybersecurity. It can also protect privacy, as the Authors demonstrate in the submitted manuscript. This is of great importance in distributed environments, such as the Internet of Things (IoT), which is used in medicine, industry, or even to control vehicles in the city. In the case of intrusion detection and malware classification, Federated Learning is a reasonable approach for this class of systems. Gradient pruning, differential privacy, and encrypted model aggregation were also rightly used to ensure secure and efficient cooperation between heterogeneous clients that implement learning modules. Publicly available datasets were used: CICIDS2017 and TON_IoT. Experimental results showed the advantages of federated learning for increasing security in distributed Internet infrastructures. It is hard to disagree with the above conclusions of the Authors, but I noticed minor shortcomings in the manuscript that should be corrected.
1. Title: Federated Learning for Cybersecurity A Privacy-Preserving Approach => Federated Learning for Cybersecurity. A Privacy-Preserving Approach
2. There is no discussion of the organization of the article at the end of the Introduction. What is in the individual Sections?
3. I propose combining Lines 157-170 into two paragraphs.
4. Table I: This Work +> I propose naming the proposed method something;
5. Figure 3: We have four layers: Fog/Edge/Fog/Edge. This is incorrect. The Figure should be corrected.
6. There are no citations in Section 4.1, and in Section 4.2 such citations should accompany each mathematical equation. In the case of PL and COR, mathematical dependencies are missing.
7. There should be no Section 6 after the Conclusion. I propose placing it before Section no. 5.
To sum up, I will gladly review the manuscript once more after the corrections have been submitted by the Authors.
Author Response
Response to Reviewer
We thank the reviewer for the valuable feedback and constructive suggestions. We have carefully revised the manuscript in accordance with the comments received. Below is a summary of the changes made, aligned with each point raised:
-
Title Formatting
We have corrected the title as recommended. It now reads:
Federated Learning for Cybersecurity. A Privacy-Preserving Approach. -
Organization of the Article
We have added a paragraph at the end of the Introduction outlining the structure of the manuscript. This paragraph briefly describes the content of each major section to help guide the reader through the article. -
Paragraph Structure (Lines 157–170)
We revised the indicated section by restructuring it into two coherent and logically separated paragraphs. This improves the flow and readability of the text. -
Naming the Proposed Method in Table 1
We renamed the previously generic label “This Work” in Table 1 to FL-SecNet (Federated Learning for Secure Network) to provide a clear and consistent identity for the proposed method. -
Correction of Figure 3
Figure 3 has been updated to correct the previously inconsistent layering. The architecture now clearly reflects the proper hierarchy: Device Layer, Edge Layer, Fog Layer, and Cloud Layer. -
Missing Citations and Mathematical Details (Sections 4.1 and 4.2)
We added references in Section 4.1 to support the use of CICIDS2017 and TON_IoT datasets. In Section 4.2, we included citations for all equations and introduced the missing mathematical expressions for Precision Loss (PL) and Correlation (COR), along with brief explanations. -
Section Order: Discussion Before Conclusion
We reorganized the manuscript to move the Discussion and Future Work section before the Conclusion. The structure now follows a logical sequence, with the Conclusion serving as the final section of the manuscript.
We trust that these revisions adequately address the reviewer’s concerns and contribute to the overall quality and clarity of the paper. We are grateful for the constructive feedback.
Sincerely,
Drd. ing. Timofte Edi Marian
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe paper proposes an approach for privacy preserving in the context of federated learning in cybersecutiry contexts.
Having carefully reviewed and evaluated this manuscript, I have some concerns that I report below:
In the abstract is not recommended the use of references (ex. CI-22 CIDS2017 [1] and TON_IoT [2]). In my opinion must be removed in the abstract.
In Introduction there’s no reference support the basic concepts of federated learning, besides a after during the introduction it is referenced the [3,4,5 and 6] papers.
In the end of the introduction section there’s no information about the objective of the paper as well as a paragraph with the structured of the paper.
In section 2.1 the authors mentioned “present a modular federated learning framework designed to en-149 hance cybersecurity in distributed IoT environments by combining lightweight privacy-150 preserving techniques, personalized model adaptation, and blockchain-assisted secure 151 communication.”. However there’s no evidence or it is not clear how “blockchain-assisted secure communication” is guaranteed.
In line 172, there’s no reference support for the “of the key features of state-of-the-art methods such as FedAvg, FedProx, and MOFL/MTFL.”.
In “Table 1. Comparative Overview of Federated Learning Methods” there’s no support for the categorization for the columns FedAVG, FedProx, and MOFL/MTFL.
There’s no information about if the comparison was made in the same conditions and context to evaluate the thre metods comparing with the proposal of the authors.
In my opinion before the The flow of the Figure 1 (overall architecture) it must be presented or described other examples of flows in other studies to then be compared with this work.
Besides the description of the components there’s no clear about how performs “robust aggregation”.. How? And not just say “it performs”..
The information of the section 3.1. Security and Privacy Analysis is generic and not presents or describe with foundations and references as well as with concrete facts or approaches to lead with the Security and Privacy Analysis. Which rules? Laws, regulamentations?
In section 4.2. Datasets Used here’s no reference for the datasets, CICIDS2017, TON_IoT and, NSL-KDD as well as any data analysis of the dataset.
There’s no justification and what was done in the components of the Figure 4. Dataset Integration and Preprocessing Workflow.
In section 4.3. Evaluation Metrics, besides it presents the generic metrics it misses references to support the metrics as well other important metrics associated with the evaluation as the confusion matrix and others.
Table 2 presents the Simulated Results for what dataset? How were created the models in terms of train and set parameters, k-folds and interactions’
After the table 2 the authors present “These results demonstrate that the federated learning”. Is not possible to evaluate the high performance without the information of the evaluation metrics and the model creation process as well as the algorithms used.
It is not normal and it is complete out of the work the basic section 4.2 “case study” where it is presented some paragraphs and a Figure 9. Federated Intrusion Detection in a Smart Healthcare IoT Network!!!. Is not clearly justified how the foundations and architecture and specifically the components of the work presented will be implemented in a SmartHealcare IoT Network.. or in a SmartHouse Network, or… SmartXPTONetwork.. In conclusion this section is like propaganda of that possible could be applied in a SmartHealthcare Context..
In the conclusions the authors mentioned “This research proposes a federated learning (FL) framework”. In fact it is not really clear the specific of the framework, but a general set of flows and blocks without any deep specification of each component, but only general statements how could be.
It is not true when the author mentioned “A practical use case in smart healthcare validates.. ”
A lot of terms without references (ex in 6.2. Techniques for Explainable Federated Learning)
Some writing corrections:
In the abstract, federated learning (FL) to Federated Learning(FL), as well as in other parts of the paper. Or if not used the terms FL and IoT in the rest of the abstract it must be removed the (FL) and (IoT).
If the affiliation of the authors is the same “ University “Ștefan cel Mare” Suceava, Romania”, it is only need to write one time the institution and after the e-mails of the authors. By other words, the affiliations and the authors e-mails must be in 2 lines and not in 8 lines. Besides that, there’s no information about the corresponding author and ORCID ID/REF for at least one author.
The reference section does not follow the same syntax, namely the year and urls in each reference. I suggest the use the DOI – Document Object Identifier or an url (ex. from Arxiv) in each references, when possible.
Some references not present the year of the publication (ex. Ref 13)
In general the paper presents a lot of text about federate learning and related information in the context of cybersecurity but lack in the really core of the work, the implementation conditions and foundations of the implementations in the experimental results. The experiments as presented in the text are generally presented and not scientific clarified beside the charts and the tables with some evaluations that is impossible to know how have been achieved.
Author Response
Response to Reviewer
We thank the reviewer for the detailed and thoughtful feedback. We have carefully revised the manuscript to address each of the concerns raised. Below is a summary of the changes made, aligned with each point raised:
-
References in the Abstract
We have removed the references to CICIDS2017 [1] and TON_IoT [2] from the abstract, in line with standard academic guidelines. -
Introduction – Missing References for Basic Concepts
We revised the Introduction to include references supporting the foundational concepts of federated learning, including its relevance in cybersecurity contexts. These were added prior to the discussion of papers [3]–[6]. -
Introduction – Missing Objective and Article Structure
We added a clear statement of the paper's objective at the end of the Introduction, followed by a paragraph summarizing the structure of the paper. -
Blockchain-Assisted Secure Communication – Lack of Clarity
We clarified the role of blockchain in the proposed framework in Section 2.1, including how secure communication is achieved. Specific mechanisms and references were added to explain the integration and benefits. -
Line 172 – Missing Reference for FedAvg, FedProx, MOFL/MTFL
We included appropriate references to support the mention of FedAvg, FedProx, and MOFL/MTFL in the relevant section. -
Table 1 – Missing Support for Categorization
We revised Table 1 to include citations and supporting references for the categorizations under FedAvg, FedProx, and MOFL/MTFL. A brief methodological explanation was added in the caption. -
Evaluation Conditions for Compared Methods
We added a clarification in the experimental section to specify that the comparisons were made under similar conditions. The assumptions, datasets, and models used were standardized across the methods. -
Figure 1 – Lack of Related Flow Comparisons
We added a brief discussion preceding Figure 1 referencing existing system flows in related studies and explaining how our proposed architecture builds on or diverges from these works. -
Robust Aggregation – Lack of Details
We expanded the explanation of the robust aggregation mechanism in the methodology section. Details were provided about the aggregation algorithm and how resilience to poisoned updates is ensured. -
Section 3.1 – Generic Privacy and Security Discussion
We revised Section 3.1 to include references to specific privacy-preserving techniques and security regulations (e.g., GDPR, NIST SP 800-53). The discussion was enhanced with more technical detail and citations. -
Section 4.2 – Missing References and Dataset Analysis
We included references for the CICIDS2017, TON_IoT, and NSL-KDD datasets and provided a short statistical overview of each dataset. A brief data profiling description was also added. -
Figure 4 – Lack of Justification for Components
We expanded the description of the dataset preprocessing pipeline in Figure 4, explaining the role and implementation of each component in the data integration workflow. -
Section 4.3 – Missing Evaluation Metric References
References were added to support the use of evaluation metrics (accuracy, precision, recall, F1-score). We also included additional metrics, such as confusion matrix components, to strengthen the analysis. -
Table 2 – Dataset and Model Setup Not Specified
We added details before Table 2 specifying the dataset used, the model setup, and the training procedure, including k-fold validation and hyperparameter configuration. -
Performance Claims Without Context
We revised the discussion following Table 2 to clarify that performance results are based on the evaluation metrics presented. We emphasized the importance of model configuration and provided additional explanation of the evaluation setup. -
Section 4.2 “Case Study” – Poor Justification
We restructured the practical case study (formerly Section 4.2) to better connect it with the framework components. We clarified that it is a conceptual application scenario rather than a real-world deployment, and justified its relevance with clearer references and rationale. The section now appears as part of Section 5. -
Conclusion – Framework Clarity
We reworded parts of the Conclusion to avoid overgeneralizations and better reflect the nature of our proposal. We made it clear that our framework is modular and concept-driven, with emphasis on feasibility and future extensibility. -
Claim Regarding Practical Use Case in Smart Healthcare
We removed the statement that the use case "validates" the framework and replaced it with more accurate phrasing, indicating that it illustrates a potential application context. -
Missing References (e.g., Explainable Federated Learning)
We added supporting references in Section 6.2 to justify the mention of explainable federated learning techniques, including recent developments in interpretable FL. -
Writing Corrections (Terminology in Abstract, Affiliations, ORCID)
We have corrected the capitalization and abbreviation usage throughout the abstract and main text (e.g., “federated learning (FL)” vs. “Federated Learning (FL)”).
Affiliations have been consolidated into two lines, and we added the corresponding author’s contact and ORCID ID, following the journal’s submission guidelines. -
Reference Formatting Consistency
The reference list has been revised for consistency in format. We ensured each entry includes a publication year and either a DOI or URL where available. Missing publication years (e.g., Ref 13) were also added.
We trust that these extensive revisions address all of the reviewer’s concerns and significantly enhance the clarity, credibility, and technical strength of the manuscript. We are grateful for the detailed and constructive feedback.
Sincerely,
Drd. ing. Timofte Edi Marian
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsI didn't find some of the modifications in the author's reply during the file revision. Please check if the uploaded version is the final one. For example, in the reply to question 4, the author cited non-quantum information work, which is a citation error.
4. Addition of Quantum Communication Context
As suggested, we have added a short paragraph at the end of Section 1 discussing the implications of quantum communication and computing in the context of federated learning. We cited the recent studies:
-
Rep. Prog. Phys. 87, 127901 (2024)
-
PRX Quantum 3, 020315 (2022)
This addition provides context for future directions involving quantum-safe techniques.
Author Response
Response to Comment 4 – Addition of Quantum Communication Context:
Thank you for your observation. We confirm that the final uploaded version does include the requested addition regarding the implications of quantum communication in the context of federated learning.
The following paragraph was added at the end of Section 1 (page 3):
“As quantum technologies evolve, they present new challenges to the confidentiality and resilience of FL systems. In anticipation of adversaries with quantum capabilities, recent research has explored integrating quantum-resilient approaches into FL architectures. These include lattice-based cryptographic primitives, quantum key distribution protocols and secure aggregation schemes that are designed to resist quantum decryption attempts. These adaptations are intended to ensure the long-term security of FL deployments, particularly in critical infrastructures such as healthcare and smart city networks [15], [16].”
The cited references [15], [16] correspond to recent, peer-reviewed studies on quantum communication and security:
- Rep. Prog. Phys. 87, 127901 (2024)
- PRX Quantum 3, 020315 (2022)
We respectfully confirm that these references are not erroneous and directly support the integration of quantum-resilient techniques into federated learning architectures.
Please feel free to let us know if any further clarification is needed.
Respectfully,
Drd. ing. Timofte Edi Marian
Faculty of Electrical Engineering and Computer Science
Ștefan cel Mare University of Suceava
Reviewer 2 Report
Comments and Suggestions for AuthorsThe current version of the manuscript is of a much higher standard and I recommend that it be accepted.
Author Response
Thank you for the opportunity to improve the manuscript. We appreciate the reviewers’ constructive feedback and the editorial support throughout the process.
Respectfully,
Drd. ing. Timofte Edi Marian
Faculty of Electrical Engineering and Computer Science
Ștefan cel Mare University of Suceava
Reviewer 3 Report
Comments and Suggestions for AuthorsHaving carefully reviewed and evaluated this manuscript for the second time, I have some concerns that I report below:
Section “5.1. Motivation and Context As” should be “5.1. Motivation and Context”??
Several nomenclatures must be associated with a reference (e. General Data Protection Regulation (GDPR)) in line 337, Cyber Intelligent Risk Assessment (CIRA) in line 136, Trust-6GCPSS in line 158… and others.
In line 195 the authors mentioned “federated learning (FL)” where FL have been presented before this section..
It is not clear the conclusions of the Table 1. Mapping FL Challenges to Applied Techniques and Their Impact. By the fact that there’s no references to other works to support the justifications and impact as well as evidences of tests and data to support the text inside the table. More critical is after the table the statement “As observed, the SecFL-IoT framework.. ” where SecFL is not mentioned in the table!!
In line 789 the authors mentioned “This architecture demonstrates a privacy-preserving yet interpretable learning pipe”. How demonstrates? Illustrates a general explainability..
Table 3 must be resized between lines 822 and 825.
Figure 3. Experimental Testbed Architecture is not clear detailed the components (ex. PF) and how to load balance?
In line 388, “denial of service (DoS),” must be “Denial of Service (DoS),”
It is not clear the process realized for each task from line 419 to 422.
It was not clear the k-fold and split of the train and test data;
The figures should be better resized (i.e., reduced in size). Ex. Fig 4, 10.
Author Response
Response to Reviewer 3
We sincerely thank the reviewer for the detailed and constructive feedback, which has helped us improve the clarity and quality of our manuscript. Below are our point-by-point responses:
-
Section title “5.1. Motivation and Context As”
- Corrected. The title has been updated to “5.1. Motivation and Context”. -
Missing references for terms such as GDPR, CIRA, Trust-6GCPSS
- Added appropriate references for all mentioned acronyms and nomenclatures, including General Data Protection Regulation (GDPR) [ref. 35], Cyber Intelligent Risk Assessment (CIRA) [ref. 36], and Trust-6GCPSS [ref. 37], among others. These citations are now present directly after the first mention of each term. -
Redundant definition of FL in line 195
- Corrected. The repetition was removed to avoid redundancy. -
Lack of justification and references in Table 1
- Revised Table 1 to include references that support the listed techniques and their impact. We also clarified that the “SecFL-IoT framework” is derived from the proposed methods and is now referenced within the table description. -
Clarification of line 789
- Rewritten for clarity. The sentence now reads:
“This architecture illustrates a privacy-preserving yet interpretable learning pipeline, highlighting how explainable mechanisms can be implemented without compromising data confidentiality.” -
Table 3 sizing between lines 822–825
- Adjusted. Table 3 was resized to properly fit within the specified lines and improve readability. -
Figure 3 lacks clarity regarding components and load balancing
- Expanded caption and paragraph above the figure to clearly describe the components (e.g., PF – Policy Filter module) and the role of load balancing. The load-balancing mechanism is now explained as being managed by a smart scheduling module based on client availability and resource constraints. -
Capitalization of “Denial of Service” in line 388
- Corrected to “Denial of Service (DoS)”. -
Unclear process in lines 419–422
- Revised the section to clarify the steps executed by each node, including missing value imputation, Min-Max scaling, one-hot encoding, and label remapping for class balance. -
Clarification of k-fold and train/test split
- Added explanation in Section 4.1. The data was split using an 80/20 train-test ratio and evaluated using 5-fold cross-validation, which is now clearly mentioned in the revised text. -
Figure resizing (e.g., Fig. 4, Fig. 10)
- All mentioned figures have been resized for consistency and better visual integration with the text.
All the aforementioned revisions are highlighted in yellow in the updated manuscript for ease of reference.
Respectfully,
Drd. ing. Timofte Edi Marian
Ștefan cel Mare University of Suceava
Faculty of Electrical Engineering and Computer Science