Stability-Guaranteed Grant-Free Access for Cyber–Physical System over Space–Air–Ground Integrated Networks
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper proposes anmulti-agent deep reinforcement learning (MA-DRL) and hierarchical reinforcement learning (HRL) framework to ensure stability and energy efficiency for cyber-physical systems (CPS) operating over Sky–Air–Ground Integrated Networks (SAGIN). The topic is timely and relevant to the Electronics readership.
However, several major issues should be addressed before publication:
1. The system and communication models are simplified (linear dynamics and binary transmission success). Realistic network delay, nonlinearities, and channel uncertainty are not included.
2. The claimed “stability-guaranteed” property is only supported empirically. A formal proof of convergence or Lyapunov-based stability analysis is needed.
3. The simulation scale (3 groups × 4 devices) is small; scalability and performance for larger networks should be demonstrated.
4. The computational and communication cost of MA-DRL training should be analyzed, especially for edge or embedded CPS deployment.
5. Comparison with more advanced baselines (e.g., distributed RL, GNN-based coordination, or optimization-based access control) is recommended.
6. Include a discussion on future work directions (e.g., delay-aware modeling, real-world or hardware-in-the-loop validation).
Comments on the Quality of English LanguageThe English is generally understandable but includes grammatical inconsistencies and abnormal phrasing in technical explanations. Editing for flow and precision would enhance readability.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsComments are in the document
Comments for author File:
Comments.pdf
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsOverall, this paper is excellent, and you have stated that you examine grant-free (GF) access for cyber-physical systems (CPSs) via space-air-ground integrated networks (SAGINs) by taking power consumption and system stability into combined consideration. You have added a Markov decision process is used to describe a GF access problem for CPSs over SAGINs, where preamble sequences are selected to minimize power consumption while ensuring system stability. A distributed multi-agent deep reinforcement learning system built on factorization technology is suggested as a solution to this issue. Furthermore, a hierarchical reinforcement learning-based local network is intended to stop the action space's dimension from blowing up, which lowers the suggested algorithm's computational complexity.
You have two key contributions to this paper:
- To overcome the preamble sequence selection collisions in the GF access communications, an energy-efficient GF access technique based on multi-agent deep reinforcement learning (MA-DRL) is applied and engaged into real-time application.
- You have added the distributed MA-DRL framework based on factorization is used to address the problem of the global Q-value being nearly impossible to obtain, and a local network based on hierarchical reinforcement learning (HRL) is designed to lessen the explosion of action space dimensionality in a joint training algorithm.
Please reduce the plagiarism /similarity checks under 15% and add few more citations into this manuscript.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for Authors- The system and communication models should be enhanced or properly justified; current linear dynamics and binary transmission assumptions must either be supported analytically or extended to include realistic factors such as network delay, stochastic uncertainty, or channel variability.
- The “stability-guaranteed” claim must be theoretically supported with a formal proof (e.g., Lyapunov or convergence analysis) in addition to the empirical simulations currently presented.
- The simulation setup should be expanded to evaluate scalability in larger CPS networks, and the corresponding computational and communication costs of MA-DRL training (e.g., runtime, data exchange, iterations) should be quantified to demonstrate feasibility.
- The proposed approach should be compared against more advanced or recent baselines, including distributed RL, GNN-based coordination, or optimization-based access control, to confirm its technical competitiveness.
- The discussion and presentation of results should be slightly extended to include clearer parameter descriptions, stronger interpretation of outcomes, and a short note on the relationship between delay, uncertainty, and control performance for practical applicability.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsSee attachment for detailed comments.
Comments for author File:
Comments.pdf
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 3
Reviewer 1 Report
Comments and Suggestions for Authors1. Include a brief mathematical validation or Lyapunov-style justification of the claimed stability-guarantee, rather than relying only on referenced inequalities
2. Expand the discussion of scalability, provide insights on expected model performance for larger agent populations beyond the 3×4 test setup.
3. Consider including a complexity–accuracy trade-off table summarizing computational and communication overheads during training and inference.
4. Add a short future work subsection describing potential hardware-in-the-loop or delay-aware implementations, as noted in the author responses.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsComments to the Author: The authors have satisfactorily addressed the concerns raised in the previous review rounds.
1.Corrections: The specific typos pointed out have been corrected.
2.Figure Quality: The updated Figure 2 and Figure 3 clearly illustrate the algorithm structure and are now acceptable for publication.
3.Content Improvements: The simulation parameter settings and the discussion on system stability are reasonable and sufficient.
In summary, the manuscript now meets the standards for publication.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
