Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Timeslot Scheduling with Reinforcement Learning Using a Double Deep Q-Network

Electronics 2023, 12(4), 1042; https://doi.org/10.3390/electronics12041042

by Jihye Ryu¹, Juhyeok Kwon¹

, Jeong-Dong Ryoo^2,3, Taesik Cheung²

and Jinoo Joung^4,*

Reviewer 1:

Gandhimathi Velusamy

Reviewer 2:

Alex Carrega

Electronics 2023, 12(4), 1042; https://doi.org/10.3390/electronics12041042

Submission received: 18 January 2023 / Revised: 11 February 2023 / Accepted: 16 February 2023 / Published: 20 February 2023

(This article belongs to the Section Networks)

Round 1

Reviewer 1 Report

The article, "Timeslot Scheduling with Reinforcement Learning Using a Double Deep Q-Network" with a few enhancements to improve the decision-making times seems interesting and novel. I have a few suggestions to improve the paper further as per my knowledge.

1. Though the introduction section provided enough background for the proposed work. The main motivation for the work is not clear. You may want to clearly state the problem you want to solve, the gap that you analyzed from the existing literature, and how your work fills the gap. The performance comparisons are provided with static algorithms. Including a dynamic ML-based algorithm will make your claim stronger. I was wondering why the performance of DQN is not included. It is good to see DRL VS. DDQN's performances.

2. Thanks for explaining DDQN elaborately in section 2.1. However, including the details of the minibatch implementation will be helpful for the readers.

3. Table 1 Caption needs to be aligned better.

3. The end-to-end is specified as ET in some places whereas in some places E2E. Please make sure to maintain consistency.

4. The equations in line number 308 need to be explained. Missed to state what is st and Max(E1), Max(E2), E1, and E2.

5. If E1 and E2 are static in a scenario, What is the Max of E1 and E2?

5. The statements in lines 319 to 322 need to be rephrased to make the intention of using R1 and R2 clear to the readers.

6. In Figure 6, why DQN is performed with only 10, 000 episodes while other methods are performed with up to 20, 000 episodes? Please explain the reason.

7. Including the X-axis name together with units is desired in graphs where time is used in X-axis. For example, in Figure 9, 13.

8. Please explain about route column in table 6. Are there any source and destination nodes? please mentioned them.

9. What is the purpose of figure 11 and how it differs from figure 12?

10. Mentioning the algorithm used for table lookup will be helpful to the readers.

11. Conclusion needs to be enhanced. Stating the plans for future work is fine but stating the other algorithms are better may make your work look weaker.

12. I feel explaining PPO or referring to other works in the conclusion is not necessary.

13. Providing abbreviations is helpful.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The article describes a solution for timeslot scheduling using Reinforcement learning based on a Double Deep Q-Network (DDQN). The DDQN is an extension of the Q-Learning technology that uses two neural networks to improve the stability and performance of the model. In the article, the author demonstrates that the use of a DDQN for timeslot planning is an effective method and improves the quality of solutions compared to other traditional methods.

Generally, the article presents good research and methodology, with promising results that support the author's conclusions. However, it would be useful to know if the results were compared to other timeslot scheduling methods based on machine learning or if tests were conducted on real-world data.

In summary, the article presents an interesting and promising solution for the timeslot scheduling problem, but it would be useful to know more about the validation of the results.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The paper presents a solution for optimizing timeslot scheduling in a wireless network using reinforcement learning. The authors use a Double Deep Q-Network (DDQN) to model the problem and evaluate the performance of their approach. The results show that the proposed method outperforms traditional scheduling methods and provides a more efficient way to handle the scheduling problem. Overall, the paper provides a valuable contribution to the field of reinforcement learning and its applications to scheduling problems in wireless networks.

The proposed method may not work well in all cases or under all conditions, and further research is needed to improve its generalizability and robustness.

The method may have limitations in terms of computational complexity, memory requirements, or other resources, which could make it infeasible for large-scale problems.

The results of the evaluation may be subject to various sources of bias or uncertainty, and further experiments may be needed to validate the findings.

The method may have theoretical limitations that need to be addressed, such as assumptions about the problem structure or the model's ability to represent the underlying system.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Article Menu

Timeslot Scheduling with Reinforcement Learning Using a Double Deep Q-Network

Further Information

Guidelines

MDPI Initiatives

Follow MDPI