Review Reports - A Fast Resource Allocation Algorithm Based on Reinforcement Learning in Edge Computing Networks Considering User Cooperation

Round 1

Reviewer 1 Report

The objective of this research is to investigate the use of Q-learning in reinforcement learning for optimizing resource allocation in wireless networks. In this paper, a mathematical model is developed to formulate the problem, and a Q-learning-based algorithm is proposed to solve it. The proposed method is evaluated through experiments, and the results demonstrate its effectiveness. While the proposed approach shows promising results, further research is needed to address potential limitations and future extensions of the method.

1. The abstract of this paper falls short of describing the significant contributions of the proposed algorithm for optimizing resource allocation in edge computing networks. Specifically, the abstract does not clearly convey the extent to which the algorithm improves efficiency compared to the current state-of-the-art (SOTA) method. To address this issue, the revised abstract will include a specific and quantifiable measure of the improvement achieved by the proposed algorithm.

2. This paper requires significant improvements in its English writing due to numerous typos and grammatical errors. Specifically, the manuscript includes informal first-person pronouns that should be avoided in scholarly writing. To address these issues, it is recommended that the author seek professional editing services. By doing so, the revised manuscript will meet the standards of academic writing.

3. This paper's introduction lacks clarity regarding the motivation for the research question addressed in Section 1 (Introduction). To address this issue, the revised manuscript will clarify the research question and the importance of the problem to be solved in the context of MEC. Additionally, it is suggested to outline the significant research contributions of the study at the end of Section 1.

4. Figure 1 in this paper lacks the quality required for scholarly publication. Specifically, the resolution of the figure should be increased to 300 in accordance with the journal's submission requirements to improve its clarity and comprehensibility. By making these improvements, the revised manuscript will meet the standards required for publication and ensure that readers can fully appreciate the research presented in the figure.

5. Figure 1 in this paper fails to capture the specific optimization problem scenario addressed in this study. Furthermore, the figure lacks a contextual description of network resource allocation, making it difficult for readers to understand its relevance to the research. To address these issues, it is recommended that the author revise the figure to more clearly express the problem to be solved in the research, with a specific focus on the network resource allocation context.

6. Although previous studies have developed a range of optimization algorithms for network resource optimization, the motivation for selecting Q-learning of reinforcement learning as the optimization algorithm in this study is not sufficiently explained. To improve the manuscript's quality, the author should provide a more detailed explanation of the benefits of using Q-learning over other optimization algorithms in addressing the specific problems presented in this research. By providing a more robust justification for the use of Q-learning, the revised manuscript will be better equipped to contribute to the existing body of knowledge on network resource optimization.

7. The manuscript lacks a thorough description of the experimental simulation environment, including simulators, hardware and software specifications, and other relevant details. Additionally, there is no description of the deep learning network environment used for training Q-learning in reinforcement learning, nor is there a clear explanation of the Q-learning deep learning network architecture. To enhance the manuscript's quality, it is recommended that the author provide a more comprehensive and detailed description of the experimental simulation environment, as well as a thorough explanation of the deep learning network environment and Q-learning network architecture used in the study.

8. The units of the y-axis in the subgraphs about time and energy consumption in Figure 7-Figure 10 are missing, which makes it difficult to interpret the results. It is suggested that the author include the units of time and energy consumption to provide readers with a clear understanding of the results. The experimental results in Figure 7-Figure 10 demonstrate the time and energy consumption of the proposed algorithm in comparison with the baseline methods. However, the discussion and analysis of the results lack depth and do not fully address the implications of the findings. To improve the paper, it is recommended that the author provide a more in-depth discussion and analysis of the results.

9. The paper lacks experimental results on the impact of the proposed optimization algorithm on the network transmission delay, which is an important concern for readers. It is recommended that the author conduct additional experiments to investigate the impact of the proposed algorithm on network transmission delay and provide a thorough analysis and discussion of the results in the paper. The inclusion of results on the impact of the proposed algorithm on network transmission delay will enhance the significance and practical value of this study in the field of MEC network resource optimization.

Author Response

Thank for your advice. My reply and revisions are shown as follows:

Comment 1. The abstract of this paper falls short of describing the significant contributions of the proposed algorithm for optimizing resource allocation in edge computing networks. Specifically, the abstract does not clearly convey the extent to which the algorithm improves efficiency compared to the current state-of-the-art (SOTA) method. To address this issue, the revised abstract will include a specific and quantifiable measure of the improvement achieved by the proposed algorithm.

Reply to comment 1: I add ‘Specifically, compared with heuristic algorithms, such as particle swarm optimization, ant colony algorithm, etc., commonly used to solve such problems, the algorithm proposed in this paper can reduce some aspects of network performance (including delay and user energy consumption) by about 10% in a network dominated by downlink tasks.’ starting from line 16, which is about specific improvement brought by proposed algorithm.

Comment 2. This paper requires significant improvements in its English writing due to numerous typos and grammatical errors. Specifically, the manuscript includes informal first-person pronouns that should be avoided in scholarly writing. To address these issues, it is recommended that the author seek professional editing services. By doing so, the revised manuscript will meet the standards of academic writing.

Reply to comment 2: I revise the paper as your suggestion. In addition, this paper has been professionally edited.

Comment 3. This paper's introduction lacks clarity regarding the motivation for the research question addressed in Section 1 (Introduction). To address this issue, the revised manuscript will clarify the research question and the importance of the problem to be solved in the context of MEC. Additionally, it is suggested to outline the significant research contributions of the study at the end of Section 1.

Reply to comment 3: I describe specific research problem and the importance of MEC resource allocation. This content starts from the line 73. The contribution of this paper starts from line 149.

Comment 4. Figure 1 in this paper lacks the quality required for scholarly publication. Specifically, the resolution of the figure should be increased to 300 in accordance with the journal's submission requirements to improve its clarity and comprehensibility. By making these improvements, the revised manuscript will meet the standards required for publication and ensure that readers can fully appreciate the research presented in the figure.

Comment 5. Figure 1 in this paper fails to capture the specific optimization problem scenario addressed in this study. Furthermore, the figure lacks a contextual description of network resource allocation, making it difficult for readers to understand its relevance to the research. To address these issues, it is recommended that the author revise the figure to more clearly express the problem to be solved in the research, with a specific focus on the network resource allocation context.

Reply to comment 4&5: I have revised the figure 1. Besides, in order to relate the figure and scenario of optimization problem, I add more details into this figure. So, in the current figure 1, the information of the model of specific problem can be observed, including the network architecture, types of tasks, power supply of the infrastructure, resources for different tasks and etc.

Comment 6. Although previous studies have developed a range of optimization algorithms for network resource optimization, the motivation for selecting Q-learning of reinforcement learning as the optimization algorithm in this study is not sufficiently explained. To improve the manuscript's quality, the author should provide a more detailed explanation of the benefits of using Q-learning over other optimization algorithms in addressing the specific problems presented in this research. By providing a more robust justification for the use of Q-learning, the revised manuscript will be better equipped to contribute to the existing body of knowledge on network resource optimization.

Reply to comment 6: The reason why this paper uses Qlearning begins from the line 417.

Comment 7. The manuscript lacks a thorough description of the experimental simulation environment, including simulators, hardware and software specifications, and other relevant details. Additionally, there is no description of the deep learning network environment used for training Q-learning in reinforcement learning, nor is there a clear explanation of the Q-learning deep learning network architecture. To enhance the manuscript's quality, it is recommended that the author provide a more comprehensive and detailed description of the experimental simulation environment, as well as a thorough explanation of the deep learning network environment and Q-learning network architecture used in the study.

Reply to comment 7: I clarify the software used in the simulation. There is no deep learning involved in this paper, I only used Qlearning. The architecture of Qlearning is quite different with ANN algorithms. The agent is not trained by data set, instead, in reinforcement learning (Qlearning), the agent finds the samples for training according to the reward function you set. Thus, the section 2.2.2 has introduced the algorithm I used very clearly. Supplementary explanation is at the line 536.

Comment 8. The units of the y-axis in the subgraphs about time and energy consumption in Figure 7-Figure 10 are missing, which makes it difficult to interpret the results. It is suggested that the author include the units of time and energy consumption to provide readers with a clear understanding of the results. The experimental results in Figure 7-Figure 10 demonstrate the time and energy consumption of the proposed algorithm in comparison with the baseline methods. However, the discussion and analysis of the results lack depth and do not fully address the implications of the findings. To improve the paper, it is recommended that the author provide a more in-depth discussion and analysis of the results.

Reply to comment 8: Figures have been revised. New contents related to more details of simulation results respectively begin at line 610, 641, 664 and 691.

Comment 9. The paper lacks experimental results on the impact of the proposed optimization algorithm on the network transmission delay, which is an important concern for readers. It is recommended that the author conduct additional experiments to investigate the impact of the proposed algorithm on network transmission delay and provide a thorough analysis and discussion of the results in the paper. The inclusion of results on the impact of the proposed algorithm on network transmission delay will enhance the significance and practical value of this study in the field of MEC network resource optimization.

Reply to comment 9: In the model of this paper, the delay is composed of the computation delay and the transmission delay. Meanwhile, the computation delay only depends on the devices themselves, which does not change during the simulation. The change of total network delay can already reflect the change of network transmission delay. Therefore, although user cooperation only reduces the transmission delay of the network, this component of delay is no longer illustrated separately in the simulation, but is represented by the image of the network delay. The related explanation is added into the line 593. So, the simulation of network transmission delay may not be necessary because its trend can be seen from the curve of total network delay.

Author Response File: Author Response.pdf

Reviewer 2 Report

1. For sake of clarity and scientific soundness, please split the introduction section into following subsections: 1.1 research motivation, 1.2. research objectives/questions, 1.3 problem statement, 1.4 research contributions

2. isolate related work section from introduction , at the end of related work section, identify and describe the research gap and include below studies.

2.1 A Robust Light Field Semantic Segmentation Network Combining Contextual and Geometric Features. Frontiers in Environmental Science, 1443. doi: 10.3389/fenvs.2022.996513

2.2 Task Co-Offloading for D2D-Assisted Mobile Edge Computing in Industrial Internet of Things. IEEE Transactions on Industrial Informatics, 19(1), 480-490. doi: 10.1109/TII.2022.3158974

2.3 Joint Task Offloading and Resource Allocation for Energy-Constrained Mobile Edge Computing. IEEE Transactions on Mobile Computing. doi: 10.1109/TMC.2022.3150432

2.4 Robust Online CSI Estimation in a Complex Environment. IEEE transactions on wireless communications, 1. doi: 10.1109/TWC.2022.3165588

2.5 Broadband cancellation method in an adaptive co-site interference cancellation system. International journal of electronics, 109(5), 854-874. doi: 10.1080/00207217.2021.1941295

3. check equation numbering and their presentation, most of them are not properly aligned and also check paper for typos.

4. Figs. have low intensity, increase their quality

5. comparison with the benchmark studies is missing

Author Response

Thank for your advice. My reply and revisions are shown as follows:

Comment 1. For sake of clarity and scientific soundness, please split the introduction section into following subsections: 1.1 research motivation, 1.2. research objectives/questions, 1.3 problem statement, 1.4 research contributions

Reply to comment 1: The revised section I now has following subtitles.

1.1. research motivation

1.2. research objectives and problem statement

1.3. related works and research gap

1.4. research contributions and paper structure

Comment 2. isolate related work section from introduction, at the end of related work section, identify and describe the research gap and include below studies. (There were five recommended materials which is not listed here.)

Reply to comment 2: The reference materials provided by the reviewer do have higher value. So, the recommended materials can be found in the reference list - [4], [9], [13], [14], [15]. However, the content of related work and research gap is not very long. Considering the length of this content, the related work is still a part of section I with more details.

Comment 3. check equation numbering and their presentation, most of them are not properly aligned and also check paper for typos.

Reply to comment 3: The problem of typos has been revised. The equation numbering on my computer is aligned. All equations and their numbers are aligned in this paper. In addition, this paper has been professionally edited.

Comment 4. Figs. have low intensity, increase their quality.

Reply to comment 4: I have redrawn and replaced all figures.

Comment 5. comparison with the benchmark studies is missing

Reply to comment 5: The benchmark is added into the simulation now. It is represented by the curve with diamond marker in the figures. Brief analysis of the benchmark is also provided as ‘In particular, the benchmark in all the following figures are the results corresponding to the greedy algorithm with random exploration. This algorithm has the property of simple, fast, but easy to get trapped in the local optimal result’ at the line 557.

Author Response File: Author Response.pdf

Reviewer 3 Report

The paper is generally well written and structured. However, the paper has some shortcomings, as provided below.

1. On lines 14 and 15, the author stated, "We propose a network frame and its resource allocation algorithm, which is based on power consumption, delay, and user cooperation." With the user cooperation part, the author is short on detail description and technical explanation; this part needs to be rewritten with more detail information because it distinguishes the work from other works that use RL to solve the same problem.

2. The author did not provide details about the ratio of the values of the normalized weight coefficients (\gamma_t, \gamma_e) used during the simulation.

3. The effects of different priority of the weight of the coefficient should be discuss in the result section example when both \gamma_t, \gamma_e were set to 0.5 or when one among the given high priority more than the other how were the results affected.

4. At line 513, I believe it’s a continuation of the sentence on line 512. The author should consider joining them together and not creating a new paragraph as it is now, or if it’s a new paragraph, consider adding a full stop at the end of line 512 and starting with a capital letter at line 513.

Author Response

Thank for your advice. My reply and revisions are shown as follows:

Comment 1. On lines 14 and 15, the author stated, "We propose a network frame and its resource allocation algorithm, which is based on power consumption, delay, and user cooperation." With the user cooperation part, the author is short on detail description and technical explanation; this part needs to be rewritten with more detail information because it distinguishes the work from other works that use RL to solve the same problem.

Reply to comment 1: The introduction of user cooperation model has been improved now. It begins from line 326.

Comment 2. The author did not provide details about the ratio of the values of the normalized weight coefficients (\gamma_t, \gamma_e) used during the simulation.

Reply to comment 2: Some parameters used in simulation, like gamma_t, are added into the table 2 now. In particular, this paper has mentioned the sum of gamma_t and gamma_e is always 1 (at line 369-370), so, there is only gamma_t shown in the table.

Comment 3. The effects of different priority of the weight of the coefficient should be discuss in the result section example when both \gamma_t, \gamma_e were set to 0.5 or when one among the given high priority more than the other how were the results affected.

Reply to comment 3: The suggested content is revised, including new simulations, figures and analysis. Specific revision is at the line 706.

Comment 4. At line 513, I believe it’s a continuation of the sentence on line 512. The author should consider joining them together and not creating a new paragraph as it is now, or if it’s a new paragraph, consider adding a full stop at the end of line 512 and starting with a capital letter at line 513.

Reply to comment 4: I have revised it as your advice.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Thank you for your efforts in revising the manuscript as suggested. However, there are still several issues that require further corrections. Specifically, the y-axis in Figures 7-10 are missing units, which need to be clarified. Additionally, the newly added text in the manuscript lacks a description of the units used. Therefore, I kindly request that you revise the manuscript again to address these issues.

Furthermore, it is important to note that the language used in the manuscript needs improvement to meet the standards of professional academic writing. The author is advised to revise the language, particularly in the newly added content.

Author Response

Thank you for your patience and valuable suggestions.

All figures, including figures 7-11, have been revised with their units on y-axis. The units of delay and energy are successively second (s) and Joule (J). However, due to the multiple objectives considered in this paper, the objective function consists of terms of both delay and energy. Thus, it is only a value of function with no unit.

Besides, the text related to the explanation of used units is added as ‘As shown in the optimization problem (a), the objective function consists of terms of both delay and energy. The units of delay and energy are successively second (s) and Joule (J)’ at line 604, which is marked in red.

We used the English Editing service by MDPI, and we checked and revised the academic writing carefully again this time.

Reviewer 2 Report

Author Response

Thank you so much!

Reviewer 3 Report

No further comment.

Author Response

Thank you so much!

Round 3

Reviewer 1 Report

Thank you for your research, I am satisfied with your revised manuscript and the paper has met the requirements for publication.