Next Article in Journal
Multi-Factor Task Assignment and Adaptive Window Enhanced Conflict-Based Search: Multi-Agent Task Assignment and Path Planning for a Smart Factory
Next Article in Special Issue
Collaborative Online Learning-Based Distributed Handover Scheme in Hybrid VLC/RF 5G Systems
Previous Article in Journal
A Multi-Agent System for Parking Allocation: An Approach to Allocate Parking Spaces
Previous Article in Special Issue
DGA-Based Fault Diagnosis Using Self-Organizing Neural Networks with Incremental Learning
 
 
Article
Peer-Review Record

Joint Pricing, Server Orchestration and Network Slice Deployment in Mobile Edge Computing Networks

Electronics 2025, 14(5), 841; https://doi.org/10.3390/electronics14050841
by Yijian Hou 1, Kaisa Zhang 1,*, Gang Chuai 1, Weidong Gao 1, Xiangyu Chen 2 and Siqi Liu 3
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4: Anonymous
Electronics 2025, 14(5), 841; https://doi.org/10.3390/electronics14050841
Submission received: 7 January 2025 / Revised: 31 January 2025 / Accepted: 18 February 2025 / Published: 21 February 2025
(This article belongs to the Special Issue New Advances in Distributed Computing and Its Applications)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

1. The authors propose a pricing-driven joint MEC server orchestration and network slice deployment (PD-JSOSD) scheme. We divide the system into InP layer (IPL), network planning layer (NPL) and resource allocation layer (RAL), and a three-stage Stackelberg game is proposed to describe the relationships among them. Please compare the contributions of the proposed PD-JSOSD scheme to related technologies, in detail.

2. The results show that the proposed modified-BDQ can improve the convergence by 21.9% and 28.3% compared to the benchmark algorithms. Please elaborate the possible reasons in detail.

3.  In the figure 2, proposed three-stage Stackelberg game model, should be demonstrated in detail.

4.  In the figure 3, the architecture of D3QN should be demonstrated in detail.

5. In the figure 4, the evaluation network of BDQ should be demonstrated in detail.

6.In the figure 5, convergence process of D3QN-DMBDQ should be demonstrated in detail.

7.   The manuscript has 28 equations; the number of the equations should be decreased to retain the reader engagement.  The manuscript should be refined by deleting some equations.

 

Comments on the Quality of English Language

 Please thoroughly revise the language before your submission.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

In this manuscript, the authors propose a ML-assisted algorithm that deals with server orchestration, network slice deployment, while also considering the revenue of the infrastructure providers. The authors model the problem in 3 layers (infrastructure, network and resource allocation) and propose a three-stage Stackelberg game based on three-layer deep reinforcement learning (DRL) algorithm (combination of double deep Q-Network and branching dueling Q-network). They formulate mathematically the optimization problem, and they describe the state, action and reward functions for the DRL schemes, describing also an illegal action modification algorithm to ensure the convergence of the BDQ. The authors verify the performance of their approach with simulations, illustrating its efficiency compared to several baseline algorithms. The paper is in general interesting and relevant to the topic of the special issue. Some minor comments should be addressed before publication:

·        Equation 1 – you do not consider co-channel interference in the wireless area. Why? As far as I understand your system model, interference among the downlink transmissions is possible. Please comment on this and, if applicable, mention it as a limitation of your method

·        Algorithm 1, line 6: how is the new environment state calculated or obtained? I am currently missing this information. Does it depend on the selected action? Please clarify this issue in the manuscript.

·        What I would suggest is to include some references with applications of the network slicing in order to give a broader perspective to the reader for the problem investigated in this work. For instance:

1.      "Network slicing for vehicular communication." Transactions on Emerging Telecommunications Technologies 32.1 (2021): e3652.

2.      "Smart Mission Critical Service Management: Architecture, Deployment Options, and Experimental Results." IEEE Transactions on Network and Service Management (2024).

3.      "Dynamic SDN-based radio access network slicing with deep reinforcement learning for URLLC and eMBB services." IEEE Transactions on Network Science and Engineering 9.4 (2022): 2174-2187.

4.      "Customized industrial networks: Network slicing trial at hamburg seaport." IEEE Wireless Communications 25.5 (2018): 48-55.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This paper is well-written and presents a novel method for optimizing resource allocation and network performance in MEC networks.

Below are some suggestions for consideration:

1. The novelty of this work is somewhat low. The use of Stackelberg games and DRL for MEC resource allocation has been extensively studied in prior research. The author should tell more about the differentiation from existing approaches.

2. Here are some concerns about the system model. The paper assumes a one-to-one mapping between base stations and MEC servers. However, in practical MEC deployments, multiple base stations may share a single server, or servers may be virtualized across multiple stations.

3. The model does not account for user mobility, which is a critical factor in MEC networks. The static partitioning of the region into subareas assumes fixed user distributions, limiting the applicability of the proposed method in dynamic environments.

4. The delay model may oversimplify real-world conditions. 

5. Adding a detailed figure or flowchart in the system model section could significantly improve understanding.

6. The paper lacks sufficient explanation regarding the implementation of benchmark algorithms and the rationale for their selection.

7. Although pricing optimization is very important ofr this paper, the evaluation does not thoroughly analyze how pricing levels influence tenant behavior or overall system efficiency. It may need more exploration.

8. While the simulations are useful, I really want to see how the approach perform on a real MEC system.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

The reviewer has the following concerns.

1) The Abstract is too long to read, it is thus advised to be streamlined.

2) Better show or discuss the impact of hyperparameters on the DRL-related training process in the simulation.

3) Any sparse/delayed reward issues encountered? If so, how did you deal with it?

4) Any reward-shaping technique used? If not, why?

5) Why did not choose a DRL framework that can handle continuous action space? For example, the actor-critic DRL architecture.

6) The literature regarding DQN variants, e.g., the D3QN, is not as adequate as anticipated. Advanced D3QN variants are not reviewed at all, which limits the comprehensiveness of the Introduction. There are plenty of recent related works in the field of D3QN for wireless communications, representative examples include Path planning for cellular-connected UAV: A DRL solution with quantum-inspired experience replay.

7) What are the drawbacks or cons of adopting a three-layer DRL?

8) In the implementation, the PER technique is typically sensitive to its hyperparameter settings. Did you encounter such issues? How did you deal with it, if so?

9) A complexity analysis of the proposed three-layer DRL solution is provided. However, the lack of a complexity comparison between the proposed and the baselines in the simulation makes the complexity advantage unclear.

10) If a more dynamic environment is provided or the number of M or N is different from the value it was, will the learnt model still work?

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

No further comment.

Reviewer 4 Report

Comments and Suggestions for Authors

The authors addressed my concerns.

Back to TopTop