Next Article in Journal
An Efficient BDS-3 Long-Range Undifferenced Network RTK Positioning Algorithm
Previous Article in Journal
Label Smoothing Auxiliary Classifier Generative Adversarial Network with Triplet Loss for SAR Ship Classification
Previous Article in Special Issue
Cross-Domain Automatic Modulation Classification Using Multimodal Information and Transfer Learning
 
 
Article
Peer-Review Record

Multi-Agent Deep Reinforcement Learning Framework Strategized by Unmanned Aerial Vehicles for Multi-Vessel Full Communication Connection

Remote Sens. 2023, 15(16), 4059; https://doi.org/10.3390/rs15164059
by Jiabao Cao 1, Jinfeng Dou 2,*, Jilong Liu 1, Xuanning Wei 2 and Zhongwen Guo 2
Reviewer 2:
Remote Sens. 2023, 15(16), 4059; https://doi.org/10.3390/rs15164059
Submission received: 11 July 2023 / Revised: 4 August 2023 / Accepted: 10 August 2023 / Published: 16 August 2023
(This article belongs to the Special Issue Satellite and UAV for Internet of Things (IoT))

Round 1

Reviewer 1 Report

autors in this paper proposed a multi-agent deep reinforcement learning framework strat gized by unmanned aerial vehicles (UAVs). UAVs can evaluate and navigate the multi-USV cooperation and position adjustment to establish FCC. When ensuring FCC, we aim to improve the IoV  performance by maximizing the USVs communication range and movement fairness while minimizing their energy consumption, which cannot be explicitly expressed in a closed-form equation.

I think the article has the merit of being published, but the authors should consider the following major points:

1- In the abstract, the authors do not discuss the results of the simulations in detail. We recommend that you explain your results better in the abstract.
2- I recommend in particular that authors add a comparative table to better explain the comparison between their work and the literature. 

3-your system model is badly explained, please improve the part of your system model.

4-why you should choose your propagation model.

5-equation 15 is not understood 

6-In the design of the action, the optimization variables of the optimization problem do not correspond to each other, and the explanation is not given.

7-In the design of the reward, it is mentioned that the destination is reached. But the relevant constraints are not given.

8- comparison with other results is necessary. in addition, your results are being misinterpreted. 

9-study the complexity of your algorithm and what the limits of your article are. 

Please add this papers:

M. A. Ouamri, G. Barb, D. Singh, A. B. M. Adam, M. S. A. Muthanna and X. Li, "Nonlinear Energy-Harvesting for D2D Networks Underlaying UAV With SWIPT Using MADQN," in IEEE Communications Letters, vol. 27, no. 7, pp. 1804-1808, July 2023.

M. A. Ouamri, R. Alkanhel, D. Singh, E. M. El-kenaway and S. S. M. Ghoneim, "Double deep q-network method for energy efficiency and throughput in a uav-assisted terrestrial network," Computer Systems Science and Engineering, vol. 46, no.1, pp. 73–92, 2023.

Moderate english required

Author Response

The response to the comments is in the attachment. Thank you.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper proposes the UST-MADRL framework, which enables UAVs to efficiently navigate the movement of USVs to establish a multi-USV FCC based on MADDPG

A major revision is recommended.

 

1.There are several MADRL algorithms out there, so why choose MADDPG?

2.In a MADDPG structure, each agent needs to know the global state during  the training process. Do the authors consider the information exchange in the training process? 

3. For DRL, the reviewer suggests elaborating the training process and testing process separately.  offline training is preferred, and how to design the training dataset and loss function to guarantee the generalization capability is the key issue. 

4. Is the training process and testing process sharing the same sets?

5. In this model, does the action of each agent have an impact on the other agent's state? 

6. In Sec3, it should be made clear what the purpose of this paper is to maximize or minimize, please add.

such as " max  ..

s.t. ..." 

7.Please elaborate on the rationale for the choice of parameters in this paper, or introduce references。

8.It is suggested to add other RL algorithms as BASELINE, e.g., centralized DDPG and MADQN.

 

Author Response

The response to comments is in the attachment. Thank you so much.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The paper have been improved 

Enflish can be improved 

 

Reviewer 2 Report

The authors have answered all my concerns. I recommend to accept it in its current form.

Back to TopTop