Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

A Novel Searching Method Using Reinforcement Learning Scheme for Multi-UAVs in Unknown Environments

Appl. Sci. 2019, 9(22), 4964; https://doi.org/10.3390/app9224964

by Wei Yue^1,2, Xianhe Guan¹ and Liyuan Wang^2,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Appl. Sci. 2019, 9(22), 4964; https://doi.org/10.3390/app9224964

Submission received: 28 July 2019 / Revised: 8 November 2019 / Accepted: 14 November 2019 / Published: 18 November 2019

(This article belongs to the Special Issue Unmanned Aerial Vehicles (UAVs))

Round 1

Reviewer 1 Report

This could have been a really nice and interesting paper on an interesting and practically relevant topic. The idea, as far as I understood it, makes sense and could be useful. Unfortunately, the writing / presentation and the evaluation are quite poor. The paper needs a serious re-write, and ideally the authors enlist support from a native speaker.

Writing and presentation:

Section 2 is really hard to follow, several sentences are just confusing. For example, the meaning of \tau is unclear, similarly the meaning of \delta p_mn(k) and the explanations at the end of page 2 You should have made it clearer up-front that you are actually consider UAVs searching a sea area. I was confused about whether your UAVs are airborne or sea-borne for quite a while. I cannot resist to comment that the mathematical typesetting is plain ugly, Word just does not produce nice-looking formulas.

Comments on the algorithm design:

It seems that your algorithm requires that each UAV has access to the “global state” s(k) and u(k) (and all the p_mn(k) values) at any time to evaluate Equation (20) and the equations this is dependent on. This means that the UAVs need to communicate constantly to maintain global agreement on these state values. This should be stated explicitly, and it would be good if some discussion of the ensuing communications cost (required bandwidths) would be included. Also, it would be good to discuss what happens if the sea area is large and connectivity is only intermittent. It seems that your algorithm “converges” in some sense to a tracking behavior, since in your reward function (Equation (20)) you make it worthwhile to go near places where you already had a discovery. This, coupled with the “cooling down” of the system temperature in Equation (17), makes it unlikely for a UAV to be in an area where the known ships are certain not to be. However, new ships may enter the area. It would be good to include some discussion of this. What is a time step? Is the motion model (Equation (7)) standard? If so, a reference would be good. Some comments on the computational complexity of your algorithm would be good. P7, line 250: how is “the actual distance between UAVs is d” defined?

Methodology and evaluation:

The comparison metric is not defined at all, I do not know what the values in Table 1 represent. Also, there is no real discussion about the reasons for the differences between the algorithms. The statistical significance of results is unclear. You do not clarify all the parameter values used in the evaluation, e.g. the weights used in Equation (16), or the value of the cooling parameter \lambda in Equation (17)

Author Response

Responses to 1^st round review

Response to Reviewer 1:

We thank the reviewer for the time and efforts that he has invested in reading and commenting on our work. His important feedback enabled us to improve the quality of our work. We have addressed his concerns as per our responses below. For convenience, we have retyped the comments in italics, followed by our responses in bold. The paper has been revised and edited thoroughly. The modications in the resubmitted paper are shown also in red font.

Comment 1: The related work needs to include more up to date papers in the area. Also, more references should be included to cover the research area.

Response: As you suggested, we have updated the references in the revised paper, which including the latest and some important research results.

Comment 2: Regarding the simulation results, more parameters can be examined and present some more results. For instance, how would an increase of the number of the warships would affect the results, or how would an increase in the area affect the results. All the simulation details/ parameters should be included.

Response: As the Reviewer suggested, we have given out all the simulation details/results in the revised paper, including the weighted parameters w1=0.25, w2=0.15 , w3=0.1, and w4=0.5; The sensor parameters Pd=0.9 and Pf=0.1, and the search map parameters τ=0.98, τc=0.9, and τH=0.9; the cooling parameter T.

Furthermore, we also have compared the proposed algorithm with random search and traversal search in detail, and the statistical results are detailed in Tables 1 and 2, respectively. The main comparison result is that the number of dynamic and static targets found by different algorithms in the same time.

In subsection 4.3, we also have given out the algorithm parameter analysis.

Comment 3: Discussion should be made about the time/ energy/complexity of the proposed solution.

Response: According to the reviewer’s suggestion, we have discussed the time complexity and space complexity of the algorithm by given Remark 1 in the revised manuscript on page 9.

Comment 4: A comparison with another method/ approach would help the reader to better understand the proposed method.

Response: In this paper, the comparisons are made between the new results and two other methods (random search and traversing search), and the results are shown in Table 1. We updated the manuscript by increasing more details on the simulation results.

Comment 5: There are some spelling and grammar errors.

Response: Thanks for the reviewer carefully reading. The errors we have found in spelling and grammar have been corrected.

Once again, we thank the editor for his time in handling our paper and the reviewers for their time and efforts in reviewing and commenting on our work. We look forward to hearing back from you.

Sincere Regards,

The authors: Wei yue, Xianhe Guan, Liyuan Wang

Author Response File: Author Response.docx

Reviewer 2 Report

The paper copes with a promising topic and presents some interesting results. Some recommendations to improve the paper quality:

The related work needs to include more up to date papers in the area. Also, more references should be included to cover the research area. Regarding the simulation results, more parameters can be examined and present some more results. For instance, how would an increase of the number of the warships would affect the results, or how would an increase in the area affect the results. All the simulation details/ parameters should be included. Discussion should be made about the time/ energy/complexity of the proposed solution. A comparison with another method/ approach would help the reader to better understand the proposed method.

Author Response

Responses to 1^st round review

Response to Reviewer 2:

We thank the reviewer and are grateful for his thorough reading and highly constructive comments on our paper. Below we handle each of the points the reviewer raised on an individual basis and we provide further explanations on how we have adjusted the paper to reflect the suggested refinements. For convenience, we have retyped the comments in italics, followed by our responses in bold. The paper has been revised and edited thoroughly. The modications in the resubmitted paper are shown also in red font.

Comment 1: This could have been a really nice and interesting paper on an interesting and practically relevant topic. The idea, as far as I understood it, makes sense and could be useful. Unfortunately, the writing / presentation and the evaluation are quite poor. The paper needs a serious re-write, and ideally the authors enlist support from a native speaker.

Response: As the reviewer suggestion, the paper have been rewritten by the professional English editing service, and the grammar, spelling and punctuation have been improved.

Comment 2: Section 2 is really hard to follow, several sentences are just confusing. For example, the meaning of \tau is unclear, similarly the meaning of \delta p_mn(k) and the explanations at the end of page 2.

Response: Thanks to the reviewer carefully reading, we have redefined these unclear variable in the revised version. \tau denotes a discount factor, which represents the forgetting rate of probability map; Only a few grids are accessed at the same time, and these grids probability change will affect other grids. So, \delta p_mn(k) is defined to represent the changing probability in grid (m,n).

Comment 3: You should have made it clearer up-front that you are actually consider UAVs searching a sea area. I was confused about whether your UAVs are airborne or sea-borne for quite a while.

Response: We thank the reviewer for pointing this out as well, our UAVs are airborne and we have added more details to describe the Environmental Model in the revised paper.

Comment 4: I cannot resist to comment that the mathematical typesetting is plain ugly, Word just does not produce nice-looking formulas.

Response: We are very sorry for our negligence of writing form, we have reedited all the formulas in our revised manuscript.

Comment 5: It seems that your algorithm requires that each UAV has access to the “global state” s(k) and u(k) (and all the p_mn(k) values) at any time to evaluate Equation (20) and the equations this is dependent on. This means that the UAVs need to communicate constantly to maintain global agreement on these state values. This should be stated explicitly, and it would be good if some discussion of the ensuing communications cost (required bandwidths) would be included. Also, it would be good to discuss what happens if the sea area is large and connectivity is only intermittent.

Response: The reviewer is right, in our proposed scheme the UAVs need to communicate constantly to maintain global agreement on state values. However, in this research the communication constraints were not involved, namely, we assume that communication networks are ideal. In practical, the communication networks may have constraints such as transmission delays, packet dropouts and bandwidths constraints, which will be a challenging and interesting future topic.

We have rewritten the Conclusion part in the revised paper, and raised the communication constraints as our future research work.

Comment 6: It seems that your algorithm “converges” in some sense to a tracking behavior, since in your reward function (Equation (20)) you make it worthwhile to go near places where you already had a discovery. This, coupled with the “cooling down” of the system temperature in Equation (17), makes it unlikely for a UAV to be in an area where the known ships are certain not to be. However, new ships may enter the area. It would be good to include some discussion of this.

Response: The reviewer is right, the proposed algorithm have tracking behavior when the UAV have a discovery. However, the targets we're discussing in this research are moving. Although the “cooling down” of the system temperature increased utilization of the knowledge learned, when no target was found at the location, the reward function will falling until UAV fly away from this area, and continue searching for possible target or new target in the sea areas.

Comment 7: What is a time step?

Response: In order to make the definition clear, we have redefined ‘time step’ in the revised paper as ,

Comment 8: Is the motion model (Equation (7)) standard? If so, a reference would be good.

Response: Yes, the motion model (Equation (7)) is standard, we have added the Reference [4] and [21] in the revised paper.

Comment 9: Some comments on the computational complexity of your algorithm would be good.

Response: According to the reviewer’s suggestion, we have discussed the time complexity and space complexity of the algorithm by given Remark 1 in the revised manuscript on page 9.

Comment 10: P7, line 250: how is “the actual distance between UAVs is d” defined?

Response: Thanks to the reviewer carefully reading, we have given out the definition of d in Equation (8), where d is the Euclidean distance between any two UAVs.

Comment 11: The comparison metric is not defined at all, I do not know what the values in Table 1 represent. There is no real discussion about the reasons for the differences between the algorithms. The statistical significance of results is unclear.

Response: According to the reviewer’s suggestion, we have redefined the comparison metric and more details about the simulation results have been given out in Part 4.

We also have compared the proposed algorithm with random search and traversal search in detail, and the statistical results are detailed in Tables 1 and 2. The main comparison result is that the number of dynamic and static targets found by different algorithms in the same time.

Comment 12: You do not clarify all the parameter values used in the evaluation, e.g. the weights used in Equation (16), or the value of the cooling parameter \lambda in Equation (17).

Response: Thanks to the reviewer carefully reading, we have clarified all the parameters values in the revised paper, including the weighted parameters w1=0.25, w2=0.15 , w3=0.1, and w4=0.5; The sensor parameters Pd=0.9 and Pf=0.1, and the search map parameters τ=0.98, τc=0.9, and τH=0.9; the cooling parameter T.

Once again, we thank the editor for his time in handling our paper and the reviewers for their time and efforts in reviewing and commenting on our work. We look forward to hearing back from you.

Sincere Regards,

The authors: Wei yue, Xianhe Guan, Liyuan Wang

Round 2

Reviewer 1 Report

The paper has certainly improved, the writing became better (but still is not great -- again I suggest to go over it with a native speaker), but it is somewhat disappointing that not all of my concerns have been addressed. In particular, I am still missing some (qualitative at least) analysis of the communications requirements, including a discussion of who updates which parts of the state space and environmental model, what is the data rate and what the allowed delay is.

Other comments:

Are N_t, N_m and N_v known a priori? Equation (1): what is meant by "no access"? What is the coordinate system the UAVs are working in? I would assume it is pre-understood? Is v_i a control knob? If not, why does it feature in the cost in Eq (14)? Is the area covered by the sensor circular or rectangular? You said you made 500 replications of each experiment. What are the confidence intervals for the averages reported in the table?

Author Response

Responses to 2^st round review

Response to Reviewer 1:

We thank the reviewer for the time and efforts that he has invested in reading and commenting on our work. We have addressed your concerns as per our responses below. The paper has been revised and edited thoroughly and we use the English editing service provided by MDPI to improve the language of articles. The modications in the resubmitted paper are shown also in red font.

Comment 1: In particular, I am still missing some (qualitative at least) analysis of the communications requirements, including a discussion of who updates which parts of the state space and environmental model, what is the data rate and what the allowed delay is.

Response: In this research, we focus on a new searching method based on Reinforcement Learning scheme without considering the effect of communication networks.

However, as the reviewer suggested, the communication networks may have constraints such as transmission delays, packet dropouts, and bandwidths, which may degrade the performance. Exploiting the effect of the communication constraints to derive effective algorithms for multi-UAV cooperative search is an interesting topic worthy of investigation in the next step of research work.

Comment 2: Are N_t, N_m and N_v known a priori?

Response: N_v is a known priori; N_m and N_t are unknown. We have described in the revised paper.

Comment 4: Equation (1): what is meant by "no access"?

Response: "no access" means no any UAVs detect the grid (m, n).

Comment 5: What is the coordinate system the UAVs are working in?

Response: The UAVs working in two dimensional coordinates. We have described in the revised paper.

Comment 6: Is v_i a control knob? If not, why does it feature in the cost in Eq (14)?

Response: v_i is the velocity of the UAV, not the control knob. Eq (14) is about the consumption of time during the task, namely denote the cost of time, where is the distance for the ith UAV move from instant k to .

Comment 7: Is the area covered by the sensor circular or rectangular?

Response: The area covered by the sensor is rectangular, as shown in Figure. 1.

Comment 8: You said you made 500 replications of each experiment. What are the confidence intervals for the averages reported in the table?

Response: In the simulation parts, we experimented 500 times and calculated the average number of discovered targets. There is no need for statistical confidence and confidence interval. It is meaningless to consider confidence interval in search problem.

Once again, we thank the editor for his time in handling our paper and the reviewers for their time and efforts in reviewing and commenting on our work. We look forward to hearing back from you.

Sincere Regards,

The authors: Wei yue, Xianhe Guan, Liyuan Wang

Reviewer 2 Report

It would be helpful if the authors have used a point-to-point reference for the changes with the initial version of the paper and the changes they performed. Seems that the recommendations from reviewer1 are not followed

Author Response

Responses to 2^st round review

Response to Reviewer 2:

We thank the reviewer for the time and efforts that he has invested in reading and commenting on our work. We have responded to the questions of reviewer 1 one by one.

Once again, we thank the editor for his time in handling our paper and the reviewers for their time and efforts in reviewing and commenting on our work. We look forward to hearing back from you.

Sincere Regards,

The authors: Wei yue, Xianhe Guan, Liyuan Wang

Round 3

Reviewer 2 Report

Some of the figures are not cited in the text- a general proof read is necessary, while more references would make the paper more interesting

Article Menu

A Novel Searching Method Using Reinforcement Learning Scheme for Multi-UAVs in Unknown Environments

Further Information

Guidelines

MDPI Initiatives

Follow MDPI