Deep Reinforcement Learning for Autonomous Driving with an Auxiliary Actor Discriminator
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis study introduces an autonomous driving technique based on a deep reinforcement learning algorithm with an auxiliary actor discriminator.
The authors combine the soft actor-critic algorithm with the state attention network and the auxiliary actor discriminator for intelligent robotic path planning and obstacle avoidance.
The evaluation results suggest the feasibility and effectiveness of the proposed solution.
Broadly speaking, the work appears interesting and important.
However, the reviewer believes that this research has several issues that need clarification, improvement, and addressing.
Specifically, the reviewer has a couple of concerns or suggestions:
1. How is the accuracy of the developed auxiliary actor discriminator ensured? How does the precision of the auxiliary actor discriminator impact the performance of the policy model?
2. The authors are required to provide an introduction to the network information used in this paper, including the number of hidden layers, the quantity of neurons, and the scale of the network's input and output.
3. The critical hyperparameters of the proposed algorithm need to be introduced.
4. How many random seeds were employed for training strategy models in each approach?
5. I recommend that the authors thoroughly review research conducted in the last one or two years on topics related to reinforcement learning-based autonomous vehicles, such as fear-neuro-inspired reinforcement learning for autonomous driving and trustworthy reinforcement learning for autonomous driving.
Comments on the Quality of English LanguageThe whole English writing of this paper could be polished to enhance the readability.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper presents collision-free autonomous navigation of a mobile robot based on Reinforcement Learning (RL) using State Attention Network (SAN), auxiliary actor discriminator (AAD) and Heuristic knowledge (HK). In general, the paper is written well. The paper's organization looks fine, with some improvements required in Results discussions and Conclusions. The authors should consider the following comments for the paper's improvement.
Note: Lxx=> line number xx
a) The authors could add 1-2 lines that best summarize the work in the Abstract.
b) Having individual citations of [3-7] for the methods in L29-L30 in the Introduction would be better.
c) The last paragraph of the Introduction section should provide an overview of the following sections of the manuscript.
d) In L137, please also provide the type of CPU and amount of RAM of the computer used for training the model.
e) It is nice to see the performance in terms of success percentage of total trials.
f) The authors should provide a comparison with similar methods in literature or else argue why the comparison is not possible.
g) I would like more discussion about failure cases in the Results discussion.
h) The future perspectives of the current work should be discussed in the conclusion section.
i) Citations 31 and 32 are the same, which should be fixed.
j) I would like to see the experiment in a natural real-world scenario and the source code open-sourced.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have answered my concerns; some could have been more satisfactory but in an acceptable way. However, I still want more discussion about failure cases in the Results discussion that could add value to this paper. The authors may consider the following comments for the paper's improvement.
a) The term gap-based term is unclear. Does it mean collision-free space?
b) In Section 2. Materials and Methods: Clarifying the methods used for goal-directed and gap-based navigation strategies is recommended.
c) It is recommended to add a few lines about failure cases.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf