Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Printed Edition

A printed edition of this Special Issue is available at MDPI Books....

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Cooperative Multi-Agent Reinforcement Learning with Conversation Knowledge for Dialogue Management

Appl. Sci. 2020, 10(8), 2740; https://doi.org/10.3390/app10082740

by Shuyu Lei^*

, Xiaojie Wang and Caixia Yuan

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Appl. Sci. 2020, 10(8), 2740; https://doi.org/10.3390/app10082740

Submission received: 25 March 2020 / Revised: 8 April 2020 / Accepted: 9 April 2020 / Published: 15 April 2020

(This article belongs to the Special Issue Natural Language Processing: Emerging Neural Approaches and Applications)

Round 1

Reviewer 1 Report

This paper proposes a multi-agent dialogue model where an end-to-end dialogue manager and a user simulator are optimized simultaneously. For user simulator reward function, it has been proposed the reward shaping technique based on the adjacency pairs to make the simulator learn real user behaviors quickly while learning from scratch. In addition, the paper proposes a generalization of the one-to-one learning strategy to one-to-many learning strategy where a dialogue manager cooperates with various user simulators, which are obtained by changing the adjacency pairs settings.

The paper’s structure is very good, the concept is scientifically sound.

In terms of methods and methodology - these have been explained clearly with graphical representations and relevant comparisons.

English grammar must be checked and revised.

Author Response

We sincerely thank you for your time and efforts. We thank the reviewer for pointing out "There are some grammar errors". After received your comments, we have revised these errors in the new revision.

Reviewer 2 Report

The authors presented the use of multi-agent deep reinforcement learning for dialogue management.

They demonstrated an interesting approach based on the latest artificial intelligence solutions to solve a very difficult problem.

Apart from a solid presentation of the mathematical foundations of the solution, the authors proposed a set of tests and comparisons.
In my opinion, the most valuable is the comparison conducted by real users.

My remarks are rather subtle.

Firstly, I do not know to what extent the fact that this system was implemented Chinese language complicates or simplifies the solution of this issue in relation to English. It would be very beneficial to write at least a sentence about it. The presentation of alternative solutions and the description of related work could be more detailed. There are many places in the paper that cites a long list of other papers without detailed comment. As far as research is concerned, I think it would be valuable to present at least one other scenario (except meeting room booking). Additionally, in the future, you could consider adding a metric about how natural it is to talk with the prepared solution.

I think that the article present very high level and the comments I have outlined do not significantly reduce its value.

Author Response

We sincerely thank you for your time and efforts. Below please find the responses to some specific comments.

Point 1: Firstly, I do not know to what extent the fact that this system was implemented Chinese language complicates or simplifies the solution of this issue in relation to English. It would be very beneficial to write at least a sentence about it.

Response 1: We thank you for pointing out this issue. Since the dialogue process in both Chinese and English are basically the same, we believe that our system has an equal solution in relation to English. The only difference is that characters are utilized as inputs in Chinese and words are utilized in English. We added a sentence "It is worth nothing that our proposed framework can be directly utilized on English tasks by substituting Chinese characters to English words as inputs." in the first paragraph in Experiment Section.

Point 2: The presentation of alternative solutions and the description of related work could be more detailed. There are many places in the paper that cites a long list of other papers without detailed comment. As far as research is concerned, I think it would be valuable to present at least one other scenario (except meeting room booking)

Response 2: We agree with you and have incorporated this suggestion in related work. We updated the related work in our revised manuscript.

Point 3: Additionally, in the future, you could consider adding a metric about how natural it is to talk with the prepared solution.

Response 3: Thank you for your suggestion. In a dialogue community, there is no objective criterion about evaluating how natural it is to talk with the agent. Thus, we conduct experiments with users. We also think that a independent paper proposed a metric about how natural it is to talk may promote the progress of dialogue area in the future.

Article Menu

Printed Edition

Cooperative Multi-Agent Reinforcement Learning with Conversation Knowledge for Dialogue Management

Further Information

Guidelines

MDPI Initiatives

Follow MDPI