Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Reinforcement Learning with Multi-Policy Movement Strategy for Weakly Supervised Temporal Sentence Grounding

Appl. Sci. 2024, 14(21), 9696; https://doi.org/10.3390/app14219696

by Shan Jiang¹, Yuqiu Kong^2,*

, Lihe Zhang³ and Baocai Yin¹

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Appl. Sci. 2024, 14(21), 9696; https://doi.org/10.3390/app14219696

Submission received: 6 August 2024 / Revised: 18 October 2024 / Accepted: 20 October 2024 / Published: 23 October 2024

(This article belongs to the Special Issue AI for Multimedia Information Processing)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Authors need to present the research purpose more clearly in the introduction.

Authors need to present both theoretical and practical implcations of this work at the conclusion section.

Authors need to improve the figure presentation in figure 5. It is little bit poorly presented.

Authors need to present and explain more about the Tables. Also, authors need to present the greek alphabet meaning in both results and method parts more clearly.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Due to the new edition of texts, typographical errors have to be checked. For example, some words are hyphenated (probably because they were on two different lines in the original draft). Examples: L. 34 ac-cess, L. 40 increas-ing.

There are many more all over the text, so it is worth checking it.

The introduction provides a solid background, but there are opportunities to improve by including a more detailed review of the state-of-the-art methods and how they relate specifically to the research problem. This would help to better contextualize the research within the broader field.

While the methods are described in detail, some sections could benefit from clearer explanations of the algorithms and the rationale behind certain design choices. Additionally, including more visual aids or diagrams could help in understanding complex concepts.

The results are presented clearly, but there could be more discussion on the significance of the results in the context of the field. Adding more comparative analysis with existing methods would help to highlight the contributions of the research.

The research introduces some novel aspects, particularly in the reinforcement learning approach combined with weak supervision for temporal sentence grounding. However, the integration of existing methods could be seen as more of an incremental improvement rather than a groundbreaking innovation.

The presentation is generally clear, with structured sections and detailed explanations. However, some areas could benefit from additional clarity, particularly in the explanation of complex methods and results. Visual aids and more concise summaries could enhance understanding.

The research is scientifically sound, with a solid methodological approach and thorough experimentation. The conclusions are well-supported by the results, and the use of reinforcement learning in this context is appropriately justified.

The paper would be of interest primarily to researchers and practitioners in the field of computer vision and machine learning, particularly those working on video analysis and natural language processing. Its appeal might be limited outside these specialized areas.

Overall, the paper presents a well-executed study with some novel contributions. However, its incremental nature and the need for clearer presentation may limit its impact. It is a solid contribution to the field but may not be seen as highly transformative.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper is well organized and easy to read. Here are a few suggestions to improve the paper

- Ln. 110: Please, cite the original references to the datasets used Charades-STA and ActivityNet-Captions.

- Ln. 157: "Inspired by these works" This part is not needed, because section 2 is about related work; the authors do not need to reafirm their objective here. The same goes for line 172

- Ln. 223: All equations are misalligned and broken, which made it difficult to assess the validity of equations; this happens throughout the paper

- Ln. 442: "GTX 3090" I think you meant "RTX 3090"?

- What does runtime mean in Tables 1, 2 and 6? Is it the time to train the models? Please, clarify.

Comments on the Quality of English Language

In general, the language is good, however there are a few aspects to improve:

- Ln. 12: The paper has some sentences written in a weird manner. For instance, "Temporal grounding is to identify the target" makes little sense. Please, conduct a language review to solve issues like this.

- Ln. 34: In "provides ac- cess to specific" the hyphenization seems wrong. This happens in various parts of the text as well, so please fix this throughout the paper.

- Ln. 70: "Forth" should be "Fourth" (and other minor typos that appear in various parts of the paper)

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Authors need to emphasize the contribution of this work theoretically and practically at the conclusion section. Current version is very weak.

Authors still need to improve the visibility of figures.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Article Menu

Reinforcement Learning with Multi-Policy Movement Strategy for Weakly Supervised Temporal Sentence Grounding

Further Information

Guidelines

MDPI Initiatives

Follow MDPI