Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Human Action Recognition Based on Improved Two-Stream Convolution Network

Appl. Sci. 2022, 12(12), 5784; https://doi.org/10.3390/app12125784

by Zhongwen Wang¹

, Haozhu Lu²

, Junlan Jin¹

and Kai Hu^1,3,*

Reviewer 1: Anonymous

Reviewer 2:

Bhupesh Mishra

Appl. Sci. 2022, 12(12), 5784; https://doi.org/10.3390/app12125784

Submission received: 25 April 2022 / Revised: 19 May 2022 / Accepted: 2 June 2022 / Published: 7 June 2022

(This article belongs to the Special Issue Big Data Analysis and Management Based on Deep Learning)

Round 1

Reviewer 1 Report

In this paper, authors proposed an improved two-stream convolution network. The recognition mode of single frame of spatial stream is changed to multi-frame image recognition byusing BiGRU network, which solves the shortcomings of many existing neural network in the perception of action appearance coherence features.The theory of the paper is correct, the structure is rigorous, and the experiment is sufficient.

My suggestions are as following:

In introduction, and section 2.2, Please explain why you use attention mechanism SimAM;
Figure 3, and Figure 6 are classical figures of original algorithms, please re-draw or cite them when use them in your paper.
In formula 13~17, please explain the meaning of left arrow, right arrow, wavy sign on symbol.
Please check formula 17.(between line 320 and 321)
In line 387, how many repeated experiments which you get Figure 10.
Section 4.3. Results of experiments and analysis, maybe it is better to divided into just 3 parts, “Ablation experiments”,“Comparative experiment”,“Experimental overall analysis”.
Applied Science is one excellent journal, there are many excellent related papers, please study and cite them.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

The paper has presented the improved version of two-stream CNN with the improved comparative results. However, the following issues need to be addressed in the revised version.

In the abstract (line 5), the author has mentioned that the paper has utilized the strong mining capabilities of BiGRU. What are those strong mining capabilities and how they have been utilized are not clearly presented in the paper?
In the introduction section (lines 92-93), as a contribution to the paper, the author has claimed that the proposed structure can well solve the shortcoming of the original network and will provide the possibility for more complex fuzzy recognition tasks as an argument. But, there has no clear justification for this argument. This appears as the author's assumption as the contribution.
In Section 2.2 (line 116), the attention mechanism is presented. My suggestion is present the attention mechanism in detail describing what is it, and how it is beneficial?
In figure 2, a three-stream of input is presented in a spatial stream network. How these streams are different from each other in feature. extraction?
In line 213, Yang et al. .., reference is missing in the statement.
In line 222, clarification and explanation are required on why the small energy of each neuron is more important?
In section 4 (line 261), clarification and explanation are required to understand more about "several independent repeated experiments". What are they, and how different they are from each other?
The author has presented 5 datasets but used only two datasets because of hardware limitations (line 306), What is the hardware limitation of the three left out datasets?
In line 360, the author claimed satisfactory result, which is vague to understand what is the satisfactory level is?
The contribution of the paper is weak at the moment in the conclusion section in relation with the summary.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

The revised version has addressed the issues I have raised.

Article Menu

Human Action Recognition Based on Improved Two-Stream Convolution Network

Further Information

Guidelines

MDPI Initiatives

Follow MDPI