Next Article in Journal
A Piecewise Linear Mitchell Algorithm-Based Approximate Multiplier
Previous Article in Journal
LGMSU-Net: Local Features, Global Features, and Multi-Scale Features Fused the U-Shaped Network for Brain Tumor Segmentation
 
 
Article
Peer-Review Record

An Unsafe Behavior Detection Method Based on Improved YOLO Framework

Electronics 2022, 11(12), 1912; https://doi.org/10.3390/electronics11121912
by Binbin Chen 1, Xiuhui Wang 1,*, Qifu Bao 2,*, Bo Jia 2, Xuesheng Li 2 and Yaru Wang 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Electronics 2022, 11(12), 1912; https://doi.org/10.3390/electronics11121912
Submission received: 19 April 2022 / Revised: 15 June 2022 / Accepted: 17 June 2022 / Published: 20 June 2022
(This article belongs to the Section Artificial Intelligence)

Round 1

Reviewer 1 Report

The paper is well written.The proposed method is clearly described and its performance has been evaluated appropriately by experiments comparing it with conventional methods.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper proposes a new recognition network, YOLO-AW, to detect smoking and helmet-wearing behaviors. It improves the YOLOv5 framework by introducing a novel adaptive attention embedding model and a new weighted feature pyramid network module. The experiment results show that the average accuracy of YOLO-AW is increased by 3% compared with the traditional target detection frameworks.

This paper is well-written and easy to follow.  

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The authors present a technique for detecting unsafe behaviour operators in their workplaces, based on YOLOv5 framework. Regarding this paper I would like to make the following remarks.

The abstract should be a total of about 200 words maximum.

Some typos should be checked throughout the text; e.g. sentence in lines 19-20 could suggest that “early warning of safety accidents” constitute “the problem”; or lines 21-22 could suggest that “operators’ ... standardized wearing of safety helmets” results in accidents.

Acronyms/Abbreviations/Initialisms should be defined the first time they appear in each of three sections: the abstract; the main text; the first figure or table. E.g. the term “self-attention embedding (ASAE)” is defined three times: in line 74, 89-90 and 107.

Moreover, after being defined, the acronym/abbreviation/initialism should be used instead of the written-out form. E.g. the authors used “self-attention embedding” instead of “ASAE” in lines 214 and 240.

In order to keep the readability of the paper, I recommend the authors to use the same identifiers for both Figure 2 and its explanation in lines 121-126. E.g. they should use C3 (instead of Fin) as the system input, and M3 (instead of Fout) as the system output.

In order to keep consistency, the ending commas of equations (4) to (12) should be suppressed.

In Figure 3, the left-bottom box should be labelled as C3, instead of C3.

In Experiments section, I miss a description of the vision equipment used for acquiring the images.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report


The manuscript is well written and easy to follow.

There are, however, some minor linguistic problems and typos that should be corrected. For example in lines 136-137 there is no verb in "As shown in Figure 4, the bottom-up fusion method" sentence. Also in line 190, "Since there in open-accessed dataset" should become "Since there is no open-access dataset".

Minor comments:
1) In line 41, the authors mention "artificially designed features". Strictly speaking, all features are artificially designed either they are hand-crafted or automatically generated by deep learning models. This needs to be rephrased.

2) It is not clear as described in lines 68-70 what the drawback of R-CNN-based methods is. Please rephrase it.

3) In the beginning of Section 2, it should be mentioned what the input is and what the input's dimensions are.

4) The font size in figures is very small and very difficult to read, especially in a printed copy of the manuscript.

Major comments:
1) As I understand the reported results correspond to one experiment where the data is split in training and testing sets (90%-10%). These results might be misleading (different split might lead to different results). The 10-fold cross validation scheme should be performed and the authors should report results averaged across all folds.

2) The proposed method achieves higher performance than the other methods tested. Is this improvement significant or did it happen by chance? Statistical tests should be performed.

3) Although there are no other datasets for detecting unsafe behavior in industrial sites, the methodology presented here is quite general. Therefore, I think that it is very important to be evaluated on other public available behaviour recognition datasets. Otherwise, its impact would be very limited (detect unsafe behavior related solely to helmets and smoking).

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 4 Report

The authors addressed all my comments, except my second major comment, where I suggest the use of statistical tests for indicating whether the performance improvement is significant or not. 

Although, it is not a red line comment, I strongly suggest the authors to perform these tests. It is important to know whether the proposed method is better than the other methods compared against or not. And this can only be done by performing statistical tests (e.g. paired t-test). 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop