Next Article in Journal
Operating and Dynamic Capabilities and Their Impact on Operating and Business Performance
Next Article in Special Issue
Integrated Agent-Based Simulation and Game Theory Decision Support Framework for Cash Flow and Payment Management in Construction Projects
Previous Article in Journal
Co-Designing Protected Areas Management with Small Island Developing States’ Local Stakeholders: A Case from Coastal Communities of Cabo Verde
Previous Article in Special Issue
A Performance Quality Index to Assess Professional Conduct of Contractors at Sustainable Construction Projects in Saudi Arabia
 
 
Article
Peer-Review Record

Application of YOLO v5 and v8 for Recognition of Safety Risk Factors at Construction Sites

Sustainability 2023, 15(20), 15179; https://doi.org/10.3390/su152015179
by Kyunghwan Kim 1,*, Kangeun Kim 1 and Soyoon Jeong 2
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Sustainability 2023, 15(20), 15179; https://doi.org/10.3390/su152015179
Submission received: 24 August 2023 / Revised: 9 October 2023 / Accepted: 20 October 2023 / Published: 23 October 2023

Round 1

Reviewer 1 Report

Authors have employed the machine learning method, particularly the YOLO V5 and V8, on risk and safety management of the construction site. The preliminary study established a medium size data set including 10,181 objects from 4,844 images, with 2,212 heavy equipment, 5,614 PPE, and 2,355 workers. The data size for the initial model is fairly good. The feasibility of using computer vision at construction sites is demonstrated in detail. I would like to recommend for a publication of the manuscript.

Author Response

Dear Reviewer,

We would like to thank you for your careful review, thoughtful comments, and encouragement on this manuscript. Please note that changes made to reflect the opinions of you and other reviewers are highlighted in red font in the revised manuscript.

Thanks again for your time and help.

Reviewer 2 Report

1. Reduce the length of the title.

2. Can you please list at-least two novelties of your proposed approach? 

3. Rewrite abstract, give a bit about the methodology and relevant results obtained.

4. "This study aims to improve the object recognition and classification accuracy using 47 two You Only Look Once (YOLO) versions" Can we take tuning parameters' value as substantial improvement? I do not think so.

5. "The findings of this study may facilitate more accurate recognition of 51 situations through videos or photos, which will further enable more effective support for 52 various management tasks at construction sites, including safety management." This does not sound convincing.

6. In Section 1.2, numbering should be like i, ii, iii....

7. Is there any difference between "Existing studies using object detection AI technology are reviewed." and "Background theories relating to object detection AI technology are investigated."

8. Benchmark dataset is missing.

9. Ensure that all instances of "e.g., i.e., viz., etc., et al." are properly formatted in italics.

10. Related literature is extremely less. Provide thorough literature study and gap analysis.

11. Write full length titles for the figures. Like for Figure 1, 2,...and so on.

12. line. 162: Take care about headings and sub-headings. These should not come solo at the end of any page.

13. Provide a suitable flowchart related to the work flow.

14. Provide pseudocode of the algorithm / model used with the customization that authors are proposing.

15. In Table 3, maximum parameter values for "Hyp-med Hyp-high" are same??? How? and Why?

I strongly believe that the manuscript can be improved further by taking the suggested points into consideration. I suggest authors to invest some more time and refine / revise the manuscript. But considering this present form, manuscript must be rejected.

Must be revised

Author Response

Dear Reviewer,
We would like to thank you for your careful review and thoughtful comments on this manuscript. Please note that changes made to reflect the opinions of you and other reviewers are highlighted in red font in the revised manuscript. The changes made to reflect your opinion are as follows.

1. Reduce the length of the title.
> The title has been changed to be shorter.

2. Can you please list at-least two novelties of your proposed approach? 
> First, as explained with the newly added Figure 9 and Table 7, based on relatively complex image data, this study has improved accuracy compared to the previous studies. Second, by applying the same data to YOLO v5 and v8 with the same procedure, it has been confirmed that the results of YOLO v8 are superior. Please refer to the explanation in the last sentence of section 1.1 and the conclusions (9-iii and 9-v).

3. Rewrite abstract, give a bit about the methodology and relevant results obtained.
> The abstract has been revised to introduce the methodology and the highest test accuracies obtained.

4. "This study aims to improve the object recognition and classification accuracy using 47 two You Only Look Once (YOLO) versions" Can we take tuning parameters' value as substantial improvement? I do not think so.
> As the reviewer is well aware, significant improvements may not be possible by tuning parameters' values. However, please understand that an improvement of 1-2% from initial results that are close to or better than the best results of previous studies may be worth the study. For more discussion, please refer to the newly added chapter 8 (Discussions). 

5. "The findings of this study may facilitate more accurate recognition of 51 situations through videos or photos, which will further enable more effective support for 52 various management tasks at construction sites, including safety management." This does not sound convincing.
> The above sentence has been replaced with another sentence. Please see the last sentence of section 1.1.

6. In Section 1.2, numbering should be like i, ii, iii....
> Based on the reviewer’s opinion, the numbering has been modified to Roman numerals. The numbering in chapter 9 has also been modified to the same format.

7. Is there any difference between "Existing studies using object detection AI technology are reviewed." and "Background theories relating to object detection AI technology are investigated."
> Since these two sentences cause confusion, ‘Existing’ has been replaced with ‘Previous”. Thus, as the reviewer may well know, a previous study is a published research paper. On other hand, ‘background theories are fundamental concepts in object detection AI technology, such as CNN, typical object detection models, several concepts in accuracy, and so on, 

8. Benchmark dataset is missing.
> Although direct comparison is impossible due to the different data applied, the differences from previous studies are explained in the newly added Discussion chapter and Table 7.

9. Ensure that all instances of "e.g., i.e., viz., etc., et al." are properly formatted in italics.
> Based on the reviewer’s opinion, “e.g. and et al.” are formatted in italics.

10. Related literature is extremely less. Provide thorough literature study and gap analysis.
> Ten references have been added to the text to describe the latest research trends and technologies relevant to this study.

11. Write full length titles for the figures. Like for Figure 1, 2,...and so on.
> Based on the reviewer’s opinion, titles of Figure 1, 3 ~ 5 have been revised to have full length. For the sake of brevity, please understand that the word YOLO is retained.

12. line. 162: Take care about headings and sub-headings. These should not come solo at the end of any page.
> I will keep this in mind. As content is being revised, I will make sure to keep it correct in the final version.

13. Provide a suitable flowchart related to the work flow.
> A new flowchart has been added as Figure 7.

14. Provide pseudocode of the algorithm / model used with the customization that authors are proposing.
> Because training was performed using the functions provided by YOLO, it is difficult to present pseudocode separately. So, please understand that the newly added flowchart in Figure 7 shows the overall procedure for the customization.

15. In Table 3, maximum parameter values for "Hyp-med Hyp-high" are same??? How? and Why?
> As described in the section 6.2.5, YOLO v5 provides three hyperparameter files, each of which has unique preset values for 29 variables. When comparing Hyp-med and Hyp-high files, there is only one difference in the 'copy-paste' variable, as shown in Table 3. Please understand that this study mainly applies default options since there are too many possibilities for hyperparameter combinations. 

Thanks again for your time and help.

Reviewer 3 Report

 The study compared performance of YOLO v5 and v8 for automatic recognition of safety risk factors at construction sites. Academic contribution is limited in studies based on only one method. There is a similar situation in this study. It would be necessary to seek the best method for solving the related problem. However, it is sufficient to compare the performance of only two methods. Therefore, its innovative aspect is weak. Therefore, the impact of the study on the journal will be low. It is currently in the form of a conference paper. It is not at an acceptable level. My other criticisms of the article are as follows:

 

1) Abstract Section should be supported with numerical information.

2) The structure of the introduction section should be changed. The introduction should begin with a few paragraphs containing general information.

3) There are 40 references in the work, but the Literature section looks weak. The information given about references can be expanded.

4) What is the difference between YOLO v5 and v8? The versions should be compared and explained. More detailed information should be given.

5) Figures should be used to facilitate the understanding of the results.

6) The performance of only two approaches is compared to solve the related problem. Therefore, it does not contribute to the literature. Comparison tables should be improved. My suggestion should also be compared with the results of several AI methods.

7) The interpretation of the tables is insufficient.

8) A discussion section should be created.

Minor editing of English language required.

Author Response

Dear Reviewer,
We would like to thank you for your careful review and thoughtful comments on this manuscript. Please note that changes made to reflect the opinions of you and other reviewers are highlighted in red font in the revised manuscript. The changes made to reflect your opinion are as follows.

1) Abstract Section should be supported with numerical information.
> The abstract has been revised to show numerical information.

2) The structure of the introduction section should be changed. The introduction should begin with a few paragraphs containing general information.
> The beginning of the introduction has been revised to present general information related to the accident.

3) There are 40 references in the work, but the Literature section looks weak. The information given about references can be expanded.
> Ten references have been added to the text to describe the latest research trends and technologies relevant to this study.

4) What is the difference between YOLO v5 and v8? The versions should be compared and explained. More detailed information should be given.
> As this paper does not develop a new system, but rather utilizes the publicly available YOLO model, the detailed model architecture is omitted. Readers who want to understand the YOLO models in more depth can find the necessary information in the newly cited references at the end of section 6.1.

5) Figures should be used to facilitate the understanding of the results.
> Figure 7 has been added to show the accuracy improvement process.

6) The performance of only two approaches is compared to solve the related problem. Therefore, it does not contribute to the literature. Comparison tables should be improved. My suggestion should also be compared with the results of several AI methods.
> Table 7 and the newly added Discussion chapter show a comparison to previous studies.

7) The interpretation of the tables is insufficient.
> Based on the reviewer’s opinion, Tables 4, 5, and 6 have been revised.

8) A discussion section should be created.
> A discussion chapter has been added, explaining accuracy improvement features in this study and performance comparison with previous studies.

Thanks again for your time and help.

Round 2

Reviewer 2 Report

Authors still not have responded well to my comments.

Like:

1. Old Point No. 12. line. 162: Take care about headings and sub-headings. These should not come solo at the end of any page. Now this has been repeated at Line No. 136.

2. I failed to understand that when we run any model/s with preset values only; then how it can be considered as a value addition in existing work. [Old Point 15. In Table 3, maximum parameter values for "Hyp-med Hyp-high" are same??? How? and Why?

> As described in the section 6.2.5, YOLO v5 provides three hyperparameter files, each of which has unique preset values for 29 variables. When comparing Hyp-med and Hyp-high files, there is only one difference in the 'copy-paste' variable, as shown in Table 3. Please understand that this study mainly applies default options since there are too many possibilities for hyperparameter combinations. ]

I am still not convinced with the scenario that has been presented by just deploying an existing model. This utility is very much outdated and work done here is not giving any substantial outcomes. I recommend it as major revision. Authors must devote some time and work on it.

 

English grammar is ok.

Author Response

Dear Reviewer,

All newly revised parts are written in red font. The revised contents and our thoughts are as follows.

1. Yes. We'll be careful about that.

2. The contents of the three default files described in Section 6.2.5 are attached as a file. Only Word or PDF files can be attached, so a PDF file with a screen copy of the files is attached. You can download these files from https://github.com/ultralytics/yolov5/tree/master/data/hyps, and open and edit these files in a regular text editor. 

 

If we don't make any settings for the Hyp file when learning in YOLO v5, the Hyper-low file is applied by default and the values ​​of the 29 variables set in this file are applied in YOLO learning. We can check individual values on the screen displayed after the learning. If we wants to apply a Hyper-med or Hyper-high environment, we can set the option as an argument to the training function.

 

For example, if we want to apply the value set in a Hyper-high environment, just add “--hyp ../hyp.scratch-high.yaml” as an argument to the training function used for learning. We can check these individual values ​​on the screen displayed after YOLO learning. The code actually used in colab for this, the log on the screen, and the values ​​set in hyp.scratch-high.yaml are as follows.

 

<code>

!python train.py --img {myimg} --batch {myBatch1} --epochs {myEpochs} --data ./MyData/data.yaml --weights yolov5m.pt --cache --patience {myPatience} --cfg ./models/MyCustom_yolov5m.yaml --hyp ./data/hyps/hyp.scratch-high.yaml

 

<parts of the output log>

train: weights=yolov5m.pt, cfg=./models/MyCustom_yolov5m.yaml, data=./MyData/data.yaml, hyp=./data/hyps/hyp.scratch-high.yaml, epochs=20, batch_size=32, imgsz=608, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=ram, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=myData_resultsSGDmHi3, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=2, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest

 

hyperparameters: lr0=0.01, lrf=0.1, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.3, cls_pw=1.0, obj=0.7, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.9, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.1, copy_paste=0.1

 

<hyp.scratch-high.yaml>

lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)

lrf: 0.1  # final OneCycleLR learning rate (lr0 * lrf)

momentum: 0.937  # SGD momentum/Adam beta1

weight_decay: 0.0005  # optimizer weight decay 5e-4

warmup_epochs: 3.0  # warmup epochs (fractions ok)

warmup_momentum: 0.8  # warmup initial momentum

warmup_bias_lr: 0.1  # warmup initial bias lr

box: 0.05  # box loss gain

cls: 0.3  # cls loss gain

cls_pw: 1.0  # cls BCELoss positive_weight

obj: 0.7  # obj loss gain (scale with pixels)

obj_pw: 1.0  # obj BCELoss positive_weight

iou_t: 0.20  # IoU training threshold

anchor_t: 4.0  # anchor-multiple threshold

# anchors: 3  # anchors per output layer (0 to ignore)

fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)

hsv_h: 0.015  # image HSV-Hue augmentation (fraction)

hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)

hsv_v: 0.4  # image HSV-Value augmentation (fraction)

degrees: 0.0  # image rotation (+/- deg)

translate: 0.1  # image translation (+/- fraction)

scale: 0.9  # image scale (+/- gain)

shear: 0.0  # image shear (+/- deg)

perspective: 0.0  # image perspective (+/- fraction), range 0-0.001

flipud: 0.0  # image flip up-down (probability)

fliplr: 0.5  # image flip left-right (probability)

mosaic: 1.0  # image mosaic (probability)

mixup: 0.1  # image mixup (probability)

copy_paste: 0.1  # segment copy-paste (probability)

 

As there are so many types of variables that we can set as shown above, this study introduced a process to improve accuracy by applying recommended and default values, and as a result, high accuracy results were obtained.

 

To show the superiority of this study's results, they are compared with the results of additional previous studies related to construction safety applying computer vision, as shown in Table 7. Compared to SSD, Faster R-CNN, or self-developed models, as well as other YOLO models, and even compared to previous studies with some better results, we believe our results are among the best in terms of reflecting the diversity of classes in a more complex construction site environment.

 

Thanks again for your time and help.

Author Response File: Author Response.pdf

Reviewer 3 Report

Many suggestions were taken into account by the authors. But the answer given to my most important suggestion is not sufficient. Does the method you use provide the best results for this problem? According to Table 7, this is not the case. As I stated before, the study does not offer any innovation. However, you can also add the results obtained with different deep learning methods to Table 7. You must prove that your work is successful and contributes to the literature.

Author Response

Dear Reviewer,

All newly revised parts are written in red font. The revised contents and our thoughts are as follows.

To show the superiority of this study's results, they are compared with the results of additional previous studies related to construction safety applying computer vision, as shown in Table 7. Compared to SSD, Faster R-CNN, or self-developed models, as well as other YOLO models, and even compared to the studies with some better results, we believe our results are among the best in terms of reflecting the diversity of classes in a more complex construction site environment.

This study uses the open-source code as is, but changes the default options in a way that is recommended by the developers, since the option combinations available to users are almost infinite. So, as the reviewer pointed out, this is an experimental study utilizing an existing program rather than an innovative study proposing a new method. However, we believe that the concept of major hyperparameters applied in YOLO and the accuracy improvement process introduced in this study will be helpful to other future researchers applying the YOLO model.

Thanks again for your time and help.

Round 3

Reviewer 3 Report

it is at an acceptable level.

Back to TopTop