Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Using a High-Precision YOLO Surveillance System for Gun Detection to Prevent Mass Shootings

AI 2025, 6(9), 198; https://doi.org/10.3390/ai6090198

by Jonathan Hsueh¹

and Chao-Tung Yang^2,3,4,*

Reviewer 1: Anonymous

Reviewer 2:

Ashwin Dhakal

Reviewer 3:

Wai Lun Lo

Reviewer 4:

Alexandre Hudon

AI 2025, 6(9), 198; https://doi.org/10.3390/ai6090198

Submission received: 12 July 2025 / Revised: 13 August 2025 / Accepted: 20 August 2025 / Published: 22 August 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper examines YOLO’s capabilities for gun detection in video surveillances. There are certain contributions with much practicality to public safety, and the amount of work presented is decent, but the presentation itself needs improvement. Therefore I recommend major revisions, throughout the paper, on the following issues:

Despite being adequately clear on presenting the study conducted, the writing style and writing tone of the paper is a little off the way it should be for academic publications. It has almost an “educative” tone with many secondary person pronouns in the paper and sounds like lecturing in many sections with excessive basic concept descriptions. This should be corrected.
You used multiple versions of YOLO, why do you pick YOLOv8 for your title?
Line 57 “In conclusion”: Avoid conclusive terms in the introduction, use “To summarize, ”, etc.
Fig. 2: The workflow conveyed is not clear in this image. Improve the image quality by being more explicit and specific so it has more readability. For example, “Store on Website”, store what? “Using Drone and Surveillance Cameras”, to do what? “Result”, of what? Readers should be able to see directly from the figure since that is the point of a workflow diagram.
Fig. 4, 5, 6, etc.: ALL figures and tables should be cited within the text, I could not find where these are cited? Check others too.
Fig. 4, Table 1: These are well known concepts and do not need to be explained.
Section 3.2: Too lengthy. Give a brief definition of the metrics you used and the reasons would be enough.
Section 3.3: More details or maybe a little statistic of the dataset would be beneficial. For example, what are the original resolutions? Are the guns displayed isolated in the image or are they held by people?
Line 318: Irrelevant information.
Fig. 12 ,etc.: You have Precision-Recall Curve for many of your results and figures and they almost look the same, why don’t you use the AUC metric?
Fig. 13, etc: Is this figure cited?
Editing: I have to ask, why don’t you edit all the figures and tables close to where they are cited? Now they seem like being placed arbitrarily and the readability is poor, which makes the paper look like it has been rushed to be finished.
Tables: Use academic style tables.

Author Response

We are incredibly greatful to our reviewers for their valuable comments that allow us to improve the quality of our manuscript. Our responses to reviews are attached as a pdf.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper looks into using different YOLO object detection models to spot firearms in surveillance footage. The idea is to help prevent mass shootings. The authors compare several YOLO versions (v5 to v11), describe their system setup in detail, and show a live, web-based interface designed for edge devices. The topic is relevant, and the approach is based on solid machine learning and computer vision methods. That said, a few sections need to be clearer and more thorough.

1 The authors say the paper is novel because it evaluates the latest YOLO versions (v9–v11) and uses a web-based edge system. But to back that up, they should explain more clearly how v9–v11 improve on v8 — is it just better mAP, or are there deeper differences? Also, how is their Streamlit setup different from others? Without this context, the novelty isn’t very convincing.

2 Table 4 shows all models have a 100% false positive rate and can’t identify any non-gun images correctly. That’s a serious issue for real-world use. The authors need to dig deeper into why this happened. Were there too few background-only images (only 93)? Were augmentations applied evenly across all classes? Would adding more negative examples help?

3 They only used a holdout split for testing. That’s not ideal. Using k-fold cross-validation would give more reliable results and help check if the models are overfitting.

4 There’s some inconsistency around YOLO versions. YOLOv12 is mentioned in the methods, but it doesn’t show up in the results. Also, the paper doesn’t describe what makes each YOLO version different. A quick summary of version-to-version changes would help readers follow along.

5 The paper reports mAP, precision, and recall in one table, but accuracy (from the confusion matrix) in another. These should be shown together or clearly explained as separate views. Also, the accuracy numbers (~0.48) are misleading due to class imbalance : most images are positives. Instead, they should use balanced accuracy or something like the Matthews correlation coefficient to paint a more honest picture.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The authors proposed a method that using a YOLOv8 Surveillance System for Gun Detection to Prevent Mass Shootings. The topic is interesting, after reviewing the paper, I have the following comments.

[1] A survey of no. of past incidence of mass shooting could be include in section1 in order to show the importance of this research topic.

[2] The survey on related work in section 2.1 could be presented in form of table with reference no., advantages and limitations of the methods.

[3] The fonts for the labels in Fig. 2 should be enlarged for a better presentation.

[4] For the section of method and implementation, the fonts for the labels in Fig. 3 should be enlarged for a better presentation.

[5] For the section 3.1. You Only Look Once: Unified, Real-Time Object Detection, equation numbers should be added in the equations.

[6] For the section 3.2 performance metric, te authors should also explain the reasons why eq. (1) to (7) are suitable performance measurements for this particular problem. Is there any other performance index that can serve the same purposes.

[7] The fonts for the labels in Fig. 7 should be enlarged for a better presentation.

[8] For the section 3.3 Dataset, the authors should describe more about the distribution of the dataset and why the split was 82%-14%-4% training-validation-testing ?
[9] The fonts labels in Fig 10. 11. 12 and 13 are too small to read, the fonts label should be enlarged for better presentation.

[10] The authors should explain the findings of this study in section 4.4. Testing Surveillance Capabilities, do the methods can perform well under objects being partially obscured?
[11] The authors should extend the conclusion section 5 to describe more about the original contributions of the paper.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

Brief summary: This interesting study focused on the use of the Roboflow dataset to assess a YOLO (SSD+CNN) based algorithm to assess classification performances. Please find my comments below.

Introduction:

- Line 24 (700+ events in 2023 or all years combined?) : please clarify.
- One major comment is that most of the introduction is currently based on a small subset of American grey literature rather than peer-reviewed documents and studies. Considering the amount of studies available on this topic that have been peer reviewed, the authors are encouraged to provide a revision of the state of the literature as part of their introduction to enhance clarity for the readership.
- The context and main concepts of this study are barely explored in the introduction. The authors used half the introduction to state what they are going to do rather than justifying the need and advancements provided by such studies (eg.: lines 49 to 70 belong to Materials and Methods as per MDPI's guidelines).
- Lines 76-80 : how was the literature review conducted? No strategy / scientific methods (such as PRISMA or other well recognized approaches are mentioned). What databases were explored? Please clarify.
- The problematic being tackled by this study is unclear as well as it justification.
- The aim of the study and the hypotheses are not explicit. The authors are encouraged to follow usual introduction structure in original research papers: Few paragraphs about the main concepts, 1 paragraph on the problematic being tackled and one paragraph on objectives and hypotheses.

Methods:

- Figure 2 and 3 are unreadable (font is too small).
- No guidelines on AI-reporting were used in this study.
- Algorithms are introduced in great details, but the justification behind choosing them versus others remains difficult to appreciate.
- The Roboflow dataset is only briefly detailed in the methods. Please provide a table with the main characteristics found in this dataset.
- The training-testing split is very unusual. Please use a reference from the literature to justify the 82-14-4 approach.
- There is almost nothing mentioned about data analysis. There are long explanations on well known metrics in AI but nothing is mentioned as to what will be precisely analysed and how in this version of the methodology.

Results

- Figure 10 is unreadable.
- Most of the results seem to be overfitted and this might be due to the 82-14-4 split. Please provide the hyperparameters of the YOLO algorithms.
- Figure 14 and 15 and also unreadable.

Discussion:

- The discussion does not tie at all the results with the current state of the literature.

Minor comment:
- Sentences structures are difficult to follow in this paper (eg line 180-183) with random capital letters after '';''. Please revise accordingly.
- Part of the paper appears to be written by AI. Please state such use accordingly (eg: acronyms redefined many times, long hyphens, AI-specific words, etc.).

Comments on the Quality of English Language

- Sentences structures are difficult to follow in this paper (eg line 180-183) with random capital letters after '';''. Please revise accordingly.
- Part of the paper appears to be written by AI. Please state such use accordingly (eg: acronyms redefined many times, long hyphens, AI-specific words, etc.).

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have carefully revised the paper and I recommend its publication.

Author Response

Thanks for your comment.

Reviewer 2 Report

Comments and Suggestions for Authors

All my comments/concerns have been addressed.

Author Response

Thank you for your comment.

Reviewer 4 Report

Comments and Suggestions for Authors

Brief summary: This is a second round revision for this manuscript. The authors have made several changes as per the first version. Please see my comments below.

Materials and methods:

1. The main comment that remains to be adressed is about the clarity of the methodology. The authors are encouraged to follow the scientific approach rather than a narrative approach ''We did this, then...etc.''. What I mean is that it should be clear: what data was used, why, how data were analysed, and what constitues something significant. In its current state, the study does not make it possible for the readership to follow this information sufficiently to grasp the vast amount of results being reported.
2. To account for my previous comment, the authors are encouraged to use a clear guide on AI reporting (CONSORT-AI) which is the standard in the field.

Article Menu

Using a High-Precision YOLO Surveillance System for Gun Detection to Prevent Mass Shootings

Further Information

Guidelines

MDPI Initiatives

Follow MDPI