Audiovisual Gun Detection with Automated Lockdown and PA Announcing IoT System for Schools
Abstract
1. Introduction
- Audiovisual detection: There are several software solutions [4,5,6,7,8,9] that integrate with existing security cameras to monitor live video feeds and detect firearms. One advantage of this approach is that it can detect a gun before the first shot. However, visual detection relies on security cameras having a clear line of sight to the firearm, which means its effectiveness is restricted by the camera placement, angle, lighting conditions, and potential occlusion in the environment. In contrast, a gunshot sound can be heard and detected without a line of sight and from any corner making sound-based systems [10,11,12,13,14,15,16] more effective in different environments. However, one shortcoming of this approach is that the gun is detected only after the first gunshot happens. In this project, a gun is detected using both image and microphone sensors. It will detect guns using camera images before any shot is fired. However, if it is not detected due to occlusion or lighting conditions, then it will be detected from the gunshot sound when the first bullet is fired, making the detection system more robust.
- Auto lockdown: The proposed system automates the lockdown process by sending commands to the door locks of the classrooms as soon as a gun is detected and this will save time and effort in implementing the ALICE protocol.
- Auto PA announcement: The system automatically announces using the PA system the location of where the gun is detected and also announce to move away from that area. For instance, if the gun is detected in the northside hallway, the people who are outside the classrooms will be automatically advised using the PA system to move towards the south. This is the current practice in schools, and automating this can save time and lives.
- Privacy: In [4], camera images are continuously transmitted to cloud-based servers for primary classification, enabling persistent third-party access and raising privacy concerns. In contrast, the proposed system performs all image and audio classification locally within the private network. Images are sent to an external vision model only as a secondary verification step and only when the local classifier detects a potential firearm with high confidence. This event-driven transmission is infrequent and limited to high-risk scenarios and contains the same visual information that is provided to first responders for situational awareness. Therefore, the system does not enable continuous external surveillance and preserves user privacy.
2. Related Works
2.1. Commercial Products
2.2. Published Literature
2.2.1. Image-Based Gun Detection
2.2.2. Audio-Based Gunshot Detection
2.2.3. Audiovisual Gun Detection
3. Materials and Methods
3.1. Image-Based Gun Detection Using Deep Learning
3.1.1. Custom Dataset Creation
3.1.2. Residual Separable Convolutional Neural Network Architecture
- Input and Normalization: The model accepts input images of size 150 × 200 × 3 (height, width, channel). A Rescaling layer normalizes pixel intensities from 0 to 1 range by dividing them by 255—improving numerical stability and facilitating faster convergence during training.
- Residual Separable Convolutional Blocks: The model includes three progressively deeper residual blocks, each combining separable convolutions with residual connections [29,30]. These residual connections preserve low-level information and mitigate vanishing gradients, resulting in more stable optimization. Although the input image is initially normalized by a Rescaling layer, batch normalization is still applied inside each block because the internal feature distributions continue to shift as they pass through multiple nonlinear layers. Batch normalization in these deeper layers reduces internal covariate shift, stabilizes activation statistics, allows the use of higher learning rates, and helps gradients propagate more reliably through the network.
- Final Feature Extraction Layer: After the last residual block, a SeparableConv2D layer with 256 filters further enriches the learned feature representations. Batch normalization and ReLU activation are applied afterward.
- Global Feature Aggregation and Dropout: A GlobalAveragePooling2D layer reduces each feature map to a single representative value, producing a compact 256-dimensional vector. A dropout layer is applied to reduce overfitting by preventing the model from relying too heavily on specific activations.
- Output Layer: A Dense layer with sigmoid activation outputs a probability indicating whether the input image contains a gun. Values greater than or equal to 0.5 indicate the gun class, while values below 0.5 correspond to non-gun class.
- Loss Function and Optimizer: The model is trained using binary cross-entropy loss, appropriate for two-class classification tasks. The Adam optimizer with a learning rate of 1 × 10−5 is used.
3.1.3. Model Training
3.1.4. Verification with OpenAI
3.2. Sound-Based Gunshot Detection Using Deep Learning
4. Prototype Development
4.1. Audio-Visual Gun Detection and MQTT Broker
4.1.1. MQTT Broker Server
4.1.2. Gun Detection from Image
4.1.3. Gunshot Detection from Sound
4.2. Smart Lock
4.2.1. Hardware
4.2.2. Firmware
4.3. IoT Connected PA
4.3.1. Hardware
4.3.2. Firmware
4.4. Smartphone Application
4.4.1. MQTT Connectivity and System Monitoring
4.4.2. Control of Smart Locks and PA Devices
4.4.3. Gun Image and Gunshot Event Notifications
4.4.4. Navigation Assistance
5. Results
5.1. Deep Learning Model Results
5.1.1. Deep Learning Model Result for Image Based Detection
5.1.2. Deep Learning Model Result for Sound Based Detection
5.2. Prototype Results
5.2.1. Components of the Proposed System
5.2.2. Image-Based Gun Detection Testing Results
5.2.3. Sound-Based Gunshot Detection Testing Results
6. Discussion
7. Conclusions
8. Patents
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Gunfire on School Grounds in the United States. Available online: https://everytownresearch.org/maps/gunfire-on-school-grounds/ (accessed on 4 December 2025).
- Active Shooter Notification Time Costs Lives. Available online: https://guard911.com/active-shooter-notification-time-costs-lives/ (accessed on 4 December 2025).
- ALICE Active Shooter Response Training. Available online: https://www.alicetraining.com/ (accessed on 2 December 2025).
- ZeroEyes. Available online: https://zeroeyes.com/ (accessed on 29 October 2024).
- Jain, A.; Garg, G. Gun Detection with Model and Type Recognition using Haar Cascade Classifier. In 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India; IEEE: New York, NY, USA, 2020; pp. 419–423. [Google Scholar]
- Alaqil, R.M.; Alsuhaibani, J.A.; Alhumaidi, B.A.; Alnasser, R.A.; Alotaibi, R.D.; Benhidour, H. Automatic Gun Detection from Images Using Faster R-CNN. In 2020 First International Conference of Smart Systems and Emerging Technologies (SMARTTECH), Riyadh, Saudi Arabia; IEEE: New York, NY, USA, 2020; pp. 149–154. [Google Scholar]
- Debnath, R.; Bhowmik, M.K. Automatic Visual Gun Detection Carried by A Moving Person. In 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS), Rupnagar, India; IEEE: New York, NY, USA, 2020; pp. 208–213. [Google Scholar]
- Mehta, P.; Kumar, A.; Bhattacharjee, S. Fire and Gun Violence based Anomaly Detection System Using Deep Neural Networks. In 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India; IEEE: New York, NY, USA, 2020; pp. 199–204. [Google Scholar]
- Goenka, A.; Sitara, K. Weapon Detection from Surveillance Images using Deep Learning. In 2022 3rd International Conference for Emerging Technology (INCET), Belgaum, India; IEEE: New York, NY, USA, 2022; pp. 1–6. [Google Scholar]
- AmberBox. Available online: https://amberbox.com/ (accessed on 29 October 2024).
- Lopez-Morillas, J.; Canadas-Quesada, F.J.; Vera-Candeas, P.; Ruiz-Reyes, N.; Mata-Campos, R.; Montiel-Zafra, V. Gunshot detection and localization based on Non-negative Matrix Factorization and SRP-Phat. In 2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM); IEEE: New York, NY, USA, 2016; pp. 1–5. [Google Scholar] [CrossRef]
- Valenzise, G.; Gerosa, L.; Tagliasacchi, M.; Antonacci, F.; Sarti, A. Scream and gunshot detection and localization for audio-surveillance systems. In 2007 IEEE Conference on Advanced Video and Signal Based Surveillance; IEEE: New York, NY, USA, 2007; pp. 21–26. [Google Scholar] [CrossRef]
- Bajzik, J.; Prinosil, J.; Koniar, D. Gunshot Detection Using Convolutional Neural Networks. In 2020 24th International Conference Electronics; IEEE: New York, NY, USA, 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Morehead, A.; Ogden, L.; Magee, G.; Hosler, R.; White, B.; Mohler, G. Low Cost Gunshot Detection using Deep Learning on the Raspberry Pi. In 2019 IEEE International Conference on Big Data (Big Data); IEEE: New York, NY, USA, 2019; pp. 3038–3044. [Google Scholar] [CrossRef]
- Khan, T.H. A deep learning-based gunshot detection IoT system with enhanced security features and testing using blank guns. Internet Things (IoT) 2025, 6, 5. [Google Scholar] [CrossRef]
- ShotSpotter. Available online: https://www.soundthinking.com/law-enforcement/gunshot-detection-technology (accessed on 31 December 2025).
- Chinnasamy, P.; Sivakrishnaiah, C.; Sathiya, T.; Alam, I.; Degala, D.P. Design and Implementation of an IoT-based Emergency Alert and GPS Tracking System using MQTT and GSM/GPS Module. In 2025 5th International Conference on Trends in Material Science and Inventive Materials (ICTMIM), Kanyakumari, India; IEEE: New York, NY, USA, 2025; pp. 1286–1291. [Google Scholar]
- Chinnasamy, P.; Subramanian, A.; Nithish Selvam, R.; Kabilash, K.N.; Ibrahim, S.N.M.; Swetha, D.Y. GUARDTRACK: RFID and Wi-Fi based Smart Entry System. In 2025 5th International Conference on Trends in Material Science and Inventive Materials (ICTMIM), Kanyakumari, India; IEEE: New York, NY, USA, 2025; pp. 667–672. [Google Scholar]
- Shanthi, P.; Manjula, V. A systematic review on CNN-YOLO techniques for face and weapon detection in crime prevention. Discov. Comput. 2025, 28, 204. [Google Scholar] [CrossRef]
- Vallez, N.; Velasco-Mata, A.; Deniz, O. Deep autoencoder for false positive reduction in handgun detection. Neural Comput. Appl. 2021, 33, 5885–5895. [Google Scholar] [CrossRef]
- Shanthi, P.; Manjula, V. Weapon detection with FMR-CNN and YOLOv8 for enhanced crime prevention and security. Sci. Rep. 2025, 15, 26766. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Abins, A.A.; Priyadharshini, P.; Rohidh, G.; Cheran, R. Weapon recognition in CCTV videos: Deep learning solutions for rapid threat identification. In 2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE); IEEE: New York, NY, USA, 2024; pp. 1–8. [Google Scholar]
- Wang, G.; Ding, H.; Duan, M.; Pu, Y.; Yang, Z.; Li, H. Fighting against terrorism: A real-time CCTV autonomous weapons detection based on improved YOLOv4. Digit. Signal Process. 2023, 132, 103790. [Google Scholar] [CrossRef]
- Bushra, S.N.; Shobana, G.; Maheswari, K.U.; Subramanian, N. Smart video surveillance-based weapon identification using YOLOv5. In 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC); IEEE: New York, NY, USA, 2022; pp. 351–357. [Google Scholar]
- Khalid, S.; Waqar, A.; Tahir, H.U.A.; Edo, O.C.; Tenebe, I.T. Weapon detection system for surveillance and security. In 2023 International Conference on IT Innovation and Knowledge Discovery (ITIKD); IEEE: New York, NY, USA, 2023; pp. 1–7. [Google Scholar]
- Yadav, P.; Gupta, N.; Sharma, P.K. Robust weapon detection in dark environments using YOLOv7-DarkVision. Digit. Signal Process. 2024, 145, 104342. [Google Scholar] [CrossRef]
- Chen, C.; Abdallah, A.; Wolf, W. Audiovisual Gunshot Event Recognition. In 2006 IEEE International Conference on Systems, Man and Cybernetics; IEEE: New York, NY, USA, 2006; pp. 4807–4812. [Google Scholar]
- Replica Guns. Available online: https://www.amazon.com/dp/B01MQS74AT/ (accessed on 4 December 2025).
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA; IEEE: New York, NY, USA, 2017; pp. 1800–1807. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA; IEEE: New York, NY, USA, 2016; pp. 770–778. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France; PMLR: New York, NY, USA, 2015; pp. 448–456. [Google Scholar]
- Nair, V.; Hinton, G.E. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10); Omnipress: Madison, WI, USA, 2010; pp. 807–814. [Google Scholar]
- Nagi, J.; Ducatelle, F.; Di Caro, G.A.; Cireşan, D.; Meier, U.; Giusti, A.; Nagi, F.; Schmidhuber, J.; Gambardella, L.M. Max-Pooling Convolutional Neural Networks for Vision-based Hand Gesture Recognition. In IEEE International Conference on Signal and Image Processing Applications (ICSIPA2011); IEEE: New York, NY, USA, 2011. [Google Scholar]
- Keras: The Python Deep Learning Library. Available online: https://keras.io (accessed on 15 December 2025).
- Convert TensorFlow Models. Available online: https://ai.google.dev/edge/litert/models/convert_tf (accessed on 15 December 2025).
- Khan, T.H. Towards an indoor gunshot detection and notification system using deep learning. Appl. Syst. Innov. (ASI) 2023, 6, 94. [Google Scholar] [CrossRef]
- Youyeetoo Mini Computers. Available online: https://www.amazon.com/youyeetoo-Computers-Windows-Preinstalled-Business/dp/B0D4DQQYYX/ (accessed on 9 December 2025).
- Eclipse Mosquitto. Available online: https://mosquitto.org/ (accessed on 11 December 2025).
- MQTT: The Standard for IoT Messaging. Available online: https://mqtt.org/ (accessed on 9 December 2025).
- Cloud Storage for Firebase. Available online: https://firebase.google.com/docs/storage (accessed on 9 December 2025).
- FFmpeg. Available online: https://www.ffmpeg.org/ (accessed on 11 December 2025).
- ESP32-S3-Tiny. Available online: https://www.waveshare.com/wiki/ESP32-S3-Tiny (accessed on 12 December 2025).
- TB6612FNG Motor Driver. Available online: https://www.sparkfun.com/sparkfun-motor-driver-dual-tb6612fng-with-headers.html (accessed on 12 December 2025).
- Stepper Motor Linear Actuator. Available online: https://www.amazon.com/Stepper-Linear-Actuator-Engraving-Machine/dp/B09BZDSY7V (accessed on 12 December 2025).
- Door Latch. Available online: https://www.amazon.com/JQK-Security-Stainless-Thickened-HBB120-P2/dp/B09Y5MXCDN/ (accessed on 12 December 2025).
- AccelStepper Library for Arduino. Available online: https://www.airspayce.com/mikem/arduino/AccelStepper/index.html (accessed on 12 December 2025).
- Arduino-MQTT Library. Available online: https://github.com/256dpi/arduino-mqtt (accessed on 12 December 2025).
- WM8960 Hi-Fi Sound Card HAT. Available online: https://www.waveshare.com/wm8960-audio-hat.htm (accessed on 12 December 2025).
- HLK PM01 AC DC Converter 220V to 5V. Available online: https://www.amazon.com/EC-Buying-Step-Down-Intelligent-3-3V/dp/B09Z253MQ2 (accessed on 12 December 2025).
- PM2320 AC Wall Plug Enclosure. Available online: https://www.polycase.com/pm2320#PM2320T03XWT (accessed on 12 December 2025).
- How to Run a Raspberry Pi Program on Startup. Available online: https://learn.sparkfun.com/tutorials/how-to-run-a-raspberry-pi-program-on-startup/all#method-2-autostart (accessed on 12 December 2025).
- Paho-mqtt. Available online: https://pypi.org/project/paho-mqtt/ (accessed on 12 December 2025).
- gTTS (Google Text-to-Speech). Available online: https://pypi.org/project/gTTS/ (accessed on 12 December 2025).
- mpg123—Fast MP3 Player for Linux and Unix Systems. Available online: https://www.mpg123.de/ (accessed on 12 December 2025).
- jMQTT Library. Available online: https://www.b4x.com/android/help/jmqtt.html (accessed on 12 December 2025).
- NB6—Notifications Builder. Available online: https://www.b4x.com/android/forum/threads/nb6-notifications-builder.91819/ (accessed on 12 December 2025).
- Gavrilov, A.; Bergaliyev, M.; Tinyakov, S.; Krinkin, K.; Popov, P. Using IoT Protocols in Real-Time Systems: Protocol Analysis and Evaluation of Data Transmission Characteristics. J. Comput. Netw. Commun. 2022, 2022, 7368691. [Google Scholar] [CrossRef]
- Chinnasamy, P.; Yarramsetti, S.; Ayyasamy, R.K.; Rajesh, E.; Vijayasaro, V.; Pandey, D.; Pandey, B.K.; Lelish, M.E. AI-Driven intrusion detection and prevention systems to safeguard 6G networks from cyber threats. Sci. Rep. 2025, 15, 37901. [Google Scholar] [CrossRef] [PubMed]
- Jocher, G.; Qiu, J.; Chaurasia, A. YOLOv8: Ultralytics YOLO. Ultralytics. 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 28 January 2026).
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J. YOLOv10: Real-Time End-to-End Object Detection. arXiv 2024, arXiv:2405.14458. [Google Scholar]
- Ultralytics. Ultralytics YOLO Models. 2024. Available online: https://docs.ultralytics.com (accessed on 28 January 2026).












| ShotSpotter [16] | ZeroEyes [4] | AmberBox [10] | This Work | |
|---|---|---|---|---|
| Indoor/Outdoor | Outdoor | Indoor | Indoor | Indoor |
| Detection Method | Sound | Image | Sound | Image + Sound |
| Human in the loop | ✓ | ✓ | ✗ | ✗ |
| Response Time (s) | 60 | 30 | 3.6 | Image: 1.06 Sound: 1.11 |
| Auto Lockdown | ✗ | ✗ | ✗ | ✓ |
| Auto PA announcement | ✗ | ✗ | ✗ | ✓ |
| P. S. et al. [21] | Alaqil et al. [6] | Debnath et al. [7] | Mehta et al. [8] | Goenka and Sitara [9] | J. Morillas, et al. [11] | G. Valenzise, et al. [12] | J. Bajzik, et al. [13] | A. Morehead, et al. [14] | T. Khan [15] | Chen et al. [27] | This Work | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Modality | Image | Image | Image | Image | Image | Audio | Audio | Audio | Audio | Audio | Image + Audio | Image + Audio |
| Classifier | FMR-CNN–YOLOv8 | Faster R-CNN | Template-matching | YOLOv3 | Mask R-CNN | NMF | GMM | CNN | CNN | CNN | SVM | Image: RS CNN Audio: CNN |
| Accuracy % | 98.7 | - | 95 | 89.3 | 82.76 | - | - | 99 | 99 | 99 | 73.46 | Image: 94.6 Audio: 99 |
| Precision % | 90.1 | 82 | - | - | - | - | 93 | - | - | 100 | - | Image: 94.2 Audio: 100 |
| Smartphone notification | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ (using SMS) | ✓ | ✗ | ✓ |
| Plot on map | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | ✗ | ✓ |
| Realtime testing with replica and blank gun | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ | ✗ | ✓ |
| Training | Validation | Test | |
|---|---|---|---|
| Loss | 0.1119 | 0.097 | 0.1336 |
| Accuracy | 0.9599 | 0.9707 | 0.9464 |
| True/Predicted | Gun | Non-Gun |
|---|---|---|
| Gun | 308 | 13 |
| Non-gun | 19 | 332 |
| Precision | Recall | F1-Score | |
|---|---|---|---|
| Gun | 0.9419 | 0.9595 | 0.9506 |
| Non-gun | 0.9623 | 0.9459 | 0.9540 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Khan, T. Audiovisual Gun Detection with Automated Lockdown and PA Announcing IoT System for Schools. IoT 2026, 7, 15. https://doi.org/10.3390/iot7010015
Khan T. Audiovisual Gun Detection with Automated Lockdown and PA Announcing IoT System for Schools. IoT. 2026; 7(1):15. https://doi.org/10.3390/iot7010015
Chicago/Turabian StyleKhan, Tareq. 2026. "Audiovisual Gun Detection with Automated Lockdown and PA Announcing IoT System for Schools" IoT 7, no. 1: 15. https://doi.org/10.3390/iot7010015
APA StyleKhan, T. (2026). Audiovisual Gun Detection with Automated Lockdown and PA Announcing IoT System for Schools. IoT, 7(1), 15. https://doi.org/10.3390/iot7010015
