You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

13 August 2025

Analysis of OpenCV Security Vulnerabilities in YOLO v10-Based IP Camera Image Processing Systems for Disaster Safety Management

and
Department of Computer Engineering, Honam University, Gwangju 62399, Republic of Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
This article belongs to the Special Issue Securing Tomorrow: Human-Centric Security and Privacy in the Fifth Industrial Revolution

Abstract

This paper systematically analyzes security vulnerabilities that may occur during the OpenCV library and IP camera linkage process for the YOLO v10-based IP camera image processing system used in the disaster safety management field. Recently, the use of AI-based real-time image analysis technology in disaster response and safety management systems has been increasing, but it has been confirmed that open source-based object detection frameworks and security vulnerabilities in IP cameras can pose serious threats to the reliability and safety of actual systems. In this study, the structure of an image processing system that applies the latest YOLO v10 algorithm was analyzed, and major security threats (e.g., remote code execution, denial of service, data tampering, authentication bypass, etc.) that might occur during the IP camera image collection and processing process using OpenCV were identified. In particular, the possibility of attacks due to insufficient verification of external inputs (model files, configuration files, image data, etc.), failure to set an initial password, and insufficient encryption of network communication sections were presented with cases. These problems could lead to more serious results in mission-critical environments such as disaster safety management.

1. Introduction

Due to recent climate change and accelerated urbanization, the frequency of natural and other disasters is increasing. Accordingly, disaster safety management is emerging as a key task for securing safety and minimizing damage throughout society. In particular, the convergence of advanced information and communication technology (ICT) and artificial intelligence (AI) technology is becoming essential in disaster sites where real-time situational awareness and rapid response are required [1].
Within this trend, video-based disaster surveillance systems using IP cameras play an important role in real-time understanding of on-site situations and early detection of risk factors. Recently, YOLO (You Only Look Once) v10, which shows excellent performance in the field of object detection, is being actively introduced to disaster safety management systems [2,3].
These technologies are utilized to collect and process images from IP cameras through open source computer vision libraries such as OpenCV and to automatically detect and analyze disaster situations by applying artificial intelligence models [1].
However, despite the introduction and spread of these recent technologies, concerns about security vulnerabilities that can undermine the reliability and safety of systems are constantly being raised. In particular, security vulnerabilities in open source libraries such as OpenCV, network security issues in IP cameras, and the possibility of forgery and tampering with input data for AI models can act as real threats to disaster safety management systems. In fact, in recent security incidents, hackers have been reported to have infiltrated systems or tampered with image data due to failure to set initial passwords for IP cameras, insufficient network encryption, and insufficient external input verification. These security vulnerabilities not only hinder rapid decision-making and response in disaster situations, but also cause secondary damage due to incorrect information, making them very dangerous. The purpose of this study is to systematically analyze security vulnerabilities that may occur during the OpenCV library to IP camera linkage process, targeting the YOLO v10-based IP camera image processing system widely used in disaster safety management fields, and to suggest countermeasures for them [4]. To this end, this paper sets the following research scope. First, we analyze the structure and characteristics of the YOLO v10-based image processing system applied to disaster safety management systems. Second, we derive major security threat scenarios that might occur during the IP camera image collection and processing process using OpenCV. Third, based on the practical environment and recent security incident cases, we empirically analyze vulnerabilities that can actually occur (e.g., remote code execution, denial of service, data tampering, authentication bypass, etc.). Finally, we propose effective countermeasures and policy implications for the identified vulnerabilities. This study aims to strengthen the security of AI image processing systems specializing in disaster safety management and contribute to the establishment of safe and reliable disaster response infrastructure. Through this, we expect to be able to raise awareness of security vulnerabilities and provide practical security guidelines when developing and operating disaster safety management systems in the future [5,6].

3. Proposal

3.1. Disaster Safety System for Fire and Smoke Using YOLO v10

The fire and smoke disaster safety system proposed in this study is based on YOLO v10, and the overall flow of the system is shown in the system flow diagram in Figure 5.
Figure 5. System flowchart.
First, the system is operated in the Start stage.
In the Indoor CCTV/IP camera video input stage, video input is received from an indoor CCTV or IP camera. Next, in the Detecting Security Vulnerabilities stage, security vulnerabilities such as hacking and unauthorized access are detected for the input video stream. Afterwards, in the Frame Extraction and Preprocessing stage, frames are extracted from the video, and preprocessing such as resizing and normalization is performed for YOLO-based detection.
In the Fire/Smoke Detection with YOLO stage, the YOLO algorithm is used to detect whether fire or smoke occurs in each frame.
In the decision branch, if fire/smoke is not detected (“N”), it returns to the frame extraction and preprocessing stage and repeatedly performs detection.
On the other hand, if fire/smoke is detected (“Y”), it enters the Detected stage and the system recognizes the occurrence of fire or smoke. Finally, in the Notifications (Alerts/Messages) stage, warning notifications or messages are sent to the manager in various ways such as SMS and app notifications based on the detection results. Figure 6 presents the user interface (UI) design of the YOLO v10-based fire and smoke specialized disaster safety system proposed in this study.
Figure 6. User interface for disaster safety system specializing in fire prevention.

3.2. OpenCV Vulnerability Improvement for IP Camera

OpenCV (an open-source computer vision library) is widely used as a core image processing and analysis tool in IP Camera systems. However, its open-source nature exposes it to various security vulnerabilities. This paper analyzes the major security vulnerabilities that may arise when using OpenCV in IP Camera environments and presents technical and operational measures to effectively address them.
OpenCV can cause serious vulnerabilities, such as buffer overflows and denial of service (DoS), when input data are not validated during processing of externally incoming images and video streams. For example, OpenCV-Python version 4.1.0 can cause a heap buffer overflow through a specially crafted XML file, leading to arbitrary code execution or system crashes. Therefore, strict validation of all input data, including format, size, and hash values, and whitelist-based filtering are required.
Many IP camera systems either do not change the default authentication information (e.g., admin) or lack authentication procedures for the RTSP/ONVIF protocol. There have been frequent reports of authentication being bypassed when directly viewing RTSP streams using OpenCV. RTSP is not encrypted by default, so commands and data are transmitted in plaintext, making it vulnerable to eavesdropping and unauthorized access. It is essential to implement robust authentication methods, such as strong password policies, regular password changes, and digest authentication.
On the network side, streams between IP cameras and OpenCV can be interrupted or hijacked through network attacks such as ARP spoofing and man-in-the-middle (MitM) attacks. Unauthenticated RTSP streams can be accessed and hijacked by anyone on the network, making it easy for attackers to launch denial-of-service (DoS) or image manipulation attacks. To mitigate these vulnerabilities, the following measures are suggested:
Enhance input data validation and integrity. Perform strict validation of all video and image inputs, including format, size, and hashing, and apply whitelist-based filtering to block the inflow of malicious data.
Enhance authentication and access control. Enforce complex password policies and regularly change passwords when accessing IP cameras and OpenCV streams. Use strong authentication methods like digest authentication for network protocols like RTSP/ONVIF. Implement privilege separation and fine-grained access control in OpenCV-based applications.
Dependency Management and Sandboxing: Keep OpenCV and related libraries (e.g., FFmpeg) up-to-date with security patches, and run OpenCV processes in containers (e.g., Docker) or sandboxed environments to minimize the impact of vulnerability exploits.
Enhanced Network Security: Use encrypted streams (e.g., RTSP over TLS) between IP cameras and OpenCV servers, and implement firewalls and intrusion detection systems (IDS) on the network to block abnormal traffic in real time.
In addition to common vulnerabilities, AI-based vulnerabilities may exist. It is important to consider AI-specific threats such as the following:
Adversarial Attacks: Attackers can insert subtle perturbations or special patterns into input data, causing image recognition AI (e.g., YOLO) to misrecognize or fail to detect even normal images. This can severely undermine the reliability of surveillance and disaster detection systems.
Malformed Input Frames Triggering GPU Errors: Maliciously manipulated image/video frames can target vulnerable functions in GPUs or codecs, causing system crashes, freezes, or even remote code execution.
Poisoned Pre-trained Weights: Pre-trained/fine-tuned models can be intentionally misconfigured with specific inputs if configured with malicious data or contaminated weights. Therefore, verifying the integrity of model weights (e.g., SHA-256 checksums) and verifying their authenticity are essential. Strengthening input data validation and integrity: Strict format, size, and hash validation must be implemented for all video/image inputs, along with whitelist-based filtering.
AI model reliability management: Detection of adversarial attacks (e.g., abnormal input detection), integrity verification of pre-trained model weights, and strengthened model training data/source management.
These multi-layered security enhancements can significantly reduce security vulnerabilities in OpenCV in IP Camera environments and contribute to enhanced system stability and reliability.

4. Experimental Results and Analysis

In this experiment, about 35,000 images from Kaggle’s “Fire/Smoke Detection YOLO v9” dataset and some images from AI-Hub’s “Fire Occurrence Prediction Images (Advanced)—Image-based Fire Surveillance and Occurrence Location Detection Data” were extracted and mixed, constructing a dataset of 43,146 images ultimately used for learning. The data were divided into 35,997 images for training, 4893 images for validation, and 2256 images for testing. After directly labeling images taken at actual indoor fire scenes using the LabelImg program, YOLO v10 was applied based on this dataset to evaluate the fire and smoke detection performance. In addition, this study analyzed security vulnerabilities that may occur in the process of utilizing OpenCV, and presented representative examples of vulnerable source codes and improved pseudocodes. Through this, we discuss ways to practically improve security issues such as insufficient input verification and authentication bypass that could occur during data processing and video stream input.

4.1. Experimental Environment

The experimental environment is as shown in Table 1. PyTorch 2.12 was used, and CUDA version 12.4 was used.
Table 1. Experimental environment.

4.2. Results of Indoor Fire Detection Experiment Using YOLO v10

In this experiment, we selected fire image 0001 and smoke image 0034 from the AI-Hub “Fire Occurrence Prediction Image (Advanced)—Image-based Fire Surveillance and Occurrence Location Detection Data” dataset and performed an object detection experiment.
Figure 7 shows the results of detecting fire and smoke disaster situations on the building exterior. Both Figure 7a,b use the same fire situation images, and the real-time processing speed (FPS) is indicated at the top of each image. The FPS of Figure 7a is 50.68, and the FPS of Figure 7b is 51.26, showing real-time performance of about 50 FPS in both cases. The green box and the ‘smoke’ label indicate the detected smoke area, and the number next to each box indicates the confidence score (e.g., 0.74, 0.69, etc.) of the area.
Figure 7. Fire smoke detection on building facades using YOLO v10: (a) 3 smoke boxes, (b) 6 smoke boxes.
In Figure 7a, there are three smoke detection boxes, and they are mainly concentrated near the middle and bottom windows of the building. On the other hand, in Figure 7b, there are six smoke detection boxes, and smoke is additionally detected in wider areas such as the top, middle, and bottom of the building. It can be seen that Figure 7b detects more smoke areas than Figure 7a, and the reliability is also generally high or similar. Both images showed a real-time processing speed of over 50 FPS, proving their performance as a real-time surveillance and detection system.
Figure 8 presents the performance comparison results of fire and smoke detection algorithms in an indoor environment. Both images used video frames captured in the same indoor fire situation. The green boxes marked ‘smoke’ and ‘fire’ represent the detected smoke and fire areas, respectively, and the detection confidence scores are indicated next to the boxes.
Figure 8. Indoor fire and smoke detection using YOLO v10: (a) Fire and smoke detection experiment 1, (b) Fire and smoke detection experiment 2.
The FPS of Figure 8a is 47.58, and the FPS of Figure 8b is 51.94, which confirms a slight increase in processing speed in Figure 8b.
In both experiments, smoke and fire were accurately detected with high confidence; the smoke area was recognized with a confidence of 0.78, and the fire area with a confidence of 0.85.
There was no significant difference between the two experiments in the location, confidence, and box size of the detected object, but Figure 8b showed an improvement in real-time performance. The above results visually compare the real-time fire and smoke detection performance of two algorithms (or settings) in the same fire situation, experimentally demonstrating that the proposed system is capable of rapid and accurate disaster situation detection in a real environment.
Figure 9 shows the fire and smoke detection results using YOLO v10 with security applied.
Figure 9. YOLO v10 learning results applying security to fire and smoke detection.
train/box_loss, train/cls_loss, train/dfl_loss: The box, class, and distribution losses for the training data show gradual decreases, indicating that the model is increasingly effectively learning features.
val/box_loss, val/cls_loss, val/dfl_loss: Similar loss reductions are observed for the validation data, demonstrating stable training without overfitting.
metrics/precision(B), metrics/recall(B): Detection accuracy (precision) and detection rate (recall) steadily increase, demonstrating improved performance as training progresses.
metrics/mAP50(B), metrics/mAP50-95(B): Comprehensive metrics such as mAP50 and mAP50-95 also show increasing trends, ultimately confirming the model’s overall detection performance improvement. Consequently, graphs of the evolution of each loss function and metric visually demonstrate that the model is gradually being optimized.
Figure 10, an “F1-Confidence Curve,” shows the evolution of the F1 score according to the confidence threshold for the classification model’s results.
Figure 10. YOLO v10 F1 curve securely applied to fire and smoke detection.
The x-axis (horizontal axis) represents the confidence threshold. As this value increases, the model only considers more certain predictions as correct.
The y-axis (vertical axis) represents the F1 score. The F1 score is the harmonic mean of precision and recall, and is a comprehensive indicator of classification performance.
The phrase “all classes 0.75 at 0.379” next to this curve indicates that the highest F1 score (0.75) is achieved at a confidence level of 0.379.
The curves generally show that the F1 score is low when confidence is low and very high (on both sides of the graph), and that the F1 score is maximum at intermediate levels of confidence. One can see that for each class, the range where the F1 score is highest at the optimal threshold is slightly different.

4.3. Results of Security Vulnerability Testing on OpenCV Code

In this experiment, we used YOLO and OpenCV to detect major source code security vulnerabilities that might occur in an IP Camera environment.
The reason why YOLO v10 and OpenCV were used as security vulnerability targets in this experiment is because they are object detection frameworks including YOLO v8. The major vulnerabilities are listed in Figure 11.
Figure 11. Security vulnerability source code for opencv.
First, the integrity of the YOLO model file is not verified.
Although there is exception handling when loading YOLO(‘best.pt’), it does not verify whether the model file has been tampered with using hash values, etc.
There is a possibility of vulnerability when applying an actual WAN, etc., to an external network on local host 127.0.0.1.
If an attacker injects a malicious model file, there is a risk of arbitrary code execution. Another problem is that authentication and encryption are not applied to the RTSP stream. For example, if the RTSP basic stream is used without authentication information, such as cv2.VideoCapture(“rtsp://127.0.0.1:1111/ipcamera”), the stream is not encrypted, which increases the risk of man-in-the-middle attacks (MitM) or unauthorized access to the network. In addition, the validity of the stream data is insufficient. Currently, only the presence of a stream connection is checked with cap.isOpened(), and the format, normality, and integrity of the actual data are not verified. This can lead to denial of service (DoS) or system failure due to malicious stream input. The method of applying SHA256 as a way to guarantee the file integrity of the training model for the first YOLO can be presented as in Figure 12.
Figure 12. Applying SHA256 hash to YOLO model.
To summarize the main improvements for the YOLO learning model, file integrity verification can be used to check whether the class/model file has been tampered with using the SHA-256 hash value.
Second, Figure 13 shows that a security vulnerability has been resolved in OpenCV’s RTSP stream processing. Authentication and encryption were added to the RTSP configuration, following best practices to include authentication information and employ secure protocols such as RTSP over TLS. Exception handling was also strengthened so that failures at each processing stage are detected promptly, with immediate error notification and graceful termination where necessary. These enhancements can substantially improve the real-world security level of OpenCV-based and AI-based image systems.
Figure 13. Fixing OpenCV security vulnerabilities.
It is noteworthy that related vulnerabilities—in particular, improper validation or lack of authentication in RTSP stream handling—have been previously disclosed as security issues. For example, CVE-2017-15293 describes a critical weakness in OpenCV’s VideoCapture module, where it was possible to access RTSP streams without proper authentication, leading to potential unauthorized access or information leakage. Although the specific improvement in this study focuses on secure configuration and coding practices (rather than a software patch), the measures adopted directly address, and proactively mitigate, the risks highlighted in CVE-2017-15293 and similar vulnerabilities.

5. Conclusions

In this paper, we systematically analyzed the major security vulnerabilities that may occur in IP Camera video processing systems using YOLO and OpenCV, and presented practical improvement measures for them. The major vulnerabilities identified through experiments included the lack of integrity verification of YOLO model files, lack of authentication and encryption of RTSP streams, and insufficient validation of video stream data. It was confirmed that these vulnerabilities could lead to serious security threats such as arbitrary code execution by malicious model file injection, man-in-the-middle attacks and unauthorized access to the network, and denial of service (DoS) or system failure due to malicious stream data. To respond to this, this study proposed multi-layered security enhancement measures such as SHA-256 hash-based integrity verification for model files and class files, inclusion of authentication information in RTSP streams and application of encryption protocols (such as RTSP over TLS), strengthening input verification for the format, normality, and integrity of video data, and introducing exception handling and real-time notification systems at each stage. It was experimentally proven that these improvement measures can substantially improve the security level of OpenCV and AI-based video processing systems in actual IP Camera environments. In future studies, the proposed security enhancement measures should be applied to various real-world cases to verify their effectiveness, and additional research is needed to build a safer and more reliable video surveillance system through convergence with advanced security technologies such as AI-based abnormal behavior detection and automated threat response. It is hoped that the analysis and improvement directions presented in this paper will serve as useful guidelines in both practical and academic terms.

Author Contributions

Conceptualization, D.-Y.J. and N.-H.K.; methodology, D.-Y.J.; software, D.-Y.J. and N.-H.K.; validation, D.-Y.J. and N.-H.K.; formal analysis, D.-Y.J. and N.-H.K.; investigation, D.-Y.J.; resources, N.-H.K.; data curation, D.-Y.J.; writing—original draft preparation, D.-Y.J.; writing—review and editing, D.-Y.J.; visualization, D.-Y.J.; supervision, N.-H.K.; project administration, N.-H.K.; funding acquisition, N.-H.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Gwangju Regional Innovation System & Education (RISE) Project. This research (paper) used datasets from the Open AI Dataset Project (AI-Hub, S. Korea). All data information can be accessed through AI-Hub (www.aihub.or.kr).

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Irene, S.; Prakash, A.J.; Uthariaraj, V.R. Person search over security video surveillance systems using deep learning methods: A review. Image Vis. Comput. 2024, 143, 104930. [Google Scholar] [CrossRef]
  2. Ganapathy, S.; Ajmera, D. An Intelligent Video Surveillance System for Detecting the Vehicles on Road Using Refined YOLOV4. Comput. Electr. Eng. 2023, 113, 109036. [Google Scholar] [CrossRef]
  3. Sapkota, R.; Flores-Calero, M.; Qureshi, R.; Badgujar, C.; Nepal, U.; Poulose, A.; Zeno, P.; Vaddevolu, U.B.P.; Khan, S.; Shoman, M.; et al. YOLO advances to its genesis: A decadal and comprehensive review of the You Only Look Once (YOLO) series. Artif. Intell. Rev. 2025, 58, 274. [Google Scholar] [CrossRef]
  4. Biondi, P.; Bognanni, S.; Bella, G. Vulnerability assessment and penetration testing on IP camera. In Proceedings of the 2021 8th International Conference on Internet of Things: Systems, Management and Security (IOTSMS), Gandia, Spain, 6–9 December 2021; pp. 1–8. [Google Scholar]
  5. Moon, J.; Bukhari, M.; Kim, C.; Nam, Y.; Maqsood, M.; Rho, S. Object detection under the lens of privacy: A critical survey of methods, challenges, and future directions. ICT Express 2024, 10, 1124–1144. [Google Scholar] [CrossRef]
  6. Ahmad, H.M.; Rahimi, A. SH17: A dataset for human safety and personal protective equipment detection in manufacturing industry. J. Saf. Sci. Resil. 2025, 6, 175–185. [Google Scholar] [CrossRef]
  7. Stabili, D.; Bocchi, T.; Valgimigli, F.; Marchetti, M. Finding (and exploiting) vulnerabilities on IP Cameras: The Tenda CP3 case study. Adv. Comput. Secur. 2024, 195–210. [Google Scholar]
  8. Bhardwaj, A.; Bharany, S.; Ibrahim, A.O.; Almogren, A.; Rehman, A.U.; Hamam, H. Unmasking vulnerabilities by a pioneering approach to securing smart IoT cameras through threat surface analysis and dynamic metrics. Egypt. Inform. J. 2024, 27, 100513. [Google Scholar] [CrossRef]
  9. Vennam, P.; T.C., P.; B.M., T.; Kim, Y.G.; B.N., P.K. Attacks and Preventive Measures on Video Surveillance Systems: A Review. Appl. Sci. 2021, 11, 5571. [Google Scholar] [CrossRef]
  10. Wang, X.; Cai, L.; Zhou, S.; Jin, Y.; Tang, L.; Zhao, Y. Fire Safety Detection Based on CAGSA-YOLO Network. Fire 2023, 6, 297. [Google Scholar] [CrossRef]
  11. Li, X.; Liang, Y. Fire-RPG: An Urban Fire Detection Network Providing Warnings in Advance. Fire 2024, 7, 214. [Google Scholar] [CrossRef]
  12. Hussain, M. Yolov5, yolov8 and yolov10: The go-to detectors for real-time vision. arXiv 2024, arXiv:2407.02988. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.