Comparative Evaluation of YOLO Models for Human Position Recognition with UAVs During a Flood
Abstract
1. Introduction
2. Methods
2.1. Datasets and Data Preparation
2.1.1. SARD Dataset (Search and Rescue Image Dataset)
2.1.2. SeaDronesSee Dataset (Maritime UAV Dataset)
2.1.3. C2A (Combination-to-Application Dataset)
2.1.4. SynBASe—Synthetic Bodies at Sea
2.1.5. Data Preparation and Preprocessing
2.2. Simulation Research
2.3. Hybrid Method
2.3.1. Optimizing Video Data and Object Detection
2.3.2. Joint Detection Movement and Position Analysis
2.3.3. Filtered Data Refinement
2.4. Training and Testing Neural Network Procedure
2.5. Ablative Research of YOLO Architectures
2.6. Justification for Choosing the YOLO12 Model
3. Results
3.1. Human Recognition
3.2. Evaluation Metrics and Comparative Analysis of YOLO Models
3.3. Ablation Research of Hybrid Method Components
3.4. Stress Testing of UAVs in Real Conditions
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| SARD | Search and Rescue Dataset |
| UAV | Unmanned Aerial Vehicle |
| YOLO | You Only Look Once |
References
- Alwani, A.A.; Chahir, Y.; Goumidi, D.E.; Molina, M.; Jouen, F. 3D-Posture Recognition Using Joint Angle Representation. In Information Processing and Management of Uncertainty in Knowledge-Based Systems; Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R., Eds.; Communications in Computer and Information Science; Springer International Publishing: Cham, Switzerland, 2014; Volume 443, pp. 106–115. ISBN 978-3-319-08854-9. [Google Scholar]
- Chen, C.; Min, H.; Peng, Y.; Yang, Y.; Wang, Z. An Intelligent Real-Time Object Detection System on Drones. Appl. Sci. 2022, 12, 10227. [Google Scholar] [CrossRef]
- Gallego, A.; Pertusa, A.; Gil, P.; Fisher, R.B. Detection of Bodies in Maritime Rescue Operations Using Unmanned Aerial Vehicles with Multispectral Cameras. J. Field Robot. 2019, 36, 782–796. [Google Scholar] [CrossRef]
- Bazarevsky, V.; Grishchenko, I.; Raveendran, K.; Zhu, T.; Zhang, F.; Grundmann, M. BlazePose: On-Device Real-Time Body Pose Tracking. arXiv 2020, arXiv:2006.10204. [Google Scholar]
- Maji, D.; Nagori, S.; Mathew, M.; Poddar, D. YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss. arXiv 2022, arXiv:2204.06806. [Google Scholar] [CrossRef]
- Bilous, N.V.; Ahekian, I.A.; Kaluhin, V.V. Determination and comparison methods of body positions on stream video. RIC 2023, 52. [Google Scholar] [CrossRef]
- Hernández, D.; Cecilia, J.M.; Cano, J.-C.; Calafate, C.T. Flood Detection Using Real-Time Image Segmentation from Unmanned Aerial Vehicles on Edge-Computing Platform. Remote Sens. 2022, 14, 223. [Google Scholar] [CrossRef]
- Li, X.; Guo, Y.; Pan, W.; Liu, H.; Xu, B. Human Pose Estimation Based on Lightweight Multi-Scale Coordinate Attention. Appl. Sci. 2023, 13, 3614. [Google Scholar] [CrossRef]
- Neto, J.G.D.S.; Teixeira, J.M.X.N.; Teichrieb, V. Analyzing Embedded Pose Estimation Solutions for Human Behaviour Understanding. In Proceedings of the Anais Estendidos do Simpósio de Realidade Virtual e Aumentada (SVR Estendido 2020), Recife, Brazil, 7–10 November 2020; Sociedade Brasileira de Computação: Porto Alegre, Brazil, 2020; pp. 30–34. [Google Scholar]
- Rakova, A.O.; Bilous, N.V. Reference points method for human head movements tracking. RIC 2020, 3, 121–128. [Google Scholar] [CrossRef]
- Nguyen, H.-C.; Nguyen, T.-H.; Scherer, R.; Le, V.-H. Unified End-to-End YOLOv5-HR-TCM Framework for Automatic 2D/3D Human Pose Estimation for Real-Time Applications. Sensors 2022, 22, 5419. [Google Scholar] [CrossRef]
- Smith, J.; Loncomilla, P.; Ruiz-Del-Solar, J. Human Pose Estimation Using Thermal Images. IEEE Access 2023, 11, 35352–35370. [Google Scholar] [CrossRef]
- Dhanushree, M.; Bhatt, C.M.; Chitrakala, S. Robust Human Detection System in Flood Related Images with Data Augmentation. Multimed. Tools Appl. 2023, 82, 10661–10679. [Google Scholar] [CrossRef]
- Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLOv8 2023; Ultralytics Inc.: Frederick, MD, USA, 2023. [Google Scholar]
- Chen, H.; Zhou, G.; Jiang, H. Student Behavior Detection in the Classroom Based on Improved YOLOv8. Sensors 2023, 23, 8385. [Google Scholar] [CrossRef]
- Wang, G.; Chen, Y.; An, P.; Hong, H.; Hu, J.; Huang, T. UAV-YOLOv8: A Small-Object-Detection Model Based on Improved YOLOv8 for UAV Aerial Photography Scenarios. Sensors 2023, 23, 7190. [Google Scholar] [CrossRef] [PubMed]
- Berndt, J.; Meißner, H.; Kraft, T. On the accuracy of yolov8-cnn regarding detection of humans in nadir aerial images for search and rescue applications. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 2023, XLVIII-1/W2-2023, 139–146. [Google Scholar] [CrossRef]
- Szpilman, D.; Webber, J.; Quan, L.; Bierens, J.; Morizot-Leite, L.; Langendorfer, S.J.; Beerman, S.; Løfgren, B. Creating a Drowning Chain of Survival. Resuscitation 2014, 85, 1149–1152. [Google Scholar] [CrossRef]
- International Life Saving Federation (ILS). Drowning Prevention Strategies: A Framework to Reduce Drowning Deaths in the Aquatic Environment for Nations/Regions Engaged in Lifesaving; The International Life Saving Federation: Leuven, Belgium, 2015. [Google Scholar]
- Wu, Y.; Guo, W.; Tan, Z.; Zhao, Y.; Zhu, Q.; Wu, L.; Guo, Z. Syn2Real Detection in the Sky: Generation and Adaptation of Synthetic Aerial Ship Images. Appl. Sci. 2024, 14, 4558. [Google Scholar] [CrossRef]
- Tjia, M.; Kim, A.; Wijaya, E.W.; Tefara, H.; Zhu, K. Enhancing Robustness of Human Detection Algorithms in Maritime SAR through Augmented Aerial Images to Simulate Weather Conditions. arXiv 2024, arXiv:2408.13766. [Google Scholar] [CrossRef]
- Guettala, W.; Sayah, A.; Kahloul, L.; Tibermacine, A. Real Time Human Detection by Unmanned Aerial Vehicles. arXiv 2024, arXiv:2401.03275. [Google Scholar] [CrossRef]
- Sambolek, S.; Ivasic-Kos, M. Search and rescue image dataset for person Detection—Sard. IEEE Dataport 2021. [Google Scholar] [CrossRef]
- Varga, L.A.; Kiefer, B.; Messmer, M.; Zell, A. SeaDronesSee: A Maritime Benchmark for Detecting Humans in Open Water. arXiv 2021, arXiv:2105.01922. [Google Scholar] [CrossRef]
- Nihal, R.A.; Yen, B.; Itoyama, K.; Nakadai, K. UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios. arXiv 2024, arXiv:2408.04922. [Google Scholar] [CrossRef]
- Martinez-Esteso, J.P.; Castellanos, F.J.; Rosello, A.; Calvo-Zaragoza, J.; Gallego, A.J. On the Use of Synthetic Data for Body Detection in Maritime Search and Rescue Operations. Eng. Appl. Artif. Intell. 2025, 139, 109586. [Google Scholar]
- Farnebäck, G. Two-Frame Motion Estimation Based on Polynomial Expansion. In Proceedings of the 13th Scandinavian Conference on Image Analysis, Gothenburg, Sweden, 29 June–2 July 2003; pp. 363–370. [Google Scholar]
- Hoshino, Y.; Rathnayake, N.; Dang, T.L.; Rathnayake, U. Flow Velocity Analysis of Rivers Using Farneback Optical Flow and STIV Techniques With Drone Data. In Information and Communication Technology; Buntine, W., Fjeld, M., Tran, T., Tran, M.-T., Huynh Thi Thanh, B., Miyoshi, T., Eds.; Communications in Computer and Information Science; Springer Nature: Singapore, 2025; Volume 2351, pp. 17–26. ISBN 978-981-96-4284-7. [Google Scholar]
- Mendes, L.P.N.; Ricardo, A.M.C.; Bernardino, A.J.M.; Ferreira, R.M.L. A Comparative Study of Optical Flow Methods for Fluid Mechanics. Exp. Fluids 2022, 63, 7. [Google Scholar] [CrossRef]
- Sarker, S.; Rahman, S.; Hossain, T.; Faiza Ahmed, S.; Jamal, L.; Ahad, M.A.R. Skeleton-Based Activity Recognition: Preprocessing and Approaches. In Contactless Human Activity Analysis; Ahad, M.A.R., Mahbub, U., Rahman, T., Eds.; Intelligent Systems Reference Library; Springer International Publishing: Cham, Switzerland, 2021; Volume 200, pp. 43–81. ISBN 978-3-030-68589-8. [Google Scholar]
- Zacharie, M.; Fuji, S.; Minori, S. Rapid Human Body Detection in Disaster Sites Using Image Processing from Unmanned Aerial Vehicle (UAV) Cameras. In Proceedings of the 2018 International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), Bangkok, Thailand, 21–24 October 2018; IEEE: Bangkok, Thailand, 2018; pp. 230–235. [Google Scholar]
- Bilous, N.; Svidin, O.; Ahekian, I.; Malko, V. A Skeleton-Based Method for Exercise Recognition Based On 3D Coordinates of Human Joints. IJ-AI 2024, 13, 581. [Google Scholar] [CrossRef]
- Zhao, J.; Zhang, Y.; Li, S.; Wang, J.; Fang, L.; Ning, L.; Feng, J.; Zhang, J. An Improved Unscented Kalman Filter Applied to Positioning and Navigation of Autonomous Underwater Vehicles. Sensors 2025, 25, 551. [Google Scholar] [CrossRef]
- Bilous, N.; Malko, V.; Moshenskyi, N. Search and Detection of People in the Water Using YOLO Architectures: A Comparative Analysis from YOLOv3 to YOLOv8. In Automation 2024: Advances in Automation, Robotics and Measurement Techniques; Szewczyk, R., Zieliński, C., Kaliczyńska, M., Bučinskas, V., Eds.; Lecture Notes in Networks and Systems; Springer Nature: Cham, Switzerland, 2024; Volume 1219, pp. 233–255. ISBN 978-3-031-78265-7. [Google Scholar]
- Bilous, N.; Malko, V.; Frohme, M.; Nechyporenko, A. Comparison of CNN-Based Architectures for Detection of Different Object Classes. AI 2024, 5, 2300–2320. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Geetha, A.S. YOLOv4: A Breakthrough in Real-Time Object Detection. arXiv 2025, arXiv:2502.04161. [Google Scholar] [CrossRef]
- Sapkota, R.; Flores-Calero, M.; Qureshi, R.; Badgujar, C.; Nepal, U.; Poulose, A.; Zeno, P.; Vaddevolu, U.B.P.; Khan, S.; Shoman, M.; et al. YOLO Advances to Its Genesis: A Decadal and Comprehensive Review of the You Only Look Once (YOLO) Series. Artif. Intell. Rev. 2025, 58, 274. [Google Scholar] [CrossRef]
- Jocher, G.; Qiu, J. Ultralytics YOLO11 2024; Ultralytics Inc.: Frederick, MD, USA, 2024. [Google Scholar]
- Tian, Y.; Ye, Q.; Doermann, D. YOLO12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar] [CrossRef]









| Dataset | Data Type | Number of Images | Video | Number of Sequences | Duration | Classes |
|---|---|---|---|---|---|---|
| SARD | RGB images | 1981 | No | - | - | 6 types of human poses |
| SeaDronesSee (images) | RGB images | 14,227 | Partially | 10 | 20–45 s | swimmers, life jackets, boards, boats |
| SeaDronesSee (videos) | Video | - | Yes | 10 | 20–45 s | swimmers, life jackets, boards, boats |
| C2A | Syntetic | 10,215 | No | - | - | 5 types of poses + scenes of the elements |
| SynBase | Syntetic | 1295 | No | - | - | swimmers |
| Custom Dataset | Real video | ~21,600 frames * | Yes | 5 | ~18 min | swimmers, life jackets, debris |
| Model | Dataset | mAP@0.5 | Recall | FLOPs (G) | FPS (GPU) |
|---|---|---|---|---|---|
| YOLOv4 | SARD | 0.75 | 0.72 | 96 | 73 |
| YOLOv4 | SeaDronesSee | 0.77 | 0.73 | 96 | 73 |
| YOLOv4 | C2A | 0.78 | 0.74 | 96 | 73 |
| YOLOv4 | SynBASe | 0.78 | 0.75 | 96 | 73 |
| YOLOv4 | Combined | 0.79 | 0.75 | 96 | 73 |
| YOLOv8-s | SARD | 0.84 | 0.81 | 28.6 | 160 |
| YOLOv8-s | SeaDronesSee | 0.87 | 0.84 | 28.6 | 160 |
| YOLOv8-s | C2A | 0.88 | 0.86 | 28.6 | 160 |
| YOLOv8-s | SynBASe | 0.89 | 0.86 | 28.6 | 160 |
| YOLOv8-s | Combined | 0.90 | 0.87 | 28.6 | 160 |
| YOLO11-s | SARD | 0.88 | 0.85 | 23 | 175 |
| YOLO11-s | SeaDronesSee | 0.92 | 0.90 | 23 | 175 |
| YOLO11-s | C2A | 0.92 | 0.89 | 23 | 175 |
| YOLO11-s | SynBASe | 0.93 | 0.90 | 23 | 175 |
| YOLO11-s | Combined | 0.94 | 0.92 | 23 | 175 |
| YOLO12-s | SARD | 0.89 | 0.86 | 20 | 190 |
| YOLO12-s | SeaDronesSee | 0.93 | 0.91 | 20 | 190 |
| YOLO12-s | C2A | 0.93 | 0.90 | 20 | 190 |
| YOLO12-s | SynBASe | 0.94 | 0.91 | 20 | 190 |
| YOLO12-s | Combined | 0.95 | 0.93 | 20 | 190 |
| Metric | YOLOv4 | YOLOv8-s | YOLO11-s | YOLO12-s |
|---|---|---|---|---|
| mAP@0.5 | 0.79 | 0.90 | 0.94 | 0.95 |
| Recall | 0.75 | 0.87 | 0.92 | 0.93 |
| Precision | 0.77 | 0.88 | 0.93 | 0.94 |
| F1-score | 0.76 | 0.875 | 0.925 | 0.935 |
| mAP@0.5:0.95 | 0.42 | 0.58 | 0.65 | 0.69 |
| Metric | YOLOv4 | YOLOv8-s | YOLO11-s | YOLO12-s |
|---|---|---|---|---|
| mAP@0.5 | 0.77 | 0.89 | 0.93 | 0.95 |
| Recall | 0.73 | 0.86 | 0.91 | 0.93 |
| Precision | 0.75 | 0.88 | 0.92 | 0.94 |
| F1-score | 0.74 | 0.87 | 0.915 | 0.935 |
| mAP@0.5:0.95 | 0.40 | 0.57 | 0.64 | 0.68 |
| Type of Error | YOLOv4 | YOLOv8-s | YOLO11-s | YOLO12-s |
|---|---|---|---|---|
| Partial | 3% | 4% | 3% | 4% |
| Misclassified (FP) | 14% | 10% | 8% | 6% |
| Missed (FN) | 17% | 15% | 10% | 7% |
| Configuration | mAP@0.5 | Precision | Recall | F1-Score | FPS (Raspberry Pi 5) |
|---|---|---|---|---|---|
| YOLO12-s only | 0.95 | 0.92 | 0.93 | 0.925 | 23 FPS |
| +Optical Flow | 0.95 | 0.93 | 0.93 | 0.93 | 22 FPS |
| +Kalman Filter | 0.95 | 0.93 | 0.94 | 0.935 | 22 FPS |
| +BlazePose | 0.95 | 0.94 | 0.93 | 0.935 | 21 FPS |
| Hybrid (All modules) | 0.95 | 0.95 | 0.94 | 0.945 | 21 FPS |
| Metric | Value |
|---|---|
| Precision | 0.94 |
| Recall | 0.93 |
| F1-score | 0.935 |
| False positives per 1000 frames | 5.4 |
| Average FPS (Raspberry Pi 5, full pipeline) | 21.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Published by MDPI on behalf of the International Institute of Knowledge Innovation and Invention. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bilous, N.; Malko, V.; Ahekian, I.; Korobiichuk, I.; Ivanichev, V. Comparative Evaluation of YOLO Models for Human Position Recognition with UAVs During a Flood. Appl. Syst. Innov. 2026, 9, 6. https://doi.org/10.3390/asi9010006
Bilous N, Malko V, Ahekian I, Korobiichuk I, Ivanichev V. Comparative Evaluation of YOLO Models for Human Position Recognition with UAVs During a Flood. Applied System Innovation. 2026; 9(1):6. https://doi.org/10.3390/asi9010006
Chicago/Turabian StyleBilous, Nataliya, Vladyslav Malko, Iryna Ahekian, Igor Korobiichuk, and Volodymyr Ivanichev. 2026. "Comparative Evaluation of YOLO Models for Human Position Recognition with UAVs During a Flood" Applied System Innovation 9, no. 1: 6. https://doi.org/10.3390/asi9010006
APA StyleBilous, N., Malko, V., Ahekian, I., Korobiichuk, I., & Ivanichev, V. (2026). Comparative Evaluation of YOLO Models for Human Position Recognition with UAVs During a Flood. Applied System Innovation, 9(1), 6. https://doi.org/10.3390/asi9010006

