A Vision-Based Sensing Framework for PPE Detection and Safety Harness Compliance Recognition in High-Formwork Construction Environments Using YOLO-ILB
Abstract
1. Introduction
- (1)
- YOLO-ILB is proposed, a PPE detector for HFSS built on YOLO11n. Three targeted improvements are integrated: C3k2_IDWC for enhanced multi-scale feature extraction with reduced computational cost; SPPF_LSKA for strengthened global context awareness at the backbone tail; and a BiFPN neck for bidirectional cross-scale weighted feature fusion.
- (2)
- A dedicated HFSS PPE dataset is constructed from 2700 drone-captured images collected across 17 real construction sites and multiple project types. Five categories are annotated, covering both the presence and absence of each PPE item.
- (3)
- A geometry-constrained harness compliance detection method is proposed. It classifies three harness usage states: correct high-point anchoring, incorrect low-point anchoring, and dangerous unclipped or excessively distant hook. The method achieves 90.82% overall accuracy on 305 field instances without additional sensors or annotations.
2. Literature Review
3. Methods
3.1. Dataset Construction
3.2. YOLO-ILB Model Architecture
- (1)
- C3k2_IDWC: Multi-Scale Feature Extraction
- (2)
- SPPF_LSKA: Global Context Enhancement
- (3)
- BiFPN: Bidirectional Weighted Feature Fusion
3.3. Recognition of High-Anchoring and Low-Anchoring
4. Experiment
4.1. Comparative Experiment
4.2. Ablation Experiment
4.3. Results and Analysis of Safety Harness High-Anchoring Recognition
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- U.S. Bureau of Labor Statistics. Census of Fatal Occupational Injuries (CFOI) Summary, 2023; U.S. Bureau of Labor Statistics: Washington, DC, USA, 2023. [Google Scholar]
- Shanti, M.Z.; Cho, C.-S.; De Soto, B.G.; Byon, Y.-J.; Yeun, C.Y.; Kim, T.Y. Real-Time Monitoring of Work-at-Height Safety Hazards in Construction Sites Using Drones and Deep Learning. J. Saf. Res. 2022, 83, 364–370. [Google Scholar] [CrossRef] [PubMed]
- GB 23468-2025; Selection, Use and Maintenance of Fall Protection Equipments. Standardization Administration of China: Beijing, China, 2025.
- Fang, W.; Zhong, B.; Zhao, N.; Love, P.E.D.; Luo, H.; Xue, J.; Xu, S. A Deep Learning-Based Approach for Mitigating Falls from Height with Computer Vision: Convolutional Neural Network. Adv. Eng. Inform. 2019, 39, 170–177. [Google Scholar] [CrossRef]
- Wu, J.; Cai, N.; Chen, W.; Wang, H.; Wang, G. Automatic Detection of Hardhats Worn by Construction Personnel: A Deep Learning Approach and Benchmark Dataset. Autom. Constr. 2019, 106, 102894. [Google Scholar] [CrossRef]
- Nath, N.D.; Behzadan, A.H.; Paal, S.G. Deep Learning for Site Safety: Real-Time Detection of Personal Protective Equipment. Autom. Constr. 2020, 112, 103085. [Google Scholar] [CrossRef]
- Fang, Q.; Li, H.; Luo, X.; Ding, L.; Luo, H.; Rose, T.M.; An, W. Detecting Non-Hardhat-Use by a Deep Learning Method from Far-Field Surveillance Videos. Autom. Constr. 2018, 85, 1–9. [Google Scholar] [CrossRef]
- Wu, L.; Cai, N.; Liu, Z.; Yuan, A.; Wang, H. A One-Stage Deep Learning Framework for Automatic Detection of Safety Harnesses in High-Altitude Operations. Signal Image Video Process. 2023, 17, 75–82. [Google Scholar] [CrossRef]
- Fang, W.; Ding, L.; Luo, H.; Love, P.E.D. Falls from Heights: A Computer Vision-Based Approach for Safety Harness Detection. Autom. Constr. 2018, 91, 53–61. [Google Scholar] [CrossRef]
- Chen, H.; Chen, K.; Ding, G.; Han, J.; Lin, Z.; Liu, L.; Wang, A. YOLOv10: Real-Time End-to-End Object Detection. In Proceedings of the Advances in Neural Information Processing Systems 37, New York, NY, USA, 10–25 December 2024; Neural Information Processing Systems Foundation, Inc. (NeurIPS): Vancouver, BC, Canada, 2024; pp. 107984–108011. [Google Scholar]
- Wang, S. Automated Non-PPE Detection on Construction Sites Using YOLOv10 and Transformer Architectures for Surveillance and Body Worn Cameras with Benchmark Datasets. Sci. Rep. 2025, 15, 27043. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Wei, H.; Han, Z.; Huang, J.; Wang, W. Deep Learning-Based Safety Helmet Detection in Engineering Management Based on Convolutional Neural Networks. Adv. Civ. Eng. 2020, 2020, 9703560. [Google Scholar] [CrossRef]
- Lim, J.; Jung, D.G.; Park, C.; Kim, D.Y. Computer Vision Process Development Regarding Worker’s Safety Harness and Hook to Prevent Fall Accidents: Focused on System Scaffolds in South Korea. Adv. Civ. Eng. 2022, 2022, 4678479. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; IEEE: New York, NY, USA, 2017; pp. 936–944. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–12 June 2018; IEEE: New York, NY, USA, 2018; pp. 8759–8768. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: New York, NY, USA, 2020; pp. 10778–10787. [Google Scholar]
- Ding, X.; Zhang, X.; Han, J.; Ding, G. Scaling Up Your Kernels to 31×31: Revisiting Large Kernel Design in CNNs. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; IEEE: New York, NY, USA, 2022; pp. 11953–11965. [Google Scholar]
- Lau, K.W.; Po, L.-M.; Rehman, Y.A.U. Large Separable Kernel Attention: Rethinking the Large Kernel Attention Design in CNN. Expert Syst. Appl. 2024, 236, 121352. [Google Scholar] [CrossRef]
- Jocher, G.; Chaurasia, A.; Stoken, A.; Borovec, J.; NanoCode012; Kwon, Y.; Michael, K.; Xie, T.; Fang, J.; Imyhxy; et al. Ultralytics/Yolov5: V7.0-YOLOv5 SOTA Realtime Instance Segmentation. Zenodo 2022. [Google Scholar] [CrossRef]
- Fang, W.; Ding, L.; Love, P.E.D.; Luo, H.; Li, H.; Peña-Mora, F.; Zhong, B.; Zhou, C. Computer Vision Applications in Construction Safety Assurance. Autom. Constr. 2020, 110, 103013. [Google Scholar] [CrossRef]
- GB/T 10000-2023; Human Dimensions of Chinese Adults. Standardization Administration of China: Beijing, China, 2023.













| Category | Specification | Parameter |
|---|---|---|
| UAV | Takeoff weight | <249 g |
| Dimensions (L × W × H) | 251 × 362 × 70 mm | |
| Max flight time | 47 min | |
| Max wind resistance | 10.7 m/s | |
| Max tilt angle | 40° (forward)/35° (backward) | |
| Camera | Image sensor | 1/1.3 inch, 48 MP |
| Focal length (equiv.) | 24 mm | |
| Aperture | f/1.7 | |
| Field of view | 82.1° | |
| Shutter speed | 1/8000 s–2 s | |
| Image format | JPEG/DNG (RAW) | |
| Gimbal | Tilt range | −90° to +60° |
| Max control speed | 100°/s |
| Version | Base | Key Innovations |
|---|---|---|
| YOLOv8 | YOLOv5 | C2f module; optimized loss function |
| YOLOv10 | YOLOv8 | Dual-label assignment; PSA module; C2fCIB |
| YOLO11 | YOLOv8 | Dual-label assignment; C2PSA; C3k2 module |
| YOLO12 | YOLOv8 | Dual-label assignment; A2C2f + C3k2 modules |
| Model | mAP50 | Params (M) | FLOPs (G) | FPS (img/s) |
|---|---|---|---|---|
| YOLOv8n | 0.906 | 2.685 | 6.8 | 281.2 |
| YOLOv8s | 0.908 | 9.830 | 23.4 | 177.3 |
| YOLOv9t | 0.901 | 1.731 | 6.4 | 249.9 |
| YOLOv9s | 0.909 | 6.196 | 22.1 | 164.9 |
| YOLOv10n | 0.899 | 2.266 | 6.5 | 281.4 |
| YOLOv10s | 0.902 | 7.220 | 21.4 | 167.1 |
| YOLO11n | 0.910 | 2.583 | 6.3 | 256.3 |
| YOLO11s | 0.911 | 9.415 | 21.3 | 159.2 |
| YOLO-ILB | 0.939 | 1.923 | 5.7 | 262.3 |
| Category | Precision | Recall | AP50 |
|---|---|---|---|
| helmet | 0.961 | 0.955 | 0.975 |
| no-helmet | 0.934 | 0.929 | 0.949 |
| harness | 0.956 | 0.950 | 0.968 |
| no-harness | 0.925 | 0.921 | 0.952 |
| safety hook | 0.884 | 0.846 | 0.851 |
| Mean | 0.932 | 0.920 | 0.939 |
| Model | C3k2_IDWC | SPPF_LSKA | BiFPN | mAP50 | Params (M) | FLOPs (G) | FPS (img/s) |
|---|---|---|---|---|---|---|---|
| YOLO11n | × | × | × | 0.910 | 2.583 | 6.3 | 256.3 |
| YOLO11n-I | √ | × | × | 0.916 | 2.400 | 6.1 | 270.5 |
| YOLO11n-L | × | √ | × | 0.919 | 2.856 | 6.5 | 265.6 |
| YOLO11n-B | × | × | √ | 0.922 | 1.834 | 5.7 | 268.9 |
| YOLO11n-IL | √ | √ | × | 0.929 | 2.673 | 6.3 | 263.0 |
| YOLO-ILB | √ | √ | √ | 0.939 | 1.923 | 5.7 | 262.3 |
| Recognition Type | Sample Size | Correctly Recognized | Individual Accuracy (%) | Overall Accuracy (%) |
|---|---|---|---|---|
| High-Anchoring | 138 | 124 | 89.86 | 90.82 |
| Low-Anchoring | 94 | 85 | 90.43 | |
| Unclipped/Hook Too Far | 73 | 68 | 93.15 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Yao, G.; Liu, L.; Yang, Y.; Cai, X. A Vision-Based Sensing Framework for PPE Detection and Safety Harness Compliance Recognition in High-Formwork Construction Environments Using YOLO-ILB. Sensors 2026, 26, 4147. https://doi.org/10.3390/s26134147
Yao G, Liu L, Yang Y, Cai X. A Vision-Based Sensing Framework for PPE Detection and Safety Harness Compliance Recognition in High-Formwork Construction Environments Using YOLO-ILB. Sensors. 2026; 26(13):4147. https://doi.org/10.3390/s26134147
Chicago/Turabian StyleYao, Gang, Lang Liu, Yang Yang, and Xiaodong Cai. 2026. "A Vision-Based Sensing Framework for PPE Detection and Safety Harness Compliance Recognition in High-Formwork Construction Environments Using YOLO-ILB" Sensors 26, no. 13: 4147. https://doi.org/10.3390/s26134147
APA StyleYao, G., Liu, L., Yang, Y., & Cai, X. (2026). A Vision-Based Sensing Framework for PPE Detection and Safety Harness Compliance Recognition in High-Formwork Construction Environments Using YOLO-ILB. Sensors, 26(13), 4147. https://doi.org/10.3390/s26134147

