PSD-YOLO: An Enhanced Real-Time Framework for Robust Worker Detection in Complex Offshore Oil Platform Environments
Abstract
1. Introduction
- (1)
- This paper proposes PSD-YOLO, a framework based on YOLOv10s specifically optimized for worker detection on offshore oil platforms. The architecture is engineered to achieve a superior balance between high accuracy and real-time performance within a lightweight structure. Consequently, the model facilitates rapid and precise worker detection in complex offshore environments, directly addressing the critical challenges inherent to this application.
- (2)
- To enhance long-range contextual modeling, the Channel Attention-Aware (CAA) module is integrated after the 2fIB block in the backbone network. By dynamically adjusting the importance of target features, it effectively improves the focus on the target, reduces background noise interference, and further optimizes target detection accuracy in complex background environments. In the neck network, the newly designed C2fCIB_Conv2Former module replaces the original two C2f modules. Utilizing large-kernel depth-wise convolution, strengthens multi-scale feature fusion, optimizes feature integration and modeling, and improves the model’s detection ability for small targets.
- (3)
- To address the challenge of missed detections in occlusion scenarios on offshore platforms, we incorporate the Soft Non-Maximum Suppression (Soft-NMS) algorithm. Unlike the traditional NMS, Soft-NMS gradually reduces the scores of overlapping boxes instead of suppressing them directly, significantly mitigating missed detections and thereby enhancing detection accuracy and robustness.
- (4)
- To achieve robust adaptation in offshore environments, a strategy of Domain Adaptation via Cross-Domain Knowledge Transfer is employed. This approach enhances the model’s robustness in the unique and complex environment of offshore platforms by leveraging foundational knowledge from pre-trained models.
2. Related Work
3. The Proposed PSD-YOLO Algorithm
3.1. The CAA Module
3.2. C2fCIB_Conv2Former Module
3.3. Soft-NMS Algorithm
3.4. Domain Adaptation via Cross-Domain Knowledge Transfer
4. Experiments Analysis
4.1. Data Set
4.2. Environment Setup
4.3. Evaluation Metrics
4.4. Analysis of Experimental Results
4.4.1. Ablation Study
4.4.2. Comparative Experiments
- (1)
- Performance Validation of the C2fCIB_Conv2Former ModuleTo validate the effectiveness of our proposed C2fCIB_Conv2Former module, we conducted a comparative analysis. We use the original YOLOv10s-based model containing only the basic C2fCIB structure as our baseline, with its performance presented in the first row of Table 4. Our module and several other state-of-the-art attention mechanisms were integrated into the same baseline network architecture for a fair comparison. The competing modules include Monte Carlo Attention (MCAttn) [46], Dynamic Convolution (DynamicConv) [47], Single-Head Self-Attention (SHSA) [48], BiFormer (BiF) [49], and Patch-Aware Attention (PPA) [50]. The detailed comparison results are presented in Table 4. The best result is highlighted in bold.As shown in Table 4, the C2fCIB_Conv2Former model demonstrates superior detection accuracy in worker detection tasks under complex offshore oil drilling platform backgrounds, while also reducing the computational load of the baseline model. Compared to other mainstream C2fCIB modules, the model based on the C2fCIB_Conv2Former module achieves a higher mAP@0.5 of 81.3%, a recall of 73.2%, and a precision of 89.1%, surpassing other models. Through the aforementioned architectural improvement, the C2fCIB_Conv2Former module effectively improves the model’s detection accuracy for small targets in complex backgrounds and occlusion scenarios, demonstrating its potential for practical deployment.
- (2)
- Performance Validation of the CAA ModuleTo validate the effectiveness of the CAA module in improving the detection performance of workers on offshore oil platforms, we compare it with several mainstream attention mechanisms, including Squeeze-and-Excitation (SE) [51], a Convolutional Block Attention Module (CBAM) [52], Efficient Channel Attention (ECA) [53], Coordinate Attention (CA) [54], A Simple Parameter-Free Attention (SimAM) [55], a Global Attention Mechanism (GAM) [56], Efficient Multi-Scale Attention (EMA) [57], ShuffleAttention [58], and a Normalization-based Attention Module (NAM) [59]. We use the unmodified YOLOv10s model as our baseline, with its performance presented in the first row of Table 5, to evaluate the performance changes after integrating different attention modules. The detailed comparison results are presented in Table 5.As shown in Table 5, the CAA module demonstrates significant advantages in the worker detection task on offshore oil platforms. Compared to mainstream attention mechanisms, the model’s detection accuracy achieves the highest value among all compared methods, indicating that CAA can capture targets more comprehensively and enhance multi-scale detection robustness. Specifically, the CAA model achieves an mAP@0.5 of 0.817, showing a significant improvement over other attention mechanisms. Furthermore, the CAA model exhibits an excellent recall of 0.736, indicating its superior ability to detect targets in complex background environments.Although its inference speed is slightly lower than the baseline YOLOv10s model, it outperforms most attention-based methods. Compared to other attention mechanisms, it is noteworthy that while ShuffleAttention and GAM are close to CAA in mAP@0.5, their mAP@0.5-0.95 scores are 1.1% and 0.9% lower, respectively, highlighting CAA’s stronger adaptability to complex scenarios. Overall, by optimizing the channel attention mechanism, CAA significantly enhances detection performance while ensuring real-time capability, making it particularly suitable for the precise identification of small targets in dynamic offshore environments.
- (3)
- Comparative Experiments with Mainstream AlgorithmsTo objectively evaluate the detection performance of the model, this study compares PSD-YOLO with several mainstream detection models on the offshore oil drilling platform worker detection dataset, including the two-stage object detection algorithm Faster R-CNN with high detection accuracy and single-stage object detection algorithms SSD, YOLOv5s, YOLOv8s, YOLOv10s, and YOLOv11s.The detection results are presented in Table 6. Although single-metric winners, their performance in other key areas is significantly lower, making them unsuitable for this balanced task. In contrast, our proposed PSD-YOLO not only achieves the highest detection accuracy (mAP@0.5) and recall, but also secures competitive, second-best results in precision. This demonstrates that PSD-YOLO strikes a superior balance between accuracy and efficiency, making it a highly effective and robust solution for personnel detection in dynamic marine environments. As seen in the FPS metrics in Table 6, while models like YOLOv5s have a higher raw speed, their accuracy is lower; PSD-YOLO, while achieving a significant lead in mAP, maintains an inference speed of 232.56 FPS, which fully meets the requirements for real-time monitoring.Figure 7 illustrates the mAP@0.5 convergence curves of the different models during the training process. From the trend of the curves, it can be observed that PSD-YOLO converges the fastest and exhibits the most stable performance. Within the first 50 epochs, its mAP value rises rapidly and stabilizes around 100 epochs, outperforming all other models. In comparison, while the mAP values of YOLOv5s, YOLOv8s, YOLOv10s, and YOLOv11s are also relatively high, they remain slightly lower than PSD-YOLO in the final stages of training, indicating a weaker adaptability to this complex detection task. The mAP curves of SSD and Faster R-CNN remain at a low level throughout the training process. Specifically, the mAP of SSD rises rapidly in the early stages of training and then stabilizes around 0.6, while the mAP of Faster R-CNN stabilizes around 0.74, indicating the limited capability of these two models in handling complex object detection tasks.
4.4.3. Visualization Results
5. Discussion
5.1. Interpretation of Results and Comparison with Related Work
5.2. Limitations and Practical Deployment Challenges
5.3. Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Vidal, P.d.C.J.; González, M.O.A.; de Vasconcelos, R.M.; de Melo, D.C.; de Oliveira Ferreira, P.; Sampaio, P.G.V.; da Silva, D.R. Decommissioning of offshore oil and gas platforms: A systematic literature review of factors involved in the process. Ocean Eng. 2022, 255, 111428. [Google Scholar] [CrossRef]
- Hatefi, M.A.; Balilehvand, H.R. Risk assessment of oil and gas drilling operation: An empirical case using a hybrid GROC-VIMUN-modified FMEA method. Process Saf. Environ. Prot. 2023, 170, 392–402. [Google Scholar] [CrossRef]
- Wang, H.; Huang, H.; Bi, W.; Ji, G.; Zhou, B.; Zhuo, L. Deep and ultra-deep oil and gas well drilling technologies: Progress and prospect. Nat. Gas Ind. B 2022, 9, 141–157. [Google Scholar] [CrossRef]
- Ye, H.; Jiang, C.; Zu, F.; Li, S. Design of a structural health monitoring system and performance evaluation for a jacket offshore platform in east china sea. Appl. Sci. 2022, 12, 12021. [Google Scholar] [CrossRef]
- Xia, S.; Qin, R.; Lu, Y.; Ma, L.; Liu, Z. A Monocular Vision-Based Safety Monitoring Framework for Offshore Infrastructures Utilizing Grounded SAM. J. Mar. Sci. Eng. 2025, 13, 340. [Google Scholar] [CrossRef]
- Fang, W.; Ding, L.; Luo, H.; Love, P.E. Falls from heights: A computer vision-based approach for safety harness detection. Autom. Constr. 2018, 91, 53–61. [Google Scholar] [CrossRef]
- Park, M.W.; Brilakis, I. Construction worker detection in video frames for initializing vision trackers. Autom. Constr. 2012, 28, 15–25. [Google Scholar] [CrossRef]
- Cai, L.; Qian, J. A method for detecting miners based on helmets detection in underground coal mine videos. Min. Sci. Technol. (China) 2011, 21, 553–556. [Google Scholar] [CrossRef]
- Sun, Y. Research on, Design, and Implementation of an Intelligent Detection System for Unsafe Behaviors of Construction Workers. Master’s Thesis, Shenzhen University, Shenzhen, China, 2020. [Google Scholar]
- Fang, L.; Zhao, Z.; Yan, Z.; Dai, Z.; Chen, G. A Construction Worker Detection Algorithm Based on Single-Stage Semi-Supervised Object Detection. Microelectron. Comput. 2025, 42, 20–30. [Google Scholar] [CrossRef]
- Roberts, D.; Torres Calderon, W.; Tang, S.; Golparvar-Fard, M. Vision-based construction worker activity analysis informed by body posture. J. Comput. Civ. Eng. 2020, 34, 04020017. [Google Scholar] [CrossRef]
- Zhang, M.; Cao, Z.; Yang, Z.; Zhao, X. Utilizing computer vision and fuzzy inference to evaluate level of collision safety for workers and equipment in a dynamic environment. J. Constr. Div. Am. Soc. Civ. Eng. 2020, 146, 04020051. [Google Scholar] [CrossRef]
- Shukla, A.; Karki, H. Application of robotics in onshore oil and gas industry—A review Part I. Robot. Auton. Syst. 2016, 75, 490–507. [Google Scholar] [CrossRef]
- Amaechi, C.V.; Reda, A.; Butler, H.O.; Ja’e, I.A.; An, C. Review on fixed and floating offshore structures. Part I: Types of platforms with some applications. J. Mar. Sci. Eng. 2022, 10, 1074. [Google Scholar] [CrossRef]
- Amaechi, C.V.; Reda, A.; Butler, H.O.; Ja’e, I.A.; An, C. Review on fixed and floating offshore structures. Part II: Sustainable design approaches and project management. J. Mar. Sci. Eng. 2022, 10, 973. [Google Scholar] [CrossRef]
- Ji, X.; Gong, F.; Yuan, X.; Wang, N. A high-performance framework for personal protective equipment detection on the offshore drilling platform. Complex Intell. Syst. 2023, 9, 5637–5652. [Google Scholar] [CrossRef]
- Gong, F.; Ma, Y.; Zheng, P.; Song, T. A deep model method for recognizing activities of workers on offshore drilling platform by multistage convolutional pose machine. J. Loss Prev. Process Ind. 2020, 64, 104043. [Google Scholar] [CrossRef]
- Gong, F.; Ji, X.; Gong, W.; Yuan, X.; Gong, C. Deep learning based protective equipment detection on offshore drilling platform. Symmetry 2021, 13, 954. [Google Scholar] [CrossRef]
- Yang, H.; Ling, Y.; Zhang, D. Research on a Personnel Localization Algorithm for Offshore Drilling Platforms Based on the YOLO Algorithm. Mod. Transm. 2024, 41–44. [Google Scholar]
- Li, Z.; Zhang, H.; Gao, D.; Wu, Z.; Zhang, Z.; Du, L. ACD-Net: An Abnormal Crew Detection Network for Complex Ship Scenarios. Sensors 2024, 24, 7288. [Google Scholar] [CrossRef]
- Pham, V.T.; Le, Q.B.; Nguyen, D.A.; Dang, N.D.; Huynh, H.T.; Tran, D.T. Multi-sensor data fusion in a real-time support system for on-duty firefighters. Sensors 2019, 19, 4746. [Google Scholar] [CrossRef]
- Ma, J.; Li, H.; Wang, L.; Yu, X.; Huang, X. Multimodal fusion for monitoring worker fatigue in elevated work environments. Adv. Eng. Inform. 2025, 67, 103565. [Google Scholar] [CrossRef]
- Chen, X.; Yu, Y.; Wang, Y.; Hu, Z.Z. Multimodal data fusion for ergonomic assessment of construction workers in visually obstructed environments. Autom. Constr. 2025, 179, 106495. [Google Scholar] [CrossRef]
- Wang, Z.; Yan, J. Multi-sensor fusion based industrial action recognition method under the environment of intelligent manufacturing. J. Manuf. Syst. 2024, 74, 575–586. [Google Scholar] [CrossRef]
- Sim, H.; Kim, H. Establishment of a Real-Time Risk Assessment and Preventive Safety Management System in Industrial Environments Utilizing Multimodal Data and Advanced Deep Reinforcement Learning Techniques. Int. J. Adv. Sci. Eng. Inf. Technol. 2025, 15, 328–337. [Google Scholar] [CrossRef]
- Zhao, X.; Wang, L.; Zhang, Y.; Han, X.; Deveci, M.; Parmar, M. A review of convolutional neural networks in computer vision. Artif. Intell. Rev. 2024, 57, 99. [Google Scholar] [CrossRef]
- Du, L.; Zhang, R.; Wang, X. Overview of two-stage object detection algorithms. J. Phys. Conf. Ser. 2020, 1544, 012033. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef]
- Terven, J.; Córdova-Esparza, D.M.; Romero-González, J.A. A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
- Alhaila, S.; Almelhem, I.; Alsnafi, A.; Alameerah, M.; Alrumaidhi, K.; Nadeem, M. YOLO-Based Helmet Detection System for Safety Compliance in Oil and Gas Industry. In Proceedings of the 2024 5th International Conference on Industrial Engineering and Artificial Intelligence (IEAI), IEEE, Chiang Mai, Thailand, 24–26 April 2024; pp. 16–23. [Google Scholar]
- Wang, Y.; Zhang, J.; Zhu, J.; Ge, Y.; Zhai, G. Research on the Visual Perception of Ship Engine Rooms Based on Deep Learning. J. Mar. Sci. Eng. 2023, 11, 1450. [Google Scholar] [CrossRef]
- Yang, J. Research and Application of Small Object Detection Algorithms in Complex Environments of Drilling Sites. Master’s Thesis, China University of Petroleum (East China), Qingdao, China, 2022. [Google Scholar]
- Li, Y.; Zhang, B.; Liu, Y.; Wang, H.; Zhang, S. Personnel Monitoring in Shipboard Surveillance Using Improved Multi-Object Detection and Tracking Algorithm. Sensors 2024, 24, 5756. [Google Scholar] [CrossRef]
- Sun, F.; Zhang, S.; Qu, C.; Li, Z. Deep learning-based method for detecting the association between the personnel operating attitudes and the operational targets on offshore platform. Neural Comput. Appl. 2025, 37, 16445–16460. [Google Scholar] [CrossRef]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [Google Scholar]
- Sohan, M.; Sai Ram, T.; Rami Reddy, C.V. A review on yolov8 and its advancements. In Proceedings of the International Conference on Data Intelligence and Cognitive Informatics, Tirunelveli, India, 18–20 November 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 529–545. [Google Scholar]
- Cai, X.; Lai, Q.; Wang, Y.; Wang, W.; Sun, Z.; Yao, Y. Poly kernel inception network for remote sensing detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 27706–27716. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Hou, Q.; Lu, C.Z.; Cheng, M.M.; Feng, J. Conv2former: A simple transformer-style convnet for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 8274–8283. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Zhang, B.; Li, C.; Ji, R.; Han, J.; Cao, X.; Liu, J. Modulated convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 840–848. [Google Scholar]
- Guo, Y.; Li, Y.; Wang, L.; Rosing, T. Depthwise convolution is all you need for learning multiple visual domains. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 8368–8375. [Google Scholar]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L.S. Soft-NMS–improving object detection with one line of code. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 5561–5569. [Google Scholar]
- Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 9. [Google Scholar] [CrossRef]
- Dai, W.; Liu, R.; Wu, Z.; Wu, T.; Wang, M.; Zhou, J.; Yuan, Y.; Liu, J. Exploiting scale-variant attention for segmenting small medical objects. arXiv 2024, arXiv:2407.07720. [Google Scholar]
- Han, K.; Wang, Y.; Guo, J.; Wu, E. ParameterNet: Parameters are all you need for large-scale visual pretraining of mobile networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 15751–15761. [Google Scholar]
- Yun, S.; Ro, Y. Shvit: Single-head vision transformer with memory efficient macro design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 5756–5767. [Google Scholar]
- Zhu, L.; Wang, X.; Ke, Z.; Zhang, W.; Lau, R.W. Biformer: Vision transformer with bi-level routing attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 10323–10333. [Google Scholar]
- Xu, S.; Zheng, S.; Xu, W.; Xu, R.; Wang, C.; Zhang, J.; Teng, X.; Li, A.; Guo, L. Hcf-net: Hierarchical context fusion network for infrared small object detection. In Proceedings of the 2024 IEEE International Conference on Multimedia and Expo (ICME), Niagara Falls, ON, Canada, 15–19 July 2024; pp. 1–6. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 19–25 June 2021; pp. 13713–13722. [Google Scholar]
- Yang, L.; Zhang, R.Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual Event, 18–24 July 2021; pp. 11863–11874. [Google Scholar]
- Liu, Y.; Shao, Z.; Hoffmann, N. Global attention mechanism: Retain information to enhance channel-spatial interactions. arXiv 2021, arXiv:2112.05561. [Google Scholar] [CrossRef]
- Ouyang, D.; He, S.; Zhang, G.; Luo, M.; Guo, H.; Zhan, J.; Huang, Z. Efficient multi-scale attention module with cross-spatial learning. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
- Zhang, Q.L.; Yang, Y.B. Sa-net: Shuffle attention for deep convolutional neural networks. In Proceedings of the ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Virtual Event, 6–11 June 2021; pp. 2235–2239. [Google Scholar]
- Liu, Y.; Shao, Z.; Teng, Y.; Hoffmann, N. NAM: Normalization-based attention module. arXiv 2021, arXiv:2111.12419. [Google Scholar] [CrossRef]









| Component | Specification |
|---|---|
| Operating System | Ubuntu 20.04 |
| GPU | NVIDIA GeForce RTX 3090 (24 GB) |
| CPU | 15 vCPU Intel® Xeon® Platinum 8362 @ 2.80 GHz |
| Framework | PyTorch 1.11, CUDA 11.3 |
| Hyperparameter | Value |
|---|---|
| Optimizer | Stochastic Gradient Descent (SGD) |
| Input Image Size | 960 × 540 |
| Batch Size | 8 |
| Total Epochs | 300 |
| Initial Learning Rate | 0.01 |
| Learning Rate Scheduler | Cosine annealing |
| Learning Rate Decay Factor | 0.937 |
| Momentum Factor | 0.937 |
| Weight Decay | 0.0005 |
| Warm-up Phase | 20 epochs |
| Model | TL | S-NMS | CAA | C2C-CF | mAP@0.5 | P | R | FPS | GFLOPs |
|---|---|---|---|---|---|---|---|---|---|
| 1 | — | — | — | — | 0.805 | 0.875 | 0.721 | 285.71 | 24.4 |
| 2 | 🗸 | — | — | — | 0.806 | 0.887 | 0.723 | 270.27 | 24.8 |
| 3 | 🗸 | 🗸 | — | — | 0.81 | 0.881 | 0.731 | 270.27 | 24.8 |
| 4 | 🗸 | 🗸 | 🗸 | — | 0.819 | 0.892 | 0.732 | 270.27 | 25.2 |
| 5 | 🗸 | 🗸 | 🗸 | 🗸 | 0.825 | 0.895 | 0.744 | 232.56 | 24.2 |
| Model | P | R | mAP@0.5 | mAP@0.5-0.95 | FPS | GFLOPs |
|---|---|---|---|---|---|---|
| C2fCIB (Baseline) | 0.875 | 0.721 | 0.805 | 0.538 | 285.71 | 24.4 |
| C2fCIB_MCAttn [46] | 0.889 | 0.709 | 0.794 | 0.538 | 256.41 | 24.9 |
| C2fCIB_DynamicConv [47] | 0.892 | 0.728 | 0.810 | 0.556 | 294.12 | 24.4 |
| C2fCIB_SHSA [48] | 0.890 | 0.722 | 0.805 | 0.548 | 294.12 | 23.8 |
| C2fCIB_BiF [49] | 0.882 | 0.729 | 0.812 | 0.555 | 285.71 | 23.8 |
| C2fCIB_PPA [50] | 0.884 | 0.727 | 0.801 | 0.550 | 250.00 | 26.3 |
| C2fCIB_Conv2Former | 0.891 | 0.732 | 0.813 | 0.558 | 270.27 | 24.0 |
| Model | P | R | mAP@0.5 | mAP@0.5-0.95 | FPS |
|---|---|---|---|---|---|
| YOLOv10s (Baseline) | 0.875 | 0.721 | 0.805 | 0.538 | 285.71 |
| SE [51] | 0.876 | 0.727 | 0.809 | 0.539 | 256.41 |
| CBAM [52] | 0.873 | 0.703 | 0.787 | 0.524 | 270.27 |
| ECA [53] | 0.872 | 0.721 | 0.805 | 0.533 | 285.71 |
| CA [54] | 0.883 | 0.713 | 0.804 | 0.531 | 285.71 |
| SimAM [55] | 0.883 | 0.711 | 0.799 | 0.529 | 256.41 |
| GAM [56] | 0.887 | 0.724 | 0.813 | 0.541 | 256.41 |
| EMA [57] | 0.881 | 0.715 | 0.797 | 0.53 | 270.27 |
| ShuffleAttention [58] | 0.877 | 0.722 | 0.813 | 0.539 | 285.71 |
| NAM [59] | 0.881 | 0.703 | 0.785 | 0.528 | 270.27 |
| CAA | 0.888 | 0.736 | 0.817 | 0.55 | 256.41 |
| Model | P | R | mAP@0.5 | FPS | GFLOPs |
|---|---|---|---|---|---|
| SSD | 0.909 | 0.470 | 0.670 | 38.35 | 30.43 |
| YOLOv5s | 0.892 | 0.717 | 0.818 | 526.32 | 16.4 |
| YOLOv8s | 0.884 | 0.686 | 0.779 | 500 | 28.4 |
| YOLOv10s | 0.875 | 0.721 | 0.805 | 285.71 | 24.4 |
| YOLOv11s | 0.888 | 0.703 | 0.792 | 526.32 | 21.3 |
| FasterRCNN | 0.528 | 0.739 | 0.753 | 26.49 | 470.86 |
| PSD-YOLO (Ours) | 0.895 | 0.744 | 0.825 | 232.56 | 24.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qin, Y.; Dong, J.; Li, W.; Zhang, L.; Feng, K.; Wang, Z. PSD-YOLO: An Enhanced Real-Time Framework for Robust Worker Detection in Complex Offshore Oil Platform Environments. Sensors 2025, 25, 6264. https://doi.org/10.3390/s25206264
Qin Y, Dong J, Li W, Zhang L, Feng K, Wang Z. PSD-YOLO: An Enhanced Real-Time Framework for Robust Worker Detection in Complex Offshore Oil Platform Environments. Sensors. 2025; 25(20):6264. https://doi.org/10.3390/s25206264
Chicago/Turabian StyleQin, Yikun, Jiawen Dong, Wei Li, Linxin Zhang, Ke Feng, and Zijia Wang. 2025. "PSD-YOLO: An Enhanced Real-Time Framework for Robust Worker Detection in Complex Offshore Oil Platform Environments" Sensors 25, no. 20: 6264. https://doi.org/10.3390/s25206264
APA StyleQin, Y., Dong, J., Li, W., Zhang, L., Feng, K., & Wang, Z. (2025). PSD-YOLO: An Enhanced Real-Time Framework for Robust Worker Detection in Complex Offshore Oil Platform Environments. Sensors, 25(20), 6264. https://doi.org/10.3390/s25206264

