TFDF-YOLO: A Position Detection Model for Underwater Wireless Power Transfer Docking
Abstract
1. Introduction
- A spatial–frequency decoupling module (SFD) is proposed by using Fourier-based degradation cues to guide Top-K proxy attention to boost blurred edge extraction capability. Proxy tokens and Top-K sparsity reduce the complexity of global relation modeling.
- A relevance-difference fusion (RD-Fusion) mechanism is proposed to process imbalanced multi-scale features. A global channel attention mechanism with dynamic weighting is improved to raise the recognition rate of small objects.
- A new U-CIoU loss function and an illumination-background adaptive weight strategy are specially designed for underwater target recognition.
- The proposed framework integrates lightweight modules and efficient feature fusion mechanisms for underwater docking detection. Its effectiveness and computational efficiency are evaluated on a multi-source underwater dataset.
2. Methodology
2.1. Spatial–Frequency Decoupling (SFD) Module
2.1.1. Edge-Enhancement Preprocessing Layer
2.1.2. Multi-Scale Spatial Feature Extraction Module
2.1.3. Frequency-Guided Sparsity for Dual-Stage Spatial Attention
- 1.
- Proxy token construction
- 2.
- Degradation indicator and sparsity ratio assignment
- 3.
- Dual-stage attention: proxy aggregation and broadcast refinement
2.1.4. Local–Global Feature Fusion Layer
2.2. Relevance-Difference Fusion (RD-Fusion) Strategy
2.2.1. Global Channel Attention Generation
2.2.2. Difference Feature Weight Generation
2.2.3. Dynamic Weighted Fusion
2.3. U-CIoU (Underwater-CIoU) Loss Function
2.3.1. CIoU Loss Function
2.3.2. Illumination-Induced Gradient Drift
2.3.3. U-CIoU Loss Function
- 1.
- Adaptive weight term wi
- 2.
- Anchor Regularization Term λreg
2.3.4. Parameter Sensitivity Discussion
3. UWPT Platform and Dataset Introduction
3.1. UWPT Platform
3.2. Multi-Source Dataset
3.2.1. Experimental Data
3.2.2. Simulated Data
3.2.3. Public Data
3.2.4. Typical Scenarios in Multi-Source Data
3.3. Data Analysis
4. Result Analysis
4.1. Experimental Setup
4.1.1. Implementation Details
4.1.2. Evaluation Metrics
4.2. Comparative Analysis
4.2.1. Model Analysis
- 1.
- Image Quality Analysis
- 2.
- Recognition accuracy analysis
4.2.2. Optimal Setting Analysis
- 1.
- Ablation on kernel configuration settings.
- 2.
- Ablation on single sparsity ratio and grouped sparsity ratios.
- 3.
- Ablation on Illumination Intensity Computation
- 4.
- Feature extraction module
- 5.
- Feature fusion strategy
- 6.
- Loss function
4.2.3. Ablation Experiment
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| Variables | |
| Input feature map to the SFD module | |
| Preprocessed feature map after edge enhancement | |
| Local spatial feature map after multi-scale DW convolutions | |
| Refined spatial tokens after Stage-II refinement | |
| Global interaction feature reshaped from | |
| Final output feature of the SFD module after local–global fusion | |
| Shallow and deep input features | |
| Concatenated cross-layer features | |
| Difference feature between shallow and deep layers | |
| Fused output feature from RD-Fusion | |
| Underwater-optimized CIoU loss function | |
| Q, K, V | Query/Key/Value token embeddings |
| Pooling window corresponding to the -th proxy token. | |
| Local patch aligned with for degradation estimation | |
| Fourier-based degradation indicator of proxy region | |
| Sparsity ratio | |
| Illumination-background adaptive weight | |
| Anchor regularization coefficient | |
| Dynamic weight matrix for difference features in RD-Fusion | |
| Channel attention weights for shallow and deep features | |
| Acronyms | |
| UWPT | Underwater Wireless Power Transfer |
| UUVs | Unmanned Underwater Vehicles |
| YOLO | You Only Look Once |
| SFD module | Spatial–Frequency Decoupling Module |
| RD-Fusion | Relevance-Difference Fusion |
| U-CIoU | Underwater-CIoU |
| DW Conv | Depthwise Convolution |
| PW Conv | Pointwise Convolution |
| GAP | Global Average Pooling |
| DS Conv | Depthwise Separable Convolution |
| SGD | Stochastic Gradient Descent |
| Precision | |
| R | Recall |
| mAP | Mean Average Precision |
| GFLOPs | Giga Floating-point Operations Per Second |
| TPs/FPs/FNs | True Positives, False Positives, False Negatives |
| PANet | Path Aggregation Network |
| CBAM | Convolutional Block Attention Module |
| URPC | Underwater Robot Professional Competition |
| UDID | Underwater Docking Identification Dataset |
References
- Wang, L.; Zhu, D.; Pang, W.; Liu, X. A survey of underwater search for multi-target using multi-AUV: Task allocation, path planning and formation control. Ocean Eng. 2023, 278, 114393. [Google Scholar] [CrossRef]
- Zhang, B.; Jiang, C.; Yang, F.; Chen, C.; Lu, Y.; Zhou, J. An anti-rotation wireless power transfer system with a flexible magnetic coupler for autonomous underwater vehicles. IEEE Trans. Power Electron. 2025, 40, 2593–2603. [Google Scholar] [CrossRef]
- Wang, D.; Zhang, J.; Cui, S.; Bie, Z.; Chen, F.; Zhu, C. The state-of-the-art of underwater wireless power transfer: A comprehensive review and new perspectives. Renew. Sustain. Energy Rev. 2024, 189, 113910. [Google Scholar] [CrossRef]
- Lin, M.; Lin, R.; Yang, C.; Li, D.; Zhang, Z.; Zhao, Y.; Ding, W. Docking to an underwater suspended charging station: Systematic design and experimental tests. Ocean Eng. 2022, 249, 110766. [Google Scholar] [CrossRef]
- Liu, J.; Yu, F.; He, B.; Guedes Soares, C. A review of underwater docking and charging technology for autonomous vehicles. Ocean Eng. 2024, 297, 117154. [Google Scholar] [CrossRef]
- Lin, R.; Zhao, Y.; Li, D.; Lin, M.; Yang, C. Underwater electromagnetic guidance based on the magnetic dipole model applied in AUV terminal docking. J. Mar. Sci. Eng. 2022, 10, 995. [Google Scholar] [CrossRef]
- Yang, Q.; Liu, H.; Hong, L.; Yu, X.; Chen, J.; Chen, B. Anti-disturbance control strategy in capture stage for AUV dynamic-base docking with optical-guided constraints. Ocean Eng. 2024, 311, 118946. [Google Scholar] [CrossRef]
- Watt, G.; Roy, A.; Currie, J.; Gillis, C.; Giesbrecht, J.; Heard, G.; Birsan, M.; Seto, M.; Carretero, J.; Dubay, R.; et al. A concept for docking a UUV with a slowly moving submarine under waves. IEEE J. Ocean. Eng. 2016, 41, 471–498. [Google Scholar] [CrossRef]
- Li, Y.; Sun, K.; Han, Z.; Lang, J. Deep learning-based docking scheme for autonomous underwater vehicles with an omnidirectional rotating optical beacon. Drones 2024, 8, 697. [Google Scholar] [CrossRef]
- Palomeras, N.; Vallicrosa, G.; Mallios, A.; Bosch, J.; Vidal, E.; Hurtós, N.; Carreras, M.; Ridao, P. AUV homing and docking for remote operations. Ocean Eng. 2018, 154, 106–120. [Google Scholar] [CrossRef]
- Yan, Z.; Gong, P.; Zhang, W.; Li, Z.; Teng, Y. Autonomous underwater vehicle vision-guided docking experiments based on L-shaped light array. IEEE Access 2019, 7, 72567–72576. [Google Scholar] [CrossRef]
- Zhong, L.; Li, D.; Lin, M.; Lin, R.; Yang, C. A fast binocular localisation method for AUV docking. Sensors 2019, 19, 1735. [Google Scholar] [CrossRef]
- Gomes, D.; Saif, A.; Nandi, D. Robust underwater object detection with autonomous underwater vehicle: A comprehensive study. In Proceedings of the International Conference on Computer Advancements (ICCA 2020), Dhaka, Bangladesh, 10–12 January 2020. Article 17. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014), Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV 2015), Santiago, Chile, 13–16 December 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving into high quality object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, USA, 18–22 June 2018; pp. 6154–6162. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.; Berg, A. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision (ECCV 2016), Amsterdam, The Netherlands, 11–14 October 2016; Volume 9905, pp. 21–37. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Zhang, M.; Xu, S.; Song, W.; He, Q.; Wei, Q. Lightweight underwater object detection based on YOLOv4 and multi-scale attentional feature fusion. Remote Sens. 2021, 13, 4706. [Google Scholar] [CrossRef]
- Lu, D.; Yi, J.; Wang, J. Enhanced YOLOv7 for improved underwater target detection. J. Mar. Sci. Eng. 2024, 12, 1127. [Google Scholar] [CrossRef]
- Lei, F.; Tang, F.; Li, S. Underwater target detection algorithm based on improved YOLOv5. J. Mar. Sci. Eng. 2022, 10, 310. [Google Scholar] [CrossRef]
- Yan, J.; Zhou, Z.; Zhou, D.; Su, B.; Xuanyuan, Z.; Tang, J.; Lai, Y.; Chen, J.; Liang, W. Underwater object detection algorithm based on attention mechanism and cross-stage partial fast spatial pyramidal pooling. Front. Mar. Sci. 2022, 9, 1056300. [Google Scholar] [CrossRef]
- Chang, Y.; Li, D.; Gao, Y.; Su, Y.; Jia, X. An improved YOLO model for UAV fuzzy small target image detection. Appl. Sci. 2023, 13, 5409. [Google Scholar] [CrossRef]
- Zheng, Z.; Yu, W. RG-YOLO: Multi-scale feature learning for underwater target detection. Multimed. Syst. 2025, 31, 26. [Google Scholar] [CrossRef]
- Wang, Z.; Ruan, Z.; Chen, C. DyFish-DETR: Underwater fish image recognition based on detection transformer. J. Mar. Sci. Eng. 2024, 12, 864. [Google Scholar] [CrossRef]
- Han, Z.; Yue, Z.; Liu, L. 3L-YOLO: A lightweight low-light object detection algorithm. Appl. Sci. 2025, 15, 90. [Google Scholar] [CrossRef]
- Gao, J.; Zhang, Y.; Geng, X.; Tang, H.; Bhatti, U. PE-Transformer: Path enhanced transformer for improving underwater object detection. Expert Syst. Appl. 2024, 246, 123253. [Google Scholar] [CrossRef]
- Zhang, Y.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and efficient IoU loss for accurate bounding box regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
- Rajafillah, C.; El Moutaouakil, K.; Patriciu, A.-M.; Yahyaouy, A.; Riffi, J. INT-FUP: Intuitionistic Fuzzy Pooling. Mathematics 2024, 12, 1740. [Google Scholar] [CrossRef]
- Liu, S.; Ozay, M.; Okatani, T.; Xu, H.; Sun, K.; Lin, Y. Detection and pose estimation for short-range vision-based underwater docking. IEEE Access 2019, 7, 2720–2749. [Google Scholar] [CrossRef]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. DETRs beat YOLOs on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024), Seatle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Guo, J.; Wu, E. ParameterNet: Parameters are all you need for large-scale visual pretraining of mobile networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024), Seatle, WA, USA, 16–22 June 2024; pp. 15751–15761. [Google Scholar] [CrossRef]
- Chen, J.; Kao, S.; He, H.; Zhuo, W.; Wen, S.; Lee, C.; Chan, S. Run, don’t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, BC, Canada, 17–24 June 2023; pp. 12021–12031. [Google Scholar] [CrossRef]
- Ma, X.; Dai, X.; Bai, Y.; Wang, Y.; Fu, Y. Rewrite the stars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024), Seatle, WA, USA, 16–22 June 2024; pp. 5694–5703. [Google Scholar] [CrossRef]
- Qi, Y.; He, Y.; Qi, X.; Zhang, Y.; Yang, G. Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 1–6 October 2023; pp. 6070–6079. [Google Scholar] [CrossRef]
- Finder, S.; Amoyal, R.; Treister, E.; Freifeld, O. Wavelet convolutions for large receptive fields. In Proceedings of the European Conference on Computer Vision (ECCV 2024), Milan, Italy, 29 September–4 October 2024; Volume 15112, pp. 363–380. [Google Scholar] [CrossRef]
- Li, H.; Li, J.; Wei, H.; Liu, Z.; Zhan, Z.; Ren, Q. Slim-neck by GSConv: A lightweight-design for real-time detector architectures. J. Real-Time Image Process. 2024, 21, 62. [Google Scholar] [CrossRef]
- Tan, M.; Pang, R.; Le, Q. EfficientDet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), Seattle, WA, USA, 13–19 June 2020; pp. 10778–10787. [Google Scholar] [CrossRef]
- Kang, M.; Ting, C.; Ting, F.; Phan, R. ASF-YOLO: A novel YOLO model with attentional scale sequence fusion for cell instance segmentation. Image Vis. Comput. 2024, 147, 105057. [Google Scholar] [CrossRef]
- Xu, X.; Jiang, Y.; Chen, W.; Huang, Y.; Zhang, Y.; Sun, X. DAMO-YOLO: A report on real-time object detection design. arXiv 2022. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020), New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar] [CrossRef]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), Long Beach, CA, USA, 16–20 June 2019; pp. 658–666. [Google Scholar] [CrossRef]
- Zhang, H.; Zhang, S. Shape-IoU: More accurate metric considering bounding box shape and scale. arXiv 2024, arXiv:2312.17663. [Google Scholar] [CrossRef]


















| Subset | Image Count | UUV (Instances) | Dock (Instances) | Light (Instances) |
|---|---|---|---|---|
| Training set | 6592 | 1479 | 2451 | 4636 |
| Valid set | 824 | 295 | 490 | 963 |
| Test set | 824 | 247 | 445 | 927 |
| Model | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) | Param/M | GFLOPs |
|---|---|---|---|---|---|---|
| Faster-RCNN | 77.8 ± 0.5 (77.2–78.4) | 72.5 ± 0.6 (71.8–73.2) | 69.1 ± 0.5 (68.5–69.7) | 56.3 ± 0.5 (55.7–56.9) | 40.7 | 211.4 |
| SSD | 79.6 ± 0.4 (79.1–80.1) | 73.3 ± 0.4 (72.8–73.8) | 70.3 ± 0.5 (69.7–70.9) | 57.2 ± 0.4 (56.7–57.7) | 25.6 | 36.2 |
| RT-DETR-R18 | 84.2 ± 0.4 (83.7–84.7) | 78.6 ± 0.4 (78.1–79.1) | 85.1 ± 0.4 (84.6–85.6) | 60.2 ± 0.4 (59.7–60.7) | 39.4 | 58.8 |
| YOLOv5s | 82.3 ± 0.3 (81.9–82.7) | 76.2 ± 0.3 (75.8–76.6) | 82.9 ± 0.4 (82.4–83.4) | 58.7 ± 0.4 (58.2–59.2) | 7.1 | 16.7 |
| YOLOv8s | 83.6 ± 0.2 (83.3–83.9) | 77.8 ± 0.2 (77.5–78.1) | 84.2 ± 0.3 (83.8–84.6) | 60.3 ± 0.3 (59.9–60.7) | 10.6 | 28.5 |
| YOLOv9s | 80.7 ± 0.4 (80.2–81.2) | 75.2 ± 0.4 (74.7–75.7) | 81.3 ± 0.4 (80.8–81.8) | 59.4 ± 0.4 (58.9–59.9) | 6.4 | 22.6 |
| YOLOv10s | 83.1 ± 0.3 (82.7–83.5) | 76.4 ± 0.3 (76.0–76.8) | 82.3 ± 0.3 (81.9–82.7) | 59.3 ± 0.3 (58.9–59.7) | 8.4 | 24.5 |
| YOLOv11s | 84.9 ± 0.2 (84.7–85.1) | 79.1 ± 0.2 (78.9–79.3) | 85.9 ± 0.2 (85.7–86.1) | 61.1 ± 0.2 (60.9–61.3) | 10.1 | 21.5 |
| TFDF-YOLO | 91.5 ± 0.2 (91.3–91.7) | 87.2 ± 0.2 (87.0–87.4) | 92.7 ± 0.2 (92.5–92.9) | 67.5 ± 0.2 (67.3–67.7) | 10.5 | 21.1 |
| Model | AP-UUV (%) | AP-Docking (%) | AP-Light (%) | mAP (%) |
|---|---|---|---|---|
| Faster-RCNN | 72.5 ± 0.5 (71.9–73.1) | 70.3 ± 0.5 (69.7–70.9) | 64.5 ± 0.5 (63.9–65.1) | 69.1 ± 0.5 (68.5–69.7) |
| SSD | 74.2 ± 0.4 (73.7–74.7) | 71.8 ± 0.4 (71.3–72.3) | 66.7 ± 0.4 (66.2–67.2) | 70.3 ± 0.5 (69.7–70.9) |
| RT-DETR-R18 | 86.3 ± 0.4 (85.8–86.8) | 84.7 ± 0.4 (84.2–85.2) | 80.5 ± 0.4 (80.0–80.7) | 85.1 ± 0.4 (84.6–85.6) |
| YOLOv5s | 83.6 ± 0.3 (83.2–84.0) | 81.9 ± 0.3 (81.5–82.3) | 78.2 ± 0.3 (77.8–78.6) | 82.9 ± 0.4 (82.4–83.4) |
| YOLOv8s | 85.1 ± 0.2 (84.8–85.4) | 83.5 ± 0.2 (83.2–83.8) | 79.6 ± 0.2 (79.3–79.9) | 84.2 ± 0.3 (83.8–84.6) |
| YOLOv9s | 82.4 ± 0.4 (81.9–82.9) | 80.7 ± 0.4 (80.2–81.2) | 76.9 ± 0.4 (76.4–77.4) | 81.3 ± 0.4 (80.8–81.8) |
| YOLOv10s | 84.5 ± 0.3 (84.1–84.9) | 82.8 ± 0.3 (82.4–83.2) | 79.1 ± 0.3 (78.7–79.5) | 82.3 ± 0.3 (81.9–82.7) |
| YOLOv11s | 87.2 ± 0.2 (87.0–87.4) | 85.6 ± 0.2 (85.4–85.8) | 81.3 ± 0.2 (81.1–81.5) | 85.9 ± 0.2 (85.7–86.1) |
| TFDF-YOLO | 93.5 ± 0.2 (93.3–93.7) | 91.8 ± 0.2 (91.6–92.0) | 88.6 ± 0.2 (88.4–88.8) | 92.7 ± 0.2 (92.5–92.9) |
| Kernel Size | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) | Param/M | GFLOPs |
|---|---|---|---|---|---|---|
| — | 84.9 | 79.1 | 85.9 | 61.1 | 10.1 | 21.5 |
| 3 | 85.1 | 81.5 | 86.8 | 63.5 | 10.2 | 20.8 |
| 5 | 84.6 | 79.8 | 85.7 | 62.3 | 10.5 | 21.6 |
| 7 | 83.2 | 77.1 | 82.7 | 61.7 | 10.7 | 22.5 |
| 3, 5, 7 | 83.6 | 79.4 | 86.3 | 62.8 | 11.4 | 24.1 |
| 1, 3, 7 | 86.5 | 82.9 | 87.9 | 64.1 | 10.9 | 22.3 |
| 1, 5, 7 | 84.2 | 80.1 | 85.7 | 62.3 | 11.2 | 23.5 |
| 1, 3, 5 | 87.8 | 83.8 | 89.4 | 65.8 | 10.7 | 21.8 |
| Setting (Single ρ) | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) | Param/M | GFLOPs |
|---|---|---|---|---|---|---|
| — | 84.9 | 79.1 | 85.9 | 61.1 | 10.1 | 21.5 |
| ρ = 1/6 | 82.7 | 77.7 | 84.1 | 58.4 | 10.2 | 21.5 |
| ρ = 1/4 | 83.6 | 78.5 | 84.9 | 59.8 | 10.4 | 21.6 |
| ρ = 1/3 | 85.9 | 82.5 | 87.5 | 64.3 | 10.5 | 21.8 |
| ρ = 1/2 | 85.5 | 81.8 | 87.3 | 64.5 | 10.7 | 22.1 |
| ρ = 2/3 | 85.1 | 81.2 | 86.6 | 64.1 | 10.7 | 22.3 |
| ρ = 1.0 | 84.7 | 80.5 | 86.3 | 63.9 | 10.9 | 22.3 |
| Setting (Grouped ρ ) | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) | Param/M | GFLOPs |
|---|---|---|---|---|---|---|
| ρ = 1/3 | 85.9 | 82.5 | 87.5 | 64.3 | 10.5 | 21.8 |
| (1/6,1/3,2/3) | 86.8 | 83.6 | 87.9 | 64.7 | 10.6 | 21.9 |
| (1/4,1/3,2/3) | 87.4 | 84.0 | 88.5 | 65.2 | 10.8 | 22.1 |
| (1/4,1/3,1/2) | 87.8 | 83.8 | 89.4 | 65.8 | 10.7 | 21.8 |
| Strategies | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) |
|---|---|---|---|---|
| Global image averaging intensity | 88.3 | 83.5 | 89.1 | 64.2 |
| Bounding-box median intensity | 90.7 | 86.1 | 91.6 | 66.8 |
| Histogram-peak intensity | 91.0 | 86.5 | 92.0 | 66.9 |
| Bounding-box mean intensity | 91.5 | 87.2 | 92.7 | 67.5 |
| Model | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) | Param/M | GFLOPs |
|---|---|---|---|---|---|---|
| C3K2 | 84.9 | 79.1 | 85.9 | 61.1 | 10.1 | 21.5 |
| C3k2- GhostDynamicConv | 84.1 | 78.6 | 85.7 | 60.7 | 8.9 | 21.7 |
| C3k2-Faster | 85.5 | 80.8 | 84.8 | 63.3 | 9.3 | 23.2 |
| C3k2-DySnakeConv | 86.3 | 82.7 | 86.9 | 63.5 | 10.4 | 22.1 |
| C3k2-Star | 86.7 | 81.8 | 86.3 | 64.2 | 11.2 | 22.4 |
| C3k2-WTConv | 87.2 | 82.2 | 87.6 | 64.9 | 9.6 | 20.1 |
| SFD Module (Ours) | 87.8 | 83.8 | 89.4 | 65.8 | 10.7 | 21.8 |
| Model | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) | Param/M | GFLOPs |
|---|---|---|---|---|---|---|
| PA-FPN | 84.9 | 79.1 | 85.9 | 61.1 | 10.1 | 21.5 |
| SlimNeck | 85.3 | 79.3 | 83.9 | 59.3 | 9.8 | 20.0 |
| Bi-FPN | 86.5 | 81.1 | 86.0 | 62.7 | 9.7 | 20.9 |
| ASF | 86.9 | 82.2 | 87.5 | 62.9 | 10 | 21.3 |
| G-FPN | 87.5 | 84.8 | 87.9 | 63.5 | 10.3 | 21.1 |
| RD-Fusion (Ours) | 87.1 | 85.5 | 89.1 | 64.7 | 9.8 | 20.7 |
| Model | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) |
|---|---|---|---|---|
| CIoU | 84.9 | 79.1 | 85.9 | 61.1 |
| DIoU | 83.7 | 78.3 | 85.5 | 59.3 |
| GIoU | 85.3 | 79.6 | 85.7 | 61.3 |
| Shape-IoU | 85.5 | 80.6 | 86.1 | 61.6 |
| U-CIoU (Ours) | 85.7 | 80.8 | 86.6 | 61.5 |
| SFD | RD-F | U-CIoU | P (%) | R (%) | mAP@0.5 (%) | mAP@0.5:0.95 (%) | Param/M | GFLOPs |
|---|---|---|---|---|---|---|---|---|
| 84.9 | 79.1 | 85.9 | 61.1 | 10.1 | 21.5 | |||
| √ | 87.8 | 83.8 | 89.4 | 65.8 | 10.7 | 21.8 | ||
| √ | 87.1 | 85.5 | 89.1 | 64.7 | 9.8 | 20.7 | ||
| √ | 85.7 | 80.8 | 86.6 | 61.5 | 10.1 | 21.5 | ||
| √ | √ | 90.2 | 86.4 | 91.4 | 67.3 | 10.5 | 21.1 | |
| √ | √ | 88.9 | 86.0 | 90.1 | 66.9 | 10.7 | 21.8 | |
| √ | √ | 88.0 | 86.9 | 89.5 | 66.6 | 9.8 | 20.7 | |
| √ | √ | √ | 91.5 (↑6.6) | 87.2 (↑8.1) | 92.7 (↑6.8) | 67.5 (↑6.4) | 10.5 (↑0.4) | 21.1 (↓0.4) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Yin, H.; Cheng, Y.; Shi, W. TFDF-YOLO: A Position Detection Model for Underwater Wireless Power Transfer Docking. J. Mar. Sci. Eng. 2026, 14, 429. https://doi.org/10.3390/jmse14050429
Yin H, Cheng Y, Shi W. TFDF-YOLO: A Position Detection Model for Underwater Wireless Power Transfer Docking. Journal of Marine Science and Engineering. 2026; 14(5):429. https://doi.org/10.3390/jmse14050429
Chicago/Turabian StyleYin, He, Yuxuan Cheng, and Wentao Shi. 2026. "TFDF-YOLO: A Position Detection Model for Underwater Wireless Power Transfer Docking" Journal of Marine Science and Engineering 14, no. 5: 429. https://doi.org/10.3390/jmse14050429
APA StyleYin, H., Cheng, Y., & Shi, W. (2026). TFDF-YOLO: A Position Detection Model for Underwater Wireless Power Transfer Docking. Journal of Marine Science and Engineering, 14(5), 429. https://doi.org/10.3390/jmse14050429

