Precision Without Complexity: A Comparative Study of YOLO26 Pose Variants for Distal Arm Landmark Detection
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design and Objective
2.2. Image Acquisition and Landmark Annotation
2.3. YOLO26 Pose-Estimation Framework
2.4. Model Training, Configuration, and Evaluations
3. Results and Analysis
3.1. Keypoint Detection Performance
3.2. Effect of Model Scale on Localization Accuracy
3.3. Landmark-Error Distribution and Analysis
3.4. Computational Efficiency
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| YOLO | You Only Look Once |
| mAP | mean Average Precision |
| IoU | Intersection over Union |
| OKS | Object Keypoint Similarity |
| COCO | Common Objects in Context |
| RGB | Red, Green, Blue |
| GFLOPs | Giga Floating Point Operations per Second |
| VRAM | Video Random Access Memory |
| SGD | Stochastic Gradient Descent |
| HSV | Hue, Saturation, Value |
| IQR | Interquartile Range |
| SD | Standard Deviation |
| TEAM | Traditional East Asian Medicine |
| AI | Artificial Intelligence |
| 2D | Two-Dimensional |
References
- Payer, C.; Štern, D.; Bischof, H.; Urschler, M. Integrating Spatial Configuration into Heatmap Regression Based CNNs for Landmark Localization. Med. Image Anal. 2019, 54, 207–219. [Google Scholar] [CrossRef] [PubMed]
- Noh, S.H.; Lee, G.; Bae, H.J.; Han, J.Y.; Son, S.J.; Kim, D.; Park, J.Y.; Choi, S.K.; Cho, P.G.; Kim, S.H.; et al. Deep Learning Method for Precise Landmark Identification and Structural Assessment of Whole-Spine Radiographs. Bioengineering 2024, 11, 481. [Google Scholar] [CrossRef] [PubMed]
- Tajbakhsh, N.; Jeyaseelan, L.; Li, Q.; Chiang, J.N.; Wu, Z.; Ding, X. Embracing Imperfect Datasets: A Review of Deep Learning Solutions for Medical Image Segmentation. Med. Image Anal. 2020, 63, 101693. [Google Scholar] [CrossRef] [PubMed]
- Yang, F.; Zamzmi, G.; Angara, S.; Rajaraman, S.; Aquilina, A.; Xue, Z.; Jaeger, S.; Papagiannakis, E.; Antani, S.K. Assessing Inter-Annotator Agreement for Medical Image Segmentation. IEEE Access 2023, 11, 21300. [Google Scholar] [CrossRef] [PubMed]
- Serafin, M.; Baldini, B.; Cabitza, F.; Carrafiello, G.; Baselli, G.; Del Fabbro, M.; Sforza, C.; Caprioglio, A.; Tartaglia, G.M. Accuracy of Automated 3D Cephalometric Landmarks by Deep Learning Algorithms: Systematic Review and Meta-Analysis. Radiol. Med. 2023, 128, 544–555. [Google Scholar] [CrossRef] [PubMed]
- Deep Learning-Based Human Pose Estimation: A Survey. Available online: https://www.researchgate.net/publication/347881067_Deep_Learning-Based_Human_Pose_Estimation_A_Survey (accessed on 23 March 2026).
- Lin, Y.; Liao, Y.; Zeng, W.; Wei, Y.; Chen, D.; Yuan, X.; Li, Y.; Erkan, U.; Toktas, A.; Zhang, C.; et al. 3D Non-Degenerate Hyperchaos: Design, Analysis, and Application in Image Encryption. IEEE Trans. Consum. Electron. 2026, 1. [Google Scholar] [CrossRef]
- Arik, S.Ö.; Ibragimov, B.; Xing, L. Fully Automated Quantitative Cephalometry Using Convolutional Neural Networks. J. Med. Imaging 2017, 4, 014501. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Sun, K.; Xiao, B.; Liu, D.; Wang, J. Deep High-Resolution Representation Learning for Human Pose Estimation. In 2019 IEEE Computer Society Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2019; pp. 5686–5696. [Google Scholar] [CrossRef]
- Newell, A.; Yang, K.; Deng, J. Stacked Hourglass Networks for Human Pose Estimation. In Computer Vision—ECCV 2016; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Cham, Switzerland, 2016; Volume 9912, pp. 483–499. [Google Scholar] [CrossRef]
- Zhang, J.; Liu, M.; Shen, D. Detecting Anatomical Landmarks From Limited Medical Imaging Data Using Two-Stage Task-Oriented Deep Neural Networks. IEEE Trans. Image Process. 2017, 26, 4753–4764. [Google Scholar] [CrossRef] [PubMed]
- Noothout, J.M.H.; De Vos, B.D.; Wolterink, J.M.; Postma, E.M.; Smeets, P.A.M.; Takx, R.A.P.; Leiner, T.; Viergever, M.A.; Išgum, I. Deep Learning-Based Regression and Classification for Automatic Landmark Localization in Medical Images. IEEE Trans. Med. Imaging 2020, 39, 4011–4022. [Google Scholar] [CrossRef] [PubMed]
- Cao, Z.; Simon, T.; Wei, S.E.; Sheikh, Y. Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Piscataway, NJ, USA, 2017; pp. 1302–1310. [Google Scholar] [CrossRef]
- Maji, D.; Nagori, S.; Mathew, M.; Poddar, D. YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity Loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L.; Dollár, P. Microsoft COCO: Common Objects in Context. In Computer Vision—ECCV 2014; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Cham, Switzerland, 2014; Volume 8693, pp. 740–755. [Google Scholar]
- Dong, C.; Du, G. An Enhanced Real-Time Human Pose Estimation Method Based on Modified YOLOv8 Framework. Sci. Rep. 2024, 14, 8012. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Yeh, I.-H.; Liao, H.-Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. In Computer Vision—ECCV 2024; Springer: Cham, Switzerland, 2024. [Google Scholar]
- Sapkota, R.; Karkee, M. Ultralytics YOLO evolution: An overview of YOLO26, YOLO11, YOLOv8 and YOLOv5 object detectors for computer vision and pattern recognition. arXiv 2025, arXiv:2510.09653. [Google Scholar]
- Terven, J.; Córdova-Esparza, D.M.; Romero-González, J.A. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
- Garrido-Jurado, S.; Muñoz-Salinas, R.; Madrid-Cuevas, F.J.; Marín-Jiménez, M.J. Automatic Generation and Detection of Highly Reliable Fiducial Markers under Occlusion. Pattern Recognit. 2014, 47, 2280–2292. [Google Scholar] [CrossRef]
- Malekroodi, H.S.; Seo, S.D.; Choi, J.; Na, C.S.; Lee, B.I.; Yi, M. Real-Time Location of Acupuncture Points Based on Anatomical Landmarks and Pose Estimation Models. Front. Neurorobot. 2024, 18, 1484038. [Google Scholar] [CrossRef]
- Yuan, Z.; Shao, P.; Li, J.; Wang, Y.; Zhu, Z.; Qiu, W.; Chen, B.; Tang, Y.; Han, A. YOLOv8-ACU: Improved YOLOv8-Pose for Facial Acupoint Detection. Front. Neurorobot. 2024, 18, 1355857. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Liu, L.; Wang, Y.; Du, S. Hand Acupuncture Point Localization Method Based on a Dual-Attention Mechanism and Cascade Network Model. Biomed. Opt. Express 2023, 14, 5965. [Google Scholar] [CrossRef] [PubMed]
- Seo, S.-D.; Madusanka, N.; Malekroodi, H.S.; Na, C.-S.; Yi, M.; Lee, B. Accurate Acupoint Localization in 2D Hand Images: Evaluating HRNet and ResNet Architectures for Enhanced Detection Performance. Curr. Med. Imaging 2024, 20, e15734056315235. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Lan, T.; Dou, W.; Chen, Z.; Zhang, S.; Chen, G. Structure-Guided Deep Learning for Back Acupoint Localization via Bone-Measuring Constraints. Front. Physiol. 2025, 16, 1662104. [Google Scholar] [CrossRef] [PubMed]
- Shorten, C.; Khoshgoftaar, T.M. A Survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
- Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; van der Laak, J.A.W.M.; van Ginneken, B.; Sánchez, C.I. A Survey on Deep Learning in Medical Image Analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [PubMed]
- Ji, Y.C. Improving the Lightweight Pose Detection Model Based on YOLOpose. In Proceedings of the 3rd International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM2024) Lecture Notes in Electrical Engineering; Springer: Singapore, 2025; Volume 1326, pp. 31–41. [Google Scholar] [CrossRef]







| YOLO Variants | Params (M) | FLOPs (G) | COCO mAP@0.5 | COCO mAP@0.5:0.95 | Distal Arm mAP@0.5 | Distal Arm mAP@0.5:0.95 |
|---|---|---|---|---|---|---|
| YOLO26N | 2.9 | 7.5 | 83.3 | 57.2 | 99.5 | 99.2 |
| YOLO26S | 10.4 | 23.9 | 86.6 | 63.0 | 99.5 | 99.4 |
| YOLO26M | 21.5 | 73.1 | 89.6 | 68.8 | 99.5 | 99.3 |
| YOLO26L | 25.9 | 91.3 | 90.5 | 70.4 | 99.5 | 99.2 |
| YOLO26X | 57.6 | 201.7 | 91.6 | 71.6 | 99.5 | 99.2 |
| YOLO Variants | Mean (mm) | Median (mm) | SD (mm) | P75 (mm) | P90 (mm) | <4 mm (%) |
|---|---|---|---|---|---|---|
| YOLO26N | 2.76 | 2.65 | 0.96 | 3.30 | 4.04 | 88.0 |
| YOLO26S | 3.35 | 3.05 | 1.46 | 4.12 | 5.84 | 74.0 |
| YOLO26M | 3.11 | 3.02 | 1.05 | 3.86 | 4.25 | 80.0 |
| YOLO26L | 2.96 | 2.78 | 0.96 | 3.61 | 4.15 | 86.0 |
| YOLO26X | 4.08 | 3.05 | 2.59 | 4.41 | 9.18 | 72.0 |
| Yolo Variants | Mean Localization Error (mm) | ||||
|---|---|---|---|---|---|
| LI11 | LI10 | TE5 | LI4 | TE3 | |
| YOLO26N | 3.16 | 3.27 | 2.44 | 2.21 | 2.75 |
| YOLO26S | 4.31 | 4.25 | 3.07 | 2.16 | 2.96 |
| YOLO26M | 3.55 | 3.85 | 2.91 | 2.07 | 3.15 |
| YOLO26L | 3.36 | 3.58 | 2.89 | 2.26 | 2.72 |
| YOLO26X | 5.35 | 5.54 | 3.09 | 2.98 | 3.46 |
| Study | Year | Body Region | Model/Method | Dataset (Images) | mAP@0.5 | Localization Error Metric | Best Reported Result | Physical Calibration (mm) |
|---|---|---|---|---|---|---|---|---|
| Wang et al. [26] | 2023 | Hand (21 keypoints) | SC-YOLOv5 + HRNet (cascade, dual-attention) | Custom (real scene) | mAP@0.5 = 97.15% | Average offset error (AOE) | AOE = 0.0269 (>40% lower than others) | No (normalized units; d = 18 cm denominator, result is dimensionless) |
| Malekroodi et al. [24] | 2024 | Distal arm (LI11, LI10, TE5, LI4, TE3) | YOLOv8l-pose (transfer learning, fine-tuned on custom dataset) | 5997 images 194 participants | mAP@0.5 = 0.99 | Euclidean distance, reported in mm | Mean error <5 mm (mm-calibrated) | Partial—fixed global conversion factor via 80 cm reference sheet |
| Seo et al. [27] | 2024 | Hand (forearm acupoints) | HRNet-w48 vs. ResNet (top-down) | 940 images/94 participants (PK dataset); test set = 180 images | no single mAP@0.5 value stated | Mean distance error (pixels) | HRNet-w48 surpassed expert annotators | No (ArUco used for perspective correction; error reported in pixels only) |
| Yuan et al. [25] | 2024 | Face (facial acupoints) | YOLOv8-ACU (ECA attention + Slim-neck + GIoU loss) | Self-constructed (facial acupoint) | 97.5% | mAP@0.5:0.95 = 76.9% (validation); 80.7% on external test set | mAP@0.5 = 99.5% on external test set | No (no mm reported) |
| Present Study (YOLO26N) | 2026 | Distal arm (LI11, LI10, TE5, LI4, TE3) | YOLO26N (smallest variant; single-stage) | 3679 images 262 participants | 99.5% | Physical mm (ArUco-calibrated) | Mean error 2.76 ± 0.96 mm (88% within 4 mm) | Yes (ArUco marker) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Padmanabha, P.; Herath, H.M.K.K.M.B.; Madusanka, N.; Park, H.-J.; Na, C.-S.; Yi, M.; Lee, B.-i. Precision Without Complexity: A Comparative Study of YOLO26 Pose Variants for Distal Arm Landmark Detection. Appl. Sci. 2026, 16, 3968. https://doi.org/10.3390/app16083968
Padmanabha P, Herath HMKKMB, Madusanka N, Park H-J, Na C-S, Yi M, Lee B-i. Precision Without Complexity: A Comparative Study of YOLO26 Pose Variants for Distal Arm Landmark Detection. Applied Sciences. 2026; 16(8):3968. https://doi.org/10.3390/app16083968
Chicago/Turabian StylePadmanabha, Prathiksha, H. M. K. K. M. B. Herath, Nuwan Madusanka, Hi-Joon Park, Chang-Su Na, Myunggi Yi, and Byeong-il Lee. 2026. "Precision Without Complexity: A Comparative Study of YOLO26 Pose Variants for Distal Arm Landmark Detection" Applied Sciences 16, no. 8: 3968. https://doi.org/10.3390/app16083968
APA StylePadmanabha, P., Herath, H. M. K. K. M. B., Madusanka, N., Park, H.-J., Na, C.-S., Yi, M., & Lee, B.-i. (2026). Precision Without Complexity: A Comparative Study of YOLO26 Pose Variants for Distal Arm Landmark Detection. Applied Sciences, 16(8), 3968. https://doi.org/10.3390/app16083968

