Can YOLO Detect Retinal Pathologies? A Step Towards Automated OCT Analysis
Abstract
1. Introduction
2. Materials and Methods
2.1. YOLO
2.2. YOLOv12
2.3. Other Object Detection Models
2.4. Performance Evaluation
2.5. Data
2.5.1. AROI Dataset
2.5.2. OCT5k Dataset
3. Results
3.1. Performance Analysis of YOLO Versions on the AROI Dataset
3.2. Performance Analysis of YOLO Versions on the OCT5k Dataset
4. Discussion and Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
mAP | Mean Average Precision |
mAP@50 | Mean Average Precision at an IoU threshold above 0.5 |
mAP@50-95 | Mean Average Precision at IoU thresholds from 0.5 to 0.95 |
IoU | Intersection over Union |
OCT | Optical Coherence Tomography |
PED | Pigment Epithelial Detachment |
SRF | Subretinal Fluid |
IRF | Intraretinal Fluid |
References
- Aumann, S.; Donner, S.; Fischer, J.; Müller, F. Optical Coherence Tomography (OCT): Principle and Technical Realization. In High Resolution Imaging in Microscopy and Ophthalmology: New Frontiers in Biomedical Optics; Bille, J.F., Ed.; Springer: Cham, Switzerland, 2019. Available online: http://www.ncbi.nlm.nih.gov/books/NBK554044/ (accessed on 26 May 2025).
- Zysk, A.M.; Nguyen, F.T.; Oldenburg, A.L.; Marks, D.L.; Boppart, S.A. Optical coherence tomography: A review of clinical development from bench to bedside. J. Biomed. Opt. 2007, 12, 051403. [Google Scholar] [CrossRef] [PubMed]
- Drexler, W.; Morgner, U.; Ghanta, R.K.; Kärtner, F.X.; Schuman, J.S.; Fujimoto, J.G. Ultrahigh-resolution ophthalmic optical coherence tomography. Nat. Med. 2001, 7, 502–507. [Google Scholar] [CrossRef] [PubMed]
- Wu, Y.; Zhou, Y.; Zhao, J.; Yang, J.; Yu, W.; Chen, Y.; Li, X. Lesion Localization in OCT by Semi-Supervised Object Detection. In Proceedings of the 2022 International Conference on Multimedia Retrieval, New York, NY, USA, 27 June 2022; pp. 639–646. [Google Scholar] [CrossRef]
- Brehar, R.; Groza, A.; Damian, I.; Muntean, G.; Nicoară, S.D. Age-Related Macular Degeneration Biomarker Segmentation from OCT images. In Proceedings of the 2023 24th International Conference on Control Systems and Computer Science (CSCS), Bucharest, Romania, 24–26 May 2023; pp. 444–451. [Google Scholar] [CrossRef]
- Herlea, D.M.; Iancu, B.; Ardelean, E.-R. A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System. Informatics 2025, 12, 50. [Google Scholar] [CrossRef]
- Kalita, H.; Dandapat, S.; Kumar Bora, P. Contextual Self-Attention Based UNet Architecture for Fluid Segmentation in Retinal OCT B-scans. In Proceedings of the Fifteenth Indian Conference on Computer Vision Graphics and Image Processing, New York, NY, USA, 17–20 December 2025; pp. 1–8. [Google Scholar] [CrossRef]
- Melinščak, M. Attention-based U-net: Joint Segmentation of Layers and Fluids from Retinal OCT Images. In Proceedings of the 2023 46th MIPRO ICT and Electronics Convention (MIPRO), Opatija, Croatia, 22–26 May 2023; pp. 391–396. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. arXiv 2016. [Google Scholar] [CrossRef]
- Liu, H.; Li, X.; Bamba, A.L.; Song, X.; Brott, B.C.; Litovsky, S.H.; Gan, Y. Toward reliable calcification detection: Calibration of uncertainty in object detection from coronary optical coherence tomography images. J. Biomed. Opt. 2023, 28, 036008. [Google Scholar] [CrossRef] [PubMed]
- Rizzieri, N.; Dall’Asta, L.; Ozoliņš, M. Diabetic Retinopathy Features Segmentation without Coding Experience with Computer Vision Models YOLOv8 and YOLOv9. Vision 2024, 8, 48. [Google Scholar] [CrossRef] [PubMed]
- Li, C.; Che, S.; Gong, H.; Ding, Y.; Luo, Y.; Xi, J.; Qi, L.; Zhang, G. PI-YOLO: Dynamic sparse attention and lightweight convolutional based YOLO for vessel detection in pathological images. Front. Oncol. 2024, 14, 1347123. [Google Scholar] [CrossRef] [PubMed]
- Wahid, F.; Ma, Y.; Khan, D.; Aamir, M.; Bukhari, S.U.K. Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods. Available online: https://arxiv.org/abs/2408.03393v2 (accessed on 3 June 2025).
- Neri, G.; Sharma, S.; Ghezzo, B.; Novarese, C.; Olivieri, C.; Tibaldi, D.; Marolo, P.; Russakoff, D.B.; Oakley, J.D.; Reibaldi, M.; et al. Deep learning model for automatic detection of different types of microaneurysms in diabetic retinopathy. Eye 2025, 39, 570–577. [Google Scholar] [CrossRef] [PubMed]
- Melinščak, M.; Radmilovič, M.; Vatavuk, Z.; Lončarić, S. AROI: Annotated Retinal OCT Images Database. In Proceedings of the 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO), Opatija, Croatia, 27 September–1 October 2021; pp. 371–376. [Google Scholar] [CrossRef]
- Arikan, M.; Willoughby, J.; Ongun, S.; Sallo, F.; Montesel, A.; Ahmed, H.; Hagag, A.; Book, M.; Faatz, H.; Cicinelli, M.V.; et al. OCT5k: A dataset of multi-disease and multi-graded annotations for retinal layers. Sci. Data 2025, 12, 267. [Google Scholar] [CrossRef] [PubMed]
- Karn, P.K.; Abdulla, W.H. Advancing Ocular Imaging: A Hybrid Attention Mechanism-Based U-Net Model for Precise Segmentation of Sub-Retinal Layers in OCT Images. Bioengineering 2024, 11, 240. [Google Scholar] [CrossRef] [PubMed]
- Sapkota, R.; Qureshi, R.; Calero, M.F.; Badjugar, C.; Nepal, U.; Poulose, A.; Zeno, P.; Vaddevolu, U.B.P.; Khan, S.; Shoman, M.; et al. YOLOv12 to Its Genesis: A Decadal and Comprehensive Review of The You Only Look Once (YOLO) Series. Artif. Intell. Rev. 2025, 58, 274. [Google Scholar] [CrossRef]
- Khanam, R.; Hussain, M. A Review of YOLOv12: Attention-Based Enhancements vs. Previous Versions. arXiv 2025. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020. [Google Scholar] [CrossRef]
- Jocher, G.; Qiu, J. Ultralytics YOLO. 2024. Available online: https://github.com/ultralytics/ultralytics (accessed on 29 June 2025).
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv 2022. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Yeh, I.-H.; Liao, H.-Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv 2024. [Google Scholar] [CrossRef]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J.; Ding, G. YOLOv10: Real-Time End-to-End Object Detection. arXiv 2024. [Google Scholar] [CrossRef]
- Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025. [Google Scholar] [CrossRef]
- Sapkota, R.; Cheppally, R.H.; Sharda, A.; Karkee, M. RF-DETR Object Detection vs YOLOv12: A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity. arXiv 2025. [Google Scholar] [CrossRef]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. DETRs Beat YOLOs on Real-time Object Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar] [CrossRef]
- Lv, W.; Zhao, Y.; Chang, Q.; Huang, K.; Wang, G.; Liu, Y. RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer. Available online: https://arxiv.org/abs/2407.17140v1 (accessed on 4 June 2025).
- Cheng, T.; Song, L.; Ge, Y.; Liu, W.; Wang, X.; Shan, Y. YOLO-World: Real-Time Open-Vocabulary Object Detection. arXiv 2024. [Google Scholar] [CrossRef]
- Wang, A.; Liu, L.; Chen, H.; Lin, Z.; Han, J.; Ding, G. YOLOE: Real-Time Seeing Anything. arXiv 2025. [Google Scholar] [CrossRef]
- Portase, R.L.; Tolas, R.; Potolea, R. From Sensors to Insights: An Original Method for Consumer Behavior Identification in Appliance Usage. Electronics 2024, 13, 1364. [Google Scholar] [CrossRef]
- Blagec, K.; Dorffner, G.; Moradi, M.; Samwald, M. A critical analysis of metrics used for measuring progress in artificial intelligence. arXiv 2021. [Google Scholar] [CrossRef]
- Ardelean, E.-R.; Portase, R.L.; Potolea, R.; Dînșoreanu, M. A path-based distance computation for non-convexity with applications in clustering. Knowl. Inf. Syst. 2024, 67, 1415–1453. [Google Scholar] [CrossRef]
- Xu, F.; Liu, S.; Xiang, Y.; Lin, Z.; Li, C.; Zhou, L.; Gong, Y.; Li, L.; Li, Z.; Guo, C.; et al. Deep Learning for Detecting Subretinal Fluid and Discerning Macular Status by Fundus Images in Central Serous Chorioretinopathy. Front. Bioeng. Biotechnol. 2021, 9, 651340. [Google Scholar] [CrossRef] [PubMed]
- Habe, T.T.; Haataja, K.; Toivanen, P. Precision enhancement in wireless capsule endoscopy: A novel transformer-based approach for real-time video object detection. Front. Artif. Intell. 2025, 8, 1529814. [Google Scholar] [CrossRef] [PubMed]
- He, W.; Zhang, Y.; Xu, T.; An, T.; Liang, Y.; Zhang, B. Object Detection for Medical Image Analysis: Insights from the RT-DETR Model. arXiv 2025. [Google Scholar] [CrossRef]
- Chileshe, M.; Nyirenda, M.; Kaoma, J. Early Detection of Sexually Transmitted Infections Using YOLO 12: A Deep Learning Approach. Open J. Appl. Sci. 2025, 15, 1126–1144. [Google Scholar] [CrossRef]
- Chen, Y.-H. Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID. arXiv 2025. [Google Scholar] [CrossRef]
- Ye, J.; Zhang, Y.; Li, P.; Guo, Z.; Zeng, S.; Wei, T. Real-time dense small object detection model for floating litter detection and removal on water surfaces. Mar. Pollut. Bull. 2025, 218, 118189. [Google Scholar] [CrossRef] [PubMed]
Metric | YOLOv8 | YOLOv9 | YOLOv10 | YOLOv11 | YOLOv12 | RT-DETR | YOLOv8-Worldv2 | YOLOE |
---|---|---|---|---|---|---|---|---|
mAP@50 | 0.691 | 0.663 | 0.674 | 0.694 | 0.712 | 0.566 | 0.697 | 0.725 |
mAP@50-95 | 0.462 | 0.455 | 0.466 | 0.473 | 0.485 | 0.267 | 0.444 | 0.495 |
Mean F1 | 0.642 | 0.609 | 0.629 | 0.643 | 0.661 | 0.613 | 0.632 | 0.675 |
Mean Precision | 0.789 | 0.763 | 0.747 | 0.797 | 0.775 | 0.689 | 0.836 | 0.786 |
Mean Recall | 0.543 | 0.514 | 0.545 | 0.541 | 0.581 | 0.552 | 0.516 | 0.597 |
Inference Time [ms] | 4.9 | 6.8 | 6.3 | 5.7 | 4.9 | 15.3 | 5.9 | 4.1 |
Computational Cost [GFLOPs] | 28.4 | 26.7 | 24.5 | 21.3 | 21.2 | 103.4 | 32.6 | 35.3 |
Class | Images | Instances | Precision | Recall | mAP@50 | mAP@50-95 |
---|---|---|---|---|---|---|
all | 107 | 420 | 0.762 | 0.593 | 0.712 | 0.485 |
PED | 104 | 157 | 0.786 | 0.631 | 0.749 | 0.528 |
SRF | 63 | 80 | 0.831 | 0.738 | 0.834 | 0.626 |
IRF | 26 | 183 | 0.67 | 0.41 | 0.553 | 0.3 |
Class | Images | Instances | Precision | Recall | mAP@50 | mAP@50-95 |
---|---|---|---|---|---|---|
all | 107 | 420 | 0.786 | 0.597 | 0.725 | 0.495 |
PED | 104 | 157 | 0.764 | 0.618 | 0.742 | 0.541 |
SRF | 63 | 80 | 0.868 | 0.738 | 0.847 | 0.642 |
IRF | 26 | 183 | 0.727 | 0.437 | 0.586 | 0.302 |
Metric | YOLOv8 | YOLOv9 | YOLOv10 | YOLOv11 | YOLOv12 | RT-DETR | YOLOv8-Worldv2 | YOLOE |
---|---|---|---|---|---|---|---|---|
mAP@50 | 0.301 | 0.282 | 0.228 | 0.304 | 0.301 | 0.180 | 0.292 | 0.355 |
mAP@50-95 | 0.120 | 0.102 | 0.095 | 0.132 | 0.111 | 0.057 | 0.117 | 0.135 |
Mean F1 | 0.278 | 0.253 | 0.228 | 0.268 | 0.268 | 0.196 | 0.245 | 0.322 |
Mean Precision | 0.371 | 0.340 | 0.315 | 0.381 | 0.382 | 0.391 | 0.414 | 0.445 |
Mean Recall | 0.239 | 0.208 | 0.191 | 0.222 | 0.210 | 0.270 | 0.185 | 0.269 |
Inference Time [ms] | 10.6 | 13.2 | 13.6 | 12.5 | 10.4 | 25.8 | 11.5 | 10.3 |
Computational Cost [GFLOPs] | 28.5 | 26.7 | 24.5 | 21.3 | 21.2 | 103.5 | 34.0 | 35.3 |
Class | Images | Instances | Precision | Recall | mAP@0.5 | mAP@0.5-0.95 |
---|---|---|---|---|---|---|
All | 57 | 575 | 0.382 | 0.21 | 0.301 | 0.111 |
Choroidalfolds | 5 | 10 | 0.429 | 0.3 | 0.397 | 0.082 |
Fluid | 2 | 3 | 0 | 0 | 0 | 0 |
Geographicatrophy | 6 | 9 | 0.833 | 0.556 | 0.722 | 0.182 |
Harddrusen | 33 | 119 | 0.429 | 0.252 | 0.339 | 0.112 |
Hyperfluorescentspots | 5 | 8 | 0 | 0 | 0 | 0 |
PRlayerdisruption | 23 | 101 | 0 | 0 | 0 | 0 |
Reticulardrusen | 9 | 33 | 0.312 | 0.152 | 0.191 | 0.0672 |
Softdrusen | 38 | 265 | 0.739 | 0.374 | 0.56 | 0.279 |
SoftdrusenPED | 9 | 27 | 0.7 | 0.259 | 0.503 | 0.279 |
Class | Images | Instances | Precision | Recall | mAP@0.5 | mAP@0.5-0.95 |
---|---|---|---|---|---|---|
All | 57 | 575 | 0.445 | 0.269 | 0.355 | 0.135 |
Choroidalfolds | 5 | 10 | 0.571 | 0.4 | 0.501 | 0.132 |
Fluid | 2 | 3 | 0 | 0 | 0 | 0 |
Geographicatrophy | 6 | 9 | 0.75 | 0.667 | 0.713 | 0.236 |
Harddrusen | 33 | 119 | 0.468 | 0.37 | 0.402 | 0.139 |
Hyperfluorescentspots | 5 | 8 | 0 | 0 | 0 | 0 |
PRlayerdisruption | 23 | 101 | 0.65 | 0.129 | 0.391 | 0.111 |
Reticulardrusen | 9 | 33 | 0.143 | 0.0303 | 0.0744 | 0.0372 |
Softdrusen | 38 | 265 | 0.752 | 0.457 | 0.599 | 0.29 |
SoftdrusenPED | 9 | 27 | 0.667 | 0.37 | 0.517 | 0.266 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ardelean, A.-I.; Ardelean, E.-R.; Marginean, A. Can YOLO Detect Retinal Pathologies? A Step Towards Automated OCT Analysis. Diagnostics 2025, 15, 1823. https://doi.org/10.3390/diagnostics15141823
Ardelean A-I, Ardelean E-R, Marginean A. Can YOLO Detect Retinal Pathologies? A Step Towards Automated OCT Analysis. Diagnostics. 2025; 15(14):1823. https://doi.org/10.3390/diagnostics15141823
Chicago/Turabian StyleArdelean, Adriana-Ioana, Eugen-Richard Ardelean, and Anca Marginean. 2025. "Can YOLO Detect Retinal Pathologies? A Step Towards Automated OCT Analysis" Diagnostics 15, no. 14: 1823. https://doi.org/10.3390/diagnostics15141823
APA StyleArdelean, A.-I., Ardelean, E.-R., & Marginean, A. (2025). Can YOLO Detect Retinal Pathologies? A Step Towards Automated OCT Analysis. Diagnostics, 15(14), 1823. https://doi.org/10.3390/diagnostics15141823