Cup and Disc Segmentation in Smartphone Handheld Ophthalmoscope Images with a Composite Backbone and Double Decoder Architecture
Abstract
:1. Introduction
2. Related Works
3. Materials and Methods
3.1. Image Acquisition
3.2. Preprocessing
3.3. Architecture Construction
3.4. Evaluation
4. Results
Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ghanem, G.O.B.; Wareham, L.K.; Calkins, D.J. Addressing neurodegeneration in glaucoma: Mechanisms, challenges, and treatments. Prog. Retin. Eye Res. 2024, 100, 101261. [Google Scholar] [CrossRef] [PubMed]
- Bourne, R.R.A.; Jonas, J.B.; Friedman, D.; Nangia, V.; Bron, A.; Tapply, I.; Fernandes, A.G.; Cicinelli, M.V.; Arrigo, A.; Leveziel, N.; et al. Global estimates on the number of people blind or visually impaired by glaucoma: A meta-analysis from 2000 to 2020. Eye 2024, 38, 2036–2046. [Google Scholar] [CrossRef]
- Bragança, C.P.; Torres, J.M.; Macedo, L.O.; Soares, C.P.d.A. Advancements in Glaucoma Diagnosis: The Role of AI in Medical Imaging. Diagnostics 2024, 14, 530. [Google Scholar] [CrossRef] [PubMed]
- Tham, Y.C.; Li, X.; Wong, T.Y.; Quigley, H.A.; Aung, T.; Cheng, C.Y. Global Prevalence of Glaucoma and Projections of Glaucoma Burden through 2040: A Systematic Review and Meta-Analysis. Ophthalmology 2014, 121, 2081–2090. [Google Scholar] [CrossRef]
- Kalita, N.; Borgohain, S.K. An Ocular Feature-Based Novel Biomarker Determination for Glaucoma Diagnosis Using Supervised Machine Learning and Fundus Imaging. IEEE Sens. Lett. 2024, 8, 6014504. [Google Scholar] [CrossRef]
- Lu, S.; Zhao, H.; Liu, H.; Li, H.; Wang, N. PKRT-Net: Prior knowledge-based relation transformer network for optic cup and disc segmentation. Neurocomputing 2023, 538, 126183. [Google Scholar] [CrossRef]
- Hervella, Á.S.; Rouco, J.; Novo, J.; Ortega, M. End-to-end multi-task learning for simultaneous optic disc and cup segmentation and glaucoma classification in eye fundus images. Appl. Soft Comput. 2022, 116, 108347. [Google Scholar] [CrossRef]
- Zhao, A.; Su, H.; She, C.; Huang, X.; Li, H.; Qiu, H.; Jiang, Z.; Huang, G. Joint optic disc and cup segmentation based on elliptical-like morphological feature and spatial geometry constraint. Comput. Biol. Med. 2023, 158, 106796. [Google Scholar] [CrossRef]
- Bragança, C.P.; Torres, J.M.; Soares, C.P.d.A.; Macedo, L.O. Detection of Glaucoma on Fundus Images Using Deep Learning on a New Image Set Obtained with a Smartphone and Handheld Ophthalmoscope. Healthcare 2022, 10, 2345. [Google Scholar] [CrossRef]
- Guo, S. Fundus image segmentation via hierarchical feature learning. Comput. Biol. Med. 2021, 138, 104928. [Google Scholar] [CrossRef]
- Iqbal, S.; Khan, T.M.; Naveed, K.; Naqvi, S.S.; Nawaz, S.J. Recent trends and advances in fundus image analysis: A review. Comput. Biol. Med. 2022, 151, 106277. [Google Scholar] [CrossRef] [PubMed]
- Al-Bander, B.; Williams, B.M.; Al-Nuaimy, W.; Al-Taee, M.A.; Pratt, H.; Zheng, Y. Dense Fully Convolutional Segmentation of the Optic Disc and Cup in Colour Fundus for Glaucoma Diagnosis. Symmetry 2018, 10, 87. [Google Scholar] [CrossRef]
- Yu, S.; Xiao, D.; Frost, S.; Kanagasingam, Y. Robust optic disc and cup segmentation with deep learning for glaucoma detection. Comput. Med. Imaging Graph. 2019, 74, 61–71. [Google Scholar] [CrossRef] [PubMed]
- Meas, C.; Guo, W.; Miah, M.H. Multi-Scale Attention U-Net for Optic Disc and Optic Cup Segmentation in Retinal Fundus Images. In Proceedings of the 2024 2nd International Conference on Advancement in Computation & Computer Technologies (InCACCT), Gharuan, India, 2–3 May 2024; pp. 760–765. [Google Scholar] [CrossRef]
- Alam, A.U.; Islam, S.P.; Mahedy Hasan, S.M.; Srizon, A.Y.; Faruk, M.F.; Mamun, M.A.; Hossain, M.R. Optic Disc and Cup Segmentation via Enhanced U - Net with Residual and Attention Mechanisms. In Proceedings of the 2024 6th International Conference on Electrical Engineering and Information & Communication Technology (ICEEICT), Dhaka, Bangladesh, 2–4 May 2024; pp. 329–334. [Google Scholar] [CrossRef]
- Liu, M.; Wang, Y.; Li, Y.; Hu, S.; Wang, G.; Wang, J. A Novel Edge-Enhanced Networks for Optic Disc and Optic Cup Segmentation. Int. J. Imaging Syst. Technol. 2025, 35, e70019. [Google Scholar] [CrossRef]
- Zedan, M.J.M.; Raihanah Abdani, S.; Lee, J.; Zulkifley, M.A. RMHA-Net: Robust Optic Disc and Optic Cup Segmentation Based on Residual Multiscale Feature Extraction With Hybrid Attention Networks. IEEE Access 2025, 13, 7715–7735. [Google Scholar] [CrossRef]
- Kumar, G.B.; Kumar, S. Enhanced segmentation of optic disc and cup using attention-based U-Net with dense dilated series convolutions. Neural Comput. Appl. 2025, 37, 6831–6847. [Google Scholar] [CrossRef]
- Wang, S.; Yu, L.; Yang, X.; Fu, C.W.; Heng, P.A. Patch-Based Output Space Adversarial Learning for Joint Optic Disc and Cup Segmentation. IEEE Trans. Med. Imaging 2019, 38, 2485–2495. [Google Scholar] [CrossRef]
- Tian, Z.; Zheng, Y.; Li, X.; Du, S.; Xu, X. Graph convolutional network based optic disc and cup segmentation on fundus images. Biomed. Opt. Express 2020, 11, 3043–3057. [Google Scholar] [CrossRef]
- Yang, Y.; Yang, G.; Wang, Y.; Liu, X.; Zhao, J.; Ding, D. A geometry-aware multi-coordinate transformation fusion network for optic disc and cup segmentation. Appl. Intell. 2024, 54, 6701–6717. [Google Scholar] [CrossRef]
- Chen, C.; Zou, B.; Chen, Y.; Zhu, C. Optic disc and cup segmentation based on information aggregation network with contour reconstruction. Biomed. Signal Process. Control 2025, 104, 107179. [Google Scholar] [CrossRef]
- Virbukaitė, S.; Bernatavičienė, J.; Imbrasienė, D. Glaucoma Identification Using Convolutional Neural Networks Ensemble for Optic Disc and Cup Segmentation. IEEE Access 2024, 12, 82720–82729. [Google Scholar] [CrossRef]
- Tadisetty, S.; Chodavarapu, R.; Jin, R.; Clements, R.J.; Yu, M. Identifying the Edges of the Optic Cup and the Optic Disc in Glaucoma Patients by Segmentation. Sensors 2023, 23, 4668. [Google Scholar] [CrossRef] [PubMed]
- He, Y.; Kong, J.; Li, J.; Zheng, C. Entropy and distance-guided super self-ensembling for optic disc and cup segmentation. Biomed. Opt. Express 2024, 15, 3975–3992. [Google Scholar] [CrossRef]
- Jiang, J.X.; Li, Y.; Wang, Z. Structure-aware single-source generalization with pixel-level disentanglement for joint optic disc and cup segmentation. Biomed. Signal Process. Control 2025, 99, 106801. [Google Scholar] [CrossRef]
- Buslaev, A.; Iglovikov, V.I.; Khvedchenya, E.; Parinov, A.; Druzhinin, M.; Kalinin, A.A. Albumentations: Fast and Flexible Image Augmentations. Information 2020, 11, 125. [Google Scholar] [CrossRef]
- Liang, T.; Chu, X.; Liu, Y.; Wang, Y.; Tang, Z.; Chu, W.; Chen, J.; Ling, H. CBNet: A Composite Backbone Network Architecture for Object Detection. IEEE Trans. Image Process. 2022, 31, 6893–6906. [Google Scholar] [CrossRef]
- Eelbode, T.; Bertels, J.; Berman, M.; Vandermeulen, D.; Maes, F.; Bisschops, R.; Blaschko, M.B. Optimization for Medical Image Segmentation: Theory and Practice When Evaluating With Dice Score or Jaccard Index. IEEE Trans. Med. Imaging 2020, 39, 3679–3690. [Google Scholar] [CrossRef]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19), New York, NY, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the Computer Vision—ECCV 2018, Munich, Germany, 8–14 September 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Computer Vision Foundation: Cham, Switzerland, 2018; pp. 833–851. [Google Scholar]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. In Advances in Neural Information Processing Systems; Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W., Eds.; Curran Associates, Inc.: Newry, UK, 2021; Volume 34, pp. 12077–12090. [Google Scholar]
- Zhang, Z.; Yin, F.S.; Liu, J.; Wong, W.K.; Tan, N.M.; Lee, B.H.; Cheng, J.; Wong, T.Y. ORIGA-light: An online retinal fundus image database for glaucoma analysis and research. In Proceedings of the 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, Buenos Aires, Argentina, 31 August–4 September 2010; pp. 3065–3068. [Google Scholar] [CrossRef]
- Wong, T. Prediction of diseases via ocular imaging: The singapore retinal archival and analysis imaging network. In Proceedings of the Inaugural Ocular Imaging Symposium, Hong Kong, China, 28 June–2 July 2008. [Google Scholar]
- Vinogradova, K.; Dibrov, A.; Myers, G. Towards Interpretable Semantic Segmentation via Gradient-Weighted Class Activation Mapping (Student Abstract). Proc. AAAI Conf. Artif. Intell. 2020, 34, 13943–13944. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
Proposition | Cup Dice | Disc Dice | Cup IoU | Disc IoU | Datasets | |
---|---|---|---|---|---|---|
[12] | FC-DenseNet network | 86.59% | 96.53% | 76.88% | 93.34% | ORIGA, DRIONS-DB, Drishti-GS, ONHSD, RIM-ONE |
[13] | Resnet 34 Encoder with 2 Steps | 88.77% | 97.38% | 80.42% | 94.92% | RIGA, DRISHTI-GS, RIM-ONE |
[14] | Multi-Scale Attention UNet | 93.4% | 96.4% | 87.5% | 92.8% | REFUGE, ORIGA |
[15] | U-Net with Residual and Attention Mechanisms | 93.48% | 97.48% | 87.77% | 95.09% | REFUGE, RIM-ONE, Drishti-GS |
[16] | EE-TransUNet | 90.68% | 97.74% | 84.10% | 95.59% | RIM-ONE, REFUGUE, DRISHTI-GS |
[17] | RMHA-Net | 87.87% | 95.15% | 86.75% | 85.28% | Drishti-GS, ORIGA, PAPILA, Chaksu, REFUGE |
[18] | Attention-based with dense dilated series convolutions | 88.7% | 95.95% | 79.72% | 92.22% | REFUGE, PAPILA, ORIGA, Drishti-GS, G1020, CRFO |
[19] | Patch-Based Output Space Adversarial Learning | 88.26% | 96.02% | — | — | DRISHTI-GS, RIM-ONE, REFUGE |
[8] | Framework based on Fast R-CNN | 90.27% | 96.34% | — | — | REFUGE, ORIGA |
[20] | Graph Convolutional Network | 95.58% | 97.76% | 91.60% | 95.64% | REFUGE, Drishti-GS |
[21] | Sector Association and Multi-Coordinate Transformation Fusion | 90.32% | 96.20% | — | — | REFUGE, Drishti-GS, private dataset from Beijin Tongren Hospital |
[22] | HR-Net with Contour Reconstruction | 91.78% | 97.65% | — | — | ORIGA, DRISHTI-GS |
[23] | Ensemble Learning | 89.4% | 96.1% | 80.8% | 92.5% | REFUGE, RIM-ONE, Drishti-GS |
[24] | Post processing with edge detection | 90.2% | 96.5% | 82.4% | 93.3% | Drishti-GS, ORIGA, RIM-ONE, REFUGE |
[25] | Unsupervised domain adaptation | 95.44% | 87.63% | — | — | RIGA+, REFUGE |
[26] | Single-source domain generalization | 83.07% | 93.71% | — | — | RIGA+, REFUGE |
Brazil Glaucoma Dataset | ||||
---|---|---|---|---|
Dice | IoU | |||
Fold | OD | OC | OD | OC |
1 | 95.81% | 85.59% | 92.05% | 75.85% |
2 | 96.09% | 85.69% | 92.53% | 76.18% |
3 | 95.92% | 85.30% | 92.22% | 75.68% |
4 | 96.01% | 84.72% | 92.38% | 74.90% |
5 | 96.31% | 84.23% | 92.92% | 74.49% |
Average | 96.03% | 85.11% | 92.42% | 75.42% |
Results with Baseline Architectures | ||||
---|---|---|---|---|
Dice | IoU | |||
Network | OD | OC | OD | OC |
U-Net | 86.96% | 84.61% | 77.74% | 74.86% |
Deeplab V3 | 87.27% | 84.60% | 78.22% | 74.79% |
Segformer | 86.98% | 84.56% | 77.79% | 74.87 |
Composite Encoder Double Decoder | 95.92% | 85.30% | 92.22% | 75.68% |
ORIGA Dataset | ||||
---|---|---|---|---|
Dice | IoU | |||
Fold | OD | OC | OD | OC |
1 | 96.03% | 86.56% | 92.48% | 77.24% |
2 | 95.52% | 87.44% | 91.51% | 78.20% |
3 | 95.98% | 86.40% | 92.33% | 76.86% |
4 | 95.59% | 85.07% | 91.66% | 74.93% |
5 | 95.91% | 86.36% | 92.26% | 77.07% |
Average | 95.81% | 86.37% | 92.05% | 76.86% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Freire, T.P.; Braz Júnior, G.; de Almeida, J.D.S.; Rodrigues Junior, J.R.D. Cup and Disc Segmentation in Smartphone Handheld Ophthalmoscope Images with a Composite Backbone and Double Decoder Architecture. Vision 2025, 9, 32. https://doi.org/10.3390/vision9020032
Freire TP, Braz Júnior G, de Almeida JDS, Rodrigues Junior JRD. Cup and Disc Segmentation in Smartphone Handheld Ophthalmoscope Images with a Composite Backbone and Double Decoder Architecture. Vision. 2025; 9(2):32. https://doi.org/10.3390/vision9020032
Chicago/Turabian StyleFreire, Thiago Paiva, Geraldo Braz Júnior, João Dallyson Sousa de Almeida, and José Ribamar Durand Rodrigues Junior. 2025. "Cup and Disc Segmentation in Smartphone Handheld Ophthalmoscope Images with a Composite Backbone and Double Decoder Architecture" Vision 9, no. 2: 32. https://doi.org/10.3390/vision9020032
APA StyleFreire, T. P., Braz Júnior, G., de Almeida, J. D. S., & Rodrigues Junior, J. R. D. (2025). Cup and Disc Segmentation in Smartphone Handheld Ophthalmoscope Images with a Composite Backbone and Double Decoder Architecture. Vision, 9(2), 32. https://doi.org/10.3390/vision9020032