An Interpretable Clinical Decision Support System Aims to Stage Age-Related Macular Degeneration Using Deep Learning and Imaging Biomarkers
Abstract
1. Introduction
Technical Contributions
2. Modeling the Cognitive Processes of Experts in Staging Age-Related Macular Degeneration Based on Imaging Biomarkers
2.1. Presentation of OCT Data as an Image and a Set of Imaging Biomarkers
2.2. Analysis of the Primary OCT Data Set
- Examples of retinal OCT without AMD disease (N) (25% of the total data set),
- Early stage AMD (S) (8% of the total data set),
- Intermediate stage of AMD between dry and wet stage (P) (34% of the total data set),
- Late stage of dry (atrophic) AMD (SI) (7% of the total data set),
- Late stage of wet (neovascular) AMD (V) (17% of the total data set),
- Late stage of AMD with subretinal fibrosis (VI; IB sr) (9% of the total data set).
2.3. Assessing the Efficiency of Direct Classification and Pattern Detection in OCT Images
- Accuracy (Acc): Reflects the DNN’s overall correctness in identifying IBs across all AMD stages.
- Precision: Indicates the reliability of the DNN’s positive IB detections.
- F1-score: Balances precision and recall, which is crucial for handling rare or imbalanced IBs.
- Specificity (SP): Measures the correct identification of an IB’s absence, reducing false positives.
- Sensitivity (SN/Recall): Ensures that critical IBs are not overlooked, which is vital for detecting early signs of AMD progression
- Area Under the ROC Curve (AUC): Threshold-independent summary of discriminative performance; equals the probability that a randomly chosen positive is ranked above a randomly chosen negative and is widely used alongside TPR/FPR, sensitivity, and specificity.
- High Clinical + High Statistical Significance: The IB is highly relevant for diagnosing AMD stages and is supported by sufficient data to train an effective classifier. These are ideal candidates for inclusion.
- Low Clinical + High Statistical Significance: The IB is statistically sound but has low clinical relevance for AMD staging, making its inclusion potentially redundant.
- High Clinical + Low Statistical Significance: The IB is clinically critical but rare in the dataset. This low statistical representation hinders the development of an effective classifier without techniques like data augmentation.
- Low Clinical + Low Statistical Significance: The IB lacks both clinical and statistical relevance, suggesting it should be excluded from the dataset.
2.4. External Validation Protocol
3. Creation of a Dataset and a Classification Algorithm Based on the Patterns of the Target Class
3.1. Calculating the Statistical and Clinical Significance of IBs
3.2. Optimal Selection of IBs Based on Their Statistical and Clinical Significance
- Ensuring maximum classifier performance: High performance characteristics of binary classifiers are preferred.
- Ensuring maximum clinical value: Preference should be given to IBs that are assessed as “Present,” “Common,” or “Defining features” for at least one stage of AMD.
- Statistical Performance:In the context of diagnostics, a minimum acceptable level of sensitivity and specificity is considered to be 80% [90,91]. The threshold value for statistical performance was determined to be . The steepness parameter is defined as because a steeper sigmoid results in a larger derivative near the threshold. It ensures that slight deviations in performance are represented as significant changes in the transformed output. Such an approach provides meaningful gradients for optimization, which is essential for robust parameter estimation and model convergence [92].From the equation , it follows that in order to penalize a low J via the transformed J term , it is required that M > 1. Since statistical efficiency is a more critical factor for enabling the training of the Information Bottleneck search classifier on OCT images, the penalty value was chosen to be greater than the corresponding element of the equation used for calculating clinical significance: .
- Clinical Significance:
- Performance Threshold. Ensures the cumulative statistical performance exceeds a minimum threshold:
- Stage Coverage. Ensures adequate clinical coverage across all disease stages:
- shape the statistical objective via a transformed J with a steep sigmoid around to magnify small but clinically meaningful improvements near the acceptable sensitivity/specificity operating point;
- add a stage-wise clinical coverage term using so that every AMD stage maintains a minimum level of supported evidence;
- enforce hard constraints and to rule out Pareto-optimal yet clinically unsafe subsets. This approach ensures that the selection process focuses on subsets that remain learnable (high J), maintain clinical balance (coverage across different stages), and are robust against class imbalance. To our knowledge, this combined strategy has not been previously applied in IB selection using MOEAs.
- Selection: [td, md, gv, ga, fopes, irzh, srzh, sr];
- Aggregated Transformed Performance: 0.0000;
- Aggregated Clinical Significance: 1.0000.
3.3. Fuzzy Logic-Based Interpretable AMD Stage Classification
3.3.1. Architecture Integration and Confidence Calibration
3.3.2. Fuzzy Logic Implementation for Expert Rule Modeling
3.3.3. Hyper-Parameter Sensitivity of Fuzzy Membership
- the membership midpoint around the clinical “Present” operating point;
- the steepness ;
- the “Defining feature” amplification ;
3.4. Interpretable Visualization Framework
- Bar chart representation of IB confidence scores, enabling clinicians to quickly assess which image features were detected with high reliability;
- Radar chart visualization of AMD stage probabilities, providing a unique “diagnostic fingerprint” for each case that facilitates comparison across different stages.
4. Results
4.1. Overall System Performance
4.2. Comparison with Strong End-to-End Baselines
4.3. Per-Stage Performance Analysis
- Early AMD classification accuracy increased from 7.1% to 84.8%, with correct identification of 84 out of 99 cases compared to only 7 in the baseline model;
- Normal case identification improved from 48% to 95.1% accuracy, virtually eliminating false positives;
- Late atrophic AMD classification improved from 53.0% to 86.7% accuracy;
- Intermediate AMD accuracy increased from 70.1% to 92.7%;
- Two categories showed slight performance decreases: late neovascular AMD (from 96.7% to 88.3%) and late fibrosis AMD (from 93.1% to 89.7%), suggesting potential overlapping features between advanced disease stages that require further refinement.
4.4. Failure Analysis: Grouped Error Modes and Rule/IB
4.5. Temperature Calibration Analysis
4.6. IB Importance Ablations (LOBO and Top-K)
4.7. External Validation Across Scanner Types
4.8. Computational Efficiency Analysis
4.9. Clinical Impact and Interpretability
5. Discussion
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Conflicts of Interest
References
- Madi, H.A.; Keller, J. Increasing frequency of hospital admissions for retinal detachment and vitreo-retinal surgery in England 2000–2018. Eye 2020, 34, 1584–1590. [Google Scholar] [CrossRef]
- Stålhammar, G.; Lardner, E.; Georgsson, M.; Seregard, S. Increasing demand for ophthalmic pathology: Time trends in a laboratory with nationwide coverage. BMC Ophthalmol. 2023, 23, 88. [Google Scholar] [CrossRef]
- Shah, V. To study the morbidity pattern of patients attending the ophthalmology OPD of tertiary eye care centre with reference to age. Open Access J. Ophthalmol. 2023, 8, 1–5. [Google Scholar] [CrossRef]
- Aznabaev, B.M.; Mukhamadeev, T.R.; Dibaev, T.I. Optical Coherence Tomography + Angiography in the Diagnosis, Therapy and Surgery of Eye Diseases; August Borg: Moscow, Russia, 2019. [Google Scholar]
- Hu, Y.; Gao, Y.; Gao, W.; Luo, W.; Yang, Z.; Xiong, F.; Chen, Z.; Lin, Y.; Xia, X.; Yin, X.; et al. AMD-SD: An optical coherence tomography image dataset for wet AMD lesions segmentation. Sci. Data 2024, 11, 1014. [Google Scholar] [CrossRef] [PubMed]
- Lopukhova, E.A.; Ibragimova, R.R.; Idrisova, G.M.; Lakman, I.A.; Mukhamadeev, T.R.; Grakhova, E.P.; Bilyalov, A.R.; Kutluyarov, R.V. Machine learning algorithms for the analysis of age-related macular degeneration based on optical coherence tomography: A systematic review. J. Biomed. Photonics Eng. 2023, 9, 020202. [Google Scholar] [CrossRef]
- Victor, A.A. The role of imaging in age-related macular degeneration. Retin. Physician 2019, 16, 38–42. [Google Scholar]
- Chen, J.Y.; Vedantham, S.; Lexa, F.J. Burnout and work-work imbalance in radiology—Wicked problems on a global scale: A baseline pre-COVID-19 survey of US neuroradiologists compared to international radiologists and adjacent staff. Eur. J. Radiol. 2022, 155, 110153. [Google Scholar] [CrossRef]
- Duncan, J.R. Information overload: When less is more in medical imaging. Diagnosis 2017, 4, 179–183. [Google Scholar] [CrossRef]
- Ergün Sahin, B.U.; Güneş, E.D.; Kocabıyıkoğlu, A.; Keskin, A. How does workload affect test ordering behavior of physicians? An empirical investigation. Prod. Oper. Manag. 2022, 31, 2664–2680. [Google Scholar] [CrossRef]
- Winder, M.; Owczarek, A.J.; Chudek, J.; Pilch-Kowalczyk, J.; Baron, J. Are we overdoing it? Changes in diagnostic imaging workload during the years 2010–2020 including the impact of the SARS-CoV-2 pandemic. Healthcare 2021, 9, 1557. [Google Scholar] [CrossRef] [PubMed]
- Jain, B.I. Enhancing diagnostic: Machine learning in medical image analysis. Int. J. Sci. Res. Eng. Manag. 2024, 8, 1–5. [Google Scholar]
- Li, J. Reliability and efficiency of human-automation interaction in automated decision support systems. Highlights Sci. Eng. Technol. 2024, 106, 431–435. [Google Scholar] [CrossRef]
- Lukmanov, I.; Agaev, V.; Tsypkin, D. Automation in healthcare: Advantages, prospects, perceptual barriers. City Healthc. 2024, 5, 181–188. [Google Scholar] [CrossRef]
- Amaral, A.C.K.; Cuthbertson, B.H. The efficiency of computerised clinical decision support systems. Lancet 2024, 403, 410–411. [Google Scholar] [CrossRef]
- Belov, K.S.; Kharitonov, A.S.; Chernova, S.V. At the Crossroads of Technology and Medicine: Prospects of automation in medical practice with the use of neural networks. Infokommunikacionnye Tehnol. 2023, 21, 89–93. [Google Scholar] [CrossRef]
- Przystalski, K.; Thanki, R.M. Computer vision for medical data analysis. In Explainable Machine Learning in Medicine; Springer: Cham, Switzerland, 2024; pp. 53–66. [Google Scholar]
- Sheelavathi, A.; Shanmugapriya, P.; Sangeethapriya, J.; Muthukarupaee, K. A roadmap to smart healthcare automation sensors and technologies. In Futuristic Trends in IOT; Volume 2, Book 15, Part 1; Iterative International Publishers: Chikmagalur, Karnataka, India, 2022; pp. 43–51. ISBN 978-93-5747-350-7. [Google Scholar]
- Sindhu, P.; Sivakumar, M. Healthcare integrating automation and robotics-based industry 5.0 advancement. In Advances in Medical Technologies and Clinical Practice; IGI Global: Hershey, PA, USA, 2024; pp. 254–264. [Google Scholar]
- Umare Thool, K.B.; Wankhede, P.A.; Yella, V.R.; Tamijeselvan, S.; Suganthi, D.; Rastogi, R. Artificial intelligence in medical imaging data analytics using CT images. In Proceedings of the 2023 4th International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 6–8 July 2023; pp. 1–6. [Google Scholar]
- Hayat, M.; Aramvith, S.; Bhattacharjee, S.; Ahmad, N. Attention GhostUNet++: Enhanced segmentation of adipose tissue and liver in CT images. Med. Image Anal. 2025, 89, 102913. [Google Scholar]
- Sarvakar, K.; Yadav, R.; Patel, A.; Patel, C.D.; Rana, K.; Borisagar, V. Advanced analytics and machine learning algorithms for healthcare decision support systems: A study. In Advances in Healthcare Information Systems and Administration; IGI Global: Hershey, PA, USA, 2024; pp. 16–50. [Google Scholar]
- Chaddad, A.; Peng, J.; Xu, J.; Bouridane, A. Survey of explainable AI techniques in healthcare. Sensors 2023, 23, 634. [Google Scholar] [CrossRef]
- Badhoutiya, A.; Verma, R.P.; Shrivastava, A.; Laxminarayanamma, K.; Rao, A.L.N.; Khan, A.K. Random Forest Classification in Healthcare Decision Support for Disease Diagnosis. In Proceedings of the 2023 International Conference on Artificial Intelligence for Innovations in Healthcare Industries (ICAIIHI), Raipur, India, 29–30 December 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
- Choi, H.; Abdirayimov, S. Demonstrating the power of SHAP values in AI-driven classification of Marvel characters. J. Multimed. Inf. Syst. 2024, 11, 167–172. [Google Scholar] [CrossRef]
- Daram, S. Explainable AI in healthcare: Enhancing trust, transparency, and ethical compliance in medical AI systems. AI Ethics 2025, 5, 1–20. [Google Scholar]
- Chauvie, S.; Mazzoni, L.N. A review on the use of imaging biomarkers in oncology clinical trials: Quality assurance strategies for technical validation. Tomography 2023, 9, 1876–1902. [Google Scholar] [CrossRef] [PubMed]
- Chiu, F.; Yen, Y. Imaging biomarkers for clinical applications in neuro-oncology: Current status and future perspectives. Biomark. Res. 2023, 11, 35. [Google Scholar] [CrossRef]
- Cho, W.C.; Zhou, F.; Li, J.; Hua, L.; Liu, F. Editorial: Biomarker detection algorithms and tools for medical imaging or omics data. Front. Genet. 2022, 13, 919390. [Google Scholar] [CrossRef]
- Pai, S.; Bontempi, D.; Hadzic, I.; Prudente, V.; Sokač, M.; Chaunzwa, T.L.; Bernatz, S.; Hosny, A.; Mak, R.H.; Birkbak, N.J.; et al. Foundation model for cancer imaging biomarkers. Nat. Mach. Intell. 2024, 6, 354–367. [Google Scholar] [CrossRef]
- Deshmukh, A. Artificial intelligence in medical imaging: Applications of deep learning for disease detection and diagnosis. Univers. Res. Rep. 2024, 11, 31–36. [Google Scholar] [CrossRef]
- Ltifi, H.; Benmohamed, E.; Kolski, C.; Ben Ayed, M. Adapted visual analytics process for intelligent decision-making: Application in a medical context. Int. J. Inf. Technol. Decis. Mak. 2020, 19, 241–282. [Google Scholar] [CrossRef]
- Myrou, A.; Barmpagiannos, K.; Ioakimidou, A.; Savopoulos, C. Molecular biomarkers in neurological diseases: Advances in diagnosis and prognosis. Int. J. Mol. Sci. 2025, 26, 2231. [Google Scholar] [CrossRef]
- Rasouli, S.; Alkurdi, D.; Jia, B. The role of artificial intelligence in modern medical education and practice: A systematic literature review. BMC Med. Educ. 2024, 24, 456. [Google Scholar]
- Reeja, S.R.; Kavitha, G. Biomarkers classification for various diseases using machine learning approaches: A review. Int. J. Adv. Comput. Sci. Appl. 2023, 14, 123–135. [Google Scholar]
- Trueblood, J.S.; Holmes, W.R.; Seegmiller, A.C.; Douds, J.; Compton, M.; Szentirmai, E.; Woodruff, M.; Huang, W.; Stratton, C.; Eichbaum, Q. The impact of speed and bias on the cognitive processes of experts and novices in medical image decision-making. Cogn. Res. Princ. Implic. 2018, 3, 28. [Google Scholar] [CrossRef]
- Stahl, A. The diagnosis and treatment of age-related macular degeneration. Dtsch. Ärzteblatt Int. 2020, 117, 513–520. [Google Scholar] [CrossRef] [PubMed]
- Wong, T.Y.; Lanzetta, P.; Bandello, F.; Eldem, B.; Navarro, R.; Lövestam-Adrian, M.; Loewenstein, A. Current concepts and modalities for monitoring the fellow eye in neovascular age-related macular degeneration: An expert panel consensus. Retina 2020, 40, 599–611. [Google Scholar] [CrossRef]
- Ferris, F.L.; Wilkinson, C.; Bird, A.; Chakravarthy, U.; Chew, E.; Csaky, K.; Sadda, S.R. Clinical classification of age-related macular degeneration. Ophthalmology 2013, 120, 844–851. [Google Scholar] [CrossRef]
- Rudnicka, A.R.; Kapetanakis, V.V.; Jarrar, Z.; Wathern, A.K.; Wormald, R.; Fletcher, A.E.; Cook, D.G.; Owen, C.G. Incidence of late-stage age-related macular degeneration in American whites: Systematic review and meta-analysis. Am. J. Ophthalmol. 2015, 160, 85–93. [Google Scholar] [CrossRef]
- Handa, J.T.; Bowes Rickman, C.; Dick, A.D.; Gorin, M.B.; Miller, J.W.; Toth, C.A.; Ueffing, M.; Zarbin, M.; Farrer, L.A. A systems biology approach towards understanding and treating non-neovascular age-related macular degeneration. Nat. Commun. 2019, 10, 3347. [Google Scholar] [CrossRef]
- Lopukhova, E.A.; Yusupov, E.S.; Ibragimova, R.R.; Idrisova, G.M.; Mukhamadeev, T.R.; Grakhova, E.P.; Kutluyarov, R.V. Hybrid intelligent staging of age-related macular degeneration for decision-making on patient management tactics. Biomed. Signal Process. Control 2025, 87, 105456. [Google Scholar]
- Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
- Guo, C.; Pleiss, G.; Sun, Y.; Weinberger, K.Q. On Calibration of Modern Neural Networks. In Proceedings of the 34th International Conference on Machine Learning—PMLR, Sydney, Australia, 6–11 August 2017; Volume 70, pp. 1321–1330. [Google Scholar]
- Bahani, K.; Moujabbir, M.; Ramdani, M. An accurate fuzzy rule-based classification systems for heart disease diagnosis. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 491–498. [Google Scholar] [CrossRef]
- Sirocchi, C.; Bogliolo, A.; Montagna, S. Medical-informed machine learning: Integrating prior knowledge into medical decision systems. BMC Med. Inform. Decis. Mak. 2024, 24 (Suppl. S4), 186. [Google Scholar] [CrossRef]
- Liang, S.; Li, Y.; Srikant, R. Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv 2017, arXiv:1706.02690. [Google Scholar]
- Age-Related Eye Disease Study Research Group. A randomized, placebo-controlled, clinical trial of high-dose supplementation with vitamins C and E, beta carotene, and zinc for age-related macular degeneration and vision loss: AREDS report no. 8. Arch. Ophthalmol. 2001, 119, 1417–1436. [Google Scholar] [CrossRef]
- Lad, E.M.; Finger, R.P.; Guymer, R. Biomarkers for the progression of intermediate age-related macular degeneration. Ophthalmol. Ther. 2023, 12, 2917–2941. [Google Scholar] [CrossRef]
- Vallino, V.; Berni, A.; Coletto, A.; Serafino, S.; Bandello, F.; Reibaldi, M.; Borrelli, E. Structural OCT and OCT angiography biomarkers associated with the development and progression of geographic atrophy in AMD. Surv. Ophthalmol. 2024, 69, 405–420. [Google Scholar] [CrossRef]
- Garcia-Layana, A.; Cabrera-López, F.; García-Arumí, J.; Arias-Barquet, L.; Ruiz-Moreno, J.M. Early and intermediate age-related macular degeneration: Update and clinical review. Clin. Interv. Aging 2017, 12, 1579–1587. [Google Scholar] [CrossRef]
- Waldstein, S.M.; Vogl, W.; Bogunovic, H.; Sadeghipour, A.; Riedl, S.; Schmidt-Erfurth, U. Characterization of drusen and hyperreflective foci as biomarkers for disease progression in age-related macular degeneration using artificial intelligence in optical coherence tomography. JAMA Ophthalmol. 2020, 138, 740–747. [Google Scholar] [CrossRef]
- Romano, F.; Ding, X.; Yuan, M.; Vingopoulos, F.; Garg, I.; Choi, H.; Alvarez, R.; Tracy, J.H.; Finn, M.; Razavi, P.; et al. Progressive choriocapillaris changes on optical coherence tomography angiography correlate with stage progression in AMD. Ophthalmol. Retin. 2024, 8, 654–663. [Google Scholar] [CrossRef] [PubMed]
- Asani, B.; Holmberg, O.; Schiefelbein, J.B.; Hafner, M.; Herold, T.; Spitzer, H.; Siedlecki, J.; Kern, C.; Kortuem, K.U.; Frishberg, A.; et al. Evaluation of OCT biomarker changes in treatment-naive neovascular AMD using a deep semantic segmentation algorithm. Sci. Rep. 2024, 14, 8140. [Google Scholar] [CrossRef] [PubMed]
- Tenbrock, L.; Wolf, J.; Boneva, S.; Schlecht, A.; Agostini, H.; Wieghofer, P.; Schlunck, G.; Lange, C. Subretinal fibrosis in neovascular age-related macular degeneration: Current concepts, therapeutic avenues, and future perspectives. Cell Tissue Res. 2022, 387, 361–375. [Google Scholar] [CrossRef] [PubMed]
- Bird, A.C.; Phillips, R.L.; Hageman, G.S. Geographic atrophy: A histopathological assessment. JAMA Ophthalmol. 2014, 132, 338–345. [Google Scholar] [CrossRef]
- Fang, V.; Gomez-Caraballo, M.; Lad, E.M. Biomarkers for nonexudative age-related macular degeneration and relevance for clinical trials: A systematic review. Mol. Diagn. Ther. 2021, 25, 691–713. [Google Scholar] [CrossRef]
- Flores, R.; Carneiro, Â.; Tenreiro, S.; Seabra, M.C. Retinal progression biomarkers of early and intermediate age-related macular degeneration. Life 2021, 12, 36. [Google Scholar] [CrossRef]
- Gill, K.; Yoo, H.; Chakravarthy, H.; Granville, D.J.; Matsubara, J.A. Exploring the role of granzyme B in subretinal fibrosis of age-related macular degeneration. Investig. Ophthalmol. Vis. Sci. 2024, 65, 12. [Google Scholar] [CrossRef] [PubMed]
- Latifi-Navid, H.; Barzegar Behrooz, A.; Jamehdor, S.; Davari, M.; Latifinavid, M.; Zolfaghari, N.; Piroozmand, S.; Taghizadeh, S.; Bourbour, M.; Shemshaki, G.; et al. Construction of an exudative age-related macular degeneration diagnostic and therapeutic molecular network using multi-layer network analysis, a fuzzy logic model, and deep learning techniques: Are retinal and brain neurodegenerative disorders related? Pharmaceuticals 2023, 16, 1555. [Google Scholar] [CrossRef]
- Saha, S.; Nassisi, M.; Wang, M.; Lindenberg, S.; Kanagasingam, Y.; Sadda, S.; Hu, Z.J. Automated detection and classification of early AMD biomarkers using deep learning. Sci. Rep. 2019, 9, 10990. [Google Scholar] [CrossRef]
- Sharma, A.; Parachuri, N.; Kumar, N.; Bandello, F.; Kuppermann, B.D.; Loewenstein, A.; Regillo, C.; Chakravarthy, U. Fluid-based prognostication in n-AMD: Type 3 macular neovascularisation needs an analysis in isolation. Eye 2021, 35, 2370–2379. [Google Scholar] [CrossRef]
- Vinković, M.; Kopić, A.; Benašić, T. Anti-VEGF treatment and optical coherence tomography biomarkers in wet age-related macular degeneration. Acta Clin. Croat. 2022, 61, 285–292. [Google Scholar]
- Miladinović, A.; Biscontin, A.; Ajčević, M.S.; Kresevic, S.; Accardo, A.; Marangoni, D.; Tognetto, D.; Inferrera, L. Evaluating deep learning models for classifying OCT images with limited data and noisy labels. Sci. Rep. 2024, 14, 30321. [Google Scholar] [CrossRef] [PubMed]
- Wu, Z.; Zhuo, R.; Liu, X.; Wu, B.; Wang, J. Enhancing surgical decision-making in NEC with ResNet18: A deep learning approach to predict the need for surgery through X-ray image analysis. Front. Pediatr. 2024, 12, 1405780. [Google Scholar] [CrossRef] [PubMed]
- Ayyachamy, S.; Alex, V.; Khened, M.; Krishnamurthi, G. Medical image retrieval using Resnet-18 for clinical diagnosis. In Proceedings of the Medical Imaging 2019: Imaging Informatics for Healthcare, Research, and Applications, San Diego, CA, USA, 16–21 February 2019; Chen, P.-H., Bak, P.R., Eds.; SPIE: Bellingham, WA, USA, 2019; Volume 10954, p. 1095410. [Google Scholar] [CrossRef]
- Rahman Siddiquee, M.M.; Shah, J.; Chong, C.; Nikolova, S.; Dumkrieger, G.; Li, B.; Wu, T.; Schwedt, T.J. Headache classification and automatic biomarker extraction from structural MRIs using deep learning. Brain Commun. 2022, 5, fcac311. [Google Scholar] [CrossRef]
- Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
- Cui, Y.; Jia, M.; Lin, T.; Song, Y.; Belongie, S. Class-balanced loss based on effective number of samples. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 9268–9277. [Google Scholar]
- Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Bradshaw, T.J.; Huemann, Z.; Hu, J.; Rahmim, A. A guide to cross-validation for artificial intelligence in medical imaging. Radiol. Artif. Intell. 2023, 5, e220232. [Google Scholar] [CrossRef]
- Guan, H.; Liu, M. Domain adaptation for medical image analysis: A survey. IEEE Trans. Biomed. Eng. 2022, 69, 1173–1185. [Google Scholar] [CrossRef]
- Collins, G.S.; Moons, K.G.M.; Dhiman, P.; Riley, R.D.; Beam, A.L.; Van Calster, B.; Ghassemi, M.; Liu, X.; Reitsma, J.B.; van Smeden, M.; et al. TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024, 385, e078378. [Google Scholar] [CrossRef]
- Yu, A.C.; Mohajer, B.; Eng, J. External validation of deep learning algorithms for radiologic diagnosis: A systematic review. Radiol. Artif. Intell. 2022, 4, e210064. [Google Scholar] [CrossRef]
- Imran, H.M.; Asad, M.A.A.; Abdullah, T.A.; Chowdhury, S.I.; Alamin, M. Few shot learning for medical imaging: A review of categorized images. IEEE Access 2023, 11, 75055–75090. [Google Scholar]
- Malhotra, A. Single-shot image recognition using siamese neural networks. In Proceedings of the 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 12–13 May 2023; pp. 2156–2160. [Google Scholar]
- Xian, Y.; Lampert, C.; Schiele, B.; Akata, Z. Zero-shot learning—A comprehensive evaluation of the good, the bad and the ugly. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 41, 2251–2265. [Google Scholar] [CrossRef]
- Xu, C.; Zheng, H.; Liu, K.; Chen, Y.; Ye, C.; Niu, C.; Jin, S.; Li, Y.; Gao, H.; Hu, J.; et al. Deep learning for retina structural biomarker classification using OCT images. Biomed. Opt. Express 2024, 15, 2190–2205. [Google Scholar]
- Lu, C.; Wang, X.; Yang, A.; Liu, Y.; Dong, Z. A few-shot-based model-agnostic meta-learning for intrusion detection in security of Internet of Things. Comput. Netw. 2023, 228, 109724. [Google Scholar] [CrossRef]
- Wang, H.; Tong, X.; Wang, P.; Xu, Z.; Song, L. Few-shot transfer learning method based on meta-learning and graph convolution network for machinery fault diagnosis. IEEE Trans. Ind. Inform. 2023, 19, 6073–6083. [Google Scholar] [CrossRef]
- Zhao, Z.; Ding, H.; Cai, D.; Yan, Y. Gated multi-scale attention transformer for few-shot medical image segmentation. Med. Image Anal. 2024, 93, 103084. [Google Scholar]
- Tang, S.; Yan, S.; Qi, X.; Gao, J.; Ye, M.; Zhang, J.; Zhu, X. Few-shot medical image segmentation with high-fidelity prototypes. Med. Image Anal. 2025, 89, 102897. [Google Scholar] [CrossRef]
- Wang, J.; Wang, T.; Xu, J.; Zhang, Z.; Wang, H.; Li, H. Zero-shot diagnosis of unseen pulmonary diseases via spatial domain adaptive correction and guidance by ChatGPT-4o. Med. Image Anal. 2024, 95, 103189. [Google Scholar]
- Flanagan, A.R.; Glavin, F.G. A systematic review of multi-class and one-vs-rest classification techniques for near-infrared spectra of crop cultivars. Comput. Electron. Agric. 2023, 210, 107900. [Google Scholar]
- Hong, J.; Cho, S. A probabilistic multi-class strategy of one-vs.-rest support vector machines for cancer classification. Neurocomputing 2008, 71, 3275–3281. [Google Scholar] [CrossRef]
- Jang, J.; Kim, C. One-vs-rest network-based deep probability model for open set recognition. IEEE Access 2020, 8, 113493–113506. [Google Scholar]
- Youden, W.J. Index for rating diagnostic tests. Cancer 1950, 3, 32–35. [Google Scholar] [CrossRef] [PubMed]
- Shreffler, J.; Huecker, M.R. Diagnostic testing accuracy: Sensitivity, specificity, predictive values and likelihood ratios. In StatPearls; StatPearls Publishing: Tampa, FL, USA, 2025. [Google Scholar]
- Nebro, A.J.; Galeano-Brajones, J.; Luna, F.; Coello Coello, C.A. Is NSGA-II ready for large-scale multi-objective optimization? Math. Comput. Appl. 2022, 27, 103. [Google Scholar] [CrossRef]
- Glascoe, F.P. Screening for developmental and behavioral problems. Ment. Retard. Dev. Disabil. Res. Rev. 2005, 11, 173–179. [Google Scholar] [CrossRef]
- Vanderheyden, A.M. Technical adequacy of response to intervention decisions. Except. Child. 2011, 77, 335–350. [Google Scholar] [CrossRef]
- McDowall, L.M.; Dampney, R.A.L. Calculation of threshold and saturation points of sigmoidal baroreflex function curves. Am. J. Physiol.-Heart Circ. Physiol. 2006, 291, H2003–H2007. [Google Scholar] [CrossRef]
- Leão, W. Attended temperature scaling: A practical approach for calibrating deep neural networks. arXiv 2018, arXiv:1810.11586. [Google Scholar]
- Mukhoti, J.; Kulharia, V.; Sanyal, A.; Golodetz, S.; Torr, P.H.S.; Dokania, P.K. Calibrating deep neural networks using focal loss. arXiv 2020, arXiv:2002.09437. [Google Scholar] [CrossRef]
- Dabah, L.; Tirer, T. On temperature scaling and conformal prediction of deep classifiers. Neural Netw. 2025, 163, 1–15. [Google Scholar]
- Mamalakis, M.; de Vareilles, H.E.I.; Murray, G.; Lio, P.; Suckling, J. The explanation necessity for healthcare AI. Nat. Mach. Intell. 2024, 6, 410–420. [Google Scholar]
- Metta, C.; Beretta, A.; Pellungrini, R.; Rinzivillo, S.; Giannotti, F. Towards transparent healthcare: Advancing local explanation methods in explainable artificial intelligence. Bioengineering 2024, 11, 369. [Google Scholar] [CrossRef] [PubMed]
Model | Accuracy (%) | Macro-AUROC | ECE (%) | Brier |
---|---|---|---|---|
Strong end-to-end ResNet-18 (CB + focal + T) | 80.6 ± 1.6 | 0.921 ± 0.013 | 5.3 ± 0.6 | 0.126 ± 0.005 |
Strong end-to-end ResNet-34 (CB + focal + T) | 86.8 ± 1.5 | 0.934 ± 0.017 | 4.6 ± 0.6 | 0.118 ± 0.004 |
Strong end-to-end ConvNeXt-Tiny (CB + focal + T) | 88.4 ± 1.2 | 0.937 ± 0.007 | 4.3 ± 0.5 | 0.115 ± 0.004 |
Strong end-to-end DeiT-Tiny (CB + focal + T) | 87.4 ± 1.4 | 0.939 ± 0.012 | 4.4 ± 0.5 | 0.116 ± 0.004 |
Two-stage (IB + fuzzy, ) | 90.4 ± 1.9 | 0.962 ± 0.016 | 2.1 ± 0.4 | 0.082 ± 0.003 |
Failure Mode | Observed Pattern | Likely Driver(s) | Mitigation in Rules/IB Set |
---|---|---|---|
Early (S) vs. Intermediate (P) | Borderline drusen size; td high but md near threshold; occasional gv biases toward later staging | High td with moderate md; weak/absent sd; calibrated > but | Slightly raise td midpoint for S (e.g., m: ); require md or sd contribution for P; add a mild negative term for td in P when sr/ga are absent, preserving clinical semantics of “Present” at the midpoint |
Atrophic (SI) vs. Fibrosis (VI) | Confluent GA with hyperreflective inclusions versus fibrotic scarring; gv common; residual low fluids | Moderate ga with low/borderline sr; gv present; – induces ambiguity | Increase defining-feature weight for sr in VI (: ) and strengthen negative membership for sr in SI; require ga > 0.5 for SI when sr < 0.5 to reflect defining status of ga in SI |
Neovascular (V) vs. Fibrosis (VI) | Exudation with early fibrovascular change; fopes/irzh present but incipient sr | > and > with | Boost fopes (and, if needed, irzh) weights for V and apply a small negative term for sr in V; add a tie-breaker: if then prefer VI even with fopes present, consistent with LOBO sensitivities |
Normal (N) vs. Early (S) | False-positive td from subtle undulations/artefacts without md/gv corroboration | just over 0.5 without corroborating IBs | Raise td midpoint for S to 0.52; require corroboration by md or multi-B-scan consistency of td; rely on temperature scaling to suppress isolated over-confidence |
Model | T | Accuracy (%) | Macro-AUROC | ECE (%) | Brier |
---|---|---|---|---|---|
End-to-end ResNet-18 | 0.7 | 76.1 ± 3.5 | 0.898 ± 0.030 | 11.7 ± 1.5 | 0.201 ± 0.011 |
(CB + focal) | 1.0 | 77.5 ± 2.4 | 0.911 ± 0.019 | 9.8 ± 1.1 | 0.194 ± 0.009 |
1.3 | 78.6 ± 3.0 | 0.921 ± 0.025 | 6.9 ± 1.0 | 0.185 ± 0.013 | |
1.6 | 76.0 ± 3.1 | 0.905 ± 0.033 | 8.5 ± 1.6 | 0.193 ± 0.015 | |
2.0 | 74.9 ± 4.1 | 0.889 ± 0.035 | 10.1 ± 1.4 | 0.199 ± 0.014 | |
Two-stage (IB + fuzzy) | 0.7 | 86.1 ± 4.0 | 0.938 ± 0.038 | 4.5 ± 1.2 | 0.131 ± 0.007 |
(IB encoder + fuzzy) | 1.0 | 86.5 ± 3.5 | 0.945 ± 0.030 | 3.9 ± 0.9 | 0.128 ± 0.008 |
1.3 | 90.4 ± 1.9 | 0.951 ± 0.029 | 2.9 ± 0.7 | 0.121 ± 0.006 | |
1.6 | 86.8 ± 3.9 | 0.947 ± 0.031 | 3.6 ± 0.9 | 0.125 ± 0.007 | |
2.0 | 86.3 ± 4.2 | 0.935 ± 0.036 | 4.1 ± 1.1 | 0.129 ± 0.008 |
Model | Accuracy (%) | Macro-AUROC | ECE (%) | Brier |
---|---|---|---|---|
Train Avanti XR (n = 1180) → Test REVO NX (n = 748) | ||||
Strong end-to-end ResNet-18 (CB + focal + T) | 71.0 ± 2.7 | 0.882 ± 0.021 | 8.4 ± 1.1 | 0.176 ± 0.010 |
Strong end-to-end ResNet-34 (CB + focal + T) | 76.9 ± 2.2 | 0.906 ± 0.017 | 6.3 ± 0.9 | 0.159 ± 0.008 |
Strong end-to-end ConvNeXt-Tiny (CB + focal + T) | 78.8 ± 1.9 | 0.913 ± 0.014 | 5.8 ± 0.7 | 0.152 ± 0.007 |
Two-stage (IB + fuzzy, ) | 86.1 ± 2.3 | 0.946 ± 0.018 | 2.9 ± 0.6 | 0.124 ± 0.006 |
Train REVO NX (n = 748) → Test Avanti XR (n = 1180) | ||||
Strong end-to-end ResNet-18 (CB + focal + T) | 70.3 ± 2.9 | 0.878 ± 0.022 | 8.7 ± 1.2 | 0.179 ± 0.011 |
Strong end-to-end ResNet-34 (CB + focal + T) | 75.9 ± 2.4 | 0.900 ± 0.019 | 6.5 ± 0.8 | 0.161 ± 0.009 |
Strong end-to-end ConvNeXt-Tiny (CB + focal + T) | 77.5 ± 2.0 | 0.908 ± 0.015 | 6.1 ± 0.8 | 0.156 ± 0.008 |
Two-stage (IB + fuzzy, ) | 84.7 ± 2.6 | 0.939 ± 0.020 | 3.1 ± 0.7 | 0.129 ± 0.007 |
Component | Params (M) | GFLOPs | VRAM (GB) | Latency |
---|---|---|---|---|
IB encoder (ResNet-18) | 11.7 | ∼2.0 | 0.70 | 2.4 ms (GPU) |
25.0 ms (CPU) | ||||
Temperature scaling | ≈0 | < 0.001 | < 0.01 | 0.01 ms (GPU) |
0.03 ms (CPU) | ||||
Fuzzy solver | 0 | < 0.001 | < 0.01 | 0.02 ms (GPU) |
0.05 ms (CPU) | ||||
Total | 11.7 | ∼2.0 | ∼0.70 | 2.43 ms (GPU) |
25.08 ms (CPU) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lopukhova, E.A.; Yusupov, E.S.; Ibragimova, R.R.; Idrisova, G.M.; Mukhamadeev, T.R.; Grakhova, E.P.; Kutluyarov, R.V. An Interpretable Clinical Decision Support System Aims to Stage Age-Related Macular Degeneration Using Deep Learning and Imaging Biomarkers. Appl. Sci. 2025, 15, 10197. https://doi.org/10.3390/app151810197
Lopukhova EA, Yusupov ES, Ibragimova RR, Idrisova GM, Mukhamadeev TR, Grakhova EP, Kutluyarov RV. An Interpretable Clinical Decision Support System Aims to Stage Age-Related Macular Degeneration Using Deep Learning and Imaging Biomarkers. Applied Sciences. 2025; 15(18):10197. https://doi.org/10.3390/app151810197
Chicago/Turabian StyleLopukhova, Ekaterina A., Ernest S. Yusupov, Rada R. Ibragimova, Gulnaz M. Idrisova, Timur R. Mukhamadeev, Elizaveta P. Grakhova, and Ruslan V. Kutluyarov. 2025. "An Interpretable Clinical Decision Support System Aims to Stage Age-Related Macular Degeneration Using Deep Learning and Imaging Biomarkers" Applied Sciences 15, no. 18: 10197. https://doi.org/10.3390/app151810197
APA StyleLopukhova, E. A., Yusupov, E. S., Ibragimova, R. R., Idrisova, G. M., Mukhamadeev, T. R., Grakhova, E. P., & Kutluyarov, R. V. (2025). An Interpretable Clinical Decision Support System Aims to Stage Age-Related Macular Degeneration Using Deep Learning and Imaging Biomarkers. Applied Sciences, 15(18), 10197. https://doi.org/10.3390/app151810197