Using Shapley Values to Explain the Decisions of Convolutional Neural Networks in Glaucoma Diagnosis
Abstract
1. Introduction
2. Related Work
- It’s approximate: Built on DeepLIFT [15], it estimates rather than computes exact Shapley values, which can cause inaccuracies, especially in complex models.
- Feature independence is assumed: It doesn’t handle feature correlations well, which may lead to misleading explanations.
- Sensitive to background data: The quality of explanations hinges heavily on the choice of representative background samples.
- Model limitations: Only works with differentiable models that support backpropagation, so it is a no-go for many non-standard architectures.
- Harder to generalize: Local explanations don’t always aggregate neatly into global insights, making interpretation tricky.
- Can still be resource-hungry: While faster than some alternatives, it can still be taxing on large-scale models.
3. Materials and Methods
3.1. Datasets
3.2. Deep Learning Models
3.3. Shapley Values Implementation
- For each input image I, generate all possible combinations of occluded sectors for I, taking into account whether the image belongs to a left or right eye to properly locate the sectors.
- For each sector j of I, calculate the probability of the glaucoma class of the model and for each possible coalition of sectors, using as input and , the images generated from I with that coalition of sectors present in the image.
- With the above probabilities, calculate the Shapley value for sector j following Equation (2).
- Re-train only the last layer, i.e., the classification layer of the original models, keeping the convolutional base frozen, using as training data the different possible combinations of sectors, resulting in a model adapted to each combination, but without modifying the original feature extraction.
- Re-train all layers of the original models, from start to finish, also using the same data as in the previous scenario, resulting in new models adapted to each combination.
- Given a set of images with a combination O of occluded sectors, the training and test sets were divided using exactly the same division as in the corresponding original set. That is, if an image was originally in the training set, the same image with the combination O of occluded sectors will still be in the training set.
- Similarly, the training set was subdivided into the same 5 training and validation partitions as the respective unoccluded set.
- For each of the 5 partitions we built a new model, initializing it with the weights of the original model corresponding to that partition, and we re-trained it following the methodology described in Section 3.2, but only for 50 epochs, as we empirically observed that this was sufficient.
- Finally, we selected the models for the epoch that achieved the highest average validation accuracy among the 5 partitions. This results in 5 models per network architecture for each combination O of occluded sectors.
3.4. Evaluation Methodology
- Agreement between Shapley values for re-trained models and randomly initialized models.
- Agreement between models with the same architecture.
- Agreement between models with different architectures.
4. Experimental Results and Discussion
4.1. Performance Evaluation of the Deep Learning Models
4.2. Randomization of Model Weights
4.3. Correlation Between Shapley Values for Models with the Same Architecture
4.4. Correlation Between Shapley Values for Models with Different Architectures
4.5. Relevance of Sectors According to Shapley Values
4.6. Comparison Between Shapley Values for Different Datasets
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Molnar, C. Interpretable Machine Learning, 2nd ed.; Lulu Press, Inc.: Morrisville, NC, USA, 2022. [Google Scholar]
- Hemelings, R.; Elen, B.; Barbosa-Breda, J.; Blaschko, M.B.; De Boever, P.; Stalmans, I. Deep learning on fundus images detects glaucoma beyond the optic disc. Sci. Rep. 2021, 11, 20313. [Google Scholar] [CrossRef] [PubMed]
- Loh, H.W.; Ooi, C.P.; Seoni, S.; Barua, P.D.; Molinari, F.; Acharya, U.R. Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011–2022). Comput. Methods Programs Biomed. 2022, 226, 107161. [Google Scholar] [CrossRef] [PubMed]
- Shapley, L.S. A Value for N-Person Games; RAND Corporation: Santa Monica, CA, USA, 1952. [Google Scholar] [CrossRef]
- Aas, K.; Jullum, M.; Løland, A. Explaining individual predictions when features are dependent: More accurate approximations to Shapley values. Artif. Intell. 2021, 298, 103502. [Google Scholar] [CrossRef]
- Sigut, J.; Fumero, F.; Estévez, J.; Alayón, S.; Díaz-Alemán, T. In-Depth Evaluation of Saliency Maps for Interpreting Convolutional Neural Network Decisions in the Diagnosis of Glaucoma Based on Fundus Imaging. Sensors 2024, 24, 239. [Google Scholar] [CrossRef] [PubMed]
- Roth, A.E. The Shapley Value: Essays in Honor of Lloyd S. Shapley; Cambridge University Press: Cambridge, UK, 1988. [Google Scholar]
- Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 2014, 41, 647–665. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Sydney, Australia, 2017; Volume 30. [Google Scholar]
- Chen, H.; Lundberg, S.; Lee, S.I. Explaining Models by Propagating Shapley Values of Local Components. arXiv 2019, arXiv:1911.11888. [Google Scholar] [CrossRef]
- Tham, Y.C.; Li, X.; Wong, T.Y.; Quigley, H.A.; Aung, T.; Cheng, C.Y. Global Prevalence of Glaucoma and Projections of Glaucoma Burden through 2040: A Systematic Review and Meta-Analysis. Ophthalmology 2014, 121, 2081–2090. [Google Scholar] [CrossRef] [PubMed]
- European Glaucoma Society. European Glaucoma Society Terminology and Guidelines for Glaucoma, 5th Edition. Br. J. Ophthalmol. 2021, 105, 1–169. [Google Scholar] [CrossRef] [PubMed]
- Singh, A.; Jothi Balaji, J.; Rasheed, M.A.; Jayakumar, V.; Raman, R.; Lakshminarayanan, V. Evaluation of explainable deep learning methods for ophthalmic diagnosis. Clin. Ophthalmol. 2021, 15, 2573–2581. [Google Scholar] [CrossRef] [PubMed]
- Shorfuzzaman, M.; Hossain, M.S.; El Saddik, A. An Explainable Deep Learning Ensemble Model for Robust Diagnosis of Diabetic Retinopathy Grading. ACM Trans. Multimed. Comput. Commun. Appl. 2021, 17, 113:1–113:24. [Google Scholar] [CrossRef]
- Shrikumar, A.; Greenside, P.; Kundaje, A. Learning important features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, ICML’17, Sydney, Australia, 6–11 August 2017; pp. 3145–3153. [Google Scholar]
- Mehta, P.; Petersen, C.A.; Wen, J.C.; Banitt, M.R.; Chen, P.P.; Bojikian, K.D.; Egan, C.; Lee, S.I.; Balazinska, M.; Lee, A.Y.; et al. Automated Detection of Glaucoma with Interpretable Machine Learning Using Clinical Data and Multimodal Retinal Images. Am. J. Ophthalmol. 2021, 231, 154–169. [Google Scholar] [CrossRef] [PubMed]
- Hasan, M.M.; Phu, J.; Wang, H.; Sowmya, A.; Kalloniatis, M.; Meijering, E. OCT-based diagnosis of glaucoma and glaucoma stages using explainable machine learning. Sci. Rep. 2025, 15, 3592. [Google Scholar] [CrossRef] [PubMed]
- Oh, S.; Park, Y.; Cho, K.J.; Kim, S.J. Explainable Machine Learning Model for Glaucoma Diagnosis and Its Interpretation. Diagnostics 2021, 11, 510. [Google Scholar] [CrossRef] [PubMed]
- Tao, S.; Ravindranath, R.; Wang, S.Y. Predicting Glaucoma Progression to Surgery with Artificial Intelligence Survival Models. Ophthalmol. Sci. 2023, 3, 100336. [Google Scholar] [CrossRef] [PubMed]
- Ravindranath, R.; Naor, J.; Wang, S.Y. Artificial Intelligence Models to Identify Patients at High Risk for Glaucoma Using Self-reported Health Data in a United States National Cohort. Ophthalmol. Sci. 2025, 5, 100685. [Google Scholar] [CrossRef] [PubMed]
- Christopher, M.; Gonzalez, R.; Huynh, J.; Walker, E.; Radha Saseendrakumar, B.; Bowd, C.; Belghith, A.; Goldbaum, M.H.; Fazio, M.A.; Girkin, C.A.; et al. Proactive Decision Support for Glaucoma Treatment: Predicting Surgical Interventions with Clinically Available Data. Bioengineering 2024, 11, 140. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.Y.; Ravindranath, R.; Stein, J.D.; Amin, S.; Edwards, P.A.; Srikumaran, D.; Woreta, F.; Schultz, J.S.; Shrivastava, A.; Ahmad, B.; et al. Prediction Models for Glaucoma in a Multicenter Electronic Health Records Consortium: The Sight Outcomes Research Collaborative. Ophthalmol. Sci. 2024, 4, 100445. [Google Scholar] [CrossRef] [PubMed]
- Wang, R.; Bradley, C.; Herbert, P.; Hou, K.; Ramulu, P.; Breininger, K.; Unberath, M.; Yohannan, J. Deep learning-based identification of eyes at risk for glaucoma surgery. Sci. Rep. 2024, 14, 599. [Google Scholar] [CrossRef] [PubMed]
- Yoon, J.S.; Kim, Y.e.; Lee, E.J.; Kim, H.; Kim, T.W. Systemic factors associated with 10-year glaucoma progression in South Korean population: A single center study based on electronic medical records. Sci. Rep. 2023, 13, 530. [Google Scholar] [CrossRef] [PubMed]
- Fumero, F.; Diaz-Aleman, T.; Sigut, J.; Alayon, S.; Arnay, R.; Angel-Pereira, D. RIM-ONE DL: A Unified Retinal Image Database for Assessing Glaucoma Using Deep Learning. Image Anal. Stereol. 2020, 39, 161–167. [Google Scholar] [CrossRef]
- Fumero, F.; Alayon, S.; Sanchez, J.L.; Sigut, J.; Gonzalez-Hernandez, M. RIM-ONE: An open retinal image database for optic nerve evaluation. In Proceedings of the 2011 24th International Symposium on Computer-Based Medical Systems (CBMS), Bristol, UK, 27–30 June 2011; pp. 1–6. [Google Scholar] [CrossRef]
- Orlando, J.I.; Fu, H.; Barbosa Breda, J.; van Keer, K.; Bathula, D.R.; Diaz-Pinto, A.; Fang, R.; Heng, P.A.; Kim, J.; Lee, J.; et al. REFUGE Challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med. Image Anal. 2020, 59, 101570. [Google Scholar] [CrossRef] [PubMed]
- Sivaswamy, J.; Chakravarty, A.; Joshi, G.D.; Ujjwal; Syed, T.A. A Comprehensive Retinal Image Dataset for the Assessment of Glaucoma from the Optic Nerve Head Analysis. JSM Biomed. Imaging Data Pap. 2015, 2, 1004. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1800–1807. [Google Scholar]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: www.tensorflow.org (accessed on 16 May 2025).
- Chalkiadakis, G.; Elkind, E.; Wooldridge, M. Basic Concepts. In Computational Aspects of Cooperative Game Theory; Chalkiadakis, G., Elkind, E., Wooldridge, M., Eds.; Synthesis Lectures on Artificial Intelligence and Machine Learning; Springer International Publishing: Cham, Switzerland, 2012; pp. 11–35. [Google Scholar] [CrossRef]
- Campbell, T.W.; Roder, H.; Georgantas, R.W., III; Roder, J. Exact Shapley values for local and model-true explanations of decision tree ensembles. Mach. Learn. Appl. 2022, 9, 100345. [Google Scholar] [CrossRef]
- Campbell, T.W.; Wilson, M.P.; Roder, H.; MaWhinney, S.; Georgantas, R.W.; Maguire, L.K.; Roder, J.; Erlandson, K.M. Predicting prognosis in COVID-19 patients using machine learning and readily available clinical data. Int. J. Med. Inform. 2021, 155, 104594. [Google Scholar] [CrossRef] [PubMed]
- Štrumbelj, E.; Kononenko, I.; Robnik Šikonja, M. Explaining instance classifications with interactions of subsets of feature values. Data Knowl. Eng. 2009, 68, 886–904. [Google Scholar] [CrossRef]
- Garway-Heath, D.F.; Poinoosawmy, D.; Fitzke, F.W.; Hitchings, R.A. Mapping the visual field to the optic disc in normal tension glaucoma eyes11The authors have no proprietary interest in the development or marketing of any product or instrument mentioned in this article. Ophthalmology 2000, 107, 1809–1815. [Google Scholar] [CrossRef] [PubMed]
- Van Craenendonck, T.; Elen, B.; Gerrits, N.; De Boever, P. Systematic Comparison of Heatmapping Techniques in Deep Learning in the Context of Diabetic Retinopathy Lesion Detection. Transl. Vis. Sci. Technol. 2020, 9, 64. [Google Scholar] [CrossRef] [PubMed]
- Covert, I.; Lundberg, S.; Lee, S.I. Explaining by Removing: A Unified Framework for Model Explanation. arXiv 2022, arXiv:2011.14878. [Google Scholar] [CrossRef]
- Hooker, S.; Erhan, D.; Kindermans, P.J.; Kim, B. A Benchmark for Interpretability Methods in Deep Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Sydney, NSW, Australia, 2019; Volume 32. [Google Scholar]
- Chen, H.; Janizek, J.D.; Lundberg, S.; Lee, S.I. True to the Model or True to the Data? arXiv 2020, arXiv:2006.16234. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.I. From Local Explanations to Global Understanding with Explainable AI for Trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef] [PubMed]
- Zar, J.H. Significance Testing of the Spearman Rank Correlation Coefficient. J. Am. Stat. Assoc. 1972, 67, 578–580. [Google Scholar] [CrossRef]
- Adebayo, J.; Gilmer, J.; Muelly, M.; Goodfellow, I.; Hardt, M.; Kim, B. Sanity Checks for Saliency Maps. In Proceedings of the Advances in Neural Information Processing Systems; Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., Eds.; Curran Associates, Inc.: Sydney, NSW, Australia, 2018; Volume 31. [Google Scholar]
- Brodersen, K.H.; Ong, C.S.; Stephan, K.E.; Buhmann, J.M. The Balanced Accuracy and Its Posterior Distribution. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 3121–3124. [Google Scholar] [CrossRef]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; Chia Laguna Resort. Volume 9, pp. 249–256. [Google Scholar]
Dataset | No. of Healthy Images | No. of Glaucoma Images | Min. Resolution | Max. Resolution |
---|---|---|---|---|
RIM-ONE DL | 313 | 172 | 274 × 274 | 793 × 793 |
HUC RGB | 63 | 191 | 520 × 520 | 1035 × 1035 |
REFUGE | 1080 | 120 | 268 × 268 | 562 × 562 |
DRISHTI-GS1 | 31 | 70 | 450 × 450 | 722 × 722 |
Network | Fold | Sensitivity | Specificity | Accuracy | B. Accuracy | F1 Score |
---|---|---|---|---|---|---|
VGG19 | 1 | 0.8767 | 0.9467 | 0.9122 | 0.9117 | 0.9078 |
VGG19 | 2 | 0.9726 | 0.9467 | 0.9595 | 0.9596 | 0.9595 |
VGG19 | 3 | 0.9041 | 0.9733 | 0.9392 | 0.9387 | 0.9362 |
VGG19 | 4 | 0.9863 | 0.9467 | 0.9662 | 0.9665 | 0.9664 |
VGG19 | 5 | 1.0000 | 0.9200 | 0.9595 | 0.9600 | 0.9605 |
VGG19 | M ± SD | 0.9479 ± 0.0543 | 0.9467 ± 0.0189 | 0.9473 ± 0.0221 | 0.9473 ± 0.0225 | 0.9461 ± 0.0243 |
ResNet50 | 1 | 0.9315 | 0.9067 | 0.9189 | 0.9191 | 0.9189 |
ResNet50 | 2 | 0.9589 | 0.9867 | 0.9730 | 0.9728 | 0.9722 |
ResNet50 | 3 | 0.9863 | 0.9600 | 0.9730 | 0.9732 | 0.9730 |
ResNet50 | 4 | 0.9178 | 0.9467 | 0.9324 | 0.9322 | 0.9306 |
ResNet50 | 5 | 0.8904 | 0.9200 | 0.9054 | 0.9052 | 0.9028 |
ResNet50 | M ± SD | 0.9370 ± 0.0370 | 0.9440 ± 0.0318 | 0.9405 ± 0.0311 | 0.9405 ± 0.0311 | 0.9395 ± 0.0318 |
InceptionV3 | 1 | 0.9589 | 0.8800 | 0.9189 | 0.9195 | 0.9211 |
InceptionV3 | 2 | 0.9452 | 0.8667 | 0.9054 | 0.9059 | 0.9079 |
InceptionV3 | 3 | 0.9726 | 0.9067 | 0.9392 | 0.9396 | 0.9404 |
InceptionV3 | 4 | 0.9178 | 0.9333 | 0.9257 | 0.9256 | 0.9241 |
InceptionV3 | 5 | 0.9178 | 0.9200 | 0.9189 | 0.9189 | 0.9178 |
InceptionV3 | M ± SD | 0.9425 ± 0.0245 | 0.9013 ± 0.0276 | 0.9216 ± 0.0123 | 0.9219 ± 0.0122 | 0.9223 ± 0.0118 |
Xception | 1 | 0.9452 | 0.9067 | 0.9257 | 0.9259 | 0.9262 |
Xception | 2 | 0.9315 | 0.8933 | 0.9122 | 0.9124 | 0.9128 |
Xception | 3 | 0.9452 | 0.8267 | 0.8851 | 0.8859 | 0.8903 |
Xception | 4 | 0.9178 | 0.9067 | 0.9122 | 0.9122 | 0.9116 |
Xception | 5 | 0.9315 | 0.8133 | 0.8716 | 0.8724 | 0.8774 |
Xception | M ± SD | 0.9342 ± 0.0115 | 0.8693 ± 0.0456 | 0.9014 ± 0.0222 | 0.9018 ± 0.0219 | 0.9036 ± 0.0195 |
Network | Fold | Sensitivity | Specificity | Accuracy | B. Accuracy | F1 Score |
---|---|---|---|---|---|---|
VGG19 | 1 | 0.8000 | 0.8769 | 0.8692 | 0.8384 | 0.5501 |
VGG19 | 2 | 0.8167 | 0.8472 | 0.8442 | 0.8319 | 0.5117 |
VGG19 | 3 | 0.7333 | 0.9269 | 0.9075 | 0.8301 | 0.6132 |
VGG19 | 4 | 0.7833 | 0.8537 | 0.8467 | 0.8185 | 0.5054 |
VGG19 | 5 | 0.8833 | 0.8898 | 0.8892 | 0.8866 | 0.6145 |
VGG19 | M ± SD | 0.8033 ± 0.0545 | 0.8789 ± 0.0319 | 0.8713 ± 0.0273 | 0.8411 ± 0.0264 | 0.5590 ± 0.0529 |
ResNet50 | 1 | 0.7250 | 0.9620 | 0.9383 | 0.8435 | 0.7016 |
ResNet50 | 2 | 0.8417 | 0.9009 | 0.8950 | 0.8713 | 0.6159 |
ResNet50 | 3 | 0.8000 | 0.8981 | 0.8883 | 0.8491 | 0.5890 |
ResNet50 | 4 | 0.7833 | 0.9296 | 0.9150 | 0.8565 | 0.6483 |
ResNet50 | 5 | 0.8083 | 0.8065 | 0.8067 | 0.8074 | 0.4554 |
ResNet50 | M ± SD | 0.7917 ± 0.0429 | 0.8994 ± 0.0580 | 0.8887 ± 0.0498 | 0.8456 ± 0.0237 | 0.6020 ± 0.0921 |
InceptionV3 | 1 | 0.7500 | 0.9435 | 0.9242 | 0.8468 | 0.6642 |
InceptionV3 | 2 | 0.8333 | 0.9065 | 0.8992 | 0.8699 | 0.6231 |
InceptionV3 | 3 | 0.8500 | 0.9389 | 0.9300 | 0.8944 | 0.7083 |
InceptionV3 | 4 | 0.6750 | 0.9843 | 0.9533 | 0.8296 | 0.7431 |
InceptionV3 | 5 | 0.7750 | 0.9426 | 0.9258 | 0.8588 | 0.6764 |
InceptionV3 | M ± SD | 0.7767 ± 0.0701 | 0.9431 ± 0.0276 | 0.9265 ± 0.0193 | 0.8599 ± 0.0244 | 0.6830 ± 0.0454 |
Xception | 1 | 0.7500 | 0.9083 | 0.8925 | 0.8292 | 0.5825 |
Xception | 2 | 0.8083 | 0.8963 | 0.8875 | 0.8523 | 0.5897 |
Xception | 3 | 0.7000 | 0.9241 | 0.9017 | 0.8120 | 0.5874 |
Xception | 4 | 0.7333 | 0.9148 | 0.8967 | 0.8241 | 0.5867 |
Xception | 5 | 0.9250 | 0.6898 | 0.7133 | 0.8074 | 0.3922 |
Xception | M ± SD | 0.7833 ± 0.0884 | 0.8667 ± 0.0994 | 0.8583 ± 0.0812 | 0.8250 ± 0.0176 | 0.5477 ± 0.0870 |
Network | Fold | Sensitivity | Specificity | Accuracy | B. Accuracy | F1 Score |
---|---|---|---|---|---|---|
VGG19 | 1 | 0.8429 | 0.7742 | 0.8218 | 0.8085 | 0.8676 |
VGG19 | 2 | 0.8857 | 0.7419 | 0.8416 | 0.8138 | 0.8857 |
VGG19 | 3 | 0.8429 | 0.8065 | 0.8317 | 0.8247 | 0.8741 |
VGG19 | 4 | 0.8857 | 0.7419 | 0.8416 | 0.8138 | 0.8857 |
VGG19 | 5 | 0.9571 | 0.6452 | 0.8614 | 0.8012 | 0.9054 |
VGG19 | M ± SD | 0.8829 ± 0.0467 | 0.7419 ± 0.0603 | 0.8396 ± 0.0147 | 0.8124 ± 0.0086 | 0.8837 ± 0.0144 |
ResNet50 | 1 | 0.9286 | 0.7742 | 0.8812 | 0.8514 | 0.9155 |
ResNet50 | 2 | 0.8571 | 0.7419 | 0.8218 | 0.7995 | 0.8696 |
ResNet50 | 3 | 0.8143 | 0.7419 | 0.7921 | 0.7781 | 0.8444 |
ResNet50 | 4 | 0.9143 | 0.7419 | 0.8614 | 0.8281 | 0.9014 |
ResNet50 | 5 | 0.9429 | 0.6774 | 0.8614 | 0.8101 | 0.9041 |
ResNet50 | M ± SD | 0.8914 ± 0.0540 | 0.7355 ± 0.0353 | 0.8436 ± 0.0360 | 0.8135 ± 0.0279 | 0.8870 ± 0.0293 |
InceptionV3 | 1 | 0.8429 | 0.8065 | 0.8317 | 0.8247 | 0.8741 |
InceptionV3 | 2 | 0.8571 | 0.7419 | 0.8218 | 0.7995 | 0.8696 |
InceptionV3 | 3 | 0.9000 | 0.7097 | 0.8416 | 0.8048 | 0.8873 |
InceptionV3 | 4 | 0.8571 | 0.7742 | 0.8317 | 0.8157 | 0.8759 |
InceptionV3 | 5 | 0.8857 | 0.7742 | 0.8515 | 0.8300 | 0.8921 |
InceptionV3 | M ± SD | 0.8686 ± 0.0235 | 0.7613 ± 0.0368 | 0.8356 ± 0.0113 | 0.8149 ± 0.0128 | 0.8798 ± 0.0095 |
Xception | 1 | 0.8429 | 0.7097 | 0.8020 | 0.7763 | 0.8551 |
Xception | 2 | 0.8714 | 0.6774 | 0.8119 | 0.7744 | 0.8652 |
Xception | 3 | 0.8286 | 0.7419 | 0.8020 | 0.7853 | 0.8529 |
Xception | 4 | 0.9143 | 0.6452 | 0.8317 | 0.7797 | 0.8828 |
Xception | 5 | 0.8857 | 0.6774 | 0.8218 | 0.7816 | 0.8732 |
Xception | M ± SD | 0.8686 ± 0.0341 | 0.6903 ± 0.0368 | 0.8139 ± 0.0129 | 0.7794 ± 0.0043 | 0.8659 ± 0.0125 |
Architecture | Fold 1 | Fold 2 | Fold 3 | Fold 4 | Fold 5 | M ± SD |
---|---|---|---|---|---|---|
VGG19 | 0.9254 | 0.9322 | 0.9387 | 0.9459 | 0.9663 | 0.9417 ± 0.0157 |
ResNet50 | 0.9052 | 0.9728 | 0.9528 | 0.9391 | 0.8985 | 0.9337 ± 0.0315 |
InceptionV3 | 0.9391 | 0.9256 | 0.9185 | 0.9256 | 0.9393 | 0.9296 ± 0.0092 |
Xception | 0.8995 | 0.9187 | 0.8789 | 0.9393 | 0.8791 | 0.9031 ± 0.0261 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sigut, J.; Fumero, F.; Díaz-Alemán, T. Using Shapley Values to Explain the Decisions of Convolutional Neural Networks in Glaucoma Diagnosis. Algorithms 2025, 18, 464. https://doi.org/10.3390/a18080464
Sigut J, Fumero F, Díaz-Alemán T. Using Shapley Values to Explain the Decisions of Convolutional Neural Networks in Glaucoma Diagnosis. Algorithms. 2025; 18(8):464. https://doi.org/10.3390/a18080464
Chicago/Turabian StyleSigut, Jose, Francisco Fumero, and Tinguaro Díaz-Alemán. 2025. "Using Shapley Values to Explain the Decisions of Convolutional Neural Networks in Glaucoma Diagnosis" Algorithms 18, no. 8: 464. https://doi.org/10.3390/a18080464
APA StyleSigut, J., Fumero, F., & Díaz-Alemán, T. (2025). Using Shapley Values to Explain the Decisions of Convolutional Neural Networks in Glaucoma Diagnosis. Algorithms, 18(8), 464. https://doi.org/10.3390/a18080464