Beyond Accuracy: Explainable Deep Learning for Alzheimer’s Disease Detection Using Structural MRI Data
Abstract
1. Introduction
- Provide an optimized reference model for 3D AD classification using ADNI 3D structural MRI images.
- Conduct a systematic comparison among a list of interpretation methods using consistent implementation and parameterization.
- Establish a quantitative and qualitative evaluation protocol that incorporates sanity checks and comparisons of brain regions against established AD biomarkers.
2. Related Work
2.1. AD Diagnosis and MRI Biomarkers
2.2. Deep Learning Models for AD Classification
2.3. Interpretability of Medical Imaging: Intrinsic vs. Post Hoc
2.3.1. Gradient-Based Methods
2.3.2. Model-Agnostic Perturbation Methods
2.3.3. Limitations and Pitfalls of Saliency Maps
3. Datasets
4. Framework
4.1. Image Preprocessing
- 1.
- Finding Target Size: All the segmented brainmasks had the same shape of . Therefore, the first step involved determining the minimum dimensions needed to ensure that all brain regions in the dataset can fit without doing any spatial changes, such as zooming or interpolation. For each MRI scan, we calculated the maximum length required for each axis to accommodate the largest brain region present in the entire dataset. We need to assess each axis individually, as the area of the three planes is not uniform when considering only the brain region. The final target shape was found to be . In our findings, we have not encountered any FreeSurfer post-processed brain mask MRI scans available on ADNI that exceed these dimensions.
- 2.
- Segmenting Brain Region: In the second step, we removed the black areas or voxels to isolate only the brain region. First, we filtered out the non-zero coordinates from the image. Then, we cropped the brain region based on the minimum and maximum non-zero coordinates along each axis. Additionally, we updated the header and affine details of the raw image to align with the segmented image. The sample output from this steps depicted on Figure 2.
- 3.
- Padding for Uniformity: The MRI shapes after applying Step 2 were not uniform across the dataset. To make the MRI uniform across the dataset, we utilized the target shape computed in Step 1. If an MRI shape is smaller than the target shape along any axis, we added padding of 0 pixels symmetrically on both sides of the brain region for that axis. Additionally, before saving, we updated the header and affine details of the images. By implementing these adjustments, we ensured that all images in the dataset have a uniform shape, making them suitable for further use. The algorithms for Step 2 and Step 3 briefly explained in Algorithm 1.
| Algorithm 1: Process 3D MRI Images |
![]() |
4.2. Reference Classifiers
4.2.1. 3D CNN and DenseNet Variants
- Baseline 3D CNN: The baseline model illustrated in Figure 3 consists of four convolutional blocks, and three fully-connected layers [3]. Each block includes a convolution layer with a kernel size of 3 × 3 × 3, followed by batch normalization, a ReLU activation function, and max pooling. The progression of convolutional channels is as follows: 1 to 8, 8 to 16, 16 to 32, and 32 to 64, with pooling factors of 2, 3, 2, and 3, respectively.After the convolutional blocks, the learned feature representations are then flattened and passed through three fully connected layers (128, 64, and 2 neurons, respectively), using ReLU activations. A dropout regularization (rate = 0.8) applied after the first dense layer. The final output layer produces logits for binary classification, which are optimized using cross-entropy loss.
- Custom DenseNet: We experimented with a custom three-dimensional variant of DenseNet to train volumetric MRI inputs from scratch with a growth rate of 8 and a compression factor of 0.5 [33]. The architecture Figure 4 begins with an initial convolutional block that includes a 3 × 3 × 3 convolution with a stride of 1, followed by max pooling, batch normalization, and ReLU activation.The feature extractor consists of two Dense Blocks, each containing three composite layers arranged in the following pattern: Batch Normalization (BN) → ReLU → 1 × 1 × 1 convolution → BN → ReLU → 3 × 3 × 3 convolution. After the first Dense Block, a transition layer (BN → ReLU → 1 × 1 × 1 convolution → average pooling) reduces both the channel dimension (using a compression factor of 0.5) and the spatial resolution. The second Dense Block, with the same pattern as the first block, is then applied, followed by a max pooling transition layer. The classifier head consists of a global flattening operation, dropout regularization (p = 0.5), and a fully connected linear layer mapping to the final output. ReLU activation is utilized throughout, except for the final layer, which is trained using cross-entropy loss, where softmax is applied implicitly.
4.2.2. ResNet-18 Pretrained on ImageNet
4.2.3. CNN with ResNet and Swin Transformer Pre-Training
4.3. Interpretation Methods
4.3.1. Gradient-Based Techniques
- Grad-CAM (Gradient-weighted Class Activation Mapping) is a gradient-based visualization method that uses the gradients of a target class flowing into the final convolutional layers to create coarse localization heatmaps [19]. It offers visual explanations particular to a class by highlighting the areas of the input image that have the most influence on a model’s judgment. Because of this, it is especially useful in medical imaging, where it is crucial to detect anatomical regions that are discriminative. It is computationally light and easily integrable into current CNN designs because of its efficiency, which comes from using gradient information from a single forward and backward pass.
- Grad-CAM++ is an improved gradient-based visualization method that uses higher-order gradients to handle scenarios with fine-grained features or numerous object instances [20]. It is appropriate for tasks requiring fine detail, such as medical image analysis, because it produces localization maps that are sharper and more precise. Its efficiency comes from using both first- and second-order gradients to capture more subtle spatial importance.
- HiResCAM (High-Resolution Class Activation Mapping) is a visualization approach that relies on activation rather than gradients to enhance interpretability [21]. It is less noisy and more stable since it exploits class score differences caused by the removal of individual feature maps rather than backpropagation. Gradient noise is avoided, and each feature map’s direct class score contribution is the main focus, making it effective for high-resolution explanations, especially in medical imaging.
- Backpropagation is a gradient-based visualization technique that calculates the output class score’s gradient in relation to the input image [7]. By highlighting each input pixel’s sensitivity to the prediction, it shows which areas of the image have the most effects on the model’s output. Because it is straightforward and uses the gradient directly, it is efficient; yet, it frequently generates noisy attribution maps that lack spatial context.
- Guided Backpropagation is a gradient-based visualization method that modifies the backward run through ReLU layers to improve on normal backpropagation [18]. Sharper and more focused saliency maps are produced by limiting the flow of gradients across neurons that experienced positive activations in the forward pass. Class-discriminative localization is not available, although it is effective in bringing out the finer details in the input image.
4.3.2. Model-Agnostic Techniques
- SHAP (SHapley Additive exPlanations) is a unified framework for interpreting machine learning models by computing the contribution of each feature to the model’s prediction [23]. Kernel SHAP is a model-agnostic variant that estimates SHAP values using a weighted linear regression approach based on Shapley values from cooperative game theory. It is particularly useful when model internals are inaccessible or opaque.
- LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions by fitting a local surrogate model around the prediction of interest [22]. It perturbs the input data and observes changes in model outputs to learn the importance of input features. This method is model-agnostic and widely used for interpreting complex black-box models.
- RISE (Randomized Input Sampling for Explanation) is a saliency mapping technique designed to interpret deep neural networks, particularly in vision tasks [24]. It generates heatmaps by applying random masks to the input image and observing the resulting changes in prediction scores. The final explanation is computed as a weighted sum of these masks, weighted by the model’s output confidence for the masked input.
5. Implementation
5.1. Experimental Setup
5.2. Performance Metrics on Classification
5.3. Evaluation Protocol for Interpretation Methods
- Faithfulness Tests—To verify that our interpretation methods were learning meaningful features rather than detecting model artifacts, we employed a model reinitialization sanity check [39]. For this check, we calculated the Pearson [40] and Spearman’s rank [41] correlation coefficients between the saliency map from the fully trained model and a map generated from the same model after its parameters were randomly reinitialized. A faithful saliency method is expected to yield a low CC score (closer to ) in the range of to , confirming that the map’s structural patterns were derived from the model’s learned weights rather than from input artifacts or universal biases. Conversely, a score close to or would indicate a strong positive or negative correlation, suggesting that the saliency map is heavily influenced by the model’s architecture or initialization instead of its learned weights.
- Robustness Checks—To evaluate the robustness of the interpretation methods, we perturbed the input MRI samples by injecting additive Gaussian noise and applying a low-pass Gaussian filter.The process for adding noise to each voxel in the input volume is described bywhere N is a random value sampled from a Gaussian distribution with a mean of zero and a standard deviation of . The probability density function of this distribution isFor the Gaussian filter, the MRI sample is convolved with a 3D Gaussian kernel, defined aswhere are the spatial coordinates relative to the center of the kernel and is the standard deviation, which controls the amount of blurring. Finally, we compared the saliency maps from perturbed input with the baseline (unperturbed) maps using Spearman’s rank correlation coefficient [41]. Robust interpretation methods are expected to produce saliency maps that maintain high similarity between the original/baseline and the perturbed inputs.
- Usefulness Verification—To provide anatomical interpretability, we mapped saliency maps to standard brain regions using an anatomical labeled atlas [42]. Region-wise hit rates were computed for structures commonly associated with Alzheimer’s disease (e.g., hippocampus, entorhinal cortex, temporal lobe), as suggested in prior neuroimaging studies [43].
6. Results
6.1. Performance Analysis of the Classifiers
6.2. Qualitative Analysis of the Interpretations
6.2.1. Interpretation of DenseNet
6.2.2. Interpretation of ResNet-18 (Pre-Trained)
6.2.3. Interpretation of Hybrid Swin Transformer
6.3. Quantitative Analysis of the Interpretations
6.3.1. Faithfulness Tests
6.3.2. Robustness Checks
6.3.3. Usefulness Verification
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| CNN | Convolutional Neural Network |
| XAI | Explainable Artificial Intelligence |
| MRI | Magnetic Resonance Imaging |
| AD | Alzheimer’s Disease |
| CN | Cognitive Normal |
| MCI | Mild Cognitive Impairment |
| ROI | Region-of-Interest |
| ReLU | Rectified Linear Unit |
| BN | Batch Normalization |
| QC | Quality Control |
| AAL | Automated Anatomical Labeling |
| Grad-CAM | Gradient-weighted Class Activation Mapping |
| HiResCAM | High-Resolution Class Activation Mapping |
| LIME | Local Interpretable Model-agnostic Explanations |
| RISE | Randomized Input Sampling for Explanation |
| SHAP | SHapley Additive exPlanations |
| SHARCNET | Shared Hierarchical Academic Research Computing Network |
| CC | Correlation Coefficient |
References
- Steiner, A.B.Q.; Jacinto, A.F.; Mayoral, V.F.S.; Brucki, S.M.D.; Citero, V.A. Mild cognitive impairment and progression to dementia of Alzheimer’s disease. Rev. Assoc. Méd. Bras. 2017, 63, 651–655. [Google Scholar] [CrossRef]
- Amoroso, N.; Quarto, S.; La Rocca, M.; Tangaro, S.; Monaco, A.; Bellotti, R. An eXplainability artificial intelligence approach to brain connectivity in Alzheimer’s disease. Front. Aging Neurosci. 2023, 15, 1238065. [Google Scholar] [CrossRef] [PubMed]
- Rieke, J.; Eitel, F.; Weygandt, M.; Haynes, J.D.; Ritter, K. Visualizing convolutional networks for MRI-based diagnosis of Alzheimer’s disease. In Proceedings of the Understanding and Interpreting Machine Learning in Medical Image Computing Applications, Cham, Switzerland, 16–20 September 2018; pp. 24–31. [Google Scholar] [CrossRef]
- Ebrahimighahnavieh, A.; Luo, S.; Chiong, R. Deep learning to detect Alzheimer’s disease from neuroimaging: A systematic literature review. Comput. Methods Programs Biomed. 2020, 187, 105242. [Google Scholar] [CrossRef] [PubMed]
- Hassan, N.; Miah, A.S.M.; Suzuki, T.; Shin, J. Gradual variation-based dual-stream deep learning for spatial feature enhancement with dimensionality reduction in early Alzheimer’s disease detection. IEEE Access 2025, 13, 31701–31717. [Google Scholar] [CrossRef]
- Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 818–833. [Google Scholar] [CrossRef]
- Simonyan, K.; Vedaldi, A.; Zisserman, A. Deep inside Convolutional Networks: Visualising image classification models and saliency maps. arXiv 2013, arXiv:1312.6034. [Google Scholar] [CrossRef]
- Dandl, S.; Binder, M.; Auer, A.; Bischl, B. Multi-Objective Counterfactual Explanations. In Proceedings of the Parallel Problem Solving from Nature—PPSN XVI, Granada, Spain, 5–9 September 2020; pp. 448–469. [Google Scholar] [CrossRef]
- Jack, C.R.; Knopman, D.S.; Jagust, W.J.; Petersen, R.C.; Weiner, M.W.; Aisen, P.S.; Shaw, L.M.; Vemuri, P.; Wiste, H.J.; Weigand, S.D.; et al. Tracking pathophysiological processes in Alzheimer’s disease: An updated hypothetical model of dynamic biomarkers. Lancet Neurol. 2013, 12, 207–216. [Google Scholar] [CrossRef]
- Jack, C.R.; Wiste, H.J.; Vemuri, P.; Weigand, S.D.; Senjem, M.L.; Zeng, G.; Bernstein, M.A.; Gunter, J.L.; Pankratz, V.S.; Aisen, P.S.; et al. Brain beta-amyloid measures and magnetic resonance imaging atrophy both predict time-to-progression from mild cognitive impairment to Alzheimer’s disease. Brain 2010, 133, 3336–3348. [Google Scholar] [CrossRef]
- Fischl, B. FreeSurfer. NeuroImage 2012, 62, 774–781. [Google Scholar] [CrossRef]
- Schwarz, C.G.; Gunter, J.L.; Wiste, H.J.; Przybelski, S.A.; Weigand, S.D.; Ward, C.P.; Senjem, M.L.; Vemuri, P.; Murray, M.E.; Dickson, D.W.; et al. A large-scale comparison of cortical thickness and volume methods for measuring Alzheimer’s disease severity. NeuroImage Clin. 2016, 11, 802–812. [Google Scholar] [CrossRef]
- Liu, S.; Masurkar, A.V.; Rusinek, H.; Chen, J.; Zhang, B.; Zhu, W.; Fernandez-Granda, C.; Razavian, N. Generalizable deep learning model for early Alzheimer’s disease detection from structural MRIs. Sci. Rep. 2022, 12, 17106. [Google Scholar] [CrossRef]
- Alahmed, H.; Al-Suhail, G. AlzONet: A deep learning optimized framework for multiclass Alzheimer’s disease diagnosis using MRI brain imaging. J. Supercomput. 2025, 81, 423. [Google Scholar] [CrossRef]
- Ebrahimi, A.; Luo, S.; Alzheimer’s Disease Neuroimaging Initiative. Convolutional neural networks for Alzheimer’s disease detection on MRI images. J. Med. Imaging 2021, 8, 024503. [Google Scholar] [CrossRef]
- Farahani, F.V.; Fiok, K.; Lahijanian, B.; Karwowski, W.; Douglas, P.K. Explainable AI: A review of applications to neuroimaging data. Front. Neurosci. 2022, 16, 906290. [Google Scholar] [CrossRef]
- Wang, S.H.; Han, X.; Du, J.; Wang, Z.; Yuan, C.; Chen, Y.; Zhu, Y.; Dou, X.; Xu, X.; Xu, H.; et al. Saliency-based 3D convolutional neural network for categorising common focal liver lesions on multisequence MRI. Insights Imaging 2021, 12, 173. [Google Scholar] [CrossRef]
- Springenberg, J.T.; Dosovitskiy, A.; Brox, T.; Riedmiller, M. Striving for simplicity: The all convolutional net. arXiv 2014, arXiv:1412.6806. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 2020, 128, 336–359. [Google Scholar] [CrossRef]
- Chattopadhay, A.; Sarkar, A.; Howlader, P.; Balasubramanian, V.N. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 839–847. [Google Scholar] [CrossRef]
- Draelos, R.L.; Carin, L. Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks. arXiv 2020, arXiv:2011.08891. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; pp. 4765–4774. [Google Scholar]
- Petsiuk, V.; Das, A.; Saenko, K. RISE: Randomized Input Sampling for Explanation of Black-box Models. In Proceedings of the British Machine Vision Conference (BMVC), Newcastle upon Tyne, UK, 3–6 September 2018; BMVA Press: Guildford, UK, 2018. [Google Scholar]
- Jin, W.; Li, X.; Hamarneh, G. One map does not fit all: Evaluating saliency map explanation on multi-modal medical images. arXiv 2021, arXiv:2107.05047. [Google Scholar] [CrossRef]
- Wiśniewski, M.; Giulivi, L.; Boracchi, G. SE3D: A framework for saliency method evaluation in 3D imaging. In Proceedings of the 2024 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 27–30 October 2024; pp. 89–95. [Google Scholar] [CrossRef]
- Brima, Y.; Atemkeng, M. Saliency-driven explainable deep learning in medical imaging: Bridging visual explainability and statistical quantitative analysis. BioData Min. 2024, 17, 18. [Google Scholar] [CrossRef]
- Alzheimer’s Disease Neuroimaging Initiative. Available online: https://adni.loni.usc.edu/ (accessed on 7 August 2025).
- Alzheimer’s Disease Neuroimaging Initiative. UCSF FreeSurfer Methods Summary. Available online: https://adni.bitbucket.io/reference/docs/UCSFFRESFR/UCSFFreeSurferMethodsSummary.pdf (accessed on 15 August 2025).
- Ebrahimi, A.; Luo, S.; Chiong, R.; Alzheimer’s Disease Neuroimaging Initiative. Deep sequence modelling for Alzheimer’s disease detection using MRI. Comput. Biol. Med. 2021, 134, 104537. [Google Scholar] [CrossRef] [PubMed]
- Guan, Z.; Kumar, R.; Fung, Y.R.; Wu, Y.; Fiterau, M. A Comprehensive Study of Alzheimer’s Disease Classification Using Convolutional Neural Networks. arXiv 2019, arXiv:1904.07950. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar] [CrossRef]
- Singh, D.; Dyrba, M. Comparison of CNN Architectures for Detecting Alzheimer’s Disease using Relevance Maps. In Bildverarbeitung für die Medizin 2023; Deserno, T.M., Handels, H., Maier, A., Maier-Hein, K., Palm, C., Tolxdorff, T., Eds.; Springer Vieweg: Wiesbaden, Germany, 2023. [Google Scholar] [CrossRef]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
- Carreira, J.; Zisserman, A. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4724–4733. [Google Scholar] [CrossRef]
- Ebrahimi-Ghahnavieh, A.; Luo, S.; Chiong, R. Transfer learning for Alzheimer’s disease detection on MRI images. In Proceedings of the 2019 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), Bali, Indonesia, 1–3 July 2019; pp. 133–138. [Google Scholar] [CrossRef]
- Tamal. xai-for-ad. GitHub. 2025. Available online: https://github.com/tamal3472/xai-for-ad (accessed on 27 November 2025).
- SHARCNET. SHARCNET: Shared Hierarchical Academic Research Computing Network. 2024. Available online: https://www.sharcnet.ca (accessed on 3 October 2025).
- Adebayo, J.; Gilmer, J.; Muelly, M.; Goodfellow, I.; Hardt, M.; Kim, B. Sanity Checks for Saliency Maps. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NeurIPS), NeurIPS’18, Red Hook, NY, USA, 2–7 December 2018; pp. 9525–9536. [Google Scholar]
- Pearson, K. VII. Mathematical contributions to the theory of evolution—III. Regression, heredity, and panmixia. Philos. Trans. R. Soc. Lond. Ser. Contain. Pap. Math. Phys. Character 1896, 187, 253–318. [Google Scholar] [CrossRef]
- Spearman, C. The Proof and Measurement of Association between Two Things. Am. J. Psychol. 1904, 15, 72–101. [Google Scholar] [CrossRef]
- Tzourio-Mazoyer, N.; Landeau, B.; Papathanassiou, D.; Crivello, F.; Etard, O.; Delcroix, N.; Mazoyer, B.; Joliot, M. Automated Anatomical Labeling of Activations in SPM Using a Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain. NeuroImage 2002, 15, 273–289. [Google Scholar] [CrossRef]
- Rathore, S.; Habes, M.; Iftikhar, M.A.; Shacklett, A.; Davatzikos, C. A review on neuroimaging-based classification studies and associated feature extraction methods for Alzheimer’s disease and its prodromal stages. NeuroImage 2017, 155, 530–548. [Google Scholar] [CrossRef]
- Schallner, L.; Rabold, J.; Scholz, O.; Schmid, U. Effect of Superpixel Aggregation on Explanations in LIME—A Case Study with Biological Data. In Communications in Computer and Information Science, Proceedings of the Machine Learning and Knowledge Discovery in Databases, Würzburg, Germany, 16–20 September 2019; Cellier, P., Driessens, K., Eds.; Springer International Publishing: Cham, Switzerland, 2020; Volume 1167, pp. 147–158. [Google Scholar] [CrossRef]
- Tempel, F.; Ihlen, E.A.F.; Adde, L.; Strümke, I. Explaining Human Activity Recognition with SHAP: Validating insights with perturbation and quantitative measures. Comput. Biol. Med. 2025, 188, 109838. [Google Scholar] [CrossRef]
- Hendrycks, D.; Dietterich, T. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. arXiv 2019, arXiv:1903.12261. [Google Scholar] [CrossRef]
- Convit, A.; Leon, M.J.D.; Tarshish, C.; Santi, S.D.; Tsui, W.; Rusinek, H.; George, A. Specific Hippocampal Volume Reductions in Individuals at Risk for Alzheimer’s Disease. Neurobiol. Aging 1997, 18, 131–138. [Google Scholar] [CrossRef]
- Clerx, L.; van Rossum, I.A.; Burns, L.; Knol, D.L.; Scheltens, P.; Verhey, F.; Aalten, P.; Lapuerta, P.; van de Pol, L.; van Schijndel, R.; et al. Measurements of medial temporal lobe atrophy for prediction of Alzheimer’s disease in subjects with mild cognitive impairment. Neurobiol. Aging 2013, 34, 2003–2013. [Google Scholar] [CrossRef]
- Desikan, R.S.; Cabral, H.J.; Settecase, F.; Hess, C.P.; Dillon, W.P.; Glastonbury, C.M.; Weiner, M.W.; Schmansky, N.J.; Salat, D.H.; Fischl, B. Automated MRI measures predict progression to Alzheimer’s disease. Neurobiol. Aging 2010, 31, 1364–1374. [Google Scholar] [CrossRef] [PubMed]
- Miller, E.K.; Cohen, J.D. An Integrative Theory of Prefrontal Cortex Function. Annu. Rev. Neurosci. 2001, 24, 167–202. [Google Scholar] [CrossRef]
- Morgan, C.T. The Cerebral Cortex of Man: A Clinical Study of Localization of Function. Science 1950, 112, 567. [Google Scholar] [CrossRef]
- Stoodley, C.J.; Valera, E.M.; Schmahmann, J.D. Functional topography of the cerebellum for motor and cognitive tasks: An fMRI study. NeuroImage 2012, 59, 1560–1570. [Google Scholar] [CrossRef]
- Chauveau, L.; Kuhn, E.; Palix, C.; Felisatti, F.; Ourry, V.; de La Sayette, V.; Chételat, G.; de Flores, R. Medial Temporal Lobe Subregional Atrophy in Aging and Alzheimer’s Disease: A Longitudinal Study. Front. Aging Neurosci. 2021, 13, 750154. [Google Scholar] [CrossRef] [PubMed]
- Braak, H.; Braak, E. Neuropathological staging of Alzheimer-related changes. Acta Neuropathol. 1991, 82, 239–259. [Google Scholar] [CrossRef] [PubMed]




















| Model | Accuracy (%) | (WA) | (Class CN / AD) |
|---|---|---|---|
| Baseline CNN (scratch) | 81.33 | 0.81 | 0.80/0.83 |
| DenseNet (scratch) | 92.00 | 0.92 | 0.93/0.90 |
| ResNet-18 (scratch) | 80.67 | 0.81 | 0.82/0.80 |
| ResNet-18 (pre-trained) | 95.33 | 0.95 | 0.96/0.94 |
| Hybrid Swin Transformer | 92.67 | 0.93 | 0.94/0.91 |
| Model | Grad-CAM (%) | Grad-CAM++ (%) | HiResCAM (%) | Backpropagation (%) | Guided Backpropagation (%) |
|---|---|---|---|---|---|
| DenseNet | Frontal_Mid (5.97) Temporal_Mid (5.83) Temporal_Inf (4.22) Precentral (4.18) Frontal_Sup (4.05) Postcentral (3.90) | Temporal_Mid (5.46) Temporal_Inf (4.77) Frontal_Mid (4.52) Precuneus (4.20) Postcentral (3.60) Precentral (3.33) | Frontal_Mid (6.75) Temporal_Mid (5.47) Temporal_Inf (5.06) Frontal_Sup (4.21) Precentral (4.03) Postcentral (3.84) | Frontal_Mid (7.60) Temporal_Mid (5.50) Frontal_Sup (5.08) Precentral (4.67) Temporal_Inf (3.96) Temporal_Sup (3.78) | Frontal_Mid (7.77) Temporal_Mid (5.39) Frontal_Sup (5.09) Precentral (4.70) Temporal_Inf (3.81) Temporal_Sup (3.79) |
| ResNet-18 (Pre-trained) | Frontal_Sup_Medial (7.95) Frontal_Mid (6.74) Frontal_Sup (6.07) Frontal_Inf_Orb (5.32) Cingulum_Ant (4.80) Temporal_Pole_Sup (3.93) | Frontal_Mid (19.16) Frontal_Sup_Medial (12.73) Frontal_Sup (11.83) Frontal_Inf_Orb (6.53) Frontal_Inf_Tri (6.16) Cingulum_Ant (5.93) | Frontal_Sup_Medial (15.72) Frontal_Mid (15.06) Frontal_Sup (12.51) Cingulum_Ant (6.80) Frontal_Inf_Orb (5.30) Frontal_Sup_Orb (4.36) | Frontal_Mid (15.09) Precentral (7.23) Frontal_Sup (7.20) Frontal_Inf_Tri (5.94) Postcentral (5.79) Frontal_Sup_Medial (5.56) | Frontal_Mid (15.75) Frontal_Sup (7.13) Precentral (7.00) Frontal_Inf_Tri (6.01) Postcentral (5.67) Frontal_Sup_Medial (5.55) |
| Model | Grad-CAM (%) | Grad-CAM++ (%) | HiResCAM (%) | Backpropagation (%) | Guided Backpropagation (%) |
|---|---|---|---|---|---|
| DenseNet | Frontal_Mid (6.76) Frontal_Sup (5.25) Cerebelum_Crus1 (4.56) Frontal_Sup_Medial (4.07) Cerebelum_Crus2 (3.96) Temporal_Inf (3.87) | Temporal_Mid (5.20) Precuneus (4.77) Temporal_Inf (4.48) Fusiform (3.76) Postcentral (3.41) Temporal_Sup (3.35) | Frontal_Sup_Medial (5.05) Frontal_Mid (4.81) Cingulum_Mid (4.64) Temporal_Mid (4.54) Cerebelum_Crus1 (4.43) Cerebelum_Crus2 (4.11) | Frontal_Mid (7.42) Temporal_Mid (4.91) Frontal_Sup (4.56) Precentral (3.95) Cingulum_Mid (3.83) Postcentral (3.78) | Frontal_Mid (7.65) Temporal_Mid (4.84) Frontal_Sup (4.39) Precentral (3.99) Postcentral (3.83) Cingulum_Mid (3.77) |
| ResNet-18 (Pre-trained) | Postcentral (7.23) Frontal_Mid (6.99) Precuneus (6.62) Precentral (5.46) Temporal_Mid (5.27) Frontal_Sup (4.10) | Postcentral (7.44) Frontal_Mid (7.33) Precentral (6.35) Precuneus (5.61) Temporal_Mid (5.04) Frontal_Sup (4.19) | Frontal_Mid (9.72) Precentral (7.69) Postcentral (7.39) Frontal_Sup (5.15) Supp_Motor_Area (4.11) Cingulum_Mid (4.03) | Frontal_Mid (9.21) Precentral (6.05) Insula (5.93) Frontal_Inf_Oper (5.42) Frontal_Inf_Tri (4.94) Postcentral (4.61) | Frontal_Mid (8.61) Precentral (6.17) Insula (5.84) Frontal_Inf_Oper (4.99) Frontal_Inf_Tri (4.94) Postcentral (4.81) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chakroborty, T.; Colafranceschi, A.; Liu, Y.; for the Alzheimer’s Disease Neuroimaging Initiative. Beyond Accuracy: Explainable Deep Learning for Alzheimer’s Disease Detection Using Structural MRI Data. Information 2025, 16, 1058. https://doi.org/10.3390/info16121058
Chakroborty T, Colafranceschi A, Liu Y, for the Alzheimer’s Disease Neuroimaging Initiative. Beyond Accuracy: Explainable Deep Learning for Alzheimer’s Disease Detection Using Structural MRI Data. Information. 2025; 16(12):1058. https://doi.org/10.3390/info16121058
Chicago/Turabian StyleChakroborty, Tamal, Adam Colafranceschi, Yang Liu, and for the Alzheimer’s Disease Neuroimaging Initiative. 2025. "Beyond Accuracy: Explainable Deep Learning for Alzheimer’s Disease Detection Using Structural MRI Data" Information 16, no. 12: 1058. https://doi.org/10.3390/info16121058
APA StyleChakroborty, T., Colafranceschi, A., Liu, Y., & for the Alzheimer’s Disease Neuroimaging Initiative. (2025). Beyond Accuracy: Explainable Deep Learning for Alzheimer’s Disease Detection Using Structural MRI Data. Information, 16(12), 1058. https://doi.org/10.3390/info16121058


