Diagnosing Shortcut Learning in CNN-Based Photovoltaic Fault Recognition from RGB Images: A Multi-Method Explainability Audit
Abstract
1. Introduction
1.1. Objectives of the Study
1.2. Novelty and Contributions
- (1)
- A multi-method explainability audit for PV fault recognition combining surrogate-based (LIME), perturbation-based (occlusion), and gradient-based (IG) explanations in a unified protocol;
- (2)
- Quantitative reliability reporting for explanations, including kernel-weighted LIME surrogate fidelity, occlusion-based localization and concentration metrics (IoU@Top10%, entropy, Hoyer sparsity), and IG deletion–insertion faithfulness with a Faithfulness Gap;
- (3)
- A class-level performance–faithfulness coupling analysis to highlight categories prone to context-driven shortcuts despite high accuracy;
- (4)
- Practical guidance for dataset curation and model selection in vision-based PV monitoring.
2. Materials and Methodology
2.1. Dataset and Preprocessing
2.2. Architectural Framework of the Deep Learning Models
3. Explainability Framework
3.1. LIME-Based Explainability for Image Classification Models
Kernel-Weighted for LIME Surrogate Fidelity
- (i)
- , the number of explained validation images per model;
- (ii)
- (the mean value of ) and (the 10th set percentile), where is the kernel-weighted coefficient of determination between the architecture’s response on perturbations and the surrogate prediction , computed with the LIME locality weights (thus capturing fidelity in the local neighborhood emphasized by LIME);
- (iii)
- , the mean predicted-class probability on the unperturbed images, included to contextualize surrogate fidelity with respect to model confidence;
- (iv)
- , the mean number of superpixels produced by the chosen segmentation settings (quickshift), serving as a proxy for explanation granularity and complexity of the surrogate feature space;
- (v)
- , the fraction of a model’s instances falling below the global low-fidelity threshold (defined as the bottom decile of across all image model pairs), indicating how often LIME explanations for that architecture enter a regime where linear surrogates are unreliable; and
- (vi)
- the fraction of instances exceeding the global high-fragmentation threshold (top decile of ), indicating how frequently segmentation produces highly fragmented partitions that can destabilize coefficient-based attributions.
3.2. Occlusion Sensitivity Quantitative Analysis
3.3. Integrated Gradients
3.3.1. General Theory of Integrated Gradients
3.3.2. Faithfulness of IG Explanations (Deletion-Insertion)
3.4. XAI Hyperparameter Choices and Rationale
3.4.1. LIME Parameters
3.4.2. OS Parameters
3.4.3. IG Parameters
4. Results
4.1. Performance
4.1.1. Metrics
4.1.2. Cross Validation and Robustness to Partitioning
4.2. LIME Explainability
4.2.1. LIME Quantitative Results
4.2.2. Selection Policy for Representative and Failure-Mode LIME Examples
4.3. Functional Interpretability Through Occlusion Sensitivity
4.3.1. Occlusion Sensitivity Maps
4.3.2. Model-Level Interpretability Metrics
4.3.3. Per-Class Interpretability Analysis
4.4. Integrated Gradient
5. Discussion
5.1. Performance-Interpretability Coupling Across Architectures
5.2. Overall XAI Practical Implications
5.3. Rationale for a Quantitative, Multi-Method XAI Audit
5.4. Scope of Architectural Comparison and Outlook Toward ViT-Based Models
5.5. Mechanistic Interpretation of Shortcut Vulnerability in Clean and Electrical-Damage
5.6. Limitations and Future Work
6. Conclusions
Supplementary Materials
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Cação, J.; Santos, J.; Antunes, M. Explainable AI for industrial fault diagnosis: A systematic review. J. Ind. Inf. Integr. 2025, 47, 100905. [Google Scholar] [CrossRef]
- Hosain, M.T.; Jim, J.R.; Mridha, M.F.; Kabir, M.M. Explainable AI approaches in deep learning: Advancements, applications and challenges. Comput. Electr. Eng. 2024, 117, 109246. [Google Scholar] [CrossRef]
- Awedat, K.; Comert, G.; Ayad, M.; Mrebit, A. Advanced fault detection in photovoltaic panels using enhanced U-Net architectures. Mach. Learn. Appl. 2025, 20, 100636. [Google Scholar] [CrossRef]
- Sairam, S.; Seshadri, S.; Marafioti, G.; Srinivasan, S.; Mathisen, G.; Bekiroglu, K. Edge-based explainable fault detection systems for photovoltaic panels on edge nodes. Renew. Energy 2022, 185, 1425–1440. [Google Scholar] [CrossRef]
- Rico Espinosa, A.; Bressan, M.; Giraldo, L.F. Failure signature classification in solar photovoltaic plants using RGB images and convolutional neural networks. Renew. Energy 2020, 162, 249–256. [Google Scholar] [CrossRef]
- Wan, L.; Zhao, L.; Xu, W.; Guo, F.; Jiang, X. Dust deposition on the photovoltaic panel: A comprehensive survey on mechanisms, effects, mathematical modeling, cleaning methods, and monitoring systems. Sol. Energy 2024, 268, 112300. [Google Scholar] [CrossRef]
- Restrepo-Cuestas, B.J.; Guarnizo-Lemus, C.; Montoya-Marín, J.A.; Montano, J. Dataset of photovoltaic panel performance under different fault conditions cracks, discoloration, and shading effects. Data Brief 2025, 59, 111392. [Google Scholar] [CrossRef] [PubMed]
- Ling, M.; Zhu, J.; Yang, Y.; Li, H.; Yi, J.; Gao, J.; Wang, L. Study on an enhanced YOLOv9 algorithm for detecting stains and damage in photovoltaic panels. Renew. Energy 2026, 256, 124540. [Google Scholar] [CrossRef]
- Nauta, M.; Trienes, J.; Pathak, S.; Nguyen, E.; Peters, M.; Schmitt, Y.; Schlötterer, J.; van Keulen, M.; Seifert, C. From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI. ACM Comput. Surv. 2023, 55, 1–42. [Google Scholar] [CrossRef]
- Gomez, T.; Fréour, T.; Mouchère, H. Metrics for Saliency Map Evaluation of Deep Learning Explanation Methods. arXiv 2022, arXiv:2201.13291. [Google Scholar] [CrossRef]
- Adebayo, J.; Gilmer, J.; Muelly, M.; Goodfellow, I.; Hardt, M.; Kim, B. Sanity Checks for Saliency Maps. arXiv 2018, arXiv:1810.03292. [Google Scholar]
- Li, X.; Du, M.; Chen, J.; Chai, Y.; Lakkaraju, H.; Xiong, H. M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods. In Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Curran Associates Inc.: Red Hook, NY, USA, 2023. [Google Scholar]
- Lenarczyk, A. PV Panel Defect Dataset. Kaggle. 2025. Available online: https://www.kaggle.com/datasets/alicjalena/pv-panel-defect-dataset (accessed on 8 February 2026).
- Deb, N.; Rahman, T. An efficient VGG16-based deep learning model for automated potato pest detection. Smart Agric. Technol. 2025, 12, 101409. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; IEEE: New York, NY, USA, 2016; pp. 2818–2826. [Google Scholar]
- Khan, M.N.; Das, S.; Liu, J. Predicting pedestrian-involved crash severity using inception-v3 deep learning model. Accid. Anal. Prev. 2024, 197, 107457. [Google Scholar] [CrossRef] [PubMed]
- VanBerlo, B.; Wu, D.; Li, B.; Rahman, M.A.; Hogg, G.; VanBerlo, B.; Tschirhart, J.; Ford, A.; Ho, J.; McCauley, J.; et al. Accurate assessment of the lung sliding artefact on lung ultrasonography using a deep learning approach. Comput. Biol. Med. 2022, 148, 105953. [Google Scholar] [CrossRef] [PubMed]
- Vedaldi, A.; Soatto, S. Quick shift and kernel methods for mode seeking. In Proceedings of the 10th European Conference on Computer Vision—ECCV 2008, Marseille, France, 12–18 October 2008; Forsyth, D., Torr, P., Zisserman, A., Eds.; Lecture Notes in Computer Science, 5305; Springer: Berlin/Heidelberg, Germany, 2008; pp. 705–718. [Google Scholar] [CrossRef]
- Yang, B.; Lei, Y.; Li, N.; Li, X.; Si, X.; Chen, C. Balance recovery and collaborative adaptation approach for federated fault diagnosis of inconsistent machine groups. Knowl.-Based Syst. 2025, 317, 113480. [Google Scholar] [CrossRef]


















| Architecture | ||||||
|---|---|---|---|---|---|---|
| VGG16 | 0.289 | 0.171 | 0.88 | 0.425 | 159 | 0 |
| ResNet50 | 0.4159 | 0.293 | 0.81 | 0.078 | 159 | 0 |
| InceptionV3 | 0.476 | 0.386 | 0.69 | 0 | 279 | 0.51 |
| EfficientNetB0 | 0.558 | 0.457 | 0.67 | 0 | 159 | 0 |
| Baseline_CNN | 0.915 | 0.853 | 0.60 | 0 | 159 | 0 |
| Model | Pick | Ground Truth | Prob | Predicted | r2_Rank | r2_p | ||
|---|---|---|---|---|---|---|---|---|
| 1 | B | Clean | 0.974 | 1.01 × 10−4 | 0.812 | Clean | 2 | 0.993 |
| W | Snow-covered | 0.772 | 8.06 × 10−4 | 0.811 | Snow-covered | 139 | 0.014 | |
| 2 | B | Clean | 0.727 | 1.28 × 10−3 | 0.909 | Clean | 1 | 1.000 |
| W | Snow-covered | 0.427 | 6.40 × 10−4 | 0.944 | Snow-covered | 134 | 0.050 | |
| 3 | B | Snow-covered | 0.623 | 7.10 × 10−3 | 0.862 | Snow-covered | 3 | 0.986 |
| W | Bird-drop | 0.366 | 9.33 × 10−3 | 0.935 | Bird-drop | 135 | 0.043 | |
| 4 | B | Clean | 0.620 | 1.04 × 10−2 | 0.858 | Clean | 3 | 0.986 |
| W | Snow-covered | 0.185 | 4.75 × 10−4 | 1.000 | Snow-covered | 140 | 0.007 | |
| 5 | B | Bird-drop | 0.492 | 6.27 × 10−2 | 0.956 | Bird-drop | 1 | 1.000 |
| W | Snow-covered | 0.101 | 1.14 × 10−6 | 1.000 | Snow-covered | 141 | 0.000 |
| Model | VGG16 | ResNet50 | InceptionV3 | EfficientNetB0 | Baseline_CNN |
|---|---|---|---|---|---|
| No images | 141 | 141 | 141 | 141 | 141 |
| IoU@Top10% | 0.172 ± 0.145 | 0.130 ± 0.114 | 0.111 ± 0.064 | 0.096 ± 0.051 | 0.083 ± 0.030 |
| Entropy | 8.391 ± 2.700 | 9.321 ± 3.157 | 10.321 ± 2.209 | 9.760 ± 0.555 | 9.994 ± 0.368 |
| HoyerSparsity | 0.520 ± 0.277 | 0.183 ± 0.252 | 0.013 ± 0.094 | 0.449 ± 0.146 | 0.385 ± 0.115 |
| PredProb | 0.887 ± 0.165 | 0.804 ± 0.186 | 0.674 ± 0.195 | 0.658 ± 0.197 | 0.550 ± 0.202 |
| Model | |||
|---|---|---|---|
| Baseline_CNN | 0.22594 | 0.23654 | 0.0106 |
| EfficientNetB0 | 0.22772 | 0.24688 | 0.0192 |
| ResNet50 | 0.25782 | 0.27314 | 0.0153 |
| InceptionV3 | 0.24359 | 0.24853 | 0.0049 |
| VGG16 | 0.26182 | 0.26949 | 0.0077 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Diaconu, B.M. Diagnosing Shortcut Learning in CNN-Based Photovoltaic Fault Recognition from RGB Images: A Multi-Method Explainability Audit. AI 2026, 7, 94. https://doi.org/10.3390/ai7030094
Diaconu BM. Diagnosing Shortcut Learning in CNN-Based Photovoltaic Fault Recognition from RGB Images: A Multi-Method Explainability Audit. AI. 2026; 7(3):94. https://doi.org/10.3390/ai7030094
Chicago/Turabian StyleDiaconu, Bogdan Marian. 2026. "Diagnosing Shortcut Learning in CNN-Based Photovoltaic Fault Recognition from RGB Images: A Multi-Method Explainability Audit" AI 7, no. 3: 94. https://doi.org/10.3390/ai7030094
APA StyleDiaconu, B. M. (2026). Diagnosing Shortcut Learning in CNN-Based Photovoltaic Fault Recognition from RGB Images: A Multi-Method Explainability Audit. AI, 7(3), 94. https://doi.org/10.3390/ai7030094
