Classifying XAI Methods to Resolve Conceptual Ambiguity
Abstract
1. Introduction
2. Definitions and Concepts
3. Common Approaches for Understanding the Decision Making of Machine Learning Models
3.1. Interpretable Models
- Shallow Rule-Based Models: These models use simple logical rules for classification. For instance, a two-layer model was proposed in which each neuron represents a simple rule, and the output is a disjunction of these rules, thereby facilitating interpretation [30]. Classifiers based on association rules were also introduced to ensure efficiency with sparse data [3,31].
- Linear Models: These models detect interactions between features. A hybrid model combining rough sets and linear models was developed to enhance interaction detection and feature selection [32].
- Decision Trees: These models are highly interpretable due to their hierarchical structure, which explicitly displays decision rules and feature importance. Features closer to the root have a stronger influence on predictions.
3.2. XAI Approaches
- Feature Contribution Analysis: This approach aims to understand and illustrate the impact of each feature on a model’s predictions. It relies on techniques such as SHAP (SHapley Additive exPlanations) values, which quantify and visualize the individual contribution of each variable to a given prediction using summary plots, dependency graphs, or heatmaps based on SHAP values [29]. Dependency plots, for instance, show how predictions change depending on the variation of one or more features, thus helping identify feature interactions [33]. Furthermore, permutation importance scores are used to evaluate the overall influence of features on the model by identifying the attributes with the most significant impact on predictions [34]. These tools are essential for interpreting and explaining model decisions by making the relationships between input data and outputs more transparent.
- Surrogate Models: The simplification of complex models, also known as surrogate modeling, involves replacing a complex and opaque model with a simpler one that mimics its behavior. The goal is to enhance explainability while maintaining satisfactory accuracy. For example, model distillation trains a simplified model from the predictions of a complex model, so that for any data input X, the simplified model approximates the behavior of the original model [35]. Surrogate models can be applied globally, replicating the predictions across the entire input space, or locally, focusing on explaining a specific prediction. A global surrogate offers interpretability across the full domain, while a local surrogate provides a targeted explanation for a specific decision. This approach reconciles the performance of advanced models with interpretability requirements, thereby enhancing user understanding and trust.
- Post hoc and Agnostic Explainability Methods: These methods are applied after training complex models, without requiring access to their internal structure, to interpret predictions and improve model transparency. Among the most common techniques, LIME (Local Interpretable Model-agnostic Explanations) builds a simple, interpretable local model around a specific prediction, highlighting the most influential features behind the decision [15]. SHAP, on the other hand, is based on Shapley values from cooperative game theory [36], which fairly distribute the total gain (i.e., prediction) among all features by considering all possible coalitions of features. Each feature’s contribution is computed as its average marginal effect across all possible subsets of features [37]. This ensures consistency and additivity, providing both local and global explanations of feature importance. These agnostic techniques can be applied to any machine learning model and are essential tools for fostering the trust, understanding, and validation of AI systems.
4. Classification of Methods in Explainable AI
4.1. Limitations of Existing Classifications
4.2. Toward a Formal Classification of Explainability and Interpretability Methods
5. Application to Classical Machine Learning Models
5.1. Results with Decision Tree
5.2. Results with Linear Regression
- K-Lasso linear regression with , selecting the five most influential features.
- Weights (with D being the Euclidean distance and ).
- samples adapted to the size of the Breast Cancer dataset.
- Functions and as defined by Ribeiro: preserves the original representation, and randomly selects subsets of attributes.
5.3. Results with Random Forest
5.4. Cross-Evaluation of Interpretability and Explainability of the Studied Models
6. Limitations and Future Work
6.1. Summary of the Contributions of the Proposed Classification
6.2. Limitations of the Proposed Classification
6.3. Limitations of XAI in the Literature
6.4. Current Challenges and Open Issues
6.5. Societal and Ethical Impacts
6.6. Research Perspectives
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Haar, L.V.; Elvira, T.; Ochoa, O. An analysis of explainability methods for convolutional neural networks. Eng. Appl. Artif. Intell. 2023, 117, 105606. [Google Scholar] [CrossRef]
- Esterhuizen, J.A.; Goldsmith, B.R.; Linic, S. Interpretable machine learning for knowledge generation in heterogeneous catalysis. Nat. Catal. 2022, 5, 175–184. [Google Scholar] [CrossRef]
- Rudin, C.; Chen, C.; Chen, Z.; Huang, H.; Semenova, L.; Zhong, C. Interpretable machine learning: Fundamental principles and 10 grand challenges. Stat. Surv. 2022, 16, 1–85. [Google Scholar] [CrossRef]
- Zhang, Y.; Tiňo, P.; Leonardis, A.; Tang, K. A survey on neural network interpretability. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 5, 726–742. [Google Scholar] [CrossRef]
- Risch, J.; Ruff, R.; Krestel, R. Explaining offensive language detection. J. Lang. Technol. Comput. Linguist. (JLCL) 2020, 34, 1–19. [Google Scholar] [CrossRef]
- Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. 2018, 51, 639–662. [Google Scholar] [CrossRef]
- Dib, L. Formal Definition of Interpretability and Explainability in XAI. In Intelligent Systems And Applications: Proceedings of The 2024 Intelligent Systems Conference (IntelliSys); Springer: Cham, Switzerland, 2024; Volume 3, p. 133. [Google Scholar]
- Carvalho, D.V.; Pereira, E.M.; Cardoso, J.S. Machine learning interpretability: A survey on methods and metrics. Electronics 2019, 8, 832. [Google Scholar] [CrossRef]
- Chuang, J.; Ramage, D.; Manning, C.; Heer, J. Interpretation and trust: Designing model-driven visualizations for text analysis. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Austin, TX, USA, 5–10 May 2012; pp. 443–452. [Google Scholar]
- Doshi-Velez, F.; Kim, B. Towards a Rigorous Science of Interpretable Machine Learning. arXiv 2017, arXiv:1702.08608. [Google Scholar] [CrossRef]
- Miller, T. Explanation in Artificial Intelligence: Insights from the Social Sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
- Lipton, Z.C. The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 2018, 16, 31–57. [Google Scholar] [CrossRef]
- Gilpin, L.; Bau, D.; Yuan, B.; Bajwa, A.; Specter, M.; Kagal, L. Explaining explanations: An overview of interpretability of machine learning. In Proceedings of the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy, 1–4 October 2018; pp. 80–89. [Google Scholar]
- Buhrmester, V.; Münch, D.; Arens, M. Analysis of explainers of black box deep neural networks for computer vision: A survey. Mach. Learn. Knowl. Extr. 2021, 3, 966–989. [Google Scholar] [CrossRef]
- Ribeiro, M.; Singh, S.; Guestrin, C. “Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
- Došilović, F.; Brčić, M.; Hlupić, N. Explainable artificial intelligence: A survey. In Proceedings of the 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 21–25 May 2018; pp. 0210–0215. [Google Scholar]
- Monroe, D. AI, explain yourself. Commun. ACM. 2018, 61, 11–13. [Google Scholar] [CrossRef]
- Montavon, G.; Lapuschkin, S.; Binder, A.; Samek, W.; Müller, K.R. Explaining nonlinear classification decisions with deep taylor decomposition. Pattern Recognit. 2017, 65, 211–222. [Google Scholar] [CrossRef]
- Mohseni, S.; Zarei, N.; Ragan, E. A survey of evaluation methods and measures for interpretable machine learning. arXiv 2018, arXiv:1811.11839. [Google Scholar]
- Murdoch, W.; Singh, C.; Kumbier, K.; Abbasi-Asl, R.; Yu, B. Interpretable machine learning: Definitions, methods, and applications. arXiv 2019, arXiv:1901.04592. [Google Scholar] [CrossRef]
- Piltaver, R.; Luštrek, M.; Gams, M.; Martinčić-Ipšić, S. What makes classification trees comprehensible? Expert Syst. Appl. 2016, 62, 333–346. [Google Scholar] [CrossRef]
- Zhou, Z. Comprehensibility of data mining algorithms. In Encyclopedia of Data Warehousing and Mining; IGI Global Scientific Publishing: Hershey, PA, USA, 2005; pp. 190–195. [Google Scholar]
- Lou, Y.; Caruana, R.; Gehrke, J. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; pp. 150–158. [Google Scholar]
- Chiticariu, L.; Li, Y.; Reiss, F. Transparent machine learning for information extraction: State-of-the-art and the future. EMNLP (tutorial). In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisboa, Portugal, 17–21 September 2015; pp. 4–6. [Google Scholar]
- Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
- Fails, J.; Olsen, D., Jr. Interactive machine learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces, Miami, FL, USA, 12–15 January 2003; pp. 39–45. [Google Scholar]
- Kulesza, T.; Burnett, M.; Wong, W.; Stumpf, S. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the 20th International Conference on Intelligent User Interfaces, Atlanta, GA, USA, 29 March–1 April 2015; pp. 126–137. [Google Scholar]
- Holzinger, A.; Plass, M.; Holzinger, K.; Crişan, G.; Pintea, C.; Palade, V. Towards interactive Machine Learning (iML): Applying ant colony algorithms to solve the traveling salesman problem with the human-in-the-loop approach. In Proceedings of the International Conference on Availability, Reliability, and Security, Salzburg, Austria, 31 August–2 September 2016; pp. 81–95. [Google Scholar]
- Lundberg, S.; Lee, S. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4768–4777. [Google Scholar]
- Gacto, M.J.; Alcalá, R.; Herrera, F. Interpretability of linguistic fuzzy rule-based systems: An overview of interpretability measures. Inf. Sci. 2011, 181, 4340–4360. [Google Scholar] [CrossRef]
- Rudin, C.; Letham, B.; Madigan, D. Learning theory analysis for association rules and sequential event prediction. Mach. Learn. Res. 2013, 14, 3441–3492. [Google Scholar]
- Kega, I.; Nderu, L.; Mwangi, R.; Njagi, D. Model interpretability via interaction feature detection using roughset in a generalized linear model for weather prediction in Kenya. Authorea 2023. preprints. [Google Scholar] [CrossRef]
- Caruana, R.; Lou, Y.; Gehrke, J.; Koch, P.; Sturm, M.; Elhadad, N. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, 10–13 August 2015; pp. 1721–1730. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Müller, R.; Kornblith, S.; Hinton, G. Subclass distillation. arXiv 2020, arXiv:2002.03936. [Google Scholar] [CrossRef]
- Shapley, L. A value for n-person games. Contrib. Theory Games 1953, 2, 307–317. [Google Scholar]
- Dave, D.; Naik, H.; Singhal, S.; Patel, P. Explainable ai meets healthcare: A study on heart disease dataset. arXiv 2020, arXiv:2011.03195. [Google Scholar] [CrossRef]
- Lipton, Z.C.; Kale, D.C.; Elkan, C.; Wetzel, R. Learning to diagnose with LSTM recurrent neural networks. arXiv 2015, arXiv:1511.03677. [Google Scholar]
- Yang, G.; Ye, Q.; Xia, J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. Inf. Fusion 2022, 77, 29–52. [Google Scholar] [CrossRef]
- Jeyasothy, A. Génération D’Explications Post-Hoc Personnalisées; Sorbonne Université: Paris, France, 2024. [Google Scholar]
- Molnar, C.; Casalicchio, G.; Bischl, B. Interpretable machine learning–A brief history, state-of-the-art and challenges. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases; Springer International Publishing: Cham, Switzerland, 2020; pp. 417–431. [Google Scholar]
- Kim, B.; Wattenberg, M.; Gilmer, J.; Cai, C.; Wexler, J.; Viegas, F. Others Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 2668–2677. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.; Kaiser, Ł; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Ribeiro, M.; Singh, S.; Guestrin, C. Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
- Du, Y.; Rafferty, A.; McAuliffe, F.; Wei, L.; Mooney, C. An explainable machine learning-based clinical decision support system for prediction of gestational diabetes mellitus. Sci. Rep. 2022, 12, 1170. [Google Scholar] [CrossRef]
- Freitas, A. Comprehensible classification models: A position paper. ACM SIGKDD Explor. Newsl. 2014, 15, 1–10. [Google Scholar] [CrossRef]
- Rudin, C.; Radin, J. Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harv. Data Sci. Rev. 2019, 1, 1–9. [Google Scholar] [CrossRef]
- Craven, M.; Shavlik, J. Extracting tree-structured representations of trained networks. Adv. Neural Inf. Process. Syst. 1995, 8, 24–30. [Google Scholar]
- Giudici, P.; Raffinetti, E. Shapley-Lorenz eXplainable artificial intelligence. Expert Syst. Appl. 2021, 167, 114104. [Google Scholar] [CrossRef]
- Ho, L.V.; Aczon, M.; Ledbetter, D.; Wetzel, R. Interpreting a recurrent neural network’s predictions of ICU mortality risk. J. Biomed. Inform. 2021, 114, 103672. [Google Scholar] [CrossRef]
- Antwarg, L.; Miller, R.M.; Shapira, B.; Rokach, L. Explaining anomalies detected by autoencoders using Shapley Additive Explanations. Expert Syst. Appl. 2021, 186, 115736. [Google Scholar] [CrossRef]
- Dandolo, D.; Masiero, C.; Carletti, M.; Dalle Pezze, D.; Susto, G.A. AcME—Accelerated model-agnostic explanations: Fast whitening of the machine-learning black box. Expert Syst. Appl. 2023, 214, 119115. [Google Scholar] [CrossRef]
- Choi, E.; Bahadori, M.; Kulas, J.; Schuetz, A.; Stewart, W.; Sun, J. RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism. Adv. Neural Inf. Process. Syst. (NeurIPS) 2016, 29, 3504–3512. [Google Scholar] [CrossRef]
- Samek, W.; Wiegand, T.; Müller, K. Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv 2017, arXiv:1708.08296. [Google Scholar] [CrossRef]
- Serradilla, O.; Zugasti, E.; Cernuda, C.; Aranburu, A.; Okariz, J.; Zurutuza, U. Interpreting remaining useful life estimations combining explainable artificial intelligence and domain knowledge in industrial machinery. In Proceedings of the 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
- Chen, L.; Lou, S.; Zhang, K.; Huang, J.; Zhang, Q. Harsanyinet: Computing accurate shapley values in a single forward propagation. arXiv 2023, arXiv:2304.01811. [Google Scholar] [CrossRef]
- Alvarez-Melis, D.; Jaakkola, T. On the Robustness of Interpretability Methods. arXiv 2018, arXiv:1806.08049. [Google Scholar] [CrossRef]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531. [Google Scholar] [CrossRef]
- Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR). 2021, 54, 1–35. [Google Scholar] [CrossRef]
Dimension | Interpretability | Explainability |
---|---|---|
Local | Direct understanding of a decision through a simple transparent model (e.g., shallow decision tree). | Justifies a specific prediction via feature contributions (e.g., LIME, SHAP). |
Global | Understanding of the model’s overall behavior through its structure (e.g., linear regression). | Provides a global view of the model by analyzing feature attributions (e.g., DeepRED, Global Surrogate). |
Intrinsic | Models that are naturally understandable without external explanation (e.g., shallow decision trees, simple linear models). | Models whose decisions are inherently understandable through their internal structure (e.g., decision trees, rule-based systems). |
Extrinsic | Opaque models interpreted using global post hoc methods (e.g., rule aggregation, global approximations). | Local or global post hoc methods applied to black-box models (e.g., LIME, SHAP, ANCHOR). |
Specific | Utilization of the model’s internal structure to understand its mechanisms. Examples: Grad-CAM, LRP. | Methods designed for specific architectures that leverage model transparency. Examples: decision trees, Gini importance, fuzzy logic. |
Agnostic | Understanding without access to internal mechanisms. Examples: LIME, Anchors. | Post hoc approaches independent of model internals, applicable to any model. Examples: SHAP, Kernel SHAP, AcME. |
Method | Local/Global | Model Dependence | Intrinsic/Extrinsic |
---|---|---|---|
LIME | Local | Agnostic | Extrinsic |
SHAP (TreeExplainer) | Local/Global | Specific | Extrinsic |
SHAP (Kernel) | Local/Global | Agnostic | Extrinsic |
Grad-CAM | Local | Specific | Extrinsic |
Feature Importance (Gini) | Global | Specific | Intrinsic |
Coefficients (Logistic Regression) | Global | Specific | Intrinsic |
Decision Paths (Decision Tree) | Local | Specific | Intrinsic |
Anchors | Local | Agnostic | Extrinsic |
Attention Mechanisms | Local/Global | Specific | Intrinsic |
TCAV (Concept-based explanations) | Global | Specific | Extrinsic |
ID | Attribute | LR | Importance AD | Importance RF |
---|---|---|---|---|
21 | texture_worst | 1.342 | 0.059 | 0.027 |
10 | radius_se | 1.271 | 0.000 | 0.008 |
28 | symmetry_worst | 1.208 | 0.000 | 0.008 |
7 | concave points_mean | 1.117 | 0.000 | 0.102 |
27 | concavity points_worst | 0.945 | 0.000 | 0.088 |
13 | area_se | 0.911 | 0.000 | 0.014 |
20 | radius_worst | 0.878 | 0.000 | 0.121 |
23 | area_worst | 0.846 | 0.000 | 0.113 |
6 | concavity_mean | 0.796 | 0.796 | 0.087 |
26 | concave_worst | 0.773 | 0.075 | 0.043 |
12 | perimeter_se | 0.609 | 0.005 | 0.005 |
22 | perimeter_worst | 0.587 | 0.000 | 0.131 |
24 | smoothness_worst | 0.551 | 0.000 | 0.012 |
3 | area_mean | 0.464 | 0.000 | 0.013 |
0 | radius_mean | 0.427 | 0.000 | 0.026 |
1 | texture_mean | 0.393 | 0.000 | 0.010 |
2 | perimeter_mean | 0.389 | 0.000 | 0.081 |
17 | concave points_se | 0.317 | 0.000 | 0.005 |
14 | smoothness_se | 0.312 | 0.000 | 0.004 |
29 | fractal_dimension_worst | 0.154 | 0.000 | 0.003 |
4 | smoothness_mean | 0.066 | 0.000 | 0.007 |
25 | compactness_worst | −0.005 | 0.000 | 0.023 |
9 | fractal_dimension_mean | −0.076 | 0.000 | 0.004 |
16 | concavity_se | −0.181 | 0.000 | 0.016 |
11 | texture_se | −0.188 | 0.000 | 0.006 |
8 | symmetry_mean | −0.235 | 0.003 | 0.004 |
18 | symmetry_se | −0.499 | 0.000 | 0.002 |
5 | compactness_mean | −0.542 | 0.000 | 0.022 |
19 | fractal_dimension_se | −0.613 | 0.060 | 0.006 |
15 | compactness_se | −0.685 | 0.000 | 0.004 |
ID | Variable | Rank | |
---|---|---|---|
10 | radius_error | 0.24 | 7 |
7 | concave_points_mean | 0.21 | 4 |
23 | area_error | 0.17 | 8 |
6 | concavity_mean | 0.16 | 9 |
25 | compactness_error | −0.14 | 22 |
Model | Interpretability | Explainability | ||||
---|---|---|---|---|---|---|
Local | Global | Intrinsic | Local (Agnostic/ Specific) | Global (Agnostic/ Specific) | Extrinsic | |
LR | ✓ Fixed coefficients per instance | ✓ Variable coefficients | ✓ Linear structure | ✓ LIME (agnostic) possible but of limited value | ✗ Not needed (analysis via coefficients) | ✗ No post hoc method required |
DT | ✓ A single decision path | ✓ Variable importance | ✓ Explicit rules | ✓ LIME possible (agnostic), rarely necessary | ✗ Few external explanations used | ✗ Few external adjustments |
RF | ✗ Not directly accessible | ✗ Not directly accessible | ✗ Too complex | ✓ LIME/SHAP (agnostic), treeinterpreter (specific) | ✓ SHAP (agnostic), Gini importance (specific) | ✓ Requires post hoc methods |
Criterion | Interpretability | Explainability | Critical Analysis |
---|---|---|---|
Local | Instance-level understanding | Black-box analysis | Local interpretability focuses on individual prediction analysis (e.g., LIME, Anchors), whereas local explainability aims to explain a black-box, often via post hoc models (e.g., SHAP, LIME). |
Enables explanation of an individual decision. | Uses external models to justify a decision. | Local explanations rely on auxiliary models (e.g., LIME, SHAP) to approximately interpret a given prediction. | |
Global | Overview perspective | General justification | Global interpretability seeks to understand the entire model, as achieved with methods like SHAP or decision trees. Global explainability aims to extract rules and trends from a complex model (e.g., DeepRED, GIRP). |
Explains the overall functioning of the model. | Identifies patterns and trends explaining decisions. | DeepRED and GIRP are suitable for neural networks to produce global explanations. | |
Intrinsic | Transparent model | Understandable model | Intrinsically interpretable models (e.g., trees, linear regressions) are also explainable as their rules are directly accessible and do not require post hoc methods. |
No need for post hoc explanation. | The model’s rules are directly interpretable. | These models offer immediate transparency, unlike black-box models. | |
Extrinsic | No guaranteed transparency | Requires approximate explanations | Complex models (e.g., neural networks, SVM, XGBoost) have low transparency and require post hoc methods like SHAP or LIME. |
Depends on the model and its structure. | Uses approximation techniques. | Extrinsic explainability relies on approximations (e.g., SHAP, LIME) or locally aggregated methods. | |
Specific | Internal analysis | Tailored methods | These methods are specific to certain model families: Gini importance for trees, Grad-CAM and LRP for neural networks. |
Applies only to specific model types. | Used to visualize decisions in neural networks. | Unlike agnostic methods, these cannot be generalized to all models. | |
Agnostic | Model-independent | General explanations | These methods work independently of the underlying model structure (e.g., LIME, SHAP, Anchors). |
Variable applicability. | Works on various models without specific constraints. | Kernel SHAP is more computationally expensive than LIME, which assumes local linearity. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dib, L.; Capus, L. Classifying XAI Methods to Resolve Conceptual Ambiguity. Technologies 2025, 13, 390. https://doi.org/10.3390/technologies13090390
Dib L, Capus L. Classifying XAI Methods to Resolve Conceptual Ambiguity. Technologies. 2025; 13(9):390. https://doi.org/10.3390/technologies13090390
Chicago/Turabian StyleDib, Lynda, and Laurence Capus. 2025. "Classifying XAI Methods to Resolve Conceptual Ambiguity" Technologies 13, no. 9: 390. https://doi.org/10.3390/technologies13090390
APA StyleDib, L., & Capus, L. (2025). Classifying XAI Methods to Resolve Conceptual Ambiguity. Technologies, 13(9), 390. https://doi.org/10.3390/technologies13090390