Interpretable Diagnostics with SHAP-Rule: Fuzzy Linguistic Explanations from SHAP Values
Abstract
1. Introduction
- A novel XAI method, termed SHAP-Rule, is proposed to automatically convert numeric SHAP values into interpretable fuzzy linguistic rules with quantified activation strength. Its novelty lies in the automated generation of linguistic rules directly from feature attributions without manual definition, ensuring consistent interpretability for any model type.
- A rigorous comparative analysis is conducted, benchmarking SHAP-Rule against standard numeric SHAP and AnchorTabular explanations across multiple datasets.
- The methodological innovation lies in bridging quantitative feature attributions and symbolic fuzzy representation through a unified, computationally efficient pipeline that automatically converts SHAP outputs into interpretable linguistic rules.
2. Related Works
2.1. Local Rule-Based Explanation Methods
2.2. Fuzzy Logic and Linguistic Explanations in XAI
2.3. Identified Gaps of Existing Studies
- Current rule-based explanation methods such as Anchors, RuleFit, and TREPAN often use crisp rules and thresholds, which struggle to capture continuous feature variations effectively. Thus, it compromises interpretability in terms of diagnostics.
- Numeric feature attribution methods, like SHAP, provide accurate quantitative explanations. However, their purely numeric outputs remain challenging for domain experts. Outputs require immediate comprehension to be interpreted rapidly, especially in critical scenarios.
- Fuzzy logic-based systems traditionally demand significant expert input for rule creation, hindering their scalability and practicality in complex, high-dimensional diagnostic tasks.
3. Methodology
3.1. Overview of SHAP-Rule Method
- (Preparatory stage.) Create a number of linguistic variables for each feature.
- Compute numeric SHAP feature importance values using a trained black-box model for a processed input sample.
- Normalize the SHAP values and select the top 3–5 features with the greatest absolute values of SHAP-based feature importance.
- Determine linguistic fuzzy terms for each feature of the processed input sample.
- Generate fuzzy IF–THEN rules using the selected top SHAP features and linguistic fuzzy terms.
3.2. Automatic Generation of Fuzzy Terms via Arcsinh-Based Quartile Partitioning
- “Average” term is always included (the median).
- For each index to the left of the median, i = 2, 1, 0. A member with an index i is included only if Bi < Bi+1. If two adjacent boundaries are equal, the loop stops, and no more left-side terms are added.
- The right part of the median of the boundaries is processed in a similar way.
3.3. Fuzzy Membership Function Construction
3.4. Generation of Fuzzy IF–THEN Rules
- Sort features by descending |wj|.
- Find the smallest m so that .
- Take q = min(m, K) top features.
3.5. Activation Strength of Rules
3.6. SHAP-Rule Algorithm (Pseudocode)
| Algorithm 1. SHAP-Rule Algorithm | |
| Input: Trained model M, set of fuzzy membership functions MF, instance x, number of top features q and threshold T. | |
| Output: Fuzzy IF–THEN rule | |
| Begin ACO-JSS Algorithm | |
| 1 | Compute SHAP values φ = SHAP(M, x) |
| 2 | Normalize SHAP values wj. |
| 3 | Select top q features based on normalized SHAP values. |
| 4 | Compute membership degrees for each feature i in top features: MF(i, xi). |
| 5 | Formulate fuzzy rule: (feature1 is term1 AND … AND featureq is termq) : M(x). |
| 6 | Calculate activation strength α. |
| End ACO-JSS Algorithm | |
- (1)
- Start with a trained model and a specific instance to explain. Compute SHAP contributions for that instance—i.e., how strongly each feature influenced the model’s decision.
- (2)
- Normalize SHAP values to apply the next step (top features selection).
- (3)
- To keep the explanation short, preserve only the most influential features: sort features by contribution, take them from the top until a cumulative importance threshold T is reached, and cap the total number by K. This guarantees a compact, readable rule.
- (4)
- Convert each selected feature’s numeric value into a human-friendly term (e.g., “very low,” “below average,” “average,” “above average,” “high,” “very high”). These terms are created automatically from the training data using a robust transformation and quartile-based boundaries; triangular membership functions define how strongly the value matches the term.
- (5)
- Assemble everything into a single IF–THEN sentence: “IF (feature1 is term1) AND (feature2 is term2) … THEN class = model’s prediction.” In other words, the numeric SHAP vector becomes a concise linguistic rule.
- (6)
- Compute a single “activation” score for the rule that reflects both (a) each feature’s importance and (b) how strongly the current value fits its term. Intuitively: the more important the feature and the stronger its membership, the higher the activation.
3.7. Benchmark Methods for Comparison
- TreeSHAP (for tree-based ensembles, SHAP was computed using the TreeExplainer algorithm, which is an efficient implementation of the classical SHAP method for tree models [33]) (“SHAP”).
- SHAP (TreeExplainer) with a limitation on the number of displayed features, according to the same rules as described (“SHAP-L”).
- AnchorTabular, which generates local IF–THEN rules with high precision for model predictions [12] (“AnchorT”).
4. Experimental Setup
4.1. Datasets
- CWRU Bearing dataset [34]. It is a data-driven fault diagnosis dataset containing ball Bearing test data for normal and faulty bearings.
- Domain: equipment diagnostics.
- Samples: 2300 time series segments of 2048 points (0.04 s at the 48 kHz accelerometer sampling frequency).
- Features: mean, skewness, kurtosis, crest factor, (4 numeric features without missing values).
- Task: binary classification (fault or no fault, 90%/10%).
- Domain: equipment diagnostics.
- Samples: 470 oil sampling results.
- Features: concentrations of gases dissolved in the oil, dibenzyl sulfide concentration (DBDS), breakdown voltage (BV), water capacity (WC) (14 numeric features without missing values).
- Task: Binary classification. The dataset contains a health index (less is better) and we conditionally classify transformers with a value higher than 40 as bad (25%) and the rest as good (75%) [37].
- Pima Indians Diabetes database [38]. It originally was from the National Institute of Diabetes and Digestive and Kidney Diseases.
- Domain: medical diagnostics.
- Samples: 768 values of tests and measurements from patients.
- Features: 8 diagnostic features such as pregnancies, OGTT (Oral Glucose Tolerance Test), blood pressure, skin thickness, insulin, BMI (Body Mass Index), age, and pedigree diabetes function.
- Task: binary classification (diabetic or non-diabetic, 35/65%).
4.2. Black-Box Models
- Extreme Gradient Boosting (XGBoostClassifier).
- Random Forest (RandomForestClassifier).
4.3. Procedures for Model Training and Generating Explanations
- Number of trees: 50, 100, or 200;
- Maximum tree depth: 3, 5, or without limit;
- Minimum samples number of splitting: 3, 4, 5, or 6.
- Number of trees: 50, 100, or 200;
- Maximum tree’s depth: 3, 5, or without limit;
- Learning rate: 0.01, 0.1, or 0.2.
4.4. Evaluation Metrics
- Execution time: Mean and standard deviation of explanation generation time per instance.
- Complexity: Number of conditions or features explicitly used in generated explanations.
- Coverage:
- AnchorTabular: Proportion of test instances with anchors meeting a precision threshold ≥ 0.95. This is the default threshold for AnchorTabular [39].
- SHAP-Rule: Proportion of instances where fuzzy rule activation strength (α) ≥ 0.33. The thresholds 0.95 and 0.33 were selected empirically to ensure a balance between fidelity and coverage.
- Fidelity: Agreement rate between explanation predictions and the original model:
- 5.
- Consistency (robustness): In DSS, end-users primarily interacted with explanations presented in a textual form. Therefore, the consistency of explanations should reflect how similarly users perceived explanations for slightly disturbed input data. Comparing explanations at the textual level by means of using Levenshtein distance allows for capture of significant changes that directly affect user perception, not just sets of numerical characteristics. The Levenshtein similarity quantifies how closely the textual explanations match, measuring the cognitive coherence experienced by users.
- Generate the original explanation text for a given input instance.
- Generate several (10) slightly perturbed input instances. We perturb each test sample by small Gaussian noise (1% of feature standard deviation).
- For each perturbed input, generate its corresponding textual explanation.
- Compute the normalized Levenshtein similarity:
- Average these similarity values for each instance, then average them again across all test instances.
- 6.
- Subjective expert assessment: expert survey conducted by subject area specialists (10 experts per dataset domain). Evaluation criteria: Simplicity and speed of understanding of the explanation. Responses were recorded using a 7-point Likert scale (from 1, simplicity and speed are very low, to 7, simplicity and speed are very high). Experts rated the explanations for each of the three methods in a random order for 10 random examples from the test set (the same examples for each expert).
4.5. Implementation Details
5. Results
5.1. ML Model Tuning
5.2. Text Formats of Explanations and Their Assessment by Experts
- SHAP: Skewness is −0.225 (value 0.550) AND Kurtosis is −0.047 (value 0.324) AND Crest is 3.318 (value 0.097) AND Mean is 0.016 (value 0.029): Defect detected.
- SHAP-L: Skewness is −0.225 (value 0.550) AND Kurtosis is −0.047 (value 0.324) (value 0.029): Defect detected.
- AnchorT: Skewness ≤ −0.11 AND Kurtosis ≤ −0.02: Defect detected.
- SHAP-R: Skewness is below average AND Kurtosis is below average: Defect detected.
- SHAP-L: Hydrogen is 590 (value 0.245) AND Acethylene is 582 (value 0.216) AND Methane is 949 (value 0.210) AND Ethylene is 828 (value 0.136): State is bad.
- AnchorT: Acethylene > 0.00 AND Methane > 8.00 AND Ethane > 72.00 AND Water_content ≤ 11.00: State is bad.
- SHAP-R: Hydrogen is high AND Acethylene is average AND Methane is high AND Ethylene is high: State is bad.
- SHAP-L: Glucose is 162 (value −0.582) AND Pregnancies is 10 (value −0.119) AND BMI is 27.70 (value 0.109).
- AnchorT: Glucose > 140.00 AND Pregnancies > 6.00 AND BMI > 27.50: Diabetes.
- SHAP-R: Glucose is above average AND Pregnancies is above average AND BMI is below average.
5.3. Comparitive Analysis of Explanation Methods
5.4. Time Complexity
6. Discussion
6.1. Interpretability and Communication
6.2. Coverage–Fidelity Profile
6.3. Robustness and Consistency
6.4. Computational Efficiency and Complexity
6.5. Comparison with LLM-Aided Explanations
6.6. Generalization and Limitations
7. Conclusions
- A deterministic post hoc pipeline with calibrated rules is proposed. SHAP-Rule automatically converts numeric SHAP attributions into compact fuzzy IF–THEN rules with quantified activation, selecting antecedents by a cumulative threshold, T, with a hard cap, K. The procedure is model-agnostic, auditable, and very fast.
- Linguistic terms are derived directly from data via an arcsinh transform and Tukey fences with redundancy pruning. The same linguistic transformation is drop-in extensible to alternative attribution backends (e.g., LIME and permutation importance) and to grouped/interaction attributions for high-dimensional settings.
- The research introduces a metric suite for complexity, coverage, fidelity, consistency, and statistically validated expert estimations. This suite enables assessing the generated explanation in its native form (such as a user-facing textual message within a decision-support system) rather than only as numeric artifacts.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; pp. 4768–4777. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. Why Should I Trust You? Explaining the Predictions of Any Classifier. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), San Diego, CA, USA, 10–15 February 2016; pp. 97–101. [Google Scholar] [CrossRef]
- Alexander, Z.; Chau, D.H.; Saldana, C. An Interrogative Survey of Explainable AI in Manufacturing. IEEE Trans. Ind. Inf. 2024, 20, 7069–7081. [Google Scholar] [CrossRef]
- Cacao, J.; Santos, J.; Antunes, M. Explainable AI for industrial fault diagnosis: A systematic review. J. Ind. Inf. Integr. 2025, 47, 100905. [Google Scholar] [CrossRef]
- Brito, L.C.; Susto, G.A.; Brito, J.N.; Duarte, M.A.V. An Explainable Artificial Intelligence Approach for Unsupervised Fault Detection and Diagnosis in Rotating Machinery. arXiv 2021, arXiv:2102.11848. [Google Scholar] [CrossRef]
- Zereen, A.N.; Das, A.; Uddin, J. Machine Fault Diagnosis Using Audio Sensors Data and Explainable AI Techniques—LIME and SHAP. Comput. Mater. Contin. 2024, 80, 3463–3484. [Google Scholar] [CrossRef]
- Zeng, X. Enhancing the Interpretability of SHAP Values Using Large Language Models. arXiv 2024, arXiv:2409.00079. [Google Scholar] [CrossRef]
- Gosiewska, A.; Biecek, P. Do Not Trust Additive Explanations. arXiv 2019, arXiv:1903.11420. [Google Scholar] [CrossRef]
- Cao, J.; Zhou, T.; Zhi, S. Fuzzy Inference System with Interpretable Fuzzy Rules: Advancing Explainable Artificial Intelligence for Disease Diagnosis—A Comprehensive Review. Inf. Sci. 2024, 662, 120212. [Google Scholar] [CrossRef]
- Pickering, L.; Cohen, K.; De Baets, B. A Narrative Review on the Interpretability of Fuzzy Rule-Based Models from a Modern Interpretable Machine Learning Perspective. Int. J. Fuzzy Syst. 2025, 1–20. [Google Scholar] [CrossRef]
- Niskanen, V.A. Methodological Aspects on Integrating Fuzzy Systems with Explainable Artificial Intelligence. In Advances in Artificial Intelligence-Empowered Decision Support Systems; Learning and Analytics in Intelligent Systems; Springer: Cham, Switzerland, 2024; p. 39. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. Anchors: High-Precision Model-Agnostic Explanations. Proc. AAAI Conf. Artif. Intell. 2018, 32, 1527–1535. [Google Scholar] [CrossRef]
- Friedman, J.H.; Popescu, B.E. Predictive Learning via Rule Ensembles. Ann. Appl. Stat. 2008, 2, 916–954. [Google Scholar] [CrossRef]
- Letham, B.; Rudin, C.; McCormick, T.H.; Madigan, D. Interpretable Classifiers Using Rules and Bayesian Analysis: Building a Better Stroke Prediction Model. Ann. Appl. Stat. 2015, 9, 1350–1371. [Google Scholar] [CrossRef]
- Craven, M.W.; Shavlik, J.W. Extracting Tree-Structured Representations of Trained Networks. In Advances in Neural Information Processing Systems (NeurIPS); MIT Press: Denver, CO, USA, 1996; pp. 24–30. [Google Scholar]
- Wang, J.; Chen, Y.; Giudici, P. Group Shapley with Robust Significance Testing and Its Application to Bond Recovery Rate Prediction. arXiv 2025, arXiv:2501.03041. [Google Scholar] [CrossRef]
- Jullum, M.; Redelmeier, A.; Aas, K. GroupShapley: Efficient Prediction Explanation with Shapley Values for Feature Groups. arXiv 2021, arXiv:2106.12228. [Google Scholar] [CrossRef]
- Matrenin, P.V.; Gamaley, V.V.; Khalyasmaa, A.I.; Stepanova, A.I. Solar Irradiance Forecasting with Natural Language Processing of Cloud Observations and Interpretation of Results with Modified Shapley Additive Explanations. Algorithms 2024, 17, 150. [Google Scholar] [CrossRef]
- Tatset, H.; Shater, A. Beyond Black Box: Enhancing Model Explainability with LLMs and SHAP. arXiv 2025, arXiv:2505.24650. [Google Scholar] [CrossRef]
- Khediri, A.; Slimi, H.; Yahiaoui, A.; Derdour, M.; Bendjenna, H.; Ghenai, C.E. Enhancing Machine Learning Model Interpretability in Intrusion Detection Systems through SHAP Explanations and LLM-Generated Descriptions. In Proceedings of the 6th International Conference on Pattern Analysis and Intelligent Systems (PAIS), El Oued, Algeria, 24–25 April 2024. [Google Scholar] [CrossRef]
- Lim, B.; Huerta, R.; Sotelo, A.; Quintela, A.; Kumar, P. EXPLICATE: Enhancing Phishing Detection through Explainable AI and LLM-Powered Interpretability. arXiv 2025, arXiv:2503.20796. [Google Scholar] [CrossRef]
- Zhao, H.; Chen, H.; Yang, F.; Liu, N.; Deng, H.; Cai, H.; Wang, S.; Yin, D.; Du, M. Explainability for Large Language Models: A Survey. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–38. [Google Scholar] [CrossRef]
- Goldshmidt, R.; Horoviczm, M. TokenSHAP: Interpreting Large Language Models with Monte Carlo Shapley Value Estimation. arXiv 2024, arXiv:2407.10114. [Google Scholar] [CrossRef]
- Aghaeipoor, F.; Sabokrou, M.; Fernández, A. Fuzzy Rule-Based Explainer Systems for Deep Neural Networks: From Local Explainability to Global Understanding. IEEE Trans. Fuzzy Syst. 2023, 31, 3069–3083. [Google Scholar] [CrossRef]
- Buczak, A.L.; Baugher, B.D.; Zaback, K. Fuzzy Rules for Explaining Deep Neural Network Decisions (FuzRED). Electronics 2025, 14, 1965. [Google Scholar] [CrossRef]
- Mendel, J.M.; Bonissone, P.P. Critical Thinking about Explainable AI (XAI) for Rule-Based Fuzzy Systems. IEEE Trans. Fuzzy Syst. 2021, 29, 3579–3593. [Google Scholar] [CrossRef]
- Ferdaus, M.M.; Dam, T.; Alam, S.; Pham, D.-T. X-Fuzz: An Evolving and Interpretable Neuro-Fuzzy Learner for Data Streams. IEEE Trans. Artif. Intell. 2024, 5, 4001–4012. [Google Scholar] [CrossRef]
- Singh, B.; Doborjeh, M.; Doborjeh, Z. Constrained Neuro Fuzzy Inference Methodology for Explainable Personalised Modelling with Applications on Gene Expression Data. Sci. Rep. 2023, 13, 456. [Google Scholar] [CrossRef] [PubMed]
- Gokmen, O.B.; Guven, Y.; Kumbasar, T. FAME: Introducing Fuzzy Additive Models for Explainable AI. arXiv 2025, arXiv:2504.07011. [Google Scholar] [CrossRef]
- Gacto, M.J.; Alcalá, R.; Herrera, F. Interpretability of Linguistic Fuzzy Rule-Based Systems: An Overview of Interpretability Measures. Inf. Sci. 2002, 181, 4340–4360. [Google Scholar] [CrossRef]
- Ouifak, H.; Idri, A. A comprehensive review of fuzzy logic based interpretability and explainability of machine learning techniques across domains. Neurocomputing 2025, 647, 130602. [Google Scholar] [CrossRef]
- Tukey, J.W. Exploratory Data Analysis; Addison-Wesley: Reading, MA, USA, 1977. [Google Scholar]
- Lundberg, S.M.; Erson, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.I. Explainable AI for Trees: From Local Explanations to Global Understanding. arXiv 2019, arXiv:1905.04610. [Google Scholar] [CrossRef]
- Kaggle. CWRU Bearing Datasets. Available online: https://www.kaggle.com/datasets/brjapon/cwru-bearing-datasets (accessed on 5 June 2025).
- Velasquez, A.M.R.; Lara, J.V.M. Data for: Root Cause Analysis Improved with Machine Learning for Failure Analysis in Power Transformers. Eng. Fail. Anal. 2024, 115, 104684. [Google Scholar] [CrossRef]
- Kaggle. Failure Analysis in Power Transformers Dataset. Available online: https://www.kaggle.com/datasets/shashwatwork/failure-analysis-in-power-transformers-dataset (accessed on 5 June 2025).
- Khalyasmaa, A.I.; Matrenin, P.V.; Eroshenko, S.A. Assessment of Power Transformer Technical State Using Explainable Artificial Intelligence. Probl. Reg. Energetics 2024, 4, 1–9. [Google Scholar] [CrossRef]
- Kaggle. Pima Indians Diabetes Database. Available online: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database (accessed on 5 June 2025).
- GitHub. SeldonIO/Alibi. Available online: https://github.com/SeldonIO/alibi (accessed on 15 August 2025).
- GitHub. Shap/Shap. Available online: https://github.com/shap/shap (accessed on 15 August 2025).
- GitHub. Scikit-Learn/Scikit-Learn. Available online: https://github.com/scikit-learn/scikit-learn (accessed on 15 August 2025).
- GitHub. Dmlc/Xgboost. Available online: https://github.com/dmlc/xgboost (accessed on 15 August 2025).
- Khalyasmaa, A.I.; Matrenin, P.V.; Eroshenko, S.A.; Manusov, V.Z.; Bramm, A.M.; Romanov, A.M. Data Mining Applied to Decision Support Systems for Power Transformers’ Health Diagnostics. Mathematics 2022, 10, 2486. [Google Scholar] [CrossRef]




| Dataset | Model | Hyperparameters |
|---|---|---|
| CWRU Bearing | RF | max_dept: -, min_samples_split: 5, n_estimators: 100 |
| CWRU Bearing | XGB | learning_rate: 0.01, max_depth: 3, n_estimators: 100 |
| Transformer Diagnostics | RF | max_dept: -, min_samples_split: 5, n_estimators: 100 |
| Transformer Diagnostics | XGB | learning_rate: 0.01, max_depth: 3, n_estimators: 200 |
| Pima Indians Diabetes | RF | max_dept: 5, min_samples_split: 4, n_estimators: 50 |
| Pima Indians Diabetes | XGB | learning_rate: 0.01, max_depth: 5, n_estimators: 50 |
| Dataset | Model | Metrics (Weighted Avg) | ||
|---|---|---|---|---|
| Recall | Precision | F1 | ||
| CWRU Bearing | RF | 0.97 | 0.97 | 0.97 |
| CWRU Bearing | XGB | 0.97 | 0.96 | 0.96 |
| Transformer Diagnostics | RF | 0.95 | 0.95 | 0.95 |
| Transformer Diagnostics | XGB | 0.95 | 0.95 | 0.95 |
| Pima Indians Diabetes | RF | 0.74 | 0.73 | 0.73 |
| Pima Indians Diabetes | XGB | 0.77 | 0.77 | 0.77 |
| Dataset | Model | Results of Expert Assessments | ||
|---|---|---|---|---|
| Median | Mean | Std | ||
| CWRU Bearing | SHAP | 2.0 | 2.4 | 0.516 |
| CWRU Bearing | SHAP-L | 4.0 | 3.8 | 0.632 |
| CWRU Bearing | AnchorT | 5.0 | 5.1 | 0.316 |
| CWRU Bearing | SHAP-R | 6.0 | 6.4 | 0.516 |
| Transformer Diagnostics | SHAP | 3.0 | 3.3 | 0.675 |
| Transformer Diagnostics | SHAP-L | 4.0 | 3.5 | 0.707 |
| Transformer Diagnostics | AnchorT | 5.0 | 4.9 | 0.994 |
| Transformer Diagnostics | SHAP-R | 5.5 | 5.6 | 0.996 |
| Pima Indians Diabetes | SHAP | 2.0 | 1.9 | 0.316 |
| Pima Indians Diabetes | SHAP-L | 2.0 | 2.1 | 0.316 |
| Pima Indians Diabetes | AnchorT | 4.0 | 4.2 | 0.632 |
| Pima Indians Diabetes | SHAP-R | 5.0 | 4.7 | 0.483 |
| Dataset | H | p-Value | ε2 |
|---|---|---|---|
| CWRU Bearing | 35.846 | 8.070 × 10−8 | 0.842 |
| Transformer Diagnostics | 33.951 | 2.029 × 10−7 | 0.578 |
| Pima Indians Diabetes | 25.529 | 1.197 × 10−5 | 0.794 |
| Method | TreeSHAP | SHAP-L | AnchorTabular | SHAP-Rule |
|---|---|---|---|---|
| TreeSHAP | 1 | 0.00055 | 0.00038 | 0.00053 |
| SHAP-L | 0.00055 | 1 | 0.00053 | 0.00053 |
| AnchorTabular | 0.00038 | 0.00053 | 1 | 0.00053 |
| SHAP-Rule | 0.00055 | 0.00053 | 0.00053 | 1 |
| Method | TreeSHAP | SHAP-L | AnchorTabular | SHAP-Rule |
|---|---|---|---|---|
| TreeSHAP | 1 | 0. 19160 | 0.00034 | 0.00034 |
| SHAP-L | 0.19160 | 1 | 0.00034 | 0.00034 |
| AnchorTabular | 0.00034 | 0.00034 | 1 | 0.14662 |
| SHAP-Rule | 0.00034 | 0.00034 | 0.14662 | 1 |
| Method | TreeSHAP | SHAP-L | AnchorTabular | SHAP-Rule |
|---|---|---|---|---|
| TreeSHAP | 1 | 0. 47582 | 0.00414 | 0.00147 |
| SHAP-L | 0.47582 | 1 | 0.00692 | 0.00148 |
| AnchorTabular | 0.00414 | 0.00692 | 1 | 0.22650 |
| SHAP-Rule | 0.00147 | 0.00148 | 0.22650 | 1 |
| ML Model | Method | Metrics | |||||
|---|---|---|---|---|---|---|---|
| Execution Time (Mean ± Std), ms | Complexity | Coverage (Mean ± Std) | Fidelity (Median ± Std) | Consistency (Mean ± Std) | Experts’ Assessment (Median ± Std) | ||
| RF | SHAP | 1.89 ± 0.32 | 14 ± 0.00 | - | - | 0.678 ± 0.056 | - |
| RF | SHAP-L | 1.79 ± 0.44 | 3.95 ± 0.22 | - | - | 0.698 ± 0.065 | - |
| RF | AnchorT | 2281 ± 2963 | 3.28 ± 2.41 | 0.904 ± 0.013 | 1.0–0.0658 | 0.752 ± 0.149 | - |
| RF | SHAP-R | 1.89 ± 0.28 | 3.95 ± 0.22 | 0.819 ± 0.00 | 1.0–0.164 | 0.718 ± 0.106 | - |
| XGB | SHAP | 2.01 ± 0.97 | 14 ± 0.00 | - | - | 0.796 ± 0.045 | 2.0 ± 0.516 |
| XGB | SHAP-L | 1.89 ± 0.85 | 3.69 ± 0.74 | - | - | 0.711 ± 0.069 | 4.0 ± 0.632 |
| XGB | AnchorT | 1934 ± 2472 | 3.32 ± 2.49 | 0.904 ± 0.0082 | 1.0–0.0387 | 0.749 ± 0.157 | 5.0 ± 0.316 |
| XGB | SHAP-R | 1.99 ± 0.55 | 3.69 ± 0.74 | 0.883 ± 0.00 | 1.0–0.158 | 0.731 ± 0.101 | 6.0 ± 0.516 |
| ML Model | Method | Metrics | |||||
|---|---|---|---|---|---|---|---|
| Execution Time (Mean ± Std), ms | Complexity | Coverage (Mean ± Std) | Fidelity (Median ± Std) | Consistency (Mean ± Std) | Experts’ Assessment (Median ± Std) | ||
| RF | SHAP | 1.89 ± 0.32 | 14 ± 0.00 | - | - | 0.678 ± 0.056 | - |
| RF | SHAP-L | 1.79 ± 0.44 | 3.95 ± 0.22 | - | - | 0.698 ± 0.065 | - |
| RF | AnchorT | 2281 ± 2963 | 3.28 ± 2.41 | 0.904 ± 0.013 | 1.0–0.0658 | 0.752 ± 0.149 | - |
| RF | SHAP-R | 1.89 ± 0.28 | 3.95 ± 0.22 | 0.819 ± 0.00 | 1.0–0.164 | 0.718 ± 0.106 | - |
| XGB | SHAP | 2.01 ± 0.97 | 14 ± 0.00 | - | - | 0.796 ± 0.045 | 3.0 ± 0.675 |
| XGB | SHAP-L | 1.89 ± 0.85 | 3.69 ± 0.74 | - | - | 0.711 ± 0.069 | 4.0 ± 0.707 |
| XGB | AnchorT | 1934 ± 2472 | 3.32 ± 2.49 | 0.904 ± 0.0082 | 1.0–0.0387 | 0.749 ± 0.157 | 5.0 ± 0.994 |
| XGB | SHAP-R | 1.99 ± 0.55 | 3.69 ± 0.74 | 0.883 ± 0.00 | 1.0–0.158 | 0.731 ± 0.101 | 5.5 ± 0.994 |
| ML Model | Method | Metrics | |||||
|---|---|---|---|---|---|---|---|
| Execution Time (Mean ± Std), ms | Complexity | Coverage (Mean ± Std) | Fidelity (Median ± Std) | Consistency (Mean ± Std) | Experts’ Assessment (Median ± Std) | ||
| RF | SHAP | 1.83 ±1.01 | 8 ± 0.00 | - | - | 0.882 ± 0.038 | - |
| RF | SHAP-L | 1.18 ± 1.21 | 3.68 ± 0.55 | - | - | 0.872 ± 0.049 | - |
| RF | AnchorT | 331 ± 345 | 2.49 ± 1.34 | 0.924 ± 0.0065 | 1.0–0.0302 | 0.761 ± 0.153 | - |
| RF | SHAP-R | 1.16 ± 0.82 | 3.68 ± 0.55 | 0.883 ± 0.00 | 1.0–0.166 | 0.961 ± 0.054 | - |
| XGB | SHAP | 1.86 ± 0.79 | 8 ± 0.00 | - | - | 0.812 ± 0.072 | 2.0 ± 0.316 |
| XGB | SHAP-L | 1.90 ± 0.66 | 3.46 ± 0.63 | - | - | 0.810 ± 0.089 | 2.0 ± 0.316 |
| XGB | AnchorT | 342 ± 430 | 3.14 ± 1.60 | 0.909 ± 0.0072 | 1.0–0.0968 | 0.754 ± 0.168 | 4.0 ± 0.632 |
| XGB | SHAP-R | 1.89 ± 0.47 | 3.46 ± 0.63 | 0.88 ± 0.00 | 1.0–0.186 | 0.891 ± 0.110 | 5.0 ± 0.483 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Khalyasmaa, A.I.; Matrenin, P.V.; Eroshenko, S.A. Interpretable Diagnostics with SHAP-Rule: Fuzzy Linguistic Explanations from SHAP Values. Mathematics 2025, 13, 3355. https://doi.org/10.3390/math13203355
Khalyasmaa AI, Matrenin PV, Eroshenko SA. Interpretable Diagnostics with SHAP-Rule: Fuzzy Linguistic Explanations from SHAP Values. Mathematics. 2025; 13(20):3355. https://doi.org/10.3390/math13203355
Chicago/Turabian StyleKhalyasmaa, Alexandra I., Pavel V. Matrenin, and Stanislav A. Eroshenko. 2025. "Interpretable Diagnostics with SHAP-Rule: Fuzzy Linguistic Explanations from SHAP Values" Mathematics 13, no. 20: 3355. https://doi.org/10.3390/math13203355
APA StyleKhalyasmaa, A. I., Matrenin, P. V., & Eroshenko, S. A. (2025). Interpretable Diagnostics with SHAP-Rule: Fuzzy Linguistic Explanations from SHAP Values. Mathematics, 13(20), 3355. https://doi.org/10.3390/math13203355
