Comparative Analysis of Explainable AI Methods for Manufacturing Defect Prediction: A Mathematical Perspective
Abstract
1. Introduction
2. Related Work
2.1. Deep Learning + Fault Detection + Manufacturing
2.2. Explainable AI (XAI) + Manufacturing
2.3. Fuzzy C-Means + Product Classification OR Quality
2.4. Joint Use of XAI + Fuzzy Clustering + ML in Industry
3. Materials and Methods
3.1. Dataset and Preprocessing
3.1.1. Feature Normalization
3.1.2. Partitioning Strategy
- Supervised learning (XGBoost): an 80/20 stratified split was performed to produce training and testing sets and , preserving the proportion of defective and non-defective samples.
- Unsupervised learning (Fuzzy C-Means): the entire feature matrix was used for clustering, ignoring the target variable.
3.1.3. Dimensionality Check
3.2. Dual Modeling Framework
3.2.1. Supervised Defect Prediction (XGBoost)
3.2.2. Unsupervised Fuzzy Clustering (Fuzzy C-Means)
- is the centroid of cluster ,
- is the fuzzifier parameter (we set ),
- denotes the Euclidean norm.
3.3. Explainability Layer
3.3.1. SHAP (Shapley Additive Explanations)
- Local accuracy (additivity):
- Consistency: if a model changes such that a feature’s contribution increases or remains the same regardless of other features, its SHAP value will not decrease.
- Missingness: if a feature is missing in all coalitions (not used in the model), its SHAP value is zero.
3.3.2. LIME (Local Interpretable Model-Agnostic Explanations)
- Instability: small perturbations in x or the sampling process can lead to different explanations.
- Sensitivity to kernel width: the choice of (often exponential or Gaussian) heavily influences the local behavior.
- Approximation error: g may fail to capture complex interactions present in f, especially in high-dimensional spaces.
3.3.3. ELI5 (Explain Like I’m Five)
3.3.4. PDP (Partial Dependence Plots)
3.3.5. ICE (Individual Conditional Expectation)
3.3.6. Interpretation Strategy
3.4. Comparative and Mathematical Evaluation (Conceptual Framework)
3.4.1. Consistency and Convergence
3.4.2. Formal Properties
- Additivity (e.g., SHAP)
- Local Fidelity (e.g., LIME)
- Implementation Invariance
- Feature Interaction Awareness
3.4.3. Quantitative Metrics
- Entropy of feature attributions, as defined in Equation (16), is used to quantify the concentration or dispersion of importance scores across features.
- Rank correlation (Spearman’s ) across methods and across perturbed inputs, as a measure of coherence and robustness.
- Stability Index to evaluate resistance to minor input fluctuations.
4. Results
4.1. Supervised Defect Prediction and Explainability
4.1.1. Exploratory Data Analysis (EDA)
- The dataset is complete and well-structured, with no missing values or inconsistent formats.
- The strongest positive correlations with the target variable (“DefectStatus”) are observed for “MaintenanceHours” (0.30) and “DefectRate” (0.25).
- The strongest negative correlations with “DefectStatus” are “QualityScore” (−0.20) and “ProductionVolume” (−0.13).
- Most variables exhibit low pairwise correlations, suggesting the presence of non-linear or multivariate interactions, which supports the use of tree-based models and post hoc interpretability techniques.
- No feature displays strong collinearity; all correlation values remain below 0.35, indicating minimal redundancy and avoiding multicollinearity issues.
- Features with minimal variance or negligible association with the target may be considered for exclusion or dimensionality reduction prior to training.
- Higher values of “MaintenanceHours” and “DefectRate” are consistently associated with increased defect probability, while lower “QualityScore” is frequently linked to defective outputs.
- “SafetyIncidents” and “WorkerProductivity” show weak direct correlation with “DefectStatus” but may exert influence through interactions within specific clusters, as explored in later sections.
- Energy-related variables, such as “EnergyEfficiency” and “EnergyConsumption”, display inverse correlations, validating their internal consistency and measurement alignment.
4.1.2. Defect Prediction Using Machine Learning Models
4.1.3. Interpretability
- Product 0 classified as defective with a predicted probability of 0.95.
- Product 200 classified as non-defective with a predicted probability of 0.93.
- “DowntimePercentage” = 0.88, with a strong positive contribution.
- “InventoryTurnover” = 0.41, and “MaintenanceHours” = 0.43, both with moderate effects.
- In contrast, “DefectRate” = 0.12 and “QualityScore” = 0.30 appear to have slightly negative contributions, suggesting a minor mitigating effect on the prediction.
- “QualityScore” = 0.88, showing a strong negative contribution (protective),
- “WorkerProductivity” = 0.37, and
- “AdditiveMaterialCost” = 0.89, both supporting the non-defect classification.
- For “MaintenanceHours”, there is a clear positive non-linear relationship; once the normalized value exceeds ~0.4, the predicted probability of defect increases steeply. The ICE curves confirm this trend across nearly all instances, indicating consistent model behavior.
- Similarly, for “DefectRate”, low values are associated with low predicted risk, but from ~0.6 onwards, the model output rises rapidly. The ICE variability slightly increases for higher defect rates, suggesting greater sensitivity in that range.
4.2. Fuzzy Clustering and Profile Interpretation
4.2.1. Cluster Optimization Using Silhouette Index
4.2.2. Fuzzy Partitioning and Cluster Profile Analysis
4.2.3. Clustering Based on K-Means (k = 3)
- Cluster 0: 1066 samples
- Cluster 1: 1057 samples
- Cluster 2: 1117 samples
- Cluster 0 exhibits high values in “WorkerProductivity” and “QualityScore”, along with low “StockoutRate”. This profile suggests an efficient and stable production environment, likely characterized by effective resource utilization and fewer supply-related disruptions.
- Cluster 1 is defined by the highest levels of “DefectRate”, “SafetyIncidents”, and “EnergyConsumption”. These patterns indicate operational inefficiencies and potential safety risks, pointing to a high-risk and resource-intensive segment.
- Cluster 2 displays a more balanced configuration, with above-average “EnergyEfficiency” and low “DefectRate”. These traits are indicative of a sustainable and quality-oriented production profile, potentially driven by well-calibrated processes and controlled inputs.
- K-means can reveal meaningful operational distinctions even in high-dimensional manufacturing datasets, especially when aided by post hoc interpretability tools.
- The identified clusters provide a foundation for intermediate labeling, which can be further exploited in supervised tasks such as quality forecasting or process optimization.
- Compared with the FCM results (Section 4.2.2), K-means clustering demonstrates greater internal contrast and operational differentiation, making it a more informative tool for segment-based analysis in this case.
4.2.4. Cluster Membership Prediction with XGBoost and Explainable AI
4.3. Summary and Final Remarks
- Cluster 0: high efficiency and quality, with high “WorkerProductivity” and “SupplierQuality”, and low “StockoutRate”.
- Cluster 1: marked by inefficiency and risk, with high “SafetyIncidents”, “DefectRate”, and “EnergyConsumption”.
- Cluster 2: balanced and sustainable, with above-average “EnergyEfficiency” and low “DowntimePercentage”.
5. Discussion and Future Works
5.1. Supervised Defect Prediction and Explainability
5.2. Clustering-Based Segmentation and Interpretability
- Cluster 0: most influenced by “SupplierQuality”, “DowntimePercentage”, and “StockoutRate”, consistent with a lean and input-stable process.
- Cluster 1: dominated by “SafetyIncidents”, “EnergyConsumption”, and “ProductionVolume”, reflecting stress in production intensity and safety protocols.
- Cluster 2: defined by “EnergyEfficiency”, “DowntimePercentage”, and “SafetyIncidents”, capturing a scenario of optimized and controlled operations.
5.3. Future Work
- Integration of temporal data: incorporating time series from production lines could enhance both prediction and segmentation capabilities, enabling dynamic monitoring of quality drifts and transitions between operational states.
- Hybrid clustering approaches: combining fuzzy logic with supervised guidance, such as semi-supervised learning or constrained FCM, may improve cluster separability while preserving the benefits of soft membership.
- Multimodal data fusion: enriching the model with additional modalities such as maintenance logs (text), product images, or real-time sensor data could support a more comprehensive and context-aware pipeline.
- Prescriptive analytics: moving beyond prediction, the identified clusters and explainability outputs could inform prescriptive actions, such as maintenance scheduling, energy optimization, or quality assurance policies.
- Deployment in real-time environments: validating the framework within streaming architectures or digital twin systems would help test its scalability, responsiveness, and operational feasibility in real-world settings.
- Implementation of the proposed XAI evaluation framework: while Section 3.4 introduces a mathematically grounded evaluation structure for explainability methods, its full implementation remains open. Applying these formal metrics will allow robust benchmarking of interpretability, fidelity, and stability across techniques.
5.4. Limitations
- Synthetic nature of the dataset: although the dataset is grounded in empirical industrial distributions and reflects realistic operational dynamics, it remains synthetic. Consequently, the generalizability of results to specific industrial environments should be validated with real-world production data.
- Clustering assumptions and structure: the Fuzzy C-Means algorithm assumes spherical geometry and Euclidean distance, which may not be well-suited for capturing non-linear or intertwined relationships between process variables. This limitation became evident in the minimal separation between clusters, especially when compared to the supervised results.
- Absence of temporal or sequential data: the current approach is based on static observations. In real manufacturing systems, quality deviations often emerge over time. The integration of temporal data streams or event logs would enable more dynamic and context-aware predictions.
- Interpretability remains qualitative: while a mathematical framework for evaluating explainability was proposed, the actual application of quantitative metrics (e.g., attribution entropy, ranking consistency, fidelity scores) remains a subject for future work. At present, the interpretability analysis relies on visual coherence across XAI techniques (SHAP, LIME, etc.), which though informative, is inherently subjective.
6. Conclusions
Funding
Data Availability Statement
Conflicts of Interest
References
- ISO 9001:2015; Quality Management Systems—Requirements. International Organization for Standardization: Geneva, Switzerland, 2015.
- Bousdekis, A.; Lepenioti, K.; Apostolou, D.; Mentzas, G. Data analytics in quality 4.0: Literature review and future research directions. Int. J. Comput. Integr. Manuf. 2023, 36, 678–701. [Google Scholar] [CrossRef]
- Saberironaghi, A.; Ren, J.; El-Gindy, M. Defect Detection Methods for Industrial Products Using Deep Learning Techniques: A Review. Algorithms 2023, 16, 95. [Google Scholar] [CrossRef]
- Molnar, C. Interpretable Machine Learning. A Guide for Making Black Box Models Explainable. Book. p. 247. 2019. Available online: https://christophm.github.io/interpretable-ml-book (accessed on 19 May 2025).
- Marín Díaz, G.; Hernández, J.J.G. Decoding Employee Attrition: A Unified Approach with XAI and AHP. In Intelligent Management of Data and Information in Decision Making; World Scientific Publishing Co Pte Ltd.: Singapore, 2024; pp. 367–375. [Google Scholar]
- Metta, C.; Beretta, A.; Pellungrini, R.; Rinzivillo, S.; Giannotti, F. Towards Transparent Healthcare: Advancing Local Explanation Methods in Explainable Artificial Intelligence. Bioengineering 2024, 11, 369. [Google Scholar] [CrossRef]
- Abdul, A.; Vermeulen, J.; Wang, D.; Lim, B.Y.; Kankanhalli, M. Trends and trajectories for explainable, accountable and intelligible systems: An HCI research agenda. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; Volume 2018. [Google Scholar] [CrossRef]
- Marín Díaz, G.; Galán Hernández, J.J.; Galdón Salvador, J.L. Analyzing Employee Attrition Using Explainable AI for Strategic HR Decision-Making. Mathematics 2023, 11, 2023. [Google Scholar] [CrossRef]
- Kharoua, R. El Predicting Manufacturing Defects Dataset. Kaggle. 2023. Available online: https://www.kaggle.com/datasets/rabieelkharoua/predicting-manufacturing-defects-dataset (accessed on 19 May 2025).
- Marín Díaz, G. Quality Management in Chemical Processes Through Fuzzy Analysis: A Fuzzy C-Means and Predictive Models Approach. ChemEngineering 2025, 9, 45. [Google Scholar] [CrossRef]
- Marín Díaz, G.; Gómez Medina, R.; Aijón Jiménez, J.A. A Methodological Framework for Business Decisions with Explainable AI and the Analytic Hierarchical Process. Processes 2025, 13, 102. [Google Scholar] [CrossRef]
- Marín Díaz, G.; Gómez Medina, R.; Aijón Jiménez, J.A. Integrating Fuzzy C-Means Clustering and Explainable AI for Robust Galaxy Classification. Mathematics 2024, 12, 2797. [Google Scholar] [CrossRef]
- Yang, J.; Li, S.; Wang, Z.; Dong, H.; Wang, J.; Tang, S. Using deep learning to detect defects in manufacturing: A comprehensive survey and current challenges. Materials 2020, 13, 5755. [Google Scholar] [CrossRef]
- Rodriguez-Fernandez, V.; Camacho, D. Recent trends and advances in machine learning challenges and applications for industry 4.0. Expert Syst. 2024, 41, e13506. [Google Scholar] [CrossRef]
- Iqbal, R.; Maniak, T.; Doctor, F.; Karyotis, C. Fault Detection and Isolation in Industrial Processes Using Deep Learning Approaches. IEEE Trans. Ind. Inform. 2019, 15, 3077–3084. [Google Scholar] [CrossRef]
- Saufi, S.R.; Ahmad, Z.A.B.; Leong, M.S.; Lim, M.H. Challenges and opportunities of deep learning models for machinery fault detection and diagnosis: A review. IEEE Access 2019, 7, 122644–122662. [Google Scholar] [CrossRef]
- Loh, C.H.; Chen, Y.C.; Su, C.T. Using Transfer Learning and Radial Basis Function Deep Neural Network Feature Extraction to Upgrade Existing Product Fault Detection Systems for Industry 4.0: A Case Study of a Spring Factory. Appl. Sci. 2024, 14, 2913. [Google Scholar] [CrossRef]
- Choi, K.; Yi, J.; Park, C.; Yoon, S. Deep Learning for Anomaly Detection in Time-Series Data: Review, Analysis, and Guidelines. IEEE Access 2021, 9, 120043–120065. [Google Scholar] [CrossRef]
- Aboulhosn, Z.; Musamih, A.; Salah, K.; Jayaraman, R.; Omar, M.; Aung, Z. Detection of Manufacturing Defects in Steel Using Deep Learning With Explainable Artificial Intelligence. IEEE Access 2024, 12, 99240–99257. [Google Scholar] [CrossRef]
- Rožanec, J.M.; Novalija, I.; Zajec, P.; Kenda, K.; Tavakoli Ghinani, H.; Suh, S.; Veliou, E.; Papamartzivanos, D.; Giannetsos, T.; Menesidou, S.A.; et al. Human-centric artificial intelligence architecture for industry 5.0 applications. Int. J. Prod. Res. 2022, 61, 6847–6872. [Google Scholar] [CrossRef]
- Yoo, S. Explainable Artificial Intelligence for Manufacturing Cost Estimation and Machining Feature Visualization. Expert Syst. Appl. 2021, 183, 115430. [Google Scholar] [CrossRef]
- Mishra, A.; Jatti, V.S.; Sefene, E.M.; Paliwal, S. Explainable Artificial Intelligence (XAI) and Supervised Machine Learning-based Algorithms for Prediction of Surface Roughness of Additively Manufactured Polylactic Acid (PLA) Specimens. Appl. Mech. 2023, 4, 668–698. [Google Scholar] [CrossRef]
- Ukwaththa, J.; Herath, S.; Meddage, D.P.P. A review of machine learning (ML) and explainable artificial intelligence (XAI) methods in additive manufacturing (3D Printing). Mater. Today Commun. 2024, 41. [Google Scholar] [CrossRef]
- Bezdek, J.C.; Ehrlich, R.; Full, W. FCM: The fuzzy c-means clustering algorithm. Comput. Geosci. 1984, 10, 191–203. [Google Scholar] [CrossRef]
- Gindy, N.N.Z.; Ratchev, T.M.; Case, K. Component grouping for GT applications—A fuzzy clustering approach with validity measure. Int. J. Prod. Res. 1995, 33, 2493–2509. [Google Scholar] [CrossRef]
- Abdulshahed, A.M.; Longstaff, A.P.; Fletcher, S.; Myers, A. Thermal error modelling of machine tools based on ANFIS with fuzzy c-means clustering using a thermal imaging camera. Appl. Math. Model. 2015, 39, 1837–1852. [Google Scholar] [CrossRef]
- Zanoli, S.M.; Pepe, C. Design and Implementation of a Fuzzy Classifier for FDI Applied to Industrial Machinery. Sensors 2023, 23, 6954. [Google Scholar] [CrossRef]
- Yang, J.; Su, J.; Song, L. Selection of manufacturing enterprise innovation design project based on consumer’s green preferences. Sustainability 2019, 11, 1375. [Google Scholar] [CrossRef]
- Nguyen, T.P.Q.; Yang, C.L.; Le, M.D.; Nguyen, T.T.; Luu, M.T. Enhancing automated defect detection through sequential clustering and classification: An industrial case study using the Sine-Cosine Algorithm, Possibilistic Fuzzy c-means, and Artificial Neural Network. Adv. Prod. Eng. Manag. 2023, 18, 237–249. [Google Scholar] [CrossRef]
- Lee, G.M.; Gao, X. A hybrid approach combining fuzzy c-means-based genetic algorithm and machine learning for predicting job cycle times for semiconductor manufacturing. Appl. Sci. 2021, 11, 7428. [Google Scholar] [CrossRef]
- Stržinar, Ž.; Pregelj, B.; Škrjanc, I. Soft sensor for non-invasive detection of process events based on Eigenresponse Fuzzy Clustering. Appl. Soft Comput. 2023, 132, 109859. [Google Scholar] [CrossRef]
- Kulkarni, A.; Terpenny, J.; Prabhu, V. Sensor selection framework for designing fault diagnostics system. Sensors 2021, 21, 6470. [Google Scholar] [CrossRef]
- Gangadharan, N.; Sewell, D.; Turner, R.; Field, R.; Cheeks, M.; Oliver, S.G.; Slater, N.K.H.; Dikicioglu, D. Data intelligence for process performance prediction in biologics manufacturing. Comput. Chem. Eng. 2021, 146, 107226. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions Scott. Nips 2012, 16, 426–430. [Google Scholar]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?” Explaining the Predictions of Any Classifier. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, San Diego, CA, USA, 12–17 June 2016; pp. 97–101. [Google Scholar] [CrossRef]
- Fisher, A.; Rudin, C.; Dominici, F. All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 2019, 20, 177. [Google Scholar]
- Goldstein, A.; Kapelner, A.; Bleich, J.; Pitkin, E. Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. J. Comput. Graph. Stat. 2015, 24, 44–65. [Google Scholar] [CrossRef]
- Namazi, H.; KhalafAnsar, H.M.; Keighobadi, J.; Hamed, M. A Hybrid AI-Driven Fault-Tolerant Control Framework: Reinforcement Learning and LSTM-Based Adaptive Backstepping for High-Precision Robotic Manipulators. J. Electr. Syst. 2024, 20, 9662–9674. [Google Scholar] [CrossRef]
- Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 2007, 17, 395–416. [Google Scholar] [CrossRef]
- Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 2010, 31, 651–666. [Google Scholar] [CrossRef]
- Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; Garcia, S.; Gil-Lopez, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
Features | |
---|---|
ProductionVolume | StockoutRate |
ProductionCost | WorkerProductivity |
SupplierQuality | SafetyIncidents |
DeliveryDelay | EnergyConsumption |
DefectRate | EnergyEfficiency |
QualityScore | AdditiveProcessTime |
MaintenanceHours | AdditiveMaterialCost |
DowntimePercentage | DefectStatus |
InventoryTurnover | StockoutRate |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Marín Díaz, G. Comparative Analysis of Explainable AI Methods for Manufacturing Defect Prediction: A Mathematical Perspective. Mathematics 2025, 13, 2436. https://doi.org/10.3390/math13152436
Marín Díaz G. Comparative Analysis of Explainable AI Methods for Manufacturing Defect Prediction: A Mathematical Perspective. Mathematics. 2025; 13(15):2436. https://doi.org/10.3390/math13152436
Chicago/Turabian StyleMarín Díaz, Gabriel. 2025. "Comparative Analysis of Explainable AI Methods for Manufacturing Defect Prediction: A Mathematical Perspective" Mathematics 13, no. 15: 2436. https://doi.org/10.3390/math13152436
APA StyleMarín Díaz, G. (2025). Comparative Analysis of Explainable AI Methods for Manufacturing Defect Prediction: A Mathematical Perspective. Mathematics, 13(15), 2436. https://doi.org/10.3390/math13152436