Hybrid AI Systems for Tool Wear Monitoring in Manufacturing: A Systematic Review

Saatci, Busra Tan; Ulas, Mustafa; Gurgenc, Turan

doi:10.3390/app16010208

Open AccessReview

Hybrid AI Systems for Tool Wear Monitoring in Manufacturing: A Systematic Review

by

Busra Tan Saatci

¹

,

Mustafa Ulas

^2,3,*

and

Turan Gurgenc

^4,*

¹

Department of Information Technologies, Institute of Science, Fırat University, 23119 Elazig, Türkiye

²

Department of Artificial Intelligence and Data Engineering, Faculty of Engineering, Fırat University, 23119 Elazig, Türkiye

³

Prodrom Information and Communication Technologies, Firat Technopark TGB, 23050 Elazig, Türkiye

⁴

Department of Automotive Engineering, Faculty of Technologies, Fırat University, 23119 Elazig, Türkiye

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2026, 16(1), 208; https://doi.org/10.3390/app16010208

Submission received: 15 October 2025 / Revised: 3 December 2025 / Accepted: 18 December 2025 / Published: 24 December 2025

Download

Browse Figures

Versions Notes

Abstract

Tool wear is critical to quality, productivity, and sustainability in manufacturing processes. Therefore, accurately monitoring and predicting wear is one of the primary goals of smart manufacturing systems. While AI-based approaches have achieved significant success in this area in recent years, issues such as physical inconsistency, limited generalizability, and low interpretability associated with solely data-driven methods have necessitated the development of hybrid approaches. This study systematically examines the literature published between 2020 and 2025 and comprehensively analyzes hybrid AI systems used in tool wear monitoring. Hybrid systems are categorized into four main groups: physics-based hybrids, knowledge-driven hybrids, transfer learning-based hybrids, and heterogeneous model hybrids. This classification holistically evaluates the synergistic effects and performance gains achieved by combining different methods. The findings demonstrate that the combined use of physical models, expert knowledge, and data-driven learning approaches provides significant advantages in terms of both accuracy and explainability. However, challenges such as data shortage, model complexity, and computational cost remain limitations to widespread industrial use of hybrid systems. The study demonstrates that hybrid AI systems represent a new research direction enabling the development of more reliable, transparent, and efficient solutions in smart manufacturing.

Keywords:

artificial intelligence; hybrid systems; hybrid AI; manufacturing engineering; machine learning; deep learning; physics-based learning; explainable artificial intelligence (XAI); tool wear monitoring; hybrid manufacturing systems

1. Introduction

Industrial revolutions have been shaped by technological innovations that have radically transformed production processes. While steam power, electricity, and computer technologies define the first three revolutions, respectively, the Fourth Industrial Revolution stands out with the integration of artificial intelligence, digital transformation, cloud computing, and the Industrial Internet of Things [1]. The technological pressure created by this historical transformation has reshaped countries’ competitive strategies. While these developments are reshaping production processes and global competitive strategies, the United States (US) 2016 strategic plan emphasizes the potential of artificial intelligence in manufacturing [2,3,4], and China has adopted the goal of leadership in smart manufacturing between 2025 and 2035 [5,6]. The Industry 4.0 approach, introduced by Germany in 2011, also aims to develop flexible and self-adaptive production systems through digitalization [7,8].

The increasing complexity of production systems, a natural consequence of this large-scale transformation, increasingly highlights the limitations of approaches based on a single Artificial Intelligence (AI) paradigm. Whether purely data-driven Deep Learning models or purely physics-based simulations, non-hybrid methods face significant challenges such as poor generalization, high data dependency, limited interpretability (the “black box” problem), and physically inconsistent predictions. These limitations, in turn, highlight the critical need for more flexible, explainable, and robust hybrid AI systems [9,10].

A review of the existing literature reveals that there are studies on the use of AI in the manufacturing field (e.g., “Tool wear monitoring with AI methods”) or specific hybrid techniques (e.g., “AI-based hybrid predictive models”). However, a significant gap exists in the literature. To date, there is no comprehensive study that systematically defines, classifies, and analyzes hybrid AI systems as a holistic paradigm in the context of tool wear and manufacturing applications. Most of the existing literature focuses on single applications or superficially defines hybrid models without an analytical framework. This review aims to fill this gap by examining the literature published between 2020 and 2025 within a principled analysis framework. The main contribution of the study is twofold:

A New Definition and Conceptual Framework: This study not only compiles so-called “hybrid” approaches but also proposes a unique definition based on the core principles of integration: multi-paradigm, complementary interaction, and synergistic performance. This framework allows us to evaluate studies against consistent criteria and establish a clear classification of hybrid systems.

Structured Analytical Review: Using the proposed framework, hybrid approaches are examined under four main categories: Physics-Informed, Knowledge-Guided, Transfer Learning-Based, and Heterogeneous Model Hybrids. This classification allows us to systematically compare integration logics, achieved gains, and encountered limitations.

While this conceptual framework is being developed, one of the most significant impacts of this transformation is the evolution of production structures from mechanical to knowledge-driven systems. Artificial intelligence, and particularly machine learning, plays a critical role in areas such as design, planning, quality control, and predictive maintenance [11,12,13,14], while AI-based applications supported by extensive data sources and physical simulations make significant contributions to error detection, process improvement, tool condition monitoring, and near-zero-defect manufacturing goals [15,16,17,18,19]. Deep learning, particularly through Convolutional Neural Network (CNN)-based structures, enables high accuracy in tasks such as Tool Condition Monitoring (TCM) and image analytics [20,21].

These developments supported the global technological goals of reducing costs and increasing productivity in the manufacturing industry, particularly in the production details. They accelerated the development of more autonomous production efficiencies by minimizing human intervention. This capability led to significant emphasis on unattended machining on Computer Numerical Control (CNC) machines in machining and research into automatic tool changing systems. As an extension of this progress, automatic tool changing technologies based on tool wear, with detailed tool monitoring, emerged in the late 1980s and early 1990s [22].

At this point, cutting tool wear is considered a fundamental problem in manufacturing because it determines both tool life and product quality. Since wear caused by high cutting temperatures negatively affects process efficiency and surface quality, monitoring tool condition is critical [23,24]. Direct methods for determining tool wear rely on physical examination using microscopes or laser measurements, while indirect methods evaluate outputs such as cutting force, torque, and surface roughness [24,25]. Temperature monitoring is also a common indirect method, measuring the temperature in the tool-chip area using thermocouples or thermal cameras; thus, the relationship between processing parameters, wear rate, and temperature is revealed [26]. High temperatures are known to accelerate wear by causing microstructural changes in the tool material [27].

In addition to all these methods, analyzing chip formation characteristics is another important approach for tool wear prediction. The proliferation of indirect and hybrid methods allows for greater automation of production, providing significant advantages such as higher surface quality, lower labor requirements, and reduced time and material losses [28,29,30,31]. However, since limitations such as model training, lack of data, and integration of multiple-source data persist, the current trend is toward the development of integrated artificial intelligence systems that hybridize artificial neural networks with methods such as fuzzy logic and genetic algorithms, improving decision support and prediction performance [32].

Consequently, this study provides researchers and practitioners with a comprehensive conceptual framework for understanding, designing, and critically evaluating hybrid AI systems. Thus, it aims to contribute to the development of more reliable, explainable, and effective AI solutions in the field of smart manufacturing. The article is structured as follows: Section 2, presents the basic definition, evaluation criteria, and analytical framework of hybrid AI systems, categorizing these systems into four different types and presenting case studies for each. Section 3, examines in detail the limitations of traditional AI methods and highlights the motivations for developing hybrid systems through a qualitative comparison. Section 4, outlines strategic research directions to overcome the major obstacles to the industrial-scale adoption of hybrid systems. The conclusion summarizes the main findings and contributions of the study, highlighting its importance for the future development of hybrid AI.

Methodology of the Review

In this study, the literature review was conducted using Google Scholar, Web of Science, and Scopus databases. An initial comprehensive search using the term “Hybrid in Manufacturing” returned approximately 130,000 results across three databases, reflecting the broad scope of the topic. The search process was conducted according to specific inclusion and exclusion criteria. Inclusion criteria included studies published between 2020 and 2025, published in peer-reviewed journals, written in English, and focusing on AI applications in manufacturing. Exclusion criteria included book chapters and non-peer-reviewed content (Figure 1).

The resulting publications were subjected to a preliminary review at the title and abstract level; studies not focused on manufacturing were excluded. At this stage, 70 studies were excluded, leaving 105 studies for full-text evaluation. Some of the reviewed publications were used to support the study’s theoretical framework, while others were evaluated within the scope of thematic analysis.

To ensure a systematic analysis, each study was examined according to the following guiding questions:

(i): What is the scope of the study?
(ii): What methods were used?
(iii): What data types and datasets were used?
(iv): What are the key performance results?
(v): What are the contributions and limitations of the study?

As a result of the analysis, the reviewed studies were categorized under four main themes:

Studies using classical machine learning (ML) algorithms,
Studies using deep learning (DL) algorithms,
Hybrid studies,
Current challenges and future research trends.

This structured classification provided a holistic assessment of the literature, highlighting the strengths and areas for improvement of existing studies. The following keywords were used in the literature search: “Hybrid in Manufacturing,” “Tool Wear in Artificial Intelligence,” “Tool Condition Monitoring (TCM) in Artificial Intelligence,” “Wear Loss,” “Predictive Maintenance,” and “AI in Manufacturing.”

In the first stage, a search using the term “Hybrid in Manufacturing” was conducted to obtain an overview. Next, studies containing the terms “Tool Condition Monitoring (TCM),” “Wear Loss,” “Predictive Maintenance,” “Artificial Intelligence in Manufacturing,” and “Zero Defect Manufacturing” were selected to identify hybrid approaches related to manufacturing processes. Publications unrelated to these areas were excluded. Some studies identified hybrid approaches in their content, even though the term “hybrid” was not directly included in the keywords. Therefore, these terms were re-searched in the second stage, and these studies were included in the analysis.

A total of 175 studies were evaluated in the initial screening. Of the 105 full-text articles assessed, 35 were excluded due to their predominantly materials-science focus, and 32 were excluded for other reasons (insufficient methodological detail or out of scope). In total, 38 studies were included in the final synthesis, comprising 28 hybrid studies and 10 non-hybrid AI-based manufacturing studies. This multi-stage screening approach considered studies that holistically identified and analyzed the field of manufacturing and tool wear. This method provided a solid and methodologically consistent basis for further evaluation.

2. Defining Hybrid Approaches: Integration Principles and Applications

For this review, a hybrid AI system is defined as an architecture that integrates methodological components from fundamentally different paradigms to overcome the limitations of a single component. These systems aim to effectively solve multidimensional, interactive problems difficult to solve with traditional methods by combining the advantages of different approaches. Integration ensures that components work in harmony while also allowing them to co-evolve and transform. This interactive process enables the emergence of new functions that no single component can provide, improving the overall performance of the system [32,33,34,35].

Hybrid approaches aim to create more balanced, generalizable, and interpretable systems by combining techniques such as deep learning, traditional machine learning, expert knowledge-based models, physical modeling, and ensemble learning. Definitions in the literature support this fundamental idea:

Jia et al. (2023) define hybrid as an approach that combines the powerful representational capabilities of deep learning with the interpretability or statistical robustness of other methods [36].

Another study (2024) states that hybrid structures created by combining models such as Deep Neural Networks (DNN) and Random Forest Regressions (RFR) provide more robust and balanced results [37].

Shah et al. (2025) define a hybrid approach as modeling that uses both Physically Based Models (FPM) and Data Driven Methods (DDM) to describe the behavior of a process or system [38].

Asiedu et al. (2024) consider it as an approach that aims to provide higher prediction accuracy by combining the strengths of multiple machine learning algorithms [39].

Samparthi V. S. Kumar et al. (2024) define hybrid as an approach that combines expert knowledge-based guidance with random discovery [40].

Xie et al. (2025) consider it as a method that aims to improve prediction performance and generalizability, consisting of a combination of methods such as transfer learning and ensemble learning [41].

Zhou et al. (2019) define it as an approach that integrates multiple learning paradigms to facilitate knowledge transfer between source and target domains [42].

In this study, the term “hybrid approaches” refers to systems that purposefully and interactively integrate at least two different paradigms or modeling methodologies (e.g., a combination of physics-based models and data-driven models). To be classified as a hybrid system for the purposes of this review, a study must meet the following criteria:

Multi-Paradigm: It must contain components from at least two different methodological paradigms (e.g., physics-based model and data-driven model, machine learning, and deep learning).

Complementary Interaction: These components must interact in a complementary manner, where the output of one directly affects the operation or outcome of the other.

Synergistic Performance: The integrated system must demonstrate performance or functionality (e.g., accuracy, data efficiency, interpretability) that neither component alone can achieve.

According to these criteria, heterogeneous ensemble methods (e.g., Random Forest or Gradient Boosting, which combine different types of algorithms) are considered valid hybrid approaches when they strategically combine different underlying learners (e.g., a Support Vector Machine, Decision Tree, and Neural Network). Homogeneous ensembles (combining multiple models of the same type) are a powerful technique but are not the focus of this review, which emphasizes the integration of fundamentally different paradigms.

To address hybrid systems in an analytical framework rather than a descriptive one, we categorize hybrid approaches in manufacturing as follows:

Physics-Informed Hybrids: Integrate data-driven models (e.g., neural networks) with mathematical representations of physical laws (e.g., differential equations, conservation laws). Physical knowledge serves as a regulator, ensuring that predictions are scientifically plausible.

Knowledge-Guided Hybrids: Incorporate explicit expert knowledge, rules, or logical constraints into a data-driven learning process. This typically involves fuzzy logic systems or Explainable Artificial Intelligence (XAI) techniques to make black-box models interpretable and compatible with human understanding.

Heterogeneous Model Hybrids: Combine models from different algorithmic families (e.g., a Convolutional Neural Network and a Support Vector Machine) or utilize heterogeneous ensemble methods that strategically combine predictions from several base learners.

Transfer Learning Hybrids: Adapt a model pre-trained on a large-scale source task (e.g., visual recognition on ImageNet) to a data-limited target task (e.g., defect detection) in manufacturing. Hybridity arises from the combination of general features learned from the source domain and task-specific features learned from the target domain.

This analytical framework allows for a structured comparison of how different hybrid systems achieve their synergistic benefits. The following table (Table 1) provides concrete examples from the manufacturing domain, illustrating how the core components are integrated under each category and the specific synergistic performance gains realized, as required by our definition.

The findings suggest that the examined approaches can be considered complementary strategies rather than independent and singular solutions under appropriate conditions. In general, physics-informed ML and ensemble methods are more frequently emphasized in the literature due to their contribution to accuracy and generalizability. For example, the work of Pashmforoush et al. (2025) [47] demonstrates the practical applicability of hybrid solutions. This research integrates a thermomechanical force model with various machine learning applications (LSBoost—Least Squares Boosting, SVR—Support Vector Regression, RF—Random Forest) and applies a hybrid rule that complements the physical model and data-driven methods. Compared to pure machine learning methods, approximately 16% higher accuracy is achieved, and promising results are also provided for real-time monitoring of current and torque transfer in CNC systems. However, the failures occur only within specific tool–material patterns and necessitate careful consideration of the generalizability of the results. On the other hand, transfer learning and vision-based Decision Support Systems can offer advantages under certain conditions, while the integration of expert knowledge plays a supporting role, particularly in improving interpretability.

Hybrid approaches represent not only a taxonomy but also a search for balance between the strengths and weaknesses of different methods. The comparisons and case study presented here suggest that hybrid systems offer flexibility, adaptability, and multidimensional problem-solving capacity. Thus, rather than providing a definitive solution, hybrid approaches may be regarded as a flexible research orientation that can be adapted to diverse manufacturing contexts. In this context, artificial intelligence approaches are widely used in the creation of hybrid systems, the development of intelligent quality control mechanisms, and the optimization of zero-defect manufacturing processes [18,35,48,49,50]. Artificial intelligence applications are becoming increasingly common in manufacturing, but data-driven methods alone can ignore physical constraints, expert knowledge, or systematic relationships [51]. Therefore, hybrid systems that integrate physics-based modeling, expert knowledge, transfer learning, and multi-model combinations with artificial intelligence have emerged as promising solutions for developing more accurate, generalizable, and interpretable approaches.

2.1. Physics-Informed Hybrid Applications

The success of machine learning (ML) models depends largely on the quality and availability of the datasets on which they are optimized. However, in many fields such as engineering applications, obtaining sufficient quantity, clean, and representative data is often challenging due to time, cost, and resource constraints [52]. Insufficient or incompletely representative datasets can cause model outputs to deviate from physical reality, leading to erroneous conclusions. Although physics-based modeling methods are based on the underlying principles of the system, they can be limited in practice due to factors such as the computational burden encountered in solving complex equations and the lack of a comprehensive understanding of the system [53]. In this context, the integration of physical knowledge with data-driven modeling stands out as an important approach for predicting and modeling system behavior. The concept of Physics-Informed Machine Learning (PIML) was first introduced by Lagaris et al. (1998) with the use of artificial neural networks in solving differential equations [54]. Subsequently, a framework based on the integration of data science and physical theories was systematically presented by Karpatne et al. (2017) [55]. Thanks to this approach, physical governing equations can be integrated into data-driven learning processes to develop models that maintain physical consistency and provide observational flexibility [56].

Physics-based machine learning approaches enable the integration of mathematical models of physical processes (e.g., thermal, mechanical, and acoustic systems) with artificial intelligence techniques in manufacturing systems. Figure 2 outlines how physical simulations, experimental data, and machine learning models are integrated. As a result, physics-based machine learning approaches not only increase accuracy but also improve model interpretability and overall physical validity, contributing to the creation of more reliable and sustainable systems in manufacturing processes.

Zhu and colleagues (2024) [43] developed a physics-informed deep learning approach for monitoring cutting tool wear. The study employed attention-based dual-scale hierarchical LSTM (ADHL), multi-layer attention-based feature extraction, multi-task loss function, and physics-based residual modeling. In high-speed milling experiments, the model was shown to improve mAPE, mFPE, and MFPE by 42%, 43%, and 63%, respectively, significantly increasing prediction accuracy compared to previous data-driven or physics-based methods. A key strength of the study is its incorporation of physical information (e.g., cutting parameters, statistical properties, Taylor equations) into the model, which allows for generalization under limited data conditions and provides more consistent results. However, limitations include the model’s influence on low signal-to-noise labels across different machining conditions, the inability of the residual learning method to provide the expected improvement, and the lack of fully demonstrated scalability and adaptability to different production types (e.g., turning, drilling). For future work, it is recommended to integrate more machining types and physical process information (such as vibration and cutting force mechanisms) into the network and extend it to industrial optimization processes. In the context of industrial applications, high-speed milling experiments are consistent with industrial standards, and the accuracy and stability of the model are found to be sufficient for practical production.

Pashmforoush, Ebrahimi Araghizad, & Budak (2025) [47] developed a Physics-Informed Machine Learning (PIML) approach for monitoring and predicting tool wear during milling in machining processes. In this study, a thermo-mechanical force model that accounts for wear is integrated with regression-based ML algorithms such as LSBoost, SVR, and Random Forest. This hybrid model demonstrated higher accuracy compared to approaches relying solely on experimental data. For example, for force predictions, R² values exceeded 98%, with a Root Mean Square Error (RMSE) in the range of 10–14 N; for tool wear prediction, R² was 95%, with an RMSE of less than 8 µm. These findings suggest that PIML provides up to 16% higher accuracy compared to models using only ML. A strength of the study is its ability to produce highly reliable results with less experimental data by physically integrating wear effects into the cutting force model. Furthermore, the inverse modeling strategy allowed for direct prediction of tool wear length from cutting parameters and forces. A limitation is that the model’s experimental validation was conducted with a specific material (Steel 1050) and tool geometry, limiting its generalizability to different materials and conditions. Furthermore, it is emphasized that, because the wear-cutting force relationship is complex, a larger dataset may be needed. Future research is recommended to ensure broader applicability at the industrial scale by testing the model with different materials and tool types. The study has potential for industrial applications, particularly its integration with existing current/torque measurements in CNC systems, and its adaptability to real-time tool condition monitoring solutions.

In the study conducted by Rama Karthik and Babu Rao (2025) [57], a physics-based and data-driven approach was developed to predict the temperature fields and cutting forces occurring during the machining of nickel-based superalloy IN625. The study is based on the integration of data obtained from finite element (FE) simulations with different machine learning algorithms (AdaBoost, SVR, Random Forest, Gaussian Process, etc.). The FE model was created using the Johnson–Cook material model, the Coulomb–Tresca friction model, and the Arbitrary Lagrangian–Eulerian (ALE) adaptive re-mesh structure and validated with experimental measurements. Thus, errors of less than 4% were achieved in the cutting force and temperature predictions. In the performance comparison of machine learning algorithms, it was reported that AdaBoost provided the highest accuracy in cutting temperature prediction, while SVR and Gaussian Process provided the highest accuracy in cutting force prediction. In addition, the feature importance analysis results were found to be consistent with the effects of theoretically expected cutting parameters. This indicates that the integration of physics-based simulation knowledge with machine learning increases the predictive ability and reliability. However, high mesh density increases simulation times, and appropriate hyperparameter selection plays a critical role in model performance. This study has industrial potential for predicting real-time processing conditions and determining optimal parameters within the context of smart manufacturing and Industry 4.0.

In their study, Chen et al. (2025) [58], a new physics-enriched deep learning model called Physics-Informed Dynamic Gated Graph Convolutional Network (PIDGGCN) was developed. The study focused on the problem of cutting tool wear monitoring and was tested with data obtained from the machining of both titanium alloy thin-walled parts and Carbon-Fiber Reinforced Polymer (CFRP) composites. The model was able to learn complex spatio-temporal dependencies thanks to the dynamically updated trajectory matrix and gated graph convolutional layers, and it also observed monotonic wear progression with physics-based regularizers. According to the evaluation metrics, PIDGGCN provided higher accuracy than all compared baseline models; 30.83% RMSE reduction, 32.17% Mean Absolute Error (MAE) reduction, and 2.47% R² increase were obtained on the HMoTP dataset. Its strengths are both high accuracy and an extremely lightweight architecture with only 66 K parameters, 0.07 MFLOPs, and 258 KB memory usage; This has made it suitable for real-time industrial applications. However, limitations include the model’s continued reliance on labeled data for extremely noisy data across different cutting conditions, its limited ability to interpret complex wear behaviors, and the data selection/normalization stages still relying on specific assumptions. Suggestions for future work include improving the model’s self-supervised learning capacity in low-label data scenarios and making it scalable to more complex multi-material machining conditions. Direct industrial application has been implemented in this study, and the model’s applicability with high accuracy and low memory consumption has been demonstrated in real-world cutting experiments.

Dong et al. (2024) [59] proposed a hybrid framework, MultiCNN-Attention-GRU (MCAG), for tool wear prediction. The model integrates CNN-based feature extraction, dynamic weighting via MHA, and long-term dependency capture through GRU. A custom Monotonicity Loss was introduced to embed the physical principle that tool wear should increase over time. Evaluated on the Prognostics and Health Management 2010 (PHM2010) dataset, MCAG achieved 1.6% lower MAE than the second-best model, highlighting the potential of hybrid systems for both accuracy and explainability in complex manufacturing processes.

Physics-informed machine learning (PIML) methods aim to make data-driven models more consistent, interpretable, and generalizable by enriching them with physical theories. Such approaches prevent biases that can arise from ignoring physical realities, particularly in production processes, and provide reliable results despite data insufficiency. Various PIML-based studies in the literature demonstrate how different methodological frameworks (e.g., PINN, PIBO, Active Learning) are integrated with physical information and demonstrate their success across application areas. Table 2 below summarizes the key features of these studies, the physical information sources used, the modeling types, and the limitations they face.

The reviewed studies demonstrate the systematic solutions offered by physics-informed hybrid approaches (PIHML) to the most challenging modeling problems in manufacturing. A common theme is that these methods overcome the “black-box” limitations of purely data-based models by focusing on physical consistency and causality. Studies [43,47,59] have demonstrated that integrating the laws of physics into the model provides accuracy improvements of up to 16% and strong generalization capabilities even under limited data conditions. Integration strategies—using physics as a regularizer in the loss function [43,58,59], as a data source through simulations [57], or as a framework for inverse modeling [47]—demonstrate the flexibility and power of the approach.

However, these positive results should not overshadow the fundamental challenges facing the method. Most studies acknowledge that their performance is limited by specific materials, tools, and cutting conditions. Furthermore, as [58] points out, the interpretation of model outputs and the dependence on labeled data in extremely noisy environments are ongoing challenges. The computational cost of complex hybrid models, as noted in [59], also remains a practical concern for industrial deployment. Consequently, while physics-informed hybrid systems offer a superior balance between accuracy, physical consistency, and data efficiency, further research on generalizability and interpretability, and computational efficiency is required for their maturity.

In addition to the studies reviewed above, practical applications from industrial settings explain how to work under real production constraints.

Case Study: Physics-Informed Hybrid Tool Wear Monitoring Model (PSSM) [60]: In this case, tool wear during CNC milling was monitored using cutting force (Fx, Fy, Fz), vibration (Vx, Vy, Vz), acoustic emission (AE), and optical microscopy data. The hybrid framework combined the physical degradation model (Wiener-based wear model) with Gaussian Process (GP) and Particle Filter (PF) methods to create a Probabilistic State Space Model (PSSM).

Noisy Data Environment:

Because vibration and AE signals contained high noise, the signal-to-noise ratio decreased in the early wear stages. The GP-based probabilistic framework modeled these sensor errors, preventing distortion of the physical degradation curve.

Differing Sensitivities of Sensors:

Cutting force tended to increase steadily with wear, while vibration and AE signals became significant only when wear accelerated. This demonstrates why sensor fusion is essential in production environments.

Hybrid Decision Mechanism (Physics + Data):

While data-based models alone (CNN, LSTM, SVR) incorrectly classified wear transitions, the combination of physical model, GP, and PF significantly increased decision accuracy, reaching 97.7% wear prediction accuracy. The acceleration rate signal provided by PF enabled the timely detection of the Failure Threshold (FT).

Operational Observations:

Installation of high-precision sensors such as dynamometers created additional workload, increasing the field integration costs of hybrid systems. Because an optical microscope can only be used at the end of a process, real-time wear prediction was performed entirely using PSSM. The RMS value of spindle current did not show a significant correlation with wear, indicating that some low-cost sensors may not provide the expected information value.

Industrial Evaluation:

This case clearly illustrates the typical challenges hybrid approaches face in the industrial field (noise, sensor heterogeneity, installation costs). The combination of physical model and data-driven prediction has identified the critical “severe” phase of wear more reliably, contributing to lower maintenance costs by reducing unnecessary tool changes.

2.2. Knowledge-Guided Hybrid Applications

Models trained solely on data can sometimes make inaccurate generalizations. Therefore, the expert knowledge of manufacturing engineers about the process is integrated into deep learning or machine learning algorithms for rule-based, feature-based, or explainable purposes. Integrating expert knowledge aims to increase both model accuracy and interpretability by combining data science and field experience [61].

As the decisions made by AI-based systems become increasingly complex, a transparent and understandable presentation of outputs has become critical. Explainability is fundamental to model performance and user confidence. Within the scope of XAI (Explainable Artificial Intelligence), [62,63]:

Local explanations, which explain the decision in a single example;
Global explanations, which explain the overall structure of the model (visual/mathematical);
Contrastive explanations, which explain why a particular output was chosen;
What-if explanations, which explain the effect of input changes on the outcome;
Counterfactual explanations, which explain how changing assumptions can affect the outcome;
Example-based explanations that explain model behavior with concrete data examples.

Model-agnostic methods, without relying on model internals, provide post hoc explanations, making the outputs transparent from the outset, and pre-hoc explanations, making the decision process transparent from the outset. Advanced Artificial Intelligence and XAI approaches integrated within Industry 4.0 offer high accuracy, reliability, and explainability in areas such as manufacturing, quality control, and predictive maintenance. Explainable architectures that can work with big data play a significant role in digital transformation [62,63].

Lin and Chen (2024) [64] developed a decision support system (DSS) integrated with an explainable machine learning (ML) model to advance zero-defect manufacturing (ZDM) in injection molding. The proposed PSO + C4.5 approach achieved superior defect classification performance (G-mean: 0.9902, accuracy: 0.9889) compared to SVM, BPNN, RBF, and SOM, while sensor optimization for eight critical variables reduced costs and enhanced industrial applicability. Key strengths include explainability, real-time monitoring, and factory-validated high accuracy, whereas limitations involve PSO’s risk of premature convergence and testing limited to symmetric molds. Future research should extend the framework to diverse mold types and additional sensor data.

Hajgató et al. (2022) [65] introduced PredMaX, a predictive maintenance framework designed for high-dimensional, unlabeled time series data using explainable deep learning. By combining a deep convolutional autoencoder (DCAE) and PCA, the model achieved effective data compression, while automatic clustering in the DCAE latent space outperformed PCA-based clustering. Using Integrated Gradients (IG), PredMaX identified critical sensor channels linked to oil degradation in a gearbox under MW load, offering explainable and reliable fault detection without direct performance metrics. Despite its sensitivity to DCAE hyperparameters and reliance on predefined cluster numbers, the model demonstrated strong industrial potential by detecting fault types typically requiring disassembly. Future work should focus on integrating dynamic, online learning for broader fault classification.

Nasr et al. (2020) [66] proposed an ANFIS-MOPSO hybrid model for predicting and optimizing machining outputs, including surface roughness, cutting force, and depth force during milling of Ti6Al4V/GNP nanocomposites. ANFIS accurately predicted outputs from experimental data, while MOPSO enabled multi-objective optimization within the same framework. The approach significantly reduced prediction errors compared to traditional mathematical models, demonstrating its potential for complex parameter optimization in industrial manufacturing. However, the model was validated only within narrow ranges of GNP ratios and machining parameters, leaving scalability and cross-material applicability unverified. Future research suggests extending the method to other composites and integrating it into real-time process control.

Coutinho et al. (2024) [44] introduced the 0-DMF (Zero Defect Manufacturing Framework), a data-driven decision support system for ZDM. Using supervised learning algorithms (CatBoost, XGBoost, Random Forest) with 72,000 samples from melamine-faced wood panel production, they achieved a high recall (96.6%) after class imbalance correction via Synthetic Minority Oversampling Technique (SMOTE). The distinctive contribution lies in integrating explainable AI (XAI) methods such as SHAP and LIME, which enhance transparency at the operator level and facilitate intervention based on both data and expert knowledge. Beyond error prediction, the framework incorporates optimization algorithms (Powell, Dual Annealing) for real-time process adjustments, thus enabling preventative strategies. This combination provides an expert-friendly, interpretable decision support system that surpasses conventional “black box” models.

Schmetz et al. (2021) [67] developed an explainable ML-based decision support system for predicting tool wear in ultra-precision lathe operations. Using Random Forest (RF) on 60,550 AE spectrograms, the model achieved 83% accuracy, with misclassifications mainly between adjacent wear classes. Unlike deep learning approaches (e.g., LSTM), RF was preferred for its interpretability and stability. Feature importance analysis (Tree Interpreter, perturbation tests) provided visual and auditory insights, enhancing user trust and training. The study highlights the industrial relevance of real-time analysis and interpretable models, though limited user validation due to COVID-19 and challenges in acoustic representation beyond the human hearing range remain constraints.

Holst et al. (2022) [68] developed an image processing pipeline combining deep learning and rule-based algorithms to detect and measure tool wear in metal cutting. A CNN achieved 100% accuracy in tool presence detection, while a U-Net-based Fully Convolutional Network (FCN) segmented wear areas with 85% Dice coefficient accuracy. The automated measurements strongly correlated with manual inspection (R² = 0.99). The method’s strengths include end-to-end automation, effective performance with limited data, and applicability in industrial settings. Limitations are the small dataset, focus on flank wear only, and limited explainability of segmentation results. Future work should expand datasets, model multiple wear types, and connect segmentation outcomes with final part properties such as surface finish.

Such approaches, particularly when field experience is integrated into numerical models, enable engineering insights beyond model performance to be derived. This integrated structure is conceptually presented in Figure 3. This structure is a multi-layered system encompassing sensor-based data collection, preprocessing, artificial intelligence modeling, an explainability layer, and expert-supported decision-making. This structure, where rule-based domain knowledge influences both modeling and interpretation, demonstrates the integration of engineering intuition into AI systems.

Integrative hybrid architectures are used in the literature, integrating expert knowledge with explainable machines [67]. Figure 4 shows a multi-layered workflow encompassing sensor-based data collection, preprocessing, an AI system, explainability layers, and expert decision-making processes in loops. Such an architecture combines data methods with domain expertise, providing both high model range and interpretability.

For AI systems to be safely implemented in production processes, not only high accuracy but also explainability and structures open to expert judgment are required. Integrating expert knowledge enhances the interpretability of algorithmic decisions, strengthening human–machine collaboration and enabling engineering intuition to contribute to system performance. In this context, several studies have integrated expert knowledge into machine learning models alongside explainable artificial intelligence (XAI) techniques, achieving successful results. Table 3 systematically summarizes these approaches according to the explainability methods they use, their integration types, and their application areas.

The reviewed studies on knowledge-enabled hybrid approaches demonstrate a paradigm shift from pure data-driven prediction towards interpretable, actionable, and trustworthy decision-support in manufacturing. A common theme is that these methods successfully address the “black-box” problem by integrating expert knowledge and explainable AI (XAI) techniques, thereby building operator trust and enabling human-AI collaboration. Studies [44,64,67,68] have shown that this integration not only maintains high accuracy but also provides crucial benefits in transparency, leading to more reliable and adoptable systems in industrial settings. Integration strategies—ranging from using intrinsically interpretable models like C4.5 decision trees [64] and Random Forests [67], to applying post hoc explanation tools like SHAP and LIME on high-performance “black-box” models [44], to the structured combination of deep learning with deterministic, rule-based algorithms for measurement [68] demonstrate the methodological flexibility and practical power of the approach.

However, the promising results of these knowledge-guided systems should not overshadow their inherent challenges and limitations. A critical issue across several studies is the scalability and generalizability of the solutions. The performance of these hybrid models is often validated within narrow constraints, such as specific mold geometries [64], a particular production line [44], or a limited range of material compositions and process parameters [66], or, as in [68], a small dataset focusing on a single wear type (flank wear). This raises questions about their adaptability to diverse manufacturing environments without significant re-engineering.

Furthermore, the integration of expert knowledge itself presents a bottleneck. The process of formally encoding human intuition into a computational model (e.g., in ANFIS [66]) or into precise measurement rules (e.g., in [68]) can be time-consuming and subjective. While XAI methods like SHAP [44] and Integrated Gradients [65] excel at explaining model outputs, they do not automatically translate those explanations into automated, proactive decisions; the final intervention still heavily relies on human judgment. In a similar vein, the U-Net model in [68] provided accurate segmentation masks but offered limited intrinsic explainability for why it segmented the wear area in a particular way, leaving a gap between the result and full human understanding. Additionally, as noted in [65,67], these systems can be sensitive to model hyperparameters and often lack comprehensive validation with end-users, leaving the human-factor engineering partially unaddressed.

Consequently, while knowledge-enabled hybrid systems offer a superior framework for building accurate, interpretable, and human-centric AI tools, further research is crucial to mature the technology. Future work must focus on developing more systematic and scalable methods for knowledge integration, incorporating online and incremental learning capabilities to adapt to dynamic shop-floor conditions [65], and conducting rigorous, long-term user studies to optimize the human-AI collaboration loop [67], and expanding the scope of problems these systems address for example, by connecting AI outputs to final part quality, as suggested for tool wear in [68].

Case Study: Current-Based Knowledge-Guided Hybrid Tool Wear Monitoring System [69]: In this case, tool wear on a CNC lathe was monitored using feed-cutting force (CF) estimated using motor current (Current) and feed rate (Feed Speed). The approach is based on a Knowledge-Guided Hybrid architecture that combines relationships derived from the physical motor model with expert knowledge-based fuzzy rules (ANFIS) and the Adaptive Neuro-Fuzzy Inference System (ADFIS). This hybrid approach aims to align data-driven prediction with the logical structure of human expertise.

Sensor Limitations and Noisy Signals:

Because only a portion of the total motor current represents the cutting load, separating the no-load and cutting current was critical. The noise in the current signal at low feed rates made it difficult to distinguish small changes in wear and paralleled the “loss of interpretability under noise” problem in the literature.

Sensor Sensitivity and Physical Behavior:

While cutting force tends to increase steadily as wear progresses, the limited response of the current signal to low wear levels highlights the critical importance of sensor selection and data quality for decision support systems.

Impact of a Knowledge-Guided Hybrid Decision Structure:

Error increased when using a pure mathematical model; however, combining the motor model, which represents the physical relationship, with the expert knowledge-based ANFIS increased force estimation accuracy to 95%. Fuzzy classification processes following this estimation successfully distinguished tool wear into six categories and demonstrated that embedding expert rules in the model produces more stable results than single black-box models in PdM (Predictive Maintenance) applications.

Operational Realities:

The operational advantage of using a current sensor is evident because the installation of high-accuracy sensors (e.g., dynamometers) is complex and costly. Because the system indirectly monitors wear, real-time quality assessment remains limited. While the current frequency-feed rate relationship is mostly linear, under some conditions, this relationship weakens, affecting model generalization.

Industrial Assessment

This case demonstrates why Knowledge-Guided Hybrid approaches are essential in industrial applications:

Sensor fusion is inevitable in the long term because individual sensors are not equally sensitive to all wear levels. The combination of physical relationships-based regulations and expert knowledge-based rules confirms the practical value of the PIHML approach. Explainability in the decision structure, in particular, enables more consistent detection of the Worn state, reducing unnecessary tool changes and offering the potential to reduce maintenance costs.

2.3. Transfer Learning Hybrid Applications

In machine learning, traditional algorithms are task-specific and lose performance when the task or data distribution changes. Transfer learning addresses this by retraining pre-trained models with new data. Widely used in deep learning, it reduces training time and computational cost. It is effective not only in computer vision tasks like image recognition but also in industrial applications such as fault diagnosis and equipment lifetime prediction, showing strong potential for predictive modeling in additive manufacturing processes [70,71,72,73,74].

In this context, approaches that integrate transfer learning with deep neural networks, such as convolutional neural networks (CNNs), can be regarded as hybrid systems, as they combine a pre-trained feature extraction component with a domain-specific learning module, achieving superior performance compared to either method alone.

Transfer learning allows the reuse of models pre-trained on large-scale datasets in industrial fields operating with limited data. This method is widely preferred, particularly in manufacturing processes where data collection and labeling are challenging. The ImageNet dataset, frequently used in this context, is a comprehensive visual recognition dataset containing millions of labeled images. Deep convolutional neural network models trained on ImageNet can learn general visual features through various layer structures and regularization techniques, enabling successful transfer to data-constrained fields such as manufacturing [75,76].

Marei et al. (2021) [75] investigated the prediction of cutting tool health in CNC machining using transfer learning-based Convolutional Neural Networks (CNNs). The ResNet-18 model, retrained with 327 microscopic tool images, predicted wear levels as continuous values between 0 and 1. Achieving 84% accuracy with MAE of 0.0773 and RMSE of 0.1654, ResNet-18 outperformed alternative architectures. The strengths of this approach lie in its ability to achieve high accuracy with limited data, effectiveness in early wear detection, and adaptability through transfer learning. Key limitations include dataset imbalance and the extended training times of deeper models. Future work suggests sensor fusion and reinforcement learning integration. While industrial applications are not yet demonstrated, the method shows strong potential for integration into real-time PHM modules on CNC machines.

Xu Liu et al. (2023) [77] proposed the CDAR model, a deep transfer learning (DTL)-based approach to address conditional distribution shift in Machine Health Monitoring Systems (MHMS). By combining mean square error (MSE) and Conditional Embedding Operator Discrepancy (CEOD) losses, the model aims to improve prediction accuracy while preserving global conditional distribution properties. Tested on NASA and PHM Society datasets, CDAR outperformed existing methods (TCA, DAN, FA) with lower error rates and higher R² (89–92%). Strengths include strong generalization with limited labeled data and the novel integration of CEOD. Limitations involve low explainability, high computational cost, and complex hyperparameter tuning, which may hinder scalability. Future research is suggested to incorporate active learning for more efficient data selection and enhanced interpretability. The model has also been successfully applied to industrial cases such as tool wear prediction and battery health monitoring.

Papenberg et al. (2023) [45] proposed a CNN (EfficientNetB0) with transfer learning to classify milling tool wear, achieving 94.41% accuracy from a relatively small dataset. The use of Grad-CAM enhanced explainability by highlighting tool surfaces and cutting edges, though cracks remained underrepresented. The study demonstrates the strength of hybrid AI methods in tool condition monitoring; however, limitations include sensitivity to lighting, incomplete detection of wear indicators, and lack of domain knowledge integration. While industrially relevant, the model has yet to be validated in real-time manufacturing systems, making its contribution promising but still preliminary.

Yan, Zhu, and Dun (2021) [78] investigated real-time tool wear prediction in milling TC4 titanium alloy using multi-channel force and acceleration signals within a data-driven hybrid framework. By applying STFT-transformed (Short Time Fourier Transform) data to a ResNet-18 model, the study demonstrated greater stability and lower errors than CNN-1D (MSE as low as 6.99; maximum error ≤ 8.5%). Generalizability was validated across varying signal lengths and channel settings, with fine-tuning enabling accurate predictions (MSE < 500). Key strengths include a large, unique dataset and adaptability through transfer learning, while limitations involve poor explainability, low AE signal quality, and fluctuations under changing input conditions.

All these studies show that both visual and signal-based models pre-trained on large-scale datasets using a transfer learning approach offer high accuracy, low error rates, and strong generalization capabilities even with limited data in the production domain. This demonstrates their successful reuse in applications such as tool wear prediction, surface quality analysis, and industrial visual inspection. Figure 5 presents a flowchart illustrating the general process for adapting transfer learning to these manufacturing applications.

Manufacturing processes are typically characterized by low volumes and limited labeled data. Therefore, the transfer learning (TL) approach, which reuses pre-trained models across large datasets, offers a valuable solution in the industry. Table 4 below presents the industrial transformation analysis by summarizing the Transfer Learning architectures used in various studies, the transferred data sources, the target application areas, and the encountered limitations.

The reviewed studies reveal that Transfer Learning Hybrid approaches offer one of the most effective solutions to data scarcity, one of the most critical challenges in manufacturing. Consistent with our definition, this method achieves synergistic performance by integrating general feature extraction capabilities learned from a broad range of source tasks (e.g., ImageNet) with a fine-tuning module specific to the target task (e.g., tool wear). Studies [45,75,77,78] have demonstrated that this integration yields high accuracy, efficiency, and generalization capabilities at a level not achieved by either component alone.

However, this promising picture does not conceal significant limitations. A common weakness of the reviewed studies is their lack of explainability and inadequate integration of physical and expert knowledge. The models remain “black boxes” [77,78], and while they can be partially elucidated by methods such as Grad-CAM [45], the lack of systematically incorporating manufacturing expertise into the decision-making process remains a significant obstacle. Furthermore, most of these methods have been validated in narrow laboratory conditions, but their performance in the noisy and dynamic nature of real-time production environments [45,75] and scalability of complex hyperparameter settings [77] remain questions.

In conclusion, while Transfer Learning Hybrids stand out as a powerful solution for data scarcity, they need to reach the next level of maturity and confidence. Future research should combine these methods with Physics-Informed or Expert-Informed Hybrid frameworks to increase explainability and physical fidelity, further enhance data efficiency with active learning, and ultimately focus on real-world validation.

Case Study: Force-Based Tool Wear Monitoring on Two Production Lines (AL + TL Hybrid Architecture) [79]: In this case, tool wear during turning at two different production facilities was predicted using Active Learning (AL) and Transfer Learning (TL) using measured force components (Fx–Fz). The datasets showed significant distribution variability due to sensor placement and material differences.

Data Environment and Sensor Effects:

Force data was more linear and noisy in Company A, while in Company B, it exhibited a non-linear but more separable structure. The mid-range wear, especially in Company A, was the area where the model struggled the most. Sensor location (tool holder/cutting tool) directly affected signal amplitude and noise, confirming the importance of sensor positioning in industrial environments.

Hybrid Decision Architecture (AL + TL):

With AL, only 30% of the data was initially used, and then uncertain samples were added, reducing the data requirement by 40–60%.

Company A: 480 → 210 data points, accuracy 0.92%.

Company B: 229 → 126 data points, accuracy 1.00%.

Intercompany data transfer was carried out with TL; accuracy increased in the B → A direction, while data requirements increased in the A → B direction. This demonstrated that distribution differences determine TL performance.

Operational Findings:

Dynamometer-based sensors created additional installation costs. Because optical wear measurement can only be performed offline, real-time decisions were made using a hybrid model based on force signals. Because some signals (e.g., spindle current) did not show a significant correlation with wear, sensor selection should be based on their information value.

Industrial Assessment:

This case demonstrates that the hybrid AL + TL (Transfer Learning Hybrid) approach produces more reliable predictions in noisy and variable sensor environments. In particular, the more reliable detection of the worn state reduced unnecessary tool changes, contributing to lower maintenance costs.

2.4. Heterogeneous Model Hybrid Applications

Ensemble approaches aim to improve system performance by using multiple machine learning algorithms together or sequentially. These methods are preferred to improve model stability, accuracy, and generalization ability, especially in complex production processes [38]. Ensemble models are generally divided into two basic groups: homogeneous and heterogeneous structures. Homogeneous ensembles are created by combining multiple instances of the same machine learning algorithm, while heterogeneous ensembles are created by using different types of algorithms together. In addition, ensemble methods reduce the risk of overfitting by reducing model variance by combining the outputs of different algorithms [80]. In this context, various applications where different algorithms are used together are encountered in the literature.

Lin and Hsieh (2024) [46] developed an ensemble learning-based model for tool wear prediction in SKD11 steel milling. The study highlights the hybrid model’s contribution to cutting performance and the integration of multiple models as major strengths. Its limitations lie in being tested on a single tool–workpiece combination and lacking real-time validation. The approach shows strong potential for industrial use, especially when integrated with automatic tool-changing systems. Future research should focus on generalizability across different material–tool combinations and comparative analyses with deep learning models.

Demircioğlu Diren et al. (2023) [81] investigated the optimization of cutting parameters in end milling of AISI 321 stainless steel using statistical and machine learning (ML) approaches. Cutting speed, feed per tooth, and depth of cut were optimized via RSM (Response Surface Methodology), and ANOVA identified depth of cut as dominant for Fx and Fy, while cutting speed was most influential for Ra. Comparative evaluation of ANN, DT, K-NN, and an ensemble model revealed that no single algorithm performed best for all outputs; the ensemble model yielded the lowest error for Fx, while ANN was most accurate for Fy and Ra. The study’s strength lies in combining statistical design with ML for predictive modeling. However, the reliance on only three ML methods and a small dataset (27 experiments) limits generalizability. Moreover, the lack of explainability of the models reduces industrial reliability. Despite these limitations, the work highlights the potential of AI-based predictive models to reduce experimental costs and enhance manufacturing efficiency.

In the study by Gertrude David et al. (2022) [82], a hybrid model integrating the Improved Dragonfly Optimization Algorithm (IDOA) and Deep Belief Network (DBN) was developed for tool wear classification. Through dimensionality reduction and feature extraction, the model achieved high accuracy (98.83%) with notable computational efficiency. The key strengths lie in its strong classification performance and optimization effectiveness, while limitations include the relatively small dataset and lack of validation under varied machine conditions. Furthermore, the deep learning architecture poses challenges regarding explainability. The authors recommend future research on larger datasets, real-time industrial integration, and the use of explainable artificial intelligence (XAI). Although not yet practically deployed, the model shows potential for autonomous maintenance systems in CNC machines.

Ou et al. (2021) [83] introduce a hybrid approach combining a deep kernel autoencoder (DKAE) with gray wolf optimization (GWO) for tool wear detection using spindle motor current signals. By applying compressed sensing and kernel-based feature extraction, the method achieves higher accuracy (+8%) and lower processing time compared to traditional classifiers (SVM, BPNN, KNN, ELM). The study’s strengths lie in real-world data use, signal compression, and interpretability. However, its reliance on a limited dataset and a single sensor type (spindle current) restricts generalizability. Future directions highlight multi-sensor fusion and adaptability to variable cutting conditions. Despite these limitations, the industrial validation with an STC1600 CNC machine confirms practical applicability in tool life management.

He et al. (2021) [84] proposed a deep learning-based method for tool wear prediction using temperature signals. The SSAE-BPNN hybrid model outperformed traditional approaches by enabling automatic feature extraction and demonstrated adaptability to varying cutting parameters with real CNC lathe data. However, the need for specialized sensors, high computational cost, and lack of testing across different materials/conditions limit its industrial applicability. Although no industrial case study was provided, the study highlights the potential of this method for real-time tool life prediction in smart manufacturing.

Xie et al. (2023) [85] propose a hybrid deep learning model (CIEBM), integrating CNN, Informer encoder, and BiLSTM, for tool wear (TW) monitoring. The model achieved 99% accuracy on the IEEE PHM2010 dataset, outperforming CNN (by 17.42%) and BiLSTM (by 2.05%). Its main strengths are automated feature extraction, explainability via attention heatmaps, and industrial potential. However, limitations include computational complexity, questionable generalization under variable cutting conditions, and lack of industrial case studies. The authors suggest combining hybrid approaches with physical and adaptive models for broader applicability.

Echeverria-Rios & Green (2024) [86] propose a scalable and robust Gaussian Process (GP) method for predicting product quality in continuous manufacturing. The core contribution lies in the DP-based clustering combined with GP regression (DPGP model), which effectively filters faulty data and improves prediction accuracy. Compared to standard GP and robust GP (RGP), the model achieved lower RMSE values and successfully excluded sensor-related outliers in industrial applications. Key strengths include scalability, uncertainty estimation, and data lifecycle management, while weaknesses involve reliance on maximum likelihood for hyperparameter uncertainties and limited real-time efficiency on large datasets. Future improvements are suggested through sampling-based uncertainty propagation, overfitting prevention, and faster sparse GP strategies.

Individual models cannot always adequately represent the complex nature of production processes, which can lead to increased model variance or weakened generalizability. Ensemble modeling approaches aim to provide more balanced, accurate, and stable results by combining the output of multiple algorithms. Heterogeneous ensemble structures, in particular, combine the strengths of different algorithms to create structures more resilient to data diversity. Table 5 below presents various examples of ensemble models from the literature, in the context of their component algorithms, application areas, success indicators, and outlines the practical impact of this approach in the manufacturing sector.

In conclusion, examples presented in the literature demonstrate the successful application of ensemble and hybrid modeling approaches to production processes. These models increase accuracy and generalization capabilities while also improving system stability. Combining homogeneous and heterogeneous ensemble structures with different algorithms yields more balanced and reliable results. In hybrid systems, explainability is achieved by combining physics-based information with data-driven approaches. The use of these methods, particularly in complex production environments, contributes to the smartening of processes. Ensemble structures have become a key tool for Industry 4.0 applications. Figure 6 summarizes the flow from data sources used in manufacturing processes to modeling layers and output targets in the context of hybrid systems and ensemble methods. The figure visually illustrates how physics-based and experimental data are combined with hybrid and ensemble algorithms in the model integration layer, leading to classification or regression models, and how these models contribute to final output targets (e.g., surface roughness prediction, tool wear classification, zero-defect manufacturing).

The reviewed applications of Heterogeneous Model Hybrids underscore their primary strength: creating balanced and robust predictive systems by strategically combining the unique strengths of diverse algorithms. As defined by our criteria, this integration achieves synergistic performance, where the ensemble’s accuracy and generalizability surpass what any single constituent model could deliver. Studies leverage this through parallel ensembles [46,81], optimization-model hybrids [82], and deep architectural fusions [85] to effectively tackle complex manufacturing phenomena like tool wear and product quality.

Despite this demonstrated effectiveness, the analysis reveals consistent challenges that hinder broader industrial adoption. A significant limitation across multiple studies is the “black-box” nature of these complex ensembles [81,82,85], which undermines trust and operational reliability. Furthermore, the generalizability of these models is frequently questioned, as they are often validated on limited datasets or a single tool-workpiece combination [46,83,84]. The computational complexity of integrating multiple sophisticated models also poses a barrier to real-time implementation in production environments [84,85,86].

In conclusion, while Heterogeneous Model Hybrids are a powerful paradigm for improving predictive performance, their path to maturity requires addressing key gaps such as limited generalizability arising from small and homogeneous datasets, lack of explainability in complex hybrid architectures, high computational cost restricting real-time deployment, and the absence of standardized design frameworks for the integration of heterogeneous models in industrial environments. Future research must prioritize explainable AI (XAI) techniques to open the “black box,” validate models on larger and more diverse industrial datasets to prove generalizability, and optimize architectures for computational efficiency to enable real-time deployment.

Case Study: Image-Based (YOLOv3) + Multi-Sensor (Force–Vibration–Current) Hybrid Tool Wear Monitoring System [87]: In this study, tool wear was monitored simultaneously using both direct image analysis (microscope + YOLOv3, CNN) and indirect sensor data (cutting force F, vibration A, spindle current C). This structure highlights the typical data integration challenges that hybrid approaches face in a production environment.

Sensor Behavior and Data Noise:

The vibration sensor (A) exhibited high variance at low wear levels, with signal separation becoming apparent only in the Worn region (11% increase). In contrast, the force signal (F) increased consistently with wear and was the most reliable sensor. Spindle current (C) did not produce a significant correlation with wear at the RMS level.

Effect of Image + Sensor Fusion:

Using YOLOv3-based image classification (CNN) alone, Good → Worn errors reached 15%, while accuracy remained at 60% in the Moderate class. When force–vibration regression models (polynomial regression) were added to the system, overall accuracy increased to 81.1%, and precision, particularly in the Worn state, increased to 83.3%. The critical Good ↔ Worn errors were completely eliminated.

Operational Realities:

A dynamometer (high-precision F-sensor) creates installation costs and movement restrictions for industrial use. Because Ra measurements can only be made at the end of experiments, real-time quality estimation was performed indirectly using the hybrid model. Because the system is currently offline, real-time data processing is required for transition to a real production environment.

Industrial Evaluation:

This case demonstrates that hybrid systems provide more reliable decisions than single models. The varying sensitivities of sensors in different wear zones necessitate sensor fusion. The hybrid decision mechanism (CNN + F–A regression) more reliably determined the Worn state of the tool, reducing unnecessary tool changes and providing significant benefits for PdM.

3. The Limitations of Non-Hybrid Approaches

While AI models such as SVM, Random Forests, and CNNs have demonstrated high accuracy in specific manufacturing tasks (e.g., 97–99% in tool wear prediction), they continue to face structural limitations that may restrict their robustness and industrial applicability. These challenges constitute a key motivation for the exploration of hybrid AI systems that combine multiple paradigms and sources of knowledge. Although conventional machine learning and deep learning models have achieved remarkable predictive performance in controlled experiments, their generalization to dynamic, real-world manufacturing environments remains limited. These systems often depend heavily on data-specific patterns, lacking mechanisms to incorporate domain knowledge or physical constraints. As a result, while they may achieve outstanding numerical accuracy under specific laboratory conditions, their stability, interpretability, and adaptability under changing operational scenarios are often reduced.

To overcome these persistent limitations, recent studies have increasingly focused on the development of hybrid AI systems frameworks that integrate data-driven learning with complementary sources of information, such as physics-based modeling, expert knowledge, and diverse algorithmic structures. Such integration enables models not only to recognize statistical patterns within data but also to maintain physical consistency and provide transparent, interpretable decision mechanisms.

The conceptual and methodological distinctions between non-hybrid and hybrid systems are summarized in Table 6 to clarify their structural differences and underlying integration principles.

The major constraints associated with non-hybrid approaches can be discussed under the following categories:

Poor Generalization: Non-hybrid AI models often show impressive predictive capability when tested under controlled laboratory conditions. However, their performance may deteriorate significantly when exposed to new or unseen conditions, such as different material batches, machine types, or environmental variables.

For instance, Random Forest (RF) models have achieved high accuracy levels (e.g., 97.3% in tool wear prediction) in limited experimental setups ([88]), yet these models were trained on small datasets comprising fewer than 20 samples. Such restricted data diversity can reduce the model’s ability to adapt to other machining processes or material types. Similarly, Adaptive CNN-based methods used for tool wear prediction in milling tasks ([91]) have shown strong accuracy gains within specific parameter ranges, but their performance tends to decline when machining conditions differ significantly.

These examples suggest that, while non-hybrid models can be effective under narrowly defined circumstances, their generalization capacity remains conditional on the similarity between training and operational data.

Data Dependency and Cost: A common limitation in the application of AI to manufacturing lies in its heavy reliance on large, high-quality, and labeled datasets. In industrial environments, collecting such data is often costly, time-consuming, and sometimes impractical.

Studies using deep learning architectures such as CNNs have achieved very high classification accuracy (above 99%) in tool wear or defect detection ([89,92]), yet these models were typically trained on relatively small image sets of only a few thousand samples. This limited data scope may increase the risk of overfitting and reduce the models’ ability to handle variability in lighting, surface texture, or cutting conditions.

Similarly, regression-based models such as XGBoost have reported R² values close to 0.99 in wear prediction tasks ([90]), but these outcomes were obtained from datasets of fewer than 40 samples and under uniform dry-sliding conditions.

Taken together, these findings imply that the data dependency of non-hybrid models remains a major bottleneck and that insufficient or homogeneous data can compromise their stability and scalability in real manufacturing environments.

The “Black Box” Problem: Another recurring limitation concerns the interpretability of non-hybrid AI models. Deep neural networks and complex ensemble techniques often provide little transparency regarding the reasoning behind their predictions.

For example, models developed for predictive maintenance using gradient-boosting approaches such as LightGBM have achieved high accuracy levels (≈99%) under limited data conditions ([93]), yet their internal decision mechanisms are difficult to interpret. This lack of explainability can make it challenging for engineers and operators to trust AI-generated insights or integrate them into operational decision-making.

Similarly, in tool condition monitoring applications, ensemble methods such as Random Forest and Extra Trees ([94]) have delivered strong performance metrics (up to 98–99% accuracy), but their complex internal voting mechanisms may obscure the contribution of individual variables or sensors.

Thus, despite their success in pattern recognition, these models may not provide the interpretive clarity required for high-stakes industrial deployment, where explainability and accountability are critical.

Ignoring Physical Laws and Expert Knowledge: Purely data-driven approaches operate primarily through statistical correlations within datasets, without explicit consideration of the physical mechanisms or expert rules that govern manufacturing processes. As a result, such models can occasionally generate predictions that are physically implausible or inconsistent with established engineering knowledge.

In certain studies employing Weighted Extreme Learning Machine (WELM) frameworks for wear prediction in coated steel surfaces ([95]), the models have achieved R² values above 0.97. However, these methods are typically developed under simplified physical assumptions and limited parameter ranges, suggesting that physical process interactions may not be fully represented.

Similarly, neural-network-based approaches in composite wear prediction ([96]) have achieved very high correlation coefficients, yet they were trained on small datasets and tested under narrow experimental conditions (e.g., dry sliding only). This may indicate that the absence of explicit physics integration or expert constraints can restrict the realism of the predicted outcomes.

Therefore, the tendency of purely data-driven models to overlook domain-specific principles highlights a potential need for hybrid frameworks that combine data learning with physics-based reasoning and expert knowledge.

A Clarification on Ensemble Methods: Ensemble learning techniques can improve predictive performance by combining multiple models, yet not all ensemble approaches qualify as hybrid. Homogeneous ensembles, such as Random Forest or XGBoost, aggregate multiple versions of the same type of learner (e.g., decision trees) and thus remain within a single methodological paradigm. In contrast, heterogeneous ensembles, which integrate fundamentally different algorithms, such as SVMs, neural networks, and regression models, can be considered hybrid, as they combine diverse learning principles.

In the reviewed literature, homogeneous ensembles ([88,94]) have consistently demonstrated strong performance but with persistent issues related to limited generalization, lack of feature diversity, and constrained validation in industrial conditions. Conversely, studies combining distinct algorithmic approaches ([97]) indicate that diversifying model structures may improve stability and resilience against variations in operating conditions.

These observations support the view that hybrid systems, by integrating complementary modeling paradigms, could offer more adaptive and balanced solutions to complex manufacturing challenges.

Overall, non-hybrid AI methods in manufacturing have achieved high levels of predictive accuracy in specific, often constrained scenarios. However, recurring limitations such as restricted generalization, strong data dependence, interpretability challenges, and the absence of physical or expert-informed reasoning suggest that these approaches may not always provide the robustness or reliability required for real-world industrial deployment.

In this context, hybrid AI systems emerge as a promising evolution. By integrating data-driven algorithms with physics-based models, expert knowledge, and multiple methodological paradigms, hybrid frameworks have the potential to enhance generalization, interpretability, and industrial applicability, while maintaining competitive predictive performance.

4. Challenges and Future Directions

Based on our analysis of the hybrid AI landscape in manufacturing, it is evident that these systems, while promising, face a critical “lab-to-factory” gap. The transition from controlled validation to robust industrial deployment is hindered by a set of interconnected challenges. This section synthesizes these barriers from a forward-looking perspective and outlines pivotal research trajectories that, in our view, are essential for the field’s maturation.

4.1. Major Challenges

The Scalability and Generalization Bottleneck: A recurring weakness across all hybrid categories is their limited operational scope. For instance, while the PIML model by Pashmforoush et al. [47] achieves a remarkable 16% accuracy gain, its validation on a single material-tool pair raises serious concerns about its broader applicability. Similarly, the physics-informed model by Zhu et al. [43] lacks demonstrated scalability to other processes like turning or drilling. We argue that this narrow validation paradigm is the single greatest impediment to industrial trust and adoption.

Across hybrid AI methods, a persistent limitation is their inability to scale beyond narrowly defined operating conditions. While many hybrid frameworks demonstrate superior accuracy under controlled environments, the industrial case studies clearly show that this performance does not easily generalize across machines, materials, sensor configurations, or production lines. In the Physics-Informed State Space Model (PSSM) [60], for instance, the model achieved 97.7% accuracy only under a specific multi-sensor configuration—cutting forces, vibration, acoustic emission, and optical measurements. When low-cost alternatives such as spindle current were evaluated, they failed to correlate with wear, illustrating that a model validated in one sensor-rich environment does not necessarily generalize to more cost-efficient setups. Similarly, the current-based Knowledge-Guided Hybrid system [69] generalized reliably only under operational ranges where the current–force relationship remained linear; the predictive performance degraded markedly at low feed rates or under irregular current response, showing that even physically grounded hybrids are restricted by narrow applicability domains. The Transfer Learning + Active Learning (AL + TL) hybrid architecture deployed across two production lines [79] further underscored this bottleneck: transfer from factory B to A improved accuracy, while the reverse increased data requirements and reduced robustness. This asymmetry demonstrates that distribution differences in sensor placement, material heterogeneity, signal linearity can prevent hybrid systems from generalizing across real industrial sites. Together, these findings reinforce that scalability is not merely a modeling challenge but a foundational barrier to industrial trust.

Balancing Accuracy, Complexity, and Transparency: The pursuit of higher accuracy often leads to architectures that are computationally prohibitive or opaque. While deep ensembles like the MultiCNN-Attention-GRU model [59] are accurate, their high cost makes real-time deployment challenging. Conversely, simpler, explainable models may lack predictive power. Furthermore, we observe that even successful XAI integrations [44,67] often provide post hoc rationalizations rather than intrinsic transparency, failing to fully bridge the trust gap with operators.

Hybrid architectures strive to combine the predictive accuracy of data-driven models with the interpretability of physics or knowledge-based components, yet industrial evidence reveals that this balance remains delicate. In the PSSM system [60], although the integration of a physical degradation model with Gaussian Processes and Particle Filtering substantially improved failure-threshold detection, the resulting probabilistic pipeline increased computational complexity and reduced interpretability for operators, especially regarding PF-derived acceleration signals. The current-based Knowledge-Guided Hybrid model [69] demonstrated that embedding expert rules significantly stabilizes decision quality and increases force estimation accuracy to 95%; however, this came at the cost of maintaining and validating a fuzzy rule base that requires expert intervention and domain knowledge, limiting scalability across facilities lacking such expertise. In the AL + TL architecture [79], the uncertainty-based sample selection of Active Learning and the domain transformation layers of Transfer Learning improved accuracy while reducing labeling costs, but they also obscured the rationale behind sample prioritization and domain adaptation, leaving operators unable to interpret why specific decisions were made. Across all cases, the hybrid systems attempt to boost accuracy, often introducing layers of complexity that diminish intrinsic transparency, revealing that achieving all three simultaneously remains an unresolved challenge.

The Data Scarcity and Domain Shift Paradox: Hybrid systems are often proposed as a solution to data scarcity, yet they can be vulnerable to it. Transfer learning, a key strategy here, is itself hampered by domain shift. The high accuracy of ResNet-18 in tool wear prediction [75] is tempered by its sensitivity to dataset imbalance and training time. Similarly, the CDAR model’s [77] success in handling distribution shift is offset by its own data-hungry hyperparameter tuning and low explainability.

Hybrid AI methods are often proposed as remedies for data scarcity, yet the industrial case studies illustrate that these approaches remain vulnerable to uneven data distributions and shifting domains. In the PSSM framework [60], early-stage wear signals were highly noisy, particularly vibration and AE data, creating an information-poor regime in which even physics-informed models struggled to establish reliable degradation signatures. This reflects a form of intra-process domain shift, where the statistical properties of signals differ significantly between early and severe wear phases. In the current-based hybrid system [69], only a fraction of total motor current corresponds to actual cutting load, leading to inherently low information density and making small wear changes nearly indistinguishable, especially at low feed rates where noise dominates. The two-site AL + TL deployment [79] provided the strongest evidence of domain shift: Company A exhibited linear but noisy force signals, while Company B showed nonlinear yet more separable patterns. Transfer Learning thus performed asymmetrically effective from B to A but not vice versa, demonstrating that domain discrepancies between factories cannot be assumed symmetric or easily corrected. These cases collectively reveal that data scarcity is not limited to insufficient quantity; it also arises from noise-dominant conditions, imbalanced wear stages, sensor sensitivity differences, and cross-site distribution shifts that hybrid models are not yet robust enough to fully overcome.

Sensor Infrastructure, Integration Costs, and Practical Deployment Constraints: Beyond algorithmic limitations, the industrial case studies highlight a distinct class of constraints stemming from sensor infrastructure, deployment costs, and practical integration challenges that are not fully captured by the previous three categories. The PSSM case [60] exposed the significant operational burden of deploying high-precision sensors such as dynamometers and AE transducers, which increase installation complexity and cost, limiting the feasibility of large-scale industrial adoption. Additionally, optical microscopy, while essential for ground-truth labeling, can only be performed offline, preventing real-time correction and creating a disconnect between measurement and decision phases. In the current-based hybrid system [69], reliance on inexpensive current sensors reduced installation costs but introduced severe noise-related limitations, requiring additional preprocessing (no-load vs. load separation) and engineering effort, demonstrating that low-cost solutions are not substitutes for reliable measurement. The force-based AL + TL deployment across two production lines [79] further revealed the sensitivity of hybrid models to sensor positioning: differences in mounting points (tool holder vs. cutting tool) produced distinct amplitude and noise characteristics, directly influencing model performance and complicating cross-facility transfer. Together, these findings confirm that hybrid AI systems must navigate not only algorithmic challenges but also the physical realities of sensor selection, integration, calibration, maintenance, and cost—an often overlooked yet critical barrier to industrial deployment.

4.2. Future Research Directions

To overcome these challenges, we posit that future research must move beyond incremental improvements and embrace more holistic and ambitious goals.

1.

Prioritize “Generalization by Design”: Future hybrid systems must be built with adaptability as a core principle. This involves:

-: Invariant Feature Learning for Sensor Variability: As highlighted by the industrial deployment case study in [79], transfer learning can fail asymmetrically (e.g., successful transfer from Factory B to A, but failure from A to B) due to variations in sensor placement (e.g., tool holder vs. cutting tool). Future hybrid models must incorporate invariant feature learning techniques that decouple the monitoring signal from the specific hardware configuration. Research should focus on learning “sensor-agnostic” representations that remain robust even when signal amplitude or noise profiles change due to physical sensor relocation.
-: Physics-Informed Transfer Learning: Developing frameworks that transfer not just data features but also underlying physical principles (e.g., mechanics of wear) from data-rich to data-poor domains, directly addressing the limitations seen in [43,47].
-: Meta-Learning and Automated Configuration: Creating systems that can automatically reconfigure their hybrid architecture (e.g., the weight of a physics-based loss function) based on the target task, moving beyond the static designs prevalent today.

2.

Bridge the Real-Time Explainability Gap: Accuracy alone is insufficient for shop-floor acceptance. We believe the next frontier is the co-design of efficiency and interpretability.

-: Develop Lightweight, Intrinsically Interpretable Hybrids: Following the precedent of lightweight models like PIDGGCN [58], future work should focus on architectures that are both computationally efficient and transparent by design, perhaps through simplified model structures or rule-based components.
-: Establish Human-Centered XAI Protocols: Technical XAI metrics are not enough. We recommend adopting human-centered evaluation methods that measure an explanation’s impact on operator decision-speed, intervention accuracy, and trust, building upon the preliminary steps taken in [44,67].

3.

Embrace Causal and Generative Frameworks: The field must evolve from predictive diagnostics to prescriptive and generative solutions.

-: Integrate Causal Discovery: Moving beyond correlations to identify root causes of faults or wear will make hybrid systems more robust and actionable.
-: Develop Hybrid Digital Twins: Combining physics-based models, real-time data, and AI into dynamic digital twins can create a powerful platform for simulation, optimization, and proactive decision-making, ultimately leading to autonomous self-optimizing manufacturing systems.

4.

Develop Cost-Aware and Noise-Resilient Hybrid Architectures: The transition from lab to factory is often stalled by the cost of high-precision sensors. The case study of the Physics-Informed State Space Model (PSSM) [60] demonstrated that while multi-sensor fusion (force, vibration, AE) yields high accuracy, the installation complexity and cost of dynamometers are prohibitive for widespread use. Conversely, the knowledge-guided study in [69] revealed that low-cost alternatives like motor current sensors suffer from low signal-to-noise ratios, particularly at low feed rates.

Therefore, a critical future direction is the development of “Cost-Aware Hybrids”. These systems should be designed to:

Virtual Sensing: Use hybrid models to infer high-fidelity outputs (like cutting force) from low-fidelity, low-cost inputs (like current or audio) by leveraging physics-based transformations.

Noise-Robust Integration: Specifically target the “interpretability under noise” problem identified in [69], utilizing signal processing layers that can extract weak wear signatures from noisy industrial data without relying on expensive hardware.

5.: Asynchronous Multi-Modal Fusion Strategies: Hybrid systems that combine visual inspection with sensor data offer superior performance but face synchronization challenges. The heterogeneous model case study in [87] showed that while integrating image analysis (YOLOv3) with force/vibration signals eliminated critical classification errors, it highlighted a temporal disconnect: sensor data is continuous and real-time, while optical inspection is often discrete or offline.

Future research must focus on Asynchronous Multi-Modal Learning frameworks that can fuse:

High-frequency time-series data (force/vibration) for immediate anomaly detection.

Low-frequency spatial data (images/surface roughness) for periodic recalibration of the model.

Developing hybrid architectures that can update their internal state using these disparate data streams without requiring perfect temporal alignment will be key to realizing “Zero-Defect Manufacturing” in practice.

5. Conclusions

This study examines the literature between 2020 and 2025 to demonstrate the transformative impact of hybrid AI systems in smart manufacturing, particularly in tool wear monitoring (TCM) applications. According to the conceptual framework developed, hybrid approaches are defined as the complementary integration of physical laws, expert knowledge, data-driven learning, and heterogeneous model ensembles, and are categorized into four categories: Physics-Informed, Knowledge-Guided, Transfer Learning, and Heterogeneous Model Hybrids.

Analysis demonstrates that issues such as noise sensitivity, physical inconsistency, poor generalizability, and lack of interpretability observed in single data-driven models can be largely overcome with hybrid architectures. In particular, supporting indirect signals such as cutting force, vibration, and temperature with physical models increases the consistency of predictions; some studies report a 10–16% increase in accuracy compared to pure data-driven models. Expert knowledge and XAI layers increase decision transparency, increasing operator confidence. Transfer learning and heterogeneous model ensembles make the model more robust against data insufficiency and domain shift problems. This makes them robust.

However, high sensor costs, data distribution variations, computational complexity, and real-time explainability requirements make industrial scaling of these systems challenging. Furthermore, sensor calibration, data standardization, and integration processes during the transfer of laboratory prototypes to the factory environment perpetuate the “lab-to-factory” gap.

In future studies, embracing the principle of “generalizability by design” rather than purely increasing accuracy and leveraging causal AI, domain transfer, and real-time digital twin integrations is crucial. Digital twins increase the data efficiency and closed-loop optimization capacity of hybrid models by synchronizing real data with physics-based simulations.

In conclusion, hybrid systems stand out as the most effective solution for sustainable manufacturing, offering an explainable and generalizable structure that optimizes tool life, reduces unplanned downtime, and is compatible with physical reality.

Author Contributions

Conceptualization was carried out by B.T.S., M.U. and T.G. Methodology, software development, formal analysis, investigation, data curation, visualization, and writing of the original draft were performed by B.T.S. Validation and manuscript review and editing were conducted by B.T.S., M.U. and T.G. Resources, supervision, and project administration were provided by M.U. and T.G. Funding acquisition was carried out by M.U. All authors have read and agreed to the published version of the manuscript.

Funding

This study is based on the doctoral research conducted by the first author. This work was supported by the Scientific and Technological Research Council of Türkiye (TÜBİTAK) under Grant No. 222M371. The authors also gratefully acknowledge the financial support provided by the Fırat University Scientific Research Projects Coordination Unit through Project No. FUBAP-ADEP.22.06. In addition, the authors acknowledge the support of the Fırat University Scientific Research Projects Coordination Unit for covering the open access publication fee of this article under Project No. MF.25.145.

Data Availability Statement

No new data were generated or analyzed during the course of this review; therefore, data sharing is not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yang, T.; Yi, X.; Lu, S.; Johansson, K.H.; Chai, T. Intelligent Manufacturing for the Process Industry Driven by Industrial Artificial Intelligence. Engineering 2021, 7, 1224–1230. [Google Scholar] [CrossRef]
Baru, C.; Daimler, E.; Ferguson, R.; Forbe, N.; Harder, E.; Ferguson, R. The National Artificial Intelligence Research and Development Strategic Plan; U.S. NSTC: Washington, DC, USA, 2016. [Google Scholar]
Executive Office of the President. Fiscal Year 2020 Administration Research and Development Budget Priorities: Memorandum for the Heads of Executive Departments and Agencies; White House Office of Management and Budget: Washington, DC, USA, 2018. [Google Scholar]
Executive Office of the President. Fiscal Year 2021 Administration Research and Development Budget Priorities: Memorandum for the Heads of Executive Departments and Agencies; Executive Office of the President: Washington, DC, USA, 2019. [Google Scholar]
The Research Group for Research on Intelligent Manufacturing Development Strategy. Research on Intelligent Manufacturing Development Strategy in China. Strateg. Stud. Chin. Acad. Eng. 2018, 20, 1–8. [Google Scholar]
Zhou, J.; Li, P.; Zhou, Y.; Wang, B.; Zang, J.; Meng, L. Toward New-Generation Intelligent Manufacturing. Engineering 2018, 4, 11–20. [Google Scholar] [CrossRef]
Molina, L.M.; Teti, R.; Alvir, E.M.R. Quality, Efficiency and Sustainability Improvement in Machining Processes Using Artificial Intelligence. Procedia CIRP 2023, 118, 501–506. [Google Scholar] [CrossRef]
Tran, M.-Q.; Doan, H.-P.; Vu, V.Q.; Vu, L.T. Machine Learning and IoT-Based Approach for Tool Condition Monitoring: A Review and Future Prospects. Measurement 2023, 207, 112351. [Google Scholar] [CrossRef]
Gao, R.X.; Krüger, J.; Merklein, M.; Möhring, H.-C.; Váncza, J. Artificial Intelligence in Manufacturing: State of the Art, Perspectives, and Future Directions. CIRP Ann. 2024, 73, 723–749. [Google Scholar] [CrossRef]
Geurts, A.; Gutknecht, R.; Warnke, P.; Goetheer, A.; Schirrmeister, E.; Bakker, B.; Meissner, S. New Perspectives for Data-supported Foresight: The Hybrid AI-expert Approach. Futures Foresight Sci. 2022, 4, e99. [Google Scholar] [CrossRef]
Peres, R.S.; Jia, X.; Lee, J.; Sun, K.; Colombo, A.W.; Barata, J. Industrial Artificial Intelligence in Industry 4.0—Systematic Review, Challenges and Outlook. IEEE Access 2020, 8, 220121–220139. [Google Scholar] [CrossRef]
Ordenshiya, K.M.; Revathi, G.K. Hybrid FCMG-OP-FIS Model Approach to Convert Regression into Classification Data for Machine Learning-Based AQI Prediction. Heliyon 2024, 10, e39759. [Google Scholar] [CrossRef]
Kasilingam, S.; Yang, R.; Singh, S.K.; Farahani, M.A.; Rai, R.; Wuest, T. Physics-Based and Data-Driven Hybrid Modeling in Manufacturing: A Review. Prod. Manuf. Res. 2024, 12, 2305358. [Google Scholar] [CrossRef]
Vijay Kumar, V.; Shahin, K. Artificial Intelligence and Machine Learning for Sustainable Manufacturing: Current Trends and Future Prospects. Intell. Sustain. Manuf. 2025, 2, 10002. [Google Scholar] [CrossRef]
Macit, C.K.; Saatci, B.T.; Albayrak, M.G.; Ulas, M.; Gurgenc, T.; Ozel, C. Prediction of Wear Amounts of AZ91 Magnesium Alloy Matrix Composites Reinforced with ZnO-HBN Nanocomposite Particles by Hybridized GA-SVR Model. J. Mater. Sci. 2024, 59, 17456–17490. [Google Scholar] [CrossRef]
Kalidindi, S.R. Feature Engineering of Material Structure for AI-Based Materials Knowledge Systems. J. Appl. Phys. 2020, 128, 041103. [Google Scholar] [CrossRef]
Hu, M. Unleashing the Power of Artificial Intelligence in Phonon Thermal Transport: Current Challenges and Prospects. J. Appl. Phys. 2024, 135, 170904. [Google Scholar] [CrossRef]
Ejjami, R.; Boussalham, K. Industry 5.0 in Manufacturing: Enhancing Resilience and Responsibility through AI-Driven Predictive Maintenance, Quality Control, and Supply Chain Optimization. Int. J. Multidiscip. Res. 2024, 6, 25733. [Google Scholar] [CrossRef]
Liu, D.; Liu, Z.; Wang, B.; Song, Q.; Wang, H.; Zhang, L. Leveraging Artificial Intelligence for Real-Time Indirect Tool Condition Monitoring: From Theoretical and Technological Progress to Industrial Applications. Int. J. Mach. Tools Manuf. 2024, 202, 104209. [Google Scholar] [CrossRef]
Rangwala, S.; Dornfeld, D. Sensor Integration Using Neural Networks for Intelligent Tool Condition Monitoring. J. Eng. Ind. 1990, 112, 219–228. [Google Scholar] [CrossRef]
Kuntoğlu, M.; Aslan, A.; Pimenov, D.Y.; Usca, Ü.A.; Salur, E.; Gupta, M.K.; Mikolajczyk, T.; Giasin, K.; Kapłonek, W.; Sharma, S. A Review of Indirect Tool Condition Monitoring Systems and Decision-Making Methods in Turning: Critical Analysis and Trends. Sensors 2020, 21, 108. [Google Scholar] [CrossRef]
Dimla, D.E. Sensor Signals for Tool-Wear Monitoring in Metal Cutting Operations—A Review of Methods. Int. J. Mach. Tools Manuf. 2000, 40, 1073–1098. [Google Scholar] [CrossRef]
Pagani, L.; Parenti, P.; Cataldo, S.; Scott, J.; Annoni, M. Indirect Cutting Tool Wear Classification Using Deep Learning and Chip Colour Analysis. Int. J. Adv. Manuf. Technol. 2020, 111, 1099–1114. [Google Scholar] [CrossRef]
Yavuz, M.; Gökçe, H.; Şeker, U. Matkap Geometrisinin Takım Aşınması ve Talaş Oluşumu Üzerine Etkisinin Araştırılması. Gazi Mühendislik Bilim. Derg. 2017, 3, 11–19. [Google Scholar]
Mir, A.; Luo, X.; Cheng, K.; Cox, A. Investigation of Influence of Tool Rake Angle in Single Point Diamond Turning of Silicon. Int. J. Adv. Manuf. Technol. 2018, 94, 2343–2355. [Google Scholar] [CrossRef]
Siddhpura, A.; Paurobally, R. A Review of Flank Wear Prediction Methods for Tool Condition Monitoring in a Turning Process. Int. J. Adv. Manuf. Technol. 2013, 65, 371–393. [Google Scholar] [CrossRef]
Saez-de-Buruaga, M.; Soler, D.; Aristimuño, P.X.; Esnaola, J.A.; Arrazola, P.J. Determining Tool/Chip Temperatures from Thermography Measurements in Metal Cutting. Appl. Therm. Eng. 2018, 145, 305–314. [Google Scholar] [CrossRef]
Marani, M.; Zeinali, M.; Songmene, V.; Mechefske, C.K. Tool Wear Prediction in High-Speed Turning of a Steel Alloy Using Long Short-Term Memory Modelling. Measurement 2021, 177, 109329. [Google Scholar] [CrossRef]
García-Ordás, M.T.; Alegre, E.; González-Castro, V.; Alaiz-Rodríguez, R. A Computer Vision Approach to Analyze and Classify Tool Wear Level in Milling Processes Using Shape Descriptors and Machine Learning Techniques. Int. J. Adv. Manuf. Technol. 2017, 90, 1947–1961. [Google Scholar] [CrossRef]
Chen, S.-H.; Luo, Z.-R. Study of Using Cutting Chip Color to the Tool Wear Prediction. Int. J. Adv. Manuf. Technol. 2020, 109, 823–839. [Google Scholar] [CrossRef]
SK, T.; Shankar, S.; T, M.; K, D. Tool Wear Prediction in Hard Turning of EN8 Steel Using Cutting Force and Surface Roughness with Artificial Neural Network. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2019, 234, 329–342. [Google Scholar] [CrossRef]
Zgurovsky, M.Z.; Sineglazov, V.M.; Olena, I.C. Artificial Intelligence Systems Based on Hybrid Neural Networks; Springer: Berlin/Heidelberg, Germany, 2020; Volume 390. [Google Scholar]
Akata, Z.; Balliet, D.; de Rijke, M.; Dignum, F.; Dignum, V.; Eiben, G.; Fokkens, A.; Grossi, D.; Hindriks, K.; Hoos, H.; et al. A Research Agenda for Hybrid Intelligence: Augmenting Human Intellect With Collaborative, Adaptive, Responsible, and Explainable Artificial Intelligence. Computer 2020, 53, 18–28. [Google Scholar] [CrossRef]
Moravec, H. Mind Children: The Future of Robot and Human Intelligence; Harvard University Press: Cambridge, MA, USA, 1988; ISBN 0674576187. [Google Scholar]
Dellermann, D.; Ebel, P.; Söllner, M.; Leimeister, J.M. Hybrid Intelligence. Bus. Inf. Syst. Eng. 2019, 61, 637–643. [Google Scholar] [CrossRef]
Jia, J.; Liang, W.; Liang, Y. A Review of Hybrid and Ensemble in Deep Learning for Natural Language Processing. arXiv 2023, arXiv:2312.05589. [Google Scholar] [CrossRef]
Bavarchee, A. A Hybrid Deep Learning Model for Optimizing Particle Identification Systems. Comput. Phys. Commun. 2024, 303, 109277. [Google Scholar] [CrossRef]
Shah, P.; Pahari, S.; Bhavsar, R.; Kwon, J.S.-I. Hybrid Modeling of First-Principles and Machine Learning: A Step-by-Step Tutorial Review for Practical Implementation. Comput. Chem. Eng. 2025, 194, 108926. [Google Scholar] [CrossRef]
Asiedu, S.T.; Nyarko, F.K.A.; Boahen, S.; Effah, F.B.; Asaaga, B.A. Machine Learning Forecasting of Solar PV Production Using Single and Hybrid Models over Different Time Horizons. Heliyon 2024, 10, e28898. [Google Scholar] [CrossRef] [PubMed]
Kumar, S.V.S.; Kondaveeti, H.K. Bird Species Recognition Using Transfer Learning with a Hybrid Hyperparameter Optimization Scheme (HHOS). Ecol. Inform. 2024, 80, 102510. [Google Scholar] [CrossRef]
Xie, Y.; Zhu, J.; Dai, Y.; Zhang, C. Multicondition Tool Wear Assessment for Cutting Tools Based on Kernel Principal Component Analysis and Integrated Transfer Learning. IEEE Trans. Instrum. Meas. 2025, 74, 2519813. [Google Scholar] [CrossRef]
Zhou, J.T.; Pan, S.J.; Tsang, I.W. A Deep Learning Framework for Hybrid Heterogeneous Transfer Learning. Artif. Intell. 2019, 275, 310–328. [Google Scholar] [CrossRef]
Zhu, K.; Guo, H.; Li, S.; Lin, X. Physics-Informed Deep Learning for Tool Wear Monitoring. IEEE Trans. Ind. Inform. 2024, 20, 524–533. [Google Scholar] [CrossRef]
Coutinho, B.; Pereira, E.; Gonçalves, G. 0-DMF: A Decision-Support Framework for Zero Defects Manufacturing. In Proceedings of the 21st International Conference on Informatics in Control, Automation and Robotics, Porto, Portugal, 18–20 November 2024; SCITEPRESS—Science and Technology Publications: Setúbal, Portugal, 2024; pp. 253–260. [Google Scholar]
Papenberg, B.; Hogreve, S.; Tracht, K. Visualization of Relevant Areas of Milling Tools for the Classification of Tool Wear by Machine Learning Methods. Procedia CIRP 2023, 118, 525–530. [Google Scholar] [CrossRef]
Lin, S.-Y.; Hsieh, C.-J. Construction of a Cutting-Tool Wear Prediction Model through Ensemble Learning. Appl. Sci. 2024, 14, 3811. [Google Scholar] [CrossRef]
Pashmforoush, F.; Ebrahimi Araghizad, A.; Budak, E. Tool Wear Prediction in Milling Process Using Physics-Informed Machine Learning and Thermo-Mechanical Force Model with Monitoring Applications. J. Manuf. Syst. 2025, 82, 1192–1212. [Google Scholar] [CrossRef]
Dellermann, D.; Calma, A.; Lipusch, N.; Weber, T.; Weigel, S.; Ebel, P. The Future of Human-AI Collaboration: A Taxonomy of Design Knowledge for Hybrid Intelligence Systems. arXiv 2021, arXiv:2105.03354. [Google Scholar] [CrossRef]
Mokhtarzadeh, M.; Rodríguez-Echeverría, J.; Semanjski, I.; Gautama, S. Hybrid Intelligence Failure Analysis for Industry 4.0: A Literature Review and Future Prospective. J. Intell. Manuf. 2025, 36, 2309–2334. [Google Scholar] [CrossRef]
Leberruyer, N.; Bruch, J.; Ahlskog, M.; Afshar, S. Toward Zero Defect Manufacturing with the Support of Artificial Intelligence—Insights from an Industrial Application. Comput. Ind. 2023, 147, 103877. [Google Scholar] [CrossRef]
Lee, H.; Moon, H.; Lee, J.; RYu, S. Toward Knowledge-Guided AI for Inverse Design in Manufacturing: A Perspective on Domain, Physics, and Human-AI Synergy. Adv. Intell. Discov. 2025, e202500107. [Google Scholar] [CrossRef]
L’heureux, A.; Grolinger, K.; Elyamany, H.F.; Capretz, M.A.M. Machine Learning with Big Data: Challenges and Approaches. Ieee Access 2017, 5, 7776–7797. [Google Scholar] [CrossRef]
Jia, X.; Willard, J.; Karpatne, A.; Read, J.; Zwart, J.; Steinbach, M.; Kumar, V. Physics Guided RNNs for Modeling Dynamical Systems: A Case Study in Simulating Lake Temperature Profiles. In Proceedings of the 2019 SIAM International Conference on Data Mining, Calgary, AB, Canada, 2–4 May 2019; SIAM: Philadelphia, PA, USA, 2019; pp. 558–566. [Google Scholar]
Lagaris, I.E.; Likas, A.; Fotiadis, D.I. Artificial Neural Networks for Solving Ordinary and Partial Differential Equations. IEEE Trans. Neural Netw. 1998, 9, 987–1000. [Google Scholar] [CrossRef] [PubMed]
Karpatne, A.; Atluri, G.; Faghmous, J.H.; Steinbach, M.; Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; Kumar, V. Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data. IEEE Trans. Knowl. Data Eng. 2017, 29, 2318–2331. [Google Scholar] [CrossRef]
Wu, Y.; Sicard, B.; Gadsden, S.A. Physics-Informed Machine Learning: A Comprehensive Review on Applications in Anomaly Detection and Condition Monitoring. Expert Syst. Appl. 2024, 255, 124678. [Google Scholar] [CrossRef]
Karthik, M.R.; Rao, T.B. Physics-Informed Data-Driven Ensemble and Transfer Learning Approaches for Prediction of Temperature Field and Cutting Force during Machining IN625 Superalloy. Int. J. Interact. Des. Manuf. 2025, 19, 7027–7060. [Google Scholar] [CrossRef]
Chen, Z.; Zhang, J.; Qin, F.; Tao, G.; Li, D.; Cao, H. PIDGGCN: A Novel Physics-Informed Deep Learning Framework for Tool Wear Monitoring. Adv. Eng. Inform. 2025, 68, 103790. [Google Scholar] [CrossRef]
Dong, C.; Zhao, J. An Augmented AutoEncoder With Multi-Head Attention for Tool Wear Prediction in Smart Manufacturing. IEEE Access 2024, 12, 79128–79137. [Google Scholar] [CrossRef]
Ma, Z.; Zhao, M.; Dai, X.; Chen, Y. A Hybrid-Driven Probabilistic State Space Model for Tool Wear Monitoring. Mech. Syst. Signal Process. 2023, 200, 110599. [Google Scholar] [CrossRef]
Dash, T.; Chitlangia, S.; Ahuja, A.; Srinivasan, A. A Review of Some Techniques for Inclusion of Domain-Knowledge into Deep Neural Networks. Sci. Rep. 2022, 12, 1040. [Google Scholar] [CrossRef]
Ahmed, I.; Jeon, G.; Piccialli, F. From Artificial Intelligence to Explainable Artificial Intelligence in Industry 4.0: A Survey on What, How, and Where. IEEE Trans. Ind. Inform. 2022, 18, 5031–5042. [Google Scholar] [CrossRef]
Chen, T.-C.T. Explainable Artificial Intelligence (XAI) in Manufacturing. In Explainable Artificial Intelligence (XAI) in Manufacturing: Methodology, Tools, and Applications; Springer International Publishing: Cham, Switzerland, 2023; pp. 1–11. [Google Scholar]
Lin, J.-S.; Chen, K.-H. A Novel Decision Support System Based on Computational Intelligence and Machine Learning: Towards Zero-Defect Manufacturing in Injection Molding. J. Ind. Inf. Integr. 2024, 40, 100621. [Google Scholar] [CrossRef]
Hajgató, G.; Wéber, R.; Szilágyi, B.; Tóthpál, B.; Gyires-Tóth, B.; Hős, C. PredMaX: Predictive Maintenance with Explainable Deep Convolutional Autoencoders. Adv. Eng. Inform. 2022, 54, 101778. [Google Scholar] [CrossRef]
Nasr, M.M.; Anwar, S.; Al-Samhan, A.M.; Ghaleb, M.; Dabwan, A. Milling of Graphene Reinforced Ti6Al4V Nanocomposites: An Artificial Intelligence Based Industry 4.0 Approach. Materials 2020, 13, 5707. [Google Scholar] [CrossRef] [PubMed]
Schmetz, A.; Vahl, C.; Zhen, Z.; Reibert, D.; Mayer, S.; Zontar, D.; Garcke, J.; Brecher, C. Decision Support by Interpretable Machine Learning in Acoustic Emission Based Cutting Tool Wear Prediction. In Proceedings of the 2021 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Singapore, 13–16 December 2021; IEEE: New York, NY, USA, 2021; pp. 629–633. [Google Scholar]
Holst, C.; Yavuz, T.B.; Gupta, P.; Ganser, P.; Bergs, T. Deep Learning and Rule-Based Image Processing Pipeline for Automated Metal Cutting Tool Wear Detection and Measurement. IFAC-PapersOnLine 2022, 55, 534–539. [Google Scholar] [CrossRef]
Li, X.; Li, H.-X.; Guan, X.-P.; Du, R. Fuzzy Estimation of Feed-Cutting Force From Current Measurement—A Case Study on Intelligent Tool Wear Condition Monitoring. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2004, 34, 506–512. [Google Scholar] [CrossRef]
Liu, J.; Guo, F.; Gao, H.; Li, M.; Zhang, Y.; Zhou, H. Defect Detection of Injection Molding Products on Small Datasets Using Transfer Learning. J. Manuf. Process. 2021, 70, 400–413. [Google Scholar] [CrossRef]
Pandiyan, V.; Drissi-Daoudi, R.; Shevchik, S.; Masinelli, G.; Le-Quang, T.; Logé, R.; Wasmer, K. Deep Transfer Learning of Additive Manufacturing Mechanisms across Materials in Metal-Based Laser Powder Bed Fusion Process. J. Mater. Process. Technol. 2022, 303, 117531. [Google Scholar] [CrossRef]
Do, S.; Song, K.D.; Chung, J.W. Basics of Deep Learning: A Radiologist’s Guide to Understanding Published Radiology Articles on Deep Learning. Korean J. Radiol. 2020, 21, 33–41. [Google Scholar] [CrossRef]
Yang, B.; Lei, Y.; Jia, F.; Xing, S. An Intelligent Fault Diagnosis Approach Based on Transfer Learning from Laboratory Bearings to Locomotive Bearings. Mech. Syst. Signal Process. 2019, 122, 692–706. [Google Scholar] [CrossRef]
Sun, C.; Ma, M.; Zhao, Z.; Tian, S.; Yan, R.; Chen, X. Deep Transfer Learning Based on Sparse Autoencoder for Remaining Useful Life Prediction of Tool in Manufacturing. IEEE Trans. Ind. Inform. 2019, 15, 2416–2425. [Google Scholar] [CrossRef]
Marei, M.; El Zaatari, S.; Li, W. Transfer Learning Enabled Convolutional Neural Networks for Estimating Health State of Cutting Tools. Robot. Comput. Integr. Manuf. 2021, 71, 102145. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Liu, X.; Li, Y.; Meng, Q.; Chen, G. Deep Transfer Learning for Conditional Shift in Regression. Knowl.-Based Syst. 2021, 227, 107216. [Google Scholar] [CrossRef]
Yan, B.; Zhu, L.; Dun, Y. Tool Wear Monitoring of TC4 Titanium Alloy Milling Process Based on Multi-Channel Signal and Time-Dependent Properties by Using Deep Learning. J. Manuf. Syst. 2021, 61, 495–508. [Google Scholar] [CrossRef]
Papacharalampopoulos, A.; Alexopoulos, K.; Catti, P.; Stavropoulos, P.; Chryssolouris, G. Learning More with Less Data in Manufacturing: The Case of Turning Tool Wear Assessment through Active and Transfer Learning. Processes 2024, 12, 1262. [Google Scholar] [CrossRef]
Sarker, B.; Chakraborty, S.; Čep, R.; Kalita, K. Development of Optimized Ensemble Machine Learning-Based Prediction Models for Wire Electrical Discharge Machining Processes. Sci. Rep. 2024, 14, 23299. [Google Scholar] [CrossRef] [PubMed]
Demircioglu Diren, D.; Ozsoy, N.; Ozsoy, M.; Pehlivan, H. Optimization of Cutting Parameters and Result Predictions with Response Surface Methodology, Individual and Ensemble Machine Learning Algorithms in End Milling of AISI 321. Arab. J. Sci. Eng. 2023, 48, 12075–12089. [Google Scholar] [CrossRef]
David, L.G.; Patra, R.K.; Falkowski-Gilski, P.; Divakarachari, P.B.; Antony Marcilin, L.J. Tool Wear Monitoring Using Improved Dragonfly Optimization Algorithm and Deep Belief Network. Appl. Sci. 2022, 12, 8130. [Google Scholar] [CrossRef]
Ou, J.; Li, H.; Huang, G.; Liu, B.; Wang, Z. Tool Wear Recognition Based on Deep Kernel Autoencoder With Multichannel Signals Fusion. IEEE Trans. Instrum. Meas. 2021, 70, 3521909. [Google Scholar] [CrossRef]
He, Z.; Shi, T.; Xuan, J.; Li, T. Research on Tool Wear Prediction Based on Temperature Signals and Deep Learning. Wear 2021, 478–479, 203902. [Google Scholar] [CrossRef]
Xie, X.; Huang, M.; Sun, W.; Li, Y.; Liu, Y. Intelligent Tool Wear Monitoring Method Using a Convolutional Neural Network and an Informer. Lubricants 2023, 11, 389. [Google Scholar] [CrossRef]
Echeverria-Rios, D.; Green, P.L. Predicting Product Quality in Continuous Manufacturing Processes Using a Scalable Robust Gaussian Process Approach. Eng. Appl. Artif. Intell. 2024, 127, 107233. [Google Scholar] [CrossRef]
Herrera-Granados, G.; Misaka, T.; Herwan, J.; Komoto, H.; Furukawa, Y. An Experimental Study of Multi-Sensor Tool Wear Monitoring and Its Application to Predictive Maintenance. Int. J. Adv. Manuf. Technol. 2024, 133, 3415–3433. [Google Scholar] [CrossRef]
Srivastava, A.K.; Singh, B.K.; Gupta, S. Prediction of Tool Wear Using Machine Learning Approaches for Machining on Lathe Machine. Evergreen 2023, 10, 1357–1365. [Google Scholar] [CrossRef]
Ross, N.S.; Sheeba, P.T.; Shibi, C.S.; Gupta, M.K.; Korkmaz, M.E.; Sharma, V.S. A Novel Approach of Tool Condition Monitoring in Sustainable Machining of Ni Alloy with Transfer Learning Models. J. Intell. Manuf. 2024, 35, 757–775. [Google Scholar] [CrossRef]
Mukunda, S.G.; Srivastava, A.; Boppana, S.B.; Dayanand, S.; Yeshwanth, D. Wear Performance Prediction of MWCNT-Reinforced AZ31 Composite Using Machine Learning Technique. J. Bio- Tribo-Corros. 2023, 9, 27. [Google Scholar] [CrossRef]
Li, Z.; Meurer, M.; Bergs, T. Deep Learning Approach for Enhanced Transferability and Learning Capacity in Tool Wear Estimation. Procedia CIRP 2024, 126, 360–365. [Google Scholar] [CrossRef]
Molitor, D.A.; Kubik, C.; Hetfleisch, R.H.; Groche, P. Workpiece Image-Based Tool Wear Classification in Blanking Processes Using Deep Convolutional Neural Networks. Prod. Eng. 2022, 16, 481–492. [Google Scholar] [CrossRef]
Hrnjica, B.; Softic, S. Explainable AI in Manufacturing: A Predictive Maintenance Case Study. In IFIP International Conference on Presentation in Production Management Systems; Springer International Publishing: Cham, Switzerland, 2020; pp. 66–73. [Google Scholar]
Schueller, A.; Saldaña, C. Indirect Tool Condition Monitoring Using Ensemble Machine Learning Techniques. J. Manuf. Sci. Eng. 2023, 145, 011006. [Google Scholar] [CrossRef]
Ulas, M.; Altay, O.; Gurgenc, T.; Özel, C. A New Approach for Prediction of the Wear Loss of PTA Surface Coatings Using Artificial Neural Network and Basic, Kernel-Based, and Weighted Extreme Learning Machine. Friction 2020, 8, 1102–1116. [Google Scholar] [CrossRef]
Thankachan, T.; Soorya Prakash, K.; Kavimani, V.; Silambarasan, S.R. Machine Learning and Statistical Approach to Predict and Analyze Wear Rates in Copper Surface Composites. Met. Mater. Int. 2021, 27, 220–234. [Google Scholar] [CrossRef]
Aydin, F. The Investigation of the Effect of Particle Size on Wear Performance of AA7075/Al₂O₃ Composites Using Statistical Analysis and Different Machine Learning Methods. Adv. Powder Technol. 2021, 32, 445–463. [Google Scholar] [CrossRef]

Figure 1. PRISMA flow diagram illustrating the literature screening process.

Figure 2. Physics-based machine learning approaches in hybrid systems.

Figure 3. Integration of Rule-Based Domain Knowledge into AI-Based Decision Support Systems for Manufacturing.

Figure 4. Example of a hybrid XAI-based DSS workflow.

Figure 5. Flowchart of the transfer learning process for adapting pretrained models to specific tasks.

Figure 6. Flowchart Modeling Flow of Ensemble and Hybrid Approaches in Manufacturing Processes.

Table 1. Hybrid Approach Types and Example Studies.

Hybrid Category	Core Components	Integration Architecture	Synergistic Benefit	Example Study
Physics-Informed Hybrids	1. Deep Learning (LSTM) 2. Physical Models (Taylor’s equations)	Physical equations and cutting parameters were integrated into the deep learning model’s loss function and input features.	Generalization with limited data and physical consistency: The model demonstrated a significant improvement in prediction accuracy and provided more consistent results under varying conditions compared to purely data-driven models.	Zhu et al. (2024): Developed a model for cutting tool wear prediction by integrating physical laws with deep learning [43].
Knowledge-Guided Hybrids	1. Machine Learning (CatBoost, XGBoost) 2. Explainable AI (XAI—SHAP, LIME)	Model predictions were explained to operators using SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), guiding expert intervention based on these explanations.	Transparency, trust, and actionable insights: The “black box” model was transformed into a decision-support system that operators could understand and trust, achieving a very high defect detection rate.	Coutinho et al. (2024): Proposed a Zero Defect Manufacturing framework that predicts faults and explains them using XAI [44].
Transfer Learning Hybrids	1. Pre-trained CNN (on ImageNet) 2. Target Task (Tool Wear)	A CNN (EfficientNetB0) pre-trained on general image features was fine-tuned on a relatively small dataset of milling tools.	High performance with limited data: Enabled achieving a very high classification accuracy from a small, domain-specific dataset, which would be difficult with training from scratch.	Papenberg et al. (2023): Utilized transfer learning to classify milling tool wear [45].
Heterogeneous Model Hybrids	1. Ensemble Learning 2. Multiple Algorithms (SVM, DT, LR, LDA)	An ensemble was created by strategically combining predictions from different base algorithms (SVM, Decision Trees, etc.).	Enhanced robustness and reliability: Mitigated the weaknesses of any single model by leveraging the strengths of diverse algorithms, leading to more stable and reliable predictions.	Lin and Hsieh (2024): Developed a heterogeneous ensemble learning model for tool wear prediction [46].

Table 2. Summary of Physics-Informed Hybrid Machine Learning (PIHML) Approaches.

Study	Method	Source of Physical Knowledge	Hybrid Integration Logic	Target Application	Hybrid Rationale	Limitations
[43]	Physics-informed deep learning (ADHL-LSTM)	Cutting parameters, Taylor equations, physical residual modeling	Physics-Informed Hybrid: Physical equations (Taylor) and residual modeling are incorporated into the LSTM’s architecture and multi-task loss function, enhancing the model’s adherence to physical processes and its generalization ability under limited data conditions.	Cutting tool wear monitoring in high-speed milling	Massive increase in prediction accuracy compared to purely data-driven or physics-based methods. Robust results with limited data due to the integration of physical knowledge.	Influence from low signal-to-noise ratio labels; residual learning method failing to provide the expected improvement; scalability to different production types (e.g., turning, drilling) not fully demonstrated.
[47]	PIML	Thermo-mechanical force model that accounts for wear effects	Physics-Informed Hybrid: A physical force and wear model is combined with machine learning algorithms (LSBoost, SVR, etc.), and an inverse modeling strategy enables the direct prediction of tool wear from cutting parameters.	Milling tool wear & force prediction	The combination of the explainability of the physical model with the predictive power of ML. High reliability with less experimental data.	Narrow validation (only Steel 1050 and a specific tool geometry); potential need for a larger dataset due to the complex wear-force relationship.
[57]	Hybrid physics-based & data-driven approach	Finite Element (FE) simulations (Johnson–Cook material model, Coulomb–Tresca friction, ALE re-mesh)	Physics-Informed Hybrid: Physically validated simulations are used as a rich and reliable data source to train various ML algorithms (AdaBoost, SVR, etc.), ensuring predictions are grounded in physical principles.	Machining of Ni-based superalloy IN625 (cutting temperature & force prediction)	Speed and lower cost compared to expensive physical experiments; increased reliability due to results consistent with theoretical expectations.	High simulation cost; hyperparameter sensitivity
[58]	PIDGGCN	Physics-based regularizers; spatio-temporal dependencies in machining data	Physics-Informed Hybrid: Physical principles (regularizers) and spatio-temporal relationships are integrated into the deep learning model’s graph structure, creating a more accurate and lightweight model that observes monotonic wear progression.	Tool wear monitoring in machining of titanium alloy thin-walled parts & CFRP composites	The combination of high accuracy and low computational cost (lightweight model) makes it ideal for real-time industrial applications.	Continued reliance on labeled data for noisy data across different cutting conditions; limited ability to interpret complex wear behaviors.
[59]	Physics-Informed MCAG	Principle of progressive tool wear	Physics-Informed Hybrid: A custom monotonicity loss function is added to the training objective, penalizing non-increasing predictions and hard-coding the physical law of wear progression into the data-driven model.	Tool wear prediction	Ensures predictions are physically plausible and increases model reliability, especially in data-sparse regimes.	High computational cost; reliance on the strict monotonicity assumption, which might not hold in all scenarios (e.g., tool changes, run-in periods).

Table 3. Knowledge-Enabled Hybrid Approaches.

Study	Hybrid Category	Integration Architecture	Application Domain	Hybrid Rationale	Limitations and Future Work
[64]	Knowledge-Guided Hybrid	Integration of a heuristic optimizer (PSO) with a transparent classifier (C4.5 decision tree) to encode decision rules.	Zero-Defect Manufacturing (ZDM) in Injection Molding	Structural Transparency and Actionable Guidance: The model’s intrinsic interpretability builds operator trust and provides a concrete basis for preventive interventions.	Risk of optimizer convergence to local minima; generalizability is constrained to symmetric mold geometries.
[65]	Knowledge-Guided Hybrid	Combination of a Deep Convolutional Autoencoder (DCAE) and Principal Component Analysis (PCA) for anomaly detection. Model outputs are explained using Integrated Gradients (IG).	Lubricant Degradation Detection in a Gearbox	Diagnosable Anomaly Detection: The system not only identifies faults but also pinpoints the critical sensor channels responsible, enabling informed maintenance decisions.	High sensitivity to DCAE hyperparameters; requires pre-definition of the number of clusters.
[66]	Knowledge-Guided Hybrid	An Adaptive Neuro-Fuzzy Inference System (ANFIS) encodes expert knowledge, which is then optimized using Multi-Objective Particle Swarm Optimization (MOPSO) to identify feasible process parameters.	Milling of Ti6Al4V/GNP Nanocomposites	Expert Knowledge Capture & Multi-Objective Optimization: The fuzzy logic base provides interpretable predictions, while the optimization balances conflicting objectives (e.g., surface quality vs. cutting force).	Validated within a narrow range of material compositions and process parameters; scalability to other material systems remains unverified.
[44]	Knowledge-Guided Hybrid	Predictions from high-performance gradient boosting models (CatBoost, XGBoost) are explained globally and locally using model-agnostic XAI methods (SHAP, LIME).	Zero-Defect Manufacturing in Wood Panel Production	Operator-Level Transparency and System Trust: “Black-box” predictions are translated into intuitive explanations that operators can reconcile with their expert intuition, strengthening human–machine collaboration.	Reliance on SMOTE for class imbalance; the application scope is limited to a specific production line.
[67]	Knowledge-Guided Hybrid	A Random Forest (RF) model was selected for its inherent interpretability; decision contributions are communicated to operators visually and audibly using a Tree Interpreter.	Tool Wear Prediction in Ultra-Precision Turning	Operator Training and Decision Verification: Visualizing the model’s decision-making process aids operator understanding of process dynamics and increases trust in model outputs.	Lack of comprehensive user studies; challenges in effectively representing frequencies beyond the human auditory range.
[68]	Knowledge-Guided Hybrid	CNN (for detection) + U-Net (for segmentation) + Rule-based algorithm (for measurement)	Automated tool wear measurement in metal cutting	Automation & Interpretability: The measurement rule transforms the black-box model output into a tangible, meaningful, and reliable metric, enhancing interpretability and trust.	Limitations: Small dataset, focus on flank wear only, limited explainability of segmentation. Future Work: Expand datasets, model multiple wear types, connect segmentation outcomes with final part properties.

Table 4. Summary of Transfer Learning-Based Hybrid Approaches.

Study	Hybrid Category	Integration Architecture	Application Domain	Hybrid Rationale	Limitations and Future Work
[75]	Transfer Learning Hybrids	Fine-tuning of a ResNet-18 model, pre-trained on ImageNet, with a limited set of microscopic tool images.	CNC Tool Wear Prediction	Adapts features learned from a large-scale, general image dataset to a data-constrained manufacturing problem, achieving high accuracy.	Dataset imbalance; long training times for deep models; not yet validated in real-time systems. Future work suggests sensor fusion and reinforcement learning integration.
[77]	Transfer Learning Hybrids	A Deep Transfer Learning (DTL) architecture (CDAR model) combining MSE and CEOD loss functions to mitigate conditional distribution shift between source and target domains.	Machine Health Monitoring (PHM- Prognostics and Health Management)	Enables strong generalization with limited labeled data; novel loss integration improves accuracy by preserving global conditional distribution properties.	Low explainability; high computational cost; complex hyperparameter tuning. Future research suggests incorporating active learning and enhancing interpretability.
[45]	Transfer Learning Hybrids	Fine-tuning of an EfficientNetB0 CNN, pre-trained on ImageNet, for milling tool wear classification, supplemented with Grad-CAM for explainability.	Tool Condition Monitoring	Achieves high classification accuracy from a small dataset; provides partial interpretability by visualizing the model’s decision points.	Sensitivity to lighting conditions; incomplete detection of some wear indicators (e.g., cracks). Future work requires a new approach that integrates domain-specific knowledge (e.g., focus on the cutting edge) into the decision process.
[78]	Transfer Learning Hybrids	Multi-channel force and acceleration signals transformed into spectrograms via STFT and fed into a ResNet-18 model pre-trained on ImageNet.	Real-time Tool Wear Prediction in Milling	Leverages a large, unique dataset for stable and low-error predictions; transfer learning allows adaptation to varying signal lengths and channel settings.	Low explainability; poor Acoustic Emission (AE) signal quality; prediction fluctuations with input changes. Future work will acquire more data to supplement the existing dataset and study more kinds of fine-tuning for other processing environments and conditions.

Table 5. Summary of Heterogeneous Model Hybrid Approaches.

Study	Hybrid Category	Component Algorithms/Methods	Hybrid Integration Logic	Target Application	Qualitative Benefit/Hybrid Purpose	Limitations
[46]	Heterogeneous Model Hybrid	10 different ML models + Voting Regressor	Parallel Ensemble: Combines diverse base learners through a voting mechanism to average out errors and reduce variance.	Tool wear prediction in SKD11 steel milling	Robust & Generalizable Predictions: Mitigates the bias of any single model, leading to more stable and reliable predictions.	Tested on a single tool-workpiece combination.
[81]	Heterogeneous Model Hybrid	ANN + Decision Tree + k-NN	Comparative Ensemble (Soft Hybrid): Uses multiple, fundamentally different algorithms in parallel; the best-performing model for each output variable is selected, leveraging their inherent strengths.	Cutting force and surface roughness prediction in milling	Complementary Strength Utilization: Acknowledges that no single algorithm is best for all tasks; selects the most suitable model for each specific output.	Small dataset; lack of model explainability.
[82]	Heterogeneous Model Hybrid	IDOA + Deep Belief Network (DBN)	Optimization-Model Hybrid: Uses a metaheuristic algorithm (IDOA) to optimize the hyperparameters and feature space for a deep learning model (DBN).	Tool wear classification from tool edge images	Performance & Efficiency: The optimization algorithm ensures the DBN operates at its peak potential.	Small dataset; “black-box” nature.
[83]	Heterogeneous Model Hybrid	Deep Kernel Autoencoder (DKAE) + Gray Wolf Optimization (GWO)	Feature Extraction-Classifier Hybrid: Uses an unsupervised DKAE for non-linear, kernel-based feature compression, followed by GWO-optimized classification.	Tool wear detection from spindle motor current signals	Interpretability & Efficiency: The kernel-based feature extraction provides a more interpretable signal structure than raw data.	Reliance on a limited dataset and a single sensor type.
[84]	Heterogeneous Model Hybrid	Stacked Sparse Autoencoder (SSAE) + BPNN	Deep Feature Extraction Hybrid: Uses an unsupervised SSAE for automatic feature learning from sensor signals, the outputs of which are then fed into a BPNN for supervised regression.	Tool wear prediction on a CNC lathe	Automatic Feature Engineering: Eliminates the need for manual feature engineering by leveraging the SSAE’s ability to learn robust features directly from raw data.	Requires specialized sensors; high computational cost.
[85]	Heterogeneous Model Hybrid	CNN + Informer Encoder + BiLSTM (CIEBM)	Deep Architectural Hybrid: Integrates three distinct deep learning architectures: CNN for local features, Informer for long-sequence dependencies, and BiLSTM for bidirectional temporal context.	Tool wear monitoring	Multi-scale Temporal Feature Fusion: Combines strengths for both short-term and very long-term pattern recognition, leading to superior accuracy.	High computational complexity.
[86]	Heterogeneous Model Hybrid	Dirichlet Process Clustering + Gaussian Process Regression (DPGP)	Pre-processing-Model Hybrid: Uses a non-parametric clustering method to automatically identify and filter out faulty data, creating a clean dataset for a robust regression model.	Product quality prediction in continuous manufacturing	Robustness & Uncertainty Quantification: The hybrid system is inherently robust to sensor faults and provides reliable predictions with well-calibrated uncertainty estimates.	Limited real-time efficiency on large datasets.

Table 6. Qualitative comparison between non-hybrid and hybrid approaches in manufacturing.

Dimension	Non-Hybrid Example	Hybrid Example	Conceptual Distinction
Model Type	Random Forest (RF) [88]	Physics-Informed Deep Learning (ADHL-LSTM) [43]	RF relies purely on statistical correlations among input variables, whereas ADHL-LSTM embeds physical residuals (Taylor equations) within a deep network, aligning predictions with physical processes.
Integration Principle	CNN (ResNet, VGG-16) [89]–Single deep learning paradigm trained only on image data	Knowledge-Guided Hybrid (DCAE + PCA + IG) [65].–Combines autoencoder-based learning with explainable AI (Integrated Gradients)	Hybrid models integrate data-driven learning and expert-guided interpretability, while non-hybrid CNNs extract patterns without incorporating domain knowledge.
Learning Scope	XGBoost [90]–Homogeneous ensemble of decision trees	Transfer Learning Hybrid (CDAR model) [77]–Combines source-domain representation learning with target-domain adaptation	Hybrid systems leverage cross-domain knowledge transfer, while non-hybrid ensembles remain limited to a single dataset or environment.
Algorithmic Diversity	Adaptive CNN [91]–Self-updating within one algorithmic family	Heterogeneous Model Hybrid (ANN + DT + kNN) [81]–Integrates fundamentally different algorithms in parallel to exploit their complementary strengths	Hybrids combine diverse learning paradigms to balance accuracy and robustness, whereas non-hybrids rely on a single methodological family.
Knowledge Incorporation	Data-driven models trained solely on experimental data without explicit physics or expert rules	Hybrid frameworks integrate physical laws, rule-based logic, or domain expertise to guide learning and enhance interpretability	Non-hybrid systems depend mainly on statistical correlations, whereas hybrid systems combine data-driven learning with physical or expert knowledge to ensure interpretability and physical realism.
Expected Strength	High accuracy under narrow experimental conditions; limited adaptability	Enhanced generalization, interpretability, and robustness under variable industrial conditions	Hybrid systems are designed to overcome the structural fragility and context dependence of non-hybrid methods.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Saatci, B.T.; Ulas, M.; Gurgenc, T. Hybrid AI Systems for Tool Wear Monitoring in Manufacturing: A Systematic Review. Appl. Sci. 2026, 16, 208. https://doi.org/10.3390/app16010208

AMA Style

Saatci BT, Ulas M, Gurgenc T. Hybrid AI Systems for Tool Wear Monitoring in Manufacturing: A Systematic Review. Applied Sciences. 2026; 16(1):208. https://doi.org/10.3390/app16010208

Chicago/Turabian Style

Saatci, Busra Tan, Mustafa Ulas, and Turan Gurgenc. 2026. "Hybrid AI Systems for Tool Wear Monitoring in Manufacturing: A Systematic Review" Applied Sciences 16, no. 1: 208. https://doi.org/10.3390/app16010208

APA Style

Saatci, B. T., Ulas, M., & Gurgenc, T. (2026). Hybrid AI Systems for Tool Wear Monitoring in Manufacturing: A Systematic Review. Applied Sciences, 16(1), 208. https://doi.org/10.3390/app16010208

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hybrid AI Systems for Tool Wear Monitoring in Manufacturing: A Systematic Review

Abstract

1. Introduction

Methodology of the Review

2. Defining Hybrid Approaches: Integration Principles and Applications

2.1. Physics-Informed Hybrid Applications

2.2. Knowledge-Guided Hybrid Applications

2.3. Transfer Learning Hybrid Applications

2.4. Heterogeneous Model Hybrid Applications

3. The Limitations of Non-Hybrid Approaches

4. Challenges and Future Directions

4.1. Major Challenges

4.2. Future Research Directions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI