1. Introduction
Artificial intelligence (AI) is increasingly redefining the landscape of machinery health monitoring (MHM), enabling predictive maintenance, fault diagnosis, and operational resilience across industrial domains. Despite considerable progress, the literature reveals unresolved issues that constrain large-scale industrial adoption. Specifically, robust AI-based MHM faces systemic barriers in (i) overcoming data scarcity and class imbalance caused by rare fault events, (ii) managing evolving degradation modes and unknown-unknown fault types, and (iii) integrating heterogeneous multimodal data streams into scalable, transparent decision systems. These challenges undermine model generalizability, interpretability, and trustworthiness, thus delaying the transition from laboratory prototypes to production-grade [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11].
Recent work underscores that while transfer learning [
12], generative AI [
5,
13], and multimodal fusion [
3,
4,
14,
15] offer partial remedies, critical research gaps remain. First, standardization and benchmarking protocols are notably underexplored, with only four references addressing the problem [
9,
16,
17,
18,
19,
20], thus limiting reproducibility and industrial benchmarking. Second, cybersecurity, privacy, and resilience within IoT-enabled environments are poorly addressed, despite being central to Industry 4.0 integration [
8,
21,
22]. Third, the interface between human expertise and AI (“human-in-the-loop” approaches) has gained attention [
23,
24] but lacks cohesive frameworks for trust and decision transparency. Finally, while hybridization with physics-based models improves interpretability [
7,
25,
26,
27], methodological rigor for integrating causal reasoning and mechanistic insights is still emergent. Addressing these research voids is essential for establishing AI-driven MHM as a reliable industrial standard.
To map these gaps, this paper employs a systematic review using a modified PRISMA protocol, encompassing Scopus, Web of Science, and Litmaps. From 3235 initial records, 85 peer-reviewed studies were retained after multi-stage screening. Data were thematically analyzed across 13 research questions grouped into three meta-themes: (1)
Data-Centric Challenges & Innovations, (2)
Model Design, Transparency, and Hybridization, and (3)
Deployment, Operations, and Decision Intelligence. The bibliometric analysis reveals an accelerating research trajectory—from only 1 publication in 2018 to 34 in 2025 (3300% growth)—and highlights concentration around predictive maintenance (10 studies) and edge computing (9 studies). Notably, techniques such as transfer learning [
12,
27] and generative modeling [
5,
13,
28,
29] are emerging as dominant paradigms for alleviating scarcity, while explainable AI [
6,
30] and human-AI collaboration [
23,
24] are foundational for industrial trust.
Figure 1 encapsulates the conceptual schema of AI-enabled MHM. It illustrates a layered ecosystem beginning with multimodal data acquisition (acoustic, vibration, thermal, visual) and progressing through data pre-processing, AI-driven fault detection/prognostics (deep learning, transfer learning, generative AI), interpretability layers (xAI, hybrid physics-AI), and deployment mechanisms (edge computing, cybersecurity, and decision intelligence). Importantly, the diagram foregrounds human operators and decision-support systems, highlighting the socio-technical integration necessary for real-world impact.
This literature review establishes a comprehensive evidence-driven framework for next generation machine health monitoring by synthesizing 85 studies to align three core domains of data strategies, model design and deployment intelligence while prioritizing predictive maintenance and edge AI, advancing hybrid and generative approaches and prescribing standardization and human in the loop mechanisms to meet Industry 4.0 and 5.0 demands. In summary, this review contributes the following quantitative and qualitative insights:
The study consolidates evidence from 85 rigorously selected papers, covering 13 research questions, producing a thematic map of innovations and challenges.
Bibliometric evidence shows publication growth of 3300% between 2018–2025, demonstrating rapid maturation of the field.
Critical themes identified with predictive maintenance (12 papers), edge/embedded AI (10 papers), and generative AI for rare events (7 papers) as strategic levers for industrial resilience.
Underrepresented areas (standards/benchmarking: 4 papers; cybersecurity/privacy: 7 papers) indicate actionable research gaps.
2. Materials and Methods
This systematic literature review was conducted to identify, analyze, and synthesize the current state of research on AI-based machine health monitoring (MHM). The review followed a structured, systematic methodology to ensure objectivity, transparency, and reproducibility of the findings. The process is outlined in a modified PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram (see
Figure 2 in the original document).
2.1. Search Strategy and Information Sources
A comprehensive search was performed across three primary academic databases: Scopus, IEEE Xplore, Web of Science, and Litmaps. The search was conducted using a combination of keywords and Boolean operators to capture the breadth of the research field. The keywords included terms such as “AI,” “machine learning,” “deep learning,” “prognostics and health management (PHM),” “machinery health monitoring,” “fault diagnosis,” and “predictive maintenance.” A total of 3235 records were initially identified from these databases.
2.2. Study Selection and Eligibility Criteria
The selection process involved a multi-stage screening of the identified records.
Duplicate Removal: An initial screening was performed to remove duplicate records (i.e., 596 duplicates), resulting in 3235 unique papers.
Title and Abstract Screening: The remaining records were screened based on their titles and abstracts to assess their relevance to the review’s core research questions. A paper was considered eligible if its abstract indicated a focus on the application of AI, machine learning, or deep learning in the context of machinery health, fault diagnosis, or predictive maintenance (i.e., Papers that did not directly contribute to answering one of the 13 research questions were excluded.). This stage resulted in 163 potentially relevant articles.
Full-Text Assessment: The full texts of these 164 articles were then retrieved and meticulously assessed against the pre-defined research questions. This final stage yielded a total of 85 studies that were included in the final review. A total of 78 articles were excluded for not satisfying the research questions or other reasons (e.g., poster paper with limited technical details, conference paper with less than 4 pages containing limited information, non-English papers etc.).
The initial database search identified 3235 records owing to the deliberately broad use of keywords such as “AI machine health diagnostics,” “ML-based equipment monitoring,” “AI-based fault diagnosis,” “machine prognostics,” and “intelligent condition monitoring.” While this strategy was intended to capture a comprehensive body of literature, it inevitably retrieved a large number of studies outside the scope of this review. In particular, many articles focused on AI applications for human health monitoring (e.g., mental health, ECG analysis, stress detection) rather than machinery and equipment health. During the multi-stage screening process, these studies were excluded as they did not address any of the 13 research questions defined for AI-enabled machine health monitoring. Consequently, the final set of 85 studies reflects a rigorously filtered corpus that is thematically precise, methodologically relevant, and aligned with the contribution of this review.
2.3. Data Extraction and Synthesis
Data from the 85 included studies were extracted and organized based on a thematic framework derived from the 13 core research questions of the study. For each selected paper, the following information was extracted and categorized:
Thematic Focus: The primary research question(s) the paper addressed (e.g., Data challenges, Multimodal Fusion, Generative AI).
Methodology: The specific AI techniques or approaches employed (e.g., Deep Learning, Transfer Learning, Generative Adversarial Networks).
Key Findings: The main contributions and results of the study.
The extracted data were then synthesized to identify trends, research gaps, and key insights within each thematic area. A bibliometric analysis was also conducted to provide a quantitative overview of the research landscape, including publication trends over time and a preliminary author/journal analysis. This synthesis formed the basis for the critical analysis of the findings presented in the subsequent sections of this paper.
3. Thirteen Research Questions Within Three Core Themes
The 13 research questions distilled in
Section 3 provide the structural backbone of this review, organizing the field of AI-driven machine health monitoring into three interdependent thematic domains: data-centric challenges and innovations, model design with transparency and hybridization, and deployment, operations, and decision intelligence. The mapping between these research question and the thematic domains are clearly showing in
Table 1,
Table 2 and
Table 3. Collectively, these domains encompass the entire lifecycle of AI integration in machinery health, from addressing fundamental issues of data scarcity, imbalance, and multimodal fusion to ensuring explainability, human-AI collaboration, and scalable industrial adoption.
Figure 3 visually encapsulates this conceptual triad, illustrating how each theme connects to a set of sub-questions that interrogate critical bottlenecks and emerging opportunities. By framing the literature through these lenses, the review not only clarifies the technological trajectory of the field but also highlights cross-cutting concerns such as trust, cybersecurity, and standardization that underpin successful Industry 4.0 and 5.0 implementations.
The thirteen research questions were derived through a structured synthesis of the 85 selected studies, where recurring gaps and challenges were clustered into three meta-themes. This iterative process ensured that the questions captured both established bottlenecks (e.g., data scarcity, explainability) and emerging priorities (e.g., generative AI, standardization). They served not only as an organizing framework for the review but also as a roadmap that directly structures its contribution, enabling the paper to move beyond descriptive synthesis toward actionable, theme-driven insights for advancing AI-enabled machine health monitoring.
3.1. Theme 1- Data-Centric Challenges & Innovations
This section critically examines the inherent complexities associated with data acquisition, integration, and quality in AI-driven machinery health monitoring. It highlights the prevalence of sparse, imbalanced, and noisy datasets and underscores innovative strategies—such as synthetic data generation [
31,
32], transfer learning, and multimodal sensor fusion—that are reshaping the landscape of predictive maintenance and anomaly detection [
33,
34].
Table 1 provides the mapping among research questions, sub-thematic and associated references from the current body of knowledge.
Figure 4 illustrates the flow from key data-centric challenges such as data scarcity, imbalance, sensor noise, and vision-based sensing toward solution strategies including synthetic data generation, transfer learning, multimodal fusion, and structural sensor anomaly detection, which in turn enable advanced innovations like generative AI and physics-informed learning for robust machine health monitoring.
Table 1.
Four research questions under “Data-Centric Challenges & Innovations”.
Table 1.
Four research questions under “Data-Centric Challenges & Innovations”.
Research Question | Sub-Thematic | References |
---|
1. How to Make AI Models Robust to Data Scarcity, Imbalance, and Noise in Machinery Health Datasets? | Data challenges and robustness | [1,35,36,37,38,39,40,41,42,43] |
2. What methods can enable AI models to handle evolving degradation modes and previously unseen fault types (unknown-unknowns)? | Adaptation to unknown faults and concept drift | [2,12,44,45] |
3. How can multimodal data (acoustic, vibration, thermal, visual, operational logs) be fused effectively in AI-based MHM? | Multimodal and sensor fusion | [3,4,14,15,46,47] |
4. What role can generative AI play in simulating rare failure events and augmenting training data? | Generative AI for rare events and data augmentation | [5,13,28,29,42,48,49] |
5. How can AI-augmented sensor frameworks improve fidelity and adaptability in real-time condition monitoring? | Camera/vision-based sensor integration and structural sensor-based anomaly detection | [10,11,14,46,47] |
3.1.1. How to Make AI Models Robust to Data Scarcity, Imbalance, and Noise in Machinery Health Datasets
Industrial machinery often operates under normal conditions for long periods, producing vast “healthy” data and relatively few fault instances. The problem of
imbalanced learning and
domain shift due to noise or sensor degradation remains a major barrier. To address the challenges of data scarcity, imbalance, and noise in machinery health datasets, several strategies can be employed, as shown in
Figure 5.
Generating synthetic samples can help balance the dataset and improve the model’s ability to generalize. This technique, referred as data augmentation is particularly useful when abnormal conditions are underrepresented as shown in Dai et al., 2024 [
1].
Class imbalance is ubiquitous in machinery datasets, where abnormal states represent only a fraction of the data. This skews learning toward normal conditions, impairing the detection of faults. Techniques such as data augmentation and sampling, including oversampling of minority classes or balanced mini-batch learning, can mitigate this issue and improve generalization [
37,
38,
40,
42]. Fleet-based anomaly detection frameworks offer an unsupervised alternative: by comparing machines within a fleet—assuming most are healthy—these models can detect deviations that signal faults, reducing the need for large annotated datasets [
18].
AI integrated with big data analytics has proven effective in handling the dynamic, massive, nonlinear, and noisy data streams generated by industrial equipment. Preprocessing methods such as noise filtering and data cleaning can further improve data quality, enabling models to extract meaningful patterns despite heterogeneity [
39,
41].
To ensure long-term applicability, flexible and extensible AI architectures are essential. These frameworks can integrate diverse data modalities and adapt to new fault types or operating conditions [
43]. In addition, causal reasoning and expert validation safeguard against spurious correlations, while interpretable frameworks provide transparent explanations that improve operator trust in safety-critical contexts [
18,
40,
42].
Implementing Explainable AI (xAI) can improve fault diagnosis by integrating different data modalities and providing understandable explanations for decisions [
50]. This enhances transparency, traceability, and causability, which are crucial for maintaining reliability and durability in machinery health monitoring (Lahmiri, 2023) [
35]. Enhancing the tagging quality of time series data can reduce the proportion of signals unrelated to tasks. This involves processing signals twice to improve the tagging degree, although manual tagging can be costly and inconsistent as shown in Li et al., 2022 [
36].
These strategies collectively contribute to developing robust AI models capable of handling data scarcity, imbalance, and noise in machinery health datasets.
3.1.2. Methods to Handle Evolving Degradation Modes and Unknown-Unknown Fault Types
Industrial machinery often exhibits
novel, unmodeled failure modes over long lifecycles. Addressing
concept drift, open-set fault diagnosis, and continual learning is critical for robust long-term monitoring. Several artificial intelligence (AI) methods can enable models to handle evolving degradation modes and previously unseen fault types as shown in
Figure 6.
Deep learning techniques are capable of automatically extracting features, handling nonlinear relationships, integrating multi-source information, and providing adaptive learning and intelligent decision support. This makes them suitable for dealing with complex and evolving degradation modes as demonstrated by Dong et al., 2025 [
2]. Transfer learning allows AI models to leverage knowledge from previously learned tasks to improve performance on new, unseen tasks. This can be particularly useful for handling unknown-unknown fault types by transferring insights from similar known faults [
12]. Adversarial Neural Networks can be trained to recognize and adapt to new fault types by simulating various fault scenarios during training, thus preparing the model for previously unseen faults [
12]. Artificial Immune Systems (AIS)-based strategies have been proposed to overcome limitations of current fault diagnosis methods. These systems mimic the adaptive and learning capabilities of biological immune systems, making them effective in handling evolving degradation modes and unknown faults [
44]. AI models that utilize condition monitoring data collected from sensors can learn and predict system behavior and degradation. These Data-Driven approaches are adaptive and suitable for real-world applications where models are not available, thus helping in identifying unknown-unknowns (Nguyen, Liu & Zio, 2020) [
45].
In summary, following are the key points in handle evolving degradation modes and Unknown-Unknown fault types (as shown in
Figure 6):
Deep Learning: Handles nonlinear relationships and integrates multi-source information (Dong et al., 2025) [
2].
Transfer Learning: Utilizes knowledge from previous tasks for new, unseen tasks (Zhou & Chen, 2025) [
12].
Adversarial Neural Networks: Simulates fault scenarios for better adaptation [
12].
Artificial Immune Systems: Mimics biological immune systems for adaptive learning (Ming & Zhao, 2018) [
44].
Data-Driven Approaches: Uses sensor data for adaptive predictions [
45].
These methods collectively enhance the capability of AI models to manage evolving degradation modes and previously unseen fault types effectively.
3.1.3. Effective Fusion of Multimodal Data in AI-Based MHM
To effectively fuse multimodal data (acoustic, vibration, thermal, visual, operational logs) in AI-based machinery health monitoring (MHM), several strategies and techniques like Data Fusion Algorithms, Handling Heterogenous Data, Interactive Validation, Adaptive Weighting Strategies, continuous Learning Models, Simulation-Augmented Learning, Multitask Learning and Fusion Strategies etc. can be employed.
Developing robust data fusion algorithms is crucial. These algorithms should be capable of merging and registering information from multiple sensors to provide a comprehensive assessment of equipment health [
3,
4,
14]. Handling Heterogeneous Data: Each type of sensor data may require different preprocessing, feature extraction, and modeling approaches. Unified models that can seamlessly handle multimodal inputs are essential for improving the accuracy and reliability of condition monitoring systems [
4,
46]. Utilizing interactive validation across multimodal data can enhance the completeness of key fault features, thereby improving the robustness and stability of diagnostic approaches [
4,
14]. Implementing adaptive weighting strategies for multi-sensor data fusion allows for the dynamic extraction of key information and the assignment of optimal weights to different sensor inputs (Ying et al., 2025) [
4]. AI models that continuously learn and adapt to new data can capture the evolution of structural conditions and environmental changes over time, ensuring real-time monitoring accuracy [
3]. Simulation-based data augmentation can address issues of data scarcity and missing fault types, enhancing the training of deep learning models under constrained real-world conditions (Konecny et al., 2026) [
15]. Multitask Learning and Fusion Strategies can enhance classification accuracy under variable operating conditions by integrating data from multiple sensors, such as temperature, vibration, and acoustic emissions (Konecny et al., 2026) [
15].
By leveraging these techniques, AI-based MHM systems can achieve more accurate and comprehensive fault diagnosis, ultimately leading to improved equipment reliability and reduced maintenance costs.
3.1.4. Role of Generative AI in Simulating Rare Failure Events and Augmenting Training Data
Failures are rare but costly. Generative models (e.g., GANs, diffusion models, synthetic digital twins) could simulate fault scenarios to enrich datasets. However, questions about fidelity, reliability, and validation of synthetic data remain unresolved. Generative AI plays a significant role in simulating rare failure events and augmenting training data across various domains. Some of the key contributions from existing body of knowledge are “Synthetic Data Creation”, “Training Simulations”, “Data Augmentation”, “Early Anomaly Detection”, and “Customized Educational Materials”.
Generative AI can create synthetic datasets to improve the training of machine learning models, especially for rare events. This helps in addressing the data imbalance problem by generating samples similar to actual failure samples through unsupervised learning [
5,
13,
28]. Generative AI can develop realistic scenarios for training simulations, enhancing preparedness for emergencies and complex maneuvers in risk-free environments. This is particularly useful in fields like healthcare, autonomous systems, and military training [
5,
48]. Generative adversarial networks (GANs) are popular for data augmentation, generating new data that mimics the characteristics of real-world samples. This is crucial for testing and improving diagnostic algorithms in manufacturing, healthcare, and system maintenance [
13,
28]. By learning normal operating data patterns, generative AI can detect deviations and potential anomalies early, providing timely warnings and improving maintenance efficiency (Leng et al., 2025) [
29]. Generative AI can create customized educational materials to better inform the public about specific technologies, enhancing overall knowledge and preparedness (Madani et al., 2025) [
5]. Despite its benefits, generative AI faces challenges such as discrepancies between generated data and real data, and the limited training data available for the generative model itself (Zhai et al., 2025) [
49].
In summary, generative AI is a powerful tool for simulating rare failure events and augmenting training data, offering solutions across multiple fields while also facing certain limitations [
5,
13,
28,
29,
48,
49].
3.2. Theme 2—Model Design, Transparency and Hybridization
Here, we explore the architectural evolution of AI models tailored for machinery health applications, with a focus on balancing predictive accuracy, interpretability, and scalability. The discussion emphasizes the growing relevance of explainable AI, physics-informed modeling, and hybrid approaches that integrate data-driven and mechanistic insights, fostering trust and operational reliability.
Figure 7 presents the four pillars of model design for machine health monitoring: generalization and robustness, model transparency, physics–AI hybridization, and human–AI collaboration. Each pillar is annotated with representative methods, and the arrows illustrate how advances in model development ultimately converge toward human–AI collaboration, which serves as the integrative foundation for trustworthy and practical deployment.
Table 2 provides the mapping among research questions, sub-thematic and associated references from the current body of knowledge.
Table 2.
Four research question under “Model Design, Transparency and Hybridization”.
Table 2.
Four research question under “Model Design, Transparency and Hybridization”.
Research Questions | Thematic | References |
---|
6. What strategies enable AI-driven health monitoring systems to generalize across machinery types, operating conditions, and industries? | Generalization and scalability | [51,52,53,54,55,56] |
7. How can explainability and trustworthiness of AI models be enhanced for industrial adoption? | Trust, Transparency, and Explainability | [6,10,30,57,58,59] |
8. What is the optimal integration of physics-based models with AI for machinery prognostics? | Hybrid modelling and Physics-AI integration | [7,25,26,27,60,61,62] |
9. What frameworks ensure human-AI collaboration in decision-making for machinery health management? | Human-AI collaboration | [23,24,40,63,64] |
3.2.1. Strategies for Generalizing AI-Driven Health Monitoring Systems
Most models are trained on specific machines (e.g., turbines, motors, pumps) under controlled settings. A critical challenge lies in developing transfer learning, domain adaptation, or foundation model approaches to ensure cross-domain generalizability. To enable AI-driven health monitoring systems to generalize across different machinery types, operating conditions, and industries, several strategies can be employed like “Data-Driven Approaches”, “Standardization and Reusability”, “Handling Variable Operating Conditions”, “Integration of IoT and AI”, and “Human-Centric Approaches”.
Deep Learning-Based Fault Diagnosis (DLFD) methods are highly accurate and can process large volumes of data, making them suitable for diverse applications [
51]. Utilizing AI and sensing technologies, Prognostics and Health Management (PHM) facilitates condition-based and predictive maintenance, which can be adapted across various sectors (Llasag Rosero et al., 2025) [
52]. Developing reusable models and resources that can be applied across different machines and environments is crucial. Standardized Health Monitoring Resources as shown by Toothman et al., 2023 avoids the need for individual models for each machine, enhancing scalability and efficiency [
53]. Addressing the domain shift phenomenon caused by varying working conditions and cross-equipment interference is essential. As shown by Zhao et al., 2025, this ensures that models maintain high performance despite changes in data distribution [
54]. Leveraging IoT for data collection and AI for analysis enables intelligent decision-making systems that can generalize across different industries, such as healthcare, manufacturing, and transportation [
55]. Emphasizing human-centric EHM approaches ensures that AI systems are designed to complement human expertise, which is particularly important in complex and variable environments as advocated by Dang, Chen & Huang, 2025 [
56].
By implementing these strategies, AI-driven health monitoring systems can achieve greater generalization and adaptability across various machinery types, operating conditions, and industries.
3.2.2. Enhancing Explainability and Trustworthiness of AI Models for Industrial Adoption
While deep learning achieves high accuracy in fault detection and remaining useful life (RUL) prediction, these models are often “black boxes.” Developing explainable AI (XAI) methods for fault diagnostics that align with engineers’ intuition is vital for trust and decision-making. To enhance the explainability and trustworthiness of AI models for industrial adoption, several strategies can be employed like “Implement XAI”, “Improve Interpretability”, “Address Black-box Nature”, “Regulatory Compliance”, and “Visualization and Communication”.
XAI clarifies AI-driven decisions, enabling stakeholders to understand the reasoning behind outputs, which is crucial in sectors like manufacturing and transportation [
6]. By explaining AI-driven adjustments in production processes and predictive maintenance, XAI ensures operators are informed, maintaining high-quality standards and compliance with safety regulations (as discussed by Dubey & Kumar, 2025) [
6]. Interpretability allows operators to understand which variables impact the occurrence of failures, providing insights into the mechanisms leading to faults [
57]. Developing AI models that explain their decision-making process in real-time can significantly improve reliability and trustworthiness in high-stakes environments (as highlighted by Keramati Feyz Abadi et al., 2025) [
58]. Integrating humans into the AI decision-making process allows for oversight and intervention, enhancing decision-making in complex systems [
58]. Techniques like SHAP and GradCAM are used to convert complex models into more understandable forms, balancing fault detection performance with explainability [
30,
59]. Ensuring AI models comply with regulations like GDPR and the EU AI Act by incorporating XAI methods enhances trust and facilitates adoption in critical industrial tasks [
30,
59]. Using graphical explanation tools to convey XAI outputs to end-users helps in understanding and trust-building [
30]. Tailoring XAI techniques to specific industrial use cases, such as fault diagnosis and predictive maintenance, enhances interpretability and user trust [
18,
40,
42].
By focusing on these strategies, industries can significantly enhance the explainability and trustworthiness of AI models, facilitating their adoption and ensuring robust, reliable, and transparent AI-driven operations.
3.2.3. Optimal Integration of Physics-Based Models with AI for Machinery Prognostics
Traditional physics-driven models capture degradation mechanisms, while AI models exploit data-driven patterns. The unresolved research question is how to synergistically combine these paradigms into hybrid models that balance interpretability, precision, and reliability. To achieve optimal integration of physics-based models with AI for machinery prognostics, several key strategies like “Hybrid Approaches”, “Causal Machine Learning”, “Transfer Learning”, and “Statistical and Stochastic Methods”can be employed.
Combining physics-based models with AI techniques can enhance the generalizability and accuracy of predictive models. Physics-based models provide a deep understanding of mechanical components and their degradation mechanisms, while AI models, particularly deep learning, can handle complex scenarios and learn from vast amounts of condition monitoring data [
7,
25,
26]. Integrating causal reasoning into AI models can address limitations in interpretability and robustness. This approach helps in understanding the cause-effect relationships in machinery failures, which is crucial for accurate prognostics (Paredes & Reis, 2025) [
60]. Utilizing transfer learning can overcome challenges such as high algorithm complexity and the need for extensive modeling data. This technique allows AI models to leverage pre-existing knowledge, improving their performance and generality in fault diagnosis and RUL prediction [
27]. Incorporating statistical algorithms (e.g., Trend Extrapolation, ARMA) and stochastic algorithms (e.g., Bayesian Network, Particle filter) with AI models can further refine predictions by capturing uncertainties inherent in machine learning predictions [
26].
Following are some of the benefits of Integration between Physics-Based Models with AI for Machinery Prognostics:
Combining the strengths of both approaches leads to higher prediction accuracy.
Physics-based models provide a clear understanding of the underlying mechanisms, while AI models offer powerful data-driven insights.
Hybrid models can better handle complex and nonlinear systems, improving reliability in diverse working conditions.
On the other hand, challenges of Integration between Physics-Based Models with AI for Machinery Prognostics are following:
Complexity: Integrating these models requires sophisticated techniques and comprehensive knowledge of both domains.
Data Requirements: High-quality and extensive data are essential for training effective AI models.
In summary, the optimal integration of physics-based models with AI for machinery prognostics involves leveraging hybrid approaches, causal machine learning, transfer learning, and statistical methods to enhance accuracy, interpretability, and robustness [
7,
25,
26,
27,
60].
3.2.4. Frameworks for Human-AI Collaboration in Machinery Health Management
In practice, engineers and operators remain central to diagnosis and maintenance. The unresolved issue is designing human-in-the-loop AI systems that combine algorithmic predictions with domain expertise, improving trust, usability, and adoption. Key Frameworks and Approaches include “Human-in-the-Loop (HITL) Approaches”, “MELDS Framework”, “Human-Machine Cooperation”, “Human-Machine Collaborative Decision-making (HMCD) Systems” and “Adaptive Confidence Penalty Method”.
Human-in-the-Loop (HITL) Approaches emphasize human control over decisions, ensuring that AI systems provide explanations that align with human reasoning and expertise. This maintains effective oversight and enhances decision quality Herrera, 2025, (Herrera, 2025) [
23]. Human-Machine Cooperation involves shared control and high levels of communication and interaction with complex information. This approach uses machine learning to analyze data in conjunction with human opinions, enhancing decision-making processes (as discussed by Zhao et al., 2024 in [
63]) Adaptive Confidence Penalty Method provides calibrated confidence estimates, serving as a bridge between humans and intelligent models for collaborative fault diagnosis in mechanical systems [
64].
Trust, communication, and shared understanding between humans and AI are crucial. Explainability and interpretability of AI decisions are essential for successful collaboration [
40,
65]. Tailored frameworks that account for sector-specific nuances, such as those in healthcare, infrastructure, and defense, provide more accurate and actionable insights [
40,
65].
These frameworks ensure that human-AI collaboration in machinery health management is effective, reliable, and enhances decision-making processes by leveraging the strengths of both human judgment and AI capabilities.
3.3. Theme 3: Deployment, Operations, and Decision Intelligence
This subsection addresses the translational challenges of implementing AI solutions in industrial settings, moving from laboratory prototypes to robust, real-world systems. It delineates the operational, cybersecurity, and decision-support considerations necessary for effective deployment, while also highlighting the role of edge computing, real-time analytics, and human-in-the-loop systems in driving actionable intelligence.
Figure 8 illustrates the deployment and operational layer of machine health monitoring, where inputs from sensors, IoT devices, edge AI, and cybersecurity safeguards are processed through analytics and decision support systems to produce actionable outputs in the form of predictive maintenance and operational optimization.
Table 3 provides the mapping among research questions, sub-thematic and associated references from the current body of knowledge.
Table 3.
Four research questions under “Deployment, operation, and decision intelligence”.
Table 3.
Four research questions under “Deployment, operation, and decision intelligence”.
Research Question | Sub-Thematic | References |
---|
10. How can AI-based machinery health monitoring ensure cybersecurity, privacy, and resilience in Industry 4.0/IoT environments? | Security, privacy, and industrial resilience | [8,11,18,21,22,65,66,67] |
11. What benchmarks, standards, and evaluation protocols are needed to accelerate AI adoption in machine health monitoring? | Benchmarking, standards, and validation | [9,16,17,18,19,20] |
12. How can AI enable real-time, resource-efficient machine health monitoring on edge and embedded devices? | Edge, embedded, and real-time systems | [68,69,70,71,72,73,74,75,76,77,78,79] |
13. How can AI-driven health monitoring be integrated into predictive maintenance strategies for cost-benefit optimization? | Predictive maintenance and decision optimization | [20,47,80,81,82,83,84,85,86,87,88,89] |
3.3.1. Ensuring Cybersecurity, Privacy, and Resilience in AI-Based Machinery Health Monitoring for Industry 4.0/IoT Environments
The increasing reliance on edge AI, cloud computing, and IoT sensors introduces vulnerabilities such as adversarial attacks, spoofed signals, and data leakage. Ensuring secure, privacy-preserving, and attack-resilient MHM systems is an urgent research frontier.
Cybersecurity Measures include “Encryption and Access Controls”, “Continuous Monitoring and Incident Response”, and “Intrusion Detection Systems (IDS)”. Strong encryption and access limits are essential to protect data and networks from cyberattacks in Industry 4.0 environments [
8]. Implementing ongoing monitoring and having robust incident response plans can help detect and mitigate cyber threats promptly [
8]. IDS are crucial for defending Industrial Internet of Things (IIoT) networks against cyber threats, ensuring the security of industrial data [
21].
Privacy Protection include “Sensitive Data Handling” and “Nonrepudiation and Privacy Principles”. AI-powered health monitoring systems collect and analyze sensitive operational data, necessitating privacy protection techniques and distributed learning strategies to safeguard this information [
21]. Adhering to principles such as confidentiality, integrity, availability, authenticity, nonrepudiation, and privacy is vital in maintaining secure and private industrial operations [
8].
Resilience Enhancement include “Predictive Maintenance”, “Supply Chain Agility”, and “Human-Centric Approaches”. Utilizing AI and IoT for real-time health monitoring of machinery can enhance predictive maintenance, reducing downtime and optimizing production schedules [
65]. Integrating Industry 4.0 technologies can fortify supply chains, providing real-time visibility and enabling rapid detection and response to disruptions [
18,
65]. Emphasizing human-centric approaches in Industry 5.0 can further enhance resilience by ensuring safe and inclusive workplaces [
66].
Technological Integration include “IoT and Big Data Analytics” and “Cloud Computing”. IoT and Big Data Analytics technologies facilitate real-time data exchange and monitoring, driving decision-making and predictive maintenance (as discussed by Saleh & AlShafeey, 2025 in [
67]). Cloud Computing provides scalable resources for extensive data storage and processing, supporting the health monitoring systems [
67].
By implementing these measures, AI-based machinery health monitoring can ensure cybersecurity, privacy, and resilience in Industry 4.0/IoT environments, thereby maintaining operational integrity and competitive advantage.
3.3.2. Benchmarks, Standards, and Evaluation Protocols for AI in Machine Health Monitoring
The large-scale adoption of AI-driven machine health monitoring (MHM) is constrained by the lack of universally accepted benchmarks and validation protocols. While numerous datasets and evaluation methods exist, most remain fragmented, domain-specific, or insufficiently standardized, thereby limiting reproducibility and comparability across industrial sectors. As highlighted in prior work [
9], without cohesive benchmarks, AI-based prognostics and health management tools cannot achieve the level of trust and regulatory compliance necessary for enterprise-scale deployment.
Robustness Evaluation and Real-World Validation: Robustness remains a critical metric, as AI systems must sustain performance when exposed to heterogeneous operating conditions such as turbines, aircraft, or nuclear reactors. Benchmarking efforts increasingly emphasize real-world robustness, validating AI against dynamic, noisy, and evolving industrial environments. Evaluating system behavior under such scenarios ensures that AI-based MHM transcends laboratory prototypes and provides reliable, real-time decision support in critical infrastructure (Vanraj et al., 2018) [
16].
Regulatory Standards and Compliance: The introduction of the European Union’s Artificial Intelligence Act in 2024 underscores the urgency of adopting compliance frameworks for industrial AI. For MHM, this translates into structured evaluation protocols encompassing setup phases, baseline data collection, stability assessments, periodic recalibration, and long-term system reassessment. By aligning machine health benchmarks with emerging regulatory mandates, industries can ensure transparency, fairness, and operational safety (Duan et al., 2024) [
17].
Fleet-Based and Comparative Benchmarking: A practical approach for benchmarking in industrial contexts is fleet-based comparison, where the collective behavior of similar machines is used as a baseline. Healthy operating conditions are assumed for the majority of machines, with deviations signaling potential faults. This method reduces dependence on large historical datasets, accelerates validation, and empowers domain experts to define context-specific measures of performance (Hendrickx et al., 2020) [
18].
Towards Predictive and Proactive Maintenance: Recent advances in machine learning benchmarks extend beyond static fault classification to predictive maintenance frameworks. AI-driven models process continuous streams of sensor data to diagnose faults in real time and forecast potential degradations. These predictive benchmarking practices represent a shift from reactive diagnostics to proactive and resource-optimized maintenance strategies, as evidenced in manufacturing contexts (Zhang et al., 2025; Bellido-Lopez et al., 2025) [
19,
20].
Emerging Trends in Continuous Benchmarking: Contemporary perspectives argue that benchmarking should not be a one-off validation exercise but a continuous process. This includes ongoing system reassessment, policy recommendations, and adaptive recalibration of models, ensuring that benchmarks remain effective under evolving industrial and regulatory landscapes (Duan et al., 2024) [
17]. Moreover, integration of AI evaluation with structural health monitoring (SHM) systems offers a path toward holistic, ICT-enabled monitoring frameworks that unify predictive maintenance, fault diagnosis, and safety compliance (Vanraj et al., 2018) [
16].
In summary, advancing AI benchmarks for machine health monitoring requires a dual emphasis on regulatory compliance and technical robustness. Regulatory frameworks such as the EU AI Act provide governance structures, while industrially grounded approaches—such as fleet-based comparisons, real-world robustness testing, and predictive maintenance benchmarking—address practical validation needs. Together, these emerging standards and protocols chart the path toward scalable, trustworthy, and resource-efficient AI-enabled MHM.
3.3.3. How AI Enables Real-Time, Resource-Efficient Machine Health Monitoring on Edge and Embedded Devices
The increasing reliance on
edge AI, cloud computing, and IoT sensors introduces vulnerabilities such as adversarial attacks, spoofed signals, and data leakage. Ensuring
secure, privacy-preserving, and attack-resilient MHM systems is an urgent research frontier. Within the existing body of knowledge AI and Edge Computing Integration include “Real-time Monitoring”, “Privacy Protection”, and “Energy Efficiency”. The convergence of artificial intelligence and edge computing has transformed machinery health monitoring by enabling real-time, resource-efficient operations directly at the device level. Unlike traditional cloud-centric approaches, which incur latency and bandwidth costs, edge-AI processes data locally, ensuring that health assessments and predictive analytics are executed with minimal delay (as seen in
Figure 9). This architecture is crucial for industrial cyber-physical systems where rapid fault prediction and intervention can prevent costly failures and production downtime (Singh & Reddy, 2025) [
72].
Local Data Processing for Real-Time Monitoring: By deploying AI models on embedded devices, sensor data can be analyzed at the network edge without requiring continuous cloud interaction. This reduces reliance on centralized servers, improves responsiveness, and supports early detection of mechanical anomalies. Real-time monitoring ensures that maintenance actions can be initiated proactively, mitigating risk before fault escalation (Saini & Raj, 2022) [
68,
73].
Latency Reduction and Bandwidth Efficiency: Processing near the data source minimizes transmission delays and reduces the need to offload massive sensor streams to remote infrastructures. This not only accelerates decision-making but also lowers network strain, making the system viable in bandwidth-constrained or intermittently connected environments (Zhang et al., 2025; Gao et al., 2025) [
69,
74,
75].
Optimized Resource Utilization: Edge-deployed AI models are often lightweight, specifically designed for constrained environments with limited memory, computation, and energy capacity. Tools and frameworks streamline deployment, ensuring that models remain up-to-date and efficient even under resource restrictions (Himeur et al., 2024; Bhoj & Bhadoria, 2022) [
76,
77].
Intelligent Decision-Making and Predictive Maintenance: By leveraging machine learning for pattern recognition in streaming data, edge devices can forecast degradation trajectories and predict failures before they occur. Such localized predictive capabilities support adaptive maintenance strategies, allowing interventions only when necessary, thereby optimizing costs and extending equipment life (Ahmed Murtaza et al., 2024; Gao et al., 2025) [
75,
78]. Importantly, embedded AI algorithms can dynamically adjust operational parameters and even facilitate autonomous decision-making in complex environments. Recent studies in intelligent sensing and measurement for robotics demonstrate how embedded AI can regulate actions in real time, providing a useful analogy for industrial health monitoring systems (Wang et al., 2025) [
79].
Enhanced Responsiveness with Digital Twins: The integration of digital twin technology with edge nodes further augments system fidelity. Digital replicas of machinery combined with AI-driven analysis enable real-time simulation of operational states, improving the accuracy of anomaly detection and dynamic response mechanisms (Zhang et al., 2025) [
74].
In sum, edge-enabled AI ensures privacy-preserving, low-latency, and resource-efficient health monitoring. It allows industrial assets to function as intelligent, self-sufficient nodes within wider IoT ecosystems, thereby improving resilience, reducing downtime, and achieving operational efficiency in Industry 4.0 and 5.0 environments [
68,
69,
72,
73,
74,
75,
76,
77,
78,
79].
3.3.4. Integration of AI-Driven Health Monitoring into Predictive Maintenance Strategies
Moving from
fault detection to
actionable maintenance decisions requires linking AI outputs with
maintenance scheduling, downtime costs, and spare-parts logistics. The challenge is optimizing AI systems not only for accuracy but also for
economic and operational impact. AI-driven health monitoring can be effectively integrated into predictive maintenance strategies to optimize cost-benefit outcomes. The key points are clearly portrayed in
Figure 10.
AI methods, including machine learning models, enhance equipment health monitoring by using real-time data and advanced analytics to anticipate failures, thus improving operational efficiency and system reliability [
20,
80]. AI integration into maintenance strategies allows for predictive analytics, which can forecast equipment failures and optimize maintenance schedules. This proactive approach minimizes downtime and extends equipment lifespan [
81,
82,
83]. Embedding AI-driven predictions into cost-optimization models improves accuracy and practical applicability. AI-based techniques manage data uncertainty and integrate multiple subsystems, optimizing maintenance using various health indicators, including economic factors (as discussed in Alabdullh et al., 2024) [
84]. Predictive maintenance facilitated by AI reduces energy use, waste, and environmental impact, contributing to more sustainable operations. This is particularly beneficial for small and medium-sized enterprises (SMEs) by mitigating costs and reducing the need for large-scale investments [
82,
85]. The use of IoT sensors, digital twins, and AI in predictive maintenance provides real-time insights into equipment conditions, enabling adaptive maintenance strategies and supporting decision-making processes [
85,
86].
AI-driven predictive maintenance enhances quality, safety, availability, and cost reduction in industrial plants by enabling proactive and accurate maintenance decisions based on real-time health assessments and risk evaluations [
87,
88].
By leveraging these AI-driven techniques, industries can achieve significant cost savings, improved productivity, and enhanced overall performance in their maintenance operations.
4. Comprehensive and Critical Bibliometric Analysis
Based on the provided reference list, here is a more comprehensive and critical analysis of the bibliometric data, including specific metrics and examples to highlight analytical value.
4.1. Publication Trend Analysis
The temporal distribution of the selected studies demonstrates a clear growth trajectory in AI-driven machine health monitoring research. Based on the final set of 87 references, the publication counts per year were systematically calculated by extracting the publication year from each reference entry (excluding years appearing within DOIs or page ranges to avoid duplication errors). The resulting distribution shows: 1 paper in 2012, 1 paper in 2018, 3 papers in 2019, 3 papers in 2020, 4 papers in 2021, 9 papers in 2022, 7 papers in 2023, 22 papers in 2024, 35 papers in 2025, and 2 papers in 2026.
This trend highlights that the field has only recently accelerated, with a modest number of publications prior to 2020, followed by a steep increase in the last three years. The surge in 2024 and 2025 indicates that machine learning, explainable AI, and predictive maintenance applications for machinery are emerging as dominant focal areas. The early publications (2012 and 2018) represent isolated contributions, but from 2019 onwards, consistent growth reflects the consolidation of AI methods into mainstream machine health monitoring research.
Figure 11 visualizes this trend, showing both the early scarcity of contributions and the rapid expansion of literature in the most recent years.
4.2. Thematic Focus and Research Concentration
A critical analysis of the themes reveals where research efforts are most concentrated and which areas are potentially under-explored. The reference list is organized around 13 key research questions, with a total of 85 references.
The most researched topics are:
Predictive Maintenance and Decision Optimization: With 10 references, this is the most heavily cited theme, highlighting its importance for cost-benefit optimization.
Edge, Embedded, and Real-Time Systems: This topic has 9 references, reflecting the growing need for efficient and low-latency health monitoring solutions on local devices.
Generative AI for Rare Events and Data Augmentation and Human-AI Collaboration: Each of these themes has 7 references, indicating their emerging importance in addressing data scarcity and integrating human expertise with AI systems.
In contrast, topics with fewer references, such as Data challenges and robustness (4 references) and Benchmarking, standards, and validation (4 references), suggest these areas are less explored. This is a critical finding, as the document itself notes that the lack of standardized protocols “hampers industry-wide benchmarking and real-world deployment”.
4.3. Key Authors and Publication Venues
An examination of the authors and publication venues reveals several key players and preferred outlets for this research.
4.3.1. Prominent Authors
Several authors appear in multiple references, indicating their significant contribution to the field. For example, Zamanian et al. and Zhou & Chen each appear in two different publications, while Ying et al. is cited in three papers on multimodal data fusion alone. Madani et al. and Li et al. are also prominent with multiple citations.
Figure 12 shows the co-authorship analysis on 332 authors with minimum of document of an author being 1. This results in 76 clusters with 762 links. As shown in
Figure 12, the largest group of authors with 15 authors them is colored red.
Figure 13 provides a closer view these 15 authors and 105 links among them.
The inclusion of this co-authorship analysis is intended to provide insights into the collaborative structure of the research community in AI-driven machine health monitoring. By visualizing author networks and clusters (as shown in
Figure 12 and
Figure 13), the analysis identifies key contributors, research groups, and emerging collaborations, thereby contextualizing the thematic findings of the review within the dynamics of scholarly activity. This perspective is important because sustained progress in this field depends not only on technical innovations but also on the formation of collaborative research ecosystems that bridge academia, industry, and disciplines. Thus, the co-authorship analysis complements the thematic synthesis by highlighting who is driving the field, how expertise is clustered, and where potential gaps in collaboration may exist.
4.3.2. Core Journals:
The concentration of papers in specific journals suggests they are central to this research domain. Examples of high-activity journals include Heliyon and Advanced Engineering Informatics, which appear multiple times in the reference list.
This analysis shows that the research community is coalescing around key researchers and a select group of publication venues, a common characteristic of a maturing field.
5. Discussion
The findings of this systematic review underscore the transformative potential of artificial intelligence (AI) in advancing machinery health monitoring (MHM) within the paradigm of Industry 4.0. By synthesizing evidence from 85 peer-reviewed studies, this work highlights critical methodological and application-oriented innovations that collectively move the field towards more resilient, adaptive, and scalable solutions.
Table 4 consolidates critical analysis of the 85 included studies, showing how the literature was synthesized into themes, methods, insights, and gaps to inform a forward-looking research agenda. It highlights representative methods from the literature, the key insights gained, and the gaps that remain to be addressed. This table offers a concise, at-a-glance overview of the field, complementing the more detailed thematic discussions in the manuscript.
5.1. Implications of the Research
This study elucidates that addressing core challenges such as data scarcity, imbalance, and noise through techniques like generative AI [
5,
13], transfer learning [
12,
27], and multimodal fusion [
3,
4,
14] is paramount for enhancing fault diagnosis and prognostics. These insights have immediate industrial relevance; for example, the ability to simulate rare failure events can mitigate costly downtimes, while robust data fusion can improve early anomaly detection across heterogeneous sensor networks.
Table 4.
Comparative Summary of Research Themes, Representative Methods, Key Insights, and Gaps in AI-Based Machine Health Monitoring.
Table 4.
Comparative Summary of Research Themes, Representative Methods, Key Insights, and Gaps in AI-Based Machine Health Monitoring.
Research Theme | Representative Methods/Approaches | Consolidated Key Insights | Gaps & Priorities |
---|
Data-Centric Challenges & Innovations | Generative AI for rare events and augmentation ([5,13,28,29,48,49]); Transfer/few-shot learning; Multimodal fusion ([3,4,14,15,46,47]); AI-augmented sensing (vision/structural) ([10,11]) | Synthetic data, transfer learning, and multimodal fusion mitigate scarcity/imbalance, while AI-augmented sensing improves fidelity of anomaly detection. | Limited benchmarking of synthetic vs. real data; validation remains weak |
Model Design, Transparency & Hybridization | XAI (SHAP, Grad-CAM, LIME) ([6,10,30,57,58,59]); Physics–AI hybrid models ([7,25,26,27,43,60,61,62]); Domain adaptation & generalization ([51,52,53,54,55,56]); Human-in-the-loop frameworks ([23,24,63,64]) | XAI and hybridization enhance interpretability and trust; human-in-the-loop designs improve accountability; initial evidence of cross-domain generalization is emerging. | Lack of standardized XAI protocols, limited causal/physics integration in deep models, few foundation-scale cross-asset studies, and limited industrial evaluation of human-in-the-loop frameworks. |
Deployment, Operations & Decision Intelligence | Edge/embedded real-time AI ([68,69,70,71,75,76,77,78,79]); Cybersecurity, privacy, resilience ([8,21,22,65,66,67]); Standards & benchmarking ([9,16,17,18,19,20]); Predictive maintenance optimization ([20,47,80,81,82,83,84,85,86,87,88,89]) | Edge AI reduces latency and enables on-asset inference; cybersecurity/privacy are increasingly foregrounded; predictive-maintenance studies connect diagnostics to cost/risk; standards discussions gaining momentum. | Industrial-scale deployment trials are rare; security-by-design practices not systematic; benchmarking datasets and KPIs remain fragmented; economic/sustainability factors under-integrated. |
Cross-Cutting Concerns | Standards & protocols (ISO, IEC, ITU/WHO) ([9,16,17,18,19,20]); Benchmark datasets (C-MAPSS, XJTU-SY); Reproducibility tools; Socio-technical trust frameworks | Standards and reference datasets enable comparability and reproducibility; bibliometric evidence shows fragmentation across datasets and metrics; reproducibility tools and trust frameworks are emerging. | No unified industrial dataset or KPI suite; limited socio-technical evaluation (trust, regulation, compliance); need for open-source testbeds and shared industrial benchmarks. |
Furthermore, the integration of explainable AI (XAI) techniques [
6,
18,
30,
40,
42]. directly addresses the trust and regulatory hurdles, thereby accelerating adoption in safety-critical domains. The hybridization of AI with physics-based models [
7,
26,
60] offers a pathway to balance interpretability and predictive accuracy, an area crucial for long-term industry compliance and reliability.
AI provides a distinct advantage over conventional machine learning based sensing approaches by enabling multimodal data fusion, generative augmentation, and adaptive learning in dynamic environments. Traditional machine learning sensors often rely on fixed feature extraction pipelines and struggle with evolving degradation modes or rare failure events. In contrast, AI enhanced sensing frameworks incorporate deep learning and transfer learning to automatically extract hierarchical features from complex data such as vibration, acoustic, and vision streams, thereby improving fidelity and adaptability in real time monitoring. Moreover, AI methods can integrate generative modeling to simulate underrepresented fault scenarios, which standard machine learning sensors cannot achieve, thus reducing the imbalance between normal and abnormal samples. Finally, AI augmented sensors can embed explainability layers (for example, Class Activation Maps or attention mechanisms) that not only enhance diagnostic accuracy but also provide transparent decision support for engineers. These capabilities demonstrate that AI based sensors not only offer potential but have already shown measurable improvements in robustness, generalizability, and interpretability compared with classical machine learning driven sensing systems.
5.2. Significance of the Research
The significance of this review lies in its comprehensive mapping of research questions to emerging solutions. The identification of underexplored themes such as standardization and benchmarking [
9,
16,
17,
18,
19,
20], and cybersecurity and privacy in AI-driven MHM [
8,
21,
22], signals critical gaps that must be addressed to transition from experimental prototypes to enterprise-wide deployment. Moreover, this work validates the growing emphasis on human-AI collaboration [
23,
24] reflecting a shift towards socio-technical systems where operators remain central actors in decision-making. The bibliometric analysis further illustrates a rapidly expanding field, with a notable concentration of scholarship in predictive maintenance and edge computing, confirming their strategic importance for achieving real-time operational resilience [
20,
68,
72,
73,
74,
75,
76,
77,
78,
79,
87].
5.3. Limitations
Although this study provides a comprehensive synthesis of AI-driven machine health monitoring literature, several limitations must be acknowledged. First, the review was restricted to peer-reviewed, English-language publications, which may introduce a language and regional bias by excluding relevant studies from non-English or industry-specific sources. Second, the reliance on major databases such as Scopus and Web of Science, while ensuring quality, may omit grey literature, technical reports, or emerging conference proceedings that could offer valuable insights. Third, publication bias inherent in high-impact venues may overrepresent successful applications and underreport negative or inconclusive findings. Future work can mitigate these limitations by incorporating multilingual search strategies, expanding to diverse databases and repositories, and integrating practitioner perspectives or industry case studies to enhance representativeness and practical relevance.
5.4. Future Directions
While AI-augmented sensors offer significant advantages in fidelity and adaptability, they also present notable drawbacks. These include high implementation costs, increased energy consumption on edge devices, data privacy risks, and vulnerability to adversarial noise or sensor drift, all of which may compromise reliability in long-term industrial deployment. Building on these findings, several avenues for future inquiry emerge. First, research must prioritize the creation of standardized datasets and open benchmarking protocols to enhance reproducibility and comparability [
9,
16,
17,
18,
19,
20]. Second, more emphasis should be placed on developing energy-efficient, explainable models suitable for edge and embedded environments [
69,
72,
73,
74,
75,
76,
77,
78,
79]. Third, there is an urgent need to strengthen the cybersecurity and privacy frameworks for AI-based MHM, particularly as systems become increasingly interconnected and data-sensitive [
8,
21,
66]. Lastly, future studies should integrate economic, environmental, and human factors into predictive maintenance frameworks, linking technical insights with broader organizational and sustainability goals [
82,
84,
88]. These directions collectively underscore the need for multidisciplinary efforts, combining expertise from data science, engineering, human factors, and policy to fully realize AI’s potential in machinery health management.
In conclusion, this review not only consolidates the current knowledge base but also delineates a forward-looking research agenda, positioning AI-enabled MHM as a cornerstone for resilient, transparent, and sustainable industrial operations.
6. Conclusions
This systematic review consolidates evidence from 85 rigorously screened studies, demonstrating the rapid maturation of AI driven machine health monitoring. The bibliometric analysis revealed an exponential rise in publications, beginning with 1 study in 2012 and reaching 35 in 2025. The pronounced surge in the last three years highlights the transition of the field from niche experimentation to mainstream industrial relevance.
The review identifies predictive maintenance, edge and embedded intelligence, and generative AI for rare events as the most mature and impactful research streams, while also highlighting underexplored yet strategically vital areas such as standardization, benchmarking, and cybersecurity resilience. By employing 13 structured research questions, this study moves beyond descriptive synthesis and provides an evaluative framework that links methodological innovation with unresolved industrial challenges.
The critical contribution of this paper lies in its ability to catalogue existing methods while simultaneously interrogating their limitations, including insufficient validation of synthetic data, lack of cross asset generalization, and the scarcity of industry scale deployment trials. By foregrounding these limitations, the review outlines a forward-looking agenda where reproducible benchmarks, socio technical trust frameworks, and sustainability metrics become essential for future advancement.
In conclusion, this paper offers both a consolidated evidence base and a critical interpretive lens, positioning AI enabled machine health monitoring as a cornerstone of Industry 4.0 and Industry 5.0 transitions. It provides actionable insights for both scholars and practitioners by clarifying where the field has advanced, where it remains fragmented, and where collaborative innovation is most urgently required.