Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges

Garcia, Jose; Rios-Colque, Luis; Peña, Alvaro; Rojas, Luis

doi:10.3390/app15105465

Open AccessSystematic Review

Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges

¹

Escuela de Ingeniería de Construcción y Transporte, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362804, Chile

²

Doctorado en Industria Inteligente, Facultad de Ingeniería, Pontificia Universidad Católica de Valparaíso, Valparaíso 2340000, Chile

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(10), 5465; https://doi.org/10.3390/app15105465

Submission received: 18 April 2025 / Revised: 8 May 2025 / Accepted: 9 May 2025 / Published: 13 May 2025

Download

Browse Figures

Versions Notes

Abstract

Failures in critical industrial components (bearings, compressors, and conveyor belts) often lead to unplanned downtime, high costs, and safety concerns. Traditional diagnostic approaches underperform in noisy or changing environments due to heavy reliance on manual feature engineering and rule-based systems. In response, advanced machine learning, deep learning, and sophisticated signal processing techniques have emerged as transformative solutions for fault detection and predictive maintenance. To address the complexity of these advancements and their practical implications, this review combines analyses from large language models with expert validation to categorize key methodologies—spanning classical machine learning models, deep neural networks, and hybrid physics–data approaches. It also explores essential signal processing tools (e.g., Fast Fourier Transform (FFT), wavelets, and Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN)) and methods for estimating Remaining Useful Life (RUL) while highlighting major challenges such as the scarcity of labeled data, the need for model explainability, and adaptation to evolving operational conditions. By synthesizing these insights, this article offers a path forward for the adoption of new technologies (deep learning, IoT/Industry 4.0, etc.) in complex industrial contexts, anticipating the collaborative and sustainable paradigms of Industry 5.0, where human–machine collaboration and sustainability play central roles.

Keywords:

predictive maintenance; fault diagnosis; machine learning; deep learning; signal processing; Industry 5.0

1. Introduction

Fault diagnosis in critical industrial assets—including bearings, compressors, hydraulic pumps, and conveyor belts—remains essential to minimise downtime, financial losses, and safety hazards [1,2,3]. Predictive maintenance seeks to detect incipient faults before catastrophic failure [4], yet traditional diagnostic techniques face persistent obstacles such as data scarcity, non-stationary operating conditions, and elevated noise levels [5].

From a macro-economic standpoint, the International Energy Agency (IEA) estimates that unplanned outages cost energy-intensive industries nearly USD 50 billion every year, with individual facilities losing 3–5% of their annual production capacity as a direct consequence of equipment failures [6]. The same analysis shows that data-driven predictive-maintenance programmes—combining high-frequency sensing with machine learning analytics—can cut downtime by 10–20%, yielding global savings of USD 8–15 billion per annum while simultaneously curbing energy waste and associated CO₂ emissions. Quantifying these losses highlights the concrete economic value of predictive maintenance in reducing capacity shortfalls and provides a compelling motivation for the systematic review presented here.

A major limitation is the shortage of representative fault data, especially for specialised systems like rolling bearings and hydraulic circuits [7]. In addition, variable speeds and loads and environmental interference complicate the extraction of reliable features [8,9,10]. These problems are amplified in real-world plants, where fluctuating ambient conditions can mask fault signatures [11]. Therefore, manual, expert-driven feature engineering is time-consuming, subjective, and difficult to scale across heterogeneous machinery [12]. To address these issues, recent frameworks such as that proposed by Wang et al. [13] introduce refined entropy measures and support vector classifiers to enhance diagnostic accuracy under low-data conditions.

Maintenance practice spans a continuum from time-based servicing, where assets are replaced or overhauled at fixed intervals; through condition-based maintenance (CBM), which triggers work when measurable indicators cross preset thresholds; to predictive maintenance (PdM), which forecasts failures in advance. Recent surveys on industrial equipment highlight three dominant technical streams. First, signal-driven CBM schemes still rely on straightforward statistical limits applied to vibration or current signatures collected from bearings, pumps, or conveyor drives [2,3]. Second, physics-driven approaches embed mechanistic wear or leakage models, e.g., for rolling bearings or hydraulic circuits, into Bayesian or particle-filter frameworks to refine health estimates [7]. Third, hybrid data-driven PdM combines advanced analytics such as convolutional or temporal CNN-LSTM networks with engineered features or wavelet packets, delivering early fault recognition and remaining-useful-life prediction across diverse machinery [12,14].

Machine learning (ML) and deep learning (DL) techniques have been widely adopted to overcome these drawbacks [7,14]. Convolutional and recurrent architectures automatically learn informative representations from raw sensor streams, capturing complex temporal–spatial patterns in noisy, high-dimensional data [15]. These models can adapt to changing regimes and detect subtle degradation patterns earlier than classical methods.

Nevertheless, progress is still hindered by the limited availability of diverse datasets and by the “black-box’’ nature of many DL models, which restricts interpretability and trust. Recent work has integrated explainable artificial intelligence (XAI) and hybrid physics-informed models to address these issues without sacrificing predictive power.

Given the rapid pace of innovation, systematic synthesis is indispensable. Recent reviews have used large language models (LLMs) to accelerate topic discovery and mapping [16,17]. Building on these advances, this study presents a hybrid systematic literature review (SLR) that couples LLM-assisted screening with expert validation. LLMs first reveal high-impact themes; experts then refine and contextualise the findings, yielding an in-depth portrait of state-of-the-art algorithms, challenges, and industrial relevance.

This review addresses five research questions: (i) Which ML and DL methods are most effective for fault diagnosis and predictive maintenance in industrial equipment? (ii) What principal challenges persist, and how can they be mitigated? (iii) How do big data, IoT, and Industry 4.0 technologies improve scalability, speed, and reliability? (iv) How do hybrid models and XAI enhance transparency and trust? (v) What bibliometric insights illuminate the trajectory of sustainable infrastructure and its link to predictive maintenance?

By combining automated text mining with domain expertise, this work delivers both breadth—spanning signal processing, classical machine learning, and deep neural approaches—and depth—addressing real-time deployment, interpretability, and Industry 5.0 principles. This synthesis aims to support the development of safer, more energy-efficient, and human-centric maintenance solutions while identifying research gaps to motivate future work. To facilitate reuse and comparison, Supplementary summary Tables were compiled for each reviewed study, detailing the applied techniques, application domains, performance metrics, and reported limitations. These resources are provided as machine-readable material to assist both researchers and practitioners.

Unlike earlier surveys that relied exclusively on manual screening or fully automated text mining, this study employed a hybrid, two-stage pipeline. In the first stage, transformer-based embeddings—combined with UMAP–HDBSCAN and BERTopic—were used to cluster over 19,918 abstracts into coherent topic groups. What sets this approach apart is its explicit focus on industrial equipment: the resulting clusters revolve around five core asset classes—compressors, bearings, hydraulic pumps, conveyor systems, and cable infrastructure—allowing for targeted analysis of condition monitoring strategies across equipment types. In the second stage, a panel of domain experts refined this thematic landscape by discarding marginal topics, merging overlaps, and selecting high-quality studies that exemplify each retained cluster. This expert-in-the-loop curation ensures that the final corpus combines methodological diversity with industrial relevance. The synthesised findings are captured in structured Supplementary Tables, which catalogue the applied techniques, main application, reported metrics and outcomes, and acknowledged limitations for each study. These tables not only enable systematic cross-comparison across asset types but also serve as a reusable evidence base for both researchers and practitioners seeking to deploy predictive maintenance strategies in complex, real-world environments.

The remainder of this paper is organised as follows: Section 2 details the methodology, Section 3 provides a bibliometric overview, Section 4 reports qualitative findings, Section 5 discusses implications and future directions, and Section 6 concludes the study.

2. Methodology

Figure 1 provides an overview of the semi-automated systematic review pipeline that combines NLP and expert validation to identify, refine, and analyse key topics in condition monitoring and predictive maintenance. Figure 2 details the NLP stages described in Section 2. After an initial expert check, each topic is re-queried, and a second screening is performed against predefined relevance and quality criteria. This multi-stage selection and synthesis process adheres to the PRISMA 2020 guidelines for systematic reviews, ensuring transparency and reproducibility. A detailed compliance checklist is provided as Supplementary Material.

Step 1—Structured search. A Scopus query (below) retrieves English-language journal, conference, chapter, and review papers (2017–2024) on predictive maintenance, condition monitoring, and related AI techniques.

    TITLE-ABS-KEY("predictive maintenance" OR "condition monitoring" OR
    "prognostics" OR "failure prediction" OR "asset management")
    AND ("machine learning" OR "artificial intelligence" OR
    "deep learning" OR "neural networks")
    AND PUBYEAR > 2016 AND PUBYEAR < 2025
    AND LIMIT-TO(LANGUAGE,"English")
    AND LIMIT-TO(DOCTYPE,"ar","cp","ch","re")

Step 2—NLP-based topic discovery. Embeddings are generated with the BGE-large-en-v1.5 transformer; UMAP [18] reduces dimensionality; HDBSCAN [19] clusters the points; c-TF-IDF refines keywords; and BERTopic, enhanced with SOLAR-10.7B-Instruct [20], selects representative papers [16,17].

Step 3—Expert filtering and validation. After the NLP-driven discovery phase, two domain specialists independently screen every candidate record in Zotero using the following a priori quality gate:

Citation strength. A dynamic threshold is applied to avoid bias against recent work: an article must be in at least the 60th percentile of citations for its publication year and document type (journal, conference, or review). For articles published in 2023–2024—where absolute counts are still low—we accept ≥3 citations in Web of Science.
Thematic pertinence. A paper must (i) explicitly address at least one target asset class (bearings, pumps, compressors, conveyor belts, or cables) and (ii) apply an AI/ML, signal processing, or hybrid physics–data technique to condition monitoring, fault diagnosis, or RUL estimation.

Step 4—Mixed quantitative/qualitative analysis. Bibliometrics (e.g., co-authorship, citations, and topic prevalence) complements an expert thematic review of methods, challenges, producing the evidence base for Section 3, Section 4 and Section 5.

Embedding Generation and NLP Approaches in the SLR Framework

Stage 1—Embeddings. The bge-large-en-v1.5 model captures contextual semantics more effectively than classic Word2Vec or GloVe [21,22], outperforming larger alternatives on MTEB [23].

Stage 2—Dimensionality reduction. UMAP retains a local–global structure while remaining faster and more scalable than t-SNE [24].

Stage 3—Clustering. HDBSCAN discovers clusters of varying density and flags noise without requiring a preset cluster count, yielding coherent topics for diverse, real-world literature [25,26,27].

Stage 4—Topic keywords. c-TF-IDF [28] highlights discriminative terms at the cluster level, sharpening thematic separation compared with standard TF-IDF.

Stage 5—Topic representation. BERTopic integrates SOLAR-10.7B-Instruct [20] for instruction-following summarisation, ensuring representative document selection and concise topic labels.

Stage 6—Topic assessment. Citation impact, venue quality, and thematic fit govern article selection for each topic, providing a high-quality corpus for subsequent analysis.

This streamlined, expert-in-the-loop workflow delivers a rigorous, scalable review that captures both quantitative trends and qualitative insights across the rapidly evolving predictive maintenance literature.

3. Quantitative Analysis of Condition Monitoring and Predictive Maintenance

This section provides a bibliometric analysis centred on key topics identified through a systematic query designed to examine condition monitoring and predictive maintenance in industrial equipment. The selected topics include hydraulic pump condition monitoring and fault diagnosis, predictive maintenance for conveyor belt systems, remaining-useful-life estimation, bearing fault diagnosis using deep learning, and cable monitoring and predictive maintenance. These topics cover essential aspects of predictive maintenance, focusing on challenges such as fault diagnosis accuracy, system reliability, and the application of advanced technologies like machine learning, deep learning, and signal processing. Each topic is analysed in detail, using bibliometric tools such as Bibliometrix to identify trends, key contributions, and research gaps. This approach offers valuable insights into the current state and future directions of predictive maintenance strategies for industrial systems.

The dendrogram in Figure 3 illustrates the hierarchical clustering of topics generated from the bibliographic query, covering key areas in condition monitoring and predictive maintenance for industrial equipment. Six closely related topics are highlighted within the dendrogram, reflecting their thematic similarity: hydraulic pump condition monitoring and fault diagnosis, predictive maintenance for conveyor belt systems, remaining-useful-life estimation for predictive maintenance, bearing fault diagnosis using deep learning and signal processing, cable monitoring and predictive maintenance, and machine learning for fault diagnosis in air compressors. These topics represent critical areas of research with high interconnectivity, focusing on fault diagnosis accuracy, predictive capabilities, and system reliability.

Other topics, including cardiac arrest outcomes prediction models, evaluation and analysis of AI chatbots in cancer care, Unmanned Aerial Vehicle (UAV) flight monitoring and failure prediction, and machine learning-based fault detection in DC-DC power converters, are more dispersed, indicating distinct research directions. This dendrogram provides a structured representation of thematic relationships, enabling a focused exploration of the most relevant topics in this study.

It is important to clarify that the dendrogram reflects semantic similarity between documents—based on term co-occurrence and conceptual proximity—rather than strictly delineated application domains or fault types. As such, clusters may contain articles addressing multiple failure modes or technologies, requiring complementary expert interpretation for accurate topic synthesis. This rationale underpins the manual clustering and thematic validation strategy discussed later in the manuscript.

Figure 4 provides an overview of the bibliometric data collected for this study. It highlights key metrics, including the timespan of publications (2017–2024) and the number of sources (242), documents (354), and references (10,556). The annual growth rate of publications is 31.36%, indicating a rapid increase in research activity within this period. The average document age is 3.29 years, and the average number of citations per document is 15.95, reflecting the relatively recent yet impactful nature of the research.

In terms of authorship, there are 1166 contributing authors, with only 5 single-author documents, showcasing a high level of collaboration (an average of 4.07 co-authors per document). International co-authorships account for 19.77%, emphasizing the global scope of the research community. The dataset encompasses a diverse range of document types, including 186 articles, 144 conference papers, and 12 reviews and smaller contributions such as book chapters (6), data papers (2), and short surveys (1). This visualization underscores the breadth and depth of the dataset, emphasizing its relevance to a comprehensive analysis of condition monitoring and predictive maintenance. The diverse range of contributions highlights the interdisciplinary and collaborative nature of research in this domain.

The cumulative line chart in Figure 5 illustrates the top sources contributing to the bibliometric analysis of condition monitoring and predictive maintenance research over time. “Sensors” leads, with the highest cumulative occurrences, reflecting its significant role in the field, followed by “IEEE Access” and “Measurement: Journal of the International Measurement Confederation”. Other notable contributions include “IEEE Transactions on Instrumentation and Measurement” and “Mechanical Systems and Signal Processing”, which exhibit a steady increase in publications. Additional sources, such as “Measurement Science and Technology”, “Lecture Notes in Mechanical Engineering”, and “Energies”, demonstrate the diversity of publication venues in this area. Furthermore, the inclusion of “Lecture Notes in Networks and Systems” underscores the interdisciplinary nature of the research. This cumulative visualization highlights the progressive growth and collaborative efforts within the domain of condition monitoring and predictive maintenance.

The analysis of key authors in the field of predictive maintenance and condition monitoring reveals significant contributions (see Figure 6). Wang Y, Wang X, and Wang H stand out as leading contributors, consistently producing research that focuses on fault diagnosis and predictive methodologies using advanced machine learning techniques. Their work often intersects with applications in critical industrial equipment such as compressors, bearings, and conveyor systems, providing robust frameworks for predictive maintenance.

Piccirilli MC demonstrates a specialized focus on the integration of signal processing techniques with deep learning for fault diagnosis, particularly in environments with noisy data. Luchetta A and Liu Y emphasize the use of hybrid models, combining physical principles with data-driven approaches to improve the accuracy and interpretability of predictive systems. Their work addresses challenges such as concept drift and varying operating conditions in industrial processes.

Grasso F and Bindi M, though emerging contributors, showcase innovative applications of AI and IoT in asset management and real-time monitoring. Their efforts highlight the growing importance of interdisciplinary approaches in enhancing equipment reliability and operational efficiency. Collectively, these authors form a diverse and collaborative research landscape, driving advancements in predictive maintenance and its application across various industrial domains.

In Figure 7, the global distribution of research contributions is visualized, showcasing the frequency of publications by country. The map highlights the significant role of China, with the highest frequency (439), followed by India (123), the USA (87), and Italy (80). These leading contributors are represented with darker shades, indicating their dominant position in research productivity. Other countries, such as South Korea (68), the UK (49), and Brazil (48), also demonstrate notable contributions, further emphasizing the international and collaborative nature of the research landscape. The colour intensity correlates with the publication frequency, and the numeric values for the top contributors are displayed on the map to provide additional clarity. This visualization underscores the geographic diversity and the varying levels of research activity across the globe.

Figure 8 offers a comprehensive visualization of term dynamics over time in the realm of predictive maintenance and condition monitoring research. The chart displays critical terms on the vertical axis, representing key research topics, while the horizontal axis spans from 2019 to 2024. Each term is associated with a horizontal line, extending from its first quartile year (Year Q1) to its third quartile year (Year Q3), illustrating the temporal distribution of its relevance. The median year is marked with a blue circle, the size of which corresponds to the frequency of the term in the analysed literature. From the data, we observe that terms like “condition monitoring” and “fault detection” show the highest frequencies (148 and 128, respectively), with a consistent presence in recent years (median in 2022). This underscores their critical importance in the field. Emerging terms such as “deep learning” and “machine learning” have median years of 2022 and 2023, respectively, highlighting their growing prominence as research priorities in recent years.

Conversely, terms like “signal processing” and “classification accuracy” show relatively lower frequencies (22 and 10, respectively) and earlier median years (2020), suggesting a narrower or more specialized focus in earlier research efforts. Notably, terms such as “prediction accuracy” and “data mining” reflect the latest research developments, with all quartile years clustering around 2023 and 2024. This graph provides insights into how research priorities have shifted over time, illustrating the enduring relevance of foundational terms like “neural networks” and “convolution” while capturing the rise of emerging concepts like “faults diagnosis” and “predictive maintenance”. These observations reveal the dynamic evolution of research in this field, highlighting key areas of focus and suggesting avenues for future exploration.

In Figure 9, the analysis of connected terms within the field of predictive maintenance and condition monitoring is presented. The visualization groups related terms into distinct clusters, highlighting their interconnections and thematic focus areas.

Green Cluster—Fault Diagnosis and Feature Extraction: This cluster emphasizes technical methodologies for identifying and diagnosing equipment faults. Terms such as “fault diagnosis”, “bearing”, “convolutional neural network”, and “signal” are prominent, underscoring the importance of machine learning and signal processing in this area. Connections between “feature extraction”, “vibration signal”, and “neural network” suggest a focus on leveraging advanced models to analyse and interpret diagnostic data.

Red Cluster—Predictive Maintenance and Failure Prediction: Terms like “predictive maintenance”, “useful life”, “failure”, and “RUL prediction” dominate this cluster, reflecting the industry’s emphasis on forecasting equipment degradation and optimizing maintenance schedules. The strong interconnections highlight the integration of cost analysis, uncertainty management, and real-world applications in industrial settings.

Blue Cluster—Hydraulic Systems and Pumps: The blue cluster focuses on specific applications such as “centrifugal pump”, “hydraulic system”, and “vibration”, indicating the relevance of predictive maintenance to specialized mechanical systems. Terms like “classification accuracy” and “support vector machine” reveal ongoing efforts to refine model accuracy and efficiency in these applications.

Yellow Cluster—Emerging Techniques and Methodologies: The yellow cluster represents the application of advanced computational techniques and novel methodologies in predictive maintenance and condition monitoring. Terms such as “deep learning”, “convolutional neural network”, “fault detection”, “physics”, and “proposed method” are prominent, highlighting the focus on implementing state-of-the-art approaches to address complex challenges in the field. Connections between “robustness”, “experimental result”, and “CNNmodel” suggest a strong emphasis on validating these methodologies through empirical studies and practical implementations. This cluster bridges foundational topics from the green cluster and application-driven insights from the red cluster, serving as a pivot for innovation in predictive maintenance technologies.

The analysis underscores the interconnected nature of the research landscape, with each cluster addressing a critical aspect of predictive maintenance and condition monitoring while contributing to the overall development of the field.

In summary, the bibliometric analysis confirms the significant growth of research in predictive maintenance and condition monitoring, highlighting the convergence of machine learning, deep learning, and signal processing techniques to address failures in industrial equipment. The findings reveal a high level of international collaboration, with China, India, and the United States as the leading contributors. Influential publications and emerging approaches are concentrated in areas such as fault diagnosis using convolutional neural networks, remaining-useful-life estimation, and the integration of IoT sensors. Building on this quantitative overview, the next section delves into a qualitative analysis of the most representative articles, discussing the applied methods, identified limitations, and future research prospects within each relevant topic.

4. Qualitative Analysis of Fault Diagnosis and Predictive Maintenance

To bridge the gap between theory and engineering practice, this section presents real-world applications of predictive maintenance techniques across key industrial equipment. Each subsection highlights representative studies selected through LLM–expert curation, showcasing the methods used, implementation contexts, and practical insights drawn from deployment challenges and outcomes. Section 4.1, Section 4.2, Section 4.4 and Section 4.6 describe specific technologies for compressors, cables, hydraulic pumps, and conveyor belts. Two topics—remaining useful life (RUL) estimation and bearing fault diagnosis—are cross-cutting across all these equipment types; therefore, they are addressed separately in Section 4.3 and Section 4.5, while concrete examples are cited within each asset-specific subsection.

4.1. Industrial Compressors: State of the Art and Emerging Trends

Compressors underpin energy transfer in petrochemical plants, refineries, refrigeration cycles, and diverse manufacturing lines; their failure can halt production and sharply increase operating costs. Consequently, predictive strategies now blend advanced analytics with domain knowledge to anticipate incipient faults and optimise intervention schedules. Based on our expert–LLM screening, the body of work can be organised into five inter-related technique families (Figure 1; Tables S1 and S2).

Machine learning (ML). Classical models—linear/polynomial regression, random forest, gradient boosting, kNN, decision trees, SVM, and LDA/QDA—and neural baselines such as MLP and PNN dominate early-stage deployments [29,30,31,32]. Recent studies favour ensembles or stacking (e.g., kNN + gradient boosting with a ridge meta-learner) and anomaly pipelines that combine isolation forest with regression refinements.
Deep learning (DL). CNNs extract hierarchical features from raw vibration or spectrogram inputs, sequence models (LSTM, GRU) capture temporal dependencies, and hybrid layouts—TCN–LSTM and attention-augmented autoencoders—boost sensitivity to subtle degradation patterns [33,34,35]. Transfer learning from large acoustic datasets is gaining traction where labelled compressor data are scarce.
Signal processing and feature engineering. Wavelet packets (MODWPT, DWT), CEEMDAN, and FFT/STFT remain indispensable for denoising and isolating fault-signature frequencies. Statistical descriptors (RMS, skewness, kurtosis, and variance) feed ML pipelines, while GA, PCA, and kernel PCA support feature selection and dimensionality reduction [36,37,38].
Physics-based and hybrid models. Thermofluidodynamic simulations of heat-pump or refrigeration cycles provide virtual sensors for quantities that are difficult to measure in situ; polytropic-exponent or $p - V$ correlations help pinpoint valve and leakage defects. Neural surrogates accelerate what-if analyses, replacing expensive CFD or finite-element solvers [38,39,40].
Anomaly detection and diagnostics. One-Class SVM, isolation forest, and sequence prediction LSTMs flag outliers and degradation trajectories, enabling early-warning dashboards [41,42]. Probabilistic outputs are increasingly paired with XAI explanations (e.g., SHAP) to build operator trust.

Research converges on six practical themes: (i) fault diagnosis/classification of rotary, reciprocating, centrifugal, and screw units; (ii) predictive maintenance scheduling to minimise downtime; (iii) energy-efficiency optimisation by detecting fouling, leaks, or valve wear; (iv) component degradation profiling of valves, pistons, seals, and bearings; (v) real-time multivariate monitoring in harsh industrial settings; and (vi) IoT/Industry 4.0 integration—edge devices stream high-frequency data to cloud analytics, with results rendered via interactive dashboards.

Studies increasingly exploit edge-to-cloud architectures, Apache Kafka pipelines, and time-series databases to handle terabyte-scale vibration/acoustic streams. DL models are compressed with pruning or knowledge distillation for low-latency inference on embedded GPUs. Explainable AI techniques—grad-CAM on spectrogram CNN outputs and feature importance scores for ensemble trees—facilitate root-cause analysis and regulatory compliance.

Key challenges include (i) limited availability of labelled datasets covering the full spectrum of compressor types and fault modes, (ii) non-stationary operating conditions that degrade model robustness, and (iii) deployment constraints (latency, memory, and cybersecurity). Promising directions encompass transfer or meta-learning for cross-plant adaptability, domain adaptation methods to bridge synthetic–real data gaps, and digital twins that fuse vibration, acoustic, thermal, and maintenance-log data streams. Such twins may ultimately support prescriptive analytics, suggesting optimal load sharing or valve-timing adjustments, thereby reducing energy intensity and unplanned outages.

In sum, the synergy of ML/DL and sophisticated signal processing and physics-aware modelling—embedded within IoT ecosystems and augmented by XAI—continues to push compressor maintenance toward safer, greener, and more cost-effective operation, yet considerable room remains for generalisable, trustworthy, and real-time solutions.

4.2. Machine Learning-Driven Fault Detection and Maintenance in Cable Systems

Cables for power distribution, submarine communication, and robotic harnesses are pivotal in modern infrastructure, yet they are vulnerable to faults that can cause outages, safety hazards, and costly repairs. Recent studies have combined data-driven models with physics-informed insights to detect incipient defects, monitor operating conditions, and optimise maintenance schedules across diverse environments, from deep-sea installations to automated factories.

Researchers first employ classical machine-learning algorithms—including k-Nearest Neighbours, random forest, decision trees, support vector machines, and isolation forests—to classify partial-discharge signals, temperature profiles, and power-line communication data, often within ensemble frameworks that reduce false alarms [43,44,45,46]. Deep learning approaches extend these efforts: convolutional and recurrent architectures, such as CNN, LSTM, and GRU, automatically capture complex spatial–temporal patterns; autoencoders and attention mechanisms bolster anomaly detection when labelled data are scarce; and complex-valued or multi-valued neurons adapt neural networks to frequency-response and phasor inputs in high-voltage systems [43,47,48].

Signal processing and reflectometry methods remain essential for feature extraction. Frequency- and time-domain reflectometry (FDR and TDR, respectively), Fourier and wavelet transforms, and empirical-mode decomposition isolate transient phenomena, while phase-diagram and Lissajous analyses reveal insulation ageing; the resulting descriptors feed both conventional and deep classifiers [43,44,45,46]. Complementing these techniques, IoT-enabled sensors stream voltage, current, and thermal data to cloud platforms that combine machine learning predictions with physical rules in digital-twin frameworks, thereby supporting real-time remaining-useful-life estimation and cost-effective asset management [49].

Practical applications concentrate on four themes: fault diagnosis and classification of partial discharges, insulation decay, and mechanical damage in underground and overhead cables; predictive maintenance scheduling based on health indices and failure-probability scores that mitigate downtime; thermal analysis and overheating detection using reflectometry or power-line communication signals to pre-empt insulation breakdown; and infrastructure-wide reliability assessment, where models trained on maintenance logs, weather data, and sensor measurements guide renewal priorities and investment planning.

Despite these advances, several challenges limit widespread deployment. Many asset registers lack comprehensive failure histories or age or location details, constraining model generalisation; richer data collection strategies and auxiliary sources are needed. Environmental variability and fluctuating operating regimes also degrade algorithm robustness; transfer learning and domain adaptation represent promising remedies. Integrating data-driven predictions with physics-based digital twins can improve interpretability and reliability, yet such hybrids demand scalable edge-to-cloud architectures for low-latency inference. Finally, transparent models and standardised monitoring protocols are essential for regulatory acceptance and operator trust, motivating continued research into explainable AI and benchmarking frameworks. Overall, the literature demonstrates considerable progress in applying machine learning, deep learning, advanced signal processing, and IoT technologies to cable-system health management while underscoring the need for comprehensive datasets, real-time integration, and physics-aware modelling to realise a resilient, efficient future grid.

4.3. Remaining-Useful-Life Estimation: A Cross-Cutting Framework and Key Examples

Accurately estimating the remaining useful life (RUL) of components is essential in aerospace, manufacturing, oil and gas, and energy industries because well-timed interventions minimise downtime, reduce costs, and avert safety incidents. Research spans deep learning, classical machine learning, statistical forecasting, and hybrid physics-based models, all seeking to improve predictive accuracy under changing operating conditions and to provide data-driven strategies for complex assets.

Deep learning dominates recent work. Convolutional neural networks and their variants extract features directly from raw signals, while recurrent architectures such as LSTM, Bi-LSTM, GRU, and CNN–LSTM hybrids capture temporal degradation patterns. Autoencoders and variational autoencoders support anomaly detection, data augmentation, and semi-supervised learning [50,51,52,53,54,55,56,57].

Traditional machine learning methods—including linear or polynomial regression, random forests, support vector regression, ARIMA, and Bayesian updating—remain influential, owing to their interpretability and modest computational demands [53,54,58,59,60,61]. Semi- and self-supervised approaches exploit unlabelled data through pretext or auxiliary tasks, enabling relative RUL estimation where run-to-failure records are scarce [57,62].

Concept-drift handling is addressed by models that adapt parameters via gradient re-weighting and feedback loops, maintaining accuracy as machines age or operating conditions shift [52,55]. Hybrid physics–data frameworks integrate mechanistic damage equations with machine learning, often using Bayesian techniques to refine exponential, Weibull, or Gamma degradation laws [50,51,52,56,59,62,63]. Ensemble learning combines multiple predictors to lower error and bolster robustness [58,64]. Incremental and cloud–edge schemes keep models current for real-time deployment [53,59,61,63]. Uncertainty quantification with Gaussian processes and Bayesian inference provides confidence bounds that inform risk-based maintenance [50,60,61,63,64], while emerging deep reinforcement learning frameworks support adaptive control policies [65].

These methods are applied widely. In oil and gas, RUL models predict failure dates of safety valves despite the sparsity of sensors. Aeronautics research—often using NASA’s C-MAPSS dataset—focuses on turbofan engines. Advanced manufacturing relies on RUL forecasts for bearings, spindles, and CNC systems, accounting for tool wear and process shifts. Complex assets such as pumps, motors, transmissions, and rotating shafts benefit from alarm segmentation and modality-based subpopulation analysis. Fleet management for forklifts and industrial vehicles combines physical insights and data-driven models to schedule service. The energy sector monitors wind turbines, generators, and transformers, whereas battery systems use RUL estimation to anticipate capacity fade and optimise storage operations.

Across these studies, several gaps persist. High-quality run-to-failure datasets are scarce; data augmentation, domain adaptation, and federated learning can alleviate this constraint. Continuous recalibration is needed to cope with concept drift, yet real-time updates are computationally intensive. Explainability remains critical for regulatory confidence, especially in safety-critical domains. Deploying deep or Bayesian models at scale requires efficient edge solutions and streamlined model-update protocols. Future work should pursue physics-informed architectures, transfer learning across machinery types, and more comprehensive benchmarks—particularly for under-represented sectors such as automotive transmissions, large-scale vehicles, and battery storage—so that predictive maintenance can reach its full potential in diverse industrial settings.

4.4. Emerging Trends in Hydraulic Pump Condition Monitoring and Fault Diagnosis

Hydraulic pumps and related equipment are critical to industrial productivity, yet their exposure to cavitation, wear, and overheating poses substantial risks of downtime and energy loss. Recent studies have integrated machine learning (ML), deep learning (DL), advanced signal processing, and physics-based insights to reduce these risks and to align condition monitoring with big data and Industry 4.0 strategies.

Current research converges on five technique families. First, classical ML models—linear and polynomial regression, decision trees, random forest, k-nearest neighbours, and multilayer perceptrons—often appear in ensemble form to boost robustness [66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83]. Second, DL architectures, such as CNN, LSTM, GRU, and their hybrids (e.g., temporal-CNN with attention), automatically learn spatial–temporal features from vibration or acoustic sequences, improving fault discrimination and cavitation detection [70,84,85,86,87,88,89,90,91,92,93]. Third, signal processing methods—including FFT, short-time Fourier transform, MODWPT, DW, and CEEMDAN—extract frequency components and statistical descriptors such as RMS, kurtosis, and variance, with PCA or kernel PCA refining feature sets [70,78,82,84,85,86,87,89,90,91,94]. Fourth, physics-based and hybrid approaches couple thermofluidodynamic or pressure–volume models with data-driven surrogates to capture underlying failure mechanics while reducing simulation cost [69,95]. Finally, anomaly detection pipelines employ one-class SVM, isolation forests, and sequence-based predictors to flag outliers and anticipate degradation trajectories [71,72,74,75,76,80,81,83,86,88,90,94,95,96,97].

These tools underpin diverse applications: early fault identification and classification of valve leaks; bearing damage; abnormal vibration or overheating in axial, centrifugal, screw, and reciprocating pumps; predictive maintenance scheduling that minimises unplanned outages; energy consumption optimisation by detecting blockages or leaks; degradation analysis of valves, pistons, seals, and heat exchangers; real-time multivariate supervision in petrochemical and refrigeration plants; and large-scale IoT platforms that stream sensor data to cloud analytics for dashboard visualisation.

Despite notable advances, data scarcity and limited coverage of operating regimes hinder model generalisation, while evolving control strategies introduce concept drift, demanding continual recalibration. Computationally intensive DL or ensemble models complicate real-time deployment, and opaque decision logic challenges operator trust. Future research should explore federated learning, synthetic data, and improved curation to enrich training corpora; domain adaptation and transfer learning schemes to re-purpose models for new machines; hybrid physics-informed architectures that embed design constraints; and systematic benchmarking across pump types and duty cycles to foster reproducibility and accelerate industrial uptake.

4.5. Bearing Fault Diagnosis as a Common Function in Rotating Equipment

Bearings are ubiquitous in rotating machinery, and their premature failure can jeopardise safety, inflate energy consumption, and interrupt production. Recent studies have fused signal processing, machine learning (ML), and deep learning (DL) to detect incipient defects under noisy, variable loads. A concise survey clarifies how these methods complement one another, highlights persisting gaps, and outlines promising research directions.

Classical ML remains a staple for supervised health assessment. Algorithms such as random forest, support vector machines, and k-nearest neighbours discriminate between fault types using statistical descriptors—root mean square, kurtosis, or variance—or transforms such as FFT and wavelets; feature selection tools like PCA or genetic algorithms improve robustness [98,99,100,101]. Ensemble and hybrid schemes further curb misclassification.

Deep learning architectures automate feature learning from raw vibration streams or time–frequency images. One- and two-dimensional CNNs with multiscale filters, attention blocks, and residual paths excel at spatial pattern recognition, whereas LSTM, GRU, and CNN–LSTM hybrids capture temporal dependencies and cope with variable speeds [7,11,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125].

Signal processing and sensor fusion techniques—STFT, CWT, wavelet packets, EEMD/CEEMDAN, and denoising filters—extract salient components, while multi-sensor combinations of vibration, acoustics, and currents offer a system-level view [102,105,107,109,112,114,116,117,118,120,123,125,126]. When labelled failures are scarce, anomaly detection with one-class SVM or isolation forests, domain adaptation, and transfer learning enhance generalisability, and reinforcement learning tackles class imbalance or feature selection [11,106,117,126]. Physics-informed networks embed bearing kinematics or impulse periodicity into loss functions, boosting interpretability and trust [7,98,104,108,110,113,119,121].

Industry deploys these tools for early crack or spall detection, predictive maintenance scheduling, and online monitoring under fluctuating loads. Research has also targeted robustness to electromagnetic interference, lightweight models for edge devices, and multi-modal health indices that combine vibration, current, image, and acoustic streams. Tables S11–S13 synthesise representative studies, techniques, results, and limitations.

Several obstacles still impede large-scale adoption. Most public datasets are laboratory-based and lack the diversity of real operating regimes, while high-quality field labelling remains labour-intensive. Concept drift induced by evolving processes demands continual recalibration, and deep and ensemble models can exceed the computational budgets of real-time systems. Moreover, engineers require transparent diagnostics, yet explainable AI toolkits are only beginning to gain traction.

Future work should expand data acquisition through digital-twin simulation, federated learning, and standardised labelling protocols; develop quantised or pruned models for embedded deployment; and advance hybrid, physics-guided architectures that reconcile data-driven accuracy with domain knowledge. Addressing these challenges will accelerate the transition of bearing diagnosis research from controlled experiments to resilient, cost-effective solutions in petrochemical plants, transportation, and advanced manufacturing.

4.6. Artificial Intelligence and Predictive Maintenance in Conveyor Belt Systems

Conveyor belts are critical to bulk material handling in mining, logistics, and manufacturing, yet they operate under abrasive, dusty, and highly variable conditions that accelerate wear and heighten failure risk. Recent work has blended classical machine learning (ML), deep learning (DL), advanced signal processing, and physics-informed modelling to anticipate defects, schedule maintenance, and optimise energy use.

Classical and hybrid ML models—linear or polynomial regression, decision trees, k-nearest neighbours, random forest (RF), gradient boosting (GB), linear/quadratic discriminant analysis, and support vector machines—remain popular for the classification of roller wear, belt damage, and motor faults. Ensemble schemes often embed meta-learners such as ridge regression, while multilayer perceptrons and probabilistic neural networks address non-linear patterns [127,128,129,130,131].

Deep learning methods dominate recent studies: CNN, LSTM, GRU, and temporal CNN–LSTM hybrids, frequently augmented with autoencoders or attention blocks, excel at feature extraction and time-series forecasting under varying loads and speeds [132,133,134,135,136].

Signal processing and feature extraction pipelines employ MODWPT, DWT, CEEMDAN, and FFT to isolate vibration or acoustic signatures; statistical descriptors (RMS, kurtosis, and variance) and optimisation tools such as genetic algorithms, PCA, and Kernel PCA refine the inputs [129,133,136].

Physics-based and hybrid models integrate thermodynamic or mechanical simulations, e.g., fluid-dynamic analyses and pressure–volume correlations—with neural surrogates to accelerate what-if studies and improve failure prediction [131,134].

Anomaly detection frameworks rely on one-class SVM, isolation forests, and sequence models (LSTM and RNN) to flag outliers and track degradation trends, enabling proactive maintenance planning [130,137,138].

These techniques underpin practical applications: fault diagnosis and classification of roller misalignment, belt tears, and material build-up; predictive maintenance scheduling that reduces unplanned outages; energy consumption optimisation by detecting overloads or blockages; degradation analysis of rollers, bearings, and drives; and real-time multivariate monitoring of speed, load, vibration, and temperature. IoT sensors stream high-frequency data to cloud platforms, where big data analytics and interactive dashboards support agile decision making.

Persistent gaps hinder industrial uptake. High-quality, labelled datasets that capture the diversity of real operating regimes are scarce, as field instrumentation and granular annotation are labour-intensive. Advanced DL models can be computationally burdensome for near-real-time diagnostics unless pruned or quantised for edge deployment. Limited model interpretability slows acceptance among engineers, and heterogeneous sensor protocols complicate cross-site transfer.

Therefore, research should pursue scalable data collection strategies—federated learning, digital-twin augmentation, and standardised labelling—to enrich training corpora while preserving confidentiality. Lightweight, physics-guided architectures and edge-computing pipelines can balance accuracy with latency constraints. Explainable AI techniques such as Layer-wise relevance propagation or DeepSHAP, coupled with hybrid physics–data models, can bolster trust and aid in root-cause analysis. Finally, transfer learning and domain adaptation methods are needed to port models across industries and environments, enabling robust, energy-efficient predictive maintenance for conveyor systems worldwide.

5. Discussion and Future Research Directions

Predictive maintenance systems rely on a complex interplay of data sources, signal processing techniques, advanced analytics, and practical implementation strategies to optimize industrial asset performance. This section explores the foundational components of these systems, including the variety of data inputs, the essential steps of signal processing and data preparation, the application of machine learning and advanced analytics, and the challenges that must be addressed for effective deployment. Figure 10, Figure 11, Figure 12 and Figure 13 provide visual summaries of these aspects, highlighting key concepts and their relationships within predictive maintenance frameworks.

5.1. Data Sources and Their Integration

Data sources in predictive maintenance systems encompass a wide range of inputs that enable comprehensive monitoring and analysis of industrial assets. Real-time data are collected from sensors measuring vibration, acoustic signals, temperature, current, pressure, and humidity, often enhanced by IoT-enabled devices for continuous tracking. Historical records—such as maintenance logs, failure histories, equipment usage profiles, and long-term environmental monitoring—provide crucial context for analytics. In parallel, intermediate systems like SCADA, PLCs, and MES support real-time control and decision-making, while external platforms—including cloud services, third-party providers, and vendor-specific tools—offer advanced diagnostics and analytical capabilities. Collectively, these data layers form a robust foundation for the deployment of predictive maintenance strategies (see Figure 10).

Despite these advances, multiple surveys and field studies indicate that industrial datasets often lack sufficient breadth or detailed labelling, especially for rare failure events. Consequently, many models become overfit to well-documented faults or underperform due to incompleteness of the input. To mitigate these limitations, researchers are exploring strategies such as data augmentation, synthetic data generation through the use of digital twins, and federated learning. Nonetheless, further investigation is needed to address challenges related to data privacy, cross-organizational collaboration, and the operational constraints of sectors where intellectual property is tightly protected. To move from raw, fragmented logs toward usable data assets, a systematic approach to data curation and publication is required. One such approach is described below.

From Raw Logs to a Shareable Data Mining Repository

A critical step in addressing the aforementioned challenges involves structuring the raw data into interoperable, analysis-ready formats. In practice, converting heterogeneous mining logs into a research-ready database incolves four recurring steps.

(i) Ingest: Time-stamped vibration, current, thermographic, and operational tags are streamed into a staging area via OPC UA or MQTT brokers.

(ii) Standardise: All records are mapped to a common time base (e.g., 1 kHz for fast channels and 1 Hz for SCADA tags) and stored in a column-oriented time-series table, while contextual information—equipment ID, duty cycle, ore hardness, and ambient conditions—is preserved in a metadata table compliant with ISA-95 or the Asset Administration Shell.

(iii) Label: Maintenance tickets, inspection notes, and failure codes are linked to their signal intervals through a simple event table containing {start_time, end_time, fault_type, severity}, making supervised learning possible, even when faults are rare.

(iv) Publish: The three tables= (“signals”, “metadata”, and “events”) are exported as Apache Parquet or Feather files and a JSON schema so that external researchers can query the whole asset history with a single SQL-like view.

A well-structured repository of this kind acts as an enabler for scalable and reproducible AI applications.

5.2. Signal Processing and Data Preparation

Signal processing and data preparation are critical steps in predictive maintenance, ensuring that raw data are transformed into meaningful insights. Frequency and time-frequency transformations, such as FFT, STFT, DWT, CWT, WPT, and EMD, enable the extraction of spectral and temporal features essential for fault detection and diagnosis. Feature extraction techniques include basic statistics, envelope methods, and advanced dimensionality reduction, while feature fusion combines multiple data sources for robust analysis. Data cleaning processes, such as interpolation, normalization, resampling, and temporal alignment, enhance data quality and consistency. Expert knowledge integration through knowledge-based rules or synthetic signal generation (data augmentation) also enriches datasets for improved model performance (see Figure 11).

5.2.1. Fourier-Based Methods

Fast Fourier Transform (FFT) remains a fundamental tool for identifying dominant frequency components in (quasi-)stationary signals, such as rotating machinery operating at a fixed speed. Early-stage or repetitive fault signatures often manifest as characteristic frequency peaks or sidebands in the FFT spectrum. However, the effectiveness of FFT-based approaches can degrade under non-stationary conditions or varying load profiles. Short-Time Fourier Transform (STFT) alleviates this by introducing a time window for localized analysis, but optimal window sizing can be challenging, leading to trade-offs between time and frequency resolution.

5.2.2. Wavelet Transform and Time-Frequency Analysis

Wavelet-based approaches (e.g., DWT, CWT, and wavelet packet transform) have been widely adopted to capture transient or impact-type events, which are common in bearing faults, cavitation phenomena, and gearbox defects. Compared to FFT, wavelets are more effective for non-stationary signals, providing multi-resolution detail. However, parameters such as the choice of mother wavelet, the number of decomposition levels, and thresholding strategies significantly influence detection accuracy and computational load. These trade-offs are especially critical for real-time applications on edge devices, where memory and processing capabilities may be limited. Recent extensions such as variational mode decomposition (VMD) and complete ensemble EMD (CEEMDAN) aim to adaptively decompose signals into intrinsic mode functions under varying speed or load conditions, although their higher complexity and parameter tuning may reduce feasibility in online scenarios.

5.2.3. Hybrid Approaches and Emerging Perspectives

Hybrid methods that combine Fourier or wavelet transforms with advanced machine learning architectures (e.g., CNNs) or statistical tests (e.g., kurtosis-based indices) have shown improved robustness against noise. For instance, wavelet energy features can feed into a CNN classifier, directly capturing localized impulse-like behaviours caused by fault signatures. Likewise, envelope detection (e.g., Hilbert transform) can emphasize modulated fault frequencies before applying wavelet-based techniques. Nevertheless, the rising popularity of deep learning—where models ingest raw time-series data or spectrograms—shifts the emphasis onto automated feature extraction, potentially reducing manual signal processing but increasing reliance on large labelled datasets and higher computational budgets.

5.2.4. Challenges and Future Directions in Signal Processing

Although wavelets and FFT-based methods have been well-studied in the literature, real-world deployment in industrial environments faces practical hurdles. Non-stationary speed profiles, time-varying loads, and multi-fault interactions create complex signal patterns that may require adaptive transforms or online parameter tuning. Additionally, model interpretability remains a challenge; while wavelet coefficients or frequency spectra can be visually inspected by domain experts, fully automated pipelines might obscure the relationship between raw signals and final diagnoses. Future research could explore lightweight wavelet implementations, real-time approximate transforms, or even learned wavelet dictionaries that adapt to the specific signal characteristics of each machine. Furthermore, collaborative platforms (e.g., federated or cloud–edge computing) can balance the computational overhead of advanced transformations with the real-time constraints of condition monitoring on plant floors.

A recurring research challenge in this domain is striking a balance between transformation complexity and real-time applicability. Methods based on wavelets or EMD capture non-stationary signals effectively but often require high computational power, potentially exceeding the limits of edge devices. Future work might explore lightweight versions, such as approximated wavelets or adaptive sampling, to achieve near-online signal processing without compromising core diagnostics. Another issue is sensor synchronization and alignment of signals from multiple channels or components. Though cross-correlation and advanced synchronization approaches exist, their application in large-scale industrial settings remains limited, calling for more systematic validation and best-practice guidelines.

5.3. Machine Learning, Advanced Analytics, and Explainability

Machine learning and advanced analytics play a pivotal role in predictive maintenance, enabling intelligent decision making through data-driven insights. Core techniques include classical models such as regression, decision trees, random forests, and ensemble methods, as well as deep learning architectures like CNNs, LSTMs, GRUs, and hybrid networks that capture spatial and temporal dependencies. Specialized models address prognostics and anomaly detection, incorporating physics-based constraints or methods like one-class SVM and isolation forests. Optimization strategies such as hyperparameter tuning and transfer learning further enhance model adaptability and scalability. Explainability and hybrid intelligence, with approaches like SHAP, LIME, and rule-based systems, offer greater transparency and help build trust in automated diagnostics (see Figure 12).

Notwithstanding these advancements, two major concerns curtail their wider deployment. First, concept drift arises when equipment undergoes continuous changes in load, wear, or operating regimes. Models that are not updated to reflect these variations are prone to degrade in accuracy over time. Incremental learning and domain adaptation have shown potential, but practical demonstrations are still scarce. Second, computational overhead remains high when dealing with large, high-frequency datasets. Pruning, quantization, and more efficient architectures represent viable paths to address latency constraints in real-world scenarios. Further exploration of hybrid (physics–data) approaches could increase reliability and interpretability, especially in safety-critical settings where purely data-driven black-box models face scepticism.

5.4. Hybrid Physics–Data Models: A Bridge Between First-Principles Insight and Statistical Learning

As highlighted throughout the quantitative (Section 3) and qualitative (Section 4) analyses, an increasingly visible research strand blends mechanistic insight with modern machine learning pipelines. Roughly 11% of the 354 primary studies selected in our SLR employ some form of hybrid physics–data modelling—by either (i) embedding governing equations or conservation laws in the loss function of a neural network, (ii) coupling a fast data-driven surrogate with a high-fidelity simulator, or (iii) enforcing physically meaningful constraints (e.g., energy balance, cyclic symmetry, or resonance period) during feature engineering. Representative examples are summarised in Tables S2, S9, S12 and S13. Below, we distil the main patterns, benefits, and open questions that emerged from this sub-corpus.

Taxonomy of hybrid approaches.

Physics-informed neural networks (PINNs). Recent studies [110,113] minimise a joint loss that combines data misfit with residuals of the underlying motion or wear equations. By forcing the latent representation to satisfy kinematic periodicity or impulse-response decay, PINNs raise diagnostic accuracy by 2–5 pp compared with purely data-driven CNN baselines while requiring only ∼50% of the original labelled samples.
Grey-box digital twins with learned surrogates. In compressor and pump studies (e.g., those by Du et al. [40] and Rousseau et al. [39]), thermodynamic or CFD simulations generate synthetic fault trajectories that feed a lightweight surrogate network (LSTM, RBF, or autoencoder). Once calibrated, the surrogate runs three to four orders of magnitude faster than the full solver, enabling real-time residual generation for anomaly scoring.
Physics-guided feature engineering. Works on cavitation and bearing spalls (e.g., thos eby Wu et al. [84] and Huang et al. [37]) first isolate carrier or modulated frequencies predicted by fluid–structure interaction theory, then feed those physically meaningful sub-bands to classical classifiers (RF or SVM) or to compact CNNs. This grey-box preprocessing typically cuts feature dimensionality by 60–80% and improves class-imbalance robustness.

Why does the hybrid route matter?

Label scarcity. Most industrial assets fail rarely; embedding conservation constraints or simulator priors substitutes for the thousands of fault cycles that deep learners usually demand.
Extrapolation and trust. When operating conditions drift outside the training envelope, hard-coded physical laws act as safety rails that curb unphysical predictions and ease regulatory acceptance—an advantage repeatedly stressed by domain experts during our validation workshops.
Computational frugality. Grey-box surrogates such as the self-adaptive RBF used by Du et al. [40] reduce inference latency to below 100 ms on edge-grade CPUs, satisfying the cycle times of most PLC-controlled plants.

Frontier challenges.

Despite these benefits, the reviewed papers reveal three recurring pain points:

Automatic weighting of physics vs. data losses. Nearly every PINN study tunes the balance term via grid search; adaptive schemes (e.g., dynamic gradient harmonisation) remain to be tested on industrial datasets.
Uncertainty quantification. Only two hybrid works—that of Huang et al. [37] and the Bayesian ensemble of Byun et al. [64]—provide predictive intervals. Extending probabilistic PINNs or ensemble surrogates to multi-fault settings is an open avenue.
Standardised benchmarks. Cross-study comparability is hampered by heterogeneous simulators and proprietary datasets. A curated, open repository pairing raw signals with physics metadata (steel grade, lubricant viscosity, and ambient pressure) would accelerate method transfer across domains.

Research agenda and practical tips.

Building on the gaps mentioned above, we propose four actionable directions for the community:

Develop lightweight PINN kernels that leverage fast spectral convolutions, enabling on-device training for assets with intermittent connectivity.
Combine active learning with simulator-in-the-loop sampling to target high-value regions of the state space, reducing synthetic data redundancy.
Integrate explainable AI overlays such as DeepSHAP in physics-guided layers (see Keleko et al. [91]) so that operators can trace residual peaks to physically interpretable variables (e.g., pressure ratio and specific damping).
Report energy and memory footprints alongside accuracy, following the standardization guidelines outlined in the IEEE P2802 standard [139] to facilitate deployment decisions in resource-constrained industrial environments.

Taken together, these observations confirm that hybrid physics–data models are not merely a theoretical curiosity but a pragmatic pathway toward resilient, data-efficient, and operator-trusted predictive maintenance solutions. As industrial practitioners advance toward the human-centric paradigm of Industry 5.0, we expect such grey-box strategies to become the default choice for high-stakes assets where pure black-box models remain difficult to justify.

5.5. Challenges and Industry 5.0 Outlook

Despite recent advances, AI- and ML-driven predictive maintenance systems continue to face persistent challenges that limit their adoption and scalability. A major barrier is the scarcity of labelled data, as most industrial datasets are highly imbalanced and costly to annotate—particularly for rare or incipient faults. Concept drift further complicates model reliability: operational conditions change, components wear out, and signal characteristics evolve over time, requiring constant model adaptation.

In addition, the computational demands of deep learning—especially for real-time inference on resource-constrained hardware—pose limitations in edge deployments. The need for explainability is another critical factor, especially in safety-critical or regulated industries, where model transparency underpins compliance and operational trust. Integrating these systems into industrial environments also raises challenges related to IoT heterogeneity, privacy concerns, and large-scale data orchestration. Lastly, scalability and robustness must be ensured to withstand harsh operating conditions and deliver tangible returns on investment (see Figure 13).

Looking forward, aligning predictive maintenance with the guiding principles of Industry 5.0 presents a unique opportunity to address these challenges while promoting more sustainable and human-centred technologies. In this paradigm, automation and efficiency are complemented by human expertise, collaboration, and responsible innovation. Operators and domain experts are dynamically integrated into the feedback loop via intuitive dashboards and explainable AI overlays, enhancing decision making and interpretability.

Emerging tools such as digital twins and virtual commissioning enable proactive simulation of failure scenarios, including multi-fault conditions, before they occur in production environments. Furthermore, predictive maintenance can contribute directly to sustainability goals by reducing energy consumption, minimizing unnecessary part replacements, and extending equipment life cycles.

Figure 14 synthesizes the proposed implementation framework for Industry 5.0. Raw sensor data from physical assets are first processed by an edge/fog AI layer with OT-level control, enabling fast, localized inference and closed-loop actuation. Aggregated insights are then sent to a cloud-based federated learning layer for cross-site model retraining, with updated models and policies returned to the edge.

Simultaneously, preprocessed data feed a digital thread or Product Lifecycle Management (PLM) backbone, enriching it with context such as maintenance history and design constraints. This persistent digital infrastructure supports traceability and decision making across the asset lifecycle. Both the edge and the digital thread push real-time KPIs and alerts to human-centric UIs, closing the loop through intuitive operator interfaces. This layered flow directly addresses the AI/ML challenges discussed earlier—data scarcity, concept drift, computational overhead, explainability, and integration heterogeneity—by combining edge–cloud intelligence with traceable lifecycle data and human-in-the-loop control.

5.6. Concluding Remarks on Future Directions

In summary, predictive maintenance systems combine diverse data sources, advanced signal processing, and cutting-edge machine learning techniques to enable proactive management of industrial assets. While these systems offer significant benefits, including reduced downtime and optimized performance, they also face challenges related to data shortages, evolving operational conditions, computational overhead, and explainability. Addressing these gaps will require ongoing innovation in algorithmic design, infrastructure optimization, and user-centred solutions.

A viable path forward consists of four key strategies. First, scalable data sharing and synthetic augmentation methods can enrich datasets without compromising sensitive information, thereby widening the scope of model training. Second, adaptive and hybrid models offer a means to handle concept drift while improving interpretability by incorporating domain knowledge into learning pipelines. Third, synergistic edge–cloud architectures could distribute computational loads effectively, ensuring faster local decisions while maintaining a robust historical analysis in the cloud. Finally, a human-in-the-loop perspective aligns with Industry 5.0 values, fostering greater trust, transparency, and sustainability. Through these avenues, predictive maintenance can evolve from proof-of-concept implementations to core industrial practices, ultimately contributing to safer, more efficient, and sustainable operations.

5.7. Integration with Classical Maintenance Management Models

The adoption of machine learning, deep learning, and hybrid modelling techniques in industrial maintenance is not isolated. These technologies are being progressively integrated into classical approaches such as Reliability-Centred Maintenance (RCM), Condition-Based Maintenance (CBM), and Total Productive Maintenance (TPM). The methodologies and findings presented throughout this work—particularly those summarized in the tables related to fault diagnosis, signal processing, deep learning, and hybrid approaches—support and extend the application of these established maintenance strategies.

RCM focuses maintenance strategies on system and component criticality and associated risk. Traditionally, it relies on failure-mode and effect analysis (FMEA/FMECA) to guide preventive or corrective actions based on the probability and impact of failures. Studies such as those by Azizi et al. [78] and Shi et al. [34] address early detection of critical failures (e.g., pump cavitation, bearing degradation, and soot build-up in boilers), allowing maintenance to concentrate on components whose failure could lead to major shutdowns or safety hazards. In this way, predictive models strengthen RCM by identifying early signs of failure in critical components, improving the effectiveness of inspections and interventions. Moreover, the use of automatic classification methods (e.g., convolutional neural networks or ensemble algorithms) enables FMEA to incorporate real-time operational data. Hybrid models that combine signal processing and ML, such as those proposed by Saravanakumar et al. [105] and Soualhi et al. [50], dynamically estimate failure probabilities instead of assuming static values. This continuous assessment of risk allows the criticality matrix (probability/impact) to be updated according to changing operating conditions and reallocates maintenance resources accordingly. However, since RCM heavily depends on expert validation of failure modes and consequences, scepticism may arise when AI models generate difficult-to-interpret outputs. To address this, explainable AI (XAI) methods such as LIME, SHAP, and DeepSHAP—as demonstrated in works by Solis et al. [52] and Keleko et al. [91]—can improve transparency and trust between advanced analytics and traditional risk analysis.

CBM is based on continuous monitoring of equipment health indicators (e.g., temperature, vibration, and pressure) to trigger maintenance when thresholds are exceeded. Historically, these thresholds were fixed, but the techniques reviewed here show that indicators can be refined using predictive and diagnostic models. For example, vibration signals processed through Fourier or wavelet transforms can extract features that are then evaluated using ML techniques (e.g., SVM and random forest), as in the work of Lv et al. [38], enabling probabilistic fault diagnosis rather than relying on fixed alert levels. Real-time monitoring using IoT sensors, as seen in the works of Hu et al. [70] and Bruinsma et al. [80], allows the condition of the asset to be dynamically updated. When progressive degradation is detected, the system can adapt the alert criteria based on signal evolution. Advanced techniques like LSTM and GRU networks model the future trajectory of health indicators, aiding in maintenance scheduling (e.g., the works of Miao et al. [109] and Son et al. [51]). Additionally, combining multiple sensor modalities (vibration, acoustics, and temperature) leads to synthetic condition indicators, such as the Bayesian approach proposed by Martinsen et al. [130], which detects early risk events in mining applications. This reinforces CBM by reducing dependence on single metrics.

TPM emphasizes operator autonomy, proactive failure management, and continuous improvement, promoting a culture in which personnel participate in early detection and the resolution of issues to minimize downtime. Operators are seen as “owners” of the equipment, responsible for routine maintenance and anomaly detection. For this framework to benefit from AI tools, interpretability is critical. Studies such as that by Keleko et al. [91], which used DeepSHAP, and that by Kim et al. [83], which visualized feature importance, demonstrate how transparent decision support systems empower operators by clarifying why an alarm is triggered. Similarly, implementations by Robatto et al. [137] and Pulcini et al. [131] emphasize the need for dashboards and user-friendly interfaces that combine operational indicators with AI-generated diagnostics. In TPM contexts, such visualization enhances production-line decision making and enables non-expert operators to engage actively with machine health data.

The implementation of advanced monitoring and diagnostic methodologies—such as those reviewed in this paper for bearing failure detection [112,113], pump cavitation identification [82,86], and RUL estimation [54,61]—naturally aligns with the objectives of RCM, CBM, and TPM. RCM benefits from analytics to identify risk “hot spots” and justify maintenance prioritization. CBM is enhanced by predictive models capable of anticipating intervention timing based on trends rather than static thresholds. TPM gains value by democratizing access to predictive intelligence, allowing in situ operators to understand and act on health assessments, increasing ownership and engagement.

Ultimately, the convergence of the reviewed methods with established maintenance models supports the development of flexible, explainable, and risk-aware systems that improve industrial decision making. However, further efforts are needed in three key areas: (1) standardization of data and process integration, as sensor infrastructure and communication protocols remain fragmented in many industries; (2) assurance of trust and return on investment, particularly in TPM contexts where operator motivation may decline if AI is perceived as a black box; and (3) the development of unified evaluation frameworks and metrics, since measuring impact (e.g., reduced downtime, cost savings, and risk mitigation) requires consistent and transparent indicators across RCM, CBM, and TPM.

Future Work—Key Questions (Summary Box)

To conclude the discussion, we highlight a set of forward-looking questions that emerged from our systematic review and may guide future research in predictive maintenance. These topics address critical gaps in lightweight deployment, model robustness, and operator interaction.

Lightweight transforms: Can approximate wavelet/EMD variants run in real time on edge MCUs?
Concept-drift handling: How often must models be retrained, and which incremental schemes work best in safety-critical plants?
Physics-guided explainability: What level of physical fidelity is sufficient to gain operator trust while keeping models compact?
Federated data sharing: Which privacy-preserving protocols are acceptable for cross-company PdM benchmarks?
Unified ROI metrics: How should downtime, energy, and carbon savings be normalised across industries?
Human-centric UIs: Which visual cues (e.g., causal graphs and SHAP heat maps) most effectively support TPM operators?

6. Conclusions

This systematic review combined large language model text mining with expert appraisal to map the state of predictive maintenance research for industrial equipment. A structured search retrieved and filtered articles on bearings, compressors, hydraulic pumps, conveyor belts, and cable systems. Dimensionality reduction and clustering techniques revealed the most influential methodological threads—deep learning, physics-informed hybrids, advanced signal processing, and classical machine learning models—while expert validation confirmed topic relevance and research quality.

Four trends dominate contemporary work. First, deep neural architectures increasingly deliver early fault detection and fine-grained classification across diverse assets. Second, IoT and big data infrastructure enable high-frequency, large-scale data acquisition for real-time analytics. Third, hybrid physics–data approaches improve interpretability and resilience, especially where labelled data are scarce. Fourth, traditional statistical and ML algorithms (random forest, SVM, and k-NN) persist in small-sample or resource-constrained settings, providing lightweight baselines and interpretable benchmarks. Remaining-useful-life estimation—often via supervised or Bayesian models—has gained prominence alongside rising interest in explainability and concept-drift adaptation.

Reporting of Physical and Environmental Metadata

Several studies reviewed here provide only limited detail on the physical properties of critical components—such as bearing steel grade, casing geometry, and cable insulation thickness—and on operational conditions like ambient temperature, humidity, and electromagnetic interference. Explicitly documenting these parameters would facilitate stronger cross-study comparisons and enable researchers to link model performance to underlying material and environmental factors. We therefore recommend that future work adopt standard metadata tables (e.g., component material, tolerance class, and IEC/ISO environment ratings) alongside signal and performance data to strengthen the empirical foundation of industrial predictive maintenance research.

Despite clear progress, practical deployment still faces obstacles: limited availability of high-quality labels; the overhead associated with deploying complex models on edge devices; and the absence of standardised, operator-friendly protocols. Future advances should prioritise (i) synthetic and federated data generation through the use of digital twins and collaborative learning to enrich training corpora; (ii) continuous learning frameworks that track concept drift and recalibrate models online; (iii) explainable AI toolkits (e.g., SHAP, Grad-CAM) that build user trust and satisfy safety regulations; (iv) lightweight, edge-optimised architectures combining cloud and on-device inference for low-latency decisions; (v) alignment with Industry 5.0 goals of human–AI collaboration and sustainability; and (vi) large-scale pilot studies in heterogeneous sectors—mining, automotive, wind energy, and food production sectors—to validate returns on investment and refine cross-sector benchmarks.

Addressing these priorities will accelerate the industrial uptake of predictive maintenance solutions, enhancing reliability, energy efficiency, and operational safety across increasingly complex and interconnected production systems.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app15105465/s1. Reference [140] is cited in the supplementary materials.

Funding

This research was supported by the VINCI-DI initiative at Pontificia Universidad Católica de Valparaíso (PUCV), under project number 039.706/2025.

Acknowledgments

The authors express their gratitude to the institutions and funding bodies that supported this research. Special thanks to the Pontificia Universidad Católica de Valparaíso (PUCV) for its continuous academic support, particularly through the Doctorate in Intelligent Industry program. Luis A. Rios thanks AGCID Chile’s 2023 South-South Cooperation Scholarship Program and Doctorado en Industria Inteligente, Facultad de Ingeniería, Pontificia Universidad Católica de Valparaíso for support provided.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AE	Acoustic Emission or Autoencoder (depending on context)
AI	Artificial Intelligence
AL	Active Learning
ANN	Artificial Neural Network
ARIMA	AutoRegressive Integrated Moving Average
BDL	Bayesian Deep Learning
BN	Bayesian Network
CAD	Computer-Aided Design
CBM	Condition-Based Maintenance
CFD	Computational Fluid Dynamics
CLAHE	Contrast-Limited Adaptive Histogram Equalization
CNN	Convolutional Neural Network
COT	Computed Order Tracking
CWT	Continuous Wavelet Transform
DT	Digital Twin or Decision Tree (depending on context)
DTW	Dynamic Time Warping
DWT	Discrete Wavelet Transform
EDA	Encoder–Decoder with Attention
EMD	Empirical Mode Decomposition
FDR	Frequency-Domain Reflectometry
FFT	Fast Fourier Transform
FMECA	Failure Mode, Effects, and Criticality Analysis
GA	Genetic Algorithm
GB	Gradient Boosting
GP	Gaussian Process
GRU	Gated Recurrent Unit
HDBSCAN	Hierarchical Density-Based Spatial Clustering
HI	Health Index
HSA	Hierarchical Symbolic Analysis
IoT	Internet of Things
JL-CNN	Joint Learning Convolutional Neural Network
kNN	k-Nearest Neighbours
KPCA	Kernel Principal Component Analysis
LDA	Linear Discriminant Analysis
LLM	Large Language Model
LMD	Local Mean Decomposition
LSTM	Long Short-Term Memory
MC	Monte Carlo (e.g., MC Dropout)
MFCC	Mel-Frequency Cepstral Coefficient
ML	Machine Learning
MLP	Multi-Layer Perceptron
MODWPT	Maximal Overlap Discrete Wavelet Packet Transform
MVN	Multi-Valued Neuron
NARX	Nonlinear AutoRegressive Model with eXogenous Inputs
NLP	Natural Language Processing
PCA	Principal Component Analysis
PIResNet	Physics-Informed Residual Network
PSO	Particle Swarm Optimization
QDA	Quadratic Discriminant Analysis
RBF	Radial Basis Function
RF	Random Forest
RL	Reinforcement Learning
RNN	Recurrent Neural Network
RUL	Remaining Useful Life
SAE	Stacked Autoencoder
SCR	Squared Current Ratio
SDMS	Self-Diagnostic Monitoring System
SHAP	SHapley Additive exPlanation
SI	State Indicator
SNR	Signal-to-Noise Ratio
SVM	Support Vector Machine
TDR	Time-Domain Reflectometry
t-SNE	t-distributed Stochastic Neighbour Embedding
TSA	Time Synchronous Average
UMAP	Uniform Manifold Approximation and Projection
VGG	Visual Geometry Group (CNN family)
VMD	Variational Mode Decomposition
VRBG	Variable Reluctance Bearing Generator
XAI	Explainable Artificial Intelligence
XGBoost	Extreme Gradient Boosting
ZSC	Zero Sequence Current

References

Sahu, A.R.; Palei, S.K.; Mishra, A. Data-driven fault diagnosis approaches for industrial equipment: A review. Expert Syst. 2024, 41, e13360. [Google Scholar] [CrossRef]
Kumar, S.; Goyal, D.; Dang, R.K.; Dhami, S.S.; Pabla, B. Condition based maintenance of bearings and gears for fault detection–A review. Mater. Today Proc. 2018, 5, 6128–6137. [Google Scholar] [CrossRef]
Sheikh, M.A.; Bakhsh, S.T.; Irfan, M.; Nor, N.b.M.; Nowakowski, G. A review to diagnose faults related to three-phase industrial induction motors. J. Fail. Anal. Prev. 2022, 22, 1546–1557. [Google Scholar] [CrossRef]
Shahin, M.; Chen, F.F.; Hosseinzadeh, A.; Zand, N. Using machine learning and deep learning algorithms for downtime minimization in manufacturing systems: An early failure detection diagnostic service. Int. J. Adv. Manuf. Technol. 2023, 128, 3857–3883. [Google Scholar] [CrossRef]
Nunes, P.; Santos, J.; Rocha, E. Challenges in predictive maintenance—A review. CIRP J. Manuf. Sci. Technol. 2023, 40, 53–67. [Google Scholar] [CrossRef]
International Energy Agency. Digitalisation and Energy. 2017. Available online: https://www.iea.org/reports/digitalisation-and-energy (accessed on 7 May 2025).
Ni, Q.; Ji, J.; Halkon, B.; Feng, K.; Nandi, A.K. Physics-Informed Residual Network (PIResNet) for rolling element bearing fault diagnostics. Mech. Syst. Signal Process. 2023, 200, 110544. [Google Scholar] [CrossRef]
Dong, Y.; Li, Y.; Zheng, H.; Wang, R.; Xu, M. A new dynamic model and transfer learning based intelligent fault diagnosis framework for rolling element bearings race faults: Solving the small sample problem. ISA Trans. 2022, 121, 327–348. [Google Scholar] [CrossRef]
Lu, Z.; Liang, L.; Zhu, J.; Zou, W.; Mao, L. Rotating Machinery Fault Diagnosis Under Multiple Working Conditions via A Time Series Transformer Enhanced by Convolutional Neural Network. IEEE Trans. Instrum. Meas. 2023, 72, 3533611. [Google Scholar] [CrossRef]
Sang, K.X.; Shang, J.; Lin, T.R. Synchroextracting transform and deep residual network for varying speed bearing fault diagnostic. J. Vib. Eng. Technol. 2023, 11, 343–353. [Google Scholar] [CrossRef]
Wang, Y.; Ning, D.; Feng, S. A novel capsule network based on wide convolution and multi-scale convolution for fault diagnosis. Appl. Sci. 2020, 10, 3659. [Google Scholar] [CrossRef]
Chen, J.; Li, D.; Huang, R.; Chen, Z.; Li, W. Multi-Scale Dilated Convolutional Auto-Encoder Network for Weak Feature Extraction and Health Condition Detection. In Proceedings of the 2024 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Glasgow, UK, 20–23 May 2024; pp. 1–6. [Google Scholar]
Wang, Z.; Zhang, M.; Chen, H.; Li, J.; Li, G.; Zhao, J.; Yao, L.; Zhang, J.; Chu, F. A generalized fault diagnosis framework for rotating machinery based on phase entropy. Reliab. Eng. Syst. Saf. 2025, 256, 110745. [Google Scholar] [CrossRef]
Liu, S.; Ji, Z.; Zhang, Z.; Wang, Y. An improved deep transfer learning method for rotating machinery fault diagnosis based on time-frequency diagram and pre-training model. IEEE Trans. Instrum. Meas. 2024, 73, 2507512. [Google Scholar]
Mushtaq, S.; Islam, M.M.; Sohaib, M. Deep learning aided data-driven fault diagnosis of rotatory machine: A comprehensive review. Energies 2021, 14, 5150. [Google Scholar] [CrossRef]
Garcia, J.; Villavicencio, G.; Altimiras, F.; Crawford, B.; Soto, R.; Minatogawa, V.; Franco, M.; Martínez-Muñoz, D.; Yepes, V. Machine learning techniques applied to construction: A hybrid bibliometric analysis of advances and future directions. Autom. Constr. 2022, 142, 104532. [Google Scholar] [CrossRef]
García, J.; Leiva-Araos, A.; Diaz-Saavedra, E.; Moraga, P.; Pinto, H.; Yepes, V. Relevance of Machine Learning Techniques in Water Infrastructure Integrity and Quality: A Review Powered by Natural Language Processing. Appl. Sci. 2023, 13, 12497. [Google Scholar] [CrossRef]
McInnes, L.; Healy, J.; Melville, J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv 2018, arXiv:1802.03426. [Google Scholar]
Campello, R.J.; Moulavi, D.; Sander, J. Density-based clustering based on hierarchical density estimates. In Proceedings of the Advances in Knowledge Discovery and Data Mining: 17th Pacific-Asia Conference, PAKDD 2013, Gold Coast, Australia, 14–17 April 2013; Proceedings, Part II 17. Springer: Berlin/Heidelberg, Germany, 2013; pp. 160–172. [Google Scholar]
Kim, D.; Park, C.; Kim, S.; Lee, W.; Song, W.; Kim, Y.; Kim, H.; Kim, Y.; Lee, H.; Kim, J.; et al. Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv 2023, arXiv:2312.15166. [Google Scholar]
Mikolov, T. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
Muennighoff, N.; Tazi, N.; Magne, L.; Reimers, N. MTEB: Massive text embedding benchmark. arXiv 2022, arXiv:2210.07316. [Google Scholar]
Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Asyaky, M.S.; Mandala, R. Improving the performance of HDBSCAN on short text clustering by using word embedding and UMAP. In Proceedings of the 2021 8th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA), Online, 29–30 September 2021; pp. 1–6. [Google Scholar]
David, U.; Karabatak, M. Text clustering of covid-19 vaccine tweets. In Proceedings of the 2022 10th International Symposium on Digital Forensics and Security (ISDFS), İstanbul, Turkey, 6–7 June 2022; pp. 1–6. [Google Scholar]
Gelar, T.; Sari, A.N. Bertopic and NER Stop Words for Topic Modeling on Agricultural Instructional Sentences. In Proceedings of the International Conference on Applied Science and Technology on Engineering Science 2023 (iCAST-ES 2023); Atlantis Press: Tarakan, Indonesia, 2024; pp. 129–140. [Google Scholar]
Bafna, P.; Pramod, D.; Vaidya, A. Document clustering: TF-IDF approach. In Proceedings of the 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), Chennai, India, 3–5 March 2016; pp. 61–66. [Google Scholar]
Shar, M.A.; Muhammad, M.B.; Mokhtar, A.A.B.; Soomro, M. A Novel Energy Performance-Based Diagnostic Model for Centrifugal Compressor using Hybrid ML Model. Arab. J. Sci. Eng. 2024, 49, 14835–14853. [Google Scholar] [CrossRef]
Lv, Q.; Yu, X.; Ma, H.; Ye, J.; Wu, W.; Wang, X. Applications of Machine Learning to Reciprocating Compressor Fault Diagnosis: A Review. Processes 2021, 9, 909. [Google Scholar] [CrossRef]
Ahn, B.; Kim, J.; Choi, B. Artificial intelligence-based machine learning considering flow and temperature of the pipeline for leak early detection using acoustic emission. Eng. Fract. Mech. 2019, 210, 381–392. [Google Scholar] [CrossRef]
Choi, Y.; Kwun, H.; Kim, D.; Lee, E.; Bae, H. Residual life prediction for induction furnace by sequential encoder with s-convolutional LSTM. Processes 2021, 9, 1121. [Google Scholar] [CrossRef]
Liu, Y.; Duan, L.; Yuan, Z.; Wang, N.; Zhao, J. An intelligent fault diagnosis method for reciprocating compressors based on LMD and SDAE. Sensors 2019, 19, 1041. [Google Scholar] [CrossRef]
Shi, Y.; Li, M.; Wen, J.; Yang, Y.; Zeng, J. Deep Learning-Based Approach for Heat Transfer Efficiency Prediction with Deep Feature Extraction. ACS Omega 2022, 7, 31013–31035. [Google Scholar] [CrossRef]
Jeon, S.H.; Yoo, S.; Yoo, Y.S.; Lee, I.W. ML-and LSTM-Based Radiator Predictive Maintenance for Energy Saving in Compressed Air Systems. Energies 2024, 17, 1428. [Google Scholar] [CrossRef]
Yin, Y.; Liu, X.; Huang, W.; Liu, Y.; Hu, S. Gas face seal status estimation based on acoustic emission monitoring and support vector machine regression. Adv. Mech. Eng. 2020, 12, 1687814020921323. [Google Scholar] [CrossRef]
Huang, X.y.; Xia, H.; Liu, Y.k.; Miyombo, M.E. Improved fault diagnosis method of electric gate valve in nuclear power plant. Ann. Nucl. Energy 2023, 194, 109996. [Google Scholar] [CrossRef]
Lv, Q.; Cai, L.; Yu, X.; Ma, H.; Li, Y.; Shu, Y. An automatic fault diagnosis method for the reciprocating compressor based on HMT and ANN. Appl. Sci. 2022, 12, 5182. [Google Scholar] [CrossRef]
Rousseau, P.; Laubscher, R. A Condition-Monitoring Methodology Using Deep Learning-Based Surrogate Models and Parameter Identification Applied to Heat Pumps. Math. Comput. Appl. 2024, 29, 52. [Google Scholar] [CrossRef]
Du, J.; Zhang, J.; Yang, L.; Li, X.; Guo, L.; Song, L. Mechanism analysis and self-adaptive RBFNN based hybrid soft sensor model in energy production process: A case study. Sensors 2022, 22, 1333. [Google Scholar] [CrossRef] [PubMed]
Patil, A.; Soni, G.; Prakash, A. A BMFO-KNN based intelligent fault detection approach for reciprocating compressor. Int. J. Syst. Assur. Eng. Manag. 2022, 13, 797–809. [Google Scholar] [CrossRef]
Mobtahej, P.; Naddaf-Sh, S.; Hamidi, M.; Zargarzadeh, H.; Liu, X. Anomaly Detection by Employing Root Cause Analysis and Machine Learning-based Approach Using Compressors Timeseries Data. In Proceedings of the IIE Annual Conference. Proceedings. Institute of Industrial and Systems Engineers (IISE), Institute of Industrial and Systems Engineers (IISE). Seattle, WA, USA, 21–24 May 2022; pp. 1–6. [Google Scholar]
Martinez Ricardo, D.M.; Castañeda Jimenez, G.E.; Vaqueiro Ferreira, J.; de Oliveira Nobrega, E.G.; de Lima, E.R.; de Almeida, L.M. Evaluation of machine learning methods for monitoring the health of guyed towers. Sensors 2021, 22, 213. [Google Scholar] [CrossRef]
Aizenberg, I.; Belardi, R.; Bindi, M.; Grasso, F.; Manetti, S.; Luchetta, A.; Piccirilli, M.C. Failure prevention and malfunction localization in underground medium voltage cables. Energies 2020, 14, 85. [Google Scholar] [CrossRef]
Coutinho, M.; Novo, L.L.; De Melo, M.; De Medeiros, L.; Barbosa, D.; Alves, M.; Tarragô, V.; Dos Santos, R.; Neto, H.L.; Gama, P. Machine learning-based system for fault detection on anchor rods of cable-stayed power transmission towers. Electr. Power Syst. Res. 2021, 194, 107106. [Google Scholar] [CrossRef]
Stanescu, D.; Digulescu, A.; Ioana, C.; Serbanescu, A. Transient power grid phenomena classification based on phase diagram features and machine learning classifiers. In Proceedings of the 2022 30th European Signal Processing Conference (EUSIPCO), Belgrade, Serbia, 29 August–2 September 2022; pp. 1676–1680. [Google Scholar]
Belardi, R.; Bindi, M.; Grasso, F.; Luchetta, A.; Manetti, S.; Piccirilli, M. A complex neural classifier for the fault prognosis and diagnosis of overhead electrical lines. In IOP Conference Series: Earth and Environmental Science, Proceedings of the 2020 International Conference on Advanced Electrical and Energy Systems 18–21 August 2020, Osaka, Japan; IOP Publishing: Bristol, UK, 2020; Volume 582, p. 012001. [Google Scholar]
Kim, H.; Lee, H.; Kim, S.; Kim, S.W. Attention recurrent neural network-based severity estimation method for early-stage fault diagnosis in robot harness cable. Sensors 2023, 23, 5299. [Google Scholar] [CrossRef]
Boschetti, G.; Minto, R. A sensorless approach for cable failure detection and identification in cable-driven parallel robots. Robot. Auton. Syst. 2025, 183, 104855. [Google Scholar] [CrossRef]
Soualhi, M.; Nguyen, K.T.; Medjaher, K.; Nejjari, F.; Puig, V.; Blesa, J.; Quevedo, J.; Marlasca, F. Dealing with prognostics uncertainties: Combination of direct and recursive remaining useful life estimations. Comput. Ind. 2023, 144, 103766. [Google Scholar] [CrossRef]
Son, S.; Oh, K.Y. Integrated framework for estimating remaining useful lifetime through a deep neural network. Appl. Soft Comput. 2022, 122, 108879. [Google Scholar] [CrossRef]
Solís-Martín, D.; Galán-Páez, J.; Borrego-Díaz, J. On the soundness of xai in prognostics and health management (phm). Information 2023, 14, 256. [Google Scholar] [CrossRef]
Kim, S.; Kim, N.H.; Choi, J.H. Prediction of remaining useful life by data augmentation technique based on dynamic time warping. Mech. Syst. Signal Process. 2020, 136, 106486. [Google Scholar] [CrossRef]
Fathi, K.; van de Venn, H.W.; Honegger, M. Predictive maintenance: An autoencoder anomaly-based approach for a 3 DoF delta robot. Sensors 2021, 21, 6979. [Google Scholar] [CrossRef] [PubMed]
Soualhi, M.; Nguyen, K.; Medjaher, K.; Zerhouni, N. Remaining useful life estimation of turbofan engines using adaptive fault detection learning. In Proceedings of the Annual Conference of the PHM Society, London, UK, 27–29 May 2022; Volume 14. [Google Scholar]
Nguyen, T.K.; Ahmad, Z.; Nguyen, D.T.; Kim, J.M. A remaining useful lifetime prediction model for concrete structures using Mann-Whitney U test state indicator and deep learning. Mech. Syst. Signal Process. 2025, 222, 111795. [Google Scholar] [CrossRef]
Ellefsen, A.L.; Ushakov, S.; Æsøy, V.; Zhang, H. Validation of data-driven labeling approaches using a novel deep network structure for remaining useful life predictions. IEEE Access 2019, 7, 71563–71575. [Google Scholar] [CrossRef]
Trinh, H.C.; Kwon, Y.K. A data-independent genetic algorithm framework for fault-type classification and remaining useful life prediction. Appl. Sci. 2020, 10, 368. [Google Scholar] [CrossRef]
Mochammad, S.; Noh, Y.; Kim, N.H. Enhancing Realistic Remaining Useful Life Prediction through Multi-fidelity Physics-Informed Approaches. In Proceedings of the 15th Annual Conference of the Prognostics and Health Management Society, PHM 2023, Prognostics and Health Management Society, Salt Lake City, UT, USA, 28 October–2 November 2023. [Google Scholar]
Gebraeel, N.; Lei, Y.; Li, N.; Si, X.; Zio, E. Prognostics and remaining useful life prediction of machinery: Advances, opportunities and challenges. J. Dyn. Monit. Diagn. 2023, 2, 1–12. [Google Scholar]
Ramezani, S.B.; Cummins, L.; Killen, B.; Carley, R.; Amirlatifi, A.; Rahimi, S.; Seale, M.; Bian, L. Scalability, explainability and performance of data-driven algorithms in predicting the remaining useful life: A comprehensive review. IEEE Access 2023, 11, 41741–41769. [Google Scholar] [CrossRef]
Cofré-Martel, S.; Droguett, E.; Modarres, M. Physics-Informed Neural Networks for Remaining Useful Life Estimation of a Vapor Recovery Unit System. In Proceedings of the Probabilistic Safety Assessment and Management, PSAM 2022, Honolulu, HI, USA, 26 June–1 July 2022. [Google Scholar]
Rebaiaia, M.L.; Ait-Kadi, D. A remaining useful life model for optimizing maintenance cost and spare-parts replacement of production systems in the context of sustainability. IFAC-PapersOnLine 2022, 55, 1562–1568. [Google Scholar] [CrossRef]
Byun, J.; Min, S.; Kang, J. RUL Prognostics: Recursive Bayesian Ensemble Prediction with Combining Artificial Degradation Patterns. Int. J. Progn. Health Manag. 2023, 14. [Google Scholar]
Zhang, C.; Gong, D.; Xue, G. An uncertainty-incorporated active data diffusion learning framework for few-shot equipment RUL prediction. Reliab. Eng. Syst. Saf. 2025, 254, 110632. [Google Scholar] [CrossRef]
Crespo, M.A.; Candón, E.; Gómez, J.; Serra, J. A comparison of machine learning techniques for LNG pumps fault prediction in regasification plants. IFAC-PapersOnLine 2020, 53, 125–130. [Google Scholar]
Rojek, M.; Blachnik, M. A Dataset and a Comparison of Classification Methods for Valve Plate Fault Prediction of Piston Pump. Appl. Sci. 2024, 14, 7183. [Google Scholar] [CrossRef]
Borriello, P.; Tessicini, F.; Ricucci, G.; Frosina, E.; Senatore, A. A fault detection strategy for an ePump during EOL tests based on a knowledge-based vibroacoustic tool and supervised machine learning classifiers. Meccanica 2024, 59, 279–304. [Google Scholar] [CrossRef]
Azeez, A.A.; Mazzei, P.; Minav, T.; Frosina, E.; Senatore, A. A New Approach to Study the Effect of Complexity on an External Gear Pump Model to Generate Data Source for AI-Based Condition Monitoring Application. Actuators 2023, 12, 401. [Google Scholar] [CrossRef]
Hu, Q.; Ohata, E.F.; Silva, F.H.; Ramalho, G.L.; Han, T.; Reboucas Filho, P.P. A new online approach for classification of pumps vibration patterns based on intelligent IoT system. Measurement 2020, 151, 107138. [Google Scholar] [CrossRef]
Indriawati, K.; Yugoputra, G.F.; Habibah, N.N.; Yudhanto, R. Artificial Neural Network-Based Fault Detection System with Residual Analysis Approach on Centrifugal Pump: A Case Study. Int. J. Automot. Mech. Eng. 2023, 20, 10285–10297. [Google Scholar] [CrossRef]
Salim, K.; Hebri, R.S.A.; Besma, S. Classification predictive maintenance using XGboost with genetic algorithm. Rev. D’Intelligence Artif. 2022, 36, 833. [Google Scholar] [CrossRef]
Alenany, A.; Helmi, A.M.; Nasef, B.M. Comprehensive analysis for sensor-based hydraulic system condition monitoring. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 133–140. [Google Scholar] [CrossRef]
Keller, N.; Sciancalepore, A.; Vacca, A. Condition monitoring of an axial piston pump on a mini excavator. Int. J. Fluid Power 2023, 24, 171–206. [Google Scholar] [CrossRef]
Keller, N.; Sciancalepore, A.; Vacca, A. Demonstrating a condition monitoring process for axial piston pumps with damaged valve plates. Int. J. Fluid Power 2022, 23, 205–236. [Google Scholar] [CrossRef]
Kim, W.; Lim, C.; Chai, J. Development of a sdms (Self-diagnostic monitoring system) with prognostics for a reciprocating pump system. Nucl. Eng. Technol. 2020, 52, 1188–1200. [Google Scholar] [CrossRef]
Ranawat, N.S.; Kankar, P.K.; Miglani, A. Fault diagnosis in centrifugal pump using support vector machine and artificial neural network. J. Engg. Res. EMSME Spec. Issue Pp 2021, 99, 111. [Google Scholar] [CrossRef]
Azizi, R.; Attaran, B.; Hajnayeb, A.; Ghanbarzadeh, A.; Changizian, M. Improving accuracy of cavitation severity detection in centrifugal pumps using a hybrid feature selection technique. Measurement 2017, 108, 9–17. [Google Scholar] [CrossRef]
Orrù, P.F.; Zoccheddu, A.; Sassu, L.; Mattia, C.; Cozza, R.; Arena, S. Machine learning approach using MLP and SVM algorithms for the fault prediction of a centrifugal pump in the oil and gas industry. Sustainability 2020, 12, 4776. [Google Scholar] [CrossRef]
Bruinsma, S.; Geertsma, R.; Loendersloot, R.; Tinga, T. Motor current and vibration monitoring dataset for various faults in an E-motor-driven centrifugal pump. Data Brief 2024, 52, 109987. [Google Scholar] [CrossRef]
Loukatos, D.; Kondoyanni, M.; Alexopoulos, G.; Maraveas, C.; Arvanitis, K.G. On-Device Intelligence for Malfunction Detection of Water Pump Equipment in Agricultural Premises: Feasibility and Experimentation. Sensors 2023, 23, 839. [Google Scholar] [CrossRef]
Panda, A.K.; Rapur, J.S.; Tiwari, R. Prediction of flow blockages and impending cavitation in centrifugal pumps using Support Vector Machine (SVM) algorithms based on vibration measurements. Measurement 2018, 130, 44–56. [Google Scholar] [CrossRef]
Kim, D.; Heo, T.Y. Anomaly detection with feature extraction based on machine learning using hydraulic system IoT sensor data. Sensors 2022, 22, 2479. [Google Scholar] [CrossRef]
Wu, K.; Xing, Y.; Chu, N.; Wu, P.; Cao, L.; Wu, D. A carrier wave extraction method for cavitation characterization based on time synchronous average and time-frequency analysis. J. Sound Vib. 2020, 489, 115682. [Google Scholar] [CrossRef]
Kim, S.; Akpudo, U.E.; Hur, J.W. A cost-aware dnn-based fdi technology for solenoid pumps. Electronics 2021, 10, 2323. [Google Scholar] [CrossRef]
Ali, H. Cavitation Analysis in Centrifugal Pumps Based on Vibration Bispectrum and Transfer Learning. Shock Vib. 2021, 2021, 6988949. [Google Scholar]
Samanipour, P.; Poshtan, J.; Sadeghi, H. Cavitation detection in centrifugal pumps using pressure time-domain features. Turk. J. Electr. Eng. Comput. Sci. 2017, 25, 4287–4298. [Google Scholar] [CrossRef]
Sunal, C.E.; Velisavljevic, V.; Dyo, V.; Newton, B.; Newton, J. Centrifugal Pump Fault Detection with Convolutional Neural Network Transfer Learning. Sensors 2024, 24, 2442. [Google Scholar] [CrossRef]
Kim, K.; Jeong, J. Deep learning-based data augmentation for hydraulic condition monitoring system. Procedia Comput. Sci. 2020, 175, 20–27. [Google Scholar] [CrossRef]
Karagiovanidis, M.; Pantazi, X.E.; Papamichail, D.; Fragos, V. Early detection of cavitation in centrifugal pumps using low-cost vibration and sound sensors. Agriculture 2023, 13, 1544. [Google Scholar] [CrossRef]
Keleko, A.T.; Kamsu-Foguem, B.; Ngouna, R.H.; Tongne, A. Health condition monitoring of a complex hydraulic system using Deep Neural Network and DeepSHAP explainable XAI. Adv. Eng. Softw. 2023, 175, 103339. [Google Scholar] [CrossRef]
Barraza, J.F.; Bräuning, L.F.G.; Droguett, E.L.; Martins, M.R. Long short-term memory network for future-state prediction in water injection pump. In Proceedings of the 30th European Safety and Reliability Conference and the 15th Probabilistic Safety Assessment and Management Conference, Venive, Italy, 1–5 November 2020; pp. 1–5. [Google Scholar]
König, C.; Helmi, A.M. Sensitivity analysis of sensors in a hydraulic condition monitoring system using CNN models. Sensors 2020, 20, 3307. [Google Scholar] [CrossRef]
Lee, G.H.; Akpudo, U.E.; Hur, J.W. FMECA and MFCC-based early wear detection in gear pumps in cost-aware monitoring systems. Electronics 2021, 10, 2939. [Google Scholar] [CrossRef]
Wang, L.; Liu, Y.; Yin, H.; Sun, W. Fault diagnosis and predictive maintenance for hydraulic system based on digital twin model. AIP Adv. 2022, 12, 065213. [Google Scholar] [CrossRef]
Taser, P.Y. An Ordinal Multi-Dimensional Classification (OMDC) for Predictive Maintenance. Comput. Syst. Sci. Eng. 2023, 44, 1499–1516. [Google Scholar] [CrossRef]
Askari, B.; Carli, R.; Cavone, G.; Dotoli, M. Data-driven fault diagnosis in a complex hydraulic system based on early classification. IFAC-PapersOnLine 2022, 55, 187–192. [Google Scholar] [CrossRef]
Zuo, L.; Xu, F.; Zhang, C.; Xiahou, T.; Liu, Y. A multi-layer spiking neural network-based approach to bearing fault diagnosis. Reliab. Eng. Syst. Saf. 2022, 225, 108561. [Google Scholar] [CrossRef]
Alhams, A.; Abdelhadi, A.; Badri, Y.; Sassi, S.; Renno, J. Enhanced Bearing Fault Diagnosis Through Trees Ensemble Method and Feature Importance Analysis. J. Vib. Eng. Technol. 2024, 12, 109–125. [Google Scholar] [CrossRef]
Velásquez, R.M.A. Bearings faults and limits in wind turbine generators. Results Eng. 2024, 21, 101891. [Google Scholar] [CrossRef]
Liu, J.; Zuo, H. Failure prediction with statistical analysis of bearing using deep forest model and change point detection. Eng. Appl. Artif. Intell. 2024, 133, 108504. [Google Scholar] [CrossRef]
Wang, H.; Liu, Z.; Peng, D.; Cheng, Z. Attention-guided joint learning CNN with noise robustness for bearing fault diagnosis and vibration signal denoising. ISA Trans. 2022, 128, 470–484. [Google Scholar] [CrossRef]
Zhang, D.; Stewart, E.; Entezami, M.; Roberts, C.; Yu, D. Intelligent acoustic-based fault diagnosis of roller bearings using a deep graph convolutional network. Measurement 2020, 156, 107585. [Google Scholar] [CrossRef]
Xu, Y.; Li, Z.; Wang, S.; Li, W.; Sarkodie-Gyan, T.; Feng, S. A hybrid deep-learning model for fault diagnosis of rolling bearings. Measurement 2021, 169, 108502. [Google Scholar] [CrossRef]
Saravanakumar, R.; Krishnaraj, N.; Venkatraman, S.; Sivakumar, B.; Prasanna, S.; Shankar, K. Hierarchical symbolic analysis and particle swarm optimization based fault diagnosis model for rotating machineries with deep neural networks. Measurement 2021, 171, 108771. [Google Scholar] [CrossRef]
Zuo, L.; Zhang, L.; Zhang, Z.H.; Luo, X.L.; Liu, Y. A spiking neural network-based approach to bearing fault diagnosis. J. Manuf. Syst. 2021, 61, 714–724. [Google Scholar] [CrossRef]
Guan, Y.; Meng, Z.; Sun, D.; Liu, J.; Fan, F. Rolling bearing fault diagnosis based on information fusion and parallel lightweight convolutional network. J. Manuf. Syst. 2022, 65, 811–821. [Google Scholar] [CrossRef]
Sinitsin, V.; Ibryaeva, O.; Sakovskaya, V.; Eremeeva, V. Intelligent bearing fault diagnosis method combining mixed input and hybrid CNN-MLP model. Mech. Syst. Signal Process. 2022, 180, 109454. [Google Scholar] [CrossRef]
Miao, Y.; Gao, S.; Kong, Y.; Jiang, Z.; Han, Q.; Chu, F. Variable reluctance bearing generators applicable in condition monitoring of bearing cages. Mech. Syst. Signal Process. 2023, 194, 110249. [Google Scholar] [CrossRef]
Shen, S.; Lu, H.; Sadoughi, M.; Hu, C.; Nemani, V.; Thelen, A.; Webster, K.; Darr, M.; Sidon, J.; Kenny, S. A physics-informed deep learning approach for bearing fault detection. Eng. Appl. Artif. Intell. 2021, 103, 104295. [Google Scholar] [CrossRef]
Kumar, A.; Zhou, Y.; Gandhi, C.; Kumar, R.; Xiang, J. Bearing defect size assessment using wavelet transform based Deep Convolutional Neural Network (DCNN). Alex. Eng. J. 2020, 59, 999–1012. [Google Scholar] [CrossRef]
Lyu, P.; Zhang, K.; Yu, W.; Wang, B.; Liu, C. A novel RSG-based intelligent bearing fault diagnosis method for motors in high-noise industrial environment. Adv. Eng. Inform. 2022, 52, 101564. [Google Scholar] [CrossRef]
Ruan, D.; Wang, J.; Yan, J.; Gühmann, C. CNN parameter design based on fault signal analysis and its application in bearing fault diagnosis. Adv. Eng. Inform. 2023, 55, 101877. [Google Scholar] [CrossRef]
Yang, Z.b.; Zhang, J.p.; Zhao, Z.b.; Zhai, Z.; Chen, X.f. Interpreting network knowledge with attention mechanism for bearing fault diagnosis. Appl. Soft Comput. 2020, 97, 106829. [Google Scholar] [CrossRef]
González-Muñiz, A.; Díaz, I.; Cuadrado, A.A. DCNN for condition monitoring and fault detection in rotating machines and its contribution to the understanding of machine nature. Heliyon 2020, 6, e03395. [Google Scholar] [CrossRef]
Wang, J.; Mo, Z.; Zhang, H.; Miao, Q. A deep learning method for bearing fault diagnosis based on time-frequency image. IEEE Access 2019, 7, 42373–42383. [Google Scholar] [CrossRef]
Xie, J.; Du, G.; Shen, C.; Chen, N.; Chen, L.; Zhu, Z. An end-to-end model based on improved adaptive deep belief network and its application to bearing fault diagnosis. IEEE Access 2018, 6, 63584–63596. [Google Scholar] [CrossRef]
Ding, X.; Wang, H.; Cao, Z.; Liu, X.; Liu, Y.; Huang, Z. An edge intelligent method for bearing fault diagnosis based on a parameter transplantation convolutional neural network. Electronics 2023, 12, 1816. [Google Scholar] [CrossRef]
Dai, J.; Tian, L.; Chang, H. An Intelligent Diagnostic Method for Wear Depth of Sliding Bearings Based on MGCNN. Machines 2024, 12, 266. [Google Scholar] [CrossRef]
Xu, G.; Liu, M.; Jiang, Z.; Söffker, D.; Shen, W. Bearing fault diagnosis method based on deep convolutional neural network and random forest ensemble learning. Sensors 2019, 19, 1088. [Google Scholar] [CrossRef]
Li, Z.; Li, Y.; Sun, Q.; Qi, B. Bearing fault diagnosis method based on convolutional neural network and knowledge graph. Entropy 2022, 24, 1589. [Google Scholar] [CrossRef]
van den Hoogen, J.; Bloemheuvel, S.; Atzmueller, M. Classifying multivariate signals in rolling bearing fault detection using adaptive wide-kernel cnns. Appl. Sci. 2021, 11, 11429. [Google Scholar] [CrossRef]
Kahr, M.; Kovács, G.; Loinig, M.; Brückl, H. Condition monitoring of ball bearings based on machine learning with synthetically generated data. Sensors 2022, 22, 2490. [Google Scholar] [CrossRef]
Mahesh, T.; Saravanan, C.; Ram, V.A.; Kumar, V.V.; Vivek, V.; Guluwadi, S. Data-driven intelligent condition adaptation of feature extraction for bearing fault detection using deep responsible active learning. IEEE Access 2024, 12, 45381–45397. [Google Scholar] [CrossRef]
Piltan, F.; Duong, B.P.; Kim, J.M. Deep learning-based adaptive neural-fuzzy structure scheme for bearing fault pattern recognition and crack size identification. Sensors 2021, 21, 2102. [Google Scholar] [CrossRef]
Shenfield, A.; Howarth, M. A novel deep learning model for the detection and identification of rolling element-bearing faults. Sensors 2020, 20, 5112. [Google Scholar] [CrossRef] [PubMed]
Rumin, P.; Kotowicz, J.; Hogg, D.; Zastawna-Rumin, A. Utilization of measurements, machine learning, and analytical calculation for preventing belt flip over on conveyor belts. Measurement 2023, 218, 113157. [Google Scholar] [CrossRef]
Elahi, M.; Afolaranmi, S.O.; Mohammed, W.M.; Lastra, J.L.M. FASTory assembly line power consumption data. Data Brief 2023, 48, 109160. [Google Scholar] [CrossRef] [PubMed]
Elahi, M.; Afolaranmi, S.O.; Mohammed, W.M.; Martinez Lastra, J.L. Energy-based prognostics for gradual loss of conveyor belt tension in discrete manufacturing systems. Energies 2022, 15, 4705. [Google Scholar] [CrossRef]
Martinsen, M.; Fentaye, A.D.; Dahlquist, E.; Zhou, Y. Holistic approach promotes failure prevention of smart mining machines based on Bayesian networks. Machines 2023, 11, 940. [Google Scholar] [CrossRef]
Pulcini, V.; Modoni, G. Machine learning-based digital twin of a conveyor belt for predictive maintenance. Int. J. Adv. Manuf. Technol. 2024, 133, 6095–6110. [Google Scholar] [CrossRef]
Dayo-Olupona, O.; Genc, B.; Celik, T.; Bada, S. Adoptable approaches to predictive maintenance in mining industry: An overview. Resour. Policy 2023, 86, 104291. [Google Scholar] [CrossRef]
Siami, M.; Barszcz, T.; Wodecki, J.; Zimroz, R. Automated identification of overheated belt conveyor idlers in thermal images with complex backgrounds using binary classification with CNN. Sensors 2022, 22, 10004. [Google Scholar] [CrossRef]
Souza, F.M.d.C.; Filho, G.P.R.; Guimarães, F.G.; Meneguette, R.I.; Pessin, G. Navigating Market Sentiments: A Novel Approach to Iron Ore Price Forecasting with Weighted Fuzzy Time Series. Information 2024, 15, 251. [Google Scholar] [CrossRef]
Parmar, P.; Jurdziak, L.; Rzeszowska, A.; Burduk, A. Predictive Modeling of Conveyor Belt Deterioration in Coal Mines Using AI Techniques. Energies 2024, 17, 3497. [Google Scholar] [CrossRef]
Siami, M.; Barszcz, T.; Wodecki, J.; Zimroz, R. Semantic segmentation of thermal defects in belt conveyor idlers using thermal image augmentation and U-Net-based convolutional neural networks. Sci. Rep. 2024, 14, 5748. [Google Scholar] [CrossRef] [PubMed]
Robatto Simard, S.; Gamache, M.; Doyon-Poulin, P. Development and Usability Evaluation of VulcanH, a CMMS Prototype for Preventive and Predictive Maintenance of Mobile Mining Equipment. Mining 2024, 4, 326–351. [Google Scholar] [CrossRef]
Wijaya, H.; Rajeev, P.; Gad, E.; Vivekanamtham, R. Distributed optical fibre sensor for condition monitoring of mining conveyor using wavelet transform and artificial neural network. Struct. Control Health Monit. 2021, 28, e2827. [Google Scholar] [CrossRef]
IEEE P2802; Standard for the Performance and Safety Evaluation of Artificial Intelligence Based Medical Device: Terminology. IEEE: New York, NY, USA, 2023.
Tan, T. Working Process Analysis of Submarine Cable Detection Robot Using Machine Vision and Reinforcement Learning. Comput.-Aided Des. Appl. 2025, S7, 285–298. [Google Scholar] [CrossRef]

Figure 1. Semi-automated literature review framework outlining key steps, from document retrieval to topic validation and analysis.

Figure 2. Flow chart depicting the NLP-based framework used to process abstracts collected from Scopus.

Figure 3. Dendrogram illustrating the hierarchical clustering of topics, with the first six topics grouped closely together and highlighted for detailed analysis.

Figure 4. Overview of the bibliometric data analysed in this study, showcasing key metrics such as publication timespan (2017–2024); the number of sources, documents, and references; and collaboration statistics.

Figure 5. Visualization of top sources contributing to the dataset, highlighting key journals and conferences on condition monitoring and predictive maintenance.

Figure 6. Author contributions and citations over time, illustrating the number of articles (bubble size) and yearly citations (colour scale) for each author in predictive maintenance and condition monitoring research.

Figure 7. Global distribution of research contributions, illustrating the frequency of publications by country in the field of predictive maintenance and condition monitoring. Labels highlight the top contributing countries, while the colour scale reflects the relative research intensity.

Figure 8. Term analysis over time, illustrating the temporal distribution of terms related to predictive maintenance and condition monitoring.

Figure 9. Concept network analysis of predictive maintenance and condition monitoring, illustrating thematic clusters and key term relationships.

Figure 10. Illustration of data sources for predictive maintenance, highlighting real-time sensors, historical data, intermediate systems, and external platforms.

Figure 11. Overview of signal processing and data preparation techniques, including transformations, feature extraction, data cleaning, and augmentation.

Figure 12. Overview of machine learning and advanced analytics in predictive maintenance, highlighting core techniques, specialized models, optimization methods, and explainability tools.

Figure 13. Summary of key challenges in AI/ML-based predictive maintenance, including data limitations, concept drift, computational costs, explainability, IoT integration, and scalability.

Figure 14. Industry 5.0 implementation framework integrating physical assets, edge/fog AI, cloud learning, and the digital thread with human-in-the-loop feedback.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Garcia, J.; Rios-Colque, L.; Peña, A.; Rojas, L. Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges. Appl. Sci. 2025, 15, 5465. https://doi.org/10.3390/app15105465

AMA Style

Garcia J, Rios-Colque L, Peña A, Rojas L. Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges. Applied Sciences. 2025; 15(10):5465. https://doi.org/10.3390/app15105465

Chicago/Turabian Style

Garcia, Jose, Luis Rios-Colque, Alvaro Peña, and Luis Rojas. 2025. "Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges" Applied Sciences 15, no. 10: 5465. https://doi.org/10.3390/app15105465

APA Style

Garcia, J., Rios-Colque, L., Peña, A., & Rojas, L. (2025). Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges. Applied Sciences, 15(10), 5465. https://doi.org/10.3390/app15105465

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges

Abstract

1. Introduction

2. Methodology

Embedding Generation and NLP Approaches in the SLR Framework

3. Quantitative Analysis of Condition Monitoring and Predictive Maintenance

4. Qualitative Analysis of Fault Diagnosis and Predictive Maintenance

4.1. Industrial Compressors: State of the Art and Emerging Trends

4.2. Machine Learning-Driven Fault Detection and Maintenance in Cable Systems

4.3. Remaining-Useful-Life Estimation: A Cross-Cutting Framework and Key Examples

4.4. Emerging Trends in Hydraulic Pump Condition Monitoring and Fault Diagnosis

4.5. Bearing Fault Diagnosis as a Common Function in Rotating Equipment

4.6. Artificial Intelligence and Predictive Maintenance in Conveyor Belt Systems

5. Discussion and Future Research Directions

5.1. Data Sources and Their Integration

From Raw Logs to a Shareable Data Mining Repository

5.2. Signal Processing and Data Preparation

5.2.1. Fourier-Based Methods

5.2.2. Wavelet Transform and Time-Frequency Analysis

5.2.3. Hybrid Approaches and Emerging Perspectives

5.2.4. Challenges and Future Directions in Signal Processing

5.3. Machine Learning, Advanced Analytics, and Explainability

5.4. Hybrid Physics–Data Models: A Bridge Between First-Principles Insight and Statistical Learning

5.5. Challenges and Industry 5.0 Outlook

5.6. Concluding Remarks on Future Directions

5.7. Integration with Classical Maintenance Management Models

Future Work—Key Questions (Summary Box)

6. Conclusions

Reporting of Physical and Environmental Metadata

Supplementary Materials

Funding

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI