1. Introduction
Industry 4.0 reconceptualises machines as interconnected cyber–physical systems in which value comes not only from hardware performance but also from a system’s ability to sense, understand, and react to its own condition over time [
1,
2,
3,
4]. In practice, this requires sensor technologies that offer complementary views of operation, such as embedded electrical sensing within drives alongside vibration, acoustic, and thermal measurements on the actuator structure, in addition to data analytics that transform these data streams into actionable decisions like health indicators, early fault warnings, and planned maintenance rather than reactive repairs [
5,
6,
7,
8]. As these ideas move beyond manufacturing into safety-critical domains, they directly pertain to medical robotic devices where stable and repeatable actuation must be maintained over hundreds of sessions with minutes of acceptable downtime [
2].
Servomotor-driven robotic platforms have become integral to post-stroke and neurorehabilitation programs because repeatable, precise motion can be delivered safely over long sessions. As these systems move from research labs to daily clinical use, reliability becomes a safety constraint and a clinical duty: therapy must not be interrupted, and unintended forces must not reach the patient. Failures frequently cluster in the actuation chain (servomotors, transmissions, and drives), where low-speed reversals, intermittent loading, and heat accelerate wear and drift. Recent reviews of rehabilitation robotics and clinical adoption echo this need, calling reliability a key barrier to scale [
9,
10,
11,
12,
13]. Against this backdrop, a predictive-maintenance viewpoint centered on servomotors is therefore warranted to preserve both safety and therapy continuity.
Predictive maintenance (PdM) refers to data-driven strategies that estimate health, diagnose faults early, and predict remaining useful life (RUL) to plan maintenance before failure. Modern PdM combines physics-aware features with machine learning and runs close to the robot—often on edge hardware to meet real-time constraints [
14,
15,
16,
17,
18]. In robotic actuation, the most observable signals originate from servomotors and their drives: currents, voltages, speeds, temperatures, and vibration or sound from bearings and gear stages. These signals form the basis for condition indicators that are stable across different patients and therapy tasks. In this setting, a proactive view of component health becomes essential to avoid unexpected interruptions and sustain safe operation. Recent studies have moved beyond single-sensor settings by (i) using multi-sensor fusion networks (CNN/Transformer hybrids) to improve fault classification under changing operating modes, and (ii) adding uncertainty-aware RUL prediction so maintenance decisions are more reliable. Examples include multi-sensor sparse Transformer fusion for intelligent fault diagnosis [
19], ensemble Transformer-based motor fault diagnosis with multi-mode time series [
20], and feature-fusion diagnosis models that emphasize spatiotemporal consistency [
21]. On the prognostics side, recent work highlights calibrated prediction intervals via conformal prediction and related uncertainty-quantification strategies for online RUL prediction [
22,
23,
24,
25].
The rehabilitation context given in
Figure 1 shapes both failure modes and sensing. Servomotors must produce smooth torque at low speeds and frequent direction changes. The robot mechanics (links, transmissions, and compliance) filter and mix excitations, while patient–robot interaction adds variable contact stiffness and voluntary or reflexive disturbances. This triad (servomotor, robot, rehabilitation task) lowers the fault signal-to-noise ratio, shifts spectral content to very low frequencies, and makes single-sensor methods brittle. As a result, robust PdM favors multi-sensor fusion (current, vibration, acoustic, temperature), interpretable health indicators, and learning that tolerates domain shifts across tasks and patients [
26,
27,
28,
29].
These practical constraints have motivated a substantial body of work on reliability and maintenance planning for robotic systems, especially in settings where long-term safety and availability are critical. System-level reliability and maintenance optimization were studied for industrial and rehabilitation robots, including importance-measure-based preventive plans and exoskeleton-specific reliability models that account for cost, safety, and availability [
12,
13]. Clinical and technical reviews continue to stress reliability and safety as adoption bottlenecks [
9,
10,
11].
Nonintrusive motor current signature analysis (MCSA) enables diagnostics without modifying the drive wiring [
26]. For permanent-magnet synchronous machines (PMSMs) common in precision robotics, drive-side monitoring (currents/voltages) and thermal cues have been shown to detect demagnetization, inter-turn short, and bearing degradation [
30,
31]. Application-oriented studies for servo-bearing faults report improvements through feature learning and lightweight deep models [
32,
33]. In parallel, time–frequency representations combined with lightweight CNN designs can improve motor fault recognition, especially when signals vary across operating regimes [
34]. For servo-drive fault scenarios, recent work also reports phase-voltage-based diagnosis strategies for fault-tolerant multi-phase permanent-magnet servo motor drive systems [
35].
Because rehabilitation robots operate at low speed with frequent reversals, classic high-speed vibration markers lose strength. Recent work fuses sound and vibration to recover separability under variable speed/loads and to improve robustness [
27,
28,
29]. Acoustic-feature enhancement and condition-adaptive time–frequency imaging were also explored for early bearing faults under noise and nonstationarity [
36,
37,
38].
Few-shot and transfer/meta-learning have been advanced to cope with domain shifts across operating conditions, tasks, and hardware, frequently with explicit multi-sensor fusion in the learning pipeline [
16,
17]. RUL prediction with monotonic, trendable health indicators has been emphasized to make decisions explainable and stable across sessions [
39,
40]. MDPI and Elsevier studies also report practical pipelines that use hybrid features with attention or diffusion-based augmentation to sustain accuracy under limited labeled data [
16,
41].
Edge/embedded constraints drive interest in observers and compact deep models for real-time fault detection in robot drives [
31,
32]. These studies target on-board inference latency, power limits, and maintainability—all essential in clinical settings where downtime is costly and patient-facing safety margins must remain conservative.
Although experimental validation is performed on a rehabilitation robot, the proposed framework targets generic multi-mode industrial servo systems such as CNC feed drives, robotic machining cells and automated production lines, where frequent mode switching, low-speed operation and variable loads are common. In light of the above, this study is positioned to address gaps that appear when PdM is transferred to rehabilitation robots: (i) low-speed, reversal-rich trajectories; (ii) multi-sensor streams that must align with patient–robot interaction; and (iii) edge execution. The following contributions are provided:
A servomotor-centered sensing design for rehabilitation robots that combines currents/voltages, vibration, airborne acoustics, and temperatures, with synchronized acquisition under therapy-like trajectories.
A fault-indicator construction that favors monotonicity, trendability, and prognosability for use in RUL estimation and maintenance scheduling in clinical duty cycles [
39,
40].
A multi-sensor fusion pipeline that remains robust across operating conditions and tasks, drawing on recent domain-adaptation and few-shot advances [
16,
17].
An embedded-friendly implementation path to satisfy real-time constraints observed in clinical operation [
31,
32].
2. System Description and Data Acquisition
Figure 2 summarizes the complete workflow of the proposed framework. Raw multi-sensor signals are first pre-processed and segmented into fixed-length windows, and then features are extracted from each window. Next, an unsupervised health indicator is computed as the Servomotor Health Score (SHS) using an autoencoder-based reconstruction error followed by isotonic regression to obtain a monotone health trend. Finally, the resulting SHS is used as an input for two downstream tasks: RUL prediction with Gradient Boosting and fault classification with Random Forest.
The rehabilitation robot is used as a representative cyber–physical testbed for multi-mode industrial servo drives operating under variable load and frequent mode transitions. This study focuses on the servomotor actuation units of an upper-limb rehabilitation robot designed for guided exercise (
Figure 3). Each actuator includes a permanent-magnet synchronous servomotor (PMSM), a harmonic reducer, and a high-resolution encoder. Position tracking and force regulation are performed through an impedance/admittance-based control framework, and safe human–robot interaction is supported through software limits and torque constraints [
10].
The motors are driven through field-oriented control (FOC). Phase currents, DC-bus voltage, estimated torque, and speed are collected from the drive. Faults in mechanical transmission components—such as bearings, harmonic gear elements, or couplings—tend to appear early in both electrical and vibro-acoustic signals [
26,
27].
The sensing set
includes electrical, vibration, acoustic, thermal, and kinematic signals.
Table 1 summarizes sensor locations, measured quantities, units, nominal ranges, and sampling rates. This selection supports the extraction of time, frequency, and time–frequency features as well as cross-channel coherence measures [
16,
28,
42], which are later used in
Section 3.
Data are collected through four scenarios designed to reflect typical clinical use: (1) nominal tracking with sinusoidal and trapezoidal speed profiles; (2) patient-interaction emulation using external elastic loads; (3) varying torque and speed through multi-level operating points; (4) fault-simulated conditions created by increasing mechanical friction, introducing misalignment, enlarging gear backlash, and applying controlled electrical imbalance.
Sliding windows of length
s and 50% overlap are used.
s was selected because it captures at least one full mechanical revolution at typical therapy speeds (up to 150 rpm) while keeping the signals within a window approximately stationary for time–frequency descriptors. The 50% overlap increases the number of training samples and yields smoother SHS trajectories, which is helpful for early-warning decisions. Because the split is performed at the unit level (
Section 4.1), overlapping windows do not cause information leakage across train/validation/test subsets. This structure matches the time–frequency feature extraction and coherence analysis described in
Section 3 [
16,
27,
31]. The resulting multi-sensor dataset enables the construction of a unified SHS with properties suitable for both early warning and RUL estimation [
14,
16,
39].
3. Methodology
3.1. Feature Extraction from Multi-Sensor Signals
Servomotor behaviour is reflected in several physical domains, including vibration, electrical current, rotational speed, torque, and temperature. A single sensor cannot capture all degradation mechanisms; therefore, a multi-sensor feature extraction strategy is adopted, which is consistent with recent reviews on vibration analysis and multi-sensor fusion for equipment fault diagnosis [
43,
44]. The goal is to transform raw measurements into a compact set of numerical descriptors that are sensitive to fault evolution but robust to operating variability.
Let the set of synchronised sensor channels be
For each sensor
, the raw signal is represented by
. The signal is divided into overlapping windows
of length
L samples. Before extracting features, each channel is normalized using a robust scaling procedure to reduce the influence of impulsive noise and slow drift, which is commonly adopted in vibration pre-processing [
43].
The transformation in Equation (
1) uses the constant
. This standard factor converts the Median Absolute Deviation (MAD) into a scale consistent with the standard deviation for Gaussian data.
Time-domain statistical features are computed for each window
. These descriptors, such as the mean in Equation (
A1), the standard deviation in Equation (
A2), the root mean square (RMS) in Equation (
A3), the skewness in Equation (
A4), the kurtosis in Equation (
A5), and the crest factor in Equation (
A6), are widely used in bearing and motor fault diagnosis [
45,
46]. To improve readability, the explicit equations for feature definitions (Equations (
A1)–(
A12)) are moved to
Appendix A.
Frequency-domain features are obtained from the discrete Fourier transform (DFT) of each window. Power-spectrum-based descriptors, such as band power, spectral centroid and spectral entropy, are effective for rotating machinery diagnostics [
45,
46]. Let
be the DFT of
in window
. The power spectrum is given by Equation (
A7).
The band power in a frequency band
is defined in Equation (
A8). This measure represents the total signal energy concentrated within a selected frequency interval, and it is particularly useful for identifying fault-related activity that appears in specific harmonic or sideband regions.
The spectral centroid is given by Equation (
A9). It reflects the distribution of energy across the spectrum and shifts toward higher frequencies when the signal contains sharper or more impulsive components, which may indicate mechanical degradation.
Using the standardized spectrum
, the spectral entropy is defined in Equation (
A10). This descriptor quantifies the irregularity or disorder of the spectral content, with higher entropy typically associated with more complex or noise-like vibration patterns.
Cross-sensor relationships between mechanical and electrical domains are captured through coherence features. When vibration and current are monitored simultaneously, coherence can reveal electromechanical coupling effects and misalignment phenomena [
47,
48]. The magnitude-squared coherence between a vibration sensor
v and a current sensor
i is given by Equation (
A11).
The averaged coherence in a frequency band
is defined in Equation (
A12). This metric summarizes the degree of linear coupling between vibration and current signals over the selected band, and higher values typically indicate stronger electromechanical interaction associated with misalignment, eccentricity or load-dependent effects.
Finally, all features from all sensors and domains are concatenated into a single feature vector as expressed in Equation (
2), which serves as the input to the subsequent health-score and classification models. This feature-level fusion is consistent with current multi-sensor fault diagnosis frameworks that combine vibration, acoustic and electrical measurements for higher robustness and accuracy [
44,
49,
50,
51,
52,
53].
In this study, a fixed and physics-motivated feature set was used rather than an additional data-driven feature selection step. This choice keeps the pipeline simple and reproducible, while the autoencoder in the SHS module provides nonlinear dimensionality reduction through its latent representation.
Across the sensing suite in
Table 2, 12 signal channels (3-axis vibration, 1 acoustic, 3-phase current, DC-link voltage, speed, torque, encoder speed, and temperature) were obtained. For each channel, 6 time-domain and 3 frequency-domain features (9 features per channel) were computed. Moreover, 9 cross-sensor coherence features between the 3 vibration axes and the 3 current phases were calculated. Therefore, the final feature dimension is
.
3.2. Servomotor Health Score (SHS) Definition
While the feature vector in Equation (
2) captures detailed information from multiple sensors, it is often convenient for monitoring and decision making to reduce this high-dimensional representation to a single scalar health indicator that summarizes the overall condition of the servomotor. In this work, this indicator is referred to as the SHS. The SHS is designed to take values in the interval
, with values close to one corresponding to healthy behaviour and values approaching zero indicating severe deviation from normal operation. To construct such a score, an autoencoder model is trained exclusively on data collected under healthy conditions, following recent practice in health indicator construction for rotating machinery and other safety-critical systems [
54,
55,
56,
57].
Let
and
denote the encoder and decoder mappings of the autoencoder, parameterized by
. For each feature vector
, the autoencoder produces a latent representation and its reconstruction as given in Equation (
3). Here, the encoder compresses the multi-sensor information into a low-dimensional latent space, while the decoder attempts to reconstruct the original feature vector.
The parameters
are obtained by minimizing the reconstruction error over a set of healthy windows
. A regularized loss function is used, as shown in Equation (
4), where the first term enforces accurate reconstruction and the second term encourages sparsity in the latent representation to improve interpretability and robustness.
Once the autoencoder has been trained, the reconstruction error for each window
n is computed. Because the feature vector
in Equation (
2) is composed of contributions from different sensors, a weighted error measure is used to reflect their relative importance and reliability. The resulting weighted reconstruction error is defined in Equation (
5), where
are non-negative sensor weights that sum to one. In this study, uniform weights,
, were used to avoid introducing extra tuning parameters and to keep the SHS comparable across units. In general,
can be adapted (e.g., based on sensor reliability or noise level) to emphasize specific sensing channels, which directly changes each sensor’s contribution to the aggregate reconstruction error.
To make the magnitude of
comparable across different operating conditions and experiments, the error is standardized using the mean and standard deviation estimated from healthy data. The standardized error is given in Equation (
6). This step ensures that the subsequent mapping to the SHS uses a dimensionless and normalized quantity.
The SHS is then obtained by applying a logistic transformation to
, as shown in Equation (
7). The scaling factor
controls the steepness of the transition between healthy and degraded states. This mapping compresses the unbounded standardized error into the interval
and yields a monotonically decreasing function of the deviation from normal behaviour.
Unless otherwise stated, was set in all experiments. Because is a standardized reconstruction error, mainly controls the slope of the logistic mapping (how quickly saturates near 0 or 1) and does not change the ranking of windows by abnormality. For this reason, should be considered jointly with any decision threshold defined on .
In practice, even under gradual degradation, short-term fluctuations in operating conditions can cause small local increases in
. For RUL estimation and trend analysis, it is often desirable to enforce a strictly non-increasing health trajectory. To this end, an isotonic regression step is applied to
, as formulated in Equation (
8). The solution
is the closest non-increasing sequence to the original scores in a least-squares sense and is adopted as the final SHS.
Here,
denotes the set of non-increasing sequences. The resulting sequence
is generally smoother and more suitable for prognostics than the raw scores
, while still reflecting the information learned by the autoencoder. To quantify the suitability of
as a health indicator, three standard metrics are considered: monotonicity, trendability and prognosability [
55,
56]. The monotonicity index in Equation (
9) measures the proportion of time steps where the score does not increase, using the indicator function
.
Trendability reflects the strength of the relationship between the SHS and time. It is expressed in Equation (
10) as the absolute value of the Spearman correlation coefficient between
and the window index
n.
Finally, prognosability captures how tightly clustered the SHS values are near the failure point across different degradation trajectories. Let
denote the set of SHS values observed close to failure for multiple runs. The prognosability index is defined in Equation (
11); values closer to one indicate that the end-of-life SHS distribution is narrow and therefore easier to use as a failure threshold.
A health indicator with high monotonicity, strong trendability and good prognosability is particularly suitable for RUL estimation, because its evolution over time closely reflects the underlying degradation process. The SHS constructed through Equations (
3)–(
11), therefore, serves as the main input to the RUL models described later in the methodology.
3.3. Fault Classification Framework
The SHS provides a compact indication of how far the servomotor has moved away from its healthy state, but maintenance actions usually require more specific information about what kind of fault is developing. The fault classification framework, therefore, complements the SHS by assigning each time window to a small set of condition classes, such as healthy operation or specific fault modes (for example, increased friction, misalignment, or gearbox-related issues). The goal is to keep the decision process simple and easy to interpret, while still exploiting the information contained in the multi-sensor features and the SHS.
The structure of the framework is illustrated in
Figure 4. On the left, the raw signals from the different sensors (such as vibration, current, speed and temperature) are processed by the feature extraction stage, which transforms each time window into a numerical feature vector. In parallel, the SHS computation module uses the same feature sequence to produce a scalar health score that decreases as the motor degrades. These two outputs are then combined into a joint representation that captures both detailed signal characteristics and the global health trend.
In the middle, this joint representation is fed into a classifier block, which has been trained beforehand using labeled data where the operating condition of the motor is known. The classifier converts its input into a set of class scores or probabilities, one for each defined condition class. Typical models that can be used here include gradient-boosted trees, support vector machines, or lightweight neural networks, depending on the computational budget and the amount of training data available. The internal details of the classifier are not critical for the framework; what matters is that it can learn to distinguish between the different fault patterns present in the feature space.
On the right-hand side, a decision and alarm logic block interprets the classifier output together with the SHS. When the SHS indicates a clearly healthy regime, only strong and consistent classifier evidence is translated into a warning, which helps avoid false alarms during regular operation. As the SHS decreases and approaches predefined warning or critical levels, the system becomes more sensitive: persistent predictions of a specific fault class are more likely to trigger maintenance recommendations or closer inspection. In this way, the framework uses the SHS as a health-aware context for the classification results, combining continuous condition monitoring with discrete fault labels in a single, coherent structure that can be deployed on an embedded controller or supervisory computer.
3.4. RUL Estimation Based on SHS Degradation
The RUL is defined as the remaining number of operational cycles until end-of-life in the run-to-failure experiments. In this study, RUL is estimated to be using a supervised Gradient Boosting regression model trained in failing units, using the multi-sensor feature vector and the SHS as inputs (
Section 4). This learning-based RUL module does not require manually selecting a critical SHS threshold. For completeness, a simple SHS-trend extrapolation approach is also described that can be used when run-to-failure labels are unavailable; in that case, a critical threshold
is required. The monotonic SHS sequence
obtained from Equation (
8) is treated as a one-dimensional degradation signal whose downward trend reflects the gradual loss of health, and this signal is used as the basis for RUL estimation in line with recent health-indicator-based prognostics frameworks that first construct a robust health index and then fit a degradation model on top of it [
39,
58,
59]. Each feature window
n is associated with a time stamp
and an SHS value
, and a critical threshold
is chosen to represent the boundary between acceptable and unacceptable operation. In this study,
was set using the percentile-based rule described in
Section 4.2. This threshold is interpreted as an actionable maintenance boundary (early intervention) rather than catastrophic failure.
A simple and widely used strategy is to approximate the evolution of the health index by a smooth parametric trend fitted to the most recent part of the SHS trajectory [
60,
61]. When the degradation appears approximately exponential, the SHS is modeled as a decaying function of time as expressed in Equation (
12), where
and
are parameters identified from data by least-squares or robust regression:
Given the current time
and the current SHS value
, the predicted RUL under the exponential model is obtained by finding the future time at which the fitted trend reaches the critical threshold
; this leads to the closed-form expression in Equation (
13), which is straightforward to implement even on resource-limited hardware [
60]:
In other applications, the SHS decreases in an almost linear manner over the relevant time range, for example, when wear progresses at a nearly constant rate; in such cases, a linear degradation model can be more appropriate and easier to interpret [
61]. The SHS is then approximated by the affine function in Equation (
14), where
a and
are fitted coefficients:
The corresponding RUL estimate is obtained by solving for the time at which the linear trend reaches
, which gives Equation (
15):
Both trend models in Equations (
12)–(
15) keep the relationship between the SHS trajectory and the RUL estimate transparent, so that the predicted lifetime can be explained to end users and the fitted parameters can be checked for consistency with engineering expectations. Calibration of the threshold
and of the model parameters
or
is typically carried out offline using run-to-failure datasets or long-term field data, and can be updated as more operational histories become available [
58,
61]. More advanced approaches, such as deep learning and domain-adaptation models that directly map health-index sequences across operating conditions to RUL predictions and provide uncertainty-aware outputs [
39,
62], can be embedded into the same framework at a later stage; however, the simple exponential and linear trends in Equations (
12)–(
15) already offer a practical and computationally efficient solution that is consistent with recent practice in RUL estimation based on health indicators for rotating machinery and related electromechanical systems [
60,
61].
5. Discussion
5.1. Clinical and Industrial Significance of the Results
From an industrial perspective, the proposed SHS-centered framework directly addresses the requirements of process monitoring and fault diagnosis in multi-mode manufacturing systems, where servomotor health critically determines product quality, throughput and downtime. The proposed framework was designed for servomotor units in an upper-limb rehabilitation robot, where unplanned downtime can interrupt therapy and pose safety concerns. The combined use of a learning-based SHS, fault classification, and RUL prediction provides a layered view of condition that is directly relevant to such clinical settings as well as to industrial servo-driven machinery.
The SHS exhibited smooth, predominantly monotonic degradation trajectories across units and fault types, with accelerated decline near end-of-life. The use of a fixed decision threshold at produced an early-warning mechanism with a median early-warning horizon of about cycles on test units, while no false alarms were raised on healthy windows. This behavior suggests that maintenance actions can be scheduled with meaningful lead time, for example, by rescheduling patients to other devices or planning service outside therapy hours in clinical environments, or by aligning maintenance with planned production breaks in industrial settings. The absence of false positives on healthy data is particularly important in rehabilitation robots, where unnecessary alarms or shutdowns can disrupt treatment continuity.
The fault classification stage complements the SHS by providing more specific diagnostic information. On the held-out test set, an overall multi-class accuracy of 79.9% and a macro F1-score of 79.6% were obtained. The healthy class was recognized perfectly, indicating that normal operation can be reliably distinguished from degraded conditions. Misclassifications were concentrated among mechanically related faults, such as eccentricity, bearing degradation and misalignment, which share similar physical signatures and often require similar maintenance actions (inspection of mechanical components, adjustment, lubrication or replacement). From a practical standpoint, this pattern implies that the classifier separates normal versus abnormal conditions reliably, while residual ambiguity is confined to fault types that are also closely related from a maintenance perspective.
RUL prediction based on Gradient Boosting and multi-sensor features augmented with the SHS achieved a mean absolute error of cycles and a root-mean-square error of cycles on unseen failing units. Phase-wise analysis showed that errors were larger earlier in life and became progressively smaller as failure approached, which is consistent with the intuition that late-life behavior carries stronger prognostic information. In clinical use, such accuracy in terms of cycles of the exercised trajectory can support decisions such as whether a device can safely complete a planned therapy block before maintenance. In industrial use, the same information can guide whether operation can continue until the next scheduled stop, or whether an earlier intervention is required.
The sensor attribution and ablation analyses provide further practical insight. Aggregated feature importances showed that current and temperature channels contributed the largest fraction of explained importance, while vibration and acoustic channels provided complementary information, especially for difficult discriminations among mechanical fault types. Training the classifier with a single sensor group yielded test-set accuracies of approximately 69.6% for current-derived features, and substantially lower values for vibration, acoustic and temperature groups when used alone, whereas the full multi-sensor feature set increased accuracy to 79.9% and improved separation of mechanical faults. These findings indicate that existing electrical and thermal measurements, which are already available in many commercial drives, can support a useful baseline diagnostic capability, while the addition of vibro-acoustic sensing offers clearer fault separation when the installation allows it. Overall, the results suggest that the proposed SHS-centered framework can be integrated into predictive maintenance strategies for both clinical rehabilitation robots and industrial servo systems, offering early warning, coarse fault categorization and quantitative RUL estimates without excessive computational burden.
5.2. Methodological Limitations
Several limitations of the methodology need to be considered when interpreting these results and when planning deployment beyond the studied setup. First, the data were collected from the servomotor actuation units of a specific upper-limb rehabilitation robot with a particular motor–drive–reducer configuration and field-oriented control scheme. Operating speeds, torque ranges and motion profiles were selected to reflect typical therapeutic use rather than the full range of possible conditions. As a consequence, the learned SHS, fault classifier and RUL regressor are tuned to this configuration. Different motor ratings, gearbox designs, load characteristics or control strategies may lead to different signal patterns and may require re-training, re-scaling of the SHS and adjustment of the decision threshold.
Second, the fault scenarios were created under controlled conditions. Mechanical and electrical degradations were induced by, for example, increasing mechanical friction, introducing misalignment, adding gear backlash and applying controlled electrical imbalance. These scenarios approximate common degradation mechanisms but do not cover the full variety of natural wear processes, combined faults or environment-related effects (e.g., contamination, temperature extremes, vibration from the larger robot structure). Furthermore, the sample distribution across fault types was imbalanced. Certain mechanical faults, such as bearing wear, were represented by many more windows than some electrical faults. This imbalance is reflected in the confusion matrix, where rare fault types are more prone to misclassification. Therefore, performance estimates for underrepresented classes should be interpreted with caution.
Third, the amount and structure of the data impose limitations. The RUL model was trained only on units that were driven to failure under the experimental protocol, with RUL expressed in cycles of the exercised trajectory. The number of failing units per fault type, as well as the number of units exhibiting a clear SHS threshold crossing before failure, was limited (for example, only a subset of test units contributed to the early-warning horizon statistics). As a result, the reported and for RUL, and the median early-warning horizon of cycles, are conditioned on this specific protocol. Extrapolation to much longer horizons, different duty cycles, or mixed usage patterns outside the tested regime is not guaranteed.
Fourth, the modeling choices introduce additional constraints. The SHS is obtained from an autoencoder trained exclusively on healthy windows, assuming that the training set covers the diversity of normal operation. If new operating modes become common in practice (e.g., different therapy exercises, altered motion ranges, different patient interaction profiles), healthy data from these modes could initially be scored as abnormal until the SHS model is updated. The feature extraction pipeline uses a fixed window length of s with 50% overlap and a predefined set of time and frequency-domain features. Phenomena that evolve at time scales much shorter or much longer than this window, or that are only visible in more specialized features, may therefore be underrepresented. In addition, the Random Forest and Gradient Boosting models are trained offline and kept fixed at deployment; concept drift caused by hardware aging, sensor replacement, or changes in control tuning is not addressed.
Finally, practical integration aspects were not fully explored in this work. The present study focused on offline analysis, and real-time implementation constraints on embedded hardware or drive controllers were not systematically quantified. Synchronization of high-rate vibration and acoustic signals with electrical and telemetry channels, as well as the data volume associated with dense multi-sensor logging, may pose challenges in large installations or in clinics with limited data infrastructure. These aspects will need careful consideration before wide-scale deployment.
5.3. Future Work
Several directions are suggested by the current findings and limitations. A first line of work concerns data expansion and fault coverage. Longer-term monitoring of a larger number of servomotor units under routine clinical operation would permit observation of naturally occurring degradation, including combined faults and slow wear processes that were not fully represented in the present experiments. Additional fault types, such as reducer tooth wear, brake malfunction, persistent overload, sensor degradation and drive-electronics faults, could be incorporated to obtain a more complete and realistic fault taxonomy.
A second direction involves validation across different robotic platforms and motor-drive architectures. The general structure of the SHS, the fault classifier and the RUL regressor is not restricted to rehabilitation robots and could be applied to exoskeletons, collaborative manipulators and other servo-driven systems. Future studies may therefore assess how well the trained SHS and models transfer between robots with different gear ratios, load inertias and control loops. Domain adaptation or transfer learning strategies could be used to re-use a core SHS representation while adapting the final classification and regression layers to each platform.
Third, online and continual learning approaches can be explored. In the current framework, the SHS autoencoder and supervised models are trained once and remain static thereafter. In practice, new healthy data and new failure cases will accumulate over time. Incremental or streaming variants of the autoencoder and tree-based models, combined with drift detection mechanisms, could allow gradual adaptation of the SHS distribution, the decision threshold, and the fault and RUL models to evolving operating conditions. Semi-supervised schemes, in which only a subset of windows are labeled by experts, may further reduce the annotation burden while maintaining diagnostic performance.
Fourth, integration into clinical and industrial workflows can be refined. The SHS trajectory, fault probabilities and RUL estimates may be presented to clinicians or maintenance engineers through simple visual interfaces (for example, traffic-light indicators combined with trend plots) that support risk-aware decisions without requiring expertise in signal processing. Coupling the predictive maintenance outputs with scheduling modules could enable automatic suggestions for when a device should be taken out of service, reassigned, or inspected, taking into account patient bookings or production plans.
Finally, methodological refinements may be investigated. Alternative health indicators that exploit temporal models (e.g., sequence-based encoders) or probabilistic formulations could be compared with the current SHS in terms of monotonicity, trendability and prognosability. Hybrid schemes that combine the SHS with physics-informed features of the drive and mechanical transmission may improve interpretability and robustness. In addition, systematic benchmarking against deep learning baselines that operate directly on raw multi-sensor time series would clarify the trade-offs between accuracy, computational cost and data requirements. Through these extensions, the proposed framework could be further strengthened and generalized for predictive maintenance of servomotor systems in both clinical rehabilitation and broader industrial environments.
6. Conclusions
A multi-sensor predictive-maintenance framework for the servomotor units of an upper-limb rehabilitation robot was developed and evaluated. The first key finding is that the learning-based SHS provided a smooth and mostly monotonic description of degradation over time, with a simple fixed threshold at yielding early warnings without false alarms on healthy windows. A median early-warning horizon of approximately 164.5 cycles was obtained on test units that crossed the threshold before failure, indicating that the SHS can offer practical lead time for planned interventions. The second key finding is that, on this basis, multi-class fault classification and RUL estimation reached levels of performance that are useful for decision making: the Random Forest classifier trained on multi-sensor features achieved an overall accuracy of 79.9% and a macro F1-score of 79.6% on the test set, with perfect recognition of healthy windows and confusions concentrated among mechanically related faults, while the Gradient Boosting RUL model reached a mean absolute error of cycles and cycles on unseen failing units, with smaller errors closer to the end of life. A third important result is that the feature-importance and ablation analyses showed that current and temperature channels already support a strong baseline performance, and that vibration and acoustic signals provide complementary information that improves separation between fault families, confirming the value of the proposed multi-sensor design.
These findings suggest that servomotor-centered predictive maintenance can be turned into a practical tool for both clinical and industrial environments that rely on servo-driven robots. In rehabilitation robots, the SHS, fault probabilities and RUL estimates can be used to protect therapy continuity and safety by indicating when a device should be inspected or temporarily removed from clinical use before a failure occurs. The same framework can be embedded near the drive electronics or at the edge and can operate mainly on signals that are already available in many commercial servo systems, which reduces integration effort. When additional vibro-acoustic sensing is feasible, more detailed fault separation becomes possible and can support more targeted maintenance actions. The results indicate that the proposed approach provides a coherent health indicator, a reliable distinction between healthy and faulty behavior, and an interpretable estimate of remaining useful life, and that it therefore has strong potential as a building block for predictive-maintenance strategies in servomotor-based multi-mode industrial servo systems including CNC feed drives, robotic machining platforms, and cyber–physical manufacturing cells with similar actuation chains.