1. Introduction
Traditionally, reliability engineering has been preoccupied with the issue of the measurement of the likelihood that a system will act as intended within a given time frame [
1]. Over the past few years, reliability modeling has advanced from being based only on parametric lifetime distributions to also including data-driven methods, hybrid physics–machine learning techniques, and reliability metrics based on performance [
2]. New research has proposed combining hazard estimation via neural networks with a mixture distribution structure to improve how we model complex electromechanical systems by providing a more realistic representation of non-stationary degradation behavior [
3]. In addition, researchers are focusing on degradation mechanisms instead of time-to-failure abstractions in their current work [
4]. Despite these advances in methodology, failure is still mainly defined by physical failure rather than by loss of functional control ability in dynamically controlled systems.
Classical reliability models are based on survival analysis and stochastic failure theory, in which failure occurs as a physical breakdown of a material by degradation, fatigue, wear, or random shocks [
5,
6]. Reliability is generally defined as a function of failure time distributions or failure hazard rates, with the most common being exponential, Weibull, and lognormal distributions, and the analytical basis of reliability assessment being built on these three distributions [
7]. These methods have been found useful in estimating failure probability, scheduling maintenance, and supporting risk-based decision-making in a large variety of engineering tasks.
Mechanical systems in modern times are increasingly being designed to be control-oriented, with the operation of the closed-loop control mechanisms in such a way that the mechanical systems meet high standards of performance, safety, and efficiency [
8]. Actuators in cyber–physical robotic/electromechanical systems can degrade, and sensors can drift from their specified tolerances, resulting in a diminished control authority before actual structural failure may occur [
9]. There has been some recent research to estimate the degradation of electronically controlled electromechanical actuators and demonstrate that the loss of performance happens prior to the physical failure of the actuator when subjected to dynamic loading conditions [
10]. Such systems rely not just on the physical integrity of their components but also on the presence of enough control authority to bring the system to a stable state, as well as ensuring that it can operate within safe limits [
11]. Actuator wear, sensor errors, and parameter drift can cause controllability to degrade slowly, well before any physical fault is noticed [
10]. Although classical reliability formulations assume that control action will remain available and effective until physical failure occurs [
12,
13], this often leads to loss of stabilizability, which is a crucial precursor to functional failure, not being explicitly modeled or predicted in standard reliability models [
14,
15]. Even in fault-tolerant and game-theoretic control formulations, system stabilizability is implicitly assumed to remain structurally preserved [
13]. Consequently, current frameworks do not quantify the reliability of the controllability structure itself under progressive degradation, despite recent advances in degradation-aware estimation and fault-tolerant tracking control [
10,
13].
Mechanical systems that are highly controlled or manipulated through actuator-driven systems, such as robotic manipulators, require both structural integrity and sufficient control authority to operate in a stable condition. Actuators, sensors, and power electronics can exhibit wear and tear over time and lose their ability to perform effectively while still providing a functioning structural component. If a system loses its ability to be controlled or stabilized at the same time as its structural component is intact, the system is considered to have failed to achieve one of its intended functions, even if there has been no physical failure of the system.
The system architecture described in this research is visualized with the aid of a reliability block diagram of an industrial robotic arm, as seen in
Figure 1. A robotic arm will be modeled as a series of components (e.g., the motor, gearbox, encoder, and control board with their associated power electronics). Because all major subsystems in series have individual components that contribute to system failure if any one component fails, total reliability is expressed as the multiplication of the reliability functions of all components. Reliability block diagrams have been extensively used to model electromechanical systems for the purpose of analyzing their reliability.
Control theory, in turn, offers stringent methods of studying the aspects of stability, controllability, and performance of the dynamic systems. Concepts such as controllability, stabilizability [
16], and reachability [
17] are used to describe whether a system is capable of being driven to desired states given admissible control inputs. Most of the research that establishes safety through reachability-based methods provides formal verification of whether or not constraints have been met, but does not include how to quantify reliability based on probabilistic degradation processes. The relationship between erosion of controllability due to degradation processes and reliability has not yet received any analytical treatment since the majority of the current works have focused on safety verification and enforcement of constraints rather than degradation-based reliability modeling [
17]. Although they are key notions in control design and analysis, these notions are usually discussed as performance or feasibility properties, but not as determinants of system reliability. In addition, classical control formulations typically assume fixed system parameters [
5,
18] and do not consider the fact that the long-term degradation will affect control authority and the effectiveness of actuators. This has resulted in the separate development of control theory and reliability engineering with minimal direction of integration between the two fields.
More recent developments in industrial robotic systems have received much attention for robust and optimal control strategies to improve performance when disturbed and uncertain. The use of sliding mode control (SMC) methods has been popular because of their intrinsic resistance to matched uncertainties and actuator failure [
19]. Specifically, the long-duration dissipative sliding mode approaches have been found to be highly resilient in the face of actuator attacks in power electronic systems and grid systems [
20]. Sliding mode control, predictive functional control, and AI-based servo control can improve robustness against matched uncertainty and bounded disturbance [
19,
21], but these methods all rely on the premise that the system can still be controlled regardless of time (stabilizability). Therefore, while these techniques improve performance under the condition that the system will be stabilizable, none of them provides any indication as to whether the actual ability of the system to be stabilized has changed over time.
In the same manner, predictive functional control (PFC) and model predictive control (MPC) models have been effectively implemented in high-performance electromechanical systems, such as in-wheel motors of new energy vehicles. They maximize the control performance over a finite prediction horizon and directly consider the system constraints.
Although these sophisticated control techniques have a great contribution to the performance of tracking and disturbance rejection, they majorly assume that the underlying system is stabilizable [
21,
22]. This is to say that they work on an implicit assumption that the controllability structure of the system remains even when it is degraded. But in the harsh operating environments of an industrial robotic arm, the control authority can be eroded over time by the degradation of actuators, aging of electronic components, and faults in sensors. Once the controllability Gramian has diminished to a critical level, even a good or a strong controller would not be able to stabilize it. From a systems-theoretic perspective, the minimum eigenvalue of the controllability Gramian represents the minimum control energy required to steer the system across state space as established in linear systems theory [
16]. A progressive reduction in this eigenvalue signifies structural erosion of control authority. However, no probabilistic reliability framework currently maps this degradation into a formal reliability metric.
Thus, an alternative controller is not suggested in the current study; rather, a reliability-aware stabilizability framework that could evaluate the structural ability of the robotic arm to remain controllable under progressive degradation is suggested.
Similarly, recent developments in the predictive maintenance space, digital twins, and autonomous systems have further demonstrated the shortcomings of this approach. Digital twin-based predictive maintenance frameworks increasingly model degradation trajectories in real time [
23]. Nevertheless, these models focus on residual useful life (RUL) estimation and component survival probability rather than the structural feasibility of stabilization under degraded actuation capability [
4]. The degradation mechanisms of industrial systems [
23], e.g., turbines, robotic manipulators, aerospace structures, and process plants, may tend to degrade actuation capability, but do not necessarily lead to structural failure [
4,
24,
25]. Where this can occur, the systems might not be ruined, but become harder and harder to manage safely. The conventional reliability measures can still reflect high survival rates, although the system is on the verge of a regime where stabilization will no longer be guaranteed. Such disconnection creates a severe gap: tools of reliability assessment that do not take controllability into account could easily overestimate the survivability of a system and its safety in operation.
To fill this gap, more recent studies have started to investigate the concept of reliability beyond component survival, such as functional reliability and performance-based definitions of failure [
12,
26,
27]. However, such approaches are, to a large extent, empirical and do not develop a formal connection between the dynamics of the system, control authority, and reliability. Specifically, there is no single framework that has an explicit definition of failure as loss of controllability or stabilizability, and one that is used to quantify reliability as a function of system capacity to arrive at, and hold safe operating conditions during degradation.
Therefore, a fundamental gap exists at the intersection of reliability engineering and control theory: there is no unified framework that (i) defines failure explicitly as loss of stabilizability, (ii) quantifies degradation of control authority probabilistically, and (iii) integrates this degradation into a reliability function consistent with stochastic reliability theory. This drives the necessity of a controllability-based reliability paradigm where failure is no longer determined only by physical breakdown but by the failure to assure safe and stable operation. Such a framework may help to model the initial degradation effects and predict failure when it is about to cause catastrophic breakdown. In addition, control design and maintenance interventions can be used to improve reliability instead of structural redundancy when these are coupled with a reliability theory. The Controllability–Reliability Coupling (CRC) framework presented in this paper is formulated to provide a unified and physically meaningful definition of reliability for controlled mechanical systems. The approach complements existing optimal and robust control and reliability evaluation methods by providing:
- (i)
A quantitative measure of control authority degradation;
- (ii)
A functional failure criterion based on the minimum eigenvalue of the controllability Gramian;
- (iii)
A reliability metric that integrates structural and control-theoretic perspectives.
Unlike fault-tolerant control methods, the CRC framework does not redesign controllers but evaluates the structural feasibility of control under degradation. Unlike classical reliability theory, it does not equate failure solely with physical breakdown. Instead, it formally defines functional failure as the condition under which the minimum eigenvalue of the controllability Gramian falls below a critical threshold required for safe stabilization. Although sliding mode and predictive control can improve the system’s performance in the face of uncertainty, the methodology proposed here analyzes whether the system can be controlled fundamentally, which is a prerequisite for any advanced control strategy. This work on establishing the notion of reliability-based stabilizable (RBS) static and dynamic industrial robotic systems represents a significant advancement over previous research in the influence of indeterminate controllability and predictive control on closed-loop systems of industrial robots. The newly proposed reliability-based measure of stabilizability degradation adopts a Controllability–Reliability Coupling (CRC) approach that combines stochastic modeling of reliability with the analysis of linear systems through the inclusion of attractiveness of control theory to establish the use of probability as an appropriate method for establishing process reliability for controlled mechanical systems that undergo progressive degradation.
Furthermore, this work represents an expansion of functional reliability by providing an analytical definition of reliability for controlled mechanical systems with respect to progressive degradation, thus providing a statistically rigorous definition of reliability as well as physical reliability.
2. Research Methodology
2.1. Traditional Reliability Modeling Framework (Baseline)
In conventional reliability engineering models, failure is normally treated as a stochastic event that is governed by the component degradation or by random shocks, which are basically independent of the system’s controllability. Reliability is defined as the probability that a system performs its intended function over a time interval
[
3,
27,
28]:
where
is the random time to failure. For a constant failure rate
, the reliability function therefore is given as follows:
The corresponding hazard (failure rate) function is given as follows:
where
is the probability density function of failure time. For a degrading system, a time-varying hazard rate is often assumed to be as follows:
In these models, failure is implicitly characterized as physical failure, i.e., fracture, wear-out, or parameter drift beyond the limits. Control action is considered external, and it does not impact the failure definition. Consequently, failure is observed when there is a failure of function, and reliability improvement is dependent mainly on redundancy. This separation motivates the need for a unified controllability–reliability formulation.
2.2. Proposed Controlled System Model Under Degradation and the Controllability Metric Under Safe-State Reachability
Figure 2 below depicts the controllability/reliability coupling framework utilized in this study. A robot joint has been modeled as being analogous to a closed-loop electrical-mechanical system, such that as the actuator deteriorates over time, control authority decreases due to the effects of deterioration on performance (control). The impact of deterioration on the robot joints and on their associated levels of control is modeled using a time-varying input matrix
. As a result of the reduction in controllability, the controllability Gramian is used to provide information on how to quantify the reduction in controllability. In turn, the basis on which you determine the reliability index associated with CRC will be defined as a minimum singular value in terms of controllability. Overall, this graphic representation provides an overview of how physical deterioration, control authority, and functional reliability relate to one another as part of the methodology proposed in this work.
The system is modeled as a linear time-varying (LTV) controlled dynamic system in which actuation effectiveness degrades over time [
25,
29]:
where
is the state vector,
is the control input,
is the system matrix, and is a time-varying degraded control input (actuation) matrix. The state matrix is assumed constant, while the input matrix is time-dependent due to progressive degradation effects.
The state-space model (5) represents the linearization of a nonlinear electromechanical robotic manipulator described by second-order Euler–Lagrange dynamics of the form
where
is the generalized coordinates and the
actuator torques. Linearization about a nominal operating point will usually yield a state-space representation, which can be expressed as
where the matrix
captures the linearized system dynamics, while
represents the nominal actuator effectiveness. The linear model in this case is obtained by linearizing the nonlinear manipulator dynamics about a nominal equilibrium configuration
, assuming small perturbations around the operating point. The resulting matrices of
and
are therefore represented by the local dynamic behavior of the system in the neighborhood of this equilibrium.
The linear time-invariant representation is obtained via local linearization around a nominal operating point. Therefore, the CRC formulation is valid within a bounded neighborhood where higher-order nonlinear terms remain negligible, and the system operates within its admissible state and input limits. The approach assumes moderate deviations from equilibrium and sufficiently smooth nonlinear dynamics to justify a first-order approximation. For large excursions or strongly nonlinear regimes, re-linearization or nonlinear controllability analysis is required.
To model the system degradation or the actuator and electronic degradation, the input matrix is parameterized as follows:
with
representing progressive loss of control authority due to wear, corrosion, or actuator fatigue.
is the nominal (undegraded) control matrix, and it corresponds to the system’s full control authority at the initial time , while is a scalar degradation factor that is used to model the loss of control authority over time due to the physical degradation mechanisms such as the actuator wear, corrosion, thermal aging, fatigue, or the loss of hydraulic/electrical efficiency.
When the scalar degradation factor value is , then the system is said to have full, healthy control authority. If , then a severe degradation is approaching, which could lead to loss of controllability; however, if , it indicates that is a monotonically decreasing function of time, reflecting irreversible or progressively worsening degradation in the actuator or control channel. In the case of maintenance recovery or overcompensation, may temporarily satisfy representing performance enhancement beyond nominal conditions.
While a scalar degradation factor is adopted for clarity and analytical transparency, the formulation naturally extends to a matrix-valued degradation representation of the form where captures heterogeneous and coupled degradation among multiple actuators or control channels. In such a case, actuator-specific wear, cross-channel coupling effects, or asymmetric efficiency loss can be explicitly modeled. The scalar form used in this study represents a special case corresponding to uniform degradation across control channels.
Therefore, the system controllability is quantified using the finite-horizon controllability Gramian [
25]. At each time instant
, the system is treated as a frozen-parameter LTI with an input matrix
. This formulation evaluates instantaneous controllability under slowly varying degradation, assuming α(t) evolves on a slower time scale than the system dynamics:
The CRC reliability index is defined as follows:
where
denotes the controllability Gramian over the time interval
,
denotes the minimum singular value, while
is interpreted as a normalized safe-state reachability volume. The functional failure is assumed to occur when
, where
is a controllability threshold.
The finite time horizon
represents the practical control action window within which safe-state reachability must be ensured. In contrast to infinite-horizon controllability, the finite-horizon formulation reflects realistic operational constraints, such as response-time requirements and actuator bandwidth limitations. The selected value of
is chosen to be sufficiently large relative to the dominant system time constants to capture effective control authority, while remaining consistent with practical stabilization intervals. Sensitivity to the choice of
is evaluated in
Section 3.
The stabilizability threshold is denoted by , and functional failure is defined when The parameter , represents the minimum admissible controllability margin required to maintain practical stabilizability.
2.3. Failure Definition via Loss of Stabilizability and the Reliability Reformulation Based on Controllability
In traditional reliability theory, the reliability function
is derived from statistical lifetime distributions (e.g., Weibull). However, in the proposed function, failure is defined as the loss of guaranteed stabilizability:
where
is a minimum controllability threshold required to ensure closed-loop stability under bounded disturbances.
In this study, is selected as a conservative stabilizability margin corresponding to severe degradation of safe-state reachability (i.e., a 98% reduction relative to nominal controllability). This value represents a regime in which the minimum control energy required for stabilization increases sharply, indicating practical loss of control authority despite residual actuation capability. The threshold is not arbitrary but is selected to represent the onset of energetically impractical stabilization.
The stabilizability is the property of a linear dynamic system in which a controller can bring the state to zero and stabilize the system despite the poor control of the system, assuming all uncontrollable modes are stable. Stated differently, it can be described as the characteristic of a system in which all the uncontrollable modes (subsystems) are intrinsically stable. Although a system may have components that cannot be directly influenced by inputs, it is said to be stabilizable when the uncontrolled dynamics automatically fade (are asymptotically stabilized) away to zero (and not grow). The predicted failure time is therefore given as follows:
Reliability is reformulated as a deterministic function of controllability and is denoted by
and it is referred to as a deterministic reliability indicator that is derived from system dynamics, rather than a probabilistic survival function. Therefore, it complements classical stochastic reliability by identifying the deterministic onset of functional failure due to loss of controllability. It is defined as follows:
This definition removes dependence on empirical failure rates and explicitly couples reliability to system dynamics and control authority. The controllability in this case refers to the capacity of an operator or a control system of a given system to affect, guide, or move the internal conditions of the given system during a given finite time scale to a preferred final state. Where traditional reliability is concerned with the likelihood of failure, controllability is concerned with the ability of a system to be redirected to a safe or functioning state in case it begins to drift towards an undesired or failed state by manipulating external inputs (control actions).
The proposed framework considers three categories of faults:
Progressive degradation faults, which refer to the gradual reduction in actuator effectiveness (modeled as the monotonic decay of ).
Accelerated fatigue faults, which are referred to as a nonlinear degradation with increasing rate.
Shock or intermittent faults, which are the abrupt, temporary reductions in .
Therefore, the degradation factor serves as a unified representation of mechanical wear, electronic efficiency loss, and actuator faults.
Table 1 presents the differences between the traditional reliability modeling framework (baseline) and the proposed controllability-based unified reliability–control framework.
This approach develops a unified reliability–control framework, as it presents the traditional reliability modeling as a baseline and builds it up by introducing the concept of controllability-based failure definitions. The method shows that one can predict the degradation of reliability before the physical breakdown and that the reliability of the system can be increased with the use of control design, and not by means of structural redundancy.
3. Results and Discussion of the Controllability–Reliability Coupling
All results presented in this section are obtained by numerically integrating the time-varying state-space model given in Equation (5). The degradation process is introduced through the time-dependent input matrix , which modifies the system’s actuation effectiveness while preserving the underlying system dynamics described by the matrix . The state trajectories and controllability Gramian are computed directly from this governing differential equation, ensuring that the reported controllability margins are dynamically consistent with the system model. The controllability Gramian was computed over a finite time horizon using the time-varying system matrices derived from Equation (5), and the minimum singular value was used as the controllability margin.
The findings presented in
Figure 3 indicate that the proposed Controllability–Reliability Coupling (CRC) framework provides a dynamic indicator of functional failure earlier than conventional reliability metrics, as compared to the traditional reliability modeling. The state starts with a system that is in full health and full controllability, as indicated by an initial controllability margin of 1.000. The concept of degradation is presented as a buildup of loss of control authority, and it is realistic in terms of the wear of actuators instead of sudden component failure.
The highest plot shows the degradation path α(t), which gradually decreases almost linearly over time with minor stochastic variation. This action is symptomatic of the progressive aging and operations uncertainty, and more to the point, there are no sharp turns that would usually be construed as failure. The classical view of reliability would consider that the system is functional most of the time during the period.
Figure 3.
Evolution of system degradation and loss of control authority over time.
Figure 3.
Evolution of system degradation and loss of control authority over time.
The intermediate plot demonstrates the development of the normalized safe-state reachability, V (t), the fundamental controllability measure of the CRC framework. Though the degradation process is smooth, the controllability margin reduces with time at an accelerated rate, noting that the physical well-being of the system, which can direct states into safe operating conditions, is worsening at a faster rate than it initially seems. This separation between “health” and “controllability” is critical: it reveals that loss of stabilizing authority emerges well before total loss of functionality. The system is declared failed when crosses the stabilizability threshold , which occurs at approximately . This predicted failure time corresponds to the point at which closed-loop stabilization can no longer be guaranteed, even though some control effectiveness still exists.
The lower plot makes a direct comparison between the conventional reliability and the reliability of CRC. The classical reliability curve derived from statistical lifetime models, which describe failure occurrence, does not explicitly capture control-energy degradation or stabilizability loss. Also, the classical reliability curve does not clearly state that there will be a failure event; it hints at the loss of confidence over time. On the other hand, the CRC reliability is constant at unity until the system becomes stabilizable, at which point it suddenly decreases to zero at the expected failure time. This drastic change highlights one important concept shift: failure is not a gradual possibility fade, but an ultimate inability to control.
Taken together, these findings indicate that loss of controllability is a physically valid and earlier measure of failure compared to traditional measures of reliability. The CRC system defines failure when the stabilizability is gone, not when the system is totally degraded. This not only supports the main assumption of the paper, which is that controllability is a key predictor of reliability, but it also demonstrates that reliability can be defined with precision and predicted through system dynamics and control authority, without the use of empirical estimates of failure or physical redundancy.
Furthermore, the results shown in
Figure 4 and the corresponding quantitative results in
Table 1 indicate the reaction of the Controllability–Reliability Coupling (CRC) framework to the various realistic mechanical degradation impacting mechanisms. In every case, there is the initial state of the system with complete control power, which is demonstrated by an initial controllability margin of 1.000. The failure level is set at
, which gives them a similar stabilizability standard of reference.
The controllability margin in the linear wear case shrinks in a monotonically decreasing manner that is smooth, i.e., similar to the gradual aging of an actuator, or aging of a material in a homogeneous manner. The system can last a long time under constant degradation, and failure is predicted at the . This is characteristic of typical industrial systems in which slow wear does not necessarily bring an instant loss of control, but will ultimately result in loss of stabilizability when control authority is no longer adequate. The margin of controllability eventually goes to zero, and this means that it has lost effective control.
A much steeper degrading controllability is made evident in the accelerated fatigue case. This case can be described as fatigue-based effects, like thermal cycling or high-frequency loading, where deterioration increases with time. The high rate of decrease in the controllability margin results in an early prediction of a failure at , which is much earlier than that of the linear wear case. This finding demonstrates the sensitivity of the CRC metric to nonlinear degradation rates and validates the fact that the CRC measure can detect the early failures in fatigue-dominated systems.
The intermittent faults case presents non-monotonic behavior to the controllability margin, which is a sudden decrease and partial recovery. These oscillations are typical of processes like stiction, backlash, or intermittent sticking of the valves. Despite repeated near-failure states of the system, stabilizability up to is full. Notably, these concealed vulnerabilities, or phases when the margins of controllability are near the margin of failure, but not quite surpass it, are revealed by the CRC framework, which traditional reliability models would not have detected.
The controllability margin consistently decreases in the shock damage case until a sharp fall is experienced after an impact or overload. Such discontinuity is an indication of sudden actuator ineffectiveness or structural damage. However, the system has a marginal controllability as it approaches the failure threshold despite the shock. This behavior shows that the CRC model can embrace slow deterioration and immediate damage in one framework.
Figure 4.
Functional failure preceding material failure due to actuator saturation.
Figure 4.
Functional failure preceding material failure due to actuator saturation.
The maintenance recovery scenario is a critical contrast of the other cases. In this case, the ability to partially restore control capability is presented mid-life, which is a simulation of maintenance intervention or reconfiguring of adaptive control. The margin of controllability becomes significantly greater at recovery, and it does not approach the stabilizability threshold at any point in the time horizon. As a result, no failure can be anticipated. This finding is especially important, since it shows that reliability can be maintained, or even repaired, without physically redundant copies, simply by regaining control authority, something that the usual formulations of reliability do not permit.
In general, the comparative outcomes verify the fact that the CRC framework not only distinguishes between degradation mechanisms by the severity, but also by their effect on stabilizability. Decelerated failure via accelerated fatigue, linear wear via delayed but unavoidable failure, intermittent faults via hidden near-regimes of failure, abrupt controllability loss via shock damage, and preventing failure via maintenance recovery are accelerated fatigue, linear wear, intermittent faults, shock damage, and maintenance recovery, respectively. These results give good support to the main assumption according to which controllability is a key predictor of reliability, and that physical degradation is not the only effect of failure, but loss of stabilizability.
To further validate and show the importance of the new reliability assessment approach, the comparative plots in
Figure 5 and
Table 2 show the time-varying development of the CRC-based controllability margin and the conventional reliability function with five realistic conditions of mechanical degradation. In each instance, the failure threshold is set at ε = 0.02, which offers a stable point of interpretation of loss of acceptable system performance. A systematic threshold sensitivity analysis is provided in Figure 10, where ε is varied between 0.02 and 0.10. The predicted failure time varies smoothly and monotonically, demonstrating that the CRC-based failure prediction is structurally stable with respect to reasonable threshold selection.
To evaluate the robustness of the system with respect to the finite time horizon, the controllability Gramian was recomputed for multiple horizon values. The resulting controllability margin trajectories (
Table 3), were found to exhibit consistent monotonic degradation behavior, and the predicted failure times vary smoothly without qualitative changes. This confirms that the CRC-based failure prediction is structurally stable with respect to reasonable variations of
Both of the metrics in the linear wear case show a slow and steady decrease in value, which is in line with steady material degradation. The CRC controllability margin reaches its value at a constant point of 1.000, and then it drops to a minimum of 0.000, which is below the failure threshold of 21.5 s. In comparison, the traditional reliability function is above the threshold until 24.6 s with the lowest value of 0.004. The plots demonstrate clearly that the loss of controllability occurs at about 3.1 s prior to the time predicted by the conventional reliability; therefore, loss of control authority is the first to occur in systems that are characterized by gradual wear.
The difference between the two approaches is more vivid when the divergence is observed under accelerated fatigue. Coming to the point of failure, the CRC margin is lowered quickly and reaches 0.001 at least at the point of failure, 13.1 s. Conversely, the conventional reliability curve declines at a lower rate with the minimum value of 0.024, and it does not cross the failure threshold in the simulation horizon at any point. The associated figure shows that, even though the system will be statistically reliable, it will become uncontrolled at significantly earlier stages, and this shows the superior sensitivity of CRC to the rapid degradation of stiffness and damping characteristic of fatigue-driven failure modes.
Table 3.
Comparison of CRC and traditional reliability results under mechanical degradation.
Table 3.
Comparison of CRC and traditional reliability results under mechanical degradation.
| Scenario | Min CRC Controllability | CRC Failure Time (s) | Min Traditional Reliability R(t) | Traditional Failure Time (s) |
|---|
| Linear Wear | 0.000 | 21.5 | 0.004 | 24.6 |
| Accelerated Fatigue | 0.001 | 13.1 | 0.024 | None |
| Intermittent Faults | 0.011 | 23.7 | 0.103 | None |
| Shock Damage | 0.016 | 24.4 | 0.127 | None |
| Maintenance Recovery | 0.065 | None | 0.254 | None |
In the case of intermittent faults, the plots show step-like drops showing repetitive faults. The CRC margin decreases to 1.000 with a minimum value of 0.011 and passes the failure value of 23.7 s. In the meantime, the conventional reliability functional is relatively large, with the lowest value of 0.103, and does not forecast failure. The difference between the 0.011 (CRC) and the 0.103 (traditional reliability) lies significantly in the fact that the former quantifies the cumulative effect of recurring errors, and the latter removes such effects and oversimplifies the idea of functional degradation.
In the shock damage case, there is a sharp loss in controllability due to a sudden degradation event. The CRC margin has been reduced to at least 0.016 and passed the failure threshold at 24.4 s. While the conventional reliability function is far above the limit, having the lowest possible value of 0.127, it also does not predict a failure. The figure demonstrates that CRC is a direct manifestation of abrupt loss of control that is due to discrete damage, despite the indications of acceptable system health provided by survival-based metrics.
The case of maintenance recovery shows that CRC can be used to indicate performance restoration. The CRC margin initially decreases, but after maintenance is restored, it not only stays at a minimum value of 0.065 but also stays well above the failure threshold throughout the simulation. As a result, there is no CRC-predicted failure, nor is failure predicted by traditional reliability, both of which have a relatively high minimum of 0.254. The plots verify that CRC is able to model reversible degradation and measure the efficacy of maintenance interventions in increasing the functional life.
In all degradation conditions, CRC always forecasts preceding failure or unacceptable performance sooner than the conventional measures of reliability, with the exception being the maintenance recovery, where both models forecast the sustainability of system health accurately. CRC predicts failure 3–11 s before in progressive and fatigue-induced situations and alone predicts failure in situations that would not be predicted through traditional reliability. These findings indicate that the proposed CRC approach offers a more conservative and operationally useful measure of the reliability of the system by explicitly relating the degradation of the system to loss of controllability but not to survival probability, which is common with traditional reliability methods.
3.1. Case Study of an Industrial Robotic Joint Reliability
In order to demonstrate the industrial applicability of the proposed CRC, it was applied to a servo-driven robotic joint representative of articulated manipulators used in automated manufacturing. The joint, comprising an electric motor, gearbox, and rigid link, was modeled as a second-order electromechanical system that is subjected to an actuator degradation. Practically, the progressive loss of torque in such joints is caused by thermal aging, wear in the gears, lubrication breakdown, and the increase in the friction. These processes minimize functioning input gain and directly harm control, despite the absence of structural damage.
Degradation was incorporated as the time-dependent reduction in input effectiveness,
. The CRC metric, calculated based on the minimum singular value of the controllability Gramian, was assessed in a 40 s duration of operation and was compared to a Weibull structural reliability model. The normalized controllability margin, as presented in the first plot in
Figure 6, decreases in a monotonous manner with degradation. At 22.5 s, the maintenance threshold is reached, and at 27.0 s, functional failure happens when the controllability margin is close to a specified stabilizability limit. Conversely, in the third plot of
Figure 6, the Weibull reliability curve exhibits slower decay, yet it fails to attain the structural failure point in the same time frame. Although the conventional reliability modeling considers the joint to be operational, CRC, on the other hand, determines the loss of functional capability much earlier; this affirms the statement that in control critical systems, structural failure is preceded by functional degradation.
The energetic consequences of this loss of controllability are represented in the second plot of
Figure 6, where the minimum stabilizability energy grows nonlinearly with a decrease in the controllability margin. Since the necessary energy to achieve control is directly proportional to the smallest eigenvalue of the controllability Gramian, the energy demand increases slowly at small levels of degradation and quickly increases towards the end of degradation of 2527 s. This acceleration is the beginning of a high-risk operational regime. In industry, it is translated into higher values of motor current, higher winding temperature, higher stress on power electronics, and lower disturbance rejection capability. The joint then becomes energetically inefficient and thermally overstrained before structural failure is forecasted.
The total interpretation of
Figure 6 reveals the shortcomings of survival-based reliability measures for cyber–physical systems. Whereas the third plot of
Figure 6 shows that the structural integrity is maintained, the first plot shows that stabilizability is lost, and the second plot measures the growth in control energy requirements. Hence, the CRC framework offers a control-conscious reliability metric that instruments the operational risk that cannot be measured using classical Weibull analysis.
Therefore, the system maintenance should be considered at 22.5 s as shown in the first plot, which is a long time before functional collapse at 27.0 s and much earlier than any structural failure forecast in the third plot. In practice, the degradation parameter may be used to estimate the measurable quantities of the system, such as the motor current, torque response, or the tracking error, therefore allowing for the CRC-based monitoring to be embedded within a digital twin or in real-time supervisory systems.
In general, the investigation of the robotic joint, as evidenced by the trends in
Figure 6, indicates that CRC is a complement to the classical reliability theory since it explains the ability to perform a functionality as opposed to material structural soundness. It gives an indicator of degradation based on energy consumption and facilitates the proactive maintenance decision in an actuator-driven industrial system where performance degradation is slow but operationally important.
Figure 6.
The degradation behavior of the industrial robotic joint.
Figure 6.
The degradation behavior of the industrial robotic joint.
3.2. Case Study of an Advanced Industrial Robotic Arm Reliability
The action of the high-tech robotic arm serving as an industrial appliance demonstrates a significant distinction between structural survival and functional performance. Using the traditional concept of reliability, the robotic arm seems to wear off in a monotonic way. The structural reliability calculated based on a series Weibull model of the motor, gearbox, encoder, power electronics, and control board decreases with time, as shown in
Figure 7. At about 31–32 s, the reliability curve intersects the suggested failure point at 0.2, which means structural failure in the traditional meaning as well. In this consideration, the robotic arm is said to have been operational until that point.
However, the functional analysis narrates a more subtle tale. The index of controllability, as shown by CRC in
Figure 8, is not a monotonic decay. Rather, it is a dynamic system that responds to various interacting degradation processes: loss of mechanical torque, loss of electronic efficiency, sensor feedback degradation, and exogenous disturbances. At approximately 25 s, a conspicuous perturbation is manifested in the controllability margin, which is a modeled shock event. This perturbation does not show up in the structural reliability curve of
Figure 7, which is smooth and continuous. However, in terms of control, there is a quantifiable loss of margin in the system, which means that it is more vulnerable in its operations.
The minimum control energy requirement (
Figure 8) can give some additional insight into this vulnerability. Since control energy is proportional to the inverse of the minimum eigenvalue of the controllability Gramian, any lowering of the controllability is reflected as high-energy demand. There is a gradual increase in the amount of energy needed to perform the same motion as degradation increases. At the shock event (~25 s), there is an observable deviation, which indicates the abrupt loss of effective actuation authority. Further on, at approximately 40 s, the simulated maintenance intervention is partially effective in restoring system effectiveness. This recovery can be clearly seen in the controllability margin (
Figure 9), as well as the decrease in the required control energy. Conversely, the structural reliability curve will keep on decreasing monotonically since classical Weibull modeling in itself never considers maintenance-induced recovery without renewal reformulation.
Figure 8.
The minimum control energy requirement.
Figure 8.
The minimum control energy requirement.
Such a deviation points to one of the main findings of the work: structural reliability and functional reliability characterize totally different phenomena in the behavior of the system. The structural reliability provides the answer to the question about component survival, i.e., whether the system has physically failed. The functional reliability, which is measured by CRC, is a question of whether the system is capable of delivering its control purposes in an efficient and robust manner. In contemporary industrial robots, the difference is very important. A robot can stay structurally unchanged and at the same time require much larger currents to the motor, which produce larger thermal loads on power electronics, and also experience much reduced positioning accuracy.
Figure 9.
The CRC-based functional reliability index.
Figure 9.
The CRC-based functional reliability index.
This interpretation is further enhanced by the cyber–physical connection that is found in the CRC framework. Specifically, the degradation of mechanical components (reducing effective torque generation), degradation of electronics (reducing actuation efficiency), and deterioration of sensors (reducing feedback authority) cumulatively change the closed loop dynamics, in the form of the time varying system matrix . Coupled degradation is not a phenomenon that can be explained using independent component survival probabilities. The CRC index is hence a mediator between mechanical wear/electronic aging and control performance.
The industry consequences are immense; operational risks in automated manufacturing, such as welding, semiconductor handling, or precision assembly, develop at a very early stage, prior to structural failure. Higher control energy is associated with increased electrical consumption, reduced component heating, and increased stress on drive electronics. These changes may eventually cause the quality of products to decline or the throughput efficiency to drop despite the structural reliability measure indicating that it is still functional.
Taken together, these findings show that classical structural reliability only gives a partial picture of system health. The meaning of the CRC-based functional reliability framework is that it expands the domain of reliability analysis to dynamic space, including the operational sensitivity, disturbance response, maintenance recovery, and energy escalation. Such a dynamic perspective of reliability is not only useful but also essential in advanced industrial robotic arms, where mechanical, electronic, and control subsystems are inseparably coupled.
Finally, a robustness and sensitivity analysis was performed systematically to determine the effect of the main design parameters on the forecasted functional reliability of the industrial robotic joint. The findings, as shown by
Figure 8,
Figure 9 and
Figure 10, offer quantitative data on threshold sensitivity, degradation sensitivity, and disturbance robustness in the controllability reliability framework.
The impact of the stabilizability threshold ε is shown in
Figure 10. The functional failure time predictions decline continuously as the threshold is raised between 0.02 and 0.10, to about 33 s and 22.5 s, respectively. At
ε = 0.05, failure is predicted near 25.5 s; this linear and monotonically declining tendency proves that the tighter the stabilizability criteria, the sooner there is a functional failure declared. Notably, this change is not sharp but gradual, which means that the effectiveness of the CRC-based prediction of failures is not overly sensitive to the choice of reasonable thresholds. The lack of discontinuities or sharp transitions indicates that the reliability criterion is stable in structure in relation to the modeling assumptions.
Figure 10.
The impact of stabilizability threshold (sensitivity).
Figure 10.
The impact of stabilizability threshold (sensitivity).
Figure 11 looks at how the scaling of the degradation rate affects failure time. As the degradation scaling factor goes higher by 1.5 to 2.0, the forecasted failure time becomes much lower at about 20 s instead of 37.5 s. At a nominal scaling factor of 1.0, failure is at 26 s. This nonlinear decrease in lifetime indicates that a rapid increase in torque constant decay and an increase in friction directly reduce actuator authority and controllability margin. The convex trend also shows that an increased rate of degradation reduces functional life disproportionately. This action is physically consistent, validates the proposed framework, and adequately captures the impact of the degradation dynamics on system performance.
Figure 12, on the other hand, shows robustness to actuator disturbance. Mean normalized controllability margin declines progressively between 0.283 and 0.266 as the disturbance intensity rises between 0 and 0.10. The margin at a disturbance level of 0.05 is approximately 0.277. Even though such fluctuations are observed at intermediary levels of disturbance as stochastic fluctuations, the overall shrinkage is only small and does not reflect structural instability. The controllability measure improves gradually and not disastrously, showing that the CRC-based reliability measure is consistent at moderate levels of actuator uncertainty.
Collectively, the findings verify that the proposed method has predictable threshold sensitivity, high physical interpretability, and moderate resistance to disturbance. The predictions of the failure time are structurally sound in response to parameter variations, thus justifying the versatility and analytical consistency of the controllability-based reliability framework.
On comparing the proposed model with classical models, the study can conclude that the classical model predicts failure at statistical or structural failure time, which is derived from lifetime distribution modeling like Weibull, while the proposed CRC-based model predicts functional failure at functional control failure time, derived from degradation of controllability. The difference in the way the failure times are derived indicates that structural degradation and loss of functional controllability do not necessarily occur simultaneously. While the Weibull model characterizes the statistical end-of-life behavior of the system, the CRC-based criterion, on the other hand, detects the degradation of control authority, which precedes or follows structural failure. To put this in perspective,
Table 4 provides a detailed comparison of the model with the classical Weibull model and other control-based models.
The numerical comparison, as shown in
Figure 13, reveals a progressive separation between structural reliability, stability, robustness, and controllability as it degrades. The Weibull model fails to predict structural failure even within the 20 s simulation. The reliability function did not drop below the chosen threshold. Lyapunov stability margin, on the other hand, remains unchanged within the simulation period. Also, no increase was recorded in the H∞ gain, indicating that there is no change in robustness to disturbances. From these classical perspectives, the system appears fully operational. While the CRC metric reaches its threshold at
, indicating loss of effective controllability. Although the system remains structurally intact and mathematically stable, the control authority, on the other hand, has degraded to a critical level.
This comparison indicates that CRC detects a functional failure mode that is invisible to statistical reliability, Lyapunov stability, and H∞ robustness metrics, justifying the need and use of a control-aware reliability measure.
Figure 11.
The impact of degradation rate on failure time (sensitivity).
Figure 11.
The impact of degradation rate on failure time (sensitivity).
Figure 12.
Robustness to actuator disturbance.
Figure 12.
Robustness to actuator disturbance.
Figure 13.
Comparison of CRC with classical models.
Figure 13.
Comparison of CRC with classical models.
The present study focuses on the theoretical development and numerical validation of the CRC framework using a physically grounded electromechanical degradation model. The degradation factor is calculated with regard to actuator effectiveness loss mechanisms that are usually found in robotic and mechatronic systems. Although the simulations are grounded in physically understandable parameters, it is a conclusive step in future research to experimentally verify the simulations in the light of hardware platforms or industrial data. The degradation scenarios (progressive, accelerated fatigue, and shock-type faults) are realistic in relation to the actually experienced degradation behavior in the case of industrial servo systems; thus, the predictions made by the CRC are based on degradation behavior of physical significance. The obtained results are in agreement with the locally linearized dynamics. To preserve model fidelity, adaptive or piecewise CRC analysis would be needed when conducting large nonlinear excursions on systems.
4. Conclusions
This paper proposes a Controllability–Reliability Coupling (CRC) framework that links system dynamics and reliability through a controllability-based failure criterion. In contrast to conventional reliability models, which describe failure as a probabilistic structural failure as a result of lifetime distributions, e.g., Weibull or exponential law, the proposed concept of failure is based on the loss of stabilizability due to the degradation of control authority.
The main contribution of the study lies in explicitly coupling system dynamics with reliability assessment through the controllability Gramian. With a time-dependent input effectiveness factor of actuator degradation and a trace of the minimum singular value of the finite-horizon controllability Gramian, failure can be mathematically defined as the point at which practical stabilizability can no longer be guaranteed.
The findings and the consistency of the results show that the functional failure may be a precursor to structural failure. In situations involving progressive wear, accelerated fatigue, intermittent faults, and shock damage, the CRC assessment framework predicts that these kinds of failures will typically occur at an earlier time or will be identifiable during their emergence, in contrast to conventional reliability theory models. For example, in many instances, traditional Weibull reliability modeling indicates an acceptable probability of survival for the item in service; however, the CRC metric indicates that the item will soon lose its controllability in service. This demonstrates that the structural integrity as well as the functional controllability of the control-critical systems are fundamentally different.
The case studies on industrial robotic joints and robotic arms also confirm the usefulness of the framework in practice. In both systems, as control energy is required to increase nonlinearly, structural failure is predicted. CRC captures escalation of the operational risk, sensitivity of disturbances, recovery of maintenance, and energy inefficiencies that are imperceptible in the survival metrics. The architecture consequently offers practical early-warning knowledge on predictive upkeep, digital twins, and supervisory control frameworks.
The sensitivity and robustness analysis indicates that the proposed formulation has predictable threshold behavior, physically consistent sensitivity to degradation, as well as moderate disturbance uncertainty and robustness. The smooth variation in the prediction of failure time with the stabilizability thresholds and degradation rates proves the structural soundness of the method.
Further clarification of the methodological difference can be done by a comparative assessment with Lyapunov stability and H∞ robustness metrics. Stability and robustness can be maintained, and controllability is lost. CRC is the only regime that identifies this intermediate between a mathematically stable system that is no longer practically controllable.
In summary, this paper confirms that in the contemporary concept of reliability for cyber–physical and actuator-driven systems, the mechanical systems cannot be fully described by a structural survival probabilistic approach. However, that loss of control authority is the critical and earlier failure mechanism. The CRC framework offers a single reliability control paradigm that does the following:
- (i)
Measures loss of control authority;
- (ii)
Characterizes failure as loss of stabilizability;
- (iii)
Bridges reliability engineering and control theory;
- (iv)
Allows proactive maintenance decisions that are energy conscious.
This work provides a principled operationally significant basis of reliability analysis of highly industrial robotic systems, as well as other forms of engineering that require control, by moving the concept of reliability to the dynamic field of controllability. Future work will extend the approach toward nonlinear controllability formulations, disturbance-aware reliability analysis, scalable large-scale implementations, and experimental validation.