1. Introduction
Predicting the remaining useful life (RUL) of engineering systems is a central problem in prognostics and health management (PHM), with direct implications for condition-based maintenance, safety, and operational efficiency. In recent years, both data-driven and model-based approaches have been developed to exploit condition-monitoring (CM) signals collected from sensors embedded in industrial assets. Classical stochastic degradation models, including Wiener-process- and Gamma-process-based formulations [
1], have been successfully applied to soft-failure scenarios, where failure is defined by a known degradation threshold. However, such models typically rely on smooth and monotonic degradation trajectories and may be inadequate for systems exhibiting abrupt failures, heterogeneous degradation rates, or latent failure mechanisms. These limitations have motivated increasing interest in hard-failure prognostics, where the relationship between observed signals and failure time is indirect or unobservable [
2].
In many practical PHM applications, high-dimensional CM data are first compressed into a univariate health indicator (HI) through feature extraction or dimensionality reduction, after which RUL prediction is performed on the resulting trajectory. Traditional machine learning approaches operating on such health indicators often rely on extensive feature engineering, which requires substantial expert knowledge and manual design effort, making the modeling process costly and difficult to scale across heterogeneous systems [
3]. These limitations have motivated increasing interest in end-to-end data-driven models that operate directly on raw sensor sequences. However, irrespective of whether features are manually engineered or automatically learned, RUL prediction methods that rely on explicit failure thresholds become difficult to justify when such thresholds are unknown or system-dependent. State-based degradation models provide a principled alternative by representing the degradation process through discrete latent health states that evolve stochastically over time [
4]. Hidden Markov models (HMMs) are particularly attractive in this setting as they decouple latent degradation dynamics from noisy observations and enable scalable inference via efficient forward–backward recursions [
5,
6,
7].
The HMM-based joint modeling framework proposed by Deep et al. [
8] demonstrated that integrating CM signals with failure-event information through an absorbing-state structure enables threshold-free RUL estimation as failure is defined by entry into a latent absorbing state rather than by crossing an explicit degradation threshold. By modeling failure as an absorbing terminal state, the expected remaining lifetime can be computed analytically using state posterior probabilities and state-dependent survival characteristics. This formulation has established HMMs as a powerful tool for hard-failure prognostics and has inspired a growing body of follow-up work.
Despite this appeal, recent studies have increasingly emphasized that the practical reliability of HMM-based RUL predictors is often limited not by model structure but estimation stability [
8,
9]. Related challenges have also motivated Bayesian filtering approaches for RUL prediction, including Kalman and particle filtering methods, which provide a principled framework for sequential state estimation and uncertainty quantification [
10]. Nevertheless, their practical performance can be sensitive to prior specification, noise model assumptions, and computational burden, particularly under non-Gaussian disturbances and limited run-to-failure data. First, parameter estimation via maximum likelihood or expectation–maximization (EM) can suffer from severe variance inflation when the effective sample size assigned to individual latent states—implicitly inferred through posterior-state probabilities—is small, a common situation in run-to-failure datasets with short trajectories or unbalanced state occupancy [
11,
12]. Second, CM-derived health indicators frequently exhibit outliers, transient spikes, and local irregularities arising from operating condition variability or sensor drift, which violate Gaussian noise assumptions and can destabilize parameter estimates [
13,
14,
15]. Third, under a limited number of run-to-failure units, reliance on a single train–test partition without additional variance-control mechanisms may lead to high-variance performance estimates, making model comparison sensitive to the chosen dataset split [
9].
Recent advances in robust and regularized learning provide potentially important insights into addressing these challenges, particularly in contrast to physics-based prognostic models. While physics-based approaches can offer strong interpretability, their practical performance is often highly sensitive to the availability and precision of domain knowledge, which may be incomplete or unreliable under complex operating conditions and noisy environments [
16]. Moreover, many physics-based models lack the flexibility to be updated online using streaming sensor data, limiting their robustness and adaptability in real-world PHM applications. In this context, robust estimation techniques based on Huber’s seminal work [
13,
14] have been shown to substantially reduce the influence of anomalous observations and transient outliers in time-series regression and state-space models, providing a principled mechanism to improve estimation stability under non-Gaussian noise. In parallel, regularized EM formulations for latent-variable models have gained attention as a means of controlling estimator variance under limited sample sizes. In particular, ridge-regularized expectation–maximization formulations have been shown to improve numerical conditioning and variance control in regression-based parameter updates for latent-variable models [
12]. In parallel, recent studies integrating robust loss principles with latent-state inference highlight the importance of controlling estimator variance and mitigating the influence of anomalous observations in sequential decision-making problems [
15,
17]. However, the existing approaches typically address these issues in isolation. Regularization-based methods focus on stabilizing regression estimates under data scarcity, while robust estimation techniques primarily target resilience to outliers in the observation model. In HMM-based prognostics trained via EM, these two sources of instability interact non-trivially through posterior-state assignments so that addressing only one of them is often insufficient. This motivates a unified treatment that jointly stabilizes state-dependent slope estimation and residual variance updates within the EM loop.
Motivated by these developments, this paper focuses on stabilizing the estimation of emission parameters in HMM-based hard-failure prognostics. Rather than modifying the latent-state inference mechanism, we target the dominant source of estimation variability: the state-wise regression models used during the parameter update stage of the EM algorithm. Specifically, we develop a ridge-regularized EM framework in which only slope parameters are penalized, a standard practice in regularized regression to preserve interpretable intercepts while controlling variance inflation due to collinearity or limited effective sample size [
18,
19]. Regularization strengths are selected via cross-validation to avoid ad hoc tuning [
20]. In addition, we incorporate a Huber-based robust scale estimator for residual variance estimation, mitigating the impact of outliers and transient anomalies in the health indicator without altering the underlying EM optimization structure.
The resulting model is formulated within a simple-failure HMM structure, where failure is represented as an absorbing state. RUL estimation is performed analytically as the posterior-weighted expected hitting time of the absorbing state using smoothed posterior-state probabilities and transient-state survival characteristics, yielding a low-variance and computationally efficient predictor. Importantly, the proposed approach preserves the interpretability and computational advantages of classical HMM-based joint modeling while explicitly addressing bias–variance trade-offs and robustness at the parameter-estimation level.
The proposed methodology is evaluated through extensive Monte Carlo simulations and a real-world case study. The results demonstrate that the proposed ridge-regularized and robust EM approach consistently improves parameter stability and RUL prediction accuracy relative to the baseline weighted least squares EM (WLS-EM) method, particularly under limited sample sizes and noisy observation regimes.
The main contributions of this work are summarized as follows:
We develop a ridge-regularized EM algorithm for HMM-based degradation modeling, employing slope-only penalization with cross-validated regularization strengths to stabilize posterior-weighted emission regressions for latent states under data scarcity.
We incorporate a Huber-based robust residual variance estimator into the M-step, enhancing resistance to outliers and local irregularities in sensor-derived health indicators.
We retain an analytically tractable state-based RUL estimator within a HMM framework, enabling efficient and low-variance remaining-life prediction.
We provide comprehensive empirical validation through simulation and empirical data, demonstrating consistent improvements over classical WLS-EM-based approaches.
The remainder of the paper is organized as follows.
Section 2 introduces the proposed modeling framework, including the health indicator construction, the HMM specification, and the regularized and robust parameter estimation procedure.
Section 3 presents a simulation study under varying sample sizes to assess estimation stability and predictive performance.
Section 4 and
Section 5 report the benchmark results on the real dataset.
Section 6 discusses the statistical implications, robustness and practical significance of the proposed approach. Finally,
Section 7 concludes the paper.
2. Materials and Methods
This section presents the complete methodological pipeline in three major components: The proposed robust hidden Markov model (HMM) framework, including the model formulation, EM estimation, ridge–WLS updates, robust scale estimation, and the weighted RUL calculation.
2.1. Mathematical Model
We begin by formalizing the statistical structure of the degradation and failure process within a hidden Markov model (HMM) framework. The HMM formulation provides a natural representation for systems whose degradation evolves through latent health states and is observed indirectly through noisy condition-monitoring (CM) signals. This perspective is consistent with state-based prognostic models used in PHM applications [
8] and is rooted in classical probabilistic modeling introduced by Baum and colleagues in the 1960s [
13] as extensions of Markov’s early work on sequential dependence.
2.1.1. Degradation and Failure Behavior
Consider a system with
N identical units operating under similar conditions and subject to only corrective maintenance (run-to-failure). For each unit
, a monitoring system collects degradation indicators periodically. Let
denote a scalar degradation indicator observed at period
. Such indicators are assumed to summarize the dominant degradation dynamics from multivariate condition-monitoring signals and are treated as observed inputs to the prognostic model. The observation horizon
varies across units, and each trajectory terminates upon failure. While failure time
is observed, the underlying degradation level that triggers failure is latent, and it is necessary to infer degradation stages from
and failure times [
21].
2.1.2. Hidden Markov Model Structure
The latent degradation process is modeled as a discrete-time hidden Markov model (HMM) with finite state space . Let denote the latent degradation state of unit i at time t. States 1 to correspond to progressive degradation levels, while the state K represents the failure point, which is modeled as an absorbing state. All units are assumed to start at the initial healthy state, .
The state dynamics follow a first-order Markov chain with transition matrix
, where
The failure state
K is absorbing, i.e.,
,
for
, and
for
. The failure time, denoted by
, is defined as the first hitting time to the absorbing state, starting from state 1 at time 1:
The relationship between the transition matrix of P and absorption-time quantities underlying remaining useful life (RUL) estimation is classical.
2.1.3. State-Dependent Emission Model
Conditioned on the latent state
, the degradation signal
follows a Gaussian emission distribution with a state-specific linear mean structure. For each state
and time
,
where
and
denote the state-dependent intercept and slope, respectively, and
is the corresponding noise variance.
This state-wise linear formulation captures gradual but potentially heterogeneous degradation behavior across latent states. Following the HMM-based joint modeling framework for condition-monitoring signals and failure events [
8], we approximate the degradation dynamics within each latent health state by a simple linear trend, which provides a parsimonious yet interpretable representation of state-dependent degradation behavior.
The emission probability density associated with state
k is therefore given by
and the emission parameters for each state are collected as
Under the standard conditional independence assumption of hidden Markov models, the joint likelihood of the observation sequence given the latent-state path factorizes as
where
denotes the Gaussian emission density associated with state
k.
This emission model integrates naturally into the Baum–Welch EM framework, where posterior-state probabilities obtained from the forward–backward algorithm are used to perform weighted estimation of the state-dependent parameters. This formulation provides the foundation for the regularized and robust emission updates introduced in the subsequent sections.
2.2. Complete Likelihood Representation
Under the hidden Markov model formulation, the joint distribution of the latent state sequence
and the observed degradation signal
admits a standard factorization into transition and emission components. This structure enables parameter estimation via the expectation–maximization (EM) algorithm, where latent states are treated as missing data [
22].
For a single unit with latent states
and observations
, the complete-data likelihood is given by
where
denotes the initial state distribution, the transition probabilities
are defined in
Section 2.1.2. This likelihood corresponds to the classical complete-data formulation for HMMs [
5,
6].
Failure information in the present work is encoded via an absorbing-state HMM, which preserves the standard likelihood while enabling RUL estimation through absorption-time properties of the transition matrix. While transition probabilities admit stable closed-form updates, estimation of the linear emission parameters may become ill-conditioned when only a limited number of observations effectively contribute to a given latent state. We therefore adopt a ridge-regularized weighted least squares surrogate in the M-step (
Section 2.3.3).
2.3. Robust HMM Framework for Hard-Failure Prognostics
Our proposed framework builds upon the standard absorbing-state HMM formulation. Rather than reiterating the classical HMM and EM machinery, we focus on robustness under limited and noisy degradation data.
With the state-space formulation specified in
Section 2, the degradation process is fully characterized by (i) a first-order Markov transition model with an absorbing failure state and (ii) a state-dependent Gaussian emission model. Together, these components define a complete-data likelihood that admits a natural decomposition into transition and emission terms. This decomposition is central to likelihood-based inference for HMMs and leads directly to an expectation–maximization (EM) estimation scheme. We therefore proceed by deriving the EM updates, beginning with the computation of the smoothed posterior-state probabilities via the forward–backward (Baum–Welch) recursion.
2.3.1. E-Step: Forward–Backward Smoothing
For each unit
, let
denote the degradation signal observed at time
, and let
denote the corresponding latent degradation state. The initial state distribution is denoted by
, where
and, in the hard-failure setting considered here, all units are assumed to start in the healthy initial state,
The E-step evaluates the conditional expectations of latent-state indicators given the current parameter estimates. This is carried out using the classical forward–backward recursion (Baum–Welch algorithm) [
5,
6], summarized below.
Forward variables. For unit
i, the forward variables
are computed recursively as
where
denotes the Gaussian emission density associated with latent state
k, as defined in (
2). The summation term aggregates the probability of transitioning into state
k at time
t from all possible latent states at time
, while the factor
accounts for the likelihood of observing
in state
k.
Backward variables. The backward variables
are initialized at the final time and propagated backwards as
Posterior-state probabilities (smoothing). For each unit
i, the posterior probability of being in state
k at time
t is
The denominator corresponds to the marginal likelihood
and does not depend on
t. Consequently,
for all
t.
Joint posterior of successive states. For each unit
i, the joint posterior probability of being in state
k at time
t and state
ℓ at time
is
Using the forward–backward variables, this quantity can be expressed as
The smoothed probabilities quantify the unit- and time-specific responsibility of latent state k for each observation and are used as weights in the update of the emission regression parameters. The joint probabilities aggregate expected transition counts and yield closed-form updates of the transition probability matrix in the parameter estimation step.
2.3.2. M-Step: Transition Matrix and Weighted Least Squares (WLS) Emission Update
Given the smoothed state posteriors
and joint transition posteriors
obtained from the E-step, the parameter estimation step updates the transition probabilities by maximizing the expected complete-data log-likelihood. For a first-order Markov chain, this leads to a closed-form estimator in which each row of the transition matrix corresponds to normalized expected transition counts. This result is standard in EM estimation for hidden Markov models and follows directly from the structure of the complete-data log-likelihood (see, e.g., [
5,
6,
23]).
The absorbing-state constraints in () are enforced after each row update to ensure that state K remains absorbing under the simple-failure structure.
As a reference (unregularized) estimator, emission parameters are updated using a standard weighted least squares (WLS) procedure within the EM framework. Conditioned on the smoothed state posteriors
obtained from the E-step and the linear Gaussian emission model defined in (
1), the state-wise WLS update for each transient state
is obtained by solving
Given the resulting fitted values
, the state-specific noise variance is updated by
The closed-form expressions for the WLS estimators follow directly from the normal equations and are standard in HMM-based regression models (e.g., [
5,
6,
8]). In the proposed framework, the WLS-EM formulation is used solely as a baseline to assess the benefits of the ridge-regularized and robust emission updates introduced in the next subsection. For completeness, the explicit closed-form solutions are reported in
Appendix A.
2.3.3. M-Step: Ridge-Regularized Emission Update
To improve numerical stability of the emission parameter estimates under limited effective state-specific sample support, we replace the unregularized WLS update with a ridge-regularized weighted least squares formulation. Here, the effective sample size for each state is induced by the posterior-state probabilities obtained in the E-step rather than by directly observed state-labeled data. The regularization is applied only to the slope parameter in order to stabilize the estimated degradation rate while preserving the interpretability of the state-dependent intercept.
For each transient state
, the emission parameters
are obtained by minimizing the penalized objective
where
is a state-specific regularization parameter. The resulting penalized objective coincides with the classical ridge regression criterion, which augments the weighted least squares loss with a quadratic
penalty on the regression coefficients to improve numerical stability [
18]. The factor
is a matter of convention and simplifies derivatives.
Let
denote the effective number of observations associated with state
k. Define the design matrix
with rows
, the observation vector
collecting the corresponding measurements
, and the diagonal weight matrix
which aggregates the posterior-state probabilities obtained from the E-step. Under this notation, (
15) can be written compactly as
where
and
enforces slope-only penalization. This formulation shows explicitly that the emission update corresponds to a posterior-weighted ridge regression problem, in which uncertainty about latent-state assignments is accounted for through
.
The resulting estimator is obtained from the modified normal equations
In implementation, these quantities are computed from weighted sufficient statistics rather than explicit construction of
X and
, improving both numerical stability and computational efficiency.
2.3.4. Selection of the Regularization Strength via Cross-Validation
For each latent state k, the ridge regularization parameter is selected in a data-driven manner using K-fold cross-validation, which is performed exclusively within the training set to select the ridge penalty, while all reported performance metrics are computed on a held-out test set. The optimal amount of regularization depends on the effective sample size and noise level within each state and therefore cannot be fixed a priori.
State-specific weighted observation set. The state-specific weighted observation set associated with state
k is defined as
where each observation
is weighted by its posterior responsibility
.
K-fold partitioning. The index set
is randomly partitioned into
K disjoint folds,
Training loss. For a given candidate value
and fold
j, the ridge-regularized weighted least squares estimator is obtained by minimizing the training objective
where the ridge penalty is applied only during training.
Validation loss. The corresponding validation loss on the held-out fold
is defined as
which does not include any regularization term.
Cross-validation score. Averaging the validation loss across folds yields the cross-validation score
Standard error estimation. The variability of the validation loss across folds is quantified by
where
measures the uncertainty of the estimated mean validation loss.
One-standard-error (1 − SE) rule. Let
denote the mean K-fold cross-validation loss (weighted mean squared error) computed over the state-specific weighted observation set
denote the value minimizing the cross-validation score. The admissible set of regularization parameters is defined as
Finally, we select
which favors the strongest regularization that remains statistically indistinguishable from the minimum validation error. This choice yields a more stable estimator and mitigates overfitting under limited effective sample sizes [
18].
Once the state-specific regularization strengths have been selected and the ridge-regularized emission parameters are obtained, the remaining component of the M-step concerns the estimation of the state-dependent noise variance.
2.3.5. Robust Variance Estimation
Given the ridge-regularized emission estimates , we update the state-specific noise variance using a robust M-estimator of scale in order to limit the influence of atypical residuals and local deviations from the assumed Gaussian noise model.
Specifically, we define the fitted values
as
and update the state-specific noise variance using a Huber-type robust M-estimator of scale [
13,
14,
24]. Let
denote the residuals, and let
be the weighted median absolute deviation of the set
with weights
. An initial robust scale is obtained through the standard normal consistency factor
, i.e.,
To achieve 95% asymptotic efficiency under Gaussian noise, we adopt the classical Huber threshold of
[
13]. We therefore set
with a small
to prevent degeneracy. The Huber loss function is
which yields a bounded-influence scale update. The state-wise noise variance is estimated by
This robust variance update reduces the influence of occasional large residuals that can disproportionately affect the standard WLS variance estimator, particularly in the presence of local anomalies or short-lived deviations from the assumed linear degradation trend. When combined with ridge-regularized slope estimation, the resulting M-step achieves a more favorable bias–variance trade-off by stabilizing both the mean and variance updates without sacrificing efficiency for nominal Gaussian noise [
18].
The practical impact of this robustification is illustrated in the simulation study (
Section 3), where residual-wise contributions to the variance update under the standard WLS and Huber losses are contrasted.
The complete estimation procedure, integrating the E-step with the ridge-regularized and Huber-robust M-step updates, is summarized in Algorithm 1. The algorithm preserves the classical EM structure while incorporating penalized and robust components in a closed-form and computationally efficient manner. This formulation ensures stable parameter updates across iterations and provides the foundation for the analytical RUL estimator derived in the subsequent section.
| Algorithm 1 Penalized EM Algorithm for the Robust Absorbing-State HMM. |
Require: Observed trajectories for ; number of states K (state K absorbing); ridge penalties .
- 1:
Initialize parameters (e.g., simple-failure and a global linear regression for emissions). - 2:
repeat - 3:
E-step (forward–backward). - 4:
Run the scaled forward–backward recursion to obtain for and for (see ( 6)–( 11)). - 5:
M-step: transition matrix. - 6:
- 7:
M-step: emission parameters (ridge-regularized). - 8:
Select by weighted K-fold cross-validation under the current weights . - 9:
for each transient state do - 10:
Update by slope-only ridge regression: - 11:
Update using a Huber-type robust scale estimator based on the weighted residuals. - 12:
end for - 13:
until convergence of the log-likelihood and parameter updates.
|
2.4. RUL Estimation via Expected Hitting Time
Under the absorbing-state hidden Markov model, system failure is defined as the first hitting time of the absorbing state
K. For unit
i, the failure time is given by
where
denotes the latent health state at time
t. The remaining useful life (RUL) at time
is then defined as the expected remaining time until absorption,
.
For a first-order Markov chain with a single absorbing state, classical results for absorbing Markov chains apply. Let
denote the transient-state submatrix of the estimated transition matrix
P, and define the fundamental matrix [
25]
If the latent state at time
were known to be
, the expected remaining time to absorption would be
which corresponds to the expected number of future visits to transient states starting from state
k.
In practice, the latent state is not observed. Instead, the EM algorithm provides the smoothed posterior probabilities
. Taking the expectation of (
32) with respect to these posteriors yields the RUL estimator
Equivalently, this expression can be written in compact vector form as
where
collects the posterior probabilities over the transient states and
denotes the all-ones vector. This closed-form computation is efficient and depends only on the estimated transition matrix
P.
The absorbing-state (simple-failure) structure ensures that Q is strictly sub-stochastic and that is invertible, guaranteeing numerical stability of the RUL estimator.
To sum up, in the proposed framework, each EM iteration consists of (i) an E-step computing the forward–backward recursions and the associated smoothed quantities
and
, and (ii) an M-step updating the transition matrix
P and the state-wise emission parameters
. While the standard EM algorithm monitors the ascent of the (unpenalized) log-likelihood
the ridge-regularized M-step instead maximizes the penalized surrogate objective
reflecting the slope-only regularization applied to the emission trends. Accordingly, convergence is assessed using both (i) the absolute change in
and (ii) the maximum change in the emission parameters across iterations. For interpretability and for comparison with the unregularized WLS-EM baseline, we also record the unpenalized log-likelihood trace
at each iteration
r.
The selection of the ridge penalties
is carried out during the early phase of the EM procedure via weighted cross-validation on the state-wise prediction loss using the one-standard-error rule [
18]. The resulting values are then kept fixed to stabilize the subsequent emission updates.
Because EM algorithms for latent-state models can be sensitive to initialization, we follow the general guidance of Wu [
26] and adopt a carefully designed structured initialization for the transition matrix and regression coefficients. This initialization is kept fixed across runs as the imposed monotonicity and absorbing-state constraints provide sufficient numerical stability and ensure full reproducibility of the final estimates.
The complete posterior-weighted RUL estimation procedure is summarized in Algorithm 2. This formulation provides a systematic and computationally straightforward way to obtain remaining useful life predictions from the estimated HMM parameters. The predictive performance of the proposed framework is evaluated next through Monte Carlo experiments under controlled simulation settings.
| Algorithm 2 Posterior-Weighted RUL Estimation. |
Require: Estimated parameters ; absorbing state K; observed degradation trajectory for a new unit i.
- 1:
Emission Evaluation: Compute Gaussian emission densities for all and using the linear emission model in ( 1).
- 2:
Forward–Backward Recursions: Run the scaled forward and backward recursions (( 6)–( 8)) to obtain and . Compute smoothed posteriors via ( 9).
- 3:
Fundamental Matrix: Extract the transient submatrix . Compute the fundamental matrix .
- 4:
Compute RUL Path: For each , evaluate
using ( 34).
- 5:
return Full trajectory and the current estimate .
|
3. Monte Carlo Simulation Study
This section investigates the statistical properties of the proposed robust HMM estimators under controlled synthetic conditions. The goal is to assess (i) parameter stability, (ii) variance reduction under regularization, and (iii) improvements in RUL prediction accuracy relative to the unregularized WLS-EM baseline.
3.1. Simulation Setup
Synthetic run-to-failure trajectories were generated from a
K-state simple-failure HMM with linear Gaussian emissions as defined in (
1) using ground-truth parameters chosen to reproduce monotone degradation patterns typical of PHM applications. The latent-state dynamics follow a progressive structure in which units may remain in the same state or transition only to the next degradation level (i.e., from state
k to
), with state
K modeled as absorbing. The true transition matrix used in the simulations is provided in
Table 1, and the emission parameters
for each state are listed in
Table 2.
For each Monte Carlo replication, N independent units were simulated until absorption, with and repetitions per setting. To assess robustness, the simulations incorporated two controlled perturbations:
Effectively heavy-tailed noise—rare large residuals were introduced through occasional high-magnitude deviations in the emission noise, resulting in heavier-than-Gaussian tails without explicitly changing the nominal noise model.
Heterogeneous degradation rates—unit-to-unit variability in degradation speed was induced by state-dependent emission slopes, causing different progression rates through the latent health states while preserving the same transition matrix.
3.2. Estimation and Evaluation Metrics
Model estimation is performed using the EM algorithm described in
Section 2, with either weighted least squares (WLS) or ridge-regularized updates for the state-dependent emission parameters.
The regularization strengths
are selected in a data-driven manner via
K-fold cross-validation on the weighted prediction loss in (
15) using the one-standard-error (1 − SE) rule. Candidate values of
are searched on a fixed logarithmic grid ranging from
to
, with 30 evenly spaced points on the logarithmic scale, which is shared between all states to ensure comparability of the regularization path. As illustrated in
Figure 1, the cross-validation loss curve
typically exhibits a flat minimum over a wide range of values
, enabling the use of the 1 − SE rule to select stable regularization. Consequently, for each state
k, we select the largest
whose validation loss is statistically indistinguishable from the minimum, producing a conservative state-dependent choice that prioritizes numerical stability under limited effective sample sizes.
The performance of the model in the Monte Carlo experiments is evaluated using complementary metrics that capture both the accuracy of the estimation and the predictive reliability. Specifically, we report: (i) parameter-space mean squared error (MSE) for the emission parameters and transition probabilities; (ii) predictive MSE of the one-step-ahead emission means; and (iii) RUL prediction accuracy, assessed via the RMSE of over time and across Monte Carlo replications, together with a bias–variance decomposition of the RUL MSE.
3.3. Simulation Results
We begin by examining the impact of regularization on the accuracy and stability of emission parameter estimation.
Figure 2 reports a Monte Carlo bias–variance decomposition of the emission parameter estimates as a function of the training-fleet size
N, where bias, variance, and expected prediction error (EPE) are averaged over all states and emission parameters.
Across all sample sizes, the ridge-EM estimator exhibits a substantial reduction in estimator variance relative to the unregularized WLS-EM baseline at the cost of a moderate bias for small N. This bias–variance trade-off is most pronounced in data-scarce regimes, where WLS-EM suffers from large variability due to poorly conditioned weighted design matrices. As N increases, the bias induced by the ridge penalty diminishes, while the variance reduction persists, yielding a uniformly lower EPE for ridge-EM across all configurations.
These results demonstrate that the proposed penalized M-step effectively stabilizes state-wise emission trend estimation under limited effective sample sizes, providing a more reliable parameter foundation for downstream prognostic tasks.
We next consider the accuracy of the full emission parameter vector.
Figure 3 reports the parameter-space mean squared error (MSE), defined as the squared
error of the complete emission parameter vector
aggregated over all states. Here, WLS-EM corresponds to the unregularized maximum-likelihood benchmark, while ridge-EM introduces bias in exchange for improved numerical conditioning and variance reduction.
Across all sample sizes, ridge-EM achieves lower parameter-space MSE than WLS-EM, with the largest gains observed in small training fleets (). Notably, the improvement remains non-negligible even at , indicating that ridge regularization remains beneficial beyond extremely data-limited regimes.
In addition to ridge penalization, we employ robust variance estimation to control the influence of large residuals. Specifically, we adopt a Huber-type loss in the variance update step, which behaves quadratically for small residuals and transitions to linear growth beyond a threshold. This mechanism preserves efficiency under nominal noise while preventing individual large deviations from dominating the variance estimate.
Figure 4 illustrates the contrast between the standard quadratic WLS loss and the Huber loss as a function of the absolute residual magnitude.
We next evaluate prognostic performance using the weighted RUL estimator introduced in
Section 2.4.
Figure 5 reports the mean RUL trajectories for
across Monte Carlo runs. The unregularized WLS-EM exhibits a systematic tendency to overestimate the remaining life, most prominently in the mid-life region of the degradation process, where prediction uncertainty is highest. This behavior reflects a positive bias in the average RUL estimates, arising from instability in the emission parameter estimates and the resulting latent-state posteriors. By contrast, ridge-EM yields mean trajectories that remain closer to the true RUL curve and produces noticeably narrower variability bands, indicating both improved stability and reduced dispersion in the predictions.
Table 3 summarizes RUL prediction accuracy across all training fleet sizes using the root mean squared error (RMSE) and the predicted and true RUL trajectories averaged over Monte Carlo replications. Ridge-EM improves prediction accuracy by approximately 2–3 units for all values of
N, with the largest relative gains at moderate fleet sizes.
To better understand this improvement,
Figure 6 presents the distribution of RUL prediction errors at a representative time point for
. The WLS-EM baseline yields a wider error distribution with several large negative outliers (severe underestimation of remaining life), whereas ridge-EM produces a more concentrated and symmetric error profile, indicating improved robustness against local anomalies in the simulated trajectories. Beyond visual comparison, a paired Monte Carlo analysis demonstrates that the reduction in RUL prediction error achieved by ridge-EM at
is statistically significant, with a 95% confidence interval for the RMSE difference that does not include zero.
Finally, we examine the bias–variance decomposition of the RUL mean squared error. Across all simulation settings, WLS-EM exhibits relatively small bias but substantially larger variance, particularly for small and moderate training fleet sizes. This variance inflation dominates the overall error, leading to higher RUL MSE. In contrast, ridge-EM yields a pronounced reduction in variance while also stabilizing the bias component, resulting in a lower overall expected prediction error (EPE).
This behavior is summarized quantitatively in
Table 4 for the representative case
, where ridge-EM achieves a substantial reduction in both variance and total MSE.
Overall, RUL errors under WLS-EM are dominated by estimator variance, whereas ridge-EM achieves lower total error by substantially stabilizing RUL predictions at the cost of a small bias increase.
4. Real-World Case Study: NASA C-MAPSS
To demonstrate the practical applicability of the proposed ridge-regularized robust HMM framework, we evaluate it on the widely used NASA C-MAPSS turbofan degradation dataset (FD001). This dataset is commonly employed in prognostics benchmarks due to its single-operating-condition structure and monotone degradation behavior, making it well-suited for simple-failure latent-state modeling.
4.1. Dataset and Preprocessing
FD001 contains multivariate run-to-failure trajectories from 100 training and 100 test engines. Each record includes the unit identifier, cycle index, three operating condition variables, and 21 sensor channels. In this study, we use all 21 sensor channels to construct a univariate health indicator (HI). The operating condition variables are excluded from the analysis as FD001 corresponds to a single operating regime and these variables do not provide additional discriminative information.
Sensor channels are standardized using z-score normalization based on the training-fleet statistics. Specifically, for each sensor channel, the training mean and standard deviation are computed and the same transformation is applied to both training and test units.
A univariate HI is then constructed via principal component analysis (PCA) trained on the normalized training sensor matrix. The first principal component is retained, and its sign is oriented using its correlation with the cycle index so that the HI reflects a consistent direction degradation over time. PCA is employed here as a widely adopted health indicator construction approach for the FD001 dataset, widely used in prior studies to capture dominant degradation trends, while the focus of this work remains on estimation robustness rather than an optimized feature extraction strategy. For the test units, the HI is obtained by applying the learned loading vector to the normalized sensor matrix. The resulting HI trajectories exhibit the expected slow decline during healthy operation, followed by a sharp descent near failure.
Each unit is modeled using a K-state simple-failure HMM, where latent states represent progressive degradation levels and state K is absorbing. For the real-data analysis, the number of latent states is selected in a data-driven manner using the Bayesian information criterion (BIC), which is widely adopted for model order selection in HMM-based prognostics. To compare competing models with different values of K, the BIC is computed from the unpenalized log-likelihood, consistent with standard practice for penalized likelihood models.
For the simple-failure HMM considered here, the total number of free parameters is
corresponding to
free transition probabilities under the absorbing-state structure and three emission parameters per state. The BIC is therefore defined as
where
denotes the total number of observed time points across all units. This formulation is consistent with prior HMM-based condition monitoring and prognostic studies [
27,
28,
29]. Based on this criterion,
achieves the lowest BIC among the candidate models and is therefore adopted for all subsequent real-data experiments. The corresponding log-likelihood and BIC values for
are summarized in
Table 5.
4.2. Model Training and State Progression
Model parameters are learned using both WLS-EM and ridge-EM from a structured and deterministic initialization. The initial state distribution
is concentrated on the healthiest state, and the transition probability matrix is initialized with a monotone stay-forward structure and an absorbing failure state. Emission parameters are initialized from a global linear regression fit to the available data; small deterministic intercept offsets are introduced across states to avoid identical initial emissions, while the slope and noise parameters are kept common across states. Ridge regularization is applied only to the slope parameters, with penalty strength
selected via the weighted 1 − SE cross-validation rule described in
Section 3.2 (see
Appendix C).
Figure 7 shows posterior-state probabilities
for a representative training unit. The unit progresses monotonically through the latent degradation levels, transitioning approximately at
before reaching the absorbing failure state. Importantly, WLS-EM and ridge-EM produce nearly identical state segmentation, demonstrating that ridge penalization stabilizes parameter estimation without altering the underlying physical degradation structure. However, ridge-EM yields smoother and less noisy
trajectories, especially within late-stage states where the WLS solution exhibits small but noticeable fluctuations. This improved stability directly influences RUL estimation as RUL predictions depend on the posterior distribution over the transient degradation states.
4.3. RUL Prediction Results
We evaluate the RUL prediction performance of the proposed methods on the FD001 test set using standard accuracy and reliability metrics.
Table 6 summarizes the test RMSE and MAE, together with empirical coverage rates for error tolerance
periods. Let
denote the prediction error for test unit
j. The coverage rate at tolerance
is defined as the fraction of test units whose absolute prediction error satisfies
.
Across all metrics, ridge-EM consistently outperforms the unregularized WLS-EM baseline. In particular, ridge-EM achieves substantially lower RMSE and MAE while also exhibiting higher empirical coverage at all error thresholds, indicating improved accuracy and calibration of the RUL predictions.
Figure 8 complements these aggregate metrics by showing the empirical cumulative distribution function (CDF) of the absolute RUL prediction errors. In addition to WLS-EM and the proposed ridge-EM, the figure includes two fundamental benchmark models: a linear degradation regression and a simple Kalman filter-based state-space estimator. These baselines represent classical trend-based and state-space RUL estimation approaches that are widely adopted in industrial predictive maintenance workflows (e.g., MathWorks Predictive Maintenance Toolbox [
30]). Across the full error range, the ridge-EM curve dominates that of WLS-EM, indicating a uniformly higher probability of achieving smaller prediction errors.
Figure 9 shows the RUL trajectories for a representative FD001 test unit. Both WLS-EM and ridge-EM capture the overall degradation trend and the countdown toward failure. Differences are most pronounced around inferred state transitions, where WLS-EM exhibits abrupt changes in the predicted RUL. In contrast, ridge-EM yields less variable transitions, reflecting increased stability of the underlying emission parameter estimates. This behavior is consistent with the variance-reducing effect of ridge regularization under state-wise data sparsity rather than a systematic shift in bias.
Finally,
Figure 10 summarizes the distribution of prediction errors across all test engines, showing a tighter and more symmetric error profile for ridge-EM.
Overall, the results demonstrate that the proposed ridge-regularized EM framework provides a more stable and reliable basis for RUL prediction under limited and noisy run-to-failure data. Performance is evaluated using complementary metrics, including RMSE, MAE, error-coverage probabilities, and bias–variance decomposition, which jointly assess predictive accuracy, robustness, and estimator stability. Across both simulation studies and real-data experiments, the proposed approach consistently improves prediction reliability relative to the unregularized WLS-EM baseline while preserving analytical tractability and model transparency.
5. Results
This section presents the empirical evaluation of the proposed methodology based on both Monte Carlo simulation studies and the NASA C-MAPSS FD001 benchmark dataset. The simulation experiments focus on parameter estimation accuracy, cross-validation behavior, EM convergence properties, and robustness under varying noise and fleet-size conditions, with detailed results reported in
Section 3.3. Complementarily, the real-data analysis investigates posterior-state evolution, unit-specific remaining useful life (RUL) trajectories, and aggregate prediction performance, as discussed in
Section 4.3.
Figure 11 compares the RUL prediction performance of the proposed ridge-EM approach against the WLS-EM baseline and a linear regression benchmark on the FD001 dataset, evaluated at the last available cycle of each test unit. The left panel shows predicted versus true RUL values, together with the identity line and tolerance bands, while the right panel reports sorted absolute prediction errors, facilitating a direct comparison of error distributions across methods.
The results indicate that ridge-EM achieves a tighter concentration around the identity line and substantially reduced dispersion relative to the WLS-EM estimator. In contrast, the linear regression benchmark exhibits pronounced bias and large prediction variance, particularly for medium and long RUL horizons. The sorted absolute error curves further confirm that ridge-EM consistently dominates the competing methods across the full range of test units, yielding higher proportions of predictions within practical tolerance thresholds.
Overall, these findings demonstrate that the proposed ridge-EM estimator provides (i) improved numerical stability during EM iterations, (ii) reduced variance in emission-parameter estimates, and (iii) more reliable short- and medium-horizon RUL predictions compared with WLS-EM. Importantly, these advantages persist across different fleet sizes, noise levels, and real-world sensor trajectories, supporting the robustness and practical relevance of the proposed approach.
6. Discussion
The real-data analysis and the Monte Carlo experiments together provide a coherent picture of how ridge-regularized and Huber-robust updates improve the statistical behavior of HMM-based prognostic models.
First, ridge penalization stabilizes the estimation of state-dependent slopes and variances, reducing the sensitivity of the EM updates to early- or late-stage data scarcity. This manifests most clearly in the posterior-state trajectories , which become smoother and less noisy without altering the underlying degradation segmentation. Because the weighted RUL estimator is a linear functional of , this variance reduction directly translates into improved RUL stability.
Second, both the simulation and the FD001 analysis show that the regularized estimator attains similar or slightly improved mean accuracy while substantially reducing severe underestimation errors. From a bias–variance perspective, the Monte Carlo experiments confirm that ridge-EM reduces variance markedly while introducing only negligible bias, leading to consistently lower expected prediction error (EPE) across all training-fleet sizes. This property is particularly important in safety-critical prognostics, where controlling the variability of short-horizon predictions is often more crucial than marginal reductions in mean error. In particular, while conservative RUL underestimation may be preferred to avoid overly optimistic decisions, high-variance estimators can produce erratic and unpredictable predictions, including extreme errors driven by local anomalies. By suppressing such variability and extreme outliers, the proposed approach yields more reliable RUL estimates while remaining compatible with conservative maintenance decision rules.
Third, the proposed algorithm enhances numerical robustness while preserving the standard computational structure of the classical HMM-EM framework. The ridge-regularized M-step modifies the closed-form emission updates in a principled manner, improving the conditioning of the estimation problem without introducing additional iterative layers or auxiliary optimization routines. Importantly, ridge penalization alone alleviates ill-conditioning due to limited effective sample sizes but remains sensitive to outliers, while robust variance estimation mitigates outlier influence without resolving instability in slope estimation under state-dependent data scarcity. By integrating state-dependent slope penalization with robust variance estimation inside the EM loop, the proposed framework jointly stabilizes parameter estimation and latent-state inference. As a result, the method improves estimator stability and reliability while retaining the practical simplicity and interpretability of conventional HMM-based approaches.
Building on these considerations, a related modeling choice concerns the use of linear state-dependent emission models. In this work, linear emissions were deliberately adopted to preserve parameter identifiability, numerical stability, and analytical tractability within the EM framework, particularly under limited run-to-failure data. While real degradation processes may exhibit nonlinear trends, such behavior can often be approximated through transitions among multiple latent health states, yielding an effective piecewise-linear representation. Alternatively, the proposed framework can be extended to incorporate nonlinear emission functions, such as polynomial or basis-expansion models, within each state, as well as alternative health indicator constructions, provided that estimation stability and interpretability are preserved. Investigating these extensions and comparative evaluations across different feature representations and their impact on estimation stability and robustness constitutes an important direction for future research.
Similarly, although ridge regularization improves estimator stability, more flexible penalty structures, such as adaptive or hierarchical formulations, may offer additional gains in highly heterogeneous fleets. The FD001 dataset serves as a benchmark in this study, and extending the evaluation to other real-world systems would help to further assess the generalization of the proposed framework.
In summary, the proposed ridge-regularized and Huber-robust HMM provides a principled modeling framework for improving parameter stability, latent-state smoothness, and short-horizon RUL reliability while remaining fully compatible with standard HMM-based prognostic pipelines. By jointly addressing slope ill-conditioning and variance inflation within the EM procedure, the proposed approach offers a robust and practically applicable enhancement to classical HMM-based degradation modeling.
7. Conclusions
This work presented a statistically principled extension of HMM-based hard-failure prognostics by introducing ridge-regularized emission estimation and a Huber-type robust variance update within the EM framework. Rather than modifying the latent-state structure or inference mechanism, the proposed approach directly targets a key but often overlooked source of instability in HMM-based RUL prediction: the high variance and poor numerical conditioning of posterior-weighted state-wise regression updates under limited run-to-failure data.
Through comprehensive Monte Carlo experiments and a real-world evaluation on the NASA C–MAPSS FD001 dataset, the ridge-regularized EM formulation was shown to substantially reduce estimator variance while introducing negligible bias. These improvements translate into smoother latent-state posterior trajectories and markedly more stable short-horizon RUL predictions. In contrast, the conventional WLS-EM approach remains sensitive to data scarcity and local measurement irregularities, which can lead to erratic parameter updates and unreliable remaining-life estimates.
By preserving the probabilistic structure, analytical RUL formulation, and model transparency of classical HMM-based joint models, the proposed method offers a minimal yet effective alternative to more complex or black-box prognostic approaches, particularly in small fleets and safety-critical settings. From a practical perspective, the improved stability of short-horizon RUL predictions directly supports more reliable maintenance scheduling and reduces the risk of overly conservative or delayed interventions under limited data availability. Future research may incorporate nonlinear emission structures, covariate-dependent transition dynamics, or online (recursive) extensions to further enhance applicability in real-time PHM settings and heterogeneous fleets.
Author Contributions
Conceptualization, H.B.K. and G.K.; methodology, H.B.K.; software, H.B.K.; validation, H.B.K. and M.H.; formal analysis, H.B.K.; investigation, H.B.K.; resources, G.K. and M.H.; data curation, H.B.K.; writing—original draft preparation, H.B.K.; writing—review and editing, H.B.K., G.K. and M.H.; visualization, H.B.K.; supervision, G.K. and M.H.; project administration, G.K.; funding acquisition, G.K. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
The NASA C–MAPSS FD001 turbofan engine degradation dataset used in this study is publicly available through the Prognostics Center of Excellence (PCoE) at NASA Ames Research Center:
https://www.nasa.gov/intelligent-systems-division/discovery-and-systems-health/pcoe/pcoe-data-set-repository/ (accessed on 13 February 2026). All simulation codes, parameter settings, and scripts used for generating the results of the Monte Carlo experiments are available from the corresponding author upon reasonable request. No proprietary or confidential data were used in this study.
Acknowledgments
The authors gratefully acknowledge the guidance and feedback provided by Gokhan Kırkil and Mustafa Hekimoğlu throughout the study. During the preparation of this manuscript, the authors used OpenAI’s ChatGPT (GPT-5.1) to assist with English-language refinement and formatting consistency. All scientific content, mathematical derivations, modeling decisions, simulation design, data analysis, and interpretation of results were entirely performed and verified by the authors. The authors have reviewed and edited all AI-assisted text and take full responsibility for the final content of this publication.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| HMM | Hidden Markov model |
| EM | Expectation–maximization |
| WLS | Weighted least squares |
| RUL | Remaining useful life |
| TPM | Transition probability matrix |
| CM | Condition monitoring |
| HI | Health indicator |
| MAD | Median absolute deviation |
| BIC | Bayesian information criterion |
| CV | Cross-validation |
| RMSE | Root mean squared error |
| MAE | Mean absolute error |
| EPE | Expected prediction error |
| NASA C–MAPSS | NASA Commercial Modular Aero-Propulsion System Simulation dataset |
Appendix A. Closed-Form WLS Emission Update
For reference, we summarize the closed-form weighted least squares (WLS) estimators used as the unregularized baseline in the EM algorithm.
For each transient state
, define the weighted sample means
The WLS estimators of the slope and intercept are given by
Given the fitted values
the state-specific noise variance is estimated as
These expressions correspond to the standard WLS solution obtained from the state-wise maximization of the expected complete-data log-likelihood under Gaussian emissions.
Appendix B. Bias–Variance Decomposition for the RUL Estimator
This appendix provides the theoretical background underlying the bias-variance analysis of the remaining useful life (RUL) estimator, as illustrated empirically in
Section 3.3. The goal is to clarify how ridge regularization reduces the variance of RUL predictions while introducing only minimal bias.
Let
denote the true (latent) remaining useful life of unit
i at time
, and let
denote its estimator defined in (
34). The mean squared prediction error admits the decomposition
In the proposed HMM framework, the RUL estimator is given by the weighted hitting-time expression introduced in
Section 2.4:
where
is the smoothed posterior-state vector and
is the fundamental matrix of the transient-state transition submatrix
Q. The vector
contains the expected numbers of future steps until absorption starting from each latent degradation state.
Because the estimator in (
A6) depends linearly on both the posterior probabilities
and the emission parameters
, variance in either component propagates to the RUL prediction. Ridge regularization reduces the dispersion of these emission-parameter estimates, which in turn stabilizes
and yields lower overall prediction variance. This theoretical mechanism is consistent with the empirical results reported in
Section 3.3.
Appendix C. Diagnostic Analysis of State-Dependent Regularization
This appendix provides additional diagnostic results to aid the interpretation of the state-dependent regularization strengths selected in the real-world case study. As described in the main text, the regularization parameters are selected solely based on validation performance using the weighted 1 − SE cross-validation rule. The analyses presented here are post hoc diagnostics and do not influence model training or hyperparameter selection. Their purpose is to provide intuition for why certain states benefit from stronger regularization than others.
Figure A1 illustrates how the state-dependent regularization strengths selected by cross-validation vary across degradation states in the real-world case study. Panel (a) shows that earlier transient states are assigned stronger regularization, while later states require progressively weaker penalties. Panel (b) relates the selected
values to the stability of the corresponding slope estimates across cross-validation splits, indicating that states with less stable slope estimates tend to be associated with larger regularization strengths. This suggests that stronger regularization helps to stabilize emission parameter estimation in states that are more difficult to estimate reliably.
Figure A1.
Diagnostic explanation of state-dependent regularization strength in the real-world case study. (a) State-specific regularization strengths selected via the weighted 1 − SE cross-validation rule. (b) Relationship between the selected values and the cross-validation variability of the corresponding slope estimates. The dashed line is shown as a visual guide.
Figure A1.
Diagnostic explanation of state-dependent regularization strength in the real-world case study. (a) State-specific regularization strengths selected via the weighted 1 − SE cross-validation rule. (b) Relationship between the selected values and the cross-validation variability of the corresponding slope estimates. The dashed line is shown as a visual guide.
These diagnostic results complement the main analysis by providing additional insight into the behavior of state-dependent regularization in the real-world case study.
References
- Zhang, Z.; Si, X.; Hu, C.; Lei, Y. Degradation data analysis and remaining useful life estimation: A review on Wiener-process-based methods. Eur. J. Oper. Res. 2018, 271, 775–796. [Google Scholar] [CrossRef]
- Zhou, Q.; Son, J.; Zhou, S.; Mao, X.; Salman, M. Remaining useful life prediction of individual units subject to hard failure. IIE Trans. 2014, 46, 1017–1030. [Google Scholar] [CrossRef]
- Zhao, R.; Yan, R.; Wang, J.; Mao, K. Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks. Reliab. Eng. Syst. Saf. 2017, 158, 344–357. [Google Scholar] [CrossRef] [PubMed]
- Florescu, I.; Tudor, C. Probability and Stochastic Processes; Wiley: Hoboken, NJ, USA, 2014. [Google Scholar]
- Cappé, O.; Moulines, E.; Rydén, T. Inference in Hidden Markov Models; Springer: New York, NY, USA, 2005. [Google Scholar]
- Rabiner, L.R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 1989, 77, 257–286. [Google Scholar] [CrossRef]
- Jurafsky, D.; Martin, J.H. Speech and Language Processing, 3rd ed.; Online Draft; 2024; Available online: https://web.stanford.edu/~jurafsky/slp3/ (accessed on 13 February 2026).
- Deep, A.; Zhou, S.; Veeramani, D.; Chen, Y. HMM-Based Joint Modeling of Condition Monitoring Signals and Failure Event Data for Prognosis. IEEE Trans. Reliab. 2022, 71, 598–610. [Google Scholar] [CrossRef]
- Hu, C.; Youn, B.D.; Wang, P.; Yoon, J.T. Ensemble of data-driven prognostic algorithms for robust prediction of remaining useful life. Reliab. Eng. Syst. Saf. 2012, 103, 120–135. [Google Scholar] [CrossRef]
- Orchard, M.E.; Vachtsevanos, G.J. A Particle Filtering Approach for On-Line Fault Diagnosis and Failure Prognosis. IEEE Trans. Instrum. Meas. 2009, 58, 370–379. [Google Scholar] [CrossRef]
- McLachlan, G.; Krishnan, T. The EM Algorithm and Its Extensions, 2nd ed.; Wiley: New York, NY, USA, 2008. [Google Scholar]
- Obakrim, A.; El Qachchachi, N.; Hammouch, A. Regularized expectation–maximization algorithms for latent-variable models. Signal Process. 2024, 210, 109020. [Google Scholar]
- Huber, P.J. Robust estimation of a location parameter. Ann. Math. Stat. 1964, 35, 73–101. [Google Scholar] [CrossRef]
- Huber, P.J.; Ronchetti, E.M. Robust Statistics, 2nd ed.; Wiley: Hoboken, NJ, USA, 2009. [Google Scholar]
- Zhang, Y.; Li, X.; Chen, Z.; Wang, H. Robust learning under heavy-tailed noise with bounded-influence loss functions. arXiv 2024, arXiv:2405.13453. [Google Scholar]
- Ribeiro, M.; Bessa, I.; Silva, A.; Guedes Soares, C. Physics-Informed Machine Learning for Prognostics and Health Management: A Review. Reliab. Eng. Syst. Saf. 2020, 193, 106587. [Google Scholar]
- Liu, J.; Sun, Q.; Zhao, Y. Robust estimation for sequential models under contaminated observations. arXiv 2023, arXiv:2312.13257. [Google Scholar]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar]
- Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [PubMed]
- Lu, C.J.; Meeker, W.Q. Using degradation measures to estimate a time-to-failure distribution. Technometrics 1993, 35, 161–174. [Google Scholar] [CrossRef]
- Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 1977, 39, 1–38. [Google Scholar] [CrossRef]
- Vouma Lekoundji, J.-B. Modèles de Markov Cachés. Master’s Thesis, Université du Québec à Montréal, Montréal, QC, Canada, 2014. [Google Scholar]
- Hampel, F.R.; Ronchetti, E.M.; Rousseeuw, P.J.; Stahel, W.A. Robust Statistics: The Approach Based on Influence Functions; Wiley: New York, NY, USA, 1986. [Google Scholar]
- Kemeny, J.G.; Snell, J.L. Finite Markov Chains; Springer: New York, NY, USA, 1976. [Google Scholar]
- Wu, C.J. On the convergence properties of the EM algorithm. Ann. Stat. 1983, 11, 95–103. [Google Scholar] [CrossRef]
- Moghaddass, R.; Zuo, M.J. An integrated framework for online diagnostic and prognostic health monitoring using a multistate deterioration process. Rel. Eng. Syst. Saf. 2014, 124, 92–104. [Google Scholar] [CrossRef]
- Wang, F.; Tan, S.; Yang, Y.; Shi, H. Hidden Markov model-based fault detection approach for a multimode process. Ind. Eng. Chem. Res. 2016, 55, 4613–4621. [Google Scholar] [CrossRef]
- Zhang, D.; Bailey, A.D.; Djurdjanovic, D. Bayesian identification of hidden Markov models and their use for condition-based monitoring. IEEE Trans. Rel. 2016, 65, 1471–1482. [Google Scholar] [CrossRef]
- MathWorks. RUL Estimation Using RUL Estimator Models. Predictive Maintenance Toolbox Documentation. Available online: https://www.mathworks.com/help/predmaint/ug/rul-estimation-using-rul-estimator-models.html (accessed on 1 January 2026).
Figure 1.
Cross-validation loss curve as a function of the regularization strength for state with . The blue curve shows the mean K-fold validation loss, the star indicates the minimizer , and the horizontal dashed line denotes the one-standard-error (1 − SE) threshold. The selected regularization strength is marked by the vertical line and corresponds to the largest whose validation loss remains within the 1 − SE criterion.
Figure 1.
Cross-validation loss curve as a function of the regularization strength for state with . The blue curve shows the mean K-fold validation loss, the star indicates the minimizer , and the horizontal dashed line denotes the one-standard-error (1 − SE) threshold. The selected regularization strength is marked by the vertical line and corresponds to the largest whose validation loss remains within the 1 − SE criterion.
Figure 2.
Monte Carlo bias–variance decomposition of emission parameter estimates under WLS and ridge. The mean squared bias, variance, and expected prediction error (EPE) as functions of the number of training units N are shown.
Figure 2.
Monte Carlo bias–variance decomposition of emission parameter estimates under WLS and ridge. The mean squared bias, variance, and expected prediction error (EPE) as functions of the number of training units N are shown.
Figure 3.
Parameter-space MSE of the emission parameters under WLS-EM and ridge-EM as a function of the number of training units N. WLS-EM serves as the unregularized benchmark, while ridge-EM yields consistent reductions in MSE, particularly for small training fleets.
Figure 3.
Parameter-space MSE of the emission parameters under WLS-EM and ridge-EM as a function of the number of training units N. WLS-EM serves as the unregularized benchmark, while ridge-EM yields consistent reductions in MSE, particularly for small training fleets.
Figure 4.
Comparison of quadratic WLS loss and Huber loss as a function of the absolute residual magnitude. The Huber loss coincides with the quadratic loss for small residuals but grows linearly beyond the threshold , thereby capping the influence of large deviations in the variance update and improving robustness.
Figure 4.
Comparison of quadratic WLS loss and Huber loss as a function of the absolute residual magnitude. The Huber loss coincides with the quadratic loss for small residuals but grows linearly beyond the threshold , thereby capping the influence of large deviations in the variance update and improving robustness.
Figure 5.
Mean RUL trajectories over 100 simulations for . The dashed line denotes the true RUL; solid lines show the mean predictions for WLS-EM and ridge-EM, with shaded bands indicating empirical variability.
Figure 5.
Mean RUL trajectories over 100 simulations for . The dashed line denotes the true RUL; solid lines show the mean predictions for WLS-EM and ridge-EM, with shaded bands indicating empirical variability.
Figure 6.
Distribution of RUL prediction errors over all Monte Carlo runs for . Ridge-EM reduces the spread of errors and suppresses extreme negative outliers compared with WLS-EM.
Figure 6.
Distribution of RUL prediction errors over all Monte Carlo runs for . Ridge-EM reduces the spread of errors and suppresses extreme negative outliers compared with WLS-EM.
Figure 7.
Posterior-state probabilities for a representative FD001 unit. The monotone progression across hidden states is consistent with the simple-failure assumption. Ridge regularization produces stable parameter estimates without altering latent-state segmentation.
Figure 7.
Posterior-state probabilities for a representative FD001 unit. The monotone progression across hidden states is consistent with the simple-failure assumption. Ridge regularization produces stable parameter estimates without altering latent-state segmentation.
Figure 8.
Empirical CDF of absolute RUL prediction errors on the FD001 test set. Ridge-EM consistently attains higher coverage at all error levels compared to WLS-EM and the considered benchmark methods.
Figure 8.
Empirical CDF of absolute RUL prediction errors on the FD001 test set. Ridge-EM consistently attains higher coverage at all error levels compared to WLS-EM and the considered benchmark methods.
Figure 9.
RUL trajectories for a representative FD001 test unit. Ridge-EM stabilizes predictions near state transitions while preserving global accuracy.
Figure 9.
RUL trajectories for a representative FD001 test unit. Ridge-EM stabilizes predictions near state transitions while preserving global accuracy.
Figure 10.
Distribution of RUL prediction errors for FD001. Ridge-EM reduces the frequency and severity of large negative prediction errors.
Figure 10.
Distribution of RUL prediction errors for FD001. Ridge-EM reduces the frequency and severity of large negative prediction errors.
Figure 11.
Benchmark comparison on the NASA C-MAPSS FD001 dataset. (Left): Predicted versus true RUL at the last cycle for WLS-EM, ridge-EM, and linear regression, including the identity line and tolerance bands. (Right): Sorted absolute RUL prediction errors, highlighting the relative error distributions of the competing methods. Ridge-EM exhibits reduced dispersion and improved accuracy compared with WLS-EM and the linear baseline.
Figure 11.
Benchmark comparison on the NASA C-MAPSS FD001 dataset. (Left): Predicted versus true RUL at the last cycle for WLS-EM, ridge-EM, and linear regression, including the identity line and tolerance bands. (Right): Sorted absolute RUL prediction errors, highlighting the relative error distributions of the competing methods. Ridge-EM exhibits reduced dispersion and improved accuracy compared with WLS-EM and the linear baseline.
Table 1.
True transition probability matrix used in the simulation study.
Table 1.
True transition probability matrix used in the simulation study.
| | State 1 | State 2 | State 3 | State 4 (Absorbing) |
|---|
| State 1 | 0.9 | 0.1 | 0 | 0 |
| State 2 | 0 | 0.9 | 0.1 | 0 |
| State 3 | 0 | 0 | 0.9 | 0.1 |
| State 4 | 0 | 0 | 0 | 1 |
Table 2.
True emission parameters for each latent state.
Table 2.
True emission parameters for each latent state.
| State k | | | |
|---|
| 1 | 1.0 | 0.1 | 0.5 |
| 2 | 2.0 | 0.2 | 0.5 |
| 3 | 3.0 | 0.3 | 0.5 |
| 4 | 4.0 | 0.4 | 0.5 |
Table 3.
Average RUL RMSE over 100 Monte Carlo runs.
Table 3.
Average RUL RMSE over 100 Monte Carlo runs.
| N | RMSE (WLS) | RMSE (Ridge) | Improvement (Ridge–WLS) |
|---|
| 5 | 11.18 | 9.58 | |
| 10 | 11.67 | 9.58 | |
| 20 | 12.81 | 9.27 | |
| 50 | 10.96 | 8.61 | |
| 100 | 11.25 | 9.14 | |
Table 4.
Bias–variance decomposition of RUL prediction error for . The time-averaged squared bias, variance, and mean squared error (MSE) over all Monte Carlo runs are shown.
Table 4.
Bias–variance decomposition of RUL prediction error for . The time-averaged squared bias, variance, and mean squared error (MSE) over all Monte Carlo runs are shown.
| Method | Bias2 | Variance | MSE |
|---|
| WLS-EM | 28.90 | 63.80 | 143.30 |
| Ridge-EM | 20.60 | 44.45 | 96.40 |
Table 5.
Model order selection for the FD001 dataset using BIC.
Table 5.
Model order selection for the FD001 dataset using BIC.
| Number of States (K) | Log-Likelihood | BIC |
|---|
| 4 | | |
| 5 | | |
| 6 | | |
Table 6.
RUL prediction performance on the FD001 test set. The root mean squared error (RMSE), mean absolute error (MAE), and empirical coverage rates are reported, defined as the fraction of test units whose absolute prediction error does not exceed the specified tolerance.
Table 6.
RUL prediction performance on the FD001 test set. The root mean squared error (RMSE), mean absolute error (MAE), and empirical coverage rates are reported, defined as the fraction of test units whose absolute prediction error does not exceed the specified tolerance.
| Method | RMSE | MAE | | | |
|---|
| WLS-EM | 50.54 | 40.33 | 10.0% | 22.0% | 48.0% |
| Ridge-EM | 37.51 | 29.37 | 12.0% | 23.0% | 52.0% |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |