Anomaly Detection and Fault Diagnosis Based on Action States for Excavators

Soh, Jaehyun; Lee, Changmin; Kim, Wonkyung; Kang, Byungmun; Kim, DaeEun

doi:10.3390/app16052414

Open AccessArticle

Anomaly Detection and Fault Diagnosis Based on Action States for Excavators

by

Jaehyun Soh

¹

,

Changmin Lee

²,

Wonkyung Kim

³,

Byungmun Kang

¹

and

DaeEun Kim

^1,*

¹

Department of Electrical and Electronic Engineering, Yonsei University, Seoul 03722, Republic of Korea

²

Research Institute of Future City and Society, Yonsei University, Seoul 03722, Republic of Korea

³

Field Solutions Technology Team, HD Hyundai Xitesolution, Seongnam-si 13553, Republic of Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(5), 2414; https://doi.org/10.3390/app16052414

Submission received: 26 January 2026 / Revised: 19 February 2026 / Accepted: 28 February 2026 / Published: 2 March 2026

(This article belongs to the Special Issue Mechanical Fault Diagnosis and Signal Processing)

Download

Browse Figures

Versions Notes

Abstract

Anomaly detection has been a challenging subject in many industrial fields. In industrial machinery such as hydraulic excavators, sensor data distributions are inherently multimodal because different operating conditions produce distinct sensor signatures, and conventional algorithms struggle to establish clear normal–abnormal boundaries when these conditions are mixed. We propose an action-state decomposition framework that partitions multimodal sensor data into homogeneous subsets based on discretized control inputs, thereby reducing the ambiguity of normal–abnormal boundaries by learning state-conditional distributions. The approach comprises a reactive method that evaluates each sample within its action state, and a history-based method that incorporates temporal context from previous action states. This decomposition is algorithm-agnostic and can improve detection performance across diverse anomaly detection algorithms. The framework is further extended to Bayesian fault diagnosis that identifies the root cause of failures using action-state-conditional detection probabilities. Experiments on simulated excavator data and two real-world benchmark datasets (UCI Hydraulic Systems and SKAB) demonstrate the generalizability of the proposed mode decomposition and provide insights into factors that may influence its effectiveness. The history-based method achieves a mean AUC of 0.89 across sensor fault types, outperforming all baseline algorithms, and the Bayesian fault diagnosis achieves 86.7% accuracy in identifying the root cause among six action fault types. For the proposed GMM-based methods, the decomposition also substantially reduces per-sample inference time by approximately 10× (from 8.68 μs to 0.75 μs), enabling real-time deployment in industrial settings.

Keywords:

anomaly detection; action state; construction machinery; excavator; fault diagnosis; Gaussian mixture model (GMM)

1. Introduction

Modern industrial systems rely heavily on automated machinery, making early fault detection and diagnosis essential for operational safety and cost efficiency. Techniques for analyzing and predicting failures have become a primary objective across various industries, as the economic and human harm caused by system failures can be avoided or minimized through timely identification of these failures.

Anomaly detection technology can be applied in various fields, including industrial machine anomaly detection, medicine and health care, fraud detection in banks and the stock market, intrusion detection, and defect detection in specific patterns of images [1]. Researchers approach anomaly detection technology in various ways and have attempted to detect anomalies using multiple techniques, such as classification-based anomaly detection, nearest data-based anomaly detection, clustering-based anomaly detection, probability-based anomaly detection, and information theory anomaly detection [2]. In the domain of sequential data, deep learning methods have been extensively applied to time series anomaly detection, with forecasting-based, reconstruction-based, and representation-based approaches being the major paradigms [3].

Because fault samples are scarce in many industrial settings, anomaly detection is often formulated under normal-only supervision. One-class classification, which learns a model of normality using only normal data, is the most prevalent strategy in this context.

Various algorithms have been studied for anomaly detection and fault diagnosis. The main purpose of these algorithms is to find a more accurate boundary between normal and abnormal. Isolation forest distinguishes between normal and abnormal behavior by using tree depth [4], and its recent extension, deep isolation forest (DIF), incorporates neural representations to improve performance on high-dimensional and complex data distributions [5]. One-class support vector machine (OCSVM) applies an SVM to find outliers [6], and local outlier factor (LOF) considers the relative density of data for outlier detection [7]. Additional techniques include angle-based outlier detection (ABOD), copula-based outlier detection (COPOD), rotation-based outlier detection (ROD), and empirical-cumulative-distribution-based outlier detection (ECOD) [8,9,10,11]. More recently, anomaly detection research based on the Gaussian mixture model (GMM) has been actively conducted [12].

Deep learning has also been applied to anomaly detection. Variational autoencoder (VAE) and deep support vector data description (Deep SVDD) are well-known deep learning-based methods commonly used for anomaly detection [13,14]. Since general deep learning methods are developed for purposes other than anomaly detection, they often suffer from suboptimal performance when applied to anomaly detection tasks [15]. In this work, we focus on multivariate sensor time-series from excavator hydraulics, where operating-condition variability is a primary challenge.

In the domain of construction and heavy machinery, recent work has addressed fault detection in excavator gearboxes under non-stationary operating conditions [16], LSTM-based integrated sensor–actuator fault diagnosis for active steering systems [17], and integrated actuator–sensor fault detection in active suspension systems using unknown input observers [18]. Comprehensive reviews have examined compound fault diagnosis methods for rolling bearings [19], Transformer-based fault diagnosis for mechanical equipment [20], and the broader challenges of fault detection in Industry 4.0 environments [21]. Performance degradation assessment has been studied using stochastic process models [22,23]. Condition monitoring of hydraulic systems using multivariate statistics has also been explored [24]. Table 1 summarizes the key methods and their characteristics.

Despite the rapid progress in anomaly detection algorithms, a fundamental challenge remains in industrial machinery monitoring. Sensor data distributions become highly multimodal when observations from multiple operating conditions are mixed together. For example, a hydraulic excavator performing boom-up, arm-in, and bucket-dig operations produces vastly different sensor signatures, yet most existing algorithms treat all operating conditions uniformly. This mixture of operating modes obscures the normal–abnormal boundary and degrades detection performance. While some studies have addressed operating condition variability through regime-based modeling or transfer learning, these approaches typically define operating modes in a domain- or algorithm-specific manner and evaluate performance with only one or two detection methods. In contrast, a systematic, algorithm-agnostic framework for decomposing sensor data according to discrete control-input-defined operating modes and quantifying its impact across diverse anomaly detection algorithms has not been established.

In this paper, we model an actual excavator (the HX300AL model from Hyundai Construction Equipment) using real data from sensors in the excavator. Since collecting anomalous data in the actual industry is difficult, we model a normal condition excavator using extensive normal data. Various types of anomalous data can also be created by modifying the model parameters. We conduct various experiments on the proposed algorithm using data created through simulation, and analyze the characteristics of the proposed algorithm.

To address the multimodal distribution challenge, we introduce action states, discrete operating modes derived from control inputs (pilot pressures), and train state-conditional models. This factorization transforms a complex multimodal distribution into simpler, more homogeneous subsets, yielding a clearer normal boundary within each mode. The proposed action-state-based anomaly detection framework decomposes the entire data into discretized states using action parameters and then applies an anomaly detection algorithm within each action state.

This simple but effective decomposition method can be adapted to any anomaly detection algorithm. Moreover, since each action state model is trained on a smaller, more homogeneous subset of data, the computational complexity of the proposed GMM-based methods is significantly reduced, enabling real-time anomaly detection suitable for industrial applications. We propose two representative anomaly detection methods using the action state with a Gaussian mixture model (GMM). The first method is a reactive action state method. This method detects anomalies using only the current input data. The second method is the history-based method. Unlike the reactive method, anomalies are detected using the action state history as well as the current data. We compared the performance of various anomaly detection algorithms and quantified the effect of action state decomposition. In addition, it is possible to diagnose faults beyond anomaly detection using a Gaussian distribution model and a statistical approach based on anomaly detection probability in the action state.

The main contributions of this paper are summarized as follows:

(1): Action state decomposition framework. We propose a general-purpose preprocessing framework that decomposes multimodal sensor data into homogeneous subsets based on discretized control inputs. This framework is algorithm-agnostic and can be combined with any anomaly detection method. It generally improves detection performance on the excavator data and UCI Hydraulic, though the benefit is more limited on SKAB, where fault-induced mode imbalance may reduce its effectiveness.
(2): History-based anomaly detection. We extend the reactive (single-sample) detection method with a history-based approach that leverages temporal context from previous action states. This method achieves a mean AUC of 0.89, outperforming all baseline algorithms.
(3): Simulation-based fault generation and Bayesian fault diagnosis. We develop a physics-based simulation model calibrated from actual excavator sensor data to generate diverse fault scenarios. Using the action-state-conditional detection probabilities, we formulate a Bayesian fault diagnosis method that identifies the root cause of failures.
(4): Computational efficiency for real-time deployment. For the proposed GMM-based reactive and history-based methods, action state decomposition reduces per-sample inference time by approximately 10× (from 8.68 μs to 0.75 μs), enabling real-time anomaly detection in industrial settings.

A natural question is whether the control inputs could simply be included as additional continuous features rather than used to define discrete operating modes. We argue that discretized action states are architecturally essential for two reasons. First, the relationship between pilot pressure inputs and sensor outputs is fundamentally mode-dependent. When a particular action is active, its corresponding pilot pressure drives the pump dynamics, whereas an inactive action contributes no signal—creating discontinuous regime boundaries that continuous features cannot capture. Second, training separate per-state models ensures that the normal boundary for each operating regime is learned from homogeneous data, avoiding the problem where a single model must simultaneously accommodate the distinct distributions of 27 operating modes. In contrast, simply appending pilot pressures as features would require algorithms to jointly learn mode identification and anomaly scoring within a single model. The experimental results confirm this. Even strong traditional and deep-learning-based algorithms that are inherently robust to multimodal distributions still benefit from explicit mode decomposition.

2. Methods

2.1. Excavator Modeling

For modeling, data from Hyundai Construction Equipment’s HX300AL model was used. Figure 1 shows the HX300AL model and actions of the boom, arm, and bucket. Although the data from the actual excavator cannot be revealed due to industrial privacy, it was modeled using the actual data and the structural characteristics of the main pumps. We derived equations from modeling the excavator hydraulic data by fitting the actual excavator’s hydraulic data and analyzing the excavator’s real hydraulic system.

The excavator’s boom, arm, and bucket are controlled by the lever in the driver’s seat. The pilot pressure related to the boom, arm, and bucket is referred to as the action parameter and is continuously changed by the control of the lever. This action parameter is represented in Figure 2. Four principal factors can affect the pump parameters in Figure 2. The first is the pilot pressure we used mainly for modeling, and the second is the relief valve we used for modeling. The third factor is the excavator’s load (the load that occurs when lifting a heavy object), and the fourth is the controller pump’s settings, which can affect the model. We tried to model the main pump and the main pump regulator as simply as possible. The main pump was modeled by collecting data for all operations and analyzing a single motion operation. When the excavator’s controller pump setting is fixed, and there is no load, we collected data from the excavator performing various operations. We modeled the excavator without a load. Including it is very difficult because the pressure varies greatly depending on the load. This no-load assumption is a deliberate simplification. Under loaded conditions, the main pump pressures increase substantially and nonlinearly depending on the excavated material weight, bucket fill level, and arm geometry, which would require a significantly more complex model. The no-load model captures the fundamental hydraulic dynamics and action-state structure of the excavator, and serves as a necessary first step before extending to loaded conditions. We discuss this limitation and its implications in Section 4.

The dataset X is the L × 10 matrix, where X denotes the complete collection of sensor observations as illustrated in Figure 2. L is the number of samples, and the ten dimensions consist of six action parameters (BU, BD, AI, AO, BI, BO) and four pump parameters (MPa, MPb, MPRa, MPRb). The data features and abbreviations are summarized in Figure 2. The following equation was obtained by modeling the excavator’s main pump and the main pump regulator:

\begin{matrix} M P a & = 250 \cdot tanh (0.005 \cdot y_{1}) \end{matrix}

(1)

\begin{matrix} M P b & = 250 \cdot tanh (0.005 \cdot y_{2}) \end{matrix}

(2)

\begin{matrix} M P R a & = 35 \cdot tanh (0.03 \cdot (y_{3} - 40)) + 40 \end{matrix}

(3)

\begin{matrix} M P R b & = 35 \cdot tanh (0.03 \cdot (y_{4} - 40)) + 40 . \end{matrix}

(4)

The definitions for

y_{1}

,

y_{2}

,

y_{3}

, and

y_{4}

are shown below.

\begin{matrix} τ_{1} \frac{d y_{1}}{d t} = & - y_{1} + 5.488 \cdot B U - 1.823 \cdot B D + 3.689 \cdot A I \\ + 5.169 \cdot A O + 4.928 \cdot B I - 1.789 \cdot B O \\ - 0.169 \cdot B U \cdot B I - 0.261 \cdot A I \cdot B I \\ - 0.235 \cdot A O \cdot B I + 59.584 \end{matrix}

(5)

\begin{matrix} τ_{2} \frac{d y_{2}}{d t} = & - y_{2} + 5.509 \cdot B U - 0.762 \cdot B D + 0.354 \cdot A I \\ + 1.432 \cdot A O + 3.139 \cdot B I + 1.652 \cdot B O \\ - 0.252 \cdot B U \cdot B I - 0.122 \cdot A I \cdot B I \\ - 0.119 \cdot A O \cdot B I + 66.436 \end{matrix}

(6)

\begin{matrix} τ_{3} \frac{d y_{3}}{d t} = & - y_{3} - 0.651 \cdot B U + 0.182 \cdot B D - 1.102 \cdot A I \\ - 0.923 \cdot A O - 0.839 \cdot B I + 0.166 \cdot B O \\ + 0.021 \cdot B U \cdot B I + 0.037 \cdot A I \cdot B I \\ + 0.035 \cdot A O \cdot B I + 35.209 \end{matrix}

(7)

\begin{matrix} τ_{4} \frac{d y_{4}}{d t} = & - y_{4} - 0.853 \cdot B U - 0.015 \cdot B D - 0.022 \cdot A I \\ - 0.013 \cdot A O - 0.917 \cdot B I - 0.824 \cdot B O \\ + 0.029 \cdot B U \cdot B I - 0.010 \cdot A I \cdot B I \\ + 0.009 \cdot A O \cdot B I + 35.3240 \end{matrix}

(8)

τ_{1}

,

τ_{2}

,

τ_{3}

, and

τ_{4}

are set to small values. The pilot pressures (e.g., boom up and boom down) drive the main pump through the main control valve (MCV), so the pump parameters tend to be proportional or inversely proportional to the pilot pressures. The hyperbolic tangent is used in the formula, which shows the characteristics of the relief valve.

2.2. Action States Mapping Function and Data Generation

We defined the action state mapping function

f_{S M}

that accepts action parameters as input and outputs the action state. The action state mapping function binarizes the action parameters, which have continuous values, and assigns an index to each action state according to the combination of action parameters as shown in Table 2. The preprocessing pipeline is as follows: (i) raw pilot pressure signals from six channels are collected at the sensor sampling rate. (ii) Each channel is normalized to

[0, 1]

using min-max normalization based on the operating range. (iii) The normalized values are binarized with a threshold of 0.1 (i.e., if the normalized pilot pressure exceeds 10% of the operating range, the corresponding action is considered active (1), otherwise inactive (0)). (iv) The resulting 6-bit binary vector is mapped to one of 27 valid action states using a lookup table (Table 2). Various actions can be distinguished using this combination. When the lever is moved in an actual excavator, the pilot pressure related to the lever instantaneously increases. Since there are six action parameters, a combination of

2^{6}

yields a total of 64 action states. However, some actions cannot be executed using the actual excavator lever. For example, boom up and boom down cannot be simultaneously performed by the excavator’s lever. Excluding such actions, only 27 action states out of 64 action states are possible, and the corresponding states are shown in Table 2. During transitions between operational modes (e.g., switching from boom-up to arm-in), brief transient periods occur where the pilot pressures are changing between their steady-state values. In the proposed framework, these transient samples are handled naturally by the binarization threshold: a pilot pressure that is ramping up or down but has not yet exceeded 10% of the operating range is classified as inactive, and the sample is assigned to the current (pre-transition) action state. Because the pilot pressure response in hydraulic excavators is fast (the lever-to-valve response time is on the order of tens of milliseconds), the transient period is typically short relative to the duration of each action state. Any transient samples that are momentarily assigned to an intermediate or incorrect action state are evaluated against the corresponding per-state GMM model and may produce slightly elevated anomaly scores, but these isolated transient scores are effectively filtered by the history-based method (described in Section 2.3), which aggregates evidence over multiple consecutive action states.

We assumed two situations to generate data. The first randomly operates the lever, and the second operates the lever in a sequence that can be used in actual work. We defined the data created by assuming random movement of the lever as a random dataset and the data created under the assumption that the lever moved in a practical sequence as a scenario dataset. Figure 3 visualizes an example of the action sequence of the scenario dataset. The arrows indicate the boom, arm, and bucket that move according to the action state.

We generated normal data using the results from the excavator modeling with noise, where Gaussian noise with a standard deviation of 8% of each parameter’s operating range was added to simulate sensor measurement uncertainty. We also created failure data by changing the coefficient in excavator modeling (5)–(8). All failure types and related parameters are shown in Table 3. For example, in (5)–(8), the coefficients related to the boom-up pilot pressure are 5.488 in

y_{1}

, 5.509 in

y_{2}

, −0.651 in

y_{3}

, and −0.853 in

y_{4}

. If the boom-up pilot, which is one of the action parameters, is broken, the coefficients related to the boom-up pilot are changed by a certain proportion and called a type

A_{1}

failure. We defined the type of failure in which the sensor parameter failed as type S. A type

S_{1}

failure occurs when the

y_{1}

coefficients have changed by a certain proportion. By constructing the model so that the coefficients change, failure data can be created, and the failure data can be classified by the failure type.

Physically, type A faults correspond to degradation or malfunction in the pilot pressure control path. For instance, a type

A_{1}

(boom-up) fault may arise from pilot valve spool sticking, lever linkage wear, or pilot pressure hose leakage, all of which alter the effective pilot pressure signal transmitted to the main control valve. Dai et al. [26] noted that pilot valve degradation is a common and challenging fault mode in hydraulic systems due to its strong concealment characteristics. Similarly, type S faults correspond to anomalies in the main pump output or regulator response. A type

S_{1}

fault (main pump A) may result from piston shoe wear, swash plate degradation, or internal leakage in the pump [27], while types

S_{3}

and

S_{4}

(regulator faults) may arise from regulator spring fatigue or servo piston seal degradation. Bergada et al. [28] provided a quantitative analysis of all major leakage paths in axial piston pumps, confirming that internal leakage is the dominant cause of performance degradation. These physical mechanisms are modeled as proportional coefficient changes because the dominant effect of such degradation is a gain shift in the input–output relationship. Notably, by varying

β

continuously within the defined range, our simulation generates a broader spectrum of fault severities than is typically observed in individual field incidents, enabling a comprehensive evaluation of anomaly detection algorithms across diverse degradation levels.

The fault coefficients are perturbed by a factor

β \in [0.2, 0.7] \cup [1.3, 1.8]

, meaning the original coefficient is multiplied by

β

. This range was determined based on two criteria. First, regarding the sensor noise margin, the 8% sensor noise corresponds to

β = 0.92

–

1.08

under normal conditions, and the minimum fault level (

β = 0.7

or

1.3

) is separated from the noise boundary by at least 22 percentage points, ensuring reliable detection without false alarms. Values of

β

close to 1.0 (e.g.,

β \in [0.8, 1.2]

) produce perturbations within the noise margin and are indistinguishable from normal operation. Second, regarding consistency with reported hydraulic system degradation, studies on hydraulic piston pump faults report volumetric efficiency losses of 20–70% due to common failure modes such as slipper wear, swash plate degradation, and valve plate wear [27,28], corresponding to

β = 0.3

–

0.8

. The lower range (

β \in [0.2, 0.7]

) covers these degradation levels, while the upper range (

β \in [1.3, 1.8]

) represents over-response scenarios such as abnormal valve activation due to electrical faults. Parametric fault injection by scaling model coefficients is a well-established approach in hydraulic system fault simulation [29,30]. Shen and Zhao [30] similarly defined multiple fault severity levels for hydraulic system simulation, validating the use of scaled parameters to represent different fault degrees.

Table 4 summarizes the dataset configuration. The length of the training dataset is approximately 100,000, and the training dataset contains only normal data. For each failure type, 20 datasets were created for the experiments (ten random datasets and ten scenario datasets). The length of the test dataset is approximately 6000, and an identical sequence of action states is repeated twice. The first action state sequence is a normal condition, and the second action state sequence is a fault condition. After creating the failure data in this way, we labeled the data as fault data when the actions related to fault coefficients were executed. For example, when failure data were created by changing the coefficient of the boom-up parameter, the data in action states 2, 8, 9, 10, 11, 20, 21, 24, and 25 were labeled as abnormal data, whereas the remaining action states were labeled as normal data.

All models were trained using normal-only data. Faulty test datasets were generated by perturbing the fitted model parameters in Equations (5)–(8), and labels were assigned to samples collected under action states that are directly affected by the perturbed coefficients. This simulation-based approach overcomes the practical difficulty of collecting labeled fault data from actual industrial machines while preserving the essential sensor dynamics of the excavator system.

2.3. Anomaly Detection Algorithm

The training data, sorted by each action state, were separately trained to create 27 models. After sorting the test data corresponding to the action state, the test data was evaluated using the reactive method or history-based method for the corresponding action state.

2.3.1. Action States in Anomaly Detection

A step-by-step explanation of how to apply action states to an algorithm for anomaly detection is provided:

1.: Select control-related features as action parameters. We used pilot pressures of the excavator as action parameters.
2.: Define action states as binary combinations of action parameters (Table 2) and compute the action state mapping function:

$f_{S M} (B U, B D, A I, A O, B I, B O) = s$

(9)

where BU, BD, AI, AO, BI, and BO are abbreviations of action parameters and have continuous values. s denotes an action state. The detailed explanation of the action state mapping function $f_{S M}$ is described in Section 2.2.
3.: Split the training data by action state using $f_{S M}$ :

$X = {X_{1}, X_{2}, \dots, X_{27}}$

(10)

where X is represented in Figure 2, and $X_{s}$ is the data belonging to action state s.
4.: Train a separate anomaly detection model for each state:

$Ψ = {Ψ_{1}, Ψ_{2}, \dots, Ψ_{27}}$

(11)

where $Ψ_{s}$ is the anomaly detection model trained with data $X_{s}$ . For example, when using GMM as the anomaly detection model, $Ψ_{s}$ is fitted to $X_{s}$ via the expectation–maximization (EM) algorithm with K Gaussian components. Each model $Ψ_{s}$ learns the mean $μ_{k}$ , covariance $Σ_{k}$ , and mixture proportion $ϕ_{k}$ for $k = 1, \dots, K$ clusters. For other algorithms (e.g., OCSVM, IForest, VAE, DeepSVDD), the same state-wise training scheme applies. Each $Ψ_{s}$ is trained independently on $X_{s}$ using the algorithm’s standard fitting procedure.
5.: At inference time, determine the current state s using $f_{S M}$ and evaluate the sample using $Ψ_{s}$ .

2.3.2. Cumulative Probability and Reactive Method

The reactive method determines whether the test data is normal or abnormal according to the current action state. The key intuition is that within each action state, the normal data forms a relatively compact cluster in the feature space. By measuring how far a test sample deviates from this cluster using probabilistic distance, we can quantify its abnormality. The multivariate Gaussian distribution can be expressed as

N (x ∣ μ, Σ)

where μ is the mean,

Σ

is the covariance, and x is a 10-dimensional vector.

The multivariate Gaussian mixture distribution

p (x)

can be expressed as follows:

p (x) = \sum_{k = 1}^{K} ϕ_{k} N (x ∣ μ_{k}, Σ_{k}), \sum_{k = 1}^{K} ϕ_{k} = 1, 0 \leq ϕ_{k} \leq 1

(12)

where k = 1,…, K, and K denotes the number of clusters.

μ_{k}

is the mean of the k-th cluster,

Σ_{k}

is the covariance of the k-th cluster, and

ϕ_{k}

is the mixture proportion of the k-th cluster. The Mahalanobis distance is used to evaluate the test data after learning the GMM.

d_{i k} = \sqrt{{(x_{i} - μ_{k})}^{T} Σ_{k}^{- 1} (x_{i} - μ_{k})}

(13)

where i = 1,…, L, L is the number of test data,

x_{i}

is the i-th test data, and

d_{i k}

is the Mahalanobis distance between cluster k and the i-th data. The Mahalanobis distance of the Gaussian model is converted into a cumulative probability using F [31]. The cumulative distribution function (CDF) for the Mahalanobis distance is defined as follows:

F (d_{i k}) = P (X \leq d_{i k}), lim_{d_{i k} \to \infty} F (d_{i k}) = 1

(14)

where

F (d_{i k})

calculates the cumulative probability of

x_{i}

from the center point of the Gaussian distribution using the Mahalanobis distance. Therefore, if

d_{i k}

becomes infinitely large, the cumulative probability

F (d_{i k})

becomes 1. Using the mixture proportion

ϕ_{k}

of the k-th cluster, we express the cumulative probability

F_{k}

of each cluster using the mixture proportion as shown below.

lim_{d_{i k} \to \infty} ϕ_{k} F_{k} (d_{i k}) = ϕ_{k}, \sum_{k = 1}^{K} lim_{d_{i k} \to \infty} ϕ_{k} F_{k} (d_{i k}) = 1 .

(15)

Therefore, the cumulative probability considering the mixture proportion can be expressed as follows:

Φ (x_{i}) = \sum_{k = 1}^{K} ϕ_{k} F_{k} (d_{i k})

(16)

where

Φ (x_{i})

is the anomaly detection value of the i-th test data

x_{i}

in the reactive method. When the action state of the test data

x_{i}

is j, the anomaly detection value

Φ (x_{i})

can be expressed as

Φ (x_{i} ∣ j)

, which includes the action state.

\begin{matrix} Φ (x_{i} ∣ j) > t h r e s h o l d : a b n o r m a l d a t a \\ Φ (x_{i} ∣ j) \leq t h r e s h o l d : n o r m a l d a t a \end{matrix}

(17)

where j = 1,…, 27. This evaluation method is defined as the reactive method. For each action state j, the threshold is set as the q-quantile (e.g.,

q = 0.99

) of the anomaly scores computed on the normal training data in that state. This calibration ensures a controlled false alarm rate while adapting to the score distribution of each action state.

2.3.3. History-Based Method

We extended the reactive method and propose the history-based method, which uses the previous action states. The underlying intuition is that faults typically persist across multiple action states. If anomalies were detected in recent action states, the current observation in the same action state is more likely to be abnormal as well. This temporal context helps detect subtle anomalies that might be missed when evaluating each sample independently. To extract the appropriate anomaly detection information from the previous action state, the results of the reactive method in the previous action state are averaged, and the sensitivity constant

α

is added. This anomaly detection information is defined as the history anomaly factor of the m-th previous action state. It can be expressed by the following formula:

\bar{Φ} (q_{t - m}) = [\sum_{n = 1}^{N_{q_{t - m}}} \frac{1}{N_{q_{t - m}}} Φ (x_{n} ∣ q_{t - m})] + α, 0 \leq α < 1

(18)

where

q_{t - m}

represents the m-th previous action state,

N_{q_{t - m}}

is the number of data points in the m-th previous action state, and

x_{n}

is a data point in the m-th previous action state.

α

is the sensitivity constant.

f (\bar{Φ} (q_{t - m})) = \{\begin{matrix} \bar{Φ} (q_{t - m}), \bar{Φ} (q_{t - m}) > 1 \\ 1, \bar{Φ} (q_{t - m}) \leq 1 \end{matrix}

(19)

where

f (\bar{Φ} (q_{t - m}))

is a kind of anomaly activation function. If the history anomaly factor

\bar{Φ} (q_{t - m})

is greater than 1, the current data is also assigned a greater abnormal weight. On the other hand, the current data is not affected if it is less than one to prevent influence from normal data. Assuming that anomalies are independent in each action state, the history-based method is defined by multiplying the activation functions from the previous action states by the anomaly detection value of the current data. The formula for the history-based method is expressed below.

\begin{matrix} H_{α, M} (x_{i} ∣ j) & = Φ (x_{i} ∣ j) f (\bar{Φ} (q_{t - 1})) f (\bar{Φ} (q_{t - 2})) . . . f (\bar{Φ} (q_{t - M})) \\ H_{α, M} (x_{i} ∣ j) & = Φ (x_{i} ∣ j) \prod_{m = 1}^{M} f (\bar{Φ} (q_{t - m})) \end{matrix}

(20)

where

H_{α, M} (x_{i})

is the anomaly detection value of the i-th test data

x_{i}

when using the history-based method. The current action state j is the t-th action state, and it uses up to the M-th previous action state from the current action state. M must be smaller than t. The values of

α

and M are selected via grid search. In Section 3.2, we evaluate various combinations and set

α = 0.1

and

M = 15

. History amplification is applied only when the current action state appears among the previous M states that satisfy

\bar{Φ} (q_{t - m}) + α > 1

. If no such match exists,

H_{α, M}

reduces to the reactive score

Φ (x_{i} ∣ j)

. For example, an anomaly detected during a boom-up operation should not inflate the score of an unrelated bucket-in operation.

\begin{matrix} H_{α, M} (x_{i} ∣ j) > t h r e s h o l d : a b n o r m a l d a t a \\ H_{α, M} (x_{i} ∣ j) \leq t h r e s h o l d : n o r m a l d a t a . \end{matrix}

(21)

This formula means that if there is a possibility of failure in a previous occurrence of the same action state, the abnormality is detected more sensitively in the current action state by considering this factor. It is difficult to detect anomalies due to ambiguous data between normal and abnormal boundaries. If a fault is ambiguous, it can be misrecognized as normal. Therefore, an ambiguous anomaly is detected in the current data, and if there was an anomaly detection in the same previous action state, the current data is also judged by placing weight on abnormal data.

2.4. Fault Diagnosis Algorithm

Using the action states makes it possible to extend fault diagnosis beyond anomaly detection. We modeled a failure diagnosis method using the probability that an anomaly of failure type

A_{c}

is detected in each action state. In this paper, we focus on diagnosing action parameter faults (types

A_{1}

to

A_{6}

) because these faults exhibit distinct detection patterns across different action states, enabling effective Bayesian inference. Sensor parameter faults (types

S_{1}

to

S_{4}

) affect all action states similarly and require different diagnostic approaches, which we leave for future work. The probability that the failure data from fault type

A_{c}

will be detected abnormally in action state q is

p (x | q, A_{c})

. Using this conditional distribution, it is possible to analyze which action parameter is the cause of a failure. We represent the conditional probabilities as follows.

P (X ∣ Q, A_{c}) = \prod_{t = 1}^{T} p (x_{t} ∣ q_{t}, A_{c})

(22)

where X is X =

x_{1}

,

x_{2}

,…,

x_{T}

. X sequentially represents the variables when an abnormality is detected. Action state sequences are expressed as Q =

q_{1}

,

q_{2}

,…,

q_{T}

and are action state sequences of X. Since the likelihood of each

A_{c}

is known, the posterior can be calculated using Bayes’ theorem.

\begin{matrix} P (A_{c} ∣ Q, X) & = \frac{P (X ∣ Q, A_{c}) P (A_{c})}{P (X ∣ Q)} \\ = \frac{P (X ∣ Q, A_{c}) P (A_{c})}{\sum_{c = 1}^{6} P (X ∣ Q, A_{c}) P (A_{c})} . \end{matrix}

(23)

Assuming that the probability of each failure type occurring is the same, the probability of the prior

P (A_{c})

is the same for each c. We adopt a uniform prior

P (A_{c}) = 1 / 6

for two reasons: (i) a uniform prior represents the most conservative (maximum-entropy) baseline, ensuring that the diagnosis outcome is driven entirely by the observed likelihood rather than prior assumptions. (ii) When domain-specific fault statistics become available, the prior

P (A_{c})

can be straightforwardly updated to reflect the actual fault frequencies, and the posterior will be adjusted accordingly through Bayes’ theorem. Therefore, the equation can be summarized as follows:

P (A_{c} ∣ Q, X) = \frac{P (X ∣ Q, A_{c})}{\sum_{c = 1}^{6} P (X ∣ Q, A_{c})} .

(24)

The failure is expected to be caused by the failure type

A_{c}

with the highest probability among

P (A_{c} ∣ Q, X)

, which can be expressed as follows:

F a u l t c a u s e t y p e A_{c} = \underset{c}{a r g m a x} P (A_{c} ∣ Q, X) .

(25)

(25) can be used to predict which action parameter caused the failure.

3. Experimental Results

3.1. Action State Analysis

The t-SNE algorithm was used to visualize the ten-dimensional data in Figure 2. t-SNE is a nonlinear dimensionality reduction technique that preserves local pairwise similarities to project high-dimensional data into two dimensions for visualization [32]. Figure 4 shows the two-dimensional t-SNE embedding colored by action state. Data points belonging to the same action state form distinct clusters, indicating that action-state decomposition partitions the multimodal distribution into more homogeneous subsets and facilitates state-conditional boundary estimation.

Figure 5 shows the normal and abnormal data visualized by t-SNE when the action state changes from action state 3 → 1 → 9 → 15 → 20 → 14. In action state 20 and action state 14, which are related to actual failure, there is a substantial difference in the normal and abnormal data distributions. After separating the action state and analyzing it, the difference between normal and abnormal data can be seen.

One of the things that has the most significant influence on configuring the GMM is the number of clusters. Therefore, we used the Calinski–Harabasz criterion to find the optimal number of clusters for each action [33]. The optimal number of clusters is calculated between two and six according to the data distribution for each action state. In this paper, we used GMMs composed of three clusters for each action state since there is little difference between the results of learning with the ideal number of clusters as estimated by the Calinski–Harabasz criterion for each action state and the results of learning with three clusters for each action state.

3.2. Anomaly Detection Performance

In this section, we perform anomaly detection for various failure types and evaluate the results using the area under the receiver operating characteristic curve (AUC). The average AUC value of the results from 20 datasets is displayed for each failure type. The evaluation results produced by the reactive method for each action state are shown in Figure 6. The AUC performance for failure types

S_{1}

and

S_{2}

is shown in Figure 6. Since failures occurred in all action states except for action state 1 (the idle state), we evaluated the anomaly detection performance using the AUC for action states 2 to 27.

Figure 7 depicts the AUC results for the history-based method for various parameter combinations. We performed a grid search over

α \in {0.01, 0.05, 0.1, 0.15, 0.2}

and

M \in {1, 3, 5, 10, 15}

and evaluated each combination using the mean AUC across all fault types on a held-out validation set. In Figure 7, as M and

α

increase, Figure 7a shows overall upward sloping and Figure 7b shows overall downward sloping. There is a trade-off. A larger

α

increases sensitivity to persistent faults but also raises the false alarm rate on normal data, while a larger M incorporates more temporal context but may introduce stale information from distant action states. The grid search results in Figure 7 show that performance improves up to

M = 15

and begins to degrade for

α > 0.15

due to increased false positives. As a result, the history-based method is sensitive to parameters and, thus, after testing several parameters for the system, an acceptable parameter can be set. In this research, the parameter settings are set to

α

= 0.1 and M = 15, which achieve the best trade-off between detection sensitivity and false alarm rate in the grid search.

We compare our method with representative one-class and unsupervised baselines, including OCSVM, Isolation Forest (IForest), DIF, ABOD, VAE, DeepSVDD, COPOD, ROD, ECOD, and the recent time-series anomaly detection method SensitiveHUE [25]. The performance results are presented in Table 5. For IForest, we used 100 trees with a

2^{15}

subsampling size [34]. OCSVM uses MATLAB’s fitcsvm function with an RBF kernel (KernelFunction=’rbf’) and standardization. COPOD, ROD, and ECOD are parameter-free algorithms. For ABOD, we used

k = 20

neighbors. For VAE, the encoder/decoder sizes were (16, 8, 4) and (4, 8, 16), with latent dimension 4 and 50 epochs. For DeepSVDD, we used hidden neurons (32, 16) and 50 epochs. For DIF, we used

n_e n s e m b l e = 50

and max_samples=’auto’. For SensitiveHUE, we used sequence length 10 and 50 training epochs.

The cumulative probability method in Table 5 resulted from applying the reactive method without separating the action state. The cumulative probability method is nearly identical to the reactive method except that the reactive method learns and evaluates each action state. Still, the cumulative probability method learns and evaluates all of the data without splitting the action state. The number of clusters in the cumulative probability method is 50. When there are more than 50 GMM clusters, there is little difference in performance. The reactive method and history-based method used three clusters for each action state, and there was no significant difference in performance even if two or more clusters were used. Both methods yield higher AUC values than the cumulative probability method. Like the action state-based algorithm proposed in this paper, other algorithms were applied after separating according to the action state. In most algorithms, the performance improved after separating the action state. The outcomes of applying the action state to other algorithms are reported in Table 5, as the reactive method applies the action state to the cumulative probability method. In Table 5, ‘Action state’ represents that the algorithm was applied to each action state separately. For the action-state variants, AUC is computed per action state and averaged across states (excluding the idle state, which has no samples). Better outcomes were also attained by these algorithms due to using separate action states. Thus, the action state concept can be applied to various algorithms. In this experimental setting, the proposed history-based action-state method achieves the highest mean AUC among the compared approaches.

The improvement from action state decomposition varies significantly across algorithms. VAE and DeepSVDD show substantial improvement (from 0.54 to 0.75 and from 0.49 to 0.67, respectively), because these methods are sensitive to the complexity of the data distribution, and when trained on mixed multi-modal data, they struggle to learn a meaningful normal boundary. SensitiveHUE also shows a large improvement (from 0.63 to 0.84, +33%). In contrast, methods like ABOD that rely on local geometric properties show smaller improvement, as they are inherently more robust to multi-modal distributions. OCSVM and IForest show moderate improvement (3–5%), indicating that action state decomposition provides benefits across different algorithm families. Moreover, action-state models substantially reduce inference cost. The per-sample inference time is 0.75 μs for the reactive method and 0.87 μs for the history-based method, compared to 8.68 μs for the cumulative probability method (averaged over 10 runs, measured in MATLAB R2021a on an AMD Ryzen 9 5900X CPU). This approximately 10× speedup is achieved because each action-state model processes a smaller subset of data with fewer GMM clusters (3 per state vs. 50 for the cumulative method).

The proposed GMM-based methods achieve the highest AUC on the excavator data (Table 5) because the excavator simulation model produces nearly linear relationships between action parameters and sensor outputs. Within each action state, the normal data forms compact, approximately Gaussian clusters in the 10-dimensional feature space, which the GMM captures precisely. The Mahalanobis distance combined with the chi-squared CDF (Equation (14)) provides a principled probabilistic anomaly score between 0 and 1, where the score directly represents the cumulative probability that the observed deviation from the learned normal cluster is attributable to chance. This scoring mechanism offers interpretable diagnostics. When an anomaly is detected, the operator can identify which GMM cluster was most violated, quantify the deviation magnitude in standard deviations, and trace the anomaly back to specific sensor dimensions through the covariance structure. Such interpretability is critical for industrial deployment, where maintenance decisions require understanding why an alarm was triggered, not merely that it was triggered. In contrast, black-box methods such as VAE (reconstruction error) and DeepSVDD (hypersphere distance) lack this direct probabilistic interpretation. Although ABOD achieves competitive AUC without action state decomposition (0.86), its angle-based scores do not provide the same level of physical interpretability as the GMM’s probabilistic framework.

To assess the sensitivity of deep learning methods to hyperparameters and architecture choices, we conducted extended experiments on the excavator dataset with VAE and DeepSVDD across multiple encoder architectures and training epochs. Table 6 summarizes the results. For VAE, three encoder/decoder sizes were tested: small (16, 8, 4), medium (64, 32, 16), and large (128, 64, 32), with 10 and 50 epochs. For DeepSVDD, hidden layer configurations of (32, 16), (64, 32), and (128, 64, 32) were tested similarly.

Several findings emerge from Table 6. First, the hyperparameters used in Table 5 (VAE: encoder (16, 8, 4), 50 epochs, and DeepSVDD: hidden (32, 16), 50 epochs) achieve near-optimal performance, as increasing epochs from 10 to 50 yields negligible improvement for VAE (0.746→0.747 with AS) while providing modest gains for DeepSVDD (0.656→0.676 with AS). We adopted 50 epochs for all tables to ensure consistent convergence across datasets. Second, larger architectures provide modest gains with action state decomposition (VAE: 0.747→0.766, DeepSVDD: 0.676→0.694) but can degrade without it (DeepSVDD: 0.545→0.515), suggesting that larger networks overfit when trained on mixed multimodal data. For SensitiveHUE, longer sequence lengths improve performance both without AS (0.575→0.627→0.710) and with AS (0.817→0.840→0.862), but action state decomposition provides a much larger improvement than increasing seq_len. Even the shortest configuration with AS (seq_len = 5, 0.817) surpasses the best configuration without AS (seq_len = 20, 0.710). Increasing epochs from 10 to 50 yields only marginal gains at both seq_len = 10 (0.627→0.632 without AS, 0.840→0.843 with AS) and seq_len = 20 (0.710→0.716 without AS, 0.862→0.865 with AS), confirming rapid convergence. We adopted 50 epochs for SensitiveHUE in Table 5 and the subsequent benchmark experiments to match the epoch count used for VAE and DeepSVDD. The improvement from AS is largest for shorter sequences (+0.242 at seq_len = 5 vs. +0.152 at seq_len = 20, both at 10 epochs), suggesting that action state decomposition compensates for limited temporal context. For DIF, increasing

n_{ens}

from 10 to 50 improves both conditions, but further increasing to 100 shows no additional gain. These results suggest that the action state decomposition is generally more impactful than hyperparameter tuning for improving anomaly detection across the tested methods.

To assess statistical robustness, all baseline algorithms were evaluated across 20 independent test sequences, each containing different anomaly onset timings. The standard deviation of AUC across sequences was 0.02–0.07 for all algorithms, indicating stable performance. Algorithms with higher mean AUC (GMM, ABOD) tend to exhibit larger absolute standard deviations (±0.04–0.07) due to sensitivity to anomaly timing, while near-random algorithms (COPOD, ECOD) show smaller variability (±0.02–0.04).

We additionally evaluated all algorithms on type A (action parameter) faults to assess generalizability. Type A faults are inherently harder to detect because each fault affects only one of six action parameters, creating subtle shifts in a subset of the 10-dimensional feature space. Without action state decomposition, the best-performing algorithms for type A faults are IForest (mean AUC = 0.58) and COPOD (0.57), while GMM and DeepSVDD drop to near-random (0.48 and 0.49, respectively). Notably, action state decomposition provides inconsistent benefits for type A faults—some algorithms improve (e.g., DIF: 0.47→0.57) while others degrade (e.g., IForest: 0.58→0.50, ABOD: 0.53→0.45). This is because type A faults only affect action states that involve the faulty action parameter. In unaffected states, the data appears normal, and per-state models assign low anomaly scores. The proposed Bayesian fault diagnosis framework (Section 3.3) addresses this limitation by exploiting the pattern of anomaly detection across action states rather than aggregate anomaly scores.

3.3. Fault Diagnosis

It is difficult to collect sufficient data for each cause of failure in the actual industry. We propose a fault diagnosis method that uses modeling to create and evaluate a large amount of data for each cause of failure. By using the action state, it is possible to not only detect anomalies but also to diagnose failures. A type A failure can be diagnosed based on the conditional probabilistic model of an action state.

We analyze which action parameter caused each failure. Figure 8a shows the anomaly detection probabilities for each action state when the fault was caused by the boom-up pilot (type

A_{1}

). The action states with a related fault action parameter have a relatively high anomaly detection probability. If the anomaly detection probability is known for each fault type

A_{j}

as shown in Figure 8a,

P (A_{j} ∣ X, Q)

can be calculated using (22)–(24). The

P (A_{j} ∣ X, Q)

values are shown in Figure 8b. Figure 8b shows the average value of the results for 40 test datasets for which

A_{1}

is the cause of the failure. If the amount of anomaly detection data is sufficient, the cause of the failure is accurately predicted.

To comprehensively evaluate the fault diagnosis capability, we extended the Bayesian diagnosis to all six action fault types (

A_{1}

–

A_{6}

). For each true fault type, 40 test datasets (20 random-sequence and 20 scenario-sequence) were generated, and the posterior probability

P (A_{j} ∣ X, Q)

was computed by sequentially accumulating anomaly detections across action states. The diagnosis was made by selecting the fault type with the highest posterior after a given number of detected anomalies.

Table 7 shows the resulting

6 \times 6

confusion matrix after 20 detected anomalies. The overall diagnosis accuracy is 86.7%. Five out of six fault types achieve accuracy above 92%, with

A_{1}

(boom up),

A_{3}

(arm in), and

A_{6}

(bucket out) reaching 97.5%. The diagnosis accuracy improves with the number of detected anomalies, from 27.5% after 3 detections, to 43.8% after 5, 77.5% after 10, 82.9% after 15, and 86.7% after 20. This convergence demonstrates that the Bayesian framework effectively accumulates evidence from multiple action states to identify the root cause.

The fault type

A_{2}

(boom down) shows the lowest diagnosis accuracy (40.0%). This is because the boom-down pilot pressure contributes relatively uniformly to sensor responses across action states, making it difficult to distinguish it from other fault types based on action-state-conditional detection patterns. In contrast, fault types that activate specific subsets of action states (e.g.,

A_{1}

boom up,

A_{5}

bucket in) produce distinctive detection patterns that enable high diagnosis accuracy. More specifically, the Bayesian fault diagnosis relies on the action-state-conditional detection probability

p (x ∣ q, A_{c})

(Equation (22)), where each fault type

A_{c}

ideally produces a unique pattern of detection probabilities across the 27 action states. For most fault types, the perturbed pilot pressure affects only action states where the corresponding action is active, yielding a sparse and distinctive pattern. However, the boom-down parameter (BD) appears in Equations (5)–(8) with relatively smaller coefficients in several of these equations compared to other actions, and its influence propagates to the pump responses in a manner similar to several other fault types. As a result, the detection probability column for

A_{2}

exhibits high similarity with those of other fault types (particularly

A_{3}

,

A_{5}

, and

A_{6}

), reducing the discriminative power of the Bayesian update. The confusion matrix confirms this: the 24 misdiagnosed

A_{2}

trials are spread across all five other fault types rather than concentrated on a single type, indicating a diffuse rather than systematic misclassification. Potential improvements could include incorporating the magnitude of anomaly scores (rather than binary detection) into the Bayesian framework, or augmenting the diagnosis with temporal patterns of consecutive anomaly detections across action states.

It should be noted that the current fault diagnosis framework assumes single-fault scenarios and requires the true fault type to be among the six predefined action faults. In practice, the simultaneous occurrence of multiple faults would produce a superposition of detection patterns, potentially degrading diagnosis accuracy. For unknown fault types, the maximum posterior probability can serve as a confidence measure. A low maximum posterior (e.g., below 0.5 after 20 detections) could trigger a “reject” option, indicating that the observed fault pattern does not match any predefined type.

3.4. Generalizability to Other Domains

To validate whether the feature-based decomposition concept—the core idea behind action state—generalizes beyond excavator systems, we conducted experiments on two public benchmark datasets, UCI Hydraulic Systems [24] and Skoltech Anomaly Benchmark (SKAB) [35]. In the excavator case, action states are derived from pilot pressure signals that directly control motor actions. However, the decomposition concept need not rely on action-specific features. Any process variable that partitions the data into distinct operating regimes can serve as a decomposition basis. To test this broader hypothesis, we selected datasets where a single process variable—unrelated to any action or control input—naturally partitions operating conditions. If performance improvement is observed—even for a subset of algorithms—it demonstrates that the feature decomposition concept is applicable beyond action-specific features, and the pattern of which algorithms benefit reveals practical conditions for effective decomposition. Unlike turbofan engines [36] and chemical processes [37], where per-mode modeling has been previously explored [38], feature-based mode decomposition has not been systematically evaluated on hydraulic system or industrial process monitoring datasets.

The UCI Hydraulic Systems dataset monitors a hydraulic test rig with 17 sensors sampling at 1–100 Hz across 60-s duty cycles. We selected 10 low-frequency sensors (4 temperature, 2 flow, 1 vibration, and 3 efficiency/cooling sensors) sampled at 1–10 Hz, excluding the 6 pressure sensors and 1 motor power sensor that operate at 100 Hz, because their rapid intra-cycle dynamics are poorly summarized by cycle-level statistics. We computed 2 statistical features per sensor (mean and standard deviation), yielding 20-dimensional feature vectors from 2205 duty cycles. After filtering for stable operating conditions (1449 cycles), the accumulator pressure condition serves as the anomaly target. Cycles with optimal pressure (130 bar) are labeled normal, while reduced/degraded/failure conditions (115/100/90 bar) are labeled anomalous. Training uses 70% of normal-only samples, with the remaining 30% normal plus all anomaly samples forming the test set. For mode definition, we discretized the TS1 temperature sensor into 3 levels (low/medium/high) based on tercile boundaries.

The SKAB dataset contains multivariate time-series data from a water circulation testbed with eight features (two vibration, two temperature, two electrical, one pressure, and one flow measurement). The dataset provides pre-defined train/test splits with anomaly-free training samples and test samples from 34 anomaly scenarios. For mode definition, we discretized the Volume Flow RateRMS feature (coolant flow rate) into three levels based on tercile boundaries of the training data, partitioning the data into low-, medium-, and high-flow operating regimes.

Table 8 compares anomaly detection performance (AUC) with and without feature-based mode decomposition. All algorithms use the same hyperparameters as in Table 5. Without mode decomposition, a single model is trained on all data. With mode decomposition, separate models are trained per mode.

The results demonstrate that feature-based mode decomposition can improve anomaly detection on both datasets, though the benefit varies substantially with dataset characteristics. On UCI Hydraulic, 7 out of 10 comparable algorithms improve with mode decomposition. ROD shows the most dramatic gain (0.573 to 0.891, +55.5%), followed by VAE (0.599 to 0.928, +54.9%), SensitiveHUE (0.603 to 0.860, +42.6%), OCSVM (0.717 to 0.976, +36.1%), and DeepSVDD (0.610 to 0.719, +17.9%). IForest also shows substantial improvement (0.836 to 0.915, +9.4%). COPOD (0.468 to 0.257, −45.1%), ECOD (0.496 to 0.231, −53.4%), and DIF (0.650 to 0.356, −45.2%) degrade, likely because these distribution-based methods are sensitive to the reduced sample size per mode. The history-based method yields the same AUC as the reactive method (both 0.765), because each duty cycle is an independent observation without temporal ordering between cycles. Consequently, there is no sequential context for the history method to accumulate.

On SKAB, 5 out of 10 comparable algorithms improve with mode decomposition: SensitiveHUE shows the largest gain (0.600 to 0.673, +12.2%), followed by DIF (0.528 to 0.585, +10.8%), COPOD (0.585 to 0.623, +6.5%), IForest (0.513 to 0.546, +6.4%), and ROD (0.622 to 0.656, +5.5%). In contrast, VAE (0.676 to 0.543, −19.7%) and DeepSVDD (0.657 to 0.512, −22.1%) show substantial degradation. The more modest improvement on SKAB compared to UCI Hydraulic may be partly attributed to a structural difference. Fault-induced feature shifts can cause test-mode imbalance. In UCI Hydraulic, the TS1 temperature remains stable whether the system is normal or faulty, so training-data-based tercile boundaries produce balanced test modes (33%/34%/33%). In SKAB, however, many anomaly scenarios (e.g., water leakage, pump failures) directly alter the coolant flow rate, causing the test distribution to deviate substantially from the training distribution (the test tercile of 74%/6%/20% vs. the training tercile of 33%/33%/33%). The dominant test mode (74%) contains a mixture of normal and diverse anomalous samples that a single per-mode model struggles to separate, limiting the decomposition benefit. This observation suggests a practical consideration. Decomposition may be more effective when the selected feature’s distribution remains relatively stable between normal and anomalous conditions.

An important observation across the three datasets is that the best-performing algorithm differs depending on the dataset characteristics, and the effectiveness of mode decomposition appears to depend, among other factors, on how well the decomposition feature separates operating conditions. On the excavator data (Table 5), the proposed GMM-based history method achieves the highest AUC (0.890). This is because the excavator simulation model produces nearly Gaussian per-state distributions due to the linearity of the hydraulic system under no-load conditions, which aligns well with GMM’s parametric assumptions. Moreover, the GMM-based scoring provides interpretable anomaly explanations through the Mahalanobis distance and chi-squared CDF framework. On UCI Hydraulic, OCSVM achieves the highest AUC (0.976) with mode decomposition, outperforming the proposed GMM methods (reactive: 0.765). The UCI dataset has relatively few training samples per mode, a regime where the RBF kernel of OCSVM effectively captures the decision boundary without requiring large samples, while the GMM’s Mahalanobis-based scoring is less suited to the non-Gaussian feature distributions derived from statistical summaries (mean/std) of raw sensor signals. On SKAB, SensitiveHUE achieves the best performance with mode decomposition (0.673), followed by ROD (0.656).

These results confirm that the feature-based decomposition concept generalizes beyond the excavator’s action-state framework. The decomposition features used in UCI Hydraulic (temperature) and SKAB (flow rate) are process variables unrelated to any control action or input command, yet mode decomposition improves performance for the majority of algorithms on UCI Hydraulic and for half of the algorithms on SKAB. The contrasting results between the two datasets suggest that one potential factor influencing decomposition effectiveness is the stability of the decomposition feature’s distribution across normal and anomalous conditions. In UCI Hydraulic, where the temperature feature is largely unaffected by hydraulic faults, the decomposition produces balanced modes and substantial improvement. In SKAB, where faults alter the flow rate used for decomposition, the resulting mode imbalance may contribute to the reduced benefit. However, other dataset-specific factors—such as data dimensionality, anomaly diversity, and per-mode sample size—may also play a role. This analysis suggests that feature-based mode decomposition is broadly applicable, and may be advantageous to select decomposition features that characterize operating conditions rather than fault symptoms.

4. Conclusions

In this paper, we propose an action-state decomposition framework for anomaly detection and fault diagnosis in hydraulic excavators. By discretizing continuous pilot pressure signals into 27 action states, the framework transforms a complex multimodal sensor distribution into homogeneous subsets, enabling more precise normal–abnormal boundary estimation. A physics-based simulation model, calibrated from actual HX300AL excavator sensor data, was used to generate diverse fault scenarios. The proposed history-based method, which leverages temporal context from previous action states, achieved a mean AUC of 0.89 across sensor fault types, outperforming ten baseline algorithms. The action state decomposition generally improved detection performance on the excavator data. Furthermore, the decomposition reduced per-sample inference time by approximately 10× (from 8.68 μs to 0.75 μs), demonstrating the feasibility of real-time deployment. The Bayesian fault diagnosis method successfully identified the root cause of action parameter faults with 86.7% accuracy using action-state-conditional detection probabilities. Experiments on two additional benchmark datasets—UCI Hydraulic Systems and SKAB—further validated the generalizability of the mode decomposition concept. By defining operating modes from a single process variable (temperature for UCI Hydraulic, flow rate for SKAB), analogous to the excavator’s pilot pressure signals, anomaly detection performance improved for the majority of algorithms on UCI Hydraulic, and for half of the algorithms on SKAB. The contrasting results between the two benchmark datasets suggest that the effectiveness of mode decomposition may be influenced by the stability of the decomposition feature across normal and anomalous conditions, among other dataset-specific factors.

Regarding practical deployment, the per-sample inference time of 0.75 μs on a desktop CPU (with action state decomposition) suggests feasibility for real-time monitoring, although performance on embedded targets will depend on the specific hardware and implementation. For long-term deployment, a periodic model update strategy is recommended to account for equipment wear and aging. As hydraulic components degrade over time, the normal operating envelope shifts, potentially increasing false alarm rates. A practical approach is to schedule model recalibration at regular maintenance intervals (e.g., every 500–1000 operating hours), using newly collected normal-condition data to retrain the per-state GMM models. Since the action state definitions are determined by the physical lever-to-valve mapping and do not change with wear, only the per-state distribution parameters require updating, which can be performed efficiently. Additionally, an exponentially weighted moving average of anomaly scores can be monitored to detect gradual model drift between recalibration cycles.

Several limitations of this work should be acknowledged. First, the simulation model assumes no-load operating conditions. Under loaded conditions, the main pump pressures change substantially depending on the excavated material, which may alter the action state dynamics and require model extension. Second, the current fault diagnosis framework assumes single-fault scenarios. Simultaneous occurrence of multiple faults, or faults of unknown types, may degrade diagnosis accuracy and require additional detection mechanisms (e.g., a reject option when the maximum posterior probability falls below a confidence threshold). Third, the action state definition relies on a manual binarization threshold, which may not be optimal for all systems. Future work will focus on: (i) extending the model to include load-dependent dynamics, (ii) developing automatic action state discovery algorithms (e.g., using clustering or change-point detection on control signals) to generalize the approach beyond excavators to other cyber-physical systems where control inputs define distinct operating modes, and (iii) investigating multi-fault diagnosis scenarios.

Author Contributions

Conceptualization, J.S. and D.K.; methodology, J.S. and C.L.; software, J.S.; validation, J.S.; formal analysis, J.S. and B.K.; investigation, J.S., C.L. and W.K.; resources, W.K.; data curation, J.S. and B.K.; writing—original draft preparation, J.S.; writing—review and editing, J.S. and D.K.; visualization, J.S.; supervision, D.K.; project administration, D.K.; funding acquisition, D.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Research Foundation of Korea (NRF) through a grant provided by the Korean government (MSIT) (Grant No. 2020R1A2B5B01002395).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon reasonable request from the corresponding author. The data are not publicly available due to industrial confidentiality agreements with Hyundai Construction Equipment.

Acknowledgments

The authors thank Hyundai Construction Equipment for providing the excavator data and technical specifications used in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Aggarwal, C.C. An introduction to outlier analysis. In Outlier Analysis; Springer International Publishing: Cham, Switzerland, 2016; pp. 1–34. [Google Scholar]
Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. 2009, 41, 1–58. [Google Scholar] [CrossRef]
Darban, Z.Z.; Webb, G.I.; Pan, S.; Aggarwal, C.C.; Salehi, M. Deep learning for time series anomaly detection: A survey. ACM Comput. Surv. 2024, 57, 1–42. [Google Scholar] [CrossRef]
Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 413–422. [Google Scholar]
Xu, H.; Pang, G.; Wang, Y.; Wang, Y. Deep isolation forest for anomaly detection. IEEE Trans. Knowl. Data Eng. 2023, 35, 12591–12604. [Google Scholar] [CrossRef]
Manevitz, L.M.; Yousef, M. One-class SVMs for document classification. J. Mach. Learn. Res. 2001, 2, 139–154. [Google Scholar]
Breunig, M.M.; Kriegel, H.-P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; ACM: New York, NY, USA, 2000; pp. 93–104. [Google Scholar]
Kriegel, H.-P.; Schubert, M.; Zimek, A. Angle-based outlier detection in high-dimensional data. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, 24–27 August 2008; ACM: New York, NY, USA, 2008; pp. 444–452. [Google Scholar]
Li, Z.; Zhao, Y.; Botta, N.; Ionescu, C.; Hu, X. Copula-based outlier detection. In Proceedings of the 2020 IEEE International Conference on Data Mining (ICDM), Sorrento, Italy, 17–20 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1118–1123. [Google Scholar]
Almardeny, Y.; Boujnah, N.; Cleary, F. A novel outlier detection method for multivariate data. IEEE Trans. Knowl. Data Eng. 2022, 34, 4052–4062. [Google Scholar] [CrossRef]
Li, Z.; Zhao, Y.; Hu, X.; Botta, N.; Ionescu, C.; Chen, G. ECOD: Unsupervised outlier detection using empirical cumulative distribution functions. IEEE Trans. Knowl. Data Eng. 2023, 35, 12181–12193. [Google Scholar] [CrossRef]
Luo, Z.; He, K.; Yu, Z. A robust unsupervised anomaly detection framework. Appl. Intell. 2022, 52, 6022–6036. [Google Scholar] [CrossRef]
Kingma, D.P.; Welling, M. Auto-encoding variational bayes. In Proceedings of the 2nd International Conference on Learning Representations (ICLR), Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
Ruff, L.; Vandermeulen, R.A.; Görnitz, N.; Deecke, L.; Siddiqui, S.A.; Binder, A.; Müller, E.; Kloft, M. Deep one-class classification. In Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018; PMLR: Stockholm, Sweden; Volume 80, pp. 4393–4402.
Pang, G.; Shen, C.; Cao, L.; van den Hengel, A. Deep learning for anomaly detection: A review. ACM Comput. Surv. 2021, 54, 1–38. [Google Scholar] [CrossRef]
Na, K.; Kim, K.; Yoo, J.; Lee, J.; Youn, B.D. Optimized relative entropy for robust fault detection in excavator traveling gearboxes via smeared envelope spectrum analysis of cyclo-non-stationary signals. Expert Syst. Appl. 2025, 266, 126110. [Google Scholar] [CrossRef]
Meléndez-Useros, M.; Viadero-Monasterio, F.; Jiménez-Salas, M.; López-Boada, M.J. Active steering fault diagnosis via integrated LSTM-based sensor detection and robust actuator fault estimation. Reliab. Eng. Syst. Safety 2025, 111573. [Google Scholar] [CrossRef]
Meléndez-Useros, M.; Jiménez-Salas, M.; Viadero-Monasterio, F.; López-Boada, M.J. Novel methodology for integrated actuator and sensors fault detection and estimation in an active suspension system. IEEE Trans. Reliab. 2024, 74, 2171–2184. [Google Scholar] [CrossRef]
Li, S.; Wang, H.; Yan, C.; Hou, Y.; Wu, L. A systematic review of diagnosis methods for rolling bearing compound faults: Research status, challenges, and future prospects. Meas. Sci. Technol. 2025, 36, 012008. [Google Scholar] [CrossRef]
Wang, R.; Dong, E.; Cheng, Z.; Liu, Z.; Jia, X. Transformer-based intelligent fault diagnosis methods of mechanical equipment: A survey. Open Phys. 2024, 22, 20240015. [Google Scholar] [CrossRef]
Leite, D.; Andrade, E.; Rativa, D.; Maciel, A.M.A. Fault detection and diagnosis in Industry 4.0: A review on challenges and opportunities. Sensors 2024, 25, 60. [Google Scholar] [CrossRef] [PubMed]
Li, Q.; Li, H.; Ma, Z.; Liu, X.; Guan, X.; Zhang, X. Performance degradation assessment for mechanical system based on semi-analytical solution of self-similar stable distribution process. Struct. Health Monitor. 2024, 23, 1358–1382. [Google Scholar] [CrossRef]
Li, Q.; Liang, C.; Yan, C.; Li, H.; Ma, Z.; Guan, X. Performance degradation assessment of mechanical system based on dual adaptive drift coefficient state-space model with autocorrelation prediction error correction. Mech. Syst. Signal Process. 2026, 244, 113804. [Google Scholar] [CrossRef]
Helwig, N.; Pignanelli, E.; Schütze, A. Condition monitoring of a complex hydraulic system using multivariate statistics. In Proceedings of the 2015 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Pisa, Italy, 11–14 May 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 210–215. [Google Scholar]
Feng, Y.; Zhang, W.; Fu, Y.; Jiang, W.; Zhu, J.; Ren, W. SensitiveHUE: Multivariate time series anomaly detection by enhancing the sensitivity to normal patterns. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024; ACM: New York, NY, USA, 2024; pp. 782–793. [Google Scholar]
Dai, J.; Tang, J.; Huang, S.; Wang, Y. Signal-based intelligent hydraulic fault diagnosis methods: Review and prospects. Chin. J. Mech. Eng. 2019, 32, 75. [Google Scholar] [CrossRef]
Zhu, Y.; Li, G.; Wang, R.; Tang, S.; Su, H.; Cao, K. Intelligent fault diagnosis of hydraulic piston pump based on wavelet analysis and improved AlexNet. Sensors 2021, 21, 549. [Google Scholar] [CrossRef]
Bergada, J.M.; Kumar, S.; Davies, D.L.; Watton, J. A complete analysis of axial piston pump leakage and output flow ripples. Appl. Math. Model. 2012, 36, 1731–1751. [Google Scholar] [CrossRef]
Liu, S.; Zhang, G. The fault simulation methods of hydraulic system based on AMESim. In Proceedings of the 2015 6th International Conference on Manufacturing Science and Engineering, Guangzhou, China, 28–29 November 2015; Atlantis Press: Dordrecht, The Netherlands, 2015; pp. 1139–1143. [Google Scholar]
Shen, K.; Zhao, D. Fault analysis and fault degree evaluation via an improved ResNet method for aircraft hydraulic system. Sci. Rep. 2025, 15, 4132. [Google Scholar] [CrossRef]
Bensimhoun, M. N-Dimensional Cumulative Function, and Other Useful Facts About Gaussians and Normal Densities. Technical Report, Jerusalem, Israel. 2009. pp. 1–8. Available online: https://upload.wikimedia.org/wikipedia/commons/a/a2/Cumulative_function_n_dimensional_Gaussians_12.2013.pdf (accessed on 27 February 2026).
Van der Maaten, L.; Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Caliński, T.; Harabasz, J. A dendrite method for cluster analysis. Commun. Statist.-Theory Methods 1974, 3, 1–27. [Google Scholar] [CrossRef]
Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data 2012, 6, 1–39. [Google Scholar] [CrossRef]
Katser, I.D.; Kozitsin, V.O. Skoltech Anomaly Benchmark (SKAB). Kaggle. 2020. Available online: https://www.kaggle.com/datasets/yuriykatser/skoltech-anomaly-benchmark-skab (accessed on 27 February 2026). [CrossRef]
Saxena, A.; Goebel, K.; Simon, D.; Eklund, N. Damage propagation modeling for aircraft engine run-to-failure simulation. In Proceedings of the 2008 International Conference on Prognostics and Health Management, Denver, CO, USA, 6–9 October 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1–9. [Google Scholar]
Downs, J.J.; Vogel, E.F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17, 245–255. [Google Scholar] [CrossRef]
Yu, J.; Qin, S.J. Multimode process monitoring with Bayesian inference-based finite Gaussian mixture models. AIChE J. 2008, 54, 1811–1829. [Google Scholar] [CrossRef]

Figure 1. The boom, arm, and bucket of an excavator.

Figure 2. Description of the excavator’s parameters obtained by sensors and data dimensions. The dataset X is an

L \times 10

matrix consisting of 6 action parameters (pilot pressures: BU, BD, AI, AO, BI, BO) and 4 pump parameters (MPa, MPb, MPRa, MPRb).

Figure 2. Description of the excavator’s parameters obtained by sensors and data dimensions. The dataset X is an

L \times 10

matrix consisting of 6 action parameters (pilot pressures: BU, BD, AI, AO, BI, BO) and 4 pump parameters (MPa, MPb, MPRa, MPRb).

Figure 3. An example of the action state sequence used to generate the scenario dataset.

Figure 4. Data distribution in two dimensions using t-SNE, colored by action state. Dimension 1 and Dimension 2 are the two components obtained by applying t-SNE to reduce the original 10-dimensional feature space to two dimensions for visualization purposes. Each color represents one of the 27 action states defined in Table 2. The plot shows that data points from the same action state tend to form distinct clusters, demonstrating that action state decomposition transforms the complex multimodal distribution into more separable subsets.

Figure 5. Data distribution as visualized with t-SNE according to the action state change. Dimension 1 and Dimension 2 are two-dimensional t-SNE projections of the original 10-dimensional sensor data. ‘Action 3’ represents the data corresponding to action state 3. The arrows track changes in the data as the action state changes. (a) Normal data. (b) Data in which faults occurred in action states 14 and 20. The clear separation between normal and fault-affected action states validates the effectiveness of action state-based anomaly detection.

Figure 6. AUC performance analysis for each action state using the reactive method. (a) Failure type

S_{1}

. (b) Failure type

S_{2}

.

Figure 6. AUC performance analysis for each action state using the reactive method. (a) Failure type

S_{1}

. (b) Failure type

S_{2}

.

Figure 7. AUC results for the history-based method; (a,b) depict two aspects of performance change according to the parameters of the history-based method.

Figure 8. Fault diagnosis prediction result. (a) Anomaly detection probabilities at each action state according to the failure type

A_{1}

. (b) Fault diagnosis prediction when a fault is caused by type

A_{1}

.

Figure 8. Fault diagnosis prediction result. (a) Anomaly detection probabilities at each action state according to the failure type

A_{1}

. (b) Fault diagnosis prediction when a fault is caused by type

A_{1}

.

Table 1. Summary of representative anomaly detection and fault diagnosis methods in related work.

Reference	Year	Method	Key Contribution
Manevitz & Yousef [6]	2001	OCSVM	One-class SVM boundary learning
Kriegel et al. [8]	2008	ABOD	Angle-based outlier detection
Liu et al. [4]	2008	Isolation Forest	Tree-based anomaly isolation
Kingma & Welling [13]	2014	VAE	Reconstruction-based generative model
Ruff et al. [14]	2018	Deep SVDD	Hypersphere boundary in latent space
Li et al. [9]	2020	COPOD	Copula-based empirical outlier detection
Almardeny et al. [10]	2022	ROD	Rotation-based outlier detection
Li et al. [11]	2023	ECOD	Empirical CDF-based outlier detection
Xu et al. [5]	2023	DIF	Deep neural representation + isolation
Feng et al. [25]	2024	SensitiveHUE	Transformer sensitivity to normal patterns
This work	2026	Action state decomp.	Algorithm-agnostic mode decomposition

Table 2. Definition of all operable action states (1∼27) and the meaning of each action. Each column corresponds to an action state index, and each row indicates whether the corresponding pilot pressure channel (boom up/down, arm in/out, bucket in/out) is active (1) or inactive (0).

Idle and single motion
Action state	1	2	3	4	5	6	7
Idle	1	0	0	0	0	0	0
Boom up	0	1	0	0	0	0	0
Boom down	0	0	1	0	0	0	0
Arm in	0	0	0	1	0	0	0
Arm out	0	0	0	0	1	0	0
Bucket in	0	0	0	0	0	1	0
Bucket out	0	0	0	0	0	0	1
Double motion
Action state	8	9	10	11	12	13	14	15	16	17	18	19
Boom up	1	1	1	1	0	0	0	0	0	0	0	0
Boom down	0	0	0	0	1	1	1	1	0	0	0	0
Arm in	1	0	0	0	1	0	0	0	1	1	0	0
Arm out	0	1	0	0	0	1	0	0	0	0	1	1
Bucket in	0	0	1	0	0	0	1	0	1	0	1	0
Bucket out	0	0	0	1	0	0	0	1	0	1	0	1
Triple motion
Action state	20	21	22	23	24	25	26	27
Boom up	1	1	0	0	1	1	0	0
Boom down	0	0	1	1	0	0	1	1
Arm in	1	0	1	0	1	0	1	0
Arm out	0	1	0	1	0	1	0	1
Bucket in	1	1	1	1	0	0	0	0
Bucket out	0	0	0	0	1	1	1	1

Table 3. Definition of failure types used in this paper. The “Sequence” column indicates the lever operation pattern used to generate the test data: Random (R) denotes uniformly random lever movements, while Scenario (S) denotes realistic work-cycle sequences (e.g., dig–swing–dump).

Failure Type	Related Parameter	Sequence
$A_{1}$	Boom-up pilot	Random (R) or Scenario (S)
$A_{2}$	Boom-down pilot	Random (R) or Scenario (S)
$A_{3}$	Arm in pilot	Random (R) or Scenario (S)
$A_{4}$	Arm out pilot	Random (R) or Scenario (S)
$A_{5}$	Bucket in pilot	Random (R) or Scenario (S)
$A_{6}$	Bucket out pilot	Random (R) or Scenario (S)
$S_{1}$	Main pump A	Random (R) or Scenario (S)
$S_{2}$	Main pump B	Random (R) or Scenario (S)
$S_{3}$	Main pump Regulator A	Random (R) or Scenario (S)
$S_{4}$	Main pump Regulator B	Random (R) or Scenario (S)

Table 4. Summary of the dataset configuration.

	Training	Test (per Fault Type)
Samples	∼100,000	∼6000
Labels	Normal only	Normal + Fault
Structure	–	1st half: normal	2nd half: fault
Sequence type	Random	Random (10 sets)	Scenario (10 sets)
Dimensions	10	10
Normalization	Min-max per dimension
Noise level	Gaussian, $σ$ = 8% of operating range

Table 5. AUC results of various anomaly detection algorithms. Bold values indicate the best performance.

	Excavator Modeling Data
Anomaly Detector	Type S₁	Type S₂	Type S₃	Type S₄	Mean
OCSVM [6]	0.8309	0.8443	0.8303	0.8208	0.8316
OCSVM + Action state	0.8668	0.8722	0.8577	0.8467	0.8608
IForest [34]	0.7405	0.7514	0.7357	0.7167	0.7361
IForest + Action state	0.7808	0.7842	0.7678	0.7569	0.7724
ABOD [8]	0.8637	0.8660	0.8564	0.8493	0.8589
ABOD + Action state	0.8622	0.8716	0.8610	0.8571	0.8630
VAE [13]	0.5333	0.5437	0.5499	0.5485	0.5438
VAE + Action state	0.7462	0.7548	0.7520	0.7330	0.7465
DeepSVDD [14]	0.4895	0.4696	0.5033	0.5121	0.4936
DeepSVDD + Action state	0.6775	0.6751	0.6606	0.6515	0.6662
COPOD [9]	0.5251	0.5507	0.5380	0.5285	0.5356
COPOD + Action state	0.6012	0.6081	0.5843	0.5869	0.5951
ROD [10]	0.5340	0.5331	0.5539	0.5449	0.5415
ROD + Action state	0.6936	0.6993	0.6826	0.6634	0.6847
ECOD [11]	0.5399	0.5310	0.5449	0.5378	0.5384
ECOD + Action state	0.5963	0.5942	0.6006	0.5906	0.5954
DIF [5]	0.6278	0.6620	0.6676	0.6797	0.6593
DIF + Action state	0.7223	0.6947	0.7279	0.7133	0.7146
SensitiveHUE [25]	0.6239	0.6508	0.6340	0.6183	0.6317
SensitiveHUE + Action state	0.8436	0.8681	0.8416	0.8166	0.8425
Cumulative probability method	0.8792	0.8770	0.8644	0.8563	0.8692
Reactive method	0.8858	0.8839	0.8748	0.8685	0.8783
History-based method	0.9002	0.8957	0.8864	0.8782	0.8901

Table 6. Hyperparameter sensitivity analysis for deep learning and ensemble methods. Mean AUC across four sensor fault types (S1–S4) is reported for each configuration, with and without action state decomposition.

Method	Hyperparameter	Without AS	With AS
VAE	(16, 8, 4), 10 ep.	0.545	0.746
VAE	(16, 8, 4), 50 ep.	0.544	0.747
VAE	(64, 32, 16), 50 ep.	0.586	0.760
VAE	(128, 64, 32), 50 ep.	0.577	0.766
DeepSVDD	(32, 16), 10 ep.	0.518	0.656
DeepSVDD	(32, 16), 50 ep.	0.545	0.676
DeepSVDD	(64, 32), 50 ep.	0.515	0.694
DeepSVDD	(128, 64, 32), 50 ep.	0.530	0.693
SensitiveHUE	seq_len = 5, 10 ep.	0.575	0.817
SensitiveHUE	seq_len = 10, 10 ep.	0.627	0.840
SensitiveHUE	seq_len = 10, 50 ep.	0.632	0.843
SensitiveHUE	seq_len = 20, 10 ep.	0.710	0.862
SensitiveHUE	seq_len = 20, 50 ep.	0.716	0.865
DIF	$n_{ens}$ = 10	0.619	0.678
DIF	$n_{ens}$ = 50	0.659	0.715
DIF	$n_{ens}$ = 100	0.656	0.712

Table 7. Confusion matrix for Bayesian fault diagnosis across all six action fault types (

A_{1}

–

A_{6}

) after 20 detected anomalies. Each row represents the true fault type, and each column represents the diagnosed fault type (40 trials per fault type). Bold values on the diagonal indicate correct diagnoses.

Table 7. Confusion matrix for Bayesian fault diagnosis across all six action fault types (

A_{1}

–

A_{6}

) after 20 detected anomalies. Each row represents the true fault type, and each column represents the diagnosed fault type (40 trials per fault type). Bold values on the diagonal indicate correct diagnoses.

	$A_{1}$	$A_{2}$	$A_{3}$	$A_{4}$	$A_{5}$	$A_{6}$	Acc.
$A_{1}$	39	0	1	0	0	0	97.5%
$A_{2}$	3	16	6	2	6	7	40.0%
$A_{3}$	1	0	39	0	0	0	97.5%
$A_{4}$	0	0	1	38	0	1	95.0%
$A_{5}$	2	0	1	0	37	0	92.5%
$A_{6}$	0	0	1	0	0	39	97.5%
Overall accuracy							86.7%

Table 8. AUC comparison with and without feature-based mode decomposition on two benchmark datasets. Process variables define the operating modes: TS1 temperature sensor (3 levels) for UCI Hydraulic and Volume Flow RateRMS (3 levels) for SKAB. “No” and “With” denote training without and with mode decomposition, respectively. Bold values indicate better performance between No and With for each dataset. “—” indicates that the algorithm is not applicable (Cumulative is only for No, and Reactive and History are only for With).

	UCI Hydraulic (3 Modes)		SKAB (3 Modes)
Algorithm	No	With	No	With
OCSVM [6]	0.717	0.976	0.567	0.556
IForest [34]	0.836	0.915	0.513	0.546
ABOD [8]	0.762	0.824	0.672	0.628
VAE [13]	0.599	0.928	0.676	0.543
DeepSVDD [14]	0.610	0.719	0.657	0.512
COPOD [9]	0.468	0.257	0.585	0.623
ROD [10]	0.573	0.891	0.622	0.656
ECOD [11]	0.496	0.231	0.653	0.637
DIF [5]	0.650	0.356	0.528	0.585
SensitiveHUE [25]	0.603	0.860	0.600	0.673
Cumulative	0.604	—	0.567	—
Reactive	—	0.765	—	0.577
History	—	0.765	—	0.575

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Soh, J.; Lee, C.; Kim, W.; Kang, B.; Kim, D. Anomaly Detection and Fault Diagnosis Based on Action States for Excavators. Appl. Sci. 2026, 16, 2414. https://doi.org/10.3390/app16052414

AMA Style

Soh J, Lee C, Kim W, Kang B, Kim D. Anomaly Detection and Fault Diagnosis Based on Action States for Excavators. Applied Sciences. 2026; 16(5):2414. https://doi.org/10.3390/app16052414

Chicago/Turabian Style

Soh, Jaehyun, Changmin Lee, Wonkyung Kim, Byungmun Kang, and DaeEun Kim. 2026. "Anomaly Detection and Fault Diagnosis Based on Action States for Excavators" Applied Sciences 16, no. 5: 2414. https://doi.org/10.3390/app16052414

APA Style

Soh, J., Lee, C., Kim, W., Kang, B., & Kim, D. (2026). Anomaly Detection and Fault Diagnosis Based on Action States for Excavators. Applied Sciences, 16(5), 2414. https://doi.org/10.3390/app16052414

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Anomaly Detection and Fault Diagnosis Based on Action States for Excavators

Abstract

1. Introduction

2. Methods

2.1. Excavator Modeling

2.2. Action States Mapping Function and Data Generation

2.3. Anomaly Detection Algorithm

2.3.1. Action States in Anomaly Detection

2.3.2. Cumulative Probability and Reactive Method

2.3.3. History-Based Method

2.4. Fault Diagnosis Algorithm

3. Experimental Results

3.1. Action State Analysis

3.2. Anomaly Detection Performance

3.3. Fault Diagnosis

3.4. Generalizability to Other Domains

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI