Modeling of Distorted Degradation Data Based on Oil Analysis

Chen, Yue; Shi, Jian

doi:10.3390/app15126531

Open AccessArticle

Modeling of Distorted Degradation Data Based on Oil Analysis

by

Yue Chen

¹

and

Jian Shi

^2,3,*

¹

School of Statistics, Capital University of Economics and Business, Beijing 100070, China

²

Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100864, China

³

School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(12), 6531; https://doi.org/10.3390/app15126531

Submission received: 6 May 2025 / Revised: 5 June 2025 / Accepted: 9 June 2025 / Published: 10 June 2025

Download

Browse Figures

Versions Notes

Abstract

Degradation data are important in judging a machine’s health condition and providing early warning of machine failure. However, interference factors (e.g., oil top-ups) may distort degradation observations, causing the observed data to deviate from the actual physical degradation curve that engineers rely on. To address this distortion problem, this paper proposes a statistical correction framework to recover the actual degradation curve. The main contributions are as follows: First, we developed a degradation correction model that automatically identifies oil top-up events while globally rectifying distorted data, achieving an accurate reconstruction of physical degradation curves. Second, we developed a three-step explosion search algorithm for robust parameter estimation. Notably, though our methodology was initially developed for wear degradation analysis, this data-driven framework demonstrates adaptability to broader degradation scenarios. Finally, numerical simulations and case studies confirmed the practical effectiveness of the proposed method.

Keywords:

Prognostics and Health Management (PHM); condition monitoring; oil top-up; wear

1. Introduction

Prognostics and Health Management (PHM) focuses on condition monitoring and the extrapolation of degradation trends based on recent observations [1,2,3]. Among the established techniques for monitoring the state of machinery, oil analysis is the most popular. Oil monitoring is mainly concerned with the analysis of sampled oil to detect wear-related problems [4,5,6,7]. Shell in its advisory report indicates that oil contamination causes approximately 70% of in-service failures in diesel engines, of which 50% are the result of wear-related problems [8]. The wear is complex, including multiple friction pairs, such as wet clutches, transmission gears, sealing rings, bearings, and confluence planetary gear trains.

The wear describes the degradation of mechanical systems to some degree. However, unless the wear is directly observable, such as in the case of a brake pad, it is difficult to quantify it in general. For most mechanical systems, wear is not directly observable and can only be assessed via other measured condition information data such as the concentrations of metal elements. The concentration of metal within oil samples is a good indicator of wear and has been widely used for wear evaluation [8,9,10,11,12]. Wang [8] used metal concentrations to describe the deterioration of a maintained plant. Vališ et al. [9] considered the metal concentration as a potential failure indicator and then proposed a linear regression model to determine a linear course of metal particle generation. Zheng et al. [10] used the Wiener process to model metal concentrations and thus optimized the planned maintenance interval of the power shift steering transmission.

However, in practical scenarios, many factors such as oil top-ups, wear particle removal, and different measurement methods can disturb the observation of the metal concentration, thereby causing the observed data to deviate from the actual physical degradation. This distortion (deviation) indicates that it is infeasible to directly use the observed data to evaluate the wear condition. In other words, it is necessary to recover the actual physical degradation curve based on the observed data.

Macián et al. [13] considered the influence of oil consumption and replenishment and developed an analytical approach to the wear rate of internal combustion engines based on spectrometric oil analysis. Bagshaw et al. [14] used oil volume correction to transform the observed data into absolute wear values. Feng et al. [15] considered the effect of the time-varying wear coefficient. A model of wear debris concentration was built based on Kragelsky’s method with different wear coefficients in corresponding wear stages. However, the literature mentioned above assumes that the refilling is carried out at predetermined maintenance intervals [13,14,15,16] and does not consider dynamic, non-periodic oil top-ups driven by real-time operational conditions. Moreover, the above models were established based on particular mechanical structures such as an internal combustion engine, a continuous flow stirred tank reactor (CFSTR), or a gearbox, thereby restricting their applications.

Considering the disturbance caused by non-periodic oil top-ups, a piecewise linear method was employed to address the resulting variations in oil spectrum analysis [17,18]. The Hermite interpolation approach was used to model the wear characteristics [19]. The grey model and AR time series model were also adopted [20]. However, these methods rely on subjective judgments to identify oil top-up points and use local observed information to correct the data. This indicates that these methods not only lack objectivity but also do not make full use of existing observation information.

This paper focuses on rectifying observed degradation data distorted by non-periodic oil top-ups. As shown in Figure 1 (Vehicle 1 dataset), the observed data exclusively contain Cu (copper) concentration measurements with the corresponding sampling time but lack documentation of oil top-up times and quantities. Detailed experimental information is provided in Section 4. It is evident that the acquired Cu concentration exhibits non-monotonic behavior despite the inherent monotonicity of mechanical wear processes. This discrepancy stems from dilution effects caused by fresh oil additions. The contributions of this paper can be summarized as follows:

A statistical correction model is proposed to process degradation data that are disturbed by non-periodic oil top-ups. The proposed model can not only automatically identify the locations of oil top-up points but also globally correct the observed degradation data.
A three-step explosion search algorithm is developed, enabling robust parameter estimation through both parametric and non-parametric approaches.
The corrected degradation data can reflect the actual physical degradation; thus, these corrected data can be used to assess system conditions and predict a failure.

Notably, while the methodology is developed through wear degradation analysis, its data-driven architecture ensures generalizability to various degradation monitoring scenarios requiring measurement distortion correction.

The rest of the paper is organized as follows: The proposed methodology is presented in detail in Section 2. In Section 3, we perform some numerical simulations. Case studies are shown in Section 4. Section 5 provides discussion and conclusions.

2. Methodology

2.1. Data and Model

The actual physical degradation serves as a “health indicator” for system wear; however, the observation of the physical degradation process is affected by oil top-ups. To be specific, once the fresh oil is added, the observed degradation data, such as the concentration of metal elements, will be diluted. Therefore, it is necessary to correct the observed degradation data. Before establishing the model, we have the following assumptions: If the mass of the oil added is unknown, then the observed degradation data, namely, the concentrations of metal elements, are diluted in the same ratio each time fresh oil is added. This indicates that the oil top-up effect can be described by a constant.

The definitions of all variables and mathematical symbols are listed in Table 1. If the observed error cannot be neglected, then the observed degradation is random, namely,

y_{i} = f_{i} + ε_{i},

(1)

where

ε_{i}

is the random error and follows a normal distribution with zero mean and standard deviation

σ

.

We established a model to describe the relationship between observed degradation data without measurement errors and the actual physical degradation process. Typically, we consider three cases: complete observation data, partially missing data, and completely missing data.

Case I: The observed data consist of

{(t_{i}, y_{i}, z_{i}, δ_{i})}_{i = 1}^{n}

, representing complete observations. If fresh oil is added after the

(i - 1)

th observation, then the concentration becomes

z_{i} f_{i - 1}

; otherwise, it remains

f_{i - 1}

. Therefore, neglecting measurement errors, the observed degradation

f_{i}

consists of two components: the cumulative degradation

z_{i}^{δ_{i}} f_{i - 1}

and the increment in the degradation process

h_{i} - h_{i - 1}

. Formally, the correction model can be expressed as follows:

f_{i} = \{\begin{matrix} h_{i}, & i = 1; \\ h_{i} - h_{i - 1} + z_{i}^{δ_{i}} f_{i - 1}, & i = 2, \dots, n . \end{matrix}

(2)

Case II: The observed data consist of

{(t_{i}, y_{i}, δ_{i})}_{i = 1}^{n}

, which are partial missing observations. Since the effects of oil top-ups are unknown, their mean value

α

is adopted. Therefore, in this case, the model can be represented as follows:

\begin{matrix} f_{i} = \{\begin{matrix} h_{i}, & i = 1; \\ h_{i} - h_{i - 1} + α^{δ_{i}} f_{i - 1}, & i = 2, \dots, n . \end{matrix} \end{matrix}

(3)

Case III: The observed data consist of

{(t_{i}, y_{i})}_{i = 1}^{n}

, which are complete missing observations. The formula of the model is also given by Equation (3). However, we note that

δ_{i}

is unknown in this case.

2.2. Parameter Estimation

Indeed, engineers primarily focus on the actual degradation data

h_{i}

. Case II can be considered a bridge, and it links Case I and Case III. Thus, we start with Case II to explore the parameter estimation approaches.

2.2.1. Parametric Estimation Approach

If the parametric representation of the physical degradation process can be derived based on available physical theories, then the parametric estimation approach is a good choice. Here, the physical degradation h is a function of the observed time t and the unknown parameter vector

θ

, given by

h (t, θ)

.

Through a simple derivation, Equation (3) for Case II can be reformulated as follows:

\begin{matrix} f (t_{i}, θ, α) = \{\begin{matrix} h (t_{i}, θ), & i = 1, \\ h (t_{i}, θ) + (α^{δ_{i}} - 1) h (t_{i - 1}, θ) & i = 2, \\ h (t_{i}, θ) + (α^{δ_{i}} - 1) h (t_{i - 1}, θ) + \sum_{k = 1}^{i - 2} [(α^{δ_{k + 1}} - 1) h (t_{k}, θ) \prod_{j = k + 2}^{i} α^{δ_{j}}], & otherwise, \end{matrix} \end{matrix}

where

f (t_{i}, θ, α) \equiv f_{i}

and

h (t_{i}, θ) \equiv h_{i}

. Under the assumption that the observed error follows the normal distribution, the likelihood function is given by

\begin{matrix} L_{n} (θ, α, σ) = \frac{1}{{(2 π)}^{2 / n} σ^{n}} exp (- \frac{\sum_{i = 1}^{n} {(f (t_{i}, θ, α) - y_{i})}^{2}}{2 σ^{2}}) . \end{matrix}

(4)

We use traditional optimization methods such as the Gauss–Newton iteration method to maximize this likelihood function. Then, the maximum likelihood estimations of the unknown parameters obtained are denoted by

(\hat{θ}, \hat{α}, \hat{σ})

.

Similarly, we can obtain the maximum likelihood estimations

(\hat{θ}, \hat{σ})

for Case I. Compared with Case I, there is an additional parameter

α

under Case II.

Under Case III, the locations of oil top-up points are unknown; therefore, it is necessary to identify them. Inspired by the monotonic non-decreasing nature of the physical degradation, we propose an explosion search algorithm, which detects the oil top-up points from the decreased points. Note that the decreased point refers to the point where the observed degradation is lower than the previous point.

More specifically, the indicator of whether the fresh oil is added may be 1 or 0 at the decreased point, while it is fixed to 0 at the increased point. Therefore, with d decreased points, there are

2^{d}

candidate degradation paths. For each candidate degradation path, we calculate its maximum likelihood estimation based on Equation (4). Then, we obtain

2^{d}

estimations and select one with the minimum residual sum of squares as the final estimation. The procedure for the parametric estimation approach for Case III is described by Algorithm 1 in detail.

Algorithm 1: The three-step explosion search algorithm

Step 1:
Find the candidate oil top-up paths:
Find the locations of the decreased points;
Calculate the number of the decreased points, denoted by d;
Define the candidate oil top-up paths $δ_{i}, i = 1, \dots, 2^{d}$ .
Step 2:
Calculate the maximum likelihood estimation of each candidate oil top-up path:
for all $i = 1, \dots, 2^{d}$ do
Calculate maximum likelihood estimations ( ${\hat{θ}}_{i}$ , ${\hat{α}}_{i}, {\hat{σ}}_{i})$ by Equation (4), where $δ = δ_{i}$ ;
Calculate the residual sum of squares, $r e_{i} = \sum_{j = 1}^{n} {(y_{j} - f (t_{j}, {\hat{θ}}_{i}, {\hat{α}}_{i}))}^{2}$ .
end for
Step 3:
Obtain the final estimation:
$j = \arg \min_{i} r e_{i}, \hat{θ} = {\hat{θ}}_{j}, \hat{α} = {\hat{α}}_{j}, \hat{σ} = {\hat{σ}}_{j} .$

In addition to point estimation, engineers may be also interested in interval estimation. We take Case II as an example to illustrate how to use the percentile bootstrap to construct confidence intervals.

(1): Generate bootstrap sample $y_{b}^{*} = (y_{1 b}^{*}, y_{2 b}^{*}, \dots, y_{n b}^{*})$ , $b = 1, \dots, B$ , where $y_{i b}^{*} = f (t_{i}, \hat{θ}, \hat{α}) + ε_{i b}^{*}$ , $i = 1, \dots, n$ , and $ε_{1 b}^{*}, \dots, ε_{n b}^{*}$ are independent and identically distributed residuals generated from $N (0, {\hat{σ}}^{2})$ .
(2): Calculate the maximum likelihood estimation ${\hat{θ}}_{b}^{*}$ of the bth bootstrap sample $y_{b}^{*}$ , $b = 1, \dots, B$ .
(3): The $100 (1 - α) %$ percentile bootstrap interval of $h (t_{i}, θ)$ is given by

$\begin{matrix} [h^{l} (t_{i}), h^{u} (t_{i})], \end{matrix}$

where $h^{l} (t_{i})$ and $h^{u} (t_{i})$ are the lower and upper $α / 2$ quantiles of ${h (t_{i}, {\hat{θ}}_{b}^{*}), b = 1, \dots, B}$ , respectively.

2.2.2. Non-Parametric Estimation Approach

To avoid possible model misspecification in a parametric analysis, we consider an alternative approach, namely, the non-parametric estimation approach.

Under Case II, we use a natural cubic smoothing spline [21] to fit the actual physical degradation curve. The natural cubic smoothing spline is a smooth and continuous twice differentiable curve that minimizes the penalized sum of squares. The penalized sum of squares consists of two parts: the residual sum of squares that quantifies the goodness-of-fit to the data, and the roughness penalty term based on the second derivative. More specifically, we have the following optimization problem:

\begin{matrix} \begin{matrix} \min_{α, h} & \sum_{i = 1}^{n} {(f_{i} - y_{i})}^{2} + λ \int {(h^{″} (t))}^{2} d t \\ subject to & h_{i} - h_{i - 1} \geq 0, i = 2, \dots, n, \\ 0 \leq α \leq 1, \end{matrix} \end{matrix}

(5)

where

h = {(h_{1}, \dots, h_{n})}^{T}

,

λ

is a positive smoothing parameter, controlling the trade-off between fidelity to the data and the roughness of the function estimation, and the constraint condition

h_{i} - h_{i - 1} \geq 0

, to some extent, ensures that the estimation of the actual physical degradation curve is monotonically non-decreasing. Then, the interior-point method is used to solve this problem (see Appendix A for more details).

Under Case III, due to the absence of

δ = {(δ_{1}, \dots, δ_{n})}^{T}

, the three-step explosion search algorithm in the previous subsection is used in the non-parametric estimation approach. Note that its second step is replaced by the non-parametric estimation based on Equation (5). Moreover, similarly to the parametric estimation approach, we can compute the bootstrap confidence interval for non-parametric estimation.

2.3. Failure Prediction

The estimation of the actual physical degradation curve, which is also a correction to the observed degradation, can reflect the actual physical degradation well. Consequently, this estimation can be used to assess the system state and predict a failure. A failure occurs when the degradation curve

h (t)

hits the failure threshold

ω

that is empirically predetermined by engineers. According to the concept of First Hitting Time (FHT), the lifetime T of the system is defined as

\begin{matrix} T = \inf {t : h (t) \geq ω | h (0) \leq ω} . \end{matrix}

Moreover, the Remaining Useful Life (RUL) is predicted; that is,

T - t_{n}

.

2.4. Implementation Procedure

In order to illustrate how to use our methodology, we show its implementation procedure in Figure 2.

3. Numerical Simulations

In order to compare the proposed methodology with the piecewise linear method [17], two kinds of degradation functions are considered in this subsection.

Specifically, the observed degradations

y_{i}

is given by

y_{i} = f_{i} + ε_{i}

, where

f_{i} = h_{i} - h_{i - 1} + z_{i}^{δ_{i}} f_{i - 1}

. Here, the actual degradation

h_{i}

equals

h (t_{i})

. The observed points

t_{i} (i = 1, . . ., n)

with

n = 30

are equally spaced in

[0, 1]

. The measure error

ε_{i}

follows a normal distribution with zero mean and standard deviation

σ = 0.5

.

δ_{i}

follows Bernoulli distribution with the parameter

p = 0.3

.

z_{i}

follows the normal distribution with mean

α

and variance

0.01

, and we set

α = 0.6

. We simulate 1000 datasets based on the above parameter configuration and calculate the estimations for each dataset.

Since it is impossible to enumerate all degradation patterns, here, we will only choose certain representative patterns for illustration. We consider the following two degradation scenarios for all the three cases defined in Section 2:

Scenario (I): The actual degradation $h (t)$ is linear given by $h (t) = a + b t$ where $a = 10, b = 30$ .
Scenario (II): The actual degradation $h (t)$ is non-linear given by $h (t) = a + b t^{c}$ where $a = 12, b = 28, c = 0.5$ .

For the parameter estimations obtained by the parametric approach, we calculate the mean of these 1000 estimations, as well as the standard deviation (SD) and the mean square error (MSE). The results of Scenario (I) and those of Scenario (II) are shown in Table 2 and Table 3, respectively. We can see that the SD and MSE are very small for each parameter. Moreover, the mean values of estimations are very close to the true values. These results show that the proposed parametric approach indeed works well.

The results obtained by the non-parametric approach in Scenario (I) and Scenario (II) are, respectively, plotted in Figure 3 and Figure 4. The top, middle, and bottom rows display the non-parametric estimations for Case I, Case II, and Case III, respectively. The 95% point-wise confidence intervals are computed by taking the

2.5

th percentile and the

97.5

th percentile of 1000 estimations at each observed time point. It can be seen that the mean values of the non-parametric estimations are very close to the actual degradation curve and the confidence intervals are narrow. These results indicate that the proposed non-parametric approach performs well.

Moreover, the piecewise linear method [17] is also used in our simulation. The results obtained by the piecewise linear method are also shown in Figure 3 and Figure 4. We can see that the following is true: (a) The mean values of estimations obtained by the piecewise linear method are farther away from the actual degradation curve than those obtained by our non-parametric approach. (b) The confident intervals obtained by the piecewise linear method are wider than those obtained by our non-parametric approach. These results imply that the performances of our non-parametric approach are better than those of the piecewise linear method.

In addition, from Table 2 and Table 3 and Figure 3 and Figure 4, we can conclude that the estimations for Case I are more precise than those for Case II, and the estimates for Case II are better than those for Case III. This is reasonable, because the more complete the observed data, the more accurate the obtained estimations. Moreover, we computed the coverage probabilities for the

95 %

bootstrap confidence intervals, and we list them in Table 4 and Table 5. We can observe that the coverage probabilities are all around 95%. This indicates that our model and the corresponding estimation approaches are effective.

Overall, simulation studies show that our model and the corresponding estimation approaches perform well whether the degradation is linear or non-linear. Moreover, our methodology can recover the actual degradation curve better than the piecewise linear method; therefore, we can conclude that our methodology is superior to the piecewise linear method.

4. Case Studies

In this section, we validate the practical applicability of our proposed degradation model by investigating wear degradation data in the power-shift steering transmission (PSST) of tracked vehicles. PSST is characterized by its high loading capacity. The wear in PSST is complex, occurring by the plastic displacement of surface and near-surface material and by the detachment of particles. The metal particles of various shapes and sizes that are formed by the wear are distributed in the oil as contaminants.

Our experiment is an on-road test rather than a laboratory test. To be specific, the tracked vehicle runs on the road and the oil is sampled at discrete time points. Then, the spectrometric oil analysis technique is used to measure concentrations of metal particles within oil samples. In consideration of oil consumption, at some sampling points, fresh oil is added to the PSST after sampling. The Cu (copper) element represents the overall wear level. Therefore, the observations for Cu in the PSST of Vehicle 1 are shown in Figure 1 and listed in Table 6. We can see that the concentrations of Cu are non-monotone although the actual wear process is monotone. This is due to the fact that the addition of fresh oil reduces the concentration of Cu in the PSST.

Consequently, it is necessary to correct the observed data and thus recover the actual degradation curve. Unfortunately, as shown in Table 6, the data obtained only include the sampling time and the observed concentration, without the oil top-up time and the oil top-up quantity. Thus, we model these observed data based on Case III defined in Section 2 and then, respectively, use the parametric and non-parametric approaches to estimate the parameters of the model.

In the parametric approach, the parametric representation of the physical degradation process is critical. The fresh oil is added after running for some time; therefore, the first few observed concentrations are not affected by the fresh oil and can represent the actual physical degradation. According to the first few observed concentrations in Figure 1, we can assume that the running time and actual physical degradation are non-linearly expressed as follows:

\begin{matrix} h (t) = a + b t^{c}, \end{matrix}

where t is the time and

a, b, c

are unknown parameters.

In the non-parametric approach, as introduced in Section 2, we use a natural cubic smoothing spline to fit the actual physical degradation curve.

Figure 5 shows the results for the parametric and non-parametric approaches, including the observed data, the detected oil top-up points, the actual degradation (the corrected curve), and the bootstrap confidence intervals. We can see that the number of oil top-ups is relatively large. Frequent oil top-ups may be due to high-intensity work or a high leakage rate. Moreover, we notice that there is a big gap between the observed degradation and the corrected one. The “uncorrected” (observed) data are non-monotone and relative while the “corrected” (actual) data are monotone and absolute. This implies that the corrected curve reflects the actual physical degradation process better than the observed data. Therefore, it is meaningful to use our methodology to correct the observed data.

Based on actual experience, the failure threshold of Cu is set to 160 ppm. Figure 5 also shows the FHT for failure occurrence based on parametric and non-parametric approaches. Furthermore, we can see that the actual wear degradation has not reached the failure threshold yet, and thus Vehicle 1 can still work normally for a while.

The observed degradation data in the PSST of Vehicle 2 are shown in Table 7. Figure 6 shows the results obtained, respectively, by the parametric approach and the non-parametric approach. It can be seen that the number of oil top-ups is relatively small, and there is a small gap between the observed degradation and the corrected one.

According to the above two examples, our methodology can correct the distortion data and obtain an estimation of the actual degradation curve. The “uncorrected” (observed) concentration only indicates that wear is taking place while the “corrected” (actual) concentration reflects the degree of wear. Therefore, our methodology can avoid the misdiagnosis of wear condition to some degree.

Furthermore, we compared our methodology with the piecewise linear method [17]. The results shown in Figure 7 and Figure 8 can be summarized as follows:

The estimation of the actual degradation obtained by the piecewise linear method is the minimum among the three methods. This indicates that the possibility of missing alarm is relatively high and the possibility of false alarm is relatively low if the piecewise linear method is used. Therefore, if engineers’ goal is to minimize the possibility of false alarms, then the piecewise linear method is a good choice.
The estimation of the actual degradation obtained by the non-parametric approach is the maximum among the three methods. This means that the possibility of missing an alarm is relatively low and the possibility of a false alarm is relatively high if the non-parametric approach is used. Consequently, when engineers want to pay more attention on the possibility of missing an alarm, the non-parametric approach should be used.
The estimation of the actual degradation obtained by the parametric approach is located in the middle of the three methods. Therefore, if engineers are very certain about the parametric representation of the physical degradation process, it is recommended to use the parametric approach.

It should be noted that the piecewise linear method has three drawbacks: Firstly, it uses local observations to piecewisely correct the observed degradation, rather than global observations. Secondly, it uses engineers’ subjective determination to identify the disturbed points, thereby lacking objectivity. Thirdly, it assumes that the actual degradation curve is linear. However, the actual degradation is non-linear in many scenarios.

In conclusion, compared with the piecewise linear method, our model and the corresponding estimate approaches have some obvious advantages, as follows:

Our model is established based on the global observed data, so it can locate more valuable information in the observed data.
Our model can identify the disturbed points automatically, thereby avoiding subjective judgment and improving the objectivity and accuracy.
Both the parametric approach and the non-parametric approach can describe various kinds of degradation patterns, whether linear or non-linear. Therefore, our approaches have wider applicability and practicability.

5. Discussion and Conclusions

As we all know, most of the actual degradation process is monotone. However, in some circumstances, such as the observation being disturbed by some factors, the observed degradation data lose their monotonicity and are distorted. In this paper, we propose a model to characterize this distorted case and come up with the corresponding approach to solving this model.

Compared with the piecewise linear method [17], the simulation results demonstrate that our methodology can automatically identify disturbed points, thereby avoiding the subjective judgment required for the piecewise linear method, and can also more accurately recover the actual degradation curves. Importantly, our method is flexible enough to model both linear and non-linear degradation patterns, which expands its applicability and practical utility beyond that of the piecewise linear method. Case studies demonstrate that our methodology can describe the effect of disturbed factors, as well as recovering the actual degradation curve and automatically identifying the disturbed points. This is significantly meaningful for condition monitoring, because it avoids a false diagnosis of degradation to some degree.

We remark that our proposed model can be generalized to a situation where the physical degradation process is monotone while the observed process is not monotone due to the disturbance of a point process. Specifically, without observed error, the observation is a perturbation of the actual physical degradation process, as follows:

\begin{matrix} f (t) & = h (t) + \int_{0}^{t} f (τ) d N (τ), \end{matrix}

(6)

where the actual physical degradation

h (t)

is a monotone process, such as a Gamma process or inverse Gaussian process,

N (t) = \sum_{t_{i} \leq t} z_{i}^{δ_{i}} - 1

is a marked point process, the random variable

z_{i}

indicates the influence of disturbance, and the random variable

δ_{i}

takes one of two values, 0 or 1; it equals 1 when there is an existing disturbance at the ith observed time

t_{i}

and equals 0 otherwise. If the observed error cannot be neglected, then the observed degradation is as follows:

\begin{matrix} y (t) & = f (t) + ε (t), \end{matrix}

(7)

where

ε (t)

is the stochastic observed error with

E (ε (t)) = 0

.

Particle morphology influences wear mechanism identification. Therefore, these morphological parameters should be incorporated into future investigations. Furthermore, while the current validation is limited to the PSST, extending model verification to multiple machinery types would strengthen the generalizability. Additionally, integrating data-driven methodology with equipment-specific engineering knowledge represents a critical direction for subsequent research.

Author Contributions

Conceptualization, Y.C.; methodology, Y.C.; supervision, J.S.; writing—review and editing, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Youth Academic Innovation Team Construction project of Capital University of Economics and Business, grant No.QNTD202303.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Based on Equation (3), with trivial derivation, the first term in Equation (5) can be rewritten as

{(f - y)}^{T} (f - y) = {(A_{α} h - y)}^{T} (A_{α} h - y)

, where

f = {(f_{1}, \dots, f_{n})}^{T}

,

A_{α}

is an

n \times n

matrix with respect to

α

given by

\begin{matrix} A_{α} = (\begin{matrix} 1 & 0 & \dots & 0 & 0 \\ α^{δ_{2}} - 1 & 1 & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ (α^{δ_{2}} - 1) \prod_{j = 3}^{n - 1} α^{δ_{j}} & (α^{δ_{3}} - 1) \prod_{j = 4}^{n - 1} α^{δ_{j}} & \dots & 1 & 0 \\ (α^{δ_{2}} - 1) \prod_{j = 3}^{n} α^{δ_{j}} & (α^{δ_{3}} - 1) \prod_{j = 4}^{n} α^{δ_{j}} & \dots & α^{δ_{n}} - 1 & 1 \end{matrix}) . \end{matrix}

Note that we use a natural cubic smoothing spline to fit the data. Therefore, the roughness penalty has the form

\begin{matrix} \int {(h^{″} (t))}^{2} d t = h^{T} K h, \end{matrix}

where K is an

n \times n

matrix given by

K = Q^{T} R^{- 1} Q

. Here, Q is an

(n - 2) \times n

matrix of second differences with elements

Q_{i i} = 1 / τ_{i}

,

Q_{i, i + 1} = - 1 / τ_{i} - 1 / τ_{i + 1}

and

Q_{i, i + 2} = 1 / τ_{i + 1}

, R is an

(n - 2) \times (n - 2)

symmetric tri-diagonal matrix with elements

R_{i - 1, i} = R_{i, i - 1} = τ_{i} / 6

and

R_{i i} = (τ_{i} + τ_{i + 1}) / 3

, where

τ_{i} = t_{i + 1} - t_{i}

is the distance between successive knots.

Above all, Equation (5) can be transformed into the following form:

\begin{matrix} \begin{matrix} \min_{α, h} & {(A_{α} h - y)}^{T} (A_{α} h - y) + λ h^{T} K h \\ subject to & B h \geq 0, \\ 0 \leq α \leq 1, \end{matrix} \end{matrix}

where B is an

(n - 1) \times n

matrix given by

B = (\begin{matrix} - 1 & 1 & 0 & \dots & 0 & 0 \\ 0 & - 1 & 1 & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & - 1 & 1 \end{matrix}) .

Obviously, the inter-point method can be adapted to the above optimization problem. In addition, the grid search method is used to select the parameter

λ

.

References

Sikorska, J.Z.; Hodkiewicz, M.; Ma, L. Prognostic modelling options for remaining useful life estimation by industry. Mech. Syst. Signal Process. 2011, 25, 1803–1836. [Google Scholar] [CrossRef]
Zhang, Z.; Si, X.; Hu, C.; Lei, Y. Degradation data analysis and remaining useful life estimation: A review on Wiener-Process-based methods. Eur. J. Oper. Res. 2018, 271, 775–796. [Google Scholar] [CrossRef]
Cao, W.; Dong, G.; Chen, W.; Wu, J.; Xie, Y.B. Correction strategies of debris concentration for engine wear monitoring via online visual ferrograph. Proc. Inst. Mech. Eng. Part J J. Eng. Tribol. 2015, 229, 1319–1329. [Google Scholar] [CrossRef]
Zhu, X.; Zhong, C.; Zhe, J. Lubricating oil conditioning sensors for online system health monitoring—A review. Tribol. Int. 2017, 109, 473–484. [Google Scholar] [CrossRef]
Zhu, J.; Yoon, J.; He, D.; Qiu, B.; Bechhoefer, E. Online condition monitoring and remaining useful life prediction of particle contaminated lubrication oil. In Proceedings of the IEEE Conference on Prognostics and Health Management (PHM), Gaithersburg, MD, USA, 24–27 June 2013; pp. 1–14. [Google Scholar]
Du, Y.; Wu, T.; Makis, V. Parameter estimation and remaining useful life prediction of lubricating oil with HMM. Wear 2017, 376–377, 1227–1233. [Google Scholar] [CrossRef]
Fan, B.; Li, B.; Feng, S.; Mao, J.; Xie, Y.B. Modeling and experimental investigations on relationship between wear debris concentration and wear rate lubrication systems. Tribol. Int. 2017, 109, 114–123. [Google Scholar] [CrossRef]
Wang, W. A prognosis models for wear prediction based on oil-based monitoring. J. Oper. Res. Soc. 2007, 58, 887–893. [Google Scholar] [CrossRef]
Vališ, D.; Žák, L.; Pokora, O. Failure prediction of diesel engine based on occurrence of selected wear particles in oil. Eng. Fail. Anal. 2015, 256, 501–511. [Google Scholar] [CrossRef]
Zheng, C.S.; Liu, P.; Liu, Y.; Zhang, Z.L. Oil-based maintenance interval optimization for power-shift steering transmission. Adv. Mech. Eng. 2018, 10, 1–8. [Google Scholar] [CrossRef]
Pradhan, D.; Mishra, A.K. Analysis of ISO VG 68 bearing oil for condition monitoring collected from an externally pressurized ball bearing system. Mater. Today Proc. 2021, 44, 4602–4606. [Google Scholar] [CrossRef]
Wang, W.; Hussin, B. Plant residual time modelling based on observed variables in oil samples. J. Oper. Res. Soc. 2009, 60, 789–796. [Google Scholar] [CrossRef]
Macián, V.; Tormos, B.; Olmeda, P.; Montoro, L. Analytical approach to wear rate determination for internal combustion engine condition monitoring based on oil analysis. Tribol. Int. 2003, 36, 771–776. [Google Scholar] [CrossRef]
Bagshawa, J.A.; Fox, M.F.; Jones, C.J.; Picken, D.J.; Seare, K.D.R. The continuous flow stirred tank reactor (CFSTR) model and used oil volume corrections in condition monitoring. Tribol. Int. 1997, 30, 271–274. [Google Scholar] [CrossRef]
Feng, S.; Fan, B.; Mao, J.; Xie, Y. Prediction on wear of a spur gearbox by on-line wear debris concentration monitoring. Wear 2015, 336–337, 1–8. [Google Scholar] [CrossRef]
Wang, C.J.; Hu, Q.P.; Yu, D. Reliability evaluation under accelerated degradation testing with recovery capability considered. In Proceedings of the 2019 Annual Reliability and Maintainability Symposium (RAMS), Orlando, FL, USA, 28-31 January 2019; pp. 1–5. [Google Scholar]
Xu, T.F.; Yan, X.P.; Sheng, C.X. The research of avoiding the disturbing of oil changing in spectroscopic analysis on diesel engine oil. Lub. Eng. 2006, 7, 71–75. [Google Scholar]
Ji, S.D.; Yang, T.J.; Liu, X.H.; Pei, W. The treatment of changing oil disturbance in diesel engine oil spectrum analysis. Veh. Eng. 2011, 2, 86–89. [Google Scholar]
Ji, S.D.; Xu, S.Y.; Liu, X.H.; Yang, T.J. The pretreatment of spectral analysis data in diesel engine oil diagnosis. Lub. Eng 2011, 36, 105–108. [Google Scholar]
Gao, J.W.; Zhang, Y.T.; Ren, G.Q.; Zhang, X. Study on diesel engine oil spectrum analysis and prediction model. Des. Manuf. Diesel Engine 2004, 3, 26–28. [Google Scholar]
Chong, G. Smoothing Spline ANOVA Models; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]

Figure 1. The concentration of Cu in the power shift steering transmission of Vehicle 1.

Figure 2. The implementation procedure of our methodology.

Figure 3. Scenario (I): The results obtained by the non-parametric approach and the piecewise linear method.

Figure 4. Scenario (II): The results obtained by the non-parametric approach and the piecewise linear method.

Figure 5. Vehicle 1: the relationship between time and degradation.

Figure 6. Vehicle 2: the relationship between time and degradation.

Figure 7. The performance comparison for Vehicle 1.

Figure 8. The performance comparison for Vehicle 2.

Table 1. Nomenclature: summary of all variables and mathematical symbols.

Description	Symbol
The ith observed time	$t_{i}$
Actual physical degradation at ith observed time	$h_{i}$
Observed degradation with observed error at ith observed time	$y_{i}$
Observed degradation without observed error at ith observed time	$f_{i}$
Observed error at ith observed time	$ε_{i}$
Indicator of whether the fresh oil is added	$δ_{i}$
Dilution effect of fresh oil at ith observed time	$z_{i}$
Mean of the dilution effect	$α$
Number of observation	n
Standard deviation of observed errors	$σ$

Table 2. Scenario (I): The results obtained by the parametric approach for Case I, Case II, and Case III.

Data Type	Parameter	True Value	Mean of Estimations	SD	MSE
Case I	a	10	9.9991	0.2611	0.0682
	b	30	29.9161	0.4583	0.2171
	$σ$	0.5	0.4788	0.0568	0.0037
Case II	a	10	10.0175	0.2962	0.0881
	b	30	30.0401	1.2964	1.6823
	$α$	0.6	0.5985	0.0158	0.0003
	$σ$	0.5	0.4789	0.0551	0.0035
Case III	a	10	10.0277	0.3053	0.0939
	b	30	29.5139	2.5023	6.4975
	$α$	0.6	0.6004	0.0189	0.0003
	$σ$	0.5	0.4907	0.0681	0.0047

Table 3. Scenario (II): The results obtained by the parametric approach for Case I, Case II, and Case III.

Data Type	Parameter	True Value	Mean of Estimations	SD	MSE
Case I	a	12	12.0495	0.4588	0.2129
	b	28	27.8762	0.5891	0.3623
	c	0.5	0.5017	0.0175	0.0003
	$σ$	0.5	0.4699	0.0554	0.0040
Case II	a	12	12.0451	0.4679	0.2210
	b	28	27.8883	1.4627	2.1520
	c	0.5	0.5018	0.0211	0.0004
	$α$	0.6	0.6005	0.0148	0.0002
	$σ$	0.5	0.4704	0.0535	0.0037
Case III	a	12	12.0435	0.4871	0.2392
	b	28	27.5232	2.5541	6.7508
	c	0.5	0.5227	0.1092	0.0124
	$α$	0.6	0.6019	0.0156	0.0002
	$σ$	0.5	0.4905	0.1077	0.0117

Table 4. Scenario (I): The coverage probabilities for

h (t)

with

t = 0.25

,

0.50

,

0.75

, and

1.00

.

Table 4. Scenario (I): The coverage probabilities for

h (t)

with

t = 0.25

,

0.50

,

0.75

, and

1.00

.

Time	The Parametric Approach		The Non-Parametric Approach
Time	Case I	Case II	Case I	Case II
0.25	0.951	0.961	0.948	0.946
0.50	0.960	0.951	0.953	0.946
0.75	0.970	0.961	0.951	0.949
1.00	0.962	0.952	0.952	0.947

Table 5. Scenario (II): The coverage probabilities for

h (t)

with

t = 0.25

,

0.50

,

0.75

, and

1.00

.

Table 5. Scenario (II): The coverage probabilities for

h (t)

with

t = 0.25

,

0.50

,

0.75

, and

1.00

.

Time	The Parametric Approach		The Non-Parametric Approach
Time	Case I	Case II	Case I	Case II
0.25	0.950	0.960	0.948	0.945
0.50	0.953	0.959	0.952	0.948
0.75	0.962	0.956	0.950	0.951
1.00	0.950	0.952	0.965	0.943

Table 6. The observed data for Vehicle 1.

Serial Number	Sampling Time [h]	Observed Concentration [ppm]
1	0	27.7000
2	13.5	40.1500
3	26	49.6500
4	41	57.9000
5	50	47.3000
6	63.5	43.9500
7	76	42.9500
8	91	44.7500
9	100	45.9000
10	113.5	23.6500
11	126	26.6500
12	141	25.4500
13	150	34.2500
14	163.5	23.7500
15	176	27.7000
16	191	30.0500
17	200	37.2500
18	213.5	23.0000
19	226	25.5000
20	241	19.1000
21	250	22.7000
22	263.5	23.6000
23	276	18.8000
24	291	19.8000
25	300	20.4000

Table 7. The observed data for Vehicle 2.

Serial Number	Sampling Time [h]	Observed Concentration [ppm]
1	2.5	15.0000
2	15	27.4000
3	18	32.1250
4	24	41.1500
5	30	39.8500
6	37.5	48.2750
7	42.5	50.6500
8	45	47.2000
9	50	50.1250
10	57.5	56.8000
11	65	59.2500
12	68	60.3000
13	80	67.1250
14	82.5	67.2250
15	87.5	70.1500
16	92.5	70.5750
17	95	73.0250
18	100	72.4750
19	102.5	76.9500
20	107.5	79.7667
21	115	67.4750
22	130	71.1250
23	132.5	74.7250
24	137.5	72.5250
25	142.5	78.6750
26	145	75.4000
27	150	80.1500
28	152.5	82.3750
29	157.5	85.2250
30	165	82.1750
31	180	95.5500
32	182.5	104.7750

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Y.; Shi, J. Modeling of Distorted Degradation Data Based on Oil Analysis. Appl. Sci. 2025, 15, 6531. https://doi.org/10.3390/app15126531

AMA Style

Chen Y, Shi J. Modeling of Distorted Degradation Data Based on Oil Analysis. Applied Sciences. 2025; 15(12):6531. https://doi.org/10.3390/app15126531

Chicago/Turabian Style

Chen, Yue, and Jian Shi. 2025. "Modeling of Distorted Degradation Data Based on Oil Analysis" Applied Sciences 15, no. 12: 6531. https://doi.org/10.3390/app15126531

APA Style

Chen, Y., & Shi, J. (2025). Modeling of Distorted Degradation Data Based on Oil Analysis. Applied Sciences, 15(12), 6531. https://doi.org/10.3390/app15126531

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling of Distorted Degradation Data Based on Oil Analysis

Abstract

1. Introduction

2. Methodology

2.1. Data and Model

2.2. Parameter Estimation

2.2.1. Parametric Estimation Approach

2.2.2. Non-Parametric Estimation Approach

2.3. Failure Prediction

2.4. Implementation Procedure

3. Numerical Simulations

4. Case Studies

5. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI