Remaining Useful Life Prediction of Roller Bearings Based on Fractional Brownian Motion

: Roller bearing degradation features fractal characteristics such as self-similarity and long-range dependence (LRD). However, the existing remaining useful life (RUL) prediction models are memoryless or short-range dependent. To this end, we propose a RUL prediction model based on fractional Brownian motion (FBM). Bearing faults can happen in different places, and thus their degradation features are difﬁcult to extract accurately. Through variational mode decomposition (VMD), the original degradation feature is decomposed into several components of different frequencies. The monotonicity, robustness and trends of the different components are calculated. The frequency component with the best metric values is selected as the training data. In this way, the performance of the prediction model is hugely improved. The unknown parameters in the degradation model are estimated by the maximum likelihood algorithm. The Monte Carlo method is applied to predict the RUL. A case study of a bearing is presented and the prediction performance is evaluated using multiple indicators.


Introduction
The prediction of the remaining useful life (RUL) of roller bearings in machinery has recently garnered interest [1] due to its significance in both mechanical engineering and the development of sophisticated mathematical models.RUL is the time range between the current time of perfect functioning and the failure [2] expected of all machineries due to their intensive usage and consumption.The accurate RUL prediction of roller bearings can be used to plan and forecast the repairment and maintenance of many industrial machineries, like, e.g., wind turbines, helicopters and all vehicle engines [3].However, degradation uncertainty modelling remains a bottleneck for the improvement of the prediction accuracy of RUL, which cannot be addressed by current deep learning approaches [4].
Due to the randomness of the degradation process [5], degradation models based on stochastic processes are becoming more and more popular and preferred by researchers [6].These models efficiently characterize the randomness of degradation, by, e.g., the Wiener process (Brownian motion) [7][8][9], the Gamma process [10][11][12] and the inverse Gaussian process [13,14].Li et al. [15] discussed the Wiener process model in a small sample case.Zhang et al. [16] summarized various RUL predictions based on the Wiener process.However, the above processes are memoryless processes.That is, their increments are independent.
Actually, long memory effects are very common in bearing degradation, which means the degradation states of different time points are highly correlated.This property is also called long-range dependence (LRD).Moreover, the autocorrelation function (ACF) of the degradation process with LRD is non-integrable [17].In this paper, fractional Brownian motion (FBM) [18] is proposed to deal with the long memory effects of these degradation data.
Considered as a fractional generalization of Brownian motion, the FBM is a Gaussian stochastic process with LRD and self-similarity properties.The Hurst index H, also known as the self-similarity parameter [19], is used to express the self-similarity characteristics of time series and to measure their degree of LRD.The self-similarity parameter range is (0, 1).
In particular, if H= 1/2, the process is memoryless Brownian motion.For 0.5 < H < 1, the process is persistent, and thus one of LRD.In such a case, the future changing trend is the same as the current fluctuation trend.When 0 < H < 0.5, the local process is anti-persistent.
Bearing faults can happen in the outer race and/or inner race.If the fault only happens in the outer race, the vibration transmission path from the fault point to the sensor remains unchanged.The amplitude of the shock response remains similar.Therefore, feature extraction is convenient.When the fault only happens in the inner race, the transmission path for the vibration changes periodically due to the rotation of the fault point.The shock response amplitude is also interfered with by the rotational frequency [20].Meanwhile, a large amount of energy is lost during the transmission process.Therefore, feature extraction becomes difficult.If faults happen in both areas, these two kinds of vibration signals become mixed in their characteristics, which makes vibration analysis more difficult.In this work, variational mode decomposition (VMD) [21] is proposed to improve the accuracy of feature extraction in roller bearing faults.
After the vibration signal decomposition, several different degradation components are constructed.In order to improve the proposed model, several statistical metrics are calculated for the different components.The importance of the degradation components can be visualized from their metric values.The best component will be utilized for model training.
In this paper, a novel RUL prediction model containing three different steps is proposed.In the degradation feature decomposition step, the extracted feature is decomposed into several degradation components through the VMD algorithm.In the component selection step, different metrics are used to evaluate the different degradation components.The best component is selected as the health indicator (HI).In the last step, the Monte Carlo algorithm is employed to obtain the RUL prediction.FBM with self-similarity and long-range dependence is employed to drive the degradation simulations.Based on the observation data, the unknown parameters are estimated by maximum likelihood estimation (MLE) [22].
The remaining sections of the paper are structured as follows: in Section 2, the fractal characteristics of the FBM are discussed.Section 3 provides the degradation modelling method based on FBM.In Section 4, a RUL prediction framework with three different steps is described.In Section 5, a set of actual bearing degradation data is used for RUL prediction, and the results are evaluated.Section 6 concludes the paper.

Long-Range Dependence and Fractional Brownian Motion
In this section, some fundamental properties of LRD and FBM are shortly summarized, by focusing, in particular, on the self-similarity property of FBM.

Fractional Brownian Motion
Brownian motion B(t) or the Wiener process is a stochastic process with the following properties: (1) Its increments B(t) − B(s), s < t are independent.
(2) Its increments B(t) − B(s), s < t follows a normal distribution with a zero mean.
To describe the fractal dimensions of some real data, Mandelbrot proposed a fractional extension of the Weiner process, i.e., FBM [23].For the kernel function (t − s) H−0.5 with s ≤ t, 0 ≤ t, FBM is described as a stochastic integral driven by Brownian motion.
In Figure 1, the time units represented by the X-axis are not standard time units such as seconds and minutes.This is because our research focuses on simulating the path of FBM, where each unit on the X-axis essentially represents a step or stage in the simulation process.This specific time unit selection, although non-standard, is crucial for this study as it allows us to express and analyze the temporal dynamic characteristics of FBM paths in a more precise way.
Brownian motion ( ) B t or the Wiener process is a stochastic process with the fol- lowing properties: 1) Its increments ( ) ( ), B t B s s t − < are independent.2) Its increments ( ) ( ), B t B s s t − < follows a normal distribution with a zero mean.
3) The stochastic process is continuous.
To describe the fractal dimensions of some real data, Mandelbrot proposed a fractional extension of the Weiner process, i.e., FBM [23].For the kernel function In Figure 1, the time units represented by the X-axis are not standard time units such as seconds and minutes.This is because our research focuses on simulating the path of FBM, where each unit on the X-axis essentially represents a step or stage in the simulation process.This specific time unit selection, although non-standard, is crucial for this study as it allows us to express and analyze the temporal dynamic characteristics of FBM paths in a more precise way.

Fractional Brownian Motion
We can see from Figure 1 that the irregular behavior of FBM in a small time interval resembles the whole behavior of the FBM, such that FBM is a self-similar stochastic process [24].The autocorrelation function (ACF) of FBM is expressed as 1 2 When the Hurst index is lower than 0.5, the decaying rate of the ACF is close to exponential.In such cases, the random process is short-range dependent.Brownian motion

Fractional Brownian Motion
We can see from Figure 1 that the irregular behavior of FBM in a small time interval resembles the whole behavior of the FBM, such that FBM is a self-similar stochastic process [24].The autocorrelation function (ACF) of FBM is expressed as Considering τ = t − s ≥ 0, we have When the Hurst index is lower than 0.5, the decaying rate of the ACF is close to exponential.In such cases, the random process is short-range dependent.Brownian motion is a special case of FBM where the Hurst exponent is 0.5 and the increments are independent.Only if the Hurst exponent is greater than 0.5 is FBM long-range dependent.

Stochastic Model Based on FBM
For a degradation process with independent increments, the degradation model is represented by [25] dx where dx(t) is the increment of the degradation process, η(t; Ω) is the drift coefficient, Ω is a vector with unknown parameters, and σ is a constant diffusion coefficient.
To describe a degradation process with self-similarity and LRD characteristics, we propose that FBM replaces Brownian motion in the diffusion term.The degradation model is rewritten as where B H (t) is FBM.The drift coefficient η(t; Ω) in ( 5) is given by aϕ(t), and ϕ(t) has three types of formulations [24]: where M 1 , M 2 and M 3 represent linear, power law and exponential drifts, respectively.

Parameter Estimation
Referring to (5), the degradation process is described as The observed data can be expressed as X = (x 1 , x 2 , . . . ,x N ) T , where X 0 is the initial value.The function vector and FBM vector are denoted as ϕ = (ϕ(t 1 ), ϕ(t 2 ), . . . ,ϕ(t N )) T  and 9) is rewritten as X obeys X ∼ N(X 0 + aϕ, σ 2 Q), where Then, the density function of X is The log-likelihood function is obtained as follows: ( The logarithmic likelihood function of parameter b in the drift function is represented as To obtain the maximum likelihood estimation (MLE) of parameters a and σ 2 , the partial derivative of ( 21) is set to be zero, and then Due to the nonlinearity of the exponential and power law drifts, obtaining the analytical forms of b is a difficult task [26].Thus, the fminsearch function in MATLAB, employing the Nelder-Mead simplex algorithm, is designed to find the minimum of an unconstrained function.This tool is instrumental in optimizing parameters or solving problems where the objective is to minimize a specific criterion without predefined bounds [27].The estimated values of a and σ 2 can be calculated by ( 15) and ( 16).Computing a function vector based on such a drift function involves evaluating the drift function at various points or conditions to generate a series of values that constitute the vector.

Three-Step RUL Prediction Model for Roller Bearings
In this section, we propose a RUL prediction model with three different steps: degradation feature decomposition, degradation component selection and Monte Carlo RUL prediction.A flow chart of the RUL prediction model is provided in Figure 2.

Degradation Feature Decomposition
VMD is a new method of signal decomposition, proposed by Dragomiretskiy et al. [28] in 2014.This method is to decompose the original signal into a K limited bandwidth IMFS (intrinsic mode function).It can be described as follows: Each IMF is concentrated at the center frequency k ω , and the bandwidth can be es- timated by its Gaussian smoothed offset signal.Due to the sparseness of VMD，it can be deemed a constrained variational problem, as expressed by In the degradation feature decomposition stage, the variational mode decomposition (VMD) algorithm is proposed to separate different degradation feature components from the vibration signal.In the degradation component selection stage, different metrics are calculated to select the best component.The selected component will be considered the HI, which is the training data for the degradation simulation driven by FBM.In the Monte Carlo RUL prediction stage, multiple degradation simulations are conducted and their mode is considered the point prediction.

Degradation Feature Decomposition
VMD is a new method of signal decomposition, proposed by Dragomiretskiy et al. [28] in 2014.This method is to decompose the original signal into a K limited bandwidth IMFS (intrinsic mode function).It can be described as follows: where ω k (t) is the instantaneous phase of the k-th intrinsic mode function, A k (t) is the instantaneous amplitude of the k-th intrinsic mode function and y(t) is the value of the original signal at time t, represented by the A k (t) and ω k (t) determinants.Each IMF is concentrated at the center frequency ω k , and the bandwidth can be estimated by its Gaussian smoothed offset signal.Due to the sparseness of VMD, it can be deemed a constrained variational problem, as expressed by min where {u k } is the kth IMF and {ω k } is the frequency center of different components.u k satisfies To solve the constrained variational problem, an augmented Lagrange function is introduced [29].The problem can be written in the form where a is the penalty factor and λ is the Lagrange multiplier.The IMFs u k and corresponding center frequencies ω k can be updated using ( 21) and ( 22): The Lagrange multiplier λ is updated into The update process is carried out repeatedly until ( 24) is satisfied.
The convergence accuracy is typically set as 10-6.

Metric-Based Degradation Component Selection Algorithm
In order to choose a suitable component, monotonicity, robustness and trendability are applied for this evaluation.The monotonicity metric is described as follows [30][31][32]: where X is the HI (health indicator) sequence, K represents the number of HI values and d/dx = x k+1 − x k and d 2 /d 2 x denote the difference and the second-order derivatives of X, respectively.No. o f corresponds to the number of the positive and the negative difference.Mon + (X) and Mon − (X) represent the positive monotonicity and the negative monotonicity, respectively.The robustness is expressed as [33] Rob where x k is the HI value at t k and x T k is the mean trend value at t k , calculated by smooth method.The trendability is denoted as [34] Tre(X) = Higher values of these metrics represent a better performance.

Monte Carlo Simulation for RUL Prediction
In general, the RUL at t k is defined as the first time when the degradation process reaches the failure threshold ω [35]: where L k represents the RUL at the kth observed time.Because the result of the model is not fixed; one calculation result is not necessarily correct.In order to come close to the actual value, the Monte Carlo method is used to predict the RUL, generating a large number of simulation paths.The mode of the Monte Carlo simulation is considered the predicted value.
The increment at interval ∆t can be expressed as follows: The future path after j intervals at t k is where B H (∆l k ) ∼ 0, (∆l k ) 2H .The principles of the Monte Carlo RUL prediction model are depicted in Figure 3.
Fractal Fract.2024, 8, x FOR PEER REVIEW 8 of 14 where k L represents the RUL at the kth observed time.Because the result of the model is not fixed; one calculation result is not necessarily correct.In order to come close to the actual value, the Monte Carlo method is used to predict the RUL, generating a large number of simulation paths.The mode of the Monte Carlo simulation is considered the predicted value.The increment at interval t Δ can be expressed as follows: ) The future path after j intervals at k t is ) , . The principles of the Monte Carlo RUL prediction model are depicted in Figure 3.

Case Study
In this section, a concrete test example is given to show the efficiency and reliability of our method.

Data Description
In this section, the degradation process of a bearing data set is provided to test the

Case Study
In this section, a concrete test example is given to show the efficiency and reliability of our method.

Data Description
In this section, the degradation process of a bearing data set is provided to test the validity of our proposed theory.The data were generated by the Intelligent Maintenance Systems (IMS) [36].Four bearings are fixed on a shaft.The speed is constant at 2000 RPM.A 6000 lbs radial load is applied onto the bearing and shaft.For each bearing, data set 1 has two accelerometers, and data sets 2 and 3 have one accelerometer.The installation location is as shown in Figure 4.Each data set includes a run-to-failure experiment and its sampling rate is 20 kHz.The whole operational life data of bearing 3-Ch5 in set No. 1 is adopted.The first 43 files were recorded every 5 min and the remaining files were recorded every 10 min.The recording time was from 22 October 2003 12:06:24 to 25 November 2003 23:39:56.Figure 5 shows the vibration signal.

HI Construction of the Experimental Bearing Data
In order to extract the degradation trend, the vibration signal is decomposed into eight degradation components by VMD [37].These degradation components are shown in Figure 6.Monotonicity, robustness and trendability are calculated for these eight components and the results are shown in Table 1.

HI Construction of the Experimental Bearing Data
In order to extract the degradation trend, the vibration signal is decomposed into eight degradation components by VMD [37].These degradation components are shown in Figure 6.Monotonicity, robustness and trendability are calculated for these eight components and the results are shown in Table 1.

HI Construction of the Experimental Bearing Data
In order to extract the degradation trend, the vibration signal is decomposed into eight degradation components by VMD [37].These degradation components are shown in Figure 6.Monotonicity, robustness and trendability are calculated for these eight components and the results are shown in Table 1.

HI Construction of the Experimental Bearing Data
In order to extract the degradation trend, the vibration signal is decomposed into eight degradation components by VMD [37].These degradation components are shown in Figure 6.Monotonicity, robustness and trendability are calculated for these eight components and the results are shown in Table 1.[38].The failure threshold is set as and the RUL prediction starts from the 1800th time point.

RUL Prediction and Performance Evaluation
The self-similar parameter is estimated using the R/S method and its value is 0.86.Therefore, the HI is both self-similar and LRD.Other parameters are estimated by MLE.Through the Monte Carlo method, the probability density distribution (PDF) of the RUL can be simulated and the selected degradation model is M2.Obviously, the degradation process of the bearing is nonlinear and non-exponential.The utilization of M1 and M3 is inappropriate.
Figure 7 shows the PDF prediction of the proposed model.In the early stages, the distribution of its prediction results is scattered.As the RUL decreases, the distribution of the prediction results becomes more concentrated.
A Long short-term memory neural network (LSTM) model is selected for a model comparison.Its prediction results can be found in Figure 8.The RUL prediction model based on Brownian motion neglected the self-similarity and LRD of the degradation, which means it has an inferior approach.[38].The failure threshold is set as ω = 200 and the RUL prediction starts from the 1800th time point.

RUL Prediction and Performance Evaluation
The self-similar parameter is estimated using the R/S method and its value is 0.86.Therefore, the HI is both self-similar and LRD.Other parameters are estimated by MLE.Through the Monte Carlo method, the probability density distribution (PDF) of the RUL can be simulated and the selected degradation model is M2.Obviously, the degradation process of the bearing is nonlinear and non-exponential.The utilization of M1 and M3 is inappropriate.
Figure 7 shows the PDF prediction of the proposed model.In the early stages, the distribution of its prediction results is scattered.As the RUL decreases, the distribution of the prediction results becomes more concentrated.Figure 9 shows the estimated RUL and the actual RUL for both prediction algorithms.The exponential transformed accuracy (ETA) [39] is applied to evaluate their performance.The formulas are expressed as A Long short-term memory neural network (LSTM) model is selected for a model comparison.Its prediction results can be found in Figure 8.The RUL prediction model based on Brownian motion neglected the self-similarity and LRD of the degradation, which means it has an inferior approach.Figure 9 shows the estimated RUL and the actual RUL for both prediction algorithms.The exponential transformed accuracy (ETA) [39] is applied to evaluate their performance.The formulas are expressed as where i Er is the percent error, defined by Figure 9 shows the estimated RUL and the actual RUL for both prediction algorithms.The exponential transformed accuracy (ETA) [39] is applied to evaluate their performance.The formulas are expressed as ETA i = exp(− ln(0.5)Eri /5) if Er i ≤ 0 exp(ln(0.5)Eri /20) if Er i > 0 , (31) where Er i is the percent error, defined by The final score is the average of all results.A higher score represents better performance.Furthermore, the root mean squared error (RMSE) and mean absolute percentage error (MAPE) are also used for this evaluation and their results are shown in Table 2.

Figure 2 .
Figure 2. Step-by-step flowchart of the RUL prediction model.

) k tω
is the instantaneous phase of the k-th intrinsic mode function， ( ) k A t is the instantaneous amplitude of the k-th intrinsic mode function and y(t) is the value of the original signal at time t, represented by the ( )

Figure 2 .
Figure 2. Step-by-step flowchart of the RUL prediction model.

Figure 3 .
Figure 3. Plot of the principles of an RUL prediction model based on a Monte Carlo simulation.

Figure 3 .
Figure 3. Plot of the principles of an RUL prediction model based on a Monte Carlo simulation.

Figure 4 .
Figure 4. Bearing and sensor installation instructions.

Figure 4 .
Figure 4. Bearing and sensor installation instructions.

Figure 7 .
Figure 7. PDF prediction of the proposed model.Red line for the estimated PDF of the RUL.Asterisk for the maximum of the PDF.Blue line for the actual RUL.

Figure 8 .
Figure 8. PDF prediction based on LSTM.Red line for the estimated PDF of the RUL.Asterisk for the maximum of the PDF.Blue line for the actual RUL.

Figure 7 .
Figure 7. PDF prediction of the proposed model.Red line for the estimated PDF of the RUL.Asterisk for the maximum of the PDF.Blue line for the actual RUL.

Fractal 14 Figure 7 .
Figure 7. PDF prediction of the proposed model.Red line for the estimated PDF of the RUL.Asterisk for the maximum of the PDF.Blue line for the actual RUL.

Figure 8 .
Figure 8. PDF prediction based on LSTM.Red line for the estimated PDF of the RUL.Asterisk for the maximum of the PDF.Blue line for the actual RUL.

Figure 8 .
Figure 8. PDF prediction based on LSTM.Red line for the estimated PDF of the RUL.Asterisk for the maximum of the PDF.Blue line for the actual RUL.

Table 1 ,
IMF4 is selected as the training data, because it performed the best in all three indexes