Remaining Useful Life Prediction of Electric Drive Bearings in New Energy Vehicles: Based on Degradation Assessment and Spatiotemporal Feature Fusion

Yang, Fang; Dong, En; Zhong, Zhidan; Zhang, Weiqi; Cui, Yunhao; Ye, Jun

doi:10.3390/machines13100914

Open AccessArticle

Remaining Useful Life Prediction of Electric Drive Bearings in New Energy Vehicles: Based on Degradation Assessment and Spatiotemporal Feature Fusion

by

Fang Yang

^1,2,*

,

En Dong

^1,2

,

Zhidan Zhong

^1,2

,

Weiqi Zhang

^1,2,

Yunhao Cui

^1,2

and

Jun Ye

^1,2

¹

School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang 471003, China

²

Collaborative Innovation Center of Hennan Province for High-End Bearing, Henan University of Science and Technology, Luoyang 471000, China

^*

Author to whom correspondence should be addressed.

Machines 2025, 13(10), 914; https://doi.org/10.3390/machines13100914

Submission received: 4 September 2025 / Revised: 25 September 2025 / Accepted: 2 October 2025 / Published: 3 October 2025

(This article belongs to the Special Issue Data-Driven RUL Prediction: Innovations in Generalization, Uncertainty, and Efficiency for Industrial PHM)

Download

Browse Figures

Versions Notes

Abstract

Accurate prediction of the RUL of electric drive bearings over the entire service life cycle for new energy vehicles optimizes maintenance strategies and reduces costs, addressing clear application needs. Full life data of electric drive bearings exhibit long time spans and abrupt degradation, complicating the modeling of time dependent relationships and degradation states; therefore, a piecewise linear degradation model is appropriate. An RUL prediction method is proposed based on degradation assessment and spatiotemporal feature fusion, which extracts strongly time correlated features from bearing vibration data, evaluates sensitive indicators, constructs weighted fused degradation features, and identifies abrupt degradation points. On this basis, a piecewise linear degradation model is constructed that uses a path graph structure to represent temporal dependencies and a temporal observation window to embed temporal features. By incorporating GAT-LSTM, RUL prediction for bearings is performed. The method is validated on the XJTU-SY dataset and on a loaded ball bearing test rig for electric vehicle drive motors, yielding comprehensive vibration measurements for life prediction. The results show that the method captures deep degradation information across the full bearing life cycle and delivers accurate, robust predictions, providing guidance for the health assessment of electric drive bearings in new energy vehicles.

Keywords:

new energy vehicles; electric drive bearing; remaining useful life; graph attention network

1. Introduction

The new energy vehicle sector has undergone rapid growth in recent years, accompanied by accelerated technological iteration. These trends have tightened performance requirements on vehicle powertrains [1]. Within these systems, the traction motor, a key component, must meet higher standards for reliability, speed regulation, and overload tolerance [2]. Its core elements, the electric drive bearings, provide rotational support and transmit loads. Accurate prediction of their remaining useful life (RUL) over the full-service cycle, together with robust assessment of degradation states, is essential for devising cost-effective maintenance strategies and minimizing maintenance costs. These capabilities directly improve powertrain stability and reliability [3].

Electric drive bearings typically experience three phases over their service life: the stable wear phase, the degradation phase, and the failure phase [4]. During the stable wear phase, degradation indicators—such as vibration, temperature, and envelope spectrum energy—are affected by both normal wear and background noise, resulting in low amplitude, gradual changes that make significant trends difficult to discern. Once the degradation phase begins, damage accumulation accelerates, and the indicators often exhibit approximately exponential growth. The transition from slow to rapid change occurs over a very short interval and is markedly abrupt. If a linear degradation model with a constant degradation rate is applied across the entire service cycle, extrapolating from early, subtle changes to forecast the subsequent rapid consumption of RUL will yield overly optimistic estimates of the safety boundary and RUL, thereby increasing downstream maintenance costs. A more reliable approach is to characterize the degradation pattern, identify the change points, and apply appropriate linear models—specifically, piecewise linear degradation models—before and after those points.

Accurately identifying bearing degradation features and determining the first prediction time (FPT) are critical to improving the precision of RUL estimates [5]. Recent studies have increased detection accuracy and efficiency by proposing evaluation metrics and designing detection models. Pei et al. [6] introduced the feature-to-noise energy ratio as a bearing performance-degradation indicator. Mi et al. [7] proposed a degradation-thresholding method that jointly leverages root mean square and kurtosis, together with a nonparametric cumulative approach and an extended forest model. Ding et al. [8] applied fuzzy C-means clustering to delineate degradation stages by integrating degradation indicators with operating condition information.

Although these methods support coarse localization of degradation points, opportunities remain to reduce false alarm and miss detection rates and to strengthen generalization. Moreover, degradation point identification can be improved using feature engineering-based models. Chen et al. [9] developed a deep residual shrinkage relational network that classifies degradation stages by exploiting sample correlations. Zhu et al. [10] employed a random-matrix model to derive feature indicators and fused them via principal component analysis to assess degradation states. These methods require less prior knowledge and often generalize well; however, their performance is sensitive to the quality of feature extraction and to sample size, indicating a need for further improvements in stability.

Currently, RUL prediction methods are commonly categorized into failure mechanism and data-driven approaches. Failure mechanism-based methods are constrained by complex and variable operating conditions and rely heavily on expert knowledge, which limits their applicability in engineering practice. Consequently, research has increasingly shifted toward data-driven methods, especially deep learning [11].

Vibration data collected over a bearing’s full life are time-series data. Sequence models such as recurrent neural networks (RNNs) [12], gated recurrent units (GRUs) [13], and long short-term memory networks (LSTM) [14] are widely used for bearing RUL prediction because they capture temporal dependencies effectively. For example, Shen et al. [15] constructed a nonlinear degradation index from time frequency features and predicted the remaining life of rolling bearings using BiLSTM. Wang et al. [16] combined multilevel convolutional autoencoders with LSTM and introduced a bias-correction mechanism for RUL prediction. Nevertheless, when modeling long-horizon degradation over the full life cycle with very long sequences, these models can suffer temporal feature dilution, and their inherently sequential computation limits parallelization, leading to limited real-time performance.

In response, some studies have leveraged convolutional neural networks (CNNs) for their strong feature extraction capability in RUL research. Zhu et al. [17] proposed a deep feature learning approach for bearing RUL prediction based on time frequency representations and multiscale CNN. Cao et al. [18] developed an end-to-end bearing RUL model using a temporal convolutional network with residual attention, integrating both time frequency and temporal information. Although such methods capture salient degradation features, they still struggle to model inter-signal dependencies and to track feature evolution across degradation stages.

In bearing RUL studies, monitoring data exhibit pronounced non-Euclidean structure and explicit temporal dependence. Graph representations can encode these dependencies via edges and have been used to mine latent degradation information for bearing RUL prediction. Kumar et al. [19] employed graph neural networks to align graph features via maximum mean discrepancy and introduced a weighted Huber loss to improve prediction accuracy. The multilevel, multiscale relational structure of graph data provides a suitable foundation for GNN-based RUL prediction. Graph Convolutional Networks learn node attributes, edge relationships, and global graph characteristics from graph topology. Moreover, coupling GCN with recurrent architecture models captures temporal dependencies in time-series, captures complex interaction patterns, and improves RUL prediction accuracy. However, bearing degradation is multifactorial and exhibits irregular intervals and nonlinear temporal dependencies; under such conditions, standard GCN often fail to capture evolving relational dynamics [20].

Motivated by the success of Transformers in natural language processing, self-attention mechanisms have been extended to graph data [21]. Conventional GCN aggregate features from local neighborhoods and may overlook long-range dependencies and global information. For complex structured data, self-attention enables context-aware processing that better captures global relationships, thereby enhancing model expressiveness and performance. Wei et al. [22] proposed an adaptive GCN with a self-attention module to capture temporal relevance without relying on recurrent units. Transformers also inspired Graph Attention Networks, which mitigate limitations of conventional graph convolutions by applying attention mechanisms to graph-structured data. This fusion of attention architectures with graph-based deep learning substantially improves sequential pattern recognition. For bearing RUL prediction, GAT focuses on salient features and critical time periods by weighting neighboring nodes according to learned similarities and aggregating their information, thereby enhancing prediction accuracy. Chen and Zeng [23] constructed nodes with a sliding window, learned embeddings, and employed a graph attention network to capture spatiotemporal correlations. Liang et al. [24] proposed a deep adaptive Transformer augmented with a GAT to build strongly correlated graphs and integrate node information for temporal feature extraction in RUL prediction. Nevertheless, constructing high-quality graph data that reflects the abrupt degradation of electric drive bearings across degradation states in new energy vehicles and effectively modeling temporal dependencies remain open problems.

To address these challenges in the context of new energy vehicles’ electric drive bearings, a degradation assessment and spatiotemporal feature fusion method is proposed for accurate RUL prediction. Using a test rig for ball bearings in electric vehicle drive motors, combined axial and radial loads are applied to acquire full-life vibration data. From these data, a feature-weighted fusion scheme is devised to determine the FPT, with an adaptive, temporally correlation-sensitive degradation index utilized to characterize degradation and accurately detect abrupt degradation points. Temporal dependencies in vibration signals are modeled with graph structures tailored to different lifecycle stages, with appropriate temporal observation windows constructed for each degradation phase. A graph attention network aggregates node information to extract spatial features, and an LSTM captures temporal dynamics, yielding a fused spatiotemporal representation for RUL prediction of electric drive bearings. The main contributions of this paper are as follows:

(1): A feature-weighted fusion method for FPT determination is proposed. Temporally correlated features are extracted from bearing vibration signals, and a comprehensive temporal correlation assessment identifies sensitive indicators. Adaptive weighting is then used to fuse degradation features, and abrupt degradation points are detected via change point analysis. On this basis, a piecewise linear degradation model is formulated.
(2): After identifying degradation points, a path graph representation is constructed to capture temporal dependencies across the full lifecycle of bearing vibration signals, with stage-specific construction strategies for different degradation phases. Using the resulting graph-structured data, a temporal observation window embeds temporal attributes and enables the propagation and aggregation of structural degradation features among nodes. GAT-LSTM is employed to predict the remaining useful life of electric drive bearings.
(3): A dedicated test rig for ball bearings in electric vehicle drive motors is used to emulate real operating conditions. Composite axial and radial loads are applied to collect comprehensive lifecycle vibration data that captures bearing degradation, and the proposed method is validated using these data.

2. RUL Prediction Via Joint Degradation Point-Driven Graph and Time-Series Data Introduction

A RUL prediction method for electric drive bearings in new energy vehicles is proposed, based on degradation assessment and spatiotemporal feature fusion. The main theoretical contributions are the identification of abrupt degradation points and the construction of graph-structured temporal data. Determining abrupt degradation points comprises three steps: computing temporal degradation indicators, evaluating temporal sensitivity, and identifying change points from a weighted fusion of features. Constructing graph-structured temporal data entails two components: creating path graph data and fusing spatiotemporal features. The overall methodological framework is illustrated in Figure 1.

2.1. Feature Weighted Fusion-Based FPT Determination Method

The degradation of electric drive bearings exhibits pronounced abrupt changes and brief transition phases, making linear degradation models prone to misestimating safe service limits. A piecewise linear degradation model better represents bearing degradation and improves life prediction accuracy, rendering it more suitable for RUL prediction of electric drive bearings in new energy vehicles. Using comprehensive degradation data, feature-weighted temporal sensitivity indicators with strong temporal correlation are employed to characterize the degradation state and accurately capture abrupt degradation points. The methodological workflow comprises three stages: computing temporal degradation indicators, evaluating temporal sensitivity, and identifying change points from a weighted fusion of features. The specific workflow is shown in Figure 2.

2.1.1. Calculation of Temporal Degradation Indicators

Given the full life cycle vibration monitoring signal

X = (x_{1}, x_{2}, \dots x_{N})

of electric drive bearings in new energy vehicles, time domain degradation indicators are extracted, where

N

denotes the signal length. Because the full life cycle contains a large number of data points, the signal

X

is first partitioned into samples using a fixed-length window of

M

points. To balance data volume and temporal resolution, the window length is set to 4096. This segmentation yields

X = (X_{1}, X_{2}, \dots X_{i})

, where

i

denotes the total number of samples.

Five-time domain degradation indicators are adopted: mean, standard deviation, peak-to-peak value, root mean square and kurtosis. The mean and standard deviation characterize the average energy level and its fluctuations, respectively. The peak-to-peak value captures the amplitude range and is sensitive to impulsive faults. The RMS reflects changes in overall energy or amplitude. Kurtosis serves as a commonly used indicator of pitting-type damage [25].

For each full life cycle sample

X

, the mean, standard deviation, peak-to-peak value, RMS, and kurtosis are computed, as in:

F_{m e a n} = \frac{1}{N} \sum_{i = 1}^{N} x_{i}

(1)

F_{s t d} = \sqrt{\frac{\sum_{i = 1}^{N} {(x_{i} - F_{m e a n})}^{2}}{N}}

(2)

F_{p 2 p} = \max {x_{i}} - \min {x_{i}}

(3)

F_{r m s} = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}}

(4)

F_{k u r t o s i s} = \frac{1}{N} \sum_{i = 1}^{N} \frac{{(x_{i} - F_{s t d})}^{4}}{F_{s t d}}

(5)

In the formula,

F_{m e a n}

represents the mean of the signal,

F_{s t d}

denotes the standard deviation of the signal,

F_{p 2 p}

indicates the peak-to-peak value of the signal,

F_{r m s}

refers to the root mean square value of the signal, and

F_{k u r t o s i s}

signifies the kurtosis of the signal.

2.1.2. Evaluation of Temporal Sensitivity Indicators

After calculating the time-series degradation indicators, each full-cycle sample obtains a time feature vector. To further highlight the degradation characteristics, a weighted feature fusion of the feature vector

F = [F_{m e a n}, F_{s t d}, F_{p 2 p}, F_{r m s}, F_{k u r t o s i s}]

is performed. Before fusion, it is necessary to determine the weighting coefficients.

Features exhibiting strong temporal correlation and monotonicity provide more faithful representations of bearing degradation trajectories in RUL prediction. Bearing degradation progresses over time: as the healthy state deteriorates, diagnostic features emerge that follow time-dependent patterns and change consistently with the underlying degradation. These patterns constitute the core signals that weight allocation should be prioritized. Pearson correlation and Spearman rank-based monotonicity analyses are appropriate for quantifying these properties. Temporal correlation evaluates the association between feature sequences and time. Pearson correlation quantifies linear relationships and is suitable when health indicators exhibit significant linear dependence on operating time. Monotonicity characterizes features that consistently increase or decrease. Spearman’s rank correlation captures trend consistency without assuming linearity, and thus better reflects the irreversible monotonic behavior of bearing degradation.

Temporal Correlation Assessment:

Pearson correlation is widely used to measure the linear association between two variables. The overall Pearson correlation coefficient between the two variables is defined as the product of the covariance of the two variables and their standard deviations. By estimating the covariance and standard deviations from the samples, the Pearson correlation coefficient can be obtained, expressed as:

r = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{X}) (y_{i} - \bar{Y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{X})}^{2}} \sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{Y})}^{2}}}

(6)

In the formula,

\bar{X}

and

\bar{Y}

represent the sample means of two variables,

r = \pm 1

indicates strong correlation, and

r = 0

signifies complete independence.

2.: Monotonicity Assessment:

The Spearman method is commonly used to measure the strength of the monotonic relationship between two variables, defined as follows:

p = 1 - \frac{6 \sum d_{i}^{2}}{n (n^{2} - 1)}

(7)

In the formula,

d_{i}

represents the difference in rank values for the

i

-th data pair,

n

denotes the total number of observations,

p = \pm 1

indicates strong monotonicity, and

p = 0

represents the absence of a monotonic trend.

By conducting a temporal sensitivity assessment on five time-series indicators, we obtain the time-correlated weighted vector

R = [r_{m e a n}, r_{s t d}, r_{p 2 p}, r_{r m s}, r_{k u r t o s i s}]

and the monotonicity weighted vector

P = [p_{m e a n}, p_{s t d}, p_{p 2 p}, p_{r m s}, p_{k u r t o s i s}]

. At this point, the assessment of temporal sensitivity indicators is completed.

2.1.3. Identification of Change Points Based on Weighted Fusion Features

Using the temporal correlation weight vector and the monotonicity weight vector, weighted fusion is applied across all full life cycle samples. The fused features are then used to quantify degradation states, and change point detection is subsequently applied to identify abrupt change points in the degradation process.

Feature Weighted Fusion:

There may be differences in magnitude among various feature indicators. Therefore, min-max normalization is first applied for standardization. Subsequently, the feature vector

F

is fused based on the weighted vectors

R

and

P

, as follows:

\begin{matrix} F_{m e x} = \sqrt{(R + P) \cdot F^{2}} \\ F = [F_{m e a n}, F_{s t d}, F_{p 2 p}, F_{r m s}, F_{k u r t o s i s}] \\ P = [p_{m e a n}, p_{s t d}, p_{p 2 p}, p_{r m s}, p_{k u r t o s i s}] \\ R = [r_{m e a n}, r_{s t d}, r_{p 2 p}, r_{r m s}, r_{k u r t o s i s}] \end{matrix}

(8)

In the formula,

F_{m e x}

,

F_{m e x} = (F_{m e x}^{1}, F_{m e x}^{2}, \dots F_{m e x}^{i})

represents the fused features, and

i

denotes the number of sample partitions. The subsequent change point detection is conducted based on

F_{m e x}

.

2.: Change Point Detection:

In the FPT task for predicting the remaining useful life of bearings, change point detection is employed to identify the starting points of performance degradation within the time series. The fused feature

F_{m e x}

constitutes a typical one-dimensional time series, making it suitable for change point detection using the Pruned Exact Linear Time (PELT) algorithm. The objective of this detection is to pinpoint degradation points that partition the time series into two segments, each exhibiting relatively stable statistical properties. The PELT algorithm efficiently performs change point detection through dynamic programming and pruning strategies. Its core principle involves constructing a cost function and identifying the combination of change points that minimizes the total cost. The algorithm processes each point in the time series sequentially, computes the cumulative cost for each point as a potential change point, and employs a pruning strategy to eliminate candidate locations that cannot be optimal. Ultimately, it returns the set of change points that minimizes the overall cost. For the fused feature

F_{m e x} = (F_{m e x}^{1}, F_{m e x}^{2}, \dots F_{m e x}^{i})

, the FPT cost function is defined as follows:

C (F_{m e x}^{1 : i}) = \sum_{i = 1}^{m + 1} (\cos t (F_{m e x}^{τ_{j - 1} + 1 : τ_{j}})) + β m

(9)

In the formula,

\cos t (F_{m e x}^{τ_{j - 1} + 1 : τ_{j}})

represents the cost of a segment of data (from change point

τ_{j - 1} + 1

to

τ_{j}

),

m

denotes the number of change points, and

β

is a positive penalty term that controls the number of change points to avoid overfitting.

Through PELT change point detection, the indices of change points in the fused features

F_{m e x}

can be identified and designated as

l

. Based on the full-cycle sample partitioning principle, we can then derive the approximate change points in the full-cycle vibration monitoring data

X

, as follows:

n = m \cdot k

(10)

In the formula,

n

indicates the change points in the full-cycle vibration monitoring data. At this point, the FPT determination based on feature-weighted fusion has been achieved, with the specific process illustrated in Figure 3.

2.2. Graph-Structured Temporal Data Construction

Based on the identification of degradation points, path graph structure data representing temporal dependencies is constructed. A temporal observation window is established to embed temporal features, thereby completing spatiotemporal feature fusion. GAT is utilized to acquire spatial aggregation information between nodes, while LSTM is employed to capture temporal features. The methodological process encompasses two parts: the construction of path graph structure data and the fusion of spatiotemporal features.

2.2.1. Path Graph Structure Data Construction

Graph-structured data, as a type of non-Euclidean data, enables explicit modeling of complex systems via node feature information and inter-node connections. For a path graph

G = G (X, A, E)

, the node feature matrix is denoted by

X = R^{n \times d}

, where

n

represents the number of nodes and

d

denotes the feature dimension. The adjacency matrix

A

is used to characterize the connection relationships between nodes, while

E

denotes the set of edges. The core of constructing graph-structured data lies in determining the nodes themselves, their corresponding feature information, and the connections between nodes.

Nodes and their features are the primary attributes of a path graph. The node set determines the incidence structure of edges, whereas node features encode the operational state of nodes. In this study, path graph nodes are derived from bearing vibration data. A fixed-length sliding window of size

l

is applied to the vibration signal

X = (x_{1}, x_{2}, \dots x_{N})

to segment it into non-overlapping windows, each treated as a node, thereby constructing the node sequence of the path graph, as illustrated in Figure 4.

From each segmented node signal

X_{i}^{l} = (x_{i}, x_{i + 1}, \dots x_{i + k})

, sixteen features are extracted to characterize bearing degradation, which serve as node features.

To assign node features, we computed 16 indicators that systematically characterize the signal in both the time and frequency domains. In the time domain, the 13 indicators are the peak-to-peak value, absolute mean, root-mean-square value, variance, standard deviation, effective value, kurtosis, skewness, shape factor, peak factor, impulse factor, margin factor, and crevice factor. These indicators describe the signal’s amplitude, variability, distributional characteristics, and impulsive behavior. Specifically, the peak-to-peak and effective values reflect signal strength, whereas variance and standard deviation quantify variability. Kurtosis and skewness characterize distribution shape, and the five factors—shape, peak, impulse, margin, and crevice—are used to assess impulsive behavior. In the frequency domain, we use three indicators: mean frequency, centroid frequency, and root-mean-square frequency. Together, these indicators characterize the spectral distribution from complementary perspectives, including central tendency, energy concentration, and dispersion. Collectively, they provide a comprehensive, multidimensional feature representation. The formulas for each indicator are given below:

Peak-to-peak value:	$T_{1} = \max X_{i}^{l} - \min X_{i}^{l}$	Absolute mean:	$T_{2} = \frac{1}{k} \sum_{i = 1}^{k} \|X_{i}^{l}\|$
Root-mean-square value:	$T_{3} = {(\frac{1}{k} \sqrt{\sum_{i = 1}^{k} \|X_{i}^{l}\|})}^{2}$	Variance:	$T_{4} = \frac{1}{k - 1} \sum_{i = 1}^{k} {[X_{i}^{l} (i) - \bar{X_{i}^{l}}]}^{2}$
Standard deviation:	$T_{5} = \sqrt{\frac{1}{k} \sum_{i = 1}^{k} {[X_{i}^{l} (i) - \bar{X_{i}^{l}}]}^{2}}$	Effective value:	$T_{6} = \sqrt{\frac{1}{k} \sum_{i = 1}^{k} {[X_{i}^{l} (i)]}^{2}}$
Kurtosis:	$T_{7} = \frac{\sum_{i = 1}^{k} {[X_{i}^{l} (i) - \bar{X_{i}^{l}}]}^{3}}{(k - 1) T_{5}^{3}}$	Skewness:	$T_{8} = \frac{k T_{6}}{\sum_{i = 1}^{k} \|X_{i}^{l} (i)\|}$
Waveform factor:	$T_{9} = \frac{T_{6}}{T_{2}}$	Peak factor:	$T_{10} = \frac{\max \|X_{i}^{l}\|}{T_{6}}$
Impulse factor:	$T_{11} = \frac{\max \|X_{i}^{l}\|}{T_{2}}$	Margin factor:	$T_{12} = \frac{\max \|X_{i}^{l}\|}{T_{3}}$
Gap factor:	$T_{13} = \frac{\max \|X_{i}^{l}\|}{T_{6}^{2}}$	Mean frequency:	$T_{14} = \frac{1}{H} \sum_{j = 1}^{H} s (j)$
Centroid frequency:	$T_{15} = \frac{\sum_{j}^{H} f (j) s (j)}{\sum_{j}^{H} s (j)}$	Root-mean-square frequency:	$T_{16} = \sqrt{\frac{\sum_{j = 1}^{H} f^{2} (j) s (j)}{\sum_{j = 1}^{H} s (j)}}$

In the formula,

k

represents the number of sample points for a single node,

H

indicates the number of frequency spectrum lines obtained from the Fourier transform of the sample data sequence within a single node,

f (j)

is the frequency of the

j

-th spectrum line, and

s (j)

denotes the amplitude of the

j

-th spectrum.

T_{1} ~ T_{13}

represents time domain indicators, while

T_{14} ~ T_{16}

represents frequency domain indicators. Together, they constitute the node features

T = (T_{1}, T_{2}, \dots, T_{16})

of the path graph.

After determining the nodes and their features, the path graph is utilized to model the temporal dependencies between adjacent nodes. This means that edge connections exist only between neighboring nodes, reflecting the chronological order of the signals. The adjacency matrix is expressed as follows:

A_{i j} = \{\begin{matrix} 1 & f |i - j| = 1 \\ 0 & other \end{matrix}

(11)

In the formula,

A_{i j} = 1

indicates that there is an edge connection between node

E_{i}

and node

E_{j}

, while

A_{i j} = 0

signifies that there is no edge connection between these nodes.

For the vibration monitoring data of bearings throughout their entire life cycle, the monitoring data before and after the FPT follow the same node division sliding window l, while implementing different graph structure data construction strategies with a specified number of nodes s for each graph. Before the degradation point, the variation range of each degradation indicator is relatively small, and the data volume is large. Therefore, a broader node observation scale is pursued, meaning that each graph contains a larger number of nodes to facilitate the aggregation of node information over a wider range. Conversely, after the degradation point, the indicators evolve sharply, necessitating a narrower node observation scale, where each graph contains fewer nodes to more accurately represent the degradation evolution information of the bearings.

2.2.2. Spatiotemporal Feature Fusion

Based on the constructed graph structure data

G = G (X, A, E)

, a graph dataset

G = {(G_{1}, Z_{1}), (G_{2}, Z_{2}), \dots, (G_{m}, Z_{m})}

is formed, where

G_{m}

represents a single graph and

Z_{m}

is the corresponding RUL for

G_{m}

. The task of predicting the remaining life of the bearings involves learning a function

f : G \to Z

to obtain the RUL labels.

In bearing RUL prediction, the RUL for the next time period is predicted based on the signals from the current time period, meaning that the RUL of the next sample is estimated using the current sample. To achieve this, A temporal observation window of length

w

is introduced and applied to the graph dataset

G

to generate a temporally fused dataset

D

,

D = {(D_{1}, Z_{w}), (D_{2}, Z_{2 w}), \dots, (D_{m - w + 1}, Z_{(m - w + 1) w})}

,

D_{1} = (G_{1}, G_{2}, \dots, G_{w}, Z_{w})

. For each window, the RUL of the last sample in the window is assigned as the label for the corresponding sample sequence. Distinct window scales are adopted across degradation stages: a small window before the degradation point and a large window after it, enhancing the representation of degradation evolution, as shown in Figure 5. Based on the fused dataset

D

, GAT aggregates spatial degradation features across nodes, after which a LSTM model captures temporal dependencies, achieving spatiotemporal feature fusion and enabling RUL prediction for electric drive bearings.

During the data transmission process, the GAT introduces an attention mechanism to calculate the weights of each node’s neighboring nodes within a single graph. This allows for the extraction of output features for each node, which are then aggregated to obtain the consolidated information of the graph structure data. For the node feature matrix

X = {X_{1}, X_{2}, \dots, X_{n}}, X_{i} \in R^{s}

of a single graph, where

n

represents the number of nodes and

s

denotes the dimensionality of the node features, the self-attention mechanism of GAT is used to obtain the attention coefficients for each node, expressed as follows:

e_{i j} = α ({Wh}_{i}, {Wh}_{j})

(12)

In the formula,

W \in R^{s^{'} \times s}

is the weight matrix obtained from training,

α (\cdot, \cdot)

represents a single-layer feedforward neural network, and

h_{i} \in R^{s}

and

h_{j} \in R^{s}

denote the

i

-th and

j

-th nodes of the graph, respectively. Equation (12) indicates the importance of the features of node

V_{j}

for node

V_{i}

. To facilitate the comparison of attention coefficients across different nodes, softmax attention coefficient regularization is introduced, specifically:

e_{i j}^{'} = \frac{\exp (LeakyReLU (a^{T} [{WX}_{i} | | {WX}_{j}]))}{\sum_{k \in Ω_{i}} \exp (LeakyReLU (a^{T} [{WX}_{i} | | {WX}_{k}]))}

(13)

In the formula,

Ω_{i}

is the set of neighboring nodes of node

E_{i}

,

a^{T}

is the weight vector, and

LeakyReUL (\cdot)

is the leaky linear rectified function, which is an improved activation function over the standard Rectified Linear Unit. It addresses the potential issue of neuron death during training by introducing a small, non-zero gradient in the negative value region.

Through the above calculations, the attention coefficients for each node are obtained, yielding the output features for each node, specifically:

X_{i}^{'} = σ (\sum_{j \in Ω_{i}} e_{i j}^{'} {Wh}_{j})

(14)

In the formula,

σ (\cdot)

represents the nonlinear activation coefficient. The output features

X^{'} = {X_{1}^{'}, X_{2}^{'}, \dots, X_{n}^{'}}

of all nodes in the graph are aggregated, where

n

is the number of nodes, resulting in the output features of the graph, specifically:

G^{'} = \frac{1}{n} \sum_{i = 1}^{n} X^{'}

(15)

At this point, spatial information aggregation among the nodes has been accomplished using GAT, yielding the spatially aggregated graph dataset

D^{'} = {(D_{1}^{'}, Z_{w}), (D_{2}^{'}, Z_{2 w}), \dots, (D_{m - w + 1}^{'}, Z_{(m - w + 1) w})}

,

D_{1} = (G_{1}^{'}, G_{2}^{'}, \dots, G_{w}^{'}, Z_{w})

.

G_{m}^{'}

denotes the aggregated output features of the graph

G_{m}

, while

Z_{m}

represents the corresponding RUL of

G_{m}^{'}

. Subsequently, this data is fed into the LSTM to further capture temporal features, completing the spatiotemporal feature fusion for predicting the remaining useful life of electric drive bearings.

2.3. Prediction Process of Remaining Useful Life for Bearings

The specific process for predicting the RUL of electric drive bearings proposed in this paper is as follows:

(1): Full life-bearing vibration data are acquired, FPT-based sample segmentation is performed, and mean, standard deviation, peak-to-peak value, root-mean-square value, and kurtosis are computed for each sample as time-series degradation indicators. Temporal correlation and monotonicity are evaluated to identify sensitive indicators, weights are assigned, and temporal features are fused. The PELT algorithm is then applied to the fused sequence to detect degradation points and partition service stages.
(2): Based on the full life data, signals are segmented into path graph nodes; feature indicators are computed for each node; and a sliding window over nodes is applied to define edges in the path graph. Using the path graph dataset together with the temporal observation window, a temporally fused path graph dataset is constructed. Stage-dependent node sliding windows and temporal observation windows are adopted for different service stages.
(3): A GAT-LSTM model is employed to extract spatial and temporal features from the path graph, estimate sample RUL, and retain the best-performing checkpoint. The trained model is subsequently evaluated on the test set using standard performance metrics.

2.4. Evaluation Metrics

To validate the effectiveness of the model, the prediction performance is evaluated using Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the coefficient of determination (R²), defined as follows:

M A E = \frac{1}{m} \sum_{i = 1}^{m} | y_{i} - {\hat{y}}_{i} |

(16)

R M S E = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} {(y_{i} - {\hat{y}}_{i})}^{2}}

(17)

R^{2} = 1 - \frac{\sum_{i} {({\hat{y}}_{i} - y_{i})}^{2}}{\sum_{i} {(\bar{y} - y_{i})}^{2}}

(18)

where

m

is the number of samples,

{\hat{y}}_{i}

is the actual label, and

y_{i}

is the predicted label. Smaller values of MAE and RMSE, or

R^{2}

value closer to 1, indicate higher prediction accuracy.

3. Experimental Verification

To validate the effectiveness of the proposed RUL method, experiments were first conducted using the XJTU-SY dataset [26]. Subsequently, real measurement data were obtained using a test rig for ball bearings used in electric vehicle drive motors, allowing for further validation of the proposed method. The method was implemented in a Python 3.8 and PyTorch 2.4 environment, with the experimental configuration consisting of an Intel Core i7-13650HX (Intel, Santa Clara, CA, USA) and NVIDIA RTX 4060 (NVIDIA, Santa Clara, CA, USA).

3.1. XJTU-SY Dataset

3.1.1. Data Sources

The XJTU-SY dataset, released by Xi’an Jiaotong University, contains full life-cycle vibration signals from 15 bearings measured under three operating conditions, as shown in Figure 6 [27]. Each record comprises 32,768 data points acquired at one-minute intervals. The dataset includes both horizontal and vertical acceleration measurements; the horizontal channel provides rich degradation-related information. Accordingly, the horizontal acceleration data were used to formulate three prediction tasks. The division of the experimental dataset is summarized in Table 1.

3.1.2. Experimental Setup

The GAT-LSTM model is composed of three graph attention network layers, two long short-term memory layers, and one fully connected (FC) layer. The constructed road map dataset is fed as the input to the GAT-LSTM, while the model’s output corresponds to the predicted RUL for each road map sample. The architectural details of the GAT-LSTM model are summarized in Table 2. In this table, the size parameter

n_{1} \times n_{2}

denotes both the number of input nodes

n_{1}

and the feature dimension

n_{2}

. For model training, the mean squared error is adopted as the loss function, with a batch size of 64 configured for training iterations. Parameter optimization and updates are implemented using the Adam optimizer, where the learning rate is set to 0.003 and the total number of training epochs is fixed at 200.

The structure of the constructed GAT-LSTM model is shown in Table 2, where the dimension

n_{1} \times n_{2}

indicates that the number of input nodes is

n_{1}

and the feature dimension is

n_{2}

. The model employs a mean squared error loss function, with the Adam optimizer used for training. The learning rate is set to 0.003, and the number of training epochs is configured to 200.

3.1.3. Degradation Points of Bearing Service State

Accurate identification of FPT points is critical for RUL prediction, as it enables more precise extraction of bearing degradation features. The proposed method was therefore applied to identify FPT points in the XJTU-SY dataset. Piecewise linear RUL labels were constructed based on the degradation points of individual bearings. The identification results are summarized in Table 3. Compared with the reference points, the proposed method achieves high identification accuracy and effectively discriminates bearing service states, indicating strong versatility.

3.1.4. Analysis of Experimental Results

First, the proposed method was applied to the training set, and the model was trained to obtain the optimal GAT-LSTM model. Subsequently, the RUL predictions for the bearings were derived based on the testing set. During the training process, parameters for the single graph node count

s

and the temporal observation window

w

were adjusted based on the optimal model. The optimal parameters were determined by minimizing the mean RMSE of the samples. For the XJTU-SY dataset, the values were set to

s = 12 w = 7

for the points before degradation and

s = 6 w = 10

for the points after degradation. In addition to validating the effectiveness of the proposed method, comparative experiments were conducted. The comparison methods included CNN [28], LSTM [29], CNN-LSTM [30], GRU [31], Transformer [32] and CNN–Transformer, all of which were based on the piecewise linear RUL labels designed in this study. The RUL prediction evaluation metrics are shown in Table 4, and the prediction results are presented in Figure 7.

Combining the insights from Table 4 and Figure 7, it is evident that the proposed method for predicting the RUL of bearings, based on degradation assessment and the fusion of spatiotemporal features, is highly effective. The evaluation metrics, including MAE and RMSE, show that the sample means achieved by the proposed method are the smallest among the comparison methods, indicating superior predictive performance on the XJTU-SY dataset. While the comparison methods demonstrate reasonable predictive performance before the degradation point, their efficacy significantly declines after the degradation point. This is due to the exponential growth of degradation indicators, leading to sharp anomalies and a brief transitional phase. As the degradation point shifts, the imbalance in the sample sizes of healthy and degraded states becomes pronounced, making it difficult for the comparison methods to extract effective degradation features from the limited samples, resulting in a substantial loss of predictive performance. In contrast, the proposed method effectively integrates spatial and temporal features, allowing for the efficient extraction of information related to the bearing degradation stages. This integration mitigates the impact of sample proportions, and the positioning of the degradation point on the prediction outcomes, ultimately achieving a higher level of predictive accuracy.

To further highlight the advantages of the proposed method, we calculated the RUL prediction errors for all comparison methods, which are defined as the predicted values minus the actual values, as shown in Figure 8. By observing the error distribution, it was found that most of the errors are concentrated after the degradation point, where the errors increase sharply due to the onset of bearing degradation, causing significant fluctuations in the vibration signals. Although the RUL prediction errors increase during the later stages of bearing degradation, our method still maintains the errors within an acceptable range for RUL prediction tasks. Overall, the prediction errors for the tested bearings remain within an acceptable range, demonstrating the effectiveness of the proposed method.

3.2. Dataset of Electric Drive Bearings in New Energy Vehicles

3.2.1. Experimental Platform

Full life cycle data for electric drive bearings used in this study were obtained from a test rig designed for ball bearings used in electric vehicle drive motors. The testing system comprises four subsystems: an electrical control cabinet, a test bench, an environmental chamber, and a water-cooling unit. It supports a series of bearing performance tests, including high temperature fatigue durability, overspeed, and service life tests.

In this study, service life tests were conducted using the test rig, with emphasis on the main test module shown in Figure 9. The module comprises the bed assembly, drive assembly, test shaft system, radial loading assembly, axial loading assembly, and a sealed environmental enclosure. During testing, an electric spindle provides rotational drive, and the radial and axial loading assemblies apply combined loads, enabling acquisition of full life cycle monitoring data for the electric drive bearings under varying operating conditions. The radial loading assembly provides 0–12 kN loading with an accuracy class

\pm 2 %

, and the axial loading assembly offers 0–5 kN with the same accuracy. The system supports variable speeds from 2000 to 240,000 r/min. The test bearings are grease-lubricated deep groove ball bearings designated EV6206E/2RZTN/C3, with key parameters listed in Table 5. Two operating conditions were evaluated, as summarized in Table 6.

3.2.2. Data Collection

The test rig for ball bearings used in electric vehicle drive motors includes a specialized shaft system that accommodates two test bearings and two companion bearings. The bearing arrangement follows a simply supported beam configuration, with the test bearings located at the two shaft ends and the companion bearings centrally positioned. Only monitoring data from the test bearing at the drive motor outboard end were used in this study. An accelerometer was mounted on the housing of this test bearing using a magnetic base; data were acquired at 1 min intervals, yielding 16,184 samples in a single run, as shown in Figure 10.

Upon test completion, four full life cycle vibration datasets were obtained for the electric drive bearings, and their vibration signals were plotted, as illustrated in Figure 11. Figure 12 shows a representative failure of an electric drive bearing, with cage fracture identified as the predominant failure mode. During testing, the cage experienced alternating stresses exceeding the material fatigue limit, initiating fatigue cracks that propagated to final fracture.

3.2.3. Degradation Points of Bearing Service State

The proposed FPT determination method based on feature weighted fusion has been validated on the XJTU-SY dataset, demonstrating its effectiveness. Therefore, this method is applied to identify FPT points in the full life cycle vibration monitoring data of electric drive bearings. First, the samples are divided, and the mean, standard deviation, peak-to- peak value, root mean square value, and kurtosis of the temporal degradation indicators for each sample are calculated, as shown in Figure 13a. Next, a sensitivity assessment of these indicators is conducted to obtain the feature weighting coefficients, followed by weighted fusion, as illustrated in Figure 13b. Based on the weighted fused features, the PELT algorithm is employed for change point monitoring to identify the FPT points, as shown in Figure 13c. The FPT points for each set of full life cycle signals are summarized in Table 7.

3.2.4. Analysis of Experimental Results

Two prediction tasks are planned, with Bearing1 and Bearing2 serving as the training sets for their respective tasks, while Bearing4 and Bearing3 serve as the testing sets. The proposed method is applied to the training sets, and the model is trained to obtain the optimal GAT-LSTM model, from which the RUL predictions for the bearings are derived based on the testing sets. During the training process, parameters for the sliding window

l

and temporal observation window

w

are adjusted based on the optimal model, seeking the best parameters according to the mean RMSE of the samples. For the electric drive bearing dataset, the values are set to

s = 32 w = 10

before the degradation point and

s = 15 w = 12

after the degradation point.

While validating the proposed method, comparative experiments are conducted. The comparison methods include CNN, LSTM, CNN-LSTM, GRU, Transformer and CNN–Transformer. All comparative methods are based on the piecewise linear RUL labels designed in this study.

During training, model fit was assessed by tracking training and testing loss curves. For Prediction Task 1 on the electric drive bearing dataset, Figure 14 shows that both losses decline concurrently and then stabilize, indicating that the model has captured generalizable degradation patterns and exhibits good generalization performance.

The RUL evaluation metrics are shown in Table 8. The RUL prediction results are illustrated in Figure 15. Furthermore, the RUL prediction errors for all comparative methods are calculated, as shown in Figure 16.

Analysis of the figures and tables indicates that prediction performance on the electric drive bearing dataset exhibits greater variability than on the XJTU-SY dataset, with sharp error increases near degradation points. This behavior is attributable to the dataset’s long time span and large number of samples, which tend to dilute salient temporal features in the predictive methods. After the degradation point, degradation indicators increase exponentially, revealing the abrupt nature of the degradation process. The transition phase is brief, and the durations of the healthy-service and degradation phases are highly imbalanced, leading to a pronounced decline in prediction accuracy.

However, the proposed method exhibited substantially better adaptability to the electric drive bearing dataset. For the two prediction tasks, the MAE was 0.021 and 0.0619, respectively, and the RMSE was 0.029 and 0.0712, substantially lower than those of competing baselines. In the full life cycle prediction task for electric drive bearings, the method maintained high robustness and accuracy across both the healthy and degradation phases.

To demonstrate the advantages of the proposed method, prediction performance is assessed via quantitative analysis of errors before and after degradation points. For each comparative method, the root mean square error and standard deviation are computed. By weighting larger errors more heavily, RMSE reflects overall prediction accuracy and deviations at failure points. The standard deviation quantifies the dispersion of errors around the mean, indicating prediction stability. Results are reported in Table 9.

Across full-cycle metrics, the proposed method consistently outperforms all comparative approaches. Before the degradation point, the RMSE is 0.206 with a SD of 0.036; after the degradation point, the RMSE is 0.072 and the SD is 0.031. The small RMSE indicates minimal deviation from ground-truth values and effective control of extreme errors. The low SD denotes limited dispersion of prediction errors, evidencing stable and consistent accuracy. Overall, prediction errors are maintained within an acceptable range, and the large time span and abrupt degradation in the electric drive bearing dataset exert minimal influence on performance, underscoring the method’s adaptability. The RUL prediction framework based on degradation assessment and spatiotemporal feature fusion effectively captures time-dependent degradation, making it suitable for RUL prediction of electric drive bearings in new energy vehicles.

3.3. Influence of Multiple Factors on Prediction Performance

FPT points determined via feature-weighted fusion were used to construct a path graph structure that explicitly models temporal dependencies among monitored vibration signals. A GAT-LSTM was subsequently employed to capture time-dependent degradation, thereby completing RUL prediction for electric drive bearings. The effects of graph data structures, model architectures, and degradation-point selection on RUL prediction performance were then investigated.

3.3.1. Impact of Graph Data Structure on Prediction Performance

The construction of graph-structured data directly influences how temporal dependencies among monitored signals are represented, thereby affecting the prediction performance of bearing RUL. To examine this effect, k-nearest neighbor (KNN) graphs, complete graphs, and path graphs were considered, and comparative experiments were conducted on Task 1 of the XJTU-SY dataset and Task 1 of the electric drive bearing dataset, with other factors held constant. Evaluation metrics for RUL prediction under different graph-construction methods are summarized in Table 10.

Results indicate that the path graph construction achieves the best performance, suggesting that a path graph effectively captures temporal dependencies in the monitoring data for bearing RUL prediction. In a KNN graph, edges are established based on feature similarity, enabling the aggregation of similar nodes. In classification tasks, such graphs facilitate information exchange among similar nodes. A complete graph connects every pair of nodes with an edge, which is beneficial for aggregating global information during aggregation.

However, for prediction tasks, temporal dependency is the primary determinant of performance in modeling bearing degradation. Compared with the path graph, neither the KNN graph nor the complete graph explicitly encodes the temporal order among monitored vibration signals, leading to inferior prediction performance. These results suggest that explicitly encoding temporal dependencies among monitored vibration signals enhances bearing RUL prediction.

3.3.2. Impact of Model Structure and Degradation Points on Prediction Performance

The RUL prediction performance of the GAT-LSTM model depends on its capacity to extract degradation-related temporal dependencies. To assess this relationship, ablation experiments evaluated the contributions of the GAT module, the LSTM module, and FPT points to bearing RUL prediction. Comparative experiments were conducted on Task 2 of the XJTU-SY dataset and Task 2 of the electric drive bearing dataset, with all other factors held constant. Evaluation metrics for prediction performance are summarized in Table 11.

Ablation results indicate that the degradation-aware GAT-LSTM surpasses GAT, which in turn surpasses LSTM. From an architectural standpoint, GAT operates on an explicitly constructed temporal graph, enabling more effective capture of dynamic dependencies among time-indexed nodes; by contrast, LSTM models temporal dependence through gating alone, limiting structural expressiveness and interpretability. The hybrid GAT-LSTM leverages both mechanisms: GAT explicitly encodes temporal topological relations, while LSTM enhances modeling of long-range temporal patterns, yielding a more comprehensive temporal feature extractor. Combining these capabilities with a piecewise linear degradation assumption produces performance superior to that of the linear degradation counterpart. The theoretical rationale is that piecewise linear degradation better reflects the multi-stage nature of bearing wear, with stage-specific rates, thereby improving adaptability to abrupt state transitions and strengthening degradation feature capture, which in turn enhances predictive accuracy and generalization.

4. Conclusions

For RUL prediction of electric drive bearings in new energy vehicles, a method based on degradation assessment and spatiotemporal feature fusion is proposed to effectively capture time-dependent relationships and degradation states within full life-cycle monitoring data. Based on experimental results and analyses, the following conclusions are summarized as follows.

The feature-weighted fusion method for determining the FPT is both practical and effective. A piecewise linear degradation model, constructed on this basis, captures bearing degradation states with improved accuracy. As a result, the RMSE and MAE prediction metrics are reduced by 54.3% and 49.6%, respectively, leading to a significant enhancement in the accuracy of RUL prediction.
To represent temporal dependencies in time-series vibration data across the bearing’s entire life cycle, a path graph is constructed. A temporal observation window is employed to encode temporal features, and a GAT extracts spatial features from the path graph. The resulting representations are then processed by a LSTM network to capture temporal dependencies, thereby enhancing characterization of the bearings’ degradation process. Nevertheless, the graph data structure and its construction strategy must be carefully determined.
The proposed RUL prediction method for bearings demonstrates strong predictive performance. On the electric drive bearing dataset, the sample mean values of MAE, RMSE, and R² are 0.021, 0.029, and 0.981, respectively. These results offer valuable insights for addressing challenges in predicting the RUL of electric drive bearings in new energy vehicles, particularly under conditions of long-time spans and abrupt degradation. However, the model exhibits limited sensitivity to degradation features under high-frequency speed variations and extreme loads. Moreover, processing extremely long time-series monitoring data remains relatively time-consuming, indicating a failure to meet real-time health assessment requirements for electric drive systems in new energy vehicles. In practice, obtaining full-cycle data is challenging, and data are often fragmented. Future research will focus on improving the model’s sensitivity to degradation features and enhancing real-time performance under complex operating conditions. This will involve the use of transfer learning and few-shot learning techniques to extract effective information from fragmented data, thereby reducing the reliance on full-cycle data.

Author Contributions

Conceptualization, F.Y. and E.D.; methodology, E.D.; validation, Z.Z., W.Z. and Y.C.; formal analysis, F.Y.; investigation, J.Y.; resources, J.Y.; data curation, E.D.; writing—original draft preparation, E.D.; writing—review and editing, F.Y. and Z.Z.; supervision, J.Y.; project administration, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Major Science and Technology Project of Henan Province, grant number 251100220200.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

RUL	Remaining Useful Life
FPT	First Predicting Time
GAT	Graph Attention Network
LSTM	Long Short-Term Memory Network
RNN	Recurrent Neural Network
CNN	Convolutional Neural Network
GRU	Gated Recurrent Units
MAE	Mean Absolute Error
RMSE	Root Mean Square Error

References

Takiso, A.T.; Yu, J. Research progress on the optimization of thermal management systems for lithium-ion batteries in new energy vehicles. J. Energy Storage 2025, 134, 118144. [Google Scholar] [CrossRef]
Soresini, F.; Barri, D.; Cazzaniga, I.; Ballo, F.M.; Mastinu, G.; Gobbi, M. Artificial Intelligence for Fault Detection of Automotive Electric Motors. Machines 2025, 13, 457. [Google Scholar] [CrossRef]
Luu, T.T.; Huynh, A.D. A ResNet-based deep reinforcement learning framework using soft actor-critic for remaining useful life prediction of rolling bearings. Results Eng. 2025, 27, 106739. [Google Scholar] [CrossRef]
Yan, B.X.; Ma, X.B.; Huang, G.F.; Zhao, Y. Two-stage physics-based Wiener process models for online RUL prediction in field vibration data. Mech. Syst. Signal Process. 2021, 152, 107378. [Google Scholar] [CrossRef]
Liu, C.Y.; Gryllias, K. A semi-supervised Support Vector Data Description-based fault detection method for rolling element bearings based on cyclic spectral analysis. Mech. Syst. Signal Process. 2020, 140, 106682. [Google Scholar] [CrossRef]
Pei, X.W.; Dong, S.J.; Tang, B.P.; Pan, X.J. Bearing running state recognition method based on feature-to-noise energy ratio and improved deep residual shrinkage network. IEEE/ASME Trans. Mechatron. 2021, 27, 3660–3671. [Google Scholar] [CrossRef]
Mi, J.P.; Hou, Y.C.; He, W.T.; He, C.C.; Zhao, H.P.; Huang, W.J. A nonparametric cumulative sum-based fault detection method for rolling bearings using high-level extended isolated forest. IEEE Sens. J. 2022, 23, 2443–2455. [Google Scholar] [CrossRef]
Ding, N.; Li, H.L.; Yin, Z.W.; Zhong, N.; Zhang, L. Journal bearing seizure degradation assessment and remaining useful life prediction based on long short-term memory neural network. Measurement 2020, 166, 108215. [Google Scholar] [CrossRef]
Chen, Z.Y.; Li, Z.R.; Wu, J.; Deng, C.; Dai, W. Deep residual shrinkage relation network for anomaly detection of rotating machines. J. Manuf. Syst. 2022, 65, 579–590. [Google Scholar] [CrossRef]
Zhu, W.C.; Ni, G.X.; Cao, Y.P.; Wang, H. Research on a rolling bearing health monitoring algorithm oriented to industrial big data. Measurement 2021, 185, 110044. [Google Scholar] [CrossRef]
Ma, Q.L.; Li, S.; Shen, L.F.; Wang, J.B.; Wei, J.; Yu, Z.W. End-to-end incomplete time-series modeling from linear memory of latent variables. IEEE Trans. Cybern. 2020, 50, 4908–4920. [Google Scholar] [CrossRef]
Chen, Y.H.; Peng, G.L.; Zhu, Z.Y.; Li, S.J. A novel deep learning method based on attention mechanism for bearing remaining useful life prediction. Appl. Soft Comput. 2020, 86, 105919. [Google Scholar] [CrossRef]
Zhao, Q.W.; Zhang, X.L. Prediction of remaining useful life for rolling bearing based on ISOMAP and multi-head self-attention with gated recurrent unit. J. Vib. Control 2025, 31, 3187–3205. [Google Scholar] [CrossRef]
Qin, Y.; Xiang, S.; Chai, Y.; Chen, H.Z. Macroscopic–microscopic attention in LSTM networks based on fusion features for gear remaining life prediction. IEEE Trans. Ind. Electron. 2019, 67, 10865–10875. [Google Scholar] [CrossRef]
Shen, Y.Z.; Tang, B.P.; Li, B.; Tan, Q.; Wu, Y.L. Remaining useful life prediction of rolling bearing based on multi-head attention embedded Bi-LSTM network. Measurement 2022, 202, 111803. [Google Scholar] [CrossRef]
Wang, Z.; Cheng, J.F.; Zheng, H.; Zou, X.F.; Tao, F. Multistage Convolutional Autoencoder and BCM-LSTM Networks for RUL Prediction of Rolling Bearings. IEEE Trans. Instrum. Meas. 2023, 72, 2527713. [Google Scholar] [CrossRef]
Zhu, J.; Chen, N.; Peng, W.W. Estimation of bearing remaining useful life based on multiscale convolutional neural network. IEEE Trans. Ind. Electron. 2018, 66, 3208–3216. [Google Scholar] [CrossRef]
Cao, Y.D.; Ding, Y.F.; Jia, M.P.; Tian, R.S. A novel temporal convolutional network with residual self-attention mechanism for remaining useful life prediction of rolling bearings. Reliab. Eng. Syst. Saf. 2021, 215, 107813. [Google Scholar] [CrossRef]
Kumar, A.; Parkash, C.; Kundu, P.; Tang, H.S.; Xiang, J.W. Enhanced deep learning framework for accurate near-failure RUL prediction of bearings in varying operating conditions. Adv. Eng. Inf. 2025, 65, 103231. [Google Scholar] [CrossRef]
Jin, Z.; Wu, Z.W.; Zhao, J.Y.; He, D.Q.; Zhuang, Y. Graph Convolutional Neural Network Algorithms for Bearing Remaining Useful Life Prediction: A Review. J. Fail. Anal. Prev. 2025, 25, 1040–1056. [Google Scholar] [CrossRef]
Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A Survey on Vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef]
Wei, Y.P.; Wu, D.Z.; Janis, T. Bearing remaining useful life prediction using self-adaptive graph convolutional networks with self-attention mechanism. Mech. Syst. Signal Process. 2023, 188, 110010. [Google Scholar] [CrossRef]
Chen, X.; Zeng, M. Convolution-graph attention network with sensor embeddings for remaining useful life prediction of turbofan engines. IEEE Sens. J. 2023, 23, 15786–15794. [Google Scholar] [CrossRef]
Liang, P.H.; Li, Y.; Wang, B.; Yuan, X.M.; Zhang, L.J. Remaining useful life prediction via a deep adaptive transformer framework enhanced by graph attention network. Int. J. Fatigue 2023, 174, 107722. [Google Scholar] [CrossRef]
Guo, X.; Tu, J.S.; Zhan, S.P.; Zhang, W.L.; Ma, L.X.; Jia, D. A novel method for online prediction of the remaining useful life of rolling bearings based on wavelet power spectrogram and Transformer structure. Eng. Res. Express 2023, 5, 045074. [Google Scholar] [CrossRef]
Lei, Y.G.; Han, T.Y.; Wang, B.; Li, N.P. XJTU-SY rolling bearing accelerated life test data set interpretation. J. Mech. Eng. 2019, 55, 1–6. [Google Scholar]
Wnag, B.; Lei, Y.G.; Li, N.P.; Li, N.B. A Hybrid Prognostics Approach for Estimating Remaining Useful Life of Rolling Element Bearings. IEEE Trans. Reliab. 2020, 69, 401–412. [Google Scholar] [CrossRef]
Zhang, J.J.; Wang, P.; Yan, R.Q.; Robert, G. Long short-term memory for machine remaining life prediction. J. Manuf. Syst. 2018, 48, 78–86. [Google Scholar] [CrossRef]
Liu, Y.B.; Ge, M.F.; Zhang, C.H.; Liu, J. A deep feature learning method based on time-frequency images and MsCNN_SE for rul prediction. In Proceedings of the 2021 IEEE International Conference on Sensing Diagnostics, Weihai, China, 13–15 August 2021; pp. 163–167. [Google Scholar]
Zhou, J.Z.; Shan, Y.H.; Liu, J.; Xu, Y.H.; Zheng, Y. Degradation tendency prediction for pumped storage unit based on integrated degradation index construction and hybrid CNN-LSTM model. Sensors 2020, 20, 4277. [Google Scholar] [CrossRef]
Sun, S.C.; Luo, J.; Huang, A.; Xia, X.Y.; Yang, J.L.; Zhou, H. Remaining useful life prediction for rolling bearings based on adaptive aggregation of dynamic feature correlations. J. Vib. Control 2025, 31, 2450–2463. [Google Scholar] [CrossRef]
Jin, X.C.; Ji, Y.P.; Li, S.P.; Lv, K.L.; Xu, J.Z.; Jiang, H.N.; Fu, S.N. Remaining Useful Life Prediction for Rolling Bearings Based on TCN–Transformer Networks Using Vibration Signals. Sensors 2025, 25, 3571. [Google Scholar] [CrossRef] [PubMed]

Figure 1. RUL Prediction Driven by Degradation Points and Temporal Data Fusion.

Figure 2. Flowchart of the FPT Determination Method Based on Feature Weight Fusion.

Figure 3. Change Point Determination Based on Weighted Fused Features.

Figure 4. Path Graph Structure Data Node Generation.

Figure 5. Graph Spatiotemporal Structure Data Construction.

Figure 6. Test Rig for the XJTU-SY Dataset.

Figure 7. RUL Curve for the XJTU-SY Dataset. (a) Task 1; (b) Task 2; (c) Task 3.

Figure 8. Prediction Errors for the XJTU-SY Dataset. (a) Task 1; (b) Task 2; (c) Task 3.

Figure 9. Test Rig for Ball Bearings Used in Electric Drive Motors for New Energy Vehicles.

Figure 10. Layout for Data Collection of Electric Drive Bearings.

Figure 11. Vibration Signals Throughout the Full Life Cycle of Electric Drive Bearings. (a) Bearing 1; (b) Bearing 2.

Figure 12. Failure Modes of Electric Drive Bearings.

Figure 13. Determination of FPT Points for Electric Drive Bearings. (a) Temporal Feature Indicators; (b) Weighted Fusion Features; (c) FPT Detection Results.

Figure 14. Evolution of GAT-LSTM Training Loss.

Figure 15. RUL Curve for the Electric Drive Bearing Dataset. (a) Task 1; (b) Task 2.

Figure 16. Prediction Errors for the Electric Drive Bearing Dataset. (a) Task 1; (b) Task 2.

Table 1. Prediction Tasks of the XJTU-SY Dataset.

Prediction Task	Training	Test Set
1	Bearing1_1 Bearing1_2	Bearing1_3
2	Bearing2_2 Bearing2_4	Bearing2_5
3	Bearing3_3	Bearing3_4

Table 2. Structure of the GAT-LSTM Model.

Model Composition	Input Size	Output Size
GAT_1	$N \times 16$	$N \times 400$
GAT_1	$N \times 400$	$N \times 300$
GAT_1	$N \times 300$	$N \times 200$
LSTM_1	$N \times 200$	$N \times 30$
LSTM_1	$N \times 30$	$N \times 20$
FC	$N \times 20$	$N \times 1$

Table 3. FPT Point Recognition Results for the XJTU-SY Dataset.

Task	Bearing	Test Duration/min	FPT/min	Reference FPT/min
1	Bearing1_1	123	77	77
	Bearing1_2	161	45	35
	Bearing1_3	158	58	59
2	Bearing2_2	161	50	46
	Bearing2_4	42	31	30
	Bearing2_5	339	128	122
3	Bearing3_3	371	343	340
3	Bearing3_4	1515	1443	1418

Table 4. Prediction Result Metrics for the XJTU-SY Dataset.

Model	Task	MAE		RMSE		R²
Model	Task	Mean	SD	Mean	SD	Mean
GAT-LSTM	1	0.03	0.024	0.036	0.022	0.974
	2	0.053	0.061	0.061	0.055	0.963
	3	0.006	0.027	0.007	0.025	0.878
LSTM	1	0.04	0.036	0.051	0.032	0.971
	2	0.081	0.054	0.089	0.049	0.907
	3	0.027	0.106	0.032	0.098	0.281
CNN	1	0.144	0.123	0.187	0.118	0.629
	2	0.108	0.044	0.112	0.037	0.866
	3	0.021	0.043	0.029	0.039	0.027
CNN-LSTM	1	0.13	0.096	0.145	0.082	0.728
	2	0.098	0.05	0.11	0.048	0.881
	3	0.01	0.024	0.019	0.023	0.701
GRU	1	0.049	0.029	0.058	0.022	0.966
	2	0.09	0.073	0.106	0.072	0.851
	3	0.021	0.062	0.03	0.053	0.742
Transformer	1	0.085	0.046	0.091	0.04	0.909
	2	0.096	0.052	0.098	0.048	0.883
	3	0.008	0.019	0.012	0.015	0.812
CNN–Transformer	1	0.06	0.036	0.065	0.03	0.952
	2	0.092	0.06	0.098	0.052	0.883
	3	0.033	0.021	0.041	0.017	0.366

Table 5. Parameters of EV6206E Bearings.

Parameter Name	Value	Parameter Name	Value
Bearing bore diameter/mm	30	Roller diameter/mm	9.525
Bearing outside diameter/mm	62	Roller number	9
Bearing width/mm	16	Dynamic load rating/kn	19.5
Contact angle/(°)	0	Rated static load/kn	11.3

Table 6. Experimental Conditions for Electric Drive Bearings.

Operating Conditions	1	2
Rotational speed r/min	5400	18,000
Radial force/kN	1.79	1.79
Axial force/kN	0.26	0.26
Test Bearing	Bearing1 Bearing4	Bearing2 Bearing3

Table 7. FPT Point Recognition Results for Electric Drive Bearings.

Task	Bearing	Test Duration/h	FPT/h
1	Bearing1	84.4	68.17
1	Bearing4	115.7	105.91
2	Bearing2	125.7	73.45
2	Bearing3	93.2	99.89

Table 8. Prediction Result Metrics for Electric Drive Bearings.

Model	Task	MAE		RMSE		R²
Model	Task	Mean	SD	Mean	SD	Mean
GAT-LSTM	1	0.021	0.023	0.029	0.021	0.981
GAT-LSTM	2	0.062	0.019	0.065	0.018	0.904
LSTM	1	0.177	0.064	0.182	0.058	0.297
LSTM	2	0.18	0.072	0.187	0.071	0.141
CNN	1	0.374	0.094	0.398	0.092	0.421
CNN	2	0.346	0.092	0.375	0.084	0.419
CNN-LSTM	1	0.193	0.035	0.201	0.029	0.271
CNN-LSTM	2	0.117	0.024	0.132	0.021	0.686
GRU	1	0.13	0.062	0.153	0.057	0.603
GRU	2	0.286	0.073	0.287	0.068	0.592
Transformer	1	0.085	0.028	0.091	0.025	0.85
Transformer	2	0.097	0.055	0.103	0.051	0.726
CNN–Transformer	1	0.09	0.046	0.098	0.037	0.806
CNN–Transformer	2	0.037	0.02	0.042	0.018	0.906

Table 9. Analysis of Prediction Errors for Electric Drive Bearings.

Model	Task	Before the FPT		After the FPT
Model	Task	RMSE	SD	RMSE	SD
GAT-LSTM	1	0.206	0.036	0.072	0.031
GAT-LSTM	2	0.833	0.016	0.767	0.086
LSTM	1	0.024	0.018	0.051	0.026
LSTM	2	1.07	0.015	0.918	0.309
CNN	1	0.132	0.044	0.192	0.105
CNN	2	0.72	0.027	0.59	0.218
CNN-LSTM	1	0.404	0.037	0.297	0.146
CNN-LSTM	2	0.65	0.024	0.531	0.197
GRU	1	0.199	0.01	0.184	0.066
GRU	2	1.124	0.009	0.957	0.36
Transformer	1	0.09	0.023	0.079	0.04
Transformer	2	0.924	0.01	0.773	0.298
CNN–Transformer	1	0.081	0.039	0.16	0.053
CNN–Transformer	2	0.967	0.004	0.842	0.277

Table 10. Impact of Different Graphing Methods on Prediction Results.

Dataset	Graph Construction Method	MAE	RMSE	R²
XJTU-SY	KNN Graph	0.067	0.051	0.802
	Complete Graph	0.043	0.039	0.921
	Path Graph	0.03	0.036	0.974
Electric Drive Bearing	KNN Graph	0.054	0.048	0.854
	Complete Graph	0.038	0.041	0.902
	Path Graph	0.021	0.029	0.981

Table 11. Ablation Experiment Results.

Dataset	Model Structure	MAE	RMSE	R²
XJTU-SY	GAT	0.075	0.092	0.912
	LSTM	0.081	0.089	0.907
	GAT-LSTM	0.116	0.121	0.608
	FPT-GAT-LSTM	0.053	0.061	0.9632
Electric Drive Bearing	GAT	0.072	0.081	0.825
	LSTM	0.18	0.187	0.141
	GAT-LSTM	0.17	0.196	0.195
	FPT-GAT-LSTM	0.062	0.065	0.904

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, F.; Dong, E.; Zhong, Z.; Zhang, W.; Cui, Y.; Ye, J. Remaining Useful Life Prediction of Electric Drive Bearings in New Energy Vehicles: Based on Degradation Assessment and Spatiotemporal Feature Fusion. Machines 2025, 13, 914. https://doi.org/10.3390/machines13100914

AMA Style

Yang F, Dong E, Zhong Z, Zhang W, Cui Y, Ye J. Remaining Useful Life Prediction of Electric Drive Bearings in New Energy Vehicles: Based on Degradation Assessment and Spatiotemporal Feature Fusion. Machines. 2025; 13(10):914. https://doi.org/10.3390/machines13100914

Chicago/Turabian Style

Yang, Fang, En Dong, Zhidan Zhong, Weiqi Zhang, Yunhao Cui, and Jun Ye. 2025. "Remaining Useful Life Prediction of Electric Drive Bearings in New Energy Vehicles: Based on Degradation Assessment and Spatiotemporal Feature Fusion" Machines 13, no. 10: 914. https://doi.org/10.3390/machines13100914

APA Style

Yang, F., Dong, E., Zhong, Z., Zhang, W., Cui, Y., & Ye, J. (2025). Remaining Useful Life Prediction of Electric Drive Bearings in New Energy Vehicles: Based on Degradation Assessment and Spatiotemporal Feature Fusion. Machines, 13(10), 914. https://doi.org/10.3390/machines13100914

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Remaining Useful Life Prediction of Electric Drive Bearings in New Energy Vehicles: Based on Degradation Assessment and Spatiotemporal Feature Fusion

Abstract

1. Introduction

2. RUL Prediction Via Joint Degradation Point-Driven Graph and Time-Series Data Introduction

2.1. Feature Weighted Fusion-Based FPT Determination Method

2.1.1. Calculation of Temporal Degradation Indicators

2.1.2. Evaluation of Temporal Sensitivity Indicators

2.1.3. Identification of Change Points Based on Weighted Fusion Features

2.2. Graph-Structured Temporal Data Construction

2.2.1. Path Graph Structure Data Construction

2.2.2. Spatiotemporal Feature Fusion

2.3. Prediction Process of Remaining Useful Life for Bearings

2.4. Evaluation Metrics

3. Experimental Verification

3.1. XJTU-SY Dataset

3.1.1. Data Sources

3.1.2. Experimental Setup

3.1.3. Degradation Points of Bearing Service State

3.1.4. Analysis of Experimental Results

3.2. Dataset of Electric Drive Bearings in New Energy Vehicles

3.2.1. Experimental Platform

3.2.2. Data Collection

3.2.3. Degradation Points of Bearing Service State

3.2.4. Analysis of Experimental Results

3.3. Influence of Multiple Factors on Prediction Performance

3.3.1. Impact of Graph Data Structure on Prediction Performance

3.3.2. Impact of Model Structure and Degradation Points on Prediction Performance

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI