Research on Fault Prediction Method for Electric Multiple Unit Gearbox Based on Gated Recurrent Unit–Hidden Markov Model

: Due to the limited availability of fault samples and the expensive nature of marking fault samples in Electric Multiple Unit (EMU) gearbox monitoring data, a study was conducted to simulate the degradation process of key components in the CRH5 gearbox using rigid–flexible coupling dynamics. Vibration acceleration data from the simulation were utilized to create a six-dimensional hybrid feature domain representing the degradation process. By leveraging the capabilities of the Hidden Markov Model (HMM) for handling hidden transitive probabilities in temporal data and Gated Recurrent Unit (GRU) for addressing long-distance and high-dependence temporal data, a GRU-HMM fault prediction model was developed. This model was validated using monitoring data and the six-dimensional hybrid feature domain from the CRH5 gearbox and compared against actual maintenance records. The findings indicated that the GRU-HMM fault prediction model can effectively recognize the degradation patterns of multiple components, offering higher accuracy in fault prediction compared to traditional models. These research outcomes are expected to optimize EMU maintenance schedules based on usage conditions, enhance EMU utilization rates, and reduce operational and maintenance costs, thereby providing valuable theoretical support.


Introduction
As an important part of the traction transmission system, the gearbox affects the safety and stability of the Electric Multiple Unit (EMU).The components within the gearbox are subject to diverse loads, resulting in different types of malfunctions.Hence, analyzing operational data from the gearbox and anticipating potential failures are essential for ensuring the operational safety of the EMU.
The traditional method of mechanical system fault diagnosis involves using fault samples with label information to construct a fault diagnosis model for rotating machinery gearboxes, decomposing the complex vibration signals into modal functions, and analyzing the single modes so as to reflect the local characteristics of non-smooth signals [1][2][3].This method of modal decomposition is characterized by the instantaneous identification of features but requires a large amount of data to produce effective results [4][5][6].With the development of artificial intelligence, deep learning methods have become the main research direction of current fault diagnosis due to the advantages of efficient and powerful feature extraction, and methods of deep learning mainly include support vector machines and neural network algorithms [7][8][9][10].The support vector machine still remains effective when the data dimension is larger than the number of samples, and it excels at solving nonlinear problems.However, SVM is sensitive to noisy data, not applicable to multi-class classification problems, and cannot directly provide probability estimates [11][12][13][14].Neural networks have self-learning functions, can extract useful information from data through learning, have the ability to find optimization solutions at high speed, and can quickly find solutions when dealing with complex problems.However, the training and development of neural networks usually require a large amount of computational resources, which may increase the cost [15][16][17][18][19][20].
In rail train fault diagnosis research, due to the difficulty of obtaining fault characteristics from actual operation monitoring data, most scholars use accelerated life experiments or dynamics simulation to obtain the fault characteristics of key components and use deep learning algorithms to categorize and predict the fault characteristics [21,22].Generally, the frequency response function between acceleration signals is analyzed as a fault indicator by establishing a dynamic model of the train or a specific component and analyzing the impact of radial and axial vibration under different fault conditions [23,24].Then, using an RNN or CNN, normal and fault conditions are classified and recognized [25][26][27][28].However, in fault identification for EMU gearboxes, the complexity of the vibration response signal components makes it difficult to extract the characteristic signals.Additionally, the high cost of marking the fault samples through accelerated life experiments further complicates the process.How to carry out accurate fault prediction and diagnosis of key components of rolling stock through dynamics simulation and other cost-effective methods is the current problem of fault prediction that needs to be solved in the field of rail transportation.
In summary, to solve the fault prediction problem of the scarcity of fault samples for the EMU gearbox, this paper establishes a rigid-flexible coupling dynamics model of the gearbox, taking into account the special characteristics of the CRH5 in the structure of the gearbox and the completeness of the collected data.The focus is on studying the universal joints, driving gear, and driven gears, which are the most frequent faults of the gearbox system of CRH5, as the objects of the study.Based on the rigid-flexible coupling dynamics model of the gearbox, the dynamics simulation of the whole life cycle degradation process of the key components is carried out.Using the vibration and acceleration data of the degradation process, a hybrid feature domain for the degradation of key components of the gearbox is constructed.This approach addresses the problem of fault feature extraction under the condition of scarcity of fault samples of the gearbox of the moving train.A fault prediction model of the EMU gearbox based on GRU-HMM is proposed, which overcomes the limitation of the traditional HMM regarding the input sequence length.This model improves the fault prediction performance and improves the comprehensive identification accuracy of the GRU-HMM fault prediction model by 15% compared to traditional HMM and SVM fault prediction models.The research results improve the reliability of gearbox failure prediction for EMU, improve the efficiency of EMU utilization, reduce the cost of utilization and maintenance, and provide theoretical support for the transformation of the maintenance mode into state repair.At the same time, it can also guarantee safe operation, which has very important practical application value.

Demand Analysis of GRU-HMM Fault Prediction Model
The transmission structure of the CRH5 is traction motor-universal shaft-gearbox, and vibration acceleration and temperature data are collected simultaneously during operation.The monitoring data capacity of one primary repair cycle is about 200,000 items, and the data show time series characteristics, which are typical large-capacity and long-range time series data.The Hidden Markov Model (HMM) is capable of modeling probability distributions of time-series data, and the hidden state setting is well suited for multicomponent fault prediction studies.However, HMM has a limitation on the length of the input sequence, has weak performance in dealing with dependencies over long distances, and is prone to problems such as gradient vanishing and gradient explosion.Therefore, the fusion of a Gated Recurrent Unit (GRU) with the HMM can solve the shortcomings in the performance of the HMM and provide better access to the dependencies of long sequence data, and its gating mechanism can also effectively solve the problems of gradient vanishing and gradient explosion, which can effectively improve the accuracy of the fault prediction model.
indicates the probability that the degraded state of the component occurs at the initial time, where C(i) is the probability that the initial state is denotes the transfer probability of the gearbox component from state s i to state s j .

•
Confusion matrix B, indicates the probability of occurrence of eigenvector v i in state s i at train operating moment t.
The HMM fault prediction model is based on the sequence O and the initial model λ 0 = (π, A, B).The parameters π i , a ij , and b j (v i ) are continuously improved and revalued to obtain the new model λ = (π, A, B), bringing P = P(O λ ) to a point of convergence.

GRU Updates the HMM Input State Vector
GRU, a type of Recurrent Neural Network, addresses the performance limitations of the HMM in handling long-distance dependencies and excels in predicting time-series data.Firstly, the gearbox operation data of the on-board monitoring system of the EMU are input to the GRU as a training set for training, the result of the activation function of the GRU is input to the HMM as the initial fault probability, and the results of the training of the GRU are input to the HMM as the state transition matrix.In conjunction with the experimental testing of the model, the number of hidden layers of the GRU network was set to 2. Due to the large amount of input data, the number of GRU units used in each layer was set to 128.The computational procedure for updating the input of the GRU to the HMM is as follows:

•
Construct an initial failure probability output layer, using gearbox operation monitoring data as training set, calculate the output conditional probability P i ∈ [0, 1] for each fault type using the Softmax activation function.The output updates the initial vector of the HMM, where V i is the weighting factor for each fault type and n is the number of hidden states in the gearbox.

•
Calculate the reset gate, which calculates the amount of stage information to be forgotten by the eigenvalues and fault type output probabilities from the previous time step, where R t denotes the reset vector; X i denotes the fault eigenvalue; H t−1 denotes the hidden state of the fault type at the previous time step; b r denotes the weight parameter; and σ is a sigmoid function with the value range of (0,1).
where Z t denotes the update vector and b z denotes the weight parameter.

•
Calculate the output value, which is used as the state transition matrix to update the parameters of the HMM, where H t denotes the output state vector at time t.

GRU-HMM Fault Prediction Model Algorithm Flow
The algorithm flow of the GRU-HMM fault prediction modeling algorithm for the EMU gearbox is shown in Figure 1.(1) Perform wavelet de-noising and normalization on the operation monitoring data samples O = {o1, o2, ..., oN} and the degradation feature sets F = {f1, f2, ..., fN} of each component to determine the training and testing sets.
(3) Construct the GRU network model, set the number of hidden layers, M = 4.The number of GRU units in each layer, u = 128.(1) Perform wavelet de-noising and normalization on the operation monitoring data samples O = {o 1 , o 2 , . .., o N } and the degradation feature sets F = {f 1 , f 2 , . .., f N } of each component to determine the training and testing sets.
(3) Construct the GRU network model, set the number of hidden layers, M = 4.The number of GRU units in each layer, u = 128.
(4) Input the training set, optimize the network weights and parameters through iterative training, and find the optimal prediction step size and each parameter.
(5) Output the GRU training results to update the HMM parameters π i , a ij .( 6) Define the forward variable α M (i), (7) Define the backward variable β M (i), (8) Perform training on the parameters of the Baum-Welch algorithm.Define the variable γ t (n) (i), denoting the probability of being in state q i at moment t.Define the variable ξ m (n) (i,j), denoting the probability of being in state q i at moment t and in state q j at t + 1, (9) Update the HMM model parameters π i , a ij , and b j (v i ) again.If the values of π i , a ij , and b j (v i ) have converged, the algorithm ends; otherwise, go back to (5) to continue the iteration as follows: (10) Output model λ = (π, A, B), using the test set to output the failure probabilities P = P(O λ ).

Simulation Modeling of Dynamics for EMU Gearbox System
CRH5, incorporating French ALSTOM technology for manufacturing, is designed for a maximum speed of 250 km/h.The biggest difference between the CRH5 and other EMUs is the transmission system, which adopts the structure of a traction motor and universal shaft-mounted gearbox, in which the universal shaft is connected to the traction motor and the gearbox through the universal joints at the two ends; the structure of the CRH5 gearbox is as shown in Figure 2.
a maximum speed of 250 km/h.The biggest difference between the CRH5 and o is the transmission system, which adopts the structure of a traction motor an shaft-mounted gearbox, in which the universal shaft is connected to the trac and the gearbox through the universal joints at the two ends; the structure o gearbox is as shown in Figure 2.  To facilitate the subsequent dynamic simulation, the model is simplified by deleting the upper and rear covers of the box as well as all bolts, nuts, and spacers, and the traction motor is replaced by a bracket to optimize the hermetic structure of the gear, axle, and box.The exploded view of the optimized gearbox modeling is shown in Figure 3.To facilitate the subsequent dynamic simulation, the model is simplified by deleting the upper and rear covers of the box as well as all bolts, nuts, and spacers, and the traction motor is replaced by a bracket to optimize the hermetic structure of the gear, axle, and box.The exploded view of the optimized gearbox modeling is shown in Figure 3.

CRH5 Gearbox Rigid-Flexible Coupling Dynamic Modeling
The rigid-flexible coupled dynamics model is employed to analyze the vibration characteristics of the universal joint, driving gear, and driven gears of the gearbox in the normal state and fault state.This model can more accurately reflect the vibration signal characteristics.

Theory of Rigid-Flexible Coupled Dynamics
Universal joints and gears undergoing elastic deformation need to be interpreted in terms of a relative deformation field, which must satisfy the boundary conditions of the member, and the expression is then as follows:

CRH5 Gearbox Rigid-Flexible Coupling Dynamic Modeling
The rigid-flexible coupled dynamics model is employed to analyze the vibration characteristics of the universal joint, driving gear, and driven gears of the gearbox in the normal state and fault state.This model can more accurately reflect the vibration signal characteristics.

Theory of Rigid-Flexible Coupled Dynamics
Universal joints and gears undergoing elastic deformation need to be interpreted in terms of a relative deformation field, which must satisfy the boundary conditions of the member, and the expression is then as follows: where u f1 , u f2 , and u f3 denote the deformation components of the flexible body coordinates (x 1 , x 2 , and x 3 ); a k , b k , and c k are the coefficients with respect to time t, and f k , g k , and h k are the basis functions of the deformed body.
The motion constraints of universal joints and gears need to consider elastic deformation.Their motion constraint equations are as follows: where C is the constraint equation; q, .q, .. q are the velocity, position, and acceleration vectors of the flexible body; C q q is the coordinate velocity of the flexible body; and C q .. q is the acceleration of its motion.
The CRH5 gearbox rigid-flexible coupled dynamics system is connected by constraints and force elements, and the dynamics equations of the system are as follows: where M is the inertia matrix of the gearbox; λ is the Lagrange operator; Q e is the generalized external force vector; Q V is the generalized inertia force vector related to the velocity quadratic; and i denotes the flexible body components, i.e., the universal joints, driving gear, and driven gear.

Working Environment and Model Setup
The 3D model of the CRH5 gearbox is imported into ADAMS, and the material and mass properties of the solid unit are defined to realize multi-rigid body dynamics modeling of the CRH5 gearbox.Since it is necessary to analyze the vibration characteristics of the gearbox universal joints, driving gear, and driven gears in normal and fault states, the research object is softened to establish a rigid-flexible coupling dynamics model.The entire rigid-flexible coupled dynamics simulation model of the CRH5 gearbox is composed of three flexible bodies, including universal joints, the driving gear, and driven gears, and eleven rigid bodies, including the gearbox case, bushings, axles, and connecting shafts.The six bearings in the gearbox are modeled using the Adams Machinery module in ADAMS.The gearbox case and axle take one vertical degree of freedom to characterize the wheel-rail excitation; the rest of the components take three degrees of freedom in the longitudinal, transverse, and vertical directions; the whole system contains 38 rigid and flexible degrees of freedom.
The View Flex module of ADAMS can generate flexures, but the generated flexures are only oriented to simpler components, and for complex components such as gears their accuracy is greatly reduced, which is not suitable for research needs.Therefore, by using ANSYS to divide the cross joints and gears into unit meshes that satisfy the calculation accuracy, performing rigid treatment at the connection of components as connection nodes, saving the modal analysis calculation results into .mnfformat files, and importing the com-ponents into ADAMS for further parameterization, the rigid-flexible coupling dynamics model of the CRH5 gearbox is constructed.

CRH5 Gearbox Rigid-Flexible Coupling Dynamic Modeling
Using ADAMS for CRH5 gearbox rigid-flexible coupled dynamics modeling, the cross shaft and follower shaft of the universal joint, as well as the normal contact force of the active and follower gears, can be calculated by the impact function method.The impact model equates the meshing contact process as a nonlinear spring-damping model based on the depth of penetration, and the magnitude of the contact force is directly proportional to the depth of penetration and the contact stiffness.The function expression is as follows: For the tangential contact force, the Coulomb friction method is used to calculate the force, whose magnitude is proportional to the positive pressure and whose direction is opposite to the direction of the relative slip velocity, with the following functional expression:

Contact Force Parameter Setting
According to the analysis of domestic and international studies [31,32], the parameters are set as follows:

•
Modulus of elasticity E. Reflecting the stiffness of the material during elastic deformation, E = 2.07 × 10 11 .

•
Poisson's ratio µ.Reflecting the coefficient of transverse deformation of the material in one direction, µ = 0.3.

•
Contact coefficient K. Related to the shape and material properties of the contact surface, it is calculated according to Hertz contact theory, K = 1.29 × 10 6 .

•
Contact index e.Reflecting the degree of nonlinearity of the material and calculated according to the Hertz contact theory, e = 1.5.

•
Damping coefficient C. Reflecting the energy loss when objects collide, empirically, C = 10.

•
Gear cutting depth d.Empirically, d = 0.1 mm.• The coefficient of kinetic friction f dy = 0.05, the kinetic slip velocity v d = 10 mm/s, the coefficient of static friction f st = 0.08, and the static slip velocity v s = 1 mm/s.• According to a related study [33], load-bearing loads and line excitation curves are incorporated at the axle.

Feasibility Testing
The motor input time is 60 s and the angular velocity is 0-18,000 d/s.The gear meshing force curve is shown in Figure 4.According to related research [34], the bevel gear meshing force is about 25 KN under an approximate working condition; the simulation results are consistent with this measurement.
Then, the monitoring data of a particular train CRH5 from 9 June 2021 18:08:08 to 18:27:43 is selected for simulation.Compare the trend of the simulation results with the actual monitoring results.The traction motor angular velocity input curve during this time period is shown in Figure 5.
The vibration data from the online monitoring system of China Railway will be preprocessed, and the Root-Mean-Square (RMS) of the time-domain vibration acceleration every 5 s will be displayed as the time-domain characteristic value.After calculating the RMS of the simulated time-domain vibration acceleration and comparing the time-domain characteristic value of the actual operation, it is found that the general trends of the two coincide with each other.The actual and simulation comparisons are shown in Figure 6.

Feasibility Testing
The motor input time is 60 s and the angular velocity is 0-18,000 d/s.The gear meshing force curve is shown in Figure 4.According to related research [34], the bevel gear meshing force is about 25 KN under an approximate working condition; the simulation results are consistent with this measurement.Then, the monitoring data of a particular train CRH5 from 9 June 2021 18:08:08 to 18:27:43 is selected for simulation.Compare the trend of the simulation results with the actual monitoring results.The traction motor angular velocity input curve during this time period is shown in Figure 5.Then, the monitoring data of a particular train CRH5 from 9 June 2021 18:08:08 to 18:27:43 is selected for simulation.Compare the trend of the simulation results with the actual monitoring results.The traction motor angular velocity input curve during this time period is shown in Figure 5.The vibration data from the online monitoring system of China Railway will be preprocessed, and the Root-Mean-Square (RMS) of the time-domain vibration acceleration every 5 s will be displayed as the time-domain characteristic value.After calculating the RMS of the simulated time-domain vibration acceleration and comparing the time-domain characteristic value of the actual operation, it is found that the general trends of the two coincide with each other.The actual and simulation comparisons are shown in Figure 6.The frequency-domain vibration acceleration of the simulation is obtained using fast Fourier transform.The same treatment as the time-domain vibration acceleration is compared with the actual vibration frequency-domain eigenvalues, and the general trends of the two are found to match.The simulation and actual comparisons are shown in Figure 7.   Through the above analysis, the results of the CRH5 gearbox dynamics simulation model are basically consistent with the actual operation data, indicating the feasibility of this dynamics simulation model.

CRH5 Gearbox Degradation Feature Extraction and Analysis
Fault prediction of CRH5 gearbox key components requires the degradation characteristics of the key components over their full lifecycle to improve the accuracy of the prediction.Therefore, it is necessary to simulate the degradation process of key components using dynamics simulation, extract their vibration acceleration signals, calculate the vibration eigenvalues at each stage of the full lifecycle, and use the vibration eigenvalues and operation monitoring data to predict potential failures of key components.

Transient Fault Simulation and Deep Learning of Critical Component Degradation Processes
To accurately describe the degradation process of the key components of the CRH5 gearbox, this study utilizes the transient process at the key stage of component degradation for dynamics simulation, extracts the time-domain vibration acceleration signals at the key stage of component degradation, and trains the degradation characteristics of the key components over the whole life cycle using deep learning.
Taking the driving gear as an example, according to the relevant literature and field research [35][36][37], the degradation process of the gear is found to be driven by pitting and spalling caused by impact-type failures, leading to the uniform wear state characteristic of stable-type failures.Figure 8 shows a schematic diagram of the key stages of degradation such as pitting, spalling, and wear of a single tooth of the driving gear.After remodeling the degradation critical stage, it is imported into ANSYS for meshing and generating a flexible body.The new flexure is imported into ADAMS to replace the normal state driving gear and a dynamic simulation is performed.Its time-domain vibration acceleration signal is extracted for training.
Taking the driving gear as an example, according to the relevant literature and field research [35][36][37], the degradation process of the gear is found to be driven by pitting and spalling caused by impact-type failures, leading to the uniform wear state characteristic of stable-type failures.Figure 8 shows a schematic diagram of the key stages of degradation such as pitting, spalling, and wear of a single tooth of the driving gear.After remodeling the degradation critical stage, it is imported into ANSYS for meshing and generating a flexible body.The new flexure is imported into ADAMS to replace the normal state driving gear and a dynamic simulation is performed.Its time-domain vibration acceleration signal is extracted for training.Since the degradation process of parts is time-sequential and there is a causal relationship between adjacent degradation stages, the training of each piece of transient fault simulation data can also be carried out using the GRU-HMM model proposed in this paper, but the model is only carried out to the initial vector update stage.A convergence optimization function F needs to be added to the GRU for gear degradation trend training, is the loss of the function after the task Ti.
The training objective of GRU is to minimize the loss; the objective function and the loss function are as follows: Since the degradation process of parts is time-sequential and there is a causal relationship between adjacent degradation stages, the training of each piece of transient fault simulation data can also be carried out using the GRU-HMM model proposed in this paper, but the model is only carried out to the initial vector update stage.A convergence optimization function F needs to be added to the GRU for gear degradation trend training, where f is the convergence function, θ is the model update gradient after training a set of tasks, β is the step size of the gated neural network, and L T i ( f θ ) is the loss of the function after the task T i .
The training objective of GRU is to minimize the loss; the objective function and the loss function are as follows: where x j denotes the jth data and y j denotes the relative predicted data.
Taking the driving gear at 1500 rpm as an example, the time-domain vibration acceleration of the normal state and three degradation state simulations are shown in Figure 9.After training the GRU-HMM model, the time-domain vibration acceleration of the driving gear at 1500 rpm operating conditions for the whole life cycle is shown in Figure 10.The training model for the full lifecycle of degradation was validated using the gearbox dataset from Southeast University, comparing the relevant literature and EMU on-board monitoring data, which proved that this method is feasible [38,39].
After remodeling the critical degradation stages of the universal joints, driving gear, and driven gears, the time-domain vibration acceleration at 500 rpm intervals from 0 to 3500 rpm are calculated separately for each component.These will be used in the construction of the hybrid feature domain.
After training the GRU-HMM model, the time-domain vibration acceleration of the driving gear at 1500 rpm operating conditions for the whole life cycle is shown in Figure 10.
The training model for the full lifecycle of degradation was validated using the gearbox dataset from Southeast University, comparing the relevant literature and EMU on-board monitoring data, which proved that this method is feasible [38,39].

Constructing Hybrid Feature Domains
The hybrid feature domain is constituted with time-domain feature metrics, frequencydomain feature metrics, and energy distribution after wavelet packet decomposition.The time-domain and frequency-domain feature metrics are shown in Table 1.

Gravity frequency
where x(n) is the time-domain sequence of the vibration signal, x is the mean value of x(n), s(n) is the spectrum of the signal x(n), and f n is the frequency value of the nth spectral line.
Wavelet packet decomposition is a time-frequency domain feature extraction algorithm that is able to observe frequency information in a small area of the frequency domain.Let s j,k (i) be the reconstructed signal of the kth band of the jth layer obtained after wavelet packet decomposition of the original signal; then, its corresponding energy E j,k is as follows: where N is the length of the signal, k is the ordinal number of the sub-bands of the wavelet packet in layer j, and the number of sub-bands is M = 2j.E j and W j are the energy and eigenvectors of each sub-band in layer j.
Setting the wavelet packet decomposition layer to 3, W3 is an eight-dimensional vector, which, together with the six-dimensional time-frequency domain feature metrics described above, contributes to a fourteen-dimensional hybrid feature domain.

Hybrid Feature Domain Sensitivity Analysis
Taking a 1500 rpm working condition as an example, there are 3000 sampling points of vibration acceleration in the normal state and degradation process of the key components of the gearbox, and every 100 sampling points of each state type are intercepted as a sample, for a total of 30 samples for each state type.The six characteristic indexes in time and frequency domains are calculated, respectively, and, after normalizing the calculation results, the comparison diagrams of normal state and degradation positions are shown in Figure 11.Among them, the RMS, kurtosis index, and RMS frequency have better sensitivity to component degradation, but they cannot be accurately differentiated for fault location.
The wavelet packet decomposition feature indexes of 30 samples for each state are calculated, and the feature indexes are compared in Figure 12.Among them, W(3,3), W (3,4), and W(3,7) are more sensitive to degradation and can be effectively separated from different locations by combining the RMS, kurtosis index, and RMS frequency.Therefore, these six feature indicators are selected to construct a six-dimensional hybrid feature domain.The remaining feature indicators with low sensitivity are eliminated.
nents of the gearbox, and every 100 sampling points of each state type are intercepted as a sample, for a total of 30 samples for each state type.The six characteristic indexes in time and frequency domains are calculated, respectively, and, after normalizing the calculation results, the comparison diagrams of normal state and degradation positions are shown in Figure 11.Among them, the RMS, kurtosis index, and RMS frequency have better sensitivity to component degradation, but they cannot be accurately differentiated for fault location.The full lifecycle degraded hybrid feature domain of CRH5 gearbox key components is set in the preprocessing stage of the GRU-HMM fault prediction model.Before fault prediction, the operation monitoring data are noise reduced and the six-dimensional hybrid feature domain is calculated.The RMS, kurtosis index, and RMS frequency during operation are compared with the full lifecycle degradation hybrid feature domain to determine the initial probability of failure π.Then, the eigenvalues of W(3,3), W (3,4), and W(3,7) are used to determine the location of the hidden failure, and the initial probability parameter is updated by the GRU to perform the failure prediction.

Source of Experimental Data
The verification data come from the operation monitoring data of a CRH5 within 30 days before advanced repair in an EMU depot of China Railway.This CRH5 had undergone a life-critical replacement of the driving gear and driven gears of the gearbox system during this advanced repair.Thus, this dataset has better verification for GRU-HMM fault prediction and hidden fault judgment.
Because of the large number of operational monitoring data entries, the low-speed The full lifecycle degraded hybrid feature domain of CRH5 gearbox key components is set in the preprocessing stage of the GRU-HMM fault prediction model.Before fault prediction, the operation monitoring data are noise reduced and the six-dimensional hybrid feature domain is calculated.The RMS, kurtosis index, and RMS frequency during operation are compared with the full lifecycle degradation hybrid feature domain to determine the initial probability of failure π.Then, the eigenvalues of W(3,3), W (3,4), and W(3,7) are used to determine the location of the hidden failure, and the initial probability parameter is updated by the GRU to perform the failure prediction.

Source of Experimental Data
The verification data come from the operation monitoring data of a CRH5 within 30 days before advanced repair in an EMU depot of China Railway.This CRH5 had undergone a life-critical replacement of the driving gear and driven gears of the gearbox system during this advanced repair.Thus, this dataset has better verification for GRU-HMM fault prediction and hidden fault judgment.
Because of the large number of operational monitoring data entries, the low-speed monitoring data resulting from transitions or waiting were deleted, and the monitoring data for every 48 h were used as a set of samples, totaling 15 sets of samples.Wavelet packet noise reduction was applied to all sample data, using the first ten sets of samples as the training set and the last five sets as the test set for fault prediction of the CRH5 gearbox.

Verification of GRU-HMM Fault Prediction Model
The HMM initial vector, state transfer matrix, and confusion matrix are first calculated from the training set with the following values: π = [0.782, 0.115, 0.06, 0.025, 0.018]; The hybrid feature domain is substituted into the GRU for feature fusion, and the initial vectors and state transfer matrices are updated.The log-likelihood probabilities of the initial vector updates for each hidden fault type are shown in Figure 13.Where the loglikelihood probability of the driving gear increases at increasing speed levels, especially at high speeds, the risk of failure of the driving gear increases.The log-likelihood probability of the universal joint has the highest risk of failure at moderate speeds.The log-likelihood probability of the universal joint has the highest risk of failure at moderate speeds.The log-likelihood probability of the driven gear is higher at lower speeds, decreases with increasing speed, and then increases again at higher speeds, indicating that the risk of failure of the driven gear is higher at low-and high-speed operations.The hybrid feature domain is substituted into the GRU for feature fusion, and the initial vectors and state transfer matrices are updated.The log-likelihood probabilities of the initial vector updates for each hidden fault type are shown in Figure 13.Where the log-likelihood probability of the driving gear increases at increasing speed levels, especially at high speeds, the risk of failure of the driving gear increases.The log-likelihood probability of the universal joint has the highest risk of failure at moderate speeds.The log-likelihood probability of the universal joint has the highest risk of failure at moderate speeds.The log-likelihood probability of the driven gear is higher at lower speeds, decreases with increasing speed, and then increases again at higher speeds, indicating that the risk of failure of the driven gear is higher at low-and high-speed operations.The updated initial vectors and state transfer matrices are substituted into the GRU-HMM for training.The probability distribution of hidden states is calculated using the test set, and the results are normalized to obtain the fault prediction results for each type of failure mode.The results of the velocity-hidden state distribution for one of the sample sets of the test set are shown in Figure 14.The probability of hidden failures of the driving gear and driven gears is higher from the figure.Through the maintenance records of the EMU undergoing advanced repairs in the manufacturing plant, comparing the replaced gear sets, it is proven that the predicted results match the actual results, and the feasibility of the GRU-HMM fault prediction model is demonstrated.

Comparative Analysis
To verify the advantage of the proposed fault prediction method in terms identification accuracy, the traditional HMM and SVM diagnostic models were for the prediction of EMU gearbox faults.The recognition accuracies of the fou positions are shown in Table 2.The recognition accuracy of GRU-HMM was f significantly higher than that of the traditional HMM and SVM methods.H higher recognition time compared to the other two methods was observed due that this method requires multiple trainings for the hybrid feature domain, ini and state transfer matrix.

Comparative Analysis
To verify the advantage of the proposed fault prediction method in terms of its fault identification accuracy, the traditional HMM and SVM diagnostic models were compared for the prediction of EMU gearbox faults.The recognition accuracies of the four degraded positions are shown in Table 2.The recognition accuracy of GRU-HMM was found to be significantly higher than that of the traditional HMM and SVM methods.However, a higher recognition time compared to the other two methods was observed due to the fact that this method requires multiple trainings for the hybrid feature domain, initial vector, and state transfer matrix.

Conclusions
• A model for predicting faults in the CRH5 gearbox has been developed by establishing a dynamic coupling dynamics model and utilizing vibration acceleration data from key components throughout the gearbox's degradation process.By creating a six-

Figure 6 .
Figure 6.Comparison of actual and simulation time-domain curves.

Figure 6 .
Figure 6.Comparison of actual and simulation time-domain curves.

Figure 6 .
Figure 6.Comparison of actual and simulation time-domain curves.

Figure 7 .
Figure 7.Comparison of actual and simulation frequency-domain curves.

Figure 7 .
Figure 7.Comparison of actual and simulation frequency-domain curves.

Figure 9 .
Figure 9. Time-domain vibration acceleration in the normal state and in various degraded states.(a) Normal; (b) pitting; (c) spalling; (d) wear.

Figure 10 .
Figure 10.Time-domain vibration acceleration over the full lifecycle of driving gear (1500 rpm).

Figure 9 .
Figure 9. Time-domain vibration acceleration in the normal state and in various degraded states.(a) Normal; (b) pitting; (c) spalling; (d) wear.

Figure 9 .
Figure 9. Time-domain vibration acceleration in the normal state and in va Normal; (b) pitting; (c) spalling; (d) wear.

Figure 10 .
Figure 10.Time-domain vibration acceleration over the full lifecycle of dr

Figure 10 .
Figure 10.Time-domain vibration acceleration over the full lifecycle of driving gear (1500 rpm).

Figure 11 .
Figure 11.Comparison of time and frequency domain feature metrics.(a) RMS; (b) margin index; (c) kurtosis index; (d) pulse index; (e) gravity frequency; (f) RMS frequency.The wavelet packet decomposition feature indexes of 30 samples for each state are calculated, and the feature indexes are compared in Figure 12.Among them, W(3,3), W(3,4), and W(3,7) are more sensitive to degradation and can be effectively separated from different locations by combining the RMS, kurtosis index, and RMS frequency.Therefore, these six feature indicators are selected to construct a six-dimensional hybrid feature domain.The remaining feature indicators with low sensitivity are eliminated.

Figure 13 .
Figure 13.Initial vector log-likelihood probabilities for each hidden fault type.The updated initial vectors and state transfer matrices are substituted into the GRU-HMM for training.The probability distribution of hidden states is calculated using the test set, and the results are normalized to obtain the fault prediction results for each type of failure mode.The results of the velocity-hidden state distribution for one of the sample sets of the test set are shown in Figure 14.The probability of hidden failures of the driving

Figure 13 .
Figure 13.Initial vector log-likelihood probabilities for each hidden fault type.

Figure 14 .
Figure 14.The results of the velocity-hidden state distribution.

Figure 14 .
Figure 14.The results of the velocity-hidden state distribution.

•
The hidden state Q is defined as Q = {normal state, universal joint degradation, driving gear degradation, driven gear degradation}, representing the hidden fault states of the CRH5 gearbox; here, the hidden state N = 4. • Observable state O, denoted as O = {o 1 , o 2 , . .., o N }, where o i ϵ (v 1 , v 2 , . .., v t ), is a dataset for monitoring the operating status of the CRH5.Here, observable state v t is a feature vector capturing the data characteristics of the train running up to the moment t under the four hidden states mentioned above.

Table 1 .
Time-domain and frequency-domain vibration characterization metrics.

Table 2 .
Comparison of recognition accuracy of various fault prediction methods.
•A model for predicting faults in the CRH5 gearbox has been developed by

Table 2 .
Comparison of recognition accuracy of various fault prediction methods.