Prediction of Residual Electrical Life in Railway Relays Based on Convolutional Neural Network Bidirectional Long Short-Term Memory

: In this paper, we address several issues with existing methods for predicting the residual electrical life of railway relays. These issues include the diﬃculty of single feature prediction in characterizing the degradation process, the neglect of temporal and backward–forward correlations in the degradation process, and low prediction accuracy. To overcome these challenges, we propose a novel approach that combines convolutional neural networks (CNNs) and bidirectional long short-term memory (BiLSTM) to facilitate the life prediction of railway relays and provide an accurate data basis for the maintenance of railway relays. Firstly, we collected voltage and current signals from railway relay electrical life tests and extracted feature parameters that captured the relay’s operating state. Next, we applied Spearman correlation coeﬃcient analysis combined with random forest importance analysis to perform double-feature selection. This process eliminates redundant feature parameters and identiﬁes the optimal feature subset. Finally, we constructed a convolutional neural network bidirectional long short-term memory (CNN BiLSTM) prediction model to accurately predict the remaining electrical life of the railway relay. Through our analysis of the prediction results, we observed that the CNN BiLSTM model achieves an eﬀective prediction accuracy of 96.3%. This accuracy is signiﬁcantly higher, more stable, and more practical compared to other prediction models such as recurrent neural networks (RNNs), long short-term memory (LSTM), and BiLSTM models. Overall, our proposed CNN BiLSTM model oﬀers higher accuracy, be�er stability, and greater practicality in predicting the remaining electrical life of railway relays.


Introduction
Railway relays are essential switching devices in railway equipment, serving as specialized multi-way switches for controlling circuits by opening or closing them [1].They play a crucial role in signal transmission, logic control, and other functions in trains.The stability and safety of relays are vital for ensuring the normal operation of railway signal systems [2].Predicting the remaining electrical life of railway relays allows us to identify critical time points for performance testing, thus facilitating maintenance, improving economic efficiency, reducing failure rates, enhancing safety, and ensuring the safe and stable operation of the entire train [3,4].
As the performance requirements for relays have increased, research on the prediction of relay remaining electrical life has made progress.Reference [5] proposed an electromagnetic relay life prediction method based on a weighted optimization combination model, using degradation parameters of electromagnetic relays for the prediction process.
Reference [6] utilized rough set theory to extract a set of lifespan-related feature parameters from various parameters and developed a relay lifespan decision making method, completing the prediction process based on initial product life state information of the same model.Reference [7] addressed the issue of non-stationary time series for relay performance parameters and achieved lifespan prediction through wavelet packet transform and RBF neural networks.Reference [8] incorporated experimentally obtained relay overrun time data into a partitioned regression model to predict relay lifespan.Reference [9] proposed a support vector machine-based prediction method for electromagnetic relay storage life.Reference [10] systematically analyzed the common failure modes and their causes of electromagnetic relays during the working process, thus providing a method for life determination.Reference [11] evaluated the relay life by monitoring the working status of relay contacts.Reference [12] proposed a one-dimensional convolutional neural network-based automotive electromagnetic relay life prediction method with a dual self-attention mechanism.Reference [13] estimated the relay life at room temperature by using unknown variables in the Arrhenius model.Reference [14] implements the life prediction process by predicting the fracture cycle.Reference [15] proposed an electromagnetic relay life prediction method based on improved DBN and Softmax regression.Reference [16] established a balanced force-based electromagnetic relay life prediction model based on a failure mechanism.
Summarizing existing research results, three aspects need further improvement in the prediction of railway relay remaining electrical life.Firstly, existing prediction methods mostly rely on single-variable prediction, which fails to adequately represent the relay's performance degradation process, leading to significant prediction errors.Secondly, the neglect of the relay's performance degradation process as a long time series problem hinders the reasonable utilization of its temporal correlations during the lifespan prediction process, affecting the accuracy of predictions.Thirdly, most existing prediction methods are based on traditional prediction algorithms, such as regression prediction, which is widely used in the references.However, its prediction accuracy is low, and the excessive amount of data can seriously affect the prediction accuracy, making it difficult to meet the requirements of practical applications.
Recently, deep learning-based prediction methods have made progress in various research areas such as power load forecasting [17], mechanical lifespan prediction [18], and ba ery lifespan prediction [19].CNN, compared to other mathematical models, has stronger representation learning capabilities as it can learn through layered structures from input feature parameter data, with more noticeable feature extraction effects [20].LSTM, an extension of RNN, introduced forget, input, and output gates, along with a cell state, greatly alleviating the vanishing or exploding gradient problem when dealing with time series with long-term dependencies [21].BiLSTM, composed of forward and backward LSTMs, can simultaneously capture information from both directions, providing a clear advantage in capturing bidirectional temporal sequences.
Addressing the shortcomings of existing research and combining the advantages of CNN and BiLSTM models, this paper proposes a CNN BiLSTM-based railway relay residual electric life prediction method, which will realize the accurate life prediction of railway relays through the optimal feature parameter subset obtained by feature filtering, providing an accurate data basis for the maintenance of the railway relay and improving the safety and reliability of the railway system.The method will provide an accurate data basis for the maintenance of railway relays and improve the safety and reliability of the railway system.This method can directly perform life prediction using the feature parameter data obtained from the full life cycle test of railway relays, without the need to build accurate physical or statistical models [22].Therefore, the deep learning life prediction method is more advantageous in solving the long time series prediction problem of relay remaining life.
This study first designed a railway relay electric life test platform to complete the test process, collect voltage and current signals, and extract feature parameters.Then, through Spearman correlation coefficient analysis combined with random forest importance analysis, the feature parameter selection process was completed and the optimal feature parameter subset was obtained.Finally, the railway relay life prediction process was completed through CNN BiLSTM, and multiple methods were compared to prove its feasibility.The overall prediction process is based on multiple feature parameters, which can better characterize the degradation process of railway relays.The combined model reduces computational complexity; it can consider the correlation before and after the degradation process of railway relays, capture bidirectional time series information, and improve prediction accuracy and stability, thus achieving accurate life prediction of railway relays.

Spearman Correlation Coefficient Analysis
The Spearman correlation coefficient reduces the influence of outliers during the process of correlation analysis, and it also exhibits stronger robustness in capturing non-linear relationships [23,24].The formula for calculating the Spearman correlation coefficient is: In the equation, xk represents the rank of the k-th element in the vector X, and x denotes the average rank of all elements in vector X.Similarly, yk represents the rank of the k-th element in the vector Y, and y denotes the average rank of all elements in vector Y.The Spearman correlation coefficient's values correspond to the relationship exhibited between vectors x and y, as shown in Table 1.

Importance Analysis of the Random Forest Features
Random forest is a parallel ensemble learning model that uses decision trees as base classifiers [25].In this study, the feature importance analysis is conducted using the random forest's mean decrease accuracy method, with the evaluation metric being the outof-bag (OOB) error rate.The importance OOB j VIM of a variable Xj in the i-th tree is denoted as: In the equation, 0 Y  represents the predicted outcome of the i-th tree for the p-th observation in the OOB data after random permutation.I denotes the indicator function, which takes the value of 1 when the two values are equal and 0 otherwise.When a variable j does not appear in the i-th tree, The replacement importance of a variable Xj in RF is defined as: In the equation, n represents the number of classification trees in the random forest (RF).
When performing feature importance analysis through random forest, the total importance sum is equal to 1.When the accumulated importance of feature parameters reaches 0.8 or above, it is considered that these feature parameters have representative feature information and have a significant impact on the model's predictions.Therefore, they can be used as the basis for data in the predictive model.

Convolutional Neural Network
CNN has excellent feature extraction capabilities, and it consists of an input layer, convolutional layers, pooling layers, fully connected layers, and an output layer [26][27][28].The schematic diagram of the convolutional neural network model is shown in Figure 1.The convolutional layer performs convolution operations on the input features using convolutional kernels, which can be represented as follows: In the equation, x represents the j-th feature map output after the l-th convolu- tional operation, f is the chosen activation function for the model, b is the bias parameter for the l-th layer, and Mj is the convolutional kernel for the input feature vector.The role of the activation function is to construct the output feature vector.ReLU function is chosen as the activation function for this model to prevent gradient vanishing.It can be represented as follows: In the equation, x represents the output data of the convolution operation procedure.The pooling layer can reduce the size of the model and improve the computational speed of the predictive model.The pooling operation chosen for this model is max pooling, which can be represented as follows: In the equation, ( ) l i q t represents the value of the t-th neuron of the i-th feature vector in the l layer,  represents the values corresponding to the neurons in the l + 1-th layer, and W represents the width of the pooling field.
The fully connected layer will map the learned features into the label space, and it can be represented as follows: In the equation, W represents the weight matrix of the fully connected layer, X represents the input, and b represents the bias vector.

Bidirectional Long Short-Term Memory Network
BiLSTM is composed of two layers of bidirectional LSTM, which enables it to consider both forward and backward context information.It has been widely applied in language modeling, sentiment analysis, machine translation, sequence labeling, time series prediction, and other areas.
LSTM networks address the issues of vanishing and exploding gradients that may occur during training in traditional RNNs.They enhance the capability of handling longterm time series data [29,30].The recurrent structure and internal architecture of LSTM networks are illustrated in Figure 2.
The LSTM loop structure and the internal structure.
The mathematical calculation formula of LSTM is shown below (Table 2): Table 2. LSTM computational formula.

LSTM Internal Structure Expression Formula
Oblivion door value t f -1 ( [ , ] ) The cell state at the current time t c The current input cell status   0,1 ) tanh( ) BiLSTM incorporates an additional backward LSTM layer into the LSTM structure, allowing it to simultaneously consider both forward and backward context information.By capturing contextual information associated with the target event from both directions, BiLSTM can improve the model's performance [31,32].The recurrent structure of BiLSTM is illustrated in Figure 3.

Parameter Se ing and Evaluation Index (1) Parameter Se ing
The remaining action data of railway relays are taken as the label of the prediction model, and the optimal feature subset obtained through double feature selection is taken as the input of the model.The first 70% of the data are selected as the training set, 70% to 85% of the data are taken as the validation set, and the last 15% of the data are taken as the test set.The parameters of the convolutional part are set as follows: the number of convolutional layers is 1, the number of kernels is 6, and the activation function is ReLU.The parameters of the BiLSTM part are set as follows: the loss function is selected as the mean square error (MSE) function, the time step is 48, there are two BiLSTM layers, the number of cells is 32 and 8, the iteration times are 40, the optimizer is Adam, and the learning rate is 0.01.
(2) Evaluation Metrics The following metrics are used as evaluation indicators to assess the performance of the model: root mean square error (RMSE), mean absolute error (MAE), coefficient of determination (R 2 ), maximum error rate, effective accuracy, and computational time.The formulas for calculating RMSE, MAE, and R 2 are as follows: where n is the number of samples of the data set, yt is the actual value of the remaining break times, y is the average of the actual value, and ˆt y is the final predicted output value.RMSE (root mean square error) and MAE (mean absolute error) are inversely related to the prediction accuracy, meaning that lower RMSE and MAE values indicate be er prediction performance.On the other hand, R 2 is directly related to the prediction effectiveness.A higher R 2 value indicates a be er fit of the model to the actual data [33,34].

Design and Construction of the Test Platform
In this experiment, the 3RH211-2KF40 (Siemens, Berlin, Germany) model railway relay was selected as the test object.The experiment utilized the combination of communication serial ports to connect the NI data acquisition card with LabVIEW virtual instrument technology.The data acquisition was carried out using the X-Series USB6356 synchronous data acquisition card from National Instruments (Austin, TX, USA) and voltage and current sensors from LEM (Geneva, Swi erland).This setup facilitated the collection of signals related to the main contact voltage, main contact current, and coil voltage and current during the entire make-and-break process of the railway relay, thus accomplishing the design and construction of the railway relay's electrical life test platform.The design conforms to the requirements of the GB14048.5-2020national standard [35].Specific experimental parameters are listed in Table 3.The schematic diagram of the experimental platform is illustrated in Figure 4.

Feature Parameter Extraction
Feature parameters serve as the basis for a railway relay's electrical life prediction, and using multiple feature parameters allows for a be er representation of the relay's degradation process.The extraction of these feature parameters is based on the voltage and current signals during the make-and-break process of the railway relay.The waveform variations are depicted in Figures 5 and 6.Compared to single feature parameter prediction, electrical life prediction based on multiple feature parameters can be er capture the interrelationships between features and exhibits stronger sequence modeling capabilities.Therefore, this study extracted eight types of feature parameters.The extraction process is as follows: (1) Contact resistance extraction: where N represents the number of data points collected when the contacts are stably closed and conducting.un represents the voltage across the contacts, and in represents the current flowing through the contacts.
(2) Suction time extraction: where t1 is the moment when the coil is energized, and t2 is the first contact time between the dynamic and static contacts.
(3) Bounce time extraction: where t3 is the moment when significant voltage and current fluctuations end.
(4) Overshoot time extraction: where t4 is the moment when stable contact closure is achieved.
(5) Release time extraction: where t5 represents the moment when the coil power is disconnected, and t6 represents the moment when the arcing starts.( 6) Arc energy extraction: where t7 is the arc extinguishing time, t  is the time interval between adjacent sampling points, and fs is the sampling rate.
where unx represents the coil voltage at different numbers of collected points when the contacts are opened.
The extracted feature parameters and their variations are shown in Figure 7.

Model Construction and Example Analysis
From the railway relay's electrical life test data, eight types of feature parameters were extracted.Using multiple feature parameters for training allows for the retention of more useful information and be er representation of the railway relay's degradation process.However, having too many feature parameters can lead to some redundant features, which can increase the difficulty of model training, increase training time, and may also affect the prediction accuracy.Therefore, in this study, a dual feature selection process was carried out using Spearman's rank correlation coefficient analysis and random forest importance analysis to eliminate redundant feature parameters while retaining the most useful feature information.
By combining the advantages of CNN and BiLSTM, the CNN BiLSTM prediction model was constructed.Firstly, data preprocessing was performed, including data normalization.The inputs were divided into three parts: training set, validation set, and test set.The remaining action count of the railway relay was used as the label for the prediction model.Then, the optimal feature subset was fed into the model.Finally, the predicted data were reverse normalized to obtain the remaining action count prediction, which was compared and analyzed against the actual remaining action count.The prediction flow structure is illustrated in Figure 8.The constructed CNN BiLSTM combined model possesses both the feature extraction capability of CNN and the stronger representation learning ability of BiLSTM.Additionally, it benefits from BiLSTM's capability to capture bidirectional time sequences, enabling be er capturing of the correlation between input feature data in both directions and learning of long-term dependencies.This allows the model to more effectively predict the remaining electrical life of railway relays.

Feature Parameter Selection
The feature selection process was completed through the use of Spearman correlation coefficient analysis and random forest importance analysis.The results of the Spearman's rank correlation coefficient analysis are shown in Figure 9. From the above figure, it can be observed that the Spearman's rank correlation coefficients for bounce time, arcing time, and overshoot time are all less than 0.2, indicating no significant correlation.Therefore, these features are removed from further consideration.Subsequently, the remaining feature parameters underwent random forest importance analysis, and the results are shown in Figure 10.Based on the results of the random forest importance analysis, it can be observed that the sum of the importance scores for the release voltage, contact resistance, and suction time (three feature parameters) is 0.824, which exceeds 0.8.This indicates that these features can effectively represent the feature information and reflect the degradation process of the relay.Therefore, they are considered as the optimal feature subset for completing the remaining life prediction process.

Comparative Analysis of the Prediction Results
In this study, four prediction models were constructed: RNN, LSTM, BiLSTM, and CNN BiLSTM.The models were evaluated using both the original feature parameters and the feature parameter subsets obtained through dual feature selection.The prediction process was applied to estimate the remaining electrical life of railway relays.Calculate The corresponding performance evaluation indicators are calculated based on the prediction results for model prediction comparison in order to verify that the CNN BiLSTM model has more advantages compared to other models.
The comparison of prediction results before and after feature selection is shown in Figures 11 and 12, respectively.The specific evaluation metrics are presented in Tables 4  and 5.   4 and 5, the following observations can be made: (1) The prediction results of the models without feature parameter selection will have some large error fluctuations, affecting the model prediction accuracy as well as stability.After the feature parameter selection, the RMSE, MAE, and maximum error rate of the four prediction models are reduced to different degrees and the effective accuracy and R 2 are higher, and the effective accuracy of the four models after feature parameter selection is improved by 5.4%, 3%, 0.8%, and 1.4%, which directly demonstrates the significance of feature parameter selection.(2) Regardless of whether the feature parameter selection process is performed or not, compared with the other three prediction models, the RMSE, MAE, and maximum error rate of the CNN BiLSTM model are significantly lower and the effective accuracy and R 2 are higher, which also directly demonstrates that the CNN BiLSTM model has a higher accuracy and predictive stability compared with the RNN model, LSTM model, and BiLSTM model.The overall advantage of the prediction model is more obvious.Moreover, the effective accuracy of the CNN BiLSTM model after the selection of feature parameters reaches 96.3%, which is 10.8%, 5.1%, and 3% higher than the other three prediction models, indicating that the CNN BiLSTM prediction model has more obvious advantages in the prediction of the residual electrical life of railway relays.
The actual error rate directly reflects the quality of the prediction results.By calculating the actual error rate using the prediction results from the test set, further comparisons among different models can be made.The prediction error rates of the four models are shown in Figure 13.From the comparison of the prediction error rate among the four models, it is evident that the CNN BiLSTM model has a significantly lower prediction error rate compared to the RNN, LSTM, and BiLSTM models.Moreover, the CNN BiLSTM model exhibits overall lower fluctuation in the prediction error rate, indicating be er stability.
Considering the prediction results, performance evaluation metrics, and the comparison of prediction error rates on the test set, it is clear that the CNN BiLSTM model outperforms the others, with higher effective accuracy and be er stability.Therefore, the CNN BiLSTM model is more suitable for railway relay remaining electrical life prediction.

Conclusions
In this study, we proposed a railway relay remaining electrical life prediction method based on CNN BiLSTM, utilizing the extracted feature parameters from the railway relay electrical life test to model the degradation process.The following conclusions were drawn through error analysis and model comparisons: (1) The process of Spearman correlation coefficient analysis and random forest importance selection directly influences the accuracy and stability of railway relay remaining electrical life prediction.Removing redundant features to obtain the optimal feature subset is an indispensable part of railway relay electrical life prediction.(2) By introducing deep learning prediction methods into the relay electrical life prediction process, we constructed the CNN BiLSTM combined model.The comparison of different models' time series prediction results showed that the CNN BiLSTM prediction model achieved an effective accuracy of 96.3%, which was 10.8% higher than the RNN model, 5.1% higher than the LSTM model, and 3% higher than the BiLSTM model.Its stability and prediction accuracy were higher, making the CNN BiLSTM model more suitable for railway relay remaining electrical life prediction, and it demonstrated be er performance in practical applications.(3) This method provides a new approach for railway relay life prediction, which is beneficial for relay maintenance and can improve economic efficiency, reduce failure rates, and enhance safety and stability, thereby ensuring the safe and stable operation of the entire train.(4) This study completed the test process based on the relay life test regulations.However, there are still slight differences between the test process and the actual working conditions of the relay.For example, the presence of vibration, humidity, and high temperature conditions in actual working conditions can affect the service life of the relay.Therefore, in the next step, we will conduct railway relay life prediction tests in a real working environment to further verify the feasibility of this method.

in
represents the number of observations in the i-th tree's OOB data, {0,1} p Y  is the true outcome of the p-th observation, {0,1} i p Y  represents the predicted outcome of the i-th tree for the p-th observation in the OOB data before random permutation, and ,π {0,1} j i p

Figure 1 .
Figure 1.A schematic diagram of the convolutional neural network.

Figure 4 .
Figure 4.A schematic diagram of the test platform.

Figure 7 .
Figure 7. Trend diagram of the characteristic parameters.

Figure 8 .
Figure 8. Structural diagram of the prediction flow.

Figure 11 .Figure 12 .
Figure 11.Comparison of prediction results before feature selection.

Figure 13 .
Figure 13.Comparison of diagrams of model errors.

ρxy Value Range the Correlation between x and y
In the above table,  is the sigmoid activation function, Wf is the forgo en door weight matrix, -1 Wi is the input door weight matrix, bi is the input gate bias, tanh is hyperbolic tangential activation function, Wc is the input unit state weight matrix, bc is the bias of unit state, o is the multiplication of the matrix elements, ct−1 is the last moment unit state, Wo is the output door weight matrix, bo is the output door bias term, and ht is the final output value.
t t h x is a combined matrix of the upper moment output ht−1 and the cur- rent moment input xt, bi is the forgo en door bias item,

Table 3 .
Experimental conditions.In the above table, I represents the switching current, Ie represents the rated operating current, U represents the externally applied voltage, Ue represents the rated operating voltage, Ic represents the breaking current, and T0.95 represents the time taken to reach 95% of the steady-state current.

Table 4 .
Evaluation index without feature selection.

Table 5 .
Evaluation index with feature selection.
Based on the comparison of Figures 11 and 12 and the data from Tables