Convolutional Neural Network-Based Pattern Recognition of Partial Discharge in High-Speed Electric-Multiple-Unit Cable Termination

Partial discharge detection is considered a crucial technique for evaluating insulation performance and identifying defect types in cable terminals of high-speed electric multiple units (EMUs). In this study, terminal samples exhibiting four typical defects were prepared from high-speed EMUs. A cable discharge testing system, utilizing high-frequency current sensing, was developed to collect discharge signals, and datasets corresponding to these defects were established. This study proposes the use of the convolutional neural network (CNN) for the classification of discharge signals associated with specific defects, comparing this method with two existing neural network (NN)-based classification models that employ the back-propagation NN and the radial basis function NN, respectively. The comparative results demonstrate that the CNN-based model excels in accurately identifying signals from various defect types in the cable terminals of high-speed EMUs, surpassing the two existing NN-based classification models.


Introduction
The cable of the high-speed electric multiple unit (EMU) plays a critical role in the power supply system of high-speed trains, directly impacting the safety of train operations.The terminal, as a vulnerable component of the high-speed EMU cable, is particularly susceptible to partial discharge (PD), which poses a threat to the efficient functioning of high-speed trains [1][2][3][4].Detecting PD is crucial for assessing the insulation condition of cable terminals.By identifying local discharges, the extent of insulation deterioration can be determined, allowing for timely maintenance or replacement measures [5][6][7].Currently, technologies for the PD detection encompass a variety of methods, including the pulse current method, the high-frequency pulse current method, the ultra-high frequency detection method, the ultrasonic detection method, the optical measurement method, and the infrared imaging technology [8][9][10][11][12][13][14].The advantages and disadvantages of these PD detection methods, along with their applicable scopes, are presented in Table 1.
Currently, the pulse current method, while being the earliest and most widely used PD detection method under the International Electrotechnical Commission standard, has some limitations [15].Firstly, it primarily captures the lower frequency band of the PD signal, failing to acquire complete frequency information, particularly the high-frequency components.This omission may result in the neglect or misinterpretation of crucial discharge patterns.Secondly, the pulse current method has limited resistance to interference Based on the anatomy of high-speed EMU cable terminals with on-site faults and the maintenance experience of on-site staff, faults in these terminals are generally categorized as long-term, short-term, and unpredictable.Among these, short-term faults are the most prominent, with typical defects including insulation scratches, interlayer air gaps, metal particles, and uneven semi-conductive layers [24].Clarifying the cause and action mechanism of each type of discharge holds significant importance for the routine maintenance of high-speed EMU cables and the early warning of potential accidents.
In the realm of PD pattern recognition, the back-propagation NN (BPNN) and the radial basis function NN (RBFNN) are widely used.In [39], the creation of cable models with five types of defects and the establishment of a dataset based on the phase-resolved PD (PRPD) spectrum were detailed, and the convolutional NN (CNN) was employed to achieve successful recognition of defect signals.In [40], PD signals from power cables with five types of insulation defects were collected, and a set of parameters characterizing discharge characteristics was established.The study found that the CNN-based model outperformed the BPNN-based and SVM-based models in signal identification.In [41], a time-domain waveform image database of four kinds of PD defects was constructed, and image processing technologies such as image enhancement and normalization were used to process these waveform images, and a DenseNet model was built to realize the recognition of the four kinds of defects, and the model has good robustness.In [42], the PD and corona signals were collected from cable terminals, and a CNN-based model was used for the signal recognition.However, this study bypassed the construction of a feature dataset, opting to directly feed discharge signals into the model.
In this paper, a CNN-based PD signal classification model is proposed for high-speed EMUs, which enables the identification of discharge signals from cable terminals exhibiting four typical defects.The main contributions of this paper are summarized as follows: 1.
Characteristic parameters of terminal discharge signals in high-speed EMU cables are not extracted; instead, the four types of discharge signals are directly used as input to the model, achieving high accuracy.

2.
The impact of different training datasets on the classification performance of the terminal discharge signal recognition model for high-speed EMUs is compared and analyzed.The proposed CNN-based model is demonstrated to flexibly meet the varying requirements for processing time and accuracy across different scenarios.

3.
The proposed recognition model for terminal discharge signals in high-speed EMU cables is compared with two existing NN-based models, and it is verified that the CNN-based model exhibits superior recognition effectiveness.

PD Test Platform
The PD test platform used in this study comprises the voltage regulator, the test transformer, the protection resistance, the discharge sample, the high-frequency current transformer (HFCT), and the high-frequency oscilloscope, among other components, as depicted in Figure 1.The HFCT, known for its exceptional sensitivity, straightforward setup, and strong anti-interference capabilities [43], is a widely used instrument for the online detection of PD signals, particularly when the ground conductor of the device under test is accessible.The circuit of the PD test platform is illustrated in Figure 2. By increasing the test voltage with the voltage regulator, insulation defect sample discharges are induced, thereby simulating PD defects [44].
online detection of PD signals, particularly when the ground conductor of the device under test is accessible.The circuit of the PD test platform is illustrated in Figure 2. By increasing the test voltage with the voltage regulator, insulation defect sample discharges are induced, thereby simulating PD defects [44].

Four Typical Defect PD Models
During testing and operation, the cable terminal may exhibit discharge phenomena due to internal defects.The primary defect types include wire core burrs, surface sliding, internal air gaps, and suspended metal particles.To simulate these four discharge defects at the cable terminal, electrode structures for four typical discharge models are designed, as depicted in Figure 3.The tip discharge model simulates conductor burrs, which are difficult to eliminate completely during the fabrication of cable terminals and may cause discharge phenomena during operation.The surface discharge model simulates discharges caused by looseness or delamination between the insulation layers inside the cable terminal.The air gap discharge model simulates discharges caused by tiny bubbles or knife marks in the insulation layer during terminal operation.Lastly, the suspended discharge model simulates PD issues caused by conductive and semi-conductive impurities attached to the main insulation surface [45][46][47][48][49][50].
1. Tip discharge model: This model employs a steel needle with a curvature radius of 5 µm and uses ethylene-propylene-diene monomer (EPDM) rubber film as the insulating medium, with a diameter of 120 mm and a thickness of 3 mm.A ground electrode with a diameter of 80 mm is connected below the rubber film, and the steel needle is linked to a high-voltage electrode.The tip is inserted into the film to a depth of approximately 1 mm. 2. Surface discharge model: In this model, the insulating medium consists of an EPDM rubber film with a diameter of 60 mm, structured as a double layer, and has a total thickness of 6 mm.Below this, a ground electrode with a diameter of 80 mm is connected, and a copper disk with a diameter of 30 mm is positioned between the insulating medium and the high-voltage electrode.

Four Typical Defect PD Models
During testing and operation, the cable terminal may exhibit discharge phenomen due to internal defects.The primary defect types include wire core burrs, surface sliding internal air gaps, and suspended metal particles.To simulate these four discharge defect at the cable terminal, electrode structures for four typical discharge models are designed as depicted in Figure 3.The tip discharge model simulates conductor burrs, which ar difficult to eliminate completely during the fabrication of cable terminals and may caus discharge phenomena during operation.The surface discharge model simulates dis charges caused by looseness or delamination between the insulation layers inside the ca ble terminal.The air gap discharge model simulates discharges caused by tiny bubbles o knife marks in the insulation layer during terminal operation.Lastly, the suspended dis charge model simulates PD issues caused by conductive and semi-conductive impuritie attached to the main insulation surface [45][46][47][48][49][50].
1. Tip discharge model: This model employs a steel needle with a curvature radius of µm and uses ethylene-propylene-diene monomer (EPDM) rubber film as the insu lating medium, with a diameter of 120 mm and a thickness of 3 mm.A ground elec trode with a diameter of 80 mm is connected below the rubber film, and the stee needle is linked to a high-voltage electrode.The tip is inserted into the film to a depth of approximately 1 mm. 2. Surface discharge model: In this model, the insulating medium consists of an EPDM rubber film with a diameter of 60 mm, structured as a double layer, and has a tota thickness of 6 mm.Below this, a ground electrode with a diameter of 80 mm is con nected, and a copper disk with a diameter of 30 mm is positioned between the insu lating medium and the high-voltage electrode.

Four Typical Defect PD Models
During testing and operation, the cable terminal may exhibit discharge phenomena due to internal defects.The primary defect types include wire core burrs, surface sliding, internal air gaps, and suspended metal particles.To simulate these four discharge defects at the cable terminal, electrode structures for four typical discharge models are designed, as depicted in Figure 3.The tip discharge model simulates conductor burrs, which are difficult to eliminate completely during the fabrication of cable terminals and may cause discharge phenomena during operation.The surface discharge model simulates discharges caused by looseness or delamination between the insulation layers inside the cable terminal.The air gap discharge model simulates discharges caused by tiny bubbles or knife marks in the insulation layer during terminal operation.Lastly, the suspended discharge model simulates PD issues caused by conductive and semi-conductive impurities attached to the main insulation surface [45][46][47][48][49][50].

1.
Tip discharge model: This model employs a steel needle with a curvature radius of 5 µm and uses ethylene-propylene-diene monomer (EPDM) rubber film as the insulating medium, with a diameter of 120 mm and a thickness of 3 mm.A ground electrode with a diameter of 80 mm is connected below the rubber film, and the steel needle is linked to a high-voltage electrode.The tip is inserted into the film to a depth of approximately 1 mm.

2.
Surface discharge model: In this model, the insulating medium consists of an EPDM rubber film with a diameter of 60 mm, structured as a double layer, and has a total thickness of 6 mm.Below this, a ground electrode with a diameter of 80 mm is connected, and a copper disk with a diameter of 30 mm is positioned between the insulating medium and the high-voltage electrode.

3.
Air gap discharge model: In this model, the insulating medium is again EPDM rubber film, with a diameter of 60 mm and a thickness of 3 mm.To simulate an air gap discharge, a circular hole with a diameter of 1 mm is created within the insulating medium.A copper disk is placed between the high-voltage electrode and the insulating medium.To avoid surface discharge interference, the high-voltage electrode in the air gap discharge model is sealed with epoxy resin.4.
Suspension discharge model: In this model, the insulating medium is EPDM rubber with a diameter of 120 mm and a thickness of 3 mm, and the high electric electrode is a copper disk with a diameter of 30 mm.There is a certain gap between the high electrode and the insulating medium, and a copper sheet with a thickness of 1 mm is placed in the gap as a suspended metal particle to simulate the suspended electrode.
insulating medium and the high-voltage electrode.3. Air gap discharge model: In this model, the insulating medium is again EPDM rubber film, with a diameter of 60 mm and a thickness of 3 mm.To simulate an air gap discharge, a circular hole with a diameter of 1 mm is created within the insulating medium.A copper disk is placed between the high-voltage electrode and the insulating medium.To avoid surface discharge interference, the high-voltage electrode in the air gap discharge model is sealed with epoxy resin.4. Suspension discharge model: In this model, the insulating medium is EPDM rubber with a diameter of 120 mm and a thickness of 3 mm, and the high electric electrode is a copper disk with a diameter of 30 mm.There is a certain gap between the high electrode and the insulating medium, and a copper sheet with a thickness of 1 mm is placed in the gap as a suspended metal particle to simulate the suspended electrode.

High-Frequency Pulse Signals of PD with Four Typical Defects
The PD test platform shown in Figure 1 was used to carry out systematic pressure tests on four defect models mentioned above.The test protocol involved gradually increasing the voltage in 1 kV increments, maintaining each level for 1 min after pressurization to ensure test stability.High-frequency pulse current signals from the four defect

High-Frequency Pulse Signals of PD with Four Typical Defects
The PD test platform shown in Figure 1 was used to carry out systematic pressure tests on four defect models mentioned above.The test protocol involved gradually increasing the voltage in 1 kV increments, maintaining each level for 1 min after pressurization to ensure test stability.High-frequency pulse current signals from the four defect models were collected within two power frequency cycles under a 12 kV test voltage.The time-domain waveforms of these signals are presented in Figure 4. Sensors 2024, 24, x FOR PEER REVIEW 6 of 18

CNN
The CNN primarily consists of three fundamental components: convolutional layers, pooling layers, and fully connected layers [51].In this study, a CNN-based classification model is employed to categorize the discharge signals from four typical defects in highspeed EMU cables.This model is constructed using two convolutional layers, two activation layers, two maximum pooling layers, and one fully connected layer; its block diagram is illustrated in Figure 5.In this study, the discharge signal from the terminal of the high-speed EMU cable serves as the input to the model.The convolutional and pooling layers extract features and compress this information into a feature map form.For the first convolutional layer, the kernel size is set to 3 × 1 with 32 kernels, and the step size is 1.For the second convolutional layer, the kernel size is adjusted to 4 × 1 with 64 kernels, maintaining the step size at 1.The first maximum pooling layer has a kernel size of 3 × 1 and a step size of 1, mirroring the settings of the second maximum pooling layer, which also adopts a kernel size of 3 × 1 and a step size of 1.

CNN
The CNN primarily consists of three fundamental components: convolutional layers, pooling layers, and fully connected layers [51].In this study, a CNN-based classification model is employed to categorize the discharge signals from four typical defects in highspeed EMU cables.This model is constructed using two convolutional layers, two activation layers, two maximum pooling layers, and one fully connected layer; its block diagram is illustrated in Figure 5.

CNN
The CNN primarily consists of three fundamental components: convolutional layers, pooling layers, and fully connected layers [51].In this study, a CNN-based classification model is employed to categorize the discharge signals from four typical defects in highspeed EMU cables.This model is constructed using two convolutional layers, two activation layers, two maximum pooling layers, and one fully connected layer; its block diagram is illustrated in Figure 5.In this study, the discharge signal from the terminal of the high-speed EMU cable serves as the input to the model.The convolutional and pooling layers extract features and compress this information into a feature map form.For the first convolutional layer, the kernel size is set to 3 × 1 with 32 kernels, and the step size is 1.For the second convolutional layer, the kernel size is adjusted to 4 × 1 with 64 kernels, maintaining the step size at 1.The first maximum pooling layer has a kernel size of 3 × 1 and a step size of 1, mirroring the settings of the second maximum pooling layer, which also adopts a kernel size of 3 × 1 and a step size of 1.In this study, the discharge signal from the terminal of the high-speed EMU cable serves as the input to the model.The convolutional and pooling layers extract features and compress this information into a feature map form.For the first convolutional layer, the kernel size is set to 3 × 1 with 32 kernels, and the step size is 1.For the second convolutional layer, the kernel size is adjusted to 4 × 1 with 64 kernels, maintaining the step size at 1.The first maximum pooling layer has a kernel size of 3 × 1 and a step size of 1, mirroring the settings of the second maximum pooling layer, which also adopts a kernel size of 3 × 1 and a step size of 1.
Convolution Layer: This layer applies the convolution operation across local regions of the input using the convolution kernel of the filter, aiming to extract local features effectively.The convolution layer employs weight sharing, significantly reducing the number of learning parameters and thereby mitigating the risk of overfitting [52].Each convolutional layer in the network comprises multiple convolutional kernels, with the parameters of each kernel being optimized through the backpropagation algorithm [53].Each convolution can be performed on the input sequence by convolving the equation as follows: where x m is the input sequence, w (k) is the weight of the k th convolution kernel, and the size is L. Activation Layer: The output from the convolutional layer serves as the input to the activation function.The role of the activation function is to transform the output of the convolutional layer in a nonlinear manner.This transformation enhances the linear separability of data that was originally scattered.Commonly used activation functions in CNNs include the sigmoid function, the tanh function, and the rectified linear unit (ReLU) function [54].The expressions for these three functions are given as ( 2)-( 4), respectively.Compared with the other two activation functions, the ReLU function can effectively reduce the amount of computation and improve the expression ability of the network by debugging the activity of neurons in the network [54].Therefore, the ReLU function is used as the activation function of the activation layer in this study.
where x is the output data of the convolution layer.
Pooling Layer: Following the convolution operation, the quantity of extracted feature sequences increases, leading to an expansion in data dimensions and a rise in computational complexity.The pooling layer serves to reduce the data width and the number of network parameters, thus lowering computational costs and helping to prevent overfitting [55].There are two common pooling functions: average pooling and max pooling.The average pooling computes the mean of the input data to serve as the output of the layer, while the max pooling selects the maximum value from the input data as the output [56].The expressions of average pooling and max pooling are given as ( 5) and ( 6), respectively.The max pooling is particularly effective in capturing important local features of the data, thereby improving the recognition accuracy of the model.Consequently, the max pooling function is adopted in this study.
where w is the width of pooling layer, x h c (t) is the value of the tth neuron in the c eigenvalue of the hth layer, and y h+1 c is the value of neurons in layer h + 1. Fully Connected Layer: The role of this layer is to integrate and refine the features extracted by the alternating convolution and pooling layers.This is achieved by flattening the output feature map from the final convolution or pooling layer into a one-dimensional feature vector, which allows for further feature extraction [57].Then, the fully connected layer maps these extracted feature vectors to the sample label space and classifies them by constructing a classifier.For the classification, the softmax function is usually selected as the activation function of the fully connected layer, which converts the output vector into a set of probability distributions, according to which the model makes category prediction and selects the category with the highest probability as the output [58].
where P j is the probability of belonging to the correct class, r j is the node value of the jth neuron, and k is the total number of classes.

PD Data for Training and Verification
Through the PD test platform shown in Figure 1, the PD signals from high-speed EMUs with various defect models are collected; from these collected PD signals, singlewave time-domain signals for four types of PD signals associated with the defect models are extracted, as illustrated in Figure 6.
Sensors 2024, 24, x FOR PEER REVIEW 8 of 18 the output feature map from the final convolution or pooling layer into a one-dimensional feature vector, which allows for further feature extraction [57].Then, the fully connected layer maps these extracted feature vectors to the sample label space and classifies them by constructing a classifier.For the classification, the softmax function is usually selected as the activation function of the fully connected layer, which converts the output vector into a set of probability distributions, according to which the model makes category prediction and selects the category with the highest probability as the output [58].where Pj is the probability of belonging to the correct class, rj is the node value of the jth neuron, and k is the total number of classes.

PD Data for Training and Verification
Through the PD test platform shown in Figure 1, the PD signals from high-speed EMUs with various defect models are collected; from these collected PD signals, singlewave time-domain signals for four types of PD signals associated with the defect models are extracted, as illustrated in Figure 6.

Classification Steps
The classification process of the proposed CNN-based model is illustrated in Figure 7 and can be summarized in the following steps: 1.
Signal acquisition: A test platform is built, and cable terminal models with four types of defects are created.The HFCT is used to measure the PD signals of the cable terminals.

2.
Dataset construction: Four different types of discharge signals are collected, and a single signal is extracted.For each of the four signals, 251 sampling points are selected, resulting in 400 sets of data for each type.Out of these 1600 datasets, 1200 are randomly chosen to construct the training dataset, while the remaining 400 sets are designated as the test dataset.

3.
Data normalization: To simplify the data complexity, disparate data in the set are processed.This step facilitates faster gradient descent, aiding in finding the optimal solution and enhancing the model's accuracy and convergence speed.The dataset from step 2 is normalized using the following expression.
Y i = y i − y min y max − y min (8) where y i is the sample value before normalization, Y i is the normalized sample value.y min is the minimum value of the sample, and y max is the maximum value of the sample.4.
CNN training and classification: The normalized dataset from step 3 serves as the input to the constructed CNN-based model.After processing through two convolutional layers, two pooling layers, and one fully connected layer, the classification results are produced.
randomly selected to train the model.The remaining 400 sets are used to verify the mo accuracy in identifying different types of defects.

Classification Steps
The classification process of the proposed CNN-based model is illustrated in Fi 7 and can be summarized in the following steps:

Results Analysis
The models discussed in this paper are implemented using the MATLAB 2021a software, installed on a personal computer equipped with an Intel i5-10210U CPU (1.6 GHz clock frequency) and 8 GB of RAM.
In this study, training loss and accuracy are used to evaluate the recognition effectiveness of CNN.Loss quantifies the discrepancy between the predicted value and the true value.Cross entropy is used as the loss function to describe the gap between the probability distribution of the predicted values and the actual values.This measure reflects the model's degree of fit [59].The expression for cross entropy is written as where y i is the label value of sample i, p i is the probability of correctly predicting the sample, and N is the total number of samples.Four performance indices, namely accuracy, precision, recall, and F1-score, are utilized to evaluate the effectiveness of classification.Taking binary classification as an example, accuracy measures the proportion of correctly predicted classes, precision assesses the percentage of actual positives among all samples predicted as positive, recall calculates the proportion of actual positive samples correctly identified relative to all actual positives, and F1-score reflects the balance between precision and recall.The formulas for these indices are provided in ( 10)-( 13), and Table 2 illustrates the meanings of each term used in the formulas.Regarding the learning rate reduction strategy, the adaptive moment estimation (Adam) optimizer [59] is employed to decrease the learning rate to one-tenth of its value every 10 training iterations.

Accuracy =
T N + T P T N + F N + T P + F P (10) Precision = T P T P + F P (11) Table 2. Confusion matrix.

Actual
Predicted 1

Influence of Different Optimizers
By calculating network parameters that influence model training and output, optimizers aim to minimize the loss function, guiding it toward an optimal value [42].This study discusses the effects of the Adam optimizer, the stochastic gradient descent with momentum (SGDM), and the root mean square propagation (RMSprop) optimizers on the CNN-based classification model to identify the most suitable optimizer.In total, 75% of the data is randomly selected from the PD dataset to construct the training dataset, with the remaining data serving as the test dataset.
The evaluation focuses on the impacts of three optimizers on the model performance and accuracy.The training accuracy and loss curves of the CNN-based classification model, utilizing these three optimizers through the iterative process, are depicted in Figure 8          As shown in Figure 8, the CNN-based classification model employing the Adam optimizer converges fastest and exhibits a smooth curve, indicating efficient learning.In contrast, the model using the SGDM optimizer shows the slowest convergence, whereas the RMSprop optimizer achieves better convergence than SGDM but with more considerable curve fluctuation, indicating less stability in the trained model.According to Figure 9, all three optimizer-based models achieve 100% signal classification accuracy during training.Combining the results from Figure 10 and Table 3, the average accuracy of the CNN-based classification model, based on the three optimizers for identifying the four defects, ranks as follows: Adam > SGDM > RMSprop.This suggests that the CNN-based classification model using the Adam optimizer demonstrates superior learning performance and classification effectiveness.Consequently, the Adam optimizer is selected as the optimizer for the cable terminal discharge classification model in high-speed EMUs for this study.

Influence of Different Training Data Amounts
The first 20%, 40%, 60%, 80%, and 90% of the PD data of high-speed EMU cable More training data require more forward-propagation and back-propagation calculations so that the model can better understand and fit the distribution of the data, which helps to improve the algorithm, improve the generalization ability and performance of the model, and will also require more running time.

Comparison of CNN with Other Classification Models
The proposed CNN-based classification model is compared with the BPNN-based and RBFNN-based classification models.In total, 75% of the PD dataset is utilized to train these three models, and the remaining data are employed to assess the models' recognition capabilities.For the BPNN, the number of nodes in the input layer is set to 251, the    11 demonstrates that as the training data volume expands, the range of recognition accuracy for the CNN-based classification model narrows, and the average accuracy rate approaches nearly 100%.More training data introduce greater data diversity, enhancing the model's robustness and minimizing performance variances across different data subsets.This leads to a more concentrated range of high accuracy.
Moreover, as can be seen from Figure 12, the model's training time also escalates with larger training data volumes.The average training time for 100 sessions across the five data groups was recorded as 18.91 s, 46.78 s, 66.84 s, 92.03 s, and 102.38 s, respectively.
More training data require more forward-propagation and back-propagation calculations so that the model can better understand and fit the distribution of the data, which helps to improve the algorithm, improve the generalization ability and performance of the model, and will also require more running time.

Comparison of CNN with Other Classification Models
The proposed CNN-based classification model is compared with the BPNN-based and RBFNN-based classification models.In total, 75% of the PD dataset is utilized to train these three models, and the remaining data are employed to assess the models' recognition capabilities.For the BPNN, the number of nodes in the input layer is set to 251, the number of nodes in the hidden layer is set to 6, and the number of nodes in the output layer is set to 4. In addition, the target error of the model is set to 1 × 10 −6 , and the learning rate is 0.01.For the RBFNN, the Gaussian kernel is used.The RBF can be adaptively determined according to the training data, and the expansion speed of the RBF is set to 100.
The comparison results are depicted in Figure 13, while Tables 5 and 6 show the accuracy, precision, recall rate, and F1-score of the three models under different types of discharge signals.It can be seen from Tables 5 and 6 that the CNN model has higher recognition precision, recall rate, and F1-score for the four discharge signals.Although the recognition effect of the discharge at the tip and the discharge at the air gap is relatively poor, the accuracy rate of the CNN model is higher than that of the other two models, indicating that the model has a more accurate overall prediction of the four discharge signals, better classification effect, and more stable model.Although the proposed CNN-based classification model demonstrates higher accuracy than the other two NN-based classification models, it also requires a longer runtime.This is attributed to CNN's convolution layer, which enhances the model's local feature extraction capability by using convolution kernels to capture specific signal features, thereby effectively differentiating between different signal types.The increased sensitivity  Although the proposed CNN-based classification model demonstrates higher accuracy than the other two NN-based classification models, it also requires a longer runtime.This is attributed to CNN's convolution layer, which enhances the model's local feature extraction capability by using convolution kernels to capture specific signal features, thereby effectively differentiating between different signal types.The increased sensitivity and accuracy of signal recognition comes at the cost of increased computational complexity, resulting in higher time costs.

Conclusions
In this paper, a CNN-based classification method for distinguishing different defect discharge signals in high-speed EMU cable terminals is presented.Within a laboratory environment, a PD test platform using the HFCT was established for the collection of PD signals from various defects, and these signals were classified using the proposed CNN-based classification model.Furthermore, the effects of three different optimizers and varying amounts of training data on the classification accuracy of the high-speed EMU cable terminal PD model were investigated, and the proposed CNN-based classification model was compared with two existing NN-based classification models.The main conclusions are drawn as follows: 1.
Compared with SGDM and RMSprop optimizers, the Adam optimizer shows lower loss and higher classification accuracy in CNN-based classification model training, and the training effect is more stable.2.
It is found that increasing the amount of training data can enhance the robustness of the model and improve the classification accuracy but at the cost of increasing the training time.

3.
Compared with the BPNN-based and RBFNN-based classification models, the CNNbased classification model proposed in this paper shows higher classification accuracy and can identify four different types of defects more accurately.
The method proposed in this paper can avoid the process of learning manual feature extraction in traditional machine learning and can effectively identify the discharge signals of four different defect types and achieve a high classification accuracy.In future research, we will focus more on artificial intelligence technology to optimize and perfect our classification model by learning and exploring new methods so that we can build a model with better classification effect and more stable performance.

Figure 1 .
Figure 1.PD test platform for high-speed EMU cable terminals.

Figure 2 .
Figure 2. Circuit diagram of PD test platform based on the HFCT.

Figure 4 .
Figure 4. Time domain waveforms of discharge high-frequency pulse current signal of four defect models: (a) tip discharge, (b) surface discharge, (c) air gap discharge, and (d) suspended discharge.

Figure 5 .
Figure 5. CNN-based cable terminal discharge classification model of high-speed EMU.

Figure 4 .
Figure 4. Time domain waveforms of discharge high-frequency pulse current signal of four defect models: (a) tip discharge, (b) surface discharge, (c) air gap discharge, and (d) suspended discharge.

Figure 4 .
Figure 4. Time domain waveforms of discharge high-frequency pulse current signal of four defect models: (a) tip discharge, (b) surface discharge, (c) air gap discharge, and (d) suspended discharge.

Figure 5 .
Figure 5. CNN-based cable terminal discharge classification model of high-speed EMU.

Figure 5 .
Figure 5. CNN-based cable terminal discharge classification model of high-speed EMU.

Figure 6 .
Figure 6.Time-domain diagram of discharge high-frequency pulse current signals of four defect models: (a) tip discharge, (b) surface discharge, (c) air gap discharge, and (d) suspension discharge.To ensure the richness and representativeness of the data, 400 sets of four kinds of PD signals are extracted from the collected signals.Out of the 1600 total sets of data, 1200 sets are

Figure 6 .
Figure 6.Time-domain diagram of discharge high-frequency pulse current signals of four defect models: (a) tip discharge, (b) surface discharge, (c) air gap discharge, and (d) suspension discharge.To ensure the richness and representativeness of the data, 400 sets of four kinds of PD signals are extracted from the collected signals.Out of the 1600 total sets of data, 1200 sets are randomly selected to train the model.The remaining 400 sets are used to verify the model's accuracy in identifying different types of defects.

Figure 7 .
Figure 7. Classification steps based on the CNN.

Figure 7 .
Figure 7. Classification steps based on the CNN.

Figure 8 .
Figure 8. Training accuracy and loss of three optimizers: (a) loss curve and (b) accuracy curve.

Figure 12 .
Figure 12.Box plot of training time based on different training data amounts.
1. Signal acquisition: A test platform is built, and cable terminal models with four t of defects are created.The HFCT is used to measure the PD signals of the cable te nals.2. Dataset construction: Four different types of discharge signals are collected, and a gle signal is extracted.For each of the four signals, 251 sampling points are sele resulting in 400 sets of data for each type.Out of these 1600 datasets, 1200 are domly chosen to construct the training dataset, while the remaining 400 sets are ignated as the test dataset.3. Data normalization: To simplify the data complexity, disparate data in the set are cessed.This step facilitates faster gradient descent, aiding in finding the optima lution and enhancing the model's accuracy and convergence speed.The dataset f step 2 is normalized using the following expression.

Table 3 .
Classification accuracy of cable terminal discharge by different optimizers.

Table 3 .
Classification accuracy of cable terminal discharge by different optimizers.

Table 3 .
Classification accuracy of cable terminal discharge by different optimizers.

Table 3 .
Classification accuracy of cable terminal discharge by different optimizers.

Table 4 .
Classification results of terminal discharge of high-speed EMU cables based on different training data.

Table 4
reveals that the classification accuracy for the four types of discharge in the cable terminal discharge signal classification model of high-speed EMUs improves with an increase in training data volume.Furthermore, Figure

Table 6 .
Classification results of terminal discharge of high-speed EMU cables with different models: recall and F1-score.

Table 5 .
Classification results of terminal discharge of high-speed EMU cables with different models: accuracy and precision.

Table 6 .
Classification results of terminal discharge of high-speed EMU cables with different models: recall and F1-score.