Defect Recognition in Ballastless Track Structures Based on Distributed Acoustic Sensors

: Defect recognition in ballastless track structures, based on distributed acoustic sensors (DASs), was researched in order to improve detection efﬁciency and ensure the safe operation of trains on high-speed railways. A line in southern China was selected, and equipment was installed and debugged to collect the signals of trains and events along it. Track vibration signals were extracted by identifying a train track, denoising, framing and labeling to build a defect dataset. Time–frequency-domain statistical features, wavelet packet energy spectra and the MFCCs of vibration signals were extracted to form a multi-dimensional vector. An XGBoost model was trained and its accuracy reached 89.34%. A time-domain residual network (ResNet) that would expand the receptive ﬁeld and test the accuracies obtained from convolution kernels of different sizes was proposed, and its accuracy reached 94.82%. In conclusion, both methods showed a good performance with the built dataset. Additionally, the ResNet delivered more effective detection of DAS signals compared to conventional feature engineering methods.


Introduction
In recent years, China's high-speed railway has developed rapidly.By the end of 2022, its operating mileage had reached 42,000 km.People have put forward greater requirements for the running speed, safety and riding comfort of trains, which has resulted in slab ballastless tracks being widely used in high-speed railway construction.Compared with ballasted tracks, ballastless tracks are smooth, stable and durable.However, because this type of track contains no ballast buffer, trains will have stronger impacts on these tracks when they pass at high speed.At the same time, these tracks' rigid concrete structures not only bring to light the dangers of new hidden defects but also create certain difficulties for later maintenance.
With increased service life and under the impacts of high-speed trains and various environmental loads, ballastless tracks will suffer from cracks, damage and other defects [1][2][3], as shown in Figure 1.If these defects cannot be dealt with in time, the tracks will further deteriorate under the impacts and vibration of trains, which will not only seriously affect the strength of the track structure and reduce the life of the track slab but also increase the maintenance cost later.Therefore, it is of great significance to detect defects in ballastless tracks, find hidden dangers and prevent them in time.
At present, defect detection for ballastless tracks is mainly carried out through a combination of manual inspection and comprehensive detection of the train.However, with sharp increases in operating mileages and the fact that manual inspections can only be performed during "skylight", heavy workloads and low efficiency are problems.
In terms of technical defense, ground-penetrating radar [4,5] is used to detect the structural layer defects of ballastless tracks.However, it is difficult to identify the actual radar profiles recorded with it because of the reflection and scattering of targets, the nonuniformity of the medium distribution and the complexity and diversity of geological structures during the underground propagation of electromagnetic waves.In addition, the uncertain factors in the manual interpretation of images will ultimately affect the judgment of the result.
Appl.Sci.2023, 13, x FOR PEER REVIEW 2 of 16 In terms of technical defense, ground-penetrating radar [4,5] is used to detect the structural layer defects of ballastless tracks.However, it is difficult to identify the actual radar profiles recorded with it because of the reflection and scattering of targets, the nonuniformity of the medium distribution and the complexity and diversity of geological structures during the underground propagation of electromagnetic waves.In addition, the uncertain factors in the manual interpretation of images will ultimately affect the judgment of the result.Distributed acoustic sensors (DAS) based on the phase-sensitive optical-time-domain reflectometer (Φ-OTDR) [6] are a new type of sensing technology that uses the interference effect of optical-fiber-back Rayleigh scattering to realize the continuous distributed detection of acoustic signals.DASs not only have the advantages of anti-EMI, non-corrosiveness and no required power supply but can also detect and locate weak vibration signals along optical fibers.They have been widely applied in fields [7][8][9][10][11][12] such as oil pipeline security monitoring, perimeter security and rail transit.Collecting the vibration signals of trains and events along a line by using DAS technology combined with signal processing, deep learning and other ways to identify track defects has provided a reliable helper method that has an important application value for the safe and high-quality development of high-speed railways.
For this paper, we have studied identification technology for ballastless track defects, collected and processed vibration signals and built a defect dataset.According to the design principles of ballastless tracks and the characteristics of the sample data, two pattern recognition methods, based on feature engineering and deep learning, were proposed to verify the feasibility of DAS technology for defect detection in ballastless tracks.Distributed acoustic sensors (DAS) based on the phase-sensitive optical-time-domain reflectometer (Φ-OTDR) [6] are a new type of sensing technology that uses the interference effect of optical-fiber-back Rayleigh scattering to realize the continuous distributed detection of acoustic signals.DASs not only have the advantages of anti-EMI, non-corrosiveness and no required power supply but can also detect and locate weak vibration signals along optical fibers.They have been widely applied in fields [7][8][9][10][11][12] such as oil pipeline security monitoring, perimeter security and rail transit.Collecting the vibration signals of trains and events along a line by using DAS technology combined with signal processing, deep learning and other ways to identify track defects has provided a reliable helper method that has an important application value for the safe and high-quality development of high-speed railways.
For this paper, we have studied identification technology for ballastless track defects, collected and processed vibration signals and built a defect dataset.According to the design principles of ballastless tracks and the characteristics of the sample data, two pattern recognition methods, based on feature engineering and deep learning, were proposed to verify the feasibility of DAS technology for defect detection in ballastless tracks.

Experimental Setup
In the field, a high-speed railway line in southern China was selected for track vibration signal acquisition.The length of the testing line is 32.11 km.The CRTS II slab was used as the track bed, with a length of 6.4 m.The sensing fiber optic adopted the existing communication fiber-optical cable in the groove outside the line, as shown in Figure 2.

Experimental Setup
In the field, a high-speed railway line in southern China was selected tion signal acquisition.The length of the testing line is 32.11 km.The CR used as the track bed, with a length of 6.4m.The sensing fiber optic adopt communication fiber-optical cable in the groove outside the line, as shown The length of the optical cable is 33.96 km, and the actual measured km.It can be seen that there are 4095 slabs in the range.There are multip connectors and coils in the line; thus, the event locating will be affected.positioning error, we carried out knocking experiments on both sides of and coils (15 locations in total) to determine the mapping relationship bet pling points and the kilometer markers.The experimental process is show where the positions have been hidden and are not marked with real kilom The DAS equipment (a coherent detection-based Φ-OTDR system) wa communication room of the nearby station.The key parameters of the shown in Table 1.In summary, 4095 sampling points can be obtained, each to a track slab.Figure 4 shows the vibration states at certain moments on axis represents the spatial sampling points on the optical fiber, correspond ographical locations, and the Y-axis represents the phases of the sampling The length of the optical cable is 33.96 km, and the actual measured length is 26.21 km.It can be seen that there are 4095 slabs in the range.There are multiple optical fiber connectors and coils in the line; thus, the event locating will be affected.To reduce the positioning error, we carried out knocking experiments on both sides of the connectors and coils (15 locations in total) to determine the mapping relationship between the sampling points and the kilometer markers.The experimental process is shown in Figure 3, where the positions have been hidden and are not marked with real kilometers.

Experimental Setup
In the field, a high-speed railway line in southern China was selected for track vibration signal acquisition.The length of the testing line is 32.11 km.The CRTS II slab was used as the track bed, with a length of 6.4m.The sensing fiber optic adopted the existing communication fiber-optical cable in the groove outside the line, as shown in Figure 2. The length of the optical cable is 33.96 km, and the actual measured length is 26.21 km.It can be seen that there are 4095 slabs in the range.There are multiple optical fiber connectors and coils in the line; thus, the event locating will be affected.To reduce the positioning error, we carried out knocking experiments on both sides of the connectors and coils (15 locations in total) to determine the mapping relationship between the sampling points and the kilometer markers.The experimental process is shown in Figure 3, where the positions have been hidden and are not marked with real kilometers.The DAS equipment (a coherent detection-based Φ-OTDR system) was placed in the communication room of the nearby station.The key parameters of the equipment are shown in Table 1.In summary, 4095 sampling points can be obtained, each corresponding to a track slab.Figure 4 shows the vibration states at certain moments on the line; the Xaxis represents the spatial sampling points on the optical fiber, corresponding to real geographical locations, and the Y-axis represents the phases of the sampling points.The DAS equipment (a coherent detection-based Φ-OTDR system) was placed in the communication room of the nearby station.The key parameters of the equipment are shown in Table 1.In summary, 4095 sampling points can be obtained, each corresponding to a track slab.Figure 4 shows the vibration states at certain moments on the line; the X-axis represents the spatial sampling points on the optical fiber, corresponding to real geographical locations, and the Y-axis represents the phases of the sampling points.

Moving Average and Moving Differential Method
Optical loss is inevitable.When light with a wavelength of 1330 nm or 1550 nm pro agates in a G652 single-mode fiber, the average losses per kilometer are 0.35 dB and 0 dB, respectively.The splice loss caused by fiber core fusion is about 0.05 dB/piece.Th losses directly lead to Rayleigh scattered light being strong at the near and weak at the end.At the same time, due to this light's high sensitivity, it is very easily disturbed noise in the surrounding environment, such as construction noise along the line or traf noise in parallel sections of highways and railways.The collected signals are mixed w many unknown noises, which may affect the positioning accuracy of DAS equipment.
To reduce noise and increase frequency responses, the moving average and movi differential N method [13,14] was used in our detection system.This method consists acquiring a certain number of Rayleigh backscattering traces and choosing a number the acquired traces M to be averaged.The averaged traces set is

R r l N i N M M
Considering the pulse duration time and the decay time of track vibratio is used as the moving reference.Then, the differential traces can obtained by the following equation

Moving Average and Moving Differential Method
Optical loss is inevitable.When light with a wavelength of 1330 nm or 1550 nm propagates in a G652 single-mode fiber, the average losses per kilometer are 0.35 dB and 0.25 dB, respectively.The splice loss caused by fiber core fusion is about 0.05 dB/piece.These losses directly lead to Rayleigh scattered light being strong at the near and weak at the far end.At the same time, due to this light's high sensitivity, it is very easily disturbed by noise in the surrounding environment, such as construction noise along the line or traffic noise in parallel sections of highways and railways.The collected signals are mixed with many unknown noises, which may affect the positioning accuracy of DAS equipment.
To reduce noise and increase frequency responses, the moving average and moving differential method [13,14] was used in our detection system.This method consists of acquiring a certain number N of Rayleigh backscattering traces and choosing a number of the acquired traces M to be averaged.The averaged traces set is Considering the pulse duration time and the decay time of track vibration, R r = R int(i/2M)×M+1 is used as the moving reference.Then, the differential traces can be obtained by the following equation where

Vibration Signal Extraction
Defects in ballastless tracks depend on different vibration characteristics that are only regular under the action of wheel-rail cycles.When no trains pass, the value of data analysis is low and data imbalances will be caused.Therefore, it is necessary to locate trains and only retain the track vibration signals when they pass by.In this article, we have identified the train movement trajectory based on a connected component-labeling algorithm [15,16] and realized the extraction of effective signals according to the obtained fitting line of the head and tail, as shown in Figure 6.Similarly, the X-axis represents the spatial sampling points on the optical fiber, and the Y-axis represents 10,000 pulse periods, or Rayleigh backscattering traces.In this way, given the displacement and time, it is easy to identify the trajectory, such as head, tail and direction.

Vibration Signal Extraction
Defects in ballastless tracks depend on different vibration characteristics that are only regular under the action of wheel-rail cycles.When no trains pass, the value of data analysis is low and data imbalances will be caused.Therefore, it is necessary to locate trains and only retain the track vibration signals when they pass by.In this article, we have identified the train movement trajectory based on a connected component-labeling algorithm [15,16] and realized the extraction of effective signals according to the obtained fitting line of the head and tail, as shown in Figure 6.Similarly, the X-axis represents the spatial sampling points on the optical fiber, and the Y-axis represents 10,000 pulse periods, or Rayleigh backscattering traces.In this way, given the displacement and time, it is easy to identify the trajectory, such as head, tail and direction.

Vibration Signal Extraction
Defects in ballastless tracks depend on different vibration characteristics that are only regular under the action of wheel-rail cycles.When no trains pass, the value of data analysis is low and data imbalances will be caused.Therefore, it is necessary to locate trains and only retain the track vibration signals when they pass by.In this article, we have identified the train movement trajectory based on a connected component-labeling algorithm [15,16] and realized the extraction of effective signals according to the obtained fitting line of the head and tail, as shown in Figure 6.Similarly, the X-axis represents the spatial sampling points on the optical fiber, and the Y-axis represents 10,000 pulse periods, or Rayleigh backscattering traces.In this way, given the displacement and time, it is easy to identify the trajectory, such as head, tail and direction.This task consisted of the following steps: Step 1: Extract the phase from the original signal to acquire a matrix of the size 10,000 × 4095; Step 2: Denoise the matrix.Perform the moving average and moving differential method (M = 10) based on columns and median filtering based on rows; Step 3: Transform the matrix into a binary image with an adaptive threshold calculated based on the OSTU method [17]; Step 4: Label the target pixels in the binary image so that each individual connected region forms an identified block.
Assuming that the binary image is "data.bmp",then the pseudocode of its connected component labeling is shown in Algorithm 1.

Dataset Building
To ensure the consistency and representativeness of the results, the vibration differences caused by the forces and speeds of different trains were excluded.We selected three trains of the same type and with the same speed in the interval as the objects of analysis.Specifically, type CR400-AF, with a length of 414.15 m and a grouping of 16, takes about 5 s to pass a certain location.
Signals are generated when track vibration acts on the optical fiber, which is unstable and time-variant.However, these signals can be considered stable and time-invariant in short time periods, since in that case, vibrations are emitted by instantaneous events.After differential processing (M = 1), the signals in our study were framed and labeled with a fixed length of time (1 s; frame shift: 50%) in order to obtain 7067 data samples.The labels came from the equipment accounts of the railway engineering, and the proportion of each type of defect is shown in Figure 7.
and time-variant.However, these signals can be considered stable and time-invariant in short time periods, since in that case, vibrations are emitted by instantaneous events.After differential processing (

= M
), the signals in our study were framed and labeled with a fixed length of time (1 s; frame shift: 50%) in order to obtain 7067 data samples.The labels came from the equipment accounts of the railway engineering, and the proportion of each type of defect is shown in Figure 7.With the track slab through the crack at location K1148+303 taken as an example, the time-domain waveform of one frame is shown in Figure 8.The phase value of the signal has a high signal noise ratio and can be analyzed quantitatively.Next, we studied only the phases of the signals without considering the light intensity.With the track slab through the crack at location K1148+303 taken as an example, the time-domain waveform of one frame is shown in Figure 8.The phase value of the signal has a high signal noise ratio and can be analyzed quantitatively.Next, we studied only the phases of the signals without considering the light intensity.

Feature Extraction
The purpose of feature extraction is to extract meaningful information from track vibration signals, usually including the invariance of similar samples, the identifiability of different samples and robustness to noise.Generally speaking, data and features determine the upper limit of a classifier, and algorithms are only used to approach this upper

Feature Extraction
The purpose of feature extraction is to extract meaningful information from track vibration signals, usually including the invariance of similar samples, the identifiability of different samples and robustness to noise.Generally speaking, data and features determine the upper limit of a classifier, and algorithms are only used to approach this upper limit.Therefore, feature extraction is the key to accurately identifying the target.Defective and normal points show different vibration characteristics, which are reflected in certain parameters of a signal [18].We were able to construct a multi-dimensional feature vector [19] based on these parameters to identify and classify track defects, as shown in Table 2.

XGBoost
XGBoost [20] is an implementation of gradient-boosted decision trees (GBDTs) that can quickly solve prediction and classification problems in data science.The main theory of XGBoost is to continually add regression trees and obtain results by adding the value of each tree.
(1) Objective function Suppose the dataset has n samples and m features; D = {(x i , y i )} (|D| = n, x i ∈ R m , y i ∈ R), the predictive function of XGBoost can be expressed as: where F = f (x) = w q(x) q : R m → T, w ∈ R T is the CART space, q represents the structure of each tree, mapping samples to corresponding leaf nodes and T is the number of leaf nodes in the tree.Each f k corresponds to a tree whose leaf node weight is w.
To learn the set of functions used in the model, it is necessary to minimize the objective function with a regularization term: In Equation ( 4), ŷi represents the predicted value of the model and y i represents the class label of the ith sample.The first term is the loss function and the second term is the regularizer, which is used to control the tree and avoid overfitting.From Equation (3), the prediction result of sample i after the tth iteration is known as: A new function f t (x i ) that minimizes the following objective function is introduced: Then, the Taylor expansion is performed on the objective function in Equation ( 6); the first three terms are taken and the high-order infinitesimal terms are removed.We arrived at: where g i = ∂ ŷ(t−1) l y i , ŷ(t−1) and h i = ∂ 2 ŷ(t−1) l y i , ŷ(t−1) are the first and second derivatives of the loss function, respectively.This does not affect the optimization of the function, since l y i , ŷ(t−1) i is a constant in Equation (7).By removing all of the constant terms, the simplified objective function can be obtained: Only g i and h i need to be considered in Equation ( 8), so the first and second derivatives of the loss function for each round can be calculated to obtain f t (x i ).

Evaluation Metrics
The confusion matrix is a very popular measure used to measure the performances of a classification models when solving classification problems.It achieves this through the calculation of performance metrics like accuracy, precision, recall and F1 scores.Figure 9 shows a binary-classification confusion matrix.The specific formulas and meanings thereof are shown in Table 3.
Appl.Sci.2023, 13, x FOR PEER REVIEW 10 of 16 Only and i h need to be considered in Equation ( 8), so the first and second deriva- tives of the loss function for each round can be calculated to obtain ( )

Evaluation Metrics
The confusion matrix is a very popular measure used to measure the performances of a classification models when solving classification problems.It achieves this through the calculation of performance metrics like accuracy, precision, recall and F1 scores.Figure 9 shows a binary-classification confusion matrix.The specific formulas and meanings thereof are shown in Table 3.

Name Formula Meaning
Accuracy (TP + TN)/(TP + FP + TN + FN) It simply measures how often the classifier makes the correct prediction.It is a measure of correctness that is achieved

Name Formula Meaning
Accuracy (TP + TN)/(TP + FP + TN + FN) It simply measures how often the classifier makes the correct prediction.
Precision TP/(TP + FP) It is a measure of correctness that is achieved in true prediction.
Recall TP/(TP + FN) It is a measure of actual observations which are predicted correctly, i.e., how many observations of positive class are actually predicted as positive.

F1-score (2×TP)/(2×TP + FP + FN)
It is a number between 0 and 1 and is the harmonic mean of precision and recall.

Results and Discussion
The proposed method was compared to three popular classifiers in the field of pattern recognition, such as random forest (RF), gradient boosting decision tree (GBDT) and support vector machine (SVM), that were also trained with the samples in Figure 7.The results of each classifier are shown in Table 4.As can be seen, XGBoost has an average accuracy of 89.34%, which is better than the accuracy of 83.07%for RF, 82.97% for GBDT and 85.56% for SVM.Although XGBoost has a high classification accuracy, its accuracy of track defect identification is still less than 90%.One reason for this may be that feature engineering limits the improvement of accuracy to some extent.Moreover, there are currently no stable parameter-tuning methods for some algorithms, and the best parameters are often selected through exhaustive methods, which adds uncertainty to model optimization.

ResNet-Based Defect Recognition Approaches for Ballastless Tracks 6.1. Convolutional Neural Network
A convolutional neural network (CNN) is a network architecture for deep learning [21] that learns directly from data.The concept was first introduced by postdoctoral computer science researcher Yann LeCun.Notably, it has significant advantages in image segmentation, detection and classification.The overall CNN architecture includes an input layer, multiple alternating convolution and max-pooling layers, one fully connected layer and one classification layer, as shown in Figure 10.The convolutional layer is the most important component of a CNN, since it is where most of the processing takes place.It requires input data, a filter and a feature map, among other things.The pooling layer simplifies output by performing non-linear downsampling and reducing the number of parameters that the network needs to learn.The fully connected layer conducts categorization based on the characteristics retrieved by the preceding layers and the filters applied to them.In addition to these three layers, there are two more important parameters: the dropout layer and the activation function.

Residual Neural Network
Previous CNN architectures were not able to scale to large numbers of layers, which resulted in a limited performance.However, when more layers were added, a degradation problem was exposed: when the network depth increased, the accuracy would become saturated and then degrade rapidly.
To overcome this problem, a concept called residual learning building blocks was introduced for a deep residual learning framework [22].In each block, the input is split into two paths that are summed element-wise in the output.The convolutional layer is the most important component of a CNN, since it is where most of the processing takes place.It requires input data, a filter and a feature map, among other things.The pooling layer simplifies output by performing non-linear downsampling and reducing the number of parameters that the network needs to learn.The fully connected layer conducts categorization based on the characteristics retrieved by the preceding layers and the filters applied to them.In addition to these three layers, there are two more important parameters: the dropout layer and the activation function.

Residual Neural Network
Previous CNN architectures were not able to scale to large numbers of layers, which resulted in a limited performance.However, when more layers were added, a degradation problem was exposed: when the network depth increased, the accuracy would become saturated and then degrade rapidly.
To overcome this problem, a concept called residual learning building blocks was introduced for a deep residual learning framework [22].In each block, the input is split into two paths that are summed element-wise in the output.H(x) can be realized by feed-forward neural networks with shortcut connections.In terms of architecture, if any layer ends up damaging the performance of the model in a plain network, it will be skipped due to the presence of the skip connections.Residual neural networks (ResNets) are made by stacking these residual blocks together, as shown in Figure 11.
into two paths that are summed element-wise in the output.
( ) H x can be realized by feed-forward neural networks with shortcut connections.In terms of architecture, if any layer ends up damaging the performance of the model in a plain network, it will be skipped due to the presence of the skip connections.Residual neural networks (ResNets) are made by stacking these residual blocks together, as shown in Figure 11.

Receptive Field
In a fully connected layer, each neuron is affected by the whole of the input.In contrast, in a convolutional layer, each neuron has a strictly limited "field of view" (receptive field, or RF) [23][24][25].The input of the layer outside of this field of view cannot alter the neuron's activation.Within a convolutional layer, the spatial RF of a neuron is determined by its filter size: the larger the filter, the bigger the region of the input it can "see".The filter size determines the RF of a convolutional layer based on the layer input.In a CNN, nRF n which is the size of a unit from layer n to the network input, can be calculated with: where s n and k n are the stride and kernel size of layer n, respectively, and S n is the cumulative stride from layer n to the input layer.The receptive field size of a unit can be increased in a number of ways.One option is to stack more layers to make the network deeper, which will increase the receptive field size linearly, in theory, as each extra layer will increase the receptive field size by the kernel size.Subsampling, on the other hand, will increase the receptive field size multiplicatively.
In summary, the feature extraction of time-domain signals is more suitable for architectures with larger convolution kernels and fewer network layers.We have proposed a time-domain residual network that will expand the receptive field by stacking multi-layer convolution kernels with larger residual blocks near the input to increase the size of the receptive field and by stacking several convolution kernels with smaller residual blocks in the back of the network to extract local and detailed features.Our calculations have shown that a large three-layer convolution kernel can obtain a receptive field size of 449 × 1 and a feature map size of 313 × 1, as shown in Figure 12.
receptive field and by stacking several convolution kern in the back of the network to extract local and detailed shown that a large three-layer convolution kernel can ob × 1 and a feature map size of 313 × 1, as shown in Figure

Results and Discussion
The dataset was randomly divided into training an verify the impact of the receptive field on network perfo tion accuracy obtained with convolution kernels of differe was used in the network.Instead of directly connecting we adopted an architecture that was similar to GoogLeN out and fully connected layers and used L2 regularizatio in the fully connected layers.The basic network architect

Results and Discussion
The dataset was randomly divided into training and testing sets in a 3:1 ratio.To verify the impact of the receptive field on network performance, we tested the classification accuracy obtained with convolution kernels of different sizes.Global average pooling was used in the network.Instead of directly connecting the output to the Softmax layer, we adopted an architecture that was similar to GoogLeNet.In addition, we added dropout and fully connected layers and used L2 regularization to further suppress overfitting in the fully connected layers.The basic network architecture is shown in Figure 13.The results thereof are shown in Table 5.A larger receptive field could be obtained when a larger convolution kernel was set.With a small convolution kernel, capturing low frequency features with a large time-domain span was difficult and easily affected b high-frequency noise.On the other hand, the convolution kernel was not as large as i could have been.If the convolution kernel were too large, not only would the computa The results thereof are shown in Table 5.A larger receptive field could be obtained when a larger convolution kernel was set.With a small convolution kernel, capturing low-frequency features with a large time-domain span was difficult and easily affected by high-frequency noise.On the other hand, the convolution kernel was not as large as it could have been.If the convolution kernel were too large, not only would the computational cost have increased but the accuracy would have decreased.Additionally, this would have increased the number of parameters and reduced the speed of the convergence of the network.

Conclusions
Under the impacts of long-term train load and the external environment, ballastless tracks have different degrees of damage and defectiveness, which poses a huge threat to the safety of railway operation.In this paper, according to the characteristics of the CRTSII slab, research on DAS-technology-based defect identification in ballastless tracks was carried out, and the following conclusions were obtained: DAS equipment was installed and debugged to collect the vibration signals of trains and events along the line.Each track vibration signal was extracted by identifying the running train track, denoising, framing and labeling to build a defect dataset.Time-frequency-domain statistical features, the wavelet packet energy spectrum and the MFCCs of vibration signals were extracted to form a multi-dimensional vector.The XGBoost model was trained using the built dataset and reached an accuracy of 89.34%.This model's performance was better than that of other popular classifiers, such as RF, GBDT and SVM.A time-domain residual network that would expand the receptive field and test the accuracy obtained with convolution kernels of different sizes has been proposed.The best performance of this method can reach 94.82%, eliminating the manual process of feature extraction in traditional algorithms and realizing end-to-end information processing.

Figure 2 .
Figure 2. Existing optical fiber cable used by sensing.

Figure 2 .
Figure 2. Existing optical fiber cable used by sensing.

Figure 2 .
Figure 2. Existing optical fiber cable used by sensing.
subsets of averag traces can be obtained with the noise power in a measurement reduced by a factor 1/ M .We performed filtering of the vibration signals generated by the train running different times and obtained different results, as shown in Figure5.The Y-axis of the d ference signal represents the amplitude, where the sampling points with larger values a the disturbance caused by the train passing.
Thus, N − M + 1 subsets of averaged traces can be obtained with the noise power in a measurement reduced by a factor of 1/M.We performed filtering of the vibration signals generated by the train running at different times and obtained different results, as shown in Figure5.The Y-axis of the difference signal represents the amplitude, where the sampling points with larger values are the disturbance caused by the train passing.

Figure 5 .
Figure 5. Smoothing results of different times.

Figure 5 .
Figure 5. Smoothing results of different times.

Figure 7 .
Figure 7.The scale of six samples in the dataset.

Figure 7 .
Figure 7.The scale of six samples in the dataset.
be realized by feed-forward neural networks with shortcut connections.In terms of architecture, if any layer ends up damaging the performance of the model in a plain network, it will be skipped due to the presence of the skip connections.Residual neural networks (ResNets) are made by stacking these residual blocks together, as shown in Figure11.

Figure 12 .
Figure 12.The module of increased receptive field.

Figure 12 .
Figure 12.The module of increased receptive field.

Table 1 .
The key parameters of equipment.

Table 1 .
The key parameters of equipment.

Table 2 .
The extracted features for each frame.

Table 3 .
Standard performance metrics of the classifier.

Table 3 .
Standard performance metrics of the classifier.

Table 4 .
Performance of four classifiers.

Table 5 .
Various kernel sizes and impacts on classification.