Railway Track Monitoring Using Train Measurements: An Experimental Case Study

: This paper investigates the use of drive-by train measurements for railway track monitoring. An in-service Irish Rail train was instrumented while using accelerometers and a global positioning system. The measurements were taken over two months and the train bogie accelerations from 60 passes on the Dublin-Belfast line were used for this study. A 6 km section of the line is the particular focus, where the maintenance measurements from a Track Recording Vehicle (TRV) were available. The Hilbert transform is used to obtain the instantaneous amplitudes of the acceleration signals. A new representation of the signal is proposed to show the signal energy level as a function of train location. It is shown that the forward speed of the train has a signiﬁcant inﬂuence on the energy level of the signals. Therefore, a two-step speed correction is applied to the data. First, data from passes with forward speed below a certain limit are removed from the data set. Subsequently, a scaling factor is deﬁned for the remaining signals and the energy levels of those signals are scaled while using online speed measurements. The scaled amplitudes are compared with the TRV data. It is shown that the energy levels of the signals match the TRV measurements very well.


Introduction
Railways constitute a major portion of transport infrastructure in most countries. It is essential to employ good strategies for the maintenance of railway networks to avoid a disruption to services and ensure the safety of the system. For example, the railway track condition needs to be regularly monitored to detect faults at an early stage before they become major issues. Several types of defect can happen on railway tracks: rail cracks, track settlement, landslides onto track, hanging sleepers, etc. [1]. Visual inspection or 'walking the track' is one of the most common methods of track inspection. However, it is expensive, subjective, and some faults can be missed. A track recording vehicle (TRV) is a mechanised method of railway track inspection while using optical and inertial sensors in a specialised vehicle. The inspection vehicle periodically travels over the rail network and measures several geometric parameters of the track, such as: longitudinal level of left and right rail, alignment of left and right rail, track gauge, cross level, twist, curvature and curve radius, gradient, and position using GPS [2]. However, these vehicles are expensive and they may not always be available for the whole network. In addition, they do not always travel at the same speed as normal train traffic, which, in some cases, disrupts regular services.
Drive-by monitoring is an inspection method in which the responses that are measured on instrumented trains/vehicles are employed for the condition monitoring of components of transport Track profile Stiffness profile, elevation profile [11,[17][18][19] 3 Others Track replacement, tamping, rail bump and rail surface irregularities [20][21][22][23] In the context of railway track components, Wei et al. [13] investigate the degradation of a railway crossing while using real train axle box measurements over time. They show that uneven deformation between the wing rail and crossing nose and local irregularity in the longitudinal slope of the crossing nose can be identified. Oregui et al. [14] employ axle box acceleration measurements for the monitoring of bolt tightness of rail joints. Molodova et al. [15] use drive-by measurements from a train axle box for condition monitoring of insulated joints. The power spectrum of the acceleration signals is used as a damage indicator. Salvador et al. [16] present a surveying approach for the detection and classification of track parameters, such as welded joints, turnout frogs, and squats. Changes in track typologies can be detected using measurements from a train axle box.
Several researchers have proposed using inertial measurements to find the railway track stiffness profile. Quirke et al. [18] propose the use of drive-by measurements to detect railway track stiffness variation while using an optimization algorithm. Although their method provides good accuracy, it is computationally expensive and it has not been confirmed by experimental data. Le Pen et al. [19] use on-train measurements to identify changes in track support stiffness. Real et al. [17] propose a method that uses a Fourier transform to find the rail profile while using vertical accelerations that were measured on a passing train. OBrien et al. [11] employ an inverse technique to find the longitudinal track profile using inertial measurements from an in-service train. The track longitudinal profile that generated the vehicle model responses that best match the measured responses is found using the Cross Entropy optimization technique. Tsunashima et al. [24] develop a portable condition monitoring system for track, which is integrated into an in-service train. In this method, the rail irregularities can be estimated from the vertical and lateral accelerations of the car body. Kobayashi et al. [25] use a Kalman Filter to solve an inverse problem, showing the possibility of only estimating the track irregularities from car-body motions. Paixão et al. [26] present an approach that uses acceleration measurements from smartphones inside in-service trains for the assessment of the structural performance and geometrical degradation of railway tracks. They obtain cross-correlation values of greater than 0.85 between the standard deviations of the longitudinal level and the vertical accelerations measured on-board a passenger train on an 11 km stretch of railway.
Cantero and Basu [20] use wavelet transforms of acceleration responses to a passing train to identify any railway track irregularities. They use a simple model and do not consider track stiffness. Tsai et al. [22] propose the application of the Hilbert transform to the measured signals from the axle box of an operational train. They show that decomposing the signals while using the Hilbert transform has the potential to find track irregularities. Recently, Lederman et al. [23] propose an energy-based approach for track change monitoring. They plot the average energy of the measured acceleration in the space domain in a contour plot. A feature detection method is proposed to find changes in the contour plot, which can be indicators of faults on the track. In another study, Lederman et al. [27] show that in some parts of the track, the train is more excited which creates bumps in the measured acceleration signals. They suggest that changes in the sizes of the bumps in consecutive runs are indicative of changes to the track.
Although energy-based methods have shown good potential for railway track monitoring while using multiple passes, there are several challenges that need to be overcome. The first one is the method of representing the average energy of signals. The current method that was proposed by Lederman et al. [23] estimates the energy by summing the amplitudes of the acceleration signal over a pre-specified window. This method might not be accurate if the window length is not well defined. Another important challenge is the influence of train forward speed to the amplitude of the measured signals. There is usually considerable speed variation in multiple train passes over the same route. This variation has considerable influence on the energy of the signals. For example, a higher forward speed might create a higher energy for the same section of track than a lower speed. Therefore, it is difficult to determine whether the change in the signal energy is due to speed variation or track change. These challenges in the energy-based methods are addressed in this paper.
In this paper, an algorithm is proposed for railway track fault detection while using measurements on the bogie of an in-service train. A regular Irish Rail passenger train was instrumented while using accelerometers and a Global Positioning System (GPS). The location of the train and its approximate forward speed were recorded using the GPS system. Six kilometres of Dublin-Belfast railway line is inspected here from 60 train passes. The instantaneous amplitudes of the acceleration signals are extracted while using the Hilbert transform. A novel process is proposed for finding the energy levels of the amplitudes in the space domain. A data cleaning process is performed to remove the passes with low energy due to low train forward speeds. The cleaned data are then scaled using a new scaling method that emphasizes the importance of the vehicle forward speed to the level of energy. The scaled energy signals show a high level of consistency, which is ideal for the frequent inspection of railway tracks. It is suggested that a consistently high level of scaled energy over multiple passes might indicate a possible defect in the track. The TRV data recorded after the train measurements are used to check the track condition at two areas with high energy levels. It is shown that high energy parts of the track match well with the faults identified in the TRV data.

The Instrumented Train
An Irish Rail Hyundai Rotem InterCity fleet car (Figure 1a) was instrumented while using inertial sensors that were installed on the trailer (non-powered) bogie in December 2015. The data was measured on the leading car (22337) of Set 37, a five-car train set. The train was in operation in the Dublin-Belfast Enterprise service line and the measurement equipment was installed while the train was stabled for a routine examination. The sensors were installed on the trailer bogie to minimize the noise contamination from the power train. A tri-axial accelerometer (Disynet-DA3802-015g with a range of ±15 g) and tri-axial gyrometer (Crossbow VG400CC-200 with a range of ±200 • /s) (Figure 1b) were installed on the bogie as close as possible to its center of mass. The data were measured at a sampling frequency of 500 Hz. The train location was also recorded while using a GPS antenna. The approximate forward speed of the train was obtained from the GPS data and referenced to its location. The GPS system recorded the train position at a frequency of 5 Hz. The sensors and GPS system were connected to a HBM Somat eDAQ-lite data logger with 32 GB storage capacity to collect and store the data. The data logger was installed at an optimal location on the train which needed minimum cabling. Power was taken from a spare circuit breaker on the underside of the carriage, which meant that the system required no connection into the carriage, satisfying the requirements of Irish Rail. The data were collected on the Dublin-Belfast line from 13 January to 3 February 2016. In total, 57 return journeys were taken on the line during this time. There was insufficient bandwidth to allow for the transfer of the data over a GSM network. Instead, an operator used a secure Wi-Fi connection to download the data from the data logger once every 7-10 days. The Wi-Fi connection avoided the need for physical access to the data logger, which was located underneath the carriage.
Appl. Sci. 2019, 9, x FOR PEER REVIEW 4 of 16 to its location. The GPS system recorded the train position at a frequency of 5 Hz. The sensors and GPS system were connected to a HBM Somat eDAQ-lite data logger with 32 GB storage capacity to collect and store the data. The data logger was installed at an optimal location on the train which needed minimum cabling. Power was taken from a spare circuit breaker on the underside of the carriage, which meant that the system required no connection into the carriage, satisfying the requirements of Irish Rail. The data were collected on the Dublin-Belfast line from 13 January to 3 February 2016. In total, 57 return journeys were taken on the line during this time. There was insufficient bandwidth to allow for the transfer of the data over a GSM network. Instead, an operator used a secure Wi-Fi connection to download the data from the data logger once every 7-10 days. The Wi-Fi connection avoided the need for physical access to the data logger, which was located underneath the carriage.  Figure 2 shows the accelerations measured on the first five passes against train location. The train forward speed is assumed constant for each GPS sample and the position of the train has been calculated while using linear interpolation of that data. The amplitude of the accelerations is at different levels as the train moves along the rail. It can be also noticed that, with the exception of the second pass, the patterns are similar, but at different levels. For example, the four signals contain a peak with high amplitude at a distance of around 4000 m. However, Pass 1 has maximum amplitude at this location of 38.15 m/s 2 , while Pass 3 has amplitude of 59.43 m/s 2 , which is approximately 1.5 times that of Pass 1. Pass 2, on the other hand, has low levels of amplitude at most of the locations.   Figure 2 shows the accelerations measured on the first five passes against train location. The train forward speed is assumed constant for each GPS sample and the position of the train has been calculated while using linear interpolation of that data. The amplitude of the accelerations is at different levels as the train moves along the rail. It can be also noticed that, with the exception of the second pass, the patterns are similar, but at different levels. For example, the four signals contain a peak with high amplitude at a distance of around 4000 m. However, Pass 1 has maximum amplitude at this location of 38.15 m/s 2 , while Pass 3 has amplitude of 59.43 m/s 2 , which is approximately 1.5 times that of Pass 1. Pass 2, on the other hand, has low levels of amplitude at most of the locations.

Accelerations
The frequency spectra of the measured signals may provide insights into the data. Figure 3a shows the frequency content of Pass 3 while using a Fast Fourier Transform (FFT). It can be seen that there are two dominant frequency ranges in the signal; under 20 Hz and 60-90 Hz. As the signal is long, measured for about 6 km over a period of about 5 min., FFT has the significant disadvantage of not providing any spatial information. Therefore, a Short Time FFT (STFFT) is employed to show the frequency components of the signal in both the frequency and time (space) domains. The STFFT spectrum in Figure 3b shows that the ranges mentioned above are dominant for most locations as the train moves forward. It can be concluded that the signal components under 100 Hz contribute most of the energy of the signal. Appl. Sci. 2019, 9, x FOR PEER REVIEW 5 of 16 The frequency spectra of the measured signals may provide insights into the data. Figure 3a shows the frequency content of Pass 3 while using a Fast Fourier Transform (FFT). It can be seen that there are two dominant frequency ranges in the signal; under 20 Hz and 60-90 Hz. As the signal is long, measured for about 6 km over a period of about 5 min., FFT has the significant disadvantage of not providing any spatial information. Therefore, a Short Time FFT (STFFT) is employed to show the frequency components of the signal in both the frequency and time (space) domains. The STFFT spectrum in Figure 3b shows that the ranges mentioned above are dominant for most locations as the train moves forward. It can be concluded that the signal components under 100 Hz contribute most of the energy of the signal.  It has been stated in the literature [28][29][30][31] that typical train eigenfrequencies are in the range of 0.5-15 Hz. This suggests that the first frequency range mentioned above is predominately related to vehicle dynamics. It has been shown in [32] that the dominant peaks in the second range are dependent on the vehicle forward velocity and rail surface irregularities. This means that the second frequency range corresponds to properties, such as rail roughness profile and sleeper spacing frequency, which is of most interest in this paper.  The frequency spectra of the measured signals may provide insights into the data. Figure 3a shows the frequency content of Pass 3 while using a Fast Fourier Transform (FFT). It can be seen that there are two dominant frequency ranges in the signal; under 20 Hz and 60-90 Hz. As the signal is long, measured for about 6 km over a period of about 5 min., FFT has the significant disadvantage of not providing any spatial information. Therefore, a Short Time FFT (STFFT) is employed to show the frequency components of the signal in both the frequency and time (space) domains. The STFFT spectrum in Figure 3b shows that the ranges mentioned above are dominant for most locations as the train moves forward. It can be concluded that the signal components under 100 Hz contribute most of the energy of the signal. It has been stated in the literature [28][29][30][31] that typical train eigenfrequencies are in the range of 0.5-15 Hz. This suggests that the first frequency range mentioned above is predominately related to vehicle dynamics. It has been shown in [32] that the dominant peaks in the second range are dependent on the vehicle forward velocity and rail surface irregularities. This means that the second frequency range corresponds to properties, such as rail roughness profile and sleeper spacing frequency, which is of most interest in this paper. It has been stated in the literature [28][29][30][31] that typical train eigenfrequencies are in the range of 0.5-15 Hz. This suggests that the first frequency range mentioned above is predominately related to vehicle dynamics. It has been shown in [32] that the dominant peaks in the second range are dependent on the vehicle forward velocity and rail surface irregularities. This means that the second frequency range corresponds to properties, such as rail roughness profile and sleeper spacing frequency, which is of most interest in this paper.

Energy of Acceleration Signals
As suggested by Lederman et al. [23], the energy of vertical acceleration signals measured on a passing train contains useful and important information regarding the railway track condition. For example, if the train passes over a bump or a defect on the track, the acceleration signal measured on the train might contain high amplitudes that represent high levels of energy at that location. Lederman et al. [23] propose using the average amplitudes of the signal using a moving window along the track. The signal energy in each window is then obtained by summing the amplitudes of the acceleration signal. Here, the signal amplitudes are obtained while using the Hilbert transform. Hence, the energy is represented by the true local peaks of the signal instead of averaging.

Hilbert Amplitude
The amplitude of the signals is extracted while using the Hilbert transform. This represents the energy of the vertical acceleration signal by extracting its instantaneous amplitude. The Hilbert transform of a z (t) is [33]: where P presents the Cauchy principal value of the singular integral. The analytic signal of a z (t), which is a complex signal is defined by [33]: where j is The polar form of β(t) is: The instantaneous amplitude amp inst (t) is calculated as: The Hilbert amplitude of a vertical acceleration signal, a z , as measured over a distance of 300 m from a sample of the data set, is shown in Figure 4. It shows how the Hilbert transform extracts the instantaneous amplitude amp inst of the signal, which represents the signal energy.

Energy of Acceleration Signals
As suggested by Lederman et al. [23], the energy of vertical acceleration signals measured on a passing train contains useful and important information regarding the railway track condition. For example, if the train passes over a bump or a defect on the track, the acceleration signal measured on the train might contain high amplitudes that represent high levels of energy at that location. Lederman et al. [23] propose using the average amplitudes of the signal using a moving window along the track. The signal energy in each window is then obtained by summing the amplitudes of the acceleration signal. Here, the signal amplitudes are obtained while using the Hilbert transform. Hence, the energy is represented by the true local peaks of the signal instead of averaging.

Hilbert Amplitude
The amplitude of the signals is extracted while using the Hilbert transform. This represents the energy of the vertical acceleration signal by extracting its instantaneous amplitude. The Hilbert transform of a (t) is [33]: where P presents the Cauchy principal value of the singular integral. The analytic signal of a (t), which is a complex signal is defined by [33]: where j is √−1. The polar form of β(t) is: The instantaneous amplitude amp (t) is calculated as: The Hilbert amplitude of a vertical acceleration signal, a , as measured over a distance of 300 m from a sample of the data set, is shown in Figure 4. It shows how the Hilbert transform extracts the instantaneous amplitude amp of the signal, which represents the signal energy.

Peak Based Decomposition
The instantaneous amplitude of a signal is sampled at the same rate as the original signal, which means that they have same length. When the dataset for the entire track length is being processed, it is necessary to reduce the size of the data to avoid time consuming and computationally expensive signal processing. At the same time, the energy information in the signal needs to be retained. In this section, a simple process, called peak based decomposition (PBD), is proposed to represent the signal energy in a much more compact function. PBD employs a simple multi-step process. In each step, the local maxima of the Hilbert amplitudes are taken and stored as a new representation of the same signal. In this way, small peaks in the original amplitude signal are removed, but the main peaks are kept. The output of this process constitutes a shorter length version of the original signal, but it contains the significant energy levels of the signal in the space domain. The output of the first step is called Peak Function 1 (PF1). If the process is repeated on PF1, then a new shorter version of the signal is obtained, which is called PF2. The process can be repeated several times until the desired combination of signal energy and length is obtained. Figure 5 shows how PBD works by approximating the original signal in each step and decomposing it to reduced sizes while the main amplitudes are kept.

Peak Based Decomposition
The instantaneous amplitude of a signal is sampled at the same rate as the original signal, which means that they have same length. When the dataset for the entire track length is being processed, it is necessary to reduce the size of the data to avoid time consuming and computationally expensive signal processing. At the same time, the energy information in the signal needs to be retained. In this section, a simple process, called peak based decomposition (PBD), is proposed to represent the signal energy in a much more compact function. PBD employs a simple multi-step process. In each step, the local maxima of the Hilbert amplitudes are taken and stored as a new representation of the same signal. In this way, small peaks in the original amplitude signal are removed, but the main peaks are kept. The output of this process constitutes a shorter length version of the original signal, but it contains the significant energy levels of the signal in the space domain. The output of the first step is called Peak Function 1 (PF1). If the process is repeated on PF1, then a new shorter version of the signal is obtained, which is called PF2. The process can be repeated several times until the desired combination of signal energy and length is obtained. Figure 5 shows how PBD works by approximating the original signal in each step and decomposing it to reduced sizes while the main amplitudes are kept. The PBD process is applied to a 400 m sample of measurement in Figure 6. It can be observed how PF4 and PF5 represent the energy level of the raw signal. The average signal energy that was proposed by Lederman et al. [23] using a 25 m window is also shown. As this algorithm averages the energy of the signal in a pre-specified window, it might not accurately find the locations in which the signal contains high levels of energy. In addition, if the width of the window is not properly chosen, the algorithm will sort the energy of part of signal into two windows and reduce the real amplitude. This might cause some parts of the signal with high energy levels not to be discovered. Fortunately, there is no need to select a window band with PBD. The PBD process is applied to a 400 m sample of measurement in Figure 6. It can be observed how PF4 and PF5 represent the energy level of the raw signal. The average signal energy that was proposed by Lederman et al. [23] using a 25 m window is also shown. As this algorithm averages the energy of the signal in a pre-specified window, it might not accurately find the locations in which the signal contains high levels of energy. In addition, if the width of the window is not properly chosen, the algorithm will sort the energy of part of signal into two windows and reduce the real amplitude. This might cause some parts of the signal with high energy levels not to be discovered. Fortunately, there is no need to select a window band with PBD.
Accurately finding the peaks is important when multiple train runs are studied where the alignment of multiple measurements in the space domain is required. The proposed PD process produces a peak at the right location with good accuracy. This allows for the peaks to be aligned from several runs. However, there are several factors contributing to the accuracy of peak locations, e.g., the performance of the GNSS under different train forward speeds. These factors may cause some levels of mismatch between the responses in different passes. Appl. Sci. 2019, 9, x FOR PEER REVIEW 8 of 16 Figure 6. Comparison of the raw acceleration, Peak 4, Peak 5, and average energy.
Accurately finding the peaks is important when multiple train runs are studied where the alignment of multiple measurements in the space domain is required. The proposed PD process produces a peak at the right location with good accuracy. This allows for the peaks to be aligned from several runs. However, there are several factors contributing to the accuracy of peak locations, e.g., the performance of the GNSS under different train forward speeds. These factors may cause some levels of mismatch between the responses in different passes.

Forward Speed
The forward speed of the train has a direct influence on the energy of the signal. For example, if the train passes over a defect with low speed, it might not create enough energy for that defect to be detected. As a result, there may be some passes that show high energy levels at a specific location, but some others with low amplitudes. Figure 7 shows the forward speeds for the first five passes from the dataset. It can be seen that there is considerable speed variation, which is the result of several factors, such as driver behavior, temporary speed restrictions, earlier delays, or weather. The second pass has the lowest speed through most of the sample zone.  Figure 2 shows that, in the second pass, there is a low amplitude between 6000 and 7000 m relative to the other passes in this zone. This amplitude level is not useful in this study. Therefore, a cleaning process was employed to remove the low-speed signals from the dataset. In this work, the forward speed of the train at each kilometer point was monitored and data removed for segments of data for which the minimum speed was less than 50 km/h.

Energy Scaling Using Forward Speed
Even when data with speed less than 50 km/h are removed from the dataset, there is high influence of speed on signal energy. In some cases, this variability might cause fluctuations in Acceleration (m/s 2 ) Figure 6. Comparison of the raw acceleration, Peak 4, Peak 5, and average energy.

Forward Speed
The forward speed of the train has a direct influence on the energy of the signal. For example, if the train passes over a defect with low speed, it might not create enough energy for that defect to be detected. As a result, there may be some passes that show high energy levels at a specific location, but some others with low amplitudes. Figure 7 shows the forward speeds for the first five passes from the dataset. It can be seen that there is considerable speed variation, which is the result of several factors, such as driver behavior, temporary speed restrictions, earlier delays, or weather. The second pass has the lowest speed through most of the sample zone. Accurately finding the peaks is important when multiple train runs are studied where the alignment of multiple measurements in the space domain is required. The proposed PD process produces a peak at the right location with good accuracy. This allows for the peaks to be aligned from several runs. However, there are several factors contributing to the accuracy of peak locations, e.g., the performance of the GNSS under different train forward speeds. These factors may cause some levels of mismatch between the responses in different passes.

Forward Speed
The forward speed of the train has a direct influence on the energy of the signal. For example, if the train passes over a defect with low speed, it might not create enough energy for that defect to be detected. As a result, there may be some passes that show high energy levels at a specific location, but some others with low amplitudes. Figure 7 shows the forward speeds for the first five passes from the dataset. It can be seen that there is considerable speed variation, which is the result of several factors, such as driver behavior, temporary speed restrictions, earlier delays, or weather. The second pass has the lowest speed through most of the sample zone.  Figure 2 shows that, in the second pass, there is a low amplitude between 6000 and 7000 m relative to the other passes in this zone. This amplitude level is not useful in this study. Therefore, a cleaning process was employed to remove the low-speed signals from the dataset. In this work, the forward speed of the train at each kilometer point was monitored and data removed for segments of data for which the minimum speed was less than 50 km/h.

Energy Scaling Using Forward Speed
Even when data with speed less than 50 km/h are removed from the dataset, there is high influence of speed on signal energy. In some cases, this variability might cause fluctuations in  Figure 2 shows that, in the second pass, there is a low amplitude between 6000 and 7000 m relative to the other passes in this zone. This amplitude level is not useful in this study. Therefore, a cleaning process was employed to remove the low-speed signals from the dataset. In this work, the forward speed of the train at each kilometer point was monitored and data removed for segments of data for which the minimum speed was less than 50 km/h.

Energy Scaling Using Forward Speed
Even when data with speed less than 50 km/h are removed from the dataset, there is high influence of speed on signal energy. In some cases, this variability might cause fluctuations in calculated energy and it might de-emphasize the features caused by track faults. For example, the acceleration signals for two consecutive passes, Passes 19 and 20, are shown in Figure 8a for the zone, 3000 to 5000 m. Figure 8b shows the train forward speeds corresponding to these two passes. Figure 8c shows the PF5 of these two signals. The forward train speed of Pass 19 is about 30% less that Pass 20 in the range from 3000 to 3200 m, but it is increasing and catches up with the speed of Pass 20 at about 4000 m. There are two distance ranges in the signals where high energy levels are present; 3000 to 3100 m and 3800 to 4000 m. Figure 8c shows that there is a considerable difference in the energy of signals at the first range, while both of the signals give almost similar levels of energy for the second one. This shows that a higher forward speed creates a higher energy at the same section of track, with the lower speed of Pass 19 de-emphasizing the peak around 3000 to 3100 m. calculated energy and it might de-emphasize the features caused by track faults. For example, the acceleration signals for two consecutive passes, Passes 19 and 20, are shown in Figure 8a for the zone, 3000 to 5000 m. Figure 8b shows the train forward speeds corresponding to these two passes. Figure  8c shows the PF5 of these two signals. The forward train speed of Pass 19 is about 30% less that Pass 20 in the range from 3000 to 3200 m, but it is increasing and catches up with the speed of Pass 20 at about 4000 m. There are two distance ranges in the signals where high energy levels are present; 3000 to 3100 m and 3800 to 4000 m. Figure 8c shows that there is a considerable difference in the energy of signals at the first range, while both of the signals give almost similar levels of energy for the second one. This shows that a higher forward speed creates a higher energy at the same section of track, with the lower speed of Pass 19 de-emphasizing the peak around 3000 to 3100 m. The influence of forward speed on the signal amplitude is clearly not desirable for drive-by railway track monitoring.
A scaling method is proposed in order to minimize the effect of forward speed. Several scaling methods (linear, quadratic, etc.) were tested and showed a nonlinear relationship between the energy of vertical acceleration responses and the train forward speeds (horizontal speed). However, a linear scaling method is adopted, as it gives a reasonable performance while being simple and easy to apply. The linear scaling factor corrects the signal energy while using the corresponding train forward speed. To get consistent amplitudes for different passes at different speeds, a scaling factor using the train forward speed is applied: where amp ( ) is the scaled amplitude at location , is an arbitrary average speed used as a baseline for all passes (taken to be 80 km/h here), ( ) is the speed of the train at location x, and amp( ) is the amplitude obtained from the PD algorithm. Figure 9 shows the scaled PF4 for both passes 19 and 20 while using = 80 km/h. Except for the first peak around 3020 m, the peaks are more consistent in the scaled PF4 for these passes. The influence of forward speed on the signal amplitude is clearly not desirable for drive-by railway track monitoring.
A scaling method is proposed in order to minimize the effect of forward speed. Several scaling methods (linear, quadratic, etc.) were tested and showed a nonlinear relationship between the energy of vertical acceleration responses and the train forward speeds (horizontal speed). However, a linear scaling method is adopted, as it gives a reasonable performance while being simple and easy to apply. The linear scaling factor corrects the signal energy while using the corresponding train forward speed. To get consistent amplitudes for different passes at different speeds, a scaling factor using the train forward speed is applied: where amp sc (x) is the scaled amplitude at location x, v av is an arbitrary average speed used as a baseline for all passes (taken to be 80 km/h here), v(x) is the speed of the train at location x, and amp(x) is the amplitude obtained from the PD algorithm. Figure 9 shows the scaled PF4 for both passes 19 and 20 while using v av = 80 km/h. Except for the first peak around 3020 m, the peaks are more consistent in the scaled PF4 for these passes. There are several factors (e.g., change in the total mass of the train, temperature change, etc.) that impact the energy levels. However, the focus of this study is more on the impact of vehicle forward speed and the other data were not measured during the experiment. There are several factors (e.g., change in the total mass of the train, temperature change, etc.) that impact the energy levels. However, the focus of this study is more on the impact of vehicle forward speed and the other data were not measured during the experiment. Figure 10 gives an overview of the proposed fault detection algorithm. The raw acceleration data are first processed using the Hilbert Huang Transform (HHT) method, as explained in Section 3.1, which results in Hilbert amplitudes. Subsequently, the Hilbert amplitudes are decomposed while using the PBD method explained in Section 3.2. The selected PF is then scaled using the scaling process that is explained in Section 4.2. Finally, the scaled PF is used to detect faults on the railway track. The procedure is employed in an experimental case study in the following sections. There are several factors (e.g., change in the total mass of the train, temperature change, etc.) that impact the energy levels. However, the focus of this study is more on the impact of vehicle forward speed and the other data were not measured during the experiment. Figure 10 gives an overview of the proposed fault detection algorithm. The raw acceleration data are first processed using the Hilbert Huang Transform (HHT) method, as explained in Section 3.1, which results in Hilbert amplitudes. Subsequently, the Hilbert amplitudes are decomposed while using the PBD method explained in Section 3.2. The selected PF is then scaled using the scaling process that is explained in Section 4.2. Finally, the scaled PF is used to detect faults on the railway track. The procedure is employed in an experimental case study in the following sections.

Data Scaling
The TRV data shows several possible track defects in a zone between 2 to 8 km from Conolly station in Dublin. The measured bogie vertical accelerations from this 6 km of the Dublin-Belfast railway line is the dataset considered for fault detection. Figure 11a shows the 60 acceleration signals measured at the bogie. There are different energy levels in each signal, which makes it hard to infer the condition of the track. Figure 11b shows the PF5 of the Hilbert amplitudes of the signals that were calculated while using Equation (5). Although it constitutes an improvement, high variations remain between passes, e.g., between 6000 to 7000 m.

Data Scaling
The TRV data shows several possible track defects in a zone between 2 to 8 km from Conolly station in Dublin. The measured bogie vertical accelerations from this 6 km of the Dublin-Belfast railway line is the dataset considered for fault detection. Figure 11a shows the 60 acceleration signals measured at the bogie. There are different energy levels in each signal, which makes it hard to infer the condition of the track. Figure 11b shows the PF5 of the Hilbert amplitudes of the signals that were calculated while using Equation (5). Although it constitutes an improvement, high variations remain between passes, e.g., between 6000 to 7000 m. There are several factors (e.g., change in the total mass of the train, temperature change, etc.) that impact the energy levels. However, the focus of this study is more on the impact of vehicle forward speed and the other data were not measured during the experiment. Figure 10 gives an overview of the proposed fault detection algorithm. The raw acceleration data are first processed using the Hilbert Huang Transform (HHT) method, as explained in Section 3.1, which results in Hilbert amplitudes. Subsequently, the Hilbert amplitudes are decomposed while using the PBD method explained in Section 3.2. The selected PF is then scaled using the scaling process that is explained in Section 4.2. Finally, the scaled PF is used to detect faults on the railway track. The procedure is employed in an experimental case study in the following sections.

Data Scaling
The TRV data shows several possible track defects in a zone between 2 to 8 km from Conolly station in Dublin. The measured bogie vertical accelerations from this 6 km of the Dublin-Belfast railway line is the dataset considered for fault detection. Figure 11a shows the 60 acceleration signals measured at the bogie. There are different energy levels in each signal, which makes it hard to infer the condition of the track. Figure 11b shows the PF5 of the Hilbert amplitudes of the signals that were calculated while using Equation (5). Although it constitutes an improvement, high variations remain between passes, e.g., between 6000 to 7000 m.  Figure 12 shows the forward speeds for all passes and show a great deal of variation. As proposed in Section 4.1, data are removed for any 1 km length in which the minimum forward speed of the train is less than 50 km/h. The 1 km segments for which data has been removed are shown in bold red in the figure. It can be seen that there are many passes between 5000 to 7000 m that need to  Figure 12 shows the forward speeds for all passes and show a great deal of variation. As proposed in Section 4.1, data are removed for any 1 km length in which the minimum forward speed of the train is less than 50 km/h. The 1 km segments for which data has been removed are shown in bold red in the figure. It can be seen that there are many passes between 5000 to 7000 m that need to be removed from the data set. This could explain the high amplitude variations in this zone in Figure 11b. (b) Figure 11. Sixty passes of data: (a) Acceleration signals, and (b) PF5 of Hilbert amplitudes. Figure 12 shows the forward speeds for all passes and show a great deal of variation. As proposed in Section 4.1, data are removed for any 1 km length in which the minimum forward speed of the train is less than 50 km/h. The 1 km segments for which data has been removed are shown in bold red in the figure. It can be seen that there are many passes between 5000 to 7000 m that need to be removed from the data set. This could explain the high amplitude variations in this zone in Figure  11b.   Figure 11b, where segments of passes with speed below the limit have been removed. This figure clearly shows greater consistency in the amplitudes as compared to Figure 11b, particularly in the peaks. This is further improved with the energy scaling of Equation (5). The scaled amplitudes are plotted in Figure 13b and they can be seen to be more consistent than those of Figure 13a. Table 2 gives the average standrard deviations of the amplitudes in Figure 13a,b for different ranges. It shows that the average standard deviation of the data at all ranges has decreased after rescaling, which is a sign of greater consistency.
(a) Figure 12. The train forward speeds for all 60 passes (red indicates 1 km segments for which data has been removed). Figure 13a shows the cleaned version of Figure 11b, where segments of passes with speed below the limit have been removed. This figure clearly shows greater consistency in the amplitudes as compared to Figure 11b, particularly in the peaks. This is further improved with the energy scaling of Equation (5). The scaled amplitudes are plotted in Figure 13b and they can be seen to be more consistent than those of Figure 13a. Table 2 gives the average standrard deviations of the amplitudes in Figure 13a,b for different ranges. It shows that the average standard deviation of the data at all ranges has decreased after rescaling, which is a sign of greater consistency.
(b) Figure 11. Sixty passes of data: (a) Acceleration signals, and (b) PF5 of Hilbert amplitudes. Figure 12 shows the forward speeds for all passes and show a great deal of variation. As proposed in Section 4.1, data are removed for any 1 km length in which the minimum forward speed of the train is less than 50 km/h. The 1 km segments for which data has been removed are shown in bold red in the figure. It can be seen that there are many passes between 5000 to 7000 m that need to be removed from the data set. This could explain the high amplitude variations in this zone in Figure  11b.   Figure 11b, where segments of passes with speed below the limit have been removed. This figure clearly shows greater consistency in the amplitudes as compared to Figure 11b, particularly in the peaks. This is further improved with the energy scaling of Equation (5). The scaled amplitudes are plotted in Figure 13b and they can be seen to be more consistent than those of Figure 13a. Table 2 gives the average standrard deviations of the amplitudes in Figure 13a,b for different ranges. It shows that the average standard deviation of the data at all ranges has decreased after rescaling, which is a sign of greater consistency.  Figure 13. The PF5 (a) after cleaning to remove segments below the speed limit, (b) after rescaling the cleaned signals.    Figures 11 and 13 show how the raw acceleration signals are processed to create a robust and reliable representation of the signal energy along the track. The mean value of the amplitudes in Figure 13b can be treated as a condition indictor for the track that can be used for long term monitoring. A current high energy level can be considered as a possible defect at the track. Therefore, monitoring the track condition while using the proposed indicator and keeping the energy level of the indicator minimum might be good practice for railway track maintenance.

Comparison of Results with Track Recording Vehicle Data
The locations in Figure 13 where the amplitudes exhibit high energy are inspected in this section for the purpose of fault detection. There are four main zones in Figure 13b that show high energy levels: around 3100 m, 3900 m, 6000-7000 m, and 7550 m. Irish Rail's Track Recording Vehicle (TRV) monitored the track shortly after the acceleration measurements were taken. The TRV data are used here as a baseline to check the effectiveness of the proposed damage indicator.
The scaled amplitudes around the second peak close to 3900 m are shown in Figure 14 with the corresponding TRV data. The TRV data includes eight track properties, named here using the colloquial The TRV data clearly show some defects between 3800 and 4000 m corresponding to the dominant peak in the scaled Hilbert amplitudes. The cross level parameter, 'XL', in the TRV data shows that the intervention limit has been exceeded at this location. Cross level is a measure of the difference in height of the running tables of both rails. It can be observed that this is caused by two changes in the longitudinal level of the up rail, which also manifests itself as a 3 m twist. This defect in the longitudinal level would have caused significant vertical motion of the bogie frame, which was measured by the sensor attached to the bogie. The resulting scaled amplitudes show a strong correlation between the Hilbert amplitude and recorded TRV damage. Figure 15 shows the scaled amplitudes and TRV data between 5800 to 7200 m, where there are also high energy levels. Again, the deviations in the vertical alignment of the up rail produce defects, which are also measured as defects in cross level and 3 m twist. The good match between the energy indicator and the TRV data confirms that the idea of drive-by train monitoring has good potential for railway track monitoring while using an in-service vehicle. Appl. Sci. 2019, 9, x FOR PEER REVIEW 13 of 16 Figure 14. The Track Recording Vehicle (TRV) data and scaled amplitudes.
The TRV data clearly show some defects between 3800 and 4000 m corresponding to the dominant peak in the scaled Hilbert amplitudes. The cross level parameter, 'XL', in the TRV data shows that the intervention limit has been exceeded at this location. Cross level is a measure of the difference in height of the running tables of both rails. It can be observed that this is caused by two changes in the longitudinal level of the up rail, which also manifests itself as a 3 m twist. This defect in the longitudinal level would have caused significant vertical motion of the bogie frame, which was measured by the sensor attached to the bogie. The resulting scaled amplitudes show a strong correlation between the Hilbert amplitude and recorded TRV damage. Figure 15 shows the scaled amplitudes and TRV data between 5800 to 7200 m, where there are also high energy levels. Again, the deviations in the vertical alignment of the up rail produce defects, which are also measured as defects in cross level and 3 m twist. The good match between the energy indicator and the TRV data confirms that the idea of drive-by train monitoring has good potential for railway track monitoring while using an in-service vehicle. However, further work is needed to improve the accuracy of the locations and also find the types of the faults detected by this method. For example, the accuracy of the defect locations is a function of the performance of the GNSS under different train forward speeds. This could create some levels of mismatching between different passes. There is also some error in the PBD process in terms of the  Figures 14 and 15 are two samples that show the effectiveness of the proposed method. However, further work is needed to improve the accuracy of the locations and also find the types of the faults detected by this method. For example, the accuracy of the defect locations is a function of the performance of the GNSS under different train forward speeds. This could create some levels of mismatching between different passes. There is also some error in the PBD process in terms of the accuracy of the peak locations. However, there is reasonable consistency over several passes, which indicates acceptable performance.

Conclusions
In this paper, the idea of drive-by railway track monitoring is investigated while using field measurements. An operational train from the Irish Rail fleet is instrumented using accelerometers installed on the train bogie. The train location and forward velocity are recorded while using a GPS system. A data set including acceleration and train velocity measurements from 60 passes on the Dublin-Belfast line is studied. The energy of the signals is extracted from the amplitudes of the accelerations using a Hilbert transform and considered as a track condition indicator. It is shown that, when the train has a forward speed less than 50 km/h, the energy indicator is no longer reliable. The data set is cleaned by removing these low speed passes for each 1 km data segment and the remaining passes are scaled using the train forward speed. It is demonstrated that the scaled amplitudes provide good repeatability for multiple measurements and they show considerable potential for drive-by track monitoring. The scaled amplitudes for a 6 km dataset are compared with TRV measurements. This confirms that the locations with high energy levels correspond to parts of the track that exhibit faults in the TRV data.