Suitability Evaluation of a Train’s Scheduled Section Travel Time

: Two methods used to evaluate the suitability of a train’s scheduled section travel time (TSSTT) are theoretical modeling and data analysis. The ﬁrst is suitable for newly constructed railway projects, the second can reveal the reliability of the train section running time (TSRT) under an instruction of TSSTT in cases where the train operation data are provided. A suitability evaluation method of TSSTT is proposed by calculating the possibility that a train completes a task within the time windows, centering on the TSSTT given in advance. The TSRTs between two adjacent stations are classiﬁed into four groups based on whether the train dwells at the two end stations of the railway section, and then subdivided secondly into subgroups by the instruction of TSSTT given. The kurtosis of each subgroup data of TSRT is larger than 3, so Weibull distribution is selected to ﬁt the TSRT distribution of subgroup data due to good ﬁtness based on root measurement of the least square (SRLSM). A busy high-speed railway line in the Wuhan area of China is used to validate the presented approach. Each railway section has its own suitable TSSTT in which TSRT might achieve 96% reliability of arriving within 2.5 minutes centering on suitable TSSTT, otherwise which might not obtain 10% reliability.


Introduction
A working diagram of railway system specifies wagon routing, start time, and time consumption of each train at each component of railway network, and it also gives trains' scheduled section travel time (TSSTT) between two adjacent stations and dwelling time at each station. The actual train actual section running time (TSRT) and dwelling time in stations are affected by various factors, such as weather, power supply, train passenger capacity, passenger organization mode, train control system, and even the working diagram itself. Revealing the time deviation between TSRT and TSSTT on the railway section between two adjacent stations is the basis for compiling a highly reliable working diagram for the trains. Many factors affect TSSTT. First, the technical conditions of the railway line regarding safety constraints, such as the train's section speed limitations, should be met [1][2][3][4][5][6][7][8][9]. Additionally, the requirements of network operation and various passengers demand should be satisfied [10][11][12][13][14][15]. Lastly, energy conservation and environmental protection should be considered [16][17][18][19][20][21]. Hence, TSSTT should not only meet the demands of compiling a working diagram at the network level under different passenger needs, but also should consider energy conservation and environmental protection while ensuring that the TSRT on the same railway section falls within the neighborhood of TSSTT. The railway dispatching command system and operation-monitoring system can not only provide the scheduled arrival and departure time of a train at each working station but also precisely record actual arrival and departure time of a train at each working station. Hence, there are enough data needed to judge the suitability of TSSTT and any deviation between TSRT and TSSTT as a result of different factors, Sustainability 2020, 12, 2399 2 of 15 such as the weather, power, carrying capacity, and train-operation control. At present, due to the lack of appropriate analysis tools, a mass of recorded data cannot be converted to valuable information for management purposes. On the other hand, the relevant departments of the railway operating company lack scientific and quantitative information for compiling train working diagrams which, instead, rely on traction-based simulation or subjective experience.
TSRT based on traction calculation is determined by equations of motion combined with the relationship between train tractive and braking forces, the sum of mechanical and aerodynamic resistances, and force caused by the track gradient, which reflect the train's section traction, idling, and brake operation process [16][17][18][19]. Hence, noise factors such as weather and power supply [9] on the railway line are considered less in the calculation process.
Multiple noise factors can cause an actual train's running time to deviate from the planned travel time and destroy the schedule, even causing delay propagation that affects passenger service. Suitable approaches for the detailed statistical analysis of train delays and validation of running and dwell times based on standard track occupation and release data, including goodness-of-fit test and estimates for the distributions and their parameters, have been provided in some articles [22,23]. Existing studies assume that train delays are subject to negative exponential distribution [5,11,12,24,25]. The frequency distribution of trains behind the schedule is subject to negative exponential distribution. The total number of trains, number of late train trips, and total time of trains behind schedule can be surveyed at a station or railway section. Hence, the average headway time of trains behind schedule is calculated from the total time of trains behind schedule divided the number of late train trips. The buffer time of a train can then be calculated by the negative exponential distribution [11,23,26] or weighted exponential distribution [27]. This is a conservative method with some deficits, such as the following. First, the method of calculating buffer time depends only on headway time of trains behind the schedule, which does not consider the proportion of trains behind schedule. In cases where the possibility of late trains is very small, the method may waste railway line capacity by inserting buffer time into schedule with very small chance of delays. Second, train delays are caused by weather, electricity supply, train capacity, passenger organization, train-operation control, the train working diagram, and other reasons, but the negative exponential distribution of delaying trains does not reveal relationship between late arrivals and causes of delays, and parameters of negative exponential distribution must be reevaluated after adjusting trains' working diagram. These parameters are not stable, which causes uncertainty in trains' working diagram.
There are also a lot of efforts on obtaining the distribution of TSRTs in order to incorporate knock-on delays in the modeling of delay propagation in network [27][28][29][30][31]. Many data resources such as train describer, track occupation, and clearance records are used for this purpose. A group of candidate distributions such as normal distribution, Weibull distribution, gamma distribution, and beta distribution are considered to fit the empirical data. These efforts do not distinguish the railway systematic and stochastic factors on train delays, so they cannot reveal the relationship between factors and train delays. In actuality, delays that happen on the upstream railway sections and stations may be not relevant to the train running time of the downstream railway sections and can be excluded in data preprocessing. Nowadays, there are two different research directions: many studies focus on how delays are propagated in heavy busy railway systems [27][28][29][30][31], but few pay attention to how to shorten the headway time among successive trains on railway line or route conflict junction, ensuring reliability of the trains' working diagram on the railway system with a relatively high level of punctuality and recovery from train delays to some extent.
In this paper, railway section refers to the railway line section between two adjacent stations, each of which might be working station for some trains, and the section might include more than one block section or other technical sections. TSRT can be calculated from the train arrival time at its destination station minus the train departure time at the starting station, so TSRT excludes the propagation delays on upstream railway sections and the starting station while taking into account effects caused by weather, electricity supply, train capacity, passenger organization, train-operation control, the train working diagram, and other reasons. TSRTs are classified into four groups based on whether train dwells on the two end stations of the railway section, and they are then subdivided into subgroups by the instruction of TSSTT given before train departs. The kurtosis of each subgroup data of TSRT are larger than 3, so rather than log-normal, normal distribution, gamma distribution or beta distribution, Weibull distribution is selected to fit the TSRT distribution due to good fitness based on root measurement of the least square. The new approach has obtained the conditional distributions of TSRT in the case of TSSTT, which benefits the reliability calculation of TSSTT in simplicity and effectiveness. On the basis of the study of the distribution of TSRT under different TSSTTs, this paper studies the suitable TSSTT for each railway section, considering the possibility of delays and accommodating robustness of train travel times. Parameters of theoretical distribution of TSRT are stable and are not relevant to the train's working diagram.
The remainder of this paper is organized as follows. The suitable travel time and criteria for measuring the reliability of train operations are given in Section 2. Section 3 provides the data description, which deals with selection of fitting functions, and distribution fitting process of TSRT is given in Section 4. The appropriate TSSTT for each railway section on the studied railway system is given in Section 5. The paper concludes in Section 6 with problems requiring further study.

Evaluation Criterion
Assume that the analyzed railway system is represented by a set of stations N. The directional railway section between any two stations in N is indicated by a, and the set of railway section a is denoted by A. The TSSTT set of all trains in section a is denoted by T a , and arbitrary TSSTT t a i belongs to T a (i.e., t a i ∈ T a ), where i represents the i th scheduled section travel time on section a. In railway section a, the TSRT t a i is a random variable whose distribution is the conditional probability distribution of the TSSTT t a i . The probability density function of the distribution is assumed to be ρ( t a i t a i ). Train is running continuously in the railway section and might have jumped some stations. According to whether a train dwells or not, the natural railway section between two adjacent stations can be divided into four types by the notation (*,*). The asterisks in this notation can take the values 1 and 2, where 1 means the train passes the track section of relevant station and does not stop, and 2 indicates that the train stops at station. The four types of railway sections are (2,2), (1,2), (2,1), and (1, 1).
TSRTs are classified into four groups based on train's dwelling type on two end stations of the railway section, and then they are subdivided secondly into subgroups by the instruction of TSSTT given to the train driver. Although trains can be distinguished by its types further, TSRT are classified only based on two factors mentioned above for concision in the paper. The kurtosis of each subgroup data of TSRT are larger than 3, so rather than log-normal or normal distribution, Weibull distribution is selected to fit distribution of subgroup data of TSRT due to good fitness based on root measurement of the least square (SRLSM).

a. Suitability measurement
Assume that c, c are two positive values, the probability t a i − c, t a i + c centering on TSSTT t a i can be used to measure the suitability of TSSTT t a i . If the probability is greater than a given threshold β, the TSSTT t a i is appropriate. b.

Reliability measurement
Similarly, the probability t a i +c t a i −c ρ(tt a i )dt of TSRT t a i within the time interval t a i − c, t a i + c centering on TSSTT t a i can also be used to measure the reliability of the train operation. There are three purposes for studying the reliability of train operation. The first is to judge the reliability of the existing train working diagram; the second is to improve the reliability of the existing train working diagram by inserting a buffer time or dwell time supplements; and the third is to measure the ability of the control Sustainability 2020, 12, 2399 4 of 15 system to realize the schedule in different running environments. In general, a train's working diagram requires a certain degree of reliability, which may be assumed as a given threshold α, 0 < α < 1. It is often required to solve c in Equation (1) to determine the buffer time or the supplements between trains in succession.

Data Description
The used raw data were recorded by the pressure sensor of a track circuit from January 1 to March 30, 2016, from the Beijing-Shenzhen high-speed railway line in Wuhan railway administrative area (East Xuchang to North Chibi), as shown in Table 1. These data include destination code, station number, scheduled arrival and departure times, and actual arrival and departure times. On the basis of these records, we can calculate each train's TSSTT and TSRT as well as scheduled and actual dwelling time of train at station. The railway line in Wuhan railway administrative area starts at 780 kilometers and 712 meters and ends at 1353 kilometers and 654 meters from the Beijing west railway station, and the total length is 572.942 km. The northernmost station reaches East Xuchang station in Zhengzhou administrative area, and the southernmost station is East Yueyang Station in Guangzhou Railway Group. A total of 10 stations and one block post include West Luohe, West Zhumadian, East Minggang, East Xinyang, North Xiaogan, East Hengdian, L1L2 block post, Wuhan Gaosuchang, East Wulongquan, North Xianning, and North Chibi. East Wulongquan is the cross station, and the rest of the stations are passenger stations. The down direction is from West Luohe Railway Station to North Chibi Station; the opposite is the up direction. For the railway section from North Chibi to North Xianning, the train G80 passed North Chibi at 15:50:00, precisely at the scheduled time on March 12, 2016, and spent 12 minutes covering the distance from North Chibi to North Xianning. Arriving at North Xianning at 16:01:00, it was delayed by just 1 minute compared to the scheduled arrival time. Train G80 departed punctually at 16:03:00 from North Xianning for remaining trips as described in Table 1.
Raw data were recorded in the Wuhan railway administrative area, including high-speed trains and motor-train units in both the up and down directions, as long as trains passed through the area. Hence, the data show the following characteristics: trains either pass through all the stations within the area or run into the railway line from a station, or leave the railway line from a station. Table 2 shows the train-dwelling-station relationship table for some high-speed trains and a value of zero means the train does not pass through a station. For example, train G580 only passes through Wuhan Gaosuchang, L1L2 block post and East Hengdian, and only dwells at Wuhan Gaosuchang. There are 86 up-direction trains and 114 down-direction trains running in Wuhan railway administrative area. The number of motor-train units in the up and down directions is 31 and 22, respectively.
The total number of track circuit records is 91,080 items, which includes 9109 items on the section between North Xianning and North Chibi; other information on each section is listed in Table 3.

Fitting Distribution of Train Running Time
It is helpful to determine a fitting function reflecting the distribution law of TSRT, which might benefit in optimizing a train's working diagram and historical data storage of TSRT. For fitting purposes, the maximum number, minimum value, distribution range, mean value, standard deviation, skewness, and kurtosis of TSRT are calculated according to each of the four section types, as shown in Table 3. As section lengths differ between different adjacent stations, the average section running times ("expected" column of Table 3) are not the same, but variances of TSRT are less than 2 ("standard deviation" column in Table 3), which means that high-speed trains have strong traction power and a highly reliable train-operation control system. It is evident from the range of TSRTs ("range" column in Table 3) that the gap is greater than 2 minutes, and up to 35 minutes, even in the same section and for the same TSSTT. This indicates that the ability of the train-operation control system to achieve the scheduled travel time must be improved and the current train's working diagram has great potential for optimization. The kurtosis values (last column of Table 3) are almost all greater than 3 except for three instances (row 4, 35 and 45), which indicates that the kurtosis of TSRT is stable. This character is different from urban metro trains, as shown in Table 3 of Li, Liu, et al. [32]. Hence, we can choose Weibull distribution with kurtosis greater than 3 to fit distribution of TSRT. As most of variances of TSRT are less than 2, whereas sometimes the values of range are quite large, there are some singular values in the data of TSRT. Due to the frequency of singular values of TSRT being very low, they are within 1.3% quantile and outside 98.5% quantile of data of TSRT. The total number of singular values is 3% of TSRT data. After the singular values are removed, Weibull distribution is used to fit data of TSRT. The least squares between frequency on the histogram of TSRT and probability of the fitting distribution are calculated, and SRLSM is set as a criterion to judge whether the fitting distribution is suitable. A signification test was executed and found that Weibull distribution fits data of TSRT on each section, but lognormal and normal distribution are partially fitted in some sections. Parameters of the fitting function and its value of SRLSM for West Luohe-East Xuchang are shown in Table 4. The table shows that the Weibull distribution is superior to the normal distribution and lognormal distribution.

Suitability of TSSTT
Set the high-speed railway station set as N = {East Xuchang, West Luohe, West Zhumadian, East Minggang, East Xinyang, North Xiaogan, East Hengdian, L1L2 block post, Wuhan Gaosuchang, East Wulongquan, North Xianning, North Chibi}, the research scope is from North Chibi station to East Xuchang station. Let a represent the West Luohe-East Xuchang section. Figure 1 shows the frequency of TSSTTs used for the trains' working diagram on West Luohe-East Xuchang section which is type (2,2) section. Noting that data are only accurate to the minute, the figure shows that the set of TSSTTs on type (2,2) section a is T a = {16, 17, 18, 19, 20, 21, 22, 23, 24, 26, 27}. This demonstrates a variety of demand for TSSTT in the West Luohe-East Xuchang section. In three months, 16 minutes of TSSTT has appeared 468 times in total, which accounts for 48.1%, the highest frequency of occurrence. The next is 17 minutes which occurs 310 times, accounting for 31.9%. TSSTTs of 24, 26, and 27 minutes are used once, each accounting for 0.1% of the total number. Because TSSTTs greater than or equal to 20 minutes are less, the histograms of TSRTs and the fitted distributions are only plotted for 16-19 minutes, as shown in Figures 2-5, and the probability density function fitted is described as ρ( t a i t a i ). For example, the graph in Figure 2 corresponds to TSSTT t a 1 = 16 minutes, and its probability density function is written as ρ( t a 1 16). Other TSSTTs have similar usage and distribution rules.
TSRT can reach more than 96% in the neighborhood of 2.5 minutes centering on TSSTT. This reliability may be less than 10% under unsuitable TSSTT.                    Table 5 Table 5 also gives the possibility of punctual departures and arrivals in the time zone t a i − c, t a i + c of type (1, 2), (2, 1), and (1, 1) high-speed trains for different TSSTTs by the same calculation method. Table 6 shows the appropriate TSSTT value for each section in the Wuhan railway administrative area. As Table 5 shows, the reliability of TSRT can reach more than 96% in the neighborhood of 2.5 minutes centering on TSSTT. This reliability may be less than 10% under unsuitable TSSTT.