Exploring the Relationships between the Topological Characteristics of Subway Networks and Service Disruption Impact

Although numerous studies have considered the topological characteristics and the impact of disruptions in subway systems, their results have not been verified by empirical data. To address this limitation, we used a data set containing 392 detailed records of disruptions to subway services in Beijing from 2011 to 2017. The Spearman rank correlation coefficient analysis results indicate that the delay duration exhibits no significant relationship with the topological characteristics, whereas the reverse is true for the relationship between the number of affected trains and the topological characteristics. The results also demonstrate that subway network expansion will not result in a paradox between convenience and vulnerability from an actual data perspective. Moreover, contrary to previous research results, no significant relationship was found to exist between service interruption impact and the transit and key bridge stations. However, a high degree of clustering, characterized by redundant tracks between neighbours, tends to provide protection against service disruption for stations. In terms of the spatial variation, the influence of the disruption is greater when the station is further from the centre of the line. These results can support sustainable design in subway network planning.


Introduction
As a safer, greener, faster, and more punctual tool, subways have become a widely accepted means of public travel [1]. In this paper, a "subway" refers to urban rail transit that has exclusive right of way-whether on the ground, underground, or elevated-as defined in [2]. In the urban and suburban areas of Beijing, tens of millions passengers were transported by subway every day in 2018 [3]. The Beijing subway has been continually expanding its network by investing in new stations and lines. During the study period, there were a total of 14 operation lines, 174 stations, and 370 sections in 2011. At the end of 2017, there were 20 operation lines, 291 stations, and 658 sections in total. Moreover, the passenger traffic increased from 2.193 billion in 2011 to 3.778 billion in 2017 [3].
The rapid expansion of the network scale and rapid increase in passenger flow have brought a series of challenges to efficient subway operation. To minimize costs, subway systems are often designed to operate at short intervals and to carry passengers close to maximum capacity of trains [4] with little redundancy. This renders them sensitive to various disruptions [1]. Owing to the typical complexity of a subway network, when a disruption occurs at one station, it can easily cause knock-on

Literature Review
In recent years, an increasing number of researchers have used scientific network indicators to test the serviceability of subway networks [7,10,11]. The impacts of disruption on the serviceability of subway networks are often quantified by graph theory and complex network indicators, such as a decrease in the operational efficiency of the subway network following failure of a station or section [7]. In studies relating to transport and territory, the impact is generally measured in terms of accessibility or serviceability indicators, such as the proportion of delayed passengers [12], the proportion of cut-off passengers [12,13], or the time loss when passengers select alternative routes that are not optimal [13].
Disruption has been simulated by complete closure of a station or track section [13] which is passed by several lines with a high degree (an indicator used to measure the number of times a station is passed by subway lines) [14], high betweenness (an indicator used to measure the role that a station or section plays as a key bridge in all shortest paths in a network) [12], high closeness In this section, the concepts and the measures that can be used to describe these station and subway network topologies are reviewed. Thereafter, the algorithm is formalized based on these definitions.

The Degree of a Station
The degree of a station is an indicator which describes the connectivity of a station within a subway network in L-space. The node degree is defined according to Formula (1), which represents the number of neighbours that a node has; that is, the higher the degree of the node, the more connected it is and the more importance it has in the network [7]. The same applies to a subway network: where k i is the degree of station i and a ij corresponds to the tracks connected to the station. These tracks take a value of one if station i has a track leading in or out. If the degree of a station is greater than four, two or more subway lines pass through the station and it is considered a transfer station. Most stations with a degree of four or less pass only one line, and are referred to as non-transfer stations; in rare cases, if a station is the original and destination stop of two non-circular lines, it is also a transfer station, even though the degree of the station is equal to four.

The Scaling Factor of a Subway Network
The scaling factor of a subway network is an important indicator for scale-free networks. It is used to describe the heterogeneity of the station degree distribution in the subway network [18,25]. The degree distribution of a subway network following the drift power law distribution [26] is given by: where p(k) is the proportion of stations with degree k in the network and b is the scaling factor of the network, which has a negative value; the smaller its absolute value is, the lower the degree of heterogeneity of the network and more transfer stations exist in the network.

The Clustering of a Station and Subway Network
Clustering of stations is an indicator which measures the degree of stations clustered together in a network. In the other words, the station which has higher clustering means that its neighbour stations can be easily arrived each other because of redundant tracks among neighbourhoods. The clustering degree is based on triples of stations [27]. Therefore, a triangle graph includes three closed triplets, with one centred on each node [24]. In an L-space subway network, if there are no direct tracks between any two stations on the same line, but these can be reached without transfer, this type of station has been defined as an indirect neighbour in [28]. Therefore, the calculation of the clustering of a station needs to reflect the variance between the direct and indirect neighbours between two stations in the network: where l i describes the path that station i passes through, the number of track segments connecting any two stations is denoted by c ij , c jk , V neighbour (i) is the number of all neighbouring stations of station i i and the number of stations on line l i corresponds to V( i ). For detailed explanations and physical meanings of the transformed clustering, please refer to [28]. The clustering of a subway network is a common property describing the small-world characteristics of the network [24]. It is the average of all individual c i . It has been proven that, in a subway network, the clustering is typically much larger than in a comparable random network: where C is the clustering of the whole network (the value of C is between zero to one, the higher the tighter) and N is the total number of stations in the subway network.

The Betweenness of a Station
The betweenness of a station is considered as a means of detecting the amount of influence that the station has in a subway network. It is often used to determine nodes or edges that serve as a bridge from one part of a graph to another [28]. The betweenness of a station is calculated by the proportion of the number of shortest paths passing through that station of all of the shortest paths between all origin-destination pairs in the network [12,29]: where B(i) is the betweenness of station i, p jk is the total number of shortest paths between stations j and k in the network, and p jk (i) is used to describe the number of shortest paths between station j and k that pass through station i. The shortest path in this study is considered with respect to the number of edges passengers have to traverse to get from one station to another [7]. The distance between stations j and k is the number of track segments c jk in any route.

The Distance from the Centre of the Line
The distance from the centre of the line is defined as the number of track segments between a given station and the line central station, as proposed in [29]. For example, a distance of zero means that the line central station corresponds to the station under consideration, while a distance of 12 means that there are 12 track segments from the considered station to the line central station.

Case Study
Two types of data were used in the study: data of the transit service disruptions occurring in the Beijing subway from 2011 to 2017, and data of the evolution of the Beijing subway network from 2011 to 2017.

Beijing MRT Service Disruption Data
A total of 392 subway transit service disruptions, recorded by the Department of Safety Supervision of Beijing Subway Limited during the years 2011 to 2017, were used. The delay durations, which are also referred to as the disruption duration, of all the incidents were more than 5 min from occurrence to the resumption of normal operation. Subway service disruptions, such as lower speed of trains in the track segment or longer dwelling times at stations during peak hours, and large disruptions, such as incomplete or complete closure of a track segment, were all considered as disruptions in this research. Each disruption case was composed of an incident date and location, as well as a description of the incident process (from occurrence to normal service). The number of trains cancelled, the number of trains running late, the number of trains turning back to the depot empty, cause, disruption disposal measures, and the time at which normal operation was reinstated were described in the incident process. Normal operation means that all disruptions were cleared up, but not all services ran according to schedule again. Five examples of the total of 392 disruption records are listed in Table 1. Disruption disposal measures vary, according to the cause [19]. For example, for train malfunctions, such as a stuck door, malfunctioning brakes, or traction faults, the driver is first asked to perform a recovery operation. If the function cannot be restored, the train is requested to be removed from service and to return to the depot empty. In this case, the following trains may be able to arrive on time or slightly late. If the train cannot move by itself, another train nearby will be requested to perform the rescue operation. When the rescue train arrives at the disruption location, it pulls the faulty train to the nearest station to clear the passengers, and then returns to the depot. In this case, prior to the arrival of the rescue train, the following trains must stop to wait for the front tracks to be cleared and service on the line is temporarily suspended. When the faulty train returns to the depot, the line service returns to normal, which means that several trains may be late; the accumulation of delays may result in the cancellation of numerous trains. For signal failure, such as shaft-counting faults in the station and turnout malfunctions, the train's movement is requested to follow telephone blocking instead of automatic signals, at a lower speed. Certain disruptions may last for a long time, such as the power failure in 2014 that lasted for 116 min, although only eight trains were affected (Case No. 3). However, although the disruption duration may be very short, the consequences may have long-reaching effects, such as the train failure in 2011 that lasted for 11 min, but which ultimately caused the cancellation of nine trains and 47 trains delayed more than 5 min (Case No. 1).
Based on the above description of the disposal measures for disruptions, the service disruption duration (also referred to as the delay duration) cannot reflect the actual negative effect on the transport service. It is largely dependent on ability of operators, disposal regulations, types of disruptions, and so on. However, delayed and cancelled trains-defined as the number of affected trains in this study-have a serious negative impact on passenger travel. Our aim is to compare the delay duration and the number of affected trains, which have relationship with topological characteristic. Accordingly, our chosen response variables are the delay duration and number of affected trains.

Beijing Subway Network Structure
The Beijing subway network structure, which is no exception to service disruptions, was expanded with new lines and stations every year during the study period from 2011 to 2017. These changes in the network structure affected the station network characteristics. Therefore, according to the time when the new stations and lines of the Beijing subway were placed into operation and the time when the disruptions occurred at the stations, a total of seven subway network structures were established from 2011 to 2017. The suspended relationship between stations under disturbance cannot be reflected by C-space or R-space [30]. Therefore, the subway network map was established and analysed based on L-space, with reference to [30]. A straightforward representation of a subway network in the form of a graph indicates every station as a node, while each edge corresponds to a track between two stations [2].
In 2011, the network was composed of 14 lines, 174 stations, and 370 tracks (two shared tracks); in 2017, the network consisted of 20 lines, 291 stations, and 658 tracks. The expansion of the Beijing subway network structure is illustrated in Figure 1.
Sustainability 2020, 12, x FOR PEER REVIEW 7 of 18 trains in this study-have a serious negative impact on passenger travel. Our aim is to compare the delay duration and the number of affected trains, which have relationship with topological characteristic. Accordingly, our chosen response variables are the delay duration and number of affected trains.

Beijing Subway Network Structure
The Beijing subway network structure, which is no exception to service disruptions, was expanded with new lines and stations every year during the study period from 2011 to 2017. These changes in the network structure affected the station network characteristics. Therefore, according to the time when the new stations and lines of the Beijing subway were placed into operation and the time when the disruptions occurred at the stations, a total of seven subway network structures were established from 2011 to 2017. The suspended relationship between stations under disturbance cannot be reflected by C-space or R-space [30]. Therefore, the subway network map was established and analysed based on L-space, with reference to [30]. A straightforward representation of a subway network in the form of a graph indicates every station as a node, while each edge corresponds to a track between two stations [2].
In  Although the network expansion significantly improved the accessibility of subway stations to passengers, owing to the increasingly accessible routes (all connected station pairs within the network were increased from 24,644 in 2011 to 84,390 in 2017), the length of the shortest path also increased, regardless of whether this calculation was based on the number of transfers or number of stations passed. The overall evolution characteristics of the Beijing subway network structure from 2011 to 2017 are summarized in Table 2.  Although the network expansion significantly improved the accessibility of subway stations to passengers, owing to the increasingly accessible routes (all connected station pairs within the network were increased from 24,644 in 2011 to 84,390 in 2017), the length of the shortest path also increased, regardless of whether this calculation was based on the number of transfers or number of stations passed. The overall evolution characteristics of the Beijing subway network structure from 2011 to 2017 are summarized in Table 2. Table 3 summarizes four independent variables regarding the complexity characteristics of the subway stations and two dependent variables regarding the impacts of service disruption based on the above definitions. As the calculations of the station degree, betweenness, and clustering differ from the calculation of nodes in social networks, popular complex network analysis software (e.g., Pajeck or Ucinet) could not be used in this study. Therefore, calculation of the four topological characteristics of stations was conducted using the subway simulation analysis system known as Urban-metro-cas (Patent no. CN108897920-A), which was developed in [31].  Delay duration An integer continuous variable equaling the duration from the occurrence of disruption to normal operations resuming (disruption clearing).

Independent Variables Regarding Station
Degree of station An ordered categorical variable equaling the number of tracks into and out of a station.

Transfer station
A dummy variable equaling one if the station is passed by more than one line.
Clustering of stations An integer continuous variable reflecting the tightness of a station, calculated using Equation (3) Betweenness of station An integer continuous variable reflecting the criticality of a station as a bridge, calculated using Equation (5) Distance to centre of line An ordered categorical variable reflecting the distance of the station to the centre of the line, which is categorised into six groups. When the distance is within zero to one track, it is group 1; two to three tracks is group 2; four to five tracks is group 3; six to seven tracks is group 4; eight to nine tracks is group 5; and 10 to 13 tracks is group 6.

Scaling factor
A negative value. The smaller its absolute value, the lower the degree of heterogeneity of the network and the more transfer stations in the network. Clustering of network An integer continuous variable between zero and one, calculated using Equation (4)

Descriptive Statistics of Data
In this study, there were only seven subway networks. Consequently, the scaling factor, clustering, average minimum path length, and average minimum transfer describing the characteristics of the whole network had only seven values, which are listed in Table 4. Their descriptive statistics, with the exception of four indicators which describe the overall subway network, are also presented.
This analysis is of interest for verifying the hypothesis that the complexity characteristics of the subway network and the topological characteristics of the subway stations interact to affect number of affected trains and delay duration. The Pearson correlation coefficient is used to quantify the relationship level between linearly related variables [32]. Therefore, the variables should be normally distributed for the Pearson correlation coefficient analysis. The results of the normality test for our data are presented in Table 5. It can be observed that neither the four independent variables nor the two dependent variables were normally distributed. The data required for the Spearman rank correlation coefficient analysis is not as strict as that of the Pearson correlation coefficient [33]. The Spearman rank correlation coefficient can be used for correlation analysis, regardless of the shape, sample size, or overall distribution of the two variables, provided that the observed values of the two variables are paired rank data or continuous variable observation data for the level of information [34]. Therefore, the Spearman rank correlation coefficient was used for the correlation analysis in this study. Variable rank was used, instead of the value itself, which is its position in ascending or descending order of the value.

Subsection Effects of Complexity of Beijing Subway Network on Disruption Impact
The complex characteristics changed with the evolution of the Beijing subway network from 2011 to 2017. According to Table 6, with increasing network size, the scaling factor of the Beijing subway network tended to settle to a relatively high value. However, it became less clustered as the clustering decreased, and both the average minimum distance and average minimum transfer increased. According to the observation, the average disruption impact, in terms of both the number of affected trains and delay duration, exhibited no obvious patterns. A bivariate correlation analysis was conducted using the Spearman correlation coefficients to explore the depth of relationships between the holistic topological characteristics of subway network and the impact. As the intervals of trains during rush and non-rush hours are different, bivariate correlation analyses were conducted respectively for rush and non-rush hours. The correlation relationship results are presented in Table 7. Although the independent variables had high correlation with each other, the Spearman correlation coefficient test was carried out between any pair of variables. Therefore, the test results were not affected by multicollinearity. In addition, the purpose of this table is to understand whether there was any correlation between the topological characteristics of the holistic network and the average disruption impact on the network. From Table 7, we can see which variable was related with the disruption impact from a holistic perspective. ** Correlation is significant at the 0.01 level (two-tailed); * Correlation is significant at the 0.05 level (two-tailed).
On one hand, the scale-free characteristic (scaling factor) of the network exhibited a strong correlation with the small-world characteristic (clustering of the network, average minimum transfer, and average minimum distance). The curves of the related variables are illustrated in Figure 2. The scaling factor was negatively correlated with the network clustering, but positively correlated with the shortest path length, in terms of both the average minimum transfer and average minimum distance. The network clustering was negatively correlated with the shortest path length, in terms of both the average minimum transfer and average minimum distance. The average minimum transfer was positively correlated with the average minimum distance.

Correlation of Disruption Impact with Topological Characteristics of Stations
In general, the degree and betweenness of a station tends to increase the service disruption impacts, while clustering has an opposite effect. However, the distance from the centre of the line is rarely discussed. To test the hypothesis that a more important station exhibits a higher number of affected trains or longer delay duration, we further analysed the correlation between the topological On the other hand, there was no significant correlation between average delay duration and the holistic topological characteristics of both the rush group and the non-rush hour group. However, the average number of affected trains was correlated both with the scale-free and small-world holistic topological characteristics of the subway network in the non-rush hour group. Therefore, the average number of affected trains was positively correlated with scaling factor, average min. transfer, and average min. distance, but negatively correlated with clustering.

Correlation of Disruption Impact with Topological Characteristics of Stations
In general, the degree and betweenness of a station tends to increase the service disruption impacts, while clustering has an opposite effect. However, the distance from the centre of the line is rarely discussed. To test the hypothesis that a more important station exhibits a higher number of affected trains or longer delay duration, we further analysed the correlation between the topological characteristic variables and number of affected trains and delay duration by means of the Spearman correlation test, as indicated in Table 8. The test was carried out in both the rush hour and non-rush hour groups. Clustering and distance rejected the null hypothesis when the response variable was the number of affected trains in both groups. Surprisingly, when the response variable was the delay duration, none of the four variables exhibited a significant correlation in both groups. Higher clustering tends to provide protection against service disruption in stations for which the coefficient is significant at conventional significance levels. By plotting the clustering of stations versus the number of affected trains (as illustrated in Figure 3), it is easy to observe that the disruptions with the largest number of affected trains always occurred at stations with a clustering value of less than 0.2. As the station clustering increased, the number of affected trains became more concentrated around 50, as indicated in Figure 3. The distribution of the number of affected trains with clustering of less than 0.5 was very discrete; that is, even if a disruption occurred, stations with higher clustering did not affect the service operations of its neighbouring stations. concentrated around 50, as indicated in Figure 3. The distribution of the number of affected trains with clustering of less than 0.5 was very discrete; that is, even if a disruption occurred, stations with higher clustering did not affect the service operations of its neighbouring stations. Furthermore, station lines within 16 stations or more than 20 stations can be planned to improve the resilience of the subway service under disruptions. This finding is consistent with [8], where it was found that networks with a higher diameter are often sparser and contain less redundant connections. However, when the diameter is sufficiently large, service disruptions occurring at the edge of the network have a limited radiation capacity, which reduces the impact (as illustrated in Figure 4). Networks with a larger diameter are often sparser and contain less redundant connections. Therefore, the positive relationship between the distance of the station from the centre of the line and the number of affected trains under service disruption is demonstrated. However, as transfer stations are less likely to appear at the edge of the network, when the network diameter continues to increase, the service disruptions occurring at the edge of the network will have a limited radiation capacity, thereby reducing the impact.   Furthermore, station lines within 16 stations or more than 20 stations can be planned to improve the resilience of the subway service under disruptions. This finding is consistent with [8], where it was found that networks with a higher diameter are often sparser and contain less redundant connections. However, when the diameter is sufficiently large, service disruptions occurring at the edge of the network have a limited radiation capacity, which reduces the impact (as illustrated in Figure 4). Networks with a larger diameter are often sparser and contain less redundant connections. Therefore, the positive relationship between the distance of the station from the centre of the line and the number of affected trains under service disruption is demonstrated. However, as transfer stations are less likely to appear at the edge of the network, when the network diameter continues to increase, the service disruptions occurring at the edge of the network will have a limited radiation capacity, thereby reducing the impact.
Sustainability 2020, 12, x FOR PEER REVIEW 13 of 18 concentrated around 50, as indicated in Figure 3. The distribution of the number of affected trains with clustering of less than 0.5 was very discrete; that is, even if a disruption occurred, stations with higher clustering did not affect the service operations of its neighbouring stations. Furthermore, station lines within 16 stations or more than 20 stations can be planned to improve the resilience of the subway service under disruptions. This finding is consistent with [8], where it was found that networks with a higher diameter are often sparser and contain less redundant connections. However, when the diameter is sufficiently large, service disruptions occurring at the edge of the network have a limited radiation capacity, which reduces the impact (as illustrated in Figure 4). Networks with a larger diameter are often sparser and contain less redundant connections. Therefore, the positive relationship between the distance of the station from the centre of the line and the number of affected trains under service disruption is demonstrated. However, as transfer stations are less likely to appear at the edge of the network, when the network diameter continues to increase, the service disruptions occurring at the edge of the network will have a limited radiation capacity, thereby reducing the impact.

Regression Model of Disruption Impact with Topologcial Characteristic of the Station
Accordingly, the response variables-clustering and distance-which were identified as correlated variables, were considered in the regression model. The dependent variable was the number of affected trains. Firstly, multicollinearity of two correlated variables was diagnosed, as shown in Table 9. Variable proportion shows that distance can explain more than 70% of clustering. Additionally, in two groups, the eigenvalue of the distance variable was close to 0 and the condition index of distance was bigger than clustering. Therefore, distance was excluded in the following regression analysis. Logarithmic regression and inverse regression were used for comparison with linear regression. The results of the three regression models are shown in Table 9. In non-rush hour, clustering and number of affected trains maintained a linear relation. However, in rush hour, they maintained linear and logarithmic relations. The R-Square value in the linear model was larger than in the logarithmic model in the rush hour group. In both groups, the R-Square value was far from 1. The reason for this is that we only had one variable in the model. The parameter estimate is shown in Table 10 and the linear fit curve is shown in Figure 3.

Conclusions
In this paper, we presented a correlation analysis of the topological characteristics of a subway network with number of affected trains and delay duration. In contrast to studies based on simulation and graph theoretic methods, we used the data of 392 actual service disruptions that occurred in the Beijing subway network from 2011 to 2017. The Spearman rank correlation coefficient was used to analyse the correlations among the complex attributes of the holistic network, the complex attributes of local stations, and the service disruption. This enabled not only clarification of the relationships, but also an assessment of the possible means of optimizing the planning of subway networks. The information provided by the research results may aid scholars in identifying the shortcomings of several hypotheses in theoretical research. Moreover, this information can be provided to aid the preparation of planners and managers, in order to mitigate the impact of accidents on the network by planning new routes or expanding existing routes to reduce vulnerability and critical factors.
From a complex network theory perspective, the evolution of a subway network does not definitely increase the network complexity while improving its convenience to passengers. Up to 2017, the main objective of the Beijing subway extension was to expand its coverage areas. At this stage, the Beijing subway is expanding its network mainly by constructing new stations to the edges along radial lines, rather than by adding tracks between existing stations to increase the network density. Nonetheless, the expansion of new stations will generate substantially more non-transfer stations than transfer stations. An increase in transfer stations may decrease the heterogeneity of the network. From a mathematical perspective, an increasing scaling factor is demonstrated. The rapid increase in the number of stations leads to a lack of connectivity between existing stations, which eventually results in an increasing length of the shortest path of the network and a decrease in the clustering. This is the reason that the scale-free feature of the network grows, but the small-world feature of the network does not improve significantly. Only at a later stage, in which the evolution of the network is based on the construction of tracks to connect old stations, will both the scale-free and small-world complexity attributes increase simultaneously.
Additionally, although the number of affected trains increases with an increase of the scaling factor, the negative impact is absorbed by the high clustering characteristic of the subway network. At a developed stage, in which the network evolves by constructing tracks among old stations, both the scale-free and small-world complexity attributes increase simultaneously. This means that, if the subway network is extended by adding tracks between existing stations, the vulnerability caused by an increase of the scale-free features can be absorbed by improving the small-world features of the network. These results demonstrate subway network expansion will not result in a paradox between convenience and vulnerability from an actual data perspective.
Meanwhile, the number of affected trains is a valuable indicator to quantify the relationship with service delivery level under disruption and topological characterise of MRT network structure than delay duration, according to our empirical result. This is because of the two dependent variables under service disruption: only the number of affected trains exhibited correlation with the independent variables, while the delay duration exhibited no correlation. The service disruption duration cannot reflect the actual negative effect on the transport service. That is because certain disruptions that last for a long time may only affect a few trains. However, certain disruptions with very short durations ultimately cause the delay of more trains. In most transportation-related studies, the delay duration or delay time-superimposed passenger flow is generally used to quantify the impact of disruptions.
The conclusion of most studies-that subway networks are more vulnerable to targeted attacks than to random attacks-was not supported by our empirical data from service delivery perspective. In such studies, targeted attacks were simulated in numerous manners, such as deleting stations with higher degree, betweenness, or clustering. The results of our empirical analysis demonstrated no significant relationship between transfer and non-transfer stations. The station betweenness also showed no significant difference in the number of affected trains. In theoretical studies, the platforms and tracks of transfer stations were assumed to be unique. This assumption induces the conclusion that, once an incident occurs at the transfer station, the platform and tracks will be occupied; which means that all trains planning to pass this station will be affected by travelling at a lower speed or being cancelled. However, in actual operations, the platforms and tracks are not unique: different lines occupy different platforms and tracks. An incident on one platform does not necessarily affect the normal operation of trains on other platforms. Therefore, there was no significant difference in the number of affected trains at different station degrees. Furthermore, the irrelevancy of transfer stations serves as a reminder to establish the problem of multi-platform transfer in metro transfer stations correctly when conducting research based on simulation methods, rather than all lines of a transfer station sharing the same platform.
Moreover, the negative correlation between the clustering and number of affected trains indicates that the impact on the subway service disruption occurring at stations with high clustering was smaller than that at stations with low clustering. In practice, increasing the construction of tracks between existing stations can improve the clustering coefficient of the stations. Higher clustering tends to provide protection against service disruption for stations by means of alternative tracks. For example, in the Beijing subway network, the airport line has four stations in total: Dongzhimen station, Sanyuanqiao station, T2 station, and T3 station. Sanyuanqiao station, T2 station, and T3 station are all connected by tracks, forming a triple. Therefore, the clustering values of T2 and T3 are greater than 0.6. If a disruption occurs at the T3 station, T2 can be reached directly from Sanyuanqiao station, thereby providing a highly significant protective effect. Additionally, station lines within 16 stations or more than 20 stations can be planned to improve the resilience of the subway service under disruptions. This finding is consistent with [8], that networks with a higher diameter are often sparser and contain less redundant connections. However, when the diameter is sufficiently large, service disruptions occurring at the edge of the network have a limited radiation capacity, which will reduce the impact.
Last of all, we want to note several drawbacks of the study. Firstly, using of 'number of affected trains' as indicator for disruption impacts is from transport service delivery perspective. When a disruption happened, providers of the transport service or manager of transport service can know how the planned service delivery level is affected by 'number of affected trains'. The actual passenger capacity or density of trains is not considered in this indicator. Although we have separated the analysis into rush hour and non-rush hour group, actual passenger capacity or density of trains may be different in city centre line and suburban line. How many passengers affected due to disruptions cannot be known by 'number of affected trains'. Researchers had used AFC data to evaluate disruption impact on passengers. On one hand, affected passenger has delay effect and escape effect. The delayed effect refers to that affected passengers will be reflected on the AFC data after the disruption is over. The escape effect means that some of affected passengers will leave MRT system and cannot be reflected by the AFC data. Therefore, if AFC data are to be used as a tool for disruption impact assessment, the time duration selected for AFC data is very critical, which will directly change results. On the other hand, even if the appropriate time duration is selected to extract the AFC data, only the number of passengers actually served can be known. If we want to know how many passengers are affected by the disruption, simulation method is needed to support the calculation of the number of passengers that should be provided transport service. Therefore, the results also depend on the accuracy of the simulation data of passenger flow.
Secondly, the length of shortest path was calculated by the number of transfers and number of stations passed in this study. This makes the shortest path length between stations will only increase if adding a new station between existing stations when extend subway network. The Beijing subway network was not a whole connected in 2011 and 2012. It became a whole connected network till to 2013. Since 2013, there were 30,450 connected pairs existing in those 174 stations. The average shortest path length of the 30,450 OD pairs was increased from 15.9713 in 2013 to 16.6098 in 2014. That was because new stations had been added between existing stations. Whereas the reverse is true if the length of shortest path would calculate the geodesic link length; whilst using link travel times might increase the shortest path length somewhat if the dwell times result in longer journey times for through passengers. In the future study, the appropriate way to calculated the shortest path should be used according to the purpose of the research.
Thirdly, some of disruptions was the result of domino effect of another disruption, while the domino effect was not considered here. Such a disposition, may let space correlation information be missed. Therefore, in the future study, the disruptions should be considered as chains or even formulate into networks. The factors related with spatial, such as geographical, economic, and demographic attributes and so on should be taken into account when carrying out spatial autocorrelation analysis. It is really an interesting question whether there are spatial autocorrelation characteristics of subway disruption incidents.
From the perspective of prediction analysis, the regression analysis results showed that using only topological features cannot demonstrate a good interpretation of the disruption impact. Therefore, in the future, we need to consider not only the topological characteristics, but also the factors from GIS information, operations, passenger flow, and the cause of disruptions in the regression analysis.