Connectivity and Accessibility of the Railway Network in China: Guidance for Spatial Balanced Development

Good connectivity and accessibility help to enhance the competitiveness of regions and countries. This research provides a detailed analysis of the connectivity and accessibility of the Chinese railway network. The studied period starts in 1949 and ends in 2017. The research scope covers the railway system of the entire country (except Taiwan, Hong Kong, and Macao). Instead of focusing on main cities as research objects, this paper provides more detailed insights by using counties as the basic research units. The analysis shows that the achieved connectivity has been increasing continuously over the study period. Four accessibility indicators (temporal location indicator, weighted average travel time, daily accessibility, and potential indicator) provide comprehensive and complementary results, indicating that the most accessible cities and units are located in the southeastern part of the Hu line. In addition, higher economic level, or higher population density, is correlated with higher accessibility. Furthermore, the current network exhibits an unbalanced spatial distribution pattern, with an underdeveloped west. All the indicators show that the accessibility of the northwest and southwest regions is the lowest. Based on these conclusions, regional policy-making suggestions can be made to guide a rational railway network expansion and facilitate the equality and sustainable economic development of regions. The future railway system development is suggested to focus more on enhancing inner and inter-region communication in the west of China and attach importance to poverty-stricken counties in support of balanced regional growth and development. The railway development of the eastern regions needs to focus on optimizing the structure of the network as well as reasonably organizing railway routes.


Introduction
The influence of the traffic network expansion on the spatial distribution of accessibility still remains a hot topic in the community of transport geography. Accessibility is a concept used to measure the overall spatial structure of a transportation network and to evaluate available opportunities that strongly correlate with its economic development. The development of railways in China has important implications for the improvement of accessibility. Thus, it remarkably contributes to population agglomeration, economic growth, and urbanization at regional and national scales. After a hundred years of development, the skeleton of China's railway network has been basically formed. The latest Mid-to-Long-Term Railway Network Plan, which was published in 2016, outlined a future perspective of the Chinese railway system. The railway mileage is planned to reach 175,000 km in 2025,

Data Source and Descriptive Statistics
The data used in this study are specified as follows: • Travel time of the railway system: The travel time between adjacent station pairs was obtained from the official timetable database of one Chinese tourism website named "Qunar Travel" in 2018, which is then processed via Floyd algorithm, which compares all possible paths based on an adjacency matrix between each pair of vertices and finds the shortest way between each pair [28], to obtain the minimum travel time between any specific origin and destination.  The percentage of units with at least one railway station in the northwest and southwest are the lowest. Whether there is a station was obviously correlated to both the GDP and population of units, with Spearman correlation coefficients of 0.374 and 0.307, respectively, both at the 0.01 level. The value of descriptive statistics of GDP and population of units with at least one station is higher than that without a station. The coefficient of variation (CV) indicates the dispersion of GDP or population (higher CV means a more unbalanced distribution of population or GDP).

Method
This section introduces the main indicators for the analysis.

Connectivity Indicators
The following four indicators (β, µ, α, and γ) are generally used to evaluate the overall network connectivity [4,29]: β represents the average number of edges (e) per node (n), i.e., β = e/n. The network has a tree structure when β < 1 and a closed-loop structure when β > 1.
µ represents the number of circuits, i.e., the gap between e and n, while also accounting for the number of subnets p (p = 1 for a fully-connected network), written as µ = e − n + p.
γ represents the ratio of the actual to the maximal number of edges, i.e., γ = e/[3(v − 2)], 0 ≤ γ ≤ 1. Larger values of the above indices indicate a better-connected railway network.

Temporal Location (TL) Indicator
The TL indicator [24] is defined as the ratio of the average travel time at node (station/unit) i to the network average across all nodes (i = 1, 2, . . . , n), and thus, reflects its relative accessibility, given by: where T i represents the average travel time between i and all other nodes in the network (T i = j t ij /n) (in hours). A smaller T i value indicates that it is more convenient to reach other nodes. A i > 1 represents accessibility below the average of the network, while A i < 1 indicates above average. A node with a smaller value corresponds to better accessibility. In addition, the node with the smallest indicator normally becomes the center of the railway network. In practice, a matrix of travel time, which has a size of n × n, is first obtained. The entry in the i-th row and j-th column represents the travel time from node i to node j. T i is computed by taking the average of all the entries in the i-th row. k T k /n is computed by taking the average of all the entries in the travel time matrix.

Weighted Average Travel Time (WATT)
The WATT of each node is generally used to compare accessibility across places, which is the weighted average travel time as following: W i represents the WATT of node i, which has also been referred to as the location indicator [2]. A lower obtained value indicates a more accessible node, t ij represents the minimum travel time between nodes (stations/research units) i and j via the railway network (in hours), M j represents the population of destination j, which is used as a weight to distinguish the importance of the travel time from node i to node j, n represents the total number of destinations that are accessible from node i.
To obtain the WATT indicator, a matrix of travel time (with a size n × n) and a column vector of the population (with a size n × 1) is first derived. j i (t ij × M j ) is obtained by multiplying the travel time matrix and population vector and then retrieving the entry in the i-th row. j i M j is obtained by taking the summation of the population vector with excluding the entry in the i-th row.

Daily Accessibility (DA) Indicator
This indicator is based on the concept of a fixed constraint for travel time, calculated as the number of opportunities that can be reached from a research unit within a certain travel time. In this work, we focus on the number of opportunities within two and four hours, which are two typical time periods adopted in previous works, e.g., [2,30].
This study investigated two DA indicators: Daily accessible units and daily accessible population. Daily accessible units refer to the number of research units from one origin within specified travel time. This indicator describes the connectivity of a unit to other units via the railway system. A unit with a high value for this indicator has the transportation infrastructure capable of quickly moving passengers to other places. The indicator of the daily accessible population identifies how much population can be reached from one location within a given travel time via the railway system. Overall, the daily accessible units metric is aimed to describe the links between units and a set of neighboring units, while the daily accessible population measure reflects a regional demand effect [1].
In practice, daily accessible units metric is obtained by counting the entries in the i-th row of the travel time matrix which are smaller than two hours or four hours. The daily accessible population is computed by taking the summation of the population of the units which can be arrived from the i-th node within two hours or four hours.

Potential Accessibility (PA) Indicator
The potential of a research unit i is given as: D j represents the attraction of node j, which is characterized by the GDP of the research unit in this study. By incorporating the GDP into the accessibility indicator, economic opportunities are integrated, t ij represents the minimum travel time between units i and j via railway network (in hours), and α is a shaping parameter. A typical value for α in empirical studies is 1, which is also adopted in this paper.
Similarly, a row vector representing the GDP of all units (with a size 1 × n) and a matrix representing the travel time between all unit pairs (with a size n × n) are first obtained. P i is computed by dividing the GDP vector by the i-th row of the travel time matrix, replacing the i-th entry of the obtained vector with zero, and then taking summation of the obtained vector.
The visualization of accessibility indicators for the research units were performed on a Geographical Information System (ArcGIS) platform.

Overview of Railway Network Expansion
Prior to the foundation of the P.R. of China, the construction of the railway network exhibited three main stages, which were emergence, development, and decrease [31]. This study analyzes the expansion and connectivity of the railway network in the P.R. of China after its foundation in 1949. Figure 2 shows the operation mileage and rate of growth. The operation mileage increased from 21,800 km in 1949 to 127,000 km in 2017. The following years, which are marked with special events or of relatively high growth rate, are selected to assess the railway network : 1949, 1958, 1966, 1978, 1990, 1997, 2006, 2010, 2014, and 2017.

Potential Accessibility (PA) Indicator
The potential of a research unit is given as: represents the attraction of node , which is characterized by the GDP of the research unit in this study. By incorporating the GDP into the accessibility indicator, economic opportunities are integrated, represents the minimum travel time between units and via railway network (in hours), and α is a shaping parameter. A typical value for α in empirical studies is 1, which is also adopted in this paper.
Similarly, a row vector representing the GDP of all units (with a size 1 × ) and a matrix representing the travel time between all unit pairs (with a size × ) are first obtained.
is computed by dividing the GDP vector by the -th row of the travel time matrix, replacing the -th entry of the obtained vector with zero, and then taking summation of the obtained vector.
The visualization of accessibility indicators for the research units were performed on a Geographical Information System (ArcGIS) platform.

Overview of Railway Network Expansion
Prior to the foundation of the P.R. of China, the construction of the railway network exhibited three main stages, which were emergence, development, and decrease [31]. This study analyzes the expansion and connectivity of the railway network in the P.R. of China after its foundation in 1949. Figure 2 shows the operation mileage and rate of growth. The operation mileage increased from 21,800 km in 1949 to 127,000 km in 2017. The following years, which are marked with special events or of relatively high growth rate, are selected to assess the railway network : 1949, 1958, 1966, 1978, 1990, 1997, 2006, 2010, 2014, and 2017.    Table 2 summarizes the increase in passenger stations (nodes) and edges over time. In 1949, there were 668 stations, and the number increased to 2066 by 2017. The percentage of main cities and small research units with stations increased from 36.50% to 86.05% and from 16.47% to 50.04%, respectively. According to graph theory, an edge is defined as a direct link between two stations. In 1949, China's railway network had 712 edges and increased to 2425 edges by 2017. Based on the indicators listed in Section 4.1, the connectivity indicators of the railway network are calculated and listed in Table 3, which represents the railway network expansion trends from 1949 to 2017. Overall, the four indicators all show increasing trends over this period, except for a slight decrease between 1949 and 1958 for β, µ, α, and γ. The annual growth rate after 1990 far exceeded that before 1990. Overall, the connectivity of the Chinese railway network steadily increased. β was always larger than 1, which indicates that the network has developed into a grid network, instead of a tree-form network.
The expansion and evolution of the railway network are shown in Figure 3. The details of the Chinese railway network development are listed in the following: In 1949, the railway lines mainly covered the Northeast and North China. Then, a network was gradually constructed based on the principle of linking the coastal area with inland regions, as well as connecting the administrative center (Beijing) with administrative and economic district centers [4].
1949-1966, the network was reconstructed and the whole network extended to the western areas (Gansu, Xinjiang, and Qinghai). 1966-1990, the investment policy of the regions was unstable. Therefore, the operation mileage and station number increased at low speed, due to low investment. 1990-1997, the network of North China, Central China, South China, East China, and the Southwest was further improved. Based on the original railways, several corridors formed, stretching from north to south. The connection between the regions was far increased by the inter-region railways.
1997-2006, the Northwest region achieved a notable improvement, in terms of both operation mileage and connectivity, and Tibet was also covered by conventional railway in 2006. Over this period, the railway operation speed was enhanced by five times, thus the service level was noticeably optimized. The operation of HSR was initialized in 2003 by launching a trial passenger line from Qinhuangdao to Shenyang.
2006-2010, the HSR operation speed was further enhanced in 2007. The construction and operation of HSR infrastructures in the Southeast Coast, the Yangtze River Delta, Central China, and North China regions were conducted at a large scale.
2010-2017, the HSR stretched to both the Northwest and the Southwest, inter-regional communication was continuously strengthened, and the four-vertical-and-four-horizontal fast-track railway was basically formed. Eighty-seven percent of HSR mileage is in the southeast of the Hu line, forming a complex railway network.
The current network exhibits an uneven spatial distribution pattern, with the underdeveloped western region. Restricted by the natural condition and underdeveloped economy, the railway lines in northwestern and southwestern regions suffer from poor technical performance and poor connectivity. Furthermore, there still lacks a railway corridor linking the western area with the rest of China. The current network exhibits an uneven spatial distribution pattern, with the underdeveloped western region. Restricted by the natural condition and underdeveloped economy, the railway lines in northwestern and southwestern regions suffer from poor technical performance and poor connectivity. Furthermore, there still lacks a railway corridor linking the western area with the rest of China.     Miles

Accessibility Analysis and Results
The expansion of the railway network has significantly improved the connectivity of county units in China: 50.04% of research units and 86.05% of the main cities were covered by railway stations in 2017. In this section, the current spatial distribution of accessibility is analyzed via TL, WATT, DA, and PA. We also adopt statistical measures such as the mean, maximum, minimum, median and coefficient of variation (CV) of regional accessibility values to measure the degree of disparity. The results of this analysis can be adapted to guide the policy-making about the railway and economic balanced development.

Analysis of the TL Indicator
This study calculates the TL of each railway station in 2017. Both Zhengzhou East station and Shijiazhuang station had the lowest and second-lowest TL (0.6045 and 0.6164, corresponding to the highest accessibility), while Hetian station and Moyu Station in Xinjiang province had the highest and second-highest (3.1858 and 3.1608, corresponding to the lowest accessibility). Zhengzhou, which owns the highest accessibility indicated by the TL indicator, is the geometric center of the transport network, which is consistent with a previous report [4]. The top 100 stations in terms of TL accessibility were located in North China (41%), Central China (37%), East China (21%), and the Northwest (1%). The 100 stations with the lowest accessibility were located in the Northwest (40%), the Northeast (24%), North China (16%), South China (11%), and the Southwest (9%). In summary, the central region of China had the highest accessibility, according to the TL indicator.

Accessibility Analysis and Results
The expansion of the railway network has significantly improved the connectivity of county units in China: 50.04% of research units and 86.05% of the main cities were covered by railway stations in 2017. In this section, the current spatial distribution of accessibility is analyzed via TL, WATT, DA, and PA. We also adopt statistical measures such as the mean, maximum, minimum, median and coefficient of variation (CV) of regional accessibility values to measure the degree of disparity. The results of this analysis can be adapted to guide the policy-making about the railway and economic balanced development.

Analysis of the TL Indicator
This study calculates the TL of each railway station in 2017. Both Zhengzhou East station and Shijiazhuang station had the lowest and second-lowest TL (0.6045 and 0.6164, corresponding to the highest accessibility), while Hetian station and Moyu Station in Xinjiang province had the highest and second-highest (3.1858 and 3.1608, corresponding to the lowest accessibility). Zhengzhou, which owns the highest accessibility indicated by the TL indicator, is the geometric center of the transport network, which is consistent with a previous report [4]. The top 100 stations in terms of TL accessibility were located in North China (41%), Central China (37%), East China (21%), and the Northwest (1%). The 100 stations with the lowest accessibility were located in the Northwest (40%), the Northeast (24%), North China (16%), South China (11%), and the Southwest (9%). In summary, the central region of China had the highest accessibility, according to the TL indicator.
The TL indicator of the research unit clearly correlated with both the GDP and population at the 0.01 level. The correlation coefficients were −0.137 and −0.242, respectively. The strong correlation indicates that the regions with higher GDP and a larger population normally have better accessibility.

Analysis of the WATT
In this study, the WATT values of research units were grouped into five classes by adopting the geometrical interval classification. Different classes are distinguished by color. Figure 4 shows the spatial distribution of the WATT for the study areas connected via railway stations. The WATT indicator obtained similar results as the TL indicator: The research units with lower WATT are mainly located in the central region of China (e.g., Henan, Anhui, Hubei, Hebei, Jiangsu, and Shandong), exhibiting a 'core-periphery' pattern, i.e., the central region of China (core) has the best accessibility and regions far from the center (periphery) have the worst accessibility. These regions also have a high population density. Clearly, the boundary of the Hu line divides China into two parts in terms of railway accessibility. The most accessible research units are located in the southeastern part of the Hu line.
The  Table 4. Compared to conventional stations, the WATT of HSR stations was consistently lower, while the accessibility enhancement of HSR stations in different units varied. The WATT indicator also indicated that the closer a unit is to the center of the entire railway network, the more accessible it is. The TL indicator of the research unit clearly correlated with both the GDP and population at the 0.01 level. The correlation coefficients were −0.137 and −0.242, respectively. The strong correlation indicates that the regions with higher GDP and a larger population normally have better accessibility.

Analysis of the WATT
In this study, the WATT values of research units were grouped into five classes by adopting the geometrical interval classification. Different classes are distinguished by color. Figure 4 shows the spatial distribution of the WATT for the study areas connected via railway stations. The WATT indicator obtained similar results as the TL indicator: The research units with lower WATT are mainly located in the central region of China (e.g., Henan, Anhui, Hubei, Hebei, Jiangsu, and Shandong), exhibiting a 'core-periphery' pattern, i.e., the central region of China (core) has the best accessibility and regions far from the center (periphery) have the worst accessibility. These regions also have a high population density. Clearly, the boundary of the Hu line divides China into two parts in terms of railway accessibility. The most accessible research units are located in the southeastern part of the Hu line.
The average WATT of the seven regions was much different. Ordered from the lowest to the highest were Central China (7.90 h), East China (8.05 h), North China (10.63 h), South China (11.09 h), the Southwest (12.23 h), the Northeast (13.40 h), and the Northwest (14.64 h). The unit with the lowest WATT (and correspondingly the highest accessibility) was Zhengzhou (5.9 h) [4], while that with the highest WATT (with the lowest accessibility) was Hetian (39.29 h). This indicated a difference between research units with the highest accessibility and that with the lowest accessibility measured by WATT of 33.39 h. There were some units with both HSR stations and conventional stations. While conventional stations provide service merely for conventional trains, most of HSR stations serve both conventional and HSR trains. The WATT indicators of those units are listed in Table 4. Compared to conventional stations, the WATT of HSR stations was consistently lower, while the accessibility enhancement of HSR stations in different units varied. The WATT indicator also indicated that the closer a unit is to the center of the entire railway network, the more accessible it is.

Analysis of the PA Indicator
PA indicates the economic potential of each research unit. Units with higher PA have the additional capability of reaching other high GDP units within a short time. The units are divided into five classes by adopting the geometrical interval classification and the spatial distribution of PA is shown in Figure 6. Compared with the other three indicators (TL, WATT, and DA), PA shows a more concentrated pattern for the most accessible areas. The areas with the highest PA mainly cover the regions surrounding Guangzhou, Shanghai, Beijing, Tianjin, and Wuhan.   featured with a high CV, while East China and Central China have the best accessibility with low CVs, which is in accordance with the result achieved in [3].