Integrating Network Centrality and Node-Place Model to Evaluate and Classify Station Areas in Shanghai

: Transit-oriented development (TOD) is generally understood as an effective urban design model for encouraging the use of public transportation. Inspired by TOD, the node-place (NP) model was developed to investigate the relationship between transport stations and land use. However, existing studies construct the NP model based on the statistical attributes, while the importance of travel characteristics is ignored, which arguably cannot capture the complete picture of the stations. In this study, we aim to integrate the NP model and travel characteristics with systematic insights derived from network theory to classify stations. A node-place-network (NPN) model is developed by considering three aspects: land use, transportation, and travel network. Moreover, the carrying pressure is proposed to quantify the transport service pressure of the station. Taking Shanghai as a case study, our results show that the travel network affects the station classiﬁcation and highlights the imbalance between the built environment and travel characteristics.


Introduction
Rapid urbanization and population growth worldwide have caused many urban problems, such as air pollution, traffic congestion, and excessive reliance on fossil fuels. An essential reason for this phenomenon is the absence of clear and effective integrated development plans for land use and transportation. Transit-oriented development (TOD) has been adopted as a commonly followed urban planning strategy for achieving such integration, promoting efficient and mixed land use near public transportation hubs and stations [1,2]. At the metropolitan scale, metro stations provide greater accessibility to their catchment area and play a crucial role in the public transport network. Meanwhile, stations offer sufficient opportunities to develop diversity and intensity land use, aiming at creating walkable, activity-friendly communities. The implementation of TOD at station level has been a major focus of attention in scholarly work.
Various researchers have attempted to establish comparable TOD typologies for better strategic planning, investment guidance, and station quantification [3,4]. The best-known approach for evaluating TOD typology is the node-place (NP) model developed by [5], which is essentially a conceptual framework for assessing the transport supply and land use characteristics of a station simultaneously. Based on the NP model, various modifications to the original model have been introduced. The most notable modification is the extension of design dimension, which represents a considerable improvement regarding walkability and pedestrian comfort [6]. Specifically, three aspects need to be investigated to achieve effective integration of transportation and land use: (1) multimodal accessibility of the station, (2) land use diversity and intensity, and (3) pedestrian-oriented design [7]. In a similar vein, Ref. [8] extended the model by adding the experience dimension with indicators, which reflect the transit quality from a traveler's perspective.

1.
Combined official data and open geographic big data, an extended node-placenetwork (NPN) model was proposed to measure the synergy between the built environment and travel characteristics around stations. Moreover, the carrying pressure indicator was utilized for the quantitative evaluation; 2.
Employed the proposed analytical methodology to complement empirical knowledge concerning TOD of the station's area in Shanghai, which verified the effectiveness of the proposed NPN model.
The paper is organized as follows. The next section is a literature review of the NP model and relevant theory. Then, we elaborate on the methodology used and a case study of Shanghai. Afterwards, the results are discussed in Section 5, followed by conclusions in Section 6.

The Node-Place Model
In the past, many researchers examined the gaps between public transport supply and demand. Various approaches were employed to access the public transport performance, for example, network supply model [12], Gini coefficients [13,14], and node-place model [4], etc.
Among these models, the node-place (NP) model, as the best-known approach leading to a TOD typology, was developed by [4]. It defined the double nature of the station: nodes of networks and place in the city. The funding principle of the NP model is balancing transportation with land use to achieve a state of cooperative development between them. An analysis framework, based on a dual axis of node and place indexes, is used to represent the conceptual model ( Figure 1). The node dimension measures the offer of transport services (e.g., number of directions served, daily frequency), reflecting the accessibility of the station area by several transportation modes. The place dimension measures the land use features (e.g., density and diversity of activities) found around a station area, reflecting the potential demand for transport services. By identifying five types of station areas (Figure 1), the conceptual framework provides an integrated quantitative indicator for measuring the integration level of transportation and land use as well as assessing the evolution of a system over time [15,16]. During the last two decades, the dimension of NP has been extended in many diverse aspects [18,19]. One major modification is its expansion to the urban design dimension. [6] first investigated the usefulness of adding the walking environment evaluation to the NP model as an urban planning tool. Based on that evaluation, Ref. [20] emphasized the functional and morphological interrelation between transportation and urban conditions, and extended a third, 'oriented' dimension to quantify the accessibility of station areas. Likewise, [7] considered different access modes to stations and introduced the design index to measure the influence of the pedestrian accessibility of the station areas. Other feeder modes, such as bicycles, cars were considered in the work of [21], which focused on geographical contexts characterized by medium or low population densities. In a similar vein, several extra dimensions, such as 'proximity' [22], 'user-friendliness' [23,24], and the spatial configuration of the street network design [25,26] were also investigated in previous studies.
In addition to the extension of the design dimension, some scholars conducted research from a user-based perspective. For instance, the experience value [8] is added to reflect the traveler's experience at the station in terms of comfort (sheltered waiting, toilets), ambient elements (architecture and recent renovation), and social elements (presence of personnel). The people dimension proposed by [27] includes motivation (education, work, and other), ridership (using the station as an origin station, destination), and effort (walking, biking, and travelling a farther distance).
Furthermore, the high salience of transit-related attributes, especially human mobility in the studies, are identified. Two approaches have been suggested to include travel features in the NP model. The first approach highlighted the importance of the passenger frequencies [20] and adds passenger volume to calculate the node index [26,28]. Given unique characteristics and importance of travel flow data, the second approach made a comparative analysis between the travel characteristics and the results of station classification. According to this point, Ref. [29] explored the intersection of two created classifications for New York City and unveiled the consistency between context and use of the station. Similarly, Ref. [11] added the centrality analysis of the movement network to investigate the performance of the Greater London metro station areas at both local and system levels. They found that there were substantial discrepancies in the criticality of station areas within the same cluster with similar node-place-design, suggesting the various roles they played at multiple levels. During the last two decades, the dimension of NP has been extended in many diverse aspects [18,19]. One major modification is its expansion to the urban design dimension. [6] first investigated the usefulness of adding the walking environment evaluation to the NP model as an urban planning tool. Based on that evaluation, Ref. [20] emphasized the functional and morphological interrelation between transportation and urban conditions, and extended a third, 'oriented' dimension to quantify the accessibility of station areas. Likewise, [7] considered different access modes to stations and introduced the design index to measure the influence of the pedestrian accessibility of the station areas. Other feeder modes, such as bicycles, cars were considered in the work of [21], which focused on geographical contexts characterized by medium or low population densities. In a similar vein, several extra dimensions, such as 'proximity' [22], 'user-friendliness' [23,24], and the spatial configuration of the street network design [25,26] were also investigated in previous studies.
In addition to the extension of the design dimension, some scholars conducted research from a user-based perspective. For instance, the experience value [8] is added to reflect the traveler's experience at the station in terms of comfort (sheltered waiting, toilets), ambient elements (architecture and recent renovation), and social elements (presence of personnel). The people dimension proposed by [27] includes motivation (education, work, and other), ridership (using the station as an origin station, destination), and effort (walking, biking, and travelling a farther distance).
Furthermore, the high salience of transit-related attributes, especially human mobility in the studies, are identified. Two approaches have been suggested to include travel features in the NP model. The first approach highlighted the importance of the passenger frequencies [20] and adds passenger volume to calculate the node index [26,28]. Given unique characteristics and importance of travel flow data, the second approach made a comparative analysis between the travel characteristics and the results of station classification. According to this point, Ref. [29] explored the intersection of two created classifications for New York City and unveiled the consistency between context and use of the station. Similarly, Ref. [11] added the centrality analysis of the movement network to investigate the performance of the Greater London metro station areas at both local and system levels. They found that there were substantial discrepancies in the criticality of station areas within the same cluster with similar node-place-design, suggesting the various roles they played at multiple levels.

TOD and the Travel Network
Although improvements of the NP model varied to some extent, they have the common objective to increase transit ridership and create a more sustainable transit system. Compared with other NP indicators, travel flow has its own unique characteristics. First, it is the objective data generated by individual travel, reflecting the actual demand of people. Second, travel distribution reflects the characteristics of the entire travel system rather than those of a local station. Based on that, it is reasonable to think that taking travel flow as an indicator of the node index within the classification of TOD has limitations.
As for the specific index calculation method of the passenger travel dimension, most previous studies have used passenger frequency as an indicator [3,30,31]. One exception is [11], who introduced a strategic network (criticality) component to enhance the value of the NP model from a perspective of network analysis. As pointed by [32], 'characterizing the network anatomy is important because structure always affects function and vice versa'. The study of networks pervades all of science. In particular, this new wave of spatial networks and flows in cities using moving data provides new insights into understanding traffic network systems and urban spatial structure. For example, Refs. [33,34] used threeyear smart card data to reveal a polycentric urban development phenomenon in Singapore. Similarly, Ref. [35] constructed a multi-layer network from three sources (i.e., bus GPS observations, smart card data, and roadside Blue tooth detector records) to identify the network structure at each layer and reveal comparable features of the network organization for different spatial layers. In addition, some scholars also used other moving flows, such as the bicycle sharing flow and social media check-in stream to reveal the hierarchical structure of urban space [36,37].
Numerous empirical studies have shown that the importance of nodes and the type of metro stations can be further identified using the metrics of complex network theory [38]. For example, Ref. [39] developed a new weighted composite index to position hub stations in a subway network. Ref. [40] used several improved network indexes to distinguish the key stations and sections of the metro network in Beijing, China. Ref. [41] used human mobility patterns and improved the PageRank algorithm to evaluate station importance.
Consequently, we develop a new model, which is called the node-place-network (NPN) model with an extended network dimension, as a complement to the NP model. Below we discuss how we developed and applied this extended model in a case study of Shanghai.

Methodology
In this section, we first introduce the node-place-network model including the index selection and station classification with the K-means++ method. Second, we calculate the carrying pressure to verify the meaning of travel network dimension. Taken Shanghai as case study, our methodology has been depicted in Figure 2.

Node-Place-Network Model
As discussed in Section 2, the actual travel characteristics of passengers and the importance of the station in the network system were ignored in previous node-place model. Hence, an extended node-place-network model, as shown in Figure 3, is proposed. The node and place dimension represent the traffic services and land use around the stations, respectively. The network dimension indicates the importance of the station in the whole subway network system.

Index Selection
(1) Network indicators Based on the smart card data (SCD), we first construct an OD (origin-destination) matrix of travel between all metro stations, and then convert it to an undirected weighted network form, viewed as a passenger movement network: where V is a list of vertices (i.e., stations) of network G. E is a set of edges of network G, and W, which includes the weights w ij denoting the volume of travel between stations i and j. Degree centrality and centrality betweenness are the two most representative indicators in complex network studies [42,43].

•
Degree centrality The degree centrality refers to the number of links connected to a node in the network, which is the number of travel records of a metro station in our study.
where N i is the set of adjacent stations of node i, and w ij is the weight between nodes i and j. In our study, it indicates the number of travel flow between stations i and j.

• Betweenness centrality
Betweenness centrality is an index that measures the level of node connectivity, which is a key indicator for identifying metro network hubs. A node has a higher centrality C betweenness (k), the greater the number of shortest paths that traverse it and is defined as follows: where δ ij (k) is the number of shortest paths between any two stations i and j that pass through the station k and δ ij is the total number of such paths between i and j. One point to note here is that both centrality indicators are calculated based on the passengers' movement network instead of the features of transit infrastructure network.
(2) Node indicators Node indicators can be divided into two groups depending on the transit mode: the transit quality of rail and feeder transport (Table 1).

) Place indicators
In terms of place indicators, three aspects of the station's area are measured following the 'three-Ds' principle, namely density, diversity, and design [2]. Based on previous studies [20,27,44], the indicators of place are summarized in Table 2.  (4) Data normalization The first step in descriptive statistical analysis is data transformation. With formula (4) and (5), all indicators were box-cox transformed [45] and rescaled to reduce the skewness of their univariate distribution and improve their comparability.
where x i is the transformed variable; λ is the transformation parameter, ranging from −5 to 5. The optimum value of λ is obtained using the maximum likelihood estimate.
Then, the two-sided Spearman correlation analysis was also conducted to understand the direction and strength of the relations between the indicators. Furthermore, the relative importance of independent variables will be explored with the entropy method [46].

Station Classification
After data preprocessing, the similarity of metro stations was explored by applying the K-means++ method [47]. K-means++ can choose the initial cluster centers as far apart from each other as possible. The silhouette coefficient, ranging of [−1, 1], is used to evaluate the clustering performance [48].

Carrying Pressure
To verify to what extent the travel network dimension adds meaning to the findings of a traditional NP type of classification, we conduct a comparative analysis based on two classification strategies. The first strategy primarily focuses on the original NP model, which describes the built environment properties around the station. The second strategy focus on the NPN model, including extended passenger travel characteristics, which describes the real travel flow distribution of rail transit externally.
Furthermore, a new index called the carrying pressure of the node was proposed to quantitatively describe the development synergy between the travel flow and the supporting physical environment of the station. The carrying pressure value of the node can be determined using the following formula: where T, the score of model, is the average of integrated indices, representing the comprehensive measure of TOD development status of the station. Index(.) represents the score of one dimension. For example, Index (Node) represents the composite score after weighting node indicators.

Study Area
China has been experiencing rapid urbanization and a massive expansion of infrastructure construction during the past decades. Shanghai, as China's largest economic center, has a 548 km network, which is officially the largest network by route length in the world [49]. In 2018, the daily ridership of Shanghai Metro exceeded 10 million on weekdays and 7 million on weekends. The total number of subway passengers reached approximately 3.7 billion, which increased 1.62 billion over that of 2017.
According to the Master Plan, officials intend to have over 1000 km of network by track length, and over 600 metro stations, forming an extensive network covering the entire city. In particular, the conception of TOD has been introduced and considered a promising solution by local governors in land use planning. In this regard, it is necessary to have a universal approach to identifying the TOD variations and typology among the metro station areas and providing general knowledge on understanding the complex relationship between metro trip demands, transportation, and land use. Based on our data source, all 286 metro stations, covering over 600 km that are part of the analysis, are shown on the map (Figure 4).

Data
This study used three sources of open data, including 642,724 POIs extracted from Gaode Map, LandScan, OpenStreetMap (OSM), government website data, and a month of smart card data (SCD) from Shanghai Metro during March 2018.
The recorded SCD contains detailed information on each trip, including the card ID, time, fare, and the station name. From the fare field, the state of travel can be detected (i.e., whether boarding or alighting). The government website data offers the basic information of the metro network, such as the metro daily frequency, number of directions served by metro, etc. In addition, road-related information, like intersection density and accessible network length, can be obtained from OSM data. The population distribution data was from the LandScan (http://web.ornl.gov/sci/landscan, accessed on 1 September 2020). The description of data can be seen in Appendix A.

Results and Discussion
In this section, we first carried on the variable selection and the comprehensive score calculation of node, place, and network dimensions. Then, the indexes of NP and NPN were clustered respectively to obtain different station categories. Finally, the difference of station categories under NP and NPN were investigated and the guidance for individual stations are discussed.

Variables
The direction and strength of correlations between the indicators were examined (Pearson's), and the results show a high and significant correlation between the variables (see Appendix B). Nevertheless, the high correlation between some variables was not considered problematic, as the indexes of each dimension are then given different weights and finally integrated into one index. Table 3 shows the relative importance of independent variables with the entropy method. For the node index, the number of directions served by the metro is the most important variable, which emphasizes the importance of a transfer station. For the place index, the density and diversity of land use have greater weight, followed by the mixing degree of land use, and finally, the density of intersections. This result is consistent with the previous study [50] that the population density and employment density are strongly positively associated with ridership. For the network index, the degree and betweenness indicator have identical weights, which indicates that these two variables are equally important in reflecting the importance of stations.

Indexes
A descriptive statistical analysis of the three dimensions is shown in Table 4. The ranges of data indicate that the distribution of real passenger flow distribution between subway stations has a greater variation (which is 0.046) than the node dimension (0.020). The result indicates that the average network accessibility is not being fully incorporated into transportation development. The geographic distribution of the node and place indexes follows a broadly concentric circle-shape, generally radiating away from the central area of Shanghai ( Figure 5). The node and place with better development status are basically located in the central region, while the relatively underdeveloped regions occur on the outskirts of the city, which also reflects the strong monocentric urban structure of Shanghai. Notably, the spatial distribution pattern of the network dimension shows a substantial difference between node and place indexes. A higher value of network index is no longer located in the central area and appears in the periphery of city. It is evident that the periphery/suburbs of the city have a high travel demand, whereas the supply of these places is in relatively short supply with an undeveloped physical environment (including traffic facilities and land use allocation). In addition to the spatial analysis of all station areas, the relationships among these indexes were also analyzed to reveal the complex links between stations. As shown in Figure 6, Shanghai metro station areas exhibit substantial differences among these indexes: the results of a positive relationship between node and place (r2 = 0.69) indicate that the public transport accessibility has a good synergy associated with land use patterns. The relationship between network index and the other two dimensions is relatively low (0.36 and 0.48), which is consistent with [11] and that there is a significant difference between network centrality and the attribute of local station areas.

Typologies of the NP Model
Through K-means++ cluster, five clusters are chosen with the best silhouette value 0.38. The F-statistics of clusters of the place index are 541, higher than node index, which are 428. The lower descriptive power of the node index indicates that the transportation service level does not differ greatly. The reason for this situation may be that we mainly analyzed the urban area of Shanghai, which has developed public transportation.
Using the geographic map and NP scores, the node index and place index in Shanghai are generally in a state of equilibrium (Figure 7 and Table 5). The geographic distribution follows a broadly concentric circle-shape radiating from the central area of Shanghai. For instance, cluster 1 and cluster 2, two better performing clusters, are primarily located in central urban areas with good traffic accessibility and diverse land use. Followed by cluster 3, which is mostly located from the edge of the central city area and has a moderate level of development in terms of urban traffic and land use. The last two clusters, cluster 4 and cluster 5, are located at the edge of the city or at the end of the rail transit station.

Typologies of the Node-Place-Network Model
The same cluster analysis was again applied to the NPN model and five clusters are identified (the best silhouette value is 0.32). The F-statistics (network index = 345, place index = 204, and node index = 205) shows that higher descriptive power of the network. Further, in the case of similar built environment, the actual travel flow distribution among different stations is quite different. This result once again proves the necessity of travel flow for site classification.
Each of the five categories has its own distinctive characteristics (Table 6 and Figure 8).   Cluster 2 (the number of stations is 84) has most stations, which has a balanced development status. Cluster 3 has a similar score for network with cluster 2, with relatively high scores for the network but a low score for node and place. This is possibly because around these stations, the land use is usually dominated by residential types, and more commuters need to travel between workspace in the city center and residential areas in the suburbs. One typical case is Sijing station (0.32, 0.42, 0.78), a typical living place, located in an area with a dense residential population, large travel flow during the peaks and insufficient facilities.
Cluster 4 (the number of stations is 61) and cluster 5 (the number of stations is 35), are generally located at the end of a metro line. Both of them have a relatively low score of network, corresponding to lower demand for transportation. Furthermore, the cluster 5 needs to be improved comprehensively in terms of travel demand, transport services, and land supply. Figure 9 provides an alluvial diagram that depicts how the clusters in different strategies interact with one another. Between the two categories, there are thick edges that connect the clusters. This situation indicates that the set of nodes that form a particular cluster within the NP model constitutes a substantial portion of another cluster within the NPN model, reflecting the similar development consistencies under different dimensions. Cluster 1 to 5 is arranged in order of the average score from large to small. A large number of thin edges change to the neighbor cluster IDs, which indicates that some stations perform slight differences between the two categories. However, several thin edges change to the non-neighbor cluster IDs, reflecting the significant mismatches between environmental attributes and travel demand. For example, Jinke station (network = 0.82, node = 0.55, place = 0.49), which belonging to cluster 3 under NP classification, is transferred to cluster 1 under NPN classification. To quantitatively describe the development synergy between the travel flow and the physical environment of the station, we compare the carrying pressure within different categories across these two classifications. As shown in Figure 10, the score of NPN shows more variability and comparability, while there is no clear distinction in the NP model. The results from the NPN model also clearly show that the nodes in cluster 3 have a higher T(NPN) score, while cluster 5 shows a lower performance in the public transport system. The results of the comparison study show the validity of the extended NPN method.

Guidance for Individual Stations
The carrying pressure analysis on different clusters provides us not only a general impression of station groups, but also specific details for 286 stations in Shanghai (see Table 7). The stations with a high carrying pressure can be characterized as follows: (1) transportation hubs, such as Hongqiao Railway Station and Pudong International Airport, which are reasonable for having large travel demand because of the dominant position of transportation; (2) large residential districts, such as Jiuting and Guanglan Road stations, which are generally located on the edge of a main urban area or outside the city. These areas serve as sub-centers for future urban development. (3) The terminal stations, such as Shenshe Road station, which have a low degree of the physical environment but a large travel demand. Moreover, our findings are consistent with the Shanghai Master Plan 2017-2035. For instance, four integrated and promoted town clusters are found in Figure 11 (Areas 1 to 4), in planning that includes Nanxiang-Jiangqiao, Jiuting-Sijing-Dongjing-Xinqiao, Pujiang-Zhoupu-Kangqiao-Hangtou, Tangzhen-Caolu-Heqing. There are also two prominent clusters, which correspond to the planned central towns, Jiading and Nanhui.  Figure 11. Carrying pressure of stations.

Conclusions
In this paper, we extended the existing node-place (NP) model by adding a third dimension-the network dimension, which results in a new model, the node-place-network (NPN) model. The NPN model can be used to assess the extent to which the cooperation between the land use and public transport as well as travel demand. An empirical case study of Shanghai demonstrated the effectiveness of our method. Three key findings are obtained: First, the results show that the TOD guidelines have already been implemented in Shanghai, although there is also a large mismatch between the travel demands and the surrounding environment of stations. This result is largely due to the complex relationship between transport systems and travel patterns. Second, a comparative analysis of the discrepancies between two classifications, the stations that need possible improvements are identified. Third, through the index of the carrying pressure, the matching degree of travel characteristics and environmental development can be quantitatively evaluated, which provides a guideline for the development of multi-center cities.
Our results also provide recommendations for policy making. The stations with a relatively high carrying pressure have the potential for encouraging densification and diversification to increase the land use efficiency. We recommend planners to apply our methodology to a selection of potential places as sub-centers. Moreover, the reason for the occurrence of stations with a high carrying pressure is worth monitoring. Heavy commuting burdens pose a major challenge to improving the quality of public transportation. In particular, the rapid growth of urban populations has placed enormous pressure on transportation services and land use allocation [51].
This study can be extended in several directions for future investigations. First, we only used the entire travel records during a day, while more fine-grained event resolution, such as 1-h temporal interval can be considered. In other contexts, Ref. [29] used smart card data to conduct station classification, which brings finer resolution for boarding and alighting information. Second, we chose a buffer zone of 600 m in this study, which is the service scope planned by the Shanghai Municipal Government. Due to the presence of the modifiable areal unit problem (MAUP) [52,53], the influence of different research scopes and research scales can be considered in the future. Third, the different methods of index weighting can be adopted, e.g., the fuzzy approach and ANN [54,55]. Furthermore, as Shanghai is a monocentric city, further research will be carried out in cities with polycentric development to verify the performance of the proposed method. Data Availability Statement: Restrictions apply to the availability of these data. POI data was obtained from Gaode Map (https://lbs.amap.com/, accessed on 1 September 2020). OpenStreetMap (OSM) data was obtained from https://www.openstreetmap.org/ (accessed on 1 September 2020). Subway operation data is from the government's official website (http://service.shmetro.com/, accessed on 1 September 2020).

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Data Description
Approximately one million POI data were acquired from Gaode Map (https://www. amap.com, accessed on 1 September 2020) in 2018, which has detailed information of addresses, titles, and the coordination of spatial location. They were coded into several levels of catalogs, depending on their attributes. Referring to urban land classification and planning construction land standard, we reclassified the POIs, and some POIs with a low public awareness were removed, such as public toilets, kiosks, and house numbers. After deletion and integration, the raw data were reclassified into commercial service, industry, residence, green space, public services, and transportation, as shown in Figure A1. And 642,724 POI data points were obtained (Table A1). The recorded SCD contains detailed information on each trip, including the card ID, time, fare, and the station name (Table A2). From the fare field, the state of travel can be detected (i.e., whether boarding or alighting). And based on the travel time, we can construct a travel chain for every user, which contains the concrete origin and destination of the station. As shown in Figure A2, more than one million trips were carried by Shanghai's metro system during the morning peak hour.
The government website data offers the basic information of the metro network, such as the metro daily frequency, number of directions served by metro, etc. And road-related information, like intersection density and accessible network length can be obtained from OSM data. The population distribution data was from the LandScan (http://web.ornl.gov/ sci/landscan/, accessed on 1 September 2020).

Directed Weighted Network Analysis
To estimate the actual distribution of travel flow of the Shanghai subway system, we used SCD data to construct the directed weighted travel network. This network consists of 286 nodes (stations) and 78,375 edges. The average clustering coefficient is 0.946, which is much larger than the value of the random network (which is 0.5), which shows that the small word characteristics and degree distribution obeys a power-law distribution. As shown by Figure A3, the first weighted node is People's Square station, which has the largest travel flow all day. This is because People's Square station is the transfer station of Shanghai Metro Line 1, Shanghai Metro Line 2, and Shanghai Metro Line 8, which are also the central point of the Shanghai metro network structure. Moreover, Shanghai People's Square is also the political, economic, cultural, tourism, and transportation hub of Shanghai. The next highest value of weighted degree is Shanghai Railway station, which is the largest transit hub in Shanghai. And Jing'an temple, Xujiahui, East Nanjing Road, and Zhonshan Park are no longer transfer stations, but also have plenty of commercial, residential and recreational facilities. Lujiazui is well known as one of the top business circles, with excellent business facilities, and Xinzhuang is a mature leisure area.
The direct weighted network of travel flow, it shows that the subway around Shanghai is closely connected with the subway station in the central area, while some regional central stations are distributed along the inner ring road to share part of passenger volume. The direct evidence is in Table A3, which shows that the top station interaction pairs are located between Sijing and Caohejing Development Zone metro stations. Sijing is located outside the inner ring line, while Caohejing Development Zone metro station is inside the inner ring line. This situation also indicates that there remains a high demand for travel in the surrounding areas of Shanghai, with substantial interaction intensity between stations.  Note: ** p < 0.05; * p < 0.1.