The spatial behavior of tourists can be understood as the behavior choice of tourists to travel, which includes destinations with sequence and direction characteristics and their connection relationships. It means that it has the dual relationship characteristics of the network and spatial attributes. The research on the network trajectory highlights the critical nodes in travel choice and the relationship between each node. In contrast, the research on the spatial trajectory can highlight the relationship between travel choices in urban space and clearly show the spatial pattern outlined by travel behavior. Therefore, this research focuses on the trajectory of the two dimensions of network and spatial relationship.
The study of tourist trajectory is different from monitoring tourist behavior in terms of the total quantity. It starts with the individual tourists and arranges statistics in a precise order of travel time and activities [
64] (p. 28) to simulate the behavioral characteristics of urban tourists before and after the outbreak. Early traditional simulation of human activities has three types of methods: conversion method, conventional and mature selection method, and conventional and cultural transmission structure method. However, most traditional research methods take individual urban residents or families as sample objects to investigate discrete events within a certain period. They require a large number of social surveys, dairy surveys, or interviews as support. The sample size is small, but the field requirements are large, which is not suitable for investigating the travel behavior of urban tourists in the context of the epidemic.
In contrast, Social Network Analysis (SNA), which originated in sociology, does not require more fields for its research data. Moreover, it is only necessary to obtain a large quantity of tourist-based information in a certain period, and then the behavior network of the actors can be constructed. Therefore, it gradually became a popular social science research paradigm after the 1960s [
65]. This article applies SNA methods to study the spatial behavior of urban tourists. The urban attractions are regarded as nodes in the network, and the amount and sequence of visitors to each node are used as the value to calculate the strength of the connection between the nodes. Moreover, the author loads the data analysis results into ArcGIS for visual analysis. Hence, it constructs two main research frameworks of tourist travel contact trajectory and spatial trajectory to explore the characteristics of the impact of the COVID-19 epidemic on urban tourists’ spatial behavior (
Figure 2).
2.3.1. Comparative Group
To be simple, this study can be understood as the different behavioral characteristics and spatial representations of tourists on the scale of dynamic time and space under the epidemic as the primary external constraint that restricts behavior choices. Therefore, setting up an appropriate Comparative Group to cover the three time periods before, during, and after the outbreak is one of the keys to this study.
Affected by COVID-19, Jiangsu Province, where Nanjing is located, launched the “Level One Response to Public Health Emergencies” mechanism at 24:00 on 24 January 2020, and all types of external traffic were suspended. Shopping malls, entertainment venues, scenic spots, and other consumer spaces ceased operations. Until 24 February 2020, Jiangsu Province adjusted the primary response of epidemic prevention and control to secondary response. Subsequently, Nanjing issued the “Notice on Printing and Distributing Guidelines for Further Optimizing Epidemic Prevention and Control Measures in the Catering Industry and Accelerating the Resumption of Work and Resumption of Work”. It successively opened up the business restrictions on the consumption space on 3 March. Therefore, in this study, the period from 24 January to 3 March 2020 was taken as the “during” comparative group.
In an ideal circumstance, it is perfect for capturing the commentary data of the whole year before the COVID-19 and setting up the comparative group of the next whole year. However, since the limitation of data storage on the Ctrip.com website, each attraction’s maximum number of comments is 3000. It means the historical commentary data of hotspot attractions would probably be covered more frequently than other spots. Therefore, we should look into details each month to guarantee that the data captured would cover all the attractions and comments within the same select period. In addition, Tourism is a sympathetic activity, easily affected by various external constraints, with specific seasonal characteristics. In China, the establishment of short and long vacations, summer vacations, and other holidays profoundly impact the formation of the off-season and peak seasons of the urban tourism industry. Therefore, to make the research results more representative and significant, this article will set up another two Comparative Groups around the National Day holiday period. The period from October to the end of December 2019 will be regarded as the early stage before the outbreak in Nanjing, and the period from October to the end of December 2020 will be regarded as the later stage after the outbreak in Nanjing.
2.3.3. Research on Contact Trajectory of Tourists
The attractions in Nanjing are regarded as nodes in the network, and the attractions commented by each tourist (review ID) are defined as the nodes that the tourist arrives successively, and the order of the comments is defined as the order of arrival. Furthermore, through the Python language, the cumulative statistics of all visiting tourists in the three Comparative Groups are used to construct a two-way matrix Equation (1) of the number of visits to Nanjing’s attractions to carry out the Node Connection Strength (NCS), Node Degree Centrality (NDC), and Node Betweenness Centrality (NBC) in the network centrality analysis.
(1) Node Connection Strength
In simple terms, the strength of the connection between nodes is the flow relationship of tourists choosing from one attraction to the next attraction. By constructing a two-way matrix, the direction of choice between the two attractions is determined, and the initiator and the recipient of the connection between the attractions are clarified. Although the NCS can clearly show a tourist route with high contact strength, the nodes at both ends of the high-strength tourist route do not represent its high centrality. Therefore, it is necessary to calculate further the degree centrality of the nodes in the network.
(2) Degree centrality
Centrality is one of the critical points of SNA. The greater the node’s centrality, the greater the “power” that the node has in the entire network. Bavelas was the first to conduct innovative research on the formal characteristics of centrality, verifying that the more the actor is in the center of the network, the greater its influence [
66]. The centrality of a network is generally measured by two indicators: “Node Degree Centrality (NDC)” and “Network Centrality (NC)”. The former is a measurement of node centrality, and the latter is a measurement of the centrality trend of the overall network.
The measurement of NDC is divided into absolute centrality and relative centrality. The two are logically consistent, but the relative centrality standardizes the result of absolute centrality so that nodes in different networks can be compared horizontally. Therefore, the Normalized Node Degree Centrality (NrNDC) is selected to measure the centrality of the nodes in the network under the requirements of setting three types of time Comparative Groups in this article. The calculation formula is as follows:
In the Formula (2), is the NrNDC of node i; represents the sum of the number of relations between point i and any other point j in the network, that is, the degree of node i. n represents the overall network scale, that is, the number of all nodes in the network; n − 1 represents the total number of remaining nodes in the network excluding its node, which is used in the standardization process.
The above is used to measure the degree centrality of each node in the network, and it is also necessary to measure the centrality trend of the overall network. The higher the value of NC, the greater the degree of convergence of the network to the high centrality node, and the more the actors’ activity range is concentrated toward the center of the network. The formula for calculating NC is as follows:
In Formula (3),
is the value of NC,
is the node value with the highest relative degree centrality in the network, and
is the degree centrality value of other points in the network.
is the sum of the difference between the highest point of relative centrality and the relative centrality of other points in the network.
n represents the overall network scale, that is, the number of all nodes in the network; Freeman confirmed the maximum value of absolute degree centrality is
in the social network analysis [
67];
n − 1 represents the total number of remaining nodes in the network, excluding its node, which is used in the index standardization process.
(3) Betweenness centrality
The centrality of node and network degree often measures the nodes that occupy absolute “power” dominance. While there is another kind of node in the context of tourism activities, when tourists choose to travel from point A to point B, they must pass through intermediate point C. In other words, the activity connection between attraction A and B depends on the C node. Therefore, it also has a certain degree of centrality in the overall network, called Node Betweenness Centrality (NBC). Similarly, the measure of betweenness centrality is divided into NBC and overall Network Betweenness (NB). The calculation of the NBC formula is as follows:
In Formula (4),
represents the betweenness centrality of the node
i,
represents the number of network paths that exist between node j and k through node
i, so
can represent the probability that node
i is on the network path between node
j and
k. Freeman confirmed that the maximum value of betweenness centrality of nodes in social network analysis is
[
67], and
n − 1 represents the total number of remaining nodes in the network, excluding its node, which is used in the index standardization process. In the process of summarizing
, it should make sure the
j < k, otherwise it may double count the possibility between the same two nodes.
The calculation of the NB formula is as follows:
In Formula (5), represents the betweenness centrality of the overall network. The calculation process of the formula is complicated, and it is simplified as shown above. represents the maximum value of betweenness centrality of nodes in the network, represents the betweenness centrality values of other nodes, excluding the maximum value, and n represents all nodes in the network.
2.3.4. Research on the Spatial Trajectory of Tourists
Although the SNA method can clearly show the centrality of each node in the network, the connection strength between the nodes, and the selection order relationship, it does not contain the geographic coordinate information of the nodes. It cannot show the characteristics of changes in the geographic space. Therefore, it is necessary to load the SNA results into ArcGIS to analyze the spatial pattern. The author uses the Orient–Destination (OD) model and the Standard Deviation Ellipse (SDE) model to deduce the changes in the travel trajectory of tourists before, during, and after the epidemic and the range of activities in the urban space.
- (1)
Spatial trajectory pattern
The SDE model includes calculating the average center, distribution direction, and range of the spatial behavior of tourists. Through the previous SNA works, the
and
of various attractions in Nanjing before, during, and after the epidemic was obtained, which were used as the weight coefficients of the SDE analysis to obtain a tourism spatial pattern map of Nanjing based on the trajectory of tourists’ travel choices. The calculation of the SDE is generally divided into three steps. The center of the ellipse, the rotation angle, and the long and short axis length will be calculated separately. The calculation formula for the center of the ellipse is as follows:
where
,
.
In Formula (6), and calculate the x and y coordinate values, respectively, of the center of the ellipse, where represents the calculated weight of the attraction i, that is, the value of the degree centrality and betweenness centrality of the attraction; and are the x- and y- coordinates of the arithmetic mean center point of all attractions under the weight of w; and and are the spatial position coordinates of each attraction.
Subsequently, the direction of the ellipse needs to be calculated, taking the proper north direction as 0 degrees and the x-coordinate axis as the reference. The calculation formula is as follows:
where
,
.
In Formula (7), is the sine function of the ellipse’s clockwise rotation angle from true north, where and are the arithmetic average center point x and y coordinates of all attractions under the weight of w. The calculation formula is shown in Formula (6).
Finally, the
x- and
y-axis lengths of the ellipse need to be calculated. The formula is as follows:
In Formula (8), is the length of the x-axis of the ellipse, and is the length of the y-axis of the ellipse; the values of and can be calculated by the trigonometric function formula; the calculation processes of and are in the Formula (7).
- (2)
Spatial trajectory changes
Use the Python language to construct the OD table (
Table 2) of the starting place field, the destination field, and the number of visits field using the two-by-two matrix constructed in the SNA. Taking each starting point connection in the table as a spatial trajectory, by comparing the data before, during, and after the epidemic, three types of increased trajectory, decreased trajectory, and the same trajectory with value changes are obtained.