3.3.1. Social Network Analysis
A social network is a collection of social actors and their relationships. It is mainly used to describe and analyze the relationship characteristics and types of social things and analyze the influence of relationships on the network [
45]. Currently, social network analysis has been widely used in the fields of sociology, management, and economics. The flow of tourists makes tourism destinations spatially constitute a network structure including tourism nodes and the connecting lines between tourism nodes. The social network analysis method can precisely combine graph theory and mathematical models to study such network relationships. It can not only describe the overall spatial distribution pattern of tourism flows but also discover the characteristics of the intrinsic connections between tourism nodes. Social network analysis can realize the accurate and quantitative analysis of various relationships in tourist flows, thus providing quantitative tools for theoretical construction and practical tests [
5]. Therefore, the social network analysis method is gradually becoming a popular paradigm for studying the network structure of tourism flows. Specifically, we used the social network analysis method to study the node structure characteristics and overall structure characteristics of the tourism flow in Qiandongnan Prefecture from the perspective of geography. Centrality analysis and structural hole index were used to analyze the node structure characteristics of tourism flow in Qiandongnan Prefecture. Network density, core–periphery structure model, and cohesive subgroups were used to analyze the overall structure characteristics of tourism flow in Qiandongnan Prefecture.
Degree centrality is the number of nodes directly connected to other nodes. If a node has the highest degree, it is regarded as the center and has a great influence on other nodes. In a directed graph, the degree of each node can be divided into point-in degree and point-out degree. Equation (1) shows how to calculate the degree of centrality
of node
i,
where
represents the number of direct connections between nodes and
(
).
Closeness centrality is a measure that is not controlled by other nodes. If the distance between a node and all other nodes in the network is very short, the node has a high closeness centrality. The closeness centrality is the sum of the shortest path distances between the node and other nodes in the network. Equation (2) shows how to calculate closeness centrality
of node
i,
where
zij represents the distance between nodes
i and
j ).
Betweenness centrality is an index to measure the control degree of nodes on the whole network. If a node is on the shortest path of many other node pairs, we say that the node has a high betweenness centrality. Equation (3) shows how to calculate betweenness centrality
of node
i,
where
represents the number of shortest paths that exist between nodes
j and
k,
represents the number of paths that exist between nodes
j and
k through node
i.
Structural holes represent non-redundant connections and can be used to identify nodes that are at an advantage or disadvantage in the network. Nodes with structural holes are generally more competitive than nodes in other locations in the network and are irreplaceable [
46]. Measuring structural holes helps identify potential bottlenecks in the network. The index of the structural hole should generally consider the following three aspects: effective size, efficiency, and constraint, among which the third index is the most important.
- 2.
Overall network characteristic indexes
Network density can be used to measure the overall tightness of the network structure. It is calculated as the ratio of the number of connections in the network to the total number of possible connections [
47]. The lower the density value, the worse the coordination between the nodes. Equation (4) shows how to calculate the network density,
where L represents the number of arrows in the network, and N represents the number of nodes in the network.
The core–periphery structure model divides nodes into core area and periphery area according to the closeness among nodes in the network [
48]. The core area is the dominant cluster, while the periphery area has relatively few node connections. Core–periphery structure analysis can quantitatively analyze the location of nodes in the network and distinguish which nodes are in the core position in the social network and which nodes are in the periphery position.
Cohesive subgroups are important analytical measures in social networks, which can reveal actual or potential relationships between social actors [
49]. Cohesive subgroups are an important link between individuals and organizations. When the relationship between certain actors in the network is so close that they are combined into a sub-group, social network analysis calls such groups cohesive subgroups. If there are cohesive subgroups in the network, and the density is high, it means that the nodes in this cohesive subgroup are closely connected.
3.3.2. Quadratic Assignment Procedure
The quadratic assignment procedure (QAP) is a method that compares the values of the corresponding elements in two or more square matrices based on the replacement of matrix data. It compares the corresponding grid values of each square matrix, gives the correlation coefficient between the two matrices, and performs non-parametric tests on the coefficients at the same time. Since there is no need to assume that the independent variables are independent of one another, the QAP can effectively avoid the inherent autocorrelation errors in the data and obtain a more reliable result than the parametric method. Since the tourism flow network structure and its influencing factors are a kind of relational data, in this case, the use of ordinary least squares (OLS) for analysis leads to deviations in the estimates. On the contrary, QAP explicitly considers the autocorrelation error in the network data; therefore, it can circumvent this problem to a large extent [
50]. The QAP is usually divided into two steps: QAP correlation analysis and QAP regression analysis. QAP correlation analysis studies whether two matrices are correlated or not. It permutes the matrix, calculates the correlation coefficient by comparing the similarity of matrix lattice values, and carries out a non-parametric test [
51]. QAP regression analysis is to study the regression relationships between multiple matrices and one matrix. In a calculation, a standard multiple regression analysis is first carried out for the corresponding elements of the independent variable matrix and the dependent variable matrix. Then, the rows and columns of the dependent variable matrix are randomly and simultaneously replaced. Subsequently, the regression is recalculated, and all coefficient values and decision coefficient r
2 values are saved. This step is repeated hundreds of times to estimate the standard errors of the statistics [
52].
- 2.
Theoretical model.
Although the formation of tourism flow is a subjective choice of tourists, it is actually influenced by multiple factors. After referring to the existing literature [
5,
53,
54] and combining it with the actual situation of Qiandongnan Prefecture, this paper selected geographical adjacency, traffic accessibility, tourism reception capacity, tourism resource endowment, the popularity of tourist attractions, and ticket price of attractions as independent variables, and takes the spatial correlation matrix of tourism flow in Qiandongnan Prefecture as dependent variables and constructs the QAP model:
In the above model, the data of all indicators are a series of matrices. Y denotes the spatial correlation matrix of the tourism flow network. GA, TA, TRC, TRE, TAP, and TP denote the dichotomous matrices of geographical adjacency, traffic accessibility, tourism reception capacity, tourism resource endowment, the popularity of tourist attractions, and ticket price of attractions, respectively.
In this paper, whether the nodes are in the same county or not is used to represent the geographic adjacency.
In this paper, the distance from the node to the train station (DTS) and the average travel time of the node (ATT) are used to represent traffic accessibility. The railway station in the case study area is Kaili South Station. The distance from the node to the train station is based on the shortest time shown on amap (
https://ditu.amap.com/ (accessed on 1 January 2020)). The average travel time of the node is calculated by Equation (6) [
55].
where
is the average travel time of node
i;
i and
j represent tourist nodes;
represents the shortest travel time from node
i to node
j calculated using amap to select the driving mode; and
n represents the number of tourist nodes.
In this paper, tourism reception capacity is replaced by the number of hotels within five kilometers of the node.
China assesses the quality level of tourist attractions according to the “Standard of rating for the quality of tourist attractions” (GB/T17775-2003), which classifies them into national 5A, 4A, 3A, 2A, and 1A level tourist attractions. Whether a tourist attraction is A-level is an important measure of resource endowment. Therefore, this paper uses whether the node is an A-level tourist attraction to represent tourism resource endowment.
This paper uses the average Baidu index of tourist attractions in 2019 to represent the popularity of tourist attractions.
Sometimes, ticket prices can also affect tourism flow. This paper used the average off-season and peak season ticket prices of tourist attractions in 2019 to represent it. The above indicator data came from the website of the Ministry of Culture and Tourism of the People’s Republic of China (
https://mct.gov.cn/ (accessed on 1 January 2020).), Baidu.com (
http://www.baidu.com/ (accessed on 1 January 2020).), Ctrip.com, Mafengwo.com, and amap.com.
The dichotomous matrix is constructed as follows. If the nodes are in the same region, the matrix value is 1; otherwise, it is 0. If the nodes are all A-level attractions, the value is 1; otherwise, it is 0. We took the mean value of the variable data as the dividing point. If the node is above the mean value, it is coded as 1; otherwise, it is 0.
- 3.
Variable assumptions.
Since there are many factors affecting tourism flow, this paper proposes the following hypotheses with reference to the existing literature:
Hypothesis 1. (H1) Geographic adjacency is an important factor affecting tourism. If two tourist attractions are geographically close to each other, it has a positive impact on tourism flows development. The scenic spots in Qiandongnan Prefecture are scattered, and tourists need to spend a lot of time on transportation. The geographical proximity can promote tourist flow between tourist attractions.
Hypothesis 2. (H2) The improvement of traffic accessibility has a positive impact on the improvement of tourist flows. Traffic facilities, as the infrastructure of tourist attractions, play an important role in improving the accessibility of tourist attractions and expanding the tourism market. Generally speaking, tourists tend to prefer to travel to attractions with high transportation accessibility. Traffic accessibility will directly affect the formation of tourist flows in the two tourist attractions.
Hypothesis 3. (H3) The improvement of tourism reception capacity has a positive impact on the increase in tourist flows. Tourism reception capacity refers to the equipment and facilities of the tourism sector, which usually reflects the number of tourists received and the service level of the tourist attractions. Tourist reception capacity is one of the important indexes to attract tourists. It plays an important role in publicizing the culture of tourist destinations and promoting the comprehensive competitiveness of scenic spots.
Hypothesis 4. (H4) The improvement of tourism resource endowment has a positive influence on the improvement of tourism flow. Tourism resource endowment is an important condition to attract tourists. Tourism resources with special characteristics can help tourists effectively overcome the resistance of spatial distance and form a tourist flow.
Hypothesis 5. (H5) The improvement in popularity of tourist attractions has a positive impact on the improvement of tourist flows. The popularity of tourist attractions has an important impact on expanding the influence of the attractions and increasing revenue. Generally speaking, most tourists prefer to go to tourist attractions that are well known and popular.
Hypothesis 6. (H6) The reduction in ticket prices of attractions has a positive impact on the improvement of tourist flows. Ticket prices in attractions are directly related to tourists’ vital interests. In recent years, the number of tourists in some high-ticket scenic spots has begun to decline, which indicates that high ticket prices restrict the growth of tourism consumption, and it has become an urgent problem to be solved. Generally speaking, the lower the ticket price, the more tourists.