1. Introduction
The emergence of the new coronavirus pneumonia pandemic (COVID-19) has had a huge impact on tourism development in countries around the world [
1,
2]. For example, from 23 to 26 January 2020, the hotel occupancy rate in China fell by nearly 70% from a year earlier [
3]. At the same time, according to the calculation of the tourism network attention index later in this study, the average value in 2019 was 40.92, and it dropped to 31.90 in 2020 due to the impact of the pandemic. The impact of the pandemic is intuitively reflected. As a fast-growing industry, fluctuations in tourism will also be rapidly transmitted to the economic, cultural, and social development sectors [
4]. In mid-2019, the mass media was still debating over-tourism, with concerns that the concentration of over-tourism could lead to conflicts and complaints between tourists and residents [
5]. But now, in an instant, everything had changed. Global concerns about tourism have shifted from over-tourism to under-tourism, threatening the future of the global economy and society.
However, the impact is not limited to China. The observed spread of COVID-19 around the world has led to strict domestic and international travel restrictions. Most countries around the world have restricted or completely suspended international travel. Of course, it should also be noted that China has stricter rules than countries in Europe and America, which means that the case of China can hardly be referred to other countries without restrictions. However, as a new way of monitoring tourism development, by expanding the scope of research on the tourism network attention of Chinese cities, some observable cases could also be made available to other countries.
For the tourism industry, the duration of the impact of COVID-19 is very different from other crises. The impact of COVID-19 was once considered temporary, as with other emergencies, but the global pandemic situation and COVID-19 mutations have led to the spreading and frequent recurrence of pandemics. This causes the impact of COVID-19 to linger for a long time, greatly affecting the recovery process of tourism development. Considering the damage to the real economy caused by the pandemic, households will lose sustainable income and employment opportunities. This will indirectly affect the development of tourism. How to assess the impact of the crisis as soon as possible is the focus of many scholars [
2]. In previous studies, scholars have assessed the impact of SARS and other shocks on subsequent tourism development [
6,
7]. These analyses are usually conducted two to five years after the SARS event, but there is a lack of timely empirical analysis. In particular, a large number of literary works have a great influence on the development of the tourism industry, such as the number of tourists, tourism revenue, industry prospects, etc. [
8,
9,
10,
11,
12]. In recent years, COVID-19 has attracted widespread attention from scholars and the mass media at home and abroad. However, there is still little literature on this subject. They argued, inter alia, that the crisis has a pandemic tendency and will bring a decline in the economy and tourism over a period of time through normative rather than empirical analysis [
13,
14,
15]. Or some research notes, such as Yang et. al. [
2], use the DSGE method to expand the discussion of models. The importance of tourism in economic activity has led to increasing interest in this issue. The recession caused by the COVID-19 pandemic will eventually eliminate 50 million tourism jobs globally [
1], and the impact looks likely to be felt across countries. More importantly, tourism is closely linked to other important industries such as air transportation, oil production, hotels, and retail, so the chain reaction will spread around the world. Scholars, as well as government planners and implementers, must therefore better understand the impact of the pandemic, particularly on tourism.
When assessing the impact of COVID-19, Chinese cities provide a good sample for empirical studies. In China, due to measures such as “city lockdown”, “isolation” and bans on crowd gatherings [
2], many Chinese residents have canceled their original travel plans and reduced their willingness to travel in the future, which can be considered a certainty at this stage. In theory, the emergence of crisis events can have a significant impact on tourism and other activities. However, how to quantify the treatment effects from the perspective of the tourism economy in such a short period of time remains an open question to be further explored, and one that is rarely addressed in the existing literature. According to some scholars, based on the experience of SARS, Influenza A H1N1, etc. [
16,
17], tourism activities may exacerbate the spread of the pandemic, but it will also deal a heavy blow to the tourism industry due to the emergence of the pandemic. Thus, there is a viscous two-way cause-and-effect relationship between the two, which is inherently difficult for us to determine clean treatment effects [
18]. But in China’s case, the decisive decisions and enforcement of lockdowns and quarantines, not by a few but almost all of the Chinese people behavior choice, provide us with effective tools to address the endogenous biases between the pandemic and tourism development, allowing this study to more accurately identify the treatment effect of COVID-19. Meanwhile, it can also be found that many variables such as the reduction of residents’ willingness to travel and when they will be able to recover in the future lead to more uncertainty about the future of tourism development than ever before. To sum up, the previous literature mainly (a) analyzes the impact of crisis using post-event annual statistical indicators’ values and trends, and (b) relevant discussions mainly focus on qualitative documents, while empirical research methods are seldom applied, and mostly reflect statistical correlation. It is believed that the main drawback of these studies is that results can be skewed by the lack of real-time index data related to tourism and the neglect of possible endogenous problems. So, that’s exactly what this study is trying to fill in the blanks and provide a comprehensive study of the links between COVID-19 and tourism.
The main contribution of this study is twofold. First, this study crawled the daily Baidu Index of tourism-related keywords by 247 Chinese prefecture-level cities from the website
http://index.baidu.com/v2/index.html#/ (accessed on 29 April 2021), and then constructed the “tourism network attention” indicator (
TNA) to measure the overall state tourism development in the COVID-19 era in real-time. The use of these data can increase the generalizability of empirical research evidence, as most existing research focuses on the use of annual data or statistical indicators. Secondly, compared with previous research, the empirical analysis has certain advantages in methodology and identification strategy. This study used the regression-discontinuity-design (RDD) method to understand the impact of COVID-19 on tourism. Designed by time cutoff, the study represents pioneering research in pandemic impact assessment. Specifically, the RDD method will be applied to capture the causal link between COVID-19 and the degrowth in tourism. Angrist and Pischke [
19] believe that in a highly rules-based world, the “arbitrariness” of some contingencies provides us with good experiments (or quasi-experiments). The RDD, an empirical method second only to randomized trials, effectively uses real-world constraints to analyze causality between variables, avoiding the inherent bias in parameter estimations. Therefore, it can truly reflect the causal relationship between variables [
20,
21]. In addition, this study also retrieved the Baidu index of some daily life keywords as control variables, and test the continuity of the cutoff position to improve the robustness of the RDD analysis. Principles of relevant methodologies can be found in relevant studies [
18,
20,
22]. What is special and perfect in the methodology is the construction of a differenced RDD regression to further validate the accuracy of identification and obtain a clearer treatment result by subtracting the value of the corresponding day with a one-year lag from the explanatory variable.
The rest of this article is arranged as follows. Firstly, this study introduces the data and identification strategies, briefly introduces the statistical method of the key explanatory variable TNA, and explains the basic details of the RDD model. Then the results are presented and robustness analysis is performed. Finally, the main findings are summarized in the conclusion.
2. Materials and Methods
The basic period covers from 1 July 2019 to 25 April 2020, and to have more reliable results, samples from 1 July 2018 to 25 April 2019, and 1 July 2020 to 25 April 2021 have also been collected, which will be explained later. And the data contains 247 Chinese prefecture-level cities, which covers almost all prefecture-level cities, taking into account the availability of the data. The dependent variable of this study is tourism development. Several annual data indicators from statistical yearbooks are usually used for measurement, such as the number of tourists received, tourism income, etc. [
23]. This study focuses on the real-time development of the tourism industry, the annual data is insufficient. Therefore, this study turns to the daily information of
TNA, which can be calculated by Function (1) following typical pieces of literature [
24,
25].
where
TNAc,t is the comprehensive tourism index of city
c on day
t, which is constructed based on the Baidu index of tourism-related keywords;
BIk,c,t represents the Baidu index of the
k-th keyword for cities,
wk is the corresponding weight which determined under the entropy weight method. Specifically, this study mainly selects six keywords, as “
tourism”, “
travel agency”, “
self-driving tour”, “
Ctrip”, “
Qunar”, “
fun place” in Chinese, to get the
BIk,c,t. The Baidu index provides a weighted sum of the search frequency of certain keywords by network users in all prefecture-level cities of China. It reflects the distribution of information flow of residents searching the relevant keywords on the network, and has the advantage of reflecting the development of tourism in real-time from the demand side, covering a wide range of cities.
In addition, this study tries to add some control variables to the basis of empirical analysis based on existing literature. Unlike the use of annual data, the daily data in the present paper will face the unavailability of global economic and macro indicators. Given this shortcoming, this study makes improvements in the following three aspects. First, the RDD method itself focuses more on information near the cutoff point of the forcing variable. Following that research strategy, this study analyzed the time cutoff of COVID-19 and used local estimations to largely eliminate the interference of macro-environmental changes. Second, using a web-based data crawl on the post-weather report website
http://www.tianqihoubao.com/lishi/ (accessed on 29 April 2021), this study collated the weather data of 247 Chinese prefecture-level cities and introduced the daily maximum temperature, minimum temperature, sunny conditions, and wind grade as control variables. Third, a living-related keyword retrieval ratio is constructed based on the Baidu Index, denoted by
Live. Theoretically, the outbreak would not affect the search behavior of these keywords, but the focus on these aspects of life could be a factor affecting tourism demand, and therefore be exogenous. In this way, through local restrictions and appropriate direct time effect control, the estimated result is the causal relationship between COVID-19 and tourism that this study is concerned with.
According to the evaluation of tourism development and the related classical literature of the RDD method, the regression model is set up as Function (2).
Dc,t is the treatment variable in the RDD model depending on the outbreak date of the COVID-19 event; the identifiable start date is determined according to the process of the COVID-19 crisis in China and the time node that received national attention. Accordingly, it is found that Zhong Nanshan, leader of the high-level expert group of the National Health and Health Commission, academician of the Chinese Academy of Engineering, and expert in respiratory medicine, accepted the CCTV “News 1+1” interview and argued it clearly that the new crown virus is of human-to-human transmission, and that day became the landmark time of COVID-19 and its control, which is the cutoff point of this study. It was 20 January 2020. Therefore, this study set D = 0 before that date and D = 1 after. t is the forcing variable of the RDD, to measure the time distance from the cutoff point, on the cutoff day t = 0. f(t) is a polynomial function of the forcing variable, and RDD will select the optimal order of the polynomial when estimating.
X is the vector of the control variables, including daily maximum temperature, minimum temperature, sunny condition, wind grade, and
Live mentioned above. Here, the basic keywords of the variable
Live mainly include “
food”, “
oil price”, “
reading” and “
cold”. Their Baidu Indexes were crawled and then synthesized into a composite index by the entropy weight method. In addition, the model also controls the regional fixed effect and the fixed effect of the city as the major source of tourists, donated as
DQc,s, and
TCc respectively;
u is the random disturbance item. Descriptive statistics of variables are shown in
Table 1.
Further, this study provided a differenced RDD shown as Function (3).
TNA_diff refers to the arithmetic difference between the current period value and the one-year lag period value of
TNA. The current period in this study is ranged from 1 July 2019 to 25 April 2020, and the one-year lag period is 1 July 2018 to 25 April 2019, correspondingly, during which time there was no pandemic. And
X_diff is the differenced control variables vector constructed in the same way. Function (3) is particularly useful in assessing the impact of COVID-19 on tourism as there is no pandemic in the same period of 2018–2019.
It focuses primarily on the coefficient
, which is the estimator of the local average treatment effect (
LATE) at
t = 0. A significant negative result of
would indicate that COVID-19 significantly reduces
TNA or TNA_diff, and harms tourism. Besides, by eliminating the samples after the cutoff day by day and carrying out RDD regression one by one, this study can further explore the dynamic nature of the impact of the pandemic. The relative description will take place in
Section 3.
Panel A is the corresponding variables obtained from 1 July 2019 to 25 April 2020; Panel B is the differenced result with the one-year lagged period (that is, the same time interval from 1 July 2018 to 25 April 2019), which largely eliminate the impact of unobservable factors. Here this study adds the suffix “_diff” to distinguish them from the basic form. It is especially helpful to evaluate the impact of the pandemic since there is no pandemic in the same period from 2018 to 2019. Therefore, the Panel B data enables this study to apply the differenced model and provide more convincing empirical evidence. At the same time, to further study the recent recovery of China’s tourism market, this study also uses data from 1 July 2020 to 25 April 2021, compared to the same period in 2018–2019 and the same period in 2019–2020 to explore the recovery of China tourism market in the context of effective pandemic prevention and control using quantitative information.
5. Discussion and Implication
5.1. Discussion
In the basic results section, the coefficients indicating the impact of COVID-19 on TNA or TNA_diff have significantly negative values in many different models, which verifies the adverse impact of the COVID-19 pandemic on the tourism network attention. This is an intuitive description of the effects, and it also shows that, to a large extent, TNA is directly related to travel behavior, since when stricter control measures take place, people will spend more time at home. If the focus on tourism network attention were only on entertainment, the index would there have dropped significantly, but it is not.
Heterogeneity analysis shows that cities with different characteristics are affected by the pandemic to different degrees. This is due to the fact that the pandemic has mainly curbed the occurrence of tourism demand, especially in the major tourist source cities, while regions with lower demand tend to remain less willing to travel and are therefore less impacted. The emergence of heterogeneity allows the studies to further identify the potential impact of the pandemic on urban, regional, or other tourism disparities in future studies, and to focus more on the changing and nurturing demands of key source populations.
In the dynamics of the pandemic impacting tourism development, the short-term recovery near t = 0 has greater relevance to the traditional Chinese Spring Festival, but with the development and popularity of the pandemic across regions of China, the damaging effect on tourism development has become more obvious. The overall volatility is relatively high, and significant negative coefficients indicate that in recent months, due to COVID-19 and regional pandemic prevention and control measures, the tourism network attention in China has not shown any obvious signs of recovery. According to the trend of TNA, this study believes that the tourism recovery trend to May Day holiday expected by the media and experts has not come, at least, there is no clear evidence in 2020.
To obtain intuitive discussions on tourism development issues in the context of key holiday effects such as the May Day Holiday and the November Holiday in China, which two are the most important tourist weeks in China in the first half and second half of the year.
Figure 3 uses the boxes to highlight the changing trend of the pandemic impact effect based on the differenced RD regressions during a total of three weeks before and after the two holidays. It can be seen that the pandemic does have a significant negative impact on the
TNA of cities, implicating that the
TNA has a sharp drop before and during the holiday weeks compared to that of the previous year. While June to August is the peak season for China’s tourism market, compared to the same period the previous year, cities have shown a significant decline in the attention paid to tourism online search in the background of the pandemic. The results of the dynamic analysis can intuitively reflect the negative impact of the pandemic on the tourism industry, although China has made sufficient results in the prevention and control of the pandemic. At the same time, more negative and significant coefficients of the impact of the pandemic on tourism are found during important holidays and the tourist peak season. It shows that the impact of the pandemic on tourism is substantial. As the global pandemic continues to spread and continue, the impact of the pandemic on tourism development is not optimistic.
In addition, a slight sign of recovery has been observed in the trend, but it still needs to be understood that the recurrence of the pandemic is one of the determinants of tourism. As the current pandemic situation can be observed, when the pandemic situation becomes serious, the control in local areas will also spread to the tourism development in other areas, which will be reflected first in the changes in the level of TNA. TNA is a reflection of potential tourism demand. Whether the potential demand can be transformed into a real demand also depends on various supports from the external environment. Therefore, it is possible to remain strategically optimistic about the post-pandemic recovery of the tourism market, but it should also be tactically cautious. Grasp the timing of policy support for tourism recovery. Concerning the support and intervention policies for the tourism and residence industry, countries have to find a balance between them to balance the recovery and development of the tourism and residence industry and other aspects of the country.
5.2. Implication
Overall, the severe downturn tourism industry amid the pandemic can be understood on two levels. First, increased economic uncertainty has reduced tourism demand, which is linked to risk aversion in the face of uncertainty. Second, administrative restrictions introduced during the COVID-19 pandemic have the potential to significantly reduce the supply of tourism services, including restricting population mobility. While the pandemic has sparked a tourism crisis in several countries, including China, the tourism industry is likely to be a key sector for the global economic recovery after the pandemic, so it is important to focus on tourism during the pandemic.
This study is of some theoretical and policy implications. Theoretically, strong evidence supporting the damage caused by COVID-19 to tourism development is provided. This study clarifies the dynamics of the pandemic’s treatment effects over time, and there is no obvious recovery process in the overall trend in 2020. By looking at those dynamic impacts, the study sought to capture the likely timing of tourism recovery from recession amid the pandemic. In terms of policy implications, the emergence of the pandemic poses significant economic and political challenges for national and local governments. This study illustrates from the tourism demand side that COVID-19 has greatly impacted residents’ tourism willingness in the short term. As the pandemic is gradually brought under control or recurs, its impact is constantly changing. Therefore, while helping the large-scale development of the tourism industry, the government should make efforts to cultivate and guide tourism demand.
In the post-pandemic era, for the marketing of Chinese tourist cities, suggestions and inspirations are that, on the one hand, the recovery of the tourism market should be identified in time, and tourism demand should be guided and cultivated promptly, so that enterprises and self-employed individuals with the ability to organize tourism demand can quickly operate and demand can converge, and the market can recover. On the other hand, it is necessary to accurately monitor the development of the supply side of the tourism market. COVID-19 has forced many relatively weak tourism businesses to terminate or withdraw, which means that when the tourism market fully recovers, there may be a gap in the provision of tourism services, which should be paid attention to and deployed in advance. As for how to further increase the TNA of Chinese tourist cities, this study finds that tourism consumers have an obvious tendency to “internet celebrity cities” when searching for keywords about tourism. Then take advantage of the current change in the way of information exchange in China’s short video and digital economy era, actively dig out the city’s tourism resources or travel personality, consciously build an Internet celebrity city, increase Internet exposure and media voice, then the city will receive attention.
Given the current global pandemic of COVID-19, China’s timeliness and advancement in responding to the pandemic means that China will enter the post-pandemic era faster and earlier than other countries in the world. Although European and American countries have different measures to combat the pandemic than China, observing the performance and trends of China’s tourism development under COVID-19 can also help provide some potential guidance for other countries, and can also serve as a signal for the recovery of the global tourism industry.
6. Conclusions
COVID-19 has forced countries to take various measures to restrict the movement and gatherings of people. While the pandemic has had a positive impact on slowing the spread of the pandemic, it has also had a huge impact on the economy and society, attracting considerable interest from scholars and policymakers. This study attempts to investigate the treatment effect of COVID-19 on tourism network attention in 247 Chinese prefecture-level cities from 2018 to 2021, to explore the adverse effects and the developing trends of the pandemic on tourism. To this end, this study adopted a very effective RDD method and analyzed it in the form of an intelligent setting. The basic regression results used TNA and TNA_diff as the explained variables, respectively, and the RDD analysis was carried out. The estimated treatment effect coefficient values were −2.12 (p < 0.10) and −10.77 (p < 0.01), which were statistically significant, demonstrating the causal relationship between the pandemic and tourism development in quantitative and empirical terms. And the effects vary with whether it is a major tourist source city or not. Specifically, the estimated coefficient for major tourist source cities is −14.91 (p < 0.01), while that for non-major ones is −4.57 (p < 0.01), that is the impact of the pandemic on tourism is greater in the major tourism source cities. Further identifying the dynamics over time, the impact of the pandemic fluctuates negatively, and this adverse impact is more pronounced during the two major Chinese holiday and tourist seasons. This study further compares the trend changes of TNA in the pre-pandemic period and the pandemic period, and in the post-pandemic period and the pandemic period. The results show that, after excluding the individual effects, the TNA of Chinese cities shows an upward trend in 2021 relative to the year 2020. This is good news for tourism development, but the global spread, mutation, and recurrence of the pandemic remains a key variable to focus on.
Although this study is new to existing literature, it still has its limitations. First, Chinese cities have different economic and tourism characteristics and are at different stages of development. Their heterogeneity made it difficult to make a single recommendation. Spatial and cultural differences must be taken into account. Second, researchers must also pay attention to clarifying and considering theoretical mechanisms for COVID-19 and other contingencies affecting regional tourism development. Third, regarding the rules for dealing with COVID-19, China has stricter rules than others, making the case of China atypical and hard to be referred to other countries directly. Only when these factors are taken into account will the study be able to fully assess the impact of COVID-19.