Abstract
According to the analysis of historical tourism data, it was found that tourism revenue is influenced by multiple factors, and there exists a linear relationship between these factors and tourism revenue. Therefore, this paper employs a linear regression model to investigate the factors influencing tourism revenue. However, research on tourism data has found that the disturbance term of the linear regression model is not frequency-stable. This indicates that the disturbance term should be an uncertain variable rather than a random variable. Therefore, this paper adopts an uncertain linear regression analysis model and employs the tourism data of Shandong Province in China from 2011 to 2020 as the sample to investigate the factors influencing tourism revenue. The study provides parameter estimation and residual analysis of the model, as well as predictions and confidence intervals of tourism revenue. Additionally, through an uncertain hypothesis test, it was verified that the adopted model fitted the relevant tourism data well. The results show that factors such as the number of travel agencies, railway length, domestic tourist numbers, and per capita disposable income of urban residents have a significant impact on tourism revenue. Based on the study, recommendations and measures for improving tourism revenue of Shandong Province are proposed.
MSC:
62P99
1. Introduction
Tourism revenue refers to all monetary income earned by tourism and sightseeing organizations through the sale of tourism-related goods during a certain period [1]. In order to analyze various factors influencing tourism revenue, many scholars have conducted meaningful research in this field. Deng H [2] show that since tourism revenue is an important indicator to reflect and measure the level of tourism development of a country or region, it is necessary to conduct in-depth research on factors influencing tourism revenue. Through the analysis of multiple linear regression model, it is concluded that the improvement in the ticket market economy, the expansion of urban green spaces, and the construction of tourism infrastructure are the main factors affecting tourism revenue. Li J [1] pointed out that China’s tourism industry has entered a period of explosive growth, and the domestic tourism revenue has also increased rapidly. Through a multiple linear regression model and econometric analysis, it is found that the main factors affecting the domestic tourism revenue include the number of domestic tourists and the number of travel agencies, and it is concluded that when other factors remain unchanged, the tourism income will increase by about CNY 1.2 billion with an increase of 1 million tourists, and every increase of one travel agency, tourism revenue will decrease by about CNY 50 million on average. Zhou X [3] firstly applied principal component analysis to extract two main factors, namely transportation and reception factor, as well as resource quality factor, through dimensionality reduction. They then conducted a horizontal comparison of 16 cities of Anhui Province in China based on the comprehensive ranking of these factors in 2017. Secondly, through the horizontal comparison of multiple linear regression models, the analysis shows that the green coverage area, the total number of tourists received, the annual average air quality index, the railway transportation volume, highway mileage, etc., all have important effects on the tourism revenue. Ren F [4] used the improved multiple linear regression method and step-up regression method to analyze the influencing factors of tourism revenue in Yan‘an city, and concluded that the number of tourists and tourist turnover had the most significant impact on tourism revenue, followed by residents’ disposable income and private car ownership. Lin X et al. [5] adopted R language software to establish multiple linear regression model, analyzed and studied the main factors affecting Shanghai’s tourism revenue, and concluded that the number of inbound tourists, railway mileage, and per capita disposable income of urban residents had the greatest impact on Shanghai’s tourism revenue. Pan J [6] used principal component analysis to extract several principal components and build a linear regression model. The analysis found that economic development level, urban and rural residents’ income, and consumer prices were the main reasons affecting tourism income, and the regression model established on this basis also had good prediction results. Jiang R [7] applied the grey correlation analysis method in the grey system theory and took grey comprehensive correlation as the metric index to analyze relevant factors affecting domestic tourism income, and concluded that the disposable income of urban residents, the average annual salary of employees and the per capita GDP were the leading factors affecting domestic tourism revenue. He Z [8] selected nine factors that have an impact on tourism economic income, made a quantitative analysis by using the grey comprehensive correlation degree analysis method of grey system theory, and concluded that the development of social economy and the continuous improvement in infrastructure have a greater promoting effect on tourism revenue.
According to the literature mentioned above, some studies have employed traditional quantitative analysis methods such as multiple linear regression analysis [1,2,3,4,5,9] and principal component analysis [6,10] to investigate the influencing factors of tourism revenue. These traditional methods typically assume the accuracy of the obtained sample data and the stability of disturbance term frequencies. On the other hand, other studies have utilized grey correlation analysis, a method derived from grey system theory, to study the influencing factors of tourism revenue [7,8,11]. Although grey correlation analysis does not require the stability of disturbance term frequencies, it assumes the accuracy of the obtained data. However, numerous examples have shown that disturbance term frequencies in sample data are often unstable and significantly deviate from the true frequencies. Moreover, it is difficult to ensure the accuracy of data during the collection process. Therefore, the aforementioned methods are ineffective in such cases. In contrast, uncertainty theory does not require the sample data to be accurate or assume the stability and closeness of disturbance term frequencies to the true frequencies when dealing with data. Hence, this paper employs uncertainty theory for the first time to study the influencing factors of tourism revenue. Through the research on the relationship between tourism revenue and its influencing factors, a linear relationship is found. Consequently, this paper adopts the method of uncertain linear regression analysis to study the influencing factors of tourism revenue and proposes an uncertain linear regression model.
2. Uncertain Linear Regression Model
Uncertainty theory is a branch of mathematics that focuses on the study of reliability, including uncertain measures, uncertain variables, uncertain statistics, and uncertain programming. Among them, uncertain regression analysis is a primary research topic in uncertain statistics.
As a major research area in uncertain statistics, uncertain regression analysis utilizes uncertainty theory to study the relationship between the dependent and independent variables. Furthermore, it provides estimates of regression model parameters and residual analysis, as well as predictions and confidence intervals.
2.1. Uncertain Linear Regression Model
Let be a vector of explanatory variables, and be a response variable. Yao-Liu [12] suggested the functional relationship between and is expressed by an uncertain regression model:
where is a vector of parameters, and is an uncertain disturbance term (uncertain variable).
When there is a linear relationship between and , an uncertain linear regression model is obtained [13]:
2.2. Parameter Estimation
Assume we have a set of observed data Based on the observed data, the least squares estimate of in the uncertain linear regression model [12]:
is
provided that is a non-singular matrix, where
2.3. Residual Analysis
Let be a set of observed data, and let the fitted regression model be [14]:
Then for each , the term
is called the i-th residual.
The residuals will be regarded as the samples of the uncertain disturbance term in the uncertain regression model:
Thus, the expected value of the uncertain disturbance term can be estimated as the average of residuals, i.e.,
and the variance can be estimated as
Therefore, the estimated disturbance term is an uncertain variable with expected value and variance
2.4. Forecast Value
Assume the forecast uncertain variable of response variable with respect to is determined by
Lio and Liu [14] suggested that the forecast value is defined as the expected value of the forecast uncertain variable , i.e.,
2.5. Confidence Interval
Assume the forecast uncertain variable of response variable with respect to a new explanatory vector is
and the forecast value of response variable is
It follows from the operational law that has a normal uncertainty distribution i.e.,
Taking as the confidence level (e.g., 95%), it is easy to verify that
is the minimum value b so that
Since Lio and Liu [14] suggested that the confidence interval of response variable is
2.6. Uncertain Hypothesis Test
In order to test whether the estimated disturbance term is appropriate, we consider the following hypotheses [15]:
Given a level of significance (e.g., 5%), it follows from Corollary 1 that the test for the hypotheses (14) is
There are at least two of indexes i’s with , so that
where
For each , if
then is regarded as an outlier. If the number of outlier is at least , i.e.,
Then either the estimated disturbance term or the fitted regression model (1) is inappropriate. Otherwise, both the estimated disturbance term and the fitted regression model (1) are appropriate.
With the continuous development of uncertain regression analysis theory, an increasing number of scholars have made significant contributions in the field of uncertain regression analysis and have further advanced and improved the theory. These research achievements include studies on the uncertain regression analysis theory itself [16,17,18,19,20,21], as well as research applying uncertain regression analysis to practical problems [22]. Although the application research is relatively in its early stages, the application domain of uncertain regression analysis is quite extensive.
3. Data Collection and Model Assumptions
This paper analyzes tourism-related data from 2011 to 2020 in Shandong Province, with specific data and statistical information provided in Table 1. The main sources of data are the Statistical Yearbook of Shandong Province and the Statistical Bulletin on National Economy and Social Development of Shandong Province from 2011 to 2020.
Table 1.
Tourism-related data of Shandong Province from 2011 to 2020.
According to the relevant tourism data in Table 1, a scatter plot of the sample data in Figure 1 was plotted. The horizontal axis represents tourism revenue, while the vertical axes represent domestic tourist numbers, per capita disposable income of urban residents, per capita disposable income of rural residents, road length, railway length, and the number of travel agencies, respectively.
Figure 1.
Scatter plots of sample data.
From Figure 1, it is evident that the sample data points for roughly fall near their respective straight lines. This indicates that there is a strong linear relationship between variables Y and , and the uncertain linear regression model can be used to describe their relationship. Therefore, an uncertain linear regression model is established:
where Y is the tourism revenue, is the number of domestic tourists, is the per capita disposable income of urban residents, is the per capita disposable income of rural residents, is the highway length, is the railway length, is the number of travel agencies, and is the uncertain disturbance term.
4. Results and Analysis
4.1. Parameter Estimation
According to the above uncertain linear regression model and the data provided in Table 1, and according to the following formula in the literature [13]:
where
and is a full-rank matrix. Therefore, the parameter estimates for the uncertain linear regression model are as follows:
4.2. Residual Analysis
According to the above uncertain linear regression model, the formula for calculating the residuals of a linear regression model is
and then
and the estimated mean and variance of the residual are, respectively,
4.3. Forecast Tourism Revenue
According to the above uncertain linear regression model, the predicted value of tourism revenue for the year 2020 is
4.4. Confidence Interval
According to the above uncertain linear regression model, the uncertain distribution function of tourism revenue is
Assume that the confidence level is , and then the confidence interval of the above uncertain linear regression model is
that is
4.5. Uncertain Hypothesis Test
To test whether is appropriate, we consider the following hypotheses:
Given a significance level , we obtain
where is the inverse uncertainty distribution of , i.e.,
Since it follows from (16) that the test for the hypotheses (22) is : there is at least one of index i’s with , such that
As shown in Figure 2, we can see that none of Thus, . Therefore, we think the estimated disturbance term and the fitted linear regression model (20) are appropriate.
Figure 2.
Sample residual plot.
4.6. Result Analysis
By observing the obtained residuals and their residual plot (Figure 2) of the aforementioned uncertain linear regression model, it was found that the frequency of the disturbance term is indeed unstable, which aligns with our previous assumption. Additionally, through the analysis of tourism data, a linear relationship between tourism revenue and the selected influencing factors was discovered. Therefore, it is reasonable for this paper to choose uncertain linear regression analysis to study the influencing factors of tourism revenue.
To better validate the rationality of the obtained uncertain linear regression model in the study, an uncertain hypothesis test was conducted. The final test results confirm that the uncertain linear regression model fits the tourism data well. This indicates that the established uncertain linear regression model in this paper is reasonable and can effectively study the relationship between tourism revenue and the selected influencing factors.
After observing the uncertain linear regression model, it was found that the parameter estimates for and were negative, which contradicted the actual situation. Therefore, in order to achieve a more accurate model, and were removed from the regression equation. The final result is an optimized uncertain linear regression model, which is
In the context of other factors remaining constant, the impact on tourism revenue, from largest to smallest, is as follows: the number of travel agencies, railway length, domestic tourist numbers, and per capita disposable income of urban residents. For every additional 100 million domestic tourist numbers, tourism revenue increases by CNY 0.1386 billion. For every increase of CNY 1 in per capita disposable income of urban residents, tourism revenue increases by CNY 0.0961 billion. For every additional 1 km of railway length, tourism revenue increases by CNY 0.2145 billion, and for every additional travel agency, tourism revenue increases by CNY 0.6123 billion.
5. Conclusions
When studying the factors affecting tourism revenue, compared to traditional methods such as multiple linear regression analysis, principal component analysis, and grey relational analysis, uncertain linear regression analysis does not require data accuracy and does not assume the frequency stability of the disturbance term in the regression model, making it more aligned with real-world situations. Additionally, through the uncertain hypothesis test, it has been verified that the uncertain linear regression model fits the related tourism data well. Based on these reasons, this paper utilizes uncertain linear regression analysis for the first time to investigate the influencing factors of tourism revenue and proposes a corresponding uncertain linear regression model. However, considering that there may be other factors affecting tourism revenue, various uncertain linear regression models can be proposed based on uncertain linear regression analysis.
Based on the tourism data from 2011 to 2020 in Shandong Province, the obtained uncertain linear regression model reveals that factors such as the number of travel agencies, railway mileage, domestic tourist arrivals, and per capita disposable income of urban residents have significant impacts on tourism revenue. In order to better promote the growth of tourism revenue and the development of the tourism industry in Shandong Province, the following recommendations are proposed.
5.1. Enriching the Types of Tourism Products and Expanding the Domestic Tourism Market
The tourism industry in Shandong Province has mainly relied on sightseeing-oriented tourism products, resulting in a relatively simple market structure and limited variety of tourism products. Visitors often have brief sightseeing experiences and are unable to deeply immerse themselves in the local culture, leading to short stays and an inability to fully unleash their consumption potential. As a result, the growth in tourism revenue in Shandong Province has been slow, impacting the economic and social development of cities heavily reliant on tourism revenue. Therefore, to increase tourism revenue, efforts should be made to innovate tourism products and continuously attract domestic tourists for consumption.
On one hand, based on the unique resource advantages of each city in Shandong Province, it is advisable to draw inspiration from popular tourism projects in other provinces and cities. While meeting the diverse, personalized, and innovative demands of current tourists, emphasis should be placed on incorporating local cultural elements and enhancing visitor engagement to create immersive tourism products.
On the other hand, Shandong Province can actively promote the “tourism+” model according to the resource advantages of each city. This involves developing related tourism projects such as “tourism + agriculture”, “tourism + industry”, “tourism + marine”, etc., to create tourism products with distinctive regional characteristics, thereby attracting more domestic tourists.
5.2. Improving Transportation Infrastructure and Reducing Travel Costs for Tourism
Research analysis indicates that a convenient, fast, and low-cost transportation system plays a crucial role in promoting the development of tourism economy. Therefore, Shandong Province should continuously improve its comprehensive transportation system, including roads, railways, and aviation, to enhance the accessibility of tourist attractions in different cities. Additionally, Shandong Province should leverage its policy advantages to coordinate the development and planning of transportation departments in various cities, aiming to establish an interconnected and unified transportation system that prioritizes safety, comfort, speed, and convenience. By integrating tourism routes to meet the diverse needs of tourists, the overall service level of tourism transportation can be improved, thereby reducing the travel costs for visitors.
5.3. Increasing the per Capita Disposable Income of Residents to Promote the Development of Tourism Economy
Generally speaking, urban residents have relatively higher and stable sources of per capita disposable income. To enhance the tourism consumption level of urban residents, it is necessary to eliminate their negative expectations about future employment while ensuring employment stability and a high level of disposable income. Therefore, measures such as gradually raising the minimum wage standard and continuously improving the social security system can be implemented to alleviate their concerns about a potential decline in future living standards caused by tourism consumption. This will fully unleash the tourism consumption potential of urban residents and increase the total tourism revenue of Shandong Province.
However, rural residents have relatively lower levels of disposable income, and they often reduce their travel expenses in order to ensure a certain frequency of tourism. Therefore, to enhance the tourism consumption level of rural residents, it is necessary to vigorously develop the agricultural economy and continuously increase the disposable income of rural residents, thereby increasing their demand for tourism consumption.
In the future, when dealing with uncertain statistical data, it is necessary to consider not only uncertain linear regression models but also further study other types of uncertain regression models, such as exponential growth models, logarithmic growth models, and logistic growth models.
Author Contributions
Conceptualization, Y.N.; Methodology, L.W.; Data curation, S.W.; Writing—original draft, S.C. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the Natural Science Foundation of Shandong Province grant number ZR2014GL002 and the Project ZR2021MA057 of Shandong Provincial Natural Science Foundation.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Li, J. An Empirical Analysis of the Influencing Factors of China’s Domestic Tourism Income. J. Chongqing Technol. Bus. Univ. (Nat. Sci. Ed.) 2019, 36, 105–114. [Google Scholar]
- Deng, H.; Li, M. An Empirical Analysis of the Influencing Factors of China’s Tourism Income Based on the Tourism-Related Data of 22 Provinces in China. Lanzhou Acad. J. 2015, 171–176. [Google Scholar]
- Zhou, X.; Shi, T. An Analysis of the Factors Affecting China’s Tourism Income-Taking Anhui Province as an Example. J. Jining Univ. 2019, 40, 25–34. [Google Scholar]
- Ren, F.; Zuo, T. Tourism Revenue Forecast and Influencing Factors Analysis of Yan’an City. J. Yanan Univ. (Nat. Sci. Ed.) 2019, 38, 18–22. [Google Scholar]
- Lin, X.; Pang, Y.; Yu, H. A Study of Influencing Factors of Tourism Revenues in Shanghai Based on Multiple Regression Analysis. J. Wuxi Vocat. Inst. Commer. 2017, 17, 53–58. [Google Scholar]
- Pan, J.; Zhang, H. The Application of Principal Component Analysis Method in the Evaluation of the Impact Factors of Domestic Tourism Income. J. Shanxi Datong Univ. (Nat. Sci.) 2018, 34, 31–33, 57. [Google Scholar]
- Jiang, R.; Zhou, J. Research on the Influence Factors of Domestic Tourism Revenue Based on Gray Relational Analysis. Commer. Res. 2010, 8, 203–206. [Google Scholar]
- He, Z. Study on the Influence Factors of Tourism Income-Taking Henan Province as An Example. Econ. Probl. 2009, 121–122. [Google Scholar] [CrossRef]
- Chen, M.; Chen, G.; Lin, B.; Zheng, C. Analysis of Affecting Factors of Tourism Income in Fujian Based on Econometric Model. Sci. Technol. Ind. 2019, 19, 20–25. [Google Scholar]
- Zhao, H.; Zhang, X. On the Influencing Factors and Industrial Countermeasures of Tourism Revenue in Yunnan Province: Based on Principal Component Regression Model. Acad. Explor. 2018, 104–109. [Google Scholar] [CrossRef]
- Li, L.; Ma, R.; Zheng, F.; Lu, H.; Liu, W. Research on influencing factors of Henan province tourism revenue based on grey correlation. Sci.-Technol. Manag. 2016, 18, 80–84, 89. [Google Scholar]
- Yao, K.; Liu, B. Uncertain regression analysis: An approach for imprecise observations. Soft Comput. 2018, 22, 5579–5582. [Google Scholar] [CrossRef]
- Liu, B. Uncertainty Theory, 5th ed.; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
- Lio, W.; Liu, B. Residual and confidence interval for uncertain regression model with imprecise observations. J. Intell. Fuzzy Syst. 2018, 35, 2573–2583. [Google Scholar] [CrossRef]
- Ye, T.; Liu, B. Uncertain hypothesis test with application to uncertain regression analysis. Fuzzy Optim. Decis. Mak. 2022, 21, 157–174. [Google Scholar] [CrossRef]
- Lio, W.; Liu, B. Uncertain maximum likelihood estimation with application to uncertain regression analysis. Soft Comput. 2020, 24, 9351–9360. [Google Scholar] [CrossRef]
- Liu, Z.; Yang, Y. Least absolute deviations estimation for uncertain regression with imprecise observations. Fuzzy Optim. Decis. Mak. 2020, 19, 33–52. [Google Scholar] [CrossRef]
- Chen, D. Tukey’s biweight estimation for uncertain regression model with imprecise observations. Soft Comput. 2020, 24, 16803–16809. [Google Scholar] [CrossRef]
- Wang, S.; Ning, Y.; Shi, H. A new uncertain linear regression model based on equation deformation. Soft Comput. 2021, 25, 12817–12824. [Google Scholar] [CrossRef]
- Wang, S.; Ning, Y.; Shi, H.; Chen, X. A new uncertain linear regression model based on slope mean. J. Intell. Fuzzy Syst. 2021, 40, 10465–10474. [Google Scholar] [CrossRef]
- Wang, S.; Ning, Y.; Huang, H. Uncertain Least Squares Estimation Model Based on Relative Error. J. Intell. Fuzzy Syst. 2023, 44, 8281–8290. [Google Scholar] [CrossRef]
- Liu, Z. Uncertain growth model for the cumulative number of COVID-19 infections in China. Fuzzy Optim. Decis. Mak. 2021, 20, 229–242. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).