A Study on the Fiscal Sustainability of China’s Provinces

: Fiscal imbalances in China are widening; the problem of ﬁscal sustainability in each province is becoming increasingly serious. However, so far, few studies have focused on the issue of the ﬁscal sustainability of China’s provinces. This paper will focus on it to clarify the degree of ﬁscal sustainability in China’s provinces. In this paper, the GH test method is used to analyze the structural breaking of ﬁscal revenue and expenditure data of each province, the panel cointegration method is used to analyze the relationship between ﬁscal revenue and expenditure and DOLS is used to estimate the degree of ﬁscal sustainability of each province. It is found that the ﬁscal sustainability of most provinces in China, such as Beijing, Shanghai and Guangdong, is strong, while that of some provinces, such as Gansu, Qinghai and Xinjiang, is weak. This paper states that people should pay more attention to the ﬁscal sustainability of China’s provinces, and provinces with weak ﬁscal sustainability should minimize unproductive expenditures while the central government should continue to give appropriate ﬁnancial support to local governments.


Introduction
)), respectively. A characteristic of this data is that the balance of revenue and expenditure has changed from a positive value to a negative value, and the deficit has become larger and larger. Due to the impact of COVID-19, the central government requires provincial governments to live a tight life. China's local general public budget deficit is still large, but it has decreased slightly. The provincial deficit was CNY −980 billion in 2020, CNY −820 billion in 2021, and is planned to be CNY −720 billion in 2022. Our preliminary judgment is that fiscal risks continue to increase in China's provinces and that the fiscal sustainability of each province needs to be studied.
Under China's current tax-sharing system, the central government tends to give local governments more responsibility for expenditure without giving them corresponding revenue powers. There is no one-to-one correspondence between expenditure responsibility and revenue power, and local fiscal revenues and expenditures are unbalanced. Under the decentralized system characterized by tax-sharing arrangements and balanced transfer payments, the incentive mechanism to promote efficiency is not powerful enough, and local finance habitually relies on transfer payments from the central government [1], further aggravating the imbalance of local revenue and expenditure. Under the financial system or political system in which the central government has the responsibility to provide relief to local governments that are caught in financial risks, relief will create soft constraints on local government budgets, which will lead local governments to expand spending, reduce

Literature Review
Fiscal sustainability has always been an important issue of concern to the international academic community. Keynes believed that the excessive proportion of government debt in GDP may lead to fiscal unsustainability [5]; that is, if the fiscal revenue is insufficient to cover the cost of issuing new debt, national finance will face a problem of sustainability. Domar did not directly use the concept of "fiscal sustainability", but actually defined it as the relationship between government debt and GDP. Government debt can grow, but the growth rate should not be faster than the growth rate of GDP [6]. Since the 1980s, government debt in Western countries has grown rapidly, and research on fiscal sustainability has sprung up. Hamilton and Flavin first used the annual data of the United States to test the stationarity of fiscal deficits and accumulated debts, which pioneered the use of stationarity testing to study fiscal sustainability [7]. Wilcox added the stochastic interest rate factor to the Hamilton and Flavin model, but the conclusion is contrary to that of Hamilton and Flavin [8]. Trehan and Walsh used the annual data of the United States to test the cointegration relationship between government revenue and government expenditure, and proved that a cointegration relationship between government revenue and government expenditure is a sufficient condition for fiscal sustainability [9]. In theory, the cointegration test has been applied to the study of fiscal sustainability. Hakkio and Rush analyzed the cointegration between fiscal revenue and fiscal expenditure, and found that a condition of fiscal sustainability is that there is a cointegration relationship between fiscal revenue and fiscal expenditure [10]. Quintos tested whether there was cointegration between the US's fiscal revenue and expenditure, and found that the US budget deficit was unsustainable [11]. Afonso and Rault used a multi-step empirical method to test the fiscal sustainability of the EU, and found that the overall fiscal sustainability of EU countries is sustainable, but there are many problems in the fiscal sustainability of most countries [12]. Akram and Rath used the convergence technology of a panel club to study the convergence of fiscal revenue in Indian states, and found that India's finance was basically sustainable [13].
Some scholars have also studied the relationship between China's fiscal revenue and fiscal expenditure, and examined China's fiscal sustainability. Li used the bivariate cointegration and error correction model to analyze China's fiscal revenue and expenditure data from 1950 to 1997, and found that there was a stable correlation between China's fiscal revenue and expenditure [14]. Chang and Ho used the multivariate error correction model to analyze the relationship between China's tax and expenditure from 1977-1999, and found that there was a two-way causal relationship between China's tax and expenditure [15]. Zhou and Luo used the cointegration method and China's 1952-2006 time series data to empirically find that there is a cointegration relationship between China's fiscal revenue and expenditure, and that China's fiscal deficit is significantly sustainable [16]. Ma applied the panel data Granger test and panel data cointegration method to their study and found that there was a long-run equilibrium relationship between provincial fiscal revenue and expenditure from 1979 to 2005 [17]. Yang et al. used multiple nonlinear models such as momentum-consistent threshold autoregression to study the asymmetric driving relationship between government revenue and expenditure and the long-run sustainability of a fiscal deficit. Through comparative analysis of the asymmetric adjustment of government revenue and expenditure under different conditions of fiscal improvement and deterioration, they found that the self-correcting mechanism of government revenue and expenditure promoted the long-run sustainability of China's fiscal deficit [18]. Lv and Li estimated the impact of the epidemic on China's fiscal deficit rate and assessed China's fiscal sustainable development capacity. They found that the epidemic situation has led to a more severe financial situation in China, but China's fiscal sustainability is still strong [19].
From the Chinese and foreign literature, it is clear that the mainstream methods to test whether finance is sustainable are the unit root test and cointegration test. The traditional classical representative literature includes Hamilton and Flavin [7], Trehan and Walsh [9], Quintos [11] and Bohn [20]. The traditional unit root test and cointegration theory are mainly used for empirical analysis of panel data. Next, the traditional empirical analysis of unit root and cointegration is transformed into the panel unit root test and cointegration test. Afonso and Rault used the first-and second-generation panel unit roots to test the fiscal sustainability of EU countries [21], and Afonso used panel cointegration to test EU countries' fiscal sustainability [22]. However, this econometric analysis method based on unit root and cointegration was strongly criticized by Bohn [23]. Bohn pointed out that the unit root and cointegration test cannot reject the null hypothesis of fiscal sustainability, and also pointed out that the traditional measurement method missed three kinds of fiscal sustainability [23]. Later, Escario et al. discussed multi-dimensional cointegration and fiscal sustainability [24], and Chen studied the quantile cointegration method of fiscal sustainability [25].
Based on the previous literature, this paper determines its research object to be the difference between the fiscal revenue and expenditure of China's provinces, with the research goal of the fiscal sustainability of China's provinces.
The marginal academic contribution of this paper is as follows: Firstly, although there is some research on fiscal sustainability at the level of central government and local government, there is little discussion on provincial fiscal sustainability. In particular, there is no research on the sustainability of specific provinces. This study attempts to conduct exploratory research on such.
Secondly, this study follows the advanced panel cointegration method [13,26], and examines fiscal sustainability by examining the long-run relationship between fiscal revenues and expenditures. Considering the characteristics of the fiscal expenditure and revenue patterns of various provinces in China, this paper adds an industrial structure variable to the cointegration equation, which provides a basis for the central government to transfer payments to local governments and enriches the analysis of fiscal sustainability.
Thirdly, due to the heterogeneity in population, geographical location, economic environment, financial capital distribution, policy support and other aspects, we not only study the fiscal sustainability of specific provinces but also divide the finance of 30 provinces (except Tibet) into different groups to study the fiscal sustainability of different groups. We first construct seven groups: four are related to geographical location, namely, eastern provinces, central provinces, western provinces and northeastern provinces; the remaining three are based on the per capita GDP level, namely, high-income provinces, middle-income provinces and low-income provinces. In addition, according to the proportion of transfer payments in fiscal revenue, we repeat grouping into low-transfer-payment provinces and high-transfer-payment provinces. This grouping considers the fact that the revenue and expenditure gap is large, and thus, depends on the central transfer payment. Having several types of grouping has two research objectives: first, to enable us to understand and repeatedly verify the long-run empirical relationship; second, to enable us to create sub-samples of data and check the empirical relationship under different systems.
Fourthly, since the 1980s, China's financial and economic system has undergone many major structural reforms, such as the value-added tax system reform in 1994. These institutional changes inevitably form a "structural breakpoint". When the series is interrupted, the unit root test may lead to the loss of sustainable analysis efficiency and result in error. Therefore, this paper uses the structural interruption unit root test to determine the interruption date of sample data [27,28], and adjusts the series accordingly.
Our research results confirm that, in general, China's provincial finance shows strong fiscal sustainability. From the results of dynamic ordinary least squares (DOLS) analysis, most provinces and sub-panels have strong and stronger sustainability, but there are also a few provinces and sub-panels with weak fiscal sustainability.
The following parts of this paper are arranged as follows: Section 3 is method design and sample data, Section 4 is empirical analysis, Section 5 is scientific discussion and Section 6 is conclusions, policy recommendations and future research directions. If the sample data breaks, the unit root test may lead to biased estimates, and error results can be obtained from the unit root and cointegration test [29]. Since 1978, in view of the defects of the planned economy, China has repeatedly implemented major marketoriented reforms. The sample period of this study is long, during which time there may be major social and economic events or financial and economic system reforms, leading to data structure breaking. Yang et al. used SupF T test statistics to analyze this and found that there is structural breaking in China's fiscal revenue and expenditure relationship system, but they did not adjust the revenue and expenditure data [18].

Method Design and Sample Data
In this paper, Gregory and Hansen [27] and Narayan and Popp [28] cointegration models are used to endogenously explore structural breaking in the case of unknown time. Based on a standard cointegration model, the GH model defines a dummy variable to describe the changes in intercept and slope. According to the different directions of change in the intercept and slope, the model can be subdivided into different types. In view of the possible directions of change in cointegration regression, three types of estimation models are selected in this paper: the first is the mean value model, which investigates the horizontal breaking; second is the slope model, which investigates the horizontal breaking with trend; third is the state model, which investigates the simultaneous breaking of mean and slope. Breakpoints are determined endogenously by minimizing GH statistics.

• OLS Adjustment
Based on the breaking time, we use the ordinary least squares (OLS) method to adjust the data series. The regression equation is designed as follows: where Rev and Exp represent fiscal revenue and expenditure, respectively, Dum 1 and Dum 2 represent dummy variables of breaking time 1 and time 2, respectively, and t represents time. The adjusted revenue and expenditure in period t are represented by adj.Rev t and adj.Exp t , respectively, namely: adj.Rev t =α +ê t adj.Exp t =β +ε t (4) whereα andβ are the expected intercept, respectively, andê t andε t are the errors, respectively.

Stability and Long-Run Correlation Test
• ADP and PP Unit Root Test The extended Dickey Fuller (ADF) test and Phillips Perron (PP) test are used to test the stability of the series.

• Johansen Cointegration Test
In order to detect cointegration among variables, the Johansen cointegration technique is used in this paper [30]. Johansen cointegration improved the two-step estimation method of Engle and Granger [31] for generating tracking statistics and maximum eigenvalue statistics. Engle and Granger [31] two-step estimation requires a large number of observations to obtain consistent results, while the Johansen cointegration test does not require such restrictions. Johansen testing uses cointegration and error correction to identify the long-run relationship of non-stationary variables, and expands the vector auto-regression model. We use the Johansen program to check the cointegration of the following two models: where adj.Exp t and adj.Rev t represent the adjusted expenditure and revenue, respectively, for breaking in the two models. Considering that China's political system and financial system are incomplete decentralized systems-considering also the Chinese government's Budget Law-we do not introduce fiscal decentralization variables in Model II. Referring to Akram and Rath [13,26], Model II adds the industrial structure variable (Indus, expressed by the proportion of primary industry in GDP) and the interactive item (INT) (i.e., Indus × adj.Exp t ) to further examine the reaction of industrial structure to revenue. If the proportion of the primary industry is high, it means that financial autonomy is poor, so the central government will give transfer payments to provinces. ε and u are the error terms of Model I and Model II, respectively.

• IBC Model
In order to test the fiscal sustainability, we design an intertemporal budget constraint (IBC) empirical model. Based on Hakkio and Rush [10], the IBC model is designed as follows: where Exp and Rev, respectively, represent the percentage of fiscal expenditure and revenue in GDP, r represents the real interest rate and Debt represents the real government debt. Consider the form of Equation (7) during t + 1, t + 2, etc., solve recursively, and obtain: Equation (8) satisfies two assumptions proposed by Hamilton and Flavin [7]; however, Hamilton and Flavin believed that Equation (8) could not eliminate the permanent budget deficit. In order to realize a debt stock trending toward zero, under the condition that the rate of the debt stock is lower than the interest rate, this paper proposes: where Exp and Rev, respectively, represent the percentage of fiscal expenditure and revenue in GDP, Debt represents public debt, r represents real interest rate and ∆ is a difference operator. Under a Ponzi scheme, as long as the fiscal expenditure, revenue and debt stock are stable in the first-order difference, IBC will limit the time series attributes of fiscal expenditure and revenue on the right side of Equation (9). If both Rev and Exp are I (1) and cointegrated, there is an error correction mechanism to guide the government to meet the IBC; on the contrary, when there is no cointegration, the error correction mechanism will fail, and IBC cannot be satisfied (that is, Equation (9) does not exist).

• OLS Model
Empirical models are usually designed as: where u t represents the error term, α is the intercept term and β is the slope coefficient. Three conclusions can be drawn from Equation (10): (i) strong sustainability if Rev t and Exp t are cointegrated and β = 1; (ii) weak sustainability if Rev t and Exp t have a long-run association and 0< β < 1 (in this situation, government may default); and (iii) unsustainability if β < 0 (in this case, the deficit-accumulation rate is higher than the economic growth rate) [11].

• DOLS Model
In this paper, the dynamic least squares (DOLS) method is used to obtain the long-run coefficients of Model I and Model II. DOLS was constructed by Stock and Watson [32], and considers the lead and lag of explanatory variables. Stock and Watson believed that when the sample size was small, DOLS was better than both OLS and comprehensively modified OLS.

Endogeneity Test
Akram and Rath found that tax-expenditure and expenditure-tax are endogenous [33]. Therefore, the equation for testing endogeneity is set as: where ∆adj.Exp it and ∆adj.Rev it are, respectively, adjusted fiscal expenditure and revenue for breaking and ζ it , δ it and u it represent revenue error, expenditure error and combined error, respectively. The null hypothesis is that there is an endogeneity.

• The Panel Unit Root CD and CIPS Test
For the panel unit root, this paper uses the cross-sectional dependence (CD) test [34] and the cross-sectional augmented Im-Pesaran-Shin (CIPS) test [35] respectively.
To test cross-sectional correlation, Pesaran [34] proposed CD test statistics. The CD statistic follows a standard normal asymptotic distribution. This test is valid even for small panels [34]. When the errors are weakly correlated, this paper uses the CD test [36] to capture the weak cross-sectional dependence, which is different from the independence of the cross-sectional null hypothesis [34]. When N→∞ and T→∞, Pesaran [36] is also less restrictive and can produce significant and consistent results.
In order to test the correlation and heterogeneity of cross sections, Pesaran [34] augmented IPS [37] and proposed the cross-sectional ADF (CADF) test, which extends the standard regression by means of an average of the cross-section with lag and the first difference of the series. The panel unit root statistic (CIPS) developed by Pesaran [35] is a simple average of CADF. CIPS unit root testing is superior to Levin Lin Chu [38] and IPS panel unit root tests. It can capture the dependence and heterogeneity of cross-sections caused by correlation between variables. In the CIPS panel unit root test, the correlation of the cross-section and the idiosyncrasy in the regression by the error term follow an unobserved common factor structure. The CIPS test is very useful in handling transfer payments between provincial governments in China.

• The Pedroni Heterogeneous Panel Cointegration Test
Using the Pedroni heterogeneous panel cointegration test [39], we tested panel cointegration and analyzed the long-run relationship between the variables for aggregate and sub-panels. This test allows cross-section interdependence and different individual effects. The Pedroni heterogeneous panel cointegration test [39] model can be written as follows: Model II : where µ i and γ i allow the individual-specific fixed effect in both models. ε it and u it are the estimated residual obtained from the regression for Model I and Model II, respectively. The panel cointegration constructed by Pedroni [39] is a panel cointegration method based on residuals. Pedroni tested the long-run relationship based on seven statistics [39]. The four statistics are based on the within-dimension factors (i.e., panel ν, panel ρ, panel PP and panel ADF). Taking into account the common time factors and heterogeneity between individuals, these four statistics aggregate the autoregressive coefficients between individuals. Three group statistics are associated with dimensions (group ρ, Panel PP and panel ADF). Three group statistics are based on averages of the cross-section autoregressive coefficient, which is associated with the unit root test of the residuals for each cross-section of the panel. The seven statistics are asymptotically distributed, and all reject the null hypothesis that there is no cointegration at the 1% and 5% levels of significance.

• Dumitrescu Hurlin Causality Test for Heterogeneous Panels
For the dependence and heterogeneity of cross sections, the panel Granger causality test cannot provide consistent results. In order to solve this problem, we adopt the panel causality test method of Dumitrescu and Hurlin [40], which produces reliable results when there are dependences and heterogeneity in the panel. This test can only be used when the series is stable. Therefore, the data can be converted into first difference and the unit root can be excluded. The linear model is designed as follows: where adj.Exp t and ∆adj.Rev t are two stationary series, M is the lag length, determined by AIC, γ are the regression coefficients and ∆ is a difference operator. If the null hypothesis of homogeneous noncausality is rejected and the alternative hypothesis of heterogeneous causality is accepted, then causality exists.

Robustness Test
We use Westerlund panel cointegration [41] to both test the long-run equilibrium relationship between fiscal revenue and expenditure and verify the robustness of our panel cointegration analysis results from Pedroni [39]. Different from Pedroni [39], Westerlund panel cointegration [41] contains an error correction term, allowing the model to explain cross-sectional dependence, heteroscedasticity and serial correlation. Westerlund panel cointegration [41] is founded on structural dynamics rather than residuals. Therefore, unlike Pedroni [39], this test is not limited by common factors. The test tool of this test is four statistics. The first two show the alternative hypothesis, which indicates that there is a cointegration relationship on the whole. The other two test the alternative hypothesis, which proves that there is at least one single cointegration vector. This test is superior to Pedroni [39] in scope of application and power.

Samples and Data
The main sample of this paper is fiscal revenue and expenditure. The total fiscal revenue (Rev) and total fiscal expenditure (Exp) of 30 provinces (provinces, municipalities and autonomous regions, except Tibet, hereinafter referred to as a province) in the country from 1980 to 2019 are selected, expressed by their ratio to the GDP of corresponding regions. The sample also includes industrial structure (Indus), which is expressed by the proportion of the primary industry. The data are from China Statistical Yearbook and China Financial Yearbook, provincial statistical and financial yearbooks.

Sample Data Breaking and Adjustment
Before assessing fiscal sustainability, we check whether there is breaking in the data series and capture the breaking time. GH test results based on the Gregory and Hansen [27] cointegration model are shown in Table 1. The GH test results show that there are three structural breaks in the series of total expenditure and total revenue of each province in China, and the breaking time of most provinces is very close to the actual events. The breaking time estimated by the mean model was 1988, in which the intercept of cointegration relationship shifted, possibly due to the wage and price reforms in the late 1980s. The breaking time estimated by the slope model is 1994, which may be due to the reform of the tax sharing system. The breaking time estimated by the state model is 1998, which may be due to the transition of the fiscal and taxation system and the emergence of the normalized fiscal deficit in 1998. In this regard, we need to adjust the series of fiscal expenditure and revenue according to Equations (1)-(4).

Stability Analysis of Fiscal Revenue and Expenditure
We conducted ADF and PP tests on fiscal revenue and expenditure, and the results are reported in Table 2. We reject the null hypothesis of unit root at the 1% significance level, indicating that Rev and Exp are non-stationary. These findings show that China's finance is sustainable, as the series follows the integration order across China's states.

The Long-Run Relationship between Fiscal Revenues and Expenditures
In view of the defect of unit root test [20], we use Johansen cointegration based on maximum eigenvalue and tracking tests to evaluate the long-run relationship between fiscal expenditure and revenues. Johansen cointegration is applied to Model I and Model II, and the results are reported in Table 3. There is a long-run relationship between fiscal revenue and expenditure for Beijing, Tianjin, Hebei, Shanxi, Inner Mongolia, Liaoning, Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Jiangxi, Shandong, Henan, Hubei, Hunan, Guangdong, Hainan, Chongqing, Sichuan and Shaanxi. These provinces satisfy IBC and their finance is sustainable in the long run. Meanwhile, provinces where there is no long-run relationship between fiscal revenue and expenditure include Jilin, Heilongjiang, Guangxi, Guizhou, Yunnan, Gansu, Qinghai, Ningxia and Xinjiang. These provinces do not satisfy the IBC conditions; government debt increases, and finance is unsustainable. When we incorporate the industrial structure and its interaction items into Model II, we find that these financially unsustainable provinces become sustainable due to receiving transfer payments from the central government.

Degree of Fiscal Sustainability
We use DOLS technology and Wald testing to estimate the slope coefficients of Model I and Model II in order to judge the degree of fiscal sustainability. Quintos [11] distinguishes between strong fiscal sustainability and weak fiscal sustainability. If the slope coefficient β = 1, then there would be strong fiscal sustainability and solvency. When the slope coefficient 0 < β < 1, weak fiscal sustainability arises. If the slope coefficient β ≤ 0, finance is not sustainable. Hakkio and Rush believed that only strong conditions were suitable for considering fiscal sustainability [10]. For weak fiscal sustainability, IBC may be effective, but if it continues, the government will face a bottleneck in financing the deficit.
The DOLS results based on Model I in Table 4 show evidence of strong fiscal sustainability for 21 provinces including Shanghai, Beijing and Zhejiang, as the Wald statistic is not significant. Thus, we accept the null hypothesis of strong fiscal sustainability. Meanwhile, the Wald statistics are significant for nine provinces including Jilin, Heilongjiang and Guangxi. We reject the null hypothesis of strong fiscal sustainability. The reason for weak sustainability is that the accumulation of fiscal deficit over the years leads to weak fiscal performance.  The DOLS results based on Model II in Table 4 show that the conclusion of the degree of fiscal sustainability has basically not changed after the introduction of industrial structure variables and interactive items. However, in provinces with strong fiscal sustainability, Wald statistics are not significant, indicating that industrial structure and interaction items have no impact on fiscal sustainability. At the same time, in provinces with weak fiscal sustainability, Wald statistics are significant, and both industrial structure items and interaction items are significantly positive, indicating that more transfer payments from the central government have been received, promoting industrial development and making finance sustainable.

Endogeneity Test
In accordance with Equations (11)-(13), we analyze the endogeneity in panel data. The results show that there is no evidence of endogeneity and there is no significant correlation between the two error terms.

Unit Root Test for the Full Sample and Sub-Panels
We grouped the full sample by region and income. By region, we subdivide the full sample into four sub-samples, namely, the eastern, central, western and northeast region groups. By income, we subdivide the full sample into three sub-samples: high-income group, middle-income group and low-income group, and also into two sub-samples: lowtransfer-payment group and high-transfer-payment group. Table 5 reports the CIPS unit root results. Fiscal revenue and expenditure are both unstable at the 1% significance level. The null hypothesis of unit root homogeneity cannot be rejected, and cross-sectional dependence is supported, indicating the fiscal sustainability for the aggregate and groups defined by income and region. Notes: *** indicates significant level at 1%. p-values are given in parentheses. K stands for lag length. CIPS = Crosssectional augmented Im-Pesaran-Shin. CD = cross-sectional dependence.

Cointegration Test for the Full Sample and Sub-Panels
In view of the non-stationary I (1) of fiscal expenditure and revenue series, we use Equations (14) and (15) to establish the long-run relationship between variables. Table 6 reports the panel cointegration results of Pedroni [39], which are based on the seven statistics proposed by Pedroni [39]. The results show that most of the seven statistics are significant. It proves that there is a long-run correlation between expenditure and revenue in the aggregate and sub-panel for both Model I and Model II. Therefore, we can reject the null hypothesis of no cointegration, which means fiscal sustainability.  Notes: ***, ** and * indicate significance level at 1%, 5% and 10%, respectively. p-values are given in parentheses. ER, CR, WR, NR, HIG, MIG, LIG, LTPG and HTPG are eastern region, central region, western region, northeast region, high-income group, middle-income group, low-income group, low-transfer-payment group and hightransfer-payment group, respectively.

Estimation of Long-Run Slope Coefficients for the Aggregate and Sub-Panels
We used the DOLS technology and Wald test again to estimate the long-run slope coefficients of the full sample and sub-panels, and assess the degree of regional fiscal sustainability. The estimated results of Model I in Table 7 show the degree of fiscal sus-tainability for the full sample and sub-panels. The slope coefficient of the full sample is 0.99, and the Wald test statistic is not significant. We accept the null hypothesis of strong fiscal sustainability. We believe there is strong fiscal sustainability for the aggregate panel, which is basically consistent with the qualitative judgment of Lv and Li [19]. The results of Model II reach the same conclusion. We have not found any significant impact from industrial structure. The existence of fiscal sustainability means that, according to the current situation for China's provinces, there will be no financial bankruptcy in the future. In the sub-panels classified by region, the results of Model I show that the Wald test statistics in the eastern and central regions are not significant, which represents strong fiscal sustainability. At the same time, the Wald test statistics in the western and northeast regions are significant, which represents weak fiscal sustainability. Weak fiscal sustainability means that, despite satisfying the IBC, it may be difficult to finance fiscal deficits in the future. The estimation results of Model II on the degree of fiscal sustainability have, basically, not changed. We also found that in regions with strong fiscal sustainability, the industrial structure has no significant impact on fiscal revenue and expenditure, while in regions with weak fiscal sustainability, the industrial structure and interaction items are significantly positive at the 10% significance level; this means that relatively high fiscal expenditure is the basis of its fiscal revenue, and revenue and expenditure are highly correlated. Due to the low level of industrial structure, the central transfer payment is caused and the growth of fiscal revenue is promoted.
In the sub-panels classified by income, the higher the income, the stronger the fiscal sustainability. High-income provinces and middle-income provinces have strong fiscal sustainability, while low-income provinces have weak fiscal sustainability. In high-income and middle-income provinces, the industrial structure and interaction items are not significant, but in low-income provinces, the industrial structure and interaction items are significant, indicating that low-income provinces have received transfer payments from the central government, which has promoted the economic growth and fiscal sustainability of the provinces.
In the sub-panels classified by transfer payment, the more transfer payments obtained, the weaker the sustainability of the province. In the low-transfer-payment provinces, the industrial structure and interaction items are not significant. At the same time, in the high-transfer-payment provinces, the industrial structure and interaction items are significant, indicating that the high-transfer-payment provinces have received more transfer payments from the central government, which has promoted the economic growth and fiscal sustainability of the provinces.

Long-Run Causality Test for Heterogeneous Panel
According to cross-sectional dependence and heterogeneity, Granger causality testing is conducted by using Equations (16) and (17). The results reported in Table 8 show that there is a bi-directional Granger causality between fiscal expenditure and revenue arising from the first-order difference of the variables for the aggregate panel, indicating that the government makes expenditure and revenue synchronized to achieve fiscal sustainability.

Robustness Check
We use the Westerlund [41] panel cointegration method to check the robustness of the empirical results. The results reported in Table 9 show that, similar to the results achieved from the Pedroni panel cointegration, fiscal expenditure and revenue are cointegrated, and there is a long-run interaction between expenditure and revenue. China's current fiscal policy is basically sustainable in the long run.

Discussion
In terms of research methods for fiscal sustainability, some of the literature in recent years adopts other methods instead of cointegration research methods, such as panel club convergence technology [13], which is a useful scientific exploration. However, the cointegration method is the mainstream method of fiscal sustainability research [7,9,11,20]; this paper chooses this research method. Although this method has some defects [23], it has been improved in recent years [13,[24][25][26]. Therefore, the choice of research methods in this paper is scientific.
For the research conclusion, research by Yang, Zhao and Wang shows that, at the national level, the deficit growth did not exceed the threshold of 4.4% during the period 1992-2011. There is a self-correcting mechanism for fiscal revenue and expenditure, and China's finance is sustainable in the long run [18]. This is consistent with the strong sustainability of national finance summed up by the sustainability of each province in this paper. From the provincial level, Lv and Li pointed out that when China's general public budget expenditure exceeds the revenue, the difference is made up by carrying forward the balance and transfer in funds. During 2018-2020, local finance used the carrying forward of balance and the central transfer in funds to reach CNY 1231.2 billion, CNY 1896.7 billion and CNY 2110 billion, respectively. The scale continued to increase, indicating that some provinces are weak in sustainability [19], which is similar to the conclusions of this study. Of course, the existing studies have not analyzed the fiscal sustainability of specific provinces in China. The conclusion of this paper is the embodiment of the existing literature conclusions.

Conclusions, Policy Recommendations and Future Research Directions
Given that previous literature has mainly studied fiscal sustainability at the national level in China, this paper uses the 1980-2019 provincial financial and economic data to study the fiscal sustainability at the provincial level in China, including the provincial fiscal sustainability of provinces, regions, income groups and transfer payment groups. Instead of a time-series cointegration approach, the method used is an advanced panel cointegration approach with reference to Akram and Rath [13,26], and adds industrial structure variables into the cointegration equation according to Chinese characteristics.
The results of this study are as follows: First, the unit root test and cointegration test show the sustainability of provincial finance and provincial finance of various groups. The DOLS and Wald tests support that the sustainability of most provinces' finance is strong. Secondly, from the perspective of grouping, the eastern region is stronger than the central region, and the central region is stronger than the western region and the northeast region. Middle-and high-income areas are stronger than low-income areas. Low-transfer-payment regions are stronger than high-transfer-payment regions. Third, the industrial structure variable only has a significant impact on the fiscal sustainability of the western region, the northeast region, low-income regions and high-transfer-payment regions, and has no significant impact on the fiscal sustainability of other regions. This paper suggests that provincial governments should modify their fiscal expenditure policies to minimize non-productive expenditure, especially in provinces with weaker fiscal sustainability. If the level of fiscal expenditure in provinces with weak fiscal sustainability remains unchanged, it will inevitably threaten the government's reputation, worsen the fiscal deficit and expand the risk of government debt. Provinces with weak fiscal sustainability often reduce their revenue due to tax-base erosion or excessive overspending. In order to improve the fiscal sustainability of provinces with weak fiscal sustainability, it is necessary to reduce the deficit caused by non-productive expenditure.
Finally, it must be mentioned that this paper does not take into account the limitations and impacts of COVID-19 on fiscal sustainability. Moreover, the paper does not use other research methods to test the deficiencies of cointegration methods. These will be future research directions.