1. Introduction
In recent years, the spatial econometric model has been used to study many economic issues such as regional employment growth rates, housing price models, technology introduction, and so on. It is an important analytical tool in fields like regional science, geography, and the economy. The spatial econometric model can be divided into parametric models and non-parametric models according to different hypothesis forms of explanatory variables. Parametric models assume that functional relationships between explanatory variables are known and are mostly assumed to be linear. With the efforts of many researchers, the use of parametric spatial models has expanded from cross-sectional data to dynamically correlated panel data, forming the spatial dynamic panel data (SDPD) model, in which the spatio-temporal dynamic elements are added. The SDPD model not only considers the influence of individuals, which are often heterogeneous, but also takes into account spatio-temporal dependencies, improving the flexibility of the spatial panel data model.
A typical SDPD model includes individual (or regional) and time-fixed effects to account for the effects of individual and time invariant factors on the dependent variable. A data transformation approach can be adopted to wipe out either the individual or the time fixed effects from the model in order to make the incidental parameters problem less severe (Ref. [
1]). For example, Refs. [
2,
3] established the asymptotic properties of quasi-maximum likelihood (QML) estimators and generalized moment estimation estimators for the spatial dynamic panel model with fixed effects when both the number of individuals
n and the number of time periods
T are large, respectively. Ref. [
4] examined the asymptotics of QML estimators for unit root SDPD models with fixed effects. Then, Ref. [
5] investigated the asymptotic properties of QML estimators in the presence of an unstable unit roots generated by temporal and spatial correlations and proposed bias correction for the estimators. Ref. [
6] investigated the first difference (FD) estimation of SDPD models with fixed effects using the QML approach, where both
n and
T are large. These studies have shown that SDPD models with fixed effects are very worthy of attention in recent years.
However, as a parametric model, the SDPD model has a serious problem in that it assumes that the independent variables are linear:
. Once the functional relationship between economic variables is not linear, or even if the true relationship is unknown, the use of parametric models may have serious consequences such as inconsistent estimates. The non-parametric model only requires the relationship between explanatory variables to be a smooth function satisfying certain moment conditions and does not need to make assumptions on the model form in advance, which can effectively avoid the risk of model misassumptions. Current research on non-parametric spatial panel data is not complete. Ref. [
7] proved that the estimation of regression functions of non-parametric panel data has consistency when the variance is known, but in most cases, the variance is unknown. Ref. [
8] proposed a test statistic to test the null hypothesis of random effects versus fixed effects in non-parametric panel data regression models. Their research on non-parametric panel data has not been extended to spatial econometric models because the characteristics of spatial econometric models and spatial and time lag terms make it impossible to directly use non-parametric estimation methods. This is a difficult problem for calculation and theoretical derivation, which is also the problem to be solved in this paper.
According to the research trend of spatial econometric models, we designed a non-parametric spatial dynamic panel data (NSDPD) model with fixed effects. The influence of explanatory variables is extended from setting a known linear or nonlinear parameter structure, which not only retains the advantages of the SDPD model in a parametric form that can deal with spatial and/or temporal individual characteristics and spatio-temporal dependence but also solves the limitation that may lead to specification error and enhances the readability and explanatory power of the SDPD model. It is worth noting that estimation methods for finite-dimensional parametric spatial models may not be directly applied to non-parametric models because the estimation part of the model contains unknown functions, which is equivalent to facing an infinite-dimensional parameter estimation problem. In order to overcome the above difficulties, we propose a profile maximum likelihood (PML) method, which eliminates the influence of time effects through a transformation process and avoids the problem of incidental parameters in estimation. After that, we present a rigorous theoretical analysis of the asymptotic properties of PMLE and verify some of its finite-sample properties through Monte Carlo experiments. Finally, we illustrate the empirical relevance of the model by applying it to examine the impact of tourism dynamics on economic development in the Yangtze River Delta region of China.
3. Profile Likelihood Estimators and Their Asymptotic Properties
For our analysis of the asymptotic properties of the estimators, we need the following assumptions:
Assumption 1.
is a constant spatial weights matrix and its diagonal elements satisfy
for
. Also, is uniformly bounded in row and column sums in absolute value (for short, UB).
Assumption 2. The disturbances and , are across and with zero mean, variance and for some .
Assumption 3. is invertible for all . Furthermore, is compact and is in the interior of . Also, is UB, uniformly in .
Assumption 4. is an independent, identically distributed random sequence, which is nonstochastic and bounded uniformly in different and . have second-order continuously differentiable probability density functions , where , for any on the support set.
Assumption 5. has continuous derivatives and , where is a positive constant.
Assumption 6. When , , and .
Assumption 7. The kernel function
is a bounded continuous non-negative function whose support set is compact:
, where
is a constant. In addition,
,
and
are UB.
Assumption 8. is a tightly supported bounded kernel such that , where is a scalar. Furthermore, all odd moments of do not exist, namely , for all non-negative integers whose sums are odd. (The last condition is satisfied by the spherically symmetric kernel and the product kernel based on symmetric univariate kernel function).
Assumption 1 is a standard normalization assumption in spatial econometrics, and Assumption 2 provides regularity assumptions for . The reversibility and compactness of in Assumption 3 originated from Kelejian and Prucha (1998, 2001). When exogenous variables are included in the model, it is convenient to assume that the exogenous regressors are uniformly bounded, as in Assumption 4. Assumption 5 is a necessary condition for (3). Assumption 6–8 are conditions of kernel density estimation. The bandwidth of kernel function, , is an important parameter which affects the estimation result of kernel function. Kernel functions that satisfy Assumptions 7 and 8 exist, such as the product kernel, , where is a symmetric kernel of one variable on the closed interval .
For the concentrated log likelihood function (4) divided by sample size
, the corresponding expected value function is
, which is:
To show the consistency of , we need the following uniform convergence result.
Lemma 1. Under Assumptions 1–6, for an nonstochastic UB matrix :
where
,
,
[Ref. [
2], Lemma 15].
,
.
Lemma 2. Under Assumptions 1–6, for an
nonstochastic UB matrix
:where
,
,
[Ref. [
2], Lemma 16].
,
.
The consistency of will follow from the uniform convergence of to zero on and the uniqueness identification condition (White (1994, Theorem 3.4)). The properties of each part of are shown in Lemma 1 and 2, so the following conclusions can be drawn:
Lemma 3. Let Θ be any compact parameter space. Then, under Assumptions 1–7, uniformly in .
Lemma 4. Let Θ be any compact parameter space. Then, under Assumptions 1–7,is uniformly equicontinuous for .
Before obtaining the information matrix, we need to compute the first and second derivatives of the logarithmic likelihood function. The asymptotic distribution of the QMLE
can be derived from the Taylor expansion of
around
. The first order derivative of the concentrated likelihood function involves both linear and quadratic functions of
as follows:
where
,
. Then, the second order derivatives are:
And the information matrix as follows:
where
,
,
.
From Lemma 2, , .
Assumption 9. .
Assumption 9 is an important condition for the non-singularity of the limiting information matrix in addition to the global identification in Lemma 5 and Theorem 1.
Lemma 5. The information matrix
is non-singular.
The proofs of Lemmas 3–5 can be viewed in
Appendix A. After the establishment of Lemma 3 and 4, Theorem 1 presents the consistency of
if Assumption 9 holds, while Theorem 2 proves the consistency of
if Assumption 9 is not satisfied.
Theorem 1. Under Assumptions 1–9, is globally identifiable and is a consistent estimator of (similar to Ref. [
2]
). Theorem 2. Under Assumptions 1–8, is globally identifiable and if for (similar to Ref [
2]
). Lemma 6. Under Assumptions 1–9, if
is odd:If
is even:In either of these cases, the variance is:where
is the constant defined by Ruppert and Wand (1994). Lemma 6 states that the first conditional deviation term depends on whether is odd or even. From the Taylor series expansion, we know that when , the remainder term of the expansion of a polynomial of order should be of order , so the result of being odd is easy to understand. When is even, is odd, so the term is associated with when is odd. Because is an even function, . Therefore, there is no term, and the rest of the term becomes . Since is either odd or even, the deviation term we see is to an even power. This is similar to the case of using higher-order kernel functions based on symmetric kernel functions (even functions) for local constant estimates, where the deviation is always an even power of . In summary, if is odd, , if is even, .
Theorem 3. Under Assumptions 1–9, .
The proofs of Theorems 1–3 can be viewed in
Appendix B. Theorem 1 and 2 show that the PMLEs of parameters
are consistent. And Theorem 3 shows that the PMLE
of the unknown function
is also consistent.
4. Monte Carlo Simulations: Methods and Results
In this section, all experiments were compiled using R language and plotted using the ‘ggplot2’ package.
For the parameter part, we generated samples from (1) and use , , where . and are generated from uniform distribution and independent normal distribution . The spatial weight matrix we used was the matrix, which is one of the main types of spatial weight matrices in spatial econometrics. For the non-parametric part, the kernel function we used is the commonly used Gaussian kernel function, . As it is difficult to select the optimal window width, we simply used the rule of thumb method. The spatial specific effect is generated randomly in the standard normal distribution, which controls all spatial-fixed and time-invariant variables. Finally, we used the sample size and the total number of periods . For each set of and , the sampling observations were generated with the Metropolis–Hastings sampling algorithm.
The evaluation of the simulation results should also be divided into parametric and non-parametric parts. In the parametric part, for each estimator, we calculate the standard deviation (Std) and root mean squared error (RMSE), where is the number of simulations and are the parameter estimates obtained from each simulation. In order to accurately estimate the parameter values, by Su (2012), we took the window width here, where represents the standard deviation of sequence . In the non-parametric part, we referred to Chen (2012) to choose the mean absolute deviation error (MADE) as the evaluation standard, which is where is the fixed grid points selected within the support set of the . We selected 20 fixed lattice points in , namely . When estimating the non-parametric part, we used the leave one out cross validation method to select the window width, that is, the window width minimizes , where is the i th element of after the estimated value , is the estimate of obtained with the observation value other than the i th observation.
For different cases of
and
, 100 simulations were carried out with R language. In each simulation, the Metropolis–Hastings sampling algorithm was used to conduct 1000 samples in the PML function. In order to obtain the distribution of samples close to reality and ensure that the state was stable, the first 200 sampling results were discarded. With two different values of
for each
and
, the finite sample properties of both estimators are summarized in
Table 1 and
Table 2, in which we report the means, variances (Vars), root mean square error (RMSE), and coverage probability (CP). For each case, the estimated value of the parameter, that is, the means, is relatively close to the real value, and we can see that for each given
, when
is larger, the variance of estimators will be smaller; for each given
, when
is larger, the biases between the real value and the estimators will be nearly the same, but the variance will be smaller. When both
and
are maximized, that is,
, the variances and RMSEs of the parameter estimators are the smallest in all cases, which indicates that the parameter estimators will converge with the increase in
, which is consistent with the large sample property we have proven. Also, for different values of
, the variances and RMSEs do not change much.
Figure 1,
Figure 2,
Figure 3 and
Figure 4 show the variances and fitting curves of each parameter component in
and
under various combinations of
and
, where the horizontal axis is the number of simulation 0~100 and the vertical axis is the variances. The green points in the figure are the variances obtained from each simulation, and the red curve is fitted out from 100 variances. It can be clearly seen that when
and 49, the variances of the estimated value of each parameter are distributed in
, most of which are less than 0.01, and a minority of them are between 0.01 and 0.02. When
, the variances are all less than 0.015, indicating that the overall fitting error is small. In addition, the number of points exceeding 0.01 in 100 points decreased significantly, which also indicates that the variances decrease with the increase in the sample size, and the fitting results are better. Moreover, from the shape of the fitting curve, the variance will converge after about 70 simulations, and the convergence value become smaller and smaller as the number of simulations increases, indicating that the variances of the estimators do not increase with the increase in the number of simulations and further indicating that the variances of the parameters tend to be stable. The comparison between
Figure 1 and
Figure 3 and
Figure 2 and
Figure 4, namely between the variance fitting curves of
and
when
, shows that the convergence and range of variances do not change with the change in time period
, so the large sample property proven above can be confirmed.
Figure 5,
Figure 6,
Figure 7 and
Figure 8 show the graph of the mean value and confidence interval of each parameter in
and
under various combinations of
and
, where the blue area is the range covered by the 95% confidence interval, the red broken line is the mean of the parameter estimates, and the yellow line is the true value of the parameters. As can be seen from the figures, due to the small, estimated variances, the means fluctuate very little around the true values, and the ranges of confidence intervals are also relatively stable. In only a few cases, the confidence intervals do not cover the true values, and with the increase in
and
, the coverage degree becomes higher and higher, that is, the coverage probability (CP) gradually approaches 1.
Table 3 shows the average absolute error and variance of the estimates of the unknown function
under different samples. By comparing six simulation results under the initial values of different parameters, we can see that under the limited sample, when the period number
T is fixed, the deviation between the estimated value and the true value of
decreases with the increase in the sample size
n, which is mainly represented as the MADE values decrease. This indicates that when
T is the same, the estimated values of the parameters will converge with the increase in
n; when
n is fixed, the deviation between the estimated values and their true value will also decrease with the increase in
T. Combining the above two results, it is not difficult to draw the following conclusion: the estimated values will converge to the true values of the parameters with the increase in
n and
T, which is consistent with the theoretical result of Theorem 3.
5. Empirical Application: Spatial Spillovers in the Yangtze River Delta
We selected the panel data of 16 cities in the Yangtze River Delta region (Shanghai, Hangzhou, Jiaxing, Huzhou, Ningbo, Shaoxing, Zhoushan, Nanjing, Suzhou, Wuxi, Changzhou, Zhenjiang, Nantong, Yangzhou, Taizhou, and Taizhou) from 2019 to 2021 (data source: Statistical Yearbook of Shanghai, Zhejiang and Jiangsu 2020–2022) to study the relationship between urban tourism development and economic growth in the Yangtze River Delta (YRD). The YRD city cluster is an important intersection area of the “Belt and Road” and the Yangtze River Economic Belt. It plays a pivotal strategic role in China’s modernization and opening-up pattern and is an important platform for China to participate in international competition and an important leader in economic and social development. The YRD city cluster has a vast economic hinterland, modern river and seaports and airports, a relatively sound highway network, leading the country to having an increasing density of road and rail transportation lines, and a three-dimensional comprehensive transportation network, which has important conditions for economic agglomeration. Economic agglomeration is a general state of economic development, representing the geographical and spatial concentration of economic activities. Due to the influence of objective factors such as location condition, ecological environment, development basis, and market development degree, there are significant differences in the tourism development modes of the different cities. However, there are close economic relations between neighboring cities in space, that is, the development of tourism in a certain region not only has a direct impact on the local economy but also has a spillover effect on the economy of its neighboring region. In addition to the spatial lag in the same period, there may also be time and space lag or diffusion, that is, under the premise of spatial interaction between regions, the spatial and temporal linkage of inter-regional tourism development and economic growth will be further enhanced. Therefore, the spatial panel analysis of the spatial spillover effect of economic agglomeration can reveal the possible economic agglomeration phenomenon between regions from the temporal and spatial dimensions, and more objectively and scientifically study the structure and process of regional economic development.
Let
denote the gross domestic product (GDP) in city
at period
. The
spatial weights matrix is
and
if cities
and
share the same border and
otherwise. The tourism dynamic variables are constructed by the number of tourists travelling to every city at period
, including international inbound tourists and domestic tourists, which are recorded as
and
, respectively. GDP is likely to be influenced by a number of macroeconomic factors, which are reflected by fixed effects. The model consists of:
First of all, we fit the data in the ‘nlme’ package of R language to obtain the form of the non-parametric part as follows:
The estimated results of parameters
and
are shown in
Table 4, which all pass the 1% significance test and reveal interesting spatial patterns. Since the expression
is in power exponential form, it means that the increase in tourist numbers in a region has a positive impact on local economic development, which is consistent with the inference. The power index (
) of domestic tourists is negative, and the index (
) of international inbound tourists is positive, indicating that the increase in international inbound tourist arrivals will greatly promote local economic development, but the increase in domestic tourists will weaken some economic growth generated by tourism. This is related to the consumption habits of domestic tourists and the local tourism reception capacity: first, domestic tourists usually use various preferential and discount apps to book tickets and accommodation in advance, while foreign tourists lack understanding and conditions for this, and second, tourist souvenirs are an important part of tourism profits. They are highly attractive to international tourists, while domestic tourists often choose to buy them online rather than in scenic areas. Finally, China is a populous country, and the local tourism reception capacity is limited. If there are too many domestic tourists at the same time, such as during holidays, it will inevitably affect the travel and consumption experience of foreign tourists. On the other hand, the parameter SDPD model is a linear relationship with a predetermined part of the equation, which is obviously inconsistent with the actual data relationship and may lead to completely different conclusions, which is wrong. In the parametric SDPD model, the linear relationship
at
is defined in advance, which is obviously inconsistent with the actual data relationship and may lead to completely different and biased conclusions.
Table 5 reports the estimated results. Firstly, it shows that the economic development of a city has a positive impact on the neighboring region (
), that is, a city with a high degree of economic activity leads to the growth of the economic activity of the neighboring city through knowledge spillover, technological innovation, industrial experience, etc. The spatial agglomeration phenomenon of economic development in the YRD region shows a positive spatial correlation. The positive time lag coefficient (
) reflects that the economic development of each city in the previous period has a positive impact on itself from 2019 to 2021. This is consistent with the positioning that these 16 cities are the most economically developed and the most valuable urban agglomerations in China. These cities have already determined their own development direction and needs, and with the support of national policies and talents, economic and social development has been steadily improving year on year. Note that
and diffusion (
) have opposite signs, which is a very enlightening discovery. This shows that although the 16 cities are urban clusters that develop together and help each other, their economic development structures are different. Factors that produce spatial effects at the same time, such as industry experience, may be able to learn and emulate in the short term, but they are not suitable for long-term application. Local governments need to develop characteristic economies according to their own economic structure and regional characteristics. On the other hand, the fixed effect (
) is positive, indicating that the regional economic development process has unique characteristics that do not change with time, including additional individual effects and time effects such as in special cases. There, it is necessary to use the NSDPD model with fixed effect.