Statistical Inference for Partially Linear Varying Coefﬁcient Spatial Autoregressive Panel Data Model

: This paper studies the estimation and inference of a partially linear varying coefﬁcient spatial autoregressive panel data model with ﬁxed effects. By means of the basis function approximations and the instrumental variable methods, we propose a two-stage least squares estimation procedure to estimate the unknown parametric and nonparametric components, and meanwhile study the asymptotic properties of the proposed estimators. Together with an empirical log-likelihood ratio function for the regression parameters, which follows an asymptotic chi-square distribution under some regularity conditions, we can further construct accurate conﬁdence regions for the unknown parameters. Simulation studies show that the ﬁnite sample performance of the proposed methods are satisfactory in a wide range of settings. Lastly, when applied to the public capital data, our proposed model can also better reﬂect the changing characteristics of the US economy compared to the parametric panel data models.


Introduction
Spatial econometrics is a branch of econometrics that mainly deals with the interactions of economic units in space, where the space can be in both physical and economic dimensions.Early work on spatial econometrics dates back to [1], when the spatial autoregressive model was first introduced, and since then, it has become an active area thanks to its simplicity in estimation and interpretation.For more details on the traditional spatial autoregressive model, one may refer to [2][3][4], and the references therein.On the other side, however, the parametric structure of the spatial autoregressive model is often subject to the risk of model mis-specification, resulting in modeling bias and even inconsistent estimates.To overcome this shortcoming, nonparametric and semiparametric spatial autoregressive models have also been introduced in the last decade.To name a few, Su and Jin [5] proposed a profile quasi-maximum likelihood estimation approach for a partially linear spatial autoregressive model; Su [6] considered a nonparametric spatial autoregressive model; Malikov and Sun [7] proposed a flexible semiparametric varying coefficient spatial autoregressive model; Sun [8] studied a spatial varying coefficient model with nonparametric spatial weights; and Du et al. [9] developed a partially linear additive spatial autoregressive model and studied the asymptotic properties of the proposed estimators.
Panel data track individual units over time, enabling the estimation of complex models and an extraction of information not possible with cross-sectional or time series data.This two-dimensional information also allows for more comprehensive analysis and inference of panel data ( [10,11]).Specifically in spatial econometrics, panel data models are also popular since they take into account the spatial dependence and control of the unobservable heterogeneity.For instance, Lee and Yu ( [12]) focused on a spatial autoregressive panel data model with individual fixed effects.Zhang and Shen [13] considered a partially linear spatial autoregressive panel data model with functional coefficients and random effects.Ai and Zhang [14] considered the estimation of a partially specified spatial panel data model with fixed effects.Li [15] proposed a quasi-maximum-likelihood estimation method for a dynamic spatial panel data model.Sun and Malikov [16] studied a varying coefficient spatial autoregressive panel data model with fixed effects.
In this paper, we are interested in the partially linear varying coefficient spatial panel data model with fixed effects, namely, where y it is the response variable, and x it = (x it,1 , . . ., x it,p ) τ , z it = (z it,1 , . . ., z it,q ) τ , and u it ∈ [a, b] are the associated covariates.In addition, β 0 = (β 01 , . . ., β 0p ) τ is a p-dimensional vector of unknown parameters, γ 0 (•) = (γ 01 (•), . . ., γ 0q (•)) τ is a q-dimensional vector of unknown functions, and ε it is a random error with zero mean and finite variance σ 2 .The unobserved individual-specific effect α i is time-invariant to account for the individual's unobserved ability, which is also allowed to be correlated with covariates x it , z it , and u it with an unknown correlation structure; and w ij describes the spatial weight of observation j to i, which can be a decreasing function of the spatial distance between i and j.Lastly, we note that the scalar parameter λ 0 measures the strength of spatial dependence.Model ( 1) is a unified and flexible model that includes a variety of existing models as special cases.If γ 0 (u) does not vary over u, it reduces to a vector of constants so that model (1) becomes the traditional spatial autoregressive panel data model [12].If q = 1 and z it = 1, the model reduces to the partially linear spatial autoregressive model studied by [14].If x it = 0, the model is given as a varying coefficient spatial autoregressive model.If λ 0 = 0, model (1) becomes the partially linear varying coefficient panel data model considered in [17].Moreover, if λ 0 = 0 and x it = 0, model (1) reduces to the classical varying coefficient panel data model with fixed effects.For further development of this model, one may refer to, for example, [18][19][20].
This paper considers the estimation and empirical likelihood inference for model (1).For panel data models with fixed effects, the individual effects are often viewed as the nuisance parameters.We first tackle the fixed effects issue by applying differencing techniques.We then, from the perspective of computational costs, use B-spline to approximate the nonparametric functions and propose a two-stage least squares estimation method to consistently estimate the unknown parameters.The consistency and asymptotic normality properties of the resulting estimators are established under some mild conditions.Moreover, to construct confidence regions of β 0 in model (1), we also propose an empirical log-likelihood ratio function for the regression parameter and show that it follows, asymptotically, a standard chi-square distribution.
The rest of the paper is organized as follows.Section 2 introduces the two-stage least squares estimation and the empirical likelihood inference for the model.Section 3 provides the regularity conditions and then derives the asymptotic properties of the estimators.Section 4 reports the simulation results for assessing the finite sample performance of the proposed methods.Section 5 demonstrates the usefulness of the proposed methods via a real data analysis.Finally, Section 6 concludes the paper with some future directions, and Section 7 presents the technical results.
For the instrumental matrix H, we construct it in a similar way to [13].Specifically, in the first step, we select the instrumental variables In the second step, we use the instrumental variable H to obtain the initial consistent estimators δ and θ, and then use them to construct the instrumental variables Finally, we use the instrumental variable H to obtain the final estimators δ0 and θ0 .
In what follows, we apply the empirical likelihood method to construct the confidence regions of β 0 and λ 0 in model (1).The empirical likelihood method was first introduced by [22], and has now been applied to various regression models ( [23,24]).Compared with the two-stage least squares method, an advantage of the empirical likelihood method is that it uses only the data to determine the shape and orientation of confidence regions of β 0 and λ 0 .Another advantage is that the empirical likelihood method can construct confidence regions without estimating the asymptotic covariance, which can be rather complicated for the partially linear varying coefficient spatial autoregressive panel data model with fixed effects.
Specifically for model (6) If the covariate D is exogenous, then the estimating equation for the parametric components δ 0 can be defined as In practice, however, D is often an endogenous covariate.In this case, the estimating equation defined by ( 8) cannot obtain a consistent estimator of δ 0 .To overcome this problem, we propose an adjustment for (8) based on the instrumental variable H, where the key idea is to obtain a linear projection of D. From the model D = Hξ + e, the estimator of ξ is known as Also letting D = H ξ = ( Dτ 1 , . . ., Dτ N ) τ , then our adjustment for ( 8) is given by To define the empirical likelihood ratio, we first treat η i (δ 0 ) = Dτ i (∆ Ỹi − Di δ 0 ) as the auxiliary random vector.Then, by [22], an empirical log-likelihood ratio function for δ 0 can be defined as where p i , i = 1, . . ., N, are non-negative real numbers.Finally, through the Lagrange multiplier method, we can show that where φ is a (p + 1)-dimensional vector that satisfies the equation of

Asymptotic Properties
, and define Z i , u i , and ε i analogously.To derive the asymptotic properties of the proposed estimators, we need the following regularity conditions.(C1) {(X i , Z i , u i , ε i ), i = 1, . . ., N} are independent and identically distributed, and for all where a 0 b 0 means that the ratio a 0 /b 0 is bounded away from zero and infinity.(C6) Γ τ Γ/N converges in probability to a positive definite matrix Π, where Furthermore, we assume that f (u) is continuously differentiable on (a, b).(C9) Denote H = (I − S)H = ( Hτ 1 , . . ., Hτ N ) τ and Σ i = E(∆ε i ∆ε τ i ).We assume that Condition (C1) or its variant is commonly assumed in the spatial panel data models.It requires the explanatory variables (x it , z it , u it ), the instrumental variables H, and the spatial weighting matrix W to be exogenous.Condition (C2) imposes restrictions on the spatial weighting matrix.These restrictions are required in the setting of a spatial autoregressive model ( [3,4]).Condition (C3) is a standard condition on the polynomial spline function method ( [25]).Condition (C4) ensures that the functions γ 0l (u) are sufficiently smooth.Condition (C5) is required to achieve the optimal convergence rate of γ 0l (u).Condition (C6) is required to establish the asymptotic results.Condition (C7) is required to ensure the identifiability of parameters.Condition (C8) is commonly used in the nonparametric literature.And lastly, Condition (C9) is also routinely used in the empirical likelihood inference ( [23,26]).Let D −→ represent the convergence in distribution.The following two theorems derive the asymptotic distribution and the convergence rate of the 2SLS estimators δ0 and γ(u), respectively.Theorem 1.Under conditions (C1)-(C8), we have where Theorem 2. Under conditions (C1)-(C8), we have Theorem 1 shows that the 2SLS estimator of the parametric component δ0 is √ Nconsistent.Theorem 2 indicates that the 2SLS estimator γ(u) achieves the optimal convergence rate for nonparametric regression with independent and identically distributed data in [27].In addition, the above two theorems allow us to construct the confidence region for δ 0 provided a consistent estimator of the asymptotic covariance Ω is obtained.Theorem 3 shows that Ω is given as a consistent estimator.Moreover, by Theorem 1 and Slutsky's theorem, it can be shown that Ω Hence, the asymptotic 100(1 − α)% confidence intervals for δ 0k can be constructed as where z 1−α/2 is the 1 − α/2 quantile of the standard normal distribution, and Ωkk is the kth diagonal element of Ω.
Next, the following theorem establishes the asymptotic distribution of the empirical log-likelihood ratio function L(δ 0 ) in (9).Theorem 4.Under conditions (C1)-(C9), if δ 0 is the true value of the parameter, then where χ 2 p+1 is a standard chi-square distribution with p + 1 degrees of freedom.
Theorem 4 can be used to construct the empirical likelihood confidence regions for δ 0 .For any 0 < α < 1, an approximate 1 − α confidence region for δ 0 is given by where χ 2 p+1 (1 − α) is the 1 − α quantile of the standard chi-square distribution with p + 1 degrees of freedom.

Simulation Study
In this section, we investigate the finite sample performance of the proposed estimation and inference methods with a simulation study.The data are generated from the model x it,1 and i ∼ N(0, 1), and α 1 = − ∑ N i=2 α i .Throughout the simulation, we use the centered cubic B-splines as the basis functions.The smoothing parameter K is selected using the generalized cross-validation (GCV) criterion.Similar to [28], we focus on the spatial scenario with a total of R districts, where in each district, there are l members with each neighbor of a member giving equal weight , where e l is an l-dimensional column vector with all elements being 1 and ⊗ is the Kronecker product.In our simulation, the sample sizes are set to be T = 4 and 6, N = R × l where R = 30 and 50, l = 4 and 8.For comparison, three different values, λ 0 = 0.2, 0.5, and 0.8, are also considered, where λ 0 = 0.2 represents weak spatial dependence, λ 0 = 0.5 represents mild spatial dependence, and λ 0 = 0.8 represents strong spatial dependence.
We assess the performance of the two-stage least squares estimation by checking the average bias (Bias) and the sample standard deviation (SD) of the parametric components, and assess the varying coefficient function γ 0 (•) by checking the square root of the average squared error (RASE), which is defined as , where {u j , j = 1, . . ., N 0 } are the regular grid points at which the function γ0 (u) is evaluated.In our simulation, N 0 = 100 is used.We carry out 1000 simulations for each setting and then summarize the results in Table 1 and Figure 1.Table 1 lists the average biases and standard deviations of the estimators of λ 0 , β 1 , and β 2 , and the average RASEs of the estimator of γ 0 (•). Figure 1 presents the estimator of γ 0 (•) in a typical sample, which is selected in such a way that its RASE is equal to the median in the 1000 replications.
From Table 1 and Figure 1, we can make a few interesting observations: (i) All the estimators of parameters are close to the true value.(ii) The standard deviations of λ0 , β1 , and β2 decrease as the sample size increases. (iii) The RASEs of γ0 (u) are small for all cases and decrease as the sample size increases, and it can be concluded that the estimate curves fit well to the corresponding true curve, which also coincides with what was discovered from Figure 1.To conclude, the simulation results verify the validity and effectiveness of the proposed estimation procedure.The second aim of this simulation study is to construct the confidence intervals for the parameters λ 0 , β 1 , and β 2 , respectively.We consider two approaches for comparison, including the empirical likelihood (EL) approach and the normal approximation using the two-stage least squares estimator (2SLS).The average lengths of the confidence intervals and their corresponding empirical coverage probabilities, with a confidence level of 1 − α = 95%, are computed with 1000 simulation runs.The simulation results are pre-sented in Table 2.It is evident that EL has shorter interval lengths and higher coverage probabilities.This implies that EL performs better than 2SLS in terms of coverage accuracy of the confidence intervals.Lastly, we note that most of the interval lengths decrease and the empirical coverage probabilities increase as the sample size increases.

A Real Data Example
In this section, we apply the proposed estimation methods for model (1) to investigate the productivity of public capital in private production based on data for 48 US states observed over 17 years (1970)(1971)(1972)(1973)(1974)(1975)(1976)(1977)(1978)(1979)(1980)(1981)(1982)(1983)(1984)(1985)(1986).The public capital data had been considered in [11,[29][30][31], and can be downloaded from http://www.mysmu.edu/faculty/zlyang/(accessed on 1 March 2022).We also note that the previous works were all conducted within the parametric framework, assuming constant elasticities of the specified models across all the states and all the years.Nevertheless, due to changes in policies as well as the change in the economic environment, including the 1973 oil crisis and the 1979 energy crisis, the constant elasticity assumption can be questionable.In addition, the spatial spillover effects are also discussed in the literature.For example, Xu and Yang [32] employed spatial panel data models to capture the possible spatial spillover effects, and they further pointed out that a temporal heterogeneity pattern is observed in the parameter estimation.In view of this, we propose the following partially linear varying coefficient spatial autoregressive panel data model: where y it denotes the gross state product of state i in year t; α i reflects the unobserved individual fixed effect; Pc it denotes the public capital including highways and streets, water and sewer facilities, and other public buildings; L it denotes the labor input measured as employment in non-agricultural payrolls; Ps it is the stock of private capital; and Unemp it is the state unemployment rate included to capture business cycle effects.The spatial weight matrix W is specified using a contiguity form, where the (i, j)th element is indicated as 1 if the states i and j share a common border, otherwise it is 0. Note that the final W is also row-normalized.
The fitted results are reported in Table 3 including the estimates (EST) of the parameters and the 95% confidence intervals (CI).The results in the left panel of Table 3 show that Pc it does not have a significant effect on the states' private economic growth.This conclusion is consistent with the finding in [30].This leads to a reconstructed model as follows: Table 3.The estimates of the parameters and their 95% confidence intervals. Model From the right panel of Table 3, we can see that the significance of the spatial coefficient estimate reflects the spatial dependence and confirms the existences of spillover effects between states.Moreover, L it affects the states' private economic growth positively, and Unemp it affects the states' private economic growth negatively.Further, the fitted varying coefficient function curve is presented in Figure 2. The estimated curve has two inflection points, which approximately correspond to the 1973 oil crisis and the 1979 energy crisis.Figure 2 indicates the fluctuating effects of Ps it on the states' private economic growth.In the mid-1970s, the effect of Ps it on the states' private economic growth was approximately unchanged, while in the early 1970s and also the mid-1980s, the negative effect of Ps it on the states' private economic growth increased rapidly.This demonstrates that the standard applications of parametric panel models may not be valid.

Conclusions and Discussion
In this paper, we studied the statistical inference for a partially linear varying coefficient spatial autoregressive panel data model with fixed effects.By means of the basis function approximations and instrumental variable methods, we proposed a two-stage least squares estimation procedure to estimate the unknown parametric and nonparametric components, and meanwhile derived the convergence rate and asymptotic distributions of the estimators under some regularity conditions.We further constructed an empirical log-likelihood ratio function to derive the empirical likelihood confidence regions for the parametric component, which is shown to have an asymptotically correct coverage probability.Simulation studies and real data analysis also demonstrated that the proposed method performs well in the finite sample settings.
Lastly, we note that there are some interesting directions for future research.First, extending the model to a case with spatial errors would be useful yet challenging work.Second, the present paper assumes the spatial matrix W to be predetermined and timeinvariant.In practice, however, the spatial structure W may change along with T, especially when it is large.In addition, the spatial coefficient λ 0 may also change with time.These circumstances are outside the scope of the present paper and are left for future research.

Proof of the Main Results
To prove the theorems obtained in Section 3, we first present several lemmas.Note that the first three are essentially the same as Corollary 6.21 in [25], Lemma 4.5 in [33], and Lemma A.2 in [34], respectively.For convenience and simplicity, we also express C as a positive constant that may be different at each appearance throughout this section.
For any given vector a ∈ R p+1 with a τ a = 1, invoking condition (C1), it is easy to show that E(a τ ω i ) = 0 and Hence, a τ ω i satisfies the Lyapunov condition for the central limit theorem, yielding that Finally, by ( 13)-( 17), we have This proves the lemma.
Lemma 5.Under conditions (C1)-(C9), if δ 0 is the true value of the parameter, we have Proof.Following the same notation as in Lemma 4, we can derive that Further, by a similar argument as that for ( 16), we have This thus proves the lemma.It is easy to show that E(∆ε∆ε τ ) = σ 2 I N ⊗ A, where A = 2I T−1 − J T−1 (0) − J T−1 (0) τ is a (T − 1) × (T − 1) matrix, and .
Based on Gerschgorin's disk theorem ( [36]), if λ is the eigenvalue of A, then 0 ≤ λ ≤ 4. Thus, there exists a constant λ max , such that E(∆ε∆ε τ ) < λ max I. Further, by condition (C2), we have or, equivalently, R 12 = O p (1).Invoking conditions (C2) and (C7), we have This implies that R 13 = O p (1).Similarly, it can also be shown that R 14 = O p (1).Taken together, we have Next, for the term D τ (I − S)M(I − S)∆ε, we note that By a simple calculation, we have Applying the triangular inequality and invoking ( 18) and ( 21), we obtain . Thus, we have Lastly, for the term D τ (I − S)M(I − S)V 0 , we can represent it as Invoking condition (C7) and Lemma 1, we have Combining the above results, we can obtain that Finally, invoking conditions (C1) and (C6), and using the central limit theorem and Slutsky's theorem, we have This completes the proof of Theorem 1.
Using condition (C7), similar to the proof of ( 20 Taken together, (24) also holds.This completes the proof of Theorem 3.
Further using the Taylor expansion to L(δ 0 ), it yields that In addition, from (10), we have the projection matrix onto the space spanned by Q. Partialling out the B-spline approximation, we obtain (I − S)∆Y = (I − S)Dδ 0 + (I − S)V 0 + (I − S)∆ε.
C2) The matrix I − λ 0 W is nonsingular with |λ 0 | < 1, and the row and column sums of the matrices W and (I − λ 0 W) −1 are bounded uniformly in absolute value for any |λ 0 | < 1.Moreover, for the matrix C 0 = W(I − λ 0 W) −1 , there exists a constant λ c such that λ c I − C 0 C 0 is positive semidefinite.(C3) Let the internal knots of the spline be s j , j = 1, . . ., K l .Also, letting d j = s j − s j−1 and d = max 1≤j≤K l d j , there exists a constant M 0 such that max 1≤j≤K l ) For the matrix D * , there exists a constant λ c * such that λ c * I − D * D * τ is positive semidefinite.(C8) The density function of u it , f (u), is bounded away from zero and infinity on [a, b].

Table 1 .
The finite sample performance of the two-stage least squares estimators.

Table 2 .
The coverage probabilities and average lengths (in parentheses) of the 95% confidence intervals for δ 0 using different methods.
Hτ i ∆ε i .It is easy to verify that E(ω i |H i , Z i ) = 0 and Cov(ω i |H i , Z i ) = i Σ i Hi ξ P −→ Ψ.