Estimation in Partial Functional Linear Spatial Autoregressive Model

: Functional regression allows for a scalar response to be dependent on a functional predictor; however, not much work has been done when response variables are dependence spatial variables. In this paper, we introduce a new partial functional linear spatial autoregressive model which explores the relationship between a scalar dependence spatial response variable and explanatory variables containing both multiple real-valued scalar variables and a function-valued random variable. By means of functional principal components analysis and the instrumental variable estimation method, we obtain the estimators of the parametric component and slope function of the model. Under some regularity conditions, we establish the asymptotic normality for the parametric component and the convergence rate for slope function. At last, we illustrate the ﬁnite sample performance of our proposed methods with some simulation studies.


Introduction
Over the last two decades, there has been an increasing interest in functional data analysis in econometrics, biometrics, chemometrics, and medical research, as well as other fields. Due to the infinite-dimensional nature of functional data, the classical methods for functional data are no longer applicable. There has been a large amount of work in function data analysis; see Ramsay and Silverman [1], Cardot et al. [2], Yao et al. [3], Lian and Li [4], Fan et al. [5], Feng and Xue [6], Kong et al. [7], and Yu et al. [8]. Some methods and theories on partial functional linear models have been proposed. For example, based on a two-stage nonparametric regression calibration method, Zhang et al. [9] discussed a partial functional linear model. Shin [10] proposed new estimators of the parameters and coefficient function of partial functional linear model. Lu et al. [11] considered quantile regression for the functional partially linear model. Yu et al. [12] proposed a prediction procedure for the partial functional linear quantile regression model. However, the aforementioned articles have a significant limitation. That is, they assumed that response variables are independence variables. However, in many fields, such as economics, finance, and environmental studies, sometimes response variables are dependence spatial variables. Therefore, it is of practical interest to develop more flexible approaches using a broader family of data.
There has been considerable work on dependence spatial variables. One useful approach in dealing with spatial dependence is the spatial autoregressive model, which adds a weighted average of nearby values of the dependent variable to the base set of explanatory variables. Theories and methods based on parametric spatial autoregressive models have been extensively studied in Cliff and Ord [13], Anselin [14], and Cressie [15]. Lee [16] proposed the quasi-maximum likelihood estimation.
Then, Su and Jin [17] extended the quasi-likelihood estimation method to partially linear spatial autoregressive models. Koch and Krisztin [18] developed the B-splines and genetic-algorithms method for partially linear spatial autoregressive models. Chen et al. [19] proposed a new estimation method based on the kernel estimation method. Du et al. [20] considered partially linear additive spatial autoregressive models, proposed the instrumental variable estimation method, and established the asymptotic normality for the parametric component.
It is a good idea to develop more flexible approaches using a broader family of data, where the limitation can, in principle, be easily solved by proposing a new model. Thus, in this paper, based on spatial variables and functional data, we combine the spatial autoregressive model and the partial functional linear model, and propose a partial functional linear spatial autoregressive model.
Let Y i be a real-valued dependence spatial variable corresponding to the ith observation, Z i be a p-dimensional vector of associated explanatory variables, for i = 1, · · · , n. X i (t) be zero mean random functions belonging to L 2 (T ), and be independent and identically distributed, i = 1, · · · , n. For simplicity, we suppose throughout that T = [0, 1]. The partial functional linear spatial autoregressive model is given by where w ij is the (i, j)th element of a given n × n non-stochastic spatial weighting matrix W n , such that w ij = 0 for all i = j, W n is a specified n × n spatial weight matrix. The definition of spatial weight matrix W n is based on the geographic arrangement of the observations or contiguity. More generally, W n matrices can be specified based on geographical distance decay, economic distance, and the structure of a social network. β = (β 1 , · · · , β p ) T is a vector of p-dimensional unknown parameters, γ(t) is a square integrable unknown slope function on [0, 1], and ε i are independent and identically distributed random errors with zero mean and finite variance σ 2 .
There are many methods which can be used to deal with functional data, such as the functional principal component method, spline methods, and the rough penalty method. Functional principal component analysis (FPCA) can analyse an infinite dimensional problem by a finite dimensional one-therefore, FPCA is popular and widely used by researchers. Dauxois et al. [21] investigated the asymptotic theory of FPCA. Cardot et al. [22] applied FPCA to estimate the slope function of the functional linear model. Hall and Horowitz [23] and Hall and Hosseini-Nasab [24] showed the optimal convergence rates of slope function based on the FPCA technique.
In this paper, we consider the estimating problem of the model (1). Based on FPCA and the instrumental variable estimation techniques, we obtain the estimators of the parameters and slope function of model (1) with the two-stage least squares method. Under some mild conditions, the rate of convergence and asymptotic normality of the resulting estimators are established. Finally, some simulation studies are carried out to assess the finite sample performance of the proposed method. The results are encouraging and show that all estimators perform well in finite samples. Overall, simulation experiments lend support to our asymptotic results.
The rest of the paper proceeds as follows. In Section 2, functional principal component analysis and the instrumental variable estimation method is proposed to estimate the partial functional linear spatial autoregressive regression model. In Section 3, the asymptotic properties are given. Some simulation studies are described in Section 4. Lastly, we conclude the paper in Section 5 with some future work.

Estimation Procedures
First, we introduce FPCA. Denote the covariance function of X(t) by K X . Then, by Mercer's Theorem, we can obtain the spectral decomposition as K X (s, t) = where λ 1 ≥ λ 2 ≥ · · · ≥ 0 are the eigenvalues of the linear operator associated with K X (s, t), and φ k (t) are the corresponding eigenfunctions. By the Karhunen-Loève expansion, X i (t) can be represented variables with mean zero and variances E(ξ 2 ik ) = λ k , also called the functional principal component scores. Expanded on the orthonormal eigenbasis {φ k (t)}, the slope function can be written as . Based on the above FPCA, model (1) can be well-approximated by where ·, · represents the L 2 (T ) inner product, γ j = γ, φ j , and m is sufficiently large. The approximate model (2) naturally suggests the idea of principal components regression. However, in practice, φ j are unknown and must be replaced by estimates in order to estimate β and γ j (j = 1, · · · , m). For this purpose, we consider the empirical version of K X (s, t), which is given by where ( λ j , φ j ) are pairs of eigenvalues and eigenfunctions for the covariance operator associated with K X andλ 1 ≥λ 2 ≥ · · · ≥ 0. We take ( λ j , φ j ) as the estimator of (λ j , φ j ). (2) can be written as (3) can be written as matrix notation Y n= ρW n Y n + Z n β + Πα + ε n .
Let P = Π(Π T Π) −1 Π T denote the projection matrix onto the space spanned by Π, and we obtain (I − P)Y n= ρ(I − P)W n Y n + (I − P)Z n β + (I − P)ε n .
Let Q = (W n Y n , Z n ), θ = (ρ, β T ) T , applying the two-stage least squares procedure proposed by Kelejian and Prucha [25], we propose the following estimator where M = H(H T H) −1 H T and H is matrix of instrumental variables. Moreover, Consequently, we use Similar to Zhang and Shen [26], we next construct the instrument variables H. In the first step, the following instrumental variables are obtained whereρ andα are obtained by simply regressing Y n on pseudo regressor variables W n Y n , Z n , Π. In the second step, we useH to obtain the estimatorsᾱ andθ, and then we can construct the instrumental variables To implement our estimation method, we need to choose m. Here, truncation parameter m is selected by AIC criterion. Specifically, we minimize where with ρ, β and γ j being the estimated value.

Asymptotic Properties
In this section, we discuss the asymptotic normality of θ and the rate of convergence of γ(t). For convenience and simplicity, we let c denote a positive constant that may be different at each appearance. The following assumptions will be maintained throughout the paper. Assumption 1. The matrix I − ρW n is nonsingular with |ρ| < 1.
Assumption 5. For matrixQ = (S(Z n β + η), Z n ), there exists a constant ρ c * such that ρ c * I −QQ T is a positive semidefinite matrix. Assumption 6. The random vector Z has bounded fourth moments.

Assumption 7.
For any c > 0, there exists an > 0, such that Assumption 9. There exists some canstants a > 1 and b > a/2
Assumptions 1-3 impose restrictions on the spatial weighting matrix, and these restrictions are imposed for the spatial regression models (see Lee [16]; Zhang and Shen [26]; Du et al. [20]). Let the weighting matrix W n = I D ⊗ B F , where I D is a D-dimensional unit matrix, B F = (l F l T F − I F )/(F − 1), l F is the F-dimensional unit vector, and ⊗ is a Kronecker product, then weighting matrix W n can satisfy Assumptions 1-3. Assumption 4 is used to represent the asymptotic covariance matrix of θ. Assumption 5 is required to ensure the identifiability of parameter θ. Assumption 6 is the usual condition for the proofs of asymptotic properties of the estimators. Assumptions 7-9 are regularity assumptions for functional linear models (see Hall and Hosseini-Nasab [24]), where a Gaussian process with Hölder continuous sample paths satisfies Assumption 7. Assumption 10 usually appears in functional linear regression (see Feng and Xue [6]; Shin [10]; Hall and Horowitz [23]).
The following Theorem 1 shows the asymptotic property of the estimator of the parameter vector θ = (ρ, β T ) T .
By the definition of θ, we have Hence, we have By Lemma 1(b) of Kong et al. [7] with the help of Assumptions 7-9, we have

By Assumptions 7 and 9, one has
By Assumptions 9-10, one has Then, we can find Invoking the central limit theorem and Slutsky's theorem, we have Rate of convergence of the slope function γ(t) = m ∑ k=1 γ k φ k (t) is given in the following theorem.

Theorem 2.
Under the Assumptions 1-10, then The proof of Theorem 2 follows the proof of Theorem 2 of Shin [10], so we omitted it here.
We suppose the functional predictors can be expressed as X i (t) = ∑ 50 j=1 U ij v j (t), where U ij are independently distributed as the normal with mean 0 and variance λ j = ((j − 0.5)π) −2 , v j (t) = √ 2 sin((j − 0.5)πt). For the actual observations, we assume that they are realizations of {X i (·)} at an equally spaced grid of 100 points in [0, 1]. As we have said in Section 2, the truncation parameters m are selected by AIC criterion in our simulation. Similar to Lee [16] and Case [28], we focus on the spatial scenario with R number of districts, q members in each district, and with each neighbor of a member in a district given equal weight, that is, W n = I R ⊗ B q , where B q = (l q l T q − I q )/(q − 1), l q is the q-dimensional unit vector, and ⊗ is a Kronecker product. Some simulation studies are examined with different values of R for 50 and 70, q for 2, 5, and 8, and σ 2 for 0.25 and 1. For comparison, three different values ρ = 0.2, 0.5, 0.7 are considered, which represent spatial dependence of the responses from weak to strong. ρ = 0.2 represents weak spatial dependence, and ρ = 0.5 represents mild spatial dependence, whereas ρ = 0.7 represents relatively strong spatial dependence.
Throughout the simulations, for different scalar parameters ρ, β 1 and β 2 , we use the average bias, standard deviation (SD) as a measure of parametric estimation accuracy. The performance of the estimator of the slope function γ(t) is assessed using the square root of average squared errors (RASE), defined as where {t l , l = 1, · · · , N} are the regular grid points at which the functionγ(t) is evaluated. In our simulation, N = 200 is used. The sample size is n = Rq. We use 1000 Monte Carlo runs for estimation assessment, and then summarize the results in Tables 1-3 and Figures 1 and 2. Tables 1-3 list average Bias and SD of the estimators of ρ, β 1 , and β 2 , and average RASE of the estimator of γ(t) in the 1000 replications. Figures 1 and 2 present the average estimate curves of γ(t).
From Tables 1-3 and Figures 1 and 2 we can see that: (1) The biases of ρ, β 1 and β 2 are fairly small for almost all cases. (2) The standard deviation of ρ, β 1 and β 2 decrease as either R or q increases. (3) The RASEs of γ(t) are small for all cases and decrease as sample size n increases or σ 2 decreases, and it can be concluded that the estimate curves fit better to the corresponding true line, which coincides with what was discovered from Figures 1 and 2. Overall, the simulation results suggest that the proposed estimation procedure is effective for the partial functional linear spatial autoregressive model.

Conclusions
In this paper, we propose a partial functional linear spatial autoregressive model to study the link between a scalar dependence spatial response variable and explanatory variables containing both multiple real-valued scalar variables and a functional predictor. We then use functional principal component basis and instrumental variable to estimate the parametric vector and slope function based on the two stage least squares procedure. Under some mild conditions, we obtain the asymptotic

Conclusions
In this paper, we propose a partial functional linear spatial autoregressive model to study the link between a scalar dependence spatial response variable and explanatory variables containing both multiple real-valued scalar variables and a functional predictor. We then use functional principal component basis and instrumental variable to estimate the parametric vector and slope function based on the two stage least squares procedure. Under some mild conditions, we obtain the asymptotic normality of estimators of parametric vector. Furthermore, the rate of convergence of the proposed estimator of slope function is also established. The simulation studies demonstrate that the proposed

Conclusions
In this paper, we proposed a partial functional linear spatial autoregressive model to study the link between a scalar dependence spatial response variable and explanatory variables containing both multiple real-valued scalar variables and a functional predictor. We then used functional principal component basis and an instrumental variable to estimate the parametric vector and slope function based on the two-stage least squares procedure. Under some mild conditions, we obtained the asymptotic normality of estimators of a parametric vector. Furthermore, the rate of convergence of the proposed estimator of slope function was also established. The simulation studies demonstrate that the proposed method performs satisfactorily and the theoretical results are valid.
There are some interesting future directions. In this paper, we only considered the estimation of the unknown parametric vector and slope function, which does not present a way to test for the effects of the covariates, an important aspect of any statistical analysis. In the future, we would like to be able to identify the model structure by testing for the main effects of the scalar predictors and the functional predictor. Another interesting direction can be to extend our new procedure to the generalized partial functional linear spatial autoregressive model.

Conflicts of Interest:
The authors declare no conflict of interest.