Estimation in Partial Functional Linear Spatial Autoregressive Model

Hu, Yuping; Wu, Siyu; Feng, Sanying; Jin, Junliang

doi:10.3390/math8101680

Open AccessArticle

Estimation in Partial Functional Linear Spatial Autoregressive Model

¹

School of Mathematics and Statistics, Zhengzhou University, Zhengzhou 450001, China

²

Henan Key Laboratory of Financial Engineering, Zhengzhou University, Zhengzhou 450001, China

^*

Author to whom correspondence should be addressed.

Mathematics 2020, 8(10), 1680; https://doi.org/10.3390/math8101680

Submission received: 6 September 2020 / Revised: 24 September 2020 / Accepted: 24 September 2020 / Published: 1 October 2020

(This article belongs to the Section D1: Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

Functional regression allows for a scalar response to be dependent on a functional predictor; however, not much work has been done when response variables are dependence spatial variables. In this paper, we introduce a new partial functional linear spatial autoregressive model which explores the relationship between a scalar dependence spatial response variable and explanatory variables containing both multiple real-valued scalar variables and a function-valued random variable. By means of functional principal components analysis and the instrumental variable estimation method, we obtain the estimators of the parametric component and slope function of the model. Under some regularity conditions, we establish the asymptotic normality for the parametric component and the convergence rate for slope function. At last, we illustrate the finite sample performance of our proposed methods with some simulation studies.

Keywords:

partial functional linear spatial autoregressive model; spatial autoregression; functional principal component analysis; instrument variable

1. Introduction

Over the last two decades, there has been an increasing interest in functional data analysis in econometrics, biometrics, chemometrics, and medical research, as well as other fields. Due to the infinite-dimensional nature of functional data, the classical methods for functional data are no longer applicable. There has been a large amount of work in function data analysis; see Ramsay and Silverman [1], Cardot et al. [2], Yao et al. [3], Lian and Li [4], Fan et al. [5], Feng and Xue [6], Kong et al. [7], and Yu et al. [8]. Some methods and theories on partial functional linear models have been proposed. For example, based on a two-stage nonparametric regression calibration method, Zhang et al. [9] discussed a partial functional linear model. Shin [10] proposed new estimators of the parameters and coefficient function of partial functional linear model. Lu et al. [11] considered quantile regression for the functional partially linear model. Yu et al. [12] proposed a prediction procedure for the partial functional linear quantile regression model. However, the aforementioned articles have a significant limitation. That is, they assumed that response variables are independence variables. However, in many fields, such as economics, finance, and environmental studies, sometimes response variables are dependence spatial variables. Therefore, it is of practical interest to develop more flexible approaches using a broader family of data.

There has been considerable work on dependence spatial variables. One useful approach in dealing with spatial dependence is the spatial autoregressive model, which adds a weighted average of nearby values of the dependent variable to the base set of explanatory variables. Theories and methods based on parametric spatial autoregressive models have been extensively studied in Cliff and Ord [13], Anselin [14], and Cressie [15]. Lee [16] proposed the quasi-maximum likelihood estimation. Then, Su and Jin [17] extended the quasi-likelihood estimation method to partially linear spatial autoregressive models. Koch and Krisztin [18] developed the B-splines and genetic-algorithms method for partially linear spatial autoregressive models. Chen et al. [19] proposed a new estimation method based on the kernel estimation method. Du et al. [20] considered partially linear additive spatial autoregressive models, proposed the instrumental variable estimation method, and established the asymptotic normality for the parametric component.

It is a good idea to develop more flexible approaches using a broader family of data, where the limitation can, in principle, be easily solved by proposing a new model. Thus, in this paper, based on spatial variables and functional data, we combine the spatial autoregressive model and the partial functional linear model, and propose a partial functional linear spatial autoregressive model.

Let

Y_{i}

be a real-valued dependence spatial variable corresponding to the ith observation,

Z_{i}

be a p-dimensional vector of associated explanatory variables, for

i = 1, \dots, n

.

X_{i} (t)

be zero mean random functions belonging to

L^{2} (T)

, and be independent and identically distributed,

i = 1, \dots, n

. For simplicity, we suppose throughout that

T = [0, 1]

. The partial functional linear spatial autoregressive model is given by

Y_{i} = ρ \sum_{j = 1}^{n} w_{i j} Y_{j} + Z_{i}^{T} β + \int_{0}^{1} γ (t) X_{i} (t) d t + ε_{i},

(1)

where

w_{i j}

is the

(i, j)

th element of a given

n \times n

non-stochastic spatial weighting matrix

W_{n}

, such that

w_{i j} = 0

for all

i = j

,

W_{n}

is a specified

n \times n

spatial weight matrix. The definition of spatial weight matrix

W_{n}

is based on the geographic arrangement of the observations or contiguity. More generally,

W_{n}

matrices can be specified based on geographical distance decay, economic distance, and the structure of a social network.

β = {(β_{1}, \dots, β_{p})}^{T}

is a vector of p-dimensional unknown parameters,

γ (t)

is a square integrable unknown slope function on [0, 1], and

ε_{i}

are independent and identically distributed random errors with zero mean and finite variance

σ^{2}

.

There are many methods which can be used to deal with functional data, such as the functional principal component method, spline methods, and the rough penalty method. Functional principal component analysis (FPCA) can analyse an infinite dimensional problem by a finite dimensional one—therefore, FPCA is popular and widely used by researchers. Dauxois et al. [21] investigated the asymptotic theory of FPCA. Cardot et al. [22] applied FPCA to estimate the slope function of the functional linear model. Hall and Horowitz [23] and Hall and Hosseini-Nasab [24] showed the optimal convergence rates of slope function based on the FPCA technique.

In this paper, we consider the estimating problem of the model (1). Based on FPCA and the instrumental variable estimation techniques, we obtain the estimators of the parameters and slope function of model (1) with the two-stage least squares method. Under some mild conditions, the rate of convergence and asymptotic normality of the resulting estimators are established. Finally, some simulation studies are carried out to assess the finite sample performance of the proposed method. The results are encouraging and show that all estimators perform well in finite samples. Overall, simulation experiments lend support to our asymptotic results.

The rest of the paper proceeds as follows. In Section 2, functional principal component analysis and the instrumental variable estimation method is proposed to estimate the partial functional linear spatial autoregressive regression model. In Section 3, the asymptotic properties are given. Some simulation studies are described in Section 4. Lastly, we conclude the paper in Section 5 with some future work.

2. Estimation Procedures

First, we introduce FPCA. Denote the covariance function of

X (t)

by

K_{X}

. Then, by Mercer’s Theorem, we can obtain the spectral decomposition as

K_{X} (s, t) = \sum_{k = 1}^{\infty} λ_{k} ϕ_{k} (s) ϕ_{k} (t)

, where

λ_{1} \geq λ_{2} \geq \dots \geq 0

are the eigenvalues of the linear operator associated with

K_{X} (s, t)

, and

ϕ_{k} (t)

are the corresponding eigenfunctions. By the Karhunen-Loève expansion,

X_{i} (t)

can be represented as

X_{i} (t) = \sum_{j = 1}^{\infty} ξ_{i j} ϕ_{j} (t),

where the coefficients

ξ_{i k} = \int X_{i} (t) ϕ_{k} (t) d t

are uncorrelated random variables with mean zero and variances

E (ξ_{i k}^{2}) = λ_{k}

, also called the functional principal component scores. Expanded on the orthonormal eigenbasis

{ϕ_{k} (t)}

, the slope function can be written as

γ (t) = \sum_{k = 1}^{\infty} γ_{k} ϕ_{k} (t)

. Based on the above FPCA, model (1) can be well-approximated by

Y_{i} \dot{=} ρ \sum_{j = 1}^{n} w_{i j} Y_{j} + Z_{i}^{T} β + \sum_{j = 1}^{m} γ_{j} 〈X_{i}, ϕ_{j}〉 + ε_{i}, i = 1, \dots, n,

(2)

where

〈\cdot, \cdot〉

represents the

L^{2} (T)

inner product,

γ_{j} = 〈γ, ϕ_{j}〉

, and m is sufficiently large.

The approximate model (2) naturally suggests the idea of principal components regression. However, in practice,

ϕ_{j}

are unknown and must be replaced by estimates in order to estimate

β

and

γ_{j}

(j = 1, \dots, m)

. For this purpose, we consider the empirical version of

K_{X} (s, t)

, which is given by

{\hat{K}}_{X} (s, t) = \frac{1}{n} \sum_{i = 1}^{n} X_{i} (s) X_{i} (t) = \sum_{j = 1}^{\infty} {\hat{λ}}_{j} {\hat{ϕ}}_{j} (s) {\hat{ϕ}}_{j} (t),

where

({\hat{λ}}_{j}, {\hat{ϕ}}_{j})

are pairs of eigenvalues and eigenfunctions for the covariance operator associated with

{\hat{K}}_{X}

and

{\hat{λ}}_{1} \geq {\hat{λ}}_{2} \geq \dots \geq 0 .

We take

({\hat{λ}}_{j}, {\hat{ϕ}}_{j})

as the estimator of

(λ_{j}, ϕ_{j})

.

Replacing

ϕ_{j} (t)

by

{\hat{ϕ}}_{j} (t)

, model (2) can be written as

Y_{i} \dot{=} ρ \sum_{j = 1}^{n} w_{i j} Y_{j} + Z_{i}^{T} β + \sum_{j = 1}^{m} γ_{j} 〈 X_{i}, {\hat{ϕ}}_{j} 〉 + ε_{i}, i = 1, \dots, n .

(3)

Let

Y_{n} = {(Y_{1}, \dots, Y_{n})}^{T}

,

Z_{n} = {(Z_{1}, \dots, Z_{n})}^{T}

,

X_{n} = {(X_{1} (t), \dots, X_{n} (t))}^{T}

,

〈 X_{n}, {\hat{ϕ}}_{j} 〉 = \int_{0}^{1} {\hat{ϕ}}_{j} (t) X_{n} (t) d t = {(\int_{0}^{1} {\hat{ϕ}}_{j} (t) X_{1} (t) d t, \dots, \int_{0}^{1} {\hat{ϕ}}_{j} (t) X_{n} (t) d t)}^{T},

and

Π = (〈 X_{n}, {\hat{ϕ}}_{1} 〉, \dots, 〈 X_{n}, {\hat{ϕ}}_{m} 〉)

,

α = {(γ_{1}, \dots, γ_{m})}^{T}

,

ε_{n} = {(ε_{1}, \dots, ε_{n})}^{T}

, model (3) can be written as matrix notation

Y_{n} \dot{=} ρ W_{n} Y_{n} + Z_{n} β + Π α + ε_{n} .

(4)

Let

P = Π {(Π^{T} Π)}^{- 1} Π^{T}

denote the projection matrix onto the space spanned by

Π

, and we obtain

(I - P) Y_{n} \dot{=} ρ (I - P) W_{n} Y_{n} + (I - P) Z_{n} β + (I - P) ε_{n} .

(5)

Let

Q = (W_{n} Y_{n}, Z_{n})

,

θ = {(ρ, β^{T})}^{T}

, applying the two-stage least squares procedure proposed by Kelejian and Prucha [25], we propose the following estimator

\hat{θ} = {(Q^{T} (I - P) M (I - P) Q)}^{- 1} Q^{T} (I - P) M (I - P) Y_{n},

(6)

where

M = H {(H^{T} H)}^{- 1} H^{T}

and

H

is matrix of instrumental variables. Moreover,

\hat{α} = {({\hat{γ}}_{1}, \dots, {\hat{γ}}_{m})}^{T} = {(Π^{T} Π)}^{- 1} Π^{T} (Y_{n} - Q \hat{θ}) .

(7)

Consequently, we use

\hat{γ} (t) = \sum_{k = 1}^{m} {\hat{γ}}_{k} {\hat{ϕ}}_{k} (t)

as the estimator of

γ (t)

.

Similar to Zhang and Shen [26], we next construct the instrument variables

H

. In the first step, the following instrumental variables are obtained

\tilde{H} = (W_{n} {(I - \tilde{ρ} W_{n})}^{- 1} (Π \tilde{α}, Z_{n}), Z_{n}),

where

\tilde{ρ}

and

\tilde{α}

are obtained by simply regressing

Y_{n}

on pseudo regressor variables

W_{n} Y_{n}, Z_{n}, Π

. In the second step, we use

\tilde{H}

to obtain the estimators

\bar{α}

and

\bar{θ}

, and then we can construct the instrumental variables

H = (W_{n} {(I - \bar{ρ} W_{n})}^{- 1} (Π \bar{α} + Z_{n} \bar{β}), Z_{n}) .

To implement our estimation method, we need to choose m. Here, truncation parameter m is selected by AIC criterion. Specifically, we minimize

AIC (m) = log RSS (m) + 2 n^{- 1} m,

(8)

where

RSS (m) = \sum_{i = 1}^{n} {\{Y_{i} - (\hat{ρ} \sum_{j = 1}^{n} w_{i j} Y_{j} + Z_{i}^{T} \hat{β} + \sum_{j = 1}^{m} {\hat{γ}}_{j} 〈 X_{i}, {\hat{ϕ}}_{j} 〉)\}}^{2},

with

\hat{ρ},

\hat{β}

and

{\hat{γ}}_{j}

being the estimated value.

3. Asymptotic Properties

In this section, we discuss the asymptotic normality of

\hat{θ}

and the rate of convergence of

\hat{γ} (t)

. For convenience and simplicity, we let c denote a positive constant that may be different at each appearance. The following assumptions will be maintained throughout the paper.

Assumption 1.

The matrix

I - ρ W_{n}

is nonsingular with

|ρ| < 1

.

Assumption 2.

The row and column sums of the matrices

W_{n}

and

{(I - ρ W_{n})}^{- 1}

are bounded uniformly in absolute value for any

|ρ| < 1

.

Assumption 3.

For matrix

S = W_{n} {(I - ρ W_{n})}^{- 1}

, there exists a constant

ρ_{c}

such that

ρ_{c} I - {SS}^{T}

is a positive, semidefinite matrix.

Assumption 4.

\frac{1}{n} {\tilde{Q}}^{T} (I - P) M (I - P) \tilde{Q} \overset{P}{⟶} Σ

in probability for some positive definite matrix, where

\tilde{Q} = (S (Z_{n} β + η), Z_{n})

,

η = {(\int_{0}^{1} γ (t) X_{1} (t) d t, \dots, \int_{0}^{1} γ (t) X_{n} (t) d t)}^{T} .

Assumption 5.

For matrix

\tilde{Q} = (S (Z_{n} β + η), Z_{n})

, there exists a constant

ρ_{c^{*}}

such that

ρ_{c^{*}} I - {\tilde{Q} \tilde{Q}}^{T}

is a positive semidefinite matrix.

Assumption 6.

The random vector Z has bounded fourth moments.

Assumption 7.

For any

c > 0

, there exists an

ϵ > 0

, such that

sup_{t \in [0, 1]} {[E {| X (t) |}^{c}}] < \infty, sup_{s, t \in [0, 1]} {(E [{| s - t |}^{- ϵ} {| X (s) - X (t) |}^{c}}]) < \infty .

For each integer

r \geq 1

,

λ_{k}^{- r} E (ξ_{k}^{2 r})

is bounded uniformly in k.

Assumption 8.

X (t)

is twice continuously differentiable on

[0, 1]

with probability 1 and

\int E {[X^{(2)} (t)]}^{4} d t < \infty

,

X^{(2)} (t)

denotes the second derivative of

X (t)

.

Assumption 9.

There exists some canstants

a > 1

and

b > a / 2 + 1

, such that

λ_{j} - λ_{j + 1} \geq C j^{- a - 1}

and

|γ_{j}| \leq C j^{- b}

for

j \geq 1

.

Assumption 10.

For truncation parameter m, we assume that

m = O (n^{1 / (a + 2 b)})

.

Assumptions 1–3 impose restrictions on the spatial weighting matrix, and these restrictions are imposed for the spatial regression models (see Lee [16]; Zhang and Shen [26]; Du et al. [20]). Let the weighting matrix

W_{n} = I_{D} \otimes B_{F}

, where

I_{D}

is a D-dimensional unit matrix,

B_{F} = (l_{F} l_{F}^{T} - I_{F}) / (F - 1)

,

l_{F}

is the

F

-dimensional unit vector, and ⊗ is a Kronecker product, then weighting matrix

W_{n}

can satisfy Assumptions 1–3. Assumption 4 is used to represent the asymptotic covariance matrix of

\hat{θ} .

Assumption 5 is required to ensure the identifiability of parameter

θ .

Assumption 6 is the usual condition for the proofs of asymptotic properties of the estimators. Assumptions 7–9 are regularity assumptions for functional linear models (see Hall and Hosseini-Nasab [24]), where a Gaussian process with H

\ddot{o}

lder continuous sample paths satisfies Assumption 7. Assumption 10 usually appears in functional linear regression (see Feng and Xue [6]; Shin [10]; Hall and Horowitz [23]).

The following Theorem 1 shows the asymptotic property of the estimator of the parameter vector

θ = {(ρ, β^{T})}^{T} .

Theorem 1.

Under the Assumptions 1–10, then

\sqrt{n} (\hat{θ} - θ) \overset{D}{⟶} N (0, σ^{2} Σ^{- 1}),

where

\hat{θ} = {(\hat{ρ}, {\hat{β}}^{T})}^{T}

and “

\overset{D}{⟶}

” denotes convergence in distribution.

Proof of Theorem 1.

Let

e_{n} = η - Π α

, then

Y_{n} = Q θ + η + ε_{n} = Q θ + e_{n} + Π α + ε_{n} .

By the definition of

\hat{θ}

, we have

\begin{matrix} \hat{θ} - θ \\ = {(Q^{T} (I - P) M (I - P) Q)}^{- 1} Q^{T} (I - P) M (I - P) Y_{n} - θ \\ = {(Q^{T} (I - P) M (I - P) Q)}^{- 1} Q^{T} (I - P) M (I - P) [(I - P) Y_{n}] - θ \\ = {(Q^{T} (I - P) M (I - P) Q)}^{- 1} Q^{T} (I - P) M (I - P) (Q θ + e_{n} + ε_{n}) - θ \\ = {(Q^{T} (I - P) M (I - P) Q)}^{- 1} Q^{T} (I - P) M (I - P) (e_{n} + ε_{n}) . \end{matrix}

First, consider

Q^{T} (I - P) M (I - P) Q

. Recall that when

Y_{n} = {(I - ρ W_{n})}^{- 1} (Z_{n} β + η + ε_{n}),

it has

\begin{matrix} Q & = (W_{n} Y_{n}, Z_{n}) \\ = (W_{n} {(I - ρ W_{n})}^{- 1} (Z_{n} β + η + ε_{n}), Z_{n}) \\ = (W_{n} {(I - ρ W_{n})}^{- 1} (Z_{n} β + η), Z_{n}) + (W_{n} {(I - ρ W_{n})}^{- 1} ε_{n}, 0) \\ \overset{Δ}{=} \tilde{Q} + \tilde{e}, \end{matrix}

where

\tilde{Q} = (S (Z_{n} β + η), Z_{n})

,

\tilde{e} = (S ε_{n}, 0)

,

S = W_{n} {(I - ρ W_{n})}^{- 1}

.

Hence, one has

\begin{matrix} Q^{T} (I - P) M (I - P) Q \\ = {(\tilde{Q} + \tilde{e})}^{T} (I - P) M (I - P) (\tilde{Q} + \tilde{e}) \\ = {\tilde{Q}}^{T} (I - P) M (I - P) \tilde{Q} + {\tilde{e}}^{T} (I - P) M (I - P) \tilde{e} \\ + {\tilde{Q}}^{T} (I - P) M (I - P) \tilde{e} + {\tilde{e}}^{T} (I - P) M (I - P) \tilde{Q} \\ \overset{Δ}{=} R_{11} + R_{12} + R_{13} + R_{14}, \end{matrix}

where

R_{11} = {\tilde{Q}}^{T} (I - P) M (I - P) \tilde{Q},

R_{12} = {\tilde{e}}^{T} (I - P) M (I - P) \tilde{e},

R_{13} = {\tilde{Q}}^{T} (I - P) M (I - P) \tilde{e},

and

R_{14} = {\tilde{e}}^{T} (I - P) M (I - P) \tilde{Q} .

By the properties of projection matrix and Assumption 3, we have

\begin{matrix} E [ε_{n}^{T} S^{T} (I - P) M (I - P) S ε_{n}] \\ = E (trace [ε_{n}^{T} S^{T} (I - P) M (I - P) S ε_{n}]) \\ = E (trace [ε_{n}^{T} S^{T} (I - P) H {(H^{T} H)}^{- 1} H^{T} (I - P) S ε_{n}]) \\ = E (trace [{(H^{T} H)}^{- \frac{1}{2}} H^{T} (I - P) S ε_{n} ε_{n}^{T} S^{T} (I - P) H {(H^{T} H)}^{- \frac{1}{2}}]) \\ \leq σ^{2} ρ_{c} E (trace [{(H^{T} H)}^{- \frac{1}{2}} H^{T} (I - P) H {(H^{T} H)}^{- \frac{1}{2}}]) \\ \leq σ^{2} ρ_{c} E (trace [{(H^{T} H)}^{- \frac{1}{2}} H^{T} H {(H^{T} H)}^{- \frac{1}{2}}]) \\ \overset{Δ}{=} O (1) . \end{matrix}

Hence, we have

ε_{n}^{T} S^{T} (I - P) M (I - P) S ε_{n} = O_{p} (1) .

Then, we get that

R_{12} = {(S ε_{n}, 0)}^{T} (I - P) M (I - P) (S ε_{n}, 0) = O_{p} (1) .

By straightforward algebra, one has

E (R_{13}) = 0

. In addition, based on Assumption 3,

\begin{matrix} E (∥ {\tilde{Q}}^{T} (I - P) M (I - P) S ε_{n} ∥^{2}) \\ = E (trace [{\tilde{Q}}^{T} (I - P) M (I - P) S ε_{n} ε_{n}^{T} S^{T} (I - P) M (I - P) \tilde{Q}]) \\ \leq σ^{2} ρ_{c} E (trace [{\tilde{Q}}^{T} (I - P) M (I - P) M (I - P) \tilde{Q}]) \\ \leq σ^{2} ρ_{c} E (trace [{\tilde{Q}}^{T} M \tilde{Q}]) \\ = σ^{2} ρ_{c} E (trace [{\tilde{Q}}^{T} H {(H^{T} H)}^{- 1} H^{T} \tilde{Q}]) \\ = σ^{2} ρ_{c} E (trace [{(H^{T} H)}^{- \frac{1}{2}} H^{T} {\tilde{Q} \tilde{Q}}^{T} H {(H^{T} H)}^{- \frac{1}{2}}]) \\ \leq σ^{2} ρ_{c} ρ_{c}^{*} E (trace [{(H^{T} H)}^{- \frac{1}{2}} H^{T} H {(H^{T} H)}^{- \frac{1}{2}}]) \\ \overset{Δ}{=} O (1), \end{matrix}

Therefore, we have

R_{13} = O_{p} (1)

. Similarly, we have

R_{14} = O_{p} (1) .

Combining the convergence rates of

R_{12}

,

R_{13}

and

R_{14}

, we have

Q^{T} (I - P) M (I - P) Q = R_{11} + O_{p} (1) .

Now, we consider

Q^{T} (I - P) M (I - P) e_{n}

. Obviously,

\begin{matrix} Q^{T} (I - P) M (I - P) e_{n} \\ = {\tilde{Q}}^{T} (I - P) M (I - P) e_{n} + {\tilde{e}}^{T} (I - P) M (I - P) e_{n} \\ = {\tilde{Q}}^{T} (I - P) M (I - P) e_{n} + ε_{n}^{T} S^{T} (I - P) M (I - P) e_{n} \\ \overset{Δ}{=} R_{21} + R_{22}, \end{matrix}

where

R_{21} = {\tilde{Q}}^{T} (I - P) M (I - P) e_{n},

R_{22} = ε_{n}^{T} S^{T} (I - P) M (I - P) e_{n} .

By

Lemma

1 of

Hu

et al. [27], we have

\begin{matrix} ∥e_{n}∥ & = ∥\sum_{j = 1}^{\infty} γ_{j} 〈 X_{N}, ϕ_{j} 〉 - \sum_{j = 1}^{m} γ_{j} 〈 X_{N}, {\hat{ϕ}}_{j} 〉∥ \\ = ∥\sum_{j = 1}^{m} γ_{j} 〈 X_{N}, ϕ_{j} - {\hat{ϕ}}_{j} 〉 + \sum_{j = m + 1}^{\infty} γ_{j} 〈 X_{N}, ϕ_{j} 〉∥ \\ \leq ∥\sum_{j = 1}^{m} γ_{j} 〈 X_{N}, ϕ_{j} - {\hat{ϕ}}_{j} 〉∥ + ∥\sum_{j = m + 1}^{\infty} γ_{j} 〈 X_{N}, ϕ_{j} 〉∥ . \end{matrix}

By

Lemma

1 (b)

of

Kong

et al. [7] with the help of Assumptions 7–9, we have

∥{\hat{ϕ}}_{j} - ϕ_{j}∥ = O_{p} (n^{- \frac{1}{2}} j) .

By Assumptions 7 and 9, one has

\begin{matrix} {∥\sum_{j = 1}^{m} γ_{j} 〈 X_{i}, ϕ_{j} - {\hat{ϕ}}_{j} 〉∥}^{2} & \leq {∥X_{i} (t)∥}^{2} {∥\sum_{j = 1}^{m} (ϕ_{j} - {\hat{ϕ}}_{j}) γ_{j}∥}^{2} \\ = O_{p} {(\sum_{j = 1}^{m} j^{- b} j n^{- \frac{1}{2}})}^{2} \\ = O_{p} (n^{- 1} m^{4 - 2 b}) . \end{matrix}

By Assumptions 9–10, one has

E | \sum_{j = m + 1}^{\infty} γ_{j} 〈 X_{i}, ϕ_{j} 〉 | \leq \sum_{j = m + 1}^{\infty} |γ_{j}| E |〈 X_{i}, ϕ_{j} 〉| = O (m^{- b + \frac{1}{2}}) .

Thus, we have

∥ \sum_{j = m + 1}^{\infty} γ_{j} 〈 X_{i}, ϕ_{j} 〉 ∥^{2} = O_{p} (m^{1 - 2 b}),

\begin{matrix} {∥e_{n}∥}^{2} & \leq n \cdot {∥\sum_{j = 1}^{m} γ_{j} 〈 X_{i}, ϕ_{j} - {\hat{ϕ}}_{j} 〉∥}^{2} + n \cdot {∥\sum_{j = m + 1}^{\infty} γ_{j} 〈 X_{i}, ϕ_{j} 〉∥}^{2} \\ = O_{p} (m^{4 - 2 b}) + O (n m^{1 - 2 b}) \\ = o_{p} (n) . \end{matrix}

Combining this with Assumption 4, we have

\begin{matrix} E ({∥R_{21}∥}^{2}) & = E (trace [e_{n}^{T} (I - P) M (I - P) {\tilde{Q} \tilde{Q}}^{T} (I - P) M (I - P) e_{n}]) \\ \leq ρ_{c}^{*} E (trace [e_{n}^{T} (I - P) M (I - P) e_{n}]) \\ \leq ρ_{c}^{*} E (trace [e_{n}^{T} e_{n}]) \\ = o_{p} (n) . \end{matrix}

Thus, we can get

R_{21} = o_{p} (\sqrt{n}) .

Similarly, we have

R_{22} = o_{p} (\sqrt{n}) .

Then, we can find

\begin{matrix} \sqrt{n} (\hat{θ} - θ) \\ = \sqrt{n} {(R_{11} + O_{p} (1))}^{- 1} Q^{T} (I - P) M (I - P) (e_{n} + ε_{n}) \\ = {(\frac{R_{11}}{n} + O_{p} (n^{- 1}))}^{- 1} (\frac{{\tilde{Q}}^{T} (I - P) M (I - P) ε_{n}}{\sqrt{n}} + o_{p} (1)) . \end{matrix}

Invoking the central limit theorem and Slutsky’s theorem, we have

\sqrt{n} (\hat{θ} - θ) \overset{D}{⟶} N (0, σ^{2} Σ^{- 1}) .

□

Rate of convergence of the slope function

\hat{γ} (t) = \sum_{k = 1}^{m} {\hat{γ}}_{k} {\hat{ϕ}}_{k} (t)

is given in the following theorem.

Theorem 2.

Under the Assumptions 1–10, then

{∥\hat{γ} (t) - γ (t)∥}^{2} = O_{p} (n^{- \frac{2 b - 1}{a + 2 b}}) .

The proof of Theorem 2 follows the proof of Theorem 2 of

Shin

[10], so we omitted it here.

4. Simulation Study

In this section, we conduct simulation studies to assess the finite sample performance of the proposed estimation method. The data

{Y_{i}}

are generated from the following model

Y_{n} = ρ W_{n} Y_{n} + Z_{n 1} β_{1} + Z_{n 2} β_{2} + \int_{0}^{1} γ (t) X_{n} (t) d t + ε_{n},

where

Z_{n 1} = {(Z_{11}, Z_{21}, \dots, Z_{n 1})}^{T}

,

Z_{n 2} = {(Z_{12}, Z_{22}, \dots, Z_{n 2})}^{T}

,

Z_{i 1}

and

Z_{i 2}

are independent and following uniform distributions on

[- 1, 1]

and [0, 1] respectively, for

i = 1, 2, \dots, n

,

β_{1} = 1, β_{2} = - 1

,

γ (t) = \sqrt{2} sin (π t / 2) + 3 \sqrt{2} sin (3 π t / 2)

,

ε_{n} \sim N (0, σ^{2} I_{n})

.

We suppose the functional predictors can be expressed as

X_{i} (t) = \sum_{j = 1}^{50} U_{i j} v_{j} (t)

, where

U_{i j}

are independently distributed as the normal with mean 0 and variance

λ_{j} = {((j - 0.5) π)}^{- 2}

,

v_{j} (t) = \sqrt{2} sin ((j - 0.5) π t)

. For the actual observations, we assume that they are realizations of

{X_{i} (\cdot)}

at an equally spaced grid of 100 points in [0, 1]. As we have said in Section 2, the truncation parameters m are selected by AIC criterion in our simulation. Similar to Lee [16] and Case [28], we focus on the spatial scenario with R number of districts, q members in each district, and with each neighbor of a member in a district given equal weight, that is,

W_{n} = I_{R} \otimes B_{q}

, where

B_{q} = (l_{q} l_{q}^{T} - I_{q}) / (q - 1)

,

l_{q}

is the q-dimensional unit vector, and ⊗ is a Kronecker product. Some simulation studies are examined with different values of R for 50 and 70, q for 2, 5, and 8, and

σ^{2}

for 0.25 and 1. For comparison, three different values

ρ = 0.2, 0.5, 0.7

are considered, which represent spatial dependence of the responses from weak to strong.

ρ

= 0.2 represents weak spatial dependence, and

ρ

= 0.5 represents mild spatial dependence, whereas

ρ

= 0.7 represents relatively strong spatial dependence.

Throughout the simulations, for different scalar parameters

ρ

,

β_{1}

and

β_{2}

, we use the average bias, standard deviation (SD) as a measure of parametric estimation accuracy. The performance of the estimator of the slope function

γ (t)

is assessed using the square root of average squared errors (RASE), defined as

RASE = {\{\frac{1}{N} \sum_{l = 1}^{N} {[\hat{γ} (t_{l}) - γ (t_{l})]}^{2}\}}^{1 / 2},

where

{t_{l}, l = 1, \dots, N}

are the regular grid points at which the function

\hat{γ} (t)

is evaluated. In our simulation,

N = 200

is used.

The sample size is

n = R q

. We use 1000 Monte Carlo runs for estimation assessment, and then summarize the results in Table 1, Table 2 and Table 3 and Figure 1 and Figure 2. Table 1, Table 2 and Table 3 list average Bias and SD of the estimators of

ρ

,

β_{1}

, and

β_{2}

, and average RASE of the estimator of

γ (t)

in the 1000 replications. Figure 1 and Figure 2 present the average estimate curves of

γ (t)

.

From Table 1, Table 2 and Table 3 and Figure 1 and Figure 2 we can see that: (1) The biases of

\hat{ρ}

,

{\hat{β}}_{1}

and

{\hat{β}}_{2}

are fairly small for almost all cases. (2) The standard deviation of

\hat{ρ}

,

{\hat{β}}_{1}

and

{\hat{β}}_{2}

decrease as either R or q increases. (3) The RASEs of

γ (t)

are small for all cases and decrease as sample size n increases or

σ^{2}

decreases, and it can be concluded that the estimate curves fit better to the corresponding true line, which coincides with what was discovered from Figure 1 and Figure 2. Overall, the simulation results suggest that the proposed estimation procedure is effective for the partial functional linear spatial autoregressive model.

5. Conclusions

In this paper, we proposed a partial functional linear spatial autoregressive model to study the link between a scalar dependence spatial response variable and explanatory variables containing both multiple real-valued scalar variables and a functional predictor. We then used functional principal component basis and an instrumental variable to estimate the parametric vector and slope function based on the two-stage least squares procedure. Under some mild conditions, we obtained the asymptotic normality of estimators of a parametric vector. Furthermore, the rate of convergence of the proposed estimator of slope function was also established. The simulation studies demonstrate that the proposed method performs satisfactorily and the theoretical results are valid.

There are some interesting future directions. In this paper, we only considered the estimation of the unknown parametric vector and slope function, which does not present a way to test for the effects of the covariates, an important aspect of any statistical analysis. In the future, we would like to be able to identify the model structure by testing for the main effects of the scalar predictors and the functional predictor. Another interesting direction can be to extend our new procedure to the generalized partial functional linear spatial autoregressive model.

Author Contributions

Conceptualization, Y.H.; Software, S.W.; methodology, Y.H. and S.F.; writing—original draft preparation, Y.H. and J.J.; writing—review and editing, S.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Social Science Foundation of China (18BTJ021).

Conflicts of Interest

The authors declare no conflict of interest.

References

Ramsay, J.O.; Silverman, B.W. Functional Data Analysis; Springer: New York, NY, USA, 1997. [Google Scholar]
Cardot, H.; Ferraty, F.; Sarda, P. Spline estimators for the functional linear model. Statist. Sin. 2003, 13, 571–592. [Google Scholar]
Yao, F.; Müller, H.G.; Wang, J.L. Functional linear regression analysis for longitudinal data. Ann. Statist. 2005, 33, 2873–2903. [Google Scholar] [CrossRef] [Green Version]
Lian, H.; Li, G. Series expansion for functional sufficient dimension reduction. J. Multivariate Anal. 2014, 124, 150–165. [Google Scholar] [CrossRef]
Fan, Y.; James, G.M.; Radchenko, P. Functional additive regression. Ann. Statist. 2015, 43, 2296–2325. [Google Scholar] [CrossRef]
Feng, S.Y.; Xue, L.G. Partially functional linear varying coefficient model. Statistics 2016, 50, 717–732. [Google Scholar] [CrossRef]
Kong, D.; Xue, K.; Yao, F.; Zhang, H.H. Partially functional linear regression in high dimensions. Biometrika 2016, 103, 147–159. [Google Scholar] [CrossRef] [Green Version]
Yu, P.; Du, J.; Zhang, Z.Z. Varying-coefficient partially functional linear quantile regression models. J. Korean Statist. Soc. 2017, 46, 462–475. [Google Scholar] [CrossRef]
Zhang, D.; Lin, X.; Sowers, M.F. Assessing the effects of reproductive hormone profiles on bone mineral density using functional two-stage mixed models. Biometrics 2007, 63, 351–362. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shin, H. Partial functional linear regression. J. Statist. Plann. Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
Lu, Y.; Du, J.; Sun, Z. Functional partially linear quantile regression model. Metrika 2014, 77, 317–332. [Google Scholar] [CrossRef]
Yu, P.; Zhang, Z.Z.; Du, J. A test of linearity in partial functional linear regression. Metrika 2016, 79, 953–969. [Google Scholar] [CrossRef]
Cliff, A.; Ord, J.K. Spatial Autocorrelation; Pion: London, UK, 1973. [Google Scholar]
Anselin, L. Spatial Econometrics: Methods and Models; Kluwer: Dordrecht, The Netherlands, 1988. [Google Scholar]
Cressie, N. Statistics for Spatial Data; John Wiley & Sons: New York, NY, USA, 1993. [Google Scholar]
Lee, L.F. Asymptotic distributions of quasi-maximum likelihood estimators for spatial econometric models. Econometrica 2004, 72, 1899–1926. [Google Scholar] [CrossRef]
Su, L.; Jin, S. Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. J. Econom. 2010, 157, 18–33. [Google Scholar] [CrossRef]
Koch, M.; Krisztin, T. Applications for asynchronous multi-agent teams in nonlinear applied spatial econometrics. J. Internet Technol. 2014, 12, 1007–1014. [Google Scholar]
Chen, J.; Wang, R.; Huang, Y. Semiparametric spatial autoregressive model: A two-step bayesian approach. Ann. Public Health Res. 2015, 2, 1012–1024. [Google Scholar]
Du, J.; Sun, X.Q.; Cao, R.Y.; Zhang, Z.Z. Statistical inference for partially linear additive spatial autoregressive models. Spat. Statist. 2018, 25, 52–67. [Google Scholar] [CrossRef]
Dauxois, J.; Pousse, A.; Romain, Y. Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference. J. Multivariate Anal. 1982, 12, 136–154. [Google Scholar] [CrossRef] [Green Version]
Cardot, H.; Ferraty, F.; Sarda, P. Functional linear model. Statist. Probab. Lett. 1999, 45, 11–22. [Google Scholar] [CrossRef]
Hall, P.; Horowitz, J.L. Methodology and convergence rates for functional linear regression. Ann. Statist. 2007, 35, 70–91. [Google Scholar] [CrossRef] [Green Version]
Hall, P.; Hosseini-nasab, M. Theory for high-order bounds in functional principal components analysis. Math. Proc. Cambridge Philos. Soc. 2009, 146, 225–256. [Google Scholar] [CrossRef]
Kelejian, H.H.; Prucha, I.R. A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J. Real Estate Financ. 1998, 17, 99–121. [Google Scholar] [CrossRef]
Zhang, Y.Q.; Shen, D.M. Estimation of semi-parametric varying-coefficient spatial panel data models with random-effects. J. Statist. Plann. Inference 2015, 159, 64–80. [Google Scholar] [CrossRef]
Hu, Y.P.; Xue, L.G.; Zhao, J.; Zhang, L.Y. Skew-normal partial functional linear model and homogeneity test. J. Statist. Plann. Inference 2020, 204, 116–127. [Google Scholar] [CrossRef]
Case, A.C. Spatial patterns in household demand. Econometrica 1991, 59, 953–965. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Simulation result of

\hat{γ} (t)

when

ρ = 0.2, R = 50, q = 5, σ^{2} = 0.25

. The solid curve denotes the true curve, the dash curve denotes its estimate.

Figure 1. Simulation result of

\hat{γ} (t)

when

ρ = 0.2, R = 50, q = 5, σ^{2} = 0.25

. The solid curve denotes the true curve, the dash curve denotes its estimate.

Figure 2. Simulation result of

\hat{γ} (t)

when

ρ = 0.7, R = 70, q = 5, σ^{2} = 1

. The solid curve denotes the true curve, the dash curve denotes its estimate.

Figure 2. Simulation result of

\hat{γ} (t)

when

ρ = 0.7, R = 70, q = 5, σ^{2} = 1

. The solid curve denotes the true curve, the dash curve denotes its estimate.

Table 1. Simulation results for

ρ = 0.2

.

Table 1. Simulation results for

ρ = 0.2

.

$σ^{2}$	R	Est	$q = 2$			$q = 5$			$q = 8$
$σ^{2}$	R	Est	Bias	SD	RASE	Bias	SD	RASE	Bias	SD	RASE
0.25	$R = 50$	$\hat{ρ}$	−1.0 $\times 10^{- 4}$	0.022		−1.4 $\times 10^{- 4}$	0.023		−0.001	0.021
		${\hat{β}}_{1}$	−0.002	0.045		−6.5 $\times 10^{- 4}$	0.028		4.0 $\times 10^{- 4}$	0.022
		${\hat{β}}_{2}$	0.003	0.048		−0.001	0.034		−0.002	0.030
		$\hat{γ} (t)$			0.400			0.255			0.197
	$R = 70$	$\hat{ρ}$	−1.4 $\times 10^{- 4}$	0.018		−0.001	0.019		−9.2 $\times 10^{- 5}$	0.019
		${\hat{β}}_{1}$	5.1 $\times 10^{- 4}$	0.038		−2.2 $\times 10^{- 4}$	0.023		1.5 $\times 10^{- 4}$	0.018
		${\hat{β}}_{2}$	7.8 $\times 10^{- 4}$	0.042		4.5 $\times 10^{- 4}$	0.029		−1.8 $\times 10^{- 4}$	0.026
		$\hat{γ} (t)$			0.345			0.211			0.168
1	$R = 50$	$\hat{ρ}$	−5.4 $\times 10^{- 4}$	0.084		−0.006	0.092		−0.009	0.086
		${\hat{β}}_{1}$	−0.012	0.179		−0.005	0.113		6.6 $\times 10^{- 4}$	0.087
		${\hat{β}}_{2}$	0.016	0.188		−0.006	0.136		−0.009	0.118
		$\hat{γ} (t)$			0.601			0.386			0.296
	$R = 70$	$\hat{ρ}$	−0.001	0.070		−0.009	0.075		−0.004	0.075
		${\hat{β}}_{1}$	−5.8 $\times 10^{- 4}$	0.151		−0.002	0.092		1.6 $\times 10^{- 4}$	0.072
		${\hat{β}}_{2}$	0.003	0.166		−3.7 $\times 10^{- 4}$	0.117		−0.003	0.104
		$\hat{γ} (t)$			0.517			0.317			0.251

Table 2. Simulation results for

ρ = 0.5

.

Table 2. Simulation results for

ρ = 0.5

.

$σ^{2}$	R	Est	$q = 2$			$q = 5$			$q = 8$
$σ^{2}$	R	Est	Bias	SD	RASE	Bias	SD	RASE	Bias	SD	RASE
0.25	$R = 50$	$\hat{ρ}$	−2.0 $\times 10^{- 4}$	0.017		−1.3 $\times 10^{- 4}$	0.015		−8.0 $\times 10^{- 4}$	0.014
		${\hat{β}}_{1}$	−0.002	0.046		−6.3 $\times 10^{- 4}$	0.029		4.6 $\times 10^{- 4}$	0.022
		${\hat{β}}_{2}$	0.003	0.051		−0.001	0.035		−0.002	0.030
		$\hat{γ} (t)$			0.401			0.255			0.197
	$R = 70$	$\hat{ρ}$	−1.9 $\times 10^{- 4}$	0.014		−8.9 $\times 10^{- 4}$	0.013		−7.2 $\times 10^{- 5}$	0.012
		${\hat{β}}_{1}$	6.0 $\times 10^{- 4}$	0.039		−1.0 $\times 10^{- 4}$	0.023		1.6 $\times 10^{- 4}$	0.018
		${\hat{β}}_{2}$	6.0 $\times 10^{- 4}$	0.045		3.0 $\times 10^{- 4}$	0.030		−2.0 $\times 10^{- 4}$	0.027
		$\hat{γ} (t)$			0.346			0.211			0.168
1	$R = 50$	$\hat{ρ}$	−0.002	0.066		−0.004	0.062		−0.006	0.056
		${\hat{β}}_{1}$	−0.010	0.182		−0.004	0.113		0.001	0.088
		${\hat{β}}_{2}$	0.013	0.200		−0.008	0.141		−0.010	0.121
		$\hat{γ} (t)$			0.611			0.387			0.297
	$R = 70$	$\hat{ρ}$	−0.002	0.055		−0.006	0.051		−0.003	0.049
		${\hat{β}}_{1}$	6.2 $\times 10^{- 4}$	0.153		−0.001	0.093		3.8 $\times 10^{- 4}$	0.073
		${\hat{β}}_{2}$	6.7 $\times 10^{- 4}$	0.178		− 0.002	0.121		−0.004	0.107
		$\hat{γ} (t)$			0.525			0.317			0.251

Table 3. Simulation results for

ρ = 0.7

.

Table 3. Simulation results for

ρ = 0.7

.

$σ^{2}$	R	Est	$q = 2$			$q = 5$			$q = 8$
$σ^{2}$	R	Est	Bias	SD	RASE	Bias	SD	RASE	Bias	SD	RASE
0.25	$R = 50$	$\hat{ρ}$	−1.9 $\times 10^{- 4}$	0.012		−9.4 $\times 10^{- 5}$	0.010		−5.0 $\times 10^{- 4}$	0.009
		${\hat{β}}_{1}$	−0.001	0.047		−6.1 $\times 10^{- 4}$	0.029		5.1 $\times 10^{- 4}$	0.022
		${\hat{β}}_{2}$	0.003	0.053		− 0.001	0.036		−0.002	0.031
		$\hat{γ} (t)$			0.402			0.255			0.197
	$R = 70$	$\hat{ρ}$	−1.7 $\times 10^{- 4}$	0.010		−5.7 $\times 10^{- 4}$	0.008		−5.0 $\times 10^{- 5}$	0.007
		${\hat{β}}_{1}$	7.0 $\times 10^{- 4}$	0.040		−2.0 $\times 10^{- 5}$	0.023		1.6 $\times 10^{- 4}$	0.018
		${\hat{β}}_{2}$	4.4 $\times 10^{- 4}$	0.047		1.9 $\times 10^{- 4}$	0.031		−2.2 $\times 10^{- 4}$	0.027
		$\hat{γ} (t)$			0.348			0.211			0.168
1	$R = 50$	$\hat{ρ}$	−0.003	0.046		−0.003	0.039		−0.004	0.035
		${\hat{β}}_{1}$	−0.009	0.187		−0.004	0.114		0.002	0.088
		${\hat{β}}_{2}$	0.010	0.210		− 0.009	0.145		−0.011	0.123
		$\hat{γ} (t)$			0.622			0.389			0.298
	$R = 70$	$\hat{ρ}$	−0.002	0.038		−0.004	0.032		−0.002	0.030
		${\hat{β}}_{1}$	0.002	0.157		−6.9 $\times 10^{- 4}$	0.093		5.5 $\times 10^{- 4}$	0.073
		${\hat{β}}_{2}$	−0.002	0.187		−0.003	0.124		−0.004	0.109
		$\hat{γ} (t)$			0.534			0.318			0.251

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, Y.; Wu, S.; Feng, S.; Jin, J. Estimation in Partial Functional Linear Spatial Autoregressive Model. Mathematics 2020, 8, 1680. https://doi.org/10.3390/math8101680

AMA Style

Hu Y, Wu S, Feng S, Jin J. Estimation in Partial Functional Linear Spatial Autoregressive Model. Mathematics. 2020; 8(10):1680. https://doi.org/10.3390/math8101680

Chicago/Turabian Style

Hu, Yuping, Siyu Wu, Sanying Feng, and Junliang Jin. 2020. "Estimation in Partial Functional Linear Spatial Autoregressive Model" Mathematics 8, no. 10: 1680. https://doi.org/10.3390/math8101680

APA Style

Hu, Y., Wu, S., Feng, S., & Jin, J. (2020). Estimation in Partial Functional Linear Spatial Autoregressive Model. Mathematics, 8(10), 1680. https://doi.org/10.3390/math8101680

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Estimation in Partial Functional Linear Spatial Autoregressive Model

Abstract

1. Introduction

2. Estimation Procedures

3. Asymptotic Properties

4. Simulation Study

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI