Partially Functional Linear Models with Linear Process Errors

Yanping Hu; Zhongqi Pang

doi:10.3390/math11163581

and

¹

School of Mathematical Sciences, Tongji University, Shanghai 200092, China

²

Department of Applied Mathematics, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China

^*

Author to whom correspondence should be addressed.

Mathematics2023, 11(16), 3581;https://doi.org/10.3390/math11163581

This article belongs to the Special Issue Statistical Modeling for Analyzing Data with Complex Structures

Version Notes

Order Reprints

Abstract

In this paper, we focus on the partial functional linear model with linear process errors deduced by not necessarily independent random variables. Based on Mercer’s theorem and Karhunen–Loève expansion, we give the estimators of the slope parameter and coefficient function in the model, establish the asymptotic normality of the estimator for the parameter and discuss the weak convergence with rates of the proposed estimators. Meanwhile, the penalized estimator of the parameter is defined by the SCAD penalty and its oracle property is investigated. Finite sample behavior of the proposed estimators is also analysed via simulations.

Keywords:

symptotic normality; convergence rate; linear process error; partial functional linear model; variable selection

MSC:

62J05; 62F12

1. Introduction

Over the last two decades, there has been an increasing interest in functional data analysis due to its extensive application in biometrics, chemometrics, econometrics, medical research as well as other fields. The functional data are intrinsically infinite in dimension; thus, the classical methods for multivariate observations are no longer applicable. The functional linear model is an important model in functional data analysis, and has been extensively investigated. Ramasy and Silverman [1] systematically introduced the statistical analysis methods for the functional data and described regression relationship between functional covariates and scalar responses by the functional linear models; further investigations include Cardot et al. [2], Cardot and Sarda [3], Li and Hsing [4], and Hall and Horowitz [5], who constructed estimator of slope function in the functional linear model and established its convergence with rates based on the functional principal component analysis (FPCA) technique. For more analysis on the functional data, refer to Hall and Hosseini-Nasab [6], Horváth and Kokoszka [7], Hsing and Eubank [8], Ferraty and Vieu [9], and others.

In real data, the response variable is also affected by other covariates. Shin [10] proposed the following partial functional linear model:

Y = z^{T} β + \int_{0}^{1} γ (t) X (t) d t + V,

(1)

where

Y \in R

is the response variable,

z

is a d-dimensional covariate vector with

E z = 0

,

X (t)

is a square integrable stochastic process on [0, 1] with

E X (t) = 0

,

β

is an unknown d-dimensional parametric vector,

γ (t)

is a square integrable unknown coefficient function and V is the regression error, which is independent of

{X (t), z}

.

The spline method, which is frequently employed to investigate functional data, were used in many papers that studied functional data. Based on B-spline, Yuan and Zhang [11] used the residual sum of squares to construct the test statistic of the

β

in the model (1); Hu and Liang [12] considered the empirical likelihood in the single-index partially functional linear model when the observations are missing at random; Jiang and Huang [13]) discussed single-index partially functional linear quantile regression and used B-spline to approximate the link function and the slope function and further establish the convergence rates and asymptotic normality of the estimators. Bouka et al. [14] employed a smoothing spline to study the estimators with a spatial functional linear regression.

However, the spline method has some drawbacks, such as the fact that shifting a control point causes the entire curve to change, making it hard to regulate the curve’s trend locally. Then, in recent years, many researchers have developed an interest in the FPCA approach for analyzing functional data, since this approach enables finite dimensional study of a topic that is inherently infinitely dimensional. The estimation and testing of a partially functional linear varying coefficient model were covered by Feng and Xue [15]. Based on FPCA, Xie et al. [16] examined the rank-based test’s asymptotic characteristics while considering the hypothesis test for the

β

in the model (1). In addition, Hu et al.’s [17] research also concentrated on the estimation issues for additive partial functional linear models with skew-normal errors. Tang et al. [18] proposed a two-step estimation procedure with FPCA in the partial functional partially linear additive model. At the same time, we find there are several papers to discuss penalized estimators related to the partially functional linear model based on FPCA. For instance, Kong et al. [19] applied a group penalty to reduce the effect of significant functional predictors in a high dimension setting; Du et al. [20] analysed estimation and variable selection; Yao et al. [21] selected the important variables based on the SCAD penalty in partially functional linear quantile model; Wu et al. [22] concentrated on building the estimators for the parameter and slope function with the responses being right-censored and the censoring indicators being missing at random, and proposed a variable selection procedure by the method of adaptive lasso penalty.

It is known that the independence assumption for the model errors, in practical applications, is not always appropriate, especially for sequentially collected economic data, which often exhibit dependence in the errors. Up to now, we have found that Wang et al. [23] established asymptotic normality and weak convergence with rates of the estimators for

β

and

γ

, respectively, in model (1) when the errors form a stationary

α

-mixing sequence, while Hu and Liang [24] used the reproducing kernel Hilbert space technique to study the parameter estimator and the convergence rate of the estimator for the slope function with missing observations under under correlated errors

V_{i} = \sum_{j = - \infty}^{+ \infty} c_{j} e_{i - j}

, which is called linear process, with

\sum_{j = - \infty}^{+ \infty} | c_{j} | < \infty

, and further defined the penalized estimator of the parameter by the SCAD penalty and a test statistic for check a linear hypothesis.

Motivated by the discussion in Hu and Liang [24], we, in this paper, focus on the partially functional linear model (1) when the regression error V is a linear process deduced by not necessarily independent random variables by using FPCA method. In particular, we give the estimators

\hat{β}

and

\hat{γ} (\cdot)

of

β

and

γ (\cdot)

, investigate the asymptotic normality of

\hat{β}

and discuss the weak convergence with rates of

\hat{β}

and

\hat{γ} (\cdot)

. At the same time, the penalized estimator of

β

is defined based on the SCAD penalty introduced by Fan and Li [25] and its oracle property is established. Finite sample behavior of the proposed estimators is also investigated via simulations.

The rest of the paper is organized as follows. In Section 2, we construct the estimators of the parameter and slope function including the penalized estimator of the parameter. The main results are described in Section 3. A simulation study is presented in Section 4. Conclusions are put into Section 5. All proofs are put in Section 6.

2. Estimators

2.1. Least Squares Estimation

Let

{Y_{i}, X_{i} (t), z_{i}, 1 \leq i \leq n}

come from

(Y, X (t), z)

based on model (1), i.e.,

Y_{i} = z_{i}^{T} β + \int_{0}^{1} γ (t) X_{i} (t) d t + V_{i},

(2)

where

{X_{i} (t), z_{i}, 1 \leq i \leq n}

are assume to be i.i.d. random variables, the errors

V_{i} = \sum_{j = - \infty}^{\infty} c_{j} e_{i - j}

with

\sum_{j = - \infty}^{\infty} | c_{j} | < \infty

and

E e_{i} = 0

. Set

C_{X} (s, t) = E X (s) X (t)

. The operator

T_{X} f

corresponding to

C_{X} (s, t)

is defined by

(T_{X} f) (\cdot) = \int_{0}^{1} C_{X} (\cdot, t) f (t) d t .

Then

T_{X} f

is positive definite, i.e., for

f \in L_{2} ([0, 1])

,

⟨ T_{X} f, f ⟩ = \int_{0}^{1} \int_{0}^{1} E X (s) X (t) f (s) f (t) d s d t

= E {(\int_{0}^{1} X (s) f (s) d s)}^{2} \geq 0,

where

⟨ f, g ⟩ = \int_{0}^{1} f (t) g (t) d t

,

f (t)

,

g (t) \in L_{2} [0, 1]

. If

C_{X} (s, t)

is continuous, then

C_{X} (\cdot, \cdot)

has the following representation by Mercer’s theorem (cf. Hsing and Eubank [8], Theorem 4.6.5, page 120)

C_{X} (s, t) = \sum_{i = 1}^{\infty} λ_{i} ρ_{i} (s) ρ_{i} (t), s, t \in [0, 1],

where

{ρ_{i} (s), i \geq 1}

is an orthonormal basis of

L_{2} ([0, 1])

, and

(λ_{i}, ρ_{i} (s))

are (eigenvalue, eigenfunction) pairs of

T_{X}

, which satisfy

T_{X} ρ_{i} = λ_{i} ρ_{i}

,

⟨ ρ_{i}, ρ_{j} ⟩ = 0

with

i \neq j

and

⟨ ρ_{i}, ρ_{j} ⟩ = 1

with

i = j

. Without loss generality, we assume

λ_{1} > λ_{2} > \dots > 0

. The estimators of

(λ_{i}, ρ_{i} (s))

are defined by

{\hat{C}}_{X} (s, t) = \frac{1}{n} \sum_{i = 1}^{n} X_{i} (s) X_{i} (t) = \sum_{j = 1}^{\infty} {\hat{λ}}_{j} {\hat{ρ}}_{j} (s) {\hat{ρ}}_{j} (t), s, t \in [0, 1], {\hat{λ}}_{1} \geq {\hat{λ}}_{2} \geq \dots \geq 0,

where

{({\hat{λ}}_{j}, {\hat{ρ}}_{j}), i \geq 1}

are the (eigenvalue, eigenfunction) pairs of

{\hat{T}}_{X}

corresponding to

{\hat{C}}_{X}

with

{\hat{T}}_{X} f (\cdot) = \int_{0}^{1} {\hat{C}}_{X} (\cdot, t) f (t) d t .

Here, the choice method for

{{\hat{ρ}}_{j}, j \geq 1}

is to simply let

ρ_{j}

take an arbitrary sign while choosing

{\hat{ρ}}_{j}

to minimize

∥ {\hat{ρ}}_{j} - ρ_{j} ∥

over the two possible signs, that is,

{\hat{ρ}}_{j}

is chosen so that

⟨ {\hat{ρ}}_{j}, ρ_{j} ⟩

. Clearly,

{{\hat{ρ}}_{j}, j \geq 1}

is an orthonormal basis of

L_{2} ([0, 1])

and

{({\hat{λ}}_{j}, {\hat{ρ}}_{j}), j \geq 1}

satisfy

{\hat{T}}_{X} {\hat{ρ}}_{j} (t) = {\hat{λ}}_{j} {\hat{ρ}}_{j} (t)

.

In addition, using Karhunen–Loève expansion (cf. Hsing and Eubank [8], Theorem 2.4.13, page 34,),

X (t)

and

γ (t)

have the following expressions

X (t) = \sum_{j = 1}^{\infty} ⟨ X, ρ_{j} ⟩ ρ_{j} (t) = \sum_{j = 1}^{\infty} θ_{j} ρ_{j} (t), γ (t) = \sum_{j = 1}^{\infty} ⟨ γ, ρ_{j} ⟩ ρ_{j} (t) = \sum_{j = 1}^{\infty} α_{j} ρ_{j} (t),

where

θ_{j} : = ⟨ X, ρ_{j} ⟩

and

α_{j} : = ⟨ γ, ρ_{j} ⟩

. Then

E θ_{j} = 0

and

Var (θ_{j}) = λ_{j}

. Thus, the model (2) can be written as

Y_{i} = β^{'} z_{i} + \sum_{j = 1}^{\infty} α_{j} θ_{i j} + V_{i}, V_{i} = \sum_{j = - \infty}^{\infty} c_{j} e_{i - j} with θ_{i j} = ⟨ X_{i}, ρ_{j} ⟩ .

(3)

In order to define the estimators of

β

and

γ (\cdot)

, we use an approximated form of (3)

Y_{i} \approx β^{'} z_{i} + \sum_{j = 1}^{m} α_{j} θ_{i j} + V_{i}, i = 1, 2, \dots, n, m : = m_{n} \to \infty (n \to \infty),

which can be rewritten into the following matrix form:

Y = Z β + U_{m} α + V,

where

Y = {(Y_{1}, \dots, Y_{n})}^{T}

,

Z = (z_{1}, \dots, {z_{n})}^{T}

,

z_{i} = {(z_{i 1}, \dots, z_{i d})}^{T}

,

U_{m} = {(⟨ X_{i}, \hat{ρ_{j}} ⟩)}_{n \times m}

,

α = {(α_{i}, \dots, α_{m})}^{T}

,

V = {(V_{1}, \dots, V_{n})}^{T}

. The estimators

(\hat{β}, \hat{α})

of

(β, α)

can be defined by minimizing the following objective function

G (β, α) = {(Y - Z β - U_{m} α)}^{T} (Y - Z β - U_{m} α) .

Let

H_{m} = U_{m} {(U_{m}^{T} U_{m})}^{- 1} U_{m}^{T}

. When

Z^{T} (I - H_{m}) Z

is invertible, we have

\begin{matrix} \hat{β} = {(Z^{T} (I - H_{m}) Z)}^{- 1} Z^{T} (I - H_{m}) Y, \hat{α} = {(U_{m}^{T} U_{m})}^{- 1} U_{m}^{T} (Y - Z \hat{β}) . \end{matrix}

Let

δ_{j k}

denote the Kronecker delta, then

\begin{matrix} \sum_{i = 1}^{n} ⟨ X_{i}, {\hat{ρ}}_{j} ⟩ ⟨ X_{i}, {\hat{ρ}}_{k} ⟩ = n \int_{0}^{1} \int_{0}^{1} {\hat{C}}_{X} (s, t) {\hat{ρ}}_{j} (s) {\hat{ρ}}_{k} (t) d s d t = n {\hat{λ}}_{j} δ_{j k}, \end{matrix}

which implies

{U_{m}}^{T} U_{m} = diag (n {\hat{λ}}_{1}, n {\hat{λ}}_{2}, \dots, n {\hat{λ}}_{m})

. Put

{\hat{C}}_{z} = \frac{1}{n} \sum_{i = 1}^{n} z_{i} z_{i}^{T}

,

{\hat{C}}_{z Y} = \frac{1}{n} \sum_{i = 1}^{n} z_{i} Y_{i}

,

\begin{matrix} {\hat{C}}_{z X} (t) = \frac{1}{n} \sum_{i = 1}^{n} z_{i} X_{i} (t), {\hat{C}}_{X z} (t) = {({\hat{C}}_{z X} (t))}^{T}, {\hat{C}}_{Y X} (t) = \frac{1}{n} \sum_{i = 1}^{n} Y_{i} X_{i} (t) . \end{matrix}

Then

\hat{β}

and

\hat{α}

can be rewritten, respectively, as

\begin{matrix} \hat{β} & = {({\hat{C}}_{z} - \sum_{j = 1}^{m} \frac{⟨{\hat{C}}_{z X}, {\hat{ρ}}_{j}⟩ ⟨{\hat{C}}_{X z}, {\hat{ρ}}_{j}⟩}{{\hat{λ}}_{j}})}^{- 1} ({\hat{C}}_{z Y} - \sum_{j = 1}^{m} \frac{⟨{\hat{C}}_{z X}, {\hat{ρ}}_{j}⟩ ⟨{\hat{C}}_{Y X}, {\hat{ρ}}_{j}⟩}{{\hat{λ}}_{j}}), \\ \hat{α} & = {(\frac{⟨{\hat{C}}_{Y X} - {\hat{β}}^{T} {\hat{C}}_{z X}, {\hat{ρ}}_{j}⟩}{{\hat{λ}}_{j}})}_{j = 1, 2, \dots, m} : = ({\hat{α}}_{1}, {\hat{α}}_{2}, \dots, {\hat{α}}_{m}), \end{matrix}

(4)

where

⟨{\hat{C}}_{z X}, {\hat{ρ}}_{j}⟩ = \frac{1}{n} \sum_{i = 1}^{n} z_{i} ⟨ X_{i}, {\hat{ρ}}_{j} ⟩

and

⟨{\hat{C}}_{Y X}, {\hat{ρ}}_{j}⟩ = \frac{1}{n} \sum_{i = 1}^{n} Y_{i} ⟨ X_{i}, {\hat{ρ}}_{j} ⟩

. The estimator

\hat{γ} (t)

of

γ (t)

is defined by

\hat{γ} (t) = \sum_{j = 1}^{m} {\hat{α}}_{j} {\hat{ρ}}_{j} (t) .

2.2. Variable Selection

Variable selection is a crucial step when the dimensionality of the covariate

z

in (1) is high, and it is of great interest to identify the nonzero components in

β

. In this paper, we adopt the SCAD penalty introduced by Fan and Li [25] to get a penalized estimator. In particular, the first order derivative of the SCAD penalty function

p_{ω} (t)

is

\begin{matrix} p_{ω}^{'} (t) = ω \{I (t \leq ω) + \frac{{(a ω - t)}_{+}}{(a - 1) ω} I (t > ω)\}, t > 0, \end{matrix}

where

ω

is a tuning parameter and

a = 3.7

suggested by Fan and Li [25]. Hence, we define the penalty estimator of

β

as

\begin{matrix} {\hat{β}}_{p} = arg min_{β \in R^{d}} G_{p} (β) : = arg min_{β \in R^{d}} \{G (β, α (β)) + n \sum_{j = 1}^{d} p_{ω} (| β_{j} |)\}, \end{matrix}

(5)

where

α (β) = {(U_{m}^{T} U_{m})}^{- 1} U_{m}^{T} (Y - Z β)

and

β_{j}

is the j-th component of

β

.

Remark 1.

In simulation below, the tuning parameter ω in (5) is selected by 10-fold cross-validation.

Let

β_{0} = (β_{01}, \dots, β_{0 d})

be true value of

β

, and put

A = {j : β_{0 j} \neq 0}

,

d_{1} = | A |

. Without loss of generality, we assume

β_{0} = {(β_{01}^{T}, β_{02}^{T})}^{T}

, where

β_{01} \in R^{d_{1}}

and

β_{02} \in R^{d - d_{1}}

are the nonzero and zero components of

β_{0}

, respectively, i.e.,

β_{0} = {(β_{01}^{T}, 0^{T})}^{T}

, and the estimator of

β_{0} = {(β_{01}^{T}, 0^{T})}^{T}

is defined by

{\hat{β}}_{p} = {({\hat{β}}_{p 1}^{T}, {\hat{β}}_{p 2}^{T})}^{T}

.

3. Main Results

In the sequel, let

C, C_{1}, \dots

and

c_{0}, c, c_{1}, \dots

denote generic finite positive constants, whose values may change from line to line;

a_{n} \sim b_{n}

means

C_{1} \leq \frac{a_{n}}{b_{n}} \leq C_{2}

;

∥ f ∥ = {(\int_{0}^{1} f^{2} (t) d t)}^{1 / 2}

. For the sake of convenience for statement, we give the following notations.

\begin{matrix} η_{i k} = z_{i k} - ⟨ f_{k}, X_{i} ⟩ = z_{i k} - \sum_{j = 1}^{\infty} \frac{⟨ C_{z_{k} X}, ρ_{j} ⟩ ⟨ ρ_{j}, X_{i} ⟩}{λ_{j}}, η_{i} = {(η_{i 1}, η_{i 2}, \dots, η_{i d})}^{T}, C_{z} = Var (z), \\ C_{z X} = {(C_{z_{1} X}, \dots, C_{z_{d} X})}^{T}, C_{X z} = {(C_{z X})}^{T}, f_{k} = \sum_{j = 1}^{\infty} \frac{⟨ C_{z_{k} X}, ρ_{j} ⟩ ρ_{j}}{λ_{j}}, where C_{z_{k} X} = Cov (z_{k}, X) \end{matrix}

and

{∥ z ∥}_{R^{d}}^{2} = z_{1}^{2} + z_{2}^{2} + \dots + z_{d}^{2}

for

z = {(z_{1}, z_{2}, \dots, z_{d})}^{T}

.

In order to list the main results in this paper, we impose the following assumptions.

(A0): Let random variables ${e_{i}}$ be identical distributed with $E e_{1}^{2} < \infty$ or square uniformly integrable and satisfy $E e_{i} = 0$ .
(A1): $\int_{0}^{1} X^{2} (t) d t < \infty$ a.s., and ${E ∥ X ∥}^{4} < \infty$ .
(A2): ${E ∥ z ∥}_{R^{d}}^{4} < \infty$ .
(A3): For each j, $E θ_{j}^{4} \leq c λ_{j}^{2}$ and $c_{1} j^{- a} \leq λ_{j} \leq c_{2} j^{- a}$ , $λ_{j} - λ_{j + 1} \geq c j^{- a - 1}$ for some $a > 1$ .
(A4): For each j and some $b > \frac{a}{2} + 1$ : (i) $| α_{j} | \leq c j^{- b}$ ; (ii) $| ⟨ C_{z_{k} X}, ρ_{j} ⟩ | \leq c j^{- (a + b)}$ for each k.
(A5): $m \sim n^{\frac{1}{a + 2 b}}$ .
(A6): Let $η_{1 k}, η_{2 k}, \dots, η_{n k}$ be i.i.d. random variables and satisfy $E (η_{1 k} | X_{1}, X_{2}, \dots, X_{n}) = 0$ a.s., $E (η_{1 k}^{2} | X_{1}, X_{2}, \dots, X_{n}) = B_{k k}$ a.s., which is k-th diagonal element of $B = E (η_{i} η_{i}^{T})$ and $B$ is a positive definite matrix. Assume that $E | η_{1 k} |^{2 + δ} < \infty$ for $δ > 0$ .
(A7): $\underset{n \to \infty}{lim inf} \underset{θ \to 0^{+}}{lim inf} p_{ω}^{'} (θ) / ω > 0$ , $max_{j \in A} p_{ω}^{'} (| β_{0 j} |) = o (n^{- 1 / 2})$ , $max_{j \in A} p_{ω}^{″} (| β_{0 j} |) = o (1)$ .

Remark 2.

(a): It is easy to verify $B = C_{z} - \sum_{j = 1}^{\infty} \frac{⟨ C_{z X}, ρ_{j} ⟩ ⟨ ρ_{j}, X_{i} ⟩}{λ_{j}}$ .
(b): (A1)–(A3), (A4)(i), (A5) and (A6) are general regularization conditions in the partially functional linear model (cf. Shin [10]); when $λ_{j}$ decrease in (A3), $λ_{j} - λ_{j + 1} \geq c j^{- a - 1}$ implies $λ_{j} \geq c_{1} j^{- a}$ . (A1) implies $⫴ {\hat{C}}_{X} - C_{X} ⫴^{2} : = \int_{0}^{1} \int_{0}^{1} {[{\hat{C}}_{X} (s, t) - C_{X} (s, t)]}^{2} d s d t = O_{p} (n^{- 1})$ . In fact,

$\begin{matrix} E ⫴ {\hat{C}}_{X} - C_{X} ⫴^{2} = \int_{0}^{1} \int_{0}^{1} E {[n^{- 1} \sum_{i = 1}^{n} X_{i} (s) X_{i} (t) - E X (s) X (t)]}^{2} d s d t \\ = n^{- 1} \int_{0}^{1} \int_{0}^{1} Var (X (s) X (t)) d s d t \leq n^{- 1} E {(\int_{0}^{1} X^{2} (s) d s)}^{2} = n^{- 1} E {∥ X ∥}^{4} = O (n^{- 1}), \end{matrix}$

which implies $⫴ {\hat{C}}_{X} - C_{X} ⫴^{2} = O_{p} (n^{- 1})$ .
(c): From (A3) and (A4)(ii), we have $| ⟨ f_{k}, ρ_{j} ⟩ | = | ⟨ \sum_{t = 1}^{\infty} \frac{⟨ C_{z_{k} X}, ρ_{t} ⟩ ρ_{t}}{λ_{j}}, ρ_{j} ⟩ | = | \frac{⟨ C_{z_{k} X}, ρ_{j} ⟩}{λ_{j}} | \leq c j^{- b} .$

Theorem 1.

Let (A0)–(A6) hold with

E (e_{i} | \dots, e_{i - 2}, e_{i - 1}) = 0

a.s. and

E (e_{i}^{2} | \dots, e_{i - 2}, e_{i - 1}) = σ^{2}

a.s. for

- \infty < i < \infty

. Then

\sqrt{n} (\hat{β} - β_{0}) \overset{D}{\to} N (0, σ^{2} B^{- 1} \sum_{j = - \infty}^{\infty} c_{j}^{2}) .

Theorem 2.

Let (A0)–(A6) hold, then

∥ \hat{γ} {- γ ∥}^{2} = O_{p} (n^{- (2 b - 1) / (a + 2 b)}) .

Remark 3.

When

c_{0} = 1

and

c_{j} = 0

for

j \neq 0

,

V_{i} = e_{i}

and

{e_{i}}

is i.i.d. random variables, Theorems 1 and 2 can reduce to Theorems 3.1 and 3.2 of Shin [10], respectively.

Theorem 3.

Suppose that (A0)–(A7) hold. If

ω \to 0

and

n^{1 / 2} ω \to \infty

, then

(1): Selection consistency: $P ({\hat{β}}_{p 2} = 0) \to 1$ ;
(2): Asymptotic normality: If $E (e_{i} | \dots, e_{i - 2}, e_{i - 1}) = 0$ a.s. and $E (e_{i}^{2} | \dots, e_{i - 2}, e_{i - 1}) = σ^{2}$ a.s. for $- \infty < i < \infty$ , then $\sqrt{n} ({\hat{β}}_{p 1} - β_{01}) \overset{D}{\to} N (0, σ^{2} B_{p 1}^{- 1} \sum_{j = - \infty}^{\infty} c_{j}^{2}),$ where $B_{p 1} = E (η_{i (1)} η_{i (1)}^{T})$ and $η_{i (1)}$ is related to $z_{i (1)}$ , which is corresponding to $β_{01}$ , is a $d_{1}$ -dimensional subvector of $z_{i}$ for $i = 1, 2, \dots, n$ .

4. Simulation Study

4.1. Least Squares Estimation

In this subsection, we use the Monte Carlo simulation to study the performance for the proposed methods. The data are generated from the following model:

\begin{matrix} Y_{i} = z_{i 1} β_{1} + z_{i 2} β_{2} + \int_{0}^{1} γ (t) X_{i} (t) d t + V_{i}, \end{matrix}

(6)

where

z_{i 1} \sim N (0, 1)

,

z_{i 2} \sim N (0, 1)

and

β = {(β_{1}, β_{2})}^{T} = {(1, 2)}^{T}

. Taking

X_{i} (t) = \sum_{j = 1}^{50} θ_{j} ρ_{j} (t)

,

θ_{j} \sim N (0, λ_{j})

and

ρ_{j} (t) = \sqrt{2} sin ((j - 0.5) π t)

. For

\int_{0}^{1} γ (t) X_{i} (t) d t = \sum_{j = 1}^{m} θ_{i j} α_{j}

in (6), let

α_{j} = \int_{0}^{1} γ (t) ρ_{j} (t) d t with γ (t) = \sqrt{2} sin (\frac{π t}{2}) + 3 \sqrt{3} sin (\frac{3 π t}{2})

and m be chosen by the CPV method (see Horváth and Kokoszka [7], page 41), i.e.,

m = min_{k} {k : CPV (k) \geq 85 %} = min_{k} \{k : \sum_{i = 1}^{k} {\hat{λ}}_{i} / \sum_{i = 1}^{n} {\hat{λ}}_{i} \geq 85 %\} .

Let

V_{i}

be an

A R (1)

process:

V_{i} = ϕ V_{i - 1} + e_{i},

where

e_{i} \sim N (0, 1)

and

| ϕ | < 1

. Thus,

E V_{i}^{2} = 1 / (1 - ϕ^{2})

.

In the simulation, we take

ϕ = 0.1, 0.9

and sample sizes

n = 50

and 200. For each sample size, we replicate

N = 500

simulations and take

l = 500

grid points of equal interval in

[0, 1]

. The mean square error (MSE) of the estimators

\hat{γ} (\cdot)

and

\hat{β}

of

γ (\cdot)

and

β

is defined, respectively, as

\begin{matrix} MSE (\hat{γ}) = \frac{1}{N l} \sum_{i = 1}^{N} \sum_{k = 1}^{l} {[{\hat{γ}}_{i} (t_{k}) - γ_{i} (t_{k})]}^{2}, MSE (\hat{β}) = \frac{1}{N} \sum_{i = 1}^{N} {[\hat{β} (i) - β]}^{2} . \end{matrix}

In Figure 1, we draw Q-Q plots of the estimators

{\hat{β}}_{1}

and

{\hat{β}}_{2}

with sample sizes n = 50, 200 and

ϕ = 0.1, 0.9

. Figure 1 illustrates that more points fall near the line as

n = 200

and the value of

ϕ = 0.1

; more points on either sides are away from the line when

n = 50

and

ϕ = 0.9

. This implies the quality of fit decreases as the dependence of the observations increases, i.e., the value of

ϕ

increases, and that the normality in the distribution of the estimators increases as the sample size n increases, which comfirms the asymptotic normality in Theorem 1.

Figure 1. Q-Q plots of

{\hat{β}}_{1}

and

{\hat{β}}_{2}

with

n = 50

(the first row),

n = 200

(the second row),

ϕ = 0.1

(the first and second column) and

ϕ = 0.9

(the third and fourth column).

In Table 1, we report the bias and MSEs, of the estimators for

β_{1}

,

β_{2}

and

γ (t)

. From Table 1, we can draw the following conclusions:

Table 1. Values of bias, MSEs for

{\hat{β}}_{1}

,

{\hat{β}}_{2}

and

\hat{γ}

with

n = 50, 200

and

ϕ = 0.1, 0.9

.

(1): For the same sample size n, the values of MSE $({\hat{β}}_{1})$ , MSE $({\hat{β}}_{2})$ and $MSE (\hat{γ})$ increase with $ϕ$ increasing;
(2): With the same $ϕ$ , if the sample size n increases, then the values of MSE $({\hat{β}}_{1})$ , MSE $({\hat{β}}_{2})$ and $MSE (\hat{γ})$ decrease;
(3): Changes of the sample size n and $ϕ$ have a little effect on the biases of ${\hat{β}}_{1}$ and ${\hat{β}}_{2}$ .

4.2. Variable Selection

In this subsection, we add six independent covariates

{z_{i 3}, z_{i 4}, \dots, z_{i 8}}

with

z_{i j} \sim N (0, 1)

(

j = 3, 4, \dots, 8

) to model (6), and take

β = {(β_{1}, β_{2}, \dots, β_{8})}^{T} = {(1, 2, 0, 0, 0, 0, 0, 0)}^{T}

. Set C as the average number of components in

β

correctly estimated to be zero, IC as the average number of components in

β

incorrectly estimated to be zero, C-fit as the probability of exactly fitting the model and MSE(

\hat{β}) = \frac{1}{8 N} \sum_{j = 1}^{8} \sum_{i = 1}^{N} {[{\hat{β}}_{j} (i) - β_{j}]}^{2}

as the mean square error of the estimator for

β

.

Figure 2 shows Q-Q plots of the estimators for

β_{1}

and

β_{2}

with

n = 50

, 200 and

ϕ = 0.1,

0.9 by using the SCAD penalty function. The performance from Q-Q plots in this instance is comparable to that in Section 4.1. The asymptotic normality in Theorem 3 is verified as sample size rises because the Q-Q plots are more aligned with the normality; when the value of

ϕ

is increasing, the poorer imitative impact is observed.

Figure 2. Q-Q plots of the estimators for

(β_{1}, β_{2})

with

n = 50

(the first row),

n = 200

(the second row),

ϕ = 0.1

(the first and second column) and

ϕ = 0.9

(the third and fourth column) by the method of SCAD.

In Table 2, we represent the values of C, IC, C-fit and MSE(

\hat{β})

. Table 2 indicates the following conclusion:

Table 2. Values of C, IC, C-fit and MSE(

\hat{β})

.

(1)

When the sample size n increases with the same

ϕ

, the value of MSE(

\hat{β})

decreases;

(2)

(i): If the sample size n increases with the same $ϕ$ , the average number of zero coefficients correctly estimated to be zero is close to 5 and the average number of components in $β$ incorrectly estimated to be zero is near 0 and even is 0 when $ϕ = 0.1$ or $n = 200$ . This verifies the selection consistency in Theorem 3;
(ii): When the value of $ϕ$ dereases with the same sample size, the average number of zero coefficients correctly estimated to be zero increases, and the average number of components in $β$ incorrectly estimated to be zero decreases;

(3)

As the sample size n increases with the same

ϕ

or

ϕ

decreases with the same sample size, the probability of exactly fitting the model increases.

5. Conclusions

By using the least square method, based on FPCA, we construct the estimators of parameter and coefficient function in the partially functional linear models with linear process errors and establish asymptotic normality of parameter estimator and the rate of convergence of the estimator for the coefficient function. Additionally, we use the SCAD penalty to define the estimator of the parameter and discuss its oracle property.

However, the proposed method has some limitations. First, in this study, we approximate the expansion of the function part into partial sum form by using the FPCA method, which may leave out data information due to the truncation value m. Second, while, in practice, there exists missing data, this work considers the scenario of complete data. Therefore, in the event of missing data, our proposed method may be inapplicable. In the future, we are interested in figuring out a way to reduce data information loss and take missing data into account.

6. Proof of Main Results

In proof below, we use the following notations: for linear operator T, let

{∥ T ∥}_{\infty} = sup_{∥ f ∥ = 1} ∥ T f ∥

,

{∥ T ∥}_{H} = sup_{∥ f ∥ = 1} | T f |

for

f \in L_{2} [0, 1]

;

{∥ A ∥}_{\infty} = max_{i} \sum_{j} | A_{i j} |

for matrix

A = (A_{i j})

; for

k = 1, \dots, d

, set

η_{(k)} = {(η_{1 k}, η_{2 k}, \dots, η_{n k})}^{T}

,

f_{(k)} = {(⟨ f_{k}, X_{1} ⟩, ⟨ f_{k}, X_{2} ⟩, \dots, ⟨ f_{k}, X_{n} ⟩)}^{T}

,

z_{(k)} = {(z_{1 k}, z_{2 k}, \dots, z_{n k})}^{T}

,

\begin{matrix} {\hat{ψ}}_{k} (f) = \sum_{j = 1}^{m} \frac{⟨ {\hat{C}}_{z_{k} X}, {\hat{ρ}}_{j} ⟩ ⟨ {\hat{ρ}}_{j}, f ⟩}{{\hat{λ}}_{j}}, ψ_{k} (f) = \sum_{j = 1}^{\infty} \frac{⟨ C_{z_{k} X}, ρ_{j} ⟩ ⟨ ρ_{j}, f ⟩}{λ_{j}}, \end{matrix}

\hat{B} = \frac{1}{n} (Z^{T} (I - H_{m}) Z)

and

M = {(⟨ γ, X_{1} ⟩, ⟨ γ, X_{2} ⟩, \dots, ⟨ γ, X_{n} ⟩)}^{T}

.

Lemma 1

(Shin [10] ).

(1): Let $E θ_{j}^{4} \leq c λ_{j}^{2}$ , then $E ∥ {\hat{T}}_{X} - T_{X} ∥_{\infty}^{2} \leq n^{- 1} E {∥ X ∥}^{4}$ , $E ∥ {\hat{C}}_{z} - C_{z} ∥_{\infty}^{2} \leq n^{- 1} E {∥ z ∥}_{R^{d}}^{4}$ and $E ∥ {\hat{C}}_{z_{k} X} - C_{z_{k} X} ∥^{2} \leq n^{- 1} {({E ∥ z ∥}_{R^{d}}^{4} E {∥ X ∥}^{4})}^{1 / 2}$ for $k = 1, 2, \dots, d .$
(2): Suppose that (A1), (A3), (A4)(i), (A5) and (A6) are satisfied. Then $∥ ψ_{k} - {\hat{ψ}}_{k} ∥_{H}^{2} = O_{p} (n^{- (2 b - 1) / (a + 2 b)}) .$ Further, if ${e_{i}}$ are identical distributed with $E e_{1}^{2} < \infty$ or square uniformly integrable, $∥ \hat{B} {- B ∥}_{\infty}^{2} = O_{p} (n^{- (2 b - 1) / (a + 2 b)}) .$

Lemma 2

(Pollard [26], page 171). Let

{Z_{n k}, k \geq 0}

be a sequence of random variables and

{G_{n, k - 1}, k \geq 1}

be increasing sequence of σ-fields such that

{Z_{n k}}

is measurable with respect to

G_{n, k}

, and

E (Z_{n k} | G_{n, k - 1}) = 0

for

1 \leq k \leq n

. Assume that

\sum_{k = 1}^{n} E (Z_{n k}^{2} | G_{n, k - 1}) \overset{p}{\to} a^{2}

and

\sum_{k = 1}^{n} E (Z_{n k}^{2} I (| Z_{n k} | > q) | G_{n, k - 1}) \overset{p}{\to} 0

for some constant

a^{2} > 0

and every

q > 0

. Then

\sum_{k = 1}^{n} Z_{n k} \overset{d}{\to} N (0, a^{2})

.

Lemma 3.

For

t \in [0, 1]

, let

{X_{i} (t), i \geq 1}

be

i . i . d .

random variables with

E X_{i} (t) = 0

and

E ∥ X_{1} ∥^{2} < \infty

. Set

V_{i} = \sum_{s = - \infty}^{\infty} c_{s} e_{i - s}

with

\sum_{j = - \infty}^{\infty} | c_{j} | < \infty

. Assume that

{X_{i} (t), i \geq 1}

is independent of

{e_{i}, i \in Z}

, and that

{e_{i}}

are identical distributed with

E e_{1}^{2} < \infty

or square uniformly integrable. Then

n^{- 1 / 2} ∥ \sum_{i = 1}^{n} X_{i} V_{i} ∥ = O_{p} (1) .

Proof.

Note that

{sup}_{i} E V_{i}^{2} \leq {sup}_{i} [\sum_{s = - \infty}^{\infty} c_{s}^{2} E e_{i - s}^{2} + 2 \sum_{s = - \infty}^{\infty} \sum_{k > s} | c_{s} c_{k} | {(E e_{i - s}^{2} E e_{i - k}^{2})}^{1 / 2}]

\leq c {(\sum_{s = - \infty}^{\infty} | c_{s} |)}^{2} < \infty .

So

E {(n^{- 1 / 2} ∥ \sum_{i = 1}^{n} X_{i} V_{i} ∥)}^{2} = n^{- 1} \sum_{i = 1}^{n} E {∥ X_{i} ∥}^{2} E V_{i}^{2} < \infty,

which yields

n^{- 1 / 2} ∥ \sum_{i = 1}^{n} X_{i} V_{i} ∥ = O_{p} (1) .

□

Lemma 4.

If (A1) and

E θ_{k}^{4} \leq c λ_{k}^{2}

for each k hold, then

E P_{k} \leq c n^{- 1} λ_{k}

and

E Q_{j k}^{2} \leq c n^{- 1} λ_{j} λ_{k}

, where

P_{K} = \int_{0}^{1} {(\int_{0}^{1} ({\hat{C}}_{X} (s, t) - C_{X} (s, t)) ρ_{k} (t) d t)}^{2} d s

,

Q_{j k} = \int_{0}^{1} \int_{0}^{1} ({\hat{C}}_{X} (s, t) - C_{X} (s, t)) ρ_{j} (s)

ρ_{k} (t) d s d t

.

Proof.

From

θ_{i k} = ⟨ X_{i}, ρ_{k} ⟩

, using (A1) we have

\begin{matrix} E P_{k} = & \int_{0}^{1} E {(\frac{1}{n} \sum_{i = 1}^{n} X_{i} (s) \int_{0}^{1} X_{i} (t) ρ_{k} (t) d t - E X (s) \int_{0}^{1} X (s) ρ_{k} (t) d t)}^{2} d s \\ = & \int_{0}^{1} E {(\frac{1}{n} \sum_{i = 1}^{n} X_{i} (s) ⟨ X_{i}, ρ_{k} ⟩ - E [X (s) ⟨ X, ρ_{k} ⟩])}^{2} d s \leq \frac{1}{n} E [\int_{0}^{1} X^{2} (s) d s θ_{k}^{2}] \\ = & n^{- 1} {E (∥ X ∥}^{2} θ_{k}^{2}) \leq n^{- 1} {({E ∥ X ∥}^{4} \cdot E θ_{k}^{4})}^{1 / 2} \leq c n^{- 1} λ_{k}, \\ E Q_{j k}^{2} = & E {(\int_{0}^{1} \int_{0}^{1} (\frac{1}{n} \sum_{i = 1}^{n} X_{i} (s) X_{i} (t) - E (X (s) X (t))) ρ_{j} (s) ρ_{k} (t) d s d t)}^{2} \\ = & E {(\frac{1}{n} \sum_{i = 1}^{n} ⟨ X_{i}, ρ_{j} ⟩ ⟨ X_{i}, ρ_{k} ⟩ - E ⟨ X, ρ_{j} ⟩ ⟨ X (t), ρ_{k} ⟩)}^{2} = Var (\frac{1}{n} \sum_{i = 1}^{n} θ_{i j} θ_{i k}) \\ = & n^{- 1} Var (θ_{j} θ_{k}) \leq n^{- 1} {(E θ_{j}^{4} E θ_{k}^{4})}^{1 / 2} \leq c n^{- 1} λ_{j} λ_{k} . \end{matrix}

□

Lemma 5.

Let

{x_{n}}

and

{y_{n}}

be two sequences of independent random variables, then for any

ε > 0

and

c > 0

we have

E | x_{n} y_{n} |^{2} I (| x_{n} y_{n} | > ε) \leq E | x_{n} |^{2} I (| x_{n} | > c) E | y_{n} |^{2} + E | y_{n} |^{2} I (| y_{n} | > ε / c) E | x_{n} |^{2} .

Proof.

Using independence between

{x_{n}}

and

{y_{n}}

, it follows that

\begin{matrix} E | x_{n} y_{n} |^{2} I (| x_{n} y_{n} | > ε) & = E | x_{n} y_{n} |^{2} \{I (| x_{n} y_{n} | > ε, | x_{n} | > c) + I (| x_{n} y_{n} | > ε, | x_{n} | \leq c)\} \\ \leq E | x_{n} |^{2} I (| x_{n} | > c) E | y_{n} |^{2} + E | y_{n} |^{2} I (| y_{n} | > ε / c) E | x_{n} |^{2} . \end{matrix}

□

Proof of Theorem 1.

We write (cf. the proof of Theorem 3.1 in Shin [10])

\begin{matrix} \sqrt{n} (\hat{β} - β_{0}) = & n^{- \frac{1}{2}} {\hat{B}}^{- 1} {\sum_{i = 1}^{n} [z_{i} - \sum_{j = 1}^{m} \frac{⟨ {\hat{C}}_{z X}, {\hat{ρ}}_{j} ⟩ ⟨ {\hat{ρ}}_{j}, X_{i} ⟩}{{\hat{λ}}_{j}}] ⟨ γ, X_{i} ⟩ \\ + \sum_{i = 1}^{n} [\sum_{j = 1}^{\infty} \frac{⟨ C_{z X}, ρ_{j} ⟩ ⟨ X_{i}, ρ_{j} ⟩}{λ_{j}} - \sum_{j = 1}^{m} \frac{⟨ {\hat{C}}_{z X}, {\hat{ρ}}_{j} ⟩ ⟨ {\hat{ρ}}_{j}, X_{i} ⟩}{{\hat{λ}}_{j}}] V_{i} \\ + \sum_{i = 1}^{n} [z_{i} - \sum_{j = 1}^{\infty} \frac{⟨ C_{z X}, ρ_{j} ⟩ ⟨ ρ_{j}, X_{i} ⟩}{λ_{j}}] V_{i}} \\ : = & {\hat{B}}^{- 1} (A_{1} + A_{2} + A_{3}) . \end{matrix}

Lemma 1 implies

\hat{B} \overset{p}{\to} B

. Thus, it suffices to show that

A_{1} = o_{p} (1)

,

A_{2} = o_{p} (1)

and

\begin{matrix} A_{3} = n^{- 1 / 2} \sum_{i = 1}^{n} η_{i} V_{i} \overset{d}{\to} N (0, σ^{2} B \sum_{j = - \infty}^{\infty} c_{j}^{2}) . \end{matrix}

(7)

Step 1. We prove

A_{2} = o_{p} (1)

. By applying Lemmas 1 and 3, it follows

\begin{matrix} ∥ A_{2} ∥_{R^{d}} = & ∥ n^{- 1 / 2} \sum_{i = 1}^{n} [\sum_{j = 1}^{\infty} \frac{⟨ C_{z X}, ρ_{j} ⟩ ⟨ X_{i}, ρ_{j} ⟩}{λ_{j}} - \sum_{j = 1}^{m} \frac{⟨{\hat{C}}_{z X}, {\hat{ρ}}_{j}⟩ ⟨{\hat{ρ}}_{j}, X_{i}⟩}{{\hat{λ}}_{j}}] V_{i} ∥_{R^{d}} \\ = & n^{- 1 / 2} ∥ {(ψ_{1} - {\hat{ψ}}_{1}, \dots, ψ_{d} - {\hat{ψ}}_{d})}^{T} (\sum_{i = 1}^{n} X_{i} V_{i}) ∥_{R^{d}} \\ \leq & n^{- 1 / 2} \sum_{k = 1}^{d} ∥ ψ_{k} - {\hat{ψ}}_{k} ∥_{H} ∥ \sum_{i = 1}^{n} X_{i} V_{i} ∥ = O_{p} (n^{- \frac{2 b - 1}{2 (a + 2 b)}}) = o_{p} (1), \end{matrix}

which yields

A_{2} = o_{p} (1)

.

Step 2. We prove

A_{1} = o_{p} (1)

. From

z_{(k)} = η_{(k)} + f_{(k)}

, we know that the k-th (

1 \leq k \leq d

) element of

A_{1}

is

\begin{matrix} A_{1} (k) = & n^{- 1 / 2} \sum_{i = 1}^{n} [z_{i k} - \sum_{j = 1}^{m} \frac{⟨{\hat{C}}_{z_{k} X}, {\hat{ρ}}_{j}⟩ ⟨ {\hat{ρ}}_{j}, X_{i} ⟩}{{\hat{λ}}_{j}}] ⟨ γ, X_{i} ⟩ = n^{- 1 / 2} z_{(k)}^{T} (I - H_{m}) M \\ = & n^{- 1 / 2} f_{(k)}^{T} (I - H_{m}) M + n^{- 1 / 2} η_{(k)}^{T} (I - H_{m}) M . \end{matrix}

Since

I - H_{m} = {(I - H_{m})}^{2}

,

n^{- 1 / 2} | f_{(k)}^{T} (I - H_{m}) M | \leq n^{1 / 2} {[n^{- 1} f_{(k)}^{T} (I - H_{m}) f_{(k)}]}^{\frac{1}{2}} {[n^{- 1} M^{T} (I - H_{m}) M]}^{\frac{1}{2}} .

Then, to prove

A_{1} (k) = o_{p} (1)

, we need only to prove that

\begin{matrix} n^{- 1} M^{T} (I - H_{m}) M = O_{p} (n^{- (a + 2 b - 1) / (a + 2 b)}) = o_{p} (1), \end{matrix}

(8)

\begin{matrix} n^{- 1} f_{(k)}^{T} (I - H_{m}) f_{(k)} = O_{p} (n^{- (a + 2 b - 1) / (a + 2 b)}) = o_{p} (1), \end{matrix}

(9)

\begin{matrix} n^{- 1 / 2} η_{(k)}^{T} (I - H_{m}) M = O_{p} (n^{- (a + 2 b - 1) / 2 (a + 2 b)}) = o_{p} (1) . \end{matrix}

(10)

Here, we only prove (8) and (10), the proof for (9) is similar by Remark 2(c).

To do these, we need the following results (i) and (ii), their proofs can be found in Hall and Horowitz [5]:

(i): If $A_{m} : = A_{m} (n) = {{({\hat{λ}}_{j} - λ_{k})}^{- 2} \leq 2 {(λ_{j} - λ_{k})}^{- 2}, j \neq k},$ we have $P (A_{m}) \to 1$ .
(ii): $⟨ ρ_{k}, {\hat{ρ}}_{j} - ρ_{j} ⟩ = {({\hat{λ}}_{j} - λ_{k})}^{- 1} \int_{0}^{1} \int_{0}^{1} ({\hat{C}}_{X} (s, t) - C_{X} (s, t)) {\hat{ρ}}_{j} (s) ρ_{k} (t) d s d t - δ_{j k};$ $sup_{j} | {\hat{λ}}_{j} - λ_{j} | \leq ⫴ {\hat{C}}_{X} - C_{X} ⫴$ . Furthermore, let $B_{m} = {\frac{1}{2} λ_{m} \geq ⫴ {\hat{C}}_{X} - C_{X} ⫴}$ , then $P (B_{m}) \to 1$ .

Note that

γ (t) = \sum_{j = 1}^{\infty} ⟨ γ, {\hat{ρ}}_{j} ⟩ {\hat{ρ}}_{j} (t)

, then, from

\sum_{i = 1}^{n} ⟨ X_{i}, {\hat{ρ}}_{j} ⟩ ⟨ X_{i}, {\hat{ρ}}_{k} ⟩ = n {\hat{λ}}_{j} δ_{j k}

, we have

\begin{matrix} \sum_{i = 1}^{n} {⟨ γ, X_{i} ⟩}^{2} = \sum_{i = 1}^{n} {(\sum_{j = 1}^{\infty} ⟨ γ, {\hat{ρ}}_{j} ⟩ ⟨{\hat{ρ}}_{j}, X_{i}⟩)}^{2} = \sum_{i = 1}^{n} \sum_{j = 1}^{\infty} {⟨ γ, {\hat{ρ}}_{j} ⟩}^{2} {⟨ {\hat{ρ}}_{j}, X_{i} ⟩}^{2} = n \sum_{j = 1}^{\infty} {\hat{λ}}_{j} {⟨ γ, {\hat{ρ}}_{j} ⟩}^{2} . \end{matrix}

(11)

Using

\sum_{i = 1}^{n} ⟨ X_{i}, {\hat{ρ}}_{j} ⟩ ⟨ γ, X_{i} ⟩ = n \int_{0}^{1} \int_{0}^{1} {\hat{C}}_{X} (s, t) {\hat{ρ}}_{j} (s) γ (t) d s d t = n {\hat{λ}}_{j} ⟨ {\hat{ρ}}_{j}, γ ⟩,

we get

\begin{matrix} U_{m}^{T} M = (\begin{matrix} \sum_{i = 1}^{n} ⟨ X_{i}, {\hat{ρ}}_{1} ⟩ ⟨ γ, X_{i} ⟩ \\ \sum_{i = 1}^{n} ⟨ X_{i}, {\hat{ρ}}_{2} ⟩ ⟨ γ, X_{i} ⟩ \\ \dots \\ \sum_{i = 1}^{n} ⟨ X_{i}, {\hat{ρ}}_{m} ⟩ ⟨ γ, X_{i} ⟩ \end{matrix}) = (\begin{matrix} n {\hat{λ}}_{1} ⟨ {\hat{ρ}}_{1}, γ ⟩ \\ n {\hat{λ}}_{2} ⟨ {\hat{ρ}}_{2}, γ ⟩ \\ \dots \\ n {\hat{λ}}_{m} ⟨ {\hat{ρ}}_{m}, γ ⟩ \end{matrix}) . \end{matrix}

The result (ii) implies

\frac{1}{2} λ_{j} \leq {\hat{λ}}_{j} \leq \frac{3}{2} λ_{j}

on

B_{m}

, hence using (11) and

{U_{m}}^{T} U_{m} = diag (n {\hat{λ}}_{1},

n {\hat{λ}}_{2}, \dots, n {\hat{λ}}_{m})

, On

B_{m}

we have

\begin{matrix} n^{- 1} M^{T} (I - H_{m}) M = & M^{T} M - M^{T} U_{m} {(U_{m}^{T} U_{m})}^{- 1} U_{m}^{T} M \\ = & n^{- 1} \sum_{i = 1}^{n} {⟨ γ, X_{i} ⟩}^{2} - \sum_{j = 1}^{m} {\hat{λ}}_{j} {⟨ {\hat{ρ}}_{j}, γ ⟩}^{2} = \sum_{j = m + 1}^{\infty} {\hat{λ}}_{j} {⟨ {\hat{ρ}}_{j}, γ ⟩}^{2} \\ \leq & \frac{3}{2} \sum_{j = m + 1}^{\infty} λ_{j} {⟨ {\hat{ρ}}_{j}, γ ⟩}^{2} \leq 3 \sum_{j = m + 1}^{\infty} λ_{j} {⟨ {\hat{ρ}}_{j} - ρ_{j}, γ ⟩}^{2} + 3 \sum_{j = m + 1}^{\infty} λ_{j} {⟨ ρ_{j}, γ ⟩}^{2} \\ : = & 3 (D_{1} + D_{2}) . \end{matrix}

By

(A 3)

and

(A 4) (i)

, we find

D_{2} = \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} \leq c \sum_{j = m + 1}^{\infty} j^{- (a + 2 b)} \leq c m^{1 - (a + 2 b)} .

When

k \neq j

and using the conclusion

(i i)

and Lemma 4

\begin{matrix} | ⟨ ρ_{k}, {\hat{ρ}}_{j} - ρ_{j} ⟩ | = & | {\hat{λ}}_{j} - λ_{k} |^{- 1} | \int_{0}^{1} \int_{0}^{1} ({\hat{C}}_{X} (s, t) - C_{X} (s, t)) {\hat{ρ}}_{j} (s) ρ_{k} (t) d s d t | \\ \leq & | {\hat{λ}}_{j} - λ_{k} |^{- 1} {[\int_{0}^{1} {(\int_{0}^{1} ({\hat{C}}_{X} (s, t) - C_{X} (s, t)) ρ_{k} (t) d t)}^{2} d s \cdot \int_{0}^{1} {\hat{ρ}}_{j}^{2} (s) d s]}^{\frac{1}{2}} \\ = & | {\hat{λ}}_{j} - λ_{k} |^{- 1} P_{k}^{1 / 2} . \end{matrix}

The inequation (5.16) in Hall and Horowitz [5] shows

∥ {\hat{ρ}}_{j} - ρ_{j} ∥^{2} \leq 2 \sum_{k : k \neq j} {({\hat{λ}}_{j} - λ_{k})}^{- 2} {\{\int_{0}^{1} \int_{0}^{1} ({\hat{C}}_{X} (s, t) - C_{X} (s, t)) {\hat{ρ}}_{j} (s) ρ_{k} (t) d t d s\}}^{2} \leq 2 \sum_{k : k \neq j} {({\hat{λ}}_{j} - λ_{k})}^{- 2} P_{k}

, thus, from

γ (t) = \sum_{j = 1}^{\infty} α_{k} ρ_{k} (t)

we have

\begin{matrix} D_{1} = & \sum_{j = m + 1}^{\infty} λ_{j} {(\sum_{k = 1}^{\infty} α_{k} ⟨ ρ_{k}, {\hat{ρ}}_{j} - ρ_{j} ⟩)}^{2} \\ \leq & 2 \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} {(⟨ ρ_{j}, {\hat{ρ}}_{j} - ρ_{j} ⟩)}^{2} + 2 \sum_{j = m + 1}^{\infty} λ_{j} {(\sum_{k : k \neq j} α_{k} ⟨ ρ_{k}, {\hat{ρ}}_{j} - ρ_{j} ⟩)}^{2} \\ \leq & 2 \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} {∥ {\hat{ρ}}_{j} - ρ_{j} ∥}^{2} + 2 \sum_{j = m + 1}^{\infty} λ_{j} {(\sum_{k : k \neq j} α_{k} ⟨ ρ_{k}, {\hat{ρ}}_{j} - ρ_{j} ⟩)}^{2} \\ \leq & 4 \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} \sum_{k : k \neq j} {({\hat{λ}}_{j} - λ_{k})}^{- 2} P_{k} + 2 \sum_{j = m + 1}^{\infty} λ_{j} {(\sum_{k : k \neq j} | α_{k} | | {\hat{λ}}_{j} - λ_{k} |^{- 1} P_{k}^{1 / 2})}^{2} \\ : = & 4 D_{11} + 2 D_{12} . \end{matrix}

On

A_{m}

, we have

D_{11} \leq 2 \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} \sum_{k : k \neq j} {(λ_{j} - λ_{k})}^{- 2} P_{k} .

Obviously, from

(A 3)

we have

\begin{matrix} \frac{λ_{j}}{λ_{k}} \leq c {(\frac{k}{j})}^{a} < c \Rightarrow 1 - \frac{λ_{j}}{λ_{k}} > 1 - c : = c^{'} \Rightarrow λ_{k} - λ_{j} > c^{'} λ_{k} f o r k < j \end{matrix}

(12)

and

λ_{j} - λ_{k} > c^{'} λ_{j}

for

k > j

. Then, in view of (A3), (A4) and Lemma 4, we obtain

\begin{matrix} E | D_{11} | & \leq 2 \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} \sum_{k : k \neq j} {(λ_{j} - λ_{k})}^{- 2} E P_{k} \leq c n^{- 1} \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} \sum_{k : k \neq j} {(λ_{j} - λ_{k})}^{- 2} λ_{k} \\ \leq c n^{- 1} \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} \sum_{k : k < j} λ_{k}^{- 2} λ_{k} + c n^{- 1} \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} \sum_{k : k > j} λ_{j}^{- 2} λ_{k} \\ \leq c n^{- 1} \sum_{j = m + 1}^{\infty} λ_{j} α_{j}^{2} \sum_{k : k < j} λ_{k}^{- 1} + c n^{- 1} \sum_{j = m + 1}^{\infty} λ_{j}^{- 1} α_{j}^{2} \sum_{k : k > j} λ_{k} \\ \leq c n^{- 1} \sum_{j = m + 1}^{\infty} j^{- a - 2 b} \sum_{k : k < j} k^{a} + c n^{- 1} \sum_{j = m + 1}^{\infty} j^{a - 2 b} \sum_{k : k > j} k^{- a} \\ \leq c n^{- 1} \sum_{j = m + 1}^{\infty} j^{1 - 2 b} + c n^{- 1} \sum_{j = m + 1}^{\infty} j^{1 - 2 b} \leq c n^{- 1} m^{2 - 2 b} . \end{matrix}

On

A_{m}

, one can write

\begin{matrix} D_{12} \leq & 2 \sum_{j = m + 1}^{\infty} λ_{j} {(\sum_{k : k \neq j} | α_{k} | | λ_{j} - λ_{k} |^{- 1} P_{k}^{1 / 2})}^{2} \\ \leq & 4 \sum_{j = m + 1}^{\infty} λ_{j} {(\sum_{k : k < j} | α_{k} | | λ_{j} - λ_{k} |^{- 1} P_{k}^{1 / 2})}^{2} + 4 \sum_{j = m + 1}^{\infty} λ_{j} {(\sum_{k : k > j} | α_{k} | | λ_{j} - λ_{k} |^{- 1} P_{k}^{1 / 2})}^{2} \\ : = & 4 L_{21} + 4 L_{22} . \end{matrix}

According to (A3) and (A4), by using Lemma 4 we have

\begin{matrix} E | L_{21} | \leq & c \sum_{j = m + 1}^{\infty} λ_{j} E {(\sum_{k : k < j} | α_{k} | λ_{k}^{- 1} P_{k}^{1 / 2})}^{2} \\ = & c \sum_{j = m + 1}^{\infty} λ_{j} (\sum_{k : k < j} | α_{k} |^{2} λ_{k}^{- 2} E P_{k} + 2 \sum_{k : k < j} \sum_{s : s < k} | α_{k} | λ_{k}^{- 1} | α_{s} | λ_{s}^{- 1} E (P_{k}^{1 / 2} P_{s}^{1 / 2})) \\ \leq & c \sum_{j = m + 1}^{\infty} λ_{j} (\sum_{k : k < j} | α_{k} |^{2} λ_{k}^{- 2} E P_{k} + 2 \sum_{k : k < j} \sum_{s : s < k} | α_{k} | λ_{k}^{- 1} | α_{s} | λ_{s}^{- 1} {(E P_{k} \cdot E P_{s})}^{1 / 2}) \\ \leq & c n^{- 1} \sum_{j = m + 1}^{\infty} λ_{j} (\sum_{k : k < j} | α_{k} |^{2} λ_{k}^{- 1} + 2 \sum_{k : k < j} \sum_{s : s < k} | α_{k} | λ_{k}^{- 1 / 2} | α_{s} | λ_{s}^{- 1 / 2}) \\ = & c n^{- 1} \sum_{j = m + 1}^{\infty} λ_{j} {(\sum_{k : k < j} | α_{k} | λ_{k}^{- 1 / 2})}^{2} \leq c n^{- 1} \sum_{j = m + 1}^{\infty} j^{- a} {(\sum_{k : k < j} k^{- b + a / 2})}^{2} \leq c n^{- 1} m^{1 - a}, \end{matrix}

\begin{matrix} E | L_{22} | \leq & c \sum_{j = m + 1}^{\infty} λ_{j} E {(\sum_{k : k > j} | α_{k} | λ_{j}^{- 1} P_{k}^{1 / 2})}^{2} \\ \leq & c \sum_{j = m + 1}^{\infty} λ_{j}^{- 1} (\sum_{k : k > j} | α_{k} |^{2} E P_{k} + 2 \sum_{k : k > j} \sum_{s : s > k} | α_{k} | | α_{s} | {(E P_{k} \cdot E P_{s})}^{1 / 2}) \\ \leq & c n^{- 1} \sum_{j = m + 1}^{\infty} λ_{j}^{- 1} (\sum_{k : k > j} | α_{k} |^{2} λ_{j} + 2 \sum_{k : k > j} \sum_{s : s > k} | α_{k} | | α_{s} | λ_{j}) \\ \leq & c n^{- 1} \sum_{j = m + 1}^{\infty} {(\sum_{k : k > j} | α_{k} |)}^{2} \leq c n^{- 1} \sum_{j = m + 1}^{\infty} {(\sum_{k : k > j} k^{- b})}^{2} \\ \leq & c n^{- 1} \sum_{j = m + 1}^{\infty} j^{2 - 2 b} \leq c n^{- 1} m^{3 - 2 b} . \end{matrix}

So, on

A_{m} \cap B_{m}

, we have

E (n^{- 1} M^{T} (I - H_{m}) M) \leq c (m^{1 - (a + 2 b)} + n^{- 1} m^{1 - a} + n^{- 1} m^{3 - 2 b}

+ n^{- 1} m^{2 - 2 b}) \leq c n^{\frac{1 - (a + 2 b)}{a + 2 b}}

. Thus, using results (i) and (ii), as

c_{0} \to \infty

,

\begin{matrix} P (n^{- 1} M^{T} (I - H_{m}) M > c_{0} n^{\frac{1 - (a + 2 b)}{a + 2 b}}) \\ \leq & P (n^{- 1} M^{T} (I - H_{m}) M > c_{0} n^{\frac{1 - (a + 2 b)}{a + 2 b}}, A_{m} \cap B_{m}) + P (A_{m}^{c}) + P (B_{m}^{c}) \\ \leq & \frac{E [n^{- 1} M^{T} (I - H_{m}) M I_{A_{m} \cap B_{m}}])}{c_{0} n^{\frac{1 - (a + 2 b)}{a + 2 b}}} + P (A_{m}^{c}) + P (B_{m}^{c}) \to 0, \end{matrix}

which yields (8). As for (10), we have

\begin{matrix} P (| n^{- 1 / 2} η_{(k)}^{T} (I - H_{m}) M | > c_{0} n^{\frac{1 - (a + 2 b)}{2 (a + 2 b)}}, A_{m} \cap B_{m}) \\ \leq & \frac{1}{c_{0} n^{\frac{1 - (a + 2 b)}{(a + 2 b)}}} E {(n^{- 1} η_{(k)}^{T} (I - H_{m}) M I_{A_{m} \cap B_{m}})}^{2} \\ = & \frac{1}{c_{0} n^{\frac{1 - (a + 2 b)}{(a + 2 b)}}} E (n^{- 1} M^{T} (I - H_{m}) η_{(k)} η_{(k)}^{T} (I - H_{m}) M I_{A_{m} \cap B_{m}}) \\ = & \frac{1}{c_{0} n^{\frac{1 - (a + 2 b)}{(a + 2 b)}}} E [n^{- 1} M^{T} (I - H_{m}) E (η_{(k)} η_{(k)}^{T} | X_{1}, X_{2}, \dots, X_{n}) (I - H_{m}) M I_{A_{m} \cap B_{m}}] \\ = & \frac{1}{c_{0} n^{\frac{1 - (a + 2 b)}{(a + 2 b)}}} B_{k k} E (n^{- 1} M^{T} (I - H_{m}) M I_{A_{m} \cap B_{m}}) \to 0 as c_{0} \to \infty . \end{matrix}

Then, (10) is verified.

Step 3. We verify (7). It suffices to show that for nonzero vector

l = {(l_{1}, l_{2}, \dots, l_{d})}^{T}

,

l^{T} A_{3} \overset{d}{\to} N (0, σ^{2} l^{T} B l \sum_{j = - \infty}^{\infty} c_{j}^{2}) .

Write

\begin{matrix} l^{T} A_{3} = n^{- \frac{1}{2}} \sum_{i = 1}^{n} l^{T} η_{i} \sum_{j = - n}^{n} c_{j} e_{i - j} + n^{- \frac{1}{2}} \sum_{i = 1}^{n} l^{T} η_{i} \sum_{| j | > n} c_{j} e_{i - j} : = T_{n} + W_{n} . \end{matrix}

By

(A 6)

and using

\sum_{j = - \infty}^{\infty} | c_{j} | < \infty

, it follows that, for any

ϵ > 0

\begin{matrix} P (| W_{n} | > ϵ) & \leq \frac{1}{ϵ^{2}} n^{- 1} E {(\sum_{i = 1}^{n} l^{T} η_{i} \sum_{| j | > n} c_{j} e_{i - j})}^{2} = \frac{d}{ϵ^{2}} \sum_{k = 1}^{d} l_{k}^{2} B_{k k} n^{- 1} \sum_{i = 1}^{n} E {(\sum_{| j | > n} c_{j} e_{i - j})}^{2} \\ \leq c n^{- 1} \sum_{i = 1}^{n} (\sum_{| j | > n} c_{j}^{2} E e_{i - j}^{2} + 2 \sum_{| j_{1} | > n} \sum_{j_{2} > j_{1}} | c_{j_{1}} c_{j_{2}} | {(E e_{i - j_{1}}^{2} E e_{i - j_{2}}^{2})}^{1 / 2}) \\ \leq c {(\sum_{| j | > n} | c_{j} |)}^{2} \to 0, \end{matrix}

which implies

W_{n} = o_{p} (1)

.

Now, we use Lemma 2 to prove

A_{3} \overset{d}{\to} N (0, σ^{2} l^{T} B l \sum_{j = - \infty}^{\infty} c_{j}^{2})

. In fact,

T_{n} = \sum_{s = 1 - n}^{2 n} d_{n s} e_{s} =

\sum_{k = 1}^{3 n} d_{n, k - n} e_{k - n}

, where

d_{n s} = n^{- 1 / 2} \sum_{i = max (1, s - n)}^{min (n, n + s)} c_{i - s} l^{T} η_{i}

. Set

G_{n, k} = σ \{(X_{1}, z_{1}), \dots, (X_{n}, z_{n}), e_{1 - n}, e_{2 - n}, \dots, e_{k - n}\} .

Then

{d_{n, k - n} e_{k - n}}

is measurable with respect to

G_{n, k}

, and for

1 \leq k \leq 3 n

,

E (d_{n, k - n} e_{k - n} |

G_{n, k - 1}) = d_{n, k - n} E (e_{k - n} | e_{1 - n}, e_{2 - n}, \dots, e_{k - 1 - n}) = 0

. Thus, from Lemma 2, we only need to verify that

\begin{matrix} \sum_{k = 1}^{3 n} E (d_{n, k - n}^{2} e_{k - n}^{2} | G_{n, k - 1}) \overset{p}{\to} σ^{2} l^{T} B l \sum_{j = - \infty}^{\infty} c_{j}^{2}, \end{matrix}

(13)

\begin{matrix} \sum_{k = 1}^{3 n} E [d_{n, k - n}^{2} e_{k - n}^{2} I (| d_{n, k - n} e_{k - n} | > ε) | G_{n, k - 1}] = o_{p} (1) for any ε > 0 . \end{matrix}

(14)

We first prove (13). Applying

E (e_{i}^{2} | \dots, e_{i - 2}, e_{i - 1}) = σ^{2}

a.s., we can write

\begin{matrix} \sum_{k = 1}^{3 n} E (d_{n, k - n}^{2} e_{k - n}^{2} | G_{n, k - 1}) = σ^{2} \sum_{k = 1}^{3 n} d_{n, k - n}^{2} = σ^{2} \sum_{s = 1 - n}^{2 n} d_{n s}^{2} \\ = & \frac{σ^{2}}{n} [\sum_{s = 1 - n}^{2 n} \sum_{i = max (1, s - n)}^{min (n, n + s)} {(l^{T} η_{i})}^{2} c_{i - s}^{2} + 2 \sum_{s = 1 - n}^{2 n} \sum_{max (1, s - n) \leq i < j \leq min (n, n + s)} (l^{T} η_{i}) (l^{T} η_{j}) c_{i - s} c_{j - s}] \\ = & \frac{σ^{2}}{n} [\sum_{i = 1}^{n} {(l^{T} η_{i})}^{2} \sum_{j = - n}^{n} c_{j}^{2} + 2 \sum_{1 \leq i < j \leq n} (l^{T} η_{i}) (l^{T} η_{j}) \sum_{t = - n}^{n} c_{t} c_{j - i + t}] \\ : = & G_{1} + G_{2} . \end{matrix}

The law of large numbers implies

\frac{1}{n} \sum_{i = 1}^{n} {(l^{T} η_{i})}^{2} \overset{p}{\to} l^{T} B l

, hence from

\sum_{j = - n}^{n} c_{j}^{2} \to \sum_{j = - \infty}^{\infty} c_{j}^{2}

we obtain

G_{1} \overset{p}{\to} σ^{2} l^{T} B l \sum_{j = - \infty}^{\infty} c_{j}^{2} .

Since

{η_{i}, 1 \leq i \leq n}

is a sequence of independent random vectors with

E η_{i} = 0

, we have

\begin{matrix} E G_{2}^{2} = & \frac{4 σ^{2}}{n^{2}} E {[\sum_{i = 1}^{n - 1} l^{T} η_{i} \sum_{j = i + 1}^{n} l^{T} η_{j} \sum_{t = - n}^{n} c_{t} c_{j - i + t}]}^{2} \\ = & \frac{4 σ^{2}}{n^{2}} \sum_{i = 1}^{n - 1} E {(l^{T} η_{i})}^{2} E {(\sum_{j = i + 1}^{n} l^{T} η_{j} \sum_{t = - n}^{n} c_{t} c_{j - i + t})}^{2} \\ = & \frac{4 σ^{2}}{n^{2}} {(l^{T} B l)}^{2} [\sum_{i = 1}^{n - 1} \sum_{j = i + 1}^{n} \sum_{t = - n}^{n} c_{t}^{2} c_{j - i + t}^{2} + 2 \sum_{i = 1}^{n - 1} \sum_{j = i + 1}^{n} \sum_{t = - n}^{n - 1} \sum_{s = t + 1}^{n} c_{t} c_{j - i + t} c_{s} c_{j - i + s}] \\ \leq & \frac{4 σ^{2}}{n^{2}} {(l^{T} B l)}^{2} [\sum_{i = 1}^{n - 1} {(\sum_{t = - \infty}^{+ \infty} c_{t}^{2})}^{2} + 2 \sum_{i = 1}^{n - 1} {(\sum_{t = - \infty}^{+ \infty} | c_{t} |)}^{2} \sum_{s = - \infty}^{+ \infty} c_{s}^{2}] \to 0, \end{matrix}

which gives

G_{2} = o_{p} (1)

. Therefore, (13) is proved.

We next prove (14). Using Lemma 5, we write

\begin{matrix} \sum_{s = 1 - n}^{2 n} E [d_{n s}^{2} e_{s}^{2} I (| d_{n s} e_{s} | > ε)] \leq & \sum_{s = 1 - n}^{2 n} E [d_{n s}^{2} I (| d_{n s}^{2} | > ε n^{- 1 / 4})] E | e_{s} |^{2} + \sum_{s = 1 - n}^{2 n} E [e_{s}^{2} I (| e_{s} | > n^{1 / 4})] E d_{n s}^{2} \\ : = & R_{1} + R_{2} . \end{matrix}

According to the moment inequality for sum of independent random variables, we can write

\begin{matrix} R_{1} \leq & c ε^{- δ} n^{- 1 - δ / 4} \sum_{s = 1 - n}^{2 n} E {[\sum_{max (1, s - n)}^{min (n, n + s)} c_{i - s} l^{T} η_{i}]}^{2 + δ} \\ \leq & c n^{- 1 - δ / 4} \{\sum_{s = 1 - n}^{2 n} {[(\sum_{max (1, s - n)}^{min (n, n + s)} E {(l^{T} η_{i} c_{i - s})}^{2})]}^{(2 + δ) / 2} + \sum_{s = 1 - n}^{2 n} \sum_{max (1, s - n)}^{min (n, n + s)} E {| l^{T} η_{i} c_{i - s} |}^{2 + δ}\} \\ : = & R_{11} + R_{12} . \end{matrix}

By

(A 6)

, we have

\begin{matrix} R_{11} \leq & c n^{- 1 - δ / 4} \sum_{s = 1 - n}^{2 n} {(\sum_{max (1, s - n)}^{min (n, n + s)} c_{i - s}^{2})}^{(2 + δ) / 2} \leq c n^{- δ / 4} {(\sum_{j = - \infty}^{\infty} | c_{j} |)}^{2 + δ} \to 0, \\ R_{12} \leq & c n^{- 1 - δ / 4} \sum_{i = 1}^{n} \sum_{j = - \infty}^{\infty} {| c_{j} |}^{2 + δ} \leq c n^{- δ / 4} {(\sum_{j = - \infty}^{\infty} | c_{j} |)}^{2 + δ} \to 0 . \end{matrix}

When

{e_{i}}

are identical distributed, from

E e_{s}^{2} = σ^{2} < \infty

we have

E e_{s}^{2} I (| e_{s} | > n^{1 / 4}) \to 0

; when

{e_{i}}

are square uniformly integrable, we have

{sup}_{s} E e_{s}^{2} I (| e_{s} | > n^{1 / 4}) \to 0

. Hence

R_{2} \to 0

. Therefore,

\sum_{k = 1}^{3 n} E [d_{n, k - n}^{2} e_{k - n}^{2} I (| d_{n, k - n} e_{k - n} | > ε)] \to 0

, further we verify (14). □

Proof of Theorem 2.

Applying Lemmas 1 and 3 and

{sup}_{i} E V_{i}^{2} < \infty

, following the proof line in the proof of Theorem 3.2 in Shin [10], one can prove this result. □

Proof of Theorem 3.

(1) Let

a_{n} = max_{j \in A} p_{ω}^{'} (| β_{0 j} |)

and

b_{n} = n^{- 1 / 2} + a_{n}

. we first prove that for any

ϵ > 0

and a large constant

C > 0

,

\begin{matrix} P (inf_{{∥ u ∥}_{R^{d}} = C} G_{p} (β_{0} + b_{n} u) > G_{p} (β_{0})) \geq 1 - ϵ, \end{matrix}

(15)

which implies there exists a local minimizer

{\hat{β}}_{p}

such that

∥ {\hat{β}}_{p} - β_{0} ∥_{R^{d}} = O_{p} (b_{n}) = O_{p} (n^{- 1 / 2})

.

In fact,

G (β, α (β)) = {(Y - Z β)}^{T} (I - H_{m}) [Y - Z β]

. Let

I_{4} = n \sum_{j \in A} {p_{ω} (| β_{0 j} + b_{n} u_{j} |) - p_{ω} (| β_{0 j} |)}

. Using the Taylor expansion, we have

I_{4} = \sum_{j \in A} \{n b_{n} p_{ω}^{'} (| β_{0 j} |) s i g n (β_{0 j}) u_{j} + \frac{1}{2} n b_{n}^{2} p_{ω}^{″} (| β_{0 j} |) u_{j}^{2} (1 + o (1))\},

and from

p_{ω} (| b_{n} u_{j} |) - p_{ω} (0) \geq 0

, it follows that

\begin{matrix} G_{p} (β_{0} + b_{n} u) - G_{p} (β_{0}) \\ \geq & G (β_{0} + b_{n} u, α (β_{0} + b_{n} u)) - G (β_{0}, α (β_{0})) + n \sum_{j \in A} {p_{ω} (| β_{0 j} + b_{n} u_{j} |) - p_{ω} (| β_{0 j} |)} \\ = & b_{n} \nabla G {(β_{0}, α (β_{0}))}^{T} u + \frac{1}{2} b_{n}^{2} u^{T} \nabla^{2} G (β_{0}, α (β_{0})) u + I_{4} \\ = & - 2 b_{n} {(Y - Z β) |}_{β = β_{0}}^{T} (I - H_{m}) Z u + b_{n}^{2} u^{T} Z^{T} (I - H_{m}) Z u + I_{4} \\ = & - 2 b_{n} M^{T} (I - H_{m}) Z u - 2 b_{n} V^{T} (I - H_{m}) Z u + b_{n}^{2} u^{T} Z^{T} (I - H_{m}) Z u + I_{4} \\ : = & I_{1} + I_{2} + I_{3} + I_{4}, \end{matrix}

where

\nabla G (β_{0}, α (β_{0})) = \frac{\partial G (β_{0}, α (β_{0}))}{\partial β}

,

\nabla^{2} G (β_{0}, α (β_{0})) = \frac{\partial^{2} G (β_{0}, \hat{α})}{\partial β \partial β^{T}}

and

V = {(V_{1}, V_{2}, \dots, V_{n})}^{T}

. (8), (10) and (9) imply

Z^{T} (I - H_{m}) M = O_{p} (n^{1 / 2 (a + 2 b)})

. Hence, from Theorem 2 and

b > a / 2 + 1

we have

\begin{matrix} | I_{1} | \leq 2 b_{n} ∥ Z^{T} (I - H_{m}) {M ∥}_{R^{d}} {∥ u ∥}_{R^{d}} = O_{p} (n^{\frac{1}{2 (a + 2 b)}} b_{n} {∥ u ∥}_{R^{d}}) = o_{p} (n b_{n}^{2} {∥ u ∥}_{R^{d}}) . \end{matrix}

Next, we consider

I_{2}

. Clearly,

I_{2} = \sqrt{n} b_{n} u^{T} A_{2} u + \sqrt{n} b_{n} u^{T} A_{3} u

, where

A_{2}

and

A_{3}

is defined in the proof of Theorem 1 and further in the proof of Theorem 1, it holds that

\sqrt{n} b_{n} u^{T} A_{2} u = O_{p} (n^{(a + 1) / 2 (a + 2 b)} b_{n} {∥ u ∥}_{R^{d}}) = o_{p} (n b_{n}^{2} {∥ u ∥}_{R^{d}})

and

\sqrt{n} b_{n} u^{T} A_{3} u = O_{p} (b_{n} {∥ u ∥}_{R^{d}}) = o_{p} (n b_{n}^{2}

{∥ u ∥}_{R^{d}})

. Thus,

I_{2} = o_{p} (n b_{n}^{2} {∥ u ∥}_{R^{d}})

.

From Lemma 1, we have

\hat{B} = B + o_{p} (1)

, which implies

I_{3} = b_{n}^{2} u^{T} Z^{T} (I - H_{m}) Z u = n b_{n}^{2} u^{T} B u + o_{p} (n b_{n}^{2} {∥ u ∥}_{R^{d}}^{2})

.

As for

I_{4}

, from (A7) we get

\begin{matrix} | I_{4} | \leq c (n b_{n} a_{n} {∥ u ∥}_{R^{d}} + n b_{n}^{2} {∥ u ∥}_{R^{d}}^{2} max_{j \in A} p_{ω}^{″} (| β_{0 j} |)) = o (n b_{n}^{2} {∥ u ∥}_{R^{d}} + n b_{n}^{2} {∥ u ∥}_{R^{d}}^{2}) . \end{matrix}

Therefore, for

{∥ u ∥}_{R^{d}} = C

, we have

I_{1} + I_{2} + I_{3} + I_{4} = n b_{n}^{2} u^{T} B u + o_{p} (n b_{n}^{2})

, which yields (15) since

B = E η_{i} η_{i}^{T}

is a positive definite matrix.

Note that

G_{p} (β) = G (β_{0}, α (β_{0})) + \nabla G {(β_{0}, α (β_{0}))}^{T} (β - β_{0}) + \frac{1}{2} {(β - β_{0})}^{T} \nabla^{2} G (β_{0},

α (β_{0})) (β - β_{0}) + n \sum_{j = 1}^{d} p_{ω} (| β_{j} |) = G (β_{0}, α (β_{0})) - 2 M^{T} (I - H_{m}) Z (β - β_{0}) + {(β - β_{0})}^{T}

Z^{T} (I - H_{m}) Z (β - β_{0}) + n \sum_{j = 1}^{d} p_{ω} (| β_{j} |) .

Then

\begin{matrix} \nabla G_{p} (β) = - 2 Z^{T} (I - H_{m}) M - 2 Z^{T} (I - H_{m}) V + 2 Z^{T} (I - H_{m}) Z (β - β_{0}) + n g (β), \end{matrix}

(16)

where

g (β) = (p_{ω}^{'} (| β_{1} |) s i g n (β_{1}), \dots, p_{ω}^{'} (| β_{d} |) s i g n (β_{d}))^{T}

. Thus, for any

β

satisfying

∥ β - β_{0} ∥_{R^{d}} = O_{p} (n^{- 1 / 2})

, by (A7) and Theorem 2,

n^{1 / 2} ω \to \infty

and

b > a / 2 + 1

we find

\begin{matrix} \frac{\partial G_{p} (β)}{\partial β_{j}} = & O_{p} (n^{\frac{1}{2 (a + 2 b)}}) + O_{p} (\sqrt{n}) + n p_{ω}^{'} (| β_{j} |) s i g n (β_{j}) \\ = & n ω [ω^{- 1} p_{ω}^{'} (| β_{j} |) s i g n (β_{j}) + o_{p} (1)], \end{matrix}

then the sign of

\frac{\partial G_{p} (β)}{\partial β_{j}}

is dominated by the sign of

β_{j}

. Note that

{\hat{β}}_{p} = {({\hat{β}}_{p 1}^{T}, {\hat{β}}_{p 2}^{T})}^{T} =

arg {min}_{β \in R^{d}} G_{p} (β)

is the estimator of

β_{0} = {(β_{01}^{T}, 0^{T})}^{T}

. Then

P ({\hat{β}}_{p 2} = 0) \to 1

as

n \to \infty

.

(2) From the proof in (1) above, we know

∥ {\hat{β}}_{p 1} - β_{01} ∥_{R^{d_{1}}} = O_{p} (n^{- 1 / 2})

and

\nabla G_{p} ({\hat{β}}_{p 1}) = 0

. Hence, from (16) we get

\begin{matrix} \sqrt{n} ({\hat{β}}_{p 1} - β_{01}) = {\hat{B}}_{p 1}^{- 1} [n^{- 1 / 2} Z_{p 1}^{T} (I - H_{m}) M + n^{- 1 / 2} Z_{p 1}^{T} (I - H_{m}) V - \frac{\sqrt{n}}{2} g ({\hat{β}}_{p 1})] \end{matrix}

(17)

where

{\hat{B}}_{p 1} = n^{- 1} Z_{p 1}^{T} (I - H_{m}) Z_{p 1}

,

g ({\hat{β}}_{p 1}) = (p_{ω}^{'} (| {\hat{β}}_{l} |) s i g n ({\hat{β}}_{l}))_{l \in A}

and

Z_{p 1} = (z_{1 (1)},

z_{2 (1)}, \dots, z_{n (1)})^{T}

. Due to Lemma 1, we can obtain

{\hat{B}}_{p 1} \overset{p}{\to} B_{p 1}

. Then, by (A7),

\begin{matrix} \sqrt{n} g ({\hat{β}}_{p 1}) = \sqrt{n} \{g (β_{01}) + \nabla g (β_{01}) ({\hat{β}}_{p 1} - β_{01}) [1 + o (1)]\} = o_{p} (1) . \end{matrix}

Thus, by (8), (10) and (9), (17) can be rewritten as

\begin{matrix} \sqrt{n} ({\hat{β}}_{p 1} - β_{01}) = B_{p 1}^{- 1} n^{- 1 / 2} Z_{p 1}^{T} (I - H_{m}) V + o_{p} (1) . \end{matrix}

Similar to the proof of Theorem 1, we have

n^{- 1 / 2} Z_{p 1}^{T} (I - H_{m}) V = A_{2 (1)} + A_{3 (1)}

, where

A_{2 (1)}

and

A_{2 (3)}

are similarly defined as

A_{2}

and

A_{3}

, which are related to

z_{i (1)}

for

i = 1, 2, \dots, n

. Then, following the proof line in Step 2 and Step 3 of the proof of Theorem 1, one can verify

\sqrt{n} ({\hat{β}}_{p 1} - β_{01}) \overset{D}{\to} N (0, σ^{2} B_{p 1}^{- 1} \sum_{j = - \infty}^{\infty} c_{j}^{2}) .

□

Author Contributions

Conceptualization, Y.H.; Methodology, Y.H.; Software, Y.H.; Formal analysis, Y.H.; Data curation, Y.H.; Writing—original draft, Y.H.; Writing—review & editing, Z.P.; Visualization, Z.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

In this paper, Monte Carlo simulation method was used for data analysis, and R software was used to generate the required data. No new data were created in this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ramsay, J.O.; Silverman, B.W. Functional Data Analysis; Springer: New York, NY, USA, 1997. [Google Scholar]
Cardot, H.; Ferraty, F.; Sarda, P. Spline estimators for the functional linear model. Stat. Sin. 2003, 13, 571–591. [Google Scholar]
Cardot, H.; Sarda, P. Linear Regression Models for Functional Data. In The Art of Semiparametrics; Contributions to Statistics; Physica-Verlag: Heidelberg, Germany, 2006; pp. 49–66. [Google Scholar]
Li, Y.; Hsing, T. On rates of convergence in functional linear regression. J. Multivar. Anal. 2007, 98, 1782–1804. [Google Scholar] [CrossRef]
Hall, P.; Horowitz, J.L. Methodology and convergence rates for functional linear regression. Ann. Stat. 2007, 35, 70–91. [Google Scholar] [CrossRef]
Hall, P.; Hosseini-Nasab, M. On properties of functional principal components analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 2006, 68, 109–126. [Google Scholar] [CrossRef]
Horváth, L.; Kokoszka, P. Inference for Functional Data with Applications; Springer: New York, NY, USA, 2012. [Google Scholar]
Hsing, T.; Eubank, R. Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators; John Wiley & Sons, Ltd.: Chichester, UK, 2015. [Google Scholar]
Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis: Theory and Practice; Springer: New York, NY, USA, 2006. [Google Scholar]
Shin, H. Partial functional linear regression. J. Stat. Plan. Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
Yuan, M.G.; Zhang, Y. Test for the parametric part in partial functional linear regression based on B-spline. Commun. Stat.-Simul. Comput. 2021, 50, 1–15. [Google Scholar] [CrossRef]
Hu, Y.P.; Liang, H.Y. Empirical likelihood in single-index partially functional linear model with missing observations. Commun. Stat.-Theory Methods 2022. [Google Scholar] [CrossRef]
Jiang, Z.Q.; Huang, Z.S. Single-index partially functional linear quantile regression. Commun. Stat.-Theory Methods 2022. [Google Scholar] [CrossRef]
Bouka, S.; Dabo-Niang, S.; Nkiet, G.M. On estimation and prediction in spatial functional linear regression model. Lith. Math. J. 2023, 63, 13–30. [Google Scholar] [CrossRef]
Feng, S.; Xue, L. Partially functional linear varying coefficient model. Statistics 2016, 50, 717–732. [Google Scholar] [CrossRef]
Xie, T.F.; Cao, R.Y.; Yu, P. Rank-based test for partial functional linear regression models. J. Syst. Sci. Complex. 2020, 33, 1571–1584. [Google Scholar] [CrossRef]
Hu, Y.; Xue, L.; Zhao, J.; Zhang, L. Skew-normal partial functional linear model and homogeneity test. J. Stat. Plan. Inference 2020, 204, 116–127. [Google Scholar] [CrossRef]
Tang, Q.G.; Tu, W.; Kong, L.L. Estimation for partial functional partially linear additive model. Comput. Stat. Data Anal. 2023, 177, 107584. [Google Scholar] [CrossRef]
Kong, D.; Xue, K.; Yao, F.; Zhang, H.H. Partially functional linear regression in high dimensions. Biometrika 2016, 103, 147–159. [Google Scholar] [CrossRef]
Du, J.; Xu, D.; Cao, R. Estimation and variable selection for partially func- tional linear models. J. Korean Stat. Soc. 2018, 474, 436–449. [Google Scholar] [CrossRef]
Yao, F.; Sue-Chee, S.; Wang, F. Regularized partially functional quantile regression. J. Multivar. Anal. 2017, 156, 39–56. [Google Scholar] [CrossRef]
Wu, C.X.; Ling, N.X.; Vieu, P.; Liang, W.J. Partially functional linear quantile regression model and variable selection with censoring indicators MAR. J. Multivar. Anal. 2023, 197, 105189. [Google Scholar] [CrossRef]
Wang, Y.F.; Du, J.; Zhang, Z.G. Partial functional linear models with dependent errors. Acta Math. Appl. Sin. 2017, 40, 49–65. (In Chinese) [Google Scholar]
Hu, Y.P.; Liang, H.Y. Functional regression with dependent error and missing observation in reproducing kernel Hilbert spaces. J. Korean Stat. Soc. 2023. [Google Scholar] [CrossRef]
Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
Pollard, D. Convergence of Stochastic Processes; Springer: New York, NY, USA, 1984. [Google Scholar]

Figure 1. Q-Q plots of

{\hat{β}}_{1}

and

{\hat{β}}_{2}

with

n = 50

(the first row),

n = 200

(the second row),

ϕ = 0.1

(the first and second column) and

ϕ = 0.9

(the third and fourth column).

Figure 2. Q-Q plots of the estimators for

(β_{1}, β_{2})

with

n = 50

(the first row),

n = 200

(the second row),

ϕ = 0.1

(the first and second column) and

ϕ = 0.9

(the third and fourth column) by the method of SCAD.

Table 1. Values of bias, MSEs for

{\hat{β}}_{1}

,

{\hat{β}}_{2}

and

\hat{γ}

with

n = 50, 200

and

ϕ = 0.1, 0.9

.

Table 1. Values of bias, MSEs for

{\hat{β}}_{1}

,

{\hat{β}}_{2}

and

\hat{γ}

with

n = 50, 200

and

ϕ = 0.1, 0.9

.

n	$ϕ$	MSE			Bias
n	$ϕ$	${\hat{β}}_{1}$	${\hat{β}}_{2}$	$\hat{γ}$	${\hat{β}}_{1}$	${\hat{β}}_{2}$
50	0.1	0.0207	0.0245	1.8649	0.0040	−0.0050
50	0.9	0.1010	0.1288	6.1297	0.0078	−0.0118
200	0.1	0.0051	0.0052	0.4606	0.0044	−0.0058
200	0.9	0.0289	0.0268	2.1421	0.0040	−0.0120

Table 2. Values of C, IC, C-fit and MSE(

\hat{β})

.

Table 2. Values of C, IC, C-fit and MSE(

\hat{β})

.

n	$ϕ$	C	IC	C-Fit	MSE( $\hat{β})$
50	0.1	4.94	0	0.8675	0.0847
50	0.9	4.12	0.048	0.759	0.5204
200	0.1	5.424	0	0.928	0.0173
200	0.9	4.982	0	0.873	0.0900

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Partially Functional Linear Models with Linear Process Errors

Abstract

1. Introduction

2. Estimators

2.1. Least Squares Estimation

2.2. Variable Selection

3. Main Results

4. Simulation Study

4.1. Least Squares Estimation

4.2. Variable Selection

5. Conclusions

6. Proof of Main Results

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics