Estimation for Partial Functional Multiplicative Regression Model

Xiaojing Liu; Ping Yu; Jianhong Shi

doi:10.3390/math13030471

,

and

School of Mathematics and Computer Science, Shanxi Normal University, Taiyuan 031031, China

^*

Author to whom correspondence should be addressed.

Mathematics2025, 13(3), 471;https://doi.org/10.3390/math13030471

This article belongs to the Special Issue New Advances in High-Dimensional and Non-asymptotic Statistics

Version Notes

Order Reprints

Abstract

Functional data such as curves, shapes, and manifolds have become more and more common with modern technological advancements. The multiplicative regression model is well suited for analyzing data with positive responses. In this study, we study the estimation problems of the partial functional multiplicative regression model (PFMRM) based on the least absolute relative error (LARE) criterion and least product relative error (LPRE) criterion. The functional predictor and slope function are approximated by the functional principal component basis functions. Under certain regularity conditions, we derive the convergence rate of the slope function and establish the asymptotic normality of the slope vector for two estimation methods. Monte Carlo simulations are carried out to evaluate the proposed methods, and an application to Tecator data is investigated for illustration.

Keywords:

functional principal component analysis; least absolute relative error; least product relative error; partial functional multiplicative regression model

MSC:

62L05; 62L12; 62F10; 62F12; 62F25

1. Introduction

We consider the PFMRM which includes some scalar covariates and a functional predictor, paired with a scalar response. That is,

Y = \exp (Z^{⊤} α + \int_{T} β (t) X (t) d t) ϵ,

(1)

where Y is a positive response variable;

Z = {(Z_{1}, \dots, Z_{p})}^{⊤}

is a p-dimensional vector covariate,

α = {(α_{1}, \dots, α_{p})}^{⊤}

is the p-vector of slope coefficients, in which p is assumed to be fixed;

β (t) \in L^{2} (T)

is the unknown slope function associated with functional predictor

X (t)

; and

ϵ

is a positive random error independent of

Z

and

X (t)

. Here, the Hilbert space

L^{2} (T)

is the set of all square integrable functions on T, endowed with the inner product

⟨ x, Y ⟩ = \int_{T} x (t) Y (t) d t

and the norm

∥ x ∥ = {⟨ x, x ⟩}^{1 / 2}

. The model (1) generalizes both the classic multiplicative regression model [1] and the functional multiplicative regression model [2] that correspond to the cases of

β (t) = 0

and

α = 0

, respectively. When the log transformation applies to this model, the above model simply degrades to a partial functional linear regression model [3]. When the response variable Y is a failure time, model (1) is called the functional accelerated failure time model in survival analysis; see [4] for example. For simplicity of notation, we assume that

T = [0, 1]

, and

Z

and

X (t)

have zero mean throughout the study.

In many applications, the response variable is positive, for example, survival time, stock prices, income, body fat level, emissions of nitrogen oxides, and the value of owner-occupied homes frequently arise in statistical practice. The multiplicative regression model plays an important role in describing these types of data. To estimate the multiplicative regression models, Refs. [1,5] proposed LARE and LPRE estimation, respectively. The LARE criterion minimizes

\sum_{i = 1}^{n} (|ϵ_{i}^{- 1} - 1| + |ϵ_{i} - 1|)

, and the LPRE criterion minimizes

\sum_{i = 1}^{n} (|ϵ_{i}^{- 1} - 1| \times |ϵ_{i} - 1|)

, which is equivalent to minimizing

\sum_{i = 1}^{n} (ϵ_{i}^{- 1} + ϵ_{i})

. As pointed out by [5], the LARE estimation is robust and scale-free, but optimization of its use may be challenging as the objective function minimized is non-smoothing. In addition, confidence intervals for parameters are not very accurate due to the complexity of the asymptotic covariance matrix, which involves the density of the model error. In order to overcome the shortcoming of LARE, Ref. [5] proposed the LPRE criterion, which is strictly convex and infinitely differentiable, and the optimization procedure is much easier. In recent years, due to the excellent properties of LARE and LPRE estimation, scholars in various fields have been attracted to conducting extended research on them. The readers can refer to [6,7,8].

For functional multiplicative models, to the best of our knowledge, there are only a few works and all of them focus on the above two criteria. For example, Ref. [9] developed the functional quadratic multiplicative model and derived the asymptotic properties of the estimator with the LARE criterion. Later, Refs. [2,10] considered the variable selection for partially and locally sparse functional linear multiplicative models based on the LARE criterion. In this paper, we consider the modeling of a positive scalar response variable with both scalar and functional predictors under the PFMRM. The above two criteria are employed to estimate the parametric vector

α

and the slope function

β (t)

in model (1).

The major contributions of this paper are four-fold. First, this study first extends the LPRE criterion to the estimation of functional regression models. Second, we estimate the unknown slope function and functional predictor by using a functional principal component analysis technique, derive the convergence rates of the slope function, and establish the asymptotic normality of the parameter vector under mild regularity conditions for two estimation methods. Third, we develop an iterative algorithm to solve the involved optimization problem and propose a data-driven procedure to select the tuning parameters. Finally, we conduct extensive numerical studies to examine the finite sample performance of the proposed methods and find that the LPRE method has better performance than the LARE, least square, and least absolute deviation methods.

The rest of the article is organized as follows. Section 2 describes the detailed estimation procedures for model (1). Section 3 is dedicated to the asymptotic study of our estimators. The feasible algorithm for estimations of the parameters and nonparametric functions of PFMRM is proposed based on the LPRE criterion and presented in Section 4. Section 5 conducts simulation studies to evaluate the finite sample performance of the proposed methods. In Section 6, we apply the proposed method to the Tecator data. The article concludes with a discussion in Section 7. Proofs are provided in Appendix A.

2. Estimation Method

Let

(Z_{i}, X_{i} (\cdot), Y_{i}, ϵ_{i})

,

i \in {1, \dots, n}

be the independent realizations of

(Z, X (\cdot), Y, ϵ)

generated from model (1), that is,

Y_{i} = \exp (Z_{i}^{⊤} θ + \int_{0}^{1} β (t) X_{i} (t) d t) ϵ_{i}, i \in {1, \dots, n},

(2)

where random errors

ϵ_{i}

,

i \in {1, \dots, n}

are independent and identically distributed (i.i.d.) and independent of

Z_{i}

and

X_{i} (\cdot)

.

The covariance and empirical covariance functions of

X (\cdot)

can be defined as

C_{X} (t, s) = Cov (X (t), X (s)), {\hat{C}}_{X} (t, s) = \frac{1}{n} \sum_{i = 1}^{n} X_{i} (t) X_{i} (s) .

According to Mercer’s theorem, the spectral expansions of

C_{X}

and

{\hat{C}}_{X}

can be written as

C_{X} (t, s) = \sum_{j = 1}^{\infty} λ_{j} v_{j} (t) v_{j} (s), {\hat{C}}_{X} (t, s) = \sum_{j = 1}^{\infty} {\hat{λ}}_{j} {\hat{v}}_{j} (t) {\hat{v}}_{j} (s),

where

λ_{1} > λ_{2} > \dots > 0

and

{\hat{λ}}_{1} \geq {\hat{λ}}_{2} \geq \dots \geq {\hat{λ}}_{n + 1} = \dots = 0

are the ordered eigenvalue sequences of the linear operators with kernels

C_{X}

and

{\hat{C}}_{X}

, respectively, and

{v_{j} (\cdot)}_{j = 1}^{\infty}

and

{{\hat{v}}_{j} (\cdot)}_{j = 1}^{\infty}

are appropriate orthonormal eigenfunction sequences. With a slight abuse of notation, we use

C_{X}

to denote both the covariance operator and the covariance function of

X (\cdot)

. We assume that the covariance operator

C_{X}

defined by

C_{X} f (s) = \int_{0}^{1} C_{X} (t, s) f (t) d t

is strictly positive. In addition,

\{{\hat{v}}_{j} (\cdot), {\hat{λ}}_{j}\}

can be regarded as an estimator of

\{v_{j} (\cdot), λ_{j}\}

.

On the basis of the Karhunen–Loève decomposition,

β (t)

and

X_{i} (t)

can be expanded to

β (t) = \sum_{j = 1}^{\infty} γ_{j} v_{j} (t), X_{i} (t) = \sum_{j = 1}^{\infty} ξ_{i j} v_{j} (t), i \in {1, \dots, n},

(3)

where

γ_{j} = ⟨ β (\cdot), v_{j} (\cdot) ⟩ = \int_{0}^{1} β (t) v_{j} (t) d t

, and

ξ_{i j} = ⟨ X_{i} (\cdot), v_{j} (\cdot) ⟩

represents the coordinate of the ith curve with respect to the jth eigenbasis.

Analogously, we define

C_{Y X} (\cdot) = Cov (Y, X (\cdot))

,

C_{Z} = Var (Z)

,

C_{Z Y} = Cov (Z, Y)

, and

C_{Z X} (\cdot) = Cov (Z, X (\cdot)) = {(C_{Z_{1} X} (\cdot), \dots, C_{Z_{p} X} (\cdot))}^{⊤}

. Then, their corresponding empirical counterparts can be defined as

{\hat{C}}_{Y X} = \frac{1}{n} \sum_{i = 1}^{n} Y_{i} X_{i}, {\hat{C}}_{Z} = \frac{1}{n} \sum_{i = 1}^{n} Z_{i} Z_{i}^{⊤}, {\hat{C}}_{Z Y} = \frac{1}{n} \sum_{i = 1}^{n} Z_{i} Y_{i}, {\hat{C}}_{Z X} = \frac{1}{n} \sum_{i = 1}^{n} Z_{i} X_{i} .

Given the orthogonality of

{v_{1} (\cdot), \dots, v_{m} (\cdot)}

and (3), model (2) can be rewritten as

Y_{i} = \exp (Z_{i}^{⊤} α + \sum_{j = 1}^{m} γ_{j} ξ_{i j}) {\tilde{ϵ}}_{i} = \exp (Z_{i}^{⊤} α + U_{i}^{⊤} γ) {\tilde{ϵ}}_{i}, (i = 1, \dots, n)

where

U_{i} = {(ξ_{i 1}, \dots, ξ_{i m})}^{⊤}

,

γ = {(γ_{1}, \dots, γ_{m})}^{⊤}

,

{\tilde{ϵ}}_{i} = \exp (\sum_{j = m + 1}^{\infty} γ_{j} ξ_{i j}) ϵ_{i}

, and the truncation parameter

m \to \infty

as

n \to \infty

.

2.1. LARE Estimation

This is based on the LARE criterion of [1] and

υ_{j} (\cdot)

is replaced with its estimator

{\hat{υ}}_{j} (\cdot)

. The LARE estimates of model (1) can be obtained by minimizing the following loss functions:

(\hat{α}, \hat{γ}) ≜ \arg \min_{(α, γ)} \sum_{i = 1}^{n} \{| \frac{Y_{i} - \exp (Z_{i}^{⊤} α + {\hat{U}}_{i}^{⊤} γ)}{Y_{i}} | + | \frac{Y_{i} - \exp (Z_{i}^{⊤} α + {\hat{U}}_{i}^{⊤} γ)}{\exp (Z_{i}^{⊤} α + {\hat{U}}_{i}^{⊤} γ)} |\},

(4)

where

{\hat{U}}_{i} = {({\hat{ξ}}_{i 1}, \dots, {\hat{ξ}}_{i m})}^{⊤}

with

{\hat{ξ}}_{i j} = ⟨ X_{i} (\cdot), {\hat{υ}}_{j} (\cdot) ⟩

,

\hat{γ} = {({\hat{γ}}_{1}, \dots, {\hat{γ}}_{m})}^{⊤}

. Moreover, we can obtain the LARE estimator

\hat{β} (t) = \sum_{j = 1}^{m} {\hat{γ}}_{j} {\hat{υ}}_{j} (t)

.

2.2. LPRE Estimation

This is based on the LPRE criterion of [5] and

υ_{j} (\cdot)

is replaced with its estimator

{\hat{υ}}_{j} (\cdot)

. The LPRE estimates of model (1) can be obtained by minimizing the following loss functions:

(\tilde{α}, \tilde{γ}) ≜ \arg \min_{(α, γ)} \sum_{i = 1}^{n} \{Y_{i} \exp (- Z_{i}^{⊤} α - {\hat{U}}_{i}^{⊤} γ) + Y_{i}^{- 1} \exp (Z_{i}^{⊤} α + {\hat{U}}_{i}^{⊤} γ) - 2\},

(5)

where

\tilde{γ} = {({\tilde{γ}}_{1}, \dots, {\tilde{γ}}_{m})}^{⊤}

. Moreover, we can obtain the LPRE estimator

\tilde{β} (t) = \sum_{j = 1}^{m} {\tilde{γ}}_{j} {\hat{υ}}_{j} (t)

.

3. Asymptotic Properties

In this section, we establish the asymptotic properties of the estimators. Formulating the results requires the following technical assumptions. Firstly, we present some notations. Suppose that

α_{0}

and

β_{0} (t)

are the true values of

α

and

β (t)

, respectively, and let

γ_{0} = {(γ_{01}, \dots, γ_{0 m})}^{⊤}

be the true score coefficient vector. The notation

∥ \cdot ∥

denotes the norm

L^{2}

for a function or the Euclidean norm for a vector. In what follows, c denotes a generic positive constant that may take various values. Moreover,

a_{n} \sim b_{n}

implies that

a_{n} / b_{n}

is bounded away from zero and infinity as

n \to \infty

.

C1.: The random process $X (\cdot)$ and the score $ξ_{i}$ satisfy the following conditions:

${E | | X (\cdot) | |}^{4} \leq \infty, E {(ξ_{i})}^{4} \leq c λ_{i}^{2}, i \geq 1 .$
C2.: For the eigenvalues of the linear operator $C_{X}$ and the score coefficients, the following conditions hold:
$(a)$ There exist some constants c and $a > 1$ such that $c^{- 1} i^{- a} \leq λ_{i} \leq c i^{- a}, λ_{i} - λ_{i + 1} \geq c i^{- a - 1}, i \geq 1$ ;
$(b)$ There exist some constants c and $b > a / 2 + 1$ such that $| γ_{j} | \leq c j^{- b}, j \geq 1$ .
C3.: The tuning parameter $m \sim n^{1 / (a + 2 b)}$ .
C4.: For the random vector $Z$ , ${E ∥ Z ∥}^{4} < \infty .$
C5.: There exists some constant c such that $∣ ⟨ C_{Z_{l} X}, v_{j} ⟩ ∣ \leq c j^{- (a + b)}, l = 1, \dots, p, j \geq 1$ .
C6.: Let $η_{i k} = Z_{i k} - ⟨ g_{k}, X_{i} ⟩$ with $g_{k} = \sum_{j = 1}^{\infty} λ_{j}^{- 1} ⟨ C_{Z_{k} X}, v_{j} ⟩ v_{j}$ , for each k, then $η_{1 k}, \dots, η_{n k}$ are independent and identically distributed random variables. Assume that

$E [η_{1 k} | X_{1}, \dots, X_{n}] = 0, E [η_{1 k}^{2} | X_{1}, \dots, X_{n}] = Σ_{k k},$

where $Σ_{k k}$ is the kth diagonal element of $Σ = E (η_{i} {η_{i}}^{⊤})$ with $η_{i} = {(η_{i 1}, \dots, η_{i p})}^{⊤}$ , and $Σ$ is a positive definite matrix.
C7.: The error $ϵ$ has a continuous density $f (\cdot)$ in a neighborhood of 1, and is independent of $(Z, X (\cdot))$ .
C8.: $P (ϵ > 0) = 1$ , $E (ϵ + ϵ^{- 1}) < \infty$ , $E [(ϵ + ϵ^{- 1}) sgn (ϵ - 1)] = 0$ and $E [{(ϵ^{- 1} + ϵ)}^{2}] < \infty$ .
C9.: $E [(ϵ - ϵ^{- 1})] = 0$ , $E [{(ϵ^{- 1} + ϵ)}^{2}] < \infty$ .

Remark 1.

C1–C3 are standard assumptions used in classical functional linear regression (see, e.g., [11,12]). More specifically, C1 is needed for the consistency of

{\hat{C}}_{X} (t, s)

. C2(a) is required to identify the slope function

β (t)

by preventing the spacing between the eigenvalues from being too small, while C2 (b) is used to make the slope function

β (t)

sufficiently smooth. C3 is required to obtain the convergence rate of the slope function

β (t)

. C4–C6 are used to handle the linear part of the vector-type covariate in the model, which are similar to [3,13]. C4 is a little stronger than those in classical linear models and is primarily used to ensure the asymptotic behavior of

{\hat{C}}_{Z X}

and

{\hat{C}}_{Z}

. C5 makes the effect of truncation on the estimation of

β (\cdot)

small enough. Notably,

η_{i k}

is the regression error of

Z_{i k}

on

X_{i} (\cdot)

, and the conditions on

η_{s}

in C6 essentially restrict that

Z

can only be linearly related to

X (\cdot)

. C6 is also used to establish the asymptotic normality of the parameter estimator, in a similar manner to that applied in [3,13] for modeling the dependence between parametric and nonparametric components. C7–C8 are standard assumptions on random errors of the LARE estimator used in [1]. C9 is the standard assumption for random errors in the LPRE estimator used in [5].

The following two theorems present the convergence rate of the slope function estimator

\hat{β} (\cdot)

and establish the asymptotic normality of the parameter estimator

\hat{α}

, respectively, with the LARE method introduced in Section 2.1 above.

Theorem 1.

If conditions C1–C8 hold, then

∥ \hat{β} (\cdot) - β_{0} {(\cdot) ∥}^{2} = O_{p} (n^{- \frac{2 b - 1}{(a + 2 b)}}) .

Theorem 2.

If conditions C1–C8 hold, as

n \to \infty

, we have

\sqrt{n} (\hat{α} - α_{0}) \overset{D}{⟶} N (0, \frac{1}{4} {\{J + 2 f (1)\}}^{- 2} Σ^{- 1} A),

where

\overset{D}{\to}

represents convergence in distribution,

A = E {(ϵ + ϵ^{- 1})}^{2}, J = E \{ϵ sgn (ϵ - 1)\}

.

The following two theorems give the rate of convergence of the slope function and the asymptotic normality of the parameter vector, respectively, with the LPRE method introduced in Section 2.1 above.

Theorem 3.

Suppose conditions C1–C6 and C9 hold; then,

∥ \tilde{β} (\cdot) - β_{0} {(\cdot) ∥}^{2} = O_{p} (n^{- \frac{2 b - 1}{(a + 2 b)}}) .

Theorem 4.

Suppose conditions C1–C6 and C9 hold; as

n \to \infty

, we have

\sqrt{n} (\tilde{α} - α_{0}) \overset{D}{⟶} N (0, Σ_{1}^{- 1} Σ^{- 1} Σ_{2}),

where

Σ_{1} = E \{{(ϵ + ϵ^{- 1})}^{2}\}, Σ_{2} = E \{{(ϵ - ϵ^{- 1})}^{2}\}

.

Remark 2.

The convergence rate of the slope function

β (t)

obtained in Theorems 1 and 3 is the same as that of [12,13], which is optimal in the minimax sense. The variance in Theorems 2 and 4 involves the random error density function, which is the standard feature of multiplicative regression models. One can consult Theorem 3.2 of [14] for more details.

4. Implementation

Considering that the minimization problems of the LARE method are the special cases of the LPRE procedure, we only provide a detailed implementation of the LPRE approach. Specially, we use the Newton–Raphson iterative algorithm to solve the LPRE problem in Equation (5). Let

ζ_{i} = {(Z_{i}^{⊤}, {\hat{U}}_{i}^{⊤})}^{⊤}, θ = {(α^{⊤}, γ^{⊤})}^{⊤}

; then,

Q (θ) = \sum_{i = 1}^{n} \{| \frac{Y_{i} - \exp (ζ_{i}^{⊤} θ)}{Y_{i}} | + | \frac{Y_{i} - \exp (ζ_{i}^{⊤} θ)}{\exp (ζ_{i}^{⊤} θ)} |\} .

Then, the computation can be implemented as follows:

Step 1 Initialization step. In this paper, the least squares estimator

{\tilde{θ}}^{(0)}

is chosen as the initial estimator.

Step 2 Update the estimator

{\tilde{θ}}^{(k)}

of

θ

by using the following iterative procedure:

{\tilde{θ}}^{(k + 1)} = {\tilde{θ}}^{(k)} - {[\nabla^{2} Q ({\tilde{θ}}^{(k)})]}^{- 1} \times [\nabla Q ({\tilde{θ}}^{(k)})],

where

\nabla Q ({\tilde{θ}}^{(k)}) = \frac{\partial Q (θ)}{\partial θ} |_{θ = {\tilde{θ}}^{(k)}}

and

\nabla^{2} Q ({\hat{θ}}^{(k)}) = \frac{\partial^{2} Q (θ)}{\partial θ \partial θ^{⊤}} |_{θ = {\tilde{θ}}^{(k)}}

represent the gradient and Hessian matrix of

Q (θ)

at

{\tilde{θ}}^{(k)}

, respectively.

Step 3 Step 2 is repeated until convergence. We use the

L^{2}

norm of the difference between two consecutive estimates less than

10^{- 6}

as the convergence criterion. Note that [8] proposed a profiled LPRE method in partial linear multiplicative models. The algorithm in [8] requires that

E (\ln ε) = 0

and

E (ε - ε^{- 1}) = 0

hold simultaneously. Since the LPRE objective functions (5) are infinitely differentiable and strict, the prposed Newton–Raphson method can relax the restriction

E (\ln ε) = 0

. Moreover, the minimizer of the objective function (5) is just the root of its first derivative. We will express the final LPRE estimator of

θ

as

\tilde{θ}

. Furthermore, the LPRE estimator of the slope function is indicated by

\tilde{β} (t) = \sum_{j = 1}^{m} {\tilde{γ}}_{j} {\hat{υ}}_{j} (t)

.

5. Simulation Studies

In this section, the finite sample properties of the proposed estimation methods are investigated through Monte Carlo simulation studies. We compare the performance of the two proposed methods with the least absolute deviations (LAD) method in [15] and the least squares (LS) method in [3], where both the LS and LAD estimates are based on the logarithmic transformation on the two sides of the following model (6). The sample size n is set as 150, 300, and 600. And the datasets are generated from the following model:

Y = \exp (Z_{1} α_{1} + Z_{2} α_{2} + \int_{0}^{1} X (t) β (t) d t) ϵ,

(6)

where

Z_{1}

follows the standard normal distribution,

Z_{2}

follows the Bernoulli distribution with a probability of 0.5, and

α = {(α_{1}, α_{2})}^{⊤} = {(2, 1)}^{⊤}

. For the functional linear component, we use a similar setting to that used by [13] to set

β (t) = \sqrt{2} \sin (π t / 2) + 3 \sqrt{2} \sin (3 π t / 2)

and

X (t) = \sum_{j = 1}^{50} ξ_{j} υ_{j} (t)

, where

υ_{j} (t) = \sqrt{2} \sin ((j - 0.5) π t)

, and

ξ_{j}

s are independently distributed according to the normal distribution with mean 0 and variance

λ_{j} = 10 {((j - 0.5) π)}^{- 2}

for

j \geq 1

. Similar to [1], the random error

ϵ

is considered from the following three distributions: (i)

\log (ϵ) \sim 0.1 N (0, 1)

, (ii)

\log (ϵ) \sim U (- 2, 2)

, and (iii)

ϵ \sim U (0.5, a)

; the choice of a satisfies

E (ϵ) = E (ϵ^{- 1})

.

Implementing the proposed estimation method requires the tuning parameter m. Here, m is selected as the minimum value that reaches a certain proportion (denoted by

δ

) of the cumulative percentage of total variance (CPV) by the first m leading components as follows:

m = \arg \min_{K} \{\sum_{k = 1}^{K} {\hat{λ}}_{k} / \sum_{k = 1}^{M} {\hat{λ}}_{k} ⩾ δ\},

where M is the largest number of functional principle components, such that

{\hat{λ}}_{k} > 0

, and

δ = 90 %

is used in this study.

Based on 500 replications, Table 1 summarizes the performance of different estimators in terms of bias (Bias) and standard deviation (Sd) of the estimated

α_{1}

and

α_{2}

, as well as the mean squared error (MSE) of the estimated

α

. Table 2 provides the root average square errors (RASEs) of the estimated

β (t)

for LARE estimation, where the RASE is defined as follows:

RASE = \sqrt{\{\frac{1}{d} \sum_{k = 1}^{d} {(\hat{β} (t_{k}) - β (t_{k}))}^{2}\}},

where

t_{k}, k = 1, \dots, d

are equally spaced grids to calculate the value of function

β (t)

, and we take

d = 200

in this simulation. We compute the RASE for each replicate observation and obtain the average. In addition, the definitions of RASE for the LPRE, LAD, and LS methods are similar, we just replaced

\hat{β} (t)

with the corresponding estimators.

Table 1. The biases and standard deviations of the estimators for

α_{1}

and

α_{2}

, and the mean squared error of the estimators for

α

under different distributions (results

\times 100

).

Table 2. The RASEs of the estimators for

β (t)

under different error distributions (results

\times 100

).

From Table 1 and Table 2, we have the following observations: (a) Sd, MSE, and RASE decrease and the estimation performance improves as sample size n increases from 150 to 600. The estimates of the parametric covariate effects are basically unbiased and close to their true values, indicating that our proposed approaches produce consistent estimators. (b) When

\log (ϵ)

follows the normal random error, as expected, both LS and LPRE perform the best, and LAD performs the worst. (c) When

\log (ϵ)

follows

U (- 2, 2)

, LPRE performs the best, LARE also performs well, and LAD still performs the worst. (d)

\log (ϵ)

follows

U (0.5, a)

. Note that the random error violates condition C8

E [(ϵ + ϵ^{- 1}) sgn (ϵ - 1)] = 0

for the LARE method, which implies that the random error of zero mean in the least squares or of zero median in the LAD regression does not hold. Meanwhile, the LPRE method works well in the case of

E (ε - ε^{- 1}) = 0

. LPRE performs considerably better than LARE and LAD; this indicates that LPRE is much more robust than the LARE and LAD methods. In summary, LPRE performs the best in almost all the scenarios considered, confirming its superiority to LARE and other competing methods.

6. Application to Tecator Data

In this section, we introduce the application of the proposed estimation methods to Tecator data. The dataset is contained in the R package fda.usc in [16], and includes 215 independent food samples with fat, protein, and water of meat measured in percent. It has been widely used in the analysis of functional data. The Tecator data consist of a 100-channel spectrum of absorbances working in the wavelength from 850 to 1050 nanometers (nm). Further details on the data can be found in [2,9]. The purpose is to tease out the relation among the quantity of fatty Y (response), protein content

Z_{1}

, and water content

Z_{2}

(real random variables), and the spectrometric curve

X (t)

(a functional predictor). To predict the fat content of a meat sample, we consider the following PFMRM:

Y = \exp (Z_{1} α_{1} + Z_{2} α_{2} + \int_{850}^{1050} X (t) β (t) d t) ϵ .

(7)

To assess the predictive capability of the proposed methods, we followed [13] to randomly divide the sample into two subsamples:

I_{1} = \{(X_{i}, Y_{i}, Z_{i}, U_{i}), | I_{1} | = 165\}

as the training sample, where

| I_{1} |

represents the base of

I_{1}

, and

I_{2} = \{(X_{i}, Y_{i}, Z_{i}, U_{i}), | I_{2} | = 50\}

as the testing sample. The training and testing samples were used to estimate parameters and check the accuracy of the prediction, respectively. We used the mean quadratic prediction error (MQEP) as a criterion to evaluate the performance of various estimation procedures. The MQEP is defined as follows:

MQEP = \frac{1}{N} \sum_{j \in I_{2}} {(Y_{j} - {\hat{Y}}_{j})}^{2} / {Var}_{I_{2}} (Y_{j}),

where

{\hat{Y}}_{i}

is predicted based on the training sample, and

{Var}_{I_{2}}

is a response variable from test sample variance.

In addition, we compare the performances of the proposed model with the partial functional regression model in Shin [3], and log transformation on two sides of model (7) (denoted as “LogPFLM”). Specifically,

\begin{matrix} PFLM : Y = Z_{1} α_{1} + Z_{2} α_{2} + \int_{850}^{1050} X (t) β (t) d t + e \\ logPFLM : \ln Y = Z_{1} α_{1} + Z_{2} α_{2} + \int_{850}^{1050} X (t) β (t) d t + ε \end{matrix}

The CPV criterion introduced in Section 4 was used to determine the cutoff parameter m. Here,

m = 3

was selected to explain approximately 95% of the variance in the Tecator data. Table 3 shows the average MQEP of N times repeated operations. The first and second rows of Table 3 show the prediction results of the logPFLM using LS and LAD methods, respectively. The third and fourth rows give the prediction results of (7) under the LARE and LPRE methods, respectively. The final row presents the prediction results of the PFLM without logarithmic transformation by the LS method. Overall, the LPRE outperforms all other competing methods regardless of the number of random splits. LS performs the second best, whereas LAD performs the worst. In addition, we employed the above models and methods to Tecator data just considering the scalar predictors or functional predictors; and the results indicated relatively poor performance, so we have not reported them.

Table 3. MQEP of different random partitions.

Then, we used the best-performing LPRE method to estimate the unknown parameters based on the entire dataset. The estimates of

α_{1}

and

α_{2}

are

0.102

and

0.010

, respectively. Both protein and water are positively associated with the logarithmic transformation of fatty. Figure 1 depicts the estimated

β (t)

. In general, the spectrometric curve has a positive effect on the logarithmic transformation of fatty, and the estimated curve

β (t)

is small due to the large integral domain. The advantages of the LPRE method are particularly evident in the analysis of this dataset.

Figure 1. The estimated functional weight

β (t)

in model (7) with LPRE method.

7. Conclusions

In this paper, we study the estimated problems of PFMRM based on the LARE and LPRE criteria, and the unknown slope function and functional predictor are approximated by the functional principal component analysis technique. Under some regularity conditions, we obtain the convergence rates of the slope function and the asymptotic normality of the parameter vector for two estimation methods. Both the numerical simulation results and the real data analysis show that the LPRE method is superior to the LARE, least square, and least absolute deviation methods. Several issues still warrant further study. First, we choose the Karhunen–Loève expansion to approximate the slope function in this article. Other nonparametric smoothing techniques, such as B-spline, kernel estimation, and penalty regression spline, can be used in our proposed LARE and LPRE estimation methods, and the large sample characteristics and limited sample comparison are worth studying. Furthermore, the proposed methods can also be extended to more general situations, including, but not limited to, dependent functional data, partially observed functional data, and multivariate functional data. Substantial efforts must be devoted to related advances in the future.

Author Contributions

Conceptualization, X.L. and P.Y.; methodology and proof, X.L. and P.Y.; numerical study, X.L. and J.S.; writing—original draft preparation, P.Y. and J.S.; writing—review and editing, X.L., P.Y. and J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (12401356), the Natural Science Foundation of Shanxi Province (20210302124262, 20210302124530, 202203021222223), the National Statistical Science Research Project of China (2022LY089), and the Natural Science Foundation of Shanxi normal University (JYCJ2022004).

Data Availability Statement

Researchers can download the Tecator dataset from the R package “fda.usc”.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PFMRM	partial functional multiplicative regression model
PFLM	partial functional regression model
LARE	least absolute relative error
LPRE	least product relative error
LAD	least absolute deviations
LS	least square
RASE	root average square error
MQEP	mean quadratic error of prediction
MSE	mean squared error
CPV	Cumulative Percentage of Variance

Appendix A

In this appendix, we provide the technical proofs for the results presented in Section 3.

Proof of Theorem 1.

Let

δ_{n} = n^{- \frac{2 b - 1}{2 (a + 2 b)}}

,

S_{n} = δ_{n}^{- 1} (\hat{α} - α_{0})

,

V_{n} = δ_{n}^{- 1} (\hat{γ} - γ_{0})

,

r_{i} = \int_{0}^{1} β_{0} (t) X_{i} (t) d t - {\hat{U}}_{i}^{⊤} γ_{0}

, and

F_{n} = \{(S_{n}, V_{n}) : ∥ {(S_{n}^{⊤}, V_{n}^{⊤})}^{⊤} ∥ = L\}

, where L is a large enough constant. Next, we show that, for any given

η > 0

, there exists a sufficiently large constant

L = L_{η}

, such that

P \{\inf_{(S_{n}, V_{n}) \in F_{n}} Q_{n} (α_{0} + δ_{n} S_{n}, γ_{0} + δ_{n} V_{n}) > Q_{n} (α_{0}, γ_{0})\} \geq 1 - η .

(A1)

This implies that with the probability of at least

1 - η

there exists a local minimizer

\hat{α}

and

\hat{γ}

in the ball

(S_{n}, V_{n}) : ∥ {(S_{n}^{⊤}, V_{n}^{⊤})}^{⊤} ∥ = L},

such that

∥ \hat{α} - α_{0} ∥ = O_{p} (δ_{n})

,

∥ \hat{γ} - γ_{0} ∥ = O_{p} (δ_{n})

.

By using

{∥ {\hat{υ}}_{j} - υ_{j} ∥}^{2} = O_{p} (n^{- 1} j^{2})

(see, e.g., [3,13]), we have

\begin{matrix} {∣ r_{i} ∣}^{2} = & {∣ \int_{0}^{1} β_{0} (t) X_{i} (t) d t - {\hat{U}}_{i}^{⊤} γ_{0} ∣}^{2} \\ \leq & 2 {∣ \sum_{j = 1}^{m} ⟨ X_{i}, {\hat{υ}}_{j} - υ_{j} ⟩ γ_{0 j} ∣}^{2} + 2 {∣ \sum_{j = m + 1}^{\infty} ⟨ X_{i}, υ_{j} ⟩ γ_{0 j} ∣}^{2} \\ ≜ & 2 A_{1} + 2 A_{2} . \end{matrix}

For

A_{1}

, by conditions C1, C2, and the Hölder inequality, we can obtain

\begin{matrix} A_{1} = & {∣ \sum_{j = 1}^{m} ⟨ X_{i}, {\hat{υ}}_{j} - υ_{j} ⟩ γ_{0 j} ∣}^{2} \leq c m \sum_{j = 1}^{m} {‖ υ_{j} - {\hat{υ}}_{j} ‖}^{2} {∣ γ_{0 j} ∣}^{2} \\ \leq & c m \sum_{j = 1}^{m} O_{p} (n^{- 1} j^{2 - 2 b}) = O_{p} (n^{- \frac{1 + 4 b - 4}{a + 2 b}}) = o_{p} (δ_{n}^{2}) . \end{matrix}

For

A_{2}

, given that

E \{\sum_{j = m + 1}^{\infty} ⟨ X_{i}, υ_{j} ⟩ γ_{0 j}\} = 0,

Var \{\sum_{j = m + 1}^{\infty} ⟨ X_{i}, υ_{j} ⟩ γ_{0 j}\} = \sum_{j = m + 1}^{\infty} λ_{j} γ_{0 j}^{2} \leq c \sum_{j = m + 1}^{\infty} j^{- (a + 2 b)} = O (n^{- \frac{a + 2 b - 1}{a + 2 b}}) .

We have

A_{2} = O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}) = o_{p} (δ_{n}^{2}) .

Taking these together, we obtain

{∣ r_{i} ∣}^{2} = O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}) = o_{p} (δ_{n}^{2}) .

(A2)

Furthermore, a simple calculation yields

\begin{matrix} ψ_{n} (S_{n}, V_{n}) ≜ & (Q_{n} (α_{0} + δ_{n} S_{n}, γ_{0} + δ_{n} V_{n}) - Q_{n} (α_{0}, γ_{0})) \\ = & \sum_{i = 1}^{n} {| 1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} (α_{0} + δ_{n} S_{n}) + {\hat{U}}_{i}^{⊤} (γ_{0} + δ_{n} V_{n})) | - | 1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} \\ α_{0} + {\hat{U}}_{i}^{⊤} γ_{0}) |} + \sum_{i = 1}^{n} \{| 1 - Y_{i} \exp (Z_{i}^{⊤} (α_{0} + δ_{n} S_{n}) + {\hat{U}}_{i}^{⊤} (γ_{0} + δ_{n} V_{n})) | \\ - | 1 - Y_{i} \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0}) |} \\ ≜ & I_{1} + I_{2} . \end{matrix}

For

I_{1}

, by the Knight identity (see, e.g., [17]), we have

| x - y | - | x | = - y [I (x > 0) - I (x < 0)] + 2 \int_{0}^{y} [I (x \leq s) - I (x \leq 0)] d s . x \neq 0

By routine calculation, we have

\begin{matrix} I_{1} = & - \sum_{i = 1}^{n} ω_{1 i} \{I (1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0}) > 0) - I (1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0}) < 0)\} \\ + 2 \sum_{i = 1}^{n} \int_{0}^{ω_{1 i}} \{I (1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0}) \leq s) - I (1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0}) \leq 0)\} d s \\ ≜ & I_{11} + I_{12}, \end{matrix}

where

ω_{1 i} = Y_{i}^{- 1} \{\exp (Z_{i}^{⊤} (α_{0} + δ_{n} S_{n}) + {\hat{U}}_{i}^{⊤} (γ_{0} + δ_{n} V_{n})) - \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0})}

.

Using the Taylor series expansion, we have

\begin{matrix} I_{11} = & - δ_{n} \sum_{i = 1}^{n} {S_{n} Z_{i}^{⊤} Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0}) + V_{n} {\hat{U}}_{i}^{⊤} Y_{i}^{- 1} \exp ({\hat{U}}_{i}^{⊤} γ_{0})} sgn (1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0} \\ + {\hat{U}}_{i}^{⊤} γ_{0})) - \frac{1}{2} δ_{n}^{2} \sum_{i = 1}^{n} Y_{i}^{- 1} \{S_{n}^{⊤} Z_{i} Z_{i}^{⊤} S_{n} \exp (ξ_{i}^{[1]}) + V_{n}^{⊤} {\hat{U}}_{i} {\hat{U}}_{i}^{⊤} V_{n} \exp (ξ_{i}^{[2]})\} \\ sgn (1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0})) \\ ≜ & I_{111} + I_{112}, \end{matrix}

where

ξ_{i}^{[1]}

is between

Z_{i}^{⊤} (α_{0} + δ_{n} S_{n})

and

Z_{i}^{⊤} α_{0}

, and

ξ_{i}^{[2]}

is between

{\hat{U}}_{i}^{⊤} (γ_{0} + δ_{n} V_{n})

and

{\hat{U}}_{i}^{⊤} γ_{0}

.

For

I_{111}

, we have

\begin{matrix} I_{111} = & - \sum_{i = 1}^{n} δ_{n} \frac{1}{ϵ_{i}} \{S_{n} Z_{i}^{⊤} + V_{n} {\hat{U}}_{i}^{⊤}\} sgn (1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0})) \\ + \sum_{i = 1}^{n} \frac{1}{ϵ_{i}} δ_{n}^{2} \{S_{n}^{⊤} Z_{i} Z_{i}^{⊤} S_{n} + V_{n}^{⊤} {\hat{U}}_{i} {\hat{U}}^{⊤} V_{n}\} sgn (1 - Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0} + {\hat{U}}_{i}^{⊤} γ_{0})) + o_{p} (1) \\ ≜ & I_{1111} + I_{1112} + o_{p} (1) . \end{matrix}

It follows that

I_{1111} = o_{p} (1) ∥ S_{n} ∥^{2} + o_{p} (1) {∥ V_{n} ∥}^{2}

,

I_{1112} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2}

, and

I_{112} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2}

. Therefore,

I_{11} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2} .

For

I_{12}

, define

c_{1 i} = \exp (δ_{n} S_{n} + δ_{n} V_{n} - r_{i}), c_{2 i} = \exp ({\hat{U}}_{i}^{⊤} γ_{0} - U_{i}^{⊤} γ_{0} - r_{i})

, and

τ = s ϵ_{i}

; then,

\begin{matrix} I_{12} = & 2 \sum_{i = 1}^{n} \int_{0}^{ω_{1 i}} [I (1 - ϵ_{i}^{- 1} c_{2 i} \leq s) - I (1 - ϵ_{i}^{- 1} c_{2 i} \leq 0)] d s \\ = & 2 \sum_{i = 1}^{n} \int_{0}^{c_{1 i} - c_{2 i}} ϵ_{i}^{- 1} \{I (ϵ_{i} \leq c_{2 i} + τ) - I (ϵ_{i} \leq c_{2 i})\} d τ \\ = & 2 \sum_{i = 1}^{n} \int_{0}^{c_{1 i} - c_{2 i}} E_{ϵ | X} \{ϵ_{i}^{- 1} [I (ϵ_{i} \leq c_{2 i} + τ) - I (ϵ_{i} \leq c_{2 i})]\} d τ + o_{p} (1) \\ = & 2 \sum_{i = 1}^{n} \int_{0}^{c_{1 i} - c_{2 i}} E_{ϵ | X} \{I (ϵ_{i} \leq c_{2 i} + τ) - I (ϵ_{i} \leq c_{2 i})\} d τ \\ + 2 \sum_{i = 1}^{n} \int_{0}^{c_{1 i} - c_{2 i}} E_{ϵ | X} \{(ϵ_{i}^{- 1} - 1) [I (ϵ_{i} \leq c_{2 i} + τ) - I (ϵ_{i} \leq c_{2 i})]\} d τ + o_{p} (1) \\ = & δ_{n}^{2} \sum_{i = 1}^{n} \{S_{n}^{⊤} Z_{i} Z_{i}^{⊤} S_{n} + V_{n}^{T} {\hat{U}}_{i} {\hat{U}}_{i}^{⊤} V_{n}\} \{1 + o_{p} (1)\} \\ = & O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2} + o_{p} (1) . \end{matrix}

Therefore,

I_{1} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2} .

Similarly, we can prove that

I_{2} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2} .

Then, Equation (A1) holds, and there exists a local minimizer

\hat{γ}

, such that

∥ \hat{γ} - γ_{0} ∥ = O_{p} (δ_{n})

. Note that

\begin{matrix} {∥ \hat{β} (t) - β_{0} (t) ∥}^{2} = & {∥ \sum_{j = 1}^{m} {\hat{γ}}_{j} {\hat{υ}}_{j} (t) - \sum_{j = 1}^{\infty} γ_{0 j} υ_{j} (t) ∥}^{2} \\ \leq & 2 ∥ \sum_{j = 1}^{m} {\hat{γ}}_{j} {\hat{υ}}_{j} (t) - \sum_{j = 1}^{\infty} γ_{0 j} υ_{j} {(t) ∥}^{2} + 2 {∥ \sum_{j = m + 1}^{\infty} γ_{0 j} υ_{j} (t) ∥}^{2} \\ \leq & 4 ∥ \sum_{j = 1}^{m} ({\hat{γ}}_{j} - γ_{0 j}) {\hat{υ}}_{j} (t) ∥^{2} + 4 {∥ \sum_{j = 1}^{m} γ_{0 j} ({\hat{υ}}_{j} (t) - υ_{j} (t)) ∥}^{2} + 2 \sum_{j = m + 1}^{\infty} γ_{0 j}^{2} \\ ≜ & 4 D_{1} + 4 D_{2} + 2 D_{3} . \end{matrix}

According to Equation (A2), condition C2, the orthogonality of

{\hat{υ}}_{j}

, and

∥ υ_{j} (t) - {\hat{υ}}_{j} {(t) ∥}^{2} = O_{p} (n^{- 1} j^{2})

, we have

D_{1} = ∥ \sum_{j = 1}^{m} ({\hat{γ}}_{j} - γ_{0 j}) \hat{υ} (t) ∥^{2} \leq \sum_{j = 1}^{m} {| {\hat{γ}}_{j} - γ_{0 j} |}^{2} = {∥ \hat{γ} - {\hat{γ}}_{0} ∥}^{2} = O_{p} (δ_{n}^{2}) .

(A3)

\begin{matrix} D_{2} = & ∥ \sum_{j = 1}^{m} γ_{0 j} ({\hat{υ}}_{j} (t) - υ_{j} (t)) ∥^{2} \leq m \sum_{j = 1}^{m} {∥ {\hat{υ}}_{j} (t) - υ_{j} (t) ∥}^{2} γ_{0 j}^{2} \leq \frac{m}{n} O_{p} (\sum_{j = 1}^{m} j^{2} γ_{0 j}^{2}) \\ = & O_{p} (n^{- 1} m \sum_{j = 1}^{m} j^{2 - 2 b}) = O_{p} (n^{- 1} m) = o_{p} (n^{- \frac{2 b - 1}{a + 2 b}}) = o_{p} (δ_{n}^{2}) . \end{matrix}

(A4)

D_{3} = \sum_{j = m + 1}^{\infty} γ_{0 j}^{2} \leq C \sum_{j = m + 1}^{\infty} j^{- 2 b} = O (n^{- \frac{2 b - 1}{a + 2 b}}) = O (δ_{n}^{2}) .

(A5)

Then, Theorem 1 follows immediately from (A3)–(A5). □

Proof of Theorem 2.

Firstly, let

\begin{matrix} ψ_{n} (α, γ) ≜ & \sum_{i = 1}^{n} {| 1 - ϵ_{i}^{- 1} \exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) | \\ + | 1 - ϵ_{i} \exp (- Z_{i}^{⊤} (α - α_{0}) - {\hat{U}}_{i}^{⊤} (γ - γ_{0}) + r_{i}) |} . \end{matrix}

According to the convexity lemma of [18] and lemma A.1 in [1], for any compact sets

Θ

and ℜ, as

n \to \infty

, we have

\sup_{\begin{matrix} α \in Θ \\ γ \in ℜ \end{matrix}} \frac{1}{n} | ψ_{n} (α, γ) - E ψ_{n} (α, γ) | \overset{p}{⟶} 0 .

With a simple calculation, we have

\begin{matrix} E \{ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})\} \\ = & \sum_{i = 1}^{n} E {| 1 - ϵ_{i}^{- 1} \exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) | + | 1 - ϵ_{i} \exp (- Z_{i}^{⊤} (α - α_{0}) \\ - {\hat{U}}_{i}^{⊤} (γ - γ_{0}) + r_{i}) | - | 1 - ϵ_{i}^{- 1} | - | 1 - ϵ_{i} |} + o_{p} (1) \\ = & \sum_{i = 1}^{n} E \{(ϵ_{i} + ϵ_{i}^{- 1}) sgn (1 - ϵ_{i}) [\exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) - 1]\} \\ + \sum_{i = 1}^{n} E {ϵ_{i} sgn (ϵ_{i} - 1) [\exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) + \exp (- Z_{i}^{⊤} (α - α_{0}) \\ - {\hat{U}}_{i}^{⊤} (γ - γ_{0}) + r_{i}) - 2]} + 2 \sum_{i = 1}^{n} E {[I (ϵ_{i} \leq \exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i})) \\ - I (ϵ_{i} \leq 1)] \times [ϵ_{i}^{- 1} \exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) - ϵ_{i} \exp (- Z_{i}^{⊤} (θ - θ_{0}) \\ - {\hat{U}}_{i}^{⊤} (γ - γ_{0}) + r_{i})]} + o_{p} (1) . \end{matrix}

(A6)

According to condition C8, we can obtain that the sum of first term in Equation (A6) is 0. Further, by condition C8, we have

\begin{matrix} 2 E \{ϵ I (ϵ > 1)\} > E \{(ϵ + ϵ^{- 1}) I (ϵ > 1)\} = E \{(ϵ + ϵ^{- 1}) I (ϵ \leq 1)\} > 2 E \{ϵ I (ϵ \leq 1)\} . \end{matrix}

This means

J = E \{ϵ sgn (ϵ - 1)\} > 0

, and the second term in Equation (A6) is non-negative. And it is easy to prove that the third term in Equation (A6) is also non-negative. Thus, for all

α

and

γ

, we have

E \{ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})\} \geq 0 .

In addition, we have

E \{ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})\} = o_{p} (1)

,

\sum_{i = 1}^{n} E \{ϵ_{i} sgn (ϵ_{i} - 1) \times [\exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) + \exp (- Z_{i}^{⊤} (α - α_{0}) - {\hat{U}}_{i}^{⊤} (γ - γ_{0}) + r_{i}) - 2]\} = o_{p} (1)

. And then

α = α_{0}, γ = γ_{0}

is a unique minimum point of

\{\exp [(Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}] + \exp [- Z_{i}^{⊤} (α - α_{0}) - {\hat{U}}_{i}^{⊤} (γ - γ_{0}) + r_{i}]\} (1 + o_{p} (1))

. According to condition C8 and

E \{ϵ sgn (ϵ - 1)\} > 0

, we can obtain that

α = α_{0}, γ = γ_{0}

is a minimizer of

E \{ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})\}

. Let

ψ (α, γ) ≜ n^{- 1} E \{ψ_{n} (α, γ)\}

. Then, for each

δ_{1} > 0

,

δ_{2} > 0

, there exist

η > 0

, such that

∥ α - α_{0} ∥ \geq δ_{1}, ∥ γ - γ_{0} ∥ \geq δ_{2}

,

ψ (α, γ) > ψ (α_{0}, γ_{0}) + η

.

For any constant

δ_{1}

,

δ_{2} > 0

, and

C_{1}, C_{2}

, let (

α_{n}^{*}, γ_{n}^{*}

) be a minimizer of

ψ_{n} (α, γ)

such that

δ_{1} \leq ∥ α_{n}^{*} - α_{0} ∥ \leq C_{1}, δ_{2} \leq ∥ γ_{n}^{*} - γ_{0} ∥ \leq C_{2}

. According to (A6), as

n \to \infty

, we have

ψ_{n} (α_{n}^{*}, γ_{n}^{*}) \overset{p}{⟶} ψ (α_{n}^{*}, γ_{n}^{*})

, and

ψ_{n} (α_{n}^{*}, γ_{n}^{*}) > ψ (α_{0}, γ_{0}) + η

.

On the other hand, according to (A6), for arbitrary positive constants

δ_{1}

and

δ_{2}

, we have

\inf_{\begin{matrix} ∥ α - α_{0} ∥ \leq δ_{1} \\ ∥ γ - γ_{0} ∥ \leq δ_{2} \end{matrix}} ψ_{n} (α, γ) \leq ψ_{n} (α_{0}, γ_{0}) \overset{p}{⟶} ψ (α_{0}, γ_{0})

. Thus, with probability tending to 1, the minimizer is taken inside of

∥ α - α_{0} ∥ \leq δ_{1}, ∥ γ - γ_{0} ∥ \leq δ_{2}

for the strictly convex of

ψ_{n} (α, γ)

. Therefore, the local minimizer inside

∥ α - α_{0} ∥ \leq δ_{1}, ∥ γ - γ_{0} ∥ \leq δ_{2}

is the only global minimizer. According to the definition of

{\hat{α}}_{n}, {\hat{γ}}_{n}

, when

n \to \infty

,

P ({\hat{α}}_{n} \in \{α : ∥ α - α_{0} ∥ \leq δ_{1}\}) \to 1, P ({\hat{γ}}_{n} \in \{γ : ∥ γ - γ_{0} ∥ \leq δ_{2}\}) \to 1

. Thus, as

δ_{1} \to 0

,

δ_{2} \to 0

, we can obtain the weak consistency of

{\hat{α}}_{n}

and

{\hat{γ}}_{n}

.

Next, we prove the asymptotic normality. Note that

\exp (x) + \exp (- x) - 2 = x^{2} + O ({| x |}^{3})

. By invoking the Taylor expansion, we have

\begin{matrix} \frac{1}{n} E \{ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})\} \\ = & J \frac{1}{n} \sum_{i = 1}^{n} E {\exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) + \exp (- Z_{i}^{⊤} (α - α_{0}) - {\hat{U}}_{i}^{⊤} (γ - γ_{0}) \\ + r_{i}) - 2} + 2 f (1) \frac{1}{n} \sum_{i = 1}^{n} E {{(α - α_{0})}^{⊤} Z_{i} Z_{i}^{⊤} (α - α_{0}) + {(α - α_{0})}^{⊤} Z_{i} {\hat{U}}_{i}^{⊤} (γ - γ_{0}) \\ + {(γ - γ_{0})}^{⊤} {\hat{U}}_{i} Z_{i}^{⊤} (α - α_{0}) + {(γ - γ_{0})}^{⊤} {\hat{U}}_{i} {\hat{U}}_{i}^{⊤} (γ - γ_{0})} + O (∥ α - α_{0} ∥^{3}) + O (∥ γ - γ_{0} ∥^{3}) \\ + O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}) \\ = & J {(α - α_{0})}^{⊤} V_{1} (α - α_{0}) + J {(α - α_{0})}^{⊤} V_{2} (γ - γ_{0}) + J {(γ - γ_{0})}^{⊤} V_{3} (α - α_{0}) \\ + J {(γ - γ_{0})}^{⊤} V_{4} (γ - γ_{0}) + 2 f (1) {(α - α_{0})}^{⊤} V_{1} (α - α_{0}) + 2 f (1) {(α - α_{0})}^{⊤} V_{2} (γ - γ_{0}) \\ + 2 f (1) {(γ - γ_{0})}^{⊤} V_{3} (α - α_{0}) + 2 f (1) {(γ - γ_{0})}^{⊤} V_{4} (γ - γ_{0}) + O (∥ α - α_{0} ∥^{3}) \\ + O (∥ γ - γ_{0} ∥^{3}) + O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}) \\ = & (J + 2 f (1)) [{(α - α_{0})}^{⊤} V_{1} (α - α_{0}) + {(α - α_{0})}^{⊤} V_{2} (γ - γ_{0}) + {(γ - γ_{0})}^{⊤} V_{3} (α - α_{0}) \\ + {(γ - γ_{0})}^{⊤} V_{4} (γ - γ_{0})] + O (∥ α - α_{0} ∥^{3}) + O (∥ γ - γ_{0} ∥^{3}) + O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}), \end{matrix}

(A7)

where

J = E \{ϵ sgn (ϵ - 1)\} > 0

,

V_{1} = E (Z Z^{⊤}), V_{2} = E (Z {\hat{U}}^{⊤}), V_{3} = E (\hat{U} Z^{⊤}), V_{4} = E (\hat{U} {\hat{U}}^{⊤}) .

Let

W_{1} = \sum_{i = 1}^{n} (ϵ_{i} + ϵ_{i}^{- 1}) sgn (ϵ_{i} - 1) Z_{i}

,

W_{2} = \sum_{i = 1}^{n} (ϵ_{i} + ϵ_{i}^{- 1}) sgn (ϵ_{i} - 1) {\hat{U}}_{i}

.

Next, we will show that, for every

C_{1} > 0, C_{2} > 0

, one has

\begin{matrix} \sup_{\begin{matrix} ∥ θ - θ_{0} ∥ \leq C_{1} n^{- 1 / 2} \\ ∥ γ - γ_{0} ∥ \leq C_{2} n^{\frac{2 b - 1}{2 (a + 2 b)}} \end{matrix}} & | ψ_{n} (θ, γ) - ψ_{n} (θ_{0}, γ_{0}) + W_{1}^{⊤} (θ - θ_{0}) + W_{2}^{⊤} (γ - γ_{0}) \\ - E \{ψ_{n} (θ, γ) - ψ_{n} (θ_{0}, γ_{0})\} | \overset{p}{⟶} 0 . \end{matrix}

(A8)

Let

θ_{1} = \sqrt{n} (α - α_{0})

,

θ_{2} = \sqrt{δ_{n}} (γ - γ_{0})

. Then, Equation (A8) is rewritten as

\begin{matrix} \sup_{\begin{matrix} ∥ θ_{1} ∥ \leq C_{1} \\ ∥ θ_{2} ∥ \leq C_{2} \end{matrix}} | ψ_{n} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - ψ_{n} (α_{0}, γ_{0}) + \frac{1}{\sqrt{n}} W_{1}^{⊤} θ_{1} + \frac{1}{\sqrt{δ_{n}}} W_{2}^{⊤} θ_{2} \\ - E {ψ_{n} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - ψ_{n} (α_{0}, γ_{0})} | \overset{p}{⟶} 0 . \end{matrix}

To prove the above equation, we will first prove that the following Equation (A9) holds for each fixed

θ_{1}

and

θ_{2}

, that is,

\begin{matrix} ψ_{n} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - ψ_{n} (α_{0}, γ_{0}) + \frac{1}{\sqrt{n}} W_{1}^{⊤} θ_{1} + \frac{1}{\sqrt{δ_{n}}} W_{2}^{⊤} θ_{2} \\ - E \{ψ_{n} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - ψ_{n} (α_{0}, γ_{0})\} \overset{p}{⟶} 0 . \end{matrix}

(A9)

Let

\begin{matrix} G_{i} (θ, γ) ≜ & ϵ_{i} sgn (ϵ_{i} - 1) \times {\exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) + \\ \exp (- Z_{i}^{⊤} (α - α_{0}) - {\hat{U}}_{i}^{⊤} (γ - γ_{0}) + r_{i}) - 2} . \end{matrix}

\begin{matrix} R_{i} (θ, γ) ≜ & [I (ϵ_{i} > \exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i})) - I (ϵ_{i} > 1)] \times [ϵ_{i} \exp (- {\hat{U}}_{i}^{⊤} (γ \\ - γ_{0}) - Z_{i}^{⊤} (α - α_{0}) + r_{i}) - ϵ_{i}^{- 1} \exp ({\hat{U}}_{i}^{⊤} (γ - γ_{0}) + Z_{i}^{⊤} (α - α_{0}) - r_{i})] . \end{matrix}

Then, we have

\begin{matrix} ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0}) - E \{ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})\} \\ = & - \sum_{i = 1}^{n} (ϵ_{i} + ϵ_{i}^{- 1}) sgn (ϵ_{i} - 1) [\exp (Z_{i}^{⊤} (α - α_{0}) + {\hat{U}}_{i}^{⊤} (γ - γ_{0}) - r_{i}) - 1] \\ + \sum_{i = 1}^{n} \{G_{i} (α, γ) - E G_{i} (α, γ)\} + 2 \sum_{i = 1}^{n} \{R_{i} (α, γ) - E R_{i} (α, γ)\} . \end{matrix}

For each fixed

θ_{1}

and

θ_{2}

, as

n \to \infty

, one has

\begin{matrix} \sum_{i = 1}^{n} E {\{G_{i} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - E [G_{i} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}})]\}}^{2} \\ \leq & \sum_{i = 1}^{n} E {\{ϵ_{i} sgn (ϵ_{i} - 1)\}}^{2} \times E {\exp (- \frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} - \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} + r_{i}) + \exp (\frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} \\ + \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} - r_{i}) - 2}^{2} \\ = & \sum_{i = 1}^{n} E {\{ϵ_{i} sgn (ϵ_{i} - 1)\}}^{2} \times E (\frac{1}{n} θ_{1}^{⊤} Z_{i} Z_{i}^{⊤} θ_{1} + \frac{1}{\sqrt{n δ_{n}}} θ_{1}^{⊤} Z_{i} {\hat{U}}_{i}^{⊤} θ_{2} + \frac{1}{\sqrt{n δ_{n}}} θ_{2}^{⊤} {\hat{U}}_{i} Z_{i}^{⊤} θ_{1} \\ + \frac{1}{δ_{n}} θ_{2}^{⊤} {\hat{U}}_{i} {\hat{U}}_{i}^{⊤} θ_{2} + a_{i})^{2} + O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}) \to 0, \end{matrix}

where

a_{i}

such that

P (∥ a_{i} ∥ \leq c δ_{n}^{- 3 / 2}) = 1

for

i = 1, \dots, n

. As

n \to \infty

, one has

\sum_{i = 1}^{n} \{G_{i} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - E [G_{i} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}})]\} \overset{p}{⟶} 0 .

(A10)

On the other hand, by the Taylor series expansion, for every fixed

θ_{1}

and

θ_{2}

, we have

\begin{matrix} E {\{ϵ \exp (- \frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} - \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} - r_{i}) - ϵ^{- 1} \exp (\frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} + \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} + r_{i})\}}^{2} \\ = & E {(- ϵ \frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} - ϵ \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} - ϵ^{- 1} \frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} - ϵ^{- 1} \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} + ϵ - ϵ^{- 1} + b)}^{2} + O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}) \\ = & E {- (ϵ - 1) \frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} - (ϵ - 1) \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} - (ϵ^{- 1} - 1) \frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} - (ϵ^{- 1} - 1) \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} \\ - \frac{2}{\sqrt{n}} Z_{i}^{⊤} θ_{1} - \frac{2}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} + (ϵ - 1) - (ϵ^{- 1} - 1) + b}^{2} + O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}) \\ \leq & E {9 [{(ϵ - 1)}^{2} + {(ϵ^{- 1} - 1)}^{2} + 4] \frac{1}{n} θ_{1}^{⊤} Z_{i} Z_{i}^{⊤} θ_{1} + 9 [{(ϵ - 1)}^{2} + {(ϵ^{- 1} - 1)}^{2} + 4] \frac{1}{δ_{n}} θ_{2}^{⊤} {\hat{U}}_{i} {\hat{U}}_{i}^{⊤} θ_{2} \\ + 9 {(ϵ - 1)}^{2} + 9 {(ϵ^{- 1} - 1)}^{2} + b^{2}} (1 + o_{p} (1)) + O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}), \end{matrix}

where b such that

P (∥ b ∥ \leq c δ_{n}^{- 1}) = 1

.

Similarly, we have

\begin{matrix} \sum_{i = 1}^{n} E {\{R_{i} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - E [R_{i} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}})]\}}^{2} \\ \leq & \sum_{i = 1}^{n} E {{I (\frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} + \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} > 0) I (0 < \log ϵ_{i} \leq \frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} + \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2}) \\ + I (\frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} + \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2} \leq 0) I (0 \geq \log ϵ_{i} > \frac{1}{\sqrt{n}} Z_{i}^{⊤} θ_{1} + \frac{1}{\sqrt{δ_{n}}} {\hat{U}}_{i}^{⊤} θ_{2})} \\ \times [9 ({(ϵ - 1)}^{2} + {(ϵ^{- 1} - 1)}^{2} + 4) \frac{1}{\sqrt{n}} θ_{1}^{⊤} Z_{i} Z_{i}^{⊤} θ_{1} + 9 ({(ϵ - 1)}^{2} + {(ϵ^{- 1} - 1)}^{2} \\ + 4) \frac{1}{\sqrt{δ_{n}}} θ_{2}^{⊤} {\hat{U}}_{i} {\hat{U}}_{i}^{⊤} θ_{2} + 9 {(ϵ - 1)}^{2} + 9 {(ϵ^{- 1} - 1)}^{2} + b_{i}^{2}]} (1 + o_{p} (1)) + O_{p} (n^{- \frac{a + 2 b - 1}{a + 2 b}}) \to 0, \end{matrix}

where

b_{i}

such that

P (∥ b_{i} ∥ \leq c δ_{n}^{- 1}) = 1

. Furthermore, for each fixed

θ_{1}

and

θ_{2}

, one has

\sum_{i = 1}^{n} \{R_{i} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - E [R_{i} (α_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}})]\} \overset{p}{⟶} 0 .

(A11)

Combining Equation (A11) with condition C8, we complete the proof of (A9). According to Lemma A.1 in [1], we know that

ψ_{n} (θ_{0} + \frac{α_{1}}{\sqrt{n}}, γ_{0} + \frac{α_{2}}{\sqrt{δ_{n}}}) - ψ_{n} (θ_{0}, γ_{0}) + \frac{1}{\sqrt{n}} W_{1}^{⊤} α_{1} + \frac{1}{\sqrt{δ_{n}}} W_{2}^{⊤} α_{2}

is convex. Then, for each constant

C_{1} > 0, C_{2} > 0

, we have

\begin{matrix} \sup_{\begin{matrix} ∥ θ_{1} ∥ \leq C_{1} \\ ∥ θ_{2} ∥ \leq C_{2} \end{matrix}} | ψ_{n} (θ_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - ψ_{n} (θ_{0}, γ_{0}) + \frac{1}{\sqrt{n}} W_{1}^{⊤} θ_{1} + \frac{1}{\sqrt{δ_{n}}} W_{2}^{⊤} θ_{2} \\ - E \{ψ_{n} (θ_{0} + \frac{θ_{1}}{\sqrt{n}}, γ_{0} + \frac{θ_{2}}{\sqrt{δ_{n}}}) - ψ_{n} (θ_{0}, γ_{0})\} | \overset{p}{⟶} 0 . \end{matrix}

Lastly, let

\begin{matrix} ξ_{n} (α, γ) ≜ & ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0}) - n (J + 2 f (1)) [{(α - α_{0})}^{⊤} V_{1} (α - α_{0}) + {(α - α_{0})}^{⊤} \\ \times V_{2} (γ - γ_{0}) + {(γ - γ_{0})}^{⊤} V_{3} (α - α_{0}) + {(γ - γ_{0})}^{⊤} V_{4} (γ - γ_{0})] \\ + W_{1}^{⊤} (α - α_{0}) + W_{2}^{⊤} (γ - γ_{0}) . \end{matrix}

Then, for each

C_{1} > 0, C_{2} > 0

, as

n \to \infty

, we have

\sup_{\begin{matrix} ∥ α - α_{0} ∥ \leq C_{1} n^{- 1 / 2} \\ ∥ γ - γ_{0} ∥ \leq C_{2} δ_{n}^{- 1 / 2} \end{matrix}} | ξ_{n} (α, γ) | \overset{p}{⟶} 0 .

(A12)

Let

{\hat{α}}_{n}^{*}

,

{\hat{γ}}_{n}^{*}

be a minimizer of

n (J + 2 f (1)) [{(α - α_{0})}^{⊤} V_{1} (α - α_{0}) + {(α - α_{0})}^{⊤} V_{2} (γ - γ_{0}) + {(γ - γ_{0})}^{⊤} V_{3} (α - α_{0}) + {(γ - γ_{0})}^{⊤} V_{4} (γ - γ_{0})] - W_{1}^{⊤} (α - α_{0}) - W_{2}^{⊤} (γ - γ_{0})

. A simple calculation yields

{\hat{α}}_{n}^{*} - α_{0} = \frac{1}{2 n} {(J + 2 f (1))}^{- 1} {(V_{1} - V_{2} V_{4}^{- 1} V_{3})}^{- 1} (W_{1} - W_{2} V_{2} V_{4}^{- 1})

. According to the definition of

W_{1}, W_{2}

, for each

δ > 0

, there exists some constant

C_{3}

,

C_{4}

, and

N_{δ}

, as

n \geq N_{δ}

. Therefore, one has

P (‖ {\hat{α}}_{n}^{*} - α_{0} ‖ > C_{3} n^{- \frac{1}{2}}) \leq \frac{δ}{2}, P (‖ {\hat{γ}}_{n}^{*} - γ_{0} ‖ > C_{4} δ_{n}^{- \frac{1}{2}}) \leq \frac{δ}{2}

. According to (A11), for each

η

, there exists some constant

N_{η}

, for any

n \geq N_{η}, δ_{n} \geq N_{η}

, such that

P (\sup_{\begin{matrix} ∥ α - α_{0} ∥ \leq C_{3} n^{- 1 / 2} \\ ∥ γ - γ_{0} ∥ \leq C_{4} δ_{n}^{- 1 / 2} \end{matrix}} | ξ_{n} (α, γ) | > η) \leq \frac{δ}{2} .

Therefore, for each

δ, η > 0

, there exists

N = \max \{N_{δ}, N_{η}\}

, for any

n > N, δ_{n} > N

, such that

\begin{matrix} P (| ξ_{n} ({\hat{α}}_{n}^{*}, {\hat{γ}}_{n}^{*}) | > η) \\ = & P (| ξ_{n} ({\hat{α}}_{n}^{*}, {\hat{γ}}_{n}^{*}) | > η, ‖ {\hat{α}}_{n}^{*} - α ‖ > C_{3} n^{- \frac{1}{2}}, ‖ {\hat{γ}}_{n}^{*} - γ ‖ > C_{4} δ_{n}^{- \frac{1}{2}}) \\ + & P (| ξ_{n} ({\hat{α}}_{n}^{*}, {\hat{γ}}_{n}^{*}) | > η, ‖ {\hat{α}}_{n}^{*} - α ‖ \leq C_{3} n^{- \frac{1}{2}} ‖ {\hat{γ}}_{n}^{*} - γ ‖ \leq C_{4} δ_{n}^{- \frac{1}{2}}) \\ \leq & P (‖ {\hat{α}}_{n}^{*} - α ‖ > C_{3} n^{- \frac{1}{2}}) + P (‖ {\hat{γ}}_{n}^{*} - γ ‖ > C_{4} δ_{n}^{- \frac{1}{2}}) + P (\sup_{\begin{matrix} | α - α_{0} ∥ \leq C_{3} n^{- 1 / 2} \\ ∥ γ - γ_{0} ∥ \leq C_{4} δ_{n}^{- 1 / 2} \end{matrix}} | ξ_{n} (α, γ) | > η) \leq δ . \end{matrix}

This means

ξ_{n} ({\hat{α}}_{n}^{*}, {\hat{γ}}_{n}^{*}) = o_{p} (1)

. Similarly, for each constant

C_{1} > 0, C_{2} > 0

, one has

\sup_{\begin{matrix} ∥ α - {\hat{α}}_{n}^{*} ∥ \leq C_{1} n^{- 1 / 2} \\ ∥ γ - {\hat{γ}}_{n}^{*} ∥ \leq C_{2} δ_{n}^{- 1 / 2} \end{matrix}} | ξ_{n} (α, γ) | = o_{p} (1) .

Let

A_{1} ≜ \frac{1}{2 n} {(J + 2 f (1))}^{- 1} [V_{1} - V_{2} V_{4}^{- 1} V_{3}] [W_{1} - W_{2} V_{2} V_{4}^{- 1}],

A_{2} ≜ \frac{1}{2 n} {(J + 2 f (1))}^{- 1}

, and

W_{2} V_{4}^{- 1} - V_{4}^{- 1} V_{2} A_{1} .

A simple calculation yields

A_{1} - J B = o_{p} (1)

. Note that

\begin{matrix} ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0}) = & n (J + 2 f (1)) [{(α - {\hat{α}}_{n}^{*})}^{⊤} V_{1} (α - {\hat{α}}_{n}^{*}) + {(α - {\hat{α}}_{n}^{*})}^{⊤} V_{2} (γ - {\hat{γ}}_{n}^{*}) \\ + {(γ - {\hat{γ}}_{n}^{*})}^{⊤} V_{3} (α - {\hat{α}}_{n}^{*}) + {(γ - {\hat{γ}}_{n}^{*})}^{⊤} V_{4} (γ - {\hat{γ}}_{n}^{*})] - W_{1}^{⊤} A_{1} \\ - W_{2}^{⊤} A_{2} + ξ_{n} (α, γ) . \end{matrix}

Then, for any constant

C_{1}

,

C_{2}

,

C_{5}

,

C_{6}

, such that

0 < C_{5} < C_{1} < \infty

,

0 < C_{6} < C_{2} < \infty

, one has

\begin{matrix} \inf_{\begin{matrix} C_{5} n^{- \frac{1}{2}} \leq ∥ α - {\hat{α}}_{n}^{*} ∥ \leq C_{1} n^{- \frac{1}{2}} \\ C_{6} δ_{n}^{- \frac{1}{2}} \leq ∥ γ - {\hat{γ}}_{n}^{*} ∥ \leq C_{2} δ_{n}^{- \frac{1}{2}} \end{matrix}} \{ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})\} \\ \geq & \inf_{\begin{matrix} C_{5} n^{- \frac{1}{2}} \leq ∥ α - {\hat{α}}_{n}^{*} ∥ \leq C_{1} n^{- \frac{1}{2}} \\ C_{6} δ_{n}^{- \frac{1}{2}} \leq ∥ γ - {\hat{γ}}_{n}^{*} ∥ \leq C_{2} δ_{n}^{- \frac{1}{2}} \end{matrix}} {n (J + 2 f (1)) [{(α - {\hat{α}}_{n}^{*})}^{⊤} V_{1} (α - {\hat{α}}_{n}^{*}) + {(α - {\hat{α}}_{n}^{*})}^{⊤} V_{2} (γ - {\hat{γ}}_{n}^{*}) \\ + {(γ - {\hat{γ}}_{n}^{*})}^{⊤} V_{3} (α - {\hat{α}}_{n}^{*}) + {(γ - {\hat{γ}}_{n}^{*})}^{⊤} V_{4} (γ - {\hat{γ}}_{n}^{*})] - W_{1}^{⊤} A_{1} - W_{2}^{⊤} A_{2}} - \\ \sup_{\begin{matrix} C_{5} n^{- \frac{1}{2}} \leq ∥ α - {\hat{α}}_{n}^{*} ∥ \leq C_{1} n^{- \frac{1}{2}} \\ C_{6} δ_{n}^{- \frac{1}{2}} \leq ∥ γ - {\hat{γ}}_{n}^{*} ∥ \leq C_{2} δ_{n}^{- \frac{1}{2}} \end{matrix}} ∥ ξ_{n} (α, γ) ∥ \\ \geq & n (J + 2 f (1)) \{\frac{1}{n} C_{5}^{2} λ_{1} + \frac{1}{\sqrt{n δ_{n}}} C_{5} C_{6} λ_{2} + \frac{1}{\sqrt{n δ_{n}}} C_{5} C_{6} λ_{3} + \frac{1}{δ_{n}} C_{5} C_{6} λ_{4}\} - W_{1}^{⊤} A_{1} \\ - W_{2}^{⊤} A_{2} + o_{p} (1), \end{matrix}

(A13)

where

λ_{1}, λ_{2}, λ_{3}

, and

λ_{4}

are the smallest eigenvalues of

V_{1}, V_{2}, V_{3}

, and

V_{4}

, respectively.

On the other hand, for any constant

C_{7}

and

C_{8}

, one has

\begin{matrix} \inf_{\begin{matrix} ∥ α - {\hat{α}}_{n}^{*} ∥ \leq C_{7} n^{- \frac{1}{2}} \\ ∥ γ - {\hat{γ}}_{n}^{*} ∥ \leq C_{8} δ_{n}^{- \frac{1}{2}} \end{matrix}} \{ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})\} \leq ψ_{n} ({\hat{α}}_{n}^{*}, {\hat{γ}}_{n}^{*}) - ψ_{n} (α_{0}, γ_{0}) \\ = & - W_{1}^{⊤} A_{1} - W_{2}^{⊤} A_{2} + ξ_{n} ({\hat{α}}_{n}^{*}, {\hat{γ}}_{n}^{*}) = - W_{1}^{⊤} A_{1} - W_{2}^{⊤} A_{2} + o_{p} (1) . \end{matrix}

(A14)

According to (A13) and (A14), we can obtain that, with probability tending to 1, the minimizer of

ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})

is taken inside of

∥ α - {\hat{α}}_{n}^{*} ∥ \leq C_{7} n^{- \frac{1}{2}}, ∥ γ - {\hat{γ}}_{n}^{*} ∥ \leq C_{8} δ_{n}^{- \frac{1}{2}}

.

ψ_{n} (α, γ) - ψ_{n} (α_{0}, γ_{0})

is convex. Therefore, the local minimizer is also the only global minimizer. Thus,

\begin{matrix} {\hat{α}}_{n} - α_{0} = {\hat{α}}_{n}^{*} - α_{0} + o_{p} (n^{- \frac{1}{2}}) \\ = & \frac{1}{2 n} {(J + 2 f (1))}^{- 1} {[V_{1} - V_{2} V_{4}^{- 1} V_{3}]}^{- 1} [W_{1} - W_{2} V_{2} V_{4}^{- 1}] + o_{p} (n^{- \frac{1}{2}}) . \end{matrix}

Then, as

n \to \infty

, we have

\sqrt{n} ({\hat{α}}_{n} - α_{0}) \overset{D}{⟶} N (0, \frac{1}{4} {\{J + 2 f (1)\}}^{- 2} Σ^{- 1} A),

where

A = E {(ϵ + ϵ^{- 1})}^{2}, J = E \{ϵ sgn (ϵ - 1)\}

. □

Proof of Theorem 3.

Let

δ_{n} = n^{- \frac{2 b - 1}{2 (a + 2 b)}}

,

S_{n} = δ_{n}^{- 1} (\tilde{α} - α_{0})

,

V_{n} = δ_{n}^{- 1} (\tilde{γ} - γ_{0})

,

r_{i} = \int_{0}^{1} β_{0} (t) X_{i} (t) d t - {\tilde{U}}_{i}^{⊤} γ_{0}

,

F_{n} = \{(S_{n}, V_{n}) : ∥ {(S_{n}^{⊤}, V_{n}^{⊤})}^{⊤} ∥ = L\}

, where L is a large enough constant. Next, we show that, for any given

η > 0

, there exists a sufficiently large constant

L = L_{η}

, such that

P \{\inf_{(S_{n}, V_{n}) \in F_{n}} Q_{n} (α_{0} + δ_{n} S_{n}, γ_{0} + δ_{n} V_{n}) > Q_{n} (α_{0}, γ_{0})\} \geq 1 - η .

(A15)

This implies that with the probability of at least

1 - η

, there exists a local minimizer

\tilde{α}

and

\tilde{γ}

in the ball

\{(S_{n}, V_{n}) : ∥ {(S_{n}^{⊤}, V_{n}^{⊤})}^{⊤} ∥ = L\}

such that

∥ \tilde{α} - α_{0} ∥ = O_{p} (δ_{n})

,

∥ \tilde{γ} - γ_{0} ∥ = O_{p} (δ_{n})

.

With a simple calculation, we have

\begin{matrix} ψ_{n} (S_{n}, V_{n}) ≜ & (Q_{n} (α_{0} + δ_{n} S_{n}, γ_{0} + δ_{n} V_{n}) - Q_{n} (α_{0}, γ_{0})) \\ = & \sum_{i = 1}^{n} \{Y_{i}^{- 1} \exp (Z_{i}^{⊤} (α_{0} + δ_{n} S_{n}) + {\tilde{U}}_{i}^{⊤} (γ_{0} + δ_{n} V_{n})) - Y_{i}^{- 1} \exp (Z_{i}^{⊤} \\ α_{0} + {\tilde{U}}_{i}^{⊤} γ_{0})} + \sum_{i = 1}^{n} \{Y_{i} \exp (Z_{i}^{⊤} (α_{0} + δ_{n} S_{n}) + {\tilde{U}}_{i}^{⊤} (γ_{0} + δ_{n} V_{n})) \\ - Y_{i} \exp (Z_{i}^{⊤} α_{0} + {\tilde{U}}_{i}^{⊤} γ_{0})} \\ ≜ & I_{1} + I_{2} . \end{matrix}

Invoking the Taylor expansion, we have

\begin{matrix} I_{1} = & δ_{n} \sum_{i = 1}^{n} {S_{n} Z_{i}^{⊤} Y_{i}^{- 1} \exp (Z_{i}^{⊤} α_{0}) + V_{n} {\tilde{U}}_{i}^{⊤} Y_{i}^{- 1} \exp ({\tilde{U}}_{i}^{⊤} γ_{0})\} + \frac{1}{2} δ_{n}^{2} \sum_{i = 1}^{n} Y_{i}^{- 1} {S_{n}^{⊤} Z_{i} \\ Z_{i}^{⊤} S_{n} \exp (ξ_{i}^{[1]}) + V_{n}^{⊤} {\tilde{U}}_{i} {\tilde{U}}_{i}^{⊤} V_{n} \exp (ξ_{i}^{[2]})} \\ ≜ & I_{11} + I_{12}, \end{matrix}

where

ξ_{i}^{[1]}

is between

Z_{i}^{⊤} (α_{0} + δ_{n} S_{n})

and

Z_{i}^{⊤} α_{0}

, and

ξ_{i}^{[2]}

is between

{\tilde{U}}_{i}^{⊤} (γ_{0} + δ_{n} V_{n})

and

{\tilde{U}}_{i}^{⊤} γ_{0}

. For

I_{11}

, we have

\begin{matrix} I_{11} = & \sum_{i = 1}^{n} δ_{n} \frac{1}{ϵ_{i}} \{S_{n} Z_{i}^{⊤} + V_{n} {\tilde{U}}_{i}^{⊤}\} + \sum_{i = 1}^{n} \frac{1}{ϵ_{i}} δ_{n}^{2} \{S_{n}^{⊤} Z_{i} Z_{i}^{⊤} S_{n} + V_{n}^{⊤} {\tilde{U}}_{i} {\tilde{U}}^{⊤} V_{n}\} \\ ≜ & I_{111} + I_{112} + o_{p} (1) . \end{matrix}

It is easy to obtain

I_{111} = o_{p} (1) ∥ S_{n} ∥^{2} + o_{p} (1) {∥ V_{n} ∥}^{2}

,

I_{112} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2}

, and

I_{12} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2}

.

Furthermore, we have

I_{1} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) ∥ V_{n} ∥^{2}, I_{2} = O_{p} (δ_{n}^{2}) ∥ S_{n} ∥^{2} + O_{p} (δ_{n}^{2}) {∥ V_{n} ∥}^{2} .

Therefore, Equation (A15) holds, and there exists a local minimizer

\tilde{γ}

, such that

∥ \tilde{γ} - γ_{0} ∥ = O_{p} (δ_{n})

.

Note that

\begin{matrix} ∥ \tilde{β} (t) - β_{0} {(t) ∥}^{2} = & ∥ \sum_{j = 1}^{m} {\tilde{γ}}_{j} {\hat{υ}}_{j} (t) - \sum_{j = 1}^{\infty} γ_{0 j} υ_{j} {(t) ∥}^{2} \\ \leq & 2 ∥ \sum_{j = 1}^{m} {\tilde{γ}}_{j} {\hat{υ}}_{j} (t) - \sum_{j = 1}^{\infty} γ_{0 j} υ_{j} {(t) ∥}^{2} + 2 {∥ \sum_{j = m + 1}^{\infty} γ_{0 j} υ_{j} (t) ∥}^{2} \\ \leq & 4 ∥ \sum_{j = 1}^{m} ({\tilde{γ}}_{j} - γ_{0 j}) {\hat{υ}}_{j} (t) ∥^{2} + 4 {∥ \sum_{j = 1}^{m} γ_{0 j} ({\hat{υ}}_{j} (t) - υ_{j} (t)) ∥}^{2} + 2 \sum_{j = m + 1}^{\infty} γ_{0 j}^{2} \\ ≜ & 4 D_{1} + 4 D_{2} + 2 D_{3} . \end{matrix}

According to Equation (5), conditions C2, and the orthogonality of

{\tilde{υ}}_{j}

and

∥ υ_{j} (t) - {\hat{υ}}_{j} {(t) ∥}^{2} = O_{p} (n^{- 1} j^{2})

, we have

D_{1} = ∥ \sum_{j = 1}^{m} ({\tilde{γ}}_{j} - γ_{0 j}) {\hat{υ}}_{j} (t) ∥^{2} \leq \sum_{j = 1}^{m} | {\tilde{γ}}_{j} - γ_{0 j} |^{2} = {∥ \tilde{γ} - {\tilde{γ}}_{0} ∥}^{2} = O_{p} (δ_{n}^{2}) .

(A16)

\begin{matrix} D_{2} = & ∥ \sum_{j = 1}^{m} γ_{0 j} ({\hat{υ}}_{j} (t) - υ_{j} (t)) ∥^{2} \leq m \sum_{j = 1}^{m} {∥ {\hat{υ}}_{j} (t) - υ_{j} (t) ∥}^{2} γ_{0 j}^{2} \leq \frac{m}{n} O_{p} (\sum_{j = 1}^{m} j^{2} γ_{0 j}^{2}) \\ = & O_{p} (n^{- 1} m \sum_{j = 1}^{m} j^{2 - 2 b}) = O_{p} (n^{- 1} m) = o_{p} (n^{- \frac{2 b - 1}{a + 2 b}}) = o_{p} (δ_{n}^{2}) . \end{matrix}

(A17)

D_{3} = \sum_{j = m + 1}^{\infty} γ_{0 j}^{2} \leq C \sum_{j = m + 1}^{\infty} j^{- 2 b} = O (n^{- \frac{2 b - 1}{a + 2 b}}) = O (δ_{n}^{2}) .

(A18)

Then, by combining the equalities (A16)–(A18), we complete the proof of Theorem 3. □

Proof of Theorem 4.

According to Theorem 3, we know that, as

n \to \infty

, with probability tending to 1,

Q_{n} (\tilde{α}, \tilde{γ})

achieves the minimal value at

(\tilde{α}, \tilde{γ})

. We have the following score equations:

\frac{1}{n} \sum_{i = 1}^{n} \{- Z_{i} Y_{i} \exp (- Z_{i}^{⊤} \tilde{α} - {\tilde{U}}_{i}^{⊤} \tilde{γ}) + Z_{i} Y_{i}^{- 1} \exp (Z_{i}^{⊤} \tilde{α} + {\tilde{U}}_{i}^{⊤} \tilde{γ})\} = 0 .

(A19)

\frac{1}{n} \sum_{i = 1}^{n} \{- {\tilde{U}}_{i} Y_{i} \exp (- Z_{i}^{⊤} \tilde{α} - {\tilde{U}}_{i}^{⊤} \tilde{γ}) + {\tilde{U}}_{i} Y_{i}^{- 1} \exp (Z_{i}^{⊤} \tilde{α} + {\tilde{U}}_{i}^{⊤} \tilde{γ})\} = 0 .

(A20)

By Equations (A19) and (A20), we have

\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} {- Z_{i} ϵ_{i} \exp (- Z_{i}^{⊤} (\tilde{α} - α_{0}) - {\tilde{U}}_{i}^{⊤} (\tilde{γ} - γ_{0}) + r_{i}) \\ + Z_{i} ϵ_{i}^{- 1} \exp (Z_{i}^{⊤} (\tilde{α} - α_{0}) + {\tilde{U}}_{i}^{⊤} (\tilde{γ} - γ_{0}) - r_{i})} = 0 . \end{matrix}

\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} {- {\tilde{U}}_{i} ϵ_{i} \exp (- Z_{i}^{⊤} (\tilde{α} - α_{0}) - {\tilde{U}}_{i}^{⊤} (\tilde{γ} - γ_{0}) + r_{i}) \\ + {\tilde{U}}_{i} ϵ_{i}^{- 1} \exp ({\hat{Z}}_{i}^{⊤} (\tilde{α} - α_{0}) + {\tilde{U}}_{i}^{⊤} (\tilde{γ} - γ_{0}) - r_{i})} = 0 . \end{matrix}

(A21)

Using the Taylor series expansion for Equation (A21), we have

\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} {- {\tilde{U}}_{i} ϵ_{i} (1 - Z_{i}^{⊤} (\tilde{α} - α_{0}) - {\tilde{U}}_{i}^{⊤} (\tilde{γ} - γ_{0}) + r_{i}) \\ + {\tilde{U}}_{i} ϵ_{i}^{- 1} (1 + Z_{i}^{⊤} (\tilde{α} - α_{0}) + {\tilde{U}}_{i}^{⊤} (\tilde{γ} - γ_{0}) - r_{i})} + o_{p} (1) = 0 . \end{matrix}

Let

V_{1} = \frac{1}{n} \sum_{i = 1}^{n} Z_{i}^{⊤} Z_{i}

,

V_{2} = \frac{1}{n} \sum_{i = 1}^{n} Z_{i}^{⊤} {\tilde{U}}_{i}

,

V_{3} = \frac{1}{n} \sum_{i = 1}^{n} {\tilde{U}}_{i}^{⊤} Z_{i}

,

V_{4} = \frac{1}{n} \sum_{i = 1}^{n} {\tilde{U}}_{i}^{⊤} {\tilde{U}}_{i}

. A simple calculation yields

\tilde{γ} - γ_{0} = \frac{1}{n} V_{4}^{- 1} \sum_{i = 1}^{n} {\tilde{U}}_{i}^{⊤} {(ϵ_{i}^{- 1} + ϵ_{i})}^{- 1} (ϵ_{i} - ϵ_{i}^{- 1}) - V_{4}^{- 1} V_{3} (\tilde{α} - α_{0}) + o_{p} (1) .

Similarly,

\tilde{α} - α_{0} = \frac{1}{n} V_{1}^{- 1} \sum_{i = 1}^{n} Z_{i}^{⊤} {(ϵ_{i}^{- 1} + ϵ_{i})}^{- 1} (ϵ_{i} - ϵ_{i}^{- 1}) - V_{1}^{- 1} V_{2} (\tilde{γ} - γ_{0}) + o_{p} (1) .

Furthermore, we have

\tilde{α} - α_{0} = \frac{1}{n} {[V_{1} - V_{2} V_{4}^{- 1} V_{3}]}^{- 1} \sum_{i = 1}^{n} {(ϵ_{i}^{- 1} + ϵ_{i})}^{- 1} (ϵ_{i} - ϵ_{i}^{- 1}) [Z_{i}^{⊤} - Z_{i} {\tilde{U}}_{i}^{⊤} {({\tilde{U}}_{i} {\tilde{U}}_{i}^{⊤})}^{- 1} {\tilde{U}}_{i}] + o_{p} (1) .

Let

{\tilde{Z}}_{i} = Z_{i}^{⊤} - Z_{i} {\tilde{U}}_{i}^{⊤} {({\tilde{U}}_{i} {\tilde{U}}_{i}^{⊤})}^{- 1} {\tilde{U}}_{i}

. Then,

V_{1} - V_{2} V_{4}^{- 1} V_{3} \overset{p}{⟶} Σ .

\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {(ϵ_{i}^{- 1} + ϵ_{i})}^{- 1} (ϵ_{i} - ϵ_{i}^{- 1}) {\tilde{Z}}_{i} \overset{p}{⟶} N (0, Σ_{1}^{- 1} Σ Σ_{2}),

where

Σ_{1} = E \{{(ϵ + ϵ^{- 1})}^{2}\}, Σ_{2} = E \{{(ϵ - ϵ^{- 1})}^{2}\}

.

According to the law of large numbers and the central limit theorem, as

n \to \infty

, we obtain

\sqrt{n} (\tilde{α} - α_{0}) \overset{D}{⟶} N (0, Σ_{1}^{- 1} Σ^{- 1} Σ_{2}) .

□

References

Chen, K.; Guo, S.; Lin, Y.; Ying, Z. Least absolute relative error estimation. J. Am. Stat. Assoc. 2010, 105, 1104–1112. [Google Scholar] [CrossRef] [PubMed]
Fan, R.; Zhang, S.; Wu, Y. Penalized relative error estimation of functional multiplicative regression models with locally sparse properties. J. Korean Stat. Soc. 2022, 51, 666–691. [Google Scholar] [CrossRef]
Shin, H. Partial functional linear regression. J. Stat. Plan. Inference 2009, 139, 3405–3418. [Google Scholar] [CrossRef]
Liu, C.; Su, W.; Su, W. Efficient estimation for functional accelerated failure time model. arXiv 2024, arXiv:2402.05395. [Google Scholar]
Chen, K.; Guo, S.; Lin, Y.; Ying, Z. Least product relative error estimation. J. Multivar. Anal. 2016, 144, 91–98. [Google Scholar] [CrossRef]
Ming, H.; Liu, H.; Yang, H. Least product relative error estimation for identification in multiplicative additive models. J. Comput. Appl. Math. 2022, 404, 113886. [Google Scholar] [CrossRef]
Ye, F.; Zhou, H.; Yang, Y. Asymptotic properties of relative error estimation for accelerated failure time model with divergent number of parameters. Stat. Its Interface 2024, 17, 107–125. [Google Scholar] [CrossRef]
Zhang, J.; Feng, Z.; Peng, H. Estimation and hypothesis test for partial linear multiplicative models. Comput. Stat. Data Anal. 2018, 128, 87–103. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, Q.; Li, N. Least absolute relative error estimation for functional quadratic multiplicative model. Commun. Stat.-Theory Methods 2016, 45, 5802–5817. [Google Scholar] [CrossRef]
Zhang, T.; Huang, Y.; Zhang, Q.; Ma, S.; Ahmed, S. Penalized relative error estimation of a partially functional linear multiplicative model. In Matrices, Statistics and Big Data: Selected Contributions from IWMS 2016; Springer: Cham, Switzerland, 2019; Volume 45, pp. 127–144. [Google Scholar]
Cai, T.; Hall, P. Prediction in functional linear regression. Ann. Stat. 2006, 34, 2159–2179. [Google Scholar] [CrossRef]
Hall, P.; Horowitz, J.L. Methodology and convergence rates for functional linear regression. Ann. Stat. 2007, 35, 70–91. [Google Scholar] [CrossRef]
Yu, P.; Song, X.; Du, J. Composite expectile estimation in partial functional linear regression model. J. Multivar. Anal. 2024, 203, 105343. [Google Scholar] [CrossRef]
Xia, X.; Liu, Z.; Yang, H. Regularized estimation for the least absolute relative error models with a diverging number of covariates. Comput. Stat. Data Anal. 2016, 96, 104–119. [Google Scholar] [CrossRef]
Tang, Q.; Cheng, L. Partial functional linear quantile regression. Sci. China Math. 2019, 57, 2589–2608. [Google Scholar] [CrossRef]
Febrero-Bande, M.; Fuente, M. Statistical Computing in Functional Data Analysis: The R Package fda.usc. J. Stat. Softw. 2012, 51, 1–28. [Google Scholar] [CrossRef]
Knight, K. Limiting distributions for L1 regression estimators under general conditions. Ann. Stat. 1998, 26, 755–770. [Google Scholar] [CrossRef]
Pollard, D. Asymptotics for Least Absolute Deviations Regression Estimators. Econom. Theory 1982, 7, 186–199. [Google Scholar] [CrossRef]

Figure 1. The estimated functional weight

β (t)

in model (7) with LPRE method.

Figure 1. The estimated functional weight

β (t)

in model (7) with LPRE method.

Table 1. The biases and standard deviations of the estimators for

α_{1}

and

α_{2}

, and the mean squared error of the estimators for

α

under different distributions (results

\times 100

).

Table 1. The biases and standard deviations of the estimators for

α_{1}

and

α_{2}

, and the mean squared error of the estimators for

α

under different distributions (results

\times 100

).

Errors	n	Criteria	LS	LAD	LARE	LPRE
(i)	150	$α_{1}$ /Bias (sd)	0.033 (1.268)	0.025 (1.607)	−0.041 (1.575)	0.032 (1.267)
		$α_{2}$ /Bias (sd)	0.086 (1.648)	−0.051 (3.064)	−0.019 (1.995)	0.086 (1.649)
		$α$ /MSE	0.043	0.120	0.065	0.043
	300	$α_{1}$ /Bias (sd)	−0.002 (0.731)	−0.010 (0.919)	−0.028 (0.893)	−0.001 (0.731)
		$α_{2}$ /Bias (sd)	−0.031 (1.014)	−0.188 (1.954)	−0.047 (1.294)	−0.031 (1.015)
		$α$ /MSE	0.016	0.047	0.025	0.016
	600	$α_{1}$ /Bias (sd)	−0.016 (0.442)	−0.003 (0.578)	0.005 (0.575)	−0.016 (0.442)
		$α_{2}$ /Bias (sd)	−0.029 (0.666)	−0.047 (1.132)	−0.031 (0.799)	−0.029 (0.666)
		$α$ /MSE	0.006	0.016	0.010	0.006
(ii)	150	$α_{1}$ /Bias (sd)	−0.047 (9.856)	−0.354 (15.946)	−0.070 (9.078)	−0.018 (8.179)
		$α_{2}$ /Bias (sd)	−0.026 (12.612)	−1.135 (30.386)	−0.078 (11.615)	−0.021 (10.537)
		$α$ /MSE	2.562	11.790	2.173	1.779
	300	$α_{1}$ /Bias (sd)	−0.421 (6.757)	−0.614 (11.549)	−0.421 (6.255)	−0.289 (5.548)
		$α_{2}$ /Bias (sd)	0.119 (10.009)	−0.572 (23.006)	0.086 (9.076)	0.016 (8.245)
		$α$ /MSE	1.460	6.634	1.217	0.988
	600	$α_{1}$ /Bias (sd)	0.085 (4.684)	0.097 (8.125)	0.079 (4.311)	0.049 (3.874)
		$α_{2}$ /Bias (sd)	0.354 (6.517)	0.447 (15.891)	0.325 (5.973)	0.297 (5.382)
		$α$ /MSE	0.645	3.188	0.544	0.441
(iii)	150	$α_{1}$ /Bias (sd)	−0.014 (2.888)	0.030 (4.275)	−0.027 (4.109)	−0.011 (2.846)
		$α_{2}$ /Bias (sd)	0.247 (3.767)	0.085 (8.040)	3.503 (5.390)	0.039 (3.719)
		$α$ /MSE	0.226	0.829	0.582	0.219
	300	$α_{1}$ /Bias (sd)	−0.130 (1.956)	−0.235 (3.032)	−0.206 (2.949)	−0.126 (1.926)
		$α_{2}$ /Bias (sd)	0.202 (2.921)	−0.014 (6.183)	4.028 (4.058)	−0.017 (2.861)
		$α$ /MSE	0.123	0.474	0.414	0.119
	600	$α_{1}$ /Bias (sd)	0.041 (1.347)	0.032 (2.133)	0.051 (2.052)	0.041 (1.327)
		$α_{2}$ /Bias (sd)	0.336 (1.851)	0.185 (4.112)	4.186 (2.674)	0.114 (1.822)
		$α$ /MSE	0.054	0.215	0.289	0.051

Table 2. The RASEs of the estimators for

β (t)

under different error distributions (results

\times 100

).

Table 2. The RASEs of the estimators for

β (t)

under different error distributions (results

\times 100

).

n	Methods	Error (i)	Error (ii)	Error (iii)
150	LS	30.752	34.818	31.769
	LAD	30.784	39.462	32.136
	LARE	30.781	34.429	32.098
	LPRE	30.752	33.920	31.760
300	LS	22.087	24.407	22.021
	LAD	22.104	28.314	22.348
	LARE	22.103	24.077	22.297
	LPRE	22.087	23.637	22.013
600	LS	15.245	16.884	15.363
	LAD	15.255	19.446	15.567
	LARE	15.254	16.643	15.556
	LPRE	15.245	16.391	15.358

Table 3. MQEP of different random partitions.

Methods	N = 100	N = 200	N = 400
LS	0.948	3.110	2.815
LAD	2.654	3.262	3.119
LARE	1.160	2.949	3.248
LPRE	0.898	2.696	1.995
PFLM	2.540	2.820	2.907

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Estimation for Partial Functional Multiplicative Regression Model

Abstract

1. Introduction

2. Estimation Method

2.1. LARE Estimation

2.2. LPRE Estimation

3. Asymptotic Properties

4. Implementation

5. Simulation Studies

6. Application to Tecator Data

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Article Metrics

Citations

Article Access Statistics