Next Article in Journal
A Comprehensive Simulation Study of Estimation Methods for the Rasch Model
Next Article in Special Issue
A Geometric Perspective on Functional Outlier Detection
Previous Article in Journal
Survival Augmented Patient Preference Incorporated Reinforcement Learning to Evaluate Tailoring Variables for Personalized Healthcare
Previous Article in Special Issue
Curve Registration of Functional Data for Approximate Bayesian Computation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Partially Linear Generalized Single Index Models for Functional Data (PLGSIMF)

1
Ecole Nationale des Sciences Appliquées, Université Cadi Ayyad, Marrakech 40 001, Morocco
2
Laboratoiry AGEIS, UFR SHS, Université Grenoble Alpes, BP. 47, CEDEX 09, 38040 Grenoble, France
3
Institut de Mathématiques de Toulouse, Université Paul Sabatier, CEDEX 9, 31062 Toulouse, France
*
Author to whom correspondence should be addressed.
Stats 2021, 4(4), 793-813; https://doi.org/10.3390/stats4040047
Submission received: 8 April 2021 / Revised: 24 July 2021 / Accepted: 27 July 2021 / Published: 27 September 2021
(This article belongs to the Special Issue Functional Data Analysis (FDA))

Abstract

:
Single-index models are potentially important tools for multivariate non-parametric regression analysis. They generalize linear regression models by replacing the linear combination α 0 X with a non-parametric component η 0 α 0 X , where η 0 ( · ) is an unknown univariate link function. In this article, we generalize these models to have a functional component, replacing the generalized partially linear single index models η 0 α 0 X + β 0 Z , where α is a vector in I R d , η 0 ( · ) and β 0 ( · ) are unknown functions that are to be estimated. We propose estimates of the unknown parameter α 0 , the unknown functions β 0 ( · ) and η 0 ( · ) and establish their asymptotic distributions, and furthermore, a simulation study is carried out to evaluate the models and the effectiveness of the proposed estimation methodology.

1. Introduction

Generalized linear models are proposed by Nelder and Wedderburn [1], g ( μ ( X ) ) = β X ; for a detail review, we refer the readers to McCullagh and Nelder [2]; it consists of a random component and systematic component. GLMs assume the responses come from the exponential dispersion model family. They extend linear models to allow the relationship between the predictors and the function of the mean of continuous or discrete response through a canonical link function. These models encounter problems such as the canonical link function is sometimes unknown, the link between response and predictors can be complex as well as the plague of dimension reduction. To address these problems, several approaches have been developed. Hastie and Tibshirani [3] propose the GAMs models, in which the linear predictor depends linearly on smooth of predictor variables, one of the criticisms of these models is that they do not take into consideration the interactions between covariates. The manuscripts of Wood [4] and Dunn, Peter, Smyth, Gordon [5] are the latest references dealing with these two models.
The single index model had been employed to reduce the dimensionality of data, and avoid the “curse of dimensionality” while maintaining the advantages of non-parametric smoothing in multivariate regression cases over the last few decades, see for example the work of Lai et al. [6].
The single index α X aggregates the influence of the observed values X = ( X 1 , , X d ) of the explanatory variables into one number.
Examples of economic index include the following: a stock index, inflation index, cost-of-living index, and price index.
Furthermore, this idea had first been extended to the functional setting by Ferraty, Vieu et al. [7] for functional regression problems, which led to the functional single index regression model (FSIRM). The functional index acts as a filter permitting the extraction of the part of explaining the scalar response Y, and plays an important role in a such model.
The predictor is not generally linear, but is complex, which prompted Caroll et al. [8] to propose their model GPLSIM by applying the local-quasi-likelihood function and the kernel type of smoothing by approximating the function η 0 using local linear methods g ( μ ( X , Z ) ) = η 0 ( α 0 X ) + β 0 Z . A GPLSIM was proposed by Chin-Shang Li et al. [9], in which the unknown smooth function of single index was approximated by a spline function that can be expressed as a linear combination of B-spline basis function considered as follows g ( μ ( X , Z ) ) = β X + φ ( α Z ) using a modified Fisher-scoring method. Moreover, Wang and Cao [10], have studied the GPLSIM model by applying the quasi-likelihood and polynomial spline smoothing g ( μ ( X , Z ) ) = η 0 ( α 0 X ) + β 0 Z .
In recent years, the analysis of functional data has made considerable progress in several areas, including image processing, biomedical studies, environmental sciences, public health, etc. Several researchers have focused their efforts on studying this type of data. We mention the work of Aneiros-Pérez and Vieu [11], and for more details, we refer to the books of Horváth Kokoszka [12], Ferraty and Vieu [7], Aneiros-Pérez and Vieu [13], and Ramsay and Silverman [14]. Yu, Du and Zhang [15] have proposed the SIPFLM model combining the single-index model (SIM) and the FLM model by optimizing the sum of the least squares using the B-Spline basis Y = g ( α X ) + T β ( t ) Z ( t ) d t . Jiang Du et al. [16] proposed the GFPLM model using the functional principal component analysis g ( μ ( X , Z ) ) = α X + T β ( t ) Z ( t ) d t . Rachdi, Alahiane, Ouassou and Vieu presented a book chapter on the generalization of the GPLSIM model in the Iwfos 2020 conference, see Rachdi et al. [17]. Our objective is to combine the GPLSIM with the SIPFLM and consider the following generalized partially functional single-index model called PLGSIMF using B-Spline expansion and the quasi-likelihood function g ( μ ( X , Z ) ) = η 0 ( α X ) + 0 1 β ( t ) Z ( t ) d t in order to remedy the interaction effects, the dimension scourge and to take into account the functional random variables.
The paper is organized as follows. In Section 1 and Section 2, we localize our model in the literature, and we present the Fisher-scoring update algorithm used to estimate our single-index vector, the parametric function and the slope function. In Section 3, we investigate an asymptotic study of the estimators presented in the paper. Numerical simulation in the Gaussian case as in the logistic case is presented in Section 4. The proofs of the results are developed in Section 6 and in the Appendix A for the different technical lemmas necessary to develop our asymptotic study both for the non-parametric function, for the single-index vector and for the slope function.
Let H be a separable Hilbert space, which is endowed with the scalar product < · , · > H and the norm | | · | | H . Let Y be a scalar response variable and ( X , Z ) I R d × H be the predictor vector where X = ( X 1 , , X d ) and Z to be a functional random variable that is valued in H. For a fixed ( x , z ) I R d × H , we assume that the conditional density function of the response Y given ( X , Z ) = ( x , z ) belongs to the following canonical exponential family:
f Y | X = x , Z = z ( y ) = exp y ξ ( x , z ) B ( ξ ( x , z ) ) + C ( y ) ,
where B and C are two known functions that are defined from I R into I R , and ξ : I R d × H I R is the parameter in the generalized parametric linear model, which is linked to the dependent variable
μ ( x , z ) = E Y | X = x , Z = z = B ( ξ ( x , z ) ) ,
where B denotes the first derivative of the function B. In what follows, we consider the function g ( μ ( x , z ) ) as a generalized single-index partially functional linear model:
g ( μ ( x , z ) ) = η 0 α x + 0 1 β ( t ) z ( t ) d t ,
where α = α 1 , α 2 , , α d I R d is the d-dimensional single-index coefficient vector, β is the coefficient function in the functional component, and η 0 is the unknown single-index link function which will be assumed to be sufficiently smooth.
If the conditional variance V a r ( Y | X = x , Z = z ) = σ 2 V ( μ ( x , z ) ) , where V is an unknown positive function, then the estimation of the mean function g ( μ ) may be obtained by replacing the log-likelihood f Y | X = x , Z = z given by (1), by the quasi-likelihood Q ( u , v ) given by
Q ( u , v ) u = v u σ 2 V ( u ) ,
for any real numbers u and v, which may be written as Q ( u , v ) = v u v t σ 2 V ( t ) d t .

2. Estimation Methodology

Let X i , Y i , Z i i = 1 , , n be a sequence of independent and identically distributed (i.i.d.) as ( X , Y , Z ) and, for each i = 1 , , n ,
g μ X i , Z i = η 0 α X i + 0 1 β ( t ) Z i ( t ) d t .
We assume that the function η 0 is supported within the interval [ a , b ] where a = inf ( α X ) and b = sup ( α X ) .
We introduce a sequence of knots ( k m ) in the interval [ a , b ] , with J interior knots, such that k r + 1 = = k 1 = k 0 = a < k 1 < < k J = k J + 1 = = k J + r , where J : = J n is a sequence of integers which increases with the sample size n. Now, let N n = J n + r be the number of knots, B j ( u ) j = 1 , , N n be the B-spline basis functions of order r, and h = ( b a ) / ( J n + 1 ) be the distance between the neighbors knots.
Let S n be the space of polynomial splines on [ a , b ] of order r 1 . By De Boor [18], we can approximate η 0 , assumed in H ( p ) (which will be defined in Section 3) by a function η ˜ S n . So, we can write η ˜ ( u ) = γ ˜ B ( u ) where B ( u ) is the spline basis and γ ˜ I R N n is the spline coefficient vector.
We introduce a new knots sequence 0 = t 0 < t 1 < < t k + 1 = 1 of [ 0 , 1 ] . Then, there exists N = k + r + 1 functions in the B-splines basis which are normalized and of order r, such that
β ( · ) δ B 2 ( . ) where B 2 ( . ) = B 21 ( . ) , B 22 ( . ) , , B 2 N ( . ) and δ I R N .
By setting
W = 0 1 Z ( t ) B 21 ( t ) d t , , 0 1 Z ( t ) B 2 N ( t ) d t ,
and w and W i are defined accordingly to (5), the mean function estimator μ ^ x , z is then given by the evaluation of the parameter θ = α , γ , δ and by inverting the following equation g μ ^ x , z = γ ^ B 1 α ^ x + δ ^ w . Notice that the parameter θ = α , γ , δ is determined by maximizing the following quasi-likelihood rule
θ ^ = α ^ , γ ^ , δ ^ = arg max θ = ( α , γ , δ ) I R d × I R N n × I R N l ( θ ) ,
where l ( θ ) : = l ( α , γ , δ ) = 1 n i = 1 n Q g 1 m i , Y i , with
m x , z = γ B 1 α x + δ ω ,
m i : = γ B 1 α X i + δ W i and m 0 i = γ 0 B 1 U 0 i + δ 0 W i ,
where U 0 i = α 0 X i with α 0 , γ 0 , δ 0 , η 0 , β 0 denoting the true values, respectively, of α , γ , δ , η and β .
To overcome the constraint α = 1 and α 1 > 0 of the d-dimensional index α , we proceed by a re-parameterization, which is similar to Yu and Ruppert [19]
α ( τ ) = 1 τ 2 , τ for τ I R d 1 .
The true value τ 0 of τ , must satisfy τ 0 1 . Then, we assume that τ 0 < 1 . The jacobian matrix of α : τ α ( τ ) of dimension d × ( d 1 ) is J ( τ ) . Notice that τ is unconstrained and is one dimension lower than α .
Finally, let
R ( τ ) = J ( τ ) 0 0 I N × N
the jacobian matrix of α ( τ ) , δ , which is of dimension d + N × d + N 1 . Let
( α ˜ , δ ˜ ) = arg max ( α , δ ) I R d × I R N , τ I R d 1 1 n i = 1 n Q g 1 η ˜ α ( τ ) X i + δ W i , Y i and T i = X i , W i ,
Denote
m i = γ B 1 α X i + δ W i , T i = X i , W i , m 0 i = m 0 i X i , W i = γ 0 B 1 α 0 X i + δ 0 W i = γ 0 B 1 U 0 i + δ 0 W i with U 0 i = α 0 X i , m 0 ( T ) = γ 0 B 1 α 0 X + δ 0 W = γ 0 B 1 U 0 + δ 0 W with U 0 = α 0 X .
and ( τ ˜ , δ ˜ ) = arg max τ , δ l ˜ ( τ , δ ) where l ˜ ( τ , δ ) = 1 n i = 1 n Q g 1 η ˜ α ( τ ) X i + δ W i , Y i . Note that θ τ = τ , γ , δ is a ( d 1 ) × N n × N -dimensional parameter, while θ is a d × N n × N -dimensional one. Let
ρ l ( m ) = 1 σ 2 V g 1 ( m ) d d m g 1 ( m ) l
and denote
q l ( m , y ) = l m l Q g 1 ( m ) , y , for l = 1 , 2 .
Then,
q 1 ( m , y ) = y g 1 ( m ) ρ 1 ( m ) and q 2 ( m , y ) = y g 1 ( m ) ρ 1 ( m ) ρ 2 ( m ) .
So, l θ τ becomes
l θ τ = 1 n i = 1 n Q g 1 γ B 1 α ( τ ) X i + δ W i , Y i = 1 n i = 1 n Q g 1 m i , Y i
The score vector is then
S θ τ = l θ τ θ τ = 1 n i = 1 n q 1 m i , Y i ξ i ( τ , γ , δ ) ,
where
ξ i ( τ , γ , δ ) = γ B 1 α ( τ ) X i J ( τ ) X i B 1 α ( τ ) X i W i .
The expectation of the Hessian matrix is
H θ τ = E 2 θ τ θ τ S θ τ = 1 n i = 1 n ρ 2 m i ξ i ( τ , γ , δ ) ξ i ( τ , γ , δ ) ,
The Fisher Scoring update equations θ τ ( k + 1 ) = θ τ ( k ) H θ τ ( k ) 1 S θ τ ( k ) , becomes
θ τ ( k + 1 ) = θ τ ( k ) + i = 1 n ρ 2 m i ( k ) ξ i τ ( k ) , γ ( k ) , δ ( k ) ξ i τ ( k ) , γ ( k ) , δ ( k ) 1 × i = 1 n Y i μ i ( k ) ρ 1 m i ( k ) ξ i τ ( k ) , γ ( k ) , δ ( k ) ,
where m i ( k ) = γ ( k ) B 1 α ( k ) ( τ ( k ) ) X i + δ ( k ) W i , for 1 i n and μ i ( k ) = g 1 m i ( k ) .
It follows that
β ^ ( t ) = δ ^ B 2 ( t ) = δ ( k ) B 2 ( t ) , η ^ ( t ) = γ ^ B 1 ( t ) = γ ( k ) B 1 ( t ) , m ^ i = γ ^ B 1 α ( τ ^ ) X i + δ ^ W i = γ ( k ) B 1 α τ k X i + δ ( k ) W i ,
where μ ^ i = g 1 m ^ i , and α ^ = α τ ( k ) is the estimator of the single-index coefficient vector of the PLGSIMF model.

3. Some Asymptotics

We present asymptotic properties of the estimators for the non-parametric components, the functional component, the single-index coefficient vector and the slope function of the PLGSIMF model. For this aim, we will need some assumptions.

3.1. Some Additional Notions and Assumptions

Let φ , φ 1 and φ 2 be measurable functions on [ a , b ] . We define the empirical inner product φ 1 , φ 2 n and its corresponding norm φ n as follows
φ 1 , φ 2 n = 1 n i = 1 n φ 1 U i φ 2 U i and φ n 2 = 1 n i = 1 n φ 2 U i where U i = α X i .
If φ , φ 1 and φ 2 are L 2 -integrable, we define the theoretical inner product and its corresponding norm as follows
φ 1 , φ 2 = E φ 1 ( U ) φ 2 ( U ) and φ 2 2 = E φ 2 ( U ) = a b φ 2 ( u ) f ( u ) d u .
Let v N * and e ( 0 , 1 ] such that p = v + e > 1.5 . We denote by H ( p ) the collection of functions g, which are defined on [ a , b ] whose v-th order derivative, g ( v ) , exists and satisfies the following e-th order Lipschitz condition
g ( v ) m g ( v ) ( m ) C m m e , for all a m , m b .
Let ε = Y g 1 m 0 ( T ) where T = X , W .
(C1) The single-index link function η 0 H ( p ) , where H ( p ) is defined as above.
(C2) For all m I R and for all y in the range of the response variable Y, the function q 2 ( m , y ) is strictly negative, and for k = 1 , 2 , there exist some positive constants c q and C q such that c q < q 2 k ( m , y ) < C q .
(C3) The marginal density function of α X is continuous and bounded away from zero and is infinite on its support [ a , b ] . The v-th order partial derivatives of the joint density function of X satisfy the Lipschitz condition of order α ( α ( 0 , 1 ] ).
(C4) For any vector τ , there exist positive constants c τ and C τ , such that
c τ I t × t E 1 T 1 T | α ( τ ) X = α ( τ ) x C τ I t × t ,
where t = 1 + N n + N and T = X , W .
(C5) The number of knots N n satisfy n 1 2 ( p + 1 ) N n n 1 8 , for p > 3 .
(C6) The fourth order moment of the random variable Z is finite, i.e., E Z ( . ) 4 C , where C denotes a generic positive constant.
(C7) The covariance function K ( t , s ) = Cov ( Z ( t ) , Z ( s ) ) is positive definite.
(C8) The slope function β is a r-th order continuously differentiable function, i.e., β C r ( [ 0 , 1 ] ) .
(C9) For some finite positive constants C ρ , C ρ * and M 0
ρ 1 ( m 0 ) C ρ and ρ 1 ( m ) ρ 1 ( m 0 ) C ρ * m m 0 for all m m 0 M 0 .
(C10) For some finite positive constants C g , C g * and M 1 , the link function g, in the model (3), satisfies: d d m g ( m ) | m = m 0 C g and, for all m m 0 M 1 ,
d d m g 1 ( m ) d d m g 1 ( m ) | m = m 0 C g * m m 0 .
(C11) It exists a positive constant C 0 , such that E ( ϵ 2 | U τ , 0 ) C 0 , where ϵ = Y g 1 m 0 ( T ) .

3.2. Estimators Consistencies

Next we formulate several assertions on the considered estimators.

3.3. Estimation of the Nonparametric Component

The following theorem states the convergence, with rates, of the estimator η ^ .
Theorem 1.
Under assumptions ( C 6 ) ( C 8 ) , we have
η ^ η 0 2 = O I P N n 1 n h + h p and η ^ η 0 n = O I P N n 1 n h + h p ,
where O I P denotes a “grand O of Landau” in probability.
Proof of Theorem 1.
The proof of the previous theorem is given in the Appendix A. □

3.4. Estimation of the Slope Function

Theorem 2.
Under assumptions ( C 1 ) ( C 8 ) , and k n 1 / ( 2 r + 1 ) , we have
β ^ ( · ) β 0 ( · ) 2 = O I P N n 2 h p + 1 n h 2 + O I P ( n 2 r / ( 2 r + 1 ) ) .
Proof of Theorem 2.
The proof of the previous theorem is given in the Appendix A. □

3.5. Estimation of the Parametric Components

The next theorem shows that the maximum quasi-likelihood estimator is root-n consistent and is asymptotically normal, although the convergence rate of the non-parametric component η ^ is slower than root-n. Before enouncing the theorem, let us denote
Υ u τ , 0 = E X ρ 2 m 0 ( T ) | U τ , 0 = u τ , 0 E ρ 2 m 0 ( T ) | U τ , 0 = u τ , 0 , Γ u τ , 0 = E W ρ 2 m 0 ( T ) | U τ , 0 = u τ , 0 E ρ 2 m 0 ( T ) | U τ , 0 = u τ , 0 , Φ ( x ) = Φ U τ , 0 , x = x Υ u τ , 0 and Ψ ( w ) = Ψ U τ , 0 , w = w Γ u τ , 0 .
Theorem 3.
Under assumptions ( C 1 ) ( C 11 ) , the constrained quasi-likelihood estimators α ^ and δ ^ with α ^ d = 1 are jointly asymptotically normally distributed, i.e.,
n α ^ α 0 δ ^ δ 0 D N 0 , R τ 0 D 1 R τ 0 ,
where D denotes the convergence in distribution, and
D = E ρ 2 m 0 ( T ) η 0 U τ , 0 J τ 0 Φ ( X ) Ψ ( W ) η 0 U τ , 0 J τ 0 Φ ( X ) Ψ ( W ) ,
and
R ( τ ) = J ( τ ) 0 0 I N × N .
Proof of Theorem 3.
The proof of the previous theorem is given in the Appendix A. □

Comments on the Assumptions

The smoothness condition in (C1) describes that the single-index function η 0 ( · ) can be approximated by functions in the B-spline space with a normalized basis. On the other hand, the condition (C2) ensures the uniqueness of the solution, where the condition (C3) is a smoothness condition on the joint and marginal density functions of α X and X. The condition (C5) allows to obtain the rate of growth of the dimension of the spline spaces relative to the sample size. Conditions (C6) and(C7) are required for covariates function Z, and (C8) is a smoothness condition for slope function. Conditions, (C4) and (C9)–(C11) are technical lemmas that will be used to prove the cited theorems in this article.
Then, in this paper, we introduce a new generalized functional partially linear single-index model based on a combination of polynomial smoothing. The asymptotic properties of the resulting estimators under certain regularity asymptions are established for this model and hence the non-parametric component η and the slope function β are evaluated by the B-spline functions. Finally, we give some simulations to illustrate our results.

4. A Numerical Study

We conduct a simulation study in order to show our results’ effectiveness. We will treat two main cases of link functions: the identity and the logit link functions.
Recall that if the density of Y given X = x and Z = z is
f Y | X = x , Z = z ( y ) = exp y ξ ( x , z ) B ( ξ ( x , z ) ) + C ( y ) ,
then the link fuction
g ( μ ( x , z ) ) = E Y | X = x , Z = z = B ( ξ ( x , z ) ) and V ( μ ( x , z ) ) = B ( ξ ( x , z ) ) σ 2 .

4.1. Case 1: Identity Link Function

We consider the case where the link function is the identity and the model
Y i = sin π α X i A B A + 0 1 β ( t ) Z i ( t ) d t + ε i for i = 1 , , n .
The responses Y i are simulated according to the Equation (6), X i are taken uniformly over the interval [ 0.5 , 0.5 ] , whereas the errors are normally distributed with mean 0 and variance 0.01 , ε i N ( 0 , 0.01 ) . Moreover, we take the following coefficients
α = 1 3 ( 1 , 1 , 1 ) , A = 3 2 1.645 12 and B = 3 2 + 1.645 12 .
The function β ( · ) and Z i ( · ) are given by
β ( t ) = 2 sin π t 2 + 3 2 sin 3 π t 2 and Z i ( t ) = j = 1 50 ξ j v j ( t )
where v j ( t ) = 2 sin ( ( j 0.5 ) π t ) , ξ j = N 0 , λ j and λ j = ( ( j 0.5 ) π ) 2 .
The knots are selected according to the formula C n 1 2 r log ( n ) where C [ 0.3 , 1 ] (Like in Wang and Cao [10]). We chose C = 0.6 and we made 300 replications with samples of sizes n = 500 and n = 1000 .
Computations of the bias, the standard deviation (SD) and the Mean Squared Error (MSE) with respect to (i) the parameter τ , (ii) the parameter γ and (iii) the parameter δ are summarized, for n = 500 (respectively, n = 1000 ), in the following Table 1, Table 2 and Table 3 (respectively, in the Table 4, Table 5 and Table 6).

4.2. Case 2: Logit Link Function

By taking a logit link function, data are generated from the model
logit { P [ Y i = 1 | X i , Z i ] } = sin π α X i A B A + 0 1 β ( t ) Z i ( t ) d t + ε i , i = 1 , , n .
for which we have kept the same parameters and the variables as for the identity link function. Then, similarly to the identity link function case, computations of the bias, SD and the MSE with respect to the parameters τ , γ and then δ are summarized, for n = 500 (respectively, n = 1000 ), in the Table 7, Table 8 and Table 9 (respectively, in the Table 10, Table 11 and Table 12).
It is obviously seen that the quality of the estimators are illustrated via simulations. The method performs quite well. The Bias, SD and MSE are reasonably small in general. The parametric and nonparametric components, the single-index and also the slope function are computed by the procedure given in this paper.
Both tables correspondingly indicate the consistency of α ^ and δ ^ as the bias, SD and MSE decrease as the sample size increasing. The knots selection with formula C n 1 2 r log ( n ) by using C [ 0.3 , 1 ] like in Li Wang and Guanqun CAO [10], we have chosen C = 0.6 .
We developed our algorithm in both cases: the identity link function and the logistic link function. Simulations show that the PLGSIMF algorithm works well in both cases.
In the figure below (Figure 1), we illustrate 500 realizations of the functional random variable Z.
In the following figure (Figure 2), we observe the almost linearity of the single-index u = α ( τ ) X and its estimate u ^ = α ^ ( τ ^ ) X .
In the figure below (Figure 3), we plot the slope function β ( . ) and its estimator β ^ ( . )
Our model approximates well the slope function β ( . ) .
The following figure (Figure 4) shows us the comparison between the non-parametric function η ( . ) and its estimator η ^ ( . ) .
We consider that our model approximated to the best the non-parametric function η ( . ) .
To study the performance of our estimation for non-parametric function η ( . ) and slope function, respectively, we will use the square root of average square errors criterion (RASE, see Peng et al. [20]):
RASE 1 = 1 n i = 1 n η ^ ( u i ) η ( u i ) 2 1 / 2
RASE 2 = 1 n i = 1 n β ^ ( t i ) β ( t i ) 2 1 / 2
The following tables (Table 13 amd Table 14) summarize the sample means, medians and variances of the RASE i ( i = 1 , 2 ) with different sample sizes in the Gaussian case.
For the case n = 500 , we get
Table 13. The RASE criterion with the non-parametric function η ( . ) and slope function β ( . ) for the case n = 500 .
Table 13. The RASE criterion with the non-parametric function η ( . ) and slope function β ( . ) for the case n = 500 .
Gaussian CasesMeanMedianVar
RASE 1 0.0390.0380.003
RASE 2 0.1230.1220.020
For the case n = 1000 , we get
Table 14. The RASE criterion with the non-parametric function η ( . ) and slope function β ( . ) for the case n = 1000 .
Table 14. The RASE criterion with the non-parametric function η ( . ) and slope function β ( . ) for the case n = 1000 .
Gaussian CasesMeanMedianVar
RASE 1 0.0160.0160.001
RASE 2 0.0270.1250.006
The following tables (Table 15 and Table 16) summarize the sample means, medians and variances of the RASE i ( i = 1 , 2 ) with different sample sizes in the Logistic case.
For the case n = 500 , we get
Table 15. The RASE criterion with the non-parametric function η ( . ) and slope function β ( . ) for the case n = 500 .
Table 15. The RASE criterion with the non-parametric function η ( . ) and slope function β ( . ) for the case n = 500 .
Logistic CasesMeanMedianVar
RASE 1 0.0450.0440.023
RASE 2 0.1330.1030.0120
For the case where n = 1000 , we get
Table 16. The RASE criterion with the nonparametric function η ( . ) and slope function β ( . ) for the case where n = 1000 .
Table 16. The RASE criterion with the nonparametric function η ( . ) and slope function β ( . ) for the case where n = 1000 .
Logistic CasesMeanMedianVar
RASE 1 0.0280.0260.014
RASE 2 0.1240.1210.002
We conclude that as the sample size n increases from 500 to 1000, the sample mean, median and variance of RASE i (i = 1, 2) decrease.

5. Application to Tecator Data

In this paragraph, we will apply the PLGSIMF model for Tecator data, popularly known in the functional data analysis. This data can be downloaded from the following link http://lib.stat.cmu.edu/datasets/tecator (accessed on 1 August 2021). For more details, see Ferraty and Vieu [7].
Given 215 finely chopped pieces of meat, Tecator’s data contain their corresponding fat contents ( Y i , i = 1 , , 215 ), near-infrared absorbance spectra ( Z i , i = 1 , , 215 ) observed on 100 equally wavelengths in the range 850–1050 nm, the protein content X 1 , i and the moisture content X 2 , i . We are trying to predict the fat content of the finely chopped meat samples.
The following figure (Figure 5) shows the absorbance curves.
We divide the sample randomly into two sub-samples: the training I 1 of size 160 and the test I 2 of size 55. The training sample is used to estimate the parameters, and the test sample is employed to verify the quality of predictions. To perform our model, we use the mean square error of prediction (MSEP) like in Aneiros-Pérez and Vieu [11] defined as follows:
M S E P = 1 50 i I 2 ( Y i Y ^ i ) 2 / v a r I 2 ( Y i ) ,
where Y ^ i is the predicted value based on the training sample and v a r I 2 is variance of response variables’ test sample.
The following table (Table 17) shows the performance of our PLGSIMF model by comparing it with other models. We can conclude that PLGSIMF is competitive one for such data.
The following figure (Figure 6) shows us the estimator of the non-parametric function η ^ ( . ) .
The following figure (Figure 7) shows us the estimator of the slope function β ^ ( . ) .

6. Proofs

In what follows, when no confusion is possible, we will denote by C a generic positive constant.
The following lemmas, [21,22,23] will be used to prove Theorem 1. The proof of these lemmas will be developed in the Appendix A.
Lemma 1.
Under assumptions(C1)(C4)and(C6)(C8), we have
n τ ˜ τ 0 δ ˜ δ 0 D N 0 , A 1 Σ 1 A 1 ,
where Σ 1 and A will be defined below and in the appendix for more details.
D denotes the convergence in distribution.
A = A 11 A 12 A 12 A 22 . with
A 11 = E ρ 2 m o ( T ) η 0 U τ , 0 2 J τ 0 X W J τ 0 A 22 = E ρ 2 m 0 ( T ) W W A 12 = E ρ 2 m o ( T ) η 0 U τ , 0 J τ 0 X W
Σ 1 = E q 1 2 m 0 ( T ) , Y η 0 U τ , 0 J τ 0 X Z η 0 U τ , 0 J τ 0 X Z
By applying the δ -method, we get the following lemma.
Lemma 2.
Under the conditions of Lemma 1, we obtain
n α ( τ ˜ ) α τ 0 δ ˜ δ 0 D N 0 , R τ 0 A 1 Σ 1 A 1 R τ 0 ,
where
R ( τ ) = J ( τ ) 0 0 I N × N .
Furthermore α ( τ ˜ ) α τ 0 = o I P 1 n and δ ˜ δ 0 = o I P 1 n .
Lemma 3.
Under the conditions of Lemma 1, we obtain
θ ^ θ ˜ = O I P N n h p + 1 n h ,
where N n is number of B-splines basis functions of order r.
Then, we can enounce the following theorem.
Theorem 4.
Under assumptions(C1)(C5)and(C6)(C8), we obtain
η ^ η 0 2 = O I P N n 1 n h + h p ,
and
η ^ η 0 n = O I P N n 1 n h + h p .
Theorem 5.
Under assumptions(C1)(C8), and k n 1 / ( 2 r + 1 ) , we obtain
β ^ ( · ) β 0 ( · ) 2 = O I P N n 2 h p + 1 n h 2 + O I P ( n 2 r / ( 2 r + 1 ) ) .
Theorem 6.
Under assumptions ( C 1 ) ( C 11 ) , the constrained quasi-likelihood estimators α ^ and δ ^ with α ^ d = 1 is asymptotically normally distributed, i.e.,
n α ^ α 0 δ ^ δ 0 D N 0 , R τ 0 D 1 R τ 0
where
D = E ρ 2 m 0 ( T ) η 0 U τ , 0 J τ 0 Φ ( X ) Ψ ( W ) η 0 U τ , 0 J τ 0 Φ ( X ) Ψ ( W ) .
Notice that the proof of this theorem is very long. So, in order to save space and not to make this paper more difficult to read, we opted for adding a Supplementary Materials, containing necessary details.

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/stats4040047/s1.

Author Contributions

Methodology, M.A., I.O., M.R., P.V.; Software, M.A., I.O.; Visualization, M.A., I.O.; Writing, M.A., I.O., M.R., P.V.; original draft, M.A., I.O.; Writing—review & editing, M.A., I.O., M.R., P.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In what follows, we will present results and technical lemmas that would be used for the proof of the previous theorems.
First of all, for all probability measures Q, we denote by L 2 ( Q ) the space of squared integrable functions, i.e., L 2 ( Q ) = f function such that Q f 2 = f 2 d Q < . Then, let F be a subclass of L 2 ( Q ) . So, for all f F , we denote by f = f 2 d Q 1 2 the norm of f with respect to Q.
We give the following definition that will be necessary to understand the results’ proofs.
Definition A1.
  • The δ-covering number, N δ , F , L 2 ( Q ) , of F is the smallest value N for which it exists functions f 1 , f 2 , , f N , such that for each f F , it exists j { 1 , , N } such that f f j < δ or that F j = 1 N B f j , δ .
    Notice that f j s are not necessarily in F .
  • For two functions l and u, a bracketing [ l , u ] is the set of functions f such that l f u , i.e., [ l , u ] = { f : l f u } .
  • The δ-covering number with bracketing N [ ] δ , F , L 2 ( Q ) is defined as the smallest value of N, necessary to cover the whole F , for which it exists pairs of functions f j L , f j U ; j = 1 , , N with f j U f j L δ , such that for each f F , there is a j { 1 , , N } such that f j L f f j U .
    Notice that f j L and f j U are not necessary in F .
  • The δ-entropy with bracketing is defined as log N [ ] δ , F , L 2 ( Q ) .
  • The uniform entropy integral J [ ] δ , F , L 2 ( Q ) is defined by
    J [ ] δ , F , L 2 ( Q ) = 0 δ 1 + log N [ ] κ , F , L 2 ( Q ) 1 2 d κ .
Let Q n be the empirical measure of Q, i.e., Q n = 1 n i = 1 n δ X i ( · ) such that
Q n f = E Q n [ f ] = f d Q n = 1 n i = 1 n f δ X i = 1 n i = 1 n f X i .
We denote by G n = n Q n Q the standardized empirical process indexed by F , and G n F = sup f F G n f . Then, for all f F , we have Q f = E Q [ f ( X ) ] and G n f = 1 n i = 1 n f X i E [ f ( X ) ] .
Lemma A1.
(Lemma 3.4.2. in Van Der Vaart and Wellner [23])
Let M 0 > 0 and F be a uniformly bounded class of measurable functions such that, for all f F , f < M 0 and Q f 2 < δ 2 . Then
E Q G n F c 0 J [ ] δ , F , L 2 ( Q ) 1 + J [ ] δ , F , L 2 ( Q ) δ 2 n M 0 ,
where c 0 is a finite constant, which does not depend on n.
Lemma A2.
(Lemma A.1. in Huang [24])
For any λ > 0 , let
Θ n = η α 0 x + δ w such that δ δ 0 λ , η S n , η η 0 2 λ .
Then, for any ε λ ,
log N [ ] λ , Θ n , L 2 ( P ) C N n log λ ε .
Lemma A3.
(Lemma A.2. in Supplement Material of Wang and Yang [25] and Lemma A.4. in Xue and Yang [26])
Let S n be the space of all polynomial spline functions of order r on [ a , b ] . Under conditions(C1)(C5), we have
A n = sup η 1 , η 2 S n η 1 , η 2 n η 1 , η 2 η 1 2 η 2 2 = O a . s . log n n h ,
where a.s. means “almost surely”.
Recall that θ = α , γ , S . Let D n , θ = γ B α ( τ ˜ ) X i J ( τ ) 0 0 0 I 0 0 0 B α ( τ ˜ ) X i and T i = X i , W i .
Denote
W n , θ = 1 n i = 1 n D i , θ T i 1 T i 1 D i , θ
and
W θ = 1 n i = 1 n E D i , θ T i 1 T i 1 D i , θ .
Lemma A4.
(Lemma A.3 in the Supplement Material of Wang and Yang [25])
Under assumptions(C1)(C5)and(C6)(C8), it exists a positive constant C such that
sup θ W θ 1 2 C N n , a . s . ,
and
Sup θ W n , θ 1 2 C N n , a . s . ,
where
A 2 = sup x 0 A x x = error x = 1 A x .
In what follows, we will enounce lemmas allowing us to prove Theorem 2.
Lemma A5.
(Lemma 1 in Yu et al. [15])
Under assumptions conditions(C1)and(C8), we have
sup u [ a , b ] η 0 ( u ) γ 0 B 1 ( u ) C J r and sup t [ a , b ] β 0 ( t ) δ 0 B 1 ( t ) C k r ,
where J is the number of inner nodes for B 1 , and k is the number of inner nodes for B 2 .
In what follows, we will give lemmas allowing to prove Theorem 3.
Lemma A6.
Under assumptions(C1)(C8), we have
1 n i = 1 n ρ 2 m 0 i η ^ U τ , 0 i η 0 U τ , 0 i 0 U τ , 0 i J τ 0 Φ X i = o I P 1 n ,
1 n i = 1 n ρ 2 m 0 i η 0 U τ , 0 i Φ X i Υ U τ , 0 i J τ 0 τ ^ τ 0 = o I P 1 n ,
1 n i = 1 n ρ 2 m 0 i η 0 U τ , 0 i Φ X i Γ U τ , 0 i δ ^ δ 0 = o I P 1 n ,
where
Υ u τ , 0 = E X ρ 2 m 0 ( T ) | U τ , 0 = u τ , 0 E ρ 2 m 0 ( T ) | U τ , 0 = u τ , 0 ,
Γ u τ , 0 = E W ρ 2 m 0 ( T ) | U τ , 0 = u τ , 0 E ρ 2 m 0 ( T ) | U τ , 0 = u τ , 0 ,
Φ ( x ) = Φ U τ , 0 , x = x Υ u τ , 0 ,
and
Ψ ( w ) = Ψ U τ , 0 , w = w Γ u τ , 0 .
Lemma A7.
Under assumptions(C1)(C8), we have
1 n i = 1 n ρ 2 m o i η ^ U τ , o i η 0 U τ , o i Ψ T i = o I P 1 n ,
1 n i = 1 n ρ 2 m o i η 0 U τ , o i Ψ T i Υ U τ , o i J τ 0 τ ^ τ 0 = o I P 1 n ,
1 n i = 1 n ρ 2 m o i Ψ U τ , o i , W i Γ U τ , o i δ ^ δ 0 = o I P 1 n .

Summary

In this paper, we introduce estimates for the Generalized Partially Linear Single-Index Models for Functional Data (PLGSIMF). Our estimates are obtained via the Fisher Scoring update equation derived from the quasi likelihood function and the normalized B-splines basis with their derivatives.
We prove the n-consistency and asymptotic normality of our estimates and therefore, firstly, we define estimates, with rates, of the estimator η ^ , which still converges at the rate to the true non-parametric function η . Secondly, we define estimates, with rates, of the estimator β ^ , which still converges at the rate to the slope function β . Finally, we define estimates, with rates, of the estimator α ^ and δ ^ , which still converge at the rate to non-parametric parameters α and functional parameters δ , respectively, which still converge normally to the true parameters. A numerical study reveals that our estimation procedure performs well in higher dimensions. The quality of the estimators is illustrated via simulations.

References

  1. Nelder, J.A.; Wedderburn, R.W.M. Generalized Linear Models. J. R. Stat. Soc. Ser. A 1972, 135, 370–384. [Google Scholar] [CrossRef]
  2. McCullagh, P.; Nelder, J.A. Generalized Linear Models, 2nd ed.; Chapman and Hall: London, UK, 1972. [Google Scholar]
  3. Hastie, T.J.; Tibshirani, R.J. Generalized Additive Models; Chapman and Hall: London, UK, 1990. [Google Scholar] [CrossRef]
  4. Wood, S. Generalized Additive Models: An Introduction with R, Second ed.; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar] [CrossRef]
  5. Dunn, P.K.; Smyth, G.K. Generalized Linear Models with Examples in R; Springer Texts in Statistics; Springer: New York, NY, USA, 2018. [Google Scholar]
  6. Lai, P.; Tian, Y.; Lian, H. Estimation and variable selection for generalised partially linear single-index models. J. Nonparametr. Stat. 2014, 26, 171–185. [Google Scholar] [CrossRef]
  7. Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis: Theory and Practice; Springer Series in Statistics; Springer: New York, NY, USA, 2006. [Google Scholar]
  8. Caroll, R.J.; Fan, J.; Gijbels, I.; Wand, M.P. Generalized partially linear single-index models. J. Am. Stat. Assoc. 1997, 92, 477–489. [Google Scholar] [CrossRef]
  9. Li, C.S.; Lu, M. A lack-of-fit test for generalized linear models via single-index techniques. Comput. Stat. 2018, 33, 731–756. [Google Scholar] [CrossRef]
  10. Li, W.; Cao, G. Efficient estimation for generalized partially linear single-index models. Bernoulli 2018, 24, 1101–1127. [Google Scholar]
  11. Aneiros-Perez, G.; Vieu, P. Semi functional partial linear regression. Stat. Probab. Lett. 2006, 76, 1102–1110. [Google Scholar] [CrossRef]
  12. Horváth, L.; Kokoszka, P. Inference for Functional Data with Applications. Comput. Sci. 2012. [Google Scholar] [CrossRef]
  13. Aneiros-Perez, G.; Vieu, P. Partial linear modelling with multi-functional covariates. Comput. Stat. 2015, 30, 647–671. [Google Scholar] [CrossRef]
  14. Ramsay, J.O.; Silverman, B.W. Functional Data Analysis; Springer: New York, NY, USA, 2005. [Google Scholar]
  15. Yu, P.; Du, J.; Zhang, Z. Single-index partially functional linear regression model. Stat. Pap. 2020, 61, 1107–1123. [Google Scholar] [CrossRef]
  16. Cao, R.; Du, J.; Zhou, J.; Xie, T. FPCA-based estimation for generalized functional partially linear models. Stat. Pap. 2020, 61, 2715–2735. [Google Scholar] [CrossRef]
  17. Rachdi, M.; Alahiane, M.; Ouassou, I.; Vieu, P. Generalized Functional Partially Linear Single-index Models. In IWFOS 2020: Functional and High-Dimensional Statistics and Related Fields; Springer: Cham, Switzerland, 2020; pp. 221–228. [Google Scholar] [CrossRef]
  18. De Boor, C. A Practical Guide to Splines; Revised Edition of Applied Mathematical Sciences; Springer: Berlin, Germany, 2001; Volume 27. [Google Scholar]
  19. Yu, Y.; Ruppert, D. Penalized spline estimation for partially linear single-index models. J. Am. Stat. Assoc. 2002, 16, 1042–1054. [Google Scholar] [CrossRef]
  20. Peng, Q.; Zhou, J.; Tang, N. Varying coefficient partially functional linear regression models. Stat. Pap. 2015, 57, 827–841. [Google Scholar] [CrossRef]
  21. Pollard, D. Asymptotics for least absolute deviation regression estimators. Econom. Theory 1991, 7, 186–199. [Google Scholar] [CrossRef]
  22. Stone, C.J. The dimensionality reduction principle for generalized additive models. Ann. Stat. 1986, 14, 590–606. [Google Scholar] [CrossRef]
  23. Van der Vaart, A.W.; Wellner, J.A. Weak Convergence and Empirical Processes with Applications To Statistics; Springer: New York, NY, USA, 1996. [Google Scholar]
  24. Huang, J. Efficient estimation of the partly linear additive Cox model. Ann. Stat. 1999, 27, 1536–1563. [Google Scholar] [CrossRef]
  25. Li, W.; Yang, L. Spline estimation of single-index models. Stat. Sin. 2009, 19, 765–783. [Google Scholar]
  26. Xue, L.; Yang, L. Additive coefficient modelling via polynomial spline. Stat. Sin. 2006, 16, 1423–1446. [Google Scholar]
Figure 1. A sample of curves { Z i ( t ) , t [ 0 , 1 ] } i = 1 , , 500 .
Figure 1. A sample of curves { Z i ( t ) , t [ 0 , 1 ] } i = 1 , , 500 .
Stats 04 00047 g001
Figure 2. Single index u = α X versus predicted single index u ^ = α ^ X .
Figure 2. Single index u = α X versus predicted single index u ^ = α ^ X .
Stats 04 00047 g002
Figure 3. Estimated slope function β ^ ( . ) and slope function β ( . ) .
Figure 3. Estimated slope function β ^ ( . ) and slope function β ( . ) .
Stats 04 00047 g003
Figure 4. Estimated non-parametric function η ^ ( . ) and non-parametric function η ( . ) .
Figure 4. Estimated non-parametric function η ^ ( . ) and non-parametric function η ( . ) .
Stats 04 00047 g004
Figure 5. Sample of 100 absorbance curves Z.
Figure 5. Sample of 100 absorbance curves Z.
Stats 04 00047 g005
Figure 6. Estimated nonparametric function η ^ ( . ) .
Figure 6. Estimated nonparametric function η ^ ( . ) .
Stats 04 00047 g006
Figure 7. Estimated slope function β ^ ( . ) .
Figure 7. Estimated slope function β ^ ( . ) .
Stats 04 00047 g007
Table 1. Bias, SD and MSE according to the parameter τ for PLGSIMF with the identity link function and n = 500 .
Table 1. Bias, SD and MSE according to the parameter τ for PLGSIMF with the identity link function and n = 500 .
τ 1 τ 2
Bias0.004−0.0009
SD0.00310.0025
MSE2.6565 × 10 5 7.1819 × 10 6
Table 2. Bias, SD and MSE evolutions with respect to the parameter γ variation for PLGSIMF with the identity link function and n = 500 .
Table 2. Bias, SD and MSE evolutions with respect to the parameter γ variation for PLGSIMF with the identity link function and n = 500 .
γ 1 γ 2 γ 3 γ 4 γ 5 γ 6 γ 7 γ 8 γ 9
Bias−0.0256−0.05750.05260.12300.0471−0.0015−0.01480.02410.0005
SD0.18340.24300.30370.36210.20490.37560.31010.22810.1710
MSE0.03430.06240.09500.14630.04420.14110.09640.05260.0292
Table 3. Bias, SD and MSE evolutions with respect to the parameter δ variation for PLGSIMF with the identity link function and n = 500 .
Table 3. Bias, SD and MSE evolutions with respect to the parameter δ variation for PLGSIMF with the identity link function and n = 500 .
δ 1 δ 2 δ 3 δ 4 δ 5 δ 6 δ 7
Bias−0.3444−0.5969−0.25270.06960.10930.07460.2909
SD0.94570.55690.25650.09160.10590.11500.1669
MSE1.01310.66640.12960.01320.02310.01880.1125
Table 4. Bias, SD and MSE according to the parameter τ for PLGSIMF with the identity link function and n = 1000 .
Table 4. Bias, SD and MSE according to the parameter τ for PLGSIMF with the identity link function and n = 1000 .
τ 1 τ 2
Bias0.00540.0021
SD0.00230.0021
MSE3.5152 × 10 5 9.5050 × 10 6
Table 5. Bias, SD and MSE evolutions with respect to the parameter γ variation for PLGSIMF with the identity link function and n = 1000 .
Table 5. Bias, SD and MSE evolutions with respect to the parameter γ variation for PLGSIMF with the identity link function and n = 1000 .
γ 1 γ 2 γ 3 γ 4 γ 5 γ 6 γ 7 γ 8 γ 9
Bias−0.0969−0.2220−0.1260−0.0815−0.0806−0.01350.25090.22340.1636
SD0.15830.21940.24780.30410.20980.30920.25090.22340.1636
MSE0.03440.09740.07730.09910.05050.09580.12590.09980.0535
Table 6. Bias, SD and MSE evolutions with respect to the parameter δ variation for PLGSIMF with the identity link function and n = 1000 .
Table 6. Bias, SD and MSE evolutions with respect to the parameter δ variation for PLGSIMF with the identity link function and n = 1000 .
δ 1 δ 2 δ 3 δ 4 δ 5 δ 6 δ 7
Bias−0.21890.0665−0.0113−0.0025−0.0171−0.0006−0.0231
SD1.15580.34770.14470.06440.07120.07160.1296
MSE1.38390.12530.02100.00410.00530.00510.0173
Table 7. Bias, SD and MSE evolutions with respect to the parameter τ variation for PLGSIMF with the logit link function and n = 500 .
Table 7. Bias, SD and MSE evolutions with respect to the parameter τ variation for PLGSIMF with the logit link function and n = 500 .
τ 1 τ 2
Bias−0.00940.0059
SD0.01020.0107
MSE0.00020.0001
Table 8. Bias, SD and MSE evolutions with respect to the parameter γ variation for PLGSIMF with the logit link function and n = 500 .
Table 8. Bias, SD and MSE evolutions with respect to the parameter γ variation for PLGSIMF with the logit link function and n = 500 .
γ 1 γ 2 γ 3 γ 4 γ 5 γ 6 γ 7 γ 8 γ 9
Bias0.6075−0.10330.27490.19010.15060.16970.66620.59480.4662
SD0.27200.15290.18320.10550.08500.12660.45100.32240.4455
MSE0.44310.03400.10920.04730.02990.04480.64720.45770.4159
Table 9. Bias, SD and MSE evolutions with respect to the parameter δ variation for PLGSIMF with the logit link function and n = 500 .
Table 9. Bias, SD and MSE evolutions with respect to the parameter δ variation for PLGSIMF with the logit link function and n = 500 .
δ 1 δ 2 δ 3 δ 4 δ 5 δ 6 δ 7
Bias−0.29770.7625−0.3420−0.2347−0.02430.12460.0325
SD0.90250.23100.53420.20080.21600.21490.3286
MSE0.90310.63480.40240.09540.04720.06170.1090
Table 10. Bias, SD and MSE evolutions with respect to the parameter τ variation for PLGSIMF with the logit link function and n = 1000 .
Table 10. Bias, SD and MSE evolutions with respect to the parameter τ variation for PLGSIMF with the logit link function and n = 1000 .
τ 1 τ 2
Bias0.00540.0021
SD0.00230.0021
MSE3.5152 × 10 5 9.5050 × 10 6
Table 11. Bias, SD and MSE evolutions with respect to the parameter γ variation for PLGSIMF with the logit link function and n = 1000 .
Table 11. Bias, SD and MSE evolutions with respect to the parameter γ variation for PLGSIMF with the logit link function and n = 1000 .
γ 1 γ 2 γ 3 γ 4 γ 5 γ 6 γ 7 γ 8 γ 9
Bias−0.0969−0.2220−0.1260−0.0815−0.0806−0.01350.25090.22340.1636
SD0.15830.21940.24780.30410.20980.30920.25090.22340.1636
MSE0.03440.09740.07730.09910.05050.09580.12590.09980.0535
Table 12. Bias, SD and MSE evolutions with respect to the parameter δ variation for PLGSIMF with the logit link function and n = 1000 .
Table 12. Bias, SD and MSE evolutions with respect to the parameter δ variation for PLGSIMF with the logit link function and n = 1000 .
δ 1 δ 2 δ 3 δ 4 δ 5 δ 6 δ 7
Bias−0.21890.0665−0.0113−0.0025−0.0171−0.0006−0.0231
SD1.15580.34770.14470.06440.07120.07160.1296
MSE1.38390.12530.02100.00410.00530.00510.0173
Table 17. The MSEPs for different models.
Table 17. The MSEPs for different models.
Functional ModelsMSEP
Model 1 (PLGSIMF) g ( μ ( X i , Z i ) ) = η ( α 1 X 1 , i + α 2 X 2 , i ) + 850 1050 β ( t ) Z i ( t ) d t 0.015
Model 2 (GFPLM) g ( μ ( X i , Z i ) ) = α 1 X 1 , i + α 2 X 2 , i + 850 1050 β ( t ) Z i ( t ) d t 0.024
Model 3 (GFPLM) g ( μ ( X i , Z i ) ) = η ( X 1 , i ) + 850 1050 β ( t ) Z i ( t ) d t 0.029
Model 4 (GFPLM) g ( μ ( X i , Z i ) ) = η ( X 2 , i ) + 850 1050 β ( t ) Z i ( t ) d t 0.046
Model 5 (GFLM) g ( μ ( X i , Z i ) ) = 850 1050 β ( t ) Z i ( t ) d t 0.051
Model 6 (SIM) Y i = η ( α 1 X 1 , i + α 2 X 2 , i ) + ε i 1.112
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Alahiane, M.; Ouassou, I.; Rachdi, M.; Vieu, P. Partially Linear Generalized Single Index Models for Functional Data (PLGSIMF). Stats 2021, 4, 793-813. https://doi.org/10.3390/stats4040047

AMA Style

Alahiane M, Ouassou I, Rachdi M, Vieu P. Partially Linear Generalized Single Index Models for Functional Data (PLGSIMF). Stats. 2021; 4(4):793-813. https://doi.org/10.3390/stats4040047

Chicago/Turabian Style

Alahiane, Mohamed, Idir Ouassou, Mustapha Rachdi, and Philippe Vieu. 2021. "Partially Linear Generalized Single Index Models for Functional Data (PLGSIMF)" Stats 4, no. 4: 793-813. https://doi.org/10.3390/stats4040047

Article Metrics

Back to TopTop