Next Article in Journal
Construction of Permutation Polynomials Using Additive and Multiplicative Characters
Next Article in Special Issue
Threshold of Stochastic SIRS Epidemic Model from Infectious to Susceptible Class with Saturated Incidence Rate Using Spectral Method
Previous Article in Journal
A Study on Fractional Diffusion—Wave Equation with a Reaction
Previous Article in Special Issue
On Filters of Bitonic Algebras
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Double Penalized Expectile Regression for Linear Mixed Effects Model

1
Department of Statistics, College of Science, Wuhan University of Technology, Wuhan 430070, China
2
Department of Epidemiology and Biostatistics, College of Public Health, University of South Florida, Tampa, FL 33612, USA
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(8), 1538; https://doi.org/10.3390/sym14081538
Submission received: 30 June 2022 / Revised: 22 July 2022 / Accepted: 25 July 2022 / Published: 27 July 2022
(This article belongs to the Special Issue Mathematical Models: Methods and Applications)

Abstract

:
This paper constructs the double penalized expectile regression for linear mixed effects model, which can estimate coefficient and choose variable for random and fixed effects simultaneously. The method based on the linear mixed effects model by cojoining double penalized expectile regression. For this model, this paper proposes the iterative Lasso expectile regression algorithm to solve the parameter for this mode, and the Schwarz Information Criterion (SIC) and Generalized Approximate Cross-Validation Criterion (GACV) are used to choose the penalty parameters. Additionally, it establishes the asymptotic normality of the expectile regression coefficient estimators that are suggested. Though simulation studies, we examine the effects of coefficient estimation and the variable selection at varying expectile levels under various conditions, including different signal-to-noise ratios, random effects, and the sparsity of the model. In this work, founding that the proposed method is robust to various error distributions at every expectile levels, and is superior to the double penalized quantile regression method in the robustness of excluding inactive variables. The suggested method may still accurately exclude inactive variables and select important variables with a high probability for high-dimensional data. The usefulness of doubly penalized expectile regression in applications is illustrated through a case study using CD4 cell real data.

1. Introduction

Linear mixed effect model (LME) is an important statistical model, which is widely used in various fields. The LME model includes the fixed effect and random effect. The fixed effect is used to represent the general characteristics of the sample population, and the random effect is used to depict the divergences between individuals and the correlation between multiple observations. The structure and property of this model reveal substantial differences compared to the general linear models and complete random coefficient linear models. Compared with other linear models, the inclusion of random effects in the mixed effects model captures the correlation between the observed variables.
Maximum likelihood and least square method are the classical methods of estimation used for the LME model. However, the least square method leads to biased estimates when the data with heavy-tailed distribution or significant heteroscedasticity. Koenker and Bassett (1978) [1] considered the quantile regression to solve such problems by regressing covariates according to the conditional quantiles of response, and capture the regression models under the all quantiles. However, the sparsity of sample variable is an issue which can not be ignored when it involves the correction analysis of variables, that is not all variables have predictive roles on regression analysis. In practical applications, massive candidate variables can be used for modeling analysis and prediction. Retaining the incorrelated variables in the model for prediction is undesirable, where the retention of irrelevant variables will produce the large deviation and non-interpretability of the model. How to select variables effectively is a challenging topic for the linear mixed effects models. Similar to the general linear regression, replacing the different types of penalty terms in the quantile regression can achieve synchrosqueezing for coefficients to reach target of variable selection. For example, Tibshirani (1996) [2] proposed a Lasso method with penalty terms, and selected variables by squeeze parameters sparsely in quantile regression. Quantile Lasso regression constraints some coefficients of irrelevant variables to 0 when the sum of absolute values of coefficients is less than a pre-specified constant, which not only can acquire a more simplified model and select variables simultaneously, also solve the problem of data with sparsity. Next, Zou (2006) [3] gave the proof of the oracle properties of adaptive Lasso in generalized linear models. Biswas and Das (2021) [4] proposed a Bayesian approach of estimating the quantiles of multivariate longitudinal data. Koenker (2004) [5] proposed the L 1 penalty quantile regression based on random effects, which can estimates parameter through weighting random effects information of multiple quantiles. Wang (2018) [6] proposed a new semiparametric approach that uses copula to account for intra-subject dependence and approximates the marginal distributions of longitudinal measurements, given covariates, through regression of quantiles. Wu and Liu (2009) [7] proposed the SCAD and adaptive Lasso penalized quantile regression method, and gave the oracle properties in the situation of variable number unchanged. For different types of data, Peng (2021) [8] illustrated the practical utility of quantile regression in actual survival data analysis through two cases. Li (2020) [9] constructed a double penalized quantile regression method, and used the Lasso algorithm to solve the parameters. It proved that the estimated accuracy of the double penalized quantile regression is better than the other quantile regression. And according to the article above, we find that double penalized quantile regression model lacks accuracy and stability in excluding inactive variables. And expectile regression can accurately reflect the tail characteristics of the distribution, the variables of the model can be accurately selected. Therefore, in order to excluding inactive variables more accurately, this paper cojoin the double penalty terms into the expectile regression to obtain the double penalized expectile regression model.
Newey and Powell (1987) [10] replaced the L 1 -norm loss function to L 2 -norm loss function for weighted least-squares, and proposed expectile regression. It regresses the covariates based on the conditional expectile of the response to obtain the regression model under all expectiles level. Almanjahie et al. (2022) [11] investigated the nonparametric estimation of the expectile regression model for strongly mixed function time series data. Gu (2016) [12] proposed the regularized expectile regression to analysis heteroscedastictiy of high-dimensional data. Farooq and Steinwart (2017) [13] analysed a support vector machine type approach for estimating conditional expectiles. Expectile and quantile are metric indicators that capture the tail behavior of data distribution. When covariates make different impacts on the distributions of different response, such as right or left skew, the two metrics can not only decrease the effects of outliers on statistical inference, but also provide a more comprehensive characteristics of entire distribution. Therefore, quantile and expectile regression provide a more comprehensive relation between the covariates and response.
Quantile regression is the generalization of median regression, and expectile regression is the generalization of mean regression, so expectile regression inherits the computational convenience and sensitivity to the observation values. Especially in the financial field, researchers need the sensity of expectile regression to data. For panel data, the model proposed by Schulze and Kauermann (2017) [14] allows multiple covariates, and a semi-parametric approach with penalized splines is pursued to fit smooth expectile curves. For cross-sectional data, expectile regression models and its applications have been studied. Sobotha et al. (2013) [15] discussed the expectile confidence interval based on large sample properties; Zhao and Zhao (2018) [16] proposed the penalized expectile regression model with SCAD penalty with the proof of asymptotic property; Liao et al. (2019) [17] proposed penalized expectile regression with adaptive Lasso and SCAD penalty for variable selection, and gave the proof of oracle properties under independent but different distributions of error terms. Waldmannetal et al. (2017) [18] proposed a combined Bayesian method with weighted least squares to estimate complex expectile regression. Next, the newly proposed iterative Lasso-expectile regression algorithm is used to solve the estimation of parameters and variables selections. Xu and Ding (2021) [19] combined the elastic network punishment with expectile regression and constructed the elastic network penalized expectile regression model. For the underlying optimization problem, Farooq and Steinwart (2017) [20] proposed an efficient sequential-minimal-optimization-based solver and derived its convergence. For the model selection problem, Spiegel et al. (2017) [21] introduced several approaches on selection criteria and shrinkage methods to perform model selection in semiparametric expectile regression. Expectile regression has also received attention in the economic and financial sector, particularly in actuarial and financial risk management. Daouia et al. (2020) [22] derive joint weighted Gaussian approximations of the tail empirical expectile and quantile processes under the challenging model of heavy-tailed distributions. Ziegel (2013) [23] applied expectile regression to the field of risk measurement.
Expectile regression is widely used in various fields. Including economic field [11,14,17,22,23], biomedical field [12,19] and health field [15,18,21] etc. Therefore, the study of expectile regression has many practical significances, and it is very necessary to study it.
To select the important variables into the model and the inactive variables excluded from the model more accurately, this paper combines the linear mixed effect model with the expectile regression model, and cojoin penalty terms into the estimation of random and fixed effects to construct the double penalized expectile regression for linear mixed effects model. And the iterative Lasso-Expectile regression algorithm is used to solve parameters. The asymptotic property of double penalized expectile regression estimation is proved. The simulation studies will analysis the results of coefficient estimation and variable selection of the method proposed in this paper under different conditions, and the robustness of this method in excluding inactive variables is mainly studied. Finally, based on the research on the real data of CD4 cells, the practical utility difference between the double penalized quantile regression and the double penalized expectile regression are compared.
The rest of this paper is organized as follows. We propose the double penalized expectile regression method and the iterative Lasso expectile regression algorithm in Section 2. The convergence of the algorithm and the asymptotic properties of the model are given in Section 3. In Section 4, we present the simulation studies. And a real data example is illustrated in Section 5. Moreover, this method is compared with the existing double penalized quantile regression method in parameter estimation and variable selection in simulation studies and real data analysis. In Section 6, we give the conclusions. In Appendices, we show the proofs of lemmas and asymptotic properties, and some graphs and tables obtained by simulation studies.

2. Methodologies

In order to get the specific formula of the linear mixed effect double penalized expectile regression model, we introduce the LME model and summarize the estimation methods. And give the specific steps of iterative lasso-expectile regression algorithm. Then we discuss the selection criteria of penalty parameters.

2.1. Model and Estimation

Firstly, we consider the LME model
y i j = x i j T β + z i j T α i + r i j ,   i = 1 , 2 , , n , j = 1 , 2 , , n i , i n i = N ,
where β ( t ) = { β 1 ( t ) , , β h ( t ) } T is the h × 1 vector of fixed effects regression coefficients, α i = ( α i 1 , α i 2 , , α i l ) T is a l × 1 vector of random effects. x i j T is the row vector of the known design matrix, z i j is the l × 1 covariate associated with random effects, and y i j is the j th scalar of the ith subject’s continuous random variable. And we let α i N ( 0 , P ) , r i N ( 0 , Q i ) . The Equation (1) can be expressed as
y i = X i β + Z i α i + r i ,   α i N ( 0 , P ) ,   r i N ( 0 , Q i ) ,   i = 1 , 2 , , n
where X i = ( x i 1 , x i 2 , , x i n i ) T , Z i = ( z i 1 , z i 2 , , z i n i ) T , y i = ( y i 1 , y i 2 , y i n i ) T , and r i = ( r i 1 , r i 2 , , r i n i ) T .
Next, we let Z = diag ( Z 1 , Z 2 , , Z n ) , r = ( r 1 T , r 2 T , , r n T ) T , X = ( X 1 T , X 2 T , , X n T ) T , and y = ( y 1 T , y 2 T , , y n T ) T , α = ( α 1 T , α 2 T , , α n T ) T , P ˜ = diag ( P , P , , P ) , Q = diag ( Q 1 , Q 2 , , Q n ) . So Considering the matrix form, the LME model (2) can be expressed as
y = X β + Z α + r ,   α N ( 0 , P ˜ ) ,   r N ( 0 , Q )
Using the maximum likelihood and generalized least square method, the parameters β and α i can be calculated. According to the known P and Q i , i = 1 , 2 , , n , consider following joint density function of y i and α i :
L ( β , α i | y ) = i = 1 n { ( y i X i β Z i α i ) T Q i 1 ( y i X i β Z i α i ) + α i T P 1 α i + log | P | + log | Q i | }
It is possible to determine the β and α i by minimizing the twice negative logarithm of Equation (4). Equation (4) is not a typical log-likelihood function since α i are vectors of random effects parameters. To be more specific, the first part in Equation (4) is a weighted residual that accounts for within-subject variation, while α i T P 1 α i is a representation of a penalty term resulting from random effects α i that accounts for between-subject variation.
Minimizing Equation (4) is identical to solving the following mixed model equation [24,25] for given the Q i and P.
X T Q 1 X X T Q 1 Z Z T Q 1 X Z T Q 1 Z + P ˜ 1 β α = X T Q 1 y Z T Q 1 y
Expressed in matrix form as:
β ^ = ( X T W 1 X ) 1 X T W 1 y
α ^ i = P Z i T W i 1 ( y i X i β ^ )
where W i = Z i P Z i T + Q i , W = diag ( W 1 , W 2 , , W n ) . Since this outcome had already been achieved, Robinson [26] credited Henderson [27] with the aforementioned normal equations. The “best linear unbiased predictor” (BLUP), which Goldberger [28] introduced, is used to refer to the random effect estimated value β ^ and associated estimated value α ^ i . And Rao [24] put up a way to demonstrate the following Proposition 1:
Proposition 1.
β ^ solves min α , β y X β Z α Q 1 2 + α P ˜ 1 2 , where x T 2 = x T T x .
Although the implicit estimation solution of random effects may be distinct from others, seeing the random effects estimation value as a penalized least squares estimator offers useful insights for the addition of punishment items. By shrinking the unrestricted α ^ reaches the expected value, and both accuracy of the estimator of β ^ and the individual fixed effect estimation will be improved. According to the classical statistical conclusion, in the case of non-informative priors, the posterior expectations of parameters of Bayesian estimation of LME are also solutions (5) and (6).

2.2. Double Penalized Expectile Regression

Li (2020) [9] proposed a double penalized quantile regression estimation for the LME model that finds α ^ i l and β ^ h , i = 1,…,n that minimize
i = 1 n j = 1 n i γ θ ( y i j x i j T β z i j T α i ) + λ β k = 1 h | β k | + λ α i = 1 n T = 1 l | α i t |
where the so-called check function of under θ th level quantile regression is denoted by γ θ ( k ) = k ( θ I ( k < 0 ) ) . Equation (7) can simultaneously perform a variable selection operation and estimate the mixed expectile functions of the response variable, as stated in Li (2020) [9].
Considering the lack of robustness of quantile regression for excluding inactive variables. Inspired by Newey and Powell (1987) [10], we can use the following method for the conditional expectile functions of the response of the j th observation on the i th individual y i j ,
Q y i j ( τ | x i j , α i ) = x i j T β + z i j T α i
The same individual is connected to each component of α i = ( α i 1 , α i 2 , , α i l ) T . We suggest a double penalized expectile regression method for the LME model based on Equation (8), which find α ^ i l and β ^ h , i = 1 , , n that minimize
L ( β , α ) = i = 1 n j = 1 n i ρ τ ( y i j x i j T β z i j T α i ) + λ β k = 1 h | β k | + λ α i = 1 n t = 1 l | α i t |
where ρ τ ( u ) = | τ I ( u < 0 ) | u 2 signifies the check function, and τ ( 0 , 1 ) .
Compared with Equation (7), the expectile regression has the sensitivity to extreme values, and the square loss function used in the regression has the computational advantage. When discussing the asymptotic properties of the estimation, the covariance matrix of the asymptotic distribution does not need to calculate the density function of the residual. Therefore, compared with quantile regression, expectile regression depends more on the overall distribution, and similar to Equation (7), our method also considers selecting variables when estimating coefficients.

2.3. Iterative Lasso-Expectile Regression Algorithm

Obviously, it is very difficult to directly estimate parameters β , α i , λ β and λ α of the iterative double penalized expectile regression model. We propose the Lasso-expectile regression algorithm to solve this optimization problem by selecting one variable and fixing another to solve α i and β , it is equivalent to solving the general lasso expectile model. The iterative series can not guarantee the convergence to the global optimum, the objective function increases monotonically and achieves the maximum value. Therefore, the algorithm terminates after limited iterations and reaches a local optimum (the detailed proof in Section 3).
It is more efficient to combine the iterative approach with the adjustment of λ β than to solve β ^ for a fixed λ β since this algorithm can find the solution paths of β ^ with λ β when provided α . According to several selection criteria, we choose the best λ β to fit β and the best λ α to fit α i . The SIC [29] (Schwarz Information Criterion) and GACV [30] (generalized approximate cross-validation criterion) are the two criteria that are most frequently employed for expectile regression, respectively.
SIC ( λ β ) = ln ( S K / N ) + ln ( N ) 2 N | K |
GACV ( λ β ) = S Μ N | M |
where S K = i = 1 n j = 1 n i γ θ ( y i j x i j T β z i j T α ^ i ) , N = i = 1 n n i , and the dimension of the process K is | K | , which is the same as one of the non-zero components of β . Only when the result of SIC or GACV is at the inflection point is the ideal parameter obtained.
In fact, we can present a different set of tuning parameters firstly, such as dividing (0, a] evenly into b parts, and then select the best tuning parameter based on these two criteria. In order to solve the parameter β for the variables α i = α ^ i , i = 1 , 2 , , n , we compared these two criteria. The solution to α i might also be derived similarly provided β = β ^ .
We summarize the above iterative algorithm in the following Algorithm 1 steps. The specific steps of iterative Lasso-expectile regression algorithm are as follows.
Algorithm 1. Iterative Lasso-Expectile Regression Algorithm.
Input: y , X , Z
Output: β ^ , α ^ , λ β , λ α .
Step 1: Give the initial value, α ^ i ( 0 ) = 0 , i = 1 , , n , and get the standard Lasso solution β ^ ( 0 ) according to β ^ ( 0 ) = arg min L ( β , 0 ) ;
Step 2: Iterate the following two Lasso optimization steps, v = 0 , 1 , ,
 ● α ^ ( v + 1 ) = arg min L ( β ^ ( v ) , α ) , and the modified residual is f ( v ) = y i j x i j T β ^ ( v ) ;
 ● β ^ ( v + 1 ) = arg min L ( β , α ^ ( v + 1 ) ) , and get a new response variable y i j = y i j z i j T α ^ i ( v + 1 ) ;
Step 3: Terminate when k = 1 h | ( β ^ k ( v + 1 ) β ^ k ( v ) ) | / h < δ for a pre-specified small value δ .
Then we use an example to show the solution path in each iteration step for SIC and GACV. We consider the following the process of model data generating:
y i j = x i j T β + z i j T α i + r i j
where β = ( 3 , 1.5 , 0 , 0 , 2 , 0 , 0 , 0 ) T , z i j T = ( 1 , x i j 1 ) , x i j T = ( x i j 1 , x i j 2 , , x i j 8 ) , i = 1 , 2 , , 10 , j = 1 , 2 , , 20 are iid from standard normal distribution, and the random effects are iid from α i = ( α i 0 , α i 1 ) T N 2 ( 0 , I ) , r i j N ( 0 , 4 ) . We use double penalized Lasso ecpectile regression (DLER) and double penalized Lasso quantile regression (DLQR) methods to study the two criteria, and get the following results.
Figure A1 and Figure A2 (in the Appendix B) show the search paths of the DLER method for penalty parameters α and β under the SIC and GACV criteria in the first iteration respectively. It can be seen in the graph that the penalty parameter search paths of SIC and GACV are obviously different, but the corresponding optimal penalty parameter values are close.
Figure A3 and Figure A4 (in the Appendix B) show the results of DLQR method in the first iteration process respectively. The loss functions of DLER method and DLQR method are obviously different in nature. Under different penalty parameters, the criterion values of the two methods are not comparable, so they cannot be compared and analyzed horizontally from the numerical perspective of the two criteria. From the path of fixed effect coefficient, it can be seen that with the increase of penalty coefficient λ , DLER method will quickly compress the inactive covariate coefficients to 0, while the coefficients of important covariates change slowly around the true value. However, The DLQR method does not fluctuate much with changes in the penalty coefficient. It shows that the DLER method has an absolute advantage over DLQR in terms of variable selection.

3. Asymptotic Properties

In this part, we give the convergence of iterative Lasso expectile regression algorithm and the asymptotic property of double penalized expectile regression.

3.1. Convergence of Iterative Lasso-Expectile Algorithm

In order to prove the asymptotic properties of the double penalized expectile regression method. The following lemmas are provided:
Lemma 1.
There exists a single  β ( α ) that minimizes the cost function L ( β , α ) for any given λ β , λ α ,if L ( β , α ) is a continuous and strictly convex function of α and β . The same conclusion also applies to α .
The Proof of Lemma 1 is given in Appendix A.
Lemma 2.
For a given  λ β , λ α , definition η : d d is the mapping of the above iterative algorithm for updating process of ( β , α ) ( β ( 1 ) , α ( 1 ) ) in one step, then η is continuous.
The Proof of Lemma 2 is given in Appendix A.
Lemma 3.
The sequence solution  ( β ^ ( r ) , α ^ ( r ) ) provided in the iterative Lasso expectile algorithm reduces the objective function ( β ^ ( r ) , α ^ ( r ) ) and L ( β ^ ( r ) , α ^ ( r ) ) converges to ( β ¯ , α ¯ ) under the assumption that there is a unique ( β ¯ , α ¯ ) for the given λ β and λ α to minimize L ( β , α ) .
The Proof of Lemma 3 is given in Appendix A.

3.2. Asymptotic Properties of DLER

In order to obtain the asymptotic properties of the double penalized expectile regression method, we first propose the following regularization assumptions:
(A1) y i j has independent conditional distribution function F i j ( ) and density function f i j ( ) , given x i j , z i j , α i , and 0 < f i j ( κ i j ( τ ) ) < + , i = 1 , 2 , , n , j = 1 , 2 , , m .
Where κ i j ( τ ) = x i j T β ( τ ) + z i j T α i , V ar [ Π τ ( r i τ ) r i τ ] = E [ Π τ ( r i τ ) r i τ r i τ T Π τ ( r i τ ) ] = Λ i τ , r i τ = ( r i 1 τ , , r i m τ ) T , r i j τ = y i j x i j T β τ , Π τ ( r i τ ) = [ d i a g ( φ τ ( r i j τ ) ) ] j = 1 m ;
(A2) Assuming that α i t , i = 1 , 2 , , n t = 1 , 2 , , l are independent, and their distribution function and density function are G i t ( ) and g i t ( ) respectively, 0 < g i t ( 0 ) < + ;
(A3) Definition ϑ = τ ( 1 τ ) , Δ = diag ( f i j ( κ i j ( τ ) ) ) , and there are the positive definite matrixes
P 0 ( τ ) = lim m n ϑ m 1 ( Z T Λ τ Z Z T Λ τ X / n X T Λ τ Z / n X T Λ τ X / n )
P 1 ( τ ) = lim m n m 1 ( Z T E [ Π τ ( r τ ) ] Z Z T E [ Π τ ( r τ ) ] X / n X T E [ Π τ ( r τ ) ] Z / n X T E [ Π τ ( r τ ) ] X / n )
where Λ τ = V ar [ Π τ ( r τ ) r τ ] = diag [ Λ i τ ] i = 1 n ;
(A4) max 1 i n , 1 j m x i j / n m 0 , max 1 i n , 1 j m z i j / m 0 .
The above assumptions (A1) and (A3) are the standard conditions of the panel data models, where assumption (A1) is common in the literature of quantile regression, which not only ensures the independence between observation individuals, but also allows the heterogeneity within individuals; (A3) gives the full rank condition, according to Lindeberg-Feller central limit theorem, when τ = 0.5 , it will be simplified by P 1 ( τ ) , and then (A3) will be simplified.
We give the asymptotic properties of the double penalty expectile regression estimation of Equation (7), and consider the following objective function:
Γ n m ( u ) = i = 1 n j = 1 m [ ρ τ ( y i j κ i j ( τ ) x i j T u ( 1 ) n m z i j T u i ( 2 ) m ) ρ τ ( y i j κ i j ( τ ) ) ] + λ β k = 1 h ( | β k + u k ( 1 ) n m | | β k | ) + λ α i = 1 n t = 1 l ( | α i t + u i t ( 2 ) m | | α i t | )
where Γ n m ( u ) is a convex function whose minimum value is
u ^ = ( u ^ ( 1 ) u ^ ( 2 ) ) = ( n m ( β ^ ( τ ) β ( τ ) ) m ( α ^ α ) )
Theorem 1.
Under assumptions (A1)–(A4), when,  n , m , if, λ β / m n λ 1 0 , λ α / m λ 2 0 , and there is the a > 0 , and n a / m 0 is satisfied, so we have
u ^ = arg min Γ m n ( u ) d arg min Γ 0 ( u )  
where
Γ 0 ( u ) = 2 u T W + u T P 1 u + λ 1 k = 1 h [ u k ( 1 ) sgn ( β k ) I ( β k 0 ) + | u k ( 1 ) | I ( β k = 0 ) ] + λ 2 S T u ( 2 )
W N ( 0 , P 0 ) , S N ( 0 , d i a g ( 4 G i t ( 0 ) ( 1 G i t ( 0 ) ) ) )
According to the aforementioned theorem, for non-zero coefficients, the double penalized estimation resulted in bias, and the bias degree was regulated by the tuning parameter. The proof of Theorem 1 is given in the Appendix A.

4. Simulation Study

In this section, we illustrate the performance of the proposed double penalized expectile regression. In order to study the impact of the SIC criteria and the GACV criteria on double penalized Lasso expectile regression model and double penalized Lasso quantile regression model [9]. Simulation 1 is used to compare the two methods with different expectile levels and signal-to-noise ratio. And briefly denoted as DLER-SIC, DLER-GACV, DLQR-SIC, and DLQR-GACV; To illustrate the robustness of the DLER method in excluding inactive variables, simulation 2 and simulation 3 are given to study the impact of random effects on the coefficient estimation, comparing the performance of DLER and DLQR under different random effects. As well as comparing the obtained experimental values at different error distributions and different model sparsity to illustrate the advantage of DLER in variable selection; in order to illustrate the advantage of the DLER method in high-dimensional data, simulation 4 compares the two methods when dimensions of covariate is larger than the sample size.
In order to evaluate the accuracy of model coefficient estimation, we use mean square error (MSE) as the evaluation index. MSE is defined as:
M S E = ( β ^ t β ) T Σ 1 ( β ^ t β )
where Σ = ( ρ | k h | ) 8 × 8 , and β ^ t is the estimator of β in the s th simulation. SD stands for the 100-repetition Bootstrap standard deviation. Corr expresses the average proportion of important variables that were entered into the model correctly, and Incorr expresses the average proportion of inactive variables that were entered into the model incorrectly. X 1 ( S D 1 ) , X 2 ( S D 2 ) , X 5 ( S D 5 ) represent the total number of choosing each important variable into the model correctly and its bootstrap standard deviation respectively. X 3 ( S D 3 ) , X 4 ( S D 4 ) , X 6 ( S D 6 ) , X 7 ( S D 7 ) , X 8 ( S D 8 ) represent the total number of correctly excluded each redundant variable out of the model and its bootstrap standard deviation respectively. Int ( S D int ) represent the total number of choosing intercept into the model and its bootstrap standard deviation.
Simulation 1.
The impact of “signal-to-noise ratio” on estimation.
We compared the effectiveness of the two tuning parameter criteria, SIC and GACV. The process for producing the model data listed below was rated as
y i j = β 0 + x i j T β + z i j T α i + σ r i j
We let x i j T = ( x i j 1 , x i j 2 , , x i j 8 ) , i = 1 , 2 , , 20 , j = 1 , 2 , , 25 ,
Z i j T = ( x i j 1 , x i j 2 , , x i j 5 ) , and X 1 , , X 8 are generated from N ( 0 , 1 ) with correlation between X k and X l being ρ | l k | , ρ = 0.5 . This paper sets the random effects are iid from α i = ( α i 0 , α i 1 , , α i 5 ) T N 6 ( 0 , P ) , where P = d i a g ( 1 , 1 , 1 , 1 , 0 , 0 ) . And β 0 = 0 , β = ( 3 , 1.5 , 0 , 0 , 2 , 0 , 0 , 0 ) T . The error terms are iid from r i j N ( 0 , 1 ) , we set τ = 0.25 , 0.5 , 0.75 , and σ = 1 , 2 , 3 to compare the estimation methods. In every simulation, we consider the estimator of coefficients to be 0 if its absolute value is less than 0.1. We set ε = 10 3 in the iterative Lasso algorithm of DLER as the same settings in the iterative algorithm of DLQR of Li (2020) [8]. Table A1 and Table A2 (in the Appendix C) give the results of coefficient estimation and variable selection, respectively.
According to Table A1, the estimation accuracy of the model is first analyzed. When σ is fixed, the MSE of the two methods is almost reaches the minimum at τ = 0.5 , the estimation accuracy of the two methods is better, while the accuracy at the other fractile is slightly worse, but the difference is not obvious, for example when σ = 1 , MSE of DLER-SIC method is 0.292 at τ = 0.5 , while at the other expectile levels are 0.306 and 0.374, respectively. In this case, this shows that the DLER method has the same accuracy as the DLQR method in parameter estimation. In addition, according to the results of the two tunning parameter criterions, the estimation accuracy under the SIC is better than the GACV whether the DLER model or the DLQR model.
Next, analyze the accuracy of variable selection. According to the SD of each variable estimation in Table A1, when fixed σ , with the increase of τ , SD decreases gradually, and the accuracy of variable selection of the two models is gradually increased. For each important variable, according to Table A2, the DLER method can select 99% of the important variables into the model. In addition, for each inactive variable, comparing the results in the table, shows that when τ is fixed, with the increase of σ , the SD is also increasing, indicating that the accuracy of DLQR method is gradually weakened in excluding inactive variables out of the model. For DLER method, the SD is the largest at σ = 2 , DLER method has the low accuracy, it is still better than DLQR at this time, especially at σ = 3 , the SD is the smallest, indicating that with the increase of signal-to-noise ratio, DLER method is better than DLQR method in excluding the redundant variables. Combined with the results of Table A2, the same conclusion can be obtained.
Therefore, for the parameter estimation of the methods, when the signal-to-noise ratio with relatively large, the DLER method is better than the DLQR method. In terms of model variables selection, whether in SIC and GACV criteria, almost 99%of important variables can be selected in both methods. For the ability to exclude the redundant variables, the DLER method is more advantageous than the DLQR method. Therefore, in terms of the accuracy of parameter estimation, DLER method and DLQR method have the same effect. However, the DLER method is better than the DLQR method in terms of estimated stability and excluding inactive variables.
Simulation 2.
The influence of random effects.
We use the simulation to show the influence of random effects on DLER-SIC, DLER-GACV, DLQR-SIC and DLQR-GACV. The data generation model is Equation (15) with the fixed σ = 1 . And consider the covariance matrix of the random effect P with three forms:
P 1 = diag ( 1 , 1 , 1 , 1 , 0 , 0 )
P 2 = diag ( 2 , 2 , 2 , 2 , 0 , 0 )
P 3 = diag ( 3 , 3 , 3 , 3 , 0 , 0 )
With the increasing of random effects, we get the results of 100 repetitions simulations with τ = 0.25 , 0.5 , 0.75 . Table A3 and Table A4 (in the Appendix C) give the results of estimation and variable selection of different methods under different random effects at τ = 0.5 .
According to Table A3, with the increasing interference of random effects, although the estimation accuracy of the two methods is decreasing, and the accuracy of the fixed effect coefficient is decreasing, especially the first two important covariates interfered by random effects. However, in term of variable selection, DLER method has little change in the accuracy of variable selection, and almost all the important variables can be choosing into the model, and the ability to exclude the inactive variables is still better than DLQR method. In particular, the DLER-SIC method, combined with Table A4, shows that the proportion of correctly choosing variables is 99%, and the accuracy of excluding the redundant variables is almost above 90%. In general, since the double penalized expectile regression takes into account the random effects while selecting fixed effects, it can be almost free from the interference of random effects in the accuracy of variables selection, but it will still be affected by random effects to a degree in the accuracy of fixed effect coefficient estimation. Therefore, in terms of the accuracy of parameter estimation, DLER method and DLQR method have the same effect. However, DLER is still more robust than DLQR in variable selection, even if the interference of random effects is added to the model.
Simulation 3.
The case of different model sparsity and error distribution.
We compare DLQR and DLER under different model sparsity and error distribution. The data generation model is Equation (15), considering the fixed effect as the following three cases
(1)
Dense β = ( 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ) T
(2)
Sparse β = ( 3 , 1.5 , 0 , 0 , 2 , 0 , 0 , 0 ) T
(3)
High Sparse β = ( 5 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ) T
We consider σ = 1 , τ = 0.5 , P = diag ( 2 , 2 , 2 , 2 , 0 , 0 ) , the distribution of error term respectively comes form N ( 0 , 1 ) , t ( 3 ) and C a u c h y ( 0 , 1 ) . Comparing the models DLER-SIC, DLER-GACV, DLQR-SIC and DLQR-GACV. Tables shows the results of the three models by 100 repetitions simulations. Table A5 and Table A6 show the results of coefficient estimation and variable selection under the dense model, Table A7 and Table A8 show the results under the sparse model, Table A9 and Table A10 show the results under the highly sparse model. (See Appendix C for tables).
According to Table A5. At this time, all variables are important variables. With the change of error distribution, MSE are increasing, and the estimation accuracy of the two methods are decreased. We find that DLER method and DLQR method have the same effect in term of the accuracy of parameter estimation. In addition, considering the accuracy of variable selection of the two methods. Although the two methods cannot completely choose all the important variables into the model, it can be known from Table A6 that the average of the correct variables retained by them is more than 7.6, which is very close to the true value 8. Moreover, when the error term is adjusted from the normal distribution to the heavy-tailed distribution, the change of DLER is the smallest in all methods, so it is weak on the influence of different error distributions. In summary, DLER is robust to the change of error distribution on variable selection.
Next, the results of sparse model and highly sparse model are analyzed. Firstly, the estimation accuracy of DLER method and DLQR method is analyzed. The results are similar to the dense model, with the error distribution becoming more complex, MSE are increasing, and the estimation accuracy of the two methods is decreased. Next, considering the accuracy of the variable selection in model, according to Table A7 and Table A8, when the error obeys the normal distribution, for the sparse model, the two methods have little difference in the accuracy of excluding the inactive variables. However, when the error obeys distribution t ( 3 ) or C a u c h y ( 0 , 1 ) , the DLER method is significantly better than DLQR. Especially in the highly sparse model with error distribution C a u c h y ( 0 , 1 ) , DLER excluding the inactive variables advantage is more obvious, especially DLER-SIC method. It shows that expectile regression is quite robust than quantile regression.
Simulation 4.
The case of high dimensional data.
High-dimensional data is widely available in the stock exchange market, biomedicine, aerospace and other fields, so the modeling and analysis of high-dimensional data has very important practical significance. Next, we investigate the performance of the proposed model in the high-dimensional scenarios of selecting variables. The data generation model is still the Equation (15), we reduce the sample size to n = 10 , m = 10 , that is, the total sample size is 100. In addition, 102 independent noise variables X 9 , X 10 , , X 110 are added to the above sparse model, all variables are independent and identically distributed in N ( 0 , 0.25 2 ) , thus the total number of variables is 110, larger than the total sample size, and β 110 × 1 = ( 3 , 1.5 , 0 , 0 , 2 , 0 , 0 , 0 , , 0 ) T . There are three real important covariates and 107 redundant covariates. In addition, we set σ = 0.5 , P = diag ( 1 , 1 , 1 , 1 , 0 , 0 ) . Table A11 and Table A12 (in the Appendix C) show 100 repeated results of the two methods at τ = 0.5 , 0.75 , where X 0 ( S D 0 ) denote the average and bootstrap standard deviation of all redundant variables being correctly excluded out of the model.
Firstly, analyze the estimation precision of the two methods. According to Table A11, in the situation of fixed quantile, when the dimension of covariates is larger than the sample size, the MSE is larger than the previous simulation, the change of the MSE value of the DLER method is less than DLQR, and the MSE of the DLQR method is significantly larger than the DLER, indicating that although the estimation accuracy of the two methods is decreasing, the DLER method is significantly better than the DLQR method, and the stability of the DLER method is better than the DLQR method. Next, analyze the accuracy of variable selection. According to Table A11 and Table A12, the DLER method can ensure that the proportion of excluding redundant variables is more than 95% under three expectile levels. When the expectile level is fixed, DLER method has absolute advantages over DLQR in excluding redundant variables, especially DLER-GACV method can exclude more than 97% of the inactive variables at τ = 0.5 . Therefore, when the dimension of covariates is larger than the sample size, DLER method is superior to DLQR in terms of estimation accuracy and excluding the inactive variables both at median τ = 0.5 and extreme expectile level τ = 0.75 .

5. Application

CD4 cells play an important role in determining the efficacy of AIDS treatment and the immune function of patients, so excluding inactive variables is important for analyzing CD4 cell data. We applied the model to the real data of CD4 cell count. For a complete description of the data set, please refer to Diggle P.J’ s homepage: https://www.lancaster.ac.uk/staff/diggle/, accessed on 30 June 2022, and we select a part of this data set. The response variable is the open-root conversion of CD4 cell count. The variables in the data set include the time of seroconversion (time), the age relative to a starting point (age), the smoking status depicted by the number of packets smoked (smoking), the number of sexual partners (sex partner), and the depression state and depression degree (depression). Choosing time, smoking, age and sex partner as important fixed effect to determine the CD4 cell numbers, where time and age are important random effects. On this basis, consider the following model:
Y i j = X i j T β + Z i j T α i + r i j
where Y i j is the j th observation of the ith individual, X i j T is the observation of explanatory variables, and Z i j T is a subset of X i j T . We set that the threshold of β is 0.05, and the threshold of α i is 0.1. Table A13 (in the Appendix C) gives the results of the four methods at τ = 0.1 , 0.25 , 0.5 , 0.75 , 0.9 .
From the results of Table A13, the double penalized expectile regression method proposed can give the results of variable selection while estimating the coefficients. The variable with a value of 0 in the table indicates that it can be excluded from the model. It can be found that both estimation methods excluded variables X 3 and X 4 from the model. In the four variables, only time and smoking will affect CD4 cells. From the sign of coefficient estimation, the value of time is always less than 0, indicating that time has a negative influence on CD4 cells. The values of smoking are all larger than 0, indicating that smoking has a positive influence on CD4 cells.
Analysis of the numerical changes. For variable X 2 , under the SIC criterion, with the changes of expectile level, the DLER method fluctuates less, and the DLER is relatively stable. Indicating that the stability of the DLER-SIC method is slightly stronger. For variable X 1 , whether it is the SIC criteria or the GACV criteria, the numerical fluctuation of the DLER method is less than the DLQR method, indicating that the stability of DLER for estimation of variable X 1 is better than that of DLOR.
This shows that DLER method has strong practical utility for CD4 data in excluding inactive variables and analyzing its influencing factors.

6. Conclusions

In this paper, we propose the double penalized expectile regression method for linear mixed effects model, which imposes penalties on both fixed effects and random effects, fully accounting for the random effects in coefficient variable selection and estimation. The model proposed in this paper is found to be highly robust in excluding inactive variables after simulations and application studies. The conclusion is also supported by comparison with double penalized quantile regression method, through the comparison of the results of the two estimation methods, it can be found that the double penalized expectile regression has the same effect as the double penalized quantile regression in term of the accuracy of parameter estimation, but has absolute advantage in selecting variables, especially in excluding inactive variables.
When the signal-to-noise ratio is different, the precision of coefficient estimation and the accuracy of variable selection will be different. When the signal-to-noise ratio is larger, the proposed method outperforms the quantile model in terms of coefficient estimation ability and excluding inactive variables. In addition, because the double penalized expectile regression select variable for both fixed and mixed effects, it is almost undisturbed to random effects in terms of variable selection accuracy. When comparing the dense and sparse models, it is found that the sparser the model and the more complex the error term distribution, the advantage of the new method is more obvious, indicating that it has strong robustness. When the dimension of covariates is larger than the sample size, the method has obvious advantages in the estimation precision of coefficients and excluding inactive variables. Finally, when analyzing the real data, it is found that the new method has strong practical utility for the longitudinal data of CD4 cells, which can exclude its inactive variables and analyze the influence trend of various factors, therefore obtain accurate medical judgment.

Author Contributions

Conceptualization, J.C. and S.G.; methodology, S.G.; software, S.G.; validation, Z.Y., J.L. and Y.H.; formal analysis, S.G.; investigation, S.G.; resources, Z.Y.; data curation, J.L.; writing—original draft preparation, S.G.; writing—review and editing, S.G.; visualization, Z.Y.; supervision, J.C.; project administration, Y.H.; funding acquisition, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China under Grant 81671633 to J. Chen.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data in this paper have been presented in the manuscript.

Acknowledgments

Many thanks to reviewers for their positive feedback, valuable comments and constructive suggestions that helped improve the quality of this article. Many thanks to editors’ great help and coordination for the publish of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Lemma 1.
First, prove the existence of the conclusion.
We give α and select β 0 h randomly. Define
M β = { β h | L ( β , α ) L ( β 0 , α ) }
The continuity of L shows that M β is a closed area. For a given λ β > 0 , then
L ( β , α ) L ( β 0 , α ) β l L ( β 0 , α ) / λ β
M β is contained in the spherical region of k l , so M β is bounded. Therefor there exists β in M β such that L ( β , α ) reaches the minimum.
Then prove the uniqueness of the conclusion. It can be easily obtained from the strictly convex function of β of L . Similar to this, there is a specific α ( β ) such that L ( β , α ) reaches the minimum value for a given β and λ α > 0 . □
Proof of Lemma 2.
We can know that ( β ( 1 ) , α ( 1 ) ) is a composite mapping that only depends on β from the process of iterative algorithm, so all that must be demonstrated is that both α ( 1 ) β ( 1 ) and β α ( 1 ) are continuous mappings. In addition, since given β to solve α ( 1 ) , it is symmetrical to make L ( β , α ( 1 ) ) reach the minimum with given α ( 1 ) to solve β ( 1 ) to make L ( β , α ( 1 ) ) reach the minimum. Just has to be demonstrated that mapping α ( 1 ) β ( 1 ) continuous.
For simplicity, here we omit the superscript. Next define the function g ( α ) = inf β L ( β , α ) . g is a convex continuous function because the function L ( β , α ) is convex.
Next, we prove that β ( α n ) β ( α ) = β ˜ is true for every sequence α n α . Since g is a continuous function and N for r > 0 , when n N we have g ( α n ) g ( α ) + r , that is, L ( β ( α n ) , α n ) L ( β ˜ , α ) + r . And define
M = max { max n = 1 , 2 , , N L ( β ( α n ) , α n ) , L ( β ˜ , α ) + r } A = { ( β , α ) | L ( β , α ) M } ,
Since A is a bounded closed set with sequence ( β ( α n ) , α n ) for a given λ α ,and λ β , there is a convergent subsequence β ( α n k ) whose limit value is β ¯ . From the continuity of L and g , we can get
L ( β ˜ , α ) = g ( α ) = lim h g ( α n h ) = lim h L ( β ( α n h ) , α n h ) = L ( β ¯ , α )
Lemma 1 demonstrates that the point at which L ( β , α ) reaches the minimal value for a given α is unique, and from the equation β ¯ = β ˜ we may get
lim n β ( α n ) = β ˜ = β ( α )
Thus, we established that β ( α n ) β ( α ) . The mapping α ( 1 ) β ( 1 ) is continuous for all sequences α n α . □
Proof of Lemma 3.
Given λ β and λ α , from the definition ( β ^ ( v + 1 ) , α ^ ( v + 1 ) ) in the iterative Lasso expectile algorithm, it is easy to obtain
L ( β ^ ( v ) , α ^ ( v ) ) L ( β ^ ( v ) , α ^ ( v + 1 ) ) L ( β ^ ( v + 1 ) , α ^ ( v + 1 ) )
Since L is a strictly convex function, unless ( β ^ ( v + 1 ) , α ^ ( v + 1 ) ) = ( β ¯ , α ¯ ) , at least one greater-than symbol must be included in the two formulas above. As a result, the sequence L ( β ^ ( v ) , α ^ ( v ) ) is strictly decreasing. And since L ( β ^ ( v ) , α ^ ( v ) ) > 0 , its limit is real, indicated by L ¯ .
And consider
A = { ( β , α ) | L ( β , α ) L ( β ^ ( 0 ) , α ^ ( 0 ) ) }
is a bounded closed region.
Clearly, we can have ( β ^ ( v ) , α ^ ( v ) ) A . Denote lim h ( β ^ ( v h ) , α ^ ( v h ) ) = ( β ˜ , α ˜ ) by randomly selecting a convergent subsequence ( β ^ ( v h ) , α ^ ( v h ) ) .
Because the continuity of L , we can obtain
lim h ( β ^ ( v h ) , α ^ ( v h ) ) = L ( β ˜ , α ˜ )
Assuming that ( β ˜ , α ˜ ) ( β ¯ , α ¯ ) , that is, L ( β ˜ , α ˜ ) > L ( β ¯ , α ¯ ) , so we can further obtain ( β ˜ ( 1 ) , α ˜ ( 1 ) ) to produce
r = L ( β ˜ , α ˜ ) L ( β ˜ ( 1 ) , α ˜ ( 1 ) ) > 0
Define ( β ( 1 ) , α ( 1 ) ) = h ( β , α ) . It can be seen from Lemma 2 that h is continuous, and because L ( h ( β , α ) ) and L are continuous, ζ > 0 is present, resulting in
| L ( h ( β , α ) ) L ( h ( β ˜ , α ˜ ) ) | = | L ( β ( 1 ) , α ( 1 ) ) L ( β ˜ ( 1 ) , α ˜ ( 1 ) ) | < r 2
to make it true for every ( β , α ) U ( ( β ˜ , α ˜ ) , ζ )
It can be seen from lim h ( β ^ ( v h ) , α ^ ( v h ) ) = ( β ˜ , α ˜ ) that when h is large enough, we have
L ( β ^ ( v h + 1 ) , α ^ ( v h + 1 ) ) L ( β ˜ ( 1 ) , α ˜ ( 1 ) ) + r 2
Thus, to sum up L ( β ^ ( v h + 1 ) , α ^ ( v h + 1 ) ) L ( β ˜ , α ˜ ) r + r 2 = L ( β ˜ , α ˜ ) r 2 .
This is obviously in contradiction with
lim h L ( β ^ ( v h ) , α ^ ( v h ) ) = L ( β ˜ , α ˜ )
So ( β ˜ , α ˜ ) = ( β ¯ , α ¯ ) . We can obtain ( β ^ ( v ) , α ^ ( v ) ) ( β ¯ , α ¯ ) due to the limitation of any subsequent being the same.
The proof is ended. □
Proof of Theorem 1. 
First, similarly [27], we decompose the objective function and equation to prove Theorem 1, we define the objective function as following
R m n ( u ) = i = 1 n j = 1 m ρ τ { y i j κ i j ( τ ) x i j T u ( 1 ) m n z i j T u i ( 2 ) m } ρ τ { y i j κ i j ( τ ) }
Our goal is to approximate R m n by a quadratic function with a unique minimizing value, and use results to show that the asymptotic distribution of that minimizing quadratic function. This quadratic approximation is mainly composed by the Taylor expansion of the expected value and by a linear approximation function.
Let x ˜ i j = ( x i j T , z i j T ) T , u ˜ = ( u ( 1 ) T / m n , u ( 2 ) T / m ) T , r i j τ = y i j κ i j ( τ ) , The function E ( ρ τ ( r i j τ x ˜ i j T u ˜ ) ρ τ ( r i j τ ) ) is convex, continuously differentiable twice, reaches the minimum value when u ˜ = 0 , and around u ˜ = 0 it is represented as
E ρ τ r i j τ x ˜ i j T u ˜ ρ τ r i j τ = u ˜ T x ˜ i j E φ τ r i j τ x ˜ i j T u ˜ 2 u ˜ T x ˜ i j E φ τ r i j τ · r i j τ + o u ˜ 2
where φ τ ( u ) = τ I ( u < 0 ) . According to
arg min u R n + p E ρ τ r i j τ x ˜ i j T u ˜ ρ τ r i j τ = 0
We give the first-order condition
E [ φ τ ( r i j τ ) r i j τ ] = 0
Equation (13) can be simplified as follows
E ρ τ r i j τ x ˜ i j T u ˜ ρ τ r i j τ = u ˜ T x ˜ i j E φ τ r i j τ x ˜ i j T u ˜ + o u ˜ 2
Taylor expansion of Equation (12) around u ˜ = 0 can be regarded as a linear approximation function. Define
D i j ( r i j τ ) = 2 φ τ ( r i j τ ) r i j τ
According to Equation (15), and E ( D i j ( r i j τ ) ) = 0 . Give a definition
q i j ( u ˜ ) = ρ τ ( r i j τ x ˜ i j T u ˜ ) ρ τ ( r i j τ ) u ˜ T x ˜ i j D i j ( r i j τ )
And
R m n ( u ˜ ) = i = 1 n j = 1 m ( E [ ρ τ ( r i j τ x ˜ i j T u ˜ ) ρ τ ( r i j τ ) ] ) + i = 1 n j = 1 m u ˜ T x ˜ i j D i j ( r i j τ ) + i = 1 n j = 1 m ( q i j ( u ˜ ) E [ q i j ( u ˜ ) ] )
According to Koenker [5], the objective function R m n ( u ˜ ) can be reduced to
R m n ( u ˜ ) = u ˜ T i = 1 n j = 1 m ( x ˜ i j E [ φ τ ( r i j τ ) ] x ˜ i j T )   u ˜ + u ˜ T i = 1 n j = 1 m x ˜ i j D i j ( r i j τ ) + O u ˜ 2 + o u ˜ 2 = u ˜ T i = 1 n j = 1 m ( x ˜ i j E [ φ τ ( r i j τ ) ] x ˜ i j T )   u ˜ + u ˜ T i = 1 n j = 1 m x ˜ i j D i j ( r i j τ ) + o p ( 1 ) u ˜ T i = 1 n j = 1 m ( x ˜ i j E [ φ τ ( r i j τ ) ] x ˜ i j T ) u ˜ + u ˜ T i = 1 n j = 1 m x ˜ i j D i j ( r i j τ )
Let x ˜ i j = ( z i j T , x i j T ) T , u ˜ = ( u 0 T / m , u 1 T / m n ) T , Then there is
R m n ( u ) = 2 1 m i = 1 n j = 1 m ( z i j T u 0 + x i j T u 1 / n ) φ τ ( y i j κ i j ( τ ) ) ( y i j κ i j ( τ ) ) + 1 m i = 1 n j = 1 m E [ φ τ ( y i j κ i j ( τ ) ) ] ( z i j T u 0 + x i j T u 1 / n ) 2 = 2 1 m [ u 0 T Z T Π τ ( r τ ) r τ + u 1 T / n X T Π τ ( r τ ) r τ ] + 1 m [ u 0 T Z T E [ Π τ ( r τ ) ] Z u 0 + 2 u 0 T Z T E [ Π τ ( r τ ) ] X u 1 / n + u 1 T X T E [ Π τ ( r τ ) ] X u 1 / n ] = Γ m n ( 1 ) ( u ) + Γ m n ( 2 ) ( u )
Therefore, we decompose Γ m n ( u ) into four parts
Γ m n ( u ) = Γ m n ( 1 ) ( u ) + Γ m n ( 2 ) ( u ) + Γ m n ( 3 ) ( u ) + Γ m n ( 4 ) ( u )
where
Γ m n ( 1 ) ( u ) = 2 1 m [ u 0 T Z T Π τ ( r τ ) r τ + u 1 T / n X T Π τ ( r τ ) r τ ]
Γ m n ( 2 ) ( u ) = 1 m [ u 0 T Z T E [ Π τ ( r τ ) ] Z u 0 + 2 u 0 T Z T E [ Π τ ( ε τ ) ] X u 1 / n + u 1 T X T E [ Π τ ( r τ ) ] X u 1 / n ] Γ m n ( 3 ) ( u ) = λ β k = 1 h ( | β k + u k ( 1 ) m n | | β k | ) Γ m n ( 4 ) ( u ) = λ α i = 1 n t = 1 l ( | α i t + u i t ( 2 ) m | | α i t | )
For Γ m n ( 1 ) ( u ) , conditions A2 and A3 mean Lindbergh condition, we have
Γ m n ( 1 ) ( u ) = 2 1 m [ u ( 1 ) T Z T + u ( 2 ) T / n X T ] Π τ ( r τ ) r τ d 2 u T W
where W N ( 0 , P 0 ) .
For Γ m n ( 2 ) ( u ) , according to hypothesis A2, we have
Γ n m ( 2 ) ( u ) = 1 m u ( 1 ) T Z T E Π τ r τ Z u ( 1 ) + 2 u ( 1 ) T Z T E Π r τ X u 1 / n + u 1 T X T E Π r τ X u 1 / n u T P 1 u
For Γ m n ( 3 ) ( u ) ,
Γ m n ( 3 ) ( u ) = λ β k = 1 h ( | β k u k ( 1 ) m n | | β k | ) = λ β m n k = 1 h [ u k ( 1 ) sgn ( β k ) I ( β k 0 ) + | u k ( 1 ) | I ( β k = 0 ) ] λ 1 h [ u k ( 1 ) sgn ( β k ) I ( β k 0 ) + | u k ( 1 ) | I ( β k = 0 ) ]
For Γ m n ( 4 ) ( u ) , it can be divided into two parts
Γ m n ( 4 ) ( u ) = Γ m n ( 41 ) ( u ) + Γ m n ( 42 ) ( u )
by | u υ | | u | = υ sgn ( u ) + 2 0 υ ( I ( u s ) I ( u 0 ) ) d s .
where Γ m n ( 41 ) ( u ) = λ α i = 1 n t = 1 l u i t ( 2 ) m sgn ( α i t ) = λ α m i = 1 n t = 1 l u i t ( 2 ) i = 1 n t = 1 l u i t ( 2 ) sgn ( α i t ) λ 2 S T u ( 2 ) and S N ( 0 , d i a g ( 4 G i t ( 0 ) ( 1 G i t ( 0 ) ) ) ) ;
Γ m n ( 42 ) ( u ) = 2 λ α m i = 1 n t = 1 l 0 u i t ( 2 ) ( I ( α i t s m ) I ( α i t 0 ) ) d s
In hypothesis A2, a > 0 , n a / m 0 is satisfied, we have
E ( Γ m n ( 42 ) ( u ) ) = 2 λ α m i = 1 n t = 1 l 0 u i t ( 2 ) E [ I ( α i t s m ) I ( α i t 0 ) ] d s = 2 λ α m i = 1 n t = 1 l 0 u i t ( 2 ) [ G i t ( s m ) G i t ( 0 ) ] d s = λ α m i = 1 n t = 1 l g i t ( 0 ) ( u i t ( 2 ) ) 2 + o ( 1 ) 0
And Var ( Γ m n ( 42 ) ( u ) ) 0 .
According to the results of Equations (A1)–(A13). Although the function Γ m n ( u ) is convex and the point at which it reaches the minimum value is unique, we obtain
u ^ = arg min Γ m n ( u ) d arg min Γ 0 ( u )
The proof is ended. □

Appendix B

The solution paths of β and α of DLQR and DLER under the two criteria.
Figure A1. The solution paths of β and α of DLQR under criterion SIC.
Figure A1. The solution paths of β and α of DLQR under criterion SIC.
Symmetry 14 01538 g0a1
Figure A2. The solution paths of β and α of DLQR under criterion GACV.
Figure A2. The solution paths of β and α of DLQR under criterion GACV.
Symmetry 14 01538 g0a2aSymmetry 14 01538 g0a2b
Figure A3. The solution paths of β and α of DLER under criterion SIC.
Figure A3. The solution paths of β and α of DLER under criterion SIC.
Symmetry 14 01538 g0a3
Figure A4. The solution paths of β and α of DLER under criterion GACV.
Figure A4. The solution paths of β and α of DLER under criterion GACV.
Symmetry 14 01538 g0a4

Appendix C

Table A1. The results 1 of simulation 1 under three signal-to-noise ratio.
Table A1. The results 1 of simulation 1 under three signal-to-noise ratio.
ParametersMethodMSE S D int S D 1 S D 2 S D 3 S D 4 S D 5 S D 6 S D 7 S D 8
τ = 0.25
σ = 1 DLER-SIC0.3060.3000.3130.2880.0290.0330.0540.0220.0270.012
DLER-GACV0.4860.3030.4110.3830.0270.0350.0560.0240.0350.025
DLQR-SIC0.2800.2610.2970.2500.0600.0460.0910.0150.0240.020
DLQR-GACV0.2890.2660.2990.2540.0630.0490.0880.0370.0390.031
σ = 2 DLER-SIC0.3880.2610.3070.2920.0630.0700.1130.0740.0790.080
DLER-GACV0.4900.2660.3710.3420.0660.0690.1130.0850.0840.078
DLQR-SIC0.5270.2340.3320.3350.1080.0880.1670.0760.0690.086
DLQR-GACV0.5350.2360.3330.3240.1100.0980.1690.0830.0860.084
σ = 3 DLER-SIC0.5850.3370.3240.3210.1170.1290.1780.1380.1070.113
DLER-GACV0.6490.3420.3300.3430.1390.1330.1860.1390.1280.110
DLQR-SIC0.8680.3410.3370.3750.1620.1620.2440.1340.1650.117
DLQR-GACV0.8940.3530.3250.3980.1630.1630.2440.1530.1520.119
τ = 0.5
σ = 1 DLER-SIC0.2920.2830.3050.2650.0160.0150.0480.0130.0130
DLER-GACV0.5090.2830.4380.3420.0150.0190.0490.01400
DLQR-SIC0.2070.2420.2620.2240.0260.0220.0830.0150.0150.017
DLQR-GACV0.2190.2400.2560.2310.0320.0380.0930.0290.0260.032
σ = 2 DLER-SIC0.4030.2780.2840.3350.0720.0750.1280.0670.0810.071
DLER-GACV0.4720.2760.3200.3590.0760.0740.1330.0650.0690.063
DLQR-SIC0.4240.2730.2620.3250.0880.0950.1600.0790.0580.073
DLQR-GACV0.4640.2720.2690.3240.0950.0970.1670.0860.0750.091
σ = 3 DLER-SIC0.5580.3330.3130.3260.1090.0950.1710.1060.1210.111
DLER-GACV0.6340.3340.3230.3380.1290.1250.1790.1340.1260.115
DLQR-SIC0.6730.3170.3000.3460.1190.1350.2160.1220.1060.114
DLQR-GACV0.7200.3200.3110.3610.1180.1290.2180.1330.1210.121
τ = 0.75
σ = 1 DLER-SIC0.3740.2720.3220.3160.0210.0270.0560.0120.0170
DLER-GACV0.4560.2750.3890.3690.0250.0200.05800.0210.007
DLQR-SIC0.3070.2750.3200.2550.0540.0270.1050.0240.0230.028
DLQR-GACV0.3210.2750.3180.2600.0700.0390.0970.0310.0390.041
σ = 2 DLER-SIC0.4540.2630.3210.3020.0710.0870.1290.0980.0830.056
DLER-GACV0.5380.2620.4000.3590.0720.0860.1240.1060.0700.064
DLQR-SIC0.5470.2510.3190.3090.0970.1170.1870.0930.0600.070
DLQR-GACV0.5700.2550.3170.3170.1030.1170.1980.0990.0810.078
σ = 3 DLER-SIC0.5770.3120.3300.3180.1160.1450.1830.1110.1190.125
DLER-GACV0.6170.309 0.3470.3360.1300.1460.1870.1160.1230.115
DLQR-SIC0.7010.2830.3360.3220.1190.1370.2460.1310.1290.129
DLQR-GACV0.7700.3040.3290.3480.1210.1530.2470.1540.1500.157
Table A2. The results 2 of simulation 1 under three signal-to-noise ratio.
Table A2. The results 2 of simulation 1 under three signal-to-noise ratio.
ParametersMethodCorr (SD)Incorr (SD)Int X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8
τ = 0.25
σ = 1 DLER-SIC3 (0)1.04 (0.665)141001009595100979699
DLER-GACV2.99 (0.1)1.11 (0.751)14100999694100959496
DLQR-SIC3 (0)1.29 (0.518)01001008791100999698
DLQR-GACV3 (0)1.5 (0.628)01001008491100929093
σ = 2 DLER-SIC3 (0)1.6 (0.865)11001009088100888885
DLER-GACV2.99 (0.1)1.81 (0.982)1100998784100828283
DLQR-SIC3 (0)1.99 (0.959)01001007379100828780
DLQR-GACV3 (0)2.04 (1.063)01001007576100838181
σ = 3 DLER-SIC3 (0)1.87 (1.116)01001008583100837983
DLER-GACV3 (0)2.24 (1.311)01001007376100787277
DLQR-SIC3 (0)2.56 (1.351)01001006770100696573
DLQR-GACV3 (0)2.41 (1.280)01001006470100727479
τ = 0.5
σ = 1 DLER-SIC3 (0)0.75 (0.520)2910010099991009999100
DLER-GACV2.98 (0.2)0.74 (0.525)309999999810099100100
DLQR-SIC3 (0)0.78 (0.645)351001009696100989998
DLQR-GACV3 (0)0.95 (0.770)361001009591100949693
σ = 2 DLER-SIC2.99 (0.1)1.37 (0.960)29100998987100868587
DLER-GACV2.99 (0.1)1.38 (1.080)27100998586100898689
DLQR-SIC2.99(0.1)1.66 (1.075)27100997878100838880
DLQR-GACV2.99 (0.1)1.76 (1.264)31100997678100798179
σ = 3 DLER-SIC3 (0)1.7 (1.124)251001008279100808183
DLER-GACV3 (0)1.98 (1.303)281001007471100737878
DLQR-SIC3 (0)2.38 (1.153)221001006669100627271
DLQR-GACV3 (0)2.22 (1.211)201001007069100747075
τ = 0.75
σ = 1 DLER-SIC2.99 (0.1)0.94 (0.528)151009998961009998100
DLER-GACV2.99 (0.1)0.96 (0.470)141009996981001009799
DLQR-SIC3 (0)1.29 (0.478)01001008796100969894
DLQR-GACV3 (0)1.47 (0.717)01001008094100949392
σ = 2 DLER-SIC3 (0)1.72 (0.877)01001008884100828688
DLER-GACV2.99 (0.1)1.84 (1.042)1100998385100758884
DLQR-SIC3 (0)2.04 (0.974)01001007872100719283
DLQR-GACV3 (0)2.12 (1.113)01001007475100758579
σ = 3 DLER-SIC3 (0)1.93 (1.066)01001008780100838275
DLER-GACV3 (0)2.16 (1.245)01001007877100787675
DLQR-SIC3 (0)2.44 (1.192)01001007269100707273
DLQR-GACV3 (0)2.62 (1.237)01001006770100696864
Table A3. The results 1 of simulation 2 under different influence of random effects.
Table A3. The results 1 of simulation 2 under different influence of random effects.
τ = 0.5 MethodMSE S D int S D 1 S D 2 S D 3 S D 4 S D 5 S D 6 S D 7 S D 8
P 1
DLER-SIC0.3360.2680.3140.2810.0160.0120.0540.0220.0100.026
DLER-GACV0.4900.2660.4090.3210.0190.0170.0570.0210.0160.019
DLQR-SIC0.2860.2290.2940.2790.0510.0370.0790.0300.0210.018
DLQR-GACV0.2940.2330.2980.2760.0560.0430.0850.0380.0340.031
P 2
DLER-SIC1.3180.5390.5590.5680.01300.0490.0210.0140.012
DLER-GACV2.2310.5380.7890.5740.02000.0510.0200.0130.011
DLQR-SIC0.8980.5200.5340.5080.0450.4780.0990.0220.0260
DLQR-GACV0.9160.5170.5280.5160.0520.0490.1070.0410.0370.018
P 3
DLER-SIC2.5770.8760.9060.7260.0220.0240.04900.0170.022
DLER-GACV3.9460.8791.0650.7330.0150.0180.0520.0140.0130.012
DLQR-SIC1.8690.7320.7930.7460.0400.0920.1080.0270.0110.010
DLQR-GACV1.8780.7270.8030.7600.0500.0990.0760.0270.0190.020
Table A4. The results 2 of simulation 2 with different influence of random effects.
Table A4. The results 2 of simulation 2 with different influence of random effects.
τ = 0.5 MethodCorr (SD)Incorr (SD)Int X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8
P 1
DLER-SIC3 (0)0.8 (0.651)291001009899100989997
DLER-GACV3 (0)0.81 (0.598)291001009898100989898
DLQR-SIC3 (0)1.03 (0.643)301001008593100949798
DLQR-GACV3 (0)1.17 (0.877)281001008590100919495
P 2
DLER-SIC2.91 (0.288)0.98 (0.376)91009199100100979899
DLER-GACV2.78 (0.462)0.99 (0.362)8988098100100989899
DLQR-SIC3 (0)1.12 (0.591)151510090901009896100
DLQR-GACV3 (0)1.24 (0.754)17171008790100919398
P 3
DLER-SIC2.75 (0.435)1.04 (0.470)6997698981001009896
DLER-GACV2.46 (0.673)1.01 (0.460)790569998100999898
DLQR-SIC2.97 (0.171)1.11 (0.601)151001009387100969999
DLQR-GACV2.97 (0.171)1.24 (0.605)41001008784100969897
Table A5. The results 1 of simulation 3 with dense model under different error distributions.
Table A5. The results 1 of simulation 3 with dense model under different error distributions.
ModelMethodMSE S D int S D 1 S D 2 S D 3 S D 4 S D 5 S D 6 S D 7 S D 8
Dense
N ( 0 , 1 ) DLER-SIC0.9080.5220.5440.5270.0580.0610.0660.0680.0580.053
DLER-GACV0.9220.5220.5580.5500.0580.0620.0660.0680.0590.053
DLQR-SIC0.8530.4680.4850.5270.0790.0790.0810.0930.0760.070
DLQR-GACV0.8820.4630.4760.5360.0880.0860.0870.1020.0840.073
t ( 3 ) DLER-SIC0.9650.5560.5700.5340.1400.0970.1040.1130.1080.113
DLER-GACV1.1140.5510.5600.5900.1870.1140.1250.1260.1310.163
DLQR-SIC0.8710.5180.5250.4670.1130.0990.1010.1180.1080.094
DLQR-GACV0.8720.5200.5200.4840.1150.0950.0880.1130.1030.095
C a u c h y ( 0 , 1 ) DLER-SIC4.3970.3910.4040.6780.1050.2720.3910.3700.7530.377
DLER-GACV4.8280.5350.4050.7950.3060.3100.3970.3700.7250.424
DLQR-SIC2.4500.2490.3500.6050.0390.0890.0310.0720.0920.127
DLQR-GACV2.1570.2370.3610.5550.0080.05360.0410.0200.0940.129
Table A6. The results 2 of simulation 3 with dense model under different error distributions.
Table A6. The results 2 of simulation 3 with dense model under different error distributions.
ModelMethodCorr (SD) Incorr (SD)Int X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8
Dense
N ( 0 , 1 ) DLER-SIC7.99 (0.33)0.74 (0.44)269496100100100100100100
DLER-GACV7.7 (0.63)0.74 (0.44)268486100100100100100100
DLQR-SIC7.9 (0.33)0.78 (0.42)229496100100100100100100
DLQR-GACV7.91 (0.32)0.81 (0.39)199695100100100100100100
t ( 3 ) DLER-SIC7.72 (0.62)0.84 (0.37)16839099100100100100100
DLER-GACV7.37 (0.92)0.83 (0.38)1766739910010010010099
DLQR-SIC7.88 (0.33)0.84 (0.37)169395100100100100100100
DLQR-GACV7.91 (0.29)0.84 (0.37)169596100100100100100100
C a u c h y ( 0 , 1 ) DLER-SIC7.6 (0.80)0.8 (0.40)201001001001008080100100
DLER-GACV7.68 (0.47)1 (0)01001001001008880100100
DLQR-SIC8 (0)0.8 (0.40)20100100100100100100100100
DLQR-GACV8 (0)0.8 (0.40)20100100100100100100100100
Table A7. The results 1 of simulation 3 with the sparse model under different error distributions.
Table A7. The results 1 of simulation 3 with the sparse model under different error distributions.
ModelMethodMSE S D int S D 1 S D 2 S D 3 S D 4 S D 5 S D 6 S D 7 S D 8
Sparse
N ( 0 , 1 ) DLER-SIC1.3450.6080.6220.5500.01100.0510.0150.0150.023
DLER-GACV2.0840.6080.8490.6060.01100.0520.0190.0140.018
DLQR-SIC0.9780.5080.5270.5440.0300.0330.1230.0300.0200.013
DLQR-GACV0.9470.5160.5270.5290.0360.0370.1060.0370.0280.027
t ( 3 ) DLER-SIC1.2300.6760.5330.6050.0550.0390.1040.0750.0490.064
DLER-GACV2.7370.6701.0830.7420.0700.0530.2040.0820.0570.075
DLQR-SIC1.0080.5220.4980.5300.0650.0490.1020.0410.0470.048
DLQR-GACV1.0000.5290.4920.5310.0660.0550.1060.0460.0470.049
C a u c h y ( 0 , 1 ) DLER-SIC5.7560.6200.7320.4040.12100.3990.5870.2380.385
DLER-GACV6.7910.7080.7530.3070.3840.0730.5420.6820.5210.460
DLQR-SIC2.0990.5930.4110.3100.1050.1930.1520.1190.0460.062
DLQR-GACV1.7580.6380.3850.34200.1730.1130.0950.0490.034
Table A8. The results 2 of simulation 3 with the sparse model under different error distributions.
Table A8. The results 2 of simulation 3 with the sparse model under different error distributions.
ModelMethodCorr (SD)Incorr (SD)Int X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8
Sparse
N ( 0 , 1 ) DLER-SIC2.91 (0.288)0.98 (0.40)81009199100100999898
DLER-GACV2.79 (0.478)0.98 (0.40)8978299100100979999
DLQR-SIC3 (0)1.09 (0.62)141001009492100959799
DLQR-GACV3 (0)1.17 (0.63)141001009492100929596
t ( 3 ) DLER-SIC2.94 (0.24)1.38 (0.84)8100949196100879387
DLER-GACV2.59 (0.71)1.55 (1.06)1287728491100839085
DLQR-SIC2.97 (0.17)1.27 (0.79)15100978891100929493
DLQR-GACV2.97 (0.17)1.38 (0.85)16100978686100909391
C a u c h y ( 0 , 1 ) DLER-SIC2.84 (0.37)2.02 (0.62)01008479100100805980
DLER-GACV3 (0)4.62 (0.83)7100100278100181518
DLQR-SIC3 (0)2.66 (1.11)01001007936100717177
DLQR-GACV3 (0)2.09 (0.93)010010010043100718493
Table A9. The results 1 of simulation 3 with the high sparse model under different error distributions.
Table A9. The results 1 of simulation 3 with the high sparse model under different error distributions.
ModelMethodMSE S D int S D 1 S D 2 S D 3 S D 4 S D 5 S D 6 S D 7 S D 8
High Sparse
N ( 0 , 1 ) DLER-SIC1.0170.6110.5930.2670.0200.0210.0250.0390.0350.017
DLER-GACV3.0960.6130.9810.2530.0170.0220.0250.0330.0160
DLQR-SIC0.7730.5230.4930.4750.0380.0290.0300.0200.0180.035
DLQR-GACV0.810.5280.5090.4770.0480.0400.0360.0390.0410.036
t ( 3 ) DLER-SIC1.2640.5280.730.2820.05500.0240.0420.0830.091
DLER-GACV2.4780.5191.0190.3070.0690.0390.030.0390.0920.087
DLQR-SIC0.7920.4740.5760.3940.0820.0720.0510.0610.0360.039
DLQR-GACV0.7770.4830.5750.3740.0560.0540.0470.0380.0290.035
C a u c h y ( 0 , 1 ) DLER-SIC1.9251.1470.1620.74100.5130000.371
DLER-GACV14.7201.2000.2981.2890.9321.2010.6020.5130.2270.334
DLQR-SIC0.7980.6500.2560.4840.1170.0950.0820.0290.1240.121
DLQR-GACV0.8220.5670.2380.4150.1160.0880.0600.03150.1090.109
Table A10. The results 2 of simulation 3 with the high sparse model under different error distributions.
Table A10. The results 2 of simulation 3 with the high sparse model under different error distributions.
ModelMethodCorr (SD)Incorr (SD)Int X 1 X 2 X 3 X 4 X 5 X 6 X 7 X 8
High Sparse
N ( 0 , 1 ) DLER-SIC1 (0)1.43 (0.78)410075989898929597
DLER-GACV1 (0)1.27 (0.72)4100839898989498100
DLQR-SIC1 (0)1.59 (0.78)2910035939795999895
DLQR-GACV1 (0)1.71 (0.97)3110039929294959294
t ( 3 ) DLER-SIC1 (0)1.52 (0.92)2100759110098949395
DLER-GACV0.99 (0.1)1.7 (1.11)29970879496969491
DLQR-SIC1 (0)2.25 (1.22)1410034788489909492
DLQR-GACV1 (0)1.94 (1.08)1110045859090959694
C a u c h y ( 0 , 1 ) DLER-SIC1 (0)1.71 (0.76)0100821005310010010094
DLER-GACV1 (0)6.53 (2.35)010011121129303024
DLQR-SIC1 (0)3.93 (1.28)61006406854996767
DLQR-GACV1 (0)3.06 (1.54)610048396896996771
Table A11. The results 1 of simulation 4 with the high dimensional model.
Table A11. The results 1 of simulation 4 with the high dimensional model.
ModelMethodMSE S D 0 S D 1 S D 2 S D 5
τ = 0.5
DLER-SIC4.7830.0100.5130.3600.657
DLER-GACV5.4140.0070.9140.4820.706
DLQR-SIC39.2350.4060.5850.6060.443
DLQR-GACV6.9020.0030.8760.4630.789
τ = 0.75
DLER-SIC5.8700.1650.6710.3830.882
DLER-GACV4.4830.0100.7590.4670.719
DLQR-SIC40.624330.4130.5930.6250.507
DLQR-GACV7.7990.0090.9160.5140.723
Table A12. The results 2 of simulation 4 with the high dimensional model.
Table A12. The results 2 of simulation 4 with the high dimensional model.
ModelMethodCorr (SD)Incorr (SD) X 0 X 1 X 2 X 5
τ = 0.5
DLER-SIC2.34 (0.590)5.31 (2.237)95.9531004490
DLER-GACV2.28 (0.805)3.46 (1.904)97.682885090
DLQR-SIC3 (0)94.3 (3.033)12.794100100100
DLQR-GACV2.8 (0.426)68.44 (33.202)36.972989191
τ = 0.75
DLER-SIC2.27 (0.737)5.54 (7.885)95.738965378
DLER-GACV2.58 (0.684)5.32 (3.490)95.963967290
DLQR-SIC2.99 (0.1)94.46 (3.963)12.65410010099
DLQR-GACV2.68 (0.490)42.06 (41.702)61.617988288
Table A13. CD4 cell data: the estimation of DLER-SIC, DLER-GACV, DLQR-SIC, DLQR-GACV.
Table A13. CD4 cell data: the estimation of DLER-SIC, DLER-GACV, DLQR-SIC, DLQR-GACV.
IntTime
X 1
Smoking
X 2
Age
X 3
Sex Partner
X 4
τ = 0.1
DLER-SIC−0.255−0.2710.08000
DLER-GACV−0.190−0.2660.05800
DLQR-SIC−0.851−0.3140.11700
DLQR-GACV−0.852−0.3130.11900
τ = 0.25
DLER-SIC−0.047−0.2430.08000
DLER-GACV0−0.2340.05100
DLQR-SIC−0.422−0.2580.11700
DLQR-GACV−0.420−0.2590.12300
τ = 0.5
DLER-SIC0.129−0.2330.09000
DLER-GACV0.112−0.2410.10100
DLQR-SIC0.126−0.2080.11300
DLQR-GACV0.126−0.2070.10500
τ = 0.75
DLER-SIC0.345−0.2200.07400
DLER-GACV0.329−0.2300.09800
DLQR-SIC0.557−0.1670.10400
DLQR-GACV0.559−0.1660.12900
τ = 0.9
DLER-SIC0.492−0.2170.07800
DLER-GACV0.492−0.2280.09300
DLQR-SIC0.959−0.1410.15400
DLQR-GACV0.949−0.140.15300

References

  1. Koenker, R.; Bassett, G., Jr. Regression quantiles. Econom. J. Econom. Soc. 1978, 46, 33–50. [Google Scholar] [CrossRef]
  2. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 1996, 58, 267–288. [Google Scholar] [CrossRef]
  3. Zou, H. The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef]
  4. Biswas, J.; Das, K. A Bayesian quantile regression approach to multivariate semi-continuous longitudinal data. Comput. Stat. 2021, 36, 241–260. [Google Scholar] [CrossRef]
  5. Koenker, R. Quantile regression for longitudinal data. J. Multivar. Anal. 2004, 91, 74–89. [Google Scholar] [CrossRef]
  6. Wang, H.J.; Feng, X.; Dong, C. Copula-based quantile regression for longitudinal data. Stat. Sin. 2019, 29, 245–264. [Google Scholar] [CrossRef]
  7. Wu, Y.; Liu, Y. Variable selection in quantile regression. Stat. Sin. 2009, 19, 801–817. [Google Scholar]
  8. Peng, L. Quantile regression for survival data. Annu. Rev. Stat. Its Appl. 2021, 8, 413. [Google Scholar] [CrossRef]
  9. Li, H.; Liu, Y.; Luo, Y. Double Penalized Quantile Regression for the Linear Mixed Effects Model. J. Syst. Sci. Complex. 2020, 33, 2080–2102. [Google Scholar] [CrossRef]
  10. Newey, W.K.; Powell, J.L. Asymmetric least squares estimation and testing. Econom. J. Econom. Soc. 1987, 55, 819–847. [Google Scholar] [CrossRef]
  11. Almanjahie, I.M.; Bouzebda, S.; Kaid, Z.; Laksaci, A. Nonparametric estimation of expectile regression in functional dependent data. J. Nonparametr. Stat. 2022, 34, 250–281. [Google Scholar] [CrossRef]
  12. Gu, Y.; Zou, H. High-dimensional generalizations of asymmetric least squares regression and their applications. Ann. Stat. 2016, 44, 2661–2694. [Google Scholar] [CrossRef]
  13. Farooq, M.; Steinwart, I. Learning rates for kernel-based expectile regression. Mach. Learn. 2019, 108, 203–227. [Google Scholar] [CrossRef]
  14. Schulze Waltrup, L.; Kauermann, G. Smooth expectiles for panel data using penalized splines. Stat. Comput. 2017, 27, 271–282. [Google Scholar] [CrossRef]
  15. Sobotka, F.; Kauermann, G.; Schulze Waltrup, L.; Kneib, T. On confidence intervals for semiparametric expectile regression. Stat. Comput. 2013, 23, 135–148. [Google Scholar] [CrossRef]
  16. Zhao, J.; Zhang, Y. Variable selection in expectile regression. Commun. Stat.-Theory Methods 2018, 47, 1731–1746. [Google Scholar] [CrossRef]
  17. Liao, L.; Park, C.; Choi, H. Penalized expectile regression: An alternative to penalized quantile regression. Ann. Inst. Stat. Math. 2019, 71, 409–438. [Google Scholar] [CrossRef]
  18. Waldmann, E.; Sobotka, F.; Kneib, T. Bayesian regularisation in geoadditive expectile regression. Stat. Comput. 2017, 27, 1539–1553. [Google Scholar] [CrossRef]
  19. Xu, Q.F.; Ding, X.H.; Jiang, C.X.; Yu, K.M.; Shi, L. An elastic-net penalized expectile regression with applications. J. Appl. Stat. 2021, 48, 2205–2230. [Google Scholar] [CrossRef]
  20. Farooq, M.; Steinwart, I. An SVM-like approach for expectile regression. Comput. Stat. Data Anal. 2017, 109, 159–181. [Google Scholar] [CrossRef]
  21. Spiegel, E.; Sobotka, F.; Kneib, T. Model selection in semiparametric expectile regression. Electron. J. Stat. 2017, 11, 3008–3038. [Google Scholar] [CrossRef]
  22. Daouia, A.; Girard, S.; Stupfler, G. Tail expectile process and risk assessment. Bernoulli 2020, 26, 531–556. [Google Scholar] [CrossRef]
  23. Ziegel, J.F. Coherence and elicitability. Math. Financ. 2016, 26, 901–918. [Google Scholar] [CrossRef]
  24. Rao, C.R.; Statistiker, M. Linear Statistical Inference and Its Applications; Wiley: New York, NY, USA, 1973. [Google Scholar]
  25. Harville, D. Extension of the Gauss-Markov theorem to include the estimation of random effects. Ann. Stat. 1976, 4, 384–395. [Google Scholar] [CrossRef]
  26. Robinson, G.K. That BLUP is a good thing: The estimation of random effects. Stat. Sci. 1991, 6, 15–31. [Google Scholar]
  27. Van Vleck, L.D.; Henderson, C.R. Estimates of genetic parameters of some functions of part lactation milk records. J. Dairy Sci. 1961, 44, 1073–1084. [Google Scholar] [CrossRef]
  28. Goldberger, A.S. Best linear unbiased prediction in the generalized linear regression model. J. Am. Stat. Assoc. 1962, 57, 369–375. [Google Scholar] [CrossRef]
  29. Koenker, R.; Ng, P.; Portnoy, S. Quantile smoothing splines. Biometrika 1994, 81, 673–680. [Google Scholar] [CrossRef]
  30. Yuan, M. GACV for quantile smoothing splines. Comput. Stat. Data Anal. 2006, 50, 813–829. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gao, S.; Chen, J.; Yuan, Z.; Liu, J.; Huang, Y. Double Penalized Expectile Regression for Linear Mixed Effects Model. Symmetry 2022, 14, 1538. https://doi.org/10.3390/sym14081538

AMA Style

Gao S, Chen J, Yuan Z, Liu J, Huang Y. Double Penalized Expectile Regression for Linear Mixed Effects Model. Symmetry. 2022; 14(8):1538. https://doi.org/10.3390/sym14081538

Chicago/Turabian Style

Gao, Sihan, Jiaqing Chen, Zihao Yuan, Jie Liu, and Yangxin Huang. 2022. "Double Penalized Expectile Regression for Linear Mixed Effects Model" Symmetry 14, no. 8: 1538. https://doi.org/10.3390/sym14081538

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop