Abstract
As applied sciences grow by leaps and bounds, semiparametric regression analyses have broad applications in various fields, such as engineering, finance, medicine, and public health. Single-index varying-coefficient model is a common class of semiparametric models due to its flexibility and ease of interpretation. The standard single-index varying-coefficient regression models consist mainly of parametric regression and semiparametric regression, which assume that all covariates can be observed. The assumptions are relaxed by taking the models with missing covariates into consideration. To eliminate the possibility of bias due to missing data, we propose a probability weighted objective function. In this paper, we investigate the robust variable selection for a single-index varying-coefficient model with missing covariates. Using parametric and nonparametric estimates of the likelihood of observations with fully observed covariates, we examine the estimators for estimating the likelihood of observations. For variable selection, we use a weighted objective function penalized by a non-convex SCAD. Theoretical challenges include the treatment of missing data and a single-index varying-coefficient model that uses both the non-smooth loss function and the non-convex penalty function. We provide Monte Carlo simulations to evaluate the performance of our approach.
Keywords:
single-index varying-coefficient model; missing data; variable selection; inverse probability weighting; sparsity MSC:
62F12; 62G08; 62G20; 62J07T07
1. Introduction
Traditional statistical techniques are based on completely observed data. However, in many scientific experiments, such as questionnaire survey, medical research and psychological science, respondents are unwilling to provide some information which the researchers need. In addition, there are many factors that cannot be controlled in the research process, and it is often impossible to obtain all the desired data. When data are missing, traditional statistical techniques cannot be directly applied. Some statisticians consider using the observed data to draw valid conclusions in this situation. Until now, in order to deal with missing data, various methods have been employed such as complete-case analysis (CC) (Yates [1] and Healy and Westmacott [2]), imputation and inverse probability weighting (IPW), and methods based on likelihood. The IPW method proposed by Horvitz and Thompson [3] a way to deal with the missing data problems, which selects the inverse of the probability as the estimated weight so that it is not distorted by random missing data. It has earned extensive attention in the field of missing data research. There are also some related literatures, such as Robins et al. [4], Wang et al. [5], Little and Rubin [6], Liang et al. [7], Tsiatis [8], etc. However, when the error distribution is highly tailed or skewed, the results of the two aforementioned methods are not stable because they are based on least squares (LS) method.
In most regression models, it is critical to choose the proper loss function to make the resulting estimator robust. Therefore, researchers pay more attention to loss functions that have higher robustness. The exponential squared loss that has robustness is defined as , where is the tuning parameter that determines the robustness degree of the estimator. For large , is approximately equal to . Thus the proposed estimator is the same as the LS estimator in some extreme circumstances. When is small, observations with absolute values of will lead to a great loss of , whose influence upon the estimate of is insiganificant. Thus, making smaller limits the impact of outliers on the estimator but also reduces the sensitivity of the estimator. Moreover, quantile regression (QR) has become an increasingly popular method because regression methods based on exponential squared loss are more resistant to the effects of outliers than LS. Such exponential loss functions have been used in classification problems in AdaBoost (Friedman et al. [9]) and variable selection in regression models (Wang et al. [10]).
As applied sciences grow, research on semiparametric models has been extensively developed due to the high degree of flexibility and ease of interpretation. The singleindex varying-coefficient model (SIVCM) is a common semiparametric model. The main advantage of the model is that it avoids the curse of dimensionality. Another is that it has the explanatory power like parametric models. Generally, it takes the following form
where is the dependent variable, are the covariates and . and represent the vector of unknown functions and unknown parameters, respectively, whose dimension are and . is the disturbance term with zero mean and finite variance which is independent of . Furthermore, assume that the Euclidean norm of is equal to 1 and its first component is positive. Moreover, in order to avoid the influence due to the lack of uniqueness of the index direction , cannot take the form of , where are constants, , , and are not parallel to each other (Feng and Xue [11]; Xue and Pang [12]).
Model (1) is so flexible that it covers a class of significant statistical models. It becomes the standard single-index model (SIM) when and ; for related literatures, see Hardle et al. [13] and Wu et al. [14]. When and , it is simplified to the varying coefficient models (VCM) proposed by Hastie and Tibshirani [15] and Fan and Zhang [16]. Consequently, it is easily interpretable and has broad applications in practice. In particular, Xia and Li [17] first studied Model (1) using the kernel smoothing method with the LS method. The empirical likelihood ratio method was proposed by Xue and Wang [18]. Based on estimating equations, the estimate of the parametric component was built by Xue and Pang [12]. Using the function approximation, Feng and Xue [11] investigated Model (1).
Variable selection is of great importance to statistical modeling. The reason is that it will cause seriously biased results if researchers ignore the significant variables, whereas including spurious variables suffers from substantial loss in estimation efficiency. Hence, there are many popular choices for penalty functions, such as least absolute shrinkage and selection operator (LASSO, Tibshirani [19]), bridge penalty, smoothly clipped absolute deviation (SCAD, Fan and Li [20]), and adaptive lasso (Zou [21]). In particular, the non-conave least-squares penalty method based on SCAD penalization in SIM has been proposed by Peng and Huang [22] using SCAD penalization; Yang and Yang [23] adopted the SCAD penalty to achieve efficient estimation and variable selection simultaneously in partially linear single-index models (PLSIM); Wang and Kulasekera [24] proposed the partial linear varying-coefficient model (PLVCM) based on adaptive lasso.
SIVCM is a common semiparametric model. The selection of variables in semiparametric models includes two parts: the selection of the model in the nonparametric part and the selection of significant variables in the parametric part. Classical variable selection procedure involves stepwise regression and optimal subsets selection. However, the nonparametric parts of each submodel need to be extracted separately, leading to high computational cost. It is a great challenge to select variables in SIVCM for the reason that it has a complex multivariate nonlinear structure that incudes both a nonparametric function vector and an unknown parameter vector . Based on the approximation of the SCAD function and penalties, Feng and Xue [11] developed a penalty method for SIVCM. The method they propose allows the selection of significant variables into parametric and nonparametric components. It should be noted that existing research adopts the LS or likelihood method and assume that the error follows a normal distribution. Therefore, when the error is highly tailed, it makes the method sensitive to outliers and it becomes inefficient. It is not robust to outliers in the dependent variable due to using least squares criterion. Yang and Yang [25] proposed an efficient iterative procedure for SIVCM based on quantile regression. The results indicate that the resulting estimator is robust without accounting for both outliers and errors of variation. However, all existing work on SIVCM assumes that all variables are fully observed. A robust variable selection approach for SIVCM with missing covariates has not yet been studied.
The following are the innovations of this paper:
- For the case of missing covariates, we propose a robust variable selection approach based on exponential squared loss and adopt the IPW method to eliminate the latent bias due to the missing values in covariates.
- We consider parametric and nonparametric methods to estimate the probabilistic model and propose a objective function with a weighted penalty for variable selection.
- We also examine how to select the parameters of the squared exponential loss function to ensure that the corresponding estimator is robust.
The rest of this article is organized as follows. Section 2 proposes an efficient iterative SIVCM method using exponential quadratic loss, and the SCAD penalty is applied to select both important parametric variables and nonparametric components. In addition, we discusses the implementation, including bandwidth selection and tuning parameters. Section 3 conducts several Monte Carlo experiments with different error distributions in order to show the finite sample performance of the proposed method. Section 4 concludes the paper briefly.
2. Methodology
Using the exponential squared loss functions, the basis function approximation, and the SCAD penalty function, a robust variable selection procedure for SIVCM with missing covariates is proposed. First, the unknown coefficient functions are approximated applying the B-spline function. Next, under the constraint of , we use the ’delete-one-component’ approach constructed by Yu and Ruppert [26] in order to establish the objective function of the penalized exponential squared loss.
2.1. Basis Function Expansion
Consider that is a sample from model (1), i.e.,
where and are p-dimensional and q-dimensional independent variables, respectively. The disturbance term is unobserved random variable with zero mean and finite variance . We assume that are independent of .
In order to get the unknown , according to He et al. [27], we use its basis function approximations to replace the original . More specifically, construct a B-spline basis function of order M+1, , where , and K is the number of interior knots. We can approximate as
where is the vector of the spline coefficient. The following robust estimation procedure will be performed if all the data could be collected.
where is a tuning parameter. To prevent outliers from affecting the estimate, we introduce an exponential squared loss in (4). However, (4) cannot be directly optimized when is unknown. After we replace the unknown by its basis function approximations in (4), we get
where
We first handle the constraints and on the p-dimensional single index parameter vector by reparametrization. Denote and define
The true parameter must satisfy , which is an inequality constraint. Therefore, is infinitely differentiable with respect to . Therefore, the Jacobian matrix of with respect to is
where is the q-order identity matrix. As we can see, is one dimension lower than , and the penalized robust regression with the exponential squared loss is converted to
where . By maximizing (7),we can get and . Then, through (3) and (6), the robust regression estimator of based on the exponential squared loss is
and the estimator of can be procured by
2.2. Robust Estimation Based on Inverse Probability Weighting
We consider the case where a subset of covariates has missing values when estimating (5). Let be the vector of always obtained covariates and is a vector of covariates that may contain some missing parts from or . We define the vector of variables which can be always observed as ,and . Based on each observation, the value of an indicator variable R is related to whether is completely observed, which can be obtained by the following formula
The missing mechanism we proposed satisfies:
With this missing mechanism, under the condition of , we can ensure the event that is missing has no connection with . Although the response data are fully observed, the selection probability in (10) still only related to the observed covariates instead of the observed response. Therefore, we conclude that the missing mechanism is different from the missing at random (MAR) mechanism. We need this missing mechanism in order to continue the theoretical research.
When faced with missing covariates, we estimate (5) with a naive approach; only observations with complete data are used to fit the model. The naive estimator is
while all observations with missing data are dropped when we estimate the model. Under the assumption that it is not the MAR, this estimator will be asymptotically biased.
An objective function based on inverse probability weights (IPW) is proposed in order to reduce the potential error caused by missing data. The expression is used to weight the ith data point in the IPW method. The difference between IPW and naive method is that IPW provides different weights for records with fully observed data. The idea behind weighting is that for every fully observed data point with probability of being fully observed, data points with the same covariates are expected if there were no missing data.
The weight is usually unknown and needs to be estimated. We consider estimating the weights using a parametric model. The general parametric relationship of the parametric model is assumed as
Assuming the logistic relationship as an example
In practice is replaced with . The parametric model is used to estimate .
Throughout the paper will denote the parametric estimate, will denote a general estimate that could be parametric, and will denote the true probability when observation i has full data. The definition of our parametric robust regression estimator is
According to the above, through (3) and (6) and using the exponential squared loss, can be robustly estimated by
Then, the estimator of can be written as
2.3. The Penalized Robust Regression Estimator
Here we consider the variable selection problem when Model (2) has missing covariates. In order to improve the accuracy and interpretability of model fitting and ensure the identifiability of the model, the vector of the real regression coefficient is generally set to a scattered state with only a small fraction of non-zeroes (Fan and Li [20]; Tibshirani [19]).
For the purpose of getting the true model and estimating and , a penalized robust regression that uses exponential squared loss is as follows
where
The penalty function is defined on the interval and the regularization parameter is non-negative. It is necessary to emphasize that the tuning parameters and have no need to be the same for all and . Our purpose of using exponential squared loss in (5) is to prevent outliers from affecting the estimation process. It is unrealistic to directly optimize (15) when is unknown. To solve this problem, the unknown function in (15) is replaced by its basis function approximation, which can be written as
where
When ’s parametric estimate is , the parametric penalized robust regression with the exponential squared loss transforms to
where . By maximizing (17), we can get the result and . Then, through (3) and (6), the penalized robust regression estimator of based on the exponential squared loss is
and the estimator of can be obtained by
2.4. Algorithm
A quadratic approximation is used to replace the loss function for the purpose of facilitating the computation. Let
When we get the initial estimator , then the loss function can be approximated as
What makes implementing the Newton–Raphson algorithm directly difficult is that the SCAD-penalty function is irregular at the origin. Now, we develop an iterative algorithm based on the local quadratic approximation of the penalty function as in Fan and Li [20]. More specially, in a neighborhood of a given nonzero , an approximation of the penalty function at the value can be given by
Hence, for the given initial value with , , and with , , we have
Let
Then, in addition to the constant term, we maximize
with respect to and , which brings about an approximated solution of (17). We can get estimates and of and by solving for (3) and (6) respectively.
In order to implement the above method, we should correctly choose the number of interior knots K and make appropriate adjustments to the tuning parameters a, , and in the penalty function. Fan and Li [20] showed that the choice of performs well in variety of situations. Hence, we also follow their setup in this article.
2.5. The Choice of the Regularization Parameter and
We can choose the tuning parameters using a method that is similar to cross-validation. However, our penalty function contains too many tuning parameters, and higher-dimensional space makes it difficult to solve the minimization problem for the cross-validation score. To overcome this difficulty, similar to Zhao and Xue [28], we take the tuning parameters as
where and are the unpenalized estimators of and , respectively. Then, we can estimate and K by minimizing the following cross-validation score:
where and are the solutions ground on (17) after deleting the ith subject.
2.6. The Choice of the Regularization Parameter
The tuning parameter plays a decisive role in the degree of robustness and efficiency of the proposed robust regression estimators. A data-driven procedure is proposed to choose the appropriate , the new method yields both high efficiency and high robustness simultaneously. We first choose a series of the tuning parameters that makes the proposed penalized robust estimators have an asymptotic breakdown point at 1/2 and then use the maximum efficiency as a measure to select the tuning parameter.
The specific procedure steps are as follows:
Step 1. In this step, we will find the pseudo outlier set of the sample as in Wang et al. [10]. Let and . Calculate , and . Then, take the pseudo outlier set , set , and .
Step 2. In this step, we are going to update the tuning parameter . Suppose there are m bad points and good points in . Define the bad points by and the good points by .
The proportion of bad points in is . The computation of the initial estimators and is the first thing to do . For a contaminated sample , let
where . Let be the minimizer of in the set , where indicate the determinant operator,
and
Step 3. The value of can be calculated from (22). Then, we can get the value of and by (21). Through fixed and , and selected in Step 2, and can be updated by maximizing (17).
Step 4. We learn from Xue and Pang [12] to set the estimator and as the initial estimate, which means and . We then repeat Steps 1-3 until , , and converge.
3. Simulation
Here we compare the performance of the estimation and variable selection methods we propose for the finite samples with that of Yang and Yang [25] (QR), Xue and Wang [18] (EL), Xue and Pang [12] (EE) via some Monte Carlo simulations. In contrast, Xue and Wang [18] (EL) and Xue and Pang [12] (EE) fail to take into account the problem of selection of significant variables, so we introduced an adaptive penalty term into their objective function to ensure that significant variables are selected.
According to Yang and Yang [25], we choose the Gaussian kernel function in the simulations of the quantile regression method with . Evaluation of the performance of the estimators noted above is based on the following three criteria: (1) the average absolute deviations (AAD) of the estimated coefficients and the standard deviations (SD) for each; (2) mean absolute deviations (MAD) of , which can be calculated by the expression , where represents the p-norm; and (3) the square root of the average square error (RASE) as a measure of the performance of estimator , calculated as follows:
for , where denote the grid points used to assess the function .
Additionally, in order to demonstrate the effectiveness of the variable selection procedure, the average number of real zero coefficients accurately identified as zero (NC), the average number of real non-zero coefficients mistakenly identified as zero (NIC), as well as the probability of correctly selecting the real model (PC) are presented in our simulation. The tuning parameter is chosen for each simulation sample.
Example 1. In this example, we focus attention on the estimation of the proposed estimation procedure, and the following SIVCM is considered:
where , , and are jointly normally distributed with mean 0, variance 1 and correlation , , and . The error and , , , , are independent; may have missing values. The selection probability functions are given by:
We consider with . The corresponding average missing rates are . In our simulation, three different distributions of model error are considered:
case1: The standard normal distribution .
case2: The centralized t-distribution with three degrees of freedom that is used to generate heavy-tailed distribution.
case3: The mixture of normals which is used to produce the outliers.
Table 1 displays the average absolute deviations (AAD) and the standard deviations (SD), as well as the mean absolute deviations (MAD), for each case with sample sizes . It can be seen that when the errors are normally distributed, our proposed estimator, based on the exponential loss squared (ESL), has smaller AAD, SD, and MAD than the , the estimating equations (EE) and the empirical likelihood ratio (EL) methods for all sample sizes, which means that the proposed estimator performs better than the other three estimators. The proposed estimator also gives good results for the other two error distributions, and . The significant improvement in the performance of our proposed estimator over the EE, EL, and estimators indicates that our proposed estimation method ESL is robust to datasets with outliers or error distributions of response variables with high tails. More importantly, as the sample size n increases, the performance of the estimator tends to improve significantly.
Table 1.
Simulation results of AAD (), SD (), and MAD () for the estimators of .
The square root of average square error (RASE) of the estimator for the nonparametric function with sample sizes of n = 50, 200. and 400 is reported in Table 2. Table 2 gives results similar to those in Table 1. We note that no matter which of the above three distributions the error follows, our proposed estimator, compared with the other three estimators, has smaller RASE and performs better. That is, for the non-normal distributions, our proposed estimate method ESL is consistently superior to QR, EE, and EL. When the probability of selection is correctly specified and estimated using the parametric model, a clear pattern emerges: as the sample size n increases, the performance of the two estimators and becomes greater and greater.
Table 2.
Simulation results of RASE for the estimators of .
Example 2. This example aims to study the variable selection performance of the index parameters in model (1). The model setup is similar to (24) except that independently generated from and . As considered in Example 1, three different error distributions , , and are considered to show the robustness of the proposed estimator method based on the exponential squared loss (ESL). The error and , ⋯, , , are independent; may have missing values. The selection probability functions are given by:
We consider with . The corresponding average missing rates are .
For each mechanism mentioned above, we compare the performance of four methods: our proposed method [ESL-SCAD], LSE-SCAD proposed by Feng and Xue [11], LAD-SCAD proposed by Yang and Yang [25], and EE-SCAD method based on Xue and Pang [12]. The results are reported in Table 3 and are similar to the conclusions of Example 1. Whether the error term follows the normal distribution, the centralized t-distribution, or the mixture of normals, our proposed method performs more efficiently in variable selection, which has larger NC and smaller NIC. When there exist outliers in the response variables or heavy-tailed error distributions, ESL-SCAD has an obviously better performance than LAD-SCAD, EE-SCAD, or LSE-SCAD estimators. For normal error, ESL-SCAD hardly loses any efficiency.
Table 3.
Variable selection results and RASE of , in Example 2.
The proposed procedure is also competitive in terms of computational cost. The calculation was performed on a computer with AMD Ryzen processors, a 16 GB RAM, running a Windows 10 system, and only one CPU was used for fair comparisons. Results on computational efficiency of the our proposed method are presented in Table 4 and Table 5, which show CPU times (in seconds) for different combinations of the full data size n and the number of covariates p. It is seen that the proposed algorithm is faster.
Table 4.
CPU times for different n in Example 1.
Table 5.
CPU times for different n in Example 2.
4. Discussion
In this paper, we use penalized regression with exponential squared loss to propose a robust variable selection procedure for a single-index model along with missing data. The B-spline is a method that can estimate the relationship with the response. IPW is a frequently used method dealing with the bias resulting from missing covariates, and the non-convex penalty method is used to estimate and select the variable at the same time. We examine the properties of sampling and robustness of our estimator. From theoretical and simulation study in this paper, the merits of our method are obvious. We also illustrate that the outcomes are good when using our method for actual data. In particular, we reveal that this estimator has the highest sample breakdown point, and the influence function for outliers are limited either in the response domain or in the covariate domain. In this paper, simulation studies and applications indicate the advantage of our method. When outliers are presented (regardless of the mechanism), EE-SCAD and LSE-SCAD are inferior in terms of non-caused selection rate.
Moreover, we can make further studies based on our proposed method. First, it is worth considering the goodness-of-fit test; in this paper we only study the sparse estimation and variable selection, however. Second, censoring can be examined based on this model. An investigation of the difficulties above is a portion of further study but is out of this paper’s scope. In the proposed theory, internal knots are considered as fixed values. Finally, how to optimally select internal knots when data are missing is an interesting problem worthy of future research.
Author Contributions
Formal analysis, H.S.; Methodology, Y.S.; Software, Y.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by NNSF project (61503412) of China, NSF project (ZR2019MA016) of Shandong Province of China.
Conflicts of Interest
The authors declare that they have no competing interest.
References
- Yates, F. The analysis of replicated experiments when the field results are incomplete. Emp. J. Exp. Agric. 1933, 1, 129–142. [Google Scholar]
- Healy, M.; Westmacott, M. Missing values in experiments analysed on automatic computers. J. R. Stat. Soc. Ser. B Methodol. 1956, 5, 203–206. [Google Scholar] [CrossRef]
- Horvitz, D.G.; Thompson, D.J. A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 1952, 47, 663–685. [Google Scholar] [CrossRef]
- Robins, J.M.; Rotnitzky, A.; Zhao, L.P. Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 1994, 89, 846–866. [Google Scholar] [CrossRef]
- Wang, C.; Wang, S.; Zhao, L.P.; Ou, S.T. Weighted semiparametric estimation in regression analysis with missing covariate data. J. Am. Stat. Assoc. 1997, 92, 512–525. [Google Scholar] [CrossRef]
- Little, R.J.; Rubin, D.B. Statistical Analysis with Missing Data; John Wiley & Sons: Hoboken, NJ, USA, 2019; Volume 793. [Google Scholar]
- Liang, H.; Wang, S.; Robins, J.M.; Carroll, R.J. Estimation in partially linear models with missing covariates. J. Am. Stat. Assoc. 2004, 99, 357–367. [Google Scholar] [CrossRef]
- Tsiatis, A.A. Semiparametric Theory and Missing Data; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Friedman, J.; Hastie, T.; Tibshirani, R. Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Stat. 2000, 28, 337–407. [Google Scholar] [CrossRef]
- Wang, X.; Jiang, Y.; Huang, M.; Zhang, H. Robust variable selection with exponential squared loss. J. Am. Stat. Assoc. 2013, 108, 632–643. [Google Scholar] [CrossRef]
- Feng, S.; Xue, L. Variable selection for single-index varying-coefficient model. Front. Math China 2013, 8, 541–565. [Google Scholar] [CrossRef]
- Xue, L.; Pang, Z. Statistical inference for a single-index varying-coefficient model. Stat. Comput. 2013, 23, 589–599. [Google Scholar] [CrossRef]
- Hardle, W.; Hall, P.; Ichimura, H. Optimal smoothing in single-index models. Ann. Stat. 1993, 21, 157–178. [Google Scholar] [CrossRef]
- Wu, T.Z.; Lin, H.; Yu, Y. Single-index coefficient models for nonlinear time series. J. Nonparametr. Stat. 2011, 23, 37–58. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R. Varying-coefficient models. J. R. Stat. Soc. Ser. B Methodol. 1993, 55, 757–779. [Google Scholar] [CrossRef]
- Fan, J.; Zhang, W. Statistical estimation in varying coefficient models. Ann. Stat. 1999, 27, 1491–1518. [Google Scholar] [CrossRef]
- Xia, Y.; Li, W.K. On single-index coefficient regression models. J. Am. Stat. Assoc. 1999, 94, 1275–1285. [Google Scholar] [CrossRef]
- Xue, L.; Wang, Q. Empirical likelihood for single-index varying-coefficient models. Bernoulli 2012, 18, 836–856. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
- Zou, H. The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 2006, 101, 1418–1429. [Google Scholar] [CrossRef]
- Peng, H.; Huang, T. Penalized least squares for single index models. J. Stat. Plan. Inference 2011, 141, 1362–1379. [Google Scholar] [CrossRef]
- Yang, H.; Yang, J. A robust and efficient estimation and variable selection method for partially linear single-index models. J. Multivar. Anal. 2014, 129, 227–242. [Google Scholar] [CrossRef]
- Wang, D.; Kulasekera, K. Parametric component detection and variable selection in varying-coefficient partially linear models. J. Multivar. Anal. 2012, 112, 117–129. [Google Scholar] [CrossRef][Green Version]
- Yang, J.; Yang, H. Quantile regression and variable selection for single-index varying-coefficient models. Commun. Stat.-Simul. C 2017, 46, 4637–4653. [Google Scholar] [CrossRef]
- Yu, Y.; Ruppert, D. Penalized spline estimation for partially linear single-index models. J. Am. Stat. Assoc. 2002, 97, 1042–1054. [Google Scholar] [CrossRef]
- He, X.; Zhu, Z.Y.; Fung, W.K. Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 2002, 89, 579–590. [Google Scholar] [CrossRef]
- Zhao, P.; Xue, L. Variable selection for semiparametric varying coefficient partially linear models. Stat. Probab. Lett. 2009, 79, 2148–2157. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).