Next Article in Journal
Exact Solutions of a Mathematical Model Describing Competition and Co-Existence of Different Language Speakers
Previous Article in Journal
Gaussian Process Regression for Data Fulfilling Linear Differential Equations with Localized Sources
Previous Article in Special Issue
Estimation of Dynamic Networks for High-Dimensional Nonstationary Time Series
Open AccessArticle

Selection Consistency of Lasso-Based Procedures for Misspecified High-Dimensional Binary Model and Random Regressors

by Mariusz Kubkowski 1,2,† and Jan Mielniczuk 1,2,*,†
1
Institute of Computer Science, Polish Academy of Sciences, Jana Kazimierza 5, 01-248 Warsaw, Poland
2
Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2020, 22(2), 153; https://doi.org/10.3390/e22020153
Received: 13 November 2019 / Revised: 22 January 2020 / Accepted: 24 January 2020 / Published: 28 January 2020
We consider selection of random predictors for a high-dimensional regression problem with a binary response for a general loss function. An important special case is when the binary model is semi-parametric and the response function is misspecified under a parametric model fit. When the true response coincides with a postulated parametric response for a certain value of parameter, we obtain a common framework for parametric inference. Both cases of correct specification and misspecification are covered in this contribution. Variable selection for such a scenario aims at recovering the support of the minimizer of the associated risk with large probability. We propose a two-step selection Screening-Selection (SS) procedure which consists of screening and ordering predictors by Lasso method and then selecting the subset of predictors which minimizes the Generalized Information Criterion for the corresponding nested family of models. We prove consistency of the proposed selection method under conditions that allow for a much larger number of predictors than the number of observations. For the semi-parametric case when distribution of random predictors satisfies linear regressions condition, the true and the estimated parameters are collinear and their common support can be consistently identified. This partly explains robustness of selection procedures to the response function misspecification. View Full-Text
Keywords: high-dimensional regression; loss function; random predictors; misspecification; consistent selection; subgaussianity; generalized information criterion; robustness high-dimensional regression; loss function; random predictors; misspecification; consistent selection; subgaussianity; generalized information criterion; robustness
Show Figures

Figure 1

MDPI and ACS Style

Kubkowski, M.; Mielniczuk, J. Selection Consistency of Lasso-Based Procedures for Misspecified High-Dimensional Binary Model and Random Regressors. Entropy 2020, 22, 153.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop