Bias-Corrected Inference of High-Dimensional Generalized Linear Models
Abstract
:1. Introduction
2. Weighted Link-Specific Method of Generalized Linear Model
2.1. Poisson Regression
2.2. Gamma Regression
3. Theoretical Properties
3.1. Asymptotic Normality
- The connection function f is a monotone, second-order differentiable concave function on ;
- There is a normal constant such that for all , is a standard Gaussian distribution, and ;
- There exists a constant , such that ;
- For defined by Formula (2), there is a constant C, which can make the Hessian matrix expressed as , with andNone of the above four conditions is very strict, and a large number of link functions can satisfy them, where 1∼3 are relatively easy to verify, while condition 4 is from Huang [26]. Second, for the random design variables and their distributions, we assume that
- is an independent and identically distributed sub-Gaussian random vector, that is, there is a constant that satisfies .
- (1)
- , then we have , and ,
- (2)
- , then .
3.2. Confidence Interval
4. Simulations
CIs for High-Dimensional Poisson and Gamma Regression
5. Real Data Analysis
5.1. False Discovery Rate and Power Comparison
5.2. CIs for the Different Stimuli
6. Discussion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Fotuhi, H.; Amiri, A.; Maleki, M.R. Phase I monitoring of social networks based on Poisson regression profiles. Qual. Reliab. Eng. Int. 2018, 34, 572–588. [Google Scholar] [CrossRef]
- Ortega, E.M.; Bolfarine, H.; Paula, G.A. Influence diagnostics in generalized log-gamma regression models. Comput. Stat. Data Anal. 2003, 42, 165–186. [Google Scholar] [CrossRef]
- Sørensen, Ø.; Hellton, K.H.; Frigessi, A.; Thoresen, M. Covariate selection in high-dimensional generalized linear models with measurement error. J. Comput. Graph. Stat. 2018, 27, 739–749. [Google Scholar] [CrossRef]
- Piironen, J.; Paasiniemi, M.; Vehtari, A. Projective inference in high-dimensional problems: Prediction and feature selection. Electron. J. Stat. 2020, 14, 2155–2197. [Google Scholar] [CrossRef]
- Liang, F.; Xue, J.; Jia, B. Markov neighborhood regression for high-dimensional inference. J. Am. Stat. Assoc. 2022, 117, 1200–1214. [Google Scholar] [CrossRef]
- Liu, C.; Zhao, X.; Huang, J. A Random Projection Approach to Hypothesis Tests in High-Dimensional Single-Index Models. J. Am. Stat. Assoc. 2022, 1–21. [Google Scholar] [CrossRef]
- Deshpande, Y.; Javanmard, A.; Mehrabi, M. Online debiasing for adaptively collected high-dimensional data with applications to time series analysis. J. Am. Stat. Assoc. 2021, 1–14. [Google Scholar] [CrossRef]
- Cai, T.T.; Guo, Z. Confidence intervals for high-dimensional linear regression: Minimax rates and adaptivity. Ann. Stat. 2017, 45, 615–646. [Google Scholar] [CrossRef]
- Athey, S.; Imbens, G.W.; Wager, S. Approximate residual balancing: Debiased inference of average treatment effects in high dimensions. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2018, 80, 597–623. [Google Scholar] [CrossRef]
- Zhu, Y.; Bradic, J. Linear hypothesis testing in dense high-dimensional linear models. J. Am. Stat. Assoc. 2018, 113, 1583–1600. [Google Scholar] [CrossRef]
- Sur, P.; Chen, Y.; Candès, E.J. The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled chi-square. Probab. Theory Relat. Fields 2019, 175, 487–558. [Google Scholar] [CrossRef]
- Ma, R.; Tony Cai, T.; Li, H. Global and simultaneous hypothesis testing for high-dimensional logistic regression models. J. Am. Stat. Assoc. 2021, 116, 984–998. [Google Scholar] [CrossRef] [PubMed]
- Shi, C.; Song, R.; Lu, W.; Li, R. Statistical inference for high-dimensional models via recursive online-score estimation. J. Am. Stat. Assoc. 2021, 116, 1307–1318. [Google Scholar] [CrossRef] [PubMed]
- Song, Y.; Liang, X.; Zhu, Y.; Lin, L. Robust variable selection with exponential squared loss for the spatial autoregressive model. Comput. Stat. Data Anal. 2021, 155, 107094. [Google Scholar] [CrossRef]
- Oda, R.; Mima, Y.; Yanagihara, H.; Fujikoshi, Y. A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration. Commun. Stat.-Theory Methods 2021, 50, 3453–3476. [Google Scholar] [CrossRef]
- Janková, J.; van de Geer, S. De-biased sparse PCA: Inference and testing for eigenstructure of large covariance matrices. arXiv 2018, arXiv:1801.10567. [Google Scholar] [CrossRef]
- Cai, T.T.; Guo, Z.; Ma, R. Statistical inference for high-dimensional generalized linear models with binary outcomes. J. Am. Stat. Assoc. 2021, 1–14. [Google Scholar] [CrossRef]
- Belloni, A.; Chernozhukov, V.; Wei, Y. Post-selection inference for generalized linear models with many controls. J. Bus. Econ. Stat. 2016, 34, 606–619. [Google Scholar] [CrossRef]
- Li, X.; Chen, F.; Liang, H.; Ruppert, D. Model Checking for Logistic Models When the Number of Parameters Tends to Infinity. J. Comput. Graph. Stat. 2022, 1–30. [Google Scholar] [CrossRef]
- Ning, Y.; Liu, H. A general theory of hypothesis tests and confidence regions for sparse high dimensional models. Ann. Stat. 2017, 45, 158–195. [Google Scholar] [CrossRef]
- Buccini, A.; De la Cruz Cabrera, O.; Donatelli, M.; Martinelli, A.; Reichel, L. Large-scale regression with non-convex loss and penalty. Appl. Numer. Math. 2020, 157, 590–601. [Google Scholar] [CrossRef]
- Jiang, Y.; Wang, Y.; Zhang, J.; Xie, B.; Liao, J.; Liao, W. Outlier detection and robust variable selection via the penalized weighted LAD-LASSO method. J. Appl. Stat. 2021, 48, 234–246. [Google Scholar] [CrossRef] [PubMed]
- Cai, T.; Tony Cai, T.; Guo, Z. Optimal statistical inference for individualized treatment effects in high-dimensional models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2021, 83, 669–719. [Google Scholar] [CrossRef]
- Javanmard, A.; Lee, J.D. A flexible framework for hypothesis testing in high dimensions. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2020, 82, 685–718. [Google Scholar] [CrossRef]
- Guo, Z.; Rakshit, P.; Herman, D.S.; Chen, J. Inference for the case probability in high-dimensional logistic regression. J. Mach. Learn. Res. 2021, 22, 11480–11533. [Google Scholar]
- Huang, J.; Zhang, C.H. Estimation and selection via absolute penalized convex minimization and its multistage adaptive applications. J. Mach. Learn. Res. 2012, 13, 1839–1864. [Google Scholar]
- Shalek, A.K.; Satija, R.; Shuga, J.; Trombetta, J.J.; Gennert, D.; Lu, D.; Chen, P.; Gertner, R.S.; Gaublomme, J.T.; Yosef, N.; et al. Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature 2014, 510, 363–369. [Google Scholar] [CrossRef] [PubMed]
- Van de Geer, S.; Bühlmann, P.; Ritov, Y.; Dezeure, R. On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 2014, 42, 1166–1202. [Google Scholar] [CrossRef]
p | Coverage(%) | Length | ||||
---|---|---|---|---|---|---|
WLS | WLP | Lasso | WLS | WLP | Lasso | |
k = 20 | ||||||
400 | 93.0 | 9.2 | 10.2 | 1.12 | 0.41 | 0.08 |
700 | 91.0 | 8.5 | 6.5 | 1.11 | 0.41 | 0.07 |
1000 | 96.0 | 10.2 | 4.5 | 1.13 | 0.40 | 0.07 |
1300 | 93.0 | 10.0 | 4.05 | 0.97 | 0.41 | 0.06 |
k = 25 | ||||||
400 | 95.0 | 3.0 | 10.3 | 1.15 | 0.40 | 0.07 |
700 | 90.0 | 1.9 | 3.04 | 1.14 | 0.41 | 0.07 |
1000 | 98.0 | 1.2 | 2.23 | 1.34 | 0.41 | 0.07 |
1300 | 91.0 | 1.0 | 1.05 | 0.98 | 0.40 | 0.06 |
k = 35 | ||||||
400 | 95.0 | 6.8 | 9.38 | 1.15 | 0.41 | 0.08 |
700 | 95.3 | 4.1 | 6.46 | 0.95 | 0.41 | 0.08 |
1000 | 96.8 | 1.2 | 5.31 | 1.35 | 0.41 | 0.06 |
1300 | 93.0 | 1.1 | 2.24 | 0.97 | 0.40 | 0.06 |
p | Coverage(%) | Length | ||||
---|---|---|---|---|---|---|
WLS | WLP | Lasso | WLS | WLP | Lasso | |
k = 20 | ||||||
400 | 95.0 | 99.0 | 16.3 | 1.11 | 0.41 | 0.08 |
700 | 99.3 | 99.2 | 13.2 | 1.23 | 0.40 | 0.07 |
1000 | 99.2 | 99.5 | 23.3 | 1.13 | 0.41 | 0.08 |
1300 | 99.6 | 99.8 | 24.2 | 1.02 | 0.40 | 0.06 |
k = 25 | ||||||
400 | 97.0 | 94.1 | 11.3 | 1.15 | 0.41 | 0.07 |
700 | 96.0 | 95.2 | 27.1 | 1.23 | 0.40 | 0.08 |
1000 | 99.5 | 96.1 | 28.6 | 1.26 | 0.41 | 0.09 |
1300 | 96.0 | 98.2 | 37.6 | 1.03 | 0.41 | 0.10 |
k = 35 | ||||||
400 | 95.0 | 94.1 | 21.56 | 1.16 | 0.41 | 0.09 |
700 | 96.9 | 96.4 | 34.8 | 1.03 | 0.41 | 0.08 |
1000 | 98.7 | 98.6 | 33.4 | 1.15 | 0.41 | 0.07 |
1300 | 99.5 | 98.9 | 35.3 | 1.03 | 0.40 | 0.06 |
p | Coverage(%) | Length | ||||
---|---|---|---|---|---|---|
WLS | WLP | Lasso | WLS | WLP | Lasso | |
k = 20 | ||||||
400 | 93.4 | 6.1 | 11.12 | 0.99 | 0.40 | 0.08 |
700 | 94.2 | 14.2 | 8.01 | 0.84 | 0.41 | 0.07 |
1000 | 92.5 | 13.2 | 3.66 | 0.85 | 0.41 | 0.08 |
1300 | 93.2 | 13.0 | 1.03 | 0.86 | 0.40 | 0.07 |
k = 25 | ||||||
400 | 95.1 | 3.2 | 4.23 | 0.98 | 0.41 | 0.07 |
700 | 89.3 | 2.1 | 2.04 | 0.84 | 0.41 | 0.07 |
1000 | 91.7 | 1.1 | 3.26 | 0.87 | 0.40 | 0.08 |
1300 | 92.3 | 1.0 | 2.06 | 0.89 | 0.40 | 0.07 |
k = 35 | ||||||
400 | 86.2 | 7.1 | 5.07 | 0.87 | 0.41 | 0.08 |
700 | 86.3 | 6.2 | 1.09 | 0.85 | 0.41 | 0.07 |
1000 | 80.4 | 1.2 | 1.03 | 0.78 | 0.41 | 0.07 |
1300 | 82.3 | 1.1 | 1.01 | 0.80 | 0.40 | 0.07 |
p | Coverage(%) | Length | ||||
---|---|---|---|---|---|---|
WLS | WLP | Lasso | WLS | WLP | Lasso | |
k = 20 | ||||||
400 | 96.2 | 99.1 | 10.02 | 0.98 | 0.41 | 0.08 |
700 | 98.3 | 98.2 | 24.02 | 0.97 | 0.41 | 0.07 |
1000 | 96.3 | 97.3 | 27.56 | 0.89 | 0.41 | 0.08 |
1300 | 98.7 | 98.0 | 26.04 | 0.87 | 0.40 | 0.07 |
k = 25 | ||||||
400 | 95.2 | 95.0 | 11.25 | 0.98 | 0.41 | 0.08 |
700 | 99.2 | 96.8 | 17.06 | 0.91 | 0.41 | 0.07 |
1000 | 99.3 | 97.3 | 26.48 | 0.94 | 0.40 | 0.08 |
1300 | 99.6 | 98.1 | 35.76 | 0.95 | 0.41 | 0.07 |
k = 35 | ||||||
400 | 98.6 | 96.2 | 20.0 | 0.88 | 0.40 | 0.07 |
700 | 99.2 | 97.3 | 26.3 | 0.90 | 0.40 | 0.08 |
1000 | 98.6 | 98.1 | 51.05 | 0.80 | 0.41 | 0.07 |
1300 | 99.3 | 98.3 | 35.32 | 0.85 | 0.41 | 0.07 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tang, S.; Shi, Y.; Zhang, Q. Bias-Corrected Inference of High-Dimensional Generalized Linear Models. Mathematics 2023, 11, 932. https://doi.org/10.3390/math11040932
Tang S, Shi Y, Zhang Q. Bias-Corrected Inference of High-Dimensional Generalized Linear Models. Mathematics. 2023; 11(4):932. https://doi.org/10.3390/math11040932
Chicago/Turabian StyleTang, Shengfei, Yanmei Shi, and Qi Zhang. 2023. "Bias-Corrected Inference of High-Dimensional Generalized Linear Models" Mathematics 11, no. 4: 932. https://doi.org/10.3390/math11040932
APA StyleTang, S., Shi, Y., & Zhang, Q. (2023). Bias-Corrected Inference of High-Dimensional Generalized Linear Models. Mathematics, 11(4), 932. https://doi.org/10.3390/math11040932