Robust Permutation Tests for Penalized Splines
Abstract
:1. Introduction
1.1. Penalized Spline Prevalence
1.2. Penalized Spline Definition
1.3. Penalized Spline Estimation
1.4. Bayesian Interpretation
1.5. Proposed Approach
2. Omnibus Regression Tests
2.1. Model and Estimation
2.2. Asymptotic Distributions
- A1.
- with and for
- A2.
- are iid from a distribution satisfying
- A3.
- and are nonsingular, where , and is almost surely invertible. In the fixed predictors case, the mean vector is and the the covariance matrix terms are defined as and , where .
2.3. Test Statistics
2.4. Permutation Inference
3. Conditional Regression Tests
3.1. Model and Estimation
3.2. Asymptotic Distributions
- B1.
- with and for
- B2.
- are iid from a distribution satisfying
- B3.
- and are nonsingular, where , and is almost surely invertible. In the fixed predictors case, the mean vector is and the covariance matrix terms are defined as and , where .
3.3. Test Statistics
3.4. Permutation Inference
4. Simulation Studies
4.1. Simulation A
4.2. Simulation B
5. Discussion
5.1. Summary of Findings
5.2. Future Directions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
BLUE | Best Linear Unbiased Estimator |
BLUP | Best Linear Unbiased Predictor |
GCV | Generalized Cross-Validation |
GRR | Generalized Ridge Regression |
OLS | Ordinary Least Squares |
PLS | Penalized Least Squares |
Appendix A. Proofs
Appendix A.1. Proof of Lemma 1
Appendix A.2. Proof of Lemma 2
References
- Fox, J. Quantitative Applications in the Social Sciences: Multiple and Generalized Nonparametric Regression; SAGE Publications, Inc.: Thousand Oaks, CA, USA, 2000. [Google Scholar] [CrossRef]
- Helwig, N.E. Multiple and Generalized Nonparametric Regression. In SAGE Research Methods Foundations; Atkinson, P., Delamont, S., Cernat, A., Sakshaug, J.W., Williams, R.A., Eds.; SAGE Publications, Inc.: London, England, 2020. [Google Scholar] [CrossRef]
- Wahba, G. Spline Models for Observational Data; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1990. [Google Scholar]
- Wang, Y. Smoothing Splines: Methods and Applications; CRC Press: Boca Raton, FL, USA, 2011. [Google Scholar]
- Gu, C. Smoothing Spline ANOVA Models, 2nd ed.; Springer: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R. Generalized Additive Models; Chapman and Hall/CRC: New York, NY, USA, 1990. [Google Scholar]
- Ruppert, D.; Wand, M.P.; Carroll, R.J. Semiparametric Regression; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Wood, S.N. Generalized Additive Models: An Introduction with R, 2nd ed.; Chapman & Hall: Boca Raton, FL, USA, 2017. [Google Scholar]
- Almquist, Z.W.; Helwig, N.E.; You, Y. Connecting Continuum of Care point-in-time homeless counts to United States Census areal units. Math. Popul. Stud. 2020, 27, 46–58. [Google Scholar] [CrossRef]
- Kage, C.C.; Helwig, N.E.; Ellingson, A.M. Normative cervical spine kinematics of a circumduction task. J. Electromyogr. Kinesiol. 2021, 61, 102591. [Google Scholar] [CrossRef]
- Helwig, N.E.; Shorter, K.A.; Hsiao-Wecksler, E.T.; Ma, P. Smoothing spline analysis of variance models: A new tool for the analysis of cyclic biomechaniacal data. J. Biomech. 2016, 49, 3216–3222. [Google Scholar] [CrossRef] [PubMed]
- Hammell, A.E.; Helwig, N.E.; Kaczkurkin, A.N.; Sponheim, S.R.; Lissek, S. The temporal course of over-generalized conditioned threat expectancies in posttraumatic stress disorder. Behav. Res. Ther. 2020, 124, 103513. [Google Scholar] [CrossRef] [PubMed]
- Helwig, N.E.; Sohre, N.E.; Ruprecht, M.R.; Guy, S.J.; Lyford-Pike, S. Dynamic properties of successful smiles. PLoS ONE 2017, 12, e0179708. [Google Scholar] [CrossRef]
- Helwig, N.E.; Ruprecht, M.R. Age, gender, and self-esteem: A sociocultural look through a nonparametric lens. Arch. Sci. Psychol. 2017, 5, 19–31. [Google Scholar] [CrossRef]
- Helwig, N.E.; Gao, Y.; Wang, S.; Ma, P. Analyzing spatiotemporal trends in social media data via smoothing spline analysis of variance. Spat. Stat. 2015, 14, 491–504. [Google Scholar] [CrossRef]
- Helwig, N.E. Regression with ordered predictors via ordinal smoothing splines. Front. Appl. Math. Stat. 2017, 3, 1–13. [Google Scholar] [CrossRef]
- Gu, C. Nonparametric regression with ordinal responses. Stat 2021, 10, e365. [Google Scholar] [CrossRef]
- Gu, C.; Ma, P. Optimal smoothing in nonparametric mixed-effect models. Ann. Stat. 2005, 33, 1357–1379. [Google Scholar] [CrossRef]
- Gu, C.; Ma, P. Generalized Nonparametric Mixed-Effect Models: Computation and Smoothing Parameter Selection. J. Comput. Graph. Stat. 2005, 14, 485–504. [Google Scholar] [CrossRef]
- Helwig, N.E. Efficient estimation of variance components in nonparametric mixed-effects models with large samples. Stat. Comput. 2016, 26, 1319–1336. [Google Scholar] [CrossRef]
- Kim, Y.J.; Gu, C. Smoothing spline Gaussian regression: More scalable computation via efficient approximation. J. R. Stat. Soc. Ser. B 2004, 66, 337–356. [Google Scholar] [CrossRef]
- Gu, C.; Kim, Y.J. Penalized likelihood regression: General formulation and efficient approximation. Can. J. Stat. 2002, 30, 619–628. [Google Scholar] [CrossRef]
- Helwig, N.E.; Ma, P. Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. J. Comput. Graph. Stat. 2015, 24, 715–732. [Google Scholar] [CrossRef]
- Helwig, N.E.; Ma, P. Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters. Stat. Interface 2016, 9, 433–444. [Google Scholar] [CrossRef]
- Berry, L.N.; Helwig, N.E. Cross-validation, information theory, or maximum likelihood? A comparison of tuning methods for penalized splines. Stats 2021, 4, 701–724. [Google Scholar] [CrossRef]
- Helwig, N.E. Spectrally sparse nonparametric regression via elastic net regularized smoothers. J. Comput. Graph. Stat. 2021, 30, 182–191. [Google Scholar] [CrossRef]
- Kimeldorf, G.; Wahba, G. Some results on Tchebycheffian spline functions. J. Math. Anal. Appl. 1971, 33, 82–95. [Google Scholar] [CrossRef]
- Ma, P.; Huang, J.; Zhang, N. Efficient computation of smoothing splines via adaptive basis sampling. Biometrika 2015, 102, 631–645. [Google Scholar] [CrossRef]
- Moore, E.H. On the reciprocal of the general algebraic matrix. Bull. Am. Math. Soc. 1920, 26, 394–395. [Google Scholar] [CrossRef]
- Penrose, R. A generalized inverse for matrices. Math. Proc. Camb. Philos. Soc. 1955, 51, 406–413. [Google Scholar] [CrossRef]
- Wahba, G. Bayesian “confidence intervals” for the cross-validated smoothing spline. J. R. Stat. Soc. Ser. B 1983, 45, 133–150. [Google Scholar] [CrossRef]
- Nychka, D. Bayesian confidence intervals for smoothing splines. J. Am. Stat. Assoc. 1988, 83, 1134–1143. [Google Scholar] [CrossRef]
- Craven, P.; Wahba, G. Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 1979, 31, 377–403. [Google Scholar] [CrossRef]
- Gu, C.; Wahba, G. Smoothing spline ANOVA with component-wise Bayesian “confidence intervals”. J. Comput. Graph. Stat. 1993, 2, 97–117. [Google Scholar]
- Marra, G.; Wood, S.N. Coverage properties of confidence intervals for generalized additive model components. Scand. J. Stat. 2012, 39, 53–74. [Google Scholar] [CrossRef]
- Cox, D.; Koh, E.; Wahba, G.; Yandell, B.S. Testing the (Parametric) Null Model Hypothesis in (Semiparametric) Partial and Generalized Spline Models. Ann. Stat. 1988, 16, 113–119. [Google Scholar] [CrossRef]
- Zhang, D.; Lin, X. Hypothesis testing in semiparametric additive mixed models. Biostatistics 2003, 4, 57–74. [Google Scholar] [CrossRef]
- Liu, A.; Wang, Y. Hypothesis testing in smoothing spline models. J. Stat. Comput. Simul. 2004, 74, 581–597. [Google Scholar] [CrossRef]
- Crainiceanu, C.; Ruppert, D.; Claeskens, G.; Wand, M.P. Exact likelihood ratio tests for penalised splines. Biometrika 2005, 92, 91–103. [Google Scholar] [CrossRef]
- Scheipl, F.; Greven, S.; Küchenhoff, H. Size and power of tests for a zero random effect variance or polynomial regression in additive and linear mixed models. Comput. Stat. Data Anal. 2008, 52, 3283–3299. [Google Scholar] [CrossRef]
- Nummi, T.; Pan, J.; Siren, T.; Liu, K. Testing for Cubic Smoothing Splines under Dependent Data. Biometrics 2011, 67, 871–875. [Google Scholar] [CrossRef] [PubMed]
- Wood, S.N. On p-values for smooth components of an extended generalized additive model. Biometrika 2013, 100, 221–228. [Google Scholar] [CrossRef]
- Wood, S.N. A simple test for random effects in regression models. Biometrika 2013, 100, 1005–1010. [Google Scholar] [CrossRef]
- DiCiccio, C.J.; Romano, J.P. Robust Permutation Tests For Correlation And Regression Coefficients. J. Am. Stat. Assoc. 2017, 112, 1211–1220. [Google Scholar] [CrossRef]
- Hoerl, A.; Kennard, R. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
- White, H. A Heteroscedasticity-Consistent Covariance Matrix and a Direct Test for Heteroscedasticity. Econometrica 1980, 48, 817–838. [Google Scholar] [CrossRef]
- Henderson, C.R. Estimation of genetic parameters (abstract). Ann. Math. Stat. 1950, 21, 309–310. [Google Scholar]
- Henderson, C.R. Best Linear Unbiased Estimation and Prediction under a Selection Model. Biometrics 1975, 31, 423–447. [Google Scholar] [CrossRef]
- Robinson, G.K. That BLUP is a Good Thing: The Estimation of Random Effects. Stat. Sci. 1991, 6, 15–32. [Google Scholar] [CrossRef]
- Helwig, N.E. Robust nonparametric tests of general linear model coefficients: A comparison of permutation methods and test statistics. NeuroImage 2019, 201, 116030. [Google Scholar] [CrossRef] [PubMed]
- Helwig, N.E. Statistical nonparametric mapping: Multivariate permutation tests for location, correlation, and regression problems in neuroimaging. WIREs Comput. Stat. 2019, 2, e1457. [Google Scholar] [CrossRef]
- Draper, N.R.; Stoneman, D.M. Testing for the Inclusion of Variables in Linear Regression by a Randomisation Technique. Technometrics 1966, 8, 695–699. [Google Scholar] [CrossRef]
- O’Gorman, T.W. The Performance of Randomization Tests that Use Permutations of Independent Variables. Commun. Stat. Simul. Comput. 2005, 34, 895–908. [Google Scholar] [CrossRef]
- Nichols, T.E.; Ridgway, G.R.; Webster, M.G.; Smith, S.M. GLM permutation: Nonparametric inference for arbitrary general linear models. NeuroImage 2008, 41, S72. [Google Scholar]
- Manly, B. Randomization and regression methods for testing for associations with geographical, environmental and biological distances between populations. Res. Popul. Ecol. 1986, 28, 201–218. [Google Scholar] [CrossRef]
- Freedman, D.; Lane, D. A Nonstochastic Interpretation of Reported Significance Levels. J. Bus. Econ. Stat. 1983, 1, 292–298. [Google Scholar] [CrossRef]
- ter Braak, C.J.F. Permutation Versus Bootstrap Significance Tests in Multiple Regression and ANOVA. In Bootstrapping and Related Techniques. Lecture Notes in Economics and Mathematical Systems; Jöckel, K.H., Rothe, G., Sendler, W., Eds.; Springer: Berlin/Heidelberg, Germany, 1992; Volume 376, pp. 79–86. [Google Scholar]
- Still, A.W.; White, A.P. The approximate randomization test as an alternative to the F test in analysis of variance. Br. J. Math. Stat. Psychol. 1981, 34, 243–252. [Google Scholar] [CrossRef]
- Kennedy, P.E.; Cade, B.S. Randomization tests for multiple regression. Commun. Stat. Simul. Comput. 1996, 25, 923–936. [Google Scholar] [CrossRef]
- Huh, M.H.; Jhun, M. Random Permutation Testing in Multiple Linear Regression. Commun. Stat. Theory Methods 2001, 30, 2023–2032. [Google Scholar] [CrossRef]
- Schur, J. Über Potenzreihen, die im Innern des Einheitskreises beschränkt sind. J. FüR Die Reine Und Angew. Math. 1917, 1917, 205–232. [Google Scholar] [CrossRef]
- Hotelling, H. Further Points on Matrix Calculation and Simultaneous Equations. Ann. Math. Stat. 1943, 14, 440–441. [Google Scholar] [CrossRef]
- Hotelling, H. Some New Methods in Matrix Calculation. Ann. Math. Stat. 1943, 14, 1–34. [Google Scholar] [CrossRef]
- Duncan, W.J. Some devices for the solution of large sets of simultaneous linear equations (with an appendix on the reciprocation of partitioned matrices). Lond. Edinb. Dublin Philos. Mag. J. Sci. Seventh Ser. 1944, 35, 660–670. [Google Scholar] [CrossRef]
- Helwig, N.E. npreg: Nonparametric Regression via Smoothing Splines; R Package Version 1.0-9; R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://cran.r-project.org/package=npreg.
- Helwig, N.E. nptest: Nonparametric Tests; R Package Version 1.0-3; R Foundation for Statistical Computing: Vienna, Austria, 2021; Available online: https://cran.r-project.org/package=nptest.
- Wood, S.N. mgcv: Mixed GAM Computation Vehicle with GCV/AIC/REML smoothness estimation and GAMMs by REML/PQL; R Package Version 1.8-40; R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://cran.r-project.org/package=mgcv.
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Kalpić, D.; Hlupić, N. Multivariate Normal Distributions. In International Encyclopedia of Statistical Science; Lovric, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2011; pp. 907–910. [Google Scholar] [CrossRef]
- Henderson, H.V.; Searle, S.R. On deriving the inverse of a sum of matrices. SIAM Rev. 1981, 23, 53–60. [Google Scholar] [CrossRef] [Green Version]
Code | Method | Permutation Method | |
---|---|---|---|
DS | Draper-Stoneman (1966) | Y | |
OS | O’Gorman-Smith (2005/8) | Y | |
MA | Manly (1986) | ||
FL | Freedman-Lane (1983) | ||
TB | ter Braak (1992) | ||
SW | Still-White (1981) | ||
KC | Kennedy-Cade (1996) | ||
HJ | Huh-Jhun (2001) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Helwig, N.E. Robust Permutation Tests for Penalized Splines. Stats 2022, 5, 916-933. https://doi.org/10.3390/stats5030053
Helwig NE. Robust Permutation Tests for Penalized Splines. Stats. 2022; 5(3):916-933. https://doi.org/10.3390/stats5030053
Chicago/Turabian StyleHelwig, Nathaniel E. 2022. "Robust Permutation Tests for Penalized Splines" Stats 5, no. 3: 916-933. https://doi.org/10.3390/stats5030053
APA StyleHelwig, N. E. (2022). Robust Permutation Tests for Penalized Splines. Stats, 5(3), 916-933. https://doi.org/10.3390/stats5030053