Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data
Abstract
1. Introduction
2. Model and Estimation Procedure
2.1. GPLMs for Longitudinal Data
2.2. Ridge Generalized Estimating Equation (RGEE)
Algorithm 1: Monte Carlo Newton–Raphson (MCNR) algorithm |
|
2.3. Asymptotics
3. Numerical Analyses
3.1. Simulations
3.2. AIDS Data Analysis
4. Concluding Remarks and Discussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
- (A.1)
- Number of observations over time () is a bounded sequence of positive integers, and the distinct values of form a quasi-uniform sequence that grows dense on , and the kth derivative of is bounded for some ;
- (A.2)
- The covariates , , are uniformly bounded;
- (A.3)
- The unknown parameter belongs to a compact subset , the true parameter value lies in the interior of ;
- (A.4)
- There exist two positive constants, and , such that
References
- McCullagh, P.; Nelder, J.A. Generalized Linear Models, 2nd ed.; Chapman and Hall: London, UK, 1989. [Google Scholar]
- He, X.; Zhu, Z.Y.; Fung, W.K. Estimation in a Semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 2002, 89, 579–590. [Google Scholar] [CrossRef]
- He, X.M.; Fung, W.K.; Zhu, Z.Y. Robust estimation in a generalized partially linear model for cluster data. J. Am. Stat. Assoc. 2005, 100, 1176–1184. [Google Scholar] [CrossRef]
- Qin, G.; Bai, Y.; Zhu, Z. Robust empirical likelihood inference for generalized partial linear models with longitudinal data. J. Multivar. Anal. 2012, 105, 32–44. [Google Scholar] [CrossRef]
- Chen, B.; Zhou, X.H. Generalized partially linear models for incomplete longitudinal data in the presence of population-Level information. Biometrics 2013, 69, 386–395. [Google Scholar] [CrossRef]
- Zhang, J.; Xue, L. Empirical likelihood inference for generalized partially linear models with longitudinal data. Open J. Stat. 2020, 10, 188–202. [Google Scholar] [CrossRef]
- Hoerl, A.; Kennard, R. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
- Hoerl, A.; Kennard, R. Ridge regression: Application to nonorthogonal problems. Technometrics 1970, 12, 69–82. [Google Scholar] [CrossRef]
- Theobald, C.M. Generalization of mean squer error applied to ridge regresion. J. R. Stat. Soc. 1974, 36, 103–106. [Google Scholar]
- Tikhonov, A. On the stability of inverse problems. Proc. USSR Acad. Sci. 1943, 39, 267–288. [Google Scholar]
- Saleh, A.K.M.; Kibria, B.M.G. Performances of some new preliminary test ridge regression estimators and their properties. Commun. Stat.—Theory Methods 1993, 22, 2747–2764. [Google Scholar] [CrossRef]
- Kibria, B.M.G.; Saleh, A.K.M.E. Effect of W,LR and LM tests on the performance of preliminary test ridge regression estimators. J. Jpn. Stat. Soc. 2003, 33, 119–136. [Google Scholar] [CrossRef]
- Kibria, B.M.G.; Saleh, A.K.M.E. Preliminary test ridge regression estimators with student’s /errors and conflicting test-statistics. Metrika 2004, 59, 105–124. [Google Scholar] [CrossRef]
- Arashi, M.; Tabatabaey, S.M.M.; Iranmanesh, A. Improved estimation in stochastic linear models under elliptical symmetry. J. Appl. Probab. Stat. 2010, 5, 145–160. [Google Scholar]
- Bashtian, H.M.; Arashi, M.; Tabatabaey, S.M.M. Using improved estimation strategies to combat multicollinearity. J. Stat. Comput. Simul. 2011, 81, 1773–1797. [Google Scholar] [CrossRef]
- Bashtian, H.M.; Arashi, M.; Tabatabaey, S.M.M. Ridge estimation under the stochastic restriction. Commun. Stat.—Theory Methods 2011, 40, 3711–3747. [Google Scholar] [CrossRef]
- Arashi, M.; Tabatabaey, S.M.M.; Soleimani, H. Simple regression in view of elliptical models. Linear Algebra Its Appl. 2012, 437, 1675–1691. [Google Scholar] [CrossRef]
- Zhang, B.; Horvath, S. Ridge regression based hybrid genetic algorithms for multi-locus quantitative trait mapping. Bioinform. Res. Appl. 2006, 1, 261–272. [Google Scholar] [CrossRef]
- Malo, N.; Libiger, O.; Schork, N. Accommodating linkage disequilibrium in genetic-association analyses via ridge regression. Am. J. Hum. Genet. 2008, 82, 375–385. [Google Scholar] [CrossRef]
- Eliot, M.; Ferguson, J.; Reilly, M.P.; Foulkes, A.S. Ridge regression for longitudinal biomarker data. Int. J. Biostat. 2011, 7, 37. [Google Scholar] [CrossRef]
- Rahmani, M.; Arashi, M.; Mamode Khan, N.; Sunecher, Y. Improved mixed model for longitudinal data analysis using shrinkage method. Math. Sci. 2018, 12, 305–312. [Google Scholar] [CrossRef]
- Taavoni, M.; Arashi, M. Semiparametric ridge regression for longitudinal data. In Proceedings of the 14th Iranian Statistics Conference, Shahrood University of Technology, Shahrood, Iran, 25–27 August 2018. [Google Scholar]
- Qin, G.Y.; Zhu, Z.Y. Robustified maximum likelihood estimation in generalized partial linear mixed model for longitudinal data. Biometrics 2009, 65, 52–59. [Google Scholar] [CrossRef] [PubMed]
- Taavoni, M.; Arashi, M. High-dimensional generalized semiparametric model for longitudinal data. Statistics 2021, 55, 831–850. [Google Scholar] [CrossRef]
- Qin, G.Y.; Zhu, Z.Y. Robust estimation in generalized semiparametric mixed models for longitudinal data. J. Multivar. Anal. 2007, 98, 1658–1683. [Google Scholar] [CrossRef]
- Liang, K.Y.; Zeger, S.L. Longitudinal data analysis using generalized linear models. Biometrika 1986, 73, 13–22. [Google Scholar] [CrossRef]
- Fan, J.Q.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
- Wang, H.S.; Li, R.Z.; Tcai, C.L. Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 2007, 94, 553–568. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.S.; Li, B.; Leng, C.L. Shrinkage tuning parameter selection with a diverging number of parameters. J. R. Stat. Soc. Ser. B 2009, 71, 671–683. [Google Scholar] [CrossRef]
- Li, G.R.; Peng, H.; Zhu, L.X. Nonconcave penalized M-estimation with a diverging number of parameters. Stat. Sin. 2011, 21, 391–419. [Google Scholar]
- Zeger, S.L.; Diggle, P.J. Semi-parametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics 1994, 50, 689–699. [Google Scholar] [CrossRef]
- Wang, N.; Carroll, R.; Lin, X.H. Efficient semiparametric marginal estimation for longitudinal/clustered data. J. Am. Stat. Assoc. 2005, 100, 147–157. [Google Scholar] [CrossRef]
- Schumaker, L.L. Spline Functions; Wiley: New York, NY, USA, 1981. [Google Scholar]
Methods | Parameters | ||||
---|---|---|---|---|---|
RGEE-C | 0.028(0.032) | 0.033(0.038) | 0.046(0.052) | 0.141(0.142) | |
0.024(0.027) | 0.029(0.032) | 0.039(0.044) | 0.122(0.126) | ||
0.056(0.027) | 0.067(0.032) | 0.092(0.044) | 0.284(0.127) | ||
0.055(0.029) | 0.066(0.035) | 0.090(0.048) | 0.279(0.137) | ||
0.080(0.035) | 0.097(0.045) | 0.137(0.068) | 0.446(0.209) | ||
MSE | 0.243 | 0.291 | 0.404 | 1.271 | |
RGEE-I | 0.041(0.033) | 0.049(0.039) | 0.068(0.053) | 0.209(0.150) | |
0.057(0.029) | 0.068(0.034) | 0.093(0.047) | 0.289(0.137) | ||
0.055(0.029) | 0.065(0.034) | 0.090(0.047) | 0.278(0.137) | ||
0.061(0.031) | 0.073(0.037) | 0.101(0.051) | 0.311(0.146) | ||
0.070(0.037) | 0.076(0.048) | 0.092(0.073) | 0.249(0.236) | ||
MSE | 0.284 | 0.331 | 0.444 | 1.335 | |
GEE-C | 0.027(0.055) | 0.033(0.066) | 0.045(0.090) | 0.129(0.279) | |
0.024(0.044) | 0.028(0.052) | 0.039(0.072) | 0.113(0.223) | ||
0.058(0.051) | 0.069(0.061) | 0.096(0.084) | 0.308(0.259) | ||
0.055(0.050) | 0.065(0.060) | 0.089(0.082) | 0.244(0.254) | ||
0.082(0.053) | 0.100(0.068) | 0.143(0.103) | 0.449(0.391) | ||
MSE | 0.246 | 0.295 | 0.411 | 1.243 | |
GEE-I | 0.040(0.057) | 0.048(0.068) | 0.065(0.094) | 0.184(0.289) | |
0.055(0.048) | 0.065(0.057) | 0.088(0.078) | 0.237(0.242) | ||
0.057(0.052) | 0.069(0.062) | 0.096(0.085) | 0.320(0.262) | ||
0.061(0.056) | 0.072(0.067) | 0.099(0.092) | 0.271(0.284) | ||
0.072(0.055) | 0.079(0.071) | 0.099(0.110) | 0.262(0.423) | ||
MSE | 0.286 | 0.333 | 0.447 | 1.275 |
Coefficients | Methods | Coefficients | Methods | ||
---|---|---|---|---|---|
RGEE | GEE | RGEE | GEE | ||
AGE | 3.987 (0.006) | 4.298 (0.009) | AGE*CESD | −0.268 (0.008) | −0.262 (0.001) |
SMOKE | 32.780 (0.053) | 32.916 (0.062) | SMOKE*DRUG | −16.204 (0.046) | −16.221 (0.055) |
DRUG | 17.949 (0.066) | 18.254 (0.075) | SMOKE*SEXP | 4.051 (0.002) | 4.057 (0.005) |
SEXP | 2.801 (0.009) | 2.797 (0.013) | SMOKE*CESD | −0.268 (0.003) | −0.251 (0.002) |
CESD | −3.077 (0.002) | −3.077 (0.005) | DRUG*SEXP | −1.205 (0.005) | −1.292 (0.013) |
AGE*SMOKE | 0.039 (0.002) | −0.007 (0.003) | DRUG*CESD | 0.274 (0.003) | 0.273 (0.005) |
AGE*DRUG | −1.006 (0.003) | −1.017 (0.009) | SEXP*CESD | 0.033 (0.008) | 0.026 (0.001) |
AGE*SEXP | −0.565 (0.003) | −0.596 (0.001) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Taavoni, M.; Arashi, M.; Manda, S. Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data. Mathematics 2023, 11, 530. https://doi.org/10.3390/math11030530
Taavoni M, Arashi M, Manda S. Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data. Mathematics. 2023; 11(3):530. https://doi.org/10.3390/math11030530
Chicago/Turabian StyleTaavoni, Mozhgan, Mohammad Arashi, and Samuel Manda. 2023. "Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data" Mathematics 11, no. 3: 530. https://doi.org/10.3390/math11030530
APA StyleTaavoni, M., Arashi, M., & Manda, S. (2023). Multicollinearity and Linear Predictor Link Function Problems in Regression Modelling of Longitudinal Data. Mathematics, 11(3), 530. https://doi.org/10.3390/math11030530