Consistency of Restricted Maximum Likelihood Estimators in High-Dimensional Kernel Linear Mixed-Effects Models with Applications in Estimating Genetic Heritability
Abstract
1. Introduction
2. Materials and Methods
2.1. The REML Equations for Variance Components
2.2. A Review on Random Matrix Theory
- 1.
- ; that is, and remain bounded as .
- 2.
- is a positive definite matrix and remains bounded in p; that is, there exists such that for all p.
- 3.
- has a finite limit; that is, there exists such that .
- 4.
- .
- 5.
- The entries of , a p-dimensional random vector, are i.i.d. Also, as denoted by , the kth entry of , we assume that , , and for some .
- 6.
- f is a function in the neighborhood of and a function in the neighborhood of 0.
- (5’)
- The entries of , a p-dimensional random vector, are i.i.d. Also, as denoted by the kth entry of , we assume that , , and for some .
- (6’)
- f is in a neighborhood of τ.
2.3. Consistency of REML Estimators in Kernel LMMs
2.3.1. Weighted Product Kernel
- 1.
- are i.i.d. with , , and ;
- 2.
- withfor some .
2.3.2. Inner-Product Kernel Matrices
- 1.
- .
- 2.
- are i.i.d. with , , and for some .
- 3.
- f is a function in the neighborhood of 1 and a function in the neighborhood of 0.
2.3.3. Euclidean Distance Kernel Matrices
- 1.
- .
- 2.
- are i.i.d. with , and for some .
- 3.
- f is a function in a neighborhood of 2.
3. Results
3.1. Simulation Studies
3.1.1. Weighted Product Kernel
3.1.2. Inner Product Kernel
3.1.3. Euclidean Distance Kernel
3.2. Real Data Analysis
4. Discussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Mathematical Proofs of the Main Results
Appendix A.1. Proof of Lemma 1
Appendix A.2. Proof of Theorem 5
Appendix A.3. Proof of Corollary 2
Appendix A.4. Proof of Theorem 6
Appendix A.5. Proof of Theorem 7
References
- Macgregor, S.; Cornes, B.K.; Martin, N.G.; Visscher, P.M. Bias, precision and heritability of self-reported and clinically measured height in Australian twins. Hum. Genet. 2006, 120, 571–580. [Google Scholar] [CrossRef] [PubMed]
- Silventoinen, K.; Sammalisto, S.; Perola, M.; Boomsma, D.I.; Cornes, B.K.; Davis, C.; Dunkel, L.; De Lange, M.; Harris, J.R.; Hjelmborg, J.V.; et al. Heritability of adult body height: A comparative study of twin cohorts in eight countries. Twin Res. Hum. Genet. 2003, 6, 399–408. [Google Scholar] [CrossRef] [PubMed]
- Yengo, L.; Vedantam, S.; Marouli, E.; Sidorenko, J.; Bartell, E.; Sakaue, S.; Graff, M.; Eliasen, A.U.; Jiang, Y.; Raghavan, S.; et al. A saturated map of common genetic variants associated with human height. Nature 2022, 610, 704–712. [Google Scholar] [CrossRef] [PubMed]
- Yang, J.; Lee, S.H.; Goddard, M.E.; Visscher, P.M. GCTA: A tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 2011, 88, 76–82. [Google Scholar] [CrossRef] [PubMed]
- Yang, J.; Benyamin, B.; McEvoy, B.P.; Gordon, S.; Henders, A.K.; Nyholt, D.R.; Madden, P.A.; Heath, A.C.; Martin, N.G.; Montgomery, G.W.; et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010, 42, 565–569. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Cai, T.T.; Li, H. Inference for high-dimensional linear mixed-effects models: A quasi-likelihood approach. J. Am. Stat. Assoc. 2022, 117, 1835–1846. [Google Scholar] [CrossRef] [PubMed]
- van de Geer, S.; Bühlmann, P.; Ritov, Y.; Dezeure, R. On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 2014, 42, 1166–1202. [Google Scholar] [CrossRef]
- Zhang, C.H.; Zhang, S.S. Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. Ser. Stat. Methodol. 2014, 76, 217–242. [Google Scholar] [CrossRef]
- Law, M.; Ritov, Y. Inference and estimation for random effects in high-dimensional linear mixed models. J. Am. Stat. Assoc. 2023, 118, 1682–1691. [Google Scholar] [CrossRef]
- Wu, M.C.; Lee, S.; Cai, T.; Li, Y.; Boehnke, M.; Lin, X. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 2011, 89, 82–93. [Google Scholar] [CrossRef] [PubMed]
- Jiang, J.; Li, C.; Paul, D.; Yang, C.; Zhao, H. On high-dimensional misspecified mixed model analysis in genome-wide association study. Ann. Stat. 2016, 44, 2127–2160. [Google Scholar] [CrossRef]
- Dao, C.; Jiang, J.; Paul, D.; Zhao, H. Variance estimation and confidence intervals from genome-wide association studies through high-dimensional misspecified mixed model analysis. J. Stat. Plan. Inference 2022, 220, 15–23. [Google Scholar] [CrossRef] [PubMed]
- Jiang, J.; Jiang, W.; Paul, D.; Zhang, Y.; Zhao, H. High-dimensional asymptotic behavior of inference based on gwas summary statistic. Stat. Sin. 2023, 33, 1555–1576. [Google Scholar] [CrossRef]
- Liu, D.; Lin, X.; Ghosh, D. Semiparametric regression of multidimensional genetic pathway data: Least-squares kernel machines and linear mixed models. Biometrics 2007, 63, 1079–1088. [Google Scholar] [CrossRef] [PubMed]
- Banerjee, S.; Carlin, B.P.; Gelfand, A.E. Hierarchical Modeling and Analysis for Spatial Data; Chapman and Hall/CRC: Boca Raton, FL, USA, 2003. [Google Scholar]
- Shen, X.; Wen, Y.; Cui, Y.; Lu, Q. A conditional autoregressive model for genetic association analysis accounting for genetic heterogeneity. Stat. Med. 2022, 41, 517–542. [Google Scholar] [CrossRef] [PubMed]
- de Los Campos, G.; Vazquez, A.I.; Fernando, R.; Klimentidis, Y.C.; Sorensen, D. Prediction of complex human traits using the genomic best linear unbiased predictor. PLoS Genet. 2013, 9, e1003608. [Google Scholar] [CrossRef] [PubMed]
- Searle, S.R.; Casella, G.; McCulloch, C.E. Variance Components; John Wiley & Sons: Hoboken, NJ, USA, 2009; Volume 391. [Google Scholar]
- Bai, Z.; Silverstein, J.W. Spectral Analysis of Large Dimensional Random Matrices; Springer: Berlin/Heidelberg, Germany, 2010; Volume 20. [Google Scholar]
- Paul, D.; Aue, A. Random matrix theory in statistics: A review. J. Stat. Plan. Inference 2014, 150, 1–29. [Google Scholar] [CrossRef]
- El Karoui, N. The spectrum of kernel random matrices. Ann. Stat. 2010, 38, 1–50. [Google Scholar] [CrossRef]
- Couillet, R.; Liao, Z. Random Matrix Methods for Machine Learning; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
- Hiai, F. Monotonicity for entrywise functions of matrices. Linear Algebra Its Appl. 2009, 431, 1125–1146. [Google Scholar] [CrossRef]
- Wendland, H. Scattered Data Approximation; Cambridge University Press: Cambridge, UK, 2004; Volume 17. [Google Scholar]
- Li, M.; He, Z.; Zhang, M.; Zhan, X.; Wei, C.; Elston, R.C.; Lu, Q. A generalized genetic random field method for the genetic association analysis of sequencing data. Genet. Epidemiol. 2014, 38, 242–253. [Google Scholar] [CrossRef] [PubMed]
- Schrempft, S.; van Jaarsveld, C.H.; Fisher, A.; Herle, M.; Smith, A.D.; Fildes, A.; Llewellyn, C.H. Variation in the heritability of child body mass index by obesogenic home environment. JAMA Pediatr. 2018, 172, 1153–1160. [Google Scholar] [CrossRef] [PubMed]
- Ghorbani, B.; Mei, S.; Misiakiewicz, T.; Montanari, A. When do neural networks outperform kernel methods? Adv. Neural Inf. Process. Syst. 2020, 33, 14820–14830. [Google Scholar] [CrossRef]
- Mei, S.; Misiakiewicz, T.; Montanari, A. Learning with invariances in random features and kernel models. In Proceedings of the Conference on Learning Theory, PMLR, Boulder, CO, USA, 15–19 August 2021; pp. 3351–3418. [Google Scholar]
- Gönen, M.; Alpaydın, E. Multiple kernel learning algorithms. J. Mach. Learn. Res. 2011, 12, 2211–2268. [Google Scholar]
Sample Size | ||
---|---|---|
100 | 0.837 (0.709) | 1.895 (0.415) |
200 | 0.698 (0.451) | 1.941 (0.302) |
400 | 0.668 (0.344) | 1.959 (0.237) |
600 | 0.634 (0.286) | 1.977 (0.204) |
800 | 0.631 (0.247) | 1.979 (0.182) |
1000 | 0.631 (0.231) | 1.985 (0.162) |
Sample Size | ||
---|---|---|
100 | 0.606 (0.375) | 2.004 (1.021) |
200 | 0.618 (0.284) | 1.974 (0.790) |
400 | 0.603 (0.206) | 1.985 (0.561) |
600 | 0.604 (0.172) | 1.992 (0.469) |
800 | 0.603 (0.152) | 1.995 (0.416) |
1000 | 0.601 (0.130) | 1.996 (0.364) |
Sample Size | ||
---|---|---|
100 | 1.000 (0.955) | 1.774 (0.631) |
200 | 0.818 (0.650) | 1.853 (0.432) |
400 | 0.755 (0.483) | 1.900 (0.320) |
600 | 0.715 (0.406) | 1.934 (0.262) |
800 | 0.717 (0.363) | 1.929 (0.229) |
1000 | 0.707 (0.335) | 1.938 (0.215) |
Product | Polynomial | Gaussian | |
---|---|---|---|
BMI | 20.35% | 50.08% | 34.29% |
Height | 10.21% | 49.93% | 17.97% |
Weight | 8.46% | 37.02% | 16.31% |
BMI | Height | Weight | |
---|---|---|---|
0 | 50.08% | 49.93% | 37.02% |
0.2 | 64.93% | 16.23% | 16.39% |
0.4 | 31.26% | 10.54% | 8.82% |
0.6 | 21.01% | 7.95% | 6.05% |
0.8 | 15.92% | 6.41% | 4.67% |
1 | 12.85% | 5.39% | 3.87% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shen, X.; Lu, Q. Consistency of Restricted Maximum Likelihood Estimators in High-Dimensional Kernel Linear Mixed-Effects Models with Applications in Estimating Genetic Heritability. Mathematics 2025, 13, 2363. https://doi.org/10.3390/math13152363
Shen X, Lu Q. Consistency of Restricted Maximum Likelihood Estimators in High-Dimensional Kernel Linear Mixed-Effects Models with Applications in Estimating Genetic Heritability. Mathematics. 2025; 13(15):2363. https://doi.org/10.3390/math13152363
Chicago/Turabian StyleShen, Xiaoxi, and Qing Lu. 2025. "Consistency of Restricted Maximum Likelihood Estimators in High-Dimensional Kernel Linear Mixed-Effects Models with Applications in Estimating Genetic Heritability" Mathematics 13, no. 15: 2363. https://doi.org/10.3390/math13152363
APA StyleShen, X., & Lu, Q. (2025). Consistency of Restricted Maximum Likelihood Estimators in High-Dimensional Kernel Linear Mixed-Effects Models with Applications in Estimating Genetic Heritability. Mathematics, 13(15), 2363. https://doi.org/10.3390/math13152363