From Bilinear Regression to Inductive Matrix Completion: A Quasi-Bayesian Analysis
Abstract
:1. Introduction
2. Bilinear Linear Regression
2.1. Model
2.2. Prior Specification
2.3. Theoretical Results
3. Inductive Matrix Completion
3.1. Model and Method
3.2. Theoretical Results
4. Numerical Studies
4.1. Langevin Monte Carlo Implementation
4.2. Simulation Studies for Biliear Regression
- Model I: The true coefficient matrix is a rank-2 matrix that is generated as where and all entries in and are independent and identically sampled from .
- Model II: An approximate low-rank set up is studied. This series of simulations is similar to the Model I, except that the true coefficient is no longer rank 2, but it can be well approximated by a rank 2 matrix:
4.3. Simulation Studies for Inductive Matrix Completion
5. Discussion and Conclusions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Appendix: Proofs
Appendix A.1. Preliminary Lemmas
Appendix A.2. Proof of Theorem 1
Appendix A.3. Proof of Theorem 2
Appendix A.4. Proof of Theorem 3
Appendix A.5. Proof of Theorem 4
Appendix B. Comments on Algorithm Implementation
Algorithm A1 LMC |
|
Algorithm A2 MALA |
References
- Rosen, D.V. Bilinear Regression with Rank Restrictions on the Mean and Dispersion Matrix. In Methodology and Applications of Statistics; Springer: Berlin/Heidelberg, Germany, 2021; pp. 193–211. [Google Scholar]
- Von Rosen, D. Bilinear regression analysis: An Introduction; Lecture Notes in Statistics; Springer: Berlin/Heidelberg, Germany, 2018; Volume 220. [Google Scholar]
- Potthoff, R.F.; Roy, S. A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 1964, 51, 313–326. [Google Scholar] [CrossRef]
- Woolson, R.F.; Leeper, J.D. Growth curve analysis of complete and incomplete longitudinal data. Commun. Stat.-Theory Methods 1980, 9, 1491–1513. [Google Scholar] [CrossRef]
- Kshirsagar, A.; Smith, W. Growth Curves; CRC Press: Boca Raton, FL, USA, 1995; Volume 145. [Google Scholar]
- Jana, S. Inference for Generalized Multivariate Analysis of Variance (GMANOVA) Models and High-Dimensional Extensions. Ph.D. Thesis, Mcmaster University, Hamilton, ON, Canada, 2017. Available online: http://hdl.handle.net/11375/22043 (accessed on 26 January 2023).
- Natarajan, N.; Dhillon, I.S. Inductive matrix completion for predicting gene–disease associations. Bioinformatics 2014, 30, i60–i68. [Google Scholar] [CrossRef]
- Zilber, P.; Nadler, B. Inductive Matrix Completion: No Bad Local Minima and a Fast Algorithm. In Proceedings of the 39th ICML, Baltimore, MA, USA, 17–23 July 2022; PMLR. Volume 162, pp. 27671–27692. [Google Scholar]
- Zhang, W.; Xu, H.; Li, X.; Gao, Q.; Wang, L. DRIMC: An improved drug repositioning approach using Bayesian inductive matrix completion. Bioinformatics 2020, 36, 2839–2847. [Google Scholar] [CrossRef] [PubMed]
- Hsieh, C.J.; Natarajan, N.; Dhillon, I. PU learning for matrix completion. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 7–9 July 2015; pp. 2445–2453. [Google Scholar]
- Jana, S.; Balakrishnan, N.; Hamid, J.S. Bayesian growth curve model useful for high-dimensional longitudinal data. J. Appl. Stat. 2019, 46, 814–834. [Google Scholar] [CrossRef]
- Knoblauch, J.; Jewson, J.; Damoulas, T. An Optimization-centric View on Bayes’ Rule: Reviewing and Generalizing Variational Inference. J. Mach. Learn. Res. 2022, 23, 1–109. [Google Scholar]
- Bissiri, P.G.; Holmes, C.C.; Walker, S.G. A general framework for updating belief distributions. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2016, 78, 1103–1130. [Google Scholar] [CrossRef] [PubMed]
- Grünwald, P.; Van Ommen, T. Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it. Bayesian Anal. 2017, 12, 1069–1103. [Google Scholar] [CrossRef]
- McAllester, D. Some PAC-Bayesian theorems. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, Madison, WI, USA, 24–26 July 1998; ACM: New York, NY, USA, 1998; pp. 230–234. [Google Scholar]
- Shawe-Taylor, J.; Williamson, R. A PAC analysis of a Bayes estimator. In Proceedings of the Tenth Annual Conference on Computational Learning Theory, Nashville, TN, USA, 6–9 July 1997; ACM: New York, NY, USA, 1997; pp. 2–9. [Google Scholar]
- Catoni, O. PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning; IMS Lecture Notes—Monograph Series, 56; Institute of Mathematical Statistics: Beachwood, OH, USA, 2007; p. xii+163. [Google Scholar]
- Guedj, B. A primer on PAC-Bayesian learning. arXiv 2019, arXiv:1901.05353. [Google Scholar]
- Alquier, P. User-friendly introduction to PAC-Bayes bounds. arXiv 2021, arXiv:2110.11216. [Google Scholar]
- Mai, T.T.; Alquier, P. A Bayesian approach for noisy matrix completion: Optimal rate under general sampling distribution. Electron. J. Statist. 2015, 9, 823–841. [Google Scholar] [CrossRef]
- Cottet, V.; Alquier, P. 1-Bit matrix completion: PAC-Bayesian analysis of a variational approximation. Mach. Learn. 2018, 107, 579–603. [Google Scholar] [CrossRef]
- Mai, T.T.; Alquier, P. Pseudo-Bayesian quantum tomography with rank-adaptation. J. Stat. Plan. Inference 2017, 184, 62–76. [Google Scholar] [CrossRef] [Green Version]
- Mai, T.T.; Alquier, P. Optimal quasi-Bayesian reduced rank regression with incomplete response. arXiv 2022, arXiv:2206.08619. [Google Scholar]
- Jain, P.; Dhillon, I.S. Provable inductive matrix completion. arXiv 2013, arXiv:1306.0626. [Google Scholar]
- Candès, E.J.; Plan, Y. Matrix completion with noise. Proc. IEEE 2010, 98, 925–936. [Google Scholar] [CrossRef]
- Koltchinskii, V.; Lounici, K.; Tsybakov, A.B. Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist. 2011, 39, 2302–2329. [Google Scholar] [CrossRef]
- Foygel, R.; Shamir, O.; Srebro, N.; Salakhutdinov, R. Learning with the weighted trace-norm under arbitrary sampling distributions. In Proceedings of the Advances in Neural Information Processing Systems, Granada, Spain, 12–15 December 2011; pp. 2133–2141. [Google Scholar]
- Klopp, O. Noisy low-rank matrix completion with general sampling distribution. Bernoulli 2014, 20, 282–303. [Google Scholar] [CrossRef]
- Negahban, S.; Wainwright, M.J. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise. J. Mach. Learn. Res. 2012, 13, 1665–1697. [Google Scholar]
- Dalalyan, A.S.; Tsybakov, A.B. Sparse regression learning by aggregation and Langevin Monte-Carlo. J. Comput. Syst. Sci. 2012, 78, 1423–1443. [Google Scholar] [CrossRef]
- Dalalyan, A.S. Exponential weights in multivariate regression and a low-rankness favoring prior. Annales de l’Institut Henri Poincaré Probabilités et Statistiques 2020, 56, 1465–1483. [Google Scholar] [CrossRef]
- Anderson, T.W. Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann. Math. Stat. 1951, 22, 327–351. [Google Scholar] [CrossRef]
- Izenman, A.J. Modern multivariate statistical techniques. Regres. Classif. Manifold Learn. 2008, 10, 978. [Google Scholar]
- Dalalyan, A.; Tsybakov, A.B. Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity. Mach. Learn. 2008, 72, 39–61. [Google Scholar] [CrossRef]
- Catoni, O. Statistical learning theory and stochastic optimization. In Saint-Flour Summer School on Probability Theory 2001; Picard, J., Ed.; Lecture Notes in Mathematics; Springer: Berlin, Germany, 2004; Volume 1851, p. viii+272. [Google Scholar] [CrossRef]
- Alquier, P.; Ridgway, J.; Chopin, N. On the properties of variational approximations of Gibbs posteriors. J. Mach. Learn. Res. 2016, 17, 8374–8414. [Google Scholar]
- Rigollet, P.; Tsybakov, A.B. Sparse estimation by exponential weighting. Stat. Sci. 2012, 27, 558–575. [Google Scholar] [CrossRef]
- Dalalyan, A.S.; Grappin, E.; Paris, Q. On the exponentially weighted aggregate with the Laplace prior. Ann. Stat. 2018, 46, 2452–2478. [Google Scholar] [CrossRef] [Green Version]
- Candes, E.J.; Wakin, M.B.; Boyd, S.P. Enhancing sparsity by reweighted ℓ1 minimization. J. Fourier Anal. Appl. 2008, 14, 877–905. [Google Scholar] [CrossRef]
- Yang, L.; Fang, J.; Duan, H.; Li, H.; Zeng, B. Fast low-rank Bayesian matrix completion with hierarchical gaussian prior models. IEEE Trans. Signal Process. 2018, 66, 2804–2817. [Google Scholar] [CrossRef]
- Luo, C.; Liang, J.; Li, G.; Wang, F.; Zhang, C.; Dey, D.K.; Chen, K. Leveraging mixed and incomplete outcomes via reduced-rank modeling. J. Multivar. Anal. 2018, 167, 378–394. [Google Scholar] [CrossRef]
- Hoeffding, W. Probability Inequalities for Sums of Bounded Random Variables. J. Am. Stat. Assoc. 1963, 58, 13–30. [Google Scholar] [CrossRef]
- Durmus, A.; Moulines, E. High-dimensional Bayesian inference via the unadjusted Langevin algorithm. Bernoulli 2019, 25, 2854–2882. [Google Scholar] [CrossRef]
- Roberts, G.O.; Stramer, O. Langevin diffusions and Metropolis-Hastings algorithms. Methodol. Comput. Appl. Probab. 2002, 4, 337–357. [Google Scholar] [CrossRef]
- Roberts, G.O.; Rosenthal, J.S. Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 1998, 60, 255–268. [Google Scholar] [CrossRef]
- Dalalyan, A.S. Theoretical guarantees for approximate sampling from smooth and log-concave densities. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2017, 3, 651–676. [Google Scholar] [CrossRef]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022. [Google Scholar]
- Hastie, T.; Mazumder, R. softImpute: Matrix Completion via Iterative Soft-Thresholded SVD, 2021. R Package Version 1.4-1. Available online: https://cran.r-project.org/package=softImpute (accessed on 26 January 2023).
- Massart, P. Concentration Inequalities and Model Selection; Lecture Notes in Mathematics; Springer: Berlin, Germany, 2007; Volume 1896, p. xiv+337. [Google Scholar]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [PubMed]
- Dalalyan, A.S.; Karagulyan, A. User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient. Stoch. Process. Their Appl. 2019, 129, 5278–5311. [Google Scholar] [CrossRef] [Green Version]
Errors | LMC | MALA | OLS | ||
---|---|---|---|---|---|
n = 100 | Est | 1.0053 (0.5480) | 1.0342 (0.5559) | 1.0052 (0.5478) | |
Pred | 0.1138 (0.0171) | 0.0985 (0.0151) | 0.1014 (0.0154) | ||
Nmse | 0.4931 (0.1178) | 0.5100 (0.1207) | 0.4930 (0.1178) | ||
Est | 1.3544 (0.5867) | 1.3384 (0.5836) | 1.3544 (0.5867) | ||
Pred | 1.0066 (0.0430) | 0.8761 (0.0756) | 1.0030 (0.0424) | ||
Nmse | 0.7049 (0.2944) | 0.6963 (0.2927) | 0.7049 (0.2944) | ||
n = 1000 | Est | 1.0776 (0.5671) | 1.0900 (0.5670) | 1.0776 (0.5671) | |
Pred | 0.0099 (0.0013) | 0.0099 (0.0013) | 0.0099 (0.0013) | ||
Nmse | 0.5185 (0.1198) | 0.5264 (0.1219) | 0.5185 (0.1198) | ||
Est | 0.9662 (0.3240) | 0.9688 (0.3244) | 0.9662 (0.3240) | ||
Pred | 0.0999 (0.0051) | 0.0989 (0.0049) | 0.0998 (0.0051) | ||
Nmse | 0.4961 (0.1183) | 0.4976 (0.1191) | 0.4961 (0.1183) |
Errors | LMC | MALA | OLS | ||
---|---|---|---|---|---|
n = 100 | Est | 4.0731 (1.828) | 4.0989 (1.821) | 4.0731 (1.828) | |
Pred | 0.1090 (0.0160) | 0.0969 (0.0140) | 0.0987 (0.0145) | ||
Nmse | 0.5119 (0.1226) | 0.5162 (0.1241) | 0.5118 (0.1226) | ||
Est | 4.6047 (1.812) | 4.6038 (1.813) | 4.6047 (1.812) | ||
Pred | 1.0062 (0.0462) | 1.0597 (0.0495) | 1.0006 (0.0469) | ||
Nmse | 0.5801 (0.1942) | 0.5800 (0.1941) | 0.5801 (0.1942) | ||
n = 1000 | Est | 3.6733 (1.606) | 3.6884 (1.606) | 3.6733 (1.606) | |
Pred | 0.0098 (0.0015) | 0.0098 (0.0015) | 0.0098 (0.0015) | ||
Nmse | 0.4812 (0.1271) | 0.4835 (0.1260) | 0.4813 (0.1271) | ||
Est | 3.9972 (1.375) | 3.9986 (1.376) | 3.9972 (1.375) | ||
Pred | 0.1000 (0.0043) | 0.1032 (0.0057) | 0.0999 (0.0043) | ||
Nmse | 0.5013 (0.1061) | 0.5014 (0.1063) | 0.5013 (0.1062) |
Errors | LMC | MALA | OLS_imp | ||
---|---|---|---|---|---|
n = 100 = 10% | Est | 1.0559 (0.5060) | 1.0803 (0.5122) | 1.0559 (0.5060) | |
Pred | 0.1028 (0.0193) | 0.1082 (0.0143) | 0.1020 (0.0197) | ||
Nmse | 0.4986 (0.1116) | 0.5139 (0.1197) | 0.4986 (0.1116) | ||
Est | 1.4008 (0.8555) | 1.3987 (0.8542) | 1.4009 (0.8555) | ||
Pred | 1.2250 (0.4568) | 1.4468 (0.4137) | 1.2252 (0.4570) | ||
Nmse | 0.7148 (0.3591) | 0.7136 (0.3581) | 0.7148 (0.3591) | ||
n = 100 = 30% | Est | 1.0432 (0.4963) | 1.0917 (0.5085) | 1.0432 (0.4963) | |
Pred | 0.2402 (0.2705) | 0.1447 (0.0204) | 0.2446 (0.2780) | ||
Nmse | 0.5242 (0.1257) | 0.5538 (0.1335) | 0.5242 (0.1257) | ||
Est | 1.6242 (0.8179) | 1.6224 (0.8169) | 1.6242 (0.8179) | ||
Pred | 9.8879 (14.11) | 10.807 (13.84) | 9.8901 (14.11) | ||
Nmse | 0.7993 (0.3340) | 0.7985 (0.3334) | 0.7993 (0.3340) | ||
n = 1000 = 10% | Est | 0.9810 (0.4532) | 0.9882 (0.4478) | 0.9810 (0.4532) | |
Pred | 0.0114 (0.0033) | 0.0112 (0.0015) | 0.0114 (0.0033) | ||
Nmse | 0.4933 (0.1076) | 0.4984 (0.1075) | 0.4933 (0.1076) | ||
Est | 1.0063 (0.3465) | 1.0088 (0.3471) | 1.0063 (0.3465) | ||
Pred | 0.1902 (0.1758) | 0.1116 (0.0049) | 0.1902 (0.1759) | ||
Nmse | 0.5069 (0.1049) | 0.5082 (0.1050) | 0.5069 (0.1049) | ||
n = 1000 = 30% | Est | 1.0110 (0.4886) | 1.0223 (0.4872) | 1.0110 (0.4886) | |
Pred | 0.0539 (0.0599) | 0.0141 (0.0019) | 0.0540 (0.0599) | ||
Nmse | 0.5129 (0.1030) | 0.5206 (0.1043) | 0.5129 (0.1030) | ||
Est | 1.0291 (0.3567) | 1.0312 (0.3555) | 1.0291 (0.3567) | ||
Pred | 1.7529 (1.914) | 0.1475 (0.0078) | 1.7530 (1.913) | ||
Nmse | 0.5054 (0.1055) | 0.5067 (0.1053) | 0.5054 (0.1055) |
Errors | LMC | MALA | OLS_imp | ||
---|---|---|---|---|---|
n = 100 imis 10% | Est | 3.8319 (1.691) | 3.8749 (1.719) | 3.8319 (1.690) | |
Pred | 0.1604 (0.1271) | 0.1092 (0.0153) | 0.1598 (0.1322) | ||
Nmse | 0.5116 (0.1154) | 0.5169 (0.1147) | 0.5116 (0.1155) | ||
Est | 5.9500 (2.834) | 5.9452 (2.835) | 5.9500 (2.834) | ||
Pred | 4.7640 (5.272) | 4.6964 (5.515) | 4.7658 (5.275) | ||
Nmse | 0.7313 (0.3454) | 0.7307 (0.3455) | 0.7313 (0.3454) | ||
n = 100 imis 30% | Est | 4.1838 (1.850) | 4.2535 (1.859) | 4.1839 (1.850) | |
Pred | 0.7221 (0.7562) | 0.1498 (0.0183) | 0.7371 (0.7741) | ||
Nmse | 0.5182 (0.1128) | 0.5283 (0.1147) | 0.5182 (0.1128) | ||
Est | 7.1589 (4.084) | 7.1558 (4.083) | 7.1589 (4.084) | ||
Pred | 39.899 (52.40) | 40.233 (51.76) | 39.908 (52.41) | ||
Nmse | 0.8998 (0.3821) | 0.8994 (0.3820) | 0.8998 (0.3821) | ||
n = 1000 imis 10% | Est | 3.9618 (1.678) | 3.9788 (1.677) | 3.9618 (1.678) | |
Pred | 0.0409 (0.0269) | 0.0110 (0.0015) | 0.0409 (0.0269) | ||
Nmse | 0.4968 (0.1196) | 0.4989 (0.1195) | 0.4968 (0.1196) | ||
Est | 4.1153 (1.295) | 4.1163 (1.294) | 4.1153 (1.295) | ||
Pred | 1.0250 (0.9988) | 0.1135 (0.0051) | 1.0250 (0.9988) | ||
Nmse | 0.5060 (0.1096) | 0.5062 (0.1096) | 0.5060 (0.1096) | ||
n = 1000 imis 30% | Est | 4.1647 (1.990) | 4.1836 (1.995) | 4.1647 (1.990) | |
Pred | 0.4615 (0.3497) | 0.0141 (0.0017) | 0.4616 (0.3498) | ||
Nmse | 0.4905 (0.1157) | 0.4933 (0.1171) | 0.4905 (0.1157) | ||
Est | 4.0578 (1.400) | 4.0565 (1.397) | 4.0578 (1.400) | ||
Pred | 8.5608 (6.419) | 0.1538 (0.0069) | 8.5609 (6.419) | ||
Nmse | 0.4944 (0.1184) | 0.4943 (0.1180) | 0.4944 (0.1184) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mai, T.T. From Bilinear Regression to Inductive Matrix Completion: A Quasi-Bayesian Analysis. Entropy 2023, 25, 333. https://doi.org/10.3390/e25020333
Mai TT. From Bilinear Regression to Inductive Matrix Completion: A Quasi-Bayesian Analysis. Entropy. 2023; 25(2):333. https://doi.org/10.3390/e25020333
Chicago/Turabian StyleMai, The Tien. 2023. "From Bilinear Regression to Inductive Matrix Completion: A Quasi-Bayesian Analysis" Entropy 25, no. 2: 333. https://doi.org/10.3390/e25020333
APA StyleMai, T. T. (2023). From Bilinear Regression to Inductive Matrix Completion: A Quasi-Bayesian Analysis. Entropy, 25(2), 333. https://doi.org/10.3390/e25020333