No Statistical-Computational Gap in Spiked Matrix Models with Generative Network Priors †
Abstract
:1. Introduction
- Spiked Wishart Model in which is given by:
- Spiked Wigner Model in which is given by:
Our Contribution
- for the Wishart model (1) we take with
- for the Wigner model (2) we take .
- in the Spiked Wishart Model:
- in the Spiked Wigner Model:
Algorithm 1: Subgradient method for the minizimization problem (4) |
|
2. Related Work
2.1. Sparse PCA and Other Computational-Statistical Gaps
2.2. Inverse Problems with Generative Network Priors
3. Algorithm and Main Result
- for theSpiked Wishart Model(1) take , and
- for theSpiked Wigner Model(2) take , and
Numerical Experiments
4. Recovery Under Deterministic Conditions
- (A)
- there exists an integer such that
- (B)
- for any :
4.1. Technical Tools and Outline of the Proofs
- In Proposition A1 (Appendix A.3) we show that the iterates of the Algorithm 1 stay inside the Euclidean ball of radius and remain nonzero for all .
- We then identify two small Euclidean balls and around respectively and , where only depends on the depth of the network. In Proposition A2 we show that after a polynomial number of steps, the iterates of the Algorithm 1 enter the region (Appendix A.4).
- We show, in Proposition A3, that the negation step causes the iterates of the algorithm to avoid the spurious point and actually enter within a polynomial number of steps (Appendix A.5).
- We finally show in Proposition A4, that in the loss function f enjoys a favorable convexity-like property, which implies that the iterates will remain in and eventually converge to up to the noise level (Appendix A.6).
Author Contributions
Funding
Conflicts of Interest
Appendix A. Supporting Lemmas and Proof of Theorem 3
Appendix A.1. Notation
Appendix A.2. Preliminaries
Appendix A.3. Iterates Stay Bounded
Appendix A.4. Convergence to
Appendix A.5. Convergence to a Neighborhood Around
Appendix A.6. Convergence to up to Noise
Appendix A.7. Proof of Theorem 3
- (I)
- By assumption so that according to Proposition A1 for any it holds that .
- (II)
- By Proposition A3, there exists an integer T such that and therefore it satisfies the conclusions of Theorem 3.A
- (III)
- Once in the iterates of Algorithm 1 converge to up to the noise level, as shown by Proposition A4 and Equation (A14)
- (IV)
- The reconstruction error (12) in Theorem 3.B, follows then from (11) by applying Lemma A10 and the lower bound (A4).
Appendix B. Supplementary Proofs
Appendix B.1. Supplementary Proofs for Appendix A.2
Appendix B.2. Supplementary Proofs for Appendix A.3
Appendix B.3. Supplementary Proofs for Appendix A.4
Appendix B.4. Supplementary Proofs for Appendix A.5
- (1)
- : Then we have or as .
- (2)
- : Then (A24), (A35) and yield . Using (A30), we then get .
- (3)
- and : Then (A27) gives which used with (A24) leads to:
Appendix B.5. Supplementary Proofs for Appendix A.6
Appendix C. Proof of Theorem 1
- in the Spiked Wishart Model where and the ;
- in the Spiked Wigner Model where .
Appendix C.1. Spiked Wigner Model
Appendix C.2. Spiked Wishart Model
References
- Johnstone, I.M. On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat. 2001, 29, 295–327. [Google Scholar] [CrossRef]
- Amini, A.A.; Wainwright, M.J. High-dimensional analysis of semidefinite relaxations for sparse principal components. In Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, ON, Canada, 6–11 July 2008; pp. 2454–2458. [Google Scholar]
- Deshpande, Y.; Montanari, A. Sparse PCA via covariance thresholding. In Proceedings of the Advances in Neural Information Processing Systems, Montréal, QC, Canada, 8–13 December 2014; pp. 334–342. [Google Scholar]
- Vu, V.; Lei, J. Minimax rates of estimation for sparse PCA in high dimensions. In Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, La Palma, Canary Islands, Spain, 21–23 April 2012; pp. 1278–1286. [Google Scholar]
- Abbe, E.; Bandeira, A.S.; Bracher, A.; Singer, A. Decoding binary node labels from censored edge measurements: Phase transition and efficient recovery. IEEE Trans. Netw. Sci. Eng. 2014, 1, 10–22. [Google Scholar] [CrossRef] [Green Version]
- Bandeira, A.S.; Chen, Y.; Lederman, R.R.; Singer, A. Non-unique games over compact groups and orientation estimation in cryo-EM. Inverse Probl. 2020, 36, 064002. [Google Scholar] [CrossRef] [Green Version]
- Javanmard, A.; Montanari, A.; Ricci-Tersenghi, F. Phase transitions in semidefinite relaxations. Proc. Natl. Acad. Sci. USA 2016, 113, E2218–E2223. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- McSherry, F. Spectral partitioning of random graphs. In Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, Newport Beach, CA, USA, 8–11 October 2001; pp. 529–537. [Google Scholar]
- Deshpande, Y.; Abbe, E.; Montanari, A. Asymptotic mutual information for the binary stochastic block model. In Proceedings of the 2016 IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, 10–15 July 2016; pp. 185–189. [Google Scholar]
- Moore, C. The computer science and physics of community detection: Landscapes, phase transitions, and hardness. arXiv 2017, arXiv:1702.00467. [Google Scholar]
- D’Aspremont, A.; Ghaoui, L.; Jordan, M.; Lanckriet, G. A direct formulation for sparse PCA using semidefinite programming. Adv. Neural Inf. Process. Syst. 2004, 17, 41–48. [Google Scholar] [CrossRef] [Green Version]
- Berthet, Q.; Rigollet, P. Optimal detection of sparse principal components in high dimension. Ann. Stat. 2013, 41, 1780–1815. [Google Scholar] [CrossRef] [Green Version]
- Bandeira, A.S.; Perry, A.; Wein, A.S. Notes on computational-to-statistical gaps: Predictions using statistical physics. arXiv 2018, arXiv:1803.11132. [Google Scholar] [CrossRef] [Green Version]
- Kunisky, D.; Wein, A.S.; Bandeira, A.S. Notes on computational hardness of hypothesis testing: Predictions using the low-degree likelihood ratio. arXiv 2019, arXiv:1907.11636. [Google Scholar]
- Hand, P.; Voroninski, V. Global guarantees for enforcing deep generative priors by empirical risk. IEEE Trans. Inf. Theory 2019, 66, 401–418. [Google Scholar] [CrossRef] [Green Version]
- Heckel, R.; Huang, W.; Hand, P.; Voroninski, V. Rate-optimal denoising with deep neural networks. arXiv 2018, arXiv:1805.08855. [Google Scholar]
- Hand, P.; Leong, O.; Voroninski, V. Phase retrieval under a generative prior. In Proceedings of the Advances in Neural Information Processing Systems, Montréal, QC, Canada, 3–8 December 2018; pp. 9136–9146. [Google Scholar]
- Ma, F.; Ayaz, U.; Karaman, S. Invertibility of convolutional generative networks from partial measurements. Adv. Neural Inf. Process. Syst. 2018, 31, 9628–9637. [Google Scholar]
- Hand, P.; Joshi, B. Global Guarantees for Blind Demodulation with Generative Priors. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 11535–11543. [Google Scholar]
- Song, G.; Fan, Z.; Lafferty, J. Surfing: Iterative optimization over incrementally trained deep networks. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 15034–15043. [Google Scholar]
- Bora, A.; Jalal, A.; Price, E.; Dimakis, A.G. Compressed sensing using generative models. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; Volume 70, pp. 537–546. [Google Scholar]
- Asim, M.; Shamshad, F.; Ahmed, A. Blind Image Deconvolution using Deep Generative Priors. arXiv 2019, arXiv:cs.CV/1802.04073. [Google Scholar] [CrossRef]
- Hand, P.; Leong, O.; Voroninski, V. Compressive Phase Retrieval: Optimal Sample Complexity with Deep Generative Priors. arXiv 2020, arXiv:2008.10579. [Google Scholar]
- Hand, P.; Voroninski, V. Compressed sensing from phaseless gaussian measurements via linear programming in the natural parameter space. arXiv 2016, arXiv:1611.05985. [Google Scholar]
- Li, X.; Voroninski, V. Sparse signal recovery from quadratic measurements via convex programming. SIAM J. Math. Anal. 2013, 45, 3019–3033. [Google Scholar] [CrossRef]
- Ohlsson, H.; Yang, A.Y.; Dong, R.; Sastry, S.S. Compressive phase retrieval from squared output measurements via semidefinite programming. arXiv 2011, arXiv:1111.6323. [Google Scholar] [CrossRef] [Green Version]
- Cai, T.; Li, X.; Ma, Z. Optimal rates of convergence for noisy sparse phase retrieval via thresholded Wirtinger flow. Ann. Stat. 2016, 44, 2221–2251. [Google Scholar] [CrossRef]
- Wang, G.; Zhang, L.; Giannakis, G.B.; Akçakaya, M.; Chen, J. Sparse phase retrieval via truncated amplitude flow. IEEE Trans. Signal Process. 2017, 66, 479–491. [Google Scholar] [CrossRef]
- Yuan, Z.; Wang, H.; Wang, Q. Phase retrieval via sparse wirtinger flow. J. Comput. Appl. Math. 2019, 355, 162–173. [Google Scholar] [CrossRef] [Green Version]
- Aubin, B.; Loureiro, B.; Maillard, A.; Krzakala, F.; Zdeborová, L. The spiked matrix model with generative priors. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 8366–8377. [Google Scholar]
- Cocola, J.; Hand, P.; Voroninski, V. Nonasymptotic Guarantees for Spiked Matrix Recovery with Generative Priors. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 11 December 2020; Volume 33. [Google Scholar]
- Johnstone, I.M.; Lu, A.Y. On consistency and sparsity for principal components analysis in high dimensions. J. Am. Stat. Assoc. 2009, 104, 682–693. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wainwright, M.J. High-Dimensional Statistics: A Non-Asymptotic Viewpoint; Cambridge University Press: Cambridge, UK, 2019; Volume 48. [Google Scholar]
- Montanari, A.; Richard, E. Non-negative principal component analysis: Message passing algorithms and sharp asymptotics. IEEE Trans. Inf. Theory 2015, 62, 1458–1484. [Google Scholar] [CrossRef] [Green Version]
- Deshpande, Y.; Montanari, A.; Richard, E. Cone-constrained principal component analysis. Adv. Neural Inf. Process. Syst. 2014, 27, 2717–2725. [Google Scholar]
- Zou, H.; Hastie, T.; Tibshirani, R. Sparse principal component analysis. J. Comput. Graph. Stat. 2006, 15, 265–286. [Google Scholar] [CrossRef] [Green Version]
- Krauthgamer, R.; Nadler, B.; Vilenchik, D.; others. Do semidefinite relaxations solve sparse PCA up to the information limit? Ann. Stat. 2015, 43, 1300–1322. [Google Scholar] [CrossRef]
- Berthet, Q.; Rigollet, P. Computational lower bounds for Sparse PCA. arXiv 2013, arXiv:1304.0828. [Google Scholar]
- Cai, T.; Ma, Z.; Wu, Y. Sparse PCA: Optimal rates and adaptive estimation. Ann. Stat. 2013, 41, 3074–3110. [Google Scholar] [CrossRef]
- Ma, T.; Wigderson, A. Sum-of-squares lower bounds for sparse PCA. Adv. Neural Inf. Process. Syst. 2015, 28, 1612–1620. [Google Scholar]
- Lesieur, T.; Krzakala, F.; Zdeborová, L. Phase transitions in sparse PCA. In Proceedings of the 2015 IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, 14–19 June 2015; pp. 1635–1639. [Google Scholar]
- Brennan, M.; Bresler, G. Optimal average-case reductions to sparse pca: From weak assumptions to strong hardness. arXiv 2019, arXiv:1902.07380. [Google Scholar]
- Arous, G.B.; Wein, A.S.; Zadik, I. Free energy wells and overlap gap property in sparse PCA. In Proceedings of the Conference on Learning Theory, PMLR, Graz, Austria, 9–12 July 2020; pp. 479–482. [Google Scholar]
- Fan, J.; Liu, H.; Wang, Z.; Yang, Z. Curse of heterogeneity: Computational barriers in sparse mixture models and phase retrieval. arXiv 2018, arXiv:1808.06996. [Google Scholar]
- Richard, E.; Montanari, A. A statistical model for tensor PCA. Adv. Neural Inf. Process. Syst. 2014, 27, 2897–2905. [Google Scholar]
- Decelle, A.; Krzakala, F.; Moore, C.; Zdeborová, L. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 2011, 84, 066106. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Perry, A.; Wein, A.S.; Bandeira, A.S.; Moitra, A. Message-Passing Algorithms for Synchronization Problems over Compact Groups. Commun. Pure Appl. Math. 2018, 71, 2275–2322. [Google Scholar] [CrossRef] [Green Version]
- Oymak, S.; Jalali, A.; Fazel, M.; Eldar, Y.C.; Hassibi, B. Simultaneously structured models with application to sparse and low-rank matrices. IEEE Trans. Inf. Theory 2015, 61, 2886–2908. [Google Scholar] [CrossRef] [Green Version]
- Dhar, M.; Grover, A.; Ermon, S. Modeling sparse deviations for compressed sensing using generative models. arXiv 2018, arXiv:1807.01442. [Google Scholar]
- Shah, V.; Hegde, C. Solving linear inverse problems using gan priors: An algorithm with provable guarantees. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 4609–4613. [Google Scholar]
- Mixon, D.G.; Villar, S. Sunlayer: Stable denoising with generative networks. arXiv 2018, arXiv:1803.09319. [Google Scholar]
- Yeh, R.A.; Chen, C.; Lim, T.Y.; Schwing, A.G.; Hasegawa-Johnson, M.; Do, M.N. Semantic image inpainting with deep generative models. arXiv 2016, arXiv:1607.07539. [Google Scholar]
- Sønderby, C.K.; Caballero, J.; Theis, L.; Shi, W.; Huszár, F. Amortised map inference for image super-resolution. arXiv 2016, arXiv:1610.04490. [Google Scholar]
- Yang, G.; Yu, S.; Dong, H.; Slabaugh, G.; Dragotti, P.L.; Ye, X.; Liu, F.; Arridge, S.; Keegan, J.; Guo, Y.; et al. DAGAN: Deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Trans. Med. Imaging 2017, 37, 1310–1321. [Google Scholar] [CrossRef] [Green Version]
- Qiu, S.; Wei, X.; Yang, Z. Robust One-Bit Recovery via ReLU Generative Networks: Improved Statistical Rates and Global Landscape Analysis. arXiv 2019, arXiv:1908.05368. [Google Scholar]
- Xue, Y.; Xu, T.; Zhang, H.; Long, L.R.; Huang, X. Segan: Adversarial network with multi-scale l 1 loss for medical image segmentation. Neuroinformatics 2018, 16, 383–392. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Heckel, R.; Hand, P. Deep Decoder: Concise Image Representations from Untrained Non-convolutional Networks. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
- Heckel, R.; Soltanolkotabi, M. Denoising and regularization via exploiting the structural bias of convolutional generators. arXiv 2019, arXiv:1910.14634. [Google Scholar]
- Heckel, R.; Soltanolkotabi, M. Compressive sensing with un-trained neural networks: Gradient descent finds the smoothest approximation. arXiv 2020, arXiv:2005.03991. [Google Scholar]
- Aubin, B.; Loureiro, B.; Baker, A.; Krzakala, F.; Zdeborová, L. Exact asymptotics for phase retrieval and compressed sensing with random generative priors. In Proceedings of the First Mathematical and Scientific Machine Learning Conference, PMLR, Princeton, NJ, USA, 20–24 July 2020; pp. 55–73. [Google Scholar]
- Clason, C. Nonsmooth Analysis and Optimization. arXiv 2017, arXiv:1708.04180. [Google Scholar]
- Daskalakis, C.; Rohatgi, D.; Zampetakis, M. Constant-Expansion Suffices for Compressed Sensing with Generative Priors. arXiv 2020, arXiv:2006.04237. [Google Scholar]
- Chi, Y.; Lu, Y.M.; Chen, Y. Nonconvex optimization meets low-rank matrix factorization: An overview. IEEE Trans. Signal Process. 2019, 67, 5239–5269. [Google Scholar] [CrossRef] [Green Version]
- Vershynin, R. High-Dimensional Probability: An Introduction with Applications in Data Science; Cambridge University Press: Cambridge, UK, 2018; Volume 47. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cocola, J.; Hand, P.; Voroninski, V. No Statistical-Computational Gap in Spiked Matrix Models with Generative Network Priors. Entropy 2021, 23, 115. https://doi.org/10.3390/e23010115
Cocola J, Hand P, Voroninski V. No Statistical-Computational Gap in Spiked Matrix Models with Generative Network Priors. Entropy. 2021; 23(1):115. https://doi.org/10.3390/e23010115
Chicago/Turabian StyleCocola, Jorio, Paul Hand, and Vladislav Voroninski. 2021. "No Statistical-Computational Gap in Spiked Matrix Models with Generative Network Priors" Entropy 23, no. 1: 115. https://doi.org/10.3390/e23010115
APA StyleCocola, J., Hand, P., & Voroninski, V. (2021). No Statistical-Computational Gap in Spiked Matrix Models with Generative Network Priors. Entropy, 23(1), 115. https://doi.org/10.3390/e23010115