Sharp Guarantees and Optimal Performance for Inference in Binary and Gaussian-Mixture Models †
Abstract
:1. Introduction
1.1. Motivation
1.2. Data Models
- (Noisy) Signed:
- Logistic:
- Probit:
1.3. Empirical Risk Minimization
- Least Squares (LS):,
- Least-Absolute Deviations (LAD):,
- Logistic Loss:,
- Exponential Loss:,
- Hinge Loss:.
1.4. Contributions and Organization
- Precise Asymptotics: We show that the absolute value of correlation of to the true vector is sharply predicted by where the “effective noise” parameter can be explicitly computed by solving a system of three non-linear equations in three unknowns. We find that the system of equations (and, thus, the value of ) depends on the loss function ℓ through its Moreau envelope function. Our prediction holds in the linear asymptotic regime in which and (see Section 2).
- Fundamental Limits: We establish fundamental limits on the performance of convex optimization-based estimators by computing an upper bound on the best possible correlation performance among all convex loss functions. We compute the upper bound by solving a certain nonlinear equation and we show that such a solution exists for all (see Section 3.1).
- Optimal Performance and (sub)-optimality of LS for binary models: For certain binary models including signed and logistic, we find the loss functions that achieve the optimal performance, i.e., they attain the previously derived upper bound (see Section 3.2). Interestingly, for logistic and Probit models with , we prove that the correlation performance of least-squares (LS) is at least as good 0.9972 and 0.9804 times the optimal performance. However, as grows large, logistic and Probit models approach the signed model, in which case LS becomes sub-optimal (see Section 4.1).
- Extension to the Gaussian-Mixture Model: In Section 5, we extend the fundamental limits and the system of equations to the Gaussian-mixture model. Interestingly, our results indicate that, for this model, LS is optimal among all convex loss functions for all .
- Numerical Simulations: We do numerous experiments to specialize our results to popular models and loss functions, for which we provide simulation results that demonstrate the accuracy of the theoretical predictions (see Section 6 and Appendix E).
1.5. Related Works
- (a)
- Sharp asymptotics for linear measurements.
- (b)
- One-bit compressed sensing.
- (c)
- Classification in high-dimensions.
2. Sharp Performance Guarantees
2.1. Definitions
2.2. A System of Equations
2.3. Asymptotic Prediction
3. On Optimal Performance
3.1. Fundamental Limits
3.2. On the Optimal Loss Function
4. Special Cases
4.1. Least-Squares
4.2. Logistic and Hinge Loss
5. Extensions to Gaussian-Mixture Models
5.1. System of Equations for GMM
5.2. Theoretical Prediction of Error for Convex Loss Functions
5.3. Special Case: Least-Squares
5.4. Optimal Risk for GMM
6. Numerical Experiments
Numerical Experiments for GMM
7. Conclusions
Author Contributions
Funding
Conflicts of Interest
Appendix A. Properties of Moreau Envelopes
Appendix A.1. Derivatives
- (a)
- The proximal operator is unique and continuous. In fact, whenever with .
- (b)
- The value is finite and depends continuously on , with for all x as .
- (c)
- The Moreau envelope function is differentiable with respect to both arguments. Specifically, for all , the following properties are true:
Appendix A.2. Alternative Representations of (8)
Appendix A.3. Examples of Proximal Operators
Appendix A.4. Fenchel–Legendre Conjugate Representation
Appendix A.5. Convexity of the Moreau Envelope
Appendix A.6. The Expected Moreau-Envelope (EME) Function and its Properties
Appendix A.6.1. Derivatives
Appendix A.6.2. Strict Convexity
“ If ℓ is strictly convex and does not attain its minimum at 0, then is also strictly convex. ”
- , for all .
- .
Appendix A.6.3. Strict Concavity
“ If ℓ is convex, continuously differentiable and , then is strictly concave. ”
Appendix A.6.4. Summary of Properties of (uid135)
- (a)
- The function Ω is differentiable and its derivatives are given as follows:
- (b)
- The function Ω is jointly convex and concave on γ.
- (c)
- The function Ω is increasing in α.For the statements below, further assume that ℓ is strictly convex and continuously differentiable with .
- (d)
- The function Ω is strictly convex in and strictly concave in λ.
- (e)
- The function Ω is strictly increasing in α.