Next Article in Journal
Modified Integral Control Globally Counters Symmetry-Breaking Biases
Previous Article in Journal
Improved Sparse Coding Algorithm with Device-Free Localization Technique for Intrusion Detection and Monitoring
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Small-Deviation Inequalities for Sums of Random Matrices

School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
*
Author to whom correspondence should be addressed.
Symmetry 2019, 11(5), 638; https://doi.org/10.3390/sym11050638
Submission received: 16 April 2019 / Revised: 2 May 2019 / Accepted: 4 May 2019 / Published: 6 May 2019

Abstract

:
Random matrices have played an important role in many fields including machine learning, quantum information theory, and optimization. One of the main research focuses is on the deviation inequalities for eigenvalues of random matrices. Although there are intensive studies on the large-deviation inequalities for random matrices, only a few works discuss the small-deviation behavior of random matrices. In this paper, we present the small-deviation inequalities for the largest eigenvalues of sums of random matrices. Since the resulting inequalities are independent of the matrix dimension, they are applicable to high-dimensional and even the infinite-dimensional cases.

1. Introduction

Random matrices have been widely used in many problems, e.g., compressed sensing [1], high-dimensional data analysis [2], matrix approximation [3,4] and dimension reduction [5]. In the literature, one of main research issues is to study the deviation behavior of eigenvalues (or singular values) of random matrices.
In general, there are two types of deviation results studied in probability theory: one is the large-deviation inequality that describes the behavior of the probability P ( | x | > t ) for large t; and the other is the small-deviation (or small-ball) inequality that controls the probability P ( | x | < ϵ ) for small ϵ .
The early large-deviation inequalities for sums of random matrices can be dated back to the work of Ahlswede and Winter [6]. Tropp [7] improved their results and developed a user-friendly framework to obtain the large-deviation inequalities for sums of random matrices. To overcome the limitation of the matrix-dimension dependence, Hsu et al. [8] and Minsker [9] introduced the concepts of intrinsic dimension and effective dimension to tighten large-deviation inequalities, respectively. Moreover, Zhang et al. [10] applied a diagonalization method to obtain the dimension-free large-deviation random for largest singular value of sums of random matrices, while it remains a challenge to select the auxiliary matrices and functions. In the scenario of a single random matrix, Ledoux [11] studied the largest eigenvalues of Gaussian unitary ensemble matrices and Vershynin [12] studied the singular values of the sub-Gaussian and sub-exponential matrices.
Small-deviation problems stemmed from some practical applications, e.g., approximation problems [13], Brownian pursuit problems [14], quantization problem [15], and convex geometry [16]. For more details, we refer to the bibliography maintained by Lifshitz [17]. There have been some works on the small-deviation inequalities for the specific types of random matrices. Aubrun [18] obtained the small-deviation inequalities for the largest eigenvalue of a single Gaussian unitary ensemble matrix. Rudelson and Vershynin [19] presented the small-deviation inequalities for the smallest singular value of the random matrix with independent entries. Volodko [20] estimated the small-deviation probability of the determinant of the matrix B B T , where B is a d × random matrix whose entries obey a centered joint Gaussian distribution. To the best of our knowledge, there are few works on the small-deviation inequalities for sums of random matrices.

1.1. Related Works

Let { X 1 , X 2 , , X K } C d × d be a finite sequence of independent random Hermitian matrices. It follows from Markov’s inequality that
P λ max k X k t inf θ > 0 e θ t · tr E e θ k X k ,
where λ max denotes the largest eigenvalue and E stands for the expectation operation. By using Golden-Thompson inequality, Ahlswede and Winter [6] bounded the trace of the matrix moment generating function (mgf) in the following way:
tr E e θ k X k tr ( I ) · k λ max E e θ X k = d · exp k λ max log E e θ X k ,
where tr ( A ) stands for the trace of the matrix A . By applying Lieb’s concavity theorem, Tropp [7] achieved a tighter matrix mgf bound than the above one:
tr E e θ k X k d · exp λ max k log E e θ X k ,
where “the eigenvalue of sum of matrices” is smaller than “the sum of eigenvalues of matrices” in the right-hand side of (1). However, there remains a shortcoming that the result (2) is dependent with the matrix dimension d, and its right-hand side will become loose for high-dimensional matrices.
To overcome the shortcoming, Hsu et al. [8] employed the intrinsic dimension tr ( X ) λ max ( X ) to replace the ambient dimension d in the case of real symmetric matrices. Minsker [9] provided a dimension-free version of Bernstein’s inequality for sequences of independent random matrices. Zhang et al. [10] introduced a diagonalization method to obtain the tail bounds for the largest singular values of the sum of random matrices. Although their bounds are independent of the matrix dimension and overcame the aforementioned first shortcoming, there remains a challenge to select the appropriate parameters to obtain the tighter bounds.
There are also some small-deviation results on one single random matrix. Edelman [21] presented the small-deviation behavior of the smallest singular value of a Gaussian matrix:
lim d P s min ( A ) ϵ d = 1 exp ϵ ϵ 2 2 ,
where A is a d × d random matrix whose entries are independent standard normal random variables. Rudelson and Vershynin [22] studied the small-deviation bound of the smallest singular value of a sub-Gaussian matrix:
P s min ( B ) ϵ d C · ϵ + c d ,
where C > 0 , c ( 0 , 1 ) only depends on the sub-Gaussian moment of its entries and B is a d × d random matrix whose entries are i.i.d. sub-Gaussian random variables with zero mean and unit variance. However, to the best of our knowledge, there is little work on the small-deviation inequalities for sums of random matrices.

1.2. Overview of Main Results

In this paper, we present the small-deviation inequalities for the largest eigenvalue for sums of independent random Hermitian matrices, i.e., the upper bound of
P λ max k X k ϵ .
In particular, we first present some basic small-deviation results of random matrices. We then obtain several types of small-deviation inequalities for the largest eigenvalue of sums of independent random positive semi-definite (PSD) matrices. In contrast to the large-deviation inequalities for random matrices, the resulting small-deviation inequalities are independent of the matrix dimension d and thus our finding are applicable to the high-dimensional and even infinite-dimensional cases.
The rest of this paper is organized as follows. In Section 2, we introduce some useful notations and then give some basic results on small-deviation inequalities for random matrices. The small-deviation results for sums of random PSD matrices are presented in Section 3. The last section concludes the paper.

2. Basic Small-Deviation Inequalities for Random Matrices

In this section, we first introduce the necessary notations and then present some basic small-deviation results of random matrices.

2.1. Necessary Notations

Given a Hermitian matrix A , denote λ max ( A ) and λ min ( A ) as the largest and the smallest eigenvalues of A , respectively. Denote tr ( A ) and A as the trace and the spectral norm of A , respectively. Let I be the identity matrix, U be the unitary matrix and U * stand for the Hermitian adjoint of U .
By the spectral mapping theorem, given a real-value function f : R R , then
f ( A ) = U · f ( Λ ) · U * ,
where A = U Λ U * is a diagonalization of A . If f ( a ) g ( a ) for a I when the eigenvalues of A lie in I, then there holds that f ( A ) g ( A ) , where the semi-definite partial order ⪯ is defined as follows:
A H H A   is   positive   semi definite .

2.2. Basic Small-Deviation Inequalities for Random Matrices

Subsequently, we produce the small-deviation inequalities for random matrices. First, we consider a small-deviation bound for one single matrix:
Lemma 1. 
Let Y be a random Hermitian matrix. Then for any ϵ > 0 ,
P λ max ( Y ) ϵ inf θ > 0 1 d · e θ ϵ · E tr e θ Y .
Proof. 
For any θ > 0 , we have
P λ max ( Y ) ϵ = P { e λ max ( θ Y ) e θ ϵ } E e λ max ( θ Y ) · e θ ϵ [ by   Markov s   inequality ] = E e λ min ( θ Y ) · e θ ϵ [ since λ max ( A ) = λ min ( A ) ] = E λ min ( e θ Y ) · e θ ϵ [ by   Spectral   mapping   theorem ] 1 d · e θ ϵ · E tr e θ Y .
The last inequality holds because the minimum eigenvalue of a positive definite (pd) matrix is dominated by the tr ( · ) / d . Since this inequality holds for any θ > 0 , taking an infimum over θ > 0 completes the proof. □
Then, by using the subadditivity of the matrix cumulant generating function (see [7], Lemma 3.4), we obtain the small-deviation bound for sums of random matrices:
Theorem 1. 
Let { X 1 , X 2 , , X K } be a finite sequence of independent random Hermitian matrices. Then for any ϵ > 0 ,
P λ max k X k ϵ inf θ > 0 e θ ϵ · exp λ max k log E e θ X k .
Proof. 
By combining Lemma 1 and Lemma 3.4 of [7], we have for any θ > 0 ,
P λ max k X k ϵ 1 d · e θ ϵ · E tr e θ k X k 1 d · e θ ϵ · tr exp k log E e θ X k 1 d · e θ ϵ · d · λ max exp k log E e θ X k = e θ ϵ · exp λ max k log E e θ X k .
Taking the infimum over θ > 0 completes the proof. □
Please note that the above small-deviation bound is independent of the matrix dimension d, and thus it is applicable to the scenarios of high-dimensional and even infinite-dimensional matrices. In addition, we also derive the following small-deviation bounds for sums of random matrices.
Corollary 1. 
Let { X 1 , X 2 , , X K } be a sequence of independent random Hermitian matrices. Assume that there is a function g ( θ ) and a sequence { A k } of fixed Hermitian matrices such that
E e θ X k e g ( θ ) · A k , θ > 0 .
1. 
Define the scalar parameter
η 1 : = λ max k A k .
If g ( θ ) > 0 , then for any ϵ > 0 ,
P λ max k X k ϵ inf θ > 0 exp θ ϵ + g ( θ ) · η 1 .
2. 
Define the scalar parameter
η 2 : = λ min k A k .
If g ( θ ) < 0 , then for any ϵ > 0 ,
P λ max k X k ϵ inf θ > 0 exp θ ϵ + g ( θ ) · η 2 .
Proof. 
It follows from (4) that
log E e θ X k g ( θ ) · A k ,
and substituting it into Theorem 1 leads to the result (5). Then, the fact that λ max ( X ) = λ min ( X ) for any Hermitian matrix X leads to the result (6). This completes the proof. □
By using the logarithm operation, we then obtain another small-deviation bound for sums of random matrices:
Corollary 2. 
Let { X 1 , X 2 , , X K } be a sequence of independent random Hermitian matrices. Then for any ϵ > 0 ,
P λ max k X k ϵ inf θ > 0 exp θ ϵ + K · log λ max 1 K k = 1 K E e θ X k .
Proof. 
Since the matrix logarithm is operator concave, for each θ > 0 , we have
i = 1 K log E e θ X k = K · 1 K i = 1 K log E e θ X k K · log 1 K i = 1 K E e θ X k .
According to (3), we then arrive at
P λ max k X k ϵ 1 d · e θ ϵ · tr exp K · log 1 K i = 1 K E e θ X k .
Since the trace of a matrix can be bounded by d times of its maximum eigenvalue, taking the infimum over θ > 0 completes the proof. □
The following presents the relationship between one random PSD matrix and a sum of PSD random matrices.
Lemma 2. 
Let { X 1 , X 2 , , X K } be a sequence of independent random Hermitian PSD matrices. Then for all k and any ϵ > 0 ,
P λ max k X k ϵ k P λ max ( X k ) ϵ P λ max ( X k ) ϵ .
Proof. 
Since { X 1 , X 2 , , X K } are PSD, we have
P λ max k X k ϵ P max λ max ( X 1 ) , , λ max ( X K ) ϵ = k P λ max ( X k ) ϵ P λ max ( X k ) ϵ .
The last inequality holds for any k = 1 , 2 , K . This completes the proof. □
This lemma shows that the small-deviation probability for sums of random matrices can be bounded by using the small-deviation probability for one single matrix. This fact suggests that the small-deviation bound could be independent of the size of matrix sequence, while this phenomenon will not arise in the large-deviation scenario.

3. Small-Deviation Inequalities for Positive Semi-Definite Random Matrices

In this section, we present several types of small-deviation inequalities for the largest eigenvalue of sums of independent random PSD matrices. Similar to the scalar version of small-deviation inequalities, there remains a challenge to bound the term E e θ X k . Here, we adapt some methods to handle this issue.
First, we introduce the negative moment estimate for the largest eigenvalue to derive a small-deviation inequality for sums of random matrices:
Theorem 2. 
Let { X 1 , X 2 , , X K } be a sequence of independent random Hermitian PSD matrices. Given a p > 0 , if there exists a positive constant C p such that
λ max k E X k p < C p ,
then there holds that for any ϵ > 0 ,
P λ max k X k ϵ C p ϵ p .
Proof. 
It follows from Jensen’s inequality that
E λ max k X k p E λ max k X k p λ max k E X k p < C p .
Then, the Markov’s inequality yields
P λ max k X k ϵ = P λ max k X k p ϵ p ϵ p · E λ max k X k p C p ϵ p .
This completes the proof. □
In this theorem, we impose an assumption that the negative moment of λ max k X k is bounded. In general, this assumption is mild, and can be satisfied in most cases. The following small-deviation results are derived under that condition that the eigenvalues of the matrices { X k } are bounded:
Theorem 3. 
Let { X 1 , X 2 , , X K } be a sequence of independent random Hermitian PSD matrices such that λ max ( X k ) L ( k = 1 , 2 , , K ) almost surely. Then for any ϵ > 0 ,
P λ max k X k ϵ μ ϵ ϵ / L · exp ϵ μ L ,
where
μ : = λ min k E X k .
Furthermore, there holds that for any ϵ > 0 ,
P λ max k X k ϵ 1 ϵ K ε / L · k = 1 K μ k ϵ / L · exp K ϵ k μ k L ,
where
μ k = λ min ( E X k ) .
Proof. 
For any θ > 0 and x [ 0 , L ] , there holds that
e θ x 1 + e θ L 1 L · x exp e θ L 1 L · x .
According to transfer rule, we have,
log E e θ X k e θ L 1 L E X k .
By substituting (9) into Corollary 1, we then have,
P λ max k X k ϵ inf θ > 0 e θ ϵ · exp λ max k e θ L 1 L E X k = inf θ > 0 exp θ ϵ + λ max e θ L 1 L k E X k = inf θ > 0 exp θ ϵ + e θ L 1 L · λ min k E X k = inf θ > 0 exp θ ϵ + e θ L 1 L · μ ,
since e θ L 1 L < 0 . The infimum is achieved at θ = 1 L log ( μ ϵ ) , which leads to the result of (7).
Moreover, the combination of Lemma 1 and (9) leads to
P λ max ( X k ) ϵ μ k ϵ ϵ / L · exp ϵ μ k L .
Then, the result (8) is derived from Lemma 2. This completes the proof. □
Actually, the above results are derived from the geometric point of view, where the term e θ x is bounded by the linear function 1 + e θ L 1 L for any x [ 0 , L ] . Finally, we study the small-deviation inequalities for random matrix series k x k A k , which is a sum of fixed Hermitian pd matrices A k weighted by random variables x k .
Theorem 4. 
Let { A 1 , A 2 , , A K } be a sequence of fixed Hermitian pd matrices, and { x 1 , x 2 , , x K } be a finite sequence of independent variables. If there exist the constants C > 0 and α > 0 such that for all θ > 0 ,
E e θ x k C · θ α ,
then there holds that for any 0 < ϵ < K α e · K C ν α ,
P λ max k x k A k ϵ e ϵ K α α K · C ν K K ,
where
ν = λ max k A k α .
Furthermore, for any 0 < ϵ < ( α / e ) · C 1 α · k = 1 K ν k 1 α K ,
P λ max k x k A k ϵ k = 1 K ν k · C K · e ϵ α K α ,
where
ν k = λ max ( A k α ) .
Proof. 
According to transfer rule, we have,
E e θ x k A k C · ( θ A k ) α .
By substituting (13) into Corollary 2, we then have,
P λ max k x k A k ϵ inf θ > 0 exp θ ϵ + K · log λ max 1 K k = 1 K C · ( θ A k ) α = inf θ > 0 exp θ ϵ + K · log λ max C θ α K k = 1 K A k α = inf θ > 0 exp θ ϵ + K · log C ν K θ α .
The infimum will be attained at θ = α K ϵ , and it leads to the result of (11). Moreover, the combination of Lemma 1 and (13) leads to
P λ max ( x k A k ) ϵ C · ν k · e ϵ α α .
Then, the result (12) is resulted from Lemma 2. This completes the proof. □
The above results hold under the condition (10) that E e θ x k has a power-type upper bound of C · θ α . This condition is mild, and we refer to [23] for the details. Moreover, to keep the results (11) and (12) non-trivial, their right-hand sides should be less than one, and thus we arrive at ϵ < K α e · K C ν α and ϵ < ( α / e ) · C 1 α · k = 1 K ν k 1 α K , respectively.

4. Conclusions

In this paper, we present the small-deviation inequalities for the largest eigenvalues of sums of random matrices. In particular, we first give some basic results on small-deviation inequalities for random matrices. We then study the small-deviation inequalities for sums of independent random PSD matrices. In contrast to the large-deviation inequalities for random matrices, our results are independent of the matrix dimension d and thus can be applicable to the scenarios of high-dimensional and even infinite-dimensional matrices. In addition, by using the Hermitian dilation (see Section 2.6 of [7]), our small-deviation results can also be extended to the scenario of non-Hermitian random matrices.

Author Contributions

Conceptualization, X.G. and C.Z.; methodology, X.G. and C.Z.; validation, X.G., C.Z. and H.Z.; resources, C.Z.; writing—original draft preparation, X.G.; writing—review and editing, C.Z.; supervision, C.Z. and H.Z.; project administration, C.Z. and H.Z.; funding acquisition, C.Z.

Funding

This research was funded by National Natural Science Foundation of China grant number 61473328 and 11401076.

Acknowledgments

We are grateful to the anonymous reviewers and the editors for their valuable comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chandrasekaran, V.; Recht, B.; Parrilo, P.A.; Willsky, A.S. The convex geometry of linear inverse problems. Found. Comput. Math. 2012, 12, 805–849. [Google Scholar] [CrossRef]
  2. Bühlmann, P.; van de Geer, S. Statistics for High-Dimensional Data: Methods, Theory and Applications; Springer: Berlin, Germany, 2011. [Google Scholar]
  3. Gittens, A.; Mahoney, M.W. Revisiting the nyström method for improved large-scale machine learning. J. Mach. Learn. Res. 2016, 17, 3977–4041. [Google Scholar]
  4. Halko, N.; Martinsson, P.G.; Tropp, J.A. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 2011, 53, 217–288. [Google Scholar] [CrossRef]
  5. Clarkson, K.L.; Woodruff, D.P. Low rank approximation and regression in input sparsity time. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, Palo Alto, CA, USA, 1–4 June 2013; pp. 81–90. [Google Scholar]
  6. Ahlswede, R.; Winter, A. Strong converse for identification via quantum channels. IEEE Trans. Inf. Theory. 2002, 48, 569–579. [Google Scholar] [CrossRef]
  7. Tropp, J.A. User-friendly tail bounds for sums of random matrices. Found. Comput. Math. 2012, 12, 389–434. [Google Scholar] [CrossRef]
  8. Hsu, D.; Kakade, S.M.; Zhang, T. Tail inequalities for sums of random matrices that depend on the intrinsic dimension. Electron. Commun. Prob. 2012, 17, 1–13. [Google Scholar] [CrossRef]
  9. Minsker, S. On some extensions of bernstein’s inequality for self-adjoint operators. Stat. Probabil. Lett. 2017, 127, 111–119. [Google Scholar] [CrossRef]
  10. Zhang, C.; Du, L.; Tao, D. Lsv-based tail inequalities for sums of random matrices. Neural. Comput. 2017, 29, 247–262. [Google Scholar] [CrossRef] [PubMed]
  11. Ledoux, M. Deviation inequalities on largest eigenvalues. In Geometric Aspects of Functional Analysis; Milman, V.D., Schechtman, G., Eds.; Springer: Berlin, Germany, 2007; pp. 167–219. [Google Scholar]
  12. Vershynin, R. Introduction to the non-asymptotic analysis of random matrices. arXiv 2010, arXiv:1011.3027. [Google Scholar]
  13. Li, W.V.; Linde, W. Approximation, metric entropy and small ball estimates for gaussian measures. Ann. Probab. 1999, 27, 1556–1578. [Google Scholar]
  14. Li, W.V.; Shao, Q.M. Capture time of brownian pursuits. Probab. Theory. Rel. 2001, 121, 30–48. [Google Scholar] [CrossRef]
  15. Dereich, S.; Fehringer, F.; Matoussi, A.; Scheutzow, M. On the link between small ball probabilities and the quantization problem for gaussian measures on banach spaces. J. Theor. Probab. 2003, 16, 249–265. [Google Scholar] [CrossRef]
  16. Klartag, B.; Vershynin, R. Small ball probability and dvoretzky’s theorem. ISR J. Math. 2007, 157, 193–207. [Google Scholar] [CrossRef]
  17. Lifshits, M. Bibliography of Small Deviation Probabilities. Available online: https://www.lpsm.paris/pageperso/smalldev/biblio.pdf (accessed on 15 December 2016).
  18. Aubrun, G. A sharp small deviation inequality for the largest eigenvalue of a random matrix. In Séminaire de Probabilités XXXVIII; Springer: Berlin, Germany, 2005; pp. 320–337. [Google Scholar]
  19. Rudelson, M.; Vershynin, R. Non-asymptotic theory of random matrices: extreme singular values. arXiv 2010, arXiv:1003.2990. [Google Scholar]
  20. Volodko, N.V. Small deviations of the determinants of random matrices with gaussian entries. Stat Probabil. Lett. 2014, 84, 48–53. [Google Scholar] [CrossRef]
  21. Edelman, A. Eigenvalues and condition numbers of random matrices. SIAM J. Matrix Anal. A 1988, 9, 543–560. [Google Scholar] [CrossRef]
  22. Rudelson, M.; Vershynin, R. The littlewood–offord problem and invertibility of random matrices. Adv. Math. 2008, 218, 600–633. [Google Scholar] [CrossRef]
  23. Li, W.V. Small Value Probabilities in Analysis and Mathematical Physics. Available online: https://www.math.arizona.edu/~mathphys/school_2012//WenboLi.pdf (accessed on 15 March 2012).

Share and Cite

MDPI and ACS Style

Gao, X.; Zhang, C.; Zhang, H. Small-Deviation Inequalities for Sums of Random Matrices. Symmetry 2019, 11, 638. https://doi.org/10.3390/sym11050638

AMA Style

Gao X, Zhang C, Zhang H. Small-Deviation Inequalities for Sums of Random Matrices. Symmetry. 2019; 11(5):638. https://doi.org/10.3390/sym11050638

Chicago/Turabian Style

Gao, Xianjie, Chao Zhang, and Hongwei Zhang. 2019. "Small-Deviation Inequalities for Sums of Random Matrices" Symmetry 11, no. 5: 638. https://doi.org/10.3390/sym11050638

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop