Article Eigenvalue Estimates Using the Kolmogorov-Sinai Entropy

The scope of this paper is twofold. First, we use the Kolmogorov-Sinai Entropy to estimate lower bounds for dominant eigenvalues of nonnegative matrices. The lower bound is better than the Rayleigh quotient. Second, we use this estimate to give a nontrivial lower bound for the gaps of dominant eigenvalues of A and A + V.


Introduction
The main concern of this paper is to relate eigenvalue estimates to the Kolmogorov-Sinai entropy for Markov shifts.We shall begin with the definition of the Kolmogorov-Sinai entropy.Let A = (a ij ) ∈ R N ×N be an irreducible nonnegative matrix.By an irreducible matrix A, we mean for each 1 ≤ i, j ≤ N , there exists positive integer k such that (A k ) ij = 0.A matrix P = (p ij ) ∈ R N ×N is said to be a stochastic matrix compatible with A, if P satisfies
We denote by P A the set of all stochastic matrices compatible with A. By Perron-Frobenius Theorem, it is easily seen that every stochastic matrix P has a unique left eigenvector q > 0 corresponding to eigenvalue 1 with N i=1 q i = 1.Here we say q is the stationary probability vector associated with P.
For any P ∈ P A and its associated stationary probability vector q, the Markov measure of a cylinder may then be defined by μ P,q (C j 0 ,j 1 ,...,jn ) = q j 0 p j 0 ,j 1 • • • p j n−1 ,jn Here μ P,q is an invariant measure under the shift map σ A (see e.g., [8]).The Kolmogorov-Sinai entropy (or called the measure theoretic entropy) of σ A under the invariant measure μ P,q is defined by where H(x) = −x log x and the convention 0 log 0 = 0 is adopted.The notion of the Kolmogorov-Sinai entropy was first studied by Kolmogorov in 1958 on the problems arising from information theory and dimension of functional spaces, that measures the uncertainty of the dynamical systems (see e.g., [6,7]).It is shown in [8] (p.221) that where the summation in (1) is taken over all i, j with a ij = 1.On the other hand, it is shown by Parry [9] (Theorems 6 and 7) that the Kolmogorov-Sinai entropy of σ A has an upper bound log λ N (A).
Theorem 1.1 (Parry's Theorem).Let A be an N × N irreducible transition matrix.Then for any P ∈ P A and its associated stationary probability vector q, we have where λ N (A) denotes the dominant eigenvalue of A. Moreover, if A is regular (A n > 0 for some n > 0), the equality in (2) holds for some unique P ∈ P A and q the stationary probability vector associated with P.
Parry's Theorem shows the Kolmogorov-Sinai entropy for a Markov shift is less than or equal to its topological entropy (that is, log λ N (A)) and exactly one of the Markov measures on Σ A maximizes the Kolmogorov-Sinai entropy of σ A provided it is topological mixing.This is also a crucial lemma for showing the Variational Property of Entropy [8] (Proposition 8.1) in the ergodic theory.However, from the viewpoint of eigenvalue problems, combination of (1) and (2) gives a lower bound for the dominant eigenvalue of the transition matrix A. In this paper, we generalize Parry's Theorem to general N × N irreducible nonnegative matrices.Toward this end, we extend the entropy of irreducible nonnegative matrices by It is easily seen that h P,q,A = h P,q (σ A ).
Theorem 1.2 (Main Result 1: The Generalized Parry's Theorem).Let A ∈ R N ×N an irreducible nonnegative matrix.Let P ∈ P A and q be a stationary probability vector associated with P, then we have where the summation is taken over all i, j with a ij > 0.Moreover, the equality in (3) holds when and where x > 0 and y > 0 are, respectively, the right and left eigenvectors of A corresponding to the eigenvalue λ N (A).Here, diag(x) denotes the diagonal matrix with x on its diagonal, y • x denotes the vector (y 1 x 1 , . . ., y N x N ), and y denotes the transpose of the column vector y.
Lower bound estimates for the dominant eigenvalue of a symmetric irreducible nonnegative matrix play an important role in various fields, e.g., the complexity of a symbolic dynamical system [5], synchronization problem of coupled systems [10], or the ground state estimates of Schrödinger operators [2].A usual way to estimate the lower bound for λ N (A) is the Rayleigh quotient λ N (A) ≥ x Ax x x It is also well-known that (see e.g., [4] (Theorem 8.1.26)), provided that A ∈ R N ×N is nonnegative and x ∈ R N is positive.Comparing the lower bound estimate (3) with (4) as well as with the Rayleigh quotient, we have the following result.
Corollary 1.3.Let A ∈ R N ×N be a symmetric, irreducible nonnegative matrix.Suppose x ∈ R N be positive.Then the matrix P = diag(Ax) −1 Adiag(x) is in P A and q = x • (Ax) x Ax is the stationary probability vector associated with P. In addition, Here, each equality holds if and only if x is the eigenvector of A corresponding to the eigenvalue λ N (A).
Here we remark that for any arbitrary irreducible nonnegative matrix A, the entropy h P,q,A involves the left eigenvector q of P. Hence, the lower bound estimate (3) is merely a formal expression.However, for a symmetric irreducible nonnegative matrix A and P chosen as in Corollary 1.3, the vector q can be explicitly expressed.Therefore, h P,q,A can be written in an explicit form.We shall further show in Proposition 2.6 that h P,q,A = −1 x y N i=1 x i y i log x i y i where y = Ax.Considering symmetric nonnegative A and its perturbation A + V, it is easily seen that λ N (A + V) − λ N (A) ≥ x Vx, where x is the normalized eigenvector of A corresponding to λ N (A).This gives a trivial lower bound for the gap of λ N (A + V) and λ N (A).Upper bound estimates for the gap are well studied in the perturbation theory [4,11].By considering A + V as a low rank perturbation of A, the interlace structure of eigenvalues of A + V and of A is studied by [1,3].In the second result of this paper, we give a nontrivial lower bound for λ where Here (f (1/λ N (A)) − 1)λ N (A) ≥ x Vx.Furthermore, the equality in (5) holds if and only if This paper is organized as follows.In Section 2, we prove the generalized Parry's Theorem in three steps.First, we prove the case in which the matrix A has only integer entries.Next we show that Theorem 1.2 is true for nonnegative matrices with rational entries.Finally we show that it holds true for all irreducible nonnegative matrices.The proof of Corollary 1.3 is given at the end of this section.In Section 3, we give the proof of Theorem 1.4.We conclude this paper in Section 4.
Throughout this paper, we use the boldface alphabet (or symbols) to denote matrices (or vectors).For u, v ∈ R N , the Hadamard product of u and v is their elementwise product which is denoted by denotes the dominant eigenvalue of a nonnegative matrix A.

Proof of the Generalized Parry's Theorem and Corollary 1.3
In this section, we shall prove the generalized Parry's Theorem and Corollary 1.3.To prove inequality (3), we proceed in three steps.
Step 1: Inequality (3) is true for all irreducible nonnegative matrices with integer entries.
Let A be an irreducible nonnegative matrix with integer entries.To adopt Parry's Theorem, we shall construct a transition matrix Ā corresponding to A for which λ N ( Ā) = λ N (A) 1/2 .To this end, we define the sets of indexes: (3) the rest entries are set to be zero (6c) It is easily seen that Ā can be written in the block form: where 0 N ×N and 0 Ñ × Ñ are, respectively, the zero matrices in Proof.From (7), we see that From (6a) and (6b), for each i, j with a ij = 0, we have Using ( 8), together with (6c), we have From (9) we see that ĀIE ĀEI = A. Hence λ N ( Ā2 ) = λ N ( ĀIE ĀEI ) = λ Ñ ( ĀEI ĀIE ) = λ N (A).On the other hand, Ā is a nonnegative matrix.From Perron-Frobenius Theorem, its dominant eigenvalue is nonnegative.The assertion follows.
Remark 2.1.In the language of graph theory, a ij represents the number of directed edges from vertex i to vertex j.Hence ij (A n ) ij equals to the number of all possible routes of length n + 1, i.e.,

#{all possible routes of length
For the construction of Ā, we add an additional vertex on every edge from vertex i to vertex j (See Figure 2.1 for the illustration).Hence, each route that obeys the rule defined by A, now becomes one of the following routes according to the rule defined by Ā: where 1 ≤ k j ≤ a i j i j+1 , j = 1, . . ., n − 1.However, a route of the form in ( 11) is equivalent to the form in (10) but its length is doubled.Hence O(λ Now, let P ∈ P A be given and q be its associated stationary probability vector.We shall accordingly define a stochastic matrix P ∈ PĀ and its associated stationary probability vector q.The stochastic matrix P is defined as follows: ( (3) the rest entries are set to zero (12c) From ( 6) and ( 12), it is easily seen that P is a stochastic matrix compatible with Ā.Let the vector q ∈ R N + Ñ be defined by Proposition 2.2.q is the stationary probability vector associated with P.
Proof.We first show that q is a left eigenvector of P with the corresponding eigenvalue 1.For any 1 ≤ j ≤ N , using (12b), (13b), and the fact that q P = q , we have On the other hand, using (12a) and (13a), for all − → ij (k) with a ij > 0 and 1 ≤ k ≤ a ij , we have In ( 14), we have proved q P = q .Now we show that the total sum of entries of q is 1.Using the fact ij The proof is complete.
From the construction of the transition matrix Ā, it is easily seen that Ā is irreducible.In ( 12) and Proposition 2.2, we show that P ∈ PĀ and the vector q defined by ( 13) is its associated stationary probability vector.Hence the Kolmogorov-Sinai entropy h P,q (σ Ā) is well-defined.Now we give the relationship between the quantities h P,q (σ Ā) and h P,q,A defined in Equation (3).Proposition 2.3.
Proof.We note that by (12b), log p− → ij (k) ,j = 0 if a ij > 0. Using the definition of P and q in (12) and ( 13), as well as the entropy formula (1), we have The proof is complete.
Using Proposition 2.3, 2.1, and Parry's Theorem 1.1, it follows that Step 2: Inequality (3) is true for all irreducible nonnegative matrices with rational entries.
Any N × N nonnegative matrix with all entries that are rational can be written as A/n where A is a nonnegative matrix with integer entries and n is an positive integer.Suppose A is irreducible and P ∈ P A/n .Note that P A/n = P A .Letting q be a stationary probability vector associated with P, inequality (3) for A/n follows from the following proposition.
From the definition of h P,q,A/n , we see that On the other hand, since q P = q and q i = 1, we have Substituting ( 17) into (16) and using the result (15) in Step 1, we have Step 3: Inequality (3) is true for all irreducible nonnegative matrices.
It remains to show (3) holds for all nonnegative A with irrational entries.The assertion follows from Step 2 and the continuous dependence of eigenvalues with respect to the matrix.Now, we give the proof of the second assertion of Theorem 1.2.
Proposition 2.5.The equality in (3) holds when one chooses and where x > 0 and y > 0 are, respectively, the right and left eigenvectors of A corresponding to eigenvalue λ N (A).
Proof.By setting y x = 1, we may write To ease the notation, set λ N = λ N (A).Hence, we have The proof of Theorem 1.2 is complete.
In the following, we give the proof of Corollary 1.3.We first prove the following useful proposition.It will be used in Section 3 as well.
Proposition 2.6.Let A ∈ R N ×N be an irreducible nonnegative matrix.Suppose A is symmetric and x ∈ R N be positive.If P = diag(Ax) −1 Adiag(x) and q = x•y x y , where y = Ax, then From Proposition 2.5, we see that the matrix P in Proposition 2.5 is a stochastic matrix compatible with A and q is its associated stationary probability vector.Hence, the entropy h P,q,A is well defined.Now, we give the proof of this Proposition.
Proof.Since A ≥ 0 is irreducible and x > 0, it follows Ax > 0, and hence, diag(Ax) −1 is well-defined.It is easily seen that p ij = 0 if and only if a ij = 0.However, Pe = diag(Ax) −1 (Ax) = e.This shows that P ∈ P A .On the other hand, since A is symmetric, we see that y = x A. Hence We have proved the first assertion of this proposition.By the definition of h P,q,A in (3), we have This completes the proof.Now, we are in a proposition to give the proof of Corollary 1.3.
Proof of Corollary 1.3.For convenience, we let y = Ax.Hence q = x • y x y and p ij = a ij x j y i .Using Proposition 2.6, we have = log x Ax x x Here inequality (19) follows from Jensen's inequality (see e.g., [12] (Theorem 7.35)) for − log and the fact that N i=1 1 x y x i y i = 1.Similarly, using Proposition 2.6 and the monotonicity of log, we also see that This proves the first assertion of Corollary 1.3.It is easily seen that if x is an eigenvector corresponding to λ N (A), then both equalities in ( 19) and (20) hold.From the assumption that A ≥ 0 is irreducible and x > 0, it follows that y > 0 also.This implies there are N terms in (18).Hence equality in (19) or in (20) holds only if x i y i , for all i = 1, . . ., N, are constant.That is, y = Ax = λx.Here λ is some eigenvalue of A. However, x > 0. From Perron-Frobenius Theorem it follows λ = λ N (A).The proof is complete.

Proof of Theorem 1.4
In this section, we shall give the proof of Theorem 1.4.We first prove (5).Proposition 3.1.Let A, V and x be as defined in Theorem 1.4.Then we have where The equality holds in (21) if and only if Proof.To ease the notation, we shall denote λ = λ N (A).Let y = (A + V)x = λx + Vx, q = x•y x y , and P = diag(y) −1 (A + V)diag(x) ∈ P A+V .From Theorem 1.2 and Proposition 2.6, we have We note that Subtracting ( 23) from ( 22), we have and hence, This proves (21).Now we prove the second assertion of this proposition.It is easily seen that Here (24) follows from the convexity of log and Jensen's inequality.Hence, if the equality in (22) holds, then the equality in (25) also holds.This means x is also an eigenvector of A+V.However, since x > 0 is the eigenvector of A corresponding to λ N (A), we conclude that v 1 = • • • = v N .This completes the proof.
The following proposition can be obtained from a standard calculation.
Proposition 3.2.Let f be the real-valued function in Proposition 3.1.Then we have where b = N i=1 x 2 i v i and In the following, we show that the lower bound estimate (5) for λ N (A + V) − λ N (A) is greater than x Vx.Proof.It is easily seen from the definition of f (z) that f (0) = 1.Hence, using the Mean Value Theorem follows that there exists a ζ ∈ (0, 1/λ N (A)) such that From (26a) and (27a), we see that f (0) = b = x Vx.From (26b), (27a) and (27b), we also see that f (z) ≥ 0 for all z ≥ 0. This implies The assertion of this proposition follows from (28) and (29) directly.

Conclusions
In this paper, we first generalize Parry's Theorem to general nonnegative matrices.This can be treated as an estimation for the lower bound for a nonnegative matrix.Second, we use the generalized Parry's Theorem to estimate a nontrivial lower bound of λ N (A+V)−λ N (A), provided that A ≥ 0 is symmetric and V ≥ 0 is a diagonal matrix.The bound is optimal but implicit that can be applied when λ N (A) and