Next Article in Journal
Information Theory Consequences of the Scale-Invariance of Schröedinger’s Equation
Next Article in Special Issue
On the Role of Entropy Generation in Processes Involving Fatigue
Previous Article in Journal
Fluctuation, Dissipation and the Arrow of Time

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Eigenvalue Estimates Using the Kolmogorov-Sinai Entropy

by
Shih-Feng Shieh
Department of Mathematics, National Taiwan Normal University, 88 SEC. 4, Ting Chou Road, Taipei 11677, Taiwan
Entropy 2011, 13(12), 2036-2048; https://doi.org/10.3390/e13122036
Submission received: 31 October 2011 / Revised: 28 November 2011 / Accepted: 12 December 2011 / Published: 20 December 2011
(This article belongs to the Special Issue Concepts of Entropy and Their Applications)

Abstract

:
The scope of this paper is twofold. First, we use the Kolmogorov-Sinai Entropy to estimate lower bounds for dominant eigenvalues of nonnegative matrices. The lower bound is better than the Rayleigh quotient. Second, we use this estimate to give a nontrivial lower bound for the gaps of dominant eigenvalues of $A$ and $A + V$.

1. Introduction

The main concern of this paper is to relate eigenvalue estimates to the Kolmogorov-Sinai entropy for Markov shifts. We shall begin with the definition of the Kolmogorov-Sinai entropy. Let $A = ( a i j ) ∈ R N × N$ be an irreducible nonnegative matrix. By an irreducible matrix $A$, we mean for each $1 ≤ i , j ≤ N$, there exists positive integer k such that $( A k ) i j ≠ 0$. A matrix $P = ( p i j ) ∈ R N × N$ is said to be a stochastic matrix compatible with $A$, if $P$ satisfies
• $0 < p i j ≤ 1$ if $a i j > 0$,
• $p i j = 0$ if $a i j = 0$,
• $∑ j = 1 N p i j = 1$, for all $i = 1 , … , N$.
We denote by $P A$ the set of all stochastic matrices compatible with $A$. By Perron-Frobenius Theorem, it is easily seen that every stochastic matrix $P$ has a unique left eigenvector $q > 0$ corresponding to eigenvalue 1 with $∑ i = 1 N q i = 1$. Here we say $q$ is the stationary probability vector associated with $P$. For a transition matrix $A$, i.e., $a i j = 1$ or 0 for each $1 ≤ i , j ≤ N$, the subshift of finite type generated by $A$ is defined by
$Σ A = { i = ( i 0 , i 1 , … ) | i j ∈ { 1 , … , N } , a i j , i j + 1 = 1 , j = 0 , 1 , 2 , … }$
and the shift map on $Σ A$ is defined by $σ A ( i 0 , i 1 , … ) = ( i 1 , i 2 , … )$. A cylinder of $Σ A$ is the set
$C j 0 , j 1 , … , j n = { i ∈ Σ A | i 0 = j 0 , … , i n = j n }$
for any $n ≥ 0$. Disjoint unions of cylinders form an algebra which generates the Borel σ-algebra of $Σ A$. For any $P ∈ P A$ and its associated stationary probability vector $q$, the Markov measure of a cylinder may then be defined by
$μ P , q ( C j 0 , j 1 , … , j n ) = q j 0 p j 0 , j 1 ⋯ p j n - 1 , j n$
Here $μ P , q$ is an invariant measure under the shift map $σ A$ (see e.g., [8]). The Kolmogorov-Sinai entropy (or called the measure theoretic entropy) of $σ A$ under the invariant measure $μ P , q$ is defined by
$h P , q ( σ A ) = lim n → ∞ 1 n ∑ j 0 , j 1 , … , j n H ( μ P , q ( C j 0 , j 1 , … , j n ) )$
where $H ( x ) = - x log x$ and the convention $0 log 0 = 0$ is adopted. The notion of the Kolmogorov-Sinai entropy was first studied by Kolmogorov in 1958 on the problems arising from information theory and dimension of functional spaces, that measures the uncertainty of the dynamical systems (see e.g., [6,7]). It is shown in [8] (p. 221) that
$h P , q ( σ A ) = - ∑ i j q i p i j log p i j$
where the summation in (1) is taken over all $i , j$ with $a i j = 1$. On the other hand, it is shown by Parry [9] (Theorems 6 and 7) that the Kolmogorov-Sinai entropy of $σ A$ has an upper bound $log λ N ( A )$.
Theorem 1.1 (Parry’s Theorem).
Let $A$ be an $N × N$ irreducible transition matrix. Then for any $P ∈ P A$ and its associated stationary probability vector $q$, we have
$h P , q ( σ A ) ≤ log λ N ( A )$
where $λ N ( A )$ denotes the dominant eigenvalue of $A$. Moreover, if $A$ is regular ($A n > 0$ for some $n > 0$), the equality in (2) holds for some unique $P ∈ P A$ and $q$ the stationary probability vector associated with $P$.
Parry’s Theorem shows the Kolmogorov-Sinai entropy for a Markov shift is less than or equal to its topological entropy (that is, $log λ N ( A )$) and exactly one of the Markov measures on $Σ A$ maximizes the Kolmogorov-Sinai entropy of $σ A$ provided it is topological mixing. This is also a crucial lemma for showing the Variational Property of Entropy [8] (Proposition 8.1) in the ergodic theory. However, from the viewpoint of eigenvalue problems, combination of (1) and (2) gives a lower bound for the dominant eigenvalue of the transition matrix $A$. In this paper, we generalize Parry’s Theorem to general $N × N$ irreducible nonnegative matrices. Toward this end, we extend the entropy of irreducible nonnegative matrices by
$h P , q , A = - ∑ i j q i p i j log p i j a i j$
It is easily seen that $h P , q , A = h P , q ( σ A )$.
Theorem 1.2 (Main Result 1: The Generalized Parry’s Theorem).
Let $A ∈ R N × N$ an irreducible nonnegative matrix. Let $P ∈ P A$ and $q$ be a stationary probability vector associated with $P$, then we have
$h P , q , A ≤ log λ N ( A )$
where the summation is taken over all $i , j$ with $a i j > 0$. Moreover, the equality in (3) holds when
$P = 1 λ N ( A ) diag ( x ) - 1 A diag ( x )$
and
$q = y ∘ x y ⊤ x$
where $x > 0$ and $y > 0$ are, respectively, the right and left eigenvectors of $A$ corresponding to the eigenvalue $λ N ( A )$. Here, $diag ( x )$ denotes the diagonal matrix with $x$ on its diagonal, $y ∘ x$ denotes the vector $( y 1 x 1 , … , y N x N )$, and $y ⊤$ denotes the transpose of the column vector $y$.
Lower bound estimates for the dominant eigenvalue of a symmetric irreducible nonnegative matrix play an important role in various fields, e.g., the complexity of a symbolic dynamical system [5], synchronization problem of coupled systems [10], or the ground state estimates of Schrödinger operators [2]. A usual way to estimate the lower bound for $λ N ( A )$ is the Rayleigh quotient
$λ N ( A ) ≥ x ⊤ A x x ⊤ x$
It is also well-known that (see e.g., [4] (Theorem 8.1.26)),
$min 1 ≤ i ≤ N 1 x i ∑ j = 1 N a i j x j ≤ λ N ( A ) ≤ max 1 ≤ i ≤ N 1 x i ∑ j = 1 N a i j x j$
provided that $A ∈ R N × N$ is nonnegative and $x ∈ R N$ is positive. Comparing the lower bound estimate (3) with (4) as well as with the Rayleigh quotient, we have the following result.
Corollary 1.3.
Let $A ∈ R N × N$ be a symmetric, irreducible nonnegative matrix. Suppose $x ∈ R N$ be positive. Then the matrix $P = diag ( A x ) - 1 A diag ( x )$ is in $P A$ and $q = x ∘ ( A x ) x ⊤ A x$ is the stationary probability vector associated with $P$. In addition,
$h P , q , A ≥ log min 1 ≤ i ≤ N 1 x i ∑ j = 1 N a i j x j$
and
$h P , q , A ≥ log x ⊤ A x x ⊤ x$
Here, each equality holds if and only if $x$ is the eigenvector of $A$ corresponding to the eigenvalue $λ N ( A )$.
Here we remark that for any arbitrary irreducible nonnegative matrix $A$, the entropy $h P , q , A$ involves the left eigenvector $q$ of $P$. Hence, the lower bound estimate (3) is merely a formal expression. However, for a symmetric irreducible nonnegative matrix $A$ and $P$ chosen as in Corollary 1.3, the vector $q$ can be explicitly expressed. Therefore, $h P , q , A$ can be written in an explicit form. We shall further show in Proposition 2.6 that $h P , q , A = - 1 x ⊤ y ∑ i = 1 N x i y i log x i y i$ where $y = A x$.
Considering symmetric nonnegative $A$ and its perturbation $A + V$, it is easily seen that $λ N ( A + V ) - λ N ( A ) ≥ x ⊤ V x$, where $x$ is the normalized eigenvector of $A$ corresponding to $λ N ( A )$. This gives a trivial lower bound for the gap of $λ N ( A + V )$ and $λ N ( A )$. Upper bound estimates for the gap are well studied in the perturbation theory [4,11]. By considering $A + V$ as a low rank perturbation of $A$, the interlace structure of eigenvalues of $A + V$ and of $A$ is studied by [1,3]. In the second result of this paper, we give a nontrivial lower bound for $λ N ( A + V ) - λ N ( A )$.
Theorem 1.4 (Main Result 2).
Let $A ∈ R N × N$ be an irreducible nonnegative matrix and $x > 0$ be the eigenvector of $A$ corresponding to $λ N ( A )$ with $∥ x ∥ 2 = 1$. Suppose $A$ is symmetric. Then for any nonnegative $V = diag ( v 1 , … , v N )$, we have
$λ N ( A + V ) - λ N ( A ) ≥ f ( 1 / λ N ( A ) ) - 1 1 / λ N ( A )$
where
$f ( z ) = ∏ i = 1 N ( 1 + v i z ) ( 1 + v i z ) x i 2 ∑ j = 1 N ( 1 + v j z ) x i 2$
Here $( f ( 1 / λ N ( A ) ) - 1 ) λ N ( A ) ≥ x ⊤ V x$. Furthermore, the equality in (5) holds if and only if $v 1 = ⋯ = v N$.
This paper is organized as follows. In Section 2, we prove the generalized Parry’s Theorem in three steps. First, we prove the case in which the matrix $A$ has only integer entries. Next we show that Theorem 1.2 is true for nonnegative matrices with rational entries. Finally we show that it holds true for all irreducible nonnegative matrices. The proof of Corollary 1.3 is given at the end of this section. In Section 3, we give the proof of Theorem 1.4. We conclude this paper in Section 4.
Throughout this paper, we use the boldface alphabet (or symbols) to denote matrices (or vectors). For $u , v ∈ R N$, the Hadamard product of $u$ and $v$ is their elementwise product which is denoted by $u ∘ v = ( u i v i ) 1 ≤ i ≤ N$. The notation $diag ( u )$ denotes the $N × N$ diagonal matrix with $u$ on its diagonal. A matrix $A = ( a i j ) ∈ R N × N$ is said to be a transition matrix if $a i j = 1$ or 0 for all $1 ≤ i , j ≤ N$. $λ 1 ( A ) ≤ ⋯ ≤ λ N ( A )$ denotes the dominant eigenvalue of a nonnegative matrix $A$.

2. Proof of the Generalized Parry’s Theorem and Corollary 1.3

In this section, we shall prove the generalized Parry’s Theorem and Corollary 1.3. To prove inequality (3), we proceed in three steps.

Step 1: Inequality (3) is true for all irreducible nonnegative matrices with integer entries.

Let $A$ be an irreducible nonnegative matrix with integer entries. To adopt Parry’s Theorem, we shall construct a transition matrix $A ¯$ corresponding to $A$ for which $λ N ( A ¯ ) = λ N ( A ) 1 / 2$. To this end, we define the sets of indexes:
$I = { 1 , … , N } E = { i j → ( k ) | a i j ≠ 0 , 1 ≤ k ≤ a i j }$
Let $N ˜ = ∑ i , j = 1 N a i j = # E$ and $N ¯ = N + N ˜$. The transition matrix $A ¯ ∈ R N ¯ × N ¯$ corresponding to $A$ with index set $I ∪ E$ is defined as follows
$(1) a ¯ i , i j → ( k ) = 1 , for all 1 ≤ k ≤ a i j if a i j ≠ 0 ,$
$(2) a ¯ i j → ( k ) , j = 1 , for all 1 ≤ k ≤ a i j if a i j ≠ 0 ,$
$(3) the rest entries are set to be zero$
It is easily seen that $A ¯$ can be written in the block form:
$A ¯ = 0 N × N A ¯ I E A ¯ E I 0 N ˜ × N ˜$
where $0 N × N$ and $0 N ˜ × N ˜$ are, respectively, the zero matrices in $R N × N$ and $R N ˜ × N ˜$, $A ¯ I E ∈ R N × N ˜$ and $A ¯ E I ∈ R N ˜ × N$.
Proposition 2.1.
$λ N ¯ ( A ¯ ) = λ N ( A ) 1 / 2$.
Proof.
From (7), we see that
$A ¯ 2 = A ¯ I E A ¯ E I 0 N × N ˜ 0 N ˜ × N A ¯ E I A ¯ I E$
From (6a) and (6b), for each $i , j$ with $a i j ≠ 0$, we have
$∑ k = 1 a i j a ¯ i , i j → ( k ) a ¯ i j → ( k ) , j = a i j$
Using (8), together with (6c), we have
$( A ¯ I E A ¯ E I ) i j = ∑ α ∈ E a ¯ i α a ¯ α j$
$= ∑ k a ¯ i , i j → ( k ) a ¯ i j → ( k ) , j = a i j if a i j ≠ 0 0 = a i j if a i j = 0$
From (9) we see that $A ¯ I E A ¯ E I = A$. Hence $λ N ¯ ( A ¯ 2 ) = λ N ( A ¯ I E A ¯ E I ) = λ N ˜ ( A ¯ E I A ¯ I E ) = λ N ( A )$. On the other hand, $A ¯$ is a nonnegative matrix. From Perron-Frobenius Theorem, its dominant eigenvalue is nonnegative. The assertion follows. ☐
Remark 2.1. In the language of graph theory, $a i j$ represents the number of directed edges from vertex i to vertex j. Hence $∑ i j A n i j$ equals to the number of all possible routes of length $n + 1$, i.e.,
$# { all possible routes of length n + 1 } = ∑ i j A n i j = O ( λ N ( A ) n )$
For the construction of $A ¯$, we add an additional vertex on every edge from vertex i to vertex j (See Figure 2.1 for the illustration). Hence, each route that obeys the rule defined by $A$,
$( i 1 , i 2 , … , i j , i j + 1 , … , i n - 1 , i n ) , provided a i j i j + 1 > 0 for all j = 1 , ⋯ , n - 1$
now becomes one of the following routes according to the rule defined by $A ¯$:
$( i 1 , i 1 i 2 → ( k 1 ) , i 2 , … i j , i j i j + 1 → ( k j ) i j + 1 , … , i n - 1 , i n - 1 i n → ( k n - 1 ) , i n )$
where $1 ≤ k j ≤ a i j i j + 1$, $j = 1 , … , n - 1$. However, a route of the form in (11) is equivalent to the form in (10) but its length is doubled. Hence $O ( λ N ( A ¯ ) 2 n ) = O ( λ N ( A ) n )$.
Figure 1. Illustration for Remark 2.1 with the example $A = 1 2 1 0$.
Figure 1. Illustration for Remark 2.1 with the example $A = 1 2 1 0$.
Now, let $P ∈ P A$ be given and $q$ be its associated stationary probability vector. We shall accordingly define a stochastic matrix $P ¯ ∈ P A ¯$ and its associated stationary probability vector $q ¯$. The stochastic matrix $P ¯$ is defined as follows:
$(1) p ¯ i , i j → ( k ) = p i j a i j for all 1 ≤ k ≤ a i j provided a i j > 0$
$(2) p ¯ i j → ( k ) , j = 1 for all 1 ≤ k ≤ a i j provided a i j > 0$
$(3) the rest entries are set to zero$
From (6) and (12), it is easily seen that $P ¯$ is a stochastic matrix compatible with $A ¯$. Let the vector $q ¯ ∈ R N + N ˜$ be defined by
$q ¯ i = q i 2 , 1 ≤ i ≤ N$
and
$q ¯ i j → ( k ) = q i p i j 2 a i j , for all 1 ≤ k ≤ a i j with a i j > 0$
Proposition 2.2.
$q ¯$ is the stationary probability vector associated with $P ¯$.
Proof.
We first show that $q ¯$ is a left eigenvector of $P ¯$ with the corresponding eigenvalue 1. For any $1 ≤ j ≤ N$, using (12b), (13b), and the fact that $q ⊤ P = q ⊤$, we have
$( q ¯ ⊤ P ¯ ) j = ∑ i , k q ¯ i j → ( k ) p ¯ i j → ( k ) , j = ∑ i , a i j > 0 ∑ k = 1 a i j 1 2 q i p i j a i j · 1$
$= ∑ i 1 2 q i p i j = 1 2 q j = q ¯ j$
On the other hand, using (12a) and (13a), for all $i j → ( k )$ with $a i j > 0$ and $1 ≤ k ≤ a i j$, we have
$( q ¯ ⊤ P ¯ ) i j → ( k ) = q ¯ i p ¯ i , i j → ( k )$
$= 1 2 q i p i j a i j = q ¯ i j → ( k )$
In (14), we have proved $q ¯ ⊤ P ¯ = q ¯ ⊤$. Now we show that the total sum of entries of $q ¯$ is 1. Using the fact
$∑ i j ∑ k = 1 a i j q ¯ i j → ( k ) = ∑ i j ∑ k = 1 a i j q i p i j 2 a i j = ∑ i j 1 2 q i p i j = 1 2 ∑ i q i$
we conclude that
$∑ α ∈ I ∪ E ( q ) α = ∑ i q ¯ i + ∑ i j ∑ k = 1 a i j q ¯ i j → ( k ) = 1 2 ∑ q i + 1 2 ∑ i q i = 1$
The proof is complete. ☐
From the construction of the transition matrix $A ¯$, it is easily seen that $A ¯$ is irreducible. In (12) and Proposition 2.2, we show that $P ¯ ∈ P A ¯$ and the vector $q ¯$ defined by (13) is its associated stationary probability vector. Hence the Kolmogorov-Sinai entropy $h P ¯ , q ¯ ( σ A ¯ )$ is well-defined. Now we give the relationship between the quantities $h P ¯ , q ¯ ( σ A ¯ )$ and $h P , q , A$ defined in Equation (3).
Proposition 2.3.
$h P ¯ , q ¯ ( σ A ¯ ) = 1 2 h P , q , A$
Proof.
We note that by (12b), $log p ¯ i j → ( k ) , j = 0$ if $a i j > 0$. Using the definition of $P ¯$ and $q ¯$ in (12) and (13), as well as the entropy formula (1), we have
$h P ¯ , q ¯ ( σ A ¯ ) = - ∑ i j , a i j > 0 ∑ k = 1 a i j q ¯ i p ¯ i , i j → ( k ) log p ¯ i , i j → ( k ) = - ∑ i j , a i j > 0 ∑ k = 1 a i j 1 2 q i p i j a i j log p i j a i j = - ∑ i j , a i j > 0 1 2 q i p i j log p i j a i j = 1 2 h P , q , A$
The proof is complete. ☐
Using Proposition 2.3, 2.1, and Parry’s Theorem 1.1, it follows that
$1 2 h P , q , A = h P ¯ , q ¯ ( σ A ¯ )$
$≤ log λ N ( A ¯ ) = 1 2 log λ N ( A )$

Step 2: Inequality (3) is true for all irreducible nonnegative matrices with rational entries.

Any $N × N$ nonnegative matrix with all entries that are rational can be written as $A / n$ where $A$ is a nonnegative matrix with integer entries and n is an positive integer. Suppose $A$ is irreducible and $P ∈ P A / n$. Note that $P A / n = P A$. Letting $q$ be a stationary probability vector associated with $P$, inequality (3) for $A / n$ follows from the following proposition.
Proposition 2.4.
$h P , q , A / n ≤ log λ N ( A / n )$
Proof.
From the definition of $h P , q , A / n$, we see that
$h P , q , A / n = - ∑ i j , a i j > 0 q i p i j log p i j n a i j = - ∑ i j , a i j > 0 q i p i j log p i j a i j - ∑ i j q i p i j log n$
$= h P , q , A - ∑ i j q i p i j log n$
On the other hand, since $q ⊤ P = q ⊤$ and $∑ q i = 1$, we have
$∑ i j q i p i j log n = log n$
Substituting (17) into (16) and using the result (15) in Step 1, we have
$h P , q , A / n = h P , q , A - log n ≤ log λ N ( A ) - log n = log λ N ( A / n )$

Step 3: Inequality (3) is true for all irreducible nonnegative matrices.

It remains to show (3) holds for all nonnegative $A$ with irrational entries. The assertion follows from Step 2 and the continuous dependence of eigenvalues with respect to the matrix.
Now, we give the proof of the second assertion of Theorem 1.2.
Proposition 2.5.
The equality in (3) holds when one chooses
$P = 1 λ N ( A ) diag ( x ) - 1 A diag ( x )$
and
$q = y ∘ x y ⊤ x$
where $x > 0$ and $y > 0$ are, respectively, the right and left eigenvectors of $A$ corresponding to eigenvalue $λ N ( A )$.
Proof.
By setting $y ⊤ x = 1$, we may write
$p i j = a i j x j λ N ( A ) x i and q i = x i y i$
To ease the notation, set $λ N = λ N ( A )$. Hence, we have
$h P , q , A = - ∑ i j x i y i a i j x j λ N x i log x j λ N x i = ∑ i j y i λ N ( a i j x j ) log ( λ N x i ) - ∑ i j x j λ N ( y i a i j ) log x j = ∑ i y i x i log ( λ N x i ) - ∑ j x j y j log x j Use the facts ∑ j a i j x j = λ N x i and ∑ i y i a i j = λ N y j = ∑ i x i y i log λ N = log λ N$
The proof of Theorem 1.2 is complete. ☐
In the following, we give the proof of Corollary 1.3. We first prove the following useful proposition. It will be used in Section 3 as well.
Proposition 2.6.
Let $A ∈ R N × N$ be an irreducible nonnegative matrix. Suppose $A$ is symmetric and $x ∈ R N$ be positive. If $P = diag ( A x ) - 1 A diag ( x )$ and $q = x ∘ y x ⊤ y$, where $y = A x$, then
$h P , q , A = - 1 x ⊤ y ∑ i = 1 N x i y i log x i y i$
From Proposition 2.5, we see that the matrix $P$ in Proposition 2.5 is a stochastic matrix compatible with $A$ and $q$ is its associated stationary probability vector. Hence, the entropy $h P , q , A$ is well defined. Now, we give the proof of this Proposition.
Proof. Since $A ≥ 0$ is irreducible and $x > 0$, it follows $A x > 0$, and hence, $diag ( A x ) - 1$ is well-defined. It is easily seen that $p i j = 0$ if and only if $a i j = 0$. However, $P e = diag ( A x ) - 1 ( A x ) = e$. This shows that $P ∈ P A$. On the other hand, since $A$ is symmetric, we see that $y ⊤ = x ⊤ A$. Hence
$q ⊤ P = ( x ∘ ( A x ) ) ⊤ diag ( A x ) - 1 A diag ( x ) / x ⊤ A x = q ⊤$
We have proved the first assertion of this proposition. By the definition of $h P , q , A$ in (3), we have
$h P , q , A = - ∑ i j a i j x i x j x ⊤ y log x j y i = 1 x ⊤ y ∑ i = 1 N x i y i log y i - ∑ i = 1 N x j y j log x j = - 1 x ⊤ y ∑ i = 1 N x i y i log x i y i$
This completes the proof. ☐
Now, we are in a proposition to give the proof of Corollary 1.3.
Proof of Corollary 1.3.
For convenience, we let $y = A x$. Hence $q = x ∘ y x ⊤ y$ and $p i j = a i j x j y i$. Using Proposition 2.6, we have
$h P , q , A = - 1 x ⊤ y ∑ i = 1 N x i y i log x i y i$
$≥ - log x ⊤ x x ⊤ y$
$= log x ⊤ A x x ⊤ x$
Here inequality (19) follows from Jensen’s inequality (see e.g., [12] (Theorem 7.35)) for $- log$ and the fact that $∑ i = 1 N 1 x ⊤ y x i y i = 1$. Similarly, using Proposition 2.6 and the monotonicity of log, we also see that
$h P , q , A = - 1 x ⊤ y ∑ i = 1 N x i y i log x i y i$
$≥ 1 x ⊤ y ∑ i = 1 N x i y i log min 1 ≤ i ≤ N y i x i$
$= log min 1 ≤ i ≤ N y i x i$
This proves the first assertion of Corollary 1.3. It is easily seen that if $x$ is an eigenvector corresponding to $λ N ( A )$, then both equalities in (19) and (20) hold. From the assumption that $A ≥ 0$ is irreducible and $x > 0$, it follows that $y > 0$ also. This implies there are N terms in (18). Hence equality in (19) or in (20) holds only if $x i y i$, for all $i = 1 , … , N$, are constant. That is, $y = A x = λ x$. Here λ is some eigenvalue of $A$. However, $x > 0$. From Perron-Frobenius Theorem it follows $λ = λ N ( A )$. The proof is complete. ☐

3. Proof of Theorem 1.4

In this section, we shall give the proof of Theorem 1.4. We first prove (5).
Proposition 3.1.
Let $A$, $V$ and $x$ be as defined in Theorem 1.4. Then we have
$λ N ( A + V ) - λ N ( A ) ≥ f ( 1 / λ N ( A ) ) - 1 1 / λ N ( A )$
where
$f ( z ) = ∏ i = 1 N ( 1 + v i z ) ( 1 + v i z ) x i 2 ∑ j = 1 N ( 1 + v j z ) x i 2$
The equality holds in (21) if and only if $v 1 = ⋯ = v N$.
Proof.
To ease the notation, we shall denote $λ = λ N ( A )$. Let $y = ( A + V ) x = λ x + V x$, $q = x ∘ y x ⊤ y$, and $P = diag ( y ) - 1 ( A + V ) diag ( x ) ∈ P A + V$. From Theorem 1.2 and Proposition 2.6, we have
$log λ N ( A + V ) ≥ h P , q , A + V$
$= 1 x ⊤ ( A + V ) x ∑ i = 1 N ( λ + v i ) x i 2 log ( λ + v i )$
We note that
$log λ N ( A ) = 1 x ⊤ ( A + V ) x ∑ i = 1 N ( λ + v i ) x i 2 log λ$
Subtracting (23) from (22), we have
$log λ N ( A + V ) λ N ( A ) ≥ 1 ∑ i = 1 N ( 1 + v i / λ ) x i 2 ∑ i = 1 N ( 1 + v i / λ ) x i 2 log ( 1 + v i / λ )$
and hence,
$λ N ( A + V ) - λ N ( A ) λ N ( A ) ≥ f ( 1 / λ N ( A ) ) - 1$
This proves (21). Now we prove the second assertion of this proposition. It is easily seen that $v 1 = ⋯ = v N$ implies the equality in (21) holds. Conversely, suppose the equality in (21) holds. It is equivalent to the equality in (22) holds. Now, we write (22) in an alternative form
$1 x ⊤ ( A + V ) x ∑ i = 1 N ( λ + v i ) x i 2 log ( λ + v i ) ≤ log 1 x ⊤ ( A + V ) x ∑ i = 1 N ( λ + v i ) 2 x i 2$
$= log x ⊤ ( A + V ) 2 x x ⊤ ( A + V ) x$
$≤ log λ N ( A + V )$
Here (24) follows from the convexity of log and Jensen’s inequality. Hence, if the equality in (22) holds, then the equality in (25) also holds. This means $x$ is also an eigenvector of $A + V$. However, since $x > 0$ is the eigenvector of $A$ corresponding to $λ N ( A )$, we conclude that $v 1 = ⋯ = v N$. This completes the proof. ☐
The following proposition can be obtained from a standard calculation.
Proposition 3.2.
Let f be the real-valued function in Proposition 3.1. Then we have
$f ′ ( z ) = b 1 + b z + g ( z ) ( 1 + b z ) 2 f ( z )$
$f ′ ′ ( z ) = g ′ ( z ) ( 1 + b z ) 2 + g ( z ) 2 ( 1 + b z ) 4 f ( z )$
where $b = ∑ i = 1 N x i 2 v i$ and
$g ( z ) = ∑ i = 1 N x i 2 ∑ j = 1 N x j 2 ( v i - v j ) log ( 1 + v i z ) ,$
$g ′ ( z ) = 1 2 ∑ i , j = 1 N x i 2 x j 2 ( v i - v j ) 2 1 ( 1 + v i z ) ( 1 + v j z ) ,$
In the following, we show that the lower bound estimate (5) for $λ N ( A + V ) - λ N ( A )$ is greater than $x ⊤ V x$.
Proposition 3.3.
Let f be the real-valued function in Proposition 3.1. Then we have
$f ( 1 / λ N ( A ) ) - 1 1 / λ N ( A ) ≥ x ⊤ V x$
Proof.
It is easily seen from the definition of $f ( z )$ that $f ( 0 ) = 1$. Hence, using the Mean Value Theorem follows that there exists a $ζ ∈ ( 0 , 1 / λ N ( A ) )$ such that
$f ( 1 / λ N ( A ) ) - 1 1 / λ N ( A ) = f ′ ( ζ ) .$
From (26a) and (27a), we see that $f ′ ( 0 ) = b = x ⊤ V x$. From (26b), (27a) and (27b), we also see that $f ′ ′ ( z ) ≥ 0$ for all $z ≥ 0$. This implies
$f ′ ( ζ ) ≥ f ′ ( 0 ) = x ⊤ V x$
The assertion of this proposition follows from (28) and (29) directly. ☐

4. Conclusions

In this paper, we first generalize Parry’s Theorem to general nonnegative matrices. This can be treated as an estimation for the lower bound for a nonnegative matrix. Second, we use the generalized Parry’s Theorem to estimate a nontrivial lower bound of $λ N ( A + V ) - λ N ( A )$, provided that $A ≥ 0$ is symmetric and $V ≥ 0$ is a diagonal matrix. The bound is optimal but implicit that can be applied when $λ N ( A )$ and its corresponding eigenvector are known. As an interesting topic to be explored in the future, rather than a nonnegative matrix eigenvalue problem, one may wish to derive a similar inequality to (3) for a general square matrix or for a generalized eigenvalue problem $A x = λ B x$.

References

1. Arbenz, P.; Golub, G.H. On the spectral decomposition of Hermitian matrices modified by low rank perturbations with applications. SIAM J. Matrix Anal. Appl. 1988, 9, 40–58. [Google Scholar] [CrossRef]
2. Chang, S.-M.; Lin, W.-W.; Shieh, S.-F. Gauss-Seidel-type methods for energy states of a multi-component Bose-Einstein condensate. J. Comp. Phys. 2005, 22, 367–390. [Google Scholar] [CrossRef]
3. Golub, G.H. Some modified matrix eigenvalue problems. SIAM Rev. 1973, 15, 318–334. [Google Scholar] [CrossRef]
4. Horn, R.A.; Johnson, C.R. Matrix Analysis; Cambridge University Press: Cambridge, UK, 1985. [Google Scholar]
5. Juang, J.; Shieh, S.-F.; Turyn, L. Cellular neural networks: Space-dependent template, mosaic patterns and spatial chaos. Internat. J. Bifur. Chaos Appl. Sci. Engrg. 2002, 12, 1717–1730. [Google Scholar] [CrossRef]
6. Kolmogorov, A.N. A new metric invariant of transitive dynamical systems and automorphisms of Lebesgue spaces. Dokl. Akad. Nauk SSSR 1958, 119, 861–864. [Google Scholar]
7. Kolmogorov, A.N. On the entropy per time unit as a metric invariant of auto-morphisms. Dokl. Akad. Nauk SSSR 1958, 21, 754–755. [Google Scholar]
8. Mañé, R. Ergodic Theory and Differentiable Dynamics; Springer-Verlag: Berlin, Germany, 1987. [Google Scholar]
9. Parry, W. Intrinsic Markov chains. Trans. Amer. Math. Soc. 1964, 112, 55–66. [Google Scholar] [CrossRef]
10. Shieh, S.F.; Wang, Y.Q.; Wei, G.W.; Lai, C.-H. Mathematical analysis of the wavelet method of chaos control. J. Math. Phys. 2006, 47, 082701. [Google Scholar] [CrossRef]
11. Stewart, G.W.; Sun, J.-G. Matrix Perturbation Theory; Academic Press: Boston, MA, USA, 1990. [Google Scholar]
12. Wheeden, R.L.; Zygmund, A. Measure and integral: An introduction to real analysis. In Monographs and Textbooks in Pure and Applied Mathematics; Marcel Dekker: New York, NY, USA, 1977. [Google Scholar]

Share and Cite

MDPI and ACS Style

Shieh, S.-F. Eigenvalue Estimates Using the Kolmogorov-Sinai Entropy. Entropy 2011, 13, 2036-2048. https://doi.org/10.3390/e13122036

AMA Style

Shieh S-F. Eigenvalue Estimates Using the Kolmogorov-Sinai Entropy. Entropy. 2011; 13(12):2036-2048. https://doi.org/10.3390/e13122036

Chicago/Turabian Style

Shieh, Shih-Feng. 2011. "Eigenvalue Estimates Using the Kolmogorov-Sinai Entropy" Entropy 13, no. 12: 2036-2048. https://doi.org/10.3390/e13122036