Perturbation Bounds for Eigenvalues and Determinants of Matrices. A Survey

Michael Gil’

doi:10.3390/axioms10020099

Abstract

The paper is a survey of the recent results of the author on the perturbations of matrices. A part of the results presented in the paper is new. In particular, we suggest a bound for the difference of the determinants of two matrices which refines the well-known Bhatia inequality. We also derive new estimates for the spectral variation of a perturbed matrix with respect to a given one, as well as estimates for the Hausdorff and matching distances between the spectra of two matrices. These estimates are formulated in the terms of the entries of matrices and via so called departure from normality. In appropriate situations they improve the well-known results. We also suggest a bound for the angular sectors containing the spectra of matrices. In addition, we suggest a new bound for the similarity condition numbers of diagonalizable matrices. The paper also contains a generalization of the famous Kahan inequality on perturbations of Hermitian matrices by non-normal matrices. Finally, taking into account that any matrix having more than one eigenvalue is similar to a block-diagonal matrix, we obtain a bound for the condition numbers in the case of non-diagonalizable matrices, and discuss applications of that bound to matrix functions and spectrum perturbations. The main methodology presented in the paper is based on a combined usage of the recent norm estimates for matrix-valued functions with the traditional methods and results.

Keywords:

matrices; perturbations; spectral variation; Hausdorff distance between eigenvalues; matching distance between spectra

MSC:

15A15; 15A18; 15A42

1. Introduction

This paper is a survey of the recent results of the author on perturbations of the eigenvalues and determinants of matrices.

Finding the eigenvalues of a matrix is not always an easy task. In many cases it is easier to calculate the eigenvalues of a nearby matrix and then to obtain the information about the eigenvalues of the original matrix.

The perturbation theory of matrices has been developed in the works of R. Bhatia, C. Davis, L. Elsner, A.J. Hoffman, W. Kahan, T. Kato, L. Mirsky, A. Ostrowski, G.W. Stewart, J.G. Sun, H.W. Wielandt, and many other mathematicians.

To recall some basic results of the perturbation theory, which will be discussed below, let us introduce the notations.

Let

C^{n}

be the n-dimensional complex Euclidean space with a scalar product

(., .),

the norm

∥ . ∥ =

\sqrt{(., .)}

and unit matrix I.

C^{n \times n}

denotes the set of complex

n \times n

-matrices. For an

A \in C^{n \times n}

,

A^{*}

is the adjoint matrix,

A^{- 1}

is the inverse one,

∥ A ∥

is the spectral norm:

∥ A ∥ = {sup}_{x \in C^{n}, ∥ x ∥ = 1} ∥ A x ∥

,

λ_{k} (A)

are the eigenvalues of A taken with their multiplicities,

σ (A)

is the spectrum,

R_{λ} (A) = {(A - λ I)}^{- 1}

(λ \notin σ (A))

is the resolvent,

trace (A)

is the trace,

det A

is the determinant,

r_{s} (A)

is the spectral radius, and

N_{p} (A) : = {(trace {(A A^{*})}^{p / 2})}^{1 / p}

(1 \leq p < \infty)

is the Schatten-von Neumann norm; in particular,

N_{2} (A) = {∥ A ∥}_{F}

is the Hilbert-Schmidt (Frobenius) norm.

Let A and

\tilde{A}

be

n \times n

-matrices whose eigenvalues counted with their multiplicities are

λ_{k} = λ_{k} (A)

and

{\tilde{λ}}_{k} = λ_{k} (\tilde{A})

(k = 1, \dots, n)

, respectively. The following result is well-known.

| det A - det \tilde{A} | \leq n M^{n - 1} ∥ A - \tilde{A} ∥ (A, \tilde{A} \in C^{n \times n}),

(1)

where

M = max {∥ A ∥, ∥ \tilde{A} ∥}

, cf. [1] (p. 107). The spectral norm is unitarily invariant, but often it is not easy to compute that norm, especially if the matrix depends on many parameters. In Section 4 below we present a bound for

| det A - det \tilde{A} |

in terms of the entries of matrices in the standard basis. That bound can be directly calculated. Moreover, under some conditions our bound is sharper than (1).

Recall some definitions from matrix perturbation theory (see [2] (p. 167)).

The spectral variation of

\tilde{A}

with respect to A is

{sv}_{A} (\tilde{A}) : = {max}_{i} {min}_{j} | {\tilde{λ}}_{i} - λ_{j} |

.

The Hausdorff distance between the eigenvalues of A and

\tilde{A}

is

hd (A, \tilde{A}) : = max {{sv}_{A} (\tilde{A}), {sv}_{\tilde{A}} (A)} .

The matching (optimal) distance between eigenvalues of A and

\tilde{A}

is

md (A, \tilde{A}) : = min_{π} max_{i} | {\tilde{λ}}_{π (i)} - λ_{i} |,

(2)

where

π

is taken over all permutations of

{1, 2, \dots, n}

.

The quantity

{sv}_{A} (\tilde{A})

is not a metric: it may be zero, even when the eigenvalues of A and

\tilde{A}

are different (e.g., when

n = 2

and

λ_{1} = {\tilde{λ}}_{1} = {\tilde{λ}}_{2} = 0

while

λ_{2} = 1

).

Geometrically, the spectral variation has the following interpretation. If

D_{i} = {s \in C : | s - λ_{i} | \leq {sv}_{A} (\tilde{A})}, i = 1, \dots, n,

then

σ (\tilde{A}) \subset \cup_{i = 1}^{n} D_{i} .

In other words, the eigenvalues of

\tilde{A}

lie in the union of disks of radius

{sv}_{A} (\tilde{A})

centered at the eigenvalues of A.

The Hausdorff distance hounds the spectral variation and is actually a metric. The matching distance bounds the Hausdorff distance and is also a metric. The “smallness” of the matching distance means that the eigenvalues of a matrix and its perturbation are “close” and they can be grouped into nearby pairs. In some cases bounds on the spectral variation or the Hausdorff distance can be converted into bound on the matching distance.

One of the well-known bounds for

{sv}_{A} (\tilde{A})

is the Elsner inequality

{sv}_{A} (\tilde{A}) \leq ∥ A - \tilde{A} ∥^{1 / n} (∥ A ∥ + ∥ \tilde{A} {∥)}^{1 - 1 / n}

(3)

References [1,2,3]. Since the right hand part of this inequality is symmetric, we have

hd (A, \tilde{A}) \leq ∥ A - \tilde{A} ∥^{1 / n} (∥ A ∥ + ∥ \tilde{A} {∥)}^{1 - 1 / n} .

As it was mentioned, the calculations and estimating of the spectral norm is often a not easy task. Below we suggest bounds for the spectral variation and Hausdorff distance explicitly expressed via the entries of the considered matrices. In some cases our bounds are sharper than (3).

By inequality (3) the following result called the Ostrowski–Elsner theorem has been proved:

md (A, \tilde{A}) \leq (2 n - 1) ∥ \tilde{A} {- A ∥}^{1 / n} (∥ A ∥ + ∥ \tilde{A} {∥)}^{1 - 1 / n},

cf. [2] (p. 170, Theorem IV.1.4). In Section 7, we consider also other bounds for

md (A, \tilde{A})

.

Put

m_{p} (A, \tilde{A}) : = min_{π} \sum_{k = 1}^{n} {| λ_{π (k)} - {\tilde{λ}}_{k} |}^{p} (p \geq 1),

where

π

ranges over all permutations of the integers

1, 2, \dots, n

.

One of the famous results on

m_{2} (A, \tilde{A})

is the Hoffman-Wiellandt theorem proved in [4] (see also [2] (p. 189) and [5] (p. 126)), which asserts the following: for all normal matrices A and

\tilde{A}

, the inequality

m_{2} (A, \tilde{A}) \leq N_{2} (A - \tilde{A})

is valid.

In [6], L. Mirsky has proved that for all Hermitian matrices A and

\tilde{A}

,

m_{p} (A, \tilde{A}) \leq N_{p} (A - \tilde{A}) (1 \leq p < \infty)

(see also [2] (p. 194) and [5] (p. 126)). In 1975, W. Kahan [7] (see also [2] (Theorem IV.5.2, p. 213)) has derived the following result: let A be a Hermitian matrix and

\tilde{A}

an arbitrary one in

C^{n}

, and

λ_{1} \leq λ_{2} \leq \dots \leq λ_{n} and Re {\tilde{λ}}_{1} \leq Re {\tilde{λ}}_{2} \leq \dots \leq Re {\tilde{λ}}_{n} .

Then

{[\sum_{k = 1}^{n} {(R e {\tilde{λ}}_{k} - λ_{k})}^{2}]}^{1 / 2} \leq N_{2} (E_{R}) + {[N_{2}^{2} (E_{I}) - \sum_{k = 1}^{n} {(I m λ_{k})}^{2}]}^{1 / 2} \leq \sqrt{2} N_{2} (E) .

(4)

Here and below

E : = \tilde{A} - A

,

E_{R} : = (E + E^{*}) / 2

,

E_{I} : = (E - E^{*}) / 2 i

. The Kahan theorem generalizes the Mirsky result in the case

p = 2

. In Section 14 we present an analogous result for a

p \in [2, \infty)

.

Furthermore, as is well-known, the Hilbert identity

R_{λ} (\tilde{A}) - R_{λ} (A) = R_{λ} (A) (A - \tilde{A}) R_{λ} (\tilde{A}) (λ \notin σ (A) \cup σ (\tilde{A}))

plays an important role in the perturbation theory. In Section 15, we suggest a new identity for resolvents and show that it refines the results derived with the help of the Hilbert identity, if the commutator

A \tilde{A} - \tilde{A} A

has a sufficiently small norm.

A few words about the contents of the paper. It consists of 17 Sections.

In Section 2, we recall some classical results which are needed our proofs. In Section 3, we present norm estimates for resolvents of matrices which will be applied in the sequel.

In Section 4 and Section 5, we derive the perturbation bound for determinants in terms of the entries of matrices and consider some its applications. Section 6 deals with perturbation bounds for determinants expressed via rather general norms.

Section 7, Section 8, Section 9 and Section 10 are devoted to the spectral variations. Besides, the relevant bounds are obtained in terms of the departure from normality and via the entries of matrices.

Section 11 and Section 12 deal with angular localization of matrices. The results of Section 12 are new.

Section 13 is devoted to perturbations of diagonalizable matrices. Besides, we suggest a bound for the condition numbers. Besides, Corollary 14 is new.

As it was above mentioned, in Section 14 we generalize the Kahan result.

In Section 16 and Section 17, taking into account that any matrix having more than one eigenvalue is similar to a block-diagonal matrix, we obtain a bound for the condition numbers in the case of non-diagonalizable matrices, and discuss applications of that bound to matrix functions and spectrum perturbations. The material of Section 16 and Section 17 is new.

2. Preliminaries

Recall the Schur theorem Section I.4.10.2 of [8], By that theorem there is an orthogonal normal (Schur’s) basis

{e_{k}}_{k = 1}^{n}

, in which A has the triangular representation

A e_{k} = \sum_{j = 1}^{k} a_{j k} e_{j} with a_{j k} = (A e_{k}, e_{j}) (k = 1, \dots, n) .

Schur’s basis is not unique. We can write

A = D + V (σ (A) = σ (D))

(5)

with a normal (diagonal) operator D defined by

D e_{j} = λ_{j} (A) e_{j} (j = 1, \dots, n)

and a nilpotent operator V defined by

V e_{k} = \sum_{j = 1}^{k - 1} a_{j k} e_{j} (k = 2, \dots, n), V e_{1} = 0 .

Equality (5) is called the triangular representation of A; D and V are called the diagonal part and nilpotent part of A, respectively. Put

P_{j} = \sum_{k = 1}^{j} (., e_{k}) e_{k} (j = 1, \dots, n), P_{0} = 0 .

{P_{k}}_{k = 1}^{n}

is called the maximal chain of the invariant projections of A. It has the properties

0 = P_{0} C^{n} \subset P_{1} C^{n} \subset \dots \subset P_{n} C^{n} = C^{n}

with

dim (P_{k} - P_{k - 1}) C^{n} = 1

and

A P_{k} = P_{k} A P_{k}; V P_{k} = P_{k - 1} V P_{k}; D P_{k} = D P_{k} (k = 1, \dots, n) .

So

A, V

and D have the joint invariant subspaces. We can write

D = \sum_{k = 1}^{n} λ_{k} (A) Δ P_{k},

where

Δ P_{k} = P_{k} - P_{k - 1} (k = 1, \dots, n)

.

Let us recall also the famous Gerschgorin theorem [2] and Section III.2.2.1 of [8], which is an important tool for the analysis of the location of the eigenvalues.

Theorem 1.

The eigenvalues of

A = (a_{j k}) \in C^{n \times n}

lie in the union of the discs

{z \in C : | z - a_{k k} | \leq \sum_{j = 1, j \neq k}^{n} | a_{j k} |}, k = 1, \dots, n .

The Gerschgorin theorem implies the following inequality for the spectral radius:

r_{s} (A) \leq max_{k} \sum_{j = 1}^{n} | a_{j k} | .

3. Norm Estimates for Resolvents

The following quantity (the departure for normality) of A plays an essential role hereafter:

g (A) = [N_{2}^{2} (A) - \sum_{k = 1}^{n} | λ_{k} (A) {|^{2}]}^{1 / 2} .

By Lemma 3.1 from [9]

g (A) = N_{2} (V)

, where V is the nilpotent part of A (see equality (5)). Therefore, if A is a normal matrix, then

g (A) = 0

. The following relations are checked in Section 3.1 of [9]:

g^{2} (A) \leq N_{2}^{2} (A) - | trace (A^{2}) |,

(6)

g^{2} (A) \leq \frac{N_{2}^{2} (A - A^{*})}{2}

(7)

and

g (e^{i τ} A + z I) = g (A) (z \in C, τ \in R) .

(8)

By the inequality between the arithmetic and geometric means we have

(\frac{1}{n} \sum_{k = 1}^{n} | λ_{k} {(A) |}^{2})^{n} \geq \prod_{k = 1}^{n} | λ_{k} {(A) |}^{2} = {| det A |}^{2} .

Hence,

g^{2} (A) \leq N^{2} (A) - n {| det A |}^{2 / n} .

(9)

If

A_{1} \in C^{n \times n}

and

A_{2} \in C^{n \times n}

are commuting matrices, then

g (A_{1} + A_{2}) \leq g (A_{1}) + g (A_{2})

. Indeed, since

A_{1}

and

A_{2}

commute, they can have a joint basis of the triangular representation. So the nilpotent part of

A_{1} + A_{2}

is equal to

V_{1} + V_{2}

where

V_{1}

and

V_{2}

are the nilpotent parts of

A_{1}

and

A_{2}

, respectively. Therefore,

g (A_{1} + A_{2}) = N_{2} (V_{1} + V_{2}) \leq N_{2} (V_{1}) + N_{2} (V_{2}) = g (A_{1}) + g (A_{2}) .

We will need the following

Theorem 2

(Theorem 3.1 of [9]). Let

A \in C^{n \times n}

. Then

∥ R_{λ} (A) ∥ \leq \sum_{k = 0}^{n - 1} \frac{g^{k} (A)}{\sqrt{k!} ρ^{k + 1} (A, λ)} (λ \notin σ (A)),

where

ρ (A, λ) : = inf_{s \in σ (A)} | λ - s | .

This Theorem sharp: if A is a normal matrix, then

g (A) = 0

and we obtain

∥ R_{λ} (A) ∥ = \frac{1}{ρ (A, λ)}

. Here and below we put

0^{0} = 1

.

Let us recall an additional norm estimate for the resolvent, which is sharper than Theorem 2 but more cumbersome. To this end, for an integer

n \geq 2

introduce the numbers

ψ_{n, k} = \sqrt{\frac{(_{k}^{n - 1})}{{(n - 1)}^{k}}} (k = 1, \dots, n - 1) and γ_{n, 0} = 1 .

Here

(_{k}^{n}) = \frac{n!}{(n - k)! k!}

are binomial coefficients. Evidently, for all

n > 2

,

ψ_{n, k}^{2} = \frac{(n - 1) (n - 2) \dots (n - k)}{{(n - 1)}^{k} k!} \leq \frac{1}{k!} (k = 1, 2, \dots, n - 1) .

Theorem 3

(Theorem 3.10 of [9]). Let

A \in C^{n \times n}

. Then

∥ R_{λ} (A) ∥ \leq \sum_{k = 0}^{n - 1} \frac{g^{k} (A) ψ_{n, k}}{ρ^{k + 1} (A, λ)} (λ \notin σ (A)) .

Moreover, the following result is valid.

Theorem 4

(Theorem 3.4 of [9]). Let

A \in C^{n \times n}

. Then

{∥ (I λ - A)}^{- 1}) ∥ \leq \frac{1}{ρ (A, λ)} {[1 + \frac{1}{n - 1} (1 + \frac{g^{2} (A)}{ρ^{2} (A, λ)})]}^{(n - 1) / 2} (λ \notin σ (A)) .

Let us point to an inequality between the resolvent and determinant.

Theorem 5.

For any

A \in C^{n \times n}

and all regular λ of A one has

{∥ (I λ - A)}^{- 1} det (λ I - A) ∥ \leq

{[\frac{N_{2}^{2} (A) - 2 Re (\bar{λ} trace (A)) + n {| λ |}^{2}}{n - 1}]}^{(n - 1) / 2} .

For the proof see, for example Corollary 3.4 of [9].

4. Perturbation Bounds for Determinants in Terms of the Entries of Matrices

The following theorem is valid.

Theorem 6

(Reference [10]). Let

A, \tilde{A} \in C^{n \times n}

,

{d_{k}}

be an arbitrary orthonormal basis in

C^{n}

and

q_{d} = {max}_{j} ∥ (A - \tilde{A}) d_{j} ∥

. Then

| det A - det \tilde{A} | \leq q_{d} \prod_{k = 1}^{n} (\frac{1}{2} ∥ (A + \tilde{A}) d_{k} ∥ + (\frac{1}{2} + \frac{1}{q_{d}}) ∥ (A - \tilde{A}) d_{k} ∥)

(10)

and, therefore,

| det A - det \tilde{A} | \leq q_{d} \prod_{k = 1}^{n} (1 + \frac{1}{2} (∥ (A + \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} ∥)) .

(11)

Proof.

By the Hadamard inequality

| det A | \leq \prod_{k = 1}^{n} ∥ A d_{k} ∥,

(12)

(see Section 2). Put

Z (λ) = det (\frac{1}{2} (A + \tilde{A}) + λ (A - \tilde{A})) (λ \in C) .

It is not hard to check that

Z (λ)

is a polynomial in

λ

and

det (A) - det (\tilde{A}) = Z (\frac{1}{2}) - Z (- \frac{1}{2}) .

Thanks to the Cauchy integral,

Z (1 / 2) - Z (- 1 / 2) = \frac{1}{2 π i} \int_{| z | = \frac{1}{2} + r} \frac{Z (z) d z}{(z - 1 / 2) (z + 1 / 2)} (r > 0) .

Hence,

| Z (1 / 2) - Z (- 1 / 2) | \leq (1 / 2 + r) sup_{| z | = \frac{1}{2} + r} \frac{| Z (z) |}{| z^{2} - \frac{1}{4} |} .

Take into account that

inf_{| z | = \frac{1}{2} + r} | z^{2} - \frac{1}{4} | = inf_{0 \leq t < 2 π} | {(1 / 2 + r)}^{2} e^{2 i t} - 1 / 4 |

\geq {(1 / 2 + r)}^{2} - 1 / 4 = r^{2} + r > r .

Consequently,

| Z (1 / 2) - Z (- 1 / 2) | \leq \frac{1}{r} sup_{| z | = 1 / 2 + r} | Z (z) | .

(13)

In addition, according to (10)

| Z (z) | = | det (\frac{1}{2} (A + \tilde{A}) + z (A - \tilde{A})) | \leq \prod_{k = 1}^{n} ∥ [\frac{1}{2} (A + \tilde{A}) + z (A - \tilde{A})] d_{k} ∥

\leq \prod_{k = 1}^{n} [\frac{1}{2} ∥ (A + \tilde{A}) d_{k} ∥ + | z | ∥ (A - \tilde{A}) d_{k} ∥] .

Therefore, due to (13),

| det (A) - det (\tilde{A}) | = | Z (1 / 2) - Z (- 1 / 2) |

\leq \frac{1}{r} \prod_{k = 1}^{n} [\frac{1}{2} ∥ (A + \tilde{A}) d_{k} ∥ + (r + 1 / 2) ∥ (A - \tilde{A}) d_{k} ∥] .

Taking

r = \frac{1}{q}

, we get (10), as claimed. □

Obviously

∥ (A + \tilde{A}) d_{k} ∥, ∥ (A - \tilde{A}) d_{k} ∥

(k = 1, \dots, n)

are directly calculated. Below we also show that in the concrete situations Theorem 6 is sharper than (1) and enables us to establish sharp upper and lower bounds for the determinants of matrices that are “close” to triangular matrices.

Furthermore, making use of the inequality between the arithmetic and geometric means, from (11) we get

| det A - det \tilde{A} | \leq q_{d} (1 + \frac{1}{2 n} \sum_{k = 1}^{n} (∥ (A + \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} {∥))}^{n} .

Put

A_{1} = c A, {\tilde{A}}_{1} = c \tilde{A}

(c = c o n s t > 0)

. Then by the latter inequality

| det A_{1} - det {\tilde{A}}_{1} | \leq c q_{d} (1 + \frac{1}{2 n} \sum_{k = 1}^{n} ∥ (A_{1} + {\tilde{A}}_{1}) d_{k} ∥ + ∥ (A_{1} - {\tilde{A}}_{1}) d_{k} {∥)}^{n} .

Or

c^{n} | det A - det \tilde{A} | \leq c q_{d} {(1 + c b)}^{n},

where

b = \frac{1}{2 n} \sum_{k = 1}^{n} (∥ (A + \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} ∥) .

Denote

x = b c

. Then

| det A - det \tilde{A} | \leq q_{d} \frac{{(1 + x)}^{n}}{x^{n - 1}} b^{n - 1} .

Let us check that

min_{x \geq 0} \frac{{(1 + x)}^{n}}{x^{n - 1}} = \frac{n^{n}}{{(n - 1)}^{n - 1}} .

(14)

Indeed, the derivative of the function on the left-hand-side is

n {(1 + x)}^{n - 1} x^{1 - n} + (1 + x) n (1 - n) x^{- n} = {(1 + x)}^{n - 1} x^{- n} (n x + (1 - n) (1 + x)) .

Hence it follows that the infimum is reached at

x = n - 1

. This proves (14).

So we can write

| det A - det \tilde{A} | \leq \frac{q_{d} n^{n}}{{(n - 1)}^{n - 1}} b^{n - 1} .

We thus arrive at our next result.

Corollary 1.

Let

A, \tilde{A} \in C^{n \times n}

and

{d_{k}}

be an arbitrary orthonormal basis in

C^{n}

. Then we have

| det A - det \tilde{A} | \leq \frac{n q_{d}}{{(n - 1)}^{n - 1} {(2 n)}^{n - 1}} {(\sum_{k = 1}^{n} (∥ (A + \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} ∥))}^{n - 1} .

5. Perturbations of Triangular Matrices and Comparison with Inequality (1)

In this section,

A = {(a_{j k})}_{j, k = 1}^{n}

,

\tilde{A} = {({\tilde{a}}_{j k})}_{j, k = 1}^{n}

, and

{d_{k}}

is the standard basis. Clearly,

∥ (A - \tilde{A}) d_{k} ∥ = t_{k} (A - \tilde{A}), where t_{k} (A - \tilde{A}) : = (\sum_{j = 1}^{n} | a_{j k} - {\tilde{a}}_{j k} {|^{2})}^{1 / 2}

and

∥ (A + \tilde{A}) d_{k} ∥ = t_{k} (A + \tilde{A}), where t_{k} (A + \tilde{A}) : = (\sum_{j = 1}^{n} | a_{j k} + {\tilde{a}}_{j k} {|^{2})}^{1 / 2} .

Now Theorem 6 implies

Corollary 2.

One has

| det A - det \tilde{A} |

\leq max_{j} t_{j} (A - \tilde{A}) \prod_{k = 1}^{n} ((\frac{1}{2} + \frac{1}{{max}_{j} t_{j} (A - \tilde{A})}) t_{k} (A - \tilde{A}) + \frac{1}{2} t_{k} (A + \tilde{A}))

and, therefore,

| det A - det \tilde{A} | \leq max_{j} t_{j} (A - \tilde{A}) \prod_{k = 1}^{n} (1 + \frac{1}{2} (t_{k} (A - \tilde{A}) + t_{k} (A + \tilde{A}))) .

Furthermore, let

A_{+}

be the upper triangular part of A, i.e.,

A_{+} = {(a_{j k}^{+})}_{j, k = 1}^{n},

where

a_{j k}^{+} = a_{j k}

if

j \leq k

and

a_{j k}^{+} = 0

for

j > k

. Then

∥ (A - A_{+}) d_{k} ∥ = t_{k}^{-} (A) : = (\sum_{j = k + 1}^{n} | a_{j k} {|^{2})}^{1 / 2} (k < n), t_{n}^{-} (A) = 0 and

∥ (A + A_{+}) d_{k} ∥ = t_{k}^{+} (A) : = (\sum_{j = 1}^{n} | a_{j k} + a_{j k}^{+} {|^{2})}^{1 / 2} .

Clearly,

det (A_{+}) = \prod_{j = 1}^{n} a_{j j} .

Making use of Corollary 2, we arrive at our next result.

Corollary 3.

One has

| det A - \prod_{j = 1}^{n} a_{j j} | \leq δ (A),

where

δ (A) : = max_{j} t_{j}^{-} \prod_{k = 1}^{n} ((\frac{1}{2} + \frac{1}{{max}_{j} t_{j}^{-}}) t_{k}^{-} + \frac{1}{2} t_{k}^{+})

\leq max_{j} t_{j}^{-} \prod_{k = 1}^{n} (1 + \frac{1}{2} (t_{k}^{-} + t_{k}^{+})) .

From this corollary we have

| det A | < \prod_{j = 1}^{n} | a_{j j} | + δ (A) .

(15)

Moreover, if

\prod_{j = 1}^{n} | a_{j j} | > δ (A)

(16)

then

| det A | > \prod_{j = 1}^{n} | a_{j j} | - δ (A) .

(17)

Inequalities (15) and (17) are sharp: they are attained if A is triangular.

Recall that

{∥ A ∥}_{F} = N_{2} (A)

is the Frobenius norm of A.

The following lemma taken from Lemma 3.3 of [10] gives us simple conditions, under which (11) is sharper than (1).

Lemma 1.

If

q_{d} {e (∥ A ∥}_{F}^{2} + ∥ \tilde{A} ∥_{F}^{2})^{(n - 1) / 2} \leq ∥ \tilde{A} - A ∥ {(\sqrt{n} M)}^{n - 1} (n \geq 2),

(18)

then (11) is sharper than (1).

Proof.

By the Cauchy inequality,

(\sum_{k = 1}^{n} (∥ (A + \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} {∥))}^{2} \leq n \sum_{k = 1}^{n} (∥ (A + \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} {∥)}^{2}

\leq 2 n \sum_{k = 1}^{n} (∥ (A + \tilde{A}) d_{k} ∥^{2} + ∥ (A - \tilde{A}) d_{k} ∥^{2}) = 2 n (∥ A + \tilde{A} ∥_{F}^{2} + ∥ A - \tilde{A} ∥_{F}^{2}) .

Since

{∥ A ∥}_{F}^{2} = trace (A^{*} A)

, we easily have

∥ A + \tilde{A} ∥_{F}^{2} + ∥ A - \tilde{A} ∥_{F}^{2} = {2 ∥ A ∥}_{F}^{2} + 2 {∥ \tilde{A} ∥}_{F}^{2} .

Thus,

(\sum_{k = 1}^{n} (∥ (A + \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} {∥))}^{2} \leq {4 n (∥ A ∥}_{F}^{2} + ∥ \tilde{A} ∥_{F}^{2}) .

Now Corollary 3 implies

| det A - det \tilde{A} | \leq \frac{n q}{{(n - 1)}^{n - 1} {(2 n)}^{n - 1}} 2^{n - 1} {(\sqrt{n})}^{n - 1} {(∥ A ∥}_{F}^{2} + ∥ \tilde{A} ∥_{F}^{2} {))}^{(n - 1) / 2}

= \frac{n q n^{n - 1}}{{(n - 1)}^{n - 1} {(\sqrt{n})}^{n - 1}} {(∥ A ∥}_{F}^{2} + ∥ \tilde{A} ∥_{F}^{2} {))}^{(n - 1) / 2} .

Since

\frac{n^{n - 1}}{{(n - 1)}^{n - 1}} = {(1 + \frac{1}{n - 1})}^{n - 1} \leq e (n \geq 2),

We get

| det A - det \tilde{A} | \leq e \frac{n q}{{(\sqrt{n})}^{n - 1}} {(∥ A ∥}_{F}^{2} + ∥ \tilde{A} {∥_{F}^{2})}^{(n - 1) / 2} .

Thus, if (18) holds, then (17) improves (1). □

It should be noted that the determinants of diagonally dominant and double diagonally dominant matrices are very well explored, cf. [11,12,13,14]. At the same time the determinants of matrices “close” to triangular ones are investigated considerably less than the determinants of diagonally dominant matrices. About bounds for determinants of matrices close to the identity matrix see the papers [15].

6. Perturbation Bounds for Determinants in Terms of an Arbitrary Norm

Let

{∥ A ∥}_{0}

be an arbitrary fixed matrix norm of

A \in C^{n \times n}

, i.e., the the function from

C^{n \times n}

into

[0, \infty)

, defined by the usual relations:

∥ \hat{0} ∥_{0} = 0

for the zero matrix

\hat{0}

,

{∥ A ∥}_{0} > 0

if

A \neq \hat{0}

,

{∥ z A ∥}_{0} = {| z | ∥ A ∥}_{0}

, and

{∥ A + B ∥}_{0} \leq {∥ A ∥}_{0} + {∥ B ∥}_{0} (A, B \in C^{n \times n}, z \in C) .

In addition,

∥ A h ∥ \leq {∥ A ∥}_{0} ∥ h ∥

(h \in C^{n})

. So,

| λ_{k} {(A) | \leq ∥ A ∥}_{0}

(k = 1, \dots, n)

. Therefore, there is a number

α_{n} > 0

, such that

| det A | \leq α_{n} {∥ A ∥}_{0}^{n} .

(19)

We need the following result.

Theorem 7

(Theorem 1.7.1 of [16]). Let

A, B \in C^{n \times n}

and condition (19) hold. Then

| det A - det B | \leq γ_{n} {∥ A - B ∥}_{0} {(∥ A - B ∥}_{0} {+ ∥ A + B ∥}_{0})^{n - 1},

where

γ_{n} : = \frac{α_{n} n^{n}}{2^{n - 1} {(n - 1)}^{n - 1}} .

Recall that

N_{p} (.)

is the Schatten-von Neumann norm. Making use of the inequality between the arithmetic and geometric mean values, we obtain

{| det A |}^{p} = \prod_{k = 1}^{n} {| λ_{k} (A) |}^{p} \leq {(\frac{1}{n} \sum_{k = 1}^{n} {| λ_{k} (A) |}^{p})}^{n} .

Due to the Weyl inequalities

\sum_{k = 1}^{n} {| λ_{k} (A) |}^{p} \leq N_{p}^{p} (A),

(20)

cf. Corollary II.3.1 of [17], Lemma 1.1.4 of [16], we get

| det A | \leq \frac{1}{n^{n / p}} N_{p}^{n} (A) .

So in this case

α_{n} = \frac{1}{n^{n / p}} and γ_{n} = {\hat{η}}_{n, p},

where

{\hat{η}}_{n, p} : = \frac{n^{n (1 - 1 / p)}}{2^{n - 1} {(n - 1)}^{n - 1}} .

Now Theorem 7 implies

Corollary 4.

Let

A, B \in C^{n \times n}

. Then for any finite

p \geq 1

,

| det A - det B | \leq {\hat{η}}_{n, p} N_{p} (A - B) {(N_{p} (A - B) + N_{p} (A + B))}^{n - 1} .

Note that Theorem 8.1.1 from the book [16] refines the Weyl inequality with the help of the self-commutator.

Furthermore, let

A = (\begin{matrix} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ . & . & \dots & . \\ . & . & \dots & . \\ . & . & \dots & . \\ a_{n 1} & a_{12} & \dots & a_{n n} \end{matrix}) and W = (\begin{matrix} 0 & a_{12} & \dots & a_{1 n} \\ a_{21} & 0 & \dots & a_{2 n} \\ . & . & \dots & . \\ . & . & \dots & . \\ . & . & \dots & . \\ a_{n 1} & a_{12} & \dots & 0 \end{matrix}) .

i.e., W is the off-diagonal part of A:

W = A - diag (a_{j j})

. Then taking

B = diag (a_{j j})

and making use of the previous corollary, we arrive at the following result.

Corollary 5.

Let

A = (a_{j k}) \in C^{n \times n}

. Then

| det A - \prod_{k = 1}^{n} a_{k k} | \leq {\hat{η}}_{n, p} N_{p} (W) {(N_{p} (W) + N_{p} (A + diag A))}^{n - 1} .

7. Bounds for the Spectral Variations in Terms of the Departure from Normality

In this section, we estimate the spectral variation of two matrices in terms of the departure from normality

g (A)

introduced in Section 3. The results of the present section are based on the norm estimates for resolvents presented in Section 3 and the following technical lemma.

Lemma 2.

Let A and

\tilde{A}

be linear operators in

C^{n}

and

q : = ∥ A - \tilde{A} ∥

. In addition, let

∥ R_{λ} (A) ∥ \leq F (\frac{1}{ρ (A, λ)}) (λ \notin σ (A)),

where

F (x)

is a monotonically increasing continuous function of a non-negative variable x, such that

F (0) = 0

and

F (\infty) = \infty

. Then

{sv}_{A} (\tilde{A}) \leq z (F, q)

, where

z (F, q)

is the unique positive root of the equation

q F (1 / z) = 1 .

For the proof see Section 1.8 of [9]. Lemma 2 and Theorem 2 with

F (x) = \sum_{j = 0}^{n - 1} \frac{g^{j} (A) x^{j + 1}}{\sqrt{j!}}

imply

Theorem 8.

Let A and

\tilde{A}

be

n \times n

-matrices and

q = ∥ \tilde{A} - A ∥

. Then

{sv}_{A} (\tilde{A}) \leq z_{n} (A, q)

, where

z_{n} (A, q)

is the unique positive root of the equation

q \sum_{j = 0}^{n - 1} \frac{g^{j} (A)}{\sqrt{j!} z^{j + 1}} = 1 .

(21)

Since

g (A) \leq \sqrt{2} N_{2} (A_{I}),

where

A_{I} = (A - A^{*}) / 2 i

(see Section 2), one can replace

g (A)

in (21) by

\sqrt{2} N_{2} (A_{I}) .

If A is normal, then

g (A) = 0

, we have

z_{n} (A, q) = q

and, therefore, Theorem 8 gives us the well-known inequality

{sv}_{A} (\tilde{A}) \leq q

, cf. [1,2]. Thus, Theorem 8 refines the Elsner inequality (3) if A is “close” to normal.

Equation (21) can be written as

z^{n} = q \sum_{j = 0}^{n - 1} \frac{g^{j} (A)}{\sqrt{j!}} z^{n - j - 1} .

(22)

To estimate

z_{n} (A, q)

one can apply the well-known known bounds for the roots of polynomials. For instance, consider the algebraic equation

z^{n} = p (z) (n > 1), where p (z) = \sum_{j = 0}^{n - 1} c_{j} z^{n - j - 1}

(23)

with non-negative coefficients

c_{j} (j = 0, \dots, n - 1)

.

Lemma 3.

The unique positive root

{\hat{z}}_{0}

of (23) satisfies the inequality

{\hat{z}}_{0} \leq \{\begin{matrix} p (1) & if p (1) > 1, \\ p^{1 / n} (1) & if p (1) \leq 1 . \end{matrix}

Proof.

Since all the coefficients of

p (z)

are non-negative, it does not decrease as

z > 0

increases. If

p (1) \leq 1

, then

{\hat{z}}_{0} \leq 1

and

p ({\hat{z}}_{0}) \leq p (1)

. Hence

{\hat{z}}_{0}^{n} \leq p (1)

. If

p (1) \geq 1

, then

{\hat{z}}_{0} \geq 1, {\hat{z}}_{0}^{n} = p ({\hat{z}}_{0}) \leq {\hat{z}}_{0}^{n - 1} p (1)

and

{\hat{z}}_{0} \leq p (1)

, as claimed. □

Substitute

z = g (A) x

into (22), assuming that A is non-normal, i.e.,

g (A) \neq 0

. Then we obtain the equation

x^{n} = \frac{q}{g (A)} \sum_{j = 0}^{n - 1} \frac{x^{n - j - 1}}{\sqrt{j!}} .

(24)

Putting

{\hat{p}}_{n} = \sum_{j = 0}^{n - 1} \frac{1}{\sqrt{j!}}

and applying Lemma 3 for the unique positive root

x_{0}

of (24), we obtain

x_{0} \leq \{\begin{matrix} \frac{q {\hat{p}}_{n}}{g (A)} & if q {\hat{p}}_{n} > g (A), \\ {(\frac{q {\hat{p}}_{n}}{g (A)})}^{1 / n} & if q {\hat{p}}_{n} \leq g (A) . \end{matrix}

But

z_{n} (A, q) = g (A) x_{0}

; consequently, according to Theorem 8, we get

{sv}_{A} (\tilde{A}) \leq \{\begin{matrix} q {\hat{p}}_{n} & if q {\hat{p}}_{n} > g (A), \\ {(q {\hat{p}}_{n})}^{1 / n} g^{1 - 1 / n} (A) & if q {\hat{p}}_{n} \leq g (A) . \end{matrix}

(25)

Furthermore, put

\hat{g} (\tilde{A}, A) = max {g (A), g (\tilde{A})} .

Then Theorem 8 implies

Corollary 6.

One has

hd (A, \tilde{A}) \leq \hat{z} (A, \tilde{A})

, where

\hat{z} (A, \tilde{A})

is the unique positive root of the equation

z^{n} = q \sum_{j = 0}^{n - 1} \frac{{\hat{g}}^{j} (\tilde{A}, A)}{\sqrt{j!} .} z^{n - 1 - k} .

Replacing in Corollary 6

g (A)

by

\hat{g} (\tilde{A}, A)

, we obtain the following result.

Corollary 7.

We have

hd (A, \tilde{A}) \leq \{\begin{matrix} q {\hat{p}}_{n} & i f q {\hat{p}}_{n} > \hat{g} (A, \tilde{A}), \\ {(q {\hat{p}}_{n})}^{1 / n} {\hat{g}}^{1 - 1 / n} (A, \tilde{A}) & i f q {\hat{p}}_{n} \leq \hat{g} (A, \tilde{A}) . \end{matrix}

Now we are going to derive an estimate for the matching distance

md (A, \tilde{A})

introduced in Section 1. To this end we need the following well-known result.

Theorem 9

(Theorem IV.1.5, p. 170 in [2]). Let

t > 0

and

E = \tilde{A} - A

. If

β (t)

is a nondecreasing bound on

{sv}_{A} (A + t E)

, then

md (A, \tilde{A}) \leq (2 n - 1) β (1) .

If

β (t)

is a nondecreasing bound on

hd (A, A + t E)

, then

md (A, \tilde{A}) \leq 2 [n / 2] β (1) .

Here

[n / 2]

is the integer part of

n / 2

.

Note that

∥ A - (A + t E) ∥ \leq t q

for any

t \in [0, 1]

. By (25),

{sv}_{A} (A + t E) \leq {(t q {\hat{p}}_{n})}^{1 / n} g^{1 - 1 / n} (A) if t q {\hat{p}}_{n} \leq g (A) .

Hence,

{sv}_{A} (A + t E) \leq {(q {\hat{p}}_{n})}^{1 / n} g^{1 - 1 / n} (A) if q {\hat{p}}_{n} \leq g (A) (0 \leq t \leq 1) .

Making use of Theorem 9, we arrive at

Corollary 8.

Let

q {\hat{p}}_{n} \leq g (A)

. Then

md (A, \tilde{A}) \leq (2 n - 1) {(q {\hat{p}}_{n})}^{1 / n} g^{1 - 1 / n} (A) .

Since for a normal matrix A,

g (A) = 0

, Corollary 8 refines the Ostrowski–Elsner theorem mentioned in Section 1 for matrices close to normal ones.

8. A Bound for the Spectral Variation Via the Entries of Matrices

As mentioned above, the spectral norm is unitarily invariant, but the calculations and estimating of the spectral norm is often a not easy task, especially if the matrix depends on many parameters. In the paper [18], a bound for the spectral variation has been explicitly expressed via the entries of the considered matrices. In the paper [19], we have established a new bound via the entries. In the appropriate situations it considerably improves Elsner’s inequality and the main result from [18]. In this section we present the main results from [19].

Theorem 10.

Let

A = {(a_{j k})}_{j, k = 1}^{n}

and

\tilde{A} = {({\tilde{a}}_{j k})}_{j, k = 1}^{n}

be

n \times n

matrices. Then with the notations

q : = max_{k} {(\sum_{j = 1}^{n} {| {\tilde{a}}_{j k} - a_{j k} |}^{2})}^{1 / 2} a n d h (\tilde{A}) : = max_{k} {(\sum_{j = 1, j \neq k}^{n} {| {\tilde{a}}_{j k} |}^{2})}^{1 / 2}

one has

{({sv}_{A} (\tilde{A}))}^{n} \leq Δ_{A} (\tilde{A}),

where

Δ_{A} (\tilde{A}) : = q \prod_{k = 1}^{n} [1 + {(h^{2} (\tilde{A}) + \sum_{j = 1, j \neq k}^{n} {| {\tilde{a}}_{j k} |}^{2})}^{1 / 2} + {(\sum_{j = 1}^{n} {| {\tilde{a}}_{j k} - a_{j k} |}^{2})}^{1 / 2}] .

The proof of this theorem is presented in the next section. Simple calculations show that

Δ_{A} (\tilde{A}) \leq q \prod_{k = 1}^{n} [1 + h (\tilde{A}) + \sum_{j = 1, j \neq k}^{n} | {\tilde{a}}_{j k} | + \sum_{j = 1}^{n} | {\tilde{a}}_{j k} - a_{j k} |] .

Furthermore, let

A_{+}

be the upper triangular part of A. i.e.,

A_{+} = {(a_{j k}^{+})}_{j, k = 1}^{n}

, where

a_{j k}^{+} = a_{j k}

if

j \leq k

and

a_{j k}^{+} = 0

for

j > k

. To illustrate Theorem 10 apply it with

A = A_{+}

and

\tilde{A} = A

, taking into account that

(\sum_{j = 1}^{n} | a_{j k} - a_{j k}^{+} |^{2})^{1 / 2} = t_{k}^{-} (A), where t_{k}^{-} (A) : = (\sum_{j = k + 1}^{n} | a_{j k} {|^{2})}^{1 / 2} (k < n), t_{n}^{-} (A) = 0,

q = q_{+}

, where

q_{+} : = {max}_{k} t_{k}^{-} (A)

. In addition,

Δ_{A_{+}} (A) = Δ_{0} (A)

, where

Δ_{0} (A) : = q_{+} \prod_{k = 1}^{n} [1 + {(h^{2} (A) + \sum_{j = 1, j \neq k}^{n} {| a_{j k} |}^{2})}^{1 / 2} + t_{k}^{-} (A)] .

Now Theorem 10 implies.

{({sv}_{A_{+}} (A))}^{n} \leq Δ_{0} (A) .

(26)

Put

W_{k} (A) : = {z \in C : | z - a_{k k} | \leq Δ_{0}^{1 / n} (A)} .

Since

A_{+}

is triangular, we have

λ_{j} (A_{+}) = a_{j j}

(j = 1, \dots, n)

. Making use of (26), we arrive at

Corollary 9.

All the eigenvalues of

A \in C^{n \times n}

lie in the set

\cup_{k = 1}^{n} W_{k} (A)

.

This corollary is sharp: if A is triangular, then

A = A_{+}

,

Δ_{0} (A) = 0

and Corollary 9 gives us the equalities

λ_{j} (A) = a_{j j}

(j = 1, \dots, n)

.

9. Proof of Theorem 10

In this section for the brevity put

λ_{j} (A) = λ_{j}

and

λ_{j} (\tilde{A}) = {\tilde{λ}}_{j}

.

Lemma 4.

Let

A, \tilde{A} \in C^{n \times n}

and

{d_{k}}

be an arbitrary orthonormal basis in

C^{n}

. Then for any eigenvalue

{\tilde{λ}}_{j}

of

\tilde{A}

we have

min_{k} {| {\tilde{λ}}_{j} - λ_{k} |}^{n} \leq Δ ({\tilde{λ}}_{j}),

where

Δ (z) : = q_{0} \prod_{k = 1}^{n} (1 + \frac{1}{2} ∥ (2 z I - A - \tilde{A}) d_{k} ∥ + \frac{1}{2} ∥ (A - \tilde{A}) d_{k} ∥)) (z \in C)

and

q_{0} = {max}_{k} ∥ (A - \tilde{A}) d_{k} ∥ .

Proof.

Due to Theorem 6,

| det A - det \tilde{A} | \leq q_{0} \prod_{k = 1}^{n} (1 + \frac{1}{2} (∥ (A + \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} ∥)) .

(27)

Hence,

| det (z I - A) - det (z I - \tilde{A}) | \leq Δ (z) (z \in C) .

(28)

Since

det ({\tilde{λ}}_{j} I - \tilde{A}) = 0

, (28) implies

| det (I {\tilde{λ}}_{j} - A) | \leq Δ ({\tilde{λ}}_{j}) .

Consequently,

min_{k} | {\tilde{λ}}_{j} - λ_{k} |^{n} \leq \prod_{k = 1}^{n} | {\tilde{λ}}_{j} - λ_{k} | = | det (I {\tilde{λ}}_{j} - A) | \leq Δ ({\tilde{λ}}_{j}),

(29)

as claimed. □

Proof. of Theorem 10.

Obviously,

∥ (2 z I - A - \tilde{A}) d_{k} ∥ \leq 2 ∥ (z I - \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} ∥ .

Therefore,

Δ (z) \leq \prod_{k = 1}^{n} (1 + ∥ (z I - \tilde{A}) d_{k} ∥ + ∥ (A - \tilde{A}) d_{k} ∥) (z \in C) .

(30)

Now let

{d_{k}}

be the standard basis, and A and

\tilde{A}

be represented in that basis by matrices

{(a_{j k})}_{j, k = 1}^{n}

and

{({\tilde{a}}_{j k})}_{j, k = 1}^{n}

, respectively. Clearly,

∥ (A - \tilde{A}) d_{k} ∥^{2} = \sum_{j = 1}^{n} {| a_{j k} - {\tilde{a}}_{j k} |}^{2} .

So

q_{0} = q

. By the Gerschgorin theorem (see Section 2), we have

| {\tilde{λ}}_{j} - {\tilde{a}}_{k k} | \leq h (\tilde{A})

(j, k = 1, \dots, n)

. Thus,

∥ ({\tilde{λ}}_{j} I - \tilde{A}) d_{k} ∥^{2} = | {\tilde{λ}}_{j} - {\tilde{a}}_{k k} |^{2} + \sum_{j = 1, j \neq k}^{n} | {\tilde{a}}_{j k} |^{2} \leq h^{2} (\tilde{A}) + \sum_{j = 1, j \neq k}^{n} {| {\tilde{a}}_{j k} |}^{2} .

Consequently, under consideration

Δ ({\tilde{λ}}_{j}) \leq Δ_{A} (\tilde{A})

. Now Lemma 4 implies

min_{k} {| {\tilde{λ}}_{j} - λ_{k} |}^{n} \leq Δ_{A} (\tilde{A}) .

Since the right-hand part does not depend on j, this finishes the proof. □

10. Comments and Examples to Theorem 10

Again

∥ A ∥

is the spectral norm of A. To compare Theorem 10 with the Elsner inequality (3) consider the following examples.

Example 1.

Let

A = diag (1, 2, \dots, n)

,

\tilde{A} = diag (2, 3 \dots, n + 1)

.

Then

∥ A ∥ = n, ∥ \tilde{A} ∥ = n + 1, ∥ \tilde{A} - A ∥ = 1

. Now the Elsner inequality implies

{(sv (\tilde{A}))}^{n} \leq {(2 n + 1)}^{n - 1} .

(31)

Since

h (\tilde{A}) = 0

,

q = 1

, Theorem 10 yields the inequality

{(sv (\tilde{A}))}^{n} \leq \prod_{k = 1}^{n} (1 + 1) = 2^{n} .

(32)

Obviously, (32) is sharper than (31).

Example 2.

Let

A = (\begin{matrix} 1 & 0.1 \\ 0.2 & 2 \end{matrix}) and \tilde{A} = (\begin{matrix} 1.1 & 0.1 \\ 0.2 & 2.1 \end{matrix}) .

Simple calculations give us the following results:

λ_{1} (A) = 0.98

,

λ_{2} (A) = 2.02

,

λ_{1} (\tilde{A}) = 1.08

and

λ_{2} (\tilde{A}) = 2.12

. Hence,

{sv}_{A} (\tilde{A}) = 0.1

. To apply Theorem 10 note that in the considered example

q = 0.1, h (\tilde{A}) = 0.2

. So Theorem 10 gives us the following result:

{({sv}_{A} (\tilde{A}))}^{2} \leq q (1 + {(h^{2} (\tilde{A}) + {\tilde{a}}_{21}^{2})}^{1 / 2} + | {\tilde{a}}_{11} - a_{11} |) (1 + {(h^{2} (\tilde{A}) + {\tilde{a}}_{12}^{2})}^{1 / 2} + | {\tilde{a}}_{22} - a_{22} |)

= 0.1 (1 + {({0.2}^{2} + {0.2}^{2})}^{1 / 2} + 0.1) (1 + {({0.2}^{2} + {0.1}^{2})}^{1 / 2} + 0.1) \approx 0.183,

(33)

and, therefore,

{sv}_{A} (\tilde{A}) \leq 0.427

.

Furthermore, under consideration

∥ \tilde{A} ∥ \approx 2.12, ∥ A ∥ \approx 2.02, ∥ \tilde{A} - A ∥ = 0.1

, and thus the Elsner inequality implies

{({sv}_{A} (\tilde{A}))}^{2} \leq ∥ \tilde{A} - A ∥ (∥ \tilde{A} ∥ + ∥ A ∥) \approx (2.12 + 2.02) \cdot 0.1 = 0.414 .

So (33) is sharper than this result.

Example 3.

Let

A = (\begin{matrix} 5 & 0.2 \\ 0.1 & 6 \end{matrix}) and \tilde{A} = (\begin{matrix} 5.05 & 0.2 \\ 0.1 & 6.05 \end{matrix}) .

By the standard calculations we get

λ_{1} (A) = 4.98

,

λ_{2} (A) = 6.02

,

λ_{1} (\tilde{A}) = 5.03

and

λ_{2} (\tilde{A}) = 6.07

. Hence,

{sv}_{A} (\tilde{A}) = 0.05

. In the considered example

q = 0.05, h (\tilde{A}) = 0.2

. Omitting simple calculations, by Theorem 8.1, we get

{({sv}_{A} (\tilde{A}))}^{2} \leq 0.09

, and, therefore,

{sv}_{A} (\tilde{A}) \leq 0.3

.

11. Angular Localization of the Eigenvalues of Perturbed Matrices

In this section we consider the following problem: let the eigenvalues of a matrix lie in a certain sector. In what sector do the eigenvalues of a perturbed matrix lie?

Not too many works are devoted to the angular localization of matrix spectra. The papers [20,21] should be mentioned. In these papers it is shown that the test to determine whether all eigenvalues of a complex matrix of order n lie in a certain sector can be replaced by an equivalent test to find whether all eigenvalues of a real matrix of order

4 n

lie in the left half-plane. Below we also recall the well-known results from Chapter 1, Exercise 32 of [22].

To the best of our knowledge, the problem just described of angular localization of the eigenvalues of perturbed matrices was not considered in the available literature, although it is important for various applications, cf. [22].

The results of this section are adopted from the paper [23].

Again,

∥ A ∥

is the spectral norm of

A \in C^{n \times n}

. For a

Y \in C^{n \times n}

we write

Y > 0,

if Y is positive definite, i.e.,

{inf}_{x \in C^{n}, ∥ x ∥ = 1} (Y x, x) > 0

.

Without loss of the generality, we assume that

β (A) : = min_{k = 1, \dots, n} Re λ_{k} (A) > 0 .

(34)

If this condition does not hold, instead of A we can consider perturbations of the matrix

B = A + c I

with a constant

c > | β (A) |

.

By the Lyapunov theorem, cf. Theorem I.5.1 of [22], condition (34) implies that there exists a positive definite

Y \in C^{n \times n}

, such that

{(Y A)}^{*} + Y A > 0

. Define the angular Y-characteristic

τ (A, Y)

of A by

cos τ (A, Y) : = inf_{x \in C^{n}, ∥ x ∥ = 1} \frac{Re (Y A x, x)}{| (Y A x, x) |} .

The set

S (A, Y) : = {z \in C : | arg z | \leq τ (A, Y)}

will be called the Y-spectral-sector ofA. Let

λ = r e^{i t}

(r > 0, 0 \leq t < 2 π)

be an eigenvalue of A and d the corresponding eigenvector:

A d = λ d

. Then

\frac{Re (Y A d, d)}{| (Y A d, d) |} = \frac{Re r e^{i t} (Y d, d)}{r (Y d, d)} = cos t .

We, thus, get

Lemma 5.

For an

A \in C^{n \times n}

, let condition (34) hold and Y be a positive definite matrix, such that

{(Y A)}^{*} + Y A > 0

. Then, any eigenvalue of A lies in the Y-spectral-sector of

A .

Example 4.

Let

A = A^{*} > 0

. Then condition (34) holds. For any

Y > 0

commuting with A (for example

Y = I

) we have

{(Y A)}^{*} + Y A = 2 Y A

and

Re (Y A x, x) = | (Y A x, x) |

. Thus

cos τ (A, Y) = 1

and

S (A, Y) = {z \in C : arg z = 0}

.

So Lemma 5 is sharp.

Remark 1.

Suppose that A is invertible. Recall that the quantity

dev (A)

defined in the finite-dimensional case by

cos dev (A) : = inf_{x \in C^{n}, x \neq 0} \frac{Re (A x, x)}{∥ A x ∥ ∥ x ∥}

is called the angular deviation of A, cf. Chapter 1, Exercise 32 of [22]. For example, for a positive definite operator A one has

cos dev (A) = \frac{2 \sqrt{λ_{M} (A) λ_{m} (A)}}{λ_{M} (A) + λ_{m} (A)},

where

λ_{M} (A)

,

λ_{m} (A)

are the boundary of the spectrum of A (see Chapter 1, Exercise 33 of [22]).

In Exercise 32, it is shown that the spectrum of A lies in the sector

| arg z | \leq dev (A) .

Since

| (A x, x) | \leq ∥ A x ∥ ∥ x ∥

, Lemma 5 refines the that inequality.

Furthermore, by the above mentioned Lyapunov theorem, there exists a positive definite

X \in C^{n \times n}

solving the Lyapunov equation

2 Re (A X) = X A + A^{*} X = 2 I .

(35)

Hence,

cos τ (A, X) = inf_{x \in C^{n}, ∥ x ∥ = 1} \frac{(x, x)}{| (X A x, x) |} = \frac{1}{{sup}_{x \in C^{n}, ∥ x ∥ = 1} | (X A x, x) |} \geq \frac{1}{∥ A X ∥} .

(36)

Put

J (A) = 2 \int_{0}^{\infty} {∥ e^{- A t} ∥}^{2} d t .

Now we are in a position to formulate the main result of this section.

Theorem 11.

Let

A, \tilde{A} \in C^{n \times n}

, condition (34) hold and X be a solution of (35). Then, with the notation

q = ∥ A - \tilde{A} ∥

, one has

cos τ (\tilde{A}, X) \geq cos τ (A, X) \frac{(1 - q J (A))}{(1 + q J (A))},

provided

q J (A) < 1 .

(37)

The proof of this theorem is based on the following lemma

Lemma 6.

Let

A, \tilde{A} \in C^{n \times n}

, condition (34) hold and X be a solution of (35). If, in addition,

q ∥ X ∥ < 1,

(38)

then

cos τ (\tilde{A}, X) \geq cos τ (A, X) \frac{(1 - ∥ X ∥ q)}{(1 + ∥ X ∥ q)} .

Proof.

Put

E = \tilde{A} - A

. Then

q = ∥ E ∥

and due to (35), with

∥ x ∥ = 1

we obtain

Re (X (A + E) x, x) \geq Re (X A x, x) - | (X E x, x) | = (x, x) - | (X E x, x) |

\geq (x, x) - {∥ X ∥ ∥ E ∥ ∥ x ∥}^{2} = 1 - ∥ X ∥ q .

(39)

In addition,

| (X (A + E) x, x) | \leq | (X A x, x) | + {∥ X ∥ ∥ E ∥ ∥ x ∥}^{2}

= | (X A x, x) | (1 + \frac{∥ X ∥ q}{| (X A x, x) |}) (∥ x ∥ = 1) .

But

| (X A x, x) | \geq | Re (X A x, x) | = Re (X A x, x) = (x, x) = 1 .

Hence

| (X (A + E) x, x) | \leq | (X A x, x) | (1 + \frac{∥ X ∥ q}{(X A x, x) |}) \leq | (X A x, x) | (1 + ∥ X ∥ q) .

Now (39) yields.

\frac{Re (X \tilde{A} x, x)}{| (X \tilde{A} x, x) |} \geq \frac{1}{| (X A x, x) |} \frac{(1 - ∥ X ∥ q)}{(1 + ∥ X ∥ q)} (∥ x ∥ = 1),

provided (38) holds. Since

cos τ (\tilde{A}, X) = inf_{x \in C^{n}, ∥ x ∥ = 1} \frac{Re (X \tilde{A} x, x)}{| (X \tilde{A} x, x) |},

according to (36) we arrive at the required result. □

Proof of Theorem 11.

Note that X is representable as

X = 2 \int_{0}^{\infty} e^{- A^{*} t} C e^{- A t} d t

Section 1.5 of [22]. Hence, we easily have

∥ X ∥ \leq ∥ C ∥ J (A)

. Now the latter lemma proves the theorem. □

12. An Estimate for J (A) and Examples to Theorem 11

Recall that

N_{2} (A) = {∥ A ∥}_{F}

is the Frobenius (Hilbert-Schmidt) norm of A:

{∥ A ∥}_{F} = {(trace (A A^{*}))}^{1 / 2}

, and

g (A) = {[∥ A ∥}_{F}^{2} - \sum_{k = 1}^{n} | λ_{k} (A) {|^{2}]}^{1 / 2}

(see Section 3).

Lemma 7.

Let condition (34) hold. Then

J (A) \leq \hat{J} (A)

, where

\hat{J} (A) : = \sum_{j, k = 0}^{n - 1} \frac{g^{j + k} (A) (k + j)!}{2^{j + k} β^{j + k + 1} (A) {(j! k!)}^{3 / 2}} .

Proof.

By virtue of Example 3.2 from [9],

∥ e^{- A t} ∥ \leq exp [- β (A) t] \sum_{k = 0}^{n - 1} \frac{g^{k} (A) t^{k}}{{(k!)}^{3 / 2}} (t \geq 0) .

Then

J (A) \leq 2 \int_{0}^{\infty} exp [- 2 β (A) t] {(\sum_{k = 0}^{n - 1} \frac{g^{k} (A) t^{k}}{{(k!)}^{3 / 2}})}^{2} d t

= 2 \int_{0}^{\infty} exp [- 2 β (A) t] (\sum_{j, k = 0}^{n - 1} \frac{g^{k + j} (A) t^{k + j}}{{(j! k!)}^{3 / 2}}) d t

= \sum_{j, k = 0}^{n - 1} \frac{2 (k + j)! g^{j + k} (A)}{{(2 β (A))}^{j + k + 1} {(j! k!)}^{3 / 2}},

as claimed. □

If A is normal, then

g (A) = 0

and, taking

0^{0} = 1

we have

\hat{J} (A) = \frac{1}{β (A)}

.

The latter lemma and Theorem 11.1 imply

Corollary 10.

Let

A, \tilde{A} \in C^{n \times n}

and the conditions (34) and

q \hat{J} (A) < 1

hold. Then

cos τ (\tilde{A}, X) \geq \frac{(1 - q \hat{J} (A))}{(1 + q \hat{J} (A))} cos τ (A, X) .

Now consider the angular localization of the eigenvalues of matrices “close” to triangular ones. Let

A_{+}

be the upper triangular part of A. i.e.,

A_{+} = {(a_{j k}^{+})}_{j, k = 1}^{n}

, where

a_{j k}^{+} = a_{j k}

if

j \leq k

and

a_{j k}^{+} = 0

for

j > k

. To illustrate our results apply Corollary 10 with A instead of

\tilde{A}

and with

A_{+}

instead of A.

Since

A_{+}

is triangular, we have

λ_{j} (A_{+}) = a_{j j}

(j = 1, \dots, n)

,

g (A_{+}) = g_{+} (A) : = (\sum_{k = 2}^{n} \sum_{j = 1}^{k - 1} | a_{j k} {|^{2})}^{1 / 2}

and

β (A_{+}) = β_{+} (A) : = {min}_{k} Re a_{k k}

. Assuming that

β_{+} (A) > 0,

we can write

\hat{J} (A_{+}) = \sum_{j, k = 0}^{n - 1} \frac{g_{+}^{j + k} (A) (k + j)!}{2^{j + k} β_{+}^{j + k + 1} (A) {(j! k!)}^{3 / 2}} .

In addition,

q = q_{+} = ∥ A - A_{+} ∥

. Now Corollary 12.2 implies.

Corollary 11.

Let

β_{+} (A) > 0

and the condition

q_{+} \hat{J} (A_{+}) < 1

hold. Let the diagonal entries of A lie in the sector

| arg z | \leq ϕ

(ϕ < π / 2)

. Then the eigenvalues of A lie in the sector

| arg z | \leq ψ

with ψ satisfying

cos ψ \geq \frac{(1 - q_{+} {\hat{J}}_{+} (A))}{(1 + q_{+} {\hat{J}}_{+} (A))} cos ϕ

Example 5.

Consider the matrix

A = (\begin{matrix} 4 + 2 i & 0.1 \\ 0.2 & 8 + 4 i \end{matrix}) .

Then

A_{+} = (\begin{matrix} 4 + 2 i & 0.1 \\ 0 & 8 + 4 i \end{matrix}) .

We have

arg a_{11} = arg a_{22} = ϕ

, where

ϕ = arctan (1 / 2)

. and, therefore,

cos ϕ = \frac{2}{\sqrt{5}}

. In addition,

q_{+} = 0.2

,

β_{+} (A) = 4, g_{+} (A) = 0.1

and consequently,

{\hat{J}}_{+} (A) \approx 0.3125 \leq 0.313 .

Hence,

\frac{1 - q_{+} {\hat{J}}_{+} (A)}{1 + q_{+} {\hat{J}}_{+} (A)} \geq \frac{1 - 0.2 \cdot 0.313}{1 + 0.2 \cdot 0.313} \approx 0.8821 .

Now Corollary 11 implies that the eigenvalues of the considered matrix A lie in the sector

| arg z | \leq ψ

with

ψ

satisfying

cos ψ \geq 0.882 cos ϕ = \frac{2}{\sqrt{5}} 0.882 \approx 0.787 .

The direct calculations show that

cos ψ \approx 0.893

.

13. Perturbations of Diagonalizable Matrices

An eigenvalue is said to be simple, if its geometric multiplicity is equal to one. In this section, we consider a matrix A whose all the eigenvalues are simple. As it is well known, in this case there is an invertible matrix T, such that

T^{- 1} A T = \hat{D},

(40)

where

\hat{D}

is a normal matrix. Besides, A is called a diagonalizable matrix. The condition number

κ (A, T) : = ∥ T ∥ ∥ T^{- 1} ∥

is very important for various applications. We obtain a bound for the condition number and discuss applications of that bound to matrix functions and spectral variations.

If

A \in C^{n \times n}

(n \geq 2)

is diagonalizable, it can be written as

A = \sum_{k = 1}^{n} λ_{k} {\hat{Q}}_{k} \in C^{n \times n} (λ_{k} = λ_{k} (A) \in σ (A)),

where

{\hat{Q}}_{k}

are one-dimensional eigen-projections. If

f (z)

is a scalar function defined on the spectrum of A, then

f (A)

is defined as

f (A) = \sum_{k = 1}^{n} f (λ_{k}) {\hat{Q}}_{k}

Let

r (z) = \sum_{k = 0}^{n} c_{k} z^{k} (z \in C)

be the interpolation Lagrange-Sylvester polynomial, such that

r (λ_{k}) = f (λ_{k})

. and

f (A) = r (A) = \sum_{k = 0}^{n} c_{k} A^{k},

cf. Section V.1 of [24]. From (40) it follows

f (A) = \sum_{k = 0}^{n} c_{k} A^{k} = \sum_{k = 0}^{n} c_{k} T^{- 1} {\hat{D}}^{k} T = T^{- 1} f (\hat{D}) T .

Since

\hat{D}

is normal,

∥ f (\hat{D}) ∥ = {max}_{k} | f (λ_{k}) |

. We thus arrive at

Lemma 8.

Let A be diagonalizable and

f (z)

be a scalar function defined on the

σ (A)

for an

A \in C^{n \times n}

. Then

∥ f (A) ∥ \leq κ (A, T) max_{k} | f (λ_{k}) | .

In particular,

∥ A^{m} ∥ \leq κ (A, T) r_{s}^{m} (A) (m = 1, 2, \dots),

∥ e^{A t} ∥ \leq κ (A, T) e^{α (A) t} (α (A) = max_{k} Re λ_{k}, t \geq 0),

{∥ (A - λ I)}^{- 1} ∥ \leq \frac{κ (A, T)}{ρ (A, λ)} (λ \notin σ (A)) .

(41)

Inequality (41) and Lemma 7.1 imply.

Corollary 12.

Let

A, \tilde{A} \in C^{n \times n}

and A be diagonalizable. Then

{sv}_{A} (\tilde{A}) \leq ∥ A - \tilde{A} ∥ κ (A, T) .

Now we are going to estimate the condition number of A assuming that all the eigenvalues

λ_{j}

of A are different:

λ_{j} \neq λ_{m} whenever j \neq m (j, m = 1, \dots, n) .

(42)

In other words the algebraic multiplicity of each eigenvalue is is equal to one. Recall that

g (A) : = (N_{2}^{2} (A) - \sum_{k = 1}^{n} | λ_{k} {|^{2})}^{1 / 2}

(see Section 3) and put

δ_{j} : = min_{k = 1, \dots, n; k \neq j} | λ_{j} - λ_{k} |, τ_{j} (A) : = \sum_{k = 0}^{n - 2} \frac{g^{k} (A)}{\sqrt{k!} δ_{j}^{k + 1}}

and

γ (A) : = {(1 + \frac{g (A)}{n - 1} \sqrt{\sum_{j = 1}^{n - 1} τ_{j}^{2} (A)})}^{2 (n - 1)} .

Theorem 12.

Let condition (42) be fulfilled. Then there is an invertible matrix T, such that (40) holds with

κ (A, T) \leq γ (A) .

(43)

The proof of this theorem can be found in Theorem 6.1 of [9] and [25]. Theorem 12 is sharp: if A is normal, then

g (A) = 0

and

γ (A) = 1

. Thus we obtain the equality

κ (A, T) = 1

.

Lemma 8 and Theorem 12 immediately imply.

Corollary 13.

Let condition (42) hold and

f (z)

be a scalar function defined on the

σ (A)

for an

A \in C^{n \times n}

. Then

∥ f (A) ∥ \leq γ (A) max_{k} | f (λ_{k}) | .

Moreover, making use of Theorem 12 and Corollary 12, we arrive at the following result.

Corollary 14.

Let

A, \tilde{A} \in C^{n \times n}

and condition (42) hold. Then

{sv}_{A} (\tilde{A}) \leq ∥ A - \tilde{A} ∥ γ (A) .

About additional inequalities for condition numbers via norms of the eigen-projections see [26,27]. About the functions of diagonalzable matrices see also [28].

14. Sums of Real Parts of Eigenvalues of Perturbed Matrices

The aim of the present section is to generalize the Kahan inequality (4). Again, put

A_{R} : = (A + A^{*}) / 2 = Re A

,

A_{I} : = (A - A^{*}) / 2 i = Im A

and

E = \tilde{A} - A

. Let

c_{m} (m = 1, 2, \dots)

be a sequence of positive numbers defined by the recursive relation

c_{1} = 1, c_{m} = c_{m - 1} + \sqrt{c_{m - 1}^{2} + 1} (m = 2, 3, \dots) .

(44)

For a

p \in [2^{m}, 2^{m + 1}] (m = 1, 2, \dots)

, put

b_{p} = c_{m}^{t} c_{m + 1}^{1 - t} with t = 2 - 2^{- m} p .

As it is proved in Corollary 1.3 of [29],

b_{p} \leq \frac{p e^{1 / 3}}{2} \leq p (p \geq 2) .

(45)

Now we in a position to formulate and prove the main result of this section.

Theorem 13.

Let

A \in C^{n \times n}

be a Hermitian operator and

\tilde{A}

be an arbitrary

n \times n

matrix. Let the conditions

λ_{1} \leq λ_{2} \leq \dots \leq λ_{n} a n d R e {\tilde{λ}}_{1} \leq R e {\tilde{λ}}_{2} \leq \dots \leq R e {\tilde{λ}}_{n}

(46)

hold. Then for any

p \in [2, \infty)

,

[\sum_{k = 1}^{n} | Re {\tilde{λ}}_{k} - λ_{k} {|^{p}]}^{1 / p} \leq N_{p} (E_{R}) + 2 b_{p} N_{p} (E_{I}) .

(47)

Proof.

According to the Schur theorem (see Section 2), we can write

\tilde{A} = Q \tilde{T} Q^{- 1}

where

\tilde{T}

is an upper triangular matrix. Since

\tilde{T}

and

\tilde{A}

are similar, they have the same eigenvalues, and without loss of generality we can assume that

\tilde{A}

is already upper triangular, i.e.,

\tilde{A} = \tilde{D} + \tilde{V} (σ (\tilde{A}) = σ (\tilde{D}))

(48)

where

\tilde{D}

is the diagonal matrix and

\tilde{V}

is the strictly upper triangular matrix.

Here and below

σ (A)

denotes the spectrum of A. We have

\tilde{A} = {\tilde{D}}_{R} + i {\tilde{D}}_{I} + \tilde{V}

and thus, the real and imaginary part of A are

{\tilde{A}}_{R} = A + E_{R} = {\tilde{D}}_{R} + {\tilde{V}}_{R} and {\tilde{A}}_{I} = E_{I} = {\tilde{D}}_{I} + {\tilde{V}}_{I},

respectively. Since A and

{\tilde{D}}_{R}

are Hermitian, by the Mirsky inequality mentioned in the Introduction, we obtain

[\sum_{k = 1}^{n} | Re {\tilde{λ}}_{k} - λ_{k} {|^{p}]}^{1 / p} \leq N_{p} (A - {\tilde{D}}_{R}) = N_{p} (A - A_{R} + {\tilde{V}}_{R}) =

N_{p} (E_{R} + {\tilde{V}}_{R}) (1 \leq p < \infty) .

Thus

[\sum_{k = 1}^{n} | Re {\tilde{λ}}_{k} - λ_{k} {|^{p}]}^{1 / p} \leq N_{p} (E_{R}) + N_{p} ({\tilde{V}}_{R}) (1 \leq p < \infty) .

(49)

Making use of Lemma 1.5 from [29], we get the inequality

N_{p} ({\tilde{V}}_{R}) \leq b_{p} N_{p} ({\tilde{V}}_{I}) (2 \leq p < \infty)

(50)

(see also Section 3.6 of [30,31]). In addition, by (48)

{\tilde{V}}_{I} = {\tilde{A}}_{I} - {\tilde{D}}_{I}

and, therefore,

N_{p} ({\tilde{V}}_{I}) \leq N_{p} ({\tilde{A}}_{I}) + N_{p} ({\tilde{D}}_{I}) (1 \leq p < \infty) .

Thanks to the above mentioned Weyl inequalities,

N_{p} ({\tilde{D}}_{I}) \leq N_{p} ({\tilde{A}}_{I}) a n d N_{p} ({\tilde{D}}_{R}) \leq N_{p} ({\tilde{A}}_{R}) (1 \leq p < \infty) .

Thus,

N_{p} ({\tilde{V}}_{I}) \leq 2 N_{p} ({\tilde{A}}_{I}) (1 \leq p < \infty) .

Now (50) implies the inequality

N_{p} ({\tilde{V}}_{R}) \leq 2 b_{p} N_{p} ({\tilde{A}}_{I}) (2 \leq p < \infty) .

So by (49) we get the desired inequality

[\sum_{k = 1}^{n} | Re {\tilde{λ}}_{k} - λ_{k} {|^{p}]}^{1 / p} \leq N_{p} (E_{R}) + N_{p} ({\tilde{V}}_{R}) \leq N_{p} (E_{R}) + 2 b_{p} N_{p} (E_{I}) .

□

The just proved theorem is sharp in the following sense: if

\tilde{A}

is Hermitian, then

N_{p} (E_{I}) = 0

and inequality (47) becomes the Mirsky result, presented in Section 1.

Corollary 15.

Let a matrix

\tilde{A} = {(a_{j k})}_{j, k = 1}^{n}

have the real diagonal entries. Let W be the off-diagonal part of

\tilde{A}

:

W = \tilde{A} - d i a g (a_{11}, \dots, a_{n n})

. Then for any

p \in [2, \infty)

,

[\sum_{k = 1}^{n} | R e {\tilde{λ}}_{k} - a_{k k} {|^{p}]}^{1 / p} \leq N_{p} (W_{R}) + 2 b_{p} N_{p} (W_{I})

and, therefore,

[\sum_{k = 1}^{n} | Re {\tilde{λ}}_{k} |^{p}]^{1 / p} \geq [\sum_{k = 1}^{n} | a_{k k} {|^{p}]}^{1 / p} - N_{p} (W_{R}) - 2 b_{p} N_{p} (W_{I}) .

(51)

Indeed, this result is due to the previous theorem with

A = diag [a_{j j}]

.

Certainly, inequality (51) has a sense only if its right-hand side is positive.

The case

1 \leq p < 2

should be considered separately from the case

p \geq 2

, since the relations between

N_{p} ({\tilde{V}}_{R})

and

N_{p} ({\tilde{V}}_{I})

similar to inequality (50) are unknown if

p = 1

, and we could not use the arguments of the proof of Theorem 13. The case

1 \leq p < 2

is investigated in [32].

15. An Identity for Resolvents

Let

A, \tilde{A} \in C^{n \times n}

and

E = \tilde{A} - A

. The Hilbert identity for resolvents mentioned in Section 1 gives the following important result: if a

λ \in C

is regular for A and

∥ E ∥ ∥ R_{λ} (A) ∥ < 1,

(52)

then

λ

is also regular for

\tilde{A}

. In this section we suggest a new identity for resolvents of matrices. It gives us new perturbation results which in appropriate situations improve condition (52). Put

Z = \tilde{A} E - E A

.

Theorem 14.

Let a

λ \in C

be regular for A and

\tilde{A}

. Then,

R_{λ} (\tilde{A}) - R_{λ} (A) = R_{λ} (\tilde{A}) Z R_{λ}^{2} (A) - E R_{λ}^{2} (A) .

(53)

Proof.

We have

R_{λ} (\tilde{A}) (\tilde{A} E - E A) R_{λ}^{2} (A) - E R_{λ}^{2} (A) =

(R_{λ} (\tilde{A}) (\tilde{A} E - E A) - E) R_{λ}^{2} (A) = R_{λ} (\tilde{A}) (\tilde{A} E - E A - (\tilde{A} - λ) E) R_{λ}^{2} (A) =

R_{λ} (\tilde{A}) (- E λ + E A) R_{λ}^{2} (A) = - R_{λ} (\tilde{A}) E R_{λ} (A) =

- R_{λ} (\tilde{A}) (\tilde{A} - λ - (A - λ)) R_{λ} (A) = - (I - R_{λ} (\tilde{A}) (A - λ)) R_{λ} (A) =

R_{λ} (\tilde{A}) - R_{λ} (A),

as claimed. □

Denote

η (A, E, λ) : = sup_{0 \leq t \leq 1} t ∥ (A E - E A + t E^{2}) R_{λ}^{2} (A) ∥ .

Lemma 9.

Let

λ \in C

be a regular point of A and

η (A, E, λ) < 1

. Then

λ \notin σ (\tilde{A})

and identity (53) holds. Moreover,

∥ R_{λ} (\tilde{A}) ∥ \leq \frac{∥ R_{λ} (A) - E R_{λ}^{2} (A) ∥}{1 - η (A, E, λ)} .

Proof .

Put

A_{t} = A + t E

(t \in [0, 1])

. Since the regular sets of operators are open, for t small enough,

λ

is a regular point of

A_{t}

. By the previous lemma we get

R_{λ} (A_{t}) - R_{λ} (A) = R_{λ} (A_{t}) (t (A + t E) E - t E A) R_{λ}^{2} (A) - t E R_{λ}^{2} (A) .

Hence,

∥ R_{λ} (A_{t}) ∥ - ∥ R_{λ} (A) - t E R_{λ}^{2} (A) ∥ \leq ∥ R_{λ} (A_{t}) ∥ ∥ [t (E A - A E) + t^{2} E^{2}] R_{λ}^{2} (A) ∥ \leq

∥ R_{λ} (A_{t}) ∥ η (A, E, λ) .

Thus, with the notation

c (λ) : = \frac{1 - η (A, E, λ)}{∥ R_{λ} (A) ∥ + ∥ E R_{λ}^{2} (A) ∥},

We have

∥ R_{λ} (A_{t}) ∥ \leq \frac{1}{c (λ)} (λ \notin σ (A_{t})) .

(54)

Take an integer

m > c_{0} (λ) / ∥ E ∥

and put

t_{k} = k / m

(k = 1, \dots, m)

. For m large enough,

λ

is a regular point of

A_{t_{1}}

and due to (54) we can write

∥ A_{t_{1}} x - λ x ∥ \geq c_{0} (λ) (x \in D o m (A); ∥ x ∥ = 1) .

Hence,

∥ A_{t_{2}} x - λ x ∥ \geq ∥ A_{t_{1}} x - λ x ∥ - \frac{1}{m} ∥ E ∥ = γ > 0 (x \in C^{n}; ∥ x ∥ = 1),

(55)

where

γ = c_{0} - \frac{∥ E ∥}{m}

. Due to inequality (55) we can assert that

λ \notin σ (A_{t_{2}})

. So in our arguments we can replace

A_{t_{1}}

by

A_{t_{2}}

and obtain the relations

∥ A_{t_{3}} x - λ x ∥ \geq γ .

Therefore,

λ \notin σ (A_{t_{3}})

. Continuing this process for

k = 4, \dots, m

, we get

λ \notin σ (A_{t_{m}}) = σ (\tilde{A})

. Now (54) implies the required result. □

It is clear that

η (A, E, λ) \leq ζ^{2} (A, E) ∥ R_{λ}^{2} (A) ∥

, where

ζ (A, E) : = \sqrt{∥ A E - E A ∥ + ∥ E^{2} ∥} .

Now the previous lemma yields the following result.

Corollary 16.

Let

λ \notin σ (A)

and

ζ (A, E) ∥ R_{λ} (A) ∥ < 1 .

Then

λ \notin σ (\tilde{A})

and relation (53) holds.

Example 6.

Let us consider the matrices

A = (\begin{matrix} a & 0 \\ 0 & a \end{matrix}) a n d \tilde{A} = (\begin{matrix} a & c \\ 0 & a \end{matrix})

with arbitrary non-zero numbers a and

c .

It is clear that

∥ A - \tilde{A} ∥ = | c |

,

σ (A) = σ (\tilde{A})

. In this example we easily have

A E - E A = 0

and

E^{2} = 0

and, therefore, Corollary 16 gives us the sharp result.

At the same time (52) gives us the invertibility condition

| c | < | a |

.

Example 7.

Let us consider the block matrices

T = (\begin{matrix} B & 0 \\ 0 & B \end{matrix}) a n d \tilde{T} = (\begin{matrix} B & C \\ 0 & B \end{matrix}),

where C and B are commuting

n \times n

-matrices. It is simple to check that

T (\tilde{T} - T) - (\tilde{T} - T) T = 0, {(\tilde{T} - T)}^{2} = 0 .

Corollary 16 gives us the equality

σ (T) = σ (\tilde{T})

. At the same time, due to (52), if

λ \notin σ (T)

we can assert that

λ \notin σ (\tilde{T})

only if

∥ \tilde{T} - T ∥ ∥ R_{λ} (T) ∥ < 1

.

If A is invertible, then due to Theorem 5,

∥ A^{- 1} ∥ \leq \frac{N_{2}^{n - 1} (A)}{{(n - 1)}^{(n - 1) / 2} | det (A) |} .

Now Corollary 16 implies

Corollary 17.

Suppose A is invertible, and

ζ (A, E) N_{2}^{n - 1} (A) < {(n - 1)}^{(n - 1) / 2} | det (A) |,

then

\tilde{A}

is also invertible.

Recall that the quantity

g (A)

is introduced in Section 2. Theorems 2 and Corollary 16 imply our next result.

Corollary 18.

If λ is regular for A and

ζ (A, E) \sum_{k = 0}^{n - 1} \frac{g^{k} (A)}{\sqrt{k!} ρ^{k + 1} (A, λ)} < 1,

then λ is regular for

\tilde{A}

.

The following theorem gives us the bound for the spectral variation via the identity for resolvents considered in this section.

Theorem 15.

Let A and

\tilde{A}

be

n \times n

matrices. Then

{sv}_{A} (\tilde{A}) \leq x_{1}

, where

x_{1}

is the unique positive root of the algebraic equation

x^{n} = ζ (A, E) \sum_{k = 0}^{n - 1} \frac{g^{k} (A) x^{n - k - 1}}{\sqrt{k!}}

(56)

Proof.

For any

μ \in σ (A)

, due to Corollary 18 we have

ζ (A, E) \sum_{k = 0}^{n - 1} \frac{g^{k} (A)}{\sqrt{k!} ρ^{k + 1} (A, λ)} \geq 1 .

Hence, it follows that

ρ (A, μ) \leq x_{1},

where

x_{1}

is the unique positive root of the equation

ζ (A, E) \sum_{k = 0}^{n - 1} \frac{g^{k} (A)}{\sqrt{k!} x^{k + 1}} = 1,

which is equivalent to (56). But

{sv}_{A} (\tilde{A}) = {max}_{j} ρ (A, λ_{j} (\tilde{A}))

. This proves the theorem. □

To estimate

x_{1}

one can apply Lemma 13.

16. Similarity of an Arbitrary Matrix to a Block Diagonal Matrix

16.1. Preliminary Results

Again,

∥ A ∥

is the spectral norm. and

{∥ A ∥}_{F}

is the Frobenius norm of

A \in C^{n \times n}

,

λ_{j}

(j = 1, \dots, m; m \geq 2)

are the different eigenvalues of A and

μ_{j}

is the algebraic multiplicity of

λ_{j}

. So

δ : = min_{j, k = 1, \dots, m; k \neq j} | λ_{j} - λ_{k} | > 0

(57)

and

μ_{1} + \dots + μ_{m} = n

. The aim of this section is to show that there are matrices

A_{j} \in C^{μ_{j} \times μ_{j}}

(j = 1, \dots, m)

and an invertible matrix

T \in C^{n \times n}

, such that

T^{- 1} A T = \hat{D}, where \hat{D} = diag (A_{1}, A_{2}, \dots, A_{m}) .

(58)

Besides, each block

A_{j}

has the unique eigenvalue

λ_{j}

. In addition, we obtain an estimate for the (block-condition) number

κ_{T} : = ∥ T ∥ ∥ T^{- 1} ∥

and consider some applications of that estimate,

Put

{\hat{λ}}_{1} = {\hat{λ}}_{2} = \dots = {\hat{λ}}_{μ_{1}} = λ_{1},

{\hat{λ}}_{μ_{1} + 1} = {\hat{λ}}_{μ_{1} + 2} = \dots = {\hat{λ}}_{μ_{1} + μ_{2}} = λ_{2}, \dots,

{\hat{λ}}_{μ_{1} + μ_{2} + \dots + μ_{m - 1} + 1} = {\hat{λ}}_{μ_{1} + μ_{2} + \dots + μ_{m - 1} + 2} = \dots = {\hat{λ}}_{μ_{1} + μ_{2} + \dots + μ_{m}} = λ_{m} .

By the Schur theorem (see Section 2) there is a non-unique unitary transform, such that A can be reduced to the triangular form:

A = (\begin{matrix} a_{11} & a_{12} & a_{13} & \dots & a_{1, n - 1} & a_{1 n} \\ 0 & a_{22} & a_{23} & \dots & a_{2, n - 1} & a_{2 n} \\ . & . & . & \dots & . \\ 0 & 0 & 0 & \dots & a_{n - 1, n - 1} & a_{n - 1, n} \\ 0 & 0 & 0 & \dots & 0 & a_{n n} \end{matrix}) .

Besides, the diagonal entries are the eigenvalues ordered enumerated as

a_{11} = a_{22} = \dots = a_{μ_{1}, μ_{1}} = λ_{1},

a_{μ_{1} + 1, μ_{1} + 1} = a_{μ_{1} + 2, μ_{1} + 2} = \dots = a_{μ_{1} + μ_{2}, μ_{1} + μ_{2}} = λ_{2}, \dots

a_{μ_{1} + μ_{2} + \dots + μ_{m - 1} + 1, μ_{1} + μ_{2} + \dots + μ_{m - 1} + 1} = a_{μ_{1} + μ_{2} + \dots + μ_{m - 1} + 2, μ_{1} + μ_{2} + \dots + μ_{m - 1} + 2}

= \dots = a_{μ_{1} + μ_{2} + \dots + μ_{m}, μ_{1} + μ_{2} + \dots + μ_{m}} = λ_{m} .

Let

{e_{k}}_{k = 1}^{n}

be the corresponding orthonormal basis of the upper-triangular representation (the Schur basis). Denote

Q_{i} = \sum_{k = 1}^{i} (., e_{k}) e_{k} (i = 1, \dots, n); Δ Q_{k} = (., e_{k}) e_{k} (k = 1, \dots, n);

P_{0} = 0, P_{1} = \sum_{k = 1}^{μ_{1}} Δ Q_{k}, P_{2} = \sum_{k = 1}^{μ_{1} + μ_{2}} Δ Q_{k}, \dots, P_{j} = \sum_{k = 1}^{μ_{1} + μ_{2} + \dots + μ_{j}} Δ Q_{k}

and

Δ P_{j} = P_{j} - P_{j - 1} = \sum_{k = ν_{j - 1} + 1}^{ν_{j}} Δ Q_{k}, where ν_{0} = 0, ν_{j} = μ_{1} + μ_{2} + \dots + μ_{j} (j = 1, \dots, m) .

In addition, put

A_{j k} = Δ P_{j} A Δ P_{k}

(j \neq k)

and

A_{j} = Δ P_{j} A Δ P_{j}

(j, k = 1, \dots, m)

. We can see that each

P_{j}

is an orthogonal invariant projection of A and

A = (\begin{matrix} A_{1} & A_{12} & A_{13} & \dots & A_{1 m} \\ 0 & A_{2} & A_{23} & \dots & A_{2 m} \\ . & . & . & \dots & . \\ 0 & 0 & 0 & \dots & A_{m} \end{matrix}) .

(59)

Besides, if

μ_{j} = 1

, then

A_{j} = λ_{j} Δ P_{j}

and

Δ P_{j}

is one dimensional. If

μ_{j} > 1

, then

A_{j} = \sum_{k = ν_{j - 1} + 1}^{ν_{j}} Δ Q_{k} A \sum_{i = μ_{j - 1}}^{μ_{j}} Δ Q_{i} = \sum_{k = ν_{j - 1} + 1}^{ν_{j}} Δ Q_{k} A Δ Q_{k} + \sum_{i = ν_{j - 1} + 1}^{ν_{j}} \sum_{k = ν_{j - 1} + 1}^{i - 1} Δ Q_{k} A Δ Q_{i}

= λ_{j} \sum_{k = ν_{j - 1} + 1}^{ν_{j}} Δ Q_{k} + V_{j} = λ_{j} Δ P_{j} + V_{j},

where

V_{j} = \sum_{i = ν_{j - 1} + 1}^{ν_{j}} \sum_{k = ν_{j - 1} + 1}^{i - 1} Δ Q_{k} A Q_{i} .

In the matrix form the blocks

A_{j}

can be written as

A_{1} = (\begin{matrix} λ_{1} & a_{12} & a_{13} & \dots & a_{1, μ_{1} - 1} & a_{1 μ_{1}} \\ 0 & λ_{1} & a_{23} & \dots & a_{2 n - 1} & a_{2 n} \\ . & . & . & \dots & . \\ 0 & 0 & 0 & \dots & λ_{1} & a_{μ_{1} - 1, μ_{1}} \\ 0 & 0 & 0 & \dots & 0 & λ_{1} \end{matrix}),

A_{2} = (\begin{matrix} λ_{2} & a_{μ_{1} + 1, μ_{1} + 2} & a_{μ_{1} + 1, μ_{1} + 3} & \dots & a_{μ_{1} + 1, μ_{1} + μ_{2} - 1} & a_{μ_{1} + 1, μ_{1} + μ_{2}} \\ 0 & λ_{2} & a_{μ_{1} + 2, μ_{1} + 3} & \dots & a_{μ_{1} + 2, μ_{1} + μ_{2} - 1} & a_{μ_{1} + 2, μ_{1} + μ_{2}} \\ . & . & . & \dots & . \\ 0 & 0 & 0 & \dots & λ_{2} & a_{μ_{1} + μ_{2} - 1, μ_{1} + μ_{2}} \\ 0 & 0 & 0 & \dots & 0 & λ_{2} \end{matrix}),

etc. Besides, each

V_{j}

is a strictly upper-triangular (nilpotent) part of

A_{j}

. So

A_{j}

has the unique eigenvalue

λ_{j}

of the algebraic multiplicity

μ_{j}

:

σ (A_{j}) = {λ_{j}}

. We, thus, have proved the following result.

Lemma 10.

An arbitrary matrix

A \in C^{n \times n}

can be reduced by a unitary transform to the block triangular form (59) with

A_{j} = λ_{j} Δ P_{j} + V_{j} \in C^{μ_{j} \times μ_{j}}

, where

V_{j}

is either a nilpotent operator, or

V_{j} = 0

. Besides,

A_{j}

has the unique eigenvalue

λ_{j}

of the algebraic multiplicity

μ_{j}

.

16.2. Statement of the Main Result

Again, put

g (A) : = {[∥ A ∥}_{F}^{2} - \sum_{k = 1}^{m} μ_{k} | λ_{k} {|^{2}]}^{1 / 2} .

Introduce, also, the notations

d_{j} : = \sum_{k = 0}^{j} \frac{j!}{{((j - k)! k!)}^{3 / 2}} (j = 0, \dots, n - 2), θ (A) : = \sum_{k = 0}^{n - 2} \frac{d_{k} g^{k} (A)}{δ^{k + 1}}

and

γ (A) : = {(1 + \frac{g (A) θ (A)}{\sqrt{m - 1}})}^{2 (m - 1)} .

It is not hard to check that

d_{j} \leq 2^{j}

. Now we are in a position to formulate the main result of this section.

Theorem 16.

Let an

n \times n

-matrix A have

m \leq n

(m \geq 2)

different eigenvalues

λ_{j}

of the algebraic multiplicity

μ_{j}

(j = 1, \dots, m)

. Then there are

μ_{j} \times μ_{j}

-matrices

A_{j}

each of which has a unique eigenvalue

λ_{j}

, and an invertible matrix T, such that (58) holds with the block-diagonal matrix

\hat{D} = diag (A_{1}, A_{2}, \dots, A_{m})

. Moreover,

κ_{T} = ∥ T ∥ ∥ T^{- 1} ∥ \leq γ (A) .

(60)

This theorem is proved in the next section. Theorem 16 is sharp: if A is normal, then

g (A) = 0

and

γ (A) = 1

. Thus we obtain the equality

κ_{T} = 1

.

16.3. Applications of Theorem 16

Let

f (z)

be a scalar function, regular on

σ (A)

. Define

f (A)

by the usual way via the Cauchy integral [33]. Since

A_{j}

are mutually orthogonal, we have

f (\hat{D}) = diag (f (A_{1}, \dots, f (A_{m})) a n d ∥ f (\hat{D}) ∥ = max_{j} ∥ Δ P_{j} f (A_{j}) ∥ .

(61)

Let

r (z) = \sum_{k = 0}^{n - 1} c_{k} z^{n - k}

be the interpolation Lagrange–Sylvester polynomial such that

r ({\hat{λ}}_{j}) = f ({\hat{λ}}_{j})

({\hat{λ}}_{j} \in σ (A),

j = 1, \dots, n)

and

r (A) = f (A)

, cf. Section V.1 of [24].

Now (58) implies

f (A) = \sum_{k = 0}^{n - 1} c_{k} A^{n - k} = T^{- 1} \sum_{k = 0}^{n - 1} c_{k} {\hat{D}}^{n - k} T = T^{- 1} r (\hat{D}) T = T^{- 1} f (\hat{D}) T .

Hence, (59) and (60) yield

Corollary 19.

Let

A \in C^{n \times n}

. Then there is an invertible matrix T, such that

∥ f (A) ∥ \leq κ_{T} max_{j} ∥ Δ P_{j} f (A_{j}) ∥ \leq γ (A) max_{j} ∥ Δ P_{j} f (A_{j}) ∥ .

Due to Theorem 3.5 from the book [9] we have

∥ f (A_{j}) ∥ \leq \sum_{k = 0}^{μ_{j} - 1} | f^{(k)} (λ_{j}) | \frac{g^{k} (A_{j})}{\sqrt{k!}} .

Take into account that

g (A_{j}) \leq g (A)

(see Section 17). Now, making use of Theorem 16.2, we arrive at the following result.

Corollary 20.

Let

A \in C^{n \times n}

. Then

∥ f (A) ∥ \leq γ (A) max_{j} \sum_{k = 0}^{μ_{j} - 1} | f^{(k)} (λ_{j}) | \frac{g^{k} (A)}{{(k!)}^{3 / 2}} .

For example, we have

∥ e^{t A} ∥ \leq γ (A) e^{α (A) t} \sum_{k = 0}^{\hat{μ} - 1} t^{k} \frac{g^{k} (A)}{{(k!)}^{3 / 2}} (t \geq 0),

where

α (A) = {max}_{k} Re λ_{k} and \hat{μ} = {max}_{j} μ_{j}

.

About the recent results devoted to matrix-valued functions see for instance [9] and the references which are given therein.

Now consider the resolvent. Then by (58) for

| z | > max {∥ A ∥, ∥ \hat{D} ∥}

we have

R_{z} (A) = {(A - z I)}^{- 1} = - \sum_{k = 0}^{\infty} \frac{A^{k}}{z^{k + 1}} = - T^{- 1} \sum_{k = 0}^{\infty} \frac{{\hat{D}}^{k}}{z^{k + 1}} T = T^{- 1} R_{z} (\hat{D}) T .

Extending this relation analytically to all regular z and taking into account that

R_{z} (\hat{D}) = \sum_{k = 1}^{m} R_{z} (A_{j}) a n d ∥ R_{z} (\hat{D}) ∥ = max_{j} ∥ Δ P_{j} R_{z} (A_{j}) ∥ (z \in σ (A)),

(62)

We get

Corollary 21.

Let

A \in C^{n \times n}

. Then there is an invertible matrix T, such that

∥ R_{z} (A) ∥ \leq κ_{T} max_{j} ∥ Δ P_{j} R_{z} (A_{j}) ∥ \leq γ (A) max_{j} ∥ Δ P_{j} R_{z} (A_{j}) ∥

for any regular z of A.

But due to Theorem 3.2 from [9] we have

∥ R_{z} (A_{j}) ∥ \leq \sum_{k = 0}^{μ_{j} - 1} \frac{g^{k} (A_{j})}{ρ^{k + 1} (A_{j}, z) \sqrt{k!}} (z \notin σ (A_{j})),

where

ρ (A, z)

is the distance between z and the spectrum of A. Clearly,

ρ (A_{j}, z) \geq ρ (A, z)

(j = 1, \dots, m)

. Now Theorem 16 and (62) imply

Corollary 22.

Let

A \in C^{n \times n}

. Then

∥ R_{z} (A) ∥ \leq γ (A) \sum_{k = 0}^{\hat{μ} - 1} \frac{g^{k} (A)}{ρ^{k + 1} (A, z) \sqrt{k!}} (λ \notin σ (A)) .

Furthermore, let A and

\tilde{A}

be complex

n \times n

-matrices. Recall that

{sv}_{A} (\tilde{A})

is the spectral variation of

\tilde{A}

with respect to A.

For the proof see Lemma 1.10 of [9]. Making use of Lemma 2 and Corollary 22, we obtain the inequality

s v_{A} (\tilde{A}) \leq z (A, q)

, where

z (A, q)

is the unique positive root of the equation

q γ (A) \sum_{k = 0}^{\hat{μ} - 1} \frac{g^{k} (A)}{z^{k + 1} \sqrt{k!}} = 1 .

This equation is equivalent to the algebraic one

z^{\hat{μ}} = q γ (A) \sum_{k = 0}^{\hat{μ} - 1} \frac{g^{k} (A) z^{\hat{μ} - k - 1}}{\sqrt{k!}} .

(63)

For example, if

ζ (A, q) : = q γ (A) \sum_{k = 0}^{\hat{μ} - 1} \frac{g^{k} (A)}{\sqrt{k!}} < 1,

(64)

then due to Lemma 3.17 from [9], we have

z^{\hat{μ}} (A, q) \leq ζ (A, q)

. So we arrive at

Corollary 23.

Let A and

\tilde{A}

be

n \times n

-matrices. Then

s v_{A} (\tilde{A}) \leq z (A, q)

. If, in addition, condition (64) holds, then

s v_{A}^{\hat{μ}} (\tilde{A}) \leq ζ (A, q)

.

To illustrate Corollary 23 consider the matrices

A = (\begin{matrix} - 1 & a_{12} & a_{13} & a_{14} \\ 0 & - 1 & a_{23} & a_{24} \\ 0 & 0 & 1 & a_{34} \\ 0 & 0 & 0 & 1 \end{matrix}) and \tilde{A} = (\begin{matrix} - 1 & a_{12} & a_{13} & a_{14} \\ a_{21} & - 1 & a_{23} & a_{24} \\ a_{31} & a_{32} & 1 & a_{34} \\ a_{41} & a_{42} & a_{43} & 1 \end{matrix})

The eigenvalues of A are

λ_{1} = λ_{2} = - 1, λ_{3} = λ_{4} = 1

. So

m = 2, μ_{1} = μ_{2} = 2, δ = 2

,

g^{2} (A) = \sum_{k = 1}^{4} \sum_{j = 1}^{k - 1} {| a_{j k} |}^{2},

d_{0} = 1, d_{1} = 1

, and

d_{2} \leq 4

. Hence,

θ (A) \leq θ_{1} (A) : = \frac{1}{2} (1 + \frac{g (A)}{2} + g^{2} (A)) and γ (A) \leq γ_{1} (A),

where

γ_{1} (A) : = {(1 + g (A) θ_{1} (A))}^{2}

. According (59) consider the equation

z^{2} = q γ_{1} (A) (z + g (A)) .

So one can take

z (A, q) = z_{1} (A, q),

where

z_{1} (A, q) : = \frac{1}{2} q γ_{1} (A) + \sqrt{\frac{1}{4} q^{2} γ_{1}^{2} (A) + q γ_{1} (A) g (A) .}

Due to Corollary 23 we have

s v_{A} (\tilde{A}) \leq z_{1} (A, q)

.

Additional relevant results can be found in the papers [34,35].

17. Proof of Theorem 16

Recall that

P_{j}

are the orthogonal invariant projections defined in Section 16.1 and

Δ P_{j} = P_{j} - P_{j - 1}

;

A_{j k}

and

A_{j}

are also defined in Section 16.1. Put

{\bar{P}}_{k} = I - P_{k}, B_{k} = {\bar{P}}_{k} A {\bar{P}}_{k} and C_{k} = Δ P_{k} A {\bar{P}}_{k} (k = 1, \dots, m - 1) .

By Lemma 10

A_{j}

has the unique eigenvalue

λ_{j}

and A is represented by (59). Represent

B_{j}

and

C_{j}

in the block-matrix form:

B_{j} = {\bar{P}}_{j} A {\bar{P}}_{j} = (\begin{matrix} A_{j + 1} & A_{j + 1, j + 2} & \dots & A_{j + 1, m} \\ 0 & A_{j + 2} & \dots & A_{j + 2, m} \\ . & . & \dots & . \\ 0 & 0 & . & A_{m} \end{matrix})

and

C_{j} = Δ P_{j} A {\bar{P}}_{j} = (\begin{matrix} A_{j, j + 1} & A_{j, j + 2} & \dots & A_{j, m} \end{matrix}) (j = 1, \dots, m - 1) .

Since

B_{j}

is a block triangular matrix, it is not hard to see that

σ (B_{j}) = \cup_{k = j + 1}^{m} σ (A_{k}) = \cup_{k = j + 1}^{m} λ_{k} (j = 1, \dots, m - 1),

cf. Lemma 6.2 of [9]. So due to Lemma 10,

σ (B_{j}) \cap σ (A_{j}) = \emptyset (j = 1, \dots, m - 1) .

(65)

Under this condition, the equation

A_{j} X_{j} - X_{j} B_{j} = - C_{j} (j = 1, \dots, m - 1)

(66)

has a unique solution

X_{j} : {\bar{P}}_{j} C^{n} \to Δ P_{j} C^{n},

(67)

e.g., Section VII.2 of [33].

Lemma 11.

Let

X_{j}

be a solution to (66). Then

(I - X_{m - 1}) (I - X_{m - 2}) \dots (I - X_{1}) A (I + X_{1}) (I + X_{2}) \dots (I + X_{m - 1}) = \hat{D} .

(68)

Proof.

Due to (67) we can write

X_{j} = Δ P_{j} X_{j} {\bar{P}}_{j}

. But

Δ P_{j} {\bar{P}}_{j} = {\bar{P}}_{j} Δ P_{j} = 0

. Therefore,

X_{j} A_{j} = B_{j} X_{j} = X_{j} C_{j} = C_{j} X_{j} = 0

and

X_{j}^{2} = 0 .

(69)

Since

P_{j}

is a projection invariant to A:

P_{j} A P_{j} = A P_{j}

, we can write

{\bar{P}}_{j} A P_{j} = 0

. Thus,

A = A_{1} + B_{1} + C_{1}

and, consequently,

(I - X_{1}) A (I + X_{1}) = (I - X_{1}) (A_{1} + B_{1} + C_{1}) (I + X_{1}) =

A_{1} + B_{1} + C_{1} - X_{1} B_{1} + A_{1} X_{1} = A_{1} + B_{1} .

Furthermore,

B_{1} = A_{2} + B_{2} + C_{2}

. Hence,

({\bar{P}}_{1} - X_{2}) B_{1} ({\bar{P}}_{1} + X_{2}) = ({\bar{P}}_{1} - X_{1}) (A_{2} + B_{2} + C_{2}) ({\bar{P}}_{1} + X_{1}) =

A_{2} + B_{2} + C_{2} - X_{2} B_{2} + A_{2} X_{2} = A_{2} + B_{2} .

Therefore,

(I - X_{2}) (A_{1} + B_{1}) (I + X_{2}) = (P_{1} + {\bar{P}}_{1} - X_{2}) (A_{1} + B_{1}) (P_{1} + {\bar{P}}_{1} + X_{2}) =

A_{1} + ({\bar{P}}_{1} - X_{2}) (A_{1} + B_{1}) ({\bar{P}}_{1} + X_{2}) = A_{1} + A_{2} + B_{2} .

Consequently,

(I - X_{2}) (A_{1} + B_{1}) (I + X_{2}) = (I - X_{2}) (I - X_{1}) A (I + X_{1}) (I + X_{2}) = A_{1} + A_{2} + B_{2} .

Continuing this process and taking into account that

B_{m - 1} = A_{m}

, we obtain

(I - X_{m - 1}) (I - X_{m - 2}) \dots (I - X_{1}) A (I + X_{1}) (I + X_{2}) \dots (I + X_{m - 1}) = A_{1} + \dots + A_{m} = \hat{D},

as claimed. □

Take

T = (I + X_{1}) (I + X_{2}) \dots (I + X_{m - 1}) .

(70)

According to (69)

(I + X_{j}) (I - X_{j}) = (I - X_{j}) (I + X_{j}) = I .

So the matrix

I - X_{j}

is inverse to

I + X_{j}

. Thus,

T^{- 1} = (I - X_{m - 1}) (I - X_{m - 2}) \dots (I - X_{1})

(71)

and (68) can be written as (58). We thus arrive at

Corollary 24.

Let an

n \times n

-matrix A have

m \leq n

(m \geq 2)

different eigenvalues

λ_{j}

of the algebraic multiplicity

μ_{j}

(j = 1, \dots, m)

. Then there are

μ_{j} \times μ_{j}

-matrices

A_{j}

each of which has a unique eigenvalue

λ_{j}

and such that (58) holds with T defined by (70).

By the inequalities between the arithmetic and geometric means from (70) and (71) we get

∥ T ∥ \leq \prod_{j = 1}^{m - 1} (1 + ∥ X_{j} ∥) \leq {(1 + \frac{1}{m - 1} \sum_{j = 1}^{m - 1} ∥ X_{j} ∥)}^{m - 1}

(72)

and

∥ T^{- 1} ∥ \leq {(1 + \frac{1}{m - 1} \sum_{k = 1}^{m - 1} ∥ X_{k} ∥)}^{m - 1} .

(73)

Proof of Theorem 16

Consider the Sylvester equation

B X - X \tilde{B} = C,

(74)

where

B \in C^{n_{1} \times n_{1}}, \tilde{B} \in C^{n_{2} \times n_{2}}

and

C \in C^{n_{1} \times n_{2}}

are given;

X \in C^{n_{1} \times n_{2}}

should be found. Assume that the eigenvalues

λ_{k} (B)

and

λ_{j} (\tilde{B})

of B and

\tilde{B}

, respectively, satisfy the condition.

ρ_{0} (B, \tilde{B}) : = (σ (B), σ (\tilde{B})) = min_{j, k} | λ_{k} (B) - λ_{j} (\tilde{B}) | > 0 .

Then Equation (74) has a unique solution X, e.g., Section VII.2 of [33]. not mentioned in the article, plaese confirm and modifiy. Due to Corollary 5.8 of [9], the inequality

{∥ X ∥}_{F} \leq {∥ C ∥}_{F} \sum_{p = 0}^{n_{1} + n_{2} - 2} \frac{1}{ρ_{0}^{p + 1} (B, \tilde{B})} \sum_{k = 0}^{p} (_{k}^{p}) \frac{g^{k} (\tilde{B}) g^{p - k} (B)}{\sqrt{(p - k)! k!}}

is valid and, therefore,

{∥ X ∥}_{F} \leq {∥ C ∥}_{F} \sum_{p = 0}^{n_{1} + n_{2} - 2} \frac{d_{p} {\hat{g}}^{p}}{ρ_{0}^{p + 1} (B, \tilde{B})},

(75)

where

\hat{g} = max {g (B), g (\tilde{B})}

.

Let us go back to Equation (66). In this case

B = A_{j}

,

\tilde{B} = B_{j}

,

C = C_{j}

,

n_{1} = μ_{j}

,

n_{2} = {\hat{n}}_{j} : = dim {\bar{P}}_{j} C^{n}

, and due to (57),

ρ_{0} (A_{j}, B_{j}) \geq δ

(j = 1, \dots, n)

. In addition,

μ_{j} + {\hat{n}}_{j} \leq n

. Now (75) implies

∥ X_{j} ∥_{F} \leq {∥ C_{j} ∥}_{F} \sum_{k = 0}^{n - 2} \frac{d_{k} {\hat{g}}_{j}^{k}}{δ^{k + 1}},

(76)

where

{\hat{g}}_{j} = max {g (B_{j}), g (A_{j})}

.

Recall that

{e_{k}}_{k = 1}^{n}

denotes the Schur basis. So

A e_{k} = \sum_{j = 1}^{k} a_{j k} e_{j} with a_{j k} = (A e_{k}, e_{j}) (j = 1, \dots, n) .

We can write

A = D_{A} + V_{A}

(σ (A) = σ (D_{A}))

with a normal (diagonal) matrix

D_{A}

defined by

D_{A} e_{j} = a_{k k} e_{k} = {\hat{λ}}_{j} e_{k}

(k = 1, \dots, n)

and a nilpotent (strictly upper-triangular) matrix

V_{A}

defined by

V_{A} e_{k} = a_{1 k} e_{1} + \dots + a_{k - 1, k} e_{k - 1}

(k = 2, \dots, n)

,

V_{A} e_{1} = 0

.

D_{A}

and

V_{A}

will be called the diagonal part and nilpotent part of A, respectively. It can be

V_{A} = 0

, i.e., A is normal.

Besides,

g (A) = ∥ V_{A} ∥_{F}

. In addition, the nilpotent part

V_{j}

of

A_{j}

is

Δ P_{j} V_{A} Δ P_{j}

and the nilpotent part

W_{j}

of

B_{j}

is

{\bar{P}}_{j} V_{A} {\bar{P}}_{j}

. So

V_{j}

and

W_{j}

are orthogonal, and

g (A_{j}) = ∥ V_{j} ∥_{F} \leq ∥ V_{A} ∥_{F} = g (A), g (B_{j}) = ∥ W_{j} ∥_{F} \leq {∥ V_{A} ∥}_{F}^{2} = g (A) .

Thus, from (76) it follows

∥ X_{j} ∥_{F} \leq ∥ C_{j} ∥_{F} \sum_{k = 0}^{n - 2} \frac{d_{k} g^{k} (A)}{δ^{k + 1}} = {∥ C_{j} ∥}_{F} θ (A) .

(77)

It can be directly checked that

∥ C_{j} ∥_{F}^{2} = \sum_{k = j + 1}^{m} {∥ A_{j k} ∥}_{F}^{2}

and

\sum_{j = 1}^{m - 1} ∥ C_{j} ∥_{F}^{2} = \sum_{j = 1}^{m - 1} \sum_{k = j + 1}^{m} ∥ A_{j k} ∥_{F}^{2} \leq \sum_{j = 1}^{m} \sum_{k = j}^{m} ∥ A_{j k} ∥_{F}^{2} - \sum_{j = 1}^{m} ∥ A_{j j} ∥_{F}^{2} = {∥ A ∥}_{F}^{2} - \sum_{j = 1}^{m} {∥ A_{j j} ∥}_{F}^{2} .

Since

∥ A_{k k} ∥_{F} \geq μ_{k} | λ_{k} |

, we have

\sum_{j = 1}^{m - 1} \sum_{k = j + 1}^{m} {∥ A_{j k} ∥}_{F}^{2} \leq g^{2} (A),

and, consequently,

\sum_{j = 1}^{m - 1} {∥ C_{j} ∥}_{F}^{2} \leq g^{2} (A) .

(78)

Take T as is in (70). Then (72), (73), and (77) imply

∥ T ∥ \leq {(1 + \frac{1}{m - 1} \sum_{k = 1}^{m - 1} {∥ X_{k} ∥}_{F})}^{m - 1} \leq {(1 + \frac{θ (A)}{m - 1} \sum_{k = 1}^{m - 1} {∥ C_{k} ∥}_{F})}^{m - 1}

and

∥ T^{- 1} ∥ \leq {(1 + \frac{θ (A)}{m - 1} \sum_{k = 1}^{m - 1} {∥ C_{k} ∥}_{F})}^{m - 1} .

But by the Schwarz inequality and (78),

(\sum_{j = 1}^{m - 1} ∥ C_{j} ∥_{F})^{2} \leq (m - 1) \sum_{j = 1}^{m - 1} {∥ C_{j} ∥}_{F}^{2} \leq (m - 1) g^{2} (A) .

Thus,

{∥ T ∥}^{2} \leq {(1 + \frac{θ (A)}{\sqrt{m - 1}} g (A))}^{2 (m - 1)} = γ (A)

and

∥ T^{- 1} ∥^{2} \leq γ (A)

. Now (68) proves the theorem. □

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bhatia, R. Perturbation Bounds for Matrix Eigenvalues; Classics in Applied Mathematics; SIAM: Philadelphia, PA, USA, 2007; Volume 53. [Google Scholar]
Stewart, G.W.; Sun, J.G. Matrix Perturbation Theory; Academic Press: New York, NY, USA, 1990. [Google Scholar]
Elsner, L. On optimal bound for the spectral variation of two matrices. Linear Algebra Appl. 1985, 71, 77–80. [Google Scholar] [CrossRef]
Hoffman, A.J.; Wiellandt, H.W. The variation of the spectrum a normal matrix. Duke Math. J. 1953, 20, 37–39. [Google Scholar] [CrossRef]
Kato, T. Perturbation Theory for Linear Operators; Springer: Berlin, Germany, 1966. [Google Scholar]
Mirsky, L. Symmetric gage functions and unitarily invariant norms. Q. J. Math. 1960, 11, 50–59. [Google Scholar] [CrossRef]
Kahan, W. Spectra of nearly Hermitian matrices. Proc. Am. Math. Soc. 1975, 48, 11–17. [Google Scholar] [CrossRef]
Marcus, M.; Minc, H. A Survey of Matrix Theory and Matrix Inequalities; Allyn and Bacon: Boston, MA, USA, 1964. [Google Scholar]
Gil’, M.I. Operator Functions and Operator Equations; World Scientific: Hackensack, NJ, USA, 2018. [Google Scholar]
Gil’, M.I. Perturbations of determinants of matrices. Linear Algebra Its Appl. 2020, 590, 235–242. [Google Scholar] [CrossRef]
Li, H.B.; Huang, T.Z.; Li, H. Some new results on determinantal inequalities and applications. J. Inequalities Appl. 2010, 2010. [Google Scholar] [CrossRef]
Li, B.; Tsatsomeros, M.J. Doubly diagonally dominant matrices. Linear Algebra Its Appl. 1997, 261, 221–235. [Google Scholar] [CrossRef]
Wen, L.; Chen, Y. Some new two-sided bounds for determinants of diagonally dominant matrices. J. Inequal. Appl. 2012, 61, 1–9. [Google Scholar]
Vein, R.; Dale, P. Determinants and Their Applications in Mathematical Physics; Applied Mathematical Sciences; Springer: New York, NY, USA, 1999; Volume 134. [Google Scholar]
Brent, R.P.; Osborn, J.H.; Smith, W.D. Note on best possible bounds for determinants of matrices close to the identity matrix. J. Linear Algebra Appl. 2015, 466, 21–26. [Google Scholar] [CrossRef]
Gil’, M.I. Bounds for Determinants of Linear Operators and Their Applications; CRC Press: Boca Raton, MA, USA; Taylor & Francis Group: London, UK, 2017. [Google Scholar]
Gohberg, I.C.; Krein, M.G. Introduction to the Theory of Linear Nonselfadjoint Operators; American Mathematical Society: Providence, RI, USA, 1969; Volume 18. [Google Scholar]
Gil’, M.I. A new inequality for the Hausdorff distance between spectra of two matrices. Rend. Del Circ. Mat. Palermo Ser. 2 2020, 70, 341–348. [Google Scholar] [CrossRef]
Gil’, M.I. A refined bound for the spectral variations of matrices. Acta Sci. Math. 2021, 87, 1–6. [Google Scholar]
Anderson, B.D.; Bose, N.K.; Jury, E.I. A simple test for zeros of a complex polynomial in a sector. IEEE Trans. Aufomat. Contr. 1974, 19, 437–438. [Google Scholar] [CrossRef]
Anderson, B.D.; Bose, N.K.; Jury, E.I. On Eigenvalues of complex matrices in a sector. IEEE Trans. Aufomat. Contr. 1975, 20, 433. [Google Scholar] [CrossRef]
Daleckii, Y.L.; Krein, M.G. Stability of Solutions of Differential Equations in Banach Space; American Mathematical Society: Providence, RI, USA, 1974. [Google Scholar]
Gil’, M.I. On angular localization of spectra of perturbed operators. Extr. Math. 2020, 35, 197–204. [Google Scholar] [CrossRef]
Gantmakher, F.R. Theory of Matrices; Nauka: Moscow, Russia, 1967. (In Russian) [Google Scholar]
Gil’, M.I. A bound for condition numbers of matrices. Electr. J. Linear Algebra 2014, 27, 162–171. [Google Scholar] [CrossRef][Green Version]
Gil’, M.I. Estimates for functions of finite and infinite matrices. perturbations of matrix functions. Int. J. Math. Game Theory Algebra 2013, 21, 328–392. [Google Scholar]
Gil’, M.I. On condition numbers of spectral operators in a Hilbert space. Anal. Math. Phys. 2015, 5, 363–372. [Google Scholar] [CrossRef]
Gil’, M.I. Norm estimates for functions of matrices with simple spectrum. Rend. Circ. Mat. Palermo 2010, 59, 215–226. [Google Scholar] [CrossRef]
Gil’, M.I. Lower bounds for eigenvalues of Schatten-von Neumann operators. J. Inequal. Pure Appl. Math. 2007, 8, 117–122. [Google Scholar]
Gohberg, I.C.; Krein, M.G. Theory and Applications of Volterra Operators in Hilbert Space; American Mathematical Society: Providence, RI, USA, 1970; Volume 24. [Google Scholar]
Gil’, M.I. Operator Functions and Localization of Spectra; Lecture Notes In Mathematics; Springer: Berlin, Germany, 2003; Volume 1830. [Google Scholar]
Gil’, M.I. Sums of real parts of perturbed matrices. Math. Ineq. 2010, 4, 517–522. [Google Scholar] [CrossRef]
Bhatia, R. Matrix Analysis; Springer: New York, NY, USA, 1997. [Google Scholar]
Gil’, M.I. Resolvents of operators on tensor products of Euclidean spaces. Linear Multilinear Algebra 2016, 64, 699–716. [Google Scholar] [CrossRef]
Gil’, M.I. On Similarity of an Arbitrary Matrix to a Block Diagonal Matrix; Filomat: Nis, Serbia, 2021; (Accepted for Publication). [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.