1. Introduction
In this paper, we are concerned with the derivation of realistic perturbation bounds on the sensitivity of various important subspaces arising in matrix analysis. Such bounds are especially needed in the perturbation analysis of high dimension subspaces when the known bounds may produce very pessimistic results. We show that much tighter bounds on the subspace sensitivity can be obtained by using a probabilistic approach based on the Markoff inequality.
The sensitivity of invariant, deflation and singular subspaces of matrices is considered in detail in the fundamental book of Stewart and Sun [
1], as well as in the surveys of Bhatia [
2] and Li [
3]. In particular, perturbation analysis of the eigenvectors and invariant subspaces of matrices affected by deterministic and random perturbations is presented in several papers and books, see for instance [
4,
5,
6,
7,
8,
9,
10,
11,
12]. Survey [
13] is entirely devoted to the asymptotic (first-order) perturbation analysis of eigenvalues and eigenvectors. The algorithmic and software problems in computing invariant subspaces are discussed in [
14]. The sensitivity of deflating subspaces arising in the generalized Schur decomposition is considered in [
15,
16,
17,
18], and numerical algorithms for analyzing this sensitivity are presented in [
19]. Bounds on the sensitivity of singular values and singular spaces of matrices that are subject to random perturbations are derived in [
20,
21,
22,
23,
24,
25,
26], to name a few. The stochastic matrix theory that can be used in case of stochastic perturbations is developed in the papers of Stewart [
27], and Edelman and Rao [
28].
In [
29], the author proposed new componentwise perturbation bounds of unitary and orthogonal matrix decomposition based on probabilistic approximations of the entries of random perturbation matrices implementing the Markoff inequality. It was shown that using such bounds it is possible to decrease significantly the asymptotic perturbation bounds of the corresponding similarity or equivalence transformation matrices. Based on the probabilistic asymptotic estimates of the entries of random perturbation matrices, presented in [
29], in this paper we derive new new probabilistic bounds on the sensitivity of invariant subspaces, deflation subspaces and singular subspaces of matrices. The analysis and the examples given demonstrate that, in contrast to the known deterministic bounds, the probabilistic bounds are much tighter with a sufficiently high probability. The analysis performed exploits a unified method for deriving asymptotic perturbation bounds of the subspaces under interest, developed in [
30,
31,
32,
33,
34], and utilizes probabilistic approximations of the entries of random perturbation matrices implementing the Markoff inequality. As a result of the analysis, we determine, with a prescribed probability, asymptotic perturbation bounds on the angles between the perturbed and unperturbed subspaces. It is proved that the new probabilistic asymptotic bounds are significantly less conservative than the corresponding deterministic perturbation bounds. The results obtained are illustrated by examples comparing the deterministic perturbation bounds derived by Stewart [
16,
35] and Sun [
11,
18] with the probabilistic bounds derived in this paper.
The paper is structured as follows. In
Section 2, we briefly present the main results concerning the derivation of lower magnitude bounds on the entries of a random matrix using only its Frobenius norm. In the next three sections,
Section 3,
Section 4 and
Section 5, we show the application of this approach to derive probabilistic perturbation bounds for the invariant, deflating and singular subspaces of matrices, respectively. We illustrate the theoretical results by examples demonstrating that the probability bounds of such subspaces are much tighter than the corresponding deterministic asymptotic bounds. We note that the known deterministic bounds for the invariant, deflating and singular subspaces are presented briefly as theorems without proof, only with the purpose to compare them with the new bounds.
All computations in the paper are performed with MATLAB
® Version 9.9 (R2020b) [
36] using IEEE double-precision arithmetic. M-files implementing the perturbation bounds described in the paper can be obtained from the author.
2. Probabilistic Bounds for Random Matrices
Consider an
random matrix,
, with uncorrelated elements. In the componentwise perturbation analysis of matrix decompositions, we have to use a matrix bound
, so that
, i.e.,
where
and
is some matrix norm. However, if for instance we use the Frobenius norm of
, we have that
and
which produces very pessimistic results for a large
m and
n. To reduce
, in [
29] it is proposed to decrease the entries of
, taking a bound with entries
, where
. Of course, in the general case, such a bound will not satisfy (
1) for all
i and
j. However, we can allow to exist some entries,
, of the perturbation
that exceed in magnitude, with some prescribed probability, the corresponding bound,
. This probability can be determined by the Markoff inequality ([
37], Section 5-4)
where
is the probability that the random variable
is greater or equal to a given number,
a, and
is the average (or mean value) of
. Note that this inequality is valid for arbitrary distribution of
, which makes it conservative for a specific probability distribution. Applying the Markoff inequality, with
equal to the entry
and
a equal to the corresponding bound
, we obtain the following result [
29].
Theorem 1. For an random perturbation, , and a desired probability , the estimate , whereandsatisfies the inequality Theorem 1 allows the decrease of the mean value of the bound, , and hence the magnitude of its entries by the quantity , choosing the desired probability, , less than 1. The value corresponds to the case of the deterministic bound, , with entries , when it is fulfilled that all entries of are larger than or equal to the corresponding entries of the perturbation, . The value corresponds to , where . As mentioned above, the probability bound produced by the Markoff inequality is very conservative, with the actual results being much better than the results predicted by the probability, . This is due to the fact that the Markoff inequality is valid for the worst possible distribution of the random variable .
According to Theorem 1, the using of the scaling factor (
3) guarantees that the inequality
holds for each
i and
j with a probability no less than
. Since the entries of
are uncorrelated, this means that, for sufficiently large
m and
n, the number
also gives a lower bound on the relative number of the entries that satisfy the above inequality.
In some cases, for instance in the perturbation analysis of the Singular Value Decomposition, tighter perturbation bounds are obtained if instead of the norm we use the spectral norm . The following result is an analogue of Theorem 1 that allows us to use at the price of producing smaller values of .
Theorem 2. For an random perturbation, , and a desired probability , the estimate , whereandsatisfies the inequality This result follows directly from Theorem 1, replacing by its upper bound .
Since, frequently,
is of the order of
, for a large
n, Theorem 2 may produce pessimistic results in the sense that the actual probability of fulfilling the inequality
is much larger than the value predicted by (
5).
In several instances of the perturbation analysis, we have to determine a bound on the elements of the vector
where
M is a given matrix and
is a random vector with a known probabilistic bound on the elements. In accordance with (
6), we have that the following deterministic asymptotic (linear) componentwise bound is valid,
A probability bound on
can be determined by the following theorem [
29].
Theorem 3. If the estimate of the parameter vector x is chosen as , where Ξ is determined according tothen Since
the inequality (
9) shows that the probability estimate of the component
can be determined if in the linear estimate (
7) we replace the perturbation norm
by the probability estimate
, where the scaling factor,
, is taken as shown in (
8) for a specified probability,
. In this way, instead of the linear estimate,
, we obtain the probabilistic estimate
3. Perturbation Bounds for Invariant Subspaces
3.1. Problem Statement
Let
be the Schur decomposition of the matrix
, where
contains a given group of the eigenvalues of
A ([
38], Section 2.3). The matrix,
U, of the unitary similarity transformation can be partitioned as
where the columns of
are the basis vectors of the invariant subspace,
, associated with the eigenvalues of the block
, and
is the unitary complement of
,
. The invariant subspace satisfies the relation
. Note that the eigenvalues of
A can be reordered in the desired way on the diagonal of
T (and hence of
) using unitary similarity transformations ([
39], Chapter 7).
The invariant subspace, , is called simple if the matrices and have no eigenvalues in common.
If matrix
A is a subject to a perturbation,
, then, instead of the decomposition (
10), we have the decomposition
with a perturbed matrix of the unitary transformation
The columns of matrix are basis vectors of the perturbed invariant subspace, . We shall assume that matrix A has distinct eigenvalues, i.e., is a simple invariant subspace that ensures finite perturbations and for small perturbations of A.
Let
and
be two subspaces of dimension
k, where
. The distance between
and
can be characterized by the gap between these subspaces, defined as [
11]
where
and
are the orthogonal projections onto
and
, respectively. Further on, we shall measure the sensitivity of an invariant subspace of dimension
k by the canonical angles
between the perturbed and unperturbed subspaces ([
35], Chapter 4). The maximum angle
is related to the value of the gap between
and
by the relationship [
40]
We note that the maximum angle between
and
can be computed efficiently from [
41]
3.2. -Based Global Bound
Define
and
where ⊗ denotes the Kronecker product ([
42], Chapter 4). The norm of the matrix
is closely related to the quantity
separation between two matrices. The separation between given matrices
and
characterizes the distance between the spectra
and
and is defined as
Note that
, if and only if
A and
B have eigenvalues in common.
In the given case, the separation
between the two blocks
and
can be determined from
which is equivalent to
Assume that the spectra of and are disjoint, so that . Then the following theorem gives an estimate of the sensitivity of an invariant subspace of A.
Theorem 4 ([
16]).
Let matrix A be decomposed, as in (10). Given the perturbation, E, setIf andthen there is a unique matrix, P, satisfyingsuch that the columns of span a right invariant subspace of . It may be shown that the singular values of the matrix
are the sines of the canonical angles
between the invariant subspaces
and
. That is why, if
P has singular values
, then the singular values of
are
and
Hence
Thus, Theorem 4 bounds the tangents of the canonical angles between the perturbed,
, and unperturbed,
, invariant subspace of
A. The maximum canonical angle fulfils
3.3. Perturbation Expansion Bound
A global perturbation bound for invariant subspaces is derived by Sun [
11,
40] using the perturbation expansion method. The essence of this method is to expand the perturbed basis
in infinite series in the powers of
and then estimate the series sum.
Theorem 5 ([
11]).
Let matrix A be decomposed, as in (10). Given the perturbation , setandThen the simple invariant subspace, , of the matrix has the following qth-order perturbation estimation for any natural number, q: Theorem 5 can be used to estimate the canonical angles between the perturbed and unperturbed invariant subspaces. Taking into account (
13), we obtain the bound
The implementation of Theorem 5 to estimate the sensitivity of an invariant subspace shows that the bound (
23) tends to overestimate severely the true value of the angle
for
and large
n. In practice, it is possible to obtain reasonable results if we use only the first-order term (
) in the expansion (
23), i.e., if we use the linear bound
3.4. Bound by the Splitting Operator Method
The essence of the splitting operator method for perturbation analysis of matrix problems [
31] consists in the separate deriving of perturbation bounds on
and
. For this aim, we introduce the perturbation parameter vector
where the components of
x are the entries of the strictly lower triangular part of the matrix
. This vector is then used to find bounds on the various elements of the Schur decomposition.
Let
and construct the vector
The equation for the perturbation parameters represents a linear system of equations [
32]
where
is a matrix whose elements are determined from the entries of
T, and the components of the vector
contain higher-order terms in the perturbations
. Specifically, matrix
M is determined by
where
Note that matrix M is non-singular although the matrix is not of full rank.
Equation (
25) is independent from the equations that determine the perturbations of the elements of the Schur form
T. This first allows us to solve (
25) and estimate
and then to use the solution obtained to determine bounds on the elements of
.
Neglecting the second-order term
in (
25), we obtain the first-order (linear) approximation of
x,
Since
, we have that
where
is the asymptotic bound on
.
The matrix
can be estimated as
where
is a first-order approximation of
, and
contains higher-order terms in
x. Thus, an asymptotic (linear) approximation of the matrix
can be determined as
Equation (
32) shows that the sensitivity of the invariant subspace,
, of dimension
k is connected to the values of the perturbation parameters
. Consequently, if the perturbation parameters are known, it is possible to find at once sensitivity estimates for all invariant subspaces with dimension
. More specifically, let
where * is an unspecified entry. Then we have that the maximum angle between the perturbed and unperturbed invariant subspaces of dimension
k is
In this way, we obtain the following result.
Theorem 6. Let matrix A be decomposed, as in (10), and assume that the Frobenius norm of the perturbation is known. Setwhere matrix M is determined by (26). Then the following asymptotic estimate holds, The proof of Theorem 6 follows directly from (
33), replacing the matrix
by its linear approximation,
, and substituting each
by its approximation (
29). Note that, as always in the case of perturbation bounds, the equality can be achieved only for specially constructed perturbation matrices.
Denote by
the changes of the diagonal elements of
T, i.e., the perturbations of the eigenvalues
of
A. Then the first-order eigenvalue perturbations satisfy
where
and
The obtained linear bound (
35) coincides numerically with the well-known asymptotic bounds from the literature [
12,
35,
39]. The quantity
is equal to the condition number of the eigenvalue
.
3.5. Probabilistic Perturbation Bound
The idea of determining tighter perturbation bounds of matrix subspaces consists in replacing the Frobenius or 2-norm of the matrix perturbation by a much smaller probabilistic estimate of the perturbation entries, obtained using Theorems 1 and 3. This allows us to decrease, with a specified probability, the perturbation bounds for the different subspaces, achieving better results for higher dimensional problems. In simple terms, we replace in the corresponding asymptotic estimate by the ratio , thus decreasing the perturbation bound by the quantity , which is determined by the desired probability, . We shall illustrate this idea considering first the case of invariant subspaces.
Using Theorem 3, the probabilistic perturbation bounds of
x and
in the case of the Schur decomposition can be found from (
29) and (
31), respectively, replacing in (
29) the perturbation norm
by the quantity
, where
is determined according to (
3) from the desired probability,
, and the problem order,
n. In this way, we obtain the probabilistic asymptotic estimate
of the maximum angle between the perturbed and unperturbed invariant subspaces of dimension
k. In the same way, from (
35), we obtain a probabilistic asymptotic estimate
of the eigenvalue perturbations.
3.6. Bound Comparison
In the next example, we compare the invariant subspace deterministic perturbation bounds, obtained by the -based approach, the perturbation expansion method and the splitting operator method, with the probabilistic bound obtained by using the Markoff inequality.
Example 1. Consider a matrix A, taken aswhereand the matrix is constructed as [43]where are elementary reflections, σ is taken equal to and . The eigenvalues of A,are complex conjugated. The perturbation of A is taken as , where is a matrix with random entries with normal distribution and . The matrix M in (25) is of order and its inverse satisfies , which shows that the eigenvalue problem for A is ill-conditioned since the perturbations of A can be “amplified” times in x and, consequently, in and . In Figure 1, we show the mean value of the matrix and the relative number of the entries of the matrix for which , obtained for normal and uniform distribution of the entries of and for different values of the desired probability, . For the case of normal distribution and , the size of the probability entry bound, , decreases 10 times in comparison with the size of the entry bound , which allows the decrease of the mean value of the ratio from to (Table 1). For , the probability bound, , is 60 times smaller than the bound , and even for this small desired probability the number of entries for which is still . In Figure 2, we compare the asymptotic bound, , and the probabilistic estimate, , with the actual eigenvalue perturbations for normal distribution of perturbation entries and probabilities and . (For clarity, the perturbations of the superdiagonal elements of T are hidden). The probabilistic bound, , is much tighter than the linear bound, , and the inequality is satisfied for all eigenvalues and all chosen probabilities. In particular, the size of the estimate, , is 10 times smaller than the linear estimate, , for , 20 times for and 40 times for . In Figure 3, we show the asymptotic bound, , and the probabilistic estimate, , along with the actual value of the maximum angle between the perturbed and unperturbed invariant subspace of dimensions for the same probabilities and . The probability estimate satisfies for all . For comparison, we give the global bound on the maximum angle between the perturbed and unperturbed invariant subspaces computed by (20) and the first-order bound determined by (24). (The computation of the -based estimate is performed by using the numerical algorithm, presented in [14].) The global bound is slightly larger than the asymptotic bounds, and the asymptotic bounds (24) and (34) coincide. 5. Perturbation Bounds for Singular Subspaces
5.1. Problem Statement
Let
. The factorization
where
are orthogonal matrices and
is a diagonal matrix, is called the
singular value decomposition of
A ([
38], Section 2.6). If
, matrix
has the form
where the numbers
are the singular values of
A. If
and
are partitioned such that
, then
,
form a pair of
singular subspaces for
A. These subspaces satisfy
and
.
If matrix
A is subject to a perturbation,
, then there exists another pair of orthogonal matrices,
,
, and a diagonal matrix,
, such that
where
The aim of the perturbation analysis of singular subspaces consists in bounding the angles between the perturbed, , and unperturbed, , right singular subspaces, and the angles between the perturbed, , and unperturbed, , left singular subspaces.
5.2. Global Bound
Consider first the global perturbation bound on the singular subspaces derived by Stewart [
16].
Theorem 10. Let matrix A be decomposed, as in (65), and let and form a pair of singular subspaces for A. Let be given and partition conformingly with U and V as Letand letwhere, if , Σ is understood to have zero singular values. Set Ifthen there are matrices and satisfyingsuch that and form a pair of singular subspaces of . Theorem 10 bounds the tangents of the angles between the perturbed,
, and unperturbed,
, right singular subspaces and the angles between the perturbed,
, and unperturbed,
, left singular subspaces of
A. Thus, the maximum angles between perturbed and unperturbed singular subspaces fulfil
Note that Theorem 10 produces equal bounds on the maximum angles between the perturbed and unperturbed right and left singular subspaces.
5.3. Perturbation Expansion Bound
Global perturbation bounds for singular subspaces that produce individual perturbation bounds for each subspace in a pair of singular subspaces are presented in [
18].
Theorem 11. Let and let be orthogonal matrices with such that Assume that for each j and the singular values of are different from the singular values of . Let and . For , letMoreover, letwhere we define if and let Ifthen there exists a pair, and , of k-dimensional singular subspaces of , such that 5.4. Bound by the Splitting Operator Method
Similarly to the perturbation analysis of the generalized Schur decomposition, performed by using the splitting operator method, it is appropriate first to find bounds on the entries of the matrices and , where is a matrix that consists of the first n columns of U. The matrices and are related to the corresponding perturbations and by orthogonal transformations.
Let us define the vectors of the subdiagonal entries of the matrices
and
,
Further on the quantities and will be considered as perturbation parameters since they determine the perturbations and of the singular vectors.
Let us represent the matrix
as
It is possible to show that
Following the analysis performed in [
30], it can be shown that the unknown vectors
and
y satisfy the system of linear equations
where
and
contain higher-order terms in
and
.
In this way, the determining of the vectors
and
y reduces to the solution of the system of symmetric coupled equations (
75) with diagonal matrices of size
. The vector
can be found from the separate equation
where
contains higher-order terms in
and
, defined in (
66).
Neglecting the higher-order terms
in (
75), we obtain
where, taking into account that
and
commute, we have that
The matrices and are diagonal matrices whose nontrivial entries are determined by the singular values .
Hence, the components of the vectors
and
y satisfy
Taking into account the diagonal form of
, we obtain that
Since only one element of
f and
g participates in (79) and (80), these elements can be replaced by
, and we find that the linear approximations of the vectors
and
y fulfil
where
,
.
An asymptotic estimate of the vector
is obtained from (
78), neglecting the higher-order term
. Since each element of
depends only on one element of
, we have that
As a result of determining the linear estimates (82)–(84), we obtain an asymptotic approximation of the vector
x as
where
The matrices
and
can be estimated as
where
and the matrices
contain higher-order terms in
. The matrices
and
are asymptotic approximations of the matrices
and
, respectively.
Assume that the singular value decomposition of
A is reordered as
where
and
are orthogonal matrices with
and
, and
contains the desired singular values. The matrices
and
are the orthonormal bases of the perturbed and unperturbed left singular subspace,
, of dimension
, and
and
are the orthonormal bases of the perturbed and unperturbed right deflation subspace,
, of the same dimension. We have that
Using the asymptotic approximations of the elements of the vectors x and y, we obtain the following result.
Theorem 12. Let matrix A be decomposed, as in (92). Given the spectral norm of the perturbation , set the matrices and in (88), (89) using the linear estimates of the perturbation parameters x and y determined from (81)–(84). Then, the asymptotic bounds of the angles between the perturbed and unperturbed singular subspaces of dimension k are given by The perturbed matrix of the singular values satisfies
where
and
contains higher-order terms in
and
. Neglecting the higher-order terms, we obtain for the singular value perturbation the asymptotic bound
Bounding each diagonal element
by
, we find the normwise estimate
which is in accordance with Weyl’s theorem ([
44], Chapter 1). We have in a first-order approximation that
5.5. Probabilistic Perturbation Bound
Implementing a derivation similar to the one used in the proof of Theorem 3, the probabilistic estimates
and
of the parameter vectors
x and
y can be obtained from the deterministic estimates
and
. For this aim, the value of
in the expressions (82) and (83) for
and
, respectively, is replaced by the value of
, where
is determined from (
4) for the specified value of
. According to (
85), the probabilistic perturbation bound of
x fulfils
where the estimates
and
satisfy (
81) and (
84), respectively. The bound of
y is found from
where
satisfies (83).
The bounds on
and
are determined from (
90) and (91), respectively. According to (
96), the probability bound on the singular value perturbations is found from
5.6. Bound Comparison
Example 3. Consider a matrix, taken aswherethe matrices and are constructed as in Example 2,and the matrices are elementary reflections. The condition numbers of and with respect to the inversion are controlled by the variables σ and τ and are equal to and , respectively. In the given case, , and , . The perturbation of A is taken as , where is a matrix with random entries with normal distribution generated by the MATLAB®function
randn. This perturbation satisfies . The linear estimates and , which are of size 19900, are found by using (49) and (50), respectively, computing in advance the diagonal matrices and . These matrices satisfywhich means that the perturbations in A can be increased nearly times in x and y. In Figure 9, we represent the mean value of the matrix and the scaling factor as a function of . Since, in the given case we use instead of , the value of for a given probability is relatively small. For instance, if %, then we have that and the mean value of is equal to 285.23 (Table 3). In Figure 10, we compare the actual perturbations, , of the singular values with the normwise bound (96) and the probabilistic bound (97) of the singular value perturbations for and . The probabilistic perturbation bound is tighter than the normwise bound . Specifically, is 2 times smaller than for , 4 times for and 10 times for . The inequality is satisfied for all singular values and probabilities due to the small values of . Note that tighter probability estimates can be obtained if instead of we use and the scaling parameter . In Figure 11 and Figure 12, we show the actual values of the angles between the perturbed and unperturbed right and left singular subspaces, respectively, along with the corresponding linear bounds and probability bounds. For comparison, we give also the global bounds (69), (70) and (73), (74). As in the case of determining the deflation subspace global bounds, since the norms of parts of the perturbation matrix E are unknown, these norms are approximated by the 2-norms of the whole corresponding matrices. Clearly, the probabilistic bounds outperformall deterministic bounds. For instance, if , the probabilistic bounds are 10 times smaller than the deterministic asymptotic bound, as predicted by the analysis.