Abstract
In this paper, we consider various aspects of the Schur problem for a square complex matrix A, namely the similarity unitary transformation of A into upper triangular form containing the eigenvalues of A on its diagonal. Since the profound work of I. Schur published in 1909, this has become a fundamental issue in the theory and applications of matrices. Nevertheless, certain details concerning the Schur problem need further clarification, especially in connection with the perturbation analysis of the Schur decomposition relative to perturbations in the matrix A. We consider both canonical and condensed Schur forms. Special attention is paid to matrices with simple eigenvalues. Some new concepts, such as quasi-Schur forms and diagonally spectral matrices, are also introduced and studied.
Keywords:
Schur canonical form; Schur condensed form; diagonally spectral matrix; quasi-Schur form; perturbations of Schur form MSC:
15A21; 65G30
1. Introduction and Notation
The Schur decomposition of a general square matrix [1] and its generalizations are major tools both in the theory and applications of matrix analysis; see, e.g., refs. [2,3,4,5]. In this paper, we consider the main definitions and properties of the Schur decomposition of a square matrix (see the definitions given later), which are important from the point of view of the perturbation analysis of the Schur problem; see [6,7,8,9,10]. Various aspects of Schur and Schur-like decompositions are considered in [11,12,13,14]. The monographs [15,16] also deal with these problems. Schur-like forms are also proposed for pairs of matrices (or matrix pencils) [3], for control systems [17,18,19], etc. A perturbation analysis of these constructions is presented in [18,20,21] and the references therein.
In this paper, we introduce new concepts in this field such as quasi-canonical Schur forms and diagonally spectral matrices. A number of examples are given for illustration of the results presented. This is a specific issue, and we shall need a large number of notations. For the convenience of the reader, the general notations are gathered below in this section, while some specific notations appear further in the text. Some of the matrix notations are inspired by the language of the program system MATLAB® Version 9.9 (R2020b) [22].
If is a finite set, then is the number of its elements. Let be the set of integers and , where . We denote by the set of integers . We write and when .
The set of real (resp. complex) numbers is denoted by (resp. ), and is the imaginary unit. A quantity is said to be genuinely complex. An array (vector or matrix) is genuinely complex if at least one of its elements is genuinely complex.
A complex number z is written as with , or , where is the absolute value and is the angle of z. The complex conjugate of z is denoted as .
The sign function of the scalar argument is defined as , and for , and , resp. The sign function for real n-tuples is defined by the expression
The lexicographical order ≺ for n-tuples is defined as , and if , and , resp. Otherwise, when either , or there exists such that for and .
We use this lexicographical order for complex numbers written as real pairs and for pairs of integers . For example, for the fourth roots of 1, we have .
We denote by (resp. ) the space of complex (resp. real) matrices with elements and we set , . The i-th row and the j-th column of A are denoted as and , respectively.
The column m-vector with elements is written as , while the row n-vector with elements is denoted as .
The Kronecker delta symbol is . The identity matrix is denoted as and has elements . The anti-identity matrix has elements .
The zero matrix is denoted as with , or simply as O. We denote by the strictly lower triangular matrix with ones below its main diagonal and zeros otherwise, i.e., if and if . The elementary matrix with element 1 in position and zero otherwise is denoted as , i.e., .
The absolute value of the matrix is the matrix with elements . The transpose of the matrix is denoted and has elements . The complex conjugate transpose of is denoted and has elements .
The inverse of the non-singular matrix A is denoted as . The flipped matrix of the matrix is and has elements .
For , we denote by the element-wise product of A and B, i.e., . The spectral and the Frobenius norms of the matrix are denoted as and , respectively.
The spectrum of the matrix is the collection, or the multiset, of the eigenvalues of A, , counted according to their algebraic multiplicities. With certain abuse of notation, we write in the general case, and in the case when all eigenvalues of A are real.
The multiplicative group of non-singular matrices is denoted as , while the group of unitary matrices such that is denoted by . The group of orthonormal matrices such that is denoted as .
For , we denote by and the strictly lower triangular and the diagonal parts of A, respectively. If x is an n-vector with elements then is the matrix with elements .
The set of upper triangular matrices is denoted as , while the set of diagonal matrices is denoted as . For , the group of diagonal matrices of the form
where , is denoted as .
The set of -tuples of pairs of integers , , where , is denoted as .
We set and
In particular, we have and .
Unspecified matrix blocks are denoted by star.
We use the following abbreviations: CSF—canonical Schur form; ConSF—condensed Schur form; DPS—diagonal preserving solution; GJF—generalized Jordan form; PSP—perturbed Schur problem; SP—Schur problem.
2. Condensed Schur Forms
Let a nonzero matrix , , be given. A condensed form of A relative to a set of invertible matrices is a matrix with preliminary fixed elements (usually zeros and/or ones) at certain positions. Most often, the condensed form is a triangular matrix with at least zero elements. According to the famous Schur result [1], there exists a factorization of the matrix A, where and . Two things here deserve mentioning: the proof of the Schur result is elementary, and it appeared too late in the history of mathematics.
Definition 1.
The pair
is said to be a Schur decomposition, or an upper triangular unitary decomposition of the matrix A. The matrix T is referred to as a condensed Schur form, or ConSF of A. The columns of the unitary transformation matrix U form a Schur basis for the space relative to the matrix A.
Thus, the definition of the ConSF T of A is not unique, and hence, it is not canonical in the sense of Definitions 7 and 8 below. If and is real, then the matrix U may be chosen as , and we have . If has at least one pair of complex eigenvalues , , then a real block-diagonal Schur form with diagonal blocks and may be constructed by orthogonal similarity transformations; see e.g., refs. [23,24].
Next, we define two sets of matrices depending on the matrix A thatplay an important role in our analysis. Denote
and
Thus, is the set of unitary matrices transforming the matrix A into ConSF, and is the set of ConSF of A. For matrices with real spectra, we denote
In general, the set is not a group and not even a groupoid, i.e., does not imply .
The most important property of the ConSF T of A is that its diagonal elements are the eigenvalues of A. Another application of ConSF is the evaluation of functions of matrices defined by a matrix power series or by Padé approximations [25].
Because of the only condition imposed on T, the matrix T is a condensed form (rather than a canonical form) of A relative to the similarity action , defined by , of the group on the set .
Definition 2.
The problem of finding the ConSF (1) is referred to as the Schur problem (SP) for the matrix . The general solution of the SP is the set
of all ConSF of A. A pair is a particular solution of the SP for the matrix A.
Sometimes the matrices U and T in a particular solution of SP for A are written as and to emphasize their dependence on A. This dependence, however, is not a functional one. Indeed, the matrix U is always not unique. For example, and imply and, in particular, . With the exception of the case , , when it is fulfilled , the matrix is also not unique. For , we have .
All upper triangular unitary equivalent forms of a given matrix are unitary similar. In particular, the next proposition is a direct corollary of the definitions; see, e.g., ref. [24].
Proposition 1.
Let and be two solutions of the SP for A. Then, .
Proof.
It suffices to observe that . □
Definition 3.
The solutions and are said to be diagonally equal if , and diagonally different if .
The next proposition has generally been known since 1933 and is attributed to H. Röseler; see Theorem 2.3 of [24]. It gives sufficient and almost necessary conditions for diagonal equality of the solutions of SP. The formulation and proof of the results below are slightly different from the known ones.
Proposition 2.
The following assertions hold true.
- 1.
- If , then the solutions and are diagonally equal.
- 2.
- If the matrix A has pair-wise distinct eigenvalues and the solutions and are diagonally equal then .
Proof.
To prove Assertion 1 note that the condition is equivalent to the existence of a matrix such that . In this case
and .
To prove 2, we use the fact that . Partition the matrices in this equality as
where , , , and ∗ is a matrix block of corresponding size. We have
and comparing the (2,1)-blocks of these matrices, we obtain . Since , we obtain . Hence , and . Now, the proof is completed by induction. □
The MATLAB® command [U,T] = schur(A) computes a particular solution of SP for a matrix . One of the aims of computing a ConSF T of a general matrix is to determine the eigenvalues of A as the diagonal elements of . Another aim is to evaluate matrix functions
where is a power matrix series [25] in A.
A particular challenge is to define SP for matrices , which are already in ConSF. Here, the problem is not to transform A into ConSF (it already is), but rather to find the matrices that keep in ConSF, i.e., .
For matrices , the MATLAB® command [U,T] = schur(A) returns the solution , of the SP. Next, for and a matrix
the computed solution [U,T] is of the form
where . In particular, for , the command [U,T] = schur(eye(n)) returns , .
The MATLAB® (Maple) command [V,J] = jordan(A) for computing an invertible matrix V and a Jordan canonical form of A, in case , also gives , .
It is interesting to reveal the action of existing computer codes on matrices A close to in double-precision floating-point arithmetic.
Example 1.
Let and , where ε is small. We have . For , the solution of SP for is computed wrongly by the MATLAB® code schur
as , . Choosing ε as the next machine number, which is , the code
schur
computes exactly one of the solutions of SP for , namely , .
The code
jordan
exactly computes the Jordan canonical form of A as , for values of as small as . For , the matrix is rounded to , and the computed Jordan form is , .
Note that the code jordan works with variable precision arithmetic.
The SP for has infinitely many solutions for both U and T with one exception for T. Namely, the matrix T is uniquely determined only when , where . In this case, and is arbitrary, i.e., the general solution of SP is . It is also interesting to see whether the order of the diagonal elements of is the same as the order of the diagonal elements of . These considerations lead to the following definition.
Definition 4.
Let . Then, the pair , where and , is said to be a diagonal preserving solution (DPS) of the SP for A if .
Otherwise speaking, the pair is the DPS of SP when the upper triangular matrices A and T are diagonally equal, see Definition 3. The DPS thus defined as a particular solution of SP. The general DPS of SP for is the set of all particular DPS solutions. In practice, we are interested in finding a particular DPS rather than determining the general DPS as a subset of .
If the matrix has simple eigenvalues, the matrix U in any particular DPS is diagonal, i.e., . To illustrate this, let and
where . We have and , which yields and .
If the matrix has multiple eigenvalues but its upper triangular part is generically nonzero, i.e., for , then the matrix U in any particular DPS is again diagonal. To illustrate this, let and , , where . We have and , which yields and .
For all , a simple choice for U in any particular DPS is , which allows treating all cases in a unified manner. Choosing any other matrix U with is of course also possible. In the real case, orthogonal diagonal matrices U have diagonal elements .
The case when the matrix A has multiple eigenvalues and its upper triangular part is not generically nonzero, e.g., or , where and , needs special consideration. Here, the matrix U in a particular DPS may not be diagonal. For example, in the most non-generic case , any pair with is a particular solution of SP, and hence, the general solution of SP is . Anyway, for definiteness, we choose the particular solution with and of SP each time when .
Any particular choice of U in a DPS for matrices A with multiple eigenvalues and a non-generic upper triangular part may lead to jump effects, as described below.
Suppose that , is the matrix function of a scalar argument, where is a small parameter. If the matrix has simple eigenvalues (or multiple eigenvalues but generic upper triangular part), then the matrix in a DPS for may be chosen so that . Here, the function will be continuous in the interval . If, however, has multiple eigenvalues and a non-generic upper triangular part, then the choice may lead to jump of the function at the point .
We may use another function with in the DPS of the SP for , such that is continuous at for the above choice of . However, for another choice of , the matrix function will have a jump at . Hence, this type of discontinuity is inherent for the statement of the SP with T being only upper triangular. The same is true if T is always lower triangular.
The next example illustrates these slightly complicated considerations.
Example 2.
A possible DPS of the SP for is . Let , where and is fixed. Then, for , we have , , and the function has a jump at with for . If another DPS is chosen with , then the matrix function satisfies for and is constant (and hence continuous) at . If, however, we take and then for , and the function is discontinuous with the same jump at the point .
Example 2 shows that the discontinuity of is inherent to this statement of the SP in cases like . This is due to the definition of T as an upper triangular matrix (or as a lower triangular matrix). If we allow the condensed form T to be any triangular unitary equivalent form of A, then this type of artificial discontinuity may disappear due to the definitions. This is the idea of the quasi-Schur condensed forms introduced later on. The latter forms generalize triangular Schur forms.
Without additional assumptions, the matrix is only a condensed form rather than a canonical form of A relative to the similarity action of the group . The only (albeit most important) invariants for this action, which are revealed by the matrix T, are the eigenvalues of the matrix A.
The definition of complete invariants and canonical forms for the similarity action of on is mathematically interesting (see [24]), but it is much more complicated and is not considered in full detail here. Further on, we consider, among others, only a partial formulation of Schur canonical forms for generic matrices A, see also [26,27,28]. From the point of view of applications, the condensed forms provide the same advantages as the canonical forms. Moreover, strict unitary canonical forms of non-generic matrices A are rarely, if ever, used in practice, since they involve complicated conditions and procedures (which are hard to be checked) and are more sensitive to perturbations in A compared with the condensed forms.
Let and . Then, as well. In particular, for and . This fact has an important implication. The diameter of the set , i.e., the maximum of for , is equal to 2 and is achieved for .
Given a matrix , neither the ConSF of A nor the transformation matrix are unique in general. In fact, the matrix T is unique if and only if , , while U is always not unique. In this case, and is an arbitrary unitary matrix, or, equivalently, and .
If A has at least two different eigenvalues, then we have a set of ConSF T with different ordering of the eigenvalues of A on the diagonal of T. The ConSF T also differ in their strictly upper triangular parts.
Suppose that consists of pair-wise disjoint elements with multiplicities , where . Then, there are
different orderings of the elements on the diagonal of the ConSF T, or N diagonally different solutions of the SP for A.
Here, one of the ConSF of A is the block matrix with and , where . In the generic case, , we have diagonally different ConSF, while in the most non-generic case , we have and all ConSF are diagonally equal.
3. Canonical Schur Forms for Generic Matrices
In this section, we summarize and reformulate some of the results concerning Schur canonical forms for the unitary similarity action of on the set . The canonical Schur form of the matrix is a ConSF with additional conditions imposed on its elements; see [24] and the references therein. We consider only generic matrices A with pair-wise disjoint eigenvalues for which the solution of the Schur problem is continuous as a function of the matrix A. At the same time, the Schur basis U for condensed forms (and hence for canonical forms as well) of a matrix A with multiple eigenvalues may be discontinuous as a function of A.
Definition 5.
For a given matrix , the set
is called equivalence class, or orbit, of A relative to the similarity action of the unitary group .
Obviously, implies and vice versa. Let and be certain sets.
Definition 6.
The matrices are said to be unitary equivalent (denoted as ) if .
Definition 7.
The function is said to be canonical mapping for the similarity action of the group on the set when the equality holds if and only if .
Hence, the canonical mapping is a complete invariant [29] for the similarity action of the group on the set , but the opposite, of course, is not true. Next, the canonical form of A is defined as the image of A under .
Definition 8.
The image of the matrix A under the canonical mapping γ is said to be unitary canonical form, or Schur canonical form, of A.
Definition 9.
The subset of is said to be closed in the Zariski topology if it is the union of the zeros of a system of polynomials in . The subset is said to be open in the Zariski topology if its complement is closed in this topology.
Definition 10.
A property of a matrix is said to be generic if it is fulfilled on a subset , which is open in the Zariski topology.
Informally, the matrix A is said to be generic relative to a given property if this property is generic.
Proposition 3.
The following properties of a matrix are generic.
- 1.
- The matrix A is totally different from any fixed matrix , i.e., for ; in particular, for any given pair .
- 2.
- The matrix A is not normal, i.e., ; in particular, the matrix A is not unitary.
- 3.
- The singular values of the matrix A are positive and pair-wise different; in particular, .
- 4.
- The eigenvalues of the matrix A satisfy the inequalities and for ; in particular, for and the Jordan canonical form of A is diagonal.
- 5.
- Any ConSF T of the matrix A has nonzero and pair-wise different elements on and above its diagonal, i.e., and for , and .
4. Geometry of Schur Canonical Sets
Let be a permutation of the integers and recall that and . Set .
Below, we describe a possible set of canonical forms for the similarity action of the group on the subset of matrices with simple eigenvalues. Let be the set of -tuples
of integer pairs , , where . There are such -tuples; see Table 1. Later on, we shall define three important types of such sets.
Table 1.
Number of generic canonical Schur forms.
Definition 11.
The conjugate pair of the pair is
The pair p is self-conjugate if .
Obviously, the pair is self-conjugate if and only if .
Definition 12.
The conjugate -tuple of the -tuple , where , , is
The -tuple θ is self-conjugate if .
The conjugation for pairs p and -tuples is an involution, i.e., and . It corresponds to reflection relative to the anti-diagonal of arrays.
Definition 13.
The set has the following important subsets.
- 1.
- The set is of type 1 if its elements are of the form
- 2.
- The set is of type 2 if its elements are of the form
- 3.
- The set is of type 3 if it is neither of type 1 nor of type 2.
Note that the elements of the set are conjugate to the elements of the set .
Proposition 4.
The intersection has a single element
which is a self-conjugate -tuple.
Definition 14.
The elements of the set are said to be proper. The elements of the set are said to be improper.
There are elements in each of the sets and and one joint element of and . Thus, we have
Example 3.
For , there is pair of indexes and it is proper. For , there are sets of pairs of indexes
and they are all proper. For , there are triples of pairs of indexes of which 11 are proper, namely
and 9 are improper, namely
Proposition 5.
The minimal and maximal elements relative to the order relation ≺ on the set are
and
respectively. The minimal and maximal elements of the set are and
respectively.
Now, we are in position to define possible sets of canonical Schur forms (CSF) for generic matrices . There are such sets. The multiplier comes from the different orders of the (simple) eigenvalues of A on the diagonal of S. The multiplier corresponds to different choices of proper -tuples
such that the elements , , of S are positive.
If the eigenvalues of S are ordered as , then there remain sets of CSF. Note that any fixed order of the (simple) eigenvalues of A on the diagonal of S is preserved only by unitary similarity transformations with diagonal matrices U, i.e., .
If, in particular, we choose a given -tuple, say
then the set of CSF is uniquely fixed. In this case, the CSF have the form
where ⊕ denotes a positive element. The two other CSFs are
Note that there is a similar problem with canonical Jordan forms of matrices relative to general similarity transformations. Usually, it is assumed that different orders of the Jordan blocks do not produce different Jordan forms. Formally, this means that the canonical Jordan form of A is not a single block-diagonal matrix but a class of block-diagonal matrices, which are permutationally equivalent to J.
Definition 15.
A set of CSF for generic matrices and a fixed -tuple
is characterized as follows.
- 1.
- The n diagonal elements of the matrix T are ordered as
- 2.
- The elements of T over the diagonal are real and positive.
Of course, we may choose the elements to be real and negative as well, or to have angles equal to a fixed value , etc.
A matrix with eigenvalues may be transformed into CSF by the next three steps.
- The matrix A is transformed into any ConSF by a matrix . Numerically this is performed by the QR algorithm [3]. For this purpose, the code schur from MATLAB® may be used [22].
- A ConSF is constructed so that , . This may be performed by complex plane rotations, which interchange the positions of two diagonal elements and of such that but ; see, e.g., ref. [3].
- A diagonal matrix with elements , , , is chosen so that the matrix has positive elements in positions .
In connection with step 3, we note that unitary similarity transformations that introduce positive elements in certain positions of the transformed matrices are considered in [30].
We recall that to introduce CSF in the set relative to the similarity action of , we use the lexicographical order ≺ on . For , where , , we write if either , or and .
There are
sets of generic canonical forms , , for . The values of for small values of n are given in Table 1.
There are different pairs of , . They are ordered lexicographically according to the rule if either , or and . We may order the pairs as , where
Thus, we have the chain of inequalities
For any , denote the conjugate pair , symmetric to p relative to the anti-diagonal of elements in positions , .
It follows from (2) that for n fixed and any there exists a unique integer such that , where and .
The integer may be defined from
or
Finally, set
Thus, we have defined a bijection
between the ordered sets of integers and integer pairs , where .
Proposition 6.
The triple of pairs of indexes , where , , and , , is said to be improper. The triple of pairs of indexes , where , , and , , is said to be improper.
Theorem 1.
Each set of proper integer -tuples defines a class of canonical forms for the unitary similarity action of the group on generic matrices . These forms are upper triangular matrices S with for and , for .
If the matrix with eigenvalues is already transformed into ConSF, i.e., , it is then easily put into CSF as follows. First, a matrix is chosen so as
Then, a diagonal unitary matrix D with is found so that . Denoting and , , where , the conditions give the system of linear equations
for . If it happens that for some k then is replaced by
Three special sets of Schur canonical forms for generic matrices deserve attention. For these sets, the system (3) for , , is solved explicitly as follows.
- The first set corresponds to pairs of indexes , , and here .
- The second set corresponds to pairs of index pairs , , and here .
- The third set corresponds to pairs of indexes , , and here .
In all these cases, the fulfilment of the convention (4) is presupposed.
The restrictions assumed in this section, and in particular, the condition that the eigenvalues of A are simple, seem serious, but in fact, their violation can make the perturbation analysis of this statement of SP meaningless. If, for example, A has two or more equal eigenvalues, then the Schur basis of the perturbed SP may be discontinuous as a function of the perturbation in A; see, e.g., refs. [6].
5. Real Schur Canonical Forms
The considerations above are valid for real or genuinely complex matrices with spectra that may in turn be real or genuinely complex. In particular, we have the following four possibilities.
- The matrix A is real and has a real spectrum.
- The matrix A is real and has a genuinely complex spectrum (i.e., there is at least one complex conjugate pair of eigenvalues, where ).
- The matrix A is genuinely complex and has a real spectrum.
- The matrix A is genuinely complex and has a genuinely complex spectrum.
When (cases 1 and 2), we may use orthogonal transformation matrices instead of unitary ones to obtain the real Schur canonical form and the real Schur condensed forms of A. In case 1, the transformation matrix is taken as and both the SCF and the ConSF of A are real upper triangular matrices T with the eigenvalues of A on their main diagonals.
Case 2 is slightly more subtle. Here, the transformation matrix U may be chosen as orthogonal [23,24], while the canonical form and the condensed forms of A are upper block-triangular matrices with or blocks ( or ) on the main diagonal. In this case, there is at least one block
corresponding to the eigenvalues of A, where and .
Let and suppose that the spectrum contains m real elements and genuinely complex elements , , where the number is even. Set . Then, the orthogonal canonical form of A has the structure
Here , , ,
and , . The diagonal blocks are ordered as and , .
6. Perturbations of the Schur Problem
Let be a particular DPS of SP for a matrix , i.e., and . If A is already in ConSF, i.e., , there is nothing to transform. Here, we may choose for definiteness among all unitary matrices U that keep the diagonal of A under similarity transformations . In addition, if the matrix has a simple spectrum, and/or has multiple eigenvalues but its upper triangular form is generically nonzero, then any unitary matrix U in the DPS is necessarily diagonal, i.e., .
The choice of U as the simplest diagonal matrix for is justified by a number of additional arguments as follows.
- It works for arbitrary matrices A with simple eigenvalues as well as with multiple eigenvalues.
- When using ConSF for the evaluation of matrix functions, the computational algorithm has to work with a value for U, which is applicable to all input matrices A. For this reason, the QR algorithm [3] for computing T first checks whether . If yes, it assumes and .
- Algorithms for computing a Jordan form of A act similarly. Indeed, for A already in Jordan form, computational algorithms for finding assume and .
As shown later on, the choice in the solution of SP for may lead to discontinuity of the function at the point , in particular solutions of SP for a perturbed matrix with a given E. If we choose , this discontinuity disappears. However, for another perturbed matrix , the discontinuity occurs as well. This inevitable discontinuity is due to the fact that ConSF T is upper triangular and for a lower triangular perturbation , the perturbed transformation matrix is a flip matrix such as .
Let be a perturbation in A. Usually (but not always), we suppose that the matrix is small relative to A, e.g., , where is the rounding unit of FPA used in the computations [31].
We often assume that the perturbation is a 1-parameter family , where is a small parameter, and is a fixed matrix with , i.e., . The technique of the so-called fictitious small parameter can also be used in the perturbation analysis of matrix problems. Assuming that is small relative to , we use the identity , where and is finally set to .
The formulation of the perturbed Schur problem (PSP), i.e., the SP for a perturbed matrix , is not trivial. First, we mention two facts.
- If PSP for a perturbed matrix has a perturbed solution with as , then it also has a perturbed solution with and as . The reason is that .
- The solution of PSP for a matrix may have the form with . This will happen when , and in this case, .
Let be a particular solution of SP for a matrix . For definiteness and based on the above considerations, we assume that, if , then and .
Consider the perturbation , where , is a small parameter and is a fixed matrix with . Let be a particular solution to PSP for the matrix , i.e.,
Since the solution of PSP always exists, we have defined functions and through relations (5). The problem is that there are many such functions and not all of them are suitable for perturbation analysis. The aim of the next definition is to clarify the concepts in this area.
Definition 16.
The pair is said to be a regular solution of the PSP for the matrix if the functions U and T are continuous on the interval and, if , then is the DPS of SP for .
Example 4.
Let , . Then, the general solution of SP for A is . The opposite statement is also true in the form of the next two assertions.
- 1.
- If , then and .
- 2.
- If , then and .
Example 5.
Let , where and is a Jordan block with a zero eigenvalue. Then,
A number of examples of ConSF for presented in the next section illustrate the structure of these forms and the behavior of their perturbations; see also [7].
7. Examples of Real Matrices
In this section, we consider several examples illustrating the concepts introduced so far. The examples are for SP and PSP for matrices with for which the transformation group is . This is the simplest non-trivial case. However, the effects observed are valid for matrices of the form
where , , and .
Matrices correspond to linear operators and have the simplest nontrivial albeit rich structure. A surprisingly large number of facts about general linear operators is revealed by such matrices; see, e.g., ref. [32] and the examples below.
Example 6.
Let the matrix have eigenvalues and set
Then, the following four cases are possible in which the statements are reversible.
- If and , then there exists a unique ConSF of the matrix A.
- If and , then there exist two ConSF of the matrix A.
- If and , then there exist two ConSF and of the matrix A.
- If and , then there exist four ConSF , of the matrix A.
Example 7.
Let , . We have and . Since is in ConSF, we consider the simplest particular DPS, which is . Let the matrix be perturbed to , where ε is a small parameter. Then, a particular solution of PSP may be written as , , where . Here, the perturbation is small for order ε, but the perturbation is not small even though ε is small.
For , the set of transformation matrices consists of the matrices , and their negations. In view of the equalities , we have and . At the same time, for , the set of Schur forms consists of two matrices . Thus, the transformation matrix is discontinuous at the point .
Consider the multivalued function , where is the set of subsets of , defined by . We have and for . Hence, the function Ψ, i.e., the Schur basis for relative to the matrix , is discontinuous at the point , while the Schur forms of are continuous in ε.
Example 8.
Let be a Jordan block with eigenvalue . The set contains two matrices and , while the set contains the matrices , and their negations.
Let the matrix be perturbed to , where . The eigenvalues of are , . Setting
we see that there are four ConSF
and
The orthogonal matrices that transform into are
and
Hence, there are two regular solutions of this PSP, namely and corresponding to the unperturbed ConSF and , respectively.
Example 9.
Let , where . Here, the set contains two diagonally different ConSF and of , while the set has eight elements, namely and . Let us choose . For , the set has four elements , and and .
The matrices from Example 7 transform the perturbed matrix into the ConSF and the matrices transform into the ConSF since and transform in , respectively.
Consider the transformation of into some of the Schur forms or . Define the orthogonal matrices
where
We have
Furthermore, and are fulfilled. Hence, the regular solution of the PSP is .
Example 10.
Let , where and . The set of ConSF of contains four matrices:
Let the matrix be perturbed to , where ε is a small parameter such that , i.e., , where . The ConSF of the matrix are
where the quantities , and are analytical functions of ε. In particular,
Among the four ConSF, only the matrix corresponds to a regular solution.
8. Diagonally Spectral Matrices
Denote by the set of matrices such that the multiset of its diagonal elements is equal to the multiset of its eigenvalues, i.e.,
Otherwise speaking, is the set of matrices A such that
The set is defined by n algebraic equations (7) (some of them may not be independent) in the elements of the matrix A and is hence a closed algebraic variety [29] of complex dimension of at least .
Upper triangular matrices and lower triangular matrices are diagonally spectral. Schur condensed forms in particular are diagonally spectral. More generally, for being a permutation matrix, and being a diagonally spectral matrix, the matrix is also diagonally spectral.
Example 11.
The elements of the matrix satisfy one independent algebraic equation . Hence, the matrix has the form
where ∗ denotes unspecified matrix elements.
Example 12.
The matrices , where
are diagonally spectral.
Matrices from may not be condensed in the sense that they may have less than zero elements. In particular, matrices from may have all their elements different from zero.
Example 13.
Let be a parameter. Then, the matrices
are diagonally spectral, i.e., . We stress that but for all .
The main advantage of a Schur canonical or condensed form
of a matrix is that it reveals the spectrum of the matrix A as the collection of the diagonal elements
of the form T. Thus, the sets of Schur canonical and condensed forms are subsets of the larger set (closed in the Zarisky topology)
where
of matrices having spectra equal to the collection of their diagonal elements.
Obviously, the set as well as the set of lower triangular matrices are subsets of . More generally, if is a permutation matrix (i.e., the columns of P are a permutation of the columns of the identity matrix ) and then . In particular, if , then the matrix is lower triangular with , .
Example 14.
Let , . One of the DPS of SP for A is the pair . If we perturb A to , where is small, the pair is transformed to , where and is any of the four matrices . Thus, and the transformation matrix is discontinuous at the point .
To avoid such artificial high sensitivity, in the next section, we introduce the concept of condensed quasi-Schur forms.
9. Condensed Quasi-Schur Forms
Let be a fixed integer.
Definition 18.
The m-tuple , where are positive integers, is said to be m-partition of n if .
Next, we define condensed quasi-Schur forms of the matrix as block-triangular matrices (upper or lower) such that the blocks on the diagonal of A are, in turn, triangular matrices. In particular, condensed quasi-Schur forms of A are diagonally spectral and thus reveal the spectrum of A. The idea is that these forms are less sensitive to perturbations in A in comparison to other unitary equivalent forms of A such as ConSF.
Definition 19.
A matrix , where , is said to be a condensed quasi-Schur form of if there exists an m-partition ν of n such that S or is block-upper triangular with diagonal blocks , i.e.,
and either or , .
Note that the upper triangular matrix is permutationally equivalent to the lower triangular matrix . At the same time, a matrix A and its transpose are similar [33], but we cannot use this result here since the matrix V is not unitary in general. Instead of the transposed matrix , we may use its flipped variant , which is lower triangular whenever A is upper triangular and is a permutation matrix.
Example 15.
For , the condensed quasi-Schur forms are , where . For , the condensed quasi-Schur are , and , where ,
and the star denotes unspecified elements.
Condensed quasi-Schur forms are diagonally spectral, but the opposite is not true for ; see Example 13. Obviously, a ConSF is also a condensed quasi-Schur form, but the opposite may not be true (we recall that ). We stress that the high sensitivity of Schur forms as in Example 14 may not be observed for condensed quasi-Schur forms.
We stress, finally, that the concept of a diagonally spectral matrix as an algebraic object may be of independent interest in matrix analysis and should be independently studied in more detail.
10. Properties of Canonical/Condensed Forms
Canonical and condensed forms of matrices and matrix pencils under the action of matrix transformation groups are widely used in matrix analysis and control theory. These forms have zeros at given positions and, optionally, ones at other positions. Introducing zeros at given positions may lead to discontinuity of the transformation matrix, while introducing ones may cause bad conditioning of this matrix. In particular, the requirement that the condensed form T of A is upper triangular, i.e., , may lead to extreme sensitivity of the transformation pair relative to perturbations in the matrix A. At the same time, this high sensitivity may not be relevant to the problem of computing the spectrum of A. Also, whether a condensed form is upper or lower triangular does not matter from both theoretical and practical points of view. In the concept of a condensed quasi-Schur form of a matrix, we exploit this idea: the condensed form is defined as a set, containing all triangular forms.
This approach may be extended to generalized Jordan forms (GJF) of matrices. We recall that GJF of the matrix A is a bi-diagonal matrix with the eigenvalues of A on its diagonal and the property that implies . The difference with the standard Jordan form of A is that the nonzero super-diagonal elements of G are not necessarily equal to 1.
11. Conclusions
Novelty
In this paper, we consider condensed Schur forms for a general square matrix A as well as various sets of canonical Schur forms for a square matrix A with distinct eigenvalues. This case is generic, and hence, it is a primary problem [34] in the analysis of Schur decompositions. We also study the most non-generic case of scalar matrices, which belong to a one-dimensional variety in the space of square matrices.
The sensitivity of the Schur forms relative to perturbations in A is also studied. The concepts of a diagonal preserving solution to the Schur problem and of a regular solution to the perturbed Schur problem are introduced and illustrated by many examples.
We also introduce the concepts of diagonally spectral matrices and of quasi-Schur condensed forms of a matrix A. The latter forms are much less sensitive (if ever) to perturbations in the matrix A in comparison with the upper triangular condensed forms of A. The concept of a diagonally spectral matrix is about a new algebraic object that may be studied independently in matrix theory as a closed variety in the Zariski topology.
Author Contributions
Methodology, M.M.K.; Validation, P.H.P.; Formal analysis, M.M.K.; Writing—review & editing, P.H.P. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are contained within the article.
Acknowledgments
The authors of this paper are grateful to the anonymous reviewers for their very useful and detailed comments and suggestions, which helped to improve the text.
Conflicts of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
- Schur, I. Beiträge zur Theorie der Gruppen linearer homogener Substitutionen. Trans. Am. Math. Soc. 1909, 10, 159–175. [Google Scholar] [CrossRef][Green Version]
- Bhatia, R. Matrix Analysis; Springer: Berlin/Heidelberg, Germany, 1996; ISBN 978-0387948461. [Google Scholar]
- Golub, G.; Loan, C.V. Matrix Computations, 4th ed.; The Johns Hopkins University Press: Baltimore, MD, USA, 2013; ISBN 978-1421407944. [Google Scholar]
- Horn, R.; Johnson, C. Matrix Analysis, 2nd ed.; Cambridge University Press: Cambridge, UK, 2012; ISBN 978-0521839402. [Google Scholar]
- Horn, R.; Johnson, C. Topics in Matrix Analysis; Cambridge University Press: Cambridge, UK, 1991; ISBN 978-0511840371. [Google Scholar]
- Konstantinov, M.; Petkov, P.; Christov, N. Nonlocal perturbation analysis of the Schur system of a matrix. SIAM J. Matrix Anal. Appl. 1994, 15, 383–392. [Google Scholar] [CrossRef]
- Konstantinov, M.; Petkov, P. Perturbation Methods in Matrix Analysis and Control; Nova Science Publishers: New York, NY, USA, 2020; ISBN 978-1536174700. [Google Scholar] [CrossRef]
- Minenkova, A.; Nitch-Griffin, E.; Olshevsky, V. Gohberg-Kaashoek numbers and forward stability of Schur canonical forms. arXiv 2021, arXiv:2110.15334. [Google Scholar] [CrossRef]
- Minenkova, A.; Nitch-Griffin, E.; Olshevsky, V. Backward stability of the Schur decomposition under small perturbations. Linear Algebra Appl. 2024, in press. [Google Scholar] [CrossRef]
- Zhang, G.; Li, H.; Wel, Y. Componentwise perturbation analysis for the generalized Schur decomposition. Calcolo 2022, 59, 19. [Google Scholar] [CrossRef]
- Konstantinov, M.; Mehrmann, V.; Petkov, P. Perturbation analysis of Hamiltonian Schur and block-Schur forms. SIAM J. Matrix Anal. Appl. 2001, 23, 387–424. [Google Scholar] [CrossRef]
- Chu, E. Pole assignment via the Schur form. Syst. Control. Lett. 2007, 56, 303–314. [Google Scholar] [CrossRef]
- Chu, D.; Liu, X.; Mehrmann, V. A numerical method for computing the Hamiltonian Schur form. Numer. Math. 2007, 105, 375–412. [Google Scholar] [CrossRef]
- Chen, J.; Ma, W.; Miao, Y.; Wei, Y. Perturbations of Tensor-Schur decomposition and its applications to multivariable control systems and facial recognitions. Neurocomputing 2023, 547, 126446. [Google Scholar] [CrossRef]
- Stewart, G.; Sun, J. Matrix Perturbation Theory; Academic Press: Cambridge, MA, USA, 1990; ISBN 978-0126702309. [Google Scholar]
- Konstantinov, M.; Gu, D.; Mehrmann, V.; Petkov, P. Perturbation Theory for Matrix Equations; Science Direct: Amsterdam, The Netherlands, 2003; ISBN 0-444513159. [Google Scholar]
- Konstantinov, M.; Petkov, P.; Christov, N. Invariants and canonical forms for linear multivariable systems under the action of orthogonal transformation groups. Kybernetika 1981, 17, 413–424. [Google Scholar]
- Konstantinov, M.; Postlethwhite, I.; Gu, D.; Petkov, P. Perturbation analysis of orthogonal canonical forms. Linear Algebra Appl. 1997, 251, 267–291. [Google Scholar] [CrossRef][Green Version]
- Boley, D.; Datta, B. Numerical Methods for Linear Control Systems. In Systems and Control in the Twenty-First Century. Systems & Control: Foundations & Applications; Byrnes, C., Ed.; Birkhäuser: Boston, MA, USA, 1997; Volume 22. [Google Scholar]
- Sun, J. Perturbation bounds for the generalized Schur decomposition. SIAM J. Matrix Anal. Appl. 1995, 16, 1328–1340. [Google Scholar] [CrossRef]
- Sun, J. Perturbation analysis of system Hessenberg and Hessenberg/triangular forms. Linear Algebra Appl. 1996, 241/243, 811–849. [Google Scholar] [CrossRef][Green Version]
- The MathWorks, Inc. MATLAB Version 9.9.0.1538559 (R2020b); The MathWorks, Inc.: Natick, MA, USA, 2020. [Google Scholar]
- Murnaghan, F.; Wintner, A. A canonical form for real matrices under orthogonal transformations. Proc. Natl. Acad. Sci. USA 1931, 17, 417–420. [Google Scholar] [CrossRef] [PubMed]
- Shapiro, H. A survey of canonical forms and invariants for unitary similarity. Linear Algebra Appl. 1991, 147, 101–167. [Google Scholar] [CrossRef]
- Higham, N. Functions of Matrices: Theory and Computation; SIAM: Philadelphia, PA, USA, 2008. [Google Scholar] [CrossRef]
- Brenner, J. The problem of unitary equivalence. Acta Math. 1951, 86, 297–308. [Google Scholar] [CrossRef]
- Littlewood, D. On unitary equivalence. J. Lond. Math. Soc. 1953, 28, 314–322. [Google Scholar] [CrossRef]
- Ikramov, K. The canonical Schur form of a matrix with simple eigenvalues. Dokl. Math. 2008, 77, 359–360. [Google Scholar] [CrossRef]
- Hartshorne, R. Algebraic Geometry; Springer: Berlin/Heidelberg, Germany, 1977; ISBN 978-0387902449. [Google Scholar]
- Ikramov, K. On a constructive procedure for verifying whether a matrix can be made real by a unitary similarity transformation. Comput. Math. Math. Phys. 2010, 50, 383–386. [Google Scholar] [CrossRef]
- 754-2019; IEEE Standard for Floating-Point Arithmetic. IEEE Computer Society: Washington, DC, USA, 2019. [CrossRef]
- Glazman, M.; Ljubich, J. Finite-Dimensional Linear Analysis: A Systematic Presentation in Problem Form (Dover Books in Mathematics); Dover Publications: New York, NY, USA, 2006; ISBN 978-0486453323. [Google Scholar]
- Tausky, O.; Zassenhaus, H. On the similarity transformation between a matrix and its transpose. Pac. J. Math. 1959, 9, 893–896. [Google Scholar] [CrossRef]
- Arnold, V. Geometric Methods in the Theory of Ordinary Differential Equations; Springer Science + Business Media: New York, NY, USA, 1988; ISBN 978-1461269946. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).