Next Article in Journal
Locating Arrays with Mixed Alphabet Sizes
Previous Article in Journal
Nontrivial Solutions for a System of Fractional q-Difference Equations Involving q-Integral Boundary Conditions
Open AccessArticle

Numerical Range of Moore–Penrose Inverse Matrices

Department of Mathematics, Soochow University, Taipei 111002, Taiwan
Mathematics 2020, 8(5), 830; https://doi.org/10.3390/math8050830
Received: 29 April 2020 / Revised: 16 May 2020 / Accepted: 19 May 2020 / Published: 20 May 2020
(This article belongs to the Section Computational Mathematics)

Abstract

Let A be an n-by-n matrix. The numerical range of A is defined as W ( A ) = { x * A x : x C n , x * x = 1 } . The Moore–Penrose inverse A + of A is the unique matrix satisfying A A + A = A , A + A A + = A + , ( A A + ) * = A A + , and ( A + A ) * = A + A . This paper investigates the numerical range of the Moore–Penrose inverse A + of a matrix A, and examines the relation between the numerical ranges W ( A + ) and W ( A ) .
Keywords: Moore–Penrose inverse; numerical range; weighted shift matrix Moore–Penrose inverse; numerical range; weighted shift matrix

1. Introduction

Let A M m , n , the m × n complex matrices, the Moore–Penrose inverse A + is the unique matrix that satisfies the following properties [1,2]:
A A + A = A , A + A A + = A + , ( A A + ) * = A A + , and ( A + A ) * = A + A .
Consider the system of linear equations:
A x = b , b C n .
Moore and Penrose showed that A + b is a vector x such that x 2 is minimized among all vectors x for which A x b 2 is minimal. The theory and applications of the Moore–Penrose inverse can be found, for examples, in [3,4,5].
Let M n be the set of n × n complex matrices. The numerical range of A M n is defined as
W ( A ) = { x * A x : x C n , x * x = 1 } .
The numerical radius w ( A ) of A is defined by the identity w ( A ) = max { | z | : z W ( A ) } . The well-known Toeplitz–Hausdorff theorem asserts that W ( A ) is a convex set containing the spectrum σ ( A ) of A. There are several fundamental facts about the numerical ranges of square matrices:
(a)
W ( β A + γ I ) = β W ( A ) + { γ } , β , γ C ;
(b)
W ( U * A U ) = W ( A ) , U unitary;
(c)
W ( C D ) = convex hull { W ( C ) W ( D ) } , where C D = C 0 0 D M m + n is the direct sum of C M m and D M n ;
(d)
W ( A ) R if and only if A is Hermitian;
(e)
If A is normal then W ( A ) is the convex of σ ( A ) .
(For references on the numerical range and its generalizations, see, for instance, ref. [6]).
The numerical range of A 1 of a nonsingular matrix is developed in [7,8] for which the spectrum of any matrix is characterized as the intersection of a family of the numerical ranges of the inverses of nonsingular matrices. In this paper, we investigate the numerical ranges of the Moore–Penrose inverses, and examine the relationship of the numerical ranges between W ( A + ) and W ( A ) . In particular, we prove in Section 2 that 0 W ( A ) if and only if 0 W ( A + ) , and
σ ( A ) W ( A ) 1 W ( A + )
if A A + = A + A .
Recall that the singular value decomposition of a matrix A M m , n with rank k is written as A = U Σ V * , where U M m and V M n are unitary, and Σ = ( s i j ) M m , n has s i j = 0 for all i j , and s 11 s 22 s k k > s k + 1 , k + 1 = = s p p = 0 , p = min { m , n } . The entries s 11 , s 22 , , s p p are called the singular values of A (cf. [9]). The following facts list a number of useful properties concerning the Moore–Penrose inverse.
(F1).
Assume A = U Σ V * M m , n is a singular value decomposition of A, then A + = V Σ + U * .
(F2).
If A M n is nonsingular, A + = A 1 .
(F3).
If A = diag ( a 1 , a 2 , , a k , 0 , , 0 ) M n , a j 0 , j = 1 , , k , then A + = diag ( 1 / a 1 , 1 / a 2 , , 1 / a k , 0 , , 0 ) .
(F4).
For any nonzero vector x C n = M n , 1 , x + = x * / ( x * x ) .
(F5).
If A M m , n , for any unitary matrices U M m and V M n , ( U A V ) + = V * A + U * .
Throughout this paper, we define 1 / a = 0 if a = 0 .

2. Numerical Range

We begin with two examples to observe some properties of the geometry between the numerical ranges W ( A ) and W ( A + ) .
Example 1.
Consider a rank one matrix
A = a 1 a 2 a n 0 0 0 0 0 0 M n ,
| a 1 | + | a 2 | + + | a n | 0 .
By the singular value decomposition of A, we find that
A + = 1 α a ¯ 1 0 0 a ¯ 2 0 0 a ¯ n 0 0 ,
where α = | a 1 | 2 + | a 2 | 2 + + | a n | 2 . Clearly, both W ( A ) and W ( A + ) are elliptic disks.
On the other hand, the following example indicates that W ( A ) and W ( A + ) may differ in geometry types.
Example 2.
Let z = 10 + 10 i . Consider the matrix
A = diag ( 1 , 1 / z , 1 / z ¯ ) 1 1 0 0 .
By (F3), and taking n = 2 , a 1 = a 2 = 1 in Example 1, we have
A + = diag ( 1 , z , z ¯ ) 1 / 2 0 1 / 2 0 .
Then W ( A + ) = W ( diag ( 1 , z , z ¯ ) ) is a polygon, but W ( A ) = W ( 1 1 0 0 ) is an elliptic disk.
The following result can be easily derived from facts (F5) and (F3).
Theorem 1.
Let A M n . Then A is normal (resp. hermitian) if and only if A + is normal (resp. hermitian).
Theorem 1 asserts that both W ( A ) and W ( A + ) have the same geometry type, namely convex polygons or line segments, depending on A is normal or hermitian. We show in Theorem 1 that certain non-normal matrices also admit this property.
The following result shows that the spectra of A and A + as well as their numerical ranges simultaneously contain the origin.
Theorem 2.
Let A M n . Then
(i) 
0 σ ( A ) if and only if 0 σ ( A + ) .
(ii) 
0 W ( A ) if and only if 0 W ( A + ) .
(iii) 
If A is normal and λ 0 then λ σ ( A ) if and only if 1 / λ σ ( A + ) .
Proof. 
By the properties A A + A = A and A + A A + = A + , we have det ( A ) = 0 if and only if det ( A + ) = 0 . This proves ( i ) .
Suppose A is singular. Then, by ( i ) , 0 W ( A ) if and only if 0 W ( A + ) . Suppose A is nonsingular. Then A + = A 1 , and
W ( A + ) = x * A + x x * x : x 0 = ( A x ) * A + ( A x ) ( A x ) * ( A x ) : x 0 = x * A * A + A x x * A * A x : x 0 = x * A * x x * A * A x : x 0 .
Hence
0 = x * A x x * x W ( A )
for some x 0 if and only if
0 = ( x * A x x * x ) * = x * A * x x * x W ( A )
if and only if x * A * x = 0 for some x 0 , which is equivalent to 0 W ( A + ) . This proves ( i i ) .
If A is normal with spectrum decomposition A = U Λ U * , then A + = U Λ + U * . Suppose the diagonal matrix Λ = diag ( λ 1 , λ 2 , , λ k , 0 , , 0 ) , λ j 0 , j = 1 , , k . It is easy to see that Λ + = diag ( 1 / λ 1 , 1 / λ 2 , , 1 / λ k , 0 , , 0 ) , and thus ( i i i ) follows. □
Choose a 1 = a 2 = = a n = 1 in Example 1. It shows that ( i i i ) of Theorem 2 may fail for non-normal matrices.
As a consequence of Theorem 2, we obtain the following reciprocal convexity.
Theorem 3.
Let z 1 , z 2 , , z n be nonzero complex numbers. If
0 = α 1 z 1 + α 2 z 2 + + α n z n
for some nonnegative α 1 , α 2 , , α n with α 1 + α 2 + + α n = 1 , then there exist nonnegative β 1 , β 2 , , β n with β 1 + β 2 + + β n = 1 such that
0 = β 1 1 z 1 + β 2 1 z 2 + + β n 1 z n .
Proof. 
Consider the diagonal matrix A = diag ( z 1 , z 2 , , z n ) . If 0 = α 1 z 1 + α 2 z 2 + + α n z n , then 0 W ( A ) , the convex polygon with vertices z 1 , z 2 , , z n . By Theorem 2 ( i i ) , we have that
0 W ( A + ) = diag ( 1 z 1 , 1 z 2 , , 1 z n )
which is convex polygon with vertices 1 z 1 , 1 z 2 , , 1 z n . Therefore, there exist nonnegative β 1 , β 2 , , β n with β 1 + β 2 + + β n = 1 such that
0 = β 1 1 z 1 + β 2 1 z 2 + + β n 1 z n .
Theorem 4.
Let A M n . If W ( A ) is symmetric with respect to x-axis then
W ( A ) s 2 W ( A + )
for every singular value s of A.
Proof. 
Let A = U Σ V * be a singular value decomposition of A, where Σ = diag ( s 1 , s 2 , , s n ) , s 1 s 2 s n 0 . If s = 0 is a singular value of A, then A is singular. Hence 0 σ ( A ) , and thus 0 W ( A ) s 2 W ( A + ) .
If s 0 is a nonzero singular value of A, we may assume s = s 1 , then 1 is a singular value of A / s . Choose a unit vector x such that V * x = [ ( V * x ) 1 , 0 , , 0 ] T , with only nonzero first coordinate. Then x * ( A / s ) x = ( U * x ) 1 ¯ ( V * x ) 1 . Since W ( A / s ) is symmetric with respect to x-axis, W ( A / s ) = W ( ( A / s ) * ) . Hence
( ( U * x ) 1 ¯ ( V * x ) 1 ) * = ( V * x ) 1 ¯ ( U * x ) 1 W ( A / s ) .
On the other hand, s A + = V ( s Σ + ) U * . Then
x * ( s A + ) x = x * V ( s Σ + ) U * x = ( V * x ) * ( s Σ + ) ( U * x ) = ( V * x ) 1 ¯ ( U * x ) 1 .
Hence
W ( A / s ) W ( s A + ) ,
which is equivalent to
W ( A ) s 2 W ( A + ) .
The result of Theorem 4 may fail if the symmetric property of the numerical range of A is omitted. For example, consider the matrix
A = 1 + i 0 0 2 + 2 i ,
Then the singular values of A are s 1 = 2 , s 2 = 8 , and
A + = 1 / ( 1 + i ) 0 0 1 / ( 2 + 2 i ) .
In this case, for every singular value s j , j = 1 , 2 , we have that
W ( A ) s j 2 W ( A + ) = .
It is mentioned in [7,8,10], for any nonsingular matrix A M n ,
σ ( A ) W ( A ) 1 W ( A 1 ) .
We present the spectrum inclusion in Equation (1) for Moore–Penrose inverses.
Theorem 5.
Let A M n . If A A + = A + A then
σ ( A ) W ( A ) 1 W ( A + ) .
Proof. 
It is well known that σ ( A ) W ( A ) . Suppose λ σ ( A ) . If λ = 0 then 0 W ( A ) , and by ( i i ) of Theorem 2, 0 W ( A + ) . The inclusion in Equation (2) holds. Assume λ 0 . Choose a unit eigenvector x with A x = λ x . Then
A + A x = λ A + x .
Using Equation (3), we have
λ x = A x = A A + A x = λ A A + x .
The Equation (4) implies
A A + x = x .
Again using Equation (3), we have
A + x = 1 λ A + A x .
From Equations (5) and (6), we have
x * A + x = 1 λ x * A + A x = 1 λ x * A A + x = 1 λ .
Thus
λ = 1 x * A + x 1 W ( A + ) .
A matrix A M n satisfying the condition A A + = A + A in Theorem 5 is called an EP matrix. Baksalary [11] proposed that the class of EP matrices is characterized as those matrices A for which the column space of A 2 coincides with the column space of A * . Bapat et al. [12], confirmed the characterization. The EP assumption in Theorem 5 is essential. For instance, taking n = 2 , a 1 = a 2 = 1 in Example 1, then the eigenvalue 1 of A is not in 1 / W ( A + ) since w ( A + ) < 1 . Note that A A + and A + A are even unitarily equivalent.
It is shown in [13], under rank additivity rank ( A + B ) = rank ( A ) + rank ( B ) , the Moore–Prnrose inverse ( A + B ) + can be represented in terms of A + and B + . Applying the result, there obtains
( u u * + v v * ) + = u u * + v v *
for any orthonormal vectors u , v C n . We extend Equation (7) to a general result.
Theorem 6.
Let { u 1 , u 2 , , u r } and { v 1 , v 2 , , v r } be two orthonormal subsets of C n . If A = u 1 v 1 * + u 2 v 2 * + + u r v r * then A + = v 1 u 1 * + v 2 u 2 * + + v r u r * , and W ( A + ) = W ( A * ) .
Proof. 
Extend { u 1 , u 2 , , u r } and { v 1 , v 2 , , v r } to orthonormal bases { u 1 , u 2 , , u r , u r + 1 , , u n } and { v 1 , v 2 , , v r , v r + 1 , , v n } of C n , respectively. Let U = [ u 1 u 2 u n ] and V = [ v 1 v 2 v n ] be the corresponding unitary matrices. Then
A = U ( I r 0 n r ) V * .
Hence, by (F5),
A + = V ( I r 0 n r ) U * .
It follows that A + = A * , and thus W ( A + ) = W ( A * ) . □

3. Bounds on Numerical Radii

Recall that for any nonsingular matrix A, the number A A 1 is called the condition number of A with respect to the given matrix norm. The matrix A is ill conditioned if its condition number is large.
For any matrix A, nonsingular or not, we also call the number A A + the condition number of the matrix A.
Theorem 7.
Let 0 A M n . Then, for the spectral norm . ,
1 A A + 4 w ( A ) w ( A + ) .
Proof. 
If A 0 , there exists x such that A x 0 . Then A A + A x = A x , 1 σ ( A A + ) . Since A A + is idempotent and hermitian, it follows that W ( A A + ) = [ 0 , 1 ] . Thus, w ( A A + ) = 1 . By the numerical radius inequality w ( A ) A 2 w ( A ) (cf. [6] p. 44), we obtain that
1 = w ( A A + ) = A A + A A + 4 w ( A ) w ( A + ) .
Let A M n be a weighted shift matrix
A = 0 a 1 0 0 0 0 a 2 a n 1 0 0 0 .
It is well known that W ( A ) is a circular disk centered at the origin. The radius of the circle has attracted the attention of many authors, see for example, refs. [14,15,16,17]. In particular, if a 1 = a 2 = = a n 1 = 1 , w ( A ) = cos ( π / ( n + 1 ) (cf. [15,17]). For weighted shift matrices, upper bounds of the numerical radii are found in [14,16]. The Moore–Penrose inverse provides an upper bound and a lower bound for the numerical radii of certain weighted shift matrices.
Theorem 8.
Let A M n be a weighted shift matrix defined by Equation (8). Then
A + = 0 0 0 1 / a 1 0 0 1 / a 2 0 1 / a n 1 0 .
Furthermore,
( i )
W ( A ) and W ( A + ) are circular disks centered at the origin, and
1 4 w ( A ) w ( A + ) max | a k | min | a k | cos 2 ( π n + 1 ) ,
where the minimum is taken over those k with a k 0 .
( i i )
If a k a n k = 1 for all k = 1 , 2 , , [ n / 2 ] then W ( A + ) = W ( A ) , and
1 2 w ( A ) = w ( A + ) max { | a k | , 1 / | a k | : k = 1 , 2 , , [ n / 2 ] } cos ( π n + 1 ) .
Proof. 
Assume a singular value decomposition of A is
A = U Σ V * ,
where
U = I n , Σ = | a 1 | 0 0 0 | a 2 | | a n 1 | 0 0 0 0 , V * = 0 e i θ 1 0 0 0 0 e i θ 2 0 e i θ n 1 1 0 0 0 ,
and a k = | a k | e i θ k , k = 1 , 2 , , n 1 . Direct computations on A + = V Σ + U * obtain the representation in Equation (9) of A + . It is easy to see that A + in Equation (9) is permutationally equivalent to the weighted shift matrix
0 1 / a n 1 0 0 0 0 1 / a n 2 1 / a 1 0 0 0 .
The circularity of W ( A ) and W ( A + ) follows a well known result that the numerical range of any weighted shift matrix is a circular disk centered at the origin (cf. [14]), and the numerical range of the transpose of a matrix equals the numerical range of the matrix itself. Moreover, by Theorem 3 in [14], the numerical radius
w ( A ) cos ( π n + 1 ) max k | a k | .
Together with Theorem 7, the assertion ( i ) follows.
If a k a n k = 1 for all k = 1 , 2 , , [ n / 2 ] , then A + is permutationally equivalent to the matrix in Equation (10) which is exactly equal to A. Thus W ( A + ) = W ( A ) . Suppose c = max { | a k | , 1 / | a k | : k = 1 , 2 , , [ n / 2 ] } . Then min { | a k | , 1 / | a k | : k = 1 , 2 , , [ n / 2 ] } = 1 / c , and the numerical radius inequality follows from ( i ) .
The lower bound 1 / 4 in ( i ) is sharp as can be easily seen by taking n = 2 and A = 0 1 0 0 .

Funding

This work was partially supported by Ministry of Science and Technology, Taiwan, under NSC 99-2115-M-031-004-MY2.

Acknowledgments

The author thanks the referees for their helpful comments and suggestions on earlier versions.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Moore, E. On the reciprocal of the general algebraic matrix. Bull. Am. Math. Soc. 1920, 26, 394–395. [Google Scholar]
  2. Penrose, R. A generalized inverse for matrices. Proc. Camb. Philos. Soc. 1955, 51, 406–413. [Google Scholar] [CrossRef]
  3. Ben-Israel, A.; Greville, T.N.E. Generalized Inverses: Theory and Applications; Springer: New York, NY, USA, 2003. [Google Scholar]
  4. Lee, B.-G.; Park, Y. Distance for Bézier curves and degree reduction. Bull. Aust. Math. Soc. 1997, 56, 507–515. [Google Scholar] [CrossRef]
  5. Mond, B.; Pečarić, J.E. Inequalities with weights for powers of generalised inverses. Bull. Aust. Math. Soc. 1993, 48, 7–12. [Google Scholar] [CrossRef]
  6. Horn, R.; Johnson, C.R. Topics in Matrix Analysis; Cambridge University Press: New York, NY, USA, 1991. [Google Scholar]
  7. Hochstenbach, M.E.; Singer, D.A.; Zachlin, P.F. Eigenvalue inclusion regions from inverses of shifted matrices. Linear Algebra Its Appl. 2008, 429, 2481–2496. [Google Scholar] [CrossRef]
  8. Zachlin, P.F. On the Field of Values of the Inverse of a Matrix. Ph.D. Thesis, Case Western Reserve University, Cleveland, OH, USA, 2007. Available online: http://rave.ohiolink.edu/etdc/view?acc_num=case1181231690 (accessed on 29 March 2020).
  9. Horn, R.; Johnson, C.R. Matrix Analysis; Cambridge University Press: New York, NY, USA, 1985. [Google Scholar]
  10. Manteuffel, T.A.; Starke, G. On hybrid iterative methods for non-symmetric systems of linear equations. Numer. Math. 1996, 73, 489–506. [Google Scholar] [CrossRef]
  11. Baksalary, O.M. Characterization of EP Matrices. Image 2009, 43, 44. [Google Scholar]
  12. Bapat, R. Characterization of EP Matrices. Image 2010, 44, 36. [Google Scholar]
  13. Fill, J.A.; Fishkind, D.E. The Moore-Penrose Generalized Inverse for Sums of Matrices. SIAM J. Matrix Anal. Appl. 2000, 21, 629–635. [Google Scholar] [CrossRef]
  14. Chien, M.T. On the numerical range of tridiagonal operators. Linear Algebra Its Appl. 1996, 246, 203–214. [Google Scholar] [CrossRef]
  15. Eiermann, M. Fields of values and iterative methods. Linear Algebra Its Appl. 1993, 180, 167–197. [Google Scholar] [CrossRef]
  16. Linden, H. Containment regions for zeros of polynomials from numerical ranges of companion matrices. Linear Algebra Its Appl. 2002, 350, 125–145. [Google Scholar] [CrossRef]
  17. Marcus, M.; Shure, B.N. The numerical range of certain 0,1-matrices. Linear Multilinear Algebra 1979, 7, 111–120. [Google Scholar] [CrossRef]
Back to TopTop