The Singular Value Expansion for Arbitrary Bounded Linear Operators

Crane, Daniel K.; Gockenbach, Mark S.

doi:10.3390/math8081346

Open AccessFeature PaperArticle

The Singular Value Expansion for Arbitrary Bounded Linear Operators

by

Daniel K. Crane

and

Mark S. Gockenbach

^*,†

Department of Mathematical Sciences, Michigan Technological University, Houghton, MI 49931, USA

^*

Author to whom correspondence should be addressed.

^†

Current address: Department of Mathematical Sciences, University of Delaware, Newark, DE 19716, USA.

Mathematics 2020, 8(8), 1346; https://doi.org/10.3390/math8081346

Submission received: 9 July 2020 / Revised: 5 August 2020 / Accepted: 7 August 2020 / Published: 12 August 2020

(This article belongs to the Special Issue Functional Analysis, Topology and Quantum Mechanics)

Download Versions Notes

Abstract

:

The singular value decomposition (SVD) is a basic tool for analyzing matrices. Regarding a general matrix as defining a linear operator and choosing appropriate orthonormal bases for the domain and co-domain allows the operator to be represented as multiplication by a diagonal matrix. It is well known that the SVD extends naturally to a compact linear operator mapping one Hilbert space to another; the resulting representation is known as the singular value expansion (SVE). It is less well known that a general bounded linear operator defined on Hilbert spaces also has a singular value expansion. This SVE allows a simple analysis of a variety of questions about the operator, such as whether it defines a well-posed linear operator equation and how to regularize the equation when it is not well posed.

Keywords:

singular value expansion; inverse problems; regularization

1. Introduction

One of the most powerful ideas in linear algebra is diagonalization, which renders many problems completely transparent. For example, if

A \in R^{n \times n}

is a symmetric matrix, the spectral theorem implies that there exists an orthogonal matrix

V \in R^{n \times n}

and a diagonal matrix

D \in R^{n \times n}

such that

A = V D V^{T}

(1)

(see, for instance, ([1], Section 7.1)). To say that V is orthogonal means that

V^{T} V = I

, the

n \times n

identity matrix, which implies that the columns

v_{1}, v_{2}, \dots, v_{n}

of V form an orthonormal basis for

R^{n}

. If D has diagonal entries

λ_{1}, λ_{2}, \dots, λ_{n}

, then we can also write

A = V D V^{T} = \sum_{i = 1}^{n} λ_{i} v_{i} v_{i}^{T} .

If

A \in R^{m \times n}

is a general matrix (not assumed to be symmetric or even square), the singular value decomposition (SVD) allows us to diagonalize the matrix, at the cost of using two different orthonormal bases. There exist orthogonal matrices

U \in R^{m \times m}

and

V \in R^{n \times n}

and a diagonal matrix

S \in R^{m \times n}

such that

A = U S V^{T}

and

S = diag (σ_{1}, σ_{2}, \dots, σ_{n})

, with

σ_{1} \geq σ_{2} \geq \dots \geq σ_{n} \geq 0

(see ([2], Lectures 4–5) or ([1], Chapter 8)). This decomposition follows immediately from the spectral theorem for symmetric matrices; in fact,

A^{T} A = V S^{T} S V^{T}

is the spectral decomposition of the symmetric matrix

A^{T} A

.

Among its other virtues, the SVD reveals the rank of A and bases for the four fundamental subspaces associated with A. If A has rank r, then exactly r of the singular values

σ_{1}, σ_{2}, \dots, σ_{n}

are positive, and we can write the SVD in the reduced form

A = \hat{U} \hat{S} {\hat{V}}^{T} = \sum_{i = 1}^{r} σ_{i} u_{i} v_{i}^{T},

where

u_{1}, u_{2}, \dots, u_{m} \in R^{m}

,

v_{1}, v_{2}, \dots, v_{n} \in R^{n}

are the columns of U, V, respectively,

\hat{U} = [u_{1} | u_{2} | \dots | u_{r}]

,

\hat{V} = [v_{1} | v_{2} | \dots | v_{r}]

, and

\hat{S} = d i a g (σ_{1}, σ_{2}, \dots, σ_{r})

. The four fundamental subspaces are represented by orthonormal bases as follows:

\begin{matrix} col (A) & = sp {u_{1}, u_{2}, \dots, u_{r}}, \\ N (A^{T}) = col {(A)}^{⊥} & = sp {u_{r + 1}, u_{r + 2}, \dots, u_{m}}, \\ N (A) & = sp {v_{r + 1}, v_{r + 2}, \dots, v_{n}}, \\ col (A^{T}) = N {(A)}^{⊥} & = sp {v_{1}, v_{2}, \dots, v_{r}} . \end{matrix}

It is well known that spectral theory is considerably more complicated for linear operators defined on infinite-dimensional spaces. However, for compact operators, the finite-dimensional theory carries through almost unchanged. Throughout the rest of this paper, X and Y will denote real Hilbert spaces. If

T : X \to Y

is linear, then T is called compact if and only if

{T x_{n}}

has a convergent subsequence in Y for every bounded sequence

{x_{n}}

in X (this is equivalent to T’s being continuous when the weak topology is imposed on X and the norm topology on Y).

The spectral theorem for self-adjoint compact operators (see ([3], Section 4.3) or ([4], Section 8.2)) says that if

T : X \to X

is compact and self-adjoint, then there exists an orthonormal sequence

{ϕ_{k}}

in X and a sequence

{λ_{k}}

of nonzero real numbers such that

T = \sum_{k = 1}^{\infty} λ_{k} ϕ_{k} \otimes ϕ_{k},

(2)

where the outer product

ϕ_{k} \otimes ϕ_{k} : X \to X

is the bounded linear operator defined by

(ϕ_{k} \otimes ϕ_{k}) x = {〈 ϕ_{k}, x 〉}_{X} ϕ_{k} for all x \in X .

For ease of exposition, we will assume that the sequences

{ϕ_{k}}

and

{λ_{k}}

are infinite sequences; in the contrary case, T is a finite-rank operator and the infinite series becomes a finite sum.

For a general (that is, not necessarily self-adjoint) compact operator

T : X \to Y

, we can derive the singular value expansion (SVE) of T by applying the spectral theorem for self-adjoint compact operators to

T^{*} T

(see [3], Section 4.4). The result is

T = \sum_{k = 1}^{\infty} σ_{k} ψ_{k} \otimes ϕ_{k},

where

{ϕ_{k}}

is an orthonormal sequence in X,

{ψ_{k}}

is an orthonormal sequence in Y, and

σ_{1} \geq σ_{2} \geq \dots

are positive numbers converging to zero. Moreover,

{ϕ_{k}}

is a complete orthonormal set for

N {(T)}^{⊥}

and

{ψ_{k}}

is a complete orthonormal set for

\bar{R (T)}

.

The SVE of a compact operator has many applications, particularly in analyzing linear operator equations of the form

T x = y

(that is, given

y \in Y

, find or estimate

x \in X

satisfying

T x = y

). When T is compact and not of finite rank, this equation is ill-posed in the sense that the solution x (if it exists) does not depend continuously on the data y; in this case, the equation is often referred to as an inverse problem. The singular value expansion of T is useful in analyzing approaches to regularizing

T x = y

, that is, to computing a stable (approximate) solution to the equation in the presence of noisy data.

When

T : X \to Y

is not necessarily compact, there still exists a singular value expansion in the following form.

Theorem 1.

Let X and Y be real Hilbert spaces and let

T : X \to Y

be a bounded linear operator. Then there exist a Borel space

(M, A, μ)

with a second-countable topology

T

, isometries

V : L^{2} (μ) \to X

,

U : L^{2} (μ) \to Y

, and an essentially bounded measurable function

σ : M \to R

such that

T = U m_{σ} V^{†},

(3)

where

V^{†}

is the generalized inverse of V and

m_{σ}

is the multiplication operator defined by σ:

\begin{matrix} m_{σ} : L^{2} (μ) \to L^{2} (μ), \\ m_{σ} f = σ f f o r a l l f \in L^{2} (μ) . \end{matrix}

Moreover,

σ > 0

a.e.

By Borel space, we mean a measure space

(M, A, μ)

such that there is a topology

T

defined on the set M and

A

(the collection of measurable subsets of M) is the

σ

-algebra of Borel sets of

T

. As noted in the theorem, the topology is guaranteed to be second-countable, that is, to have a countable base. Furthermore, note that for an isometry V,

V^{†} = V^{*}

.

We will call the representation of Theorem 1 the SVE of T. Pietsch ([5], Section D.3) outlines a short proof of Theorem 1 based on the polar decomposition. We give a direct proof in the next section that is analogous to the derivation of the SVD of a matrix A from the spectral decomposition of

A^{T} A

. In Section 3, we derive some basic results about this form of the SVE, including its relationship to the classical SVE of a compact operator and how to recognize from the SVE when

R (T)

fails to be closed. We also include a brief discussion of the relationship of the SVE to notions of s-numbers (the generalization of singular values) that have appeared in the literature. In Section 4, we analyze the inverse problem

T x = y

, including methods for regularizing the equation, using Theorem 1. The results in Section 4 are not new, but we hope to convince the reader that the analysis based on Theorem 1 is particularly convenient. We conclude with a brief discussion in Section 5.

2. The SVE of a Bounded Linear Operator

As noted above, the SVD of a matrix

A \in R^{m \times n}

can be derived from the spectral decomposition of the symmetric matrix

A^{T} A \in R^{n \times n}

. In the same way, the SVE described by Theorem 1 can be derived from the following spectral theorem for a bounded, self-adjoint linear operator.

Theorem 2.

Let T be a bounded and self-adjoint linear operator mapping a real Hilbert space X into itself. Then there exists a Borel space

(M, A, μ)

with a second-countable topology

T

, a unitary operator

V : L^{2} (μ) \to X

, and an essentially bounded measurable function

θ : M \to R

such that

T = V m_{θ} V^{- 1},

where

m_{θ}

is the multiplication operator defined by θ.

This version of the spectral theorem is usually stated in terms of a complex Hilbert space X and complex

L^{2} (μ)

(see, for instance, [6] for an accessible exposition), but it can be verified that the same proof yields this representation when X is a real Hilbert space and

L^{2} (μ)

denotes the space of real-valued square-integrable functions.

To derive Theorem 1 from Theorem 2, we require the following preliminary result.

Lemma 1.

([7], Lemma 2.1) Let

(M, A, μ)

be a measure space and let

θ : M \to [0, \infty)

be a measurable function that is positive a.e. Define

S = {f \in L^{2} (μ) : θ^{- 1} f \in L^{2} (μ)} .

Then S is dense in

L^{2} (μ)

.

We can now prove a special case of Theorem 1.

Theorem 3.

Let X and Y be real Hilbert spaces and let

T : X \to Y

be a bounded linear operator with

N (T) = {0}

. Then there exist a Borel space

(M, A, μ)

with a second-countable topology

T

, a unitary operator

V : L^{2} (μ) \to X

, an isometry

U : L^{2} (μ) \to Y

, and an essentially bounded measurable function

σ : M \to [0, \infty)

such that

T = U m_{σ} V^{- 1} .

Moreover,

σ > 0

a.e. and

R (U) = \bar{R (T)}

.

Proof.

By Theorem 2, there exist a measure space

(M, A, μ)

, a unitary operator

V : L^{2} (μ) \to X

, and a bounded measurable function

θ : M \to R

such that

T^{*} T = V m_{θ} V^{- 1} .

We first show that

θ \geq 0

a.e., which obviously follows if we prove that

{〈 m_{θ} f, f 〉}_{L^{2} (μ)} \geq 0 for all f \in L^{2} (μ) .

However,

m_{θ} = V^{- 1} T^{*} T V

and hence, for any

f \in L^{2} (μ)

,

{〈 m_{θ} f, f 〉}_{L^{2} (μ)} = {〈 V^{- 1} T^{*} T V f, f 〉}_{L^{2} (μ)} = {〈 T V f, T V f 〉}_{Y} \geq 0

(using the fact that

V^{- 1} = V^{*}

because V is unitary). Therefore,

θ \geq 0

a.e., as desired, and we can assume that

θ \geq 0

everywhere.

Now define

E = {x \in M : θ (x) = 0}

. If

μ (E) > 0

, then

χ_{E} \neq 0

in

L^{2} (μ)

, where

χ_{E}

is the characteristic function of the set E; this implies that

V χ_{E} \neq 0

in X and hence that

T^{*} T V χ_{E} \neq 0

(since

N (T^{*} T) = N (T)

is trivial). However,

T^{*} T V χ_{E} = V m_{θ} χ_{E} = V (θ χ_{E}) = 0

because

θ = 0

on E and

χ_{E} = 0

on

M ∖ E

. This contradiction shows that

μ (E)

must be zero, that is,

θ > 0

a.e. in M.

Therefore, if we define

σ = \sqrt{θ}

and

S = {f \in L^{2} (μ) : σ^{- 1} f \in L^{2} (μ)},

then Lemma 1 applies and we see that S is dense in

L^{2} (μ)

. We define

U : S \to Y

by

U = T V m_{σ^{- 1}}

. Since

σ^{- 1} f \in L^{2} (μ)

for all

f \in S

, U is well-defined, and we see that it is linear and densely defined. We now show that

{∥ U f ∥}_{Y} = {∥ f ∥}_{L^{2} (μ)}

for all

f \in S

. We have

\begin{matrix} {∥ U f ∥}_{Y}^{2} = {〈 U f, U f 〉}_{Y} & = {〈 T V m_{σ^{- 1}} f, T V m_{σ^{- 1}} f 〉}_{Y} \\ = {〈 f, m_{σ^{- 1}} V^{- 1} T^{*} T V m_{σ^{- 1}} f 〉}_{L^{2} (μ)} \\ = {〈 f, m_{σ^{- 1}} m_{σ^{2}} m_{σ^{- 1}} f 〉}_{L^{2} (μ)} \\ = {〈 f, f 〉}_{L^{2} (μ)} = {∥ f ∥}_{L^{2} (μ)}^{2} . \end{matrix}

However, now we see that U is bounded and densely defined and hence it extends to a bounded operator defined on all of

L^{2} (μ)

. We will use U to denote this extension as well (therefore

U : L^{2} (μ) \to Y

satisfies

{U |}_{S} = T V m_{σ^{- 1}}

). By continuity, we have

{∥ U f ∥}_{Y} = {∥ f ∥}_{L^{2} (μ)}

for all

f \in L^{2} (μ)

and hence U is an isometry.

Next, we show that

T = U m_{σ} V^{- 1}

. For each

x \in X

,

m_{σ} V^{- 1} x \in S

because

m_{σ^{- 1}} m_{σ} V^{- 1} x = V^{- 1} x \in L^{2} (μ)

. However, then we see that, for each

x \in X

,

U m_{σ} V^{- 1} x = T V m_{σ^{- 1}} m_{σ} V^{- 1} x = T V V^{- 1} x = T x .

Therefore,

U m_{σ} V^{- 1} = T

, as desired.

It remains only to show that

R (U) = \bar{R (T)}

. Let

y \in \bar{R (T)}

. Then there exists

{x_{n}} \subset X

such that

T x_{n} \to y

, that is,

U m_{σ} V^{- 1} x_{n} \to y

. It follows that

{U m_{σ} V^{- 1} x_{n}}

is a Cauchy sequence and hence, because U is an isometry,

{m_{σ} V^{- 1} x_{n}}

is a Cauchy sequence in

L^{2} (μ)

. Suppose

m_{σ} V^{- 1} x_{n} \to f \in L^{2} (μ)

. Then

U f = lim_{n \to \infty} U m_{σ} V^{- 1} x_{n} = y,

which shows that

y \in R (U)

. Since

R (U) \subset \bar{R (T)}

by definition of U, it follows that

R (U) = \bar{R (T)}

. This completes the proof. □

Theorem 1 is an immediate corollary of Theorem 3.

Proof of Theorem 1.

If we apply Theorem 3 to

{T |}_{N {(T)}^{⊥}}

, we obtain

{T |}_{N {(T)}^{⊥}} = U m_{σ} V_{1}^{- 1},

where

U : L^{2} (μ) \to Y

is an isometry and

V_{1} : L^{2} (μ) \to N {(T)}^{⊥}

is unitary. We claim that

T = U m_{σ} V^{†},

(4)

where

V : L^{2} (μ) \to X

is defined by

V f = V_{1} f

for all

f \in L^{2} (μ)

. Since V is obviously an isometry, proving that (4) holds will complete the proof. By definition,

V^{†} x

is the minimum-norm least-squares solution of

V f = x

. Moreover, since

N (V)

is trivial,

V^{†} x

is the unique least-squares solution of

V f = x

, which is defined by

V^{†} x = {\begin{matrix} V_{1}^{- 1} x, & if x \in N {(T)}^{⊥}, \\ 0, & if x \in N (T) . \end{matrix}

However, then

x \in N {(T)}^{⊥} \Rightarrow U m_{σ} V^{†} x = U m_{σ} V_{1}^{- 1} x = T x

and

x \in N (T) \Rightarrow U m_{σ} V^{†} x = 0 = T x .

This proves (4), which completes the proof. □

3. Some Properties of the SVE of a Bounded Linear Operator

3.1. Relationship to the SVE of a Compact Operator

Suppose

T : X \to Y

is a compact operator with singular value expansion

T = \sum_{n = 1}^{\infty} σ_{n} ψ_{n} \otimes ϕ_{n} .

As noted above,

{ϕ_{n}}

is a complete orthonormal sequence for

N {(T)}^{⊥}

and

{ψ_{n}}

is a complete orthonormal sequence for

\bar{R (T)}

.

Let us define

M = Z^{+}

,

A = P (Z^{+})

(the power set of

Z^{+}

), and

μ

to be counting measure (that is, for an

E \subset Z^{+}

,

μ (E)

is the cardinality of E). Then

L^{2} (μ)

is the space of square summable sequences of real numbers (usually denoted by

ℓ^{2}

) and, for

α = {α_{k}} \in L^{2} (μ)

,

\int α^{2} d μ = \sum_{k = 1}^{\infty} α_{k}^{2} .

We define

V : L^{2} (μ) \to X

by

V (α_{k}) = \sum_{k = 1}^{\infty} α_{k} ϕ_{k} .

Then it is straightforward to verify that V is an isometry and that

V^{†} = V^{*}

is defined by

{(V^{†} (x))}_{k} = {〈 ϕ_{k}, x 〉}_{X}

for all

k \in Z^{+}

. The sequence

σ = {σ_{k}}

is bounded and measurable with respect to the measure space

(M, A, μ)

and

m_{σ} α = {σ_{k} α_{k}} .

Finally,

U : L^{2} (μ) \to Y

is defined to be the extension to all of

L^{2} (μ)

of

T V m_{σ^{- 1}}

, which is given, for

α \in L^{2} (μ)

such that

m_{σ^{- 1}} α

also lies in

L^{2} (μ)

, by

T V m_{σ^{- 1}} α = T (\sum_{k = 1}^{\infty} \frac{α_{k}}{σ_{k}} ϕ_{k}) = \sum_{k = 1}^{\infty} σ_{k} \frac{α_{k}}{σ_{k}} ψ_{k} = \sum_{k = 1}^{\infty} α_{k} ψ_{k} .

Clearly this formula extends to every

α \in L^{2} (μ)

.

Therefore, for each

x \in X

, we have

\begin{matrix} U m_{σ} V^{- 1} x = \sum_{k = 1}^{\infty} {(m_{σ} V^{- 1} x)}_{k} ψ_{k} & = \sum_{k = 1}^{\infty} σ_{k} {〈 ϕ_{k}, x 〉}_{X} ψ_{k} \\ = (\sum_{k = 1}^{\infty} σ_{k} ψ_{k} \otimes ϕ_{k}) x \\ = T x . \end{matrix}

This shows that

T = U m_{σ} V^{- 1}

and also that

U m_{σ} V^{- 1}

is just another way of writing the usual singular value expansion of T.

3.2. The SVE of Operators Related to T

Throughout the rest of the paper, we assume that

T : X \to Y

is a bounded linear operator from one real Hilbert space X to another such space Y, and that

T = U m_{σ} V^{†}

is the SVE of Theorem 1. The associated Borel space is denoted by

(M, A, μ)

and the topology of M is denoted by

T

.

Since

V^{*} = V^{†}

and

V^{*} V = I

hold for any isometry V, we immediately have the following:

\begin{matrix} T^{*} & = V m_{σ} U^{†}, \\ T^{*} T & = V m_{σ^{2}} V^{†}, \\ T T^{*} & = U m_{σ^{2}} U^{†} . \end{matrix}

3.3. Inverse Problems and the SVE

It is well known that the equation

T x = y

represents a true inverse problem if and only if

R (T)

fails to be closed (see ([3], Section 2.3)). In this case, the solution x (if it exists) does not depend continuously on the data vector y. One way to state this precisely is to note that the generalized inverse

T^{†}

is unbounded if and only if

R (T)

fails to be closed. We will study

T^{†}

below in Section 4; for now, we prove the following necessary and sufficient condition for

R (T)

to be closed.

Theorem 4.

The range of T is closed if and only if the function σ is bounded away from zero, that is, if and only if there exists

γ > 0

such that

σ (t) \geq γ

for almost all

t \in M

.

Proof.

It is a standard result that

R (T)

is closed if and only if there exists

γ > 0

such that

{∥ T x ∥}_{Y} \geq γ {∥ x ∥}_{X} \forall x \in N {(T)}^{⊥}

(5)

(see ([3], Theorem 2.20)). If there exists

γ > 0

such that

σ (t) \geq γ

for almost all

t \in M

, then

∥ m_{σ} {f ∥}_{L^{2} (μ)} \geq γ {∥ f ∥}_{L^{2} (μ)} \forall f \in L^{2} (μ) .

We have

{∥ V f ∥}_{X} = {∥ f ∥}_{L^{2} (μ)}

for all

f \in L^{2} (μ)

; moreover, V defines an isomorphism from

L^{2} (μ)

to

N {(T)}^{⊥}

. Therefore,

{∥ T x ∥}_{Y} = ∥ U m_{σ} V^{†} {x ∥}_{Y} = ∥ m_{σ} (V^{†} x) ∥_{L^{2} (μ)} \geq γ ∥ V^{†} {x ∥}_{L^{2} (μ)} = γ {∥ x ∥}_{X} \forall x \in N {(T)}^{⊥} .

Conversely, suppose

σ

is not bounded away from zero. It follows that

S_{k} = {t \in M : σ (t) < 1 / k}

has positive measure for each

k \in Z^{+}

. Therefore, with

f_{k} = χ_{S_{k}}

, we have

{∥ m_{σ} f_{k} ∥}_{L^{2} (μ)} < \frac{1}{k} {∥ f_{k} ∥}_{L^{2} (μ)} \forall k \in Z^{+}

and hence

∥ T x_{k} ∥_{Y} < \frac{1}{k} {∥ x_{k} ∥}_{X} \forall k \in Z^{+},

where

x_{k} = V f_{k}

. This shows that

R (T)

fails to be closed in this case, and the proof is complete. □

3.4. s-Numbers

Given the utility of singular values for matrices and compact operators, it is natural to try to extend the concept to more general operators. This can be done in various ways. The Courant-Fischer characterization [8,9] of the singular values

σ_{1}, σ_{2}, \dots

of a compact operator

T : X \to Y

is the following:

σ_{k} = min_{\begin{matrix} S \subset X \\ dim (S) < k \end{matrix}} max_{\begin{matrix} x \in S^{⊥} \\ x \neq 0 \end{matrix}} \frac{{∥ T x ∥}_{Y}}{{∥ x ∥}_{X}}, k = 1, 2, \dots .

(6)

Equation (6) can be taken as the definition of the s-numbers of an arbitrary bounded linear operator

T : X \to Y

by replacing “min” with “inf” and “max” with “sup.”

Alternatively, the singular values of a compact operator can be characterized [10] as

σ_{k} = min_{\begin{matrix} τ : X \to Y \\ rank (τ) < k \end{matrix}} ∥ τ - T ∥,

(7)

which again extends to arbitrary bounded linear operators T by replacing “min” with “inf”. It can be shown that (6) and (7) are equivalent for operators on Hilbert space. In fact, Pietsch [11] formulated a list of five axioms that characterize s-numbers on Hilbert space (in the sense that two definitions, such as (6) or (7), that satisfy the axioms are equivalent).

The above definitions of s-numbers are limited, in that they may not give much information if the continuous spectrum of

T^{*} T

is nonempty. Fack and Kosaki [12] defined generalized s-numbers for certain operators in a von Neumann algebra, and their techniques allow for a nonempty continuous spectrum. In the context of the SVE presented in this paper, it would be natural to define the set of s-numbers of a bounded linear operator

T : X \to Y

as the essential range of

σ

(where

T = U m_{σ} V^{†}

); we refer the reader to the first author’s PhD dissertation [13] for a discussion. The relationship between these two approaches remains to be investigated.

4. The SVE and Tikhonov Regularization

We believe that Theorem 1 will prove to be useful in a variety of applications. Here we show that it can be used to give transparent proofs of convergence theorems in the theory of Tikhonov regularization, the most popular method for addressing inverse problems.

We consider an equation of the form

T x = y

. We are given

y \in Y

and wish to compute or estimate

x \in X

satisfying the equation. The problem is well-posed if there exists a unique solution x for each

y \in Y

, where x depends continuously on y. Existence fails, at least for some

y \in Y

, if

R (T)

is a proper subspace of Y. However, in that case, it is common to settle for a least-squares solution of the equation, that is, an

x \in X

that minimizes

{∥ T x - y ∥}_{Y}^{2}

. Uniqueness fails to hold if

N (T)

is nontrivial, but we can select a unique (least-squares) solution by choosing the unique solution lying in

N {(T)}^{⊥}

, which is equivalent to choosing the minimum-norm least-squares solution. The interesting case occurs when

R (T)

fails to be closed. In that case,

Least-squares solutions exists only for y in the dense subspace $R (T) \oplus R {(T)}^{⊥}$ of Y;
For each $y \in R (T) \oplus R {(T)}^{⊥}$ , there exists a unique minimum-norm least-squares solution $x \in N {(T)}^{⊥}$ , but x does not depend continuously on y.

The generalized inverse

T^{†} : D (T^{†}) \to X

, where

D (T^{†}) = R (T) \oplus R {(T)}^{⊥}

, is defined by the condition that

T^{†} y

is the minimum-norm least-squares solution of

T x = y

. It follows from the above discussion that, when

R (T)

fails to be closed, then

T^{†}

is a densely defined unbounded linear operator. In this case, the problem

T x = y

, even when interpreted as asking for the minimum-norm least-squares solution, is ill-posed in that the solution does not depend continuously on the data. In this case, we call

T x = y

a (linear) inverse problem.

Many regularization techniques for solving

T x = y

approximate

T^{†}

by a family

{R_{λ} : λ > 0}

of bounded operators. Here

λ

is called the regularization parameter and it is required that

R_{λ} \to T^{†}

pointwise as

λ \to 0^{+}

. The most popular regularization method is Tikhonov regularization, in which

R_{λ} = {(T^{*} T + λ I)}^{- 1} T^{*}

. This operator arises from solving

min_{x \in X} {∥ T x - y ∥}_{Y}^{2} + λ {∥ x ∥}_{X}^{2} .

We first show that

R_{λ} y \to T^{†} y

for all

y \in D (T^{†})

. For convenience, we will write

x_{λ, y} = R_{λ} y

.

By definition,

T^{†} y \in N {(T)}^{⊥}

. Furthermore,

x_{λ, y}

is defined by the equation

(T^{*} T + λ I) x_{λ, y} = T^{*} y,

which implies that

x_{λ, y} = λ^{- 1} (T^{*} y - T^{*} T x_{λ, y}) \in R (T^{*}) \subset N {(T)}^{⊥} .

We will write

\bar{y} = {pro j}_{\bar{R (T)}} y

, and notice that

T^{*} y = T^{*} \bar{y}

since

\bar{y} - y \in R {(T)}^{⊥} = N (T^{*})

. It follows that

x_{λ, y} = {(T^{*} T + λ I)}^{- 1} T^{*} y = {(T^{*} T + λ I)}^{- 1} T^{*} \bar{y}

. Moreover, since the least-squares solutions of

T x = y

are precisely the solutions of

T^{*} T x = T^{*} y

, it also follows that

T^{†} y = T^{†} \bar{y}

. These two facts (that

T^{†} y, x_{λ, y} \in N {(T)}^{⊥}

and that

T^{†} y

,

x_{λ, y}

, can be defined by

\bar{y}

in place of y) make it convenient to use the singular value expansion as expressed in Theorem 3 (as opposed to the version of Theorem 1).

Suppose that, using the notation of Theorem 3,

{T |}_{N {(T)}^{⊥}} = U m_{σ} V^{- 1},

and recall that

σ > 0

a.e. in M. Since

x = T^{†} y

satisfies

T x = \bar{y}

,

U m_{σ} V^{- 1} T^{†} y = \bar{y} \to T^{†} y = V m_{σ^{- 1}} U^{†} \bar{y} .

Furthermore,

T^{*} T + λ I = V m_{σ^{2}} V^{- 1} + λ V V^{- 1} = V m [σ^{2} + λ] V^{- 1}

(where we write

m_{θ}

as

m [θ]

when it is convenient to do so), and hence

{(T^{*} T + λ I)}^{- 1} = V m [{(σ^{2} + λ)}^{- 1}] V^{- 1} .

This leads to

x_{λ, y} = {(T^{*} T + λ I)}^{- 1} T^{*} \bar{y} = V m [\frac{σ}{σ^{2} + λ}] U^{†} \bar{y} .

However, then

\begin{matrix} T^{†} y - x_{λ, y} & = V m_{σ^{- 1}} U^{†} \bar{y} - V m [\frac{σ}{σ^{2} + λ}] U^{†} \bar{y} \\ = V m [\frac{1}{σ} - \frac{σ}{σ^{2} + λ}] U^{†} \bar{y} \\ = V m [\frac{λ}{σ^{2} + λ}] m_{σ^{- 1}} U^{†} \bar{y} . \end{matrix}

To show that

T^{†} y - x_{λ, y} \to 0

as

λ \to 0

, it suffices to show that

m [\frac{λ}{σ^{2} + λ}] m_{σ^{- 1}} U^{†} \bar{y} \to 0 a s λ \to 0 .

Moreover, since

\bar{y} \in R (T)

(as opposed to merely belonging to

\bar{R (T)}

—this follows from the fact that

y \in D (T^{†}) = R (T) \oplus R {(T)}^{⊥}

), it follows that

\bar{f} = m_{σ^{- 1}} U^{†} \bar{y} \in L^{2} (μ)

. However, then, since

λ / (σ^{2} + λ)

is bounded on M and goes to 0 pointwise as

λ \to 0

, it follows that

{∥ m [\frac{λ}{σ^{2} + λ}] m_{σ^{- 1}} U^{†} \bar{y} ∥}_{L^{2} (μ)} = \int \frac{λ^{2}}{{(σ^{2} + λ)}^{2}} {\bar{f}}^{2} \to 0 as λ \to 0

by the dominated convergence theorem. This shows that

x_{λ, y} \to T^{†} y

as

λ \to 0

. Henceforth, we will write

x_{0, y} = T^{†} y

.

We will prove two other results to demonstrate the usefulness of the singular value expansion. The result that we just defined shows that, for each

y \in D (T^{†})

,

x_{λ, y} \to x_{0, y}

as

λ \to 0

. However, the result says nothing about the rate of convergence and, in fact, the convergence can be arbitrarily slow depending on the data

y \in D (T^{†})

(or, equivalently, on the solution

x_{0, y}

). For certain

x_{0, y}

, though, we can bound the rate of convergence. We will not attempt to prove the most general theorem, but rather just consider what turns out to be the optimal rate of convergence. We will show that if

x_{0, y} \in R (T^{*} T)

, then

{∥ x_{0, y} - x_{λ, y} ∥}_{X} = O (λ) .

From above, we have

x_{0, y} - x_{λ, y} = V m [\frac{λ}{σ^{2} + λ}] m_{σ^{- 1}} U^{†} \bar{y},

and

\bar{y} = T x_{0, y} = U m_{σ} V^{- 1} x_{0, y}

. Therefore,

x_{0, y} - x_{λ, y} = V m [\frac{λ}{σ^{2} + λ}] V^{- 1} x_{0, y} .

If we now assume that

x_{0, y} \in R (T^{*} T)

, say

x_{0, y} = T^{*} T v_{0} = V m_{σ^{2}} V^{- 1} v_{0}

for some

v_{0} \in N {(T)}^{⊥}

, then we obtain

x_{0, y} - x_{λ, y} = V m [\frac{λ}{σ^{2} + λ}] m_{σ^{2}} V^{- 1} v_{0} = λ V m [\frac{σ^{2}}{σ^{2} + λ}] V^{- 1} v_{0} .

Since V is an isometry and

∥ m [\frac{σ^{2}}{σ^{2} + λ}] ∥

is bounded by 1 (in general,

∥ m_{θ} ∥ = ess \sup θ

), it follows that

∥ x_{0, y} - x_{λ, y} ∥_{X} = O (λ)

, as desired.

We can also prove the following converse result, namely, that if

y \in D (T^{†})

and

{∥ x_{0, y} - x_{λ, y} ∥}_{X} = O (λ)

, then

x_{0, y} \in R (T^{*} T)

. We will use the fact, easily verified, that

x \in R (T^{*} T)

if and only if

m_{σ^{- 2}} V^{- 1} x \in L^{2} (μ)

, that is, if and only if

V^{- 1} x

belongs to the domain of the densely defined operator

m_{σ^{- 2}}

. Let us write

f_{0} = V^{- 1} x_{0, y}

; then we must show that

\int \frac{f_{0}^{2}}{σ^{4}} d μ < \infty .

We have

x_{0, y} - x_{λ, y} = V m [\frac{λ}{σ^{2} + λ}] V^{- 1} x_{0, y} = λ V m [\frac{1}{σ^{2} + λ}] V^{- 1} x_{0, y},

which implies that

\frac{∥ x_{0, y} - x_{λ, y} ∥_{X}^{2}}{λ^{2}} = ∥ V m [\frac{1}{σ^{2} + λ}] V^{- 1} x_{0, y} ∥_{X}^{2} = ∥ m [\frac{1}{σ^{2} + λ}] f_{0} ∥_{L^{2} (μ)}^{2} .

Since

{∥ x_{0, y} - x_{λ, y} ∥}_{X} = O (λ)

by assumption, there exists a constant

C > 0

such that

∥ m [\frac{1}{σ^{2} + λ}] f_{0} ∥_{L^{2} (μ)}^{2} \leq C for all λ > 0,

that is,

\int \frac{f_{0}^{2}}{{(σ^{2} + λ)}^{2}} d μ \leq C for all λ > 0 .

Since

f_{0}^{2} / {(σ^{2} + λ)}^{2}

converges monotonically to

f_{0}^{2} / σ^{4}

a.e. in M as

λ \to 0

, it follows from the monotone convergence theorem that

\int \frac{f_{0}^{2}}{σ^{4}} d μ \leq C < \infty,

as desired.

5. Discussion

The proofs of the last section are offered as an illustration of the power of the singular value expansion. The reader can compare these proofs to other treatments of the same results that can be found in the literature on inverse problems. In Groetsch’s monograph [14], the analysis is restricted to compact operators and Theorem 2.1.1, Corollary 3.1.2, and Theorem 3.2.2 correspond to our results; Groetsch’s proofs use the singular value expansion (2) for compact operators. The reader will see that our proofs are direct generalizations of the derivations given there, and also that there is no difficulty in extending his other conclusions to general bounded linear operators. Groetsch does present his theory in greater generality, with much of the analysis applying to a certain family of regularization operators

R_{λ}

, as opposed to just the Tikhonov approach. We restricted our presentation to Tikhonov regularization simply for convenience of exposition; there would be no difficulty in reproducing his results in the same level of generality.

To extend the results of [14] to general operators, the standard approach is to use the spectral representation of

T^{*} T

in the form

T = \int α d E_{α},

where

{E_{α}}

is the spectral resolution of

T^{*} T

, and apply the so-called functional calculus, which allows the representation of functions of

T^{*} T

via

f (T^{*} T) = \int f (α) d E_{α} .

It can be shown, for example, that

T^{†} = \int α^{- 1} d E_{α} .

A good reference for this approach is the book [15] by Engl, Hanke, and Neubauer, which (among other things) extends the results of [14] to general bounded linear operators. There is no intrinsic difficulty in doing so, but it may be argued that the arguments are less intuitive and therefore harder to follow. For instance, it is necessary to work with integrals of the following types:

\int (\dots) d E_{α}, \int (\dots) d {∥ E_{α} x ∥}_{X}^{2}, \int (\dots) d {〈 E_{α} x, y 〉}_{X} .

As Halmos stated in his popular expository article [6] on the spectral theorem (one of the most-downloaded articles from the American Mathematical Monthly),

The result (namely, the spectral theorem for Hermitian matrices, when expressed using a resolution

{E_{α}}

of the identity) is not intuitive in any language; neither Stieltjes integrals with unorthodox multiplicative properties, nor bounded operator representations of function algebras, are in the daily toolbox of every working mathematician. In contrast, the formulation of the spectral theorem given below uses only the relatively elementary concepts of measure theory.

We believe that the singular value expansion for general bounded linear operators, as described above, offers a similarly intuitive tool that can replace the standard use of the functional calculus in many contexts.

Author Contributions

Conceptualization, M.S.G.; methodology, D.K.C. and M.S.G.; formal analysis, D.K.C. and M.S.G.; writing—original draft preparation, M.S.G.; writing—review and editing, D.K.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gockenbach, M.S. Finite-Dimensional Linear Algebra; CRC Press: Boca Raton, FL, USA, 2010. [Google Scholar]
Trefethen, L.N.; Bau, D. Numerical Linear Algebra; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1997. [Google Scholar]
Gockenbach, M.S. Linear Inverse Problems and Tikhonov Regularization; Mathematical Association of America: Washington, DC, USA, 2016. [Google Scholar]
Young, N. An Introduction to Hilbert Space; Cambridge University Press: Cambridge, UK, 1988. [Google Scholar]
Pietsch, A. Operator Ideals; North-Holland Publishing Company: Amsterdam, The Netherlands, 1980. [Google Scholar]
Halmos, P.R. What does the spectral theorem say? Am. Math. Mon. 1963, 70, 241–247. [Google Scholar] [CrossRef]
Gockenbach, M.S. Generalizing the GSVD. SIAM J. Numer. Anal. 2016, 54, 2517–2540. [Google Scholar] [CrossRef]
Courant, R. Über die Eigenwerte bei den Differentialgleichungen der Mathematischen Physik. Math. Z. 1920, 7, 1–57. [Google Scholar] [CrossRef] [Green Version]
Fischer, E. Über quadratische Formen mit reellen Koeffizienten. Monatshefte Math. Phys. 1905, 16, 234–249. [Google Scholar] [CrossRef]
Fiedler, M.; Pták, V. Sur la meilleure approximation des transformations linéaires par des transformations de rang prescrit. C. R. Acad. Sci. Paris 1962, 254, 3805–3807. [Google Scholar]
Pietsch, A. s-Numbers of operators in Banach spaces. Studia Math. 1974, 51, 201–223. [Google Scholar] [CrossRef]
Fack, T.; Kosaki, H. Generalized s-numbers of τ-measurable operators. Pac. J. Mathe. 1986, 123, 269–300. [Google Scholar] [CrossRef] [Green Version]
Crane, D.K. The Singular Value Expansion for Compact and Non-Compact Operators. Ph.D. Thesis, Michigan Technological University, Houghton, MI, USA, 2020. [Google Scholar]
Groetsch, C.W. The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind; Pitman Publishing Inc.: Boston, MA, USA, 1984. [Google Scholar]
Engl, H.W.; Hanke, M.; Neubauer, A. Regularization of Inverse Problems; Mathematics and Its Applications; Kluwer Academic Publishers Group: Dordrecht, The Netherlands, 1996. [Google Scholar]

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Crane, D.K.; Gockenbach, M.S. The Singular Value Expansion for Arbitrary Bounded Linear Operators. Mathematics 2020, 8, 1346. https://doi.org/10.3390/math8081346

AMA Style

Crane DK, Gockenbach MS. The Singular Value Expansion for Arbitrary Bounded Linear Operators. Mathematics. 2020; 8(8):1346. https://doi.org/10.3390/math8081346

Chicago/Turabian Style

Crane, Daniel K., and Mark S. Gockenbach. 2020. "The Singular Value Expansion for Arbitrary Bounded Linear Operators" Mathematics 8, no. 8: 1346. https://doi.org/10.3390/math8081346

APA Style

Crane, D. K., & Gockenbach, M. S. (2020). The Singular Value Expansion for Arbitrary Bounded Linear Operators. Mathematics, 8(8), 1346. https://doi.org/10.3390/math8081346

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Singular Value Expansion for Arbitrary Bounded Linear Operators

Abstract

1. Introduction

2. The SVE of a Bounded Linear Operator

3. Some Properties of the SVE of a Bounded Linear Operator

3.1. Relationship to the SVE of a Compact Operator

3.2. The SVE of Operators Related to T

3.3. Inverse Problems and the SVE

3.4. s-Numbers

4. The SVE and Tikhonov Regularization

5. Discussion

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI