Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm

Xu, Xu; Guo, Jinyu; Ye, Peixin; Zhang, Wenhui

doi:10.3390/math11092020

Open AccessArticle

Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm

¹

School of Science, China University of Geosciences, Beijing 100083, China

²

School of Mathematics and LPMC, Nankai University, Tianjin 300071, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(9), 2020; https://doi.org/10.3390/math11092020

Submission received: 22 March 2023 / Revised: 14 April 2023 / Accepted: 19 April 2023 / Published: 24 April 2023

(This article belongs to the Special Issue Advances in Approximation Theory and Numerical Functional Analysis)

Download Versions Notes

Abstract

:

We first study the error performances of the Vector Weak Rescaled Pure Greedy Algorithm for simultaneous approximation with respect to a dictionary

D

in a Hilbert space. We show that the convergence rate of the Vector Weak Rescaled Pure Greedy Algorithm on

A_{1} (D)

and the closure of the convex hull of the dictionary

D

is optimal. The Vector Weak Rescaled Pure Greedy Algorithm has some advantages. It has a weaker convergence condition and a better convergence rate than the Vector Weak Pure Greedy Algorithm and is simpler than the Vector Weak Orthogonal Greedy Algorithm. Then, we design a Vector Weak Rescaled Pure Greedy Algorithm in a uniformly smooth Banach space setting. We obtain the convergence properties and error bound of the Vector Weak Rescaled Pure Greedy Algorithm in this case. The results show that the convergence rate of the VWRPGA on

A_{1} (D)

is sharp. Similarly, the Vector Weak Rescaled Pure Greedy Algorithm is simpler than the Vector Weak Chebyshev Greedy Algorithm and the Vector Weak Relaxed Greedy Algorithm.

Keywords:

greedy algorithm; vector approximation; Hilbert spaces; modulus of smoothness; uniformly smooth Banach spaces; convergence; error bound

MSC:

41A50; 41A46; 41A65; 46A20

1. Introduction

Approximation using a sparse linear combination of elements from a fixed redundant family is actively used because of its concise representations and increased computational efficiency. It has been applied widely to signal processing, image compression, machine learning and PDE solvers (see [1,2,3,4,5,6,7,8,9,10]). Among others, simultaneous sparse approximation has been utilized in signal vector processing and multi-task learning (see [11,12,13,14]). It is well known that the greedy-type algorithms are powerful tools for generating such sparse approximations (see [15,16,17,18,19]). In particular, vector greedy algorithms are very efficient at approximating a given finite number of target elements simultaneously (see [20,21,22,23]). In this article, we propose a new vector greedy algorithm—the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA)—for simultaneous approximation. We estimate the error of the VWRPGA and show that its convergence rate on the convex hull of the dictionary is optimal.

Let X be a real Banach space with norm

∥ \cdot ∥

. We say a set of elements

D \subset X

is a dictionary, if

∥ φ ∥ = 1

for each

φ \in D

and

\bar{span} (D) = X

. We assume that every dictionary

D

is symmetric, i.e.,

φ \in D implies - φ \in D .

If

f_{m}

is the output of a greedy algorithm after m iterations, then the efficiency of the approximation can be measured by the decay of the error

∥ f - f_{m} ∥

as

m \to \infty

. We are mainly concerned with the error

∥ f - f_{m} ∥

. We want to know whether it tends to zero, as

m \to \infty

. If it indeed converges to zero, then what is the convergence rate? To solve these problems, we need the following classes of elements.

For a general dictionary

D

, we define the class of elements

A_{1}^{o} (D, M) : = \{f : f = \sum_{k \in Λ} c_{k} (f) φ_{k}, φ_{k} \in D, | Λ | < \infty, \sum_{k \in Λ} | c_{k} (f) | \leq M\}

and

A_{1} (D, M)

as the closure of

A_{1}^{o} (D, M)

. Let

A_{1} (D)

be the union of the classes

A_{1} (D, M)

over all

M > 0

. Denote

A_{1} (D) : = A_{1} (D, 1)

. For

f \in A_{1} (D)

, we define its norm as

{∥ f ∥}_{A_{1} (D)} : = inf {M : f \in A_{1} (D, M)} .

We recall some related results in a Hilbert space for the reason that this kind of space has priority in geometric features and practical applications. Let H be a real Hilbert space with an inner product

〈 \cdot, \cdot 〉

and the norm

∥ x ∥ : = {〈 x, x 〉}^{1 / 2}

.

The most natural greedy algorithm in a Hilbert space is the Pure Greedy Algorithm (PGA). This algorithm is also known as the Matching Pursuit in signal processing [24]. We recall its definition from [15].

PGA(H, $D$ ):

•Step 0: Define

f_{0} = 0

.

•Step m:

- If

f = f_{m - 1}

, stop the algorithm and define

f_{k} = f_{m - 1} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose an element

φ_{m} \in D

such that

| 〈 f - f_{m - 1}, φ_{m} 〉 | = sup_{φ \in D} | 〈 f - f_{m - 1}, φ 〉 | .

Define the next approximant to be

f_{m} = f_{m - 1} + 〈 f - f_{m - 1}, φ_{m} 〉 φ_{m},

and proceed to Step

m + 1

.

The first upper bound on the rate of convergence of the PGA for

f \in A_{1} (D)

was obtained in [15] as follows:

∥ f - f_{m} {∥ \leq ∥ f ∥}_{A_{1} (D)} m^{- \frac{1}{6}}, m = 1, 2, \dots .

Later, the above estimate of the PGA was improved in [25,26] to

O (m^{- \frac{11}{62}})

and

O (m^{- \frac{s}{2 (s + 2)}})

, where s is the root of the equation

{(1 + x)}^{\frac{1}{2 + x}} (1 + \frac{1}{1 + x}) - 1 - \frac{1}{x} = 0

on the closed interval

[1, 1.5] .

It is known that

\frac{s}{2 (s + 2)} > \frac{11}{62}

.

Note that when

D

is an ortho-normal basis of H, it is not difficult to prove that for any

f \in A_{1} (D)

, there holds

∥ f - f_{m} {∥ \leq c ∥ f ∥}_{A_{1} (D)} m^{- \frac{1}{2}}, m = 1, 2, \dots .

(1)

In addition, there exists an element

f^{*} \in A_{1} (D)

(see [27]) such that

∥ f^{*} - f_{m}^{*} ∥ = c \cdot m^{- \frac{1}{2}}, m = 1, 2, \dots .

Thus, inequality (1) cannot be improved for ortho-normal bases. A natural question arises: does inequality (1) hold for any dictionary

D \subset H

? Unfortunately, the answer is negative.

In fact, Livshitz and Temlyakov [28] proved that there exists a dictionary

D \subset H

, a positive constant C and an element

f \in A_{1} (D)

such that

∥ f - f_{m} ∥ \geq C m^{- 0.27}, m = 1, 2, \dots .

This lower bound on the convergence rate of the PGA indicates that this algorithm does not attain the rate

O (m^{- \frac{1}{2}})

for all

D

.

In [15], the idea of the best approximation was introduced into the greedy algorithm, which formed the original idea of the Orthogonal Greedy Algorithm (OGA). In order to construct an approximation, the OGA takes the orthogonal projection of f on the subspace generated by all the chosen

φ_{1}, \dots, φ_{m}

. We recall its definition from [15].

OGA(H, $D$ ):

•Step 0: Define

f_{0} = 0

.

•Step m:

- If

f = f_{m - 1}

, stop the algorithm and define

f_{k} = f_{m - 1} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose an element

φ_{m} \in D

such that

| 〈 f - f_{m - 1}, φ_{m} 〉 | = sup_{φ \in D} | 〈 f - f_{m - 1}, φ 〉 | .

Define the next approximant to be

f_{m} = P_{m} (f),

and proceed to Step

m + 1

, where

P_{m}

is the orthogonal projection onto

V_{m} : = span {φ_{1}, φ_{2}, \cdot \cdot \cdot, φ_{m}}

In [15], it is shown that for any

D

, the output of the OGA(H,

D

) satisfies

∥ f - f_{m} {∥ \leq c ∥ f ∥}_{A_{1} (D)} m^{- \frac{1}{2}}, m = 1, 2, \dots .

Note that when

D

is an ortho-normal basis of H, the OGA(H,

D

) coincides with the PGA(H,

D

). So, the rate

O (m^{- \frac{1}{2}})

is sharp.

The Relaxed Greedy Algorithm (RGA) is also a modification of PGA. We recall its definition from [15].

RGA(H, $D$ ):

•Step 0: Define

f_{0} = 0

.

•Step m:

- If

f = f_{m - 1}

, stop the algorithm and define

f_{k} = f_{m - 1} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose an element

φ_{m} \in D

such that

|〈f - f_{m - 1}, φ_{m}〉| = sup_{φ \in D} | 〈 f - f_{m - 1}, φ 〉 | .

For

m = 1

, define

f_{1} = 〈 f, φ_{1} 〉 φ_{1} .

For

m \geq 2

, define the next approximant to be

f_{m} = (1 - \frac{1}{m}) f_{m - 1} + \frac{1}{m} φ_{m},

and proceed to Step

m + 1

.

It is shown in [15] that the RGA also achieves the rate

O (m^{- \frac{1}{2}})

on

A_{1} (D)

.

The Rescaled Pure Greedy Algorithm (RPGA) [17] is another kind of greedy algorithm which makes a modification to the rescaling process to replace the original output of the PGA with

f_{m} = s_{m} \hat{f_{m}}

at each iteration. It is defined as follows.

RPGA(H, $D)$ :

•Step 0: Define

f_{0} = 0

.

•Step m:

- If

f = f_{m - 1}

, stop the algorithm and define

f_{k} = f_{m - 1} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose an element

φ_{m} \in D

such that

| 〈 f - f_{m - 1}, φ_{m} 〉 | = sup_{φ \in D} | 〈 f - f_{m - 1}, φ 〉 | .

With

λ_{m} = 〈 f - f_{m - 1}, φ_{m} 〉, {\hat{f}}_{m} : = f_{m - 1} + λ_{m} φ_{m}, s_{m} = \frac{〈 f, {\hat{f}}_{m} 〉}{∥ {\hat{f}}_{m} ∥^{2}},

define the next approximant to be

f_{m} = s_{m} {\hat{f}}_{m},

and proceed to Step

m + 1

.

In [17], the convergence rate of the RPGA was obtained as follows:

∥ f - f_{m} {∥ \leq ∥ f ∥}_{A_{1} (D)} {(m + 1)}^{- \frac{1}{2}}, m = 0, 1, 2, \dots .

It is worth noting that the supremum of the inner product might not be attainable. To remedy this problem, the original condition on the selection of

φ_{m}

is replaced by

| 〈 f - f_{m - 1}, φ_{m} 〉 | \geq t_{m} sup_{φ \in D} | 〈 f - f_{m - 1}, φ 〉 |,

where

0 < t_{m} \leq 1

. This is often referred to as the “weak” condition. The study on the weak version of the above algorithms can be found in [15,25,29,30].

Meanwhile, building simultaneous approximations for a given vector of elements brings about the so-called vector-type greedy algorithms. Instead of running the algorithm for a finite collection of elements

f^{1}, \dots, f^{N}

each time separately, the vector greedy algorithm manages to obtain a simultaneous approximation of all elements with a single run. Hence, the complexity of calculation and the storage of information can be reduced greatly. Now, it comes to the question of how well this type of algorithm can perform. Namely, we need to measure its efficiency via its error bound. The Vector Weak Pure Greedy Algorithm (VWPGA, which is also referred to as the Vector Weak Greedy Algorithm (VWGA)) and the Vector Weak Orthogonal Greedy Algorithm (VWOGA) have been introduced and studied in [21,22,23].

We recall the definitions of the VWPGA and VWOGA from [23] as follows. Let

τ = {t_{m}}_{m = 1}^{\infty}

and

0 < t_{m} \leq 1

be a given sequence.

VWPGA(H, $D$ ): $f^{i} \in H, i = 1, \dots, N$ is the target element.

•Step 0: Define

f_{0}^{i} : = 0, i = 1, \dots N

.

•Step m:

- If

f^{i} = f_{m - 1}^{i}

, stop the algorithm and define

f_{k}^{i} = f_{m - 1}^{i} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose an element

φ_{m} \in D

such that

max_{i} | 〈 f^{i} - f_{m - 1}^{i}, φ_{m} 〉 | \geq t_{m} max_{i} sup_{φ \in D} | 〈 f^{i} - f_{m - 1}^{i}, φ 〉 | .

Define the next approximant to be

f_{m}^{i} : = f_{m - 1}^{i} + 〈 f^{i} - f_{m - 1}^{i}, φ_{m} 〉 φ_{m}, i = 1, 2, \dots, N,

and proceed to Step

m + 1

.

VWOGA(H, $D$ ): $f^{i} \in H, i = 1, \dots, N$ is the target element.

•Step 0: Define

f_{0}^{i} : = 0, i = 1, \dots N

.

•Step m:

- If

f^{i} = f_{m - 1}^{i}

, stop the algorithm and define

f_{k}^{i} = f_{m - 1}^{i} = f

for

k \geq m

.

- If

f^{i} \neq f_{m - 1}^{i}

. Let

i_{m}

be such that

∥ f^{i_{m}} - f_{m - 1}^{i_{m}} ∥ \geq ∥ f^{i} - f_{m - 1}^{i} ∥, i = 1, \dots N .

Choose an element

φ_{m} \in D

such that

| 〈 f^{i_{m}} - f_{m - 1}^{i_{m}}, φ_{m} 〉 | \geq t_{m} sup_{φ \in D} | 〈 f^{i_{m}} - f_{m - 1}^{i_{m}}, φ 〉 | .

Define the next approximant to be

f_{m}^{i} = P_{m} (f^{i}), i = 1, 2, \dots, N,

and proceed to Step

m + 1

, where

P_{m}

is the orthogonal projection onto

V_{m}

:= span{

φ_{1}, φ_{2}, \cdot \cdot \cdot, φ_{m}

}.

We list the results on the convergence rate of the VWPGA and VWOGA in [23] as follows.

Theorem 1.

Let

τ : = {t_{k}}_{k = 1}^{\infty}, t_{k} = t, 0 < t \leq 1

be a given real sequence. Then, for any

f^{1}, \dots, f^{N}, f^{i} \in A_{1} (D)

, the output

{f_{m}^{i}}_{m \geq 0}

of the VWPGA satisfies

\sum_{i = 1}^{N} {∥ f^{i} - f_{m}^{i} ∥}^{2} \leq {(1 + \frac{m t^{2}}{N})}^{- t / (2 N + t)} N^{\frac{2 N + 2 t}{2 N + t}} .

Theorem 2.

Let

τ : = {t_{k}}_{k = 1}^{\infty}, t_{k} = t, 0 < t \leq 1

be a given real sequence. Then, for any

f^{i} \in A_{1} (D)

, the output

{f_{m}^{i}}_{m \geq 0}

of the VWOGA satisfies

∥ f^{i} - f_{m}^{i} ∥ \leq min \{1, {(\frac{N}{m t^{2}})}^{1 / 2}\}, i = 1, \dots, N .

Improvements to the above estimates are made in [19,21,22]. The results indicate that the VWOGA achieves a better convergence rate on

A_{1} (D)

than that of the VWPGA.

In [23], the authors gave a sufficient condition of convergence for the VWPGA.

Theorem 3.

Assume that

\sum_{m = 1}^{\infty} \frac{t_{m}}{m} = \infty

. Then, for any dictionary and any finite elements

f^{i} \in H, i = 1, \dots, N

, the VWPGA satisfies

lim_{m \to \infty} ∥ f^{i} - f_{m}^{i} ∥ = 0 .

Motivated by these studies, we design the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA) and study its efficiency. The remainder of the paper is organized as follows. In Section 2, we deal with the case of Hilbert spaces. In Section 3, we deal with the case of Banach spaces. In Section 4, we draw the conclusions. Below, we provide more details.

In Section 2, we define the VWRPGA in Hilbert spaces and study its approximation properties. We first prove that

\sum_{m = 1}^{\infty} t_{m}^{2} = \infty

is the sufficient convergence condition of the VWRPGA for any

f^{i} \in H

and

D \subset H

,

i = 1, \dots, N

. This convergence condition is weaker than that of the VWPGA. Then, we prove that the error bound of the VWRPGA on

A_{1} (D)

satisfies

∥ f^{i} - f_{m}^{i, v, τ, r} ∥ \leq min \{1, {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{2})}^{- \frac{1}{2}}\} .

When

t_{1} = t_{2} = \dots = t_{m} = 1

, we show that convergence rate of the VWRPGA on

A_{1} (D)

is

O (m^{- \frac{1}{2}})

, which is sharp. This convergence rate is better than that of the VWPGA. In particular, this advantage is more obvious when N is large. The VWRPGA is more efficient than VWOGA from the viewpoint of computational complexity. This is because, for N target elements, the VWRPGA only needs to solve N one-dimensional optimization problems, while the VWOGA involves N m-dimensional optimization problems.

In Section 3, we define the VWRPGA for some uniformly smooth Banach spaces. The sufficient condition of the convergence of the VWRPGA is obtained in this case. It seems that this is the first result on the convergence analysis of the vector greedy algorithms in the Banach space setting. Then, we derive the error bound of the VWRPGA. The results show that the convergence rate of the VWRPGA on

A_{1} (D)

is sharp. We compare the approximation properties of the VWRPGA with those of the Vector Weak Chebyshev Greedy Algorithm (VWCGA) and the Vector Weak Relaxed Greedy Algorithm (VWRGA). We show that the VWRPGA has better convergence properties than the VWRGA. Similarly, the computational complexity of the VWRPGA is essentially smaller than those of the VWCGA and VWRGA.

In Section 4, we draw the conclusions of our study. Our results show that the VWRPGA is the simplest vector greedy algorithm for simultaneous approximation with the best convergence property and the optimal convergence rate. We also discuss the possible applications of the VWRPGA in multi-task learning and signal vector processing.

2. The VWRPGA for Hilbert Spaces

In this section, we define the VWRPGA in Hilbert spaces and obtain its sufficient condition of convergence together with an estimate of its error bound. Based on these results, we compare the VWRPGA with the VWPGA and the VWOGA.

Firstly, we recall the definition of the Weak Rescaled Pure Greedy Algorithm (WRPGA) in Hilbert spaces from [17]. Let

τ = {t_{m}}_{m = 1}^{\infty}

,

0 < t_{m} \leq 1

be a given sequence. The WRPGA consists of the following stages:

WRPGA(H, $D$ ):

•Step 0: Define

f_{0} = 0

.

•Step m:

- If

f = f_{m - 1}

, stop the algorithm and define

f_{k} = f_{m - 1} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose an element

φ_{m} \in D

such that

| 〈 f - f_{m - 1}, φ_{m} 〉 | \geq t_{m} sup_{φ \in D} | 〈 f - f_{m - 1}, φ 〉 | .

With

λ_{m} = 〈 f - f_{m - 1}, φ_{m} 〉, \hat{f_{m}} : = f_{m - 1} + λ_{m} φ_{m}, s_{m} = \frac{〈 f, \hat{f_{m}} 〉}{∥ \hat{f_{m}} ∥^{2}},

define the next approximation to be

f_{m} = s_{m} \hat{f_{m}},

and proceed to Step

m + 1

.

The error bound of the WRPGA has been obtained as follows.

Theorem 4

(see Theorem 4.1 in [17]). If

f \in A_{1} (D) \subset H

, then the output

{f_{m}}_{m \geq 0}

of the WRPGA satisfies the error estimate

∥ f - f_{m} {∥ \leq ∥ f ∥}_{A_{1} (D)} {(\sum_{k = 1}^{m} t_{k}^{2})}^{- \frac{1}{2}} .

Based on the WRPGA, we can define the VWRPGA. Let

τ = {t_{m}}_{m = 1}^{\infty}

,

0 < t_{m} \leq 1

be a given sequence. Now, we define the VWRPGA using the following steps:

VWRPGA(H, $D$ ): Given $f^{i} \in H, i = 1, \dots, N .$

•Step 0: Define

f_{0}^{i, v, τ, r} : = 0, i = 1, \dots N

.

•Step m:

- If

f^{i} = f_{m - 1}^{i, v, τ, r}

, stop the algorithm and define

f_{k}^{i, v, τ, r} = f_{m - 1}^{i, v, τ, r} = f

for

k \geq m

.

- If

f^{i} \neq f_{m - 1}^{i, v, τ, r}

, let

i_{m}

be such that

∥ f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, r} ∥ = max_{1 \leq i \leq N} ∥ f^{i} - f_{m - 1}^{i, v, τ, r} ∥ .

Choose an element

φ_{m} \in D

such that

〈 f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, r}, φ_{m} 〉 \geq t_{m} sup_{φ \in D} 〈 f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, r}, φ 〉 .

With

λ_{m}^{i} = 〈 f^{i} - f_{m - 1}^{i, v, τ, r}, φ_{m} 〉, \hat{f_{m}^{i}} : = f_{m - 1}^{i, v, τ, r} + λ_{m}^{i} φ_{m}, s_{m}^{i} = \frac{〈 f^{i}, \hat{f_{m}^{i}} 〉}{∥ \hat{f_{m}^{i}} ∥^{2}},

define the next approximation to be

f_{m}^{i, v, τ, r} : = s_{m}^{i} \hat{f_{m}^{i}},

and proceed to Step

m + 1

.

We establish in this section two typical results on the approximation properties of the VWRPGA (H,

D

). We first give the sufficient condition for the convergence of the VWRPGA for any dictionary

D

and any

f^{i}

,

i = 1, \dots, N .

Theorem 5.

Assume

\sum_{m = 1}^{\infty} t_{m}^{2} = \infty

. Then, the VWRPGA converges for any dictionary

D

and any

f^{i} \in H, i = 1, \dots, N .

In the proof of Theorem 5, we will reduce the approximation of the general element to that of the element from

A_{1} (D)

. To this end, we recall from [31] the following lemmas on the approximation properties of

A_{1} (D)

.

Lemma 1.

Let X be a Banach space and

D \subset X

be a dictionary. Then, for any

ϵ > 0

and any

f \in X

, there exists

f^{ϵ} \in X

such that

∥ f - f^{ϵ} ∥ < ϵ

and

\frac{f^{ϵ}}{A (ϵ)} \in A_{1} (D),

with some number

A (ϵ) > 0

.

Lemma 2.

For any

f \in H

and any dictionary

D

, we have

sup_{φ \in D} 〈 f, φ 〉 = sup_{g \in A_{1} (D)} 〈 f, g 〉 .

Proof of Theorem 5.

Note that

f_{m}^{i, v, τ, r}

is the orthogonal projection of

f^{i}

onto the one-dimensional space

span {\hat{f_{m}^{i}}}

. Thus, it is the best approximation of

f_{m}^{i, v, τ, r}

from

span {\hat{f_{m}^{i}}}

.

Let

r_{m}^{i} : = ∥ f^{i} - f_{m}^{i, v, τ, r} ∥, i = 1, \dots, N

, be the residual of

f_{m}^{i, v, τ, r}

. By the definition of

\hat{f_{m}^{i}}

and the choice of

λ_{m}^{i}

, we have

\begin{matrix} ∥ r_{m}^{i} ∥^{2} & = ∥ f^{i} - f_{m}^{i, v, τ, r} ∥^{2} \\ = ∥ f^{i} - s_{m}^{i} \hat{f_{m}^{i}} ∥^{2} \\ \leq ∥ f^{i} - \hat{f_{m}^{i}} ∥^{2} \\ \leq 〈 f^{i} - f_{m - 1}^{i, v, τ, r} - λ_{m}^{i} φ_{m}, f^{i} - f_{m - 1}^{i, v, τ, r} - λ_{m}^{i} φ_{m} 〉 \\ = ∥ f^{i} - f_{m - 1}^{i, v, τ, r} ∥^{2} - 2 λ_{m}^{i} 〈 f^{i} - f_{m - 1}^{i, v, τ, r}, φ_{m} 〉 + {(λ_{m}^{i})}^{2} \\ = ∥ r_{m - 1}^{i} ∥^{2} - {〈 f^{i} - f_{m - 1}^{i, v, τ, r}, φ_{m} 〉}^{2} . \end{matrix}

(2)

The latter inequality implies that

{∥ r_{m}^{i} {∥}}_{m = 0}^{\infty}

is a decreasing sequence. According to the Monotone Convergence Theorem, we know that

lim_{m \to \infty} ∥ r_{m}^{i} ∥

exists,

i = 1, \dots, N

.

We prove that

lim_{m \to \infty} ∥ r_{m}^{i} ∥ = 0

by contradiction. Assume

lim_{m \to \infty} ∥ r_{m}^{i} ∥ \geq a > 0

,

i = 1, \dots, N

. Then, for any m, we have

∥ r_{m}^{i} ∥ \geq a

. By (2), we obtain that

\begin{matrix} \sum_{i = 1}^{N} {∥ r_{m}^{i} ∥}^{2} & \leq \sum_{i = 1}^{N} {∥ r_{m - 1}^{i} ∥}^{2} - \sum_{i = 1}^{N} {〈 f^{i} - f_{m - 1}^{i, v, τ, r}, φ_{m} 〉}^{2} \\ = \sum_{i = 1}^{N} {∥ r_{m - 1}^{i} ∥}^{2} (1 - \frac{\sum_{i = 1}^{N} {〈 f^{i} - f_{m - 1}^{i, v, τ, r}, φ_{m} 〉}^{2}}{\sum_{i = 1}^{N} {∥ r_{m - 1}^{i} ∥}^{2}}) \\ \leq \sum_{i = 1}^{N} {∥ r_{m - 1}^{i} ∥}^{2} (1 - \frac{{〈 f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, r}, φ_{m} 〉}^{2}}{N ∥ r_{m - 1}^{i_{m}} ∥^{2}}) \\ \leq \sum_{i = 1}^{N} {∥ f^{i} ∥}^{2} \prod_{j = 1}^{m} (1 - \frac{{〈 f^{i_{j}} - f_{j - 1}^{i_{j}, v, τ, r}, φ_{j} 〉}^{2}}{N ∥ r_{j - 1}^{i_{j}} ∥^{2}}) . \end{matrix}

Denote

x_{j} = \frac{{〈 f^{i_{j}} - f_{j - 1}^{i_{j}, v, τ, r}, φ_{j} 〉}^{2}}{N ∥ r_{j - 1}^{i_{j}} ∥^{2}} .

By the inequality

1 - x \leq \frac{1}{1 + x}, 0 \leq x \leq 1

, we obtain that

\begin{matrix} \sum_{i = 1}^{N} {∥ r_{m}^{i} ∥}^{2} & \leq \sum_{i = 1}^{N} {∥ f^{i} ∥}^{2} \prod_{j = 1}^{m} (\frac{1}{1 + x_{j}}) \\ \leq \sum_{i = 1}^{N} {∥ f^{i} ∥}^{2} \frac{1}{1 + \sum_{j = 1}^{m} x_{j}} . \end{matrix}

(3)

Then, we come to obtain a lower estimate for

x_{j}, j = 1, \dots, m

.

Set

ϵ = \frac{a}{2}

. In view of Lemma 1, we can find

f_{j}^{ϵ}

such that

∥ f^{i_{j}} - f_{j}^{ϵ} ∥ < ϵ

and

\frac{f_{j}^{ϵ}}{A (ϵ)} \in A_{1} (D),

with some number

A (ϵ) > 0

.

Using Lemma 2, we have

\begin{matrix} 〈 f^{i_{j}} - f_{j - 1}^{i_{j}, v, τ, r}, φ_{j} 〉 & \geq t_{j} sup_{φ \in D} 〈 f^{i_{j}} - f_{j - 1}^{i_{j}, v, τ, r}, φ 〉 \\ = t_{j} sup_{g \in A_{1} (D)} 〈 f^{i_{j}} - f_{j - 1}^{i_{j}, v, τ, r}, g 〉 \\ \geq t_{j} A {(ϵ)}^{- 1} 〈 f^{i_{j}} - f_{j - 1}^{i_{j}, v, τ, r}, f_{j}^{ϵ} 〉 . \end{matrix}

(4)

Since

f_{m}^{i, v, τ, r}

is the orthogonal projection of

f^{i}

onto

span {\hat{f_{m}^{i}}}

, we have

〈 f^{i} - f_{m}^{i, v, τ, r}, f_{m}^{i, v, τ, r} 〉 = 0 .

Then,

\begin{matrix} 〈 f^{i_{j}} - f_{j - 1}^{i_{j}, v, τ, r}, f_{j}^{ϵ} 〉 & = 〈 r_{j - 1}^{i_{j}}, f^{i_{j}} + f_{j}^{ϵ} - f^{i_{j}} 〉 \\ = 〈 r_{j - 1}^{i_{j}}, r_{j - 1}^{i_{j}} + f_{j - 1}^{i_{j}, v, τ, r} 〉 - 〈 r_{j - 1}^{i_{j}}, f^{i_{j}} - f_{j}^{ϵ} 〉 \\ > ∥ r_{j - 1}^{i_{j}} ∥^{2} - ∥ r_{j - 1}^{i_{j}} ∥ \cdot ϵ . \end{matrix}

(5)

Combining (4) and (5) with

ϵ = \frac{a}{2}

, we obtain

\begin{matrix} x_{j} & \geq & \frac{1}{N ∥ r_{j - 1}^{i_{j}} ∥^{2}} \cdot {(\frac{〈 f^{i_{j}} - f_{j - 1}^{i_{j}, v, τ, r}, f_{j}^{ϵ} 〉}{A (ϵ)} t_{j})}^{2} \\ \geq & \frac{1}{N} \cdot {(\frac{∥ r_{j - 1}^{i_{j}} ∥ - ϵ}{A (ϵ)} t_{j})}^{2} \\ \geq & \frac{a^{2}}{4 N A {(ϵ)}^{2}} t_{j}^{2} . \end{matrix}

(6)

Combining (3) with (6), we can obtain that

\sum_{i = 1}^{N} ∥ r_{m}^{i} ∥^{2} \leq \sum_{i = 1}^{N} {∥ f^{i} ∥}^{2} \frac{1}{1 + \frac{a^{2}}{4 N A {(ϵ)}^{2}} \sum_{j = 1}^{m} t_{j}^{2}} .

The assumption

\sum_{m = 1}^{\infty} t_{m}^{2} = \infty

implies that

\sum_{i = 1}^{N} {∥ r_{m}^{i} ∥}^{2} \to 0

as

m \to \infty

.

It is obvious that

lim_{m \to \infty} ∥ r_{m}^{i} ∥ = 0

for

i = 1, \dots, N

. Hence, we obtain a contradiction, which proves this theorem. □

Remark 1.

It is known from Theorem 2.1 in [32] that

\sum_{m = 1}^{\infty} t_{m}^{2} = \infty

is also the necessary condition for the convergence of the VWRPGA.

Remark 2.

According to the Cauchy–Schwartz inequality, we know that

\sum_{m = 1}^{\infty} \frac{t_{m}}{m} \leq {(\sum_{m = 1}^{\infty} t_{m}^{2})}^{\frac{1}{2}} {(\sum_{m = 1}^{\infty} \frac{1}{m^{2}})}^{\frac{1}{2}} .

Hence,

\sum_{m = 1}^{\infty} \frac{t_{m}}{m} = \infty implies \sum_{m = 1}^{\infty} t_{m}^{2} = \infty .

On the other hand, taking

t_{m} = m^{- \frac{1}{2}}

,

m = 1, 2, \cdot \cdot \cdot,

we notice that

\sum_{m = 1}^{\infty} t_{m}^{2} = \infty, \sum_{m = 1}^{\infty} \frac{t_{m}}{m} < \infty .

Therefore, the convergence condition of the VWPRGA is weaker than that of the VWPGA.

The following theorem gives the error bound of the VWRPGA (

H, D

) for

f^{i} \in A_{1} (D), i = 1, \dots, N

.

Theorem 6.

Let

τ : = {t_{k}}_{k = 1}^{\infty}, 0 < t_{k} \leq 1

be a weakness sequence. If

f^{i} \in A_{1} (D) \subset H, i = 1, \dots, N

. Then, we have for the VWRPGA (

H, D

)

∥ f^{i} - f_{m}^{i, v, τ, r} ∥ \leq min \{1, {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{2})}^{- \frac{1}{2}}\} .

Proof.

We establish the approximation error of the VWRPGA based on the methods of [25]. The main idea of this proof is that the VWRPGA can be seen as a realization of the WRPGA with a particular weakness sequence.

Let

i \in {1, \dots, N}

. Under the assumption that

f^{i} \in A_{1} (D)

and the fact that the sequence

{∥ f^{i} - f_{m}^{i, v, τ, r} {∥}}_{m = 0}^{\infty}

is decreasing, we have

∥ f^{i} - f_{m}^{i, v, τ, r} ∥ \leq 1 .

Thus, we only need to prove the estimate below:

∥ f^{i} - f_{m}^{i, v, τ, r} ∥ \leq {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{2})}^{- \frac{1}{2}}, i = 1, \dots, N .

At step k, the VWRPGA chooses

φ_{k}

from the

D

in terms of only one remainder from

{r_{k - 1}^{1}, \dots, r_{k - 1}^{N}}

. Then, each

f^{i}, i = 1, \dots, N

has been used different times to choose

φ_{k}

when the VWRPGA carries on to step m. Now, we record the usage of

f^{i}, i = 1, \dots, N

.

For every

l, l = 1, \dots, N

, denote

E_{l} : = {k | i_{k} = l, 1 \leq k \leq m

} (

i_{k}

is defined in the definition of VWRPGA). Then, we have

E_{1} \cup E_{2} \cup \dots \cup E_{N} = {1, \dots, m}, E_{i} \cap E_{j} = \emptyset if i \neq j .

Hence,

∥ f^{l} - f_{k - 1}^{l, v, τ, r} ∥ = ∥ f^{i_{k}} - f_{k - 1}^{i_{k}, v, τ, r} ∥ = max_{1 \leq i \leq N} ∥ f^{i} - f_{k - 1}^{i, v, τ, r} ∥, k \in E_{l} .

Using

\sum_{k = 1}^{m} t_{k}^{2} = \sum_{l = 1}^{N} \sum_{k \in E_{l}} t_{k}^{2},

we can find

l_{0} \in [1, N]

such that

\sum_{k \in E_{l_{0}}} t_{k}^{2} \geq \frac{1}{N} \sum_{k = 1}^{m} t_{k}^{2} .

Next, let

k_{0} = max {k | k \in E_{l_{0}}}, k_{0} \leq m

. We have

\begin{matrix} max_{1 \leq i \leq N} ∥ f^{i} - f_{m}^{i, v, τ, r} ∥ \leq max_{1 \leq i \leq N} ∥ f^{i} - f_{k_{0} - 1}^{i, v, τ, r} ∥ = ∥ f^{l_{0}} - f_{k_{0} - 1}^{l_{0}, v, τ, r} ∥ . \end{matrix}

(7)

Now, we only consider element

f^{l_{0}} \in H

. For

f^{l_{0}} \in H

, we can obtain

f_{1}^{l_{0}, v, τ, r}, \dots, f_{m}^{l_{0}, v, τ, r}

as an application of the WRPGA with the weakness sequence

τ^{l_{0}} : = {t_{k}^{l_{0}}}

given by

\begin{matrix} t_{k}^{l_{0}} = \{\begin{matrix} t_{k}, & k \in E_{l_{0}} \\ 0, & otherwise . \end{matrix} \end{matrix}

Therefore, by Theorem 4, we obtain

\begin{matrix} ∥ f^{l_{0}} - f_{k_{0} - 1}^{l_{0}, v, τ, r} ∥ & \leq {(1 + \sum_{k \in E_{l_{0}} - {k_{0}}} t_{k}^{2})}^{- \frac{1}{2}} \\ \leq {(1 + \sum_{k \in E_{l_{0}}} t_{k}^{2} - 1)}^{- \frac{1}{2}} \\ \leq {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{2})}^{- \frac{1}{2}} . \end{matrix}

Together with (7), we complete the proof of Theorem 6. □

We recall the theorem in [21] about the error estimate of the VWPGA.

Theorem 7.

Let

τ : = {t_{k}}_{k = 1}^{\infty}, 0 < t_{k} \leq 1

be a decreasing sequence. Then, for any

f^{1}, \dots, f^{N}, f^{i} \in A_{1} (D)

, the output

{f_{m}^{i}}_{m \geq 0}

of the VWPGA satisfies

\sum_{i = 1}^{N} {∥ f^{i} - f_{m}^{i} ∥}^{2} \leq N^{2} {(1 + \frac{1}{N} \sum_{i = 1}^{m} {t_{k}}^{2})}^{\frac{- t_{m}}{2 N^{1 / 2} + t_{m}}} .

We observe from Theorem 7, for a fixed m, the error of the VWPGA increases as the number of the target elements increases. The exponent is close to zero as long as N is sufficiently large.

Taking

t_{k} = 1, k = 1, 2, \dots

in Theorem 7, we yield the following theorem, which gives the convergence rate of the VWPGA.

Theorem 8.

Let

τ : = {t_{k}}_{k = 1}^{\infty}

,

t_{k} = 1

. Then, for any

f^{1}, \dots, f^{N}, f^{i} \in A_{1} (D)

, the output

{f_{m}^{i}}_{m \geq 0}

of the VWPGA satisfies

\sum_{i = 1}^{N} ∥ f^{i} - f_{m}^{i} ∥ \leq N^{2} {(1 + \frac{m}{N})}^{- \frac{1}{2 N^{1 / 2} + 1}} .

Again, by taking

t_{k} = 1, k = 1, 2, \dots

in Theorem 6, we obtain the following theorem.

Theorem 9.

Let

τ : = {t_{k}}_{k = 1}^{\infty},

t_{k} = 1

. If

f^{i} \in A_{1} (D) \subset H, i = 1, \dots, N

. Then, for the VWRPGA (

H, D

),

∥ f^{i} - f_{m}^{i, v, τ, r} ∥ \leq min \{1, {(\frac{m}{N})}^{- \frac{1}{2}}\} .

Remark 3.

From Theorems 8 and 9, we see that the VWRPGA provides a significantly better convergence rate than the VWPGA. In particular, this advantage is more obvious when N is large.

Remark 4.

It is known from Theorems 2 and 9 that the approximation property of the VWRPGA is almost the same as that of the VWOGA. While the VWRPGA is simpler than the VWOGA from the viewpoint of computational complexity. For N target elements, from the definitions of the algorithms, one can see that the VWOGA needs to solve N m-dimensional optimization problems. However, the VWRPGA only needs to solve N one-dimensional optimization problems. Then, this makes the VWRPGA easier to implement than the VWOGA in practical applications.

3. The VWRPGA for Banach Spaces

In this section, we consider the VWRPGA in the context of Banach spaces. We remark that there are two natural generations of the PGA in the case of Banach space X: the X-greedy algorithm and the dual greedy algorithm. However, there are no general results on convergence and error bound of these two algorithms, cf. [29]. On the other hand, the WOGA, WRGA, WRPGA and VWOGA have been successfully generalized to the case of Banach spaces. We first recall from [31] the definition of the Weak Chebyshev Greedy Algorithm (WCGA), which is a natural generalization of the WOGA.

For any non-zero element

f \in X

, we denote by

F_{f}

a norming functional for f:

∥ F_{f} ∥ = 1, F_{f} (f) = ∥ f ∥ .

The existence of such a functional is guaranteed by the Hahn–Banach theorem. Let

τ = {t_{m}}_{m = 1}^{\infty}

,

0 < t_{m} \leq 1

be a given sequence. The WCGA is defined as follows.

WCGA(X, $D$ ):

•Step 0: Define

f_{0} = 0

.

•Step m:

- If

f = f_{m - 1}

, stop the algorithm and define

f_{k} = f_{m - 1} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose an element

φ_{m} \in D

such that

F_{f - f_{m - 1}} (φ_{m}) \geq t_{m} sup_{φ \in D} F_{f - f_{m - 1}} (φ) .

Set

Φ_{m} : = span {φ_{1}, φ_{2}, \cdot \cdot \cdot, φ_{m}} .

Define

f_{m}

to be the best approximant to f from

Φ_{m}

and proceed to Step

m + 1

.

To estimate the error of WCGA, we shall utilize some geometric aspects of Banach spaces. For a Banach space X, we define

ρ (u)

, the modulus of smoothness of X, as

ρ (u) : = sup_{f, g \in X, ∥ f ∥ = ∥ g ∥ = 1} \{\frac{∥ f + u g ∥ + ∥ f - u g ∥}{2} - 1\}, u > 0 .

A uniformly smooth Banach space is one with the property

lim_{u \to 0} \frac{ρ (u)}{u} = 0 .

We shall only consider Banach spaces whose modulus of smoothness satisfies the inequality

ρ (u) \leq γ u^{q}, 1 < q \leq 2,

where

γ

is a constant independent of u.

A typical example of a uniformly smooth Banach space is the Lebesgue space

L_{p}

,

1 < p < \infty

. It is known from [33] that

ρ (u) \leq \{\begin{matrix} u^{p} / p & i f 1 < p \leq 2, \\ (p - 1) u^{2} / 2 & i f 2 \leq p < \infty . \end{matrix}

(8)

Moreover, we obtain from [34] that for any X with

d i m X = \infty

,

ρ (u) \geq {(1 + u^{2})}^{\frac{1}{2}} - 1

and for any X with

d i m X \geq 2

,

ρ (u) \geq C u^{2}, C > 0 .

The following error bound of the WCGA on

A_{1} (D)

has been established in [31].

Theorem 10.

Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. If

f \in A_{1} (D) \subset X

, then the output

{f_{m}}_{m \geq 0}

of the WCGA(

X, D

) satisfies the inequality

∥ f - f_{m} ∥ \leq C_{1} (q, γ) {(1 + \sum_{k = 1}^{m} t_{k}^{\frac{q}{q - 1}})}^{- 1 + \frac{1}{q}},

where the constant

C_{1} (q, γ)

depends only on q and γ.

Taking

{t_{k}}_{k = 1}^{\infty}, t_{k} = 1, k = 1, 2, \dots

, Theorem 10 implies the following corollary, which can be seen in [31].

Corollary 1.

Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. Then, for any

f \in A_{1} (D)

, the output

{f_{m}}_{m \geq 0}

of the WCGA(X,

D

) satisfies the inequality

∥ f - f_{m} ∥ \leq c \cdot m^{- 1 + \frac{1}{q}} .

In order to show the convergence rate

O (m^{- 1 + \frac{1}{q}})

cannot be improved, we now take

L_{p}, 1 < p < \infty

as an example.

Let

1 < p \leq 2

be fixed. Combining Corollary 1 with inequality (8), we have for any

D \subset L_{p}

and any

f \in L_{p}

∥ f - f_{m} ∥ \leq c \cdot m^{- 1 + \frac{1}{p}} .

(9)

When

D

is a wavelet basis of

L_{p}

, it is known from [35] that there is a

f \in A_{1} (D)

such that

∥ f - f_{m} ∥ \geq c \cdot m^{- 1 + \frac{1}{p}} .

Thus, inequality (9) could not be improved.

Similarly, let

p > 2

be fixed. Combining Corollary 1 with inequality (8), we have for any

D \subset L_{p}

and any

f \in L_{p}

∥ f - f_{m} ∥ \leq c \cdot m^{- \frac{1}{2}} .

(10)

When

D

is the trigonometric system of

L_{p}

, it is known from [36] that there is a

f \in A_{1} (D)

such that

∥ f - f_{m} ∥ \geq c \cdot m^{- \frac{1}{2}} .

Thus, inequality (10) could not be improved.

Then, the convergence rate

O (m^{- 1 + \frac{1}{q}})

in Corollary 1 serves as a benchmark for the performance of greedy algorithms in uniformly smooth Banach spaces.

Next, we recall the definition of the WRGA in the Banach space setting from [31]. Let

τ = {t_{m}}_{m = 1}^{\infty}

,

0 < t_{m} \leq 1

, be a given sequence. The WRGA is defined as follows.

WRGA(X, $D$ ):

•Step 0: Define

f_{0} = 0

.

•Step m:

- If

f = f_{m - 1}

, stop the algorithm and define

f_{k} = f_{m - 1} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose a element

φ_{m} \in D

such that

F_{f - f_{m - 1}} (φ_{m} - f_{m - 1}) \geq t_{m} sup_{φ \in D} F_{f - f_{m - 1}} (φ - f_{m - 1}) .

Find

0 \leq λ_{m} \leq 1

such that

∥ f - ((1 - λ_{m}) f_{m - 1} + λ_{m} φ_{m}) ∥ = inf_{0 \leq λ \geq 1} ∥ f - ((1 - λ) f_{m - 1} + λ φ_{m}) ∥ .

Define

f_{m} : = (1 - λ_{m}) f_{m - 1} + λ_{m} φ_{m},

and proceed to Step

m + 1

.

The following error bound of the WRGA on

A_{1} (D)

has been established in [31].

Theorem 11.

Let X be a Banach space with a modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. If

f \in A_{1} (D) \subset X

, then the output

{f_{m}}_{m \geq 0}

of the WRGA (

X, D

) satisfies the inequality

∥ f - f_{m} ∥ \leq C_{2} (q, γ) {(1 + \sum_{k = 1}^{m} t_{k}^{p})}^{- \frac{1}{p}}, p = \frac{q}{q - 1},

where the constant

C_{2} (q, γ)

depends only on q and γ.

Now, we turn to the vector greedy algorithms. Let

τ = {t_{m}}_{m = 1}^{\infty}

,

0 < t_{m} \leq 1

be a given sequence. The Vector Weak Chebyshev Greedy Algorithm (VWCGA) [22] is defined as follows.

VWCGA(X, $D$ ):

•Step 0: Define

f_{0}^{i, v, τ, c} : = 0, i = 1, \dots N

.

•Step m:

- If

f^{i} = f_{m - 1}^{i, v, τ, c}

, stop the algorithm and define

f_{k}^{i, v, τ, c} = f_{m - 1}^{i, v, τ, c} = f

for

k \geq m

.

- If

f^{i} \neq f_{m - 1}^{i, v, τ, c}

, let

i_{m}

be such that

∥ f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, c} ∥ = max_{1 \leq i \leq N} ∥ f^{i} - f_{m - 1}^{i, v, τ, c} ∥ .

Choose an element

φ_{m} \in D

such that

F_{f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, c}} (φ_{m}) \geq t_{m} sup_{φ \in D} F_{f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, c}} (φ) .

Set

Φ_{m} : = span {φ_{1}, φ_{2}, \cdot \cdot \cdot, φ_{m}} .

Define

f_{m}^{i}

to be the best approximant to

f^{i}

from

Φ_{m}

,

i = 1, \cdot \cdot \cdot, N

and proceed to Step

m + 1

.

Let

τ = {t_{m}}_{m = 1}^{\infty}

,

0 < t_{m} \leq 1

be a given sequence. The Vector Weak Relaxed Greedy Algorithm (VWRGA) [22] is defined as follows.

VWRGA(X, $D$ ):

•Step 0: Define

f_{0}^{i, v, τ, c} : = 0, i = 1, \dots N

.

•Step m:

- If

f^{i} = f_{m - 1}^{i, v, τ, r}

, stop the algorithm and define

f_{k}^{i, v, τ, r} = f_{m - 1}^{i, v, τ, r} = f

for

k \geq m

.

- If

f^{i} \neq f_{m - 1}^{i, v, τ, r}

, let

i_{m}

be such that

∥ f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, r} ∥ = max_{1 \leq i \leq N} ∥ f^{i} - f_{m - 1}^{i, v, τ, r} ∥ .

Choose an element

φ_{m} \in D

such that

F_{f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, r}} (φ_{m}) \geq t_{m} sup_{φ \in D} F_{f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, r}} (φ) .

Find

0 \leq λ_{m}^{i} \leq 1

such that

∥ f^{i} - ((1 - λ_{m}^{i}) f_{m - 1}^{i_{m}, v, τ, r} + λ_{m}^{i} φ_{m}) ∥ = inf_{0 \leq λ \geq 1} ∥ f^{i} - ((1 - λ) f_{m - 1}^{i_{m}, v, τ, r} + λ φ_{m}) ∥ .

Define

f_{m}^{i, v, τ, r} : = (1 - λ_{m}^{i}) f_{m - 1}^{i, v, τ, r} + λ_{m}^{i} φ_{m},

and proceed to Step

m + 1

.

The error bounds of the VWCGA and VWRGA on

A_{1} (D)

have been established in [22].

Theorem 12.

Let X be a Banach space with a modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. Then, for a sequence

τ : = {t_{k}}_{k = 1}^{\infty}, 0 < t_{k} \leq 1

and any

f^{i} \in A_{1} (D) \subset X, i = 1, \dots, N

, we have

∥ f^{i} - f_{m}^{i, v, τ, c} ∥ \leq C_{1} (q, γ) min \{1, {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{p})}^{- \frac{1}{p}}\}, p = \frac{q}{q - 1},

∥ f^{i} - f_{m}^{i, v, τ, r} ∥ \leq C_{2} (q, γ) min \{1, {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{p})}^{- \frac{1}{p}}\}, p = \frac{q}{q - 1} .

Now, we start to define the VWRPGA (X,

D

). To accomplish this, we recall the definition of the WRPGA from [17]. Let X be a Banach space with a modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. Let

τ = {t_{m}}_{m = 1}^{\infty}

,

0 < t_{m} \leq 1

be a given sequence.

WRPGA(X, $D$ ):

•Step 0: Define

f_{0} = 0

.

•Step m:

- If

f = f_{m - 1}

, stop the algorithm and define

f_{k} = f_{m - 1} = f

for

k \geq m

.

- If

f \neq f_{m - 1}

, choose an element

φ_{m} \in D

such that

| F_{f - f_{m - 1}} (φ_{m}) | \geq t_{m} sup_{φ \in D} F_{f - f_{m - 1}} (φ) .

With

λ_{m} = sign {F_{f - f_{m - 1}} (φ_{m})} ∥ f - f_{m - 1} ∥ {(2 γ q)}^{\frac{1}{1 - q}} {| F_{f - f_{m - 1}} (φ_{m}) |}^{\frac{1}{q - 1}},

{\hat{f}}_{m} : = f_{m - 1} + λ_{m} φ_{m},

choose

s_{m}

such that

∥ f - s_{m} {\hat{f}}_{m} ∥ = min_{s \in R} ∥ f - s {\hat{f}}_{m} ∥ .

Define the next approximant to be

f_{m} = s_{m} {\hat{f}}_{m},

and proceed to Step

m + 1

.

The sufficient conditions for the convergence of the WRPGA in terms of the weakness sequence and the modulus of smoothness can be found in [17]. Moreover, the following theorem gives the error bound of the WRPGA on

A_{1} (D)

.

Theorem 13

(see Theorem 6.1 in [17]). Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. If

f \in A_{1} (D) \subset X

, then the output

{f_{m}}_{m \geq 0}

of the WRPGA(

X, D

) satisfies the inequality

∥ f - f_{m} ∥ \leq C_{3} (q, γ) {(1 + \sum_{k = 1}^{m} t_{k}^{p})}^{- \frac{1}{p}}, p = \frac{q}{q - 1},

where the constant

C_{3} (q, γ)

depends only on q and γ.

Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. Let

τ = {t_{m}}_{m = 1}^{\infty}

,

0 < t_{m} \leq 1

be a given sequence. We define the VWRPGA (X,

D

) as follows.

VWRPGA(X, $D$ ): Given $f^{i} \in X, i = 1, \dots, N .$

•Step 0: Define

f_{0}^{i, v, τ, R} : = 0, i = 1, \dots N

.

•Step m:

- If

f^{i} = f_{m - 1}^{i, v, τ, R}

, stop the algorithm and define

f_{k}^{i, v, τ, R} = f_{m - 1}^{i, v, τ, R} = f

for

k \geq m

.

- If

f^{i} \neq f_{m - 1}^{i, v, τ, R}

, let

i_{m}

be such that

∥ f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, R} ∥ = max_{1 \leq i \leq N} ∥ f^{i} - f_{m - 1}^{i, v, τ, R} ∥ .

Choose an element

φ_{m} \in D

such that

F_{f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, R}} (φ_{m}) \geq t_{m} sup_{φ \in D} F_{f^{i_{m}} - f_{m - 1}^{i_{m}, v, τ, R}} (φ) .

With

λ_{m}^{i} = sign {F_{f^{i} - f_{m - 1}^{i, v, τ, R}} (φ_{m})} ∥ f^{i} - f_{m - 1}^{i, v, τ, R} ∥ {(2 γ q)}^{\frac{1}{1 - q}} {| F_{f^{i} - f_{m - 1}^{i, v, τ, R}} (φ_{m}) |}^{\frac{1}{q - 1}},

\hat{f_{m}^{i}} : = f_{m - 1}^{i, v, τ, R} + λ_{m}^{i} φ_{m},

choose

s_{m}

such that

∥ f^{i} - s_{m}^{i} \hat{f_{m}^{i}} ∥ = min_{s \in R} ∥ f^{i} - s \hat{f_{m}^{i}} ∥ .

Define the next approximant to be

f_{m}^{i, v, τ, R} = s_{m}^{i} \hat{f_{m}^{i}},

and proceed to Step

m + 1

.

In this section, we obtain the convergence properties and error bound of the VWRPGA.

Firstly, we establish the theorem on the convergence of the VWRPGA. It seems that this theorem is the first result on the convergence property of the vector greedy algorithms in the Banach space setting.

Theorem 14.

Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. Assume

\sum_{m = 1}^{\infty} t_{m}^{p} = \infty, p = \frac{q}{q - 1} .

Then, for any

f^{i} \in X, i = 1, \dots, N

and any dictionary

D

the VWRPGA converges.

The idea of the proof of Theorem 14 is similar to that of Theorem 5. However, because of the complexity of Banach spaces, a series of arguments in the subsequence analysis must be modified, replaced, and generalized. Some useful results in the case of Hilbert spaces have been generalized to the case of Banach spaces, as shown in the following lemmas.

Lemma 3.

Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. For any two nonzero elements

f, g \in X

and any

h > 0

, we have

∥ f - h g ∥ \leq ∥ f ∥ + 2 ∥ f ∥ γ \cdot {(\frac{h ∥ g ∥}{∥ f ∥})}^{q} - h F_{f} (g) .

Proof.

The proof of this lemma follows from the proof of Lemma 6.1 in [29] and the fact that the modulus of smoothness of X satisfies

ρ (u) \leq γ u^{q}, 1 < q \leq 2 .

□

Lemma 4.

Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. Let

f_{m - 1}^{i, v, τ, R}

be the output of the VWRPGA at Step

m - 1

for

f^{i}, i = 1, \dots, N

. If

f^{i} \neq f_{m - 1}^{i, v, τ, R}

, then we have

F_{f^{i} - f_{m - 1}^{i, v, τ, R}} (f_{m - 1}^{i, v, τ, R}) = 0 .

Proof.

Denote

L : = span {\hat{f_{m}^{i}}} \subset X

. By the definition of the VWRPGA

(X, D), f_{m - 1}^{i, v, τ, R}

is the best approximant to

f^{i}

from L for

i = 1, \dots, N

. Thus, the conclusion of the lemma follows from Lemma 2.1 in [31]. □

Lemma 5

(see Lemma 2.2 in [31]). For any bounded linear functional F and any dictionary

D

from a Banach space, we have

sup_{φ \in D} F (φ) = sup_{g \in A_{1} (D)} F (g) .

Now, we prove Theorem 14.

Proof of Theorem 14.

Let

r_{m}^{i}, i = 1, \dots, N

be the residual of

f_{m}^{i, v, τ, R}

. It is known from the definition of the VWRPGA

(X, D)

that

r_{m}^{i}

satisfies

\begin{matrix} ∥ r_{m}^{i} ∥ & = ∥ f^{i} - f_{m}^{i, v, τ, R} ∥ \\ = ∥ f^{i} - s_{m}^{i} \hat{f_{m}^{i}} ∥ \\ \leq ∥ f^{i} - \hat{f_{m}^{i}} ∥ \\ = ∥ r_{m - 1}^{i} - λ_{m}^{i} φ_{m} ∥ . \end{matrix}

We apply Lemma 3 to the latter inequality with

f = r_{m - 1}^{i}, g = sign (λ_{m}^{i}) φ_{m}, h = | λ_{m}^{i} |,

and obtain

∥ r_{m}^{i} ∥ \leq ∥ r_{m - 1}^{i} ∥ + 2 ∥ r_{m - 1}^{i} ∥ γ \cdot {(\frac{| λ_{m}^{i} |}{∥ r_{m - 1}^{i} ∥})}^{q} - λ_{m}^{i} F_{f^{i} - f_{m - 1}^{i, v, τ, R}} (φ_{m}) .

By the choice of

λ_{m}^{i}

, we have

\begin{matrix} ∥ r_{m}^{i} ∥ \leq ∥ r_{m - 1}^{i} ∥ \{1 - \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot {| F_{f^{i} - f_{m - 1}^{i, v, τ, R}} (φ_{m}) |}^{\frac{q}{q - 1}}\} . \end{matrix}

(11)

Thus, it is easy to see that

{∥ r_{m}^{i} {∥}}_{m = 0}^{\infty}

is a decreasing sequence. According to the Monotone Convergence Theorem, we know that

lim_{m \to \infty} ∥ r_{m}^{i} ∥

for

i = 1, \dots, N

exists.

Next, we prove that

lim_{m \to \infty} ∥ r_{m}^{i} ∥ = 0

by contradiction. Assume

lim_{m \to \infty} ∥ r_{m}^{i} ∥ \geq a > 0, i = 1, \dots, N

. Then, for any m, we have

∥ r_{m}^{i} ∥ \geq a

. By (11), we obtain that

\begin{matrix} \sum_{i = 1}^{N} ∥ r_{m}^{i} ∥ & \leq \sum_{i = 1}^{N} ∥ r_{m - 1}^{i} ∥ - \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot \sum_{i = 1}^{N} ∥ r_{m - 1}^{i} ∥ \cdot | F_{f^{i} - f_{m - 1}^{i, v, τ, R}} (φ_{m}) |^{\frac{q}{q - 1}} \\ \leq \sum_{i = 1}^{N} ∥ r_{m - 1}^{i} ∥ \{1 - \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot \frac{∥ r_{m - 1}^{i_{m}} ∥ \cdot {(F_{f^{i_{m}} - f_{m - 1}^{i_{m}, v, R, τ}} (φ_{m}))}^{\frac{q}{q - 1}}}{\sum_{i = 1}^{N} ∥ r_{m - 1}^{i} ∥}\} \\ \leq \sum_{i = 1}^{N} ∥ r_{m - 1}^{i} ∥ \{1 - \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot \frac{∥ r_{m - 1}^{i_{m}} ∥ \cdot {(F_{f^{i_{m}} - f_{m - 1}^{i_{m}, v, R, τ}} (φ_{m}))}^{\frac{q}{q - 1}}}{N ∥ r_{m - 1}^{i_{m}} ∥}\} \\ \leq \sum_{i = 1}^{N} ∥ f^{i} ∥ \prod_{j = 1}^{m} \{1 - \frac{1}{N} \cdot \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot {(F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (φ_{j}))}^{\frac{q}{q - 1}}\} . \end{matrix}

Denote

x_{j} = \frac{1}{N} \cdot \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot {(F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (φ_{j}))}^{\frac{q}{q - 1}}) .

(12)

By the inequality

1 - x \leq \frac{1}{1 + x}, 0 \leq x \leq 1

, we obtain

\begin{matrix} \sum_{i = 1}^{N} ∥ r_{m}^{i} ∥ & \leq \sum_{i = 1}^{N} ∥ f^{i} ∥ \prod_{j = 1}^{m} \frac{1}{1 + x_{j}} \\ \leq \sum_{i = 1}^{N} ∥ f^{i} ∥ \frac{1}{1 + \sum_{j = 1}^{m} x_{j}} . \end{matrix}

(13)

Then, we proceed with a lower estimate for

x_{j}, j = 1, \dots, m

.

By Lemma 1, we set

ϵ = \frac{a}{2}

and find

f_{j}^{ϵ}

such that

∥ f^{i_{j}} - f_{j}^{ϵ} ∥ < ϵ

and

\frac{f_{j}^{ϵ}}{A (ϵ)} \in A_{1} (D),

with some number

A (ϵ) > 0

.

We obtain from Lemma 5 that

\begin{matrix} F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (φ_{j}) & \geq t_{j} sup_{φ \in D} F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (φ) \\ = t_{j} sup_{g \in A_{1} (D)} F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (g) \\ \geq t_{j} A {(ϵ)}^{- 1} F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (f_{j}^{ϵ}) . \end{matrix}

(14)

By Lemma 4, we obtain

\begin{matrix} F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (f_{j}^{ϵ}) & = F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (f^{i_{j}} - (f^{i_{j}} - f_{j}^{ϵ})) \\ = F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (f^{i_{j}}) - F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (f^{i_{j}} - f_{j}^{ϵ}) \\ > F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ} + f_{j - 1}^{i_{j}, v, R, τ}) - ϵ \\ = F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}) - ϵ \\ = ∥ r_{j - 1}^{i_{j}} ∥ - ϵ . \end{matrix}

(15)

Inequalities (14), (15) and

ϵ = \frac{a}{2}

result in

\begin{matrix} F_{f^{i_{j}} - f_{j - 1}^{i_{j}, v, R, τ}} (φ_{j}) & \geq & t_{j} A {(ϵ)}^{- 1} (∥ r_{j - 1}^{i_{j}} ∥ - ϵ) \\ \geq & t_{j} A {(ϵ)}^{- 1} \cdot \frac{a}{2} . \end{matrix}

(16)

Combining (12) with (16), we obtain

\begin{matrix} x_{j} & \geq \frac{1}{N} \cdot \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot {(t_{j} A {(ϵ)}^{- 1} \cdot \frac{a}{2})}^{\frac{q}{q - 1}} \\ = \frac{1}{N} \cdot \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot {(\frac{a}{2 A (ϵ)})}^{p} \cdot t_{j}^{p} . \end{matrix}

(17)

Combining (17) with (13), we have

\sum_{i = 1}^{N} ∥ r_{m}^{i} ∥ \leq \sum_{i = 1}^{N} ∥ f^{i} ∥ \cdot \frac{1}{1 + \frac{1}{N} \cdot \frac{q - 1}{q} {(2 γ q)}^{\frac{1}{1 - q}} \cdot {(\frac{a}{2 A (ϵ)})}^{p} \sum_{j = 1}^{m} t_{j}^{p}} .

The assumption

\sum_{m = 1}^{\infty} t_{m}^{p} = \infty

implies that

\sum_{i = 1}^{N} ∥ r_{m}^{i} ∥ \to 0

as

m \to \infty

.

Thus,

lim_{m \to \infty} ∥ r_{m}^{i} ∥ = 0

for

1 = 1, \dots, N .

We obtain a contradiction which proves this theorem. □

Remark 5.

According to Theorem 3.1 in [32], we know that

\sum_{m = 1}^{\infty} t_{m}^{p} = \infty

is also a necessary condition for the convergence of the VWRPGA.

Remark 6.

Since the WRGA only converges for the target elements from

A_{1} (D)

, see [31], then the VWRGA also only converges for the target elements from

A_{1} (D)

. Thus, the convergence property of the VWRPGA is better than that of the VWRGA.

Remark 7.

For

f^{i} \in A_{1} (D), i = 1, \dots, N

, the sufficient condition for the convergence of the VWCGA follows from Theorem 12. Theorem 15 gives the convergence condition for any

f^{i} \in X, i = 1, \dots, N

.

By using the same method, it is not difficult to prove the following theorem on the convergence of the VWCGA.

Theorem 15.

Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. Assume

\sum_{m = 1}^{\infty} t_{m}^{p} = \infty, p = \frac{q}{q - 1} .

Then, for any

f^{i} \in X, i = 1, \dots, N

and any dictionary

D

the VWCGA converges.

Next, we give the theorem about the error bound of the VWRPGA(

X, D

) on

A_{1} (D)

.

Theorem 16.

Let X be a Banach space with modulus of smoothness

ρ (u) \leq γ u^{q}, 1 < q \leq 2

. Then, for a sequence

τ : = {t_{k}}_{k = 1}^{\infty}, 0 < t_{k} \leq 1

and any

f^{i} \in A_{1} (D) \subset X, i = 1, \dots, N

, we have

∥ f^{i} - f_{m}^{i, v, τ, R} ∥ \leq C_{3} (q, γ) min \{1, {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{p})}^{- \frac{1}{p}}\}, p = \frac{q}{q - 1} .

Proof.

It is known from (11) that the sequences

{∥ f^{i} - f_{m}^{i, v, τ, R} {∥}}_{m = 0}^{\infty}, i = 1, \dots, N

are decreasing. Fix i. The inequality

∥ f^{i} - f_{m}^{i, v, τ, R} ∥ \leq 1

follows from the assumption

f^{i} \in A_{1} (D)

and the fact that

{∥ f^{i} - f_{m}^{i, v, τ, R} {∥}}_{m = 0}^{\infty}

is decreasing.

Thus, we only need to prove the following estimate:

∥ f^{i} - f_{m}^{i, v, τ, R} ∥ \leq C (q, γ) {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{p})}^{- \frac{1}{p}}, i = 1, \dots, N .

We define the set

E_{l} : = {k | i_{k} = l, 1 \leq k \leq m}

just as we did in proof of Theorem 6. It is obvious that

\sum_{k = 1}^{m} t_{k}^{p} = \sum_{l = 1}^{N} \sum_{k \in E_{l}} t_{k}^{p} .

Thus, there exists

l_{0} \in [1, N]

such that

\sum_{k \in E_{l_{0}}} t_{k}^{p} \geq \frac{1}{N} \sum_{k = 1}^{m} t_{k}^{p} .

As in proof of Theorem 6, for

f^{l_{0}} \in X

, the sequences

f_{1}^{l_{0}, v, R, τ}, \dots, f_{m}^{l_{0}, v, R, τ}

are the outputs of the WRPGA with the weakness sequence

τ^{l_{0}} : = {t_{k}^{l_{0}}}

. Therefore, using Theorem 13, we obtain

\begin{matrix} ∥ f^{l_{0}} - f_{k_{0} - 1}^{l_{0}, v, R, τ} ∥ & \leq C (q, γ) {(1 + \sum_{k \in E_{l_{0}} - {k_{0}}} t_{k}^{p})}^{- \frac{1}{p}} \\ \leq C (q, γ) {(1 + \sum_{k \in E_{l_{0}}} t_{k}^{p} - 1)}^{- \frac{1}{p}} \\ \leq C (q, γ) {(\frac{1}{N} \sum_{k = 1}^{m} t_{k}^{p})}^{- \frac{1}{p}} . \end{matrix}

The proof of Theorem 16 is completed. □

Remark 8.

We know from Theorems 12 and 16 that the error bound of the VWRPGA is almost the same as those of the VWCGA and VWRGA. Similarly, the computational complexity of the VWRPGA is essentially smaller than those of the VWCGA and VWRGA.

4. Conclusions

In this paper, we consider the use of vector greedy algorithms for simultaneous approximation. We first work in a Hilbert space H. We propose a new vector greedy algorithm—the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA)—for simultaneous approximation with respect to a dictionary

D

in H. Then, we study the error performances of the VWRPGA. We show that the convergence rate of the VWRPGA on

A_{1} (D)

is optimal. The VWRPGA has a weaker convergence condition than the VWPGA. The convergence rate of the VWRPGA is better than that of the VWPGA. In particular, this advantage is more obvious when N is large. Moreover, the error performances of the VWRPGA are similar to those of the VWOGA. However, from the viewpoint of computational complexity, the VWRPGA is simpler than the VWOGA. For N target elements, from the definitions of the algorithms, one can see that the VWOGA needs to solve N m-dimensional optimization problems. However, the VWRPGA only needs to solve N one-dimensional optimization problems.

Then, we design the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA) in a uniformly smooth Banach space setting. We obtain the convergence properties and error bound of the VWRPGA in this case. We also show that the convergence condition of the VWCGA is the same as that of the VWRPGA. We show that when the Banach space is a Lebesgue space, the convergence rate of the VWRPGA on

A_{1} (D)

is sharp. As for the convergence properties, the VWRGA converges only for the target elements from

A_{1} (D)

, while the VWRPGA converges for any element. Therefore, the VWRPGA has better convergence properties than the VWRGA. The error bounds of the VWRPGA are similar to those of the VWCGA and VWRGA. From the viewpoint of computational complexity, the VWRPGA is simpler than the VWCGA and the VWRGA.

In conclusion, the VWRPGA is the simplest vector greedy algorithm for simultaneous approximation with the best convergence property and the optimal convergence rate.

The VWRGA is more efficient than the WRPGA, since the complexity of its calculation and the storage of information can be reduced greatly by the VWRPGA instead of the N-fold WRPGA. If

τ : = {t_{k}}_{k = 1}^{\infty}, t_{k} = 1, k = 1, 2, \dots

and

N = 1

, then the VWRPGA degenerates into the RPGA. In [5], the authors applied the RPGA to a kernel-based regression. They defined the Rescaled Pure Greedy Learning Algorithm (RPGLA) and studied its efficiency. They showed that the computational complexity of the RPGLA is less than the Orthogonal Greedy Learning Algorithm (OGLA) [37] and Relaxed Greedy Learning Algorithm (RGLA) [38]. When the kernel is infinitely smooth, the learning rate can be arbitrarily close to the best rate

O (m^{- 1})

under a mild assumption of the regression function. Since the VWRPGA is more efficient than the RPGA, the VWRPGA can be used to solve the problems of multi-task learning more efficiently. Moreover, it is natural to consider the applications of the VWRPGA to vector signal processing. We will study these applications of the VWRPGA in the future.

Author Contributions

Conceptualization, X.X., P.Y. and W.Z.; methodology, X.X. and P.Y.; formal analysis, J.G.; investigation, all authors; resources, all authors; data curation, all authors; writing—original draft preparation, P.Y. and J.G.; writing—review and editing, X.X. and P.Y.; visualization, all authors; supervision, X.X. and P.Y.; project administration, all authors; funding acquisition, all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 11671213).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Barron, A.; Cohen, A.; Dahmen, W.; DeVore, R. Approximation and learning by greedy algorithms. Ann. Stat. 2008, 36, 64–94. [Google Scholar] [CrossRef]
Cohen, A.; Dahmen, W.; DeVore, R. Compressed sensing and best k-term approximation. J. Am. Math. Soc. 2009, 22, 211–231. [Google Scholar] [CrossRef]
Yang, B.; Yang, C.; Huang, G. Efficient image fusion with approximate sparse representation. Int. J. Wavelets Multiresolut. Inf. Process. 2016, 14, 1650024. [Google Scholar]
Zhang, W.H.; Ye, P.X.; Xing, S.; Xu, X. Optimality of the approximation and learning by the rescaled pure super greedy algorithms. Axioms 2022, 11, 437. [Google Scholar] [CrossRef]
Zhang, W.H.; Ye, P.X.; Xing, S. Optimality of the rescaled pure greedy learning algorithms. Int. J. Wavelets Multiresolut. Inf. Process. 2023, 21, 2250048. [Google Scholar] [CrossRef]
Nguyen, H.; Petrova, G. Greedy strategies for convex optimization. Calcolo 2017, 54, 207–224. [Google Scholar] [CrossRef]
Huang, A.T.; Feng, R.Z.; Wang, A.D. The sufficient conditions for orthogonal matching pursuit to exactly reconstruct sparse polynomials. Mathematics 2022, 10, 3703. [Google Scholar] [CrossRef]
Liu, Z.Y.; Xu, Q.Y. A multiscale RBF collocation method for the numerical solution of partial differential equations. Mathematics 2019, 7, 964. [Google Scholar] [CrossRef]
Jin, D.F.; Yang, G.; Li, Z.H.; Liu, H.D. Sparse recovery algorithm for compressed sensing using smoothed l₀ norm and randomized coordinate descent. Mathematics 2019, 7, 834. [Google Scholar] [CrossRef]
Natsiou, A.A.; Gravvanis, G.A.; Filelis-Papadopoulos, C.K.; Giannoutakis, K.M. An aggregation-based algebraic multigrid method with deflation techniques and modified generic factored approximate sparse inverses. Mathematics 2023, 11, 640. [Google Scholar] [CrossRef]
Argyriou, A.; Evgeniou, T.; Pontil, M. Convex multitask feature learning. Mach. Learn. 2008, 73, 243–272. [Google Scholar] [CrossRef]
Schmidt, E. Zur Theorie der linearen und nichtlinearen Integralgleichungen. I Math. Annalen. 1906–1907, 63, 433–476. [Google Scholar] [CrossRef]
Tropp, J.A.; Gilbert, A.C.; Strauss, M.J. Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit. Signal. Process. 2006, 86, 572–588. [Google Scholar] [CrossRef]
Wirtz, D.; Haasdonk, B. A vectorial kernel orthogonal greedy algorithm. Proc. DWCAA 2013, 6, 83–100. [Google Scholar]
DeVore, R.A.; Temlyakov, V.N. Some remarks on greedy algorithms. Adv. Comput. Math. 1996, 5, 173–187. [Google Scholar] [CrossRef]
Gao, Z.; Petrova, G. Rescaled pure greedy algorithm for convex optimization. Calcolo 2019, 56, 15. [Google Scholar] [CrossRef]
Petrova, G. Rescaled pure greedy algorithm for Hilbert and Banach spaces. Appl. Comput. Harmon. Anal. 2016, 41, 852–866. [Google Scholar] [CrossRef]
Jiang, B.; Ye, P.; Zhang, W. Unified error estimate for weak biorthogonal greedy algorithms. Int. J. Wavelets Multiresolut. Inform. Process. 2022, 5, 2150001. [Google Scholar] [CrossRef]
Dereventsov, A.V.; Temlyakov, V.N. A unified way of analyzing some greedy algorithms. J. Funct. Anal. 2019, 12, 1–30. [Google Scholar] [CrossRef]
Temlyakov, V.N. A remark on simultaneous greedy approximation. East J. Approx. 2004, 10, 17–25. [Google Scholar]
Leviatan, D.; Temlyakov, V.N. Simultaneous approximation by greedy algorithms. Adv. Comput. Math. 2006, 25, 73–90. [Google Scholar] [CrossRef]
Leviatan, D.; Temlyakov, V.N. Simultaneous greedy approximation in Banach spaces. J. Complex. 2005, 21, 275–293. [Google Scholar] [CrossRef]
Lutoborski, A.; Temlyakov, V.N. Vector greedy algorithms. J. Complex. 2003, 19, 458–473. [Google Scholar] [CrossRef]
Mallat, S.; Zhang, Z. Matching pursuit with time-frequency dictionaries. IEEE Trans. Signal Pross. 1993, 41, 3397–3415. [Google Scholar] [CrossRef]
Konyagin, S.V.; Temlyakov, V.N. Rate of convergence of pure greedy algorithm. East. J. Approx. 1996, 5, 493–499. [Google Scholar]
Sil<sup>′</sup>nichenko, A.V. Rates of convergence of greedy algorithms. Mat. Zametki. 2004, 76, 628–632. [Google Scholar]
Burusheva, L.; Temlyakov, V. Sparse approximation of individual functions. J. Approx. Theory 2020, 259, 105471. [Google Scholar] [CrossRef]
Livshitz, D.; Temlyakov, V.N. Two lower estimates in greedy approximation. Constr. Approx. 2003, 19, 509–524. [Google Scholar] [CrossRef]
Temlyakov, V.N. Greedy Approximation; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Temlyakov, V.N. Weak greedy algorithms. Adv. Comput. Math. 2000, 12, 213–227. [Google Scholar] [CrossRef]
Temlyakov, V.N. Greedy algorithms in Banach spaces. Adv. Comput. Math. 2001, 14, 277–292. [Google Scholar] [CrossRef]
Jiang, B.; Ye, P.X. Efficiency of the weak rescaled pure greedy algorithm. Int. J. Wavelets Multiresolut. Inform. Process. 2021, 4, 2150001. [Google Scholar] [CrossRef]
Donahue, M.; Gurvits, L.; Darken, C.; Sontag, E. Rate of convex approximation in non-Hilbert spaces. Constr. Approx. 1997, 13, 187–220. [Google Scholar] [CrossRef]
Lindenstrauss, J.; Tzafriri, L. Classical Banach Spaces I; Springer: Berlin/Heidelberg, Germany, 1977. [Google Scholar]
Temlyakov, V.N.; Yang, M.R.; Ye, P.X. Greedy approximation with regard to non-greedy bases. Adv. Comput. Math. 2011, 34, 319–337. [Google Scholar] [CrossRef]
Ye, P.X.; Wei, X.J. Efficiency of weak greedy algorithms for m-term approximations. Sci. China Math. 2016, 59, 697–714. [Google Scholar] [CrossRef]
Chen, H.; Zhou, Y.C.; Tang, Y.Y.; Li, L.Q.; Pan, Z.B. Convergence rate of the semi-supervised greedy algorithm. Neural Netw. 2013, 44, 44–50. [Google Scholar] [CrossRef]
Lin, S.B.; Rong, Y.H.; Sun, X.P.; Xu, Z.B. Learning capability of the relaxed greedy algorithms. IEEE Trans. Neural Netw. Learn. Syst. 2013, 24, 1598–1608. [Google Scholar]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, X.; Guo, J.; Ye, P.; Zhang, W. Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm. Mathematics 2023, 11, 2020. https://doi.org/10.3390/math11092020

AMA Style

Xu X, Guo J, Ye P, Zhang W. Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm. Mathematics. 2023; 11(9):2020. https://doi.org/10.3390/math11092020

Chicago/Turabian Style

Xu, Xu, Jinyu Guo, Peixin Ye, and Wenhui Zhang. 2023. "Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm" Mathematics 11, no. 9: 2020. https://doi.org/10.3390/math11092020

APA Style

Xu, X., Guo, J., Ye, P., & Zhang, W. (2023). Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm. Mathematics, 11(9), 2020. https://doi.org/10.3390/math11092020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm

Abstract

1. Introduction

2. The VWRPGA for Hilbert Spaces

3. The VWRPGA for Banach Spaces

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI