Next Article in Journal
Constructing an Evolutionary Tree and Path–Cycle Graph Evolution along It
Previous Article in Journal
Assisting the Human Embryo Viability Assessment by Deep Learning for In Vitro Fertilization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm

1
School of Science, China University of Geosciences, Beijing 100083, China
2
School of Mathematics and LPMC, Nankai University, Tianjin 300071, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(9), 2020; https://doi.org/10.3390/math11092020
Submission received: 22 March 2023 / Revised: 14 April 2023 / Accepted: 19 April 2023 / Published: 24 April 2023
(This article belongs to the Special Issue Advances in Approximation Theory and Numerical Functional Analysis)

Abstract

:
We first study the error performances of the Vector Weak Rescaled Pure Greedy Algorithm for simultaneous approximation with respect to a dictionary D in a Hilbert space. We show that the convergence rate of the Vector Weak Rescaled Pure Greedy Algorithm on A 1 ( D ) and the closure of the convex hull of the dictionary D is optimal. The Vector Weak Rescaled Pure Greedy Algorithm has some advantages. It has a weaker convergence condition and a better convergence rate than the Vector Weak Pure Greedy Algorithm and is simpler than the Vector Weak Orthogonal Greedy Algorithm. Then, we design a Vector Weak Rescaled Pure Greedy Algorithm in a uniformly smooth Banach space setting. We obtain the convergence properties and error bound of the Vector Weak Rescaled Pure Greedy Algorithm in this case. The results show that the convergence rate of the VWRPGA on A 1 ( D ) is sharp. Similarly, the Vector Weak Rescaled Pure Greedy Algorithm is simpler than the Vector Weak Chebyshev Greedy Algorithm and the Vector Weak Relaxed Greedy Algorithm.

1. Introduction

Approximation using a sparse linear combination of elements from a fixed redundant family is actively used because of its concise representations and increased computational efficiency. It has been applied widely to signal processing, image compression, machine learning and PDE solvers (see [1,2,3,4,5,6,7,8,9,10]). Among others, simultaneous sparse approximation has been utilized in signal vector processing and multi-task learning (see [11,12,13,14]). It is well known that the greedy-type algorithms are powerful tools for generating such sparse approximations (see [15,16,17,18,19]). In particular, vector greedy algorithms are very efficient at approximating a given finite number of target elements simultaneously (see [20,21,22,23]). In this article, we propose a new vector greedy algorithm—the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA)—for simultaneous approximation. We estimate the error of the VWRPGA and show that its convergence rate on the convex hull of the dictionary is optimal.
Let X be a real Banach space with norm · . We say a set of elements D X is a dictionary, if φ = 1 for each φ D and span ¯ ( D ) = X . We assume that every dictionary D is symmetric, i.e.,
φ D   implies   φ D .
If f m is the output of a greedy algorithm after m iterations, then the efficiency of the approximation can be measured by the decay of the error f f m as m . We are mainly concerned with the error f f m . We want to know whether it tends to zero, as m . If it indeed converges to zero, then what is the convergence rate? To solve these problems, we need the following classes of elements.
For a general dictionary D , we define the class of elements
A 1 o ( D , M ) : = f : f = k Λ c k ( f ) φ k , φ k D , | Λ | < , k Λ | c k ( f ) | M
and A 1 ( D , M ) as the closure of A 1 o ( D , M ) . Let A 1 ( D ) be the union of the classes A 1 ( D , M ) over all M > 0 . Denote A 1 ( D ) : = A 1 ( D , 1 ) . For f A 1 ( D ) , we define its norm as
f A 1 ( D ) : = inf { M : f A 1 ( D , M ) } .
We recall some related results in a Hilbert space for the reason that this kind of space has priority in geometric features and practical applications. Let H be a real Hilbert space with an inner product · , · and the norm x : = x , x 1 / 2 .
The most natural greedy algorithm in a Hilbert space is the Pure Greedy Algorithm (PGA). This algorithm is also known as the Matching Pursuit in signal processing [24]. We recall its definition from [15].
  • PGA(H, D ):
Step 0: Define f 0 = 0 .
Step m:
- If f = f m 1 , stop the algorithm and define f k = f m 1 = f for k m .
- If f f m 1 , choose an element φ m D such that
| f f m 1 , φ m | = sup φ D | f f m 1 , φ | .
Define the next approximant to be
f m = f m 1 + f f m 1 , φ m φ m ,
and proceed to Step m + 1 .
The first upper bound on the rate of convergence of the PGA for f A 1 ( D ) was obtained in [15] as follows:
f f m f A 1 ( D ) m 1 6 , m = 1 , 2 , .
Later, the above estimate of the PGA was improved in [25,26] to O ( m 11 62 ) and O ( m s 2 ( s + 2 ) ) , where s is the root of the equation
( 1 + x ) 1 2 + x 1 + 1 1 + x 1 1 x = 0
on the closed interval [ 1 , 1.5 ] . It is known that s 2 ( s + 2 ) > 11 62 .
Note that when D is an ortho-normal basis of H, it is not difficult to prove that for any f A 1 ( D ) , there holds
f f m c f A 1 ( D ) m 1 2 , m = 1 , 2 , .
In addition, there exists an element f * A 1 ( D ) (see [27]) such that
f * f m * = c · m 1 2 , m = 1 , 2 , .
Thus, inequality (1) cannot be improved for ortho-normal bases. A natural question arises: does inequality (1) hold for any dictionary D H ? Unfortunately, the answer is negative.
In fact, Livshitz and Temlyakov [28] proved that there exists a dictionary D H , a positive constant C and an element f A 1 ( D ) such that
f f m C m 0.27 , m = 1 , 2 , .
This lower bound on the convergence rate of the PGA indicates that this algorithm does not attain the rate O ( m 1 2 ) for all D .
In [15], the idea of the best approximation was introduced into the greedy algorithm, which formed the original idea of the Orthogonal Greedy Algorithm (OGA). In order to construct an approximation, the OGA takes the orthogonal projection of f on the subspace generated by all the chosen φ 1 , , φ m . We recall its definition from [15].
  • OGA(H, D ):
Step 0: Define f 0 = 0 .
Step m:
- If f = f m 1 , stop the algorithm and define f k = f m 1 = f for k m .
- If f f m 1 , choose an element φ m D such that
| f f m 1 , φ m | = sup φ D | f f m 1 , φ | .
Define the next approximant to be
f m = P m ( f ) ,
and proceed to Step m + 1 , where P m is the orthogonal projection onto V m : = span { φ 1 , φ 2 , · · · , φ m }
In [15], it is shown that for any D , the output of the OGA(H, D ) satisfies
f f m c f A 1 ( D ) m 1 2 , m = 1 , 2 , .
Note that when D is an ortho-normal basis of H, the OGA(H, D ) coincides with the PGA(H, D ). So, the rate O ( m 1 2 ) is sharp.
The Relaxed Greedy Algorithm (RGA) is also a modification of PGA. We recall its definition from [15].
  • RGA(H, D ):
Step 0: Define f 0 = 0 .
Step m:
- If f = f m 1 , stop the algorithm and define f k = f m 1 = f for k m .
- If f f m 1 , choose an element φ m D such that
f f m 1 , φ m = sup φ D | f f m 1 , φ | .
For m = 1 , define
f 1 = f , φ 1 φ 1 .
For m 2 , define the next approximant to be
f m = 1 1 m f m 1 + 1 m φ m ,
and proceed to Step m + 1 .
It is shown in [15] that the RGA also achieves the rate O ( m 1 2 ) on A 1 ( D ) .
The Rescaled Pure Greedy Algorithm (RPGA) [17] is another kind of greedy algorithm which makes a modification to the rescaling process to replace the original output of the PGA with f m = s m f m ^ at each iteration. It is defined as follows.
  • RPGA(H, D ) :
Step 0: Define f 0 = 0 .
Step m:
- If f = f m 1 , stop the algorithm and define f k = f m 1 = f for k m .
- If f f m 1 , choose an element φ m D such that
| f f m 1 , φ m | = sup φ D | f f m 1 , φ | .
With
λ m = f f m 1 , φ m , f ^ m : = f m 1 + λ m φ m , s m = f , f ^ m f ^ m 2 ,
define the next approximant to be
f m = s m f ^ m ,
and proceed to Step m + 1 .
In [17], the convergence rate of the RPGA was obtained as follows:
f f m f A 1 ( D ) ( m + 1 ) 1 2 , m = 0 , 1 , 2 , .
It is worth noting that the supremum of the inner product might not be attainable. To remedy this problem, the original condition on the selection of φ m is replaced by
| f f m 1 , φ m | t m sup φ D | f f m 1 , φ | ,
where 0 < t m 1 . This is often referred to as the “weak” condition. The study on the weak version of the above algorithms can be found in [15,25,29,30].
Meanwhile, building simultaneous approximations for a given vector of elements brings about the so-called vector-type greedy algorithms. Instead of running the algorithm for a finite collection of elements f 1 , , f N each time separately, the vector greedy algorithm manages to obtain a simultaneous approximation of all elements with a single run. Hence, the complexity of calculation and the storage of information can be reduced greatly. Now, it comes to the question of how well this type of algorithm can perform. Namely, we need to measure its efficiency via its error bound. The Vector Weak Pure Greedy Algorithm (VWPGA, which is also referred to as the Vector Weak Greedy Algorithm (VWGA)) and the Vector Weak Orthogonal Greedy Algorithm (VWOGA) have been introduced and studied in [21,22,23].
We recall the definitions of the VWPGA and VWOGA from [23] as follows. Let τ = { t m } m = 1 and 0 < t m 1 be a given sequence.
  • VWPGA(H, D ): f i H , i = 1 , , N is the target element.
Step 0: Define f 0 i : = 0 , i = 1 , N .
Step m:
- If f i = f m 1 i , stop the algorithm and define f k i = f m 1 i = f for k m .
- If f f m 1 , choose an element φ m D such that
max i | f i f m 1 i , φ m | t m max i sup φ D | f i f m 1 i , φ | .
Define the next approximant to be
f m i : = f m 1 i + f i f m 1 i , φ m φ m , i = 1 , 2 , , N ,
and proceed to Step m + 1 .
  • VWOGA(H, D ): f i H , i = 1 , , N is the target element.
Step 0: Define f 0 i : = 0 , i = 1 , N .
Step m:
- If f i = f m 1 i , stop the algorithm and define f k i = f m 1 i = f for k m .
- If f i f m 1 i . Let i m be such that
f i m f m 1 i m f i f m 1 i , i = 1 , N .
Choose an element φ m D such that
| f i m f m 1 i m , φ m | t m sup φ D | f i m f m 1 i m , φ | .
Define the next approximant to be
f m i = P m ( f i ) , i = 1 , 2 , , N ,
and proceed to Step m + 1 , where P m is the orthogonal projection onto V m := span{ φ 1 , φ 2 , · · · , φ m }.
We list the results on the convergence rate of the VWPGA and VWOGA in [23] as follows.
Theorem 1.
Let τ : = { t k } k = 1 , t k = t , 0 < t 1 be a given real sequence. Then, for any f 1 , , f N , f i A 1 ( D ) , the output { f m i } m 0 of the VWPGA satisfies
i = 1 N f i f m i 2 1 + m t 2 N t / ( 2 N + t ) N 2 N + 2 t 2 N + t .
Theorem 2.
Let τ : = { t k } k = 1 , t k = t , 0 < t 1 be a given real sequence. Then, for any f i A 1 ( D ) , the output { f m i } m 0 of the VWOGA satisfies
f i f m i min 1 , N m t 2 1 / 2 , i = 1 , , N .
Improvements to the above estimates are made in [19,21,22]. The results indicate that the VWOGA achieves a better convergence rate on A 1 ( D ) than that of the VWPGA.
In [23], the authors gave a sufficient condition of convergence for the VWPGA.
Theorem 3.
Assume that m = 1 t m m = . Then, for any dictionary and any finite elements f i H , i = 1 , , N , the VWPGA satisfies
lim m f i f m i = 0 .
Motivated by these studies, we design the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA) and study its efficiency. The remainder of the paper is organized as follows. In Section 2, we deal with the case of Hilbert spaces. In Section 3, we deal with the case of Banach spaces. In Section 4, we draw the conclusions. Below, we provide more details.
In Section 2, we define the VWRPGA in Hilbert spaces and study its approximation properties. We first prove that
m = 1 t m 2 =
is the sufficient convergence condition of the VWRPGA for any f i H and D H , i = 1 , , N . This convergence condition is weaker than that of the VWPGA. Then, we prove that the error bound of the VWRPGA on A 1 ( D ) satisfies
f i f m i , v , τ , r min 1 , 1 N k = 1 m t k 2 1 2 .
When t 1 = t 2 = = t m = 1 , we show that convergence rate of the VWRPGA on A 1 ( D ) is O ( m 1 2 ) , which is sharp. This convergence rate is better than that of the VWPGA. In particular, this advantage is more obvious when N is large. The VWRPGA is more efficient than VWOGA from the viewpoint of computational complexity. This is because, for N target elements, the VWRPGA only needs to solve N one-dimensional optimization problems, while the VWOGA involves N m-dimensional optimization problems.
In Section 3, we define the VWRPGA for some uniformly smooth Banach spaces. The sufficient condition of the convergence of the VWRPGA is obtained in this case. It seems that this is the first result on the convergence analysis of the vector greedy algorithms in the Banach space setting. Then, we derive the error bound of the VWRPGA. The results show that the convergence rate of the VWRPGA on A 1 ( D ) is sharp. We compare the approximation properties of the VWRPGA with those of the Vector Weak Chebyshev Greedy Algorithm (VWCGA) and the Vector Weak Relaxed Greedy Algorithm (VWRGA). We show that the VWRPGA has better convergence properties than the VWRGA. Similarly, the computational complexity of the VWRPGA is essentially smaller than those of the VWCGA and VWRGA.
In Section 4, we draw the conclusions of our study. Our results show that the VWRPGA is the simplest vector greedy algorithm for simultaneous approximation with the best convergence property and the optimal convergence rate. We also discuss the possible applications of the VWRPGA in multi-task learning and signal vector processing.

2. The VWRPGA for Hilbert Spaces

In this section, we define the VWRPGA in Hilbert spaces and obtain its sufficient condition of convergence together with an estimate of its error bound. Based on these results, we compare the VWRPGA with the VWPGA and the VWOGA.
Firstly, we recall the definition of the Weak Rescaled Pure Greedy Algorithm (WRPGA) in Hilbert spaces from [17]. Let τ = { t m } m = 1 , 0 < t m 1 be a given sequence. The WRPGA consists of the following stages:
  • WRPGA(H, D ):
Step 0: Define f 0 = 0 .
Step m:
- If f = f m 1 , stop the algorithm and define f k = f m 1 = f for k m .
- If f f m 1 , choose an element φ m D such that
| f f m 1 , φ m | t m sup φ D | f f m 1 , φ | .
With
λ m = f f m 1 , φ m , f m ^ : = f m 1 + λ m φ m , s m = f , f m ^ f m ^ 2 ,
define the next approximation to be
f m = s m f m ^ ,
and proceed to Step m + 1 .
The error bound of the WRPGA has been obtained as follows.
Theorem 4
(see Theorem 4.1 in [17]). If f A 1 ( D ) H , then the output { f m } m 0 of the WRPGA satisfies the error estimate
f f m f A 1 ( D ) k = 1 m t k 2 1 2 .
Based on the WRPGA, we can define the VWRPGA. Let τ = { t m } m = 1 , 0 < t m 1 be a given sequence. Now, we define the VWRPGA using the following steps:
  • VWRPGA(H, D ): Given f i H , i = 1 , , N .
Step 0: Define f 0 i , v , τ , r : = 0 , i = 1 , N .
Step m:
- If f i = f m 1 i , v , τ , r , stop the algorithm and define f k i , v , τ , r = f m 1 i , v , τ , r = f for k m .
- If f i f m 1 i , v , τ , r , let i m be such that
f i m f m 1 i m , v , τ , r = max 1 i N f i f m 1 i , v , τ , r .
Choose an element φ m D such that
f i m f m 1 i m , v , τ , r , φ m t m sup φ D f i m f m 1 i m , v , τ , r , φ .
With
λ m i = f i f m 1 i , v , τ , r , φ m , f m i ^ : = f m 1 i , v , τ , r + λ m i φ m , s m i = f i , f m i ^ f m i ^ 2 ,
define the next approximation to be
f m i , v , τ , r : = s m i f m i ^ ,
and proceed to Step m + 1 .
We establish in this section two typical results on the approximation properties of the VWRPGA (H, D ). We first give the sufficient condition for the convergence of the VWRPGA for any dictionary D and any f i , i = 1 , , N .
Theorem 5.
Assume m = 1 t m 2 = . Then, the VWRPGA converges for any dictionary D and any f i H , i = 1 , , N .
In the proof of Theorem 5, we will reduce the approximation of the general element to that of the element from A 1 ( D ) . To this end, we recall from [31] the following lemmas on the approximation properties of A 1 ( D ) .
Lemma 1.
Let X be a Banach space and D X be a dictionary. Then, for any ϵ > 0 and any f X , there exists f ϵ X such that
f f ϵ < ϵ
and
f ϵ A ( ϵ ) A 1 ( D ) ,
with some number A ( ϵ ) > 0 .
Lemma 2.
For any f H and any dictionary D , we have
sup φ D f , φ = sup g A 1 ( D ) f , g .
Proof of Theorem 5.
Note that f m i , v , τ , r is the orthogonal projection of f i onto the one-dimensional space span { f m i ^ } . Thus, it is the best approximation of f m i , v , τ , r from span { f m i ^ } .
Let r m i : = f i f m i , v , τ , r , i = 1 , , N , be the residual of f m i , v , τ , r . By the definition of f m i ^ and the choice of λ m i , we have
r m i 2 = f i f m i , v , τ , r 2 = f i s m i f m i ^ 2 f i f m i ^ 2 f i f m 1 i , v , τ , r λ m i φ m , f i f m 1 i , v , τ , r λ m i φ m = f i f m 1 i , v , τ , r 2 2 λ m i f i f m 1 i , v , τ , r , φ m + ( λ m i ) 2 = r m 1 i 2 f i f m 1 i , v , τ , r , φ m 2 .
The latter inequality implies that { r m i } m = 0 is a decreasing sequence. According to the Monotone Convergence Theorem, we know that lim m r m i exists, i = 1 , , N .
We prove that lim m r m i = 0 by contradiction. Assume lim m r m i a > 0 , i = 1 , , N . Then, for any m, we have r m i a . By (2), we obtain that
i = 1 N r m i 2 i = 1 N r m 1 i 2 i = 1 N f i f m 1 i , v , τ , r , φ m 2 = i = 1 N r m 1 i 2 1 i = 1 N f i f m 1 i , v , τ , r , φ m 2 i = 1 N r m 1 i 2 i = 1 N r m 1 i 2 1 f i m f m 1 i m , v , τ , r , φ m 2 N r m 1 i m 2 i = 1 N f i 2 j = 1 m 1 f i j f j 1 i j , v , τ , r , φ j 2 N r j 1 i j 2 .
Denote
x j = f i j f j 1 i j , v , τ , r , φ j 2 N r j 1 i j 2 .
By the inequality 1 x 1 1 + x , 0 x 1 , we obtain that
i = 1 N r m i 2 i = 1 N f i 2 j = 1 m ( 1 1 + x j ) i = 1 N f i 2 1 1 + j = 1 m x j .
Then, we come to obtain a lower estimate for x j , j = 1 , , m .
Set ϵ = a 2 . In view of Lemma 1, we can find f j ϵ such that
f i j f j ϵ < ϵ
and
f j ϵ A ( ϵ ) A 1 ( D ) ,
with some number A ( ϵ ) > 0 .
Using Lemma 2, we have
f i j f j 1 i j , v , τ , r , φ j t j sup φ D f i j f j 1 i j , v , τ , r , φ = t j sup g A 1 ( D ) f i j f j 1 i j , v , τ , r , g t j A ( ϵ ) 1 f i j f j 1 i j , v , τ , r , f j ϵ .
Since f m i , v , τ , r is the orthogonal projection of f i onto span { f m i ^ } , we have
f i f m i , v , τ , r , f m i , v , τ , r = 0 .
Then,
f i j f j 1 i j , v , τ , r , f j ϵ = r j 1 i j , f i j + f j ϵ f i j = r j 1 i j , r j 1 i j + f j 1 i j , v , τ , r r j 1 i j , f i j f j ϵ > r j 1 i j 2 r j 1 i j · ϵ .
Combining (4) and (5) with ϵ = a 2 , we obtain
x j 1 N r j 1 i j 2 · f i j f j 1 i j , v , τ , r , f j ϵ A ( ϵ ) t j 2 1 N · r j 1 i j ϵ A ( ϵ ) t j 2 a 2 4 N A ( ϵ ) 2 t j 2 .
Combining (3) with (6), we can obtain that
i = 1 N r m i 2 i = 1 N f i 2 1 1 + a 2 4 N A ( ϵ ) 2 j = 1 m t j 2 .
The assumption m = 1 t m 2 = implies that i = 1 N r m i 2 0 as m .
It is obvious that lim m r m i = 0 for i = 1 , , N . Hence, we obtain a contradiction, which proves this theorem. □
Remark 1.
It is known from Theorem 2.1 in [32] that m = 1 t m 2 = is also the necessary condition for the convergence of the VWRPGA.
Remark 2.
According to the Cauchy–Schwartz inequality, we know that
m = 1 t m m m = 1 t m 2 1 2 m = 1 1 m 2 1 2 .
Hence,
m = 1 t m m = implies m = 1 t m 2 = .
On the other hand, taking t m = m 1 2 , m = 1 , 2 , · · · , we notice that
m = 1 t m 2 = , m = 1 t m m < .
Therefore, the convergence condition of the VWPRGA is weaker than that of the VWPGA.
The following theorem gives the error bound of the VWRPGA ( H , D ) for f i A 1 ( D ) , i = 1 , , N .
Theorem 6.
Let τ : = { t k } k = 1 , 0 < t k 1 be a weakness sequence. If f i A 1 ( D ) H , i = 1 , , N . Then, we have for the VWRPGA ( H , D )
f i f m i , v , τ , r min 1 , 1 N k = 1 m t k 2 1 2 .
Proof. 
We establish the approximation error of the VWRPGA based on the methods of [25]. The main idea of this proof is that the VWRPGA can be seen as a realization of the WRPGA with a particular weakness sequence.
Let i { 1 , , N } . Under the assumption that f i A 1 ( D ) and the fact that the sequence { f i f m i , v , τ , r } m = 0 is decreasing, we have
f i f m i , v , τ , r 1 .
Thus, we only need to prove the estimate below:
f i f m i , v , τ , r 1 N k = 1 m t k 2 1 2 , i = 1 , , N .
At step k, the VWRPGA chooses φ k from the D in terms of only one remainder from { r k 1 1 , , r k 1 N } . Then, each f i , i = 1 , , N has been used different times to choose φ k when the VWRPGA carries on to step m. Now, we record the usage of f i , i = 1 , , N .
For every l , l = 1 , , N , denote E l : = { k | i k = l , 1 k m } ( i k is defined in the definition of VWRPGA). Then, we have
E 1 E 2 E N = { 1 , , m } , E i E j = if i j .
Hence,
f l f k 1 l , v , τ , r = f i k f k 1 i k , v , τ , r = max 1 i N f i f k 1 i , v , τ , r , k E l .
Using k = 1 m t k 2 = l = 1 N k E l t k 2 , we can find l 0 [ 1 , N ] such that
k E l 0 t k 2 1 N k = 1 m t k 2 .
Next, let k 0 = max { k | k E l 0 } , k 0 m . We have
max 1 i N f i f m i , v , τ , r max 1 i N f i f k 0 1 i , v , τ , r = f l 0 f k 0 1 l 0 , v , τ , r .
Now, we only consider element f l 0 H . For f l 0 H , we can obtain f 1 l 0 , v , τ , r , , f m l 0 , v , τ , r as an application of the WRPGA with the weakness sequence τ l 0 : = { t k l 0 } given by
t k l 0 = t k , k E l 0 0 , otherwise .
Therefore, by Theorem 4, we obtain
f l 0 f k 0 1 l 0 , v , τ , r 1 + k E l 0 { k 0 } t k 2 1 2 1 + k E l 0 t k 2 1 1 2 1 N k = 1 m t k 2 1 2 .
Together with (7), we complete the proof of Theorem 6. □
We recall the theorem in [21] about the error estimate of the VWPGA.
Theorem 7.
Let τ : = { t k } k = 1 , 0 < t k 1 be a decreasing sequence. Then, for any f 1 , , f N , f i A 1 ( D ) , the output { f m i } m 0 of the VWPGA satisfies
i = 1 N f i f m i 2 N 2 1 + 1 N i = 1 m t k 2 t m 2 N 1 / 2 + t m .
We observe from Theorem 7, for a fixed m, the error of the VWPGA increases as the number of the target elements increases. The exponent is close to zero as long as N is sufficiently large.
Taking t k = 1 , k = 1 , 2 , in Theorem 7, we yield the following theorem, which gives the convergence rate of the VWPGA.
Theorem 8.
Let τ : = { t k } k = 1 , t k = 1 . Then, for any f 1 , , f N , f i A 1 ( D ) , the output { f m i } m 0 of the VWPGA satisfies
i = 1 N f i f m i N 2 1 + m N 1 2 N 1 / 2 + 1 .
Again, by taking t k = 1 , k = 1 , 2 , in Theorem 6, we obtain the following theorem.
Theorem 9.
Let τ : = { t k } k = 1 , t k = 1 . If f i A 1 ( D ) H , i = 1 , , N . Then, for the VWRPGA ( H , D ),
f i f m i , v , τ , r min 1 , m N 1 2 .
Remark 3.
From Theorems 8 and 9, we see that the VWRPGA provides a significantly better convergence rate than the VWPGA. In particular, this advantage is more obvious when N is large.
Remark 4.
It is known from Theorems 2 and 9 that the approximation property of the VWRPGA is almost the same as that of the VWOGA. While the VWRPGA is simpler than the VWOGA from the viewpoint of computational complexity. For N target elements, from the definitions of the algorithms, one can see that the VWOGA needs to solve N m-dimensional optimization problems. However, the VWRPGA only needs to solve N one-dimensional optimization problems. Then, this makes the VWRPGA easier to implement than the VWOGA in practical applications.

3. The VWRPGA for Banach Spaces

In this section, we consider the VWRPGA in the context of Banach spaces. We remark that there are two natural generations of the PGA in the case of Banach space X: the X-greedy algorithm and the dual greedy algorithm. However, there are no general results on convergence and error bound of these two algorithms, cf. [29]. On the other hand, the WOGA, WRGA, WRPGA and VWOGA have been successfully generalized to the case of Banach spaces. We first recall from [31] the definition of the Weak Chebyshev Greedy Algorithm (WCGA), which is a natural generalization of the WOGA.
For any non-zero element f X , we denote by F f a norming functional for f:
F f = 1 , F f ( f ) = f .
The existence of such a functional is guaranteed by the Hahn–Banach theorem. Let τ = { t m } m = 1 , 0 < t m 1 be a given sequence. The WCGA is defined as follows.
  • WCGA(X, D ):
Step 0: Define f 0 = 0 .
Step m:
- If f = f m 1 , stop the algorithm and define f k = f m 1 = f for k m .
- If f f m 1 , choose an element φ m D such that
F f f m 1 ( φ m ) t m sup φ D F f f m 1 ( φ ) .
Set
Φ m : = span { φ 1 , φ 2 , · · · , φ m } .
Define f m to be the best approximant to f from Φ m and proceed to Step m + 1 .
To estimate the error of WCGA, we shall utilize some geometric aspects of Banach spaces. For a Banach space X, we define ρ ( u ) , the modulus of smoothness of X, as
ρ ( u ) : = sup f , g X , f = g = 1 f + u g + f u g 2 1 , u > 0 .
A uniformly smooth Banach space is one with the property
lim u 0 ρ ( u ) u = 0 .
We shall only consider Banach spaces whose modulus of smoothness satisfies the inequality
ρ ( u ) γ u q , 1 < q 2 ,
where γ is a constant independent of u.
A typical example of a uniformly smooth Banach space is the Lebesgue space L p , 1 < p < . It is known from [33] that
ρ ( u ) u p / p i f 1 < p 2 , ( p 1 ) u 2 / 2 i f 2 p < .
Moreover, we obtain from [34] that for any X with d i m X = ,
ρ ( u ) ( 1 + u 2 ) 1 2 1
and for any X with d i m X 2 ,
ρ ( u ) C u 2 , C > 0 .
The following error bound of the WCGA on A 1 ( D ) has been established in [31].
Theorem 10.
Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . If f A 1 ( D ) X , then the output { f m } m 0 of the WCGA( X , D ) satisfies the inequality
f f m C 1 ( q , γ ) 1 + k = 1 m t k q q 1 1 + 1 q ,
where the constant C 1 ( q , γ ) depends only on q and γ.
Taking { t k } k = 1 , t k = 1 , k = 1 , 2 , , Theorem 10 implies the following corollary, which can be seen in [31].
Corollary 1.
Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . Then, for any f A 1 ( D ) , the output { f m } m 0 of the WCGA(X, D ) satisfies the inequality
f f m c · m 1 + 1 q .
In order to show the convergence rate O ( m 1 + 1 q ) cannot be improved, we now take L p , 1 < p < as an example.
Let 1 < p 2 be fixed. Combining Corollary 1 with inequality (8), we have for any D L p and any f L p
f f m c · m 1 + 1 p .
When D is a wavelet basis of L p , it is known from [35] that there is a f A 1 ( D ) such that
f f m c · m 1 + 1 p .
Thus, inequality (9) could not be improved.
Similarly, let p > 2 be fixed. Combining Corollary 1 with inequality (8), we have for any D L p and any f L p
f f m c · m 1 2 .
When D is the trigonometric system of L p , it is known from [36] that there is a f A 1 ( D ) such that
f f m c · m 1 2 .
Thus, inequality (10) could not be improved.
Then, the convergence rate O ( m 1 + 1 q ) in Corollary 1 serves as a benchmark for the performance of greedy algorithms in uniformly smooth Banach spaces.
Next, we recall the definition of the WRGA in the Banach space setting from [31]. Let τ = { t m } m = 1 , 0 < t m 1 , be a given sequence. The WRGA is defined as follows.
  • WRGA(X, D ):
Step 0: Define f 0 = 0 .
Step m:
- If f = f m 1 , stop the algorithm and define f k = f m 1 = f for k m .
- If f f m 1 , choose a element φ m D such that
F f f m 1 ( φ m f m 1 ) t m sup φ D F f f m 1 ( φ f m 1 ) .
Find 0 λ m 1 such that
f ( ( 1 λ m ) f m 1 + λ m φ m ) = inf 0 λ 1 f ( ( 1 λ ) f m 1 + λ φ m ) .
Define f m : = ( 1 λ m ) f m 1 + λ m φ m , and proceed to Step m + 1 .
The following error bound of the WRGA on A 1 ( D ) has been established in [31].
Theorem 11.
Let X be a Banach space with a modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . If f A 1 ( D ) X , then the output { f m } m 0 of the WRGA ( X , D ) satisfies the inequality
f f m C 2 ( q , γ ) 1 + k = 1 m t k p 1 p , p = q q 1 ,
where the constant C 2 ( q , γ ) depends only on q and γ.
Now, we turn to the vector greedy algorithms. Let τ = { t m } m = 1 , 0 < t m 1 be a given sequence. The Vector Weak Chebyshev Greedy Algorithm (VWCGA) [22] is defined as follows.
  • VWCGA(X, D ):
Step 0: Define f 0 i , v , τ , c : = 0 , i = 1 , N .
Step m:
- If f i = f m 1 i , v , τ , c , stop the algorithm and define f k i , v , τ , c = f m 1 i , v , τ , c = f for k m .
- If f i f m 1 i , v , τ , c , let i m be such that
f i m f m 1 i m , v , τ , c = max 1 i N f i f m 1 i , v , τ , c .
Choose an element φ m D such that
F f i m f m 1 i m , v , τ , c ( φ m ) t m sup φ D F f i m f m 1 i m , v , τ , c ( φ ) .
Set
Φ m : = span { φ 1 , φ 2 , · · · , φ m } .
Define f m i to be the best approximant to f i from Φ m , i = 1 , · · · , N and proceed to Step m + 1 .
Let τ = { t m } m = 1 , 0 < t m 1 be a given sequence. The Vector Weak Relaxed Greedy Algorithm (VWRGA) [22] is defined as follows.
  • VWRGA(X, D ):
Step 0: Define f 0 i , v , τ , c : = 0 , i = 1 , N .
Step m:
- If f i = f m 1 i , v , τ , r , stop the algorithm and define f k i , v , τ , r = f m 1 i , v , τ , r = f for k m .
- If f i f m 1 i , v , τ , r , let i m be such that
f i m f m 1 i m , v , τ , r = max 1 i N f i f m 1 i , v , τ , r .
Choose an element φ m D such that
F f i m f m 1 i m , v , τ , r ( φ m ) t m sup φ D F f i m f m 1 i m , v , τ , r ( φ ) .
Find 0 λ m i 1 such that
f i ( ( 1 λ m i ) f m 1 i m , v , τ , r + λ m i φ m ) = inf 0 λ 1 f i ( ( 1 λ ) f m 1 i m , v , τ , r + λ φ m ) .
Define f m i , v , τ , r : = ( 1 λ m i ) f m 1 i , v , τ , r + λ m i φ m , and proceed to Step m + 1 .
The error bounds of the VWCGA and VWRGA on A 1 ( D ) have been established in [22].
Theorem 12.
Let X be a Banach space with a modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . Then, for a sequence τ : = { t k } k = 1 , 0 < t k 1 and any f i A 1 ( D ) X , i = 1 , , N , we have
f i f m i , v , τ , c C 1 ( q , γ ) min 1 , 1 N k = 1 m t k p 1 p , p = q q 1 ,
f i f m i , v , τ , r C 2 ( q , γ ) min 1 , 1 N k = 1 m t k p 1 p , p = q q 1 .
Now, we start to define the VWRPGA (X, D ). To accomplish this, we recall the definition of the WRPGA from [17]. Let X be a Banach space with a modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . Let τ = { t m } m = 1 , 0 < t m 1 be a given sequence.
  • WRPGA(X, D ):
Step 0: Define f 0 = 0 .
Step m:
- If f = f m 1 , stop the algorithm and define f k = f m 1 = f for k m .
- If f f m 1 , choose an element φ m D such that
| F f f m 1 ( φ m ) | t m sup φ D F f f m 1 ( φ ) .
With
λ m = sign { F f f m 1 ( φ m ) } f f m 1 ( 2 γ q ) 1 1 q | F f f m 1 ( φ m ) | 1 q 1 ,
f ^ m : = f m 1 + λ m φ m ,
choose s m such that
f s m f ^ m = min s R f s f ^ m .
Define the next approximant to be f m = s m f ^ m , and proceed to Step m + 1 .
The sufficient conditions for the convergence of the WRPGA in terms of the weakness sequence and the modulus of smoothness can be found in [17]. Moreover, the following theorem gives the error bound of the WRPGA on A 1 ( D ) .
Theorem 13
(see Theorem 6.1 in [17]). Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . If f A 1 ( D ) X , then the output { f m } m 0 of the WRPGA( X , D ) satisfies the inequality
f f m C 3 ( q , γ ) 1 + k = 1 m t k p 1 p , p = q q 1 ,
where the constant C 3 ( q , γ ) depends only on q and γ.
Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . Let τ = { t m } m = 1 , 0 < t m 1 be a given sequence. We define the VWRPGA (X, D ) as follows.
  • VWRPGA(X, D ): Given f i X , i = 1 , , N .
Step 0: Define f 0 i , v , τ , R : = 0 , i = 1 , N .
Step m:
- If f i = f m 1 i , v , τ , R , stop the algorithm and define f k i , v , τ , R = f m 1 i , v , τ , R = f for k m .
- If f i f m 1 i , v , τ , R , let i m be such that
f i m f m 1 i m , v , τ , R = max 1 i N f i f m 1 i , v , τ , R .
Choose an element φ m D such that
F f i m f m 1 i m , v , τ , R ( φ m ) t m sup φ D F f i m f m 1 i m , v , τ , R ( φ ) .
With
λ m i = sign { F f i f m 1 i , v , τ , R ( φ m ) } f i f m 1 i , v , τ , R ( 2 γ q ) 1 1 q | F f i f m 1 i , v , τ , R ( φ m ) | 1 q 1 ,
f m i ^ : = f m 1 i , v , τ , R + λ m i φ m ,
choose s m such that
f i s m i f m i ^ = min s R f i s f m i ^ .
Define the next approximant to be f m i , v , τ , R = s m i f m i ^ , and proceed to Step m + 1 .
In this section, we obtain the convergence properties and error bound of the VWRPGA.
Firstly, we establish the theorem on the convergence of the VWRPGA. It seems that this theorem is the first result on the convergence property of the vector greedy algorithms in the Banach space setting.
Theorem 14.
Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . Assume
m = 1 t m p = , p = q q 1 .
Then, for any f i X , i = 1 , , N and any dictionary D the VWRPGA converges.
The idea of the proof of Theorem 14 is similar to that of Theorem 5. However, because of the complexity of Banach spaces, a series of arguments in the subsequence analysis must be modified, replaced, and generalized. Some useful results in the case of Hilbert spaces have been generalized to the case of Banach spaces, as shown in the following lemmas.
Lemma 3.
Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . For any two nonzero elements f , g X and any h > 0 , we have
f h g f + 2 f γ · h g f q h F f ( g ) .
Proof. 
The proof of this lemma follows from the proof of Lemma 6.1 in [29] and the fact that the modulus of smoothness of X satisfies ρ ( u ) γ u q , 1 < q 2 .
Lemma 4.
Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . Let f m 1 i , v , τ , R be the output of the VWRPGA at Step m 1 for f i , i = 1 , , N . If f i f m 1 i , v , τ , R , then we have
F f i f m 1 i , v , τ , R ( f m 1 i , v , τ , R ) = 0 .
Proof. 
Denote L : = span { f m i ^ } X . By the definition of the VWRPGA ( X , D ) , f m 1 i , v , τ , R is the best approximant to f i from L for i = 1 , , N . Thus, the conclusion of the lemma follows from Lemma 2.1 in [31]. □
Lemma 5
(see Lemma 2.2 in [31]). For any bounded linear functional F and any dictionary D from a Banach space, we have
sup φ D F ( φ ) = sup g A 1 ( D ) F ( g ) .
Now, we prove Theorem 14.
Proof of Theorem 14.
Let r m i , i = 1 , , N be the residual of f m i , v , τ , R . It is known from the definition of the VWRPGA ( X , D ) that r m i satisfies
r m i = f i f m i , v , τ , R = f i s m i f m i ^ f i f m i ^ = r m 1 i λ m i φ m .
We apply Lemma 3 to the latter inequality with f = r m 1 i , g = sign ( λ m i ) φ m , h = | λ m i | , and obtain
r m i r m 1 i + 2 r m 1 i γ · | λ m i | r m 1 i q λ m i F f i f m 1 i , v , τ , R ( φ m ) .
By the choice of λ m i , we have
r m i r m 1 i 1 q 1 q ( 2 γ q ) 1 1 q · | F f i f m 1 i , v , τ , R ( φ m ) | q q 1 .
Thus, it is easy to see that { r m i } m = 0 is a decreasing sequence. According to the Monotone Convergence Theorem, we know that lim m r m i for i = 1 , , N exists.
Next, we prove that lim m r m i = 0 by contradiction. Assume lim m r m i a > 0 , i = 1 , , N . Then, for any m, we have r m i a . By (11), we obtain that
i = 1 N r m i i = 1 N r m 1 i q 1 q ( 2 γ q ) 1 1 q · i = 1 N r m 1 i · | F f i f m 1 i , v , τ , R ( φ m ) | q q 1 i = 1 N r m 1 i 1 q 1 q ( 2 γ q ) 1 1 q · r m 1 i m · ( F f i m f m 1 i m , v , R , τ ( φ m ) ) q q 1 i = 1 N r m 1 i i = 1 N r m 1 i 1 q 1 q ( 2 γ q ) 1 1 q · r m 1 i m · ( F f i m f m 1 i m , v , R , τ ( φ m ) ) q q 1 N r m 1 i m i = 1 N f i j = 1 m 1 1 N · q 1 q ( 2 γ q ) 1 1 q · ( F f i j f j 1 i j , v , R , τ ( φ j ) ) q q 1 .
Denote
x j = 1 N · q 1 q ( 2 γ q ) 1 1 q · ( F f i j f j 1 i j , v , R , τ ( φ j ) ) q q 1 ) .
By the inequality 1 x 1 1 + x , 0 x 1 , we obtain
i = 1 N r m i i = 1 N f i j = 1 m 1 1 + x j i = 1 N f i 1 1 + j = 1 m x j .
Then, we proceed with a lower estimate for x j , j = 1 , , m .
By Lemma 1, we set ϵ = a 2 and find f j ϵ such that
f i j f j ϵ < ϵ
and
f j ϵ A ( ϵ ) A 1 ( D ) ,
with some number A ( ϵ ) > 0 .
We obtain from Lemma 5 that
F f i j f j 1 i j , v , R , τ ( φ j ) t j sup φ D F f i j f j 1 i j , v , R , τ ( φ ) = t j sup g A 1 ( D ) F f i j f j 1 i j , v , R , τ ( g ) t j A ( ϵ ) 1 F f i j f j 1 i j , v , R , τ ( f j ϵ ) .
By Lemma 4, we obtain
F f i j f j 1 i j , v , R , τ ( f j ϵ ) = F f i j f j 1 i j , v , R , τ ( f i j ( f i j f j ϵ ) ) = F f i j f j 1 i j , v , R , τ ( f i j ) F f i j f j 1 i j , v , R , τ ( f i j f j ϵ ) > F f i j f j 1 i j , v , R , τ ( f i j f j 1 i j , v , R , τ + f j 1 i j , v , R , τ ) ϵ = F f i j f j 1 i j , v , R , τ ( f i j f j 1 i j , v , R , τ ) ϵ = r j 1 i j ϵ .
Inequalities (14), (15) and ϵ = a 2 result in
F f i j f j 1 i j , v , R , τ ( φ j ) t j A ( ϵ ) 1 ( r j 1 i j ϵ ) t j A ( ϵ ) 1 · a 2 .
Combining (12) with (16), we obtain
x j 1 N · q 1 q ( 2 γ q ) 1 1 q · t j A ( ϵ ) 1 · a 2 q q 1 = 1 N · q 1 q ( 2 γ q ) 1 1 q · a 2 A ( ϵ ) p · t j p .
Combining (17) with (13), we have
i = 1 N r m i i = 1 N f i · 1 1 + 1 N · q 1 q ( 2 γ q ) 1 1 q · a 2 A ( ϵ ) p j = 1 m t j p .
The assumption m = 1 t m p = implies that i = 1 N r m i 0 as m .
Thus, lim m r m i = 0 for 1 = 1 , , N . We obtain a contradiction which proves this theorem. □
Remark 5.
According to Theorem 3.1 in [32], we know that m = 1 t m p = is also a necessary condition for the convergence of the VWRPGA.
Remark 6.
Since the WRGA only converges for the target elements from A 1 ( D ) , see [31], then the VWRGA also only converges for the target elements from A 1 ( D ) . Thus, the convergence property of the VWRPGA is better than that of the VWRGA.
Remark 7.
For f i A 1 ( D ) , i = 1 , , N , the sufficient condition for the convergence of the VWCGA follows from Theorem 12. Theorem 15 gives the convergence condition for any f i X , i = 1 , , N .
By using the same method, it is not difficult to prove the following theorem on the convergence of the VWCGA.
Theorem 15.
Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . Assume
m = 1 t m p = , p = q q 1 .
Then, for any f i X , i = 1 , , N and any dictionary D the VWCGA converges.
Next, we give the theorem about the error bound of the VWRPGA( X , D ) on A 1 ( D ) .
Theorem 16.
Let X be a Banach space with modulus of smoothness ρ ( u ) γ u q , 1 < q 2 . Then, for a sequence τ : = { t k } k = 1 , 0 < t k 1 and any f i A 1 ( D ) X , i = 1 , , N , we have
f i f m i , v , τ , R C 3 ( q , γ ) min 1 , 1 N k = 1 m t k p 1 p , p = q q 1 .
Proof. 
It is known from (11) that the sequences { f i f m i , v , τ , R } m = 0 , i = 1 , , N are decreasing. Fix i. The inequality
f i f m i , v , τ , R 1
follows from the assumption f i A 1 ( D ) and the fact that { f i f m i , v , τ , R } m = 0 is decreasing.
Thus, we only need to prove the following estimate:
f i f m i , v , τ , R C ( q , γ ) 1 N k = 1 m t k p 1 p , i = 1 , , N .
We define the set E l : = { k | i k = l , 1 k m } just as we did in proof of Theorem 6. It is obvious that
k = 1 m t k p = l = 1 N k E l t k p .
Thus, there exists l 0 [ 1 , N ] such that
k E l 0 t k p 1 N k = 1 m t k p .
As in proof of Theorem 6, for f l 0 X , the sequences f 1 l 0 , v , R , τ , , f m l 0 , v , R , τ are the outputs of the WRPGA with the weakness sequence τ l 0 : = { t k l 0 } . Therefore, using Theorem 13, we obtain
f l 0 f k 0 1 l 0 , v , R , τ C ( q , γ ) 1 + k E l 0 { k 0 } t k p 1 p C ( q , γ ) 1 + k E l 0 t k p 1 1 p C ( q , γ ) 1 N k = 1 m t k p 1 p .
The proof of Theorem 16 is completed. □
Remark 8.
We know from Theorems 12 and 16 that the error bound of the VWRPGA is almost the same as those of the VWCGA and VWRGA. Similarly, the computational complexity of the VWRPGA is essentially smaller than those of the VWCGA and VWRGA.

4. Conclusions

In this paper, we consider the use of vector greedy algorithms for simultaneous approximation. We first work in a Hilbert space H. We propose a new vector greedy algorithm—the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA)—for simultaneous approximation with respect to a dictionary D in H. Then, we study the error performances of the VWRPGA. We show that the convergence rate of the VWRPGA on A 1 ( D ) is optimal. The VWRPGA has a weaker convergence condition than the VWPGA. The convergence rate of the VWRPGA is better than that of the VWPGA. In particular, this advantage is more obvious when N is large. Moreover, the error performances of the VWRPGA are similar to those of the VWOGA. However, from the viewpoint of computational complexity, the VWRPGA is simpler than the VWOGA. For N target elements, from the definitions of the algorithms, one can see that the VWOGA needs to solve N m-dimensional optimization problems. However, the VWRPGA only needs to solve N one-dimensional optimization problems.
Then, we design the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA) in a uniformly smooth Banach space setting. We obtain the convergence properties and error bound of the VWRPGA in this case. We also show that the convergence condition of the VWCGA is the same as that of the VWRPGA. We show that when the Banach space is a Lebesgue space, the convergence rate of the VWRPGA on A 1 ( D ) is sharp. As for the convergence properties, the VWRGA converges only for the target elements from A 1 ( D ) , while the VWRPGA converges for any element. Therefore, the VWRPGA has better convergence properties than the VWRGA. The error bounds of the VWRPGA are similar to those of the VWCGA and VWRGA. From the viewpoint of computational complexity, the VWRPGA is simpler than the VWCGA and the VWRGA.
In conclusion, the VWRPGA is the simplest vector greedy algorithm for simultaneous approximation with the best convergence property and the optimal convergence rate.
The VWRGA is more efficient than the WRPGA, since the complexity of its calculation and the storage of information can be reduced greatly by the VWRPGA instead of the N-fold WRPGA. If τ : = { t k } k = 1 , t k = 1 , k = 1 , 2 , and N = 1 , then the VWRPGA degenerates into the RPGA. In [5], the authors applied the RPGA to a kernel-based regression. They defined the Rescaled Pure Greedy Learning Algorithm (RPGLA) and studied its efficiency. They showed that the computational complexity of the RPGLA is less than the Orthogonal Greedy Learning Algorithm (OGLA) [37] and Relaxed Greedy Learning Algorithm (RGLA) [38]. When the kernel is infinitely smooth, the learning rate can be arbitrarily close to the best rate O ( m 1 ) under a mild assumption of the regression function. Since the VWRPGA is more efficient than the RPGA, the VWRPGA can be used to solve the problems of multi-task learning more efficiently. Moreover, it is natural to consider the applications of the VWRPGA to vector signal processing. We will study these applications of the VWRPGA in the future.

Author Contributions

Conceptualization, X.X., P.Y. and W.Z.; methodology, X.X. and P.Y.; formal analysis, J.G.; investigation, all authors; resources, all authors; data curation, all authors; writing—original draft preparation, P.Y. and J.G.; writing—review and editing, X.X. and P.Y.; visualization, all authors; supervision, X.X. and P.Y.; project administration, all authors; funding acquisition, all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 11671213).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Barron, A.; Cohen, A.; Dahmen, W.; DeVore, R. Approximation and learning by greedy algorithms. Ann. Stat. 2008, 36, 64–94. [Google Scholar] [CrossRef]
  2. Cohen, A.; Dahmen, W.; DeVore, R. Compressed sensing and best k-term approximation. J. Am. Math. Soc. 2009, 22, 211–231. [Google Scholar] [CrossRef]
  3. Yang, B.; Yang, C.; Huang, G. Efficient image fusion with approximate sparse representation. Int. J. Wavelets Multiresolut. Inf. Process. 2016, 14, 1650024. [Google Scholar]
  4. Zhang, W.H.; Ye, P.X.; Xing, S.; Xu, X. Optimality of the approximation and learning by the rescaled pure super greedy algorithms. Axioms 2022, 11, 437. [Google Scholar] [CrossRef]
  5. Zhang, W.H.; Ye, P.X.; Xing, S. Optimality of the rescaled pure greedy learning algorithms. Int. J. Wavelets Multiresolut. Inf. Process. 2023, 21, 2250048. [Google Scholar] [CrossRef]
  6. Nguyen, H.; Petrova, G. Greedy strategies for convex optimization. Calcolo 2017, 54, 207–224. [Google Scholar] [CrossRef]
  7. Huang, A.T.; Feng, R.Z.; Wang, A.D. The sufficient conditions for orthogonal matching pursuit to exactly reconstruct sparse polynomials. Mathematics 2022, 10, 3703. [Google Scholar] [CrossRef]
  8. Liu, Z.Y.; Xu, Q.Y. A multiscale RBF collocation method for the numerical solution of partial differential equations. Mathematics 2019, 7, 964. [Google Scholar] [CrossRef]
  9. Jin, D.F.; Yang, G.; Li, Z.H.; Liu, H.D. Sparse recovery algorithm for compressed sensing using smoothed l0 norm and randomized coordinate descent. Mathematics 2019, 7, 834. [Google Scholar] [CrossRef]
  10. Natsiou, A.A.; Gravvanis, G.A.; Filelis-Papadopoulos, C.K.; Giannoutakis, K.M. An aggregation-based algebraic multigrid method with deflation techniques and modified generic factored approximate sparse inverses. Mathematics 2023, 11, 640. [Google Scholar] [CrossRef]
  11. Argyriou, A.; Evgeniou, T.; Pontil, M. Convex multitask feature learning. Mach. Learn. 2008, 73, 243–272. [Google Scholar] [CrossRef]
  12. Schmidt, E. Zur Theorie der linearen und nichtlinearen Integralgleichungen. I Math. Annalen. 1906–1907, 63, 433–476. [Google Scholar] [CrossRef]
  13. Tropp, J.A.; Gilbert, A.C.; Strauss, M.J. Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit. Signal. Process. 2006, 86, 572–588. [Google Scholar] [CrossRef]
  14. Wirtz, D.; Haasdonk, B. A vectorial kernel orthogonal greedy algorithm. Proc. DWCAA 2013, 6, 83–100. [Google Scholar]
  15. DeVore, R.A.; Temlyakov, V.N. Some remarks on greedy algorithms. Adv. Comput. Math. 1996, 5, 173–187. [Google Scholar] [CrossRef]
  16. Gao, Z.; Petrova, G. Rescaled pure greedy algorithm for convex optimization. Calcolo 2019, 56, 15. [Google Scholar] [CrossRef]
  17. Petrova, G. Rescaled pure greedy algorithm for Hilbert and Banach spaces. Appl. Comput. Harmon. Anal. 2016, 41, 852–866. [Google Scholar] [CrossRef]
  18. Jiang, B.; Ye, P.; Zhang, W. Unified error estimate for weak biorthogonal greedy algorithms. Int. J. Wavelets Multiresolut. Inform. Process. 2022, 5, 2150001. [Google Scholar] [CrossRef]
  19. Dereventsov, A.V.; Temlyakov, V.N. A unified way of analyzing some greedy algorithms. J. Funct. Anal. 2019, 12, 1–30. [Google Scholar] [CrossRef]
  20. Temlyakov, V.N. A remark on simultaneous greedy approximation. East J. Approx. 2004, 10, 17–25. [Google Scholar]
  21. Leviatan, D.; Temlyakov, V.N. Simultaneous approximation by greedy algorithms. Adv. Comput. Math. 2006, 25, 73–90. [Google Scholar] [CrossRef]
  22. Leviatan, D.; Temlyakov, V.N. Simultaneous greedy approximation in Banach spaces. J. Complex. 2005, 21, 275–293. [Google Scholar] [CrossRef]
  23. Lutoborski, A.; Temlyakov, V.N. Vector greedy algorithms. J. Complex. 2003, 19, 458–473. [Google Scholar] [CrossRef]
  24. Mallat, S.; Zhang, Z. Matching pursuit with time-frequency dictionaries. IEEE Trans. Signal Pross. 1993, 41, 3397–3415. [Google Scholar] [CrossRef]
  25. Konyagin, S.V.; Temlyakov, V.N. Rate of convergence of pure greedy algorithm. East. J. Approx. 1996, 5, 493–499. [Google Scholar]
  26. Sil<sup>′</sup>nichenko, A.V. Rates of convergence of greedy algorithms. Mat. Zametki. 2004, 76, 628–632. [Google Scholar]
  27. Burusheva, L.; Temlyakov, V. Sparse approximation of individual functions. J. Approx. Theory 2020, 259, 105471. [Google Scholar] [CrossRef]
  28. Livshitz, D.; Temlyakov, V.N. Two lower estimates in greedy approximation. Constr. Approx. 2003, 19, 509–524. [Google Scholar] [CrossRef]
  29. Temlyakov, V.N. Greedy Approximation; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  30. Temlyakov, V.N. Weak greedy algorithms. Adv. Comput. Math. 2000, 12, 213–227. [Google Scholar] [CrossRef]
  31. Temlyakov, V.N. Greedy algorithms in Banach spaces. Adv. Comput. Math. 2001, 14, 277–292. [Google Scholar] [CrossRef]
  32. Jiang, B.; Ye, P.X. Efficiency of the weak rescaled pure greedy algorithm. Int. J. Wavelets Multiresolut. Inform. Process. 2021, 4, 2150001. [Google Scholar] [CrossRef]
  33. Donahue, M.; Gurvits, L.; Darken, C.; Sontag, E. Rate of convex approximation in non-Hilbert spaces. Constr. Approx. 1997, 13, 187–220. [Google Scholar] [CrossRef]
  34. Lindenstrauss, J.; Tzafriri, L. Classical Banach Spaces I; Springer: Berlin/Heidelberg, Germany, 1977. [Google Scholar]
  35. Temlyakov, V.N.; Yang, M.R.; Ye, P.X. Greedy approximation with regard to non-greedy bases. Adv. Comput. Math. 2011, 34, 319–337. [Google Scholar] [CrossRef]
  36. Ye, P.X.; Wei, X.J. Efficiency of weak greedy algorithms for m-term approximations. Sci. China Math. 2016, 59, 697–714. [Google Scholar] [CrossRef]
  37. Chen, H.; Zhou, Y.C.; Tang, Y.Y.; Li, L.Q.; Pan, Z.B. Convergence rate of the semi-supervised greedy algorithm. Neural Netw. 2013, 44, 44–50. [Google Scholar] [CrossRef]
  38. Lin, S.B.; Rong, Y.H.; Sun, X.P.; Xu, Z.B. Learning capability of the relaxed greedy algorithms. IEEE Trans. Neural Netw. Learn. Syst. 2013, 24, 1598–1608. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, X.; Guo, J.; Ye, P.; Zhang, W. Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm. Mathematics 2023, 11, 2020. https://doi.org/10.3390/math11092020

AMA Style

Xu X, Guo J, Ye P, Zhang W. Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm. Mathematics. 2023; 11(9):2020. https://doi.org/10.3390/math11092020

Chicago/Turabian Style

Xu, Xu, Jinyu Guo, Peixin Ye, and Wenhui Zhang. 2023. "Approximation Properties of the Vector Weak Rescaled Pure Greedy Algorithm" Mathematics 11, no. 9: 2020. https://doi.org/10.3390/math11092020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop