Abstract
This paper provides several error estimations for total variation (TV) type regularization, which arises in a series of areas, for instance, signal and imaging processing, machine learning, etc. In this paper, some basic properties of the minimizer for the TV regularization problem such as stability, consistency and convergence rate are fully investigated. Both a priori and a posteriori rules are considered in this paper. Furthermore, an improved convergence rate is given based on the sparsity assumption. The problem under the condition of non-sparsity, which is common in practice, is also discussed; the results of the corresponding convergence rate are also presented under certain mild conditions.
1. Introduction
Compressed sensing [1,2] has gained increasing attention in recent years; it plays an important role in signal processing [3,4], imaging science [5,6] and machine learning [7]. Compressed sensing focusses on signals with sparse presentation. Let be a Hilbert space, and be the orthonormal basis of . For any , let . Given some operators K satisfy certain conditions, it is possible to recover a sparse signal with length n by Basis Pursuit (BP) [8], i.e.,
from the samples , even K is ill-posed [2,9,10]. However, in most cases, noise is inevitable. The literatures has turned to studying the noised BP model
where is the allowed error. Actually, the unconstrained form of the noised BP model, i.e., sparse regularization which is the focus in [11,12,13,14,15,16] is more attractive. While the success of compressed sensing greatly inspired the development of sparse regularization, it is interesting to see that sparse regularization appeared much earlier than compressed sensing [11,12]. As an inverse problem, the error theory of sparse regularization is well studied in the literature [17,18,19].
In practical terms, a large crowd of signals is not sparse unless being transformed by some operators (maybe ill posed). Thus, many studies have been proposed to analyze the regularized optimization problem [20]. A typical example of them is signal with a sparse gradient which arises frequently from imaging processing (nature images are usually piece-wise constant, i.e., they have a sparse gradient). The Total Variation (TV) has been used extensively in the literature for decades in imaging sciences and a series of techniques have been dedicated to researching its choice of regularization parameter [21,22,23,24,25,26,27,28,29,30,31]; others [32,33] are developed based on this observation. Similar to [34], Total Variation can also smooth the signal of interest. Let be another Hilbert space. For any , define that satisfies
Under the above definition, T is an ill-posed linear operator. Given a linear map and , the total variation regularization problem can be represented as
where is the regularization parameter. The regularization term is the right total variation (TV) of x. The TV type regularization has a similar form to the sparse regularization. However, the perfect reconstruction result established in sparse regularization can not be applied to theTV type directly, especially when T is ill-posed (T has a nontrivial null space).
So in this paper, firstly, we discuss the stability and consistency of the minimizers of . Besides basic properties, we are also interested in the convergence rate to solve the TV problem. Then, under the source conditions [19,35,36], convergence rates get obtained for both a priori and a posteriori parameter choice rules. However, the linear convergence rate requires K to be injective, which is strict usually. In the latter part, the linear convergence rate can also be derived under the sparsity assumption on and some suitable conditions for K. This requirement of deduction does not depend on the injectivity of K. Meanwhile, this paper also considers the case when the sparsity assumption on fails. Last, based on some recent works [37,38,39], which also assume the is not sparse, a convergence rate is also given in this case.
2. Notation
The notations described in this section are adopted throughout this paper. Let , be two Hilbert spaces and , be the orthonormal basis of and , respectively. For any and , and . The and norms of x and y are denoted by , and , , respectively. In this paper, if not specified, for any and , we assume that , i.e., and . means that converges weakly to x, while means converges strongly to x. The operator norm of the linear operator is defined as
Through the paper, means the signal of interest; are the measurements. denotes an element in satisfying . Under these notations, the TV regularization can be expressed as
Denote that is one of the minimizers of .
Remark 1.
Considering the set , where
Obviously, for any n, and . As , and . That means T is ill posed.
Remark 2.
Let , where is the identical operator over . Then, for any . It is easy to verify that D is continuous. Then, T is continuous over and
In practice, The ill condition of T brings trouble to the analysis. To overcome this problem, we consider a condition which plays an important role in the deduction.
Condition 1.
There exist two constantssuch that
for any.
We present a finite-dimensional understanding of this condition. Let and . Then, satisfies and . In the finite dimension case, T has the form
The definition of T gives that . If . Then, , we have that , where . Hence, for any and some , . Note that ; we then have .
3. Basic Error Estimations
The properties of TV type regularization are investigated in this section. First, a lemma is introduced which is used in this section frequently.
Lemma 1.
Let be bounded, α be fixed and be a sequence. Assume that Condition 1 holds and is bounded. Then, is also bounded.
Proof.
It is trivial to prove and are bounded. Note that
which implies and are bounded. From Condition 1, we derive that
implies the boundedness of . □
3.1. Stability
In this subsection, we investigate the performance of as , when is fixed. A lemma is introduced which arises in convex optimization.
Lemma 2
([40,41]). Let be the solution set of the convex minimization problem
Then, and is constant over .
Theorem 1.
Assume that satisfies Condition 1. For any fixed and , we have
Proof.
The minimizing property of gives that . Then, Lemma 1 indicates that there exists a subsequence of converging weakly to some . For simplicity, we also denote this subsequence as . By the weak lower continuity of the norms, we have
Therefore, we have that
On the other hand, by the minimizing property of ,
Obviously, it holds that
That means minimizes . From Lemma 2, and . Consequently, we have , and . In the following, we present the proof by the mean of contradiction. Assume that . We can obtain that
This is a contradiction. Then, we have
From relations (3), we can obtain that . □
If K is injective, we can further have that . The theorem above indicates that and are continuous at . In fact, we can obtain a stronger result; the value function is differentiable at .
Theorem 2.
Let ; then, is differentiable with respect to α, and .
Proof.
For , we have
Due to that minimizing , we have
It follows that . On the other hand, can be written as
Similarly, we have . Combining the two inequalities above, we have
When , similar results can be also obtained. The continuity of at gives that . □
3.2. Consistency
The performance of is investigated under a prior parameter choice as . In the analysis, we assume that the following conditions hold.
Condition 2.
For anyobeying, satisfies that
The equality holds if and only if.
Lemma 3.
Let , and . Then, we have and .
Proof.
We can obtain that
The triangle inequality gives that . The Fatou’s lemma gives that
Note that ; then, . Hence, . Similarly, we can obtain . Therefore,
Thus, we have
By the same method, we also can obtain that . □
Theorem 3.
Assume that satisfies Condition 1 and Lemma 1. Let the parameters satisfy that
Then the sequence .
Proof.
By the definition of , we have
From the parameters’ choice rule of and , we can see that are bounded. Then, from Lemma 1, there exists a subsequence also denoted by and some point such that . We can have that
This means . It is easy to see that . On the other hand, we can obtain that
Condition 2 gives that . From the inequality above, we see that By Lemma 3, we have and . Consequently, from Condition 1, it holds that
□
3.3. Convergence Rate
This subsection concerns the convergence rate under different parameter choice rules (a priori and a posteriori). First, we discuss the a priori one. Like the classical Tikhonov regularization method [19,35,36], we introduce a source condition.
Condition 3.
Letsatisfy the source condition
Theorem 4.
If satisfies the source condition, it holds that
If K is injective, there exists such that
Proof.
The definition of gives that
Using the notation , we obtain that
For any , the convexity of C indicates . Then, we have that
Choose in the source condition; after simplification, we derive that
By adding both sides with , we obtain that
This means
If K is injective, there exists such that . Then, we derive that
□
Remark 3.
In fact, the first result in Theorem 4 has been proved by [42] for general convex regularization. The proof here is for the completeness.
The following part investigates the a posteriori parameter choice rule. The analysis is motivated by the work in [43,44]. For simplicity of presentation, the parameter is chosen as
Theorem 5.
Assume that α is chosen as rule (5), and satisfies Condition 2. It then holds that
If K is injective, there exists such that
Proof.
It is trivial to prove that
Lemma 2 indicates that is bounded. Note that and . It then follows that
Then, the sequence has a sub-sequence also denoted by converging weakly to some . We can easily see that
That is actually to say that . Moreover, it is easy to see that . Using relation (6), we have that
Condition 2 gives that ; hence, the whole sequence converges weakly to and
Thus, we have . From Lemma 3, we have and which leads to
If K is injective, there exists such that . Then, we derive that
□
4. Improved Convergence Rate
In this section, we investigate the convergence rate when K may be not injective. The first part presents the analysis under the sparse assumption while the second one deals with the case when the sparsity assumption fails.
4.1. Performance under Sparsity Assumption
The analysis in this subsection assumes that is sparse. To prove the convergence rate we need the finite injectivity property [45].
Condition 4.
The operatorKsatisfies the uniformly finite injectivity property, i.e., for any finite subset, is injective.
Remark 4.
In the finite dimension case, if is small, it is easy to find that the finite injectivity property is actually the restrict isometry property [2,46].
Let and . Denote S as the set , where satisfies the source condition. Let . Due to that , S is finite and it contains the support of . Let P be the identical projection onto S and be the one onto . From Condition 4, there exists some such that
Lemma 4.
Assume that satisfies the source condition and Condition 1 holds. If , there exist and such that
Proof.
Assume the conditions in Lemma 4 are held. Then, we can obtain that
Hence, we derive that
We now turn to estimating . Let . Obviously, . We then have that
The source condition gives that
Therefore, we have that
From Condition 1, we have that
Note that ; let ; we have that
□
With the lemma above, we can obtain the following result. The proofs can be found in [44,47,48,49].
Theorem 6.
Let the regularization parameter be chosen a priori as or a posteriori as according to the strong discrepancy principle (5) Then we have the convergence rate
4.2. Performance if Sparsity Assumption Fails
In this subsection, we focus on the case where is not sparse. As presented in the last section, lemma 4 is critical for the convergence rate analysis. In this part, a similar lemma will be proposed. Then, the convergence rate will be proved. The first lemma is motivated by [37].
Lemma 5.
For any and , it holds that
Proof.
Denote the projection for any . Hence, we have
Algebra computation gives that
Note that
and
Combining the equations above, we obtain
□
Condition 5.
For allthere existssuch thatand.
Lemma 6.
Let ; is concave index function. Assume that satisfies the source condition and Conditions 1 and 5 hold. If , it holds that
for some positive.
Proof.
is concave and upper semi-continuous since it is an infimum of affine functions. For any , is finite and continuous. Note that ; the upper semi-continuity at gives the continuity of at . We turn to the strict monotonicity of . Condition 5 means the infimum of is attained at some . Considering , we have
From Condition 5, we have that
Therefore, we obtain that
From Condition 1, we have that
Let , we obtain that
□
Theorem 7.
Let the regularization parameter be chosen a priori as or a posteriori as according to the strong discrepancy principle (5). Then we have the convergence rate
5. Conclusions
In this paper, we study some problems in total variation type regularization. While owning a familiar form as the sparse regularization, the TV type is hard to investigate for the ill condition of T. A group of regularization conditions has been given in this paper. Under these conditions, we study several theoretical properties such as stability, consistency and convergence rates of the minimizer of the TV type regularization. These analyses are deepened for the convergence rate under the assumption of sparsity. In the non-sparse case, we also present a conservative result based on some recent works. Now, the regularizers learned from the data are all the rage in research. So, in future work, we will make the error estimations for this type of regularization problem.
Author Contributions
Conceptualization, K.L. and Z.Y.; methodology, C.H.; validation, K.L., C.H. and Z.Y.; formal analysis, K.L.; writing—original draft preparation, K.L. and Z.Y.; writing—review and editing, K.L. and Z.Y.; supervision, C.H.; project administration, Z.Y.; funding acquisition, K.L. and C.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the National Key Research and Development Program of China under Grant 2020YFA0709803, 173 Program under Grant 2020-JCJQ-ZD-029, the Science Challenge Project under Grant TZ2016002, and Dongguan Science and Technology of Social Development Program under Grant 2020507140146.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Donoho, D.L. Compressed sensing. IEEE T. Inform. Theory 2006, 52, 1289–1306. [Google Scholar] [CrossRef]
- Candes, E.J.; Tao, T. Decoding by linear programming. IEEE T. Inform. Theory 2005, 51, 4203–4215. [Google Scholar] [CrossRef]
- Lustig, M.; Donoho, D.L.; Santos, J.M.; Pauly, J.M. Compressed Sensing MRI. IEEE Signal Proc. Mag. 2008, 25, 72–82. [Google Scholar] [CrossRef]
- Haupt, J.; Bajwa, W.U.; Rabbat, M.; Nowak, R. Compressed Sensing for Networked Data. IEEE Signal Proc. Mag. 2008, 25, 92–101. [Google Scholar] [CrossRef]
- Yin, W.; Osher, S.; Goldfarb, D.; Darbon, J. Bregman iterative algorithms for ℓ1-minimization with applications to compressed sensing. SIAM J. Imaging Sci. 2008, 1, 143–168. [Google Scholar] [CrossRef]
- Cai, J.F.; Osher, S.; Shen, Z. Split Bregman methods and frame based image restoration. Multiscale Model. Sim. 2009, 8, 337–369. [Google Scholar] [CrossRef]
- Van den Berg, E.; Friedlander, M.P. Probing the Pareto Frontier for Basis Pursuit Solutions. SIAM J. Sci. Comput. 2008, 31, 890–912. [Google Scholar] [CrossRef]
- Chen, S.S.; Donoho, D.L.; Saunders, M.A. Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 1998, 20, 33–61. [Google Scholar] [CrossRef]
- Candès, E.J.; Romberg, J.; Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 2006, 52, 489–509. [Google Scholar] [CrossRef]
- Candès, E.J. The restricted isometry property and its implications for compressed sensing. CR Math. 2008, 346, 589–592. [Google Scholar]
- Tibshirani, R. Regression Shrinkage and Selection via the Lasso. J. R. Stat. Soc. B 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Daubechies, I.; Defrise, M.; Mol, C.D. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 2004, 57, 1413–1457. [Google Scholar] [CrossRef]
- Daubechies, I.; Devore, R.; Fornasier, M.; Güntürk, C.S. Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 2010, 38, 1–38. [Google Scholar] [CrossRef]
- Bot, R.I.; Hofmann, B. The impact of a curious type of smoothness conditions on convergence rates in ℓ1-regularization. Eur. J. Math. Comput. Appl. 2013, 1, 29–40. [Google Scholar]
- Jin, B.; Maass, P. Sparsity regularization for parameter identification problems. Inverse Probl. 2012, 28, 123001. [Google Scholar] [CrossRef]
- Grasmair, M.; Haltmeier, M.; Scherzer, O. Sparse regularization with ℓq penalty term. Inverse Probl. 2008, 24, 055020. [Google Scholar] [CrossRef]
- Hans, E.; Raasch, T. Global convergence of damped semismooth Newton methods for 1 Tikhonov regularization. Inverse Probl. 2015, 31, 025005. [Google Scholar] [CrossRef]
- Lorenz, D.A.; Schiffler, S.; Trede, D. Beyond convergence rates: Exact recovery with the Tikhonov regularization with sparsity constraints. Inverse Probl. 2011, 27, 085009. [Google Scholar] [CrossRef]
- Lorenz, D.A. Convergence rates and source conditions for Tikhonov regularization with sparsity constraints. J. Inverse III-Pose P 2008, 16, 463–478. [Google Scholar] [CrossRef]
- Rubinov, A.M.; Yang, X.Q.; Bagirov, A.M. Penalty functions with a small penalty parameter. Optim. Methods Softw. 2002, 17, 931–964. [Google Scholar] [CrossRef]
- Cai, J.F.; Dong, B.; Osher, S.; Shen, A.Z. Image Restoration: Total Variation, Wavelet Frames and Beyond. J. Am. Math. Soc. 2012, 25, 1033–1089. [Google Scholar] [CrossRef]
- Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Phys. D 1992, 60, 259–268. [Google Scholar] [CrossRef]
- Jun, L.; Huang, T.Z.; Gang, L.; Wang, S.; Lv, X.-G. Total variation with overlapping group sparsity for speckle noise reduction. Neurocomputing 2019, 216, 502–513. [Google Scholar]
- Van den Berg, P.M.; Kleinman, R.E. A total variation enhanced modified gradient algorithm for profile reconstruction. Inverse Probl. 1995, 11, L5. [Google Scholar] [CrossRef]
- Clason, C.; Jin, B.; Kunisch, K. A Duality-Based Splitting Method for ℓ1-TV Image Restoration with Automatic Regularization Parameter Choice. SIAM J. Sci. Comput. 2010, 32, 1484–1505. [Google Scholar] [CrossRef]
- Cai, J.F.; Xu, W.Y. Guarantees of total variation minimization for signal recovery. Inform. Infer. 2015, 4, 328–353. [Google Scholar] [CrossRef]
- Needell, D.; Ward, R. Stable Image Reconstruction Using Total Variation Minimization. SIAM J. Imaging Sci. 2012, 6, 1035–1058. [Google Scholar] [CrossRef]
- Sun, T.; Yin, P.; Cheng, L.; Jiang, H. Alternating direction method of multipliers with difference of convex functions. Adv. Comput. Math. 2018, 44, 723–744. [Google Scholar] [CrossRef]
- Sun, T.; Jiang, H.; Cheng, L.; Zhu, W. Iteratively Linearized Reweighted Alternating Direction Method of Multipliers for a Class of Nonconvex Problems. IEEE Trans. Signal Proc. 2018, 66, 5380–5391. [Google Scholar] [CrossRef]
- Leonov, A.S. On the total-variation convergence of regularizing algorithms for ill-posed problems. Comput. Math. Math. Phys. 2007, 47, 732–747. [Google Scholar] [CrossRef]
- Tikhonov, A.; Leonov, A.; Yagola, A. Nonlinear Ill-Posed Problems; De Gruyter: Berlin, Germany, 2011; pp. 505–512. [Google Scholar] [CrossRef]
- Hào, D.N.; Quyen, T.N.T. Convergence rates for total variation regularization of coefficient identification problems in elliptic equations I. Inverse Probl. 2011, 27, 075008. [Google Scholar] [CrossRef]
- Hào, D.N.; Quyen, T.N.T. Convergence rates for total variation regularization of coefficient identification problems in elliptic equations II. J. Math. Anal. Appl. 2012, 388, 593–616. [Google Scholar] [CrossRef][Green Version]
- Ciegis, R.; Sev, A.J. Nonlinear Diffusion Problems In Image Smoothing. Math. Model. Anal. 2005, 1, 381–388. [Google Scholar]
- Grasmair, M.; Scherzer, O.; Haltmeier, M. Necessary and sufficient conditions for linear convergence of ℓ1-regularization. Commun. Pure Appl. Math. 2011, 64, 161–182. [Google Scholar] [CrossRef]
- Burger, M.; Osher, S. Convergence rates of convex variational regularization. Inverse Probl. 2004, 20, 1411. [Google Scholar] [CrossRef]
- Burger, M.; Flemming, J.; Hofmann, B. Convergence rates in ℓ1-regularization if the sparsity assumption fails. Inverse Probl. 2013, 29, 025013. [Google Scholar] [CrossRef]
- Flemming, J.; Hofmann, B.; Veselić, I. A unified approach to convergence rates for ℓ1-regularization and lacking sparsity. J. Inverse III-Pose P 2015, 24, 139–148. [Google Scholar] [CrossRef]
- Anzengruber, S.W.; Hofmann, B.; Mathé, P. Regularization properties of the sequential discrepancy principle for Tikhonov regularization in Banach spaces. Appl. Anal. 2014, 93, 1382–1400. [Google Scholar] [CrossRef]
- Luo, Z.Q.; Tseng, P. On the convergence of the coordinate descent method for convex differentiable minimization. J. Optim. Theory Appl. 1992, 72, 7–35. [Google Scholar] [CrossRef]
- Luo, Z.Q.; Tseng, P. Error bound and convergence analysis of matrix splitting algorithms for the affine variational inequality problem. SIAM J. Optim. 1992, 2, 43–54. [Google Scholar] [CrossRef]
- Jin, B.; Lorenz, D.A. Heuristic parameter-choice rules for convex variational regularization based on error estimates. SIAM J. Numer. Anal. 2010, 48, 1208–1229. [Google Scholar] [CrossRef]
- Jin, B.; Zou, J. Iterative parameter choice by discrepancy principle. IMA J. Numer. Anal. 2012, 32, 1714–1732. [Google Scholar] [CrossRef]
- Jin, B.; Lorenz, D.A.; Schiffler, S. Elastic-Net Regularization: Error estimates and Active Set Methods. Inverse Probl. 2009, 25, 115022. [Google Scholar] [CrossRef]
- Bredies, K.; Lorenz, D.A. Linear convergence of iterative soft-thresholding. J. Fourier Anal. Appl. 2008, 14, 813–837. [Google Scholar] [CrossRef]
- Foucart, S.; Rauhut, H. A Mathematical Introduction to Compressive Sensing; Springer: New York, NY, USA, 2013. [Google Scholar]
- Anzengruber, S.W.; Ramlau, R. Convergence rates for Morozov’s discrepancy principle using variational inequalities. Inverse Probl. 2011, 27, 105007–105024. [Google Scholar] [CrossRef]
- Flemming, J. Generalized Tikhonov Regularization and Modern Convergence Rate Theory in Banach Spaces; Shaker Verlag GmbH: Düren, Germany, 2012. [Google Scholar]
- Hofmann, B.; Mathé, P. Parameter choice in Banach space regularization under variational inequalities. Inverse Probl. 2012, 6, 1035–1058. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).