Investigating the Influence of Box-Constraints on the Solution of a Total Variation Model via an Efficient Primal-Dual Method

In this paper, we investigate the usefulness of adding a box-constraint to the minimization of functionals consisting of a data-fidelity term and a total variation regularization term. In particular, we show that in certain applications an additional box-constraint does not effect the solution at all, i.e., the solution is the same whether a box-constraint is used or not. On the contrary, i.e., for applications where a box-constraint may have influence on the solution, we investigate how much it effects the quality of the restoration, especially when the regularization parameter, which weights the importance of the data term and the regularizer, is chosen suitable. In particular, for such applications, we consider the case of a squared L2 data-fidelity term. For computing a minimizer of the respective box-constrained optimization problems a primal-dual semi-smooth Newton method is presented, which guarantees superlinear convergence.


Introduction
An observed image g, which contains additive Gaussian noise with zero mean and standard deviation σ, may be modeled as g = Kû + n where û is the original image, K is a linear bounded operator and n represents the noise. With the aim of preserving edges in images in [1] total variation regularization in image restoration was proposed. Based on this approach and assuming that g ∈ L 2 (Ω) and K ∈ L(L 2 (Ω)), a good approximation of û is usually obtained by solving where Ω ⊂ R 2 is a simply connected domain with Lipschitz boundary and |Ω| its volume. R Here Ω |Du| denotes the total variation of u in Ω and BV(Ω) is the space of functions with bounded R variation, i.e., u ∈ BV(Ω) if and only if u ∈ L 1 (Ω) and Ω |Du| < ∞; see [2,3] for more details. We recall, that BV(Ω) ⊂ L 2 (Ω), if Ω ⊂ R 2 .
If we additionally know (or assume) that the original image lies in the dynamic range [c min , c max ], i.e., c min ≤ u(x) ≤ c max for almost every (a.e.) x ∈ Ω, we incorporate this information into our problems (1) and (2)  and u ∈ C and Z min kKu − gk 2 + α respectively, where C := {u ∈ L 2 (Ω) : c min ≤ u(x) ≤ c max for a.e. x ∈ Ω}. In order to guarantee the existence of a minimizer of problems (3) and (4) we assume in the sequel that K does not annihilate constant functions. By noting that the characteristic function χ C is lower semicontinuous this follows by the same arguments as in [4]. If additionally g ∈ K(BV(Ω) ∩ C), then by [4] (Prop. 2.1) it follows that there exists a constant α ≥ 0 such that problem (3) is equivalent to problem (4).
For image restoration box-constraints have been considered for example in [5,[27][28][29]. In [29] a functional consisting of an L 2 -data term and a Tikhonov-like regularization term (i.e., L 2 -norm of some derivative of u) in connection with box-constrained is presented together with a Newton-like numerical scheme. For box-constrained total variation minimization in [5] a fast gradient-based algorithm, called monoton fast iterative shrinkage/thresholding algorithm (MFISTA), is proposed and a rate of convergence is proven. Based on the alternating direction method of multipliers (ADMM) [30] in [27] a solver for the box-constrained L 2 -TV and L 1 -TV model is derived and shown to be faster than MFISTA. In [28] a primal-dual algorithm for the box-constrained L 1 -TV model and for box-constrained non-local total variation is presented. In order to achieve a constrained solution, which is positive and bounded from above by some intensity value, in [31] an exponential type transform is applied to the L 2 -TV model. Recently, in [32] a box-constraint is also incorporated in a total variation model with a combined L 1 -L 2 data fdelity, proposed in [33], for removing simultaneously Gaussian and impulse noise in images.
Setting the upper bound in the set C to infnity and the lower bound to 0, i.e., c min = 0 and c max = +∞, leads to a non-negativity constraint. Total variation minimization with a non-negativity constraint is a well-known technique to improve the quality of reconstructions in image processing; see for example [34,35] and references therein.
In this paper we are concerned with the problems (3) and (4) when the lower bound c min and the upper bound c max in C are fnite. However, the analysis and the presented algorithms are easily adjustable to the situation when one of the bounds is set to −∞ or +∞ respectively. Note, that a solution of problem (1) and problem (2) is in general not an element of the set C. However, since g is an observation containing Gaussian noise with zero mean, a minimizer of problem (2) lies indeed in C, if α in problem (2) is suffciently large and the original image û ∈ C. This observation however rises the question whether an optimal parameter α would lead to a minimizer that lies in C. If this would be the case then incorporating the box-constraint into the minimization problem does not gain any improvement of the solution. In particular, there are situations in which a box-constraint is not effecting the solution at all (see Section 3 below). Additionally, we expect that the box-constrained problems are more diffcult to handle and numerically more costly to solve than problem (1) and problem (2).
In order to answer the above raised question, we numerically compute optimal values of α for the box-constrained total variation and the non-box-constrained total variation and compare the resulting reconstructions with respect to quality measures. By optimal values we mean here parameters α such that the solutions of problem (1) and problem (2) or problem (3) and problem (4) coincide. Note, that there exists several methods for computing the regularization parameter; see for example [36] for an overview of parameter selection algorithms for image restoration. Here we use the pAPS-algorithm proposed in [36] to compute reasonable α in problem (2) and problem (4). For minimizing problem (4) we derive a semi-smooth Newton method, which should serve us as a good method for quickly computing rather exact solutions. Second order methods have been already proposed and used in image reconstruction; see [21,[36][37][38][39]. However, to the best of our knowledge till now semi-smooth Newton methods have not been presented for box-constrained total variation minimization. In this setting, differently to the before mentioned approaches, the box-constraint adds some additional diffculties in deriving the dual problems, which have to be calculated to obtain the desired method; see Section 4 for more details. The superlinear convergence of our method is guaranteed by the theory of semi-smooth Newton methods; see for example [21]. Note, that our approach differs signifcantly from the Newton-like scheme presented in [29], where a smooth objective functional with a box-constraint is considered. This allows in [29] to derive a Newton method without dualization. Here, our Newton method is based on dualization and may be viewed as a primal-dual (Newton) approach.
We remark, that a scalar regularization parameter might not be the best choice for every image restoration problem, since images usually consist of large uniform areas and parts with fne details, see for example [36,38]. It has been demonstrated, for example in [36,38,40,41] and references therein, that with the help of spatially varying regularization parameters one might be able to restore images visually better than with scalar parameters. In this vein we also consider Z min kKu − gk 2 + α|Du| and Z min kKu − gk 2 + α|Du|, where α : Ω → R + is a bounded continuous function [42]. We adapt our semi-smooth Newton method to approximately solve these two optimization problems and utilize the pLATV-algorithm of [36] to compute a locally varying α.
Our numerical results show, see Section 6, that in a lot of applications the quality of the restoration is more a question of how to choose the regularization parameter then including a box-constraint. However, the solutions obtained by solving the box-constrained versions (3), (4) and (6) are improving the restorations slightly, but not drastically. Nevertheless, we also report on a medical applications where a non-negativity constraint signifcantly improves the restoration.
We realize that if the noise-level of the corrupted image is unknown, then we may use the information of the image intensity range (if known) to calculate a suitable parameter for problem (2). Note, that in this situation the optimization problems (1) and (3) cannot be considered since σ is not at hand. We present a method which automatically computes the regularization parameter α in problem (2) provided the information that the original image û ∈ [c min , c max ].
Hence the contribution of the paper is three-sided: (i) We present a semi-smooth Newton method for the box-constrained total variation minimization problems (3) and (6). (ii) We investigate the infuence of the box-constraint on the solution of the total variation minimization models with respect to the regularization parameter. (iii) In case the noise-level is not at hand, we propose a new automatic regularization parameter selection algorithm based on the box-constraint information.
The outline of the rest of the paper is organized as follows: In Section 2 we recall useful defnitions and the Fenchel-duality theorem which will be used later. Section 3 is devoted to the analysis of the box-constrained total variation minimization. In particular, we state that in certain cases adding a box-constraint to the considered problem does not change the solution at all. The semi-smooth Newton method for the box-constrained L 2 -TV model (4) and its multiscale version (6) is derived in Section 4 and its numerical implementation is presented in Section 5. Numerical experiments investigating the usefulness of a box-constraint are shown in Section 6. In Section 7 we propose an automatic parameter selection algorithm by using the box-constraint. Finally, in Section 8 conclusions are drawn.

Basic Terminology
Let X be a Banach space. Its topological dual is denoted by X * and h·, ·i describes the bilinear c anonical pairing over X × X * . A convex functional J : For a convex functional J : X → R we defne the subdifferential of J at v ∈ X as the set valued function It is clear from this defnition, that 0 ∈ ∂J(v) if and only if v is a minimizer of J. T he conjugate function (or Legendre transform) of a convex function J : X → R is defned as From this defnition we see that J * is the pointwise supremum of continuous affne functions and thus, according to [43] (Proposition 3.1, p. 14), convex, lower semicontinuous, and proper.
For an arbitrary set S we denote by χ S its characteristic function defned by We recall the Fenchel duality theorem; see, e.g., [43] for details.
Theorem 1 (Fenchel duality theorem). Let X and Y be two Banach spaces with topological duals X * and Y * , respectively, and Λ : X → Y a bounded linear operator with adjoint Λ * ∈ L(Y * , X * ). Further let F : X → R ∪ {∞}, G : Y → R ∪ {∞} be convex, lower semicontinuous, and proper functionals. Assume there exists u 0 ∈ X such that F (u 0 ) < ∞, G(Λu 0 ) < ∞ and G is continuous at Λu 0 . Then we have u∈X p∈Y * and the problem on the right hand side of (7) admits a solution p. Moreover, ū and p are solutions of the two optimization problems in (7), respectively, if and only if −p ∈ ∂G(Λū).

Limitation of Box-Constrained Total Variation Minimization
In this section we investigate the difference between the box-constrained problem (4) and the non-box-constrained problem (1). For the case when the operator K is the identity I, which is the relevant case in image denoising, we have the following obvious result: Proposition 1. Let K = I and g ∈ C, then the minimizer u * ∈ BV(Ω) of problem (2) lies also in the dynamic range [c min , c max ], i.e., u * ∈ BV(Ω) ∩ C.
This result is easily extendable to optimization problems of the type with α 1 , α 2 ≥ 0 and α 1 + α 2 > 0, since for ũ, defned as in the above proof, and a minimizer * u ∈ BV(Ω) ∩ C of problem (9) the inequalities in (8) hold as well as ku * − gk L 1 (Ω) > kũ − gk L 1 (Ω) . Problem (9) has been already considered in [33,44,45] and can be viewed as a generalization of the L 2 -TV model, since α 1 = 0 in (9) yields the L 2 -TV model and α 2 = 0 in (9) yields the L 1 -TV model. Note, that if an image is only corrupted by impulse noise, then the observed image g is in the dynamic range of the original image. For example, salt-and-pepper noise contained images may be written as [46] and for random-valued impulse noise g is described as with d being a uniformly distributed random variable in the image intensity range [c min , c max ]. Hence, following Proposition 1, in such cases considering constrained total variation minimization would not change the minimizer and no improvement in the restoration quality can be expected. This is the reason why we restrict ourselves in the rest of the paper to Gaussian white noise contaminated images and consider solely the L 2 -TV model.
It is clear that if a solution of the non box-constrained optimization problem already fulflls the box-constraint, then it is of course equivalent to a minimizer of the box-constraint problem. However, note that the minimizer is not unique in general.
In the following we compare the solution of the box-constrained optimization problem (4) with the solution of the unconstrained minimization problem (2).

The Model Problem
In general K * K is not invertible, which causes diffculties in deriving the dual problem of (4). In order to overcome this diffculties we penalize the L 2 -TV model by considering the following neighboring problem where µ > 0 is a very small constant such that problem (10) is a close approximation of the total variation regularized problem (4). Note, that for u ∈ H 0 1 (Ω) the total variation of u in Ω is equivalent R to Ω |ru|`2 dx [3]. A typical example for which K * K is indeed invertible is K = I, which is used for image denoising. In this case, we may even set µ = 0, see Section 6. The objective functional in problem (10) has been already considered for example in [21,48] for image restoration. In particular in [21], a primal-dual semi-smooth Newton algorithm is introduced. Here, we actually adopt this approach to our box-constrained problem (10).
In the sequel we assume for simplicity that −c min = c max =: c > 0, which changes the set C to C := {u ∈ L 2 (Ω) : |u| ≤ c}. Note, that any bounded image û, i.e., which lies in the dynamic range [a, b], can be easily transformed to an image ũ ∈ [−c, c]. Since this transform and K are linear, the observation g is also easily transformed to g = Kũ + n.
Problem (10) can be equivalently written as ) (and equivalently problem (11)), then there exists λ * ∈ H 0 1 (Ω) * and σ * ∈ ∂R(u), where R(u) := Ω |ru|`2 dx, such that * . For implementation reasons (actually for obtaining a fast, second-order algorithm) we approximate the non-smooth characteristic function χ C by a smooth function in the following way η where η > 0 is large. This leads to the following optimization problem Remark 1. By the assumption −c min = c max we actually excluded the cases (i) c min = 0, c max = +∞ and (ii) c min = −∞, c max = 0. In these situations we just need to approximate χ C by (i) η k max{−u, 0}k 2 2 L 2 (Ω) η and (ii) k max{u, 0}k 2 . By noting this, in a similar fashion as done below for problem (12), a primal-dual semi-smooth Newton method can be derived for these two cases.

Dualization
By a standard calculation one obtains that the dual of problem (12) is given by with Λ * p = − div p 1 + p 2 and A := {v ∈ L 2 (Ω) : |v|`2 ≤ α}. As the divergence operator does not have a trivial kernel, the solution of the optimization problem (13) is not unique. In order to render the problem (13) strictly concave we add an additional term yielding the following problem where γ > 0 is a fxed parameter.
The proof of this statement is a bit technical and therefore deferred to Appendix A. Similar as in [21] one can show that the solution of problem (15) converges to the minimizer of (12) as γ → 0.
From the Fenchel duality theorem we obtain the following characterization of solutions u and p of problem (15) and problem (14) This system can be solved effciently by a semi-smooth Newton algorithm. Moreover, αru equations (17) and (18) can be condensed into p 1 = .

Adaptation to Non-Scalar α
For locally adaptive α, i.e., α : Ω → R + is a function, the minimization problem (12) changes to Its dual problem is given by Similarly but slightly different as above, cf. problem (14), we penalize by Then the dual of this problem turns out to be Denoting by u a solution of problem (21) and p the solution of the associated pre-dual problem, the optimality conditions due to the Fenchel theorem [43] are given by

Numerical Implementation
Similar as in the works [21,[36][37][38][39], where semi-smooth Newton methods for non-smooth systems emerging from image restoration models have been derived, we can solve the discrete version of the system (16)- (19), using fnite differences, effciently by a primal-dual algorithm. Therefore let denote the discrete image intensity, the dual variables, and the observed data vector, respectively, where N ∈ N is the number of elements (pixels) in the discrete image Ω h . Moreover, we denote by α h > 0 the regularization parameter. Correspondingly we defne r h ∈ R 2N×N as the discrete gradient operator, Δ h ∈ R N×N as the discrete Laplace operator, K h ∈ R N×N as a discrete operator, and (K h ) t its transpose. Moreover, div h = −(r h ) t . Here | · |, max{·, ·}, and sign(·) are understood for vectors in a component-wise sense. Moreover, we use the function

Scalar α
The discrete version of (16)- (19) reads as Applying a generalized Newton step to solve (22) and hence we can eliminate δ p 1 from the Newton system resulting in where If H k is positive defnite, then the solution δ u of (24) exists and is a descent direction of (15). However, in general we cannot expect the positive defniteness of H k . In order to ensure h h h that H k is positive defnite, we project p onto its feasible set by setting ((p for i = 1, . . . , 2N. The modifed system matrix, denoted by H k + , is then positive defnite. Then our semi-smooth Newton solver may be written as: h h Primal-dual Newton method (pdN): Initialize (u 0 , p 1,0 ) ∈ R N × R 2N and set k := 0. (25) is not satisfed, then compute H k
This algorithm converges at a superlinear rate, which follows from standard theory; see [20,21]. The Newton method is terminated as soon as the initial residual is reduced by a factor of 10 −4 .
Note, that, since η = 0 implies p 2 = 0, in this case the proposed primal-dual Newton method becomes the method in [21].

Non-Scalar α
A similar semi-smooth Newton method might be derived for the locally adaptive case by noting that then α h ∈ R N , and hence the second equation in (22) The positive defnite modifed matrix is then obtained by setting ((p to

Numerical Experiments
For our numerical studies we consider the images shown in Figure 1 of size 256 × 256 pixels and in Figure 2. The image intensity range of all original images considered in this paper is [0, 1], i.e., c min = 0 and c max = 1. Our proposed algorithms automatically transform this images into the dynamic range [−c, c], here with c = 1/2. That is, let û ∈ [0, 1] be the original image before any . Moreover, the solution generated by the semi-smooth Newton 1 method is afterwards back-transformed, i.e., the generated solution ũ is transformed to ũ + 2 . Note As a comparison for the different restoration qualities of the restored image we use the PSNR [49] (peak signal-to-noise ratio) given by where û denotes the original image before any corruption and u * the restored image, which is widely used as an image quality assessment measure, and the MSSIM [50] (mean structural similarity), which usually relates to perceived visual quality better than PSNR. In general, when comparing PSNR and MSSIM, large values indicate better reconstruction than small values.
In our experiments we also report on the computational time (in seconds) and the number of iterations (it) needed until the considered algorithms are terminated.
In all the following experiments the parameter µ is chosen to be 0 for image denoising (i.e., K = I), since then no additional smoothing is needed, and µ = 10 −6 if K 6 = I (i.e., for image deblurring, image inpainting, for reconstructing from partial Fourier-data, and for reconstructing from sampled Radon-transform).

Dependency on the Parameter η
We start by investigating the infuence of the parameter η on the behavior of the semi-smooth Newton algorithm and its generated solution. Let us recall, that η is responsible how strictly the box-constraint is adhered. In order to visualize how good the box-constraint is fulflled for a chosen η in Figure 3 we depict max x u(x) − c max and min x u(x) − c min with c max = 1, c min = 0, and u being 1 the back-transformed solution, i.e., u = ũ + 2 , where ũ is obtained via the semi-smooth Newton method. As long as max x u(x) − c max and min x u(x) − c min are positive and negative, respectively, the box-constraint is not perfectly adhered. From our experiments for image denoising and image deblurring, see Figure 3, we clearly see that the larger η the more strictly the box-constraint is adhered. In the rest of our experiments we choose η = 10 6 , which seems suffciently large to us and the box-constraint seems to hold accurately enough.

Box-Constrained Versus Non-Box-Constrained
In the rest of this section we are going to investigate how much the solution (and its restoration quality) depends on the box-constraint and if this is a matter on how the regularization parameter is chosen. We start by comparing for different values of α the solutions obtained by the semi-smooth Newton method without a box-constraint (i.e., η = 0) with the ones generated by the same algorithm with η = 10 6 (i.e., a box-constraint is incorporated). Our obtained results are shown in Table 1 for image denoising and in Table 2 for image deblurring. We obtain, that for small α we gain "much" better results with respect to PSNR and MSSIM with a box-constraint than without. The reason for this is that if no box-constraint is used and α is small then nearly no regularization is performed and hence noise, which is violating the box-constraint, is still present. Therefore incorporating a box-constraint is reasonable for these choices of parameters. However, if α is suffciently large then we numerically observe that the solution of the box-constrained and non-box-constrained problem are the same. This is R 1 not surprising, because there exists ᾱ > 0 such that for all α > ᾱ the solution of problem (2) is . That is, for such α the minimizer of problem (2) is the average of the observation which lies in the image intensity range of the original image, as long as the mean of Gaussian noise is 0 (or suffciently small). This implies that in such a case the minimizer of problem (2) and problem (4) are equivalent. Actually this equivalency already holds if α is suffciently large such that the respective solution of problem (2) lies in the dynamic range of the original image, which is the case in our experiments for α = 0.4. Hence, whether it makes sense or not to incorporate a box-constraint into the considered model depends on the choice of parameters. The third and fourth value of α in Tables 1 and 2 refer to the ones which equalize problem (2) and problem (1), and respectively problem (4) and problem (3). In the sequel we call such parameters optimal, since a solution of the penalized problem also solves the related constrained problem. However, we note that these α-values are in general not giving the best results with respect to PSNR and MSSIM, but they are usually close to the results with the largest PSNR and MSSIM. For both type of applications, i.e., image denoising and image deblurring, these optimal α-values are nearly the same for problem (2) and problem (1), and respectively problem (4) and problem (3) and hence also the PSNR and MSSIM of the respective results are nearly the same. Nevertheless, we mention that for image deblurring the largest PSNR and MSSIM in these experiments is obtained for α = 0.01 with a box-constraint. In Tables 3 and 4 we also report on an additional strategy. In this approach we threshold (or project) the observation g such that the box-constraint holds in any pixel and use then the proposed Newton method with η = 0. For large α this is an inferior approach, but for small α this seems to work similar to incorporating a box-constraint, at least for image denoising. However, it is outperformed by the other approaches.

Comparison with Optimal Regularization Parameters
In order to determine the optimal parameters α for a range of different examples we assume that the noise-level σ is at hand and utilize the pAPS-algorithm presented in [36]. Alternatively, instead of computing a suitable α, we may solve the constrained optimization problems (1) and (3) directly by using the alternating direction methods of multipliers (ADMM). An implementation of the ADMM for solving problem (1) is presented in [51], which we refer to as the ADMM in the sequel. For solving problem (3) a possible implementation is suggested in [27]. However, for comparison purposes we use a slightly different version, which uses the same succession of updates as the ADMM in [51], see Appendix B for a description of this version. In the sequel we refer to this algorithm as the box-contrained ADMM. We do not expect the same results for the pAPS-algorithm and the (box-contrained) ADMM, since in the pAPS-algorithm we use the semi-smooth Newton method which generates an approximate solution of problem (15), that is not equivalent to problem (1) and problem (3). In all the experiments in the pAPS-algorithm we set the initial regularization parameter to be 10 −3 .

pdN versus ADMM
We start by comparing the performance of the proposed primal-dual semi-smooth Newton method (pdN) and the ADMM. In these experiments we assume that we know the optimal parameters α, which are then used in the pdN. Note, that a fair comparison of these two methods is diffcult, since they are solving different optimization problems, as already mentioned above. However, we still compare them in order to understand better the performance of the algorithms in the sequel section.
The comparison is performed for image denoising and image deblurring and the respective fndings are collected in Tables 5 and 6. From there we clearly observe, that the proposed pdN with η = 10 6 reaches in all experiments the desired reconstruction signifcantly faster than the box-constrained ADMM. While the number of iterations for image denoising is approximately the same for both methods, for image deblurring the box-constrained pdN needs signifcantly less iterations than the other method. In particular, the pdN needs nearly the same amount of iterations independently of the application. However, more iterations for small σ are needed. Note, that the pdN converges at a superlinear rate and hence a faster convergence than the box-constrained ADMM is not surprising but supports the theory.  Tables 7 and 8 we summarize our fndings for image denoising. We observe that adding a box-constraint to the considered optimization problem leads to a possibly slight improvement in PSNR and MSSIM. While in some cases there is some improvement (see for example the image "numbers") in other examples no improvement is gained (see for example the image "barbara"). In order to make the overall improvement more visible, in the last row of Tables 7 and 8 we add the average PSNR and MSSIM of all computed restorations. It shows that on average we may expect a gain of around 0.05 PSNR and around 0.001 MSSIM, which is nearly nothing. Moreover, we observe, that the pAPS-algorithm computes the optimal α for the box-constrained problem on average faster than the one for the non-box-constrained problem. We remark, that the box-constrained version needs less (or at maximum the same amount of) iterations as the version with η = 0. The reason for this might be that in each iterations, due to the thresholding of the approximation by the box-constraint, a longer or better step towards the minimizer than by the non-box-constrained pAPS-algorithm is performed. At the same time also the reconstructions of the box-constrained pAPS-algorithm yield higher PSNR and MSSIM than the ones obtained by the pAPS-algorithm with η = 0. The situation seems to be different for the ADMM. On average, the box-constrained ADMM and the (non-box-constrained) ADMM need approximately the same run-time.  For several examples (i.e., the images "phantom", "cameraman", "barbara", "house") the choice of the regularization parameter by the box-constrained pAPS-algorithm with respect to the noise-level is depicted in Figure 4. Clearly, the parameter is selected to be smaller the less noise is present in the image.  We are now wondering whether a box-constraint is more important when the regularization parameter is non-scalar, i.e., when α : Ω → R + is a function. For computing suitable locally varying α we use the pLATV-algorithm proposed in [36], whereby we set in all considered examples the initial (non-scalar) regularization parameter to be constant 10 −2 . Note, that the continuity assumption on α in problem (5) and problem (6) is not needed in our discrete setting, since ∑ x∈Ω h α(x)|r h u h (x)| is well defned for any α ∈ R N . We approximate such α for problem (20) with η = 0 (unconstrained) and with η = 10 6 (box-constrained) and obtain also here that the gain with respect to PSNR and MSSIM is of the same order as in the scalar case, see Table 9.  For σ = 0.1 and the image "barbara" we show in Figure 5 the reconstructions generated by the considered algorithms. As indicated by the quality measures, all the reconstructions look nearly alike, whereby in the reconstructions produced by the pLATV-algorithm details, like the pattern of the scarf, are (slightly) better preserved. The spatially varying α of the pLATV-algorithm is depicted in Figure 6. There we clearly see, that at the scarf around the neck and shoulder the values of α are small, allowing to preserve the details better.

Image Deblurring
Now we consider the images in Figure 1a-c, convolve them frst with a Gaussian kernel of size 9 × 9 and standard deviation 3 and then add some Gaussian noise with mean 0 and standard deviation σ. Here we again compare the results obtained by the pAPS-algorithm, the ADMM, and the pLATV-algorithm for the box-constrained and non-box-constrained problems. Our fndings are summarized in Table 10. Also here we observe a slight improvement with a box-constraint with respect to PSNR and MSSIM. The choice of the regularization parameters by the box-constrained pAPS-algorithm is depicted in Figure 7. In Figure 8 we present for the image "cameraman" and σ = 0.01 the reconstructions produced by the respective methods. Also here, as indicated by the quality measures, all the restorations look nearly the same. The locally varying α generated by the pLATV-algorithm are depicted in Figure 9. Table 10. PSNR-and MSSIM-values of the reconstruction of different images corrupted by Gaussian blur and Gaussian white noise via pAPS or pLATV using the primal-dual Newton method or via the ADMM by solving the constrained versions.

Image Inpainting
The problem of flling in and recovering missing parts in an image is called image inpainting. We call the missing parts inpainting domain and denote it by D ⊂ Ω. The linear bounded operator K is then a multiplier, i.e., Ku = 1 Ω\D · u, where 1 Ω\D is the indicator function of Ω \ D. Note, that K is not injective and hence K * K is not invertible. Hence in this experiment we need to set µ > 0 so that we can use the proposed primal-dual semismooth Newton method. In particular, as mentioned above, we choose µ = 10 −6 .
In the considered experiments the inpainting domain are gray bars as shown in Figure 10a, where additionally additive white Gaussian noise with σ = 0.1 is present. In particular, we consider examples with σ ∈ {0.3, 0.2, 0.1, 0.05, 0.01, 0.005}. The performance of the pAPS-and pLATV-algorithm with and without a box-constraint reconstructing the considered examples are summarized in Tables 11 and 12. We observe, that adding a box-constraint does not seem to change the restoration considerably. However, as in the case of image denoising, the pAPS-algorithm with box-constrained pdN needs less iterations and hence less time than the same algorithm without a box-constraint to reach the stopping criterion. Figure 10 shows a particular example for image inpainting and denoising with σ = 0.1. It demonstrates that visually there is nearly no difference between the restoration obtained by the considered approaches. Moreover, we observe that the pLATV-algorithm seems to be not suited to the task of image inpainting. A reason for this might be, that the pLATV-algorithm does not take the inpainting domain correctly into account. This is visible in Figure 11 where the spatially varying α seems to be chosen small in the inpainting domain, which not necessarily seems to be a suitable choice.   (a) (b) Figure 11. Spatially varying regularization parameter generated by the respective pLATV-algorithm.

Reconstruction from Partial Fourier-Data
In magnetic resonance imaging one wishes to reconstruct an image which is only given by partial Fourier data and additionally distorted by some additive Gaussian noise with zero mean and standard deviation σ. Hence, the linear bounded operator is K = S • F , where F is the 2D Fourier matrix and S is a downsampling operator which selects only a few output frequencies. The frequencies are usually sampled along radial lines in the frequency domain, in particular in our experiments along 32 radial lines, as visualized in Figure 12. In our experiments we consider the images of Figure 2, transformed to its Fourier frequencies.
As already mentioned, we sample the frequencies along 32 radial lines and add some Gaussian noise with zero mean and standard deviation σ. In particular, we consider different noise-levels, i.e., σ = {0.3, 0.2, 0.1, 0.05, 0.01, 0.005}. We reconstruct the obtained data via the pAPS-and pLATV-algorithm by using the semi-smooth Newton method frst with η = 0 (no box-constraint) and then with η = 10 6 (with box-constraint). In Table 13 we collect our fndings. We observe that the pLATV-algorithm seems not to be suitable for this task, since it is generating inferior results. For scalar α we observe as before, that a slight improvement with respect to PSNR and MSSIM is expectable when a box-constraint is used. In Figure 13 we present the reconstructions generated by the considered algorithms for a particular example, demonstrating the visual behavior of the methods.

Reconstruction from Sampled Radon-Data
In computerized tomography instead of a Fourier-transform a Radon-transform is used in order to obtain a visual image from the measured physical data. Also here the data is obtained along radial lines. Here we consider the Shepp-Logan phantom, see Figure 14a, and a slice of a body, see Figure 15a. The sinogram in Figures 14a and 15b are obtained by sampling along 30 and 60 radial lines, respectively, Note, that the sinogram is in general noisy. Here the data is corrupted by Gaussian white noise with standard deviation σ, whereby σ = 0.1 for the data of the Shepp-Logan phantom and σ = 0.05 for the data of the slice of the head. Using the inverse Radon-transform we obtain Figure 16a,b, which is obviously a suboptimal reconstruction. A more sophisticated approach utilizes the L 2 -TV model which yields the reconstruction depicted in Figure 16b,e, where we use the pAPS-algorithm and the proposed primal-dual algorithm with η = 0. However, since an image can be assumed to have non-negative values, we may incorporate a non-negativity constraint via the box-constrained L 2 -TV model yielding the result in Figure 16c,f, which is a much better reconstruction. Also here the parameter α is automatically computed by the pAPS-algorithm and the non-negativity constraint is incorporated by setting η = 10 6 in the semi-smooth Newton method. In order to compute the Radon-matrix in our experiments we used the FlexBox [52]. Other applications where a box-constraint, and in particular a non-negativity improves the image reconstruction quality signifcantly include for example magnetic particle imaging, see for example [53] and references therein.

Automated Parameter Selection
We recall, that if the noise-level σ is not known, then the problems (1) and (3) cannot be considered. Moreover, the selection of the parameter α in problem (2) cannot be achieved by using the pAPS-algorithm, since this algorithm is based on problem (1). Note, that also other methods, like the unbiased predictive risk estimator method (UPRE) [54,55] and approaches based on the Stein unbiased risk estimator method (SURE) [56][57][58][59][60] use knowledge of the noise-level and hence cannot be used for selecting a suitable parameter if σ is unknown.
If we assume that σ is unknown but the image intensity range of the original image û is known, i.e., û ∈ [c min , c max ], then we may use this information for choosing the parameter α in problem (2). This may be performed by applying the following algorithm: Box-constrained automatic parameter selection (bcAPS): Initialize α 0 > 0 (suffciently small) and set n := 0 R 1. Solve u n ∈ arg min u∈BV(Ω) kKu − gk 2 + α n Ω |Du|.
Here τ > 1 is an arbitrary parameter chosen manually such that the generated restoration u is not over-smoothed, i.e., there exist x ∈ Ω such that u(x) ≈ c min and/or u(x) ≈ c max . In our experiments it turned out that τ = 1.05 seems to be a reasonable choice, so that the generated solution has the wished property.

Numerical Examples
In our experiments the minimization problem in step 1 of the bcAPS algorithm is approximately solved by the proposed primal-dual semi-smooth Newton method with η = 0. We set the initial regularization parameter α 0 = 10 −4 for image denoising and α 0 = 10 −3 for image deblurring. Moreover, we set τ = 1.05 in the bcAPS-algorithm to increase the regularization parameter.
Experiments for image denoising, see Table 14, show that the bcAPS-algorithm fnds suitable parameters in the sense that the PSNR and MSSIM of these reconstructions is similar to the ones obtained with the pAPS-algorithm (when σ is known); also compare with Tables 7 and 10. This is explained by the observation that also the regularization parameters α calculated by the bcAPS-algorithm do not differ much from the ones obtained via the pAPS-algorithm. For image deblurring, see Table 15, the situation is not so persuasive. In particular, the obtained regularization parameter of the two considered methods differ more signifcantly than before, resulting in different PSNR and MSSIM. However, in the case σ = 0.05 the considered quality measures of the generated reconstructions are nearly the same. We also remark, that in all the experiments the pAPS-algorithm generated reconstructions, which have larger PSNR and MSSIM than the ones obtained by the bcAPS-algorithm. From this observation it seems more useful to know the noise-level than the image intensity range. However, if the noise-level is unknown but the image intensity is known, then the bcAPS-algorithm may be a suitable choice.

Conclusions
In this work we investigated the quality of restored images when the image intensity range of the original image is additionally incorporated into the L 2 -TV model as a box-constraint. We observe that this box-constraint may indeed improve the quality of reconstructions. However, if the observation already fulflls the box-constraint, then it clearly does not change the solution at all. Moreover, in a lot of applications the proper choice of the regularization parameter seems much more important than an additional box-constraint. Nevertheless, also then a box-constraint may improve the quality of the restored image, although the improvement is then only very little. On the contrary the additional box-constraint may improve the computational time signifcantly. In particular, for image deblurring and in magnetic resonance imaging using the pAPS-algorithm the computational time is about doubled, while the quality of the restoration is basically not improved. This suggests, that for these applications an additional box-constraint may not be reasonable. Note, that the run-time of the ADMM is independent whether a box-constraint is used or not.
For certain applications, as in computerized tomography, a box-constraint (in particular a non-negativity constraint) improves the reconstruction considerably. Hence, the question rises under which conditions an additional box-constraint indeed has signifcant infuence on the reconstruction when the present parameters are chosen in a nearly optimal way.
If the noise-level of an corrupted image is unknown but the image intensity range of the original image is at hand, then the image intensity range may be used to calculate a suitable regularization parameter α. This can be done as explained in Section 7. Potential future research may consider different approaches, as for example in an optimal control setting. Then one may want to solve min k max{u − c max , 0}k 2 + k max{c min − u, 0}k 2 + κ J(α) L 2 (Ω) L 2 (Ω) u,α Z s.t. u ∈ arg min kKu − gk 2 + α |Du|, where κ > 0 and J is a suitable functional, cf. [61][62][63] for other optimal control approaches in image reconstruction.

Conficts of Interest:
The author declares no confict of interest.