Directional TGV-Based Image Restoration under Poisson Noise

We are interested in the restoration of noisy and blurry images where the texture mainly follows a single direction (i.e., directional images). Problems of this type arise, for example, in microscopy or computed tomography for carbon or glass fibres. In order to deal with these problems, the Directional Total Generalized Variation (DTGV) was developed by Kongskov et al. in 2017 and 2019, in the case of impulse and Gaussian noise. In this article we focus on images corrupted by Poisson noise, extending the DTGV regularization to image restoration models where the data fitting term is the generalized Kullback–Leibler divergence. We also propose a technique for the identification of the main texture direction, which improves upon the techniques used in the aforementioned work about DTGV. We solve the problem by an ADMM algorithm with proven convergence and subproblems that can be solved exactly at a low computational cost. Numerical results on both phantom and real images demonstrate the effectiveness of our approach.


Introduction
Poisson noise appears in processes where digital images are obtained by the count of particles (generally photons). This is the case of X-ray computed tomography, positron emission tomography, confocal and fluorescence microscopy and optical/infrared astronomical imaging, to name just a few applications (see, e.g., [1] and the references therein). In this case, the object to be restored can be represented as a vector u ∈ R n and the data can be assumed to be a vector b ∈ N n 0 , whose entries b j are sampled from n independent Poisson random variables B j with probability The matrix A = (a ij ) ∈ R n×n models the observation mechanism of the imaging system and the following standard assumptions are made: a ij ≥ 0 for all i, j, n i=1 a ij = 1 for all j. (1) The vector γ ∈ R n , with γ > 0, models the background radiation detected by the sensors. By applying a maximum-likelihood approach [1,2], we can estimate u by minimizing the Kullback-Leibler (KL) divergence of Au + γ from b: where we set Regularization is usually introduced in (2) to deal with the ill-conditioning of this problem. The Total Variation (TV) regularization [3] has been widely used in this context, because it preserves edges and is able to smooth flat areas of the image. However, since it may produce staircase artifacts, other TV-based regularizers have been proposed. For example, the Total Generalized Variation (TGV) has been proposed and applied in [4,5,6,7] to overcome the staircasing effect while keeping the ability of identifying edges. On the other hand, to improve the quality of restoration for directional images, the Directional TV (DTV) regularization has been considered in [8], in the discrete setting. In [9,10], a regularizer combining DTV and TGV, named Directional TGV (DTGV), has been successfully applied to directional images affected by impulse and Gaussian noise. Given an image u ∈ R n , the discrete second-order Directional TGV of u is defined as where w ∈ R 2n , ∇ ∈ R 2n×n and E ∈ R 4n×2n are the discrete directional gradient operator and the directional symmetrized derivative, respectively, and α 0 , α 1 ∈ (0, +∞). For any vector v ∈ R 2n we set and for any vector y ∈ R 4n we set y 2,1|R 4n = n j=1 y 2 j + y 2 n+j + y 2 2n+j + y 2 3n+j .
Given an angle θ ∈ [−π, π] and a scaling parameter a > 0, we have that the discrete directional gradient operator has the form where D θ , D θ ⊥ ∈ R n×n represent the forward finite-difference operators along the directions determined by θ and θ ⊥ = θ + π 2 , respectively, and D H , D V ∈ R n×n represent the forward finitedifference operators along the horizontal and the vertical direction, respectively. Moreover, the directional symmetrized derivative is defined in block-wise form as It is worth noting that, by fixing θ = 0 and a = 1, we have D θ = D H and D θ ⊥ = D V , and the operators ∇ and E define the TGV 2 regularization [4]. We observe that the definition of both the matrix A and the finite difference operators D H and D V depend on the choice of boundary conditions. We make the following assumption.
Assumption 0.1. We assume that periodic boundary conditions are considered for A, D H and D V . Therefore, those matrices are Block Circulant with Circulant Blocks (BCCB).
In this work we focus on directional images affected by Poisson noise, with the aim of assessing the behaviour of DTGV in this case. Besides extending the use of DTGV to Poisson noise, we introduce a novel technique for estimating the main direction of the image, which appears to be more efficient than the techniques applied in [9,10]. We solve the resulting optimization problem by using a customized version of the Alternating Direction Method of Multipliers (ADMM). We note that all the ADMM subproblems can be solved exactly at a low cost, thanks also to the use of FFTs, and that the method has proven convergence. Finally, we show the effectiveness of our approach on a set of test images, corrupted by out-of-focus and Gaussian blurs and noise with different signal-to-noise ratios. In particular, the KL-DTGV model of our problem is described in Section 2 and the technique for estimating the main direction is presented in Section 3. A detailed description of the ADMM version used for the minimization is given in Section 4 and the results of the numerical experiments are discussed in Section 5. Conclusions are given in Section 6. Throughout this work we denote matrices with uppercase lightface letters, vectors with lowercase boldface letters and scalars with lowercase lightface letters. All the vectors are column vectors. Given a vector v, we use v i or (v) i to denote its i-th entry. We use R + to indicate the set of real nonnegative numbers and · to indicate the two-norm. For brevity, given any vectors v and w we use the notation (v, w) instead of [v w ] . Likewise, given any scalars v and w, we use (v, w) to indicate the vector [v w] . We also use the notation (

The KL-DTGV Model
We briefly describe the KL-DTGV 2 model for the restoration of directional images corrupted by Poisson noise. Let b ∈ R n be the observed image. We want to recover the original image by minimizing a combination of the KL divergence (2) and the DTGV 2 regularizer (3), i.e., by solving the optimization problem min u,w where u ∈ R n , A ∈ R n×n , γ, b ∈ R n , w ∈ R 2n , and ∇ ∈ R 2n×n and E ∈ R 4n×2n are the linear operators defining the DTGV 2 regularization. The parameters λ ∈ (0, +∞) and α 0 , α 1 ∈ (0, 1) determine the balance between the KL data fidelity term and the two components of the regularization term. We note that problem (6) is a nonsmooth convex optimization problem because of the properties of the KL divergence (see, e.g., [11]) and the DTGV operator (see, e.g., [10]).

Efficient Estimation of the Image Direction
An essential ingredient in the DTGV regularization is the estimation of the angle θ representing the image texture direction. In [10], an estimation algorithm based on the one in [12] is proposed, whose basic idea is to compute a pixelwise direction estimate and then θ as the average of that estimate. In [9], which focuses on impulse noise removal, a more efficient and robust algorithm for estimating the direction is presented, based on the Fourier transform. The main idea behind this algorithm is to exploit the fact that two-dimensional Fourier basis functions can be seen as images with one-directional patterns. However, despite being very efficient from a computational viewpoint, this technique does not appear to be fully reliable in our tests on Poissonian images (see Section 5.1). Therefore, we propose a different approach for estimating the direction, based on classical tools of image processing: the Sobel filter [13] and the Hough transform [14,15]. Our technique is based on the idea that if an image has a one-directional structure, i.e., its main pattern consists of stripes, then the edges of the image mainly consist of lines going in the direction of the stripes. The first stage of the proposed algorithm uses the Sobel filter to determine the edges of the noisy and blurry image. Then, the Hough transform is applied to the edge image in order to detect the lines. The Hough transform is based on the idea that each straight line can be identified by a pair (r, η) where r is the distance of the line from the origin, and η is the angle between the x axis and the segment connecting the origin with its orthogonal projection on the line. The output of the transform is a matrix in which each entry is associated with a pair (r, η), i.e., with a straight line in the image, and its value is the sum of the values in the pixels that are on the line. Hence, the elements with the highest value in the Hough transform indicate the lines that are most likely to be present in the input image. Because of its definition, the Hough transform tends to overestimate diagonal lines in rectangular images (diagonal lines through the central part of the image contain the largest number of pixels); therefore, before computing the transform we apply a mask to the edge image, considering only the pixels inside the largest circle centered in the center of the image. After the Hough transform has been applied, we compute the square of the two-norm of each column of the matrix resulting from the transform, to determine a score for each angle from −90 o to 90 o . Intuitively, the score for each angle is related to the number of lines with that particular inclination which have been detected in the image. Finally, we set the direction estimate θ ∈ [−π, π] as where η max is the value of η corresponding to the maximum score. A pseudocode for the estimation algorithm is provided in Algorithm 1 and an example of the algorithm workflow is given in Figure 1.

ADMM for Minimizing the KL-DTGV 2 Model
Although problem (6) is a bound-constrained convex optimization problem, the nondifferentiability of the DTGV 2 regularizer does not allow its solution by classical optimization methods for smooth problems, such as gradient methods (see [16,17,18] and the references therein). However, the problem can be solved by methods based on splitting techniques, such as [19,20,21,22,23]. Here we solve (6) by the Alternating Direction Method of Multipliers (ADMM) [20].
To this end, we first reformulate the problem as follows: where z 1 ∈ R n , z 2 ∈ R 2n , z 3 ∈ R 4n , z 4 ∈ R n , and χ R n + (z 4 ) is the characteristic function of the nonnegative orthant in R n . A similar splitting has been used in [24] for TV-based deblurring of Poissonian images. By introducing the auxiliary variables x = (u, w) and z = (z 1 , z 2 , z 3 , z 4 ) we can further reformulate the KL-DTGV 2 problem as min x,z where we set and we define the matrices H ∈ R 8n×3n and G ∈ R 8n×8n as We consider the Lagrangian function associated with problem (8), where ξ ∈ R 8n is a vector of Lagrange multipliers, and then the augmented Lagrangian function where ρ > 0. Now we are ready to introduce the ADMM method for the solution of problem (8).
. At each step k > 0 the ADMM method computes the new iterate x k+1 , z k+1 , ξ k+1 as follows: Note that the functions F 1 (x) and F 2 (z) in (8) are closed, proper and convex. Moreover, the matrices H and G defined in (10) are such that G = −I 8n and H has full rank. Hence, the convergence of the method defined by (13) can be proved by applying a classical convergence result from the seminal paper by Eckstein and Bertsekas [25,Theorem 8], which we report in a form that can be immediately applied to our reformulation of the problem.
Remark 1. Let us consider a problem of the form (8) where F 1 (x) and F 2 (z) are closed, proper and convex functions and H has full rank. Let If there exists a saddle point (x * , z * , ξ * ) of L(x, z, ξ), then x k → x * , z k → z * and ξ k → ξ * . If such saddle point does not exist, then at least one of the sequences {z k } or {ξ k } is unbounded.
Since we are dealing with linear constraints, we can recast (13) in a more convenient form, by observing that the linear term in (12) can be included in the quadratic one. By introducing the vector of scaled Lagrange multipliers µ k = 1 ρ ξ k , the ADMM method becomes z k+1 = arg min z∈R 8n In the next sections we show how the solutions to subproblems (14) and (15) can be computed exactly with a small computational effort.

Solving the Subproblem in x
Problem (14) is an overdetermined least squares problem, since H is a tall-and-skinny matrix with full rank. Hence, its solution can be computed by solving the normal equations system where we set v k x = z k − µ k . Starting from the definition of H given in (10), we have System (17) may be quite large and expensive, also for relatively small images. However, as pointed out in Assumption 0.1, A, D θ and D θ ⊥ have a BCCB structure, hence all the blocks of H H maintain that structure. By recalling that BCCB matrices can be diagonalized by means of two-dimensional Discrete Fourier Transforms (DFTs), we show how the solution to (17) can be computed expeditiously.
Let F ∈ C n×n be the matrix representing the two-dimensional DFT operator, and let F * denote its inverse, i.e., its adjoint. We can write H H as where each block of the central matrix is the diagonal complex matrix associated with the corresponding block in H H, and ∆ * θ , ∆ * θ ⊥ denote the (diagonal) adjoint matrices of ∆ θ , ∆ θ ⊥ . By (18) and the definition of x, we can reformulate (17) as where we split w and v k x in two and three blocks of size n, respectively.
By applying (20) to the matrix consisting of the second and third block rows and columns of the matrix in (19), which we denote Φ, we get To simplify the notation we set and observe that the matrices Ψ ij ∈ C n×n are diagonal. Letting ∆ * = ∆ * θ ∆ * θ ⊥ , applying the inversion formula (20) to the whole matrix in (19), and using (21) and (22), we get where We note that Ξ ∈ C n×n is diagonal (and its inversion is straightforward), while Ω ∈ C 2n×2n has a 2 × 2 block structure with blocks that are diagonal matrices belonging to C n×n . Thus, we can compute Υ = Ω −1 by applying (20): Summing up, by (19), (23) and (24), the solution to (17) can be obtained by computing and setting Remark 1.1. The only quantity in (25) that varies at each iteration is v k x . Hence, the matrices ∆, Γ, Ψ, Ξ −1 , and Υ can be computed only once before the ADMM method starts. This means that the overall cost of the exact solution of (14) at each iteration reduces to six two-dimensional DFTs and two matrix-vector products involving two 3 × 3 block matrices with diagonal blocks of dimension n.

Update of z 1
By the form of the Kullback-Leibler divergence in (2), the minimization problem (27) is equivalent to where we set d = [v k z ] 1 to ease the notation. From (31) it is clear that the problem in z 1 can be split into n problems of the form Since the objective function of this problem is strictly convex, its solution can be determined by setting the gradient equal to zero, i.e., by solving which leads to the quadratic equation Since, by looking at the domain of the objective function in (32), z +γ has to be strictly positive, we set each entry of z k+1 1 as the largest solution of the corresponding quadratic Equation (33).

Update of z 2 and z 3
The minimization problems (28) and (29) correspond to the computation of the proximal operators of the functions f (z 2 ) = α 0 ρ z 2 2,1|R 2n and g(z 3 ) = α 1 ρ z 3 2,1|R 4n , respectively. By the definitions given in (4) and (5), we see that the two (2,1)-norms correspond to the sum of two-norms of vectors in R 2 and R 4 , respectively. This means that the computation of both the proximal operators can be split into the computation of n proximal operators of functions that are scaled two-norms in either R 2 or R 4 . The proximal operator of the function f (y) = c y , c > 0, at a vector d is It can be shown (see, e.g., [26] [Chapter 6]) that Hence, for the update of z 2 we proceed as follows. By setting d = [v k z ] 2 and c = α 0 ρ , for each i = 1, . . . , n we have To update z 3 , we set d = [v k z ] 3 and c = α 1 ρ and compute

Update of z 4
It is straightforward to verify that the update of z 4 in (30) can be obtained as where Π R n + is the Euclidean projection onto the nonnegative orthant in R n .

Summary of the ADMM Method
For the sake of clarity, in Algorithm 2 we sketch the ADMM version for solving problem (7). In many image restoration applications, a reasonably good starting guess for u is often available. For example, if A represents a blur operator, a common choice is to set u 0 equal to the the noisy and blurry image. We make this choice for u 0 . By numerical experiments we also verified that once x = (u, w) has been initialized, it is convenient to set u 1 = u 0 , w 1 1 = w 0 1 and w 1 2 = w 0 2 and to shift the order of the updates in the ADMM scheme (14)- (16), so that a "more effective" initialization of z and µ is performed. We see from line 9 of Algorithm 2 that the algorithm stops when the relative change in the restored image u goes below a certain threshold tol ∈ (0, 1) or a maximum number of iterations k max is reached. Finally, we note that for the case of the KL-TGV 2 model, corresponding to θ = 0 and a = 1, we have that D θ = D H and D θ ⊥ = D V ; hence, we use the initialization w 0 1 = D H u 0 and w 0 2 = D V u 0 .

Numerical Results
All the experiments were carried out using MATLAB R2018a on a 3.50 GHz Intel Xeon E3 with 16 GB of RAM and Windows operating system. In this section, we first illustrate the effectiveness of Algorithm 1 for the estimation of the image direction by comparing it with the one given in [9] and by analysing its sensitivity to the degradation in the image to be restored. Then, we present numerical experiments that demonstrate the improvement of the KL-DTGV 2 model upon the KL-TGV 2 model for the restoration of directional images corrupted by Poisson noise. Four directional images named phantom (512 × 512), grass (375 × 600), leaves (203 × 300) and carbon (247 × 300) were used in the experiments. The first image is a piecewise affine fibre phantom image obtained with the fibre phantom pa MATLAB function available from http://www2.compute.dtu.dk/~pcha/HDtomo/ (downloaded: 20 September 2020). The second and third images represent grass and veins of leaves, respectively, which naturally exhibit a directional structure. The last image is a Scanning Electron Microscope (SEM) image of carbon fibres. The images are shown in  To simulate experimental data, each reference image was convolved with two PSFs, one corresponding to a Gaussian blur with variance 2, generated by the psfGauss function from [27], and the other corresponding to an out-of-focus blur with radius 5, obtained with the function fspecial from the MATLAB Image Processing Toolbox. To take into account the existence of some background emission, a constant term γ equal to 10 −10 was added to all pixels of the blurry image. The resulting image was corrupted by Poisson noise, using the MATLAB function imnoise. The intensities of the original images were pre-scaled to get noisy and blurry images with Signal to Noise Ratio (SNR) equal to 43 and 37 dB. We recall that in the case of Poisson noise, which affects the photon counting process, the SNR is estimated as [28]

Direction Estimation
In Figures 2-5 we compare Algorithm 1 with the algorithm proposed in [9], showing that Algorithm 1 always correctly estimates the main direction of the four test images. We also test the robustness of our algorithm with respect to noise and blur. In Figure 6 we show the estimated main direction of the phantom image corrupted by Poisson noise with SNR = 35, 37, 39, 41, 43 dB and out-of-focus blurs with radius R = 5, 7, 9. In only one case (SNR = 35, R = 7) Algorithm 1 fails, returning as estimate the orthogonal direction, i.e., the direction corresponding to the large black line and the background color gradient. Finally, we test Algorithm 1 on a phantom image with vertical, horizontal and diagonal main directions corresponding to θ = 0, 90, 45. The results, in Figure 7, show that our algorithm is not sensitive to the specific directional structure of the image.

Image Deblurring
We compare the quality of the restorations obtained by using the DTGV 2 and TGV 2 regularizers and ADMM for the solution of both models. In all the tests, the value of the penalty parameter was set as ρ = 10 and the value of the stopping threshold as tol = 10 −4 . A maximum number of k max = 500 iterations was allowed. By following [9,10], the weight parameters of DTGV were chosen as α 0 = β and α 1 = (1 − β) with β = 2/3. For each test problem, the value of the regularization parameter λ was tuned by a trial-and-error strategy. This strategy consisted in running ADMM with initial guess u 0 = b several times on each test image, varying the value of λ at each execution. For all the runs the stopping criterion for ADMM and the values of α 0 , α 1 and ρ were the same as described above. The value of λ yielding the smallest Root Mean Square Error (RMSE) at the last iteration was chosen as the "optimal" value. The numerical results are summarized in Table 1, where the RMSE, the Improved Signal to Noise Ratio (ISNR) [29], and the structural similarity (SSIM) index [30] are used to give a quantitative evaluation of the quality of the restorations. As a measure of the computational cost, the number of iterations and the time in seconds are reported. Table 1 also shows, for each test problem, the values of the regularization parameter λ. The restored images are shown in  From the results, it is evident that the DTGV 2 model outperforms the TGV 2 one in terms of quality of the restoration. A visual inspection of the figures shows that the DTGV 2 regularization is very effective in removing the noise, while for high noise levels the TGV 2 reconstructions still exhibit noise artifacts. Finally, by observing the "Iters" column of the table, we can conclude that, on average, the TGV 2 regularization requires less ADMM iterations to achieve a relative change in the restoration that is below the fixed threshold. However, the computational time per iteration is very small and also ADMM for the KL-DGTV 2 regularization is efficient. Finally, to illustrate the behaviour of ADMM, in Figure 13 we plot the RMSE history for the carbon test problem. A similar RMSE behaviour has been observed in all the numerical experiments.

Conclusions
We dealt with the use of the Directional TGV regularization in the case of directional images corrupted by Poisson noise. We presented the KL-DTGV 2 model and introduced a two-block ADMM version for its minimization. Finally, we proposed an effective strategy for the estimation of the main direction of the image. Our numerical experiments show that for Poisson noise the DTGV 2 regularization provides superior restoration performance compared with the standard Figure 12: Test problem carbon: difference images with DTGV 2 (top) and TGV 2 (bottom). TGV 2 regularization, thus remarking the importance of taking into account the texture structure of the image. A crucial ingredient for the success of the model was the proposed direction estimation strategy, which proved to be more reliable than those proposed in the literature. Possible future work includes the use of space-variant regularization terms and the analysis of automatic strategies for the selection of the regularization parameters.